[XviD-devel] [PATCH] calc_cbp_sse2 optimization
Mat Hostetter
mat at curl.com
Mon Apr 19 21:26:18 CEST 2004
>>>>> "syskin" == Radek Czyz <syskin at ihug.com.au> writes:
syskin> Mat Hostetter wrote:
>> This change (against 1.0.0-rc4) speeds up calc_cbp_sse2 from 131
>> cycles to 112 cycles on the Pentium 4 (for the in-cache case).
syskin> The most unbelivable thing appears to have happened: it's
syskin> b0rked...
You're right, it is. I extended my test suite to try more
combinations and it turns out I treat negative byte values as zero.
For some reason I thought pcmpgtb did unsigned compares, sigh!
I'll figure out a fix.
I think it would also be nice to contribute my cbp test.
I could throw it into xvid_bench.c with every other test, but I'm
thinking it would be cleaner to have a "tests" directory and a "make
test" target that runs all the tests in it (for those of us who use
"make", anyway).
-Mat
More information about the XviD-devel
mailing list