[XviD-devel] [PATCH] calc_cbp_sse2 optimization

Mat Hostetter mat at curl.com
Mon Apr 19 21:26:18 CEST 2004


>>>>> "syskin" == Radek Czyz <syskin at ihug.com.au> writes:

 syskin> Mat Hostetter wrote:
 >> This change (against 1.0.0-rc4) speeds up calc_cbp_sse2 from 131
 >> cycles to 112 cycles on the Pentium 4 (for the in-cache case).

 syskin> The most unbelivable thing appears to have happened: it's
 syskin> b0rked...

You're right, it is.  I extended my test suite to try more
combinations and it turns out I treat negative byte values as zero.
For some reason I thought pcmpgtb did unsigned compares, sigh!

I'll figure out a fix.

I think it would also be nice to contribute my cbp test.
I could throw it into xvid_bench.c with every other test, but I'm
thinking it would be cleaner to have a "tests" directory and a "make
test" target that runs all the tests in it (for those of us who use
"make", anyway).

-Mat


More information about the XviD-devel mailing list