[XviD-devel] GMC rc3 - TODO

Christoph Lampert xvid-devel@xvid.org
Wed, 15 Jan 2003 17:24:36 +0100 (CET)


On 15 Jan 2003, skal wrote:
> On Wed, 2003-01-15 at 14:22, Christoph Lampert wrote:
> > On 14 Jan 2003, skal wrote:
> > > 	here's a test-bed (a la xvid_bench.c) for future tests on GMC.
> > > 	I've also quickly hacked a fixed-point incremental version,
> > > 	just to be sure I still can beat a linux compiler :)
> > > 
> 	oops, the evaluation of the average MV was missing an
> 	(x,y) offset! Here is a corrected version. I've also
> 	simplified the calculation of the gradients (gee...how 
> 	twisted-minded the ISO is!!) and sped the interpolation
> 	a little bit more. 

Neat. We'll have to remember not to do this on machines where
integers access has to be aligned. Please someone remind me, if I 
forget and complain one day... 

> 	Moreover, weird sizes (not power of 2) where not tested...

Weird sizes for what? For image size? 

> > thank you, for this and all, but why are you leaving? Did we annoy you too
> > much?
> 
> 	well, I like to switch context on a regular basis, so I think
> 	I'm gonna (randomly?) pick a new subject of interest for the
> 	next 6 months ;) 

Sorry to lose you. :-( It was fun! 

gruel

P.S. Skal, of course, you _do_ beat the compiler (at least gcc 2.95), but
not as high as I expected, only by about 10%-12% here... Aren't you
tempted to make it at least 50%   >;-) 


xvid_gmc.c compiled with 

gcc -O2 -march=i686 -mcpu=i686 -funroll-loops -fstrict-aliasing
-fomit-frame-pointer -fPIC  -mpreferred-stack-boundary=4

on lame PII-450. 


"ugly gruel ISO" version =====  test GMC =====

res=2 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x7fb9 time=5.824sec  103.02 fps
res=4 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x8a58 time=5.816sec  103.16 fps
res=8 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x58c4 time=5.752sec  104.31 fps
res=16 ...[0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x8a58 time=5.743sec  104.48 fps


"Skal" version =====  test GMC =====

res=2 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x7fb9 time=5.247sec  114.36 fps
res=4 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x8a58 time=5.272sec  113.81 fps
res=8 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x58c4 time=5.214sec  115.08 fps
res=16 ...[0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x8a58 time=5.215sec  115.06 fps