[XviD-devel] GMC rc3 - TODO

skal xvid-devel@xvid.org
15 Jan 2003 18:02:44 +0100


	Gruel,

On Wed, 2003-01-15 at 17:24, Christoph Lampert wrote:

> > 	Moreover, weird sizes (not power of 2) where not tested...
> 
> Weird sizes for what? For image size? 

	(for the sprite size, the one used in ROUND_DIV())

> 
> P.S. Skal, of course, you _do_ beat the compiler (at least gcc 2.95), but
> not as high as I expected, only by about 10%-12% here... Aren't you
> tempted to make it at least 50%   >;-) 
> 

	well, the evil temptation would be to keep on optimizing
	a code that is just in test phase so far. Of course there's
	still tricks to inject, but it would be premature, and
	would the code hardly readable...
	For instance, clamping the vector between in [0..W[ and 
	[0..H[ is faster using:

	int offset = 0;
	if ((uint32_t)F<W) offset = F;
	else if (F>=W) offset = W;
	if ((uint32_t)G<H) offset += G*stride;
	else if (G>=H) offset += H*stride;

	but rather obfuscated...

	Anyway, a better hunt for speed would be splitting
	the loop in two: 
	1rst pass: iterate F and G, clamp, store offsets and
        weights into a temp array. 
        2nd pass: perform the bilinear interp. itself, with
	a 1-sample delay line
        => Tighter loops, and 2nd pass very suitable 
	for MMX-izing

	Also: move the calc of avgMV away from the main loop:
	it may fattens the code too much... 

	bye,
		Skal

> xvid_gmc.c compiled with 
> 
> gcc -O2 -march=i686 -mcpu=i686 -funroll-loops -fstrict-aliasing
> -fomit-frame-pointer -fPIC  -mpreferred-stack-boundary=4
> 
> on lame PII-450. 
> 
> 
> "ugly gruel ISO" version =====  test GMC =====
> 
> res=2 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x7fb9 time=5.824sec  103.02 fps
> res=4 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x8a58 time=5.816sec  103.16 fps
> res=8 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x58c4 time=5.752sec  104.31 fps
> res=16 ...[0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x8a58 time=5.743sec  104.48 fps
> 
> 
> "Skal" version =====  test GMC =====
> 
> res=2 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x7fb9 time=5.247sec  114.36 fps
> res=4 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x8a58 time=5.272sec  113.81 fps
> res=8 ... [0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x58c4 time=5.214sec  115.08 fps
> res=16 ...[0][1][2][0][1][2][0][1][2][0][1][2] ... crc=0x8a58 time=5.215sec  115.06 fps
> 
> 
> 
> _______________________________________________
> XviD-devel mailing list
> XviD-devel@xvid.org
> http://list.xvid.org/mailman/listinfo/xvid-devel
>