[XviD-devel] GMC rc3 - TODO

Christoph Lampert xvid-devel@xvid.org
Wed, 15 Jan 2003 18:31:41 +0100 (CET)


On 15 Jan 2003, skal wrote:
> > P.S. Skal, of course, you _do_ beat the compiler (at least gcc 2.95), but
> > not as high as I expected, only by about 10%-12% here... Aren't you
> > tempted to make it at least 50%   >;-) 
>
> 	well, the evil temptation would be to keep on optimizing
> 	a code that is just in test phase so far. Of course there's
> 	still tricks to inject, but it would be premature, and
> 	would the code hardly readable...

Yes, there will be 3-warp-point code, even though it would be better to
create a separate routine for that, so calculations for the 2-warp-point
codes are not slowed down. Most likely, these routines (which are under
very strong restrictions from the standard what their output may be) are
going to stay as they are. What really _is_ going to change is GME... 


> 	For instance, clamping the vector between in [0..W[ and 
> 	[0..H[ is faster using:
> 
> 	int offset = 0;
> 	if ((uint32_t)F<W) offset = F;
> 	else if (F>=W) offset = W;
> 	if ((uint32_t)G<H) offset += G*stride;
> 	else if (G>=H) offset += H*stride;
> 
> 	but rather obfuscated...

errr, yes. Give me a few hours to think about that, maybe then I'll
understand ;-)  

> 	Anyway, a better hunt for speed would be splitting
> 	the loop in two: 
> 	1rst pass: iterate F and G, clamp, store offsets and
>         weights into a temp array. 
>         2nd pass: perform the bilinear interp. itself, with
> 	a 1-sample delay line
>         => Tighter loops, and 2nd pass very suitable 
> 	for MMX-izing
> 
> 	Also: move the calc of avgMV away from the main loop:
> 	it may fattens the code too much... 

Okay, we'll try that. I guess we all learned quite a lot from you, bye!
(and tell us what you next project is going to be. Maybe audio
encoding? XVID still lacks that ;-) 

gruel