[XviD-devel] GMC rc3 - TODO

15 Jan 2003 19:36:44 +0100

	Christoph,

On Wed, 2003-01-15 at 18:31, Christoph Lampert wrote:
> On 15 Jan 2003, skal wrote:
> > > P.S. Skal, of course, you _do_ beat the compiler (at least gcc 2.95), but
> > > not as high as I expected, only by about 10%-12% here... Aren't you
> > > tempted to make it at least 50%   >;-) 
> >
> > 	well, the evil temptation would be to keep on optimizing
> > 	a code that is just in test phase so far. Of course there's
> > 	still tricks to inject, but it would be premature, and
> > 	would the code hardly readable...
> 
> Yes, there will be 3-warp-point code, even though it would be better to
> create a separate routine for that, so calculations for the 2-warp-point
> codes are not slowed down. Most likely, these routines (which are under
> very strong restrictions from the standard what their output may be) are
> going to stay as they are. What really _is_ going to change is GME... 

	Note: the main interpolation loop is the same for 3-warp-point
	case than for the 2-warp-point one.
	Only the gradients (dxF, dyF, dxG, dyG) are different from
	the rotation (Cos,-Sin,Sin,Cos). Actually, the second version
	of xvid_gmc.c I've sent already contains the setup code for
	the case num_wp = 3. Hope i didn't mess reading the ISO.
> 
> 
> > 	For instance, clamping the vector between in [0..W[ and 
> > 	[0..H[ is faster using:
> > 
> > 	int offset = 0;
> > 	if ((uint32_t)F<W) offset = F;
> > 	else if (F>=W) offset = W;
> > 	if ((uint32_t)G<H) offset += G*stride;
> > 	else if (G>=H) offset += H*stride;
> > 
> > 	but rather obfuscated...
> 
> errr, yes. Give me a few hours to think about that, maybe then I'll
> understand ;-)  

	Btw, I've noticed a little 'bug': one should better take the
	residuals (ri, rj) *after* the F/G vector have been clamped
	to width/height. Actually, clamping to [-1,W]/[-1,H] is
	correct if one assumes the edges have been replicated.
	Otherwise, one can use the code:

 int W = gmc_data->W<<4;
 int H = gmc_data->H<<4;

        int offset = 0;  
        uint32_t ri = 0, rj = 0;  
	if ((uint32_t)F<W) { ri = MTab[F&15]; offset = F>>4;  }
        else if (F>=W) offset = W-1;
	if ((uint32_t)G<=H) { rj = MTab[G&15]; offset += (G>>4)*stride;}
        else if (G>=H) offset += (H-1)*stride;

	instead, which does not make such an assumption.

	bye,


		Skal