[XviD-devel] GMC rc3 - TODO
skal
xvid-devel@xvid.org
15 Jan 2003 19:36:44 +0100
Christoph,
On Wed, 2003-01-15 at 18:31, Christoph Lampert wrote:
> On 15 Jan 2003, skal wrote:
> > > P.S. Skal, of course, you _do_ beat the compiler (at least gcc 2.95), but
> > > not as high as I expected, only by about 10%-12% here... Aren't you
> > > tempted to make it at least 50% >;-)
> >
> > well, the evil temptation would be to keep on optimizing
> > a code that is just in test phase so far. Of course there's
> > still tricks to inject, but it would be premature, and
> > would the code hardly readable...
>
> Yes, there will be 3-warp-point code, even though it would be better to
> create a separate routine for that, so calculations for the 2-warp-point
> codes are not slowed down. Most likely, these routines (which are under
> very strong restrictions from the standard what their output may be) are
> going to stay as they are. What really _is_ going to change is GME...
Note: the main interpolation loop is the same for 3-warp-point
case than for the 2-warp-point one.
Only the gradients (dxF, dyF, dxG, dyG) are different from
the rotation (Cos,-Sin,Sin,Cos). Actually, the second version
of xvid_gmc.c I've sent already contains the setup code for
the case num_wp = 3. Hope i didn't mess reading the ISO.
>
>
> > For instance, clamping the vector between in [0..W[ and
> > [0..H[ is faster using:
> >
> > int offset = 0;
> > if ((uint32_t)F<W) offset = F;
> > else if (F>=W) offset = W;
> > if ((uint32_t)G<H) offset += G*stride;
> > else if (G>=H) offset += H*stride;
> >
> > but rather obfuscated...
>
> errr, yes. Give me a few hours to think about that, maybe then I'll
> understand ;-)
Btw, I've noticed a little 'bug': one should better take the
residuals (ri, rj) *after* the F/G vector have been clamped
to width/height. Actually, clamping to [-1,W]/[-1,H] is
correct if one assumes the edges have been replicated.
Otherwise, one can use the code:
int W = gmc_data->W<<4;
int H = gmc_data->H<<4;
int offset = 0;
uint32_t ri = 0, rj = 0;
if ((uint32_t)F<W) { ri = MTab[F&15]; offset = F>>4; }
else if (F>=W) offset = W-1;
if ((uint32_t)G<=H) { rj = MTab[G&15]; offset += (G>>4)*stride;}
else if (G>=H) offset += (H-1)*stride;
instead, which does not make such an assumption.
bye,
Skal