[XviD-devel] Rounding

Tue, 29 Oct 2002 21:43:28 +0100

Hi,

> > I have been looking into the qpel rounding issues again today because
> > ms-fdam does not decode chroma of XVID qpel (as well as DivX5)
correctly...
> >
> > > here's the code I've been using for the Round  7-6 -> 7-9
> > > tables during computation of chroma MV:
>
> What do you mean be "7-6 -> 7-9" ? For me (and I think this is rather
> clear in the standard), Table 7-6 is for rounding 16th-pel positions to
> halfpel-positions (needed for inter4v chroma).

ok. sorry, it seems it was not clear that this was not my code but just an
excerpt from an earlier discussion between you (Christoph) and Skal. I've
shortened the code part, but maybe this was no good idea, so here's the
(whole) original mail again:

----- Original Message -----

From: "skal" <skal@planet-d.net>
To: <xvid-devel@xvid.org>
Sent: Monday, October 14, 2002 9:40 AM
Subject: Re: [XviD-devel] Rounding

> On Sun, 2002-10-13 at 23:33, Christoph Lampert wrote:
> >
> > Hi,
> >
> >
> > I've been looking at rounding of motion vectors half the day. It's done
in
> > 3 different ways through the code. :( And I don't even think it's
right...
> >
> > I keep reading code like this:
> >
> > uv_dy =  (uv_dy ? SIGN(uv_dy) *
> > (roundtab8[ABS(uv_dy) % 8] + (ABS(uv_dy) / 8) * 2) : 0);
> >
> >
> > which is supposed to round from 8th-pel to halfpel. Result is symmetric
> > to 0, however, it's not similar to anything I read in the Standard. So
> > where is it from?
> >
> > Also, why the complicated expression using ? and % operator?
> >
> > My impression is that a simple & and >> should be sufficient
> > (if we assume that negative values are shifted in the logical way).
> >
> > uv_dy = roundtab8[uv_dy & 0x7] + (uv_dy >> 3) * 2;
> >
> > and similar fof 16th-pel -> halfpel with
> >
> > uv_dy = roundtab16[uv_dy & 0xF] + (uv_dy >> 4) * 2;
> >
> > For 16th-Pel this is the same, thought much less complicatedly put.
> > FOr 8th-pel there's a difference, and even though it's not large, the
> > PSNR gets slightly little higher in average and I believe it's what the
> > standard demands.
> >
> > Or am I wrong here?
> >
> >
> > 8th->half
> >
> > UV_DY     old           roundtab8
> >
> > dy=-15    res=-3        res2=-4
> > dy=-14    res=-3        res2=-3
> > dy=-13    res=-3        res2=-3
> > dy=-12    res=-3        res2=-3
> > dy=-11    res=-3        res2=-3
> > dy=-10    res=-2        res2=-3
> > dy=-9     res=-2        res2=-2
> > dy=-8     res=-2        res2=-2
> > dy=-7     res=-1        res2=-2
> > dy=-6     res=-1        res2=-1
> > dy=-5     res=-1        res2=-1
> > dy=-4     res=-1        res2=-1
> > dy=-3     res=-1        res2=-1
> > dy=-2     res= 0        res2=-1
> > dy=-1     res= 0        res2= 0
> > dy= 0     res= 0        res2= 0
> > dy= 1     res= 0        res2= 0
> > dy= 2     res= 0        res2= 1
> > dy= 3     res= 1        res2= 1
> > dy= 4     res= 1        res2= 1
> > dy= 5     res= 1        res2= 1
> > dy= 6     res= 1        res2= 1
> > dy= 7     res= 1        res2= 2
> > dy= 8     res= 2        res2= 2
> > dy= 9     res= 2        res2= 2
> > dy=10     res= 2        res2= 3
> > dy=11     res= 3        res2= 3
> > dy=12     res= 3        res2= 3
> > dy=13     res= 3        res2= 3
> > dy=14     res= 3        res2= 3
> > dy=15     res= 3        res2= 4
>
> Hi all,
>
> here's the code I've been using for the Round  7-6 -> 7-9
> tables during computation of chroma MV:
>
>
>   // Table 7-6 (K=4)  (modified)
> const int SKL_MB::Rnd_Tab_76[16] = { 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0,
0, 0, 1, 1 };
>   // Table 7-8 (K=2)  (modified)
> const int SKL_MB::Rnd_Tab_78[ 8] = { 0, 0, 1, 1, 0, 0, 0, 1  };
>   // Table 7-9 (K=1)  (modified)
> const int SKL_MB::Rnd_Tab_79[ 4] = { 0, 1, 1, 1 };
>
> typedef SKL_INT16 SKL_MV[2];
>
>
> #define DIV2RND(x) ( ((x)>>1) | ((x)&1) )
>
> void SKL_MB::Derive_uv_MV_MPEG12(SKL_MV *MV)
> {
>   uv_MV[0] = MV[0][0] >> 1;
>   uv_MV[1] = MV[0][1] >> 1;
> }
>
> void SKL_MB::Derive_uv_MV_From_1MV(int Quarter, SKL_MV *MV)
> {
>   int x = MV[0][0];
>   int y = MV[0][1];
>   if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
>   uv_MV[0] = (x>>1) | Rnd_Tab_79[ x & 0x3];
>   uv_MV[1] = (y>>1) | Rnd_Tab_79[ y & 0x3];
> }
>
> void SKL_MB::Derive_uv_MV_From_2MV(int Quarter, SKL_MV *MV)
> {
>   int x = MV[0][0];
>   int y = MV[0][1];
>   if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
>   uv_MV[0] = (x>>2) | Rnd_Tab_78[ x & 0x7];
>   uv_MV[1] = (y>>2) | Rnd_Tab_78[ y & 0x7];
> }
>
> void SKL_MB::Derive_uv_MV_From_4MV(int Quarter, SKL_MV *MV)
> {
>   int x = MV[0][0] + MV[1][0] + MV[MV_Stride+0][0] + MV[MV_Stride+1][0];
>   int y = MV[0][1] + MV[1][1] + MV[MV_Stride+0][1] + MV[MV_Stride+1][1];
>   if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
>   uv_MV[0] = (x>>3) + Rnd_Tab_76[ x & 0xf ];
>   uv_MV[1] = (y>>3) + Rnd_Tab_76[ y & 0xf ];
> }
>
> #undef DIV2RND
>
>
> I  think it's correct as soon as the '>>' is
> really a bitwise shift (signed or unsigned)...
>
> For the half and q-pel truncation, I think using
> the shift is the correct way, too
> (e.g.:
>
>  int Quads = (mv_x&3) | ((mv_y&3)<<2);
>  mv_x >>= 2;
>  my_y >>= 2;
>
> for q-pel MV snapping)
>
>
>
> bye
>
> Skal
>
>
>
>
> _______________________________________________
> XviD-devel mailing list
> XviD-devel@xvid.org
> http://list.xvid.org/mailman/listinfo/xvid-devel
>

----- End of Original Message -----

> Table 7-9 is needed for rounding 4th-pel position to halfpel-positions
> (needed for inter-mode chroma).
yes

> There is also Table 7-8, which is rounding 8th-pel positions to halfpel
> (needed for inter4v lumi).
yes

> Table 7-7 is 12th-pel to halfpel, we don't need that at the moment.
yes

> But in all cases, only 1 rounding operation is done! No 2-step
> procedure. Did you interpret Table 7-6 as 16th-pel to quarterpel
> because of the "2"s ? '2' simply stands for "next fullpel position",
> which is the right choice for the 16th-pel positions '14/16' and '15/16'.

he??? table 7-6 is 16thpel->halfpel rounding, of course and there were never
any doubts about this...

> > >   // Table 7-6 (K=4)  (modified)
> > > const int SKL_MB::Rnd_Tab_76[16] = { 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0,
0,
> > 0, 0, 1, 1 };
> > >   // Table 7-8 (K=2)  (modified)
> > > const int SKL_MB::Rnd_Tab_78[ 8] = { 0, 0, 1, 1, 0, 0, 0, 1  };
> > >   // Table 7-9 (K=1)  (modified)
> > > const int SKL_MB::Rnd_Tab_79[ 4] = { 0, 1, 1, 1 };
> > >
> > > typedef SKL_INT16 SKL_MV[2];
> > >
> > > #define DIV2RND(x) ( ((x)>>1) | ((x)&1) )
> > >
> > > void SKL_MB::Derive_uv_MV_From_1MV(int Quarter, SKL_MV *MV)
> > > {
> > >   int x = MV[0][0];
> > >   int y = MV[0][1];
> > >   if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
> > >   uv_MV[0] = (x>>1) | Rnd_Tab_79[ x & 0x3];
> > >   uv_MV[1] = (y>>1) | Rnd_Tab_79[ y & 0x3];
> > > }
> >
> > this is exactly how I've understood the standard and that's what we're
> > currently already doing.
>
> Sorry, maybe I'm too stupid, but I don't see how this code has anything to
> do with the standard.

It seems we totally misunderstood eachother. Nonetheless, your answer was
very useful because I thought a bit about the Skal method and your method
again and finally found what's wrong: the above code has a lot to do with
the standard ;-) And the Skal_Tab_79 rounding code is (accidently) correct,
however the Skal method (shift + OR) does not work for any other rounding
method than quarter->halfpel.

In general a shift + add seems ok using Skal's modified tables, so for
example:

// Table 7-6 (K=4)  (modified)
const int SKL_MB::Rnd_Tab_76[16] = { 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
0, 1, 1 };
// Table 7-8 (K=2)  (modified)
const int SKL_MB::Rnd_Tab_78[ 8] = { 0, 0, 1, 1, 0, 0, 0, 1  };
// Table 7-9 (K=1)  (modified)
const int SKL_MB::Rnd_Tab_79[ 4] = { 0, 1, 1, 1 };

x = (x>>3) + Rnd_Tab_76[x & 0xF]; for rounding from 16th pel -> halfpel

x = (x>>2) + Rnd_Tab_78[x & 0x7]; for rounding from 8th pel -> halfpel

x = (x>>1) + Rnd_Tab_79[x & 0x3]; for rounding from quarterpel -> halfpel

This code should do what the standard requires and should be also equivalent
to the code Christoph provided...

> In my eyes the standard clear demands rounding as follows:
>
> roundtab16[]={0,0,0,1,1,1,1,1,1,1,1,1,1,2,2};
> roundtab8[]={0,0,0,1,1,1,1,1,1,1,2,2};
> roundtab4[]={0,1,1,1};

yep, that's what the above code does

> round_16th_to_half(int pos)
> {
>   int fullpel = pos & 0xFFFFFFF0;   // might be negative (-3 -> -16)
>   int pure_16th_pel = pos & 0x0000000F;  // always positive (-3 -> +13)
>
>   fullpel_to_halfpel = fullpel*2;
>   16thpel_to_halfpel = roundtab16[pure_16th_pel];
>
>   final_halfpel_position = fullpel_to_halfpel + 16thpel_to_halfpel;
>
>   return final_halfpel_position;
> }
>
> and the analogue routines for 8th and quarterpel position.
>
> Maybe you can test if this simple way gives correct results in reference
> decoder? I don't have a running version here...

The problem was not (only) INTER4V related (which means 16th pel ->
halfpel), but ms-fdam didn't even correctly decode XVID qpel content without
4MV mode. So far I did two rounding steps (each from quarterpel->halfpel) in
motion compensation (one for the rounding from qpel vectors to halfpel
vectors, and one to match the lower resolution of the subsampled u,v
planes). It seemed correct this way since the standard explicitely says that
the halfpel u,v vectors should be calculated using table 7-9 (see subclause
7.6.2.2) and all following steps should be performed as in normal halfpel
mode then. Obviously ms-fdam does the rounding in one step using table 7-8
(which is not at all a stupid idea btw) and after I modified XVID to
consider this our chroma problems were gone. So I'd say the way ms-fdam
handles chroma compensation is correct, but the description in the standard
is misleading then imho...

bye,
Michael