[XviD-devel] Rounding
skal
xvid-devel@xvid.org
30 Oct 2002 11:08:50 +0100
Hi all,
OMG, this rounding is such a mess :)
I've re-read the infamous section 7.6.5 and
here are some things I noticed:
a) "...in quarter pixel mode the vectors are divided
by 2 *BEFORE* summation"
So, the code I sent is wrong for K=4 (INTER4V).
It should be something like:
mv_x = DIV2RND(mv[0].x) + DIV2RND(mv[1].x) + DIV2RND(mv[2].x) + DIV2RND(mv[3].x);
mv_y = DIV2RND(mv[0].y) + DIV2RND(mv[1].y) + DIV2RND(mv[2].y) + DIV2RND(mv[3].y);
and then proceed with table table 7-6...
I think it's really a two-step process (qpel->halfpel), probably
mergable into a single one...
b) This section always mention "dividing" ("by 2K", "by 2", ...)
But it's not clearly stated which division it is (/? //? ///?).
The only place where I see the exact operator is in the
legend of table 7-6 -> 7.9. It seems to be // (nearest integer,
rounding away from 0). Hence the DIV2RND(x) ... Might
other implementors fail to use this one too?
c) Maybe a bad news: at the beginning of clause, we have:
"A I macroblock comprises either one MV or K motion vectors,
one for each non-transp 8x8 pel blocks forming the 16x16
pel macroblock, as is INDICATED BY THE CBPC CODE" !!
Would it mean that K is not always 4 when CPBC!=0xf??
Hope not...
Some additional remarks:
On Tue, 2002-10-29 at 21:43, Michael Militzer wrote:
> From: "skal" <skal@planet-d.net>
> To: <xvid-devel@xvid.org>
> Sent: Monday, October 14, 2002 9:40 AM
> Subject: Re: [XviD-devel] Rounding
>
>
> > On Sun, 2002-10-13 at 23:33, Christoph Lampert wrote:
> > >
> > > Hi,
> > >
> > >
> > > I've been looking at rounding of motion vectors half the day. It's done
> in
> > > 3 different ways through the code. :( And I don't even think it's
> right...
> > >
> > > I keep reading code like this:
> > >
> > > uv_dy = (uv_dy ? SIGN(uv_dy) *
> > > (roundtab8[ABS(uv_dy) % 8] + (ABS(uv_dy) / 8) * 2) : 0);
> > >
> > >
> > > which is supposed to round from 8th-pel to halfpel. Result is symmetric
> > > to 0, however, it's not similar to anything I read in the Standard. So
> > > where is it from?
> > >
> > > Also, why the complicated expression using ? and % operator?
> > >
> > > My impression is that a simple & and >> should be sufficient
> > > (if we assume that negative values are shifted in the logical way).
> > >
> > > uv_dy = roundtab8[uv_dy & 0x7] + (uv_dy >> 3) * 2;
> > >
> > > and similar fof 16th-pel -> halfpel with
> > >
> > > uv_dy = roundtab16[uv_dy & 0xF] + (uv_dy >> 4) * 2;
> > >
> > > For 16th-Pel this is the same, thought much less complicatedly put.
> > > FOr 8th-pel there's a difference, and even though it's not large, the
> > > PSNR gets slightly little higher in average and I believe it's what the
> > > standard demands.
> > >
> > > Or am I wrong here?
> > >
> > >
> > > 8th->half
> > >
> > > UV_DY old roundtab8
> > >
> > > dy=-15 res=-3 res2=-4
> > > dy=-14 res=-3 res2=-3
> > > dy=-13 res=-3 res2=-3
> > > dy=-12 res=-3 res2=-3
> > > dy=-11 res=-3 res2=-3
> > > dy=-10 res=-2 res2=-3
> > > dy=-9 res=-2 res2=-2
> > > dy=-8 res=-2 res2=-2
> > > dy=-7 res=-1 res2=-2
> > > dy=-6 res=-1 res2=-1
> > > dy=-5 res=-1 res2=-1
> > > dy=-4 res=-1 res2=-1
> > > dy=-3 res=-1 res2=-1
> > > dy=-2 res= 0 res2=-1
> > > dy=-1 res= 0 res2= 0
> > > dy= 0 res= 0 res2= 0
> > > dy= 1 res= 0 res2= 0
> > > dy= 2 res= 0 res2= 1
> > > dy= 3 res= 1 res2= 1
> > > dy= 4 res= 1 res2= 1
> > > dy= 5 res= 1 res2= 1
> > > dy= 6 res= 1 res2= 1
> > > dy= 7 res= 1 res2= 2
> > > dy= 8 res= 2 res2= 2
> > > dy= 9 res= 2 res2= 2
> > > dy=10 res= 2 res2= 3
> > > dy=11 res= 3 res2= 3
> > > dy=12 res= 3 res2= 3
> > > dy=13 res= 3 res2= 3
> > > dy=14 res= 3 res2= 3
> > > dy=15 res= 3 res2= 4
> >
> > Hi all,
> >
> > here's the code I've been using for the Round 7-6 -> 7-9
> > tables during computation of chroma MV:
> >
> >
> > // Table 7-6 (K=4) (modified)
> > const int SKL_MB::Rnd_Tab_76[16] = { 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0,
> 0, 0, 1, 1 };
> > // Table 7-8 (K=2) (modified)
> > const int SKL_MB::Rnd_Tab_78[ 8] = { 0, 0, 1, 1, 0, 0, 0, 1 };
> > // Table 7-9 (K=1) (modified)
> > const int SKL_MB::Rnd_Tab_79[ 4] = { 0, 1, 1, 1 };
> >
> > typedef SKL_INT16 SKL_MV[2];
> >
> >
> > #define DIV2RND(x) ( ((x)>>1) | ((x)&1) )
> >
> > void SKL_MB::Derive_uv_MV_MPEG12(SKL_MV *MV)
> > {
> > uv_MV[0] = MV[0][0] >> 1;
> > uv_MV[1] = MV[0][1] >> 1;
> > }
For the record, it's actually wrong. I add a look at the
ref MPEG2 implementation, and it should be "/2", not ">>1".
Oh well...
> >
> > void SKL_MB::Derive_uv_MV_From_1MV(int Quarter, SKL_MV *MV)
> > {
> > int x = MV[0][0];
> > int y = MV[0][1];
> > if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
> > uv_MV[0] = (x>>1) | Rnd_Tab_79[ x & 0x3];
> > uv_MV[1] = (y>>1) | Rnd_Tab_79[ y & 0x3];
> > }
> >
> > void SKL_MB::Derive_uv_MV_From_2MV(int Quarter, SKL_MV *MV)
> > {
> > int x = MV[0][0];
> > int y = MV[0][1];
> > if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
> > uv_MV[0] = (x>>2) | Rnd_Tab_78[ x & 0x7];
> > uv_MV[1] = (y>>2) | Rnd_Tab_78[ y & 0x7];
> > }
> >
> > void SKL_MB::Derive_uv_MV_From_4MV(int Quarter, SKL_MV *MV)
> > {
> > int x = MV[0][0] + MV[1][0] + MV[MV_Stride+0][0] + MV[MV_Stride+1][0];
> > int y = MV[0][1] + MV[1][1] + MV[MV_Stride+0][1] + MV[MV_Stride+1][1];
> > if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
> > uv_MV[0] = (x>>3) + Rnd_Tab_76[ x & 0xf ];
> > uv_MV[1] = (y>>3) + Rnd_Tab_76[ y & 0xf ];
> > }
> >
> > #undef DIV2RND
> >
> >
> > I think it's correct as soon as the '>>' is
> > really a bitwise shift (signed or unsigned)...
> >
> > For the half and q-pel truncation, I think using
> > the shift is the correct way, too
> > (e.g.:
> >
> > int Quads = (mv_x&3) | ((mv_y&3)<<2);
> > mv_x >>= 2;
> > my_y >>= 2;
> >
> > for q-pel MV snapping)
> >
> >
Micheal have noticed that the norm says "reference
block should be *biased* in the direction quarter or
half sample position". We're now leaving the field of
computing, and entering literature!:)
>
> ----- End of Original Message -----
>
> [...]
> > >
> > > this is exactly how I've understood the standard and that's what we're
> > > currently already doing.
> >
> > Sorry, maybe I'm too stupid, but I don't see how this code has anything to
> > do with the standard.
>
> It seems we totally misunderstood eachother. Nonetheless, your answer was
> very useful because I thought a bit about the Skal method and your method
> again and finally found what's wrong: the above code has a lot to do with
> the standard ;-) And the Skal_Tab_79 rounding code is (accidently) correct,
> however the Skal method (shift + OR) does not work for any other rounding
> method than quarter->halfpel.
>
> In general a shift + add seems ok using Skal's modified tables, so for
> example:
>
> // Table 7-6 (K=4) (modified)
> const int SKL_MB::Rnd_Tab_76[16] = { 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
> 0, 1, 1 };
> // Table 7-8 (K=2) (modified)
> const int SKL_MB::Rnd_Tab_78[ 8] = { 0, 0, 1, 1, 0, 0, 0, 1 };
> // Table 7-9 (K=1) (modified)
> const int SKL_MB::Rnd_Tab_79[ 4] = { 0, 1, 1, 1 };
>
> x = (x>>3) + Rnd_Tab_76[x & 0xF]; for rounding from 16th pel -> halfpel
>
> x = (x>>2) + Rnd_Tab_78[x & 0x7]; for rounding from 8th pel -> halfpel
>
> x = (x>>1) + Rnd_Tab_79[x & 0x3]; for rounding from quarterpel -> halfpel
>
> This code should do what the standard requires and should be also equivalent
> to the code Christoph provided...
>
Arghhh... I think I mess with the tables, indeed. I was using
the OR because it's sometime faster than arithmetic operations
(e.g. on Sparcs if I remember well). It's really a coincidence
the ' (x>>1)+ Rnd_Tab_79[x&3]' works for qpel -> halpel.
For 4th pel -> halfpel rounding, the Rnd_Tab_79 should be:
Rnd_Tab_79[4] = { 0,1,0,0 };
(i hope it's correct this time...). I'm gonna dig (again) into
this tables, just to cross-check...
> The problem was not (only) INTER4V related (which means 16th pel ->
> halfpel), but ms-fdam didn't even correctly decode XVID qpel content without
> 4MV mode. So far I did two rounding steps (each from quarterpel->halfpel) in
> motion compensation (one for the rounding from qpel vectors to halfpel
> vectors, and one to match the lower resolution of the subsampled u,v
> planes). It seemed correct this way since the standard explicitely says that
> the halfpel u,v vectors should be calculated using table 7-9 (see subclause
> 7.6.2.2) and all following steps should be performed as in normal halfpel
> mode then. Obviously ms-fdam does the rounding in one step using table 7-8
> (which is not at all a stupid idea btw) and after I modified XVID to
> consider this our chroma problems were gone. So I'd say the way ms-fdam
> handles chroma compensation is correct, but the description in the standard
> is misleading then imho...
>
Hope this "divide before summation" will fix this...
bye,
Skal