[XviD-devel] Rounding

30 Oct 2002 11:08:50 +0100

	Hi all,

	OMG, this rounding is such a mess :)

	I've re-read the infamous section  7.6.5 and
	here are some things I noticed:

	a) "...in quarter pixel mode the vectors are divided
         by 2 *BEFORE* summation"

	So, the code I sent is wrong for K=4 (INTER4V).
	It should be something like:

mv_x = DIV2RND(mv[0].x) + DIV2RND(mv[1].x) + DIV2RND(mv[2].x) + DIV2RND(mv[3].x);
mv_y = DIV2RND(mv[0].y) + DIV2RND(mv[1].y) + DIV2RND(mv[2].y) + DIV2RND(mv[3].y);

	and then proceed with table table 7-6...
	I think it's really a two-step process (qpel->halfpel), probably
	mergable into a single one...

        b) This section always mention "dividing" ("by 2K", "by 2", ...)
	But it's not clearly stated which division it is (/? //? ///?).
	The only place where I see the exact operator is in the
	legend of table 7-6 -> 7.9. It seems to be // (nearest integer,
	rounding away from 0). Hence the DIV2RND(x) ... Might 
	other  implementors fail to use this one too?

	c) Maybe a bad news: at the beginning of clause, we have:
	"A I macroblock comprises either one MV or K motion vectors,
	one for each non-transp 8x8 pel blocks forming the 16x16
	pel macroblock, as is INDICATED BY THE CBPC CODE" !!
	Would it mean that K is not always 4 when CPBC!=0xf??
	Hope not...

Some additional remarks:

On Tue, 2002-10-29 at 21:43, Michael Militzer wrote:
> From: "skal" <skal@planet-d.net>
> To: <xvid-devel@xvid.org>
> Sent: Monday, October 14, 2002 9:40 AM
> Subject: Re: [XviD-devel] Rounding
> 
> 
> > On Sun, 2002-10-13 at 23:33, Christoph Lampert wrote:
> > >
> > > Hi,
> > >
> > >
> > > I've been looking at rounding of motion vectors half the day. It's done
> in
> > > 3 different ways through the code. :( And I don't even think it's
> right...
> > >
> > > I keep reading code like this:
> > >
> > > uv_dy =  (uv_dy ? SIGN(uv_dy) *
> > > (roundtab8[ABS(uv_dy) % 8] + (ABS(uv_dy) / 8) * 2) : 0);
> > >
> > >
> > > which is supposed to round from 8th-pel to halfpel. Result is symmetric
> > > to 0, however, it's not similar to anything I read in the Standard. So
> > > where is it from?
> > >
> > > Also, why the complicated expression using ? and % operator?
> > >
> > > My impression is that a simple & and >> should be sufficient
> > > (if we assume that negative values are shifted in the logical way).
> > >
> > > uv_dy = roundtab8[uv_dy & 0x7] + (uv_dy >> 3) * 2;
> > >
> > > and similar fof 16th-pel -> halfpel with
> > >
> > > uv_dy = roundtab16[uv_dy & 0xF] + (uv_dy >> 4) * 2;
> > >
> > > For 16th-Pel this is the same, thought much less complicatedly put.
> > > FOr 8th-pel there's a difference, and even though it's not large, the
> > > PSNR gets slightly little higher in average and I believe it's what the
> > > standard demands.
> > >
> > > Or am I wrong here?
> > >
> > >
> > > 8th->half
> > >
> > > UV_DY     old           roundtab8
> > >
> > > dy=-15    res=-3        res2=-4
> > > dy=-14    res=-3        res2=-3
> > > dy=-13    res=-3        res2=-3
> > > dy=-12    res=-3        res2=-3
> > > dy=-11    res=-3        res2=-3
> > > dy=-10    res=-2        res2=-3
> > > dy=-9     res=-2        res2=-2
> > > dy=-8     res=-2        res2=-2
> > > dy=-7     res=-1        res2=-2
> > > dy=-6     res=-1        res2=-1
> > > dy=-5     res=-1        res2=-1
> > > dy=-4     res=-1        res2=-1
> > > dy=-3     res=-1        res2=-1
> > > dy=-2     res= 0        res2=-1
> > > dy=-1     res= 0        res2= 0
> > > dy= 0     res= 0        res2= 0
> > > dy= 1     res= 0        res2= 0
> > > dy= 2     res= 0        res2= 1
> > > dy= 3     res= 1        res2= 1
> > > dy= 4     res= 1        res2= 1
> > > dy= 5     res= 1        res2= 1
> > > dy= 6     res= 1        res2= 1
> > > dy= 7     res= 1        res2= 2
> > > dy= 8     res= 2        res2= 2
> > > dy= 9     res= 2        res2= 2
> > > dy=10     res= 2        res2= 3
> > > dy=11     res= 3        res2= 3
> > > dy=12     res= 3        res2= 3
> > > dy=13     res= 3        res2= 3
> > > dy=14     res= 3        res2= 3
> > > dy=15     res= 3        res2= 4
> >
> > Hi all,
> >
> > here's the code I've been using for the Round  7-6 -> 7-9
> > tables during computation of chroma MV:
> >
> >
> >   // Table 7-6 (K=4)  (modified)
> > const int SKL_MB::Rnd_Tab_76[16] = { 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0,
> 0, 0, 1, 1 };
> >   // Table 7-8 (K=2)  (modified)
> > const int SKL_MB::Rnd_Tab_78[ 8] = { 0, 0, 1, 1, 0, 0, 0, 1  };
> >   // Table 7-9 (K=1)  (modified)
> > const int SKL_MB::Rnd_Tab_79[ 4] = { 0, 1, 1, 1 };
> >
> > typedef SKL_INT16 SKL_MV[2];
> >
> >
> > #define DIV2RND(x) ( ((x)>>1) | ((x)&1) )
> >
> > void SKL_MB::Derive_uv_MV_MPEG12(SKL_MV *MV)
> > {
> >   uv_MV[0] = MV[0][0] >> 1;
> >   uv_MV[1] = MV[0][1] >> 1;
> > }

	For the record, it's actually wrong. I add a look at the
	ref MPEG2 implementation, and it should be "/2", not ">>1".
	Oh well...

> >
> > void SKL_MB::Derive_uv_MV_From_1MV(int Quarter, SKL_MV *MV)
> > {
> >   int x = MV[0][0];
> >   int y = MV[0][1];
> >   if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
> >   uv_MV[0] = (x>>1) | Rnd_Tab_79[ x & 0x3];
> >   uv_MV[1] = (y>>1) | Rnd_Tab_79[ y & 0x3];
> > }
> >
> > void SKL_MB::Derive_uv_MV_From_2MV(int Quarter, SKL_MV *MV)
> > {
> >   int x = MV[0][0];
> >   int y = MV[0][1];
> >   if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
> >   uv_MV[0] = (x>>2) | Rnd_Tab_78[ x & 0x7];
> >   uv_MV[1] = (y>>2) | Rnd_Tab_78[ y & 0x7];
> > }
> >
> > void SKL_MB::Derive_uv_MV_From_4MV(int Quarter, SKL_MV *MV)
> > {
> >   int x = MV[0][0] + MV[1][0] + MV[MV_Stride+0][0] + MV[MV_Stride+1][0];
> >   int y = MV[0][1] + MV[1][1] + MV[MV_Stride+0][1] + MV[MV_Stride+1][1];
> >   if (Quarter) { x = DIV2RND(x); y = DIV2RND(y); }
> >   uv_MV[0] = (x>>3) + Rnd_Tab_76[ x & 0xf ];
> >   uv_MV[1] = (y>>3) + Rnd_Tab_76[ y & 0xf ];
> > }
> >
> > #undef DIV2RND
> >
> >
> > I  think it's correct as soon as the '>>' is
> > really a bitwise shift (signed or unsigned)...
> >
> > For the half and q-pel truncation, I think using
> > the shift is the correct way, too
> > (e.g.:
> >
> >  int Quads = (mv_x&3) | ((mv_y&3)<<2);
> >  mv_x >>= 2;
> >  my_y >>= 2;
> >
> > for q-pel MV snapping)
> >
> >

	Micheal have noticed that the norm says "reference
	block should be *biased* in the direction quarter or
	half sample position". We're now leaving the field of
	computing, and entering literature!:)

> 
> ----- End of Original Message -----
> 
>	[...]
> > >
> > > this is exactly how I've understood the standard and that's what we're
> > > currently already doing.
> >
> > Sorry, maybe I'm too stupid, but I don't see how this code has anything to
> > do with the standard.
> 
> It seems we totally misunderstood eachother. Nonetheless, your answer was
> very useful because I thought a bit about the Skal method and your method
> again and finally found what's wrong: the above code has a lot to do with
> the standard ;-) And the Skal_Tab_79 rounding code is (accidently) correct,
> however the Skal method (shift + OR) does not work for any other rounding
> method than quarter->halfpel.
> 
> In general a shift + add seems ok using Skal's modified tables, so for
> example:
> 
> // Table 7-6 (K=4)  (modified)
> const int SKL_MB::Rnd_Tab_76[16] = { 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
> 0, 1, 1 };
> // Table 7-8 (K=2)  (modified)
> const int SKL_MB::Rnd_Tab_78[ 8] = { 0, 0, 1, 1, 0, 0, 0, 1  };
> // Table 7-9 (K=1)  (modified)
> const int SKL_MB::Rnd_Tab_79[ 4] = { 0, 1, 1, 1 };
> 
> x = (x>>3) + Rnd_Tab_76[x & 0xF]; for rounding from 16th pel -> halfpel
> 
> x = (x>>2) + Rnd_Tab_78[x & 0x7]; for rounding from 8th pel -> halfpel
> 
> x = (x>>1) + Rnd_Tab_79[x & 0x3]; for rounding from quarterpel -> halfpel
> 
> This code should do what the standard requires and should be also equivalent
> to the code Christoph provided...
> 

	Arghhh... I think I mess with the tables, indeed. I was using
	the OR because it's sometime faster than arithmetic operations
	(e.g. on Sparcs if I remember well). It's really a coincidence
	the ' (x>>1)+ Rnd_Tab_79[x&3]' works for qpel -> halpel.
	For 4th pel -> halfpel rounding, the Rnd_Tab_79 should be:

Rnd_Tab_79[4] = { 0,1,0,0 };

	(i hope it's correct this time...). I'm gonna dig (again) into
	this tables, just to cross-check...

> The problem was not (only) INTER4V related (which means 16th pel ->
> halfpel), but ms-fdam didn't even correctly decode XVID qpel content without
> 4MV mode. So far I did two rounding steps (each from quarterpel->halfpel) in
> motion compensation (one for the rounding from qpel vectors to halfpel
> vectors, and one to match the lower resolution of the subsampled u,v
> planes). It seemed correct this way since the standard explicitely says that
> the halfpel u,v vectors should be calculated using table 7-9 (see subclause
> 7.6.2.2) and all following steps should be performed as in normal halfpel
> mode then. Obviously ms-fdam does the rounding in one step using table 7-8
> (which is not at all a stupid idea btw) and after I modified XVID to
> consider this our chroma problems were gone. So I'd say the way ms-fdam
> handles chroma compensation is correct, but the description in the standard
> is misleading then imho...
> 

	Hope this "divide before summation" will fix this...

	bye,
		Skal