[XviD-devel] Quality optimization

Christoph Lampert xvid-devel@xvid.org
Thu, 23 Jan 2003 17:11:11 +0100 (CET)


On 23 Jan 2003, skal wrote:
> > * I'm sure that GMC can do better than our current version, in particular
> > GME and mode decision.
> > 
> 
> 	I've looked at your GME code (yeah, I couldn't help:) and
> 	have some question:
> 
> 	a) Where is GMC-accuracy taken into account? 

it isn't.

>       It should change the scaling of your solution...

Errr, should it? I don't think so. Values for sprite reference points are
(as far as I know) always stored in halfpel units, even in 16th-pel mode.
It's not possible to have e.g. 1/16th pel translation. Which is a pity,
btw. except for that with 1 warppoint translational (as DIVX does), we
will be able to able to re-use halfpel images and don't have to do
"real" GMC decoding. 
 
> 	b) why didn't you jump directly on 3-pts warping? The
> 	least-square minimization equations are quite similar
> 	to the 2-pts ones you seem to have used...

Yes, the estimation wouldn't be very different, maybe easier 
because X and Y are independent then. I had two reasons: 
a) I had a paper from Smolic, who used 2-warp-points and 
results were as good as with 3 in natural video. 
b) With 2 additional parameters I'm afraid the chance of overfitting
would be even greater. When just using motion vectors without 
differential refinement, I'd like to keep the number of free 
parameters as low as possible. 

> 	c) It's somehow agreed in the literature that scaling the
> 	x-y coords to [-1,1] range is vital to ensure numerical
> 	stability of the computations (or maybe this was only
> 	required for the Kalman-filtering ... non linear stuff...
> 	or maybe for the 4-pts case... dunno)

Hm, I can't really see why at the moment. Matrix values are in the range 
of 10^7 and 10^{-7} and can (and will) easily be reduced by
another factor of 64 or so... Maybe if I had to invert the 4x4 or 6x6
matrix using numerical calculations, stability would become crucial, 
but I have a closed formula for the inverse, so it's just a few float
operations per entry per iteration. 

gruel