[XviD-devel] Quality optimization

Wed, 22 Jan 2003 11:42:05 +0100 (CET)

On Wed, 22 Jan 2003, Michael Militzer wrote:
> > Please comment on these:
> 
> ok ;)
> 
> > a) Mode decision. The FFMPEG "vhq" mode decision (checking by brute force
> > which coding type needs the fewest bits) often improves the PSNR by
> > 0.2-0.5, and it (almost) never does any harm (except for lower speed, of
> > course).
> >
> > Proposal #1: Implement brute force mode decision into XVID !
> > (The routines are all there, maybe they have to be slightly modified to be
> > "reentrant")
> 
> yes, vhq mode especially seems to help with 4MV mode. INTER4V mode does not
> give much benefit currently, maybe we can improve this

After today's computer science class, I would like to make this Proposal
into  1a) Implement brute force ...
      1b) Collect lots of data of sad/dev/bits and mail it to me 
      1c) Scan through data to find some decision regions and some 
          "regions of doubt".  
      1c) Implement a "stable" decision: If the decision is clear, use the
          fast one, when in doubt, use "bitcount decision". 

> > b) Using SAD of Hadamard-transformed blocks instead of ordinary
> > SAD for _subpixel_ resolution refinement (not for ordinary search!)
> > can also increase PSNR significantly (though not for anime).
> >
> > Proposal #2: Implement SAD_hadamard which does the transform and SAD.
> > (The routines are almost all there (or used to be), the result just
> > might have to be scaled to fit into current SAD-checks)
> 
> why hadamard and not DCT? (because it's faster?) - shouldn't DCT give better
> results (at least in theory)? And why SAD and not SSE? SSE should more
> evenly distribute the error over a whole block which should make it easier
> to code.

For 3 and a half reasons (+1 double):
a) SSE is much slower than SAD
b) SSE version SAD didn't give any advantage in ffmpeg tests, except for
   Anime, but I'll retest this on the VQEG set. 
c) skal's SIMDed Hadamard is twice the speed of DCT (for C it might be
   more)
d) H.26L proposed Hadamard, not DCT (and they are the experts ;-)
e) ffmpeg showed no increase between Hadamard and DCT, but in
   this second I noticed that it was _quantizazion error_ with DCT, not 
   SAD of DCT, so I might have to retest this, too.

Of course we can have as many subpel-searchroutines as we want and test
them, but for predefined quality profiles we should only have "pure SAD"
and one "better-but-slower", where I guess SAD+SATD would be a good
compromise. 

> > c) Rate-Distorsion optimized Quantization
> >
> > Sometimes it's good not to just quantize the values and save the result,
> > but modify results a little to safe bits while at the same time no do much
> > damage to the image. ffmpeg has "Trellis Quantization" for this, which is
> > really slow, but often gives a boost to PSNR. (Note that from the
> > computational point this is by far less work than "optimal adaptive
> > quantization")
> >
> > Proposal #3: Implement "intelligent" quantization, e.g. Trellis.
> > (or port it from Ffmpeg ;-)
> 
> yes, but it's quite slow. Also some additional optimal adaptive quantization
> would be nice...

Optimal Adaptive Quant is another step, which could be based on results of
Optimal Quantization. But real OAQ is _extremely_ slow, much slower than 
Trellis (which drops fps in ffmpeg to half, but on the other it's not
optimzed and pure C as the moment). 
I proposed Trellis, because it's possible to implement within a few hours,
since the algorithm is there, whereas for usable OAQ I guess somebody
would have to come up with the right theory for our needs, first. 

> > d) Set B-frames only when helpful
> >
> > A clip can benefit very much from Bframes, or the results can be horrible.
> > In one of the VQEG test clips, I had a PSNR drop from 46.05 to 41.40
> > (-4.95db) just because of activating 1 bframe! Average(!) quantizers
> > jumped from 2.26 to 5.22. But there is also a clip where PSNR was
> > increased from 34.3 to 35.5 (+1.2dB). XVID already has a dynamic test for
> > this, but it's a very simple check...
> >
> > Proposal #4: Implement a better criterion for dynamic bframes.
> > (first we have to find one, but if necessary, just use brute force)
> 
> I'm not sure if the benefits of b-frames can be measured by simple PSNR
> comparisons: Of course the PSNR of bframes is generally lower than if we had
> coded a pframe instead (simply because bframe quant is higher), however
> bframes' lower quality don't influence the remaining picture sequence
> (because bframes are not used as reference). If we now assume normal viewing
> conditions (a video played back at ~25 fps), the quality decrease when a
> bframe is presented for only a short time (1/25s) might not be noticed at
> all. 

In a typical IBPBPB GOP half of all frames are B-frames. So with the same
argument as yours I'd say, quality of all other frames doesn't matter,
because they are presented only for a short while. 
I'd say the "weight" of B-frames quality for _viewing_ is the same as of
the others. For encoding of error it is not, but that's taken care of by
higher quant for Bframes and ratecontrol. Average quant might not be a
good measurement in bframes sequences, but average PSNR I'd say is. 

Btw. in later tests I noticed, that ffmpeg's overall quality with Bframes
get much better when vhq is enabled in ffmpeg, so their mode decision in
bframes might simply be very bad. Still, its a fact that sometimes
bframes are beneficial and sometimes they are not, and I really have to
adapt the test-script to XVID. 

> So for a fixed quant encoding with maybe 5 - 10 bframes/sec, the
> perceived video quality might stay the same while we have a noticable
> decrease in file size at the same time.

5-10 bframes per second is rarely the case, unless you already have a
dynamic mode where bframes are supressed. Typical should be 12.5-15
bframes per second (maxbframes=1), or even 18-20 (maxbframes=2). 

> Of course there might be video clips, where b-frames are really doing harm
> (and not only PSNR-wise). So for example if a PBP sequence (viewing order)
> is not significantly smaller than a corresponding PPP sequence, bframes
> should of course not be used (well, I guess I don't tell you anything new
> here ;-))

I would say: Let's take ratecontrol take care of that (fix quant encoding
is difficult for bframes, espcially because their quant isn't the one you
choose as fixed) and watch the result. 

gruel