[XviD-devel] Quality optimization

Radek Czyz xvid-devel@xvid.org
Thu, 23 Jan 2003 22:11:56 +1030


Hello,

Allow me to present some my opinions related to the discussion.

> a) Mode decision. The FFMPEG "vhq" mode decision (checking by brute force
> which coding type needs the fewest bits) often improves the PSNR by
> 0.2-0.5, and it (almost) never does any harm (except for lower speed, of
> course).
>
> Proposal #1: Implement brute force mode decision into XVID !
> (The routines are all there, maybe they have to be slightly modified to be
> "reentrant")

Let me just say that I agree :>

> After today's computer science class, I would like to make this Proposal
> into  1a) Implement brute force ...
>       1b) Collect lots of data of sad/dev/bits and mail it to me
>       1c) Scan through data to find some decision regions and some
>           "regions of doubt".
>       1c) Implement a "stable" decision: If the decision is clear, use the
>           fast one, when in doubt, use "bitcount decision".

And we can also use these statistical data to evaluate a better
'simple' mode decision.

>> > b) Using SAD of Hadamard-transformed blocks instead of ordinary
>> > SAD for _subpixel_ resolution refinement (not for ordinary search!)
>> > can also increase PSNR significantly (though not for anime).
>> >
>> > Proposal #2: Implement SAD_hadamard which does the transform and SAD.
>> > (The routines are almost all there (or used to be), the result just
>> > might have to be scaled to fit into current SAD-checks)
>> 
>> why hadamard and not DCT? (because it's faster?) - shouldn't DCT give better
>> results (at least in theory)? And why SAD and not SSE? SSE should more
>> evenly distribute the error over a whole block which should make it easier
>> to code.

My experiments show that during halfpel refinement, the biggest
slowdown comes not from the speed of comparsion function, but from
memory access. I guess that means that we an use a more complicated
function (maybe even dct) while still having the same speed.

> d) Set B-frames only when helpful
>
> A clip can benefit very much from Bframes, or the results can be horrible.
> In one of the VQEG test clips, I had a PSNR drop from 46.05 to 41.40
> (-4.95db) just because of activating 1 bframe! Average(!) quantizers
> jumped from 2.26 to 5.22. But there is also a clip where PSNR was
> increased from 34.3 to 35.5 (+1.2dB). XVID already has a dynamic test for
> this, but it's a very simple check...
>
> Proposal #4: Implement a better criterion for dynamic bframes.
> (first we have to find one, but if necessary, just use brute force)

Do you have any idea how brute force should work? The goal is to have
future frame, with it's vectors, ready before the decision. But if
it's ready, than the decision has been already done.... That doesn't
make sense.
Of course I'd also would like to see a better decison.

> I'm not sure if the benefits of b-frames can be measured by simple PSNR
> comparisons: Of course the PSNR of bframes is generally lower than if we had
> coded a pframe instead (simply because bframe quant is higher), however
> bframes' lower quality don't influence the remaining picture sequence
> (because bframes are not used as reference). If we now assume normal viewing
> conditions (a video played back at ~25 fps), the quality decrease when a
> bframe is presented for only a short time (1/25s) might not be noticed at
> all.

I agree, but with slightly different arguments ;) . B-frames might
have horrible PSNR but still look very good. The picture, when
compared pixel-by-pixel, might be different than the original, but
it's still sharp, because both references are sharp, and is not
blocky, because SAD wouldn't allow visible blocks (that's the way SAD
is). As a result, b-frames don't look bad even when someone would look
at a still picture, and at the same time original might look very
different - especially the noise is very different.

> Btw. in later tests I noticed, that ffmpeg's overall quality with Bframes
> get much better when vhq is enabled in ffmpeg, so their mode decision in
> bframes might simply be very bad. Still, its a fact that sometimes
> bframes are beneficial and sometimes they are not, and I really have to
> adapt the test-script to XVID.

Do they use vhq for mode decision in b-frames? I'd be a bit worried
about it. As b-frames have higher quant and good motion compensation,
they usually have no DCT data at all. Therefore, vhq mode decision
would simply add bits needed to code the mode (1..4) and bits needed
to code vector. Direct mode would always win...


As you might remember, I used to experiment with mrSAD for motion
estimation. This is what I discovered while doing so:

SAD is a very good way to search motion PROVIDED that it is able to
find a good match. The picture created by SAD-besed ME is very smooth
and in fact looks good before DCT data are applied (as long as you
don't look closer, of course). If there are several possible vectors
which need no DCT correction at all, SAD will choose the one which
looks best.
However, this is no longer true when DCT are to be written. A single
DCT coefficient will affect entire block. SAD tries to match most
pixels as nicely as possible, but after quantized DCT is applied,
these fine-tuned pixels will not match anymore. This is when SAD
starts to suck.

I used to be working on a second-stage ME search, which is done for
a macroblock if SAD value was not good enough.
I discovered that 'not good enough' is something like "comparable to,
or bigger than, deviation; or more than 3000". I used mrSAD for the
search and easly improved PSNR by 0.25dB. I suspended the project
because it still didn't help on fades, so was not good enough.

This can be done better with DCT. What I think we should try:

- use normal SAD for the search.
- use DCT to choose INTER/INTRA/INTER4V. Choose the one which uses
least number of bits and looks better (so when INTER4V uses 1 more bit
but SAD is much lower, we still choose INTER4V).
- if cbp == 0 (which we know after DCT check), leave it there. SAD was
really the best matching function when the block is not corrected
anymore.
- else, we can use a smarter search to mimimize number of bits. The
macroblock (well, block...) will look as good as quantization allows
anyway, so we can use more advanced techiques to minimize mumber of
bits.

OK, this mail is too long already. Thanks for reading.

Best Regards,
Radek