[XviD-devel] minor changes / todo list

Fri Jun 13 16:44:10 CEST 2003

Hi,

Quoting suxen_drol <suxen_drol at hotmail.com>:

> On Thu, 12 Jun 2003 18:39:55 +0200 Michael Militzer <michael at xvid.org>
> wrote:
> > Hi,
> > 
> > Quoting suxen_drol <suxen_drol at hotmail.com>:
> > 
> > > - (ed as previously suggested) consider moving "min/max_quant" and
> > > "zones array" to individual plugins, rather than mainting these at the
> > > encoder level (the encoder need not be aware of quantizer restrictions
> > > or zones). though, this will mean the plugins have to allocate their own
> > > zone arrays.
> > 
> > I find the zones concept rather confusing. Especially when looking at the
> > current dev-api-4 gui. Can someone explain this in more detail? I suppose
> > the zones shall replace the credits encoding mode? In my testing I found
> > that credits encoding (and supposedly zones too) decrease rate control
> > accuracy. I'd rather vote to remove the credits options because it does
> > not belong into a codec but should be better part of a tool/application.
> 
> zones are intended to replace 2pass credits encoding mode, and enable
> the user to better control the encoder. for example, control the
> distribution of bits in 2nd pass, varying bvop thresh, or forcekeyframes
> on a particular frame(not implement yet)..
>
> the old credits mode worked sufficiently, although the credits frames
> sizes were not fully taken into account by the 2nd pass rate controller.
> this resulted in slightly non-optimial 2pass bitrate allocate. the new
> xvidcore 2pass (and cbr to a lesser extent) takes into consideration fixed
> quantizer zones.

well, I wouldn't say that the old credits encoding option worked
sufficiently. I checked it myself while assisting doom9 with his latest
codec comparison and experienced that credits encoding did lead to heavy
undersizing problems. This gives a bad impression of our ratecontrol algo
which normally works pretty nicely (without credits encoding).

Also consider that credits encoding could be easily done by a tool or
application (or a slightly more advanced user): only the credits part could
be encoded at first, then the filesize of the credits part is subtracted
from the overall desired filesize and a normal 2pass encode is done for the
rest of the movie (without credits). This process is not really complex and
should always give better results than our credits mode. Therefore I
suggested to drop the credits option.

> i agree that zones do make the interface more complex, and therfore i
> welcome any comments or ideas (good,bad) on how it can be improved.
> even if improving it means removing of it.

I honestly doubt that the zones conecpt will be really useful for cbr or
2pass (except for credits, but for credits handling it's not really needed).
Forcing certain frametypes (especially I-frames) might be useful soemtimes,
but an application of this should be rather limited. Also zones don't make
much sense for cbr: cbr-modes are mostly designed for and used in real-time
applications or capturing. Both these applications are characterized by the
fact that the user has no knowledge about the content to be encoded in
advance. So user selectable zones make no sense.

Something similar is true for 2pass. The user also does not know much about
the quality of a 2pass encode after only the first pass has finished. The
first-pass usually looks quite good (because it's a quant 2 encode), so
weighting certain scenes and giving them more bits for the second pass is
just guessing, may not reflect actual needs and may give bad results.

So I'd rather suggest the following: Remove the zones option from vfw and
instead create an own small application out of it. Leave the normal 2pass
algorithm but add support to write second pass statistics (quantizer/bits
distribution) into the stats file as well. Then such a stats file could be
opened with our new application after both first and second pass have been
finished.

The application opens the stats file and displays the second pass bit 
distribution as a simple graph. The user could watch the result of his
second pass and search for "bad looking" scenes. Using the apllication, he
could then assign more bits to difficult/bad looking scenes and less to
very good looking ones. So he could manually refine the bit distribution.
Once the user has finished, the application stores the refined bit
distribution scheme into the stats file again.

Now the user could perform a third-pass encoding where the XVID encoder
would exactly respect the user refined bit distribution. Since we've
collected useful information during the second pass already and because
the user-refined bit distribution shouldn't be too different from the
original 2pass distribution, the encoder should be able to respect the user
distribution with pretty high accuracy.

So to sum it up: we could turn our current 2pass system into a 2pass plus
optional third-pass encoding process. And this third-pass could really
bring an improvement because it does not rely on mathematic error metrics
but on real human viewing impressions. So I think this could be a valuable
application for a zone-like concept while not being very difficult to
implement and not adding too much of a computational overhead.

bye,
Michael