[XviD-devel] minor changes / todo list

Sun Jun 15 12:30:31 CEST 2003

On Fri, 13 Jun 2003 15:44:10 +0200 Michael Militzer <michael at xvid.org> wrote:
> well, I wouldn't say that the old credits encoding option worked
> sufficiently. I checked it myself while assisting doom9 with his latest
> codec comparison and experienced that credits encoding did lead to heavy
> undersizing problems. This gives a bad impression of our ratecontrol algo
> which normally works pretty nicely (without credits encoding).

which credits mode (quant,bitrate,size?). bitrate and size modes
will decrease the bits allocated to the body of the video sequence, so
the target file size should have been beet. i cant recall what quant mode
does, though i assume its some form of approximation.

if target size is not being met, then there is a bug in the old code.
this of course depends on what is considered to be unacceptable
underflow. imho <1% is acceptable (e.g. 3 out of 610meg), though it
should be exact to the last kilobyte.

> Also consider that credits encoding could be easily done by a tool or
> application (or a slightly more advanced user): only the credits part could
> be encoded at first, then the filesize of the credits part is subtracted
> from the overall desired filesize and a normal 2pass encode is done for the
> rest of the movie (without credits). This process is not really complex and
> should always give better results than our credits mode. Therefore I
> suggested to drop the credits option.

i find performing credits seperately to be plain annoying. but i do
agree that the management of encoding (credits,chapters,etc.) is more
suited to a higher encoding layer than the codec.

> > i agree that zones do make the interface more complex, and therfore i
> > welcome any comments or ideas (good,bad) on how it can be improved.
> > even if improving it means removing of it.
> 
> I honestly doubt that the zones conecpt will be really useful for cbr or
> 2pass (except for credits, but for credits handling it's not really needed).
> Forcing certain frametypes (especially I-frames) might be useful soemtimes,
> but an application of this should be rather limited. Also zones don't make
> much sense for cbr: cbr-modes are mostly designed for and used in real-time
> applications or capturing. Both these applications are characterized by the
> fact that the user has no knowledge about the content to be encoded in
> advance. So user selectable zones make no sense.

okay. the idea was to merge cbr and fixed quant into one simpler single pass
mode. however it seems what i implemented is actually more difficult to
use. thats okay, it was experimental and reverting back to plugin_cbr/fixed
and removing zones wont be hard.

> Something similar is true for 2pass. The user also does not know much about
> the quality of a 2pass encode after only the first pass has finished. The
> first-pass usually looks quite good (because it's a quant 2 encode), so
> weighting certain scenes and giving them more bits for the second pass is
> just guessing, may not reflect actual needs and may give bad results.

weighting was intended for such repeated second passes only (as you have
suggested).

> So I'd rather suggest the following: Remove the zones option from vfw and
> instead create an own small application out of it. Leave the normal 2pass
> algorithm but add support to write second pass statistics (quantizer/bits
> distribution) into the stats file as well. Then such a stats file could be
> opened with our new application after both first and second pass have been
> finished.
> 
> The application opens the stats file and displays the second pass bit 
> distribution as a simple graph. The user could watch the result of his
> second pass and search for "bad looking" scenes. Using the apllication, he
> could then assign more bits to difficult/bad looking scenes and less to
> very good looking ones. So he could manually refine the bit distribution.
> Once the user has finished, the application stores the refined bit
> distribution scheme into the stats file again.
> 
> Now the user could perform a third-pass encoding where the XVID encoder
> would exactly respect the user refined bit distribution. Since we've
> collected useful information during the second pass already and because
> the user-refined bit distribution shouldn't be too different from the
> original 2pass distribution, the encoder should be able to respect the user
> distribution with pretty high accuracy.
> 
> So to sum it up: we could turn our current 2pass system into a 2pass plus
> optional third-pass encoding process. And this third-pass could really
> bring an improvement because it does not rely on mathematic error metrics
> but on real human viewing impressions. So I think this could be a valuable
> application for a zone-like concept while not being very difficult to
> implement and not adding too much of a computational overhead.

okay, well this sounds good.

pass	stats input		stats output
----------------------------------------------
1. 	fixed quant=2		1st pass stats
2. 	1st pass stat		2nd pass stats
(edit 2nd pass stats file and set curve weights as neccessary)
3.	edited 2nd pass stats	nothing

the 1st pass stats will contain:
	frame-type
	quant
	length

the 2nd pass stats will contain:
	frame-type
	(1stpass) quant, length
	(2ndpass) desired-length, quant, length

the edited 2nd pass stats will contain an additional weight field.

desired-length indicate the length the rc wanted for each frame, and
quant/length will indicate what was actually achieved. in the third pass
we might be able to use difference between desired and actual to
increase size-quanter relationship accuracy.

all of this isnt terribily difficult, however writing a nice stats
editor will take some time. it could also be written in a
platform-independant manner to give non-win32 folk access to bitrate
tuning. anyway, i will start thinking about a win32 prototype.

cheers,
-- pete