[XviD-devel] better rate control

Christoph Lampert xvid-devel@xvid.org
Tue, 10 Sep 2002 11:19:04 +0200 (CEST)


On Mon, 9 Sep 2002, daniel smith wrote:

> the somewhat disturbingly knowledgable MfA from doom9's forum pointed us to http://www-iplab.ece.ucsb.edu/zhihai/papers/icassp011.pdf , which concerns a very accurate quantizer decision based on the bitcosts and zero-content of DCT coefficients.
> 
> doesn't look too nasty for use in core, unless anyone has objections?

Okay, I looked into it a little closer. If you are intested in the theory, 
Zhinai's PhD thesis is interesting, much more examples etc. than the short
paper: 
http://vision.ece.ucsb.edu/publications/Zhihai_Thesis.pdf

I see three application for this in XviD (after discussion and testing): 

1) CBR ratecontrol (not #1 on the todo-list)
2) Adaptive Quantization 
3) Choice of quantizers in second pass of twopass (#1, better fit to
targer filesize is really needed here) 

All three are connected to the fact that Zhinai detected a close and easy
to calculate relationship between the number of zeroes in quantized DCT
coefficient (maybe Hadamard will work, too?) and resulting bitrate on one
hand, image quality/distorsion on the other hand. 
This means that you can predict the numbers of bits needed faster than by
counting them, but more exact than by the old two-pass model which assumed
texture size grows inverse proportional to quantizer. 

However... all of this would have been possible before, just with more
computational efford. Still, they are not going to be hyperfast, because 
DCT for the whole frame would be needed before decision on quantizer is
taken. 
But first we'll have to test if the results are really as good as he
proposes (he always uses "Foreman" sequence for testing, which is rather
different from "Matrix"). Also, he uses a lot of formulas, but not all his
statements are mathematically sound...

The first place for application would be two-pass, I guess. That's also a
place where it's easy to check how good the method is: 

1) Let quantization return the number of zero-coefficients for first pass. 
Write that to the logfile. Precalculate the special constants \theta
or \alpha or \theta_0 or whatever which he uses to describe the linear
model. Write them to the logfile, too. 

2) In second pass, use his formulas to decide on the right quantizer 
for the current frame, using the precalculated constant instead of some 
other "complexity" value. 

3) Compare the resulting bitstream length with the calculated one. 



If for our purposes the predicition is really as good as for his tests, 
we can continue e.g. with ordinary ratecontrol (the constants for the
current frame are approximated by using the constant from the previous
frame of same type).
For real adaptive quantization it would rather be a method of speedup than 
a better approach. I want test that with the slow DCT-based method first
before starting approximate versions.

gruel