[XviD-devel] Cartoon mode

Christoph Lampert chl at math.uni-bonn.de
Thu Apr 17 15:50:26 CEST 2003


On Thu, 17 Apr 2003, Michael Militzer wrote:
> well, I still hope to find a way to make Qpel always better than halfpel. Also 
> it should be possible to tweak the dynamic b-frame decision to achieve 
> reasonably optimized b-frame placement (I think rate-distortion optimizations 
> may not reflect subjective quality well: so, a clip with b-frames might look 
> better than one without even though overall PSNR is lower...)

Yes, but cartoons have other problems as well, e.g. they are often
created at lower framerate than later in the bitstream. 
A 15fps clip, played at 30fps with 1 bframes is most likely worse than
without bframes (I would assume), because in IPP where middle P is
identical to I, the middle frame might be 99% SKIP, but in Bframes, SKIP
means linear motion, not "no motion", so in IBP vectors had to be stored. 

> BTW: cartoon mode. I've thought about it and when we consider that a frame of 
> a "normal" cartoon is rather simple (solid colored areas, some sharp lines), I 
> think it should be possible to well "reconstruct" the original image also from 
> an image distorted by compression artifacts. This leads to the idea that a 
> special postprocessing method could be helpful for cartoons. "Cartoon mode" in 
> XVID could then simply mean to write a "cartoon=yes" flag into the user data of 
> the bitstream that then indicates that the decoder should use a special cartoon 
> postprocessing mode...

Possibly. I'm not so sure, because it's not just sharp lines + mosquito
noise, but the noise is in some lowly structured area which doesn't have
to be of constant color or anything. 

But still, sharp lines mean high frequencies which cannot completely be
quantized away. Adaptive Quantization might be interesting for this. 
Or a very special quant matrix? 

ME has very different problems, too, I guess: Removing 90% of a line
by MC might be worse than not removing anything, because a staight line is 
simple, but short segments of a line or even points is more difficult
(more high frequencies). 

My favourite idea still is extract lines from the image, fill the gap with
an easy to compress gradient or so and put lines into a binary mask
image or something. But that's not MPEG-4 anymore, of course... 

Prefiltering could be good: Cartoons from TV tend to be much noisier than
the original (because the original usually is very "clean", much cleaner
than natural video). 

gruel




More information about the XviD-devel mailing list