[XviD-devel] inter48x48

Marc FD xvid-devel@xvid.org
Sat, 27 Jul 2002 14:36:59 +0200


> but what if the frame that you begin the sequence of repeated frames with,
looks terrible because of a high quantizer?  such a frame may take >2 or 3
subsequent "correction" frames to start looking good.  thus dropping those
frames would result in a rather ugly still image.

I will explain it : In the first pass, XviD use the statistical algo
 (who is really secure, see after) to detect the Z(ero)-frames to dropp.
In the second pass, all frames before one dropped frames can't have
a quantizer below a defined value (ex : 3 for HQ)
and if a frame is before several dropped frames it _should_ have
 Quantizer 2. But test i've made show that even on HQ
anime, the gain is so huge that i think the encode would be
 Quant 2 saturated with 130 min for a CD :))
moreover, when you drop a frame, the gain you have can
 permit you to use better (smaller) quantizers
without breaking curve detection. the whole thing will need
some testing, of course, and would be totally
_unappropriate_ for fast action movies on 2 CDs but it would really
rocks for HQ anime or low bitrate encodes.
This feature is designed to make from XviD a RV killer :)) (i don't like
$RV$)
to be honest, i think a quant 2 frame with 2 dropped frames after
is less big than a quant 8 frame with
2 correction frames after and would result in a rather sweet still image :))
the sooner the better !

> if you'd like, you could compile a version of xvid that looks for
0xDEADBEEF as the first 4 bytes of the Y block, and drops that frame.
that'd >solve your problem - the harder part is finding a method suitable
for commiting to cvs for everyone else.  open source means you can do as
>you please in your own sand box :-)

- I don't make it only for me. I know tons of anime fans who would use XviD
instead of RV if they could have the same overall
quality. And i really liked XviD could beat RV on this point. I'm more a
coder than a ripper. Making a little hack for
my-1-a-year-little-ripp-to-see-what-can-do-our-encoders-today does really
not worth it.
- I dislike this ugly hack because it's totally dumb : a simple colorspace
conversion would disable it...

> > we could always add a checkbox somewhere, but i think we can easily make
a
> > statistical 99,99...% perfect algo, using the fact that video is always
noisy (even
> > DVD's, even HDTV) specially on the chroma planes. and it could be
disabled
>
> such a dangerous feature should definitely be disabled by default.  pete's
>suggestion of optionally calling image_mad before compressing >each frame
>may suit your purposes, if you can find the right threshold.  what
algorithm
>are you using in your avisynth filter to detect duplicate >frames, and how
>often does it miss frames due to small amounts of motion / noise?

The algo i use in my filter is very very basic, its a sort of modified mad.
I tested it on several scenes of some animes.
When it's good configured, it's 100% accurate (see very little movements and
don't get confused by noise)
but you can't use it with XviD, because it's a filter (several
thresholds) and need a "debug"-mode
A mad is really not accurate enough.

I don't know what do you mean by "such a dangerous feature" ???

I've made some tests/calculations on the statistical Z-frame detection algo
:
Let's say we just keep 1% of the last pic (ie 576x320 => 1843 pels / 3,6 Ko)
if in the new picture, there are 1% of the pels _only_ changing (a MAD
between 1,28-0,01)
With tests i've made on a anime DVD with heavy denoising
(SpatialSoften(10,50,50), ect..) the less pels changing i saw  were 10'000
(6%)
Only noise, there was no motion visible, even with VDub 4x Zoom
frame by frame. And when there is very few motion, like guys speaking in an
anime,
 the pixels changing are much more, about 60'000 minimum (30%)
Some Probability : (with 576x320)
The chances that we don't see any of the 1% pels changing with the 1%
referencence pels
is of (99/100)^1843 = 1/110'748'118 => 3 years of 25 fps encodes :)
if 5% change (real-life minimum noise), even with 1/1'000 reference pels, we
have
(19/20)^184 = 1/12'556 frames who will be accidentally dropped (i can't get
less when i
make VHS aquires because of audio synchronisation :)

I can code the whole algo, i just need s.o. used to XviD to
implement it in XviD's encoder after.
I think of a two functions :
void KeepDetectData(const int8_t* srcpic, const int8_t* data,int32_t amount)
to keep amount/1000 of srcpic in dstbuffer
bool IsZframe(const int8_t* srcpic,const int8_t* data,int32_t amount)
to say if it's a Z(ero)-frame or not (bool would be replaced by a portable
type)

It should be very very fast (only some thousands of values to check)
It will not eat the whole memory (10K is ridiculous..it's less than the L1
cache
of recents CPUs)
It will be accurate enough (i think a 3% value will be the best :
(99/100)^2304 = 1/11'389'683'271 on a 320x240 clip...)

If you are paranoic enough to fear it could drop you a frame every
11'389'683'271
frames (that's a 300 years long encode...) , Of course this option
would be checked with a checkbox.

I never say it was the best way, it's just what i found.

MarcFD  marc.fd@libertysurf.fr