Re[2]: [XviD-devel] scene change detecting in Bframes mode

Marc FD xvid-devel@xvid.org
Tue, 20 Aug 2002 12:09:00 +0200


> > So, thinking of tweaking/designing/coding/improving a scenedetector,
i've
> > some questions :
> > it's better for the codec to code an I-frame than an P/B when there very
> > fast movement,
> > but it's better to code a P/B frame when the movement is easy to
estimate
> > (pans,zooms),
> > right ? (seems logical)
>
> If the movement is more difficult than ME can handle, then it's better
> to make it an I frame.

seemed logical, now i'm sure

> > In this case, a histogram based algo would be a must.
> > but in scenes with for ex machine-guns in action in a dark area, the
algo
> > will turn mad and
> > will create big rows of I-frames... so i come to the next question :
>
> Which is not very bad, because ME will also get mad here. ME gets
> crazy when there is a flash (of entire scene).

yep, but when there is a 25% of the scene flash,  a histogramm algo gets mad
too.
It's very noticable in VDub.

> > Is it better to encode a P/B frame or an I-frame when only a part of the
> > frame change ?
> > Why a I-frame can be better than a P/B-frame ? due to the P/B-frame
motion
> > vectors size ??
>
> If the part of a picture that got changed is easy to encode (for
> example a smooth white 'fire' from the gun) then P frame is better.
> Especially if the part of the scene which is static is difficult.

So there are some big compression opportunities here ;)

> I frames are smaller because they use predictions of DCT coefficients
> between macroblocks, while in P frame all blocks are independant. (OK
> I might be wrong here, please correct me).

aahh. make sense. Thx a lot for this info.

> > would it be possible to code a false-I-frame? i mean a P-frame who keeps
the
> > background and
> > adds a static image on it, instead of trying to estimate an non-existing
> > movement.
>
> Every P frame is like this, because every MB in such frame can be
> intra. However there is no prediction between the intra blocks so a P
> frame saturated with intra blocks will be much bigger than a similar I
> frame. (again, please correct if I'm wrong).

I think of some flashing in 25% of the screen, who traditionnaly are coded
as
ugly rows of I-frames in almost all codecs. If we could do a 75% P-frame,
25% I-frame, XviD would simply _rule_ . I-frames rows are often very nasty
and problematics, and some XviD-magic[tm] would be _very_ appreciated.

> I don't think my answers helped you ;) But, as we all know, it is a
> difficult subject.

yes, but interessing too. seems to be somehow improvable :)

> Now I have a question: Why can't we use the current (ME-based)
> scene-detection algo?
> Of course I don't want to conduct P-frame ME for every frame. I only mean
> doing it for a P frame. If this frame will turn out to be from new
> scene, THEN go back one frame (in viewing order) and check which scene
> is this. Keep going back until you found the last frame of old scene
> (and encode it as P - ME is done already; encode the frames before as
> B, next frame [first of new scene] as I).
> There is no waste in computing power unless there is many B-frames and
> lots of scene changes. No more than current max_bframes = -1, anyway.
>
> The only problem I see is putting it into bitstream in VfW
> compatibility mode.
> Please enlighten me what's wrong with it. Or ask questions if what I
> said is difficult to understand.
>
> Radek

i think the same as gruel.