Re[2]: [XviD-devel] scene change detecting in Bframes mode

Tue, 20 Aug 2002 19:06:35 +0930

> So, thinking of tweaking/designing/coding/improving a scenedetector, i've
> some questions :
> it's better for the codec to code an I-frame than an P/B when there very
> fast movement,
> but it's better to code a P/B frame when the movement is easy to estimate
> (pans,zooms),
> right ? (seems logical)

If the movement is more difficult than ME can handle, then it's better
to make it an I frame.

> In this case, a histogram based algo would be a must.
> but in scenes with for ex machine-guns in action in a dark area, the algo
> will turn mad and
> will create big rows of I-frames... so i come to the next question :

Which is not very bad, because ME will also get mad here. ME gets
crazy when there is a flash (of entire scene).

> Is it better to encode a P/B frame or an I-frame when only a part of the
> frame change ?
> Why a I-frame can be better than a P/B-frame ? due to the P/B-frame motion
> vectors size ??

If the part of a picture that got changed is easy to encode (for
example a smooth white 'fire' from the gun) then P frame is better.
Especially if the part of the scene which is static is difficult.

I frames are smaller because they use predictions of DCT coefficients
between macroblocks, while in P frame all blocks are independant. (OK
I might be wrong here, please correct me).

> would it be possible to code a false-I-frame? i mean a P-frame who keeps the
> background and
> adds a static image on it, instead of trying to estimate an non-existing
> movement.

Every P frame is like this, because every MB in such frame can be
intra. However there is no prediction between the intra blocks so a P
frame saturated with intra blocks will be much bigger than a similar I
frame. (again, please correct if I'm wrong).

I don't think my answers helped you ;) But, as we all know, it is a
difficult subject.

Now I have a question: Why can't we use the current (ME-based)
scene-detection algo?
Of course I don't want to conduct P-frame ME for every frame. I only mean
doing it for a P frame. If this frame will turn out to be from new
scene, THEN go back one frame (in viewing order) and check which scene
is this. Keep going back until you found the last frame of old scene
(and encode it as P - ME is done already; encode the frames before as
B, next frame [first of new scene] as I).
There is no waste in computing power unless there is many B-frames and
lots of scene changes. No more than current max_bframes = -1, anyway.

The only problem I see is putting it into bitstream in VfW
compatibility mode.
Please enlighten me what's wrong with it. Or ask questions if what I
said is difficult to understand.

Radek