[XviD-devel] streaming mpeg-4 / decoding b-frames
Christoph Lampert
xvid-devel@xvid.org
Mon, 2 Sep 2002 09:58:14 +0200 (CEST)
On Sat, 31 Aug 2002, shatty wrote:
> I am trying to ensure that there will be no problems with
> bidirectionally encoded frames in the new media kit api for
> OpenBeOS. Internally the media kit has a set of nodes and wires
> similar to various other media APIs. (directshow, gstreamer)
> The data is passed in a set of buffers that are handed from
> node to node. The media kit is oriented around low latency
> media manipulation, so each node maintains some latency
> information.
>
> Which brings me to the b-frames/streaming mpeg4. If we do the
> naive thing and simply pass the frames in the order in which
> they are to be presented, we have a big latency hit if a group
> of b-frames occurs. We don't have to take this hit if I am not
> mistaken, because each b-frame has only the prior frame and
> the next key frame as its reference. (right?)
>
> So we could pass the reference frame ahead of the group of
> b-frames and then it would be available for decoding the
> b-frames, or we could pass it after the first one, for
> example. Either of these seems better than the naive approach.
>
> My question is: how do people handle this now with streaming
> mpeg4 and also, how is this handled in the xvidcore lib?
> Does the lib expect the frames out of order? Please feel free
> to respond offlist/onlist/IRC as you like.
In case nobody else answers...
---------------- DECODING -------------------------------
The MPEG standard describes in which order frames have to be
passed: If GOP is IBBPBBPBBP
0 1 2 3 4 5 6 7 8 9
I B B P B B P B B P
then the frames will be stored in bitstream (and transmitted, of course)
as
0 3 1 2 6 4 5 9 7 8
I P B B P B B P B B
So whenever a encoded frame arrives, all necessary reference frame for it
being decoded have already arrived.
----------------ENCODING-------------------------------
Encoding is more difficult. XVID (and all other codecs I know of)
use the normal input order as input source. Frames that are supposed to
become B-frames are buffered, until the other reference frame have been
encoded. In the beginning there has to be delay of N frames if the maximum
is N consecutive B-frames (that's important for A-V sync!)
viewing: 0 1 2 3 4 5 6 7 8 9
input to encoder: 0 1 2 3 4 5 6 7 8 9
action in encoder: - - e0 e3 e1 e2 e6 e4 e5 e9 e7 e8
output from encoder: - - 0 3 1 2 6 4 5 9 7 8
so there's a delay of 2 empty frame in the beginning. One could also chose
action in encoder: e0 - - e3 e1 e2 e6 e4 e5 e9 e7 e8
output from encoder: 0 - - 3 1 2 6 4 5 9 7 8
then there would be an image right at the beginning, but that is shown
for 3 instead of only 1 timestep (I think DivX5 did that, but DivX5 does
only 1 B-frame).
Btw. it's not a good idea to let the application reorder the frames before
sending it to the codec, because there is a "design flaw" (or simply a
bug) in MPEG-4 standard that makes it necessary to look at intermediate
B-frames before taking SKIP decision at _following_ P-frame.
So, you cannot _encode_ frames 1 and 2 before having encoded 3, but they
must be available when encoding 3 in order to check if blocks in frame 3
can be SKIPed or not.
Christoph