[XviD-devel] API transparency with regards to requiring future frames

Wed Jul 11 13:44:50 CEST 2007

Hi Stephan,

finally my reply in full...

Quoting Stephan Assmus <superstippi at gmx.de>:

[...]

> I would like to confirm an assumption I have with regards to the decoder
> requiring "future" frames for a particular frame. I hope I am right in
> thinking this is a feature of MPEG4/XViD that a frame can reference a
> future frame, otherwise just ignore this mail :-).

Yes, you are right with this. The feature is called B-frames where
bidirectional prediction can be used which linearly combines samples from
a past and future frame.

> Suppose the following situation: I keep fetching chunk data and stuffing it
> into the decoder, and whenever xvidDecoderStats.type > XVID_TYPE_NOTHING I
> am happy to have decoded a frame and display it. As an example, assume I
> decoded 3 frames in this way, but what really happened, without me being
> aware of it, is that I fed the decoder data so that it could decode until
> frame 4, because frame 3 needed frame 4. The library kept quite about
> having already parsed frame 3, because it wanted me to keep feeding it data
> until it could parse frame 4 as well so that it could reconstruct frame 3
> fully. After having displayed frame 3, progressing to "decode" frame 4 -
> what I would expect from the library, is that it handles this
> transparently, and just gives me frame 4, because it buffered this
> internally. My question is... is this assumption correct, or does this work
> more involved?

Yes, this is handled transparently. There's no need to 're-feed' certain
compressed data twice or the likes. Frames that were internally decoded
already to decode a B-frame are buffered and output when you feed the next
chunk of input data. If you're at the end of your stream you should call
decoder with a NULL pointer as input and length set to -1 and the decoder
will return you a potentially still buffered frame.

BTW: Many Xvid AVI files with B-frames are created with 'packed mode'. In
that case, several frames of data can be packed into one AVI chunk. So if
you feed these packed chunks to the decoder, the decoder will have all the
frames required to decode the next frame and you'll get a valid output
picture for each AVI chunk (so no XVID_TYPE_NOTHING and no delay).

> How common are streams that use this feature? My xvid based decoder works
> fine so far, but I don't know if I tested it with such streams.

The B-frame feature is very common ;)

Regards,
Michael