[XviD-devel] PATCH: Per slices rendering

peter ross xvid-devel@xvid.org
Mon, 08 Jul 2002 10:46:47 +1000


>From: "Michael Militzer" <michael@xvid.org>
>Reply-To: xvid-devel@xvid.org
>To: <xvid-devel@xvid.org>
>Subject: Re: [XviD-devel] PATCH: Per slices rendering
>Date: Mon, 8 Jul 2002 01:27:29 +0200
>
>----- Original Message -----
>From: "peter ross" <suxen_drol@hotmail.com>
>To: <xvid-devel@xvid.org>
>Sent: Monday, July 08, 2002 12:26 AM
>Subject: Re: [XviD-devel] PATCH: Per slices rendering
>
>
> > hey all,
> >
> > how about adding a 'dirty' field for each macroblock, which is set if 
>the
> > macroblock has changed? then at the image_output stage, perform some
> > macroblock checking and call slice_copy.
> >
> > out of curiosity, have you tried rending at the macroblock/block level?
>
>yes, this was my first idea. Bad is that this 'dirty' field is only used 
>for
>the slice_copy and for nothing else. Then I thought a bit about it and came
>to this idea: The idct data is currently stored in a int16_t data[64 * 6]
>for a MB. Why not make one big array [64*6*mb_width*mb_height] for all MBs
>and additionally one status variable for every MB? It should be possible
>then to skip the final transfer step in mb_intra and mb_inter and instead
>perform the transfer at once at the end of decoder_iframe and
>decoder_pframe. Advantage would be that non coded blocks doesn't need to be
>copied from ref to current frame, only MBs that have changed (intra/inter)
>need to be copied directly over the ref frame.

hmm, this should offer significant improvement. dont forget that bframe 
decoding needs copies of the two most recent refenence frames.

cache really comes into play here. so including the imageout stage inside 
the decoder stage is a sacrifice i'am happy to make.
i willing to bet macroblock-level is faster. mbs are always 16x16 and a 
mmx/sse2 transfer routine can be written specifically for the task.

>
>Also with the help of the status variable, the "dirty" blocks could then be
>copied during image_output using a copy_slice like function...
>
>Again: I didn't test this yet, maybe I missed something and it won't work. 
>I
>have still some uni work to do now and tomorrow before I can test this.
>
>btw: pete, you wrote that you wanted to rewrite the bframe decoding
>support - anything done already?

some design only; this weekend flew past rather quickly.

basically i want to fix chemns code such that bframes can be decoded without 
"unnecessary" delay.

there are three types of avis out there.
1. xvid/divx4/divx5  : no bframes, low_delay is not specified
2. divx500+bframes   : bframes (unpacked), low_delay is not specified
3. divx501+bframes   : bframes (packed), low_delay is not specified, 
includes divx 'p' identifier string

notes:
- the mpeg4 iso docs say, if low_delay is not specified then we must assume 
low_delay = 0.
- there is no way to distinguish type 2 from type 1, (other than seeing a 
bframe to decoder)
- xvid+bframes always specifies the low_delay.

the problem is: some decoder frontends can handle delay whilst other can't.
my solution is to add a "use_decoder_delay" global flag which changes the 
way xvid handles bframes.

when enable_decoder_delay = true:
	we will have a "output_valid" flag in the DEC_FRAME struct.
	this is set by the decoder, and tells the frontend to ignore/dont-display 
the output frame.

when enable_decoder_delay = false:
	basically, xvid will do its best to decode the frame without delay.
	if low_delay is not specified we will assume its low_delay=1 (unless the 
divx 'p' identifier is detected, or we come across a bframe).
	this ensures 100% compatibility with types 1 and 3, whilst being roughly 
compatible with type 2.

what do you think?

>
> > btw, iam quite pleased to here xvid's decoder is fast.
>
>yesterday I committed some additional asm code that was again tweaked by
>Skal. He especially enhanced the dequant4_mmx code, it's now nearly twice 
>as
>fast than before (!!). I think we should now always beat ffmpeg and there
>still seems to be some space left for improvements :-)
>

wow, that mmx'ing mpeg-4 quantizers was a real headache. this skal person is 
extrememly talented.

-- pete



_________________________________________________________________
MSN Photos is the easiest way to share and print your photos: 
http://photos.msn.com/support/worldwide.aspx