[XviD-devel] Question about bvop decoding

skal skal at planet-d.net
Wed Jul 21 11:07:45 CEST 2004


	Hi all,

On Tue, 2004-07-20 at 21:24, Edouard Gomez wrote:
> Christoph Lampert (chl at math.uni-bonn.de) wrote:
> > valgrind/cachegrind seems to produce results similar to yours,
> > decode_bf_interpolate_mbinter has 14% of instructions, and 
> > 5.3% of total CPU cycles. With both, it's top of the list, followed by 
> > decoder_bframes (7.3% of instructions) and decode_mbinter(6.8%). 
> 
> Glad to see i'm not crazy, and/or my box doesn't behave like
> being part of the 4th dimension !
> 
> > The largest portion is due to complicated calculation of 
> > 
> > const uint8_t *const src = refn + (int)((y+(dy>>1))*stride+x+(dx>>1)
> > 
> > and the less complicated 
> > 
> > uint8_t *const dst = cur + (int)(y*stride+x);
> > 
> > switch (((dx&1)<<1)+(dy&1)) {
> >  
> > Those are in fact not in decoder.c, but inlined from 
> > interpolate8x8_switch(), which is called 6 times per MB. 
> > So I guess that high number of cycles is due to counting inlined code. 
> > Have you maybe checked how big interpolate_mbinter is in the ASM step?

	This is most probably the bigger part, more than the above
	calculations... Unfortunately, gprof can't instrument the
	ASM code.


> 
> I'm still amazed the CK kernel could bring 15% improvement for
> free (of course that implies you do nothing else but decoding)

	Let's reverse the point of view: How could previous
	kernel spend 15% of their doing counter-productive
	things? :))



	But back on topic:

	Ed, come on, it's no need to use complicated profile
	technics to see where interpolate mode could be 
	improved: for interp mode, you're doing:

	a) fwd predict into buf1
	b) bwd predict into buf2
	c) average buf1 and buf2 and send to pic.

	Now it's pretty obvious you're loosing
	time in Memory I/O and 16/8bits conversion
	during steps b) and c). These two could be
	merged into a single 'averaging' bwd predicting
	step.

	Now, if you dig very hard into this mailing list's
	archive, you may find that i once, a long time
	ago, sent ASM code that does combined b)+c) steps...:)

	bye!

Skal
 



More information about the XviD-devel mailing list