[XviD-devel] Question about bvop decoding

Edouard Gomez ed.gomez at free.fr
Tue Jul 20 01:33:37 CEST 2004


Christoph Lampert (chl at math.uni-bonn.de) wrote:
> could this be a error in measurement/profiling? With gprof I get 1%-2% for
> decoder_bf_interpolate_mbinter() if ASM is switched on. This was the case
> for QCIF foreman, and also for 720p parkrun.

I run this tests on Naruto episodes, as it's anime xvid uses
lot of interpolated/direct blocks. 

> Since the encode was a low quant, the clearly dominating routine 
> is get_coeff(), follwoed by get_inter_block_h263. 
> 
>  33.97      0.53     0.53 12445358     0.00     0.00  get_coeff
>  12.82      0.73     0.20   489109     0.00     0.00  get_inter_block_h263
>   7.69      0.85     0.12                             idct_3dne
>   7.05      0.96     0.11                             transfer8x8_copy_3dne
>   7.05      1.07     0.11                             yv12_to_yv12_xmm
>   3.85      1.13     0.06       17     3.53    29.34  decoder_bframe
>   3.85      1.19     0.06                             interpolate8x8_halfpel_h_3dne
>   3.21      1.24     0.05                             interpolate8x8_halfpel_v_3dne
>   2.56      1.28     0.04   113085     0.00     0.01  decoder_mb_decode
>   2.56      1.32     0.04    60449     0.00     0.01  decoder_mbinter
>   2.56      1.36     0.04                             image_brightness_mmx
>   2.56      1.40     0.04                             interpolate8x8_halfpel_hv_3dne
>   1.92      1.43     0.03    54701     0.00     0.00  decoder_bf_interpolate_mbinter
>   1.92      1.46     0.03       15     2.00    28.93  decoder_pframe
>   1.28      1.48     0.02   489109     0.00     0.00  get_inter_matrix

Hmm i don't have such a profile at all...
See:
http://ed.gomez.free.fr/vrac/profile-gprof.txt
http://ed.gomez.free.fr/vrac/profile-oprofile.txt

Note that the oprofile one, seems not to carry some functions
like get_coeff though they're ranking high in gprof profile.
I'm wondering if gcc isn't doing some magic at puting both
inlined and normal functions in code so that profiles do
include all symbols though oprofile which just knows about
symbol addresses doesn't see these same inlined functions

NB: the oprofile profile doesn't match the gprof run at all,
but itgives you a rough idea of why i claim the bf
interpolateing function eats up to 14% (even 16 in this run).

Btw something is sure, our decoder is slow only on bvop
featured sequences, simple profile ones give normal speed
(still slowerthen ffmpeg, but not far behind as it's the case
with bvops featured clips)

-- 
Edouard Gomez


More information about the XviD-devel mailing list