[XviD-devel] [RFC] Decoder speedup
Edouard Gomez
ed.gomez at free.fr
Sun Jun 13 22:32:37 CEST 2004
Hey hey,
in my continuous effort to have a faster decoder, i did tweak
get_coeff this afternoon.Though the speedup isn't that great
(only 2%), it is possibly proving our bitstream functions are
pretty inneficient or that our usage pattern of these function
is not optimal.
First the benchmarks on a 640x480x1384,9s sequence, for those
who are not used to mplayer banchmarks, they consist in
playing the video as fast as the codec allows it, thus
benchmarking codec performance only (+ video pipeline, but
this is not worth counting the pipeline overhead). This is
obtained using options -vc xvid -benchmark -vo null -nosound.
BENCHMARKs: VC: 255,768s VO: 0,189s A: 0,000s Sys: 12,763s = 268,720s <-- 1.0.1
BENCHMARKs: VC: 249,709s VO: 0,193s A: 0,000s Sys: 13,287s = 263,189s <-- 1.0.1 + get_coeff optimization
BENCHMARKs: VC: 222,033s VO: 0,196s A: 0,000s Sys: 12,269s = 234,498s <-- head
BENCHMARKs: VC: 217,613s VO: 0,187s A: 0,000s Sys: 12,141s = 229,940s <-- head + get_coeff optimization
So now the reason why i call for comments...
I think that if we want to speed up the decoder we have only
some choices left now (profiling the code doesn't show any
really good candidate now):
- Merging some MPEG4 decoding operations (like DCT+zigzag, or
zigzag + dequant, memtransfer+interpolation etc etc)
- use bitstream functions the less we can, using a simple
uint32_t cache variable simplifies the code a lot for the
compiler and often improves speed as my little experiment
in get_coeff proves (the function get_coeff + get_intra_block
+ get_inter_block represent less total time, and gcc has
even decided to inline get_coeff, which proves the function
is much less complex).
So though i know how to complete the second proposed change, i
have only a very limited experience with the first proposed
change, so michael, pete (or any other reader) maybe you could
advise me a bit on what stuff i should work on.
PS: note that xvid head is already 16.8% faster than 1.0.1
only with minor changes, that's what encourages me
continuing my decoder work... because if i was just
looking at ffmpeg numbers i would give up right now
(150s on the same sequence)
--
Edouard Gomez
More information about the XviD-devel
mailing list