[XviD-devel] Decoder performance
Christoph Lampert
chl at math.uni-bonn.de
Sat Aug 9 17:50:52 CEST 2003
On Sat, 9 Aug 2003, Edouard Gomez wrote:
> Well i was wrong for MVs < 0. As usual, the "most usual" rounding bug
> got me... i was using shifts instead of divisions.
>
> This patch replaces all shifts:
> foo = bar >> dec->quaterpel
> by
> foo = bar / (1+dec->quaterpel)
Hm, can't you handle this differently? From all I know, integer division
isn't pipelined on any CPU architecture. For dividing by constants,
compilers replace this by shift+some rounding trick, but when dividing
by some variable, at least gcc doesn't manage to:
foo = bar/2
becomes
movl %edx,%eax
shrl $31,%eax
addl %eax,%edx
sarl $1,%edx
So for our case we might use
foo = (bar+((bar>>31)&quarterpel))>>quarterpel;
Ugly, but should work. Anyone got a simpler idea?
gruel
> The code is still a lot more branchless. I used some LUTs for small sets
> of values like for bframe dquants or bframe mb type... even if these two
> functions are called not so often, their ranking in the profile
> changed... it's a bit better even if it's not an impressive speedup (you
> won't notice it ;-).
>
> I also added cbp test to skip complete macroblocks when possible.
>
> Still wanting some feedback to see if the patch impacts the decoding.
>
> --
> Edouard Gomez
More information about the XviD-devel
mailing list