[XviD-devel] What are the most important optimization opportunities?
Christoph Lampert
xvid-devel@xvid.org
Tue, 21 Jan 2003 13:23:34 +0100 (CET)
On Sun, 19 Jan 2003, Felix von Leitner wrote:
> I'm interested in spending time helping optimize the xvid sources.
> I have next to no understanding of how MPEG 4 works, but I am currently
> trying to learn MMX, SSE and 3dnow, so I would be willing to convert a
> few functions if they lend themselves to vectorization.
Two things come to mind:
a) Almost all C-functions for which SIMD-equivaltenz exist are heavily
unoptimized, e.g. transferX-X-etc. Those would certainly need speedup,
e.g. loop unrolling, prefetch, etc.
The trick could be, that the MMX/SSE version _are_ very good (some of them
even are _extremely_ good), and it's not only the MMX/SSE but also memory
management etc, so it might be worth studying the MMX-stuff to see what to
do in C-parts.
Other functions, like image_input might be must faster when caching of the
source is avoided (streaming data transfer). There are also parts where we
call "memset(&p,0x00,size)" for arrays of fixed size. This might be must
more elegant to do by an unrolled MMX-XOR macro (or am i talking rubbish
here?)
b) Another thing is _rounding_ in general. Everything that is connected to
motion (MotionEst, MotionComp) has to do rounding, and the standard
describes exactly which rounding has to be used:
Round to -infinity, round to 0, round away from 0, round to +infinity
Some of these are simple and fast (e.g. >>1 rounds to -infinity), some are
simple and slow (e.g. /2 rounds to 0), but the other two are more
difficult. Maybe you can come up with faster versions than we already
have, e.g. make them CPU-specific (in portab.h) because jumps and division
might get different penalties on different CPUs.
gruel