[XviD-devel] What are the most important optimization opportunities?

Christoph Lampert xvid-devel@xvid.org
Tue, 21 Jan 2003 13:23:34 +0100 (CET)


On Sun, 19 Jan 2003, Felix von Leitner wrote:

> I'm interested in spending time helping optimize the xvid sources.
> I have next to no understanding of how MPEG 4 works, but I am currently
> trying to learn MMX, SSE and 3dnow, so I would be willing to convert a
> few functions if they lend themselves to vectorization.

Two things come to mind: 

a) Almost all C-functions for which SIMD-equivaltenz exist are heavily
unoptimized, e.g. transferX-X-etc. Those would certainly need speedup,
e.g. loop unrolling, prefetch, etc. 

The trick could be, that the MMX/SSE version _are_ very good (some of them
even are _extremely_ good), and it's not only the MMX/SSE but also memory
management etc, so it might be worth studying the MMX-stuff to see what to
do in C-parts. 

Other functions, like image_input might be must faster when caching of the
source is avoided (streaming data transfer). There are also parts where we 
call "memset(&p,0x00,size)" for arrays of fixed size. This might be must 
more elegant to do by an unrolled MMX-XOR macro (or am i talking rubbish
here?)


b) Another thing is _rounding_ in general. Everything that is connected to 
motion (MotionEst, MotionComp) has to do rounding, and the standard
describes exactly which rounding has to be used: 
Round to -infinity,  round to 0, round away from 0, round to +infinity

Some of these are simple and fast (e.g. >>1 rounds to -infinity), some are
simple and slow (e.g. /2 rounds to 0), but the other two are more
difficult. Maybe you can come up with faster versions than we already
have, e.g. make them CPU-specific (in portab.h) because jumps and division
might get different penalties on different CPUs. 


gruel