[XviD-devel] Request: optimized version of image_setedges

Alban Bedel xvid-devel@xvid.org
Mon, 8 Jul 2002 02:50:02 +0200


Hi Christoph Lampert,

on Sun, 7 Jul 2002 12:24:38 +0200 (CEST) you wrote:

> Hi,
> 
> I just saw in profiling SMP that  image_setedges()  is one
> of the slowest parts in XviD now. I doubt that this is needed!
> 
> I guess the reason is many loops and many calls to library functions
> memcpy/memset for very small memory blocks of 32 or even 16 bytes, which
> could be done by loop onrolling or MMX-copy much faster. 
> 
> There could be a fixed "copy 16 bytes by MMX" inlined function and
> something tricky for memset(), too. However, I don't know enough
> MMX/assembler for that.
> 
> Anyone else?
> 

I know nothing about asm progamation but mplayer have optimized version of
memcpy for MMX, MMX2, 3DNOW and SSE2 wich may be useful. But it's writed
in inline asm thus it would need to be rewrited :( It may be a good start if
someone is interested. The code is in the following files :
  libvo/fastmemcpy.h
  libvo/aclib.c
  libvo/aclib_template.c 

	Albeu