[XviD-devel] New Trellis Quant

James Bilotto jb13 at gomerbud.com
Mon May 12 12:26:03 CEST 2003


On Mon, May 12, 2003 at 11:27:34AM +0200, Christoph Lampert wrote:
> On Mon, 12 May 2003, Marco Al wrote:
> > Well bzero vs memset(0,.. on a modern PC isnt really relevant anymore,
> > instruction count in the inner loop is hardly going to be the limiting
> > factor on a modern PC for such simple code (the only really important
> > improvement which would be helpfull in some instances would be to use non
> > temporal stores, but memset wouldnt know when exactly to use em).
> 
> Maybve not, but still I hate to waste cycles ;-) Memset() writes arbitrary
> bytes to arbitrarily many possibly non-aligned positions. 
> 
> XVID needs to write exactly 128 zero bytes to (16-byte?) aligned
> positions. A simple unrolled loop of int/long/double/longdouble or
> anything is three times as fast as memset(). All 4 data types compile to
> the same asm code btw. (gcc 2.95). MMX is even faster. 
[...]
>  
> P.S. Okay, maybe the main advantage would be that handwritten bzero/memset
> could be inlined. After all, this routine is called 6 to 12 times per
> macroblock. 
> 
yes thows speeds are about the same as what i get, and now i find that in
freebsd at lest memset() is done with bzero() & bcopy(). could i see the code
you used for the test?


More information about the XviD-devel mailing list