[XviD-devel] New Trellis Quant

Christoph Lampert chl at math.uni-bonn.de
Tue May 13 17:55:31 CEST 2003


On Tue, 13 May 2003, Christoph Lampert wrote:

> On Mon, 12 May 2003, James Bilotto wrote:
> > On Mon, May 12, 2003 at 11:27:34AM +0200, Christoph Lampert wrote:
> > > P.S. Okay, maybe the main advantage would be that handwritten bzero/memset
> > > could be inlined. After all, this routine is called 6 to 12 times per
> > > macroblock. 
> > > 
> > yes thows speeds are about the same as what i get, and now i find that in
> > freebsd at lest memset() is done with bzero() & bcopy(). could i see the code
> > you used for the test?
> 
> It's just a plain framework for measuring speed of small routines, but 
> sure, here it is.
> I guess I modified it again a couple of times since then, but here it is.
> On gcc you have to compile with -O2 instead of -O3, otherwise the unrolled
> routines are simply removed (or called just once).

Offtopic, but I was just checking portland group C compiler pgc, and
storing doubles (8 bytes) using "fstl" is indeed faster than setting
4-bytes ints, at least on Pentium3. Strange to see that gcc replaces the
explicitly casted doubles by shorter ints. 

aligned:
       _real_ double C                  0.290000 s 1644 MB/s
unaligned:
       _real_ double C                  0.520000 s 916 MB/s


gruel 




More information about the XviD-devel mailing list