[XviD-devel] Quality optimization

Christoph Lampert chl at math.uni-bonn.de
Tue Feb 25 18:50:45 CET 2003


Hi Skal, 

since you are still (or again) amongst us ;-) 
I saw in your xvid_bench.c code that it checks e.g. SAD speed on 
16*16 arrays (so stride is 16) and only in perfect alignment. Isn't that
untypical, because in "real life" stride should be something like 720,
in any way much larger than cacheline, and also only "Reference" pointer
would be aligned, not "Current"? Or doesn't this matter on x86? 

Also, even though your code is so fast, I didn't find any
"prefetch" instructions in ASM or C whereas ffmpeg's SAD routines are full
of them. Didn't you test them, or didn't they yield a speedup? 
 
gruel 


On 25 Feb 2003, skal wrote:
> 	almost forgot this one too:
> 
> On Wed, 2003-01-22 at 20:01, Marco Al wrote:
> > Christoph Lampert wrote:
> > 
> > >> Do we have some timings for a 8 bit Hadamard transform yet?
> > >
> > > I did some a while ago of skal's MMXEXT(?) version and posted them to
> > > the list. I don't remember, but might have been twice the speed of DCT,
> > > but half the speed of SAD.
> > 
> > The non attributed asm code only managed 173 cycles with 8 bits accourding to
> > the source, that is not twice as fast as DCT AFAIK.
> > 
> 	here's a C/MMX/SSE version of the Hadamard transform (16bits).
> 	Without the 'pshufw' re-ordering, output columns are re-ordered
> 	according to: [03127465]. C-version spits the correct order...
> 	Note: Output is also scaled by 8.
> 
> 
> 	bye!
> 
> 		Skal
> 



More information about the XviD-devel mailing list