[XviD-devel] Adding SSE2 asm codes for color space transforming funcion

Xwen.Kong konsunwin at gmail.com
Fri Jun 19 09:16:34 CEST 2009


        Yes, I only test it on Intel Celoron processor, are there any tools
or documents to evaluate  the codes performance  ? now what I use  is Intel(R)
VTune(TM) Performance Analyzer.

2009/6/19 Jason Garrett-Glaser <darkshikari at gmail.com>

> > movlps [edi + 32],xmm0   ;  movlps + movhps are faster than one movdqu :)
>
> Only on Athlon 64, probably.
>
> On Phenom and Nehalem it will be most definitely slower, and probably
> slower on basically everything else too.
>
> Also, since the shuffle unit is slow on the Conroe, that code will
> almost certainly be slower than the MMX version on Conroe.
>
> Dark Shikari
> _______________________________________________
> Xvid-devel mailing list
> Xvid-devel at xvid.org
> http://list.xvid.org/mailman/listinfo/xvid-devel
>


More information about the Xvid-devel mailing list