[XviD-devel] PlainC optmization
Christoph Lampert
chl at math.uni-bonn.de
Wed Mar 5 13:05:31 CET 2003
On Tue, 4 Mar 2003, Michael Militzer wrote:
> Hi,
>
> have you checked xmm for comparison? Is it much faster than mmx?
>
> BTW: I believe that my interpolate8x8_avg2 function should be faster than
> the normal interpolate8x8_[h,v] functions (at least mmx vs. mmx). However I
> think I never exactly profiled the functions (or if I did I forgot about
> it). I remember that I had planned to replace the old mmx interpolation code
> with the new one from avg2 but then didn't had the time/forgot about it.
>
> So I would be quite interested to know how avg2 mmx performs vs. normal
> interpolate[h,v] mmx. And since you are currently profiling anyway... ;-)
Hm, avg4 seems to be worse, btw. I guess 4 pointers is too many registers,
compared to simply calculating +1, +stride, +stride+1.
And using 2 pointers or 4 pointers (avg2, avg4) is also not better in
PlainC.
gruel
***************************************************************
P2 - 450 Mhz
***************************************************************
=== test block motion ===
PLAINC - interp- h-round0 0.508 usec iCrc=8107
PLAINC - round1 0.510 usec iCrc=8100
PLAINC - avg2- h-round0 0.621 usec iCrc=8107
PLAINC - round1 0.615 usec iCrc=8100
PLAINC - interp- v-round0 0.513 usec iCrc=8108
PLAINC - round1 0.482 usec iCrc=8105
PLAINC - avg2- v-round0 0.615 usec iCrc=8108
PLAINC - round1 0.615 usec iCrc=8105
PLAINC - interp-hv-round0 0.777 usec iCrc=8112
PLAINC - round1 0.784 usec iCrc=8103
PLAINC - avg4_hv_round0 1.240 usec iCrc=8112
PLAINC - round1 1.242 usec iCrc=8103
---
MMX - interp- h-round0 0.224 usec iCrc=8107
MMX - round1 0.237 usec iCrc=8100
MMX - avg2- h-round0 0.206 usec iCrc=8107
MMX - round1 0.206 usec iCrc=8100
MMX - interp- v-round0 0.225 usec iCrc=8108
MMX - round1 0.234 usec iCrc=8105
MMX - avg2- v-round0 0.204 usec iCrc=8108
MMX - round1 0.206 usec iCrc=8105
MMX - interp-hv-round0 0.336 usec iCrc=8112
MMX - round1 0.340 usec iCrc=8103
MMX - avg4_hv_round0 0.478 usec iCrc=8112
MMX - round1 0.478 usec iCrc=8103
*****************************************************************
P3 - 700MHz
*****************************************************************
=== test block motion ===
PLAINC - interp- h-round0 0.328 usec iCrc=8107
PLAINC - round1 0.328 usec iCrc=8100
PLAINC - avg2- h-round0 0.396 usec iCrc=8107
PLAINC - round1 0.397 usec iCrc=8100
PLAINC - interp- v-round0 0.329 usec iCrc=8108
PLAINC - round1 0.310 usec iCrc=8105
PLAINC - avg2- v-round0 0.396 usec iCrc=8108
PLAINC - round1 0.396 usec iCrc=8105
PLAINC - interp-hv-round0 0.500 usec iCrc=8112
PLAINC - round1 0.505 usec iCrc=8103
PLAINC - avg4_hv_round0 0.798 usec iCrc=8112
PLAINC - round1 0.798 usec iCrc=8103
---
MMX - interp- h-round0 0.147 usec iCrc=8107
MMX - round1 0.148 usec iCrc=8100
MMX - avg2- h-round0 0.127 usec iCrc=8107
MMX - round1 0.128 usec iCrc=8100
MMX - interp- v-round0 0.142 usec iCrc=8108
MMX - round1 0.148 usec iCrc=8105
MMX - avg2- v-round0 0.129 usec iCrc=8108
MMX - round1 0.127 usec iCrc=8105
MMX - interp-hv-round0 0.216 usec iCrc=8112
MMX - round1 0.219 usec iCrc=8103
MMX - avg4_hv_round0 0.295 usec iCrc=8112
MMX - round1 0.294 usec iCrc=8103
---
MMXEXT - interp- h-round0 0.054 usec iCrc=8107
MMXEXT - round1 0.070 usec iCrc=8100
MMXEXT - interp- v-round0 0.050 usec iCrc=8108
MMXEXT - round1 0.070 usec iCrc=8105
MMXEXT - interp-hv-round0 0.111 usec iCrc=8112
MMXEXT - round1 0.109 usec iCrc=8103
---
More information about the XviD-devel
mailing list