[XviD-devel] PlainC optmization

Christoph Lampert chl at math.uni-bonn.de
Wed Mar 5 13:05:31 CET 2003


On Tue, 4 Mar 2003, Michael Militzer wrote:

> Hi,
> 
> have you checked xmm for comparison? Is it much faster than mmx?
> 
> BTW: I believe that my interpolate8x8_avg2 function should be faster than
> the normal interpolate8x8_[h,v] functions (at least mmx vs. mmx). However I
> think I never exactly profiled the functions (or if I did I forgot about
> it). I remember that I had planned to replace the old mmx interpolation code
> with the new one from avg2 but then didn't had the time/forgot about it.
> 
> So I would be quite interested to know how avg2 mmx performs vs. normal
> interpolate[h,v] mmx. And since you are currently profiling anyway... ;-)

Hm, avg4 seems to be worse, btw. I guess 4 pointers is too many registers,  
compared to simply calculating +1, +stride, +stride+1.
And using 2 pointers or 4 pointers (avg2, avg4) is also not better in
PlainC. 

gruel

***************************************************************
P2 - 450 Mhz
***************************************************************
 ===  test block motion ===
PLAINC - interp- h-round0 0.508 usec       iCrc=8107
PLAINC -           round1 0.510 usec       iCrc=8100
PLAINC -   avg2- h-round0 0.621 usec       iCrc=8107
PLAINC -           round1 0.615 usec       iCrc=8100
PLAINC - interp- v-round0 0.513 usec       iCrc=8108
PLAINC -           round1 0.482 usec       iCrc=8105
PLAINC -   avg2- v-round0 0.615 usec       iCrc=8108
PLAINC -           round1 0.615 usec       iCrc=8105
PLAINC - interp-hv-round0 0.777 usec       iCrc=8112
PLAINC -           round1 0.784 usec       iCrc=8103
PLAINC -   avg4_hv_round0 1.240 usec       iCrc=8112
PLAINC -           round1 1.242 usec       iCrc=8103
 --- 
MMX    - interp- h-round0 0.224 usec       iCrc=8107
MMX    -           round1 0.237 usec       iCrc=8100
MMX    -   avg2- h-round0 0.206 usec       iCrc=8107
MMX    -           round1 0.206 usec       iCrc=8100
MMX    - interp- v-round0 0.225 usec       iCrc=8108
MMX    -           round1 0.234 usec       iCrc=8105
MMX    -   avg2- v-round0 0.204 usec       iCrc=8108
MMX    -           round1 0.206 usec       iCrc=8105
MMX    - interp-hv-round0 0.336 usec       iCrc=8112
MMX    -           round1 0.340 usec       iCrc=8103
MMX    -   avg4_hv_round0 0.478 usec       iCrc=8112
MMX    -           round1 0.478 usec       iCrc=8103

*****************************************************************
P3 - 700MHz
*****************************************************************
 ===  test block motion ===
PLAINC - interp- h-round0 0.328 usec       iCrc=8107
PLAINC -           round1 0.328 usec       iCrc=8100
PLAINC -   avg2- h-round0 0.396 usec       iCrc=8107
PLAINC -           round1 0.397 usec       iCrc=8100
PLAINC - interp- v-round0 0.329 usec       iCrc=8108
PLAINC -           round1 0.310 usec       iCrc=8105
PLAINC -   avg2- v-round0 0.396 usec       iCrc=8108
PLAINC -           round1 0.396 usec       iCrc=8105
PLAINC - interp-hv-round0 0.500 usec       iCrc=8112
PLAINC -           round1 0.505 usec       iCrc=8103
PLAINC -   avg4_hv_round0 0.798 usec       iCrc=8112
PLAINC -           round1 0.798 usec       iCrc=8103
 --- 
MMX    - interp- h-round0 0.147 usec       iCrc=8107
MMX    -           round1 0.148 usec       iCrc=8100
MMX    -   avg2- h-round0 0.127 usec       iCrc=8107
MMX    -           round1 0.128 usec       iCrc=8100
MMX    - interp- v-round0 0.142 usec       iCrc=8108
MMX    -           round1 0.148 usec       iCrc=8105
MMX    -   avg2- v-round0 0.129 usec       iCrc=8108
MMX    -           round1 0.127 usec       iCrc=8105
MMX    - interp-hv-round0 0.216 usec       iCrc=8112
MMX    -           round1 0.219 usec       iCrc=8103
MMX    -   avg4_hv_round0 0.295 usec       iCrc=8112
MMX    -           round1 0.294 usec       iCrc=8103
 --- 
MMXEXT - interp- h-round0 0.054 usec       iCrc=8107
MMXEXT -           round1 0.070 usec       iCrc=8100
MMXEXT - interp- v-round0 0.050 usec       iCrc=8108
MMXEXT -           round1 0.070 usec       iCrc=8105
MMXEXT - interp-hv-round0 0.111 usec       iCrc=8112
MMXEXT -           round1 0.109 usec       iCrc=8103
 --- 




More information about the XviD-devel mailing list