[XviD-devel] PlainC optmization
Christoph Lampert
chl at math.uni-bonn.de
Tue Mar 4 20:30:36 CET 2003
On Tue, 4 Mar 2003, Michael Militzer wrote:
> Hi,
>
> have you checked xmm for comparison? Is it much faster than mmx?
Well, I can't check on P2, because that one doesn't have xmm, and
on P3 MMX speedup is larger than on P2:
PLAINC - interp- h-round0 0.323 usec iCrc=8107
PLAINC - round1 0.330 usec iCrc=8100
PLAINC - interp- v-round0 0.325 usec iCrc=8108
PLAINC - round1 0.306 usec iCrc=8105
PLAINC - interp-hv-round0 0.496 usec iCrc=8112
PLAINC - round1 0.499 usec iCrc=8103
---
MMX - interp- h-round0 0.273 usec iCrc=8107
MMX - round1 0.281 usec iCrc=8100
MMX - interp- v-round0 0.298 usec iCrc=8108
MMX - round1 0.293 usec iCrc=8105
MMX - interp-hv-round0 0.432 usec iCrc=8112
MMX - round1 0.432 usec iCrc=8103
---
MMXEXT - interp- h-round0 0.211 usec iCrc=8107
MMXEXT - round1 0.214 usec iCrc=8100
MMXEXT - interp- v-round0 0.161 usec iCrc=8108
MMXEXT - round1 0.170 usec iCrc=8105
MMXEXT - interp-hv-round0 0.267 usec iCrc=8112
MMXEXT - round1 0.268 usec iCrc=8103
> BTW: I believe that my interpolate8x8_avg2 function should be faster than
> the normal interpolate8x8_[h,v] functions (at least mmx vs. mmx). However I
> think I never exactly profiled the functions (or if I did I forgot about
> it). I remember that I had planned to replace the old mmx interpolation code
> with the new one from avg2 but then didn't had the time/forgot about it.
>
> So I would be quite interested to know how avg2 mmx performs vs. normal
> interpolate[h,v] mmx. And since you are currently profiling anyway... ;-)
>
You seems to be right! MMX version seems to be faster.
--- P2 ----
PLAINC - inter-avg2_h_round0 0.609 usec iCrc=8107
PLAINC - round1 0.610 usec iCrc=8100
MMX - inter-avg2_h_round0 0.387 usec iCrc=8107
MMX - round1 0.387 usec iCrc=8100
--- P3 ---
PLAINC - inter-avg2_h_round0 0.392 usec iCrc=8107
PLAINC - round1 0.392 usec iCrc=8100
MMX - inter-avg2_h_round0 0.249 usec iCrc=8107
MMX - round1 0.251 usec iCrc=8100
gruel
> >
> > Sorry, me again...
> > I just checked the same for P2 450 MHz (MMX, no MMXEXT).
> >
> > === test block motion ===
> > PLAINC - interp- h-round0 0.502 usec iCrc=8107
> > PLAINC - round1 0.511 usec iCrc=8100
> > PLAINC - interp- v-round0 0.504 usec iCrc=8108
> > PLAINC - round1 0.475 usec iCrc=8105
> > PLAINC - interp-hv-round0 0.771 usec iCrc=8112
> > PLAINC - round1 0.775 usec iCrc=8103
> > ---
> > MMX - interp- h-round0 0.454 usec iCrc=8107
> > MMX - round1 0.455 usec iCrc=8100
> > MMX - interp- v-round0 0.466 usec iCrc=8108
> > MMX - round1 0.466 usec iCrc=8105
> > MMX - interp-hv-round0 0.670 usec iCrc=8112
> > MMX - round1 0.671 usec iCrc=8103
> >
> >
> > Is this possible? Plain MMX really can't do better than _that_?
> > Ouch...
> >
> > gruel
> >
> >
> >
> >
> >
> > On Tue, 4 Mar 2003, Christoph Lampert wrote:
> >
> > > Hi,
> > >
> > > if anyone out there is bored: XVID has lots of places where C-code can
> be
> > > optimized (in particular many routines for which MMX equivalents exist
> > > are not optimized at all):
> > >
> > > I did the simplest tasks for interpolate8x8: Loop unrolling, removal of
> > > dependencies, removal of redundant calculations (of "1-rounding" in this
> > > case)
> > >
> > > Of course it's not important for everyone with MMX, but if it helps on
> > > other plattforms and doesn't make the code too unreadable... why not?
> > >
> > >
> > >
> > > before:
> > > === test block motion ===
> > > PLAINC - interp- h-round0 1.992 usec iCrc=8107
> > > PLAINC - round1 1.990 usec iCrc=8100
> > > PLAINC - interp- v-round0 1.989 usec iCrc=8108
> > > PLAINC - round1 1.989 usec iCrc=8105
> > > PLAINC - interp-hv-round0 3.181 usec iCrc=8112
> > > PLAINC - round1 3.180 usec iCrc=8103
> > >
> > > after:
> > > === test block motion ===
> > > PLAINC - interp- h-round0 0.322 usec iCrc=8107
> > > PLAINC - round1 0.329 usec iCrc=8100
> > > PLAINC - interp- v-round0 0.343 usec iCrc=8108
> > > PLAINC - round1 0.306 usec iCrc=8105
> > > PLAINC - interp-hv-round0 0.496 usec iCrc=8112
> > > PLAINC - round1 0.497 usec iCrc=8103
> > >
> > >
> > > Yeah, I'm such a super-hero ;-)))
> > >
> > > gruel
> > >
> > >
> > > _______________________________________________
> > > XviD-devel mailing list
> > > XviD-devel at xvid.org
> > > http://list.xvid.org/mailman/listinfo/xvid-devel
> > >
> >
> > _______________________________________________
> > XviD-devel mailing list
> > XviD-devel at xvid.org
> > http://list.xvid.org/mailman/listinfo/xvid-devel
> >
>
> _______________________________________________
> XviD-devel mailing list
> XviD-devel at xvid.org
> http://list.xvid.org/mailman/listinfo/xvid-devel
>
More information about the XviD-devel
mailing list