[XviD-devel] A SSIM Plugin for XviD
skal
skal65535 at orange.fr
Mon Oct 30 12:42:42 CET 2006
Hi Johannes und alles
(this rhymes too;)
> Message du 28/10/06 19:31
> De : "Johannes Reinhardt" <Johannes.Reinhardt at uni-konstanz.de>
>
> >>>> The SSE2 implementation of consim is not faster than the mmx version
> >>>> with all CPUs
> >>>> (Pentium IV and Pentium M) I tested. Is there a chance to speed it up or
> >>>> should I
> >>>> disable SSE2? Or is SSE2 perhaps faster on other CPUs?
> >>>>
> >
> > Note: the mmx version consim_mmx uses 'pshufw', which is SSE+ instruction.
> >
> I replaced it by a copy and a shift. I couldn't think of a better way to
> do it.
looks ok to me.. not a big deal.
> > [...]
> >
> > Anyway, i had a look at the c version of consim, and am
> > not sure it couldn't be turned into a faster way (and *then*
> > optimized in SSE ;). If get you right, you computing deviates
> > as <a-<a>><b-<b>>, where < > is the average operator \sum_i{a_i} / N
> > (and this is where it could also be \sum_i{a_i w_i } / \sum_i { w_i })
> >
> > Now, we have <a-<a>><b-<b>> = <ab> - <a><b> which is lighter (less subs).
btw, typo. Should be <(a-<a>)(b-<b>)> = <ab> - <a><b> of course.
[...]
> Is there a better way for finding the right order of operations than
> brute force trying? It seems as the order of instuctions is quite
> important for speed.
At this point it heavily depends on your compiler and platform.
So no, there's no silver bullet.
[...]
> Patches are here:
>
> http://xvid.ist-dein-freund.de/stuff/ssim_part2.diff
ok, applied. There was a little rounding error for pdevc/pcorr
in the ASM code (missing constant 32 before descaling >>6).
Should be ok now.
xvid_bench.c updated.
> http://xvid.ist-dein-freund.de/stuff/encraw_stats_fix.diff
This one i don't understand: it seems to me totalPSNR[] is
already accumulated in every case...
bye!
Skal
More information about the XviD-devel
mailing list