[XviD-devel] A SSIM Plugin for XviD

skal skal65535 at orange.fr
Mon Oct 30 12:42:42 CET 2006


  Hi Johannes und alles

 (this rhymes too;)


> Message du 28/10/06 19:31
> De : "Johannes Reinhardt" <Johannes.Reinhardt at uni-konstanz.de>
>
> >>>> The SSE2 implementation of consim is not faster than the mmx version 
> >>>> with all CPUs
> >>>> (Pentium IV and Pentium M) I tested. Is there a chance to speed it up or 
> >>>> should I
> >>>> disable SSE2? Or is SSE2 perhaps faster on other CPUs?
> >>>>         
> >
> >    Note: the mmx version consim_mmx uses 'pshufw', which is SSE+ instruction.
> >   
> I replaced it by a copy and a shift. I couldn't think of a better way to 
> do it.

    looks ok to me.. not a big deal.

> > [...]
> >
> >    Anyway, i had a look at the c version of consim, and am
> >    not sure it couldn't be turned into a faster way (and *then*
> >    optimized in SSE ;). If get you right, you computing deviates
> >    as <a-<a>><b-<b>>, where < > is the average operator \sum_i{a_i} / N
> >    (and this is where it could also be \sum_i{a_i w_i } / \sum_i { w_i })
> >
> >    Now, we have <a-<a>><b-<b>> = <ab> - <a><b> which is lighter (less subs).

   btw, typo. Should be <(a-<a>)(b-<b>)> = <ab> - <a><b> of course.
[...]

> Is there a better way for finding the right order of operations than 
> brute force trying? It seems as the order of instuctions is quite 
> important for speed.

    At this point it heavily depends on your compiler and platform.
    So no, there's no silver bullet.
[...]

> Patches are here:
> 
> http://xvid.ist-dein-freund.de/stuff/ssim_part2.diff

   ok, applied. There was a little rounding error for pdevc/pcorr
   in the ASM code (missing constant 32 before descaling >>6). 
   Should be ok now.
   xvid_bench.c updated.

> http://xvid.ist-dein-freund.de/stuff/encraw_stats_fix.diff
   This one i don't understand: it seems to me totalPSNR[] is 
   already accumulated in every case...


   bye!
Skal



More information about the XviD-devel mailing list