[XviD-devel] [PATCH] calc_cbp_sse2 optimization

Mat Hostetter mat at curl.com
Sun Apr 18 17:43:55 CEST 2004


>>>>> "ed.gomez" == Edouard Gomez <ed.gomez at free.fr> writes:

 >> This change uses pcmpgtb/pmovmskb to extract a zero/nonzero mask,
 >> rather than the longer sequence used previously, and eliminates
 >> all conditional branches.  I also changed a movdqu to movdqa; if
 >> there's a reason the load might be unaligned (some bogus platform
 >> that can't align static arrays mod 16?) please let me know.

 ed.gomez> The blocks passed to calc_cbp should be aligned as they're
 ed.gomez> allocated on stack and aligned by DECLARE_ALIGNED_ARRAY (or
 ed.gomez> matrix, never remember its name).

Actually the unaligned load was for the static constant 'ignore_dc'
mask, rather than the array argument.  It seems like every effort was
made to make this data aligned, but the code was using movdqu anyway.
This made me worry that someone discovered that alignment sometimes
doesn't work on one of your platforms (?) but there's no comment to
that effect.  Here's the original code:

%ifdef FORMAT_COFF
SECTION .rodata data
%else
SECTION .rodata data align=16
%endif

ALIGN 16
ignore_dc:
  dw 0, -1, -1, -1, -1, -1, -1, -1
...
  movdqu xmm7, [ignore_dc] ; mask to ignore dc value

 ed.gomez> you can use xvid_bench, but its timing function isn't very
 ed.gomez> precise because it's based on ms (time duration not MS(tm))
 ed.gomez> precision.  Maybe you can give a try at better high
 ed.gomez> precision timers available in Win32 APIs.

For my small test program I used the 'rdtsc' instruction to count
machine cycles.  A loop executes the benchmark many times and takes
the 'min' cycle count, to remove context-switch and cache-miss noise, etc.

Thanks for the list of profiling tools.  I do most of my hacking with
gprof on linux, but I use vtune on win32 sometimes.  It's pretty good,
but sadly it's not free.  vtune also exists for linux, but I don't own
a copy.

I'll be sure to check out oprofile when I upgrade to 2.6.

-Mat


More information about the XviD-devel mailing list