[XviD-devel] sse2
peter ross
xvid-devel@xvid.org
Tue, 30 Jul 2002 15:31:41 +1000
okay, i had a look at the sse2 dequant code on friday. yep, its slower than
xmm. but by replacing the loop increment and saturation code, makes sse2
~20% faster than xmm.
i noticed we have problem regarding sse2 alignment:
nasm can only guarantee the .data section to be 4-byte aligned. the
following symbol is not 16-byte aligned under plain-old msvc.
.text
align 16
sse2_value times 8 dw 1
the only way i could make the value 16-byte aligned, was to export the
symbol.
cglobal sse2_value
.text
align 16
sse2_value times 8 dw 1
i assume the sp5+processor pack will fix this
the mmx, xmm and sse2 dequant code is near identical, and could easiily be
macro'ized.
cya
-- pete
_________________________________________________________________
Send and receive Hotmail on your mobile device: http://mobile.msn.com