Re[4]: [XviD-devel] Quality optimization
Christoph Lampert
xvid-devel@xvid.org
Mon, 27 Jan 2003 14:53:00 +0100 (CET)
On Mon, 27 Jan 2003, Klaus Post (KPO) wrote:
> Hi!
>
> Actually it doesn't use cmov.
>
> Here is the code, MSVC generated for:
>
> eax = abs(edx - eax)
>
> --------------
> sub eax,edx ;Difference one way ( = abs(Luma1-Luma2))
> cdq
> xor eax,edx
> sub eax,edx ; eax=absolute difference
> --------------
Btw. I looked into the "Intel Optimization guide" and it tells not to use
CDQ, but copy&shift. Shouldn't MSVC know that? Or has scheduling changed
in P3++ ?
> Typically, an integer divide is preceded by a cdq instruction (divide
> instructions use EDX:EAX as the dividend and cdq sets up EDX). It is
> better to copy EAX into EDX, then right shift EDX 31 places to
> sign-extend. The copy/shift takes the same number of clocks as cdq on
> Pentium processors, but the copy/shift scheme allows two other
> instructions to execute at the same time on the Pentium processor. If
> you know that the value is positive, use xor edx, edx. On Pentium Pro
> and Pentium II processors the cdq instruction is faster since cdq is a
> single op instruction as opposed to two instructions for the copy/shift
> sequence.
GCC 2.95.2 (Linux) uses conditional move for -march=i686
> subl %edx,%eax
> movl %eax,%edx
> negl %eax
> cmpl $-1,%edx
> cmovg %edx,%eax
> movl %ebp,%esp
but a jump for i586 and lower.
> movl %esp,%ebp
> subl %edx,%eax
> jns .L33
> negl %eax
> .L33:
gcc 3.2 has slightly different routines, but the basics are the same:
cmov for 686 and higher (including athlons), a jump for 586.
No cbq, no shifting...
gruel