Re[4]: [XviD-devel] Quality optimization

Christoph Lampert xvid-devel@xvid.org
Mon, 27 Jan 2003 14:53:00 +0100 (CET)


On Mon, 27 Jan 2003, Klaus Post (KPO) wrote:

> Hi!
> 
> Actually it doesn't use cmov.
> 
> Here is the code, MSVC generated for:
> 
> eax = abs(edx - eax)
> 
> --------------
> sub eax,edx         ;Difference one way  ( = abs(Luma1-Luma2))
> cdq				
> xor eax,edx
> sub eax,edx         ; eax=absolute difference	
> --------------

Btw. I looked into the "Intel Optimization guide" and it tells not to use
CDQ, but copy&shift. Shouldn't MSVC know that? Or has scheduling changed 
in P3++ ?

> Typically, an integer divide is preceded by a cdq instruction (divide
> instructions use EDX:EAX as the dividend and cdq sets up EDX). It is
> better to copy EAX into EDX, then right shift EDX 31 places to
> sign-extend. The copy/shift takes the same number of clocks as cdq on
> Pentium processors, but the copy/shift scheme allows two other
> instructions to execute at the same time on the Pentium processor. If
> you know that the value is positive, use xor edx, edx. On Pentium Pro
> and Pentium II processors the cdq instruction is faster since cdq is a
> single op instruction as opposed to two instructions for the copy/shift
> sequence.

GCC 2.95.2 (Linux) uses conditional move for -march=i686

>        subl %edx,%eax
>        movl %eax,%edx
>        negl %eax
>        cmpl $-1,%edx
>        cmovg %edx,%eax
>        movl %ebp,%esp

but a jump for i586 and lower. 

>        movl %esp,%ebp
>        subl %edx,%eax
>        jns .L33
>        negl %eax
> .L33:

gcc 3.2 has slightly different routines, but the basics are the same: 
cmov for 686 and higher (including athlons), a jump for 586. 

No cbq, no shifting... 

gruel