[XviD-devel] XVID performance on P4/Opteron

Christoph Lampert chl at math.uni-bonn.de
Mon Jul 28 23:55:42 CEST 2003


On Mon, 28 Jul 2003, Edouard Gomez wrote:

> Christoph Lampert (chl at math.uni-bonn.de) wrote:
> > I couldn't check ASM on Opteron, yet, so only P4 numbers:
> 
> Technical problems  or just lack of  time ? cause  opteron is compatible
> with ia32  (sure of that), including  mmx iirc (not sure  about that one
> tho)... 

Yes, Opteron is compatible with MMX,3dnow,SSE,3dnowEXT,SSE2,etc. But
XVID's automatic CPU detection sets CPU-flags to 0. 

Some #define is interpreted wrong in configure, I guess. Let me check... 

Yes. Default gcc act as 64bit, so after ./configure, plattform.inc
becomes:

ARCHITECTURE=-DARCH_IS_GENERIC
BUS=-DARCH_IS_64BIT
ENDIANNESS=-DARCH_IS_LITTLE_ENDIAN

The "GENERIC" disables all ASM. But on the other hand, in 64bit mode,
ARCH_IS_IA32 would indeed be a strange value. Maybe ARCH_IS_X86 would
be better? Or simply add a new flag: ARCH_IS_AMD64 ?

When I add -m32 to CFLAGS to switch to 32bit mode, and change ARCHITECTURE
to IA32 and BUS to 32BIT it tries to compile ASM stuff, but there's no
NASM installed at compile farm :-( So that doesn't work at all. 
If I leave ARCHITECTURE at GENERIC; it compiles in 32bit, but without
MMX... 
Anyway, there could be a ./configure-switch to chose between 32 and 64 bit
on architectures where possible. 

What of course I can do is simply copy 32bit binaries from my Athlon
XPs. Then it's 

without ASM: decoding 112fps, 
             encoding  32fps,

with ASM:    decoding 205fps,
             encoding 174fps

I guess speedup as expected here... But no numbers for 64bit :-( 
Current routines would be suboptimal in 64bit mode anyway, because there
are twice as much SSE registers available. That would make many MMX/SSE
routines easier and faster, I guess, but so far haven't heard of
a NASM version supporting AMD64 extensions. :( 

gruel

P.S. The gcc option  "-fschedule-insns" and "-fschedule-insns2" don't seem
to work on Opteron gcc 3.2.2 

----------------------------
Opteron /proc/cpuinfo is

vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor Sample
stepping        : 0
cpu MHz         : 1593.084
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext
3dnow
bogomips        : 3185.04
TLB size        : 1088 4K pages
clflush size    : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp




More information about the XviD-devel mailing list