[XviD-devel] [CVS commit] Linux amd64 preliminary support

Guillaume POIRIER guillaume.poirier at etudiant.univ-rennes1.fr
Thu Jan 6 23:54:33 CET 2005


Le jeudi 06 janvier 2005 à 10:14 +0100, Edouard Gomez a écrit :
> Selon Guillaume POIRIER <guillaume.poirier at etudiant.univ-rennes1.fr>:
> > What a coincidence, I just happened to run a bench yesterday between the
> > "original" AMD-64 SIMD port by Andre Werthmann against IA-34 SIMD
> > version from you tla tree. I've been quite disappointed to see that the
> > AMD-64 version was just as fast as IA-32 when libavcodec gives me (from
> > the top of my head) a 50% bonus regarding encoding speed[1].
> 
> I doubt this can be true, technically using MMX/XMM/SSE2 should be equivalent
> using the amd64 either as a 32bit CPU or a 64bit processor because the code is
> exactly the same and is executed on the same simd units. The only way ffmpeg
> could speedup by 50% is that it uses a "real" 64bit port using the extra
> registers which could save some bandwidth in some parts of the code... but
> really 50% is not realistic.

You unfortunately right out that, good thing I just said it was "from
the top of my head". I guess I compared 2 encodes of different size,
with is meaningless.
Plus, given the studies I do, I should have been more suspicious about
such a large speed-up on the same chip. However, I really look forward
seeing what hackers will produce when this platform will be more
common. 


> > Anyway, thanks a lot for the work you've done, I'll test it and
> > benchmark it tonight.
> 
> You're welcome.


So here are the figures:

Filters: -vf crop=688:448:18:60,pp=fd,scale=480:288
Source: MPEG-2 (PAL DVD)  518,960 secs, 12975 frames

XviD options: 
me_quality=6:chroma_me:chroma_opt:trellis:closed_gop:max_bframes=2:hq_ac:vhq=4:bvhq=1:autoaspect:psnr:bitrate=900

Lavc options:
vcodec=mpeg4:vmax_b_frames=1:mbd=2:v4mv:vqblur=0.37:vlelim=-4:vcelim=7:vqblur=0.37:vqcomp=0.65:lumi_mask=0.03:dark_mask=0.01:cmp=2:subcmp=2:precmp=2:dia=-1:trell:cbp:mv0:turbo:autoaspect:psnr:vbitrate=900

On x86-64 
   Pure C     | XviD SIMD yasm | XviD Edouard SIMD | lavc + SIMD
Pass 1: 37fps | 68fps          | 78fps             | 79fps
Pass 2: 11fps | 30fps          | 32fps             | 30fps

On IA-32
   Pure C     | XviD SIMD yasm | XviD Edouard SIMD | lavc + SIMD
Pass 1: 25fps | N/A            | 80fps             | 74fps
Pass 2: 9fps  | N/A            | 31fps             | 28fps


The same without filters[1]:
----------------------------

On x86-64
   Pure C     | XviD SIMD yasm | XviD Edouard SIMD | lavc + SIMD
Pass 1:  N/A  | 40fps          | 45fps             | 45fps
Pass 2:  N/A  | 13fps          | 14fps             | 12fps


On IA-32
   Pure C     | XviD SIMD yasm | XviD Edouard SIMD | lavc + SIMD
Pass 1: 10fps | N/A            | 44fps             | 42fps
Pass 2: 3fps  | N/A            | 13fps             | 11fps


Comments:
---------
- Lavc test is just here to compare how much a similar project gained
from porting IA-32 SIMD ASM to x86-64.

Conclusion:
-----------
- Edouard's AMD-64 port is faster than Andre Werthmann (based on
XviD-1.0.2) ;-)
- On AMD-64, there really is a speed-up, but it's limited.


That's all for today, I'll try to test the qpel fix soon.

Regards,
Guillaume


[1] It looks like MEncoder's SSE asm code is deactivated on x86-64 due
to a bug in the cpu detecting code (cpudetect.c), that's why I tried to
maximize encoding time VS filters time.



More information about the XviD-devel mailing list