[XviD-devel] [CVS commit] Linux amd64 preliminary support
Guillaume POIRIER
guillaume.poirier at etudiant.univ-rennes1.fr
Thu Jan 6 23:54:33 CET 2005
Le jeudi 06 janvier 2005 à 10:14 +0100, Edouard Gomez a écrit :
> Selon Guillaume POIRIER <guillaume.poirier at etudiant.univ-rennes1.fr>:
> > What a coincidence, I just happened to run a bench yesterday between the
> > "original" AMD-64 SIMD port by Andre Werthmann against IA-34 SIMD
> > version from you tla tree. I've been quite disappointed to see that the
> > AMD-64 version was just as fast as IA-32 when libavcodec gives me (from
> > the top of my head) a 50% bonus regarding encoding speed[1].
>
> I doubt this can be true, technically using MMX/XMM/SSE2 should be equivalent
> using the amd64 either as a 32bit CPU or a 64bit processor because the code is
> exactly the same and is executed on the same simd units. The only way ffmpeg
> could speedup by 50% is that it uses a "real" 64bit port using the extra
> registers which could save some bandwidth in some parts of the code... but
> really 50% is not realistic.
You unfortunately right out that, good thing I just said it was "from
the top of my head". I guess I compared 2 encodes of different size,
with is meaningless.
Plus, given the studies I do, I should have been more suspicious about
such a large speed-up on the same chip. However, I really look forward
seeing what hackers will produce when this platform will be more
common.
> > Anyway, thanks a lot for the work you've done, I'll test it and
> > benchmark it tonight.
>
> You're welcome.
So here are the figures:
Filters: -vf crop=688:448:18:60,pp=fd,scale=480:288
Source: MPEG-2 (PAL DVD) 518,960 secs, 12975 frames
XviD options:
me_quality=6:chroma_me:chroma_opt:trellis:closed_gop:max_bframes=2:hq_ac:vhq=4:bvhq=1:autoaspect:psnr:bitrate=900
Lavc options:
vcodec=mpeg4:vmax_b_frames=1:mbd=2:v4mv:vqblur=0.37:vlelim=-4:vcelim=7:vqblur=0.37:vqcomp=0.65:lumi_mask=0.03:dark_mask=0.01:cmp=2:subcmp=2:precmp=2:dia=-1:trell:cbp:mv0:turbo:autoaspect:psnr:vbitrate=900
On x86-64
Pure C | XviD SIMD yasm | XviD Edouard SIMD | lavc + SIMD
Pass 1: 37fps | 68fps | 78fps | 79fps
Pass 2: 11fps | 30fps | 32fps | 30fps
On IA-32
Pure C | XviD SIMD yasm | XviD Edouard SIMD | lavc + SIMD
Pass 1: 25fps | N/A | 80fps | 74fps
Pass 2: 9fps | N/A | 31fps | 28fps
The same without filters[1]:
----------------------------
On x86-64
Pure C | XviD SIMD yasm | XviD Edouard SIMD | lavc + SIMD
Pass 1: N/A | 40fps | 45fps | 45fps
Pass 2: N/A | 13fps | 14fps | 12fps
On IA-32
Pure C | XviD SIMD yasm | XviD Edouard SIMD | lavc + SIMD
Pass 1: 10fps | N/A | 44fps | 42fps
Pass 2: 3fps | N/A | 13fps | 11fps
Comments:
---------
- Lavc test is just here to compare how much a similar project gained
from porting IA-32 SIMD ASM to x86-64.
Conclusion:
-----------
- Edouard's AMD-64 port is faster than Andre Werthmann (based on
XviD-1.0.2) ;-)
- On AMD-64, there really is a speed-up, but it's limited.
That's all for today, I'll try to test the qpel fix soon.
Regards,
Guillaume
[1] It looks like MEncoder's SSE asm code is deactivated on x86-64 due
to a bug in the cpu detecting code (cpudetect.c), that's why I tried to
maximize encoding time VS filters time.
More information about the XviD-devel
mailing list