[XviD-devel] Inlined ASM code again
Edouard Gomez
ed.gomez at free.fr
Wed Aug 20 18:02:01 CEST 2003
Michael Militzer (michael at xvid.org) wrote:
> I'd like to comment on this: Your benchmark is very artificial, so it
> doesn't say much. I'd suggest you should create a sad16 replacement using
> gcc intrinsics, patch XviD to use your newly created sad16 version, switch
> to a 16x16 block search only quality mode (<4) and compare encoding speed
> between your patch and the standard XviD version.
I know, i warned about its artificial :)
I'm working on getting sad8, sad16, sad16v inlined and replacing
pointers by defines (nasty) in XviD.
Here is an early gnu profile of relevant functions (that eavuily depend
on sad functions):
./xvid_encraw -asm -i coastguard-352x288.yuv -w 352 -h 288
With inline/intrinsics functions:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
18.29 7.99 2.89 3705592 0.00 0.00 CheckCandidate16
8.48 9.33 1.34 6136982 0.00 0.00 CheckCandidate8
2.91 11.28 0.46 118008 0.00 0.05 SearchP
1.58 13.81 0.25 641266 0.00 0.00 AdvDiamondSearch
1.58 14.06 0.25 382432 0.00 0.01 Search8
0.76 14.89 0.12 298 0.40 44.12 FrameCodeP
0.32 15.59 0.05 23840 0.00 0.02 MEanalyzeMB
0.00 15.80 0.00 23840 0.00 0.01 DiamondSearch
0.00 15.80 0.00 298 0.00 1.60 MEanalysis
[...]
w/o inline/intrinsics:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
6.67 8.65 0.98 3705592 0.00 0.00 CheckCandidate16
8.71 6.44 1.28 6136982 0.00 0.00 CheckCandidate8
1.84 12.11 0.27 118008 0.00 0.03 SearchP
2.72 11.48 0.40 641266 0.00 0.00 AdvDiamondSearch
1.16 12.95 0.17 382432 0.00 0.01 Search8
1.36 12.78 0.20 298 0.67 34.98 FrameCodeP
0.27 14.37 0.04 23840 0.00 0.00 MEanalyzeMB
0.00 14.70 0.00 23840 0.00 0.00 DiamondSearch
0.00 14.70 0.00 298 0.00 0.22 MEanalysis
[...]
Now i will add a RTC timer for these functions because gprof has a
pitiful time resolution that makes it hard to conclude on anything. The
only thing you can notice is that %time changes from one version to the
other, self seconds change as well, but most of the time that just shows
code has been inlined and that the function is now self contained.
PS: fps seems to be globally the same.
--
Edouard Gomez
More information about the XviD-devel
mailing list