[XviD-devel] Inlined ASM code again
Christoph Lampert
chl at math.uni-bonn.de
Thu Aug 21 16:03:30 CEST 2003
Hi,
please don't think that this related to your patch, which I couldn't
test because gcc 2.95 lacks the intrinsics include files.
I just thought the thread may be read by people who are interested in this
stuff, so check out this discussion:
http://lists.insecure.org/lists/linux-kernel/2003/Feb/0501.html
and maybe this description of how icc optimized (I still haven't found a
source how gcc really does optmization).
http://www.linuxjournal.com/article.php?sid=4885
gruel
P.S. For me the benchmark given in
http://lists.insecure.org/lists/linux-kernel/2003/Feb/0401.html
shows:
Pentium III at 650: "just gcc"
Proc std: 35880 kticks
Proc std inline: 35930 kticks
Proc sse: 5420 kticks
Proc sse inline: 5280 kticks
Pentium4 at 2400: "just gcc"
Proc std: 6830 kticks
Proc std inline: 6770 kticks
Proc sse: 1540 kticks
Proc sse inline: 1560 kticks
but on Pentium4 when compiled with -O3 -march=pentium4 it changes to
Proc std: 4950 kticks
Proc std inline: 4940 kticks
Proc sse: 4200 kticks
Proc sse inline: 4170 kticks
and with -O3 alone, the complete intrinsic loop seems to be gone:
Proc std: 4240 kticks
Proc std inline: 4270 kticks
Proc sse: 130 kticks
Proc sse inline: 130 kticks
What does this mean? I don't know, but I'm sure it means something.
On Thu, 21 Aug 2003, Edouard Gomez wrote:
> Edouard Gomez (ed.gomez at free.fr) wrote:
> > My mirror is up on free.fr for fellows that test arch/tla otherwise
> > you can find the patch in attachment.
>
> As usual the filter cut my emails, i wonder why it always dislikes my
> attachment :-)
>
> Available here:
> http://ed.gomez.free.fr/vrac/gcc-intrinsics.diff.gz
>
> @skal: sorry but i don't see any advantage from using nasm over a cpp+cc
> couple. You have macros in both cases, you have a lot more
> control over variables declaration with a cc (and that count on unix where
> namespace pollution is a pain), and the more important one is
> that you can hopefully use complex types (structures) directly in
> the code... doing so in nasm is far from easy because of the cc
> structure packing rules.
> Now i must admit i used intrinsics to test gcc capabilities, but
> it could be done with simple macros that would just wrap mmx
> opcodes and nothing more (mmx.h in ffmpeg or mplayer or ...) so
> you would still have the same flexibility than nowadays with nasm.
>
> --
> Edouard Gomez
> _______________________________________________
> XviD-devel mailing list
> XviD-devel at xvid.org
> http://list.xvid.org/mailman/listinfo/xvid-devel
>
More information about the XviD-devel
mailing list