From: Artur Skawina <art_k@o2.pl>
To: Olivier Galibert <galibert@pobox.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Chris Lattner <clattner@apple.com>, Michael Matz <matz@suse.de>,
Richard Guenther <richard.guenther@gmail.com>,
Joe Buck <Joe.Buck@synopsys.com>, Jan Hubicka <hubicka@ucw.cz>,
Aurelien Jarno <aurelien@aurel32.net>,
linux-kernel@vger.kernel.org, gcc@gcc.gnu.org
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag
Date: Thu, 06 Mar 2008 17:14:32 +0100 [thread overview]
Message-ID: <47D01868.9060609@o2.pl> (raw)
In-Reply-To: <20080306135139.GA5236@dspnet.fr.eu.org>
Olivier Galibert wrote:
> On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
>> It's a kernel bug, and it needs to be fixed.
>
> I'm not convinced. It's been that way for 15 years, it's that way in
> the BSD kernels, at that point it's a feature. The bug is in the
> documentation, nowhere else. And in gcc for blindly trusting the
> documentation.
well, you could see this either way -- either the kernel is buggy and
needs to be fixed or the current behavior is correct and the abi needs
an errata. If there were no performance implications i'd go for the
latter, mostly because of the security aspect.
But this thread made me dig up an old benchmark and apparently omitting
the cld before the string ops makes a significant difference; on P2 it
was ~8%, on P4 it's ~6% for 1480 byte copies; for 32 byte ones the gain
is more like 90% on a P4 [1].
So the impact on small structure memcpy/memset etc is significant, hence
fixing the kernel looks like a better long term plan.
artur
[1]
P4 # ./bcsp m
IACCK 0.9.29 Artur Skawina <...>
[ exec time; lower is better ] [speed ] [ time ] [ok?]
TIME-N+S TIME32 TIME33 TIME1480 MBYTES/S TIMEXXXX CSUM FUNCTION ( rdtsc_overhead=0 null=0 )
0 0 0 0 inf 0 ffff csum_partial_copy_null
1885 375 389 156 7589.74 39350 0 generic_memcpy
10894 532 666 1696 698.11 108557 0 kernel_memcpylib
1804 325 346 151 7841.06 19614 0 kernel_memcpy686
1804 325 346 151 7841.06 19693 0 kernel_memcpy686ncld
1744 323 381 148 8000.00 19687 0 kernel_memcpy686as1
1332 157 232 139 8517.99 19235 0 kernel_memcpy686as1ncld
1782 318 339 148 8000.00 19607 0 kernel_memcpy686as2
1371 168 189 139 8517.99 19221 0 kernel_memcpy686as2ncld
P2 # ./bcsp m
IACKK 0.9.28 Artur Skawina <...>
TIME-N+S TIME32 TIME33 TIME1480 MBYTES/S TIMEXXXX CKSUM FUNCTION ( rdtsc_overhead=1 null=0 )
0 0 0 0 inf 0 : ffff csum_partial_copy_null
7121 746 1215 730 1621.92 127418 : 0 generic_memcpy
43604 2032 1709 6574 180.10 416409 : 0 kernel_memcpylib
7480 771 726 684 1730.99 96084 : 0 kernel_memcpy686
7036 735 543 685 1728.47 95508 : 0 kernel_memcpy686ncld
7498 1015 711 716 1653.63 92200 : 0 kernel_memcpy686as1
5826 438 489 662 1788.52 91598 : 0 kernel_memcpy686as1ncld
6667 657 488 708 1672.32 89366 : 0 kernel_memcpy686as2
6614 456 270 658 1799.39 91203 : 0 kernel_memcpy686as2ncld
next prev parent reply other threads:[~2008-03-06 16:29 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-05 15:30 Linux doesn't follow x86/x86-64 ABI wrt direction flag Aurelien Jarno
2008-03-05 16:00 ` H. Peter Anvin
2008-03-05 19:58 ` Joe Buck
2008-03-05 20:23 ` Aurelien Jarno
2008-03-05 20:38 ` Michael Matz
2008-03-05 20:42 ` Joe Buck
2008-03-05 20:49 ` Jan Hubicka
2008-03-05 21:02 ` Michael Matz
2008-03-05 21:20 ` RELEASE BLOCKER: " Joe Buck
2008-03-05 21:32 ` Richard Guenther
2008-03-05 21:34 ` H. Peter Anvin
2008-03-05 21:40 ` Richard Guenther
2008-03-05 22:16 ` David Miller
2008-03-05 22:37 ` Joe Buck
2008-03-05 22:51 ` Michael Matz
2008-03-05 22:58 ` H. Peter Anvin
2008-03-05 23:07 ` Michael Matz
2008-03-05 23:10 ` David Miller
2008-03-05 23:16 ` Joe Buck
2008-03-05 23:12 ` Olivier Galibert
2008-03-05 21:43 ` Joe Buck
2008-03-05 21:44 ` Richard Guenther
[not found] ` <738B72DB-A1D6-43F8-813A-E49688D05771@apple.com>
2008-03-05 21:59 ` Michael Matz
2008-03-05 22:13 ` Adrian Bunk
2008-03-05 22:21 ` David Miller
2008-03-05 23:13 ` Olivier Galibert
2008-03-06 0:36 ` Chris Lattner
2008-03-06 0:47 ` H. Peter Anvin
[not found] ` <578FCA7D-D7A6-44F6-9310-4A97C13CDCBE@apple.com>
2008-03-06 1:12 ` H. Peter Anvin
2008-03-06 9:17 ` Jakub Jelinek
2008-03-06 13:51 ` Olivier Galibert
2008-03-06 14:03 ` Paolo Bonzini
2008-03-06 14:12 ` Olivier Galibert
2008-03-06 14:15 ` Andrew Haley
2008-03-06 17:58 ` Joe Buck
2008-03-06 18:10 ` Olivier Galibert
2008-03-06 18:13 ` Paolo Bonzini
2008-03-06 18:31 ` Jack Lloyd
2008-03-06 18:35 ` Andrew Pinski
2008-03-06 19:44 ` Paolo Bonzini
2008-03-06 19:43 ` Paolo Bonzini
2008-03-06 20:16 ` Jack Lloyd
2008-03-06 21:37 ` Artur Skawina
2008-03-06 15:09 ` Robert Dewar
2008-03-06 15:37 ` NightStrike
2008-03-06 15:43 ` H.J. Lu
2008-03-06 15:50 ` H. Peter Anvin
2008-03-06 16:23 ` Jakub Jelinek
2008-03-06 16:27 ` İsmail Dönmez
2008-03-06 16:58 ` H.J. Lu
2008-03-06 17:06 ` H. Peter Anvin
2008-03-06 17:14 ` H.J. Lu
2008-03-06 17:17 ` H. Peter Anvin
2008-03-06 17:34 ` H.J. Lu
2008-03-06 19:35 ` Robert Dewar
2008-03-06 17:18 ` Robert Dewar
2008-03-06 17:19 ` H. Peter Anvin
2008-03-06 19:25 ` Robert Dewar
2008-03-06 20:37 ` H. Peter Anvin
2008-03-07 8:28 ` Florian Weimer
2008-03-07 8:00 ` Andreas Jaeger
2008-03-06 15:57 ` Robert Dewar
2008-03-06 16:29 ` Paolo Bonzini
2008-03-06 17:18 ` H. Peter Anvin
2008-03-06 16:14 ` Artur Skawina [this message]
2008-03-06 0:49 ` Aurelien Jarno
2008-03-05 22:05 ` H. Peter Anvin
2008-03-06 2:11 ` Krzysztof Halasa
2008-03-06 8:44 ` Andi Kleen
2008-03-06 9:01 ` Jakub Jelinek
2008-03-06 15:20 ` H. Peter Anvin
2008-03-05 21:45 ` Aurelien Jarno
2008-03-05 21:43 ` Andrew Pinski
2008-03-05 21:43 ` Michael Matz
2008-03-05 22:12 ` Joe Buck
2008-03-05 22:17 ` David Miller
2008-03-05 23:17 ` Olivier Galibert
2008-03-05 23:21 ` David Daney
2008-03-06 14:06 ` Olivier Galibert
2008-03-08 19:10 ` Alexandre Oliva
2008-03-05 21:07 ` H. Peter Anvin
2008-03-05 20:44 ` H. Peter Anvin
2008-03-05 20:52 ` Aurelien Jarno
2008-03-05 21:23 ` David Miller
2008-03-06 9:53 ` Andrew Haley
2008-03-06 11:45 ` Andi Kleen
2008-03-06 12:06 ` Richard Guenther
2008-03-06 17:34 ` Joe Buck
2008-03-06 20:54 ` Richard Guenther
2008-03-06 20:56 ` H. Peter Anvin
2008-03-06 22:06 ` Andi Kleen
2008-03-07 4:56 ` Chris Lattner
2008-03-07 14:09 ` Michael Matz
2008-03-06 9:45 ` Mikael Pettersson
2008-03-05 16:56 ` H.J. Lu
2008-03-05 18:14 ` [PATCH] x86: Clear DF before calling signal handler Aurelien Jarno
2008-03-05 18:17 ` H. Peter Anvin
2008-03-06 9:21 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47D01868.9060609@o2.pl \
--to=art_k@o2.pl \
--cc=Joe.Buck@synopsys.com \
--cc=aurelien@aurel32.net \
--cc=clattner@apple.com \
--cc=galibert@pobox.com \
--cc=gcc@gcc.gnu.org \
--cc=hpa@zytor.com \
--cc=hubicka@ucw.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=matz@suse.de \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox