From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Andi Kleen <andi@firstfloor.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Steven Rostedt <srostedt@redhat.com>,
Jeremy Fitzhardinge <jeremy@goop.org>,
LKML <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Miller <davem@davemloft.net>,
Roland McGrath <roland@redhat.com>,
Ulrich Drepper <drepper@redhat.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Gregory Haskins <ghaskins@novell.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
"Luis Claudio R. Goncalves" <lclaudio@uudg.org>,
Clark Williams <williams@redhat.com>,
Christoph Lameter <cl@linux-foundation.org>
Subject: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race with preemptible kernel and CPU hotplug
Date: Wed, 13 Aug 2008 19:41:56 -0400 [thread overview]
Message-ID: <20080813234156.GA25775@Krystal> (raw)
In-Reply-To: <20080813200119.GA18966@Krystal>
If a kernel thread is preempted in single-cpu mode right after the NOP (nop
about to be turned into a lock prefix), then we CPU hotplug a CPU, and then the
thread is scheduled back again, a SMP-unsafe atomic operation will be used on
shared SMP variables, leading to corruption. No corruption would happen in the
reverse case : going from SMP to UP is ok because we split a bit instruction
into tiny pieces, which does not present this condition.
Changing the 0x90 (single-byte nop) currently used into a 0x3E DS segment
override prefix should fix this issue. Since the default of the atomic
instructions is to use the DS segment anyway, it should not affect the
behavior.
** BIG FAT WARNING (tm) ** : this patch assumes that the 0x3E prefix will
leave atomic operations as-is (thus assuming they normally touch data in the DS
segment). Is there an obnoxious corner-case I would have missed ?
This source :
AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System
Instructions
States
"Instructions that Reference a Non-Stack Segment—If an instruction encoding
references any base register other than rBP or rSP, or if an instruction
contains an immediate offset, the default segment is the data segment (DS).
These instructions can use the segment-override prefix to select one of the
non-default segments, as shown in Table 1-5."
Therefore, forcing the DS segment on the atomic operations, which already use
the DS segment, should not change.
This source :
http://wiki.osdev.org/X86_Instruction_Encoding
States
"In 64-bit the CS, SS, DS and ES segment overrides are ignored."
(since it's a wiki, it would have to be confirmed, but at least the worse that
happens is that the DS override is ignored)
This patch applies to 2.6.27-rc2, but would also have to be applied to earlier
kernels (2.6.26, 2.6.25, ...).
Performance impact of the fix : tests done on "xaddq" and "xaddl" shows it
actually improves performances on Intel Xeon, AMD64, Pentium M. It does not
change the performance on Pentium 3 and Pentium 4.
Xeon E5405 2.0GHz :
NR_TESTS 10000000
test empty cycles : 162207948
test test 1-byte nop xadd cycles : 170755422
test test DS override prefix xadd cycles : 170000118 *
test test LOCK xadd cycles : 472012134
AMD64 2.0GHz :
NR_TESTS 10000000
test empty cycles : 146674549
test test 1-byte nop xadd cycles : 150273860
test test DS override prefix xadd cycles : 149982382 *
test test LOCK xadd cycles : 270000690
Pentium 4 3.0GHz
NR_TESTS 10000000
test empty cycles : 290001195
test test 1-byte nop xadd cycles : 310000560
test test DS override prefix xadd cycles : 310000575 *
test test LOCK xadd cycles : 1050103740
Pentium M 2.0GHz
NR_TESTS 10000000
test empty cycles : 180000523
test test 1-byte nop xadd cycles : 320000345
test test DS override prefix xadd cycles : 310000374 *
test test LOCK xadd cycles : 480000357
Pentium 3 550MHz
NR_TESTS 10000000
test empty cycles : 510000231
test test 1-byte nop xadd cycles : 620000128
test test DS override prefix xadd cycles : 620000110 *
test test LOCK xadd cycles : 800000088
Speed test modules can be found at
http://ltt.polymtl.ca/svn/trunk/tests/kernel/test-prefix-speed-32.c
http://ltt.polymtl.ca/svn/trunk/tests/kernel/test-prefix-speed.c
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andi Kleen <andi@firstfloor.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
CC: Steven Rostedt <srostedt@redhat.com>
CC: Jeremy Fitzhardinge <jeremy@goop.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: David Miller <davem@davemloft.net>
CC: Roland McGrath <roland@redhat.com>
CC: Ulrich Drepper <drepper@redhat.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Gregory Haskins <ghaskins@novell.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
CC: Clark Williams <williams@redhat.com>
CC: Christoph Lameter <cl@linux-foundation.org>
---
arch/x86/kernel/alternative.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
Index: linux-2.6-lttng/arch/x86/kernel/alternative.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/alternative.c 2008-08-13 18:42:47.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/alternative.c 2008-08-13 18:54:36.000000000 -0400
@@ -241,25 +241,25 @@ static void alternatives_smp_lock(u8 **s
continue;
if (*ptr > text_end)
continue;
- text_poke(*ptr, ((unsigned char []){0xf0}), 1); /* add lock prefix */
+ /* turn DS segment override prefix into lock prefix */
+ text_poke(*ptr, ((unsigned char []){0xf0}), 1);
};
}
static void alternatives_smp_unlock(u8 **start, u8 **end, u8 *text, u8 *text_end)
{
u8 **ptr;
- char insn[1];
if (noreplace_smp)
return;
- add_nops(insn, 1);
for (ptr = start; ptr < end; ptr++) {
if (*ptr < text)
continue;
if (*ptr > text_end)
continue;
- text_poke(*ptr, insn, 1);
+ /* turn lock prefix into DS segment override prefix */
+ text_poke(*ptr, ((unsigned char []){0x3E}), 1);
};
}
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2008-08-13 23:47 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-07 18:20 [PATCH 0/5] ftrace: to kill a daemon Steven Rostedt
2008-08-07 18:20 ` [PATCH 1/5] ftrace: create __mcount_loc section Steven Rostedt
2008-08-07 18:20 ` [PATCH 2/5] ftrace: mcount call site on boot nops core Steven Rostedt
2008-08-07 18:20 ` [PATCH 3/5] ftrace: enable mcount recording for modules Steven Rostedt
2008-08-08 6:43 ` Rusty Russell
2008-08-08 12:51 ` Steven Rostedt
2008-08-07 18:20 ` [PATCH 4/5] ftrace: rebuild everything on change to FTRACE_MCOUNT_RECORD Steven Rostedt
2008-08-07 18:20 ` [PATCH 5/5] ftrace: enable using mcount recording on x86 Steven Rostedt
2008-08-07 18:47 ` [PATCH 0/5] ftrace: to kill a daemon Mathieu Desnoyers
2008-08-07 20:42 ` Steven Rostedt
2008-08-08 17:22 ` Mathieu Desnoyers
2008-08-08 17:36 ` Steven Rostedt
2008-08-08 17:46 ` Mathieu Desnoyers
2008-08-08 18:13 ` Steven Rostedt
2008-08-08 18:15 ` Peter Zijlstra
2008-08-08 18:21 ` Mathieu Desnoyers
2008-08-08 18:41 ` Steven Rostedt
2008-08-08 19:04 ` Linus Torvalds
2008-08-08 19:05 ` Mathieu Desnoyers
2008-08-08 23:38 ` Steven Rostedt
2008-08-09 0:23 ` Andi Kleen
2008-08-09 0:36 ` Steven Rostedt
2008-08-09 0:47 ` Jeremy Fitzhardinge
2008-08-09 0:51 ` Linus Torvalds
2008-08-09 1:25 ` Steven Rostedt
2008-08-13 6:31 ` Mathieu Desnoyers
2008-08-13 15:38 ` Mathieu Desnoyers
2008-08-13 17:52 ` Efficient x86 and x86_64 NOP microbenchmarks Mathieu Desnoyers
2008-08-13 18:27 ` Linus Torvalds
2008-08-13 18:41 ` Andi Kleen
2008-08-13 18:45 ` Avi Kivity
2008-08-13 18:51 ` Andi Kleen
2008-08-13 18:56 ` Avi Kivity
2008-08-13 19:30 ` Mathieu Desnoyers
2008-08-13 19:37 ` Andi Kleen
2008-08-13 20:01 ` Mathieu Desnoyers
2008-08-13 23:41 ` Mathieu Desnoyers [this message]
2008-08-14 0:01 ` [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race with preemptible kernel and CPU hotplug H. Peter Anvin
2008-08-14 1:13 ` Mathieu Desnoyers
2008-08-14 1:22 ` Jeremy Fitzhardinge
2008-08-14 1:26 ` Roland McGrath
2008-08-14 1:49 ` Mathieu Desnoyers
2008-08-14 3:35 ` Jeremy Fitzhardinge
2008-08-14 15:18 ` Mathieu Desnoyers
2008-08-14 16:10 ` Linus Torvalds
2008-08-14 16:13 ` H. Peter Anvin
2008-08-14 16:58 ` Mathieu Desnoyers
2008-08-14 17:05 ` Jeremy Fitzhardinge
2008-08-14 17:30 ` Mathieu Desnoyers
2008-08-14 17:43 ` Jeremy Fitzhardinge
2008-08-14 18:37 ` H. Peter Anvin
2008-08-14 18:53 ` Mathieu Desnoyers
2008-08-14 19:29 ` Jeremy Fitzhardinge
2008-08-14 20:31 ` Mathieu Desnoyers
2008-08-14 20:39 ` H. Peter Anvin
2008-08-14 21:46 ` Jeremy Fitzhardinge
2008-08-14 22:26 ` H. Peter Anvin
2008-08-14 17:17 ` H. Peter Anvin
2008-08-14 18:09 ` Mathieu Desnoyers
2008-08-14 19:49 ` Mathieu Desnoyers
2008-08-14 17:04 ` Jeremy Fitzhardinge
2008-08-14 17:18 ` H. Peter Anvin
2008-08-14 17:28 ` Jeremy Fitzhardinge
2008-08-14 17:31 ` H. Peter Anvin
2008-08-14 17:46 ` Mathieu Desnoyers
2008-08-14 17:49 ` Jeremy Fitzhardinge
2008-08-14 17:55 ` Mathieu Desnoyers
2008-08-14 18:59 ` Gregory Haskins
2008-08-15 21:34 ` Efficient x86 and x86_64 NOP microbenchmarks Steven Rostedt
2008-08-15 21:51 ` Andi Kleen
2008-08-13 19:16 ` Mathieu Desnoyers
2008-08-09 0:51 ` [PATCH 0/5] ftrace: to kill a daemon Steven Rostedt
2008-08-09 0:53 ` Roland McGrath
2008-08-09 1:13 ` Andi Kleen
2008-08-09 1:19 ` Andi Kleen
2008-08-09 1:30 ` Steven Rostedt
2008-08-09 1:55 ` Andi Kleen
2008-08-09 2:03 ` Steven Rostedt
2008-08-09 2:23 ` Andi Kleen
2008-08-09 4:12 ` Steven Rostedt
2008-08-09 0:30 ` Steven Rostedt
2008-08-11 18:21 ` Mathieu Desnoyers
2008-08-11 19:28 ` Steven Rostedt
2008-08-08 19:08 ` Jeremy Fitzhardinge
2008-08-11 2:41 ` Rusty Russell
2008-08-11 12:33 ` Steven Rostedt
2008-08-07 21:11 ` Jeremy Fitzhardinge
2008-08-07 21:29 ` Steven Rostedt
2008-08-07 22:26 ` Roland McGrath
2008-08-08 1:21 ` Steven Rostedt
2008-08-08 1:24 ` Steven Rostedt
2008-08-08 1:56 ` Steven Rostedt
2008-08-08 7:22 ` Peter Zijlstra
2008-08-08 11:31 ` Steven Rostedt
2008-08-08 4:54 ` Sam Ravnborg
2008-08-09 9:48 ` Abhishek Sagar
2008-08-09 13:01 ` Steven Rostedt
2008-08-09 15:01 ` Abhishek Sagar
2008-08-09 15:37 ` Steven Rostedt
2008-08-09 17:14 ` Abhishek Sagar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080813234156.GA25775@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=acme@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=drepper@redhat.com \
--cc=ghaskins@novell.com \
--cc=jeremy@goop.org \
--cc=lclaudio@uudg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=roland@redhat.com \
--cc=rostedt@goodmis.org \
--cc=rusty@rustcorp.com.au \
--cc=srostedt@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox