All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: [PATCH 2/3] x86: Remove cmpxchg from i386 NMI nesting code
Date: Fri, 08 Jun 2012 09:52:02 -0400	[thread overview]
Message-ID: <20120608135603.699156773@goodmis.org> (raw)
In-Reply-To: 20120608135200.371649691@goodmis.org

[-- Attachment #1: Type: text/plain, Size: 4342 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

I've been informed by someone on LWN called 'slashdot' that
some i386 machines do not support a true cmpxchg. The cmpxchg
used by the i386 NMI nesting code must be a true cmpxchg as
disabling interrupts will not work for NMIs (which is the work
around for i386s that do not have a true cmpxchg).

This 'slashdot' character also suggested a fix to the issue.
As the state of the nesting NMIs goes as follows:

  NOT_RUNNING -> EXECUTING
  EXECUTING   -> NOT_RUNNING
  EXECUTING   -> LATCHED
  LATCHED     -> EXECUTING

Having these states as enum values of:

  NOT_RUNNING = 0
  EXECUTING   = 1
  LATCHED     = 2

Instead of a cmpxchg to make EXECUTING -> NOT_RUNNING a
dec_and_test() would work as well. If the dec_and_test brings
the state to NOT_RUNNING, that is the same as a cmpxchg
succeeding to change EXECUTING to NOT_RUNNING. If a nested NMI
were to come in and change it to LATCHED, the dec_and_test() would
convert the state to EXECUTING (what we want it to be in such a
case anyway).

I asked 'slashdot' to post this as a patch, but it never came to
be. I decided to do the work instead.

Link: http://lwn.net/Articles/484932/

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/nmi.c |   39 ++++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index a0b2f84..43cce77 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -28,6 +28,7 @@
 #include <asm/mach_traps.h>
 #include <asm/nmi.h>
 #include <asm/x86_init.h>
+#include <asm/local.h>
 
 struct nmi_desc {
 	spinlock_t lock;
@@ -365,8 +366,9 @@ static __kprobes void default_do_nmi(struct pt_regs *regs)
 #ifdef CONFIG_X86_32
 /*
  * For i386, NMIs use the same stack as the kernel, and we can
- * add a workaround to the iret problem in C. Simply have 3 states
- * the NMI can be in.
+ * add a workaround to the iret problem in C (preventing nested
+ * NMIs if an NMI takes a trap). Simply have 3 states the NMI
+ * can be in:
  *
  *  1) not running
  *  2) executing
@@ -383,32 +385,39 @@ static __kprobes void default_do_nmi(struct pt_regs *regs)
  * If an NMI hits a breakpoint that executes an iret, another
  * NMI can preempt it. We do not want to allow this new NMI
  * to run, but we want to execute it when the first one finishes.
- * We set the state to "latched", and the first NMI will perform
- * an cmpxchg on the state, and if it doesn't successfully
- * reset the state to "not running" it will restart the next
- * NMI.
+ * We set the state to "latched", and the exit of the first NMI will
+ * perform a dec_and_test, if the result is zero (NOT_RUNNING), then
+ * it will simply exit the NMI handler. If not, the dec_and_test
+ * would have set the state to NMI_EXECUTING (what we want it to
+ * be when we are running). In this case, we simply jump back
+ * to rerun the NMI handler again, and restart the 'latched' NMI.
+ *
+ * No trap (breakpoint or page fault) should be hit before nmi_restart,
+ * thus there is no race between the first check of state for NOT_RUNNING
+ * and setting it to NMI_EXECUTING. The HW will prevent nested NMIs
+ * at this point.
  */
 enum nmi_states {
-	NMI_NOT_RUNNING,
+	NMI_NOT_RUNNING = 0,
 	NMI_EXECUTING,
 	NMI_LATCHED,
 };
-static DEFINE_PER_CPU(enum nmi_states, nmi_state);
+static DEFINE_PER_CPU(local_t, nmi_state);
 
 #define nmi_nesting_preprocess(regs)					\
 	do {								\
-		if (__get_cpu_var(nmi_state) != NMI_NOT_RUNNING) {	\
-			__get_cpu_var(nmi_state) = NMI_LATCHED;		\
+		local_t *__state = &__get_cpu_var(nmi_state);		\
+		if (local_read(__state) != NMI_NOT_RUNNING) {		\
+			local_set(__state, NMI_LATCHED);		\
 			return;						\
 		}							\
-	nmi_restart:							\
-		__get_cpu_var(nmi_state) = NMI_EXECUTING;		\
-	} while (0)
+		local_set(__state, NMI_EXECUTING);			\
+	} while (0);							\
+	nmi_restart:
 
 #define nmi_nesting_postprocess()					\
 	do {								\
-		if (cmpxchg(&__get_cpu_var(nmi_state),			\
-		    NMI_EXECUTING, NMI_NOT_RUNNING) != NMI_EXECUTING)	\
+		if (!local_dec_and_test(&__get_cpu_var(nmi_state)))	\
 			goto nmi_restart;				\
 	} while (0)
 #else /* x86_64 */
-- 
1.7.10



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  parent reply	other threads:[~2012-06-08 13:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-08 13:52 [PATCH 0/3] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
2012-06-08 13:52 ` [PATCH 1/3] x86: Save cr2 in NMI in case NMIs take a page fault Steven Rostedt
2012-06-08 13:52 ` Steven Rostedt [this message]
2012-06-08 16:10   ` [PATCH 2/3] x86: Remove cmpxchg from i386 NMI nesting code H. Peter Anvin
2012-06-08 16:41     ` Steven Rostedt
2012-06-08 17:28       ` H. Peter Anvin
2012-06-08 17:36         ` Steven Rostedt
2012-06-08 17:39           ` H. Peter Anvin
2012-06-08 17:52             ` Steven Rostedt
2012-06-08 18:04               ` H. Peter Anvin
2012-06-08 18:09               ` Borislav Petkov
2012-06-11  0:25               ` Maciej W. Rozycki
2012-06-11  2:23                 ` H. Peter Anvin
2012-06-11  3:14                   ` Maciej W. Rozycki
2012-06-11  3:17                     ` H. Peter Anvin
2012-06-08 13:52 ` [PATCH 3/3] x86: Save cr2 in NMI in case NMIs take a page fault (for i386) Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120608135603.699156773@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.