public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI
@ 2012-06-09  2:02 Steven Rostedt
  2012-06-09  2:02 ` [PATCH 1/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Steven Rostedt @ 2012-06-09  2:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: Type: text/plain, Size: 702 bytes --]


Ingo,

This is the version 2 with the updates suggested by H. Peter Anvin.

Please pull the latest tip/x86/core-2 tree, which can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
tip/x86/core-2

Head SHA1: 70fb74a5420f9caa3e001d65004e4b669124283e


Steven Rostedt (3):
      x86: Save cr2 in NMI in case NMIs take a page fault
      x86: Remove cmpxchg from i386 NMI nesting code
      x86: Save cr2 in NMI in case NMIs take a page fault (for i386)

----
 arch/x86/kernel/entry_64.S |   20 +++++++++++++++++++
 arch/x86/kernel/nmi.c      |   47 +++++++++++++++++++++++++++++++-------------
 2 files changed, 53 insertions(+), 14 deletions(-)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault
  2012-06-09  2:02 [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
@ 2012-06-09  2:02 ` Steven Rostedt
  2012-06-09  2:02 ` [PATCH 2/3 v2] x86: Remove cmpxchg from i386 NMI nesting code Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2012-06-09  2:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, H. Peter Anvin,
	Avi Kivity, Linus Torvalds, Brian Gerst

[-- Attachment #1: Type: text/plain, Size: 2531 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Avi Kivity reported that page faults in NMIs could cause havic if
the NMI preempted another page fault handler:

   The recent changes to NMI allow exceptions to take place in NMI
   handlers, but I think that a #PF (say, due to access to vmalloc space)
   is still problematic.  Consider the sequence

    #PF  (cr2 set by processor)
      NMI
        ...
        #PF (cr2 clobbered)
          do_page_fault()
          IRET
        ...
        IRET
      do_page_fault()
        address = read_cr2()

   The last line reads the overwritten cr2 value.

Originally I wrote a patch to solve this by saving the cr2 on the stack.
Brian Gerst suggested to save it in the r12 register as both r12 and rbx
are saved by the do_nmi handler as required by the C standard. But rbx
is already used for saving if swapgs needs to be run on exit of the NMI
handler.

Link: http://lkml.kernel.org/r/4FBB8C40.6080304@redhat.com
Link: http://lkml.kernel.org/r/1337763411.13348.140.camel@gandalf.stny.rr.com

Reported-by: Avi Kivity <avi@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/entry_64.S |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 7d65133..111f6bb 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1758,10 +1758,30 @@ end_repeat_nmi:
 	 */
 	call save_paranoid
 	DEFAULT_FRAME 0
+
+	/*
+	 * Save off the CR2 register. If we take a page fault in the NMI then
+	 * it could corrupt the CR2 value. If the NMI preempts a page fault
+	 * handler before it was able to read the CR2 register, and then the
+	 * NMI itself takes a page fault, the page fault that was preempted
+	 * will read the information from the NMI page fault and not the
+	 * origin fault. Save it off and restore it if it changes.
+	 * Use the r12 callee-saved register.
+	 */
+	movq %cr2, %r12
+
 	/* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */
 	movq %rsp,%rdi
 	movq $-1,%rsi
 	call do_nmi
+
+	/* Did the NMI take a page fault? Restore cr2 if it did */
+	movq %cr2, %rcx
+	cmpq %rcx, %r12
+	je 1f
+	movq %r12, %cr2
+1:
+	
 	testl %ebx,%ebx				/* swapgs needed? */
 	jnz nmi_restore
 nmi_swapgs:
-- 
1.7.10



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3 v2] x86: Remove cmpxchg from i386 NMI nesting code
  2012-06-09  2:02 [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
  2012-06-09  2:02 ` [PATCH 1/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault Steven Rostedt
@ 2012-06-09  2:02 ` Steven Rostedt
  2012-06-09  2:03 ` [PATCH 3/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault (for i386) Steven Rostedt
  2012-06-09  2:58 ` [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
  3 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2012-06-09  2:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, H. Peter Anvin

[-- Attachment #1: Type: text/plain, Size: 4254 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

I've been informed by someone on LWN called 'slashdot' that
some i386 machines do not support a true cmpxchg. The cmpxchg
used by the i386 NMI nesting code must be a true cmpxchg as
disabling interrupts will not work for NMIs (which is the work
around for i386s that do not have a true cmpxchg).

This 'slashdot' character also suggested a fix to the issue.
As the state of the nesting NMIs goes as follows:

  NOT_RUNNING -> EXECUTING
  EXECUTING   -> NOT_RUNNING
  EXECUTING   -> LATCHED
  LATCHED     -> EXECUTING

Having these states as enum values of:

  NOT_RUNNING = 0
  EXECUTING   = 1
  LATCHED     = 2

Instead of a cmpxchg to make EXECUTING -> NOT_RUNNING a
dec_and_test() would work as well. If the dec_and_test brings
the state to NOT_RUNNING, that is the same as a cmpxchg
succeeding to change EXECUTING to NOT_RUNNING. If a nested NMI
were to come in and change it to LATCHED, the dec_and_test() would
convert the state to EXECUTING (what we want it to be in such a
case anyway).

I asked 'slashdot' to post this as a patch, but it never came to
be. I decided to do the work instead.

Thanks to H. Peter Anvin for suggesting to use this_cpu_dec_and_return()
instead of local_dec_and_test(&__get_cpu_var()).

Link: http://lwn.net/Articles/484932/

Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/nmi.c |   35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index a0b2f84..a15a888 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -365,8 +365,9 @@ static __kprobes void default_do_nmi(struct pt_regs *regs)
 #ifdef CONFIG_X86_32
 /*
  * For i386, NMIs use the same stack as the kernel, and we can
- * add a workaround to the iret problem in C. Simply have 3 states
- * the NMI can be in.
+ * add a workaround to the iret problem in C (preventing nested
+ * NMIs if an NMI takes a trap). Simply have 3 states the NMI
+ * can be in:
  *
  *  1) not running
  *  2) executing
@@ -383,13 +384,20 @@ static __kprobes void default_do_nmi(struct pt_regs *regs)
  * If an NMI hits a breakpoint that executes an iret, another
  * NMI can preempt it. We do not want to allow this new NMI
  * to run, but we want to execute it when the first one finishes.
- * We set the state to "latched", and the first NMI will perform
- * an cmpxchg on the state, and if it doesn't successfully
- * reset the state to "not running" it will restart the next
- * NMI.
+ * We set the state to "latched", and the exit of the first NMI will
+ * perform a dec_return, if the result is zero (NOT_RUNNING), then
+ * it will simply exit the NMI handler. If not, the dec_return
+ * would have set the state to NMI_EXECUTING (what we want it to
+ * be when we are running). In this case, we simply jump back
+ * to rerun the NMI handler again, and restart the 'latched' NMI.
+ *
+ * No trap (breakpoint or page fault) should be hit before nmi_restart,
+ * thus there is no race between the first check of state for NOT_RUNNING
+ * and setting it to NMI_EXECUTING. The HW will prevent nested NMIs
+ * at this point.
  */
 enum nmi_states {
-	NMI_NOT_RUNNING,
+	NMI_NOT_RUNNING = 0,
 	NMI_EXECUTING,
 	NMI_LATCHED,
 };
@@ -397,18 +405,17 @@ static DEFINE_PER_CPU(enum nmi_states, nmi_state);
 
 #define nmi_nesting_preprocess(regs)					\
 	do {								\
-		if (__get_cpu_var(nmi_state) != NMI_NOT_RUNNING) {	\
-			__get_cpu_var(nmi_state) = NMI_LATCHED;		\
+		if (this_cpu_read(nmi_state) != NMI_NOT_RUNNING) {	\
+			this_cpu_write(nmi_state, NMI_LATCHED);		\
 			return;						\
 		}							\
-	nmi_restart:							\
-		__get_cpu_var(nmi_state) = NMI_EXECUTING;		\
-	} while (0)
+		this_cpu_write(nmi_state, NMI_EXECUTING);		\
+	} while (0);							\
+	nmi_restart:
 
 #define nmi_nesting_postprocess()					\
 	do {								\
-		if (cmpxchg(&__get_cpu_var(nmi_state),			\
-		    NMI_EXECUTING, NMI_NOT_RUNNING) != NMI_EXECUTING)	\
+		if (this_cpu_dec_return(nmi_state))			\
 			goto nmi_restart;				\
 	} while (0)
 #else /* x86_64 */
-- 
1.7.10



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault (for i386)
  2012-06-09  2:02 [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
  2012-06-09  2:02 ` [PATCH 1/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault Steven Rostedt
  2012-06-09  2:02 ` [PATCH 2/3 v2] x86: Remove cmpxchg from i386 NMI nesting code Steven Rostedt
@ 2012-06-09  2:03 ` Steven Rostedt
  2012-06-09  2:58 ` [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
  3 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2012-06-09  2:03 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, H. Peter Anvin,
	Avi Kivity, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 2765 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Avi Kivity reported that page faults in NMIs could cause havic if
the NMI preempted another page fault handler:

   The recent changes to NMI allow exceptions to take place in NMI
   handlers, but I think that a #PF (say, due to access to vmalloc space)
   is still problematic.  Consider the sequence

    #PF  (cr2 set by processor)
      NMI
        ...
        #PF (cr2 clobbered)
          do_page_fault()
          IRET
        ...
        IRET
      do_page_fault()
        address = read_cr2()

   The last line reads the overwritten cr2 value.

This is the i386 version, which has the luxury of doing the work
in C code.

Link: http://lkml.kernel.org/r/4FBB8C40.6080304@redhat.com

Reported-by: Avi Kivity <avi@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/nmi.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index a15a888..f84f5c5 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -395,6 +395,14 @@ static __kprobes void default_do_nmi(struct pt_regs *regs)
  * thus there is no race between the first check of state for NOT_RUNNING
  * and setting it to NMI_EXECUTING. The HW will prevent nested NMIs
  * at this point.
+ *
+ * In case the NMI takes a page fault, we need to save off the CR2
+ * because the NMI could have preempted another page fault and corrupt
+ * the CR2 that is about to be read. As nested NMIs must be restarted
+ * and they can not take breakpoints or page faults, the update of the
+ * CR2 must be done before converting the nmi state back to NOT_RUNNING.
+ * Otherwise, there would be a race of another nested NMI coming in
+ * after setting state to NOT_RUNNING but before updating the nmi_cr2.
  */
 enum nmi_states {
 	NMI_NOT_RUNNING = 0,
@@ -402,6 +410,7 @@ enum nmi_states {
 	NMI_LATCHED,
 };
 static DEFINE_PER_CPU(enum nmi_states, nmi_state);
+static DEFINE_PER_CPU(unsigned long, nmi_cr2);
 
 #define nmi_nesting_preprocess(regs)					\
 	do {								\
@@ -410,11 +419,14 @@ static DEFINE_PER_CPU(enum nmi_states, nmi_state);
 			return;						\
 		}							\
 		this_cpu_write(nmi_state, NMI_EXECUTING);		\
+		this_cpu_write(nmi_cr2, read_cr2());			\
 	} while (0);							\
 	nmi_restart:
 
 #define nmi_nesting_postprocess()					\
 	do {								\
+		if (unlikely(this_cpu_read(nmi_cr2) != read_cr2()))	\
+			write_cr2(this_cpu_read(nmi_cr2));		\
 		if (this_cpu_dec_return(nmi_state))			\
 			goto nmi_restart;				\
 	} while (0)
-- 
1.7.10



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI
  2012-06-09  2:02 [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
                   ` (2 preceding siblings ...)
  2012-06-09  2:03 ` [PATCH 3/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault (for i386) Steven Rostedt
@ 2012-06-09  2:58 ` Steven Rostedt
  2012-06-09  6:30   ` H. Peter Anvin
  2012-06-09  6:32   ` H. Peter Anvin
  3 siblings, 2 replies; 8+ messages in thread
From: Steven Rostedt @ 2012-06-09  2:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, H. Peter Anvin

On Fri, 2012-06-08 at 22:02 -0400, Steven Rostedt wrote:
> Ingo,
> 
> This is the version 2 with the updates suggested by H. Peter Anvin.
> 
> Please pull the latest tip/x86/core-2 tree, which can be found at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
> tip/x86/core-2
> 

FYI, I based this off of tip/x86/debug.

-- Steve



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI
  2012-06-09  2:58 ` [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
@ 2012-06-09  6:30   ` H. Peter Anvin
  2012-06-09  6:32   ` H. Peter Anvin
  1 sibling, 0 replies; 8+ messages in thread
From: H. Peter Anvin @ 2012-06-09  6:30 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner

On 06/08/2012 07:58 PM, Steven Rostedt wrote:
> On Fri, 2012-06-08 at 22:02 -0400, Steven Rostedt wrote:
>> Ingo,
>>
>> This is the version 2 with the updates suggested by H. Peter Anvin.
>>
>> Please pull the latest tip/x86/core-2 tree, which can be found at:
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
>> tip/x86/core-2
>>
> 
> FYI, I based this off of tip/x86/debug.
> 

No, you didn't... you based it on an ancient (pre-v3.3) version of
tip/x86/debug...

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI
  2012-06-09  2:58 ` [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
  2012-06-09  6:30   ` H. Peter Anvin
@ 2012-06-09  6:32   ` H. Peter Anvin
  2012-06-09 12:44     ` Steven Rostedt
  1 sibling, 1 reply; 8+ messages in thread
From: H. Peter Anvin @ 2012-06-09  6:32 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner

On 06/08/2012 07:58 PM, Steven Rostedt wrote:
> On Fri, 2012-06-08 at 22:02 -0400, Steven Rostedt wrote:
>> Ingo,
>>
>> This is the version 2 with the updates suggested by H. Peter Anvin.
>>
>> Please pull the latest tip/x86/core-2 tree, which can be found at:
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
>> tip/x86/core-2
>>
> 
> FYI, I based this off of tip/x86/debug.
> 

Hm... actually it looks more like you reused an old branch and forgot to
actually push it out.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI
  2012-06-09  6:32   ` H. Peter Anvin
@ 2012-06-09 12:44     ` Steven Rostedt
  0 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2012-06-09 12:44 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner

On Fri, 2012-06-08 at 23:32 -0700, H. Peter Anvin wrote:

> Hm... actually it looks more like you reused an old branch and forgot to
> actually push it out.

Actully, I pushed it out, I just forgot to check to see if it succeeded:

error: failed to push some refs to 'gitolite@ra.kernel.org:/pub/scm/linux/kernel/git/rostedt/linux-trace.git'
To prevent you from losing history, non-fast-forward updates were rejected
Merge the remote changes (e.g. 'git pull') before pushing again.  See the
'Note about fast-forwards' section of 'git push --help' for details.

This is also why I include the HEAD SHA1 in my pull messages ;-)

Force pushed just made.

Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-06-09 12:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-09  2:02 [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
2012-06-09  2:02 ` [PATCH 1/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault Steven Rostedt
2012-06-09  2:02 ` [PATCH 2/3 v2] x86: Remove cmpxchg from i386 NMI nesting code Steven Rostedt
2012-06-09  2:03 ` [PATCH 3/3 v2] x86: Save cr2 in NMI in case NMIs take a page fault (for i386) Steven Rostedt
2012-06-09  2:58 ` [PATCH 0/3 v2] [GIT PULL][3.6] x86: cr2 and cmpxchg issues of NMI Steven Rostedt
2012-06-09  6:30   ` H. Peter Anvin
2012-06-09  6:32   ` H. Peter Anvin
2012-06-09 12:44     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox