stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, stable@vger.kernel.org
Cc: Raphael Prevost <raphael@buro.asia>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Peter Anvin <hpa@zytor.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: [PATCH 3/5] i387: fix x86-64 preemption-unsafe user stack save/restore
Date: Wed, 22 Feb 2012 12:57:33 -0800 (PST)	[thread overview]
Message-ID: <alpine.LFD.2.02.1202221256450.20522@i5.linux-foundation.org> (raw)
In-Reply-To: <alpine.LFD.2.02.1202221256130.20522@i5.linux-foundation.org>


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 16 Feb 2012 09:15:04 -0800

[ Upstream commit 15d8791cae75dca27bfda8ecfe87dca9379d6bb0 ]

Commit 5b1cbac37798 ("i387: make irq_fpu_usable() tests more robust")
added a sanity check to the #NM handler to verify that we never cause
the "Device Not Available" exception in kernel mode.

However, that check actually pinpointed a (fundamental) race where we do
cause that exception as part of the signal stack FPU state save/restore
code.

Because we use the floating point instructions themselves to save and
restore state directly from user mode, we cannot do that atomically with
testing the TS_USEDFPU bit: the user mode access itself may cause a page
fault, which causes a task switch, which saves and restores the FP/MMX
state from the kernel buffers.

This kind of "recursive" FP state save is fine per se, but it means that
when the signal stack save/restore gets restarted, it will now take the
'#NM' exception we originally tried to avoid.  With preemption this can
happen even without the page fault - but because of the user access, we
cannot just disable preemption around the save/restore instruction.

There are various ways to solve this, including using the
"enable/disable_page_fault()" helpers to not allow page faults at all
during the sequence, and fall back to copying things by hand without the
use of the native FP state save/restore instructions.

However, the simplest thing to do is to just allow the #NM from kernel
space, but fix the race in setting and clearing CR0.TS that this all
exposed: the TS bit changes and the TS_USEDFPU bit absolutely have to be
atomic wrt scheduling, so while the actual state save/restore can be
interrupted and restarted, the act of actually clearing/setting CR0.TS
and the TS_USEDFPU bit together must not.

Instead of just adding random "preempt_disable/enable()" calls to what
is already excessively ugly code, this introduces some helper functions
that mostly mirror the "kernel_fpu_begin/end()" functionality, just for
the user state instead.

Those helper functions should probably eventually replace the other
ad-hoc CR0.TS and TS_USEDFPU tests too, but I'll need to think about it
some more: the task switching functionality in particular needs to
expose the difference between the 'prev' and 'next' threads, while the
new helper functions intentionally were written to only work with
'current'.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@kernel.org	# Fixes preemption bug and helps backporting
---
 arch/x86/include/asm/i387.h |   42 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/xsave.c     |   10 +++-------
 2 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 262bea981aa5..6e87fa43c357 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -400,6 +400,48 @@ static inline void irq_ts_restore(int TS_state)
 }
 
 /*
+ * The question "does this thread have fpu access?"
+ * is slightly racy, since preemption could come in
+ * and revoke it immediately after the test.
+ *
+ * However, even in that very unlikely scenario,
+ * we can just assume we have FPU access - typically
+ * to save the FP state - we'll just take a #NM
+ * fault and get the FPU access back.
+ *
+ * The actual user_fpu_begin/end() functions
+ * need to be preemption-safe, though.
+ *
+ * NOTE! user_fpu_end() must be used only after you
+ * have saved the FP state, and user_fpu_begin() must
+ * be used only immediately before restoring it.
+ * These functions do not do any save/restore on
+ * their own.
+ */
+static inline int user_has_fpu(void)
+{
+	return current_thread_info()->status & TS_USEDFPU;
+}
+
+static inline void user_fpu_end(void)
+{
+	preempt_disable();
+	current_thread_info()->status &= ~TS_USEDFPU;
+	stts();
+	preempt_enable();
+}
+
+static inline void user_fpu_begin(void)
+{
+	preempt_disable();
+	if (!user_has_fpu()) {
+		clts();
+		current_thread_info()->status |= TS_USEDFPU;
+	}
+	preempt_enable();
+}
+
+/*
  * These disable preemption on their own and are safe
  */
 static inline void save_init_fpu(struct task_struct *tsk)
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index a3911343976b..86f1f09a738a 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -168,7 +168,7 @@ int save_i387_xstate(void __user *buf)
 	if (!used_math())
 		return 0;
 
-	if (task_thread_info(tsk)->status & TS_USEDFPU) {
+	if (user_has_fpu()) {
 		if (use_xsave())
 			err = xsave_user(buf);
 		else
@@ -176,8 +176,7 @@ int save_i387_xstate(void __user *buf)
 
 		if (err)
 			return err;
-		task_thread_info(tsk)->status &= ~TS_USEDFPU;
-		stts();
+		user_fpu_end();
 	} else {
 		sanitize_i387_state(tsk);
 		if (__copy_to_user(buf, &tsk->thread.fpu.state->fxsave,
@@ -292,10 +291,7 @@ int restore_i387_xstate(void __user *buf)
 			return err;
 	}
 
-	if (!(task_thread_info(current)->status & TS_USEDFPU)) {
-		clts();
-		task_thread_info(current)->status |= TS_USEDFPU;
-	}
+	user_fpu_begin();
 	if (use_xsave())
 		err = restore_user_xstate(buf);
 	else
-- 
1.7.9.188.g12766.dirty


  reply	other threads:[~2012-02-22 20:57 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-22 20:54 [PATCH 0/5] i387: stable kernel backport Linus Torvalds
2012-02-22 20:56 ` [PATCH 1/5] i387: math_state_restore() isn't called from asm Linus Torvalds
2012-02-22 20:56   ` [PATCH 2/5] i387: make irq_fpu_usable() tests more robust Linus Torvalds
2012-02-22 20:57     ` Linus Torvalds [this message]
2012-02-22 20:58       ` [PATCH 4/5] i387: move TS_USEDFPU clearing out of __save_init_fpu and into callers Linus Torvalds
2012-02-22 20:59         ` [PATCH 5/5] i387: move TS_USEDFPU flag from thread_info to task_struct Linus Torvalds
2012-02-22 21:02 ` [PATCH 0/5] i387: stable kernel backport H. Peter Anvin
2012-02-22 21:19 ` Greg Kroah-Hartman
2012-02-22 21:29   ` Linus Torvalds
2012-02-22 21:30     ` Linus Torvalds
2012-02-22 21:32     ` Greg Kroah-Hartman
2012-02-23 20:09       ` Greg Kroah-Hartman
2012-02-23 20:29         ` H. Peter Anvin
2012-02-23 20:48           ` Greg Kroah-Hartman
2012-02-23 20:51             ` H. Peter Anvin
2012-02-23 21:10               ` Greg Kroah-Hartman
2012-02-23 21:52                 ` Willy Tarreau
2012-02-23 22:11                   ` Linus Torvalds
2012-02-23 22:27                     ` Willy Tarreau
2012-02-23 22:38                       ` Linus Torvalds
2012-02-23 22:48                         ` H. Peter Anvin
2012-02-23 22:52                           ` Willy Tarreau
2012-02-23 22:55                             ` H. Peter Anvin
2012-02-23 23:04                               ` Willy Tarreau
2012-02-23 22:49                         ` Willy Tarreau
2012-02-23 22:59                         ` Greg Kroah-Hartman
2012-02-23 23:05                 ` H. Peter Anvin
2012-02-23 23:16                   ` Greg Kroah-Hartman
2012-02-23 23:18                     ` H. Peter Anvin
2012-02-23 23:19                       ` Suresh Siddha
2012-02-23 23:54                       ` Greg Kroah-Hartman
2012-02-23 23:59                         ` H. Peter Anvin
2012-02-24  0:47                         ` H. Peter Anvin
2012-02-22 22:34 ` H. Peter Anvin
2012-02-22 22:45   ` H. Peter Anvin
2012-02-22 23:15   ` Linus Torvalds
2012-02-22 23:31     ` Linus Torvalds
2012-02-23  0:14       ` H. Peter Anvin
2012-02-23  0:25         ` Linus Torvalds
2012-02-23  0:37           ` Greg Kroah-Hartman
2012-02-23  1:47             ` raphael
2012-02-23  2:55               ` Linus Torvalds
2012-02-23  2:41                 ` raphael
2012-02-23  3:37                   ` Linus Torvalds
2012-02-23 18:15                     ` Greg Kroah-Hartman
2012-02-23 19:36                   ` Greg Kroah-Hartman
2012-02-23 19:41                     ` Linus Torvalds
2012-02-23 19:50                       ` Greg Kroah-Hartman
2012-02-23 19:55                         ` Greg Kroah-Hartman
2012-02-23 20:02                           ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.02.1202221256450.20522@i5.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=raphael@buro.asia \
    --cc=stable@vger.kernel.org \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).