linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk,
	Arjan van de Ven <arjan@linux.intel.com>,
	Peter Anvin <hpa@zytor.com>
Subject: [ 54/73] i387: move TS_USEDFPU flag from thread_info to task_struct
Date: Mon, 27 Feb 2012 17:02:57 -0800	[thread overview]
Message-ID: <20120228010213.201264229@linuxfoundation.org> (raw)
In-Reply-To: <20120228010246.GA24299@kroah.com>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@linux-foundation.org>

commit f94edacf998516ac9d849f7bc6949a703977a7f3 upstream.

This moves the bit that indicates whether a thread has ownership of the
FPU from the TS_USEDFPU bit in thread_info->status to a word of its own
(called 'has_fpu') in task_struct->thread.has_fpu.

This fixes two independent bugs at the same time:

 - changing 'thread_info->status' from the scheduler causes nasty
   problems for the other users of that variable, since it is defined to
   be thread-synchronous (that's what the "TS_" part of the naming was
   supposed to indicate).

   So perfectly valid code could (and did) do

	ti->status |= TS_RESTORE_SIGMASK;

   and the compiler was free to do that as separate load, or and store
   instructions.  Which can cause problems with preemption, since a task
   switch could happen in between, and change the TS_USEDFPU bit. The
   change to TS_USEDFPU would be overwritten by the final store.

   In practice, this seldom happened, though, because the 'status' field
   was seldom used more than once, so gcc would generally tend to
   generate code that used a read-modify-write instruction and thus
   happened to avoid this problem - RMW instructions are naturally low
   fat and preemption-safe.

 - On x86-32, the current_thread_info() pointer would, during interrupts
   and softirqs, point to a *copy* of the real thread_info, because
   x86-32 uses %esp to calculate the thread_info address, and thus the
   separate irq (and softirq) stacks would cause these kinds of odd
   thread_info copy aliases.

   This is normally not a problem, since interrupts aren't supposed to
   look at thread information anyway (what thread is running at
   interrupt time really isn't very well-defined), but it confused the
   heck out of irq_fpu_usable() and the code that tried to squirrel
   away the FPU state.

   (It also caused untold confusion for us poor kernel developers).

It also turns out that using 'task_struct' is actually much more natural
for most of the call sites that care about the FPU state, since they
tend to work with the task struct for other reasons anyway (ie
scheduling).  And the FPU data that we are going to save/restore is
found there too.

Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to
the %esp issue.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Reported-and-tested-by: Raphael Prevost <raphael@buro.asia>
Acked-and-tested-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/include/asm/i387.h        |   44 ++++++++++++++++++-------------------
 arch/x86/include/asm/processor.h   |    1 
 arch/x86/include/asm/thread_info.h |    2 -
 arch/x86/kernel/traps.c            |   11 ++++-----
 arch/x86/kernel/xsave.c            |    2 -
 arch/x86/kvm/vmx.c                 |    2 -
 6 files changed, 30 insertions(+), 32 deletions(-)

--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -264,21 +264,21 @@ static inline int restore_fpu_checking(s
  * be preemption protection *and* they need to be
  * properly paired with the CR0.TS changes!
  */
-static inline int __thread_has_fpu(struct thread_info *ti)
+static inline int __thread_has_fpu(struct task_struct *tsk)
 {
-	return ti->status & TS_USEDFPU;
+	return tsk->thread.has_fpu;
 }
 
 /* Must be paired with an 'stts' after! */
-static inline void __thread_clear_has_fpu(struct thread_info *ti)
+static inline void __thread_clear_has_fpu(struct task_struct *tsk)
 {
-	ti->status &= ~TS_USEDFPU;
+	tsk->thread.has_fpu = 0;
 }
 
 /* Must be paired with a 'clts' before! */
-static inline void __thread_set_has_fpu(struct thread_info *ti)
+static inline void __thread_set_has_fpu(struct task_struct *tsk)
 {
-	ti->status |= TS_USEDFPU;
+	tsk->thread.has_fpu = 1;
 }
 
 /*
@@ -288,16 +288,16 @@ static inline void __thread_set_has_fpu(
  * These generally need preemption protection to work,
  * do try to avoid using these on their own.
  */
-static inline void __thread_fpu_end(struct thread_info *ti)
+static inline void __thread_fpu_end(struct task_struct *tsk)
 {
-	__thread_clear_has_fpu(ti);
+	__thread_clear_has_fpu(tsk);
 	stts();
 }
 
-static inline void __thread_fpu_begin(struct thread_info *ti)
+static inline void __thread_fpu_begin(struct task_struct *tsk)
 {
 	clts();
-	__thread_set_has_fpu(ti);
+	__thread_set_has_fpu(tsk);
 }
 
 /*
@@ -308,21 +308,21 @@ extern int restore_i387_xstate(void __us
 
 static inline void __unlazy_fpu(struct task_struct *tsk)
 {
-	if (__thread_has_fpu(task_thread_info(tsk))) {
+	if (__thread_has_fpu(tsk)) {
 		__save_init_fpu(tsk);
-		__thread_fpu_end(task_thread_info(tsk));
+		__thread_fpu_end(tsk);
 	} else
 		tsk->fpu_counter = 0;
 }
 
 static inline void __clear_fpu(struct task_struct *tsk)
 {
-	if (__thread_has_fpu(task_thread_info(tsk))) {
+	if (__thread_has_fpu(tsk)) {
 		/* Ignore delayed exceptions from user space */
 		asm volatile("1: fwait\n"
 			     "2:\n"
 			     _ASM_EXTABLE(1b, 2b));
-		__thread_fpu_end(task_thread_info(tsk));
+		__thread_fpu_end(tsk);
 	}
 }
 
@@ -337,7 +337,7 @@ static inline void __clear_fpu(struct ta
  */
 static inline bool interrupted_kernel_fpu_idle(void)
 {
-	return !__thread_has_fpu(current_thread_info()) &&
+	return !__thread_has_fpu(current) &&
 		(read_cr0() & X86_CR0_TS);
 }
 
@@ -371,12 +371,12 @@ static inline bool irq_fpu_usable(void)
 
 static inline void kernel_fpu_begin(void)
 {
-	struct thread_info *me = current_thread_info();
+	struct task_struct *me = current;
 
 	WARN_ON_ONCE(!irq_fpu_usable());
 	preempt_disable();
 	if (__thread_has_fpu(me)) {
-		__save_init_fpu(me->task);
+		__save_init_fpu(me);
 		__thread_clear_has_fpu(me);
 		/* We do 'stts()' in kernel_fpu_end() */
 	} else
@@ -441,13 +441,13 @@ static inline void irq_ts_restore(int TS
  */
 static inline int user_has_fpu(void)
 {
-	return __thread_has_fpu(current_thread_info());
+	return __thread_has_fpu(current);
 }
 
 static inline void user_fpu_end(void)
 {
 	preempt_disable();
-	__thread_fpu_end(current_thread_info());
+	__thread_fpu_end(current);
 	preempt_enable();
 }
 
@@ -455,7 +455,7 @@ static inline void user_fpu_begin(void)
 {
 	preempt_disable();
 	if (!user_has_fpu())
-		__thread_fpu_begin(current_thread_info());
+		__thread_fpu_begin(current);
 	preempt_enable();
 }
 
@@ -464,10 +464,10 @@ static inline void user_fpu_begin(void)
  */
 static inline void save_init_fpu(struct task_struct *tsk)
 {
-	WARN_ON_ONCE(!__thread_has_fpu(task_thread_info(tsk)));
+	WARN_ON_ONCE(!__thread_has_fpu(tsk));
 	preempt_disable();
 	__save_init_fpu(tsk);
-	__thread_fpu_end(task_thread_info(tsk));
+	__thread_fpu_end(tsk);
 	preempt_enable();
 }
 
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -454,6 +454,7 @@ struct thread_struct {
 	unsigned long		trap_no;
 	unsigned long		error_code;
 	/* floating point and extended processor state */
+	unsigned long		has_fpu;
 	struct fpu		fpu;
 #ifdef CONFIG_X86_32
 	/* Virtual 86 mode info */
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -242,8 +242,6 @@ static inline struct thread_info *curren
  * ever touches our thread-synchronous status, so we don't
  * have to worry about atomic accesses.
  */
-#define TS_USEDFPU		0x0001	/* FPU was used by this task
-					   this quantum (SMP) */
 #define TS_COMPAT		0x0002	/* 32bit syscall active (64BIT)*/
 #define TS_POLLING		0x0004	/* idle task polling need_resched,
 					   skip sending interrupt */
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -728,12 +728,11 @@ asmlinkage void __attribute__((weak)) sm
  */
 void math_state_restore(void)
 {
-	struct thread_info *thread = current_thread_info();
-	struct task_struct *tsk = thread->task;
+	struct task_struct *tsk = current;
 
 	/* We need a safe address that is cheap to find and that is already
-	   in L1. We just brought in "thread->task", so use that */
-#define safe_address (thread->task)
+	   in L1. We're just bringing in "tsk->thread.has_fpu", so use that */
+#define safe_address (tsk->thread.has_fpu)
 
 	if (!tsk_used_math(tsk)) {
 		local_irq_enable();
@@ -750,7 +749,7 @@ void math_state_restore(void)
 		local_irq_disable();
 	}
 
-	__thread_fpu_begin(thread);
+	__thread_fpu_begin(tsk);
 
 	/* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception
 	   is pending.  Clear the x87 state here by setting it to fixed
@@ -766,7 +765,7 @@ void math_state_restore(void)
 	 * Paranoid restore. send a SIGSEGV if we fail to restore the state.
 	 */
 	if (unlikely(restore_fpu_checking(tsk))) {
-		__thread_fpu_end(thread);
+		__thread_fpu_end(tsk);
 		force_sig(SIGSEGV, tsk);
 		return;
 	}
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -47,7 +47,7 @@ void __sanitize_i387_state(struct task_s
 	if (!fx)
 		return;
 
-	BUG_ON(__thread_has_fpu(task_thread_info(tsk)));
+	BUG_ON(__thread_has_fpu(tsk));
 
 	xstate_bv = tsk->thread.fpu.state->xsave.xsave_hdr.xstate_bv;
 
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -948,7 +948,7 @@ static void __vmx_load_host_state(struct
 #ifdef CONFIG_X86_64
 	wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_host_kernel_gs_base);
 #endif
-	if (__thread_has_fpu(current_thread_info()))
+	if (__thread_has_fpu(current))
 		clts();
 	load_gdt(&__get_cpu_var(host_gdt));
 }



  parent reply	other threads:[~2012-02-28  1:07 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-28  1:02 [ 00/73] 3.0.23-stable review Greg KH
2012-02-28  1:02 ` [ 01/73] ASoC: wm8962: Fix sidetone enumeration texts Greg KH
2012-02-28  1:02 ` [ 02/73] NOMMU: Lock i_mmap_mutex for access to the VMA prio list Greg KH
2012-02-28  1:02 ` [ 03/73] hwmon: (max6639) Fix FAN_FROM_REG calculation Greg KH
2012-02-28  1:02 ` [ 04/73] hwmon: (max6639) Fix PPR register initialization to set both channels Greg KH
2012-02-28  1:02 ` [ 05/73] hwmon: (ads1015) Fix file leak in probe function Greg KH
2012-02-28  1:02 ` [ 06/73] powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events Greg KH
2012-02-28  1:02 ` [ 07/73] drm/radeon/kms: fix MSI re-arm on rv370+ Greg KH
2012-02-28  1:02 ` [ 08/73] PCI: workaround hard-wired bus number V2 Greg KH
2012-02-28  1:02 ` [ 09/73] mac80211: Fix a rwlock bad magic bug Greg KH
2012-02-28  1:02 ` [ 10/73] ipheth: Add iPhone 4S Greg KH
2012-02-28  1:02 ` [ 11/73] eCryptfs: Copy up lower inode attrs after setting lower xattr Greg KH
2012-02-28  1:02 ` [ 12/73] ALSA: hda - Fix redundant jack creations for cx5051 Greg KH
2012-02-28  1:02 ` [ 13/73] mmc: core: check for zero length ioctl data Greg KH
2012-02-28  1:02 ` [ 14/73] NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEID Greg KH
2012-02-28  1:02 ` [ 15/73] ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR Greg KH
2012-02-28  1:02 ` [ 16/73] ARM: 7325/1: fix v7 boot with lockdep enabled Greg KH
2012-02-28  1:02 ` [ 17/73] net: Make qdisc_skb_cb upper size bound explicit Greg KH
2012-02-28  1:02 ` [ 18/73] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses Greg KH
2012-02-28  1:02 ` [ 19/73] gro: more generic L2 header check Greg KH
2012-02-28  1:02 ` [ 20/73] veth: Enforce minimum size of VETH_INFO_PEER Greg KH
2012-02-28  1:02 ` [ 21/73] 3c59x: shorten timer period for slave devices Greg KH
2012-02-28  1:02 ` [ 22/73] ipv6-multicast: Fix memory leak in input path Greg KH
2012-02-28  1:02 ` [ 23/73] ipv6-multicast: Fix memory leak in IPv6 multicast Greg KH
2012-02-28  1:02 ` [ 24/73] ipv4: fix for ip_options_rcv_srr() daddr update Greg KH
2012-02-28  1:02 ` [ 25/73] ipv4: Save nexthop address of LSRR/SSRR option to IPCB Greg KH
2012-02-28  1:02 ` [ 26/73] ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr Greg KH
2012-02-28  1:02 ` [ 27/73] ipv4: reset flowi parameters on route connect Greg KH
2012-02-28  1:02 ` [ 28/73] net: Dont proxy arp respond if iif == rt->dst.dev if private VLAN is disabled Greg KH
2012-02-28  1:02 ` [ 29/73] netpoll: netpoll_poll_dev() should access dev->flags Greg KH
2012-02-28  1:02 ` [ 30/73] net_sched: Bug in netem reordering Greg KH
2012-02-28  1:02 ` [ 31/73] via-velocity: S3 resume fix Greg KH
2012-02-28  1:02 ` [ 32/73] tcp_v4_send_reset: binding oif to iif in no sock case Greg KH
2012-02-28  1:02 ` [ 33/73] tcp: allow tcp_sacktag_one() to tag ranges not aligned with skbs Greg KH
2012-02-28  1:02 ` [ 34/73] tcp: fix range tcp_shifted_skb() passes to tcp_sacktag_one() Greg KH
2012-02-28  1:02 ` [ 35/73] tcp: fix tcp_shifted_skb() adjustment of lost_cnt_hint for FACK Greg KH
2012-02-28  1:02 ` [ 36/73] route: fix ICMP redirect validation Greg KH
2012-02-28  1:02 ` [ 37/73] ipv4: fix redirect handling Greg KH
2012-02-28  1:02 ` [ 38/73] USB: Added Kamstrup VID/PIDs to cp210x serial driver Greg KH
2012-02-28  1:02 ` [ 39/73] USB: option: cleanup zte 3g-dongles pid in option.c Greg KH
2012-02-28  1:02 ` [ 40/73] USB: Serial: ti_usb_3410_5052: Add Abbot Diabetes Care cable id Greg KH
2012-02-28  1:02 ` [ 41/73] USB: Remove duplicate USB 3.0 hub feature #defines Greg KH
2012-02-28  1:02 ` [ 42/73] USB: Fix handoff when BIOS disables host PCI device Greg KH
2012-02-28  1:02 ` [ 43/73] xhci: Fix oops caused by more USB2 ports than USB3 ports Greg KH
2012-02-28  1:02 ` [ 44/73] xhci: Fix encoding for HS bulk/control NAK rate Greg KH
2012-02-28  1:02 ` [ 45/73] USB: Set hub depth after USB3 hub reset Greg KH
2012-02-28  1:02 ` [ 46/73] i387: math_state_restore() isnt called from asm Greg KH
2012-02-28  1:02 ` [ 47/73] i387: make irq_fpu_usable() tests more robust Greg KH
2012-02-28  1:02 ` [ 48/73] i387: fix sense of sanity check Greg KH
2012-02-28  1:02 ` [ 49/73] i387: fix x86-64 preemption-unsafe user stack save/restore Greg KH
2012-02-28  1:02 ` [ 50/73] i387: move TS_USEDFPU clearing out of __save_init_fpu and into callers Greg KH
2012-02-28  1:02 ` [ 51/73] i387: dont ever touch TS_USEDFPU directly, use helper functions Greg KH
2012-02-28  1:02 ` [ 52/73] i387: do not preload FPU state at task switch time Greg KH
2012-02-28  1:02 ` [ 53/73] i387: move AMD K7/K8 fpu fxsave/fxrstor workaround from save to restore Greg KH
2012-02-28  1:02 ` Greg KH [this message]
2012-02-28  1:02 ` [ 55/73] i387: re-introduce FPU state preloading at context switch time Greg KH
2012-02-28  1:02 ` [ 56/73] usb-storage: fix freezing of the scanning thread Greg KH
2012-02-28  1:03 ` [ 57/73] USB: Dont fail USB3 probe on missing legacy PCI IRQ Greg KH
2012-02-28  1:03 ` [ 58/73] x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors Greg KH
2012-02-28  1:03 ` [ 59/73] ath9k: stop on rates with idx -1 in ath9k rate controls .tx_status Greg KH
2012-02-28  1:03 ` [ 60/73] genirq: Unmask oneshot irqs when thread was not woken Greg KH
2012-02-28  1:03 ` [ 61/73] genirq: Handle pending irqs in irq_startup() Greg KH
2012-02-28  1:03 ` [ 62/73] [SCSI] scsi_scan: Fix Poison overwritten warning caused by using freed shost Greg KH
2012-02-28  1:03 ` [ 63/73] [SCSI] scsi_pm: Fix bug in the SCSI power management handler Greg KH
2012-02-28  1:03 ` [ 64/73] ipvs: fix matching of fwmark templates during scheduling Greg KH
2012-02-28  1:03 ` [ 65/73] jme: Fix FIFO flush issue Greg KH
2012-02-28  1:03 ` [ 66/73] davinci_emac: Do not free all rx dma descriptors during init Greg KH
2012-02-28  1:03 ` [ 67/73] builddeb: Dont create files in /tmp with predictable names Greg KH
2012-02-28  1:03 ` [ 68/73] [media] hdpvr: fix race conditon during start of streaming Greg KH
2012-02-28  1:03 ` [ 69/73] hwmon: (f75375s) Fix register write order when setting fans to full speed Greg KH
2012-02-28  1:03 ` [ 70/73] epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() Greg KH
2012-02-28  1:03 ` [ 71/73] epoll: ep_unregister_pollwait() can use the freed pwq->whead Greg KH
2012-02-28  1:03 ` [ 72/73] epoll: limit paths Greg KH
2012-02-28  1:03 ` [ 73/73] cdrom: use copy_to_user() without the underscores Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120228010213.201264229@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=arjan@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).