From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Andy Lutomirski <luto@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Borislav Petkov <bp@suse.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@kernel.org>
Subject: [PATCH 3.14 79/84] x86/nmi/64: Switch stacks on userspace NMI entry
Date: Tue, 29 Sep 2015 17:19:11 +0200 [thread overview]
Message-ID: <20150929145334.482419451@linuxfoundation.org> (raw)
In-Reply-To: <20150929145330.924730721@linuxfoundation.org>
3.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski <luto@kernel.org>
commit 9b6e6a8334d56354853f9c255d1395c2ba570e0a upstream.
Returning to userspace is tricky: IRET can fail, and ESPFIX can
rearrange the stack prior to IRET.
The NMI nesting fixup relies on a precise stack layout and
atomic IRET. Rather than trying to teach the NMI nesting fixup
to handle ESPFIX and failed IRET, punt: run NMIs that came from
user mode on the normal kernel stack.
This will make some nested NMIs visible to C code, but the C
code is okay with that.
As a side effect, this should speed up perf: it eliminates an
RDMSR when NMIs come from user mode.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/kernel/entry_64.S | 77 ++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 73 insertions(+), 4 deletions(-)
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1715,19 +1715,88 @@ ENTRY(nmi)
* a nested NMI that updated the copy interrupt stack frame, a
* jump will be made to the repeat_nmi code that will handle the second
* NMI.
+ *
+ * However, espfix prevents us from directly returning to userspace
+ * with a single IRET instruction. Similarly, IRET to user mode
+ * can fault. We therefore handle NMIs from user space like
+ * other IST entries.
*/
/* Use %rdx as out temp variable throughout */
pushq_cfi %rdx
CFI_REL_OFFSET rdx, 0
+ testb $3, CS-RIP+8(%rsp)
+ jz .Lnmi_from_kernel
+
+ /*
+ * NMI from user mode. We need to run on the thread stack, but we
+ * can't go through the normal entry paths: NMIs are masked, and
+ * we don't want to enable interrupts, because then we'll end
+ * up in an awkward situation in which IRQs are on but NMIs
+ * are off.
+ */
+ SWAPGS
+ cld
+ movq %rsp, %rdx
+ movq PER_CPU_VAR(kernel_stack), %rsp
+ addq $KERNEL_STACK_OFFSET, %rsp
+ pushq 5*8(%rdx) /* pt_regs->ss */
+ pushq 4*8(%rdx) /* pt_regs->rsp */
+ pushq 3*8(%rdx) /* pt_regs->flags */
+ pushq 2*8(%rdx) /* pt_regs->cs */
+ pushq 1*8(%rdx) /* pt_regs->rip */
+ pushq $-1 /* pt_regs->orig_ax */
+ pushq %rdi /* pt_regs->di */
+ pushq %rsi /* pt_regs->si */
+ pushq (%rdx) /* pt_regs->dx */
+ pushq %rcx /* pt_regs->cx */
+ pushq %rax /* pt_regs->ax */
+ pushq %r8 /* pt_regs->r8 */
+ pushq %r9 /* pt_regs->r9 */
+ pushq %r10 /* pt_regs->r10 */
+ pushq %r11 /* pt_regs->r11 */
+ pushq %rbx /* pt_regs->rbx */
+ pushq %rbp /* pt_regs->rbp */
+ pushq %r12 /* pt_regs->r12 */
+ pushq %r13 /* pt_regs->r13 */
+ pushq %r14 /* pt_regs->r14 */
+ pushq %r15 /* pt_regs->r15 */
+
+ /*
+ * At this point we no longer need to worry about stack damage
+ * due to nesting -- we're on the normal thread stack and we're
+ * done with the NMI stack.
+ */
+ movq %rsp, %rdi
+ movq $-1, %rsi
+ call do_nmi
+
+ /*
+ * Return back to user mode. We must *not* do the normal exit
+ * work, because we don't want to enable interrupts. Fortunately,
+ * do_nmi doesn't modify pt_regs.
+ */
+ SWAPGS
+
/*
- * If %cs was not the kernel segment, then the NMI triggered in user
- * space, which means it is definitely not nested.
+ * Open-code the entire return process for compatibility with varying
+ * register layouts across different kernel versions.
*/
- cmpl $__KERNEL_CS, 16(%rsp)
- jne first_nmi
+ addq $6*8, %rsp /* skip bx, bp, and r12-r15 */
+ popq %r11 /* pt_regs->r11 */
+ popq %r10 /* pt_regs->r10 */
+ popq %r9 /* pt_regs->r9 */
+ popq %r8 /* pt_regs->r8 */
+ popq %rax /* pt_regs->ax */
+ popq %rcx /* pt_regs->cx */
+ popq %rdx /* pt_regs->dx */
+ popq %rsi /* pt_regs->si */
+ popq %rdi /* pt_regs->di */
+ addq $8, %rsp /* skip orig_ax */
+ INTERRUPT_RETURN
+.Lnmi_from_kernel:
/*
* Check the special variable on the stack to see if NMIs are
* executing.
next prev parent reply other threads:[~2015-09-29 15:19 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-29 15:17 [PATCH 3.14 00/84] 3.14.54-stable review Greg Kroah-Hartman
2015-09-29 15:17 ` [PATCH 3.14 01/84] blk-mq: fix buffer overflow when reading sysfs file of pending Greg Kroah-Hartman
2015-09-29 15:17 ` [PATCH 3.14 02/84] unshare: Unsharing a thread does not require unsharing a vm Greg Kroah-Hartman
2015-09-29 15:17 ` [PATCH 3.14 03/84] rtlwifi: rtl8192cu: Add new device ID Greg Kroah-Hartman
2015-09-29 15:17 ` [PATCH 3.14 04/84] tg3: Fix temperature reporting Greg Kroah-Hartman
2015-09-29 15:17 ` [PATCH 3.14 05/84] mac80211: enable assoc check for mesh interfaces Greg Kroah-Hartman
2015-09-29 15:17 ` [PATCH 3.14 06/84] arm64: kconfig: Move LIST_POISON to a safe value Greg Kroah-Hartman
2015-09-29 15:17 ` [PATCH 3.14 07/84] arm64: compat: fix vfp save/restore across signal handlers in big-endian Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 08/84] arm64: head.S: initialise mdcr_el2 in el2_setup Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 09/84] arm64: errata: add module build workaround for erratum #843419 Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 10/84] arm64: KVM: Disable virtual timer even if the guest is not using it Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 11/84] Input: evdev - do not report errors form flush() Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 12/84] ALSA: hda - Enable headphone jack detect on old Fujitsu laptops Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 13/84] ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437 Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 14/84] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 15/84] powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 16/84] powerpc/mm: Recompute hash value after a failed update Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 17/84] CIFS: fix type confusion in copy offload ioctl Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 18/84] Add radeon suspend/resume quirk for HP Compaq dc5750 Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 19/84] mm: check if section present during memory block registering Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 20/84] x86/mm: Initialize pmd_idx in page_table_range_init_count() Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 22/84] [media] v4l: omap3isp: Fix sub-device power management code Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 23/84] Btrfs: check if previous transaction aborted to avoid fs corruption Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 24/84] NFSv4: dont set SETATTR for O_RDONLY|O_EXCL Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 25/84] NFS: Fix a NULL pointer dereference of migration recovery ops for v4.2 client Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 26/84] NFS: nfs_set_pgio_error sometimes misses errors Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 27/84] parisc: Use double word condition in 64bit CAS operation Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 28/84] parisc: Filter out spurious interrupts in PA-RISC irq handler Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 29/84] vmscan: fix increasing nr_isolated incurred by putback unevictable pages Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 30/84] fs: if a coredump already exists, unlink and recreate with O_EXCL Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 31/84] mmc: core: fix race condition in mmc_wait_data_done Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 32/84] md/raid10: always set reshape_safe when initializing reshape_position Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 33/84] xen/gntdev: convert priv->lock to a mutex Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 34/84] hfs: fix B-tree corruption after insertion at position 0 Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 35/84] IB/qib: Change lkey table allocation to support more MRs Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 36/84] IB/uverbs: reject invalid or unknown opcodes Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 37/84] IB/uverbs: Fix race between ib_uverbs_open and remove_one Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 38/84] IB/mlx4: Forbid using sysfs to change RoCE pkeys Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 39/84] IB/mlx4: Use correct SL on AH query under RoCE Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 40/84] stmmac: fix check for phydev being open Greg Kroah-Hartman
2015-09-30 11:22 ` Sergei Shtylyov
2015-10-01 3:04 ` Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 41/84] stmmac: troubleshoot unexpected bits in des0 & des1 Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 42/84] hfs,hfsplus: cache pages correctly between bnode_create and bnode_free Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 43/84] ipv6: Make MLD packets to only be processed locally Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 44/84] net: graceful exit from netif_alloc_netdev_queues() Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 45/84] rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 47/84] net/tipc: initialize security state for new connection socket Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 48/84] bridge: mdb: zero out the local br_ip variable before use Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 49/84] net: pktgen: fix race between pktgen_thread_worker() and kthread_stop() Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 50/84] net: do not process device backlog during unregistration Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 51/84] net: call rcu_read_lock early in process_backlog Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 52/84] net: Clone skb before setting peeked flag Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 53/84] net: Fix skb csum races when peeking Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 54/84] net: Fix skb_set_peeked use-after-free bug Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 55/84] bridge: mdb: fix double add notification Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 56/84] isdn/gigaset: reset tty->receive_room when attaching ser_gigaset Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 57/84] ipv6: lock socket in ip6_datagram_connect() Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 58/84] bonding: fix destruction of bond with devices different from arphrd_ether Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 59/84] bonding: correct the MAC address for "follow" fail_over_mac policy Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 60/84] inet: frags: fix defragmented packets IP header for af_packet Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 61/84] netlink: dont hold mutex in rcu callback when releasing mmapd ring Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 62/84] net/mlx4_core: Fix wrong index in propagating port change event to VFs Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 63/84] ip6_gre: release cached dst on tunnel removal Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 64/84] usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 65/84] ipv6: fix exthdrs offload registration in out_rt path Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 66/84] net/ipv6: Correct PIM6 mrt_lock handling Greg Kroah-Hartman
2015-09-29 15:18 ` [PATCH 3.14 67/84] netlink, mmap: transform mmap skb into full skb on taps Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 68/84] sctp: fix race on protocol/netns initialization Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 69/84] openvswitch: Zero flows on allocation Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 70/84] fib_rules: fix fib rule dumps across multiple skbs Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 71/84] packet: missing dev_put() in packet_do_bind() Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 72/84] rds: fix an integer overflow test in rds_info_getsockopt() Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 74/84] bna: fix interrupts storm caused by erroneous packets Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 75/84] net: gso: use feature flag argument in all protocol gso handlers Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 76/84] Revert "iio: bmg160: IIO_BUFFER and IIO_TRIGGERED_BUFFER are required" Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 77/84] x86/nmi: Enable nested do_nmi() handling for 64-bit kernels Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 78/84] x86/nmi/64: Remove asm code that saves CR2 Greg Kroah-Hartman
2015-09-29 15:19 ` Greg Kroah-Hartman [this message]
2015-09-29 17:25 ` [PATCH 3.14 79/84] x86/nmi/64: Switch stacks on userspace NMI entry Andy Lutomirski
2015-09-29 17:57 ` Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 80/84] x86/nmi/64: Improve nested NMI comments Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 81/84] x86/nmi/64: Reorder nested NMI checks Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 82/84] x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detection Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 83/84] udf: Check length of extended attributes and allocation descriptors Greg Kroah-Hartman
2015-09-29 15:19 ` [PATCH 3.14 84/84] NVMe: Initialize device reference count earlier Greg Kroah-Hartman
2015-09-29 16:53 ` [PATCH 3.14 00/84] 3.14.54-stable review Shuah Khan
2015-09-29 19:41 ` Greg Kroah-Hartman
2015-09-29 21:15 ` Guenter Roeck
2015-09-30 2:11 ` Greg Kroah-Hartman
2015-09-30 5:53 ` Sudip Mukherjee
2015-09-30 6:00 ` Greg Kroah-Hartman
[not found] ` <560e8874.e968c20a.57231.fffff396@mx.google.com>
2015-10-02 13:38 ` Kevin Hilman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150929145334.482419451@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=bp@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).