public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Andy Lutomirski <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: hpa@zytor.com, dvlasenk@redhat.com, peterz@infradead.org,
	brgerst@gmail.com, torvalds@linux-foundation.org, bp@alien8.de,
	tglx@linutronix.de, luto@amacapital.net, luto@kernel.org,
	mingo@kernel.org, linux-kernel@vger.kernel.org
Subject: [tip:x86/asm] x86/sched/64: Don't save flags on context switch ( reinstated)
Date: Fri, 25 Sep 2015 05:21:46 -0700	[thread overview]
Message-ID: <tip-3f2c5085ed99b6ad233cf77009c2f4f898b2f7c8@git.kernel.org> (raw)
In-Reply-To: <d4440fdc2a89247bffb7c003d2a9a2952bd46827.1441146105.git.luto@kernel.org>

Commit-ID:  3f2c5085ed99b6ad233cf77009c2f4f898b2f7c8
Gitweb:     http://git.kernel.org/tip/3f2c5085ed99b6ad233cf77009c2f4f898b2f7c8
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Tue, 1 Sep 2015 15:41:06 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 25 Sep 2015 09:29:17 +0200

x86/sched/64: Don't save flags on context switch (reinstated)

This reinstates the following commit:

  2c7577a75837 ("sched/x86_64: Don't save flags on context switch")

which was reverted in:

  512255a2ad2c ("Revert 'sched/x86_64: Don't save flags on context switch'")

Historically, Linux has always saved and restored EFLAGS across
context switches.  As far as I know, the only reason to do this
is because of the NT flag.  In particular, if something calls
switch_to() with the NT flag set, then we don't want to leak the
NT flag into a different task that might try to IRET and fail
because NT is set.

Before this commit:

  8c7aa698baca ("x86_64, entry: Filter RFLAGS.NT on entry from userspace")

we could run system call bodies with NT set.  This would be a DoS or possibly
privilege escalation hole if scheduling in such a system call would leak
NT into a different task.

Importantly, we don't need to worry about NT being set while
preemptible or across page faults.  The only way we can schedule
due to preemption or a page fault is in an interrupt entry that
nests inside the SYSENTER prologue.  The CPU will clear NT when
entering through an interrupt gate, so we won't schedule with NT
set.

The only other interesting flags are IOPL and AC.  Allowing
switch_to() to change IOPL has no effect, as the value loaded
during kernel execution doesn't matter at all except between a
SYSENTER entry and the subsequent PUSHF, and anythign that
interrupts in that window will restore IOPL on return.

If we call __switch_to() with AC set, we have bigger problems.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/d4440fdc2a89247bffb7c003d2a9a2952bd46827.1441146105.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/switch_to.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
index d7f3b3b..751bf4b 100644
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -79,12 +79,12 @@ do {									\
 #else /* CONFIG_X86_32 */
 
 /* frame pointer must be last for get_wchan */
-#define SAVE_CONTEXT    "pushf ; pushq %%rbp ; movq %%rsi,%%rbp\n\t"
-#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp ; popf\t"
+#define SAVE_CONTEXT    "pushq %%rbp ; movq %%rsi,%%rbp\n\t"
+#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp\t"
 
 #define __EXTRA_CLOBBER  \
 	, "rcx", "rbx", "rdx", "r8", "r9", "r10", "r11", \
-	  "r12", "r13", "r14", "r15"
+	  "r12", "r13", "r14", "r15", "flags"
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 #define __switch_canary							  \
@@ -100,7 +100,11 @@ do {									\
 #define __switch_canary_iparam
 #endif	/* CC_STACKPROTECTOR */
 
-/* Save restore flags to clear handle leaking NT */
+/*
+ * There is no need to save or restore flags, because flags are always
+ * clean in kernel mode, with the possible exception of IOPL.  Kernel IOPL
+ * has no effect.
+ */
 #define switch_to(prev, next, last) \
 	asm volatile(SAVE_CONTEXT					  \
 	     "movq %%rsp,%P[threadrsp](%[prev])\n\t" /* save RSP */	  \

  parent reply	other threads:[~2015-09-25 12:23 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-01 22:41 [RFC 00/30] x86: Rewrite all syscall entries except native 64-bit Andy Lutomirski
2015-09-01 22:41 ` [RFC 01/30] selftests/x86: Add a test for vDSO unwinding Andy Lutomirski
2015-09-01 22:41 ` [RFC 02/30] selftests/x86: Add a test for syscall restart and arg modification Andy Lutomirski
2015-09-01 22:41 ` [RFC 03/30] x86/entry/64/compat: Fix SYSENTER's NT flag before user memory access Andy Lutomirski
2015-09-01 22:41 ` [RFC 04/30] x86/entry: Move lockdep_sys_exit to prepare_exit_to_usermode Andy Lutomirski
2015-09-01 22:41 ` [RFC 05/30] x86/entry/64/compat: After SYSENTER, move STI after the NT fixup Andy Lutomirski
2015-09-01 22:41 ` [RFC 06/30] x86/sched/64: Don't save flags on context switch (reinstated) Andy Lutomirski
2015-09-24 17:11   ` Andy Lutomirski
2015-09-25 12:21   ` tip-bot for Andy Lutomirski [this message]
2015-09-01 22:41 ` [RFC 07/30] x86/vdso: Remove runtime 32-bit vDSO selection Andy Lutomirski
2015-09-01 22:41 ` [RFC 08/30] x86/asm: Re-add manual CFI infrastructure Andy Lutomirski
2015-09-01 22:41 ` [RFC 09/30] x86/vdso: Define BUILD_VDSO while building and emit .eh_frame in asm Andy Lutomirski
2015-09-01 22:41 ` [RFC 10/30] x86/vdso: Replace hex int80 CFI annotations with gas directives Andy Lutomirski
2015-09-01 22:41 ` [RFC 11/30] x86/vdso/32: Save extra registers in the INT80 vsyscall path Andy Lutomirski
2015-09-01 22:41 ` [RFC 12/30] x86/entry/64/compat: Disable SYSENTER and SYSCALL32 entries Andy Lutomirski
2015-09-01 22:41 ` [RFC 13/30] x86/entry/64/compat: Remove audit optimizations Andy Lutomirski
2015-09-01 22:41 ` [RFC 14/30] x86/entry/64/compat: Remove most of the fast system call machinery Andy Lutomirski
2015-09-01 22:41 ` [RFC 15/30] x86/entry/64/compat: Set up full pt_regs for all compat syscalls Andy Lutomirski
2015-09-01 22:41 ` [RFC 16/30] x86/entry/syscalls: Move syscall table declarations into asm/syscalls.h Andy Lutomirski
2015-09-01 22:41 ` [RFC 17/30] x86/syscalls: Give sys_call_ptr_t a useful type Andy Lutomirski
2015-09-01 22:41 ` [RFC 18/30] x86/entry: Add do_syscall_32, a C function to do 32-bit syscalls Andy Lutomirski
2015-09-01 22:41 ` [RFC 19/30] x86/entry/64/compat: Migrate the body of the syscall entry to C Andy Lutomirski
2015-09-01 22:41 ` [RFC 20/30] x86/entry: Add C code for fast system call entries Andy Lutomirski
2015-09-01 22:41 ` [RFC 21/30] x86/vdso/compat: Wire up SYSENTER and SYSCSALL for compat userspace Andy Lutomirski
2015-09-01 22:41 ` [RFC 22/30] x86/entry/compat: Implement opportunistic SYSRETL for compat syscalls Andy Lutomirski
2015-09-01 22:41 ` [RFC 23/30] x86/entry/32: Open-code return tracking from fork and kthreads Andy Lutomirski
2015-09-01 22:41 ` [RFC 24/30] x86/entry/32: Switch INT80 to the new C syscall path Andy Lutomirski
2015-09-03 16:45   ` Brian Gerst
2015-09-03 17:22     ` Andy Lutomirski
2015-09-01 22:41 ` [RFC 25/30] x86/entry/32: Re-implement SYSENTER using the new C path Andy Lutomirski
2015-09-01 22:41 ` [RFC 26/30] x86/asm: Remove thread_info.sysenter_return Andy Lutomirski
2015-09-01 22:41 ` [RFC 27/30] x86/entry: Remove unnecessary IRQ twiddling in fast 32-bit syscalls Andy Lutomirski
2015-09-01 22:41 ` [RFC 28/30] x86/entry: Make irqs_disabled checks in exit code depend on lockdep Andy Lutomirski
2015-09-01 22:41 ` [RFC 29/30] x86/entry: Force inlining of 32-bit syscall code Andy Lutomirski
2015-09-01 22:41 ` [RFC 30/30] x86/entry: Micro-optimize compat fast syscall arg fetch Andy Lutomirski
2015-09-03  5:23 ` [RFC 00/30] x86: Rewrite all syscall entries except native 64-bit Brian Gerst
2015-09-03 17:18   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-3f2c5085ed99b6ad233cf77009c2f4f898b2f7c8@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox