From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755830AbbIYMXZ (ORCPT ); Fri, 25 Sep 2015 08:23:25 -0400 Received: from terminus.zytor.com ([198.137.202.10]:39357 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756071AbbIYMW6 (ORCPT ); Fri, 25 Sep 2015 08:22:58 -0400 Date: Fri, 25 Sep 2015 05:21:46 -0700 From: tip-bot for Andy Lutomirski Message-ID: Cc: hpa@zytor.com, dvlasenk@redhat.com, peterz@infradead.org, brgerst@gmail.com, torvalds@linux-foundation.org, bp@alien8.de, tglx@linutronix.de, luto@amacapital.net, luto@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org Reply-To: luto@kernel.org, mingo@kernel.org, bp@alien8.de, luto@amacapital.net, tglx@linutronix.de, linux-kernel@vger.kernel.org, peterz@infradead.org, brgerst@gmail.com, hpa@zytor.com, dvlasenk@redhat.com, torvalds@linux-foundation.org In-Reply-To: References: To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/asm] x86/sched/64: Don't save flags on context switch ( reinstated) Git-Commit-ID: 3f2c5085ed99b6ad233cf77009c2f4f898b2f7c8 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 3f2c5085ed99b6ad233cf77009c2f4f898b2f7c8 Gitweb: http://git.kernel.org/tip/3f2c5085ed99b6ad233cf77009c2f4f898b2f7c8 Author: Andy Lutomirski AuthorDate: Tue, 1 Sep 2015 15:41:06 -0700 Committer: Ingo Molnar CommitDate: Fri, 25 Sep 2015 09:29:17 +0200 x86/sched/64: Don't save flags on context switch (reinstated) This reinstates the following commit: 2c7577a75837 ("sched/x86_64: Don't save flags on context switch") which was reverted in: 512255a2ad2c ("Revert 'sched/x86_64: Don't save flags on context switch'") Historically, Linux has always saved and restored EFLAGS across context switches. As far as I know, the only reason to do this is because of the NT flag. In particular, if something calls switch_to() with the NT flag set, then we don't want to leak the NT flag into a different task that might try to IRET and fail because NT is set. Before this commit: 8c7aa698baca ("x86_64, entry: Filter RFLAGS.NT on entry from userspace") we could run system call bodies with NT set. This would be a DoS or possibly privilege escalation hole if scheduling in such a system call would leak NT into a different task. Importantly, we don't need to worry about NT being set while preemptible or across page faults. The only way we can schedule due to preemption or a page fault is in an interrupt entry that nests inside the SYSENTER prologue. The CPU will clear NT when entering through an interrupt gate, so we won't schedule with NT set. The only other interesting flags are IOPL and AC. Allowing switch_to() to change IOPL has no effect, as the value loaded during kernel execution doesn't matter at all except between a SYSENTER entry and the subsequent PUSHF, and anythign that interrupts in that window will restore IOPL on return. If we call __switch_to() with AC set, we have bigger problems. Signed-off-by: Andy Lutomirski Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/d4440fdc2a89247bffb7c003d2a9a2952bd46827.1441146105.git.luto@kernel.org Signed-off-by: Ingo Molnar --- arch/x86/include/asm/switch_to.h | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h index d7f3b3b..751bf4b 100644 --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -79,12 +79,12 @@ do { \ #else /* CONFIG_X86_32 */ /* frame pointer must be last for get_wchan */ -#define SAVE_CONTEXT "pushf ; pushq %%rbp ; movq %%rsi,%%rbp\n\t" -#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp ; popf\t" +#define SAVE_CONTEXT "pushq %%rbp ; movq %%rsi,%%rbp\n\t" +#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp\t" #define __EXTRA_CLOBBER \ , "rcx", "rbx", "rdx", "r8", "r9", "r10", "r11", \ - "r12", "r13", "r14", "r15" + "r12", "r13", "r14", "r15", "flags" #ifdef CONFIG_CC_STACKPROTECTOR #define __switch_canary \ @@ -100,7 +100,11 @@ do { \ #define __switch_canary_iparam #endif /* CC_STACKPROTECTOR */ -/* Save restore flags to clear handle leaking NT */ +/* + * There is no need to save or restore flags, because flags are always + * clean in kernel mode, with the possible exception of IOPL. Kernel IOPL + * has no effect. + */ #define switch_to(prev, next, last) \ asm volatile(SAVE_CONTEXT \ "movq %%rsp,%P[threadrsp](%[prev])\n\t" /* save RSP */ \