From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754380AbaI2RkP (ORCPT ); Mon, 29 Sep 2014 13:40:15 -0400 Received: from mail-pd0-f171.google.com ([209.85.192.171]:33368 "EHLO mail-pd0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752602AbaI2RkN (ORCPT ); Mon, 29 Sep 2014 13:40:13 -0400 Message-ID: <54299979.6080705@amacapital.net> Date: Mon, 29 Sep 2014 10:40:09 -0700 From: Andy Lutomirski User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.0 MIME-Version: 1.0 To: Anish Bhatt , linux-kernel@vger.kernel.org CC: x86@kernel.org, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, sebastian@fds-team.de Subject: Re: [PATCH] x86 : Ensure X86_FLAGS_NT is cleared on syscall entry References: <1411674171-24442-1-git-send-email-anish@chelsio.com> In-Reply-To: <1411674171-24442-1-git-send-email-anish@chelsio.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/25/2014 12:42 PM, Anish Bhatt wrote: > The MSR_SYSCALL_MASK, which is responsible for clearing specific EFLAGS on > syscall entry, should also clear the nested task (NT) flag to be safe from > userspace injection. Without this fix the application segmentation > faults on syscall return because of the changed meaning of the IRET > instruction. > > Further details can be seen here https://bugs.winehq.org/show_bug.cgi?id=33275 > > Signed-off-by: Anish Bhatt > Signed-off-by: Sebastian Lackner > --- > arch/x86/kernel/cpu/common.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > index e4ab2b4..3126558 100644 > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -1184,7 +1184,7 @@ void syscall_init(void) > /* Flags to clear on syscall */ > wrmsrl(MSR_SYSCALL_MASK, > X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_IF| > - X86_EFLAGS_IOPL|X86_EFLAGS_AC); > + X86_EFLAGS_IOPL|X86_EFLAGS_AC|X86_EFLAGS_NT); Something's weird here, and at the very least the changelog is insufficiently informative. The Intel SDM says: If the NT flag is set and the processor is in IA-32e mode, the IRET instruction causes a general protection exception. Presumably interrupt delivery clears NT. I haven't spotted where that's documented yet. sysret doesn't appear to care about NT at all. So: the test code doesn't appear to do anything interesting *unless* it goes through syscall followed by the iret exit path. Then it receives #GP on return, which turns into a signal. On the premise that the slow and fast return paths ought to be indistinguishable from userspace, I think we should fix this. But I want to understand it better first. Also, 32-bit may need more care here. --Andy