From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f172.google.com ([209.85.212.172]:38142 "EHLO mail-wi0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751517AbbHRTcl (ORCPT ); Tue, 18 Aug 2015 15:32:41 -0400 Received: by wicja10 with SMTP id ja10so117363951wic.1 for ; Tue, 18 Aug 2015 12:32:39 -0700 (PDT) Subject: Re: [PATCH-v3.14.y 3/6] x86/nmi/64: Switch stacks on userspace NMI entry To: "Thomas D." , "stable@vger.kernel.org" References: <20150817132349.GA26797@kroah.com> <1439852125-6581-1-git-send-email-whissi@whissi.de> <1439852125-6581-4-git-send-email-whissi@whissi.de> <55D35304.8050000@suse.cz> <55D36763.80609@whissi.de> Cc: "luto@kernel.org" , Linus Torvalds , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Greg Kroah-Hartman From: Jiri Slaby Message-ID: <55D38856.5000801@suse.cz> Date: Tue, 18 Aug 2015 21:32:38 +0200 MIME-Version: 1.0 In-Reply-To: <55D36763.80609@whissi.de> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: stable-owner@vger.kernel.org List-ID: On 08/18/2015, 07:12 PM, Thomas D. wrote: >>> --- a/arch/x86/kernel/entry_64.S >>> +++ b/arch/x86/kernel/entry_64.S >>> @@ -1715,19 +1715,88 @@ ENTRY(nmi) >>> * a nested NMI that updated the copy interrupt stack frame, a >>> * jump will be made to the repeat_nmi code that will handle the second >>> * NMI. >>> + * >>> + * However, espfix prevents us from directly returning to userspace >>> + * with a single IRET instruction. Similarly, IRET to user mode >>> + * can fault. We therefore handle NMIs from user space like >>> + * other IST entries. >>> */ >>> >>> /* Use %rdx as out temp variable throughout */ >>> pushq_cfi %rdx >>> CFI_REL_OFFSET rdx, 0 >>> >>> + testb $3, CS-RIP+8(%rsp) >>> + jz .Lnmi_from_kernel >>> + >>> + /* >>> + * NMI from user mode. We need to run on the thread stack, but we >>> + * can't go through the normal entry paths: NMIs are masked, and >>> + * we don't want to enable interrupts, because then we'll end >>> + * up in an awkward situation in which IRQs are on but NMIs >>> + * are off. >>> + */ >>> + >>> + SWAPGS >>> + cld >>> + movq %rsp, %rdx >>> + movq PER_CPU_VAR(kernel_stack), %rsp >> >> I think you are wasting stack space here. With kernel_stack, you should >> add 5*8 (KERNEL_STACK_OFFSET) to the pointer here. I.e. space for 5 >> registers is pre-reserved at kernel_stack already. (Or use movq instead >> of the 5 pushq below.) >> >> Why don't you re-use the 3.16's version anyway? >> >>> + pushq 5*8(%rdx) /* pt_regs->ss */ >>> + pushq 4*8(%rdx) /* pt_regs->rsp */ >>> + pushq 3*8(%rdx) /* pt_regs->flags */ >>> + pushq 2*8(%rdx) /* pt_regs->cs */ >>> + pushq 1*8(%rdx) /* pt_regs->rip */ >>> + pushq $-1 /* pt_regs->orig_ax */ >>> + pushq %rdi /* pt_regs->di */ >>> + pushq %rsi /* pt_regs->si */ >>> + pushq (%rdx) /* pt_regs->dx */ >>> + pushq %rcx /* pt_regs->cx */ >>> + pushq %rax /* pt_regs->ax */ >>> + pushq %r8 /* pt_regs->r8 */ >>> + pushq %r9 /* pt_regs->r9 */ >>> + pushq %r10 /* pt_regs->r10 */ >>> + pushq %r11 /* pt_regs->r11 */ >>> + pushq %rbx /* pt_regs->rbx */ >>> + pushq %rbp /* pt_regs->rbp */ >>> + pushq %r12 /* pt_regs->r12 */ >>> + pushq %r13 /* pt_regs->r13 */ >>> + pushq %r14 /* pt_regs->r14 */ >>> + pushq %r15 /* pt_regs->r15 */ > > Mh, so you mean > >> + addq $KERNEL_STACK_OFFSET, %rsp > > between > >> + movq PER_CPU_VAR(kernel_stack), %rsp > > and > >> + pushq 5*8(%rdx) /* pt_regs->ss */ > > is missing? Yep, that makes sense. But I am not an x86 maintainer :P. -- js suse labs