All of lore.kernel.org
 help / color / mirror / Atom feed
From: Denys Vlasenko <dvlasenk@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@amacapital.net>
Cc: Stefan Seyfried <stefan.seyfried@googlemail.com>,
	Takashi Iwai <tiwai@suse.de>, X86 ML <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, Tejun Heo <tj@kernel.org>
Subject: Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
Date: Wed, 18 Mar 2015 22:42:57 +0100	[thread overview]
Message-ID: <5509F161.3010101@redhat.com> (raw)
In-Reply-To: <CA+55aFwT4BJVR10i2Cm8pMH0UGd-J3EwnEUYKf3BWTM0awebbA@mail.gmail.com>

On 03/18/2015 10:32 PM, Linus Torvalds wrote:
> On Wed, Mar 18, 2015 at 12:26 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>
>>> crash> disassemble page_fault
>>> Dump of assembler code for function page_fault:
>>>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>>>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>>>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
>>
>> The callq was the double-faulting instruction, and it is indeed the
>> first function in here that would have accessed the stack.  (The sub
>> *changes* rsp but isn't a memory access.)  So, since RSP is bogus, we
>> page fault, and the page fault is promoted to a double fault.  The
>> surprising thing is that the page fault itself seems to have been
>> delivered okay, and RSP wasn't on a page boundary.
> 
> Not at all surprising, and sure it was on a page boundry..
> 
> Look closer.
> 
> %rsp is 00007fffa55eafb8.
> 
> But that's *after* page_fault has done that
> 
>     sub    $0x78,%rsp
> 
> so %rsp when the page fault happened was 0x7fffa55eb030. Which is a
> different page.
> 
> And that page happened to be mapped.
> 
> So what happened is:
> 
>  - we somehow entered kernel mode without switching stacks
> 
>    (ie presumably syscall)
> 
>  - the user stack was still fine
> 
>  - we took a page fault, which once again didn't switch stacks,
> because we were already in kernel mode. And this page fault worked,
> because it just pushed the error code onto the user stack which was
> mapped.
> 
>  - we now took a second page fault within the page fault handler,
> because now the stack pointer has been decremented and points one user
> page down that is *not* mapped, so now that page fault cannot push the
> error code and return information.
> 
> Now, how we took that original page fault is sadly not very clear at
> all.  I agree that it's something about system-call (how could we not
> change stacks otherwise), but why it should have started now, I don't
> know. I don't think "system_call" has changed at all.
> 
> Maybe there is something wrong with the new "ret_from_sys_call" logic,
> and that "use sysret to return to user mode" thing. Because this code
> sequence:
> 
> +       movq (RSP-RIP)(%rsp),%rsp
> +       USERGS_SYSRET64
> 
> in 'irq_return_via_sysret' is new to 4.0, and instead of entering the
> kernel with a user stack poiinter, maybe we're *exiting* the kernel,
> and have just reloaded the user stack pointer when "USERGS_SYSRET64"
> takes some fault.

Yes, so far we happily thought that SYSRET never fails...

This merits adding some code which would at least BUG_ON
if the faulting address is seen to match SYSRET64.

Now we only check for faulting IRETQ:

error_kernelspace:
        CFI_REL_OFFSET rcx, RCX+8
        incl %ebx
        leaq native_irq_return_iret(%rip),%rcx
        cmpq %rcx,RIP+8(%rsp)
        je error_bad_iret

> 
> Is PARAVIRT enabled? The three nop's at the beginning of 'page_fault'
> makes me suspect it is,  and that that is some paravirt rewriting
> area. What does paravirt go for that USERGS_SYSRET64 (or for
> SWAPGS_UNSAFE_STACK, for that matter).
> 
>                         Linus
> 


  reply	other threads:[~2015-03-18 21:43 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-15  8:17 PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
2015-03-18 14:16 ` Takashi Iwai
2015-03-18 15:05   ` Takashi Iwai
2015-03-18 17:43   ` Takashi Iwai
2015-03-18 17:46     ` Takashi Iwai
2015-03-18 18:03       ` Andy Lutomirski
2015-03-18 19:03         ` Stefan Seyfried
2015-03-18 19:26           ` Andy Lutomirski
2015-03-18 20:05             ` Stefan Seyfried
2015-03-18 20:51               ` Andy Lutomirski
2015-03-18 21:12                 ` Stefan Seyfried
2015-03-18 21:21                   ` Andy Lutomirski
2015-03-18 21:41                     ` Stefan Seyfried
2015-03-18 21:49                       ` Denys Vlasenko
2015-03-18 21:53                         ` Stefan Seyfried
2015-03-18 20:06             ` Denys Vlasenko
2015-03-18 20:49               ` Andy Lutomirski
2015-03-18 21:06                 ` Denys Vlasenko
2015-03-18 21:17                   ` Andy Lutomirski
2015-03-18 21:32             ` Linus Torvalds
2015-03-18 21:42               ` Denys Vlasenko [this message]
2015-03-18 21:55                 ` Andy Lutomirski
2015-03-18 22:17                   ` Denys Vlasenko
2015-03-18 22:20                     ` Andy Lutomirski
2015-03-18 22:27                       ` Denys Vlasenko
2015-03-18 22:18                   ` Linus Torvalds
2015-03-18 22:24                     ` Andy Lutomirski
2015-03-18 22:22                   ` Jiri Kosina
2015-03-18 22:28                     ` Linus Torvalds
2015-03-18 22:29                       ` Andy Lutomirski
2015-03-18 22:29                     ` Andy Lutomirski
2015-03-18 22:38                       ` Stefan Seyfried
2015-03-18 22:40                         ` Andy Lutomirski
2015-03-18 23:22                           ` Andy Lutomirski
2015-03-19  0:23                             ` Stefan Seyfried
2015-03-19  0:57                               ` Andy Lutomirski
2015-03-19  2:15                                 ` Linus Torvalds
2015-03-19  6:24                                 ` Stefan Seyfried
2015-03-19 10:16                       ` Takashi Iwai
2015-03-19 10:58                         ` Denys Vlasenko
2015-03-19 11:21                           ` Takashi Iwai
2015-03-19 12:48                             ` Denys Vlasenko
2015-03-19 13:47                               ` Takashi Iwai
2015-03-19 14:55                                 ` Takashi Iwai
2015-03-19 15:22                                   ` Takashi Iwai
2015-03-19 15:41                                     ` Andy Lutomirski
2015-03-19 15:51                                       ` Takashi Iwai
2015-03-19 16:01                                         ` Andy Lutomirski
2015-03-20 18:16                                         ` Denys Vlasenko
2015-03-20 18:50                                           ` Takashi Iwai
2015-03-23  9:02                                           ` Takashi Iwai
2015-03-23  9:35                                             ` Takashi Iwai
2015-03-23 13:22                                               ` Takashi Iwai
2015-03-23 16:07                                                 ` Denys Vlasenko
2015-03-23 17:18                                                   ` Takashi Iwai
2015-03-23 17:46                                                     ` Denys Vlasenko
2015-03-23 18:43                                                       ` Takashi Iwai
2015-03-23 18:38                                                   ` Andy Lutomirski
2015-03-23 18:48                                                     ` Andy Lutomirski
2015-03-23 18:59                                                       ` Takashi Iwai
2015-03-23 19:10                                                         ` [PATCH] x86, entry: Check for syscall exit work with IRQs disabled Andy Lutomirski
2015-03-23 19:21                                                           ` Denys Vlasenko
2015-03-23 19:27                                                             ` Andy Lutomirski
2015-03-23 19:32                                                               ` Andy Lutomirski
2015-03-24 11:17                                                           ` Takashi Iwai
2015-03-24 20:08                                                           ` Ingo Molnar
2015-03-25  0:35                                                             ` Andy Lutomirski
2015-03-25 12:21                                                               ` Ingo Molnar
2015-03-25 15:07                                                                 ` Andy Lutomirski
2015-03-25  9:13                                                           ` [tip:x86/asm] x86/asm/entry: " tip-bot for Andy Lutomirski
2015-03-23 18:54                                                     ` PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
2015-03-23 18:56                                                     ` Takashi Iwai
2015-03-23 19:07                                                     ` Denys Vlasenko
2015-03-23 19:10                                                       ` Andy Lutomirski
2015-03-19 13:21                   ` Denys Vlasenko
2015-03-18 21:49               ` Stefan Seyfried
2015-03-28 23:57             ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5509F161.3010101@redhat.com \
    --to=dvlasenk@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=stefan.seyfried@googlemail.com \
    --cc=tiwai@suse.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.