All of lore.kernel.org
 help / color / mirror / Atom feed
From: Takashi Iwai <tiwai@suse.de>
To: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>,
	Denys Vlasenko <vda.linux@googlemail.com>,
	Jiri Kosina <jkosina@suse.cz>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Stefan Seyfried <stefan.seyfried@googlemail.com>,
	X86 ML <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	Tejun Heo <tj@kernel.org>
Subject: Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
Date: Mon, 23 Mar 2015 19:43:31 +0100	[thread overview]
Message-ID: <s5hmw33lfl8.wl-tiwai@suse.de> (raw)
In-Reply-To: <55105185.5090200@redhat.com>

At Mon, 23 Mar 2015 18:46:45 +0100,
Denys Vlasenko wrote:
> 
> On 03/23/2015 06:18 PM, Takashi Iwai wrote:
> > At Mon, 23 Mar 2015 17:07:15 +0100, Denys Vlasenko wrote:
> >>>> I pulled tip tree on top of 4.0-rc5, built with your patch and now
> >>>> succeeded to get a better message:
> >>>>
> >>>>  kvm: zapping shadow pages for mmio generation wraparound
> >>>>  kvm [5126]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
> >>>>  Exception on user stack 00007ffd22c23ef0: RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
> >>>>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >>>>  PANIC: double fault, error_code: 0x0
> >>>>  CPU: 1 PID: 10819 Comm: cc1 Tainted: G        W       4.0.0-rc5-debug1+ #2
> >>>>  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
> >>>>  task: ffff8800d1b34b10 ti: ffff8800d1b30000 task.ti: ffff8800d1b30000
> >>>>  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >>>>  RSP: 0018:00007ffd22c23f28  EFLAGS: 00010006
> >>>>  RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00000000c0000101
> >>>>  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007ffd22c23ef0
> 
> >> FYI: the disassembly of netlink_attachskb (from "Code:" line) is:
> >>
> >>    0:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> >>    5:   55                      push   %rbp
> >>    6:   48 89 e5                mov    %rsp,%rbp
> >>    9:   41 56                   push   %r14
> >>    b:   41 55                   push   %r13
> >>    d:   49 89 d5                mov    %rdx,%r13
> >>   10:   41 54                   push   %r12
> >>   12:   49 89 f4                mov    %rsi,%r12
> >>   15:   53                      push   %rbx
> >>   16:   48 89 fb                mov    %rdi,%rbx
> >>   19:   48 83 ec 30             sub    $0x30,%rsp
> >>   1d:   8b 87 68 01 00 00       mov    0x168(%rdi),%eax
> >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>   23:   39 87 9c 01 00 00       cmp    %eax,0x19c(%rdi)
> >>   29:   7c 25                   jl     50 <_start+0x50>
> >>   2b:   48 8b 87 88 04 00 00    mov    0x488(%rdi),%rax
> >>
> >> The ^^^^^ instruction is the one which faults. Since you said it
> >> consistently happens here, this should be a page fault, not an external
> >> hardware interrupt.
> >>
> >> The code corresponds to the comparison in if():
> >>
> >> int netlink_attachskb(struct sock *sk, struct sk_buff *skb,
> >>                       long *timeo, struct sock *ssk)
> >> {
> >>         struct netlink_sock *nlk;
> >>
> >>         nlk = nlk_sk(sk);
> >>
> >>         if ((atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf ||
> 
> >>> - Another piece is that the bug happens only when a KVM is running.
> >>>   The kernel ran without problem over days with similar tasks
> >>>   (compiling kernel, etc) when no KVM was used.
> >>
> >> Conceivably virtualization support in CPUs can have nasty erratas.
> >> However, you and other reporter have different CPUs - yours
> >> is Ivy Bridge, his CPU is a Penryn.
> >>
> >> I don't see the path how KVM helps to trigger this.
> >>
> >>> - And now I get the trace as above, pointing netlink_attachskb().
> >>>
> >>> I have a difficulty to imagine how all these pieces fit into a single
> >>> picture.  Is something already screwed up before that?
> >>
> >> Well, a tiny bit more info will be seen if you'd change %rdi
> >> to, say, %r15 in these two lines in my patch:
> >>
> >>        /* Save bogus RSP value */
> >>        movq    %rsp,%rdi
> >> ...
> >>        push    %rdi            /* pt_regs->sp */
> >>
> >> Then original %rdi will be visible in the crash message.
> > 
> > OK, here we go.
> > 
> >  kvm: zapping shadow pages for mmio generation wraparound
> >  kvm [5490]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
> >  Exception on user stack 00007fff1d7e5ec0: RSP: 0018:00007fff1d7e5ef8  EFLAGS: 00010002
> >  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >  PANIC: double fault, error_code: 0x0
> >  CPU: 5 PID: 14285 Comm: fixdep Tainted: G        W       4.0.0-rc5-debug1+ #3
> >  Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
> >  task: ffff88020ba1c690 ti: ffff880206ba4000 task.ti: ffff880206ba4000
> >  RIP: 0010:[<ffffffff8162681d>]  [<ffffffff8162681d>] netlink_attachskb+0x1d/0x1d0
> >  RSP: 0018:00007fff1d7e5ef8  EFLAGS: 00010002
> >  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000c0000101
> >  RDX: 0000000000000000 RSI: 0000000000001ebb RDI: 0000000000000000
> 
> Thanks for your testing. So the %rdi was NULL... not very informative.
> 
> Notice that your every crash is preceded by
> 
>     kvm: zapping shadow pages for mmio generation wraparound
>     kvm [5490]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
> 
> This hints that kvm _is_ somehow responsible.

It's likely irrelevant, as this appears at the time a VM starting, not
at the crash time.  I've got this message all the time.  Sorry for
confusing.


Takashi

  reply	other threads:[~2015-03-23 18:43 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-15  8:17 PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
2015-03-18 14:16 ` Takashi Iwai
2015-03-18 15:05   ` Takashi Iwai
2015-03-18 17:43   ` Takashi Iwai
2015-03-18 17:46     ` Takashi Iwai
2015-03-18 18:03       ` Andy Lutomirski
2015-03-18 19:03         ` Stefan Seyfried
2015-03-18 19:26           ` Andy Lutomirski
2015-03-18 20:05             ` Stefan Seyfried
2015-03-18 20:51               ` Andy Lutomirski
2015-03-18 21:12                 ` Stefan Seyfried
2015-03-18 21:21                   ` Andy Lutomirski
2015-03-18 21:41                     ` Stefan Seyfried
2015-03-18 21:49                       ` Denys Vlasenko
2015-03-18 21:53                         ` Stefan Seyfried
2015-03-18 20:06             ` Denys Vlasenko
2015-03-18 20:49               ` Andy Lutomirski
2015-03-18 21:06                 ` Denys Vlasenko
2015-03-18 21:17                   ` Andy Lutomirski
2015-03-18 21:32             ` Linus Torvalds
2015-03-18 21:42               ` Denys Vlasenko
2015-03-18 21:55                 ` Andy Lutomirski
2015-03-18 22:17                   ` Denys Vlasenko
2015-03-18 22:20                     ` Andy Lutomirski
2015-03-18 22:27                       ` Denys Vlasenko
2015-03-18 22:18                   ` Linus Torvalds
2015-03-18 22:24                     ` Andy Lutomirski
2015-03-18 22:22                   ` Jiri Kosina
2015-03-18 22:28                     ` Linus Torvalds
2015-03-18 22:29                       ` Andy Lutomirski
2015-03-18 22:29                     ` Andy Lutomirski
2015-03-18 22:38                       ` Stefan Seyfried
2015-03-18 22:40                         ` Andy Lutomirski
2015-03-18 23:22                           ` Andy Lutomirski
2015-03-19  0:23                             ` Stefan Seyfried
2015-03-19  0:57                               ` Andy Lutomirski
2015-03-19  2:15                                 ` Linus Torvalds
2015-03-19  6:24                                 ` Stefan Seyfried
2015-03-19 10:16                       ` Takashi Iwai
2015-03-19 10:58                         ` Denys Vlasenko
2015-03-19 11:21                           ` Takashi Iwai
2015-03-19 12:48                             ` Denys Vlasenko
2015-03-19 13:47                               ` Takashi Iwai
2015-03-19 14:55                                 ` Takashi Iwai
2015-03-19 15:22                                   ` Takashi Iwai
2015-03-19 15:41                                     ` Andy Lutomirski
2015-03-19 15:51                                       ` Takashi Iwai
2015-03-19 16:01                                         ` Andy Lutomirski
2015-03-20 18:16                                         ` Denys Vlasenko
2015-03-20 18:50                                           ` Takashi Iwai
2015-03-23  9:02                                           ` Takashi Iwai
2015-03-23  9:35                                             ` Takashi Iwai
2015-03-23 13:22                                               ` Takashi Iwai
2015-03-23 16:07                                                 ` Denys Vlasenko
2015-03-23 17:18                                                   ` Takashi Iwai
2015-03-23 17:46                                                     ` Denys Vlasenko
2015-03-23 18:43                                                       ` Takashi Iwai [this message]
2015-03-23 18:38                                                   ` Andy Lutomirski
2015-03-23 18:48                                                     ` Andy Lutomirski
2015-03-23 18:59                                                       ` Takashi Iwai
2015-03-23 19:10                                                         ` [PATCH] x86, entry: Check for syscall exit work with IRQs disabled Andy Lutomirski
2015-03-23 19:21                                                           ` Denys Vlasenko
2015-03-23 19:27                                                             ` Andy Lutomirski
2015-03-23 19:32                                                               ` Andy Lutomirski
2015-03-24 11:17                                                           ` Takashi Iwai
2015-03-24 20:08                                                           ` Ingo Molnar
2015-03-25  0:35                                                             ` Andy Lutomirski
2015-03-25 12:21                                                               ` Ingo Molnar
2015-03-25 15:07                                                                 ` Andy Lutomirski
2015-03-25  9:13                                                           ` [tip:x86/asm] x86/asm/entry: " tip-bot for Andy Lutomirski
2015-03-23 18:54                                                     ` PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related? Stefan Seyfried
2015-03-23 18:56                                                     ` Takashi Iwai
2015-03-23 19:07                                                     ` Denys Vlasenko
2015-03-23 19:10                                                       ` Andy Lutomirski
2015-03-19 13:21                   ` Denys Vlasenko
2015-03-18 21:49               ` Stefan Seyfried
2015-03-28 23:57             ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=s5hmw33lfl8.wl-tiwai@suse.de \
    --to=tiwai@suse.de \
    --cc=dvlasenk@redhat.com \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=stefan.seyfried@googlemail.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vda.linux@googlemail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.