All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Andy Lutomirski <luto@amacapital.net>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen <xen@lists.fedoraproject.org>,
	Xen Devel <xen-devel@lists.xensource.com>,
	kvm list <kvm@vger.kernel.org>,
	Cole Robinson <crobinso@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	M A Young <m.a.young@durham.ac.uk>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: [Fedora-xen] Running fedora xen on top of KVM?
Date: Thu, 17 Sep 2015 22:29:34 +0100	[thread overview]
Message-ID: <55FB30BE.3080603@citrix.com> (raw)
In-Reply-To: <CALCETrWT0WmZqv3jbd9jpmBD6M9fdtdJEz7yD92=C+PJH=PipQ@mail.gmail.com>

On 17/09/2015 21:23, Andy Lutomirski wrote:
> On Thu, Sep 17, 2015 at 1:10 PM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
>> On Wed, Sep 16, 2015 at 06:39:03PM -0400, Cole Robinson wrote:
>>> On 09/16/2015 05:08 PM, Konrad Rzeszutek Wilk wrote:
>>>> On Wed, Sep 16, 2015 at 05:04:31PM -0400, Cole Robinson wrote:
>>>>> On 09/16/2015 04:07 PM, M A Young wrote:
>>>>>> On Wed, 16 Sep 2015, Cole Robinson wrote:
>>>>>>
>>>>>>> Unfortunately I couldn't get anything else extra out of xen using any of these
>>>>>>> options or the ones Major recommended... in fact I couldn't get anything to
>>>>>>> the serial console at all. console=con1 would seem to redirect messages since
>>>>>>> they wouldn't show up on the graphical display, but nothing went to the serial
>>>>>>> log. Maybe I'm missing something...
>>>>>> That should be console=com1 so you have a typo either in this message or
>>>>>> in your tests.
>>>>>>
>>>>> Yeah that was it :/ So here's the crash output use -cpu host:
>>>>>
>>>>> - Cole
>>>>>
>>> <snip>
>>>
>>>>> about to get started...
>>>>> (XEN) traps.c:459:d0v0 Unhandled general protection fault fault/trap [#13] on
>>>>> VCPU 0 [ec=0000]
>>>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d08023a5d3
>>>>> create_bounce_frame+0x12b/0x13a
>>>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>>>>> (XEN) ----[ Xen-4.5.1  x86_64  debug=n  Not tainted ]----
>>>>> (XEN) CPU:    0
>>>>> (XEN) RIP:    e033:[<ffffffff810032b0>]
>>>> That is the Linux kernel EIP. Can you figure out what is at ffffffff810032b0 ?
>>>>
>>>> gdb vmlinux and then
>>>> x/20i 0xffffffff810032b0
>>>>
>>>> can help with that.
>>>>
>>> Updated to the latest kernel 4.1.6-201.fc22.x86_64. Trace is now:
>>>
>>> about to get started...
>>> (XEN) traps.c:459:d0v0 Unhandled general protection fault fault/trap [#13] on
>>> VCPU 0 [ec=0000]
> What exactly does this mean?

This means that there was  #GP fault originating from dom0 context, but
dom0 has not yet registered a #GP handler with Xen.  (I already have a
patch pending to correct the wording of that error message.)

Would be a double fault on native.

>
>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d08023a5d3
>>> create_bounce_frame+0x12b/0x13a
>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>>> (XEN) ----[ Xen-4.5.1  x86_64  debug=n  Not tainted ]----
>>> (XEN) CPU:    0
>>> (XEN) RIP:    e033:[<ffffffff810031f0>]
>>> (XEN) RFLAGS: 0000000000000282   EM: 1   CONTEXT: pv guest
>>> (XEN) rax: 0000000000000015   rbx: ffffffff81c03e1c   rcx: 00000000c0010112
>>> (XEN) rdx: 0000000000000001   rsi: ffffffff81c03e1c   rdi: 00000000c0010112
>>> (XEN) rbp: ffffffff81c03df8   rsp: ffffffff81c03da0   r8:  ffffffff81c03e28
>>> (XEN) r9:  ffffffff81c03e2c   r10: 0000000000000000   r11: 00000000ffffffff
>>> (XEN) r12: ffffffff81d25a60   r13: 0000000004000000   r14: 0000000000000000
>>> (XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 00000000000406f0
>>> (XEN) cr3: 0000000075c0b000   cr2: 0000000000000000
>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
>>> (XEN) Guest stack trace from rsp=ffffffff81c03da0:
>>> (XEN)    00000000c0010112 00000000ffffffff 0000000000000000 ffffffff810031f0
>>> (XEN)    000000010000e030 0000000000010082 ffffffff81c03de0 000000000000e02b
>>> (XEN)    0000000000000000 000000000000000c ffffffff81c03e1c ffffffff81c03e48
>>> (XEN)    ffffffff8102a7a4 ffffffff81c03e48 ffffffff8102aa3b ffffffff81c03e48
>>> (XEN)    cf1fa5f5e026f464 0000000001000000 ffffffff81c03ef8 0000000004000000
>>> (XEN)    0000000000000000 ffffffff81c03e58 ffffffff81d5d142 ffffffff81c03ee8
>>> (XEN)    ffffffff81d58b56 0000000000000000 0000000000000000 ffffffff81c03e88
>>> (XEN)    ffffffff810f8a39 ffffffff81c03ee8 ffffffff81798b13 ffffffff00000010
>>> (XEN)    ffffffff81c03ef8 ffffffff81c03eb8 cf1fa5f5e026f464 ffffffff81f1de9c
>>> (XEN)    ffffffffffffffff 0000000000000000 ffffffff81df7920 0000000000000000
>>> (XEN)    0000000000000000 ffffffff81c03f28 ffffffff81d51c74 cf1fa5f5e026f464
>>> (XEN)    0000000000000000 ffffffff81c03f60 ffffffff81c03f5c 0000000000000000
>>> (XEN)    0000000000000000 ffffffff81c03f38 ffffffff81d51339 ffffffff81c03ff8
>>> (XEN)    ffffffff81d548b1 0000000000000000 00600f1200000000 0000000100000800
>>> (XEN)    0300000100000032 0000000000000005 0000000000000000 0000000000000000
>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN)    0f00000060c0c748 ccccccccccccc305 cccccccccccccccc cccccccccccccccc
>>> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
>>>
>>>
>>> gdb output:
>>>
>>> (gdb) x/20i 0xffffffff810031f0
>>>    0xffffffff810031f0 <xen_read_msr_safe+16>: rdmsr
>> Fantastic! So we have some rdmsr that makes KVM inject an
>> GP.
> What's the scenario?  Is this Xen on KVM?

I believe from the thread that this is a Xen/dom0 combo running as a KVM
guest.

>
> Why didn't the guest print anything?

Lack of earlyprintk=xen on the dom0 command line.  (IMO this really
should be the default when a PVOPs detects that it is running under Xen)

>
> Is the issue here that the guest died due to failure to handle an
> RDMSR failure or did the *hypervisor* die?

The guest suffered a GP fault which it couldn't handle.  Therefore Xen
crashed the domain.

When dom0 crashes, Xen goes down too.

>
> It looks like null_trap_bounce is returning true, which suggests that
> the failure is happening before the guest sets up exception handling.

I concur.

>
>> Looking at the stack you have some other values:
>> ffffffff81c03de0, ffffffff81c03e1c .. they should correspond
>> to other functions calling this one. If you do 'nm --defined vmlinux | grep ffffffff81c03e1'
>> that should give an idea where they are. Or use 'gdb'.
>>
>> That will give us an stack - and we can find what type of MSR
>> this is. Oh wait, it is on the registers: 00000000c0010112
>>
>> Ok, so where in the code is that MSR ah, that looks to be:
>>  #define MSR_K8_TSEG_ADDR                0xc0010112
>>
>> which is called at bsp_init_amd.
>>
>> I think the problem here is that we are calling the
>> 'safe' variant of MSR but we still get an injected #GP and
>> don't expect that.
>>
>> I am not really sure what the expected outcome should be here.
>>
>> CC-ing xen-devel, KVM folks, and Andy who has been looking
>> in mucking around in the _safe* pvops.
> It's too early of a failure, I think.
>
> Cc: Borislav.  Is TSEG guaranteed to exist?  Can we defer that until
> we have exception handling working?  Do we need to rig up exception
> handling so that it works earlier (e.g. in early_trap_init, which is
> presumably early enough)?  Or is this just a KVM and/or Xen bug.

It would certainly help to move the exception setup as early as possible.

>From a Xen PV guests point of view, the kernel is already executing on
working pagetables and flat GDT when it starts.  A set_trap_table
hypercall (equivalent of `lidt`) ought to be the second action,
following the stack switch.

This appears not to be the case, and the load_idt() is deferred until
native cpu_init().

~Andrew

  reply	other threads:[~2015-09-17 21:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <55F87984.7030903@redhat.com>
     [not found] ` <alpine.DEB.2.00.1509152223140.16001@procyon.dur.ac.uk>
     [not found]   ` <55F9C792.8070205@redhat.com>
     [not found]     ` <alpine.DEB.2.00.1509162056260.3899@procyon.dur.ac.uk>
     [not found]       ` <55F9D95F.9040401@redhat.com>
     [not found]         ` <20150916210814.GA4643@l.oracle.com>
     [not found]           ` <55F9EF87.7030407@redhat.com>
2015-09-17 20:10             ` [Fedora-xen] rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: Running fedora xen on top of KVM? Konrad Rzeszutek Wilk
2015-09-17 20:23               ` rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: [Fedora-xen] " Andy Lutomirski
2015-09-17 21:29                 ` Andrew Cooper [this message]
2015-09-18 13:54                 ` Borislav Petkov
2015-09-18 15:20                   ` Andy Lutomirski
2015-09-18 19:04                     ` Borislav Petkov
2015-09-18 19:15                       ` Cole Robinson
2015-09-21  4:49                       ` Andy Lutomirski
2015-09-22 18:23                         ` [Fedora-xen] rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: " Konrad Rzeszutek Wilk
2015-09-22 18:32                           ` rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: [Fedora-xen] " Andy Lutomirski
2015-09-18 15:27                   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55FB30BE.3080603@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=crobinso@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=m.a.young@durham.ac.uk \
    --cc=pbonzini@redhat.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=xen@lists.fedoraproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.