Re: rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: [Fedora-xen] Running fedora xen on top of KVM?

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Andy Lutomirski <luto@amacapital.net>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen <xen@lists.fedoraproject.org>,
	Xen Devel <xen-devel@lists.xensource.com>,
	kvm list <kvm@vger.kernel.org>,
	Cole Robinson <crobinso@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	M A Young <m.a.young@durham.ac.uk>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: [Fedora-xen] Running fedora xen on top of KVM?
Date: Thu, 17 Sep 2015 22:29:34 +0100	[thread overview]
Message-ID: <55FB30BE.3080603@citrix.com> (raw)
In-Reply-To: <CALCETrWT0WmZqv3jbd9jpmBD6M9fdtdJEz7yD92=C+PJH=PipQ@mail.gmail.com>

On 17/09/2015 21:23, Andy Lutomirski wrote:
> On Thu, Sep 17, 2015 at 1:10 PM, Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com> wrote:
>> On Wed, Sep 16, 2015 at 06:39:03PM -0400, Cole Robinson wrote:
>>> On 09/16/2015 05:08 PM, Konrad Rzeszutek Wilk wrote:
>>>> On Wed, Sep 16, 2015 at 05:04:31PM -0400, Cole Robinson wrote:
>>>>> On 09/16/2015 04:07 PM, M A Young wrote:
>>>>>> On Wed, 16 Sep 2015, Cole Robinson wrote:
>>>>>>
>>>>>>> Unfortunately I couldn't get anything else extra out of xen using any of these
>>>>>>> options or the ones Major recommended... in fact I couldn't get anything to
>>>>>>> the serial console at all. console=con1 would seem to redirect messages since
>>>>>>> they wouldn't show up on the graphical display, but nothing went to the serial
>>>>>>> log. Maybe I'm missing something...
>>>>>> That should be console=com1 so you have a typo either in this message or
>>>>>> in your tests.
>>>>>>
>>>>> Yeah that was it :/ So here's the crash output use -cpu host:
>>>>>
>>>>> - Cole
>>>>>
>>> <snip>
>>>
>>>>> about to get started...
>>>>> (XEN) traps.c:459:d0v0 Unhandled general protection fault fault/trap [#13] on
>>>>> VCPU 0 [ec=0000]
>>>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d08023a5d3
>>>>> create_bounce_frame+0x12b/0x13a
>>>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>>>>> (XEN) ----[ Xen-4.5.1  x86_64  debug=n  Not tainted ]----
>>>>> (XEN) CPU:    0
>>>>> (XEN) RIP:    e033:[<ffffffff810032b0>]
>>>> That is the Linux kernel EIP. Can you figure out what is at ffffffff810032b0 ?
>>>>
>>>> gdb vmlinux and then
>>>> x/20i 0xffffffff810032b0
>>>>
>>>> can help with that.
>>>>
>>> Updated to the latest kernel 4.1.6-201.fc22.x86_64. Trace is now:
>>>
>>> about to get started...
>>> (XEN) traps.c:459:d0v0 Unhandled general protection fault fault/trap [#13] on
>>> VCPU 0 [ec=0000]
> What exactly does this mean?

This means that there was  #GP fault originating from dom0 context, but
dom0 has not yet registered a #GP handler with Xen.  (I already have a
patch pending to correct the wording of that error message.)

Would be a double fault on native.

>
>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d08023a5d3
>>> create_bounce_frame+0x12b/0x13a
>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>>> (XEN) ----[ Xen-4.5.1  x86_64  debug=n  Not tainted ]----
>>> (XEN) CPU:    0
>>> (XEN) RIP:    e033:[<ffffffff810031f0>]
>>> (XEN) RFLAGS: 0000000000000282   EM: 1   CONTEXT: pv guest
>>> (XEN) rax: 0000000000000015   rbx: ffffffff81c03e1c   rcx: 00000000c0010112
>>> (XEN) rdx: 0000000000000001   rsi: ffffffff81c03e1c   rdi: 00000000c0010112
>>> (XEN) rbp: ffffffff81c03df8   rsp: ffffffff81c03da0   r8:  ffffffff81c03e28
>>> (XEN) r9:  ffffffff81c03e2c   r10: 0000000000000000   r11: 00000000ffffffff
>>> (XEN) r12: ffffffff81d25a60   r13: 0000000004000000   r14: 0000000000000000
>>> (XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 00000000000406f0
>>> (XEN) cr3: 0000000075c0b000   cr2: 0000000000000000
>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
>>> (XEN) Guest stack trace from rsp=ffffffff81c03da0:
>>> (XEN)    00000000c0010112 00000000ffffffff 0000000000000000 ffffffff810031f0
>>> (XEN)    000000010000e030 0000000000010082 ffffffff81c03de0 000000000000e02b
>>> (XEN)    0000000000000000 000000000000000c ffffffff81c03e1c ffffffff81c03e48
>>> (XEN)    ffffffff8102a7a4 ffffffff81c03e48 ffffffff8102aa3b ffffffff81c03e48
>>> (XEN)    cf1fa5f5e026f464 0000000001000000 ffffffff81c03ef8 0000000004000000
>>> (XEN)    0000000000000000 ffffffff81c03e58 ffffffff81d5d142 ffffffff81c03ee8
>>> (XEN)    ffffffff81d58b56 0000000000000000 0000000000000000 ffffffff81c03e88
>>> (XEN)    ffffffff810f8a39 ffffffff81c03ee8 ffffffff81798b13 ffffffff00000010
>>> (XEN)    ffffffff81c03ef8 ffffffff81c03eb8 cf1fa5f5e026f464 ffffffff81f1de9c
>>> (XEN)    ffffffffffffffff 0000000000000000 ffffffff81df7920 0000000000000000
>>> (XEN)    0000000000000000 ffffffff81c03f28 ffffffff81d51c74 cf1fa5f5e026f464
>>> (XEN)    0000000000000000 ffffffff81c03f60 ffffffff81c03f5c 0000000000000000
>>> (XEN)    0000000000000000 ffffffff81c03f38 ffffffff81d51339 ffffffff81c03ff8
>>> (XEN)    ffffffff81d548b1 0000000000000000 00600f1200000000 0000000100000800
>>> (XEN)    0300000100000032 0000000000000005 0000000000000000 0000000000000000
>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> (XEN)    0f00000060c0c748 ccccccccccccc305 cccccccccccccccc cccccccccccccccc
>>> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
>>>
>>>
>>> gdb output:
>>>
>>> (gdb) x/20i 0xffffffff810031f0
>>>    0xffffffff810031f0 <xen_read_msr_safe+16>: rdmsr
>> Fantastic! So we have some rdmsr that makes KVM inject an
>> GP.
> What's the scenario?  Is this Xen on KVM?

I believe from the thread that this is a Xen/dom0 combo running as a KVM
guest.

>
> Why didn't the guest print anything?

Lack of earlyprintk=xen on the dom0 command line.  (IMO this really
should be the default when a PVOPs detects that it is running under Xen)

>
> Is the issue here that the guest died due to failure to handle an
> RDMSR failure or did the *hypervisor* die?

The guest suffered a GP fault which it couldn't handle.  Therefore Xen
crashed the domain.

When dom0 crashes, Xen goes down too.

>
> It looks like null_trap_bounce is returning true, which suggests that
> the failure is happening before the guest sets up exception handling.

I concur.

>
>> Looking at the stack you have some other values:
>> ffffffff81c03de0, ffffffff81c03e1c .. they should correspond
>> to other functions calling this one. If you do 'nm --defined vmlinux | grep ffffffff81c03e1'
>> that should give an idea where they are. Or use 'gdb'.
>>
>> That will give us an stack - and we can find what type of MSR
>> this is. Oh wait, it is on the registers: 00000000c0010112
>>
>> Ok, so where in the code is that MSR ah, that looks to be:
>>  #define MSR_K8_TSEG_ADDR                0xc0010112
>>
>> which is called at bsp_init_amd.
>>
>> I think the problem here is that we are calling the
>> 'safe' variant of MSR but we still get an injected #GP and
>> don't expect that.
>>
>> I am not really sure what the expected outcome should be here.
>>
>> CC-ing xen-devel, KVM folks, and Andy who has been looking
>> in mucking around in the _safe* pvops.
> It's too early of a failure, I think.
>
> Cc: Borislav.  Is TSEG guaranteed to exist?  Can we defer that until
> we have exception handling working?  Do we need to rig up exception
> handling so that it works earlier (e.g. in early_trap_init, which is
> presumably early enough)?  Or is this just a KVM and/or Xen bug.

It would certainly help to move the exception setup as early as possible.

>From a Xen PV guests point of view, the kernel is already executing on
working pagetables and flat GDT when it starts.  A set_trap_table
hypercall (equivalent of `lidt`) ought to be the second action,
following the stack switch.

This appears not to be the case, and the load_idt() is deferred until
native cpu_init().

~Andrew

next prev parent reply	other threads:[~2015-09-17 21:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <55F87984.7030903@redhat.com>
     [not found] ` <alpine.DEB.2.00.1509152223140.16001@procyon.dur.ac.uk>
     [not found]   ` <55F9C792.8070205@redhat.com>
     [not found]     ` <alpine.DEB.2.00.1509162056260.3899@procyon.dur.ac.uk>
     [not found]       ` <55F9D95F.9040401@redhat.com>
     [not found]         ` <20150916210814.GA4643@l.oracle.com>
     [not found]           ` <55F9EF87.7030407@redhat.com>
2015-09-17 20:10             ` [Fedora-xen] rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: Running fedora xen on top of KVM? Konrad Rzeszutek Wilk
2015-09-17 20:23               ` rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: [Fedora-xen] " Andy Lutomirski
2015-09-17 21:29                 ` Andrew Cooper [this message]
2015-09-18 13:54                 ` Borislav Petkov
2015-09-18 15:20                   ` Andy Lutomirski
2015-09-18 19:04                     ` Borislav Petkov
2015-09-18 19:15                       ` Cole Robinson
2015-09-21  4:49                       ` Andy Lutomirski
2015-09-22 18:23                         ` [Fedora-xen] rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: " Konrad Rzeszutek Wilk
2015-09-22 18:32                           ` rdmsr_safe in Linux PV (under Xen) gets an #GP:Re: [Fedora-xen] " Andy Lutomirski
2015-09-18 15:27                   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55FB30BE.3080603@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=crobinso@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=m.a.young@durham.ac.uk \
    --cc=pbonzini@redhat.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=xen@lists.fedoraproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).