All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: andrew.cooper3@citrix.com, kevin.tian@intel.com,
	wim.coekaerts@oracle.com, jun.nakajima@intel.com,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6
Date: Tue, 2 Feb 2016 17:05:45 -0500	[thread overview]
Message-ID: <20160202220545.GA9915@char.us.oracle.com> (raw)
In-Reply-To: <569CC17002000078000C7D91@prv-mh.provo.novell.com>

On Mon, Jan 18, 2016 at 02:41:52AM -0700, Jan Beulich wrote:
> >>> On 15.01.16 at 22:39, <konrad.wilk@oracle.com> wrote:
> > On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote:
> >> Since we can (I hope) pretty much exclude a paging type, the
> >> ASSERT() must have triggered because of vapic_pg being NULL.
> >> That might be verifiable without extra printk()s, just by checking
> >> the disassembly (assuming the value sits in a register). In which
> >> case vapic_gpfn would be of interest too.
> > 
> > The vapic_gpfn is 0xffffffffffff.
> > 
> > To be exact:
> > 
> > nvmx_update_virtual_apic_address:vCPU0 0xffffffffffffffff(vAPIC) 0x0(APIC), 0x0(TPR) ctrl=b5b9effe
> > 
> > Based on this:
> > 
> > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> > index cb6f9b8..8a0abfc 100644
> > --- a/xen/arch/x86/hvm/vmx/vvmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> > @@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct vcpu *v)
> >  
> >          vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) >> PAGE_SHIFT;
> >          vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt, P2M_ALLOC);
> > -        ASSERT(vapic_pg && !p2m_is_paging(p2mt));
> > +       if ( !vapic_pg ) {
> > +               printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR) ctrl=%x\n", __func__,v->vcpu_id,
> > +                       __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR),
> > +                       __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR),
> > +                       __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD),
> > +                       ctrl);
> > +       }
> > +        ASSERT(vapic_pg);
> > +       ASSERT(vapic_pg && !p2m_is_paging(p2mt));
> >          __vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg));
> >          put_page(vapic_pg);
> >      }
> 
> Interesting: I can't see VIRTUAL_APIC_PAGE_ADDR to be written
> with all ones anywhere, neither for the real VMCS nor for the virtual
> one (page_to_maddr() can't, afaict, return such a value). Could you
> check where the L1 guest itself is writing that value, or whether it
> fails to initialize that field and it happens to start out as all ones?

This is getting more and more bizzare.

I realized that this machine has VMCS shadowing so Xen does not trap on
any vmwrite or vmread. Unless I update the VMCS shadowing bitmap - which
I did for vmwrite and vmread to get a better view of this. It never
traps on VIRTUAL_APIC_PAGE_ADDR accesses. It does trap on: VIRTUAL_PROCESSOR_ID,
VM_EXIT_MSR_LOAD_ADDR and GUEST_[ES,DS,FS,GS,TR]_SELECTORS.

(It may also trap on IO_BITMAP_A,B but I didn't print that out).

To confirm that the VMCS that will be given to the L2 guest is correct
I added some printking of some states that ought to be pretty OK such
as HOST_RIP or HOST_RSP - which are all 0!

If I let the nvmx_update_virtual_apic_address keep on going without
modifying the VIRTUAL_APIC_PAGE_ADDR it later on crashes the nested
guest:

EN) nvmx_handle_vmwrite: 0                                                    
(XEN) nvmx_handle_vmwrite: 0                                                    
(XEN) nvmx_handle_vmwrite: 2008                                                 
(XEN) nvmx_handle_vmwrite: 2008                                                 
(XEN) nvmx_handle_vmwrite: 0                                                    
(XEN) nvmx_handle_vmwrite: 2008                                                 
(XEN) nvmx_handle_vmwrite: 0                                                    
(XEN) nvmx_handle_vmwrite: 2008                                                 
(XEN) nvmx_handle_vmwrite: 2008                                                 
(XEN) nvmx_handle_vmwrite: 2008                                                 
(XEN) nvmx_handle_vmwrite: 2008                                                 
(XEN) nvmx_handle_vmwrite: 2008                                                 
(XEN) nvmx_handle_vmwrite: 800                                                  
(XEN) nvmx_handle_vmwrite: 804                                                  
(XEN) nvmx_handle_vmwrite: 806                                                  
(XEN) nvmx_handle_vmwrite: 80a                                                  
(XEN) nvmx_handle_vmwrite: 80e                                                  
(XEN) nvmx_update_virtual_apic_address: vCPU1 0xffffffffffffffff(vAPIC) 0x0(APIC), 0x0(TPR) ctrl=b5b9effe sec=0 
(XEN) nvmx_update_virtual_apic_address: TPR threshold = 0x0 updated 0.          
(XEN) nvmx_update_virtual_apic_address: Virtual APIC = 0x0 updated 0.           
(XEN) nvmx_update_virtual_apic_address: APIC address = 0x0 updated 0.           
(XEN) HOST_RIP=0x0 HOST_RSP=0x0                                                 
(XEN) <vm_launch_fail> error code 7                                             
(XEN) domain_crash_sync called from vmcs.c:1597                                 
(XEN) Domain 1 (vcpu#1) crashed on cpu#37:                                      
(XEN) ----[ Xen-4.6.0  x86_64  debug=n  Tainted:    C ]----                     
(XEN) CPU:    37                                                                
(XEN) RIP:    0000:[<0000000000000000>]                                         
(XEN) RFLAGS: 0000000000000000   CONTEXT: hvm guest (d1v1)                      
(XEN) rax: ffff82d08010648b   rbx: ffff8340007fb000   rcx: 0000000000000000     
(XEN) rdx: ffff82d0801ddf5f   rsi: 0000000000000000   rdi: ffff82d0801ebd6a     
(XEN) rbp: ffff82d08018cb09   rsp: 0000000000000000   r8:  0000000000000000     
(XEN) r9:  ffff834007980000   r10: 000000000000063d   r11: ffff82d080106465     
(XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000     
(XEN) r15: ffff834007980000   cr0: 0000000000000010   cr4: 0000000000000000     
(XEN) cr3: 00000000efd06000   cr2: 0000000000000000                             
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: 0000          

which should be no surprise as the VMCS is corrupt.

I need to do some more double-checking to see how it is possible
for this VMCS to get some messed up.

And of course if I run an Xen under Xen with an HVM guests - it works fine.

  reply	other threads:[~2016-02-02 22:05 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-12  3:38 Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6 Konrad Rzeszutek Wilk
2016-01-12  9:22 ` Jan Beulich
2016-01-15 21:39   ` Konrad Rzeszutek Wilk
2016-01-18  9:41     ` Jan Beulich
2016-02-02 22:05       ` Konrad Rzeszutek Wilk [this message]
2016-02-03  9:34         ` Jan Beulich
2016-02-03 15:07           ` Konrad Rzeszutek Wilk
2016-02-04 18:36             ` Konrad Rzeszutek Wilk
2016-02-05 10:33               ` Jan Beulich
2016-11-03  1:41                 ` Konrad Rzeszutek Wilk
2016-11-03 14:36                   ` Konrad Rzeszutek Wilk
2016-02-04  5:52           ` Tian, Kevin
2016-02-17  2:54           ` Tian, Kevin
2016-01-12 14:18 ` Alvin Starr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160202220545.GA9915@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=wim.coekaerts@oracle.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.