From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: andrew.cooper3@citrix.com, kevin.tian@intel.com,
wim.coekaerts@oracle.com, jun.nakajima@intel.com,
xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6
Date: Fri, 15 Jan 2016 16:39:58 -0500 [thread overview]
Message-ID: <20160115213958.GA16118@char.us.oracle.com> (raw)
In-Reply-To: <5694D3CB02000078000C5D00@prv-mh.provo.novell.com>
On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote:
> >>> On 12.01.16 at 04:38, <konrad.wilk@oracle.com> wrote:
> > (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
> > (XEN) ----[ Xen-4.6.0 x86_64 debug=y Tainted: C ]----
> > (XEN) CPU: 39
> > (XEN) RIP: e008:[<ffff82d0801ed053>] virtual_vmentry+0x487/0xac9
> > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor (d1v3)
> > (XEN) rax: 0000000000000000 rbx: ffff83007786c000 rcx: 0000000000000000
> > (XEN) rdx: 0000000000000e00 rsi: 000fffffffffffff rdi: ffff83407f81e010
> > (XEN) rbp: ffff834008a47ea8 rsp: ffff834008a47e38 r8: 0000000000000000
> > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> > (XEN) r12: 0000000000000000 r13: ffff82c000341000 r14: ffff834008a47f18
> > (XEN) r15: ffff83407f7c4000 cr0: 0000000080050033 cr4: 00000000001526e0
> > (XEN) cr3: 000000407fb22000 cr2: 0000000000000000
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> > (XEN) Xen stack trace from rsp=ffff834008a47e38:
> > (XEN) ffff834008a47e68 ffff82d0801d2cde ffff834008a47e68 0000000000000d00
> > (XEN) 0000000000000000 0000000000000000 ffff834008a47e88 00000004801cc30e
> > (XEN) ffff83007786c000 ffff83007786c000 ffff834008a40000 0000000000000000
> > (XEN) ffff834008a47f18 0000000000000000 ffff834008a47f08 ffff82d0801edf94
> > (XEN) ffff834008a47ef8 0000000000000000 ffff834008f62000 ffff834008a47f18
> > (XEN) 000000ae8c99eb8d ffff83007786c000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82d0801ee2ab
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 00000000078bfbff 0000000000000000 0000000000000000 0000beef0000beef
> > (XEN) fffffffffc4b3440 000000bf0000beef 0000000000040046 fffffffffc607f00
> > (XEN) 000000000000beef 000000000000beef 000000000000beef 000000000000beef
> > (XEN) 000000000000beef 0000000000000027 ffff83007786c000 0000006f88716300
> > (XEN) 0000000000000000
> > (XEN) Xen call trace:
> > (XEN) [<ffff82d0801ed053>] virtual_vmentry+0x487/0xac9
> > (XEN) [<ffff82d0801edf94>] nvmx_switch_guest+0x8ff/0x915
> > (XEN) [<ffff82d0801ee2ab>] vmx_asm_vmexit_handler+0x4b/0xc0
> > (XEN)
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 39:
> > (XEN) Assertion 'vapic_pg && !p2m_is_paging(p2mt)' failed at vvmx.c:698
> > (XEN) ****************************************
> > (XEN)
> >
> > ..and then to my surprise the hypervisor stopped hitting this.
>
> Since we can (I hope) pretty much exclude a paging type, the
> ASSERT() must have triggered because of vapic_pg being NULL.
> That might be verifiable without extra printk()s, just by checking
> the disassembly (assuming the value sits in a register). In which
> case vapic_gpfn would be of interest too.
The vapic_gpfn is 0xffffffffffff.
To be exact:
nvmx_update_virtual_apic_address:vCPU0 0xffffffffffffffff(vAPIC) 0x0(APIC), 0x0(TPR) ctrl=b5b9effe
Based on this:
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index cb6f9b8..8a0abfc 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct vcpu *v)
vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) >> PAGE_SHIFT;
vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt, P2M_ALLOC);
- ASSERT(vapic_pg && !p2m_is_paging(p2mt));
+ if ( !vapic_pg ) {
+ printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR) ctrl=%x\n", __func__,v->vcpu_id,
+ __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR),
+ __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR),
+ __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD),
+ ctrl);
+ }
+ ASSERT(vapic_pg);
+ ASSERT(vapic_pg && !p2m_is_paging(p2mt));
__vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg));
put_page(vapic_pg);
}
>
> What looks odd to me is the connection between
> CPU_BASED_TPR_SHADOW being set and the use of a (valid)
> virtual APIC page: Wouldn't this rather need to depend on
> SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, just like in
> nvmx_update_apic_access_address()?
Could be. I added in an read for the secondary control:
nvmx_update_virtual_apic_address:vCPU2 0xffffffffffffffff(vAPIC) 0x0(APIC), 0x0(TPR) ctrl=b5b9effe sec=0
So trying your recommendation:
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index cb6f9b8..d291c91 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -686,8 +686,8 @@ static void nvmx_update_virtual_apic_address(struct vcpu *v)
struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
u32 ctrl;
- ctrl = __n2_exec_control(v);
- if ( ctrl & CPU_BASED_TPR_SHADOW )
+ ctrl = __n2_secondary_exec_control(v);
+ if ( ctrl & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES )
{
p2m_type_t p2mt;
unsigned long vapic_gpfn;
Got me:
(XEN) stdvga.c:151:d1v0 leaving stdvga mode
(XEN) stdvga.c:147:d1v0 entering stdvga and caching modes
(XEN) stdvga.c:520:d1v0 leaving caching mode
(XEN) vvmx.c:2491:d1v0 Unknown nested vmexit reason 80000021.
(XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest state (0).
(XEN) ************* VMCS Area **************
(XEN) *** Guest State ***
(XEN) CR0: actual=0x0000000000000030, shadow=0x0000000000000000, gh_mask=ffffffffffffffff
(XEN) CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffffff
(XEN) CR3 = 0x00000000800ed000
(XEN) RSP = 0x0000000000000000 (0x0000000000000000) RIP = 0x0000000000000000 (0x0000000000000000)
(XEN) RFLAGS=0x00000002 (0x00000002) DR7 = 0x0000000000000400
(XEN) Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000
(XEN) sel attr limit base
(XEN) CS: 0000 00000 00000000 0000000000000000
(XEN) DS: 0000 00000 00000000 0000000000000000
(XEN) SS: 0000 00000 00000000 0000000000000000
(XEN) ES: 0000 00000 00000000 0000000000000000
(XEN) FS: 0000 00000 00000000 0000000000000000
(XEN) GS: 0000 00000 00000000 0000000000000000
(XEN) GDTR: 00000000 0000000000000000
(XEN) LDTR: 0000 00000 00000000 0000000000000000
(XEN) IDTR: 00000000 0000000000000000
(XEN) TR: 0000 00000 00000000 0000000000000000
(XEN) EFER = 0x0000000000000800 PAT = 0x0000000000000000
(XEN) PreemptionTimer = 0x00000000 SM Base = 0x00000000
(XEN) DebugCtl = 0x0000000000000000 DebugExceptions = 0x0000000000000000
(XEN) Interruptibility = 00000000 ActivityState = 00000000
(XEN) *** Host State ***
(XEN) RIP = 0xffff82d0801ee3a0 (vmx_asm_vmexit_handler) RSP = 0xffff8340077d7f90
(XEN) CS=e008 SS=0000 DS=0000 ES=0000 FS=0000 GS=0000 TR=e040
(XEN) FSBase=0000000000000000 GSBase=0000000000000000 TRBase=ffff8340077dfc00
(XEN) GDTBase=ffff8340077d0000 IDTBase=ffff8340077dc000
(XEN) CR0=0000000080050033 CR3=000000400076c000 CR4=00000000001526e0
(XEN) Sysenter RSP=ffff8340077d7fc0 CS:RIP=e008:ffff82d080238870
(XEN) EFER = 0x0000000000000000 PAT = 0x0000050100070406
(XEN) *** Control State ***
(XEN) PinBased=0000003f CPUBased=b5b9effe SecondaryExec=000054eb
(XEN) EntryControls=000011fb ExitControls=001fefff
(XEN) ExceptionBitmap=00062042 PFECmask=00000000 PFECmatch=ffffffff
(XEN) VMEntry: intr_info=00000000 errcode=00000000 ilen=00000000
(XEN) VMExit: intr_info=00000000 errcode=00000000 ilen=00000006
(XEN) reason=80000021 qualification=0000000000000000
(XEN) IDTVectoring: info=00000000 errcode=00000000
(XEN) TSC Offset = 0xfffd34adb2c3a149
(XEN) TPR Threshold = 0x00 PostedIntrVec = 0x00
(XEN) EPT pointer = 0x000000400079a01e EPTP index = 0x0000
(XEN) PLE Gap=00000080 Window=00001000
(XEN) Virtual processor ID = 0x004e VMfunc controls = 0000000000000000
(XEN) **************************************
(XEN) domain_crash called from vmx.c:2729
(XEN) Domain 1 (vcpu#0) crashed on cpu#21:
(XEN) ----[ Xen-4.6.0 x86_64 debug=y Tainted: C ]----
(XEN) CPU: 21
(XEN) RIP: 0000:[<0000000000000000>]
(XEN) RFLAGS: 0000000000000002 CONTEXT: hvm guest (d1v0)
(XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000
(XEN) rdx: 00000000078bfbff rsi: 0000000000000000 rdi: 0000000000000000
(XEN) rbp: 0000000000000000 rsp: 0000000000000000 r8: 0000000000000000
(XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000000000010 cr4: 0000000000000000
(XEN) cr3: 00000000800ed000 cr2: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: 0000
..
>
> Anyway, the writing of the respective VMCS field to zero in the
> alternative worries me a little: Aren't we risking MFN zero to be
> wrongly accessed due to this?
>
> Furthermore, nvmx_update_apic_access_address() having a
> similar ASSERT() seems entirely wrong: The APIC access
> page doesn't really need to match up with any actual page
> belonging to the guest - a guest could choose to point this
> into no-where (note that we've been at least considering this
> option recently for our own purposes, in the context of
> http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02191.html).
>
> > Instead I started getting an even more bizzare crash:
Ignore this part please.
.. snip..
> this doesn't match the call stack. Something's pretty fishy here.
Yes. The hypervisor was modified alongside me and I hadn't connected
the dots...
>
> Jan
next prev parent reply other threads:[~2016-01-15 21:40 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-12 3:38 Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6 Konrad Rzeszutek Wilk
2016-01-12 9:22 ` Jan Beulich
2016-01-15 21:39 ` Konrad Rzeszutek Wilk [this message]
2016-01-18 9:41 ` Jan Beulich
2016-02-02 22:05 ` Konrad Rzeszutek Wilk
2016-02-03 9:34 ` Jan Beulich
2016-02-03 15:07 ` Konrad Rzeszutek Wilk
2016-02-04 18:36 ` Konrad Rzeszutek Wilk
2016-02-05 10:33 ` Jan Beulich
2016-11-03 1:41 ` Konrad Rzeszutek Wilk
2016-11-03 14:36 ` Konrad Rzeszutek Wilk
2016-02-04 5:52 ` Tian, Kevin
2016-02-17 2:54 ` Tian, Kevin
2016-01-12 14:18 ` Alvin Starr
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160115213958.GA16118@char.us.oracle.com \
--to=konrad.wilk@oracle.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=wim.coekaerts@oracle.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).