* [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()
@ 2017-02-21 2:11 Haozhong Zhang
2017-02-21 9:15 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: Haozhong Zhang @ 2017-02-21 2:11 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Haozhong Zhang
For a HVM domain, if a vcpu is in the nested guest mode,
__raw_copy_to_guest() and __copy_to_guest() used by
update_runstate_area() will copy data to L2 guest other than L1 guest.
Besides copying to the wrong address, this bug also causes crash in
the code path:
context_switch(prev, next)
_update_runstate_area(next)
update_runstate_area(next)
__raw_copy_to_guest(...)
...
__hvm_copy(...)
paging_gva_to_gfn(...)
nestedhap_walk_L1_p2m(...)
nvmx_hap_walk_L1_p2m(..)
vmx_vmcs_enter(v) [ v = next ]
vmx_vmcs_try_enter(v) [ v = next ]
if ( likely(v == current) )
return v->arch.hvm_vmx.vmcs_pa == this_cpu(current_vmcs);
vmx_vmcs_try_enter() will fail and trigger the assert in
vmx_vmcs_enter(), if vcpu 'next' is in the nested guest mode and is
being scheduled to another CPU.
This commit temporally clears the nested guest flag before all
__raw_copy_to_guest() and __copy_to_guest() in update_runstate_area(),
and restores the flag after those guest copy operations.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
xen/arch/x86/domain.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7d3071e..5f0444c 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -50,6 +50,7 @@
#include <asm/mpspec.h>
#include <asm/ldt.h>
#include <asm/hvm/hvm.h>
+#include <asm/hvm/nestedhvm.h>
#include <asm/hvm/support.h>
#include <asm/hvm/viridian.h>
#include <asm/debugreg.h>
@@ -1931,10 +1932,29 @@ bool_t update_runstate_area(struct vcpu *v)
bool_t rc;
smap_check_policy_t smap_policy;
void __user *guest_handle = NULL;
+ bool nested_guest_mode = 0;
if ( guest_handle_is_null(runstate_guest(v)) )
return 1;
+ /*
+ * Must be before all following __raw_copy_to_guest() and __copy_to_guest().
+ *
+ * Otherwise, if 'v' is in the nested guest mode, paging_gva_to_gfn() called
+ * from __raw_copy_to_guest() and __copy_to_guest() will treat the target
+ * address as L2 gva, and __raw_copy_to_guest() and __copy_to_guest() will
+ * consequently copy runstate to L2 guest other than L1 guest.
+ *
+ * Therefore, we clear the nested guest flag before __raw_copy_to_guest()
+ * and __copy_to_guest(), and restore the flag after all guest copy.
+ */
+ if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) )
+ {
+ nested_guest_mode = nestedhvm_is_n2(v);
+ if ( nested_guest_mode )
+ nestedhvm_vcpu_exit_guestmode(v);
+ }
+
smap_policy = smap_policy_change(v, SMAP_CHECK_ENABLED);
if ( VM_ASSIST(v->domain, runstate_update_flag) )
@@ -1971,6 +1991,9 @@ bool_t update_runstate_area(struct vcpu *v)
smap_policy_change(v, smap_policy);
+ if ( nested_guest_mode )
+ nestedhvm_vcpu_enter_guestmode(v);
+
return rc;
}
--
2.10.1
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()
2017-02-21 2:11 [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area() Haozhong Zhang
@ 2017-02-21 9:15 ` Jan Beulich
2017-02-22 1:28 ` Haozhong Zhang
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2017-02-21 9:15 UTC (permalink / raw)
To: Haozhong Zhang; +Cc: Andrew Cooper, xen-devel
>>> On 21.02.17 at 03:11, <haozhong.zhang@intel.com> wrote:
> For a HVM domain, if a vcpu is in the nested guest mode,
> __raw_copy_to_guest() and __copy_to_guest() used by
> update_runstate_area() will copy data to L2 guest other than L1 guest.
>
> Besides copying to the wrong address, this bug also causes crash in
> the code path:
> context_switch(prev, next)
> _update_runstate_area(next)
> update_runstate_area(next)
> __raw_copy_to_guest(...)
> ...
> __hvm_copy(...)
> paging_gva_to_gfn(...)
> nestedhap_walk_L1_p2m(...)
> nvmx_hap_walk_L1_p2m(..)
> vmx_vmcs_enter(v) [ v = next ]
> vmx_vmcs_try_enter(v) [ v = next ]
> if ( likely(v == current) )
> return v->arch.hvm_vmx.vmcs_pa ==
> this_cpu(current_vmcs);
> vmx_vmcs_try_enter() will fail and trigger the assert in
> vmx_vmcs_enter(), if vcpu 'next' is in the nested guest mode and is
> being scheduled to another CPU.
>
> This commit temporally clears the nested guest flag before all
> __raw_copy_to_guest() and __copy_to_guest() in update_runstate_area(),
> and restores the flag after those guest copy operations.
Nice catch, but doesn't the same apply to
update_secondary_system_time() then?
> @@ -1931,10 +1932,29 @@ bool_t update_runstate_area(struct vcpu *v)
> bool_t rc;
> smap_check_policy_t smap_policy;
> void __user *guest_handle = NULL;
> + bool nested_guest_mode = 0;
false
> if ( guest_handle_is_null(runstate_guest(v)) )
> return 1;
>
> + /*
> + * Must be before all following __raw_copy_to_guest() and __copy_to_guest().
> + *
> + * Otherwise, if 'v' is in the nested guest mode, paging_gva_to_gfn() called
> + * from __raw_copy_to_guest() and __copy_to_guest() will treat the target
> + * address as L2 gva, and __raw_copy_to_guest() and __copy_to_guest() will
> + * consequently copy runstate to L2 guest other than L1 guest.
... rather than ...
> + *
> + * Therefore, we clear the nested guest flag before __raw_copy_to_guest()
> + * and __copy_to_guest(), and restore the flag after all guest copy.
> + */
> + if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) )
has_hvm_container_vcpu()
And why is this HAP-specific?
> @@ -1971,6 +1991,9 @@ bool_t update_runstate_area(struct vcpu *v)
>
> smap_policy_change(v, smap_policy);
>
> + if ( nested_guest_mode )
Can we have an unlikely() here please?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()
2017-02-21 9:15 ` Jan Beulich
@ 2017-02-22 1:28 ` Haozhong Zhang
2017-02-22 2:20 ` Haozhong Zhang
0 siblings, 1 reply; 8+ messages in thread
From: Haozhong Zhang @ 2017-02-22 1:28 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel
On 02/21/17 02:15 -0700, Jan Beulich wrote:
> >>> On 21.02.17 at 03:11, <haozhong.zhang@intel.com> wrote:
> > For a HVM domain, if a vcpu is in the nested guest mode,
> > __raw_copy_to_guest() and __copy_to_guest() used by
> > update_runstate_area() will copy data to L2 guest other than L1 guest.
> >
> > Besides copying to the wrong address, this bug also causes crash in
> > the code path:
> > context_switch(prev, next)
> > _update_runstate_area(next)
> > update_runstate_area(next)
> > __raw_copy_to_guest(...)
> > ...
> > __hvm_copy(...)
> > paging_gva_to_gfn(...)
> > nestedhap_walk_L1_p2m(...)
> > nvmx_hap_walk_L1_p2m(..)
> > vmx_vmcs_enter(v) [ v = next ]
> > vmx_vmcs_try_enter(v) [ v = next ]
> > if ( likely(v == current) )
> > return v->arch.hvm_vmx.vmcs_pa ==
> > this_cpu(current_vmcs);
> > vmx_vmcs_try_enter() will fail and trigger the assert in
> > vmx_vmcs_enter(), if vcpu 'next' is in the nested guest mode and is
> > being scheduled to another CPU.
> >
> > This commit temporally clears the nested guest flag before all
> > __raw_copy_to_guest() and __copy_to_guest() in update_runstate_area(),
> > and restores the flag after those guest copy operations.
>
> Nice catch, but doesn't the same apply to
> update_secondary_system_time() then?
>
Yes, I'll apply the same change to update_secondary_system_time().
> > @@ -1931,10 +1932,29 @@ bool_t update_runstate_area(struct vcpu *v)
> > bool_t rc;
> > smap_check_policy_t smap_policy;
> > void __user *guest_handle = NULL;
> > + bool nested_guest_mode = 0;
>
> false
will change
>
> > if ( guest_handle_is_null(runstate_guest(v)) )
> > return 1;
> >
> > + /*
> > + * Must be before all following __raw_copy_to_guest() and __copy_to_guest().
> > + *
> > + * Otherwise, if 'v' is in the nested guest mode, paging_gva_to_gfn() called
> > + * from __raw_copy_to_guest() and __copy_to_guest() will treat the target
> > + * address as L2 gva, and __raw_copy_to_guest() and __copy_to_guest() will
> > + * consequently copy runstate to L2 guest other than L1 guest.
>
> ... rather than ...
ditto
>
> > + *
> > + * Therefore, we clear the nested guest flag before __raw_copy_to_guest()
> > + * and __copy_to_guest(), and restore the flag after all guest copy.
> > + */
> > + if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) )
>
> has_hvm_container_vcpu()
Nested HVM is only available to HVM domain, so I think is_hvm_vcpu(v) is enough.
>
> And why is this HAP-specific?
>
IIUC, nested HVM relies on HAP.
> > @@ -1971,6 +1991,9 @@ bool_t update_runstate_area(struct vcpu *v)
> >
> > smap_policy_change(v, smap_policy);
> >
> > + if ( nested_guest_mode )
>
> Can we have an unlikely() here please?
will add
Thanks,
Haozhong
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()
2017-02-22 1:28 ` Haozhong Zhang
@ 2017-02-22 2:20 ` Haozhong Zhang
2017-02-22 3:15 ` Tian, Kevin
2017-02-22 7:46 ` Jan Beulich
0 siblings, 2 replies; 8+ messages in thread
From: Haozhong Zhang @ 2017-02-22 2:20 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Tian, Kevin, xen-devel
On 02/22/17 09:28 +0800, Haozhong Zhang wrote:
> On 02/21/17 02:15 -0700, Jan Beulich wrote:
> > >>> On 21.02.17 at 03:11, <haozhong.zhang@intel.com> wrote:
[..]
> > > + *
> > > + * Therefore, we clear the nested guest flag before __raw_copy_to_guest()
> > > + * and __copy_to_guest(), and restore the flag after all guest copy.
> > > + */
> > > + if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) )
> >
I think it would be clearer to use nestedhvm_enabled() here.
> > has_hvm_container_vcpu()
>
> Nested HVM is only available to HVM domain, so I think is_hvm_vcpu(v) is enough.
>
> >
> > And why is this HAP-specific?
> >
>
> IIUC, nested HVM relies on HAP.
For nested SVM, I find the following check in hvmop_set_param():
case HVM_PARAM_NESTEDHVM:
if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
rc = -EINVAL;
I don't find the similar check for nested VMX here and in vvmx.c.
Though L1 HVM domain w/ nestedhvm=1 and hap=0 can boot up on Intel
machine (because of the lack of above check?), starting L2 guest does
crash L1 at the very beginning and L0 Xen reports the following debug
messages:
(XEN) realmode.c:111:d18v9 Failed to emulate insn.
(XEN) Real-mode emulation failed: d18v9 Real @ f000:0000fff0 ->
(XEN) domain_crash called from realmode.c:157
(XEN) Domain 18 (vcpu#9) crashed on cpu#29:
(XEN) ----[ Xen-4.9-unstable x86_64 debug=y Not tainted ]----
(XEN) CPU: 29
(XEN) RIP: f000:[<000000000000fff0>]
(XEN) RFLAGS: 0000000000000002 CONTEXT: hvm guest (d18v9)
(XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000
(XEN) rdx: 0000000000000f61 rsi: 0000000000000000 rdi: 0000000000000000
(XEN) rbp: 0000000000000000 rsp: 0000000000000000 r8: 0000000000000000
(XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000000000030 cr4: 0000000000002050
(XEN) cr3: 00000000feffc000 cr2: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: f000
I haven't dug into this problem, but I suspect there would be other
bugs when using nested VMX w/o HAP. Maybe we should add a similar check in
hvmop_set_param() for nested VMX as well.
Kevin, any comments?
Thanks,
Haozhong
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()
2017-02-22 2:20 ` Haozhong Zhang
@ 2017-02-22 3:15 ` Tian, Kevin
2017-02-22 7:46 ` Jan Beulich
1 sibling, 0 replies; 8+ messages in thread
From: Tian, Kevin @ 2017-02-22 3:15 UTC (permalink / raw)
To: Zhang, Haozhong, Jan Beulich; +Cc: Andrew Cooper, xen-devel@lists.xen.org
> From: Zhang, Haozhong
> Sent: Wednesday, February 22, 2017 10:21 AM
> >
> > >
> > > And why is this HAP-specific?
> > >
> >
> > IIUC, nested HVM relies on HAP.
>
> For nested SVM, I find the following check in hvmop_set_param():
> case HVM_PARAM_NESTEDHVM:
> if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
> rc = -EINVAL;
>
> I don't find the similar check for nested VMX here and in vvmx.c.
> Though L1 HVM domain w/ nestedhvm=1 and hap=0 can boot up on Intel
> machine (because of the lack of above check?), starting L2 guest does
> crash L1 at the very beginning and L0 Xen reports the following debug
> messages:
>
> (XEN) realmode.c:111:d18v9 Failed to emulate insn.
> (XEN) Real-mode emulation failed: d18v9 Real @ f000:0000fff0 ->
> (XEN) domain_crash called from realmode.c:157
> (XEN) Domain 18 (vcpu#9) crashed on cpu#29:
> (XEN) ----[ Xen-4.9-unstable x86_64 debug=y Not tainted ]----
> (XEN) CPU: 29
> (XEN) RIP: f000:[<000000000000fff0>]
> (XEN) RFLAGS: 0000000000000002 CONTEXT: hvm guest (d18v9)
> (XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000
> (XEN) rdx: 0000000000000f61 rsi: 0000000000000000 rdi: 0000000000000000
> (XEN) rbp: 0000000000000000 rsp: 0000000000000000 r8: 0000000000000000
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 0000000000000030 cr4: 0000000000002050
> (XEN) cr3: 00000000feffc000 cr2: 0000000000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: f000
>
> I haven't dug into this problem, but I suspect there would be other
> bugs when using nested VMX w/o HAP. Maybe we should add a similar check in
> hvmop_set_param() for nested VMX as well.
>
> Kevin, any comments?
>
I don't recall the initial version of vvmx support - there is a possibility
of not supporting HAP at that time which was introduced later. It's
probably the reason why we don't limit it to HAP=1. But I'm OK to
add similar check now - given HAP=0 is broken and less efficient
than HAP=1 in nested situation.
Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()
2017-02-22 2:20 ` Haozhong Zhang
2017-02-22 3:15 ` Tian, Kevin
@ 2017-02-22 7:46 ` Jan Beulich
2017-02-23 8:00 ` Haozhong Zhang
1 sibling, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2017-02-22 7:46 UTC (permalink / raw)
To: Haozhong Zhang; +Cc: Andrew Cooper, Kevin Tian, xen-devel
>>> On 22.02.17 at 03:20, <haozhong.zhang@intel.com> wrote:
> On 02/22/17 09:28 +0800, Haozhong Zhang wrote:
>> On 02/21/17 02:15 -0700, Jan Beulich wrote:
>> > >>> On 21.02.17 at 03:11, <haozhong.zhang@intel.com> wrote:
> [..]
>> > > + *
>> > > + * Therefore, we clear the nested guest flag before __raw_copy_to_guest()
>> > > + * and __copy_to_guest(), and restore the flag after all guest copy.
>> > > + */
>> > > + if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) )
>> >
>
> I think it would be clearer to use nestedhvm_enabled() here.
Indeed - that would eliminate all these open coding of assumption
which may be valid at present, but not down the road.
>> > And why is this HAP-specific?
>> >
>>
>> IIUC, nested HVM relies on HAP.
>
> For nested SVM, I find the following check in hvmop_set_param():
> case HVM_PARAM_NESTEDHVM:
> if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
> rc = -EINVAL;
>
> I don't find the similar check for nested VMX here and in vvmx.c.
> Though L1 HVM domain w/ nestedhvm=1 and hap=0 can boot up on Intel
> machine (because of the lack of above check?), starting L2 guest does
> crash L1 at the very beginning and L0 Xen reports the following debug
> messages:
>
> (XEN) realmode.c:111:d18v9 Failed to emulate insn.
> (XEN) Real-mode emulation failed: d18v9 Real @ f000:0000fff0 ->
> (XEN) domain_crash called from realmode.c:157
> (XEN) Domain 18 (vcpu#9) crashed on cpu#29:
> (XEN) ----[ Xen-4.9-unstable x86_64 debug=y Not tainted ]----
> (XEN) CPU: 29
> (XEN) RIP: f000:[<000000000000fff0>]
> (XEN) RFLAGS: 0000000000000002 CONTEXT: hvm guest (d18v9)
> (XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000
> (XEN) rdx: 0000000000000f61 rsi: 0000000000000000 rdi: 0000000000000000
> (XEN) rbp: 0000000000000000 rsp: 0000000000000000 r8: 0000000000000000
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 0000000000000030 cr4: 0000000000002050
> (XEN) cr3: 00000000feffc000 cr2: 0000000000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: f000
Now that's of course quite odd: The instruction at that address
ought to be a direct far branch, which I think the emulator is able
to deal with. Which leaves the possibility of it fetching the wrong
bytes (which sadly don't get shown above).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()
2017-02-22 7:46 ` Jan Beulich
@ 2017-02-23 8:00 ` Haozhong Zhang
2017-02-23 13:08 ` Andrew Cooper
0 siblings, 1 reply; 8+ messages in thread
From: Haozhong Zhang @ 2017-02-23 8:00 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, xen-devel
On 02/22/17 00:46 -0700, Jan Beulich wrote:
> >>> On 22.02.17 at 03:20, <haozhong.zhang@intel.com> wrote:
> > On 02/22/17 09:28 +0800, Haozhong Zhang wrote:
> >> On 02/21/17 02:15 -0700, Jan Beulich wrote:
> >> > >>> On 21.02.17 at 03:11, <haozhong.zhang@intel.com> wrote:
> > [..]
> >> > > + *
> >> > > + * Therefore, we clear the nested guest flag before __raw_copy_to_guest()
> >> > > + * and __copy_to_guest(), and restore the flag after all guest copy.
> >> > > + */
> >> > > + if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) )
> >> >
> >
> > I think it would be clearer to use nestedhvm_enabled() here.
>
> Indeed - that would eliminate all these open coding of assumption
> which may be valid at present, but not down the road.
>
> >> > And why is this HAP-specific?
> >> >
> >>
> >> IIUC, nested HVM relies on HAP.
> >
> > For nested SVM, I find the following check in hvmop_set_param():
> > case HVM_PARAM_NESTEDHVM:
> > if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
> > rc = -EINVAL;
> >
> > I don't find the similar check for nested VMX here and in vvmx.c.
> > Though L1 HVM domain w/ nestedhvm=1 and hap=0 can boot up on Intel
> > machine (because of the lack of above check?), starting L2 guest does
> > crash L1 at the very beginning and L0 Xen reports the following debug
> > messages:
> >
> > (XEN) realmode.c:111:d18v9 Failed to emulate insn.
> > (XEN) Real-mode emulation failed: d18v9 Real @ f000:0000fff0 ->
> > (XEN) domain_crash called from realmode.c:157
> > (XEN) Domain 18 (vcpu#9) crashed on cpu#29:
> > (XEN) ----[ Xen-4.9-unstable x86_64 debug=y Not tainted ]----
> > (XEN) CPU: 29
> > (XEN) RIP: f000:[<000000000000fff0>]
> > (XEN) RFLAGS: 0000000000000002 CONTEXT: hvm guest (d18v9)
> > (XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000
> > (XEN) rdx: 0000000000000f61 rsi: 0000000000000000 rdi: 0000000000000000
> > (XEN) rbp: 0000000000000000 rsp: 0000000000000000 r8: 0000000000000000
> > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> > (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
> > (XEN) r15: 0000000000000000 cr0: 0000000000000030 cr4: 0000000000002050
> > (XEN) cr3: 00000000feffc000 cr2: 0000000000000000
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: f000
>
> Now that's of course quite odd: The instruction at that address
> ought to be a direct far branch, which I think the emulator is able
> to deal with. Which leaves the possibility of it fetching the wrong
> bytes (which sadly don't get shown above).
Yes, the emulator fails at fetching the instruction from L2
guest. It's at the vm entry to L2 guest after handling the L1
vmlaunch. The code path from vmx_do_vmentry is shown as below:
vmx_asm_do_vmentry
vmx_realmode
vmx_realmode_emulate_one
hvm_emulate_one
_hvm_emulate_one
x86_emulate
x86_decode
insn_fetch_type
...
hvmemul_insn_fetch
__hvmemul_read
hvm_fetch_from_guest_linear
__hvm_copy(addr = 0xfffffff0, flags = HVMCOPY_from_guest | HVMCOPY_linear)
3117 if ( flags & HVMCOPY_linear )
{
3119 gfn = paging_gva_to_gfn(v, addr, &pfec);
...
3133 gpa |= gfn << PAGE_SHIHT;
}
...
3151 page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_UNSHARE);
3153 if ( !page )
3154 return HVMCOPY_bad_gfn_to_mfn;
When hap is not enabled, paging_gva_to_gfn at line 3119 uses hostp2m
to translate the *L2* gpa rather than walking through the nested
EPT, so line 3119 actually translates from *L1* gpa 0xfffffff0 to L1
gfn 0xfffff. The p2m type at gfn 0xfffff is p2m_mmio_dm, so
get_page_from_gfn() at line 3151 fails.
I don't know whether nested VMX with non-hap ever worked or not, but
it does not now. I'll include a patch to enforce the hap requirement
for nested vmx.
Haozhong
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()
2017-02-23 8:00 ` Haozhong Zhang
@ 2017-02-23 13:08 ` Andrew Cooper
0 siblings, 0 replies; 8+ messages in thread
From: Andrew Cooper @ 2017-02-23 13:08 UTC (permalink / raw)
To: Jan Beulich, Kevin Tian, xen-devel
On 23/02/17 08:00, Haozhong Zhang wrote:
> On 02/22/17 00:46 -0700, Jan Beulich wrote:
>>>>> On 22.02.17 at 03:20, <haozhong.zhang@intel.com> wrote:
>>> On 02/22/17 09:28 +0800, Haozhong Zhang wrote:
>>>> On 02/21/17 02:15 -0700, Jan Beulich wrote:
>>>>>>>> On 21.02.17 at 03:11, <haozhong.zhang@intel.com> wrote:
>>> [..]
>>>>>> + *
>>>>>> + * Therefore, we clear the nested guest flag before __raw_copy_to_guest()
>>>>>> + * and __copy_to_guest(), and restore the flag after all guest copy.
>>>>>> + */
>>>>>> + if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) )
>>> I think it would be clearer to use nestedhvm_enabled() here.
>> Indeed - that would eliminate all these open coding of assumption
>> which may be valid at present, but not down the road.
>>
>>>>> And why is this HAP-specific?
>>>>>
>>>> IIUC, nested HVM relies on HAP.
>>> For nested SVM, I find the following check in hvmop_set_param():
>>> case HVM_PARAM_NESTEDHVM:
>>> if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
>>> rc = -EINVAL;
>>>
>>> I don't find the similar check for nested VMX here and in vvmx.c.
>>> Though L1 HVM domain w/ nestedhvm=1 and hap=0 can boot up on Intel
>>> machine (because of the lack of above check?), starting L2 guest does
>>> crash L1 at the very beginning and L0 Xen reports the following debug
>>> messages:
>>>
>>> (XEN) realmode.c:111:d18v9 Failed to emulate insn.
>>> (XEN) Real-mode emulation failed: d18v9 Real @ f000:0000fff0 ->
>>> (XEN) domain_crash called from realmode.c:157
>>> (XEN) Domain 18 (vcpu#9) crashed on cpu#29:
>>> (XEN) ----[ Xen-4.9-unstable x86_64 debug=y Not tainted ]----
>>> (XEN) CPU: 29
>>> (XEN) RIP: f000:[<000000000000fff0>]
>>> (XEN) RFLAGS: 0000000000000002 CONTEXT: hvm guest (d18v9)
>>> (XEN) rax: 0000000000000000 rbx: 0000000000000000 rcx: 0000000000000000
>>> (XEN) rdx: 0000000000000f61 rsi: 0000000000000000 rdi: 0000000000000000
>>> (XEN) rbp: 0000000000000000 rsp: 0000000000000000 r8: 0000000000000000
>>> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
>>> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000
>>> (XEN) r15: 0000000000000000 cr0: 0000000000000030 cr4: 0000000000002050
>>> (XEN) cr3: 00000000feffc000 cr2: 0000000000000000
>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: f000
>> Now that's of course quite odd: The instruction at that address
>> ought to be a direct far branch, which I think the emulator is able
>> to deal with. Which leaves the possibility of it fetching the wrong
>> bytes (which sadly don't get shown above).
> Yes, the emulator fails at fetching the instruction from L2
> guest. It's at the vm entry to L2 guest after handling the L1
> vmlaunch. The code path from vmx_do_vmentry is shown as below:
> vmx_asm_do_vmentry
> vmx_realmode
> vmx_realmode_emulate_one
> hvm_emulate_one
> _hvm_emulate_one
> x86_emulate
> x86_decode
> insn_fetch_type
> ...
> hvmemul_insn_fetch
> __hvmemul_read
> hvm_fetch_from_guest_linear
> __hvm_copy(addr = 0xfffffff0, flags = HVMCOPY_from_guest | HVMCOPY_linear)
> 3117 if ( flags & HVMCOPY_linear )
> {
> 3119 gfn = paging_gva_to_gfn(v, addr, &pfec);
> ...
> 3133 gpa |= gfn << PAGE_SHIHT;
> }
> ...
> 3151 page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_UNSHARE);
> 3153 if ( !page )
> 3154 return HVMCOPY_bad_gfn_to_mfn;
>
> When hap is not enabled, paging_gva_to_gfn at line 3119 uses hostp2m
> to translate the *L2* gpa rather than walking through the nested
> EPT, so line 3119 actually translates from *L1* gpa 0xfffffff0 to L1
> gfn 0xfffff. The p2m type at gfn 0xfffff is p2m_mmio_dm, so
> get_page_from_gfn() at line 3151 fails.
>
> I don't know whether nested VMX with non-hap ever worked or not, but
> it does not now. I'll include a patch to enforce the hap requirement
> for nested vmx.
This answers one of my open tickets (although being shadow-related, it
wasn't the highest priority item to look at).
Nested VMX definitely used to work with shadow paging, but I am not
surprised if it has bitrotted in the meantime. One larger area I have
identified in the current swamp of nested-virt problems is that the
current p2m infrastructure makes it far too easy to make mistakes like
this (e.g. c/s bab2bd8e2)
My idea (probably needs refining) to address the issue is to introduce
new TYPE_SAFE()'s for l2 and l2 gfns, and use the compiler to make it
harder to get this wrong.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-02-23 13:08 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-21 2:11 [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area() Haozhong Zhang
2017-02-21 9:15 ` Jan Beulich
2017-02-22 1:28 ` Haozhong Zhang
2017-02-22 2:20 ` Haozhong Zhang
2017-02-22 3:15 ` Tian, Kevin
2017-02-22 7:46 ` Jan Beulich
2017-02-23 8:00 ` Haozhong Zhang
2017-02-23 13:08 ` Andrew Cooper
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).