* [PATCH 0/4] Fix shadow-on-shadow nested VMX
@ 2013-09-25 9:51 Gleb Natapov
2013-09-25 9:51 ` [PATCH 1/4] KVM: nVMX: Amend nested_run_pending logic Gleb Natapov
` (4 more replies)
0 siblings, 5 replies; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 9:51 UTC (permalink / raw)
To: kvm; +Cc: pbonzini
This series fixes shadow-on-shadow nested VMX (at least for me).
Gleb Natapov (4):
KVM: nVMX: Amend nested_run_pending logic
KVM: nVMX: Do not put exception that caused vmexit to
IDT_VECTORING_INFO
KVM: nVMX: Check all exceptions for intercept during delivery to L2
KVM: nVMX: Do not generate #DF if #PF happens during exception
delivery into L2
arch/x86/kvm/vmx.c | 60 ++++++++++++++++++++++++++++++++++--------------------
1 file changed, 38 insertions(+), 22 deletions(-)
--
1.8.4.rc3
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/4] KVM: nVMX: Amend nested_run_pending logic
2013-09-25 9:51 [PATCH 0/4] Fix shadow-on-shadow nested VMX Gleb Natapov
@ 2013-09-25 9:51 ` Gleb Natapov
2013-09-25 9:51 ` [PATCH 2/4] KVM: nVMX: Do not put exception that caused vmexit to IDT_VECTORING_INFO Gleb Natapov
` (3 subsequent siblings)
4 siblings, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 9:51 UTC (permalink / raw)
To: kvm; +Cc: pbonzini
EXIT_REASON_VMLAUNCH/EXIT_REASON_VMRESUME exit does not mean that nested
VM will actually run during next entry. Move setting nested_run_pending
closer to vmentry emulation code and move its clearing close to vmexit to
minimize amount of code that will erroneously run with nested_run_pending
set.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/vmx.c | 26 ++++++++++++--------------
1 file changed, 12 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e6e8fbc..7eb0512 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6742,20 +6742,6 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
if (vmx->emulation_required)
return handle_invalid_guest_state(vcpu);
- /*
- * the KVM_REQ_EVENT optimization bit is only on for one entry, and if
- * we did not inject a still-pending event to L1 now because of
- * nested_run_pending, we need to re-enable this bit.
- */
- if (vmx->nested.nested_run_pending)
- kvm_make_request(KVM_REQ_EVENT, vcpu);
-
- if (!is_guest_mode(vcpu) && (exit_reason == EXIT_REASON_VMLAUNCH ||
- exit_reason == EXIT_REASON_VMRESUME))
- vmx->nested.nested_run_pending = 1;
- else
- vmx->nested.nested_run_pending = 0;
-
if (is_guest_mode(vcpu) && nested_vmx_exit_handled(vcpu)) {
nested_vmx_vmexit(vcpu);
return 1;
@@ -7290,6 +7276,16 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
trace_kvm_exit(vmx->exit_reason, vcpu, KVM_ISA_VMX);
+ /*
+ * the KVM_REQ_EVENT optimization bit is only on for one entry, and if
+ * we did not inject a still-pending event to L1 now because of
+ * nested_run_pending, we need to re-enable this bit.
+ */
+ if (vmx->nested.nested_run_pending)
+ kvm_make_request(KVM_REQ_EVENT, vcpu);
+
+ vmx->nested.nested_run_pending = 0;
+
vmx_complete_atomic_exit(vmx);
vmx_recover_nmi_blocking(vmx);
vmx_complete_interrupts(vmx);
@@ -7948,6 +7944,8 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
enter_guest_mode(vcpu);
+ vmx->nested.nested_run_pending = 1;
+
vmx->nested.vmcs01_tsc_offset = vmcs_read64(TSC_OFFSET);
cpu = get_cpu();
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 2/4] KVM: nVMX: Do not put exception that caused vmexit to IDT_VECTORING_INFO
2013-09-25 9:51 [PATCH 0/4] Fix shadow-on-shadow nested VMX Gleb Natapov
2013-09-25 9:51 ` [PATCH 1/4] KVM: nVMX: Amend nested_run_pending logic Gleb Natapov
@ 2013-09-25 9:51 ` Gleb Natapov
2013-09-25 9:51 ` [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2 Gleb Natapov
` (2 subsequent siblings)
4 siblings, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 9:51 UTC (permalink / raw)
To: kvm; +Cc: pbonzini
If an exception causes vmexit directly it should not be reported in
IDT_VECTORING_INFO during the exit. For that we need to be able to
distinguish between exception that is injected into nested VM and one that
is reinjected because its delivery failed. Fortunately we already have
mechanism to do so for nested SVM, so here we just use correct function
to requeue exceptions and make sure that reinjected exception is not
moved to IDT_VECTORING_INFO during vmexit emulation and not re-checked
for interception during delivery.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/vmx.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7eb0512..663bc5e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1921,7 +1921,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
struct vcpu_vmx *vmx = to_vmx(vcpu);
u32 intr_info = nr | INTR_INFO_VALID_MASK;
- if (nr == PF_VECTOR && is_guest_mode(vcpu) &&
+ if (!reinject && nr == PF_VECTOR && is_guest_mode(vcpu) &&
!vmx->nested.nested_run_pending && nested_pf_handled(vcpu))
return;
@@ -7053,9 +7053,9 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
case INTR_TYPE_HARD_EXCEPTION:
if (idt_vectoring_info & VECTORING_INFO_DELIVER_CODE_MASK) {
u32 err = vmcs_read32(error_code_field);
- kvm_queue_exception_e(vcpu, vector, err);
+ kvm_requeue_exception_e(vcpu, vector, err);
} else
- kvm_queue_exception(vcpu, vector);
+ kvm_requeue_exception(vcpu, vector);
break;
case INTR_TYPE_SOFT_INTR:
vcpu->arch.event_exit_inst_len = vmcs_read32(instr_len_field);
@@ -8013,7 +8013,7 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
u32 idt_vectoring;
unsigned int nr;
- if (vcpu->arch.exception.pending) {
+ if (vcpu->arch.exception.pending && vcpu->arch.exception.reinject) {
nr = vcpu->arch.exception.nr;
idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 9:51 [PATCH 0/4] Fix shadow-on-shadow nested VMX Gleb Natapov
2013-09-25 9:51 ` [PATCH 1/4] KVM: nVMX: Amend nested_run_pending logic Gleb Natapov
2013-09-25 9:51 ` [PATCH 2/4] KVM: nVMX: Do not put exception that caused vmexit to IDT_VECTORING_INFO Gleb Natapov
@ 2013-09-25 9:51 ` Gleb Natapov
2013-09-25 10:38 ` Paolo Bonzini
2013-09-25 14:00 ` Paolo Bonzini
2013-09-25 9:51 ` [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2 Gleb Natapov
2013-09-26 15:10 ` [PATCH 0/4] Fix shadow-on-shadow nested VMX Paolo Bonzini
4 siblings, 2 replies; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 9:51 UTC (permalink / raw)
To: kvm; +Cc: pbonzini
All exceptions should be checked for intercept during delivery to L2,
but we check only #PF currently. Drop nested_run_pending while we are
at it since exception cannot be injected during vmentry anyway.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/vmx.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 663bc5e..5bfa09d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1902,12 +1902,11 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
* a #PF exception (this is the only case in which KVM injects a #PF when L2
* is running).
*/
-static int nested_pf_handled(struct kvm_vcpu *vcpu)
+static int nested_ex_handled(struct kvm_vcpu *vcpu, unsigned nr)
{
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
- /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
- if (!(vmcs12->exception_bitmap & (1u << PF_VECTOR)))
+ if (!(vmcs12->exception_bitmap & (1u << nr)))
return 0;
nested_vmx_vmexit(vcpu);
@@ -1921,8 +1920,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
struct vcpu_vmx *vmx = to_vmx(vcpu);
u32 intr_info = nr | INTR_INFO_VALID_MASK;
- if (!reinject && nr == PF_VECTOR && is_guest_mode(vcpu) &&
- !vmx->nested.nested_run_pending && nested_pf_handled(vcpu))
+ if (!reinject && is_guest_mode(vcpu) && nested_ex_handled(vcpu, nr))
return;
if (has_error_code) {
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
2013-09-25 9:51 [PATCH 0/4] Fix shadow-on-shadow nested VMX Gleb Natapov
` (2 preceding siblings ...)
2013-09-25 9:51 ` [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2 Gleb Natapov
@ 2013-09-25 9:51 ` Gleb Natapov
2013-09-25 11:24 ` Paolo Bonzini
2013-09-26 15:10 ` [PATCH 0/4] Fix shadow-on-shadow nested VMX Paolo Bonzini
4 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 9:51 UTC (permalink / raw)
To: kvm; +Cc: pbonzini
If #PF happens during delivery of an exception into L2 and L1 also do
not have the page mapped in its shadow page table then L0 needs to
generate vmexit to L2 with original event in IDT_VECTORING_INFO, but
current code combines both exception and generates #DF instead. Fix that
by providing nVMX specific function to handle page faults during page
table walk that handles this case correctly.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
arch/x86/kvm/vmx.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5bfa09d..07c36fd 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7520,6 +7520,20 @@ static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu)
vcpu->arch.walk_mmu = &vcpu->arch.mmu;
}
+static void vmx_inject_page_fault_nested(struct kvm_vcpu *vcpu,
+ struct x86_exception *fault)
+{
+ struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+
+ WARN_ON(!is_guest_mode(vcpu));
+
+ /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
+ if (vmcs12->exception_bitmap & (1u << PF_VECTOR))
+ nested_vmx_vmexit(vcpu);
+ else
+ kvm_inject_page_fault(vcpu, fault);
+}
+
/*
* prepare_vmcs02 is called when the L1 guest hypervisor runs its nested
* L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it
@@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
kvm_set_cr3(vcpu, vmcs12->guest_cr3);
kvm_mmu_reset_context(vcpu);
+ if (!enable_ept)
+ vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
+
/*
* L1 may access the L2's PDPTR, so save them to construct vmcs12
*/
@@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
kvm_set_cr3(vcpu, vmcs12->host_cr3);
kvm_mmu_reset_context(vcpu);
+ if (!enable_ept)
+ vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
+
if (enable_vpid) {
/*
* Trivially support vpid by letting L2s share their parent
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 9:51 ` [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2 Gleb Natapov
@ 2013-09-25 10:38 ` Paolo Bonzini
2013-09-25 11:00 ` Gleb Natapov
2013-09-25 14:00 ` Paolo Bonzini
1 sibling, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-25 10:38 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> All exceptions should be checked for intercept during delivery to L2,
> but we check only #PF currently. Drop nested_run_pending while we are
> at it since exception cannot be injected during vmentry anyway.
>
> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> ---
> arch/x86/kvm/vmx.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 663bc5e..5bfa09d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -1902,12 +1902,11 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> * a #PF exception (this is the only case in which KVM injects a #PF when L2
> * is running).
> */
> -static int nested_pf_handled(struct kvm_vcpu *vcpu)
> +static int nested_ex_handled(struct kvm_vcpu *vcpu, unsigned nr)
> {
> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>
> - /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
> - if (!(vmcs12->exception_bitmap & (1u << PF_VECTOR)))
> + if (!(vmcs12->exception_bitmap & (1u << nr)))
> return 0;
>
> nested_vmx_vmexit(vcpu);
> @@ -1921,8 +1920,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> u32 intr_info = nr | INTR_INFO_VALID_MASK;
>
> - if (!reinject && nr == PF_VECTOR && is_guest_mode(vcpu) &&
> - !vmx->nested.nested_run_pending && nested_pf_handled(vcpu))
> + if (!reinject && is_guest_mode(vcpu) && nested_ex_handled(vcpu, nr))
The code is now pretty similar to what svm.c does. Do we want to move
the is_guest_mode(vcpu) check into nested_ex_handled, too? (Or vice
versa, take it out in svm.c). Perhaps you could also name the function
nested_vmx_check_exception.
Paolo
> return;
>
> if (has_error_code) {
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 10:38 ` Paolo Bonzini
@ 2013-09-25 11:00 ` Gleb Natapov
2013-09-25 11:25 ` Paolo Bonzini
0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 11:00 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm
On Wed, Sep 25, 2013 at 12:38:20PM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> > All exceptions should be checked for intercept during delivery to L2,
> > but we check only #PF currently. Drop nested_run_pending while we are
> > at it since exception cannot be injected during vmentry anyway.
> >
> > Signed-off-by: Gleb Natapov <gleb@redhat.com>
> > ---
> > arch/x86/kvm/vmx.c | 8 +++-----
> > 1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 663bc5e..5bfa09d 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -1902,12 +1902,11 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> > * a #PF exception (this is the only case in which KVM injects a #PF when L2
> > * is running).
> > */
> > -static int nested_pf_handled(struct kvm_vcpu *vcpu)
> > +static int nested_ex_handled(struct kvm_vcpu *vcpu, unsigned nr)
> > {
> > struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >
> > - /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
> > - if (!(vmcs12->exception_bitmap & (1u << PF_VECTOR)))
> > + if (!(vmcs12->exception_bitmap & (1u << nr)))
> > return 0;
> >
> > nested_vmx_vmexit(vcpu);
> > @@ -1921,8 +1920,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
> > struct vcpu_vmx *vmx = to_vmx(vcpu);
> > u32 intr_info = nr | INTR_INFO_VALID_MASK;
> >
> > - if (!reinject && nr == PF_VECTOR && is_guest_mode(vcpu) &&
> > - !vmx->nested.nested_run_pending && nested_pf_handled(vcpu))
> > + if (!reinject && is_guest_mode(vcpu) && nested_ex_handled(vcpu, nr))
>
> The code is now pretty similar to what svm.c does. Do we want to move
> the is_guest_mode(vcpu) check into nested_ex_handled, too? (Or vice
> versa, take it out in svm.c). Perhaps you could also name the function
> nested_vmx_check_exception.
>
I want to try to move the logic into common code eventually. I do not
mind renaming for now, but it will have to wait for the next week :)
--
Gleb.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
2013-09-25 9:51 ` [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2 Gleb Natapov
@ 2013-09-25 11:24 ` Paolo Bonzini
2013-09-25 11:51 ` Gleb Natapov
0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-25 11:24 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm, Marcelo Tosatti, Jan Kiszka, Alexander Graf
Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> If #PF happens during delivery of an exception into L2 and L1 also do
> not have the page mapped in its shadow page table then L0 needs to
> generate vmexit to L2 with original event in IDT_VECTORING_INFO, but
> current code combines both exception and generates #DF instead. Fix that
> by providing nVMX specific function to handle page faults during page
> table walk that handles this case correctly.
>
> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> ---
> arch/x86/kvm/vmx.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 5bfa09d..07c36fd 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -7520,6 +7520,20 @@ static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu)
> vcpu->arch.walk_mmu = &vcpu->arch.mmu;
> }
>
> +static void vmx_inject_page_fault_nested(struct kvm_vcpu *vcpu,
> + struct x86_exception *fault)
> +{
> + struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> +
> + WARN_ON(!is_guest_mode(vcpu));
> +
> + /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
> + if (vmcs12->exception_bitmap & (1u << PF_VECTOR))
> + nested_vmx_vmexit(vcpu);
> + else
> + kvm_inject_page_fault(vcpu, fault);
> +}
> +
> /*
> * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested
> * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it
> @@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> kvm_set_cr3(vcpu, vmcs12->guest_cr3);
> kvm_mmu_reset_context(vcpu);
>
> + if (!enable_ept)
> + vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
> +
> /*
> * L1 may access the L2's PDPTR, so save them to construct vmcs12
> */
> @@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> kvm_set_cr3(vcpu, vmcs12->host_cr3);
> kvm_mmu_reset_context(vcpu);
>
> + if (!enable_ept)
> + vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
This is strictly speaking not needed, because kvm_mmu_reset_context
takes care of it.
But I wonder if it is cleaner to not touch the struct here, and instead
add a new member to kvm_x86_ops---used directly in init_kvm_softmmu like
kvm_x86_ops->set_cr3. The new member can do something like
if (is_guest_mode(vcpu)) {
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
nested_vmx_vmexit(vcpu);
return;
}
}
kvm_inject_page_fault(vcpu, fault);
Marcelo, Jan, what do you think?
Alex (or Gleb :)), do you have any idea why SVM does not need this?
Paolo
> if (enable_vpid) {
> /*
> * Trivially support vpid by letting L2s share their parent
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 11:00 ` Gleb Natapov
@ 2013-09-25 11:25 ` Paolo Bonzini
2013-09-25 11:52 ` Gleb Natapov
0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-25 11:25 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
Il 25/09/2013 13:00, Gleb Natapov ha scritto:
>>> > > @@ -1921,8 +1920,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
>>> > > struct vcpu_vmx *vmx = to_vmx(vcpu);
>>> > > u32 intr_info = nr | INTR_INFO_VALID_MASK;
>>> > >
>>> > > - if (!reinject && nr == PF_VECTOR && is_guest_mode(vcpu) &&
>>> > > - !vmx->nested.nested_run_pending && nested_pf_handled(vcpu))
>>> > > + if (!reinject && is_guest_mode(vcpu) && nested_ex_handled(vcpu, nr))
>> >
>> > The code is now pretty similar to what svm.c does. Do we want to move
>> > the is_guest_mode(vcpu) check into nested_ex_handled, too? (Or vice
>> > versa, take it out in svm.c). Perhaps you could also name the function
>> > nested_vmx_check_exception.
>> >
> I want to try to move the logic into common code eventually. I do not
> mind renaming for now, but it will have to wait for the next week :)
I can rename while applying the patch. Making more logic common to vmx
and svm can wait.
Paolo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
2013-09-25 11:24 ` Paolo Bonzini
@ 2013-09-25 11:51 ` Gleb Natapov
2013-09-25 12:08 ` Paolo Bonzini
0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 11:51 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm, Marcelo Tosatti, Jan Kiszka, Alexander Graf
On Wed, Sep 25, 2013 at 01:24:49PM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> > If #PF happens during delivery of an exception into L2 and L1 also do
> > not have the page mapped in its shadow page table then L0 needs to
> > generate vmexit to L2 with original event in IDT_VECTORING_INFO, but
> > current code combines both exception and generates #DF instead. Fix that
> > by providing nVMX specific function to handle page faults during page
> > table walk that handles this case correctly.
> >
> > Signed-off-by: Gleb Natapov <gleb@redhat.com>
> > ---
> > arch/x86/kvm/vmx.c | 20 ++++++++++++++++++++
> > 1 file changed, 20 insertions(+)
> >
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 5bfa09d..07c36fd 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -7520,6 +7520,20 @@ static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu)
> > vcpu->arch.walk_mmu = &vcpu->arch.mmu;
> > }
> >
> > +static void vmx_inject_page_fault_nested(struct kvm_vcpu *vcpu,
> > + struct x86_exception *fault)
> > +{
> > + struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> > +
> > + WARN_ON(!is_guest_mode(vcpu));
> > +
> > + /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
> > + if (vmcs12->exception_bitmap & (1u << PF_VECTOR))
> > + nested_vmx_vmexit(vcpu);
> > + else
> > + kvm_inject_page_fault(vcpu, fault);
> > +}
> > +
> > /*
> > * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested
> > * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it
> > @@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> > kvm_set_cr3(vcpu, vmcs12->guest_cr3);
> > kvm_mmu_reset_context(vcpu);
> >
> > + if (!enable_ept)
> > + vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
> > +
> > /*
> > * L1 may access the L2's PDPTR, so save them to construct vmcs12
> > */
> > @@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> > kvm_set_cr3(vcpu, vmcs12->host_cr3);
> > kvm_mmu_reset_context(vcpu);
> >
> > + if (!enable_ept)
> > + vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
>
> This is strictly speaking not needed, because kvm_mmu_reset_context
> takes care of it.
>
Yeah, but better make it explicit, it does not hurt but make it more
clear what is going on. Or at least add comment above
kvm_mmu_reset_context() about this side effect.
> But I wonder if it is cleaner to not touch the struct here, and instead
> add a new member to kvm_x86_ops---used directly in init_kvm_softmmu like
> kvm_x86_ops->set_cr3. The new member can do something like
>
> if (is_guest_mode(vcpu)) {
> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
> nested_vmx_vmexit(vcpu);
> return;
> }
> }
>
> kvm_inject_page_fault(vcpu, fault);
I do not quite understand what you mean here. inject_page_fault() is
called from the depth of page table walking. How the code will not to
call new member in some circumstances?
>
> Marcelo, Jan, what do you think?
>
> Alex (or Gleb :)), do you have any idea why SVM does not need this?
>
It's probably needed there too. At least I fail to see why it does
not. Without that patch guest is actually booting (most of the times),
but sometimes random processes crash with double fault exception.
--
Gleb.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 11:25 ` Paolo Bonzini
@ 2013-09-25 11:52 ` Gleb Natapov
0 siblings, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 11:52 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm
On Wed, Sep 25, 2013 at 01:25:39PM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 13:00, Gleb Natapov ha scritto:
> >>> > > @@ -1921,8 +1920,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
> >>> > > struct vcpu_vmx *vmx = to_vmx(vcpu);
> >>> > > u32 intr_info = nr | INTR_INFO_VALID_MASK;
> >>> > >
> >>> > > - if (!reinject && nr == PF_VECTOR && is_guest_mode(vcpu) &&
> >>> > > - !vmx->nested.nested_run_pending && nested_pf_handled(vcpu))
> >>> > > + if (!reinject && is_guest_mode(vcpu) && nested_ex_handled(vcpu, nr))
> >> >
> >> > The code is now pretty similar to what svm.c does. Do we want to move
> >> > the is_guest_mode(vcpu) check into nested_ex_handled, too? (Or vice
> >> > versa, take it out in svm.c). Perhaps you could also name the function
> >> > nested_vmx_check_exception.
> >> >
> > I want to try to move the logic into common code eventually. I do not
> > mind renaming for now, but it will have to wait for the next week :)
>
> I can rename while applying the patch. Making more logic common to vmx
> and svm can wait.
>
Yes, of course, logic unification is not for the immediate feature.
--
Gleb.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
2013-09-25 11:51 ` Gleb Natapov
@ 2013-09-25 12:08 ` Paolo Bonzini
2013-09-25 12:21 ` Gleb Natapov
0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-25 12:08 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm, Marcelo Tosatti, Jan Kiszka, Alexander Graf
Il 25/09/2013 13:51, Gleb Natapov ha scritto:
> On Wed, Sep 25, 2013 at 01:24:49PM +0200, Paolo Bonzini wrote:
>> Il 25/09/2013 11:51, Gleb Natapov ha scritto:
>>> @@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>>> kvm_set_cr3(vcpu, vmcs12->guest_cr3);
>>> kvm_mmu_reset_context(vcpu);
>>>
>>> + if (!enable_ept)
>>> + vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
>>> +
>>> /*
>>> * L1 may access the L2's PDPTR, so save them to construct vmcs12
>>> */
>>> @@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>>> kvm_set_cr3(vcpu, vmcs12->host_cr3);
>>> kvm_mmu_reset_context(vcpu);
>>>
>>> + if (!enable_ept)
>>> + vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
>>
>> This is strictly speaking not needed, because kvm_mmu_reset_context
>> takes care of it.
>>
> Yeah, but better make it explicit, it does not hurt but make it more
> clear what is going on. Or at least add comment above
> kvm_mmu_reset_context() about this side effect.
Yes, I agree the code is cleaner like you wrote it.
>> But I wonder if it is cleaner to not touch the struct here, and instead
>> add a new member to kvm_x86_ops---used directly in init_kvm_softmmu like
>> kvm_x86_ops->set_cr3. The new member can do something like
>>
>> if (is_guest_mode(vcpu)) {
>> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>> if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
>> nested_vmx_vmexit(vcpu);
>> return;
>> }
>> }
>>
>> kvm_inject_page_fault(vcpu, fault);
>
> I do not quite understand what you mean here. inject_page_fault() is
> called from the depth of page table walking. How the code will not to
> call new member in some circumstances?
IIUC the new function is called if and only if is_guest_mode(vcpu) &&
!enable_ept. So what I'm suggesting is something like this:
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -735,6 +735,8 @@ struct kvm_x86_ops {
void (*adjust_tsc_offset)(struct kvm_vcpu *vcpu, s64 adjustment, bool host);
void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
+ void (*inject_softmmu_page_fault)(struct kvm_vcpu *vcpu,
+ struct x86_exception *fault);
void (*set_supported_cpuid)(u32 func, struct kvm_cpuid_entry2 *entry);
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3805,7 +3805,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
vcpu->arch.walk_mmu->set_cr3 = kvm_x86_ops->set_cr3;
vcpu->arch.walk_mmu->get_cr3 = get_cr3;
vcpu->arch.walk_mmu->get_pdptr = kvm_pdptr_read;
- vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
+ vcpu->arch.walk_mmu->inject_page_fault = kvm_x86_ops->inject_softmmu_page_fault;
return r;
}
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7499,6 +7499,20 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu,
vmcs12->guest_physical_address = fault->address;
}
+static void vmx_inject_softmmu_page_fault(struct kvm_vcpu *vcpu,
+ struct x86_exception *fault)
+{
+ if (is_guest_mode(vcpu)) {
+ struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+ if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
+ nested_vmx_vmexit(vcpu);
+ return;
+ }
+ }
+
+ kvm_inject_page_fault(vcpu, fault);
+}
+
/* Callbacks for nested_ept_init_mmu_context: */
static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu)
@@ -8490,6 +8504,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
.read_l1_tsc = vmx_read_l1_tsc,
.set_tdp_cr3 = vmx_set_cr3,
+ .inject_nested_tdp_pagefault = vmx_set_cr3,
.check_intercept = vmx_check_intercept,
.handle_external_intr = vmx_handle_external_intr,
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4347,6 +4347,7 @@ static struct kvm_x86_ops svm_x86_ops = {
.read_l1_tsc = svm_read_l1_tsc,
.set_tdp_cr3 = set_tdp_cr3,
+ .inject_nested_tdp_pagefault = kvm_inject_page_fault, /*FIXME*/
.check_intercept = svm_check_intercept,
.handle_external_intr = svm_handle_external_intr,
>> Alex (or Gleb :)), do you have any idea why SVM does not need this?
>
> It's probably needed there too. At least I fail to see why it does
> not. Without that patch guest is actually booting (most of the times),
> but sometimes random processes crash with double fault exception.
Sounds indeed like the same bug.
Paolo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
2013-09-25 12:08 ` Paolo Bonzini
@ 2013-09-25 12:21 ` Gleb Natapov
2013-09-25 13:26 ` Paolo Bonzini
0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 12:21 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm, Marcelo Tosatti, Jan Kiszka, Alexander Graf
On Wed, Sep 25, 2013 at 02:08:09PM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 13:51, Gleb Natapov ha scritto:
> > On Wed, Sep 25, 2013 at 01:24:49PM +0200, Paolo Bonzini wrote:
> >> Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> >>> @@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> >>> kvm_set_cr3(vcpu, vmcs12->guest_cr3);
> >>> kvm_mmu_reset_context(vcpu);
> >>>
> >>> + if (!enable_ept)
> >>> + vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
> >>> +
> >>> /*
> >>> * L1 may access the L2's PDPTR, so save them to construct vmcs12
> >>> */
> >>> @@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> >>> kvm_set_cr3(vcpu, vmcs12->host_cr3);
> >>> kvm_mmu_reset_context(vcpu);
> >>>
> >>> + if (!enable_ept)
> >>> + vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
> >>
> >> This is strictly speaking not needed, because kvm_mmu_reset_context
> >> takes care of it.
> >>
> > Yeah, but better make it explicit, it does not hurt but make it more
> > clear what is going on. Or at least add comment above
> > kvm_mmu_reset_context() about this side effect.
>
> Yes, I agree the code is cleaner like you wrote it.
>
> >> But I wonder if it is cleaner to not touch the struct here, and instead
> >> add a new member to kvm_x86_ops---used directly in init_kvm_softmmu like
> >> kvm_x86_ops->set_cr3. The new member can do something like
> >>
> >> if (is_guest_mode(vcpu)) {
> >> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >> if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
> >> nested_vmx_vmexit(vcpu);
> >> return;
> >> }
> >> }
> >>
> >> kvm_inject_page_fault(vcpu, fault);
> >
> > I do not quite understand what you mean here. inject_page_fault() is
> > called from the depth of page table walking. How the code will not to
> > call new member in some circumstances?
>
> IIUC the new function is called if and only if is_guest_mode(vcpu) &&
> !enable_ept. So what I'm suggesting is something like this:
>
Ah I see, so you propose to check for guest mode and enable_ept in the
function instead of switching to another function, but switching to
another function is how code was designed to be. Nested NPT/EPT provide
their own function too, but there is nothing that stops you from
checking on what MMU you are now in the function itself.
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -735,6 +735,8 @@ struct kvm_x86_ops {
> void (*adjust_tsc_offset)(struct kvm_vcpu *vcpu, s64 adjustment, bool host);
>
> void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
> + void (*inject_softmmu_page_fault)(struct kvm_vcpu *vcpu,
> + struct x86_exception *fault);
>
> void (*set_supported_cpuid)(u32 func, struct kvm_cpuid_entry2 *entry);
>
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3805,7 +3805,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
> vcpu->arch.walk_mmu->set_cr3 = kvm_x86_ops->set_cr3;
> vcpu->arch.walk_mmu->get_cr3 = get_cr3;
> vcpu->arch.walk_mmu->get_pdptr = kvm_pdptr_read;
> - vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
> + vcpu->arch.walk_mmu->inject_page_fault = kvm_x86_ops->inject_softmmu_page_fault;
>
> return r;
> }
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -7499,6 +7499,20 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu,
> vmcs12->guest_physical_address = fault->address;
> }
>
> +static void vmx_inject_softmmu_page_fault(struct kvm_vcpu *vcpu,
> + struct x86_exception *fault)
> +{
> + if (is_guest_mode(vcpu)) {
is_guest_mode(vcpu) && !enable_ept
> + struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> + if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
> + nested_vmx_vmexit(vcpu);
> + return;
> + }
> + }
> +
> + kvm_inject_page_fault(vcpu, fault);
> +}
> +
> /* Callbacks for nested_ept_init_mmu_context: */
>
> static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu)
> @@ -8490,6 +8504,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
> .read_l1_tsc = vmx_read_l1_tsc,
>
> .set_tdp_cr3 = vmx_set_cr3,
> + .inject_nested_tdp_pagefault = vmx_set_cr3,
>
> .check_intercept = vmx_check_intercept,
> .handle_external_intr = vmx_handle_external_intr,
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -4347,6 +4347,7 @@ static struct kvm_x86_ops svm_x86_ops = {
> .read_l1_tsc = svm_read_l1_tsc,
>
> .set_tdp_cr3 = set_tdp_cr3,
> + .inject_nested_tdp_pagefault = kvm_inject_page_fault, /*FIXME*/
>
> .check_intercept = svm_check_intercept,
> .handle_external_intr = svm_handle_external_intr,
>
> >> Alex (or Gleb :)), do you have any idea why SVM does not need this?
> >
> > It's probably needed there too. At least I fail to see why it does
> > not. Without that patch guest is actually booting (most of the times),
> > but sometimes random processes crash with double fault exception.
>
> Sounds indeed like the same bug.
>
I described what I saw with VMX, I am not saying the same happens with
SVM :) I just do not see why it should not and the non fatality of the
BUG can explain why it was missed.
--
Gleb.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
2013-09-25 12:21 ` Gleb Natapov
@ 2013-09-25 13:26 ` Paolo Bonzini
2013-09-25 13:36 ` Gleb Natapov
0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-25 13:26 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm, Marcelo Tosatti, Jan Kiszka, Alexander Graf
Il 25/09/2013 14:21, Gleb Natapov ha scritto:
> On Wed, Sep 25, 2013 at 02:08:09PM +0200, Paolo Bonzini wrote:
>> Il 25/09/2013 13:51, Gleb Natapov ha scritto:
>>> On Wed, Sep 25, 2013 at 01:24:49PM +0200, Paolo Bonzini wrote:
>>>> Il 25/09/2013 11:51, Gleb Natapov ha scritto:
>>>>> @@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>>>>> kvm_set_cr3(vcpu, vmcs12->guest_cr3);
>>>>> kvm_mmu_reset_context(vcpu);
>>>>>
>>>>> + if (!enable_ept)
>>>>> + vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
>>>>> +
>>>>> /*
>>>>> * L1 may access the L2's PDPTR, so save them to construct vmcs12
>>>>> */
>>>>> @@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>>>>> kvm_set_cr3(vcpu, vmcs12->host_cr3);
>>>>> kvm_mmu_reset_context(vcpu);
>>>>>
>>>>> + if (!enable_ept)
>>>>> + vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
>>>>
>>>> This is strictly speaking not needed, because kvm_mmu_reset_context
>>>> takes care of it.
>>>>
>>> Yeah, but better make it explicit, it does not hurt but make it more
>>> clear what is going on. Or at least add comment above
>>> kvm_mmu_reset_context() about this side effect.
>>
>> Yes, I agree the code is cleaner like you wrote it.
>>
>>>> But I wonder if it is cleaner to not touch the struct here, and instead
>>>> add a new member to kvm_x86_ops---used directly in init_kvm_softmmu like
>>>> kvm_x86_ops->set_cr3. The new member can do something like
>>>>
>>>> if (is_guest_mode(vcpu)) {
>>>> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>>>> if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
>>>> nested_vmx_vmexit(vcpu);
>>>> return;
>>>> }
>>>> }
>>>>
>>>> kvm_inject_page_fault(vcpu, fault);
>>>
>>> I do not quite understand what you mean here. inject_page_fault() is
>>> called from the depth of page table walking. How the code will not to
>>> call new member in some circumstances?
>>
>> IIUC the new function is called if and only if is_guest_mode(vcpu) &&
>> !enable_ept. So what I'm suggesting is something like this:
>>
> Ah I see, so you propose to check for guest mode and enable_ept in the
> function instead of switching to another function, but switching to
> another function is how code was designed to be.
You do not need to check enable_ept if I understand the code correctly,
because the new function is specifically called in init_kvm_softmmu,
i.e. not for nested_mmu and not for tdp_enabled.
I'm asking because I didn't find any other place that modifies function
pointers this way after kvm_mmu_reset_context.
> Nested NPT/EPT provide
> their own function too, but there is nothing that stops you from
> checking on what MMU you are now in the function itself.
The difference is that NPT/EPT use a completely different paging mode
for nested and non-nested (non-nested uses direct mapping, nested uses
shadow mapping). Shadow paging is really the same thing for nested and
non-nested, you just have to do the injection the right way.
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -735,6 +735,8 @@ struct kvm_x86_ops {
>> void (*adjust_tsc_offset)(struct kvm_vcpu *vcpu, s64 adjustment, bool host);
>>
>> void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
>> + void (*inject_softmmu_page_fault)(struct kvm_vcpu *vcpu,
>> + struct x86_exception *fault);
>>
>> void (*set_supported_cpuid)(u32 func, struct kvm_cpuid_entry2 *entry);
>>
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -3805,7 +3805,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
>> vcpu->arch.walk_mmu->set_cr3 = kvm_x86_ops->set_cr3;
>> vcpu->arch.walk_mmu->get_cr3 = get_cr3;
>> vcpu->arch.walk_mmu->get_pdptr = kvm_pdptr_read;
>> - vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
>> + vcpu->arch.walk_mmu->inject_page_fault = kvm_x86_ops->inject_softmmu_page_fault;
>>
>> return r;
>> }
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -7499,6 +7499,20 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu,
>> vmcs12->guest_physical_address = fault->address;
>> }
>>
>> +static void vmx_inject_softmmu_page_fault(struct kvm_vcpu *vcpu,
>> + struct x86_exception *fault)
>> +{
>> + if (is_guest_mode(vcpu)) {
> is_guest_mode(vcpu) && !enable_ept
You don't really need to check for enable_ept (perhaps
WARN_ON(enable_ept) instead) because the function is not used always,
only in init_kvm_softmmu.
> I described what I saw with VMX, I am not saying the same happens with
> SVM :) I just do not see why it should not and the non fatality of the
> BUG can explain why it was missed.
Interesting, I got something completely different. The guest just got
stuck before even getting to the GRUB prompt. I'm trying your patches
now...
Paolo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
2013-09-25 13:26 ` Paolo Bonzini
@ 2013-09-25 13:36 ` Gleb Natapov
2013-09-25 13:53 ` Paolo Bonzini
0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 13:36 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm, Marcelo Tosatti, Jan Kiszka, Alexander Graf
On Wed, Sep 25, 2013 at 03:26:56PM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 14:21, Gleb Natapov ha scritto:
> > On Wed, Sep 25, 2013 at 02:08:09PM +0200, Paolo Bonzini wrote:
> >> Il 25/09/2013 13:51, Gleb Natapov ha scritto:
> >>> On Wed, Sep 25, 2013 at 01:24:49PM +0200, Paolo Bonzini wrote:
> >>>> Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> >>>>> @@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> >>>>> kvm_set_cr3(vcpu, vmcs12->guest_cr3);
> >>>>> kvm_mmu_reset_context(vcpu);
> >>>>>
> >>>>> + if (!enable_ept)
> >>>>> + vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
> >>>>> +
> >>>>> /*
> >>>>> * L1 may access the L2's PDPTR, so save them to construct vmcs12
> >>>>> */
> >>>>> @@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> >>>>> kvm_set_cr3(vcpu, vmcs12->host_cr3);
> >>>>> kvm_mmu_reset_context(vcpu);
> >>>>>
> >>>>> + if (!enable_ept)
> >>>>> + vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
> >>>>
> >>>> This is strictly speaking not needed, because kvm_mmu_reset_context
> >>>> takes care of it.
> >>>>
> >>> Yeah, but better make it explicit, it does not hurt but make it more
> >>> clear what is going on. Or at least add comment above
> >>> kvm_mmu_reset_context() about this side effect.
> >>
> >> Yes, I agree the code is cleaner like you wrote it.
> >>
> >>>> But I wonder if it is cleaner to not touch the struct here, and instead
> >>>> add a new member to kvm_x86_ops---used directly in init_kvm_softmmu like
> >>>> kvm_x86_ops->set_cr3. The new member can do something like
> >>>>
> >>>> if (is_guest_mode(vcpu)) {
> >>>> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >>>> if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
> >>>> nested_vmx_vmexit(vcpu);
> >>>> return;
> >>>> }
> >>>> }
> >>>>
> >>>> kvm_inject_page_fault(vcpu, fault);
> >>>
> >>> I do not quite understand what you mean here. inject_page_fault() is
> >>> called from the depth of page table walking. How the code will not to
> >>> call new member in some circumstances?
> >>
> >> IIUC the new function is called if and only if is_guest_mode(vcpu) &&
> >> !enable_ept. So what I'm suggesting is something like this:
> >>
> > Ah I see, so you propose to check for guest mode and enable_ept in the
> > function instead of switching to another function, but switching to
> > another function is how code was designed to be.
>
> You do not need to check enable_ept if I understand the code correctly,
> because the new function is specifically called in init_kvm_softmmu,
> i.e. not for nested_mmu and not for tdp_enabled.
>
Ah true. With EPT shadow page code will not run so function will not be
called.
> I'm asking because I didn't find any other place that modifies function
> pointers this way after kvm_mmu_reset_context.
>
nested_ept_init_mmu_context() modify the pointer. Not after
kvm_mmu_reset_context() but it does not matter much. The canonical way
to do what I did here would be to create special mmu context to handle
shadow on shadow, bit the only difference between it and regular shadow
one would be this pointer, so it looked like overkill. Just changing the
pointer do the trick.
> > Nested NPT/EPT provide
> > their own function too, but there is nothing that stops you from
> > checking on what MMU you are now in the function itself.
>
> The difference is that NPT/EPT use a completely different paging mode
> for nested and non-nested (non-nested uses direct mapping, nested uses
> shadow mapping). Shadow paging is really the same thing for nested and
> non-nested, you just have to do the injection the right way.
>
> >> --- a/arch/x86/include/asm/kvm_host.h
> >> +++ b/arch/x86/include/asm/kvm_host.h
> >> @@ -735,6 +735,8 @@ struct kvm_x86_ops {
> >> void (*adjust_tsc_offset)(struct kvm_vcpu *vcpu, s64 adjustment, bool host);
> >>
> >> void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
> >> + void (*inject_softmmu_page_fault)(struct kvm_vcpu *vcpu,
> >> + struct x86_exception *fault);
> >>
> >> void (*set_supported_cpuid)(u32 func, struct kvm_cpuid_entry2 *entry);
> >>
> >> --- a/arch/x86/kvm/mmu.c
> >> +++ b/arch/x86/kvm/mmu.c
> >> @@ -3805,7 +3805,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
> >> vcpu->arch.walk_mmu->set_cr3 = kvm_x86_ops->set_cr3;
> >> vcpu->arch.walk_mmu->get_cr3 = get_cr3;
> >> vcpu->arch.walk_mmu->get_pdptr = kvm_pdptr_read;
> >> - vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
> >> + vcpu->arch.walk_mmu->inject_page_fault = kvm_x86_ops->inject_softmmu_page_fault;
> >>
> >> return r;
> >> }
> >> --- a/arch/x86/kvm/vmx.c
> >> +++ b/arch/x86/kvm/vmx.c
> >> @@ -7499,6 +7499,20 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu,
> >> vmcs12->guest_physical_address = fault->address;
> >> }
> >>
> >> +static void vmx_inject_softmmu_page_fault(struct kvm_vcpu *vcpu,
> >> + struct x86_exception *fault)
> >> +{
> >> + if (is_guest_mode(vcpu)) {
> > is_guest_mode(vcpu) && !enable_ept
>
> You don't really need to check for enable_ept (perhaps
> WARN_ON(enable_ept) instead) because the function is not used always,
> only in init_kvm_softmmu.
>
> > I described what I saw with VMX, I am not saying the same happens with
> > SVM :) I just do not see why it should not and the non fatality of the
> > BUG can explain why it was missed.
>
> Interesting, I got something completely different. The guest just got
> stuck before even getting to the GRUB prompt. I'm trying your patches
> now...
>
That's what patch 2 suppose to fix.
--
Gleb.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
2013-09-25 13:36 ` Gleb Natapov
@ 2013-09-25 13:53 ` Paolo Bonzini
0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-25 13:53 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm, Marcelo Tosatti, Jan Kiszka, Alexander Graf
Il 25/09/2013 15:36, Gleb Natapov ha scritto:
>> > I'm asking because I didn't find any other place that modifies function
>> > pointers this way after kvm_mmu_reset_context.
>> >
> nested_ept_init_mmu_context() modify the pointer. Not after
> kvm_mmu_reset_context() but it does not matter much. The canonical way
> to do what I did here would be to create special mmu context to handle
> shadow on shadow, bit the only difference between it and regular shadow
> one would be this pointer, so it looked like overkill. Just changing the
> pointer do the trick.
Ok, I think it's possible to clean up the code a bit but I can do that
on top of your patches.
Paolo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 9:51 ` [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2 Gleb Natapov
2013-09-25 10:38 ` Paolo Bonzini
@ 2013-09-25 14:00 ` Paolo Bonzini
2013-09-25 14:19 ` Gleb Natapov
1 sibling, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-25 14:00 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> All exceptions should be checked for intercept during delivery to L2,
> but we check only #PF currently. Drop nested_run_pending while we are
> at it since exception cannot be injected during vmentry anyway.
>
> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> ---
> arch/x86/kvm/vmx.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 663bc5e..5bfa09d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -1902,12 +1902,11 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> * a #PF exception (this is the only case in which KVM injects a #PF when L2
> * is running).
> */
> -static int nested_pf_handled(struct kvm_vcpu *vcpu)
> +static int nested_ex_handled(struct kvm_vcpu *vcpu, unsigned nr)
> {
> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>
> - /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
Just one more question, why drop this comment? It doesn't seem
incorrect in the particular case where nr == PF_VECTOR.
> - if (!(vmcs12->exception_bitmap & (1u << PF_VECTOR)))
> + if (!(vmcs12->exception_bitmap & (1u << nr)))
> return 0;
>
> nested_vmx_vmexit(vcpu);
> @@ -1921,8 +1920,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> u32 intr_info = nr | INTR_INFO_VALID_MASK;
>
> - if (!reinject && nr == PF_VECTOR && is_guest_mode(vcpu) &&
> - !vmx->nested.nested_run_pending && nested_pf_handled(vcpu))
> + if (!reinject && is_guest_mode(vcpu) && nested_ex_handled(vcpu, nr))
> return;
>
> if (has_error_code) {
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 14:00 ` Paolo Bonzini
@ 2013-09-25 14:19 ` Gleb Natapov
2013-09-25 14:22 ` Paolo Bonzini
0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 14:19 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm
On Wed, Sep 25, 2013 at 04:00:16PM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> > All exceptions should be checked for intercept during delivery to L2,
> > but we check only #PF currently. Drop nested_run_pending while we are
> > at it since exception cannot be injected during vmentry anyway.
> >
> > Signed-off-by: Gleb Natapov <gleb@redhat.com>
> > ---
> > arch/x86/kvm/vmx.c | 8 +++-----
> > 1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 663bc5e..5bfa09d 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -1902,12 +1902,11 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
> > * a #PF exception (this is the only case in which KVM injects a #PF when L2
> > * is running).
> > */
> > -static int nested_pf_handled(struct kvm_vcpu *vcpu)
> > +static int nested_ex_handled(struct kvm_vcpu *vcpu, unsigned nr)
> > {
> > struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >
> > - /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
>
> Just one more question, why drop this comment? It doesn't seem
> incorrect in the particular case where nr == PF_VECTOR.
>
I moved it to vmx_inject_page_fault_nested().
> > - if (!(vmcs12->exception_bitmap & (1u << PF_VECTOR)))
> > + if (!(vmcs12->exception_bitmap & (1u << nr)))
> > return 0;
> >
> > nested_vmx_vmexit(vcpu);
> > @@ -1921,8 +1920,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
> > struct vcpu_vmx *vmx = to_vmx(vcpu);
> > u32 intr_info = nr | INTR_INFO_VALID_MASK;
> >
> > - if (!reinject && nr == PF_VECTOR && is_guest_mode(vcpu) &&
> > - !vmx->nested.nested_run_pending && nested_pf_handled(vcpu))
> > + if (!reinject && is_guest_mode(vcpu) && nested_ex_handled(vcpu, nr))
> > return;
> >
> > if (has_error_code) {
> >
--
Gleb.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 14:19 ` Gleb Natapov
@ 2013-09-25 14:22 ` Paolo Bonzini
2013-09-25 16:31 ` Gleb Natapov
0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-25 14:22 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
Il 25/09/2013 16:19, Gleb Natapov ha scritto:
>>> > > -static int nested_pf_handled(struct kvm_vcpu *vcpu)
>>> > > +static int nested_ex_handled(struct kvm_vcpu *vcpu, unsigned nr)
>>> > > {
>>> > > struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>>> > >
>>> > > - /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
>> >
>> > Just one more question, why drop this comment? It doesn't seem
>> > incorrect in the particular case where nr == PF_VECTOR.
>
> I moved it to vmx_inject_page_fault_nested().
Ah, so if you test it there, you shouldn't get here unless the
PFEC_MASK/MATCH matches. In the case of EPT, you run with L1
PFEC_MASK/MATCH so you do not need any check.
Paolo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2
2013-09-25 14:22 ` Paolo Bonzini
@ 2013-09-25 16:31 ` Gleb Natapov
0 siblings, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2013-09-25 16:31 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm
On Wed, Sep 25, 2013 at 04:22:48PM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 16:19, Gleb Natapov ha scritto:
> >>> > > -static int nested_pf_handled(struct kvm_vcpu *vcpu)
> >>> > > +static int nested_ex_handled(struct kvm_vcpu *vcpu, unsigned nr)
> >>> > > {
> >>> > > struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >>> > >
> >>> > > - /* TODO: also check PFEC_MATCH/MASK, not just EB.PF. */
> >> >
> >> > Just one more question, why drop this comment? It doesn't seem
> >> > incorrect in the particular case where nr == PF_VECTOR.
> >
> > I moved it to vmx_inject_page_fault_nested().
>
> Ah, so if you test it there, you shouldn't get here unless the
> PFEC_MASK/MATCH matches.
Yes, PF will never reach here in case of shadow on shadow.
--
Gleb.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/4] Fix shadow-on-shadow nested VMX
2013-09-25 9:51 [PATCH 0/4] Fix shadow-on-shadow nested VMX Gleb Natapov
` (3 preceding siblings ...)
2013-09-25 9:51 ` [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2 Gleb Natapov
@ 2013-09-26 15:10 ` Paolo Bonzini
4 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2013-09-26 15:10 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> This series fixes shadow-on-shadow nested VMX (at least for me).
>
> Gleb Natapov (4):
> KVM: nVMX: Amend nested_run_pending logic
> KVM: nVMX: Do not put exception that caused vmexit to
> IDT_VECTORING_INFO
> KVM: nVMX: Check all exceptions for intercept during delivery to L2
> KVM: nVMX: Do not generate #DF if #PF happens during exception
> delivery into L2
>
> arch/x86/kvm/vmx.c | 60 ++++++++++++++++++++++++++++++++++--------------------
> 1 file changed, 38 insertions(+), 22 deletions(-)
>
For me too. :) Applied to kvm/next branch, thanks!
Paolo
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2013-09-26 15:10 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-25 9:51 [PATCH 0/4] Fix shadow-on-shadow nested VMX Gleb Natapov
2013-09-25 9:51 ` [PATCH 1/4] KVM: nVMX: Amend nested_run_pending logic Gleb Natapov
2013-09-25 9:51 ` [PATCH 2/4] KVM: nVMX: Do not put exception that caused vmexit to IDT_VECTORING_INFO Gleb Natapov
2013-09-25 9:51 ` [PATCH 3/4] KVM: nVMX: Check all exceptions for intercept during delivery to L2 Gleb Natapov
2013-09-25 10:38 ` Paolo Bonzini
2013-09-25 11:00 ` Gleb Natapov
2013-09-25 11:25 ` Paolo Bonzini
2013-09-25 11:52 ` Gleb Natapov
2013-09-25 14:00 ` Paolo Bonzini
2013-09-25 14:19 ` Gleb Natapov
2013-09-25 14:22 ` Paolo Bonzini
2013-09-25 16:31 ` Gleb Natapov
2013-09-25 9:51 ` [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2 Gleb Natapov
2013-09-25 11:24 ` Paolo Bonzini
2013-09-25 11:51 ` Gleb Natapov
2013-09-25 12:08 ` Paolo Bonzini
2013-09-25 12:21 ` Gleb Natapov
2013-09-25 13:26 ` Paolo Bonzini
2013-09-25 13:36 ` Gleb Natapov
2013-09-25 13:53 ` Paolo Bonzini
2013-09-26 15:10 ` [PATCH 0/4] Fix shadow-on-shadow nested VMX Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).