[PATCH 0/3] KVM: nVMX: Make direct IRQ/NMI injection work

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] KVM: nVMX: Make direct IRQ/NMI injection work
@ 2013-03-13 16:53 Jan Kiszka
  2013-03-13 16:53 ` [PATCH 1/3] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1 Jan Kiszka
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jan Kiszka @ 2013-03-13 16:53 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

As everyone is so busy taking nVMX patches today, I'm pushing some more:

Patch 1 I already posted earlier, it is still valid. Patch 2 & 3 were
developed out of a previous version which turned out to be incomplete.
Things became more complex since then, but they seem to work correctly
now.

These patches apply on top of what I posted so far, but should not
directly depend on the other fixes.

Thanks in advance for wrapping your mind around this topic.

Jan Kiszka (3):
  KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to
    L1
  KVM: nVMX: Fix conditions for NMI and interrupt injection
  KVM: nVMX: Rework event injection and recovery

 arch/x86/kvm/vmx.c |  187 +++++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 141 insertions(+), 46 deletions(-)

-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1
  2013-03-13 16:53 [PATCH 0/3] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
@ 2013-03-13 16:53 ` Jan Kiszka
  2013-03-13 16:53 ` [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection Jan Kiszka
  2013-03-13 16:53 ` [PATCH 3/3] KVM: nVMX: Rework event injection and recovery Jan Kiszka
  2 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2013-03-13 16:53 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

The comment was wrong: enable_irq_window might be called after
prepare_vmcs02 when we left L2 to prepare IRQ injecting for L1. Same for
NMIs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |   10 +++-------
 1 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 313f8b3..b50174d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6134,14 +6134,10 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu)
 	case EXIT_REASON_TRIPLE_FAULT:
 		return 1;
 	case EXIT_REASON_PENDING_INTERRUPT:
+		return nested_cpu_has(vmcs12, CPU_BASED_VIRTUAL_INTR_PENDING);
 	case EXIT_REASON_NMI_WINDOW:
-		/*
-		 * prepare_vmcs02() set the CPU_BASED_VIRTUAL_INTR_PENDING bit
-		 * (aka Interrupt Window Exiting) only when L1 turned it on,
-		 * so if we got a PENDING_INTERRUPT exit, this must be for L1.
-		 * Same for NMI Window Exiting.
-		 */
-		return 1;
+		return vmcs12->pin_based_vm_exec_control &
+			PIN_BASED_NMI_EXITING;
 	case EXIT_REASON_TASK_SWITCH:
 		return 1;
 	case EXIT_REASON_CPUID:
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection
  2013-03-13 16:53 [PATCH 0/3] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
  2013-03-13 16:53 ` [PATCH 1/3] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1 Jan Kiszka
@ 2013-03-13 16:53 ` Jan Kiszka
  2013-03-14 13:59   ` Gleb Natapov
  2013-03-14 15:12   ` Gleb Natapov
  2013-03-13 16:53 ` [PATCH 3/3] KVM: nVMX: Rework event injection and recovery Jan Kiszka
  2 siblings, 2 replies; 10+ messages in thread
From: Jan Kiszka @ 2013-03-13 16:53 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

If we are in guest mode, L0 can only inject events into L2 if L1 has
nothing pending. Otherwise, L0 would overwrite L1's events and they
would get lost. This check is conceptually independent of
nested_exit_on_intr.

If L1 traps external interrupts, then we also need to look at L1's
idt_vectoring_info_field. If it is empty, we can kick the guest from L2
to L1, just like the previous code worked.

Finally, the logic for checking interrupt has to be applied also on NMIs
in an analogous way. This enables NMI interception for nested guests.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |   59 ++++++++++++++++++++++++++++++++++++++++++++-------
 1 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b50174d..10de336 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4211,6 +4211,12 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
 		PIN_BASED_EXT_INTR_MASK;
 }
 
+static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
+{
+	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
+		PIN_BASED_NMI_EXITING;
+}
+
 static void enable_irq_window(struct kvm_vcpu *vcpu)
 {
 	u32 cpu_based_vm_exec_control;
@@ -4307,6 +4313,30 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
 
 static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
 {
+	if (is_guest_mode(vcpu)) {
+		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+
+		if (to_vmx(vcpu)->nested.nested_run_pending &&
+		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
+			return 0;
+		if (nested_exit_on_nmi(vcpu)) {
+			/*
+			 * Check if the idt_vectoring_info_field is free. We
+			 * cannot raise EXIT_REASON_EXCEPTION_NMI if it isn't.
+			 */
+			if (vmcs12->idt_vectoring_info_field &
+			    VECTORING_INFO_VALID_MASK)
+				return 0;
+			nested_vmx_vmexit(vcpu);
+			vmcs12->vm_exit_reason = EXIT_REASON_EXCEPTION_NMI;
+			vmcs12->vm_exit_intr_info = NMI_VECTOR |
+				INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK;
+			/*
+			 * fall through to normal code, but now in L1, not L2
+			 */
+		}
+	}
+
 	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
 		return 0;
 
@@ -4346,16 +4376,29 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
 
 static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
 {
-	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
+	if (is_guest_mode(vcpu)) {
 		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-		if (to_vmx(vcpu)->nested.nested_run_pending ||
-		    (vmcs12->idt_vectoring_info_field &
-		     VECTORING_INFO_VALID_MASK))
+
+		if (to_vmx(vcpu)->nested.nested_run_pending &&
+		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
 			return 0;
-		nested_vmx_vmexit(vcpu);
-		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
-		vmcs12->vm_exit_intr_info = 0;
-		/* fall through to normal code, but now in L1, not L2 */
+		if (nested_exit_on_intr(vcpu)) {
+			/*
+			 * Check if the idt_vectoring_info_field is free. We
+			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
+			 * isn't.
+			 */
+			if (vmcs12->idt_vectoring_info_field &
+			    VECTORING_INFO_VALID_MASK)
+				return 0;
+			nested_vmx_vmexit(vcpu);
+			vmcs12->vm_exit_reason =
+				EXIT_REASON_EXTERNAL_INTERRUPT;
+			vmcs12->vm_exit_intr_info = 0;
+			/*
+			 * fall through to normal code, but now in L1, not L2
+			 */
+		}
 	}
 
 	return (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) &&
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] KVM: nVMX: Rework event injection and recovery
  2013-03-13 16:53 [PATCH 0/3] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
  2013-03-13 16:53 ` [PATCH 1/3] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1 Jan Kiszka
  2013-03-13 16:53 ` [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection Jan Kiszka
@ 2013-03-13 16:53 ` Jan Kiszka
  2 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2013-03-13 16:53 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Nadav Har'El

The changes finally allow to inject interrupts directly from L0 to L2.

The basic idea is to always transfer the pending event injection on
vmexit into the architectural state of the VCPU and then drop it from
there if it turns out that we left L2 to enter L1.

VMX and SVM are now identical in how they recover event injections from
unperformed vmlaunch/vmresume: We detect that VM_ENTRY_INTR_INFO_FIELD
still contains a valid event and, if yes, transfer the content into L1's
idt_vectoring_info_field.

However, we differ on how to deal with events that L0 wanted to inject
into L2. Likely, this case is still broken in SVM. For VMX, the function
vmcs12_save_pending_events deals with transferring pending L0 events
into the queue of L1. That is mandatory as L1 may decide to switch the
guest state completely, invalidating or preserving the pending events
for later injection (including on a different node, once we support
migration).

Note that we treat directly injected NMIs differently as they can hit
both L1 and L2. In this case, we let L0 try to injection again also over
L1 after leaving L2.

To avoid that we incorrectly leak an event into the architectural VCPU
state that L1 wants to inject, we skip cancellation on nested run.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |  118 ++++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 87 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 10de336..3ca5461 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6559,8 +6559,6 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
 
 static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
 {
-	if (is_guest_mode(&vmx->vcpu))
-		return;
 	__vmx_complete_interrupts(&vmx->vcpu, vmx->idt_vectoring_info,
 				  VM_EXIT_INSTRUCTION_LEN,
 				  IDT_VECTORING_ERROR_CODE);
@@ -6568,7 +6566,7 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
 
 static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
 {
-	if (is_guest_mode(vcpu))
+	if (to_vmx(vcpu)->nested.nested_run_pending)
 		return;
 	__vmx_complete_interrupts(vcpu,
 				  vmcs_read32(VM_ENTRY_INTR_INFO_FIELD),
@@ -6601,21 +6599,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	unsigned long debugctlmsr;
 
-	if (is_guest_mode(vcpu) && !vmx->nested.nested_run_pending) {
-		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-		if (vmcs12->idt_vectoring_info_field &
-				VECTORING_INFO_VALID_MASK) {
-			vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
-				vmcs12->idt_vectoring_info_field);
-			vmcs_write32(VM_ENTRY_INSTRUCTION_LEN,
-				vmcs12->vm_exit_instruction_len);
-			if (vmcs12->idt_vectoring_info_field &
-					VECTORING_INFO_DELIVER_CODE_MASK)
-				vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE,
-					vmcs12->idt_vectoring_error_code);
-		}
-	}
-
 	/* Record the guest's net vcpu time for enforced NMI injections. */
 	if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
 		vmx->entry_time = ktime_get();
@@ -6774,17 +6757,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 
 	vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
 
-	if (is_guest_mode(vcpu)) {
-		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-		vmcs12->idt_vectoring_info_field = vmx->idt_vectoring_info;
-		if (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK) {
-			vmcs12->idt_vectoring_error_code =
-				vmcs_read32(IDT_VECTORING_ERROR_CODE);
-			vmcs12->vm_exit_instruction_len =
-				vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
-		}
-	}
-
 	vmx->loaded_vmcs->launched = 1;
 
 	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
@@ -7391,6 +7363,52 @@ vmcs12_guest_cr4(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 			vcpu->arch.cr4_guest_owned_bits));
 }
 
+static void vmcs12_save_pending_events(struct kvm_vcpu *vcpu,
+				       struct vmcs12 *vmcs12)
+{
+	u32 idt_vectoring;
+	unsigned int nr;
+
+	/*
+	 * We only transfer exceptions and maskable interrupts. It is fine if
+	 * L0 retries to inject a pending NMI over L1.
+	 */
+	if (vcpu->arch.exception.pending) {
+		nr = vcpu->arch.exception.nr;
+		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
+
+		if (kvm_exception_is_soft(nr)) {
+			vmcs12->vm_exit_instruction_len =
+				vcpu->arch.event_exit_inst_len;
+			idt_vectoring |= INTR_TYPE_SOFT_EXCEPTION;
+		} else
+			idt_vectoring |= INTR_TYPE_HARD_EXCEPTION;
+
+		if (vcpu->arch.exception.has_error_code) {
+			idt_vectoring |= VECTORING_INFO_DELIVER_CODE_MASK;
+			vmcs12->idt_vectoring_error_code =
+				vcpu->arch.exception.error_code;
+		}
+
+		vmcs12->idt_vectoring_info_field = idt_vectoring;
+	} else if (vcpu->arch.interrupt.pending) {
+		nr = vcpu->arch.interrupt.nr;
+		idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
+
+		if (vcpu->arch.interrupt.soft) {
+			idt_vectoring |= INTR_TYPE_SOFT_INTR;
+			vmcs12->vm_entry_instruction_len =
+				vcpu->arch.event_exit_inst_len;
+		} else
+			idt_vectoring |= INTR_TYPE_EXT_INTR;
+
+		vmcs12->idt_vectoring_info_field = idt_vectoring;
+	}
+
+	kvm_clear_exception_queue(vcpu);
+	kvm_clear_interrupt_queue(vcpu);
+}
+
 /*
  * prepare_vmcs12 is part of what we need to do when the nested L2 guest exits
  * and we want to prepare to run its L1 parent. L1 keeps a vmcs for L2 (vmcs12),
@@ -7482,9 +7500,47 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 	vmcs12->vm_exit_instruction_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
 	vmcs12->vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
 
-	/* clear vm-entry fields which are to be cleared on exit */
-	if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
+	if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) {
+		if ((vmcs12->vm_entry_intr_info_field &
+		     INTR_INFO_VALID_MASK) &&
+		    (vmcs_read32(VM_ENTRY_INTR_INFO_FIELD) &
+		     INTR_INFO_VALID_MASK)) {
+			/*
+			 * Preserve the event that was supposed to be injected
+			 * by L1 via emulating it would have been returned in
+			 * IDT_VECTORING_INFO_FIELD.
+			 */
+			vmcs12->idt_vectoring_info_field =
+				vmcs12->vm_entry_intr_info_field;
+			vmcs12->idt_vectoring_error_code =
+				vmcs12->vm_entry_exception_error_code;
+			vmcs12->vm_exit_instruction_len =
+				vmcs12->vm_entry_instruction_len;
+			vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
+
+			/*
+			 * We do not drop NMIs that targeted L2 below as they
+			 * can also be reinjected over L1. But if this event
+			 * was an NMI, it was synthetic and came from L1.
+			 */
+			vcpu->arch.nmi_injected = false;
+		} else
+			/*
+			 * Transfer the event L0 may wanted to inject into L2
+			 * to IDT_VECTORING_INFO_FIELD.
+			 */
+			vmcs12_save_pending_events(vcpu, vmcs12);
+
+		/* clear vm-entry fields which are to be cleared on exit */
 		vmcs12->vm_entry_intr_info_field &= ~INTR_INFO_VALID_MASK;
+	}
+
+	/*
+	 * Drop what we picked up for L2 via vmx_complete_interrupts. It is
+	 * preserved above and would only end up incorrectly in L1.
+	 */
+	kvm_clear_exception_queue(vcpu);
+	kvm_clear_interrupt_queue(vcpu);
 }
 
 /*
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection
  2013-03-13 16:53 ` [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection Jan Kiszka
@ 2013-03-14 13:59   ` Gleb Natapov
  2013-03-14 15:33     ` Jan Kiszka
  2013-03-14 15:12   ` Gleb Natapov
  1 sibling, 1 reply; 10+ messages in thread
From: Gleb Natapov @ 2013-03-14 13:59 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On Wed, Mar 13, 2013 at 05:53:45PM +0100, Jan Kiszka wrote:
> If we are in guest mode, L0 can only inject events into L2 if L1 has
> nothing pending. Otherwise, L0 would overwrite L1's events and they
> would get lost. This check is conceptually independent of
> nested_exit_on_intr.
> 
> If L1 traps external interrupts, then we also need to look at L1's
> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
> to L1, just like the previous code worked.
> 
> Finally, the logic for checking interrupt has to be applied also on NMIs
> in an analogous way. This enables NMI interception for nested guests.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  arch/x86/kvm/vmx.c |   59 ++++++++++++++++++++++++++++++++++++++++++++-------
>  1 files changed, 51 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index b50174d..10de336 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4211,6 +4211,12 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
>  		PIN_BASED_EXT_INTR_MASK;
>  }
>  
> +static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
> +{
> +	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
> +		PIN_BASED_NMI_EXITING;
> +}
> +
It will take me some time to review this, but I have a small nit now.
You open code checking of this bit in your previous patch, why not move
this hunk there?

>  static void enable_irq_window(struct kvm_vcpu *vcpu)
>  {
>  	u32 cpu_based_vm_exec_control;
> @@ -4307,6 +4313,30 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
>  
>  static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
>  {
> +	if (is_guest_mode(vcpu)) {
> +		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> +
> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
> +			return 0;
> +		if (nested_exit_on_nmi(vcpu)) {
> +			/*
> +			 * Check if the idt_vectoring_info_field is free. We
> +			 * cannot raise EXIT_REASON_EXCEPTION_NMI if it isn't.
> +			 */
> +			if (vmcs12->idt_vectoring_info_field &
> +			    VECTORING_INFO_VALID_MASK)
> +				return 0;
> +			nested_vmx_vmexit(vcpu);
> +			vmcs12->vm_exit_reason = EXIT_REASON_EXCEPTION_NMI;
> +			vmcs12->vm_exit_intr_info = NMI_VECTOR |
> +				INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK;
> +			/*
> +			 * fall through to normal code, but now in L1, not L2
> +			 */
> +		}
> +	}
> +
>  	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
>  		return 0;
>  
> @@ -4346,16 +4376,29 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
>  
>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>  {
> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
> +	if (is_guest_mode(vcpu)) {
>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
> -		    (vmcs12->idt_vectoring_info_field &
> -		     VECTORING_INFO_VALID_MASK))
> +
> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
>  			return 0;
> -		nested_vmx_vmexit(vcpu);
> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
> -		vmcs12->vm_exit_intr_info = 0;
> -		/* fall through to normal code, but now in L1, not L2 */
> +		if (nested_exit_on_intr(vcpu)) {
> +			/*
> +			 * Check if the idt_vectoring_info_field is free. We
> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
> +			 * isn't.
> +			 */
> +			if (vmcs12->idt_vectoring_info_field &
> +			    VECTORING_INFO_VALID_MASK)
> +				return 0;
> +			nested_vmx_vmexit(vcpu);
> +			vmcs12->vm_exit_reason =
> +				EXIT_REASON_EXTERNAL_INTERRUPT;
> +			vmcs12->vm_exit_intr_info = 0;
> +			/*
> +			 * fall through to normal code, but now in L1, not L2
> +			 */
> +		}
>  	}
>  
>  	return (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) &&
> -- 
> 1.7.3.4

--
			Gleb.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection
  2013-03-13 16:53 ` [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection Jan Kiszka
  2013-03-14 13:59   ` Gleb Natapov
@ 2013-03-14 15:12   ` Gleb Natapov
  2013-03-14 15:24     ` Jan Kiszka
  1 sibling, 1 reply; 10+ messages in thread
From: Gleb Natapov @ 2013-03-14 15:12 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On Wed, Mar 13, 2013 at 05:53:45PM +0100, Jan Kiszka wrote:
> If we are in guest mode, L0 can only inject events into L2 if L1 has
> nothing pending. Otherwise, L0 would overwrite L1's events and they
> would get lost. This check is conceptually independent of
> nested_exit_on_intr.
> 
> If L1 traps external interrupts, then we also need to look at L1's
> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
> to L1, just like the previous code worked.
> 
> Finally, the logic for checking interrupt has to be applied also on NMIs
> in an analogous way. This enables NMI interception for nested guests.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>  arch/x86/kvm/vmx.c |   59 ++++++++++++++++++++++++++++++++++++++++++++-------
>  1 files changed, 51 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index b50174d..10de336 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4211,6 +4211,12 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
>  		PIN_BASED_EXT_INTR_MASK;
>  }
>  
> +static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
> +{
> +	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
> +		PIN_BASED_NMI_EXITING;
> +}
> +
>  static void enable_irq_window(struct kvm_vcpu *vcpu)
>  {
>  	u32 cpu_based_vm_exec_control;
> @@ -4307,6 +4313,30 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
>  
>  static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
>  {
> +	if (is_guest_mode(vcpu)) {
> +		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> +
> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
> +			return 0;
> +		if (nested_exit_on_nmi(vcpu)) {
> +			/*
> +			 * Check if the idt_vectoring_info_field is free. We
> +			 * cannot raise EXIT_REASON_EXCEPTION_NMI if it isn't.
> +			 */
> +			if (vmcs12->idt_vectoring_info_field &
> +			    VECTORING_INFO_VALID_MASK)
> +				return 0;
> +			nested_vmx_vmexit(vcpu);
> +			vmcs12->vm_exit_reason = EXIT_REASON_EXCEPTION_NMI;
> +			vmcs12->vm_exit_intr_info = NMI_VECTOR |
> +				INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK;
> +			/*
> +			 * fall through to normal code, but now in L1, not L2
> +			 */
> +		}
> +	}
> +
>  	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
>  		return 0;
>  
> @@ -4346,16 +4376,29 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
>  
>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>  {
> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
> +	if (is_guest_mode(vcpu)) {
>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
> -		    (vmcs12->idt_vectoring_info_field &
> -		     VECTORING_INFO_VALID_MASK))
> +
> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
>  			return 0;
I do not understand this. As far as I remember Nadav's explanation we
have to enter guest if nested_run_pending is set because VMX does not
expect vmexit to happen without running guest at all. May be
idt_vectoring_info_field processing is the only reason, may be not. I
wouldn't gamble on it.

> -		nested_vmx_vmexit(vcpu);
> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
> -		vmcs12->vm_exit_intr_info = 0;
> -		/* fall through to normal code, but now in L1, not L2 */
> +		if (nested_exit_on_intr(vcpu)) {
> +			/*
> +			 * Check if the idt_vectoring_info_field is free. We
> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
> +			 * isn't.
> +			 */
> +			if (vmcs12->idt_vectoring_info_field &
> +			    VECTORING_INFO_VALID_MASK)
> +				return 0;
I think we actually need to return 0 if idt_vectoring_info_field is
valid even if !nested_exit_on_intr(). If we do not we let L0 inject
interrupt into L2 and then overwrite it on entry from
vmcs12->idt_vectoring_info_field.

> +			vmcs12->vm_exit_reason =
> +				EXIT_REASON_EXTERNAL_INTERRUPT;
> +			vmcs12->vm_exit_intr_info = 0;
> +			/*
> +			 * fall through to normal code, but now in L1, not L2
> +			 */
> +		}
>  	}
>  
>  	return (vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) &&
> -- 
> 1.7.3.4

--
			Gleb.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection
  2013-03-14 15:12   ` Gleb Natapov
@ 2013-03-14 15:24     ` Jan Kiszka
  2013-03-14 15:37       ` Gleb Natapov
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kiszka @ 2013-03-14 15:24 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On 2013-03-14 16:12, Gleb Natapov wrote:
> On Wed, Mar 13, 2013 at 05:53:45PM +0100, Jan Kiszka wrote:
>> If we are in guest mode, L0 can only inject events into L2 if L1 has
>> nothing pending. Otherwise, L0 would overwrite L1's events and they
>> would get lost. This check is conceptually independent of
>> nested_exit_on_intr.
>>
>> If L1 traps external interrupts, then we also need to look at L1's
>> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
>> to L1, just like the previous code worked.
>>
>> Finally, the logic for checking interrupt has to be applied also on NMIs
>> in an analogous way. This enables NMI interception for nested guests.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> ---
>>  arch/x86/kvm/vmx.c |   59 ++++++++++++++++++++++++++++++++++++++++++++-------
>>  1 files changed, 51 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index b50174d..10de336 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -4211,6 +4211,12 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
>>  		PIN_BASED_EXT_INTR_MASK;
>>  }
>>  
>> +static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
>> +{
>> +	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
>> +		PIN_BASED_NMI_EXITING;
>> +}
>> +
>>  static void enable_irq_window(struct kvm_vcpu *vcpu)
>>  {
>>  	u32 cpu_based_vm_exec_control;
>> @@ -4307,6 +4313,30 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
>>  
>>  static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
>>  {
>> +	if (is_guest_mode(vcpu)) {
>> +		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>> +
>> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
>> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
>> +			return 0;
>> +		if (nested_exit_on_nmi(vcpu)) {
>> +			/*
>> +			 * Check if the idt_vectoring_info_field is free. We
>> +			 * cannot raise EXIT_REASON_EXCEPTION_NMI if it isn't.
>> +			 */
>> +			if (vmcs12->idt_vectoring_info_field &
>> +			    VECTORING_INFO_VALID_MASK)
>> +				return 0;
>> +			nested_vmx_vmexit(vcpu);
>> +			vmcs12->vm_exit_reason = EXIT_REASON_EXCEPTION_NMI;
>> +			vmcs12->vm_exit_intr_info = NMI_VECTOR |
>> +				INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK;
>> +			/*
>> +			 * fall through to normal code, but now in L1, not L2
>> +			 */
>> +		}
>> +	}
>> +
>>  	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
>>  		return 0;
>>  
>> @@ -4346,16 +4376,29 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
>>  
>>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>>  {
>> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
>> +	if (is_guest_mode(vcpu)) {
>>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
>> -		    (vmcs12->idt_vectoring_info_field &
>> -		     VECTORING_INFO_VALID_MASK))
>> +
>> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
>> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
>>  			return 0;
> I do not understand this. As far as I remember Nadav's explanation we
> have to enter guest if nested_run_pending is set because VMX does not
> expect vmexit to happen without running guest at all. May be
> idt_vectoring_info_field processing is the only reason, may be not. I
> wouldn't gamble on it.

What are the problems? Specifically if we emulate the effects of an
immediate vmexit properly. I'm not categorically excluding that some
case is missing. If we do not allow soft-vmexit here, we will set the
interrupt window later and will get such an immediate exit as well
(provided the L2 was interruptible).

> 
>> -		nested_vmx_vmexit(vcpu);
>> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
>> -		vmcs12->vm_exit_intr_info = 0;
>> -		/* fall through to normal code, but now in L1, not L2 */
>> +		if (nested_exit_on_intr(vcpu)) {
>> +			/*
>> +			 * Check if the idt_vectoring_info_field is free. We
>> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
>> +			 * isn't.
>> +			 */
>> +			if (vmcs12->idt_vectoring_info_field &
>> +			    VECTORING_INFO_VALID_MASK)
>> +				return 0;
> I think we actually need to return 0 if idt_vectoring_info_field is
> valid even if !nested_exit_on_intr(). If we do not we let L0 inject
> interrupt into L2 and then overwrite it on entry from
> vmcs12->idt_vectoring_info_field.

Sorry, don't understand the last sentence. This check is about the
software idt_vectoring_info_field, the one we keep for L1, not the real
field.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection
  2013-03-14 13:59   ` Gleb Natapov
@ 2013-03-14 15:33     ` Jan Kiszka
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2013-03-14 15:33 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On 2013-03-14 14:59, Gleb Natapov wrote:
> On Wed, Mar 13, 2013 at 05:53:45PM +0100, Jan Kiszka wrote:
>> If we are in guest mode, L0 can only inject events into L2 if L1 has
>> nothing pending. Otherwise, L0 would overwrite L1's events and they
>> would get lost. This check is conceptually independent of
>> nested_exit_on_intr.
>>
>> If L1 traps external interrupts, then we also need to look at L1's
>> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
>> to L1, just like the previous code worked.
>>
>> Finally, the logic for checking interrupt has to be applied also on NMIs
>> in an analogous way. This enables NMI interception for nested guests.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> ---
>>  arch/x86/kvm/vmx.c |   59 ++++++++++++++++++++++++++++++++++++++++++++-------
>>  1 files changed, 51 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index b50174d..10de336 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -4211,6 +4211,12 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
>>  		PIN_BASED_EXT_INTR_MASK;
>>  }
>>  
>> +static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
>> +{
>> +	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
>> +		PIN_BASED_NMI_EXITING;
>> +}
>> +
> It will take me some time to review this, but I have a small nit now.
> You open code checking of this bit in your previous patch, why not move
> this hunk there?

True. Patch 1 is several weeks older, and I didn't recheck this. Will
clean it up.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection
  2013-03-14 15:24     ` Jan Kiszka
@ 2013-03-14 15:37       ` Gleb Natapov
  2013-03-14 15:41         ` Jan Kiszka
  0 siblings, 1 reply; 10+ messages in thread
From: Gleb Natapov @ 2013-03-14 15:37 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On Thu, Mar 14, 2013 at 04:24:18PM +0100, Jan Kiszka wrote:
> On 2013-03-14 16:12, Gleb Natapov wrote:
> > On Wed, Mar 13, 2013 at 05:53:45PM +0100, Jan Kiszka wrote:
> >> If we are in guest mode, L0 can only inject events into L2 if L1 has
> >> nothing pending. Otherwise, L0 would overwrite L1's events and they
> >> would get lost. This check is conceptually independent of
> >> nested_exit_on_intr.
> >>
> >> If L1 traps external interrupts, then we also need to look at L1's
> >> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
> >> to L1, just like the previous code worked.
> >>
> >> Finally, the logic for checking interrupt has to be applied also on NMIs
> >> in an analogous way. This enables NMI interception for nested guests.
> >>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >> ---
> >>  arch/x86/kvm/vmx.c |   59 ++++++++++++++++++++++++++++++++++++++++++++-------
> >>  1 files changed, 51 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >> index b50174d..10de336 100644
> >> --- a/arch/x86/kvm/vmx.c
> >> +++ b/arch/x86/kvm/vmx.c
> >> @@ -4211,6 +4211,12 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
> >>  		PIN_BASED_EXT_INTR_MASK;
> >>  }
> >>  
> >> +static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
> >> +{
> >> +	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
> >> +		PIN_BASED_NMI_EXITING;
> >> +}
> >> +
> >>  static void enable_irq_window(struct kvm_vcpu *vcpu)
> >>  {
> >>  	u32 cpu_based_vm_exec_control;
> >> @@ -4307,6 +4313,30 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
> >>  
> >>  static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
> >>  {
> >> +	if (is_guest_mode(vcpu)) {
> >> +		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >> +
> >> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
> >> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
> >> +			return 0;
> >> +		if (nested_exit_on_nmi(vcpu)) {
> >> +			/*
> >> +			 * Check if the idt_vectoring_info_field is free. We
> >> +			 * cannot raise EXIT_REASON_EXCEPTION_NMI if it isn't.
> >> +			 */
> >> +			if (vmcs12->idt_vectoring_info_field &
> >> +			    VECTORING_INFO_VALID_MASK)
> >> +				return 0;
> >> +			nested_vmx_vmexit(vcpu);
> >> +			vmcs12->vm_exit_reason = EXIT_REASON_EXCEPTION_NMI;
> >> +			vmcs12->vm_exit_intr_info = NMI_VECTOR |
> >> +				INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK;
> >> +			/*
> >> +			 * fall through to normal code, but now in L1, not L2
> >> +			 */
> >> +		}
> >> +	}
> >> +
> >>  	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
> >>  		return 0;
> >>  
> >> @@ -4346,16 +4376,29 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
> >>  
> >>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
> >>  {
> >> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
> >> +	if (is_guest_mode(vcpu)) {
> >>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
> >> -		    (vmcs12->idt_vectoring_info_field &
> >> -		     VECTORING_INFO_VALID_MASK))
> >> +
> >> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
> >> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
> >>  			return 0;
> > I do not understand this. As far as I remember Nadav's explanation we
> > have to enter guest if nested_run_pending is set because VMX does not
> > expect vmexit to happen without running guest at all. May be
> > idt_vectoring_info_field processing is the only reason, may be not. I
> > wouldn't gamble on it.
> 
> What are the problems? Specifically if we emulate the effects of an
> immediate vmexit properly. I'm not categorically excluding that some
> case is missing. If we do not allow soft-vmexit here, we will set the
> interrupt window later and will get such an immediate exit as well
> (provided the L2 was interruptible).
> 
Don't know. Some field that VMX change on vmresume and since from L2
point of view vmresume was executed, but in reality it was not the
result that L2 will see will be incorrect. vm_entry_intr_info_field is
on of such fields, may be there are or will be more (vAPIC+posted
interrupt)?

> > 
> >> -		nested_vmx_vmexit(vcpu);
> >> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
> >> -		vmcs12->vm_exit_intr_info = 0;
> >> -		/* fall through to normal code, but now in L1, not L2 */
> >> +		if (nested_exit_on_intr(vcpu)) {
> >> +			/*
> >> +			 * Check if the idt_vectoring_info_field is free. We
> >> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
> >> +			 * isn't.
> >> +			 */
> >> +			if (vmcs12->idt_vectoring_info_field &
> >> +			    VECTORING_INFO_VALID_MASK)
> >> +				return 0;
> > I think we actually need to return 0 if idt_vectoring_info_field is
> > valid even if !nested_exit_on_intr(). If we do not we let L0 inject
> > interrupt into L2 and then overwrite it on entry from
> > vmcs12->idt_vectoring_info_field.
> 
> Sorry, don't understand the last sentence. This check is about the
> software idt_vectoring_info_field, the one we keep for L1, not the real
> field.
> 

Suppose the vmcs12->idt_vectoring_info_field is valid and L0 want to
inject an interrupt directly into L2 and L2 does not block interrupts.
vmx_interrupt_allowed() will return true, so vmx_inject_irq()
will be called and L0->L2 interrupt information will be written
to VM_ENTRY_INTR_INFO_FIELD.  Now in vmx_vcpu_run() there is again
check for valid vmcs12->idt_vectoring_info_field and if it is valid
VM_ENTRY_INTR_INFO_FIELD is overwritten with it. Interrupt that L0 just
injected into L2 was lost forever.

--
			Gleb.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection
  2013-03-14 15:37       ` Gleb Natapov
@ 2013-03-14 15:41         ` Jan Kiszka
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Kiszka @ 2013-03-14 15:41 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Nadav Har'El

On 2013-03-14 16:37, Gleb Natapov wrote:
> On Thu, Mar 14, 2013 at 04:24:18PM +0100, Jan Kiszka wrote:
>> On 2013-03-14 16:12, Gleb Natapov wrote:
>>> On Wed, Mar 13, 2013 at 05:53:45PM +0100, Jan Kiszka wrote:
>>>> If we are in guest mode, L0 can only inject events into L2 if L1 has
>>>> nothing pending. Otherwise, L0 would overwrite L1's events and they
>>>> would get lost. This check is conceptually independent of
>>>> nested_exit_on_intr.
>>>>
>>>> If L1 traps external interrupts, then we also need to look at L1's
>>>> idt_vectoring_info_field. If it is empty, we can kick the guest from L2
>>>> to L1, just like the previous code worked.
>>>>
>>>> Finally, the logic for checking interrupt has to be applied also on NMIs
>>>> in an analogous way. This enables NMI interception for nested guests.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>> ---
>>>>  arch/x86/kvm/vmx.c |   59 ++++++++++++++++++++++++++++++++++++++++++++-------
>>>>  1 files changed, 51 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index b50174d..10de336 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -4211,6 +4211,12 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
>>>>  		PIN_BASED_EXT_INTR_MASK;
>>>>  }
>>>>  
>>>> +static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
>>>> +		PIN_BASED_NMI_EXITING;
>>>> +}
>>>> +
>>>>  static void enable_irq_window(struct kvm_vcpu *vcpu)
>>>>  {
>>>>  	u32 cpu_based_vm_exec_control;
>>>> @@ -4307,6 +4313,30 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
>>>>  
>>>>  static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
>>>>  {
>>>> +	if (is_guest_mode(vcpu)) {
>>>> +		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>>>> +
>>>> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
>>>> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
>>>> +			return 0;
>>>> +		if (nested_exit_on_nmi(vcpu)) {
>>>> +			/*
>>>> +			 * Check if the idt_vectoring_info_field is free. We
>>>> +			 * cannot raise EXIT_REASON_EXCEPTION_NMI if it isn't.
>>>> +			 */
>>>> +			if (vmcs12->idt_vectoring_info_field &
>>>> +			    VECTORING_INFO_VALID_MASK)
>>>> +				return 0;
>>>> +			nested_vmx_vmexit(vcpu);
>>>> +			vmcs12->vm_exit_reason = EXIT_REASON_EXCEPTION_NMI;
>>>> +			vmcs12->vm_exit_intr_info = NMI_VECTOR |
>>>> +				INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK;
>>>> +			/*
>>>> +			 * fall through to normal code, but now in L1, not L2
>>>> +			 */
>>>> +		}
>>>> +	}
>>>> +
>>>>  	if (!cpu_has_virtual_nmis() && to_vmx(vcpu)->soft_vnmi_blocked)
>>>>  		return 0;
>>>>  
>>>> @@ -4346,16 +4376,29 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
>>>>  
>>>>  static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
>>>>  {
>>>> -	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu)) {
>>>> +	if (is_guest_mode(vcpu)) {
>>>>  		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>>>> -		if (to_vmx(vcpu)->nested.nested_run_pending ||
>>>> -		    (vmcs12->idt_vectoring_info_field &
>>>> -		     VECTORING_INFO_VALID_MASK))
>>>> +
>>>> +		if (to_vmx(vcpu)->nested.nested_run_pending &&
>>>> +		    (vmcs12->vm_entry_intr_info_field & INTR_INFO_VALID_MASK))
>>>>  			return 0;
>>> I do not understand this. As far as I remember Nadav's explanation we
>>> have to enter guest if nested_run_pending is set because VMX does not
>>> expect vmexit to happen without running guest at all. May be
>>> idt_vectoring_info_field processing is the only reason, may be not. I
>>> wouldn't gamble on it.
>>
>> What are the problems? Specifically if we emulate the effects of an
>> immediate vmexit properly. I'm not categorically excluding that some
>> case is missing. If we do not allow soft-vmexit here, we will set the
>> interrupt window later and will get such an immediate exit as well
>> (provided the L2 was interruptible).
>>
> Don't know. Some field that VMX change on vmresume and since from L2
> point of view vmresume was executed, but in reality it was not the
> result that L2 will see will be incorrect. vm_entry_intr_info_field is
> on of such fields, may be there are or will be more (vAPIC+posted
> interrupt)?

OK, better safe than sorry. Easy to change.

> 
>>>
>>>> -		nested_vmx_vmexit(vcpu);
>>>> -		vmcs12->vm_exit_reason = EXIT_REASON_EXTERNAL_INTERRUPT;
>>>> -		vmcs12->vm_exit_intr_info = 0;
>>>> -		/* fall through to normal code, but now in L1, not L2 */
>>>> +		if (nested_exit_on_intr(vcpu)) {
>>>> +			/*
>>>> +			 * Check if the idt_vectoring_info_field is free. We
>>>> +			 * cannot raise EXIT_REASON_EXTERNAL_INTERRUPT if it
>>>> +			 * isn't.
>>>> +			 */
>>>> +			if (vmcs12->idt_vectoring_info_field &
>>>> +			    VECTORING_INFO_VALID_MASK)
>>>> +				return 0;
>>> I think we actually need to return 0 if idt_vectoring_info_field is
>>> valid even if !nested_exit_on_intr(). If we do not we let L0 inject
>>> interrupt into L2 and then overwrite it on entry from
>>> vmcs12->idt_vectoring_info_field.
>>
>> Sorry, don't understand the last sentence. This check is about the
>> software idt_vectoring_info_field, the one we keep for L1, not the real
>> field.
>>
> 
> Suppose the vmcs12->idt_vectoring_info_field is valid and L0 want to
> inject an interrupt directly into L2 and L2 does not block interrupts.
> vmx_interrupt_allowed() will return true, so vmx_inject_irq()
> will be called and L0->L2 interrupt information will be written
> to VM_ENTRY_INTR_INFO_FIELD.  Now in vmx_vcpu_run() there is again
> check for valid vmcs12->idt_vectoring_info_field and if it is valid
> VM_ENTRY_INTR_INFO_FIELD is overwritten with it. Interrupt that L0 just
> injected into L2 was lost forever.

OK, that's my fault in trying to split patch 2 and 3. Patch 3 will
remove that bogus overwriting from vmx_vcpu_run. I'll merge both patches
in the next round.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-03-14 15:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-13 16:53 [PATCH 0/3] KVM: nVMX: Make direct IRQ/NMI injection work Jan Kiszka
2013-03-13 16:53 ` [PATCH 1/3] KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1 Jan Kiszka
2013-03-13 16:53 ` [PATCH 2/3] KVM: nVMX: Fix conditions for NMI and interrupt injection Jan Kiszka
2013-03-14 13:59   ` Gleb Natapov
2013-03-14 15:33     ` Jan Kiszka
2013-03-14 15:12   ` Gleb Natapov
2013-03-14 15:24     ` Jan Kiszka
2013-03-14 15:37       ` Gleb Natapov
2013-03-14 15:41         ` Jan Kiszka
2013-03-13 16:53 ` [PATCH 3/3] KVM: nVMX: Rework event injection and recovery Jan Kiszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox