linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] KVM: nVMX: Fix nested vmexit ack intr before load vmcs01
@ 2014-08-05  4:42 Wanpeng Li
  2014-08-05  4:42 ` [PATCH v2 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use Wanpeng Li
  0 siblings, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2014-08-05  4:42 UTC (permalink / raw)
  To: Paolo Bonzini, Jan Kiszka
  Cc: Marcelo Tosatti, Gleb Natapov, Bandan Das, Zhang Yang,
	Davidlohr Bueso, kvm, linux-kernel, Wanpeng Li

External interrupt will cause L1 vmexit w/ reason external interrupt when L2 is 
running. Then L1 will pick up the interrupt through vmcs12 if L1 set the ack 
interrupt bit. Commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to) get intr that belongs to L1 before load vmcs01 which is wrong, 
especially this lead to the obvious L1 ack APICv behavior weired since APICv 
is for L1 instead of L2. This patch fix it by ack intr after load vmcs01.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
---
 arch/x86/kvm/vmx.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e618f34..b8122b3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8754,14 +8754,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
 		       exit_qualification);
 
-	if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
-	    && nested_exit_intr_ack_set(vcpu)) {
-		int irq = kvm_cpu_get_interrupt(vcpu);
-		WARN_ON(irq < 0);
-		vmcs12->vm_exit_intr_info = irq |
-			INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
-	}
-
 	trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
 				       vmcs12->exit_qualification,
 				       vmcs12->idt_vectoring_info_field,
@@ -8771,6 +8763,14 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 
 	vmx_load_vmcs01(vcpu);
 
+	if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+	    && nested_exit_intr_ack_set(vcpu)) {
+		int irq = kvm_cpu_get_interrupt(vcpu);
+		WARN_ON(irq < 0);
+		vmcs12->vm_exit_intr_info = irq |
+			INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
+	}
+
 	vm_entry_controls_init(vmx, vmcs_read32(VM_ENTRY_CONTROLS));
 	vm_exit_controls_init(vmx, vmcs_read32(VM_EXIT_CONTROLS));
 	vmx_segment_cache_clear(vmx);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use
  2014-08-05  4:42 [PATCH v2 1/2] KVM: nVMX: Fix nested vmexit ack intr before load vmcs01 Wanpeng Li
@ 2014-08-05  4:42 ` Wanpeng Li
  2014-08-05 11:04   ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2014-08-05  4:42 UTC (permalink / raw)
  To: Paolo Bonzini, Jan Kiszka
  Cc: Marcelo Tosatti, Gleb Natapov, Bandan Das, Zhang Yang,
	Davidlohr Bueso, kvm, linux-kernel, Wanpeng Li

After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
emulated. To do so, KVM will ask the APIC for the interrupt vector if
during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
kvm_get_apic_interrupt would return -1 and give the following WARNING:

Call Trace:
 [<ffffffff81493563>] dump_stack+0x49/0x5e
 [<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
 [<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
 [<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
 [<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]

If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.
    
Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
---
v1 -> v2:
 * reusing kvm_get_apic_interrupt here (by modifying kvm_cpu_get_interrupt, 
   apic_set_isr and apic_clear_irr)

 arch/x86/kvm/irq.c   |  2 +-
 arch/x86/kvm/lapic.c | 52 +++++++++++++++++++++++++++++++++++++++-------------
 2 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index bd0da43..a1ec6a5 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
 
 	vector = kvm_cpu_get_extint(v);
 
-	if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
+	if (vector != -1)
 		return vector;			/* PIC */
 
 	return kvm_get_apic_interrupt(v);	/* APIC */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3855103..08e8a89 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -352,25 +352,46 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
 
 static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
 {
-	apic->irr_pending = false;
+	struct kvm_vcpu *vcpu;
+
+	vcpu = apic->vcpu;
+
 	apic_clear_vector(vec, apic->regs + APIC_IRR);
-	if (apic_search_irr(apic) != -1)
-		apic->irr_pending = true;
+	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
+		/* try to update RVI */
+		kvm_make_request(KVM_REQ_EVENT, vcpu);
+	else {
+		vec = apic_search_irr(apic);
+		apic->irr_pending = (vec != -1);
+	}
 }
 
 static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
 {
-	/* Note that we never get here with APIC virtualization enabled.  */
+	struct kvm_vcpu *vcpu;
+
+	if (__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
+		return;
+
+	vcpu = apic->vcpu;
 
-	if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
-		++apic->isr_count;
-	BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
 	/*
-	 * ISR (in service register) bit is set when injecting an interrupt.
-	 * The highest vector is injected. Thus the latest bit set matches
-	 * the highest bit in ISR.
+	 * With APIC virtualization enabled, all caching is disabled
+	 * because the processor can modify ISR under the hood.  Instead
+	 * just set SVI.
 	 */
-	apic->highest_isr_cache = vec;
+	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
+		kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
+	else {
+		++apic->isr_count;
+		BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
+		/*
+		 * ISR (in service register) bit is set when injecting an interrupt.
+		 * The highest vector is injected. Thus the latest bit set matches
+		 * the highest bit in ISR.
+		 */
+		apic->highest_isr_cache = vec;
+	}
 }
 
 static inline int apic_find_highest_isr(struct kvm_lapic *apic)
@@ -1627,11 +1648,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
 	int vector = kvm_apic_has_interrupt(vcpu);
 	struct kvm_lapic *apic = vcpu->arch.apic;
 
-	/* Note that we never get here with APIC virtualization enabled.  */
-
 	if (vector == -1)
 		return -1;
 
+	/*
+	 * We get here even with APIC virtualization enabled, if doing
+	 * nested virtualization and L1 runs with the "acknowledge interrupt
+	 * on exit" mode.  Then we cannot inject the interrupt via RVI,
+	 * because the process would deliver it through the IDT.
+	 */
+
 	apic_set_isr(vector, apic);
 	apic_update_ppr(apic);
 	apic_clear_irr(vector, apic);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use
  2014-08-05  4:42 ` [PATCH v2 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use Wanpeng Li
@ 2014-08-05 11:04   ` Paolo Bonzini
  2014-08-05 12:39     ` Felipe Reyes
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2014-08-05 11:04 UTC (permalink / raw)
  To: Wanpeng Li, Jan Kiszka
  Cc: Marcelo Tosatti, Gleb Natapov, Bandan Das, Zhang Yang,
	Davidlohr Bueso, kvm, linux-kernel, freyes

Il 05/08/2014 06:42, Wanpeng Li ha scritto:
> After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
> if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
> emulated. To do so, KVM will ask the APIC for the interrupt vector if
> during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
> kvm_get_apic_interrupt would return -1 and give the following WARNING:
> 
> Call Trace:
>  [<ffffffff81493563>] dump_stack+0x49/0x5e
>  [<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
>  [<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
>  [<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
>  [<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
>  [<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
>  [<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
>  [<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
>  [<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
>  [<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]
> 
> If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
> But when L2 is running, external interrupt will casue L1 vmexit with
> reason external interrupt. Then L1 will pick up the interrupt through
> vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
> L1 is running, so APIC-v hardware still will do vEOI updating. The problem
> is that the interrupt is delivered not through APIC-v hardware, this means
> SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
> updating. The solution is that, when L1 tried to pick up the interrupt
> from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
> sure the following vEOI updating and vPPR updating corrently.
>     
> Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
> not cleare vIRR and hypervisor need to clear it before L1 running.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Suggested-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
> Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
> ---
> v1 -> v2:
>  * reusing kvm_get_apic_interrupt here (by modifying kvm_cpu_get_interrupt, 
>    apic_set_isr and apic_clear_irr)
> 
>  arch/x86/kvm/irq.c   |  2 +-
>  arch/x86/kvm/lapic.c | 52 +++++++++++++++++++++++++++++++++++++++-------------
>  2 files changed, 40 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> index bd0da43..a1ec6a5 100644
> --- a/arch/x86/kvm/irq.c
> +++ b/arch/x86/kvm/irq.c
> @@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>  
>  	vector = kvm_cpu_get_extint(v);
>  
> -	if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
> +	if (vector != -1)
>  		return vector;			/* PIC */
>  
>  	return kvm_get_apic_interrupt(v);	/* APIC */
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 3855103..08e8a89 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -352,25 +352,46 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
>  
>  static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
>  {
> -	apic->irr_pending = false;
> +	struct kvm_vcpu *vcpu;
> +
> +	vcpu = apic->vcpu;
> +
>  	apic_clear_vector(vec, apic->regs + APIC_IRR);
> -	if (apic_search_irr(apic) != -1)
> -		apic->irr_pending = true;
> +	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
> +		/* try to update RVI */
> +		kvm_make_request(KVM_REQ_EVENT, vcpu);
> +	else {
> +		vec = apic_search_irr(apic);
> +		apic->irr_pending = (vec != -1);
> +	}
>  }
>  
>  static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
>  {
> -	/* Note that we never get here with APIC virtualization enabled.  */
> +	struct kvm_vcpu *vcpu;
> +
> +	if (__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
> +		return;
> +
> +	vcpu = apic->vcpu;
>  
> -	if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
> -		++apic->isr_count;
> -	BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
>  	/*
> -	 * ISR (in service register) bit is set when injecting an interrupt.
> -	 * The highest vector is injected. Thus the latest bit set matches
> -	 * the highest bit in ISR.
> +	 * With APIC virtualization enabled, all caching is disabled
> +	 * because the processor can modify ISR under the hood.  Instead
> +	 * just set SVI.
>  	 */
> -	apic->highest_isr_cache = vec;
> +	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
> +		kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
> +	else {
> +		++apic->isr_count;
> +		BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
> +		/*
> +		 * ISR (in service register) bit is set when injecting an interrupt.
> +		 * The highest vector is injected. Thus the latest bit set matches
> +		 * the highest bit in ISR.
> +		 */
> +		apic->highest_isr_cache = vec;
> +	}
>  }
>  
>  static inline int apic_find_highest_isr(struct kvm_lapic *apic)
> @@ -1627,11 +1648,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
>  	int vector = kvm_apic_has_interrupt(vcpu);
>  	struct kvm_lapic *apic = vcpu->arch.apic;
>  
> -	/* Note that we never get here with APIC virtualization enabled.  */
> -
>  	if (vector == -1)
>  		return -1;
>  
> +	/*
> +	 * We get here even with APIC virtualization enabled, if doing
> +	 * nested virtualization and L1 runs with the "acknowledge interrupt
> +	 * on exit" mode.  Then we cannot inject the interrupt via RVI,
> +	 * because the process would deliver it through the IDT.
> +	 */
> +
>  	apic_set_isr(vector, apic);
>  	apic_update_ppr(apic);
>  	apic_clear_irr(vector, apic);
> 

Thanks, this looks good.  Felipe, can you test this patch together with
http://article.gmane.org/gmane.linux.kernel/1762356/raw please?

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use
  2014-08-05 11:04   ` Paolo Bonzini
@ 2014-08-05 12:39     ` Felipe Reyes
  2014-08-05 12:44       ` Wanpeng Li
  0 siblings, 1 reply; 5+ messages in thread
From: Felipe Reyes @ 2014-08-05 12:39 UTC (permalink / raw)
  To: Paolo Bonzini, Wanpeng Li, Jan Kiszka
  Cc: Marcelo Tosatti, Gleb Natapov, Bandan Das, Zhang Yang,
	Davidlohr Bueso, kvm, linux-kernel

Hi,

On 08/05/2014 01:04 PM, Paolo Bonzini wrote:
> Il 05/08/2014 06:42, Wanpeng Li ha scritto:
>> After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
>> if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
>> emulated. To do so, KVM will ask the APIC for the interrupt vector if
>> during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
>> kvm_get_apic_interrupt would return -1 and give the following WARNING:
>>
>> Call Trace:
>>   [<ffffffff81493563>] dump_stack+0x49/0x5e
>>   [<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
>>   [<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
>>   [<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
>>   [<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
>>   [<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
>>   [<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
>>   [<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
>>   [<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
>>   [<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]
>>
>> If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
>> But when L2 is running, external interrupt will casue L1 vmexit with
>> reason external interrupt. Then L1 will pick up the interrupt through
>> vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
>> L1 is running, so APIC-v hardware still will do vEOI updating. The problem
>> is that the interrupt is delivered not through APIC-v hardware, this means
>> SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
>> updating. The solution is that, when L1 tried to pick up the interrupt
>> from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
>> sure the following vEOI updating and vPPR updating corrently.
>>
>> Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
>> not cleare vIRR and hypervisor need to clear it before L1 running.
>>
>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>> Suggested-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
>> Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>> ---
>> v1 -> v2:
>>   * reusing kvm_get_apic_interrupt here (by modifying kvm_cpu_get_interrupt,
>>     apic_set_isr and apic_clear_irr)
>>
>>   arch/x86/kvm/irq.c   |  2 +-
>>   arch/x86/kvm/lapic.c | 52 +++++++++++++++++++++++++++++++++++++++-------------
>>   2 files changed, 40 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>> index bd0da43..a1ec6a5 100644
>> --- a/arch/x86/kvm/irq.c
>> +++ b/arch/x86/kvm/irq.c
>> @@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>>
>>   	vector = kvm_cpu_get_extint(v);
>>
>> -	if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
>> +	if (vector != -1)
>>   		return vector;			/* PIC */
>>
>>   	return kvm_get_apic_interrupt(v);	/* APIC */
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 3855103..08e8a89 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -352,25 +352,46 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
>>
>>   static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
>>   {
>> -	apic->irr_pending = false;
>> +	struct kvm_vcpu *vcpu;
>> +
>> +	vcpu = apic->vcpu;
>> +
>>   	apic_clear_vector(vec, apic->regs + APIC_IRR);
>> -	if (apic_search_irr(apic) != -1)
>> -		apic->irr_pending = true;
>> +	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
>> +		/* try to update RVI */
>> +		kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +	else {
>> +		vec = apic_search_irr(apic);
>> +		apic->irr_pending = (vec != -1);
>> +	}
>>   }
>>
>>   static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
>>   {
>> -	/* Note that we never get here with APIC virtualization enabled.  */
>> +	struct kvm_vcpu *vcpu;
>> +
>> +	if (__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
>> +		return;
>> +
>> +	vcpu = apic->vcpu;
>>
>> -	if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
>> -		++apic->isr_count;
>> -	BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
>>   	/*
>> -	 * ISR (in service register) bit is set when injecting an interrupt.
>> -	 * The highest vector is injected. Thus the latest bit set matches
>> -	 * the highest bit in ISR.
>> +	 * With APIC virtualization enabled, all caching is disabled
>> +	 * because the processor can modify ISR under the hood.  Instead
>> +	 * just set SVI.
>>   	 */
>> -	apic->highest_isr_cache = vec;
>> +	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
>> +		kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
>> +	else {
>> +		++apic->isr_count;
>> +		BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
>> +		/*
>> +		 * ISR (in service register) bit is set when injecting an interrupt.
>> +		 * The highest vector is injected. Thus the latest bit set matches
>> +		 * the highest bit in ISR.
>> +		 */
>> +		apic->highest_isr_cache = vec;
>> +	}
>>   }
>>
>>   static inline int apic_find_highest_isr(struct kvm_lapic *apic)
>> @@ -1627,11 +1648,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
>>   	int vector = kvm_apic_has_interrupt(vcpu);
>>   	struct kvm_lapic *apic = vcpu->arch.apic;
>>
>> -	/* Note that we never get here with APIC virtualization enabled.  */
>> -
>>   	if (vector == -1)
>>   		return -1;
>>
>> +	/*
>> +	 * We get here even with APIC virtualization enabled, if doing
>> +	 * nested virtualization and L1 runs with the "acknowledge interrupt
>> +	 * on exit" mode.  Then we cannot inject the interrupt via RVI,
>> +	 * because the process would deliver it through the IDT.
>> +	 */
>> +
>>   	apic_set_isr(vector, apic);
>>   	apic_update_ppr(apic);
>>   	apic_clear_irr(vector, apic);
>>
>
> Thanks, this looks good.  Felipe, can you test this patch together with
> http://article.gmane.org/gmane.linux.kernel/1762356/raw please?
>
> Paolo
>

I checked that the bug was still there using the latest version of 
linus' kernel (8e099d1) and it is, then I applied the patches you 
indicated and the problem went away. So those 2 patches fix the problem 
for me.

Best,

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use
  2014-08-05 12:39     ` Felipe Reyes
@ 2014-08-05 12:44       ` Wanpeng Li
  0 siblings, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2014-08-05 12:44 UTC (permalink / raw)
  To: Felipe Reyes
  Cc: Paolo Bonzini, Jan Kiszka, Marcelo Tosatti, Gleb Natapov,
	Bandan Das, Zhang Yang, Davidlohr Bueso, kvm, linux-kernel

On Tue, Aug 05, 2014 at 02:39:05PM +0200, Felipe Reyes wrote:
>Hi,
>
>On 08/05/2014 01:04 PM, Paolo Bonzini wrote:
>>Il 05/08/2014 06:42, Wanpeng Li ha scritto:
>>>After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
>>>if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
>>>emulated. To do so, KVM will ask the APIC for the interrupt vector if
>>>during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
>>>kvm_get_apic_interrupt would return -1 and give the following WARNING:
>>>
>>>Call Trace:
>>>  [<ffffffff81493563>] dump_stack+0x49/0x5e
>>>  [<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
>>>  [<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
>>>  [<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
>>>  [<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
>>>  [<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
>>>  [<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
>>>  [<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
>>>  [<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
>>>  [<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]
>>>
>>>If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
>>>But when L2 is running, external interrupt will casue L1 vmexit with
>>>reason external interrupt. Then L1 will pick up the interrupt through
>>>vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
>>>L1 is running, so APIC-v hardware still will do vEOI updating. The problem
>>>is that the interrupt is delivered not through APIC-v hardware, this means
>>>SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
>>>updating. The solution is that, when L1 tried to pick up the interrupt
>>>from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
>>>sure the following vEOI updating and vPPR updating corrently.
>>>
>>>Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
>>>not cleare vIRR and hypervisor need to clear it before L1 running.
>>>
>>>Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>>>Suggested-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
>>>Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
>>>Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>>>---
>>>v1 -> v2:
>>>  * reusing kvm_get_apic_interrupt here (by modifying kvm_cpu_get_interrupt,
>>>    apic_set_isr and apic_clear_irr)
>>>
>>>  arch/x86/kvm/irq.c   |  2 +-
>>>  arch/x86/kvm/lapic.c | 52 +++++++++++++++++++++++++++++++++++++++-------------
>>>  2 files changed, 40 insertions(+), 14 deletions(-)
>>>
>>>diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>>>index bd0da43..a1ec6a5 100644
>>>--- a/arch/x86/kvm/irq.c
>>>+++ b/arch/x86/kvm/irq.c
>>>@@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>>>
>>>  	vector = kvm_cpu_get_extint(v);
>>>
>>>-	if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
>>>+	if (vector != -1)
>>>  		return vector;			/* PIC */
>>>
>>>  	return kvm_get_apic_interrupt(v);	/* APIC */
>>>diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>index 3855103..08e8a89 100644
>>>--- a/arch/x86/kvm/lapic.c
>>>+++ b/arch/x86/kvm/lapic.c
>>>@@ -352,25 +352,46 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
>>>
>>>  static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
>>>  {
>>>-	apic->irr_pending = false;
>>>+	struct kvm_vcpu *vcpu;
>>>+
>>>+	vcpu = apic->vcpu;
>>>+
>>>  	apic_clear_vector(vec, apic->regs + APIC_IRR);
>>>-	if (apic_search_irr(apic) != -1)
>>>-		apic->irr_pending = true;
>>>+	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
>>>+		/* try to update RVI */
>>>+		kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>+	else {
>>>+		vec = apic_search_irr(apic);
>>>+		apic->irr_pending = (vec != -1);
>>>+	}
>>>  }
>>>
>>>  static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
>>>  {
>>>-	/* Note that we never get here with APIC virtualization enabled.  */
>>>+	struct kvm_vcpu *vcpu;
>>>+
>>>+	if (__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
>>>+		return;
>>>+
>>>+	vcpu = apic->vcpu;
>>>
>>>-	if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
>>>-		++apic->isr_count;
>>>-	BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
>>>  	/*
>>>-	 * ISR (in service register) bit is set when injecting an interrupt.
>>>-	 * The highest vector is injected. Thus the latest bit set matches
>>>-	 * the highest bit in ISR.
>>>+	 * With APIC virtualization enabled, all caching is disabled
>>>+	 * because the processor can modify ISR under the hood.  Instead
>>>+	 * just set SVI.
>>>  	 */
>>>-	apic->highest_isr_cache = vec;
>>>+	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
>>>+		kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
>>>+	else {
>>>+		++apic->isr_count;
>>>+		BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
>>>+		/*
>>>+		 * ISR (in service register) bit is set when injecting an interrupt.
>>>+		 * The highest vector is injected. Thus the latest bit set matches
>>>+		 * the highest bit in ISR.
>>>+		 */
>>>+		apic->highest_isr_cache = vec;
>>>+	}
>>>  }
>>>
>>>  static inline int apic_find_highest_isr(struct kvm_lapic *apic)
>>>@@ -1627,11 +1648,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
>>>  	int vector = kvm_apic_has_interrupt(vcpu);
>>>  	struct kvm_lapic *apic = vcpu->arch.apic;
>>>
>>>-	/* Note that we never get here with APIC virtualization enabled.  */
>>>-
>>>  	if (vector == -1)
>>>  		return -1;
>>>
>>>+	/*
>>>+	 * We get here even with APIC virtualization enabled, if doing
>>>+	 * nested virtualization and L1 runs with the "acknowledge interrupt
>>>+	 * on exit" mode.  Then we cannot inject the interrupt via RVI,
>>>+	 * because the process would deliver it through the IDT.
>>>+	 */
>>>+
>>>  	apic_set_isr(vector, apic);
>>>  	apic_update_ppr(apic);
>>>  	apic_clear_irr(vector, apic);
>>>
>>
>>Thanks, this looks good.  Felipe, can you test this patch together with
>>http://article.gmane.org/gmane.linux.kernel/1762356/raw please?
>>
>>Paolo
>>
>
>I checked that the bug was still there using the latest version of
>linus' kernel (8e099d1) and it is, then I applied the patches you
>indicated and the problem went away. So those 2 patches fix the
>problem for me.
>

Thanks for your verify. 

Regards,
Wanpeng Li 

>Best,
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-08-05 12:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-05  4:42 [PATCH v2 1/2] KVM: nVMX: Fix nested vmexit ack intr before load vmcs01 Wanpeng Li
2014-08-05  4:42 ` [PATCH v2 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use Wanpeng Li
2014-08-05 11:04   ` Paolo Bonzini
2014-08-05 12:39     ` Felipe Reyes
2014-08-05 12:44       ` Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).