public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue
@ 2023-07-26 13:59 Maxim Levitsky
  2023-07-26 13:59 ` [PATCH v2 1/3] KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically Maxim Levitsky
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Maxim Levitsky @ 2023-07-26 13:59 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, x86, Borislav Petkov, Paolo Bonzini,
	Sean Christopherson, Dave Hansen, Thomas Gleixner, H. Peter Anvin,
	Ingo Molnar, Maxim Levitsky

Recently we found an issue which causes these error messages
to be sometimes logged if the guest has VFIO device attached:

'Spurious APIC interrupt (vector 0xFF) on CPU#0, should never happen'

It was traced to the incorrect APICv inhibition bug which started with
'KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base'
(All these issues are now fixed)

However, there are valid cases for the APICv to be inhibited and it should not
cause spurious interrupts to be injected to the guest.

After some debug, the root cause was found and it is that __kvm_apic_update_irr
doesn't set irr_pending which later triggers a int->unsigned char conversion
bug which leads to the wrong 0xFF injection.

This also leads to an unbounded delay in injecting the interrupt and hurts
performance.

In addition to that, I also noticed that __kvm_apic_update_irr is not atomic
in regard to IRR, which can lead to an even harder to debug bug.

V2: applied Paolo's feedback for the patch 1.

Best regards,
	Maxim Levitsky

Maxim Levitsky (3):
  KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically
  KVM: x86: VMX: set irr_pending in kvm_apic_update_irr
  KVM: x86: check the kvm_cpu_get_interrupt result before using it

 arch/x86/kvm/lapic.c | 25 +++++++++++++++++--------
 arch/x86/kvm/x86.c   | 10 +++++++---
 2 files changed, 24 insertions(+), 11 deletions(-)

-- 
2.26.3



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/3] KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically
  2023-07-26 13:59 [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Maxim Levitsky
@ 2023-07-26 13:59 ` Maxim Levitsky
  2023-07-26 13:59 ` [PATCH v2 2/3] KVM: x86: VMX: set irr_pending in kvm_apic_update_irr Maxim Levitsky
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Maxim Levitsky @ 2023-07-26 13:59 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, x86, Borislav Petkov, Paolo Bonzini,
	Sean Christopherson, Dave Hansen, Thomas Gleixner, H. Peter Anvin,
	Ingo Molnar, Maxim Levitsky

If APICv is inhibited, then IPIs from peer vCPUs are done by
atomically setting bits in IRR.

This means, that when __kvm_apic_update_irr copies PIR to IRR,
it has to modify IRR atomically as well.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/lapic.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 113ca9661ab21d..b17b37e4d4fcd1 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -637,16 +637,22 @@ bool __kvm_apic_update_irr(u32 *pir, void *regs, int *max_irr)
 	*max_irr = -1;
 
 	for (i = vec = 0; i <= 7; i++, vec += 32) {
+		u32 *p_irr = (u32 *)(regs + APIC_IRR + i * 0x10);
+
+		irr_val = *p_irr;
 		pir_val = READ_ONCE(pir[i]);
-		irr_val = *((u32 *)(regs + APIC_IRR + i * 0x10));
+
 		if (pir_val) {
+			pir_val = xchg(&pir[i], 0);
+
 			prev_irr_val = irr_val;
-			irr_val |= xchg(&pir[i], 0);
-			*((u32 *)(regs + APIC_IRR + i * 0x10)) = irr_val;
-			if (prev_irr_val != irr_val) {
-				max_updated_irr =
-					__fls(irr_val ^ prev_irr_val) + vec;
-			}
+			do {
+				irr_val = prev_irr_val | pir_val;
+			} while (prev_irr_val != irr_val &&
+				 !try_cmpxchg(p_irr, &prev_irr_val, irr_val));
+
+			if (prev_irr_val != irr_val)
+				max_updated_irr = __fls(irr_val ^ prev_irr_val) + vec;
 		}
 		if (irr_val)
 			*max_irr = __fls(irr_val) + vec;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/3] KVM: x86: VMX: set irr_pending in kvm_apic_update_irr
  2023-07-26 13:59 [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Maxim Levitsky
  2023-07-26 13:59 ` [PATCH v2 1/3] KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically Maxim Levitsky
@ 2023-07-26 13:59 ` Maxim Levitsky
  2023-07-26 13:59 ` [PATCH v2 3/3] KVM: x86: check the kvm_cpu_get_interrupt result before using it Maxim Levitsky
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Maxim Levitsky @ 2023-07-26 13:59 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, x86, Borislav Petkov, Paolo Bonzini,
	Sean Christopherson, Dave Hansen, Thomas Gleixner, H. Peter Anvin,
	Ingo Molnar, Maxim Levitsky

When the APICv is inhibited, the irr_pending optimization is used.

Therefore, when kvm_apic_update_irr sets bits in the IRR,
it must set irr_pending to true as well.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/lapic.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b17b37e4d4fcd1..a983a16163b137 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -666,8 +666,11 @@ EXPORT_SYMBOL_GPL(__kvm_apic_update_irr);
 bool kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir, int *max_irr)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
+	bool irr_updated = __kvm_apic_update_irr(pir, apic->regs, max_irr);
 
-	return __kvm_apic_update_irr(pir, apic->regs, max_irr);
+	if (unlikely(!apic->apicv_active && irr_updated))
+		apic->irr_pending = true;
+	return irr_updated;
 }
 EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/3] KVM: x86: check the kvm_cpu_get_interrupt result before using it
  2023-07-26 13:59 [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Maxim Levitsky
  2023-07-26 13:59 ` [PATCH v2 1/3] KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically Maxim Levitsky
  2023-07-26 13:59 ` [PATCH v2 2/3] KVM: x86: VMX: set irr_pending in kvm_apic_update_irr Maxim Levitsky
@ 2023-07-26 13:59 ` Maxim Levitsky
  2023-07-29  1:41 ` [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Sean Christopherson
  2023-07-29 14:27 ` Paolo Bonzini
  4 siblings, 0 replies; 6+ messages in thread
From: Maxim Levitsky @ 2023-07-26 13:59 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, x86, Borislav Petkov, Paolo Bonzini,
	Sean Christopherson, Dave Hansen, Thomas Gleixner, H. Peter Anvin,
	Ingo Molnar, Maxim Levitsky

The code was blindly assuming that kvm_cpu_get_interrupt never returns -1
when there is a pending interrupt.

While this should be true, a bug in KVM can still cause this.

If -1 is returned, the code before this patch was converting it to 0xFF,
and 0xFF interrupt was injected to the guest, which results in an issue
which was hard to debug.

Add WARN_ON_ONCE to catch this case and	skip the injection
if this happens again.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/x86.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a6b9bea62fb8ac..00b87fcf6da4af 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10203,9 +10203,13 @@ static int kvm_check_and_inject_events(struct kvm_vcpu *vcpu,
 		if (r < 0)
 			goto out;
 		if (r) {
-			kvm_queue_interrupt(vcpu, kvm_cpu_get_interrupt(vcpu), false);
-			static_call(kvm_x86_inject_irq)(vcpu, false);
-			WARN_ON(static_call(kvm_x86_interrupt_allowed)(vcpu, true) < 0);
+			int irq = kvm_cpu_get_interrupt(vcpu);
+
+			if (!WARN_ON_ONCE(irq == -1)) {
+				kvm_queue_interrupt(vcpu, irq, false);
+				static_call(kvm_x86_inject_irq)(vcpu, false);
+				WARN_ON(static_call(kvm_x86_interrupt_allowed)(vcpu, true) < 0);
+			}
 		}
 		if (kvm_cpu_has_injectable_intr(vcpu))
 			static_call(kvm_x86_enable_irq_window)(vcpu);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue
  2023-07-26 13:59 [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Maxim Levitsky
                   ` (2 preceding siblings ...)
  2023-07-26 13:59 ` [PATCH v2 3/3] KVM: x86: check the kvm_cpu_get_interrupt result before using it Maxim Levitsky
@ 2023-07-29  1:41 ` Sean Christopherson
  2023-07-29 14:27 ` Paolo Bonzini
  4 siblings, 0 replies; 6+ messages in thread
From: Sean Christopherson @ 2023-07-29  1:41 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, linux-kernel, x86, Borislav Petkov, Paolo Bonzini,
	Dave Hansen, Thomas Gleixner, H. Peter Anvin, Ingo Molnar

On Wed, Jul 26, 2023, Maxim Levitsky wrote:
> Maxim Levitsky (3):
>   KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically
>   KVM: x86: VMX: set irr_pending in kvm_apic_update_irr
>   KVM: x86: check the kvm_cpu_get_interrupt result before using it
> 
>  arch/x86/kvm/lapic.c | 25 +++++++++++++++++--------
>  arch/x86/kvm/x86.c   | 10 +++++++---
>  2 files changed, 24 insertions(+), 11 deletions(-)

Paolo, are you still planning on taking these directly?  I can also grab them
and send them your way next week.  I have a couple of (not super urgent, but
kinda urgent) fixes for 6.5 that I'm planning on sending a PULL request for, just
didn't get around to that today.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue
  2023-07-26 13:59 [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Maxim Levitsky
                   ` (3 preceding siblings ...)
  2023-07-29  1:41 ` [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Sean Christopherson
@ 2023-07-29 14:27 ` Paolo Bonzini
  4 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2023-07-29 14:27 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, linux-kernel, x86, Borislav Petkov, Sean Christopherson,
	Dave Hansen, Thomas Gleixner, H . Peter Anvin, Ingo Molnar

Queued, thanks.

Paolo



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-07-29 14:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-26 13:59 [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Maxim Levitsky
2023-07-26 13:59 ` [PATCH v2 1/3] KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically Maxim Levitsky
2023-07-26 13:59 ` [PATCH v2 2/3] KVM: x86: VMX: set irr_pending in kvm_apic_update_irr Maxim Levitsky
2023-07-26 13:59 ` [PATCH v2 3/3] KVM: x86: check the kvm_cpu_get_interrupt result before using it Maxim Levitsky
2023-07-29  1:41 ` [PATCH v2 0/3] Fix 'Spurious APIC interrupt (vector 0xFF) on CPU#n' issue Sean Christopherson
2023-07-29 14:27 ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox