From: Chao Gao <chao.gao@intel.com>
To: Dongli Zhang <dongli.zhang@oracle.com>
Cc: <kvm@vger.kernel.org>, <x86@kernel.org>,
<linux-kernel@vger.kernel.org>, <seanjc@google.com>,
<pbonzini@redhat.com>, <tglx@linutronix.de>, <mingo@redhat.com>,
<bp@alien8.de>, <dave.hansen@linux.intel.com>, <hpa@zytor.com>,
<joe.jin@oracle.com>, <alejandro.j.jimenez@oracle.com>
Subject: Re: [PATCH v2 1/1] KVM: VMX: configure SVI during runtime APICv activation
Date: Mon, 10 Nov 2025 15:08:33 +0800 [thread overview]
Message-ID: <aRGPcYE4liEI+DfT@intel.com> (raw)
In-Reply-To: <20251110063212.34902-1-dongli.zhang@oracle.com>
On Sun, Nov 09, 2025 at 10:32:12PM -0800, Dongli Zhang wrote:
>The APICv (apic->apicv_active) can be activated or deactivated at runtime,
>for instance, because of APICv inhibit reasons. Intel VMX employs different
>mechanisms to virtualize LAPIC based on whether APICv is active.
>
>When APICv is activated at runtime, GUEST_INTR_STATUS is used to configure
>and report the current pending IRR and ISR states. Unless a specific vector
>is explicitly included in EOI_EXIT_BITMAP, its EOI will not be trapped to
>KVM. Intel VMX automatically clears the corresponding ISR bit based on the
>GUEST_INTR_STATUS.SVI field.
>
>When APICv is deactivated at runtime, the VM_ENTRY_INTR_INFO_FIELD is used
>to specify the next interrupt vector to invoke upon VM-entry. The
>VMX IDT_VECTORING_INFO_FIELD is used to report un-invoked vectors on
>VM-exit. EOIs are always trapped to KVM, so the software can manually clear
>pending ISR bits.
>
>There are scenarios where, with APICv activated at runtime, a guest-issued
>EOI may not be able to clear the pending ISR bit.
>
>Taking vector 236 as an example, here is one scenario.
>
>1. Suppose APICv is inactive. Vector 236 is pending in the IRR.
>2. To handle KVM_REQ_EVENT, KVM moves vector 236 from the IRR to the ISR,
>and configures the VM_ENTRY_INTR_INFO_FIELD via vmx_inject_irq().
>3. After VM-entry, vector 236 is invoked through the guest IDT. At this
>point, the data in VM_ENTRY_INTR_INFO_FIELD is no longer valid. The guest
>interrupt handler for vector 236 is invoked.
>4. Suppose a VM exit occurs very early in the guest interrupt handler,
>before the EOI is issued.
>5. Nothing is reported through the IDT_VECTORING_INFO_FIELD because
>vector 236 has already been invoked in the guest.
>6. Now, suppose APICv is activated. Before the next VM-entry, KVM calls
>kvm_vcpu_update_apicv() to activate APICv.
>7. Unfortunately, GUEST_INTR_STATUS.SVI is not configured, although
>vector 236 is still pending in the ISR.
>8. After VM-entry, the guest finally issues the EOI for vector 236.
>However, because SVI is not configured, vector 236 is not cleared.
>9. ISR is stalled forever on vector 236.
>
>Here is another scenario.
>
>1. Suppose APICv is inactive. Vector 236 is pending in the IRR.
>2. To handle KVM_REQ_EVENT, KVM moves vector 236 from the IRR to the ISR,
>and configures the VM_ENTRY_INTR_INFO_FIELD via vmx_inject_irq().
>3. VM-exit occurs immediately after the next VM-entry. The vector 236 is
>not invoked through the guest IDT. Instead, it is saved to the
>IDT_VECTORING_INFO_FIELD during the VM-exit.
>4. KVM calls kvm_queue_interrupt() to re-queue the un-invoked vector 236
>into vcpu->arch.interrupt. A KVM_REQ_EVENT is requested.
>5. Now, suppose APICv is activated. Before the next VM-entry, KVM calls
>kvm_vcpu_update_apicv() to activate APICv.
>6. Although APICv is now active, KVM still uses the legacy
>VM_ENTRY_INTR_INFO_FIELD to re-inject vector 236. GUEST_INTR_STATUS.SVI is
>not configured.
>7. After the next VM-entry, vector 236 is invoked through the guest IDT.
>Finally, an EOI occurs. However, due to the lack of GUEST_INTR_STATUS.SVI
>configuration, vector 236 is not cleared from the ISR.
>8. ISR is stalled forever on vector 236.
>
>Using QEMU as an example, vector 236 is stuck in ISR forever.
>
>(qemu) info lapic 1
>dumping local APIC state for CPU 1
>
>LVT0 0x00010700 active-hi edge masked ExtINT (vec 0)
>LVT1 0x00010400 active-hi edge masked NMI
>LVTPC 0x00000400 active-hi edge NMI
>LVTERR 0x000000fe active-hi edge Fixed (vec 254)
>LVTTHMR 0x00010000 active-hi edge masked Fixed (vec 0)
>LVTT 0x000400ec active-hi edge tsc-deadline Fixed (vec 236)
>Timer DCR=0x0 (divide by 2) initial_count = 0 current_count = 0
>SPIV 0x000001ff APIC enabled, focus=off, spurious vec 255
>ICR 0x000000fd physical edge de-assert no-shorthand
>ICR2 0x00000000 cpu 0 (X2APIC ID)
>ESR 0x00000000
>ISR 236
>IRR 37(level) 236
>
>The issue is not applicable to AMD SVM which employs a different LAPIC
>virtualization mechanism. In addition, APICV_INHIBIT_REASON_IRQWIN ensures
>AMD SVM AVIC is not activated until the last interrupt is EOI.
>
>Fix the bug by configuring Intel VMX GUEST_INTR_STATUS.SVI if APICv is
>activated at runtime.
>
>Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
next prev parent reply other threads:[~2025-11-10 7:08 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 6:32 [PATCH v2 1/1] KVM: VMX: configure SVI during runtime APICv activation Dongli Zhang
2025-11-10 7:08 ` Chao Gao [this message]
2025-11-12 14:47 ` Sean Christopherson
2025-11-13 3:06 ` Dongli Zhang
2025-11-13 21:13 ` Sean Christopherson
2025-11-18 3:36 ` Dongli Zhang
2025-12-05 2:15 ` Sean Christopherson
2025-12-05 18:12 ` Dongli Zhang
2025-12-05 18:27 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRGPcYE4liEI+DfT@intel.com \
--to=chao.gao@intel.com \
--cc=alejandro.j.jimenez@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dongli.zhang@oracle.com \
--cc=hpa@zytor.com \
--cc=joe.jin@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox