All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chao Gao <chao.gao@intel.com>
To: Chenyi Qiang <chenyi.qiang@intel.com>
Cc: <kvm@vger.kernel.org>, Sean Christopherson <seanjc@google.com>,
	"Jim Mattson" <jmattson@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Farrah Chen" <farrah.chen@intel.com>
Subject: Re: [PATCH] KVM: VMX: Fall back to IRR scan when PIR is empty despite PID.ON being set
Date: Tue, 28 Apr 2026 19:10:22 +0800	[thread overview]
Message-ID: <afCVnpeoorjOqeCD@intel.com> (raw)
In-Reply-To: <20260428070349.1633238-1-chenyi.qiang@intel.com>

On Tue, Apr 28, 2026 at 03:03:26PM +0800, Chenyi Qiang wrote:
>Fall back to kvm_lapic_find_highest_irr() in vmx_sync_pir_to_irr() when
>PID.ON is set but PIR turns out to be empty, to correctly report the
>highest pending interrupt from the existing IRR.
>
>In a nested VM stress test, the following WARNING fires in
>vmx_check_nested_events() when kvm_cpu_has_interrupt() reports a pending
>interrupt but the subsequent kvm_apic_has_interrupt() (which invokes
>vmx_sync_pir_to_irr() again) returns -1:
>
>  WARNING: CPU: 99 PID: 57767 at arch/x86/kvm/vmx/nested.c:4449 vmx_check_nested_events+0x6bf/0x6e0 [kvm_intel]
>  Call Trace:
>   kvm_check_and_inject_events
>   vcpu_enter_guest.constprop.0
>   vcpu_run
>   kvm_arch_vcpu_ioctl_run
>   kvm_vcpu_ioctl
>   __x64_sys_ioctl
>   do_syscall_64
>   entry_SYSCALL_64_after_hwframe
>
>The root cause is a race between vmx_sync_pir_to_irr() on the target vCPU
>and __vmx_deliver_posted_interrupt() on a sender vCPU.  The sender
>performs two individually-atomic operations that are not a single
>transaction:
>
>  1. pi_test_and_set_pir(vector)  -- sets the PIR bit
>  2. pi_test_and_set_on()         -- sets PID.ON
>
>The following interleaving triggers the bug:
>
>  Sender vCPU (IPI):              Target vCPU (1st sync_pir_to_irr):
>  B1: set PIR[vector]
>                                  A1: pi_clear_on()
>                                  A2: pi_harvest_pir() -> sees B1 bit
>                                  A3: xchg() -> consumes bit, PIR=0
>                                      (1st sync returns correct max_irr)
>  B2: set PID.ON = 1
>
>                                  Target vCPU (2nd sync_pir_to_irr):
>                                  C1: pi_test_on() -> TRUE (from B2)
>                                  C2: pi_clear_on() -> ON=0
>                                  C3: pi_harvest_pir() -> PIR empty
>                                  C4: *max_irr = -1, early return
>                                      IRR NOT SCANNED
>
>The interrupt is not lost (it resides in the IRR from the first sync and
>is recovered on the next vcpu_enter_guest() iteration), but the incorrect
>max_irr causes a spurious WARNING and a wasted L2 VM-Enter/VM-Exit cycle.
>
>Fixes: b41f8638b9d3 ("KVM: VMX: Isolate pure loads from atomic XCHG when processing PIR")

Just FYI, I asked Copilot to review commit b41f8638b9d3, and it indeed
identified this subtle functional change:

"
Found 1 regression. In arch/x86/kvm/lapic.c::__kvm_apic_update_irr(), the new if
(!pending) return false; drops the old behavior of recomputing *max_irr from
APIC_IRR on an empty-PIR path. vmx_sync_pir_to_irr() still calls this helper
whenever PID.ON is set and then unconditionally passes max_irr to vmx_set_rvi(),
so when hardware has already drained PIR into vIRR/APIC_IRR, max_irr stays -1 and
KVM clears RVI despite an interrupt still being pending
"

      parent reply	other threads:[~2026-04-28 11:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28  7:03 [PATCH] KVM: VMX: Fall back to IRR scan when PIR is empty despite PID.ON being set Chenyi Qiang
2026-04-28  7:45 ` Paolo Bonzini
2026-04-28  8:27   ` Chenyi Qiang
2026-04-28 15:50     ` Sean Christopherson
2026-04-29  1:08       ` Chenyi Qiang
2026-04-29 12:58         ` Sean Christopherson
2026-04-28 11:10 ` Chao Gao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afCVnpeoorjOqeCD@intel.com \
    --to=chao.gao@intel.com \
    --cc=chenyi.qiang@intel.com \
    --cc=farrah.chen@intel.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.