All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Khushit Shah <khushit.shah@nutanix.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	 "kai.huang@intel.com" <kai.huang@intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	 "x86@kernel.org" <x86@kernel.org>, "bp@alien8.de" <bp@alien8.de>,
	"hpa@zytor.com" <hpa@zytor.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	 Jon Kohler <jon@nutanix.com>,
	Shaju Abraham <shaju.abraham@nutanix.com>,
	 "stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: Re: [PATCH v3] KVM: x86: Add x2APIC "features" to control EOI broadcast suppression
Date: Tue, 2 Dec 2025 08:36:21 -0800	[thread overview]
Message-ID: <aS8Vhb66UViQmY_Q@google.com> (raw)
In-Reply-To: <68ad817529c6661085ff0524472933ba9f69fd47.camel@infradead.org>

On Tue, Dec 02, 2025, David Woodhouse wrote:
> On Tue, 2025-12-02 at 07:42 -0800, Sean Christopherson wrote:
> > On Tue, Dec 02, 2025, David Woodhouse wrote:
> > > On Tue, 2025-12-02 at 12:58 +0000, Khushit Shah wrote:
> > > > Thanks for the review!
> > > > 
> > > > > On 2 Dec 2025, at 2:43 PM, David Woodhouse <dwmw2@infradead.org> wrote:
> > > > > 
> > > > > Firstly, excellent work debugging and diagnosing that!
> > > > > 
> > > > > On Tue, 2025-11-25 at 18:05 +0000, Khushit Shah wrote:
> > > > > > 
> > > > > > --- a/Documentation/virt/kvm/api.rst
> > > > > > +++ b/Documentation/virt/kvm/api.rst
> > > > > > @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created.
> > > > > >  
> > > > > >  Valid feature flags in args[0] are::
> > > > > >  
> > > > > > -  #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
> > > > > > -  #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
> > > > > > +  #define KVM_X2APIC_API_USE_32BIT_IDS                               (1ULL << 0)
> > > > > > +  #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK                     (1ULL << 1)
> > > > > > +  #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2)
> > > > > > +  #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST              (1ULL << 3)
> > > > > > 
> > > > > 
> > > > > I kind of hate these names. This part right here is what we leave
> > > > > behind for future generations, to understand the weird behaviour of
> > > > > KVM. To have "IGNORE" "SUPPRESS" "QUIRK" all in the same flag, quite
> > > > > apart from the length of the token, makes my brain hurt.
> > 
> > ...
> > 
> > > > > Could we perhaps call them 'ENABLE_SUPPRESS_EOI_BROADCAST' and
> > > > > 'DISABLE_SUPPRESS_EOI_BROADCAST', with a note saying that modern VMMs
> > > > > should always explicitly enable one or the other, because for
> > > > > historical reasons KVM only *pretends* to support it by default but it
> > > > > doesn't actually work correctly?
> > 
> > I don't disagree on the names being painful, but ENABLE_SUPPRESS_EOI_BROADCAST
> > vs. DISABLE_SUPPRESS_EOI_BROADCAST won't work, and is even more confusing IMO.
> 
> I dunno, KVM never actually *did* suppress the EOI broadcast anyway,
> did it? This fix really *does* enable it — as opposed to just
> pretending to?
> 
> I was thinking along the lines of ...
> 
> 
> Setting KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST causes KVM to
> advertise and correctly implement the Directed EOI feature in the local
> APIC, suppressing broadcast EOI when the feature is enabled by the
> guest.
> 
> Setting KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST causes KVM not to
> advertise the Directed EOI feature in the local APIC.
> 
> Userspace should explicitly either enable or disable the EOI broadcast
> using one of the two flags above. For historical compatibility reasons,
> if neither flag is set then KVM will advertise the feature but will not
> actually suppress the EOI broadcast, leading to potential IRQ storms in
> some guest configurations.

Hmm, I suppose that could work for uAPI.  Having both an ENABLE and a DISABLE
is obviously a bit odd, but slowing down the reader might actually be a good
thing in this case.  And the documentation should be easy enough to write.

I was worried that having ENABLE and DISABLE controls would lead to confusing code
internally, but there's no reason KVM's internal tracking needs to match uAPI.

How about this?

---
 arch/x86/include/asm/kvm_host.h |  7 +++++++
 arch/x86/include/uapi/asm/kvm.h |  6 ++++--
 arch/x86/kvm/lapic.c            | 16 +++++++++++++++-
 arch/x86/kvm/x86.c              | 15 ++++++++++++---
 4 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5a3bfa293e8b..b4c41255f01d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1226,6 +1226,12 @@ enum kvm_irqchip_mode {
 	KVM_IRQCHIP_SPLIT,        /* created with KVM_CAP_SPLIT_IRQCHIP */
 };
 
+enum kvm_suppress_eoi_broadcast_mode {
+	KVM_SUPPRESS_EOI_QUIRKED,
+	KVM_SUPPRESS_EOI_ENABLED,
+	KVM_SUPPRESS_EOI_DISABLED,
+};
+
 struct kvm_x86_msr_filter {
 	u8 count;
 	bool default_allow:1;
@@ -1475,6 +1481,7 @@ struct kvm_arch {
 
 	bool x2apic_format;
 	bool x2apic_broadcast_quirk_disabled;
+	enum kvm_suppress_eoi_broadcast_mode suppress_eoi_broadcast;
 
 	bool has_mapped_host_mmio;
 	bool guest_can_read_msr_platform_info;
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 7ceff6583652..bd51596001f8 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -914,8 +914,10 @@ struct kvm_sev_snp_launch_finish {
 	__u64 pad1[4];
 };
 
-#define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
-#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
+#define KVM_X2APIC_API_USE_32BIT_IDS			(_BITULL(0))
+#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK		(_BITULL(1))
+#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST	(_BITULL(2))
+#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST	(_BITULL(3))
 
 struct kvm_hyperv_eventfd {
 	__u32 conn_id;
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 1597dd0b0cc6..3f00c9640785 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -562,7 +562,8 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu)
 	 * IOAPIC.
 	 */
 	if (guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) &&
-	    !ioapic_in_kernel(vcpu->kvm))
+	    !ioapic_in_kernel(vcpu->kvm) &&
+	    vcpu->kvm->arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_DISABLED)
 		v |= APIC_LVR_DIRECTED_EOI;
 	kvm_lapic_set_reg(apic, APIC_LVR, v);
 }
@@ -1517,6 +1518,19 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
 
 	/* Request a KVM exit to inform the userspace IOAPIC. */
 	if (irqchip_split(apic->vcpu->kvm)) {
+		/*
+		 * Don't exit to userspace if the guest has enabled Directed
+		 * EOI, a.k.a. Suppress EOI Broadcasts, in which case the local
+		 * APIC doesn't broadcast EOIs (the guest must EOI the target
+		 * I/O APIC(s) directly).  Ignore the suppression if userspace
+		 * has not explictly enabled support (KVM's historical quirky
+		 * behavior is to advertise support for Suppress EOI Broadcasts
+		 * without actually suppressing EOIs).
+		 */
+		if ((kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) &&
+		    apic->vcpu->kvm->arch.suppress_eoi_broadcast != KVM_SUPPRESS_EOI_QUIRKED)
+			return;
+
 		apic->vcpu->arch.pending_ioapic_eoi = vector;
 		kvm_make_request(KVM_REQ_IOAPIC_EOI_EXIT, apic->vcpu);
 		return;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0c6d899d53dd..b36e048c7862 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -121,8 +121,10 @@ static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE);
 
 #define KVM_CAP_PMU_VALID_MASK KVM_PMU_CAP_DISABLE
 
-#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \
-                                    KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
+#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS |		\
+				    KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK |	\
+				    KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST |	\
+				    KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST)
 
 static void update_cr8_intercept(struct kvm_vcpu *vcpu);
 static void process_nmi(struct kvm_vcpu *vcpu);
@@ -6739,11 +6741,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		if (cap->args[0] & ~KVM_X2APIC_API_VALID_FLAGS)
 			break;
 
+		if (cap->args[0] & KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST &&
+		    cap->args[0] & KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST)
+			break;
+
 		if (cap->args[0] & KVM_X2APIC_API_USE_32BIT_IDS)
 			kvm->arch.x2apic_format = true;
 		if (cap->args[0] & KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
 			kvm->arch.x2apic_broadcast_quirk_disabled = true;
-
+		if (cap->args[0] & KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST)
+			kvm->arch.suppress_eoi_broadcast = KVM_SUPPRESS_EOI_ENABLED;
+		if (cap->args[0] & KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST)
+			kvm->arch.suppress_eoi_broadcast = KVM_SUPPRESS_EOI_DISABLED;
 		r = 0;
 		break;
 	case KVM_CAP_X86_DISABLE_EXITS:

base-commit: 6c3373b26189853230552bd3932b3edba5883423
--

  reply	other threads:[~2025-12-02 16:36 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 18:05 [PATCH v3] KVM: x86: Add x2APIC "features" to control EOI broadcast suppression Khushit Shah
2025-11-25 21:16 ` Huang, Kai
2025-12-02  9:13 ` David Woodhouse
2025-12-02 12:58   ` Khushit Shah
2025-12-02 13:31     ` David Woodhouse
2025-12-02 15:42       ` Sean Christopherson
2025-12-02 15:51         ` David Woodhouse
2025-12-02 16:36           ` Sean Christopherson [this message]
2025-12-02 17:10             ` David Woodhouse
2025-12-02 22:26               ` Sean Christopherson
2025-12-02 22:35                 ` David Woodhouse
2025-12-03  0:50             ` Huang, Kai
2025-12-03  1:14               ` David Woodhouse
2025-12-03 12:25               ` David Woodhouse
2025-12-03 13:10                 ` Paolo Bonzini
2025-12-03 13:32                   ` David Woodhouse
2025-12-03 13:36                     ` Paolo Bonzini
2025-12-03 13:55                       ` David Woodhouse
2025-12-03  7:45             ` Khushit Shah
2025-12-02 16:04         ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aS8Vhb66UViQmY_Q@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jon@nutanix.com \
    --cc=kai.huang@intel.com \
    --cc=khushit.shah@nutanix.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=shaju.abraham@nutanix.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.