public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Kai Huang <kai.huang@intel.com>
Cc: "dwmw2@infradead.org" <dwmw2@infradead.org>,
	Jon Kohler <jon@nutanix.com>,
	 "khushit.shah@nutanix.com" <khushit.shah@nutanix.com>,
	"x86@kernel.org" <x86@kernel.org>, "bp@alien8.de" <bp@alien8.de>,
	 "hpa@zytor.com" <hpa@zytor.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	 "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	 "stable@vger.kernel.org" <stable@vger.kernel.org>,
	 "shaju.abraham@nutanix.com" <shaju.abraham@nutanix.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "tglx@linutronix.de" <tglx@linutronix.de>
Subject: Re: [PATCH v6] KVM: x86: Add x2APIC "features" to control EOI broadcast suppression
Date: Wed, 28 Jan 2026 06:57:46 -0800	[thread overview]
Message-ID: <aXoj6szBMy6BSzYO@google.com> (raw)
In-Reply-To: <62f9bdb9f3c7f55db747a29c292fa632bb6ec749.camel@intel.com>

On Wed, Jan 28, 2026, Kai Huang wrote:
> On Tue, 2026-01-27 at 19:48 -0800, David Woodhouse wrote:
> > On Wed, 2026-01-28 at 02:22 +0000, Huang, Kai wrote:
> > >  
> > > > Ah, so userspace which checks all the kernel's capabilities *first*
> > > > will not see KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST advertised,
> > > > because it needs to enable KVM_CAP_SPLIT_IRQCHIP first?
> > > > 
> > > > I guess that's tolerable¹ but the documentation could make it clearer,
> > > > perhaps? I can see VMMs silently failing to detect the feature because
> > > > they just don't set split-irqchip before checking for it? 
> > > > 
> > > > 
> > > > ¹ although I still kind of hate it and would have preferred to have the
> > > >    I/O APIC patch; userspace still has to intentionally *enable* that
> > > >    combination. But OK, I've reluctantly conceded that.
> > > 
> > > To make it even more robust, perhaps we can grab kvm->lock mutex in
> > > kvm_vm_ioctl_enable_cap() for KVM_CAP_X2APIC_API, so that it won't race with
> > > KVM_CREATE_IRQCHIP (which already grabs kvm->lock) and
> > > KVM_CAP_SPLIT_IRQCHIP?
> > > 
> > > Even more, we can add additional check in KVM_CREATE_IRQCHIP to return -
> > > EINVAL when it sees kvm->arch.suppress_eoi_broadcast_mode is
> > > KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST?
> > 
> > If we do that, then the query for KVM_CAP_X2APIC_API could advertise
> > the KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST for a freshly created KVM,
> > even before userspace has enabled *either* KVM_CREATE_IRQCHIP nor
> > KVM_CAP_SPLIT_IRQCHIP?
> 
> No IIUC it doesn't change that?
> 
> The change I mentioned above is only related to "enable" part, but not
> "query" part.
> 
> The "query" is done via kvm_vm_ioctl_check_extension(KVM_CAP_X2APIC_API),
> and in this patch, it does:
> 
> @@ -4931,6 +4933,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long
> ext)
>  		break;
>  	case KVM_CAP_X2APIC_API:
>  		r = KVM_X2APIC_API_VALID_FLAGS;
> +		if (kvm && !irqchip_split(kvm))
> +			r &= ~KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST;
> 
> IIRC if this is called before KVM_CREATE_IRQCHIP and KVM_CAP_SPLIT_IRQCHIP,
> then !irqchip_split() will be true, so it will NOT advertise
> KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST.
> 
> If it is called after KVM_CAP_SPLIT_IRQCHIP, then it will advertise
> KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST.

Yep.  And when called at system-scope, i.e. with @kvm=NULL, userspace will see
the maximal support with KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST.

> Btw, it doesn't grab kvm->lock either, so theoretically it could race with
> KVM_CREATE_IRQCHIP and kvm_vm_ioctl_enable_cap(KVM_CAP_SPLIT_IRQCHIP) too.

That's totally fine.

> > That would be slightly better than the existing proposed awfulness
> > where the kernel doesn't *admit* to having the _ENABLE_ capability
> > until userspace first enables the KVM_CAP_SPLIT_IRQCHIP.
> 
> We could also make kvm_vm_ioctl_check_extension(KVM_CAP_X2APIC_API) to
> _always_ advertise KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST if that's
> better.

No, because then we'd need new uAPI if we add support for ENABLE_SUPPRESS_EOI_BROADCAST
with an in-kernel I/O APIC.

> I suppose what we need is to document such behaviour -- that albeit 
> KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST is advertise as supposed, but it
> cannot be enabled together with KVM_CREATE_IRQCHIP -- one will fail
> depending on which is called first.

No, we don't need to explicitly document this, because it's super duper basic
multi-threaded programming.  KVM only needs to documented that
KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST requires a VM with KVM_CAP_SPLIT_IRQCHIP.

> As a bonus, it can get rid of "calling irqchip_split() w/o holding kvm-
> >lock" awfulness too.

No, it's not awfulness.  It's userspace's responsibility to not be stupid.  KVM
taking kvm->lock changes *nothing*.  All holding kvm->lock does is serialize KVM
code, it doesn't prevent a race.  I.e. it just changes whether tasks are racing
to acquire kvm->lock versus racing against irqchip_mode.

If userspace invokes KVM_CAP_SPLIT_IRQCHIP and KVM_ENABLE_CAP concurrently on two
separate tasks, then KVM_ENABLE_CAP will fail ~50% of the time regardless of
whether or not KVM takes kvm->lock.

CPU0                         CPU1

1. Locked Failure
----------------------------------------------------
                             lock(kvm->lock)
			     KVM_ENABLE_CAP = EINVAL
			     unlock(kvm->lock)
lock(kvm->lock)
KVM_CAP_SPLIT_IRQCHIP = 0
unlock(kvm->lock)

1. Locked Success
----------------------------------------------------
lock(kvm->lock)
KVM_CAP_SPLIT_IRQCHIP = 0
unlock(kvm->lock)
                             lock(kvm->lock)
			     KVM_ENABLE_CAP = 0
			     unlock(kvm->lock)

3. Lockless Failure
----------------------------------------------------
			     KVM_ENABLE_CAP = EINVAL
KVM_CAP_SPLIT_IRQCHIP = 0


4. Lockless Success
----------------------------------------------------
CPU0                         CPU1
KVM_CAP_SPLIT_IRQCHIP = 0
			     KVM_ENABLE_CAP = 0

  reply	other threads:[~2026-01-28 14:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23 12:56 [PATCH v6] KVM: x86: Add x2APIC "features" to control EOI broadcast suppression Khushit Shah
2026-01-27  2:21 ` Khushit Shah
2026-01-27  2:41   ` Khushit Shah
2026-01-27 21:09 ` David Woodhouse
2026-01-27 21:49   ` Sean Christopherson
2026-01-27 22:36     ` David Woodhouse
2026-01-28  2:22       ` Huang, Kai
2026-01-28  3:48         ` David Woodhouse
     [not found]           ` <SA2PR02MB756478359EE9185285ACE6158891A@SA2PR02MB7564.namprd02.prod.outlook.com>
2026-01-28  5:17             ` Khushit Shah
2026-01-28  5:32               ` David Woodhouse
2026-01-28  6:40               ` Huang, Kai
2026-01-28 15:04               ` Sean Christopherson
2026-01-28  6:15           ` Huang, Kai
2026-01-28 14:57             ` Sean Christopherson [this message]
2026-01-28 21:10               ` Huang, Kai
2026-01-28 14:44       ` Sean Christopherson
2026-02-04  0:10 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXoj6szBMy6BSzYO@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jon@nutanix.com \
    --cc=kai.huang@intel.com \
    --cc=khushit.shah@nutanix.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=shaju.abraham@nutanix.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox