From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 26A1E368299 for ; Wed, 28 Jan 2026 15:04:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769612669; cv=none; b=JFSJVuRFfZoSxNXrznuzyacb7eRKpmCNIvjn/SS5I/oIBUNj9guGzuPgmsDlHqKk8xejSc5QWyOgQNOREcbOjP3yWWo24yWHDfxWHfNosytYvKijeDrw6D49oNWiGQgN70py9iY9lIYGbmgWYthRtsy8LI1TqplNceV/cvAYvHk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769612669; c=relaxed/simple; bh=9p9sY8GurF6O2zwvllKDDT1l+KKObwzM+skkBydhYaU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=pY70E+VVf3m/Qu2CCL0RjbtKyVLeIyF8u+em8aokCj+u/R9cBuqQJv8OWp1p9tS93J9jHi6UIQVnYO5cEAZD6ejdn9T/kzBwFM6YFaIPKCVAuCU4UHCaRDyzVzMzFkjH26ghnlog0J0fSwXl2nNT60i3GmTpil1+wahYbqMwGdA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=c8LNOMBJ; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="c8LNOMBJ" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-c636238ec57so3072611a12.1 for ; Wed, 28 Jan 2026 07:04:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1769612657; x=1770217457; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=ySrwH5OmJgiZgYr7ivDWu26JHCODpKL/+VCc6ZHqtQc=; b=c8LNOMBJqjFrDq3czsZdUZx99leTsHBkzdY0mT8bxmwUxWnh5k8EATrsxbcDib3R+b UK3WGq0lMkLcK7Ywlt/sgscz4brMo0Vgz7BLcWpc1q9x93EocSJS39P6JEyXKP2hp6Vz hFuongmHHTcpnT1e2GKa5ovOUHmvTwhdjP3Ja5jUwEwA7WRJtoAIv3Mm5hGiyOBpE3DY ZE8s5kt7u4kq0jjlLKm8jwOOGVn6cLIxQMbORSQ1qwIpNArtdPWVF7AthInSnSk/rC0p SveGXBLvKSBeNFnd9XR+ieqO53SpM0yO79p/5xOsyZg3evy7JsuCELRwIOICqL9+a75h DxMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769612657; x=1770217457; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=ySrwH5OmJgiZgYr7ivDWu26JHCODpKL/+VCc6ZHqtQc=; b=GaI8s9HlxqE6SBuU7thpryqS8LHORA03/DOKsZiaJIErmYaQI2SBUC1QWMYcWgUJZs y728LDpZ00MTabAv9eGMcNX8k5glHdPKAtUftP+GKrj24w+plOZLUA1z44k7FCd5Gdnb 8czuRDMW+xBC194QYI+lFjU9sjdER4sNrhFcrEhYFYCicg9mZjcdNweuaFIKhoReFlzj +wcIP/doOHUkJB+0OsUOW/d2q6ZZF8oJtKr6To2vfVbYTv1DDWSnTpOg3i/D5PPxP2RF QeduTFnlT0JwkCf01wKnLoNLHKU+IaYDVyW1PPk/d0iJTNl+/BVvZ/NfzKMPCs/ruPcw 9Rwg== X-Forwarded-Encrypted: i=1; AJvYcCVGDKDsjTGbmCnNCQLdK/HqLq2QkA5AYB66kLnwm1RR21WkcBZr7fkuxVqtHm9CcCaJjfefqhoTw8+mzMs=@vger.kernel.org X-Gm-Message-State: AOJu0YzJApR229FtsV6i5u/f46gi5TB+E+DSf9AyK09sjj52mc2QeZWJ vqNpXoMsZQwI8+3JtsLkz7sMHu7m1CH0MVmgmup6izjgfBA5dW2g2dD1HQA6sdjAPojTqdbXLFh vIQ9Fpg== X-Received: from pghx8.prod.google.com ([2002:a63:f708:0:b0:bac:a20:5f1b]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:1709:b0:38e:5535:bb4e with SMTP id adf61e73a8af0-38ec6581350mr5832803637.76.1769612656719; Wed, 28 Jan 2026 07:04:16 -0800 (PST) Date: Wed, 28 Jan 2026 07:04:15 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260123125657.3384063-1-khushit.shah@nutanix.com> <699708d7f3da2e2a41e3282c1a87e6f4d69a4e89.camel@intel.com> Message-ID: Subject: Re: [PATCH v6] KVM: x86: Add x2APIC "features" to control EOI broadcast suppression From: Sean Christopherson To: Khushit Shah Cc: David Woodhouse , Kai Huang , Shaju Abraham , "x86@kernel.org" , "bp@alien8.de" , "stable@vger.kernel.org" , "hpa@zytor.com" , "linux-kernel@vger.kernel.org" , "mingo@redhat.com" , "dave.hansen@linux.intel.com" , "pbonzini@redhat.com" , "kvm@vger.kernel.org" , Jon Kohler , "tglx@linutronix.de" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Wed, Jan 28, 2026, Khushit Shah wrote: >=20 > > On 28 Jan 2026, at 9:27=E2=80=AFAM, Khushit Shah wrote: > >=20 > >=20 > > On 28/01/26, 9:19=E2=80=AFAM, "David Woodhouse" w= rote: > >=20 > > On Wed, 2026-01-28 at 02:22 +0000, Huang, Kai wrote: > > > =20 > > > > Ah, so userspace which checks all the kernel's capabilities *first* > > > > will not see KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST advertised, > > > > because it needs to enable KVM_CAP_SPLIT_IRQCHIP first? > > > > > > I guess that's tolerable=C2=B9 but the documentation could make= it clearer, > > > > perhaps? I can see VMMs silently failing to detect the feature beca= use > > > > they just don't set split-irqchip before checking for it? > > > > >= > =C2=B9 although I still kind of hate it and would have preferred to have= the > > > > I/O APIC patch; userspace still has to intentionally *enable* th= at > > > > combination. But OK, I've reluctantly conceded that. > > > > To make it even more robust, perhaps we can grab kvm->lock mutex in > > > kvm_vm_ioctl_enable_cap() for KVM_CAP_X2APIC_API, so that it won't ra= ce with > > > KVM_CREATE_IRQCHIP (which already grabs kvm->lock) and > > > KVM_CAP_SPLIT_IRQCHIP? > > > > Even more, we can add additional check in KVM_CREATE_IRQCHIP to ret= urn - > > > EINVAL when it sees kvm->arch.suppress_eoi_broadcast_mode is > > > KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST? > >=20 > > If we do that, then the query for KVM_CAP_X2APIC_API could advertise > > the KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST for a freshly created KVM, > > even before userspace has enabled *either* KVM_CREATE_IRQCHIP nor > > KVM_CAP_SPLIT_IRQCHIP? > >=20 > > That would be slightly better than the existing proposed awfulness > > where the kernel doesn't *admit* to having the _ENABLE_ capability > > until userspace first enables the KVM_CAP_SPLIT_IRQCHIP. No. If userspace wants to see if *KVM* supports the feature, then userspac= e can do KVM_CHECK_EXTENSION on /dev/kvm. If userspace does KVM_CHECK_EXTENSION = on a VM fd, then KVM absolutely must report exactly what that VM supports. > How about we make an explicit _ENABLE_ bit for split IRQCHIP? > When/if in-kernel IRQCHIP starts supporting I/O APIC 0x20, we > can add a separate bit for that in the CAP.=20 NAK. Conditionally enumerating support for a feature based on the configur= ation of the VM has been KVM's documented behavior since KVM_CHECK_EXTENSION was = added by commit 92b591a4c46b ("KVM: Allow KVM_CHECK_EXTENSION on the vm fd"). I don't see any reason why KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST needs t= o do something different.