From: Sean Christopherson <seanjc@google.com>
To: Marc Zyngier <maz@kernel.org>, Oliver Upton <oupton@kernel.org>,
Tianrui Zhao <zhaotianrui@loongson.cn>,
Bibo Mao <maobibo@loongson.cn>,
Huacai Chen <chenhuacai@kernel.org>,
Anup Patel <anup@brainfault.org>, Paul Walmsley <pjw@kernel.org>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>, Xin Li <xin@zytor.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Sean Christopherson <seanjc@google.com>,
Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
kvm@vger.kernel.org, loongarch@lists.linux.dev,
kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Mingwei Zhang <mizhang@google.com>,
Xudong Hao <xudong.hao@intel.com>,
Sandipan Das <sandipan.das@amd.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>,
Xiong Zhang <xiong.y.zhang@linux.intel.com>,
Manali Shukla <manali.shukla@amd.com>,
Jim Mattson <jmattson@google.com>
Subject: [PATCH v6 16/44] KVM: x86/pmu: Start stubbing in mediated PMU support
Date: Fri, 5 Dec 2025 16:16:52 -0800 [thread overview]
Message-ID: <20251206001720.468579-17-seanjc@google.com> (raw)
In-Reply-To: <20251206001720.468579-1-seanjc@google.com>
From: Dapeng Mi <dapeng1.mi@linux.intel.com>
Introduce enable_mediated_pmu as a global variable, with the intent of
exposing it to userspace a vendor module parameter, to control and reflect
mediated vPMU support. Wire up the perf plumbing to create+release a
mediated PMU, but defer exposing the parameter to userspace until KVM
support for a mediated PMUs is fully landed.
To (a) minimize compatibility issues, (b) to give userspace a chance to
opt out of the restrictive side-effects of perf_create_mediated_pmu(),
and (c) to avoid adding new dependencies between enabling an in-kernel
irqchip and a mediated vPMU, defer "creating" a mediated PMU in perf
until the first vCPU is created.
Regarding userspace compatibility, an alternative solution would be to
make the mediated PMU fully opt-in, e.g. to avoid unexpected failure due
to perf_create_mediated_pmu() failing. Ironically, that approach creates
an even bigger compatibility issue, as turning on enable_mediated_pmu
would silently break VMMs that don't utilize KVM_CAP_PMU_CAPABILITY (well,
silently until the guest tried to access PMU assets).
Regarding an in-kernel irqchip, create a mediated PMU if and only if the
VM has an in-kernel local APIC, as the mediated PMU will take a hard
dependency on forwarding PMIs to the guest without bouncing through host
userspace. Silently "drop" the PMU instead of rejecting KVM_CREATE_VCPU,
as KVM's existing vPMU support doesn't function correctly if the local
APIC is emulated by userspace, e.g. PMIs will never be delivered. I.e.
it's far, far more likely that rejecting KVM_CREATE_VCPU would cause
problems, e.g. for tests or userspace daemons that just want to probe
basic KVM functionality.
Note! Deliberately make mediated PMU creation "sticky", i.e. don't unwind
it on failure to create a vCPU. Practically speaking, there's no harm to
having a VM with a mediated PMU and no vCPUs. To avoid an "impossible" VM
setup, reject KVM_CAP_PMU_CAPABILITY if a mediated PMU has been created,
i.e. don't let userspace disable PMU support after failed vCPU creation
(with PMU support enabled).
Defer vendor specific requirements and constraints to the future.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Co-developed-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/pmu.c | 4 ++++
arch/x86/kvm/pmu.h | 7 +++++++
arch/x86/kvm/x86.c | 37 +++++++++++++++++++++++++++++++--
arch/x86/kvm/x86.h | 1 +
5 files changed, 48 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5a3bfa293e8b..defd979003be 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1484,6 +1484,7 @@ struct kvm_arch {
bool bus_lock_detection_enabled;
bool enable_pmu;
+ bool created_mediated_pmu;
u32 notify_window;
u32 notify_vmexit_flags;
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 7c219305b61d..0de0af5c6e4f 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -137,6 +137,10 @@ void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops)
enable_pmu = false;
}
+ if (!enable_pmu || !enable_mediated_pmu || !kvm_host_pmu.mediated ||
+ !pmu_ops->is_mediated_pmu_supported(&kvm_host_pmu))
+ enable_mediated_pmu = false;
+
if (!enable_pmu) {
memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap));
return;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 5c3939e91f1d..a5c7c026b919 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -37,6 +37,8 @@ struct kvm_pmu_ops {
void (*deliver_pmi)(struct kvm_vcpu *vcpu);
void (*cleanup)(struct kvm_vcpu *vcpu);
+ bool (*is_mediated_pmu_supported)(struct x86_pmu_capability *host_pmu);
+
const u64 EVENTSEL_EVENT;
const int MAX_NR_GP_COUNTERS;
const int MIN_NR_GP_COUNTERS;
@@ -58,6 +60,11 @@ static inline bool kvm_pmu_has_perf_global_ctrl(struct kvm_pmu *pmu)
return pmu->version > 1;
}
+static inline bool kvm_vcpu_has_mediated_pmu(struct kvm_vcpu *vcpu)
+{
+ return enable_mediated_pmu && vcpu_to_pmu(vcpu)->version;
+}
+
/*
* KVM tracks all counters in 64-bit bitmaps, with general purpose counters
* mapped to bits 31:0 and fixed counters mapped to 63:32, e.g. fixed counter 0
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1b2827cecf38..fb3a5e861553 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -183,6 +183,10 @@ bool __read_mostly enable_pmu = true;
EXPORT_SYMBOL_FOR_KVM_INTERNAL(enable_pmu);
module_param(enable_pmu, bool, 0444);
+/* Enable/disabled mediated PMU virtualization. */
+bool __read_mostly enable_mediated_pmu;
+EXPORT_SYMBOL_FOR_KVM_INTERNAL(enable_mediated_pmu);
+
bool __read_mostly eager_page_split = true;
module_param(eager_page_split, bool, 0644);
@@ -6854,7 +6858,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
break;
mutex_lock(&kvm->lock);
- if (!kvm->created_vcpus) {
+ if (!kvm->created_vcpus && !kvm->arch.created_mediated_pmu) {
kvm->arch.enable_pmu = !(cap->args[0] & KVM_PMU_CAP_DISABLE);
r = 0;
}
@@ -12641,8 +12645,13 @@ static int sync_regs(struct kvm_vcpu *vcpu)
return 0;
}
+#define PERF_MEDIATED_PMU_MSG \
+ "Failed to enable mediated vPMU, try disabling system wide perf events and nmi_watchdog.\n"
+
int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id)
{
+ int r;
+
if (kvm_check_tsc_unstable() && kvm->created_vcpus)
pr_warn_once("SMP vm created on host with unstable TSC; "
"guest TSC will not be reliable\n");
@@ -12653,7 +12662,29 @@ int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id)
if (id >= kvm->arch.max_vcpu_ids)
return -EINVAL;
- return kvm_x86_call(vcpu_precreate)(kvm);
+ /*
+ * Note, any actions done by .vcpu_create() must be idempotent with
+ * respect to creating multiple vCPUs, and therefore are not undone if
+ * creating a vCPU fails (including failure during pre-create).
+ */
+ r = kvm_x86_call(vcpu_precreate)(kvm);
+ if (r)
+ return r;
+
+ if (enable_mediated_pmu && kvm->arch.enable_pmu &&
+ !kvm->arch.created_mediated_pmu) {
+ if (irqchip_in_kernel(kvm)) {
+ r = perf_create_mediated_pmu();
+ if (r) {
+ pr_warn_ratelimited(PERF_MEDIATED_PMU_MSG);
+ return r;
+ }
+ kvm->arch.created_mediated_pmu = true;
+ } else {
+ kvm->arch.enable_pmu = false;
+ }
+ }
+ return 0;
}
int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
@@ -13319,6 +13350,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
__x86_set_memory_region(kvm, TSS_PRIVATE_MEMSLOT, 0, 0);
mutex_unlock(&kvm->slots_lock);
}
+ if (kvm->arch.created_mediated_pmu)
+ perf_release_mediated_pmu();
kvm_destroy_vcpus(kvm);
kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
#ifdef CONFIG_KVM_IOAPIC
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index fdab0ad49098..6e1fb1680c0a 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -470,6 +470,7 @@ extern struct kvm_caps kvm_caps;
extern struct kvm_host_values kvm_host;
extern bool enable_pmu;
+extern bool enable_mediated_pmu;
/*
* Get a filtered version of KVM's supported XCR0 that strips out dynamic
--
2.52.0.223.gf5cc29aaa4-goog
next prev parent reply other threads:[~2025-12-06 0:17 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-06 0:16 [PATCH v6 00/44] KVM: x86: Add support for mediated vPMUs Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 01/44] perf: Skip pmu_ctx based on event_type Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 02/44] perf: Add generic exclude_guest support Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 03/44] perf: Move security_perf_event_free() call to __free_event() Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 04/44] perf: Add APIs to create/release mediated guest vPMUs Sean Christopherson
2025-12-08 11:51 ` Peter Zijlstra
2025-12-08 18:07 ` Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 05/44] perf: Clean up perf ctx time Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 06/44] perf: Add a EVENT_GUEST flag Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 07/44] perf: Add APIs to load/put guest mediated PMU context Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 08/44] perf/x86/core: Register a new vector for handling mediated guest PMIs Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 09/44] perf/x86/core: Add APIs to switch to/from mediated PMI vector (for KVM) Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 10/44] perf/x86/core: Do not set bit width for unavailable counters Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 11/44] perf/x86/core: Plumb mediated PMU capability from x86_pmu to x86_pmu_cap Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 12/44] perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 13/44] perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 14/44] KVM: Add a simplified wrapper for registering perf callbacks Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 15/44] KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities Sean Christopherson
2025-12-06 0:16 ` Sean Christopherson [this message]
2025-12-06 0:16 ` [PATCH v6 17/44] KVM: x86/pmu: Implement Intel mediated PMU requirements and constraints Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 18/44] KVM: x86/pmu: Implement AMD mediated PMU requirements Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 19/44] KVM: x86/pmu: Register PMI handler for mediated vPMU Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 20/44] KVM: x86/pmu: Disable RDPMC interception for compatible " Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 21/44] KVM: x86/pmu: Load/save GLOBAL_CTRL via entry/exit fields for mediated PMU Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 22/44] KVM: x86/pmu: Disable interception of select PMU MSRs for mediated vPMUs Sean Christopherson
2025-12-06 0:16 ` [PATCH v6 23/44] KVM: x86/pmu: Bypass perf checks when emulating mediated PMU counter accesses Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 24/44] KVM: x86/pmu: Introduce eventsel_hw to prepare for pmu event filtering Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 25/44] KVM: x86/pmu: Reprogram mediated PMU event selectors on event filter updates Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 26/44] KVM: x86/pmu: Always stuff GuestOnly=1,HostOnly=0 for mediated PMCs on AMD Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 27/44] KVM: x86/pmu: Load/put mediated PMU context when entering/exiting guest Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 28/44] KVM: x86/pmu: Disallow emulation in the fastpath if mediated PMCs are active Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 29/44] KVM: x86/pmu: Handle emulated instruction for mediated vPMU Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 30/44] KVM: nVMX: Add macros to simplify nested MSR interception setting Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 31/44] KVM: nVMX: Disable PMU MSR interception as appropriate while running L2 Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 32/44] KVM: nSVM: " Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 33/44] KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 34/44] KVM: x86/pmu: Elide WRMSRs when loading guest PMCs if values already match Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 35/44] KVM: VMX: Drop intermediate "guest" field from msr_autostore Sean Christopherson
2025-12-08 9:14 ` Mi, Dapeng
2025-12-06 0:17 ` [PATCH v6 36/44] KVM: nVMX: Don't update msr_autostore count when saving TSC for vmcs12 Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 37/44] KVM: VMX: Dedup code for removing MSR from VMCS's auto-load list Sean Christopherson
2025-12-08 9:29 ` Mi, Dapeng
2025-12-09 17:37 ` Sean Christopherson
2025-12-10 1:08 ` Mi, Dapeng
2025-12-06 0:17 ` [PATCH v6 38/44] KVM: VMX: Drop unused @entry_only param from add_atomic_switch_msr() Sean Christopherson
2025-12-08 9:32 ` Mi, Dapeng
2025-12-06 0:17 ` [PATCH v6 39/44] KVM: VMX: Bug the VM if either MSR auto-load list is full Sean Christopherson
2025-12-08 9:32 ` Mi, Dapeng
2025-12-08 9:34 ` Mi, Dapeng
2025-12-06 0:17 ` [PATCH v6 40/44] KVM: VMX: Set MSR index auto-load entry if and only if entry is "new" Sean Christopherson
2025-12-08 9:35 ` Mi, Dapeng
2025-12-06 0:17 ` [PATCH v6 41/44] KVM: VMX: Compartmentalize adding MSRs to host vs. guest auto-load list Sean Christopherson
2025-12-08 9:36 ` Mi, Dapeng
2025-12-06 0:17 ` [PATCH v6 42/44] KVM: VMX: Dedup code for adding MSR to VMCS's auto list Sean Christopherson
2025-12-08 9:37 ` Mi, Dapeng
2025-12-06 0:17 ` [PATCH v6 43/44] KVM: VMX: Initialize vmcs01.VM_EXIT_MSR_STORE_ADDR with list address Sean Christopherson
2025-12-06 0:17 ` [PATCH v6 44/44] KVM: VMX: Add mediated PMU support for CPUs without "save perf global ctrl" Sean Christopherson
2025-12-08 9:39 ` Mi, Dapeng
2025-12-09 6:31 ` Mi, Dapeng
2025-12-08 15:37 ` [PATCH v6 00/44] KVM: x86: Add support for mediated vPMUs Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251206001720.468579-17-seanjc@google.com \
--to=seanjc@google.com \
--cc=acme@kernel.org \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=chenhuacai@kernel.org \
--cc=dapeng1.mi@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=loongarch@lists.linux.dev \
--cc=luto@kernel.org \
--cc=manali.shukla@amd.com \
--cc=maobibo@loongson.cn \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=mizhang@google.com \
--cc=namhyung@kernel.org \
--cc=oupton@kernel.org \
--cc=palmer@dabbelt.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=pjw@kernel.org \
--cc=sandipan.das@amd.com \
--cc=xin@zytor.com \
--cc=xiong.y.zhang@linux.intel.com \
--cc=xudong.hao@intel.com \
--cc=zhaotianrui@loongson.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).