public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf/x86/intel: Don't update PEBS_ENABLE MSR in NMI context
@ 2026-03-20 15:27 Jim Mattson
  2026-03-20 18:58 ` Jim Mattson
  0 siblings, 1 reply; 4+ messages in thread
From: Jim Mattson @ 2026-03-20 15:27 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Thomas Gleixner,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	linux-perf-users, linux-kernel, Stephane Eranian, Mingwei Zhang
  Cc: Jim Mattson

Writing MSR_IA32_PEBS_ENABLE in handle_pmi_common() when a PEBS event is
throttled is problematic for KVM, which may have an older value for the MSR
(previously obtained via perf_guest_get_msrs()) stored in the VM-exit
MSR-load list.

It isn't necessary to update the MSR in the PMI handler, because the
throttled counter's PerfEvtSel ENABLE bit has been cleared, so its
PEBS_ENABLE bit is irrelevant. However, the MSR must be updated before the
stale bit becomes relevant again (i.e. when the PMC starts counting a new
perf event).

When the PEBS_ENABLE MSR is rendered stale in the PMI handler, just record
that fact in a new cpu_hw_events boolean, pebs_stale.  Extend
intel_pmu_pebs_enable_all() to write MSR_IA32_PEBS_ENABLE when pebs_stale
is set and to clear pebs_stale.  This ensures stale bits are cleared
during the normal PMU enable path, before PERF_GLOBAL_CTRL is restored and
any new event can start counting.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/events/intel/core.c | 28 ++++++++++++++++++++++------
 arch/x86/events/intel/ds.c   |  4 +++-
 arch/x86/events/perf_event.h |  1 +
 3 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cf3a4fe06ff2..3b0173d25232 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3548,14 +3548,26 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 		static_call(x86_pmu_drain_pebs)(regs, &data);
 
 		/*
-		 * PMI throttle may be triggered, which stops the PEBS event.
-		 * Although cpuc->pebs_enabled is updated accordingly, the
-		 * MSR_IA32_PEBS_ENABLE is not updated. Because the
-		 * cpuc->enabled has been forced to 0 in PMI.
-		 * Update the MSR if pebs_enabled is changed.
+		 * PMI throttling may be triggered, which stops the PEBS
+		 * event.  Although cpuc->pebs_enabled was updated
+		 * accordingly, MSR_IA32_PEBS_ENABLE has not been updated.
+		 *
+		 * The MSR must not be updated in NMI context, because KVM
+		 * may be in a critical region on its way to VM-entry, with
+		 * the now-stale MSR value stored in the VM-exit MSR-load
+		 * list.  On VM-exit, the CPU will restore the stale value.
+		 *
+		 * Deferring the MSR update is harmless because the
+		 * throttled event's PerfEvtSel ENABLE bit has been
+		 * cleared, so the stale PEBS_ENABLE bit is irrelevant for
+		 * now.
+		 *
+		 * Track when MSR_IA32_PEBS_ENABLE is stale, so that
+		 * pebs_enable_all() will write cpuc->pebs_enabled to the
+		 * MSR, even when cpuc->pebs_enabled is 0.
 		 */
 		if (pebs_enabled != cpuc->pebs_enabled)
-			wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
+			cpuc->pebs_stale = true;
 
 		/*
 		 * Above PEBS handler (PEBS counters snapshotting) has updated fixed
@@ -4978,6 +4990,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
  * These values have nothing to do with the emulated values the guest sees
  * when it uses {RD,WR}MSR, which should be handled by the KVM context,
  * specifically in the intel_pmu_{get,set}_msr().
+ *
+ * Note that MSRs returned from this function must not be modified in NMI
+ * context. Doing so could result in a stale host value being restored at
+ * the next VM-exit.
  */
 static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 {
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 5027afc97b65..852baf6a05e9 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1982,8 +1982,10 @@ void intel_pmu_pebs_enable_all(void)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
-	if (cpuc->pebs_enabled)
+	if (cpuc->pebs_enabled || cpuc->pebs_stale)
 		wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
+
+	cpuc->pebs_stale = false;
 }
 
 void intel_pmu_pebs_disable_all(void)
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index fad87d3c8b2c..22102296e31b 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -298,6 +298,7 @@ struct cpu_hw_events {
 	/* DS based PEBS or arch-PEBS buffer address */
 	void			*pebs_vaddr;
 	u64			pebs_enabled;
+	bool			pebs_stale;
 	int			n_pebs;
 	int			n_large_pebs;
 	int			n_pebs_via_pt;
-- 
2.53.0.959.g497ff81fa9-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-10  2:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-20 15:27 [PATCH] perf/x86/intel: Don't update PEBS_ENABLE MSR in NMI context Jim Mattson
2026-03-20 18:58 ` Jim Mattson
2026-04-10  2:24   ` Mi, Dapeng
2026-04-10  2:58     ` Jim Mattson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox