Linux Perf Users
 help / color / mirror / Atom feed
* [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes
@ 2026-06-11 16:00 Zide Chen
  2026-06-11 16:00 ` [PATCH V3 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure Zide Chen
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen

This series fixes correctness issues in Intel uncore PMU setup:

- If all init_box() on a PMU fails, the PMU sysfs node may still exist,
  while perf events read zeros and silently report wrong data.
- If init_box() fails on only some dies, perf may return partial
  non-zero counts, which is harder to diagnose.
- CPU hotplug ref/unref ordering bugs can skip init_box() when the first
  CPU in a die comes online, and can call box_exit() prematurely when
  the second-to-last CPU goes offline.
- PCI PMU cleanup on setup failure has activeboxes leaks and potential
  NULL pointer dereference in error paths.

To address this, the series introduces a PMU broken state to track setup
failures and switches MSR/MMIO PMUs to lazy registration, matching
existing PCI behavior.

To avoid merge conflicts, this series should be applied after:
https://lore.kernel.org/lkml/20260527151154.130505-1-zide.chen@intel.com/
(textual conflict, no logical dependency)

Only cosmetic changes only in v3.

V3 changes:
- patch 2/8: Instead of removing atomic_inc(&box->refcnt) in PMU
  register, add the corresponding atomic_dec_return(&box->refcnt) in
  PMU unregister. (Dapeng)
- patch 6/8: Minor changes in code comments.
- patch 7/8: Minor changelog update. (Dapeng)
- Add Reviewed-by tags.

V2 changes:
- Add new patch 1 to fix PCI PMU cleanup issues (Sashiko)
- Keep pmu->activeboxes naming and semantics to avoid potential refcnt
  leaks in the uncore_pci_remove() path. To accomplish this, make the
  PMU broken flag sticky and decrement pmu->activeboxes on active box
  only.
- Update commit messages and changelogs according.

V2: https://lore.kernel.org/lkml/20260601170114.173359-1-zide.chen@intel.com/
V1: https://lore.kernel.org/lkml/20260512233048.9577-1-zide.chen@intel.com/
Sashiko's review: https://sashiko.dev/#/patchset/20260512233048.9577-1-zide.chen@intel.com

Zide Chen (8):
  perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure
  perf/x86/intel/uncore: Fix refcnt and other cleanups
  perf/x86/intel/uncore: Let init_box() callback report failures
  perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails
  perf/x86/intel/uncore: Factor out box setup code
  perf/x86/intel/uncore: Introduce PMU flags and broken state
  perf/x86/intel/uncore: Fix uncore_box ref/unref ordering
  perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs

 arch/x86/events/intel/uncore.c           | 225 +++++++++++------------
 arch/x86/events/intel/uncore.h           |  39 +++-
 arch/x86/events/intel/uncore_discovery.c |  21 ++-
 arch/x86/events/intel/uncore_discovery.h |   6 +-
 arch/x86/events/intel/uncore_nhmex.c     |   3 +-
 arch/x86/events/intel/uncore_snb.c       |  82 ++++++---
 arch/x86/events/intel/uncore_snbep.c     |  77 +++++---
 7 files changed, 255 insertions(+), 198 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-06-11 16:38 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
2026-06-11 16:00 ` [PATCH V3 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure Zide Chen
2026-06-11 16:26   ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups Zide Chen
2026-06-11 16:29   ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 3/8] perf/x86/intel/uncore: Let init_box() callback report failures Zide Chen
2026-06-11 16:38   ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 4/8] perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails Zide Chen
2026-06-11 16:00 ` [PATCH V3 5/8] perf/x86/intel/uncore: Factor out box setup code Zide Chen
2026-06-11 16:00 ` [PATCH V3 6/8] perf/x86/intel/uncore: Introduce PMU flags and broken state Zide Chen
2026-06-11 16:30   ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering Zide Chen
2026-06-11 16:29   ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 8/8] perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs Zide Chen
2026-06-11 16:33   ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox