From: Zide Chen <zide.chen@intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Eranian Stephane <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
Dapeng Mi <dapeng1.mi@linux.intel.com>,
Zide Chen <zide.chen@intel.com>
Subject: [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes
Date: Thu, 11 Jun 2026 09:00:25 -0700 [thread overview]
Message-ID: <20260611160033.66760-1-zide.chen@intel.com> (raw)
This series fixes correctness issues in Intel uncore PMU setup:
- If all init_box() on a PMU fails, the PMU sysfs node may still exist,
while perf events read zeros and silently report wrong data.
- If init_box() fails on only some dies, perf may return partial
non-zero counts, which is harder to diagnose.
- CPU hotplug ref/unref ordering bugs can skip init_box() when the first
CPU in a die comes online, and can call box_exit() prematurely when
the second-to-last CPU goes offline.
- PCI PMU cleanup on setup failure has activeboxes leaks and potential
NULL pointer dereference in error paths.
To address this, the series introduces a PMU broken state to track setup
failures and switches MSR/MMIO PMUs to lazy registration, matching
existing PCI behavior.
To avoid merge conflicts, this series should be applied after:
https://lore.kernel.org/lkml/20260527151154.130505-1-zide.chen@intel.com/
(textual conflict, no logical dependency)
Only cosmetic changes only in v3.
V3 changes:
- patch 2/8: Instead of removing atomic_inc(&box->refcnt) in PMU
register, add the corresponding atomic_dec_return(&box->refcnt) in
PMU unregister. (Dapeng)
- patch 6/8: Minor changes in code comments.
- patch 7/8: Minor changelog update. (Dapeng)
- Add Reviewed-by tags.
V2 changes:
- Add new patch 1 to fix PCI PMU cleanup issues (Sashiko)
- Keep pmu->activeboxes naming and semantics to avoid potential refcnt
leaks in the uncore_pci_remove() path. To accomplish this, make the
PMU broken flag sticky and decrement pmu->activeboxes on active box
only.
- Update commit messages and changelogs according.
V2: https://lore.kernel.org/lkml/20260601170114.173359-1-zide.chen@intel.com/
V1: https://lore.kernel.org/lkml/20260512233048.9577-1-zide.chen@intel.com/
Sashiko's review: https://sashiko.dev/#/patchset/20260512233048.9577-1-zide.chen@intel.com
Zide Chen (8):
perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure
perf/x86/intel/uncore: Fix refcnt and other cleanups
perf/x86/intel/uncore: Let init_box() callback report failures
perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails
perf/x86/intel/uncore: Factor out box setup code
perf/x86/intel/uncore: Introduce PMU flags and broken state
perf/x86/intel/uncore: Fix uncore_box ref/unref ordering
perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs
arch/x86/events/intel/uncore.c | 225 +++++++++++------------
arch/x86/events/intel/uncore.h | 39 +++-
arch/x86/events/intel/uncore_discovery.c | 21 ++-
arch/x86/events/intel/uncore_discovery.h | 6 +-
arch/x86/events/intel/uncore_nhmex.c | 3 +-
arch/x86/events/intel/uncore_snb.c | 82 ++++++---
arch/x86/events/intel/uncore_snbep.c | 77 +++++---
7 files changed, 255 insertions(+), 198 deletions(-)
--
2.54.0
next reply other threads:[~2026-06-11 16:09 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-11 16:00 Zide Chen [this message]
2026-06-11 16:00 ` [PATCH V3 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure Zide Chen
2026-06-11 16:26 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups Zide Chen
2026-06-11 16:29 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 3/8] perf/x86/intel/uncore: Let init_box() callback report failures Zide Chen
2026-06-11 16:38 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 4/8] perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails Zide Chen
2026-06-11 16:00 ` [PATCH V3 5/8] perf/x86/intel/uncore: Factor out box setup code Zide Chen
2026-06-11 16:00 ` [PATCH V3 6/8] perf/x86/intel/uncore: Introduce PMU flags and broken state Zide Chen
2026-06-11 16:30 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering Zide Chen
2026-06-11 16:29 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 8/8] perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs Zide Chen
2026-06-11 16:33 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260611160033.66760-1-zide.chen@intel.com \
--to=zide.chen@intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox