* [PATCH V3 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
@ 2026-06-11 16:00 ` Zide Chen
2026-06-11 16:26 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups Zide Chen
` (6 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
When uncore_pci_pmu_register() fails, pmu->boxes[die] is set to NULL
before returning. In the uncore_pci_remove() path, this causes
uncore_pci_pmu_unregister() to be skipped entirely, leaking
pmu->activeboxes. In the uncore_bus_notify() path,
uncore_pci_pmu_unregister() may still be called and must exit early
when pmu->boxes[die] is NULL to avoid a NULL pointer dereference, and
to ensure activeboxes is only decremented for a previously active box.
Additionally, since pci_get_drvdata() returns NULL on registration
failure, uncore_pci_remove() can no longer treat NULL drvdata as an
indicator of an auxiliary PCI device. Remove the associated
WARN_ON_ONCE().
Link: https://sashiko.dev/#/patchset/20260512233048.9577-1-zide.chen@intel.com?part=1
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
v3: Add Reviewed-by tag.
v2: New patch.
---
arch/x86/events/intel/uncore.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 7857959c6e82..b69b6a21d46b 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1183,6 +1183,7 @@ static int uncore_pci_pmu_register(struct pci_dev *pdev,
/* First active box registers the pmu */
ret = uncore_pmu_register(pmu);
if (ret) {
+ atomic_dec(&pmu->activeboxes);
pmu->boxes[die] = NULL;
uncore_box_exit(box);
kfree(box);
@@ -1248,6 +1249,9 @@ static void uncore_pci_pmu_unregister(struct intel_uncore_pmu *pmu, int die)
{
struct intel_uncore_box *box = pmu->boxes[die];
+ if (!box)
+ return;
+
pmu->boxes[die] = NULL;
if (atomic_dec_return(&pmu->activeboxes) == 0)
uncore_pmu_unregister(pmu);
@@ -1272,7 +1276,6 @@ static void uncore_pci_remove(struct pci_dev *pdev)
break;
}
}
- WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX);
return;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V3 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure
2026-06-11 16:00 ` [PATCH V3 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure Zide Chen
@ 2026-06-11 16:26 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-11 16:26 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [Medium] Unjustified removal of `WARN_ON_ONCE` safety check based on a misunderstanding of the Linux device model.
Pre-existing issues:
- [High] Active perf events will dereference a freed `intel_uncore_box` when a PCI device is forcefully unbound.
- [High] Lockless update of `pmu->boxes[die]` in PCI probe/remove paths races with CPU hotplug callbacks, leading to Use-After-Free.
--
commit 1ebe3c291f253a0ba73f6277fd61a8e1138e1725
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure
This commit attempts to fix a cleanup issue when uncore_pci_pmu_register()
fails by explicitly managing pmu->boxes[die] and activeboxes. It also removes
a WARN_ON_ONCE() under the assumption that a NULL driver data pointer could
result from a registration failure during probe.
Link: https://sashiko.dev/#/patchset/20260512233048.9577-1-zide.chen@intel.com?part=1
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 7857959c6e823..b69b6a21d46bf 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
[ ... ]
> @@ -1248,6 +1249,9 @@ static void uncore_pci_pmu_unregister(struct intel_uncore_pmu *pmu, int die)
> {
> struct intel_uncore_box *box = pmu->boxes[die];
>
> + if (!box)
> + return;
> +
> pmu->boxes[die] = NULL;
> if (atomic_dec_return(&pmu->activeboxes) == 0)
> uncore_pmu_unregister(pmu);
[Severity: High]
This is a pre-existing issue, but when a PCI device is forcefully unbound,
does unconditionally freeing the box memory here lead to a use-after-free if
there are active perf events?
If active perf events are currently open for this PMU, the perf core retains
them. Subsequent perf operations will dereference the freed memory:
uncore_pmu_event_read() {
struct intel_uncore_box *box = uncore_event_to_box(event);
uncore_perf_event_update(box, event);
...
}
[Severity: High]
This is also a pre-existing issue, but is there a race condition between
setting pmu->boxes[die] to NULL here and concurrent CPU hotplug callbacks?
CPU hotplug callbacks run under cpus_read_lock(), but PCI bind/unbind
operations can execute concurrently without it. If a CPU hotplug event reads
the box right before it is cleared:
CPU1
uncore_change_type_ctx()
box = pmu->boxes[die];
CPU2 would then execute uncore_pci_pmu_unregister():
pmu->boxes[die] = NULL;
...
kfree(box);
Would CPU1 then dereference the freed box memory when it subsequently accesses
box->cpu?
[ ... ]
> @@ -1272,7 +1276,6 @@ static void uncore_pci_remove(struct pci_dev *pdev)
> break;
> }
> }
> - WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX);
> return;
> }
[Severity: Medium]
The commit message states that since pci_get_drvdata() returns NULL on
registration failure, uncore_pci_remove() can no longer treat NULL drvdata as
an indicator of an auxiliary PCI device. Is this assumption accurate?
If PMU registration fails during uncore_pci_probe(), the probe function
returns an error code. The PCI driver core responds by aborting the bind
process and will never invoke the driver's remove callback for that device.
Because uncore_pci_remove() is never called as a result of a registration
failure, does removing this check silently mask bugs where a successfully
bound primary device unexpectedly loses its driver data?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260611160033.66760-1-zide.chen@intel.com?part=1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V3 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
2026-06-11 16:00 ` [PATCH V3 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure Zide Chen
@ 2026-06-11 16:00 ` Zide Chen
2026-06-11 16:29 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 3/8] perf/x86/intel/uncore: Let init_box() callback report failures Zide Chen
` (5 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
Fix typo UNCORE_BOX_FLAG_INITIATED to UNCORE_BOX_FLAG_INITIALIZED.
Rename the 'id' parameter in uncore_box_{ref,unref}() to 'die' to
reflect its actual meaning and be consistent with other functions.
box->refcnt is incremented in the PCI PMU register path but has never
been checked or decremented. Although for PCI PMUs box->refcnt
effectively tracks only a single user, add atomic_dec_return() in the
PCI PMU unregister path to make the reference counting complete and
consistent.
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
v3:
- Instead of removing atomic_inc(&box->refcnt) in PMU register, add
the corresponding atomic_dec_return(&box->refcnt) in PMU unregister.
(Dapeng)
v2:
- Don't rename pmu->activeboxes and keep its semantics because in
uncore_pci_remove() path, uncore_pci_pmu_unregister() won't be
called for non-active boxes.
- Since pmu->activeboxes keeps its name, don't need to rename
box->refcnt to box->cpu_refcnt.
---
arch/x86/events/intel/uncore.c | 16 +++++++++-------
arch/x86/events/intel/uncore.h | 6 +++---
2 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index b69b6a21d46b..21c8ed1628cb 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1255,8 +1255,10 @@ static void uncore_pci_pmu_unregister(struct intel_uncore_pmu *pmu, int die)
pmu->boxes[die] = NULL;
if (atomic_dec_return(&pmu->activeboxes) == 0)
uncore_pmu_unregister(pmu);
- uncore_box_exit(box);
- kfree(box);
+ if (atomic_dec_return(&box->refcnt) == 0) {
+ uncore_box_exit(box);
+ kfree(box);
+ }
}
static void uncore_pci_remove(struct pci_dev *pdev)
@@ -1518,7 +1520,7 @@ static void uncore_change_context(struct intel_uncore_type **uncores,
uncore_change_type_ctx(*uncores, old_cpu, new_cpu);
}
-static void uncore_box_unref(struct intel_uncore_type **types, int id)
+static void uncore_box_unref(struct intel_uncore_type **types, int die)
{
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
@@ -1529,7 +1531,7 @@ static void uncore_box_unref(struct intel_uncore_type **types, int id)
type = *types;
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
- box = pmu->boxes[id];
+ box = pmu->boxes[die];
if (box && box->cpu >= 0 && atomic_dec_return(&box->refcnt) == 0)
uncore_box_exit(box);
}
@@ -1604,14 +1606,14 @@ static int allocate_boxes(struct intel_uncore_type **types,
}
static int uncore_box_ref(struct intel_uncore_type **types,
- int id, unsigned int cpu)
+ int die, unsigned int cpu)
{
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
struct intel_uncore_box *box;
int i, ret;
- ret = allocate_boxes(types, id, cpu);
+ ret = allocate_boxes(types, die, cpu);
if (ret)
return ret;
@@ -1619,7 +1621,7 @@ static int uncore_box_ref(struct intel_uncore_type **types,
type = *types;
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
- box = pmu->boxes[id];
+ box = pmu->boxes[die];
if (box && box->cpu >= 0 && atomic_inc_return(&box->refcnt) == 1)
uncore_box_init(box);
}
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index c2e5ccb1d72c..bad5d8dec8e0 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -185,7 +185,7 @@ struct intel_uncore_box {
#define CFL_UNC_CBO_7_PERFEVTSEL0 0xf70
#define CFL_UNC_CBO_7_PER_CTR0 0xf76
-#define UNCORE_BOX_FLAG_INITIATED 0
+#define UNCORE_BOX_FLAG_INITIALIZED 0
/* event config registers are 8-byte apart */
#define UNCORE_BOX_FLAG_CTL_OFFS8 1
/* CFL 8th CBOX has different MSR space */
@@ -559,7 +559,7 @@ static inline u64 uncore_read_counter(struct intel_uncore_box *box,
static inline void uncore_box_init(struct intel_uncore_box *box)
{
- if (!test_and_set_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) {
+ if (!test_and_set_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags)) {
if (box->pmu->type->ops->init_box)
box->pmu->type->ops->init_box(box);
}
@@ -567,7 +567,7 @@ static inline void uncore_box_init(struct intel_uncore_box *box)
static inline void uncore_box_exit(struct intel_uncore_box *box)
{
- if (test_and_clear_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) {
+ if (test_and_clear_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags)) {
if (box->pmu->type->ops->exit_box)
box->pmu->type->ops->exit_box(box);
}
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V3 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups
2026-06-11 16:00 ` [PATCH V3 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups Zide Chen
@ 2026-06-11 16:29 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-11 16:29 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
Pre-existing issues:
- [High] Use-After-Free of uncore_box via active perf events and hrtimers after PCI PMU unregistration
- [High] Lockless race condition on `activeboxes` leading to concurrent perf PMU register/unregister
- [High] Missing memory barrier when publishing `box` allowing readers to observe partially initialized memory
- [High] Use-After-Free race between CPU hotplug context migration and PCI PMU unregistration
--
commit 372caa30bbc8374ec4e96cf8e02d24ffa69fe538
Author: Zide Chen <zide.chen@intel.com>
Subject: perf/x86/intel/uncore: Fix refcnt and other cleanups
This commit fixes a typo for the initialized flag and renames the 'id'
parameter to 'die' in uncore_box_{ref,unref}. It also adds reference
counting decrement logic in the PCI PMU unregister path to properly
clean up the uncore box.
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index b69b6a21d46bf..21c8ed1628cb2 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -1255,8 +1255,10 @@ static void uncore_pci_pmu_unregister(struct intel_uncore_pmu *pmu, int die)
> pmu->boxes[die] = NULL;
> if (atomic_dec_return(&pmu->activeboxes) == 0)
> uncore_pmu_unregister(pmu);
[Severity: High]
This is a pre-existing issue, but does this lockless decrement of activeboxes
create a race condition with concurrent PMU registrations?
If Thread A (unregistering die 0) and Thread B (registering die 1) run
concurrently, Thread A might decrement activeboxes from 1 to 0 here, while
Thread B increments it from 0 to 1 in uncore_pci_pmu_register().
Both threads could then concurrently invoke uncore_pmu_unregister(pmu) and
perf_pmu_register(&pmu->pmu) on the exact same struct pmu object, potentially
corrupting the internal perf core PMU lists.
> - uncore_box_exit(box);
> - kfree(box);
> + if (atomic_dec_return(&box->refcnt) == 0) {
> + uncore_box_exit(box);
> + kfree(box);
> + }
[Severity: High]
This isn't a bug introduced by this patch, but does freeing the box here
leave dangling pointers in active perf events or hrtimers?
The perf core doesn't actively detach or destroy open events when the
underlying device is removed. If other dies still have this PMU active,
uncore_pmu_unregister(pmu) is bypassed.
Any armed hrtimer firing asynchronously in uncore_pmu_hrtimer() or userspace
reads invoking uncore_pmu_event_read() would then dereference the freed box
memory, leading to a use-after-free.
[Severity: High]
This is a pre-existing issue, but could this kfree(box) also race with CPU
hotplug context migration?
During CPU hotplug, uncore_change_type_ctx() iterates over pmu->boxes[die]
to migrate perf contexts. It fetches the box pointer and subsequently
accesses it (e.g., reading box->cpu or calling
uncore_pmu_cancel_hrtimer(box)).
If a PCI device removal executes uncore_pci_pmu_unregister() concurrently,
it nullifies pmu->boxes[die] and frees the box here.
If the hotplug thread reads the pointer before it is set to NULL but accesses
it after the memory is freed, it could result in a use-after-free.
[Severity: High]
This isn't a bug introduced by this patch, but is there a missing memory
barrier when publishing the box pointer during allocation?
In uncore_pci_pmu_register(), a newly allocated box is published to
pmu->boxes[die] using a plain store, lacking an smp_store_release().
Concurrently, a user opening a perf event triggers uncore_pmu_event_init(),
which looks up the box via uncore_pmu_to_box() using a plain load. Due to
CPU reordering, the reader may observe the non-NULL box pointer but read a
stale, uninitialized memory state (such as box->cpu == 0 instead of -1).
This would bypass the if (!box || box->cpu < 0) safety check and allow
the perf core to operate on partially initialized memory.
> }
[ ... ]
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260611160033.66760-1-zide.chen@intel.com?part=2
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V3 3/8] perf/x86/intel/uncore: Let init_box() callback report failures
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
2026-06-11 16:00 ` [PATCH V3 1/8] perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure Zide Chen
2026-06-11 16:00 ` [PATCH V3 2/8] perf/x86/intel/uncore: Fix refcnt and other cleanups Zide Chen
@ 2026-06-11 16:00 ` Zide Chen
2026-06-11 16:38 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 4/8] perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails Zide Chen
` (4 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
The init_box() callback currently returns void, so initialization
failures are silently ignored and the box is still marked initialized.
Change the callback to return int so platform code can report errors
back to the common uncore layer.
Update uncore_box_init() to set the initialized flag only when
init_box() succeeds. Because box->refcnt guarantees that at most
one CPU calls uncore_box_init() for a given box at a time, plain
__set_bit() is safe for the initialized flag without atomic overhead.
Convert all init_box() implementations to return 0 on success or a
negative error code on failure. This is a prerequisite for propagating
initialization errors to the caller so they can be handled properly.
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
v3: Add Reviewed-by tag.
---
arch/x86/events/intel/uncore.h | 16 +++--
arch/x86/events/intel/uncore_discovery.c | 21 ++++---
arch/x86/events/intel/uncore_discovery.h | 6 +-
arch/x86/events/intel/uncore_nhmex.c | 3 +-
arch/x86/events/intel/uncore_snb.c | 80 +++++++++++++++---------
arch/x86/events/intel/uncore_snbep.c | 77 ++++++++++++++---------
6 files changed, 126 insertions(+), 77 deletions(-)
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index bad5d8dec8e0..d732b87be0a9 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -129,7 +129,7 @@ struct intel_uncore_type {
#define events_group attr_groups[2]
struct intel_uncore_ops {
- void (*init_box)(struct intel_uncore_box *);
+ int (*init_box)(struct intel_uncore_box *);
void (*exit_box)(struct intel_uncore_box *);
void (*disable_box)(struct intel_uncore_box *);
void (*enable_box)(struct intel_uncore_box *);
@@ -557,12 +557,18 @@ static inline u64 uncore_read_counter(struct intel_uncore_box *box,
return box->pmu->type->ops->read_counter(box, event);
}
-static inline void uncore_box_init(struct intel_uncore_box *box)
+static inline int uncore_box_init(struct intel_uncore_box *box)
{
- if (!test_and_set_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags)) {
- if (box->pmu->type->ops->init_box)
- box->pmu->type->ops->init_box(box);
+ int ret = 0;
+
+ if (!test_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags) &&
+ box->pmu->type->ops->init_box) {
+ ret = box->pmu->type->ops->init_box(box);
+ if (!ret)
+ __set_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags);
}
+
+ return ret;
}
static inline void uncore_box_exit(struct intel_uncore_box *box)
diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
index e50776222256..0a22edf4d509 100644
--- a/arch/x86/events/intel/uncore_discovery.c
+++ b/arch/x86/events/intel/uncore_discovery.c
@@ -489,14 +489,15 @@ static u64 intel_generic_uncore_box_ctl(struct intel_uncore_box *box)
return unit->addr;
}
-void intel_generic_uncore_msr_init_box(struct intel_uncore_box *box)
+int intel_generic_uncore_msr_init_box(struct intel_uncore_box *box)
{
u64 box_ctl = intel_generic_uncore_box_ctl(box);
if (!box_ctl)
- return;
+ return -ENODEV;
wrmsrq(box_ctl, GENERIC_PMON_BOX_CTL_INT);
+ return 0;
}
void intel_generic_uncore_msr_disable_box(struct intel_uncore_box *box)
@@ -578,15 +579,16 @@ static inline int intel_pci_uncore_box_ctl(struct intel_uncore_box *box)
return UNCORE_DISCOVERY_PCI_BOX_CTRL(intel_generic_uncore_box_ctl(box));
}
-void intel_generic_uncore_pci_init_box(struct intel_uncore_box *box)
+int intel_generic_uncore_pci_init_box(struct intel_uncore_box *box)
{
int box_ctl = intel_pci_uncore_box_ctl(box);
if (!box_ctl)
- return;
+ return -ENODEV;
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
- pci_write_config_dword(box->pci_dev, box_ctl, GENERIC_PMON_BOX_CTL_INT);
+ return pci_write_config_dword(box->pci_dev, box_ctl,
+ GENERIC_PMON_BOX_CTL_INT);
}
void intel_generic_uncore_pci_disable_box(struct intel_uncore_box *box)
@@ -648,7 +650,7 @@ static struct intel_uncore_ops generic_uncore_pci_ops = {
#define UNCORE_GENERIC_MMIO_SIZE 0x4000
-void intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box)
+int intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box)
{
static struct intel_uncore_discovery_unit *unit;
struct intel_uncore_type *type = box->pmu->type;
@@ -658,13 +660,13 @@ void intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box)
if (!unit) {
pr_warn("Uncore type %d id %d: Cannot find box control address.\n",
type->type_id, box->pmu->pmu_idx);
- return;
+ return -ENODEV;
}
if (!unit->addr) {
pr_warn("Uncore type %d box %d: Invalid box control address.\n",
type->type_id, unit->id);
- return;
+ return -ENODEV;
}
addr = unit->addr;
@@ -672,10 +674,11 @@ void intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box)
if (!box->io_addr) {
pr_warn("Uncore type %d box %d: ioremap error for 0x%llx.\n",
type->type_id, unit->id, (unsigned long long)addr);
- return;
+ return -ENOMEM;
}
writel(GENERIC_PMON_BOX_CTL_INT, box->io_addr);
+ return 0;
}
void intel_generic_uncore_mmio_disable_box(struct intel_uncore_box *box)
diff --git a/arch/x86/events/intel/uncore_discovery.h b/arch/x86/events/intel/uncore_discovery.h
index e1330342b92e..142e1b56cfc2 100644
--- a/arch/x86/events/intel/uncore_discovery.h
+++ b/arch/x86/events/intel/uncore_discovery.h
@@ -148,11 +148,11 @@ void intel_uncore_generic_uncore_cpu_init(void);
int intel_uncore_generic_uncore_pci_init(void);
void intel_uncore_generic_uncore_mmio_init(void);
-void intel_generic_uncore_msr_init_box(struct intel_uncore_box *box);
+int intel_generic_uncore_msr_init_box(struct intel_uncore_box *box);
void intel_generic_uncore_msr_disable_box(struct intel_uncore_box *box);
void intel_generic_uncore_msr_enable_box(struct intel_uncore_box *box);
-void intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box);
+int intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box);
void intel_generic_uncore_mmio_disable_box(struct intel_uncore_box *box);
void intel_generic_uncore_mmio_enable_box(struct intel_uncore_box *box);
void intel_generic_uncore_mmio_disable_event(struct intel_uncore_box *box,
@@ -160,7 +160,7 @@ void intel_generic_uncore_mmio_disable_event(struct intel_uncore_box *box,
void intel_generic_uncore_mmio_enable_event(struct intel_uncore_box *box,
struct perf_event *event);
-void intel_generic_uncore_pci_init_box(struct intel_uncore_box *box);
+int intel_generic_uncore_pci_init_box(struct intel_uncore_box *box);
void intel_generic_uncore_pci_disable_box(struct intel_uncore_box *box);
void intel_generic_uncore_pci_enable_box(struct intel_uncore_box *box);
void intel_generic_uncore_pci_disable_event(struct intel_uncore_box *box,
diff --git a/arch/x86/events/intel/uncore_nhmex.c b/arch/x86/events/intel/uncore_nhmex.c
index 8962e7cb21e3..7a6855281102 100644
--- a/arch/x86/events/intel/uncore_nhmex.c
+++ b/arch/x86/events/intel/uncore_nhmex.c
@@ -199,9 +199,10 @@ DEFINE_UNCORE_FORMAT_ATTR(counter, counter, "config:6-7");
DEFINE_UNCORE_FORMAT_ATTR(match, match, "config1:0-63");
DEFINE_UNCORE_FORMAT_ATTR(mask, mask, "config2:0-63");
-static void nhmex_uncore_msr_init_box(struct intel_uncore_box *box)
+static int nhmex_uncore_msr_init_box(struct intel_uncore_box *box)
{
wrmsrq(NHMEX_U_MSR_PMON_GLOBAL_CTL, NHMEX_U_PMON_GLOBAL_EN_ALL);
+ return 0;
}
static void nhmex_uncore_msr_exit_box(struct intel_uncore_box *box)
diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index edddd4f9ab5f..c5347920541c 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -295,12 +295,14 @@ static void snb_uncore_msr_disable_event(struct intel_uncore_box *box, struct pe
wrmsrq(event->hw.config_base, 0);
}
-static void snb_uncore_msr_init_box(struct intel_uncore_box *box)
+static int snb_uncore_msr_init_box(struct intel_uncore_box *box)
{
if (box->pmu->pmu_idx == 0) {
wrmsrq(SNB_UNC_PERF_GLOBAL_CTL,
SNB_UNC_GLOBAL_CTL_EN | SNB_UNC_GLOBAL_CTL_CORE_ALL);
}
+
+ return 0;
}
static void snb_uncore_msr_enable_box(struct intel_uncore_box *box)
@@ -394,7 +396,7 @@ void snb_uncore_cpu_init(void)
snb_uncore_cbox.num_boxes = topology_num_cores_per_package();
}
-static void skl_uncore_msr_init_box(struct intel_uncore_box *box)
+static int skl_uncore_msr_init_box(struct intel_uncore_box *box)
{
if (box->pmu->pmu_idx == 0) {
wrmsrq(SKL_UNC_PERF_GLOBAL_CTL,
@@ -404,6 +406,8 @@ static void skl_uncore_msr_init_box(struct intel_uncore_box *box)
/* The 8th CBOX has different MSR space */
if (box->pmu->pmu_idx == 7)
__set_bit(UNCORE_BOX_FLAG_CFL8_CBOX_MSR_OFFS, &box->flags);
+
+ return 0;
}
static void skl_uncore_msr_enable_box(struct intel_uncore_box *box)
@@ -547,10 +551,12 @@ static struct intel_uncore_type *tgl_msr_uncores[] = {
NULL,
};
-static void rkl_uncore_msr_init_box(struct intel_uncore_box *box)
+static int rkl_uncore_msr_init_box(struct intel_uncore_box *box)
{
if (box->pmu->pmu_idx == 0)
wrmsrq(SKL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
+
+ return 0;
}
void tgl_uncore_cpu_init(void)
@@ -707,9 +713,10 @@ static struct intel_uncore_type mtl_uncore_hac_cbox = {
.format_group = &adl_uncore_format_group,
};
-static void mtl_uncore_msr_init_box(struct intel_uncore_box *box)
+static int mtl_uncore_msr_init_box(struct intel_uncore_box *box)
{
wrmsrq(uncore_msr_box_ctl(box), SNB_UNC_GLOBAL_CTL_EN);
+ return 0;
}
static struct intel_uncore_ops mtl_uncore_msr_ops = {
@@ -773,10 +780,12 @@ static struct intel_uncore_type *lnl_msr_uncores[] = {
#define LNL_UNC_MSR_GLOBAL_CTL 0x240e
-static void lnl_uncore_msr_init_box(struct intel_uncore_box *box)
+static int lnl_uncore_msr_init_box(struct intel_uncore_box *box)
{
if (box->pmu->pmu_idx == 0)
wrmsrq(LNL_UNC_MSR_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
+
+ return 0;
}
static struct intel_uncore_ops lnl_uncore_msr_ops = {
@@ -874,7 +883,7 @@ static const struct attribute_group snb_uncore_imc_format_group = {
.attrs = snb_uncore_imc_formats_attr,
};
-static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
+static int snb_uncore_imc_init_box(struct intel_uncore_box *box)
{
struct intel_uncore_type *type = box->pmu->type;
struct pci_dev *pdev = box->pci_dev;
@@ -893,10 +902,13 @@ static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
addr &= ~(PAGE_SIZE - 1);
box->io_addr = ioremap(addr, type->mmio_map_size);
- if (!box->io_addr)
+ if (!box->io_addr) {
pr_warn("perf uncore: Failed to ioremap for %s.\n", type->name);
+ return -ENOMEM;
+ }
box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
+ return 0;
}
static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
@@ -1532,7 +1544,7 @@ static struct pci_dev *tgl_uncore_get_mc_dev(void)
#define TGL_UNCORE_MMIO_IMC_MEM_OFFSET 0x10000
#define TGL_UNCORE_PCI_IMC_MAP_SIZE 0xe000
-static void
+static int
uncore_get_box_mmio_addr(struct intel_uncore_box *box,
unsigned int base_offset,
int bar_offset, int step)
@@ -1541,19 +1553,20 @@ uncore_get_box_mmio_addr(struct intel_uncore_box *box,
struct intel_uncore_pmu *pmu = box->pmu;
struct intel_uncore_type *type = pmu->type;
resource_size_t addr;
+ int ret = 0;
u32 bar;
if (!pdev) {
pr_warn("perf uncore: Cannot find matched IMC device.\n");
- return;
+ return -ENODEV;
}
pci_read_config_dword(pdev, bar_offset, &bar);
if (!(bar & BIT(0))) {
pr_warn("perf uncore: BAR 0x%x is disabled. Failed to map %s counters.\n",
bar_offset, type->name);
- pci_dev_put(pdev);
- return;
+ ret = -ENODEV;
+ goto out;
}
bar &= ~BIT(0);
addr = (resource_size_t)(bar + step * pmu->pmu_idx);
@@ -1565,23 +1578,26 @@ uncore_get_box_mmio_addr(struct intel_uncore_box *box,
addr += base_offset;
box->io_addr = ioremap(addr, type->mmio_map_size);
- if (!box->io_addr)
+ if (!box->io_addr) {
+ ret = -ENOMEM;
pr_warn("perf uncore: Failed to ioremap for %s.\n", type->name);
-
+ }
+out:
pci_dev_put(pdev);
+ return ret;
}
-static void __uncore_imc_init_box(struct intel_uncore_box *box,
+static int __uncore_imc_init_box(struct intel_uncore_box *box,
unsigned int base_offset)
{
- uncore_get_box_mmio_addr(box, base_offset,
+ return uncore_get_box_mmio_addr(box, base_offset,
SNB_UNCORE_PCI_IMC_BAR_OFFSET,
TGL_UNCORE_MMIO_IMC_MEM_OFFSET);
}
-static void tgl_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
+static int tgl_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
{
- __uncore_imc_init_box(box, 0);
+ return __uncore_imc_init_box(box, 0);
}
static struct intel_uncore_ops tgl_uncore_imc_freerunning_ops = {
@@ -1648,13 +1664,15 @@ void tgl_uncore_mmio_init(void)
#define ADL_UNCORE_IMC_CTL_INT (ADL_UNCORE_IMC_CTL_RST_CTRL | \
ADL_UNCORE_IMC_CTL_RST_CTRS)
-static void adl_uncore_imc_init_box(struct intel_uncore_box *box)
+static int adl_uncore_imc_init_box(struct intel_uncore_box *box)
{
- __uncore_imc_init_box(box, ADL_UNCORE_IMC_BASE);
+ int ret = __uncore_imc_init_box(box, ADL_UNCORE_IMC_BASE);
/* The global control in MC1 can control both MCs. */
- if (box->io_addr && (box->pmu->pmu_idx == 1))
+ if (!ret && (box->pmu->pmu_idx == 1))
writel(ADL_UNCORE_IMC_CTL_INT, box->io_addr + ADL_UNCORE_IMC_GLOBAL_CTL);
+
+ return ret;
}
static void adl_uncore_mmio_disable_box(struct intel_uncore_box *box)
@@ -1731,9 +1749,9 @@ static struct freerunning_counters adl_uncore_imc_freerunning[] = {
[ADL_MMIO_UNCORE_IMC_DATA_WRITE] = { 0xA0, 0x0, 0x0, 1, 64 },
};
-static void adl_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
+static int adl_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
{
- __uncore_imc_init_box(box, ADL_UNCORE_IMC_FREERUNNING_BASE);
+ return __uncore_imc_init_box(box, ADL_UNCORE_IMC_FREERUNNING_BASE);
}
static struct intel_uncore_ops adl_uncore_imc_freerunning_ops = {
@@ -1803,9 +1821,9 @@ static const struct attribute_group lnl_uncore_format_group = {
.attrs = lnl_uncore_formats_attr,
};
-static void lnl_uncore_hbo_init_box(struct intel_uncore_box *box)
+static int lnl_uncore_hbo_init_box(struct intel_uncore_box *box)
{
- uncore_get_box_mmio_addr(box, LNL_UNCORE_HBO_BASE,
+ return uncore_get_box_mmio_addr(box, LNL_UNCORE_HBO_BASE,
LNL_UNCORE_PCI_SAFBAR_OFFSET,
LNL_UNCORE_HBO_OFFSET);
}
@@ -1829,14 +1847,16 @@ static struct intel_uncore_type lnl_uncore_hbo = {
.format_group = &lnl_uncore_format_group,
};
-static void lnl_uncore_sncu_init_box(struct intel_uncore_box *box)
+static int lnl_uncore_sncu_init_box(struct intel_uncore_box *box)
{
- uncore_get_box_mmio_addr(box, LNL_UNCORE_SNCU_BASE,
+ int ret = uncore_get_box_mmio_addr(box, LNL_UNCORE_SNCU_BASE,
LNL_UNCORE_PCI_SAFBAR_OFFSET,
0);
- if (box->io_addr)
+ if (!ret)
writel(ADL_UNCORE_IMC_CTL_INT, box->io_addr + LNL_UNCORE_GLOBAL_CTL);
+
+ return ret;
}
static struct intel_uncore_ops lnl_uncore_sncu_ops = {
@@ -1887,13 +1907,15 @@ static struct intel_uncore_type ptl_uncore_imc = {
.mmio_map_size = 0xf00,
};
-static void ptl_uncore_sncu_init_box(struct intel_uncore_box *box)
+static int ptl_uncore_sncu_init_box(struct intel_uncore_box *box)
{
- intel_generic_uncore_mmio_init_box(box);
+ int ret = intel_generic_uncore_mmio_init_box(box);
/* Clear the global freeze bit */
if (box->io_addr)
writel(0, box->io_addr + PTL_UNCORE_GLOBAL_CTL_OFFSET);
+
+ return ret;
}
static struct intel_uncore_ops ptl_uncore_sncu_ops = {
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 334dc384b5b9..a97cd029db36 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -627,12 +627,12 @@ static u64 snbep_uncore_pci_read_counter(struct intel_uncore_box *box, struct pe
return count;
}
-static void snbep_uncore_pci_init_box(struct intel_uncore_box *box)
+static int snbep_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
int box_ctl = uncore_pci_box_ctl(box);
- pci_write_config_dword(pdev, box_ctl, SNBEP_PMON_BOX_CTL_INT);
+ return pci_write_config_dword(pdev, box_ctl, SNBEP_PMON_BOX_CTL_INT);
}
static void snbep_uncore_msr_disable_box(struct intel_uncore_box *box)
@@ -680,12 +680,14 @@ static void snbep_uncore_msr_disable_event(struct intel_uncore_box *box,
wrmsrq(hwc->config_base, hwc->config);
}
-static void snbep_uncore_msr_init_box(struct intel_uncore_box *box)
+static int snbep_uncore_msr_init_box(struct intel_uncore_box *box)
{
unsigned msr = uncore_msr_box_ctl(box);
if (msr)
wrmsrq(msr, SNBEP_PMON_BOX_CTL_INT);
+
+ return 0;
}
static struct attribute *snbep_uncore_formats_attr[] = {
@@ -1507,18 +1509,21 @@ int snbep_uncore_pci_init(void)
/* end of Sandy Bridge-EP uncore support */
/* IvyTown uncore support */
-static void ivbep_uncore_msr_init_box(struct intel_uncore_box *box)
+static int ivbep_uncore_msr_init_box(struct intel_uncore_box *box)
{
unsigned msr = uncore_msr_box_ctl(box);
if (msr)
wrmsrq(msr, IVBEP_PMON_BOX_CTL_INT);
+
+ return 0;
}
-static void ivbep_uncore_pci_init_box(struct intel_uncore_box *box)
+static int ivbep_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
- pci_write_config_dword(pdev, SNBEP_PCI_PMON_BOX_CTL, IVBEP_PMON_BOX_CTL_INT);
+ return pci_write_config_dword(pdev, SNBEP_PCI_PMON_BOX_CTL,
+ IVBEP_PMON_BOX_CTL_INT);
}
#define IVBEP_UNCORE_MSR_OPS_COMMON_INIT() \
@@ -2784,7 +2789,7 @@ static struct intel_uncore_type hswep_uncore_cbox = {
/*
* Write SBOX Initialization register bit by bit to avoid spurious #GPs
*/
-static void hswep_uncore_sbox_msr_init_box(struct intel_uncore_box *box)
+static int hswep_uncore_sbox_msr_init_box(struct intel_uncore_box *box)
{
unsigned msr = uncore_msr_box_ctl(box);
@@ -2798,6 +2803,8 @@ static void hswep_uncore_sbox_msr_init_box(struct intel_uncore_box *box)
wrmsrq(msr, flags);
}
}
+
+ return 0;
}
static struct intel_uncore_ops hswep_uncore_sbox_msr_ops = {
@@ -4162,12 +4169,13 @@ static const struct attribute_group skx_upi_uncore_format_group = {
.attrs = skx_upi_uncore_formats_attr,
};
-static void skx_upi_uncore_pci_init_box(struct intel_uncore_box *box)
+static int skx_upi_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
- pci_write_config_dword(pdev, SKX_UPI_PCI_PMON_BOX_CTL, IVBEP_PMON_BOX_CTL_INT);
+ return pci_write_config_dword(pdev, SKX_UPI_PCI_PMON_BOX_CTL,
+ IVBEP_PMON_BOX_CTL_INT);
}
static struct intel_uncore_ops skx_upi_uncore_pci_ops = {
@@ -4323,12 +4331,13 @@ static struct intel_uncore_type skx_uncore_upi = {
.cleanup_mapping = skx_upi_cleanup_mapping,
};
-static void skx_m2m_uncore_pci_init_box(struct intel_uncore_box *box)
+static int skx_m2m_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
- pci_write_config_dword(pdev, SKX_M2M_PCI_PMON_BOX_CTL, IVBEP_PMON_BOX_CTL_INT);
+ return pci_write_config_dword(pdev, SKX_M2M_PCI_PMON_BOX_CTL,
+ IVBEP_PMON_BOX_CTL_INT);
}
static struct intel_uncore_ops skx_m2m_uncore_pci_ops = {
@@ -4831,13 +4840,13 @@ void snr_uncore_cpu_init(void)
uncore_msr_uncores = snr_msr_uncores;
}
-static void snr_m2m_uncore_pci_init_box(struct intel_uncore_box *box)
+static int snr_m2m_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
int box_ctl = uncore_pci_box_ctl(box);
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
- pci_write_config_dword(pdev, box_ctl, IVBEP_PMON_BOX_CTL_INT);
+ return pci_write_config_dword(pdev, box_ctl, IVBEP_PMON_BOX_CTL_INT);
}
static struct intel_uncore_ops snr_m2m_uncore_pci_ops = {
@@ -5010,17 +5019,22 @@ static int snr_uncore_mmio_map(struct intel_uncore_box *box,
return 0;
}
-static void __snr_uncore_mmio_init_box(struct intel_uncore_box *box,
+static int __snr_uncore_mmio_init_box(struct intel_uncore_box *box,
unsigned int box_ctl, int mem_offset,
unsigned int device)
{
- if (!snr_uncore_mmio_map(box, box_ctl, mem_offset, device))
+ int ret;
+
+ ret = snr_uncore_mmio_map(box, box_ctl, mem_offset, device);
+ if (!ret)
writel(IVBEP_PMON_BOX_CTL_INT, box->io_addr);
+
+ return ret;
}
-static void snr_uncore_mmio_init_box(struct intel_uncore_box *box)
+static int snr_uncore_mmio_init_box(struct intel_uncore_box *box)
{
- __snr_uncore_mmio_init_box(box, uncore_mmio_box_ctl(box),
+ return __snr_uncore_mmio_init_box(box, uncore_mmio_box_ctl(box),
SNR_IMC_MMIO_MEM0_OFFSET,
SNR_MC_DEVICE_ID);
}
@@ -5637,14 +5651,14 @@ int icx_uncore_pci_init(void)
return 0;
}
-static void icx_uncore_imc_init_box(struct intel_uncore_box *box)
+static int icx_uncore_imc_init_box(struct intel_uncore_box *box)
{
unsigned int box_ctl = box->pmu->type->box_ctl +
box->pmu->type->mmio_offset * (box->pmu->pmu_idx % ICX_NUMBER_IMC_CHN);
int mem_offset = (box->pmu->pmu_idx / ICX_NUMBER_IMC_CHN) * ICX_IMC_MEM_STRIDE +
SNR_IMC_MMIO_MEM0_OFFSET;
- __snr_uncore_mmio_init_box(box, box_ctl, mem_offset,
+ return __snr_uncore_mmio_init_box(box, box_ctl, mem_offset,
SNR_MC_DEVICE_ID);
}
@@ -5701,12 +5715,12 @@ static struct uncore_event_desc icx_uncore_imc_freerunning_events[] = {
{ /* end: all zeroes */ },
};
-static void icx_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
+static int icx_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
{
int mem_offset = box->pmu->pmu_idx * ICX_IMC_MEM_STRIDE +
SNR_IMC_MMIO_MEM0_OFFSET;
- snr_uncore_mmio_map(box, uncore_mmio_box_ctl(box),
+ return snr_uncore_mmio_map(box, uncore_mmio_box_ctl(box),
mem_offset, SNR_MC_DEVICE_ID);
}
@@ -6003,10 +6017,10 @@ static struct intel_uncore_type spr_uncore_mdf = {
.name = "mdf",
};
-static void spr_uncore_mmio_offs8_init_box(struct intel_uncore_box *box)
+static int spr_uncore_mmio_offs8_init_box(struct intel_uncore_box *box)
{
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
- intel_generic_uncore_mmio_init_box(box);
+ return intel_generic_uncore_mmio_init_box(box);
}
static struct intel_uncore_ops spr_uncore_mmio_offs8_ops = {
@@ -6187,12 +6201,11 @@ static struct uncore_event_desc spr_uncore_imc_freerunning_events[] = {
#define SPR_MC_DEVICE_ID 0x3251
-static void spr_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
+static int spr_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
{
int mem_offset = box->pmu->pmu_idx * ICX_IMC_MEM_STRIDE + SNR_IMC_MMIO_MEM0_OFFSET;
-
- snr_uncore_mmio_map(box, uncore_mmio_box_ctl(box),
- mem_offset, SPR_MC_DEVICE_ID);
+ return snr_uncore_mmio_map(box, uncore_mmio_box_ctl(box),
+ mem_offset, SPR_MC_DEVICE_ID);
}
static struct intel_uncore_ops spr_uncore_imc_freerunning_ops = {
@@ -6881,20 +6894,24 @@ static unsigned int dmr_iio_freerunning_box_offsets[] = {
0x0, 0x8000, 0x18000, 0x20000
};
-static void dmr_uncore_freerunning_init_box(struct intel_uncore_box *box)
+static int dmr_uncore_freerunning_init_box(struct intel_uncore_box *box)
{
struct intel_uncore_type *type = box->pmu->type;
u64 mmio_base;
if (box->pmu->pmu_idx >= type->num_boxes)
- return;
+ return -ENODEV;
mmio_base = DMR_IMH1_HIOP_MMIO_BASE;
mmio_base += dmr_iio_freerunning_box_offsets[box->pmu->pmu_idx];
box->io_addr = ioremap(mmio_base, type->mmio_map_size);
- if (!box->io_addr)
+ if (!box->io_addr) {
pr_warn("perf uncore: Failed to ioremap for %s.\n", type->name);
+ return -ENOMEM;
+ }
+
+ return 0;
}
static struct intel_uncore_ops dmr_uncore_freerunning_ops = {
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V3 3/8] perf/x86/intel/uncore: Let init_box() callback report failures
2026-06-11 16:00 ` [PATCH V3 3/8] perf/x86/intel/uncore: Let init_box() callback report failures Zide Chen
@ 2026-06-11 16:38 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-11 16:38 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues:
- [Medium] The `adl_uncore_msr_ops` struct lacks an `init_box` callback, causing `exit_box` to be skipped during teardown and leaving hardware PMU logic enabled.
- [Low] PCI `init_box` callbacks return positive `PCIBIOS_*` error codes instead of negative error codes as documented.
Pre-existing issues:
- [High] `ptl_uncore_sncu_init_box` performs a `writel()` gated by a check on `box->io_addr` rather than checking for initialization success, leading to Use-After-Unmap if reinitialization fails.
- [High] `uncore_mmio_exit_box()` calls `iounmap()` on `box->io_addr` but fails to clear the pointer to NULL, leaving a dangling pointer.
--
commit eb92519036a2afbd738b5e91732e11fc662b60f7
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Let init_box() callback report failures
The init_box() callback currently returns void, so initialization
failures are silently ignored and the box is still marked initialized.
This patch changes the callback to return int so platform code can report
errors back to the common uncore layer.
> diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
> --- a/arch/x86/events/intel/uncore.h
> +++ b/arch/x86/events/intel/uncore.h
[ ... ]
> @@ -557,12 +557,18 @@ static inline u64 uncore_read_counter(struct intel_uncore_box *box,
> return box->pmu->type->ops->read_counter(box, event);
> }
>
> -static inline void uncore_box_init(struct intel_uncore_box *box)
> +static inline int uncore_box_init(struct intel_uncore_box *box)
> {
> - if (!test_and_set_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags)) {
> - if (box->pmu->type->ops->init_box)
> - box->pmu->type->ops->init_box(box);
> + int ret = 0;
> +
> + if (!test_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags) &&
> + box->pmu->type->ops->init_box) {
> + ret = box->pmu->type->ops->init_box(box);
> + if (!ret)
> + __set_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags);
> }
[Severity: Medium]
Will this cause exit_box() to be skipped for PMUs that lack an init_box()
callback?
For example, adl_uncore_msr_ops defines exit_box but lacks an init_box.
Because ops->init_box is NULL, the UNCORE_BOX_FLAG_INITIALIZED flag is
never set for Alder Lake MSR PMUs.
During teardown, uncore_box_exit() checks the initialized flag.
Since it is false, adl_uncore_msr_exit_box() is skipped, and the
ADL_UNC_PERF_GLOBAL_CTL MSR is never cleared.
Does this leave the hardware PMON logic improperly enabled?
> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
> --- a/arch/x86/events/intel/uncore_discovery.c
> +++ b/arch/x86/events/intel/uncore_discovery.c
[ ... ]
> @@ -578,15 +579,16 @@ static inline int intel_pci_uncore_box_ctl(struct intel_uncore_box *box)
> return UNCORE_DISCOVERY_PCI_BOX_CTRL(intel_generic_uncore_box_ctl(box));
> }
>
> -void intel_generic_uncore_pci_init_box(struct intel_uncore_box *box)
> +int intel_generic_uncore_pci_init_box(struct intel_uncore_box *box)
> {
> int box_ctl = intel_pci_uncore_box_ctl(box);
>
> if (!box_ctl)
> - return;
> + return -ENODEV;
>
> __set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
> - pci_write_config_dword(box->pci_dev, box_ctl, GENERIC_PMON_BOX_CTL_INT);
> + return pci_write_config_dword(box->pci_dev, box_ctl,
> + GENERIC_PMON_BOX_CTL_INT);
> }
[Severity: Low]
Does this violate the contract mentioned in the commit message to "return 0
on success or a negative error code on failure"?
pci_write_config_dword() yields positive PCIBIOS_* error codes on failure.
Returning positive values directly mixes standard Linux error codes with PCI
BIOS error codes.
> diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
> --- a/arch/x86/events/intel/uncore_snb.c
> +++ b/arch/x86/events/intel/uncore_snb.c
[ ... ]
> @@ -1887,13 +1907,15 @@ static struct intel_uncore_type ptl_uncore_imc = {
> .mmio_map_size = 0xf00,
> };
>
> -static void ptl_uncore_sncu_init_box(struct intel_uncore_box *box)
> +static int ptl_uncore_sncu_init_box(struct intel_uncore_box *box)
> {
> - intel_generic_uncore_mmio_init_box(box);
> + int ret = intel_generic_uncore_mmio_init_box(box);
>
> /* Clear the global freeze bit */
> if (box->io_addr)
> writel(0, box->io_addr + PTL_UNCORE_GLOBAL_CTL_OFFSET);
> +
> + return ret;
> }
[Severity: High]
This is a pre-existing issue, but should this check if (!ret) instead of
if (box->io_addr)?
When a CPU goes offline, the MMIO region is unmapped by
uncore_mmio_exit_box(), but the pointer box->io_addr is not cleared to NULL.
If the CPU is later brought back online,
intel_generic_uncore_mmio_init_box() may fail early (e.g., returning
-ENODEV) without overwriting the stale box->io_addr pointer.
Because the stale pointer evaluates to true, the kernel would execute
writel(0, box->io_addr + PTL_UNCORE_GLOBAL_CTL_OFFSET) to an unmapped
address.
Could this trigger a page fault?
[Severity: High]
This isn't a bug introduced by this patch, but does uncore_mmio_exit_box()
leave a dangling pointer?
uncore_mmio_exit_box() calls iounmap() on box->io_addr but fails to clear
the pointer to NULL.
If the box is later reused and suffers an early initialization failure,
box->io_addr remains non-NULL.
Core handlers (such as uncore_mmio_read_counter) validate readiness by
checking if (!box->io_addr).
Since the pointer is not cleared, these checks can bypass validation and
permit perf events to trigger memory accesses.
Could this lead to a use-after-unmap memory access?
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
[ ... ]
> @@ -627,12 +627,12 @@ static u64 snbep_uncore_pci_read_counter(struct intel_uncore_box *box, struct pe
> return count;
> }
>
> -static void snbep_uncore_pci_init_box(struct intel_uncore_box *box)
> +static int snbep_uncore_pci_init_box(struct intel_uncore_box *box)
> {
> struct pci_dev *pdev = box->pci_dev;
> int box_ctl = uncore_pci_box_ctl(box);
>
> - pci_write_config_dword(pdev, box_ctl, SNBEP_PMON_BOX_CTL_INT);
> + return pci_write_config_dword(pdev, box_ctl, SNBEP_PMON_BOX_CTL_INT);
> }
[Severity: Low]
Similar to intel_generic_uncore_pci_init_box(), does this also violate the
negative error code API contract by returning positive PCIBIOS error codes?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260611160033.66760-1-zide.chen@intel.com?part=3
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V3 4/8] perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
` (2 preceding siblings ...)
2026-06-11 16:00 ` [PATCH V3 3/8] perf/x86/intel/uncore: Let init_box() callback report failures Zide Chen
@ 2026-06-11 16:00 ` Zide Chen
2026-06-11 16:00 ` [PATCH V3 5/8] perf/x86/intel/uncore: Factor out box setup code Zide Chen
` (3 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
uncore_event_cpu_online() returns -ENOMEM early when both the MSR and
MMIO box allocations fail. This also aborts PCI uncore setup, even
though PCI PMUs are independent of the MSR/MMIO paths.
Remove the early return so PCI uncore setup always runs regardless
of whether MSR or MMIO box allocation succeeds.
Fixes: 3da04b8a00dd ("perf/x86/intel/uncore: Support MMIO type uncore blocks")
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
v3: Add Reviewed-by tag.
---
arch/x86/events/intel/uncore.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 21c8ed1628cb..eae335df7634 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1636,8 +1636,6 @@ static int uncore_event_cpu_online(unsigned int cpu)
die = topology_logical_die_id(cpu);
msr_ret = uncore_box_ref(uncore_msr_uncores, die, cpu);
mmio_ret = uncore_box_ref(uncore_mmio_uncores, die, cpu);
- if (msr_ret && mmio_ret)
- return -ENOMEM;
/*
* Check if there is an online cpu in the package
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH V3 5/8] perf/x86/intel/uncore: Factor out box setup code
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
` (3 preceding siblings ...)
2026-06-11 16:00 ` [PATCH V3 4/8] perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails Zide Chen
@ 2026-06-11 16:00 ` Zide Chen
2026-06-11 16:00 ` [PATCH V3 6/8] perf/x86/intel/uncore: Introduce PMU flags and broken state Zide Chen
` (2 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
The PCI uncore PMU path already implements a lazy registration model:
the PMU is registered when the first active box appears and
unregistered when the last active box is removed.
Factor this registration management into a shared helper, so the same
code can be reused by the MSR and MMIO paths in later changes.
No functional change intended.
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
v3: Add Reviewed-by tag.
---
arch/x86/events/intel/uncore.c | 40 ++++++++++++++++++++++++----------
1 file changed, 28 insertions(+), 12 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index eae335df7634..06ef89f6ccc2 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1148,6 +1148,29 @@ uncore_pci_find_dev_pmu(struct pci_dev *pdev, const struct pci_device_id *ids)
return pmu;
}
+static int uncore_box_setup(struct intel_uncore_pmu *pmu,
+ struct intel_uncore_box *box)
+{
+ int ret;
+
+ uncore_box_init(box);
+
+ /* First active box registers the pmu. */
+ if (atomic_inc_return(&pmu->activeboxes) > 1)
+ return 0;
+
+ ret = uncore_pmu_register(pmu);
+ if (ret) {
+ atomic_dec(&pmu->activeboxes);
+ goto err;
+ }
+
+ return 0;
+err:
+ uncore_box_exit(box);
+ return ret;
+}
+
/*
* Register the PMU for a PCI device
* @pdev: The PCI device.
@@ -1174,20 +1197,13 @@ static int uncore_pci_pmu_register(struct pci_dev *pdev,
box->dieid = die;
box->pci_dev = pdev;
box->pmu = pmu;
- uncore_box_init(box);
- pmu->boxes[die] = box;
- if (atomic_inc_return(&pmu->activeboxes) > 1)
- return 0;
-
- /* First active box registers the pmu */
- ret = uncore_pmu_register(pmu);
- if (ret) {
- atomic_dec(&pmu->activeboxes);
- pmu->boxes[die] = NULL;
- uncore_box_exit(box);
+ ret = uncore_box_setup(pmu, box);
+ if (!ret)
+ pmu->boxes[die] = box;
+ else
kfree(box);
- }
+
return ret;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH V3 6/8] perf/x86/intel/uncore: Introduce PMU flags and broken state
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
` (4 preceding siblings ...)
2026-06-11 16:00 ` [PATCH V3 5/8] perf/x86/intel/uncore: Factor out box setup code Zide Chen
@ 2026-06-11 16:00 ` Zide Chen
2026-06-11 16:30 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering Zide Chen
2026-06-11 16:00 ` [PATCH V3 8/8] perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs Zide Chen
7 siblings, 1 reply; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
Replace the boolean 'registered' field in intel_uncore_pmu with an
unsigned long 'flags' field, and add a PMU_BROKEN flag to track box
setup failures. The broken flag is sticky, meaning it is cleared only
by a module reload or system reboot.
Broken PMUs are skipped in the CPU hotplug and box allocation paths.
When any box fails to initialize, the PMU is marked broken. Broken
PMUs reject new event assignments and skip future box setup attempts.
If the PMU was already registered, it remains so to avoid disrupting
in-flight events on other boxes.
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
v3:
- Fix typo stick->sticky and other cosmetic fixes in code comment.
v2:
- Make the broken flag sticky by using clear_bit() in
uncore_pmu_unregister() rather than zeroing out pmu->flags.
- In uncore_change_type_ctx(), don't stop CPU migration on broken PMU
for in-flight events. (Sashiko).
- Use box->cpu == -1 to identify inactive boxes that don't need
migration, no need to check uncore_die_has_box(), which is incomplete.
---
arch/x86/events/intel/uncore.c | 44 ++++++++++++++++++++++--------
arch/x86/events/intel/uncore.h | 13 ++++++++-
arch/x86/events/intel/uncore_snb.c | 2 +-
3 files changed, 46 insertions(+), 13 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 06ef89f6ccc2..feb8c3b0076b 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -757,7 +757,7 @@ static int uncore_pmu_event_init(struct perf_event *event)
pmu = uncore_event_to_pmu(event);
/* no device found for this pmu */
- if (!pmu->registered)
+ if (!uncore_pmu_available(pmu))
return -ENOENT;
/* Sampling not supported yet */
@@ -953,16 +953,18 @@ static int uncore_pmu_register(struct intel_uncore_pmu *pmu)
ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
if (!ret)
- pmu->registered = true;
+ uncore_pmu_set_registered(pmu);
return ret;
}
static void uncore_pmu_unregister(struct intel_uncore_pmu *pmu)
{
- if (!pmu->registered)
+ if (!uncore_pmu_registered(pmu))
return;
perf_pmu_unregister(&pmu->pmu);
- pmu->registered = false;
+
+ /* Keep PMU_BROKEN_BIT sticky. */
+ uncore_pmu_clear_registered(pmu);
}
static void uncore_free_boxes(struct intel_uncore_pmu *pmu)
@@ -1153,7 +1155,12 @@ static int uncore_box_setup(struct intel_uncore_pmu *pmu,
{
int ret;
- uncore_box_init(box);
+ if (uncore_pmu_broken(pmu))
+ return -ENODEV;
+
+ ret = uncore_box_init(box);
+ if (ret)
+ goto err;
/* First active box registers the pmu. */
if (atomic_inc_return(&pmu->activeboxes) > 1)
@@ -1167,6 +1174,16 @@ static int uncore_box_setup(struct intel_uncore_pmu *pmu,
return 0;
err:
+ /*
+ * If any box fails, mark the per-package PMU as broken regardless of
+ * whether it was registered or not.
+ *
+ * Don't decrement refcnt to avoid other in-die CPUs from trying to set
+ * up the PMU box again.
+ *
+ * Don't kfree box; MSR and MMIO boxes are freed at module exit only.
+ */
+ uncore_pmu_set_broken(pmu);
uncore_box_exit(box);
return ret;
}
@@ -1190,8 +1207,10 @@ static int uncore_pci_pmu_register(struct pci_dev *pdev,
return -EINVAL;
box = uncore_alloc_box(type, NUMA_NO_NODE);
- if (!box)
+ if (!box) {
+ uncore_pmu_set_broken(pmu);
return -ENOMEM;
+ }
atomic_inc(&box->refcnt);
box->dieid = die;
@@ -1507,7 +1526,8 @@ static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_cpu,
if (old_cpu < 0) {
WARN_ON_ONCE(box->cpu != -1);
- if (uncore_die_has_box(type, die, pmu->pmu_idx)) {
+ if (uncore_die_has_box(type, die, pmu->pmu_idx) &&
+ !uncore_pmu_broken(pmu)) {
box->cpu = new_cpu;
cpumask_set_cpu(new_cpu, &pmu->cpu_mask);
}
@@ -1515,12 +1535,14 @@ static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_cpu,
}
WARN_ON_ONCE(box->cpu != -1 && box->cpu != old_cpu);
- box->cpu = -1;
cpumask_clear_cpu(old_cpu, &pmu->cpu_mask);
- if (new_cpu < 0)
+ if (new_cpu < 0) {
+ box->cpu = -1;
continue;
+ }
- if (!uncore_die_has_box(type, die, pmu->pmu_idx))
+ /* An inactive box doesn't need migration. */
+ if (box->cpu == -1)
continue;
uncore_pmu_cancel_hrtimer(box);
perf_pmu_migrate_context(&pmu->pmu, old_cpu, new_cpu);
@@ -1596,7 +1618,7 @@ static int allocate_boxes(struct intel_uncore_type **types,
type = *types;
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
- if (pmu->boxes[die])
+ if (pmu->boxes[die] || uncore_pmu_broken(pmu))
continue;
box = uncore_alloc_box(type, cpu_to_node(cpu));
if (!box)
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index d732b87be0a9..0adb477d9708 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -146,13 +146,24 @@ struct intel_uncore_pmu {
struct pmu pmu;
char name[UNCORE_PMU_NAME_LEN];
int pmu_idx;
- bool registered;
+ unsigned long flags;
atomic_t activeboxes;
cpumask_t cpu_mask;
struct intel_uncore_type *type;
struct intel_uncore_box **boxes;
};
+#define PMU_REGISTERED_BIT 0
+#define PMU_BROKEN_BIT 1
+
+#define uncore_pmu_registered(pmu) test_bit(PMU_REGISTERED_BIT, &(pmu)->flags)
+#define uncore_pmu_broken(pmu) test_bit(PMU_BROKEN_BIT, &(pmu)->flags)
+#define uncore_pmu_available(pmu) (uncore_pmu_registered(pmu) && \
+ !uncore_pmu_broken(pmu))
+#define uncore_pmu_set_registered(pmu) set_bit(PMU_REGISTERED_BIT, &(pmu)->flags)
+#define uncore_pmu_set_broken(pmu) set_bit(PMU_BROKEN_BIT, &(pmu)->flags)
+#define uncore_pmu_clear_registered(pmu) clear_bit(PMU_REGISTERED_BIT, &(pmu)->flags)
+
struct intel_uncore_extra_reg {
raw_spinlock_t lock;
u64 config, config1, config2;
diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index c5347920541c..055131c508ff 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -940,7 +940,7 @@ static int snb_uncore_imc_event_init(struct perf_event *event)
pmu = uncore_event_to_pmu(event);
/* no device found for this pmu */
- if (!pmu->registered)
+ if (!uncore_pmu_available(pmu))
return -ENOENT;
/* Sampling not supported yet */
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V3 6/8] perf/x86/intel/uncore: Introduce PMU flags and broken state
2026-06-11 16:00 ` [PATCH V3 6/8] perf/x86/intel/uncore: Introduce PMU flags and broken state Zide Chen
@ 2026-06-11 16:30 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-11 16:30 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Removal of `box->cpu = -1` introduces a race condition where concurrent `perf_event_open` can install events on an offlining CPU during migration.
--
commit 90f307cb2cce79d442ba12ad2d9b5ba2c1d5018e
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Introduce PMU flags and broken state
This commit replaces the boolean 'registered' field in intel_uncore_pmu with
an unsigned long 'flags' field. It introduces a PMU_BROKEN flag to track box
setup failures, which is sticky and clears only on module reload or reboot.
Broken PMUs are skipped in CPU hotplug and box allocation paths.
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 06ef89f6ccc28..feb8c3b0076b2 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
[ ... ]
> @@ -1507,7 +1526,8 @@ static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_cpu,
[ ... ]
> WARN_ON_ONCE(box->cpu != -1 && box->cpu != old_cpu);
> - box->cpu = -1;
> cpumask_clear_cpu(old_cpu, &pmu->cpu_mask);
> - if (new_cpu < 0)
> + if (new_cpu < 0) {
> + box->cpu = -1;
> continue;
> + }
>
> - if (!uncore_die_has_box(type, die, pmu->pmu_idx))
> + /* An inactive box doesn't need migration. */
> + if (box->cpu == -1)
> continue;
> uncore_pmu_cancel_hrtimer(box);
> perf_pmu_migrate_context(&pmu->pmu, old_cpu, new_cpu);
[Severity: Medium]
Does removing the temporary assignment of box->cpu = -1 before the
perf_pmu_migrate_context() call introduce a race condition with concurrent
sys_perf_event_open() calls?
During the CPU offline process in uncore_change_type_ctx(), old_cpu is still
marked as online. If a user concurrently calls sys_perf_event_open(),
uncore_pmu_event_init() locklessly reads box->cpu. Since it is no longer
temporarily set to -1, it sees old_cpu and sets event->cpu = old_cpu.
If sys_perf_event_open() acquires the CPU's context mutex just after
perf_pmu_migrate_context() has finished and unlocked it, the core perf code
will successfully install the new event into old_cpu's context after the
migration has completed. When old_cpu fully offlines, this stranded event is
forcefully killed, leaving the user with a dead event even though the syscall
succeeded.
Previously, setting box->cpu = -1 prevented this by safely rejecting such
concurrent open requests with -EINVAL.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260611160033.66760-1-zide.chen@intel.com?part=6
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V3 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
` (5 preceding siblings ...)
2026-06-11 16:00 ` [PATCH V3 6/8] perf/x86/intel/uncore: Introduce PMU flags and broken state Zide Chen
@ 2026-06-11 16:00 ` Zide Chen
2026-06-11 16:29 ` sashiko-bot
2026-06-11 16:00 ` [PATCH V3 8/8] perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs Zide Chen
7 siblings, 1 reply; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
In uncore_event_cpu_online(), uncore_box_ref() was called before
uncore_change_context(). uncore_box_ref() gates on box->cpu >= 0,
but box->cpu is still -1 at that point because uncore_change_context()
has not run yet. As a result, the box is never initialized on the
first CPU to come online in a die, leaving it permanently
uninitialized in the single-CPU-per-die case.
Thus, box->refcnt is one count below the true value, and in the CPU
offline path, the box will be torn down on the second-to-last CPU.
In uncore_event_cpu_offline(), uncore_box_unref() was called after
uncore_change_context(), so box->cpu is already -1 when the collector
CPU goes offline, which prevents it from tearing down the box.
Fix by swapping the call order in both paths so that
uncore_box_{ref,unref}() runs at the point where box->cpu reflects
the correct context.
Move allocate_boxes() out of uncore_box_ref() to enable this
reordering.
Fixes: c74443d92f68 ("perf/x86/uncore: Support per PMU cpumask")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
v3:
- Update changelog to mention moving allocate_boxes(). (Dapeng)
- Update title; the bug is not limited to CPU hotplug.
---
arch/x86/events/intel/uncore.c | 50 ++++++++++++++++------------------
1 file changed, 23 insertions(+), 27 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index feb8c3b0076b..b9ac2f7d31ca 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1580,9 +1580,15 @@ static int uncore_event_cpu_offline(unsigned int cpu)
{
int die, target;
+ /* Clear the references */
+ die = topology_logical_die_id(cpu);
+ uncore_box_unref(uncore_msr_uncores, die);
+ uncore_box_unref(uncore_mmio_uncores, die);
+
/* Check if exiting cpu is used for collecting uncore events */
if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask))
- goto unref;
+ return 0;
+
/* Find a new cpu to collect uncore events */
target = cpumask_any_but(topology_die_cpumask(cpu), cpu);
@@ -1595,16 +1601,10 @@ static int uncore_event_cpu_offline(unsigned int cpu)
uncore_change_context(uncore_msr_uncores, cpu, target);
uncore_change_context(uncore_mmio_uncores, cpu, target);
uncore_change_context(uncore_pci_uncores, cpu, target);
-
-unref:
- /* Clear the references */
- die = topology_logical_die_id(cpu);
- uncore_box_unref(uncore_msr_uncores, die);
- uncore_box_unref(uncore_mmio_uncores, die);
return 0;
}
-static int allocate_boxes(struct intel_uncore_type **types,
+static void allocate_boxes(struct intel_uncore_type **types,
unsigned int die, unsigned int cpu)
{
struct intel_uncore_box *box, *tmp;
@@ -1621,8 +1621,10 @@ static int allocate_boxes(struct intel_uncore_type **types,
if (pmu->boxes[die] || uncore_pmu_broken(pmu))
continue;
box = uncore_alloc_box(type, cpu_to_node(cpu));
- if (!box)
+ if (!box) {
+ uncore_pmu_set_broken(pmu);
goto cleanup;
+ }
box->pmu = pmu;
box->dieid = die;
list_add(&box->active_list, &allocated);
@@ -1633,14 +1635,13 @@ static int allocate_boxes(struct intel_uncore_type **types,
list_del_init(&box->active_list);
box->pmu->boxes[die] = box;
}
- return 0;
+ return;
cleanup:
list_for_each_entry_safe(box, tmp, &allocated, active_list) {
list_del_init(&box->active_list);
kfree(box);
}
- return -ENOMEM;
}
static int uncore_box_ref(struct intel_uncore_type **types,
@@ -1649,11 +1650,7 @@ static int uncore_box_ref(struct intel_uncore_type **types,
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
struct intel_uncore_box *box;
- int i, ret;
-
- ret = allocate_boxes(types, die, cpu);
- if (ret)
- return ret;
+ int i;
for (; *types; types++) {
type = *types;
@@ -1669,27 +1666,26 @@ static int uncore_box_ref(struct intel_uncore_type **types,
static int uncore_event_cpu_online(unsigned int cpu)
{
- int die, target, msr_ret, mmio_ret;
+ int die, target;
die = topology_logical_die_id(cpu);
- msr_ret = uncore_box_ref(uncore_msr_uncores, die, cpu);
- mmio_ret = uncore_box_ref(uncore_mmio_uncores, die, cpu);
+ allocate_boxes(uncore_msr_uncores, die, cpu);
+ allocate_boxes(uncore_mmio_uncores, die, cpu);
/*
* Check if there is an online cpu in the package
* which collects uncore events already.
*/
target = cpumask_any_and(&uncore_cpu_mask, topology_die_cpumask(cpu));
- if (target < nr_cpu_ids)
- return 0;
-
- cpumask_set_cpu(cpu, &uncore_cpu_mask);
-
- if (!msr_ret)
+ if (target >= nr_cpu_ids) {
+ cpumask_set_cpu(cpu, &uncore_cpu_mask);
uncore_change_context(uncore_msr_uncores, -1, cpu);
- if (!mmio_ret)
uncore_change_context(uncore_mmio_uncores, -1, cpu);
- uncore_change_context(uncore_pci_uncores, -1, cpu);
+ uncore_change_context(uncore_pci_uncores, -1, cpu);
+ }
+
+ uncore_box_ref(uncore_msr_uncores, die, cpu);
+ uncore_box_ref(uncore_mmio_uncores, die, cpu);
return 0;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V3 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering
2026-06-11 16:00 ` [PATCH V3 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering Zide Chen
@ 2026-06-11 16:29 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-11 16:29 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Tearing down the uncore box (unmapping MMIO) before the perf core stops active events causes a use-after-unmap during CPU offline, leading to a kernel panic.
--
commit e2e044b2e3ac13715fc0c671d7186d9b5f0626a5
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Fix uncore_box ref/unref ordering
This commit swaps the call order in the CPU online and offline paths
so that uncore_box_{ref,unref}() runs at the point where the box CPU
reflects the correct context.
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index feb8c3b0076b2..b9ac2f7d31caa 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -1580,9 +1580,15 @@ static int uncore_event_cpu_offline(unsigned int cpu)
> {
> int die, target;
>
> + /* Clear the references */
> + die = topology_logical_die_id(cpu);
> + uncore_box_unref(uncore_msr_uncores, die);
> + uncore_box_unref(uncore_mmio_uncores, die);
[Severity: High]
Will this early unref cause a use-after-unmap panic during CPU offline?
During CPU offline, the CPUHP_AP_PERF_X86_UNCORE_ONLINE teardown callback
runs before the CPUHP_AP_PERF_ONLINE callback.
When the last CPU on a die goes offline, this uncore_box_unref() drops refcnt
to 0 and calls uncore_box_exit(). For MMIO PMUs, this leads to
iounmap(box->io_addr) without clearing the box->io_addr pointer.
Later in the offline process, the CPUHP_AP_PERF_ONLINE callback runs
perf_event_exit_cpu_context(), which stops active events via:
uncore_pmu_event_stop() -> uncore_perf_event_update() -> uncore_mmio_read_counter()
Since uncore_mmio_read_counter() does this:
arch/x86/events/intel/uncore.c:uncore_mmio_read_counter() {
...
return readq(box->io_addr + event->hw.event_base);
}
Will it proceed to read from the unmapped box->io_addr because the active
events have not been stopped yet?
[ ... ]
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260611160033.66760-1-zide.chen@intel.com?part=7
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V3 8/8] perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs
2026-06-11 16:00 [PATCH v3 0/8] perf/x86/intel/uncore: PMU setup robustness fixes Zide Chen
` (6 preceding siblings ...)
2026-06-11 16:00 ` [PATCH V3 7/8] perf/x86/intel/uncore: Fix uncore_box ref/unref ordering Zide Chen
@ 2026-06-11 16:00 ` Zide Chen
2026-06-11 16:33 ` sashiko-bot
7 siblings, 1 reply; 15+ messages in thread
From: Zide Chen @ 2026-06-11 16:00 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
MSR and MMIO uncore PMUs are currently registered at module init time
and appear in sysfs even when no PMU boxes are functional.
Apply the same lazy registration model used by PCI uncore PMUs: the
PMU is registered when the first box is successfully initialized, and
unregistered when the last box exits. If a box fails to initialize on
a subsequent die, the PMU is marked broken but remains registered to
avoid disrupting any in-flight perf events.
Box allocation and freeing remain at module init/exit time to avoid
repeated kfree/alloc cycles across CPU offline/online events.
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
v3:
- Update title: PMU -> PMUs.
- Add Reviewed-by tag.
v2:
- uncore_box_unref(): Only decrement pmu->activeboxes for active boxes
(those without init_box callback or successfully initialized) to prevent
underflow when initialization fails.
- Set uncore_{msr,mmio}_uncores to empty_uncore when
uncore_pmu_types_init() fails.
- Rename uncore_cpu_mmio_init() to uncore_pmu_types_init() (Dapeng).
---
arch/x86/events/intel/uncore.c | 78 +++++++---------------------------
arch/x86/events/intel/uncore.h | 6 +++
2 files changed, 22 insertions(+), 62 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index b9ac2f7d31ca..9f8a80c9dcb6 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1570,8 +1570,13 @@ static void uncore_box_unref(struct intel_uncore_type **types, int die)
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
box = pmu->boxes[die];
- if (box && box->cpu >= 0 && atomic_dec_return(&box->refcnt) == 0)
+ if (box && box->cpu >= 0 &&
+ atomic_dec_return(&box->refcnt) == 0) {
+ if (uncore_box_active(box) &&
+ atomic_dec_return(&pmu->activeboxes) == 0)
+ uncore_pmu_unregister(pmu);
uncore_box_exit(box);
+ }
}
}
}
@@ -1658,7 +1663,7 @@ static int uncore_box_ref(struct intel_uncore_type **types,
for (i = 0; i < type->num_boxes; i++, pmu++) {
box = pmu->boxes[die];
if (box && box->cpu >= 0 && atomic_inc_return(&box->refcnt) == 1)
- uncore_box_init(box);
+ uncore_box_setup(pmu, box);
}
}
return 0;
@@ -1689,67 +1694,12 @@ static int uncore_event_cpu_online(unsigned int cpu)
return 0;
}
-static int __init type_pmu_register(struct intel_uncore_type *type)
+static int __init uncore_pmu_types_init(struct intel_uncore_type **types)
{
- int i, ret;
-
- for (i = 0; i < type->num_boxes; i++) {
- ret = uncore_pmu_register(&type->pmus[i]);
- if (ret)
- return ret;
- }
- return 0;
-}
-
-static int __init uncore_msr_pmus_register(void)
-{
- struct intel_uncore_type **types = uncore_msr_uncores;
- int ret;
-
- for (; *types; types++) {
- ret = type_pmu_register(*types);
- if (ret)
- return ret;
- }
- return 0;
-}
-
-static int __init uncore_cpu_init(void)
-{
- int ret;
-
- ret = uncore_types_init(uncore_msr_uncores);
- if (ret)
- goto err;
-
- ret = uncore_msr_pmus_register();
- if (ret)
- goto err;
- return 0;
-err:
- uncore_types_exit(uncore_msr_uncores);
- uncore_msr_uncores = empty_uncore;
- return ret;
-}
-
-static int __init uncore_mmio_init(void)
-{
- struct intel_uncore_type **types = uncore_mmio_uncores;
- int ret;
-
- ret = uncore_types_init(types);
+ int ret = uncore_types_init(types);
if (ret)
- goto err;
+ uncore_types_exit(types);
- for (; *types; types++) {
- ret = type_pmu_register(*types);
- if (ret)
- goto err;
- }
- return 0;
-err:
- uncore_types_exit(uncore_mmio_uncores);
- uncore_mmio_uncores = empty_uncore;
return ret;
}
@@ -2050,12 +2000,16 @@ static int __init intel_uncore_init(void)
if (uncore_init->cpu_init) {
uncore_init->cpu_init();
- cret = uncore_cpu_init();
+ cret = uncore_pmu_types_init(uncore_msr_uncores);
+ if (cret)
+ uncore_msr_uncores = empty_uncore;
}
if (uncore_init->mmio_init) {
uncore_init->mmio_init();
- mret = uncore_mmio_init();
+ mret = uncore_pmu_types_init(uncore_mmio_uncores);
+ if (mret)
+ uncore_mmio_uncores = empty_uncore;
}
if (cret && pret && mret) {
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index 0adb477d9708..c8dfa2d21bfd 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -568,6 +568,12 @@ static inline u64 uncore_read_counter(struct intel_uncore_box *box,
return box->pmu->type->ops->read_counter(box, event);
}
+static inline bool uncore_box_active(struct intel_uncore_box *box)
+{
+ return (!box->pmu->type->ops->init_box ||
+ test_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags));
+}
+
static inline int uncore_box_init(struct intel_uncore_box *box)
{
int ret = 0;
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V3 8/8] perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs
2026-06-11 16:00 ` [PATCH V3 8/8] perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs Zide Chen
@ 2026-06-11 16:33 ` sashiko-bot
0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-11 16:33 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The `pmu->activeboxes` counter can underflow during CPU offline if `uncore_box_setup()` fails for an MSR/MMIO PMU lacking an `init_box` callback.
--
commit be5a594990bc5ad6318159289b770e88f7e8958d
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs
This commit delays the registration of MSR and MMIO uncore PMUs until the
first box is successfully initialized, applying the same lazy registration
model used by PCI uncore PMUs. Unregistration is deferred until the last
box exits, while box allocation remains at module init/exit time.
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index b9ac2f7d31caa..9f8a80c9dcb66 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -1570,8 +1570,13 @@ static void uncore_box_unref(struct intel_uncore_type **types, int die)
> pmu = type->pmus;
> for (i = 0; i < type->num_boxes; i++, pmu++) {
> box = pmu->boxes[die];
> - if (box && box->cpu >= 0 && atomic_dec_return(&box->refcnt) == 0)
> + if (box && box->cpu >= 0 &&
> + atomic_dec_return(&box->refcnt) == 0) {
> + if (uncore_box_active(box) &&
> + atomic_dec_return(&pmu->activeboxes) == 0)
[Severity: Medium]
Could this atomic decrement cause pmu->activeboxes to underflow during CPU
offline?
If uncore_box_setup() fails during initialization for a PMU that lacks an
init_box callback, the activeboxes counter won't be incremented.
Because uncore_box_active() unconditionally returns true when init_box is
NULL, wouldn't uncore_box_unref() still decrement pmu->activeboxes here and
cause it to become negative?
> + uncore_pmu_unregister(pmu);
> uncore_box_exit(box);
> + }
> }
> }
> }
[ ... ]
> diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
> index 0adb477d97086..c8dfa2d21bfd9 100644
> --- a/arch/x86/events/intel/uncore.h
> +++ b/arch/x86/events/intel/uncore.h
> @@ -568,6 +568,12 @@ static inline u64 uncore_read_counter(struct intel_uncore_box *box,
> return box->pmu->type->ops->read_counter(box, event);
> }
>
> +static inline bool uncore_box_active(struct intel_uncore_box *box)
> +{
> + return (!box->pmu->type->ops->init_box ||
> + test_bit(UNCORE_BOX_FLAG_INITIALIZED, &box->flags));
> +}
[Severity: Medium]
Is it safe to unconditionally return true when init_box is NULL?
If uncore_box_setup() fails and marks the PMU as broken, this definition
causes uncore_box_unref() to still consider the box active and perform an
unbalanced decrement on the activeboxes counter.
> +
> static inline int uncore_box_init(struct intel_uncore_box *box)
> {
> int ret = 0;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260611160033.66760-1-zide.chen@intel.com?part=8
^ permalink raw reply [flat|nested] 15+ messages in thread