* [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
@ 2026-05-27 15:11 ` Zide Chen
2026-05-27 17:28 ` sashiko-bot
2026-05-28 6:03 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 3/7] perf/x86/intel/uncore: Fix PCI device refcount leak in UPI discovery Zide Chen
` (6 subsequent siblings)
7 siblings, 2 replies; 19+ messages in thread
From: Zide Chen @ 2026-05-27 15:11 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
Theoretically, intel_uncore_find_discovery_unit() could return NULL,
e.g., when a CPU die is offline during uncore enumeration and its PMU
units are not added to the discovery RB-tree.
Guard against a NULL return value and the resulting invalid box control
address (0) before accessing hardware.
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2:
- New patch.
- Address pre-existing invalid box control address issue (Sashiko).
---
arch/x86/events/intel/uncore_discovery.c | 36 ++++++++++++++++++------
1 file changed, 27 insertions(+), 9 deletions(-)
diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
index 60e1200c4691..af2217b44a81 100644
--- a/arch/x86/events/intel/uncore_discovery.c
+++ b/arch/x86/events/intel/uncore_discovery.c
@@ -490,17 +490,28 @@ static u64 intel_generic_uncore_box_ctl(struct intel_uncore_box *box)
void intel_generic_uncore_msr_init_box(struct intel_uncore_box *box)
{
- wrmsrq(intel_generic_uncore_box_ctl(box), GENERIC_PMON_BOX_CTL_INT);
+ u64 box_ctl = intel_generic_uncore_box_ctl(box);
+
+ if (!box_ctl)
+ return;
+
+ wrmsrq(box_ctl, GENERIC_PMON_BOX_CTL_INT);
}
void intel_generic_uncore_msr_disable_box(struct intel_uncore_box *box)
{
- wrmsrq(intel_generic_uncore_box_ctl(box), GENERIC_PMON_BOX_CTL_FRZ);
+ u64 box_ctl = intel_generic_uncore_box_ctl(box);
+
+ if (box_ctl)
+ wrmsrq(box_ctl, GENERIC_PMON_BOX_CTL_FRZ);
}
void intel_generic_uncore_msr_enable_box(struct intel_uncore_box *box)
{
- wrmsrq(intel_generic_uncore_box_ctl(box), 0);
+ u64 box_ctl = intel_generic_uncore_box_ctl(box);
+
+ if (box_ctl)
+ wrmsrq(box_ctl, 0);
}
static void intel_generic_uncore_msr_enable_event(struct intel_uncore_box *box,
@@ -549,6 +560,10 @@ bool intel_generic_uncore_assign_hw_event(struct perf_event *event,
if (box->pci_dev) {
box_ctl = UNCORE_DISCOVERY_PCI_BOX_CTRL(box_ctl);
+
+ if (!box_ctl)
+ return false;
+
hwc->config_base = box_ctl + uncore_pci_event_ctl(box, hwc->idx);
hwc->event_base = box_ctl + uncore_pci_perf_ctr(box, hwc->idx);
return true;
@@ -567,27 +582,30 @@ static inline int intel_pci_uncore_box_ctl(struct intel_uncore_box *box)
void intel_generic_uncore_pci_init_box(struct intel_uncore_box *box)
{
- struct pci_dev *pdev = box->pci_dev;
int box_ctl = intel_pci_uncore_box_ctl(box);
+ if (!box_ctl)
+ return;
+
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
- pci_write_config_dword(pdev, box_ctl, GENERIC_PMON_BOX_CTL_INT);
+ pci_write_config_dword(box->pci_dev, box_ctl, GENERIC_PMON_BOX_CTL_INT);
}
void intel_generic_uncore_pci_disable_box(struct intel_uncore_box *box)
{
- struct pci_dev *pdev = box->pci_dev;
int box_ctl = intel_pci_uncore_box_ctl(box);
- pci_write_config_dword(pdev, box_ctl, GENERIC_PMON_BOX_CTL_FRZ);
+ if (box_ctl)
+ pci_write_config_dword(box->pci_dev, box_ctl,
+ GENERIC_PMON_BOX_CTL_FRZ);
}
void intel_generic_uncore_pci_enable_box(struct intel_uncore_box *box)
{
- struct pci_dev *pdev = box->pci_dev;
int box_ctl = intel_pci_uncore_box_ctl(box);
- pci_write_config_dword(pdev, box_ctl, 0);
+ if (box_ctl)
+ pci_write_config_dword(box->pci_dev, box_ctl, 0);
}
static void intel_generic_uncore_pci_enable_event(struct intel_uncore_box *box,
--
2.54.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address
2026-05-27 15:11 ` [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address Zide Chen
@ 2026-05-27 17:28 ` sashiko-bot
2026-05-28 6:03 ` Mi, Dapeng
1 sibling, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-05-27 17:28 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] The patch guards box-level operations against invalid (0) control addresses when a discovery unit is missing, but it critically misses event-level operations. This omission allows an unprivileged or privileged user running `perf` to trigger a #GP fault (kernel panic) or PCI configuration space corruption.
--
commit 65ddbcdd36196085490e8783cb658d7e20c2e9c5
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Guard against invalid box control address
This commit guards against a NULL return value from
intel_uncore_find_discovery_unit() before accessing hardware box control
addresses. It prevents potential faults when a CPU die is offline during
uncore enumeration.
> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
> index 60e1200c4691d..af2217b44a814 100644
> --- a/arch/x86/events/intel/uncore_discovery.c
> +++ b/arch/x86/events/intel/uncore_discovery.c
[ ... ]
> @@ -549,6 +560,10 @@ bool intel_generic_uncore_assign_hw_event(struct perf_event *event,
>
> if (box->pci_dev) {
> box_ctl = UNCORE_DISCOVERY_PCI_BOX_CTRL(box_ctl);
> +
> + if (!box_ctl)
> + return false;
> +
[Severity: High]
This is a pre-existing issue, but does returning false here when box_ctl is
invalid still leave us vulnerable to hardware access faults during event
execution?
When intel_generic_uncore_assign_hw_event() returns false, the caller
uncore_assign_hw_event() unconditionally falls back to legacy offset
calculations using uncore_event_ctl(). For discovery PMUs, this results in
hwc->config_base being set to a small relative offset like 0x0.
Later, when a user schedules a perf event on this box, the event startup
routines execute blindly using this base:
intel_generic_uncore_msr_enable_event()
wrmsrq(hwc->config_base, hwc->config);
Could this result in a fatal #GP fault by executing wrmsrq(0, ...), or
corrupt PCI configuration space by overwriting the Vendor or Device ID at
offset 0x0?
While this patch successfully prevents faults during boot or hotplug box
initialization, could it leave the crash accessible through user-controlled
perf execution?
> hwc->config_base = box_ctl + uncore_pci_event_ctl(box, hwc->idx);
> hwc->event_base = box_ctl + uncore_pci_perf_ctr(box, hwc->idx);
> return true;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260527151154.130505-1-zide.chen@intel.com?part=2
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address
2026-05-27 15:11 ` [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address Zide Chen
2026-05-27 17:28 ` sashiko-bot
@ 2026-05-28 6:03 ` Mi, Dapeng
1 sibling, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2026-05-28 6:03 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
On 5/27/2026 11:11 PM, Zide Chen wrote:
> Theoretically, intel_uncore_find_discovery_unit() could return NULL,
> e.g., when a CPU die is offline during uncore enumeration and its PMU
> units are not added to the discovery RB-tree.
>
> Guard against a NULL return value and the resulting invalid box control
> address (0) before accessing hardware.
>
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> V2:
> - New patch.
> - Address pre-existing invalid box control address issue (Sashiko).
> ---
> arch/x86/events/intel/uncore_discovery.c | 36 ++++++++++++++++++------
> 1 file changed, 27 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
> index 60e1200c4691..af2217b44a81 100644
> --- a/arch/x86/events/intel/uncore_discovery.c
> +++ b/arch/x86/events/intel/uncore_discovery.c
> @@ -490,17 +490,28 @@ static u64 intel_generic_uncore_box_ctl(struct intel_uncore_box *box)
>
> void intel_generic_uncore_msr_init_box(struct intel_uncore_box *box)
> {
> - wrmsrq(intel_generic_uncore_box_ctl(box), GENERIC_PMON_BOX_CTL_INT);
> + u64 box_ctl = intel_generic_uncore_box_ctl(box);
> +
> + if (!box_ctl)
> + return;
> +
> + wrmsrq(box_ctl, GENERIC_PMON_BOX_CTL_INT);
> }
>
> void intel_generic_uncore_msr_disable_box(struct intel_uncore_box *box)
> {
> - wrmsrq(intel_generic_uncore_box_ctl(box), GENERIC_PMON_BOX_CTL_FRZ);
> + u64 box_ctl = intel_generic_uncore_box_ctl(box);
> +
> + if (box_ctl)
> + wrmsrq(box_ctl, GENERIC_PMON_BOX_CTL_FRZ);
> }
>
> void intel_generic_uncore_msr_enable_box(struct intel_uncore_box *box)
> {
> - wrmsrq(intel_generic_uncore_box_ctl(box), 0);
> + u64 box_ctl = intel_generic_uncore_box_ctl(box);
> +
> + if (box_ctl)
> + wrmsrq(box_ctl, 0);
> }
>
> static void intel_generic_uncore_msr_enable_event(struct intel_uncore_box *box,
> @@ -549,6 +560,10 @@ bool intel_generic_uncore_assign_hw_event(struct perf_event *event,
>
> if (box->pci_dev) {
> box_ctl = UNCORE_DISCOVERY_PCI_BOX_CTRL(box_ctl);
> +
> + if (!box_ctl)
> + return false;
> +
> hwc->config_base = box_ctl + uncore_pci_event_ctl(box, hwc->idx);
> hwc->event_base = box_ctl + uncore_pci_perf_ctr(box, hwc->idx);
> return true;
> @@ -567,27 +582,30 @@ static inline int intel_pci_uncore_box_ctl(struct intel_uncore_box *box)
>
> void intel_generic_uncore_pci_init_box(struct intel_uncore_box *box)
> {
> - struct pci_dev *pdev = box->pci_dev;
> int box_ctl = intel_pci_uncore_box_ctl(box);
>
> + if (!box_ctl)
> + return;
> +
> __set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
> - pci_write_config_dword(pdev, box_ctl, GENERIC_PMON_BOX_CTL_INT);
> + pci_write_config_dword(box->pci_dev, box_ctl, GENERIC_PMON_BOX_CTL_INT);
> }
>
> void intel_generic_uncore_pci_disable_box(struct intel_uncore_box *box)
> {
> - struct pci_dev *pdev = box->pci_dev;
> int box_ctl = intel_pci_uncore_box_ctl(box);
>
> - pci_write_config_dword(pdev, box_ctl, GENERIC_PMON_BOX_CTL_FRZ);
> + if (box_ctl)
> + pci_write_config_dword(box->pci_dev, box_ctl,
> + GENERIC_PMON_BOX_CTL_FRZ);
> }
>
> void intel_generic_uncore_pci_enable_box(struct intel_uncore_box *box)
> {
> - struct pci_dev *pdev = box->pci_dev;
> int box_ctl = intel_pci_uncore_box_ctl(box);
>
> - pci_write_config_dword(pdev, box_ctl, 0);
> + if (box_ctl)
> + pci_write_config_dword(box->pci_dev, box_ctl, 0);
> }
>
> static void intel_generic_uncore_pci_enable_event(struct intel_uncore_box *box,
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V2 3/7] perf/x86/intel/uncore: Fix PCI device refcount leak in UPI discovery
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
2026-05-27 15:11 ` [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address Zide Chen
@ 2026-05-27 15:11 ` Zide Chen
2026-05-28 6:34 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box() Zide Chen
` (5 subsequent siblings)
7 siblings, 1 reply; 19+ messages in thread
From: Zide Chen @ 2026-05-27 15:11 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
pci_get_domain_bus_and_slot() increments the reference count of the
returned PCI device and therefore requires a matching pci_dev_put().
In skx_upi_topology_cb() and discover_upi_topology(), the lookup is
performed inside a loop, but pci_dev_put() is only called once after
the loop. As a result, references from all previous iterations are
leaked.
Move pci_dev_put(dev) into the if (dev) block immediately after
upi_fill_topology() returns.
Opportunistically, fix uninitialized variable in skx_upi_topology_cb().
Fixes: 4cfce57fa42d ("perf/x86/intel/uncore: Enable UPI topology discovery for Skylake Server")
Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server")
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
arch/x86/events/intel/uncore_snbep.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 215d33e260ed..c9ce206fcbb6 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -4261,7 +4261,7 @@ static int upi_fill_topology(struct pci_dev *dev, struct intel_uncore_topology *
static int skx_upi_topology_cb(struct intel_uncore_type *type, int segment,
int die, u64 cpu_bus_msr)
{
- int idx, ret;
+ int idx, ret = 0;
struct intel_uncore_topology *upi;
unsigned int devfn;
struct pci_dev *dev = NULL;
@@ -4274,12 +4274,12 @@ static int skx_upi_topology_cb(struct intel_uncore_type *type, int segment,
dev = pci_get_domain_bus_and_slot(segment, bus, devfn);
if (dev) {
ret = upi_fill_topology(dev, upi, idx);
+ pci_dev_put(dev);
if (ret)
break;
}
}
- pci_dev_put(dev);
return ret;
}
@@ -5499,6 +5499,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
devfn);
if (dev) {
ret = upi_fill_topology(dev, upi, idx);
+ pci_dev_put(dev);
if (ret)
goto err;
}
@@ -5506,7 +5507,6 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
}
err:
pci_dev_put(ubox);
- pci_dev_put(dev);
return ret;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V2 3/7] perf/x86/intel/uncore: Fix PCI device refcount leak in UPI discovery
2026-05-27 15:11 ` [PATCH V2 3/7] perf/x86/intel/uncore: Fix PCI device refcount leak in UPI discovery Zide Chen
@ 2026-05-28 6:34 ` Mi, Dapeng
0 siblings, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2026-05-28 6:34 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
On 5/27/2026 11:11 PM, Zide Chen wrote:
> pci_get_domain_bus_and_slot() increments the reference count of the
> returned PCI device and therefore requires a matching pci_dev_put().
>
> In skx_upi_topology_cb() and discover_upi_topology(), the lookup is
> performed inside a loop, but pci_dev_put() is only called once after
> the loop. As a result, references from all previous iterations are
> leaked.
>
> Move pci_dev_put(dev) into the if (dev) block immediately after
> upi_fill_topology() returns.
>
> Opportunistically, fix uninitialized variable in skx_upi_topology_cb().
>
> Fixes: 4cfce57fa42d ("perf/x86/intel/uncore: Enable UPI topology discovery for Skylake Server")
> Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server")
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> arch/x86/events/intel/uncore_snbep.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 215d33e260ed..c9ce206fcbb6 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -4261,7 +4261,7 @@ static int upi_fill_topology(struct pci_dev *dev, struct intel_uncore_topology *
> static int skx_upi_topology_cb(struct intel_uncore_type *type, int segment,
> int die, u64 cpu_bus_msr)
> {
> - int idx, ret;
> + int idx, ret = 0;
> struct intel_uncore_topology *upi;
> unsigned int devfn;
> struct pci_dev *dev = NULL;
> @@ -4274,12 +4274,12 @@ static int skx_upi_topology_cb(struct intel_uncore_type *type, int segment,
> dev = pci_get_domain_bus_and_slot(segment, bus, devfn);
> if (dev) {
> ret = upi_fill_topology(dev, upi, idx);
> + pci_dev_put(dev);
> if (ret)
> break;
> }
> }
>
> - pci_dev_put(dev);
> return ret;
> }
>
> @@ -5499,6 +5499,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
> devfn);
> if (dev) {
> ret = upi_fill_topology(dev, upi, idx);
> + pci_dev_put(dev);
> if (ret)
> goto err;
> }
> @@ -5506,7 +5507,6 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
> }
> err:
> pci_dev_put(ubox);
> - pci_dev_put(dev);
> return ret;
> }
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box()
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
2026-05-27 15:11 ` [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address Zide Chen
2026-05-27 15:11 ` [PATCH V2 3/7] perf/x86/intel/uncore: Fix PCI device refcount leak in UPI discovery Zide Chen
@ 2026-05-27 15:11 ` Zide Chen
2026-05-27 18:17 ` sashiko-bot
2026-05-28 6:35 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 5/7] perf/x86/intel/uncore: Move die_to_cpu() to uncore.c Zide Chen
` (4 subsequent siblings)
7 siblings, 2 replies; 19+ messages in thread
From: Zide Chen @ 2026-05-27 15:11 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, stable
On some Raptor Cove CPUs, enabling uncore PMON globally at driver init
may increase power consumption even when no perf events are in use.
Drop adl_uncore_msr_init_box() and defer programming the global control
register to enable_box(), so it is only set when a box is actually used.
IMC and IMC freerunning counters use a separate control path and are
unaffected.
Cc: stable@vger.kernel.org
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
arch/x86/events/intel/uncore_snb.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index 3dbc6bacbd9d..edddd4f9ab5f 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -563,12 +563,6 @@ void tgl_uncore_cpu_init(void)
skl_uncore_msr_ops.init_box = rkl_uncore_msr_init_box;
}
-static void adl_uncore_msr_init_box(struct intel_uncore_box *box)
-{
- if (box->pmu->pmu_idx == 0)
- wrmsrq(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
-}
-
static void adl_uncore_msr_enable_box(struct intel_uncore_box *box)
{
wrmsrq(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
@@ -587,7 +581,6 @@ static void adl_uncore_msr_exit_box(struct intel_uncore_box *box)
}
static struct intel_uncore_ops adl_uncore_msr_ops = {
- .init_box = adl_uncore_msr_init_box,
.enable_box = adl_uncore_msr_enable_box,
.disable_box = adl_uncore_msr_disable_box,
.exit_box = adl_uncore_msr_exit_box,
--
2.54.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box()
2026-05-27 15:11 ` [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box() Zide Chen
@ 2026-05-27 18:17 ` sashiko-bot
2026-05-28 6:35 ` Mi, Dapeng
1 sibling, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-05-27 18:17 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The global PMON control MSR remains permanently enabled after the first perf event is removed, defeating the patch's intended dynamic power-saving goal.
--
commit cfa3cf64f648ad5bd39932da84def0d92dadb8dc
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box()
This commit removes adl_uncore_msr_init_box() to stop unconditionally
enabling uncore PMON globally at driver init on Raptor Cove CPUs, which
was increasing power consumption. Instead, it defers programming the
global control register to enable_box() so it is only turned on when a
box is in use.
> diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
> index 3dbc6bacbd9de..edddd4f9ab5fc 100644
> --- a/arch/x86/events/intel/uncore_snb.c
> +++ b/arch/x86/events/intel/uncore_snb.c
[ ... ]
> @@ -563,12 +563,6 @@ void tgl_uncore_cpu_init(void)
> skl_uncore_msr_ops.init_box = rkl_uncore_msr_init_box;
> }
>
> -static void adl_uncore_msr_init_box(struct intel_uncore_box *box)
> -{
> - if (box->pmu->pmu_idx == 0)
> - wrmsrq(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
> -}
> -
> static void adl_uncore_msr_enable_box(struct intel_uncore_box *box)
> {
> wrmsrq(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
[Severity: Medium]
When the last perf event is removed, will the MSR remain permanently enabled?
The perf core framework wraps PMU transactions in perf_pmu_disable() and
perf_pmu_enable(). For example, when removing the last event,
perf_remove_from_context() ends with a call to perf_pmu_enable().
This ultimately invokes uncore_pmu_enable(), which unconditionally calls
the enable_box() callback. Since adl_uncore_msr_enable_box() doesn't
check if there are any active events (like checking box->n_active > 0),
it writes SNB_UNC_GLOBAL_CTL_EN to the MSR.
While this saves power before the first event is created, does it fail to
dynamically power down the hardware after its first use?
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260527151154.130505-1-zide.chen@intel.com?part=4
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box()
2026-05-27 15:11 ` [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box() Zide Chen
2026-05-27 18:17 ` sashiko-bot
@ 2026-05-28 6:35 ` Mi, Dapeng
1 sibling, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2026-05-28 6:35 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, stable
On 5/27/2026 11:11 PM, Zide Chen wrote:
> On some Raptor Cove CPUs, enabling uncore PMON globally at driver init
> may increase power consumption even when no perf events are in use.
>
> Drop adl_uncore_msr_init_box() and defer programming the global control
> register to enable_box(), so it is only set when a box is actually used.
>
> IMC and IMC freerunning counters use a separate control path and are
> unaffected.
>
> Cc: stable@vger.kernel.org
Need a "Fixes" tag?
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> arch/x86/events/intel/uncore_snb.c | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
> index 3dbc6bacbd9d..edddd4f9ab5f 100644
> --- a/arch/x86/events/intel/uncore_snb.c
> +++ b/arch/x86/events/intel/uncore_snb.c
> @@ -563,12 +563,6 @@ void tgl_uncore_cpu_init(void)
> skl_uncore_msr_ops.init_box = rkl_uncore_msr_init_box;
> }
>
> -static void adl_uncore_msr_init_box(struct intel_uncore_box *box)
> -{
> - if (box->pmu->pmu_idx == 0)
> - wrmsrq(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
> -}
> -
> static void adl_uncore_msr_enable_box(struct intel_uncore_box *box)
> {
> wrmsrq(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
> @@ -587,7 +581,6 @@ static void adl_uncore_msr_exit_box(struct intel_uncore_box *box)
> }
>
> static struct intel_uncore_ops adl_uncore_msr_ops = {
> - .init_box = adl_uncore_msr_init_box,
> .enable_box = adl_uncore_msr_enable_box,
> .disable_box = adl_uncore_msr_disable_box,
> .exit_box = adl_uncore_msr_exit_box,
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V2 5/7] perf/x86/intel/uncore: Move die_to_cpu() to uncore.c
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
` (2 preceding siblings ...)
2026-05-27 15:11 ` [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box() Zide Chen
@ 2026-05-27 15:11 ` Zide Chen
2026-05-28 6:36 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies Zide Chen
` (3 subsequent siblings)
7 siblings, 1 reply; 19+ messages in thread
From: Zide Chen @ 2026-05-27 15:11 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
Move die_to_cpu() into uncore.c so it can be reused by the MSR
initialization path, preparing for the introduction of an MSR global
initialization callback.
Move the cpus_read_{lock,unlock}() out of the API, in order to make
it possible to be called when the lock is being held.
Add the uncore_ prefix for consistency with other uncore APIs.
Signed-off-by: Zide Chen <zide.chen@intel.com>
V2:
- Move cpus_read_{lock,unlock}() out of uncore_die_to_cpu() and rely
on callers to manage the lock. (Sashiko)
- Remove "No functional change intended" from the changelog.
---
arch/x86/events/intel/uncore.c | 19 +++++++++++++++++++
arch/x86/events/intel/uncore.h | 1 +
arch/x86/events/intel/uncore_snbep.c | 23 +++--------------------
3 files changed, 23 insertions(+), 20 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index e9cc1ba921c5..22256ded2d67 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -83,6 +83,25 @@ int uncore_device_to_die(struct pci_dev *dev)
return -1;
}
+/*
+ * Using cpus_read_lock() to ensure cpu is not going down between
+ * looking at cpu_online_mask.
+ *
+ * The lock must be held by the caller.
+ */
+int uncore_die_to_cpu(int die)
+{
+ int res = 0, cpu;
+
+ for_each_online_cpu(cpu) {
+ if (topology_logical_die_id(cpu) == die) {
+ res = cpu;
+ break;
+ }
+ }
+ return res;
+}
+
static void uncore_free_pcibus_map(void)
{
struct pci2phy_map *map, *tmp;
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index c35918c01afa..94c68e3417b6 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -235,6 +235,7 @@ struct pci2phy_map *__find_pci2phy_map(int segment);
int uncore_pcibus_to_dieid(struct pci_bus *bus);
int uncore_die_to_segment(int die);
int uncore_device_to_die(struct pci_dev *dev);
+int uncore_die_to_cpu(int die);
ssize_t uncore_event_show(struct device *dev,
struct device_attribute *attr, char *buf);
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index c9ce206fcbb6..772b78237424 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -3704,25 +3704,6 @@ static int skx_msr_cpu_bus_read(int cpu, u64 *topology)
return 0;
}
-static int die_to_cpu(int die)
-{
- int res = 0, cpu, current_die;
- /*
- * Using cpus_read_lock() to ensure cpu is not going down between
- * looking at cpu_online_mask.
- */
- cpus_read_lock();
- for_each_online_cpu(cpu) {
- current_die = topology_logical_die_id(cpu);
- if (current_die == die) {
- res = cpu;
- break;
- }
- }
- cpus_read_unlock();
- return res;
-}
-
enum {
IIO_TOPOLOGY_TYPE,
UPI_TOPOLOGY_TYPE,
@@ -3794,8 +3775,9 @@ static int skx_pmu_get_topology(struct intel_uncore_type *type,
int die, ret = -EPERM;
u64 cpu_bus_msr;
+ cpus_read_lock();
for (die = 0; die < uncore_max_dies(); die++) {
- ret = skx_msr_cpu_bus_read(die_to_cpu(die), &cpu_bus_msr);
+ ret = skx_msr_cpu_bus_read(uncore_die_to_cpu(die), &cpu_bus_msr);
if (ret)
break;
@@ -3807,6 +3789,7 @@ static int skx_pmu_get_topology(struct intel_uncore_type *type,
if (ret)
break;
}
+ cpus_read_unlock();
return ret;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V2 5/7] perf/x86/intel/uncore: Move die_to_cpu() to uncore.c
2026-05-27 15:11 ` [PATCH V2 5/7] perf/x86/intel/uncore: Move die_to_cpu() to uncore.c Zide Chen
@ 2026-05-28 6:36 ` Mi, Dapeng
0 siblings, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2026-05-28 6:36 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
On 5/27/2026 11:11 PM, Zide Chen wrote:
> Move die_to_cpu() into uncore.c so it can be reused by the MSR
> initialization path, preparing for the introduction of an MSR global
> initialization callback.
>
> Move the cpus_read_{lock,unlock}() out of the API, in order to make
> it possible to be called when the lock is being held.
>
> Add the uncore_ prefix for consistency with other uncore APIs.
>
> Signed-off-by: Zide Chen <zide.chen@intel.com>
>
> V2:
> - Move cpus_read_{lock,unlock}() out of uncore_die_to_cpu() and rely
> on callers to manage the lock. (Sashiko)
> - Remove "No functional change intended" from the changelog.
> ---
> arch/x86/events/intel/uncore.c | 19 +++++++++++++++++++
> arch/x86/events/intel/uncore.h | 1 +
> arch/x86/events/intel/uncore_snbep.c | 23 +++--------------------
> 3 files changed, 23 insertions(+), 20 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index e9cc1ba921c5..22256ded2d67 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -83,6 +83,25 @@ int uncore_device_to_die(struct pci_dev *dev)
> return -1;
> }
>
> +/*
> + * Using cpus_read_lock() to ensure cpu is not going down between
> + * looking at cpu_online_mask.
> + *
> + * The lock must be held by the caller.
> + */
> +int uncore_die_to_cpu(int die)
> +{
> + int res = 0, cpu;
> +
> + for_each_online_cpu(cpu) {
> + if (topology_logical_die_id(cpu) == die) {
> + res = cpu;
> + break;
> + }
> + }
> + return res;
> +}
> +
> static void uncore_free_pcibus_map(void)
> {
> struct pci2phy_map *map, *tmp;
> diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
> index c35918c01afa..94c68e3417b6 100644
> --- a/arch/x86/events/intel/uncore.h
> +++ b/arch/x86/events/intel/uncore.h
> @@ -235,6 +235,7 @@ struct pci2phy_map *__find_pci2phy_map(int segment);
> int uncore_pcibus_to_dieid(struct pci_bus *bus);
> int uncore_die_to_segment(int die);
> int uncore_device_to_die(struct pci_dev *dev);
> +int uncore_die_to_cpu(int die);
>
> ssize_t uncore_event_show(struct device *dev,
> struct device_attribute *attr, char *buf);
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index c9ce206fcbb6..772b78237424 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3704,25 +3704,6 @@ static int skx_msr_cpu_bus_read(int cpu, u64 *topology)
> return 0;
> }
>
> -static int die_to_cpu(int die)
> -{
> - int res = 0, cpu, current_die;
> - /*
> - * Using cpus_read_lock() to ensure cpu is not going down between
> - * looking at cpu_online_mask.
> - */
> - cpus_read_lock();
> - for_each_online_cpu(cpu) {
> - current_die = topology_logical_die_id(cpu);
> - if (current_die == die) {
> - res = cpu;
> - break;
> - }
> - }
> - cpus_read_unlock();
> - return res;
> -}
> -
> enum {
> IIO_TOPOLOGY_TYPE,
> UPI_TOPOLOGY_TYPE,
> @@ -3794,8 +3775,9 @@ static int skx_pmu_get_topology(struct intel_uncore_type *type,
> int die, ret = -EPERM;
> u64 cpu_bus_msr;
>
> + cpus_read_lock();
> for (die = 0; die < uncore_max_dies(); die++) {
> - ret = skx_msr_cpu_bus_read(die_to_cpu(die), &cpu_bus_msr);
> + ret = skx_msr_cpu_bus_read(uncore_die_to_cpu(die), &cpu_bus_msr);
> if (ret)
> break;
>
> @@ -3807,6 +3789,7 @@ static int skx_pmu_get_topology(struct intel_uncore_type *type,
> if (ret)
> break;
> }
> + cpus_read_unlock();
>
> return ret;
> }
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
` (3 preceding siblings ...)
2026-05-27 15:11 ` [PATCH V2 5/7] perf/x86/intel/uncore: Move die_to_cpu() to uncore.c Zide Chen
@ 2026-05-27 15:11 ` Zide Chen
2026-05-27 19:56 ` sashiko-bot
2026-05-28 6:38 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore Zide Chen
` (2 subsequent siblings)
7 siblings, 2 replies; 19+ messages in thread
From: Zide Chen @ 2026-05-27 15:11 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
If the die is offline when uncore_die_to_cpu() is called, it silently
returns 0, which is misleading. Return -1 in this case to indicate
that all CPUs on the die are offline and the caller can take care of
it accordingly.
Opportunistically, replace -EPERM with -ENODEV, as -ENODEV is
the appropriate error when no CPUs are online across all dies.
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
arch/x86/events/intel/uncore.c | 2 +-
arch/x86/events/intel/uncore_snbep.c | 9 +++++++--
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 22256ded2d67..4b3a1fa5b41b 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -91,7 +91,7 @@ int uncore_device_to_die(struct pci_dev *dev)
*/
int uncore_die_to_cpu(int die)
{
- int res = 0, cpu;
+ int res = -1, cpu;
for_each_online_cpu(cpu) {
if (topology_logical_die_id(cpu) == die) {
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 772b78237424..334dc384b5b9 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -3772,12 +3772,17 @@ static void pmu_free_topology(struct intel_uncore_type *type)
static int skx_pmu_get_topology(struct intel_uncore_type *type,
int (*topology_cb)(struct intel_uncore_type*, int, int, u64))
{
- int die, ret = -EPERM;
+ int die, ret = -ENODEV;
u64 cpu_bus_msr;
+ int cpu;
cpus_read_lock();
for (die = 0; die < uncore_max_dies(); die++) {
- ret = skx_msr_cpu_bus_read(uncore_die_to_cpu(die), &cpu_bus_msr);
+ cpu = uncore_die_to_cpu(die);
+ if (cpu == -1)
+ continue;
+
+ ret = skx_msr_cpu_bus_read(cpu, &cpu_bus_msr);
if (ret)
break;
--
2.54.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies
2026-05-27 15:11 ` [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies Zide Chen
@ 2026-05-27 19:56 ` sashiko-bot
2026-05-28 6:38 ` Mi, Dapeng
1 sibling, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-05-27 19:56 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues:
- [High] This problem wasn't introduced by this patch, but in `pmu_alloc_topology()`, if the memory allocation for `topology[die]` fails, the error cleanup loop dereferences the newly-NULL pointer, causing a kernel crash.
- [High] This problem wasn't introduced by this patch, but there is an unprotected list traversal of PCI buses in `uncore_die_to_segment()` leading to a potential Use-After-Free if a PCI bus is hot-removed.
--
commit ff1f5cc12e40244014f762aa4e444d2ef1d867a3
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies
If the die is offline when uncore_die_to_cpu() is called, it silently
returns 0, which is misleading. This patch modifies the function to return
-1 in this case, indicating that all CPUs on the die are offline. It also
opportunistically replaces -EPERM with -ENODEV.
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 22256ded2d67d..4b3a1fa5b41b7 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -91,7 +91,7 @@ int uncore_device_to_die(struct pci_dev *dev)
> */
> int uncore_die_to_cpu(int die)
> {
> - int res = 0, cpu;
> + int res = -1, cpu;
>
> for_each_online_cpu(cpu) {
> if (topology_logical_die_id(cpu) == die) {
[ ... ]
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 772b78237424c..334dc384b5b93 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3772,12 +3772,17 @@ static void pmu_free_topology(struct intel_uncore_type *type)
> static int skx_pmu_get_topology(struct intel_uncore_type *type,
> int (*topology_cb)(struct intel_uncore_type*, int, int, u64))
> {
> - int die, ret = -EPERM;
> + int die, ret = -ENODEV;
> u64 cpu_bus_msr;
> + int cpu;
>
> cpus_read_lock();
> for (die = 0; die < uncore_max_dies(); die++) {
> - ret = skx_msr_cpu_bus_read(uncore_die_to_cpu(die), &cpu_bus_msr);
> + cpu = uncore_die_to_cpu(die);
> + if (cpu == -1)
> + continue;
> +
> + ret = skx_msr_cpu_bus_read(cpu, &cpu_bus_msr);
> if (ret)
> break;
[Severity: High]
This isn't a bug introduced by this patch, but later in this function, we
call uncore_die_to_segment() which iterates over PCI root buses using
pci_find_next_bus():
arch/x86/events/intel/uncore.c:uncore_die_to_segment() {
struct pci_bus *bus = NULL;
/* Find first pci bus which attributes to specified die. */
while ((bus = pci_find_next_bus(bus)) &&
(die != uncore_pcibus_to_dieid(bus)))
;
return bus ? pci_domain_nr(bus) : -EINVAL;
}
The pci_find_next_bus() function acquires and releases pci_bus_sem internally,
returning a raw bus pointer without holding a lock or taking a reference count.
If an unrelated hot-pluggable PCI root bus is hot-removed concurrently, could
the bus pointer be freed before the next iteration?
Passing this freed pointer back to pci_find_next_bus() could cause a
use-after-free when accessing from->node.next.
While cpus_read_lock() is held in skx_pmu_get_topology(), does it prevent
concurrent hotplug of unrelated PCI root buses?
[Severity: High]
This is a pre-existing issue, but I noticed a potential NULL pointer
dereference in pmu_alloc_topology() in this same file.
If kzalloc_objs() for topology[die] fails, the code jumps to the clear label:
arch/x86/events/intel/uncore_snbep.c:pmu_alloc_topology() {
...
for (die = 0; die < uncore_max_dies(); die++) {
topology[die] = kzalloc_objs(**topology, type->num_boxes);
if (!topology[die])
goto clear;
...
clear:
for (; die >= 0; die--) {
for (idx = 0; idx < type->num_boxes; idx++)
kfree(topology[die][idx].untyped);
kfree(topology[die]);
...
}
At the clear label, the loop variable die still points to the index that
failed allocation, so topology[die] is NULL.
Does the cleanup loop's attempt to execute kfree(topology[die][idx].untyped)
evaluate to (topology[die] + idx)->untyped and crash the kernel?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260527151154.130505-1-zide.chen@intel.com?part=6
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies
2026-05-27 15:11 ` [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies Zide Chen
2026-05-27 19:56 ` sashiko-bot
@ 2026-05-28 6:38 ` Mi, Dapeng
1 sibling, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2026-05-28 6:38 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
On 5/27/2026 11:11 PM, Zide Chen wrote:
> If the die is offline when uncore_die_to_cpu() is called, it silently
> returns 0, which is misleading. Return -1 in this case to indicate
> that all CPUs on the die are offline and the caller can take care of
> it accordingly.
>
> Opportunistically, replace -EPERM with -ENODEV, as -ENODEV is
> the appropriate error when no CPUs are online across all dies.
>
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> arch/x86/events/intel/uncore.c | 2 +-
> arch/x86/events/intel/uncore_snbep.c | 9 +++++++--
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 22256ded2d67..4b3a1fa5b41b 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -91,7 +91,7 @@ int uncore_device_to_die(struct pci_dev *dev)
> */
> int uncore_die_to_cpu(int die)
> {
> - int res = 0, cpu;
> + int res = -1, cpu;
>
> for_each_online_cpu(cpu) {
> if (topology_logical_die_id(cpu) == die) {
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 772b78237424..334dc384b5b9 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3772,12 +3772,17 @@ static void pmu_free_topology(struct intel_uncore_type *type)
> static int skx_pmu_get_topology(struct intel_uncore_type *type,
> int (*topology_cb)(struct intel_uncore_type*, int, int, u64))
> {
> - int die, ret = -EPERM;
> + int die, ret = -ENODEV;
> u64 cpu_bus_msr;
> + int cpu;
>
> cpus_read_lock();
> for (die = 0; die < uncore_max_dies(); die++) {
> - ret = skx_msr_cpu_bus_read(uncore_die_to_cpu(die), &cpu_bus_msr);
> + cpu = uncore_die_to_cpu(die);
> + if (cpu == -1)
> + continue;
> +
> + ret = skx_msr_cpu_bus_read(cpu, &cpu_bus_msr);
> if (ret)
> break;
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
` (4 preceding siblings ...)
2026-05-27 15:11 ` [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies Zide Chen
@ 2026-05-27 15:11 ` Zide Chen
2026-05-27 20:45 ` sashiko-bot
2026-05-28 6:46 ` Mi, Dapeng
2026-05-27 15:45 ` [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems sashiko-bot
2026-05-28 6:01 ` Mi, Dapeng
7 siblings, 2 replies; 19+ messages in thread
From: Zide Chen @ 2026-05-27 15:11 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen
On Sierra Forest and Clearwater Forest, the FRZ_ALL bit in the global
control register defaults to 0 at boot, but UBOX PMON units do not
work until the global control register is explicitly written with 0
to trigger hardware initialization properly.
Implement the generic uncore_msr_global_init() callback and add it to
gnr_uncore_init[], which is shared by GNR, GRR, SRF, and CWF.
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2:
- Propagate return value of wrmsrq_on_cpu() to global_init().
---
arch/x86/events/intel/uncore.c | 13 ++++++++++++-
arch/x86/events/intel/uncore.h | 2 +-
arch/x86/events/intel/uncore_discovery.c | 2 +-
3 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 4b3a1fa5b41b..7857959c6e82 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1716,7 +1716,7 @@ static int __init uncore_mmio_init(void)
return ret;
}
-static int uncore_mmio_global_init(u64 ctl)
+static int uncore_mmio_global_init(int die, u64 ctl)
{
void __iomem *io_addr;
@@ -1731,6 +1731,16 @@ static int uncore_mmio_global_init(u64 ctl)
return 0;
}
+static int uncore_msr_global_init(int die, u64 msr)
+{
+ int cpu = uncore_die_to_cpu(die);
+
+ if (cpu == -1)
+ return -ENODEV;
+
+ return wrmsrq_on_cpu(cpu, msr, 0);
+}
+
static const struct uncore_plat_init nhm_uncore_init __initconst = {
.cpu_init = nhm_uncore_cpu_init,
};
@@ -1871,6 +1881,7 @@ static const struct uncore_plat_init gnr_uncore_init __initconst = {
.domain[0].base_is_pci = true,
.domain[0].discovery_base = UNCORE_DISCOVERY_TABLE_DEVICE,
.domain[0].units_ignore = gnr_uncore_units_ignore,
+ .domain[0].global_init = uncore_msr_global_init,
};
static const struct uncore_plat_init dmr_uncore_init __initconst = {
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index 94c68e3417b6..c2e5ccb1d72c 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -53,7 +53,7 @@ struct uncore_discovery_domain {
/* MSR address or PCI device used as the discovery base */
u32 discovery_base;
bool base_is_pci;
- int (*global_init)(u64 ctl);
+ int (*global_init)(int die, u64 ctl);
/* The units in the discovery table should be ignored. */
int *units_ignore;
diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
index af2217b44a81..e36613d934b1 100644
--- a/arch/x86/events/intel/uncore_discovery.c
+++ b/arch/x86/events/intel/uncore_discovery.c
@@ -287,7 +287,7 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
if (!io_addr)
return -ENOMEM;
- if (domain->global_init && domain->global_init(global.ctl)) {
+ if (domain->global_init && domain->global_init(die, global.ctl)) {
ret = -ENODEV;
goto out;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore
2026-05-27 15:11 ` [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore Zide Chen
@ 2026-05-27 20:45 ` sashiko-bot
2026-05-28 6:46 ` Mi, Dapeng
1 sibling, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-05-27 20:45 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] Calling `uncore_die_to_cpu()` without holding `cpus_read_lock()` introduces a race condition with CPU hotplug, potentially aborting PMU discovery.
- [High] Use of the unsafe `wrmsrq_on_cpu()` function with an unvalidated MSR index can trigger a General Protection Fault (#GP) and crash the kernel.
- [High] The required global initialization write is only performed during discovery and is missed upon system resume, leaving PMON units uninitialized after sleep.
- [Medium] PMU discovery permanently fails for a die if all of its CPUs are offline during module initialization.
--
commit d885055ecaea97356a710a83b6d4de7ee7d6ba76
Author: Zide Chen <zide.chen@intel.com>
Subject: perf/x86/intel/uncore: Implement global init callback for GNR uncore
This commit implements a global initialization callback for GNR uncore to
explicitly write 0 to the global control register at boot. This triggers
hardware initialization properly for UBOX PMON units on Sierra Forest and
Clearwater Forest.
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 4b3a1fa5b41b7..7857959c6e823 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
[ ... ]
> @@ -1731,6 +1731,16 @@ static int uncore_mmio_global_init(u64 ctl)
> return 0;
> }
>
> +static int uncore_msr_global_init(int die, u64 msr)
> +{
> + int cpu = uncore_die_to_cpu(die);
> +
> + if (cpu == -1)
> + return -ENODEV;
[Severity: Medium]
If a die has no online CPUs during module load or boot, this will return
-ENODEV and abort parsing the discovery table for that die. Since there
is no mechanism to re-trigger discovery when a CPU on that die is later
brought online via hotplug, will the uncore PMUs for that die remain
permanently unavailable?
[Severity: High]
Is this safe against CPU hotplug? uncore_discovery_pci() invokes
__parse_discovery_table() without holding cpus_read_lock(). If the
selected CPU goes offline in the narrow window between uncore_die_to_cpu()
and the subsequent wrmsrq_on_cpu() call, smp_call_function_single() will
return -ENXIO and silently abort PMU discovery for the entire die.
> +
> + return wrmsrq_on_cpu(cpu, msr, 0);
> +}
[Severity: High]
Can this trigger a General Protection Fault (#GP)? The msr parameter is
derived from a 64-bit value dynamically read from the hardware/firmware
discovery table. If the firmware provides an invalid value, an MMIO physical
address, or if the table is corrupted, wrmsrq_on_cpu() will execute a bare
wrmsr instruction. Should this use wrmsrq_safe_on_cpu() to gracefully handle
invalid MSR indices?
[Severity: High]
Since domain->global_init() is only called once during initial uncore
discovery, what happens during S3 suspend or hibernation? The hardware drops
power and loses its state, reverting the FRZ_ALL bit back to the default.
Without a suspend/resume callback to re-trigger this explicit global
initialization write, will the UBOX PMON units silently fail to work after
the system resumes from sleep?
[ ... ]
> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
> index af2217b44a814..e36613d934b1f 100644
> --- a/arch/x86/events/intel/uncore_discovery.c
> +++ b/arch/x86/events/intel/uncore_discovery.c
[ ... ]
> @@ -287,7 +287,7 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
> if (!io_addr)
> return -ENOMEM;
>
> - if (domain->global_init && domain->global_init(global.ctl)) {
> + if (domain->global_init && domain->global_init(die, global.ctl)) {
> ret = -ENODEV;
> goto out;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260527151154.130505-1-zide.chen@intel.com?part=7
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore
2026-05-27 15:11 ` [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore Zide Chen
2026-05-27 20:45 ` sashiko-bot
@ 2026-05-28 6:46 ` Mi, Dapeng
1 sibling, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2026-05-28 6:46 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users
On 5/27/2026 11:11 PM, Zide Chen wrote:
> On Sierra Forest and Clearwater Forest, the FRZ_ALL bit in the global
> control register defaults to 0 at boot, but UBOX PMON units do not
> work until the global control register is explicitly written with 0
> to trigger hardware initialization properly.
>
> Implement the generic uncore_msr_global_init() callback and add it to
> gnr_uncore_init[], which is shared by GNR, GRR, SRF, and CWF.
Need a "Fixes" tag?
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> V2:
> - Propagate return value of wrmsrq_on_cpu() to global_init().
> ---
> arch/x86/events/intel/uncore.c | 13 ++++++++++++-
> arch/x86/events/intel/uncore.h | 2 +-
> arch/x86/events/intel/uncore_discovery.c | 2 +-
> 3 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 4b3a1fa5b41b..7857959c6e82 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -1716,7 +1716,7 @@ static int __init uncore_mmio_init(void)
> return ret;
> }
>
> -static int uncore_mmio_global_init(u64 ctl)
> +static int uncore_mmio_global_init(int die, u64 ctl)
> {
> void __iomem *io_addr;
>
> @@ -1731,6 +1731,16 @@ static int uncore_mmio_global_init(u64 ctl)
> return 0;
> }
>
> +static int uncore_msr_global_init(int die, u64 msr)
> +{
> + int cpu = uncore_die_to_cpu(die);
> +
> + if (cpu == -1)
> + return -ENODEV;
> +
> + return wrmsrq_on_cpu(cpu, msr, 0);
> +}
> +
> static const struct uncore_plat_init nhm_uncore_init __initconst = {
> .cpu_init = nhm_uncore_cpu_init,
> };
> @@ -1871,6 +1881,7 @@ static const struct uncore_plat_init gnr_uncore_init __initconst = {
> .domain[0].base_is_pci = true,
> .domain[0].discovery_base = UNCORE_DISCOVERY_TABLE_DEVICE,
> .domain[0].units_ignore = gnr_uncore_units_ignore,
> + .domain[0].global_init = uncore_msr_global_init,
> };
>
> static const struct uncore_plat_init dmr_uncore_init __initconst = {
> diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
> index 94c68e3417b6..c2e5ccb1d72c 100644
> --- a/arch/x86/events/intel/uncore.h
> +++ b/arch/x86/events/intel/uncore.h
> @@ -53,7 +53,7 @@ struct uncore_discovery_domain {
> /* MSR address or PCI device used as the discovery base */
> u32 discovery_base;
> bool base_is_pci;
> - int (*global_init)(u64 ctl);
> + int (*global_init)(int die, u64 ctl);
>
> /* The units in the discovery table should be ignored. */
> int *units_ignore;
> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
> index af2217b44a81..e36613d934b1 100644
> --- a/arch/x86/events/intel/uncore_discovery.c
> +++ b/arch/x86/events/intel/uncore_discovery.c
> @@ -287,7 +287,7 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
> if (!io_addr)
> return -ENOMEM;
>
> - if (domain->global_init && domain->global_init(global.ctl)) {
> + if (domain->global_init && domain->global_init(die, global.ctl)) {
> ret = -ENODEV;
> goto out;
> }
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
` (5 preceding siblings ...)
2026-05-27 15:11 ` [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore Zide Chen
@ 2026-05-27 15:45 ` sashiko-bot
2026-05-28 6:01 ` Mi, Dapeng
7 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-05-27 15:45 UTC (permalink / raw)
To: Zide Chen; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Critical] Unhandled return value of 0 from `intel_generic_uncore_box_ctl()` causes fatal writes to MSR 0 or PCI offset 0.
--
Note: The format of this report has been altered to provide a free-form summary without directly quoting the original patch code, due to automated recitation restrictions.
commit e9069bffe8688f6e4044b8f15ff7ac27dcdbb094
Author: Zide Chen <zide.chen@intel.com>
Subject: perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems
This commit fixes a unit lookup issue on multi-die systems by using the box's
die ID instead of a hardcoded -1. It also removes a warning when a discovery
unit is not found, as offline CPU dies will now rightfully return a NULL unit.
[Severity: Critical]
In the updated intel_generic_uncore_box_ctl(), returning 0 when the unit is
not found (such as for offline dies) appears to introduce a regression in
its callers.
Looking at callers like intel_generic_uncore_msr_init_box() and
intel_generic_uncore_pci_init_box(), they seem to use the returned control
address blindly. If 0 is returned, MSR-based uncores would attempt to write
to MSR 0x0 (IA32_P5_MC_ADDR), which triggers a general protection fault and
panics the kernel on modern CPUs. For PCI uncores, writing to offset 0
targets the read-only vendor and device ID, which could trigger AER NMIs.
Is it safe to return 0 here without adding validation to the callers? I
note that this is fixed later in the patch series by a commit named
"perf/x86/intel/uncore: Guard against invalid box control address", but
does leaving it unhandled in this commit break git bisectability?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260527151154.130505-1-zide.chen@intel.com?part=1
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
` (6 preceding siblings ...)
2026-05-27 15:45 ` [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems sashiko-bot
@ 2026-05-28 6:01 ` Mi, Dapeng
7 siblings, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2026-05-28 6:01 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users
LGTM.
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
On 5/27/2026 11:11 PM, Zide Chen wrote:
> In uncore_find_add_unit(), PMON units with the same unit ID may be
> added to the uncore discovery RB-tree for different dies. These units
> are distinguished by node->die.
>
> However, intel_generic_uncore_box_ctl() uses a fixed die ID of -1 when
> looking up the discovery unit, which may retrieve the wrong node on
> multi-die systems.
>
> Use box->dieid instead so the correct discovery unit is selected.
>
> No functional issue has been observed so far because currently supported
> platforms happen to use the same unit control register for such units.
>
> Remove WARN_ON_ONCE() because with the above change a NULL unit can be
> expected, e.g. when a CPU die is offline during uncore enumeration and
> the unit is not added to the RB-tree. In this case,
> intel_uncore_find_discovery_unit() returns NULL once the die becomes
> online, and it is expected that the PMU box is not functional for that
> die.
>
> Fixes: b1d9ea2e1ca4 ("perf/x86/uncore: Apply the unit control RB tree to MSR uncore units")
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> v2:
> - Remove WARN_ON_ONCE().
> ---
> arch/x86/events/intel/uncore_discovery.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
> index 583cbd06b9b8..60e1200c4691 100644
> --- a/arch/x86/events/intel/uncore_discovery.c
> +++ b/arch/x86/events/intel/uncore_discovery.c
> @@ -481,8 +481,8 @@ static u64 intel_generic_uncore_box_ctl(struct intel_uncore_box *box)
> struct intel_uncore_discovery_unit *unit;
>
> unit = intel_uncore_find_discovery_unit(box->pmu->type->boxes,
> - -1, box->pmu->pmu_idx);
> - if (WARN_ON_ONCE(!unit))
> + box->dieid, box->pmu->pmu_idx);
> + if (!unit)
> return 0;
>
> return unit->addr;
^ permalink raw reply [flat|nested] 19+ messages in thread