* [PATCH V5 0/5] Miscellaneous Intel uncore patches
@ 2026-03-24 21:49 Zide Chen
2026-03-24 21:49 ` [PATCH V5 1/4] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure Zide Chen
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Zide Chen @ 2026-03-24 21:49 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
Chun-Tse Shao, Markus Elfring
This is mainly a RESEND of the patch set below.
No code changes since the last posting, except for a minor change in
patch 3.
I grouped these patches together for the convenience of the maintainers
and bumped the whole series to v5.
Changes since v4:
- Remove unused die_id variable from patch 3.
v4:
https://lore.kernel.org/lkml/20260313174050.171704-1-zide.chen@intel.com/
Previous versions:
https://lore.kernel.org/lkml/cda9ab9b-4581-409f-a9bb-0e8a67ad3530@web.de/T/
https://lore.kernel.org/lkml/20260116031113.45098-1-zide.chen@intel.com/
https://lore.kernel.org/lkml/20260206231203.11760-1-zide.chen@intel.com/
Zide Chen (4):
perf/x86/intel/uncore: Fix iounmap() leak on global_init failure
perf/x86/intel/uncore: Skip discovery table for offline dies
perf/x86/intel/uncore: Fix die ID init and look up bugs
perf/x86/intel/uncore: Remove extra double quote mark
arch/x86/events/intel/uncore.c | 1 +
arch/x86/events/intel/uncore_discovery.c | 17 ++++---
arch/x86/events/intel/uncore_snbep.c | 65 ++++++++++++------------
3 files changed, 44 insertions(+), 39 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH V5 1/4] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure
2026-03-24 21:49 [PATCH V5 0/5] Miscellaneous Intel uncore patches Zide Chen
@ 2026-03-24 21:49 ` Zide Chen
2026-03-24 21:49 ` [PATCH V5 2/4] perf/x86/intel/uncore: Skip discovery table for offline dies Zide Chen
` (2 subsequent siblings)
3 siblings, 0 replies; 10+ messages in thread
From: Zide Chen @ 2026-03-24 21:49 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
Chun-Tse Shao, Markus Elfring, kernel test robot
Kernel test robot reported:
Unverified Error/Warning (likely false positive, kindly check if
interested):
arch/x86/events/intel/uncore_discovery.c:293:2-8:
ERROR: missing iounmap; ioremap on line 288 and execution via
conditional on line 292
If domain->global_init() fails in __parse_discovery_table(), the
ioremap'ed MMIO region is not released before returning, resulting
in an MMIO mapping leak.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: b575fc0e3357 ("perf/x86/intel/uncore: Add domain global init callback")
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2:
- As suggested by Markus, add an `out` label and use goto-based error
handling to reduce duplicated iounmap() code.
- Add the original warning from the kernel test robot to the commit message.
- Trivial rewording of the commit message.
V3:
- Add Reviewed-by tag.
---
arch/x86/events/intel/uncore_discovery.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
index 1b1a6b221048..6a4e892cd525 100644
--- a/arch/x86/events/intel/uncore_discovery.c
+++ b/arch/x86/events/intel/uncore_discovery.c
@@ -264,6 +264,7 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
struct uncore_unit_discovery unit;
void __iomem *io_addr;
unsigned long size;
+ int ret = 0;
int i;
size = UNCORE_DISCOVERY_GLOBAL_MAP_SIZE;
@@ -273,21 +274,23 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
/* Read Global Discovery State */
memcpy_fromio(&global, io_addr, sizeof(struct uncore_global_discovery));
+ iounmap(io_addr);
+
if (uncore_discovery_invalid_unit(global)) {
pr_info("Invalid Global Discovery State: 0x%llx 0x%llx 0x%llx\n",
global.table1, global.ctl, global.table3);
- iounmap(io_addr);
return -EINVAL;
}
- iounmap(io_addr);
size = (1 + global.max_units) * global.stride * 8;
io_addr = ioremap(addr, size);
if (!io_addr)
return -ENOMEM;
- if (domain->global_init && domain->global_init(global.ctl))
- return -ENODEV;
+ if (domain->global_init && domain->global_init(global.ctl)) {
+ ret = -ENODEV;
+ goto out;
+ }
/* Parsing Unit Discovery State */
for (i = 0; i < global.max_units; i++) {
@@ -307,8 +310,10 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
}
*parsed = true;
+
+out:
iounmap(io_addr);
- return 0;
+ return ret;
}
static int parse_discovery_table(struct uncore_discovery_domain *domain,
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V5 2/4] perf/x86/intel/uncore: Skip discovery table for offline dies
2026-03-24 21:49 [PATCH V5 0/5] Miscellaneous Intel uncore patches Zide Chen
2026-03-24 21:49 ` [PATCH V5 1/4] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure Zide Chen
@ 2026-03-24 21:49 ` Zide Chen
2026-03-24 21:49 ` [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs Zide Chen
2026-03-24 21:49 ` [PATCH V5 4/4] perf/x86/intel/uncore: Remove extra double quote mark Zide Chen
3 siblings, 0 replies; 10+ messages in thread
From: Zide Chen @ 2026-03-24 21:49 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
Chun-Tse Shao, Markus Elfring
This warning can be triggered if NUMA is disabled and the system
boots with fewer CPUs than the number of CPUs in die 0.
WARNING: CPU: 9 PID: 7257 at uncore.c:1157 uncore_pci_pmu_register+0x136/0x160 [intel_uncore]
Currently, the discovery table continues to be parsed even if all CPUs
in the associated die are offline. This can lead to an array overflow
at "pmu->boxes[die] = box" in uncore_pci_pmu_register(), which may
trigger the warning above or cause other issues.
Reported-by: Steve Wahl <steve.wahl@hpe.com>
Tested-by: Steve Wahl <steve.wahl@hpe.com>
Fixes: edae1f06c2cd ("perf/x86/intel/uncore: Parse uncore discovery tables")
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2:
- Add the Tested-by tag
- Rebase onto perf/core (base commit: a491c02c2770)
V3:
- Remove the overly cautious WARN_ON()
V4:
- Add Reviewed-by tag.
---
arch/x86/events/intel/uncore_discovery.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
index 6a4e892cd525..749db3649f8f 100644
--- a/arch/x86/events/intel/uncore_discovery.c
+++ b/arch/x86/events/intel/uncore_discovery.c
@@ -371,7 +371,7 @@ static bool uncore_discovery_pci(struct uncore_discovery_domain *domain)
(val & UNCORE_DISCOVERY_DVSEC2_BIR_MASK) * UNCORE_DISCOVERY_BIR_STEP;
die = get_device_die_id(dev);
- if (die < 0)
+ if ((die < 0) || (die >= uncore_max_dies()))
continue;
parse_discovery_table(domain, dev, die, bar_offset, &parsed);
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs
2026-03-24 21:49 [PATCH V5 0/5] Miscellaneous Intel uncore patches Zide Chen
2026-03-24 21:49 ` [PATCH V5 1/4] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure Zide Chen
2026-03-24 21:49 ` [PATCH V5 2/4] perf/x86/intel/uncore: Skip discovery table for offline dies Zide Chen
@ 2026-03-24 21:49 ` Zide Chen
2026-03-25 0:35 ` Mi, Dapeng
2026-03-26 6:03 ` Mi, Dapeng
2026-03-24 21:49 ` [PATCH V5 4/4] perf/x86/intel/uncore: Remove extra double quote mark Zide Chen
3 siblings, 2 replies; 10+ messages in thread
From: Zide Chen @ 2026-03-24 21:49 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
Chun-Tse Shao, Markus Elfring
In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
uncore_device_to_die() may return -1 when all CPUs associated
with the UBOX device are offline.
Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
- The current code breaks out of the loop. This is incorrect because
pci_get_device() does not guarantee iteration in domain or bus order,
so additional UBOX devices may be skipped during the scan.
- Returning -EINVAL is incorrect, since marking offline buses with
die_id == -1 is expected and should not be treated as an error.
Separately, when NUMA is disabled on a NUMA-capable platform,
pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die()
to return -1 for all PCI devices. As a result,
spr_update_device_location(), used on Intel SPR and EMR, ignores the
corresponding PMON units and does not add them to the RB tree.
Fix this by using uncore_pcibus_to_dieid(), which retrieves topology
from the UBOX GIDNIDMAP register and works regardless of whether NUMA
is enabled in Linux. This requires snbep_pci2phy_map_init() to be
added in spr_uncore_pci_init().
Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where
NUMA is expected to be enabled.
Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR")
Tested-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2:
- Fix the commit message to note that spr_update_device_location() is
used by EMR, not GNR.
- Rewrite the commit message for clarity.
- Add a Tested-by tag.
V5:
- Remove unused variable die_id (Dapeng).
---
arch/x86/events/intel/uncore.c | 1 +
arch/x86/events/intel/uncore_snbep.c | 17 ++++++++---------
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 786bd51a0d89..e9cc1ba921c5 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
return bus ? pci_domain_nr(bus) : -EINVAL;
}
+/* Note: This API can only be used when NUMA information is available. */
int uncore_device_to_die(struct pci_dev *dev)
{
int node = pcibus_to_node(dev->bus);
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 9b51883fd6fd..5ef205a70559 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
{
struct pci_dev *ubox_dev = NULL;
- int i, bus, nodeid, segment, die_id;
+ int i, bus, nodeid, segment;
struct pci2phy_map *map;
int err = 0;
u32 config = 0;
@@ -1458,14 +1458,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
break;
}
- map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
-
+ map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
raw_spin_unlock(&pci2phy_map_lock);
-
- if (WARN_ON_ONCE(die_id == -1)) {
- err = -EINVAL;
- break;
- }
}
}
@@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
- die = uncore_device_to_die(dev);
+ die = uncore_pcibus_to_dieid(dev->bus);
if (die < 0)
continue;
@@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
int spr_uncore_pci_init(void)
{
+ int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
+
+ if (ret)
+ return ret;
+
/*
* The discovery table of UPI on some SPR variant is broken,
* which impacts the detection of both UPI and M3UPI uncore PMON.
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V5 4/4] perf/x86/intel/uncore: Remove extra double quote mark
2026-03-24 21:49 [PATCH V5 0/5] Miscellaneous Intel uncore patches Zide Chen
` (2 preceding siblings ...)
2026-03-24 21:49 ` [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs Zide Chen
@ 2026-03-24 21:49 ` Zide Chen
3 siblings, 0 replies; 10+ messages in thread
From: Zide Chen @ 2026-03-24 21:49 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
Chun-Tse Shao, Markus Elfring
The third argument in INTEL_UNCORE_FR_EVENT_DESC() is subject to
__stringify(), and the extra double quote marks can result in the
expansion "3.814697266e-6" in the sysfs knobs, instead of
3.814697266e-6.
This is incorrect, though it may still work for perf, e.g.
perf stat -e uncore_iio_free_running_0/bw_in_port0/
Fixes: d8987048f665 ("perf/x86/intel/uncore: Support IIO free-running counters on DMR")
Reported-by: Chun-Tse Shao <ctshao@google.com>
Closes: https://lore.kernel.org/all/20251231224233.113839-1-zide.chen@intel.com/
Reviewed-by: Chun-Tse Shao <ctshao@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2: Added Reviewed-by tags.
---
arch/x86/events/intel/uncore_snbep.c | 48 ++++++++++++++--------------
1 file changed, 24 insertions(+), 24 deletions(-)
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 5ef205a70559..526e995c63da 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -6960,34 +6960,34 @@ static struct freerunning_counters dmr_iio_freerunning[] = {
static struct uncore_event_desc dmr_uncore_iio_freerunning_events[] = {
/* ITC Free Running Data BW counter for inbound traffic */
- INTEL_UNCORE_FR_EVENT_DESC(inb_data_port0, 0x10, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(inb_data_port1, 0x11, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(inb_data_port2, 0x12, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(inb_data_port3, 0x13, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(inb_data_port4, 0x14, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(inb_data_port5, 0x15, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(inb_data_port6, 0x16, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(inb_data_port7, 0x17, "3.814697266e-6"),
+ INTEL_UNCORE_FR_EVENT_DESC(inb_data_port0, 0x10, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(inb_data_port1, 0x11, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(inb_data_port2, 0x12, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(inb_data_port3, 0x13, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(inb_data_port4, 0x14, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(inb_data_port5, 0x15, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(inb_data_port6, 0x16, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(inb_data_port7, 0x17, 3.814697266e-6),
/* ITC Free Running BW IN counters */
- INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, "3.814697266e-6"),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, 3.814697266e-6),
/* ITC Free Running BW OUT counters */
- INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x30, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x31, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x32, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x33, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_out_port4, 0x34, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_out_port5, 0x35, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_out_port6, 0x36, "3.814697266e-6"),
- INTEL_UNCORE_FR_EVENT_DESC(bw_out_port7, 0x37, "3.814697266e-6"),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x30, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x31, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x32, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x33, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_out_port4, 0x34, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_out_port5, 0x35, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_out_port6, 0x36, 3.814697266e-6),
+ INTEL_UNCORE_FR_EVENT_DESC(bw_out_port7, 0x37, 3.814697266e-6),
/* Free Running Clock Counter */
INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x40"),
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs
2026-03-24 21:49 ` [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs Zide Chen
@ 2026-03-25 0:35 ` Mi, Dapeng
2026-03-26 6:03 ` Mi, Dapeng
1 sibling, 0 replies; 10+ messages in thread
From: Mi, Dapeng @ 2026-03-25 0:35 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
Markus Elfring
LGTM.
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
On 3/25/2026 5:49 AM, Zide Chen wrote:
> In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
> uncore_device_to_die() may return -1 when all CPUs associated
> with the UBOX device are offline.
>
> Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
>
> - The current code breaks out of the loop. This is incorrect because
> pci_get_device() does not guarantee iteration in domain or bus order,
> so additional UBOX devices may be skipped during the scan.
>
> - Returning -EINVAL is incorrect, since marking offline buses with
> die_id == -1 is expected and should not be treated as an error.
>
> Separately, when NUMA is disabled on a NUMA-capable platform,
> pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die()
> to return -1 for all PCI devices. As a result,
> spr_update_device_location(), used on Intel SPR and EMR, ignores the
> corresponding PMON units and does not add them to the RB tree.
>
> Fix this by using uncore_pcibus_to_dieid(), which retrieves topology
> from the UBOX GIDNIDMAP register and works regardless of whether NUMA
> is enabled in Linux. This requires snbep_pci2phy_map_init() to be
> added in spr_uncore_pci_init().
>
> Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where
> NUMA is expected to be enabled.
>
> Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
> Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR")
> Tested-by: Steve Wahl <steve.wahl@hpe.com>
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> V2:
> - Fix the commit message to note that spr_update_device_location() is
> used by EMR, not GNR.
> - Rewrite the commit message for clarity.
> - Add a Tested-by tag.
>
> V5:
> - Remove unused variable die_id (Dapeng).
> ---
> arch/x86/events/intel/uncore.c | 1 +
> arch/x86/events/intel/uncore_snbep.c | 17 ++++++++---------
> 2 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 786bd51a0d89..e9cc1ba921c5 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
> return bus ? pci_domain_nr(bus) : -EINVAL;
> }
>
> +/* Note: This API can only be used when NUMA information is available. */
> int uncore_device_to_die(struct pci_dev *dev)
> {
> int node = pcibus_to_node(dev->bus);
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 9b51883fd6fd..5ef205a70559 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
> static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
> {
> struct pci_dev *ubox_dev = NULL;
> - int i, bus, nodeid, segment, die_id;
> + int i, bus, nodeid, segment;
> struct pci2phy_map *map;
> int err = 0;
> u32 config = 0;
> @@ -1458,14 +1458,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
> break;
> }
>
> - map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
> -
> + map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
> raw_spin_unlock(&pci2phy_map_lock);
> -
> - if (WARN_ON_ONCE(die_id == -1)) {
> - err = -EINVAL;
> - break;
> - }
> }
> }
>
> @@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
>
> while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>
> - die = uncore_device_to_die(dev);
> + die = uncore_pcibus_to_dieid(dev->bus);
> if (die < 0)
> continue;
>
> @@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
>
> int spr_uncore_pci_init(void)
> {
> + int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
> +
> + if (ret)
> + return ret;
> +
> /*
> * The discovery table of UPI on some SPR variant is broken,
> * which impacts the detection of both UPI and M3UPI uncore PMON.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs
2026-03-24 21:49 ` [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs Zide Chen
2026-03-25 0:35 ` Mi, Dapeng
@ 2026-03-26 6:03 ` Mi, Dapeng
2026-03-26 23:57 ` Chen, Zide
1 sibling, 1 reply; 10+ messages in thread
From: Mi, Dapeng @ 2026-03-26 6:03 UTC (permalink / raw)
To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
Markus Elfring
Zide, Sashiko gave some comments on this patch. Could you please have a
look if they are reasonable? Thanks.
https://sashiko.dev/#/patchset/20260324214932.10068-1-zide.chen%40intel.com
On 3/25/2026 5:49 AM, Zide Chen wrote:
> In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
> uncore_device_to_die() may return -1 when all CPUs associated
> with the UBOX device are offline.
>
> Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
>
> - The current code breaks out of the loop. This is incorrect because
> pci_get_device() does not guarantee iteration in domain or bus order,
> so additional UBOX devices may be skipped during the scan.
>
> - Returning -EINVAL is incorrect, since marking offline buses with
> die_id == -1 is expected and should not be treated as an error.
>
> Separately, when NUMA is disabled on a NUMA-capable platform,
> pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die()
> to return -1 for all PCI devices. As a result,
> spr_update_device_location(), used on Intel SPR and EMR, ignores the
> corresponding PMON units and does not add them to the RB tree.
>
> Fix this by using uncore_pcibus_to_dieid(), which retrieves topology
> from the UBOX GIDNIDMAP register and works regardless of whether NUMA
> is enabled in Linux. This requires snbep_pci2phy_map_init() to be
> added in spr_uncore_pci_init().
>
> Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where
> NUMA is expected to be enabled.
>
> Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
> Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR")
> Tested-by: Steve Wahl <steve.wahl@hpe.com>
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> V2:
> - Fix the commit message to note that spr_update_device_location() is
> used by EMR, not GNR.
> - Rewrite the commit message for clarity.
> - Add a Tested-by tag.
>
> V5:
> - Remove unused variable die_id (Dapeng).
> ---
> arch/x86/events/intel/uncore.c | 1 +
> arch/x86/events/intel/uncore_snbep.c | 17 ++++++++---------
> 2 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 786bd51a0d89..e9cc1ba921c5 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
> return bus ? pci_domain_nr(bus) : -EINVAL;
> }
>
> +/* Note: This API can only be used when NUMA information is available. */
> int uncore_device_to_die(struct pci_dev *dev)
> {
> int node = pcibus_to_node(dev->bus);
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 9b51883fd6fd..5ef205a70559 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
> static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
> {
> struct pci_dev *ubox_dev = NULL;
> - int i, bus, nodeid, segment, die_id;
> + int i, bus, nodeid, segment;
> struct pci2phy_map *map;
> int err = 0;
> u32 config = 0;
> @@ -1458,14 +1458,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
> break;
> }
>
> - map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
> -
> + map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
> raw_spin_unlock(&pci2phy_map_lock);
> -
> - if (WARN_ON_ONCE(die_id == -1)) {
> - err = -EINVAL;
> - break;
> - }
> }
> }
>
> @@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
>
> while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>
> - die = uncore_device_to_die(dev);
> + die = uncore_pcibus_to_dieid(dev->bus);
> if (die < 0)
> continue;
>
> @@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
>
> int spr_uncore_pci_init(void)
> {
> + int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
> +
> + if (ret)
> + return ret;
> +
> /*
> * The discovery table of UPI on some SPR variant is broken,
> * which impacts the detection of both UPI and M3UPI uncore PMON.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs
2026-03-26 6:03 ` Mi, Dapeng
@ 2026-03-26 23:57 ` Chen, Zide
2026-03-27 2:03 ` Mi, Dapeng
0 siblings, 1 reply; 10+ messages in thread
From: Chen, Zide @ 2026-03-26 23:57 UTC (permalink / raw)
To: Mi, Dapeng, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
Markus Elfring
On 3/25/2026 11:03 PM, Mi, Dapeng wrote:
> Zide, Sashiko gave some comments on this patch. Could you please have a
> look if they are reasonable? Thanks.
>
> https://sashiko.dev/#/patchset/20260324214932.10068-1-zide.chen%40intel.com
1. Regarding the concern that this change may replace an offline node's
-1 with the die ID of an adjacent online node, I do not think this is an
issue.
After this fix, the logic is the same for both (nr_node_ids <= 8) and
(nr_node_ids > 8): map->pbus_to_dieid[bus] may be written with an
invalid die_id (e.g., -1). This is not an error and is expected in some
cases. We should continue to populate the map->pbus_to_dieid[] array.
Regardless of the traversal order (as determined by the reverse
argument), for a given die, the UBOX device is expected to reside on the
first valid bus in the die it is affined to.
Under the current assignment algorithm, all buses following a UBOX
device, up to the next UBOX device or the end of traversal, are assigned
the same die ID.
For example, on SPR, there are two UBOX devices: one device on bus 0x7e
in die 0, and another on bus 0xfe in die 1. With reversed traversal
order, buses 0xff–0x7f are assigned die ID 1, while buses 0x7e–0x00 are
assigned die ID 0.
If all CPUs in die 1 are offline, then buses 0xff–0x7f are assigned -1.
This is fine.
That being said, the die ID for invalid buses is not consistent, which
is not ideal.
2. Regarding the repeated snbep_pci2phy_map_init() calls. I wanted a
"simple" fix initially. I may need to split this patch into two
separate patches.
>
> On 3/25/2026 5:49 AM, Zide Chen wrote:
>> In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
>> uncore_device_to_die() may return -1 when all CPUs associated
>> with the UBOX device are offline.
>>
>> Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
>>
>> - The current code breaks out of the loop. This is incorrect because
>> pci_get_device() does not guarantee iteration in domain or bus order,
>> so additional UBOX devices may be skipped during the scan.
>>
>> - Returning -EINVAL is incorrect, since marking offline buses with
>> die_id == -1 is expected and should not be treated as an error.
>>
>> Separately, when NUMA is disabled on a NUMA-capable platform,
>> pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die()
>> to return -1 for all PCI devices. As a result,
>> spr_update_device_location(), used on Intel SPR and EMR, ignores the
>> corresponding PMON units and does not add them to the RB tree.
>>
>> Fix this by using uncore_pcibus_to_dieid(), which retrieves topology
>> from the UBOX GIDNIDMAP register and works regardless of whether NUMA
>> is enabled in Linux. This requires snbep_pci2phy_map_init() to be
>> added in spr_uncore_pci_init().
>>
>> Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where
>> NUMA is expected to be enabled.
>>
>> Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
>> Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR")
>> Tested-by: Steve Wahl <steve.wahl@hpe.com>
>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>> ---
>> V2:
>> - Fix the commit message to note that spr_update_device_location() is
>> used by EMR, not GNR.
>> - Rewrite the commit message for clarity.
>> - Add a Tested-by tag.
>>
>> V5:
>> - Remove unused variable die_id (Dapeng).
>> ---
>> arch/x86/events/intel/uncore.c | 1 +
>> arch/x86/events/intel/uncore_snbep.c | 17 ++++++++---------
>> 2 files changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>> index 786bd51a0d89..e9cc1ba921c5 100644
>> --- a/arch/x86/events/intel/uncore.c
>> +++ b/arch/x86/events/intel/uncore.c
>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>> return bus ? pci_domain_nr(bus) : -EINVAL;
>> }
>>
>> +/* Note: This API can only be used when NUMA information is available. */
>> int uncore_device_to_die(struct pci_dev *dev)
>> {
>> int node = pcibus_to_node(dev->bus);
>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>> index 9b51883fd6fd..5ef205a70559 100644
>> --- a/arch/x86/events/intel/uncore_snbep.c
>> +++ b/arch/x86/events/intel/uncore_snbep.c
>> @@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
>> static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
>> {
>> struct pci_dev *ubox_dev = NULL;
>> - int i, bus, nodeid, segment, die_id;
>> + int i, bus, nodeid, segment;
>> struct pci2phy_map *map;
>> int err = 0;
>> u32 config = 0;
>> @@ -1458,14 +1458,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
>> break;
>> }
>>
>> - map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
>> -
>> + map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
>> raw_spin_unlock(&pci2phy_map_lock);
>> -
>> - if (WARN_ON_ONCE(die_id == -1)) {
>> - err = -EINVAL;
>> - break;
>> - }
>> }
>> }
>>
>> @@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
>>
>> while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>
>> - die = uncore_device_to_die(dev);
>> + die = uncore_pcibus_to_dieid(dev->bus);
>> if (die < 0)
>> continue;
>>
>> @@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
>>
>> int spr_uncore_pci_init(void)
>> {
>> + int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>> +
>> + if (ret)
>> + return ret;
>> +
>> /*
>> * The discovery table of UPI on some SPR variant is broken,
>> * which impacts the detection of both UPI and M3UPI uncore PMON.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs
2026-03-26 23:57 ` Chen, Zide
@ 2026-03-27 2:03 ` Mi, Dapeng
2026-03-27 16:55 ` Chen, Zide
0 siblings, 1 reply; 10+ messages in thread
From: Mi, Dapeng @ 2026-03-27 2:03 UTC (permalink / raw)
To: Chen, Zide, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
Markus Elfring
On 3/27/2026 7:57 AM, Chen, Zide wrote:
>
> On 3/25/2026 11:03 PM, Mi, Dapeng wrote:
>> Zide, Sashiko gave some comments on this patch. Could you please have a
>> look if they are reasonable? Thanks.
>>
>> https://sashiko.dev/#/patchset/20260324214932.10068-1-zide.chen%40intel.com
> 1. Regarding the concern that this change may replace an offline node's
> -1 with the die ID of an adjacent online node, I do not think this is an
> issue.
>
> After this fix, the logic is the same for both (nr_node_ids <= 8) and
> (nr_node_ids > 8): map->pbus_to_dieid[bus] may be written with an
> invalid die_id (e.g., -1). This is not an error and is expected in some
> cases. We should continue to populate the map->pbus_to_dieid[] array.
>
> Regardless of the traversal order (as determined by the reverse
> argument), for a given die, the UBOX device is expected to reside on the
> first valid bus in the die it is affined to.
>
> Under the current assignment algorithm, all buses following a UBOX
> device, up to the next UBOX device or the end of traversal, are assigned
> the same die ID.
>
> For example, on SPR, there are two UBOX devices: one device on bus 0x7e
> in die 0, and another on bus 0xfe in die 1. With reversed traversal
> order, buses 0xff–0x7f are assigned die ID 1, while buses 0x7e–0x00 are
> assigned die ID 0.
>
> If all CPUs in die 1 are offline, then buses 0xff–0x7f are assigned -1.
> This is fine.
>
> That being said, the die ID for invalid buses is not consistent, which
> is not ideal.
Yes, for the case with 2 sockets and socket 1 is offline, it's correct. But
assume there are 4 sockets (0/1/2/3), buses 0x0-0x3f are attached to socket
0, buses 0x40-0x7f are attached to socket 1, buses 0x80-0xbf are attached
to socket 2 and buses 0xc0-0xff are attached to socket 3, the socket 2 is
offline. In reverse order, the die id of buses 0x80-0xbf would be
overwritten to 3 instead of -1, right?
But it seems there is not a good way to fix this issue and the function
spr_update_device_location() won't really find the ubox device of socket 2
since socket 2 has been offline. So it won't cause a real issue.
>
> 2. Regarding the repeated snbep_pci2phy_map_init() calls. I wanted a
> "simple" fix initially. I may need to split this patch into two
> separate patches.
>
>> On 3/25/2026 5:49 AM, Zide Chen wrote:
>>> In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
>>> uncore_device_to_die() may return -1 when all CPUs associated
>>> with the UBOX device are offline.
>>>
>>> Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
>>>
>>> - The current code breaks out of the loop. This is incorrect because
>>> pci_get_device() does not guarantee iteration in domain or bus order,
>>> so additional UBOX devices may be skipped during the scan.
>>>
>>> - Returning -EINVAL is incorrect, since marking offline buses with
>>> die_id == -1 is expected and should not be treated as an error.
>>>
>>> Separately, when NUMA is disabled on a NUMA-capable platform,
>>> pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die()
>>> to return -1 for all PCI devices. As a result,
>>> spr_update_device_location(), used on Intel SPR and EMR, ignores the
>>> corresponding PMON units and does not add them to the RB tree.
>>>
>>> Fix this by using uncore_pcibus_to_dieid(), which retrieves topology
>>> from the UBOX GIDNIDMAP register and works regardless of whether NUMA
>>> is enabled in Linux. This requires snbep_pci2phy_map_init() to be
>>> added in spr_uncore_pci_init().
>>>
>>> Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where
>>> NUMA is expected to be enabled.
>>>
>>> Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
>>> Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR")
>>> Tested-by: Steve Wahl <steve.wahl@hpe.com>
>>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>>> ---
>>> V2:
>>> - Fix the commit message to note that spr_update_device_location() is
>>> used by EMR, not GNR.
>>> - Rewrite the commit message for clarity.
>>> - Add a Tested-by tag.
>>>
>>> V5:
>>> - Remove unused variable die_id (Dapeng).
>>> ---
>>> arch/x86/events/intel/uncore.c | 1 +
>>> arch/x86/events/intel/uncore_snbep.c | 17 ++++++++---------
>>> 2 files changed, 9 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>>> index 786bd51a0d89..e9cc1ba921c5 100644
>>> --- a/arch/x86/events/intel/uncore.c
>>> +++ b/arch/x86/events/intel/uncore.c
>>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>>> return bus ? pci_domain_nr(bus) : -EINVAL;
>>> }
>>>
>>> +/* Note: This API can only be used when NUMA information is available. */
>>> int uncore_device_to_die(struct pci_dev *dev)
>>> {
>>> int node = pcibus_to_node(dev->bus);
>>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>>> index 9b51883fd6fd..5ef205a70559 100644
>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>> @@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
>>> static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
>>> {
>>> struct pci_dev *ubox_dev = NULL;
>>> - int i, bus, nodeid, segment, die_id;
>>> + int i, bus, nodeid, segment;
>>> struct pci2phy_map *map;
>>> int err = 0;
>>> u32 config = 0;
>>> @@ -1458,14 +1458,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
>>> break;
>>> }
>>>
>>> - map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
>>> -
>>> + map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
>>> raw_spin_unlock(&pci2phy_map_lock);
>>> -
>>> - if (WARN_ON_ONCE(die_id == -1)) {
>>> - err = -EINVAL;
>>> - break;
>>> - }
>>> }
>>> }
>>>
>>> @@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
>>>
>>> while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>>
>>> - die = uncore_device_to_die(dev);
>>> + die = uncore_pcibus_to_dieid(dev->bus);
>>> if (die < 0)
>>> continue;
>>>
>>> @@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
>>>
>>> int spr_uncore_pci_init(void)
>>> {
>>> + int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>> +
>>> + if (ret)
>>> + return ret;
>>> +
>>> /*
>>> * The discovery table of UPI on some SPR variant is broken,
>>> * which impacts the detection of both UPI and M3UPI uncore PMON.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs
2026-03-27 2:03 ` Mi, Dapeng
@ 2026-03-27 16:55 ` Chen, Zide
0 siblings, 0 replies; 10+ messages in thread
From: Chen, Zide @ 2026-03-27 16:55 UTC (permalink / raw)
To: Mi, Dapeng, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
Andi Kleen, Eranian Stephane
Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
Markus Elfring
On 3/26/2026 7:03 PM, Mi, Dapeng wrote:
>
> On 3/27/2026 7:57 AM, Chen, Zide wrote:
>>
>> On 3/25/2026 11:03 PM, Mi, Dapeng wrote:
>>> Zide, Sashiko gave some comments on this patch. Could you please have a
>>> look if they are reasonable? Thanks.
>>>
>>> https://sashiko.dev/#/patchset/20260324214932.10068-1-zide.chen%40intel.com
>> 1. Regarding the concern that this change may replace an offline node's
>> -1 with the die ID of an adjacent online node, I do not think this is an
>> issue.
>>
>> After this fix, the logic is the same for both (nr_node_ids <= 8) and
>> (nr_node_ids > 8): map->pbus_to_dieid[bus] may be written with an
>> invalid die_id (e.g., -1). This is not an error and is expected in some
>> cases. We should continue to populate the map->pbus_to_dieid[] array.
>>
>> Regardless of the traversal order (as determined by the reverse
>> argument), for a given die, the UBOX device is expected to reside on the
>> first valid bus in the die it is affined to.
>>
>> Under the current assignment algorithm, all buses following a UBOX
>> device, up to the next UBOX device or the end of traversal, are assigned
>> the same die ID.
>>
>> For example, on SPR, there are two UBOX devices: one device on bus 0x7e
>> in die 0, and another on bus 0xfe in die 1. With reversed traversal
>> order, buses 0xff–0x7f are assigned die ID 1, while buses 0x7e–0x00 are
>> assigned die ID 0.
>>
>> If all CPUs in die 1 are offline, then buses 0xff–0x7f are assigned -1.
>> This is fine.
>>
>> That being said, the die ID for invalid buses is not consistent, which
>> is not ideal.
>
> Yes, for the case with 2 sockets and socket 1 is offline, it's correct. But
> assume there are 4 sockets (0/1/2/3), buses 0x0-0x3f are attached to socket
> 0, buses 0x40-0x7f are attached to socket 1, buses 0x80-0xbf are attached
> to socket 2 and buses 0xc0-0xff are attached to socket 3, the socket 2 is
> offline. In reverse order, the die id of buses 0x80-0xbf would be
> overwritten to 3 instead of -1, right?
Yes, that’s correct theoretically.
>
> But it seems there is not a good way to fix this issue and the function
> spr_update_device_location() won't really find the ubox device of socket 2
> since socket 2 has been offline. So it won't cause a real issue.
Yes. In this algorithm, for any "invalid" buses — whether they belong to
an offline die or are not present in the system — the associated
map->pbus_to_dieid[] entry is arbitrary.
PMON units on offline dies won't be enumerated.
>>
>> 2. Regarding the repeated snbep_pci2phy_map_init() calls. I wanted a
>> "simple" fix initially. I may need to split this patch into two
>> separate patches.
>>
>>> On 3/25/2026 5:49 AM, Zide Chen wrote:
>>>> In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
>>>> uncore_device_to_die() may return -1 when all CPUs associated
>>>> with the UBOX device are offline.
>>>>
>>>> Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
>>>>
>>>> - The current code breaks out of the loop. This is incorrect because
>>>> pci_get_device() does not guarantee iteration in domain or bus order,
>>>> so additional UBOX devices may be skipped during the scan.
>>>>
>>>> - Returning -EINVAL is incorrect, since marking offline buses with
>>>> die_id == -1 is expected and should not be treated as an error.
>>>>
>>>> Separately, when NUMA is disabled on a NUMA-capable platform,
>>>> pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die()
>>>> to return -1 for all PCI devices. As a result,
>>>> spr_update_device_location(), used on Intel SPR and EMR, ignores the
>>>> corresponding PMON units and does not add them to the RB tree.
>>>>
>>>> Fix this by using uncore_pcibus_to_dieid(), which retrieves topology
>>>> from the UBOX GIDNIDMAP register and works regardless of whether NUMA
>>>> is enabled in Linux. This requires snbep_pci2phy_map_init() to be
>>>> added in spr_uncore_pci_init().
>>>>
>>>> Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where
>>>> NUMA is expected to be enabled.
>>>>
>>>> Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
>>>> Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR")
>>>> Tested-by: Steve Wahl <steve.wahl@hpe.com>
>>>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>>>> ---
>>>> V2:
>>>> - Fix the commit message to note that spr_update_device_location() is
>>>> used by EMR, not GNR.
>>>> - Rewrite the commit message for clarity.
>>>> - Add a Tested-by tag.
>>>>
>>>> V5:
>>>> - Remove unused variable die_id (Dapeng).
>>>> ---
>>>> arch/x86/events/intel/uncore.c | 1 +
>>>> arch/x86/events/intel/uncore_snbep.c | 17 ++++++++---------
>>>> 2 files changed, 9 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>>>> index 786bd51a0d89..e9cc1ba921c5 100644
>>>> --- a/arch/x86/events/intel/uncore.c
>>>> +++ b/arch/x86/events/intel/uncore.c
>>>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>>>> return bus ? pci_domain_nr(bus) : -EINVAL;
>>>> }
>>>>
>>>> +/* Note: This API can only be used when NUMA information is available. */
>>>> int uncore_device_to_die(struct pci_dev *dev)
>>>> {
>>>> int node = pcibus_to_node(dev->bus);
>>>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>>>> index 9b51883fd6fd..5ef205a70559 100644
>>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>>> @@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
>>>> static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
>>>> {
>>>> struct pci_dev *ubox_dev = NULL;
>>>> - int i, bus, nodeid, segment, die_id;
>>>> + int i, bus, nodeid, segment;
>>>> struct pci2phy_map *map;
>>>> int err = 0;
>>>> u32 config = 0;
>>>> @@ -1458,14 +1458,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
>>>> break;
>>>> }
>>>>
>>>> - map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
>>>> -
>>>> + map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
>>>> raw_spin_unlock(&pci2phy_map_lock);
>>>> -
>>>> - if (WARN_ON_ONCE(die_id == -1)) {
>>>> - err = -EINVAL;
>>>> - break;
>>>> - }
>>>> }
>>>> }
>>>>
>>>> @@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
>>>>
>>>> while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>>>
>>>> - die = uncore_device_to_die(dev);
>>>> + die = uncore_pcibus_to_dieid(dev->bus);
>>>> if (die < 0)
>>>> continue;
>>>>
>>>> @@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
>>>>
>>>> int spr_uncore_pci_init(void)
>>>> {
>>>> + int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>>> +
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> /*
>>>> * The discovery table of UPI on some SPR variant is broken,
>>>> * which impacts the detection of both UPI and M3UPI uncore PMON.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-03-27 16:55 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-24 21:49 [PATCH V5 0/5] Miscellaneous Intel uncore patches Zide Chen
2026-03-24 21:49 ` [PATCH V5 1/4] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure Zide Chen
2026-03-24 21:49 ` [PATCH V5 2/4] perf/x86/intel/uncore: Skip discovery table for offline dies Zide Chen
2026-03-24 21:49 ` [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs Zide Chen
2026-03-25 0:35 ` Mi, Dapeng
2026-03-26 6:03 ` Mi, Dapeng
2026-03-26 23:57 ` Chen, Zide
2026-03-27 2:03 ` Mi, Dapeng
2026-03-27 16:55 ` Chen, Zide
2026-03-24 21:49 ` [PATCH V5 4/4] perf/x86/intel/uncore: Remove extra double quote mark Zide Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox