public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V6 0/5] Miscellaneous Intel uncore patches
@ 2026-03-30 21:24 Zide Chen
  2026-03-30 21:24 ` [PATCH V6 1/5] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure Zide Chen
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Zide Chen @ 2026-03-30 21:24 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
	Chun-Tse Shao, Markus Elfring

This is mainly a RESEND of the patch set below, with patch v5 3/4
split into two patches.

I grouped these patches together for the convenience of the maintainers
and bumped the whole series to V6.

Changes since v5:
- Split patch v5 3/4 into two patches for clearer scope.
- Patch 2/5: Add changelog to clarify that it doesn't support die ID gaps.
- Patch 3/5: Change title and changelog to reflect the reduced scope.
- Patch 4/5: New patch. Remove redundant snbep_pci2phy_map_init() calls.

Changes since v4:
- Remove unused die_id variable from patch 3.

V5:
https://lore.kernel.org/lkml/20260324214932.10068-1-zide.chen@intel.com/

v4:
https://lore.kernel.org/lkml/20260313174050.171704-1-zide.chen@intel.com/

Previous versions:
https://lore.kernel.org/lkml/cda9ab9b-4581-409f-a9bb-0e8a67ad3530@web.de/T/
https://lore.kernel.org/lkml/20260116031113.45098-1-zide.chen@intel.com/
https://lore.kernel.org/lkml/20260206231203.11760-1-zide.chen@intel.com/

Zide Chen (5):
  perf/x86/intel/uncore: Fix iounmap() leak on global_init failure
  perf/x86/intel/uncore: Skip discovery table for offline dies
  perf/x86/intel/uncore: Do not treat -1 die_id as error during UBOX
    scan
  perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled
  perf/x86/intel/uncore: Remove extra double quote mark

 arch/x86/events/intel/uncore.c           |  1 +
 arch/x86/events/intel/uncore_discovery.c | 17 +++--
 arch/x86/events/intel/uncore_snbep.c     | 83 +++++++++++-------------
 3 files changed, 49 insertions(+), 52 deletions(-)
-- 
2.53.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH V6 1/5] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure
  2026-03-30 21:24 [PATCH V6 0/5] Miscellaneous Intel uncore patches Zide Chen
@ 2026-03-30 21:24 ` Zide Chen
  2026-03-30 21:24 ` [PATCH V6 2/5] perf/x86/intel/uncore: Skip discovery table for offline dies Zide Chen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Zide Chen @ 2026-03-30 21:24 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
	Chun-Tse Shao, Markus Elfring, kernel test robot

Kernel test robot reported:

Unverified Error/Warning (likely false positive, kindly check if
interested):
    arch/x86/events/intel/uncore_discovery.c:293:2-8:
    ERROR: missing iounmap; ioremap on line 288 and execution via
    conditional on line 292

If domain->global_init() fails in __parse_discovery_table(), the
ioremap'ed MMIO region is not released before returning, resulting
in an MMIO mapping leak.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: b575fc0e3357 ("perf/x86/intel/uncore: Add domain global init callback")
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2:
- As suggested by Markus, add an `out` label and use goto-based error
  handling to reduce duplicated iounmap() code.
- Add the original warning from the kernel test robot to the commit message.
- Trivial rewording of the commit message.

V3:
- Add Reviewed-by tag.
---
 arch/x86/events/intel/uncore_discovery.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
index 1b1a6b221048..6a4e892cd525 100644
--- a/arch/x86/events/intel/uncore_discovery.c
+++ b/arch/x86/events/intel/uncore_discovery.c
@@ -264,6 +264,7 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
 	struct uncore_unit_discovery unit;
 	void __iomem *io_addr;
 	unsigned long size;
+	int ret = 0;
 	int i;
 
 	size = UNCORE_DISCOVERY_GLOBAL_MAP_SIZE;
@@ -273,21 +274,23 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
 
 	/* Read Global Discovery State */
 	memcpy_fromio(&global, io_addr, sizeof(struct uncore_global_discovery));
+	iounmap(io_addr);
+
 	if (uncore_discovery_invalid_unit(global)) {
 		pr_info("Invalid Global Discovery State: 0x%llx 0x%llx 0x%llx\n",
 			global.table1, global.ctl, global.table3);
-		iounmap(io_addr);
 		return -EINVAL;
 	}
-	iounmap(io_addr);
 
 	size = (1 + global.max_units) * global.stride * 8;
 	io_addr = ioremap(addr, size);
 	if (!io_addr)
 		return -ENOMEM;
 
-	if (domain->global_init && domain->global_init(global.ctl))
-		return -ENODEV;
+	if (domain->global_init && domain->global_init(global.ctl)) {
+		ret = -ENODEV;
+		goto out;
+	}
 
 	/* Parsing Unit Discovery State */
 	for (i = 0; i < global.max_units; i++) {
@@ -307,8 +310,10 @@ static int __parse_discovery_table(struct uncore_discovery_domain *domain,
 	}
 
 	*parsed = true;
+
+out:
 	iounmap(io_addr);
-	return 0;
+	return ret;
 }
 
 static int parse_discovery_table(struct uncore_discovery_domain *domain,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V6 2/5] perf/x86/intel/uncore: Skip discovery table for offline dies
  2026-03-30 21:24 [PATCH V6 0/5] Miscellaneous Intel uncore patches Zide Chen
  2026-03-30 21:24 ` [PATCH V6 1/5] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure Zide Chen
@ 2026-03-30 21:24 ` Zide Chen
  2026-03-30 21:24 ` [PATCH V6 3/5] perf/x86/intel/uncore: Do not treat -1 die_id as error during UBOX scan Zide Chen
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Zide Chen @ 2026-03-30 21:24 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
	Chun-Tse Shao, Markus Elfring

This warning can be triggered if NUMA is disabled and the system
boots with fewer CPUs than the number of CPUs in die 0.

WARNING: CPU: 9 PID: 7257 at uncore.c:1157 uncore_pci_pmu_register+0x136/0x160 [intel_uncore]

Currently, the discovery table continues to be parsed even if all CPUs
in the associated die are offline.  This can lead to an array overflow
at "pmu->boxes[die] = box" in uncore_pci_pmu_register(), which may
trigger the warning above or cause other issues.

Reported-by: Steve Wahl <steve.wahl@hpe.com>
Tested-by: Steve Wahl <steve.wahl@hpe.com>
Fixes: edae1f06c2cd ("perf/x86/intel/uncore: Parse uncore discovery tables")
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2:
- Add the Tested-by tag
- Rebase onto perf/core (base commit: a491c02c2770)

V3:
- Remove the overly cautious WARN_ON()

V4:
- Add Reviewed-by tag.
---
 arch/x86/events/intel/uncore_discovery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
index 6a4e892cd525..749db3649f8f 100644
--- a/arch/x86/events/intel/uncore_discovery.c
+++ b/arch/x86/events/intel/uncore_discovery.c
@@ -371,7 +371,7 @@ static bool uncore_discovery_pci(struct uncore_discovery_domain *domain)
 				     (val & UNCORE_DISCOVERY_DVSEC2_BIR_MASK) * UNCORE_DISCOVERY_BIR_STEP;
 
 			die = get_device_die_id(dev);
-			if (die < 0)
+			if ((die < 0) || (die >= uncore_max_dies()))
 				continue;
 
 			parse_discovery_table(domain, dev, die, bar_offset, &parsed);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V6 3/5] perf/x86/intel/uncore: Do not treat -1 die_id as error during UBOX scan
  2026-03-30 21:24 [PATCH V6 0/5] Miscellaneous Intel uncore patches Zide Chen
  2026-03-30 21:24 ` [PATCH V6 1/5] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure Zide Chen
  2026-03-30 21:24 ` [PATCH V6 2/5] perf/x86/intel/uncore: Skip discovery table for offline dies Zide Chen
@ 2026-03-30 21:24 ` Zide Chen
  2026-03-31  1:13   ` Mi, Dapeng
  2026-03-30 21:24 ` [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled Zide Chen
  2026-03-30 21:24 ` [PATCH V6 5/5] perf/x86/intel/uncore: Remove extra double quote mark Zide Chen
  4 siblings, 1 reply; 12+ messages in thread
From: Zide Chen @ 2026-03-30 21:24 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
	Chun-Tse Shao, Markus Elfring

In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
uncore_device_to_die() may return -1 when all CPUs associated
with the UBOX device are offline.

Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:

- The current code breaks out of the loop. This is incorrect because
  pci_get_device() does not guarantee iteration in domain or bus order,
  so additional UBOX devices may be skipped during the scan.

- Returning -EINVAL is incorrect, since marking offline buses with
  die_id == -1 is expected and should not be treated as an error.

Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2:
- Fix the commit message to note that spr_update_device_location() is
  used by EMR, not GNR.
- Rewrite the commit message for clarity.
- Add a Tested-by tag.

V5:
- Removed unused die_id (Dapeng).

V6:
- Move the spr_update_device_location() change to a separate patch.
- Update the comit message and title to reflect the reduced scope of
  this patch.
- Remove Tested-by since the code has changed.
---
 arch/x86/events/intel/uncore_snbep.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 9b51883fd6fd..8ee06d4659bb 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
 static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
 {
 	struct pci_dev *ubox_dev = NULL;
-	int i, bus, nodeid, segment, die_id;
+	int i, bus, nodeid, segment;
 	struct pci2phy_map *map;
 	int err = 0;
 	u32 config = 0;
@@ -1458,14 +1458,9 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
 				break;
 			}
 
-			map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
+			map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
 
 			raw_spin_unlock(&pci2phy_map_lock);
-
-			if (WARN_ON_ONCE(die_id == -1)) {
-				err = -EINVAL;
-				break;
-			}
 		}
 	}
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled
  2026-03-30 21:24 [PATCH V6 0/5] Miscellaneous Intel uncore patches Zide Chen
                   ` (2 preceding siblings ...)
  2026-03-30 21:24 ` [PATCH V6 3/5] perf/x86/intel/uncore: Do not treat -1 die_id as error during UBOX scan Zide Chen
@ 2026-03-30 21:24 ` Zide Chen
  2026-03-31  1:26   ` Mi, Dapeng
  2026-03-30 21:24 ` [PATCH V6 5/5] perf/x86/intel/uncore: Remove extra double quote mark Zide Chen
  4 siblings, 1 reply; 12+ messages in thread
From: Zide Chen @ 2026-03-30 21:24 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
	Chun-Tse Shao, Markus Elfring

When NUMA is disabled on a NUMA-capable platform, UPI and M3UPI PMON
units are not enumerated.

In this case, pcibus_to_node() always returns NUMA_NO_NODE, causing
uncore_device_to_die() to return -1 for all PCI devices. As a result,
the corresponding PMON units are not added to the RB tree.

These PMON units are per-die resources, and their utility when NUMA is
disabled is limited.  The driver does not prohibit their use, and the
enumeration should still work correctly.

Fix this by using uncore_pcibus_to_dieid(), which works regardless of
whether NUMA is enabled.  This requires calling
snbep_pci2phy_map_init() in spr_uncore_pci_init().

Since pci_init() is called before mmio_init(), remove the redundant
snbep_pci2phy_map_init() call from spr_uncore_mmio_init().  If
snbep_pci2phy_map_init() fails, uncore driver should be bailed out,
so the fallback path in spr_uncore_mmio_init() can be removed.

Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V6:
- Split from patch v5 3/4.
- Remove the redundant call in spr_uncore_mmio_init().
- Update commit messages.
---
 arch/x86/events/intel/uncore.c       |  1 +
 arch/x86/events/intel/uncore_snbep.c | 26 +++++++++++---------------
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 786bd51a0d89..e9cc1ba921c5 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
 	return bus ? pci_domain_nr(bus) : -EINVAL;
 }
 
+/* Note: This API can only be used when NUMA information is available. */
 int uncore_device_to_die(struct pci_dev *dev)
 {
 	int node = pcibus_to_node(dev->bus);
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 8ee06d4659bb..73da1e88e286 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -6415,7 +6415,7 @@ static void spr_update_device_location(int type_id)
 
 	while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
 
-		die = uncore_device_to_die(dev);
+		die = uncore_pcibus_to_dieid(dev->bus);
 		if (die < 0)
 			continue;
 
@@ -6439,6 +6439,10 @@ static void spr_update_device_location(int type_id)
 
 int spr_uncore_pci_init(void)
 {
+	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
+	if (ret)
+		return ret;
+
 	/*
 	 * The discovery table of UPI on some SPR variant is broken,
 	 * which impacts the detection of both UPI and M3UPI uncore PMON.
@@ -6460,21 +6464,13 @@ int spr_uncore_pci_init(void)
 
 void spr_uncore_mmio_init(void)
 {
-	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
+	uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
+						 UNCORE_SPR_MMIO_EXTRA_UNCORES,
+						 spr_mmio_uncores,
+						 UNCORE_SPR_NUM_UNCORE_TYPES,
+						 spr_uncores);
 
-	if (ret) {
-		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 0, NULL,
-							 UNCORE_SPR_NUM_UNCORE_TYPES,
-							 spr_uncores);
-	} else {
-		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
-							 UNCORE_SPR_MMIO_EXTRA_UNCORES,
-							 spr_mmio_uncores,
-							 UNCORE_SPR_NUM_UNCORE_TYPES,
-							 spr_uncores);
-
-		spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
-	}
+	spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
 }
 
 /* end of SPR uncore support */
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V6 5/5] perf/x86/intel/uncore: Remove extra double quote mark
  2026-03-30 21:24 [PATCH V6 0/5] Miscellaneous Intel uncore patches Zide Chen
                   ` (3 preceding siblings ...)
  2026-03-30 21:24 ` [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled Zide Chen
@ 2026-03-30 21:24 ` Zide Chen
  4 siblings, 0 replies; 12+ messages in thread
From: Zide Chen @ 2026-03-30 21:24 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Dapeng Mi, Zide Chen, Steve Wahl,
	Chun-Tse Shao, Markus Elfring

The third argument in INTEL_UNCORE_FR_EVENT_DESC() is subject to
__stringify(), and the extra double quote marks can result in the
expansion "3.814697266e-6" in the sysfs knobs, instead of
3.814697266e-6.

This is incorrect, though it may still work for perf, e.g.
perf stat -e uncore_iio_free_running_0/bw_in_port0/

Fixes: d8987048f665 ("perf/x86/intel/uncore: Support IIO free-running counters on DMR")
Reported-by: Chun-Tse Shao <ctshao@google.com>
Closes: https://lore.kernel.org/all/20251231224233.113839-1-zide.chen@intel.com/
Reviewed-by: Chun-Tse Shao <ctshao@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
---
V2: Added Reviewed-by tags.
---
 arch/x86/events/intel/uncore_snbep.c | 48 ++++++++++++++--------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 73da1e88e286..fdbdabb5026e 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -6952,34 +6952,34 @@ static struct freerunning_counters dmr_iio_freerunning[] = {
 
 static struct uncore_event_desc dmr_uncore_iio_freerunning_events[] = {
 	/*  ITC Free Running Data BW counter for inbound traffic */
-	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port0, 0x10, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port1, 0x11, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port2, 0x12, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port3, 0x13, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port4, 0x14, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port5, 0x15, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port6, 0x16, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port7, 0x17, "3.814697266e-6"),
+	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port0, 0x10, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port1, 0x11, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port2, 0x12, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port3, 0x13, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port4, 0x14, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port5, 0x15, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port6, 0x16, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(inb_data_port7, 0x17, 3.814697266e-6),
 
 	/*  ITC Free Running BW IN counters */
-	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, "3.814697266e-6"),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, 3.814697266e-6),
 
 	/*  ITC Free Running BW OUT counters */
-	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x30, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x31, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x32, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x33, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port4, 0x34, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port5, 0x35, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port6, 0x36, "3.814697266e-6"),
-	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port7, 0x37, "3.814697266e-6"),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x30, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x31, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x32, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x33, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port4, 0x34, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port5, 0x35, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port6, 0x36, 3.814697266e-6),
+	INTEL_UNCORE_FR_EVENT_DESC(bw_out_port7, 0x37, 3.814697266e-6),
 
 	/* Free Running Clock Counter */
 	INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x40"),
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH V6 3/5] perf/x86/intel/uncore: Do not treat -1 die_id as error during UBOX scan
  2026-03-30 21:24 ` [PATCH V6 3/5] perf/x86/intel/uncore: Do not treat -1 die_id as error during UBOX scan Zide Chen
@ 2026-03-31  1:13   ` Mi, Dapeng
  0 siblings, 0 replies; 12+ messages in thread
From: Mi, Dapeng @ 2026-03-31  1:13 UTC (permalink / raw)
  To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
	Markus Elfring


On 3/31/2026 5:24 AM, Zide Chen wrote:
> In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
> uncore_device_to_die() may return -1 when all CPUs associated
> with the UBOX device are offline.
>
> Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
>
> - The current code breaks out of the loop. This is incorrect because
>   pci_get_device() does not guarantee iteration in domain or bus order,
>   so additional UBOX devices may be skipped during the scan.
>
> - Returning -EINVAL is incorrect, since marking offline buses with
>   die_id == -1 is expected and should not be treated as an error.
>
> Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> V2:
> - Fix the commit message to note that spr_update_device_location() is
>   used by EMR, not GNR.
> - Rewrite the commit message for clarity.
> - Add a Tested-by tag.
>
> V5:
> - Removed unused die_id (Dapeng).
>
> V6:
> - Move the spr_update_device_location() change to a separate patch.
> - Update the comit message and title to reflect the reduced scope of
>   this patch.
> - Remove Tested-by since the code has changed.
> ---
>  arch/x86/events/intel/uncore_snbep.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 9b51883fd6fd..8ee06d4659bb 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
>  static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
>  {
>  	struct pci_dev *ubox_dev = NULL;
> -	int i, bus, nodeid, segment, die_id;
> +	int i, bus, nodeid, segment;
>  	struct pci2phy_map *map;
>  	int err = 0;
>  	u32 config = 0;
> @@ -1458,14 +1458,9 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
>  				break;
>  			}
>  
> -			map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
> +			map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
>  
>  			raw_spin_unlock(&pci2phy_map_lock);
> -
> -			if (WARN_ON_ONCE(die_id == -1)) {
> -				err = -EINVAL;
> -				break;
> -			}
>  		}
>  	}
>  

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled
  2026-03-30 21:24 ` [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled Zide Chen
@ 2026-03-31  1:26   ` Mi, Dapeng
  2026-04-01 20:25     ` Chen, Zide
  0 siblings, 1 reply; 12+ messages in thread
From: Mi, Dapeng @ 2026-03-31  1:26 UTC (permalink / raw)
  To: Zide Chen, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
	Markus Elfring


On 3/31/2026 5:24 AM, Zide Chen wrote:
> When NUMA is disabled on a NUMA-capable platform, UPI and M3UPI PMON
> units are not enumerated.
>
> In this case, pcibus_to_node() always returns NUMA_NO_NODE, causing
> uncore_device_to_die() to return -1 for all PCI devices. As a result,
> the corresponding PMON units are not added to the RB tree.
>
> These PMON units are per-die resources, and their utility when NUMA is
> disabled is limited.  The driver does not prohibit their use, and the
> enumeration should still work correctly.
>
> Fix this by using uncore_pcibus_to_dieid(), which works regardless of
> whether NUMA is enabled.  This requires calling
> snbep_pci2phy_map_init() in spr_uncore_pci_init().
>
> Since pci_init() is called before mmio_init(), remove the redundant
> snbep_pci2phy_map_init() call from spr_uncore_mmio_init().  If
> snbep_pci2phy_map_init() fails, uncore driver should be bailed out,
> so the fallback path in spr_uncore_mmio_init() can be removed.
>
> Signed-off-by: Zide Chen <zide.chen@intel.com>
> ---
> V6:
> - Split from patch v5 3/4.
> - Remove the redundant call in spr_uncore_mmio_init().
> - Update commit messages.
> ---
>  arch/x86/events/intel/uncore.c       |  1 +
>  arch/x86/events/intel/uncore_snbep.c | 26 +++++++++++---------------
>  2 files changed, 12 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 786bd51a0d89..e9cc1ba921c5 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>  	return bus ? pci_domain_nr(bus) : -EINVAL;
>  }
>  
> +/* Note: This API can only be used when NUMA information is available. */
>  int uncore_device_to_die(struct pci_dev *dev)
>  {
>  	int node = pcibus_to_node(dev->bus);
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 8ee06d4659bb..73da1e88e286 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -6415,7 +6415,7 @@ static void spr_update_device_location(int type_id)
>  
>  	while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>  
> -		die = uncore_device_to_die(dev);
> +		die = uncore_pcibus_to_dieid(dev->bus);
>  		if (die < 0)
>  			continue;
>  
> @@ -6439,6 +6439,10 @@ static void spr_update_device_location(int type_id)
>  
>  int spr_uncore_pci_init(void)
>  {
> +	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
> +	if (ret)
> +		return ret;
> +
>  	/*
>  	 * The discovery table of UPI on some SPR variant is broken,
>  	 * which impacts the detection of both UPI and M3UPI uncore PMON.
> @@ -6460,21 +6464,13 @@ int spr_uncore_pci_init(void)
>  
>  void spr_uncore_mmio_init(void)
>  {
> -	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
> +	uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
> +						 UNCORE_SPR_MMIO_EXTRA_UNCORES,
> +						 spr_mmio_uncores,
> +						 UNCORE_SPR_NUM_UNCORE_TYPES,
> +						 spr_uncores);
>  
> -	if (ret) {
> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 0, NULL,
> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
> -							 spr_uncores);
> -	} else {
> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
> -							 UNCORE_SPR_MMIO_EXTRA_UNCORES,
> -							 spr_mmio_uncores,
> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
> -							 spr_uncores);
> -
> -		spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
> -	}
> +	spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;

I'm not sure if we can directly remove the snbep_pci2phy_map_init() call
here. In theory, the snbep_pci2phy_map_init() call in spr_uncore_pci_init()
could fail and then spr_uncore_mmio_init() doesn't know it and directly
initializes MMIO PMU, then it could lead to the MMIO initialization fails.

Currently the PCI, CPU and MMIO initialization are totally independent,
only when the 3 types initialization all fail, then uncore PMU can abort.

``` 

   if (uncore_init->pci_init) {
        pret = uncore_init->pci_init();
        if (!pret)
            pret = uncore_pci_init();
    }

    if (uncore_init->cpu_init) {
        uncore_init->cpu_init();
        cret = uncore_cpu_init();
    }

    if (uncore_init->mmio_init) {
        uncore_init->mmio_init();
        mret = uncore_mmio_init();
    }

    if (cret && pret && mret) {
        ret = -ENODEV;
        goto free_discovery;
    }
```


>  }
>  
>  /* end of SPR uncore support */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled
  2026-03-31  1:26   ` Mi, Dapeng
@ 2026-04-01 20:25     ` Chen, Zide
  2026-04-02  2:48       ` Mi, Dapeng
  0 siblings, 1 reply; 12+ messages in thread
From: Chen, Zide @ 2026-04-01 20:25 UTC (permalink / raw)
  To: Mi, Dapeng, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
	Markus Elfring



On 3/30/2026 6:26 PM, Mi, Dapeng wrote:
> 
> On 3/31/2026 5:24 AM, Zide Chen wrote:
>> When NUMA is disabled on a NUMA-capable platform, UPI and M3UPI PMON
>> units are not enumerated.
>>
>> In this case, pcibus_to_node() always returns NUMA_NO_NODE, causing
>> uncore_device_to_die() to return -1 for all PCI devices. As a result,
>> the corresponding PMON units are not added to the RB tree.
>>
>> These PMON units are per-die resources, and their utility when NUMA is
>> disabled is limited.  The driver does not prohibit their use, and the
>> enumeration should still work correctly.
>>
>> Fix this by using uncore_pcibus_to_dieid(), which works regardless of
>> whether NUMA is enabled.  This requires calling
>> snbep_pci2phy_map_init() in spr_uncore_pci_init().
>>
>> Since pci_init() is called before mmio_init(), remove the redundant
>> snbep_pci2phy_map_init() call from spr_uncore_mmio_init().  If
>> snbep_pci2phy_map_init() fails, uncore driver should be bailed out,
>> so the fallback path in spr_uncore_mmio_init() can be removed.
>>
>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>> ---
>> V6:
>> - Split from patch v5 3/4.
>> - Remove the redundant call in spr_uncore_mmio_init().
>> - Update commit messages.
>> ---
>>  arch/x86/events/intel/uncore.c       |  1 +
>>  arch/x86/events/intel/uncore_snbep.c | 26 +++++++++++---------------
>>  2 files changed, 12 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>> index 786bd51a0d89..e9cc1ba921c5 100644
>> --- a/arch/x86/events/intel/uncore.c
>> +++ b/arch/x86/events/intel/uncore.c
>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>>  	return bus ? pci_domain_nr(bus) : -EINVAL;
>>  }
>>  
>> +/* Note: This API can only be used when NUMA information is available. */
>>  int uncore_device_to_die(struct pci_dev *dev)
>>  {
>>  	int node = pcibus_to_node(dev->bus);
>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>> index 8ee06d4659bb..73da1e88e286 100644
>> --- a/arch/x86/events/intel/uncore_snbep.c
>> +++ b/arch/x86/events/intel/uncore_snbep.c
>> @@ -6415,7 +6415,7 @@ static void spr_update_device_location(int type_id)
>>  
>>  	while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>  
>> -		die = uncore_device_to_die(dev);
>> +		die = uncore_pcibus_to_dieid(dev->bus);
>>  		if (die < 0)
>>  			continue;
>>  
>> @@ -6439,6 +6439,10 @@ static void spr_update_device_location(int type_id)
>>  
>>  int spr_uncore_pci_init(void)
>>  {
>> +	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>> +	if (ret)
>> +		return ret;
>> +
>>  	/*
>>  	 * The discovery table of UPI on some SPR variant is broken,
>>  	 * which impacts the detection of both UPI and M3UPI uncore PMON.
>> @@ -6460,21 +6464,13 @@ int spr_uncore_pci_init(void)
>>  
>>  void spr_uncore_mmio_init(void)
>>  {
>> -	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>> +	uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>> +						 UNCORE_SPR_MMIO_EXTRA_UNCORES,
>> +						 spr_mmio_uncores,
>> +						 UNCORE_SPR_NUM_UNCORE_TYPES,
>> +						 spr_uncores);
>>  
>> -	if (ret) {
>> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 0, NULL,
>> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
>> -							 spr_uncores);
>> -	} else {
>> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>> -							 UNCORE_SPR_MMIO_EXTRA_UNCORES,
>> -							 spr_mmio_uncores,
>> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
>> -							 spr_uncores);
>> -
>> -		spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>> -	}
>> +	spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
> 
> I'm not sure if we can directly remove the snbep_pci2phy_map_init() call
> here. In theory, the snbep_pci2phy_map_init() call in spr_uncore_pci_init()
> could fail and then spr_uncore_mmio_init() doesn't know it and directly
> initializes MMIO PMU, then it could lead to the MMIO initialization fails.


Yes, this is true. But I would argue that the fix in this patch is
correct, and the issue you pointed out is not new: the uncore driver
registers a PMU device without guaranteeing it's functioning.

This is because the Intel uncore driver employs a lazy init approach.
And when init_box() fails, it doesn't unregister the inaccessible PMU
devices. For example, intel_generic_uncore_mmio_init_box() could fail
for a number of reasons, making all associated PMU devices non-functional.

Originally the uncore driver tried to enumerate PCI/MSR/MMIO uncore
independently, but evolving hardware complexity makes this more
challenging.  This patch is just one example, IMC Freerunning is
MMIO-accessed but relies on PCI devices to read the die-specific MMIO
base address. Explicitly gating sysfs node creation with PCI init code
in mmio_init() is neither clean nor reliable.

To fix it, it seems reasonable to have init_box() return int and
unregister the PMU device if deemed inaccessible — similar to what
perf_event_ibs_init() does.

--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -129,7 +129,7 @@ struct intel_uncore_type {
 #define events_group attr_groups[2]

 struct intel_uncore_ops {
-       void (*init_box)(struct intel_uncore_box *);
+       int (*init_box)(struct intel_uncore_box *);

--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1155,7 +1155,8 @@ static int uncore_pci_pmu_register(struct pci_dev
*pdev,
        box->dieid = die;
        box->pci_dev = pdev;
        box->pmu = pmu;
-       uncore_box_init(box);
+       ret = uncore_box_init(box);
+	if (ret)
+               return ret;

@@ -1598,8 +1599,10 @@ static int uncore_box_ref(struct
intel_uncore_type **types,
                pmu = type->pmus;
                for (i = 0; i < type->num_boxes; i++, pmu++) {
                        box = pmu->boxes[id];
-                       if (box && box->cpu >= 0 &&
atomic_inc_return(&box->refcnt) == 1)
-                               uncore_box_init(box);
+                       if (box && box->cpu >= 0 &&
atomic_inc_return(&box->refcnt) == 1)
+                               if (uncore_box_init(box))
+                                       uncore_pmu_unregister(pmu);


> Currently the PCI, CPU and MMIO initialization are totally independent,
> only when the 3 types initialization all fail, then uncore PMU can abort.
> 
> ``` 
> 
>    if (uncore_init->pci_init) {
>         pret = uncore_init->pci_init();
>         if (!pret)
>             pret = uncore_pci_init();
>     }
> 
>     if (uncore_init->cpu_init) {
>         uncore_init->cpu_init();
>         cret = uncore_cpu_init();
>     }
> 
>     if (uncore_init->mmio_init) {
>         uncore_init->mmio_init();
>         mret = uncore_mmio_init();
>     }
> 
>     if (cret && pret && mret) {
>         ret = -ENODEV;
>         goto free_discovery;
>     }
> ```
> 
> 
>>  }
>>  
>>  /* end of SPR uncore support */


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled
  2026-04-01 20:25     ` Chen, Zide
@ 2026-04-02  2:48       ` Mi, Dapeng
  2026-04-02 21:31         ` Chen, Zide
  0 siblings, 1 reply; 12+ messages in thread
From: Mi, Dapeng @ 2026-04-02  2:48 UTC (permalink / raw)
  To: Chen, Zide, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
	Markus Elfring


On 4/2/2026 4:25 AM, Chen, Zide wrote:
>
> On 3/30/2026 6:26 PM, Mi, Dapeng wrote:
>> On 3/31/2026 5:24 AM, Zide Chen wrote:
>>> When NUMA is disabled on a NUMA-capable platform, UPI and M3UPI PMON
>>> units are not enumerated.
>>>
>>> In this case, pcibus_to_node() always returns NUMA_NO_NODE, causing
>>> uncore_device_to_die() to return -1 for all PCI devices. As a result,
>>> the corresponding PMON units are not added to the RB tree.
>>>
>>> These PMON units are per-die resources, and their utility when NUMA is
>>> disabled is limited.  The driver does not prohibit their use, and the
>>> enumeration should still work correctly.
>>>
>>> Fix this by using uncore_pcibus_to_dieid(), which works regardless of
>>> whether NUMA is enabled.  This requires calling
>>> snbep_pci2phy_map_init() in spr_uncore_pci_init().
>>>
>>> Since pci_init() is called before mmio_init(), remove the redundant
>>> snbep_pci2phy_map_init() call from spr_uncore_mmio_init().  If
>>> snbep_pci2phy_map_init() fails, uncore driver should be bailed out,
>>> so the fallback path in spr_uncore_mmio_init() can be removed.
>>>
>>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>>> ---
>>> V6:
>>> - Split from patch v5 3/4.
>>> - Remove the redundant call in spr_uncore_mmio_init().
>>> - Update commit messages.
>>> ---
>>>  arch/x86/events/intel/uncore.c       |  1 +
>>>  arch/x86/events/intel/uncore_snbep.c | 26 +++++++++++---------------
>>>  2 files changed, 12 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>>> index 786bd51a0d89..e9cc1ba921c5 100644
>>> --- a/arch/x86/events/intel/uncore.c
>>> +++ b/arch/x86/events/intel/uncore.c
>>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>>>  	return bus ? pci_domain_nr(bus) : -EINVAL;
>>>  }
>>>  
>>> +/* Note: This API can only be used when NUMA information is available. */
>>>  int uncore_device_to_die(struct pci_dev *dev)
>>>  {
>>>  	int node = pcibus_to_node(dev->bus);
>>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>>> index 8ee06d4659bb..73da1e88e286 100644
>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>> @@ -6415,7 +6415,7 @@ static void spr_update_device_location(int type_id)
>>>  
>>>  	while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>>  
>>> -		die = uncore_device_to_die(dev);
>>> +		die = uncore_pcibus_to_dieid(dev->bus);
>>>  		if (die < 0)
>>>  			continue;
>>>  
>>> @@ -6439,6 +6439,10 @@ static void spr_update_device_location(int type_id)
>>>  
>>>  int spr_uncore_pci_init(void)
>>>  {
>>> +	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>>  	/*
>>>  	 * The discovery table of UPI on some SPR variant is broken,
>>>  	 * which impacts the detection of both UPI and M3UPI uncore PMON.
>>> @@ -6460,21 +6464,13 @@ int spr_uncore_pci_init(void)
>>>  
>>>  void spr_uncore_mmio_init(void)
>>>  {
>>> -	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>> +	uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>>> +						 UNCORE_SPR_MMIO_EXTRA_UNCORES,
>>> +						 spr_mmio_uncores,
>>> +						 UNCORE_SPR_NUM_UNCORE_TYPES,
>>> +						 spr_uncores);
>>>  
>>> -	if (ret) {
>>> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 0, NULL,
>>> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
>>> -							 spr_uncores);
>>> -	} else {
>>> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>>> -							 UNCORE_SPR_MMIO_EXTRA_UNCORES,
>>> -							 spr_mmio_uncores,
>>> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
>>> -							 spr_uncores);
>>> -
>>> -		spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>>> -	}
>>> +	spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>> I'm not sure if we can directly remove the snbep_pci2phy_map_init() call
>> here. In theory, the snbep_pci2phy_map_init() call in spr_uncore_pci_init()
>> could fail and then spr_uncore_mmio_init() doesn't know it and directly
>> initializes MMIO PMU, then it could lead to the MMIO initialization fails.
>
> Yes, this is true. But I would argue that the fix in this patch is
> correct, and the issue you pointed out is not new: the uncore driver
> registers a PMU device without guaranteeing it's functioning.
>
> This is because the Intel uncore driver employs a lazy init approach.
> And when init_box() fails, it doesn't unregister the inaccessible PMU
> devices. For example, intel_generic_uncore_mmio_init_box() could fail
> for a number of reasons, making all associated PMU devices non-functional.
>
> Originally the uncore driver tried to enumerate PCI/MSR/MMIO uncore
> independently, but evolving hardware complexity makes this more
> challenging.  This patch is just one example, IMC Freerunning is
> MMIO-accessed but relies on PCI devices to read the die-specific MMIO
> base address. Explicitly gating sysfs node creation with PCI init code
> in mmio_init() is neither clean nor reliable.
>
> To fix it, it seems reasonable to have init_box() return int and
> unregister the PMU device if deemed inaccessible — similar to what
> perf_event_ibs_init() does.
>
> --- a/arch/x86/events/intel/uncore.h
> +++ b/arch/x86/events/intel/uncore.h
> @@ -129,7 +129,7 @@ struct intel_uncore_type {
>  #define events_group attr_groups[2]
>
>  struct intel_uncore_ops {
> -       void (*init_box)(struct intel_uncore_box *);
> +       int (*init_box)(struct intel_uncore_box *);
>
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -1155,7 +1155,8 @@ static int uncore_pci_pmu_register(struct pci_dev
> *pdev,
>         box->dieid = die;
>         box->pci_dev = pdev;
>         box->pmu = pmu;
> -       uncore_box_init(box);
> +       ret = uncore_box_init(box);
> +	if (ret)
> +               return ret;
>
> @@ -1598,8 +1599,10 @@ static int uncore_box_ref(struct
> intel_uncore_type **types,
>                 pmu = type->pmus;
>                 for (i = 0; i < type->num_boxes; i++, pmu++) {
>                         box = pmu->boxes[id];
> -                       if (box && box->cpu >= 0 &&
> atomic_inc_return(&box->refcnt) == 1)
> -                               uncore_box_init(box);
> +                       if (box && box->cpu >= 0 &&
> atomic_inc_return(&box->refcnt) == 1)
> +                               if (uncore_box_init(box))
> +                                       uncore_pmu_unregister(pmu);

Yes, I like this idea. The return value of init_box() should always be
checked. I'm not quite sure if there are other resources need to be cleaned
besides unregistering the corresponding uncore pmu, please double check.
Thanks.


>
>
>> Currently the PCI, CPU and MMIO initialization are totally independent,
>> only when the 3 types initialization all fail, then uncore PMU can abort.
>>
>> ``` 
>>
>>    if (uncore_init->pci_init) {
>>         pret = uncore_init->pci_init();
>>         if (!pret)
>>             pret = uncore_pci_init();
>>     }
>>
>>     if (uncore_init->cpu_init) {
>>         uncore_init->cpu_init();
>>         cret = uncore_cpu_init();
>>     }
>>
>>     if (uncore_init->mmio_init) {
>>         uncore_init->mmio_init();
>>         mret = uncore_mmio_init();
>>     }
>>
>>     if (cret && pret && mret) {
>>         ret = -ENODEV;
>>         goto free_discovery;
>>     }
>> ```
>>
>>
>>>  }
>>>  
>>>  /* end of SPR uncore support */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled
  2026-04-02  2:48       ` Mi, Dapeng
@ 2026-04-02 21:31         ` Chen, Zide
  2026-04-03  0:58           ` Mi, Dapeng
  0 siblings, 1 reply; 12+ messages in thread
From: Chen, Zide @ 2026-04-02 21:31 UTC (permalink / raw)
  To: Mi, Dapeng, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
	Markus Elfring



On 4/1/2026 7:48 PM, Mi, Dapeng wrote:
> 
> On 4/2/2026 4:25 AM, Chen, Zide wrote:
>>
>> On 3/30/2026 6:26 PM, Mi, Dapeng wrote:
>>> On 3/31/2026 5:24 AM, Zide Chen wrote:
>>>> When NUMA is disabled on a NUMA-capable platform, UPI and M3UPI PMON
>>>> units are not enumerated.
>>>>
>>>> In this case, pcibus_to_node() always returns NUMA_NO_NODE, causing
>>>> uncore_device_to_die() to return -1 for all PCI devices. As a result,
>>>> the corresponding PMON units are not added to the RB tree.
>>>>
>>>> These PMON units are per-die resources, and their utility when NUMA is
>>>> disabled is limited.  The driver does not prohibit their use, and the
>>>> enumeration should still work correctly.
>>>>
>>>> Fix this by using uncore_pcibus_to_dieid(), which works regardless of
>>>> whether NUMA is enabled.  This requires calling
>>>> snbep_pci2phy_map_init() in spr_uncore_pci_init().
>>>>
>>>> Since pci_init() is called before mmio_init(), remove the redundant
>>>> snbep_pci2phy_map_init() call from spr_uncore_mmio_init().  If
>>>> snbep_pci2phy_map_init() fails, uncore driver should be bailed out,
>>>> so the fallback path in spr_uncore_mmio_init() can be removed.
>>>>
>>>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>>>> ---
>>>> V6:
>>>> - Split from patch v5 3/4.
>>>> - Remove the redundant call in spr_uncore_mmio_init().
>>>> - Update commit messages.
>>>> ---
>>>>  arch/x86/events/intel/uncore.c       |  1 +
>>>>  arch/x86/events/intel/uncore_snbep.c | 26 +++++++++++---------------
>>>>  2 files changed, 12 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>>>> index 786bd51a0d89..e9cc1ba921c5 100644
>>>> --- a/arch/x86/events/intel/uncore.c
>>>> +++ b/arch/x86/events/intel/uncore.c
>>>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>>>>  	return bus ? pci_domain_nr(bus) : -EINVAL;
>>>>  }
>>>>  
>>>> +/* Note: This API can only be used when NUMA information is available. */
>>>>  int uncore_device_to_die(struct pci_dev *dev)
>>>>  {
>>>>  	int node = pcibus_to_node(dev->bus);
>>>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>>>> index 8ee06d4659bb..73da1e88e286 100644
>>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>>> @@ -6415,7 +6415,7 @@ static void spr_update_device_location(int type_id)
>>>>  
>>>>  	while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>>>  
>>>> -		die = uncore_device_to_die(dev);
>>>> +		die = uncore_pcibus_to_dieid(dev->bus);
>>>>  		if (die < 0)
>>>>  			continue;
>>>>  
>>>> @@ -6439,6 +6439,10 @@ static void spr_update_device_location(int type_id)
>>>>  
>>>>  int spr_uncore_pci_init(void)
>>>>  {
>>>> +	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>>> +	if (ret)
>>>> +		return ret;
>>>> +
>>>>  	/*
>>>>  	 * The discovery table of UPI on some SPR variant is broken,
>>>>  	 * which impacts the detection of both UPI and M3UPI uncore PMON.
>>>> @@ -6460,21 +6464,13 @@ int spr_uncore_pci_init(void)
>>>>  
>>>>  void spr_uncore_mmio_init(void)
>>>>  {
>>>> -	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>>> +	uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>>>> +						 UNCORE_SPR_MMIO_EXTRA_UNCORES,
>>>> +						 spr_mmio_uncores,
>>>> +						 UNCORE_SPR_NUM_UNCORE_TYPES,
>>>> +						 spr_uncores);
>>>>  
>>>> -	if (ret) {
>>>> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 0, NULL,
>>>> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
>>>> -							 spr_uncores);
>>>> -	} else {
>>>> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>>>> -							 UNCORE_SPR_MMIO_EXTRA_UNCORES,
>>>> -							 spr_mmio_uncores,
>>>> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
>>>> -							 spr_uncores);
>>>> -
>>>> -		spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>>>> -	}
>>>> +	spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>>> I'm not sure if we can directly remove the snbep_pci2phy_map_init() call
>>> here. In theory, the snbep_pci2phy_map_init() call in spr_uncore_pci_init()
>>> could fail and then spr_uncore_mmio_init() doesn't know it and directly
>>> initializes MMIO PMU, then it could lead to the MMIO initialization fails.
>>
>> Yes, this is true. But I would argue that the fix in this patch is
>> correct, and the issue you pointed out is not new: the uncore driver
>> registers a PMU device without guaranteeing it's functioning.
>>
>> This is because the Intel uncore driver employs a lazy init approach.
>> And when init_box() fails, it doesn't unregister the inaccessible PMU
>> devices. For example, intel_generic_uncore_mmio_init_box() could fail
>> for a number of reasons, making all associated PMU devices non-functional.
>>
>> Originally the uncore driver tried to enumerate PCI/MSR/MMIO uncore
>> independently, but evolving hardware complexity makes this more
>> challenging.  This patch is just one example, IMC Freerunning is
>> MMIO-accessed but relies on PCI devices to read the die-specific MMIO
>> base address. Explicitly gating sysfs node creation with PCI init code
>> in mmio_init() is neither clean nor reliable.
>>
>> To fix it, it seems reasonable to have init_box() return int and
>> unregister the PMU device if deemed inaccessible — similar to what
>> perf_event_ibs_init() does.
>>
>> --- a/arch/x86/events/intel/uncore.h
>> +++ b/arch/x86/events/intel/uncore.h
>> @@ -129,7 +129,7 @@ struct intel_uncore_type {
>>  #define events_group attr_groups[2]
>>
>>  struct intel_uncore_ops {
>> -       void (*init_box)(struct intel_uncore_box *);
>> +       int (*init_box)(struct intel_uncore_box *);
>>
>> --- a/arch/x86/events/intel/uncore.c
>> +++ b/arch/x86/events/intel/uncore.c
>> @@ -1155,7 +1155,8 @@ static int uncore_pci_pmu_register(struct pci_dev
>> *pdev,
>>         box->dieid = die;
>>         box->pci_dev = pdev;
>>         box->pmu = pmu;
>> -       uncore_box_init(box);
>> +       ret = uncore_box_init(box);
>> +	if (ret)
>> +               return ret;
>>
>> @@ -1598,8 +1599,10 @@ static int uncore_box_ref(struct
>> intel_uncore_type **types,
>>                 pmu = type->pmus;
>>                 for (i = 0; i < type->num_boxes; i++, pmu++) {
>>                         box = pmu->boxes[id];
>> -                       if (box && box->cpu >= 0 &&
>> atomic_inc_return(&box->refcnt) == 1)
>> -                               uncore_box_init(box);
>> +                       if (box && box->cpu >= 0 &&
>> atomic_inc_return(&box->refcnt) == 1)
>> +                               if (uncore_box_init(box))
>> +                                       uncore_pmu_unregister(pmu);
> 
> Yes, I like this idea. The return value of init_box() should always be
> checked. I'm not quite sure if there are other resources need to be cleaned
> besides unregistering the corresponding uncore pmu, please double check.
> Thanks.

I'm thinking of removing this patch from this series and putting it
together with the init_box() changes, where it will be a complete fix
and I need more time to double check all the init_box() callbacks.

>>
>>
>>> Currently the PCI, CPU and MMIO initialization are totally independent,
>>> only when the 3 types initialization all fail, then uncore PMU can abort.
>>>
>>> ``` 
>>>
>>>    if (uncore_init->pci_init) {
>>>         pret = uncore_init->pci_init();
>>>         if (!pret)
>>>             pret = uncore_pci_init();
>>>     }
>>>
>>>     if (uncore_init->cpu_init) {
>>>         uncore_init->cpu_init();
>>>         cret = uncore_cpu_init();
>>>     }
>>>
>>>     if (uncore_init->mmio_init) {
>>>         uncore_init->mmio_init();
>>>         mret = uncore_mmio_init();
>>>     }
>>>
>>>     if (cret && pret && mret) {
>>>         ret = -ENODEV;
>>>         goto free_discovery;
>>>     }
>>> ```
>>>
>>>
>>>>  }
>>>>  
>>>>  /* end of SPR uncore support */


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled
  2026-04-02 21:31         ` Chen, Zide
@ 2026-04-03  0:58           ` Mi, Dapeng
  0 siblings, 0 replies; 12+ messages in thread
From: Mi, Dapeng @ 2026-04-03  0:58 UTC (permalink / raw)
  To: Chen, Zide, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Ian Rogers, Adrian Hunter, Alexander Shishkin,
	Andi Kleen, Eranian Stephane
  Cc: linux-kernel, linux-perf-users, Steve Wahl, Chun-Tse Shao,
	Markus Elfring


On 4/3/2026 5:31 AM, Chen, Zide wrote:
>
> On 4/1/2026 7:48 PM, Mi, Dapeng wrote:
>> On 4/2/2026 4:25 AM, Chen, Zide wrote:
>>> On 3/30/2026 6:26 PM, Mi, Dapeng wrote:
>>>> On 3/31/2026 5:24 AM, Zide Chen wrote:
>>>>> When NUMA is disabled on a NUMA-capable platform, UPI and M3UPI PMON
>>>>> units are not enumerated.
>>>>>
>>>>> In this case, pcibus_to_node() always returns NUMA_NO_NODE, causing
>>>>> uncore_device_to_die() to return -1 for all PCI devices. As a result,
>>>>> the corresponding PMON units are not added to the RB tree.
>>>>>
>>>>> These PMON units are per-die resources, and their utility when NUMA is
>>>>> disabled is limited.  The driver does not prohibit their use, and the
>>>>> enumeration should still work correctly.
>>>>>
>>>>> Fix this by using uncore_pcibus_to_dieid(), which works regardless of
>>>>> whether NUMA is enabled.  This requires calling
>>>>> snbep_pci2phy_map_init() in spr_uncore_pci_init().
>>>>>
>>>>> Since pci_init() is called before mmio_init(), remove the redundant
>>>>> snbep_pci2phy_map_init() call from spr_uncore_mmio_init().  If
>>>>> snbep_pci2phy_map_init() fails, uncore driver should be bailed out,
>>>>> so the fallback path in spr_uncore_mmio_init() can be removed.
>>>>>
>>>>> Signed-off-by: Zide Chen <zide.chen@intel.com>
>>>>> ---
>>>>> V6:
>>>>> - Split from patch v5 3/4.
>>>>> - Remove the redundant call in spr_uncore_mmio_init().
>>>>> - Update commit messages.
>>>>> ---
>>>>>  arch/x86/events/intel/uncore.c       |  1 +
>>>>>  arch/x86/events/intel/uncore_snbep.c | 26 +++++++++++---------------
>>>>>  2 files changed, 12 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>>>>> index 786bd51a0d89..e9cc1ba921c5 100644
>>>>> --- a/arch/x86/events/intel/uncore.c
>>>>> +++ b/arch/x86/events/intel/uncore.c
>>>>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>>>>>  	return bus ? pci_domain_nr(bus) : -EINVAL;
>>>>>  }
>>>>>  
>>>>> +/* Note: This API can only be used when NUMA information is available. */
>>>>>  int uncore_device_to_die(struct pci_dev *dev)
>>>>>  {
>>>>>  	int node = pcibus_to_node(dev->bus);
>>>>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>>>>> index 8ee06d4659bb..73da1e88e286 100644
>>>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>>>> @@ -6415,7 +6415,7 @@ static void spr_update_device_location(int type_id)
>>>>>  
>>>>>  	while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>>>>  
>>>>> -		die = uncore_device_to_die(dev);
>>>>> +		die = uncore_pcibus_to_dieid(dev->bus);
>>>>>  		if (die < 0)
>>>>>  			continue;
>>>>>  
>>>>> @@ -6439,6 +6439,10 @@ static void spr_update_device_location(int type_id)
>>>>>  
>>>>>  int spr_uncore_pci_init(void)
>>>>>  {
>>>>> +	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>>>> +	if (ret)
>>>>> +		return ret;
>>>>> +
>>>>>  	/*
>>>>>  	 * The discovery table of UPI on some SPR variant is broken,
>>>>>  	 * which impacts the detection of both UPI and M3UPI uncore PMON.
>>>>> @@ -6460,21 +6464,13 @@ int spr_uncore_pci_init(void)
>>>>>  
>>>>>  void spr_uncore_mmio_init(void)
>>>>>  {
>>>>> -	int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>>>> +	uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>>>>> +						 UNCORE_SPR_MMIO_EXTRA_UNCORES,
>>>>> +						 spr_mmio_uncores,
>>>>> +						 UNCORE_SPR_NUM_UNCORE_TYPES,
>>>>> +						 spr_uncores);
>>>>>  
>>>>> -	if (ret) {
>>>>> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 0, NULL,
>>>>> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
>>>>> -							 spr_uncores);
>>>>> -	} else {
>>>>> -		uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>>>>> -							 UNCORE_SPR_MMIO_EXTRA_UNCORES,
>>>>> -							 spr_mmio_uncores,
>>>>> -							 UNCORE_SPR_NUM_UNCORE_TYPES,
>>>>> -							 spr_uncores);
>>>>> -
>>>>> -		spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>>>>> -	}
>>>>> +	spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>>>> I'm not sure if we can directly remove the snbep_pci2phy_map_init() call
>>>> here. In theory, the snbep_pci2phy_map_init() call in spr_uncore_pci_init()
>>>> could fail and then spr_uncore_mmio_init() doesn't know it and directly
>>>> initializes MMIO PMU, then it could lead to the MMIO initialization fails.
>>> Yes, this is true. But I would argue that the fix in this patch is
>>> correct, and the issue you pointed out is not new: the uncore driver
>>> registers a PMU device without guaranteeing it's functioning.
>>>
>>> This is because the Intel uncore driver employs a lazy init approach.
>>> And when init_box() fails, it doesn't unregister the inaccessible PMU
>>> devices. For example, intel_generic_uncore_mmio_init_box() could fail
>>> for a number of reasons, making all associated PMU devices non-functional.
>>>
>>> Originally the uncore driver tried to enumerate PCI/MSR/MMIO uncore
>>> independently, but evolving hardware complexity makes this more
>>> challenging.  This patch is just one example, IMC Freerunning is
>>> MMIO-accessed but relies on PCI devices to read the die-specific MMIO
>>> base address. Explicitly gating sysfs node creation with PCI init code
>>> in mmio_init() is neither clean nor reliable.
>>>
>>> To fix it, it seems reasonable to have init_box() return int and
>>> unregister the PMU device if deemed inaccessible — similar to what
>>> perf_event_ibs_init() does.
>>>
>>> --- a/arch/x86/events/intel/uncore.h
>>> +++ b/arch/x86/events/intel/uncore.h
>>> @@ -129,7 +129,7 @@ struct intel_uncore_type {
>>>  #define events_group attr_groups[2]
>>>
>>>  struct intel_uncore_ops {
>>> -       void (*init_box)(struct intel_uncore_box *);
>>> +       int (*init_box)(struct intel_uncore_box *);
>>>
>>> --- a/arch/x86/events/intel/uncore.c
>>> +++ b/arch/x86/events/intel/uncore.c
>>> @@ -1155,7 +1155,8 @@ static int uncore_pci_pmu_register(struct pci_dev
>>> *pdev,
>>>         box->dieid = die;
>>>         box->pci_dev = pdev;
>>>         box->pmu = pmu;
>>> -       uncore_box_init(box);
>>> +       ret = uncore_box_init(box);
>>> +	if (ret)
>>> +               return ret;
>>>
>>> @@ -1598,8 +1599,10 @@ static int uncore_box_ref(struct
>>> intel_uncore_type **types,
>>>                 pmu = type->pmus;
>>>                 for (i = 0; i < type->num_boxes; i++, pmu++) {
>>>                         box = pmu->boxes[id];
>>> -                       if (box && box->cpu >= 0 &&
>>> atomic_inc_return(&box->refcnt) == 1)
>>> -                               uncore_box_init(box);
>>> +                       if (box && box->cpu >= 0 &&
>>> atomic_inc_return(&box->refcnt) == 1)
>>> +                               if (uncore_box_init(box))
>>> +                                       uncore_pmu_unregister(pmu);
>> Yes, I like this idea. The return value of init_box() should always be
>> checked. I'm not quite sure if there are other resources need to be cleaned
>> besides unregistering the corresponding uncore pmu, please double check.
>> Thanks.
> I'm thinking of removing this patch from this series and putting it
> together with the init_box() changes, where it will be a complete fix
> and I need more time to double check all the init_box() callbacks.

Yeah, It's a fundamental change and let's do more tests. Thanks.


>
>>>
>>>> Currently the PCI, CPU and MMIO initialization are totally independent,
>>>> only when the 3 types initialization all fail, then uncore PMU can abort.
>>>>
>>>> ``` 
>>>>
>>>>    if (uncore_init->pci_init) {
>>>>         pret = uncore_init->pci_init();
>>>>         if (!pret)
>>>>             pret = uncore_pci_init();
>>>>     }
>>>>
>>>>     if (uncore_init->cpu_init) {
>>>>         uncore_init->cpu_init();
>>>>         cret = uncore_cpu_init();
>>>>     }
>>>>
>>>>     if (uncore_init->mmio_init) {
>>>>         uncore_init->mmio_init();
>>>>         mret = uncore_mmio_init();
>>>>     }
>>>>
>>>>     if (cret && pret && mret) {
>>>>         ret = -ENODEV;
>>>>         goto free_discovery;
>>>>     }
>>>> ```
>>>>
>>>>
>>>>>  }
>>>>>  
>>>>>  /* end of SPR uncore support */

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-04-03  0:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30 21:24 [PATCH V6 0/5] Miscellaneous Intel uncore patches Zide Chen
2026-03-30 21:24 ` [PATCH V6 1/5] perf/x86/intel/uncore: Fix iounmap() leak on global_init failure Zide Chen
2026-03-30 21:24 ` [PATCH V6 2/5] perf/x86/intel/uncore: Skip discovery table for offline dies Zide Chen
2026-03-30 21:24 ` [PATCH V6 3/5] perf/x86/intel/uncore: Do not treat -1 die_id as error during UBOX scan Zide Chen
2026-03-31  1:13   ` Mi, Dapeng
2026-03-30 21:24 ` [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled Zide Chen
2026-03-31  1:26   ` Mi, Dapeng
2026-04-01 20:25     ` Chen, Zide
2026-04-02  2:48       ` Mi, Dapeng
2026-04-02 21:31         ` Chen, Zide
2026-04-03  0:58           ` Mi, Dapeng
2026-03-30 21:24 ` [PATCH V6 5/5] perf/x86/intel/uncore: Remove extra double quote mark Zide Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox