Re: [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies

Linux Perf Users
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Zide Chen" <zide.chen@intel.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies
Date: Wed, 27 May 2026 19:56:01 +0000	[thread overview]
Message-ID: <20260527195601.F20AF1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260527151154.130505-6-zide.chen@intel.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] This problem wasn't introduced by this patch, but in `pmu_alloc_topology()`, if the memory allocation for `topology[die]` fails, the error cleanup loop dereferences the newly-NULL pointer, causing a kernel crash.
- [High] This problem wasn't introduced by this patch, but there is an unprotected list traversal of PCI buses in `uncore_die_to_segment()` leading to a potential Use-After-Free if a PCI bus is hot-removed.
--

commit ff1f5cc12e40244014f762aa4e444d2ef1d867a3
Author: Zide Chen <zide.chen@intel.com>

perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies

If the die is offline when uncore_die_to_cpu() is called, it silently
returns 0, which is misleading. This patch modifies the function to return
-1 in this case, indicating that all CPUs on the die are offline. It also
opportunistically replaces -EPERM with -ENODEV.

> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 22256ded2d67d..4b3a1fa5b41b7 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -91,7 +91,7 @@ int uncore_device_to_die(struct pci_dev *dev)
>   */
>  int uncore_die_to_cpu(int die)
>  {
> -	int res = 0, cpu;
> +	int res = -1, cpu;
>  
>  	for_each_online_cpu(cpu) {
>  		if (topology_logical_die_id(cpu) == die) {

[ ... ]

> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 772b78237424c..334dc384b5b93 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3772,12 +3772,17 @@ static void pmu_free_topology(struct intel_uncore_type *type)
>  static int skx_pmu_get_topology(struct intel_uncore_type *type,
>  				 int (*topology_cb)(struct intel_uncore_type*, int, int, u64))
>  {
> -	int die, ret = -EPERM;
> +	int die, ret = -ENODEV;
>  	u64 cpu_bus_msr;
> +	int cpu;
>  
>  	cpus_read_lock();
>  	for (die = 0; die < uncore_max_dies(); die++) {
> -		ret = skx_msr_cpu_bus_read(uncore_die_to_cpu(die), &cpu_bus_msr);
> +		cpu = uncore_die_to_cpu(die);
> +		if (cpu == -1)
> +			continue;
> +
> +		ret = skx_msr_cpu_bus_read(cpu, &cpu_bus_msr);
>  		if (ret)
>  			break;

[Severity: High]
This isn't a bug introduced by this patch, but later in this function, we
call uncore_die_to_segment() which iterates over PCI root buses using
pci_find_next_bus():

arch/x86/events/intel/uncore.c:uncore_die_to_segment() {
	struct pci_bus *bus = NULL;

	/* Find first pci bus which attributes to specified die. */
	while ((bus = pci_find_next_bus(bus)) &&
	       (die != uncore_pcibus_to_dieid(bus)))
		;

	return bus ? pci_domain_nr(bus) : -EINVAL;
}

The pci_find_next_bus() function acquires and releases pci_bus_sem internally,
returning a raw bus pointer without holding a lock or taking a reference count.

If an unrelated hot-pluggable PCI root bus is hot-removed concurrently, could
the bus pointer be freed before the next iteration?

Passing this freed pointer back to pci_find_next_bus() could cause a
use-after-free when accessing from->node.next. 

While cpus_read_lock() is held in skx_pmu_get_topology(), does it prevent
concurrent hotplug of unrelated PCI root buses?

[Severity: High]
This is a pre-existing issue, but I noticed a potential NULL pointer
dereference in pmu_alloc_topology() in this same file.

If kzalloc_objs() for topology[die] fails, the code jumps to the clear label:

arch/x86/events/intel/uncore_snbep.c:pmu_alloc_topology() {
	...
	for (die = 0; die < uncore_max_dies(); die++) {
		topology[die] = kzalloc_objs(**topology, type->num_boxes);
		if (!topology[die])
			goto clear;
	...
clear:
	for (; die >= 0; die--) {
		for (idx = 0; idx < type->num_boxes; idx++)
			kfree(topology[die][idx].untyped);
		kfree(topology[die]);
	...
}

At the clear label, the loop variable die still points to the index that
failed allocation, so topology[die] is NULL.

Does the cleanup loop's attempt to execute kfree(topology[die][idx].untyped)
evaluate to (topology[die] + idx)->untyped and crash the kernel?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260527151154.130505-1-zide.chen@intel.com?part=6

next prev parent reply	other threads:[~2026-05-27 19:56 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
2026-05-27 15:11 ` [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address Zide Chen
2026-05-27 17:28   ` sashiko-bot
2026-05-28  6:03   ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 3/7] perf/x86/intel/uncore: Fix PCI device refcount leak in UPI discovery Zide Chen
2026-05-28  6:34   ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box() Zide Chen
2026-05-27 18:17   ` sashiko-bot
2026-05-28  6:35   ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 5/7] perf/x86/intel/uncore: Move die_to_cpu() to uncore.c Zide Chen
2026-05-28  6:36   ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies Zide Chen
2026-05-27 19:56   ` sashiko-bot [this message]
2026-05-28  6:38   ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore Zide Chen
2026-05-27 20:45   ` sashiko-bot
2026-05-28  6:46   ` Mi, Dapeng
2026-05-27 15:45 ` [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems sashiko-bot
2026-05-28  6:01 ` Mi, Dapeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260527195601.F20AF1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=zide.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox