From: sashiko-bot@kernel.org
To: "Zide Chen" <zide.chen@intel.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies
Date: Wed, 27 May 2026 19:56:01 +0000 [thread overview]
Message-ID: <20260527195601.F20AF1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260527151154.130505-6-zide.chen@intel.com>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues:
- [High] This problem wasn't introduced by this patch, but in `pmu_alloc_topology()`, if the memory allocation for `topology[die]` fails, the error cleanup loop dereferences the newly-NULL pointer, causing a kernel crash.
- [High] This problem wasn't introduced by this patch, but there is an unprotected list traversal of PCI buses in `uncore_die_to_segment()` leading to a potential Use-After-Free if a PCI bus is hot-removed.
--
commit ff1f5cc12e40244014f762aa4e444d2ef1d867a3
Author: Zide Chen <zide.chen@intel.com>
perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies
If the die is offline when uncore_die_to_cpu() is called, it silently
returns 0, which is misleading. This patch modifies the function to return
-1 in this case, indicating that all CPUs on the die are offline. It also
opportunistically replaces -EPERM with -ENODEV.
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 22256ded2d67d..4b3a1fa5b41b7 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -91,7 +91,7 @@ int uncore_device_to_die(struct pci_dev *dev)
> */
> int uncore_die_to_cpu(int die)
> {
> - int res = 0, cpu;
> + int res = -1, cpu;
>
> for_each_online_cpu(cpu) {
> if (topology_logical_die_id(cpu) == die) {
[ ... ]
> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
> index 772b78237424c..334dc384b5b93 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3772,12 +3772,17 @@ static void pmu_free_topology(struct intel_uncore_type *type)
> static int skx_pmu_get_topology(struct intel_uncore_type *type,
> int (*topology_cb)(struct intel_uncore_type*, int, int, u64))
> {
> - int die, ret = -EPERM;
> + int die, ret = -ENODEV;
> u64 cpu_bus_msr;
> + int cpu;
>
> cpus_read_lock();
> for (die = 0; die < uncore_max_dies(); die++) {
> - ret = skx_msr_cpu_bus_read(uncore_die_to_cpu(die), &cpu_bus_msr);
> + cpu = uncore_die_to_cpu(die);
> + if (cpu == -1)
> + continue;
> +
> + ret = skx_msr_cpu_bus_read(cpu, &cpu_bus_msr);
> if (ret)
> break;
[Severity: High]
This isn't a bug introduced by this patch, but later in this function, we
call uncore_die_to_segment() which iterates over PCI root buses using
pci_find_next_bus():
arch/x86/events/intel/uncore.c:uncore_die_to_segment() {
struct pci_bus *bus = NULL;
/* Find first pci bus which attributes to specified die. */
while ((bus = pci_find_next_bus(bus)) &&
(die != uncore_pcibus_to_dieid(bus)))
;
return bus ? pci_domain_nr(bus) : -EINVAL;
}
The pci_find_next_bus() function acquires and releases pci_bus_sem internally,
returning a raw bus pointer without holding a lock or taking a reference count.
If an unrelated hot-pluggable PCI root bus is hot-removed concurrently, could
the bus pointer be freed before the next iteration?
Passing this freed pointer back to pci_find_next_bus() could cause a
use-after-free when accessing from->node.next.
While cpus_read_lock() is held in skx_pmu_get_topology(), does it prevent
concurrent hotplug of unrelated PCI root buses?
[Severity: High]
This is a pre-existing issue, but I noticed a potential NULL pointer
dereference in pmu_alloc_topology() in this same file.
If kzalloc_objs() for topology[die] fails, the code jumps to the clear label:
arch/x86/events/intel/uncore_snbep.c:pmu_alloc_topology() {
...
for (die = 0; die < uncore_max_dies(); die++) {
topology[die] = kzalloc_objs(**topology, type->num_boxes);
if (!topology[die])
goto clear;
...
clear:
for (; die >= 0; die--) {
for (idx = 0; idx < type->num_boxes; idx++)
kfree(topology[die][idx].untyped);
kfree(topology[die]);
...
}
At the clear label, the loop variable die still points to the index that
failed allocation, so topology[die] is NULL.
Does the cleanup loop's attempt to execute kfree(topology[die][idx].untyped)
evaluate to (topology[die] + idx)->untyped and crash the kernel?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260527151154.130505-1-zide.chen@intel.com?part=6
next prev parent reply other threads:[~2026-05-27 19:56 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-27 15:11 [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems Zide Chen
2026-05-27 15:11 ` [PATCH V2 2/7] perf/x86/intel/uncore: Guard against invalid box control address Zide Chen
2026-05-27 17:28 ` sashiko-bot
2026-05-28 6:03 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 3/7] perf/x86/intel/uncore: Fix PCI device refcount leak in UPI discovery Zide Chen
2026-05-28 6:34 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 4/7] perf/x86/intel/uncore: Defer ADL global PMON enable to enable_box() Zide Chen
2026-05-27 18:17 ` sashiko-bot
2026-05-28 6:35 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 5/7] perf/x86/intel/uncore: Move die_to_cpu() to uncore.c Zide Chen
2026-05-28 6:36 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 6/7] perf/x86/intel/uncore: Fix uncore_die_to_cpu() for offline dies Zide Chen
2026-05-27 19:56 ` sashiko-bot [this message]
2026-05-28 6:38 ` Mi, Dapeng
2026-05-27 15:11 ` [PATCH V2 7/7] perf/x86/intel/uncore: Implement global init callback for GNR uncore Zide Chen
2026-05-27 20:45 ` sashiko-bot
2026-05-28 6:46 ` Mi, Dapeng
2026-05-27 15:45 ` [PATCH V2 1/7] perf/x86/intel/uncore: Fix discovery unit lookup for multi-die systems sashiko-bot
2026-05-28 6:01 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260527195601.F20AF1F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox