Re: [PATCH 2/5] perf tools: Add bounds check to cpu__get_node()

Linux Perf Users
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Arnaldo Carvalho de Melo" <acme@kernel.org>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 2/5] perf tools: Add bounds check to cpu__get_node()
Date: Fri, 05 Jun 2026 14:30:00 +0000	[thread overview]
Message-ID: <20260605143000.DE0CD1F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260605121515.1725549-3-acme@kernel.org>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] `builtin-kmem` omits the `.attr` callback in its `perf_tool`, which breaks processing of piped streams natively.
- [High] Integer truncation in `set_max_cpu_num()` combined with missing bounds checks in `cpu__setup_cpunode_map()` causes heap out-of-bounds writes on systems with large sparse CPU topologies.
- [High] `perf_mmap__aio_bind` triggers an out-of-bounds memory write and subsequent crash when a CPU's NUMA node is unknown or mapped to a dummy CPU.
--

commit 84fade77fbbcea06a93ffe9275b1abcc29bbc6bc
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Add bounds check to cpu__get_node()

The patch adds a bounds check against max_cpu_num in cpu__get_node() to
protect against untrusted or out-of-bounds CPU index values before accessing
cpunode_map. This prevents out-of-bounds array indexing, particularly when
analyzing cross-machine perf recordings where the recorded CPU count may
exceed the host's CPU count.

> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index b1e5c29c6e3ec..d3432622b2adc 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -576,6 +576,10 @@ int cpu__get_node(struct perf_cpu cpu)
>  		return -1;
>  	}
>  
> +	/* cpunode_map allocated for max_cpu_num entries; input may be untrusted */
> +	if (cpu.cpu < 0 || cpu.cpu >= max_cpu_num.cpu)

[Severity: High]
This isn't a bug introduced by this patch, but parsing sysfs to determine
the maximum CPU number introduces a memory corruption vulnerability.

In tools/perf/util/cpumap.c:set_max_cpu_num(), the max possible CPU number
is parsed from sysfs into a local int max. This value is directly assigned to
max_cpu_num.cpu, which is an int16_t. If max exceeds 32,767 (as is possible on
large modern architectures or due to sparse APIC numbering), the assignment
truncates the value, resulting in a small or negative integer.

Later, init_cpunode_map() allocates cpunode_map based on this truncated size,
and cpu__setup_cpunode_map() writes to it without truncation:

tools/perf/util/cpumap.c:cpu__setup_cpunode_map() {
...
		while ((dent2 = readdir(dir2)) != NULL) {
			if (dent2->d_type != DT_LNK || sscanf(dent2->d_name, "cpu%u", &cpu) < 1)
				continue;
			cpunode_map[cpu] = mem;
		}
...

Could this result in an out-of-bounds write on the heap?

> +		return -1;

[Severity: High]
This isn't a bug introduced by this patch, but returning -1 here causes a
crash in perf_mmap__aio_bind() when a CPU's NUMA node is unknown or mapped to
a dummy CPU.

tools/perf/util/mmap.c:perf_mmap__aio_bind() {
...
		node_index = cpu__get_node(cpu);
		node_mask = bitmap_zalloc(node_index + 1);
		if (!node_mask) {
			pr_err("Failed to allocate node mask for mbind: error %m\n");
			return -1;
		}
		__set_bit(node_index, node_mask);
...

When cpu__get_node() returns -1, it is implicitly cast to unsigned long
node_index (ULONG_MAX). bitmap_zalloc(ULONG_MAX + 1) evaluates to
bitmap_zalloc(0) and allocates a minimal buffer.

Does the subsequent __set_bit(ULONG_MAX, node_mask) perform a wild memory
write vastly outside the allocated bitmap?

> +
>  	return cpunode_map[cpu.cpu];
>  }

[Severity: High]
This isn't a bug introduced by this patch, but while reviewing
builtin-kmem.c which is mentioned in the commit message as calling
cpu__get_node(), it appears the perf_tool initialization omits the
.attr callback:

tools/perf/builtin-kmem.c:cmd_kmem() {
...
	perf_tool__init(&perf_kmem, /*ordered_events=*/true);
	perf_kmem.sample	= process_sample_event;
	perf_kmem.comm		= perf_event__process_comm;
	perf_kmem.mmap		= perf_event__process_mmap;
	perf_kmem.mmap2		= perf_event__process_mmap2;
	perf_kmem.namespaces	= perf_event__process_namespaces;
...

In pipe mode (e.g., perf kmem record -o - | perf kmem stat -i -), perf.data
headers are missing and events must be processed sequentially. Omitting the
perf_event__process_attr callback causes perf_event_header_attr events to be
silently discarded by the default stub.

Does this prevent the creation of evsels and the initialization of the
perf_env, completely breaking the tool's ability to process piped data
streams natively?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605121515.1725549-1-acme@kernel.org?part=2

next prev parent reply	other threads:[~2026-06-05 14:30 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05 12:15 [PATCHES v1 0/5] perf tools: Fix OOB reads, reference leaks, and overflow in sched/script/auxtrace Arnaldo Carvalho de Melo
2026-06-05 12:15 ` [PATCH 1/5] perf tools: Guard remaining test_bit calls from OOB sample CPU Arnaldo Carvalho de Melo
2026-06-05 12:31   ` sashiko-bot
2026-06-05 12:15 ` [PATCH 2/5] perf tools: Add bounds check to cpu__get_node() Arnaldo Carvalho de Melo
2026-06-05 14:30   ` sashiko-bot [this message]
2026-06-05 14:45     ` Arnaldo Carvalho de Melo
2026-06-05 12:15 ` [PATCH 3/5] perf sched: Fix thread reference leaks in timehist_get_thread() Arnaldo Carvalho de Melo
2026-06-05 12:35   ` sashiko-bot
2026-06-05 12:15 ` [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing Arnaldo Carvalho de Melo
2026-06-05 12:43   ` sashiko-bot
2026-06-05 14:34   ` David Ahern
2026-06-05 15:01     ` Arnaldo Carvalho de Melo
2026-06-05 12:15 ` [PATCH 5/5] perf sched: Fix register_pid() overflow, strcpy, and BUG_ON Arnaldo Carvalho de Melo
2026-06-05 12:29   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260605143000.DE0CD1F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=acme@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox