Linux Perf Users
 help / color / mirror / Atom feed
* [PATCHES v1 0/5] perf tools: Fix OOB reads, reference leaks, and overflow in sched/script/auxtrace
@ 2026-06-05 12:15 Arnaldo Carvalho de Melo
  2026-06-05 12:15 ` [PATCH 1/5] perf tools: Guard remaining test_bit calls from OOB sample CPU Arnaldo Carvalho de Melo
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 12:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo

Hi,

Five more pre-existing bugs found by sashiko-bot during AI-assisted
review of the perf-data-validation hardening series.  All are
independent of that series -- they are latent bugs in surrounding code
exposed during review.

  1. test_bit(cpu, cpu_bitmap) reads out of bounds in
     auxtrace.c:filter_cpu() and builtin-script.c:filter_cpu() when
     the CPU value is negative or >= MAX_NR_CPUS.  Same class of bug
     fixed in the previous series for annotate, diff, report, and
     sched.

  2. cpu__get_node() in cpumap.c indexes cpunode_map[] without bounds
     checking against max_cpu_num.  Callers like builtin-kmem pass
     untrusted sample->cpu from perf.data.

  3. Thread reference leaks in perf sched timehist_get_thread() --
     two error paths and the success path in the idle_hist block
     fail to release thread references acquired via
     machine__findnew_thread() and get_idle_thread().

  4. sched->max_cpu updated from sample->cpu without bounds checking
     in perf_timehist__process_sample().  Later code uses max_cpu + 1
     as iteration count over arrays allocated with MAX_CPUS entries.
     Also caps the env->nr_cpus_online initialization.

  5. register_pid() in perf sched replay has integer overflow on
     32-bit (pid * sizeof wraps), strcpy into fixed 20-byte buffer
     without length check, BUG_ON on allocation failure, and unsafe
     realloc pattern that leaks on failure.

All require crafted or unusual perf.data inputs to trigger.
Verified with gcc and clang builds, checkpatch, and perf test.

Arnaldo Carvalho de Melo (5):
  perf tools: Guard remaining test_bit calls from OOB sample CPU
  perf tools: Add bounds check to cpu__get_node()
  perf sched: Fix thread reference leaks in timehist_get_thread()
  perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing
  perf sched: Fix register_pid() overflow, strcpy, and BUG_ON

 tools/perf/builtin-sched.c  | 56 +++++++++++++++++++++++++++++++++------------
 tools/perf/builtin-script.c |  2 +-
 tools/perf/util/auxtrace.c  |  3 ++-
 tools/perf/util/cpumap.c    |  4 ++++
 4 files changed, 49 insertions(+), 16 deletions(-)

Developed with AI assistance (Claude/sashiko), tagged in commits.

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] perf tools: Guard remaining test_bit calls from OOB sample CPU
  2026-06-05 12:15 [PATCHES v1 0/5] perf tools: Fix OOB reads, reference leaks, and overflow in sched/script/auxtrace Arnaldo Carvalho de Melo
@ 2026-06-05 12:15 ` Arnaldo Carvalho de Melo
  2026-06-05 12:31   ` sashiko-bot
  2026-06-05 12:15 ` [PATCH 2/5] perf tools: Add bounds check to cpu__get_node() Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 12:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Anton Blanchard,
	Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

auxtrace.c:filter_cpu() and builtin-script.c:filter_cpu() call
test_bit(cpu, cpu_bitmap) where cpu_bitmap is declared with
MAX_NR_CPUS bits.  When the CPU value from a perf.data event is
corrupt or absent (e.g. negative or >= MAX_NR_CPUS), test_bit reads
out of bounds.

Add bounds checks before test_bit(): >= 0 for the int16_t cpu.cpu in
auxtrace (which also covers the -1 sentinel), and < MAX_NR_CPUS for
both sites.  Matches the pattern applied in the previous series for
builtin-annotate.c, builtin-diff.c, builtin-report.c, and
builtin-sched.c.

Fixes: 644e0840ad46 ("perf auxtrace: Add CPU filter support")
Fixes: 5d67be97f890 ("perf report/annotate/script: Add option to specify a CPU range")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 2 +-
 tools/perf/util/auxtrace.c  | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index f4aa255fc3297f90..9ac29bdc3cd547e6 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2646,7 +2646,7 @@ static int cleanup_scripting(void)
 
 static bool filter_cpu(struct perf_sample *sample)
 {
-	if (cpu_list && sample->cpu != (u32)-1)
+	if (cpu_list && sample->cpu != (u32)-1 && sample->cpu < MAX_NR_CPUS)
 		return !test_bit(sample->cpu, cpu_bitmap);
 	return false;
 }
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 5f4aa1701aef649a..4cd2caf5401522ca 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -372,7 +372,8 @@ static bool filter_cpu(struct perf_session *session, struct perf_cpu cpu)
 {
 	unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap;
 
-	return cpu_bitmap && cpu.cpu != -1 && !test_bit(cpu.cpu, cpu_bitmap);
+	return cpu_bitmap && cpu.cpu >= 0 && cpu.cpu < MAX_NR_CPUS &&
+	       !test_bit(cpu.cpu, cpu_bitmap);
 }
 
 static int auxtrace_queues__add_buffer(struct auxtrace_queues *queues,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] perf tools: Add bounds check to cpu__get_node()
  2026-06-05 12:15 [PATCHES v1 0/5] perf tools: Fix OOB reads, reference leaks, and overflow in sched/script/auxtrace Arnaldo Carvalho de Melo
  2026-06-05 12:15 ` [PATCH 1/5] perf tools: Guard remaining test_bit calls from OOB sample CPU Arnaldo Carvalho de Melo
@ 2026-06-05 12:15 ` Arnaldo Carvalho de Melo
  2026-06-05 14:30   ` sashiko-bot
  2026-06-05 12:15 ` [PATCH 3/5] perf sched: Fix thread reference leaks in timehist_get_thread() Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 12:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

cpu__get_node() accesses cpunode_map[cpu.cpu] without checking against
max_cpu_num, the allocation size of cpunode_map.  Callers such as
builtin-kmem.c:evsel__process_alloc_event() pass sample->cpu from
perf.data events, which may exceed the host's CPU count when analyzing
cross-machine recordings.

Add a bounds check against max_cpu_num before indexing, returning -1
for out-of-range values.  This is a central fix that protects all
callers.

Fixes: 86895b480a2f ("perf stat: Add --per-node agregation support")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/cpumap.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index b1e5c29c6e3ec8df..d3432622b2adc994 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -576,6 +576,10 @@ int cpu__get_node(struct perf_cpu cpu)
 		return -1;
 	}
 
+	/* cpunode_map allocated for max_cpu_num entries; input may be untrusted */
+	if (cpu.cpu < 0 || cpu.cpu >= max_cpu_num.cpu)
+		return -1;
+
 	return cpunode_map[cpu.cpu];
 }
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] perf sched: Fix thread reference leaks in timehist_get_thread()
  2026-06-05 12:15 [PATCHES v1 0/5] perf tools: Fix OOB reads, reference leaks, and overflow in sched/script/auxtrace Arnaldo Carvalho de Melo
  2026-06-05 12:15 ` [PATCH 1/5] perf tools: Guard remaining test_bit calls from OOB sample CPU Arnaldo Carvalho de Melo
  2026-06-05 12:15 ` [PATCH 2/5] perf tools: Add bounds check to cpu__get_node() Arnaldo Carvalho de Melo
@ 2026-06-05 12:15 ` Arnaldo Carvalho de Melo
  2026-06-05 12:35   ` sashiko-bot
  2026-06-05 12:15 ` [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing Arnaldo Carvalho de Melo
  2026-06-05 12:15 ` [PATCH 5/5] perf sched: Fix register_pid() overflow, strcpy, and BUG_ON Arnaldo Carvalho de Melo
  4 siblings, 1 reply; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 12:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

timehist_get_thread() acquires a thread reference via
machine__findnew_thread() and an idle thread reference via
get_idle_thread() (which calls thread__get()).  Two error paths in
the idle_hist block return NULL without releasing these references:

 - When get_idle_thread() fails, the thread reference leaks.
 - When thread__priv(idle) returns NULL, both idle and thread leak.

Additionally, the idle thread reference acquired on the success path
is never released, leaking a reference on every sample when
--idle-hist is active.

Add thread__put() calls on both error paths and release the idle
reference after use on the success path.

Fixes: 5d8f17fb5822 ("perf sched timehist: Add -I/--idle-hist option")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 4aa7833cae6e36b8..7bd61028327b39db 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2546,12 +2546,16 @@ static struct thread *timehist_get_thread(struct perf_sched *sched,
 			idle = get_idle_thread(sample->cpu);
 			if (idle == NULL) {
 				pr_err("Failed to get idle thread for cpu %d.\n", sample->cpu);
+				thread__put(thread);
 				return NULL;
 			}
 
 			itr = thread__priv(idle);
-			if (itr == NULL)
+			if (itr == NULL) {
+				thread__put(idle);
+				thread__put(thread);
 				return NULL;
+			}
 
 			thread__put(itr->last_thread);
 			itr->last_thread = thread__get(thread);
@@ -2559,6 +2563,8 @@ static struct thread *timehist_get_thread(struct perf_sched *sched,
 			/* copy task callchain when entering to idle */
 			if (perf_sample__intval(sample, "next_pid") == 0)
 				save_idle_callchain(sched, itr, sample);
+
+			thread__put(idle);
 		}
 	}
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing
  2026-06-05 12:15 [PATCHES v1 0/5] perf tools: Fix OOB reads, reference leaks, and overflow in sched/script/auxtrace Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2026-06-05 12:15 ` [PATCH 3/5] perf sched: Fix thread reference leaks in timehist_get_thread() Arnaldo Carvalho de Melo
@ 2026-06-05 12:15 ` Arnaldo Carvalho de Melo
  2026-06-05 12:43   ` sashiko-bot
  2026-06-05 14:34   ` David Ahern
  2026-06-05 12:15 ` [PATCH 5/5] perf sched: Fix register_pid() overflow, strcpy, and BUG_ON Arnaldo Carvalho de Melo
  4 siblings, 2 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 12:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, David Ahern,
	Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

perf_timehist__process_sample() updates sched->max_cpu from the
sample CPU without bounds checking.  Later code uses max_cpu + 1 as
an iteration count over arrays allocated with MAX_CPUS entries
(curr_thread, cpu_last_switched).  A recording with CPU IDs >= MAX_CPUS
causes out-of-bounds array accesses.

Also cap the env->nr_cpus_online initialization of max_cpu in
perf_sched__timehist(), which could exceed MAX_CPUS on very large
systems.

Add bounds checks before both max_cpu updates, matching the pattern
already used in map_switch_event().

Fixes: 49394a2a24c7 ("perf sched timehist: Introduce timehist command")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 7bd61028327b39db..87a1f4cf8760e1e9 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3215,7 +3215,9 @@ static int perf_timehist__process_sample(const struct perf_tool *tool,
 		.cpu = sample->cpu,
 	};
 
-	if (this_cpu.cpu > sched->max_cpu.cpu)
+	/* max_cpu indexes arrays allocated with MAX_CPUS entries */
+	if (this_cpu.cpu >= 0 && this_cpu.cpu < MAX_CPUS &&
+	    this_cpu.cpu > sched->max_cpu.cpu)
 		sched->max_cpu = this_cpu;
 
 	if (evsel->handler != NULL) {
@@ -3385,8 +3387,8 @@ static int perf_sched__timehist(struct perf_sched *sched)
 		perf_session__set_tracepoints_handlers(session, migrate_handlers))
 		goto out;
 
-	/* pre-allocate struct for per-CPU idle stats */
-	sched->max_cpu.cpu = env->nr_cpus_online;
+	/* pre-allocate struct for per-CPU idle stats; cap to array bounds */
+	sched->max_cpu.cpu = min(env->nr_cpus_online, MAX_CPUS);
 	if (sched->max_cpu.cpu == 0)
 		sched->max_cpu.cpu = 4;
 	if (init_idle_threads(sched->max_cpu.cpu))
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] perf sched: Fix register_pid() overflow, strcpy, and BUG_ON
  2026-06-05 12:15 [PATCHES v1 0/5] perf tools: Fix OOB reads, reference leaks, and overflow in sched/script/auxtrace Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2026-06-05 12:15 ` [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing Arnaldo Carvalho de Melo
@ 2026-06-05 12:15 ` Arnaldo Carvalho de Melo
  2026-06-05 12:29   ` sashiko-bot
  4 siblings, 1 reply; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 12:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Ingo Molnar,
	Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

register_pid() has several issues when processing untrusted perf.data:

1. Integer overflow: (pid + 1) * sizeof(struct task_desc *) can wrap
   to a small value on 32-bit systems when pid is large (e.g.
   0x40000000), causing realloc to return a tiny buffer followed by
   out-of-bounds writes in the initialization loop.

2. Heap buffer overflow: strcpy(task->comm, comm) copies the
   untrusted comm string into a fixed 20-byte COMM_LEN buffer with
   no length check.

3. BUG_ON on allocation failure: perf.data is untrusted input, so
   allocation failures should be handled gracefully rather than
   killing the process.

4. Realloc of sched->tasks assigned directly back, leaking the old
   pointer on failure; nr_tasks incremented before the realloc,
   leaving corrupted state on failure.

Cap pid at PID_MAX_LIMIT (4194304, matching the kernel's maximum
on 64-bit), replace strcpy with strlcpy, guard against NULL comm,
replace BUG_ON with NULL returns using safe realloc patterns, and
add NULL checks in callers that dereference the result.

Fixes: ec156764d424 ("perf sched: Import schedbench.c")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 40 ++++++++++++++++++++++++++++----------
 1 file changed, 30 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 87a1f4cf8760e1e9..21fb820b625b43e1 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -55,6 +55,7 @@
 #define COMM_LEN		20
 #define SYM_LEN			129
 #define MAX_PID			1024000
+#define PID_MAX_LIMIT		4194304 /* kernel limit on 64-bit */
 #define MAX_PRIO		140
 #define SEP_LEN			100
 
@@ -448,17 +449,28 @@ static void add_sched_event_sleep(struct perf_sched *sched, struct task_desc *ta
 static struct task_desc *register_pid(struct perf_sched *sched,
 				      unsigned long pid, const char *comm)
 {
-	struct task_desc *task;
+	struct task_desc *task, **tasks_p;
 	static int pid_max;
 
+	/* perf.data is untrusted — cap pid to prevent overflow in size calculations */
+	if (pid >= PID_MAX_LIMIT) {
+		pr_err("pid %lu exceeds limit %d, skipping\n", pid, PID_MAX_LIMIT);
+		return NULL;
+	}
+
 	if (sched->pid_to_task == NULL) {
 		if (sysctl__read_int("kernel/pid_max", &pid_max) < 0)
 			pid_max = MAX_PID;
-		BUG_ON((sched->pid_to_task = calloc(pid_max, sizeof(struct task_desc *))) == NULL);
+		sched->pid_to_task = calloc(pid_max, sizeof(struct task_desc *));
+		if (sched->pid_to_task == NULL)
+			return NULL;
 	}
 	if (pid >= (unsigned long)pid_max) {
-		BUG_ON((sched->pid_to_task = realloc(sched->pid_to_task, (pid + 1) *
-			sizeof(struct task_desc *))) == NULL);
+		void *p = realloc(sched->pid_to_task, (pid + 1) * sizeof(struct task_desc *));
+
+		if (p == NULL)
+			return NULL;
+		sched->pid_to_task = p;
 		while (pid >= (unsigned long)pid_max)
 			sched->pid_to_task[pid_max++] = NULL;
 	}
@@ -469,9 +481,11 @@ static struct task_desc *register_pid(struct perf_sched *sched,
 		return task;
 
 	task = zalloc(sizeof(*task));
+	if (task == NULL)
+		return NULL;
 	task->pid = pid;
-	task->nr = sched->nr_tasks;
-	strcpy(task->comm, comm);
+	if (comm)
+		strlcpy(task->comm, comm, sizeof(task->comm));
 	/*
 	 * every task starts in sleeping state - this gets ignored
 	 * if there's no wakeup pointing to this sleep state:
@@ -479,10 +493,12 @@ static struct task_desc *register_pid(struct perf_sched *sched,
 	add_sched_event_sleep(sched, task, 0);
 
 	sched->pid_to_task[pid] = task;
-	sched->nr_tasks++;
-	sched->tasks = realloc(sched->tasks, sched->nr_tasks * sizeof(struct task_desc *));
-	BUG_ON(!sched->tasks);
-	sched->tasks[task->nr] = task;
+	tasks_p = realloc(sched->tasks, (sched->nr_tasks + 1) * sizeof(struct task_desc *));
+	if (!tasks_p)
+		return NULL;
+	sched->tasks = tasks_p;
+	sched->tasks[sched->nr_tasks] = task;
+	task->nr = sched->nr_tasks++;
 
 	if (verbose > 0)
 		printf("registered task #%ld, PID %ld (%s)\n", sched->nr_tasks, pid, comm);
@@ -841,6 +857,8 @@ replay_wakeup_event(struct perf_sched *sched,
 
 	waker = register_pid(sched, sample->tid, "<unknown>");
 	wakee = register_pid(sched, pid, comm);
+	if (waker == NULL || wakee == NULL)
+		return -1;
 
 	add_sched_event_wakeup(sched, waker, sample->time, wakee);
 	return 0;
@@ -881,6 +899,8 @@ static int replay_switch_event(struct perf_sched *sched,
 
 	prev = register_pid(sched, prev_pid, prev_comm);
 	next = register_pid(sched, next_pid, next_comm);
+	if (prev == NULL || next == NULL)
+		return -1;
 
 	sched->cpu_last_switched[cpu] = timestamp;
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] perf sched: Fix register_pid() overflow, strcpy, and BUG_ON
  2026-06-05 12:15 ` [PATCH 5/5] perf sched: Fix register_pid() overflow, strcpy, and BUG_ON Arnaldo Carvalho de Melo
@ 2026-06-05 12:29   ` sashiko-bot
  0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-05 12:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

New issues:
- [High] Incomplete error rollback in register_pid() leads to state corruption, memory leaks, and deadlocks during replay.

Pre-existing issues:
- [High] Unhandled memory allocation failures and reachable BUG_ON() in get_new_event() and add_sched_event_wakeup() cause crashes on untrusted input.
--

commit ff8db8ecbe0e2e8aa902db3edce9b5b6e8db02f4
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: perf sched: Fix register_pid() overflow, strcpy, and BUG_ON

This commit aims to fix several issues in register_pid() when processing
untrusted perf.data, including integer/heap overflows and BUG_ON calls on
allocation failures. It caps the PID, uses safe string copying, and handles
reallocation failures gracefully.

> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 87a1f4cf8760e..21fb820b625b4 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c

[ ... ]

> @@ -469,9 +481,11 @@ static struct task_desc *register_pid(struct perf_sched *sched,
>  		return task;
>  
>  	task = zalloc(sizeof(*task));
> +	if (task == NULL)
> +		return NULL;
>  	task->pid = pid;
> -	task->nr = sched->nr_tasks;
> -	strcpy(task->comm, comm);
> +	if (comm)
> +		strlcpy(task->comm, comm, sizeof(task->comm));
>  	/*
>  	 * every task starts in sleeping state - this gets ignored
>  	 * if there's no wakeup pointing to this sleep state:
>  	 */
>  	add_sched_event_sleep(sched, task, 0);

[Severity: High]
This is a pre-existing issue, but since the patch aims to prevent crashes on
memory exhaustion when parsing untrusted data, does this still leave a path
to crash?

register_pid() unconditionally calls add_sched_event_sleep(), which calls
get_new_event(). Inside get_new_event(), the result of zalloc() is used
without a NULL check, leading to an immediate dereference:

get_new_event()
{
        ...
        struct sched_atom *event = zalloc(sizeof(*event));
        ...
        event->timestamp = timestamp;
        ...
        BUG_ON(!task->atoms);
}

There is also a remaining BUG_ON() upon realloc() failure.

Similarly, add_sched_event_wakeup() allocates wait_sem and passes it
unchecked to sem_init(), causing another crash if zalloc() fails:

add_sched_event_wakeup()
{
        ...
        wakee_event->wait_sem = zalloc(sizeof(*wakee_event->wait_sem));
        sem_init(wakee_event->wait_sem, 0, 0);
}

>  
>  	sched->pid_to_task[pid] = task;
> -	sched->nr_tasks++;
> -	sched->tasks = realloc(sched->tasks, sched->nr_tasks * sizeof(struct task_desc *));
> -	BUG_ON(!sched->tasks);
> -	sched->tasks[task->nr] = task;
> +	tasks_p = realloc(sched->tasks, (sched->nr_tasks + 1) * sizeof(struct task_desc *));
> +	if (!tasks_p)
> +		return NULL;

[Severity: High]
If realloc() fails here, does this leak the newly allocated task and leave
a corrupted state?

The task has already been assigned to sched->pid_to_task[pid] above. If we
return NULL, the caller gets an error, but the partially initialized task
remains in the pid_to_task array and is never added to sched->tasks.

If a caller like replay_fork_event() ignores the NULL return value:

replay_fork_event()
{
        ...
        register_pid(sched, thread__tid(parent), thread__comm_str(parent));
        register_pid(sched, thread__tid(child), thread__comm_str(child));
}

Subsequent events for the same PID will find the partially initialized task
in pid_to_task and successfully add events to it. Because the task is not in
sched->tasks, perf sched replay will fail to spawn a thread for it in
create_tasks(). When other simulated threads expect to be awoken by this
missing thread, they might wait indefinitely on a semaphore, deadlocking the
tool.

> +	sched->tasks = tasks_p;
> +	sched->tasks[sched->nr_tasks] = task;
> +	task->nr = sched->nr_tasks++;
>  
>  	if (verbose > 0)
>  		printf("registered task #%ld, PID %ld (%s)\n", sched->nr_tasks, pid, comm);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605121515.1725549-1-acme@kernel.org?part=5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] perf tools: Guard remaining test_bit calls from OOB sample CPU
  2026-06-05 12:15 ` [PATCH 1/5] perf tools: Guard remaining test_bit calls from OOB sample CPU Arnaldo Carvalho de Melo
@ 2026-06-05 12:31   ` sashiko-bot
  0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-05 12:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Logic error in CPU filtering bounds checks allows out-of-bounds CPU IDs to bypass the filter.
--

commit c7a6257406af3fb4587a3e6f3454c67efbc1b249
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Guard remaining test_bit calls from OOB sample CPU

This commit adds bounds checking to filter_cpu in auxtrace.c and
builtin-script.c to prevent out-of-bounds reads in test_bit when
the CPU value is corrupt or absent.

> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index f4aa255fc3297..9ac29bdc3cd54 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2646,7 +2646,7 @@ static int cleanup_scripting(void)
>  
>  static bool filter_cpu(struct perf_sample *sample)
>  {
> -	if (cpu_list && sample->cpu != (u32)-1)
> +	if (cpu_list && sample->cpu != (u32)-1 && sample->cpu < MAX_NR_CPUS)
>  		return !test_bit(sample->cpu, cpu_bitmap);
>  	return false;
>  }

[Severity: Medium]
When sample->cpu is greater than or equal to MAX_NR_CPUS, the condition fails
and the function returns false. Does returning false here mean the out-of-bounds
sample is kept rather than dropped?

> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 5f4aa1701aef6..4cd2caf540152 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -372,7 +372,8 @@ static bool filter_cpu(struct perf_session *session, struct perf_cpu cpu)
>  {
>  	unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap;
>  
> -	return cpu_bitmap && cpu.cpu != -1 && !test_bit(cpu.cpu, cpu_bitmap);
> +	return cpu_bitmap && cpu.cpu >= 0 && cpu.cpu < MAX_NR_CPUS &&
> +	       !test_bit(cpu.cpu, cpu_bitmap);
>  }

[Severity: Medium]
Similarly in auxtrace.c, if cpu.cpu is greater than or equal to MAX_NR_CPUS,
this boolean expression evaluates to false.

In auxtrace_queues__add_buffer, if filter_cpu returns false, the buffer is kept
and queued:

	if (filter_cpu(session, buffer->cpu))
		return 0;

	buffer = memdup(buffer, sizeof(*buffer));

Could this cause corrupted events to bypass the filter and be processed
downstream?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605121515.1725549-1-acme@kernel.org?part=1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5] perf sched: Fix thread reference leaks in timehist_get_thread()
  2026-06-05 12:15 ` [PATCH 3/5] perf sched: Fix thread reference leaks in timehist_get_thread() Arnaldo Carvalho de Melo
@ 2026-06-05 12:35   ` sashiko-bot
  0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-05 12:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] Unbounded reference leak of struct thread in timehist_sched_change_event
- [High] Missing .attr and .feature callbacks break perf sched in pipe mode
--

commit ff77c452c7d2cf0ab5fbb15792dac9ff2587a7cd
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sched: Fix thread reference leaks in timehist_get_thread()

timehist_get_thread() acquires a thread reference via
machine__findnew_thread() and an idle thread reference via
get_idle_thread() (which calls thread__get()).  Two error paths in
the idle_hist block return NULL without releasing these references.

> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 4aa7833cae6e3..7bd61028327b3 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -2546,12 +2546,16 @@ static struct thread *timehist_get_thread(struct perf_sched *sched,
>  			idle = get_idle_thread(sample->cpu);
>  			if (idle == NULL) {
>  				pr_err("Failed to get idle thread for cpu %d.\n", sample->cpu);
> +				thread__put(thread);
>  				return NULL;
>  			}
>  
>  			itr = thread__priv(idle);
> -			if (itr == NULL)
> +			if (itr == NULL) {
> +				thread__put(idle);
> +				thread__put(thread);
>  				return NULL;
> +			}
>  
>  			thread__put(itr->last_thread);
>  			itr->last_thread = thread__get(thread);

[Severity: High]
This isn't a bug introduced by this patch, but since itr->last_thread is now
properly reference counted here, I noticed it is being cleared in
timehist_sched_change_event() without a thread__put():

tools/perf/builtin-sched.c:timehist_sched_change_event() {
    ...
            if (itr->cursor.nr)
                callchain_append(&itr->callchain, &itr->cursor, t - tprev);

            itr->last_thread = NULL;
        }

        if (!sched->summary_only)
            timehist_print_sample(sched, sample, &al, thread, t, state);
    ...
}

Does this permanently leak the thread reference on every idle-to-busy
transition when --idle-hist is active?


[Severity: High]
This is a pre-existing issue, but while reviewing the sched tool
initialization, I noticed that the .feature callback is omitted for both
perf_sched__timehist and cmd_sched, and .attr is additionally omitted for
cmd_sched:

tools/perf/builtin-sched.c:cmd_sched() {
    ...
    perf_tool__init(&sched.tool, /*ordered_events=*/true);
    sched.tool.sample    = perf_sched__process_tracepoint_sample;
    sched.tool.comm      = perf_sched__process_comm;
    sched.tool.namespaces = perf_event__process_namespaces;
    sched.tool.lost      = perf_event__process_lost;
    sched.tool.fork      = perf_sched__process_fork_event;
    ...
}

tools/perf/builtin-sched.c:perf_sched__timehist() {
    ...
    sched->tool.sample   = perf_timehist__process_sample;
    sched->tool.mmap     = perf_event__process_mmap;
    sched->tool.mmap2    = perf_event__process_mmap2;
    ...
    sched->tool.attr     = perf_event__process_attr;
    ...
}

In pipe mode, does dropping .attr prevent the creation of evlists entirely,
and does dropping .feature prevent proper initialization of perf_env,
breaking the event processing pipeline?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605121515.1725549-1-acme@kernel.org?part=3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing
  2026-06-05 12:15 ` [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing Arnaldo Carvalho de Melo
@ 2026-06-05 12:43   ` sashiko-bot
  2026-06-05 14:34   ` David Ahern
  1 sibling, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-05 12:43 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] perf_sched__timehist omits the .feature tool callback, breaking perf_env population in pipe mode.
- [High] Leaked callchain nodes and violation of reference counting rules during idle threads teardown.
--

commit fa7c274dda31324797cc9c2df04b0e7a2f0a40d2
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing

This commit adds bounds checking to sched->max_cpu updates from the
sample CPU. It caps env->nr_cpus_online initialization of max_cpu to
MAX_CPUS to prevent out-of-bounds array accesses on large systems or
recordings with high CPU IDs.

> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 7bd61028327b3..87a1f4cf8760e 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c

[ ... ]

> @@ -3385,8 +3387,8 @@ static int perf_sched__timehist(struct perf_sched *sched)
>  		perf_session__set_tracepoints_handlers(session, migrate_handlers))
>  		goto out;
>  
> -	/* pre-allocate struct for per-CPU idle stats */
> -	sched->max_cpu.cpu = env->nr_cpus_online;
> +	/* pre-allocate struct for per-CPU idle stats; cap to array bounds */
> +	sched->max_cpu.cpu = min(env->nr_cpus_online, MAX_CPUS);

[Severity: High]
This is a pre-existing issue, but does perf_sched__timehist() correctly
populate perf_env in pipe mode?

Looking at perf_sched__timehist() in tools/perf/builtin-sched.c, the tool
callbacks are initialized without a .feature callback:

    sched->tool.sample       = perf_timehist__process_sample;
    sched->tool.mmap         = perf_event__process_mmap;
    ...
    sched->tool.build_id     = perf_event__process_build_id;

Without registering perf_event__process_feature, piped perf.data streams
might silently drop non-seekable PERF_RECORD_HEADER_FEATURE events.

Does this leave perf_env uninitialized and break environment metadata lookups
like the env->nr_cpus_online field accessed here?

[Severity: High]
This is another pre-existing issue, but is there a reference counting leak
during teardown in free_idle_threads()?

When cleaning up the idle threads array created below by init_idle_threads(),
the cleanup code uses the internal destructor instead of the lifecycle helper:

    free_idle_threads() {
        ...
        thread__delete(idle);
    }

Since reference handles allocated via thread__new() must be strictly paired
with thread__put(), does calling thread__delete() directly bypass the
reference wrapper lifecycle and leak the pointer handle, triggering
REFCNT_CHECKING traps? 

Also, does destroying the thread without proper cleanup leak the dynamically
allocated nodes inside itr->callchain?

>  	if (sched->max_cpu.cpu == 0)
>  		sched->max_cpu.cpu = 4;
>  	if (init_idle_threads(sched->max_cpu.cpu))

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605121515.1725549-1-acme@kernel.org?part=4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/5] perf tools: Add bounds check to cpu__get_node()
  2026-06-05 12:15 ` [PATCH 2/5] perf tools: Add bounds check to cpu__get_node() Arnaldo Carvalho de Melo
@ 2026-06-05 14:30   ` sashiko-bot
  2026-06-05 14:45     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 14+ messages in thread
From: sashiko-bot @ 2026-06-05 14:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] `builtin-kmem` omits the `.attr` callback in its `perf_tool`, which breaks processing of piped streams natively.
- [High] Integer truncation in `set_max_cpu_num()` combined with missing bounds checks in `cpu__setup_cpunode_map()` causes heap out-of-bounds writes on systems with large sparse CPU topologies.
- [High] `perf_mmap__aio_bind` triggers an out-of-bounds memory write and subsequent crash when a CPU's NUMA node is unknown or mapped to a dummy CPU.
--

commit 84fade77fbbcea06a93ffe9275b1abcc29bbc6bc
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Add bounds check to cpu__get_node()

The patch adds a bounds check against max_cpu_num in cpu__get_node() to
protect against untrusted or out-of-bounds CPU index values before accessing
cpunode_map. This prevents out-of-bounds array indexing, particularly when
analyzing cross-machine perf recordings where the recorded CPU count may
exceed the host's CPU count.

> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index b1e5c29c6e3ec..d3432622b2adc 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -576,6 +576,10 @@ int cpu__get_node(struct perf_cpu cpu)
>  		return -1;
>  	}
>  
> +	/* cpunode_map allocated for max_cpu_num entries; input may be untrusted */
> +	if (cpu.cpu < 0 || cpu.cpu >= max_cpu_num.cpu)

[Severity: High]
This isn't a bug introduced by this patch, but parsing sysfs to determine
the maximum CPU number introduces a memory corruption vulnerability.

In tools/perf/util/cpumap.c:set_max_cpu_num(), the max possible CPU number
is parsed from sysfs into a local int max. This value is directly assigned to
max_cpu_num.cpu, which is an int16_t. If max exceeds 32,767 (as is possible on
large modern architectures or due to sparse APIC numbering), the assignment
truncates the value, resulting in a small or negative integer.

Later, init_cpunode_map() allocates cpunode_map based on this truncated size,
and cpu__setup_cpunode_map() writes to it without truncation:

tools/perf/util/cpumap.c:cpu__setup_cpunode_map() {
...
		while ((dent2 = readdir(dir2)) != NULL) {
			if (dent2->d_type != DT_LNK || sscanf(dent2->d_name, "cpu%u", &cpu) < 1)
				continue;
			cpunode_map[cpu] = mem;
		}
...

Could this result in an out-of-bounds write on the heap?

> +		return -1;

[Severity: High]
This isn't a bug introduced by this patch, but returning -1 here causes a
crash in perf_mmap__aio_bind() when a CPU's NUMA node is unknown or mapped to
a dummy CPU.

tools/perf/util/mmap.c:perf_mmap__aio_bind() {
...
		node_index = cpu__get_node(cpu);
		node_mask = bitmap_zalloc(node_index + 1);
		if (!node_mask) {
			pr_err("Failed to allocate node mask for mbind: error %m\n");
			return -1;
		}
		__set_bit(node_index, node_mask);
...

When cpu__get_node() returns -1, it is implicitly cast to unsigned long
node_index (ULONG_MAX). bitmap_zalloc(ULONG_MAX + 1) evaluates to
bitmap_zalloc(0) and allocates a minimal buffer.

Does the subsequent __set_bit(ULONG_MAX, node_mask) perform a wild memory
write vastly outside the allocated bitmap?

> +
>  	return cpunode_map[cpu.cpu];
>  }

[Severity: High]
This isn't a bug introduced by this patch, but while reviewing
builtin-kmem.c which is mentioned in the commit message as calling
cpu__get_node(), it appears the perf_tool initialization omits the
.attr callback:

tools/perf/builtin-kmem.c:cmd_kmem() {
...
	perf_tool__init(&perf_kmem, /*ordered_events=*/true);
	perf_kmem.sample	= process_sample_event;
	perf_kmem.comm		= perf_event__process_comm;
	perf_kmem.mmap		= perf_event__process_mmap;
	perf_kmem.mmap2		= perf_event__process_mmap2;
	perf_kmem.namespaces	= perf_event__process_namespaces;
...

In pipe mode (e.g., perf kmem record -o - | perf kmem stat -i -), perf.data
headers are missing and events must be processed sequentially. Omitting the
perf_event__process_attr callback causes perf_event_header_attr events to be
silently discarded by the default stub.

Does this prevent the creation of evsels and the initialization of the
perf_env, completely breaking the tool's ability to process piped data
streams natively?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605121515.1725549-1-acme@kernel.org?part=2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing
  2026-06-05 12:15 ` [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing Arnaldo Carvalho de Melo
  2026-06-05 12:43   ` sashiko-bot
@ 2026-06-05 14:34   ` David Ahern
  2026-06-05 15:01     ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 14+ messages in thread
From: David Ahern @ 2026-06-05 14:34 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6

On 6/5/26 6:15 AM, Arnaldo Carvalho de Melo wrote:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> perf_timehist__process_sample() updates sched->max_cpu from the
> sample CPU without bounds checking.  Later code uses max_cpu + 1 as
> an iteration count over arrays allocated with MAX_CPUS entries
> (curr_thread, cpu_last_switched).  A recording with CPU IDs >= MAX_CPUS
> causes out-of-bounds array accesses.
> 
> Also cap the env->nr_cpus_online initialization of max_cpu in
> perf_sched__timehist(), which could exceed MAX_CPUS on very large
> systems.
> 
> Add bounds checks before both max_cpu updates, matching the pattern
> already used in map_switch_event().
> 
> Fixes: 49394a2a24c7 ("perf sched timehist: Introduce timehist command")
> Reported-by: sashiko-bot <sashiko-bot@kernel.org>
> Cc: David Ahern <dsahern@gmail.com>
> Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/builtin-sched.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 

LGTM

Reviewed-by: David Ahern <dsahern@kernel.org>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/5] perf tools: Add bounds check to cpu__get_node()
  2026-06-05 14:30   ` sashiko-bot
@ 2026-06-05 14:45     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 14:45 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: linux-perf-users

On Fri, Jun 05, 2026 at 02:30:00PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
> 
> Pre-existing issues:
> - [High] `builtin-kmem` omits the `.attr` callback in its `perf_tool`, which breaks processing of piped streams natively.
> - [High] Integer truncation in `set_max_cpu_num()` combined with missing bounds checks in `cpu__setup_cpunode_map()` causes heap out-of-bounds writes on systems with large sparse CPU topologies.
> - [High] `perf_mmap__aio_bind` triggers an out-of-bounds memory write and subsequent crash when a CPU's NUMA node is unknown or mapped to a dummy CPU.
> --

Thanks, added all these pre-existing conditions for the following
series,

- Arnaldo
 
> commit 84fade77fbbcea06a93ffe9275b1abcc29bbc6bc
> Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> perf tools: Add bounds check to cpu__get_node()
> 
> The patch adds a bounds check against max_cpu_num in cpu__get_node() to
> protect against untrusted or out-of-bounds CPU index values before accessing
> cpunode_map. This prevents out-of-bounds array indexing, particularly when
> analyzing cross-machine perf recordings where the recorded CPU count may
> exceed the host's CPU count.
> 
> > diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> > index b1e5c29c6e3ec..d3432622b2adc 100644
> > --- a/tools/perf/util/cpumap.c
> > +++ b/tools/perf/util/cpumap.c
> > @@ -576,6 +576,10 @@ int cpu__get_node(struct perf_cpu cpu)
> >  		return -1;
> >  	}
> >  
> > +	/* cpunode_map allocated for max_cpu_num entries; input may be untrusted */
> > +	if (cpu.cpu < 0 || cpu.cpu >= max_cpu_num.cpu)
> 
> [Severity: High]
> This isn't a bug introduced by this patch, but parsing sysfs to determine
> the maximum CPU number introduces a memory corruption vulnerability.
> 
> In tools/perf/util/cpumap.c:set_max_cpu_num(), the max possible CPU number
> is parsed from sysfs into a local int max. This value is directly assigned to
> max_cpu_num.cpu, which is an int16_t. If max exceeds 32,767 (as is possible on
> large modern architectures or due to sparse APIC numbering), the assignment
> truncates the value, resulting in a small or negative integer.
> 
> Later, init_cpunode_map() allocates cpunode_map based on this truncated size,
> and cpu__setup_cpunode_map() writes to it without truncation:
> 
> tools/perf/util/cpumap.c:cpu__setup_cpunode_map() {
> ...
> 		while ((dent2 = readdir(dir2)) != NULL) {
> 			if (dent2->d_type != DT_LNK || sscanf(dent2->d_name, "cpu%u", &cpu) < 1)
> 				continue;
> 			cpunode_map[cpu] = mem;
> 		}
> ...
> 
> Could this result in an out-of-bounds write on the heap?
> 
> > +		return -1;
> 
> [Severity: High]
> This isn't a bug introduced by this patch, but returning -1 here causes a
> crash in perf_mmap__aio_bind() when a CPU's NUMA node is unknown or mapped to
> a dummy CPU.
> 
> tools/perf/util/mmap.c:perf_mmap__aio_bind() {
> ...
> 		node_index = cpu__get_node(cpu);
> 		node_mask = bitmap_zalloc(node_index + 1);
> 		if (!node_mask) {
> 			pr_err("Failed to allocate node mask for mbind: error %m\n");
> 			return -1;
> 		}
> 		__set_bit(node_index, node_mask);
> ...
> 
> When cpu__get_node() returns -1, it is implicitly cast to unsigned long
> node_index (ULONG_MAX). bitmap_zalloc(ULONG_MAX + 1) evaluates to
> bitmap_zalloc(0) and allocates a minimal buffer.
> 
> Does the subsequent __set_bit(ULONG_MAX, node_mask) perform a wild memory
> write vastly outside the allocated bitmap?
> 
> > +
> >  	return cpunode_map[cpu.cpu];
> >  }
> 
> [Severity: High]
> This isn't a bug introduced by this patch, but while reviewing
> builtin-kmem.c which is mentioned in the commit message as calling
> cpu__get_node(), it appears the perf_tool initialization omits the
> .attr callback:
> 
> tools/perf/builtin-kmem.c:cmd_kmem() {
> ...
> 	perf_tool__init(&perf_kmem, /*ordered_events=*/true);
> 	perf_kmem.sample	= process_sample_event;
> 	perf_kmem.comm		= perf_event__process_comm;
> 	perf_kmem.mmap		= perf_event__process_mmap;
> 	perf_kmem.mmap2		= perf_event__process_mmap2;
> 	perf_kmem.namespaces	= perf_event__process_namespaces;
> ...
> 
> In pipe mode (e.g., perf kmem record -o - | perf kmem stat -i -), perf.data
> headers are missing and events must be processed sequentially. Omitting the
> perf_event__process_attr callback causes perf_event_header_attr events to be
> silently discarded by the default stub.
> 
> Does this prevent the creation of evsels and the initialization of the
> perf_env, completely breaking the tool's ability to process piped data
> streams natively?
> 
> -- 
> Sashiko AI review · https://sashiko.dev/#/patchset/20260605121515.1725549-1-acme@kernel.org?part=2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing
  2026-06-05 14:34   ` David Ahern
@ 2026-06-05 15:01     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 15:01 UTC (permalink / raw)
  To: David Ahern
  Cc: Namhyung Kim, Ingo Molnar, Thomas Gleixner, James Clark,
	Jiri Olsa, Ian Rogers, Adrian Hunter, Clark Williams,
	linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	sashiko-bot, Claude Opus 4.6

On Fri, Jun 05, 2026 at 08:34:53AM -0600, David Ahern wrote:
> On 6/5/26 6:15 AM, Arnaldo Carvalho de Melo wrote:
> > From: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Add bounds checks before both max_cpu updates, matching the pattern
> > already used in map_switch_event().

> > Fixes: 49394a2a24c7 ("perf sched timehist: Introduce timehist command")
> > Reported-by: sashiko-bot <sashiko-bot@kernel.org>
> > Cc: David Ahern <dsahern@gmail.com>
> > Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
> > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > ---
> >  tools/perf/builtin-sched.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)

> LGTM
 
> Reviewed-by: David Ahern <dsahern@kernel.org>
 
Thanks, added to the cset,

- Arnaldo

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-06-05 15:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-05 12:15 [PATCHES v1 0/5] perf tools: Fix OOB reads, reference leaks, and overflow in sched/script/auxtrace Arnaldo Carvalho de Melo
2026-06-05 12:15 ` [PATCH 1/5] perf tools: Guard remaining test_bit calls from OOB sample CPU Arnaldo Carvalho de Melo
2026-06-05 12:31   ` sashiko-bot
2026-06-05 12:15 ` [PATCH 2/5] perf tools: Add bounds check to cpu__get_node() Arnaldo Carvalho de Melo
2026-06-05 14:30   ` sashiko-bot
2026-06-05 14:45     ` Arnaldo Carvalho de Melo
2026-06-05 12:15 ` [PATCH 3/5] perf sched: Fix thread reference leaks in timehist_get_thread() Arnaldo Carvalho de Melo
2026-06-05 12:35   ` sashiko-bot
2026-06-05 12:15 ` [PATCH 4/5] perf sched: Cap max_cpu at MAX_CPUS in timehist sample processing Arnaldo Carvalho de Melo
2026-06-05 12:43   ` sashiko-bot
2026-06-05 14:34   ` David Ahern
2026-06-05 15:01     ` Arnaldo Carvalho de Melo
2026-06-05 12:15 ` [PATCH 5/5] perf sched: Fix register_pid() overflow, strcpy, and BUG_ON Arnaldo Carvalho de Melo
2026-06-05 12:29   ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox