Linux Perf Users
 help / color / mirror / Atom feed
* [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched
@ 2026-06-05 23:38 Arnaldo Carvalho de Melo
  2026-06-05 23:38 ` [PATCH 1/9] perf mmap: Guard cpu__get_node() return in aio_bind() Arnaldo Carvalho de Melo
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo

Hi,

Nine more pre-existing bugs found by sashiko-bot during AI-assisted
review of the perf-data-validation hardening series.  All are
independent of that series -- they are latent bugs in surrounding code
exposed during review.

  1. perf_mmap__aio_bind() passes cpu__get_node() return value to an
     unsigned long.  When -1 is returned, sign-extension to ULONG_MAX
     causes bitmap_zalloc(0) and a massive OOB __set_bit.

  2. Six perf_env__get_*_aggr_by_cpu() topology aggregation callbacks
     in builtin-stat.c access env->cpu[cpu.cpu] after only checking
     cpu.cpu != -1.  A CPU index from untrusted perf.data exceeding
     env->nr_cpus_avail causes OOB heap reads.

  3. perf c2c: __set_bit on cpuset/nodeset bitmaps without bounds
     checking sample->cpu and node IDs against their allocation sizes.
     Also, cpu2node[] array accessed without upper bound check.

  4. perf c2c: setup_nodes() iterates CPU maps from perf.data topology
     and uses cpu.cpu directly as index into cpu2node[] and __set_bit
     without validating against nr_cpus_avail.

  5. get_idle_thread() leaves a partially initialized thread in
     idle_threads[] when init_idle_thread() fails, causing subsequent
     calls to return a thread with no priv data -- later cast to a
     larger struct causes OOB writes.

  6. timehist_sched_change_event() uses thread__tid() == 0 to guard
     a cast from thread_runtime to idle_thread_runtime.  A crafted
     perf.data with common_pid=0 but prev_pid!=0 gets a machine
     thread with thread_runtime priv -- the cast reads past the
     allocation.

  7. timehist_sched_change_event() sets itr->last_thread to NULL
     without calling thread__put() first, leaking a thread reference
     on every idle context switch with --idle-hist.

  8. free_idle_threads() calls thread__delete() directly instead of
     thread__put(), bypassing the refcount lifecycle.

  9. get_new_event() dereferences unchecked zalloc() result and uses
     BUG_ON on realloc failure.  add_sched_event_wakeup() passes
     unchecked zalloc() to sem_init().  All crash on OOM with
     untrusted input.

All require crafted or unusual perf.data inputs to trigger.
Verified with gcc and clang builds, checkpatch, and perf test.

Arnaldo Carvalho de Melo (9):
  perf mmap: Guard cpu__get_node() return in aio_bind()
  perf stat: Bounds-check CPU index in topology aggregation callbacks
  perf c2c: Bounds-check CPU and node IDs before bitmap and array access
  perf c2c: Bounds-check CPU IDs in setup_nodes() topology loop
  perf sched: Clean up idle_threads entry on init failure
  perf sched: Use is_idle_sample() for idle thread runtime cast guard
  perf sched: Fix thread reference leak in idle hist processing
  perf sched: Use thread__put() in free_idle_threads()
  perf sched: Replace BUG_ON and add NULL checks in replay event helpers

 tools/perf/builtin-c2c.c   | 23 +++++++++++++++++++++--
 tools/perf/builtin-sched.c | 45 ++++++++++++++++++++++++++++++++++++++-------
 tools/perf/builtin-stat.c  | 13 +++++++------
 tools/perf/util/mmap.c     |  8 +++++++-
 4 files changed, 73 insertions(+), 16 deletions(-)

Developed with AI assistance (Claude/sashiko), tagged in commits.

v2:
  - New patch 6: use is_idle_sample() instead of thread__tid() == 0
    to guard idle_thread_runtime cast, preventing OOB read/arbitrary
    free with crafted common_pid=0/prev_pid!=0 events (sashiko Critical
    on v1 patch 6)
  - Patch 3: add cpu < 0 check to catch large u32 sample->cpu values
    that wrap negative when assigned to signed int, bypassing the
    >= cpus_cnt bounds check (sashiko High on v1 patch 3)

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/9] perf mmap: Guard cpu__get_node() return in aio_bind()
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  2026-06-05 23:56   ` sashiko-bot
  2026-06-05 23:38 ` [PATCH 2/9] perf stat: Bounds-check CPU index in topology aggregation callbacks Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Alexey Budankov,
	Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

perf_mmap__aio_bind() passes the cpu__get_node() return value directly
to an unsigned long variable (node_index).  When cpu__get_node() returns
-1 for an unknown CPU, the implicit int-to-unsigned-long conversion
sign-extends it to ULONG_MAX.

This causes bitmap_zalloc(ULONG_MAX + 1) which wraps to
bitmap_zalloc(0), returning a zero-sized allocation.  The subsequent
__set_bit(ULONG_MAX, node_mask) then writes massively out of bounds.

Check the return value in a signed temporary before assigning to
node_index, and skip the NUMA binding when the node is unknown.

Fixes: c44a8b44ca9f ("perf record: Bind the AIO user space buffers to nodes")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/mmap.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index b69f926d314b148b..4404a99eee45f9c3 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -104,9 +104,15 @@ static int perf_mmap__aio_bind(struct mmap *map, int idx, struct perf_cpu cpu, i
 	int err = 0;
 
 	if (affinity != PERF_AFFINITY_SYS && cpu__max_node() > 1) {
+		int node;
+
 		data = map->aio.data[idx];
 		mmap_len = mmap__mmap_len(map);
-		node_index = cpu__get_node(cpu);
+		node = cpu__get_node(cpu);
+		/* -1 sign-extends to ULONG_MAX, wrapping bitmap_zalloc(0) and OOB __set_bit */
+		if (node < 0)
+			return 0;
+		node_index = node;
 		node_mask = bitmap_zalloc(node_index + 1);
 		if (!node_mask) {
 			pr_err("Failed to allocate node mask for mbind: error %m\n");
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/9] perf stat: Bounds-check CPU index in topology aggregation callbacks
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
  2026-06-05 23:38 ` [PATCH 1/9] perf mmap: Guard cpu__get_node() return in aio_bind() Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  2026-06-05 23:55   ` sashiko-bot
  2026-06-05 23:38 ` [PATCH 3/9] perf c2c: Bounds-check CPU and node IDs before bitmap and array access Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Six perf_env__get_*_aggr_by_cpu() functions access env->cpu[cpu.cpu]
after only checking cpu.cpu != -1.  env->cpu[] is allocated with
env->nr_cpus_avail entries, so a CPU index from an untrusted perf.data
file that exceeds that count causes an out-of-bounds heap read.

Replace the != -1 guard with >= 0 && < env->nr_cpus_avail in all six
functions.  The >= 0 check also catches -1 and any other negative values
that could bypass the old check.

Affected functions:
  - perf_env__get_socket_aggr_by_cpu()
  - perf_env__get_die_aggr_by_cpu()
  - perf_env__get_cache_aggr_by_cpu()
  - perf_env__get_cluster_aggr_by_cpu()
  - perf_env__get_core_aggr_by_cpu()
  - perf_env__get_cpu_aggr_by_cpu()

Fixes: 68d702f7a120 ("perf stat report: Add support to initialize aggr_map from file")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 99d7db372b480800..9a045811c4197ccd 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1638,7 +1638,8 @@ static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(struct perf_cpu cpu,
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu.cpu != -1)
+	/* env->cpu[] has env->nr_cpus_avail entries; reject untrusted indices */
+	if (cpu.cpu >= 0 && cpu.cpu < env->nr_cpus_avail)
 		id.socket = env->cpu[cpu.cpu].socket_id;
 
 	return id;
@@ -1649,7 +1650,7 @@ static struct aggr_cpu_id perf_env__get_die_aggr_by_cpu(struct perf_cpu cpu, voi
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu.cpu != -1) {
+	if (cpu.cpu >= 0 && cpu.cpu < env->nr_cpus_avail) {
 		/*
 		 * die_id is relative to socket, so start
 		 * with the socket ID and then add die to
@@ -1705,7 +1706,7 @@ static struct aggr_cpu_id perf_env__get_cache_aggr_by_cpu(struct perf_cpu cpu,
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu.cpu != -1) {
+	if (cpu.cpu >= 0 && cpu.cpu < env->nr_cpus_avail) {
 		u32 cache_level = (perf_stat.aggr_level) ?: stat_config.aggr_level;
 
 		id.socket = env->cpu[cpu.cpu].socket_id;
@@ -1722,7 +1723,7 @@ static struct aggr_cpu_id perf_env__get_cluster_aggr_by_cpu(struct perf_cpu cpu,
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu.cpu != -1) {
+	if (cpu.cpu >= 0 && cpu.cpu < env->nr_cpus_avail) {
 		id.socket = env->cpu[cpu.cpu].socket_id;
 		id.die = env->cpu[cpu.cpu].die_id;
 		id.cluster = env->cpu[cpu.cpu].cluster_id;
@@ -1736,7 +1737,7 @@ static struct aggr_cpu_id perf_env__get_core_aggr_by_cpu(struct perf_cpu cpu, vo
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu.cpu != -1) {
+	if (cpu.cpu >= 0 && cpu.cpu < env->nr_cpus_avail) {
 		/*
 		 * core_id is relative to socket, die and cluster, we need a
 		 * global id. So we set socket, die id, cluster id and core id.
@@ -1755,7 +1756,7 @@ static struct aggr_cpu_id perf_env__get_cpu_aggr_by_cpu(struct perf_cpu cpu, voi
 	struct perf_env *env = data;
 	struct aggr_cpu_id id = aggr_cpu_id__empty();
 
-	if (cpu.cpu != -1) {
+	if (cpu.cpu >= 0 && cpu.cpu < env->nr_cpus_avail) {
 		/*
 		 * core_id is relative to socket and die,
 		 * we need a global id. So we set
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/9] perf c2c: Bounds-check CPU and node IDs before bitmap and array access
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
  2026-06-05 23:38 ` [PATCH 1/9] perf mmap: Guard cpu__get_node() return in aio_bind() Arnaldo Carvalho de Melo
  2026-06-05 23:38 ` [PATCH 2/9] perf stat: Bounds-check CPU index in topology aggregation callbacks Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  2026-06-05 23:54   ` sashiko-bot
  2026-06-05 23:38 ` [PATCH 4/9] perf c2c: Bounds-check CPU IDs in setup_nodes() topology loop Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

c2c_he__set_cpu() passes sample->cpu directly to __set_bit(cpu, cpuset)
after only checking for the (u32)-1 sentinel.  The cpuset bitmap is
allocated with c2c.cpus_cnt bits (from env->nr_cpus_avail), so a crafted
perf.data with CPU IDs exceeding that count causes out-of-bounds heap
writes.

c2c_he__set_node() similarly passes the node ID from mem2node__node()
to __set_bit(node, nodeset) after only checking for negative values.
The nodeset bitmap is sized to c2c.nodes_cnt (from env->nr_numa_nodes),
so a node ID exceeding that causes OOB writes.

process_sample_event() indexes c2c.cpu2node[cpu] and
c2c_he->node_stats[node] without bounds checking.  Both arrays are
sized to c2c.cpus_cnt and c2c.nodes_cnt respectively.

Add bounds checks in all three paths:
  - c2c_he__set_cpu(): return if sample->cpu >= c2c.cpus_cnt
  - c2c_he__set_node(): return if node >= c2c.nodes_cnt
  - process_sample_event(): clamp cpu to 0 if >= cpus_cnt,
    guard node_stats access with bounds check

Fixes: 1e181b92a2da ("perf c2c report: Add 'node' sort key")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 3dd45a550fdb772a..f060dfbe11c285bf 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -245,6 +245,10 @@ static void c2c_he__set_cpu(struct c2c_hist_entry *c2c_he,
 		      "WARNING: no sample cpu value"))
 		return;
 
+	/* cpuset bitmap has c2c.cpus_cnt bits from env->nr_cpus_avail */
+	if (sample->cpu >= (unsigned int)c2c.cpus_cnt)
+		return;
+
 	__set_bit(sample->cpu, c2c_he->cpuset);
 }
 
@@ -262,6 +266,10 @@ static void c2c_he__set_node(struct c2c_hist_entry *c2c_he,
 	if (WARN_ONCE(node < 0, "WARNING: failed to find node\n"))
 		return;
 
+	/* nodeset bitmap has c2c.nodes_cnt bits from env->nr_numa_nodes */
+	if (node >= c2c.nodes_cnt)
+		return;
+
 	__set_bit(node, c2c_he->nodeset);
 
 	if (c2c_he->paddr != sample->phys_addr) {
@@ -391,7 +399,12 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,
 		 * Doing node stats only for single callchain data.
 		 */
 		int cpu = sample->cpu == (unsigned int) -1 ? 0 : sample->cpu;
-		int node = c2c.cpu2node[cpu];
+		int node;
+
+		/* cpu2node[] has c2c.cpus_cnt entries; large u32 wraps signed negative */
+		if (cpu < 0 || cpu >= c2c.cpus_cnt)
+			cpu = 0;
+		node = c2c.cpu2node[cpu];
 
 		c2c_hists = he__get_c2c_hists(he, c2c.cl_sort, 2, machine->env);
 		if (!c2c_hists) {
@@ -410,7 +423,9 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,
 		c2c_he = container_of(he, struct c2c_hist_entry, he);
 		c2c_add_stats(&c2c_he->stats, &stats);
 		c2c_add_stats(&c2c_hists->stats, &stats);
-		c2c_add_stats(&c2c_he->node_stats[node], &stats);
+		/* node_stats[] has c2c.nodes_cnt entries */
+		if (node >= 0 && node < c2c.nodes_cnt)
+			c2c_add_stats(&c2c_he->node_stats[node], &stats);
 
 		compute_stats(c2c_he, &stats, sample->weight);
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/9] perf c2c: Bounds-check CPU IDs in setup_nodes() topology loop
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2026-06-05 23:38 ` [PATCH 3/9] perf c2c: Bounds-check CPU and node IDs before bitmap and array access Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  2026-06-05 23:38 ` [PATCH 5/9] perf sched: Clean up idle_threads entry on init failure Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

setup_nodes() iterates CPU maps from the perf.data topology header and
uses cpu.cpu directly as an array index into cpu2node[] (allocated with
c2c.cpus_cnt = env->nr_cpus_avail entries) and __set_bit(cpu.cpu, set)
(bitmap also sized to c2c.cpus_cnt).

A crafted perf.data with topology CPU IDs exceeding nr_cpus_avail causes
out-of-bounds heap writes into both the cpu2node array and the per-node
bitmap.

Add a bounds check to skip CPU IDs that fall outside the valid range.

Fixes: 1e181b92a2da ("perf c2c report: Add 'node' sort key")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-c2c.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f060dfbe11c285bf..cfc1ebe8c0af74dc 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2371,6 +2371,10 @@ static int setup_nodes(struct perf_session *session)
 		nodes[node] = set;
 
 		perf_cpu_map__for_each_cpu_skip_any(cpu, idx, map) {
+			/* topology CPU IDs from perf.data may exceed nr_cpus_avail */
+			if (cpu.cpu < 0 || cpu.cpu >= c2c.cpus_cnt)
+				continue;
+
 			__set_bit(cpu.cpu, set);
 
 			if (WARN_ONCE(cpu2node[cpu.cpu] != -1, "node/cpu topology bug"))
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/9] perf sched: Clean up idle_threads entry on init failure
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2026-06-05 23:38 ` [PATCH 4/9] perf c2c: Bounds-check CPU IDs in setup_nodes() topology loop Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  2026-06-05 23:56   ` sashiko-bot
  2026-06-05 23:38 ` [PATCH 6/9] perf sched: Use is_idle_sample() for idle thread runtime cast guard Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, David Ahern,
	Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

get_idle_thread() allocates a thread via thread__new() and stores it in
idle_threads[cpu], then calls init_idle_thread() to set up the private
data.  If init_idle_thread() fails (e.g. OOM for the idle_thread_runtime
struct), the function returns NULL but leaves the partially initialized
thread in idle_threads[cpu].

On subsequent calls for the same CPU, get_idle_thread() finds a non-NULL
idle_threads[cpu], skips allocation, and returns thread__get() on a
thread that has no priv data.  Callers then get a thread whose
thread__priv() returns NULL, leading to unexpected behavior.

Release the thread and reset the slot to NULL on init failure so the
entry doesn't persist in a corrupted state.

Fixes: 49394a2a24c7 ("perf sched timehist: Introduce timehist command")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 21fb820b625b43e1..e4378cc9ab3ed48b 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2514,8 +2514,11 @@ static struct thread *get_idle_thread(int cpu)
 		idle_threads[cpu] = thread__new(0, 0);
 
 		if (idle_threads[cpu]) {
-			if (init_idle_thread(idle_threads[cpu]) < 0)
+			if (init_idle_thread(idle_threads[cpu]) < 0) {
+				/* clean up so next call doesn't find a half-initialized thread */
+				thread__zput(idle_threads[cpu]);
 				return NULL;
+			}
 		}
 	}
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/9] perf sched: Use is_idle_sample() for idle thread runtime cast guard
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2026-06-05 23:38 ` [PATCH 5/9] perf sched: Clean up idle_threads entry on init failure Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  2026-06-05 23:38 ` [PATCH 7/9] perf sched: Fix thread reference leak in idle hist processing Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

timehist_sched_change_event() uses thread__tid(thread) == 0 to decide
whether to cast thread_runtime to idle_thread_runtime.  However, a
crafted perf.data can set common_pid=0 and common_tid=0 (the perf_sample
fields) while prev_pid != 0 (the tracepoint field).  is_idle_sample()
returns false (it checks prev_pid for sched_switch), so
timehist_get_thread() goes through machine__findnew_thread() and returns
the machine's TID 0 thread — whose priv data is a regular thread_runtime,
not the larger idle_thread_runtime allocated by init_idle_thread().

The subsequent cast to idle_thread_runtime reads past the thread_runtime
allocation, accessing itr->last_thread, itr->cursor, and itr->callchain
from adjacent heap memory.  Writing to itr->last_thread corrupts the
heap; calling thread__put() on the OOB value frees an arbitrary pointer.

Replace the thread__tid() == 0 check with is_idle_sample(), which uses
the tracepoint-specific prev_pid field and correctly identifies whether
the sample originated from an idle thread with idle_thread_runtime priv.

Fixes: 5d8f17fb5822 ("perf sched timehist: Add -I/--idle-hist option")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index e4378cc9ab3ed48b..4600d70b486104dd 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2902,7 +2902,13 @@ static int timehist_sched_change_event(const struct perf_tool *tool,
 			t = ptime->end;
 	}
 
-	if (!sched->idle_hist || thread__tid(thread) == 0) {
+	/*
+	 * Use is_idle_sample() not thread__tid() == 0: a crafted perf.data
+	 * can set common_pid=0 with prev_pid!=0, giving us a machine thread
+	 * whose priv is thread_runtime, not idle_thread_runtime — the cast
+	 * below would read past the allocation.
+	 */
+	if (!sched->idle_hist || is_idle_sample(sample)) {
 		if (!cpu_list || (sample->cpu < MAX_NR_CPUS &&
 				 test_bit(sample->cpu, cpu_bitmap)))
 			timehist_update_runtime_stats(tr, t, tprev);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/9] perf sched: Fix thread reference leak in idle hist processing
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2026-06-05 23:38 ` [PATCH 6/9] perf sched: Use is_idle_sample() for idle thread runtime cast guard Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  2026-06-05 23:56   ` sashiko-bot
  2026-06-05 23:38 ` [PATCH 8/9] perf sched: Use thread__put() in free_idle_threads() Arnaldo Carvalho de Melo
  2026-06-05 23:38 ` [PATCH 9/9] perf sched: Replace BUG_ON and add NULL checks in replay event helpers Arnaldo Carvalho de Melo
  8 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, David Ahern,
	Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

timehist_sched_change_event() sets itr->last_thread to NULL at the end
of idle hist processing without calling thread__put() first.  The
thread reference was acquired via thread__get() in timehist_get_thread()
(line 2581), so every idle context switch leaks a thread reference when
--idle-hist is active.

Use thread__zput() to properly release the reference before clearing
the pointer.

Fixes: 5d8f17fb5822 ("perf sched timehist: Add -I/--idle-hist option")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 4600d70b486104dd..c0cd3cbb602aa4c0 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2940,7 +2940,7 @@ static int timehist_sched_change_event(const struct perf_tool *tool,
 			if (itr->cursor.nr)
 				callchain_append(&itr->callchain, &itr->cursor, t - tprev);
 
-			itr->last_thread = NULL;
+			thread__zput(itr->last_thread);
 		}
 
 		if (!sched->summary_only)
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 8/9] perf sched: Use thread__put() in free_idle_threads()
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2026-06-05 23:38 ` [PATCH 7/9] perf sched: Fix thread reference leak in idle hist processing Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  2026-06-05 23:38 ` [PATCH 9/9] perf sched: Replace BUG_ON and add NULL checks in replay event helpers Arnaldo Carvalho de Melo
  8 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, David Ahern,
	Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

free_idle_threads() calls thread__delete() directly instead of
thread__put(), bypassing the reference counting lifecycle.  Under
REFCNT_CHECKING builds, this leaks the pointer handle since
thread__delete() frees the object without going through the refcount
wrapper.

The idle threads are created via thread__new() (refcount=1) in
get_idle_thread().  Callers get additional references via thread__get()
which they release with thread__put().  free_idle_threads() drops the
base reference — thread__put() is the correct call, matching the
thread__new() acquisition.

Fixes: 49394a2a24c7 ("perf sched timehist: Introduce timehist command")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index c0cd3cbb602aa4c0..732c65008a8a5240 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2481,7 +2481,7 @@ static void free_idle_threads(void)
 			if (itr)
 				thread__put(itr->last_thread);
 
-			thread__delete(idle);
+			thread__put(idle);
 		}
 	}
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 9/9] perf sched: Replace BUG_ON and add NULL checks in replay event helpers
  2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2026-06-05 23:38 ` [PATCH 8/9] perf sched: Use thread__put() in free_idle_threads() Arnaldo Carvalho de Melo
@ 2026-06-05 23:38 ` Arnaldo Carvalho de Melo
  8 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-05 23:38 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6

From: Arnaldo Carvalho de Melo <acme@redhat.com>

get_new_event() has three issues:

1. The zalloc() result is dereferenced without a NULL check, crashing
   on allocation failure.

2. BUG_ON(!task->atoms) kills the process when realloc() fails.
   Since perf.data is untrusted input, this should be a graceful error.

3. The realloc pattern assigns directly to task->atoms, losing the old
   pointer on failure.  task->nr_events is also incremented before the
   realloc, leaving corrupted state on failure.

Fix get_new_event() to:
  - Check the zalloc() result before dereferencing
  - Use a temporary for realloc() to avoid losing the old pointer
  - Increment nr_events only after successful realloc
  - Return NULL instead of calling BUG_ON on failure

Also fix add_sched_event_wakeup() where zalloc() for wait_sem is
passed to sem_init() without a NULL check.

Update all callers (add_sched_event_run, add_sched_event_wakeup,
add_sched_event_sleep) to handle NULL returns by returning early.
The replay may produce incomplete output on OOM but will not crash.

Fixes: ec156764d424 ("perf sched: Import schedbench.c")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-sched.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 732c65008a8a5240..b7ccdc6a985d1c7b 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -365,14 +365,25 @@ get_new_event(struct task_desc *task, u64 timestamp)
 	struct sched_atom *event = zalloc(sizeof(*event));
 	unsigned long idx = task->nr_events;
 	size_t size;
+	struct sched_atom **atoms_p;
+
+	if (event == NULL) {
+		pr_err("ERROR: sched: failed to allocate event\n");
+		return NULL;
+	}
 
 	event->timestamp = timestamp;
 	event->nr = idx;
 
+	size = sizeof(struct sched_atom *) * (task->nr_events + 1);
+	atoms_p = realloc(task->atoms, size);
+	if (!atoms_p) {
+		pr_err("ERROR: sched: failed to grow atoms array\n");
+		free(event);
+		return NULL;
+	}
+	task->atoms = atoms_p;
 	task->nr_events++;
-	size = sizeof(struct sched_atom *) * task->nr_events;
-	task->atoms = realloc(task->atoms, size);
-	BUG_ON(!task->atoms);
 
 	task->atoms[idx] = event;
 
@@ -403,6 +414,8 @@ static void add_sched_event_run(struct perf_sched *sched, struct task_desc *task
 	}
 
 	event = get_new_event(task, timestamp);
+	if (event == NULL)
+		return;
 
 	event->type = SCHED_EVENT_RUN;
 	event->duration = duration;
@@ -416,6 +429,8 @@ static void add_sched_event_wakeup(struct perf_sched *sched, struct task_desc *t
 	struct sched_atom *event, *wakee_event;
 
 	event = get_new_event(task, timestamp);
+	if (event == NULL)
+		return;
 	event->type = SCHED_EVENT_WAKEUP;
 	event->wakee = wakee;
 
@@ -430,6 +445,10 @@ static void add_sched_event_wakeup(struct perf_sched *sched, struct task_desc *t
 	}
 
 	wakee_event->wait_sem = zalloc(sizeof(*wakee_event->wait_sem));
+	if (!wakee_event->wait_sem) {
+		pr_err("ERROR: sched: failed to allocate semaphore\n");
+		return;
+	}
 	sem_init(wakee_event->wait_sem, 0, 0);
 	event->wait_sem = wakee_event->wait_sem;
 
@@ -441,6 +460,9 @@ static void add_sched_event_sleep(struct perf_sched *sched, struct task_desc *ta
 {
 	struct sched_atom *event = get_new_event(task, timestamp);
 
+	if (event == NULL)
+		return;
+
 	event->type = SCHED_EVENT_SLEEP;
 
 	sched->nr_sleep_events++;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/9] perf c2c: Bounds-check CPU and node IDs before bitmap and array access
  2026-06-05 23:38 ` [PATCH 3/9] perf c2c: Bounds-check CPU and node IDs before bitmap and array access Arnaldo Carvalho de Melo
@ 2026-06-05 23:54   ` sashiko-bot
  0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-05 23:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] Dangling pointer in the `he__get_c2c_hists()` error path leads to a Use-After-Free and double-free during cleanup.
--

commit 4a6b2b9d1bf179db3b813952c508e27db4f604e7
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf c2c: Bounds-check CPU and node IDs before bitmap and array access

This commit adds bounds checks for CPU and node IDs in c2c_he__set_cpu(),
c2c_he__set_node(), and process_sample_event(). It aims to prevent
out-of-bounds heap writes and array access when processing perf.data files
with unusually large CPU or node values.

> diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
> index 3dd45a550fdb7..f060dfbe11c28 100644
> --- a/tools/perf/builtin-c2c.c
> +++ b/tools/perf/builtin-c2c.c

[ ... ]

> @@ -391,7 +399,12 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,
>  		 * Doing node stats only for single callchain data.
>  		 */
>  		int cpu = sample->cpu == (unsigned int) -1 ? 0 : sample->cpu;
> -		int node = c2c.cpu2node[cpu];
> +		int node;
> +
> +		/* cpu2node[] has c2c.cpus_cnt entries; large u32 wraps signed negative */
> +		if (cpu < 0 || cpu >= c2c.cpus_cnt)
> +			cpu = 0;
> +		node = c2c.cpu2node[cpu];
>  
>  		c2c_hists = he__get_c2c_hists(he, c2c.cl_sort, 2, machine->env);

[Severity: High]
This is a pre-existing issue, but can the error path in he__get_c2c_hists()
lead to a use-after-free and double-free during cleanup?

If c2c_hists__init() fails, the memory is freed but c2c_he->hists is not
cleared:

tools/perf/builtin-c2c.c:he__get_c2c_hists() {
	...
	hists = c2c_he->hists = zalloc(sizeof(*hists));
	if (!hists)
		return NULL;

	ret = c2c_hists__init(hists, sort, nr_header_lines, env);
	if (ret) {
		free(hists);
		return NULL;
	}
	...
}

Since the hist_entry is already added to the tree earlier in
process_sample_event(), returning -ENOMEM from here triggers teardown via
c2c_he_free(). Does this cause c2c_he_free() to dereference the dangling
c2c_he->hists pointer?

tools/perf/builtin-c2c.c:c2c_he_free() {
	...
	if (c2c_he->hists) {
		hists__delete_entries(&c2c_he->hists->hists);
		zfree(&c2c_he->hists);
	}
	...
}

>  		if (!c2c_hists) {
> @@ -410,7 +423,9 @@ static int process_sample_event(const struct perf_tool *tool __maybe_unused,

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605233837.1773732-1-acme@kernel.org?part=3

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/9] perf stat: Bounds-check CPU index in topology aggregation callbacks
  2026-06-05 23:38 ` [PATCH 2/9] perf stat: Bounds-check CPU index in topology aggregation callbacks Arnaldo Carvalho de Melo
@ 2026-06-05 23:55   ` sashiko-bot
  0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-05 23:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] Incomplete bounds-check fix: Missing NULL pointer check before accessing `env->cpu` array in `perf_env__get_*_aggr_by_cpu` callbacks.
- [High] Out-of-bounds heap read due to unvalidated `sample->cpu` index in `perf_event__preprocess_sample()`.
--

commit ebd8bf34de713904cef201f9c6fa2181cf91d727
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf stat: Bounds-check CPU index in topology aggregation callbacks

This commit updates six perf_env__get_*_aggr_by_cpu() functions to properly
bounds-check the CPU index against env->nr_cpus_avail. This prevents
out-of-bounds heap reads when processing untrusted perf.data files.

> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1638,7 +1638,8 @@ static struct aggr_cpu_id perf_env__get_socket_aggr_by_cpu(struct perf_cpu cpu,
>  	struct perf_env *env = data;
>  	struct aggr_cpu_id id = aggr_cpu_id__empty();
>  
> -	if (cpu.cpu != -1)
> +	/* env->cpu[] has env->nr_cpus_avail entries; reject untrusted indices */
> +	if (cpu.cpu >= 0 && cpu.cpu < env->nr_cpus_avail)
>  		id.socket = env->cpu[cpu.cpu].socket_id;

[Severity: High]
This is a pre-existing issue, but is it possible for env->cpu to be NULL
here while env->nr_cpus_avail is greater than zero?

When parsing an older perf.data file, process_cpu_topology() can free
env->cpu while leaving env->nr_cpus_avail populated.

If this occurs, the bounds check cpu.cpu < env->nr_cpus_avail would pass,
but accessing env->cpu[cpu.cpu] would result in a NULL pointer dereference.
Does this code need an additional check to ensure env->cpu is not NULL?

[ ... ]

[Severity: High]
This isn't a bug introduced by this patch, but while looking at CPU index
bounds checking, is there a similar missing upper-bound check in
perf_event__preprocess_sample()?

In tools/perf/util/event.c:perf_event__preprocess_sample(), the untrusted
sample->cpu value is checked for negative values, but does not appear to be
checked against env->nr_cpus_avail before accessing the env->cpu array:

    if (al->cpu >= 0)
        al->socket = env->cpu[al->cpu].socket_id;

Could a large positive value for al->cpu bypass this check and cause an
out-of-bounds heap read from the env->cpu array?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605233837.1773732-1-acme@kernel.org?part=2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/9] perf sched: Clean up idle_threads entry on init failure
  2026-06-05 23:38 ` [PATCH 5/9] perf sched: Clean up idle_threads entry on init failure Arnaldo Carvalho de Melo
@ 2026-06-05 23:56   ` sashiko-bot
  0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-05 23:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] The perf sched command fails to register the `.feature` callback in its `perf_tool` initialization, which prevents `perf_env` from being populated in pipe mode.
- [Medium] free_idle_threads() leaks the dynamically allocated nodes of the callchain root attached to the idle thread's private data.
--

commit 22f7015aaa31f156b54bc85f9ec298a36781e9ba
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sched: Clean up idle_threads entry on init failure

get_idle_thread() allocates a thread via thread__new() and stores it in
idle_threads[cpu], then calls init_idle_thread() to set up the private
data.  If init_idle_thread() fails (e.g. OOM for the idle_thread_runtime
struct), the function returns NULL but leaves the partially initialized
thread in idle_threads[cpu].

> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 21fb820b625b4..e4378cc9ab3ed 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -2514,8 +2514,11 @@ static struct thread *get_idle_thread(int cpu)
>  		idle_threads[cpu] = thread__new(0, 0);

[Severity: Medium]
This is a pre-existing issue, but does free_idle_threads() leak the
dynamically allocated nodes of the callchain root attached to the idle
thread's private data?

During tool execution, callchain_append() dynamically allocates struct
callchain_node objects and attaches them to itr->callchain (inside struct
idle_thread_runtime).

When free_idle_threads() runs at exit:

tools/perf/builtin-sched.c:free_idle_threads() {
    ...
            itr = thread__priv(idle);
            if (itr)
                thread__put(itr->last_thread);

            thread__put(idle);
    ...
}

Because cmd_sched() registers the generic free() function as the private
data destructor (thread__set_priv_destructor(free)), thread__put(idle)
performs a shallow free on itr.

The function free_callchain(&itr->callchain) is never called, resulting in
a leak of all appended callchain nodes at tool exit.

>  
>  		if (idle_threads[cpu]) {
> -			if (init_idle_thread(idle_threads[cpu]) < 0)
> +			if (init_idle_thread(idle_threads[cpu]) < 0) {
> +				/* clean up so next call doesn't find a half-initialized thread */
> +				thread__zput(idle_threads[cpu]);
>  				return NULL;
> +			}
>  		}
>  	}
>  

[Severity: High]
This is a pre-existing issue, but does the perf sched command fail to
set up the .feature callback in its perf_tool initialization, preventing
perf_env from being populated in pipe mode?

In pipe mode, perf_env is populated via synthesized feature events from the
stream.  Because perf_sched__timehist (via cmd_sched) assigns callbacks like
.mmap, but completely omits .feature:

tools/perf/builtin-sched.c:perf_sched__timehist() {
    ...
        sched->tool.sample       = perf_timehist__process_sample;
        sched->tool.mmap         = perf_event__process_mmap;
        /* missing sched->tool.feature = perf_event__process_feature; */
    ...
}

Are feature events routed to a default stub and dropped?  This would prevent
proper initialization of tool structures (e.g., env->nr_cpus_online defaults
to 4 rather than the actual topology), breaking analysis functionality.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605233837.1773732-1-acme@kernel.org?part=5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 7/9] perf sched: Fix thread reference leak in idle hist processing
  2026-06-05 23:38 ` [PATCH 7/9] perf sched: Fix thread reference leak in idle hist processing Arnaldo Carvalho de Melo
@ 2026-06-05 23:56   ` sashiko-bot
  0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-05 23:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [Critical] Unchecked reallocarray failure leads to out-of-bounds array write.
--

commit d067b628e210dabba51b3b303c2790b28507f63f
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf sched: Fix thread reference leak in idle hist processing

This commit resolves a thread reference leak in timehist_sched_change_event()
by properly releasing the acquired thread reference using thread__zput() before
clearing the pointer during idle history processing.

> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 4600d70b48610..c0cd3cbb602aa 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -2940,7 +2940,7 @@ static int timehist_sched_change_event(const struct perf_tool *tool,
>  			if (itr->cursor.nr)
>  				callchain_append(&itr->callchain, &itr->cursor, t - tprev);
>  
> -			itr->last_thread = NULL;
> +			thread__zput(itr->last_thread);
>  		}
>  
>  		if (!sched->summary_only)

[Severity: Critical]
This isn't a bug introduced by this patch, but I noticed a potential
regression in a different file:
tools/perf/util/drm_pmu.c:for_each_drm_fdinfo_in_dir().

When expanding the minor device array:

		if (minors->minors_num == minors->minors_len) {
			unsigned int *tmp = reallocarray(minors->minors, minors->minors_len + 4,
							 sizeof(unsigned int));

			if (tmp) {
				minors->minors = tmp;
				minors->minors_len += 4;
			}
		}
		minors->minors[minors->minors_num++] = minor;

If reallocarray() fails (e.g., due to memory exhaustion), it returns a null
pointer and the array's capacity (minors_len) is not updated.

However, the code proceeds unconditionally to write the minor into
minors->minors[minors->minors_num++].

Could this cause a null pointer dereference on the first allocation failure?

Furthermore, because minors_num is incremented without minors_len
increasing, would subsequent loops bypass the minors_num == minors_len
check completely, leading to continuous and unbounded out-of-bounds heap
writes for each new file descriptor?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605233837.1773732-1-acme@kernel.org?part=7

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/9] perf mmap: Guard cpu__get_node() return in aio_bind()
  2026-06-05 23:38 ` [PATCH 1/9] perf mmap: Guard cpu__get_node() return in aio_bind() Arnaldo Carvalho de Melo
@ 2026-06-05 23:56   ` sashiko-bot
  0 siblings, 0 replies; 15+ messages in thread
From: sashiko-bot @ 2026-06-05 23:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [Medium] NULL pointer dereference in `perf_mmap__aio_free()` when memory allocation fails in `perf_mmap__aio_mmap()`.
- [Medium] Sign-extension out-of-bounds write vulnerability in `perf_mmap__setup_affinity_mask`.
--

commit 2a2adfec0327c3db4323154825d9029d39b512eb
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf mmap: Guard cpu__get_node() return in aio_bind()

This commit fixes a sign-extension out-of-bounds write in perf_mmap__aio_bind()
by assigning the return of cpu__get_node() to a signed integer first.
If the return value is less than zero, the function safely returns early
instead of performing out-of-bounds operations.

> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index b69f926d314b1..4404a99eee45f 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -104,9 +104,15 @@ static int perf_mmap__aio_bind(struct mmap *map, int idx, struct perf_cpu cpu, i
>  	int err = 0;
>  
>  	if (affinity != PERF_AFFINITY_SYS && cpu__max_node() > 1) {
> +		int node;
> +
>  		data = map->aio.data[idx];
>  		mmap_len = mmap__mmap_len(map);
> -		node_index = cpu__get_node(cpu);
> +		node = cpu__get_node(cpu);
> +		/* -1 sign-extends to ULONG_MAX, wrapping bitmap_zalloc(0) and OOB __set_bit */
> +		if (node < 0)
> +			return 0;
> +		node_index = node;
>  		node_mask = bitmap_zalloc(node_index + 1);
>  		if (!node_mask) {
>  			pr_err("Failed to allocate node mask for mbind: error %m\n");

[Severity: Medium]
This is a pre-existing issue, but I noticed a similar sign-extension issue in
perf_mmap__setup_affinity_mask() that could cause an out-of-bounds write.

If map->core.cpu.cpu is -1 (for example, for a dummy CPU map created by perf
record -p <pid> --affinity cpu), wouldn't it be sign-extended to ULONG_MAX
when passed to __set_bit()?

tools/perf/util/mmap.c:perf_mmap__setup_affinity_mask() {
    ...
	if (mp->affinity == PERF_AFFINITY_NODE && cpu__max_node() > 1)
		build_node_mask(cpu__get_node(map->core.cpu), &map->affinity_mask);
	else if (mp->affinity == PERF_AFFINITY_CPU)
		__set_bit(map->core.cpu.cpu, map->affinity_mask.bits);
    ...
}


[Severity: Medium]
This is also a pre-existing issue, but while looking at the mmap code, I
noticed a potential NULL pointer dereference during error cleanup.

If memory allocation for map->aio.data fails in perf_mmap__aio_mmap(), the
function returns -1 while map->aio.nr_cblocks is already set.

During the cleanup sequence, perf_mmap__aio_munmap() unconditionally loops up
to map->aio.nr_cblocks:

tools/perf/util/mmap.c:perf_mmap__aio_munmap() {
    ...
	for (i = 0; i < map->aio.nr_cblocks; ++i)
		perf_mmap__aio_free(map, i);
    ...
}

Inside perf_mmap__aio_free(), the pointer offset is calculated and passed to
zfree():

tools/perf/util/mmap.c:perf_mmap__aio_free() {
	zfree(&(map->aio.data[idx]));
}

If map->aio.data is NULL, won't passing its evaluated offset to zfree()
cause a segmentation fault when __zfree() tries to free the invalid pointer?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605233837.1773732-1-acme@kernel.org?part=1

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-06-05 23:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-05 23:38 [PATCHES v2 0/9] perf tools: Fix OOB writes, refcount bugs, and BUG_ON in mmap/stat/c2c/sched Arnaldo Carvalho de Melo
2026-06-05 23:38 ` [PATCH 1/9] perf mmap: Guard cpu__get_node() return in aio_bind() Arnaldo Carvalho de Melo
2026-06-05 23:56   ` sashiko-bot
2026-06-05 23:38 ` [PATCH 2/9] perf stat: Bounds-check CPU index in topology aggregation callbacks Arnaldo Carvalho de Melo
2026-06-05 23:55   ` sashiko-bot
2026-06-05 23:38 ` [PATCH 3/9] perf c2c: Bounds-check CPU and node IDs before bitmap and array access Arnaldo Carvalho de Melo
2026-06-05 23:54   ` sashiko-bot
2026-06-05 23:38 ` [PATCH 4/9] perf c2c: Bounds-check CPU IDs in setup_nodes() topology loop Arnaldo Carvalho de Melo
2026-06-05 23:38 ` [PATCH 5/9] perf sched: Clean up idle_threads entry on init failure Arnaldo Carvalho de Melo
2026-06-05 23:56   ` sashiko-bot
2026-06-05 23:38 ` [PATCH 6/9] perf sched: Use is_idle_sample() for idle thread runtime cast guard Arnaldo Carvalho de Melo
2026-06-05 23:38 ` [PATCH 7/9] perf sched: Fix thread reference leak in idle hist processing Arnaldo Carvalho de Melo
2026-06-05 23:56   ` sashiko-bot
2026-06-05 23:38 ` [PATCH 8/9] perf sched: Use thread__put() in free_idle_threads() Arnaldo Carvalho de Melo
2026-06-05 23:38 ` [PATCH 9/9] perf sched: Replace BUG_ON and add NULL checks in replay event helpers Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox