[PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths

Linux Perf Users
 help / color / mirror / Atom feed

* [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths
  2026-05-24  3:26 [PATCHES v2 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
@ 2026-05-24  3:26 ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-24  3:26 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Several event types use an nr field to control iteration over
variable-length arrays.  The swap handlers byte-swap and loop using
these fields without bounds checks, and the native processing path
trusts them as well.

Add bounds checks on both paths for:

- PERF_RECORD_THREAD_MAP: validate nr against payload, return -1
  on the swap path.  On the native path, reject with -EINVAL.

- PERF_RECORD_NAMESPACES: clamp nr on the swap path (safe because
  each entry is indexed by type; missing entries just won't be
  resolved).  Skip the event on the native path.

- PERF_RECORD_CPU_MAP: clamp nr for CPUS and MASK sub-types on
  the swap path.  Add bounds checks for mask64 which previously
  had no nr validation.  Skip the event on the native path.

- PERF_RECORD_STAT_CONFIG: clamp nr on the swap path (safe because
  each config entry is self-describing via its tag).  Skip the
  event on the native path.

The swap path (cross-endian, writable MAP_PRIVATE mapping) can
safely clamp by writing back to the event.  The native path
(read-only MAP_SHARED mapping) must skip instead of clamping
because writing to the mmap'd event would segfault.

Also fix stat_config swap range: change size += 1 to
size += sizeof(event->stat_config.nr) for clarity.  The old +1
happened to work because mem_bswap_64 processes 8-byte chunks,
but the intent is to include the 8-byte nr field in the swap
range.

Changes in v2:
- Document that PERF_RECORD_NAMESPACES max_nr includes trailing
  sample_id space when sample_id_all is present — harmless on the
  swap path because both per-element bswap_64 and swap_sample_id_all()
  perform the same u64 byte swap (Reported-by: sashiko-bot@kernel.org)

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 253 +++++++++++++++++++++++++++++++++++---
 1 file changed, 234 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index aef10d42be35487a..8588e12f110fca70 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -496,13 +496,35 @@ static int perf_event__throttle_swap(union perf_event *event,
 static int perf_event__namespaces_swap(union perf_event *event,
 				       bool sample_id_all)
 {
-	u64 i;
+	u64 i, nr, max_nr;
 
 	event->namespaces.pid		= bswap_32(event->namespaces.pid);
 	event->namespaces.tid		= bswap_32(event->namespaces.tid);
 	event->namespaces.nr_namespaces	= bswap_64(event->namespaces.nr_namespaces);
 
-	for (i = 0; i < event->namespaces.nr_namespaces; i++) {
+	nr = event->namespaces.nr_namespaces;
+	/*
+	 * Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof.
+	 * When sample_id_all is present max_nr slightly overestimates the
+	 * array space because header.size includes the trailing sample_id.
+	 * Harmless: both the per-element bswap_64 loop and swap_sample_id_all()
+	 * perform the same u64 byte swap, so the result is correct regardless
+	 * of where the boundary between array and sample_id falls.
+	 */
+	max_nr = (event->header.size - sizeof(event->namespaces)) /
+		 sizeof(event->namespaces.link_info[0]);
+	/*
+	 * Safe to clamp: each namespace entry is indexed by type;
+	 * missing entries just won't be resolved.
+	 */
+	if (nr > max_nr) {
+		pr_warning("WARNING: PERF_RECORD_NAMESPACES: nr_namespaces %" PRIu64 " exceeds payload (max %" PRIu64 "), clamping\n",
+			   nr, max_nr);
+		nr = max_nr;
+		event->namespaces.nr_namespaces = nr;
+	}
+
+	for (i = 0; i < nr; i++) {
 		struct perf_ns_link_info *ns = &event->namespaces.link_info[i];
 
 		ns->dev = bswap_64(ns->dev);
@@ -734,11 +756,23 @@ static int perf_event__auxtrace_error_swap(union perf_event *event,
 static int perf_event__thread_map_swap(union perf_event *event,
 				       bool sample_id_all __maybe_unused)
 {
-	unsigned i;
+	unsigned int i;
+	u64 nr;
 
 	event->thread_map.nr = bswap_64(event->thread_map.nr);
 
-	for (i = 0; i < event->thread_map.nr; i++)
+	/*
+	 * Reject rather than clamp: unlike namespaces (indexed by type)
+	 * or stat_config (self-describing tags), a truncated thread map
+	 * is structurally broken — downstream would get a wrong map.
+	 */
+	/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+	nr = event->thread_map.nr;
+	if (nr > (event->header.size - sizeof(event->thread_map)) /
+		  sizeof(event->thread_map.entries[0]))
+		return -1;
+
+	for (i = 0; i < nr; i++)
 		event->thread_map.entries[i].pid = bswap_64(event->thread_map.entries[i].pid);
 	return 0;
 }
@@ -747,32 +781,80 @@ static int perf_event__cpu_map_swap(union perf_event *event,
 				    bool sample_id_all __maybe_unused)
 {
 	struct perf_record_cpu_map_data *data = &event->cpu_map.data;
+	u32 payload = event->header.size - sizeof(event->header);
 
 	data->type = bswap_16(data->type);
 
+	/*
+	 * Safe to clamp: a shorter CPU map just means some CPUs
+	 * are absent; tools process the CPUs that are present.
+	 */
 	switch (data->type) {
-	case PERF_CPU_MAP__CPUS:
-		data->cpus_data.nr = bswap_16(data->cpus_data.nr);
+	case PERF_CPU_MAP__CPUS: {
+		u16 nr, max_nr;
 
-		for (unsigned i = 0; i < data->cpus_data.nr; i++)
+		data->cpus_data.nr = bswap_16(data->cpus_data.nr);
+		nr = data->cpus_data.nr;
+		max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+					     cpus_data.cpu)) /
+			 sizeof(data->cpus_data.cpu[0]);
+		if (nr > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_CPU_MAP: nr %u exceeds payload (max %u), clamping\n",
+				   nr, max_nr);
+			nr = max_nr;
+			data->cpus_data.nr = nr;
+		}
+		for (unsigned int i = 0; i < nr; i++)
 			data->cpus_data.cpu[i] = bswap_16(data->cpus_data.cpu[i]);
 		break;
+	}
 	case PERF_CPU_MAP__MASK:
 		data->mask32_data.long_size = bswap_16(data->mask32_data.long_size);
 
 		switch (data->mask32_data.long_size) {
-		case 4:
+		case 4: {
+			u16 nr, max_nr;
+
 			data->mask32_data.nr = bswap_16(data->mask32_data.nr);
-			for (unsigned i = 0; i < data->mask32_data.nr; i++)
+			nr = data->mask32_data.nr;
+			max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+						     mask32_data.mask)) /
+				 sizeof(data->mask32_data.mask[0]);
+			if (nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP mask32: nr %u exceeds payload (max %u), clamping\n",
+					   nr, max_nr);
+				nr = max_nr;
+				data->mask32_data.nr = nr;
+			}
+			for (unsigned int i = 0; i < nr; i++)
 				data->mask32_data.mask[i] = bswap_32(data->mask32_data.mask[i]);
 			break;
-		case 8:
+		}
+		case 8: {
+			u16 nr, max_nr;
+
 			data->mask64_data.nr = bswap_16(data->mask64_data.nr);
-			for (unsigned i = 0; i < data->mask64_data.nr; i++)
+			nr = data->mask64_data.nr;
+			if (payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) {
+				data->mask64_data.nr = 0;
+				break;
+			}
+			max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+						     mask64_data.mask)) /
+				 sizeof(data->mask64_data.mask[0]);
+			if (nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP mask64: nr %u exceeds payload (max %u), clamping\n",
+					   nr, max_nr);
+				nr = max_nr;
+				data->mask64_data.nr = nr;
+			}
+			for (unsigned int i = 0; i < nr; i++)
 				data->mask64_data.mask[i] = bswap_64(data->mask64_data.mask[i]);
 			break;
+		}
 		default:
-			pr_err("cpu_map swap: unsupported long size\n");
+			pr_err("cpu_map swap: unsupported long size %u\n",
+			       data->mask32_data.long_size);
 		}
 		break;
 	case PERF_CPU_MAP__RANGE_CPUS:
@@ -788,11 +870,27 @@ static int perf_event__cpu_map_swap(union perf_event *event,
 static int perf_event__stat_config_swap(union perf_event *event,
 					bool sample_id_all __maybe_unused)
 {
-	u64 size;
+	u64 nr, max_nr, size;
 
-	size  = bswap_64(event->stat_config.nr) * sizeof(event->stat_config.data[0]);
-	size += 1; /* nr item itself */
+	nr = bswap_64(event->stat_config.nr);
+	/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+	max_nr = (event->header.size - sizeof(event->stat_config)) /
+		 sizeof(event->stat_config.data[0]);
+	/*
+	 * Safe to clamp: each config entry is self-describing
+	 * via its tag; missing entries keep their defaults.
+	 */
+	if (nr > max_nr) {
+		pr_warning("WARNING: PERF_RECORD_STAT_CONFIG: nr %" PRIu64 " exceeds payload (max %" PRIu64 "), clamping\n",
+			   nr, max_nr);
+		nr = max_nr;
+	}
+	size = nr * sizeof(event->stat_config.data[0]);
+	/* The swap starts at &nr, so add its size to cover the full range */
+	size += sizeof(event->stat_config.nr);
 	mem_bswap_64(&event->stat_config.nr, size);
+	/* Persist the clamped value in native byte order */
+	event->stat_config.nr = nr;
 	return 0;
 }
 
@@ -1730,8 +1828,27 @@ static int machines__deliver_event(struct machines *machines,
 					   "COMM"))
 			return 0;
 		return tool->comm(tool, event, sample, machine);
-	case PERF_RECORD_NAMESPACES:
+	case PERF_RECORD_NAMESPACES: {
+		/*
+		 * Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof.
+		 * Includes trailing sample_id space when present, but prevents OOB.
+		 */
+		u64 max_nr = (event->header.size - sizeof(event->namespaces)) /
+			     sizeof(event->namespaces.link_info[0]);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr in place.  Skip the event instead.
+		 * The swap handler already clamps on the writable
+		 * cross-endian path.
+		 */
+		if (event->namespaces.nr_namespaces > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_NAMESPACES: nr_namespaces %" PRIu64 " exceeds payload (max %" PRIu64 "), skipping\n",
+				   (u64)event->namespaces.nr_namespaces, max_nr);
+			return 0;
+		}
 		return tool->namespaces(tool, event, sample, machine);
+	}
 	case PERF_RECORD_CGROUP:
 		if (!perf_event__check_nul(event->cgroup.path,
 					   (void *)event + event->header.size,
@@ -1912,15 +2029,112 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		perf_session__auxtrace_error_inc(session, event);
 		err = tool->auxtrace_error(tool, session, event);
 		break;
-	case PERF_RECORD_THREAD_MAP:
+	case PERF_RECORD_THREAD_MAP: {
+		u64 max_nr;
+
+		if (event->header.size < sizeof(event->thread_map)) {
+			pr_err("PERF_RECORD_THREAD_MAP: header.size (%u) too small\n",
+			       event->header.size);
+			err = -EINVAL;
+			break;
+		}
+
+		max_nr = (event->header.size - sizeof(event->thread_map)) /
+			 sizeof(event->thread_map.entries[0]);
+		if (event->thread_map.nr > max_nr) {
+			pr_err("PERF_RECORD_THREAD_MAP: nr %" PRIu64 " exceeds max %" PRIu64 "\n",
+			       (u64)event->thread_map.nr, max_nr);
+			err = -EINVAL;
+			break;
+		}
+
 		err = tool->thread_map(tool, session, event);
 		break;
-	case PERF_RECORD_CPU_MAP:
+	}
+	case PERF_RECORD_CPU_MAP: {
+		struct perf_record_cpu_map_data *data = &event->cpu_map.data;
+		u32 payload = event->header.size - sizeof(event->header);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr fields in place.  Skip the event
+		 * if any variant overflows.
+		 */
+		switch (data->type) {
+		case PERF_CPU_MAP__CPUS: {
+			u16 max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+							 cpus_data.cpu)) /
+				     sizeof(data->cpus_data.cpu[0]);
+
+			if (data->cpus_data.nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP: nr %u exceeds payload (max %u), skipping\n",
+					   data->cpus_data.nr, max_nr);
+				err = 0;
+				goto out;
+			}
+			break;
+		}
+		case PERF_CPU_MAP__MASK:
+			if (data->mask32_data.long_size == 4) {
+				u16 max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+								 mask32_data.mask)) /
+					     sizeof(data->mask32_data.mask[0]);
+
+				if (data->mask32_data.nr > max_nr) {
+					pr_warning("WARNING: PERF_RECORD_CPU_MAP mask32: nr %u exceeds payload (max %u), skipping\n",
+						   data->mask32_data.nr, max_nr);
+					err = 0;
+					goto out;
+				}
+			} else if (data->mask64_data.long_size == 8) {
+				u16 max_nr;
+
+				if (payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) {
+					err = 0;
+					goto out;
+				}
+				max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+							     mask64_data.mask)) /
+					 sizeof(data->mask64_data.mask[0]);
+				if (data->mask64_data.nr > max_nr) {
+					pr_warning("WARNING: PERF_RECORD_CPU_MAP mask64: nr %u exceeds payload (max %u), skipping\n",
+						   data->mask64_data.nr, max_nr);
+					err = 0;
+					goto out;
+				}
+			} else {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP: unsupported long_size %u, skipping\n",
+					   data->mask32_data.long_size);
+				err = 0;
+				goto out;
+			}
+			break;
+		default:
+			break;
+		}
+
 		err = tool->cpu_map(tool, session, event);
 		break;
-	case PERF_RECORD_STAT_CONFIG:
+	}
+	case PERF_RECORD_STAT_CONFIG: {
+		/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+		u64 max_nr = (event->header.size - sizeof(event->stat_config)) /
+			     sizeof(event->stat_config.data[0]);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr in place.  Skip the event instead.
+		 */
+		if (event->stat_config.nr > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_STAT_CONFIG: nr %" PRIu64 " exceeds payload (max %" PRIu64 "), skipping\n",
+				   (u64)event->stat_config.nr, max_nr);
+			err = 0;
+			goto out;
+		}
+
 		err = tool->stat_config(tool, session, event);
 		break;
+	}
 	case PERF_RECORD_STAT:
 		err = tool->stat(tool, session, event);
 		break;
@@ -1963,6 +2177,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		err = -EINVAL;
 		break;
 	}
+out:
 	perf_sample__exit(&sample);
 	return err;
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths
  2026-05-25  1:05 [PATCHES v3 " Arnaldo Carvalho de Melo
@ 2026-05-25  1:05 ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-25  1:05 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Several event types use an nr field to control iteration over
variable-length arrays.  The swap handlers byte-swap and loop using
these fields without bounds checks, and the native processing path
trusts them as well.

Add bounds checks on both paths for:

- PERF_RECORD_THREAD_MAP: validate nr against payload, return -1
  on the swap path.  On the native path, reject with -EINVAL.

- PERF_RECORD_NAMESPACES: clamp nr on the swap path (safe because
  each entry is indexed by type; missing entries just won't be
  resolved).  Skip the event on the native path.

- PERF_RECORD_CPU_MAP: clamp nr for CPUS and MASK sub-types on
  the swap path.  Add bounds checks for mask64 which previously
  had no nr validation.  Skip the event on the native path.

- PERF_RECORD_STAT_CONFIG: clamp nr on the swap path (safe because
  each config entry is self-describing via its tag).  Skip the
  event on the native path.

The swap path (cross-endian, writable MAP_PRIVATE mapping) can
safely clamp by writing back to the event.  The native path
(read-only MAP_SHARED mapping) must skip instead of clamping
because writing to the mmap'd event would segfault.

Also fix stat_config swap range: change size += 1 to
size += sizeof(event->stat_config.nr) for clarity.  The old +1
happened to work because mem_bswap_64 processes 8-byte chunks,
but the intent is to include the 8-byte nr field in the swap
range.

Changes in v2:
- Document that PERF_RECORD_NAMESPACES max_nr includes trailing
  sample_id space when sample_id_all is present — harmless on the
  swap path because both per-element bswap_64 and swap_sample_id_all()
  perform the same u64 byte swap (Reported-by: sashiko-bot@kernel.org)

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 253 +++++++++++++++++++++++++++++++++++---
 1 file changed, 234 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index aef10d42be35487a..8588e12f110fca70 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -496,13 +496,35 @@ static int perf_event__throttle_swap(union perf_event *event,
 static int perf_event__namespaces_swap(union perf_event *event,
 				       bool sample_id_all)
 {
-	u64 i;
+	u64 i, nr, max_nr;
 
 	event->namespaces.pid		= bswap_32(event->namespaces.pid);
 	event->namespaces.tid		= bswap_32(event->namespaces.tid);
 	event->namespaces.nr_namespaces	= bswap_64(event->namespaces.nr_namespaces);
 
-	for (i = 0; i < event->namespaces.nr_namespaces; i++) {
+	nr = event->namespaces.nr_namespaces;
+	/*
+	 * Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof.
+	 * When sample_id_all is present max_nr slightly overestimates the
+	 * array space because header.size includes the trailing sample_id.
+	 * Harmless: both the per-element bswap_64 loop and swap_sample_id_all()
+	 * perform the same u64 byte swap, so the result is correct regardless
+	 * of where the boundary between array and sample_id falls.
+	 */
+	max_nr = (event->header.size - sizeof(event->namespaces)) /
+		 sizeof(event->namespaces.link_info[0]);
+	/*
+	 * Safe to clamp: each namespace entry is indexed by type;
+	 * missing entries just won't be resolved.
+	 */
+	if (nr > max_nr) {
+		pr_warning("WARNING: PERF_RECORD_NAMESPACES: nr_namespaces %" PRIu64 " exceeds payload (max %" PRIu64 "), clamping\n",
+			   nr, max_nr);
+		nr = max_nr;
+		event->namespaces.nr_namespaces = nr;
+	}
+
+	for (i = 0; i < nr; i++) {
 		struct perf_ns_link_info *ns = &event->namespaces.link_info[i];
 
 		ns->dev = bswap_64(ns->dev);
@@ -734,11 +756,23 @@ static int perf_event__auxtrace_error_swap(union perf_event *event,
 static int perf_event__thread_map_swap(union perf_event *event,
 				       bool sample_id_all __maybe_unused)
 {
-	unsigned i;
+	unsigned int i;
+	u64 nr;
 
 	event->thread_map.nr = bswap_64(event->thread_map.nr);
 
-	for (i = 0; i < event->thread_map.nr; i++)
+	/*
+	 * Reject rather than clamp: unlike namespaces (indexed by type)
+	 * or stat_config (self-describing tags), a truncated thread map
+	 * is structurally broken — downstream would get a wrong map.
+	 */
+	/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+	nr = event->thread_map.nr;
+	if (nr > (event->header.size - sizeof(event->thread_map)) /
+		  sizeof(event->thread_map.entries[0]))
+		return -1;
+
+	for (i = 0; i < nr; i++)
 		event->thread_map.entries[i].pid = bswap_64(event->thread_map.entries[i].pid);
 	return 0;
 }
@@ -747,32 +781,80 @@ static int perf_event__cpu_map_swap(union perf_event *event,
 				    bool sample_id_all __maybe_unused)
 {
 	struct perf_record_cpu_map_data *data = &event->cpu_map.data;
+	u32 payload = event->header.size - sizeof(event->header);
 
 	data->type = bswap_16(data->type);
 
+	/*
+	 * Safe to clamp: a shorter CPU map just means some CPUs
+	 * are absent; tools process the CPUs that are present.
+	 */
 	switch (data->type) {
-	case PERF_CPU_MAP__CPUS:
-		data->cpus_data.nr = bswap_16(data->cpus_data.nr);
+	case PERF_CPU_MAP__CPUS: {
+		u16 nr, max_nr;
 
-		for (unsigned i = 0; i < data->cpus_data.nr; i++)
+		data->cpus_data.nr = bswap_16(data->cpus_data.nr);
+		nr = data->cpus_data.nr;
+		max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+					     cpus_data.cpu)) /
+			 sizeof(data->cpus_data.cpu[0]);
+		if (nr > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_CPU_MAP: nr %u exceeds payload (max %u), clamping\n",
+				   nr, max_nr);
+			nr = max_nr;
+			data->cpus_data.nr = nr;
+		}
+		for (unsigned int i = 0; i < nr; i++)
 			data->cpus_data.cpu[i] = bswap_16(data->cpus_data.cpu[i]);
 		break;
+	}
 	case PERF_CPU_MAP__MASK:
 		data->mask32_data.long_size = bswap_16(data->mask32_data.long_size);
 
 		switch (data->mask32_data.long_size) {
-		case 4:
+		case 4: {
+			u16 nr, max_nr;
+
 			data->mask32_data.nr = bswap_16(data->mask32_data.nr);
-			for (unsigned i = 0; i < data->mask32_data.nr; i++)
+			nr = data->mask32_data.nr;
+			max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+						     mask32_data.mask)) /
+				 sizeof(data->mask32_data.mask[0]);
+			if (nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP mask32: nr %u exceeds payload (max %u), clamping\n",
+					   nr, max_nr);
+				nr = max_nr;
+				data->mask32_data.nr = nr;
+			}
+			for (unsigned int i = 0; i < nr; i++)
 				data->mask32_data.mask[i] = bswap_32(data->mask32_data.mask[i]);
 			break;
-		case 8:
+		}
+		case 8: {
+			u16 nr, max_nr;
+
 			data->mask64_data.nr = bswap_16(data->mask64_data.nr);
-			for (unsigned i = 0; i < data->mask64_data.nr; i++)
+			nr = data->mask64_data.nr;
+			if (payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) {
+				data->mask64_data.nr = 0;
+				break;
+			}
+			max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+						     mask64_data.mask)) /
+				 sizeof(data->mask64_data.mask[0]);
+			if (nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP mask64: nr %u exceeds payload (max %u), clamping\n",
+					   nr, max_nr);
+				nr = max_nr;
+				data->mask64_data.nr = nr;
+			}
+			for (unsigned int i = 0; i < nr; i++)
 				data->mask64_data.mask[i] = bswap_64(data->mask64_data.mask[i]);
 			break;
+		}
 		default:
-			pr_err("cpu_map swap: unsupported long size\n");
+			pr_err("cpu_map swap: unsupported long size %u\n",
+			       data->mask32_data.long_size);
 		}
 		break;
 	case PERF_CPU_MAP__RANGE_CPUS:
@@ -788,11 +870,27 @@ static int perf_event__cpu_map_swap(union perf_event *event,
 static int perf_event__stat_config_swap(union perf_event *event,
 					bool sample_id_all __maybe_unused)
 {
-	u64 size;
+	u64 nr, max_nr, size;
 
-	size  = bswap_64(event->stat_config.nr) * sizeof(event->stat_config.data[0]);
-	size += 1; /* nr item itself */
+	nr = bswap_64(event->stat_config.nr);
+	/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+	max_nr = (event->header.size - sizeof(event->stat_config)) /
+		 sizeof(event->stat_config.data[0]);
+	/*
+	 * Safe to clamp: each config entry is self-describing
+	 * via its tag; missing entries keep their defaults.
+	 */
+	if (nr > max_nr) {
+		pr_warning("WARNING: PERF_RECORD_STAT_CONFIG: nr %" PRIu64 " exceeds payload (max %" PRIu64 "), clamping\n",
+			   nr, max_nr);
+		nr = max_nr;
+	}
+	size = nr * sizeof(event->stat_config.data[0]);
+	/* The swap starts at &nr, so add its size to cover the full range */
+	size += sizeof(event->stat_config.nr);
 	mem_bswap_64(&event->stat_config.nr, size);
+	/* Persist the clamped value in native byte order */
+	event->stat_config.nr = nr;
 	return 0;
 }
 
@@ -1730,8 +1828,27 @@ static int machines__deliver_event(struct machines *machines,
 					   "COMM"))
 			return 0;
 		return tool->comm(tool, event, sample, machine);
-	case PERF_RECORD_NAMESPACES:
+	case PERF_RECORD_NAMESPACES: {
+		/*
+		 * Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof.
+		 * Includes trailing sample_id space when present, but prevents OOB.
+		 */
+		u64 max_nr = (event->header.size - sizeof(event->namespaces)) /
+			     sizeof(event->namespaces.link_info[0]);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr in place.  Skip the event instead.
+		 * The swap handler already clamps on the writable
+		 * cross-endian path.
+		 */
+		if (event->namespaces.nr_namespaces > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_NAMESPACES: nr_namespaces %" PRIu64 " exceeds payload (max %" PRIu64 "), skipping\n",
+				   (u64)event->namespaces.nr_namespaces, max_nr);
+			return 0;
+		}
 		return tool->namespaces(tool, event, sample, machine);
+	}
 	case PERF_RECORD_CGROUP:
 		if (!perf_event__check_nul(event->cgroup.path,
 					   (void *)event + event->header.size,
@@ -1912,15 +2029,112 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		perf_session__auxtrace_error_inc(session, event);
 		err = tool->auxtrace_error(tool, session, event);
 		break;
-	case PERF_RECORD_THREAD_MAP:
+	case PERF_RECORD_THREAD_MAP: {
+		u64 max_nr;
+
+		if (event->header.size < sizeof(event->thread_map)) {
+			pr_err("PERF_RECORD_THREAD_MAP: header.size (%u) too small\n",
+			       event->header.size);
+			err = -EINVAL;
+			break;
+		}
+
+		max_nr = (event->header.size - sizeof(event->thread_map)) /
+			 sizeof(event->thread_map.entries[0]);
+		if (event->thread_map.nr > max_nr) {
+			pr_err("PERF_RECORD_THREAD_MAP: nr %" PRIu64 " exceeds max %" PRIu64 "\n",
+			       (u64)event->thread_map.nr, max_nr);
+			err = -EINVAL;
+			break;
+		}
+
 		err = tool->thread_map(tool, session, event);
 		break;
-	case PERF_RECORD_CPU_MAP:
+	}
+	case PERF_RECORD_CPU_MAP: {
+		struct perf_record_cpu_map_data *data = &event->cpu_map.data;
+		u32 payload = event->header.size - sizeof(event->header);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr fields in place.  Skip the event
+		 * if any variant overflows.
+		 */
+		switch (data->type) {
+		case PERF_CPU_MAP__CPUS: {
+			u16 max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+							 cpus_data.cpu)) /
+				     sizeof(data->cpus_data.cpu[0]);
+
+			if (data->cpus_data.nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP: nr %u exceeds payload (max %u), skipping\n",
+					   data->cpus_data.nr, max_nr);
+				err = 0;
+				goto out;
+			}
+			break;
+		}
+		case PERF_CPU_MAP__MASK:
+			if (data->mask32_data.long_size == 4) {
+				u16 max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+								 mask32_data.mask)) /
+					     sizeof(data->mask32_data.mask[0]);
+
+				if (data->mask32_data.nr > max_nr) {
+					pr_warning("WARNING: PERF_RECORD_CPU_MAP mask32: nr %u exceeds payload (max %u), skipping\n",
+						   data->mask32_data.nr, max_nr);
+					err = 0;
+					goto out;
+				}
+			} else if (data->mask64_data.long_size == 8) {
+				u16 max_nr;
+
+				if (payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) {
+					err = 0;
+					goto out;
+				}
+				max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+							     mask64_data.mask)) /
+					 sizeof(data->mask64_data.mask[0]);
+				if (data->mask64_data.nr > max_nr) {
+					pr_warning("WARNING: PERF_RECORD_CPU_MAP mask64: nr %u exceeds payload (max %u), skipping\n",
+						   data->mask64_data.nr, max_nr);
+					err = 0;
+					goto out;
+				}
+			} else {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP: unsupported long_size %u, skipping\n",
+					   data->mask32_data.long_size);
+				err = 0;
+				goto out;
+			}
+			break;
+		default:
+			break;
+		}
+
 		err = tool->cpu_map(tool, session, event);
 		break;
-	case PERF_RECORD_STAT_CONFIG:
+	}
+	case PERF_RECORD_STAT_CONFIG: {
+		/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+		u64 max_nr = (event->header.size - sizeof(event->stat_config)) /
+			     sizeof(event->stat_config.data[0]);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr in place.  Skip the event instead.
+		 */
+		if (event->stat_config.nr > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_STAT_CONFIG: nr %" PRIu64 " exceeds payload (max %" PRIu64 "), skipping\n",
+				   (u64)event->stat_config.nr, max_nr);
+			err = 0;
+			goto out;
+		}
+
 		err = tool->stat_config(tool, session, event);
 		break;
+	}
 	case PERF_RECORD_STAT:
 		err = tool->stat(tool, session, event);
 		break;
@@ -1963,6 +2177,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		err = -EINVAL;
 		break;
 	}
+out:
 	perf_sample__exit(&sample);
 	return err;
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files
@ 2026-05-26 21:17 Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 01/29] perf session: Add minimum event size and alignment validation Arnaldo Carvalho de Melo
                   ` (29 more replies)
  0 siblings, 30 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo

perf.data validation and hardening (29 patches)

A crafted or corrupted perf.data file can cause out-of-bounds
reads/writes, infinite loops, heap overflows, and segfaults in perf
report, perf script, perf inject, perf timechart, and perf kwork.
This series adds defense-in-depth validation for file parsing:

- Per-event-type minimum size table, enforced before swap and
  processing on both native and cross-endian paths.

- Bounds-checking the one_mmap fast path in peek_event against the
  mapped region size, preventing OOB reads from crafted file_offset.

- Swap handler return values (void -> int) so handlers can propagate
  errors instead of silently corrupting adjacent memory.

- Bounds checking for string fields (null-termination), array counts
  (nr vs payload size), feature section sizes (vs file size), and
  CPU indices (vs nr_cpus_avail / array allocation).

- ABI0 handling for perf_event_attr.size == 0 across all code paths
  (swap, native, synthesize, read_event_desc), with consistent
  behavior regardless of file endianness.

- READ_ONCE() snapshot of event->header.size in process_user_event()
  to prevent compiler rematerialization from MAP_SHARED memory.

- Sanitizer-aware shell test: the truncated perf.data test captures
  stderr and checks for ASAN/MSAN/TSAN/UBSAN markers, since sanitizer
  exits use code 1 which otherwise looks like a clean error exit.

Pre-existing bugs fixed along the way:

- event_contains() macro off-by-one (checked start, not full extent)

- zstd_decompress_stream multi-iteration output.pos bug

- zstd_compress_stream_to_records: broken memcpy fallback -> return -1
  + ZSTD context reset + dst_size underflow guard

- PERF_RECORD_SWITCH sample_id_all offset wrong for non-CPU_WIDE

- cpu_map__from_range any_cpu used as count instead of boolean

- cpu_map__from_mask double-fetch heap overflow (j >= weight guard)

- kwork cpus_runtime BUG_ON with signed comparison

- perf_header__getbuffer64 EOF without errno (silent success)

- read_event_desc ABI0 sentinel (attr.size=0 -> free_event_desc early stop)

- EVENT_UPDATE MASK: missing offsetof underflow guard + pr_warning on
  mask32/mask64 validation paths

Additional pre-existing issues were noticed during review and will be
addressed in follow-up series.

Testing
-------

- perf test at baseline and at patches 1, 8, 11, 17, 21, 26, 29
  with 300s timeout -- no regressions detected.
- Build with both gcc and clang at every patch.
- checkpatch.pl on all 29 patches.
- Full root perf test on x86_64 (x1, i7-1260P) and aarch64
  (Raspberry Pi 4, Cortex-A72, Debian trixie).

Developed with AI assistance (Claude/sashiko), tagged in commits.

It is available at:

  https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git perf-data-validation

  https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=perf-data-validation

I think this is the last one, followup series will deal with the
pre-existing issues found while working on this series, its all in
several TODO files.

Best regards,

- Arnaldo

Changes in v4
-------------

- Patch 22: fix comment in process_mem_topology() — per-node fields
  are node_id + mem_size + bitmap_nr_bits, not version + bitmap_size.
- Patch 29: add mktemp failure guards (exit 2 = skip) so empty
  variables don't cause 'rm -f .old' in cleanup.  Use dd bs=$cut_at
  count=1 instead of bs=1 count=$cut_at to avoid one syscall per byte.

Changes in v3
-------------

- Patch 10: fix perf_event__repipe_attr() in builtin-inject.c to
  handle ABI0 attr.size==0 — was using the raw size for memcpy and
  the perf_record_header_attr_id() macro, which both break when
  attr.size is 0.
- Patch 12: add sample_id_all handling to perf_event__build_id_swap()
  — perf_event__synthesize_build_id() appends id_sample data, so
  cross-endian pipe mode must swap those trailing fields.
- Patch 24: remove comp_mmap_len upper-bound cap that rejected valid
  perf record -m 2G recordings (mmap_len exceeds 2GB - 4096).  The
  downstream decompression path already checks against SIZE_MAX.

Changes in v2
-------------

- Patch 8: strnlen with 'end - data' limit instead of open-ended strlen
- Patch 10: ABI0 attr.size==0 handling for native-endian path
- Patch 13: READ_ONCE snapshot for mask32_data.nr, long_size validation
- Patch 17: attr_size bounds check for all PRINT_ATTRn macros

Arnaldo Carvalho de Melo (29):
  perf session: Add minimum event size and alignment validation
  perf session: Bounds-check one_mmap event pointer in peek_event
  perf tools: Fix event_contains() macro to verify full field extent
  perf zstd: Fix compression error path in zstd_compress_stream_to_records()
  perf zstd: Fix multi-iteration decompression and error handling
  perf session: Fix PERF_RECORD_READ swap and dump for variable-length events
  perf session: Fix swap_sample_id_all() crash on crafted events
  perf session: Add validated swap infrastructure with null-termination checks
  perf session: Use bounded copy for PERF_RECORD_TIME_CONV
  perf session: Validate HEADER_ATTR attr.size before swapping
  perf session: Validate nr fields against event size on both swap and common paths
  perf header: Byte-swap build ID event pid and bounds check section entries
  perf cpumap: Reject RANGE_CPUS with start_cpu > end_cpu
  perf auxtrace: Harden auxtrace_error event handling
  perf session: Add byte-swap and bounds check for PERF_RECORD_BPF_METADATA events
  perf header: Validate null-termination in PERF_RECORD_EVENT_UPDATE string fields
  perf tools: Bounds check perf_event_attr fields against attr.size before printing
  perf header: Propagate feature section processing errors
  perf header: Validate f_attr.ids section before use in perf_session__read_header()
  perf header: Validate feature section size and add read path bounds checking
  perf header: Sanity check HEADER_EVENT_DESC attr.size before swap
  perf header: Validate bitmap size before allocating in do_read_bitmap()
  perf session: Add byte-swap handler for PERF_RECORD_COMPRESSED2
  perf tools: Harden compressed event processing
  perf session: Check for decompression buffer size overflow
  perf session: Bound nr_cpus_avail and validate sample CPU
  perf kwork: Bounds check work->cpu before indexing cpus_runtime[]
  perf session: Snapshot event->header.size in process_user_event()
  perf test: Add truncated perf.data robustness test

 tools/lib/perf/include/perf/event.h           |    9 +-
 tools/perf/builtin-inject.c                   |   23 +-
 tools/perf/builtin-kwork.c                    |   45 +-
 tools/perf/builtin-record.c                   |    6 +-
 tools/perf/tests/parse-no-sample-id-all.c     |    6 +
 tools/perf/tests/shell/data_validation.sh     |   85 ++
 tools/perf/trace/beauty/perf_event_open.c     |   23 +-
 tools/perf/util/arm-spe.c                     |    2 +-
 tools/perf/util/auxtrace.c                    |   24 +-
 tools/perf/util/cpumap.c                      |   62 +-
 tools/perf/util/cs-etm.c                      |    2 +-
 tools/perf/util/header.c                      |  625 +++++++-
 tools/perf/util/jitdump.c                     |    2 +-
 tools/perf/util/kwork.h                       |    1 +
 tools/perf/util/perf_event_attr_fprintf.c     |  141 +-
 .../scripting-engines/trace-event-python.c    |   28 +-
 tools/perf/util/session.c                     | 1355 +++++++++++++++--
 tools/perf/util/session.h                     |    2 +
 tools/perf/util/synthetic-events.c            |   25 +-
 tools/perf/util/tool.c                        |   51 +-
 tools/perf/util/tsc.c                         |    2 +-
 tools/perf/util/zstd.c                        |   47 +-
 22 files changed, 2272 insertions(+), 294 deletions(-)
 create mode 100755 tools/perf/tests/shell/data_validation.sh

-- 
2.54.0


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 01/29] perf session: Add minimum event size and alignment validation
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 02/29] perf session: Bounds-check one_mmap event pointer in peek_event Arnaldo Carvalho de Melo
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Add a per-type minimum size table (perf_event__min_size[]) and
enforce it before swap and processing, so that both cross-endian
and native-endian paths are protected from accessing fields past
the event boundary.

The table uses offsetof() for types with trailing variable-length
fields (filenames, strings, msg arrays) and sizeof() for
fixed-size types.  Zero entries mean no minimum beyond the 8-byte
header already enforced by the reader.

Undersized events are skipped with a warning in process_event
and rejected in peek_event — both checked before the swap
handler runs, preventing OOB access on crafted event fields.

Also reject events whose header.size is not 8-byte aligned.  The
kernel aligns all event sizes to sizeof(u64) — see
perf_event_comm_event() (ALIGN), perf_event_mmap_event(),
perf_event_cgroup(), perf_event_ksymbol() (IS_ALIGNED loops),
and perf_event_text_poke() (ALIGN) in kernel/events/core.c.
An unaligned size means the file is corrupted or crafted; reject
early so downstream code that divides by sizeof(u64) to compute
array element counts gets exact results.

Three legacy user events are exempted from the alignment check:
TRACING_DATA (66) had a 12-byte struct before commit b39c915a4f36
("libperf event: Ensure tracing data is multiple of 8 sized")
added padding, COMPRESSED (81) carries raw ZSTD output (already
superseded by COMPRESSED2 with PERF_ALIGN), and HEADER_FEATURE
(80) uses do_write_string() with a 4-byte length prefix.

Also guard event_swap() against crafted event types >=
PERF_RECORD_HEADER_MAX to prevent OOB reads on the
perf_event__swap_ops[] array.

Changes in v2:
- Fix double-skip for unsupported event types: return 0 instead
  of event->header.size in perf_session__process_event() for
  HEADER_MAX, since reader__read_event() already advances by
  event->header.size (Reported-by: sashiko-bot@kernel.org)
- Exempt TRACING_DATA, COMPRESSED, and HEADER_FEATURE from the
  alignment check — these legacy user events predate the 8-byte
  alignment rule (Reported-by: sashiko-bot@kernel.org)
- peek_event: return 0 (skip) for unknown event types instead of
  -1 (error), consistent with process_event which already skips
  unsupported types gracefully (Reported-by: sashiko-bot@kernel.org)

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 253 +++++++++++++++++++++++++++++++++-----
 1 file changed, 220 insertions(+), 33 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 1e25892963b7857a..0523fd243e02c09b 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1759,15 +1759,121 @@ int perf_session__deliver_synth_attr_event(struct perf_session *session,
 	return perf_session__deliver_synth_event(session, &ev.ev, NULL);
 }
 
-static void event_swap(union perf_event *event, bool sample_id_all)
+/*
+ * Minimum event sizes indexed by type.  Checked before swap and
+ * processing so that both cross-endian and native-endian paths
+ * are protected from accessing fields past the event boundary.
+ * Zero means no minimum beyond the 8-byte header (already
+ * enforced by the reader).
+ */
+static const u32 perf_event__min_size[PERF_RECORD_HEADER_MAX] = {
+	/*
+	 * offsetof() + 1 for types with a trailing variable-length
+	 * string (filename, comm, path, name, msg): the +1 ensures
+	 * room for at least a null terminator.  Full null-termination
+	 * within the event boundary is checked separately.
+	 *
+	 * PERF_RECORD_SAMPLE is omitted: all64_swap is bounded by
+	 * header.size, and the internal layout varies by sample_type
+	 * so a fixed minimum is not meaningful.
+	 */
+	[PERF_RECORD_MMAP]		  = offsetof(struct perf_record_mmap, filename) + 1,
+	[PERF_RECORD_LOST]		  = sizeof(struct perf_record_lost),
+	[PERF_RECORD_COMM]		  = offsetof(struct perf_record_comm, comm) + 1,
+	[PERF_RECORD_EXIT]		  = sizeof(struct perf_record_fork),
+	[PERF_RECORD_THROTTLE]		  = sizeof(struct perf_record_throttle),
+	[PERF_RECORD_UNTHROTTLE]	  = sizeof(struct perf_record_throttle),
+	[PERF_RECORD_FORK]		  = sizeof(struct perf_record_fork),
+	/*
+	 * The kernel dynamically sizes PERF_RECORD_READ based on
+	 * attr.read_format — the minimum has just pid + tid + value.
+	 */
+	[PERF_RECORD_READ]		  = offsetof(struct perf_record_read, time_enabled),
+	[PERF_RECORD_MMAP2]		  = offsetof(struct perf_record_mmap2, filename) + 1,
+	[PERF_RECORD_LOST_SAMPLES]	  = sizeof(struct perf_record_lost_samples),
+	[PERF_RECORD_AUX]		  = sizeof(struct perf_record_aux),
+	[PERF_RECORD_ITRACE_START]	  = sizeof(struct perf_record_itrace_start),
+	[PERF_RECORD_SWITCH]		  = sizeof(struct perf_event_header),
+	[PERF_RECORD_SWITCH_CPU_WIDE]	  = sizeof(struct perf_record_switch),
+	[PERF_RECORD_NAMESPACES]	  = sizeof(struct perf_record_namespaces),
+	[PERF_RECORD_CGROUP]		  = offsetof(struct perf_record_cgroup, path) + 1,
+	[PERF_RECORD_TEXT_POKE]		  = sizeof(struct perf_record_text_poke_event),
+	[PERF_RECORD_KSYMBOL]		  = offsetof(struct perf_record_ksymbol, name) + 1,
+	[PERF_RECORD_BPF_EVENT]		  = sizeof(struct perf_record_bpf_event),
+	[PERF_RECORD_HEADER_ATTR]	  = sizeof(struct perf_event_header) + PERF_ATTR_SIZE_VER0,
+	[PERF_RECORD_HEADER_EVENT_TYPE]	  = sizeof(struct perf_record_header_event_type),
+	/* Legacy events predate the __u32 pad field, accept 12-byte records */
+	[PERF_RECORD_HEADER_TRACING_DATA] = offsetof(struct perf_record_header_tracing_data, pad),
+	[PERF_RECORD_AUX_OUTPUT_HW_ID]	  = sizeof(struct perf_record_aux_output_hw_id),
+	[PERF_RECORD_AUXTRACE_INFO]	  = sizeof(struct perf_record_auxtrace_info),
+	[PERF_RECORD_AUXTRACE]		  = sizeof(struct perf_record_auxtrace),
+	[PERF_RECORD_AUXTRACE_ERROR]	  = offsetof(struct perf_record_auxtrace_error, msg) + 1,
+	[PERF_RECORD_THREAD_MAP]	  = sizeof(struct perf_record_thread_map),
+	/* Smallest valid variant is RANGE_CPUS: header(8) + type(2) + range(6) */
+	[PERF_RECORD_CPU_MAP]		  = sizeof(struct perf_event_header) +
+					    sizeof(__u16) +
+					    sizeof(struct perf_record_range_cpu_map),
+	[PERF_RECORD_STAT_CONFIG]	  = sizeof(struct perf_record_stat_config),
+	[PERF_RECORD_STAT]		  = sizeof(struct perf_record_stat),
+	[PERF_RECORD_STAT_ROUND]	  = sizeof(struct perf_record_stat_round),
+	/* Union inflates sizeof; use fixed header fields as minimum */
+	[PERF_RECORD_EVENT_UPDATE]	  = offsetof(struct perf_record_event_update, scale),
+	[PERF_RECORD_TIME_CONV]		  = offsetof(struct perf_record_time_conv, time_cycles),
+	[PERF_RECORD_ID_INDEX]		  = sizeof(struct perf_record_id_index),
+	[PERF_RECORD_HEADER_BUILD_ID]	  = sizeof(struct perf_record_header_build_id),
+	[PERF_RECORD_HEADER_FEATURE]	  = sizeof(struct perf_record_header_feature),
+	[PERF_RECORD_COMPRESSED2]	  = sizeof(struct perf_record_compressed2),
+	[PERF_RECORD_BPF_METADATA]	  = sizeof(struct perf_record_bpf_metadata),
+	[PERF_RECORD_CALLCHAIN_DEFERRED]  = sizeof(struct perf_event_header) + sizeof(__u64),
+	/*
+	 * SCHEDSTAT events have a version-dependent union after the
+	 * fixed header fields; the minimum is the base (pre-union)
+	 * portion so old and new versions both pass.
+	 */
+	[PERF_RECORD_SCHEDSTAT_CPU]	  = offsetof(struct perf_record_schedstat_cpu, v15),
+	[PERF_RECORD_SCHEDSTAT_DOMAIN]	  = offsetof(struct perf_record_schedstat_domain, v15),
+};
+
+/*
+ * Return true if the event is too small for its declared type.
+ * Caller must ensure event->header.type < PERF_RECORD_HEADER_MAX.
+ * If min is non-NULL, stores the required minimum on failure.
+ */
+static bool perf_event__too_small(const union perf_event *event, u32 *min)
 {
-	perf_event__swap_op swap;
+	u32 min_sz = perf_event__min_size[event->header.type];
+
+	if (min_sz && event->header.size < min_sz) {
+		if (min)
+			*min = min_sz;
+		return true;
+	}
 
-	swap = perf_event__swap_ops[event->header.type];
+	return false;
+}
+
+/* Caller must ensure event->header.type < PERF_RECORD_HEADER_MAX */
+static void event_swap(union perf_event *event, bool sample_id_all)
+{
+	perf_event__swap_op swap = perf_event__swap_ops[event->header.type];
 	if (swap)
 		swap(event, sample_id_all);
 }
 
+/*
+ * Read and validate the event at @file_offset.
+ *
+ * Returns:
+ *   0  — success: *event_ptr is set and safe to access.
+ *  -1  — error; check *event_ptr to decide whether to advance or abort:
+ *          *event_ptr set  — event header was read but the event is
+ *                            malformed (too small for its type, or byte-swap
+ *                            failed).  header.size is still valid, so the
+ *                            caller can advance past the event.
+ *          *event_ptr NULL — fatal: couldn't read the header at all
+ *                            (I/O error, offset out of range, pipe mode).
+ *                            Caller must abort.
+ */
 int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 			     void *buf, size_t buf_sz,
 			     union perf_event **event_ptr,
@@ -1775,52 +1881,85 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 {
 	union perf_event *event;
 	size_t hdr_sz, rest;
+	u32 min_sz;
 	int fd;
 
+	*event_ptr = NULL;
+
 	if (session->one_mmap && !session->header.needs_swap) {
 		event = file_offset - session->one_mmap_offset +
 			session->one_mmap_addr;
-		goto out_parse_sample;
-	}
 
-	if (perf_data__is_pipe(session->data))
-		return -1;
+		/* Every event must at least contain its own header */
+		if (event->header.size < sizeof(struct perf_event_header))
+			return -1;
+	} else {
+		if (perf_data__is_pipe(session->data))
+			return -1;
 
-	fd = perf_data__fd(session->data);
-	hdr_sz = sizeof(struct perf_event_header);
+		fd = perf_data__fd(session->data);
+		hdr_sz = sizeof(struct perf_event_header);
 
-	if (buf_sz < hdr_sz)
-		return -1;
+		if (buf_sz < hdr_sz)
+			return -1;
 
-	if (lseek(fd, file_offset, SEEK_SET) == (off_t)-1 ||
-	    readn(fd, buf, hdr_sz) != (ssize_t)hdr_sz)
-		return -1;
+		if (lseek(fd, file_offset, SEEK_SET) == (off_t)-1 ||
+		    readn(fd, buf, hdr_sz) != (ssize_t)hdr_sz)
+			return -1;
 
-	event = (union perf_event *)buf;
+		event = (union perf_event *)buf;
 
-	if (session->header.needs_swap)
-		perf_event_header__bswap(&event->header);
+		if (session->header.needs_swap)
+			perf_event_header__bswap(&event->header);
+
+		if (event->header.size < hdr_sz || event->header.size > buf_sz)
+			return -1;
+
+		buf += hdr_sz;
+		rest = event->header.size - hdr_sz;
+
+		if (readn(fd, buf, rest) != (ssize_t)rest)
+			return -1;
+	}
 
-	if (event->header.size < hdr_sz || event->header.size > buf_sz)
+	/* Event data is fully loaded — expose so callers can advance */
+	*event_ptr = event;
+
+	/*
+	 * Check alignment before type: an unaligned size misaligns the
+	 * stream for all subsequent reads regardless of event type.
+	 * Three legacy user events predate the 8-byte rule — exempt them.
+	 */
+	if (event->header.size % sizeof(u64) &&
+	    event->header.type != PERF_RECORD_HEADER_TRACING_DATA &&
+	    event->header.type != PERF_RECORD_COMPRESSED &&
+	    event->header.type != PERF_RECORD_HEADER_FEATURE) {
+		pr_warning("WARNING: peek_event: event type %u size %u not aligned to %zu\n",
+			   event->header.type,
+			   event->header.size, sizeof(u64));
 		return -1;
+	}
 
-	buf += hdr_sz;
-	rest = event->header.size - hdr_sz;
+	if (event->header.type >= PERF_RECORD_HEADER_MAX) {
+		pr_warning("WARNING: peek_event: unsupported event type %u, skipping\n",
+			   event->header.type);
+		return 0;
+	}
 
-	if (readn(fd, buf, rest) != (ssize_t)rest)
+	if (perf_event__too_small(event, &min_sz)) {
+		pr_warning("WARNING: peek_event: %s event size %u too small (min %u)\n",
+			   perf_event__name(event->header.type),
+			   event->header.size, min_sz);
 		return -1;
+	}
 
 	if (session->header.needs_swap)
 		event_swap(event, evlist__sample_id_all(session->evlist));
 
-out_parse_sample:
-
 	if (sample && event->header.type < PERF_RECORD_USER_TYPE_START &&
 	    evlist__parse_sample(session->evlist, event, sample))
 		return -1;
 
-	*event_ptr = event;
-
 	return 0;
 }
 
@@ -1858,23 +1997,71 @@ static s64 perf_session__process_event(struct perf_session *session,
 {
 	struct evlist *evlist = session->evlist;
 	const struct perf_tool *tool = session->tool;
+	u32 min_sz;
 	int ret;
 
-	if (session->header.needs_swap)
-		event_swap(event, evlist__sample_id_all(evlist));
+	/*
+	 * The kernel aligns all event sizes to sizeof(u64) — see
+	 * perf_event_comm_event() (ALIGN), perf_event_mmap_event(),
+	 * perf_event_cgroup(), perf_event_ksymbol() (IS_ALIGNED loops),
+	 * and perf_event_text_poke() (ALIGN) in kernel/events/core.c.
+	 *
+	 * An unaligned size means the file is corrupted or crafted.
+	 * Abort: there is no point continuing to read unaligned records
+	 * because the caller advances rd->head by event->header.size,
+	 * so every subsequent read would start at a misaligned offset,
+	 * producing garbage headers for the rest of the file.
+	 *
+	 * Exempt three legacy user events that predate the alignment rule:
+	 *
+	 * TRACING_DATA (66): struct tracing_data_event was 12 bytes before
+	 *   b39c915a4f36 ("libperf event: Ensure tracing data is multiple
+	 *   of 8 sized") added __u32 pad; old perf.data files still contain
+	 *   12-byte records.
+	 *   TODO: introduce HEADER_TRACING_DATA2 with guaranteed alignment.
+	 *
+	 * COMPRESSED (81): raw ZSTD output, arbitrary length.  Already
+	 *   superseded by COMPRESSED2 (83) with PERF_ALIGN.
+	 *
+	 * HEADER_FEATURE (80): do_write_string() uses a 4-byte length
+	 *   prefix with no padding to 8-byte total.
+	 *   TODO: introduce HEADER_FEATURE2 with guaranteed alignment.
+	 */
+	if (event->header.size % sizeof(u64) &&
+	    event->header.type != PERF_RECORD_HEADER_TRACING_DATA &&
+	    event->header.type != PERF_RECORD_COMPRESSED &&
+	    event->header.type != PERF_RECORD_HEADER_FEATURE) {
+		pr_err("ERROR: %s event size %u is not 8-byte aligned, aborting\n",
+		       perf_event__name(event->header.type),
+		       event->header.size);
+		return -EINVAL;
+	}
 
 	if (event->header.type >= PERF_RECORD_HEADER_MAX) {
-		/* perf should not support unaligned event, stop here. */
-		if (event->header.size % sizeof(u64))
-			return -EINVAL;
-
 		/* This perf is outdated and does not support the latest event type. */
 		ui__warning("Unsupported header type %u, please consider updating perf.\n",
 			    event->header.type);
-		/* Skip unsupported event by returning its size. */
-		return event->header.size;
+		/*
+		 * Return 0 to skip: the caller (reader__read_event)
+		 * already advances by event->header.size.
+		 */
+		return 0;
 	}
 
+	/*
+	 * Skip rather than abort: a too-small-but-aligned event
+	 * can be safely stepped over without misaligning the stream.
+	 */
+	if (perf_event__too_small(event, &min_sz)) {
+		pr_warning("WARNING: %s event size %u too small (min %u), skipping\n",
+			   perf_event__name(event->header.type),
+			   event->header.size, min_sz);
+		return 0;
+	}
+
+	if (session->header.needs_swap)
+		event_swap(event, evlist__sample_id_all(evlist));
+
 	events_stats__inc(&evlist->stats, event->header.type);
 
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 02/29] perf session: Bounds-check one_mmap event pointer in peek_event
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 01/29] perf session: Add minimum event size and alignment validation Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 22:00   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 03/29] perf tools: Fix event_contains() macro to verify full field extent Arnaldo Carvalho de Melo
                   ` (27 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

perf_session__peek_event() computes an event pointer directly from
file_offset when one_mmap is active, without verifying that file_offset
and the subsequent event->header.size fall within the mapped region.
A corrupted perf.data file could cause out-of-bounds memory reads.

Add one_mmap_size to the session struct and validate both the header
and full event fit within the mmap before dereferencing.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 29 ++++++++++++++++++++++++++---
 tools/perf/util/session.h |  2 ++
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 0523fd243e02c09b..c4cd8ad6d810a74c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1887,12 +1887,27 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 	*event_ptr = NULL;
 
 	if (session->one_mmap && !session->header.needs_swap) {
-		event = file_offset - session->one_mmap_offset +
-			session->one_mmap_addr;
+		u64 offset_in_mmap;
+
+		/* Validate offset with integer arithmetic to avoid pointer UB */
+		if ((u64)file_offset < session->one_mmap_offset)
+			return -1;
+
+		offset_in_mmap = (u64)file_offset - session->one_mmap_offset;
+
+		/* Use subtraction to avoid addition overflow */
+		if (offset_in_mmap >= session->one_mmap_size ||
+		    session->one_mmap_size - offset_in_mmap < sizeof(struct perf_event_header))
+			return -1;
+
+		event = session->one_mmap_addr + offset_in_mmap;
 
-		/* Every event must at least contain its own header */
 		if (event->header.size < sizeof(struct perf_event_header))
 			return -1;
+
+		/* Ensure full event is within the mmap region */
+		if (session->one_mmap_size - offset_in_mmap < event->header.size)
+			return -1;
 	} else {
 		if (perf_data__is_pipe(session->data))
 			return -1;
@@ -2560,6 +2575,14 @@ reader__mmap(struct reader *rd, struct perf_session *session)
 	if (session->one_mmap) {
 		session->one_mmap_addr = buf;
 		session->one_mmap_offset = rd->file_offset;
+		/*
+		 * mmap_size was set to the full file extent (data_offset +
+		 * data_size) but file_offset was shifted forward by
+		 * page_offset for page alignment.  Reduce by page_offset
+		 * so the bounds check reflects the file-backed portion
+		 * of the mapping — pages beyond the file cause SIGBUS.
+		 */
+		session->one_mmap_size = rd->mmap_size - page_offset;
 	}
 
 	return 0;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index f05f0d4a6c238dc8..d554e2a1a50ed304 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -71,6 +71,8 @@ struct perf_session {
 	void			*one_mmap_addr;
 	/** @one_mmap_offset: File offset in perf.data file when mapped. */
 	u64			one_mmap_offset;
+	/** @one_mmap_size: Size of the single mmap in bytes. */
+	u64			one_mmap_size;
 	/** @ordered_events: Used to turn unordered events into ordered ones. */
 	struct ordered_events	ordered_events;
 	/** @data: Optional perf data file being read from. */
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 03/29] perf tools: Fix event_contains() macro to verify full field extent
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 01/29] perf session: Add minimum event size and alignment validation Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 02/29] perf session: Bounds-check one_mmap event pointer in peek_event Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 04/29] perf zstd: Fix compression error path in zstd_compress_stream_to_records() Arnaldo Carvalho de Melo
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

event_contains() checked whether a field's start offset was within
the event (header.size > offsetof), but not whether the full field
fit.  A crafted event with header.size = offsetof(field) + 1 would
pass the check, but an 8-byte access (bswap_64, direct read) would
overrun the event boundary by up to 7 bytes.

Fix the macro to verify the complete field:

  header.size >= offsetof(field) + sizeof(field)

Also update all callers that check event_contains(time_cycles) but
access later fields (time_mask, cap_user_time_zero,
cap_user_time_short) to check for cap_user_time_short — the last
field accessed — so the entire extended block is verified:
tsc.c, arm-spe.c, cs-etm.c, jitdump.c.

Note: session.c's perf_event__time_conv_swap() also guards on
time_cycles but accesses time_mask — a pre-existing issue not
introduced by this macro change.  It is fixed by a later patch
in this series ("perf session: Add validated swap
infrastructure with null-termination checks"), which decouples
time_cycles and time_mask into independent per-field
event_contains() checks.  The struct assignment overread
(session->time_conv = event->time_conv copies sizeof on a
potentially shorter event) is separately fixed by "perf
session: Use bounded copy for PERF_RECORD_TIME_CONV".

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/lib/perf/include/perf/event.h | 9 ++++++++-
 tools/perf/util/arm-spe.c           | 2 +-
 tools/perf/util/cs-etm.c            | 2 +-
 tools/perf/util/jitdump.c           | 2 +-
 tools/perf/util/tsc.c               | 2 +-
 5 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
index 9043dc72b5d68d58..fdced574c889e503 100644
--- a/tools/lib/perf/include/perf/event.h
+++ b/tools/lib/perf/include/perf/event.h
@@ -8,7 +8,14 @@
 #include <linux/bpf.h>
 #include <sys/types.h> /* pid_t */
 
-#define event_contains(obj, mem) ((obj).header.size > offsetof(typeof(obj), mem))
+/*
+ * Verify the full field fits within the event, not just its start offset.
+ * Only valid for fixed-size scalar fields — for trailing arrays like
+ * filename[PATH_MAX], sizeof() evaluates to the declared maximum, not
+ * the actual string length, so this would spuriously return false.
+ */
+#define event_contains(obj, mem) \
+	((obj).header.size >= offsetof(typeof(obj), mem) + sizeof((obj).mem))
 
 struct perf_record_mmap {
 	struct perf_event_header header;
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 31f05f46781092c1..552f063f126e6769 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -2002,7 +2002,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->tc.time_mult = tc->time_mult;
 	spe->tc.time_zero = tc->time_zero;
 
-	if (event_contains(*tc, time_cycles)) {
+	if (event_contains(*tc, cap_user_time_short)) {
 		spe->tc.time_cycles = tc->time_cycles;
 		spe->tc.time_mask = tc->time_mask;
 		spe->tc.cap_user_time_zero = tc->cap_user_time_zero;
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 6ec48de29441012f..40c6ddfa8c8d91b6 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -3514,7 +3514,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
 	etm->tc.time_shift = tc->time_shift;
 	etm->tc.time_mult = tc->time_mult;
 	etm->tc.time_zero = tc->time_zero;
-	if (event_contains(*tc, time_cycles)) {
+	if (event_contains(*tc, cap_user_time_short)) {
 		etm->tc.time_cycles = tc->time_cycles;
 		etm->tc.time_mask = tc->time_mask;
 		etm->tc.cap_user_time_zero = tc->cap_user_time_zero;
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 52e6ffac2b3e1039..18fd84a82153c2ab 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -409,7 +409,7 @@ static uint64_t convert_timestamp(struct jit_buf_desc *jd, uint64_t timestamp)
 	 * checks the event size and assigns these extended fields if these
 	 * fields are contained in the event.
 	 */
-	if (event_contains(*time_conv, time_cycles)) {
+	if (event_contains(*time_conv, cap_user_time_short)) {
 		tc.time_cycles	       = time_conv->time_cycles;
 		tc.time_mask	       = time_conv->time_mask;
 		tc.cap_user_time_zero  = time_conv->cap_user_time_zero;
diff --git a/tools/perf/util/tsc.c b/tools/perf/util/tsc.c
index 511a517ce613dff1..ebf289bf6b9d9add 100644
--- a/tools/perf/util/tsc.c
+++ b/tools/perf/util/tsc.c
@@ -127,7 +127,7 @@ size_t perf_event__fprintf_time_conv(union perf_event *event, FILE *fp)
 	 * when supported cap_user_time_short, for backward compatibility,
 	 * prints the extended fields only if they are contained in the event.
 	 */
-	if (event_contains(*tc, time_cycles)) {
+	if (event_contains(*tc, cap_user_time_short)) {
 		ret += fprintf(fp, "... Time Cycles     %" PRI_lu64 "\n",
 			       tc->time_cycles);
 		ret += fprintf(fp, "... Time Mask       %#" PRI_lx64 "\n",
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 04/29] perf zstd: Fix compression error path in zstd_compress_stream_to_records()
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 03/29] perf tools: Fix event_contains() macro to verify full field extent Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 22:00   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 05/29] perf zstd: Fix multi-iteration decompression and error handling Arnaldo Carvalho de Melo
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The error fallback does memcpy(dst, src, src_size) intending to store
uncompressed data when compression fails, but this has three bugs:

1. dst has been advanced past the record header (and potentially
   past earlier compressed records), so the copy writes to the
   wrong offset in the output buffer.

2. src still points to the start of the input, not to the
   remaining uncompressed data at src + input.pos.  On a second
   or later iteration, previously compressed data would be
   duplicated.

3. No check that dst_size >= src_size — if the remaining output
   space is smaller, this is an out-of-bounds write.

Replace with return -1 after resetting the ZSTD compression
context via ZSTD_initCStream().  The -1 propagates through
zstd_compress() -> record__pushfn() -> perf_mmap__push() to the
recording loop, which breaks out and terminates recording.

Add an out_child_no_flush label in __cmd_record() so the
mmap-read failure path skips the final record__mmap_read_all()
flush — retrying the same read that just failed would just fail
again, and the flush is only useful when the mmap data is intact
but the control path (auxtrace, switch_output) had an error.

Consolidate all error paths through a single 'reset' label to
ensure the compression context is always reset on failure —
including the output-buffer-full path, where a bare return
without resetting would leave stale stream state that corrupts
output if the caller retries.

Also guard against process_header() writing the event header
before the buffer-full check: add a sizeof(perf_event_header)
pre-check so the callback never writes past the output buffer.

Guard against ZSTD making no progress: if output.pos is zero
after ZSTD_compressStream(), calling process_header(record, 0)
would re-trigger header initialization, double-subtracting the
header size from dst_size and underflowing the unsigned counter.

Also fix two pre-existing issues in the same function:

- Add a dst_size guard before subtracting the record header
  size: if the output buffer is nearly full, the unsigned
  dst_size -= size underflows to a huge value, causing
  ZSTD_compressStream to write past the buffer boundary.

- Check the ZSTD_initCStream() return value and log an error
  if the context reset itself fails.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c |  6 +++++-
 tools/perf/util/zstd.c      | 27 +++++++++++++++++++++++++--
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index cc601796b2c8ae60..f1877bac815d76b2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2743,7 +2743,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 			trigger_error(&auxtrace_snapshot_trigger);
 			trigger_error(&switch_output_trigger);
 			err = -1;
-			goto out_child;
+			goto out_child_no_flush;
 		}
 
 		if (auxtrace_record__snapshot_started) {
@@ -2890,6 +2890,10 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 out_child:
 	record__stop_threads(rec);
 	record__mmap_read_all(rec, true);
+	goto out_free_threads;
+out_child_no_flush:
+	/* mmap read already failed — retrying would just fail again */
+	record__stop_threads(rec);
 out_free_threads:
 	record__free_thread_data(rec);
 	evlist__finalize_ctlfd(rec->evlist);
diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index 57027e0ac7b658a8..ecda9deb53b738fa 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
 #include <string.h>
+#include <linux/perf_event.h>
 
 #include "util/compress.h"
 #include "util/debug.h"
@@ -54,7 +55,13 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 
 	while (input.pos < input.size) {
 		record = dst;
+		/* process_header writes the event header into record */
+		if (dst_size < sizeof(struct perf_event_header))
+			goto reset;
 		size = process_header(record, 0);
+		/* Output buffer full — cannot fit even the record header */
+		if (size > dst_size)
+			goto reset;
 		compressed += size;
 		dst += size;
 		dst_size -= size;
@@ -65,10 +72,18 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 		if (ZSTD_isError(ret)) {
 			pr_err("failed to compress %ld bytes: %s\n",
 				(long)src_size, ZSTD_getErrorName(ret));
-			memcpy(dst, src, src_size);
-			return src_size;
+			goto reset;
 		}
 		size = output.pos;
+		/*
+		 * No progress: ZSTD couldn't emit any bytes into the
+		 * remaining output buffer.  Calling process_header
+		 * with size=0 would re-trigger header initialization,
+		 * double-subtracting the header size from dst_size and
+		 * underflowing the unsigned counter.
+		 */
+		if (size == 0)
+			goto reset;
 		size = process_header(record, size);
 		compressed += size;
 		dst += size;
@@ -76,6 +91,14 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 	}
 
 	return compressed;
+
+reset:
+	/* Reset so the context is usable if the caller retries */
+	ret = ZSTD_initCStream(data->cstream, data->comp_level);
+	if (ZSTD_isError(ret))
+		pr_err("failed to reset compression context: %s\n",
+			ZSTD_getErrorName(ret));
+	return -1;
 }
 
 size_t zstd_decompress_stream(struct zstd_data *data, void *src, size_t src_size,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 05/29] perf zstd: Fix multi-iteration decompression and error handling
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 04/29] perf zstd: Fix compression error path in zstd_compress_stream_to_records() Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:49   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 06/29] perf session: Fix PERF_RECORD_READ swap and dump for variable-length events Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

zstd_decompress_stream() has two bugs in its multi-iteration loop:

1. After each ZSTD_decompressStream() call, the code advances
   output.dst by output.pos but doesn't reset output.pos to 0.
   ZSTD interprets output.pos relative to output.dst, so the
   next iteration writes at (dst + pos) + pos = dst + 2*pos,
   skipping a gap and potentially writing out of bounds.

2. On ZSTD_decompressStream() error, the loop executes break
   and returns output.pos (which is > 0 if some bytes were
   decompressed before the error).  The caller checks
   !decomp_size and skips the error, silently accepting
   truncated or corrupted data.

Fix both by removing the output buffer adjustment — ZSTD
correctly accumulates output.pos across calls without it.
Return 0 on decompression error so the caller detects it.
Add a no-progress guard to prevent infinite loops if the
output buffer fills before all input is consumed.

Note: the compressed event data_size is validated against
header.size by a subsequent patch in this series
("perf tools: Harden compressed event processing").

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/zstd.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index ecda9deb53b738fa..21a0eb58597c21f9 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -123,14 +123,26 @@ size_t zstd_decompress_stream(struct zstd_data *data, void *src, size_t src_size
 		}
 	}
 	while (input.pos < input.size) {
+		size_t prev_in = input.pos;
+		size_t prev_out = output.pos;
+
 		ret = ZSTD_decompressStream(data->dstream, &output, &input);
 		if (ZSTD_isError(ret)) {
 			pr_err("failed to decompress (B): %zd -> %zd, dst_size %zd : %s\n",
-			       src_size, output.size, dst_size, ZSTD_getErrorName(ret));
-			break;
+			       src_size, output.pos, dst_size, ZSTD_getErrorName(ret));
+			return 0;
 		}
-		output.dst  = dst + output.pos;
-		output.size = dst_size - output.pos;
+		/*
+		 * Neither stream advanced — decompression is stuck.
+		 * Return 0 (error) rather than partial output: perf
+		 * uses ZSTD_flushStream (not ZSTD_endStream), so the
+		 * stream is continuous across compressed events.
+		 * Discarding unconsumed input would desynchronize the
+		 * decompressor, causing the next call to produce
+		 * garbage that could be misinterpreted as valid events.
+		 */
+		if (input.pos == prev_in && output.pos == prev_out)
+			return 0;
 	}
 
 	return output.pos;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 06/29] perf session: Fix PERF_RECORD_READ swap and dump for variable-length events
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 05/29] perf zstd: Fix multi-iteration decompression and error handling Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 07/29] perf session: Fix swap_sample_id_all() crash on crafted events Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The kernel dynamically sizes PERF_RECORD_READ based on
attr.read_format: only the fields enabled by PERF_FORMAT_TOTAL_TIME_ENABLED,
PERF_FORMAT_TOTAL_TIME_RUNNING, PERF_FORMAT_ID, and PERF_FORMAT_LOST
are emitted, packed with no gaps.

perf_event__read_swap() unconditionally byte-swapped time_enabled,
time_running, and id at their fixed struct offsets, causing
out-of-bounds access on smaller events and swapping the wrong
bytes when not all format fields are present.  It also swapped
sample_id_all at a fixed offset past the full struct, which is
wrong for shorter events.

Replace the individual field swaps with a single mem_bswap_64()
over the entire tail from value onward.  Since every field after
pid/tid is u64 regardless of which combination is present, this
correctly handles any read_format combination and any trailing
sample_id_all fields.

Similarly, dump_read() accessed optional fields via fixed struct
offsets, displaying values from wrong positions when not all
format bits are set.  Walk the packed u64 array sequentially
instead, with bounds checks against event->header.size.

Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 61 ++++++++++++++++++++++++++++-----------
 1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c4cd8ad6d810a74c..24f2ba599b8079bd 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -354,17 +354,24 @@ static void perf_event__task_swap(union perf_event *event, bool sample_id_all)
 		swap_sample_id_all(event, &event->fork + 1);
 }
 
-static void perf_event__read_swap(union perf_event *event, bool sample_id_all)
+static void perf_event__read_swap(union perf_event *event,
+				  bool sample_id_all __maybe_unused)
 {
+	size_t tail;
+
 	event->read.pid		 = bswap_32(event->read.pid);
 	event->read.tid		 = bswap_32(event->read.tid);
-	event->read.value	 = bswap_64(event->read.value);
-	event->read.time_enabled = bswap_64(event->read.time_enabled);
-	event->read.time_running = bswap_64(event->read.time_running);
-	event->read.id		 = bswap_64(event->read.id);
-
-	if (sample_id_all)
-		swap_sample_id_all(event, &event->read + 1);
+	/*
+	 * Everything after pid/tid is u64: the read values (variable
+	 * set determined by attr.read_format, which we don't have
+	 * here) optionally followed by sample_id_all fields.
+	 * Since all are u64, swap the entire remaining tail at once.
+	 */
+	tail = event->header.size - offsetof(struct perf_record_read, value);
+	/* mem_bswap_64 rounds up to 8-byte chunks — unaligned tail overruns the buffer */
+	if (tail % sizeof(u64))
+		return;
+	mem_bswap_64(&event->read.value, tail);
 }
 
 static void perf_event__aux_swap(union perf_event *event, bool sample_id_all)
@@ -1200,8 +1207,9 @@ static void dump_deferred_callchain(union perf_event *event, struct perf_sample
 
 static void dump_read(struct evsel *evsel, union perf_event *event)
 {
-	struct perf_record_read *read_event = &event->read;
 	u64 read_format;
+	__u64 *array;
+	void *end;
 
 	if (!dump_trace)
 		return;
@@ -1213,18 +1221,37 @@ static void dump_read(struct evsel *evsel, union perf_event *event)
 		return;
 
 	read_format = evsel->core.attr.read_format;
+	/*
+	 * The kernel packs only the enabled read_format fields
+	 * after value, with no gaps.  Walk the packed array
+	 * instead of using fixed struct offsets.
+	 */
+	array = &event->read.value + 1;
+	end = (void *)event + event->header.size;
 
-	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
-		printf("... time enabled : %" PRI_lu64 "\n", read_event->time_enabled);
+	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) {
+		if ((void *)(array + 1) > end)
+			return;
+		printf("... time enabled : %" PRI_lu64 "\n", *array++);
+	}
 
-	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
-		printf("... time running : %" PRI_lu64 "\n", read_event->time_running);
+	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING) {
+		if ((void *)(array + 1) > end)
+			return;
+		printf("... time running : %" PRI_lu64 "\n", *array++);
+	}
 
-	if (read_format & PERF_FORMAT_ID)
-		printf("... id           : %" PRI_lu64 "\n", read_event->id);
+	if (read_format & PERF_FORMAT_ID) {
+		if ((void *)(array + 1) > end)
+			return;
+		printf("... id           : %" PRI_lu64 "\n", *array++);
+	}
 
-	if (read_format & PERF_FORMAT_LOST)
-		printf("... lost         : %" PRI_lu64 "\n", read_event->lost);
+	if (read_format & PERF_FORMAT_LOST) {
+		if ((void *)(array + 1) > end)
+			return;
+		printf("... lost         : %" PRI_lu64 "\n", *array++);
+	}
 }
 
 static struct machine *machines__find_for_cpumode(struct machines *machines,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 07/29] perf session: Fix swap_sample_id_all() crash on crafted events
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (5 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 06/29] perf session: Fix PERF_RECORD_READ swap and dump for variable-length events Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 08/29] perf session: Add validated swap infrastructure with null-termination checks Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

swap_sample_id_all() calls BUG_ON(size % sizeof(u64)) which kills
perf on any event where the sample_id_all tail is not 8-byte aligned.
A crafted perf.data can trigger this trivially.

Replace BUG_ON with a bounds check: skip the swap if the data pointer
is past the end of the event, and only swap when there are bytes
remaining.

Note: the strlen calls in string-field swap handlers (comm,
mmap, mmap2, cgroup) are replaced with bounded strnlen by the
next patch in this series ("perf session: Add validated swap
infrastructure with null-termination checks").

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 24f2ba599b8079bd..37544a3574185bac 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -276,10 +276,18 @@ void perf_session__delete(struct perf_session *session)
 static void swap_sample_id_all(union perf_event *event, void *data)
 {
 	void *end = (void *) event + event->header.size;
-	int size = end - data;
+	int size;
 
-	BUG_ON(size % sizeof(u64));
-	mem_bswap_64(data, size);
+	if (data >= end)
+		return;
+
+	size = end - data;
+	if (size % sizeof(u64)) {
+		pr_warning("swap_sample_id_all: unaligned sample_id_all remainder (%d), skipping swap\n", size);
+		return;
+	}
+	if (size > 0)
+		mem_bswap_64(data, size);
 }
 
 static void perf_event__all64_swap(union perf_event *event,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 08/29] perf session: Add validated swap infrastructure with null-termination checks
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (6 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 07/29] perf session: Fix swap_sample_id_all() crash on crafted events Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:55   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 09/29] perf session: Use bounded copy for PERF_RECORD_TIME_CONV Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, David Carrillo-Cisneros,
	Song Liu, Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Change swap callbacks from void to int return so handlers can
propagate errors.  All 28 existing handlers are converted to
return 0 on success, -1 on error.  Three new handlers (KSYMBOL,
BPF_EVENT, HEADER_FEATURE) are added returning int from the
start, with sample_id_all handling for the kernel event types.

event_swap() propagates the return to its callers (process_event
and peek_event), which skip events that fail to swap.

Add perf_event__check_nul() for null-termination enforcement
on the common event delivery path for MMAP, MMAP2, COMM,
CGROUP, and KSYMBOL events.  Events with
unterminated strings are skipped — native-endian files are
mapped read-only, so writing a NUL byte in place would segfault.

Swap handler hardening:

 - Use strnlen bounded by event size (instead of strlen) in
   COMM/MMAP/MMAP2/CGROUP swap handlers, returning -1 on
   unterminated strings.

 - Bounds check text_poke old_len+new_len before computing the
   sample_id offset, returning -1 on overflow.  Use offsetof()
   for the native-path check in machines__deliver_event() since
   sizeof() includes struct padding past the flexible array.

 - Fix PERF_RECORD_SWITCH sample_id_all: non-CPU_WIDE SWITCH
   events have sample_id immediately after the 8-byte header,
   not at sizeof(struct perf_record_switch) which is the
   CPU_WIDE variant size.

 - Fix perf_event__time_conv_swap(): decouple time_cycles and
   time_mask into independent per-field event_contains() checks,
   so each field is only swapped when the event is large enough
   to contain it.  The original code guarded both fields under
   a single time_cycles check, which would swap time_mask on a
   short event that contains time_cycles but not time_mask.

 - Handle ABI0 (attr.size == 0) in perf_event__attr_swap()
   by substituting PERF_ATTR_SIZE_VER0, so bswap_safe()
   correctly swaps VER0 fields instead of skipping everything.

 - peek_events: on swap failure, advance past the malformed
   entry instead of aborting the loop.

Note: the nr-field bounds checks for namespaces, thread_map,
cpu_map, and stat_config arrays are added by a subsequent
patch ("perf session: Validate nr fields against event size
on both swap and common paths").  The HEADER_ATTR attr.size
validation is added by ("perf session: Validate HEADER_ATTR
attr.size before swapping").

By establishing the int-returning swap infrastructure first,
all subsequent hardening patches can use direct error returns
from day one — no poison values, no workarounds for void return.

Changes in v2:
- peek_events: abort instead of skip for AUXTRACE events on
  validation failure — skipping only header.size would land
  inside the raw trace payload, causing subsequent iterations
  to misparse data as events (Reported-by: sashiko-bot@kernel.org)

Fixes: 9aa0bfa370b2 ("perf tools: Handle PERF_RECORD_KSYMBOL")
Fixes: 45178a928a4b ("perf tools: Handle PERF_RECORD_BPF_EVENT")
Fixes: e9def1b2e74e ("perf tools: Add feature header record to pipe-mode")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 406 ++++++++++++++++++++++++++++++--------
 1 file changed, 325 insertions(+), 81 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 37544a3574185bac..d5864e380c1bd52e 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -290,28 +290,44 @@ static void swap_sample_id_all(union perf_event *event, void *data)
 		mem_bswap_64(data, size);
 }
 
-static void perf_event__all64_swap(union perf_event *event,
-				   bool sample_id_all __maybe_unused)
+static int perf_event__all64_swap(union perf_event *event,
+				  bool sample_id_all __maybe_unused)
 {
 	struct perf_event_header *hdr = &event->header;
-	mem_bswap_64(hdr + 1, event->header.size - sizeof(*hdr));
+	size_t size = event->header.size - sizeof(*hdr);
+
+	/* mem_bswap_64 rounds up to 8-byte chunks — unaligned size overruns the buffer */
+	if (size % sizeof(u64))
+		return -1;
+	mem_bswap_64(hdr + 1, size);
+	return 0;
 }
 
-static void perf_event__comm_swap(union perf_event *event, bool sample_id_all)
+static int perf_event__comm_swap(union perf_event *event, bool sample_id_all)
 {
 	event->comm.pid = bswap_32(event->comm.pid);
 	event->comm.tid = bswap_32(event->comm.tid);
 
 	if (sample_id_all) {
 		void *data = &event->comm.comm;
+		void *end = (void *)event + event->header.size;
+		size_t len = strnlen(data, end - data);
 
-		data += PERF_ALIGN(strlen(data) + 1, sizeof(u64));
+		/*
+		 * No NUL within the event boundary — can't locate where
+		 * sample_id_all starts.  Reject so the event is skipped
+		 * rather than swapping garbage.
+		 */
+		if (len == (size_t)(end - data))
+			return -1;
+		data += PERF_ALIGN(len + 1, sizeof(u64));
 		swap_sample_id_all(event, data);
 	}
+	return 0;
 }
 
-static void perf_event__mmap_swap(union perf_event *event,
-				  bool sample_id_all)
+static int perf_event__mmap_swap(union perf_event *event,
+				 bool sample_id_all)
 {
 	event->mmap.pid	  = bswap_32(event->mmap.pid);
 	event->mmap.tid	  = bswap_32(event->mmap.tid);
@@ -321,13 +337,19 @@ static void perf_event__mmap_swap(union perf_event *event,
 
 	if (sample_id_all) {
 		void *data = &event->mmap.filename;
+		void *end = (void *)event + event->header.size;
+		size_t len = strnlen(data, end - data);
 
-		data += PERF_ALIGN(strlen(data) + 1, sizeof(u64));
+		/* See comment in perf_event__comm_swap() */
+		if (len == (size_t)(end - data))
+			return -1;
+		data += PERF_ALIGN(len + 1, sizeof(u64));
 		swap_sample_id_all(event, data);
 	}
+	return 0;
 }
 
-static void perf_event__mmap2_swap(union perf_event *event,
+static int perf_event__mmap2_swap(union perf_event *event,
 				  bool sample_id_all)
 {
 	event->mmap2.pid   = bswap_32(event->mmap2.pid);
@@ -345,12 +367,19 @@ static void perf_event__mmap2_swap(union perf_event *event,
 
 	if (sample_id_all) {
 		void *data = &event->mmap2.filename;
+		void *end = (void *)event + event->header.size;
+		size_t len = strnlen(data, end - data);
 
-		data += PERF_ALIGN(strlen(data) + 1, sizeof(u64));
+		/* See comment in perf_event__comm_swap() */
+		if (len == (size_t)(end - data))
+			return -1;
+		data += PERF_ALIGN(len + 1, sizeof(u64));
 		swap_sample_id_all(event, data);
 	}
+	return 0;
 }
-static void perf_event__task_swap(union perf_event *event, bool sample_id_all)
+
+static int perf_event__task_swap(union perf_event *event, bool sample_id_all)
 {
 	event->fork.pid	 = bswap_32(event->fork.pid);
 	event->fork.tid	 = bswap_32(event->fork.tid);
@@ -360,10 +389,11 @@ static void perf_event__task_swap(union perf_event *event, bool sample_id_all)
 
 	if (sample_id_all)
 		swap_sample_id_all(event, &event->fork + 1);
+	return 0;
 }
 
-static void perf_event__read_swap(union perf_event *event,
-				  bool sample_id_all __maybe_unused)
+static int perf_event__read_swap(union perf_event *event,
+				 bool sample_id_all __maybe_unused)
 {
 	size_t tail;
 
@@ -378,11 +408,12 @@ static void perf_event__read_swap(union perf_event *event,
 	tail = event->header.size - offsetof(struct perf_record_read, value);
 	/* mem_bswap_64 rounds up to 8-byte chunks — unaligned tail overruns the buffer */
 	if (tail % sizeof(u64))
-		return;
+		return -1;
 	mem_bswap_64(&event->read.value, tail);
+	return 0;
 }
 
-static void perf_event__aux_swap(union perf_event *event, bool sample_id_all)
+static int perf_event__aux_swap(union perf_event *event, bool sample_id_all)
 {
 	event->aux.aux_offset = bswap_64(event->aux.aux_offset);
 	event->aux.aux_size   = bswap_64(event->aux.aux_size);
@@ -390,19 +421,21 @@ static void perf_event__aux_swap(union perf_event *event, bool sample_id_all)
 
 	if (sample_id_all)
 		swap_sample_id_all(event, &event->aux + 1);
+	return 0;
 }
 
-static void perf_event__itrace_start_swap(union perf_event *event,
-					  bool sample_id_all)
+static int perf_event__itrace_start_swap(union perf_event *event,
+					 bool sample_id_all)
 {
 	event->itrace_start.pid	 = bswap_32(event->itrace_start.pid);
 	event->itrace_start.tid	 = bswap_32(event->itrace_start.tid);
 
 	if (sample_id_all)
 		swap_sample_id_all(event, &event->itrace_start + 1);
+	return 0;
 }
 
-static void perf_event__switch_swap(union perf_event *event, bool sample_id_all)
+static int perf_event__switch_swap(union perf_event *event, bool sample_id_all)
 {
 	if (event->header.type == PERF_RECORD_SWITCH_CPU_WIDE) {
 		event->context_switch.next_prev_pid =
@@ -411,30 +444,45 @@ static void perf_event__switch_swap(union perf_event *event, bool sample_id_all)
 				bswap_32(event->context_switch.next_prev_tid);
 	}
 
-	if (sample_id_all)
-		swap_sample_id_all(event, &event->context_switch + 1);
+	if (sample_id_all) {
+		/*
+		 * PERF_RECORD_SWITCH has no fields beyond the header;
+		 * SWITCH_CPU_WIDE adds pid/tid.  Use the right offset
+		 * so sample_id starts at the correct position.
+		 */
+		if (event->header.type == PERF_RECORD_SWITCH)
+			swap_sample_id_all(event, (void *)event + sizeof(event->header));
+		else
+			swap_sample_id_all(event, &event->context_switch + 1);
+	}
+	return 0;
 }
 
-static void perf_event__text_poke_swap(union perf_event *event, bool sample_id_all)
+static int perf_event__text_poke_swap(union perf_event *event, bool sample_id_all)
 {
 	event->text_poke.addr    = bswap_64(event->text_poke.addr);
 	event->text_poke.old_len = bswap_16(event->text_poke.old_len);
 	event->text_poke.new_len = bswap_16(event->text_poke.new_len);
 
 	if (sample_id_all) {
+		void *data = &event->text_poke.old_len;
+		void *end = (void *)event + event->header.size;
 		size_t len = sizeof(event->text_poke.old_len) +
 			     sizeof(event->text_poke.new_len) +
 			     event->text_poke.old_len +
 			     event->text_poke.new_len;
-		void *data = &event->text_poke.old_len;
 
+		/* old_len + new_len exceeds event — can't find sample_id_all */
+		if (data + len > end)
+			return -1;
 		data += PERF_ALIGN(len, sizeof(u64));
 		swap_sample_id_all(event, data);
 	}
+	return 0;
 }
 
-static void perf_event__throttle_swap(union perf_event *event,
-				      bool sample_id_all)
+static int perf_event__throttle_swap(union perf_event *event,
+				     bool sample_id_all)
 {
 	event->throttle.time	  = bswap_64(event->throttle.time);
 	event->throttle.id	  = bswap_64(event->throttle.id);
@@ -442,10 +490,11 @@ static void perf_event__throttle_swap(union perf_event *event,
 
 	if (sample_id_all)
 		swap_sample_id_all(event, &event->throttle + 1);
+	return 0;
 }
 
-static void perf_event__namespaces_swap(union perf_event *event,
-					bool sample_id_all)
+static int perf_event__namespaces_swap(union perf_event *event,
+				       bool sample_id_all)
 {
 	u64 i;
 
@@ -462,18 +511,25 @@ static void perf_event__namespaces_swap(union perf_event *event,
 
 	if (sample_id_all)
 		swap_sample_id_all(event, &event->namespaces.link_info[i]);
+	return 0;
 }
 
-static void perf_event__cgroup_swap(union perf_event *event, bool sample_id_all)
+static int perf_event__cgroup_swap(union perf_event *event, bool sample_id_all)
 {
 	event->cgroup.id = bswap_64(event->cgroup.id);
 
 	if (sample_id_all) {
 		void *data = &event->cgroup.path;
+		void *end = (void *)event + event->header.size;
+		size_t len = strnlen(data, end - data);
 
-		data += PERF_ALIGN(strlen(data) + 1, sizeof(u64));
+		/* See comment in perf_event__comm_swap() */
+		if (len == (size_t)(end - data))
+			return -1;
+		data += PERF_ALIGN(len + 1, sizeof(u64));
 		swap_sample_id_all(event, data);
 	}
+	return 0;
 }
 
 static u8 revbyte(u8 b)
@@ -514,9 +570,19 @@ void perf_event__attr_swap(struct perf_event_attr *attr)
 	attr->type		= bswap_32(attr->type);
 	attr->size		= bswap_32(attr->size);
 
-#define bswap_safe(f, n) 					\
-	(attr->size > (offsetof(struct perf_event_attr, f) + 	\
-		       sizeof(attr->f) * (n)))
+	/*
+	 * ABI0: size == 0 means the producer didn't set it.
+	 * Assume PERF_ATTR_SIZE_VER0 so bswap_safe() below
+	 * correctly swaps the VER0 fields instead of skipping
+	 * everything.  Same convention as read_attr().
+	 */
+	if (!attr->size)
+		attr->size = PERF_ATTR_SIZE_VER0;
+
+/* Verify the full field extent fits, not just its start offset */
+#define bswap_safe(f, n)					\
+	(attr->size >= (offsetof(struct perf_event_attr, f) +	\
+			sizeof(attr->f) * ((n) + 1)))
 #define bswap_field(f, sz) 			\
 do { 						\
 	if (bswap_safe(f, 0))			\
@@ -554,8 +620,8 @@ do { 						\
 #undef bswap_safe
 }
 
-static void perf_event__hdr_attr_swap(union perf_event *event,
-				      bool sample_id_all __maybe_unused)
+static int perf_event__hdr_attr_swap(union perf_event *event,
+				     bool sample_id_all __maybe_unused)
 {
 	size_t size;
 
@@ -564,30 +630,34 @@ static void perf_event__hdr_attr_swap(union perf_event *event,
 	size = event->header.size;
 	size -= perf_record_header_attr_id(event) - (void *)event;
 	mem_bswap_64(perf_record_header_attr_id(event), size);
+	return 0;
 }
 
-static void perf_event__event_update_swap(union perf_event *event,
-					  bool sample_id_all __maybe_unused)
+static int perf_event__event_update_swap(union perf_event *event,
+					 bool sample_id_all __maybe_unused)
 {
 	event->event_update.type = bswap_64(event->event_update.type);
 	event->event_update.id   = bswap_64(event->event_update.id);
+	return 0;
 }
 
-static void perf_event__event_type_swap(union perf_event *event,
-					bool sample_id_all __maybe_unused)
+static int perf_event__event_type_swap(union perf_event *event,
+				       bool sample_id_all __maybe_unused)
 {
 	event->event_type.event_type.event_id =
 		bswap_64(event->event_type.event_type.event_id);
+	return 0;
 }
 
-static void perf_event__tracing_data_swap(union perf_event *event,
-					  bool sample_id_all __maybe_unused)
+static int perf_event__tracing_data_swap(union perf_event *event,
+					 bool sample_id_all __maybe_unused)
 {
 	event->tracing_data.size = bswap_32(event->tracing_data.size);
+	return 0;
 }
 
-static void perf_event__auxtrace_info_swap(union perf_event *event,
-					   bool sample_id_all __maybe_unused)
+static int perf_event__auxtrace_info_swap(union perf_event *event,
+					  bool sample_id_all __maybe_unused)
 {
 	size_t size;
 
@@ -596,10 +666,11 @@ static void perf_event__auxtrace_info_swap(union perf_event *event,
 	size = event->header.size;
 	size -= (void *)&event->auxtrace_info.priv - (void *)event;
 	mem_bswap_64(event->auxtrace_info.priv, size);
+	return 0;
 }
 
-static void perf_event__auxtrace_swap(union perf_event *event,
-				      bool sample_id_all __maybe_unused)
+static int perf_event__auxtrace_swap(union perf_event *event,
+				     bool sample_id_all __maybe_unused)
 {
 	event->auxtrace.size      = bswap_64(event->auxtrace.size);
 	event->auxtrace.offset    = bswap_64(event->auxtrace.offset);
@@ -607,10 +678,11 @@ static void perf_event__auxtrace_swap(union perf_event *event,
 	event->auxtrace.idx       = bswap_32(event->auxtrace.idx);
 	event->auxtrace.tid       = bswap_32(event->auxtrace.tid);
 	event->auxtrace.cpu       = bswap_32(event->auxtrace.cpu);
+	return 0;
 }
 
-static void perf_event__auxtrace_error_swap(union perf_event *event,
-					    bool sample_id_all __maybe_unused)
+static int perf_event__auxtrace_error_swap(union perf_event *event,
+					   bool sample_id_all __maybe_unused)
 {
 	event->auxtrace_error.type = bswap_32(event->auxtrace_error.type);
 	event->auxtrace_error.code = bswap_32(event->auxtrace_error.code);
@@ -625,10 +697,11 @@ static void perf_event__auxtrace_error_swap(union perf_event *event,
 		event->auxtrace_error.machine_pid = bswap_32(event->auxtrace_error.machine_pid);
 		event->auxtrace_error.vcpu = bswap_32(event->auxtrace_error.vcpu);
 	}
+	return 0;
 }
 
-static void perf_event__thread_map_swap(union perf_event *event,
-					bool sample_id_all __maybe_unused)
+static int perf_event__thread_map_swap(union perf_event *event,
+				       bool sample_id_all __maybe_unused)
 {
 	unsigned i;
 
@@ -636,10 +709,11 @@ static void perf_event__thread_map_swap(union perf_event *event,
 
 	for (i = 0; i < event->thread_map.nr; i++)
 		event->thread_map.entries[i].pid = bswap_64(event->thread_map.entries[i].pid);
+	return 0;
 }
 
-static void perf_event__cpu_map_swap(union perf_event *event,
-				     bool sample_id_all __maybe_unused)
+static int perf_event__cpu_map_swap(union perf_event *event,
+				    bool sample_id_all __maybe_unused)
 {
 	struct perf_record_cpu_map_data *data = &event->cpu_map.data;
 
@@ -677,20 +751,22 @@ static void perf_event__cpu_map_swap(union perf_event *event,
 	default:
 		break;
 	}
+	return 0;
 }
 
-static void perf_event__stat_config_swap(union perf_event *event,
-					 bool sample_id_all __maybe_unused)
+static int perf_event__stat_config_swap(union perf_event *event,
+					bool sample_id_all __maybe_unused)
 {
 	u64 size;
 
 	size  = bswap_64(event->stat_config.nr) * sizeof(event->stat_config.data[0]);
 	size += 1; /* nr item itself */
 	mem_bswap_64(&event->stat_config.nr, size);
+	return 0;
 }
 
-static void perf_event__stat_swap(union perf_event *event,
-				  bool sample_id_all __maybe_unused)
+static int perf_event__stat_swap(union perf_event *event,
+				 bool sample_id_all __maybe_unused)
 {
 	event->stat.id     = bswap_64(event->stat.id);
 	event->stat.thread = bswap_32(event->stat.thread);
@@ -698,44 +774,90 @@ static void perf_event__stat_swap(union perf_event *event,
 	event->stat.val    = bswap_64(event->stat.val);
 	event->stat.ena    = bswap_64(event->stat.ena);
 	event->stat.run    = bswap_64(event->stat.run);
+	return 0;
 }
 
-static void perf_event__stat_round_swap(union perf_event *event,
-					bool sample_id_all __maybe_unused)
+static int perf_event__stat_round_swap(union perf_event *event,
+				       bool sample_id_all __maybe_unused)
 {
 	event->stat_round.type = bswap_64(event->stat_round.type);
 	event->stat_round.time = bswap_64(event->stat_round.time);
+	return 0;
 }
 
-static void perf_event__time_conv_swap(union perf_event *event,
-				       bool sample_id_all __maybe_unused)
+static int perf_event__time_conv_swap(union perf_event *event,
+				      bool sample_id_all __maybe_unused)
 {
 	event->time_conv.time_shift = bswap_64(event->time_conv.time_shift);
 	event->time_conv.time_mult  = bswap_64(event->time_conv.time_mult);
 	event->time_conv.time_zero  = bswap_64(event->time_conv.time_zero);
 
-	if (event_contains(event->time_conv, time_cycles)) {
+	if (event_contains(event->time_conv, time_cycles))
 		event->time_conv.time_cycles = bswap_64(event->time_conv.time_cycles);
+	if (event_contains(event->time_conv, time_mask))
 		event->time_conv.time_mask = bswap_64(event->time_conv.time_mask);
-	}
+	return 0;
 }
 
-static void
+static int
 perf_event__schedstat_cpu_swap(union perf_event *event __maybe_unused,
 			       bool sample_id_all __maybe_unused)
 {
 	/* FIXME */
+	return 0;
 }
 
-static void
+static int
 perf_event__schedstat_domain_swap(union perf_event *event __maybe_unused,
 				  bool sample_id_all __maybe_unused)
 {
 	/* FIXME */
+	return 0;
+}
+
+static int perf_event__ksymbol_swap(union perf_event *event,
+				    bool sample_id_all)
+{
+	event->ksymbol.addr = bswap_64(event->ksymbol.addr);
+	event->ksymbol.len = bswap_32(event->ksymbol.len);
+	event->ksymbol.ksym_type = bswap_16(event->ksymbol.ksym_type);
+	event->ksymbol.flags = bswap_16(event->ksymbol.flags);
+
+	if (sample_id_all) {
+		void *data = &event->ksymbol.name;
+		void *end = (void *)event + event->header.size;
+		size_t len = strnlen(data, end - data);
+
+		/* See comment in perf_event__comm_swap() */
+		if (len == (size_t)(end - data))
+			return -1;
+		data += PERF_ALIGN(len + 1, sizeof(u64));
+		swap_sample_id_all(event, data);
+	}
+	return 0;
+}
+
+static int perf_event__bpf_event_swap(union perf_event *event,
+				      bool sample_id_all)
+{
+	event->bpf.type  = bswap_16(event->bpf.type);
+	event->bpf.flags = bswap_16(event->bpf.flags);
+	event->bpf.id    = bswap_32(event->bpf.id);
+
+	if (sample_id_all)
+		swap_sample_id_all(event, &event->bpf + 1);
+	return 0;
 }
 
-typedef void (*perf_event__swap_op)(union perf_event *event,
-				    bool sample_id_all);
+static int perf_event__header_feature_swap(union perf_event *event,
+					   bool sample_id_all __maybe_unused)
+{
+	event->feat.feat_id = bswap_64(event->feat.feat_id);
+	return 0;
+}
+
+typedef int (*perf_event__swap_op)(union perf_event *event,
+				   bool sample_id_all);
 
 static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_MMAP]		  = perf_event__mmap_swap,
@@ -755,6 +877,8 @@ static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_SWITCH_CPU_WIDE]	  = perf_event__switch_swap,
 	[PERF_RECORD_NAMESPACES]	  = perf_event__namespaces_swap,
 	[PERF_RECORD_CGROUP]		  = perf_event__cgroup_swap,
+	[PERF_RECORD_KSYMBOL]		  = perf_event__ksymbol_swap,
+	[PERF_RECORD_BPF_EVENT]		  = perf_event__bpf_event_swap,
 	[PERF_RECORD_TEXT_POKE]		  = perf_event__text_poke_swap,
 	[PERF_RECORD_AUX_OUTPUT_HW_ID]	  = perf_event__all64_swap,
 	[PERF_RECORD_CALLCHAIN_DEFERRED]  = perf_event__all64_swap,
@@ -762,6 +886,7 @@ static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_HEADER_EVENT_TYPE]	  = perf_event__event_type_swap,
 	[PERF_RECORD_HEADER_TRACING_DATA] = perf_event__tracing_data_swap,
 	[PERF_RECORD_HEADER_BUILD_ID]	  = NULL,
+	[PERF_RECORD_HEADER_FEATURE]	  = perf_event__header_feature_swap,
 	[PERF_RECORD_ID_INDEX]		  = perf_event__all64_swap,
 	[PERF_RECORD_AUXTRACE_INFO]	  = perf_event__auxtrace_info_swap,
 	[PERF_RECORD_AUXTRACE]		  = perf_event__auxtrace_swap,
@@ -1488,6 +1613,25 @@ static int session__flush_deferred_samples(struct perf_session *session,
 	return ret;
 }
 
+/*
+ * Return true if the string field is properly null-terminated
+ * within the event boundary.  Native-endian files are mapped
+ * read-only (MAP_SHARED + PROT_READ) so we cannot write a
+ * null byte in place; skip the event instead.
+ */
+static bool perf_event__check_nul(const char *str, const void *end, const char *event_name)
+{
+	size_t max_len = (const char *)end - str;
+
+	if (max_len == 0 || strnlen(str, max_len) == max_len) {
+		pr_warning("WARNING: PERF_RECORD_%s: string not null-terminated, skipping event\n",
+			   event_name);
+		return false;
+	}
+
+	return true;
+}
+
 static int machines__deliver_event(struct machines *machines,
 				   struct evlist *evlist,
 				   union perf_event *event,
@@ -1536,16 +1680,32 @@ static int machines__deliver_event(struct machines *machines,
 		}
 		return evlist__deliver_sample(evlist, tool, event, sample, machine);
 	case PERF_RECORD_MMAP:
+		if (!perf_event__check_nul(event->mmap.filename,
+					   (void *)event + event->header.size,
+					   "MMAP"))
+			return 0;
 		return tool->mmap(tool, event, sample, machine);
 	case PERF_RECORD_MMAP2:
 		if (event->header.misc & PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT)
 			++evlist->stats.nr_proc_map_timeout;
+		if (!perf_event__check_nul(event->mmap2.filename,
+					   (void *)event + event->header.size,
+					   "MMAP2"))
+			return 0;
 		return tool->mmap2(tool, event, sample, machine);
 	case PERF_RECORD_COMM:
+		if (!perf_event__check_nul(event->comm.comm,
+					   (void *)event + event->header.size,
+					   "COMM"))
+			return 0;
 		return tool->comm(tool, event, sample, machine);
 	case PERF_RECORD_NAMESPACES:
 		return tool->namespaces(tool, event, sample, machine);
 	case PERF_RECORD_CGROUP:
+		if (!perf_event__check_nul(event->cgroup.path,
+					   (void *)event + event->header.size,
+					   "CGROUP"))
+			return 0;
 		return tool->cgroup(tool, event, sample, machine);
 	case PERF_RECORD_FORK:
 		return tool->fork(tool, event, sample, machine);
@@ -1584,11 +1744,25 @@ static int machines__deliver_event(struct machines *machines,
 	case PERF_RECORD_SWITCH_CPU_WIDE:
 		return tool->context_switch(tool, event, sample, machine);
 	case PERF_RECORD_KSYMBOL:
+		if (!perf_event__check_nul(event->ksymbol.name,
+					   (void *)event + event->header.size,
+					   "KSYMBOL"))
+			return 0;
 		return tool->ksymbol(tool, event, sample, machine);
 	case PERF_RECORD_BPF_EVENT:
 		return tool->bpf(tool, event, sample, machine);
-	case PERF_RECORD_TEXT_POKE:
+	case PERF_RECORD_TEXT_POKE: {
+		/* offsetof(bytes), not sizeof — sizeof includes padding past the flexible array */
+		size_t text_poke_len = offsetof(struct perf_record_text_poke_event, bytes) +
+				       event->text_poke.old_len +
+				       event->text_poke.new_len;
+
+		if (event->header.size < text_poke_len) {
+			pr_warning("WARNING: PERF_RECORD_TEXT_POKE: old_len+new_len exceeds event, skipping\n");
+			return 0;
+		}
 		return tool->text_poke(tool, event, sample, machine);
+	}
 	case PERF_RECORD_AUX_OUTPUT_HW_ID:
 		return tool->aux_output_hw_id(tool, event, sample, machine);
 	case PERF_RECORD_CALLCHAIN_DEFERRED:
@@ -1794,12 +1968,28 @@ int perf_session__deliver_synth_attr_event(struct perf_session *session,
 	return perf_session__deliver_synth_event(session, &ev.ev, NULL);
 }
 
+/* Caller must ensure event->header.type < PERF_RECORD_HEADER_MAX */
+static int event_swap(union perf_event *event, bool sample_id_all)
+{
+	perf_event__swap_op swap = perf_event__swap_ops[event->header.type];
+
+	if (swap)
+		return swap(event, sample_id_all);
+	return 0;
+}
+
 /*
  * Minimum event sizes indexed by type.  Checked before swap and
  * processing so that both cross-endian and native-endian paths
  * are protected from accessing fields past the event boundary.
  * Zero means no minimum beyond the 8-byte header (already
  * enforced by the reader).
+ *
+ * These values represent the smallest event the kernel has ever
+ * emitted for each type, so they do not reject legitimate legacy
+ * perf.data files from older kernels.  Variable-length events
+ * use offsetof() to the first variable field; the variable
+ * content is validated separately (e.g., perf_event__check_nul).
  */
 static const u32 perf_event__min_size[PERF_RECORD_HEADER_MAX] = {
 	/*
@@ -1821,7 +2011,9 @@ static const u32 perf_event__min_size[PERF_RECORD_HEADER_MAX] = {
 	[PERF_RECORD_FORK]		  = sizeof(struct perf_record_fork),
 	/*
 	 * The kernel dynamically sizes PERF_RECORD_READ based on
-	 * attr.read_format — the minimum has just pid + tid + value.
+	 * attr.read_format — only the enabled fields are emitted,
+	 * packed with no gaps.  The minimum valid event has just
+	 * pid + tid + one u64 value (no optional fields).
 	 */
 	[PERF_RECORD_READ]		  = offsetof(struct perf_record_read, time_enabled),
 	[PERF_RECORD_MMAP2]		  = offsetof(struct perf_record_mmap2, filename) + 1,
@@ -1844,14 +2036,25 @@ static const u32 perf_event__min_size[PERF_RECORD_HEADER_MAX] = {
 	[PERF_RECORD_AUXTRACE]		  = sizeof(struct perf_record_auxtrace),
 	[PERF_RECORD_AUXTRACE_ERROR]	  = offsetof(struct perf_record_auxtrace_error, msg) + 1,
 	[PERF_RECORD_THREAD_MAP]	  = sizeof(struct perf_record_thread_map),
-	/* Smallest valid variant is RANGE_CPUS: header(8) + type(2) + range(6) */
+	/*
+	 * sizeof(perf_record_cpu_map) is 20 because the outer struct
+	 * isn't packed and GCC adds 2 bytes of trailing padding.
+	 * The smallest valid variant (RANGE_CPUS) is only 16 bytes:
+	 * header(8) + type(2) + range_cpu_data(6).  Per-variant
+	 * bounds are checked in the swap handler via payload.
+	 */
 	[PERF_RECORD_CPU_MAP]		  = sizeof(struct perf_event_header) +
 					    sizeof(__u16) +
 					    sizeof(struct perf_record_range_cpu_map),
 	[PERF_RECORD_STAT_CONFIG]	  = sizeof(struct perf_record_stat_config),
 	[PERF_RECORD_STAT]		  = sizeof(struct perf_record_stat),
 	[PERF_RECORD_STAT_ROUND]	  = sizeof(struct perf_record_stat_round),
-	/* Union inflates sizeof; use fixed header fields as minimum */
+	/*
+	 * EVENT_UPDATE has a union whose largest member (cpus)
+	 * inflates sizeof to 40, but SCALE events are only 32
+	 * and UNIT/NAME events can be even smaller.  Use the
+	 * fixed header fields (header + type + id) as minimum.
+	 */
 	[PERF_RECORD_EVENT_UPDATE]	  = offsetof(struct perf_record_event_update, scale),
 	[PERF_RECORD_TIME_CONV]		  = offsetof(struct perf_record_time_conv, time_cycles),
 	[PERF_RECORD_ID_INDEX]		  = sizeof(struct perf_record_id_index),
@@ -1887,14 +2090,6 @@ static bool perf_event__too_small(const union perf_event *event, u32 *min)
 	return false;
 }
 
-/* Caller must ensure event->header.type < PERF_RECORD_HEADER_MAX */
-static void event_swap(union perf_event *event, bool sample_id_all)
-{
-	perf_event__swap_op swap = perf_event__swap_ops[event->header.type];
-	if (swap)
-		swap(event, sample_id_all);
-}
-
 /*
  * Read and validate the event at @file_offset.
  *
@@ -2003,8 +2198,16 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 		return -1;
 	}
 
-	if (session->header.needs_swap)
-		event_swap(event, evlist__sample_id_all(session->evlist));
+	if (session->header.needs_swap &&
+	    event_swap(event, evlist__sample_id_all(session->evlist))) {
+		/*
+		 * The header was already swapped so header.size is
+		 * valid — expose the event so callers can advance
+		 * past this malformed entry instead of aborting.
+		 */
+		*event_ptr = event;
+		return -1;
+	}
 
 	if (sample && event->header.type < PERF_RECORD_USER_TYPE_START &&
 	    evlist__parse_sample(session->evlist, event, sample))
@@ -2022,11 +2225,37 @@ int perf_session__peek_events(struct perf_session *session, u64 offset,
 	int err;
 
 	do {
+		event = NULL;
 		err = perf_session__peek_event(session, offset, buf,
 					       PERF_SAMPLE_MAX_SIZE, &event,
 					       NULL);
-		if (err)
-			return err;
+		if (err) {
+			/*
+			 * Recoverable error: peek_event returns -1 but
+			 * sets event_ptr when the header was read
+			 * successfully but the event is malformed (too
+			 * small or swap failed).  Skip past it using
+			 * header.size — don't invoke the callback since
+			 * type-specific fields may be truncated.
+			 *
+			 * Must abort if: event_ptr is NULL (I/O error),
+			 * size is 0 (can't advance), type is AUXTRACE
+			 * (payload extends beyond header.size), or size
+			 * is unaligned (would misalign all subsequent reads).
+			 *
+			 * Direct callers (auxtrace, cs-etm) treat any
+			 * non-zero return as fatal — only this loop skips.
+			 */
+			if (event && event->header.size &&
+			    event->header.type != PERF_RECORD_AUXTRACE &&
+			    event->header.size % sizeof(u64) == 0) {
+				offset += event->header.size;
+				err = 0;
+			} else {
+				return err;
+			}
+			continue;
+		}
 
 		err = cb(session, event, offset, data);
 		if (err)
@@ -2109,8 +2338,12 @@ static s64 perf_session__process_event(struct perf_session *session,
 		return 0;
 	}
 
-	if (session->header.needs_swap)
-		event_swap(event, evlist__sample_id_all(evlist));
+	if (session->header.needs_swap &&
+	    event_swap(event, evlist__sample_id_all(evlist))) {
+		pr_warning("WARNING: swap failed for %s event, skipping\n",
+			   perf_event__name(event->header.type));
+		return 0;
+	}
 
 	events_stats__inc(&evlist->stats, event->header.type);
 
@@ -2579,6 +2812,17 @@ reader__mmap(struct reader *rd, struct perf_session *session)
 	char *buf, **mmaps = rd->mmaps;
 	u64 page_offset;
 
+	/*
+	 * Native-endian: MAP_SHARED + PROT_READ — the kernel
+	 * guarantees page-level coherence but a concurrent writer
+	 * could modify the file between validation and use.  This
+	 * is a theoretical TOCTOU that affects the entire perf.data
+	 * processing pipeline; fixing it would require copying each
+	 * event to a private buffer before processing.
+	 *
+	 * Cross-endian: MAP_PRIVATE + PROT_WRITE — swap handlers
+	 * get a copy-on-write snapshot immune to concurrent writes.
+	 */
 	mmap_prot  = PROT_READ;
 	mmap_flags = MAP_SHARED;
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 09/29] perf session: Use bounded copy for PERF_RECORD_TIME_CONV
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (7 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 08/29] perf session: Add validated swap infrastructure with null-termination checks Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 10/29] perf session: Validate HEADER_ATTR attr.size before swapping Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

session->time_conv = event->time_conv copies sizeof(struct
perf_record_time_conv) bytes unconditionally, but older kernels
emit shorter TIME_CONV events without the time_cycles, time_mask,
cap_user_time_zero, and cap_user_time_short fields.

For a 32-byte event (the original format), this reads 24 bytes
past the event boundary into adjacent mmap'd data.  The garbage
values end up in session->time_conv and can cause incorrect TSC
conversion if cap_user_time_zero happens to be non-zero.

Replace the struct assignment with a bounded memcpy capped at
event->header.size, zeroing the remainder so extended fields
default to off when absent.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index d5864e380c1bd52e..3f72b80aac56b04e 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1897,7 +1897,14 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		err = tool->stat_round(tool, session, event);
 		break;
 	case PERF_RECORD_TIME_CONV:
-		session->time_conv = event->time_conv;
+		/*
+		 * Bounded copy: older kernels emit a shorter struct
+		 * without time_cycles/time_mask/cap_user_time_*.
+		 * Zero the rest so extended fields default to off.
+		 */
+		memset(&session->time_conv, 0, sizeof(session->time_conv));
+		memcpy(&session->time_conv, &event->time_conv,
+		       min((size_t)event->header.size, sizeof(session->time_conv)));
 		err = tool->time_conv(tool, session, event);
 		break;
 	case PERF_RECORD_HEADER_FEATURE:
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 10/29] perf session: Validate HEADER_ATTR attr.size before swapping
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (8 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 09/29] perf session: Use bounded copy for PERF_RECORD_TIME_CONV Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 22:01   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Harden PERF_RECORD_HEADER_ATTR handling against crafted perf.data:

- Validate attr.size: must be >= PERF_ATTR_SIZE_VER0, a multiple
  of sizeof(u64), and fit within the event payload.
- Copy only min(attr.size, sizeof(struct perf_event_attr)) bytes
  into a local attr, zeroing the rest so legacy files don't leak
  adjacent event data into new fields.
- Keep the original attr.size so perf_event__synthesize_attr()
  uses it for both allocation and ID-array placement.

Fix perf_event__synthesize_attr() to use attr->size (not the
compiled sizeof) for event allocation and layout, so perf inject
correctly re-synthesizes attrs from files recorded by a different
perf version.  Without this, the ID array destination pointer
(computed via perf_record_header_attr_id()) would be inconsistent
with the allocation when attr->size differs from sizeof.

Also fix the parse-no-sample-id-all test to set attr.size, which
is now validated, and improve error handling in read_attr() for
short reads and invalid attr sizes.

Handle ABI0 pipe/inject events where attr.size is 0: use a local
attr_size variable set to PERF_ATTR_SIZE_VER0 for both the bounded
copy and ID array position, instead of writing back to the event.
Native-endian files may be MAP_SHARED (read-only mmap), so writing
to the event buffer would SIGSEGV.  The swap path handles ABI0 in
perf_event__attr_swap() which writes to the MAP_PRIVATE copy.

header.size alignment is now validated centrally in
perf_session__process_event() (see "Add minimum event size and
alignment validation").

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-inject.c               | 23 ++++--
 tools/perf/tests/parse-no-sample-id-all.c |  6 ++
 tools/perf/util/header.c                  | 96 +++++++++++++++++++++--
 tools/perf/util/session.c                 | 31 ++++++++
 tools/perf/util/synthetic-events.c        | 25 +++++-
 5 files changed, 166 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 41a3721a194dc9b9..d8cb1f562f690ce4 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -229,6 +229,7 @@ static int perf_event__repipe_attr(const struct perf_tool *tool,
 	struct perf_inject *inject = container_of(tool, struct perf_inject,
 						  tool);
 	struct perf_event_attr attr;
+	u32 raw_attr_size, attr_size;
 	size_t n_ids;
 	u64 *ids;
 	int ret;
@@ -244,24 +245,34 @@ static int perf_event__repipe_attr(const struct perf_tool *tool,
 	if (!inject->itrace_synth_opts.set)
 		return perf_event__repipe_synth(tool, event);
 
-	if (event->header.size < sizeof(struct perf_event_header) + sizeof(u64)) {
+	if (event->header.size < sizeof(struct perf_event_header) + PERF_ATTR_SIZE_VER0) {
 		pr_err("Attribute event size %u is too small\n", event->header.size);
 		return -EINVAL;
 	}
 
-	if (event->header.size - sizeof(event->header) < event->attr.attr.size) {
+	/*
+	 * ABI0 pipe/inject events have attr.size == 0; default to
+	 * PERF_ATTR_SIZE_VER0 (the ABI0 footprint) for the bounded
+	 * copy and ID array position.  Same pattern as
+	 * perf_event__process_attr() in header.c.
+	 */
+	raw_attr_size = event->attr.attr.size;
+	attr_size = raw_attr_size ?: PERF_ATTR_SIZE_VER0;
+
+	if (raw_attr_size && (raw_attr_size < PERF_ATTR_SIZE_VER0 ||
+			      raw_attr_size > event->header.size - sizeof(event->header))) {
 		pr_err("Attribute event size %u is too small for attr.size %u\n",
-		       event->header.size, event->attr.attr.size);
+		       event->header.size, raw_attr_size);
 		return -EINVAL;
 	}
 
 	memset(&attr, 0, sizeof(attr));
 	memcpy(&attr, &event->attr.attr,
-	       min_t(size_t, sizeof(attr), (size_t)event->attr.attr.size));
+	       min_t(size_t, sizeof(attr), attr_size));
 
-	n_ids = event->header.size - sizeof(event->header) - event->attr.attr.size;
+	n_ids = event->header.size - sizeof(event->header) - attr_size;
 	n_ids /= sizeof(u64);
-	ids = perf_record_header_attr_id(event);
+	ids = (void *)&event->attr.attr + attr_size;
 
 	attr.size = sizeof(struct perf_event_attr);
 	attr.sample_type &= ~PERF_SAMPLE_AUX;
diff --git a/tools/perf/tests/parse-no-sample-id-all.c b/tools/perf/tests/parse-no-sample-id-all.c
index 50e68b7d43aad030..8ac862c94879f3a3 100644
--- a/tools/perf/tests/parse-no-sample-id-all.c
+++ b/tools/perf/tests/parse-no-sample-id-all.c
@@ -82,6 +82,9 @@ static int test__parse_no_sample_id_all(struct test_suite *test __maybe_unused,
 			.type = PERF_RECORD_HEADER_ATTR,
 			.size = sizeof(struct test_attr_event),
 		},
+		.attr = {
+			.size = sizeof(struct perf_event_attr),
+		},
 		.id = 1,
 	};
 	struct test_attr_event event2 = {
@@ -89,6 +92,9 @@ static int test__parse_no_sample_id_all(struct test_suite *test __maybe_unused,
 			.type = PERF_RECORD_HEADER_ATTR,
 			.size = sizeof(struct test_attr_event),
 		},
+		.attr = {
+			.size = sizeof(struct perf_event_attr),
+		},
 		.id = 2,
 	};
 	struct perf_record_mmap event3 = {
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f30e48eb3fc32da2..967c3d8ff12c8676 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -4770,9 +4770,15 @@ static int read_attr(int fd, struct perf_header *ph,
 	if (sz == 0) {
 		/* assume ABI0 */
 		sz =  PERF_ATTR_SIZE_VER0;
+	} else if (sz < PERF_ATTR_SIZE_VER0) {
+		pr_debug("bad attr size %zu, expected at least %d\n",
+			 sz, PERF_ATTR_SIZE_VER0);
+		errno = EINVAL;
+		return -1;
 	} else if (sz > our_sz) {
 		pr_debug("file uses a more recent and unsupported ABI"
 			 " (%zu bytes extra)\n", sz - our_sz);
+		errno = EINVAL;
 		return -1;
 	}
 	/* what we have not yet read and that we know about */
@@ -4782,11 +4788,21 @@ static int read_attr(int fd, struct perf_header *ph,
 		ptr += PERF_ATTR_SIZE_VER0;
 
 		ret = readn(fd, ptr, left);
+		if (ret <= 0) {
+			if (ret == 0)
+				errno = EIO;
+			return -1;
+		}
 	}
 	/* read perf_file_section, ids are read in caller */
 	ret = readn(fd, &f_attr->ids, sizeof(f_attr->ids));
+	if (ret <= 0) {
+		if (ret == 0)
+			errno = EIO;
+		return -1;
+	}
 
-	return ret <= 0 ? -1 : 0;
+	return 0;
 }
 
 #ifdef HAVE_LIBTRACEEVENT
@@ -5094,11 +5110,42 @@ int perf_event__process_attr(const struct perf_tool *tool __maybe_unused,
 			     union perf_event *event,
 			     struct evlist **pevlist)
 {
-	u32 i, n_ids;
+	struct perf_event_attr attr;
+	u32 i, n_ids, raw_attr_size;
 	u64 *ids;
+	size_t attr_size, copy_size;
 	struct evsel *evsel;
 	struct evlist *evlist = *pevlist;
 
+	/*
+	 * HEADER_ATTR event layout (pipe/inject mode):
+	 *
+	 *   [header (8 bytes)] [attr (attr_size bytes)] [id0 id1 ... idN]
+	 *   |<------------------ header.size --------------------------->|
+	 *
+	 * attr_size varies across perf versions: VER0 = 64 bytes,
+	 * current sizeof(struct perf_event_attr) = larger.  A newer
+	 * producer may emit a larger attr than we understand.
+	 *
+	 * attr.size == 0 (ABI0) means the producer didn't set it
+	 * (e.g., bench/inject-buildid, older perf).  Treat as VER0.
+	 *
+	 * Require 8-byte alignment so the u64 ID array is aligned
+	 * and attr.size fits cleanly within the payload.
+	 *
+	 * Read attr.size once — the event may be on a shared mmap
+	 * and re-reading could yield a different value.
+	 */
+	raw_attr_size = event->attr.attr.size;
+	if (event->header.size < sizeof(event->header) + PERF_ATTR_SIZE_VER0 ||
+	    (raw_attr_size && (raw_attr_size < PERF_ATTR_SIZE_VER0 ||
+			      raw_attr_size % sizeof(u64) ||
+			      raw_attr_size > event->header.size - sizeof(event->header)))) {
+		pr_err("PERF_RECORD_HEADER_ATTR: invalid attr.size %u (event size %u, min %d)\n",
+		       raw_attr_size, event->header.size, PERF_ATTR_SIZE_VER0);
+		return -EINVAL;
+	}
+
 	if (dump_trace)
 		perf_event__fprintf_attr(event, stdout);
 
@@ -5108,13 +5155,46 @@ int perf_event__process_attr(const struct perf_tool *tool __maybe_unused,
 			return -ENOMEM;
 	}
 
-	evsel = evsel__new(&event->attr.attr);
+	/*
+	 * attr_size = footprint of the attr in the event — determines
+	 * where the ID array starts.  For ABI0, assume VER0 (64 bytes).
+	 *
+	 * copy_size = how much we copy into our local struct, capped at
+	 * sizeof(attr) so a newer producer's larger attr doesn't
+	 * overflow.  Fields beyond copy_size are zeroed.
+	 *
+	 * Do NOT write attr_size back to the event — native-endian
+	 * files use MAP_SHARED (read-only), writing would SIGSEGV.
+	 * The swap path handles ABI0 in perf_event__attr_swap()
+	 * which writes to the writable MAP_PRIVATE copy instead.
+	 */
+	attr_size = raw_attr_size ?: PERF_ATTR_SIZE_VER0;
+	copy_size = min(attr_size, sizeof(attr));
+	memcpy(&attr, &event->attr.attr, copy_size);
+	if (copy_size < sizeof(attr))
+		memset((void *)&attr + copy_size, 0, sizeof(attr) - copy_size);
+
+	/*
+	 * Normalize ABI0: the swap path sets attr.size = VER0 on the
+	 * event, but the native path leaves it as 0.  Set it on the
+	 * local copy so perf inject re-synthesizes with consistent
+	 * layout regardless of endianness.
+	 */
+	attr.size = attr_size;
+
+	evsel = evsel__new(&attr);
 	if (evsel == NULL)
 		return -ENOMEM;
 
 	evlist__add(evlist, evsel);
 
-	n_ids = event->header.size - sizeof(event->header) - event->attr.attr.size;
+	/*
+	 * IDs occupy the remainder after header + attr.  Use attr_size
+	 * (not copy_size) — even if the producer's attr is larger than
+	 * our struct, the IDs start after attr_size bytes in the event.
+	 * Validation above guarantees attr_size <= payload size.
+	 */
+	n_ids = event->header.size - sizeof(event->header) - attr_size;
 	n_ids = n_ids / sizeof(u64);
 	/*
 	 * We don't have the cpu and thread maps on the header, so
@@ -5124,7 +5204,13 @@ int perf_event__process_attr(const struct perf_tool *tool __maybe_unused,
 	if (perf_evsel__alloc_id(&evsel->core, 1, n_ids))
 		return -ENOMEM;
 
-	ids = perf_record_header_attr_id(event);
+	/*
+	 * Locate IDs at attr_size bytes past the attr start in the
+	 * event.  Cannot use perf_record_header_attr_id() — that
+	 * macro reads event->attr.attr.size, which is 0 for ABI0
+	 * on the native-endian path (no swap handler to fix it up).
+	 */
+	ids = (void *)&event->attr.attr + attr_size;
 	for (i = 0; i < n_ids; i++) {
 		perf_evlist__id_add(&evlist->core, &evsel->core, 0, i, ids[i]);
 	}
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 3f72b80aac56b04e..aef10d42be35487a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -623,8 +623,39 @@ do { 						\
 static int perf_event__hdr_attr_swap(union perf_event *event,
 				     bool sample_id_all __maybe_unused)
 {
+	u32 attr_size, payload_size;
 	size_t size;
 
+	/*
+	 * Validate attr.size (still foreign-endian) before calling
+	 * perf_event__attr_swap(), which uses it via bswap_safe()
+	 * to decide which fields to swap.  A crafted attr.size
+	 * larger than the event payload would swap past the event
+	 * boundary and corrupt adjacent memory.
+	 *
+	 * header.size alignment is already validated by
+	 * perf_session__process_event().  The min_size table
+	 * guarantees header.size >= sizeof(header) +
+	 * PERF_ATTR_SIZE_VER0, so attr.size is safe to access.
+	 */
+	attr_size = bswap_32(event->attr.attr.size);
+	/*
+	 * ABI0: size field not set.  This only happens in pipe/inject
+	 * mode where HEADER_ATTR events carry their own attr.  For
+	 * regular perf.data files, read_attr() uses f_header.attr_size
+	 * from the file header instead.  Assume PERF_ATTR_SIZE_VER0.
+	 */
+	if (!attr_size)
+		attr_size = PERF_ATTR_SIZE_VER0;
+	payload_size = event->header.size - sizeof(event->header);
+
+	if (attr_size < PERF_ATTR_SIZE_VER0 || attr_size % sizeof(u64) ||
+	    attr_size > payload_size) {
+		pr_err("PERF_RECORD_HEADER_ATTR: invalid attr.size %u (min: %d, max: %u, 8-byte aligned)\n",
+		       attr_size, PERF_ATTR_SIZE_VER0, payload_size);
+		return -1;
+	}
+
 	perf_event__attr_swap(&event->attr.attr);
 
 	size = event->header.size;
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index d665b0f94b321433..5307d707711d876c 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2181,11 +2181,21 @@ int perf_event__synthesize_attr(const struct perf_tool *tool, struct perf_event_
 				u32 ids, u64 *id, perf_event__handler_t process)
 {
 	union perf_event *ev;
-	size_t size;
+	size_t attr_size, size;
 	int err;
 
-	size = sizeof(struct perf_event_attr);
-	size = PERF_ALIGN(size, sizeof(u64));
+	/*
+	 * Use attr->size for the event layout, not the compiled
+	 * sizeof(struct perf_event_attr), so that synthesized events
+	 * match the source perf.data layout.  This matters for perf
+	 * inject, which re-synthesizes attrs from a file that may
+	 * have been recorded by a different version of perf.
+	 * perf_record_header_attr_id() locates the ID array at
+	 * attr->size bytes past the attr.
+	 */
+	attr_size = attr->size ?: sizeof(struct perf_event_attr);
+
+	size = PERF_ALIGN(attr_size, sizeof(u64));
 	size += sizeof(struct perf_event_header);
 	size += ids * sizeof(u64);
 
@@ -2194,7 +2204,14 @@ int perf_event__synthesize_attr(const struct perf_tool *tool, struct perf_event_
 	if (ev == NULL)
 		return -ENOMEM;
 
-	ev->attr.attr = *attr;
+	/*
+	 * Copy only the bytes we understand; zalloc ensures that any
+	 * extra bytes between sizeof(struct perf_event_attr) and
+	 * attr_size are zero when the source file uses a newer, larger
+	 * struct.
+	 */
+	memcpy(&ev->attr.attr, attr, min(sizeof(struct perf_event_attr), attr_size));
+	ev->attr.attr.size = attr_size;
 	memcpy(perf_record_header_attr_id(ev), id, ids * sizeof(u64));
 
 	ev->attr.header.type = PERF_RECORD_HEADER_ATTR;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (9 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 10/29] perf session: Validate HEADER_ATTR attr.size before swapping Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:54   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 12/29] perf header: Byte-swap build ID event pid and bounds check section entries Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Several event types use an nr field to control iteration over
variable-length arrays.  The swap handlers byte-swap and loop using
these fields without bounds checks, and the native processing path
trusts them as well.

Add bounds checks on both paths for:

- PERF_RECORD_THREAD_MAP: validate nr against payload, return -1
  on the swap path.  On the native path, reject with -EINVAL.

- PERF_RECORD_NAMESPACES: clamp nr on the swap path (safe because
  each entry is indexed by type; missing entries just won't be
  resolved).  Skip the event on the native path.

- PERF_RECORD_CPU_MAP: clamp nr for CPUS and MASK sub-types on
  the swap path.  Add bounds checks for mask64 which previously
  had no nr validation.  Skip the event on the native path.

- PERF_RECORD_STAT_CONFIG: clamp nr on the swap path (safe because
  each config entry is self-describing via its tag).  Skip the
  event on the native path.

The swap path (cross-endian, writable MAP_PRIVATE mapping) can
safely clamp by writing back to the event.  The native path
(read-only MAP_SHARED mapping) must skip instead of clamping
because writing to the mmap'd event would segfault.

Also fix stat_config swap range: change size += 1 to
size += sizeof(event->stat_config.nr) for clarity.  The old +1
happened to work because mem_bswap_64 processes 8-byte chunks,
but the intent is to include the 8-byte nr field in the swap
range.

Changes in v2:
- Document that PERF_RECORD_NAMESPACES max_nr includes trailing
  sample_id space when sample_id_all is present — harmless on the
  swap path because both per-element bswap_64 and swap_sample_id_all()
  perform the same u64 byte swap (Reported-by: sashiko-bot@kernel.org)

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 253 +++++++++++++++++++++++++++++++++++---
 1 file changed, 234 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index aef10d42be35487a..8588e12f110fca70 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -496,13 +496,35 @@ static int perf_event__throttle_swap(union perf_event *event,
 static int perf_event__namespaces_swap(union perf_event *event,
 				       bool sample_id_all)
 {
-	u64 i;
+	u64 i, nr, max_nr;
 
 	event->namespaces.pid		= bswap_32(event->namespaces.pid);
 	event->namespaces.tid		= bswap_32(event->namespaces.tid);
 	event->namespaces.nr_namespaces	= bswap_64(event->namespaces.nr_namespaces);
 
-	for (i = 0; i < event->namespaces.nr_namespaces; i++) {
+	nr = event->namespaces.nr_namespaces;
+	/*
+	 * Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof.
+	 * When sample_id_all is present max_nr slightly overestimates the
+	 * array space because header.size includes the trailing sample_id.
+	 * Harmless: both the per-element bswap_64 loop and swap_sample_id_all()
+	 * perform the same u64 byte swap, so the result is correct regardless
+	 * of where the boundary between array and sample_id falls.
+	 */
+	max_nr = (event->header.size - sizeof(event->namespaces)) /
+		 sizeof(event->namespaces.link_info[0]);
+	/*
+	 * Safe to clamp: each namespace entry is indexed by type;
+	 * missing entries just won't be resolved.
+	 */
+	if (nr > max_nr) {
+		pr_warning("WARNING: PERF_RECORD_NAMESPACES: nr_namespaces %" PRIu64 " exceeds payload (max %" PRIu64 "), clamping\n",
+			   nr, max_nr);
+		nr = max_nr;
+		event->namespaces.nr_namespaces = nr;
+	}
+
+	for (i = 0; i < nr; i++) {
 		struct perf_ns_link_info *ns = &event->namespaces.link_info[i];
 
 		ns->dev = bswap_64(ns->dev);
@@ -734,11 +756,23 @@ static int perf_event__auxtrace_error_swap(union perf_event *event,
 static int perf_event__thread_map_swap(union perf_event *event,
 				       bool sample_id_all __maybe_unused)
 {
-	unsigned i;
+	unsigned int i;
+	u64 nr;
 
 	event->thread_map.nr = bswap_64(event->thread_map.nr);
 
-	for (i = 0; i < event->thread_map.nr; i++)
+	/*
+	 * Reject rather than clamp: unlike namespaces (indexed by type)
+	 * or stat_config (self-describing tags), a truncated thread map
+	 * is structurally broken — downstream would get a wrong map.
+	 */
+	/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+	nr = event->thread_map.nr;
+	if (nr > (event->header.size - sizeof(event->thread_map)) /
+		  sizeof(event->thread_map.entries[0]))
+		return -1;
+
+	for (i = 0; i < nr; i++)
 		event->thread_map.entries[i].pid = bswap_64(event->thread_map.entries[i].pid);
 	return 0;
 }
@@ -747,32 +781,80 @@ static int perf_event__cpu_map_swap(union perf_event *event,
 				    bool sample_id_all __maybe_unused)
 {
 	struct perf_record_cpu_map_data *data = &event->cpu_map.data;
+	u32 payload = event->header.size - sizeof(event->header);
 
 	data->type = bswap_16(data->type);
 
+	/*
+	 * Safe to clamp: a shorter CPU map just means some CPUs
+	 * are absent; tools process the CPUs that are present.
+	 */
 	switch (data->type) {
-	case PERF_CPU_MAP__CPUS:
-		data->cpus_data.nr = bswap_16(data->cpus_data.nr);
+	case PERF_CPU_MAP__CPUS: {
+		u16 nr, max_nr;
 
-		for (unsigned i = 0; i < data->cpus_data.nr; i++)
+		data->cpus_data.nr = bswap_16(data->cpus_data.nr);
+		nr = data->cpus_data.nr;
+		max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+					     cpus_data.cpu)) /
+			 sizeof(data->cpus_data.cpu[0]);
+		if (nr > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_CPU_MAP: nr %u exceeds payload (max %u), clamping\n",
+				   nr, max_nr);
+			nr = max_nr;
+			data->cpus_data.nr = nr;
+		}
+		for (unsigned int i = 0; i < nr; i++)
 			data->cpus_data.cpu[i] = bswap_16(data->cpus_data.cpu[i]);
 		break;
+	}
 	case PERF_CPU_MAP__MASK:
 		data->mask32_data.long_size = bswap_16(data->mask32_data.long_size);
 
 		switch (data->mask32_data.long_size) {
-		case 4:
+		case 4: {
+			u16 nr, max_nr;
+
 			data->mask32_data.nr = bswap_16(data->mask32_data.nr);
-			for (unsigned i = 0; i < data->mask32_data.nr; i++)
+			nr = data->mask32_data.nr;
+			max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+						     mask32_data.mask)) /
+				 sizeof(data->mask32_data.mask[0]);
+			if (nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP mask32: nr %u exceeds payload (max %u), clamping\n",
+					   nr, max_nr);
+				nr = max_nr;
+				data->mask32_data.nr = nr;
+			}
+			for (unsigned int i = 0; i < nr; i++)
 				data->mask32_data.mask[i] = bswap_32(data->mask32_data.mask[i]);
 			break;
-		case 8:
+		}
+		case 8: {
+			u16 nr, max_nr;
+
 			data->mask64_data.nr = bswap_16(data->mask64_data.nr);
-			for (unsigned i = 0; i < data->mask64_data.nr; i++)
+			nr = data->mask64_data.nr;
+			if (payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) {
+				data->mask64_data.nr = 0;
+				break;
+			}
+			max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+						     mask64_data.mask)) /
+				 sizeof(data->mask64_data.mask[0]);
+			if (nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP mask64: nr %u exceeds payload (max %u), clamping\n",
+					   nr, max_nr);
+				nr = max_nr;
+				data->mask64_data.nr = nr;
+			}
+			for (unsigned int i = 0; i < nr; i++)
 				data->mask64_data.mask[i] = bswap_64(data->mask64_data.mask[i]);
 			break;
+		}
 		default:
-			pr_err("cpu_map swap: unsupported long size\n");
+			pr_err("cpu_map swap: unsupported long size %u\n",
+			       data->mask32_data.long_size);
 		}
 		break;
 	case PERF_CPU_MAP__RANGE_CPUS:
@@ -788,11 +870,27 @@ static int perf_event__cpu_map_swap(union perf_event *event,
 static int perf_event__stat_config_swap(union perf_event *event,
 					bool sample_id_all __maybe_unused)
 {
-	u64 size;
+	u64 nr, max_nr, size;
 
-	size  = bswap_64(event->stat_config.nr) * sizeof(event->stat_config.data[0]);
-	size += 1; /* nr item itself */
+	nr = bswap_64(event->stat_config.nr);
+	/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+	max_nr = (event->header.size - sizeof(event->stat_config)) /
+		 sizeof(event->stat_config.data[0]);
+	/*
+	 * Safe to clamp: each config entry is self-describing
+	 * via its tag; missing entries keep their defaults.
+	 */
+	if (nr > max_nr) {
+		pr_warning("WARNING: PERF_RECORD_STAT_CONFIG: nr %" PRIu64 " exceeds payload (max %" PRIu64 "), clamping\n",
+			   nr, max_nr);
+		nr = max_nr;
+	}
+	size = nr * sizeof(event->stat_config.data[0]);
+	/* The swap starts at &nr, so add its size to cover the full range */
+	size += sizeof(event->stat_config.nr);
 	mem_bswap_64(&event->stat_config.nr, size);
+	/* Persist the clamped value in native byte order */
+	event->stat_config.nr = nr;
 	return 0;
 }
 
@@ -1730,8 +1828,27 @@ static int machines__deliver_event(struct machines *machines,
 					   "COMM"))
 			return 0;
 		return tool->comm(tool, event, sample, machine);
-	case PERF_RECORD_NAMESPACES:
+	case PERF_RECORD_NAMESPACES: {
+		/*
+		 * Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof.
+		 * Includes trailing sample_id space when present, but prevents OOB.
+		 */
+		u64 max_nr = (event->header.size - sizeof(event->namespaces)) /
+			     sizeof(event->namespaces.link_info[0]);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr in place.  Skip the event instead.
+		 * The swap handler already clamps on the writable
+		 * cross-endian path.
+		 */
+		if (event->namespaces.nr_namespaces > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_NAMESPACES: nr_namespaces %" PRIu64 " exceeds payload (max %" PRIu64 "), skipping\n",
+				   (u64)event->namespaces.nr_namespaces, max_nr);
+			return 0;
+		}
 		return tool->namespaces(tool, event, sample, machine);
+	}
 	case PERF_RECORD_CGROUP:
 		if (!perf_event__check_nul(event->cgroup.path,
 					   (void *)event + event->header.size,
@@ -1912,15 +2029,112 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		perf_session__auxtrace_error_inc(session, event);
 		err = tool->auxtrace_error(tool, session, event);
 		break;
-	case PERF_RECORD_THREAD_MAP:
+	case PERF_RECORD_THREAD_MAP: {
+		u64 max_nr;
+
+		if (event->header.size < sizeof(event->thread_map)) {
+			pr_err("PERF_RECORD_THREAD_MAP: header.size (%u) too small\n",
+			       event->header.size);
+			err = -EINVAL;
+			break;
+		}
+
+		max_nr = (event->header.size - sizeof(event->thread_map)) /
+			 sizeof(event->thread_map.entries[0]);
+		if (event->thread_map.nr > max_nr) {
+			pr_err("PERF_RECORD_THREAD_MAP: nr %" PRIu64 " exceeds max %" PRIu64 "\n",
+			       (u64)event->thread_map.nr, max_nr);
+			err = -EINVAL;
+			break;
+		}
+
 		err = tool->thread_map(tool, session, event);
 		break;
-	case PERF_RECORD_CPU_MAP:
+	}
+	case PERF_RECORD_CPU_MAP: {
+		struct perf_record_cpu_map_data *data = &event->cpu_map.data;
+		u32 payload = event->header.size - sizeof(event->header);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr fields in place.  Skip the event
+		 * if any variant overflows.
+		 */
+		switch (data->type) {
+		case PERF_CPU_MAP__CPUS: {
+			u16 max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+							 cpus_data.cpu)) /
+				     sizeof(data->cpus_data.cpu[0]);
+
+			if (data->cpus_data.nr > max_nr) {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP: nr %u exceeds payload (max %u), skipping\n",
+					   data->cpus_data.nr, max_nr);
+				err = 0;
+				goto out;
+			}
+			break;
+		}
+		case PERF_CPU_MAP__MASK:
+			if (data->mask32_data.long_size == 4) {
+				u16 max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+								 mask32_data.mask)) /
+					     sizeof(data->mask32_data.mask[0]);
+
+				if (data->mask32_data.nr > max_nr) {
+					pr_warning("WARNING: PERF_RECORD_CPU_MAP mask32: nr %u exceeds payload (max %u), skipping\n",
+						   data->mask32_data.nr, max_nr);
+					err = 0;
+					goto out;
+				}
+			} else if (data->mask64_data.long_size == 8) {
+				u16 max_nr;
+
+				if (payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) {
+					err = 0;
+					goto out;
+				}
+				max_nr = (payload - offsetof(struct perf_record_cpu_map_data,
+							     mask64_data.mask)) /
+					 sizeof(data->mask64_data.mask[0]);
+				if (data->mask64_data.nr > max_nr) {
+					pr_warning("WARNING: PERF_RECORD_CPU_MAP mask64: nr %u exceeds payload (max %u), skipping\n",
+						   data->mask64_data.nr, max_nr);
+					err = 0;
+					goto out;
+				}
+			} else {
+				pr_warning("WARNING: PERF_RECORD_CPU_MAP: unsupported long_size %u, skipping\n",
+					   data->mask32_data.long_size);
+				err = 0;
+				goto out;
+			}
+			break;
+		default:
+			break;
+		}
+
 		err = tool->cpu_map(tool, session, event);
 		break;
-	case PERF_RECORD_STAT_CONFIG:
+	}
+	case PERF_RECORD_STAT_CONFIG: {
+		/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
+		u64 max_nr = (event->header.size - sizeof(event->stat_config)) /
+			     sizeof(event->stat_config.data[0]);
+
+		/*
+		 * Native-endian events are mmap'd read-only, so we
+		 * cannot clamp nr in place.  Skip the event instead.
+		 */
+		if (event->stat_config.nr > max_nr) {
+			pr_warning("WARNING: PERF_RECORD_STAT_CONFIG: nr %" PRIu64 " exceeds payload (max %" PRIu64 "), skipping\n",
+				   (u64)event->stat_config.nr, max_nr);
+			err = 0;
+			goto out;
+		}
+
 		err = tool->stat_config(tool, session, event);
 		break;
+	}
 	case PERF_RECORD_STAT:
 		err = tool->stat(tool, session, event);
 		break;
@@ -1963,6 +2177,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		err = -EINVAL;
 		break;
 	}
+out:
 	perf_sample__exit(&sample);
 	return err;
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 12/29] perf header: Byte-swap build ID event pid and bounds check section entries
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (10 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 22:05   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 13/29] perf cpumap: Reject RANGE_CPUS with start_cpu > end_cpu Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

perf_header__read_build_ids() swaps the event header fields for cross-endian
perf.data files but not bev.pid. This causes perf_session__findnew_machine()
to look up the wrong machine for guest VM build IDs, misattributing them.
Swap bev.pid alongside the header fields.

Also add a build_id_swap callback for stream-mode build ID events,
and validate NUL-termination of build_id.filename on the native-endian
delivery path (perf_session__process_user_event) — events with
unterminated filenames are skipped.

Harden perf_header__read_build_ids() against crafted perf.data files:

- Add overflow check on offset + size to prevent wrap past ULLONG_MAX.
- Reject bev.header.size == 0 which would loop forever.
- Reject bev.header.size > remaining section to prevent reading past
  the section boundary.
- Guard memcmp(filename, "nel.kallsyms]", 13) with len >= 13 to avoid
  reading uninitialized stack memory on short filenames.
- Force NUL-termination of filename before passing it to functions
  like machine__findnew_dso() that use strlen/strcmp.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c  | 50 +++++++++++++++++++++++++++++++++++----
 tools/perf/util/session.c | 27 ++++++++++++++++++++-
 2 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 967c3d8ff12c8676..c0b5c99f462ad925 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <errno.h>
 #include <inttypes.h>
+#include <limits.h>
 #include "string2.h"
 #include <sys/param.h>
 #include <sys/types.h>
@@ -2578,7 +2579,13 @@ static int perf_header__read_build_ids_abi_quirk(struct perf_header *header,
 	} old_bev;
 	struct perf_record_header_build_id bev;
 	char filename[PATH_MAX];
-	u64 limit = offset + size;
+	u64 limit;
+
+	/* Prevent offset + size from wrapping past ULLONG_MAX */
+	if (size > ULLONG_MAX - offset)
+		return -1;
+
+	limit = offset + size;
 
 	while (offset < limit) {
 		ssize_t len;
@@ -2589,6 +2596,10 @@ static int perf_header__read_build_ids_abi_quirk(struct perf_header *header,
 		if (header->needs_swap)
 			perf_event_header__bswap(&old_bev.header);
 
+		/* size == 0 loops forever; size > remaining reads past section */
+		if (old_bev.header.size == 0 || old_bev.header.size > limit - offset)
+			return -1;
+
 		len = old_bev.header.size - sizeof(old_bev);
 		if (len < 0 || len >= PATH_MAX) {
 			pr_warning("invalid build_id filename length %zd\n", len);
@@ -2597,6 +2608,13 @@ static int perf_header__read_build_ids_abi_quirk(struct perf_header *header,
 
 		if (readn(input, filename, len) != len)
 			return -1;
+		/*
+		 * The file data may lack a null terminator, which could
+		 * indicate a corrupt or crafted perf.data file.  Ensure
+		 * filename is always a valid C string before passing it
+		 * to functions like machine__findnew_dso().
+		 */
+		filename[len] = '\0';
 
 		bev.header = old_bev.header;
 
@@ -2624,17 +2642,32 @@ static int perf_header__read_build_ids(struct perf_header *header,
 	struct perf_session *session = container_of(header, struct perf_session, header);
 	struct perf_record_header_build_id bev;
 	char filename[PATH_MAX];
-	u64 limit = offset + size, orig_offset = offset;
+	u64 limit, orig_offset = offset;
 	int err = -1;
 
+	/* Prevent offset + size from wrapping past ULLONG_MAX */
+	if (size > ULLONG_MAX - offset)
+		return -1;
+
+	limit = offset + size;
+
 	while (offset < limit) {
 		ssize_t len;
 
 		if (readn(input, &bev, sizeof(bev)) != sizeof(bev))
 			goto out;
 
-		if (header->needs_swap)
+		if (header->needs_swap) {
 			perf_event_header__bswap(&bev.header);
+			bev.pid = bswap_32(bev.pid);
+		}
+
+		/*
+		 * size == 0 would loop forever (offset never advances);
+		 * size > remaining would read past the section boundary.
+		 */
+		if (bev.header.size == 0 || bev.header.size > limit - offset)
+			goto out;
 
 		len = bev.header.size - sizeof(bev);
 		if (len < 0 || len >= PATH_MAX) {
@@ -2644,6 +2677,13 @@ static int perf_header__read_build_ids(struct perf_header *header,
 
 		if (readn(input, filename, len) != len)
 			goto out;
+		/*
+		 * The file data may lack a null terminator, which could
+		 * indicate a corrupt or crafted perf.data file.  Ensure
+		 * filename is always a valid C string before passing it
+		 * to functions like machine__findnew_dso().
+		 */
+		filename[len] = '\0';
 		/*
 		 * The a1645ce1 changeset:
 		 *
@@ -2657,7 +2697,9 @@ static int perf_header__read_build_ids(struct perf_header *header,
 		 * '[kernel.kallsyms]' string for the kernel build-id has the
 		 * first 4 characters chopped off (where the pid_t sits).
 		 */
-		if (memcmp(filename, "nel.kallsyms]", 13) == 0) {
+		/* Guard short filenames against memcmp reading past the buffer */
+		if (len >= (ssize_t)sizeof("nel.kallsyms]") - 1 &&
+		    memcmp(filename, "nel.kallsyms]", sizeof("nel.kallsyms]") - 1) == 0) {
 			if (lseek(input, orig_offset, SEEK_SET) == (off_t)-1)
 				return -1;
 			return perf_header__read_build_ids_abi_quirk(header, input, offset, size);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 8588e12f110fca70..0fac8f4e0e22310f 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -686,6 +686,25 @@ static int perf_event__hdr_attr_swap(union perf_event *event,
 	return 0;
 }
 
+static int perf_event__build_id_swap(union perf_event *event,
+				     bool sample_id_all)
+{
+	event->build_id.pid = bswap_32(event->build_id.pid);
+
+	if (sample_id_all) {
+		void *data = &event->build_id.filename;
+		void *end = (void *)event + event->header.size;
+		size_t len = strnlen(data, end - data);
+
+		/* See comment in perf_event__comm_swap() */
+		if (len == (size_t)(end - data))
+			return -1;
+		data += PERF_ALIGN(len + 1, sizeof(u64));
+		swap_sample_id_all(event, data);
+	}
+	return 0;
+}
+
 static int perf_event__event_update_swap(union perf_event *event,
 					 bool sample_id_all __maybe_unused)
 {
@@ -1014,7 +1033,7 @@ static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_HEADER_ATTR]	  = perf_event__hdr_attr_swap,
 	[PERF_RECORD_HEADER_EVENT_TYPE]	  = perf_event__event_type_swap,
 	[PERF_RECORD_HEADER_TRACING_DATA] = perf_event__tracing_data_swap,
-	[PERF_RECORD_HEADER_BUILD_ID]	  = NULL,
+	[PERF_RECORD_HEADER_BUILD_ID]	  = perf_event__build_id_swap,
 	[PERF_RECORD_HEADER_FEATURE]	  = perf_event__header_feature_swap,
 	[PERF_RECORD_ID_INDEX]		  = perf_event__all64_swap,
 	[PERF_RECORD_AUXTRACE_INFO]	  = perf_event__auxtrace_info_swap,
@@ -2004,6 +2023,12 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		err = tool->tracing_data(tool, session, event);
 		break;
 	case PERF_RECORD_HEADER_BUILD_ID:
+		if (!perf_event__check_nul(event->build_id.filename,
+					   (void *)event + event->header.size,
+					   "HEADER_BUILD_ID")) {
+			err = 0;
+			break;
+		}
 		err = tool->build_id(tool, session, event);
 		break;
 	case PERF_RECORD_FINISHED_ROUND:
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 13/29] perf cpumap: Reject RANGE_CPUS with start_cpu > end_cpu
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (11 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 12/29] perf header: Byte-swap build ID event pid and bounds check section entries Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 22:03   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 14/29] perf auxtrace: Harden auxtrace_error event handling Arnaldo Carvalho de Melo
                   ` (16 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

cpu_map__from_range() computes nr_cpus as end_cpu - start_cpu + 1.
When a crafted perf.data has start_cpu > end_cpu, this wraps to a
huge value, causing perf_cpu_map__empty_new() to attempt a massive
allocation.

Return NULL when the range is inverted.

Also clamp any_cpu to boolean (0 or 1) since it is added to the
allocation count — a crafted value > 1 would inflate the map size.

Harden cpu_map__from_mask() to reject unsupported long_size values
(anything other than 4 or 8), preventing misinterpretation of the
mask data layout.

Snapshot mmap'd fields via READ_ONCE() into locals to prevent
TOCTOU re-reads — the data pointer references MAP_SHARED mmap'd
memory that could theoretically change between reads on a
FUSE-backed file:

- cpu_map__from_range(): snapshot start_cpu, end_cpu, any_cpu
- cpu_map__from_entries(): snapshot nr and each cpu[i] element
- cpu_map__from_mask(): snapshot long_size (before validation,
  closing the check-then-read gap), mask_nr
- perf_record_cpu_map_data__read_one_mask(): add u16 long_size
  parameter so callers pass the validated copy instead of
  re-reading data->mask32_data.long_size from mmap'd memory

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/cpumap.c | 62 +++++++++++++++++++++++++++++-----------
 1 file changed, 45 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 11922e1ded844a03..b1e5c29c6e3ec8df 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -10,6 +10,7 @@
 #include <linux/bitmap.h>
 #include "asm/bug.h"
 
+#include <linux/compiler.h>
 #include <linux/ctype.h>
 #include <linux/zalloc.h>
 #include <internal/cpumap.h>
@@ -40,15 +41,16 @@ bool perf_record_cpu_map_data__test_bit(int i,
 
 /* Read ith mask value from data into the given 64-bit sized bitmap */
 static void perf_record_cpu_map_data__read_one_mask(const struct perf_record_cpu_map_data *data,
-						    int i, unsigned long *bitmap)
+						    int i, unsigned long *bitmap,
+						    u16 long_size)
 {
 #if __SIZEOF_LONG__ == 8
-	if (data->mask32_data.long_size == 4)
+	if (long_size == 4)
 		bitmap[0] = data->mask32_data.mask[i];
 	else
 		bitmap[0] = data->mask64_data.mask[i];
 #else
-	if (data->mask32_data.long_size == 4) {
+	if (long_size == 4) {
 		bitmap[0] = data->mask32_data.mask[i];
 		bitmap[1] = 0;
 	} else {
@@ -64,24 +66,27 @@ static void perf_record_cpu_map_data__read_one_mask(const struct perf_record_cpu
 }
 static struct perf_cpu_map *cpu_map__from_entries(const struct perf_record_cpu_map_data *data)
 {
+	/* Snapshot nr — data is mmap'd and could change between reads */
+	u16 nr = READ_ONCE(data->cpus_data.nr);
 	struct perf_cpu_map *map;
 
-	map = perf_cpu_map__empty_new(data->cpus_data.nr);
+	map = perf_cpu_map__empty_new(nr);
 	if (!map)
 		return NULL;
 
-	for (unsigned int i = 0; i < data->cpus_data.nr; i++) {
+	for (unsigned int i = 0; i < nr; i++) {
+		u16 cpu = READ_ONCE(data->cpus_data.cpu[i]);
 		/*
 		 * Special treatment for -1, which is not real cpu number,
 		 * and we need to use (int) -1 to initialize map[i],
 		 * otherwise it would become 65535.
 		 */
-		if (data->cpus_data.cpu[i] == (u16) -1) {
+		if (cpu == (u16) -1) {
 			RC_CHK_ACCESS(map)->map[i].cpu = -1;
-		} else if (data->cpus_data.cpu[i] < INT16_MAX) {
-			RC_CHK_ACCESS(map)->map[i].cpu = (int16_t) data->cpus_data.cpu[i];
+		} else if (cpu < INT16_MAX) {
+			RC_CHK_ACCESS(map)->map[i].cpu = (int16_t) cpu;
 		} else {
-			pr_err("Invalid cpumap entry %u\n", data->cpus_data.cpu[i]);
+			pr_err("Invalid cpumap entry %u\n", cpu);
 			perf_cpu_map__put(map);
 			return NULL;
 		}
@@ -93,11 +98,21 @@ static struct perf_cpu_map *cpu_map__from_entries(const struct perf_record_cpu_m
 static struct perf_cpu_map *cpu_map__from_mask(const struct perf_record_cpu_map_data *data)
 {
 	DECLARE_BITMAP(local_copy, 64);
-	int weight = 0, mask_nr = data->mask32_data.nr;
+	int weight = 0, mask_nr;
+	/* Snapshot before validation — data is mmap'd and could change */
+	u16 long_size = READ_ONCE(data->mask32_data.long_size);
 	struct perf_cpu_map *map;
 
+	/* long_size must be 4 or 8; other values overflow cpus_per_i below */
+	if (long_size != 4 && long_size != 8) {
+		pr_warning("WARNING: cpu_map mask: unsupported long_size %u\n", long_size);
+		return NULL;
+	}
+
+	mask_nr = READ_ONCE(data->mask32_data.nr);
+
 	for (int i = 0; i < mask_nr; i++) {
-		perf_record_cpu_map_data__read_one_mask(data, i, local_copy);
+		perf_record_cpu_map_data__read_one_mask(data, i, local_copy, long_size);
 		weight += bitmap_weight(local_copy, 64);
 	}
 
@@ -106,11 +121,14 @@ static struct perf_cpu_map *cpu_map__from_mask(const struct perf_record_cpu_map_
 		return NULL;
 
 	for (int i = 0, j = 0; i < mask_nr; i++) {
-		int cpus_per_i = (i * data->mask32_data.long_size  * BITS_PER_BYTE);
+		int cpus_per_i = (i * long_size * BITS_PER_BYTE);
 		int cpu;
 
-		perf_record_cpu_map_data__read_one_mask(data, i, local_copy);
+		perf_record_cpu_map_data__read_one_mask(data, i, local_copy, long_size);
 		for_each_set_bit(cpu, local_copy, 64) {
+			/* Guard against more set bits than the first pass counted */
+			if (j >= weight)
+				break;
 			if (cpu + cpus_per_i < INT16_MAX) {
 				RC_CHK_ACCESS(map)->map[j++].cpu = cpu + cpus_per_i;
 			} else {
@@ -126,18 +144,28 @@ static struct perf_cpu_map *cpu_map__from_mask(const struct perf_record_cpu_map_
 
 static struct perf_cpu_map *cpu_map__from_range(const struct perf_record_cpu_map_data *data)
 {
+	/* Snapshot fields — data is mmap'd and could change between reads */
+	u16 start_cpu = READ_ONCE(data->range_cpu_data.start_cpu);
+	u16 end_cpu = READ_ONCE(data->range_cpu_data.end_cpu);
+	u16 any_cpu = READ_ONCE(data->range_cpu_data.any_cpu);
 	struct perf_cpu_map *map;
 	unsigned int i = 0;
 
-	map = perf_cpu_map__empty_new(data->range_cpu_data.end_cpu -
-				data->range_cpu_data.start_cpu + 1 + data->range_cpu_data.any_cpu);
+	if (end_cpu < start_cpu) {
+		pr_warning("WARNING: cpu_map range: end_cpu %u < start_cpu %u\n",
+			   end_cpu, start_cpu);
+		return NULL;
+	}
+
+	/* any_cpu is boolean (0 or 1), not a count — clamp to avoid inflated nr */
+	map = perf_cpu_map__empty_new(end_cpu - start_cpu + 1 + !!any_cpu);
 	if (!map)
 		return NULL;
 
-	if (data->range_cpu_data.any_cpu)
+	if (any_cpu)
 		RC_CHK_ACCESS(map)->map[i++].cpu = -1;
 
-	for (int cpu = data->range_cpu_data.start_cpu; cpu <= data->range_cpu_data.end_cpu;
+	for (int cpu = start_cpu; cpu <= end_cpu;
 	     i++, cpu++) {
 		if (cpu < INT16_MAX) {
 			RC_CHK_ACCESS(map)->map[i].cpu = cpu;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 14/29] perf auxtrace: Harden auxtrace_error event handling
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (12 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 13/29] perf cpumap: Reject RANGE_CPUS with start_cpu > end_cpu Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 15/29] perf session: Add byte-swap and bounds check for PERF_RECORD_BPF_METADATA events Arnaldo Carvalho de Melo
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Fix four issues in PERF_RECORD_AUXTRACE_ERROR handling:

1. auxtrace_error_name() takes a signed int parameter, but e->type
   is __u32.  A crafted value like 0xFFFFFFFF converts to -1, passes
   the bounds check, and causes a negative array index.  Fix by
   changing the parameter to unsigned int.

2. The msg field is printed via %s without a length bound.  The
   min_size table only guarantees fields up to msg (offset 48), so
   a truncated event has zero msg bytes within the event boundary.
   Compute the available msg length from header.size, cap at
   sizeof(e->msg), and use %.*s.

3. fmt >= 2 adds machine_pid and vcpu fields after msg[64].  Older
   files may have fmt >= 2 but an event size that doesn't include
   these fields.  Add a size check in the swap handler to downgrade
   fmt before the conditional field access, and a matching size
   guard in the fprintf path for native-endian events (which are
   mmap'd read-only and can't be modified in place).

4. python_process_auxtrace_error() had the same issues: msg was
   passed to tuple_set_string() unbounded, and machine_pid/vcpu
   were accessed unconditionally without checking fmt or event
   size.  Apply the same bounds checks.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c                    | 24 +++++++++++++---
 .../scripting-engines/trace-event-python.c    | 28 +++++++++++++++++--
 tools/perf/util/session.c                     | 18 ++++++++++--
 3 files changed, 61 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index a224687ffbc1b5be..d9770e1d2f959fc4 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1759,7 +1759,7 @@ static const char * const auxtrace_error_type_name[] = {
 	[PERF_AUXTRACE_ERROR_ITRACE] = "instruction trace",
 };
 
-static const char *auxtrace_error_name(int type)
+static const char *auxtrace_error_name(unsigned int type)
 {
 	const char *error_type_name = NULL;
 
@@ -1775,6 +1775,7 @@ size_t perf_event__fprintf_auxtrace_error(union perf_event *event, FILE *fp)
 	struct perf_record_auxtrace_error *e = &event->auxtrace_error;
 	unsigned long long nsecs = e->time;
 	const char *msg = e->msg;
+	int msg_max;
 	int ret;
 
 	ret = fprintf(fp, " %s error type %u",
@@ -1792,11 +1793,26 @@ size_t perf_event__fprintf_auxtrace_error(union perf_event *event, FILE *fp)
 	if (!e->fmt)
 		msg = (const char *)&e->time;
 
-	if (e->fmt >= 2 && e->machine_pid)
+	/* Bound msg to the bytes actually within the event, capped at the array size */
+	msg_max = (int)((void *)event + event->header.size - (void *)msg);
+	if (msg_max < 0)
+		msg_max = 0;
+	if (msg_max > (int)sizeof(e->msg))
+		msg_max = sizeof(e->msg);
+
+	/*
+	 * Unlike the swap path which downgrades fmt in place,
+	 * native-endian events are mmap'd read-only — check size
+	 * instead to avoid accessing machine_pid/vcpu OOB.
+	 */
+	if (e->fmt >= 2 &&
+	    event->header.size >= offsetof(typeof(event->auxtrace_error), vcpu) +
+				  sizeof(event->auxtrace_error.vcpu) &&
+	    e->machine_pid)
 		ret += fprintf(fp, " machine_pid %d vcpu %d", e->machine_pid, e->vcpu);
 
-	ret += fprintf(fp, " cpu %d pid %d tid %d ip %#"PRI_lx64" code %u: %s\n",
-		       e->cpu, e->pid, e->tid, e->ip, e->code, msg);
+	ret += fprintf(fp, " cpu %d pid %d tid %d ip %#"PRI_lx64" code %u: %.*s\n",
+		       e->cpu, e->pid, e->tid, e->ip, e->code, msg_max, msg);
 	return ret;
 }
 
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 8edd2f36e5a95829..cee1f32d70225cc7 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1607,6 +1607,9 @@ static void python_process_auxtrace_error(struct perf_session *session __maybe_u
 	const char *handler_name = "auxtrace_error";
 	unsigned long long tm = e->time;
 	const char *msg = e->msg;
+	s32 machine_pid = 0, vcpu = 0;
+	char msg_buf[MAX_AUXTRACE_ERROR_MSG + 1];
+	int msg_max;
 	PyObject *handler, *t;
 
 	handler = get_handler(handler_name);
@@ -1618,6 +1621,25 @@ static void python_process_auxtrace_error(struct perf_session *session __maybe_u
 		msg = (const char *)&e->time;
 	}
 
+	/* Bound msg to the bytes within the event, ensure NUL-termination */
+	msg_max = (int)((void *)event + event->header.size - (void *)msg);
+	if (msg_max <= 0) {
+		msg_buf[0] = '\0';
+	} else {
+		if (msg_max > (int)sizeof(msg_buf) - 1)
+			msg_max = sizeof(msg_buf) - 1;
+		memcpy(msg_buf, msg, msg_max);
+		msg_buf[msg_max] = '\0';
+	}
+
+	/* Only access fmt >= 2 fields if the event is large enough */
+	if (e->fmt >= 2 &&
+	    event->header.size >= offsetof(typeof(event->auxtrace_error), vcpu) +
+				  sizeof(event->auxtrace_error.vcpu)) {
+		machine_pid = e->machine_pid;
+		vcpu = e->vcpu;
+	}
+
 	t = tuple_new(11);
 
 	tuple_set_u32(t, 0, e->type);
@@ -1627,10 +1649,10 @@ static void python_process_auxtrace_error(struct perf_session *session __maybe_u
 	tuple_set_s32(t, 4, e->tid);
 	tuple_set_u64(t, 5, e->ip);
 	tuple_set_u64(t, 6, tm);
-	tuple_set_string(t, 7, msg);
+	tuple_set_string(t, 7, msg_buf);
 	tuple_set_u32(t, 8, cpumode);
-	tuple_set_s32(t, 9, e->machine_pid);
-	tuple_set_s32(t, 10, e->vcpu);
+	tuple_set_s32(t, 9, machine_pid);
+	tuple_set_s32(t, 10, vcpu);
 
 	call_object(handler, t, handler_name);
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 0fac8f4e0e22310f..092fccbea8f8017e 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -766,8 +766,22 @@ static int perf_event__auxtrace_error_swap(union perf_event *event,
 	if (event->auxtrace_error.fmt)
 		event->auxtrace_error.time = bswap_64(event->auxtrace_error.time);
 	if (event->auxtrace_error.fmt >= 2) {
-		event->auxtrace_error.machine_pid = bswap_32(event->auxtrace_error.machine_pid);
-		event->auxtrace_error.vcpu = bswap_32(event->auxtrace_error.vcpu);
+		/*
+		 * fmt >= 2 adds machine_pid and vcpu after msg[64].
+		 * Older files may have fmt >= 2 but an event size
+		 * that doesn't include these fields — downgrade to
+		 * avoid swapping out of bounds.
+		 */
+		if (event->header.size < offsetof(typeof(event->auxtrace_error), vcpu) +
+					 sizeof(event->auxtrace_error.vcpu)) {
+			pr_warning("WARNING: PERF_RECORD_AUXTRACE_ERROR: fmt %u but event too small for machine_pid/vcpu (%u bytes), downgrading fmt\n",
+				   event->auxtrace_error.fmt,
+				   event->header.size);
+			event->auxtrace_error.fmt = 1;
+		} else {
+			event->auxtrace_error.machine_pid = bswap_32(event->auxtrace_error.machine_pid);
+			event->auxtrace_error.vcpu = bswap_32(event->auxtrace_error.vcpu);
+		}
 	}
 	return 0;
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 15/29] perf session: Add byte-swap and bounds check for PERF_RECORD_BPF_METADATA events
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (13 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 14/29] perf auxtrace: Harden auxtrace_error event handling Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:56   ` sashiko-bot
  2026-05-26 21:17 ` [PATCH 16/29] perf header: Validate null-termination in PERF_RECORD_EVENT_UPDATE string fields Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Blake Jones,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

PERF_RECORD_BPF_METADATA has no entry in perf_event__swap_ops[], so its
nr_entries field is never byte-swapped when reading a cross-endian
perf.data file.  Downstream processing in
perf_event__fprintf_bpf_metadata() loops over nr_entries, so a
foreign-endian value causes out-of-bounds reads.

Add a swap handler that byte-swaps nr_entries after validating that
header.size is large enough.  The entries[] array contains only char
arrays (key/value strings), so no per-entry swap is needed — but ensure
NUL-termination on the writable cross-endian path.

Validate header.size, nr_entries, and string NUL-termination in the
common event delivery path so that native-endian files with malicious
values are also rejected.  Snapshot nr_entries via READ_ONCE() before
validation — the event is on a MAP_SHARED mmap that could theoretically
change between the bounds check and the loop.

Changes in v2:
- Snapshot event->header.size via READ_ONCE() into a local variable
  to prevent a double-fetch underflow in the max_entries calculation
  (Reported-by: sashiko-bot@kernel.org)
- Write back clamped nr_entries to the event on the swap path,
  consistent with NAMESPACES and STAT_CONFIG handlers — without
  writeback the native path sees the inflated nr and skips the
  event entirely (Reported-by: sashiko-bot@kernel.org)

Fixes: ab38e84ba9a8 ("perf record: collect BPF metadata from existing BPF programs")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 89 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 88 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 092fccbea8f8017e..95eb793026de6d8d 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -961,6 +961,48 @@ static int perf_event__time_conv_swap(union perf_event *event,
 	return 0;
 }
 
+static int perf_event__bpf_metadata_swap(union perf_event *event,
+					 bool sample_id_all __maybe_unused)
+{
+	u64 i, nr, max_nr;
+
+	/* Fixed header must fit before accessing nr_entries or prog_name */
+	if (event->header.size < sizeof(event->bpf_metadata))
+		return -1;
+
+	event->bpf_metadata.nr_entries = bswap_64(event->bpf_metadata.nr_entries);
+
+	/*
+	 * Ensure NUL-termination on the cross-endian path where the
+	 * mapping is writable (MAP_PRIVATE + PROT_WRITE).  Fixing
+	 * the string in place is preferred over rejecting because it
+	 * preserves the event for downstream processing — only the
+	 * last byte is lost.
+	 *
+	 * The native-endian path (MAP_SHARED + PROT_READ) cannot
+	 * write, so it validates and skips unterminated events in
+	 * perf_session__process_user_event() instead.  The two
+	 * strategies produce different outcomes for the same
+	 * malformed input (fix vs skip), which is inherent in the
+	 * writable-vs-read-only mapping model.
+	 */
+	event->bpf_metadata.prog_name[BPF_PROG_NAME_LEN - 1] = '\0';
+
+	nr = event->bpf_metadata.nr_entries;
+	max_nr = (event->header.size - sizeof(event->bpf_metadata)) /
+		 sizeof(event->bpf_metadata.entries[0]);
+	if (nr > max_nr) {
+		/* Persist clamped value so the native path processes entries, not skips */
+		nr = max_nr;
+		event->bpf_metadata.nr_entries = nr;
+	}
+
+	for (i = 0; i < nr; i++) {
+		event->bpf_metadata.entries[i].key[BPF_METADATA_KEY_LEN - 1] = '\0';
+		event->bpf_metadata.entries[i].value[BPF_METADATA_VALUE_LEN - 1] = '\0';
+	}
+	return 0;
+}
 static int
 perf_event__schedstat_cpu_swap(union perf_event *event __maybe_unused,
 			       bool sample_id_all __maybe_unused)
@@ -1060,6 +1102,7 @@ static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_STAT_ROUND]	  = perf_event__stat_round_swap,
 	[PERF_RECORD_EVENT_UPDATE]	  = perf_event__event_update_swap,
 	[PERF_RECORD_TIME_CONV]		  = perf_event__time_conv_swap,
+	[PERF_RECORD_BPF_METADATA]	  = perf_event__bpf_metadata_swap,
 	[PERF_RECORD_SCHEDSTAT_CPU]	  = perf_event__schedstat_cpu_swap,
 	[PERF_RECORD_SCHEDSTAT_DOMAIN]	  = perf_event__schedstat_domain_swap,
 	[PERF_RECORD_HEADER_MAX]	  = NULL,
@@ -2203,9 +2246,53 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 	case PERF_RECORD_FINISHED_INIT:
 		err = tool->finished_init(tool, session, event);
 		break;
-	case PERF_RECORD_BPF_METADATA:
+	case PERF_RECORD_BPF_METADATA: {
+		u64 nr_entries, max_entries;
+		u32 hdr_size = READ_ONCE(event->header.size);
+
+		if (hdr_size < sizeof(event->bpf_metadata)) {
+			pr_warning("WARNING: PERF_RECORD_BPF_METADATA: header.size (%u) too small, skipping\n",
+				   hdr_size);
+			err = 0;
+			break;
+		}
+
+		/*
+		 * Native-endian files are mmap'd read-only — validate
+		 * NUL-termination instead of writing.
+		 */
+		if (strnlen(event->bpf_metadata.prog_name,
+			    BPF_PROG_NAME_LEN) == BPF_PROG_NAME_LEN) {
+			pr_warning("WARNING: PERF_RECORD_BPF_METADATA: prog_name not null-terminated, skipping\n");
+			err = 0;
+			break;
+		}
+
+		/* Snapshot — event is mmap'd and could change between reads */
+		nr_entries = READ_ONCE(event->bpf_metadata.nr_entries);
+		max_entries = (hdr_size - sizeof(event->bpf_metadata)) /
+			      sizeof(event->bpf_metadata.entries[0]);
+		if (nr_entries > max_entries) {
+			pr_warning("WARNING: PERF_RECORD_BPF_METADATA: nr_entries %" PRIu64 " exceeds max %" PRIu64 ", skipping\n",
+				   nr_entries, max_entries);
+			err = 0;
+			break;
+		}
+
+		for (u64 i = 0; i < nr_entries; i++) {
+			if (strnlen(event->bpf_metadata.entries[i].key,
+				    BPF_METADATA_KEY_LEN) == BPF_METADATA_KEY_LEN ||
+			    strnlen(event->bpf_metadata.entries[i].value,
+				    BPF_METADATA_VALUE_LEN) == BPF_METADATA_VALUE_LEN) {
+				pr_warning("WARNING: PERF_RECORD_BPF_METADATA: entry %" PRIu64 " key/value not null-terminated, skipping\n", i);
+				err = 0;
+				goto out;
+			}
+		}
+
 		err = tool->bpf_metadata(tool, session, event);
 		break;
+	}
 	case PERF_RECORD_SCHEDSTAT_CPU:
 		err = tool->schedstat_cpu(tool, session, event);
 		break;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 16/29] perf header: Validate null-termination in PERF_RECORD_EVENT_UPDATE string fields
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (14 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 15/29] perf session: Add byte-swap and bounds check for PERF_RECORD_BPF_METADATA events Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 17/29] perf tools: Bounds check perf_event_attr fields against attr.size before printing Arnaldo Carvalho de Melo
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

strdup(ev->unit) and strdup(ev->name) read until '\0' with no
guarantee the string is null-terminated within event->header.size.
The dump_trace fprintf path has the same problem with %s.

Validate before either path runs — same class of bug fixed for
MMAP/MMAP2/COMM/CGROUP by perf_event__check_nul().

Also harden the event_update swap handler to:
- Validate SCALE event size before swapping the double at
  offset 24, which exceeds the 24-byte min_size.
- Validate CPUS event size before accessing the cpu_map
  type/nr/long_size fields, which also start at the min_size
  boundary.
- Swap CPUS variant fields (type, nr, long_size) so the
  processing path sees native byte order.

Add validation in perf_event__process_event_update() for all
event update variants (UNIT, NAME, SCALE, CPUS) before
dump_trace or processing.

Validate CPUS nr against payload size for both PERF_CPU_MAP__CPUS
and PERF_CPU_MAP__MASK types on the fprintf (dump_trace) path:
- CPUS: check nr does not exceed available cpu entries
- MASK: check nr does not exceed available mask entries for
  both mask32 (long_size == 4) and mask64 (long_size == 8)
  layouts, with underflow guards on the offsetof subtraction

Fix a missing break before the default case in the CPUS
switch path.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c  | 150 ++++++++++++++++++++++++++++++++++++--
 tools/perf/util/session.c |  99 ++++++++++++++++++++++++-
 2 files changed, 242 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c0b5c99f462ad925..9e3a08b1f8ae5a73 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -5117,15 +5117,76 @@ size_t perf_event__fprintf_event_update(union perf_event *event, FILE *fp)
 
 	switch (ev->type) {
 	case PERF_EVENT_UPDATE__SCALE:
+		if (event->header.size < offsetof(struct perf_record_event_update, scale) +
+					 sizeof(ev->scale)) {
+			ret += fprintf(fp, "... scale: (truncated)\n");
+			break;
+		}
 		ret += fprintf(fp, "... scale: %f\n", ev->scale.scale);
 		break;
 	case PERF_EVENT_UPDATE__UNIT:
-		ret += fprintf(fp, "... unit:  %s\n", ev->unit);
-		break;
-	case PERF_EVENT_UPDATE__NAME:
-		ret += fprintf(fp, "... name:  %s\n", ev->name);
+	case PERF_EVENT_UPDATE__NAME: {
+		size_t str_off = offsetof(struct perf_record_event_update, unit);
+		size_t max_len = event->header.size > str_off ?
+				 event->header.size - str_off : 0;
+
+		if (max_len == 0 || strnlen(ev->unit, max_len) == max_len) {
+			ret += fprintf(fp, "... %s: (unterminated)\n",
+				       ev->type == PERF_EVENT_UPDATE__UNIT ? "unit" : "name");
+			break;
+		}
+		ret += fprintf(fp, "... %s:  %s\n",
+			       ev->type == PERF_EVENT_UPDATE__UNIT ? "unit" : "name",
+			       ev->unit);
 		break;
-	case PERF_EVENT_UPDATE__CPUS:
+	}
+	case PERF_EVENT_UPDATE__CPUS: {
+		size_t cpus_off = offsetof(struct perf_record_event_update, cpus);
+		u32 cpus_payload;
+
+		if (event->header.size < cpus_off + sizeof(__u16) +
+					 sizeof(struct perf_record_range_cpu_map)) {
+			ret += fprintf(fp, "... cpus: (truncated)\n");
+			break;
+		}
+
+		/*
+		 * Validate nr against payload — this function may be
+		 * called from the stub handler (dump_trace path) which
+		 * bypasses perf_event__process_event_update() validation.
+		 */
+		cpus_payload = event->header.size - cpus_off;
+		if (ev->cpus.cpus.type == PERF_CPU_MAP__CPUS) {
+			if (cpus_payload < offsetof(struct perf_record_cpu_map_data, cpus_data.cpu) ||
+			    ev->cpus.cpus.cpus_data.nr >
+			    (cpus_payload - offsetof(struct perf_record_cpu_map_data, cpus_data.cpu)) /
+			    sizeof(ev->cpus.cpus.cpus_data.cpu[0])) {
+				ret += fprintf(fp, "... cpus: nr %u exceeds payload\n",
+					       ev->cpus.cpus.cpus_data.nr);
+				break;
+			}
+		} else if (ev->cpus.cpus.type == PERF_CPU_MAP__MASK) {
+			if (ev->cpus.cpus.mask32_data.long_size == 4) {
+				if (cpus_payload < offsetof(struct perf_record_cpu_map_data, mask32_data.mask) ||
+				    ev->cpus.cpus.mask32_data.nr >
+				    (cpus_payload - offsetof(struct perf_record_cpu_map_data, mask32_data.mask)) /
+				    sizeof(ev->cpus.cpus.mask32_data.mask[0])) {
+					ret += fprintf(fp, "... cpus: mask nr %u exceeds payload\n",
+						       ev->cpus.cpus.mask32_data.nr);
+					break;
+				}
+			} else if (ev->cpus.cpus.mask64_data.long_size == 8) {
+				if (cpus_payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask) ||
+				    ev->cpus.cpus.mask64_data.nr >
+				    (cpus_payload - offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) /
+				    sizeof(ev->cpus.cpus.mask64_data.mask[0])) {
+					ret += fprintf(fp, "... cpus: mask nr %u exceeds payload\n",
+						       ev->cpus.cpus.mask64_data.nr);
+					break;
+				}
+			}
+		}
+
 		ret += fprintf(fp, "... ");
 
 		map = cpu_map__new_data(&ev->cpus.cpus);
@@ -5135,6 +5196,7 @@ size_t perf_event__fprintf_event_update(union perf_event *event, FILE *fp)
 		} else
 			ret += fprintf(fp, "failed to get cpus\n");
 		break;
+	}
 	default:
 		ret += fprintf(fp, "... unknown type\n");
 		break;
@@ -5269,6 +5331,83 @@ int perf_event__process_event_update(const struct perf_tool *tool __maybe_unused
 	struct evsel *evsel;
 	struct perf_cpu_map *map;
 
+	/*
+	 * Validate payload before dump_trace or processing — both
+	 * paths access variant-specific fields without further checks.
+	 */
+	if (ev->type == PERF_EVENT_UPDATE__UNIT ||
+	    ev->type == PERF_EVENT_UPDATE__NAME) {
+		size_t str_off = offsetof(struct perf_record_event_update, unit);
+		size_t max_len = event->header.size > str_off ?
+				 event->header.size - str_off : 0;
+
+		if (max_len == 0 || strnlen(ev->unit, max_len) == max_len) {
+			pr_warning("WARNING: PERF_RECORD_EVENT_UPDATE: %s not null-terminated, skipping\n",
+				   ev->type == PERF_EVENT_UPDATE__UNIT ? "unit" : "name");
+			return 0;
+		}
+	} else if (ev->type == PERF_EVENT_UPDATE__SCALE) {
+		if (event->header.size < offsetof(struct perf_record_event_update, scale) +
+					 sizeof(ev->scale)) {
+			pr_warning("WARNING: PERF_RECORD_EVENT_UPDATE: SCALE payload too small, skipping\n");
+			return 0;
+		}
+	} else if (ev->type == PERF_EVENT_UPDATE__CPUS) {
+		size_t cpus_off = offsetof(struct perf_record_event_update, cpus);
+		size_t min_cpus = sizeof(__u16) +
+				  sizeof(struct perf_record_range_cpu_map);
+		u32 cpus_payload;
+
+		if (event->header.size < cpus_off + min_cpus) {
+			pr_warning("WARNING: PERF_RECORD_EVENT_UPDATE: CPUS payload too small, skipping\n");
+			return 0;
+		}
+
+		/*
+		 * Validate per-variant nr against the remaining
+		 * payload on the native path — the swap path clamps
+		 * nr in perf_event__event_update_swap(), but native
+		 * events are read-only and cannot be clamped in place.
+		 * cpu_map__new_data() trusts nr for allocation and
+		 * iteration, so unchecked values cause OOB reads.
+		 */
+		cpus_payload = event->header.size - cpus_off;
+		switch (ev->cpus.cpus.type) {
+		case PERF_CPU_MAP__CPUS:
+			if (ev->cpus.cpus.cpus_data.nr >
+			    (cpus_payload - offsetof(struct perf_record_cpu_map_data, cpus_data.cpu)) /
+			    sizeof(ev->cpus.cpus.cpus_data.cpu[0])) {
+				pr_warning("WARNING: EVENT_UPDATE CPUS: nr %u exceeds payload, skipping\n",
+					   ev->cpus.cpus.cpus_data.nr);
+				return 0;
+			}
+			break;
+		case PERF_CPU_MAP__MASK:
+			if (ev->cpus.cpus.mask32_data.long_size == 4) {
+				if (cpus_payload < offsetof(struct perf_record_cpu_map_data, mask32_data.mask) ||
+				    ev->cpus.cpus.mask32_data.nr >
+				    (cpus_payload - offsetof(struct perf_record_cpu_map_data, mask32_data.mask)) /
+				    sizeof(ev->cpus.cpus.mask32_data.mask[0])) {
+					pr_warning("WARNING: EVENT_UPDATE MASK: nr %u exceeds payload, skipping\n",
+						   ev->cpus.cpus.mask32_data.nr);
+					return 0;
+				}
+			} else if (ev->cpus.cpus.mask64_data.long_size == 8) {
+				if (cpus_payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask) ||
+				    ev->cpus.cpus.mask64_data.nr >
+				    (cpus_payload - offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) /
+				    sizeof(ev->cpus.cpus.mask64_data.mask[0])) {
+					pr_warning("WARNING: EVENT_UPDATE MASK: nr %u exceeds payload, skipping\n",
+						   ev->cpus.cpus.mask64_data.nr);
+					return 0;
+				}
+			}
+			break;
+		default:
+			break;
+		}
+	}
+
 	if (dump_trace)
 		perf_event__fprintf_event_update(event, stdout);
 
@@ -5300,6 +5439,7 @@ int perf_event__process_event_update(const struct perf_tool *tool __maybe_unused
 			evsel->core.pmu_cpus = map;
 		} else
 			pr_err("failed to get event_update cpus\n");
+		break;
 	default:
 		break;
 	}
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 95eb793026de6d8d..8280413f4528f53c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -708,8 +708,103 @@ static int perf_event__build_id_swap(union perf_event *event,
 static int perf_event__event_update_swap(union perf_event *event,
 					 bool sample_id_all __maybe_unused)
 {
-	event->event_update.type = bswap_64(event->event_update.type);
-	event->event_update.id   = bswap_64(event->event_update.id);
+	struct perf_record_event_update *ev = &event->event_update;
+
+	ev->type = bswap_64(ev->type);
+	ev->id   = bswap_64(ev->id);
+
+	/*
+	 * Swap variant-specific fields so the processing path
+	 * sees native byte order.
+	 */
+	if (ev->type == PERF_EVENT_UPDATE__SCALE) {
+		if (event->header.size < offsetof(struct perf_record_event_update, scale) +
+					 sizeof(ev->scale))
+			return -1;
+		mem_bswap_64(&ev->scale.scale, sizeof(ev->scale.scale));
+	} else if (ev->type == PERF_EVENT_UPDATE__CPUS) {
+		u32 cpus_payload;
+		struct perf_record_cpu_map_data *data = &ev->cpus.cpus;
+
+		/* CPUS fields start at the same offset as scale (union) */
+		if (event->header.size < offsetof(struct perf_record_event_update, cpus) +
+					 sizeof(__u16) + sizeof(struct perf_record_range_cpu_map))
+			return -1;
+		cpus_payload = event->header.size - offsetof(struct perf_record_event_update, cpus);
+		data->type = bswap_16(data->type);
+		/*
+		 * Full swap including array elements — same logic as
+		 * perf_event__cpu_map_swap() but scoped to the
+		 * embedded cpu_map_data within EVENT_UPDATE.
+		 */
+		switch (data->type) {
+		case PERF_CPU_MAP__CPUS: {
+			u16 nr, max_nr;
+
+			data->cpus_data.nr = bswap_16(data->cpus_data.nr);
+			nr = data->cpus_data.nr;
+			max_nr = (cpus_payload - offsetof(struct perf_record_cpu_map_data,
+							  cpus_data.cpu)) /
+				 sizeof(data->cpus_data.cpu[0]);
+			if (nr > max_nr) {
+				nr = max_nr;
+				data->cpus_data.nr = nr;
+			}
+			for (unsigned int i = 0; i < nr; i++)
+				data->cpus_data.cpu[i] = bswap_16(data->cpus_data.cpu[i]);
+			break;
+		}
+		case PERF_CPU_MAP__MASK:
+			data->mask32_data.long_size = bswap_16(data->mask32_data.long_size);
+			switch (data->mask32_data.long_size) {
+			case 4: {
+				u16 nr, max_nr;
+
+				data->mask32_data.nr = bswap_16(data->mask32_data.nr);
+				nr = data->mask32_data.nr;
+				max_nr = (cpus_payload - offsetof(struct perf_record_cpu_map_data,
+								  mask32_data.mask)) /
+					 sizeof(data->mask32_data.mask[0]);
+				if (nr > max_nr) {
+					nr = max_nr;
+					data->mask32_data.nr = nr;
+				}
+				for (unsigned int i = 0; i < nr; i++)
+					data->mask32_data.mask[i] = bswap_32(data->mask32_data.mask[i]);
+				break;
+			}
+			case 8: {
+				u16 nr, max_nr;
+
+				data->mask64_data.nr = bswap_16(data->mask64_data.nr);
+				nr = data->mask64_data.nr;
+				if (cpus_payload < offsetof(struct perf_record_cpu_map_data, mask64_data.mask)) {
+					data->mask64_data.nr = 0;
+					break;
+				}
+				max_nr = (cpus_payload - offsetof(struct perf_record_cpu_map_data,
+								  mask64_data.mask)) /
+					 sizeof(data->mask64_data.mask[0]);
+				if (nr > max_nr) {
+					nr = max_nr;
+					data->mask64_data.nr = nr;
+				}
+				for (unsigned int i = 0; i < nr; i++)
+					data->mask64_data.mask[i] = bswap_64(data->mask64_data.mask[i]);
+				break;
+			}
+			default:
+				break;
+			}
+			break;
+		case PERF_CPU_MAP__RANGE_CPUS:
+			data->range_cpu_data.start_cpu = bswap_16(data->range_cpu_data.start_cpu);
+			data->range_cpu_data.end_cpu = bswap_16(data->range_cpu_data.end_cpu);
+			break;
+		default:
+			break;
+		}
+	}
 	return 0;
 }
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 17/29] perf tools: Bounds check perf_event_attr fields against attr.size before printing
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (15 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 16/29] perf header: Validate null-termination in PERF_RECORD_EVENT_UPDATE string fields Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 18/29] perf header: Propagate feature section processing errors Arnaldo Carvalho de Melo
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

perf_event_attr__fprintf() accessed all struct fields unconditionally,
but attrs from older perf.data files or BPF-captured syscall payloads
may have a smaller size than the current struct.  Fields beyond the
recorded size contain uninitialized or zero-filled data.

Add size-guarded macros (PRINT_ATTRn, PRINT_ATTRn_bf) that compare
each field's offset against attr->size before accessing it.

Guard the bitfield block (disabled, inherit, ... defer_output) with
attr_size >= 48.  These bitfields share a single __u64 at offset 40,
which is within PERF_ATTR_SIZE_VER0 for validated perf.data attrs,
but BPF-captured attrs from perf trace can have a smaller size when
the tracee passes a minimal struct to sys_perf_event_open.

Also fix the BPF trace path: when perf trace intercepts
sys_perf_event_open via BPF, the program copies PERF_ATTR_SIZE_VER0
bytes when the tracee passes size=0, but leaves the size field as 0.
Set attr->size to PERF_ATTR_SIZE_VER0 in the augmented syscall
handler so the bounds checks match the actual copied size.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/trace/beauty/perf_event_open.c |  23 +++-
 tools/perf/util/perf_event_attr_fprintf.c | 141 ++++++++++++++--------
 2 files changed, 114 insertions(+), 50 deletions(-)

diff --git a/tools/perf/trace/beauty/perf_event_open.c b/tools/perf/trace/beauty/perf_event_open.c
index c1c7445dcff994cb..6315b46bcdf02b8c 100644
--- a/tools/perf/trace/beauty/perf_event_open.c
+++ b/tools/perf/trace/beauty/perf_event_open.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: LGPL-2.1
+#include <string.h>
 #include "trace/beauty/beauty.h"
 #include "util/evsel_fprintf.h"
 #include <linux/perf_event.h>
@@ -80,7 +81,27 @@ static size_t perf_event_attr___scnprintf(struct perf_event_attr *attr, char *bf
 
 static size_t syscall_arg__scnprintf_augmented_perf_event_attr(struct syscall_arg *arg, char *bf, size_t size)
 {
-	return perf_event_attr___scnprintf((void *)arg->augmented.args->value, bf, size,
+	struct perf_event_attr *attr = (void *)arg->augmented.args->value;
+	struct perf_event_attr local_attr;
+
+	/*
+	 * augmented_raw_syscalls.bpf.c (shipped with perf) copies
+	 * PERF_ATTR_SIZE_VER0 bytes when the tracee passes size=0,
+	 * but leaves the size field as 0.  The payload size is
+	 * guaranteed by perf's own BPF program, not externally
+	 * controllable.  Copy to a local so we can fix up size
+	 * without writing to the potentially read-only augmented
+	 * args buffer.
+	 */
+	if (!attr->size) {
+		memcpy(&local_attr, attr, PERF_ATTR_SIZE_VER0);
+		memset((void *)&local_attr + PERF_ATTR_SIZE_VER0, 0,
+		       sizeof(local_attr) - PERF_ATTR_SIZE_VER0);
+		local_attr.size = PERF_ATTR_SIZE_VER0;
+		attr = &local_attr;
+	}
+
+	return perf_event_attr___scnprintf(attr, bf, size,
 					   trace__show_zeros(arg->trace));
 }
 
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index 741c3d657a8b6ae7..3933639d76c54bb3 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -275,24 +275,56 @@ static void __p_config_id(struct perf_pmu *pmu, char *buf, size_t size, u32 type
 #define p_type_id(val)		__p_type_id(buf, BUF_SIZE, pmu, val)
 #define p_config_id(val)	__p_config_id(pmu, buf, BUF_SIZE, attr->type, val)
 
-#define PRINT_ATTRn(_n, _f, _p, _a)			\
-do {							\
-	if (_a || attr->_f) {				\
-		_p(attr->_f);				\
-		ret += attr__fprintf(fp, _n, buf, priv);\
-	}						\
+#define PRINT_ATTRn(_n, _f, _p, _a)					\
+do {									\
+	if (attr_size >= offsetof(struct perf_event_attr, _f) +		\
+			 sizeof(attr->_f) &&				\
+	    (_a || attr->_f)) {						\
+		_p(attr->_f);						\
+		ret += attr__fprintf(fp, _n, buf, priv);		\
+	}								\
+} while (0)
+
+/* bitfield members share an offset; most are within PERF_ATTR_SIZE_VER0 */
+#define PRINT_ATTRn_bf(_n, _f, _p, _a)					\
+do {									\
+	if (_a || attr->_f) {						\
+		_p(attr->_f);						\
+		ret += attr__fprintf(fp, _n, buf, priv);		\
+	}								\
 } while (0)
 
 #define PRINT_ATTRf(_f, _p)	PRINT_ATTRn(#_f, _f, _p, false)
+#define PRINT_ATTRf_bf(_f, _p)	PRINT_ATTRn_bf(#_f, _f, _p, false)
 
 int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 			     attr__fprintf_f attr__fprintf, void *priv)
 {
 	struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type);
+	/*
+	 * size == 0 means ABI0 — the producer didn't set attr.size.
+	 * perf_event__fprintf_attr() may pass the raw mmap'd event
+	 * before the local copy, so default to PERF_ATTR_SIZE_VER0
+	 * (the ABI0 footprint) to avoid reading past the attr into
+	 * the ID array that follows it in HEADER_ATTR events.
+	 */
+	u32 attr_size = attr->size ?: PERF_ATTR_SIZE_VER0;
 	char buf[BUF_SIZE];
 	int ret = 0;
 
-	if (!pmu && (attr->type == PERF_TYPE_HARDWARE || attr->type == PERF_TYPE_HW_CACHE)) {
+	/*
+	 * Cap to what we understand: all callers store the attr in a
+	 * buffer of sizeof(*attr) bytes (perf.data read path copies
+	 * min(attr.size, sizeof), BPF augmented path copies into a
+	 * fixed-size value[] array).  A spoofed attr->size larger
+	 * than sizeof would cause PRINT_ATTRn to read past the
+	 * actual buffer.
+	 */
+	if (attr_size > sizeof(*attr))
+		attr_size = sizeof(*attr);
+
+	if (!pmu && attr_size >= offsetof(struct perf_event_attr, config) + sizeof(attr->config) &&
+	    (attr->type == PERF_TYPE_HARDWARE || attr->type == PERF_TYPE_HW_CACHE)) {
 		u32 extended_type = attr->config >> PERF_PMU_TYPE_SHIFT;
 
 		if (extended_type)
@@ -306,45 +338,53 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(sample_type, p_sample_type);
 	PRINT_ATTRf(read_format, p_read_format);
 
-	PRINT_ATTRf(disabled, p_unsigned);
-	PRINT_ATTRf(inherit, p_unsigned);
-	PRINT_ATTRf(pinned, p_unsigned);
-	PRINT_ATTRf(exclusive, p_unsigned);
-	PRINT_ATTRf(exclude_user, p_unsigned);
-	PRINT_ATTRf(exclude_kernel, p_unsigned);
-	PRINT_ATTRf(exclude_hv, p_unsigned);
-	PRINT_ATTRf(exclude_idle, p_unsigned);
-	PRINT_ATTRf(mmap, p_unsigned);
-	PRINT_ATTRf(comm, p_unsigned);
-	PRINT_ATTRf(freq, p_unsigned);
-	PRINT_ATTRf(inherit_stat, p_unsigned);
-	PRINT_ATTRf(enable_on_exec, p_unsigned);
-	PRINT_ATTRf(task, p_unsigned);
-	PRINT_ATTRf(watermark, p_unsigned);
-	PRINT_ATTRf(precise_ip, p_unsigned);
-	PRINT_ATTRf(mmap_data, p_unsigned);
-	PRINT_ATTRf(sample_id_all, p_unsigned);
-	PRINT_ATTRf(exclude_host, p_unsigned);
-	PRINT_ATTRf(exclude_guest, p_unsigned);
-	PRINT_ATTRf(exclude_callchain_kernel, p_unsigned);
-	PRINT_ATTRf(exclude_callchain_user, p_unsigned);
-	PRINT_ATTRf(mmap2, p_unsigned);
-	PRINT_ATTRf(comm_exec, p_unsigned);
-	PRINT_ATTRf(use_clockid, p_unsigned);
-	PRINT_ATTRf(context_switch, p_unsigned);
-	PRINT_ATTRf(write_backward, p_unsigned);
-	PRINT_ATTRf(namespaces, p_unsigned);
-	PRINT_ATTRf(ksymbol, p_unsigned);
-	PRINT_ATTRf(bpf_event, p_unsigned);
-	PRINT_ATTRf(aux_output, p_unsigned);
-	PRINT_ATTRf(cgroup, p_unsigned);
-	PRINT_ATTRf(text_poke, p_unsigned);
-	PRINT_ATTRf(build_id, p_unsigned);
-	PRINT_ATTRf(inherit_thread, p_unsigned);
-	PRINT_ATTRf(remove_on_exec, p_unsigned);
-	PRINT_ATTRf(sigtrap, p_unsigned);
-	PRINT_ATTRf(defer_callchain, p_unsigned);
-	PRINT_ATTRf(defer_output, p_unsigned);
+	/*
+	 * All bitfields share a single __u64 right after read_format.
+	 * BPF-captured attrs from perf trace may have a small size
+	 * when the tracee passes a minimal struct, so skip the
+	 * entire block when it's not covered.
+	 */
+	if (attr_size >= offsetof(struct perf_event_attr, wakeup_events)) {
+		PRINT_ATTRf_bf(disabled, p_unsigned);
+		PRINT_ATTRf_bf(inherit, p_unsigned);
+		PRINT_ATTRf_bf(pinned, p_unsigned);
+		PRINT_ATTRf_bf(exclusive, p_unsigned);
+		PRINT_ATTRf_bf(exclude_user, p_unsigned);
+		PRINT_ATTRf_bf(exclude_kernel, p_unsigned);
+		PRINT_ATTRf_bf(exclude_hv, p_unsigned);
+		PRINT_ATTRf_bf(exclude_idle, p_unsigned);
+		PRINT_ATTRf_bf(mmap, p_unsigned);
+		PRINT_ATTRf_bf(comm, p_unsigned);
+		PRINT_ATTRf_bf(freq, p_unsigned);
+		PRINT_ATTRf_bf(inherit_stat, p_unsigned);
+		PRINT_ATTRf_bf(enable_on_exec, p_unsigned);
+		PRINT_ATTRf_bf(task, p_unsigned);
+		PRINT_ATTRf_bf(watermark, p_unsigned);
+		PRINT_ATTRf_bf(precise_ip, p_unsigned);
+		PRINT_ATTRf_bf(mmap_data, p_unsigned);
+		PRINT_ATTRf_bf(sample_id_all, p_unsigned);
+		PRINT_ATTRf_bf(exclude_host, p_unsigned);
+		PRINT_ATTRf_bf(exclude_guest, p_unsigned);
+		PRINT_ATTRf_bf(exclude_callchain_kernel, p_unsigned);
+		PRINT_ATTRf_bf(exclude_callchain_user, p_unsigned);
+		PRINT_ATTRf_bf(mmap2, p_unsigned);
+		PRINT_ATTRf_bf(comm_exec, p_unsigned);
+		PRINT_ATTRf_bf(use_clockid, p_unsigned);
+		PRINT_ATTRf_bf(context_switch, p_unsigned);
+		PRINT_ATTRf_bf(write_backward, p_unsigned);
+		PRINT_ATTRf_bf(namespaces, p_unsigned);
+		PRINT_ATTRf_bf(ksymbol, p_unsigned);
+		PRINT_ATTRf_bf(bpf_event, p_unsigned);
+		PRINT_ATTRf_bf(aux_output, p_unsigned);
+		PRINT_ATTRf_bf(cgroup, p_unsigned);
+		PRINT_ATTRf_bf(text_poke, p_unsigned);
+		PRINT_ATTRf_bf(build_id, p_unsigned);
+		PRINT_ATTRf_bf(inherit_thread, p_unsigned);
+		PRINT_ATTRf_bf(remove_on_exec, p_unsigned);
+		PRINT_ATTRf_bf(sigtrap, p_unsigned);
+		PRINT_ATTRf_bf(defer_callchain, p_unsigned);
+		PRINT_ATTRf_bf(defer_output, p_unsigned);
+	}
 
 	PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned, false);
 	PRINT_ATTRf(bp_type, p_unsigned);
@@ -359,9 +399,12 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(sample_max_stack, p_unsigned);
 	PRINT_ATTRf(aux_sample_size, p_unsigned);
 	PRINT_ATTRf(sig_data, p_unsigned);
-	PRINT_ATTRf(aux_start_paused, p_unsigned);
-	PRINT_ATTRf(aux_pause, p_unsigned);
-	PRINT_ATTRf(aux_resume, p_unsigned);
+	/* aux_{start_paused,pause,resume} are at byte 116, past VER0 */
+	if (attr_size >= offsetof(struct perf_event_attr, sig_data)) {
+		PRINT_ATTRf_bf(aux_start_paused, p_unsigned);
+		PRINT_ATTRf_bf(aux_pause, p_unsigned);
+		PRINT_ATTRf_bf(aux_resume, p_unsigned);
+	}
 
 	return ret;
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 18/29] perf header: Propagate feature section processing errors
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (16 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 17/29] perf tools: Bounds check perf_event_attr fields against attr.size before printing Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 19/29] perf header: Validate f_attr.ids section before use in perf_session__read_header() Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

perf_session__read_header() discards the return value from
perf_header__process_sections(), so any error from a feature
section processor (process_nrcpus, process_compressed, etc.)
is silently ignored and the session opens as if nothing went
wrong.

This defeats the validation added by subsequent commits in this
series: a crafted perf.data that fails a feature section check
would still be processed with partially-initialized state.

Check the return value and fail the session if any feature
section processor returns an error.

For truncated files (data.size == 0, i.e. recording was
interrupted before the header was finalized), skip feature
section processing entirely and clear the feature bitmap so
tools use their "feature not present" fallbacks instead of
accessing uninitialized env fields.

Change the feature processor stubs for optional libraries
(libtraceevent, libbpf) from returning -1 to returning 0,
so that perf.data files containing these features can still be
opened on builds without the optional library — the feature is
simply skipped rather than causing a fatal error.

Also propagate evlist__prepare_tracepoint_events() failure as
-ENOMEM, since the function can fail due to strdup() allocation
failure inside evsel__prepare_tracepoint_event().

Fixes: 1c0b04d12ae9 ("perf tools: Add perf_session__read_header function")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 51 ++++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 9e3a08b1f8ae5a73..f4e0e257ff7226ac 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2748,8 +2748,9 @@ static int process_tracing_data(struct feat_fd *ff __maybe_unused, void *data __
 
 	return ret < 0 ? -1 : 0;
 #else
-	pr_err("ERROR: Trying to read tracing data without libtraceevent support.\n");
-	return -1;
+	/* Not an error — the feature is simply unsupported in this build */
+	pr_debug("Tracing data present but libtraceevent not available, skipping.\n");
+	return 0;
 #endif
 }
 
@@ -3643,8 +3644,9 @@ static int process_bpf_prog_info(struct feat_fd *ff __maybe_unused, void *data _
 	up_write(&env->bpf_progs.lock);
 	return err;
 #else
-	pr_err("ERROR: Trying to read bpf_prog_info without libbpf support.\n");
-	return -1;
+	/* Not an error — the feature is simply unsupported in this build */
+	pr_debug("BPF prog info present but libbpf not available, skipping.\n");
+	return 0;
 #endif // HAVE_LIBBPF_SUPPORT
 }
 
@@ -3712,8 +3714,9 @@ static int process_bpf_btf(struct feat_fd *ff  __maybe_unused, void *data __mayb
 	free(node);
 	return err;
 #else
-	pr_err("ERROR: Trying to read btf data without libbpf support.\n");
-	return -1;
+	/* Not an error — the feature is simply unsupported in this build */
+	pr_debug("BTF data present but libbpf not available, skipping.\n");
+	return 0;
 #endif // HAVE_LIBBPF_SUPPORT
 }
 
@@ -4900,7 +4903,7 @@ int perf_session__read_header(struct perf_session *session)
 	struct perf_file_header	f_header;
 	struct perf_file_attr	f_attr;
 	u64			f_id;
-	int nr_attrs, nr_ids, i, j, err;
+	int nr_attrs, nr_ids, i, j, err = -ENOMEM;
 	int fd = perf_data__fd(data);
 
 	session->evlist = evlist__new();
@@ -4920,6 +4923,7 @@ int perf_session__read_header(struct perf_session *session)
 		return err;
 	}
 
+	err = -ENOMEM;
 	if (perf_file_header__read(&f_header, header, fd) < 0)
 		return -EINVAL;
 
@@ -4997,15 +5001,36 @@ int perf_session__read_header(struct perf_session *session)
 		lseek(fd, tmp, SEEK_SET);
 	}
 
+	/*
+	 * Skip feature section processing for truncated files
+	 * (data.size == 0 means recording was interrupted).  The
+	 * section table is unreliable in that case, and the event
+	 * data can still be processed without the feature headers.
+	 * Clear the bitmap so has_feat() returns false and tools
+	 * use their "feature not present" fallbacks instead of
+	 * accessing uninitialized env fields.
+	 */
+	if (f_header.data.size == 0) {
+		bitmap_zero(header->adds_features, HEADER_FEAT_BITS);
+	} else {
 #ifdef HAVE_LIBTRACEEVENT
-	perf_header__process_sections(header, fd, &session->tevent,
-				      perf_file_section__process);
+		err = perf_header__process_sections(header, fd, &session->tevent,
+						    perf_file_section__process);
+		if (err < 0)
+			goto out_delete_evlist;
 
-	if (evlist__prepare_tracepoint_events(session->evlist, session->tevent.pevent))
-		goto out_delete_evlist;
+		if (evlist__prepare_tracepoint_events(session->evlist,
+						      session->tevent.pevent)) {
+			err = -ENOMEM;
+			goto out_delete_evlist;
+		}
 #else
-	perf_header__process_sections(header, fd, NULL, perf_file_section__process);
+		err = perf_header__process_sections(header, fd, NULL,
+						    perf_file_section__process);
+		if (err < 0)
+			goto out_delete_evlist;
 #endif
+	}
 
 	return 0;
 out_errno:
@@ -5014,7 +5039,7 @@ int perf_session__read_header(struct perf_session *session)
 out_delete_evlist:
 	evlist__delete(session->evlist);
 	session->evlist = NULL;
-	return -ENOMEM;
+	return err;
 }
 
 int perf_event__process_feature(const struct perf_tool *tool __maybe_unused,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 19/29] perf header: Validate f_attr.ids section before use in perf_session__read_header()
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (17 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 18/29] perf header: Propagate feature section processing errors Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 20/29] perf header: Validate feature section size and add read path bounds checking Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

perf_session__read_header() reads f_attr.ids.size from the perf.data
file and divides it by sizeof(u64) to compute nr_ids, which is
declared as int.  No validation is performed on the value before it
is used to allocate arrays and drive a read loop.

On 32-bit architectures, a crafted f_attr.ids.size of 0x100000000
(4 GB) produces nr_ids = 0x20000000, but the allocation size
1 * 0x20000000 * 8 overflows size_t to 0, so zalloc(0) returns a
valid pointer.  The subsequent loop writes 0x20000000 IDs into that
zero-length buffer, corrupting the heap.

On 64-bit, the u64-to-int truncation silently drops high bits,
processing fewer IDs than the file claims.  While not exploitable,
this is a data integrity issue.

Add validation before using f_attr.ids:

- Cap nr_attrs (attrs.size / attr_size) to MAX_NR_ATTRS (1 << 16)
  with overflow-safe u64 comparison before assigning to int
- Reject ids.size not aligned to sizeof(u64)
- Cap ids.size / sizeof(u64) to MAX_IDS_PER_ATTR (1 << 24) to
  prevent int truncation and size_t overflow on 32-bit
- Reject ids sections that extend past the end of the file,
  guarded by S_ISREG() so non-regular files (block devices,
  pipes) are not falsely rejected

Also fix perf_header__getbuffer64() to set errno = EIO when
readn() returns 0 (EOF).  Without this, the out_errno path in
perf_session__read_header() returns -errno which is 0 (success)
on truncated files, causing downstream NULL dereferences.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 77 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 76 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f4e0e257ff7226ac..fe23bbd8370c0190 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -64,6 +64,25 @@
 #include <event-parse.h>
 #endif
 
+/*
+ * nr_ids * sizeof(struct perf_sample_id) must not overflow
+ * size_t on 32-bit; the struct is ~104 bytes (32-bit) or
+ * ~184 bytes (64-bit), so 1<<24 (16M) keeps the product
+ * under 2 GB on 32-bit.
+ *
+ * This is a per-attribute cap only — the total across all
+ * attributes is not capped because legitimate high-core-count
+ * workloads (e.g. 5000 tracepoints × 4096 CPUs) can exceed
+ * a single-attribute limit.
+ */
+#define MAX_IDS_PER_ATTR	(1 << 24)
+/*
+ * Cap nr_attrs to prevent resource exhaustion from crafted
+ * files.  65536 is well beyond any real workload (perf stat
+ * typically uses < 100 events) but prevents u64-to-int
+ * truncation on the attr count.
+ */
+#define MAX_NR_ATTRS		(1 << 16)
 #define MAX_BPF_DATA_LEN	(256 * 1024 * 1024)
 #define MAX_BPF_PROGS		131072
 #define MAX_CACHE_ENTRIES	32768
@@ -4468,8 +4487,13 @@ int perf_session__inject_header(struct perf_session *session,
 static int perf_header__getbuffer64(struct perf_header *header,
 				    int fd, void *buf, size_t size)
 {
-	if (readn(fd, buf, size) <= 0)
+	ssize_t n = readn(fd, buf, size);
+
+	if (n <= 0) {
+		if (n == 0)
+			errno = EIO;
 		return -1;
+	}
 
 	if (header->needs_swap)
 		mem_bswap_64(buf, size);
@@ -4803,6 +4827,8 @@ static int read_attr(int fd, struct perf_header *ph,
 	if (ret <= 0) {
 		pr_debug("cannot read %d bytes of header attr\n",
 			 PERF_ATTR_SIZE_VER0);
+		if (ret == 0)
+			errno = EIO;
 		return -1;
 	}
 
@@ -4903,6 +4929,7 @@ int perf_session__read_header(struct perf_session *session)
 	struct perf_file_header	f_header;
 	struct perf_file_attr	f_attr;
 	u64			f_id;
+	struct stat		input_stat;
 	int nr_attrs, nr_ids, i, j, err = -ENOMEM;
 	int fd = perf_data__fd(data);
 
@@ -4951,6 +4978,15 @@ int perf_session__read_header(struct perf_session *session)
 		return -EINVAL;
 	}
 
+	if (fstat(fd, &input_stat) < 0)
+		return -errno;
+
+	/* Check before assigning to int to avoid u64-to-int truncation */
+	if (f_header.attrs.size / f_header.attr_size > MAX_NR_ATTRS) {
+		pr_err("Too many attributes: %" PRIu64 " (max %d)\n",
+		       f_header.attrs.size / f_header.attr_size, MAX_NR_ATTRS);
+		return -EINVAL;
+	}
 	nr_attrs = f_header.attrs.size / f_header.attr_size;
 	lseek(fd, f_header.attrs.offset, SEEK_SET);
 
@@ -4967,6 +5003,45 @@ int perf_session__read_header(struct perf_session *session)
 			perf_event__attr_swap(&f_attr.attr);
 		}
 
+		/*
+		 * Validate ids section: must be aligned to u64, and
+		 * the count must fit in an int to avoid truncation in
+		 * nr_ids and size_t overflow in perf_evsel__alloc_id()
+		 * on 32-bit architectures.
+		 */
+		if (f_attr.ids.size % sizeof(u64)) {
+			pr_err("Invalid ids section size %" PRIu64 " for attr %d, not aligned to u64\n",
+			       f_attr.ids.size, i);
+			err = -EINVAL;
+			goto out_delete_evlist;
+		}
+
+		/*
+		 * Cap the ID count to avoid int truncation of nr_ids
+		 * on 64-bit and size_t overflow in the allocation
+		 * paths (nr_ids * sizeof(u64), nr_ids *
+		 * sizeof(struct perf_sample_id)) on 32-bit.
+		 */
+		if (f_attr.ids.size / sizeof(u64) > MAX_IDS_PER_ATTR) {
+			pr_err("Invalid ids section size %" PRIu64 " for attr %d, too many IDs\n",
+			       f_attr.ids.size, i);
+			err = -EINVAL;
+			goto out_delete_evlist;
+		}
+
+		/*
+		 * FIXME: see perf_header__process_sections() — block
+		 * devices bypass this check because st_size is 0.
+		 */
+		if (S_ISREG(input_stat.st_mode) &&
+		    (f_attr.ids.offset > (u64)input_stat.st_size ||
+		     f_attr.ids.size > (u64)input_stat.st_size - f_attr.ids.offset)) {
+			pr_err("Invalid ids section for attr %d: offset=%" PRIu64 " size=%" PRIu64 " exceeds file size %" PRIu64 "\n",
+			       i, f_attr.ids.offset, f_attr.ids.size, (u64)input_stat.st_size);
+			err = -EINVAL;
+			goto out_delete_evlist;
+		}
+
 		tmp = lseek(fd, 0, SEEK_CUR);
 		evsel = evsel__new(&f_attr.attr);
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 20/29] perf header: Validate feature section size and add read path bounds checking
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (18 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 19/29] perf header: Validate f_attr.ids section before use in perf_session__read_header() Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 21/29] perf header: Sanity check HEADER_EVENT_DESC attr.size before swap Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, David Carrillo-Cisneros,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Harden feature section parsing against crafted perf.data files:

1. perf_header__process_sections() reads the feature section table
   and passes each section's offset and size directly to the
   processing callbacks without validating them against the actual
   file size.  A crafted section size would make all downstream
   bounds checks against ff->size ineffective since they compare
   against the untrusted, inflated bound.  Add an fstat() check
   with S_ISREG() guard and verify that each section's offset +
   size does not extend past EOF.

2. __do_read_buf() validates reads against ff->size (section size),
   but __do_read_fd() had no such check, so a malformed perf.data
   with an understated section size could cause reads past the end
   of the current section into the next section's data.  Add the
   bounds check in __do_read(), the common caller of both helpers,
   so it is enforced uniformly for both the fd and buf paths.
   Track the section-relative offset in __do_read_fd() so the
   check works for the fd path.  Reject negative sizes which on
   32-bit can occur when a u32 >= 0x80000000 is passed as ssize_t.

3. do_read_string() relied on file data being null-padded.  Add
   explicit null-termination (buf[len-1] = '\0') after reading
   and validate length (>= 1, fits within section) before
   allocating, so callers like process_cpu_topology() never
   receive an unterminated string.

4. Initialize feat_fd.offset to 0 (section-relative) instead of
   section->offset (file-absolute) so the bounds tracking is
   consistent with __do_read()'s section-relative comparison.
   Adjust process_build_id() to use lseek() for its file-absolute
   offset needs since it cannot rely on ff->offset for that.

5. Propagate ff->size to perf_file_section__fprintf_info() so its
   reads are also bounded.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 66 ++++++++++++++++++++++++++++++++++------
 1 file changed, 57 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index fe23bbd8370c0190..90417a478c8db2e1 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -233,23 +233,32 @@ static int __do_read_fd(struct feat_fd *ff, void *addr, ssize_t size)
 
 	if (ret != size)
 		return ret < 0 ? (int)ret : -1;
+	ff->offset += size;
 	return 0;
 }
 
 static int __do_read_buf(struct feat_fd *ff, void *addr, ssize_t size)
 {
-	if (size > (ssize_t)ff->size - ff->offset)
-		return -1;
-
 	memcpy(addr, ff->buf + ff->offset, size);
 	ff->offset += size;
 
 	return 0;
-
 }
 
 static int __do_read(struct feat_fd *ff, void *addr, ssize_t size)
 {
+	/*
+	 * Reject negative sizes, which on 32-bit can occur when a
+	 * u32 >= 0x80000000 is passed as ssize_t.  The cast to
+	 * ssize_t is safe because perf_header__process_sections()
+	 * validates that each section fits within the file size
+	 * before any feature callback reaches here, and only
+	 * feature sections (metadata like build IDs, topology, etc.)
+	 * use this path — these cannot legitimately approach 2GB.
+	 */
+	if (size < 0 || size > (ssize_t)ff->size - ff->offset)
+		return -1;
+
 	if (!ff->buf)
 		return __do_read_fd(ff, addr, size);
 	return __do_read_buf(ff, addr, size);
@@ -289,16 +298,25 @@ static char *do_read_string(struct feat_fd *ff)
 	if (do_read_u32(ff, &len))
 		return NULL;
 
+	/* At least the null terminator. */
+	if (len < 1 || len > ff->size - ff->offset) {
+		pr_debug("do_read_string: invalid length %u (remaining %zu)\n",
+			 len, (size_t)(ff->size - ff->offset));
+		return NULL;
+	}
+
 	buf = malloc(len);
 	if (!buf)
 		return NULL;
 
 	if (!__do_read(ff, buf, len)) {
 		/*
-		 * strings are padded by zeroes
-		 * thus the actual strlen of buf
-		 * may be less than len
+		 * do_write_string() writes len including the null
+		 * terminator, padded to NAME_ALIGN.  Ensure the
+		 * string is always null-terminated even if the file
+		 * data has been tampered with.
 		 */
+		buf[len - 1] = '\0';
 		return buf;
 	}
 
@@ -2775,7 +2793,13 @@ static int process_tracing_data(struct feat_fd *ff __maybe_unused, void *data __
 
 static int process_build_id(struct feat_fd *ff, void *data __maybe_unused)
 {
-	if (perf_header__read_build_ids(ff->ph, ff->fd, ff->offset, ff->size))
+	/* lseek fails in pipe mode — fall back to ff->offset */
+	off_t offset = lseek(ff->fd, 0, SEEK_CUR);
+
+	if (offset == (off_t)-1)
+		offset = ff->offset;
+
+	if (perf_header__read_build_ids(ff->ph, ff->fd, offset, ff->size))
 		pr_debug("Failed to read buildids, continuing...\n");
 	return 0;
 }
@@ -4152,6 +4176,7 @@ static int perf_file_section__fprintf_info(struct perf_file_section *section,
 	ff = (struct  feat_fd) {
 		.fd = fd,
 		.ph = ph,
+		.size = section->size,
 	};
 
 	if (!feat_ops[feat].full_only || hd->full)
@@ -4512,6 +4537,7 @@ int perf_header__process_sections(struct perf_header *header, int fd,
 	int sec_size;
 	int feat;
 	int err;
+	struct stat st;
 
 	nr_sections = bitmap_weight(header->adds_features, HEADER_FEAT_BITS);
 	if (!nr_sections)
@@ -4529,7 +4555,29 @@ int perf_header__process_sections(struct perf_header *header, int fd,
 	if (err < 0)
 		goto out_free;
 
+	if (fstat(fd, &st) < 0) {
+		pr_err("Failed to stat the perf data file\n");
+		err = -1;
+		goto out_free;
+	}
+
 	for_each_set_bit(feat, header->adds_features, header->last_feat) {
+		/*
+		 * FIXME: block devices have st_size == 0, so we skip
+		 * bounds checking entirely.  Historically perf never
+		 * prevented using a block device as input, but it
+		 * probably should — there's no valid use case for it
+		 * and it bypasses all file-size validation.
+		 */
+		if (S_ISREG(st.st_mode) &&
+		    (sec->offset > (u64)st.st_size ||
+		     sec->size > (u64)st.st_size - sec->offset)) {
+			pr_err("Feature %s (%d) section extends past EOF (offset=%" PRIu64 ", size=%" PRIu64 ", file=%" PRIu64 ")\n",
+			       header_feat__name(feat), feat,
+			       sec->offset, sec->size, (u64)st.st_size);
+			err = -1;
+			goto out_free;
+		}
 		err = process(sec++, header, feat, fd, data);
 		if (err < 0)
 			goto out_free;
@@ -4756,7 +4804,7 @@ static int perf_file_section__process(struct perf_file_section *section,
 		.fd	= fd,
 		.ph	= ph,
 		.size	= section->size,
-		.offset	= section->offset,
+		.offset	= 0,
 	};
 
 	if (lseek(fd, section->offset, SEEK_SET) == (off_t)-1) {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 21/29] perf header: Sanity check HEADER_EVENT_DESC attr.size before swap
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (19 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 20/29] perf header: Validate feature section size and add read path bounds checking Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 22/29] perf header: Validate bitmap size before allocating in do_read_bitmap() Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Wang Nan,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

read_event_desc() reads nre (event count), sz (attr size), and nr
(IDs per event) from the file and uses them to control allocations
and loops without validating them against the section size.

A crafted perf.data could trigger large allocations or many loop
iterations before __do_read() eventually rejects the reads.

Add bounds checks in read_event_desc():
- Reject sz smaller than PERF_ATTR_SIZE_VER0.
- Require at least one event (nre > 0).
- Check that nre events fit in the remaining section, using the
  minimum per-event footprint of sz + sizeof(u32).
- Pre-swap attr->size to native byte order, then reject values
  below PERF_ATTR_SIZE_VER0 or above sz before calling
  perf_event__attr_swap() to prevent heap out-of-bounds access.
- Handle ABI0 (attr.size == 0): substitute PERF_ATTR_SIZE_VER0,
  and on native-endian files write the value back so
  free_event_desc() does not treat the zero as its end-of-array
  sentinel (it iterates while attr.size != 0).  The swap path
  skips the write-back — perf_event__attr_swap() has its own
  ABI0 fallback that sets VER0 after swapping.
- Check that nr IDs fit in the remaining section before allocating.

Fixes: b30b61729246 ("perf tools: Fix a problem when opening old perf.data with different byte order")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 54 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 90417a478c8db2e1..37d7c9849e0e9199 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2173,9 +2173,28 @@ static struct evsel *read_event_desc(struct feat_fd *ff)
 	if (do_read_u32(ff, &nre))
 		goto error;
 
+	/* Size of each of the nre attributes. */
 	if (do_read_u32(ff, &sz))
 		goto error;
 
+	/*
+	 * Require at least one event with an attr no smaller than the
+	 * first published struct, and reject sz values where
+	 * sz + sizeof(u32) would overflow size_t (possible on 32-bit)
+	 * or nre == UINT32_MAX where nre + 1 wraps to 0 in the calloc.
+	 *
+	 * The minimum section footprint per event is sz bytes for the
+	 * attr plus a u32 for the id count, check that nre events fit.
+	 */
+	if (!nre || sz < PERF_ATTR_SIZE_VER0 ||
+	    sz > ff->size || (size_t)sz > SIZE_MAX - sizeof(u32) ||
+	    nre == UINT32_MAX ||
+	    nre > (ff->size - ff->offset) / (sz + sizeof(u32))) {
+		pr_err("Invalid HEADER_EVENT_DESC: nre=%u sz=%u (min %d)\n",
+		       nre, sz, PERF_ATTR_SIZE_VER0);
+		goto error;
+	}
+
 	/* buffer to hold on file attr struct */
 	buf = malloc(sz);
 	if (!buf)
@@ -2191,6 +2210,9 @@ static struct evsel *read_event_desc(struct feat_fd *ff)
 		msz = sz;
 
 	for (i = 0, evsel = events; i < nre; evsel++, i++) {
+		struct perf_event_attr *attr = buf;
+		u32 attr_size;
+
 		evsel->core.idx = i;
 
 		/*
@@ -2200,6 +2222,32 @@ static struct evsel *read_event_desc(struct feat_fd *ff)
 		if (__do_read(ff, buf, sz))
 			goto error;
 
+		/* Reject before attr_swap to prevent OOB via bswap_safe() */
+		attr_size = ff->ph->needs_swap ? bswap_32(attr->size) : attr->size;
+		/* ABI0: size == 0 means the producer didn't set it */
+		if (!attr_size) {
+			attr_size = PERF_ATTR_SIZE_VER0;
+			/*
+			 * Write back so free_event_desc() doesn't
+			 * treat this event as the end-of-array sentinel
+			 * (it iterates while attr.size != 0).
+			 *
+			 * Only for native — the swap path must NOT
+			 * write native-endian VER0 here because
+			 * perf_event__attr_swap() would re-swap it
+			 * to 0x40000000, defeating bswap_safe() bounds.
+			 * perf_event__attr_swap() has its own ABI0
+			 * fallback that sets VER0 after swapping.
+			 */
+			if (!ff->ph->needs_swap)
+				attr->size = attr_size;
+		}
+		if (attr_size < PERF_ATTR_SIZE_VER0 || attr_size > sz) {
+			pr_err("Event %d attr.size (%u) invalid (min: %d, max: %u)\n",
+			       i, attr_size, PERF_ATTR_SIZE_VER0, sz);
+			goto error;
+		}
+
 		if (ff->ph->needs_swap)
 			perf_event__attr_swap(buf);
 
@@ -2221,6 +2269,12 @@ static struct evsel *read_event_desc(struct feat_fd *ff)
 		if (!nr)
 			continue;
 
+		/* Prevent oversized allocation from crafted nr */
+		if (nr > (ff->size - ff->offset) / sizeof(*id)) {
+			pr_err("Event %d: id count %u exceeds remaining section\n", i, nr);
+			goto error;
+		}
+
 		id = calloc(nr, sizeof(*id));
 		if (!id)
 			goto error;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 22/29] perf header: Validate bitmap size before allocating in do_read_bitmap()
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (20 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 21/29] perf header: Sanity check HEADER_EVENT_DESC attr.size before swap Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:17 ` [PATCH 23/29] perf session: Add byte-swap handler for PERF_RECORD_COMPRESSED2 Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

do_read_bitmap() reads a u64 bit count from the file and passes it
to bitmap_zalloc() without checking it against the remaining section
size. A crafted perf.data could trigger a large allocation that would
only fail later when the per-element reads exceed section bounds.

Additionally, bitmap_zalloc() takes an int parameter, so a crafted
size with bits set above bit 31 (e.g. 0x100000040) would pass the
section bounds check but truncate when passed to bitmap_zalloc(),
allocating a much smaller buffer than the subsequent read loop
expects.

Reject size values that exceed INT_MAX, and check that the data
needed (BITS_TO_U64(size) u64 values) fits in the remaining section
before allocating.  Switch from bitmap_zalloc() to calloc() of u64
units so the allocation size matches the u64 read/write granularity
and avoids unsigned long vs u64 mismatch on 32-bit architectures.

Fix do_write_bitmap() to use memcpy to read u64-sized chunks from
the unsigned long bitmap, preventing out-of-bounds reads on 32-bit
systems where sizeof(unsigned long) is 4 but the bitmap is stored
in u64 units.

Fix process_mem_topology() minimum section size: the check used
nr * 2 * sizeof(u64) per node, but do_read_bitmap() reads an
additional u64 for the bitmap size, so the minimum is 3 * sizeof(u64).

Fix memory leak in process_mem_topology() error paths: replace
free(nodes) with memory_node__delete_nodes() to free per-node
bitmaps allocated by do_read_bitmap().

Currently used by process_mem_topology() for HEADER_MEM_TOPOLOGY.

Fixes: a881fc56038a ("perf header: Sanity check HEADER_MEM_TOPOLOGY")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Closes: https://lore.kernel.org/linux-perf-users/20260414224622.2AE69C19425@smtp.kernel.org/
Closes: https://lore.kernel.org/linux-perf-users/20260410223242.DD76FC19421@smtp.kernel.org/
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 34 +++++++++++++++++++++++++++++-----
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 37d7c9849e0e9199..2fea0172140e4abd 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -178,15 +178,25 @@ int do_write(struct feat_fd *ff, const void *buf, size_t size)
 /* Return: 0 if succeeded, -ERR if failed. */
 static int do_write_bitmap(struct feat_fd *ff, unsigned long *set, u64 size)
 {
-	u64 *p = (u64 *) set;
+	size_t byte_size = BITS_TO_LONGS(size) * sizeof(unsigned long);
 	int i, ret;
 
 	ret = do_write(ff, &size, sizeof(size));
 	if (ret < 0)
 		return ret;
 
+	/*
+	 * The on-disk format uses u64 elements, but the in-memory bitmap
+	 * uses unsigned long, which is only 4 bytes on 32-bit architectures.
+	 * Copy with bounded size so the last element doesn't read past the
+	 * bitmap allocation when BITS_TO_LONGS(size) is odd.
+	 */
 	for (i = 0; (u64) i < BITS_TO_U64(size); i++) {
-		ret = do_write(ff, p + i, sizeof(*p));
+		u64 val = 0;
+		size_t off = i * sizeof(val);
+
+		memcpy(&val, (char *)set + off, min(sizeof(val), byte_size - off));
+		ret = do_write(ff, &val, sizeof(val));
 		if (ret < 0)
 			return ret;
 	}
@@ -335,7 +345,20 @@ static int do_read_bitmap(struct feat_fd *ff, unsigned long **pset, u64 *psize)
 	if (ret)
 		return ret;
 
-	set = bitmap_zalloc(size);
+	/* Bitmap APIs use int for nbits; reject u64 values that truncate. */
+	if (size > INT_MAX ||
+	    BITS_TO_U64(size) > (ff->size - ff->offset) / sizeof(u64)) {
+		pr_debug("do_read_bitmap: size %" PRIu64 " exceeds section bounds\n", size);
+		return -1;
+	}
+
+	/*
+	 * bitmap_zalloc() allocates in unsigned long units, which are only
+	 * 4 bytes on 32-bit architectures. The read loop below casts the
+	 * buffer to u64 * and writes 8-byte elements, so allocate in u64
+	 * units to ensure the buffer is large enough.
+	 */
+	set = calloc(BITS_TO_U64(size), sizeof(u64));
 	if (!set)
 		return -ENOMEM;
 
@@ -3497,7 +3520,8 @@ static int process_mem_topology(struct feat_fd *ff,
 		return -1;
 	}
 
-	if (ff->size < 3 * sizeof(u64) + nr * 2 * sizeof(u64)) {
+	/* Per node: node_id(u64) + mem_size(u64) + bitmap_nr_bits(u64) */
+	if (ff->size < 3 * sizeof(u64) + nr * 3 * sizeof(u64)) {
 		pr_err("Invalid HEADER_MEM_TOPOLOGY: section too small (%zu) for %llu nodes\n",
 		       ff->size, (unsigned long long)nr);
 		return -1;
@@ -3532,7 +3556,7 @@ static int process_mem_topology(struct feat_fd *ff,
 
 out:
 	if (ret)
-		free(nodes);
+		memory_node__delete_nodes(nodes, nr);
 	return ret;
 }
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 23/29] perf session: Add byte-swap handler for PERF_RECORD_COMPRESSED2
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (21 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 22/29] perf header: Validate bitmap size before allocating in do_read_bitmap() Arnaldo Carvalho de Melo
@ 2026-05-26 21:17 ` Arnaldo Carvalho de Melo
  2026-05-26 21:18 ` [PATCH 24/29] perf tools: Harden compressed event processing Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Chun-Tse Shao,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

PERF_RECORD_COMPRESSED2 events carry a data_size field that must be
byte-swapped when reading cross-endian perf.data files.  Without a
swap handler, reading COMPRESSED2 events on a different-endian machine
would misinterpret data_size as a garbage value, causing the
decompression path to read the wrong number of bytes.

The compressed payload itself is a raw byte stream and needs no
swapping.

Fixes: 208c0e16834472bb ("perf record: Add 8-byte aligned event type PERF_RECORD_COMPRESSED2")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 8280413f4528f53c..9271885e3920f897 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1056,6 +1056,14 @@ static int perf_event__time_conv_swap(union perf_event *event,
 	return 0;
 }
 
+static int perf_event__compressed2_swap(union perf_event *event,
+					bool sample_id_all __maybe_unused)
+{
+	/* Only data_size needs swapping — compressed payload is a raw byte stream */
+	event->pack2.data_size = bswap_64(event->pack2.data_size);
+	return 0;
+}
+
 static int perf_event__bpf_metadata_swap(union perf_event *event,
 					 bool sample_id_all __maybe_unused)
 {
@@ -1197,6 +1205,7 @@ static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_STAT_ROUND]	  = perf_event__stat_round_swap,
 	[PERF_RECORD_EVENT_UPDATE]	  = perf_event__event_update_swap,
 	[PERF_RECORD_TIME_CONV]		  = perf_event__time_conv_swap,
+	[PERF_RECORD_COMPRESSED2]	  = perf_event__compressed2_swap,
 	[PERF_RECORD_BPF_METADATA]	  = perf_event__bpf_metadata_swap,
 	[PERF_RECORD_SCHEDSTAT_CPU]	  = perf_event__schedstat_cpu_swap,
 	[PERF_RECORD_SCHEDSTAT_DOMAIN]	  = perf_event__schedstat_domain_swap,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 24/29] perf tools: Harden compressed event processing
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (22 preceding siblings ...)
  2026-05-26 21:17 ` [PATCH 23/29] perf session: Add byte-swap handler for PERF_RECORD_COMPRESSED2 Arnaldo Carvalho de Melo
@ 2026-05-26 21:18 ` Arnaldo Carvalho de Melo
  2026-05-26 22:23   ` sashiko-bot
  2026-05-26 21:18 ` [PATCH 25/29] perf session: Check for decompression buffer size overflow Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Add several hardening checks to the compressed event decompression
pipeline:

1. Guard against decomp_last_rem underflow: check that
   decomp_last->head does not exceed decomp_last->size before
   subtracting.  A u64 underflow here would produce a huge
   decomp_len, causing an oversized mmap allocation.

2. Validate comp_mmap_len from the HEADER_COMPRESSED feature
   section: reject values that are not 4K-aligned or smaller than
   4096.  The downstream decompression path checks allocation
   sizes against SIZE_MAX, which handles 32-bit safety.

3. Validate COMPRESSED event header size: reject events where
   header.size is too small to contain the fixed struct fields,
   preventing underflow in the payload size calculation.

4. Validate COMPRESSED2 event data_size: check that data_size
   does not exceed the available payload (header.size minus the
   fixed struct fields) for the newer compressed format.

5. Reject compressed events when the HEADER_COMPRESSED feature
   is missing from the file header, which means no decompression
   context was initialized.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c | 17 +++++++++++++++++
 tools/perf/util/tool.c   | 38 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 2fea0172140e4abd..f771a76321c10a02 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3861,6 +3861,23 @@ static int process_compressed(struct feat_fd *ff,
 	if (do_read_u32(ff, &(env->comp_mmap_len)))
 		return -1;
 
+	/*
+	 * FIXME: perf.data should record the recording system's page
+	 * size — it affects mmap buffer alignment, sample addresses,
+	 * and data_page_size/code_page_size interpretation.  Without
+	 * it we assume 4K (the smallest Linux page size) as a safe
+	 * minimum alignment for comp_mmap_len validation.
+	 *
+	 * No upper-bound cap: perf_session__process_compressed_event()
+	 * checks decomp_len + sizeof(struct decomp) against SIZE_MAX
+	 * before allocating, which handles 32-bit safety.
+	 */
+	if (env->comp_mmap_len < 4096 || env->comp_mmap_len % 4096) {
+		pr_err("Invalid HEADER_COMPRESSED: comp_mmap_len (%u) must be a 4K-aligned value >= 4096\n",
+		       env->comp_mmap_len);
+		return -1;
+	}
+
 	return 0;
 }
 
diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c
index 225a77d530ce8ab3..18641919473a859f 100644
--- a/tools/perf/util/tool.c
+++ b/tools/perf/util/tool.c
@@ -24,7 +24,15 @@ static int perf_session__process_compressed_event(const struct perf_tool *tool _
 	size_t mmap_len, decomp_len = perf_session__env(session)->comp_mmap_len;
 	struct decomp *decomp, *decomp_last = session->active_decomp->decomp_last;
 
+	if (!decomp_len) {
+		pr_err("Compressed events found but HEADER_COMPRESSED not set\n");
+		return -1;
+	}
+
 	if (decomp_last) {
+		/* Prevent u64 underflow in decomp_last_rem */
+		if (decomp_last->head > decomp_last->size)
+			return -1;
 		decomp_last_rem = decomp_last->size - decomp_last->head;
 		decomp_len += decomp_last_rem;
 	}
@@ -47,14 +55,37 @@ static int perf_session__process_compressed_event(const struct perf_tool *tool _
 		decomp->size = decomp_last_rem;
 	}
 
+	/*
+	 * Events are read directly from the mmap'd file; fields could
+	 * theoretically change via a FUSE-backed file, but that applies
+	 * to the entire event processing pipeline, not just here.
+	 */
 	if (event->header.type == PERF_RECORD_COMPRESSED) {
+		if (event->header.size < sizeof(struct perf_record_compressed))
+			goto err_decomp;
 		src = (void *)event + sizeof(struct perf_record_compressed);
 		src_size = event->pack.header.size - sizeof(struct perf_record_compressed);
 	} else if (event->header.type == PERF_RECORD_COMPRESSED2) {
+		/*
+		 * prefetch_event() only guarantees that the 8-byte
+		 * event header fits; validate that header.size covers
+		 * the data_size field before accessing it, otherwise a
+		 * crafted event reads data_size from adjacent memory.
+		 */
+		if (event->header.size < sizeof(struct perf_record_compressed2))
+			goto err_decomp;
 		src = (void *)event + sizeof(struct perf_record_compressed2);
 		src_size = event->pack2.data_size;
+		/*
+		 * data_size is independent of header.size (which
+		 * includes padding); verify it doesn't exceed the
+		 * actual payload to prevent out-of-bounds reads in
+		 * zstd_decompress_stream().
+		 */
+		if (src_size > event->header.size - sizeof(struct perf_record_compressed2))
+			goto err_decomp;
 	} else {
-		return -1;
+		goto err_decomp;
 	}
 
 	decomp_size = zstd_decompress_stream(session->active_decomp->zstd_decomp, src, src_size,
@@ -77,6 +108,11 @@ static int perf_session__process_compressed_event(const struct perf_tool *tool _
 	pr_debug("decomp (B): %zd to %zd\n", src_size, decomp_size);
 
 	return 0;
+
+err_decomp:
+	munmap(decomp, mmap_len);
+	pr_err("Couldn't decompress data\n");
+	return -1;
 }
 #endif
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 25/29] perf session: Check for decompression buffer size overflow
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (23 preceding siblings ...)
  2026-05-26 21:18 ` [PATCH 24/29] perf tools: Harden compressed event processing Arnaldo Carvalho de Melo
@ 2026-05-26 21:18 ` Arnaldo Carvalho de Melo
  2026-05-26 21:18 ` [PATCH 26/29] perf session: Bound nr_cpus_avail and validate sample CPU Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

On 32-bit systems, sizeof(struct decomp) + decomp_len can wrap
size_t when comp_mmap_len is large.  The preceding patch validates
comp_mmap_len alignment but does not cap the upper bound, so two
additions can still overflow:

1. decomp_len += decomp_last_rem: on 32-bit, adding a u64 to
   size_t silently truncates, producing a corrupted decomp_len
   that would bypass the subsequent overflow check and result
   in an undersized buffer allocation.

2. sizeof(struct decomp) + decomp_len: the final addition could
   overflow on systems with small size_t.

Add explicit overflow checks before each addition as
defense-in-depth.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/tool.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c
index 18641919473a859f..25c9b378aa163664 100644
--- a/tools/perf/util/tool.c
+++ b/tools/perf/util/tool.c
@@ -34,9 +34,22 @@ static int perf_session__process_compressed_event(const struct perf_tool *tool _
 		if (decomp_last->head > decomp_last->size)
 			return -1;
 		decomp_last_rem = decomp_last->size - decomp_last->head;
+		/*
+		 * Check before adding: on 32-bit, size_t += u64
+		 * silently truncates, bypassing the overflow check
+		 * below and producing an undersized buffer.
+		 */
+		if (decomp_last_rem > SIZE_MAX - decomp_len - sizeof(struct decomp)) {
+			pr_err("Decompression buffer size overflow\n");
+			return -1;
+		}
 		decomp_len += decomp_last_rem;
 	}
 
+	if (decomp_len > SIZE_MAX - sizeof(struct decomp)) {
+		pr_err("Decompression buffer size overflow\n");
+		return -1;
+	}
 	mmap_len = sizeof(struct decomp) + decomp_len;
 	decomp = mmap(NULL, mmap_len, PROT_READ|PROT_WRITE,
 		      MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 26/29] perf session: Bound nr_cpus_avail and validate sample CPU
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (24 preceding siblings ...)
  2026-05-26 21:18 ` [PATCH 25/29] perf session: Check for decompression buffer size overflow Arnaldo Carvalho de Melo
@ 2026-05-26 21:18 ` Arnaldo Carvalho de Melo
  2026-05-26 22:40   ` sashiko-bot
  2026-05-26 21:18 ` [PATCH 27/29] perf kwork: Bounds check work->cpu before indexing cpus_runtime[] Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Several downstream consumers (timechart, kwork, sched) use fixed-size
arrays indexed by CPU.  A crafted perf.data can supply arbitrary CPU
values that index past these arrays, causing out-of-bounds access.

Validate sample.cpu against min(nr_cpus_avail, MAX_NR_CPUS) in
perf_session__deliver_event() before any tool callback runs.  The
cap at MAX_NR_CPUS protects fixed-size downstream arrays; the true
nr_cpus_avail is preserved in env for header parsing (e.g.
process_cpu_topology) which needs the real count.

Fall back to MAX_NR_CPUS when HEADER_NRCPUS is missing (truncated
files, pipe mode, pre-2017 perf).

Only validate when PERF_SAMPLE_CPU is set in sample_type — when
absent, evsel__parse_sample() leaves sample.cpu as (u32)-1, a
sentinel that downstream tools (script, inject) check to identify
events without CPU info.  Clamping it to 0 would break those checks.

Inline evlist__parse_sample() into perf_session__deliver_event()
so the evsel lookup needed for sample_type checking reuses the same
evsel that parsed the sample, avoiding a second evlist__event2evsel()
call on every event.

For pipe-mode streams where HEADER_NRCPUS may arrive late or not at
all, the MAX_NR_CPUS fallback ensures the bounds check is still
effective against the fixed-size downstream arrays.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c  | 30 +++++++++++++
 tools/perf/util/session.c | 88 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 117 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index f771a76321c10a02..5b1fa1653d2a48cc 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -48,6 +48,7 @@
 #include <api/io_dir.h>
 #include "asm/bug.h"
 #include "tool.h"
+#include "../perf.h"
 #include "time-utils.h"
 #include "units.h"
 #include "util/util.h" // perf_exe()
@@ -2895,6 +2896,17 @@ static int process_nrcpus(struct feat_fd *ff, void *data __maybe_unused)
 	if (ret)
 		return ret;
 
+	/*
+	 * Cap at 1M CPUs — generous for any real system but prevents
+	 * stack overflow from VLA allocations sized by nr_cpus_avail
+	 * (e.g. DECLARE_BITMAP in builtin-c2c.c node_entry()).
+	 */
+	if (nr_cpus_avail > (1U << 20)) {
+		pr_err("Invalid HEADER_NRCPUS: nr_cpus_avail (%u) exceeds maximum (%u)\n",
+		       nr_cpus_avail, 1U << 20);
+		return -1;
+	}
+
 	if (nr_cpus_online > nr_cpus_avail) {
 		pr_err("Invalid HEADER_NRCPUS: nr_cpus_online (%u) > nr_cpus_avail (%u)\n",
 		       nr_cpus_online, nr_cpus_avail);
@@ -5250,6 +5262,24 @@ int perf_session__read_header(struct perf_session *session)
 #endif
 	}
 
+	/*
+	 * Without nr_cpus_avail the sample CPU bounds check in
+	 * perf_session__deliver_event() is bypassed, allowing crafted
+	 * CPU IDs to reach downstream consumers that index fixed-size
+	 * arrays (timechart, kwork, sched — all sized MAX_NR_CPUS).
+	 *
+	 * This can happen with truncated files (interrupted recording
+	 * loses all feature sections), very old files that predate
+	 * HEADER_NRCPUS, or crafted files that omit it.  Fall back to
+	 * MAX_NR_CPUS so the bounds check is still effective — any
+	 * CPU ID below that limit is safe for all downstream arrays.
+	 */
+	if (header->env.nr_cpus_avail == 0) {
+		header->env.nr_cpus_avail = MAX_NR_CPUS;
+		pr_warning("WARNING: perf.data is missing HEADER_NRCPUS, using MAX_NR_CPUS (%d) as CPU bound\n",
+			   MAX_NR_CPUS);
+	}
+
 	return 0;
 out_errno:
 	return -errno;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 9271885e3920f897..6de665d3c9054179 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2110,14 +2110,100 @@ static int perf_session__deliver_event(struct perf_session *session,
 				       const char *file_path)
 {
 	struct perf_sample sample;
+	struct evsel *evsel;
 	int ret;
 
 	perf_sample__init(&sample, /*all=*/false);
-	ret = evlist__parse_sample(session->evlist, event, &sample);
+	evsel = evlist__event2evsel(session->evlist, event);
+	if (!evsel) {
+		pr_err("No evsel found for event type %u\n",
+		       event->header.type);
+		ret = -EFAULT;
+		goto out;
+	}
+	ret = evsel__parse_sample(evsel, event, &sample);
 	if (ret) {
 		pr_err("Can't parse sample, err = %d\n", ret);
 		goto out;
 	}
+	/*
+	 * evsel__parse_sample() doesn't populate machine_pid/vcpu,
+	 * which are needed by machines__find_for_cpumode() to
+	 * attribute samples to guest VMs.  The SID table maps
+	 * sample IDs to the guest that owns the event.
+	 */
+	if (perf_guest && sample.id) {
+		struct perf_sample_id *sid = evlist__id2sid(session->evlist, sample.id);
+
+		if (sid) {
+			sample.machine_pid = sid->machine_pid;
+			sample.vcpu = sid->vcpu.cpu;
+		}
+	}
+
+	/*
+	 * Validate sample.cpu before any callback can use it as an
+	 * array index (kwork cpus_runtime, timechart cpus_cstate_*,
+	 * sched cpu_last_switched).
+	 *
+	 * When PERF_SAMPLE_CPU is absent, evsel__parse_sample() leaves
+	 * sample.cpu as (u32)-1 — a sentinel that downstream tools
+	 * (script, inject) check to identify events without CPU info.
+	 * Only check when sample.cpu was actually populated from event
+	 * data: PERF_RECORD_SAMPLE always has it when PERF_SAMPLE_CPU
+	 * is set; non-sample events only have it when sample_id_all is
+	 * enabled.  Otherwise sample.cpu is the (u32)-1 sentinel from
+	 * evsel__parse_sample() and must not be validated or clamped.
+	 */
+	if ((evsel->core.attr.sample_type & PERF_SAMPLE_CPU) &&
+	    (event->header.type == PERF_RECORD_SAMPLE ||
+	     evsel->core.attr.sample_id_all)) {
+		int nr_cpus_avail = perf_session__env(session)->nr_cpus_avail;
+
+		/*
+		 * For perf.data files the MAX_NR_CPUS fallback in
+		 * perf_session__read_header() guarantees this is set.
+		 * For pipe mode, HEADER_NRCPUS may arrive late or not
+		 * at all (pre-2017 perf, third-party tools).  Fall
+		 * back to MAX_NR_CPUS so the bounds check still works
+		 * against fixed-size downstream arrays.
+		 *
+		 * Do NOT write back to env: this function runs during
+		 * recording (synthesized events) when nr_cpus_avail is
+		 * legitimately 0.  Writing MAX_NR_CPUS would cause
+		 * write_cpu_topology() to emit 4096 core_id/socket_id
+		 * pairs instead of the real CPU count, corrupting the
+		 * topology section in the generated perf.data.
+		 */
+		if (nr_cpus_avail <= 0)
+			nr_cpus_avail = MAX_NR_CPUS;
+		/*
+		 * Cap at MAX_NR_CPUS for the bounds check — downstream
+		 * consumers use fixed-size arrays of that size.  Keep
+		 * the true nr_cpus_avail in env for header parsing
+		 * (e.g. process_cpu_topology) which needs the real count.
+		 */
+		if (nr_cpus_avail > MAX_NR_CPUS)
+			nr_cpus_avail = MAX_NR_CPUS;
+		if (sample.cpu >= (u32)nr_cpus_avail &&
+		    sample.cpu != (u32)-1) {
+			/*
+			 * Warn rather than abort: synthesized events
+			 * (MMAP, COMM) lack sample_id_all data, so
+			 * parse_id_sample reads garbage from the event
+			 * payload.  Clamping to 0 protects downstream
+			 * array indexing while keeping the session alive.
+			 *
+			 * Preserve (u32)-1: perf script and perf inject
+			 * use it as a sentinel for "CPU not applicable."
+			 * Downstream array users (timechart, kwork) have
+			 * their own per-callback bounds checks.
+			 */
+			pr_warning_once("WARNING: sample CPU %u >= nr_cpus_avail %u, clamping to 0\n",
+					sample.cpu, nr_cpus_avail);
+			sample.cpu = 0;
+		}
+	}
 
 	ret = auxtrace__process_event(session, event, &sample, tool);
 	if (ret < 0)
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 27/29] perf kwork: Bounds check work->cpu before indexing cpus_runtime[]
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (25 preceding siblings ...)
  2026-05-26 21:18 ` [PATCH 26/29] perf session: Bound nr_cpus_avail and validate sample CPU Arnaldo Carvalho de Melo
@ 2026-05-26 21:18 ` Arnaldo Carvalho de Melo
  2026-05-26 21:18 ` [PATCH 28/29] perf session: Snapshot event->header.size in process_user_event() Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Yang Jihong,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

work->cpu comes from sample->cpu which is (u32)-1 when
PERF_SAMPLE_CPU is absent.  Stored as int, this becomes -1
which passes the signed BUG_ON(work->cpu >= MAX_NR_CPUS) but
causes an out-of-bounds access on cpus_runtime[-1].

Replace the BUG_ON in top_calc_total_runtime() with an unsigned
bounds check that skips entries with invalid CPU values, counting
them for a summary warning.

Guard the same index in profile_event_match() (bitmap OOB),
top_calc_idle_time(), top_calc_irq_runtime(), top_calc_cpu_usage(),
and top_calc_load_runtime().  Also guard against division by zero
in top_calc_cpu_usage() when no runtime was accumulated.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-kwork.c | 45 +++++++++++++++++++++++++++++++++-----
 tools/perf/util/kwork.h    |  1 +
 2 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index f793ea578515d08c..99dc293a0744726e 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -467,7 +467,9 @@ static bool profile_event_match(struct perf_kwork *kwork,
 	u64 time = sample->time;
 	struct perf_time_interval *ptime = &kwork->ptime;
 
-	if ((kwork->cpu_list != NULL) && !test_bit(cpu, kwork->cpu_bitmap))
+	/* Guard test_bit: cpu == -1 (absent PERF_SAMPLE_CPU) would index past the bitmap */
+	if ((kwork->cpu_list != NULL) &&
+	    ((unsigned int)cpu >= MAX_NR_CPUS || !test_bit(cpu, kwork->cpu_bitmap)))
 		return false;
 
 	if (((ptime->start != 0) && (ptime->start > time)) ||
@@ -2041,7 +2043,18 @@ static void top_calc_total_runtime(struct perf_kwork *kwork)
 	next = rb_first_cached(&class->work_root);
 	while (next) {
 		work = rb_entry(next, struct kwork_work, node);
-		BUG_ON(work->cpu >= MAX_NR_CPUS);
+		/*
+		 * work->cpu comes from sample->cpu which is -1 when
+		 * PERF_SAMPLE_CPU is absent.  As int that's -1, but as
+		 * unsigned it exceeds MAX_NR_CPUS — skip to avoid OOB
+		 * on cpus_runtime[].
+		 */
+		/* Counted and reported in perf_kwork__top_report() */
+		if ((unsigned int)work->cpu >= MAX_NR_CPUS) {
+			stat->nr_skipped_cpu++;
+			next = rb_next(next);
+			continue;
+		}
 		stat->cpus_runtime[work->cpu].total += work->total_runtime;
 		stat->cpus_runtime[MAX_NR_CPUS].total += work->total_runtime;
 		next = rb_next(next);
@@ -2053,7 +2066,8 @@ static void top_calc_idle_time(struct perf_kwork *kwork,
 {
 	struct kwork_top_stat *stat = &kwork->top_stat;
 
-	if (work->id == 0) {
+	/* See comment in top_calc_total_runtime() */
+	if (work->id == 0 && (unsigned int)work->cpu < MAX_NR_CPUS) {
 		stat->cpus_runtime[work->cpu].idle += work->total_runtime;
 		stat->cpus_runtime[MAX_NR_CPUS].idle += work->total_runtime;
 	}
@@ -2065,6 +2079,10 @@ static void top_calc_irq_runtime(struct perf_kwork *kwork,
 {
 	struct kwork_top_stat *stat = &kwork->top_stat;
 
+	/* See comment in top_calc_total_runtime() */
+	if ((unsigned int)work->cpu >= MAX_NR_CPUS)
+		return;
+
 	if (type == KWORK_CLASS_IRQ) {
 		stat->cpus_runtime[work->cpu].irq += work->total_runtime;
 		stat->cpus_runtime[MAX_NR_CPUS].irq += work->total_runtime;
@@ -2117,12 +2135,19 @@ static void top_calc_cpu_usage(struct perf_kwork *kwork)
 		if (work->total_runtime == 0)
 			goto next;
 
+		/* See comment in top_calc_total_runtime() */
+		if ((unsigned int)work->cpu >= MAX_NR_CPUS)
+			goto next;
+
 		__set_bit(work->cpu, stat->all_cpus_bitmap);
 
 		top_subtract_irq_runtime(kwork, work);
 
-		work->cpu_usage = work->total_runtime * 10000 /
-			stat->cpus_runtime[work->cpu].total;
+		/* Guard against division by zero if no runtime was accumulated */
+		if (stat->cpus_runtime[work->cpu].total) {
+			work->cpu_usage = work->total_runtime * 10000 /
+				stat->cpus_runtime[work->cpu].total;
+		}
 
 		top_calc_idle_time(kwork, work);
 next:
@@ -2135,7 +2160,8 @@ static void top_calc_load_runtime(struct perf_kwork *kwork,
 {
 	struct kwork_top_stat *stat = &kwork->top_stat;
 
-	if (work->id != 0) {
+	/* See comment in top_calc_total_runtime() */
+	if (work->id != 0 && (unsigned int)work->cpu < MAX_NR_CPUS) {
 		stat->cpus_runtime[work->cpu].load += work->total_runtime;
 		stat->cpus_runtime[MAX_NR_CPUS].load += work->total_runtime;
 	}
@@ -2211,6 +2237,13 @@ static void perf_kwork__top_report(struct perf_kwork *kwork)
 		next = rb_next(next);
 	}
 
+	if (kwork->top_stat.nr_skipped_cpu) {
+		printf("  Warning: %u work entries with invalid CPU were excluded from totals.\n"
+		       "  Task runtimes may appear inflated (IRQ time not subtracted).\n"
+		       "  Consider re-recording with PERF_SAMPLE_CPU enabled.\n",
+		       kwork->top_stat.nr_skipped_cpu);
+	}
+
 	printf("\n");
 }
 
diff --git a/tools/perf/util/kwork.h b/tools/perf/util/kwork.h
index 81d39a7f78c8b811..6ec70dcc4157a1f2 100644
--- a/tools/perf/util/kwork.h
+++ b/tools/perf/util/kwork.h
@@ -202,6 +202,7 @@ struct __top_cpus_runtime {
 struct kwork_top_stat {
 	DECLARE_BITMAP(all_cpus_bitmap, MAX_NR_CPUS);
 	struct __top_cpus_runtime *cpus_runtime;
+	unsigned int nr_skipped_cpu;
 };
 
 struct perf_kwork {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 28/29] perf session: Snapshot event->header.size in process_user_event()
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (26 preceding siblings ...)
  2026-05-26 21:18 ` [PATCH 27/29] perf kwork: Bounds check work->cpu before indexing cpus_runtime[] Arnaldo Carvalho de Melo
@ 2026-05-26 21:18 ` Arnaldo Carvalho de Melo
  2026-05-26 22:31   ` sashiko-bot
  2026-05-26 21:18 ` [PATCH 29/29] perf test: Add truncated perf.data robustness test Arnaldo Carvalho de Melo
  2026-05-27  1:06 ` [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot,
	Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

On native-endian files, events are read from MAP_SHARED memory.
Multiple reads of event->header.size can return different values
if the file is concurrently modified, allowing an attacker to
bypass bounds checks performed on an earlier read.

Snapshot header.size into a local variable at function entry using
READ_ONCE() to prevent compiler rematerialization, and use it for
all size-dependent arithmetic within the function.  This ensures
every bounds calculation uses the same value that was validated
by the reader.

Reported-by: sashiko-bot@kernel.org # Running on a local machine
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 6de665d3c9054179..e2e821b77766dbfc 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -2230,6 +2230,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 {
 	struct ordered_events *oe = &session->ordered_events;
 	const struct perf_tool *tool = session->tool;
+	const u32 event_size = READ_ONCE(event->header.size);
 	struct perf_sample sample;
 	int fd = perf_data__fd(session->data);
 	s64 err;
@@ -2271,7 +2272,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		break;
 	case PERF_RECORD_HEADER_BUILD_ID:
 		if (!perf_event__check_nul(event->build_id.filename,
-					   (void *)event + event->header.size,
+					   (void *)event + event_size,
 					   "HEADER_BUILD_ID")) {
 			err = 0;
 			break;
@@ -2294,7 +2295,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		 * place already.
 		 */
 		if (!perf_data__is_pipe(session->data))
-			lseek(fd, file_offset + event->header.size, SEEK_SET);
+			lseek(fd, file_offset + event_size, SEEK_SET);
 		err = tool->auxtrace(tool, session, event);
 		break;
 	case PERF_RECORD_AUXTRACE_ERROR:
@@ -2304,14 +2305,14 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 	case PERF_RECORD_THREAD_MAP: {
 		u64 max_nr;
 
-		if (event->header.size < sizeof(event->thread_map)) {
+		if (event_size < sizeof(event->thread_map)) {
 			pr_err("PERF_RECORD_THREAD_MAP: header.size (%u) too small\n",
-			       event->header.size);
+			       event_size);
 			err = -EINVAL;
 			break;
 		}
 
-		max_nr = (event->header.size - sizeof(event->thread_map)) /
+		max_nr = (event_size - sizeof(event->thread_map)) /
 			 sizeof(event->thread_map.entries[0]);
 		if (event->thread_map.nr > max_nr) {
 			pr_err("PERF_RECORD_THREAD_MAP: nr %" PRIu64 " exceeds max %" PRIu64 "\n",
@@ -2325,7 +2326,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 	}
 	case PERF_RECORD_CPU_MAP: {
 		struct perf_record_cpu_map_data *data = &event->cpu_map.data;
-		u32 payload = event->header.size - sizeof(event->header);
+		u32 payload = event_size - sizeof(event->header);
 
 		/*
 		 * Native-endian events are mmap'd read-only, so we
@@ -2389,8 +2390,8 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		break;
 	}
 	case PERF_RECORD_STAT_CONFIG: {
-		/* Cannot underflow: perf_event__min_size[] guarantees header.size >= sizeof */
-		u64 max_nr = (event->header.size - sizeof(event->stat_config)) /
+		/* Cannot underflow: perf_event__min_size[] guarantees event_size >= sizeof */
+		u64 max_nr = (event_size - sizeof(event->stat_config)) /
 			     sizeof(event->stat_config.data[0]);
 
 		/*
@@ -2421,7 +2422,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		 */
 		memset(&session->time_conv, 0, sizeof(session->time_conv));
 		memcpy(&session->time_conv, &event->time_conv,
-		       min((size_t)event->header.size, sizeof(session->time_conv)));
+		       min((size_t)event_size, sizeof(session->time_conv)));
 		err = tool->time_conv(tool, session, event);
 		break;
 	case PERF_RECORD_HEADER_FEATURE:
@@ -2438,11 +2439,10 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		break;
 	case PERF_RECORD_BPF_METADATA: {
 		u64 nr_entries, max_entries;
-		u32 hdr_size = READ_ONCE(event->header.size);
 
-		if (hdr_size < sizeof(event->bpf_metadata)) {
+		if (event_size < sizeof(event->bpf_metadata)) {
 			pr_warning("WARNING: PERF_RECORD_BPF_METADATA: header.size (%u) too small, skipping\n",
-				   hdr_size);
+				   event_size);
 			err = 0;
 			break;
 		}
@@ -2458,9 +2458,8 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 			break;
 		}
 
-		/* Snapshot — event is mmap'd and could change between reads */
 		nr_entries = READ_ONCE(event->bpf_metadata.nr_entries);
-		max_entries = (hdr_size - sizeof(event->bpf_metadata)) /
+		max_entries = (event_size - sizeof(event->bpf_metadata)) /
 			      sizeof(event->bpf_metadata.entries[0]);
 		if (nr_entries > max_entries) {
 			pr_warning("WARNING: PERF_RECORD_BPF_METADATA: nr_entries %" PRIu64 " exceeds max %" PRIu64 ", skipping\n",
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 29/29] perf test: Add truncated perf.data robustness test
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (27 preceding siblings ...)
  2026-05-26 21:18 ` [PATCH 28/29] perf session: Snapshot event->header.size in process_user_event() Arnaldo Carvalho de Melo
@ 2026-05-26 21:18 ` Arnaldo Carvalho de Melo
  2026-05-26 22:19   ` sashiko-bot
  2026-05-27  1:06 ` [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
  29 siblings, 1 reply; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-26 21:18 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, Claude Opus 4.6 (1M context)

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Add a shell test that verifies perf report handles truncated perf.data
files gracefully — exiting with an error code rather than crashing with
SIGSEGV or SIGABRT.

The test records a simple workload, then truncates the resulting
perf.data at four offsets that exercise different parsing stages:

  8 bytes   — file header magic only
  64 bytes  — partial file header (attr section incomplete)
  256 bytes — into the first events (partial event headers)
  75% size  — mid-stream truncation (partial event data)

For each truncation, perf report is run and the exit code is checked:

- Exit code 0 (success) fails the test — a truncated file should
  never parse without error.

- Crash signals are detected portably via kill -l, which maps the
  signal number to a name on the running system.  This handles
  architectures where signal numbers differ (e.g. SIGBUS is 7 on
  x86/ARM but 10 on MIPS/SPARC).  Core-dump and fatal signals
  (KILL, ILL, ABRT, BUS, FPE, SEGV, TRAP, SYS) fail the test.

- Higher exit codes (200+) are perf's own negative-errno returns
  (e.g. -EINVAL = 234) and are expected.

This exercises the bounds checking, minimum-size validation, and error
propagation added by the preceding patches in this series.

Testing it:

  root@number:~# perf test truncat
   84: Test that perf report handles truncated perf.data gracefully (no crash, no segfault — clean error exit).: Ok
  root@number:~# perf test -vv truncat
   84: Test that perf report handles truncated perf.data gracefully (no crash, no segfault — clean error exit).:
  --- start ---
  test child forked, pid 62890
  ---- end(0) ----
   84: Test that perf report handles truncated perf.data gracefully (no crash, no segfault — clean error exit).: Ok
  root@number:~#

Changes in v2:
- Add SIGKILL to the list of fatal signals so OOM kills from
  resource exhaustion bugs are detected (Reported-by: sashiko-bot@kernel.org)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
[ Fixed the SPDX on the line where 'perf test' expects the test description, reviewed by Ian Rogers ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/shell/data_validation.sh | 85 +++++++++++++++++++++++
 1 file changed, 85 insertions(+)
 create mode 100755 tools/perf/tests/shell/data_validation.sh

diff --git a/tools/perf/tests/shell/data_validation.sh b/tools/perf/tests/shell/data_validation.sh
new file mode 100755
index 0000000000000000..5c3e6fced6fce0cb
--- /dev/null
+++ b/tools/perf/tests/shell/data_validation.sh
@@ -0,0 +1,85 @@
+#!/bin/bash
+# Test that perf report handles truncated perf.data gracefully (no crash, no segfault — clean error exit).
+# SPDX-License-Identifier: GPL-2.0
+#
+# Exercises the bounds checking and minimum-size validation added
+# by the perf-data-validation hardening series.
+
+err=0
+
+cleanup() {
+	rm -f "${perfdata}" "${perfdata}.old" "${truncated}" "${stderrfile}"
+	trap - EXIT TERM INT
+}
+trap 'cleanup; exit 1' TERM INT
+trap cleanup EXIT
+
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) || exit 2
+truncated=$(mktemp /tmp/__perf_test.perf.data.XXXXX) || exit 2
+stderrfile=$(mktemp /tmp/__perf_test.perf.data.XXXXX) || exit 2
+
+# Record a simple workload
+if ! perf record -o "${perfdata}" -- perf test -w noploop 2>/dev/null; then
+	echo "Skip: perf record failed"
+	cleanup
+	exit 2
+fi
+
+file_size=$(wc -c < "${perfdata}")
+if [ "${file_size}" -lt 512 ]; then
+	echo "Skip: perf.data too small (${file_size} bytes)"
+	cleanup
+	exit 2
+fi
+
+# Test truncation at various offsets that exercise different
+# parsing stages:
+#   8    — file header magic only, no attrs or data
+#   64   — partial file header (attr section incomplete)
+#   256  — into the first events (partial event headers)
+#   75%  — mid-stream truncation (partial event data)
+for cut_at in 8 64 256 $((file_size * 3 / 4)); do
+	if [ "${cut_at}" -ge "${file_size}" ]; then
+		continue
+	fi
+	dd if="${perfdata}" of="${truncated}" bs="${cut_at}" count=1 2>/dev/null
+
+	# perf report should exit with an error, not crash.
+	# Capture stderr to detect sanitizer violations.
+	perf report -i "${truncated}" --stdio > /dev/null 2> "${stderrfile}"
+	exit_code=$?
+
+	# A truncated file should never parse successfully
+	if [ ${exit_code} -eq 0 ]; then
+		echo "FAIL: perf report exited 0 (success) on ${cut_at}-byte truncated file — expected an error"
+		err=1
+		continue
+	fi
+
+	# Detect sanitizer violations — ASAN/MSAN/TSAN/UBSAN exit
+	# with code 1 by default, which would otherwise look like a
+	# clean error exit.  Check stderr for their markers.
+	if grep -qE "^(==[0-9]+==ERROR:|SUMMARY: [A-Za-z]*Sanitizer)" "${stderrfile}" 2>/dev/null; then
+		sanitizer=$(grep -oE "(Address|Memory|Thread|UndefinedBehavior)Sanitizer" "${stderrfile}" | head -1)
+		echo "FAIL: perf report triggered ${sanitizer:-sanitizer} on ${cut_at}-byte truncated file"
+		err=1
+		continue
+	fi
+
+	# Detect crash signals portably — signal numbers differ
+	# across architectures (e.g. SIGBUS is 7 on x86/ARM but
+	# 10 on MIPS/SPARC).  Use kill -l to map the number to a
+	# name on the running system.
+	if [ ${exit_code} -gt 128 ] && [ ${exit_code} -lt 200 ]; then
+		sig_name=$(kill -l $((exit_code - 128)) 2>/dev/null)
+		case ${sig_name} in
+		KILL|ILL|ABRT|BUS|FPE|SEGV|TRAP|SYS)
+			echo "FAIL: perf report crashed (SIG${sig_name}) on ${cut_at}-byte truncated file"
+			err=1
+			;;
+		esac
+	fi
+done
+
+cleanup
+exit ${err}
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 05/29] perf zstd: Fix multi-iteration decompression and error handling
  2026-05-26 21:17 ` [PATCH 05/29] perf zstd: Fix multi-iteration decompression and error handling Arnaldo Carvalho de Melo
@ 2026-05-26 21:49   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 21:49 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] Missing O_NONBLOCK when opening potentially user-controlled file paths, which can hang the process.
--

commit e38284f65bb9657e0257bfeee414f3d1b5f409dd
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf zstd: Fix multi-iteration decompression and error handling

zstd_decompress_stream() has two bugs in its multi-iteration loop:

1. After each ZSTD_decompressStream() call, the code advances
   output.dst by output.pos but doesn't reset output.pos to 0.

[ ... ]

> diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
> index ecda9deb53b73..21a0eb58597c2 100644
> --- a/tools/perf/util/zstd.c
> +++ b/tools/perf/util/zstd.c
> @@ -123,14 +123,26 @@ size_t zstd_decompress_stream(struct zstd_data *data, void *src, size_t src_size
>  		}
>  	}
>  	while (input.pos < input.size) {

[Severity: High]
This isn't a bug introduced by this patch, but functions that extract build
IDs, such as filename__read_build_id() in tools/perf/util/symbol-elf.c and
sysfs__read_build_id() in tools/perf/util/symbol-minimal.c, open file paths
originating from user-controlled sources.

The underlying open(..., O_RDONLY) calls lack the O_NONBLOCK flag. Even though
checks like is_regular_file() exist in some places, they suffer from
Time-of-Check to Time-of-Use (TOCTOU) races.

If a path resolves to a blocking special file like a FIFO or a hung network
mount, the open() call will hang the process.

Could we add O_NONBLOCK to these open() calls?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=5

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths
  2026-05-26 21:17 ` [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths Arnaldo Carvalho de Melo
@ 2026-05-26 21:54   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 21:54 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Inconsistent error handling for malformed PERF_RECORD_THREAD_MAP events leads to unnecessary session aborts on native architectures.
--

commit 8cc6c3cf14b21b51f888c8eeb92fee4d28388fbd
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Validate nr fields against event size on both swap and common paths

This commit adds bounds checking to variable-length arrays in events such as
PERF_RECORD_THREAD_MAP, PERF_RECORD_NAMESPACES, PERF_RECORD_CPU_MAP, and
PERF_RECORD_STAT_CONFIG. It applies these checks on both the native and
cross-endian processing paths.

> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index aef10d42be354..8588e12f110fc 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c

[ ... ]

> @@ -1912,15 +2029,112 @@ static s64 perf_session__process_user_event(struct perf_session *session,
>  		perf_session__auxtrace_error_inc(session, event);
>  		err = tool->auxtrace_error(tool, session, event);
>  		break;
> -	case PERF_RECORD_THREAD_MAP:
> +	case PERF_RECORD_THREAD_MAP: {
> +		u64 max_nr;
> +
> +		if (event->header.size < sizeof(event->thread_map)) {
> +			pr_err("PERF_RECORD_THREAD_MAP: header.size (%u) too small\n",
> +			       event->header.size);
> +			err = -EINVAL;
> +			break;
> +		}
> +
> +		max_nr = (event->header.size - sizeof(event->thread_map)) /
> +			 sizeof(event->thread_map.entries[0]);
> +		if (event->thread_map.nr > max_nr) {
> +			pr_err("PERF_RECORD_THREAD_MAP: nr %" PRIu64 " exceeds max %" PRIu64 "\n",
> +			       (u64)event->thread_map.nr, max_nr);
> +			err = -EINVAL;

[Severity: Medium]
Does returning -EINVAL here introduce inconsistent error handling between the
native-endian path and the cross-endian swap path?

On the cross-endian swap path, perf_event__thread_map_swap() returns -1
if the thread map nr exceeds the payload size. perf_session__process_event()
handles this swap failure by skipping the event and allowing the session to
continue.

Here on the native-endian path, the bounds check detects the same condition
but sets err = -EINVAL. This -EINVAL propagates up to
__perf_session__process_events() and causes the entire perf session to abort.

Since other events like CPU_MAP and STAT_CONFIG are skipped on error, and the
swap path skips malformed THREAD_MAP events, should the native path also skip
them to prevent a single malformed event from breaking the entire session?

> +			break;
> +		}
> +
>  		err = tool->thread_map(tool, session, event);
>  		break;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=11

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 08/29] perf session: Add validated swap infrastructure with null-termination checks
  2026-05-26 21:17 ` [PATCH 08/29] perf session: Add validated swap infrastructure with null-termination checks Arnaldo Carvalho de Melo
@ 2026-05-26 21:55   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 21:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] The `prot` and `flags` fields in `PERF_RECORD_MMAP2` are completely ignored during endianness swap.
- [High] The `perf_event__event_update_swap` function fails to byte-swap the variable union fields of `PERF_RECORD_EVENT_UPDATE`.
--

commit bc0a5c7da863b9ea5a7bf13cf8099e8bd03a1abc
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Add validated swap infrastructure with null-termination checks

This commit changes swap callbacks to return an int instead of void to allow
error propagation. It also adds a string null-termination check and includes
several security hardenings for cross-endian swap handlers.

> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 37544a3574185..d5864e380c1bd 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
[ ... ]
> -static void perf_event__mmap2_swap(union perf_event *event,
> +static int perf_event__mmap2_swap(union perf_event *event,
>  				  bool sample_id_all)
>  {
>  	event->mmap2.pid   = bswap_32(event->mmap2.pid);

[Severity: High]
This is a pre-existing issue, but I noticed the prot and flags fields of
PERF_RECORD_MMAP2 are skipped during the endianness swap. 

When processing cross-endian perf.data files, these 32-bit values will be 
interpreted with the wrong endianness. Could this result in garbage values 
when the flags field is read directly during processing (such as in 
machine__process_mmap2_event to check for MAP_SHARED), leading to incorrect 
memory permissions or misattribution of samples in memory maps?

[ ... ]
> -static void perf_event__event_update_swap(union perf_event *event,
> -					  bool sample_id_all __maybe_unused)
> +static int perf_event__event_update_swap(union perf_event *event,
> +					 bool sample_id_all __maybe_unused)
>  {
>  	event->event_update.type = bswap_64(event->event_update.type);
>  	event->event_update.id   = bswap_64(event->event_update.id);
> +	return 0;
>  }

[Severity: High]
This is a pre-existing issue, but it looks like the variable union fields
in struct perf_record_event_update are never byte-swapped.

The struct contains a union payload (like cpus, scale, or name) based on
event->event_update.type. For cross-endian captures, multi-byte fields
inside the union remain unswapped. Could this lead to data corruption in
downstream stats processing, or potential memory exhaustion and
out-of-bounds reads when allocating CPU maps using unswapped nr values in
cpu_map__new_data?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=8

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 15/29] perf session: Add byte-swap and bounds check for PERF_RECORD_BPF_METADATA events
  2026-05-26 21:17 ` [PATCH 15/29] perf session: Add byte-swap and bounds check for PERF_RECORD_BPF_METADATA events Arnaldo Carvalho de Melo
@ 2026-05-26 21:56   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 21:56 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [High] A double-fetch of `event->header.size` in `perf_event__bpf_metadata_swap()` allows an attacker to underflow `max_nr`, bypassing clamping logic and causing massive out-of-bounds writes.

Pre-existing issues:
- [High] The bounds checking and string validation introduced for PERF_RECORD_BPF_METADATA are bypassed by downstream consumers due to a Time-Of-Check to Time-Of-Use (TOCTOU) vulnerability.
- [High] In `perf_session__process_user_event()`, fetching `event->header.size` via `READ_ONCE()` can retrieve a size larger than what the caller validated, leading to out-of-bounds reads past the memory mapping.
--

commit be863429ae5d472c5216e543fecba1e6eae818cc
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Add byte-swap and bounds check for PERF_RECORD_BPF_METADATA events

PERF_RECORD_BPF_METADATA has no entry in perf_event__swap_ops[], so its
nr_entries field is never byte-swapped when reading a cross-endian
perf.data file. Downstream processing in perf_event__fprintf_bpf_metadata()
loops over nr_entries, so a foreign-endian value causes out-of-bounds reads.

> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 092fccbea8f80..95eb793026de6 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -961,6 +961,48 @@ static int perf_event__time_conv_swap(union perf_event *event,
>  	return 0;
>  }
>  
> +static int perf_event__bpf_metadata_swap(union perf_event *event,
> +					 bool sample_id_all __maybe_unused)
> +{
> +	u64 i, nr, max_nr;
> +
> +	/* Fixed header must fit before accessing nr_entries or prog_name */
> +	if (event->header.size < sizeof(event->bpf_metadata))
> +		return -1;
> +
> +	event->bpf_metadata.nr_entries = bswap_64(event->bpf_metadata.nr_entries);

[ ... ]

> +	nr = event->bpf_metadata.nr_entries;
> +	max_nr = (event->header.size - sizeof(event->bpf_metadata)) /
> +		 sizeof(event->bpf_metadata.entries[0]);

[Severity: High]
Can this double-fetch of event->header.size allow an attacker to bypass the
clamping logic?

Because event points to a memory-mapped file, the file can be modified
concurrently. If event->header.size is modified to a value smaller than
sizeof(event->bpf_metadata) after the initial size check above, but before this
calculation, wouldn't the subtraction underflow and yield a massive unsigned
value for max_nr?

This would bypass the nr > max_nr check below, allowing the loop to write NUL
bytes out-of-bounds. The commit message mentions fixing a double-fetch via
READ_ONCE(), but it appears it was only applied to process_user_event(),
leaving the swap path vulnerable.

> +	if (nr > max_nr) {
> +		/* Persist clamped value so the native path processes entries, not skips */
> +		nr = max_nr;
> +		event->bpf_metadata.nr_entries = nr;
> +	}
> +
> +	for (i = 0; i < nr; i++) {
> +		event->bpf_metadata.entries[i].key[BPF_METADATA_KEY_LEN - 1] = '\0';
> +		event->bpf_metadata.entries[i].value[BPF_METADATA_VALUE_LEN - 1] = '\0';
> +	}
> +	return 0;
> +}

[ ... ]

> @@ -2203,9 +2246,53 @@ static s64 perf_session__process_user_event(struct perf_session *session,
>  	case PERF_RECORD_FINISHED_INIT:
>  		err = tool->finished_init(tool, session, event);
>  		break;
> -	case PERF_RECORD_BPF_METADATA:
> +	case PERF_RECORD_BPF_METADATA: {
> +		u64 nr_entries, max_entries;
> +		u32 hdr_size = READ_ONCE(event->header.size);

[Severity: High]
This is a pre-existing issue, but by re-fetching event->header.size via
READ_ONCE() here, could we retrieve a size larger than what the caller
originally validated?

If a malicious writer enlarges the size between the caller's check against the
memory mapping bounds and this READ_ONCE(), max_entries would be calculated
based on the enlarged size. This would cause the strnlen validations below
to read memory past the end of the memory-mapped file.

> +
> +		if (hdr_size < sizeof(event->bpf_metadata)) {
> +			pr_warning("WARNING: PERF_RECORD_BPF_METADATA: header.size (%u) too small, skipping\n",
> +				   hdr_size);
> +			err = 0;
> +			break;
> +		}
> +
> +		/*
> +		 * Native-endian files are mmap'd read-only — validate
> +		 * NUL-termination instead of writing.
> +		 */
> +		if (strnlen(event->bpf_metadata.prog_name,
> +			    BPF_PROG_NAME_LEN) == BPF_PROG_NAME_LEN) {
> +			pr_warning("WARNING: PERF_RECORD_BPF_METADATA: prog_name not null-terminated, skipping\n");
> +			err = 0;
> +			break;
> +		}

[ ... ]

> +		for (u64 i = 0; i < nr_entries; i++) {
> +			if (strnlen(event->bpf_metadata.entries[i].key,
> +				    BPF_METADATA_KEY_LEN) == BPF_METADATA_KEY_LEN ||
> +			    strnlen(event->bpf_metadata.entries[i].value,
> +				    BPF_METADATA_VALUE_LEN) == BPF_METADATA_VALUE_LEN) {
> +				pr_warning("WARNING: PERF_RECORD_BPF_METADATA: entry %" PRIu64 " key/value not null-terminated, skipping\n", i);
> +				err = 0;
> +				goto out;
> +			}
> +		}
> +
>  		err = tool->bpf_metadata(tool, session, event);

[Severity: High]
This is a pre-existing issue, but is the validation mitigation bypassed here?

After verifying the string lengths and entry bounds locally, the raw
memory-mapped event pointer is passed directly to the downstream callback.
If downstream handlers read event->bpf_metadata.nr_entries and the payload
strings from the shared memory again, an attacker could concurrently modify
the mmap'd file after these checks, leading to out-of-bounds reads.

>  		break;
> +	}
>  	case PERF_RECORD_SCHEDSTAT_CPU:
>  		err = tool->schedstat_cpu(tool, session, event);
>  		break;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=15

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 04/29] perf zstd: Fix compression error path in zstd_compress_stream_to_records()
  2026-05-26 21:17 ` [PATCH 04/29] perf zstd: Fix compression error path in zstd_compress_stream_to_records() Arnaldo Carvalho de Melo
@ 2026-05-26 22:00   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

Pre-existing issues:
- [Critical] Mishandling of multi-record compressed streams leads to header size overflow, stream corruption, and out-of-bounds stack reads.
- [High] AIO compressed records contain an uninitialized `data_size` field, breaking decompression.
- [High] `ZSTD_flushStream()` is invoked on potentially errored contexts and its return value is ignored, causing silent data loss.
- [High] `zstd_decompress_stream` manually advances the output buffer pointers without resetting `output.pos`, leading to double-increment and uninitialized memory gaps.
--

commit e4a35c8e9f1ac44e063d21b4f97da413caa1b375
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf zstd: Fix compression error path in zstd_compress_stream_to_records()

This commit fixes multiple issues in the error paths of
zstd_compress_stream_to_records(). It properly resets the ZSTD
compression context on failures instead of incorrectly copying
uncompressed data, and ensures process_header() doesn't write past the
output buffer. It also fixes dst_size underflows and adds an
out_child_no_flush label to prevent retrying failed mmap reads.

> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index cc601796b2c8a..f1877bac815d7 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
[ ... ]
> diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
> index 57027e0ac7b65..ecda9deb53b73 100644
> --- a/tools/perf/util/zstd.c
> +++ b/tools/perf/util/zstd.c
[ ... ]
> @@ -54,7 +55,13 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
>  
>  	while (input.pos < input.size) {
>  		record = dst;

[Severity: Critical]
This is a pre-existing issue, but does returning the aggregate size of
multiple compressed records from this loop cause problems for callers like
record__pushfn()?

If record__pushfn() assumes the output is a single record and assigns
the total aggregate size to event->header.size, a total compressed size
exceeding 65535 bytes could cause the 16-bit header.size to overflow and
truncate.

This truncation could cause an underflow in the subsequent padding
calculation (padding = event->header.size - compressed), resulting in a
massive size passed to record__write(&pad, padding) which would dump
out-of-bounds stack memory into the trace file.

> +		/* process_header writes the event header into record */
> +		if (dst_size < sizeof(struct perf_event_header))
> +			goto reset;
>  		size = process_header(record, 0);

[Severity: High]
This isn't a bug introduced by this patch, but does the process_header
callback leave the 8-byte data_size field uninitialized in AIO mode?

In AIO mode, record__aio_pushfn() calls zstd_compress() but does not
manually set event->data_size. When process_header() initializes
header.type and header.size, the data_size field appears to be left
uninitialized.

This causes perf_session__process_compressed_event() to fail
decompression when reading perf.data.

> +		/* Output buffer full — cannot fit even the record header */
> +		if (size > dst_size)
> +			goto reset;
>  		compressed += size;
>  		dst += size;
>  		dst_size -= size;
[ ... ]
> @@ -65,10 +72,18 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
>  		if (ZSTD_isError(ret)) {

[Severity: High]
This is a pre-existing issue, but does calling ZSTD_flushStream() just
before this error check, without verifying if ZSTD_compressStream()
succeeded, violate the API contract?

Also, the return value of ZSTD_flushStream() is discarded. If the output
buffer fills up during flush, ZSTD returns > 0 to indicate more flushing
is required.

Because the loop only evaluates input.pos < input.size, if the input is
exhausted but the flush is incomplete, the loop will exit and leave the
remaining compressed data trapped in the internal context, dropping data.

>  			pr_err("failed to compress %ld bytes: %s\n",
>  				(long)src_size, ZSTD_getErrorName(ret));
> -			memcpy(dst, src, src_size);
> -			return src_size;
> +			goto reset;
>  		}
>  		size = output.pos;
[ ... ]
> @@ -76,6 +91,14 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
>  	}
>  
>  	return compressed;
> +
> +reset:
> +	/* Reset so the context is usable if the caller retries */
> +	ret = ZSTD_initCStream(data->cstream, data->comp_level);
> +	if (ZSTD_isError(ret))
> +		pr_err("failed to reset compression context: %s\n",
> +			ZSTD_getErrorName(ret));
> +	return -1;
>  }
>  
>  size_t zstd_decompress_stream(struct zstd_data *data, void *src, size_t src_size,

[Severity: High]
This isn't a bug introduced by this patch, but does
zstd_decompress_stream() fail to reset output.pos to 0 during each
iteration?

At the end of the decompression loop:

    output.dst  = dst + output.pos;
    output.size = dst_size - output.pos;

Without resetting output.pos = 0, the next iteration will write at
(dst + pos) + pos, skipping memory and leaving uninitialized gaps.

I see this was fixed later in the series in commit e38284f65bb96
("perf zstd: Fix multi-iteration decompression and error handling").

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=4

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 02/29] perf session: Bounds-check one_mmap event pointer in peek_event
  2026-05-26 21:17 ` [PATCH 02/29] perf session: Bounds-check one_mmap event pointer in peek_event Arnaldo Carvalho de Melo
@ 2026-05-26 22:00   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Flawed session->one_mmap_size calculation leads to integer underflow and bounds check bypass.
--

commit 748b22cbaf335af84a4611fd76fa89c066f02a10
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Bounds-check one_mmap event pointer in peek_event

This commit adds bounds checking to perf_session__peek_event() to ensure
that event pointers computed from file_offset fall within the single mmap
region. It adds one_mmap_size to the session struct to track the valid extent.

> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 0523fd243e02c..c4cd8ad6d810a 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
[ ... ]
> @@ -2560,6 +2575,14 @@ reader__mmap(struct reader *rd, struct perf_session *session)
>  	if (session->one_mmap) {
>  		session->one_mmap_addr = buf;
>  		session->one_mmap_offset = rd->file_offset;
> +		/*
> +		 * mmap_size was set to the full file extent (data_offset +
> +		 * data_size) but file_offset was shifted forward by
> +		 * page_offset for page alignment.  Reduce by page_offset
> +		 * so the bounds check reflects the file-backed portion
> +		 * of the mapping — pages beyond the file cause SIGBUS.
> +		 */
> +		session->one_mmap_size = rd->mmap_size - page_offset;

[Severity: High]
Can this subtraction underflow if a crafted perf.data file causes
rd->mmap_size to be smaller than page_offset?

In reader__init(), the size can be calculated from an unvalidated header
as rd->data_size + rd->data_offset. If a file is crafted where
data_offset = 8192 and data_size = -4096 (in 2's complement), the sum wraps
around to 4096. Since 4096 is less than MMAP_SIZE, one_mmap mode is enabled.

When this reaches reader__mmap(), page_offset becomes 8192 (from
data_offset). The subtraction 4096 - 8192 then underflows to a massive
64-bit value for session->one_mmap_size.

Would this bypass the newly added bounds checks in perf_session__peek_event()
and allow arbitrary out-of-bounds memory reads? The new validation check:

    if (offset_in_mmap >= session->one_mmap_size ...

would pass for arbitrary offsets because of the artificially huge underflowed
size.

>  	}
>  
>  	return 0;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=2

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 10/29] perf session: Validate HEADER_ATTR attr.size before swapping
  2026-05-26 21:17 ` [PATCH 10/29] perf session: Validate HEADER_ATTR attr.size before swapping Arnaldo Carvalho de Melo
@ 2026-05-26 22:01   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:01 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Missing TOCTOU protection for `event->header.size` during validation on shared mmaps.
--

commit e4ad3606afbf2146d8e039237ce12985c68071fa
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: perf session: Validate HEADER_ATTR attr.size before swapping

This patch hardens PERF_RECORD_HEADER_ATTR handling against crafted perf.data
files. It validates attr.size, bounds the copy size to prevent leaking
adjacent event data, and retains the original attr.size for correct event
re-synthesis.

> diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> index 41a3721a194dc..d8cb1f562f690 100644
> --- a/tools/perf/builtin-inject.c
> +++ b/tools/perf/builtin-inject.c
[ ... ]
> @@ -244,24 +245,34 @@ static int perf_event__repipe_attr(const struct perf_tool *tool,
>  	if (!inject->itrace_synth_opts.set)
>  		return perf_event__repipe_synth(tool, event);
>  
> -	if (event->header.size < sizeof(struct perf_event_header) + sizeof(u64)) {
> +	if (event->header.size < sizeof(struct perf_event_header) + PERF_ATTR_SIZE_VER0) {
>  		pr_err("Attribute event size %u is too small\n", event->header.size);
>  		return -EINVAL;
>  	}
>  
> -	if (event->header.size - sizeof(event->header) < event->attr.attr.size) {
> +	/*
> +	 * ABI0 pipe/inject events have attr.size == 0; default to
> +	 * PERF_ATTR_SIZE_VER0 (the ABI0 footprint) for the bounded
> +	 * copy and ID array position.  Same pattern as
> +	 * perf_event__process_attr() in header.c.
> +	 */
> +	raw_attr_size = event->attr.attr.size;
> +	attr_size = raw_attr_size ?: PERF_ATTR_SIZE_VER0;
> +
> +	if (raw_attr_size && (raw_attr_size < PERF_ATTR_SIZE_VER0 ||
> +			      raw_attr_size > event->header.size - sizeof(event->header))) {
>  		pr_err("Attribute event size %u is too small for attr.size %u\n",
> -		       event->header.size, event->attr.attr.size);
> +		       event->header.size, raw_attr_size);
>  		return -EINVAL;
>  	}
>  
>  	memset(&attr, 0, sizeof(attr));
>  	memcpy(&attr, &event->attr.attr,
> -	       min_t(size_t, sizeof(attr), (size_t)event->attr.attr.size));
> +	       min_t(size_t, sizeof(attr), attr_size));
>  
> -	n_ids = event->header.size - sizeof(event->header) - event->attr.attr.size;
> +	n_ids = event->header.size - sizeof(event->header) - attr_size;

[Severity: Medium]
Is it possible for event->header.size to change between the validation above
and this calculation?

Since raw_attr_size was explicitly snapshotted to prevent race conditions on
shared memory mappings, could a concurrently modified event->header.size
bypass the bounds check?

If event->header.size is shrunk concurrently after the validation, could this
cause n_ids to underflow?

>  	n_ids /= sizeof(u64);
> -	ids = perf_record_header_attr_id(event);
> +	ids = (void *)&event->attr.attr + attr_size;
>  
>  	attr.size = sizeof(struct perf_event_attr);
>  	attr.sample_type &= ~PERF_SAMPLE_AUX;

> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index f30e48eb3fc32..967c3d8ff12c8 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
[ ... ]
> @@ -5094,11 +5110,42 @@ int perf_event__process_attr(const struct perf_tool *tool __maybe_unused,
>  			     union perf_event *event,
>  			     struct evlist **pevlist)
>  {
> -	u32 i, n_ids;
> +	struct perf_event_attr attr;
> +	u32 i, n_ids, raw_attr_size;
>  	u64 *ids;
> +	size_t attr_size, copy_size;
>  	struct evsel *evsel;
>  	struct evlist *evlist = *pevlist;
>  
> +	/*
> +	 * HEADER_ATTR event layout (pipe/inject mode):
> +	 *
> +	 *   [header (8 bytes)] [attr (attr_size bytes)] [id0 id1 ... idN]
> +	 *   |<------------------ header.size --------------------------->|
> +	 *
> +	 * attr_size varies across perf versions: VER0 = 64 bytes,
> +	 * current sizeof(struct perf_event_attr) = larger.  A newer
> +	 * producer may emit a larger attr than we understand.
> +	 *
> +	 * attr.size == 0 (ABI0) means the producer didn't set it
> +	 * (e.g., bench/inject-buildid, older perf).  Treat as VER0.
> +	 *
> +	 * Require 8-byte alignment so the u64 ID array is aligned
> +	 * and attr.size fits cleanly within the payload.
> +	 *
> +	 * Read attr.size once — the event may be on a shared mmap
> +	 * and re-reading could yield a different value.
> +	 */
> +	raw_attr_size = event->attr.attr.size;
> +	if (event->header.size < sizeof(event->header) + PERF_ATTR_SIZE_VER0 ||
> +	    (raw_attr_size && (raw_attr_size < PERF_ATTR_SIZE_VER0 ||
> +			      raw_attr_size % sizeof(u64) ||
> +			      raw_attr_size > event->header.size - sizeof(event->header)))) {
> +		pr_err("PERF_RECORD_HEADER_ATTR: invalid attr.size %u (event size %u, min %d)\n",
> +		       raw_attr_size, event->header.size, PERF_ATTR_SIZE_VER0);
> +		return -EINVAL;
> +	}
> +
[ ... ]
> -	n_ids = event->header.size - sizeof(event->header) - event->attr.attr.size;
> +	n_ids = event->header.size - sizeof(event->header) - attr_size;

[Severity: Medium]
Could this calculation also suffer from the same race condition if
event->header.size is modified after the validation above?

If an attacker shrinks event->header.size concurrently, could n_ids underflow
and lead to an out-of-bounds memory read when allocating and populating the
IDs array?

>  	n_ids = n_ids / sizeof(u64);
>  	/*
>  	 * We don't have the cpu and thread maps on the header, so

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=10

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 13/29] perf cpumap: Reject RANGE_CPUS with start_cpu > end_cpu
  2026-05-26 21:17 ` [PATCH 13/29] perf cpumap: Reject RANGE_CPUS with start_cpu > end_cpu Arnaldo Carvalho de Melo
@ 2026-05-26 22:03   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:03 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] The patch attempts to prevent TOCTOU re-reads of mmap'd fields using `READ_ONCE()`, but it fails to close the critical time window between payload size validation in `session.c` and usage in `cpumap.c`. An attacker can mutate fields like `nr` or `mask_nr` in a FUSE-backed file after they are bounds-checked in `session.c`, causing `cpumap.c` to snapshot and use unvalidated malicious bounds, leading to severe out-of-bounds memory reads.
- [High] On 32-bit big-endian architectures, `perf_record_cpu_map_data__read_one_mask()` incorrectly swaps the upper and lower 32 bits of a 64-bit mask, inverting the logical CPU IDs.
- [High] A double-read of mmap'd bitmasks in `cpu_map__from_mask` can result in partially initialized CPU maps with unsorted entries, breaking map invariants.
--

commit 5918f47b5e37275ddfd5dce5525f764061776724
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf cpumap: Reject RANGE_CPUS with start_cpu > end_cpu

This commit addresses potential issues with malformed perf.data files
by rejecting inverted CPU ranges, clamping the any_cpu flag, and adding
READ_ONCE() snapshots to prevent TOCTOU vulnerabilities on mmap'd data.

> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 11922e1ded844..b1e5c29c6e3ec 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
[ ... ]
> @@ -40,15 +41,16 @@ bool perf_record_cpu_map_data__test_bit(int i,
>  
>  /* Read ith mask value from data into the given 64-bit sized bitmap */
>  static void perf_record_cpu_map_data__read_one_mask(const struct perf_record_cpu_map_data *data,
> -						    int i, unsigned long *bitmap)
> +						    int i, unsigned long *bitmap,
> +						    u16 long_size)
>  {
>  #if __SIZEOF_LONG__ == 8
[ ... ]
>  #else
> -	if (data->mask32_data.long_size == 4) {
> +	if (long_size == 4) {
>  		bitmap[0] = data->mask32_data.mask[i];
>  		bitmap[1] = 0;
>  	} else {
>  #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>  		bitmap[0] = (unsigned long)(data->mask64_data.mask[i] >> 32);
>  		bitmap[1] = (unsigned long)data->mask64_data.mask[i];

[Severity: High]
This is a pre-existing issue, but does this mapping reverse the logical CPU
IDs on 32-bit big-endian systems?

The native u64 mask stores CPU k at bit k. When mapping this into the Linux
bitops unsigned long array on 32-bit machines, for_each_set_bit() evaluates
bit indices based strictly on array position. This means bitmap[0] logically
needs to contain bits 0-31 and bitmap[1] needs to contain bits 32-63,
regardless of host endianness.

By placing the upper 32 bits into bitmap[0] and the lower 32 bits into
bitmap[1], are we mapping CPUs 0-31 to bits 32-63, and CPUs 32-63 to
bits 0-31?

>  #else
>  		bitmap[0] = (unsigned long)data->mask64_data.mask[i];
>  		bitmap[1] = (unsigned long)(data->mask64_data.mask[i] >> 32);
>  #endif
>  	}
>  #endif
>  }
>  static struct perf_cpu_map *cpu_map__from_entries(const struct perf_record_cpu_map_data *data)
>  {
> +	/* Snapshot nr — data is mmap'd and could change between reads */
> +	u16 nr = READ_ONCE(data->cpus_data.nr);
>  	struct perf_cpu_map *map;

[Severity: High]
This isn't a bug introduced by this patch, but does this snapshot actually
close the TOCTOU window?

If the bounds validation against the event payload size occurs earlier in
perf_session__process_user_event() without snapshotting the value, couldn't
an attacker supply a small, valid nr value to pass the check there, and
then mutate the FUSE-backed memory to a large value (like 65535) before
we reach this READ_ONCE()?

If so, wouldn't this late snapshot just capture the unvalidated malicious
value, resulting in a massive allocation and out-of-bounds reads from
the perf.data mapping?

>  
> -	map = perf_cpu_map__empty_new(data->cpus_data.nr);
> +	map = perf_cpu_map__empty_new(nr);
>  	if (!map)
>  		return NULL;
[ ... ]
> @@ -93,11 +98,21 @@ static struct perf_cpu_map *cpu_map__from_entries(const struct perf_record_cpu_m
>  static struct perf_cpu_map *cpu_map__from_mask(const struct perf_record_cpu_map_data *data)
>  {
>  	DECLARE_BITMAP(local_copy, 64);
> -	int weight = 0, mask_nr = data->mask32_data.nr;
> +	int weight = 0, mask_nr;
[ ... ]
>  	for (int i = 0; i < mask_nr; i++) {
> -		perf_record_cpu_map_data__read_one_mask(data, i, local_copy);
> +		perf_record_cpu_map_data__read_one_mask(data, i, local_copy, long_size);
>  		weight += bitmap_weight(local_copy, 64);
>  	}
>  
> @@ -106,11 +121,14 @@ static struct perf_cpu_map *cpu_map__from_mask(const struct perf_record_cpu_map_
>  		return NULL;
>  
>  	for (int i = 0, j = 0; i < mask_nr; i++) {
> -		int cpus_per_i = (i * data->mask32_data.long_size  * BITS_PER_BYTE);
> +		int cpus_per_i = (i * long_size * BITS_PER_BYTE);
>  		int cpu;
>  
> -		perf_record_cpu_map_data__read_one_mask(data, i, local_copy);
> +		perf_record_cpu_map_data__read_one_mask(data, i, local_copy, long_size);
>  		for_each_set_bit(cpu, local_copy, 64) {
> +			/* Guard against more set bits than the first pass counted */
> +			if (j >= weight)
> +				break;

[Severity: High]
This is a pre-existing issue, but can the double-read of the bitmask
result in an incorrectly sorted map?

The code reads the mask array twice: first to calculate the weight, and
second to populate the map. Since perf_record_cpu_map_data__read_one_mask()
reads directly from mmap'd memory without READ_ONCE(), the underlying data
could mutate between passes.

If the data changes so that fewer bits are set during the second pass,
the map is only partially populated. Because perf_cpu_map__empty_new()
initializes all slots to -1 (representing 'any CPU'), wouldn't those -1
values remain at the end of the array?

Since downstream logic relies on strict monotonic sorting (e.g.
perf_cpu_map__max() assumes the last element is the maximum CPU), could
returning -1 here lead to 0-sized allocations later and subsequent
out-of-bounds writes?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=13

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 12/29] perf header: Byte-swap build ID event pid and bounds check section entries
  2026-05-26 21:17 ` [PATCH 12/29] perf header: Byte-swap build ID event pid and bounds check section entries Arnaldo Carvalho de Melo
@ 2026-05-26 22:05   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:05 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Incorrect `sample_id` offset calculation in `perf_event__build_id_swap` causes cross-endian stream parsing failures.
--

commit b5fb0a97a4ece334d3f6c106328194158f99aa94
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Byte-swap build ID event pid and bounds check section entries

This commit fixes a missing byte swap of the pid field for build ID events in
cross-endian perf.data files. It also adds a build_id_swap callback for
stream-mode events, checks NUL-termination, and hardens the reader against
crafted files.

> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 8588e12f110fc..0fac8f4e0e223 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
[ ... ]
> @@ -686,6 +686,25 @@ static int perf_event__hdr_attr_swap(union perf_event *event,
>  	return 0;
>  }
>  
> +static int perf_event__build_id_swap(union perf_event *event,
> +				     bool sample_id_all)
> +{
> +	event->build_id.pid = bswap_32(event->build_id.pid);
> +
> +	if (sample_id_all) {
> +		void *data = &event->build_id.filename;
> +		void *end = (void *)event + event->header.size;
> +		size_t len = strnlen(data, end - data);
> +
> +		/* See comment in perf_event__comm_swap() */
> +		if (len == (size_t)(end - data))
> +			return -1;
> +		data += PERF_ALIGN(len + 1, sizeof(u64));

[Severity: Medium]
Will this result in an incorrect offset for the sample_id data?

Since data points to &event->build_id.filename, which is at offset 36 in the
event structure, adding an 8-byte aligned value to an unaligned base offset
yields an address that is 4 bytes misaligned relative to the event start.

However, the event synthesizer correctly pads the entire variable-length event
to an 8-byte boundary before appending sample_id using:

PERF_ALIGN(36 + len + 1, 8)

As a result, does this leave data pointing 4 bytes away from the actual
sample_id location? 

Furthermore, won't swap_sample_id_all() unconditionally trigger an alignment
warning and skip swapping because the remaining size is not a multiple of 8?

> +		swap_sample_id_all(event, data);
> +	}
> +	return 0;
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=12

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 29/29] perf test: Add truncated perf.data robustness test
  2026-05-26 21:18 ` [PATCH 29/29] perf test: Add truncated perf.data robustness test Arnaldo Carvalho de Melo
@ 2026-05-26 22:19   ` sashiko-bot
  2026-05-27  0:50     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The cleanup routine may delete a file named '.old' in the current working directory if temporary file creation fails.
--

commit 6d5a7740cf583f089ba666236f38032085207496
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf test: Add truncated perf.data robustness test

This commit adds a shell test to verify that perf report gracefully handles
truncated perf.data files. It records a simple workload, truncates the file
at various offsets, and checks that perf report exits cleanly with an error
rather than crashing.

> diff --git a/tools/perf/tests/shell/data_validation.sh b/tools/perf/tests/shell/data_validation.sh
> new file mode 100755
> index 0000000000000..5c3e6fced6fce
> --- /dev/null
> +++ b/tools/perf/tests/shell/data_validation.sh
> @@ -0,0 +1,85 @@
[ ... ]
> +err=0
> +
> +cleanup() {
> +	rm -f "${perfdata}" "${perfdata}.old" "${truncated}" "${stderrfile}"
> +	trap - EXIT TERM INT
> +}
> +trap 'cleanup; exit 1' TERM INT
> +trap cleanup EXIT
> +
> +perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) || exit 2

[Severity: Medium]
If mktemp fails (for example, if /tmp is out of space), the perfdata
variable will be empty and the script will exit, triggering the EXIT trap.

Will the cleanup function then expand the command to rm -f "" ".old" and
silently delete a file named .old in the current working directory?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=29

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 24/29] perf tools: Harden compressed event processing
  2026-05-26 21:18 ` [PATCH 24/29] perf tools: Harden compressed event processing Arnaldo Carvalho de Melo
@ 2026-05-26 22:23   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Double-fetch of event header size from a memory-mapped file leads to integer underflow and out-of-bounds read.
--

commit 6f8ff54f59a432aeacf15c3aba9115ca23082597
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf tools: Harden compressed event processing

This commit adds several hardening checks to the compressed event
decompression pipeline. It validates comp_mmap_len, checks for underflow
in decomp_last_rem, and ensures event header sizes are large enough to
contain the fixed struct fields.

> diff --git a/tools/perf/util/tool.c b/tools/perf/util/tool.c
> --- a/tools/perf/util/tool.c
> +++ b/tools/perf/util/tool.c

[ ... ]

> @@ -47,14 +55,37 @@ static int perf_session__process_compressed_event(const struct perf_tool *tool _
>  		decomp->size = decomp_last_rem;
>  	}
>  
> +	/*
> +	 * Events are read directly from the mmap'd file; fields could
> +	 * theoretically change via a FUSE-backed file, but that applies
> +	 * to the entire event processing pipeline, not just here.
> +	 */
>  	if (event->header.type == PERF_RECORD_COMPRESSED) {
> +		if (event->header.size < sizeof(struct perf_record_compressed))
> +			goto err_decomp;
>  		src = (void *)event + sizeof(struct perf_record_compressed);
>  		src_size = event->pack.header.size - sizeof(struct perf_record_compressed);

[Severity: Medium]
Since event points to a memory-mapped file, can the size field be concurrently
modified between the bounds check and the subtraction here?

If the size is changed to a smaller value after passing the check, could this
double-fetch lead to an integer underflow, producing a massive src_size that
gets passed to zstd_decompress_stream() and causes an out-of-bounds read?

While the comment mentions FUSE modifications apply to the entire pipeline,
later commits reportedly fix this issue globally by snapshotting the size in
process_user_event(). It seems these newly introduced double-fetches in this
function miss that mitigation.

>  	} else if (event->header.type == PERF_RECORD_COMPRESSED2) {
> +		/*
> +		 * prefetch_event() only guarantees that the 8-byte
> +		 * event header fits; validate that header.size covers
> +		 * the data_size field before accessing it, otherwise a
> +		 * crafted event reads data_size from adjacent memory.
> +		 */
> +		if (event->header.size < sizeof(struct perf_record_compressed2))
> +			goto err_decomp;
>  		src = (void *)event + sizeof(struct perf_record_compressed2);
>  		src_size = event->pack2.data_size;
> +		/*
> +		 * data_size is independent of header.size (which
> +		 * includes padding); verify it doesn't exceed the
> +		 * actual payload to prevent out-of-bounds reads in
> +		 * zstd_decompress_stream().
> +		 */
> +		if (src_size > event->header.size - sizeof(struct perf_record_compressed2))

[Severity: Medium]
Does this also suffer from the same double-fetch problem?

If event->header.size is modified by a FUSE-backed file between the initial
check and this second evaluation, could the subtraction underflow and
erroneously bypass the out-of-bounds check for data_size?

> +			goto err_decomp;
>  	} else {
> -		return -1;
> +		goto err_decomp;
>  	}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=24

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 28/29] perf session: Snapshot event->header.size in process_user_event()
  2026-05-26 21:18 ` [PATCH 28/29] perf session: Snapshot event->header.size in process_user_event() Arnaldo Carvalho de Melo
@ 2026-05-26 22:31   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

Pre-existing issues:
- [High] Incomplete Vulnerability Fix / TOCTOU Bypass on event->header.size
- [High] Type Confusion / Validation Bypass (TOCTOU on event->header.type)
- [High] TOCTOU / Bounds Check Bypass on Payload Array Counts
- [High] Incomplete TOCTOU Fix / Unprotected Reads in Other Event Paths
--

commit 80788dea311fbc98ef2cb9a00824d6f40689bd54
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Snapshot event->header.size in process_user_event()

This commit attempts to fix a vulnerability where multiple reads of
event->header.size from MAP_SHARED memory could return different values.
It snapshots the header size into a local variable using READ_ONCE() to
ensure a consistent value is used for bounds calculations.

> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 6de665d3c9054..e2e821b77766d 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -2230,6 +2230,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
>  {
>  	struct ordered_events *oe = &session->ordered_events;
>  	const struct perf_tool *tool = session->tool;
> +	const u32 event_size = READ_ONCE(event->header.size);

[Severity: High]
This is a pre-existing issue, but since structural bounds validation is
performed earlier by callers like perf_session__process_event() calling
perf_event__too_small(), can the size be modified in MAP_SHARED memory
between that caller validation and this READ_ONCE() snapshot?

If the size is shrunk concurrently, could subsequent subtraction operations
like (event_size - sizeof(event->stat_config)) underflow, bypassing array
bounds checks?

[Severity: High]
This is a pre-existing issue, but does this function also need to protect
event->header.type with a READ_ONCE() snapshot?

If an attacker sets the type to a small struct to pass the size check in
perf_event__too_small(), and then concurrently overwrites it to a larger
struct type, will this function process it as the larger struct using a
size validated for the smaller one?

[ ... ]

> @@ -2271,7 +2272,7 @@ static s64 perf_session__process_user_event(struct perf_session *session,
>  		break;
>  	case PERF_RECORD_HEADER_BUILD_ID:
>  		if (!perf_event__check_nul(event->build_id.filename,
> -					   (void *)event + event->header.size,
> +					   (void *)event + event_size,

[Severity: High]
This is a pre-existing issue, but does this READ_ONCE() pattern need to be
applied to other execution paths that process memory-mapped events?

For example, perf_session__deliver_event() delegates to
machines__deliver_event(), which appears to still read event->header.size
directly from memory to calculate string boundaries before calling
perf_event__check_nul().

Could concurrent modification of the size allow the string checker's
strnlen() to scan out-of-bounds unmapped memory?

>  					   "HEADER_BUILD_ID")) {
>  			err = 0;
>  			break;

[ ... ]

> @@ -2304,14 +2305,14 @@ static s64 perf_session__process_user_event(struct perf_session *session,
>  	case PERF_RECORD_THREAD_MAP: {
>  		u64 max_nr;
>  
> -		if (event->header.size < sizeof(event->thread_map)) {
> +		if (event_size < sizeof(event->thread_map)) {
>  			pr_err("PERF_RECORD_THREAD_MAP: header.size (%u) too small\n",
> -			       event->header.size);
> +			       event_size);
>  			err = -EINVAL;
>  			break;
>  		}
>  
> -		max_nr = (event->header.size - sizeof(event->thread_map)) /
> +		max_nr = (event_size - sizeof(event->thread_map)) /
>  			 sizeof(event->thread_map.entries[0]);
>  		if (event->thread_map.nr > max_nr) {

[Severity: High]
This is a pre-existing issue, but while event_size is snapshotted, are
array count fields like event->thread_map.nr still vulnerable?

If event->thread_map.nr is validated here against max_nr, but downstream
handlers like thread_map__new_event() re-read the count directly from
MAP_SHARED memory, could an attacker overwrite it with a huge value after
this check to cause out-of-bounds heap allocations?

>  			pr_err("PERF_RECORD_THREAD_MAP: nr %" PRIu64 " exceeds max %" PRIu64 "\n",

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=28

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 26/29] perf session: Bound nr_cpus_avail and validate sample CPU
  2026-05-26 21:18 ` [PATCH 26/29] perf session: Bound nr_cpus_avail and validate sample CPU Arnaldo Carvalho de Melo
@ 2026-05-26 22:40   ` sashiko-bot
  0 siblings, 0 replies; 47+ messages in thread
From: sashiko-bot @ 2026-05-26 22:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Globally mutating `sample.cpu` to 0 for CPUs exceeding `MAX_NR_CPUS` corrupts valid data for tools that correctly support large CPU counts.
--

commit 17ecab60f87deb66695ee576a168167eb8c64bf5
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf session: Bound nr_cpus_avail and validate sample CPU
    
This commit adds a bounds check for the sample CPU index in the core event
delivery loop to protect downstream tools with fixed-size arrays. It bounds
the available CPU count to MAX_NR_CPUS and clamps any out-of-bounds
sample.cpu value to 0.

> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 9271885e3920f..6de665d3c9054 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c

[ ... ]

> +		/*
> +		 * Cap at MAX_NR_CPUS for the bounds check — downstream
> +		 * consumers use fixed-size arrays of that size.  Keep
> +		 * the true nr_cpus_avail in env for header parsing
> +		 * (e.g. process_cpu_topology) which needs the real count.
> +		 */
> +		if (nr_cpus_avail > MAX_NR_CPUS)
> +			nr_cpus_avail = MAX_NR_CPUS;
> +		if (sample.cpu >= (u32)nr_cpus_avail &&
> +		    sample.cpu != (u32)-1) {
> +			/*
> +			 * Warn rather than abort: synthesized events
> +			 * (MMAP, COMM) lack sample_id_all data, so
> +			 * parse_id_sample reads garbage from the event
> +			 * payload.  Clamping to 0 protects downstream
> +			 * array indexing while keeping the session alive.
> +			 *
> +			 * Preserve (u32)-1: perf script and perf inject
> +			 * use it as a sentinel for "CPU not applicable."
> +			 * Downstream array users (timechart, kwork) have
> +			 * their own per-callback bounds checks.
> +			 */
> +			pr_warning_once("WARNING: sample CPU %u >= nr_cpus_avail %u, clamping to 0\n",
> +					sample.cpu, nr_cpus_avail);
> +			sample.cpu = 0;

[Severity: High]
Does this global mutation of sample.cpu corrupt valid data for tools that
correctly support large CPU counts?

If a system legitimately has more than MAX_NR_CPUS (4096) CPUs, clamping
nr_cpus_avail to MAX_NR_CPUS and then reassigning sample.cpu to 0 will
attribute all samples from CPU 4096 and above to CPU 0.

Downstream tools like perf script and perf report handle CPU indices
dynamically and fully support greater than 4096 CPUs.

Would it be safer to push this fixed-size array bounds checking down into
the specific tools (like timechart and kwork) rather than altering the
shared sample structure globally in the core event delivery loop?

> +		}
> +	}
>  
>  	ret = auxtrace__process_event(session, event, &sample, tool);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260526211806.1193848-1-acme@kernel.org?part=26

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 29/29] perf test: Add truncated perf.data robustness test
  2026-05-26 22:19   ` sashiko-bot
@ 2026-05-27  0:50     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-27  0:50 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: linux-perf-users

On Tue, May 26, 2026 at 10:19:40PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [Medium] The cleanup routine may delete a file named '.old' in the current working directory if temporary file creation fails.
> --
> 
> commit 6d5a7740cf583f089ba666236f38032085207496
> Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> perf test: Add truncated perf.data robustness test
> 
> This commit adds a shell test to verify that perf report gracefully handles
> truncated perf.data files. It records a simple workload, truncates the file
> at various offsets, and checks that perf report exits cleanly with an error
> rather than crashing.
> 
> > diff --git a/tools/perf/tests/shell/data_validation.sh b/tools/perf/tests/shell/data_validation.sh
> > new file mode 100755
> > index 0000000000000..5c3e6fced6fce
> > --- /dev/null
> > +++ b/tools/perf/tests/shell/data_validation.sh
> > @@ -0,0 +1,85 @@
> [ ... ]
> > +err=0
> > +
> > +cleanup() {
> > +	rm -f "${perfdata}" "${perfdata}.old" "${truncated}" "${stderrfile}"
> > +	trap - EXIT TERM INT
> > +}
> > +trap 'cleanup; exit 1' TERM INT
> > +trap cleanup EXIT
> > +
> > +perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) || exit 2
> 
> [Severity: Medium]
> If mktemp fails (for example, if /tmp is out of space), the perfdata
> variable will be empty and the script will exit, triggering the EXIT trap.
> 
> Will the cleanup function then expand the command to rm -f "" ".old" and
> silently delete a file named .old in the current working directory?

Good catch, added the following patch to it:

   err=0
 
   cleanup() {
  -  rm -f "${perfdata}" "${perfdata}.old" "${truncated}" "${stderrfile}"
  +  [ -n "${perfdata}" ] && rm -f "${perfdata}" "${perfdata}.old"
  +  rm -f "${truncated}" "${stderrfile}"
     trap - EXIT TERM INT
   }
   trap 'cleanup; exit 1' TERM INT


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files
  2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
                   ` (28 preceding siblings ...)
  2026-05-26 21:18 ` [PATCH 29/29] perf test: Add truncated perf.data robustness test Arnaldo Carvalho de Melo
@ 2026-05-27  1:06 ` Arnaldo Carvalho de Melo
  29 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-27  1:06 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users

On Tue, May 26, 2026 at 06:17:36PM -0300, Arnaldo Carvalho de Melo wrote:
> perf.data validation and hardening (29 patches)
> 
> A crafted or corrupted perf.data file can cause out-of-bounds
> reads/writes, infinite loops, heap overflows, and segfaults in perf
> report, perf script, perf inject, perf timechart, and perf kwork.
> This series adds defense-in-depth validation for file parsing:

The analysis about Sashiko's remaining comments is below, unless someone
has something not related by Sashiko, I'll merge this tomorrow and
continue processing the other outstanding patches.

● Here's the v4 sashiko.dev review triage — 13 of 29 patches got reviews:

  Patches with findings:

  Patch: 02 (peek_event bounds)
  Findings: 1 High: mmap_size - page_offset underflow
  Verdict: Pre-existing — reader__init() validates data_size, page_offset can't exceed mmap_size
  ────────────────────────────────────────
  Patch: 04 (zstd compress)
  Findings: 1 Critical + 3 High: multi-record header overflow, AIO data_size, flush return, decompress pos
  Verdict: All pre-existing — the Critical is about process_header() aggregate size, and the decompress issue is fixed later in patch 05
  ────────────────────────────────────────
  Patch: 05 (zstd decompress)
  Findings: 1 High: O_NONBLOCK missing on file opens
  Verdict: Pre-existing — not introduced by this patch, unrelated to zstd

This one IIRC Ian sent a patch for review-prompts, merged recently that
will make its way to Sashiko and will stop being flagged as a problem:

  "kernel/subsystem/perf.md: Remove section describing non-blocking IO"
  https://github.com/masoncl/review-prompts/commit/261d73261dbb11f38ff9c653da3608b162741e03

  ────────────────────────────────────────
  Patch: 08 (swap infra)
  Findings: 2 High: mmap2 prot/flags not swapped, event_update union not swapped
  Verdict: Both pre-existing — correct observations for follow-up series
  ────────────────────────────────────────
  Patch: 10 (HEADER_ATTR)
  Findings: same as v3 — already triaged
  Verdict:
  ────────────────────────────────────────
  Patch: 11 (nr validation)
  Findings: 1 Medium: native path aborts vs swap path skips on bad THREAD_MAP
  Verdict: Valid observation — but intentional: native path returns -EINVAL to catch corruption, swap path skips to keep session alive.
  ────────────────────────────────────────
  Patch: 12 (build_id_swap)
  Findings: same as v3 — already fixed in v4
  Verdict:
  ────────────────────────────────────────
  Patch: 15 (BPF_METADATA)
  Findings: 1 High new: double-fetch of header.size in swap path; 2 High pre-existing: TOCTOU on native path
  Verdict: The double-fetch is valid for swap but swap runs on MAP_PRIVATE (writable copy), so no concurrent modification possible.
  ────────────────────────────────────────
  Patch: 24 (compressed hardening)
  Findings: 1 Medium: double-fetch of event->header.size in tool.c
  Verdict: Same TOCTOU pattern
  ────────────────────────────────────────
  Patch: 26 (CPU bounds)
  Findings: 1 High: global clamp corrupts data for >4096 CPUs
  Verdict: Known limitation — memory [[MAX_NR_CPUS dynamic allocation]] TODO
  ────────────────────────────────────────
  Patch: 28 (READ_ONCE snapshot)
  Findings: 4 High: incomplete TOCTOU fix, type confusion, array count re-reads
  Verdict: All pre-existing MAP_SHARED TOCTOU. The full fix would be MAP_PRIVATE, noted as follow-up
  ────────────────────────────────────────
  Patch: 29 (shell test)

  Fixed and sent the diff in response to Sashiko's review e-mail.

  Summary: 1 new actionable issue in v4. All the other findings are
  either pre-existing (documented in the cover letter), already fixed in
  this version, or intentional design decisions. The mmap2 prot/flags
  and event_update union swap gaps (patch 08) are valid pre-existing
  bugs for a follow-up series.

-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2026-05-27  1:06 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-26 21:17 [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 01/29] perf session: Add minimum event size and alignment validation Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 02/29] perf session: Bounds-check one_mmap event pointer in peek_event Arnaldo Carvalho de Melo
2026-05-26 22:00   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 03/29] perf tools: Fix event_contains() macro to verify full field extent Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 04/29] perf zstd: Fix compression error path in zstd_compress_stream_to_records() Arnaldo Carvalho de Melo
2026-05-26 22:00   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 05/29] perf zstd: Fix multi-iteration decompression and error handling Arnaldo Carvalho de Melo
2026-05-26 21:49   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 06/29] perf session: Fix PERF_RECORD_READ swap and dump for variable-length events Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 07/29] perf session: Fix swap_sample_id_all() crash on crafted events Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 08/29] perf session: Add validated swap infrastructure with null-termination checks Arnaldo Carvalho de Melo
2026-05-26 21:55   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 09/29] perf session: Use bounded copy for PERF_RECORD_TIME_CONV Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 10/29] perf session: Validate HEADER_ATTR attr.size before swapping Arnaldo Carvalho de Melo
2026-05-26 22:01   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths Arnaldo Carvalho de Melo
2026-05-26 21:54   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 12/29] perf header: Byte-swap build ID event pid and bounds check section entries Arnaldo Carvalho de Melo
2026-05-26 22:05   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 13/29] perf cpumap: Reject RANGE_CPUS with start_cpu > end_cpu Arnaldo Carvalho de Melo
2026-05-26 22:03   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 14/29] perf auxtrace: Harden auxtrace_error event handling Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 15/29] perf session: Add byte-swap and bounds check for PERF_RECORD_BPF_METADATA events Arnaldo Carvalho de Melo
2026-05-26 21:56   ` sashiko-bot
2026-05-26 21:17 ` [PATCH 16/29] perf header: Validate null-termination in PERF_RECORD_EVENT_UPDATE string fields Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 17/29] perf tools: Bounds check perf_event_attr fields against attr.size before printing Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 18/29] perf header: Propagate feature section processing errors Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 19/29] perf header: Validate f_attr.ids section before use in perf_session__read_header() Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 20/29] perf header: Validate feature section size and add read path bounds checking Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 21/29] perf header: Sanity check HEADER_EVENT_DESC attr.size before swap Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 22/29] perf header: Validate bitmap size before allocating in do_read_bitmap() Arnaldo Carvalho de Melo
2026-05-26 21:17 ` [PATCH 23/29] perf session: Add byte-swap handler for PERF_RECORD_COMPRESSED2 Arnaldo Carvalho de Melo
2026-05-26 21:18 ` [PATCH 24/29] perf tools: Harden compressed event processing Arnaldo Carvalho de Melo
2026-05-26 22:23   ` sashiko-bot
2026-05-26 21:18 ` [PATCH 25/29] perf session: Check for decompression buffer size overflow Arnaldo Carvalho de Melo
2026-05-26 21:18 ` [PATCH 26/29] perf session: Bound nr_cpus_avail and validate sample CPU Arnaldo Carvalho de Melo
2026-05-26 22:40   ` sashiko-bot
2026-05-26 21:18 ` [PATCH 27/29] perf kwork: Bounds check work->cpu before indexing cpus_runtime[] Arnaldo Carvalho de Melo
2026-05-26 21:18 ` [PATCH 28/29] perf session: Snapshot event->header.size in process_user_event() Arnaldo Carvalho de Melo
2026-05-26 22:31   ` sashiko-bot
2026-05-26 21:18 ` [PATCH 29/29] perf test: Add truncated perf.data robustness test Arnaldo Carvalho de Melo
2026-05-26 22:19   ` sashiko-bot
2026-05-27  0:50     ` Arnaldo Carvalho de Melo
2026-05-27  1:06 ` [PATCHES v4 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
  -- strict thread matches above, loose matches on Subject: below --
2026-05-25  1:05 [PATCHES v3 " Arnaldo Carvalho de Melo
2026-05-25  1:05 ` [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths Arnaldo Carvalho de Melo
2026-05-24  3:26 [PATCHES v2 00/29] perf: Harden perf.data parsing against crafted/corrupted files Arnaldo Carvalho de Melo
2026-05-24  3:26 ` [PATCH 11/29] perf session: Validate nr fields against event size on both swap and common paths Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox