The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH 0/3] perf record: fix multi-record Zstd compression
@ 2026-06-18 15:59 Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-18 15:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Nick Terrell,
	David Sterba
  Cc: linux-kernel, linux-perf-users, kernel-team, Farid Zakaria,
	Dmitry Ilvokhin

Patch 1 fixes a 'perf record -z' regression that aborts recording with
"Bad address" and produces an undecompressable perf.data.

While fixing it I found two more latent bugs in the same compressor,
zstd_compress_stream_to_records(): an output-buffer overrun and a broken
zstd-error fallback. Addressed in patches 2 and 3.

Dmitry Ilvokhin (3):
  perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push
  perf record: Avoid overrunning the zstd output buffer
  perf record: Don't store raw data on zstd compression failure

 tools/perf/builtin-record.c                   | 40 ++++++------
 .../record+zstd_comp_decomp_multi_record.sh   | 64 +++++++++++++++++++
 tools/perf/util/zstd.c                        | 15 +++--
 3 files changed, 94 insertions(+), 25 deletions(-)
 create mode 100755 tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh

-- 
2.53.0-Meta


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push
  2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
@ 2026-06-18 15:59 ` Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 3/3] perf record: Don't store raw data on zstd compression failure Dmitry Ilvokhin
  2 siblings, 0 replies; 4+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-18 15:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Nick Terrell,
	David Sterba
  Cc: linux-kernel, linux-perf-users, kernel-team, Farid Zakaria,
	Dmitry Ilvokhin

With Zstd compression enabled ('perf record -z'), a single mmap push
whose compressed output exceeds the maximum record size makes
zstd_compress_stream_to_records() emit several PERF_RECORD_COMPRESSED2
records back to back. record__pushfn() however rewrote only the first
record's header to describe the whole blob as one record:

  event->data_size   = compressed - sizeof(struct perf_record_compressed2);
  event->header.size = PERF_ALIGN(compressed, sizeof(u64));
  padding            = event->header.size - compressed;
  ...
  record__write(rec, map, &pad, padding);

perf_event_header::size is a __u16, so once the compressed blob no
longer fits in it the header.size assignment truncates and 'padding'
(size_t) underflows. write() is then handed that bogus length and fails
with EFAULT, aborting the recording:

  failed to write perf data, error: Bad address

The bytes that did reach the file are mis-framed, so reading it back
cannot be decompressed.

This is easy to hit with a high event rate and a large buffer, e.g.:

  perf record -z -F max -m 32M --per-thread -- perf test -w thloop 5 1

The single-record fixup is wrong by construction: because header.size is
16 bits a compressed record cannot exceed 64KB, so the compressor must
split a push into a chain of records, and the session reader already
consumes them as such.

Frame each record where it is produced instead: process_comp_header()
now sets the per-record data_size, 8-byte-aligns header.size and zeroes
the trailing padding, and record__pushfn() just writes the resulting
blob, as the AIO path already does. max_record_size is reduced by
sizeof(u64) so the per-record alignment padding cannot push header.size
past its u16 field.

There is no on-disk format change; a perf.data written by the fixed tool
is still read by existing perf.

Fixes: 208c0e168344 ("perf record: Add 8-byte aligned event type PERF_RECORD_COMPRESSED2")
Reported-by: Farid Zakaria <fmzakari@meta.com>
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
 tools/perf/builtin-record.c                   | 40 ++++++------
 .../record+zstd_comp_decomp_multi_record.sh   | 64 +++++++++++++++++++
 tools/perf/util/zstd.c                        |  2 +-
 3 files changed, 86 insertions(+), 20 deletions(-)
 create mode 100755 tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4a5eba498c02..314600812207 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -652,27 +652,14 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
 	struct record *rec = to;
 
 	if (record__comp_enabled(rec)) {
-		struct perf_record_compressed2 *event = map->data;
-		size_t padding = 0;
-		u8 pad[8] = {0};
 		ssize_t compressed = zstd_compress(rec->session, map, map->data,
 						   mmap__mmap_len(map), bf, size);
 
 		if (compressed < 0)
 			return (int)compressed;
 
-		bf = event;
 		thread->samples++;
-
-		/*
-		 * The record from `zstd_compress` is not 8 bytes aligned, which would cause asan
-		 * error. We make it aligned here.
-		 */
-		event->data_size = compressed - sizeof(struct perf_record_compressed2);
-		event->header.size = PERF_ALIGN(compressed, sizeof(u64));
-		padding = event->header.size - compressed;
-		return record__write(rec, map, bf, compressed) ||
-		       record__write(rec, map, &pad, padding);
+		return record__write(rec, map, map->data, compressed);
 	}
 
 	thread->samples++;
@@ -1590,18 +1577,28 @@ static void record__adjust_affinity(struct record *rec, struct mmap *map)
 	}
 }
 
-static size_t process_comp_header(void *record, size_t increment)
+/*
+ * Called once with data_size == 0 to start a record, then once with
+ * data_size == compressed payload size to finalize and 8-byte-pad it.
+ */
+static size_t process_comp_header(void *record, size_t data_size)
 {
 	struct perf_record_compressed2 *event = record;
 	size_t size = sizeof(*event);
 
-	if (increment) {
-		event->header.size += increment;
-		return increment;
+	if (data_size) {
+		size_t padding;
+
+		event->data_size = data_size;
+		event->header.size = PERF_ALIGN(size + data_size, sizeof(u64));
+		padding = event->header.size - size - data_size;
+		memset(record + size + data_size, 0, padding);
+		return data_size + padding;
 	}
 
 	event->header.type = PERF_RECORD_COMPRESSED2;
 	event->header.size = size;
+	event->data_size = 0;
 
 	return size;
 }
@@ -1610,7 +1607,12 @@ static ssize_t zstd_compress(struct perf_session *session, struct mmap *map,
 			    void *dst, size_t dst_size, void *src, size_t src_size)
 {
 	ssize_t compressed;
-	size_t max_record_size = PERF_SAMPLE_MAX_SIZE - sizeof(struct perf_record_compressed2) - 1;
+	/*
+	 * Reserve space so per-record PERF_ALIGN() padding keeps header.size
+	 * within u16.
+	 */
+	size_t max_record_size = PERF_SAMPLE_MAX_SIZE
+		- sizeof(struct perf_record_compressed2) - sizeof(u64);
 	struct zstd_data *zstd_data = &session->zstd_data;
 
 	if (map && map->file)
diff --git a/tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh b/tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh
new file mode 100755
index 000000000000..42efe7260def
--- /dev/null
+++ b/tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh
@@ -0,0 +1,64 @@
+#!/bin/bash
+# Zstd perf.data compression/decompression of multi-record data
+
+# SPDX-License-Identifier: GPL-2.0
+
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX)
+recout=$(mktemp /tmp/__perf_test.zstd.rec.XXXXX)
+injout=$(mktemp /tmp/__perf_test.zstd.inj.XXXXX)
+perf_tool=perf
+
+cleanup() {
+	rm -f "${perfdata}" "${perfdata}".old "${perfdata}".decomp "${recout}" "${injout}"
+}
+trap cleanup EXIT TERM INT
+
+skip_if_no_z_record() {
+	$perf_tool record -h 2>&1 | grep -q -- '-z, --compression-level'
+}
+
+collect_z_record() {
+	echo "Collecting compressed record file:"
+	[ "$(uname -m)" != s390x ] && gflag='-g'
+	$perf_tool record -o "${perfdata}" $gflag -z -F max -m 32M --per-thread -- \
+		$perf_tool test -w thloop 5 1 \
+		>/dev/null 2>"${recout}"
+}
+
+check_record() {
+	echo "Checking record did not fail to write data:"
+	if grep -q "failed to write perf data" "${recout}"; then
+		cat "${recout}"
+		return 1
+	fi
+}
+
+check_decompress() {
+	echo "Checking compressed file decompresses cleanly:"
+	if ! $perf_tool inject -i "${perfdata}" -o "${perfdata}".decomp 2>"${injout}"; then
+		cat "${injout}"
+		return 1
+	fi
+	if grep -Eqi "decompress|corrupt|failed to process type" "${injout}"; then
+		cat "${injout}"
+		return 1
+	fi
+}
+
+skip_if_no_z_record || exit 2
+collect_z_record
+check_record || exit 1
+
+# Need >1 record, else the multi-record path wasn't exercised.
+# Skip rather than pass/fail spuriously.
+nr=$($perf_tool report -i "${perfdata}" --stats 2>/dev/null |
+	awk '/COMPRESSED2 events:/ { print $3 }')
+if [ -z "${nr}" ] || [ "${nr}" -lt 2 ]; then
+	echo "less than two compressed records (${nr:-0}), skipping"
+	exit 2
+fi
+echo "Produced ${nr} compressed records"
+
+check_decompress
+err=$?
+exit $err
diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index 57027e0ac7b6..1955fa2431d1 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -30,7 +30,7 @@ int zstd_fini(struct zstd_data *data)
 
 ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_t dst_size,
 				       void *src, size_t src_size, size_t max_record_size,
-				       size_t process_header(void *record, size_t increment))
+				       size_t process_header(void *record, size_t data_size))
 {
 	size_t ret, size, compressed = 0;
 	ZSTD_inBuffer input = { src, src_size, 0 };
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer
  2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
@ 2026-06-18 15:59 ` Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 3/3] perf record: Don't store raw data on zstd compression failure Dmitry Ilvokhin
  2 siblings, 0 replies; 4+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-18 15:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Nick Terrell,
	David Sterba
  Cc: linux-kernel, linux-perf-users, kernel-team, Farid Zakaria,
	Dmitry Ilvokhin

zstd_compress_stream_to_records() compresses one mmap chunk into 'dst'
as a chain of records, decrementing 'dst_size' as each record's header
and compressed payload are written, but it never checks that the space
left can still hold a record before writing one.

When the compressed output does not fit in 'dst', 'dst_size -= size'
underflows ('dst_size' is a size_t), the output size is then taken as
max_record_size again, and ZSTD_compressStream() writes past the end
of 'dst'.

Bail out when the remaining 'dst_size' can no longer hold a full record
(header + max_record_size + 8-byte alignment) rather than writing past
'dst'.

Fixes: f24c1d7523e6 ("perf tools: Introduce Zstd streaming based compression API")
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
 tools/perf/util/zstd.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index 1955fa2431d1..a0f6572ad201 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -32,7 +32,7 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 				       void *src, size_t src_size, size_t max_record_size,
 				       size_t process_header(void *record, size_t data_size))
 {
-	size_t ret, size, compressed = 0;
+	size_t ret, size, compressed = 0, hdr_size = 0;
 	ZSTD_inBuffer input = { src, src_size, 0 };
 	ZSTD_outBuffer output;
 	void *record;
@@ -53,8 +53,14 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 	}
 
 	while (input.pos < input.size) {
+		if (hdr_size &&
+		    dst_size < hdr_size + max_record_size + sizeof(u64)) {
+			pr_err("failed to compress: output buffer too small\n");
+			return -1;
+		}
 		record = dst;
 		size = process_header(record, 0);
+		hdr_size = size;
 		compressed += size;
 		dst += size;
 		dst_size -= size;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] perf record: Don't store raw data on zstd compression failure
  2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer Dmitry Ilvokhin
@ 2026-06-18 15:59 ` Dmitry Ilvokhin
  2 siblings, 0 replies; 4+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-18 15:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Nick Terrell,
	David Sterba
  Cc: linux-kernel, linux-perf-users, kernel-team, Farid Zakaria,
	Dmitry Ilvokhin

On a ZSTD_compressStream() error, zstd_compress_stream_to_records()
falls back to copying the uncompressed input into the output buffer and
returning its size. That cannot work:

- a COMPRESSED2 payload is always fed to zstd_decompress_stream(), so
  raw bytes placed there can never be read back.

- the record header is never finalized: process_header() is not called
  to write the real header.size and data_size, so they keep their
  initial values and do not match the bytes written.

- the copy is unbounded and can write past the output buffer.

Propagate the error instead.

Fixes: f24c1d7523e6 ("perf tools: Introduce Zstd streaming based compression API")
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
 tools/perf/util/zstd.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index a0f6572ad201..2de777d49ee0 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
 
-#include <string.h>
-
 #include "util/compress.h"
 #include "util/debug.h"
 
@@ -71,8 +69,7 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 		if (ZSTD_isError(ret)) {
 			pr_err("failed to compress %ld bytes: %s\n",
 				(long)src_size, ZSTD_getErrorName(ret));
-			memcpy(dst, src, src_size);
-			return src_size;
+			return -1;
 		}
 		size = output.pos;
 		size = process_header(record, size);
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-18 16:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
2026-06-18 15:59 ` [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer Dmitry Ilvokhin
2026-06-18 15:59 ` [PATCH 3/3] perf record: Don't store raw data on zstd compression failure Dmitry Ilvokhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox