[PATCH 0/3] perf record: fix multi-record Zstd compression

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] perf record: fix multi-record Zstd compression
@ 2026-06-18 15:59 Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-18 15:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Nick Terrell,
	David Sterba
  Cc: linux-kernel, linux-perf-users, kernel-team, Farid Zakaria,
	Dmitry Ilvokhin

Patch 1 fixes a 'perf record -z' regression that aborts recording with
"Bad address" and produces an undecompressable perf.data.

While fixing it I found two more latent bugs in the same compressor,
zstd_compress_stream_to_records(): an output-buffer overrun and a broken
zstd-error fallback. Addressed in patches 2 and 3.

Dmitry Ilvokhin (3):
  perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push
  perf record: Avoid overrunning the zstd output buffer
  perf record: Don't store raw data on zstd compression failure

 tools/perf/builtin-record.c                   | 40 ++++++------
 .../record+zstd_comp_decomp_multi_record.sh   | 64 +++++++++++++++++++
 tools/perf/util/zstd.c                        | 15 +++--
 3 files changed, 94 insertions(+), 25 deletions(-)
 create mode 100755 tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh

-- 
2.53.0-Meta


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push
  2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
@ 2026-06-18 15:59 ` Dmitry Ilvokhin
  2026-06-18 16:26   ` sashiko-bot
  2026-06-18 15:59 ` [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 3/3] perf record: Don't store raw data on zstd compression failure Dmitry Ilvokhin
  2 siblings, 1 reply; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-18 15:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Nick Terrell,
	David Sterba
  Cc: linux-kernel, linux-perf-users, kernel-team, Farid Zakaria,
	Dmitry Ilvokhin

With Zstd compression enabled ('perf record -z'), a single mmap push
whose compressed output exceeds the maximum record size makes
zstd_compress_stream_to_records() emit several PERF_RECORD_COMPRESSED2
records back to back. record__pushfn() however rewrote only the first
record's header to describe the whole blob as one record:

  event->data_size   = compressed - sizeof(struct perf_record_compressed2);
  event->header.size = PERF_ALIGN(compressed, sizeof(u64));
  padding            = event->header.size - compressed;
  ...
  record__write(rec, map, &pad, padding);

perf_event_header::size is a __u16, so once the compressed blob no
longer fits in it the header.size assignment truncates and 'padding'
(size_t) underflows. write() is then handed that bogus length and fails
with EFAULT, aborting the recording:

  failed to write perf data, error: Bad address

The bytes that did reach the file are mis-framed, so reading it back
cannot be decompressed.

This is easy to hit with a high event rate and a large buffer, e.g.:

  perf record -z -F max -m 32M --per-thread -- perf test -w thloop 5 1

The single-record fixup is wrong by construction: because header.size is
16 bits a compressed record cannot exceed 64KB, so the compressor must
split a push into a chain of records, and the session reader already
consumes them as such.

Frame each record where it is produced instead: process_comp_header()
now sets the per-record data_size, 8-byte-aligns header.size and zeroes
the trailing padding, and record__pushfn() just writes the resulting
blob, as the AIO path already does. max_record_size is reduced by
sizeof(u64) so the per-record alignment padding cannot push header.size
past its u16 field.

There is no on-disk format change; a perf.data written by the fixed tool
is still read by existing perf.

Fixes: 208c0e168344 ("perf record: Add 8-byte aligned event type PERF_RECORD_COMPRESSED2")
Reported-by: Farid Zakaria <fmzakari@meta.com>
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
 tools/perf/builtin-record.c                   | 40 ++++++------
 .../record+zstd_comp_decomp_multi_record.sh   | 64 +++++++++++++++++++
 tools/perf/util/zstd.c                        |  2 +-
 3 files changed, 86 insertions(+), 20 deletions(-)
 create mode 100755 tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4a5eba498c02..314600812207 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -652,27 +652,14 @@ static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
 	struct record *rec = to;
 
 	if (record__comp_enabled(rec)) {
-		struct perf_record_compressed2 *event = map->data;
-		size_t padding = 0;
-		u8 pad[8] = {0};
 		ssize_t compressed = zstd_compress(rec->session, map, map->data,
 						   mmap__mmap_len(map), bf, size);
 
 		if (compressed < 0)
 			return (int)compressed;
 
-		bf = event;
 		thread->samples++;
-
-		/*
-		 * The record from `zstd_compress` is not 8 bytes aligned, which would cause asan
-		 * error. We make it aligned here.
-		 */
-		event->data_size = compressed - sizeof(struct perf_record_compressed2);
-		event->header.size = PERF_ALIGN(compressed, sizeof(u64));
-		padding = event->header.size - compressed;
-		return record__write(rec, map, bf, compressed) ||
-		       record__write(rec, map, &pad, padding);
+		return record__write(rec, map, map->data, compressed);
 	}
 
 	thread->samples++;
@@ -1590,18 +1577,28 @@ static void record__adjust_affinity(struct record *rec, struct mmap *map)
 	}
 }
 
-static size_t process_comp_header(void *record, size_t increment)
+/*
+ * Called once with data_size == 0 to start a record, then once with
+ * data_size == compressed payload size to finalize and 8-byte-pad it.
+ */
+static size_t process_comp_header(void *record, size_t data_size)
 {
 	struct perf_record_compressed2 *event = record;
 	size_t size = sizeof(*event);
 
-	if (increment) {
-		event->header.size += increment;
-		return increment;
+	if (data_size) {
+		size_t padding;
+
+		event->data_size = data_size;
+		event->header.size = PERF_ALIGN(size + data_size, sizeof(u64));
+		padding = event->header.size - size - data_size;
+		memset(record + size + data_size, 0, padding);
+		return data_size + padding;
 	}
 
 	event->header.type = PERF_RECORD_COMPRESSED2;
 	event->header.size = size;
+	event->data_size = 0;
 
 	return size;
 }
@@ -1610,7 +1607,12 @@ static ssize_t zstd_compress(struct perf_session *session, struct mmap *map,
 			    void *dst, size_t dst_size, void *src, size_t src_size)
 {
 	ssize_t compressed;
-	size_t max_record_size = PERF_SAMPLE_MAX_SIZE - sizeof(struct perf_record_compressed2) - 1;
+	/*
+	 * Reserve space so per-record PERF_ALIGN() padding keeps header.size
+	 * within u16.
+	 */
+	size_t max_record_size = PERF_SAMPLE_MAX_SIZE
+		- sizeof(struct perf_record_compressed2) - sizeof(u64);
 	struct zstd_data *zstd_data = &session->zstd_data;
 
 	if (map && map->file)
diff --git a/tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh b/tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh
new file mode 100755
index 000000000000..42efe7260def
--- /dev/null
+++ b/tools/perf/tests/shell/record+zstd_comp_decomp_multi_record.sh
@@ -0,0 +1,64 @@
+#!/bin/bash
+# Zstd perf.data compression/decompression of multi-record data
+
+# SPDX-License-Identifier: GPL-2.0
+
+perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX)
+recout=$(mktemp /tmp/__perf_test.zstd.rec.XXXXX)
+injout=$(mktemp /tmp/__perf_test.zstd.inj.XXXXX)
+perf_tool=perf
+
+cleanup() {
+	rm -f "${perfdata}" "${perfdata}".old "${perfdata}".decomp "${recout}" "${injout}"
+}
+trap cleanup EXIT TERM INT
+
+skip_if_no_z_record() {
+	$perf_tool record -h 2>&1 | grep -q -- '-z, --compression-level'
+}
+
+collect_z_record() {
+	echo "Collecting compressed record file:"
+	[ "$(uname -m)" != s390x ] && gflag='-g'
+	$perf_tool record -o "${perfdata}" $gflag -z -F max -m 32M --per-thread -- \
+		$perf_tool test -w thloop 5 1 \
+		>/dev/null 2>"${recout}"
+}
+
+check_record() {
+	echo "Checking record did not fail to write data:"
+	if grep -q "failed to write perf data" "${recout}"; then
+		cat "${recout}"
+		return 1
+	fi
+}
+
+check_decompress() {
+	echo "Checking compressed file decompresses cleanly:"
+	if ! $perf_tool inject -i "${perfdata}" -o "${perfdata}".decomp 2>"${injout}"; then
+		cat "${injout}"
+		return 1
+	fi
+	if grep -Eqi "decompress|corrupt|failed to process type" "${injout}"; then
+		cat "${injout}"
+		return 1
+	fi
+}
+
+skip_if_no_z_record || exit 2
+collect_z_record
+check_record || exit 1
+
+# Need >1 record, else the multi-record path wasn't exercised.
+# Skip rather than pass/fail spuriously.
+nr=$($perf_tool report -i "${perfdata}" --stats 2>/dev/null |
+	awk '/COMPRESSED2 events:/ { print $3 }')
+if [ -z "${nr}" ] || [ "${nr}" -lt 2 ]; then
+	echo "less than two compressed records (${nr:-0}), skipping"
+	exit 2
+fi
+echo "Produced ${nr} compressed records"
+
+check_decompress
+err=$?
+exit $err
diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index 57027e0ac7b6..1955fa2431d1 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -30,7 +30,7 @@ int zstd_fini(struct zstd_data *data)
 
 ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_t dst_size,
 				       void *src, size_t src_size, size_t max_record_size,
-				       size_t process_header(void *record, size_t increment))
+				       size_t process_header(void *record, size_t data_size))
 {
 	size_t ret, size, compressed = 0;
 	ZSTD_inBuffer input = { src, src_size, 0 };
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer
  2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
@ 2026-06-18 15:59 ` Dmitry Ilvokhin
  2026-06-18 16:20   ` sashiko-bot
  2026-06-18 15:59 ` [PATCH 3/3] perf record: Don't store raw data on zstd compression failure Dmitry Ilvokhin
  2 siblings, 1 reply; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-18 15:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Nick Terrell,
	David Sterba
  Cc: linux-kernel, linux-perf-users, kernel-team, Farid Zakaria,
	Dmitry Ilvokhin

zstd_compress_stream_to_records() compresses one mmap chunk into 'dst'
as a chain of records, decrementing 'dst_size' as each record's header
and compressed payload are written, but it never checks that the space
left can still hold a record before writing one.

When the compressed output does not fit in 'dst', 'dst_size -= size'
underflows ('dst_size' is a size_t), the output size is then taken as
max_record_size again, and ZSTD_compressStream() writes past the end
of 'dst'.

Bail out when the remaining 'dst_size' can no longer hold a full record
(header + max_record_size + 8-byte alignment) rather than writing past
'dst'.

Fixes: f24c1d7523e6 ("perf tools: Introduce Zstd streaming based compression API")
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
 tools/perf/util/zstd.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index 1955fa2431d1..a0f6572ad201 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -32,7 +32,7 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 				       void *src, size_t src_size, size_t max_record_size,
 				       size_t process_header(void *record, size_t data_size))
 {
-	size_t ret, size, compressed = 0;
+	size_t ret, size, compressed = 0, hdr_size = 0;
 	ZSTD_inBuffer input = { src, src_size, 0 };
 	ZSTD_outBuffer output;
 	void *record;
@@ -53,8 +53,14 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 	}
 
 	while (input.pos < input.size) {
+		if (hdr_size &&
+		    dst_size < hdr_size + max_record_size + sizeof(u64)) {
+			pr_err("failed to compress: output buffer too small\n");
+			return -1;
+		}
 		record = dst;
 		size = process_header(record, 0);
+		hdr_size = size;
 		compressed += size;
 		dst += size;
 		dst_size -= size;
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] perf record: Don't store raw data on zstd compression failure
  2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
  2026-06-18 15:59 ` [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer Dmitry Ilvokhin
@ 2026-06-18 15:59 ` Dmitry Ilvokhin
  2026-06-18 16:24   ` sashiko-bot
  2 siblings, 1 reply; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-18 15:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, James Clark, Nick Terrell,
	David Sterba
  Cc: linux-kernel, linux-perf-users, kernel-team, Farid Zakaria,
	Dmitry Ilvokhin

On a ZSTD_compressStream() error, zstd_compress_stream_to_records()
falls back to copying the uncompressed input into the output buffer and
returning its size. That cannot work:

- a COMPRESSED2 payload is always fed to zstd_decompress_stream(), so
  raw bytes placed there can never be read back.

- the record header is never finalized: process_header() is not called
  to write the real header.size and data_size, so they keep their
  initial values and do not match the bytes written.

- the copy is unbounded and can write past the output buffer.

Propagate the error instead.

Fixes: f24c1d7523e6 ("perf tools: Introduce Zstd streaming based compression API")
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
 tools/perf/util/zstd.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index a0f6572ad201..2de777d49ee0 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
 
-#include <string.h>
-
 #include "util/compress.h"
 #include "util/debug.h"
 
@@ -71,8 +69,7 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
 		if (ZSTD_isError(ret)) {
 			pr_err("failed to compress %ld bytes: %s\n",
 				(long)src_size, ZSTD_getErrorName(ret));
-			memcpy(dst, src, src_size);
-			return src_size;
+			return -1;
 		}
 		size = output.pos;
 		size = process_header(record, size);
-- 
2.53.0-Meta


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer
  2026-06-18 15:59 ` [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer Dmitry Ilvokhin
@ 2026-06-18 16:20   ` sashiko-bot
  0 siblings, 0 replies; 7+ messages in thread
From: sashiko-bot @ 2026-06-18 16:20 UTC (permalink / raw)
  To: Dmitry Ilvokhin; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [High] The patch introduces an overly strict buffer size check that causes compression to fail prematurely whenever less than `max_record_size` (~64KB) space remains.

Pre-existing issues:
- [High] The safety bounds check added to `zstd_compress_stream_to_records` is bypassed on the first loop iteration, leaving the buffer vulnerable to overflow and integer underflow.
- [High] In `zstd_decompress_stream()`, the `output.dst` pointer is advanced while `output.pos` is improperly preserved, leading to out-of-bounds writes and corrupted output if the decompression loop executes more than once.
- [High] If `perf_mmap__push` processes the ring buffer in two chunks and the second chunk's compression fails, the reference to `map->core` acquired during the first chunk is leaked.
--

commit 09f418a2fa19ace0234b4443e193f65fb347ed8b
Author: Dmitry Ilvokhin <d@ilvokhin.com>

perf record: Avoid overrunning the zstd output buffer

This commit adds a bounds check in zstd_compress_stream_to_records() to
prevent writing past the end of the output buffer. It verifies if the
remaining buffer size is sufficient for a full record.

> diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
> index 1955fa2431d17..a0f6572ad2017 100644
> --- a/tools/perf/util/zstd.c
> +++ b/tools/perf/util/zstd.c

[ ... ]

> @@ -53,8 +53,14 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
>  	}
>  
>  	while (input.pos < input.size) {
> +		if (hdr_size &&
> +		    dst_size < hdr_size + max_record_size + sizeof(u64)) {

[Severity: High]
This isn't a bug introduced by this patch, but by gating the bounds check on
hdr_size, isn't the check bypassed on the very first loop iteration since
hdr_size is initialized to 0?

If dst_size is initially very small (which can legitimately happen in 
record__aio_pushfn() when calculating mmap__mmap_len(map) - aio->size), the 
code will unconditionally write the header and decrement dst_size.

Could this underflow dst_size to a massive positive value, bypassing
subsequent bounds checks and allowing ZSTD_compressStream() to write out of
bounds?

[Severity: High]
Does this check introduce an overly strict requirement that causes compression 
to fail prematurely?

By requiring dst_size < hdr_size + max_record_size + sizeof(u64), this 
condition unconditionally fails compression if the remaining buffer is less 
than the maximum record size (typically ~64KB).

In AIO mode, dst_size represents the remaining space in the AIO buffer, which
naturally dwindles. If a small chunk needs to be compressed and could fit in
the remaining space, wouldn't this check spuriously abort the compression and
render the last ~64KB of every buffer unusable?

> +			pr_err("failed to compress: output buffer too small\n");
> +			return -1;
> +		}
>  		record = dst;
>  		size = process_header(record, 0);
> +		hdr_size = size;
>  		compressed += size;
>  		dst += size;
>  		dst_size -= size;

[Severity: High]
This isn't a bug introduced by this patch, but looking at
zstd_decompress_stream() in the same file, is there an issue with how
output.pos is handled?

tools/perf/util/zstd.c:zstd_decompress_stream() {
    ...
    output.dst  = dst + output.pos;
    output.size = dst_size - output.pos;
    ...
}

The loop advances the destination pointer but fails to clear output.pos.

If the decompression loop executes more than once, will ZSTD start writing
at the new output.dst + output.pos, effectively skipping output.pos bytes
and corrupting the decompressed stream?

[Severity: High]
This isn't a bug introduced by this patch, but does the strict bounds check
introduced above make a memory leak in AIO mode more easily triggerable?

In tools/perf/builtin-record.c:record__aio_pushfn(), a reference to map->core 
is acquired on the first successful chunk:

tools/perf/builtin-record.c:record__aio_pushfn() {
    ...
    if (!aio->size) {
        perf_mmap__get(&map->core);
    }
    ...
}

If perf_mmap__push() processes the ring buffer in two chunks, and the
second chunk's compression fails (which happens more frequently now due
to the ~64KB bounds check), perf_mmap__push() aborts and passes the error
back to record__aio_push():

tools/perf/builtin-record.c:record__aio_push() {
    ...
    ret = perf_mmap__push(map, &aio, record__aio_pushfn);
    if (ret != 0)
        return ret;
    ...
}

Will this return immediately on failure and bypass the perf_mmap__put()
cleanup that occurs later, resulting in a leak of the map->core resource?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781797544.git.d@ilvokhin.com?part=2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] perf record: Don't store raw data on zstd compression failure
  2026-06-18 15:59 ` [PATCH 3/3] perf record: Don't store raw data on zstd compression failure Dmitry Ilvokhin
@ 2026-06-18 16:24   ` sashiko-bot
  0 siblings, 0 replies; 7+ messages in thread
From: sashiko-bot @ 2026-06-18 16:24 UTC (permalink / raw)
  To: Dmitry Ilvokhin; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:

Pre-existing issues:
- [Critical] The `zstd_decompress_stream` function incorrectly updates the output buffer pointer, causing an out-of-bounds write and data corruption.
- [Critical] Buffer overflow via integer underflow in `zstd_compress_stream_to_records` bypasses the bounds check.
- [High] Unflushed compressed data is left in the ZSTD internal buffer and permanently lost at the end of a recording session.
- [High] The return value of `ZSTD_flushStream()` is entirely ignored, masking potential compression state errors.
- [High] A missing `perf_mmap__put` in the error path of `record__aio_push` causes a reference count leak of `map->core`.
- [High] Skipped `perf_mmap__consume` on partial failure causes duplicate event records in the `perf.data` file.
- [High] An overly strict bounds check in `zstd_compress_stream_to_records` causes `perf record` to abort when compressing large chunks of trace data.
--

commit 460b94129652aafbc57f9667aa7c4a4a81ffacbe
Author: Dmitry Ilvokhin <d@ilvokhin.com>

perf record: Don't store raw data on zstd compression failure

On a ZSTD_compressStream() error, zstd_compress_stream_to_records()
falls back to copying the uncompressed input into the output buffer and
returning its size. That cannot work:

- a COMPRESSED2 payload is always fed to zstd_decompress_stream(), so
  raw bytes placed there can never be read back.

- the record header is never finalized: process_header() is not called
  to write the real header.size and data_size, so they keep their
  initial values and do not match the bytes written.

- the copy is unbounded and can write past the output buffer.

Propagate the error instead.

Fixes: f24c1d7523e6 ("perf tools: Introduce Zstd streaming based compression API")

> diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
> index a0f6572ad2017..2de777d49ee0b 100644
> --- a/tools/perf/util/zstd.c
> +++ b/tools/perf/util/zstd.c
[ ... ]
> @@ -71,8 +69,7 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
>  		if (ZSTD_isError(ret)) {
>  			pr_err("failed to compress %ld bytes: %s\n",
>  				(long)src_size, ZSTD_getErrorName(ret));
> -			memcpy(dst, src, src_size);
> -			return src_size;
> +			return -1;
>  		}
>  		size = output.pos;
>  		size = process_header(record, size);

[Severity: High]
This isn't a bug introduced by this patch, but does the loop ignore the
return value of ZSTD_flushStream()?

Looking at zstd_compress_stream_to_records():

		ret = ZSTD_compressStream(data->cstream, &output, &input);
		ZSTD_flushStream(data->cstream, &output);
		if (ZSTD_isError(ret)) {

The return value of ZSTD_flushStream() is never checked. The subsequent
ZSTD_isError(ret) only evaluates the error code of ZSTD_compressStream().
If the flush operation encounters a failure, the error code is discarded,
potentially resulting in silently corrupted data streams or emitting an
incomplete compressed payload without propagating an error to the user.

[Severity: High]
This is a pre-existing issue, but does zstd_compress_stream_to_records()
leave unflushed compressed data in the ZSTD internal buffer?

In tools/perf/util/zstd.c:zstd_compress_stream_to_records():

	while (input.pos < input.size) {
        ...

The loop exits strictly when input.pos == input.size. Since
ZSTD_flushStream() can leave unflushed data in the internal compression
stream if the output buffer fills up, and perf record never calls
ZSTD_endStream() to force a final flush at the end of the session, any
remaining compressed data trapped in the stream buffer is silently and
permanently lost, resulting in truncated or corrupted trace files.

[Severity: Critical]
This is a pre-existing issue, but does zstd_decompress_stream() incorrectly
advance the output buffer pointer, causing an out-of-bounds write?

In tools/perf/util/zstd.c:zstd_decompress_stream():

	while (input.pos < input.size) {
		ret = ZSTD_decompressStream(data->dstream, &output, &input);
        ...
		output.dst  = dst + output.pos;
		output.size = dst_size - output.pos;
	}

The output.dst pointer is advanced by output.pos
(output.dst = dst + output.pos). However, output.pos is the accumulated
number of bytes written, which ZSTD_decompressStream() internally adds to
output.dst. This evaluates to an effective write address of
dst + 2 * output.pos.

Consequently, the decompression loop skips over output.pos bytes on every
iteration, leaving uninitialized gaps in the decompressed output and quickly
writing past the dst_size bounds, causing severe memory corruption.

[Severity: Critical]
This is a pre-existing issue, but does zstd_compress_stream_to_records()
have a buffer overflow via integer underflow that bypasses bounds checking?

In tools/perf/util/zstd.c:zstd_compress_stream_to_records():

		if (hdr_size &&
		    dst_size < hdr_size + max_record_size + sizeof(u64)) {
			pr_err("failed to compress: output buffer too small\n");
			return -1;
		}
		record = dst;
		size = process_header(record, 0);

When perf_mmap__push() splits the ring buffer, the second chunk can be
passed to zstd_compress() with a very small dst_size (e.g., 8 bytes). Since
hdr_size is initially 0, the boundary check is bypassed on the first
iteration.

process_header() unconditionally writes a 24-byte header into the <24 byte
remaining buffer, corrupting memory. Then dst_size -= 24 underflows to a
huge value (SIZE_MAX), causing ZSTD_compressStream() to write up to 64KB
(max_record_size) entirely out of bounds.

[Severity: High]
This is a pre-existing issue, but does the error path in record__aio_push()
cause a reference count leak of map->core?

In tools/perf/builtin-record.c:record__aio_push():

	if (ret != 0) /* ret > 0 - no data, ret < 0 - error */
		return ret;

When zstd_compress() fails on the second mmap chunk, record__aio_pushfn()
propagates the error (< 0). perf_mmap__push() then aborts and returns the
error code.

The code checks if (ret != 0) and returns immediately, skipping the
perf_mmap__put(&map->core) cleanup that balances the perf_mmap__get()
acquired during the successful first chunk. This unhandled error path causes
an LSAN-reportable memory leak and violates perf tool reference counting
invariants.

[Severity: High]
This is a pre-existing issue, but does skipping perf_mmap__consume() on
partial failure cause duplicate event records in the perf.data file?

In tools/perf/util/mmap.c:perf_mmap__push():

	if (push(md, to, buf, size) < 0) {
		rc = -1;
		goto out;
	}

If the second chunk of a split ring buffer fails during zstd_compress(),
perf_mmap__push() aborts via goto out, skipping the
perf_mmap__consume(&md->core) call.

Because perf_mmap__consume() is skipped, the ring buffer's consumer tail
pointer is never advanced, even though the first chunk was successfully
compressed and written. The caller record__mmap_read_evlist() receives the
error, aborts, and triggers a retry via record__mmap_read_all(rec, true),
which reads and writes the exact same first chunk again, resulting in
duplicated events and corrupted profiling data.

[Severity: High]
This is a pre-existing issue, but does an overly strict bounds check in
zstd_compress_stream_to_records() cause perf record to improperly abort?

In tools/perf/util/zstd.c:zstd_compress_stream_to_records():

		if (hdr_size &&
		    dst_size < hdr_size + max_record_size + sizeof(u64)) {
			pr_err("failed to compress: output buffer too small\n");
			return -1;
		}

The loop requires dst_size to be larger than approximately 64KB
(hdr_size + max_record_size + sizeof(u64)) to proceed to the next iteration.

If the remaining buffer space falls below this threshold near the end of a
compression loop, it unconditionally returns -1 and prints "output buffer
too small", even if only a few bytes of input remain to be compressed. This
logic effectively turns a normal buffer sizing condition into a fatal error,
instantly aborting the entire perf record session.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781797544.git.d@ilvokhin.com?part=3

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push
  2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
@ 2026-06-18 16:26   ` sashiko-bot
  0 siblings, 0 replies; 7+ messages in thread
From: sashiko-bot @ 2026-06-18 16:26 UTC (permalink / raw)
  To: Dmitry Ilvokhin; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [Critical] Missing bounds check on `dst_size` during the first iteration causes unsigned underflow and out-of-bounds writes.
--

commit 05c1411c44b6c31eba9854745a0f8d5ce767f17a
Author: Dmitry Ilvokhin <d@ilvokhin.com>

perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push

With Zstd compression enabled, a single mmap push whose compressed output
exceeds the maximum record size emits several back-to-back compressed
records. The record__pushfn() function incorrectly rewrote only the first
record's header to describe the whole blob, which resulted in a bogus
truncated length and corrupted the perf.data file. This patch fixes the
issue by framing each record properly where it is produced, ensuring the
sizes remain valid.

[ ... ]

> diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
> index 57027e0ac7b65..1955fa2431d17 100644
> --- a/tools/perf/util/zstd.c
> +++ b/tools/perf/util/zstd.c
> @@ -30,7 +30,7 @@ int zstd_fini(struct zstd_data *data)
>  
>  ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_t dst_size,
>  				       void *src, size_t src_size, size_t max_record_size,
> -				       size_t process_header(void *record, size_t increment))
> +				       size_t process_header(void *record, size_t data_size))
>  {
>  	size_t ret, size, compressed = 0;
>  	ZSTD_inBuffer input = { src, src_size, 0 };

[Severity: Critical]
This isn't a bug introduced by this patch, but is there a missing bounds check
on dst_size during the first iteration in this function?

Looking at zstd_compress_stream_to_records(), the bounds check explicitly
skips the first iteration because hdr_size is initialized to 0:

	while (input.pos < input.size) {
		if (hdr_size &&
		    dst_size < hdr_size + max_record_size + sizeof(u64)) {
			pr_err("failed to compress: output buffer too small\n");
			return -1;
		}
		record = dst;
		size = process_header(record, 0);
		hdr_size = size;
		compressed += size;
		dst += size;
		dst_size -= size;

If this function is called with an initially small dst_size (e.g., < 16),
process_header() will perform an out-of-bounds write of the 16-byte
perf_record_compressed2 header.

Following this, dst_size -= size will unconditionally unsigned underflow to a
massive value. This allows ZSTD_compressStream() to write up to
max_record_size bytes out-of-bounds into the dst buffer.

Could this be triggered in aio mode if perf_mmap__push() splits a wrap-around
kernel ring buffer, and the first compressed chunk leaves less than 16 bytes
in the aio->data buffer? The second call to zstd_compress() would then pass
a dst_size < 16, bypassing the bounds check on the first iteration and
triggering a buffer overflow.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781797544.git.d@ilvokhin.com?part=1

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-18 16:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
2026-06-18 16:26   ` sashiko-bot
2026-06-18 15:59 ` [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer Dmitry Ilvokhin
2026-06-18 16:20   ` sashiko-bot
2026-06-18 15:59 ` [PATCH 3/3] perf record: Don't store raw data on zstd compression failure Dmitry Ilvokhin
2026-06-18 16:24   ` sashiko-bot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.