All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Dmitry Ilvokhin" <d@ilvokhin.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer
Date: Thu, 18 Jun 2026 16:20:59 +0000	[thread overview]
Message-ID: <20260618162059.8B59B1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <7860259ac13e539dfb907489be29d0487d36aeb8.1781797544.git.d@ilvokhin.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [High] The patch introduces an overly strict buffer size check that causes compression to fail prematurely whenever less than `max_record_size` (~64KB) space remains.

Pre-existing issues:
- [High] The safety bounds check added to `zstd_compress_stream_to_records` is bypassed on the first loop iteration, leaving the buffer vulnerable to overflow and integer underflow.
- [High] In `zstd_decompress_stream()`, the `output.dst` pointer is advanced while `output.pos` is improperly preserved, leading to out-of-bounds writes and corrupted output if the decompression loop executes more than once.
- [High] If `perf_mmap__push` processes the ring buffer in two chunks and the second chunk's compression fails, the reference to `map->core` acquired during the first chunk is leaked.
--

commit 09f418a2fa19ace0234b4443e193f65fb347ed8b
Author: Dmitry Ilvokhin <d@ilvokhin.com>

perf record: Avoid overrunning the zstd output buffer

This commit adds a bounds check in zstd_compress_stream_to_records() to
prevent writing past the end of the output buffer. It verifies if the
remaining buffer size is sufficient for a full record.

> diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
> index 1955fa2431d17..a0f6572ad2017 100644
> --- a/tools/perf/util/zstd.c
> +++ b/tools/perf/util/zstd.c

[ ... ]

> @@ -53,8 +53,14 @@ ssize_t zstd_compress_stream_to_records(struct zstd_data *data, void *dst, size_
>  	}
>  
>  	while (input.pos < input.size) {
> +		if (hdr_size &&
> +		    dst_size < hdr_size + max_record_size + sizeof(u64)) {

[Severity: High]
This isn't a bug introduced by this patch, but by gating the bounds check on
hdr_size, isn't the check bypassed on the very first loop iteration since
hdr_size is initialized to 0?

If dst_size is initially very small (which can legitimately happen in 
record__aio_pushfn() when calculating mmap__mmap_len(map) - aio->size), the 
code will unconditionally write the header and decrement dst_size.

Could this underflow dst_size to a massive positive value, bypassing
subsequent bounds checks and allowing ZSTD_compressStream() to write out of
bounds?

[Severity: High]
Does this check introduce an overly strict requirement that causes compression 
to fail prematurely?

By requiring dst_size < hdr_size + max_record_size + sizeof(u64), this 
condition unconditionally fails compression if the remaining buffer is less 
than the maximum record size (typically ~64KB).

In AIO mode, dst_size represents the remaining space in the AIO buffer, which
naturally dwindles. If a small chunk needs to be compressed and could fit in
the remaining space, wouldn't this check spuriously abort the compression and
render the last ~64KB of every buffer unusable?

> +			pr_err("failed to compress: output buffer too small\n");
> +			return -1;
> +		}
>  		record = dst;
>  		size = process_header(record, 0);
> +		hdr_size = size;
>  		compressed += size;
>  		dst += size;
>  		dst_size -= size;

[Severity: High]
This isn't a bug introduced by this patch, but looking at
zstd_decompress_stream() in the same file, is there an issue with how
output.pos is handled?

tools/perf/util/zstd.c:zstd_decompress_stream() {
    ...
    output.dst  = dst + output.pos;
    output.size = dst_size - output.pos;
    ...
}

The loop advances the destination pointer but fails to clear output.pos.

If the decompression loop executes more than once, will ZSTD start writing
at the new output.dst + output.pos, effectively skipping output.pos bytes
and corrupting the decompressed stream?

[Severity: High]
This isn't a bug introduced by this patch, but does the strict bounds check
introduced above make a memory leak in AIO mode more easily triggerable?

In tools/perf/builtin-record.c:record__aio_pushfn(), a reference to map->core 
is acquired on the first successful chunk:

tools/perf/builtin-record.c:record__aio_pushfn() {
    ...
    if (!aio->size) {
        perf_mmap__get(&map->core);
    }
    ...
}

If perf_mmap__push() processes the ring buffer in two chunks, and the
second chunk's compression fails (which happens more frequently now due
to the ~64KB bounds check), perf_mmap__push() aborts and passes the error
back to record__aio_push():

tools/perf/builtin-record.c:record__aio_push() {
    ...
    ret = perf_mmap__push(map, &aio, record__aio_pushfn);
    if (ret != 0)
        return ret;
    ...
}

Will this return immediately on failure and bypass the perf_mmap__put()
cleanup that occurs later, resulting in a leak of the map->core resource?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1781797544.git.d@ilvokhin.com?part=2

  reply	other threads:[~2026-06-18 16:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-18 15:59 [PATCH 0/3] perf record: fix multi-record Zstd compression Dmitry Ilvokhin
2026-06-18 15:59 ` [PATCH 1/3] perf record: Fix multiple PERF_RECORD_COMPRESSED2 records per push Dmitry Ilvokhin
2026-06-18 16:26   ` sashiko-bot
2026-06-18 15:59 ` [PATCH 2/3] perf record: Avoid overrunning the zstd output buffer Dmitry Ilvokhin
2026-06-18 16:20   ` sashiko-bot [this message]
2026-06-18 15:59 ` [PATCH 3/3] perf record: Don't store raw data on zstd compression failure Dmitry Ilvokhin
2026-06-18 16:24   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260618162059.8B59B1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=d@ilvokhin.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.