All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Arnaldo Carvalho de Melo" <acme@kernel.org>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 5/8] perf header: Use a max number of command line args
Date: Wed, 08 Apr 2026 18:17:47 +0000	[thread overview]
Message-ID: <20260408181748.ABAD2C19421@smtp.kernel.org> (raw)
In-Reply-To: <20260408172846.96360-6-acme@kernel.org>

Sashiko AI review found 5 potential issue(s):
- [Critical] Incomplete Mitigation for Heap Buffer Overflow (Ignores section bounds)
- [High] Incomplete Mitigation for Integer Overflow in `cmdline` allocation
- [High] Pre-existing Out-of-Bounds Read and Heap Data Leak in `do_read_string`
- [High] Pre-existing Inconsistent Error State Leads to NULL Pointer Dereference
- [Medium] Silent Truncation of Remaining Header Features on Large Command Lines
--

commit f7dfe248f37e0c8cc488e6a1ae29de4882de3598
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Use a max number of command line args

This commit limits the maximum number of command line arguments processed
from the perf.data header to avoid overflows.

> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
[ ... ]
> @@ -2795,6 +2795,9 @@ static int process_cmdline(struct feat_fd *ff, void *data __maybe_unused)
>  	if (do_read_u32(ff, &nr))
>  		return -1;
>  
> +	if (nr > MAX_CMDLINE_NR)
> +		return -1;

If nr exceeds this limit, returning -1 here aborts the rest of the section
processing. Since perf_session__read_header() ignores errors from
perf_header__process_sections(), does this silently discard remaining
header features (like CPU_TOPOLOGY) without warning the user?

Furthermore, valid command lines can exceed 32,768 arguments when using
wildcards, which might result in silently dropping features on valid data.

>  	env->nr_cmdline = nr;
>  
>  	cmdline = zalloc(ff->size + nr + 1);

The commit message notes this is to avoid overflows, but can
ff->size + nr + 1 still integer overflow? Since ff->size is a 64-bit
value read directly from the user-controlled perf.data header, a value near
SIZE_MAX could wrap the allocation size to a small number. zalloc() would
succeed, but it could lead to a heap buffer overflow during the memcpy()
operations below.

Also, if zalloc() fails here, or do_read_string() fails in the loop,
env->nr_cmdline remains set to nr while env->cmdline_argv is NULL.
Could this cause a NULL pointer dereference later in functions like
print_cmdline() that iterate over env->nr_cmdline?

>  	if (!cmdline)
>  		return -1;
>  
>  	argv = zalloc(sizeof(char *) * (nr + 1));
>  	if (!argv)
>  		goto error;
>  
>  	for (i = 0; i < nr; i++) {
>  		str = do_read_string(ff);
>  		if (!str)
>  			goto error;
>  
>  		argv[i] = cmdline + len;
>  		memcpy(argv[i], str, strlen(str) + 1);

While looking at the overflow mitigations here, I noticed a couple of
potential issues with how do_read_string() interacts with this buffer.

First, do_read_string() reads strings without bounding them to ff->size.
If a file provides a small ff->size (resulting in a small cmdline
allocation) but embeds huge strings, will this memcpy() write past the
end of the cmdline buffer?

Second, if the data returned by do_read_string() lacks a null terminator,
will strlen(str) read out of bounds into adjacent heap memory, causing the
subsequent memcpy() to leak that heap data into the cmdline buffer?

>  		len += strlen(str) + 1;
>  		free(str);
>  	}
>  	env->cmdline = cmdline;
>  	env->cmdline_argv = (const char **) argv;
>  	return 0;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260408172846.96360-1-acme@kernel.org?part=5

  reply	other threads:[~2026-04-08 18:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 17:28 [PATCHES perf-tools-next v5 0/7] Cleanups and fixes Arnaldo Carvalho de Melo
2026-04-08 17:28 ` [PATCH 1/8] perf symbols: Make variable receiving result strrchr() const Arnaldo Carvalho de Melo
2026-04-08 17:28 ` [PATCH 2/8] perf util: Kill die() prototype, dead for a long time Arnaldo Carvalho de Melo
2026-04-08 17:28 ` [PATCH 3/8] perf tools: Make more global variables static Arnaldo Carvalho de Melo
2026-04-08 17:28 ` [PATCH 4/8] perf bench: Constify tables Arnaldo Carvalho de Melo
2026-04-08 17:28 ` [PATCH 5/8] perf header: Use a max number of command line args Arnaldo Carvalho de Melo
2026-04-08 18:17   ` sashiko-bot [this message]
2026-04-08 17:28 ` [PATCH 6/8] perf header: Do validation of perf.data HEADER_CPU_DOMAIN_INFO Arnaldo Carvalho de Melo
2026-04-08 18:38   ` sashiko-bot
2026-04-08 17:28 ` [PATCH 7/8] perf tools: Use calloc() where applicable Arnaldo Carvalho de Melo
2026-04-08 17:28 ` [PATCH 8/8] perf tools: Replace basename() calls with perf_basename() Arnaldo Carvalho de Melo
  -- strict thread matches above, loose matches on Subject: below --
2026-04-08 17:31 [PATCHES perf-tools-next v6 0/8] Cleanups and fixes Arnaldo Carvalho de Melo
2026-04-08 17:32 ` [PATCH 5/8] perf header: Use a max number of command line args Arnaldo Carvalho de Melo
2026-04-08 18:09   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260408181748.ABAD2C19421@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=acme@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.