public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Arnaldo Carvalho de Melo" <acme@kernel.org>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 03/13] perf header: Sanity check HEADER_NRCPUS and HEADER_CPU_DOMAIN_INFO
Date: Fri, 10 Apr 2026 22:45:55 +0000	[thread overview]
Message-ID: <20260410224556.67892C19421@smtp.kernel.org> (raw)
In-Reply-To: <20260410220905.200051-4-acme@kernel.org>

Sashiko AI review found 4 potential issue(s):
- [Medium] Mismatch in domain array traversal bounds causes a memory leak on malformed domain indices.
- [Medium] Unbounded `nr_cpus_avail` allows massive memory allocations and integer cast sign extension.
- [Medium] Integer overflow in the section size validation check bypasses bounds validation.
- [Low] Format specifier mismatch in error messages prints large values as negative numbers.
--

commit 05b748d77f989fb396557569862fea65e5d2e722
Author: Arnaldo Carvalho de Melo <acme@redhat.com>

perf header: Sanity check HEADER_NRCPUS and HEADER_CPU_DOMAIN_INFO

This commit adds sanity checks to process_nrcpus and process_cpu_domain_info
to validate CPU counts and prevent duplicate domain allocations.

> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
[ ... ]
> @@ -2722,6 +2724,13 @@ static int process_nrcpus(struct feat_fd *ff, void *data __maybe_unused)
>  	ret = do_read_u32(ff, &nr_cpus_online);
>  	if (ret)
>  		return ret;
> +
> +	if (nr_cpus_online > nr_cpus_avail) {
> +		pr_err("Invalid HEADER_NRCPUS: nr_cpus_online (%u) > nr_cpus_avail (%u)\n",
> +		       nr_cpus_online, nr_cpus_avail);
> +		return -1;
> +	}
> +

Since there is no absolute upper limit checked for nr_cpus_avail here, could
a malformed perf.data file provide an excessively large value?

For example, if nr_cpus_online is 1 and nr_cpus_avail is 0x40000000,
it seems this check would pass.

Later, env->nr_cpus_avail is cast to int, which would result in a negative
value if nr_cpus_avail exceeds INT_MAX. Additionally, in
process_cpu_domain_info, this could lead to a massive memory allocation
when allocating the cd_map array.

[ ... ]
> @@ -3698,6 +3707,17 @@ static int process_cpu_domain_info(struct feat_fd *ff, void *data __maybe_unused
>  	nra = env->nr_cpus_avail;
>  	nr = env->nr_cpus_online;
>  
> +	if (nra == 0 || nr == 0) {
> +		pr_err("Invalid HEADER_CPU_DOMAIN_INFO: missing HEADER_NRCPUS\n");
> +		return -1;
> +	}
> +
> +	if (ff->size < 2 * sizeof(u32) + nr * 2 * sizeof(u32)) {

Can this size validation be bypassed due to an integer overflow?

Because nr is a 32-bit unsigned integer, if a crafted file provides a very
large value like 0x80000000, nr * 2 could overflow to 0 before the promotion
to size_t. This would evaluate the requirement to just 8 bytes, completely
bypassing the bounds check.

> +		pr_err("Invalid HEADER_CPU_DOMAIN_INFO: section too small (%zu) for %u CPUs\n",
> +		       (size_t)ff->size, nr);
> +		return -1;
> +	}
> +
[ ... ]
> @@ -3714,6 +3734,18 @@ static int process_cpu_domain_info(struct feat_fd *ff, void *data __maybe_unused
>  	if (ret)
>  		return ret;
>  
> +	/*
> +	 * Sanity check: real systems have at most ~10 sched domain levels
> +	 * (SMT, CLS, MC, PKG + NUMA hops). Reject obviously bogus values
> +	 * from malformed perf.data files before they cause excessive
> +	 * allocation in the per-CPU loop.
> +	 */
> +	if (max_sched_domains > MAX_SCHED_DOMAINS) {
> +		pr_err("Invalid HEADER_CPU_DOMAIN_INFO: max_sched_domains %u > %u\n",
> +		       max_sched_domains, MAX_SCHED_DOMAINS);
> +		return -1;
> +	}

This isn't a bug, but some of the error messages in this function (such as
the nearby checks for cpu >= nra and domain >= max_sched_domains) use the
%d format specifier for unsigned 32-bit integers. If a malformed file
provides values >= 0x80000000, would they be logged as confusing negative
numbers? Updating them to %u might improve error logging clarity.

[ ... ]
> @@ -3760,7 +3797,13 @@ static int process_cpu_domain_info(struct feat_fd *ff, void *data __maybe_unused
>  			if (!d_info)
>  				return -1;
>  
> -			assert(cd_map[cpu]->domains[domain] == NULL);
> +			if (cd_map[cpu]->domains[domain]) {
> +				pr_err("Invalid HEADER_CPU_DOMAIN_INFO: duplicate domain %u for cpu %u\n",
> +				       domain, cpu);
> +				free(d_info);
> +				return -1;
> +			}
> +
>  			cd_map[cpu]->domains[domain] = d_info;

If a malformed file specifies a domain index such that
nr_domains <= domain < max_sched_domains, does this create a memory leak?

The array is sized by max_sched_domains, so the allocation succeeds and is
stored safely here. However, if free_cpu_domain_info only loops up to
nr_domains - 1 during teardown, any structures placed at an index >=
nr_domains (including their dynamically allocated strings) might never be
accessed and permanently leaked.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260410220905.200051-1-acme@kernel.org?part=3

  reply	other threads:[~2026-04-10 22:45 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10 22:08 [PATCHES perf-tools-next v2 00/13] Sanity check perf.data headers Arnaldo Carvalho de Melo
2026-04-10 22:08 ` [PATCH 01/13] perf header: Validate nr_domains when reading HEADER_CPU_DOMAIN_INFO Arnaldo Carvalho de Melo
2026-04-10 22:08 ` [PATCH 02/13] perf header: Bump up the max number of command line args allowed Arnaldo Carvalho de Melo
2026-04-10 22:34   ` sashiko-bot
2026-04-10 22:08 ` [PATCH 03/13] perf header: Sanity check HEADER_NRCPUS and HEADER_CPU_DOMAIN_INFO Arnaldo Carvalho de Melo
2026-04-10 22:45   ` sashiko-bot [this message]
2026-04-10 22:08 ` [PATCH 04/13] perf header: Sanity check HEADER_CPU_TOPOLOGY Arnaldo Carvalho de Melo
2026-04-10 22:38   ` sashiko-bot
2026-04-10 22:08 ` [PATCH 05/13] perf header: Sanity check HEADER_NUMA_TOPOLOGY Arnaldo Carvalho de Melo
2026-04-10 22:28   ` sashiko-bot
2026-04-10 22:08 ` [PATCH 06/13] perf header: Sanity check HEADER_MEM_TOPOLOGY Arnaldo Carvalho de Melo
2026-04-10 22:32   ` sashiko-bot
2026-04-10 22:08 ` [PATCH 07/13] perf header: Sanity check HEADER_PMU_MAPPINGS Arnaldo Carvalho de Melo
2026-04-10 22:33   ` sashiko-bot
2026-04-10 22:09 ` [PATCH 08/13] perf header: Sanity check HEADER_GROUP_DESC Arnaldo Carvalho de Melo
2026-04-10 22:28   ` sashiko-bot
2026-04-10 22:09 ` [PATCH 09/13] perf header: Sanity check HEADER_CACHE Arnaldo Carvalho de Melo
2026-04-10 22:09 ` [PATCH 10/13] perf header: Sanity check HEADER_HYBRID_TOPOLOGY Arnaldo Carvalho de Melo
2026-04-10 22:09 ` [PATCH 11/13] perf header: Sanity check HEADER_PMU_CAPS Arnaldo Carvalho de Melo
2026-04-10 22:09 ` [PATCH 12/13] perf header: Sanity check HEADER_BPF_PROG_INFO Arnaldo Carvalho de Melo
2026-04-10 22:09 ` [PATCH 13/13] perf header: Add sanity checks to HEADER_BPF_BTF processing Arnaldo Carvalho de Melo
  -- strict thread matches above, loose matches on Subject: below --
2026-04-10  0:39 [PATCHES perf-tools-next v1 00/13] Sanity check perf.data headers Arnaldo Carvalho de Melo
2026-04-10  0:39 ` [PATCH 03/13] perf header: Sanity check HEADER_NRCPUS and HEADER_CPU_DOMAIN_INFO Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260410224556.67892C19421@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=acme@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox