Linux Perf Users
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Athira Rajeev" <atrajeev@linux.ibm.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 7/9] perf tools powerpc: Add physical to logical address mapping for HTM traces
Date: Wed, 01 Jul 2026 09:07:25 +0000	[thread overview]
Message-ID: <20260701090726.34FF71F00A3A@smtp.kernel.org> (raw)
In-Reply-To: <20260701084115.80383-8-atrajeev@linux.ibm.com>

Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:
- [Critical] Multiple integer overflows exist when calculating allocation sizes on 32-bit systems, leading to severe heap buffer overflows.
- [Critical] The `write_htm` function blindly trusts data from the untrusted AUX buffer to determine how much memory to write to the translation file, allowing an attacker to leak sensitive heap memory.
- [High] Trace analysis improperly queries the host machine's `/proc/powerpc/lparcfg` rather than utilizing recorded trace metadata, breaking cross-platform analysis.
- [High] Missing bounds check and an off-by-one chunk error in `create_mem_maps` lead to out-of-bounds memory reads and misinterpretation of chunk headers.
- [High] Logical addresses are truncated on 32-bit hosts, corrupting addresses greater than 4GB.
- [High] Unconditional byte swapping of HTM memory map entries corrupts data on matching-endian architectures.
- [Medium] Missing explicit POSIX libc header inclusions will cause compilation failures under musl libc.
- [Medium] The unchecked return value of `add_map_entry` silently drops trace entries when memory allocation fails.
--

commit bb5ab27d1cf88495a27db7cca665426d73fc046d
Author: Tanushree Shah <tshah@linux.ibm.com>
Subject: perf tools powerpc: Add physical to logical address mapping for HTM traces

This commit adds support for mapping physical addresses from HTM traces
to logical addresses. It reads the current partition ID and extracts memory
map entries, then parses decoded HTM trace files to generate mappings.

> diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
> index 487989ca4fc7d..83253850870c2 100644
> --- a/tools/perf/util/powerpc-htm.c
> +++ b/tools/perf/util/powerpc-htm.c
> @@ -20,6 +20,13 @@
>  #include "sample.h"
>  #include <sys/types.h>
>  #include <sys/wait.h>
> +#include <regex.h>
> +#include <ctype.h>
> +#include <errno.h>

[Severity: Medium]
Could we include the standard posix headers here? Functions like fopen,
malloc, free, and strerror are used in this file, but <stdio.h>,
<stdlib.h>, <string.h>, and <fcntl.h> are missing. This causes build
failures when compiling with musl libc.

[ ... ]
> +struct mem_entries {
> +	unsigned long long phy_addr;
> +	unsigned long logical_addr;

[Severity: High]
Will this truncate logical addresses on 32-bit analysis hosts? PowerPC uses
64-bit addresses, but logical_addr is defined as an unsigned long. This
corrupts logical addresses over 4GB and causes the UINT64_MAX sentinel check
to fail when analyzed on a 32-bit system.

> +	u32 lp_index;
> +	u64 size;
> +};

[ ... ]
> +static int add_map_entry(struct addr_map **arr, size_t *count, size_t *cap, struct addr_map entry)
> +{
> +	if (*count >= *cap) {
> +		size_t new_cap = (*cap == 0) ? 1024 : (*cap * 2);
> +		void *tmp = safe_realloc(*arr, new_cap * sizeof(struct addr_map));

[Severity: Critical]
Could this multiplication wrap around on 32-bit systems? If new_cap is
sufficiently large from a large trace file, the size calculation can
overflow. This allocates a smaller buffer than expected, leading to
sequential out-of-bounds writes.

> +
> +		if (!tmp)
> +			return -1;
> +		*arr = tmp;
> +		*cap = new_cap;
> +
> +	}
> +
> +	(*arr)[(*count)++] = entry;
> +	return 0;
> +}

[ ... ]
> +static unsigned long find_logical_addr(unsigned long long given_addr,
> +				struct mem_entries *mem_entries_array,
> +				size_t n_entries,
> +				u32 lp_filter)
> +{
> +	for (size_t i = 0; i < n_entries; i++) {
[ ... ]
> +		if (start <= given_addr && given_addr < end &&
> +			mem_entries_array[i].lp_index == lp_filter) {
> +			unsigned long long offset = given_addr - start;
> +			unsigned long logical = mem_entries_array[i].logical_addr + offset;

[Severity: High]
This variable also uses unsigned long, which truncates the 64-bit address on
32-bit architectures. Does it make sense to use a fixed-width type like u64
or unsigned long long here and in the structures?

> +
> +			pr_debug("DEBUG: Condition hit at i=%zu given_addr=0x%llx start=0x%llx end=0x%llx lp_index=%u\n",

[ ... ]
> +				entry.phys_addr   = phys_addr;
> +				entry.logical_addr = logical_addr;
> +
> +				add_map_entry(&maps, &count, &cap, entry);

[Severity: Medium]
Is it safe to ignore the return value here? If safe_realloc fails due to
memory pressure, add_map_entry returns -1. Ignoring this silently drops
trace entries without any error propagation.

> +
> +				ptr += pmatch[0].rm_eo;

[ ... ]
> +	/* get the lp index */
> +	fp = fopen(lpar_cfg, "r");

[Severity: High]
Does this correctly handle cross-platform analysis? Reading from the local
host's /proc/powerpc/lparcfg means trace decoding might fail or misidentify
the partition if the analysis is run on an x86 machine or a different
system. Should this query the recorded trace metadata via perf_env instead?

> +	if (!fp) {

[ ... ]
> +	mem_entries_array = malloc(n_entries * sizeof(struct mem_entries));

[Severity: Critical]
Could this allocation overflow on 32-bit systems? If n_entries is very
large, the multiplication wraps around, leading to a small allocation that
will be followed by out-of-bounds writes in the initialization loop below.

> +	if (!mem_entries_array) {

[ ... ]
> +	/* get the HTM memory map data and store it in mem_entries_array
> +	 * to use it later on for physical->logical mapping
> +	 */
> +	for (u64 i = 0; i < n_entries; i++, mem++) {

[Severity: High]
Can this loop read past the end of the memory mapped region? The variable mem
starts at mapped_data + 0x20. If htm->htm_mem_entries includes chunk headers
across all auxiliary buffers, skipping the first 32-byte header but still
iterating n_entries times could cause the loop to read out of bounds and
misinterpret subsequent chunk headers.

> +		mem_entries_array[i].phy_addr = bswap_64(mem->phy_real);
> +		mem_entries_array[i].logical_addr = bswap_64(mem->logical_real);
> +		mem_entries_array[i].lp_index = bswap_32(mem->lp_index);
> +		mem_entries_array[i].size = bswap_64(mem->size);

[Severity: High]
Should this use endian-aware helpers like be64_to_cpu() instead of
unconditional byte swapping? Applying bswap_64() unconditionally corrupts
the native data format when the offline analysis is run on a big-endian host
like ppc64.

> +	}


[Severity: Critical]
Is it possible for write_htm() to read out of bounds from heap memory?

The function extracts an unbounded size from an untrusted buffer and passes
it directly to fwrite():

write_htm() {
    ...
    num_entries = data + 0x10;
    entries = be64_to_cpu(*num_entries);
    entries++;
    written = fwrite(data, 32, entries, fp);
    ...
}

A trace file with a small size but massive num_entries could cause an
out-of-bounds heap read, leaking sensitive memory into the translation file.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=7

  reply	other threads:[~2026-07-01  9:07 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01  8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
2026-07-01  8:41 ` [PATCH 1/9] tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility Athira Rajeev
2026-07-01  8:56   ` sashiko-bot
2026-07-01  8:41 ` [PATCH 2/9] tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc Athira Rajeev
2026-07-01  8:55   ` sashiko-bot
2026-07-01  8:41 ` [PATCH 3/9] tools/perf: Add arch_record__collect_final_data to collect additional data before closing the event Athira Rajeev
2026-07-01  8:54   ` sashiko-bot
2026-07-01  8:41 ` [PATCH 4/9] tools/perf: Add powerpc callback support for arch_record__collect_final_data Athira Rajeev
2026-07-01  8:55   ` sashiko-bot
2026-07-01  8:41 ` [PATCH 5/9] tools/perf: process htm auxtrace events and display in perf report -D Athira Rajeev
2026-07-01  9:05   ` sashiko-bot
2026-07-01  8:41 ` [PATCH 6/9] perf tools powerpc: Add HTM trace data processing and decoding support Athira Rajeev
2026-07-01  9:06   ` sashiko-bot
2026-07-01  8:41 ` [PATCH 7/9] perf tools powerpc: Add physical to logical address mapping for HTM traces Athira Rajeev
2026-07-01  9:07   ` sashiko-bot [this message]
2026-07-01  8:41 ` [PATCH 8/9] tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples Athira Rajeev
2026-07-01  9:12   ` sashiko-bot
2026-07-01  8:41 ` [PATCH 9/9] tools/perf/powerpc: Add logical address in decoded traces Athira Rajeev
2026-07-01  9:13   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260701090726.34FF71F00A3A@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=atrajeev@linux.ibm.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox