All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHES v2 00/11] perf tools: Assorted fixes
@ 2026-06-08  1:30 Arnaldo Carvalho de Melo
  2026-06-08  1:30 ` [PATCH 01/11] perf tools: Fix get_max_num() size_t underflow on empty sysfs file Arnaldo Carvalho de Melo
                   ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-08  1:30 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo

Hi,

Sixth batch of pre-existing bug fixes found by sashiko-bot AI review
during the perf-data-validation hardening series.  All bugs are latent
in existing code — none were introduced by the hardening patches.

Three broad categories:

1. snprintf() accumulation overflows (patches 2, 9, 10, 11):

   Several functions accumulate formatted output via ret += snprintf().
   snprintf() returns the would-have-been-written count, so on truncation
   ret overshoots the buffer size and the next 'size - ret' underflows
   to a huge unsigned value, disabling bounds checking.  Switched to
   scnprintf() which returns actual bytes written.

   Affected: cpu_map__snprint(), snprintf_hex(),
   synthesize_bpf_prog_name(), hists__scnprintf_title(),
   build_id__snprintf(), hwmon_pmu__describe_items().

2. Missing safety checks on untrusted data (patches 1, 3, 6, 7):

   - get_max_num(): size_t underflow on empty sysfs file causes heap
     over-read.
   - machine__resolve(): unguarded env->cpu[] access with untrusted
     CPU index — switched to perf_env__get_cpu_topology() accessor.
   - timehist: test_bit(prio, ...) without bounds check on untrusted
     tracepoint priority.
   - idle-hist: rb_first_cached() on a tree populated with plain
     rb_insert_color() — rb_leftmost never set, callchains silently
     dropped.

3. Resource hygiene (patches 4, 5, 8):

   - mbind() bitmap allocation one bit short of what kernel reads.
   - bitmap_free() without NULLing the pointer (3 call sites).
   - O_CLOEXEC missing from open() calls in DSO and ELF code
     (12 call sites across 2 files).

Also expanded the libperf ABI TODO (tools/lib/perf/TODO) to emphasize
the code simplification argument for widening struct perf_cpu.cpu from
int16_t to int — the narrow type forces defensive truncation checks at
every boundary where wider CPU indices are narrowed.

 tools/lib/perf/TODO          |  7 +++++++
 tools/perf/builtin-record.c  |  1 +
 tools/perf/builtin-sched.c   |  7 +++++--
 tools/perf/util/bpf-event.c  | 11 ++++++-----
 tools/perf/util/build-id.c   |  2 +-
 tools/perf/util/cpumap.c     | 24 +++++++++++++++---------
 tools/perf/util/dso.c        |  4 ++--
 tools/perf/util/event.c      |  9 +++++++--
 tools/perf/util/header.c     |  4 +++-
 tools/perf/util/hist.c       |  7 ++++---
 tools/perf/util/hwmon_pmu.c  | 12 ++++++------
 tools/perf/util/mmap.c       |  4 +++-
 tools/perf/util/symbol-elf.c | 20 ++++++++++----------
 13 files changed, 70 insertions(+), 42 deletions(-)

Changes since v1:
  - Patch 6: fix prio bounds-check logic — the v1 condition
    (prio < 0 || prio >= MAX_PRIO || !test_bit(...)) incorrectly
    skipped events with unknown priority (prio == -1).  Changed to
    (prio >= 0 && (prio >= MAX_PRIO || !test_bit(...))) to preserve
    the original pass-through for events without priority info.
    (Found by sashiko-bot lore review)


Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 24+ messages in thread
* [PATCHES v4 00/11] perf tools: Assorted fixes
@ 2026-06-09  1:05 Arnaldo Carvalho de Melo
  2026-06-09  1:05 ` [PATCH 03/11] perf tools: Use perf_env__get_cpu_topology() in machine__resolve() Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 24+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-09  1:05 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo

Hi,

Sixth batch of pre-existing bug fixes found by sashiko-bot AI review
during the perf-data-validation hardening series.  All bugs are latent
in existing code — none were introduced by the hardening patches.

Three broad categories:

1. snprintf() accumulation overflows (patches 2, 8, 9, 10):

   Several functions accumulate formatted output via ret += snprintf().
   snprintf() returns the would-have-been-written count, so on truncation
   ret overshoots the buffer size and the next 'size - ret' underflows
   to a huge unsigned value, disabling bounds checking.  Switched to
   scnprintf() which returns actual bytes written.

   Affected: cpu_map__snprint(), snprintf_hex(),
   synthesize_bpf_prog_name(), hists__scnprintf_title(),
   build_id__snprintf(), hwmon_pmu__describe_items().

2. Missing safety checks on untrusted data (patches 1, 3, 5, 6):

   - get_max_num(): size_t underflow on empty sysfs file causes heap
     over-read.
   - machine__resolve(): unguarded env->cpu[] access with untrusted
     CPU index — switched to perf_env__get_cpu_topology() accessor,
     added bounds check before int16_t truncation.
   - timehist: test_bit(prio, ...) without bounds check on untrusted
     tracepoint priority.
   - idle-hist: rb_first_cached() on a tree populated with plain
     rb_insert_color() — rb_leftmost never set, callchains silently
     dropped.

3. Resource hygiene (patches 4, 7):

   - bitmap_free() without NULLing the pointer (2 call sites).
   - O_CLOEXEC missing from open() calls in DSO and ELF code
     (12 call sites across 2 files).

Patch 11 expands the libperf ABI TODO with the code simplification
argument for widening struct perf_cpu.cpu — the int16_t forces
truncation checks at every boundary where wider CPU indices are
narrowed.

Arnaldo Carvalho de Melo (11):
  perf tools: Fix get_max_num() size_t underflow on empty sysfs file
  perf tools: Use scnprintf() in cpu_map__snprint() to prevent overflow
  perf tools: Use perf_env__get_cpu_topology() in machine__resolve()
  perf tools: NULL bitmap pointers after bitmap_free()
  perf sched: Bounds-check prio before test_bit() in timehist
  perf sched: Fix idle-hist callchain display using wrong rb_first variant
  perf tools: Add O_CLOEXEC to open() calls in DSO and ELF code
  perf bpf: Use scnprintf() in snprintf_hex() and synthesize_bpf_prog_name()
  perf hists: Fix snprintf() in hists__scnprintf_title() UID filter path
  perf tools: Use scnprintf() in build_id__snprintf() and hwmon read_events()
  libperf: Document code simplification case for widening struct perf_cpu

 tools/lib/perf/TODO          |  8 ++++++++
 tools/perf/builtin-record.c  |  1 +
 tools/perf/builtin-sched.c   |  7 +++++--
 tools/perf/util/bpf-event.c  | 11 ++++++-----
 tools/perf/util/build-id.c   |  4 ++--
 tools/perf/util/cpumap.c     | 24 +++++++++++++++---------
 tools/perf/util/dso.c        |  4 ++--
 tools/perf/util/event.c      | 15 +++++++++++++--
 tools/perf/util/hist.c       |  7 ++++---
 tools/perf/util/hwmon_pmu.c  | 12 ++++++------
 tools/perf/util/mmap.c       |  1 +
 tools/perf/util/symbol-elf.c | 20 ++++++++++----------
 12 files changed, 73 insertions(+), 41 deletions(-)

Changes since v3:
  - Patch 3 (machine__resolve): expanded comment explaining why the
    outer al->cpu < nr_cpus_avail check is needed — the int16_t cast
    to struct perf_cpu silently truncates e.g. 65536 to 0, bypassing
    the accessor's internal bounds check.
    (Ian Rogers review)
  - Patch 10 (build_id__snprintf): fixed loop termination — after
    switching to scnprintf(), offs never reaches bf_size, so the
    loop spun doing zero-byte writes.  Changed condition to
    offs + 1 < bf_size.
    (Found by sashiko-bot, confirmed by Ian Rogers)
  - Patch 11 (TODO wording): fixed "wrap to small positive numbers"
    to "wrap to negative numbers (two's complement)".
    (Found by sashiko-bot, confirmed by Ian Rogers)

Changes since v2:
  - Dropped mbind patch (was v2 patch 4): the original code was
    correct — get_nodes() does --maxnode before computing
    BITS_TO_LONGS, so bitmap_zalloc(node_index + 1) with
    maxnode = node_index + 2 already match.  The commit message
    misstated the kernel-side semantics.
  - Split libperf ABI TODO hunk out of prio patch into standalone
    patch 11.
  - Patch 3 (machine__resolve): bounds-check al->cpu against
    env->nr_cpus_avail before truncating to int16_t struct perf_cpu.
    (Found by sashiko-bot lore review)
  - Patch 4 (was v2 patch 5, bitmap_free): reworded from "Three
    call sites" to "Two call sites" — removed dead store from
    memory_node__delete_nodes() where NULLing a pointer right
    before freeing the containing struct was useless.
    (Found by sashiko-bot lore review)

Changes since v1:
  - Patch 5 (was v1 patch 6): fix prio bounds-check logic — the
    v1 condition (prio < 0 || prio >= MAX_PRIO || !test_bit(...))
    incorrectly skipped events with unknown priority (prio == -1).
    Changed to (prio >= 0 && (prio >= MAX_PRIO || !test_bit(...)))
    to preserve the original pass-through for events without
    priority info.
    (Found by sashiko-bot lore review)

Developed with AI assistance (Claude/sashiko), tagged in commits.

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 24+ messages in thread
* [PATCHES v3 00/11] perf tools: Assorted fixes
@ 2026-06-08 20:17 Arnaldo Carvalho de Melo
  2026-06-08 20:17 ` [PATCH 03/11] perf tools: Use perf_env__get_cpu_topology() in machine__resolve() Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 24+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-08 20:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Hi,

Sixth batch of pre-existing bug fixes found by sashiko-bot AI review
during the perf-data-validation hardening series.  All bugs are latent
in existing code — none were introduced by the hardening patches.

Three broad categories:

1. snprintf() accumulation overflows (patches 2, 8, 9, 10):

   Several functions accumulate formatted output via ret += snprintf().
   snprintf() returns the would-have-been-written count, so on truncation
   ret overshoots the buffer size and the next 'size - ret' underflows
   to a huge unsigned value, disabling bounds checking.  Switched to
   scnprintf() which returns actual bytes written.

   Affected: cpu_map__snprint(), snprintf_hex(),
   synthesize_bpf_prog_name(), hists__scnprintf_title(),
   build_id__snprintf(), hwmon_pmu__describe_items().

2. Missing safety checks on untrusted data (patches 1, 3, 5, 6):

   - get_max_num(): size_t underflow on empty sysfs file causes heap
     over-read.
   - machine__resolve(): unguarded env->cpu[] access with untrusted
     CPU index — switched to perf_env__get_cpu_topology() accessor,
     added bounds check before int16_t truncation.
   - timehist: test_bit(prio, ...) without bounds check on untrusted
     tracepoint priority.
   - idle-hist: rb_first_cached() on a tree populated with plain
     rb_insert_color() — rb_leftmost never set, callchains silently
     dropped.

3. Resource hygiene (patches 4, 7):

   - bitmap_free() without NULLing the pointer (2 call sites).
   - O_CLOEXEC missing from open() calls in DSO and ELF code
     (12 call sites across 2 files).

Also expanded the libperf ABI TODO (tools/lib/perf/TODO) to emphasize
the code simplification argument for widening struct perf_cpu.cpu from
int16_t to int — the narrow type forces defensive truncation checks at
every boundary where wider CPU indices are narrowed.

Arnaldo Carvalho de Melo (10):
  perf tools: Fix get_max_num() size_t underflow on empty sysfs file
  perf tools: Use scnprintf() in cpu_map__snprint() to prevent overflow
  perf tools: Use perf_env__get_cpu_topology() in machine__resolve()
  perf tools: NULL bitmap pointers after bitmap_free()
  perf sched: Bounds-check prio before test_bit() in timehist
  perf sched: Fix idle-hist callchain display using wrong rb_first variant
  perf tools: Add O_CLOEXEC to open() calls in DSO and ELF code
  perf bpf: Use scnprintf() in snprintf_hex() and synthesize_bpf_prog_name()
  perf hists: Fix snprintf() in hists__scnprintf_title() UID filter path
  perf tools: Use scnprintf() in build_id__snprintf() and hwmon read_events()

 tools/lib/perf/TODO          |  7 +++++++
 tools/perf/builtin-record.c  |  1 +
 tools/perf/builtin-sched.c   |  7 +++++--
 tools/perf/util/bpf-event.c  | 11 ++++++-----
 tools/perf/util/build-id.c   |  2 +-
 tools/perf/util/cpumap.c     | 24 +++++++++++++++---------
 tools/perf/util/dso.c        |  4 ++--
 tools/perf/util/event.c      | 11 +++++++++--
 tools/perf/util/hist.c       |  7 ++++---
 tools/perf/util/hwmon_pmu.c  | 12 ++++++------
 tools/perf/util/mmap.c       |  1 +
 tools/perf/util/symbol-elf.c | 20 ++++++++++----------
 12 files changed, 67 insertions(+), 40 deletions(-)

Changes since v2:
  - Dropped mbind patch (was v2 patch 4): the original code was
    correct — get_nodes() does --maxnode before computing
    BITS_TO_LONGS, so bitmap_zalloc(node_index + 1) with
    maxnode = node_index + 2 already match.  The commit message
    misstated the kernel-side semantics.
  - Patch 3 (machine__resolve): bounds-check al->cpu against
    env->nr_cpus_avail before truncating to int16_t struct perf_cpu.
    (Found by sashiko-bot lore review)
  - Patch 4 (was v2 patch 5, bitmap_free): reworded from "Three
    call sites" to "Two call sites" — removed dead store from
    memory_node__delete_nodes() where NULLing a pointer right
    before freeing the containing struct was useless.
    (Found by sashiko-bot lore review)

Changes since v1:
  - Patch 5 (was v1 patch 6): fix prio bounds-check logic — the
    v1 condition (prio < 0 || prio >= MAX_PRIO || !test_bit(...))
    incorrectly skipped events with unknown priority (prio == -1).
    Changed to (prio >= 0 && (prio >= MAX_PRIO || !test_bit(...)))
    to preserve the original pass-through for events without
    priority info.
    (Found by sashiko-bot lore review)

Developed with AI assistance (Claude/sashiko), tagged in commits.

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 24+ messages in thread
* [PATCHES v1 00/11] perf tools: Assorted fixes
@ 2026-06-07 23:29 Arnaldo Carvalho de Melo
  2026-06-07 23:29 ` [PATCH 03/11] perf tools: Use perf_env__get_cpu_topology() in machine__resolve() Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 24+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-06-07 23:29 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Thomas Gleixner, James Clark, Jiri Olsa, Ian Rogers,
	Adrian Hunter, Clark Williams, linux-kernel, linux-perf-users,
	Arnaldo Carvalho de Melo, sashiko-bot, Claude Opus 4.6,
	Arnaldo Carvalho de Melo

Hi,

Sixth batch of pre-existing bug fixes found by sashiko-bot AI review
during the perf-data-validation hardening series.  All bugs are latent
in existing code — none were introduced by the hardening patches.

Three broad categories:

1. snprintf() accumulation overflows (patches 2, 9, 10, 11):

   Several functions accumulate formatted output via ret += snprintf().
   snprintf() returns the would-have-been-written count, so on truncation
   ret overshoots the buffer size and the next 'size - ret' underflows
   to a huge unsigned value, disabling bounds checking.  Switched to
   scnprintf() which returns actual bytes written.

   Affected: cpu_map__snprint(), snprintf_hex(),
   synthesize_bpf_prog_name(), hists__scnprintf_title(),
   build_id__snprintf(), hwmon_pmu__describe_items().

2. Missing safety checks on untrusted data (patches 1, 3, 6, 7):

   - get_max_num(): size_t underflow on empty sysfs file causes heap
     over-read.
   - machine__resolve(): unguarded env->cpu[] access with untrusted
     CPU index — switched to perf_env__get_cpu_topology() accessor.
   - timehist: test_bit(prio, ...) without bounds check on untrusted
     tracepoint priority.
   - idle-hist: rb_first_cached() on a tree populated with plain
     rb_insert_color() — rb_leftmost never set, callchains silently
     dropped.

3. Resource hygiene (patches 4, 5, 8):

   - mbind() bitmap allocation one bit short of what kernel reads.
   - bitmap_free() without NULLing the pointer (3 call sites).
   - O_CLOEXEC missing from open() calls in DSO and ELF code
     (12 call sites across 2 files).

Also expanded the libperf ABI TODO (tools/lib/perf/TODO) to emphasize
the code simplification argument for widening struct perf_cpu.cpu from
int16_t to int — the narrow type forces defensive truncation checks at
every boundary where wider CPU indices are narrowed.

 tools/lib/perf/TODO          |  7 +++++++
 tools/perf/builtin-record.c  |  1 +
 tools/perf/builtin-sched.c   |  7 +++++--
 tools/perf/util/bpf-event.c  | 11 ++++++-----
 tools/perf/util/build-id.c   |  2 +-
 tools/perf/util/cpumap.c     | 24 +++++++++++++++---------
 tools/perf/util/dso.c        |  4 ++--
 tools/perf/util/event.c      |  9 +++++++--
 tools/perf/util/header.c     |  4 +++-
 tools/perf/util/hist.c       |  7 ++++---
 tools/perf/util/hwmon_pmu.c  | 12 ++++++------
 tools/perf/util/mmap.c       |  4 +++-
 tools/perf/util/symbol-elf.c | 20 ++++++++++----------
 13 files changed, 70 insertions(+), 42 deletions(-)

Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Thanks,

- Arnaldo



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2026-06-09  1:22 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08  1:30 [PATCHES v2 00/11] perf tools: Assorted fixes Arnaldo Carvalho de Melo
2026-06-08  1:30 ` [PATCH 01/11] perf tools: Fix get_max_num() size_t underflow on empty sysfs file Arnaldo Carvalho de Melo
2026-06-08  1:45   ` sashiko-bot
2026-06-08  1:30 ` [PATCH 02/11] perf tools: Use scnprintf() in cpu_map__snprint() to prevent overflow Arnaldo Carvalho de Melo
2026-06-08  1:30 ` [PATCH 03/11] perf tools: Use perf_env__get_cpu_topology() in machine__resolve() Arnaldo Carvalho de Melo
2026-06-08  1:51   ` sashiko-bot
2026-06-08  1:30 ` [PATCH 04/11] perf mmap: Fix mbind() maxnode vs bitmap allocation mismatch in aio_bind Arnaldo Carvalho de Melo
2026-06-08  1:30 ` [PATCH 05/11] perf tools: NULL bitmap pointers after bitmap_free() Arnaldo Carvalho de Melo
2026-06-08  1:45   ` sashiko-bot
2026-06-08  1:30 ` [PATCH 06/11] perf sched: Bounds-check prio before test_bit() in timehist Arnaldo Carvalho de Melo
2026-06-08  1:51   ` sashiko-bot
2026-06-08  1:30 ` [PATCH 07/11] perf sched: Fix idle-hist callchain display using wrong rb_first variant Arnaldo Carvalho de Melo
2026-06-08  1:30 ` [PATCH 08/11] perf tools: Add O_CLOEXEC to open() calls in DSO and ELF code Arnaldo Carvalho de Melo
2026-06-08  1:44   ` sashiko-bot
2026-06-08  1:30 ` [PATCH 09/11] perf bpf: Use scnprintf() in snprintf_hex() and synthesize_bpf_prog_name() Arnaldo Carvalho de Melo
2026-06-08  1:30 ` [PATCH 10/11] perf hists: Fix snprintf() in hists__scnprintf_title() UID filter path Arnaldo Carvalho de Melo
2026-06-08  1:51   ` sashiko-bot
2026-06-08  1:30 ` [PATCH 11/11] perf tools: Use scnprintf() in build_id__snprintf() and hwmon read_events() Arnaldo Carvalho de Melo
2026-06-08  1:54   ` sashiko-bot
  -- strict thread matches above, loose matches on Subject: below --
2026-06-09  1:05 [PATCHES v4 00/11] perf tools: Assorted fixes Arnaldo Carvalho de Melo
2026-06-09  1:05 ` [PATCH 03/11] perf tools: Use perf_env__get_cpu_topology() in machine__resolve() Arnaldo Carvalho de Melo
2026-06-09  1:22   ` sashiko-bot
2026-06-08 20:17 [PATCHES v3 00/11] perf tools: Assorted fixes Arnaldo Carvalho de Melo
2026-06-08 20:17 ` [PATCH 03/11] perf tools: Use perf_env__get_cpu_topology() in machine__resolve() Arnaldo Carvalho de Melo
2026-06-08 21:56   ` Ian Rogers
2026-06-07 23:29 [PATCHES v1 00/11] perf tools: Assorted fixes Arnaldo Carvalho de Melo
2026-06-07 23:29 ` [PATCH 03/11] perf tools: Use perf_env__get_cpu_topology() in machine__resolve() Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.