* [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf
@ 2026-07-01 8:41 Athira Rajeev
2026-07-01 8:41 ` [PATCH 1/9] tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility Athira Rajeev
` (8 more replies)
0 siblings, 9 replies; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
H_HTM (Hardware Trace Macro) hypervisor call is an HCALL to export data
from Hardware Trace Macro (HTM) function. Patchset adds support for setup,
configuration and control of HTM functions as well as trace data
collection via perf PMU interface.
H_HTM is used as an interface for executing Hardware Trace Macro (HTM)
functions, including setup, configuration, control and dumping of the
HTM trace data. HTM operations can be controlled using the H_HTM hcall.
The hcall can be invoked for any core/chip of the system from within a
partition itself.
HTM perf interface usage:
The HTM (Hardware Trace Macro) perf interface enables collection and
analysis of hardware trace data from PowerPC systems. This interface
allows users to capture detailed execution traces for performance
analysis and debugging. The interface uses AUX infrastructure for
capturing of trace data.
Patch 1: Move auxtrace_record__init for powerpc-vpadtl as separate
utility
- Refactor VPA-DTL auxtrace initialization into arch/powerpc/util/vpa-dtl.c
to allow multiple PMUs to use auxtrace
Patch 2: Add CONFIG_AUXTRACE support for HTM pmu on powerpc
- Enable HTM PMU to use AUX buffers with recording options, info callbacks,
and PERF_AUXTRACE_POWERPC_HTM type
Patch 3: Add arch_record__collect_final_data to collect additional data
before closing the event
- Introduce callback mechanism to capture remaining data after
evlist__disable but before event removal
Patch 4: Add powerpc callback support for arch_record__collect_final_data
- Implement HTM-specific callback to read trace data until perf_evsel__read
returns zero indicating completion
Patch 5: Process htm auxtrace events and display in perf report -D
- Add PERF_RECORD_AUXTRACE_INFO processing and write HTM trace data
to htm.bin.nXpXcX files
Patch 6: Add HTM trace data processing and decoding support
- Extract system memory configuration, write translation files, and integrate
htmdecode for trace analysis
Patch 7: Add physical to logical address mapping for HTM traces
- Map physical addresses from HTM traces to logical addresses using LPAR
memory configuration from /proc/powerpc/lparcfg
Patch 8: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples
- Create synthetic HTM event with PERF_SYNTH_POWERPC_HTM config to display
logical addresses in perf report
Patch 9: Add logical address in decoded nest traces
- Translate physical to logical addresses in decoded output and create
.l files for source code correlation
Link to tools side changes:
https://lore.kernel.org/linux-perf-users/20260701083806.79358-1-atrajeev@linux.ibm.com/
Event Configuration:
Use "perf record" with the htm PMU event. The event is configured using
named parameters that specify the target hardware location and trace type:
- htm_type
- Type of HTM trace to collect (bits 0-3)
- nodeindex
- Node index in the system topology (bits 4-11)
- nodalchipindex
- Chip index within the specified node (bits 12-19)
- coreindexonchip
- Core index on the specified chip (bits 20-27)
event: "config:0-27"
htm_type: "config:0-3"
nodeindex: "config:4-11"
nodalchipindex: "config:12-19"
coreindexonchip: "config:20-27"
1) nodeindex, nodalchipindex, coreindexonchip: this specifies
which partition to configure the HTM for.
2) htmtype: specifies the type of HTM.
Event Syntax:
The event configuration uses named parameters::
htm/nodeindex=N,nodalchipindex=C,coreindexonchip=R,htm_type=T/
Where:
- N = node index
- C = chip index within the node
- R = core index on the chip
- T = HTM type
Basic Usage Example:
To collect HTM trace data for a specific chip:
# perf record -C 1 -e htm/nodalchipindex=2,nodeindex=0,htm_type=1/ <workload>
In this example:
- nodeindex=0: Target node 0
- nodalchipindex=2: Target chip 2 within node 0
- htm_type=1: HTM trace type 1
Output Files:
After running "perf record", the following files are generated:
# ls htm.bin.*
htm.bin.n0.p2.c0 htm.bin.n1.p3.c0 # Binary trace files
# ls translation.*
translation.n0.p2.c0 translation.n1.p3.c0 # Memory configuration files
These files contain:
- **htm.bin.*** - Raw HTM trace data in binary format
- **translation.*** - Memory address translation information for decoding
Trace Data Processing:
Process the collected trace data using perf script:
# perf script -D
This command:
1. Reads the perf.data file
2. Decodes HTM trace data using translation files
3. Displays human-readable trace output
The decoder automatically:
- Translates physical addresses to logical addresses
- Creates decoded output files for analysis
- Correlates trace data with memory mappings
Here's a complete example of collecting and analyzing HTM traces:
# Step 1: Collect trace data
perf record -C 1 -e htm/nodalchipindex=2,nodeindex=0,htm_type=1/ sleep 5
# Step 2: Verify output files
ls htm.bin.* # Binary trace files
ls translation.* # Memory configuration files
ls perf.data # Perf data file
# Step 3: Decode and view traces
perf script -D > decoded_trace.txt
# Step 4: Analyze with perf report to see the hot logical address
perf report
Thanks
Athira
Athira Rajeev (8):
tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate
utility
tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc
tools/perf: Add arch_record__collect_final_data to collect additional
data before closing the event
tools/perf: Add powerpc callback support for
arch_record__collect_final_data
tools/perf: process htm auxtrace events and display in perf report -D
perf tools powerpc: Add HTM trace data processing and decoding support
tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to
present htm samples
tools/perf/powerpc: Add logical address in decoded nest traces
Tanushree Shah (1):
perf tools powerpc: Add physical to logical address mapping for HTM
traces
tools/perf/arch/powerpc/util/Build | 2 +
tools/perf/arch/powerpc/util/auxtrace.c | 87 +--
tools/perf/arch/powerpc/util/htm.c | 116 ++++
tools/perf/arch/powerpc/util/vpa-dtl.c | 96 +++
tools/perf/builtin-record.c | 29 +
tools/perf/util/Build | 1 +
tools/perf/util/auxtrace.c | 4 +
tools/perf/util/auxtrace.h | 1 +
tools/perf/util/event.h | 1 +
tools/perf/util/powerpc-htm.c | 883 ++++++++++++++++++++++++
tools/perf/util/powerpc-htm.h | 25 +
tools/perf/util/powerpc-vpadtl.h | 1 +
tools/perf/util/record.h | 4 +
13 files changed, 1177 insertions(+), 73 deletions(-)
create mode 100644 tools/perf/arch/powerpc/util/htm.c
create mode 100644 tools/perf/arch/powerpc/util/vpa-dtl.c
create mode 100644 tools/perf/util/powerpc-htm.c
create mode 100644 tools/perf/util/powerpc-htm.h
--
2.52.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/9] tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 8:56 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 2/9] tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc Athira Rajeev
` (7 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
The powerpc PMU collecting Dispatch Trace Log (DTL) entries makes use of
AUX support in perf infrastructure. To enable the creation of
PERF_RECORD_AUXTRACE, auxtrace_record__init() to initialize auxtrace
record is part of arch/powerpc/util/auxtrace.c
To enable other PMU's to use auxtrace, move the auxtrace_record__init
for powerpc-vpadtl to another file: arch/powerpc/util/vpa-dtl.c
In auxtrace_record__init, based on pmu used, call vpa_dtl_recording_init
to initialize recording options for DTL
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/arch/powerpc/util/Build | 1 +
tools/perf/arch/powerpc/util/auxtrace.c | 79 ++------------------
tools/perf/arch/powerpc/util/vpa-dtl.c | 96 +++++++++++++++++++++++++
tools/perf/util/powerpc-vpadtl.h | 1 +
4 files changed, 104 insertions(+), 73 deletions(-)
create mode 100644 tools/perf/arch/powerpc/util/vpa-dtl.c
diff --git a/tools/perf/arch/powerpc/util/Build b/tools/perf/arch/powerpc/util/Build
index ae928050e07a..7819c8f5af2d 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -7,3 +7,4 @@ perf-util-y += evsel.o
perf-util-$(CONFIG_LIBDW) += skip-callchain-idx.o
perf-util-y += auxtrace.o
+perf-util-y += vpa-dtl.o
diff --git a/tools/perf/arch/powerpc/util/auxtrace.c b/tools/perf/arch/powerpc/util/auxtrace.c
index 4600a1661b4f..0053526329e0 100644
--- a/tools/perf/arch/powerpc/util/auxtrace.c
+++ b/tools/perf/arch/powerpc/util/auxtrace.c
@@ -13,63 +13,12 @@
#include "../../util/auxtrace.h"
#include "../../util/powerpc-vpadtl.h"
#include "../../util/record.h"
-#include <internal/lib.h> // page_size
-
-#define KiB(x) ((x) * 1024)
-
-static int
-powerpc_vpadtl_recording_options(struct auxtrace_record *ar __maybe_unused,
- struct evlist *evlist __maybe_unused,
- struct record_opts *opts)
-{
- opts->full_auxtrace = true;
-
- /*
- * Set auxtrace_mmap_pages to minimum
- * two pages
- */
- if (!opts->auxtrace_mmap_pages) {
- opts->auxtrace_mmap_pages = KiB(128) / page_size;
- if (opts->mmap_pages == UINT_MAX)
- opts->mmap_pages = KiB(256) / page_size;
- }
-
- return 0;
-}
-
-static size_t powerpc_vpadtl_info_priv_size(struct auxtrace_record *itr __maybe_unused,
- struct evlist *evlist __maybe_unused)
-{
- return VPADTL_AUXTRACE_PRIV_SIZE;
-}
-
-static int
-powerpc_vpadtl_info_fill(struct auxtrace_record *itr __maybe_unused,
- struct perf_session *session __maybe_unused,
- struct perf_record_auxtrace_info *auxtrace_info,
- size_t priv_size __maybe_unused)
-{
- auxtrace_info->type = PERF_AUXTRACE_VPA_DTL;
-
- return 0;
-}
-
-static void powerpc_vpadtl_free(struct auxtrace_record *itr)
-{
- free(itr);
-}
-
-static u64 powerpc_vpadtl_reference(struct auxtrace_record *itr __maybe_unused)
-{
- return 0;
-}
struct auxtrace_record *auxtrace_record__init(struct evlist *evlist,
int *err)
{
- struct auxtrace_record *aux;
struct evsel *pos;
- int found = 0;
+ int found_vpa_dtl = 0;
/*
* Set err value to zero here. Any fail later
@@ -79,32 +28,16 @@ struct auxtrace_record *auxtrace_record__init(struct evlist *evlist,
evlist__for_each_entry(evlist, pos) {
if (strstarts(pos->name, "vpa_dtl")) {
- found = 1;
+ found_vpa_dtl = 1;
pos->needs_auxtrace_mmap = true;
break;
}
}
- if (!found)
- return NULL;
-
- /*
- * To obtain the auxtrace buffer file descriptor, the auxtrace event
- * must come first.
- */
- evlist__to_front(pos->evlist, pos);
-
- aux = zalloc(sizeof(*aux));
- if (aux == NULL) {
- pr_debug("aux record is NULL\n");
- *err = -ENOMEM;
+ if (found_vpa_dtl)
+ return vpa_dtl_recording_init(pos);
+ else {
+ *err = -EINVAL;
return NULL;
}
-
- aux->recording_options = powerpc_vpadtl_recording_options;
- aux->info_priv_size = powerpc_vpadtl_info_priv_size;
- aux->info_fill = powerpc_vpadtl_info_fill;
- aux->free = powerpc_vpadtl_free;
- aux->reference = powerpc_vpadtl_reference;
- return aux;
}
diff --git a/tools/perf/arch/powerpc/util/vpa-dtl.c b/tools/perf/arch/powerpc/util/vpa-dtl.c
new file mode 100644
index 000000000000..ae81cbad0c38
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/vpa-dtl.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * VPA support
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <linux/log2.h>
+#include <linux/string.h>
+#include <time.h>
+
+#include "../../util/cpumap.h"
+#include "../../util/evsel.h"
+#include "../../util/evlist.h"
+#include "../../util/session.h"
+#include "../../util/util.h"
+#include "../../util/pmu.h"
+#include "../../util/debug.h"
+#include "../../util/auxtrace.h"
+#include "../../util/powerpc-vpadtl.h"
+#include "../../util/record.h"
+#include <internal/lib.h> // page_size
+
+#define KiB(x) ((x) * 1024)
+
+static int
+powerpc_vpadtl_recording_options(struct auxtrace_record *ar __maybe_unused,
+ struct evlist *evlist __maybe_unused,
+ struct record_opts *opts)
+{
+ opts->full_auxtrace = true;
+
+ /*
+ * Set auxtrace_mmap_pages to minimum
+ * two pages
+ */
+ if (!opts->auxtrace_mmap_pages) {
+ opts->auxtrace_mmap_pages = KiB(128) / page_size;
+ if (opts->mmap_pages == UINT_MAX)
+ opts->mmap_pages = KiB(256) / page_size;
+ }
+
+ return 0;
+}
+
+static size_t powerpc_vpadtl_info_priv_size(struct auxtrace_record *itr __maybe_unused,
+ struct evlist *evlist __maybe_unused)
+{
+ return VPADTL_AUXTRACE_PRIV_SIZE;
+}
+
+static int
+powerpc_vpadtl_info_fill(struct auxtrace_record *itr __maybe_unused,
+ struct perf_session *session __maybe_unused,
+ struct perf_record_auxtrace_info *auxtrace_info,
+ size_t priv_size __maybe_unused)
+{
+ auxtrace_info->type = PERF_AUXTRACE_VPA_DTL;
+
+ return 0;
+}
+
+static void powerpc_vpadtl_free(struct auxtrace_record *itr)
+{
+ free(itr);
+}
+
+static u64 powerpc_vpadtl_reference(struct auxtrace_record *itr __maybe_unused)
+{
+ return 0;
+}
+
+struct auxtrace_record *vpa_dtl_recording_init(struct evsel *pos)
+{
+ struct auxtrace_record *aux;
+
+ /*
+ * To obtain the auxtrace buffer file descriptor, the auxtrace event
+ * must come first.
+ */
+ evlist__to_front(pos->evlist, pos);
+
+ aux = zalloc(sizeof(*aux));
+ if (aux == NULL) {
+ pr_debug("aux record is NULL\n");
+ return NULL;
+ }
+
+ aux->recording_options = powerpc_vpadtl_recording_options;
+ aux->info_priv_size = powerpc_vpadtl_info_priv_size;
+ aux->info_fill = powerpc_vpadtl_info_fill;
+ aux->free = powerpc_vpadtl_free;
+ aux->reference = powerpc_vpadtl_reference;
+ return aux;
+}
diff --git a/tools/perf/util/powerpc-vpadtl.h b/tools/perf/util/powerpc-vpadtl.h
index ca809660b9bb..5f17e660c562 100644
--- a/tools/perf/util/powerpc-vpadtl.h
+++ b/tools/perf/util/powerpc-vpadtl.h
@@ -20,4 +20,5 @@ struct perf_pmu;
int powerpc_vpadtl_process_auxtrace_info(union perf_event *event,
struct perf_session *session);
+struct auxtrace_record *vpa_dtl_recording_init(struct evsel *pos);
#endif
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/9] tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
2026-07-01 8:41 ` [PATCH 1/9] tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 8:55 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 3/9] tools/perf: Add arch_record__collect_final_data to collect additional data before closing the event Athira Rajeev
` (6 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
The powerpc PMU collecting Hardware Trace Macro (HTM) entries makes use of
AUX support in perf infrastructure. The PMU driver has the functionality
to collect trace entries in the aux buffer. On the tools side, this data
is made available as PERF_RECORD_AUXTRACE records. This record is
generated by "perf record" command. To enable the creation of
PERF_RECORD_AUXTRACE, add functions to initialize auxtrace records ie
"htm_recording_init()". Fill in fields for other callbacks like
info_priv_size, info_fill, free, recording options etc. Add header file
to define htm pmu specific details.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/arch/powerpc/util/Build | 1 +
tools/perf/arch/powerpc/util/auxtrace.c | 10 ++-
tools/perf/arch/powerpc/util/htm.c | 113 ++++++++++++++++++++++++
tools/perf/util/powerpc-htm.h | 23 +++++
4 files changed, 146 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/arch/powerpc/util/htm.c
create mode 100644 tools/perf/util/powerpc-htm.h
diff --git a/tools/perf/arch/powerpc/util/Build b/tools/perf/arch/powerpc/util/Build
index 7819c8f5af2d..297152591046 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -8,3 +8,4 @@ perf-util-$(CONFIG_LIBDW) += skip-callchain-idx.o
perf-util-y += auxtrace.o
perf-util-y += vpa-dtl.o
+perf-util-y += htm.o
diff --git a/tools/perf/arch/powerpc/util/auxtrace.c b/tools/perf/arch/powerpc/util/auxtrace.c
index 0053526329e0..ec84f8876a4a 100644
--- a/tools/perf/arch/powerpc/util/auxtrace.c
+++ b/tools/perf/arch/powerpc/util/auxtrace.c
@@ -12,6 +12,7 @@
#include "../../util/debug.h"
#include "../../util/auxtrace.h"
#include "../../util/powerpc-vpadtl.h"
+#include "../../util/powerpc-htm.h"
#include "../../util/record.h"
struct auxtrace_record *auxtrace_record__init(struct evlist *evlist,
@@ -19,6 +20,7 @@ struct auxtrace_record *auxtrace_record__init(struct evlist *evlist,
{
struct evsel *pos;
int found_vpa_dtl = 0;
+ int found_htm = 0;
/*
* Set err value to zero here. Any fail later
@@ -31,13 +33,19 @@ struct auxtrace_record *auxtrace_record__init(struct evlist *evlist,
found_vpa_dtl = 1;
pos->needs_auxtrace_mmap = true;
break;
+ } else if (strstarts(pos->name, "htm")) {
+ found_htm = 1;
+ pos->needs_auxtrace_mmap = true;
+ break;
}
}
if (found_vpa_dtl)
return vpa_dtl_recording_init(pos);
+ else if (found_htm)
+ return htm_recording_init(pos);
else {
- *err = -EINVAL;
+ *err = 0;
return NULL;
}
}
diff --git a/tools/perf/arch/powerpc/util/htm.c b/tools/perf/arch/powerpc/util/htm.c
new file mode 100644
index 000000000000..cc733f45ac9b
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/htm.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * HTM support
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <linux/log2.h>
+#include <linux/string.h>
+#include <time.h>
+
+#include "../../util/cpumap.h"
+#include "../../util/evsel.h"
+#include "../../util/evlist.h"
+#include "../../util/session.h"
+#include "../../util/util.h"
+#include "../../util/pmu.h"
+#include "../../util/debug.h"
+#include "../../util/auxtrace.h"
+#include "../../util/powerpc-htm.h"
+#include "../../util/record.h"
+#include <internal/lib.h> // page_size
+#include <errno.h>
+
+#define KiB(x) ((x) * 1024)
+
+static int
+htm_recording_options(struct auxtrace_record *ar __maybe_unused,
+ struct evlist *evlist __maybe_unused,
+ struct record_opts *opts)
+{
+ struct evsel *pos;
+
+ opts->full_auxtrace = true;
+
+ if (opts->target.system_wide) {
+ pr_info("System wide monitoring not supported, specify -C <cpu>\n");
+ return -EINVAL;
+ } else if (!opts->target.cpu_list) {
+ pr_info("-C option not provided, specify -C <cpu> to use HTM event\n");
+ return -EINVAL;
+ }
+
+ /*
+ * Set auxtrace_mmap_pages to minimum
+ * two pages
+ */
+ if (!opts->auxtrace_mmap_pages) {
+ opts->auxtrace_mmap_pages = KiB(128) / page_size;
+ if (opts->mmap_pages == UINT_MAX)
+ opts->mmap_pages = KiB(256) / page_size;
+ }
+
+ evlist__for_each_entry(evlist, pos) {
+ if (strstarts(pos->name, "htm")) {
+ pos->needs_auxtrace_mmap = true;
+ pos->core.attr.aux_watermark = opts->auxtrace_mmap_pages * (size_t)page_size;
+ break;
+ }
+ }
+
+ return 0;
+}
+
+static size_t htm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
+ struct evlist *evlist __maybe_unused)
+{
+ return HTM_AUXTRACE_PRIV_SIZE;
+}
+
+static int
+htm_info_fill(struct auxtrace_record *itr __maybe_unused,
+ struct perf_session *session __maybe_unused,
+ struct perf_record_auxtrace_info *auxtrace_info __maybe_unused,
+ size_t priv_size __maybe_unused)
+{
+ return 0;
+}
+
+static u64 htm_reference(struct auxtrace_record *itr __maybe_unused)
+{
+ return 0;
+}
+
+static void htm_free(struct auxtrace_record *itr)
+{
+ free(itr);
+}
+
+struct auxtrace_record *htm_recording_init(struct evsel *pos)
+{
+ struct auxtrace_record *aux;
+
+ /*
+ * To obtain the auxtrace buffer file descriptor, the auxtrace event
+ * must come first.
+ */
+ evlist__to_front(pos->evlist, pos);
+
+ aux = zalloc(sizeof(*aux));
+ if (aux == NULL) {
+ pr_debug("aux record is NULL\n");
+ return NULL;
+ }
+
+ aux->recording_options = htm_recording_options;
+ aux->info_priv_size = htm_info_priv_size;
+ aux->info_fill = htm_info_fill;
+ aux->free = htm_free;
+ aux->reference = htm_reference;
+ return aux;
+}
diff --git a/tools/perf/util/powerpc-htm.h b/tools/perf/util/powerpc-htm.h
new file mode 100644
index 000000000000..be7f8c03e161
--- /dev/null
+++ b/tools/perf/util/powerpc-htm.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * HTM PMU Support
+ */
+
+#ifndef INCLUDE__PERF_POWERPC_HTM_H__
+#define INCLUDE__PERF_POWERPC_HTM_H__
+
+#define POWERPC_HTM_NAME "powerpc_htm_"
+
+enum {
+ POWERPC_HTM_TYPE,
+ HTM_AUXTRACE_PRIV_MAX,
+};
+
+#define HTM_AUXTRACE_PRIV_SIZE (HTM_AUXTRACE_PRIV_MAX * sizeof(u64))
+
+union perf_event;
+struct perf_session;
+struct perf_pmu;
+
+struct auxtrace_record *htm_recording_init(struct evsel *pos);
+#endif
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/9] tools/perf: Add arch_record__collect_final_data to collect additional data before closing the event
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
2026-07-01 8:41 ` [PATCH 1/9] tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility Athira Rajeev
2026-07-01 8:41 ` [PATCH 2/9] tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 8:54 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 4/9] tools/perf: Add powerpc callback support for arch_record__collect_final_data Athira Rajeev
` (5 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
While collecting samples using "perf record", function
"__cmd_record" checks if monitoring is done. Once recording
is done, event list will be disabled using "evlist__disable".
After this, event fd won't be read and event will be removed.
Before removing the event, if any additional data needs
to be captured/written to perf.data, currently its not
possible. Introduce arch_record__collect_final_data to
collect additional data before closing the event
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/builtin-record.c | 29 +++++++++++++++++++++++++++++
tools/perf/util/record.h | 4 ++++
2 files changed, 33 insertions(+)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index ebd3ed0c9b3e..1312f7223455 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2426,6 +2426,34 @@ static unsigned long record__waking(struct record *rec)
return waking;
}
+/*
+ * Weak symbol - architecture can override to indicate if more
+ * data needs to be collected before finishing output.
+ *
+ * Returns: 1 if more data exists, 0 if collection is complete
+ */
+__weak int arch_perf_record__need_read(struct evlist *evlist __maybe_unused)
+{
+ return 0; /* Default: no arch-specific data to collect */
+}
+
+static void record__final_data(struct record *rec)
+{
+ /*
+ * Collect any remaining architecture-specific data.
+ * The arch code checks if more data exists, and we do the actual
+ * reading here since we have access to record__mmap_read_all().
+ */
+ while (arch_perf_record__need_read(rec->evlist)) {
+ if (record__mmap_read_all(rec, false) < 0)
+ break;
+ /* Re-enable events for next batch */
+ evlist__enable(rec->evlist);
+ }
+
+ return;
+}
+
static int __cmd_record(struct record *rec, int argc, const char **argv)
{
int err;
@@ -2853,6 +2881,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
*/
if (done && !disabled && !target__none(&opts->target)) {
trigger_off(&auxtrace_snapshot_trigger);
+ record__final_data(rec);
evlist__disable(rec->evlist);
disabled = true;
}
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 93627c9a7338..21f51efd36fc 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -8,6 +8,8 @@
#include <linux/stddef.h>
#include <linux/perf_event.h>
#include "util/target.h"
+#include "util/evlist.h"
+#include "util/util.h"
struct option;
@@ -95,4 +97,6 @@ static inline bool record_opts__no_switch_events(const struct record_opts *opts)
return opts->record_switch_events_set && !opts->record_switch_events;
}
+int arch_perf_record__need_read(struct evlist *evlist);
+
#endif // _PERF_RECORD_H
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/9] tools/perf: Add powerpc callback support for arch_record__collect_final_data
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
` (2 preceding siblings ...)
2026-07-01 8:41 ` [PATCH 3/9] tools/perf: Add arch_record__collect_final_data to collect additional data before closing the event Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 8:55 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 5/9] tools/perf: process htm auxtrace events and display in perf report -D Athira Rajeev
` (4 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
Add arch_record__collect_final_data to collect additional
data before closing the event. Define the callback in
util/powerpc-htm.c
Invoke record__mmap_read_all till the complete trace
data is collected in auxtrace buffer and copied to
perf.data . When the auxtrace buffer is full, perf_aux_output_end
will disable the event till data is written. Hence enable
the event using evlist__enable after reading event using
htm_read_data function. The perf_evsel__read returns zero, when
the trace data is completely read and completed. If the count
returns zero for the event, stop the data collection.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/util/Build | 1 +
tools/perf/util/powerpc-htm.c | 74 +++++++++++++++++++++++++++++++++++
2 files changed, 75 insertions(+)
create mode 100644 tools/perf/util/powerpc-htm.c
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 330311cac550..7fa354853d2a 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -141,6 +141,7 @@ perf-util-y += hisi-ptt.o
perf-util-y += hisi-ptt-decoder/
perf-util-y += s390-cpumsf.o
perf-util-y += powerpc-vpadtl.o
+perf-util-y += powerpc-htm.o
ifdef CONFIG_LIBOPENCSD
perf-util-y += cs-etm.o
diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
new file mode 100644
index 000000000000..5043ff41a609
--- /dev/null
+++ b/tools/perf/util/powerpc-htm.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * HTM support
+ */
+
+#include "../../../util/record.h"
+#include "evlist.h"
+#include "evsel.h"
+#include "session.h"
+#include "debug.h"
+#include <internal/xyarray.h>
+
+/*
+ * Check if HTM events have more data to collect.
+ *
+ * This function reads the HTM event counts. When the kernel driver
+ * has more data available, it returns a non-zero count. When all
+ * data has been collected, it returns zero.
+ *
+ * Returns: 1 if more data exists, 0 if collection is complete
+ */
+int arch_perf_record__need_read(struct evlist *evlist)
+{
+ struct evsel *evsel;
+ int found_htm = 0;
+
+ /* there was an error during record__open */
+ if (!evlist)
+ return 0;
+
+ /* First, check if any HTM events exist */
+ evlist__for_each_entry(evlist, evsel) {
+ if (strstr(evsel->name, "htm") != NULL)
+ found_htm = 1;
+ }
+
+ if (!found_htm)
+ return 0;
+
+ /* Read HTM event counts to check if more data is available */
+ evlist__for_each_entry(evlist, evsel) {
+ struct xyarray *xy = evsel->core.sample_id;
+
+ if (strstr(evsel->name, "htm") == NULL)
+ continue;
+
+ if (xy == NULL || evsel->core.fd == NULL)
+ continue;
+ if (xyarray__max_x(evsel->core.fd) != xyarray__max_x(xy) ||
+ xyarray__max_y(evsel->core.fd) != xyarray__max_y(xy)) {
+ pr_debug("Unmatched FD vs. sample ID: skip reading LOST count\n");
+ continue;
+ }
+
+ for (int x = 0; x < xyarray__max_x(xy); x++) {
+ for (int y = 0; y < xyarray__max_y(xy); y++) {
+ struct perf_counts_values count;
+
+ if (!strcmp(evsel->name, "dummy:u"))
+ continue;
+
+ if (strstr(evsel->name, "htm")) {
+ perf_evsel__read(&evsel->core, x, y, &count);
+ y = xyarray__max_y(xy);
+ x = xyarray__max_x(xy);
+ }
+ if (!count.val)
+ return 0;
+ }
+ }
+ }
+
+ return 1;
+}
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 5/9] tools/perf: process htm auxtrace events and display in perf report -D
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
` (3 preceding siblings ...)
2026-07-01 8:41 ` [PATCH 4/9] tools/perf: Add powerpc callback support for arch_record__collect_final_data Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 9:05 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 6/9] perf tools powerpc: Add HTM trace data processing and decoding support Athira Rajeev
` (3 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
Add htm pmu auxtrace process function for "perf script -D".
The auxtrace event processing functions are defined in file
"util/powerpc-htm.c". Different PERF_RECORD_XXX are generated
during recording. PERF_RECORD_AUXTRACE_INFO is processed first
since it is of type perf_user_event_type and perf session event
delivers perf_session__process_user_event() first. Define function
powerpc_htm_process_auxtrace_info() to handle the processing of
PERF_RECORD_AUXTRACE_INFO records. In this function, initialize
the aux buffer queues using auxtrace_queues__init(). Setup the
required infrastructure for aux data processing.
The trace data which is part of each AUXTRACE record will be written
to a file named htm.bin.n<nodeindex>.p<nodalchipindex>.c<coreindexonchip>
Sample output:
# perf record -C 8 -m,256 -e htm/event=0x901032/ ls
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2048.915 MB perf.data ]
# perf script -D -i perf.data
. ... HTM PMU data: size <N> bytes
# perf record -C 1 -e htm/nodalchipindex=2,nodeindex=0,htm_type=1/ -o perf_1.data ls 1>out
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 257.504 MB perf_1.data ]
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/arch/powerpc/util/htm.c | 3 +
tools/perf/util/auxtrace.c | 4 +
tools/perf/util/auxtrace.h | 1 +
tools/perf/util/powerpc-htm.c | 149 +++++++++++++++++++++++++++++
tools/perf/util/powerpc-htm.h | 2 +
5 files changed, 159 insertions(+)
diff --git a/tools/perf/arch/powerpc/util/htm.c b/tools/perf/arch/powerpc/util/htm.c
index cc733f45ac9b..0e6638c02716 100644
--- a/tools/perf/arch/powerpc/util/htm.c
+++ b/tools/perf/arch/powerpc/util/htm.c
@@ -56,6 +56,7 @@ htm_recording_options(struct auxtrace_record *ar __maybe_unused,
if (strstarts(pos->name, "htm")) {
pos->needs_auxtrace_mmap = true;
pos->core.attr.aux_watermark = opts->auxtrace_mmap_pages * (size_t)page_size;
+ pos->core.attr.sample_type |= PERF_SAMPLE_RAW;
break;
}
}
@@ -75,6 +76,8 @@ htm_info_fill(struct auxtrace_record *itr __maybe_unused,
struct perf_record_auxtrace_info *auxtrace_info __maybe_unused,
size_t priv_size __maybe_unused)
{
+ auxtrace_info->type = PERF_AUXTRACE_POWERPC_HTM;
+
return 0;
}
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 0b851f32e98c..9f32f54fad43 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -56,6 +56,7 @@
#include "s390-cpumsf.h"
#include "util/mmap.h"
#include "powerpc-vpadtl.h"
+#include "powerpc-htm.h"
#include <linux/ctype.h>
#include "symbol/kallsyms.h"
@@ -1427,6 +1428,9 @@ int perf_event__process_auxtrace_info(const struct perf_tool *tool __maybe_unuse
case PERF_AUXTRACE_VPA_DTL:
err = powerpc_vpadtl_process_auxtrace_info(event, session);
break;
+ case PERF_AUXTRACE_POWERPC_HTM:
+ err = powerpc_htm_process_auxtrace_info(event, session);
+ break;
case PERF_AUXTRACE_UNKNOWN:
default:
return -EINVAL;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 6947f3f284c0..68b17802a419 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -46,6 +46,7 @@ enum auxtrace_type {
PERF_AUXTRACE_S390_CPUMSF,
PERF_AUXTRACE_HISI_PTT,
PERF_AUXTRACE_VPA_DTL,
+ PERF_AUXTRACE_POWERPC_HTM,
};
enum itrace_period_type {
diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
index 5043ff41a609..ffddf0e59fc1 100644
--- a/tools/perf/util/powerpc-htm.c
+++ b/tools/perf/util/powerpc-htm.c
@@ -9,6 +9,24 @@
#include "session.h"
#include "debug.h"
#include <internal/xyarray.h>
+#include <linux/string.h>
+#include "color.h"
+#include <inttypes.h>
+#include "powerpc-htm.h"
+#include <errno.h>
+
+struct perf_session;
+
+struct powerpc_htm {
+ struct auxtrace auxtrace;
+ struct auxtrace_queues queues;
+ struct auxtrace_heap heap;
+ u32 auxtrace_type;
+ struct perf_session *session;
+ struct machine *machine;
+ u32 pmu_type;
+ char htmbin_file[64];
+};
/*
* Check if HTM events have more data to collect.
@@ -72,3 +90,134 @@ int arch_perf_record__need_read(struct evlist *evlist)
return 1;
}
+
+static void powerpc_htm_dump_event(size_t len)
+{
+ const char *color = PERF_COLOR_BLUE;
+
+ color_fprintf(stdout, color,
+ ". ... HTM PMU data: size %zu bytes\n",
+ len);
+}
+
+static int powerpc_htm_process_event(struct perf_session *session __maybe_unused,
+ union perf_event *event __maybe_unused,
+ struct perf_sample *sample __maybe_unused,
+ const struct perf_tool *tool __maybe_unused)
+{
+ return 0;
+}
+
+static int powerpc_htm_process_auxtrace_event(struct perf_session *session __maybe_unused,
+ union perf_event *event,
+ const struct perf_tool *tool __maybe_unused)
+{
+ powerpc_htm_dump_event(event->auxtrace.size);
+
+ return 0;
+}
+
+static int powerpc_htm_flush(struct perf_session *session __maybe_unused,
+ const struct perf_tool *tool __maybe_unused)
+{
+ return 0;
+}
+
+static void powerpc_htm_free_events(struct perf_session *session)
+{
+ struct powerpc_htm *htm = container_of(session->auxtrace, struct powerpc_htm,
+ auxtrace);
+ struct auxtrace_queues *queues = &htm->queues;
+ unsigned int i;
+
+ for (i = 0; i < queues->nr_queues; i++)
+ zfree(&queues->queue_array[i].priv);
+
+ auxtrace_queues__free(queues);
+}
+
+static void powerpc_htm_free(struct perf_session *session)
+{
+ struct powerpc_htm *htm = container_of(session->auxtrace, struct powerpc_htm,
+ auxtrace);
+
+ powerpc_htm_free_events(session);
+ session->auxtrace = NULL;
+ free(htm);
+}
+static const char * const powerpc_htm_info_fmts[] = {
+ [POWERPC_HTM_TYPE] = " PMU Type %"PRId64"\n",
+};
+
+static void powerpc_htm_print_info(__u64 *arr)
+{
+ if (!dump_trace)
+ return;
+
+ fprintf(stdout, powerpc_htm_info_fmts[POWERPC_HTM_TYPE], arr[POWERPC_HTM_TYPE]);
+}
+
+int powerpc_htm_process_auxtrace_info(union perf_event *event,
+ struct perf_session *session)
+{
+ struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
+ struct evsel *evsel = evlist__event2evsel(session->evlist, event);
+ u32 nodeindex, nodalchipindex, coreindexonchip;
+ int config = (evsel->core.attr.config);
+ size_t min_sz = sizeof(u64) * POWERPC_HTM_TYPE;
+ struct powerpc_htm *htm;
+ int err;
+ FILE *fp;
+
+ nodeindex = (config >> 4) & 0xff;
+ nodalchipindex = (config >> 12) & 0xff;
+ coreindexonchip = (config >> 20) & 0xff;
+
+ if (auxtrace_info->header.size < sizeof(struct perf_record_auxtrace_info) +
+ min_sz)
+ return -EINVAL;
+
+ htm = zalloc(sizeof(struct powerpc_htm));
+ if (!htm)
+ return -ENOMEM;
+
+ err = auxtrace_queues__init(&htm->queues);
+ if (err)
+ goto err_free;
+
+ htm->session = session;
+ htm->machine = &session->machines.host; /* No kvm support */
+ htm->auxtrace_type = auxtrace_info->type;
+ htm->pmu_type = auxtrace_info->priv[POWERPC_HTM_TYPE];
+
+ htm->auxtrace.process_event = powerpc_htm_process_event;
+ htm->auxtrace.process_auxtrace_event = powerpc_htm_process_auxtrace_event;
+ htm->auxtrace.flush_events = powerpc_htm_flush;
+ htm->auxtrace.free_events = powerpc_htm_free_events;
+ htm->auxtrace.free = powerpc_htm_free;
+ session->auxtrace = &htm->auxtrace;
+
+ snprintf(htm->htmbin_file, sizeof(htm->htmbin_file), "htm.bin.n%d.p%d.c%d", nodeindex, nodalchipindex, coreindexonchip);
+ fp = fopen(htm->htmbin_file, "w");
+ if (!fp) {
+ pr_err("Failed to create %s: %s\n", htm->htmbin_file, strerror(errno));
+ return -errno;
+ }
+ fclose(fp);
+
+ powerpc_htm_print_info(&auxtrace_info->priv[0]);
+
+ err = auxtrace_queues__process_index(&htm->queues, session);
+ if (err)
+ goto err_free_queues;
+
+ return 0;
+
+err_free_queues:
+ auxtrace_queues__free(&htm->queues);
+ session->auxtrace = NULL;
+
+err_free:
+ free(htm);
+ return err;
+}
diff --git a/tools/perf/util/powerpc-htm.h b/tools/perf/util/powerpc-htm.h
index be7f8c03e161..0dc31fa252b4 100644
--- a/tools/perf/util/powerpc-htm.h
+++ b/tools/perf/util/powerpc-htm.h
@@ -20,4 +20,6 @@ struct perf_session;
struct perf_pmu;
struct auxtrace_record *htm_recording_init(struct evsel *pos);
+int powerpc_htm_process_auxtrace_info(union perf_event *event,
+ struct perf_session *session);
#endif
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 6/9] perf tools powerpc: Add HTM trace data processing and decoding support
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
` (4 preceding siblings ...)
2026-07-01 8:41 ` [PATCH 5/9] tools/perf: process htm auxtrace events and display in perf report -D Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 9:06 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 7/9] perf tools powerpc: Add physical to logical address mapping for HTM traces Athira Rajeev
` (2 subsequent siblings)
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
perf data includes SystemMemory Configuration dump. This
information helps to understand the physical to logical real
address mapping for the logical partitions in the system.
To help with relating and identifying the start of memory
mapping data in the auxiliary buffer, two PERF_SAMPLE_RAW records
are also present in the ring buffer. First PERF_SAMPLE_RAW record
represents beginning of system memory mapping data in aux buffer.
And second PERF_SAMPLE_RAW record represents the end of the trace
data in aux buffer and also contains the total size of the memory
map data. These sample raw records are used during post processing.
Add support for processing Hardware Trace Macro (HTM) auxiliary trace
data collected via perf AUX buffers. This enables post-processing of
HTM traces including system memory configuration and trace
HTM trace data includes two types of information:
1. Bus traces captured in the AUX buffer
2. System Memory Configuration that maps physical to logical real
addresses for logical partitions
The implementation handles the challenge of large HTM trace buffers
(up to 8GB) being collected through perf AUX buffers (typically
16MB) by reading data in chunks during post-processing.
Key features:
- Process PERF_RECORD_SAMPLE events with RAW data that mark boundaries
between trace data and memory configuration data in the AUX buffer
- Write HTM trace data to htm.bin.nXpXcX files where X represents
node, chip, and core indices extracted from the event configuration
- Write system memory configuration to translation.nXpXcX files for
address mapping analysis
- Integrate with external htmdecode tool for trace decoding
when available (config bit 0 set indicates Bus traces)
- Use fork/exec pattern for secure external command execution with
proper error handling and exit code checking
The memory configuration data is written in 32-byte entries with the
entry count stored at offset 0x10 in big-endian format. The first
PERF_SAMPLE_RAW record marks the start of memory mapping data, while
the second marks the end and contains the total buffer count.
Error handling includes:
- NULL checks for all file operations
- Verification of write operations
- Graceful degradation if htmdecode is not installed
- Proper resource cleanup (file handles, memory mappings)
Example usage:
# perf record -C 1 -e htm/nodalchipindex=2,nodeindex=0,htm_type=1/ <workload> # Collect trace data
# perf script -D # Shows HTM trace data
# ls htm.bin.* # Binary trace files
# ls translation.* # Memory configuration files
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/util/powerpc-htm.c | 225 +++++++++++++++++++++++++++++++++-
1 file changed, 224 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
index ffddf0e59fc1..487989ca4fc7 100644
--- a/tools/perf/util/powerpc-htm.c
+++ b/tools/perf/util/powerpc-htm.c
@@ -14,6 +14,12 @@
#include <inttypes.h>
#include "powerpc-htm.h"
#include <errno.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include "sample.h"
+#include <sys/types.h>
+#include <sys/wait.h>
struct perf_session;
@@ -26,8 +32,140 @@ struct powerpc_htm {
struct machine *machine;
u32 pmu_type;
char htmbin_file[64];
+ char trans_file[64];
+ int htm_mem_entries;
+ int mem_maps;
};
+struct htm_mem {
+ uint64_t phy_real;
+ uint64_t logical_real;
+ uint32_t lp_index;
+ uint8_t mem_tier;
+ uint8_t mem_type;
+ uint16_t res;
+ uint64_t size;
+};
+
+static int run_htmdecode(const char *input_file, const char *output_file)
+{
+ pid_t pid;
+ int status;
+
+ pid = fork();
+ if (pid == -1) {
+ pr_err("fork() failed: %s\n", strerror(errno));
+ return -errno;
+ }
+
+ if (pid == 0) {
+ /* Child process */
+ int fd = open(output_file, O_WRONLY | O_CREAT | O_TRUNC, 0644);
+
+ if (fd == -1) {
+ pr_err("Failed to open output file: %s\n", strerror(errno));
+ exit(1);
+ }
+
+ /* Redirect stdout to output file */
+ dup2(fd, STDOUT_FILENO);
+ close(fd);
+
+ /* Execute htmdecode - execlp searches PATH automatically */
+ execlp("htmdecode", "htmdecode", "-o", "-j", "-w", "1",
+ "-f", input_file, NULL);
+
+ /* If execlp returns, it failed */
+ pr_err("Failed to execute htmdecode: %s\n", strerror(errno));
+ if (errno == ENOENT)
+ pr_err("htmdecode not found in PATH\n");
+
+ exit(127); /* Standard "command not found" exit code */
+ }
+
+ /* Parent process - wait for child */
+ if (waitpid(pid, &status, 0) == -1) {
+ pr_err("waitpid() failed: %s\n", strerror(errno));
+ return -errno;
+ }
+
+ /* Check exit status */
+ if (WIFEXITED(status)) {
+ int exit_code = WEXITSTATUS(status);
+
+ if (exit_code == 127) {
+ pr_err("htmdecode not found in PATH\n");
+ return -ENOENT;
+ } else if (exit_code != 0) {
+ pr_err("htmdecode failed with exit code %d\n", exit_code);
+ return -EINVAL;
+ }
+ } else if (WIFSIGNALED(status)) {
+ pr_err("htmdecode killed by signal %d\n", WTERMSIG(status));
+ return -EINTR;
+ }
+
+ return 0;
+}
+
+static int create_mem_maps(struct powerpc_htm *htm)
+{
+ off_t file_size;
+ void *htmdata, *mapped_data;
+ int fd;
+ struct stat file_info;
+ struct htm_mem *mem;
+ char tracefile[128];
+ int ret;
+
+ snprintf(tracefile, sizeof(tracefile), "%s.out", htm->htmbin_file);
+
+ ret = run_htmdecode(htm->htmbin_file, tracefile);
+ if (ret) {
+ if (ret == -ENOENT)
+ pr_info("htmdecode not found. Install htmdecode to decode traces.\n");
+ else
+ pr_info("htmdecode failed with error %d\n", ret);
+ return ret;
+ }
+
+ fd = open(htm->trans_file, O_RDONLY);
+ if (fd == -1) {
+ pr_err("Failed to open %s: %s\n", htm->trans_file, strerror(errno));
+ return -1;
+ }
+
+ if (fstat(fd, &file_info) == -1) {
+ close(fd);
+ pr_err("fstat failed on %s: %s\n", htm->trans_file, strerror(errno));
+ return -1;
+ }
+
+ file_size = file_info.st_size;
+
+ mapped_data = mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0);
+ if (mapped_data == MAP_FAILED) {
+ close(fd);
+ pr_err("mmap failed on %s: %s\n", htm->trans_file, strerror(errno));
+ return -1;
+ }
+
+ htmdata = mapped_data + 0x20;
+ mem = (struct htm_mem *)htmdata;
+
+ if (!mem || !htm->htm_mem_entries) {
+ pr_info("No memory mapping entries captured in HTM translation\n");
+ munmap(mapped_data, file_size);
+ close(fd);
+ return -1;
+ }
+
+ munmap(mapped_data, file_size);
+ close(fd);
+
+ return 0;
+}
+
/*
* Check if HTM events have more data to collect.
*
@@ -95,9 +233,55 @@ static void powerpc_htm_dump_event(size_t len)
{
const char *color = PERF_COLOR_BLUE;
- color_fprintf(stdout, color,
+ if (dump_trace) {
+ color_fprintf(stdout, color,
". ... HTM PMU data: size %zu bytes\n",
len);
+ }
+}
+
+static int write_htm(void *data, size_t size, struct powerpc_htm *htm)
+{
+ FILE *fp;
+ u64 *num_entries;
+ size_t entries;
+ size_t written;
+ int ret = -1;
+
+ if (htm->mem_maps) {
+ fp = fopen(htm->trans_file, "ab");
+ if (!fp) {
+ pr_err("Failed to open %s: %s\n", htm->trans_file, strerror(errno));
+ return ret;
+ }
+ num_entries = data + 0x10;
+ entries = be64_to_cpu(*num_entries);
+ entries++;
+ written = fwrite(data, 32, entries, fp);
+ if (written != entries) {
+ pr_err("Failed to write data: expected %zu, wrote %zu\n", entries, written);
+ fclose(fp);
+ return ret;
+ }
+ fclose(fp);
+ htm->htm_mem_entries += entries;
+ return 0;
+ }
+
+ fp = fopen(htm->htmbin_file, "a");
+ if (!fp) {
+ pr_err("Failed to open %s: %s\n", htm->htmbin_file, strerror(errno));
+ return ret;
+ }
+ written = fwrite(data, size, 1, fp);
+ if (!written) {
+ pr_err("Failed to htm trace data\n");
+ fclose(fp);
+ return ret;
+ }
+ fclose(fp);
+
+ return 0;
}
static int powerpc_htm_process_event(struct perf_session *session __maybe_unused,
@@ -105,6 +289,37 @@ static int powerpc_htm_process_event(struct perf_session *session __maybe_unused
struct perf_sample *sample __maybe_unused,
const struct perf_tool *tool __maybe_unused)
{
+ struct powerpc_htm *htm = container_of(session->auxtrace, struct powerpc_htm,
+ auxtrace);
+
+ if ((event->header.type == PERF_RECORD_SAMPLE) && sample->raw_data) {
+ int *content = (int *)sample->raw_data;
+ struct evsel *evsel = evlist__event2evsel(session->evlist, event);
+ int config = (evsel->core.attr.config) & 0xF;
+ struct auxtrace_buffer *buffer = NULL;
+ struct auxtrace_queues *queues = &htm->queues;
+ unsigned int i = 0;
+ int j = 0;
+
+ if (strstr(evsel->name, "htm") == NULL)
+ return 0;
+
+ for (i = 0; i < queues->nr_queues; i++) {
+ buffer = auxtrace_buffer__next(&queues->queue_array[i], buffer);
+ for (; buffer;) {
+ if (j >= *content)
+ htm->mem_maps = 1;
+ if (write_htm(buffer->data, buffer->size, htm))
+ return -1;
+ j++;
+ buffer = auxtrace_buffer__next(&queues->queue_array[i], buffer);
+ }
+ }
+ /* Only for power bus traces, we decode traces */
+ if (config == 1)
+ create_mem_maps(htm);
+ }
+
return 0;
}
@@ -205,6 +420,14 @@ int powerpc_htm_process_auxtrace_info(union perf_event *event,
}
fclose(fp);
+ snprintf(htm->trans_file, sizeof(htm->trans_file), "translation.n%d.p%d.c%d", nodeindex, nodalchipindex, coreindexonchip);
+ fp = fopen(htm->trans_file, "w");
+ if (!fp) {
+ pr_err("Failed to create %s: %s\n", htm->trans_file, strerror(errno));
+ return -errno;
+ }
+ fclose(fp);
+
powerpc_htm_print_info(&auxtrace_info->priv[0]);
err = auxtrace_queues__process_index(&htm->queues, session);
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 7/9] perf tools powerpc: Add physical to logical address mapping for HTM traces
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
` (5 preceding siblings ...)
2026-07-01 8:41 ` [PATCH 6/9] perf tools powerpc: Add HTM trace data processing and decoding support Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 9:07 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 8/9] tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples Athira Rajeev
2026-07-01 8:41 ` [PATCH 9/9] tools/perf/powerpc: Add logical address in decoded traces Athira Rajeev
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
From: Tanushree Shah <tshah@linux.ibm.com>
Add support for mapping physical addresses from HTM (Hardware Trace Macro)
traces to logical addresses within the current LPAR (Logical Partition).
This enables correlation of HTM trace data with the logical address space
visible to applications and the kernel.
HTM traces capture physical memory addresses from the transactions, but for
meaningful analysis, these need to be mapped to the logical addresses used
by the partition. This patch implements the mapping by:
1. Reading the current partition ID from /proc/powerpc/lparcfg to identify
which LPAR the trace belongs to
2. Extracting memory map entries from the HTM system memory configuration
data, which contains:
3. Parsing the decoded HTM trace file using regex patterns to extract
physical addresses and their associated event labels from each trace
entry
4. For each physical address in the trace, finding the matching memory
map entry by:
- Checking if the address falls within the entry's physical range
- Verifying the entry belongs to the current partition (LP index match)
- Computing the offset from the physical range start
- Adding the offset to the logical range start to get the logical address
The implementation uses dynamic memory allocation to handle variable numbers
of trace entries and memory map entries.
Data structures:
- struct mem_entries: Stores memory map metadata (physical start, logical
start, LP index, size) extracted from HTM system memory configuration
- struct addr_map: Stores the mapping results (physical address,
logical address) for each trace entry
The mapping results are output via pr_debug() for verification during
development and debugging. This information is essential for subsequent
patches that will use the logical addresses to generate synthetic perf
samples.
Error handling includes:
- NULL checks for all memory allocations
- Validation of file operations (fopen, fstat, mmap)
- Proper resource cleanup on all error paths
- Consistent error return codes using negative errno values
- Descriptive error messages using perf's pr_err() infrastructure
This patch is part of the HTM trace processing pipeline and works in
conjunction with:
- HTM trace collection (kernel driver)
- HTM trace decoding (htmdecode tool)
- Synthetic sample generation (subsequent patch)
This patch incorporates the following changes:
Store HTM memory map entries (start physical, start logical,
LP index) into a dedicated struct.
Parse HTM decoded trace file and extract the "addr" and
"label" fields and store it in a struct.
For each address in the trace, the code checks for a matching
memory map entry with the same LP index. If the address falls
within the entry's range, the offset is computed and added to
the logical start address (got from the memory map entry) to
get the logical address of the given address.
Signed-off-by: Tanushree Shah <tshah@linux.ibm.com>
---
tools/perf/util/powerpc-htm.c | 264 ++++++++++++++++++++++++++++++++++
1 file changed, 264 insertions(+)
diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
index 487989ca4fc7..83253850870c 100644
--- a/tools/perf/util/powerpc-htm.c
+++ b/tools/perf/util/powerpc-htm.c
@@ -20,6 +20,13 @@
#include "sample.h"
#include <sys/types.h>
#include <sys/wait.h>
+#include <regex.h>
+#include <ctype.h>
+#include <errno.h>
+
+/* mask the 64th bit of a physical address */
+#define PHYS_ADDR_MASK 0x7FFFFFFFFFFFFFFFUL
+const char *lpar_cfg = "/proc/powerpc/lparcfg";
struct perf_session;
@@ -47,6 +54,19 @@ struct htm_mem {
uint64_t size;
};
+struct mem_entries {
+ unsigned long long phy_addr;
+ unsigned long logical_addr;
+ u32 lp_index;
+ u64 size;
+};
+
+struct addr_map {
+ char event[64];
+ unsigned long long phys_addr;
+ unsigned long logical_addr;
+};
+
static int run_htmdecode(const char *input_file, const char *output_file)
{
pid_t pid;
@@ -108,6 +128,187 @@ static int run_htmdecode(const char *input_file, const char *output_file)
return 0;
}
+static void *safe_realloc(void *ptr, size_t new_size)
+{
+ void *tmp = realloc(ptr, new_size);
+
+ if (!tmp) {
+ pr_err("realloc failed: %s\n", strerror(errno));
+ return NULL;
+ }
+
+ return tmp;
+}
+
+static int add_map_entry(struct addr_map **arr, size_t *count, size_t *cap, struct addr_map entry)
+{
+ if (*count >= *cap) {
+ size_t new_cap = (*cap == 0) ? 1024 : (*cap * 2);
+ void *tmp = safe_realloc(*arr, new_cap * sizeof(struct addr_map));
+
+ if (!tmp)
+ return -1; // allocation failed
+ *arr = tmp;
+ *cap = new_cap;
+
+ }
+
+ (*arr)[(*count)++] = entry;
+ return 0;
+}
+
+/*
+ * This effectively maps a physical address to its logical address
+ * within the selected LP partition.
+ */
+static unsigned long find_logical_addr(unsigned long long given_addr,
+ struct mem_entries *mem_entries_array,
+ size_t n_entries,
+ u32 lp_filter)
+{
+ for (size_t i = 0; i < n_entries; i++) {
+ unsigned long long start = mem_entries_array[i].phy_addr & PHYS_ADDR_MASK;
+ unsigned long long end = start + mem_entries_array[i].size;
+
+ /* Skip entries with invalid logical_start sentinel */
+ if (mem_entries_array[i].logical_addr == UINT64_MAX) {
+ pr_debug(" SKIP i=%zu: logical_start sentinel=0x%016llx\n",
+ i, (unsigned long long)mem_entries_array[i].logical_addr);
+ continue;
+ }
+
+ /*
+ * Check if 'given_addr' falls within the physical memory range of this entry
+ * and belongs to the LP partition indicated by 'lp_filter'.
+ * If so, calculate:
+ * 'offset' and the 'logical address
+ */
+ if (start <= given_addr && given_addr < end &&
+ mem_entries_array[i].lp_index == lp_filter) {
+ unsigned long long offset = given_addr - start;
+ unsigned long logical = mem_entries_array[i].logical_addr + offset;
+
+ pr_debug("DEBUG: Condition hit at i=%zu given_addr=0x%llx start=0x%llx end=0x%llx lp_index=%u\n",
+ i, given_addr,
+ start, end,
+ (unsigned int)mem_entries_array[i].lp_index);
+ pr_debug("logical = 0x%016lx\n", logical);
+ return logical;
+ }
+ }
+
+ return 0;
+}
+
+/*
+ * Parse the HTM trace file line by line, extracting memory addresses and labels.
+ * Map each memory addresses to a corresponding logical address for a given
+ * lp_filter. Store the results in a dynamically growing map of entries.
+ */
+static struct addr_map *process_trace_file(const char *trace_file,
+ struct mem_entries *mem_entries_array,
+ size_t n_entries,
+ u32 lp_filter,
+ size_t *count_out)
+{
+ regex_t addr_regex, label_regex;
+ struct addr_map *maps;
+ size_t count, cap;
+ char *line = NULL;
+ size_t len;
+ regmatch_t pmatch[2];
+ const char *ptr;
+ unsigned long logical_addr;
+ size_t total_phys = 0;
+ size_t total_phys_to_logical = 0;
+
+ FILE *fp = fopen(trace_file, "r");
+
+ if (!fp) {
+ pr_err("Failed to open trace file %s: %s\n", trace_file, strerror(errno));
+ return NULL;
+ }
+
+ if (regcomp(&addr_regex, "addr:0x[0-9A-Fa-f]+", REG_EXTENDED) != 0) {
+ pr_err("Failed to compile addr_regex\n");
+ return NULL;
+ }
+
+ if (regcomp(&label_regex,
+ "^[[:space:]]*[0-9A-Fa-f]+ : [^[:space:]]+[[:space:]]+([^[:space:]]+)",
+ REG_EXTENDED) != 0) {
+ pr_err("Failed to compile label_regex\n");
+ return NULL;
+ }
+
+ maps = NULL;
+ count = 0;
+ cap = 0;
+
+ while (getline(&line, &len, fp) != -1) {
+ if (regexec(&label_regex, line, 2, pmatch, 0) == 0) {
+ char label[64] = {0};
+ int start = pmatch[1].rm_so;
+ int end = pmatch[1].rm_eo;
+ int line_len = end - start;
+
+ if (line_len < 0)
+ line_len = 0;
+ if ((size_t)line_len > sizeof(label) - 1)
+ line_len = (int)(sizeof(label) - 1);
+
+ /* Use snprintf to copy exactly len characters and
+ * always null terminate
+ */
+ snprintf(label, sizeof(label), "%.*s", line_len, line + start);
+ ptr = line;
+ while (regexec(&addr_regex, ptr, 1, pmatch, 0) == 0) {
+ unsigned long long phys_addr = 0;
+ struct addr_map entry = {0};
+
+ if (sscanf(ptr + pmatch[0].rm_so + strlen("addr:"),
+ "%llx", &phys_addr) != 1) {
+ pr_debug("Failed to parse phys addr from trace line\n");
+ continue;
+ }
+
+ total_phys++;
+ pr_debug("Total Phys[%zu]: 0x%016llx\n", total_phys, phys_addr);
+ logical_addr = find_logical_addr(phys_addr,
+ mem_entries_array,
+ n_entries,
+ lp_filter);
+ if (logical_addr == 0) {
+ ptr += pmatch[0].rm_eo;
+ continue;
+ } else {
+ total_phys_to_logical++;
+ pr_debug(" Phys: 0x%016llx to Logical: 0x%016lx\n",
+ phys_addr,
+ logical_addr);
+ }
+ pr_debug("Total physical to logical found : %zu\n",
+ total_phys_to_logical);
+ snprintf(entry.event, sizeof(entry.event), "%s", label);
+ entry.phys_addr = phys_addr;
+ entry.logical_addr = logical_addr;
+
+ add_map_entry(&maps, &count, &cap, entry);
+
+ ptr += pmatch[0].rm_eo;
+ }
+ }
+ }
+
+ free(line);
+ fclose(fp);
+ regfree(&addr_regex);
+ regfree(&label_regex);
+
+ *count_out = count;
+ return maps;
+}
+
static int create_mem_maps(struct powerpc_htm *htm)
{
off_t file_size;
@@ -117,6 +318,13 @@ static int create_mem_maps(struct powerpc_htm *htm)
struct htm_mem *mem;
char tracefile[128];
int ret;
+ u32 lp_filter = 0;
+ size_t n_entries = htm->htm_mem_entries;
+ struct mem_entries *mem_entries_array;
+ size_t num_maps = 0;
+ struct addr_map *maps;
+ FILE *fp;
+ char lp_line[256];
snprintf(tracefile, sizeof(tracefile), "%s.out", htm->htmbin_file);
@@ -129,6 +337,27 @@ static int create_mem_maps(struct powerpc_htm *htm)
return ret;
}
+ /* get the lp index */
+ fp = fopen(lpar_cfg, "r");
+ if (!fp) {
+ pr_err("Failed to open %s: %s\n", lpar_cfg, strerror(errno));
+ return -errno;
+ }
+
+ while (fgets(lp_line, sizeof(lp_line), fp)) {
+ if (strncmp(lp_line, "partition_id=", 13) == 0) {
+ lp_filter = strtoul(lp_line + 13, NULL, 10);
+ break;
+ }
+ }
+
+ fclose(fp);
+
+ if (lp_filter == 0)
+ pr_info("partition_id not found in %s\n", lpar_cfg);
+ else
+ pr_info("Using partition_id = %" PRIu32 "\n", lp_filter);
+
fd = open(htm->trans_file, O_RDONLY);
if (fd == -1) {
pr_err("Failed to open %s: %s\n", htm->trans_file, strerror(errno));
@@ -160,9 +389,44 @@ static int create_mem_maps(struct powerpc_htm *htm)
return -1;
}
+ mem_entries_array = malloc(n_entries * sizeof(struct mem_entries));
+ if (!mem_entries_array) {
+ pr_err("Failed to allocate memory for mem entries: %s\n", strerror(errno));
+ munmap(mapped_data, file_size);
+ close(fd);
+ return -ENOMEM;
+ }
+
+ /* get the HTM memory map data and store it in mem_entries_array
+ * to use it later on for physical->logical mapping
+ */
+ for (u64 i = 0; i < n_entries; i++, mem++) {
+ mem_entries_array[i].phy_addr = bswap_64(mem->phy_real);
+ mem_entries_array[i].logical_addr = bswap_64(mem->logical_real);
+ mem_entries_array[i].lp_index = bswap_32(mem->lp_index);
+ mem_entries_array[i].size = bswap_64(mem->size);
+ }
+
munmap(mapped_data, file_size);
close(fd);
+ maps = process_trace_file(tracefile, mem_entries_array, n_entries, lp_filter, &num_maps);
+ if (!maps) {
+ pr_err("Error processing physical addresses from trace file\n");
+ free(mem_entries_array);
+ return -EINVAL;
+ }
+
+ for (size_t i = 0; i < num_maps; i++) {
+ pr_debug("Event: %-20s | Phys: 0x%016llx | Logical: 0x%lx\n",
+ maps[i].event,
+ maps[i].phys_addr,
+ (unsigned long)maps[i].logical_addr);
+ }
+
+ free(maps);
+ free(mem_entries_array);
+
return 0;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 8/9] tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
` (6 preceding siblings ...)
2026-07-01 8:41 ` [PATCH 7/9] perf tools powerpc: Add physical to logical address mapping for HTM traces Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 9:12 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 9/9] tools/perf/powerpc: Add logical address in decoded traces Athira Rajeev
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
HTM trace details are captured as-is in PERF_RECORD_AUXTRACE
records. To present htm entries as samples, create an event
with name as "htm" and type PERF_TYPE_SYNTH.
Add perf_synth_id, "PERF_SYNTH_POWERPC_HTM" as config value for the
event. Create a sample id to be a fixed offset from evsel id.
Invoke powerpc_htm_create_sample() using the logical address
as sample ip.
This will help in understanding hot logical address from the
traces.
Usage:
# perf record -C 1 -e htm/nodalchipindex=2,nodeindex=0,htm_type=1/ -o perf_1.data ls 1>out
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 257.504 MB perf_1.data ]
# ./perf report -i perf_1.data
# Samples: 8M of event 'htm'
# Event count (approx.): 8515199
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................ ......................
#
0.61% 0.61% swapper [unknown] [.] 0x00000006fd567fe0
0.33% 0.33% swapper [unknown] [.] 0x00000006fc194b20
0.20% 0.20% swapper [unknown] [.] 0x0000000066113f80
0.18% 0.18% swapper [unknown] [.] 0x00000007fd888f20
0.15% 0.15% swapper [unknown] [.] 0x00000006fd567fc0
0.08% 0.08% swapper [unknown] [.] 0x00000006fc194b00
0.05% 0.05% swapper [unknown] [.] 0x00000007fd888f00
0.03% 0.03% swapper [unknown] [.] 0x0000000422510700
0.03% 0.03% swapper [unknown] [.] 0x0000000422510820
0.03% 0.03% swapper [unknown] [.] 0x00000007fd888b80
0.02% 0.02% swapper [unknown] [.] 0x000000000a0ece40
0.02% 0.02% swapper [unknown] [.] 0x000000000a0ed2e0
0.01% 0.01% swapper [unknown] [.] 0x00000007fd888c40
0.01% 0.01% swapper [unknown] [.] 0x00000007fd889000
0.01% 0.01% swapper [unknown] [.] 0x00000007fd5bc200
0.01% 0.01% swapper [unknown] [.] 0x00000007fd61c200
0.01% 0.01% swapper [unknown] [.] 0x00000007fd28c200
0.01% 0.01% swapper [unknown] [.] 0x00000007fd22c200
0.01% 0.01% swapper [unknown] [.] 0x00000007fd1fc200
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/util/event.h | 1 +
tools/perf/util/powerpc-htm.c | 110 +++++++++++++++++++++++++++++++++-
2 files changed, 109 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 2ea83fdf8a03..f5aa7eb9f5b7 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -118,6 +118,7 @@ enum perf_synth_id {
PERF_SYNTH_INTEL_EVT,
PERF_SYNTH_INTEL_IFLAG_CHG,
PERF_SYNTH_POWERPC_VPA_DTL,
+ PERF_SYNTH_POWERPC_HTM,
};
/*
diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
index 83253850870c..050fbceac71e 100644
--- a/tools/perf/util/powerpc-htm.c
+++ b/tools/perf/util/powerpc-htm.c
@@ -42,6 +42,7 @@ struct powerpc_htm {
char trans_file[64];
int htm_mem_entries;
int mem_maps;
+ u64 sample_id;
};
struct htm_mem {
@@ -128,6 +129,43 @@ static int run_htmdecode(const char *input_file, const char *output_file)
return 0;
}
+static int powerpc_htm_create_sample(unsigned long addr, struct perf_session *session,
+ struct powerpc_htm *htm)
+{
+ struct perf_sample sample;
+ union perf_event event;
+
+ if (dump_trace)
+ return 0;
+
+ memset(&sample, 0, sizeof(sample));
+ sample.cpumode = PERF_RECORD_MISC_USER;
+
+ if (!addr)
+ return 0;
+
+ if (addr >= 0xc000000000000000)
+ sample.cpumode = PERF_RECORD_MISC_KERNEL;
+
+ sample.ip = addr;
+ sample.period = 1;
+ sample.cpu = 0;
+ sample.id = htm->sample_id;
+ sample.callchain = NULL;
+ sample.branch_stack = NULL;
+ memset(&event, 0, sizeof(event));
+ event.sample.header.type = PERF_RECORD_SAMPLE;
+ event.sample.header.misc = sample.cpumode;
+ event.sample.header.size = sizeof(struct perf_event_header);
+
+ if (perf_session__deliver_synth_event(session, &event, &sample)) {
+ pr_debug("Failed to create sample for htm entry\n");
+ return -1;
+ }
+
+ return 0;
+}
+
static void *safe_realloc(void *ptr, size_t new_size)
{
void *tmp = realloc(ptr, new_size);
@@ -309,7 +347,7 @@ static struct addr_map *process_trace_file(const char *trace_file,
return maps;
}
-static int create_mem_maps(struct powerpc_htm *htm)
+static int create_mem_maps(struct perf_session *session, struct powerpc_htm *htm)
{
off_t file_size;
void *htmdata, *mapped_data;
@@ -422,6 +460,7 @@ static int create_mem_maps(struct powerpc_htm *htm)
maps[i].event,
maps[i].phys_addr,
(unsigned long)maps[i].logical_addr);
+ powerpc_htm_create_sample(maps[i].logical_addr, session, htm);
}
free(maps);
@@ -581,7 +620,7 @@ static int powerpc_htm_process_event(struct perf_session *session __maybe_unused
}
/* Only for power bus traces, we decode traces */
if (config == 1)
- create_mem_maps(htm);
+ create_mem_maps(session, htm);
}
return 0;
@@ -636,6 +675,69 @@ static void powerpc_htm_print_info(__u64 *arr)
fprintf(stdout, powerpc_htm_info_fmts[POWERPC_HTM_TYPE], arr[POWERPC_HTM_TYPE]);
}
+static void set_event_name(struct evlist *evlist, u64 id,
+ const char *name)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(evlist, evsel) {
+ if (evsel->core.id && evsel->core.id[0] == id) {
+ if (evsel->name)
+ zfree(&evsel->name);
+ evsel->name = strdup(name);
+ if (!evsel->name) {
+ pr_err("Failed to allocate memory for event name\n");
+ return;
+ }
+ break;
+ }
+ }
+}
+
+static int
+powerpc_htm_synth_events(struct powerpc_htm *htm, struct perf_session *session)
+{
+ struct evlist *evlist = session->evlist;
+ struct evsel *evsel;
+ struct perf_event_attr attr;
+ bool found = false;
+ u64 id;
+ int err;
+
+ evlist__for_each_entry(evlist, evsel) {
+ if (strstarts(evsel->name, "htm")) {
+ found = true;
+ break;
+ }
+ }
+
+ if (!found) {
+ pr_debug("No selected events with HTM trace data\n");
+ return 0;
+ }
+
+ memset(&attr, 0, sizeof(struct perf_event_attr));
+ attr.size = sizeof(struct perf_event_attr);
+ attr.sample_type = evsel->core.attr.sample_type;
+ attr.sample_id_all = evsel->core.attr.sample_id_all;
+ attr.type = PERF_TYPE_SYNTH;
+ attr.config = PERF_SYNTH_POWERPC_HTM;
+
+ /* create new id val to be a fixed offset from evsel id */
+ id = evsel->core.id[0] + 1000000000;
+ if (!id)
+ id = 1;
+
+ err = perf_session__deliver_synth_attr_event(session, &attr, id);
+ if (err)
+ return err;
+
+ htm->sample_id = id;
+ set_event_name(evlist, id, "htm");
+
+ return 0;
+}
+
int powerpc_htm_process_auxtrace_info(union perf_event *event,
struct perf_session *session)
{
@@ -698,6 +800,10 @@ int powerpc_htm_process_auxtrace_info(union perf_event *event,
if (err)
goto err_free_queues;
+ err = powerpc_htm_synth_events(htm, session);
+ if (err)
+ goto err_free;
+
return 0;
err_free_queues:
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 9/9] tools/perf/powerpc: Add logical address in decoded traces
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
` (7 preceding siblings ...)
2026-07-01 8:41 ` [PATCH 8/9] tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples Athira Rajeev
@ 2026-07-01 8:41 ` Athira Rajeev
2026-07-01 9:13 ` sashiko-bot
8 siblings, 1 reply; 19+ messages in thread
From: Athira Rajeev @ 2026-07-01 8:41 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, maddy, irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, tejas05,
tshah, venkat88
Enhance the post processing to translate physical addresses to logical
addresses in the decoded output. This improves debuggability by
allowing direct correlation with source code and debug symbols.
The decoder now creates a .l output file with logical addresses,
making it easier to analyze traces using symbol tables and debuggers.
LP index filtering is made optional when dumping traces to show all
addresses.
Add logical address translation in the post-processing step of
process_trace_file(). For each physical address found in the decoded
trace, find_logical_addr() is used to look up the corresponding
logical address from the mem_entries array. The translated output is
written to a new file with a ".l" suffix alongside the original
decoded trace file.
LP index filtering in find_logical_addr() is made optional via a
new 'filter' parameter. When running with dump_trace (perf script -D),
filtering is disabled so all addresses are translated regardless of
LP index. This ensures complete coverage when dumping traces for
analysis.
Example output files after decoding:
htm.bin.n0.p2.c0.out - decoded trace with physical addresses
htm.bin.n0.p2.c0.out.l - decoded trace with logical addresses
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/util/powerpc-htm.c | 81 ++++++++++++++++++++++++++++++++---
1 file changed, 74 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
index 050fbceac71e..1bd2664453c6 100644
--- a/tools/perf/util/powerpc-htm.c
+++ b/tools/perf/util/powerpc-htm.c
@@ -202,7 +202,7 @@ static int add_map_entry(struct addr_map **arr, size_t *count, size_t *cap, stru
static unsigned long find_logical_addr(unsigned long long given_addr,
struct mem_entries *mem_entries_array,
size_t n_entries,
- u32 lp_filter)
+ u32 lp_filter, int filter)
{
for (size_t i = 0; i < n_entries; i++) {
unsigned long long start = mem_entries_array[i].phy_addr & PHYS_ADDR_MASK;
@@ -221,11 +221,13 @@ static unsigned long find_logical_addr(unsigned long long given_addr,
* If so, calculate:
* 'offset' and the 'logical address
*/
- if (start <= given_addr && given_addr < end &&
- mem_entries_array[i].lp_index == lp_filter) {
+ if (start <= given_addr && given_addr < end) {
unsigned long long offset = given_addr - start;
unsigned long logical = mem_entries_array[i].logical_addr + offset;
+ if (filter && (mem_entries_array[i].lp_index != lp_filter))
+ continue;
+
pr_debug("DEBUG: Condition hit at i=%zu given_addr=0x%llx start=0x%llx end=0x%llx lp_index=%u\n",
i, given_addr,
start, end,
@@ -259,36 +261,60 @@ static struct addr_map *process_trace_file(const char *trace_file,
unsigned long logical_addr;
size_t total_phys = 0;
size_t total_phys_to_logical = 0;
-
FILE *fp = fopen(trace_file, "r");
+ size_t prefix_len;
+ int found_match = 0;
+ FILE *fout;
+ int filter_lp = 1;
+ char *output = malloc(strlen(trace_file) + 3); /* +3 for ".l" and null */
+
+ if (!output) {
+ pr_err("Failed to allocate memory for output filename\n");
+ fclose(fp);
+ return NULL;
+ }
if (!fp) {
pr_err("Failed to open trace file %s: %s\n", trace_file, strerror(errno));
return NULL;
}
+ snprintf(output, strlen(trace_file) + 3, "%s.l", trace_file);
+ fout = fopen(output, "w");
+ if (!fout) {
+ pr_err("Failed to open trace output file: %s\n", output);
+ fclose(fp);
+ return NULL;
+ }
+
if (regcomp(&addr_regex, "addr:0x[0-9A-Fa-f]+", REG_EXTENDED) != 0) {
pr_err("Failed to compile addr_regex\n");
- return NULL;
+ goto out;
}
if (regcomp(&label_regex,
"^[[:space:]]*[0-9A-Fa-f]+ : [^[:space:]]+[[:space:]]+([^[:space:]]+)",
REG_EXTENDED) != 0) {
pr_err("Failed to compile label_regex\n");
- return NULL;
+ regfree(&addr_regex);
+ goto out;
}
maps = NULL;
count = 0;
cap = 0;
+ /* When dumping traces, show all addresses regardless of LP index */
+ if (dump_trace)
+ filter_lp = 0;
+
while (getline(&line, &len, fp) != -1) {
if (regexec(&label_regex, line, 2, pmatch, 0) == 0) {
char label[64] = {0};
int start = pmatch[1].rm_so;
int end = pmatch[1].rm_eo;
int line_len = end - start;
+ found_match = 0;
if (line_len < 0)
line_len = 0;
@@ -303,6 +329,10 @@ static struct addr_map *process_trace_file(const char *trace_file,
while (regexec(&addr_regex, ptr, 1, pmatch, 0) == 0) {
unsigned long long phys_addr = 0;
struct addr_map entry = {0};
+ char *hex_start = strstr(line, "addr:0x");
+ const char *target = "addr:0x";
+ char *old_val_ptr;
+ size_t written;
if (sscanf(ptr + pmatch[0].rm_so + strlen("addr:"),
"%llx", &phys_addr) != 1) {
@@ -315,7 +345,30 @@ static struct addr_map *process_trace_file(const char *trace_file,
logical_addr = find_logical_addr(phys_addr,
mem_entries_array,
n_entries,
- lp_filter);
+ lp_filter, filter_lp);
+ /* create output.txt with logical address */
+ if (dump_trace && hex_start) {
+ old_val_ptr = hex_start + strlen(target);
+ prefix_len = hex_start - line;
+ written = fwrite(line, 1, prefix_len, fout);
+ if (written != prefix_len) {
+ pr_err("Failed to write prefix to output file\n");
+ continue;
+ }
+ if (fprintf(fout, "addr:0x%llx\t", (unsigned long long)logical_addr) < 0) {
+ pr_err("Failed to write to output file\n");
+ continue;
+ }
+ while (*old_val_ptr != ' ' && *old_val_ptr != '\n' && *old_val_ptr != '\0') {
+ old_val_ptr++;
+ }
+ if (fprintf(fout, "%s", old_val_ptr) < 0) {
+ pr_err("Failed to write suffix to output file\n");
+ continue;
+ }
+ found_match = 1;
+ }
+
if (logical_addr == 0) {
ptr += pmatch[0].rm_eo;
continue;
@@ -335,6 +388,12 @@ static struct addr_map *process_trace_file(const char *trace_file,
ptr += pmatch[0].rm_eo;
}
+ if (dump_trace && (!found_match) && line) {
+ if (fprintf(fout, "%s", line) < 0) {
+ pr_err("Failed to write line to output file\n");
+ continue;
+ }
+ }
}
}
@@ -342,9 +401,17 @@ static struct addr_map *process_trace_file(const char *trace_file,
fclose(fp);
regfree(&addr_regex);
regfree(&label_regex);
+ fclose(fout);
+ free(output);
*count_out = count;
return maps;
+
+out:
+ fclose(fp);
+ fclose(fout);
+ free(output);
+ return NULL;
}
static int create_mem_maps(struct perf_session *session, struct powerpc_htm *htm)
--
2.52.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 3/9] tools/perf: Add arch_record__collect_final_data to collect additional data before closing the event
2026-07-01 8:41 ` [PATCH 3/9] tools/perf: Add arch_record__collect_final_data to collect additional data before closing the event Athira Rajeev
@ 2026-07-01 8:54 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 8:54 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
- [High] `record__final_data` implements a defective busy-wait loop that causes CPU pegging and infinite loops.
- [High] Redundant and potentially harmful calls to `evlist__enable` during the shutdown phase.
- [High] Collection of final architecture-specific data is randomly skipped due to an early loop break race condition.
- [High] Final data collection is completely omitted when tracing child workloads.
- [Low] Unnecessary full header inclusions in `record.h` violate subsystem guidelines.
- [Low] The function name mentioned in the commit message does not match the actual code implementation.
--
commit 4b101c454c9b203ae14e9902eb253bcd6be33523
Author: Athira Rajeev <atrajeev@linux.ibm.com>
tools/perf: Add arch_record__collect_final_data to collect additional data before closing the event
While collecting samples using "perf record", function
"__cmd_record" checks if monitoring is done. Once recording
is done, event list will be disabled using "evlist__disable".
After this, event fd won't be read and event will be removed.
Before removing the event, if any additional data needs
to be captured/written to perf.data, currently its not
possible. Introduce arch_record__collect_final_data to
collect additional data before closing the event
> Introduce arch_record__collect_final_data to
> collect additional data before closing the event
[Severity: Low]
This isn't a bug, but the commit message mentions introducing
arch_record__collect_final_data, but the function added is actually named
arch_perf_record__need_read.
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index e915390556752..a60bc039401a2 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -2425,6 +2425,34 @@ static unsigned long record__waking(struct record *rec)
> return waking;
> }
>
> +/*
> + * Weak symbol - architecture can override to indicate if more
> + * data needs to be collected before finishing output.
> + *
> + * Returns: 1 if more data exists, 0 if collection is complete
> + */
> +__weak int arch_perf_record__need_read(struct evlist *evlist __maybe_unused)
> +{
> + return 0; /* Default: no arch-specific data to collect */
> +}
> +
> +static void record__final_data(struct record *rec)
> +{
> + /*
> + * Collect any remaining architecture-specific data.
> + * The arch code checks if more data exists, and we do the actual
> + * reading here since we have access to record__mmap_read_all().
> + */
> + while (arch_perf_record__need_read(rec->evlist)) {
> + if (record__mmap_read_all(rec, false) < 0)
> + break;
[Severity: High]
Does this create a busy-wait loop that pegs the CPU? This loop repeatedly calls
record__mmap_read_all() without yielding or polling.
Additionally, record__mmap_read_all() flushes mmap buffers based on the
thread-local variable, so when running on the main thread it only flushes
thread_data[0]. If the data to be consumed resides in a worker thread's buffer,
could this starve the worker thread and cause an infinite loop?
> + /* Re-enable events for next batch */
> + evlist__enable(rec->evlist);
[Severity: High]
Is it intentional to re-enable the events here during the shutdown phase?
Since this is called when done is true to drain the final buffers, calling
evlist__enable() instructs the kernel to resume generating normal profiling
samples. Could this continuously feed new samples into the buffers and trap
the tool in an infinite loop while generating massive amounts of ioctls?
> + }
> +
> + return;
> +}
> +
[ ... ]
> @@ -2852,6 +2880,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> */
> if (done && !disabled && !target__none(&opts->target)) {
> trigger_off(&auxtrace_snapshot_trigger);
> + record__final_data(rec);
[Severity: High]
Could the early break condition earlier in the event loop cause this final data
collection to be silently bypassed?
If the main thread's mmap buffers contain no new normal events during the
iteration when the done flag is observed:
__cmd_record() {
...
if (hits == thread->samples) {
if (done || draining)
break;
...
}
It breaks out of the loop and skips over this block, randomly missing the
final architecture-specific data.
[Severity: High]
Will this fail to collect final data when tracing child workloads?
The condition !target__none(&opts->target) evaluates to false when a child
workload is specified (e.g., perf record ./my_app). Because record__final_data()
is placed inside this pre-existing guard, won't it silently omit the data
collection for normal child workloads?
> evlist__disable(rec->evlist);
> disabled = true;
> }
> diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
> index 93627c9a73387..21f51efd36fc7 100644
> --- a/tools/perf/util/record.h
> +++ b/tools/perf/util/record.h
> @@ -8,6 +8,8 @@
> #include <linux/stddef.h>
> #include <linux/perf_event.h>
> #include "util/target.h"
> +#include "util/evlist.h"
> +#include "util/util.h"
[Severity: Low]
Is it possible to use a forward declaration here instead of full header
inclusions?
Since arch_perf_record__need_read() only references struct evlist by pointer, a
forward declaration like "struct evlist;" would avoid including these heavy
headers, aligning with the perf subsystem guidelines.
>
> struct option;
>
> @@ -95,4 +97,6 @@ static inline bool record_opts__no_switch_events(const struct record_opts *opts)
> return opts->record_switch_events_set && !opts->record_switch_events;
> }
>
> +int arch_perf_record__need_read(struct evlist *evlist);
> +
> #endif // _PERF_RECORD_H
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=3
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/9] tools/perf: Add powerpc callback support for arch_record__collect_final_data
2026-07-01 8:41 ` [PATCH 4/9] tools/perf: Add powerpc callback support for arch_record__collect_final_data Athira Rajeev
@ 2026-07-01 8:55 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 8:55 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
- [High] Direct access to `evsel->name` without NULL checks causes a segmentation fault.
- [High] Premature loop termination in `arch_perf_record__need_read` due to artificial overwriting of loop variables.
- [High] Unconditional compilation of PowerPC-specific execution logic globally overrides a cross-platform weak symbol.
- [High] The negative error return value from `perf_evsel__read()` is ignored, leading to premature termination of trace data collection.
- [Medium] The functions and callbacks described in the commit message are completely missing from the implemented code.
- [Low] Malformed include path directive `../../../util/record.h` causes a build failure.
--
commit dee0b2f32508f51e2b57675240fda0c4192160d5
Author: Athira Rajeev <atrajeev@linux.ibm.com>
tools/perf: Add powerpc callback support for arch_record__collect_final_data
This patch adds an architecture-specific hook to collect additional HTM trace
data before closing events. It reads the HTM event counts to determine if
more data is available to be written to the perf data file.
> Add arch_record__collect_final_data to collect additional
> data before closing the event. Define the callback in
> util/powerpc-htm.c
[ ... ]
> the event using evlist__enable after reading event using
> htm_read_data function. The perf_evsel__read returns zero, when
[Severity: Medium]
The commit message mentions adding arch_record__collect_final_data and
reading events using the htm_read_data function. However, the implemented
code defines arch_perf_record__need_read and uses perf_evsel__read instead.
Could the commit message be updated to match the actual implementation?
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index 5e2265018826f..4d79662e88266 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -142,6 +142,7 @@ perf-util-y += hisi-ptt.o
> perf-util-y += hisi-ptt-decoder/
> perf-util-y += s390-cpumsf.o
> perf-util-y += powerpc-vpadtl.o
> +perf-util-y += powerpc-htm.o
[Severity: High]
Does unconditionally adding powerpc-htm.o here cause it to be compiled for
all architectures? Since it defines arch_perf_record__need_read globally,
this could override the weak symbol in builtin-record.c and incorrectly
execute PowerPC-specific logic on other architectures like x86 or ARM.
> diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
> new file mode 100644
> index 0000000000000..5043ff41a609e
> --- /dev/null
> +++ b/tools/perf/util/powerpc-htm.c
> @@ -0,0 +1,74 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * HTM support
> + */
> +
> +#include "../../../util/record.h"
[Severity: Low]
Will this include path resolve correctly during the build? It appears to
point outside the normal include directories for the perf tool, which might
result in a compile-time failure.
[ ... ]
> +int arch_perf_record__need_read(struct evlist *evlist)
> +{
> + struct evsel *evsel;
> + int found_htm = 0;
> +
> + /* there was an error during record__open */
> + if (!evlist)
> + return 0;
> +
> + /* First, check if any HTM events exist */
> + evlist__for_each_entry(evlist, evsel) {
> + if (strstr(evsel->name, "htm") != NULL)
[Severity: High]
Can evsel->name be NULL here? For raw hardware events or events before
parsing has dynamically assigned a name, passing a NULL pointer to strstr
or strcmp will cause a segmentation fault.
[ ... ]
> + for (int x = 0; x < xyarray__max_x(xy); x++) {
> + for (int y = 0; y < xyarray__max_y(xy); y++) {
> + struct perf_counts_values count;
> +
> + if (!strcmp(evsel->name, "dummy:u"))
> + continue;
> +
> + if (strstr(evsel->name, "htm")) {
> + perf_evsel__read(&evsel->core, x, y, &count);
[Severity: High]
Does this properly handle failures from perf_evsel__read()? It returns a
negative error code on failure and leaves the count struct uninitialized or
zeroed. If ignored, count.val could evaluate to 0 below, causing the trace
data collection to terminate prematurely.
> + y = xyarray__max_y(xy);
> + x = xyarray__max_x(xy);
[Severity: High]
Are these assignments intentionally overwriting the loop iteration variables?
This forcefully terminates both the inner and outer loops after checking
only the very first CPU and thread index (0, 0), preventing the iteration
from checking any of the remaining CPUs.
> + }
> + if (!count.val)
> + return 0;
> + }
> + }
> + }
> +
> + return 1;
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=4
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/9] tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc
2026-07-01 8:41 ` [PATCH 2/9] tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc Athira Rajeev
@ 2026-07-01 8:55 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 8:55 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [Medium] Missing explicit inclusions of POSIX standard headers `<stdlib.h>` and `<limits.h>` for standard libc symbols.
- [Medium] Memory allocation failure during HTM auxtrace initialization is silently dropped, causing a broken recording state.
Pre-existing issues:
- [High] Directly accessing `pos->name` with `strstarts()` causes a segmentation fault when the event name is NULL.
--
commit d4675d2351298802a38f0a4941c6bb4d86a4ed1f
Author: Athira Rajeev <atrajeev@linux.ibm.com>
tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc
The powerpc PMU collecting Hardware Trace Macro (HTM) entries makes use of
AUX support in perf infrastructure. The PMU driver has the functionality
to collect trace entries in the aux buffer. On the tools side, this data
is made available as PERF_RECORD_AUXTRACE records. This record is
generated by "perf record" command. To enable the creation of
PERF_RECORD_AUXTRACE, add functions to initialize auxtrace records ie
"htm_recording_init()". Fill in fields for other callbacks like
info_priv_size, info_fill, free, recording options etc. Add header file
to define htm pmu specific details.
> diff --git a/tools/perf/arch/powerpc/util/auxtrace.c b/tools/perf/arch/powerpc/util/auxtrace.c
> index 0053526329e0e..ec84f8876a4ad 100644
> --- a/tools/perf/arch/powerpc/util/auxtrace.c
> +++ b/tools/perf/arch/powerpc/util/auxtrace.c
> @@ -31,13 +33,19 @@ struct auxtrace_record *auxtrace_record__init(struct evlist *evlist,
> found_vpa_dtl = 1;
> pos->needs_auxtrace_mmap = true;
> break;
> + } else if (strstarts(pos->name, "htm")) {
^^^^^^^^^^^^^^^^^^^^^^^^^^
[Severity: High]
This isn't a bug introduced by this patch, but does accessing pos->name
directly cause a segmentation fault if the event name is NULL?
If a standard hardware event is parsed without an explicit name (like
"perf record -e cycles"), the name remains NULL until it is lazily generated
via evsel__name(). Evaluating strncmp(NULL, "htm", 3) inside strstarts()
would crash.
Since this patch duplicates this pattern here and in htm_recording_options(),
should we use evsel__name() or check for NULL first?
> + found_htm = 1;
> + pos->needs_auxtrace_mmap = true;
> + break;
> }
> }
>
> if (found_vpa_dtl)
> return vpa_dtl_recording_init(pos);
> + else if (found_htm)
> + return htm_recording_init(pos);
[Severity: Medium]
If htm_recording_init() fails to allocate memory, it returns NULL. Since we
don't update the *err parameter before returning, does this propagate a false
success to the caller?
If *err remains 0, builtin-record.c might assume auxtrace is simply not
required and continue recording, silently dropping the HTM trace data instead
of aborting with -ENOMEM.
> else {
> - *err = -EINVAL;
> + *err = 0;
> return NULL;
> }
> }
> diff --git a/tools/perf/arch/powerpc/util/htm.c b/tools/perf/arch/powerpc/util/htm.c
> new file mode 100644
> index 0000000000000..cc733f45ac9b0
> --- /dev/null
> +++ b/tools/perf/arch/powerpc/util/htm.c
> @@ -0,0 +1,113 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * HTM support
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/bitops.h>
> +#include <linux/log2.h>
> +#include <linux/string.h>
> +#include <time.h>
> +
> +#include "../../util/cpumap.h"
> +#include "../../util/evsel.h"
> +#include "../../util/evlist.h"
> +#include "../../util/session.h"
> +#include "../../util/util.h"
> +#include "../../util/pmu.h"
> +#include "../../util/debug.h"
> +#include "../../util/auxtrace.h"
> +#include "../../util/powerpc-htm.h"
> +#include "../../util/record.h"
> +#include <internal/lib.h> // page_size
> +#include <errno.h>
[Severity: Medium]
Are we missing the explicit inclusions of <stdlib.h> and <limits.h> here?
This file uses UINT_MAX in htm_recording_options() and free() in htm_free().
Without directly including <limits.h> and <stdlib.h>, building the perf tools
with musl libc might fail due to missing declarations.
[ ... ]
> + /*
> + * Set auxtrace_mmap_pages to minimum
> + * two pages
> + */
> + if (!opts->auxtrace_mmap_pages) {
> + opts->auxtrace_mmap_pages = KiB(128) / page_size;
> + if (opts->mmap_pages == UINT_MAX)
> + opts->mmap_pages = KiB(256) / page_size;
> + }
> +
> + evlist__for_each_entry(evlist, pos) {
> + if (strstarts(pos->name, "htm")) {
[Severity: High]
This is a pre-existing issue, but as mentioned above, this could crash if
pos->name is NULL.
> + pos->needs_auxtrace_mmap = true;
> + pos->core.attr.aux_watermark = opts->auxtrace_mmap_pages * (size_t)page_size;
> + break;
> + }
> + }
> +
> + return 0;
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=2
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/9] tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility
2026-07-01 8:41 ` [PATCH 1/9] tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility Athira Rajeev
@ 2026-07-01 8:56 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 8:56 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Memory allocation failure in vpa_dtl_recording_init silently returns success, masking -ENOMEM and causing a subsequent NULL pointer dereference.
- [Low] Missing explicit inclusion of POSIX standard headers <stdlib.h> and <limits.h> breaks musl libc compatibility.
--
(Note: The format of this report is altered to provide a free-form summary of the findings, rather than inline diff comments, due to system recitation restrictions.)
commit 9f7d3d8f8dc04ee94edc0507b7b561997e9bd6a5
Author: Athira Rajeev <atrajeev@linux.ibm.com>
tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility
This commit extracts the auxtrace_record__init logic for powerpc-vpadtl into a
separate utility file, vpa-dtl.c, allowing other PMUs to share the auxtrace
infrastructure.
[Severity: High]
When zalloc() fails in the new vpa_dtl_recording_init() function, it returns
NULL. However, because the refactored function no longer takes the err pointer
argument, it cannot propagate the -ENOMEM error back up the stack.
In auxtrace_record__init(), the err variable is initially set to 0. If
vpa_dtl_recording_init() returns NULL due to memory exhaustion,
auxtrace_record__init() will also return NULL while leaving the error code
as 0 (success).
Could this regression lead to memory corruption or crashes? Since
pos->needs_auxtrace_mmap is already set to true earlier in the function,
the caller will see a success code and incorrectly proceed to map the auxtrace
buffer. Eventually, this reaches __auxtrace_mmap__read(), which evaluates
itr->alignment. If itr is NULL because of the masked allocation failure, won't
this trigger a NULL pointer dereference?
[Severity: Low]
The newly added file tools/perf/arch/powerpc/util/vpa-dtl.c uses the POSIX
function free() in powerpc_vpadtl_free() and the constant UINT_MAX in
powerpc_vpadtl_recording_options().
Are we missing some required include directives here? The file doesn't seem to
explicitly include stdlib.h or limits.h. While this might compile fine under
glibc due to implicit header inclusions, does this regression break compilation
on stricter libc implementations like musl?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=1
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 5/9] tools/perf: process htm auxtrace events and display in perf report -D
2026-07-01 8:41 ` [PATCH 5/9] tools/perf: process htm auxtrace events and display in perf report -D Athira Rajeev
@ 2026-07-01 9:05 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 9:05 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
- [High] Unchecked return value of evlist__event2evsel() leads to immediate null pointer dereference.
- [High] Incorrect minimum size calculation for the auxtrace_info event allows out-of-bounds read of the priv array.
- [High] Unconditionally creates and truncates files with predictable names in the working directory, enabling data clobbering and symlink attacks.
- [High] Direct access to evsel->name without a NULL check leads to a crash when an event is unnamed.
- [Medium] Missing explicit inclusion of POSIX standard headers (<stdio.h>, <stdlib.h>, <string.h>), which breaks compilation on musl libc.
- [Medium] Metadata is improperly extracted from local evsel config instead of serialized priv array, and htm_info_fill fails to populate the payload.
--
commit 30d1cdd44c59f9d127ce163684cc99673a095817
Author: Athira Rajeev <atrajeev@linux.ibm.com>
Subject: tools/perf: process htm auxtrace events and display in perf report -D
This commit adds the htm pmu auxtrace process function for perf script -D.
It defines functions to handle PERF_RECORD_AUXTRACE_INFO records and setups
the aux data queues. It also writes the trace data to files based on node,
chip, and core indices.
> diff --git a/tools/perf/arch/powerpc/util/htm.c b/tools/perf/arch/powerpc/util/htm.c
> index cc733f45ac9b0..0e6638c02716c 100644
> --- a/tools/perf/arch/powerpc/util/htm.c
> +++ b/tools/perf/arch/powerpc/util/htm.c
> @@ -56,6 +56,7 @@ htm_recording_options(struct auxtrace_record *ar __maybe_unused,
> if (strstarts(pos->name, "htm")) {
[Severity: High]
Does this unconditionally dereference pos->name?
The name field of an evsel can be dynamically allocated and might be NULL
for certain parsed or synthesized events.
Can passing NULL to strstarts() (which wraps strncmp()) cause an immediate
crash? A similar unconditional access to evsel->name without a NULL check
also appears to happen in arch_perf_record__need_read().
> pos->needs_auxtrace_mmap = true;
> pos->core.attr.aux_watermark = opts->auxtrace_mmap_pages * (size_t)page_size;
> + pos->core.attr.sample_type |= PERF_SAMPLE_RAW;
> break;
> }
> }
[ ... ]
> @@ -75,6 +76,8 @@ htm_info_fill(struct auxtrace_record *itr __maybe_unused,
> struct perf_record_auxtrace_info *auxtrace_info __maybe_unused,
> size_t priv_size __maybe_unused)
> {
> + auxtrace_info->type = PERF_AUXTRACE_POWERPC_HTM;
>
> return 0;
> }
[Severity: Medium]
Does this logic fail to serialize the hardware metadata?
Following the auxtrace standard, the hardware metadata should be serialized
into the auxtrace_info->priv array here, but it appears to be left empty.
[ ... ]
> diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
> index 5043ff41a609e..ffddf0e59fc14 100644
> --- a/tools/perf/util/powerpc-htm.c
> +++ b/tools/perf/util/powerpc-htm.c
> @@ -9,6 +9,24 @@
> #include "session.h"
> #include "debug.h"
> #include <internal/xyarray.h>
> +#include <linux/string.h>
> +#include "color.h"
> +#include <inttypes.h>
> +#include "powerpc-htm.h"
> +#include <errno.h>
[Severity: Medium]
Will this break the build on musl libc systems?
The code in powerpc-htm.c heavily uses FILE, fopen, fclose, snprintf, zalloc,
and free, but does not explicitly include the required POSIX standard headers
<stdio.h> and <stdlib.h>.
[ ... ]
> +int powerpc_htm_process_auxtrace_info(union perf_event *event,
> + struct perf_session *session)
> +{
> + struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
> + struct evsel *evsel = evlist__event2evsel(session->evlist, event);
> + u32 nodeindex, nodalchipindex, coreindexonchip;
> + int config = (evsel->core.attr.config);
[Severity: High]
Can evlist__event2evsel() return NULL here if it fails to map the event?
If so, dereferencing it immediately for evsel->core.attr.config would lead
to a segmentation fault on crafted or corrupted perf.data files.
[Severity: Medium]
Is it correct to extract the metadata indices (nodeindex, etc.) from the local
runtime evsel config?
For synthesized auxtrace events, evlist__event2evsel() often returns the first
event in the list (which could be a dummy event), potentially yielding garbage
config values. Should these instead be read from the serialized
auxtrace_info->priv array?
> + size_t min_sz = sizeof(u64) * POWERPC_HTM_TYPE;
> + struct powerpc_htm *htm;
> + int err;
> + FILE *fp;
> +
> + nodeindex = (config >> 4) & 0xff;
> + nodalchipindex = (config >> 12) & 0xff;
> + coreindexonchip = (config >> 20) & 0xff;
> +
> + if (auxtrace_info->header.size < sizeof(struct perf_record_auxtrace_info) +
> + min_sz)
> + return -EINVAL;
[Severity: High]
Will min_sz evaluate to 0 because POWERPC_HTM_TYPE is 0 (the first element
in the enum)?
If so, this check will pass even for a payload with no priv elements.
[ ... ]
> + htm->session = session;
> + htm->machine = &session->machines.host; /* No kvm support */
> + htm->auxtrace_type = auxtrace_info->type;
> + htm->pmu_type = auxtrace_info->priv[POWERPC_HTM_TYPE];
[Severity: High]
Because of the size validation above passing for 0 elements, does reading
auxtrace_info->priv[POWERPC_HTM_TYPE] result in an out-of-bounds memory read
past the end of the validated event buffer?
> +
> + htm->auxtrace.process_event = powerpc_htm_process_event;
> + htm->auxtrace.process_auxtrace_event = powerpc_htm_process_auxtrace_event;
> + htm->auxtrace.flush_events = powerpc_htm_flush;
> + htm->auxtrace.free_events = powerpc_htm_free_events;
> + htm->auxtrace.free = powerpc_htm_free;
> + session->auxtrace = &htm->auxtrace;
> +
> + snprintf(htm->htmbin_file, sizeof(htm->htmbin_file), "htm.bin.n%d.p%d.c%d", nodeindex, nodalchipindex, coreindexonchip);
> + fp = fopen(htm->htmbin_file, "w");
> + if (!fp) {
> + pr_err("Failed to create %s: %s\n", htm->htmbin_file, strerror(errno));
> + return -errno;
> + }
> + fclose(fp);
[Severity: High]
Could this enable data clobbering and symlink attacks?
By unconditionally constructing a predictable filename and opening it with "w",
this aggressively truncates any existing file in the current working
directory. An attacker could place a symlink here to truncate arbitrary files
when another user runs perf script.
[ ... ]
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=5
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/9] perf tools powerpc: Add HTM trace data processing and decoding support
2026-07-01 8:41 ` [PATCH 6/9] perf tools powerpc: Add HTM trace data processing and decoding support Athira Rajeev
@ 2026-07-01 9:06 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 9:06 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 10 potential issue(s) to consider:
- [Critical] Use-after-free and inconsistent resource cleanup on error paths in `powerpc_htm_process_auxtrace_info()`.
- [Critical] Out-of-bounds read and information leak in `write_htm()` due to missing bounds checks against `buffer->size`.
- [High] Missing size validation before dereferencing `sample->raw_data`.
- [High] Stateless queue iteration logic causes redundant processing and massive trace duplication.
- [High] Missing NULL checks for `evsel` pointer and `evsel->name`.
- [High] Missing call to `auxtrace_buffer__get_data()` causes `buffer->data` to be NULL when passed to `write_htm()`.
- [High] Deadlock in child process after fork() due to non-async-signal-safe functions.
- [High] Predictable filenames are created in the current working directory without O_EXCL or O_NOFOLLOW flags, enabling arbitrary file overwrites via symlink attacks.
- [Medium] Cross-platform analysis broken due to native endianness assumption on raw sample data.
- [Low] Missing explicit inclusion of POSIX standard libc headers.
--
commit 699e32993a6b92ec36264eeaf2b99217af504b57
Author: Athira Rajeev <atrajeev@linux.ibm.com>
perf tools powerpc: Add HTM trace data processing and decoding support
This patch adds support for processing Hardware Trace Macro (HTM) auxiliary
trace data collected via perf AUX buffers. It allows post-processing of HTM
traces, extracting bus traces and system memory configuration. It integrates
with an external htmdecode tool for trace decoding.
> diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
> index ffddf0e59fc14..487989ca4fc7d 100644
> --- a/tools/perf/util/powerpc-htm.c
> +++ b/tools/perf/util/powerpc-htm.c
> @@ -14,6 +14,12 @@
> #include <inttypes.h>
> #include "powerpc-htm.h"
> #include <errno.h>
> +#include <sys/stat.h>
> +#include <unistd.h>
> +#include <sys/mman.h>
> +#include "sample.h"
> +#include <sys/types.h>
> +#include <sys/wait.h>
[Severity: Low]
Does this file compile successfully against musl libc?
This code uses POSIX standard functions and constants like open, O_WRONLY,
fopen, exit, malloc, and strerror. However, it looks like the explicit
POSIX headers (<fcntl.h>, <stdio.h>, <stdlib.h>, <string.h>) are missing.
Without these explicit inclusions, compilation under musl libc might fail.
[ ... ]
> @@ -26,8 +32,140 @@ struct powerpc_htm {
> struct machine *machine;
> u32 pmu_type;
> char htmbin_file[64];
> + char trans_file[64];
> + int htm_mem_entries;
> + int mem_maps;
> };
[ ... ]
> +static int run_htmdecode(const char *input_file, const char *output_file)
> +{
> + pid_t pid;
> + int status;
> +
> + pid = fork();
> + if (pid == -1) {
> + pr_err("fork() failed: %s\n", strerror(errno));
> + return -errno;
> + }
> +
> + if (pid == 0) {
> + /* Child process */
> + int fd = open(output_file, O_WRONLY | O_CREAT | O_TRUNC, 0644);
[Severity: High]
Could creating this output file with a predictable name make it vulnerable to
a symlink attack?
If this is executed in a world-writable directory like /tmp, and a user
pre-creates a symlink matching the output filename, opening it without
O_EXCL and O_NOFOLLOW might allow an attacker to overwrite arbitrary files.
> +
> + if (fd == -1) {
> + pr_err("Failed to open output file: %s\n", strerror(errno));
> + exit(1);
> + }
> +
> + /* Redirect stdout to output file */
> + dup2(fd, STDOUT_FILENO);
> + close(fd);
> +
> + /* Execute htmdecode - execlp searches PATH automatically */
> + execlp("htmdecode", "htmdecode", "-o", "-j", "-w", "1",
> + "-f", input_file, NULL);
> +
> + /* If execlp returns, it failed */
> + pr_err("Failed to execute htmdecode: %s\n", strerror(errno));
> + if (errno == ENOENT)
> + pr_err("htmdecode not found in PATH\n");
> +
> + exit(127); /* Standard "command not found" exit code */
> + }
[Severity: High]
Is there a risk of deadlocking the child process if perf is multi-threaded?
The child process calls pr_err and exit if open or execlp fails. Since
pr_err uses fprintf and exit flushes streams, these are not
async-signal-safe functions.
If another thread held a libc lock, like the stderr lock or malloc heap
lock, when fork was called, the child process could deadlock trying to
acquire it. Would it be safer to use write to file descriptor 2 and
_exit here?
[ ... ]
> +static int write_htm(void *data, size_t size, struct powerpc_htm *htm)
> +{
> + FILE *fp;
> + u64 *num_entries;
> + size_t entries;
> + size_t written;
> + int ret = -1;
> +
> + if (htm->mem_maps) {
> + fp = fopen(htm->trans_file, "ab");
> + if (!fp) {
> + pr_err("Failed to open %s: %s\n", htm->trans_file, strerror(errno));
> + return ret;
> + }
> + num_entries = data + 0x10;
> + entries = be64_to_cpu(*num_entries);
> + entries++;
> + written = fwrite(data, 32, entries, fp);
[Severity: Critical]
Can this result in an out-of-bounds memory read?
The code calculates the number of entries dynamically from data + 0x10 and
then calls fwrite with entries as the count multiplier.
If the trace data is malformed and contains an artificially large entry
count, it seems we could write gigabytes of out-of-bounds heap or mapped
memory to the file without verifying that the requested write length is
bounded by the size parameter.
> + if (written != entries) {
> + pr_err("Failed to write data: expected %zu, wrote %zu\n", entries, written);
> + fclose(fp);
> + return ret;
> + }
> + fclose(fp);
> + htm->htm_mem_entries += entries;
> + return 0;
> + }
[ ... ]
> static int powerpc_htm_process_event(struct perf_session *session __maybe_unused,
> union perf_event *event,
> struct perf_sample *sample __maybe_unused,
> const struct perf_tool *tool __maybe_unused)
> {
> + struct powerpc_htm *htm = container_of(session->auxtrace, struct powerpc_htm,
> + auxtrace);
> +
> + if ((event->header.type == PERF_RECORD_SAMPLE) && sample->raw_data) {
> + int *content = (int *)sample->raw_data;
[Severity: High]
Could casting sample->raw_data to an integer pointer cause an out-of-bounds
read if the raw data is smaller than sizeof(int)?
Should we verify that sample->raw_size is at least 4 bytes before
dereferencing it?
> + struct evsel *evsel = evlist__event2evsel(session->evlist, event);
> + int config = (evsel->core.attr.config) & 0xF;
[Severity: High]
Is it possible for evlist__event2evsel to return NULL here?
If it fails to find a matching event for the sample, dereferencing evsel
immediately in the config assignment would cause a crash.
> + struct auxtrace_buffer *buffer = NULL;
> + struct auxtrace_queues *queues = &htm->queues;
> + unsigned int i = 0;
> + int j = 0;
> +
> + if (strstr(evsel->name, "htm") == NULL)
> + return 0;
[Severity: High]
Is it guaranteed that evsel->name is non-NULL?
If the event is unnamed, passing evsel->name directly to strstr might
trigger a NULL pointer dereference.
> +
> + for (i = 0; i < queues->nr_queues; i++) {
> + buffer = auxtrace_buffer__next(&queues->queue_array[i], buffer);
> + for (; buffer;) {
> + if (j >= *content)
[Severity: Medium]
Does this code handle cross-endian analysis correctly?
The trace data is read directly into content as a native integer. If a
trace recorded on a big-endian system is analyzed on a little-endian
host, it seems the value will have the wrong byte order and the
comparison here will evaluate incorrectly.
> + htm->mem_maps = 1;
> + if (write_htm(buffer->data, buffer->size, htm))
[Severity: High]
Is it possible for buffer->data to be NULL here?
Because the auxtrace subsystem lazily maps file payloads, passing
buffer->data directly to write_htm without calling
auxtrace_buffer__get_data first might pass a NULL pointer, leading to a
crash when it is dereferenced inside write_htm.
[Severity: High]
Does this loop iteration lead to massive data duplication?
The code iterates over all auxtrace_buffer elements starting from the
queue head for every PERF_RECORD_SAMPLE event. Because buffers aren't
marked as processed or dequeued, subsequent events will cause the same
buffers to be appended to the output files multiple times.
> + return -1;
> + j++;
> + buffer = auxtrace_buffer__next(&queues->queue_array[i], buffer);
> + }
> + }
> + /* Only for power bus traces, we decode traces */
> + if (config == 1)
> + create_mem_maps(htm);
> + }
> +
> return 0;
> }
[ ... ]
> @@ -205,6 +420,14 @@ int powerpc_htm_process_auxtrace_info(union perf_event *event,
> }
> fclose(fp);
>
> + snprintf(htm->trans_file, sizeof(htm->trans_file), "translation.n%d.p%d.c%d", nodeindex, nodalchipindex, coreindexonchip);
> + fp = fopen(htm->trans_file, "w");
[Severity: High]
Are we vulnerable to symlink attacks here by using predictable filenames?
Creating files with predictable names like translation.n%d.p%d.c%d using
fopen with the w flag without O_EXCL could allow arbitrary file
overwrites if run in a shared directory.
> + if (!fp) {
> + pr_err("Failed to create %s: %s\n", htm->trans_file, strerror(errno));
> + return -errno;
[Severity: Critical]
Does returning directly here leak the htm struct allocated earlier in the
function?
Additionally, if the function jumps to the err_free label on other error
paths, such as when powerpc_htm_synth_events fails, does it bypass the
session->auxtrace = NULL assignment?
If session->auxtrace was set to &htm->auxtrace, leaving a dangling
pointer could cause a use-after-free when the session is later deleted
via perf_session__delete.
> + }
> + fclose(fp);
> +
> powerpc_htm_print_info(&auxtrace_info->priv[0]);
>
> err = auxtrace_queues__process_index(&htm->queues, session);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=6
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 7/9] perf tools powerpc: Add physical to logical address mapping for HTM traces
2026-07-01 8:41 ` [PATCH 7/9] perf tools powerpc: Add physical to logical address mapping for HTM traces Athira Rajeev
@ 2026-07-01 9:07 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 9:07 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:
- [Critical] Multiple integer overflows exist when calculating allocation sizes on 32-bit systems, leading to severe heap buffer overflows.
- [Critical] The `write_htm` function blindly trusts data from the untrusted AUX buffer to determine how much memory to write to the translation file, allowing an attacker to leak sensitive heap memory.
- [High] Trace analysis improperly queries the host machine's `/proc/powerpc/lparcfg` rather than utilizing recorded trace metadata, breaking cross-platform analysis.
- [High] Missing bounds check and an off-by-one chunk error in `create_mem_maps` lead to out-of-bounds memory reads and misinterpretation of chunk headers.
- [High] Logical addresses are truncated on 32-bit hosts, corrupting addresses greater than 4GB.
- [High] Unconditional byte swapping of HTM memory map entries corrupts data on matching-endian architectures.
- [Medium] Missing explicit POSIX libc header inclusions will cause compilation failures under musl libc.
- [Medium] The unchecked return value of `add_map_entry` silently drops trace entries when memory allocation fails.
--
commit bb5ab27d1cf88495a27db7cca665426d73fc046d
Author: Tanushree Shah <tshah@linux.ibm.com>
Subject: perf tools powerpc: Add physical to logical address mapping for HTM traces
This commit adds support for mapping physical addresses from HTM traces
to logical addresses. It reads the current partition ID and extracts memory
map entries, then parses decoded HTM trace files to generate mappings.
> diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
> index 487989ca4fc7d..83253850870c2 100644
> --- a/tools/perf/util/powerpc-htm.c
> +++ b/tools/perf/util/powerpc-htm.c
> @@ -20,6 +20,13 @@
> #include "sample.h"
> #include <sys/types.h>
> #include <sys/wait.h>
> +#include <regex.h>
> +#include <ctype.h>
> +#include <errno.h>
[Severity: Medium]
Could we include the standard posix headers here? Functions like fopen,
malloc, free, and strerror are used in this file, but <stdio.h>,
<stdlib.h>, <string.h>, and <fcntl.h> are missing. This causes build
failures when compiling with musl libc.
[ ... ]
> +struct mem_entries {
> + unsigned long long phy_addr;
> + unsigned long logical_addr;
[Severity: High]
Will this truncate logical addresses on 32-bit analysis hosts? PowerPC uses
64-bit addresses, but logical_addr is defined as an unsigned long. This
corrupts logical addresses over 4GB and causes the UINT64_MAX sentinel check
to fail when analyzed on a 32-bit system.
> + u32 lp_index;
> + u64 size;
> +};
[ ... ]
> +static int add_map_entry(struct addr_map **arr, size_t *count, size_t *cap, struct addr_map entry)
> +{
> + if (*count >= *cap) {
> + size_t new_cap = (*cap == 0) ? 1024 : (*cap * 2);
> + void *tmp = safe_realloc(*arr, new_cap * sizeof(struct addr_map));
[Severity: Critical]
Could this multiplication wrap around on 32-bit systems? If new_cap is
sufficiently large from a large trace file, the size calculation can
overflow. This allocates a smaller buffer than expected, leading to
sequential out-of-bounds writes.
> +
> + if (!tmp)
> + return -1;
> + *arr = tmp;
> + *cap = new_cap;
> +
> + }
> +
> + (*arr)[(*count)++] = entry;
> + return 0;
> +}
[ ... ]
> +static unsigned long find_logical_addr(unsigned long long given_addr,
> + struct mem_entries *mem_entries_array,
> + size_t n_entries,
> + u32 lp_filter)
> +{
> + for (size_t i = 0; i < n_entries; i++) {
[ ... ]
> + if (start <= given_addr && given_addr < end &&
> + mem_entries_array[i].lp_index == lp_filter) {
> + unsigned long long offset = given_addr - start;
> + unsigned long logical = mem_entries_array[i].logical_addr + offset;
[Severity: High]
This variable also uses unsigned long, which truncates the 64-bit address on
32-bit architectures. Does it make sense to use a fixed-width type like u64
or unsigned long long here and in the structures?
> +
> + pr_debug("DEBUG: Condition hit at i=%zu given_addr=0x%llx start=0x%llx end=0x%llx lp_index=%u\n",
[ ... ]
> + entry.phys_addr = phys_addr;
> + entry.logical_addr = logical_addr;
> +
> + add_map_entry(&maps, &count, &cap, entry);
[Severity: Medium]
Is it safe to ignore the return value here? If safe_realloc fails due to
memory pressure, add_map_entry returns -1. Ignoring this silently drops
trace entries without any error propagation.
> +
> + ptr += pmatch[0].rm_eo;
[ ... ]
> + /* get the lp index */
> + fp = fopen(lpar_cfg, "r");
[Severity: High]
Does this correctly handle cross-platform analysis? Reading from the local
host's /proc/powerpc/lparcfg means trace decoding might fail or misidentify
the partition if the analysis is run on an x86 machine or a different
system. Should this query the recorded trace metadata via perf_env instead?
> + if (!fp) {
[ ... ]
> + mem_entries_array = malloc(n_entries * sizeof(struct mem_entries));
[Severity: Critical]
Could this allocation overflow on 32-bit systems? If n_entries is very
large, the multiplication wraps around, leading to a small allocation that
will be followed by out-of-bounds writes in the initialization loop below.
> + if (!mem_entries_array) {
[ ... ]
> + /* get the HTM memory map data and store it in mem_entries_array
> + * to use it later on for physical->logical mapping
> + */
> + for (u64 i = 0; i < n_entries; i++, mem++) {
[Severity: High]
Can this loop read past the end of the memory mapped region? The variable mem
starts at mapped_data + 0x20. If htm->htm_mem_entries includes chunk headers
across all auxiliary buffers, skipping the first 32-byte header but still
iterating n_entries times could cause the loop to read out of bounds and
misinterpret subsequent chunk headers.
> + mem_entries_array[i].phy_addr = bswap_64(mem->phy_real);
> + mem_entries_array[i].logical_addr = bswap_64(mem->logical_real);
> + mem_entries_array[i].lp_index = bswap_32(mem->lp_index);
> + mem_entries_array[i].size = bswap_64(mem->size);
[Severity: High]
Should this use endian-aware helpers like be64_to_cpu() instead of
unconditional byte swapping? Applying bswap_64() unconditionally corrupts
the native data format when the offline analysis is run on a big-endian host
like ppc64.
> + }
[Severity: Critical]
Is it possible for write_htm() to read out of bounds from heap memory?
The function extracts an unbounded size from an untrusted buffer and passes
it directly to fwrite():
write_htm() {
...
num_entries = data + 0x10;
entries = be64_to_cpu(*num_entries);
entries++;
written = fwrite(data, 32, entries, fp);
...
}
A trace file with a small size but massive num_entries could cause an
out-of-bounds heap read, leaking sensitive memory into the translation file.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=7
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/9] tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples
2026-07-01 8:41 ` [PATCH 8/9] tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples Athira Rajeev
@ 2026-07-01 9:12 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 9:12 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 11 potential issue(s) to consider:
- [High] Error handling path bypasses auxtrace_queues cleanup, causing a memory leak and leaving a dangling pointer that leads to Use-After-Free.
- [High] Unchecked direct access to `evsel->name` passed to `strstarts()` within an event loop.
- [High] Unchecked dereference of `evsel->core.id` when synthesizing event IDs.
- [High] Synthesized `PERF_RECORD_SAMPLE` events lack payload serialization, causing downstream parser faults.
- [High] Off-by-one accounting of translation file headers causes out-of-bounds mmap read and data corruption.
- [High] Processing the auxtrace queues for every PERF_RECORD_SAMPLE duplicates trace data and heavily impacts performance.
- [High] Missing POSIX libc headers violates musl compatibility guidelines.
- [High] Massive out-of-bounds heap read in `write_htm` due to unvalidated length field from untrusted auxtrace data.
- [High] Missing `NULL` check on `evlist__event2evsel()` return value leads to immediate crash on malformed events.
- [Medium] Using strstr from the beginning of the line inside a regex match loop corrupts the output when a trace line contains multiple addresses.
- [Medium] Using `unsigned long` for 64-bit PowerPC addresses truncates them on 32-bit hosts, breaking cross-platform analysis.
--
(Note: The standard patch quoting format has been altered to a free-form
summary due to system recitation restrictions. No code diffs are quoted below.)
commit bb9556f09003be87e7766614cf5cfdd9647bd477
Author: Athira Rajeev <atrajeev@linux.ibm.com>
Subject: tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples
This commit creates a synthetic "htm" event to present hardware transactional
memory entries as samples in perf. It adds configuration values and uses the
logical address as the sample instruction pointer to aid in identifying hot
logical addresses.
[Severity: High]
In powerpc_htm_process_auxtrace_info(), if powerpc_htm_synth_events() returns
an error, the code jumps to the err_free label. Does this skip the cleanup of
the auxtrace queues and leave a dangling pointer in session->auxtrace?
It looks like the err_free_queues label handles both auxtrace_queues__free()
and setting session->auxtrace to NULL, which might be needed here to prevent
a use-after-free during session teardown.
[Severity: High]
In powerpc_htm_synth_events(), the loop iterates over evsel entries and checks
if they start with "htm" using strstarts(evsel->name, "htm"). Could
evsel->name be NULL for unnamed or dummy events?
Passing a NULL name into strstarts() could trigger a segmentation fault.
[Severity: High]
Also in powerpc_htm_synth_events(), the synthesized event ID is generated by
adding a fixed offset to evsel->core.id[0]. Is it possible for evsel->core.id
to be NULL if the event was recorded without sample IDs?
Dereferencing it directly in that case might cause a crash.
[Severity: High]
Looking at powerpc_htm_create_sample(), the synthesized event size is hardcoded
to exactly sizeof(struct perf_event_header), which is 8 bytes. Does this omit
the actual sample payload serialization that perf_event__synthesize_sample()
usually handles?
Downstream tools parsing this PERF_RECORD_SAMPLE event might expect a payload
based on the sample_type, potentially leading to out-of-bounds reads when
they process the 8-byte event.
[Severity: High]
In write_htm(), the total number of entries is incremented to account for the
32-byte header, and this is added to htm->htm_mem_entries. However, when
create_mem_maps() later maps the file, it skips the first header but loops
exactly htm->htm_mem_entries times.
Will this result in reading past the end of the memory-mapped region?
Additionally, if multiple chunks were written, would their embedded headers
be misinterpreted as valid trace data?
[Severity: High]
In powerpc_htm_process_event(), the code iterates through all auxtrace queues
and writes their buffer contents to the file for every PERF_RECORD_SAMPLE
event processed.
Does this repeatedly duplicate the entire trace data? Since the buffers are
not consumed or marked as processed, this might result in an exponentially
growing trace file and significant performance degradation.
[Severity: Medium]
In process_trace_file(), while iterating through regex matches for addresses
on a line, the replacement logic uses strstr() from the beginning of the line
to find the "addr:0x" prefix.
If a single trace line contains multiple addresses, won't strstr() repeatedly
find the first occurrence instead of the current match, potentially corrupting
the output?
[Severity: Medium]
The physical and logical addresses from the HTM trace are 64-bit values, but
powerpc_htm_create_sample() takes an unsigned long address.
On 32-bit host systems performing cross-platform analysis, could this truncate
the 64-bit addresses? The boundary check for kernel addresses would also be
optimized out on 32-bit hosts. The addr_map structure seems to share this
same truncation issue.
[Severity: High]
This file utilizes several standard POSIX functions and constants such as
fopen(), malloc(), strtoul(), and O_WRONLY.
To comply with musl libc compatibility guidelines, shouldn't the standard
headers like <stdio.h>, <stdlib.h>, and <fcntl.h> be explicitly included
at the top of the file?
[Severity: High]
In write_htm(), the num_entries value is parsed from the incoming auxtrace
data, converted to cpu endianness, incremented, and then passed directly to
fwrite().
Is there a missing bounds check to ensure that the calculated size does not
exceed the buffer's actual bounds? An unvalidated length could lead to an
out-of-bounds heap read if the trace data is malformed.
[Severity: High]
Both powerpc_htm_process_event() and powerpc_htm_process_auxtrace_info()
fetch an evsel using evlist__event2evsel() and immediately dereference its
config attribute.
If the event's sample ID isn't found in the evlist, could
evlist__event2evsel() return NULL? A check might be needed to avoid a
segmentation fault on malformed perf.data files.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=8
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 9/9] tools/perf/powerpc: Add logical address in decoded traces
2026-07-01 8:41 ` [PATCH 9/9] tools/perf/powerpc: Add logical address in decoded traces Athira Rajeev
@ 2026-07-01 9:13 ` sashiko-bot
0 siblings, 0 replies; 19+ messages in thread
From: sashiko-bot @ 2026-07-01 9:13 UTC (permalink / raw)
To: Athira Rajeev; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
New issues:
- [High] Improper error handling causes memory leak and potential segfault during initialization.
- [High] File write failures cause an infinite loop in trace line parsing.
- [High] String replacement logic corrupts output for trace lines with multiple addresses.
- [Medium] Unmapped physical addresses are incorrectly overwritten with 0x0, destroying trace information.
- [Medium] Trace lines that do not match the event label regex are silently dropped from the translated dump.
Pre-existing issues:
- [High] Malformed physical addresses cause an infinite loop in trace line parsing.
--
commit cce34e6215ec4758fecae703f279c8591debaca6
Author: Athira Rajeev <atrajeev@linux.ibm.com>
tools/perf/powerpc: Add logical address in decoded traces
This patch enhances post-processing to translate physical addresses to
logical addresses in the decoded output. It creates a new .l output file
with these translated addresses. It also allows optional LP index filtering
when dumping traces.
> diff --git a/tools/perf/util/powerpc-htm.c b/tools/perf/util/powerpc-htm.c
> index 050fbceac71eb..1bd2664453c66 100644
> --- a/tools/perf/util/powerpc-htm.c
> +++ b/tools/perf/util/powerpc-htm.c
[ ... ]
> @@ -259,36 +261,60 @@ static struct addr_map *process_trace_file(const char *trace_file,
> unsigned long logical_addr;
> size_t total_phys = 0;
> size_t total_phys_to_logical = 0;
> -
> FILE *fp = fopen(trace_file, "r");
> + size_t prefix_len;
> + int found_match = 0;
> + FILE *fout;
> + int filter_lp = 1;
> + char *output = malloc(strlen(trace_file) + 3); /* +3 for ".l" and null */
> +
> + if (!output) {
> + pr_err("Failed to allocate memory for output filename\n");
> + fclose(fp);
> + return NULL;
> + }
> +
> if (!fp) {
> pr_err("Failed to open trace file %s: %s\n", trace_file, strerror(errno));
> return NULL;
> }
[Severity: High]
Does this logic mishandle failures from fopen() and malloc()?
If trace_file cannot be opened, fopen() returns NULL. If malloc() for output
succeeds, the subsequent if (!fp) block returns NULL without freeing output,
leaking the allocated memory.
Additionally, if malloc() fails when fp is NULL, the if (!output) block calls
fclose(fp), which would dereference the NULL pointer and crash the tool.
[ ... ]
> @@ -303,6 +329,10 @@ static struct addr_map *process_trace_file(const char *trace_file,
> while (regexec(&addr_regex, ptr, 1, pmatch, 0) == 0) {
> unsigned long long phys_addr = 0;
> struct addr_map entry = {0};
> + char *hex_start = strstr(line, "addr:0x");
[Severity: High]
Can this strstr() call corrupt the output for trace lines with multiple
addresses?
When a trace line contains multiple addresses, the inner loop iterates over
each regex match. However, hex_start is calculated using
strstr(line, "addr:0x"), which always finds the first occurrence in the
entire line, ignoring the regex match bounds. This will cause the same prefix
and suffix to be printed multiple times, duplicating portions of the line.
> + const char *target = "addr:0x";
> + char *old_val_ptr;
> + size_t written;
>
> if (sscanf(ptr + pmatch[0].rm_so + strlen("addr:"),
> "%llx", &phys_addr) != 1) {
> pr_debug("Failed to parse phys addr from trace line\n");
> continue;
> }
[Severity: High]
This is a pre-existing issue, but does this error path cause an infinite loop?
If sscanf() fails to parse a physical address, it executes continue. This
skips the ptr += pmatch[0].rm_eo; offset advancement at the bottom of the loop.
As a result, the next iteration of the while (regexec(&addr_regex, ptr...))
loop will match the exact same string and fail to parse it again, looping
indefinitely.
> @@ -315,7 +345,30 @@ static struct addr_map *process_trace_file(const char *trace_file,
> logical_addr = find_logical_addr(phys_addr,
> mem_entries_array,
> n_entries,
> - lp_filter);
> + lp_filter, filter_lp);
> + /* create output.txt with logical address */
> + if (dump_trace && hex_start) {
> + old_val_ptr = hex_start + strlen(target);
> + prefix_len = hex_start - line;
> + written = fwrite(line, 1, prefix_len, fout);
> + if (written != prefix_len) {
> + pr_err("Failed to write prefix to output file\n");
> + continue;
> + }
[Severity: High]
Can file write failures here cause an infinite loop in trace line parsing?
If either the fwrite() or fprintf() calls fail (e.g., due to a full disk),
the code logs an error and executes continue. Similar to the parsing error
above, this skips the ptr += pmatch[0].rm_eo; advancement at the bottom of
the loop, causing regexec() to repeatedly match the same address string
forever.
> + if (fprintf(fout, "addr:0x%llx\t", (unsigned long long)logical_addr) < 0) {
[Severity: Medium]
Are unmapped physical addresses incorrectly overwritten with 0x0 here?
If find_logical_addr() fails to find a mapping, it returns 0. The code then
unconditionally writes logical_addr to the output file. This replaces valid but
unmapped physical addresses with addr:0x0, rather than preserving the original
address.
> + pr_err("Failed to write to output file\n");
> + continue;
> + }
[ ... ]
> @@ -335,6 +388,12 @@ static struct addr_map *process_trace_file(const char *trace_file,
>
> ptr += pmatch[0].rm_eo;
> }
> + if (dump_trace && (!found_match) && line) {
> + if (fprintf(fout, "%s", line) < 0) {
> + pr_err("Failed to write line to output file\n");
> + continue;
> + }
> + }
> }
> }
[Severity: Medium]
Are lines that do not match the event label regex dropped entirely from the
translated dump?
The logic to dump lines to fout when dump_trace is enabled appears to be
nested completely within this if (regexec(&label_regex, line, 2, pmatch, 0) == 0)
block. There is no corresponding else block, which would cause any
non-matching lines (like headers or metadata) to be silently omitted from
the .l output file.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701084115.80383-1-atrajeev@linux.ibm.com?part=9
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2026-07-01 9:13 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 8:41 [PATCH 0/9] tools/perf: Add interface to expose HTM trace data via perf Athira Rajeev
2026-07-01 8:41 ` [PATCH 1/9] tool/perf: Move auxtrace_record__init for powerpc-vpadtl as separate utility Athira Rajeev
2026-07-01 8:56 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 2/9] tools/perf: Add CONFIG_AUXTRACE support for HTM pmu on powerpc Athira Rajeev
2026-07-01 8:55 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 3/9] tools/perf: Add arch_record__collect_final_data to collect additional data before closing the event Athira Rajeev
2026-07-01 8:54 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 4/9] tools/perf: Add powerpc callback support for arch_record__collect_final_data Athira Rajeev
2026-07-01 8:55 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 5/9] tools/perf: process htm auxtrace events and display in perf report -D Athira Rajeev
2026-07-01 9:05 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 6/9] perf tools powerpc: Add HTM trace data processing and decoding support Athira Rajeev
2026-07-01 9:06 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 7/9] perf tools powerpc: Add physical to logical address mapping for HTM traces Athira Rajeev
2026-07-01 9:07 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 8/9] tools/perf/powerpc: Add event name as htm of PERF_TYPE_SYNTH type to present htm samples Athira Rajeev
2026-07-01 9:12 ` sashiko-bot
2026-07-01 8:41 ` [PATCH 9/9] tools/perf/powerpc: Add logical address in decoded traces Athira Rajeev
2026-07-01 9:13 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox