From: Ian Rogers <irogers@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
James Clark <james.clark@linaro.org>,
Leo Yan <leo.yan@linux.dev>,
Ravi Bangoria <ravi.bangoria@amd.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
thomas.falcon@intel.com, dapeng1.mi@linux.intel.com
Cc: Ian Rogers <irogers@google.com>
Subject: [PATCH v1 2/2] perf inject: Fix itrace branch stack synthesis
Date: Tue, 28 Apr 2026 00:03:28 -0700 [thread overview]
Message-ID: <20260428070328.1880314-3-irogers@google.com> (raw)
In-Reply-To: <20260428070328.1880314-1-irogers@google.com>
When using "perf inject --itrace=L" to synthesize branch stacks from
AUX data, several issues caused failures with the generated file:
1. The synthesized samples were delivered without the
PERF_SAMPLE_BRANCH_STACK flag if it was not in the original event's
sample_type. Fixed by using sample_type | evsel->synth_sample_type
in intel_pt_deliver_synth_event.
2. The record layout was misaligned because of inconsistent handling
of PERF_SAMPLE_BRANCH_HW_INDEX. Fixed by explicitly writing nr and
hw_idx in perf_event__synthesize_sample.
3. Modifying evsel->core.attr.sample_type early in __cmd_inject caused
parse failures for subsequent records in the input file. Fixed by
moving this modification to just before writing the header.
4. perf_event__repipe_sample was narrowed to only synthesize samples
when branch stack injection was requested, and restored the use of
perf_inject__cut_auxtrace_sample as a fallback to preserve
functionality.
5. Potential Heap Overflow in perf_event__repipe_sample : Addressed by
adding a check that prints an error and returns -EFAULT if the
calculated event size exceeds PERF_SAMPLE_MAX_SIZE , as you
requested.
6. Header vs Payload Mismatch in __cmd_inject : Addressed by narrowing
the condition so that HEADER_BRANCH_STACK is only set in the file
header if add_last_branch was true.
7. NULL Pointer Dereference in intel-pt.c : Addressed by updating the
condition in intel_pt_do_synth_pebs_sample to fill sample.
branch_stack if it was synthesized, even if not in the original
sample_type .
Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/builtin-inject.c | 87 ++++++++++++++++++++++++++++++++-----
tools/perf/util/intel-pt.c | 6 ++-
2 files changed, 81 insertions(+), 12 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 0c51cb4250d1..1f4d25a0efba 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -223,6 +223,11 @@ static int perf_event__repipe_attr(const struct perf_tool *tool,
tool);
int ret;
+ if (inject->itrace_synth_opts.add_last_branch) {
+ event->attr.attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
+ event->attr.attr.branch_sample_type |= PERF_SAMPLE_BRANCH_HW_INDEX;
+ }
+
ret = perf_event__process_attr(tool, event, pevlist);
if (ret)
return ret;
@@ -375,7 +380,60 @@ static int perf_event__repipe_sample(const struct perf_tool *tool,
build_id__mark_dso_hit(tool, event, sample, evsel, machine);
- if (inject->itrace_synth_opts.set && sample->aux_sample.size) {
+ if (inject->itrace_synth_opts.set &&
+ (inject->itrace_synth_opts.last_branch ||
+ inject->itrace_synth_opts.add_last_branch)) {
+ union perf_event *event_copy = (void *)inject->event_copy;
+ struct branch_stack dummy_bs = { .nr = 0, .hw_idx = 0 };
+ int err;
+ size_t sz;
+ u64 orig_type = evsel->core.attr.sample_type;
+ u64 orig_branch_type = evsel->core.attr.branch_sample_type;
+
+ if (event_copy == NULL) {
+ inject->event_copy = malloc(PERF_SAMPLE_MAX_SIZE);
+ if (!inject->event_copy)
+ return -ENOMEM;
+
+ event_copy = (void *)inject->event_copy;
+ }
+
+ if (!sample->branch_stack)
+ sample->branch_stack = &dummy_bs;
+
+ if (inject->itrace_synth_opts.add_last_branch) {
+ /* Temporarily add in type bits for synthesis. */
+ evsel->core.attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
+ evsel->core.attr.branch_sample_type |= PERF_SAMPLE_BRANCH_HW_INDEX;
+ }
+ evsel->core.attr.sample_type &= ~PERF_SAMPLE_AUX;
+
+ sz = perf_event__sample_event_size(sample, evsel->core.attr.sample_type,
+ evsel->core.attr.read_format,
+ evsel->core.attr.branch_sample_type);
+
+ if (sz > PERF_SAMPLE_MAX_SIZE) {
+ pr_err("Sample size %zu exceeds max size %d\n", sz, PERF_SAMPLE_MAX_SIZE);
+ return -EFAULT;
+ }
+
+ event_copy->header.type = PERF_RECORD_SAMPLE;
+ event_copy->header.misc = event->header.misc;
+ event_copy->header.size = sz;
+
+ err = perf_event__synthesize_sample(event_copy, evsel->core.attr.sample_type,
+ evsel->core.attr.read_format,
+ evsel->core.attr.branch_sample_type, sample);
+
+ evsel->core.attr.sample_type = orig_type;
+ evsel->core.attr.branch_sample_type = orig_branch_type;
+
+ if (err) {
+ pr_err("Failed to synthesize sample\n");
+ return err;
+ }
+ event = event_copy;
+ } else if (inject->itrace_synth_opts.set && sample->aux_sample.size) {
event = perf_inject__cut_auxtrace_sample(inject, event, sample);
if (IS_ERR(event))
return PTR_ERR(event);
@@ -463,13 +521,9 @@ static int perf_event__convert_sample_callchain(const struct perf_tool *tool,
/* remove sample_type {STACK,REGS}_USER for synthesize */
sample_type &= ~(PERF_SAMPLE_STACK_USER | PERF_SAMPLE_REGS_USER);
- ret = perf_event__synthesize_sample(event_copy, evsel->core.attr.sample_type,
- evsel->core.attr.read_format,
- evsel->core.attr.branch_sample_type, sample);
- if (ret) {
- pr_err("Failed to synthesize sample\n");
- return ret;
- }
+ perf_event__synthesize_sample(event_copy, sample_type,
+ evsel->core.attr.read_format,
+ evsel->core.attr.branch_sample_type, sample);
return perf_event__repipe_synth(tool, event_copy);
}
@@ -2440,12 +2494,25 @@ static int __cmd_inject(struct perf_inject *inject)
* synthesized hardware events, so clear the feature flag.
*/
if (inject->itrace_synth_opts.set) {
+ struct evsel *evsel;
+
perf_header__clear_feat(&session->header,
HEADER_AUXTRACE);
- if (inject->itrace_synth_opts.last_branch ||
- inject->itrace_synth_opts.add_last_branch)
+
+ evlist__for_each_entry(session->evlist, evsel) {
+ evsel->core.attr.sample_type &= ~PERF_SAMPLE_AUX;
+ }
+
+ if (inject->itrace_synth_opts.add_last_branch) {
perf_header__set_feat(&session->header,
HEADER_BRANCH_STACK);
+
+ evlist__for_each_entry(session->evlist, evsel) {
+ evsel->core.attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
+ evsel->core.attr.branch_sample_type |=
+ PERF_SAMPLE_BRANCH_HW_INDEX;
+ }
+ }
}
/*
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 5142983e3243..2dce6106c038 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1731,6 +1731,7 @@ static void intel_pt_prep_b_sample(struct intel_pt *pt,
static int intel_pt_inject_event(union perf_event *event,
struct perf_sample *sample, u64 type)
{
+ event->header.type = PERF_RECORD_SAMPLE;
event->header.size = perf_event__sample_event_size(sample, type, /*read_format=*/0,
/*branch_sample_type=*/0);
@@ -2489,7 +2490,7 @@ static int intel_pt_do_synth_pebs_sample(struct intel_pt_queue *ptq, struct evse
intel_pt_add_xmm(intr_regs, pos, items, regs_mask);
}
- if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
+ if ((sample_type | evsel->synth_sample_type) & PERF_SAMPLE_BRANCH_STACK) {
if (items->mask[INTEL_PT_LBR_0_POS] ||
items->mask[INTEL_PT_LBR_1_POS] ||
items->mask[INTEL_PT_LBR_2_POS]) {
@@ -2560,7 +2561,8 @@ static int intel_pt_do_synth_pebs_sample(struct intel_pt_queue *ptq, struct evse
sample.transaction = txn;
}
- ret = intel_pt_deliver_synth_event(pt, event, &sample, sample_type);
+ ret = intel_pt_deliver_synth_event(pt, event, &sample,
+ sample_type | evsel->synth_sample_type);
perf_sample__exit(&sample);
return ret;
}
--
2.54.0.545.g6539524ca2-goog
next prev parent reply other threads:[~2026-04-28 7:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 7:03 [PATCH v1 0/2] perf inject intel-PT LBR/brstack synthesis fixes Ian Rogers
2026-04-28 7:03 ` [PATCH v1 1/2] perf event: Fix size of synthesized sample with branch stacks Ian Rogers
2026-04-28 23:19 ` Namhyung Kim
2026-04-28 7:03 ` Ian Rogers [this message]
2026-04-28 20:20 ` [PATCH v1 2/2] perf inject: Fix itrace branch stack synthesis sashiko-bot
2026-04-29 18:11 ` [PATCH v2 0/2] perf inject intel-PT LBR/brstack synthesis fixes Ian Rogers
2026-04-29 18:11 ` [PATCH v2 1/2] perf event: Fix size of synthesized sample with branch stacks Ian Rogers
2026-04-29 20:51 ` sashiko-bot
2026-04-29 18:11 ` [PATCH v2 2/2] perf inject: Fix itrace branch stack synthesis Ian Rogers
2026-04-29 21:18 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428070328.1880314-3-irogers@google.com \
--to=irogers@google.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=james.clark@linaro.org \
--cc=leo.yan@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=thomas.falcon@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.