* [PATCH v10 1/4] perf arm-spe: Set sample.addr to target address for instruction sample
2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
@ 2024-10-25 10:52 ` Graham Woodward
2024-10-25 10:52 ` [PATCH v10 2/4] perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches Graham Woodward
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
mike.leach, leo.yan, linux-perf-users, linux-kernel,
linux-arm-kernel
Cc: nd, Graham Woodward
For an instruction sample, assign the target address to the field
'to_ip'.
If it is a non-branch record, to_ip will be 0, presenting a non-valid
target address.
Signed-off-by: Graham Woodward <graham.woodward@arm.com>
---
tools/perf/util/arm-spe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 138ffc71b..76d41c91f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -400,7 +400,7 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq,
sample.id = spe_events_id;
sample.stream_id = spe_events_id;
- sample.addr = record->virt_addr;
+ sample.addr = record->to_ip;
sample.phys_addr = record->phys_addr;
sample.data_src = data_src;
sample.period = spe->instructions_sample_period;
--
2.40.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v10 2/4] perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
2024-10-25 10:52 ` [PATCH v10 1/4] perf arm-spe: Set sample.addr to target address for instruction sample Graham Woodward
@ 2024-10-25 10:52 ` Graham Woodward
2024-10-25 10:52 ` [PATCH v10 3/4] perf arm-spe: Correctly set sample flags Graham Woodward
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
mike.leach, leo.yan, linux-perf-users, linux-kernel,
linux-arm-kernel
Cc: nd, Graham Woodward
Instead of checking the type for just branch misses, we can instead
check for the OP_BRANCH_ERET and synthesise branches as well as
branch misses.
Signed-off-by: Graham Woodward <graham.woodward@arm.com>
---
tools/perf/util/arm-spe.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 76d41c91f..d27500c53 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -69,7 +69,7 @@ struct arm_spe {
u64 llc_access_id;
u64 tlb_miss_id;
u64 tlb_access_id;
- u64 branch_miss_id;
+ u64 branch_id;
u64 remote_access_id;
u64 memory_id;
u64 instructions_id;
@@ -601,8 +601,8 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
}
}
- if (spe->sample_branch && (record->type & ARM_SPE_BRANCH_MISS)) {
- err = arm_spe__synth_branch_sample(speq, spe->branch_miss_id);
+ if (spe->sample_branch && (record->op & ARM_SPE_OP_BRANCH_ERET)) {
+ err = arm_spe__synth_branch_sample(speq, spe->branch_id);
if (err)
return err;
}
@@ -1202,12 +1202,12 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
if (spe->synth_opts.branches) {
spe->sample_branch = true;
- /* Branch miss */
+ /* Branch */
err = perf_session__deliver_synth_attr_event(session, &attr, id);
if (err)
return err;
- spe->branch_miss_id = id;
- arm_spe_set_event_name(evlist, id, "branch-miss");
+ spe->branch_id = id;
+ arm_spe_set_event_name(evlist, id, "branch");
id += 1;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v10 3/4] perf arm-spe: Correctly set sample flags
2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
2024-10-25 10:52 ` [PATCH v10 1/4] perf arm-spe: Set sample.addr to target address for instruction sample Graham Woodward
2024-10-25 10:52 ` [PATCH v10 2/4] perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches Graham Woodward
@ 2024-10-25 10:52 ` Graham Woodward
2024-10-25 10:52 ` [PATCH v10 4/4] perf arm-spe: Update --itrace help text Graham Woodward
2024-10-25 13:08 ` [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions James Clark
4 siblings, 0 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
mike.leach, leo.yan, linux-perf-users, linux-kernel,
linux-arm-kernel
Cc: nd, Graham Woodward
Set flags on all synthesized instruction and branch samples.
Signed-off-by: Graham Woodward <graham.woodward@arm.com>
---
tools/perf/builtin-script.c | 1 +
tools/perf/util/arm-spe.c | 17 +++++++++++++++++
tools/perf/util/event.h | 1 +
3 files changed, 19 insertions(+)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index a644787fa..6f3db0737 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1728,6 +1728,7 @@ static struct {
{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "tr end"},
{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMENTRY, "vmentry"},
{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMEXIT, "vmexit"},
+ {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_BRANCH_MISS, "br miss"},
{0, NULL}
};
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index d27500c53..830ab653f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -96,6 +96,7 @@ struct arm_spe_queue {
u64 timestamp;
struct thread *thread;
u64 period_instructions;
+ u32 flags;
};
static void arm_spe_dump(struct arm_spe *spe __maybe_unused,
@@ -376,6 +377,7 @@ static int arm_spe__synth_branch_sample(struct arm_spe_queue *speq,
sample.stream_id = spe_events_id;
sample.addr = record->to_ip;
sample.weight = record->latency;
+ sample.flags = speq->flags;
return arm_spe_deliver_synth_event(spe, speq, event, &sample);
}
@@ -405,10 +407,24 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq,
sample.data_src = data_src;
sample.period = spe->instructions_sample_period;
sample.weight = record->latency;
+ sample.flags = speq->flags;
return arm_spe_deliver_synth_event(spe, speq, event, &sample);
}
+static void arm_spe__sample_flags(struct arm_spe_queue *speq)
+{
+ const struct arm_spe_record *record = &speq->decoder->record;
+
+ speq->flags = 0;
+ if (record->op & ARM_SPE_OP_BRANCH_ERET) {
+ speq->flags = PERF_IP_FLAG_BRANCH;
+
+ if (record->type & ARM_SPE_BRANCH_MISS)
+ speq->flags |= PERF_IP_FLAG_BRANCH_MISS;
+ }
+}
+
static const struct midr_range neoverse_spe[] = {
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
@@ -551,6 +567,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
u64 data_src;
int err;
+ arm_spe__sample_flags(speq);
data_src = arm_spe__synth_data_source(record, spe->midr);
if (spe->sample_flc) {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index f8742e623..2744c54f4 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -66,6 +66,7 @@ enum {
PERF_IP_FLAG_VMEXIT = 1ULL << 12,
PERF_IP_FLAG_INTR_DISABLE = 1ULL << 13,
PERF_IP_FLAG_INTR_TOGGLE = 1ULL << 14,
+ PERF_IP_FLAG_BRANCH_MISS = 1ULL << 15,
};
#define PERF_IP_FLAG_CHARS "bcrosyiABExghDt"
--
2.40.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v10 4/4] perf arm-spe: Update --itrace help text
2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
` (2 preceding siblings ...)
2024-10-25 10:52 ` [PATCH v10 3/4] perf arm-spe: Correctly set sample flags Graham Woodward
@ 2024-10-25 10:52 ` Graham Woodward
2024-10-25 13:08 ` [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions James Clark
4 siblings, 0 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
mike.leach, leo.yan, linux-perf-users, linux-kernel,
linux-arm-kernel
Cc: nd, Graham Woodward
The --itrace help now needs updating to reflect that
the --itrace=b argument sythesises branches as well
as branch misses.
Signed-off-by: Graham Woodward <graham.woodward@arm.com>
---
tools/perf/Documentation/itrace.txt | 2 +-
tools/perf/Documentation/perf-arm-spe.txt | 2 +-
tools/perf/util/auxtrace.h | 3 +--
3 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 19cc179be..40476b227 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -1,6 +1,6 @@
i synthesize instructions events
y synthesize cycles events
- b synthesize branches events (branch misses for Arm SPE)
+ b synthesize branches events
c synthesize branches events (calls only)
r synthesize branches events (returns only)
x synthesize transactions events
diff --git a/tools/perf/Documentation/perf-arm-spe.txt b/tools/perf/Documentation/perf-arm-spe.txt
index 0a3eda482..de2b0b479 100644
--- a/tools/perf/Documentation/perf-arm-spe.txt
+++ b/tools/perf/Documentation/perf-arm-spe.txt
@@ -187,7 +187,7 @@ groups:
7 llc-access
2 tlb-miss
1K tlb-access
- 36 branch-miss
+ 36 branch
0 remote-access
900 memory
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index a1895a4f5..dddaf4f3f 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -75,7 +75,6 @@ enum itrace_period_type {
* (not fully accurate, since CYC packets are only emitted
* together with other events, such as branches)
* @branches: whether to synthesize 'branches' events
- * (branch misses only for Arm SPE)
* @transactions: whether to synthesize events for transactions
* @ptwrites: whether to synthesize events for ptwrites
* @pwr_events: whether to synthesize power events
@@ -650,7 +649,7 @@ bool auxtrace__evsel_is_auxtrace(struct perf_session *session,
#define ITRACE_HELP \
" i[period]: synthesize instructions events\n" \
" y[period]: synthesize cycles events (same period as i)\n" \
-" b: synthesize branches events (branch misses for Arm SPE)\n" \
+" b: synthesize branches events\n" \
" c: synthesize branches events (calls only)\n" \
" r: synthesize branches events (returns only)\n" \
" x: synthesize transactions events\n" \
--
2.40.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions
2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
` (3 preceding siblings ...)
2024-10-25 10:52 ` [PATCH v10 4/4] perf arm-spe: Update --itrace help text Graham Woodward
@ 2024-10-25 13:08 ` James Clark
4 siblings, 0 replies; 6+ messages in thread
From: James Clark @ 2024-10-25 13:08 UTC (permalink / raw)
To: Graham Woodward
Cc: nd, acme, namhyung, mark.rutland, jolsa, irogers, mike.leach,
leo.yan, linux-perf-users, linux-kernel, linux-arm-kernel
On 25/10/2024 11:52 am, Graham Woodward wrote:
> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
>
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.
>
> Graham Woodward (4):
> perf arm-spe: Set sample.addr to target address for instruction sample
> perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
> perf arm-spe: Correctly set sample flags
> perf arm-spe: Update --itrace help text
>
> tools/perf/Documentation/itrace.txt | 2 +-
> tools/perf/Documentation/perf-arm-spe.txt | 2 +-
> tools/perf/builtin-script.c | 1 +
> tools/perf/util/arm-spe.c | 31 ++++++++++++++++++-----
> tools/perf/util/auxtrace.h | 3 +--
> tools/perf/util/event.h | 1 +
> 6 files changed, 29 insertions(+), 11 deletions(-)
>
Hi Graham,
I think this is V1? Also it looks like the base doesn't include a few of
the new SPE changes and it doesn't apply cleanly. Make sure it's based
of the latest (46610ba41ef1)
With that LGTM:
Reviewed-by: James Clark <james.clark@linaro.org>
^ permalink raw reply [flat|nested] 6+ messages in thread