linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions
@ 2024-10-25 10:52 Graham Woodward
  2024-10-25 10:52 ` [PATCH v10 1/4] perf arm-spe: Set sample.addr to target address for instruction sample Graham Woodward
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
  To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
	mike.leach, leo.yan, linux-perf-users, linux-kernel,
	linux-arm-kernel
  Cc: nd, Graham Woodward

Currently the --itrace=b will only show branch-misses but this change
allows perf to synthesize branches as well.

The change also incorporates the ability to display the target
addresses when specifying the addr field if the instruction is a branch.

Graham Woodward (4):
  perf arm-spe: Set sample.addr to target address for instruction sample
  perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
  perf arm-spe: Correctly set sample flags
  perf arm-spe: Update --itrace help text

 tools/perf/Documentation/itrace.txt       |  2 +-
 tools/perf/Documentation/perf-arm-spe.txt |  2 +-
 tools/perf/builtin-script.c               |  1 +
 tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
 tools/perf/util/auxtrace.h                |  3 +--
 tools/perf/util/event.h                   |  1 +
 6 files changed, 29 insertions(+), 11 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v10 1/4] perf arm-spe: Set sample.addr to target address for instruction sample
  2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
@ 2024-10-25 10:52 ` Graham Woodward
  2024-10-25 10:52 ` [PATCH v10 2/4] perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches Graham Woodward
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
  To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
	mike.leach, leo.yan, linux-perf-users, linux-kernel,
	linux-arm-kernel
  Cc: nd, Graham Woodward

For an instruction sample, assign the target address to the field
'to_ip'.
If it is a non-branch record, to_ip will be 0, presenting a non-valid
target address.

Signed-off-by: Graham Woodward <graham.woodward@arm.com>
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 138ffc71b..76d41c91f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -400,7 +400,7 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq,
 
 	sample.id = spe_events_id;
 	sample.stream_id = spe_events_id;
-	sample.addr = record->virt_addr;
+	sample.addr = record->to_ip;
 	sample.phys_addr = record->phys_addr;
 	sample.data_src = data_src;
 	sample.period = spe->instructions_sample_period;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v10 2/4] perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
  2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
  2024-10-25 10:52 ` [PATCH v10 1/4] perf arm-spe: Set sample.addr to target address for instruction sample Graham Woodward
@ 2024-10-25 10:52 ` Graham Woodward
  2024-10-25 10:52 ` [PATCH v10 3/4] perf arm-spe: Correctly set sample flags Graham Woodward
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
  To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
	mike.leach, leo.yan, linux-perf-users, linux-kernel,
	linux-arm-kernel
  Cc: nd, Graham Woodward

Instead of checking the type for just branch misses, we can instead
check for the OP_BRANCH_ERET and synthesise branches as well as
branch misses.

Signed-off-by: Graham Woodward <graham.woodward@arm.com>
---
 tools/perf/util/arm-spe.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 76d41c91f..d27500c53 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -69,7 +69,7 @@ struct arm_spe {
 	u64				llc_access_id;
 	u64				tlb_miss_id;
 	u64				tlb_access_id;
-	u64				branch_miss_id;
+	u64				branch_id;
 	u64				remote_access_id;
 	u64				memory_id;
 	u64				instructions_id;
@@ -601,8 +601,8 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 		}
 	}
 
-	if (spe->sample_branch && (record->type & ARM_SPE_BRANCH_MISS)) {
-		err = arm_spe__synth_branch_sample(speq, spe->branch_miss_id);
+	if (spe->sample_branch && (record->op & ARM_SPE_OP_BRANCH_ERET)) {
+		err = arm_spe__synth_branch_sample(speq, spe->branch_id);
 		if (err)
 			return err;
 	}
@@ -1202,12 +1202,12 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
 	if (spe->synth_opts.branches) {
 		spe->sample_branch = true;
 
-		/* Branch miss */
+		/* Branch */
 		err = perf_session__deliver_synth_attr_event(session, &attr, id);
 		if (err)
 			return err;
-		spe->branch_miss_id = id;
-		arm_spe_set_event_name(evlist, id, "branch-miss");
+		spe->branch_id = id;
+		arm_spe_set_event_name(evlist, id, "branch");
 		id += 1;
 	}
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v10 3/4] perf arm-spe: Correctly set sample flags
  2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
  2024-10-25 10:52 ` [PATCH v10 1/4] perf arm-spe: Set sample.addr to target address for instruction sample Graham Woodward
  2024-10-25 10:52 ` [PATCH v10 2/4] perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches Graham Woodward
@ 2024-10-25 10:52 ` Graham Woodward
  2024-10-25 10:52 ` [PATCH v10 4/4] perf arm-spe: Update --itrace help text Graham Woodward
  2024-10-25 13:08 ` [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions James Clark
  4 siblings, 0 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
  To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
	mike.leach, leo.yan, linux-perf-users, linux-kernel,
	linux-arm-kernel
  Cc: nd, Graham Woodward

Set flags on all synthesized instruction and branch samples.

Signed-off-by: Graham Woodward <graham.woodward@arm.com>
---
 tools/perf/builtin-script.c |  1 +
 tools/perf/util/arm-spe.c   | 17 +++++++++++++++++
 tools/perf/util/event.h     |  1 +
 3 files changed, 19 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index a644787fa..6f3db0737 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1728,6 +1728,7 @@ static struct {
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "tr end"},
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMENTRY, "vmentry"},
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMEXIT, "vmexit"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_BRANCH_MISS, "br miss"},
 	{0, NULL}
 };
 
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index d27500c53..830ab653f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -96,6 +96,7 @@ struct arm_spe_queue {
 	u64				timestamp;
 	struct thread			*thread;
 	u64				period_instructions;
+	u32				flags;
 };
 
 static void arm_spe_dump(struct arm_spe *spe __maybe_unused,
@@ -376,6 +377,7 @@ static int arm_spe__synth_branch_sample(struct arm_spe_queue *speq,
 	sample.stream_id = spe_events_id;
 	sample.addr = record->to_ip;
 	sample.weight = record->latency;
+	sample.flags = speq->flags;
 
 	return arm_spe_deliver_synth_event(spe, speq, event, &sample);
 }
@@ -405,10 +407,24 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq,
 	sample.data_src = data_src;
 	sample.period = spe->instructions_sample_period;
 	sample.weight = record->latency;
+	sample.flags = speq->flags;
 
 	return arm_spe_deliver_synth_event(spe, speq, event, &sample);
 }
 
+static void arm_spe__sample_flags(struct arm_spe_queue *speq)
+{
+	const struct arm_spe_record *record = &speq->decoder->record;
+
+	speq->flags = 0;
+	if (record->op & ARM_SPE_OP_BRANCH_ERET) {
+		speq->flags = PERF_IP_FLAG_BRANCH;
+
+		if (record->type & ARM_SPE_BRANCH_MISS)
+			speq->flags |= PERF_IP_FLAG_BRANCH_MISS;
+	}
+}
+
 static const struct midr_range neoverse_spe[] = {
 	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
 	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
@@ -551,6 +567,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 	u64 data_src;
 	int err;
 
+	arm_spe__sample_flags(speq);
 	data_src = arm_spe__synth_data_source(record, spe->midr);
 
 	if (spe->sample_flc) {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index f8742e623..2744c54f4 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -66,6 +66,7 @@ enum {
 	PERF_IP_FLAG_VMEXIT		= 1ULL << 12,
 	PERF_IP_FLAG_INTR_DISABLE	= 1ULL << 13,
 	PERF_IP_FLAG_INTR_TOGGLE	= 1ULL << 14,
+	PERF_IP_FLAG_BRANCH_MISS	= 1ULL << 15,
 };
 
 #define PERF_IP_FLAG_CHARS "bcrosyiABExghDt"
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v10 4/4] perf arm-spe: Update --itrace help text
  2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
                   ` (2 preceding siblings ...)
  2024-10-25 10:52 ` [PATCH v10 3/4] perf arm-spe: Correctly set sample flags Graham Woodward
@ 2024-10-25 10:52 ` Graham Woodward
  2024-10-25 13:08 ` [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions James Clark
  4 siblings, 0 replies; 6+ messages in thread
From: Graham Woodward @ 2024-10-25 10:52 UTC (permalink / raw)
  To: acme, namhyung, mark.rutland, jolsa, irogers, james.clark,
	mike.leach, leo.yan, linux-perf-users, linux-kernel,
	linux-arm-kernel
  Cc: nd, Graham Woodward

The --itrace help now needs updating to reflect that
the --itrace=b argument sythesises branches as well
as branch misses.

Signed-off-by: Graham Woodward <graham.woodward@arm.com>
---
 tools/perf/Documentation/itrace.txt       | 2 +-
 tools/perf/Documentation/perf-arm-spe.txt | 2 +-
 tools/perf/util/auxtrace.h                | 3 +--
 3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 19cc179be..40476b227 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -1,6 +1,6 @@
 		i	synthesize instructions events
 		y	synthesize cycles events
-		b	synthesize branches events (branch misses for Arm SPE)
+		b	synthesize branches events
 		c	synthesize branches events (calls only)
 		r	synthesize branches events (returns only)
 		x	synthesize transactions events
diff --git a/tools/perf/Documentation/perf-arm-spe.txt b/tools/perf/Documentation/perf-arm-spe.txt
index 0a3eda482..de2b0b479 100644
--- a/tools/perf/Documentation/perf-arm-spe.txt
+++ b/tools/perf/Documentation/perf-arm-spe.txt
@@ -187,7 +187,7 @@ groups:
   7 llc-access
   2 tlb-miss
   1K tlb-access
-  36 branch-miss
+  36 branch
   0 remote-access
   900 memory
 
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index a1895a4f5..dddaf4f3f 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -75,7 +75,6 @@ enum itrace_period_type {
  *          (not fully accurate, since CYC packets are only emitted
  *          together with other events, such as branches)
  * @branches: whether to synthesize 'branches' events
- *            (branch misses only for Arm SPE)
  * @transactions: whether to synthesize events for transactions
  * @ptwrites: whether to synthesize events for ptwrites
  * @pwr_events: whether to synthesize power events
@@ -650,7 +649,7 @@ bool auxtrace__evsel_is_auxtrace(struct perf_session *session,
 #define ITRACE_HELP \
 "				i[period]:    		synthesize instructions events\n" \
 "				y[period]:    		synthesize cycles events (same period as i)\n" \
-"				b:	    		synthesize branches events (branch misses for Arm SPE)\n" \
+"				b:	    		synthesize branches events\n" \
 "				c:	    		synthesize branches events (calls only)\n"	\
 "				r:	    		synthesize branches events (returns only)\n" \
 "				x:	    		synthesize transactions events\n"		\
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions
  2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
                   ` (3 preceding siblings ...)
  2024-10-25 10:52 ` [PATCH v10 4/4] perf arm-spe: Update --itrace help text Graham Woodward
@ 2024-10-25 13:08 ` James Clark
  4 siblings, 0 replies; 6+ messages in thread
From: James Clark @ 2024-10-25 13:08 UTC (permalink / raw)
  To: Graham Woodward
  Cc: nd, acme, namhyung, mark.rutland, jolsa, irogers, mike.leach,
	leo.yan, linux-perf-users, linux-kernel, linux-arm-kernel



On 25/10/2024 11:52 am, Graham Woodward wrote:
> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.
> 
> Graham Woodward (4):
>    perf arm-spe: Set sample.addr to target address for instruction sample
>    perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
>    perf arm-spe: Correctly set sample flags
>    perf arm-spe: Update --itrace help text
> 
>   tools/perf/Documentation/itrace.txt       |  2 +-
>   tools/perf/Documentation/perf-arm-spe.txt |  2 +-
>   tools/perf/builtin-script.c               |  1 +
>   tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
>   tools/perf/util/auxtrace.h                |  3 +--
>   tools/perf/util/event.h                   |  1 +
>   6 files changed, 29 insertions(+), 11 deletions(-)
> 

Hi Graham,

I think this is V1? Also it looks like the base doesn't include a few of 
the new SPE changes and it doesn't apply cleanly. Make sure it's based 
of the latest (46610ba41ef1)

With that LGTM:

Reviewed-by: James Clark <james.clark@linaro.org>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-10-25 13:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-25 10:52 [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions Graham Woodward
2024-10-25 10:52 ` [PATCH v10 1/4] perf arm-spe: Set sample.addr to target address for instruction sample Graham Woodward
2024-10-25 10:52 ` [PATCH v10 2/4] perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches Graham Woodward
2024-10-25 10:52 ` [PATCH v10 3/4] perf arm-spe: Correctly set sample flags Graham Woodward
2024-10-25 10:52 ` [PATCH v10 4/4] perf arm-spe: Update --itrace help text Graham Woodward
2024-10-25 13:08 ` [PATCH v10 0/4] perf arm-spe: Allow synthesizing of branch instructions James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).