* [PATCH v19 00/11] arm64/perf: Enable branch stack sampling
@ 2025-02-03 0:42 Rob Herring (Arm)
2025-02-03 0:42 ` [PATCH v19 01/11] perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters Rob Herring (Arm)
` (10 more replies)
0 siblings, 11 replies; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:42 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Mark Brown
This series enables perf branch stack sampling support on arm64 via a
v9.2 arch feature called Branch Record Buffer Extension (BRBE). Details
on BRBE can be found in the Arm ARM[1] chapter D18.
I've picked up this series from Anshuman. This version has been reworked
quite a bit by Mark and myself. The bulk of those changes are in patch
11.
Patches 1-7 are new clean-ups/prep which stand on their own. They
were previously posted here[2]. Please pick them up if there's no issues
with them.
Patches 8-11 add BRBE support with the actual support in patch 11.
[1] https://developer.arm.com/documentation/ddi0487/latest/
[2] https://lore.kernel.org/all/20250107-arm-pmu-cleanups-v1-v1-0-313951346a25@kernel.org/
v19:
- Drop saving of branch records when task scheduled out (Mark). Make
sched_task() callback actually get called. Enabling requires a call
to perf_sched_cb_inc(). So the saving of branch records never
happened.
- Got rid of added armpmu ops. All BRBE support is contained within
pmuv3 code.
- Fix freeze on overflow for VHE
- The cycle counter doesn't freeze BRBE on overflow, so avoid assigning
it when BRBE is enabled.
- Drop all the Arm specific exception branches. Not a clear need for
them.
- Fix handling of branch 'cycles' reading. CC field is
mantissa/exponent, not an integer.
- Rework s/w filtering to better match h/w filtering
- Reject events with disjoint event filter and branch filter or with
exclude_host set
- Dropped perf test patch which has been applied for 6.14
- Dropped patch "KVM: arm64: Explicitly handle BRBE traps as UNDEFINED"
which has been applied for 6.14
v18:
- https://lore.kernel.org/all/20240613061731.3109448-1-anshuman.khandual@arm.com/
For v1-v17, see the above link. Not going to duplicate it all here...
Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
---
Anshuman Khandual (4):
arm64/sysreg: Add BRBE registers and fields
arm64: Handle BRBE booting requirements
KVM: arm64: nvhe: Disable branch generation in nVHE guests
perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
Mark Rutland (3):
perf: arm_pmu: Don't disable counter in armpmu_add()
perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event()
perf: arm_pmu: Move PMUv3-specific data
Rob Herring (Arm) (4):
perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters
perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts
perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event()
perf: apple_m1: Don't disable counter in m1_pmu_enable_event()
Documentation/arch/arm64/booting.rst | 21 +
arch/arm64/include/asm/el2_setup.h | 86 +++-
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/include/asm/sysreg.h | 17 +-
arch/arm64/kvm/debug.c | 4 +
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 32 ++
arch/arm64/tools/sysreg | 132 ++++++
drivers/perf/Kconfig | 11 +
drivers/perf/Makefile | 1 +
drivers/perf/apple_m1_cpu_pmu.c | 4 -
drivers/perf/arm_brbe.c | 794 +++++++++++++++++++++++++++++++++++
drivers/perf/arm_brbe.h | 47 +++
drivers/perf/arm_pmu.c | 23 +-
drivers/perf/arm_pmuv3.c | 96 ++++-
drivers/perf/arm_v7_pmu.c | 50 ---
include/linux/perf/arm_pmu.h | 21 +-
16 files changed, 1250 insertions(+), 91 deletions(-)
---
base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
change-id: 20250129-arm-brbe-v19-24d5d9e5e623
Best regards,
--
Rob Herring (Arm) <robh@kernel.org>
^ permalink raw reply [flat|nested] 43+ messages in thread
* [PATCH v19 01/11] perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
@ 2025-02-03 0:42 ` Rob Herring (Arm)
2025-02-03 4:07 ` Anshuman Khandual
2025-02-03 0:42 ` [PATCH v19 02/11] perf: arm_pmu: Don't disable counter in armpmu_add() Rob Herring (Arm)
` (9 subsequent siblings)
10 siblings, 1 reply; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:42 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
Counting events related to setup of the PMU is not desired, but
kvm_vcpu_pmu_resync_el0() is called just after the PMU counters have
been enabled. Move the call to before enabling the counters.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/perf/arm_pmuv3.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 0e360feb3432..9ebc950559c0 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -825,10 +825,10 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
else
armv8pmu_disable_user_access();
+ kvm_vcpu_pmu_resync_el0();
+
/* Enable all counters */
armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
-
- kvm_vcpu_pmu_resync_el0();
}
static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 02/11] perf: arm_pmu: Don't disable counter in armpmu_add()
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
2025-02-03 0:42 ` [PATCH v19 01/11] perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters Rob Herring (Arm)
@ 2025-02-03 0:42 ` Rob Herring (Arm)
2025-02-03 6:04 ` Anshuman Khandual
2025-02-03 0:42 ` [PATCH v19 03/11] perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event() Rob Herring (Arm)
` (8 subsequent siblings)
10 siblings, 1 reply; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:42 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Mark Rutland <mark.rutland@arm.com>
Currently armpmu_add() tries to handle a newly-allocated counter having
a stale associated event, but this should not be possible, and if this
were to happen the current mitigation is insufficient and potentially
expensive. It would be better to warn if we encounter the impossible
case.
Calls to pmu::add() and pmu::del() are serialized by the core perf code,
and armpmu_del() clears the relevant slot in pmu_hw_events::events[]
before clearing the bit in pmu_hw_events::used_mask such that the
counter can be reallocated. Thus when armpmu_add() allocates a counter
index from pmu_hw_events::used_mask, it should not be possible to observe
a stale even in pmu_hw_events::events[] unless either
pmu_hw_events::used_mask or pmu_hw_events::events[] have been corrupted.
If this were to happen, we'd end up with two events with the same
event->hw.idx, which would clash with each other during reprogramming,
deletion, etc, and produce bogus results. Add a WARN_ON_ONCE() for this
case so that we can detect if this ever occurs in practice.
That possiblity aside, there's no need to call arm_pmu::disable(event)
for the new event. The PMU reset code initialises the counter in a
disabled state, and armpmu_del() will disable the counter before it can
be reused. Remove the redundant disable.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/perf/arm_pmu.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 398cce3d76fc..2f33e69a8caf 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -342,12 +342,10 @@ armpmu_add(struct perf_event *event, int flags)
if (idx < 0)
return idx;
- /*
- * If there is an event in the counter we are going to use then make
- * sure it is disabled.
- */
+ /* The newly-allocated counter should be empty */
+ WARN_ON_ONCE(hw_events->events[idx]);
+
event->hw.idx = idx;
- armpmu->disable(event);
hw_events->events[idx] = event;
hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 03/11] perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event()
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
2025-02-03 0:42 ` [PATCH v19 01/11] perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters Rob Herring (Arm)
2025-02-03 0:42 ` [PATCH v19 02/11] perf: arm_pmu: Don't disable counter in armpmu_add() Rob Herring (Arm)
@ 2025-02-03 0:42 ` Rob Herring (Arm)
2025-02-03 6:38 ` Anshuman Khandual
2025-02-03 0:42 ` [PATCH v19 04/11] perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts Rob Herring (Arm)
` (7 subsequent siblings)
10 siblings, 1 reply; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:42 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Mark Rutland <mark.rutland@arm.com>
Currently armv8pmu_enable_event() starts by disabling the event counter
it has been asked to enable. This should not be necessary as the counter
(and the PMU as a whole) should not be active when
armv8pmu_enable_event() is called.
Remove the redundant call to armv8pmu_disable_event_counter(). At the
same time, remove the comment immeditately above as everything it says
is obvious from the function names below.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/perf/arm_pmuv3.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 9ebc950559c0..5406b9ca591a 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -795,11 +795,6 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
static void armv8pmu_enable_event(struct perf_event *event)
{
- /*
- * Enable counter and interrupt, and set the counter to count
- * the event that we're interested in.
- */
- armv8pmu_disable_event_counter(event);
armv8pmu_write_event_type(event);
armv8pmu_enable_event_irq(event);
armv8pmu_enable_event_counter(event);
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 04/11] perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (2 preceding siblings ...)
2025-02-03 0:42 ` [PATCH v19 03/11] perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event() Rob Herring (Arm)
@ 2025-02-03 0:42 ` Rob Herring (Arm)
2025-02-03 4:09 ` Anshuman Khandual
2025-02-03 0:42 ` [PATCH v19 05/11] perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event() Rob Herring (Arm)
` (6 subsequent siblings)
10 siblings, 1 reply; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:42 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
The function calls for enabling/disabling counters and interrupts are
pretty obvious as to what they are doing, and the comments don't add
any additional value.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/perf/arm_v7_pmu.c | 44 --------------------------------------------
1 file changed, 44 deletions(-)
diff --git a/drivers/perf/arm_v7_pmu.c b/drivers/perf/arm_v7_pmu.c
index 420cadd108e7..7fa88e3b64e0 100644
--- a/drivers/perf/arm_v7_pmu.c
+++ b/drivers/perf/arm_v7_pmu.c
@@ -857,14 +857,6 @@ static void armv7pmu_enable_event(struct perf_event *event)
return;
}
- /*
- * Enable counter and interrupt, and set the counter to count
- * the event that we're interested in.
- */
-
- /*
- * Disable counter
- */
armv7_pmnc_disable_counter(idx);
/*
@@ -875,14 +867,7 @@ static void armv7pmu_enable_event(struct perf_event *event)
if (cpu_pmu->set_event_filter || idx != ARMV7_IDX_CYCLE_COUNTER)
armv7_pmnc_write_evtsel(idx, hwc->config_base);
- /*
- * Enable interrupt for this counter
- */
armv7_pmnc_enable_intens(idx);
-
- /*
- * Enable counter
- */
armv7_pmnc_enable_counter(idx);
}
@@ -898,18 +883,7 @@ static void armv7pmu_disable_event(struct perf_event *event)
return;
}
- /*
- * Disable counter and interrupt
- */
-
- /*
- * Disable counter
- */
armv7_pmnc_disable_counter(idx);
-
- /*
- * Disable interrupt for this counter
- */
armv7_pmnc_disable_intens(idx);
}
@@ -1476,12 +1450,6 @@ static void krait_pmu_enable_event(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
- /*
- * Enable counter and interrupt, and set the counter to count
- * the event that we're interested in.
- */
-
- /* Disable counter */
armv7_pmnc_disable_counter(idx);
/*
@@ -1494,10 +1462,7 @@ static void krait_pmu_enable_event(struct perf_event *event)
else
armv7_pmnc_write_evtsel(idx, hwc->config_base);
- /* Enable interrupt for this counter */
armv7_pmnc_enable_intens(idx);
-
- /* Enable counter */
armv7_pmnc_enable_counter(idx);
}
@@ -1797,12 +1762,6 @@ static void scorpion_pmu_enable_event(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
- /*
- * Enable counter and interrupt, and set the counter to count
- * the event that we're interested in.
- */
-
- /* Disable counter */
armv7_pmnc_disable_counter(idx);
/*
@@ -1815,10 +1774,7 @@ static void scorpion_pmu_enable_event(struct perf_event *event)
else if (idx != ARMV7_IDX_CYCLE_COUNTER)
armv7_pmnc_write_evtsel(idx, hwc->config_base);
- /* Enable interrupt for this counter */
armv7_pmnc_enable_intens(idx);
-
- /* Enable counter */
armv7_pmnc_enable_counter(idx);
}
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 05/11] perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event()
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (3 preceding siblings ...)
2025-02-03 0:42 ` [PATCH v19 04/11] perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts Rob Herring (Arm)
@ 2025-02-03 0:42 ` Rob Herring (Arm)
2025-02-03 6:54 ` Anshuman Khandual
2025-02-03 0:43 ` [PATCH v19 06/11] perf: apple_m1: Don't disable counter in m1_pmu_enable_event() Rob Herring (Arm)
` (5 subsequent siblings)
10 siblings, 1 reply; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:42 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
Currently (armv7|krait_|scorpion_)pmu_enable_event() start by disabling
the event counter it has been asked to enable. This should not be
necessary as the counter (and the PMU as a whole) should not be active
when *_enable_event() is called.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/perf/arm_v7_pmu.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/perf/arm_v7_pmu.c b/drivers/perf/arm_v7_pmu.c
index 7fa88e3b64e0..17831e1920bd 100644
--- a/drivers/perf/arm_v7_pmu.c
+++ b/drivers/perf/arm_v7_pmu.c
@@ -857,8 +857,6 @@ static void armv7pmu_enable_event(struct perf_event *event)
return;
}
- armv7_pmnc_disable_counter(idx);
-
/*
* Set event (if destined for PMNx counters)
* We only need to set the event for the cycle counter if we
@@ -1450,8 +1448,6 @@ static void krait_pmu_enable_event(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
- armv7_pmnc_disable_counter(idx);
-
/*
* Set event (if destined for PMNx counters)
* We set the event for the cycle counter because we
@@ -1762,8 +1758,6 @@ static void scorpion_pmu_enable_event(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
- armv7_pmnc_disable_counter(idx);
-
/*
* Set event (if destined for PMNx counters)
* We don't set the event for the cycle counter because we
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 06/11] perf: apple_m1: Don't disable counter in m1_pmu_enable_event()
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (4 preceding siblings ...)
2025-02-03 0:42 ` [PATCH v19 05/11] perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event() Rob Herring (Arm)
@ 2025-02-03 0:43 ` Rob Herring (Arm)
2025-02-03 8:10 ` Anshuman Khandual
2025-02-03 0:43 ` [PATCH v19 07/11] perf: arm_pmu: Move PMUv3-specific data Rob Herring (Arm)
` (4 subsequent siblings)
10 siblings, 1 reply; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:43 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
Currently m1_pmu_enable_event() starts by disabling the event counter
it has been asked to enable. This should not be necessary as the
counter (and the PMU as a whole) should not be active when
m1_pmu_enable_event() is called.
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
drivers/perf/apple_m1_cpu_pmu.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
index 06fd317529fc..39349ecec3c1 100644
--- a/drivers/perf/apple_m1_cpu_pmu.c
+++ b/drivers/perf/apple_m1_cpu_pmu.c
@@ -396,10 +396,6 @@ static void m1_pmu_enable_event(struct perf_event *event)
user = event->hw.config_base & M1_PMU_CFG_COUNT_USER;
kernel = event->hw.config_base & M1_PMU_CFG_COUNT_KERNEL;
- m1_pmu_disable_counter_interrupt(event->hw.idx);
- m1_pmu_disable_counter(event->hw.idx);
- isb();
-
m1_pmu_configure_counter(event->hw.idx, evt, user, kernel);
m1_pmu_enable_counter(event->hw.idx);
m1_pmu_enable_counter_interrupt(event->hw.idx);
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 07/11] perf: arm_pmu: Move PMUv3-specific data
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (5 preceding siblings ...)
2025-02-03 0:43 ` [PATCH v19 06/11] perf: apple_m1: Don't disable counter in m1_pmu_enable_event() Rob Herring (Arm)
@ 2025-02-03 0:43 ` Rob Herring (Arm)
2025-02-03 8:16 ` Anshuman Khandual
2025-02-03 0:43 ` [PATCH v19 08/11] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
` (3 subsequent siblings)
10 siblings, 1 reply; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:43 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Mark Rutland <mark.rutland@arm.com>
A few fields in struct arm_pmu are only used with PMUv3, and soon we
will need to add more for BRBE. Group the fields together so that we
have a logical place to add more data in future.
At the same time, remove the comment for reg_pmmir as it doesn't convey
anything useful.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
include/linux/perf/arm_pmu.h | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 4b5b83677e3f..c70d528594f2 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -84,7 +84,6 @@ struct arm_pmu {
struct pmu pmu;
cpumask_t supported_cpus;
char *name;
- int pmuver;
irqreturn_t (*handle_irq)(struct arm_pmu *pmu);
void (*enable)(struct perf_event *event);
void (*disable)(struct perf_event *event);
@@ -102,18 +101,20 @@ struct arm_pmu {
int (*map_event)(struct perf_event *event);
DECLARE_BITMAP(cntr_mask, ARMPMU_MAX_HWEVENTS);
bool secure_access; /* 32-bit ARM only */
-#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
- DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
-#define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE 0x4000
- DECLARE_BITMAP(pmceid_ext_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
struct platform_device *plat_device;
struct pmu_hw_events __percpu *hw_events;
struct hlist_node node;
struct notifier_block cpu_pm_nb;
/* the attr_groups array must be NULL-terminated */
const struct attribute_group *attr_groups[ARMPMU_NR_ATTR_GROUPS + 1];
- /* store the PMMIR_EL1 to expose slots */
+
+ /* PMUv3 only */
+ int pmuver;
u64 reg_pmmir;
+#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
+ DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
+#define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE 0x4000
+ DECLARE_BITMAP(pmceid_ext_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
/* Only to be used by ACPI probing code */
unsigned long acpi_cpuid;
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 08/11] arm64/sysreg: Add BRBE registers and fields
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (6 preceding siblings ...)
2025-02-03 0:43 ` [PATCH v19 07/11] perf: arm_pmu: Move PMUv3-specific data Rob Herring (Arm)
@ 2025-02-03 0:43 ` Rob Herring (Arm)
2025-02-03 8:32 ` Anshuman Khandual
2025-02-03 0:43 ` [PATCH v19 09/11] arm64: Handle BRBE booting requirements Rob Herring (Arm)
` (2 subsequent siblings)
10 siblings, 1 reply; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:43 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Mark Brown
From: Anshuman Khandual <anshuman.khandual@arm.com>
This patch adds definitions related to the Branch Record Buffer Extension
(BRBE) as per ARM DDI 0487K.a. These will be used by KVM and a BRBE driver
in subsequent patches.
Some existing BRBE definitions in asm/sysreg.h are replaced with equivalent
generated definitions.
Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
v19:
- split BRBINF.CC field into mantissa and exponent
---
arch/arm64/include/asm/sysreg.h | 17 ++----
arch/arm64/tools/sysreg | 132 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 138 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 05ea5223d2d5..a8257e13f8f1 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -198,16 +198,8 @@
#define SYS_DBGVCR32_EL2 sys_reg(2, 4, 0, 7, 0)
#define SYS_BRBINF_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 0))
-#define SYS_BRBINFINJ_EL1 sys_reg(2, 1, 9, 1, 0)
#define SYS_BRBSRC_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 1))
-#define SYS_BRBSRCINJ_EL1 sys_reg(2, 1, 9, 1, 1)
#define SYS_BRBTGT_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 2))
-#define SYS_BRBTGTINJ_EL1 sys_reg(2, 1, 9, 1, 2)
-#define SYS_BRBTS_EL1 sys_reg(2, 1, 9, 0, 2)
-
-#define SYS_BRBCR_EL1 sys_reg(2, 1, 9, 0, 0)
-#define SYS_BRBFCR_EL1 sys_reg(2, 1, 9, 0, 1)
-#define SYS_BRBIDR0_EL1 sys_reg(2, 1, 9, 2, 0)
#define SYS_TRCITECR_EL1 sys_reg(3, 0, 1, 2, 3)
#define SYS_TRCACATR(m) sys_reg(2, 1, 2, ((m & 7) << 1), (2 | (m >> 3)))
@@ -273,8 +265,6 @@
/* ETM */
#define SYS_TRCOSLAR sys_reg(2, 1, 1, 0, 4)
-#define SYS_BRBCR_EL2 sys_reg(2, 4, 9, 0, 0)
-
#define SYS_MIDR_EL1 sys_reg(3, 0, 0, 0, 0)
#define SYS_MPIDR_EL1 sys_reg(3, 0, 0, 0, 5)
#define SYS_REVIDR_EL1 sys_reg(3, 0, 0, 0, 6)
@@ -610,7 +600,6 @@
#define SYS_CNTHV_CVAL_EL2 sys_reg(3, 4, 14, 3, 2)
/* VHE encodings for architectural EL0/1 system registers */
-#define SYS_BRBCR_EL12 sys_reg(2, 5, 9, 0, 0)
#define SYS_SCTLR_EL12 sys_reg(3, 5, 1, 0, 0)
#define SYS_CPACR_EL12 sys_reg(3, 5, 1, 0, 2)
#define SYS_SCTLR2_EL12 sys_reg(3, 5, 1, 0, 3)
@@ -821,6 +810,12 @@
#define OP_COSP_RCTX sys_insn(1, 3, 7, 3, 6)
#define OP_CPP_RCTX sys_insn(1, 3, 7, 3, 7)
+/*
+ * BRBE Instructions
+ */
+#define BRB_IALL_INSN __emit_inst(0xd5000000 | OP_BRB_IALL | (0x1f))
+#define BRB_INJ_INSN __emit_inst(0xd5000000 | OP_BRB_INJ | (0x1f))
+
/* Common SCTLR_ELx flags. */
#define SCTLR_ELx_ENTP2 (BIT(60))
#define SCTLR_ELx_DSSBS (BIT(44))
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 762ee084b37c..c0943579977a 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1038,6 +1038,138 @@ UnsignedEnum 3:0 MTEPERM
EndEnum
EndSysreg
+
+SysregFields BRBINFx_EL1
+Res0 63:47
+Field 46 CCU
+Field 45:40 CC_EXP
+Field 39:32 CC_MANT
+Res0 31:18
+Field 17 LASTFAILED
+Field 16 T
+Res0 15:14
+Enum 13:8 TYPE
+ 0b000000 DIRECT_UNCOND
+ 0b000001 INDIRECT
+ 0b000010 DIRECT_LINK
+ 0b000011 INDIRECT_LINK
+ 0b000101 RET
+ 0b000111 ERET
+ 0b001000 DIRECT_COND
+ 0b100001 DEBUG_HALT
+ 0b100010 CALL
+ 0b100011 TRAP
+ 0b100100 SERROR
+ 0b100110 INSN_DEBUG
+ 0b100111 DATA_DEBUG
+ 0b101010 ALIGN_FAULT
+ 0b101011 INSN_FAULT
+ 0b101100 DATA_FAULT
+ 0b101110 IRQ
+ 0b101111 FIQ
+ 0b110000 IMPDEF_TRAP_EL3
+ 0b111001 DEBUG_EXIT
+EndEnum
+Enum 7:6 EL
+ 0b00 EL0
+ 0b01 EL1
+ 0b10 EL2
+ 0b11 EL3
+EndEnum
+Field 5 MPRED
+Res0 4:2
+Enum 1:0 VALID
+ 0b00 NONE
+ 0b01 TARGET
+ 0b10 SOURCE
+ 0b11 FULL
+EndEnum
+EndSysregFields
+
+SysregFields BRBCR_ELx
+Res0 63:24
+Field 23 EXCEPTION
+Field 22 ERTN
+Res0 21:10
+Field 9 FZPSS
+Field 8 FZP
+Res0 7
+Enum 6:5 TS
+ 0b01 VIRTUAL
+ 0b10 GUEST_PHYSICAL
+ 0b11 PHYSICAL
+EndEnum
+Field 4 MPRED
+Field 3 CC
+Res0 2
+Field 1 ExBRE
+Field 0 E0BRE
+EndSysregFields
+
+Sysreg BRBCR_EL1 2 1 9 0 0
+Fields BRBCR_ELx
+EndSysreg
+
+Sysreg BRBFCR_EL1 2 1 9 0 1
+Res0 63:30
+Enum 29:28 BANK
+ 0b00 BANK_0
+ 0b01 BANK_1
+EndEnum
+Res0 27:23
+Field 22 CONDDIR
+Field 21 DIRCALL
+Field 20 INDCALL
+Field 19 RTN
+Field 18 INDIRECT
+Field 17 DIRECT
+Field 16 EnI
+Res0 15:8
+Field 7 PAUSED
+Field 6 LASTFAILED
+Res0 5:0
+EndSysreg
+
+Sysreg BRBTS_EL1 2 1 9 0 2
+Field 63:0 TS
+EndSysreg
+
+Sysreg BRBINFINJ_EL1 2 1 9 1 0
+Fields BRBINFx_EL1
+EndSysreg
+
+Sysreg BRBSRCINJ_EL1 2 1 9 1 1
+Field 63:0 ADDRESS
+EndSysreg
+
+Sysreg BRBTGTINJ_EL1 2 1 9 1 2
+Field 63:0 ADDRESS
+EndSysreg
+
+Sysreg BRBIDR0_EL1 2 1 9 2 0
+Res0 63:16
+Enum 15:12 CC
+ 0b0101 20_BIT
+EndEnum
+Enum 11:8 FORMAT
+ 0b0000 FORMAT_0
+EndEnum
+Enum 7:0 NUMREC
+ 0b00001000 8
+ 0b00010000 16
+ 0b00100000 32
+ 0b01000000 64
+EndEnum
+EndSysreg
+
+Sysreg BRBCR_EL2 2 4 9 0 0
+Fields BRBCR_ELx
+EndSysreg
+
+Sysreg BRBCR_EL12 2 5 9 0 0
+Fields BRBCR_ELx
+EndSysreg
+
Sysreg ID_AA64ZFR0_EL1 3 0 0 4 4
Res0 63:60
UnsignedEnum 59:56 F64MM
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 09/11] arm64: Handle BRBE booting requirements
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (7 preceding siblings ...)
2025-02-03 0:43 ` [PATCH v19 08/11] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
@ 2025-02-03 0:43 ` Rob Herring (Arm)
2025-02-03 8:47 ` Anshuman Khandual
2025-02-12 12:10 ` Leo Yan
2025-02-03 0:43 ` [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
2025-02-03 0:43 ` [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
10 siblings, 2 replies; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:43 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Anshuman Khandual <anshuman.khandual@arm.com>
To use the Branch Record Buffer Extension (BRBE), some configuration is
necessary at EL3 and EL2. This patch documents the requirements and adds
the initial EL2 setup code, which largely consists of configuring the
fine-grained traps and initializing a couple of BRBE control registers.
Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and
HDFGWTR_EL2 with the same value, relying on the read/write trap controls
for a register occupying the same bit position in either register. The
'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit
59 of HDFGRTR_EL2 is RES0, and so this assumption no longer holds.
To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit
layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and
HDFGWTR_EL2 control bits separately. While making this change the
open-coded value (1 << 62) is replaced with
HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK.
The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special
initialisation: even though they are subject to E2H renaming, both have
an effect regardless of HCR_EL2.TGE, even when running at EL2, and
consequently both need to be initialised. This is handled in
__init_el2_brbe() with a comment to explain the situation.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
[Mark: rewrite commit message, fix typo in comment]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
Documentation/arch/arm64/booting.rst | 21 +++++++++
arch/arm64/include/asm/el2_setup.h | 86 ++++++++++++++++++++++++++++++++++--
2 files changed, 104 insertions(+), 3 deletions(-)
diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
index cad6fdc96b98..0a421757cacf 100644
--- a/Documentation/arch/arm64/booting.rst
+++ b/Documentation/arch/arm64/booting.rst
@@ -352,6 +352,27 @@ Before jumping into the kernel, the following conditions must be met:
- HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
+ For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE):
+
+ - If EL3 is present:
+
+ - MDCR_EL3.SBRBE (bits 33:32) must be initialised to 0b11.
+
+ - If the kernel is entered at EL1 and EL2 is present:
+
+ - BRBCR_EL2.CC (bit 3) must be initialised to 0b1.
+ - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1.
+
+ - HDFGRTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1.
+ - HDFGRTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1.
+ - HDFGRTR_EL2.nBRBIDR (bit 59) must be initialised to 0b1.
+
+ - HDFGWTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1.
+ - HDFGWTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1.
+
+ - HFGITR_EL2.nBRBIALL (bit 56) must be initialised to 0b1.
+ - HFGITR_EL2.nBRBINJ (bit 55) must be initialised to 0b1.
+
For CPUs with the Scalable Matrix Extension FA64 feature (FEAT_SME_FA64):
- If EL3 is present:
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index 25e162651750..bf21ce513aff 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -163,6 +163,39 @@
.Lskip_set_cptr_\@:
.endm
+/*
+ * Configure BRBE to permit recording cycle counts and branch mispredicts.
+ *
+ * At any EL, to record cycle counts BRBE requires that both BRBCR_EL2.CC=1 and
+ * BRBCR_EL1.CC=1.
+ *
+ * At any EL, to record branch mispredicts BRBE requires that both
+ * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1.
+ *
+ * When HCR_EL2.E2H=1, the BRBCR_EL1 encoding is redirected to BRBCR_EL2, but
+ * the {CC,MPRED} bits in the real BRBCR_EL1 register still apply.
+ *
+ * Set {CC,MPRED} in both BRBCR_EL2 and BRBCR_EL1 so that at runtime we only
+ * need to enable/disable these in BRBCR_EL1 regardless of whether the kernel
+ * ends up executing in EL1 or EL2.
+ */
+.macro __init_el2_brbe
+ mrs x1, id_aa64dfr0_el1
+ ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
+ cbz x1, .Lskip_brbe_\@
+
+ mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED
+ msr_s SYS_BRBCR_EL2, x0
+
+ __check_hvhe .Lset_brbe_nvhe_\@, x1
+ msr_s SYS_BRBCR_EL12, x0 // VHE
+ b .Lskip_brbe_\@
+
+.Lset_brbe_nvhe_\@:
+ msr_s SYS_BRBCR_EL1, x0 // NVHE
+.Lskip_brbe_\@:
+.endm
+
/* Disable any fine grained traps */
.macro __init_el2_fgt
mrs x1, id_aa64mmfr0_el1
@@ -170,16 +203,48 @@
cbz x1, .Lskip_fgt_\@
mov x0, xzr
+ mov x2, xzr
mrs x1, id_aa64dfr0_el1
ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4
cmp x1, #3
b.lt .Lskip_spe_fgt_\@
+
/* Disable PMSNEVFR_EL1 read and write traps */
- orr x0, x0, #(1 << 62)
+ orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK
+ orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK
.Lskip_spe_fgt_\@:
+#ifdef CONFIG_ARM64_BRBE
+ mrs x1, id_aa64dfr0_el1
+ ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
+ cbz x1, .Lskip_brbe_reg_fgt_\@
+
+ /*
+ * Disable read traps for the following registers
+ *
+ * [BRBSRC|BRBTGT|RBINF]_EL1
+ * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
+ */
+ orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK
+
+ /*
+ * Disable write traps for the following registers
+ *
+ * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
+ */
+ orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK
+
+ /* Disable read and write traps for [BRBCR|BRBFCR]_EL1 */
+ orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK
+ orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK
+
+ /* Disable read traps for BRBIDR_EL1 */
+ orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK
+
+.Lskip_brbe_reg_fgt_\@:
+#endif /* CONFIG_ARM64_BRBE */
msr_s SYS_HDFGRTR_EL2, x0
- msr_s SYS_HDFGWTR_EL2, x0
+ msr_s SYS_HDFGWTR_EL2, x2
mov x0, xzr
mrs x1, id_aa64pfr1_el1
@@ -220,7 +285,21 @@
.Lset_fgt_\@:
msr_s SYS_HFGRTR_EL2, x0
msr_s SYS_HFGWTR_EL2, x0
- msr_s SYS_HFGITR_EL2, xzr
+ mov x0, xzr
+#ifdef CONFIG_ARM64_BRBE
+ mrs x1, id_aa64dfr0_el1
+ ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
+ cbz x1, .Lskip_brbe_insn_fgt_\@
+
+ /* Disable traps for BRBIALL instruction */
+ orr x0, x0, #HFGITR_EL2_nBRBIALL_MASK
+
+ /* Disable traps for BRBINJ instruction */
+ orr x0, x0, #HFGITR_EL2_nBRBINJ_MASK
+
+.Lskip_brbe_insn_fgt_\@:
+#endif /* CONFIG_ARM64_BRBE */
+ msr_s SYS_HFGITR_EL2, x0
mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU
ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4
@@ -275,6 +354,7 @@
__init_el2_hcrx
__init_el2_timers
__init_el2_debug
+ __init_el2_brbe
__init_el2_lor
__init_el2_stage2
__init_el2_gicv3
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (8 preceding siblings ...)
2025-02-03 0:43 ` [PATCH v19 09/11] arm64: Handle BRBE booting requirements Rob Herring (Arm)
@ 2025-02-03 0:43 ` Rob Herring (Arm)
2025-02-03 9:16 ` Anshuman Khandual
` (2 more replies)
2025-02-03 0:43 ` [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
10 siblings, 3 replies; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:43 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Anshuman Khandual <anshuman.khandual@arm.com>
While BRBE can record branches within guests, the host recording
branches in guests is not supported by perf. Therefore, BRBE needs to be
disabled on guest entry and restored on exit.
For nVHE, this requires explicit handling for guests. Before
entering a guest, save the BRBE state and disable the it. When
returning to the host, restore the state.
For VHE, it is not necessary. We initialize
BRBCR_EL1.{E1BRE,E0BRE}=={0,0} at boot time, and HCR_EL2.TGE==1 while
running in the host. We configure BRBCR_EL2.{E2BRE,E0HBRE} to enable
branch recording in the host. When entering the guest, we set
HCR_EL2.TGE==0 which means BRBCR_EL1 is used instead of BRBCR_EL2.
Consequently for VHE, BRBE recording is disabled at EL1 and EL0 when
running a guest.
Should recording in guests (by the host) ever be desired, the perf ABI
will need to be extended to distinguish guest addresses (struct
perf_branch_entry.priv) for starters. BRBE records would also need to be
invalidated on guest entry/exit as guest/host EL1 and EL0 records can't
be distinguished.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
v19:
- Rework due to v6.14 debug flag changes
- Redo commit message
---
arch/arm64/include/asm/kvm_host.h | 2 ++
arch/arm64/kvm/debug.c | 4 ++++
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 32 ++++++++++++++++++++++++++++++++
3 files changed, 38 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7cfa024de4e3..4fc246a1ee6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -619,6 +619,7 @@ struct kvm_host_data {
#define KVM_HOST_DATA_FLAG_HOST_SME_ENABLED 3
#define KVM_HOST_DATA_FLAG_TRBE_ENABLED 4
#define KVM_HOST_DATA_FLAG_EL1_TRACING_CONFIGURED 5
+#define KVM_HOST_DATA_FLAG_HAS_BRBE 6
unsigned long flags;
struct kvm_cpu_context host_ctxt;
@@ -662,6 +663,7 @@ struct kvm_host_data {
u64 trfcr_el1;
/* Values of trap registers for the host before guest entry. */
u64 mdcr_el2;
+ u64 brbcr_el1;
} host_debug_state;
/* Guest trace filter value */
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 0e4c805e7e89..bc6015108a68 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -81,6 +81,10 @@ void kvm_init_host_debug_data(void)
!(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
host_data_set_flag(HAS_SPE);
+ /* Check if we have BRBE implemented and available at the host */
+ if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT))
+ host_data_set_flag(HAS_BRBE);
+
if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT)) {
/* Force disable trace in protected mode in case of no TRBE */
if (is_protected_kvm_enabled())
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index 2f4a4f5036bb..2a1c0f49792b 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -92,12 +92,42 @@ static void __trace_switch_to_host(void)
*host_data_ptr(host_debug_state.trfcr_el1));
}
+static void __debug_save_brbe(u64 *brbcr_el1)
+{
+ *brbcr_el1 = 0;
+
+ /* Check if the BRBE is enabled */
+ if (!(read_sysreg_el1(SYS_BRBCR) & (BRBCR_ELx_E0BRE | BRBCR_ELx_ExBRE)))
+ return;
+
+ /*
+ * Prohibit branch record generation while we are in guest.
+ * Since access to BRBCR_EL1 is trapped, the guest can't
+ * modify the filtering set by the host.
+ */
+ *brbcr_el1 = read_sysreg_el1(SYS_BRBCR);
+ write_sysreg_el1(0, SYS_BRBCR);
+}
+
+static void __debug_restore_brbe(u64 brbcr_el1)
+{
+ if (!brbcr_el1)
+ return;
+
+ /* Restore BRBE controls */
+ write_sysreg_el1(brbcr_el1, SYS_BRBCR);
+}
+
void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
{
/* Disable and flush SPE data generation */
if (host_data_test_flag(HAS_SPE))
__debug_save_spe(host_data_ptr(host_debug_state.pmscr_el1));
+ /* Disable BRBE branch records */
+ if (host_data_test_flag(HAS_BRBE))
+ __debug_save_brbe(host_data_ptr(host_debug_state.brbcr_el1));
+
if (__trace_needs_switch())
__trace_switch_to_guest();
}
@@ -111,6 +141,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
{
if (host_data_test_flag(HAS_SPE))
__debug_restore_spe(*host_data_ptr(host_debug_state.pmscr_el1));
+ if (host_data_test_flag(HAS_BRBE))
+ __debug_restore_brbe(*host_data_ptr(host_debug_state.brbcr_el1));
if (__trace_needs_switch())
__trace_switch_to_host();
}
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (9 preceding siblings ...)
2025-02-03 0:43 ` [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
@ 2025-02-03 0:43 ` Rob Herring (Arm)
2025-02-03 16:53 ` James Clark
` (2 more replies)
10 siblings, 3 replies; 43+ messages in thread
From: Rob Herring (Arm) @ 2025-02-03 0:43 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Anshuman Khandual <anshuman.khandual@arm.com>
The ARMv9.2 architecture introduces the optional Branch Record Buffer
Extension (BRBE), which records information about branches as they are
executed into set of branch record registers. BRBE is similar to x86's
Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
(BHRB).
BRBE supports filtering by exception level and can filter just the
source or target address if excluded to avoid leaking privileged
addresses. The h/w filter would be sufficient except when there are
multiple events with disjoint filtering requirements. In this case, BRBE
is configured with a union of all the events' desired branches, and then
the recorded branches are filtered based on each event's filter. For
example, with one event capturing kernel events and another event
capturing user events, BRBE will be configured to capture both kernel
and user branches. When handling event overflow, the branch records have
to be filtered by software to only include kernel or user branch
addresses for that event. In contrast, x86 simply configures LBR using
the last installed event which seems broken.
The event and branch exception level filtering are separately
controlled. On x86, it is possible to request filtering which is
disjoint (e.g. kernel only event with user only branches). It is also
possible on x86 to configure branch filter such that no branches are
ever recorded (e.g. -j save_type). For BRBE, events with mismatched
exception level filtering or a configuration that will result in no
samples are rejected. This can be relaxed in the future if such a need
arises.
The handling of KVM guests is similar to the above. On x86, branch
recording is always disabled when a guest is running. However,
requesting branch recording in guests is allowed. The guest events are
recorded, but the resulting branches are all from the host. For BRBE,
branch recording is similarly disabled when guest is running. In
addition, events with branch recording and "exclude_host" set are
rejected. Requiring "exclude_guest" to be set did not work. The default
for the perf tool does set "exclude_guest" if no exception level
options are specified. However, specifying kernel or user defaults to
including both host and guest. In this case, only host branches are
recorded.
BRBE can support some additional exception, FIQ, and debug branch
types, but they are not supported currently. There's no control in the
perf ABI to enable/disable these branch types, so they could only be
enabled for the 'any' filter which might be undesired or unexpected.
The other architectures don't have any support similar events (at least
with perf). These can be added in the future if there is demand by
adding additional specific filter types.
BRBE records are invalidated whenever events are reconfigured, a new
task is scheduled in, or after recording is paused (and the records
have been recorded for the event). The architecture allows branch
records to be invalidated by the PE under implementation defined
conditions. It is expected that these conditions are rare.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
v19:
- Drop saving of branch records when task scheduled out. (Mark)
- Got rid of added armpmu ops. All BRBE support contained within pmuv3
code.
- Dropped armpmu.num_branch_records as reg_brbidr has same info.
- Make sched_task() callback actually get called. Enabling requires a
call to perf_sched_cb_inc().
- Fix freeze on overflow for VHE
- The cycle counter doesn't freeze BRBE on overflow, so avoid assigning
it when BRBE is enabled.
- Drop all the Arm specific exception branches. Not a clear need for
them.
- Simplify enable/disable to avoid RMW and document ISBs needed
- Fix handling of branch 'cycles' reading. CC field is
mantissa/exponent, not an integer.
- Save BRBFCR and BRBCR settings in event->hw.branch_reg.config and
event->hw.extra_reg.config to avoid recalculating the register value
each time the event is installed.
- Rework s/w filtering to better match h/w filtering
- Reject events with disjoint event filter and branch filter
- Reject events if exclude_host is set
v18: https://lore.kernel.org/all/20240613061731.3109448-6-anshuman.khandual@arm.com/
---
drivers/perf/Kconfig | 11 +
drivers/perf/Makefile | 1 +
drivers/perf/arm_brbe.c | 794 +++++++++++++++++++++++++++++++++++++++++++
drivers/perf/arm_brbe.h | 47 +++
drivers/perf/arm_pmu.c | 15 +-
drivers/perf/arm_pmuv3.c | 87 ++++-
include/linux/perf/arm_pmu.h | 8 +
7 files changed, 958 insertions(+), 5 deletions(-)
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 4e268de351c4..3be60ff4236d 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -223,6 +223,17 @@ config ARM_SPE_PMU
Extension, which provides periodic sampling of operations in
the CPU pipeline and reports this via the perf AUX interface.
+config ARM64_BRBE
+ bool "Enable support for branch stack sampling using FEAT_BRBE"
+ depends on ARM_PMUV3 && ARM64
+ default y
+ help
+ Enable perf support for Branch Record Buffer Extension (BRBE) which
+ records all branches taken in an execution path. This supports some
+ branch types and privilege based filtering. It captures additional
+ relevant information such as cycle count, misprediction and branch
+ type, branch privilege level etc.
+
config ARM_DMC620_PMU
tristate "Enable PMU support for the ARM DMC-620 memory controller"
depends on (ARM64 && ACPI) || COMPILE_TEST
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index de71d2574857..192fc8b16204 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_STARFIVE_STARLINK_PMU) += starfive_starlink_pmu.o
obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
+obj-$(CONFIG_ARM64_BRBE) += arm_brbe.o
obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
diff --git a/drivers/perf/arm_brbe.c b/drivers/perf/arm_brbe.c
new file mode 100644
index 000000000000..18eb9bfa1f9c
--- /dev/null
+++ b/drivers/perf/arm_brbe.c
@@ -0,0 +1,794 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Branch Record Buffer Extension Driver.
+ *
+ * Copyright (C) 2022-2025 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#include <linux/types.h>
+#include <linux/bitmap.h>
+#include <linux/perf/arm_pmu.h>
+#include "arm_brbe.h"
+
+#define BRBFCR_EL1_BRANCH_FILTERS (BRBFCR_EL1_DIRECT | \
+ BRBFCR_EL1_INDIRECT | \
+ BRBFCR_EL1_RTN | \
+ BRBFCR_EL1_INDCALL | \
+ BRBFCR_EL1_DIRCALL | \
+ BRBFCR_EL1_CONDDIR)
+
+/*
+ * BRBTS_EL1 is currently not used for branch stack implementation
+ * purpose but BRBCR_ELx.TS needs to have a valid value from all
+ * available options. BRBCR_ELx_TS_VIRTUAL is selected for this.
+ */
+#define BRBCR_ELx_DEFAULT_TS FIELD_PREP(BRBCR_ELx_TS_MASK, BRBCR_ELx_TS_VIRTUAL)
+
+/*
+ * BRBE Buffer Organization
+ *
+ * BRBE buffer is arranged as multiple banks of 32 branch record
+ * entries each. An individual branch record in a given bank could
+ * be accessed, after selecting the bank in BRBFCR_EL1.BANK and
+ * accessing the registers i.e [BRBSRC, BRBTGT, BRBINF] set with
+ * indices [0..31].
+ *
+ * Bank 0
+ *
+ * --------------------------------- ------
+ * | 00 | BRBSRC | BRBTGT | BRBINF | | 00 |
+ * --------------------------------- ------
+ * | 01 | BRBSRC | BRBTGT | BRBINF | | 01 |
+ * --------------------------------- ------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | .. |
+ * --------------------------------- ------
+ * | 31 | BRBSRC | BRBTGT | BRBINF | | 31 |
+ * --------------------------------- ------
+ *
+ * Bank 1
+ *
+ * --------------------------------- ------
+ * | 32 | BRBSRC | BRBTGT | BRBINF | | 00 |
+ * --------------------------------- ------
+ * | 33 | BRBSRC | BRBTGT | BRBINF | | 01 |
+ * --------------------------------- ------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | .. |
+ * --------------------------------- ------
+ * | 63 | BRBSRC | BRBTGT | BRBINF | | 31 |
+ * --------------------------------- ------
+ */
+#define BRBE_BANK_MAX_ENTRIES 32
+#define BRBE_MAX_BANK 2
+#define BRBE_MAX_ENTRIES (BRBE_BANK_MAX_ENTRIES * BRBE_MAX_BANK)
+
+struct brbe_regset {
+ unsigned long brbsrc;
+ unsigned long brbtgt;
+ unsigned long brbinf;
+};
+
+#define PERF_BR_ARM64_MAX (PERF_BR_MAX + PERF_BR_NEW_MAX)
+
+struct brbe_hw_attr {
+ int brbe_version;
+ int brbe_cc;
+ int brbe_nr;
+ int brbe_format;
+};
+
+#define BRBE_REGN_CASE(n, case_macro) \
+ case n: case_macro(n); break
+
+#define BRBE_REGN_SWITCH(x, case_macro) \
+ do { \
+ switch (x) { \
+ BRBE_REGN_CASE(0, case_macro); \
+ BRBE_REGN_CASE(1, case_macro); \
+ BRBE_REGN_CASE(2, case_macro); \
+ BRBE_REGN_CASE(3, case_macro); \
+ BRBE_REGN_CASE(4, case_macro); \
+ BRBE_REGN_CASE(5, case_macro); \
+ BRBE_REGN_CASE(6, case_macro); \
+ BRBE_REGN_CASE(7, case_macro); \
+ BRBE_REGN_CASE(8, case_macro); \
+ BRBE_REGN_CASE(9, case_macro); \
+ BRBE_REGN_CASE(10, case_macro); \
+ BRBE_REGN_CASE(11, case_macro); \
+ BRBE_REGN_CASE(12, case_macro); \
+ BRBE_REGN_CASE(13, case_macro); \
+ BRBE_REGN_CASE(14, case_macro); \
+ BRBE_REGN_CASE(15, case_macro); \
+ BRBE_REGN_CASE(16, case_macro); \
+ BRBE_REGN_CASE(17, case_macro); \
+ BRBE_REGN_CASE(18, case_macro); \
+ BRBE_REGN_CASE(19, case_macro); \
+ BRBE_REGN_CASE(20, case_macro); \
+ BRBE_REGN_CASE(21, case_macro); \
+ BRBE_REGN_CASE(22, case_macro); \
+ BRBE_REGN_CASE(23, case_macro); \
+ BRBE_REGN_CASE(24, case_macro); \
+ BRBE_REGN_CASE(25, case_macro); \
+ BRBE_REGN_CASE(26, case_macro); \
+ BRBE_REGN_CASE(27, case_macro); \
+ BRBE_REGN_CASE(28, case_macro); \
+ BRBE_REGN_CASE(29, case_macro); \
+ BRBE_REGN_CASE(30, case_macro); \
+ BRBE_REGN_CASE(31, case_macro); \
+ default: WARN(1, "Invalid BRB* index %d\n", x); \
+ } \
+ } while (0)
+
+#define RETURN_READ_BRBSRCN(n) \
+ return read_sysreg_s(SYS_BRBSRC_EL1(n))
+static inline u64 get_brbsrc_reg(int idx)
+{
+ BRBE_REGN_SWITCH(idx, RETURN_READ_BRBSRCN);
+ return 0;
+}
+
+#define RETURN_READ_BRBTGTN(n) \
+ return read_sysreg_s(SYS_BRBTGT_EL1(n))
+static u64 get_brbtgt_reg(int idx)
+{
+ BRBE_REGN_SWITCH(idx, RETURN_READ_BRBTGTN);
+ return 0;
+}
+
+#define RETURN_READ_BRBINFN(n) \
+ return read_sysreg_s(SYS_BRBINF_EL1(n))
+static u64 get_brbinf_reg(int idx)
+{
+ BRBE_REGN_SWITCH(idx, RETURN_READ_BRBINFN);
+ return 0;
+}
+
+static u64 brbe_record_valid(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_VALID_MASK, brbinf);
+}
+
+static bool brbe_invalid(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_NONE;
+}
+
+static bool brbe_record_is_complete(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_FULL;
+}
+
+static bool brbe_record_is_source_only(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_SOURCE;
+}
+
+static bool brbe_record_is_target_only(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_TARGET;
+}
+
+static int brbinf_get_in_tx(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_T_MASK, brbinf);
+}
+
+static int brbinf_get_mispredict(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_MPRED_MASK, brbinf);
+}
+
+static int brbinf_get_lastfailed(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_LASTFAILED_MASK, brbinf);
+}
+
+static u16 brbinf_get_cycles(u64 brbinf)
+{
+ u32 exp, mant, cycles;
+ /*
+ * Captured cycle count is unknown and hence
+ * should not be passed on to userspace.
+ */
+ if (brbinf & BRBINFx_EL1_CCU)
+ return 0;
+
+ exp = FIELD_GET(BRBINFx_EL1_CC_EXP_MASK, brbinf);
+ mant = FIELD_GET(BRBINFx_EL1_CC_MANT_MASK, brbinf);
+
+ if (!exp)
+ return mant;
+
+ cycles = (mant | 0x100) << (exp - 1);
+
+ return (cycles > U16_MAX) ? U16_MAX : cycles;
+}
+
+static int brbinf_get_type(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_TYPE_MASK, brbinf);
+}
+
+static int brbinf_get_el(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_EL_MASK, brbinf);
+}
+
+static void brbe_invalidate_nosync(void)
+{
+ asm volatile(BRB_IALL_INSN);
+}
+
+void brbe_invalidate(void)
+{
+ // Ensure all branches before this point are recorded
+ isb();
+ brbe_invalidate_nosync();
+ // Ensure all branch records are invalidated after this point
+ isb();
+}
+
+static bool valid_brbe_nr(int brbe_nr)
+{
+ return brbe_nr == BRBIDR0_EL1_NUMREC_8 ||
+ brbe_nr == BRBIDR0_EL1_NUMREC_16 ||
+ brbe_nr == BRBIDR0_EL1_NUMREC_32 ||
+ brbe_nr == BRBIDR0_EL1_NUMREC_64;
+}
+
+static bool valid_brbe_cc(int brbe_cc)
+{
+ return brbe_cc == BRBIDR0_EL1_CC_20_BIT;
+}
+
+static bool valid_brbe_format(int brbe_format)
+{
+ return brbe_format == BRBIDR0_EL1_FORMAT_FORMAT_0;
+}
+
+static bool valid_brbidr(u64 brbidr)
+{
+ int brbe_format, brbe_cc, brbe_nr;
+
+ brbe_format = FIELD_GET(BRBIDR0_EL1_FORMAT_MASK, brbidr);
+ brbe_cc = FIELD_GET(BRBIDR0_EL1_CC_MASK, brbidr);
+ brbe_nr = FIELD_GET(BRBIDR0_EL1_NUMREC_MASK, brbidr);
+
+ return valid_brbe_format(brbe_format) && valid_brbe_cc(brbe_cc) && valid_brbe_nr(brbe_nr);
+}
+
+static bool valid_brbe_version(int brbe_version)
+{
+ return brbe_version == ID_AA64DFR0_EL1_BRBE_IMP ||
+ brbe_version == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1;
+}
+
+static void select_brbe_bank(int bank)
+{
+ u64 brbfcr;
+
+ brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+ brbfcr &= ~BRBFCR_EL1_BANK_MASK;
+ brbfcr |= SYS_FIELD_PREP(BRBFCR_EL1, BANK, bank);
+ write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+ /*
+ * Arm ARM (DDI 0487K.a) D.18.4 rule PPBZP requires explicit sync
+ * between setting BANK and accessing branch records.
+ */
+ isb();
+}
+
+static bool __read_brbe_regset(struct brbe_regset *entry, int idx)
+{
+ entry->brbinf = get_brbinf_reg(idx);
+
+ if (brbe_invalid(entry->brbinf))
+ return false;
+
+ entry->brbsrc = get_brbsrc_reg(idx);
+ entry->brbtgt = get_brbtgt_reg(idx);
+ return true;
+}
+
+/*
+ * Generic perf branch filters supported on BRBE
+ *
+ * New branch filters need to be evaluated whether they could be supported on
+ * BRBE. This ensures that such branch filters would not just be accepted, to
+ * fail silently. PERF_SAMPLE_BRANCH_HV is a special case that is selectively
+ * supported only on platforms where kernel is in hyp mode.
+ */
+#define BRBE_EXCLUDE_BRANCH_FILTERS (PERF_SAMPLE_BRANCH_ABORT_TX | \
+ PERF_SAMPLE_BRANCH_IN_TX | \
+ PERF_SAMPLE_BRANCH_NO_TX | \
+ PERF_SAMPLE_BRANCH_CALL_STACK | \
+ PERF_SAMPLE_BRANCH_COUNTERS)
+
+#define BRBE_ALLOWED_BRANCH_TYPES (PERF_SAMPLE_BRANCH_ANY | \
+ PERF_SAMPLE_BRANCH_ANY_CALL | \
+ PERF_SAMPLE_BRANCH_ANY_RETURN | \
+ PERF_SAMPLE_BRANCH_IND_CALL | \
+ PERF_SAMPLE_BRANCH_COND | \
+ PERF_SAMPLE_BRANCH_IND_JUMP | \
+ PERF_SAMPLE_BRANCH_CALL)
+
+
+#define BRBE_ALLOWED_BRANCH_FILTERS (PERF_SAMPLE_BRANCH_USER | \
+ PERF_SAMPLE_BRANCH_KERNEL | \
+ PERF_SAMPLE_BRANCH_HV | \
+ BRBE_ALLOWED_BRANCH_TYPES | \
+ PERF_SAMPLE_BRANCH_NO_FLAGS | \
+ PERF_SAMPLE_BRANCH_NO_CYCLES | \
+ PERF_SAMPLE_BRANCH_TYPE_SAVE | \
+ PERF_SAMPLE_BRANCH_HW_INDEX | \
+ PERF_SAMPLE_BRANCH_PRIV_SAVE)
+
+#define BRBE_PERF_BRANCH_FILTERS (BRBE_ALLOWED_BRANCH_FILTERS | \
+ BRBE_EXCLUDE_BRANCH_FILTERS)
+
+/*
+ * BRBE supports the following functional branch type filters while
+ * generating branch records. These branch filters can be enabled,
+ * either individually or as a group i.e ORing multiple filters
+ * with each other.
+ *
+ * BRBFCR_EL1_CONDDIR - Conditional direct branch
+ * BRBFCR_EL1_DIRCALL - Direct call
+ * BRBFCR_EL1_INDCALL - Indirect call
+ * BRBFCR_EL1_INDIRECT - Indirect branch
+ * BRBFCR_EL1_DIRECT - Direct branch
+ * BRBFCR_EL1_RTN - Subroutine return
+ */
+static u64 branch_type_to_brbfcr(int branch_type)
+{
+ u64 brbfcr = 0;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+ brbfcr |= BRBFCR_EL1_BRANCH_FILTERS;
+ return brbfcr;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+ brbfcr |= BRBFCR_EL1_INDCALL;
+ brbfcr |= BRBFCR_EL1_DIRCALL;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+ brbfcr |= BRBFCR_EL1_RTN;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
+ brbfcr |= BRBFCR_EL1_INDCALL;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_COND)
+ brbfcr |= BRBFCR_EL1_CONDDIR;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
+ brbfcr |= BRBFCR_EL1_INDIRECT;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_CALL)
+ brbfcr |= BRBFCR_EL1_DIRCALL;
+
+ return brbfcr;
+}
+
+/*
+ * BRBE supports the following privilege mode filters while generating
+ * branch records.
+ *
+ * BRBCR_ELx_E0BRE - EL0 branch records
+ * BRBCR_ELx_ExBRE - EL1/EL2 branch records
+ *
+ * BRBE also supports the following additional functional branch type
+ * filters while generating branch records.
+ *
+ * BRBCR_ELx_EXCEPTION - Exception
+ * BRBCR_ELx_ERTN - Exception return
+ */
+static u64 branch_type_to_brbcr(int branch_type)
+{
+ u64 brbcr = BRBCR_ELx_FZP | BRBCR_ELx_DEFAULT_TS;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_USER)
+ brbcr |= BRBCR_ELx_E0BRE;
+
+ /*
+ * When running in the hyp mode, writing into BRBCR_EL1
+ * actually writes into BRBCR_EL2 instead. Field E2BRE
+ * is also at the same position as E1BRE.
+ */
+ if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
+ brbcr |= BRBCR_ELx_ExBRE;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_HV) {
+ if (is_kernel_in_hyp_mode())
+ brbcr |= BRBCR_ELx_ExBRE;
+ }
+
+ if (!(branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
+ brbcr |= BRBCR_ELx_CC;
+
+ if (!(branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
+ brbcr |= BRBCR_ELx_MPRED;
+
+ /*
+ * The exception and exception return branches could be
+ * captured, irrespective of the perf event's privilege.
+ * If the perf event does not have enough privilege for
+ * a given exception level, then addresses which falls
+ * under that exception level will be reported as zero
+ * for the captured branch record, creating source only
+ * or target only records.
+ */
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+ brbcr |= BRBCR_ELx_EXCEPTION;
+ brbcr |= BRBCR_ELx_ERTN;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+ brbcr |= BRBCR_ELx_EXCEPTION;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+ brbcr |= BRBCR_ELx_ERTN;
+
+ return brbcr;
+}
+
+bool brbe_branch_attr_valid(struct perf_event *event)
+{
+ u64 branch_type = event->attr.branch_sample_type;
+
+ /*
+ * Ensure both perf branch filter allowed and exclude
+ * masks are always in sync with the generic perf ABI.
+ */
+ BUILD_BUG_ON(BRBE_PERF_BRANCH_FILTERS != (PERF_SAMPLE_BRANCH_MAX - 1));
+
+ if (branch_type & BRBE_EXCLUDE_BRANCH_FILTERS) {
+ pr_debug_once("requested branch filter not supported 0x%llx\n", branch_type);
+ return false;
+ }
+
+ /* Ensure at least 1 branch type is enabled */
+ if (!(branch_type & BRBE_ALLOWED_BRANCH_TYPES)) {
+ pr_debug_once("no branch type enabled 0x%llx\n", branch_type);
+ return false;
+ }
+
+ /*
+ * No branches are recorded in guests nor nVHE hypervisors, so
+ * excluding the host or both kernel and user is invalid.
+ *
+ * Ideally we'd just require exclude_guest and exclude_hv, but setting
+ * event filters with perf for kernel or user don't set exclude_guest.
+ * So effectively, exclude_guest and exclude_hv are ignored.
+ */
+ if (event->attr.exclude_host || (event->attr.exclude_user && event->attr.exclude_kernel))
+ return false;
+
+ /*
+ * Require that the event filter and branch filter permissions match.
+ *
+ * The event and branch permissions can only mismatch if the user set
+ * at least one of the privilege branch filters in PERF_SAMPLE_BRANCH_PLM_ALL.
+ * Otherwise, the core will set the branch sample permissions in
+ * perf_copy_attr().
+ */
+ if ((event->attr.exclude_user != !(branch_type & PERF_SAMPLE_BRANCH_USER)) ||
+ (event->attr.exclude_kernel != !(branch_type & PERF_SAMPLE_BRANCH_KERNEL)) ||
+ (!is_kernel_in_hyp_mode() &&
+ (event->attr.exclude_hv != !(branch_type & PERF_SAMPLE_BRANCH_HV))))
+ return false;
+
+ event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
+ event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
+
+ return true;
+}
+
+unsigned int brbe_num_branch_records(const struct arm_pmu *armpmu)
+{
+ return FIELD_GET(BRBIDR0_EL1_NUMREC_MASK, armpmu->reg_brbidr);
+}
+
+void brbe_probe(struct arm_pmu *armpmu)
+{
+ u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
+ u32 brbe;
+
+ brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
+ if (!valid_brbe_version(brbe))
+ return;
+
+ u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
+ if (!valid_brbidr(brbidr))
+ return;
+
+ armpmu->reg_brbidr = brbidr;
+}
+
+void brbe_enable(const struct arm_pmu *arm_pmu)
+{
+ struct pmu_hw_events *cpuc = this_cpu_ptr(arm_pmu->hw_events);
+ u64 brbfcr = 0, brbcr = 0;
+
+ /*
+ * Merge the permitted branch filters of all events.
+ */
+ for (int i = 0; i < ARMPMU_MAX_HWEVENTS; i++) {
+ struct perf_event *event = cpuc->events[i];
+
+ if (event && has_branch_stack(event)) {
+ brbfcr |= event->hw.branch_reg.config;
+ brbcr |= event->hw.extra_reg.config;
+ }
+ }
+
+ /*
+ * If the record buffer contains any branches, we've already read them
+ * out and don't want to read them again.
+ * No need to sync as we're already stopped.
+ */
+ brbe_invalidate_nosync();
+ isb(); // Make sure invalidate takes effect before enabling
+
+ /*
+ * In VHE mode with MDCR_EL2.HPMN set to PMCR_EL0.N, the counters are
+ * controlled by BRBCR_EL1 rather than BRBCR_EL2 (which writes to
+ * BRBCR_EL1 are redirected to). Use the same value for both register
+ * except keep EL1 and EL0 recording disabled in guests.
+ */
+ if (is_kernel_in_hyp_mode())
+ write_sysreg_s(brbcr & ~(BRBCR_ELx_ExBRE | BRBCR_ELx_E0BRE), SYS_BRBCR_EL12);
+ write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+ isb(); // Ensure BRBCR_ELx settings take effect before unpausing
+
+ write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+}
+
+void brbe_disable(void)
+{
+ /*
+ * No need for synchronization here as synchronization in PMCR write
+ * ensures ordering and in the interrupt handler this is a NOP as
+ * we're already paused.
+ */
+ write_sysreg_s(BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
+}
+
+static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
+ [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
+ [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
+ [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
+ [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
+ [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
+ [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
+ [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
+ [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
+ [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
+};
+
+static void brbe_set_perf_entry_type(struct perf_branch_entry *entry, u64 brbinf)
+{
+ int brbe_type = brbinf_get_type(brbinf);
+
+ if (brbe_type <= BRBINFx_EL1_TYPE_DEBUG_EXIT) {
+ const int *br_type = brbe_type_to_perf_type_map[brbe_type];
+
+ entry->type = br_type[0];
+ entry->new_type = br_type[1];
+ }
+}
+
+static int brbinf_get_perf_priv(u64 brbinf)
+{
+ int brbe_el = brbinf_get_el(brbinf);
+
+ switch (brbe_el) {
+ case BRBINFx_EL1_EL_EL0:
+ return PERF_BR_PRIV_USER;
+ case BRBINFx_EL1_EL_EL1:
+ return PERF_BR_PRIV_KERNEL;
+ case BRBINFx_EL1_EL_EL2:
+ if (is_kernel_in_hyp_mode())
+ return PERF_BR_PRIV_KERNEL;
+ return PERF_BR_PRIV_HV;
+ default:
+ pr_warn_once("%d - unknown branch privilege captured\n", brbe_el);
+ return PERF_BR_PRIV_UNKNOWN;
+ }
+}
+
+static void capture_brbe_flags(struct perf_branch_entry *entry,
+ const struct perf_event *event,
+ u64 brbinf)
+{
+ brbe_set_perf_entry_type(entry, brbinf);
+
+ if (!branch_sample_no_cycles(event))
+ entry->cycles = brbinf_get_cycles(brbinf);
+
+ if (!branch_sample_no_flags(event)) {
+ /* Mispredict info is available for source only and complete branch records. */
+ if (!brbe_record_is_target_only(brbinf)) {
+ entry->mispred = brbinf_get_mispredict(brbinf);
+ entry->predicted = !entry->mispred;
+ }
+
+ /*
+ * Currently TME feature is neither implemented in any hardware
+ * nor it is being supported in the kernel. Just warn here once
+ * if TME related information shows up rather unexpectedly.
+ */
+ if (brbinf_get_lastfailed(brbinf) || brbinf_get_in_tx(brbinf))
+ pr_warn_once("Unknown transaction states\n");
+ }
+
+ /*
+ * Branch privilege level is available for target only and complete
+ * branch records.
+ */
+ if (!brbe_record_is_source_only(brbinf))
+ entry->priv = brbinf_get_perf_priv(brbinf);
+}
+
+static bool perf_entry_from_brbe_regset(int index, struct perf_branch_entry *entry,
+ const struct perf_event *event)
+{
+ struct brbe_regset bregs;
+
+ if (!__read_brbe_regset(&bregs, index))
+ return false;
+
+ perf_clear_branch_entry_bitfields(entry);
+ if (brbe_record_is_complete(bregs.brbinf)) {
+ entry->from = bregs.brbsrc;
+ entry->to = bregs.brbtgt;
+ } else if (brbe_record_is_source_only(bregs.brbinf)) {
+ entry->from = bregs.brbsrc;
+ entry->to = 0;
+ } else if (brbe_record_is_target_only(bregs.brbinf)) {
+ entry->from = 0;
+ entry->to = bregs.brbtgt;
+ }
+ capture_brbe_flags(entry, event, bregs.brbinf);
+ return true;
+}
+
+#define PERF_BR_ARM64_ALL ( \
+ BIT(PERF_BR_COND) | \
+ BIT(PERF_BR_UNCOND) | \
+ BIT(PERF_BR_IND) | \
+ BIT(PERF_BR_CALL) | \
+ BIT(PERF_BR_IND_CALL) | \
+ BIT(PERF_BR_RET) | \
+ BIT(PERF_BR_SYSCALL) | \
+ BIT(PERF_BR_ERET) | \
+ BIT(PERF_BR_IRQ))
+
+static void prepare_event_branch_type_mask(const struct perf_event *event,
+ unsigned long *event_type_mask)
+{
+ u64 branch_sample = event->attr.branch_sample_type;
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_ANY) {
+ bitmap_from_u64(event_type_mask, PERF_BR_ARM64_ALL);
+ return;
+ }
+
+ bitmap_zero(event_type_mask, PERF_BR_ARM64_MAX);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_IND_JUMP)
+ set_bit(PERF_BR_IND, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_COND)
+ set_bit(PERF_BR_COND, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_CALL)
+ set_bit(PERF_BR_CALL, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_IND_CALL)
+ set_bit(PERF_BR_IND_CALL, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_ANY_CALL) {
+ set_bit(PERF_BR_CALL, event_type_mask);
+ set_bit(PERF_BR_IND_CALL, event_type_mask);
+ set_bit(PERF_BR_SYSCALL, event_type_mask);
+
+ if (!event->attr.exclude_kernel)
+ set_bit(PERF_BR_IRQ, event_type_mask);
+ }
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_ANY_RETURN) {
+ set_bit(PERF_BR_RET, event_type_mask);
+
+ if (!event->attr.exclude_kernel)
+ set_bit(PERF_BR_ERET, event_type_mask);
+ }
+}
+
+/*
+ * BRBE is configured with an OR of permissions from all events, so there may
+ * be events which have to be dropped or events where just the source or target
+ * address has to be zeroed.
+ */
+static bool filter_branch_privilege(struct perf_branch_entry *entry, u64 branch_sample_type)
+{
+ /* We can only have a half record if permissions have not been expanded */
+ if (!entry->from || !entry->to)
+ return true;
+
+ bool from_user = access_ok((void __user *)(unsigned long)entry->from, 4);
+ bool to_user = access_ok((void __user *)(unsigned long)entry->to, 4);
+ bool exclude_kernel = !((branch_sample_type & PERF_SAMPLE_BRANCH_KERNEL) ||
+ (is_kernel_in_hyp_mode() && (branch_sample_type & PERF_SAMPLE_BRANCH_HV)));
+
+ /*
+ * If record is within a single exception level, just need to either
+ * drop or keep the entire record.
+ */
+ if (from_user == to_user)
+ return ((entry->priv == PERF_BR_PRIV_KERNEL) && !exclude_kernel) ||
+ ((entry->priv == PERF_BR_PRIV_USER) &&
+ (branch_sample_type & PERF_SAMPLE_BRANCH_USER));
+
+ /*
+ * Record is across exception levels, mask addresses for the exception
+ * level we're not capturing.
+ */
+ if (!(branch_sample_type & PERF_SAMPLE_BRANCH_USER)) {
+ if (from_user)
+ entry->from = 0;
+ if (to_user)
+ entry->to = 0;
+ }
+
+ if (exclude_kernel) {
+ if (!from_user)
+ entry->from = 0;
+ if (!to_user)
+ entry->to = 0;
+ }
+ return true;
+}
+
+static bool filter_branch_record(struct perf_branch_entry *entry,
+ u64 branch_sample,
+ const unsigned long *event_type_mask)
+{
+ return test_bit(entry->type, event_type_mask) &&
+ filter_branch_privilege(entry, branch_sample);
+}
+
+void brbe_read_filtered_entries(struct perf_branch_stack *branch_stack,
+ const struct perf_event *event)
+{
+ struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
+ int nr_hw = brbe_num_branch_records(cpu_pmu);
+ int nr_banks = DIV_ROUND_UP(nr_hw, BRBE_BANK_MAX_ENTRIES);
+ int nr_filtered = 0;
+ DECLARE_BITMAP(event_type_mask, PERF_BR_ARM64_MAX);
+
+ prepare_event_branch_type_mask(event, event_type_mask);
+
+ for (int bank = 0; bank < nr_banks; bank++) {
+ int nr_remaining = nr_hw - (bank * BRBE_BANK_MAX_ENTRIES);
+ int nr_this_bank = min(nr_remaining, BRBE_BANK_MAX_ENTRIES);
+
+ select_brbe_bank(bank);
+
+ for (int i = 0; i < nr_this_bank; i++) {
+ struct perf_branch_entry *pbe = &branch_stack->entries[nr_filtered];
+
+ if (!perf_entry_from_brbe_regset(i, pbe, event))
+ goto done;
+
+ if (!filter_branch_record(pbe, event->attr.branch_sample_type, event_type_mask))
+ continue;
+
+ nr_filtered++;
+ }
+ }
+
+done:
+ branch_stack->nr = nr_filtered;
+}
diff --git a/drivers/perf/arm_brbe.h b/drivers/perf/arm_brbe.h
new file mode 100644
index 000000000000..b7c7d8796c86
--- /dev/null
+++ b/drivers/perf/arm_brbe.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Branch Record Buffer Extension Helpers.
+ *
+ * Copyright (C) 2022-2025 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+
+struct arm_pmu;
+struct perf_branch_stack;
+struct perf_event;
+
+#ifdef CONFIG_ARM64_BRBE
+void brbe_probe(struct arm_pmu *arm_pmu);
+unsigned int brbe_num_branch_records(const struct arm_pmu *armpmu);
+void brbe_invalidate(void);
+
+void brbe_enable(const struct arm_pmu *arm_pmu);
+void brbe_disable(void);
+
+bool brbe_branch_attr_valid(struct perf_event *event);
+void brbe_read_filtered_entries(struct perf_branch_stack *branch_stack,
+ const struct perf_event *event);
+#else
+static inline void brbe_probe(struct arm_pmu *arm_pmu) { }
+static inline unsigned int brbe_num_branch_records(const struct arm_pmu *armpmu)
+{
+ return 0;
+}
+
+static inline void brbe_invalidate(void) { }
+
+static inline void brbe_enable(const struct arm_pmu *arm_pmu) { };
+static inline void brbe_disable(void) { };
+
+static inline bool brbe_branch_attr_valid(struct perf_event *event)
+{
+ WARN_ON_ONCE(!has_branch_stack(event));
+ return false;
+}
+
+static void brbe_read_filtered_entries(struct perf_branch_stack *branch_stack,
+ const struct perf_event *event)
+{
+}
+#endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2f33e69a8caf..df9867c0dc57 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -99,7 +99,7 @@ static const struct pmu_irq_ops percpu_pmunmi_ops = {
.free_pmuirq = armpmu_free_percpu_pmunmi
};
-static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
+DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
static DEFINE_PER_CPU(int, cpu_irq);
static DEFINE_PER_CPU(const struct pmu_irq_ops *, cpu_irq_ops);
@@ -317,6 +317,11 @@ armpmu_del(struct perf_event *event, int flags)
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
+ if (has_branch_stack(event)) {
+ hw_events->branch_users--;
+ perf_sched_cb_dec(event->pmu);
+ }
+
armpmu_stop(event, PERF_EF_UPDATE);
hw_events->events[idx] = NULL;
armpmu->clear_event_idx(hw_events, event);
@@ -345,6 +350,11 @@ armpmu_add(struct perf_event *event, int flags)
/* The newly-allocated counter should be empty */
WARN_ON_ONCE(hw_events->events[idx]);
+ if (has_branch_stack(event)) {
+ hw_events->branch_users++;
+ perf_sched_cb_inc(event->pmu);
+ }
+
event->hw.idx = idx;
hw_events->events[idx] = event;
@@ -509,8 +519,7 @@ static int armpmu_event_init(struct perf_event *event)
!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
return -ENOENT;
- /* does not support taken branch sampling */
- if (has_branch_stack(event))
+ if (has_branch_stack(event) && !armpmu->reg_brbidr)
return -EOPNOTSUPP;
return __hw_perf_event_init(event);
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index 5406b9ca591a..748728c4227d 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -25,6 +25,8 @@
#include <linux/smp.h>
#include <linux/nmi.h>
+#include "arm_brbe.h"
+
/* ARMv8 Cortex-A53 specific event types. */
#define ARMV8_A53_PERFCTR_PREF_LINEFILL 0xC2
@@ -809,6 +811,7 @@ static void armv8pmu_disable_event(struct perf_event *event)
static void armv8pmu_start(struct arm_pmu *cpu_pmu)
{
struct perf_event_context *ctx;
+ struct pmu_hw_events *hw_events = this_cpu_ptr(cpu_pmu->hw_events);
int nr_user = 0;
ctx = perf_cpu_task_ctx();
@@ -822,16 +825,34 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
kvm_vcpu_pmu_resync_el0();
+ if (hw_events->branch_users)
+ brbe_enable(cpu_pmu);
+
/* Enable all counters */
armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
}
static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
{
+ struct pmu_hw_events *hw_events = this_cpu_ptr(cpu_pmu->hw_events);
+
+ if (hw_events->branch_users)
+ brbe_disable();
+
/* Disable all counters */
armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
}
+static void read_branch_records(struct pmu_hw_events *cpuc,
+ struct perf_event *event,
+ struct perf_sample_data *data)
+{
+ struct perf_branch_stack *branch_stack = cpuc->branch_stack;
+
+ brbe_read_filtered_entries(branch_stack, event);
+ perf_sample_save_brstack(data, event, branch_stack, NULL);
+}
+
static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
{
u64 pmovsr;
@@ -882,6 +903,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
if (!armpmu_event_set_period(event))
continue;
+ /*
+ * PMU IRQ should remain asserted until all branch records
+ * are captured and processed into struct perf_sample_data.
+ */
+ if (has_branch_stack(event))
+ read_branch_records(cpuc, event, &data);
+
/*
* Perf event overflow will queue the processing of the event as
* an irq_work which will be taken care of in the handling of
@@ -939,7 +967,7 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
/* Always prefer to place a cycle counter into the cycle counter. */
if ((evtype == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) &&
- !armv8pmu_event_get_threshold(&event->attr)) {
+ !armv8pmu_event_get_threshold(&event->attr) && !has_branch_stack(event)) {
if (!test_and_set_bit(ARMV8_PMU_CYCLE_IDX, cpuc->used_mask))
return ARMV8_PMU_CYCLE_IDX;
else if (armv8pmu_event_is_64bit(event) &&
@@ -988,6 +1016,18 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
return event->hw.idx + 1;
}
+static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
+{
+ struct arm_pmu *armpmu = *this_cpu_ptr(&cpu_armpmu);
+ struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
+
+ if (!hw_events->branch_users)
+ return;
+
+ if (sched_in)
+ brbe_invalidate();
+}
+
/*
* Add an event filter to a given event.
*/
@@ -1005,6 +1045,13 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
return -EOPNOTSUPP;
}
+ if (has_branch_stack(perf_event)) {
+ if (!brbe_num_branch_records(cpu_pmu) || !brbe_branch_attr_valid(perf_event))
+ return -EOPNOTSUPP;
+
+ perf_event->attach_state |= PERF_ATTACH_SCHED_CB;
+ }
+
/*
* If we're running in hyp mode, then we *are* the hypervisor.
* Therefore we ignore exclude_hv in this configuration, since
@@ -1071,6 +1118,9 @@ static void armv8pmu_reset(void *info)
/* Clear the counters we flip at guest entry/exit */
kvm_clr_pmu_events(mask);
+ if (brbe_num_branch_records(cpu_pmu))
+ brbe_disable();
+
/*
* Initialize & Reset PMNC. Request overflow interrupt for
* 64 bit cycle counter but cheat in armv8pmu_write_counter().
@@ -1239,6 +1289,30 @@ static void __armv8pmu_probe_pmu(void *info)
cpu_pmu->reg_pmmir = read_pmmir();
else
cpu_pmu->reg_pmmir = 0;
+
+ brbe_probe(cpu_pmu);
+}
+
+static int branch_records_alloc(struct arm_pmu *armpmu)
+{
+ struct perf_branch_stack *branch_stack;
+ size_t size = struct_size(branch_stack, entries, brbe_num_branch_records(armpmu));
+ int cpu;
+
+ branch_stack = __alloc_percpu_gfp(size, __alignof__(*branch_stack),
+ GFP_KERNEL);
+ if (!branch_stack)
+ return -ENOMEM;
+
+ for_each_possible_cpu(cpu) {
+ struct pmu_hw_events *events_cpu;
+ struct perf_branch_stack *branch_stack_cpu;
+
+ events_cpu = per_cpu_ptr(armpmu->hw_events, cpu);
+ branch_stack_cpu = per_cpu_ptr(branch_stack, cpu);
+ events_cpu->branch_stack = branch_stack_cpu;
+ }
+ return 0;
}
static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
@@ -1255,7 +1329,15 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
if (ret)
return ret;
- return probe.present ? 0 : -ENODEV;
+ if (!probe.present)
+ return -ENODEV;
+
+ if (brbe_num_branch_records(cpu_pmu)) {
+ ret = branch_records_alloc(cpu_pmu);
+ if (ret)
+ return ret;
+ }
+ return 0;
}
static void armv8pmu_disable_user_access_ipi(void *unused)
@@ -1314,6 +1396,7 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
cpu_pmu->set_event_filter = armv8pmu_set_event_filter;
cpu_pmu->pmu.event_idx = armv8pmu_user_event_idx;
+ cpu_pmu->pmu.sched_task = armv8pmu_sched_task;
cpu_pmu->name = name;
cpu_pmu->map_event = map_event;
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index c70d528594f2..219d58259857 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -70,6 +70,11 @@ struct pmu_hw_events {
struct arm_pmu *percpu_pmu;
int irq;
+
+ struct perf_branch_stack *branch_stack;
+
+ /* Active events requesting branch records */
+ unsigned int branch_users;
};
enum armpmu_attr_groups {
@@ -111,6 +116,7 @@ struct arm_pmu {
/* PMUv3 only */
int pmuver;
u64 reg_pmmir;
+ u64 reg_brbidr;
#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
#define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE 0x4000
@@ -122,6 +128,8 @@ struct arm_pmu {
#define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
+DECLARE_PER_CPU(struct arm_pmu *, cpu_armpmu);
+
u64 armpmu_event_update(struct perf_event *event);
int armpmu_event_set_period(struct perf_event *event);
--
2.47.2
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [PATCH v19 01/11] perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters
2025-02-03 0:42 ` [PATCH v19 01/11] perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters Rob Herring (Arm)
@ 2025-02-03 4:07 ` Anshuman Khandual
0 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 4:07 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:12, Rob Herring (Arm) wrote:
> Counting events related to setup of the PMU is not desired, but
> kvm_vcpu_pmu_resync_el0() is called just after the PMU counters have
> been enabled. Move the call to before enabling the counters.
>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> drivers/perf/arm_pmuv3.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 0e360feb3432..9ebc950559c0 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -825,10 +825,10 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
> else
> armv8pmu_disable_user_access();
>
> + kvm_vcpu_pmu_resync_el0();
> +
> /* Enable all counters */
> armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
> -
> - kvm_vcpu_pmu_resync_el0();
> }
>
> static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 04/11] perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts
2025-02-03 0:42 ` [PATCH v19 04/11] perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts Rob Herring (Arm)
@ 2025-02-03 4:09 ` Anshuman Khandual
0 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 4:09 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:12, Rob Herring (Arm) wrote:
> The function calls for enabling/disabling counters and interrupts are
> pretty obvious as to what they are doing, and the comments don't add
> any additional value.
>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> drivers/perf/arm_v7_pmu.c | 44 --------------------------------------------
> 1 file changed, 44 deletions(-)
>
> diff --git a/drivers/perf/arm_v7_pmu.c b/drivers/perf/arm_v7_pmu.c
> index 420cadd108e7..7fa88e3b64e0 100644
> --- a/drivers/perf/arm_v7_pmu.c
> +++ b/drivers/perf/arm_v7_pmu.c
> @@ -857,14 +857,6 @@ static void armv7pmu_enable_event(struct perf_event *event)
> return;
> }
>
> - /*
> - * Enable counter and interrupt, and set the counter to count
> - * the event that we're interested in.
> - */
> -
> - /*
> - * Disable counter
> - */
> armv7_pmnc_disable_counter(idx);
>
> /*
> @@ -875,14 +867,7 @@ static void armv7pmu_enable_event(struct perf_event *event)
> if (cpu_pmu->set_event_filter || idx != ARMV7_IDX_CYCLE_COUNTER)
> armv7_pmnc_write_evtsel(idx, hwc->config_base);
>
> - /*
> - * Enable interrupt for this counter
> - */
> armv7_pmnc_enable_intens(idx);
> -
> - /*
> - * Enable counter
> - */
> armv7_pmnc_enable_counter(idx);
> }
>
> @@ -898,18 +883,7 @@ static void armv7pmu_disable_event(struct perf_event *event)
> return;
> }
>
> - /*
> - * Disable counter and interrupt
> - */
> -
> - /*
> - * Disable counter
> - */
> armv7_pmnc_disable_counter(idx);
> -
> - /*
> - * Disable interrupt for this counter
> - */
> armv7_pmnc_disable_intens(idx);
> }
>
> @@ -1476,12 +1450,6 @@ static void krait_pmu_enable_event(struct perf_event *event)
> struct hw_perf_event *hwc = &event->hw;
> int idx = hwc->idx;
>
> - /*
> - * Enable counter and interrupt, and set the counter to count
> - * the event that we're interested in.
> - */
> -
> - /* Disable counter */
> armv7_pmnc_disable_counter(idx);
>
> /*
> @@ -1494,10 +1462,7 @@ static void krait_pmu_enable_event(struct perf_event *event)
> else
> armv7_pmnc_write_evtsel(idx, hwc->config_base);
>
> - /* Enable interrupt for this counter */
> armv7_pmnc_enable_intens(idx);
> -
> - /* Enable counter */
> armv7_pmnc_enable_counter(idx);
> }
>
> @@ -1797,12 +1762,6 @@ static void scorpion_pmu_enable_event(struct perf_event *event)
> struct hw_perf_event *hwc = &event->hw;
> int idx = hwc->idx;
>
> - /*
> - * Enable counter and interrupt, and set the counter to count
> - * the event that we're interested in.
> - */
> -
> - /* Disable counter */
> armv7_pmnc_disable_counter(idx);
>
> /*
> @@ -1815,10 +1774,7 @@ static void scorpion_pmu_enable_event(struct perf_event *event)
> else if (idx != ARMV7_IDX_CYCLE_COUNTER)
> armv7_pmnc_write_evtsel(idx, hwc->config_base);
>
> - /* Enable interrupt for this counter */
> armv7_pmnc_enable_intens(idx);
> -
> - /* Enable counter */
> armv7_pmnc_enable_counter(idx);
> }
>
>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 02/11] perf: arm_pmu: Don't disable counter in armpmu_add()
2025-02-03 0:42 ` [PATCH v19 02/11] perf: arm_pmu: Don't disable counter in armpmu_add() Rob Herring (Arm)
@ 2025-02-03 6:04 ` Anshuman Khandual
0 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 6:04 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:12, Rob Herring (Arm) wrote:
> From: Mark Rutland <mark.rutland@arm.com>
>
> Currently armpmu_add() tries to handle a newly-allocated counter having
> a stale associated event, but this should not be possible, and if this
A stale associated event ? Does that mean hw_events->events[idx] still
points to a valid event even though counter idx has already been freed
up and allocated to a new event.
> were to happen the current mitigation is insufficient and potentially
> expensive. It would be better to warn if we encounter the impossible
> case.
Makes sense.
>
> Calls to pmu::add() and pmu::del() are serialized by the core perf code,
> and armpmu_del() clears the relevant slot in pmu_hw_events::events[]
> before clearing the bit in pmu_hw_events::used_mask such that the
> counter can be reallocated. Thus when armpmu_add() allocates a counter
> index from pmu_hw_events::used_mask, it should not be possible to observe
> a stale even in pmu_hw_events::events[] unless either
> pmu_hw_events::used_mask or pmu_hw_events::events[] have been corrupted.
>
> If this were to happen, we'd end up with two events with the same
> event->hw.idx, which would clash with each other during reprogramming,
> deletion, etc, and produce bogus results. Add a WARN_ON_ONCE() for this
> case so that we can detect if this ever occurs in practice.
Agreed.
>
> That possiblity aside, there's no need to call arm_pmu::disable(event)
s/possiblity/possibility
> for the new event. The PMU reset code initialises the counter in a
> disabled state, and armpmu_del() will disable the counter before it can
> be reused. Remove the redundant disable.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> drivers/perf/arm_pmu.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 398cce3d76fc..2f33e69a8caf 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -342,12 +342,10 @@ armpmu_add(struct perf_event *event, int flags)
> if (idx < 0)
> return idx;
>
> - /*
> - * If there is an event in the counter we are going to use then make
> - * sure it is disabled.
> - */
> + /* The newly-allocated counter should be empty */
Should this comment also include what happens when two events some how end
up using the same 'event->hw.idx' as mentioned in the commit message, just
to make things clearer.
> + WARN_ON_ONCE(hw_events->events[idx]);
> +
> event->hw.idx = idx;
> - armpmu->disable(event);
> hw_events->events[idx] = event;
>
> hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
>
Otherwise LGTM.
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 03/11] perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event()
2025-02-03 0:42 ` [PATCH v19 03/11] perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event() Rob Herring (Arm)
@ 2025-02-03 6:38 ` Anshuman Khandual
0 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 6:38 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:12, Rob Herring (Arm) wrote:
> From: Mark Rutland <mark.rutland@arm.com>
>
> Currently armv8pmu_enable_event() starts by disabling the event counter
> it has been asked to enable. This should not be necessary as the counter
> (and the PMU as a whole) should not be active when
> armv8pmu_enable_event() is called.
Makes sense.
>
> Remove the redundant call to armv8pmu_disable_event_counter(). At the
> same time, remove the comment immeditately above as everything it says
s/immeditately/immediately
> is obvious from the function names below.
But should this comment drop change be folded into the next patch which
exclusively drops all obviously redundant disable/enable comments.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> drivers/perf/arm_pmuv3.c | 5 -----
> 1 file changed, 5 deletions(-)
>
> diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> index 9ebc950559c0..5406b9ca591a 100644
> --- a/drivers/perf/arm_pmuv3.c
> +++ b/drivers/perf/arm_pmuv3.c
> @@ -795,11 +795,6 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
>
> static void armv8pmu_enable_event(struct perf_event *event)
> {
> - /*
> - * Enable counter and interrupt, and set the counter to count
> - * the event that we're interested in.
> - */
> - armv8pmu_disable_event_counter(event);
> armv8pmu_write_event_type(event);
> armv8pmu_enable_event_irq(event);
> armv8pmu_enable_event_counter(event);
>
Otherwise LGTM.
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 05/11] perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event()
2025-02-03 0:42 ` [PATCH v19 05/11] perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event() Rob Herring (Arm)
@ 2025-02-03 6:54 ` Anshuman Khandual
0 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 6:54 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:12, Rob Herring (Arm) wrote:
> Currently (armv7|krait_|scorpion_)pmu_enable_event() start by disabling
> the event counter it has been asked to enable. This should not be
> necessary as the counter (and the PMU as a whole) should not be active
> when *_enable_event() is called.
>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> drivers/perf/arm_v7_pmu.c | 6 ------
> 1 file changed, 6 deletions(-)
>
> diff --git a/drivers/perf/arm_v7_pmu.c b/drivers/perf/arm_v7_pmu.c
> index 7fa88e3b64e0..17831e1920bd 100644
> --- a/drivers/perf/arm_v7_pmu.c
> +++ b/drivers/perf/arm_v7_pmu.c
> @@ -857,8 +857,6 @@ static void armv7pmu_enable_event(struct perf_event *event)
> return;
> }
>
> - armv7_pmnc_disable_counter(idx);
> -
> /*
> * Set event (if destined for PMNx counters)
> * We only need to set the event for the cycle counter if we
> @@ -1450,8 +1448,6 @@ static void krait_pmu_enable_event(struct perf_event *event)
> struct hw_perf_event *hwc = &event->hw;
> int idx = hwc->idx;
>
> - armv7_pmnc_disable_counter(idx);
> -
> /*
> * Set event (if destined for PMNx counters)
> * We set the event for the cycle counter because we
> @@ -1762,8 +1758,6 @@ static void scorpion_pmu_enable_event(struct perf_event *event)
> struct hw_perf_event *hwc = &event->hw;
> int idx = hwc->idx;
>
> - armv7_pmnc_disable_counter(idx);
> -
> /*
> * Set event (if destined for PMNx counters)
> * We don't set the event for the cycle counter because we
>
LGTM
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 06/11] perf: apple_m1: Don't disable counter in m1_pmu_enable_event()
2025-02-03 0:43 ` [PATCH v19 06/11] perf: apple_m1: Don't disable counter in m1_pmu_enable_event() Rob Herring (Arm)
@ 2025-02-03 8:10 ` Anshuman Khandual
0 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 8:10 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:13, Rob Herring (Arm) wrote:
> Currently m1_pmu_enable_event() starts by disabling the event counter
> it has been asked to enable. This should not be necessary as the
> counter (and the PMU as a whole) should not be active when
> m1_pmu_enable_event() is called.
>
> Cc: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> drivers/perf/apple_m1_cpu_pmu.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
> index 06fd317529fc..39349ecec3c1 100644
> --- a/drivers/perf/apple_m1_cpu_pmu.c
> +++ b/drivers/perf/apple_m1_cpu_pmu.c
> @@ -396,10 +396,6 @@ static void m1_pmu_enable_event(struct perf_event *event)
> user = event->hw.config_base & M1_PMU_CFG_COUNT_USER;
> kernel = event->hw.config_base & M1_PMU_CFG_COUNT_KERNEL;
>
> - m1_pmu_disable_counter_interrupt(event->hw.idx);
> - m1_pmu_disable_counter(event->hw.idx);
> - isb();
> -
> m1_pmu_configure_counter(event->hw.idx, evt, user, kernel);
> m1_pmu_enable_counter(event->hw.idx);
> m1_pmu_enable_counter_interrupt(event->hw.idx);
>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 07/11] perf: arm_pmu: Move PMUv3-specific data
2025-02-03 0:43 ` [PATCH v19 07/11] perf: arm_pmu: Move PMUv3-specific data Rob Herring (Arm)
@ 2025-02-03 8:16 ` Anshuman Khandual
0 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 8:16 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:13, Rob Herring (Arm) wrote:
> From: Mark Rutland <mark.rutland@arm.com>
>
> A few fields in struct arm_pmu are only used with PMUv3, and soon we
> will need to add more for BRBE. Group the fields together so that we
> have a logical place to add more data in future.
>
> At the same time, remove the comment for reg_pmmir as it doesn't convey
> anything useful.
>
> There should be no functional change as a result of this patch.
>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> include/linux/perf/arm_pmu.h | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 4b5b83677e3f..c70d528594f2 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -84,7 +84,6 @@ struct arm_pmu {
> struct pmu pmu;
> cpumask_t supported_cpus;
> char *name;
> - int pmuver;
> irqreturn_t (*handle_irq)(struct arm_pmu *pmu);
> void (*enable)(struct perf_event *event);
> void (*disable)(struct perf_event *event);
> @@ -102,18 +101,20 @@ struct arm_pmu {
> int (*map_event)(struct perf_event *event);
> DECLARE_BITMAP(cntr_mask, ARMPMU_MAX_HWEVENTS);
> bool secure_access; /* 32-bit ARM only */
> -#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
> - DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
> -#define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE 0x4000
> - DECLARE_BITMAP(pmceid_ext_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
> struct platform_device *plat_device;
> struct pmu_hw_events __percpu *hw_events;
> struct hlist_node node;
> struct notifier_block cpu_pm_nb;
> /* the attr_groups array must be NULL-terminated */
> const struct attribute_group *attr_groups[ARMPMU_NR_ATTR_GROUPS + 1];
> - /* store the PMMIR_EL1 to expose slots */
> +
> + /* PMUv3 only */
> + int pmuver;
> u64 reg_pmmir;
> +#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
> + DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
> +#define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE 0x4000
> + DECLARE_BITMAP(pmceid_ext_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
>
> /* Only to be used by ACPI probing code */
> unsigned long acpi_cpuid;
>
Makes sense.
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 08/11] arm64/sysreg: Add BRBE registers and fields
2025-02-03 0:43 ` [PATCH v19 08/11] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
@ 2025-02-03 8:32 ` Anshuman Khandual
0 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 8:32 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Mark Brown
On 2/3/25 06:13, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> This patch adds definitions related to the Branch Record Buffer Extension
> (BRBE) as per ARM DDI 0487K.a. These will be used by KVM and a BRBE driver
> in subsequent patches.
>
> Some existing BRBE definitions in asm/sysreg.h are replaced with equivalent
> generated definitions.
>
> Cc: Marc Zyngier <maz@kernel.org>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> v19:
> - split BRBINF.CC field into mantissa and exponent
> ---
> arch/arm64/include/asm/sysreg.h | 17 ++----
> arch/arm64/tools/sysreg | 132 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 138 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 05ea5223d2d5..a8257e13f8f1 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -198,16 +198,8 @@
> #define SYS_DBGVCR32_EL2 sys_reg(2, 4, 0, 7, 0)
>
> #define SYS_BRBINF_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 0))
> -#define SYS_BRBINFINJ_EL1 sys_reg(2, 1, 9, 1, 0)
> #define SYS_BRBSRC_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 1))
> -#define SYS_BRBSRCINJ_EL1 sys_reg(2, 1, 9, 1, 1)
> #define SYS_BRBTGT_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 2))
> -#define SYS_BRBTGTINJ_EL1 sys_reg(2, 1, 9, 1, 2)
> -#define SYS_BRBTS_EL1 sys_reg(2, 1, 9, 0, 2)
> -
> -#define SYS_BRBCR_EL1 sys_reg(2, 1, 9, 0, 0)
> -#define SYS_BRBFCR_EL1 sys_reg(2, 1, 9, 0, 1)
> -#define SYS_BRBIDR0_EL1 sys_reg(2, 1, 9, 2, 0)
>
> #define SYS_TRCITECR_EL1 sys_reg(3, 0, 1, 2, 3)
> #define SYS_TRCACATR(m) sys_reg(2, 1, 2, ((m & 7) << 1), (2 | (m >> 3)))
> @@ -273,8 +265,6 @@
> /* ETM */
> #define SYS_TRCOSLAR sys_reg(2, 1, 1, 0, 4)
>
> -#define SYS_BRBCR_EL2 sys_reg(2, 4, 9, 0, 0)
> -
> #define SYS_MIDR_EL1 sys_reg(3, 0, 0, 0, 0)
> #define SYS_MPIDR_EL1 sys_reg(3, 0, 0, 0, 5)
> #define SYS_REVIDR_EL1 sys_reg(3, 0, 0, 0, 6)
> @@ -610,7 +600,6 @@
> #define SYS_CNTHV_CVAL_EL2 sys_reg(3, 4, 14, 3, 2)
>
> /* VHE encodings for architectural EL0/1 system registers */
> -#define SYS_BRBCR_EL12 sys_reg(2, 5, 9, 0, 0)
> #define SYS_SCTLR_EL12 sys_reg(3, 5, 1, 0, 0)
> #define SYS_CPACR_EL12 sys_reg(3, 5, 1, 0, 2)
> #define SYS_SCTLR2_EL12 sys_reg(3, 5, 1, 0, 3)
> @@ -821,6 +810,12 @@
> #define OP_COSP_RCTX sys_insn(1, 3, 7, 3, 6)
> #define OP_CPP_RCTX sys_insn(1, 3, 7, 3, 7)
>
> +/*
> + * BRBE Instructions
> + */
> +#define BRB_IALL_INSN __emit_inst(0xd5000000 | OP_BRB_IALL | (0x1f))
> +#define BRB_INJ_INSN __emit_inst(0xd5000000 | OP_BRB_INJ | (0x1f))
> +
> /* Common SCTLR_ELx flags. */
> #define SCTLR_ELx_ENTP2 (BIT(60))
> #define SCTLR_ELx_DSSBS (BIT(44))
> diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
> index 762ee084b37c..c0943579977a 100644
> --- a/arch/arm64/tools/sysreg
> +++ b/arch/arm64/tools/sysreg
> @@ -1038,6 +1038,138 @@ UnsignedEnum 3:0 MTEPERM
> EndEnum
> EndSysreg
>
> +
> +SysregFields BRBINFx_EL1
> +Res0 63:47
> +Field 46 CCU
> +Field 45:40 CC_EXP
> +Field 39:32 CC_MANT
> +Res0 31:18
> +Field 17 LASTFAILED
> +Field 16 T
> +Res0 15:14
> +Enum 13:8 TYPE
> + 0b000000 DIRECT_UNCOND
> + 0b000001 INDIRECT
> + 0b000010 DIRECT_LINK
> + 0b000011 INDIRECT_LINK
> + 0b000101 RET
> + 0b000111 ERET
> + 0b001000 DIRECT_COND
> + 0b100001 DEBUG_HALT
> + 0b100010 CALL
> + 0b100011 TRAP
> + 0b100100 SERROR
> + 0b100110 INSN_DEBUG
> + 0b100111 DATA_DEBUG
> + 0b101010 ALIGN_FAULT
> + 0b101011 INSN_FAULT
> + 0b101100 DATA_FAULT
> + 0b101110 IRQ
> + 0b101111 FIQ
> + 0b110000 IMPDEF_TRAP_EL3
> + 0b111001 DEBUG_EXIT
> +EndEnum
> +Enum 7:6 EL
> + 0b00 EL0
> + 0b01 EL1
> + 0b10 EL2
> + 0b11 EL3
> +EndEnum
> +Field 5 MPRED
> +Res0 4:2
> +Enum 1:0 VALID
> + 0b00 NONE
> + 0b01 TARGET
> + 0b10 SOURCE
> + 0b11 FULL
> +EndEnum
> +EndSysregFields
> +
> +SysregFields BRBCR_ELx
> +Res0 63:24
> +Field 23 EXCEPTION
> +Field 22 ERTN
> +Res0 21:10
> +Field 9 FZPSS
> +Field 8 FZP
> +Res0 7
> +Enum 6:5 TS
> + 0b01 VIRTUAL
> + 0b10 GUEST_PHYSICAL
> + 0b11 PHYSICAL
> +EndEnum
> +Field 4 MPRED
> +Field 3 CC
> +Res0 2
> +Field 1 ExBRE
> +Field 0 E0BRE
> +EndSysregFields
> +
> +Sysreg BRBCR_EL1 2 1 9 0 0
> +Fields BRBCR_ELx
> +EndSysreg
> +
> +Sysreg BRBFCR_EL1 2 1 9 0 1
> +Res0 63:30
> +Enum 29:28 BANK
> + 0b00 BANK_0
> + 0b01 BANK_1
> +EndEnum
> +Res0 27:23
> +Field 22 CONDDIR
> +Field 21 DIRCALL
> +Field 20 INDCALL
> +Field 19 RTN
> +Field 18 INDIRECT
> +Field 17 DIRECT
> +Field 16 EnI
> +Res0 15:8
> +Field 7 PAUSED
> +Field 6 LASTFAILED
> +Res0 5:0
> +EndSysreg
> +
> +Sysreg BRBTS_EL1 2 1 9 0 2
> +Field 63:0 TS
> +EndSysreg
> +
> +Sysreg BRBINFINJ_EL1 2 1 9 1 0
> +Fields BRBINFx_EL1
> +EndSysreg
> +
> +Sysreg BRBSRCINJ_EL1 2 1 9 1 1
> +Field 63:0 ADDRESS
> +EndSysreg
> +
> +Sysreg BRBTGTINJ_EL1 2 1 9 1 2
> +Field 63:0 ADDRESS
> +EndSysreg
> +
> +Sysreg BRBIDR0_EL1 2 1 9 2 0
> +Res0 63:16
> +Enum 15:12 CC
> + 0b0101 20_BIT
> +EndEnum
> +Enum 11:8 FORMAT
> + 0b0000 FORMAT_0
> +EndEnum
> +Enum 7:0 NUMREC
> + 0b00001000 8
> + 0b00010000 16
> + 0b00100000 32
> + 0b01000000 64
> +EndEnum
> +EndSysreg
> +
> +Sysreg BRBCR_EL2 2 4 9 0 0
> +Fields BRBCR_ELx
> +EndSysreg
> +
> +Sysreg BRBCR_EL12 2 5 9 0 0
> +Fields BRBCR_ELx
> +EndSysreg
> +
> Sysreg ID_AA64ZFR0_EL1 3 0 0 4 4
> Res0 63:60
> UnsignedEnum 59:56 F64MM
>
LGTM.
The only thing that changed from V18 - BRBINF_EL1's CC[45:32] field has been
split into CC_EXP[45:40] and CC_MANT[39:32].
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 09/11] arm64: Handle BRBE booting requirements
2025-02-03 0:43 ` [PATCH v19 09/11] arm64: Handle BRBE booting requirements Rob Herring (Arm)
@ 2025-02-03 8:47 ` Anshuman Khandual
2025-02-12 12:10 ` Leo Yan
1 sibling, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 8:47 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:13, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> To use the Branch Record Buffer Extension (BRBE), some configuration is
> necessary at EL3 and EL2. This patch documents the requirements and adds
> the initial EL2 setup code, which largely consists of configuring the
> fine-grained traps and initializing a couple of BRBE control registers.
>
> Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and
> HDFGWTR_EL2 with the same value, relying on the read/write trap controls
> for a register occupying the same bit position in either register. The
> 'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit
> 59 of HDFGRTR_EL2 is RES0, and so this assumption no longer holds.
>
> To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit
> layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and
> HDFGWTR_EL2 control bits separately. While making this change the
> open-coded value (1 << 62) is replaced with
> HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK.
>
> The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special
> initialisation: even though they are subject to E2H renaming, both have
> an effect regardless of HCR_EL2.TGE, even when running at EL2, and
> consequently both need to be initialised. This is handled in
> __init_el2_brbe() with a comment to explain the situation.
>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> [Mark: rewrite commit message, fix typo in comment]
This commit message after rewrite is better, thanks !
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> Documentation/arch/arm64/booting.rst | 21 +++++++++
> arch/arm64/include/asm/el2_setup.h | 86 ++++++++++++++++++++++++++++++++++--
> 2 files changed, 104 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
> index cad6fdc96b98..0a421757cacf 100644
> --- a/Documentation/arch/arm64/booting.rst
> +++ b/Documentation/arch/arm64/booting.rst
> @@ -352,6 +352,27 @@ Before jumping into the kernel, the following conditions must be met:
>
> - HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
>
> + For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE):
> +
> + - If EL3 is present:
> +
> + - MDCR_EL3.SBRBE (bits 33:32) must be initialised to 0b11.
> +
> + - If the kernel is entered at EL1 and EL2 is present:
> +
> + - BRBCR_EL2.CC (bit 3) must be initialised to 0b1.
> + - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1.
> +
> + - HDFGRTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1.
> + - HDFGRTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1.
> + - HDFGRTR_EL2.nBRBIDR (bit 59) must be initialised to 0b1.
> +
> + - HDFGWTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1.
> + - HDFGWTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1.
> +
> + - HFGITR_EL2.nBRBIALL (bit 56) must be initialised to 0b1.
> + - HFGITR_EL2.nBRBINJ (bit 55) must be initialised to 0b1.
> +
> For CPUs with the Scalable Matrix Extension FA64 feature (FEAT_SME_FA64):
>
> - If EL3 is present:
> diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> index 25e162651750..bf21ce513aff 100644
> --- a/arch/arm64/include/asm/el2_setup.h
> +++ b/arch/arm64/include/asm/el2_setup.h
> @@ -163,6 +163,39 @@
> .Lskip_set_cptr_\@:
> .endm
>
> +/*
> + * Configure BRBE to permit recording cycle counts and branch mispredicts.
> + *
> + * At any EL, to record cycle counts BRBE requires that both BRBCR_EL2.CC=1 and
> + * BRBCR_EL1.CC=1.
> + *
> + * At any EL, to record branch mispredicts BRBE requires that both
> + * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1.
> + *
> + * When HCR_EL2.E2H=1, the BRBCR_EL1 encoding is redirected to BRBCR_EL2, but
> + * the {CC,MPRED} bits in the real BRBCR_EL1 register still apply.
> + *
> + * Set {CC,MPRED} in both BRBCR_EL2 and BRBCR_EL1 so that at runtime we only
> + * need to enable/disable these in BRBCR_EL1 regardless of whether the kernel
> + * ends up executing in EL1 or EL2.
> + */
> +.macro __init_el2_brbe
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_\@
> +
> + mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED
> + msr_s SYS_BRBCR_EL2, x0
> +
> + __check_hvhe .Lset_brbe_nvhe_\@, x1
> + msr_s SYS_BRBCR_EL12, x0 // VHE
> + b .Lskip_brbe_\@
> +
> +.Lset_brbe_nvhe_\@:
> + msr_s SYS_BRBCR_EL1, x0 // NVHE
> +.Lskip_brbe_\@:
> +.endm
> +
> /* Disable any fine grained traps */
> .macro __init_el2_fgt
> mrs x1, id_aa64mmfr0_el1
> @@ -170,16 +203,48 @@
> cbz x1, .Lskip_fgt_\@
>
> mov x0, xzr
> + mov x2, xzr
> mrs x1, id_aa64dfr0_el1
> ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4
> cmp x1, #3
> b.lt .Lskip_spe_fgt_\@
> +
> /* Disable PMSNEVFR_EL1 read and write traps */
> - orr x0, x0, #(1 << 62)
> + orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK
> + orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK
>
> .Lskip_spe_fgt_\@:
> +#ifdef CONFIG_ARM64_BRBE
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_reg_fgt_\@
> +
> + /*
> + * Disable read traps for the following registers
> + *
> + * [BRBSRC|BRBTGT|RBINF]_EL1
> + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
> + */
> + orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK
> +
> + /*
> + * Disable write traps for the following registers
> + *
> + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
> + */
> + orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK
> +
> + /* Disable read and write traps for [BRBCR|BRBFCR]_EL1 */
> + orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK
> + orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK
> +
> + /* Disable read traps for BRBIDR_EL1 */
> + orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK
> +
> +.Lskip_brbe_reg_fgt_\@:
> +#endif /* CONFIG_ARM64_BRBE */
> msr_s SYS_HDFGRTR_EL2, x0
> - msr_s SYS_HDFGWTR_EL2, x0
> + msr_s SYS_HDFGWTR_EL2, x2
>
> mov x0, xzr
> mrs x1, id_aa64pfr1_el1
> @@ -220,7 +285,21 @@
> .Lset_fgt_\@:
> msr_s SYS_HFGRTR_EL2, x0
> msr_s SYS_HFGWTR_EL2, x0
> - msr_s SYS_HFGITR_EL2, xzr
> + mov x0, xzr
> +#ifdef CONFIG_ARM64_BRBE
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_insn_fgt_\@
> +
> + /* Disable traps for BRBIALL instruction */
> + orr x0, x0, #HFGITR_EL2_nBRBIALL_MASK
> +
> + /* Disable traps for BRBINJ instruction */
> + orr x0, x0, #HFGITR_EL2_nBRBINJ_MASK
> +
> +.Lskip_brbe_insn_fgt_\@:
> +#endif /* CONFIG_ARM64_BRBE */
> + msr_s SYS_HFGITR_EL2, x0
>
> mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU
> ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4
> @@ -275,6 +354,7 @@
> __init_el2_hcrx
> __init_el2_timers
> __init_el2_debug
> + __init_el2_brbe
> __init_el2_lor
> __init_el2_stage2
> __init_el2_gicv3
>
LGTM.
Both commit message and in code comments formatting have been changed from V18.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-02-03 0:43 ` [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
@ 2025-02-03 9:16 ` Anshuman Khandual
2025-02-03 11:28 ` James Clark
2025-02-13 17:03 ` Leo Yan
2 siblings, 0 replies; 43+ messages in thread
From: Anshuman Khandual @ 2025-02-03 9:16 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 2/3/25 06:13, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> While BRBE can record branches within guests, the host recording
> branches in guests is not supported by perf. Therefore, BRBE needs to be
> disabled on guest entry and restored on exit.
>
> For nVHE, this requires explicit handling for guests. Before
> entering a guest, save the BRBE state and disable the it. When
> returning to the host, restore the state.
>
> For VHE, it is not necessary. We initialize
> BRBCR_EL1.{E1BRE,E0BRE}=={0,0} at boot time, and HCR_EL2.TGE==1 while
> running in the host. We configure BRBCR_EL2.{E2BRE,E0HBRE} to enable
> branch recording in the host. When entering the guest, we set
> HCR_EL2.TGE==0 which means BRBCR_EL1 is used instead of BRBCR_EL2.
> Consequently for VHE, BRBE recording is disabled at EL1 and EL0 when
> running a guest.
>
> Should recording in guests (by the host) ever be desired, the perf ABI
> will need to be extended to distinguish guest addresses (struct
> perf_branch_entry.priv) for starters. BRBE records would also need to be
> invalidated on guest entry/exit as guest/host EL1 and EL0 records can't
> be distinguished.
>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> v19:
> - Rework due to v6.14 debug flag changes
> - Redo commit message
> ---
> arch/arm64/include/asm/kvm_host.h | 2 ++
> arch/arm64/kvm/debug.c | 4 ++++
> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 32 ++++++++++++++++++++++++++++++++
> 3 files changed, 38 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 7cfa024de4e3..4fc246a1ee6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -619,6 +619,7 @@ struct kvm_host_data {
> #define KVM_HOST_DATA_FLAG_HOST_SME_ENABLED 3
> #define KVM_HOST_DATA_FLAG_TRBE_ENABLED 4
> #define KVM_HOST_DATA_FLAG_EL1_TRACING_CONFIGURED 5
> +#define KVM_HOST_DATA_FLAG_HAS_BRBE 6
Although there is some variation in these feature names above, but seems
like KVM_HOST_DATA_FLAG_HAS_BRBE is an appropriate one for BRBE handling.
> unsigned long flags;
>
> struct kvm_cpu_context host_ctxt;
> @@ -662,6 +663,7 @@ struct kvm_host_data {
> u64 trfcr_el1;
> /* Values of trap registers for the host before guest entry. */
> u64 mdcr_el2;
> + u64 brbcr_el1;
> } host_debug_state;
>
> /* Guest trace filter value */
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 0e4c805e7e89..bc6015108a68 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -81,6 +81,10 @@ void kvm_init_host_debug_data(void)
> !(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
> host_data_set_flag(HAS_SPE);
>
> + /* Check if we have BRBE implemented and available at the host */
> + if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT))
> + host_data_set_flag(HAS_BRBE);
> +
> if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT)) {
> /* Force disable trace in protected mode in case of no TRBE */
> if (is_protected_kvm_enabled())
> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> index 2f4a4f5036bb..2a1c0f49792b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> @@ -92,12 +92,42 @@ static void __trace_switch_to_host(void)
> *host_data_ptr(host_debug_state.trfcr_el1));
> }
>
> +static void __debug_save_brbe(u64 *brbcr_el1)
> +{
> + *brbcr_el1 = 0;
> +
> + /* Check if the BRBE is enabled */
> + if (!(read_sysreg_el1(SYS_BRBCR) & (BRBCR_ELx_E0BRE | BRBCR_ELx_ExBRE)))
> + return;
> +
> + /*
> + * Prohibit branch record generation while we are in guest.
> + * Since access to BRBCR_EL1 is trapped, the guest can't
> + * modify the filtering set by the host.
> + */
> + *brbcr_el1 = read_sysreg_el1(SYS_BRBCR);
> + write_sysreg_el1(0, SYS_BRBCR);
> +}
> +
> +static void __debug_restore_brbe(u64 brbcr_el1)
> +{
> + if (!brbcr_el1)
> + return;
> +
> + /* Restore BRBE controls */
> + write_sysreg_el1(brbcr_el1, SYS_BRBCR);
> +}
> +
> void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
> {
> /* Disable and flush SPE data generation */
> if (host_data_test_flag(HAS_SPE))
> __debug_save_spe(host_data_ptr(host_debug_state.pmscr_el1));
>
> + /* Disable BRBE branch records */
> + if (host_data_test_flag(HAS_BRBE))
> + __debug_save_brbe(host_data_ptr(host_debug_state.brbcr_el1));
> +
> if (__trace_needs_switch())
> __trace_switch_to_guest();
> }
> @@ -111,6 +141,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
> {
> if (host_data_test_flag(HAS_SPE))
> __debug_restore_spe(*host_data_ptr(host_debug_state.pmscr_el1));
> + if (host_data_test_flag(HAS_BRBE))
> + __debug_restore_brbe(*host_data_ptr(host_debug_state.brbcr_el1));
> if (__trace_needs_switch())
> __trace_switch_to_host();
> }
>
LGTM
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-02-03 0:43 ` [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
2025-02-03 9:16 ` Anshuman Khandual
@ 2025-02-03 11:28 ` James Clark
2025-02-13 17:03 ` Leo Yan
2 siblings, 0 replies; 43+ messages in thread
From: James Clark @ 2025-02-03 11:28 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Anshuman Khandual
On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> While BRBE can record branches within guests, the host recording
> branches in guests is not supported by perf. Therefore, BRBE needs to be
> disabled on guest entry and restored on exit.
I don't think this is strictly true. You only need a Perf session in the
guest to records sideband events. That allows you to make sense of the
userspace addresses, but by then you might as well record BRBE in the
guest in the first place. See [1] for an example.
With kernel addresses it might be even easier as all you need is
--guestvmlinux, --guestkallsyms etc and no sideband events.
[1]:
https://lore.kernel.org/all/20220711093218.10967-25-adrian.hunter@intel.com/
>
> For nVHE, this requires explicit handling for guests. Before
> entering a guest, save the BRBE state and disable the it. When
> returning to the host, restore the state.
>
> For VHE, it is not necessary. We initialize
> BRBCR_EL1.{E1BRE,E0BRE}=={0,0} at boot time, and HCR_EL2.TGE==1 while
> running in the host. We configure BRBCR_EL2.{E2BRE,E0HBRE} to enable
> branch recording in the host. When entering the guest, we set
> HCR_EL2.TGE==0 which means BRBCR_EL1 is used instead of BRBCR_EL2.
> Consequently for VHE, BRBE recording is disabled at EL1 and EL0 when
> running a guest.
>
> Should recording in guests (by the host) ever be desired, the perf ABI
> will need to be extended to distinguish guest addresses (struct
> perf_branch_entry.priv) for starters.
There's already this which would be enough (if every entry in the branch
buffer matches it):
sample->cpumode == PERF_RECORD_MISC_GUEST_KERNEL
sample->cpumode == PERF_RECORD_MISC_GUEST_USER
But I don't think we need all the extra complexity. Just let the guest
use all of BRBE and then there isn't really a use case that's not
supported. I assume a lot of these workflows were added for trace
because it's not supported in guests, but I don't think that applies to
BRBE so we can skip them and go straight to full BRBE in guest support.
As a later change obviously, these comments are more about the commit
message.
James
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-03 0:43 ` [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
@ 2025-02-03 16:53 ` James Clark
2025-02-03 17:58 ` Rob Herring
2025-02-12 18:52 ` Leo Yan
2025-02-13 16:16 ` Leo Yan
2 siblings, 1 reply; 43+ messages in thread
From: James Clark @ 2025-02-03 16:53 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Anshuman Khandual
On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> The ARMv9.2 architecture introduces the optional Branch Record Buffer
> Extension (BRBE), which records information about branches as they are
> executed into set of branch record registers. BRBE is similar to x86's
> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> (BHRB).
>
> BRBE supports filtering by exception level and can filter just the
> source or target address if excluded to avoid leaking privileged
> addresses. The h/w filter would be sufficient except when there are
> multiple events with disjoint filtering requirements. In this case, BRBE
> is configured with a union of all the events' desired branches, and then
> the recorded branches are filtered based on each event's filter. For
> example, with one event capturing kernel events and another event
> capturing user events, BRBE will be configured to capture both kernel
> and user branches. When handling event overflow, the branch records have
> to be filtered by software to only include kernel or user branch
> addresses for that event. In contrast, x86 simply configures LBR using
> the last installed event which seems broken.
>
> The event and branch exception level filtering are separately
> controlled. On x86, it is possible to request filtering which is
> disjoint (e.g. kernel only event with user only branches). It is also
> possible on x86 to configure branch filter such that no branches are
> ever recorded (e.g. -j save_type). For BRBE, events with mismatched
> exception level filtering or a configuration that will result in no
> samples are rejected. This can be relaxed in the future if such a need
> arises.
>
> The handling of KVM guests is similar to the above. On x86, branch
> recording is always disabled when a guest is running. However,
> requesting branch recording in guests is allowed. The guest events are
> recorded, but the resulting branches are all from the host. For BRBE,
> branch recording is similarly disabled when guest is running. In
> addition, events with branch recording and "exclude_host" set are
> rejected. Requiring "exclude_guest" to be set did not work. The default
> for the perf tool does set "exclude_guest" if no exception level
> options are specified. However, specifying kernel or user defaults to
> including both host and guest. In this case, only host branches are
> recorded.
>
> BRBE can support some additional exception, FIQ, and debug branch
> types, but they are not supported currently. There's no control in the
> perf ABI to enable/disable these branch types, so they could only be
> enabled for the 'any' filter which might be undesired or unexpected.
> The other architectures don't have any support similar events (at least
> with perf). These can be added in the future if there is demand by
> adding additional specific filter types.
>
> BRBE records are invalidated whenever events are reconfigured, a new
> task is scheduled in, or after recording is paused (and the records
> have been recorded for the event). The architecture allows branch
> records to be invalidated by the PE under implementation defined
> conditions. It is expected that these conditions are rare.
>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Co-developed-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> v19:
> - Drop saving of branch records when task scheduled out. (Mark)
> - Got rid of added armpmu ops. All BRBE support contained within pmuv3
> code.
> - Dropped armpmu.num_branch_records as reg_brbidr has same info.
> - Make sched_task() callback actually get called. Enabling requires a
> call to perf_sched_cb_inc().
> - Fix freeze on overflow for VHE
> - The cycle counter doesn't freeze BRBE on overflow, so avoid assigning
> it when BRBE is enabled.
> - Drop all the Arm specific exception branches. Not a clear need for
> them.
> - Simplify enable/disable to avoid RMW and document ISBs needed
> - Fix handling of branch 'cycles' reading. CC field is
> mantissa/exponent, not an integer.
> - Save BRBFCR and BRBCR settings in event->hw.branch_reg.config and
> event->hw.extra_reg.config to avoid recalculating the register value
> each time the event is installed.
> - Rework s/w filtering to better match h/w filtering
> - Reject events with disjoint event filter and branch filter
> - Reject events if exclude_host is set
>
> v18: https://lore.kernel.org/all/20240613061731.3109448-6-anshuman.khandual@arm.com/
> ---
> drivers/perf/Kconfig | 11 +
> drivers/perf/Makefile | 1 +
> drivers/perf/arm_brbe.c | 794 +++++++++++++++++++++++++++++++++++++++++++
> drivers/perf/arm_brbe.h | 47 +++
> drivers/perf/arm_pmu.c | 15 +-
> drivers/perf/arm_pmuv3.c | 87 ++++-
> include/linux/perf/arm_pmu.h | 8 +
> 7 files changed, 958 insertions(+), 5 deletions(-)
>
[...]
> +bool brbe_branch_attr_valid(struct perf_event *event)
> +{
> + u64 branch_type = event->attr.branch_sample_type;
> +
> + /*
> + * Ensure both perf branch filter allowed and exclude
> + * masks are always in sync with the generic perf ABI.
> + */
> + BUILD_BUG_ON(BRBE_PERF_BRANCH_FILTERS != (PERF_SAMPLE_BRANCH_MAX - 1));
> +
> + if (branch_type & BRBE_EXCLUDE_BRANCH_FILTERS) {
> + pr_debug_once("requested branch filter not supported 0x%llx\n", branch_type);
> + return false;
> + }
> +
> + /* Ensure at least 1 branch type is enabled */
> + if (!(branch_type & BRBE_ALLOWED_BRANCH_TYPES)) {
> + pr_debug_once("no branch type enabled 0x%llx\n", branch_type);
> + return false;
> + }
> +
> + /*
> + * No branches are recorded in guests nor nVHE hypervisors, so
> + * excluding the host or both kernel and user is invalid.
> + *
> + * Ideally we'd just require exclude_guest and exclude_hv, but setting
> + * event filters with perf for kernel or user don't set exclude_guest.
> + * So effectively, exclude_guest and exclude_hv are ignored.
> + */
> + if (event->attr.exclude_host || (event->attr.exclude_user && event->attr.exclude_kernel))
> + return false;
Is there a reason to do the pr_debugs for the two cases above, but not
for the remaining ones? Seems like it should be all or nothing.
> +
> + /*
> + * Require that the event filter and branch filter permissions match.
> + *
> + * The event and branch permissions can only mismatch if the user set
> + * at least one of the privilege branch filters in PERF_SAMPLE_BRANCH_PLM_ALL.
> + * Otherwise, the core will set the branch sample permissions in
> + * perf_copy_attr().
> + */
> + if ((event->attr.exclude_user != !(branch_type & PERF_SAMPLE_BRANCH_USER)) ||
> + (event->attr.exclude_kernel != !(branch_type & PERF_SAMPLE_BRANCH_KERNEL)) ||
I don't think this one is right. By default perf_copy_attr() copies the
exclude_ settings into the branch settings so this works, but if the
user sets any _less_ permissive branch setting this fails. For example:
# perf record -j any,u -- true
Error:
cycles:PH: PMU Hardware or event type doesn't support branch stack
sampling.
Here we want the default sampling permissions (exclude_kernel == 0,
exclude_user == 0), but only user branch records, which doesn't match.
It should be allowed because it doesn't include anything that we're not
allowed to see.
This also makes the Perf branch test skip because it uses
any,save_type,u to see if BRBE exists.
> + (!is_kernel_in_hyp_mode() &&
> + (event->attr.exclude_hv != !(branch_type & PERF_SAMPLE_BRANCH_HV))))
> + return false;
> +
> + event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
> + event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
> +
> + return true;
> +}
> +
[...]
> +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
Does the second field go into 'new_type'? They all seem to be zero so
I'm not sure why new_type isn't ignored instead of having it mapped.
> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
How do ones that don't map to anything appear in Perf? For example
BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to the
previous versions fails because it doesn't see the trap that jumps to
the kernel, but it does still see the ERET back to userspace:
[unknown]/trap_bench+0x20/-/-/-/0/ERET/-
In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
PERF_BR_SYSCALL so you could see it go into the kernel before the return:
trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
[unknown]/trap_bench+0x20/-/-/-/0/ERET/-
> +};
> +
> +static void brbe_set_perf_entry_type(struct perf_branch_entry *entry, u64 brbinf)
> +{
> + int brbe_type = brbinf_get_type(brbinf);
> +
> + if (brbe_type <= BRBINFx_EL1_TYPE_DEBUG_EXIT) {
> + const int *br_type = brbe_type_to_perf_type_map[brbe_type];
> +
> + entry->type = br_type[0];
> + entry->new_type = br_type[1];
> + }
> +}
> +
[...]
> + if (branch_sample & PERF_SAMPLE_BRANCH_ANY_RETURN) {
> + set_bit(PERF_BR_RET, event_type_mask);
> +
> + if (!event->attr.exclude_kernel)
> + set_bit(PERF_BR_ERET, event_type_mask);
You could argue that ERET should be included even if exclude_kernel is
set, otherwise you miss the point that you returned to in userspace and
leave a gap in the program flow. See the trap and eret example above.
It looks like we still have the zeroing of the kernel address in this
version if we only have userspace priviledge, so it should be fine to
show the ERET and the target address.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-03 16:53 ` James Clark
@ 2025-02-03 17:58 ` Rob Herring
2025-02-04 12:02 ` James Clark
0 siblings, 1 reply; 43+ messages in thread
From: Rob Herring @ 2025-02-03 17:58 UTC (permalink / raw)
To: James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Anshuman Khandual
On Mon, Feb 3, 2025 at 10:53 AM James Clark <james.clark@linaro.org> wrote:
>
>
>
> On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
> > From: Anshuman Khandual <anshuman.khandual@arm.com>
> >
> > The ARMv9.2 architecture introduces the optional Branch Record Buffer
> > Extension (BRBE), which records information about branches as they are
> > executed into set of branch record registers. BRBE is similar to x86's
> > Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> > (BHRB).
> >
> > BRBE supports filtering by exception level and can filter just the
> > source or target address if excluded to avoid leaking privileged
> > addresses. The h/w filter would be sufficient except when there are
> > multiple events with disjoint filtering requirements. In this case, BRBE
> > is configured with a union of all the events' desired branches, and then
> > the recorded branches are filtered based on each event's filter. For
> > example, with one event capturing kernel events and another event
> > capturing user events, BRBE will be configured to capture both kernel
> > and user branches. When handling event overflow, the branch records have
> > to be filtered by software to only include kernel or user branch
> > addresses for that event. In contrast, x86 simply configures LBR using
> > the last installed event which seems broken.
> >
> > The event and branch exception level filtering are separately
> > controlled. On x86, it is possible to request filtering which is
> > disjoint (e.g. kernel only event with user only branches). It is also
> > possible on x86 to configure branch filter such that no branches are
> > ever recorded (e.g. -j save_type). For BRBE, events with mismatched
> > exception level filtering or a configuration that will result in no
> > samples are rejected. This can be relaxed in the future if such a need
> > arises.
> >
> > The handling of KVM guests is similar to the above. On x86, branch
> > recording is always disabled when a guest is running. However,
> > requesting branch recording in guests is allowed. The guest events are
> > recorded, but the resulting branches are all from the host. For BRBE,
> > branch recording is similarly disabled when guest is running. In
> > addition, events with branch recording and "exclude_host" set are
> > rejected. Requiring "exclude_guest" to be set did not work. The default
> > for the perf tool does set "exclude_guest" if no exception level
> > options are specified. However, specifying kernel or user defaults to
> > including both host and guest. In this case, only host branches are
> > recorded.
> >
> > BRBE can support some additional exception, FIQ, and debug branch
> > types, but they are not supported currently. There's no control in the
> > perf ABI to enable/disable these branch types, so they could only be
> > enabled for the 'any' filter which might be undesired or unexpected.
> > The other architectures don't have any support similar events (at least
> > with perf). These can be added in the future if there is demand by
> > adding additional specific filter types.
> >
> > BRBE records are invalidated whenever events are reconfigured, a new
> > task is scheduled in, or after recording is paused (and the records
> > have been recorded for the event). The architecture allows branch
> > records to be invalidated by the PE under implementation defined
> > conditions. It is expected that these conditions are rare.
> >
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> > Co-developed-by: Mark Rutland <mark.rutland@arm.com>
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
> > Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> > ---
> > v19:
> > - Drop saving of branch records when task scheduled out. (Mark)
> > - Got rid of added armpmu ops. All BRBE support contained within pmuv3
> > code.
> > - Dropped armpmu.num_branch_records as reg_brbidr has same info.
> > - Make sched_task() callback actually get called. Enabling requires a
> > call to perf_sched_cb_inc().
> > - Fix freeze on overflow for VHE
> > - The cycle counter doesn't freeze BRBE on overflow, so avoid assigning
> > it when BRBE is enabled.
> > - Drop all the Arm specific exception branches. Not a clear need for
> > them.
> > - Simplify enable/disable to avoid RMW and document ISBs needed
> > - Fix handling of branch 'cycles' reading. CC field is
> > mantissa/exponent, not an integer.
> > - Save BRBFCR and BRBCR settings in event->hw.branch_reg.config and
> > event->hw.extra_reg.config to avoid recalculating the register value
> > each time the event is installed.
> > - Rework s/w filtering to better match h/w filtering
> > - Reject events with disjoint event filter and branch filter
> > - Reject events if exclude_host is set
> >
> > v18: https://lore.kernel.org/all/20240613061731.3109448-6-anshuman.khandual@arm.com/
> > ---
> > drivers/perf/Kconfig | 11 +
> > drivers/perf/Makefile | 1 +
> > drivers/perf/arm_brbe.c | 794 +++++++++++++++++++++++++++++++++++++++++++
> > drivers/perf/arm_brbe.h | 47 +++
> > drivers/perf/arm_pmu.c | 15 +-
> > drivers/perf/arm_pmuv3.c | 87 ++++-
> > include/linux/perf/arm_pmu.h | 8 +
> > 7 files changed, 958 insertions(+), 5 deletions(-)
> >
> [...]
> > +bool brbe_branch_attr_valid(struct perf_event *event)
> > +{
> > + u64 branch_type = event->attr.branch_sample_type;
> > +
> > + /*
> > + * Ensure both perf branch filter allowed and exclude
> > + * masks are always in sync with the generic perf ABI.
> > + */
> > + BUILD_BUG_ON(BRBE_PERF_BRANCH_FILTERS != (PERF_SAMPLE_BRANCH_MAX - 1));
> > +
> > + if (branch_type & BRBE_EXCLUDE_BRANCH_FILTERS) {
> > + pr_debug_once("requested branch filter not supported 0x%llx\n", branch_type);
> > + return false;
> > + }
> > +
> > + /* Ensure at least 1 branch type is enabled */
> > + if (!(branch_type & BRBE_ALLOWED_BRANCH_TYPES)) {
> > + pr_debug_once("no branch type enabled 0x%llx\n", branch_type);
> > + return false;
> > + }
> > +
> > + /*
> > + * No branches are recorded in guests nor nVHE hypervisors, so
> > + * excluding the host or both kernel and user is invalid.
> > + *
> > + * Ideally we'd just require exclude_guest and exclude_hv, but setting
> > + * event filters with perf for kernel or user don't set exclude_guest.
> > + * So effectively, exclude_guest and exclude_hv are ignored.
> > + */
> > + if (event->attr.exclude_host || (event->attr.exclude_user && event->attr.exclude_kernel))
> > + return false;
>
> Is there a reason to do the pr_debugs for the two cases above, but not
> for the remaining ones? Seems like it should be all or nothing.
Shrug. Anshuman wrote the pr_debugs. I wrote this part. Honestly, I
don't know why you'd want them only once if they are gated off by
debug. I guess since other cases of rejecting events outside this
function have pr_debug() we should do the same here.
> > +
> > + /*
> > + * Require that the event filter and branch filter permissions match.
> > + *
> > + * The event and branch permissions can only mismatch if the user set
> > + * at least one of the privilege branch filters in PERF_SAMPLE_BRANCH_PLM_ALL.
> > + * Otherwise, the core will set the branch sample permissions in
> > + * perf_copy_attr().
> > + */
> > + if ((event->attr.exclude_user != !(branch_type & PERF_SAMPLE_BRANCH_USER)) ||
> > + (event->attr.exclude_kernel != !(branch_type & PERF_SAMPLE_BRANCH_KERNEL)) ||
>
> I don't think this one is right. By default perf_copy_attr() copies the
> exclude_ settings into the branch settings so this works, but if the
> user sets any _less_ permissive branch setting this fails. For example:
>
> # perf record -j any,u -- true
> Error:
> cycles:PH: PMU Hardware or event type doesn't support branch stack
> sampling.
>
> Here we want the default sampling permissions (exclude_kernel == 0,
> exclude_user == 0), but only user branch records, which doesn't match.
> It should be allowed because it doesn't include anything that we're not
> allowed to see.
I know it is allowed(on x86), but why would we want that? If you do
something even more restricted:
perf record -e cycles:k -j any,u -- true
That's allowed on x86 and gives you samples with user addresses. But
all the events happened in the kernel. How does that make any sense?
I suppose in your example, we could avoid attaching branch stack on
samples from the kernel. However, given how my example works, I'm
pretty sure that's not what x86 does.
There's also combinations that are allowed, but record no samples.
Though I think that was with guest events. I've gone with reject
non-sense combinations as much as possible. We can easily remove those
restrictions later if needed. Changing the behavior later (for the
same configuration) wouldn't be good.
> This also makes the Perf branch test skip because it uses
> any,save_type,u to see if BRBE exists.
Yes, I plan to update that if we keep this behavior.
> > + (!is_kernel_in_hyp_mode() &&
> > + (event->attr.exclude_hv != !(branch_type & PERF_SAMPLE_BRANCH_HV))))
> > + return false;
> > +
> > + event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
> > + event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
> > +
> > + return true;
> > +}
> > +
> [...]
> > +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
> > + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
>
> Does the second field go into 'new_type'? They all seem to be zero so
> I'm not sure why new_type isn't ignored instead of having it mapped.
Well, left over from when all the Arm specific types were supported.
So yeah, that can be simplified.
> > + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
> > + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
> > + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
> > + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
> > + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
> > + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
> > + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
> > + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
>
> How do ones that don't map to anything appear in Perf? For example
> BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to the
> previous versions fails because it doesn't see the trap that jumps to
> the kernel, but it does still see the ERET back to userspace:
>
> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>
> In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
> PERF_BR_SYSCALL so you could see it go into the kernel before the return:
>
> trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
My read of that was we should see a CALL in this case. Whether SVC
generates a TRAP or CALL depends on HFGITR_EL2.SVC_EL0 (table D18-2).
I assumed "SVC due to HFGITR_EL2.SVC_EL0" means when SVC_EL0 is set
(and set has additional conditions). We have SVC_EL0 cleared, so that
should be a CALL. Maybe the FVP has this wrong?
> > +};
> > +
> > +static void brbe_set_perf_entry_type(struct perf_branch_entry *entry, u64 brbinf)
> > +{
> > + int brbe_type = brbinf_get_type(brbinf);
> > +
> > + if (brbe_type <= BRBINFx_EL1_TYPE_DEBUG_EXIT) {
> > + const int *br_type = brbe_type_to_perf_type_map[brbe_type];
> > +
> > + entry->type = br_type[0];
> > + entry->new_type = br_type[1];
> > + }
> > +}
> > +
>
> [...]
>
> > + if (branch_sample & PERF_SAMPLE_BRANCH_ANY_RETURN) {
> > + set_bit(PERF_BR_RET, event_type_mask);
> > +
> > + if (!event->attr.exclude_kernel)
> > + set_bit(PERF_BR_ERET, event_type_mask);
>
> You could argue that ERET should be included even if exclude_kernel is
> set, otherwise you miss the point that you returned to in userspace and
> leave a gap in the program flow. See the trap and eret example above.
>
> It looks like we still have the zeroing of the kernel address in this
> version if we only have userspace priviledge, so it should be fine to
> show the ERET and the target address.
Yes, agreed.
Rob
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-03 17:58 ` Rob Herring
@ 2025-02-04 12:02 ` James Clark
2025-02-04 15:03 ` Rob Herring
0 siblings, 1 reply; 43+ messages in thread
From: James Clark @ 2025-02-04 12:02 UTC (permalink / raw)
To: Rob Herring
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Anshuman Khandual
On 03/02/2025 5:58 pm, Rob Herring wrote:
> On Mon, Feb 3, 2025 at 10:53 AM James Clark <james.clark@linaro.org> wrote:
>>
>>
>>
>> On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
>>> From: Anshuman Khandual <anshuman.khandual@arm.com>
>>>
>>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
>>> Extension (BRBE), which records information about branches as they are
>>> executed into set of branch record registers. BRBE is similar to x86's
>>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
>>> (BHRB).
>>>
>>> BRBE supports filtering by exception level and can filter just the
>>> source or target address if excluded to avoid leaking privileged
>>> addresses. The h/w filter would be sufficient except when there are
>>> multiple events with disjoint filtering requirements. In this case, BRBE
>>> is configured with a union of all the events' desired branches, and then
>>> the recorded branches are filtered based on each event's filter. For
>>> example, with one event capturing kernel events and another event
>>> capturing user events, BRBE will be configured to capture both kernel
>>> and user branches. When handling event overflow, the branch records have
>>> to be filtered by software to only include kernel or user branch
>>> addresses for that event. In contrast, x86 simply configures LBR using
>>> the last installed event which seems broken.
>>>
>>> The event and branch exception level filtering are separately
>>> controlled. On x86, it is possible to request filtering which is
>>> disjoint (e.g. kernel only event with user only branches). It is also
>>> possible on x86 to configure branch filter such that no branches are
>>> ever recorded (e.g. -j save_type). For BRBE, events with mismatched
>>> exception level filtering or a configuration that will result in no
>>> samples are rejected. This can be relaxed in the future if such a need
>>> arises.
>>>
>>> The handling of KVM guests is similar to the above. On x86, branch
>>> recording is always disabled when a guest is running. However,
>>> requesting branch recording in guests is allowed. The guest events are
>>> recorded, but the resulting branches are all from the host. For BRBE,
>>> branch recording is similarly disabled when guest is running. In
>>> addition, events with branch recording and "exclude_host" set are
>>> rejected. Requiring "exclude_guest" to be set did not work. The default
>>> for the perf tool does set "exclude_guest" if no exception level
>>> options are specified. However, specifying kernel or user defaults to
>>> including both host and guest. In this case, only host branches are
>>> recorded.
>>>
>>> BRBE can support some additional exception, FIQ, and debug branch
>>> types, but they are not supported currently. There's no control in the
>>> perf ABI to enable/disable these branch types, so they could only be
>>> enabled for the 'any' filter which might be undesired or unexpected.
>>> The other architectures don't have any support similar events (at least
>>> with perf). These can be added in the future if there is demand by
>>> adding additional specific filter types.
>>>
>>> BRBE records are invalidated whenever events are reconfigured, a new
>>> task is scheduled in, or after recording is paused (and the records
>>> have been recorded for the event). The architecture allows branch
>>> records to be invalidated by the PE under implementation defined
>>> conditions. It is expected that these conditions are rare.
>>>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> Co-developed-by: Mark Rutland <mark.rutland@arm.com>
>>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>>> Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
>>> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
>>> ---
>>> v19:
>>> - Drop saving of branch records when task scheduled out. (Mark)
>>> - Got rid of added armpmu ops. All BRBE support contained within pmuv3
>>> code.
>>> - Dropped armpmu.num_branch_records as reg_brbidr has same info.
>>> - Make sched_task() callback actually get called. Enabling requires a
>>> call to perf_sched_cb_inc().
>>> - Fix freeze on overflow for VHE
>>> - The cycle counter doesn't freeze BRBE on overflow, so avoid assigning
>>> it when BRBE is enabled.
>>> - Drop all the Arm specific exception branches. Not a clear need for
>>> them.
>>> - Simplify enable/disable to avoid RMW and document ISBs needed
>>> - Fix handling of branch 'cycles' reading. CC field is
>>> mantissa/exponent, not an integer.
>>> - Save BRBFCR and BRBCR settings in event->hw.branch_reg.config and
>>> event->hw.extra_reg.config to avoid recalculating the register value
>>> each time the event is installed.
>>> - Rework s/w filtering to better match h/w filtering
>>> - Reject events with disjoint event filter and branch filter
>>> - Reject events if exclude_host is set
>>>
>>> v18: https://lore.kernel.org/all/20240613061731.3109448-6-anshuman.khandual@arm.com/
>>> ---
>>> drivers/perf/Kconfig | 11 +
>>> drivers/perf/Makefile | 1 +
>>> drivers/perf/arm_brbe.c | 794 +++++++++++++++++++++++++++++++++++++++++++
>>> drivers/perf/arm_brbe.h | 47 +++
>>> drivers/perf/arm_pmu.c | 15 +-
>>> drivers/perf/arm_pmuv3.c | 87 ++++-
>>> include/linux/perf/arm_pmu.h | 8 +
>>> 7 files changed, 958 insertions(+), 5 deletions(-)
>>>
>> [...]
>>> +bool brbe_branch_attr_valid(struct perf_event *event)
>>> +{
>>> + u64 branch_type = event->attr.branch_sample_type;
>>> +
>>> + /*
>>> + * Ensure both perf branch filter allowed and exclude
>>> + * masks are always in sync with the generic perf ABI.
>>> + */
>>> + BUILD_BUG_ON(BRBE_PERF_BRANCH_FILTERS != (PERF_SAMPLE_BRANCH_MAX - 1));
>>> +
>>> + if (branch_type & BRBE_EXCLUDE_BRANCH_FILTERS) {
>>> + pr_debug_once("requested branch filter not supported 0x%llx\n", branch_type);
>>> + return false;
>>> + }
>>> +
>>> + /* Ensure at least 1 branch type is enabled */
>>> + if (!(branch_type & BRBE_ALLOWED_BRANCH_TYPES)) {
>>> + pr_debug_once("no branch type enabled 0x%llx\n", branch_type);
>>> + return false;
>>> + }
>>> +
>>> + /*
>>> + * No branches are recorded in guests nor nVHE hypervisors, so
>>> + * excluding the host or both kernel and user is invalid.
>>> + *
>>> + * Ideally we'd just require exclude_guest and exclude_hv, but setting
>>> + * event filters with perf for kernel or user don't set exclude_guest.
>>> + * So effectively, exclude_guest and exclude_hv are ignored.
>>> + */
>>> + if (event->attr.exclude_host || (event->attr.exclude_user && event->attr.exclude_kernel))
>>> + return false;
>>
>> Is there a reason to do the pr_debugs for the two cases above, but not
>> for the remaining ones? Seems like it should be all or nothing.
>
> Shrug. Anshuman wrote the pr_debugs. I wrote this part. Honestly, I
> don't know why you'd want them only once if they are gated off by
> debug. I guess since other cases of rejecting events outside this
> function have pr_debug() we should do the same here.
>
>>> +
>>> + /*
>>> + * Require that the event filter and branch filter permissions match.
>>> + *
>>> + * The event and branch permissions can only mismatch if the user set
>>> + * at least one of the privilege branch filters in PERF_SAMPLE_BRANCH_PLM_ALL.
>>> + * Otherwise, the core will set the branch sample permissions in
>>> + * perf_copy_attr().
>>> + */
>>> + if ((event->attr.exclude_user != !(branch_type & PERF_SAMPLE_BRANCH_USER)) ||
>>> + (event->attr.exclude_kernel != !(branch_type & PERF_SAMPLE_BRANCH_KERNEL)) ||
>>
>> I don't think this one is right. By default perf_copy_attr() copies the
>> exclude_ settings into the branch settings so this works, but if the
>> user sets any _less_ permissive branch setting this fails. For example:
>>
>> # perf record -j any,u -- true
>> Error:
>> cycles:PH: PMU Hardware or event type doesn't support branch stack
>> sampling.
>>
>> Here we want the default sampling permissions (exclude_kernel == 0,
>> exclude_user == 0), but only user branch records, which doesn't match.
>> It should be allowed because it doesn't include anything that we're not
>> allowed to see.
>
> I know it is allowed(on x86), but why would we want that? If you do
> something even more restricted:
>
> perf record -e cycles:k -j any,u -- true
>
> That's allowed on x86 and gives you samples with user addresses. But
> all the events happened in the kernel. How does that make any sense?
>
> I suppose in your example, we could avoid attaching branch stack on
> samples from the kernel. However, given how my example works, I'm
> pretty sure that's not what x86 does.
>
> There's also combinations that are allowed, but record no samples.
> Though I think that was with guest events. I've gone with reject
> non-sense combinations as much as possible. We can easily remove those
> restrictions later if needed. Changing the behavior later (for the
> same configuration) wouldn't be good.
>
>
Rejecting ones that produce no samples is fair enough, but my example
does produce samples. To answer the question "why would we want that?",
nothing major, but there are a few small reasons:
* Perf includes both user and kernel by default, so the shortest
command to only gather user branches doesn't work (-j any,u)
* The test already checks for branch stack support like this, so old
Perf test versions don't work
* You might only be optimising userspace, but still interested in the
proportion of time spent or particular place in the kernel
* Consistency with existing implementations and for people porting
existing tools to Arm
* It doesn't cost anything to support it (I think we just
only check if exclude_* is set rather than !=)
* Permissions checks should be handled by the core code so that
they're consistent
* What's the point of separate branch filters anyway if they always
have to match the event filter?
Some of these things could be fixed in Perf, but not in older versions.
Even if we can't think of a real use case now, it doesn't sound like the
driver should be so restrictive of an option that doesn't do any harm.
>> This also makes the Perf branch test skip because it uses
>> any,save_type,u to see if BRBE exists.
>
> Yes, I plan to update that if we keep this behavior.
>
>>> + (!is_kernel_in_hyp_mode() &&
>>> + (event->attr.exclude_hv != !(branch_type & PERF_SAMPLE_BRANCH_HV))))
>>> + return false;
>>> +
>>> + event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
>>> + event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
>>> +
>>> + return true;
>>> +}
>>> +
>> [...]
>>> +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
>>> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
>>
>> Does the second field go into 'new_type'? They all seem to be zero so
>> I'm not sure why new_type isn't ignored instead of having it mapped.
>
> Well, left over from when all the Arm specific types were supported.
> So yeah, that can be simplified.
>
>>> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
>>> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
>>> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
>>> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
>>> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
>>> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
>>> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
>>> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
>>
>> How do ones that don't map to anything appear in Perf? For example
>> BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to the
>> previous versions fails because it doesn't see the trap that jumps to
>> the kernel, but it does still see the ERET back to userspace:
>>
>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>
>> In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
>> PERF_BR_SYSCALL so you could see it go into the kernel before the return:
>>
>> trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>
> My read of that was we should see a CALL in this case. Whether SVC
> generates a TRAP or CALL depends on HFGITR_EL2.SVC_EL0 (table D18-2).
> I assumed "SVC due to HFGITR_EL2.SVC_EL0" means when SVC_EL0 is set
> (and set has additional conditions). We have SVC_EL0 cleared, so that
> should be a CALL. Maybe the FVP has this wrong?
>
The test is doing this rather than a syscall:
asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (val)); /* TRAP + ERET */
So I think trap is right. Whether that should be mapped to SYSCALL or
some other branch type I don't know, but the point is that it's missing now.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-04 12:02 ` James Clark
@ 2025-02-04 15:03 ` Rob Herring
2025-02-05 14:38 ` James Clark
0 siblings, 1 reply; 43+ messages in thread
From: Rob Herring @ 2025-02-04 15:03 UTC (permalink / raw)
To: James Clark
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Anshuman Khandual
On Tue, Feb 4, 2025 at 6:03 AM James Clark <james.clark@linaro.org> wrote:
>
>
>
> On 03/02/2025 5:58 pm, Rob Herring wrote:
> > On Mon, Feb 3, 2025 at 10:53 AM James Clark <james.clark@linaro.org> wrote:
> >>
> >>
> >>
> >> On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
> >>> From: Anshuman Khandual <anshuman.khandual@arm.com>
> >>>
> >>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
> >>> Extension (BRBE), which records information about branches as they are
> >>> executed into set of branch record registers. BRBE is similar to x86's
> >>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> >>> (BHRB).
[...]
> >>> + /*
> >>> + * Require that the event filter and branch filter permissions match.
> >>> + *
> >>> + * The event and branch permissions can only mismatch if the user set
> >>> + * at least one of the privilege branch filters in PERF_SAMPLE_BRANCH_PLM_ALL.
> >>> + * Otherwise, the core will set the branch sample permissions in
> >>> + * perf_copy_attr().
> >>> + */
> >>> + if ((event->attr.exclude_user != !(branch_type & PERF_SAMPLE_BRANCH_USER)) ||
> >>> + (event->attr.exclude_kernel != !(branch_type & PERF_SAMPLE_BRANCH_KERNEL)) ||
> >>
> >> I don't think this one is right. By default perf_copy_attr() copies the
> >> exclude_ settings into the branch settings so this works, but if the
> >> user sets any _less_ permissive branch setting this fails. For example:
> >>
> >> # perf record -j any,u -- true
> >> Error:
> >> cycles:PH: PMU Hardware or event type doesn't support branch stack
> >> sampling.
> >>
> >> Here we want the default sampling permissions (exclude_kernel == 0,
> >> exclude_user == 0), but only user branch records, which doesn't match.
> >> It should be allowed because it doesn't include anything that we're not
> >> allowed to see.
> >
> > I know it is allowed(on x86), but why would we want that? If you do
> > something even more restricted:
> >
> > perf record -e cycles:k -j any,u -- true
> >
> > That's allowed on x86 and gives you samples with user addresses. But
> > all the events happened in the kernel. How does that make any sense?
> >
> > I suppose in your example, we could avoid attaching branch stack on
> > samples from the kernel. However, given how my example works, I'm
> > pretty sure that's not what x86 does.
> >
> > There's also combinations that are allowed, but record no samples.
> > Though I think that was with guest events. I've gone with reject
> > non-sense combinations as much as possible. We can easily remove those
> > restrictions later if needed. Changing the behavior later (for the
> > same configuration) wouldn't be good.
> >
> >
>
> Rejecting ones that produce no samples is fair enough, but my example
> does produce samples. To answer the question "why would we want that?",
> nothing major, but there are a few small reasons:
>
> * Perf includes both user and kernel by default, so the shortest
> command to only gather user branches doesn't work (-j any,u)
> * The test already checks for branch stack support like this, so old
> Perf test versions don't work
I would be more concerned about this one except that *we* wrote that
test. (I'm not sure why we wrote a new test rather than adapting
record_lbr.sh...)
> * You might only be optimising userspace, but still interested in the
> proportion of time spent or particular place in the kernel
How do you see that? It looks completely misleading to me. 'perf
report' seems to only list branch stack addresses in this case. There
doesn't seem to be any matching of the event address to branch stack
addresses.
> * Consistency with existing implementations and for people porting
> existing tools to Arm
> * It doesn't cost anything to support it (I think we just
> only check if exclude_* is set rather than !=)
> * Permissions checks should be handled by the core code so that
> they're consistent
> * What's the point of separate branch filters anyway if they always
> have to match the event filter?
IDK, I wish someone could tell me. I don't see the usecase for them
being mismatched.
In any case, I don't care too much one way or the other what we do
here. If everyone thinks we should relax this, then that's fine with
me.
> Some of these things could be fixed in Perf, but not in older versions.
> Even if we can't think of a real use case now, it doesn't sound like the
> driver should be so restrictive of an option that doesn't do any harm.
>
> >> This also makes the Perf branch test skip because it uses
> >> any,save_type,u to see if BRBE exists.
> >
> > Yes, I plan to update that if we keep this behavior.
> >
> >>> + (!is_kernel_in_hyp_mode() &&
> >>> + (event->attr.exclude_hv != !(branch_type & PERF_SAMPLE_BRANCH_HV))))
> >>> + return false;
> >>> +
> >>> + event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
> >>> + event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
> >>> +
> >>> + return true;
> >>> +}
> >>> +
> >> [...]
> >>> +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
> >>> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
> >>
> >> Does the second field go into 'new_type'? They all seem to be zero so
> >> I'm not sure why new_type isn't ignored instead of having it mapped.
> >
> > Well, left over from when all the Arm specific types were supported.
> > So yeah, that can be simplified.
> >
> >>> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
> >>> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
> >>> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
> >>> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
> >>> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
> >>> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
> >>> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
> >>> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
> >>
> >> How do ones that don't map to anything appear in Perf? For example
> >> BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to the
> >> previous versions fails because it doesn't see the trap that jumps to
> >> the kernel, but it does still see the ERET back to userspace:
> >>
> >> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
> >>
> >> In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
> >> PERF_BR_SYSCALL so you could see it go into the kernel before the return:
> >>
> >> trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
> >> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
> >
> > My read of that was we should see a CALL in this case. Whether SVC
> > generates a TRAP or CALL depends on HFGITR_EL2.SVC_EL0 (table D18-2).
> > I assumed "SVC due to HFGITR_EL2.SVC_EL0" means when SVC_EL0 is set
> > (and set has additional conditions). We have SVC_EL0 cleared, so that
> > should be a CALL. Maybe the FVP has this wrong?
> >
>
> The test is doing this rather than a syscall:
>
> asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (val)); /* TRAP + ERET */
>
> So I think trap is right. Whether that should be mapped to SYSCALL or
> some other branch type I don't know, but the point is that it's missing now.
We aren't supporting any of the Arm specific traps/exceptions. One
reason is for consistency with x86 like you just argued for. The only
exception types supported are syscall and IRQ. Part of the issue is
there is no userspace control over enabling all the extra Arm ones.
There's no way to say enable all branches except debug, fault, etc.
exceptions. If we want to support these, I think there should be user
control over enabling them. But that can come later if there's any
demand for them.
Rob
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-04 15:03 ` Rob Herring
@ 2025-02-05 14:38 ` James Clark
2025-02-05 14:51 ` James Clark
2025-02-05 16:15 ` Rob Herring
0 siblings, 2 replies; 43+ messages in thread
From: James Clark @ 2025-02-05 14:38 UTC (permalink / raw)
To: Rob Herring
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Anshuman Khandual
On 04/02/2025 3:03 pm, Rob Herring wrote:
> On Tue, Feb 4, 2025 at 6:03 AM James Clark <james.clark@linaro.org> wrote:
>>
>>
>>
>> On 03/02/2025 5:58 pm, Rob Herring wrote:
>>> On Mon, Feb 3, 2025 at 10:53 AM James Clark <james.clark@linaro.org> wrote:
>>>>
>>>>
>>>>
>>>> On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
>>>>> From: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>>
>>>>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
>>>>> Extension (BRBE), which records information about branches as they are
>>>>> executed into set of branch record registers. BRBE is similar to x86's
>>>>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
>>>>> (BHRB).
>
> [...]
>
>>>>> + /*
>>>>> + * Require that the event filter and branch filter permissions match.
>>>>> + *
>>>>> + * The event and branch permissions can only mismatch if the user set
>>>>> + * at least one of the privilege branch filters in PERF_SAMPLE_BRANCH_PLM_ALL.
>>>>> + * Otherwise, the core will set the branch sample permissions in
>>>>> + * perf_copy_attr().
>>>>> + */
>>>>> + if ((event->attr.exclude_user != !(branch_type & PERF_SAMPLE_BRANCH_USER)) ||
>>>>> + (event->attr.exclude_kernel != !(branch_type & PERF_SAMPLE_BRANCH_KERNEL)) ||
>>>>
>>>> I don't think this one is right. By default perf_copy_attr() copies the
>>>> exclude_ settings into the branch settings so this works, but if the
>>>> user sets any _less_ permissive branch setting this fails. For example:
>>>>
>>>> # perf record -j any,u -- true
>>>> Error:
>>>> cycles:PH: PMU Hardware or event type doesn't support branch stack
>>>> sampling.
>>>>
>>>> Here we want the default sampling permissions (exclude_kernel == 0,
>>>> exclude_user == 0), but only user branch records, which doesn't match.
>>>> It should be allowed because it doesn't include anything that we're not
>>>> allowed to see.
>>>
>>> I know it is allowed(on x86), but why would we want that? If you do
>>> something even more restricted:
>>>
>>> perf record -e cycles:k -j any,u -- true
>>>
>>> That's allowed on x86 and gives you samples with user addresses. But
>>> all the events happened in the kernel. How does that make any sense?
>>>
>>> I suppose in your example, we could avoid attaching branch stack on
>>> samples from the kernel. However, given how my example works, I'm
>>> pretty sure that's not what x86 does.
>>>
>>> There's also combinations that are allowed, but record no samples.
>>> Though I think that was with guest events. I've gone with reject
>>> non-sense combinations as much as possible. We can easily remove those
>>> restrictions later if needed. Changing the behavior later (for the
>>> same configuration) wouldn't be good.
>>>
>>>
>>
>> Rejecting ones that produce no samples is fair enough, but my example
>> does produce samples. To answer the question "why would we want that?",
>> nothing major, but there are a few small reasons:
>>
>> * Perf includes both user and kernel by default, so the shortest
>> command to only gather user branches doesn't work (-j any,u)
>> * The test already checks for branch stack support like this, so old
>> Perf test versions don't work
>
> I would be more concerned about this one except that *we* wrote that
> test. (I'm not sure why we wrote a new test rather than adapting
> record_lbr.sh...)
>
record_lbr.sh was added 6 months ago, test_brstack.sh 3 years ago so
it's the other way around.
Although record_lbr.sh also tests --call-graph and --stitch-lbr as well,
so I think it's fine for test_brstack.sh to test only --branch-filter
options at the lowest level.
Looking at that test though I see there is a capability
"/sys/devices/cpu/caps/branches". I'm wondering whether we should be
adding that on the Arm PMU for BRBE?
Ignoring the tests, the man pages (and some pages on the internet) give
this example: "--branch-filter any_ret,u,k". This doesn't work either
because it doesn't match the default exclude_hv option. It just seems a
bit awkward and incompatible to me, for not much gain.
>> * You might only be optimising userspace, but still interested in the
>> proportion of time spent or particular place in the kernel
>
> How do you see that? It looks completely misleading to me. 'perf
> report' seems to only list branch stack addresses in this case. There
> doesn't seem to be any matching of the event address to branch stack
> addresses.
>
Perf script will show everything with all it's various options, or
--branch-history on perf report will show both too. Also there are tools
other than Perf, AutoFDO seems like something that BRBE can be used with.
>> * Consistency with existing implementations and for people porting
>> existing tools to Arm
>> * It doesn't cost anything to support it (I think we just
>> only check if exclude_* is set rather than !=)
>> * Permissions checks should be handled by the core code so that
>> they're consistent
>> * What's the point of separate branch filters anyway if they always
>> have to match the event filter?
>
> IDK, I wish someone could tell me. I don't see the usecase for them
> being mismatched.
>
> In any case, I don't care too much one way or the other what we do
> here. If everyone thinks we should relax this, then that's fine with
> me.
>
Seeing the branch history from userspace that led up to a certain thing
in the kernel happening doesn't seem like that much of an edge case to
me. If you always have to have both on then you lose the userspace
branch history because the buffer isn't that big and gets overwritten.
>> Some of these things could be fixed in Perf, but not in older versions.
>> Even if we can't think of a real use case now, it doesn't sound like the
>> driver should be so restrictive of an option that doesn't do any harm.
>>
>>>> This also makes the Perf branch test skip because it uses
>>>> any,save_type,u to see if BRBE exists.
>>>
>>> Yes, I plan to update that if we keep this behavior.
>>>
>>>>> + (!is_kernel_in_hyp_mode() &&
>>>>> + (event->attr.exclude_hv != !(branch_type & PERF_SAMPLE_BRANCH_HV))))
>>>>> + return false;
>>>>> +
>>>>> + event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
>>>>> + event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
>>>>> +
>>>>> + return true;
>>>>> +}
>>>>> +
>>>> [...]
>>>>> +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
>>>>> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
>>>>
>>>> Does the second field go into 'new_type'? They all seem to be zero so
>>>> I'm not sure why new_type isn't ignored instead of having it mapped.
>>>
>>> Well, left over from when all the Arm specific types were supported.
>>> So yeah, that can be simplified.
>>>
>>>>> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
>>>>> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
>>>>> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
>>>>> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
>>>>> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
>>>>> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
>>>>> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
>>>>> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
>>>>
>>>> How do ones that don't map to anything appear in Perf? For example
>>>> BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to the
>>>> previous versions fails because it doesn't see the trap that jumps to
>>>> the kernel, but it does still see the ERET back to userspace:
>>>>
>>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>>>
>>>> In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
>>>> PERF_BR_SYSCALL so you could see it go into the kernel before the return:
>>>>
>>>> trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
>>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>>
>>> My read of that was we should see a CALL in this case. Whether SVC
>>> generates a TRAP or CALL depends on HFGITR_EL2.SVC_EL0 (table D18-2).
>>> I assumed "SVC due to HFGITR_EL2.SVC_EL0" means when SVC_EL0 is set
>>> (and set has additional conditions). We have SVC_EL0 cleared, so that
>>> should be a CALL. Maybe the FVP has this wrong?
>>>
>>
>> The test is doing this rather than a syscall:
>>
>> asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (val)); /* TRAP + ERET */
>>
>> So I think trap is right. Whether that should be mapped to SYSCALL or
>> some other branch type I don't know, but the point is that it's missing now.
>
> We aren't supporting any of the Arm specific traps/exceptions. One
> reason is for consistency with x86 like you just argued for. The only
Does x86 leave holes in the program flow though, or is it complete? IMO
it makes it harder for tools to make sense of the branch buffer if there
are things like an ERET with no previous trap to match it up to.
> exception types supported are syscall and IRQ. Part of the issue is
> there is no userspace control over enabling all the extra Arm ones.
> There's no way to say enable all branches except debug, fault, etc.
> exceptions. If we want to support these, I think there should be user
> control over enabling them. But that can come later if there's any
> demand for them.
>
> Rob
In this patchset we enable PERF_BR_IRQ with PERF_SAMPLE_BRANCH_ANY,
without any way to selectively disable it. I would assume trap could be
done with the same option.
If we're filtering some of them out it might be worth documenting that
"PERF_SAMPLE_BRANCH_ANY" doesn't actually mean 'any' branch type on Arm,
and some types are recorded but discarded out before sending to userspace.
There could be some confusion when there are partially filled or empty
branch buffers, and the reason wouldn't be that there weren't any
branches recorded, but they were all filtered out even with the 'any'
option.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-05 14:38 ` James Clark
@ 2025-02-05 14:51 ` James Clark
2025-02-05 16:15 ` Rob Herring
1 sibling, 0 replies; 43+ messages in thread
From: James Clark @ 2025-02-05 14:51 UTC (permalink / raw)
To: Rob Herring
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Anshuman Khandual
On 05/02/2025 2:38 pm, James Clark wrote:
>
>
> On 04/02/2025 3:03 pm, Rob Herring wrote:
>> On Tue, Feb 4, 2025 at 6:03 AM James Clark <james.clark@linaro.org>
>> wrote:
>>>
>>>
>>>
>>> On 03/02/2025 5:58 pm, Rob Herring wrote:
>>>> On Mon, Feb 3, 2025 at 10:53 AM James Clark <james.clark@linaro.org>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
>>>>>> From: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>>>
>>>>>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
>>>>>> Extension (BRBE), which records information about branches as they
>>>>>> are
>>>>>> executed into set of branch record registers. BRBE is similar to
>>>>>> x86's
>>>>>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
>>>>>> (BHRB).
>>
>> [...]
>>
>>>>>> + /*
>>>>>> + * Require that the event filter and branch filter
>>>>>> permissions match.
>>>>>> + *
>>>>>> + * The event and branch permissions can only mismatch if the
>>>>>> user set
>>>>>> + * at least one of the privilege branch filters in
>>>>>> PERF_SAMPLE_BRANCH_PLM_ALL.
>>>>>> + * Otherwise, the core will set the branch sample
>>>>>> permissions in
>>>>>> + * perf_copy_attr().
>>>>>> + */
>>>>>> + if ((event->attr.exclude_user != !(branch_type &
>>>>>> PERF_SAMPLE_BRANCH_USER)) ||
>>>>>> + (event->attr.exclude_kernel != !(branch_type &
>>>>>> PERF_SAMPLE_BRANCH_KERNEL)) ||
>>>>>
>>>>> I don't think this one is right. By default perf_copy_attr() copies
>>>>> the
>>>>> exclude_ settings into the branch settings so this works, but if the
>>>>> user sets any _less_ permissive branch setting this fails. For
>>>>> example:
>>>>>
>>>>> # perf record -j any,u -- true
>>>>> Error:
>>>>> cycles:PH: PMU Hardware or event type doesn't support branch stack
>>>>> sampling.
>>>>>
>>>>> Here we want the default sampling permissions (exclude_kernel == 0,
>>>>> exclude_user == 0), but only user branch records, which doesn't match.
>>>>> It should be allowed because it doesn't include anything that we're
>>>>> not
>>>>> allowed to see.
>>>>
>>>> I know it is allowed(on x86), but why would we want that? If you do
>>>> something even more restricted:
>>>>
>>>> perf record -e cycles:k -j any,u -- true
>>>>
>>>> That's allowed on x86 and gives you samples with user addresses. But
>>>> all the events happened in the kernel. How does that make any sense?
>>>>
>>>> I suppose in your example, we could avoid attaching branch stack on
>>>> samples from the kernel. However, given how my example works, I'm
>>>> pretty sure that's not what x86 does.
>>>>
>>>> There's also combinations that are allowed, but record no samples.
>>>> Though I think that was with guest events. I've gone with reject
>>>> non-sense combinations as much as possible. We can easily remove those
>>>> restrictions later if needed. Changing the behavior later (for the
>>>> same configuration) wouldn't be good.
>>>>
>>>>
>>>
>>> Rejecting ones that produce no samples is fair enough, but my example
>>> does produce samples. To answer the question "why would we want that?",
>>> nothing major, but there are a few small reasons:
>>>
>>> * Perf includes both user and kernel by default, so the shortest
>>> command to only gather user branches doesn't work (-j any,u)
>>> * The test already checks for branch stack support like this, so old
>>> Perf test versions don't work
>>
>> I would be more concerned about this one except that *we* wrote that
>> test. (I'm not sure why we wrote a new test rather than adapting
>> record_lbr.sh...)
>>
>
> record_lbr.sh was added 6 months ago, test_brstack.sh 3 years ago so
> it's the other way around.
>
> Although record_lbr.sh also tests --call-graph and --stitch-lbr as well,
> so I think it's fine for test_brstack.sh to test only --branch-filter
> options at the lowest level.
>
> Looking at that test though I see there is a capability "/sys/devices/
> cpu/caps/branches". I'm wondering whether we should be adding that on
> the Arm PMU for BRBE?
>
> Ignoring the tests, the man pages (and some pages on the internet) give
> this example: "--branch-filter any_ret,u,k". This doesn't work either
> because it doesn't match the default exclude_hv option. It just seems a
> bit awkward and incompatible to me, for not much gain.
>
Looking at record_lbr.sh led me to the fact that --call-graph=lbr sets
"PERF_SAMPLE_BRANCH_USER" with the default kernel/user sampling mode,
causing the same issue.
>>> * You might only be optimising userspace, but still interested in the
>>> proportion of time spent or particular place in the kernel
>>
>> How do you see that? It looks completely misleading to me. 'perf
>> report' seems to only list branch stack addresses in this case. There
>> doesn't seem to be any matching of the event address to branch stack
>> addresses.
>>
>
> Perf script will show everything with all it's various options, or --
> branch-history on perf report will show both too. Also there are tools
> other than Perf, AutoFDO seems like something that BRBE can be used with.
>
>>> * Consistency with existing implementations and for people porting
>>> existing tools to Arm
>>> * It doesn't cost anything to support it (I think we just
>>> only check if exclude_* is set rather than !=)
>>> * Permissions checks should be handled by the core code so that
>>> they're consistent
>>> * What's the point of separate branch filters anyway if they always
>>> have to match the event filter?
>>
>> IDK, I wish someone could tell me. I don't see the usecase for them
>> being mismatched.
>>
>> In any case, I don't care too much one way or the other what we do
>> here. If everyone thinks we should relax this, then that's fine with
>> me.
>>
>
> Seeing the branch history from userspace that led up to a certain thing
> in the kernel happening doesn't seem like that much of an edge case to
> me. If you always have to have both on then you lose the userspace
> branch history because the buffer isn't that big and gets overwritten.
>
>>> Some of these things could be fixed in Perf, but not in older versions.
>>> Even if we can't think of a real use case now, it doesn't sound like the
>>> driver should be so restrictive of an option that doesn't do any harm.
>>>
>>>>> This also makes the Perf branch test skip because it uses
>>>>> any,save_type,u to see if BRBE exists.
>>>>
>>>> Yes, I plan to update that if we keep this behavior.
>>>>
>>>>>> + (!is_kernel_in_hyp_mode() &&
>>>>>> + (event->attr.exclude_hv != !(branch_type &
>>>>>> PERF_SAMPLE_BRANCH_HV))))
>>>>>> + return false;
>>>>>> +
>>>>>> + event->hw.branch_reg.config = branch_type_to_brbfcr(event-
>>>>>> >attr.branch_sample_type);
>>>>>> + event->hw.extra_reg.config = branch_type_to_brbcr(event-
>>>>>> >attr.branch_sample_type);
>>>>>> +
>>>>>> + return true;
>>>>>> +}
>>>>>> +
>>>>> [...]
>>>>>> +static const int
>>>>>> brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
>>>>>
>>>>> Does the second field go into 'new_type'? They all seem to be zero so
>>>>> I'm not sure why new_type isn't ignored instead of having it mapped.
>>>>
>>>> Well, left over from when all the Arm specific types were supported.
>>>> So yeah, that can be simplified.
>>>>
>>>>>> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
>>>>>> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
>>>>>
>>>>> How do ones that don't map to anything appear in Perf? For example
>>>>> BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to
>>>>> the
>>>>> previous versions fails because it doesn't see the trap that jumps to
>>>>> the kernel, but it does still see the ERET back to userspace:
>>>>>
>>>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>>>>
>>>>> In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
>>>>> PERF_BR_SYSCALL so you could see it go into the kernel before the
>>>>> return:
>>>>>
>>>>> trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
>>>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>>>
>>>> My read of that was we should see a CALL in this case. Whether SVC
>>>> generates a TRAP or CALL depends on HFGITR_EL2.SVC_EL0 (table D18-2).
>>>> I assumed "SVC due to HFGITR_EL2.SVC_EL0" means when SVC_EL0 is set
>>>> (and set has additional conditions). We have SVC_EL0 cleared, so that
>>>> should be a CALL. Maybe the FVP has this wrong?
>>>>
>>>
>>> The test is doing this rather than a syscall:
>>>
>>> asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (val)); /* TRAP + ERET */
>>>
>>> So I think trap is right. Whether that should be mapped to SYSCALL or
>>> some other branch type I don't know, but the point is that it's
>>> missing now.
>>
>> We aren't supporting any of the Arm specific traps/exceptions. One
>> reason is for consistency with x86 like you just argued for. The only
>
> Does x86 leave holes in the program flow though, or is it complete? IMO
> it makes it harder for tools to make sense of the branch buffer if there
> are things like an ERET with no previous trap to match it up to.
>
>> exception types supported are syscall and IRQ. Part of the issue is
>> there is no userspace control over enabling all the extra Arm ones.
>> There's no way to say enable all branches except debug, fault, etc.
>> exceptions. If we want to support these, I think there should be user
>> control over enabling them. But that can come later if there's any
>> demand for them.
>>
>> Rob
>
> In this patchset we enable PERF_BR_IRQ with PERF_SAMPLE_BRANCH_ANY,
> without any way to selectively disable it. I would assume trap could be
> done with the same option.
>
> If we're filtering some of them out it might be worth documenting that
> "PERF_SAMPLE_BRANCH_ANY" doesn't actually mean 'any' branch type on Arm,
> and some types are recorded but discarded out before sending to userspace.
>
> There could be some confusion when there are partially filled or empty
> branch buffers, and the reason wouldn't be that there weren't any
> branches recorded, but they were all filtered out even with the 'any'
> option.
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-05 14:38 ` James Clark
2025-02-05 14:51 ` James Clark
@ 2025-02-05 16:15 ` Rob Herring
2025-02-06 12:58 ` James Clark
1 sibling, 1 reply; 43+ messages in thread
From: Rob Herring @ 2025-02-05 16:15 UTC (permalink / raw)
To: James Clark, Mark Rutland
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Anshuman Khandual
On Wed, Feb 5, 2025 at 8:38 AM James Clark <james.clark@linaro.org> wrote:
> On 04/02/2025 3:03 pm, Rob Herring wrote:
> > On Tue, Feb 4, 2025 at 6:03 AM James Clark <james.clark@linaro.org> wrote:
> >> On 03/02/2025 5:58 pm, Rob Herring wrote:
> >>> On Mon, Feb 3, 2025 at 10:53 AM James Clark <james.clark@linaro.org> wrote:
> >>>> On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
> >>>>> From: Anshuman Khandual <anshuman.khandual@arm.com>
> >>>>>
> >>>>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
> >>>>> Extension (BRBE), which records information about branches as they are
> >>>>> executed into set of branch record registers. BRBE is similar to x86's
> >>>>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> >>>>> (BHRB).
> >
> > [...]
> >
> >>>>> + /*
> >>>>> + * Require that the event filter and branch filter permissions match.
> >>>>> + *
> >>>>> + * The event and branch permissions can only mismatch if the user set
> >>>>> + * at least one of the privilege branch filters in PERF_SAMPLE_BRANCH_PLM_ALL.
> >>>>> + * Otherwise, the core will set the branch sample permissions in
> >>>>> + * perf_copy_attr().
> >>>>> + */
> >>>>> + if ((event->attr.exclude_user != !(branch_type & PERF_SAMPLE_BRANCH_USER)) ||
> >>>>> + (event->attr.exclude_kernel != !(branch_type & PERF_SAMPLE_BRANCH_KERNEL)) ||
> >>>>
> >>>> I don't think this one is right. By default perf_copy_attr() copies the
> >>>> exclude_ settings into the branch settings so this works, but if the
> >>>> user sets any _less_ permissive branch setting this fails. For example:
> >>>>
> >>>> # perf record -j any,u -- true
> >>>> Error:
> >>>> cycles:PH: PMU Hardware or event type doesn't support branch stack
> >>>> sampling.
> >>>>
> >>>> Here we want the default sampling permissions (exclude_kernel == 0,
> >>>> exclude_user == 0), but only user branch records, which doesn't match.
> >>>> It should be allowed because it doesn't include anything that we're not
> >>>> allowed to see.
> >>>
> >>> I know it is allowed(on x86), but why would we want that? If you do
> >>> something even more restricted:
> >>>
> >>> perf record -e cycles:k -j any,u -- true
> >>>
> >>> That's allowed on x86 and gives you samples with user addresses. But
> >>> all the events happened in the kernel. How does that make any sense?
> >>>
> >>> I suppose in your example, we could avoid attaching branch stack on
> >>> samples from the kernel. However, given how my example works, I'm
> >>> pretty sure that's not what x86 does.
> >>>
> >>> There's also combinations that are allowed, but record no samples.
> >>> Though I think that was with guest events. I've gone with reject
> >>> non-sense combinations as much as possible. We can easily remove those
> >>> restrictions later if needed. Changing the behavior later (for the
> >>> same configuration) wouldn't be good.
> >>>
> >>>
> >>
> >> Rejecting ones that produce no samples is fair enough, but my example
> >> does produce samples. To answer the question "why would we want that?",
> >> nothing major, but there are a few small reasons:
> >>
> >> * Perf includes both user and kernel by default, so the shortest
> >> command to only gather user branches doesn't work (-j any,u)
> >> * The test already checks for branch stack support like this, so old
> >> Perf test versions don't work
> >
> > I would be more concerned about this one except that *we* wrote that
> > test. (I'm not sure why we wrote a new test rather than adapting
> > record_lbr.sh...)
> >
>
> record_lbr.sh was added 6 months ago, test_brstack.sh 3 years ago so
> it's the other way around.
Sigh...
> Although record_lbr.sh also tests --call-graph and --stitch-lbr as well,
> so I think it's fine for test_brstack.sh to test only --branch-filter
> options at the lowest level.
>
> Looking at that test though I see there is a capability
> "/sys/devices/cpu/caps/branches". I'm wondering whether we should be
> adding that on the Arm PMU for BRBE?
I noticed that too. I suppose we should. Though I suppose that could
give weird results if userspace is expecting LBR. Adding that would
make record_lbr.sh run and then the LBR callgraph test is going to
fail.
> Ignoring the tests, the man pages (and some pages on the internet) give
> this example: "--branch-filter any_ret,u,k". This doesn't work either
> because it doesn't match the default exclude_hv option. It just seems a
> bit awkward and incompatible to me, for not much gain.
>
> >> * You might only be optimising userspace, but still interested in the
> >> proportion of time spent or particular place in the kernel
> >
> > How do you see that? It looks completely misleading to me. 'perf
> > report' seems to only list branch stack addresses in this case. There
> > doesn't seem to be any matching of the event address to branch stack
> > addresses.
> >
>
> Perf script will show everything with all it's various options, or
> --branch-history on perf report will show both too. Also there are tools
> other than Perf, AutoFDO seems like something that BRBE can be used with.
>
> >> * Consistency with existing implementations and for people porting
> >> existing tools to Arm
> >> * It doesn't cost anything to support it (I think we just
> >> only check if exclude_* is set rather than !=)
> >> * Permissions checks should be handled by the core code so that
> >> they're consistent
> >> * What's the point of separate branch filters anyway if they always
> >> have to match the event filter?
> >
> > IDK, I wish someone could tell me. I don't see the usecase for them
> > being mismatched.
> >
> > In any case, I don't care too much one way or the other what we do
> > here. If everyone thinks we should relax this, then that's fine with
> > me.
> >
>
> Seeing the branch history from userspace that led up to a certain thing
> in the kernel happening doesn't seem like that much of an edge case to
> me. If you always have to have both on then you lose the userspace
> branch history because the buffer isn't that big and gets overwritten.
Okay, let's drop this check...
> >> Some of these things could be fixed in Perf, but not in older versions.
> >> Even if we can't think of a real use case now, it doesn't sound like the
> >> driver should be so restrictive of an option that doesn't do any harm.
> >>
> >>>> This also makes the Perf branch test skip because it uses
> >>>> any,save_type,u to see if BRBE exists.
> >>>
> >>> Yes, I plan to update that if we keep this behavior.
> >>>
> >>>>> + (!is_kernel_in_hyp_mode() &&
> >>>>> + (event->attr.exclude_hv != !(branch_type & PERF_SAMPLE_BRANCH_HV))))
> >>>>> + return false;
> >>>>> +
> >>>>> + event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
> >>>>> + event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
> >>>>> +
> >>>>> + return true;
> >>>>> +}
> >>>>> +
> >>>> [...]
> >>>>> +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
> >>>>> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
> >>>>
> >>>> Does the second field go into 'new_type'? They all seem to be zero so
> >>>> I'm not sure why new_type isn't ignored instead of having it mapped.
> >>>
> >>> Well, left over from when all the Arm specific types were supported.
> >>> So yeah, that can be simplified.
> >>>
> >>>>> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
> >>>>> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
> >>>>> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
> >>>>> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
> >>>>> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
> >>>>> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
> >>>>> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
> >>>>> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
> >>>>
> >>>> How do ones that don't map to anything appear in Perf? For example
> >>>> BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to the
> >>>> previous versions fails because it doesn't see the trap that jumps to
> >>>> the kernel, but it does still see the ERET back to userspace:
> >>>>
> >>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
> >>>>
> >>>> In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
> >>>> PERF_BR_SYSCALL so you could see it go into the kernel before the return:
> >>>>
> >>>> trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
> >>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
> >>>
> >>> My read of that was we should see a CALL in this case. Whether SVC
> >>> generates a TRAP or CALL depends on HFGITR_EL2.SVC_EL0 (table D18-2).
> >>> I assumed "SVC due to HFGITR_EL2.SVC_EL0" means when SVC_EL0 is set
> >>> (and set has additional conditions). We have SVC_EL0 cleared, so that
> >>> should be a CALL. Maybe the FVP has this wrong?
> >>>
> >>
> >> The test is doing this rather than a syscall:
> >>
> >> asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (val)); /* TRAP + ERET */
> >>
> >> So I think trap is right. Whether that should be mapped to SYSCALL or
> >> some other branch type I don't know, but the point is that it's missing now.
> >
> > We aren't supporting any of the Arm specific traps/exceptions. One
> > reason is for consistency with x86 like you just argued for. The only
>
> Does x86 leave holes in the program flow though, or is it complete? IMO
> it makes it harder for tools to make sense of the branch buffer if there
> are things like an ERET with no previous trap to match it up to.
I'll have to test that. x86 has SYSRET for "syscall return". We added
ERET which maps to x86 interrupt return. So I guess x86 only records
syscalls and their returns. There's also "sw interrupt" on x86 which
gets mapped to PERF_BR_UNKNOWN. I don't think there's any way for us
to distinguish a syscall return from any other exception return.
> > exception types supported are syscall and IRQ. Part of the issue is
> > there is no userspace control over enabling all the extra Arm ones.
> > There's no way to say enable all branches except debug, fault, etc.
> > exceptions. If we want to support these, I think there should be user
> > control over enabling them. But that can come later if there's any
> > demand for them.
> >
> > Rob
>
> In this patchset we enable PERF_BR_IRQ with PERF_SAMPLE_BRANCH_ANY,
> without any way to selectively disable it. I would assume trap could be
> done with the same option.
If I was designing the interface, I would make PERF_BR_IRQ separately
controllable. But we're kind of stuck with what x86 did. I suppose we
could add a negative 'noirq' option.
Are you of the opinion that we should enable everything or some subset
of them? There's basically inst/data/algn faults, FIQ, SError, and
debug. The debug ones seem questionable to me, or at least ones you'd
want to opt-in for. For FIQ, if that's used by secure world, do we
want non-secure world recording when FIQs happen? Could the timing of
those be used maliciously?
> If we're filtering some of them out it might be worth documenting that
> "PERF_SAMPLE_BRANCH_ANY" doesn't actually mean 'any' branch type on Arm,
> and some types are recorded but discarded out before sending to userspace.
>
> There could be some confusion when there are partially filled or empty
> branch buffers, and the reason wouldn't be that there weren't any
> branches recorded, but they were all filtered out even with the 'any'
> option.
Fair enough. I think we need Mark to chime in here. He was questioning
the need for these.
Rob
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-05 16:15 ` Rob Herring
@ 2025-02-06 12:58 ` James Clark
0 siblings, 0 replies; 43+ messages in thread
From: James Clark @ 2025-02-06 12:58 UTC (permalink / raw)
To: Rob Herring, Mark Rutland
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Will Deacon, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Anshuman Khandual
On 05/02/2025 4:15 pm, Rob Herring wrote:
> On Wed, Feb 5, 2025 at 8:38 AM James Clark <james.clark@linaro.org> wrote:
>> On 04/02/2025 3:03 pm, Rob Herring wrote:
>>> On Tue, Feb 4, 2025 at 6:03 AM James Clark <james.clark@linaro.org> wrote:
>>>> On 03/02/2025 5:58 pm, Rob Herring wrote:
>>>>> On Mon, Feb 3, 2025 at 10:53 AM James Clark <james.clark@linaro.org> wrote:
>>>>>> On 03/02/2025 12:43 am, Rob Herring (Arm) wrote:
>>>>>>> From: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>>>>
>>>>>>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
>>>>>>> Extension (BRBE), which records information about branches as they are
>>>>>>> executed into set of branch record registers. BRBE is similar to x86's
>>>>>>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
>>>>>>> (BHRB).
>>>
>>> [...]
>>>
>>>>>>> + /*
>>>>>>> + * Require that the event filter and branch filter permissions match.
>>>>>>> + *
>>>>>>> + * The event and branch permissions can only mismatch if the user set
>>>>>>> + * at least one of the privilege branch filters in PERF_SAMPLE_BRANCH_PLM_ALL.
>>>>>>> + * Otherwise, the core will set the branch sample permissions in
>>>>>>> + * perf_copy_attr().
>>>>>>> + */
>>>>>>> + if ((event->attr.exclude_user != !(branch_type & PERF_SAMPLE_BRANCH_USER)) ||
>>>>>>> + (event->attr.exclude_kernel != !(branch_type & PERF_SAMPLE_BRANCH_KERNEL)) ||
>>>>>>
>>>>>> I don't think this one is right. By default perf_copy_attr() copies the
>>>>>> exclude_ settings into the branch settings so this works, but if the
>>>>>> user sets any _less_ permissive branch setting this fails. For example:
>>>>>>
>>>>>> # perf record -j any,u -- true
>>>>>> Error:
>>>>>> cycles:PH: PMU Hardware or event type doesn't support branch stack
>>>>>> sampling.
>>>>>>
>>>>>> Here we want the default sampling permissions (exclude_kernel == 0,
>>>>>> exclude_user == 0), but only user branch records, which doesn't match.
>>>>>> It should be allowed because it doesn't include anything that we're not
>>>>>> allowed to see.
>>>>>
>>>>> I know it is allowed(on x86), but why would we want that? If you do
>>>>> something even more restricted:
>>>>>
>>>>> perf record -e cycles:k -j any,u -- true
>>>>>
>>>>> That's allowed on x86 and gives you samples with user addresses. But
>>>>> all the events happened in the kernel. How does that make any sense?
>>>>>
>>>>> I suppose in your example, we could avoid attaching branch stack on
>>>>> samples from the kernel. However, given how my example works, I'm
>>>>> pretty sure that's not what x86 does.
>>>>>
>>>>> There's also combinations that are allowed, but record no samples.
>>>>> Though I think that was with guest events. I've gone with reject
>>>>> non-sense combinations as much as possible. We can easily remove those
>>>>> restrictions later if needed. Changing the behavior later (for the
>>>>> same configuration) wouldn't be good.
>>>>>
>>>>>
>>>>
>>>> Rejecting ones that produce no samples is fair enough, but my example
>>>> does produce samples. To answer the question "why would we want that?",
>>>> nothing major, but there are a few small reasons:
>>>>
>>>> * Perf includes both user and kernel by default, so the shortest
>>>> command to only gather user branches doesn't work (-j any,u)
>>>> * The test already checks for branch stack support like this, so old
>>>> Perf test versions don't work
>>>
>>> I would be more concerned about this one except that *we* wrote that
>>> test. (I'm not sure why we wrote a new test rather than adapting
>>> record_lbr.sh...)
>>>
>>
>> record_lbr.sh was added 6 months ago, test_brstack.sh 3 years ago so
>> it's the other way around.
>
> Sigh...
>
>> Although record_lbr.sh also tests --call-graph and --stitch-lbr as well,
>> so I think it's fine for test_brstack.sh to test only --branch-filter
>> options at the lowest level.
>>
>> Looking at that test though I see there is a capability
>> "/sys/devices/cpu/caps/branches". I'm wondering whether we should be
>> adding that on the Arm PMU for BRBE?
>
> I noticed that too. I suppose we should. Though I suppose that could
> give weird results if userspace is expecting LBR. Adding that would
> make record_lbr.sh run and then the LBR callgraph test is going to
> fail.
Looks like we should add it. The "branches" cap seems to imply that any
of the branch recording options are supported.
For --call-graph=lbr, that's a special branch type
PERF_SAMPLE_BRANCH_CALL_STACK which we reject as not supported in BRBE.
The test already does an additional skip if --call-graph=lbr isn't
supported over the top of checking the branches cap. But there are other
sub tests that don't use that option that should pass. They are only
checking for non zero branch stack entries.
>
>> Ignoring the tests, the man pages (and some pages on the internet) give
>> this example: "--branch-filter any_ret,u,k". This doesn't work either
>> because it doesn't match the default exclude_hv option. It just seems a
>> bit awkward and incompatible to me, for not much gain.
>>
>>>> * You might only be optimising userspace, but still interested in the
>>>> proportion of time spent or particular place in the kernel
>>>
>>> How do you see that? It looks completely misleading to me. 'perf
>>> report' seems to only list branch stack addresses in this case. There
>>> doesn't seem to be any matching of the event address to branch stack
>>> addresses.
>>>
>>
>> Perf script will show everything with all it's various options, or
>> --branch-history on perf report will show both too. Also there are tools
>> other than Perf, AutoFDO seems like something that BRBE can be used with.
>>
>>>> * Consistency with existing implementations and for people porting
>>>> existing tools to Arm
>>>> * It doesn't cost anything to support it (I think we just
>>>> only check if exclude_* is set rather than !=)
>>>> * Permissions checks should be handled by the core code so that
>>>> they're consistent
>>>> * What's the point of separate branch filters anyway if they always
>>>> have to match the event filter?
>>>
>>> IDK, I wish someone could tell me. I don't see the usecase for them
>>> being mismatched.
>>>
>>> In any case, I don't care too much one way or the other what we do
>>> here. If everyone thinks we should relax this, then that's fine with
>>> me.
>>>
>>
>> Seeing the branch history from userspace that led up to a certain thing
>> in the kernel happening doesn't seem like that much of an edge case to
>> me. If you always have to have both on then you lose the userspace
>> branch history because the buffer isn't that big and gets overwritten.
>
> Okay, let's drop this check...
>
>>>> Some of these things could be fixed in Perf, but not in older versions.
>>>> Even if we can't think of a real use case now, it doesn't sound like the
>>>> driver should be so restrictive of an option that doesn't do any harm.
>>>>
>>>>>> This also makes the Perf branch test skip because it uses
>>>>>> any,save_type,u to see if BRBE exists.
>>>>>
>>>>> Yes, I plan to update that if we keep this behavior.
>>>>>
>>>>>>> + (!is_kernel_in_hyp_mode() &&
>>>>>>> + (event->attr.exclude_hv != !(branch_type & PERF_SAMPLE_BRANCH_HV))))
>>>>>>> + return false;
>>>>>>> +
>>>>>>> + event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
>>>>>>> + event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
>>>>>>> +
>>>>>>> + return true;
>>>>>>> +}
>>>>>>> +
>>>>>> [...]
>>>>>>> +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
>>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
>>>>>>
>>>>>> Does the second field go into 'new_type'? They all seem to be zero so
>>>>>> I'm not sure why new_type isn't ignored instead of having it mapped.
>>>>>
>>>>> Well, left over from when all the Arm specific types were supported.
>>>>> So yeah, that can be simplified.
>>>>>
>>>>>>> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
>>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
>>>>>>> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
>>>>>>> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
>>>>>>> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
>>>>>>> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
>>>>>>> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
>>>>>>> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
>>>>>>
>>>>>> How do ones that don't map to anything appear in Perf? For example
>>>>>> BRBINFx_EL1_TYPE_TRAP is missing, and the test that was attached to the
>>>>>> previous versions fails because it doesn't see the trap that jumps to
>>>>>> the kernel, but it does still see the ERET back to userspace:
>>>>>>
>>>>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>>>>>
>>>>>> In older versions we'd also have BRBINFx_EL1_TYPE_TRAP mapping to
>>>>>> PERF_BR_SYSCALL so you could see it go into the kernel before the return:
>>>>>>
>>>>>> trap_bench+0x1C/[unknown]/-/-/-/0/SYSCALL/-
>>>>>> [unknown]/trap_bench+0x20/-/-/-/0/ERET/-
>>>>>
>>>>> My read of that was we should see a CALL in this case. Whether SVC
>>>>> generates a TRAP or CALL depends on HFGITR_EL2.SVC_EL0 (table D18-2).
>>>>> I assumed "SVC due to HFGITR_EL2.SVC_EL0" means when SVC_EL0 is set
>>>>> (and set has additional conditions). We have SVC_EL0 cleared, so that
>>>>> should be a CALL. Maybe the FVP has this wrong?
>>>>>
>>>>
>>>> The test is doing this rather than a syscall:
>>>>
>>>> asm("mrs %0, ID_AA64ISAR0_EL1" : "=r" (val)); /* TRAP + ERET */
>>>>
>>>> So I think trap is right. Whether that should be mapped to SYSCALL or
>>>> some other branch type I don't know, but the point is that it's missing now.
>>>
>>> We aren't supporting any of the Arm specific traps/exceptions. One
>>> reason is for consistency with x86 like you just argued for. The only
>>
>> Does x86 leave holes in the program flow though, or is it complete? IMO
>> it makes it harder for tools to make sense of the branch buffer if there
>> are things like an ERET with no previous trap to match it up to.
>
> I'll have to test that. x86 has SYSRET for "syscall return". We added
> ERET which maps to x86 interrupt return. So I guess x86 only records
> syscalls and their returns. There's also "sw interrupt" on x86 which
> gets mapped to PERF_BR_UNKNOWN. I don't think there's any way for us
> to distinguish a syscall return from any other exception return.
>
Any return type is fine really, as long as you can potentially make
sense of it in the end.
>>> exception types supported are syscall and IRQ. Part of the issue is
>>> there is no userspace control over enabling all the extra Arm ones.
>>> There's no way to say enable all branches except debug, fault, etc.
>>> exceptions. If we want to support these, I think there should be user
>>> control over enabling them. But that can come later if there's any
>>> demand for them.
>>>
>>> Rob
>>
>> In this patchset we enable PERF_BR_IRQ with PERF_SAMPLE_BRANCH_ANY,
>> without any way to selectively disable it. I would assume trap could be
>> done with the same option.
>
> If I was designing the interface, I would make PERF_BR_IRQ separately
> controllable. But we're kind of stuck with what x86 did. I suppose we
> could add a negative 'noirq' option.
>
> Are you of the opinion that we should enable everything or some subset
> of them? There's basically inst/data/algn faults, FIQ, SError, and
> debug. The debug ones seem questionable to me, or at least ones you'd
> want to opt-in for. For FIQ, if that's used by secure world, do we
> want non-secure world recording when FIQs happen? Could the timing of
> those be used maliciously?
>
I would say include everything that's already filling the buffer and can
affect the program flow, even if they have to map to unknown or some
slightly off mapping. These tools are supposed to increase visibility,
not hide it. If silicon and buffer space are being consumed by branches,
let userspace decide if it wants to do anything with them or not. It
doesn't sound like changing 'unknown' to a more specific type in the
future would be a breaking change.
Except for any security stuff of course, if FIQ is an issue because of
that, filter it out.
>> If we're filtering some of them out it might be worth documenting that
>> "PERF_SAMPLE_BRANCH_ANY" doesn't actually mean 'any' branch type on Arm,
>> and some types are recorded but discarded out before sending to userspace.
>>
>> There could be some confusion when there are partially filled or empty
>> branch buffers, and the reason wouldn't be that there weren't any
>> branches recorded, but they were all filtered out even with the 'any'
>> option.
>
> Fair enough. I think we need Mark to chime in here. He was questioning
> the need for these.
>
> Rob
I suppose you could say any branches that leave and return to the same
place in userspace aren't useful, like trap and eret (but isn't syscall
the same, and we have those?). But that's only if you filter out kernel.
With both enabled the trap actually goes somewhere and I'm sure that's
interesting to someone.
It might be fine to say that the types that we have now don't match up
well enough, so we can revisit this in the future and add them in with
the right types rather than a potentially breaking change from unknown.
I will leave it to you.
I was mainly stuck on the permissions issue which seemed like a blocker.
I noticed this one because the test was actually testing it, but you're
right these more obscure branch types in userspace aren't exactly the
MVP of BRBE.
Although I will say that leaving the associated ERETs in but filtering
out the thing that took it there is a bit odd. Maybe that's just a
personal thing without much technical merit.
A lot of words to say I don't really know for sure either.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 09/11] arm64: Handle BRBE booting requirements
2025-02-03 0:43 ` [PATCH v19 09/11] arm64: Handle BRBE booting requirements Rob Herring (Arm)
2025-02-03 8:47 ` Anshuman Khandual
@ 2025-02-12 12:10 ` Leo Yan
2025-02-12 21:21 ` Rob Herring
1 sibling, 1 reply; 43+ messages in thread
From: Leo Yan @ 2025-02-12 12:10 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Sun, Feb 02, 2025 at 06:43:03PM -0600, Rob Herring (Arm) wrote:
>
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> To use the Branch Record Buffer Extension (BRBE), some configuration is
> necessary at EL3 and EL2. This patch documents the requirements and adds
> the initial EL2 setup code, which largely consists of configuring the
> fine-grained traps and initializing a couple of BRBE control registers.
>
> Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and
> HDFGWTR_EL2 with the same value, relying on the read/write trap controls
> for a register occupying the same bit position in either register. The
> 'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit
> 59 of HDFGRTR_EL2 is RES0, and so this assumption no longer holds.
s/HDFGRTR_EL2/HDFGWTR_EL2
> To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit
> layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and
> HDFGWTR_EL2 control bits separately. While making this change the
> open-coded value (1 << 62) is replaced with
> HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK.
>
> The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special
> initialisation: even though they are subject to E2H renaming, both have
> an effect regardless of HCR_EL2.TGE, even when running at EL2, and
> consequently both need to be initialised. This is handled in
> __init_el2_brbe() with a comment to explain the situation.
>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> [Mark: rewrite commit message, fix typo in comment]
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> Documentation/arch/arm64/booting.rst | 21 +++++++++
> arch/arm64/include/asm/el2_setup.h | 86 ++++++++++++++++++++++++++++++++++--
> 2 files changed, 104 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
> index cad6fdc96b98..0a421757cacf 100644
> --- a/Documentation/arch/arm64/booting.rst
> +++ b/Documentation/arch/arm64/booting.rst
> @@ -352,6 +352,27 @@ Before jumping into the kernel, the following conditions must be met:
>
> - HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
>
> + For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE):
> +
> + - If EL3 is present:
> +
> + - MDCR_EL3.SBRBE (bits 33:32) must be initialised to 0b11.
Can MDCR_EL3.SBRBE be 0b01 ?
> +
> + - If the kernel is entered at EL1 and EL2 is present:
> +
> + - BRBCR_EL2.CC (bit 3) must be initialised to 0b1.
> + - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1.
Should clarify BRBCR_EL2.TS to be initialised to 0b00 ? Arm ARM
claims the reset behaviour of the TS field is unknown value. The
assembly code below actually has initializes the TS field as zero.
Except the above minor comments, I read the assembly code and it looks
good to me:
Reviewed-by: Leo Yan <leo.yan@arm.com>
> +
> + - HDFGRTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1.
> + - HDFGRTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1.
> + - HDFGRTR_EL2.nBRBIDR (bit 59) must be initialised to 0b1.
> +
> + - HDFGWTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1.
> + - HDFGWTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1.
> +
> + - HFGITR_EL2.nBRBIALL (bit 56) must be initialised to 0b1.
> + - HFGITR_EL2.nBRBINJ (bit 55) must be initialised to 0b1.
> +
> For CPUs with the Scalable Matrix Extension FA64 feature (FEAT_SME_FA64):
>
> - If EL3 is present:
> diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> index 25e162651750..bf21ce513aff 100644
> --- a/arch/arm64/include/asm/el2_setup.h
> +++ b/arch/arm64/include/asm/el2_setup.h
> @@ -163,6 +163,39 @@
> .Lskip_set_cptr_\@:
> .endm
>
> +/*
> + * Configure BRBE to permit recording cycle counts and branch mispredicts.
> + *
> + * At any EL, to record cycle counts BRBE requires that both BRBCR_EL2.CC=1 and
> + * BRBCR_EL1.CC=1.
> + *
> + * At any EL, to record branch mispredicts BRBE requires that both
> + * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1.
> + *
> + * When HCR_EL2.E2H=1, the BRBCR_EL1 encoding is redirected to BRBCR_EL2, but
> + * the {CC,MPRED} bits in the real BRBCR_EL1 register still apply.
> + *
> + * Set {CC,MPRED} in both BRBCR_EL2 and BRBCR_EL1 so that at runtime we only
> + * need to enable/disable these in BRBCR_EL1 regardless of whether the kernel
> + * ends up executing in EL1 or EL2.
> + */
> +.macro __init_el2_brbe
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_\@
> +
> + mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED
> + msr_s SYS_BRBCR_EL2, x0
> +
> + __check_hvhe .Lset_brbe_nvhe_\@, x1
> + msr_s SYS_BRBCR_EL12, x0 // VHE
> + b .Lskip_brbe_\@
> +
> +.Lset_brbe_nvhe_\@:
> + msr_s SYS_BRBCR_EL1, x0 // NVHE
> +.Lskip_brbe_\@:
> +.endm
> +
> /* Disable any fine grained traps */
> .macro __init_el2_fgt
> mrs x1, id_aa64mmfr0_el1
> @@ -170,16 +203,48 @@
> cbz x1, .Lskip_fgt_\@
>
> mov x0, xzr
> + mov x2, xzr
> mrs x1, id_aa64dfr0_el1
> ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4
> cmp x1, #3
> b.lt .Lskip_spe_fgt_\@
> +
> /* Disable PMSNEVFR_EL1 read and write traps */
> - orr x0, x0, #(1 << 62)
> + orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK
> + orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK
>
> .Lskip_spe_fgt_\@:
> +#ifdef CONFIG_ARM64_BRBE
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_reg_fgt_\@
> +
> + /*
> + * Disable read traps for the following registers
> + *
> + * [BRBSRC|BRBTGT|RBINF]_EL1
> + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
> + */
> + orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK
> +
> + /*
> + * Disable write traps for the following registers
> + *
> + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
> + */
> + orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK
> +
> + /* Disable read and write traps for [BRBCR|BRBFCR]_EL1 */
> + orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK
> + orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK
> +
> + /* Disable read traps for BRBIDR_EL1 */
> + orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK
> +
> +.Lskip_brbe_reg_fgt_\@:
> +#endif /* CONFIG_ARM64_BRBE */
> msr_s SYS_HDFGRTR_EL2, x0
> - msr_s SYS_HDFGWTR_EL2, x0
> + msr_s SYS_HDFGWTR_EL2, x2
>
> mov x0, xzr
> mrs x1, id_aa64pfr1_el1
> @@ -220,7 +285,21 @@
> .Lset_fgt_\@:
> msr_s SYS_HFGRTR_EL2, x0
> msr_s SYS_HFGWTR_EL2, x0
> - msr_s SYS_HFGITR_EL2, xzr
> + mov x0, xzr
> +#ifdef CONFIG_ARM64_BRBE
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_insn_fgt_\@
> +
> + /* Disable traps for BRBIALL instruction */
> + orr x0, x0, #HFGITR_EL2_nBRBIALL_MASK
> +
> + /* Disable traps for BRBINJ instruction */
> + orr x0, x0, #HFGITR_EL2_nBRBINJ_MASK
> +
> +.Lskip_brbe_insn_fgt_\@:
> +#endif /* CONFIG_ARM64_BRBE */
> + msr_s SYS_HFGITR_EL2, x0
>
> mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU
> ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4
> @@ -275,6 +354,7 @@
> __init_el2_hcrx
> __init_el2_timers
> __init_el2_debug
> + __init_el2_brbe
> __init_el2_lor
> __init_el2_stage2
> __init_el2_gicv3
>
> --
> 2.47.2
>
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-03 0:43 ` [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
2025-02-03 16:53 ` James Clark
@ 2025-02-12 18:52 ` Leo Yan
2025-02-12 19:00 ` Leo Yan
2025-02-13 16:16 ` Leo Yan
2 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2025-02-12 18:52 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Sun, Feb 02, 2025 at 06:43:05PM -0600, Rob Herring (Arm) wrote:
>
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> The ARMv9.2 architecture introduces the optional Branch Record Buffer
> Extension (BRBE), which records information about branches as they are
> executed into set of branch record registers. BRBE is similar to x86's
> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> (BHRB).
[...]
> diff --git a/drivers/perf/arm_brbe.c b/drivers/perf/arm_brbe.c
> new file mode 100644
> index 000000000000..18eb9bfa1f9c
> --- /dev/null
> +++ b/drivers/perf/arm_brbe.c
> @@ -0,0 +1,794 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Branch Record Buffer Extension Driver.
> + *
> + * Copyright (C) 2022-2025 ARM Limited
> + *
> + * Author: Anshuman Khandual <anshuman.khandual@arm.com>
> + */
> +#include <linux/types.h>
> +#include <linux/bitmap.h>
> +#include <linux/perf/arm_pmu.h>
> +#include "arm_brbe.h"
> +
> +#define BRBFCR_EL1_BRANCH_FILTERS (BRBFCR_EL1_DIRECT | \
> + BRBFCR_EL1_INDIRECT | \
> + BRBFCR_EL1_RTN | \
> + BRBFCR_EL1_INDCALL | \
> + BRBFCR_EL1_DIRCALL | \
> + BRBFCR_EL1_CONDDIR)
> +
> +/*
> + * BRBTS_EL1 is currently not used for branch stack implementation
> + * purpose but BRBCR_ELx.TS needs to have a valid value from all
> + * available options. BRBCR_ELx_TS_VIRTUAL is selected for this.
> + */
> +#define BRBCR_ELx_DEFAULT_TS FIELD_PREP(BRBCR_ELx_TS_MASK, BRBCR_ELx_TS_VIRTUAL)
> +
> +/*
> + * BRBE Buffer Organization
> + *
> + * BRBE buffer is arranged as multiple banks of 32 branch record
> + * entries each. An individual branch record in a given bank could
> + * be accessed, after selecting the bank in BRBFCR_EL1.BANK and
> + * accessing the registers i.e [BRBSRC, BRBTGT, BRBINF] set with
> + * indices [0..31].
> + *
> + * Bank 0
> + *
> + * --------------------------------- ------
> + * | 00 | BRBSRC | BRBTGT | BRBINF | | 00 |
> + * --------------------------------- ------
> + * | 01 | BRBSRC | BRBTGT | BRBINF | | 01 |
> + * --------------------------------- ------
> + * | .. | BRBSRC | BRBTGT | BRBINF | | .. |
> + * --------------------------------- ------
> + * | 31 | BRBSRC | BRBTGT | BRBINF | | 31 |
> + * --------------------------------- ------
> + *
> + * Bank 1
> + *
> + * --------------------------------- ------
> + * | 32 | BRBSRC | BRBTGT | BRBINF | | 00 |
> + * --------------------------------- ------
> + * | 33 | BRBSRC | BRBTGT | BRBINF | | 01 |
> + * --------------------------------- ------
> + * | .. | BRBSRC | BRBTGT | BRBINF | | .. |
> + * --------------------------------- ------
> + * | 63 | BRBSRC | BRBTGT | BRBINF | | 31 |
> + * --------------------------------- ------
> + */
> +#define BRBE_BANK_MAX_ENTRIES 32
> +#define BRBE_MAX_BANK 2
> +#define BRBE_MAX_ENTRIES (BRBE_BANK_MAX_ENTRIES * BRBE_MAX_BANK)
BRBE_MAX_BANK and BRBE_MAX_ENTRIES are not used. Should remove them?
> +
> +struct brbe_regset {
> + unsigned long brbsrc;
> + unsigned long brbtgt;
> + unsigned long brbinf;
Explicitly define as 'u64' type for 64-bit registers.
> +};
> +
> +#define PERF_BR_ARM64_MAX (PERF_BR_MAX + PERF_BR_NEW_MAX)
> +
> +struct brbe_hw_attr {
> + int brbe_version;
> + int brbe_cc;
> + int brbe_nr;
> + int brbe_format;
> +};
> +
> +#define BRBE_REGN_CASE(n, case_macro) \
> + case n: case_macro(n); break
> +
> +#define BRBE_REGN_SWITCH(x, case_macro) \
> + do { \
> + switch (x) { \
> + BRBE_REGN_CASE(0, case_macro); \
> + BRBE_REGN_CASE(1, case_macro); \
> + BRBE_REGN_CASE(2, case_macro); \
> + BRBE_REGN_CASE(3, case_macro); \
> + BRBE_REGN_CASE(4, case_macro); \
> + BRBE_REGN_CASE(5, case_macro); \
> + BRBE_REGN_CASE(6, case_macro); \
> + BRBE_REGN_CASE(7, case_macro); \
> + BRBE_REGN_CASE(8, case_macro); \
> + BRBE_REGN_CASE(9, case_macro); \
> + BRBE_REGN_CASE(10, case_macro); \
> + BRBE_REGN_CASE(11, case_macro); \
> + BRBE_REGN_CASE(12, case_macro); \
> + BRBE_REGN_CASE(13, case_macro); \
> + BRBE_REGN_CASE(14, case_macro); \
> + BRBE_REGN_CASE(15, case_macro); \
> + BRBE_REGN_CASE(16, case_macro); \
> + BRBE_REGN_CASE(17, case_macro); \
> + BRBE_REGN_CASE(18, case_macro); \
> + BRBE_REGN_CASE(19, case_macro); \
> + BRBE_REGN_CASE(20, case_macro); \
> + BRBE_REGN_CASE(21, case_macro); \
> + BRBE_REGN_CASE(22, case_macro); \
> + BRBE_REGN_CASE(23, case_macro); \
> + BRBE_REGN_CASE(24, case_macro); \
> + BRBE_REGN_CASE(25, case_macro); \
> + BRBE_REGN_CASE(26, case_macro); \
> + BRBE_REGN_CASE(27, case_macro); \
> + BRBE_REGN_CASE(28, case_macro); \
> + BRBE_REGN_CASE(29, case_macro); \
> + BRBE_REGN_CASE(30, case_macro); \
> + BRBE_REGN_CASE(31, case_macro); \
> + default: WARN(1, "Invalid BRB* index %d\n", x); \
> + } \
> + } while (0)
> +
> +#define RETURN_READ_BRBSRCN(n) \
> + return read_sysreg_s(SYS_BRBSRC_EL1(n))
> +static inline u64 get_brbsrc_reg(int idx)
> +{
> + BRBE_REGN_SWITCH(idx, RETURN_READ_BRBSRCN);
> + return 0;
> +}
> +
> +#define RETURN_READ_BRBTGTN(n) \
> + return read_sysreg_s(SYS_BRBTGT_EL1(n))
> +static u64 get_brbtgt_reg(int idx)
> +{
> + BRBE_REGN_SWITCH(idx, RETURN_READ_BRBTGTN);
> + return 0;
> +}
> +
> +#define RETURN_READ_BRBINFN(n) \
> + return read_sysreg_s(SYS_BRBINF_EL1(n))
> +static u64 get_brbinf_reg(int idx)
> +{
> + BRBE_REGN_SWITCH(idx, RETURN_READ_BRBINFN);
> + return 0;
> +}
> +
> +static u64 brbe_record_valid(u64 brbinf)
> +{
> + return FIELD_GET(BRBINFx_EL1_VALID_MASK, brbinf);
> +}
> +
> +static bool brbe_invalid(u64 brbinf)
> +{
> + return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_NONE;
> +}
> +
> +static bool brbe_record_is_complete(u64 brbinf)
> +{
> + return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_FULL;
> +}
> +
> +static bool brbe_record_is_source_only(u64 brbinf)
> +{
> + return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_SOURCE;
> +}
> +
> +static bool brbe_record_is_target_only(u64 brbinf)
> +{
> + return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_TARGET;
> +}
> +
> +static int brbinf_get_in_tx(u64 brbinf)
> +{
> + return FIELD_GET(BRBINFx_EL1_T_MASK, brbinf);
> +}
> +
> +static int brbinf_get_mispredict(u64 brbinf)
> +{
> + return FIELD_GET(BRBINFx_EL1_MPRED_MASK, brbinf);
> +}
I would expect the naming of brbinf_get_mispredict() will cause
confusion. When the function returns 1, it means "Branch was
incorrectly predicted".
Maybe consider to use '!FIELD_GET(...)' for a reversed value?
> +
> +static int brbinf_get_lastfailed(u64 brbinf)
> +{
> + return FIELD_GET(BRBINFx_EL1_LASTFAILED_MASK, brbinf);
> +}
> +
> +static u16 brbinf_get_cycles(u64 brbinf)
> +{
> + u32 exp, mant, cycles;
> + /*
> + * Captured cycle count is unknown and hence
> + * should not be passed on to userspace.
> + */
> + if (brbinf & BRBINFx_EL1_CCU)
> + return 0;
> +
> + exp = FIELD_GET(BRBINFx_EL1_CC_EXP_MASK, brbinf);
> + mant = FIELD_GET(BRBINFx_EL1_CC_MANT_MASK, brbinf);
> +
> + if (!exp)
> + return mant;
> +
> + cycles = (mant | 0x100) << (exp - 1);
> +
> + return (cycles > U16_MAX) ? U16_MAX : cycles;
min(cycles, (u32)U16_MAX);
Please expect more comments in my tomorrow.
Thanks,
Leo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-12 18:52 ` Leo Yan
@ 2025-02-12 19:00 ` Leo Yan
0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2025-02-12 19:00 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Wed, Feb 12, 2025 at 06:52:27PM +0000, Leo Yan wrote:
[...]
> > +static int brbinf_get_mispredict(u64 brbinf)
> > +{
> > + return FIELD_GET(BRBINFx_EL1_MPRED_MASK, brbinf);
> > +}
>
> I would expect the naming of brbinf_get_mispredict() will cause
> confusion. When the function returns 1, it means "Branch was
> incorrectly predicted".
>
> Maybe consider to use '!FIELD_GET(...)' for a reversed value?
Please ignore this comment. Sorry for my misreading and noise.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 09/11] arm64: Handle BRBE booting requirements
2025-02-12 12:10 ` Leo Yan
@ 2025-02-12 21:21 ` Rob Herring
2025-02-13 12:27 ` Leo Yan
0 siblings, 1 reply; 43+ messages in thread
From: Rob Herring @ 2025-02-12 21:21 UTC (permalink / raw)
To: Leo Yan
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Wed, Feb 12, 2025 at 6:10 AM Leo Yan <leo.yan@arm.com> wrote:
>
> On Sun, Feb 02, 2025 at 06:43:03PM -0600, Rob Herring (Arm) wrote:
> >
> > From: Anshuman Khandual <anshuman.khandual@arm.com>
> >
> > To use the Branch Record Buffer Extension (BRBE), some configuration is
> > necessary at EL3 and EL2. This patch documents the requirements and adds
> > the initial EL2 setup code, which largely consists of configuring the
> > fine-grained traps and initializing a couple of BRBE control registers.
> >
> > Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and
> > HDFGWTR_EL2 with the same value, relying on the read/write trap controls
> > for a register occupying the same bit position in either register. The
> > 'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit
> > 59 of HDFGRTR_EL2 is RES0, and so this assumption no longer holds.
>
> s/HDFGRTR_EL2/HDFGWTR_EL2
>
> > To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit
> > layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and
> > HDFGWTR_EL2 control bits separately. While making this change the
> > open-coded value (1 << 62) is replaced with
> > HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK.
> >
> > The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special
> > initialisation: even though they are subject to E2H renaming, both have
> > an effect regardless of HCR_EL2.TGE, even when running at EL2, and
> > consequently both need to be initialised. This is handled in
> > __init_el2_brbe() with a comment to explain the situation.
> >
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Oliver Upton <oliver.upton@linux.dev>
> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> > [Mark: rewrite commit message, fix typo in comment]
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> > ---
> > Documentation/arch/arm64/booting.rst | 21 +++++++++
> > arch/arm64/include/asm/el2_setup.h | 86 ++++++++++++++++++++++++++++++++++--
> > 2 files changed, 104 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
> > index cad6fdc96b98..0a421757cacf 100644
> > --- a/Documentation/arch/arm64/booting.rst
> > +++ b/Documentation/arch/arm64/booting.rst
> > @@ -352,6 +352,27 @@ Before jumping into the kernel, the following conditions must be met:
> >
> > - HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
> >
> > + For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE):
> > +
> > + - If EL3 is present:
> > +
> > + - MDCR_EL3.SBRBE (bits 33:32) must be initialised to 0b11.
>
> Can MDCR_EL3.SBRBE be 0b01 ?
Yes, in fact I think that should be required instead. If it is 0b11,
then recording of secure EL0, EL1, and EL2 would be allowed and
accessible to non-secure world. Though I suppose EL3 could explicitly
pause BRBE instead.
> > +
> > + - If the kernel is entered at EL1 and EL2 is present:
> > +
> > + - BRBCR_EL2.CC (bit 3) must be initialised to 0b1.
> > + - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1.
>
> Should clarify BRBCR_EL2.TS to be initialised to 0b00 ? Arm ARM
> claims the reset behaviour of the TS field is unknown value. The
> assembly code below actually has initializes the TS field as zero.
Humm, we don't currently care what it is initialized to because the
timestamp is never used. We would care in the future if we use
timestamps. Will 0b00 be the only correct value? I'm not sure.
> Except the above minor comments, I read the assembly code and it looks
> good to me:
>
> Reviewed-by: Leo Yan <leo.yan@arm.com>
Thank you.
Rob
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 09/11] arm64: Handle BRBE booting requirements
2025-02-12 21:21 ` Rob Herring
@ 2025-02-13 12:27 ` Leo Yan
0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2025-02-13 12:27 UTC (permalink / raw)
To: Rob Herring
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Wed, Feb 12, 2025 at 03:21:46PM -0600, Rob Herring wrote:
[...]
> > > + - If the kernel is entered at EL1 and EL2 is present:
> > > +
> > > + - BRBCR_EL2.CC (bit 3) must be initialised to 0b1.
> > > + - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1.
> >
> > Should clarify BRBCR_EL2.TS to be initialised to 0b00 ? Arm ARM
> > claims the reset behaviour of the TS field is unknown value. The
> > assembly code below actually has initializes the TS field as zero.
>
> Humm, we don't currently care what it is initialized to because the
> timestamp is never used. We would care in the future if we use
> timestamps. Will 0b00 be the only correct value? I'm not sure.
In initializaton phase, if set BRBCR_EL2.TS = 0b00, then the timestamp
will be decided by BRBCR_EL1.TS. I expect the BRBE driver will always
write BRBCR_EL1.TS.
Thanks,
Leo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-03 0:43 ` [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
2025-02-03 16:53 ` James Clark
2025-02-12 18:52 ` Leo Yan
@ 2025-02-13 16:16 ` Leo Yan
2025-02-13 17:13 ` Rob Herring
2 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2025-02-13 16:16 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Sun, Feb 02, 2025 at 06:43:05PM -0600, Rob Herring (Arm) wrote:
[...]
> +void brbe_enable(const struct arm_pmu *arm_pmu)
> +{
> + struct pmu_hw_events *cpuc = this_cpu_ptr(arm_pmu->hw_events);
> + u64 brbfcr = 0, brbcr = 0;
> +
> + /*
> + * Merge the permitted branch filters of all events.
> + */
> + for (int i = 0; i < ARMPMU_MAX_HWEVENTS; i++) {
> + struct perf_event *event = cpuc->events[i];
> +
> + if (event && has_branch_stack(event)) {
> + brbfcr |= event->hw.branch_reg.config;
> + brbcr |= event->hw.extra_reg.config;
> + }
> + }
> +
> + /*
> + * If the record buffer contains any branches, we've already read them
> + * out and don't want to read them again.
> + * No need to sync as we're already stopped.
> + */
> + brbe_invalidate_nosync();
> + isb(); // Make sure invalidate takes effect before enabling
> +
> + /*
> + * In VHE mode with MDCR_EL2.HPMN set to PMCR_EL0.N, the counters are
> + * controlled by BRBCR_EL1 rather than BRBCR_EL2 (which writes to
> + * BRBCR_EL1 are redirected to). Use the same value for both register
> + * except keep EL1 and EL0 recording disabled in guests.
> + */
> + if (is_kernel_in_hyp_mode())
> + write_sysreg_s(brbcr & ~(BRBCR_ELx_ExBRE | BRBCR_ELx_E0BRE), SYS_BRBCR_EL12);
> + write_sysreg_s(brbcr, SYS_BRBCR_EL1);
> + isb(); // Ensure BRBCR_ELx settings take effect before unpausing
> +
> + write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
Seems to me, it is weird that first enable recording (BRBCR), then set
control register BRBFCR. And the writing SYS_BRBFCR_EL1 not guarded
by a barrier is also a bit concerned.
> +}
> +
> +void brbe_disable(void)
> +{
> + /*
> + * No need for synchronization here as synchronization in PMCR write
> + * ensures ordering and in the interrupt handler this is a NOP as
> + * we're already paused.
> + */
> + write_sysreg_s(BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
Maybe the Arm ARM causes the confusion for the description of the
PAUSED bit, I read it as this bit is a status bit to indicate
branch recording is paused.
> +}
> +
> +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
> + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
> + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
> + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
> + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
> + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
> + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
> + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
> + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
> + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
I saw this table cannot reflect the complete branch type. We might
need to consider to extend the perf branch flags later.
If the 'new_type' is always zero, it is not necessary to maintain a
array with two items (the second one is always 0).
> +};
> +
> +static void brbe_set_perf_entry_type(struct perf_branch_entry *entry, u64 brbinf)
> +{
> + int brbe_type = brbinf_get_type(brbinf);
> +
> + if (brbe_type <= BRBINFx_EL1_TYPE_DEBUG_EXIT) {
> + const int *br_type = brbe_type_to_perf_type_map[brbe_type];
> +
> + entry->type = br_type[0];
> + entry->new_type = br_type[1];
> + }
> +}
> +
> +static int brbinf_get_perf_priv(u64 brbinf)
> +{
> + int brbe_el = brbinf_get_el(brbinf);
> +
> + switch (brbe_el) {
> + case BRBINFx_EL1_EL_EL0:
> + return PERF_BR_PRIV_USER;
> + case BRBINFx_EL1_EL_EL1:
> + return PERF_BR_PRIV_KERNEL;
> + case BRBINFx_EL1_EL_EL2:
> + if (is_kernel_in_hyp_mode())
> + return PERF_BR_PRIV_KERNEL;
> + return PERF_BR_PRIV_HV;
> + default:
> + pr_warn_once("%d - unknown branch privilege captured\n", brbe_el);
> + return PERF_BR_PRIV_UNKNOWN;
> + }
> +}
> +
> +static void capture_brbe_flags(struct perf_branch_entry *entry,
> + const struct perf_event *event,
> + u64 brbinf)
> +{
> + brbe_set_perf_entry_type(entry, brbinf);
> +
> + if (!branch_sample_no_cycles(event))
> + entry->cycles = brbinf_get_cycles(brbinf);
> +
> + if (!branch_sample_no_flags(event)) {
> + /* Mispredict info is available for source only and complete branch records. */
> + if (!brbe_record_is_target_only(brbinf)) {
> + entry->mispred = brbinf_get_mispredict(brbinf);
> + entry->predicted = !entry->mispred;
> + }
> +
> + /*
> + * Currently TME feature is neither implemented in any hardware
> + * nor it is being supported in the kernel. Just warn here once
> + * if TME related information shows up rather unexpectedly.
> + */
> + if (brbinf_get_lastfailed(brbinf) || brbinf_get_in_tx(brbinf))
> + pr_warn_once("Unknown transaction states\n");
If the branch is in transaction, we can set:
entry->in_tx = 1;
> + }
> +
> + /*
> + * Branch privilege level is available for target only and complete
> + * branch records.
> + */
> + if (!brbe_record_is_source_only(brbinf))
> + entry->priv = brbinf_get_perf_priv(brbinf);
This logic is not quite right. In theory, if we check with above
condition (!brbe_record_is_source_only(brbinf)), it might be the
case both source and target are not valid.
Thanks,
Leo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-02-03 0:43 ` [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
2025-02-03 9:16 ` Anshuman Khandual
2025-02-03 11:28 ` James Clark
@ 2025-02-13 17:03 ` Leo Yan
2025-02-13 23:16 ` Rob Herring
2 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2025-02-13 17:03 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Sun, Feb 02, 2025 at 06:43:04PM -0600, Rob Herring (Arm) wrote:
[...]
> +static void __debug_save_brbe(u64 *brbcr_el1)
> +{
> + *brbcr_el1 = 0;
> +
> + /* Check if the BRBE is enabled */
> + if (!(read_sysreg_el1(SYS_BRBCR) & (BRBCR_ELx_E0BRE | BRBCR_ELx_ExBRE)))
> + return;
> +
> + /*
> + * Prohibit branch record generation while we are in guest.
> + * Since access to BRBCR_EL1 is trapped, the guest can't
> + * modify the filtering set by the host.
> + */
> + *brbcr_el1 = read_sysreg_el1(SYS_BRBCR);
> + write_sysreg_el1(0, SYS_BRBCR);
> +}
Should flush branch record and use isb() before exit host kernel?
I see inconsistence between the function above and BRBE's disable
function. Here it clears E0BRE / ExBRE bits for disabling BRBE, but the
BRBE driver sets the PAUSED bit in BRBFCR_EL1 for disabling BRBE.
Thanks,
Leo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-13 16:16 ` Leo Yan
@ 2025-02-13 17:13 ` Rob Herring
2025-02-13 17:45 ` Leo Yan
0 siblings, 1 reply; 43+ messages in thread
From: Rob Herring @ 2025-02-13 17:13 UTC (permalink / raw)
To: Leo Yan
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Thu, Feb 13, 2025 at 10:16 AM Leo Yan <leo.yan@arm.com> wrote:
>
> On Sun, Feb 02, 2025 at 06:43:05PM -0600, Rob Herring (Arm) wrote:
>
> [...]
>
> > +void brbe_enable(const struct arm_pmu *arm_pmu)
> > +{
> > + struct pmu_hw_events *cpuc = this_cpu_ptr(arm_pmu->hw_events);
> > + u64 brbfcr = 0, brbcr = 0;
> > +
> > + /*
> > + * Merge the permitted branch filters of all events.
> > + */
> > + for (int i = 0; i < ARMPMU_MAX_HWEVENTS; i++) {
> > + struct perf_event *event = cpuc->events[i];
> > +
> > + if (event && has_branch_stack(event)) {
> > + brbfcr |= event->hw.branch_reg.config;
> > + brbcr |= event->hw.extra_reg.config;
> > + }
> > + }
> > +
> > + /*
> > + * If the record buffer contains any branches, we've already read them
> > + * out and don't want to read them again.
> > + * No need to sync as we're already stopped.
> > + */
> > + brbe_invalidate_nosync();
> > + isb(); // Make sure invalidate takes effect before enabling
> > +
> > + /*
> > + * In VHE mode with MDCR_EL2.HPMN set to PMCR_EL0.N, the counters are
> > + * controlled by BRBCR_EL1 rather than BRBCR_EL2 (which writes to
> > + * BRBCR_EL1 are redirected to). Use the same value for both register
> > + * except keep EL1 and EL0 recording disabled in guests.
> > + */
> > + if (is_kernel_in_hyp_mode())
> > + write_sysreg_s(brbcr & ~(BRBCR_ELx_ExBRE | BRBCR_ELx_E0BRE), SYS_BRBCR_EL12);
> > + write_sysreg_s(brbcr, SYS_BRBCR_EL1);
> > + isb(); // Ensure BRBCR_ELx settings take effect before unpausing
> > +
> > + write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
>
> Seems to me, it is weird that first enable recording (BRBCR), then set
> control register BRBFCR. And the writing SYS_BRBFCR_EL1 not guarded
> by a barrier is also a bit concerned.
We are always disabled (paused) when we enter brbe_enable(). So the
last thing we do is unpause. The only ordering we care about after
writing SYS_BRBFCR_EL1 is writing PMCR which has an isb before it is
written.
> > +}
> > +
> > +void brbe_disable(void)
> > +{
> > + /*
> > + * No need for synchronization here as synchronization in PMCR write
> > + * ensures ordering and in the interrupt handler this is a NOP as
> > + * we're already paused.
> > + */
> > + write_sysreg_s(BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
>
> Maybe the Arm ARM causes the confusion for the description of the
> PAUSED bit, I read it as this bit is a status bit to indicate
> branch recording is paused.
I agree, but I tested that writing it sets the bit (on FVP). Rule
RSRJND says s/w clears the bit to unpause, so it is definitely
writeable. While it doesn't say anything explicitly about s/w setting
the bit, there is no definition in the Arm ARM of a 'write 0 to clear'
only bit while there are W1C and W1S definitions.
> > +}
> > +
> > +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
> > + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
> > + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
> > + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
> > + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
> > + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
> > + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
> > + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
> > + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
> > + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
>
> I saw this table cannot reflect the complete branch type. We might
> need to consider to extend the perf branch flags later.
>
> If the 'new_type' is always zero, it is not necessary to maintain a
> array with two items (the second one is always 0).
I'm adding the new_type's back in the next version.
>
> > +};
> > +
> > +static void brbe_set_perf_entry_type(struct perf_branch_entry *entry, u64 brbinf)
> > +{
> > + int brbe_type = brbinf_get_type(brbinf);
> > +
> > + if (brbe_type <= BRBINFx_EL1_TYPE_DEBUG_EXIT) {
> > + const int *br_type = brbe_type_to_perf_type_map[brbe_type];
> > +
> > + entry->type = br_type[0];
> > + entry->new_type = br_type[1];
> > + }
> > +}
> > +
> > +static int brbinf_get_perf_priv(u64 brbinf)
> > +{
> > + int brbe_el = brbinf_get_el(brbinf);
> > +
> > + switch (brbe_el) {
> > + case BRBINFx_EL1_EL_EL0:
> > + return PERF_BR_PRIV_USER;
> > + case BRBINFx_EL1_EL_EL1:
> > + return PERF_BR_PRIV_KERNEL;
> > + case BRBINFx_EL1_EL_EL2:
> > + if (is_kernel_in_hyp_mode())
> > + return PERF_BR_PRIV_KERNEL;
> > + return PERF_BR_PRIV_HV;
> > + default:
> > + pr_warn_once("%d - unknown branch privilege captured\n", brbe_el);
> > + return PERF_BR_PRIV_UNKNOWN;
> > + }
> > +}
> > +
> > +static void capture_brbe_flags(struct perf_branch_entry *entry,
> > + const struct perf_event *event,
> > + u64 brbinf)
> > +{
> > + brbe_set_perf_entry_type(entry, brbinf);
> > +
> > + if (!branch_sample_no_cycles(event))
> > + entry->cycles = brbinf_get_cycles(brbinf);
> > +
> > + if (!branch_sample_no_flags(event)) {
> > + /* Mispredict info is available for source only and complete branch records. */
> > + if (!brbe_record_is_target_only(brbinf)) {
> > + entry->mispred = brbinf_get_mispredict(brbinf);
> > + entry->predicted = !entry->mispred;
> > + }
> > +
> > + /*
> > + * Currently TME feature is neither implemented in any hardware
> > + * nor it is being supported in the kernel. Just warn here once
> > + * if TME related information shows up rather unexpectedly.
> > + */
> > + if (brbinf_get_lastfailed(brbinf) || brbinf_get_in_tx(brbinf))
> > + pr_warn_once("Unknown transaction states\n");
>
> If the branch is in transaction, we can set:
>
> entry->in_tx = 1;
We actively don't want to support the feature. The comment there is
from Mark's feedback on a prior version.
>
> > + }
> > +
> > + /*
> > + * Branch privilege level is available for target only and complete
> > + * branch records.
> > + */
> > + if (!brbe_record_is_source_only(brbinf))
> > + entry->priv = brbinf_get_perf_priv(brbinf);
>
> This logic is not quite right. In theory, if we check with above
> condition (!brbe_record_is_source_only(brbinf)), it might be the
> case both source and target are not valid.
We never get here if the record is not valid. A valid record must have
at least 1 address valid.
I could merge capture_brbe_flags() into perf_entry_from_brbe_regset().
There's not much reason to have a separate function.
Rob
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-02-13 17:13 ` Rob Herring
@ 2025-02-13 17:45 ` Leo Yan
0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2025-02-13 17:45 UTC (permalink / raw)
To: Rob Herring
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Thu, Feb 13, 2025 at 11:13:49AM -0600, Rob Herring wrote:
[...]
> > > +void brbe_enable(const struct arm_pmu *arm_pmu)
> > > +{
> > > + struct pmu_hw_events *cpuc = this_cpu_ptr(arm_pmu->hw_events);
> > > + u64 brbfcr = 0, brbcr = 0;
> > > +
> > > + /*
> > > + * Merge the permitted branch filters of all events.
> > > + */
> > > + for (int i = 0; i < ARMPMU_MAX_HWEVENTS; i++) {
> > > + struct perf_event *event = cpuc->events[i];
> > > +
> > > + if (event && has_branch_stack(event)) {
> > > + brbfcr |= event->hw.branch_reg.config;
> > > + brbcr |= event->hw.extra_reg.config;
> > > + }
> > > + }
> > > +
> > > + /*
> > > + * If the record buffer contains any branches, we've already read them
> > > + * out and don't want to read them again.
> > > + * No need to sync as we're already stopped.
> > > + */
> > > + brbe_invalidate_nosync();
> > > + isb(); // Make sure invalidate takes effect before enabling
> > > +
> > > + /*
> > > + * In VHE mode with MDCR_EL2.HPMN set to PMCR_EL0.N, the counters are
> > > + * controlled by BRBCR_EL1 rather than BRBCR_EL2 (which writes to
> > > + * BRBCR_EL1 are redirected to). Use the same value for both register
> > > + * except keep EL1 and EL0 recording disabled in guests.
> > > + */
> > > + if (is_kernel_in_hyp_mode())
> > > + write_sysreg_s(brbcr & ~(BRBCR_ELx_ExBRE | BRBCR_ELx_E0BRE), SYS_BRBCR_EL12);
> > > + write_sysreg_s(brbcr, SYS_BRBCR_EL1);
> > > + isb(); // Ensure BRBCR_ELx settings take effect before unpausing
> > > +
> > > + write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
> >
> > Seems to me, it is weird that first enable recording (BRBCR), then set
> > control register BRBFCR. And the writing SYS_BRBFCR_EL1 not guarded
> > by a barrier is also a bit concerned.
>
> We are always disabled (paused) when we enter brbe_enable(). So the
> last thing we do is unpause. The only ordering we care about after
> writing SYS_BRBFCR_EL1 is writing PMCR which has an isb before it is
> written.
Maybe it is good to add a comment to record the info.
Thanks,
Leo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-02-13 17:03 ` Leo Yan
@ 2025-02-13 23:16 ` Rob Herring
2025-02-14 9:55 ` Leo Yan
0 siblings, 1 reply; 43+ messages in thread
From: Rob Herring @ 2025-02-13 23:16 UTC (permalink / raw)
To: Leo Yan
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Thu, Feb 13, 2025 at 11:03 AM Leo Yan <leo.yan@arm.com> wrote:
>
> On Sun, Feb 02, 2025 at 06:43:04PM -0600, Rob Herring (Arm) wrote:
>
> [...]
>
> > +static void __debug_save_brbe(u64 *brbcr_el1)
> > +{
> > + *brbcr_el1 = 0;
> > +
> > + /* Check if the BRBE is enabled */
> > + if (!(read_sysreg_el1(SYS_BRBCR) & (BRBCR_ELx_E0BRE | BRBCR_ELx_ExBRE)))
> > + return;
> > +
> > + /*
> > + * Prohibit branch record generation while we are in guest.
> > + * Since access to BRBCR_EL1 is trapped, the guest can't
> > + * modify the filtering set by the host.
> > + */
> > + *brbcr_el1 = read_sysreg_el1(SYS_BRBCR);
> > + write_sysreg_el1(0, SYS_BRBCR);
> > +}
>
> Should flush branch record and use isb() before exit host kernel?
I don't think so. The isb()'s in the other cases appear to be related
to ordering WRT memory buffers. BRBE is just registers. I would assume
that there's some barrier before we switch to the guest.
> I see inconsistence between the function above and BRBE's disable
> function. Here it clears E0BRE / ExBRE bits for disabling BRBE, but the
> BRBE driver sets the PAUSED bit in BRBFCR_EL1 for disabling BRBE.
Indeed. This works, but the enabled check won't work. I'm going to add
clearing BRBCR to brbe_disable(), and this part will stay the same.
Rob
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-02-13 23:16 ` Rob Herring
@ 2025-02-14 9:55 ` Leo Yan
2025-02-18 14:17 ` Rob Herring
0 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2025-02-14 9:55 UTC (permalink / raw)
To: Rob Herring
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Thu, Feb 13, 2025 at 05:16:45PM -0600, Rob Herring wrote:
[...]
> > > +static void __debug_save_brbe(u64 *brbcr_el1)
> > > +{
> > > + *brbcr_el1 = 0;
> > > +
> > > + /* Check if the BRBE is enabled */
> > > + if (!(read_sysreg_el1(SYS_BRBCR) & (BRBCR_ELx_E0BRE | BRBCR_ELx_ExBRE)))
> > > + return;
> > > +
> > > + /*
> > > + * Prohibit branch record generation while we are in guest.
> > > + * Since access to BRBCR_EL1 is trapped, the guest can't
> > > + * modify the filtering set by the host.
> > > + */
> > > + *brbcr_el1 = read_sysreg_el1(SYS_BRBCR);
> > > + write_sysreg_el1(0, SYS_BRBCR);
> > > +}
> >
> > Should flush branch record and use isb() before exit host kernel?
>
> I don't think so. The isb()'s in the other cases appear to be related
> to ordering WRT memory buffers. BRBE is just registers. I would assume
> that there's some barrier before we switch to the guest.
Given BRBCR is a system register, my understanding is the followd ISB
can ensure the writing BRBCR has finished and take effect. As a result,
it is promised that the branch record has been stopped.
However, with isb() it is not necessarily to say the branch records have
been flushed to the buffer. The purpose at here is just to stop record.
The BRBE driver will take care the flush issue when it reads records.
I agreed that it is likely barriers in the followed switch flow can assure
the writing BRBCR to take effect. It might be good to add a comment for
easier maintenance.
> > I see inconsistence between the function above and BRBE's disable
> > function. Here it clears E0BRE / ExBRE bits for disabling BRBE, but the
> > BRBE driver sets the PAUSED bit in BRBFCR_EL1 for disabling BRBE.
>
> Indeed. This works, but the enabled check won't work. I'm going to add
> clearing BRBCR to brbe_disable(), and this part will stay the same.
Seems to me, a right logic would be:
- In BRBE driver, the brbe_disable() function should clear E0BRE and
ExBRE bits in BRBCR. It can make sure the BRBE is totally disabled
when a perf session is terminated.
- For a kvm context switching, it is good to use PAUSED bit. If a host
is branch record enabled, this is a light way for temporarily pause
branch record for the switched VM.
Thanks,
Leo
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-02-14 9:55 ` Leo Yan
@ 2025-02-18 14:17 ` Rob Herring
0 siblings, 0 replies; 43+ messages in thread
From: Rob Herring @ 2025-02-18 14:17 UTC (permalink / raw)
To: Leo Yan
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Fri, Feb 14, 2025 at 3:55 AM Leo Yan <leo.yan@arm.com> wrote:
>
> On Thu, Feb 13, 2025 at 05:16:45PM -0600, Rob Herring wrote:
>
> [...]
>
> > > > +static void __debug_save_brbe(u64 *brbcr_el1)
> > > > +{
> > > > + *brbcr_el1 = 0;
> > > > +
> > > > + /* Check if the BRBE is enabled */
> > > > + if (!(read_sysreg_el1(SYS_BRBCR) & (BRBCR_ELx_E0BRE | BRBCR_ELx_ExBRE)))
> > > > + return;
> > > > +
> > > > + /*
> > > > + * Prohibit branch record generation while we are in guest.
> > > > + * Since access to BRBCR_EL1 is trapped, the guest can't
> > > > + * modify the filtering set by the host.
> > > > + */
> > > > + *brbcr_el1 = read_sysreg_el1(SYS_BRBCR);
> > > > + write_sysreg_el1(0, SYS_BRBCR);
> > > > +}
> > >
> > > Should flush branch record and use isb() before exit host kernel?
> >
> > I don't think so. The isb()'s in the other cases appear to be related
> > to ordering WRT memory buffers. BRBE is just registers. I would assume
> > that there's some barrier before we switch to the guest.
>
> Given BRBCR is a system register, my understanding is the followd ISB
> can ensure the writing BRBCR has finished and take effect. As a result,
> it is promised that the branch record has been stopped.
>
> However, with isb() it is not necessarily to say the branch records have
> been flushed to the buffer. The purpose at here is just to stop record.
> The BRBE driver will take care the flush issue when it reads records.
>
> I agreed that it is likely barriers in the followed switch flow can assure
> the writing BRBCR to take effect. It might be good to add a comment for
> easier maintenance.
>
> > > I see inconsistence between the function above and BRBE's disable
> > > function. Here it clears E0BRE / ExBRE bits for disabling BRBE, but the
> > > BRBE driver sets the PAUSED bit in BRBFCR_EL1 for disabling BRBE.
> >
> > Indeed. This works, but the enabled check won't work. I'm going to add
> > clearing BRBCR to brbe_disable(), and this part will stay the same.
>
> Seems to me, a right logic would be:
>
> - In BRBE driver, the brbe_disable() function should clear E0BRE and
> ExBRE bits in BRBCR. It can make sure the BRBE is totally disabled
> when a perf session is terminated.
>
> - For a kvm context switching, it is good to use PAUSED bit. If a host
> is branch record enabled, this is a light way for temporarily pause
> branch record for the switched VM.
We have to read BRBCR to see if it is enabled as PAUSED is unknown out
of reset and the driver may not exist to initialize it. Either way, it
is a register read and write, so same overhead for both.
Rob
^ permalink raw reply [flat|nested] 43+ messages in thread
end of thread, other threads:[~2025-02-18 14:18 UTC | newest]
Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-03 0:42 [PATCH v19 00/11] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
2025-02-03 0:42 ` [PATCH v19 01/11] perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters Rob Herring (Arm)
2025-02-03 4:07 ` Anshuman Khandual
2025-02-03 0:42 ` [PATCH v19 02/11] perf: arm_pmu: Don't disable counter in armpmu_add() Rob Herring (Arm)
2025-02-03 6:04 ` Anshuman Khandual
2025-02-03 0:42 ` [PATCH v19 03/11] perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event() Rob Herring (Arm)
2025-02-03 6:38 ` Anshuman Khandual
2025-02-03 0:42 ` [PATCH v19 04/11] perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts Rob Herring (Arm)
2025-02-03 4:09 ` Anshuman Khandual
2025-02-03 0:42 ` [PATCH v19 05/11] perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event() Rob Herring (Arm)
2025-02-03 6:54 ` Anshuman Khandual
2025-02-03 0:43 ` [PATCH v19 06/11] perf: apple_m1: Don't disable counter in m1_pmu_enable_event() Rob Herring (Arm)
2025-02-03 8:10 ` Anshuman Khandual
2025-02-03 0:43 ` [PATCH v19 07/11] perf: arm_pmu: Move PMUv3-specific data Rob Herring (Arm)
2025-02-03 8:16 ` Anshuman Khandual
2025-02-03 0:43 ` [PATCH v19 08/11] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
2025-02-03 8:32 ` Anshuman Khandual
2025-02-03 0:43 ` [PATCH v19 09/11] arm64: Handle BRBE booting requirements Rob Herring (Arm)
2025-02-03 8:47 ` Anshuman Khandual
2025-02-12 12:10 ` Leo Yan
2025-02-12 21:21 ` Rob Herring
2025-02-13 12:27 ` Leo Yan
2025-02-03 0:43 ` [PATCH v19 10/11] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
2025-02-03 9:16 ` Anshuman Khandual
2025-02-03 11:28 ` James Clark
2025-02-13 17:03 ` Leo Yan
2025-02-13 23:16 ` Rob Herring
2025-02-14 9:55 ` Leo Yan
2025-02-18 14:17 ` Rob Herring
2025-02-03 0:43 ` [PATCH v19 11/11] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
2025-02-03 16:53 ` James Clark
2025-02-03 17:58 ` Rob Herring
2025-02-04 12:02 ` James Clark
2025-02-04 15:03 ` Rob Herring
2025-02-05 14:38 ` James Clark
2025-02-05 14:51 ` James Clark
2025-02-05 16:15 ` Rob Herring
2025-02-06 12:58 ` James Clark
2025-02-12 18:52 ` Leo Yan
2025-02-12 19:00 ` Leo Yan
2025-02-13 16:16 ` Leo Yan
2025-02-13 17:13 ` Rob Herring
2025-02-13 17:45 ` Leo Yan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).