* [PATCH v21 0/4] arm64/perf: Enable branch stack sampling
@ 2025-04-07 17:41 Rob Herring (Arm)
2025-04-07 17:41 ` [PATCH v21 1/4] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: Rob Herring (Arm) @ 2025-04-07 17:41 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, Leo Yan
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Mark Brown
This series enables perf branch stack sampling support on arm64 via a
v9.2 arch feature called Branch Record Buffer Extension (BRBE). Details
on BRBE can be found in the Arm ARM[1] chapter D18.
I've picked up this series from Anshuman. v19 and v20 versions have been
reworked quite a bit by Mark and myself. The bulk of those changes are
in patch 4.
A git branch is here[2].
[1] https://developer.arm.com/documentation/ddi0487/latest/
[2] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git arm/brbe-v21
v21:
- Drop clean-up patches 1-7 already applied
- Rebase on v6.15-rc1
v20:
- https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-0-4e9922fc2e8e@kernel.org
- Added back some of the arm64 specific exception types. The x86 IRQ
branches also include other exceptions like page faults. On arm64, we
can distinguish the exception types, so we do. Also, to better
align with x86, we convert 'call' branches which are user to kernel
to 'syscall'.
- Only enable exceptions and exception returns if recording kernel
branches (matching x86)
- Drop requiring event and branch privileges to match
- Add "branches" caps sysfs attribute like x86
- Reword comment about FZP and MDCR_EL2.HPMN interaction
- Rework BRBE invalidation to avoid invalidating in interrupt handler
when no handled events capture the branch stack (i.e. when there are
multiple users).
- Also clear BRBCR_ELx bits in brbe_disable(). This is for KVM nVHE
checks if BRBE is enabled.
- Document that MDCR_EL3.SBRBE can be 0b01 also
v19:
- https://lore.kernel.org/all/20250202-arm-brbe-v19-v19-0-1c1300802385@kernel.org/
- Drop saving of branch records when task scheduled out (Mark). Make
sched_task() callback actually get called. Enabling requires a call
to perf_sched_cb_inc(). So the saving of branch records never
happened.
- Got rid of added armpmu ops. All BRBE support is contained within
pmuv3 code.
- Fix freeze on overflow for VHE
- The cycle counter doesn't freeze BRBE on overflow, so avoid assigning
it when BRBE is enabled.
- Drop all the Arm specific exception branches. Not a clear need for
them.
- Fix handling of branch 'cycles' reading. CC field is
mantissa/exponent, not an integer.
- Rework s/w filtering to better match h/w filtering
- Reject events with disjoint event filter and branch filter or with
exclude_host set
- Dropped perf test patch which has been applied for 6.14
- Dropped patch "KVM: arm64: Explicitly handle BRBE traps as UNDEFINED"
which has been applied for 6.14
v18:
- https://lore.kernel.org/all/20240613061731.3109448-1-anshuman.khandual@arm.com/
For v1-v17, see the above link. Not going to duplicate it all here...
Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
---
Anshuman Khandual (4):
arm64/sysreg: Add BRBE registers and fields
arm64: Handle BRBE booting requirements
KVM: arm64: nvhe: Disable branch generation in nVHE guests
perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
Documentation/arch/arm64/booting.rst | 21 +
arch/arm64/include/asm/el2_setup.h | 86 +++-
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/include/asm/sysreg.h | 17 +-
arch/arm64/kvm/debug.c | 4 +
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 32 ++
arch/arm64/kvm/hyp/nvhe/switch.c | 2 +-
arch/arm64/tools/sysreg | 132 ++++++
drivers/perf/Kconfig | 11 +
drivers/perf/Makefile | 1 +
drivers/perf/arm_brbe.c | 802 +++++++++++++++++++++++++++++++++++
drivers/perf/arm_brbe.h | 47 ++
drivers/perf/arm_pmu.c | 15 +-
drivers/perf/arm_pmuv3.c | 129 +++++-
include/linux/perf/arm_pmu.h | 8 +
15 files changed, 1287 insertions(+), 22 deletions(-)
---
base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8
change-id: 20250129-arm-brbe-v19-24d5d9e5e623
Best regards,
--
Rob Herring (Arm) <robh@kernel.org>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v21 1/4] arm64/sysreg: Add BRBE registers and fields
2025-04-07 17:41 [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
@ 2025-04-07 17:41 ` Rob Herring (Arm)
2025-05-19 14:08 ` Will Deacon
2025-04-07 17:41 ` [PATCH v21 2/4] arm64: Handle BRBE booting requirements Rob Herring (Arm)
` (3 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Rob Herring (Arm) @ 2025-04-07 17:41 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, Leo Yan
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Mark Brown
From: Anshuman Khandual <anshuman.khandual@arm.com>
This patch adds definitions related to the Branch Record Buffer Extension
(BRBE) as per ARM DDI 0487K.a. These will be used by KVM and a BRBE driver
in subsequent patches.
Some existing BRBE definitions in asm/sysreg.h are replaced with equivalent
generated definitions.
Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
---
v19:
- split BRBINF.CC field into mantissa and exponent
---
arch/arm64/include/asm/sysreg.h | 17 ++----
arch/arm64/tools/sysreg | 132 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 138 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 2639d3633073..67e96494749e 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -198,16 +198,8 @@
#define SYS_DBGVCR32_EL2 sys_reg(2, 4, 0, 7, 0)
#define SYS_BRBINF_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 0))
-#define SYS_BRBINFINJ_EL1 sys_reg(2, 1, 9, 1, 0)
#define SYS_BRBSRC_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 1))
-#define SYS_BRBSRCINJ_EL1 sys_reg(2, 1, 9, 1, 1)
#define SYS_BRBTGT_EL1(n) sys_reg(2, 1, 8, (n & 15), (((n & 16) >> 2) | 2))
-#define SYS_BRBTGTINJ_EL1 sys_reg(2, 1, 9, 1, 2)
-#define SYS_BRBTS_EL1 sys_reg(2, 1, 9, 0, 2)
-
-#define SYS_BRBCR_EL1 sys_reg(2, 1, 9, 0, 0)
-#define SYS_BRBFCR_EL1 sys_reg(2, 1, 9, 0, 1)
-#define SYS_BRBIDR0_EL1 sys_reg(2, 1, 9, 2, 0)
#define SYS_TRCITECR_EL1 sys_reg(3, 0, 1, 2, 3)
#define SYS_TRCACATR(m) sys_reg(2, 1, 2, ((m & 7) << 1), (2 | (m >> 3)))
@@ -273,8 +265,6 @@
/* ETM */
#define SYS_TRCOSLAR sys_reg(2, 1, 1, 0, 4)
-#define SYS_BRBCR_EL2 sys_reg(2, 4, 9, 0, 0)
-
#define SYS_MIDR_EL1 sys_reg(3, 0, 0, 0, 0)
#define SYS_MPIDR_EL1 sys_reg(3, 0, 0, 0, 5)
#define SYS_REVIDR_EL1 sys_reg(3, 0, 0, 0, 6)
@@ -607,7 +597,6 @@
#define SYS_CNTHV_CVAL_EL2 sys_reg(3, 4, 14, 3, 2)
/* VHE encodings for architectural EL0/1 system registers */
-#define SYS_BRBCR_EL12 sys_reg(2, 5, 9, 0, 0)
#define SYS_SCTLR_EL12 sys_reg(3, 5, 1, 0, 0)
#define SYS_CPACR_EL12 sys_reg(3, 5, 1, 0, 2)
#define SYS_SCTLR2_EL12 sys_reg(3, 5, 1, 0, 3)
@@ -818,6 +807,12 @@
#define OP_COSP_RCTX sys_insn(1, 3, 7, 3, 6)
#define OP_CPP_RCTX sys_insn(1, 3, 7, 3, 7)
+/*
+ * BRBE Instructions
+ */
+#define BRB_IALL_INSN __emit_inst(0xd5000000 | OP_BRB_IALL | (0x1f))
+#define BRB_INJ_INSN __emit_inst(0xd5000000 | OP_BRB_INJ | (0x1f))
+
/* Common SCTLR_ELx flags. */
#define SCTLR_ELx_ENTP2 (BIT(60))
#define SCTLR_ELx_DSSBS (BIT(44))
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index f9476848a2ed..ec74419dc1ae 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1038,6 +1038,138 @@ UnsignedEnum 3:0 MTEPERM
EndEnum
EndSysreg
+
+SysregFields BRBINFx_EL1
+Res0 63:47
+Field 46 CCU
+Field 45:40 CC_EXP
+Field 39:32 CC_MANT
+Res0 31:18
+Field 17 LASTFAILED
+Field 16 T
+Res0 15:14
+Enum 13:8 TYPE
+ 0b000000 DIRECT_UNCOND
+ 0b000001 INDIRECT
+ 0b000010 DIRECT_LINK
+ 0b000011 INDIRECT_LINK
+ 0b000101 RET
+ 0b000111 ERET
+ 0b001000 DIRECT_COND
+ 0b100001 DEBUG_HALT
+ 0b100010 CALL
+ 0b100011 TRAP
+ 0b100100 SERROR
+ 0b100110 INSN_DEBUG
+ 0b100111 DATA_DEBUG
+ 0b101010 ALIGN_FAULT
+ 0b101011 INSN_FAULT
+ 0b101100 DATA_FAULT
+ 0b101110 IRQ
+ 0b101111 FIQ
+ 0b110000 IMPDEF_TRAP_EL3
+ 0b111001 DEBUG_EXIT
+EndEnum
+Enum 7:6 EL
+ 0b00 EL0
+ 0b01 EL1
+ 0b10 EL2
+ 0b11 EL3
+EndEnum
+Field 5 MPRED
+Res0 4:2
+Enum 1:0 VALID
+ 0b00 NONE
+ 0b01 TARGET
+ 0b10 SOURCE
+ 0b11 FULL
+EndEnum
+EndSysregFields
+
+SysregFields BRBCR_ELx
+Res0 63:24
+Field 23 EXCEPTION
+Field 22 ERTN
+Res0 21:10
+Field 9 FZPSS
+Field 8 FZP
+Res0 7
+Enum 6:5 TS
+ 0b01 VIRTUAL
+ 0b10 GUEST_PHYSICAL
+ 0b11 PHYSICAL
+EndEnum
+Field 4 MPRED
+Field 3 CC
+Res0 2
+Field 1 ExBRE
+Field 0 E0BRE
+EndSysregFields
+
+Sysreg BRBCR_EL1 2 1 9 0 0
+Fields BRBCR_ELx
+EndSysreg
+
+Sysreg BRBFCR_EL1 2 1 9 0 1
+Res0 63:30
+Enum 29:28 BANK
+ 0b00 BANK_0
+ 0b01 BANK_1
+EndEnum
+Res0 27:23
+Field 22 CONDDIR
+Field 21 DIRCALL
+Field 20 INDCALL
+Field 19 RTN
+Field 18 INDIRECT
+Field 17 DIRECT
+Field 16 EnI
+Res0 15:8
+Field 7 PAUSED
+Field 6 LASTFAILED
+Res0 5:0
+EndSysreg
+
+Sysreg BRBTS_EL1 2 1 9 0 2
+Field 63:0 TS
+EndSysreg
+
+Sysreg BRBINFINJ_EL1 2 1 9 1 0
+Fields BRBINFx_EL1
+EndSysreg
+
+Sysreg BRBSRCINJ_EL1 2 1 9 1 1
+Field 63:0 ADDRESS
+EndSysreg
+
+Sysreg BRBTGTINJ_EL1 2 1 9 1 2
+Field 63:0 ADDRESS
+EndSysreg
+
+Sysreg BRBIDR0_EL1 2 1 9 2 0
+Res0 63:16
+Enum 15:12 CC
+ 0b0101 20_BIT
+EndEnum
+Enum 11:8 FORMAT
+ 0b0000 FORMAT_0
+EndEnum
+Enum 7:0 NUMREC
+ 0b00001000 8
+ 0b00010000 16
+ 0b00100000 32
+ 0b01000000 64
+EndEnum
+EndSysreg
+
+Sysreg BRBCR_EL2 2 4 9 0 0
+Fields BRBCR_ELx
+EndSysreg
+
+Sysreg BRBCR_EL12 2 5 9 0 0
+Fields BRBCR_ELx
+EndSysreg
+
Sysreg ID_AA64ZFR0_EL1 3 0 0 4 4
Res0 63:60
UnsignedEnum 59:56 F64MM
--
2.47.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v21 2/4] arm64: Handle BRBE booting requirements
2025-04-07 17:41 [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
2025-04-07 17:41 ` [PATCH v21 1/4] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
@ 2025-04-07 17:41 ` Rob Herring (Arm)
2025-05-19 14:07 ` Will Deacon
2025-04-07 17:41 ` [PATCH v21 3/4] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
` (2 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Rob Herring (Arm) @ 2025-04-07 17:41 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, Leo Yan
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Anshuman Khandual <anshuman.khandual@arm.com>
To use the Branch Record Buffer Extension (BRBE), some configuration is
necessary at EL3 and EL2. This patch documents the requirements and adds
the initial EL2 setup code, which largely consists of configuring the
fine-grained traps and initializing a couple of BRBE control registers.
Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and
HDFGWTR_EL2 with the same value, relying on the read/write trap controls
for a register occupying the same bit position in either register. The
'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit
59 of HDFGWTR_EL2 is RES0, and so this assumption no longer holds.
To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit
layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and
HDFGWTR_EL2 control bits separately. While making this change the
open-coded value (1 << 62) is replaced with
HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK.
The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special
initialisation: even though they are subject to E2H renaming, both have
an effect regardless of HCR_EL2.TGE, even when running at EL2, and
consequently both need to be initialised. This is handled in
__init_el2_brbe() with a comment to explain the situation.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
[Mark: rewrite commit message, fix typo in comment]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
---
v20:
- Document that MDCR_EL3.SBRBE can be 0b01 also
- Fix "HDFGWTR_EL2 is RES0" in commit msg
---
Documentation/arch/arm64/booting.rst | 21 +++++++++
arch/arm64/include/asm/el2_setup.h | 86 ++++++++++++++++++++++++++++++++++--
2 files changed, 104 insertions(+), 3 deletions(-)
diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
index dee7b6de864f..a627c1e0e4a0 100644
--- a/Documentation/arch/arm64/booting.rst
+++ b/Documentation/arch/arm64/booting.rst
@@ -358,6 +358,27 @@ Before jumping into the kernel, the following conditions must be met:
- HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
+ For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE):
+
+ - If EL3 is present:
+
+ - MDCR_EL3.SBRBE (bits 33:32) must be initialised to 0b01 or 0b11.
+
+ - If the kernel is entered at EL1 and EL2 is present:
+
+ - BRBCR_EL2.CC (bit 3) must be initialised to 0b1.
+ - BRBCR_EL2.MPRED (bit 4) must be initialised to 0b1.
+
+ - HDFGRTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1.
+ - HDFGRTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1.
+ - HDFGRTR_EL2.nBRBIDR (bit 59) must be initialised to 0b1.
+
+ - HDFGWTR_EL2.nBRBDATA (bit 61) must be initialised to 0b1.
+ - HDFGWTR_EL2.nBRBCTL (bit 60) must be initialised to 0b1.
+
+ - HFGITR_EL2.nBRBIALL (bit 56) must be initialised to 0b1.
+ - HFGITR_EL2.nBRBINJ (bit 55) must be initialised to 0b1.
+
For CPUs with the Scalable Matrix Extension FA64 feature (FEAT_SME_FA64):
- If EL3 is present:
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index ebceaae3c749..e7fcba1e7d8e 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -189,6 +189,39 @@
.Lskip_set_cptr_\@:
.endm
+/*
+ * Configure BRBE to permit recording cycle counts and branch mispredicts.
+ *
+ * At any EL, to record cycle counts BRBE requires that both BRBCR_EL2.CC=1 and
+ * BRBCR_EL1.CC=1.
+ *
+ * At any EL, to record branch mispredicts BRBE requires that both
+ * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1.
+ *
+ * When HCR_EL2.E2H=1, the BRBCR_EL1 encoding is redirected to BRBCR_EL2, but
+ * the {CC,MPRED} bits in the real BRBCR_EL1 register still apply.
+ *
+ * Set {CC,MPRED} in both BRBCR_EL2 and BRBCR_EL1 so that at runtime we only
+ * need to enable/disable these in BRBCR_EL1 regardless of whether the kernel
+ * ends up executing in EL1 or EL2.
+ */
+.macro __init_el2_brbe
+ mrs x1, id_aa64dfr0_el1
+ ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
+ cbz x1, .Lskip_brbe_\@
+
+ mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED
+ msr_s SYS_BRBCR_EL2, x0
+
+ __check_hvhe .Lset_brbe_nvhe_\@, x1
+ msr_s SYS_BRBCR_EL12, x0 // VHE
+ b .Lskip_brbe_\@
+
+.Lset_brbe_nvhe_\@:
+ msr_s SYS_BRBCR_EL1, x0 // NVHE
+.Lskip_brbe_\@:
+.endm
+
/* Disable any fine grained traps */
.macro __init_el2_fgt
mrs x1, id_aa64mmfr0_el1
@@ -196,16 +229,48 @@
cbz x1, .Lskip_fgt_\@
mov x0, xzr
+ mov x2, xzr
mrs x1, id_aa64dfr0_el1
ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4
cmp x1, #3
b.lt .Lskip_spe_fgt_\@
+
/* Disable PMSNEVFR_EL1 read and write traps */
- orr x0, x0, #(1 << 62)
+ orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK
+ orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK
.Lskip_spe_fgt_\@:
+#ifdef CONFIG_ARM64_BRBE
+ mrs x1, id_aa64dfr0_el1
+ ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
+ cbz x1, .Lskip_brbe_reg_fgt_\@
+
+ /*
+ * Disable read traps for the following registers
+ *
+ * [BRBSRC|BRBTGT|RBINF]_EL1
+ * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
+ */
+ orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK
+
+ /*
+ * Disable write traps for the following registers
+ *
+ * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
+ */
+ orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK
+
+ /* Disable read and write traps for [BRBCR|BRBFCR]_EL1 */
+ orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK
+ orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK
+
+ /* Disable read traps for BRBIDR_EL1 */
+ orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK
+
+.Lskip_brbe_reg_fgt_\@:
+#endif /* CONFIG_ARM64_BRBE */
msr_s SYS_HDFGRTR_EL2, x0
- msr_s SYS_HDFGWTR_EL2, x0
+ msr_s SYS_HDFGWTR_EL2, x2
mov x0, xzr
mrs x1, id_aa64pfr1_el1
@@ -246,7 +311,21 @@
.Lset_fgt_\@:
msr_s SYS_HFGRTR_EL2, x0
msr_s SYS_HFGWTR_EL2, x0
- msr_s SYS_HFGITR_EL2, xzr
+ mov x0, xzr
+#ifdef CONFIG_ARM64_BRBE
+ mrs x1, id_aa64dfr0_el1
+ ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
+ cbz x1, .Lskip_brbe_insn_fgt_\@
+
+ /* Disable traps for BRBIALL instruction */
+ orr x0, x0, #HFGITR_EL2_nBRBIALL_MASK
+
+ /* Disable traps for BRBINJ instruction */
+ orr x0, x0, #HFGITR_EL2_nBRBINJ_MASK
+
+.Lskip_brbe_insn_fgt_\@:
+#endif /* CONFIG_ARM64_BRBE */
+ msr_s SYS_HFGITR_EL2, x0
mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU
ubfx x1, x1, #ID_AA64PFR0_EL1_AMU_SHIFT, #4
@@ -320,6 +399,7 @@
__init_el2_hcrx
__init_el2_timers
__init_el2_debug
+ __init_el2_brbe
__init_el2_lor
__init_el2_stage2
__init_el2_gicv3
--
2.47.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v21 3/4] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-04-07 17:41 [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
2025-04-07 17:41 ` [PATCH v21 1/4] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
2025-04-07 17:41 ` [PATCH v21 2/4] arm64: Handle BRBE booting requirements Rob Herring (Arm)
@ 2025-04-07 17:41 ` Rob Herring (Arm)
2025-05-19 14:11 ` Will Deacon
2025-05-28 16:09 ` Suzuki K Poulose
2025-04-07 17:41 ` [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
2025-05-06 14:47 ` [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Jonathan Cameron
4 siblings, 2 replies; 18+ messages in thread
From: Rob Herring (Arm) @ 2025-04-07 17:41 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, Leo Yan
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Anshuman Khandual <anshuman.khandual@arm.com>
While BRBE can record branches within guests, the host recording
branches in guests is not supported by perf (though events are).
Support for BRBE in guests will supported by providing direct access
to BRBE within the guests. That is how x86 LBR works for guests.
Therefore, BRBE needs to be disabled on guest entry and restored on
exit.
For nVHE, this requires explicit handling for guests. Before
entering a guest, save the BRBE state and disable the it. When
returning to the host, restore the state.
For VHE, it is not necessary. We initialize
BRBCR_EL1.{E1BRE,E0BRE}=={0,0} at boot time, and HCR_EL2.TGE==1 while
running in the host. We configure BRBCR_EL2.{E2BRE,E0HBRE} to enable
branch recording in the host. When entering the guest, we set
HCR_EL2.TGE==0 which means BRBCR_EL1 is used instead of BRBCR_EL2.
Consequently for VHE, BRBE recording is disabled at EL1 and EL0 when
running a guest.
Should recording in guests (by the host) ever be desired, the perf ABI
will need to be extended to distinguish guest addresses (struct
perf_branch_entry.priv) for starters. BRBE records would also need to be
invalidated on guest entry/exit as guest/host EL1 and EL0 records can't
be distinguished.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
v20:
- Reword commit message about no guest recording.
- Add BRBE to __kvm_vcpu_run() synchronization comment
v19:
- Rework due to v6.14 debug flag changes
- Redo commit message
---
arch/arm64/include/asm/kvm_host.h | 2 ++
arch/arm64/kvm/debug.c | 4 ++++
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 32 ++++++++++++++++++++++++++++++++
arch/arm64/kvm/hyp/nvhe/switch.c | 2 +-
4 files changed, 39 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e98cfe7855a6..e3f1e7b5ce52 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -654,6 +654,7 @@ struct kvm_host_data {
#define KVM_HOST_DATA_FLAG_HAS_TRBE 1
#define KVM_HOST_DATA_FLAG_TRBE_ENABLED 4
#define KVM_HOST_DATA_FLAG_EL1_TRACING_CONFIGURED 5
+#define KVM_HOST_DATA_FLAG_HAS_BRBE 6
unsigned long flags;
struct kvm_cpu_context host_ctxt;
@@ -687,6 +688,7 @@ struct kvm_host_data {
u64 trfcr_el1;
/* Values of trap registers for the host before guest entry. */
u64 mdcr_el2;
+ u64 brbcr_el1;
} host_debug_state;
/* Guest trace filter value */
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 0e4c805e7e89..bc6015108a68 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -81,6 +81,10 @@ void kvm_init_host_debug_data(void)
!(read_sysreg_s(SYS_PMBIDR_EL1) & PMBIDR_EL1_P))
host_data_set_flag(HAS_SPE);
+ /* Check if we have BRBE implemented and available at the host */
+ if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT))
+ host_data_set_flag(HAS_BRBE);
+
if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT)) {
/* Force disable trace in protected mode in case of no TRBE */
if (is_protected_kvm_enabled())
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index 2f4a4f5036bb..2a1c0f49792b 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -92,12 +92,42 @@ static void __trace_switch_to_host(void)
*host_data_ptr(host_debug_state.trfcr_el1));
}
+static void __debug_save_brbe(u64 *brbcr_el1)
+{
+ *brbcr_el1 = 0;
+
+ /* Check if the BRBE is enabled */
+ if (!(read_sysreg_el1(SYS_BRBCR) & (BRBCR_ELx_E0BRE | BRBCR_ELx_ExBRE)))
+ return;
+
+ /*
+ * Prohibit branch record generation while we are in guest.
+ * Since access to BRBCR_EL1 is trapped, the guest can't
+ * modify the filtering set by the host.
+ */
+ *brbcr_el1 = read_sysreg_el1(SYS_BRBCR);
+ write_sysreg_el1(0, SYS_BRBCR);
+}
+
+static void __debug_restore_brbe(u64 brbcr_el1)
+{
+ if (!brbcr_el1)
+ return;
+
+ /* Restore BRBE controls */
+ write_sysreg_el1(brbcr_el1, SYS_BRBCR);
+}
+
void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
{
/* Disable and flush SPE data generation */
if (host_data_test_flag(HAS_SPE))
__debug_save_spe(host_data_ptr(host_debug_state.pmscr_el1));
+ /* Disable BRBE branch records */
+ if (host_data_test_flag(HAS_BRBE))
+ __debug_save_brbe(host_data_ptr(host_debug_state.brbcr_el1));
+
if (__trace_needs_switch())
__trace_switch_to_guest();
}
@@ -111,6 +141,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
{
if (host_data_test_flag(HAS_SPE))
__debug_restore_spe(*host_data_ptr(host_debug_state.pmscr_el1));
+ if (host_data_test_flag(HAS_BRBE))
+ __debug_restore_brbe(*host_data_ptr(host_debug_state.brbcr_el1));
if (__trace_needs_switch())
__trace_switch_to_host();
}
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 7d2ba6ef0261..0d66d3b54800 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -319,7 +319,7 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
* We're about to restore some new MMU state. Make sure
* ongoing page-table walks that have started before we
* trapped to EL2 have completed. This also synchronises the
- * above disabling of SPE and TRBE.
+ * above disabling of BRBE, SPE and TRBE.
*
* See DDI0487I.a D8.1.5 "Out-of-context translation regimes",
* rule R_LFHQG and subsequent information statements.
--
2.47.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-04-07 17:41 [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (2 preceding siblings ...)
2025-04-07 17:41 ` [PATCH v21 3/4] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
@ 2025-04-07 17:41 ` Rob Herring (Arm)
2025-05-19 15:06 ` Will Deacon
2025-05-06 14:47 ` [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Jonathan Cameron
4 siblings, 1 reply; 18+ messages in thread
From: Rob Herring (Arm) @ 2025-04-07 17:41 UTC (permalink / raw)
To: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, Leo Yan
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
From: Anshuman Khandual <anshuman.khandual@arm.com>
The ARMv9.2 architecture introduces the optional Branch Record Buffer
Extension (BRBE), which records information about branches as they are
executed into set of branch record registers. BRBE is similar to x86's
Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
(BHRB).
BRBE supports filtering by exception level and can filter just the
source or target address if excluded to avoid leaking privileged
addresses. The h/w filter would be sufficient except when there are
multiple events with disjoint filtering requirements. In this case, BRBE
is configured with a union of all the events' desired branches, and then
the recorded branches are filtered based on each event's filter. For
example, with one event capturing kernel events and another event
capturing user events, BRBE will be configured to capture both kernel
and user branches. When handling event overflow, the branch records have
to be filtered by software to only include kernel or user branch
addresses for that event. In contrast, x86 simply configures LBR using
the last installed event which seems broken.
It is possible on x86 to configure branch filter such that no branches
are ever recorded (e.g. -j save_type). For BRBE, events with a
configuration that will result in no samples are rejected.
Recording branches in KVM guests is not supported like x86. However,
perf on x86 allows requesting branch recording in guests. The guest
events are recorded, but the resulting branches are all from the host.
For BRBE, events with branch recording and "exclude_host" set are
rejected. Requiring "exclude_guest" to be set did not work. The default
for the perf tool does set "exclude_guest" if no exception level
options are specified. However, specifying kernel or user events
defaults to including both host and guest. In this case, only host
branches are recorded.
BRBE can support some additional exception branch types compared to
x86. On x86, all exceptions other than syscalls are recorded as IRQ.
With BRBE, it is possible to better categorize these exceptions. One
limitation relative to x86 is we cannot distinguish a syscall return
from other exception returns. So all exception returns are recorded as
ERET type. The FIQ branch type is omitted as the only FIQ user is Apple
platforms which don't support BRBE. The debug branch types are omitted
as there is no clear need for them.
BRBE records are invalidated whenever events are reconfigured, a new
task is scheduled in, or after recording is paused (and the records
have been recorded for the event). The architecture allows branch
records to be invalidated by the PE under implementation defined
conditions. It is expected that these conditions are rare.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
---
v21:
- update .sched_task() parameters. task_struct added in v6.15
v20:
- Fix sparse percpu warning in branch_records_alloc()
- Use pr_debug instead of pr_debug_once. Add debug print on all error
cases when event.attr is rejected.
- Add back some arm64 specific exception types
- Convert 'call' from user to kernel to syscall as appropriate
- Drop requiring event and branch privileges to match
- Only enable exceptions and exception returns if recording kernel
branches
- Add "branches" caps sysfs attribute like x86
- Drop some unused defines
- Use u64 instead of unsigned long for branch record fields
- use min() in brbinf_get_cycles()
- Reword comment about FZP and MDCR_EL2.HPMN interaction
- Add comments on assumptions about calling brbe_enable()
- Merge capture_brbe_flags() into perf_entry_from_brbe_regset()
- Rework BRBE invalidation to avoid invalidating in interrupt handler
when no handled events capture the branch stack.
- Also clear BRBCR_ELx in brbe_disable(). This is for KVM nVHE checks
if BRBE is enabled.
v19:
- Drop saving of branch records when task scheduled out. (Mark)
- Got rid of added armpmu ops. All BRBE support contained within pmuv3
code.
- Dropped armpmu.num_branch_records as reg_brbidr has same info.
- Make sched_task() callback actually get called. Enabling requires a
call to perf_sched_cb_inc().
- Fix freeze on overflow for VHE
- The cycle counter doesn't freeze BRBE on overflow, so avoid assigning
it when BRBE is enabled.
- Drop all the Arm specific exception branches. Not a clear need for
them.
- Simplify enable/disable to avoid RMW and document ISBs needed
- Fix handling of branch 'cycles' reading. CC field is
mantissa/exponent, not an integer.
- Save BRBFCR and BRBCR settings in event->hw.branch_reg.config and
event->hw.extra_reg.config to avoid recalculating the register value
each time the event is installed.
- Rework s/w filtering to better match h/w filtering
- Reject events with disjoint event filter and branch filter
- Reject events if exclude_host is set
v18: https://lore.kernel.org/all/20240613061731.3109448-6-anshuman.khandual@arm.com/
---
drivers/perf/Kconfig | 11 +
drivers/perf/Makefile | 1 +
drivers/perf/arm_brbe.c | 802 +++++++++++++++++++++++++++++++++++++++++++
drivers/perf/arm_brbe.h | 47 +++
drivers/perf/arm_pmu.c | 15 +-
drivers/perf/arm_pmuv3.c | 129 ++++++-
include/linux/perf/arm_pmu.h | 8 +
7 files changed, 1006 insertions(+), 7 deletions(-)
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 4e268de351c4..3be60ff4236d 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -223,6 +223,17 @@ config ARM_SPE_PMU
Extension, which provides periodic sampling of operations in
the CPU pipeline and reports this via the perf AUX interface.
+config ARM64_BRBE
+ bool "Enable support for branch stack sampling using FEAT_BRBE"
+ depends on ARM_PMUV3 && ARM64
+ default y
+ help
+ Enable perf support for Branch Record Buffer Extension (BRBE) which
+ records all branches taken in an execution path. This supports some
+ branch types and privilege based filtering. It captures additional
+ relevant information such as cycle count, misprediction and branch
+ type, branch privilege level etc.
+
config ARM_DMC620_PMU
tristate "Enable PMU support for the ARM DMC-620 memory controller"
depends on (ARM64 && ACPI) || COMPILE_TEST
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index de71d2574857..192fc8b16204 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_STARFIVE_STARLINK_PMU) += starfive_starlink_pmu.o
obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
+obj-$(CONFIG_ARM64_BRBE) += arm_brbe.o
obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
diff --git a/drivers/perf/arm_brbe.c b/drivers/perf/arm_brbe.c
new file mode 100644
index 000000000000..2f254bd40af3
--- /dev/null
+++ b/drivers/perf/arm_brbe.c
@@ -0,0 +1,802 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Branch Record Buffer Extension Driver.
+ *
+ * Copyright (C) 2022-2025 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#include <linux/types.h>
+#include <linux/bitmap.h>
+#include <linux/perf/arm_pmu.h>
+#include "arm_brbe.h"
+
+#define BRBFCR_EL1_BRANCH_FILTERS (BRBFCR_EL1_DIRECT | \
+ BRBFCR_EL1_INDIRECT | \
+ BRBFCR_EL1_RTN | \
+ BRBFCR_EL1_INDCALL | \
+ BRBFCR_EL1_DIRCALL | \
+ BRBFCR_EL1_CONDDIR)
+
+/*
+ * BRBTS_EL1 is currently not used for branch stack implementation
+ * purpose but BRBCR_ELx.TS needs to have a valid value from all
+ * available options. BRBCR_ELx_TS_VIRTUAL is selected for this.
+ */
+#define BRBCR_ELx_DEFAULT_TS FIELD_PREP(BRBCR_ELx_TS_MASK, BRBCR_ELx_TS_VIRTUAL)
+
+/*
+ * BRBE Buffer Organization
+ *
+ * BRBE buffer is arranged as multiple banks of 32 branch record
+ * entries each. An individual branch record in a given bank could
+ * be accessed, after selecting the bank in BRBFCR_EL1.BANK and
+ * accessing the registers i.e [BRBSRC, BRBTGT, BRBINF] set with
+ * indices [0..31].
+ *
+ * Bank 0
+ *
+ * --------------------------------- ------
+ * | 00 | BRBSRC | BRBTGT | BRBINF | | 00 |
+ * --------------------------------- ------
+ * | 01 | BRBSRC | BRBTGT | BRBINF | | 01 |
+ * --------------------------------- ------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | .. |
+ * --------------------------------- ------
+ * | 31 | BRBSRC | BRBTGT | BRBINF | | 31 |
+ * --------------------------------- ------
+ *
+ * Bank 1
+ *
+ * --------------------------------- ------
+ * | 32 | BRBSRC | BRBTGT | BRBINF | | 00 |
+ * --------------------------------- ------
+ * | 33 | BRBSRC | BRBTGT | BRBINF | | 01 |
+ * --------------------------------- ------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | .. |
+ * --------------------------------- ------
+ * | 63 | BRBSRC | BRBTGT | BRBINF | | 31 |
+ * --------------------------------- ------
+ */
+#define BRBE_BANK_MAX_ENTRIES 32
+
+struct brbe_regset {
+ u64 brbsrc;
+ u64 brbtgt;
+ u64 brbinf;
+};
+
+#define PERF_BR_ARM64_MAX (PERF_BR_MAX + PERF_BR_NEW_MAX)
+
+struct brbe_hw_attr {
+ int brbe_version;
+ int brbe_cc;
+ int brbe_nr;
+ int brbe_format;
+};
+
+#define BRBE_REGN_CASE(n, case_macro) \
+ case n: case_macro(n); break
+
+#define BRBE_REGN_SWITCH(x, case_macro) \
+ do { \
+ switch (x) { \
+ BRBE_REGN_CASE(0, case_macro); \
+ BRBE_REGN_CASE(1, case_macro); \
+ BRBE_REGN_CASE(2, case_macro); \
+ BRBE_REGN_CASE(3, case_macro); \
+ BRBE_REGN_CASE(4, case_macro); \
+ BRBE_REGN_CASE(5, case_macro); \
+ BRBE_REGN_CASE(6, case_macro); \
+ BRBE_REGN_CASE(7, case_macro); \
+ BRBE_REGN_CASE(8, case_macro); \
+ BRBE_REGN_CASE(9, case_macro); \
+ BRBE_REGN_CASE(10, case_macro); \
+ BRBE_REGN_CASE(11, case_macro); \
+ BRBE_REGN_CASE(12, case_macro); \
+ BRBE_REGN_CASE(13, case_macro); \
+ BRBE_REGN_CASE(14, case_macro); \
+ BRBE_REGN_CASE(15, case_macro); \
+ BRBE_REGN_CASE(16, case_macro); \
+ BRBE_REGN_CASE(17, case_macro); \
+ BRBE_REGN_CASE(18, case_macro); \
+ BRBE_REGN_CASE(19, case_macro); \
+ BRBE_REGN_CASE(20, case_macro); \
+ BRBE_REGN_CASE(21, case_macro); \
+ BRBE_REGN_CASE(22, case_macro); \
+ BRBE_REGN_CASE(23, case_macro); \
+ BRBE_REGN_CASE(24, case_macro); \
+ BRBE_REGN_CASE(25, case_macro); \
+ BRBE_REGN_CASE(26, case_macro); \
+ BRBE_REGN_CASE(27, case_macro); \
+ BRBE_REGN_CASE(28, case_macro); \
+ BRBE_REGN_CASE(29, case_macro); \
+ BRBE_REGN_CASE(30, case_macro); \
+ BRBE_REGN_CASE(31, case_macro); \
+ default: WARN(1, "Invalid BRB* index %d\n", x); \
+ } \
+ } while (0)
+
+#define RETURN_READ_BRBSRCN(n) \
+ return read_sysreg_s(SYS_BRBSRC_EL1(n))
+static inline u64 get_brbsrc_reg(int idx)
+{
+ BRBE_REGN_SWITCH(idx, RETURN_READ_BRBSRCN);
+ return 0;
+}
+
+#define RETURN_READ_BRBTGTN(n) \
+ return read_sysreg_s(SYS_BRBTGT_EL1(n))
+static u64 get_brbtgt_reg(int idx)
+{
+ BRBE_REGN_SWITCH(idx, RETURN_READ_BRBTGTN);
+ return 0;
+}
+
+#define RETURN_READ_BRBINFN(n) \
+ return read_sysreg_s(SYS_BRBINF_EL1(n))
+static u64 get_brbinf_reg(int idx)
+{
+ BRBE_REGN_SWITCH(idx, RETURN_READ_BRBINFN);
+ return 0;
+}
+
+static u64 brbe_record_valid(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_VALID_MASK, brbinf);
+}
+
+static bool brbe_invalid(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_NONE;
+}
+
+static bool brbe_record_is_complete(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_FULL;
+}
+
+static bool brbe_record_is_source_only(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_SOURCE;
+}
+
+static bool brbe_record_is_target_only(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINFx_EL1_VALID_TARGET;
+}
+
+static int brbinf_get_in_tx(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_T_MASK, brbinf);
+}
+
+static int brbinf_get_mispredict(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_MPRED_MASK, brbinf);
+}
+
+static int brbinf_get_lastfailed(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_LASTFAILED_MASK, brbinf);
+}
+
+static u16 brbinf_get_cycles(u64 brbinf)
+{
+ u32 exp, mant, cycles;
+ /*
+ * Captured cycle count is unknown and hence
+ * should not be passed on to userspace.
+ */
+ if (brbinf & BRBINFx_EL1_CCU)
+ return 0;
+
+ exp = FIELD_GET(BRBINFx_EL1_CC_EXP_MASK, brbinf);
+ mant = FIELD_GET(BRBINFx_EL1_CC_MANT_MASK, brbinf);
+
+ if (!exp)
+ return mant;
+
+ cycles = (mant | 0x100) << (exp - 1);
+
+ return min(cycles, U16_MAX);
+}
+
+static int brbinf_get_type(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_TYPE_MASK, brbinf);
+}
+
+static int brbinf_get_el(u64 brbinf)
+{
+ return FIELD_GET(BRBINFx_EL1_EL_MASK, brbinf);
+}
+
+void brbe_invalidate(void)
+{
+ // Ensure all branches before this point are recorded
+ isb();
+ asm volatile(BRB_IALL_INSN);
+ // Ensure all branch records are invalidated after this point
+ isb();
+}
+
+static bool valid_brbe_nr(int brbe_nr)
+{
+ return brbe_nr == BRBIDR0_EL1_NUMREC_8 ||
+ brbe_nr == BRBIDR0_EL1_NUMREC_16 ||
+ brbe_nr == BRBIDR0_EL1_NUMREC_32 ||
+ brbe_nr == BRBIDR0_EL1_NUMREC_64;
+}
+
+static bool valid_brbe_cc(int brbe_cc)
+{
+ return brbe_cc == BRBIDR0_EL1_CC_20_BIT;
+}
+
+static bool valid_brbe_format(int brbe_format)
+{
+ return brbe_format == BRBIDR0_EL1_FORMAT_FORMAT_0;
+}
+
+static bool valid_brbidr(u64 brbidr)
+{
+ int brbe_format, brbe_cc, brbe_nr;
+
+ brbe_format = FIELD_GET(BRBIDR0_EL1_FORMAT_MASK, brbidr);
+ brbe_cc = FIELD_GET(BRBIDR0_EL1_CC_MASK, brbidr);
+ brbe_nr = FIELD_GET(BRBIDR0_EL1_NUMREC_MASK, brbidr);
+
+ return valid_brbe_format(brbe_format) && valid_brbe_cc(brbe_cc) && valid_brbe_nr(brbe_nr);
+}
+
+static bool valid_brbe_version(int brbe_version)
+{
+ return brbe_version == ID_AA64DFR0_EL1_BRBE_IMP ||
+ brbe_version == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1;
+}
+
+static void select_brbe_bank(int bank)
+{
+ u64 brbfcr;
+
+ brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+ brbfcr &= ~BRBFCR_EL1_BANK_MASK;
+ brbfcr |= SYS_FIELD_PREP(BRBFCR_EL1, BANK, bank);
+ write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+ /*
+ * Arm ARM (DDI 0487K.a) D.18.4 rule PPBZP requires explicit sync
+ * between setting BANK and accessing branch records.
+ */
+ isb();
+}
+
+static bool __read_brbe_regset(struct brbe_regset *entry, int idx)
+{
+ entry->brbinf = get_brbinf_reg(idx);
+
+ if (brbe_invalid(entry->brbinf))
+ return false;
+
+ entry->brbsrc = get_brbsrc_reg(idx);
+ entry->brbtgt = get_brbtgt_reg(idx);
+ return true;
+}
+
+/*
+ * Generic perf branch filters supported on BRBE
+ *
+ * New branch filters need to be evaluated whether they could be supported on
+ * BRBE. This ensures that such branch filters would not just be accepted, to
+ * fail silently. PERF_SAMPLE_BRANCH_HV is a special case that is selectively
+ * supported only on platforms where kernel is in hyp mode.
+ */
+#define BRBE_EXCLUDE_BRANCH_FILTERS (PERF_SAMPLE_BRANCH_ABORT_TX | \
+ PERF_SAMPLE_BRANCH_IN_TX | \
+ PERF_SAMPLE_BRANCH_NO_TX | \
+ PERF_SAMPLE_BRANCH_CALL_STACK | \
+ PERF_SAMPLE_BRANCH_COUNTERS)
+
+#define BRBE_ALLOWED_BRANCH_TYPES (PERF_SAMPLE_BRANCH_ANY | \
+ PERF_SAMPLE_BRANCH_ANY_CALL | \
+ PERF_SAMPLE_BRANCH_ANY_RETURN | \
+ PERF_SAMPLE_BRANCH_IND_CALL | \
+ PERF_SAMPLE_BRANCH_COND | \
+ PERF_SAMPLE_BRANCH_IND_JUMP | \
+ PERF_SAMPLE_BRANCH_CALL)
+
+
+#define BRBE_ALLOWED_BRANCH_FILTERS (PERF_SAMPLE_BRANCH_USER | \
+ PERF_SAMPLE_BRANCH_KERNEL | \
+ PERF_SAMPLE_BRANCH_HV | \
+ BRBE_ALLOWED_BRANCH_TYPES | \
+ PERF_SAMPLE_BRANCH_NO_FLAGS | \
+ PERF_SAMPLE_BRANCH_NO_CYCLES | \
+ PERF_SAMPLE_BRANCH_TYPE_SAVE | \
+ PERF_SAMPLE_BRANCH_HW_INDEX | \
+ PERF_SAMPLE_BRANCH_PRIV_SAVE)
+
+#define BRBE_PERF_BRANCH_FILTERS (BRBE_ALLOWED_BRANCH_FILTERS | \
+ BRBE_EXCLUDE_BRANCH_FILTERS)
+
+/*
+ * BRBE supports the following functional branch type filters while
+ * generating branch records. These branch filters can be enabled,
+ * either individually or as a group i.e ORing multiple filters
+ * with each other.
+ *
+ * BRBFCR_EL1_CONDDIR - Conditional direct branch
+ * BRBFCR_EL1_DIRCALL - Direct call
+ * BRBFCR_EL1_INDCALL - Indirect call
+ * BRBFCR_EL1_INDIRECT - Indirect branch
+ * BRBFCR_EL1_DIRECT - Direct branch
+ * BRBFCR_EL1_RTN - Subroutine return
+ */
+static u64 branch_type_to_brbfcr(int branch_type)
+{
+ u64 brbfcr = 0;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+ brbfcr |= BRBFCR_EL1_BRANCH_FILTERS;
+ return brbfcr;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+ brbfcr |= BRBFCR_EL1_INDCALL;
+ brbfcr |= BRBFCR_EL1_DIRCALL;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+ brbfcr |= BRBFCR_EL1_RTN;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
+ brbfcr |= BRBFCR_EL1_INDCALL;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_COND)
+ brbfcr |= BRBFCR_EL1_CONDDIR;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
+ brbfcr |= BRBFCR_EL1_INDIRECT;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_CALL)
+ brbfcr |= BRBFCR_EL1_DIRCALL;
+
+ return brbfcr;
+}
+
+/*
+ * BRBE supports the following privilege mode filters while generating
+ * branch records.
+ *
+ * BRBCR_ELx_E0BRE - EL0 branch records
+ * BRBCR_ELx_ExBRE - EL1/EL2 branch records
+ *
+ * BRBE also supports the following additional functional branch type
+ * filters while generating branch records.
+ *
+ * BRBCR_ELx_EXCEPTION - Exception
+ * BRBCR_ELx_ERTN - Exception return
+ */
+static u64 branch_type_to_brbcr(int branch_type)
+{
+ u64 brbcr = BRBCR_ELx_FZP | BRBCR_ELx_DEFAULT_TS;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_USER)
+ brbcr |= BRBCR_ELx_E0BRE;
+
+ /*
+ * When running in the hyp mode, writing into BRBCR_EL1
+ * actually writes into BRBCR_EL2 instead. Field E2BRE
+ * is also at the same position as E1BRE.
+ */
+ if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
+ brbcr |= BRBCR_ELx_ExBRE;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_HV) {
+ if (is_kernel_in_hyp_mode())
+ brbcr |= BRBCR_ELx_ExBRE;
+ }
+
+ if (!(branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
+ brbcr |= BRBCR_ELx_CC;
+
+ if (!(branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
+ brbcr |= BRBCR_ELx_MPRED;
+
+ /*
+ * The exception and exception return branches could be
+ * captured, irrespective of the perf event's privilege.
+ * If the perf event does not have enough privilege for
+ * a given exception level, then addresses which falls
+ * under that exception level will be reported as zero
+ * for the captured branch record, creating source only
+ * or target only records.
+ */
+ if (branch_type & PERF_SAMPLE_BRANCH_KERNEL) {
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+ brbcr |= BRBCR_ELx_EXCEPTION;
+ brbcr |= BRBCR_ELx_ERTN;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+ brbcr |= BRBCR_ELx_EXCEPTION;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+ brbcr |= BRBCR_ELx_ERTN;
+ }
+ return brbcr;
+}
+
+bool brbe_branch_attr_valid(struct perf_event *event)
+{
+ u64 branch_type = event->attr.branch_sample_type;
+
+ /*
+ * Ensure both perf branch filter allowed and exclude
+ * masks are always in sync with the generic perf ABI.
+ */
+ BUILD_BUG_ON(BRBE_PERF_BRANCH_FILTERS != (PERF_SAMPLE_BRANCH_MAX - 1));
+
+ if (branch_type & BRBE_EXCLUDE_BRANCH_FILTERS) {
+ pr_debug("requested branch filter not supported 0x%llx\n", branch_type);
+ return false;
+ }
+
+ /* Ensure at least 1 branch type is enabled */
+ if (!(branch_type & BRBE_ALLOWED_BRANCH_TYPES)) {
+ pr_debug("no branch type enabled 0x%llx\n", branch_type);
+ return false;
+ }
+
+ /*
+ * No branches are recorded in guests nor nVHE hypervisors, so
+ * excluding the host or both kernel and user is invalid.
+ *
+ * Ideally we'd just require exclude_guest and exclude_hv, but setting
+ * event filters with perf for kernel or user don't set exclude_guest.
+ * So effectively, exclude_guest and exclude_hv are ignored.
+ */
+ if (event->attr.exclude_host || (event->attr.exclude_user && event->attr.exclude_kernel)) {
+ pr_debug("branch filter in hypervisor or guest only not supported 0x%llx\n", branch_type);
+ return false;
+ }
+
+ event->hw.branch_reg.config = branch_type_to_brbfcr(event->attr.branch_sample_type);
+ event->hw.extra_reg.config = branch_type_to_brbcr(event->attr.branch_sample_type);
+
+ return true;
+}
+
+unsigned int brbe_num_branch_records(const struct arm_pmu *armpmu)
+{
+ return FIELD_GET(BRBIDR0_EL1_NUMREC_MASK, armpmu->reg_brbidr);
+}
+
+void brbe_probe(struct arm_pmu *armpmu)
+{
+ u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
+ u32 brbe;
+
+ brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
+ if (!valid_brbe_version(brbe))
+ return;
+
+ u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
+ if (!valid_brbidr(brbidr))
+ return;
+
+ armpmu->reg_brbidr = brbidr;
+}
+
+/*
+ * BRBE is assumed to be disabled/paused on entry
+ */
+void brbe_enable(const struct arm_pmu *arm_pmu)
+{
+ struct pmu_hw_events *cpuc = this_cpu_ptr(arm_pmu->hw_events);
+ u64 brbfcr = 0, brbcr = 0;
+
+ /*
+ * Merge the permitted branch filters of all events.
+ */
+ for (int i = 0; i < ARMPMU_MAX_HWEVENTS; i++) {
+ struct perf_event *event = cpuc->events[i];
+
+ if (event && has_branch_stack(event)) {
+ brbfcr |= event->hw.branch_reg.config;
+ brbcr |= event->hw.extra_reg.config;
+ }
+ }
+
+ /*
+ * In VHE mode with MDCR_EL2.HPMN equal to PMCR_EL0.N, BRBCR_EL1.FZP
+ * controls freezing the branch records on counter overflow rather than
+ * BRBCR_EL2.FZP (which writes to BRBCR_EL1 are redirected to).
+ * The exception levels are enabled/disabled in BRBCR_EL2, so keep EL1
+ * and EL0 recording disabled for guests.
+ *
+ * As BRBCR_EL1 CC and MPRED bits also need to match, use the same
+ * value for both registers just masking the exception levels.
+ */
+ if (is_kernel_in_hyp_mode())
+ write_sysreg_s(brbcr & ~(BRBCR_ELx_ExBRE | BRBCR_ELx_E0BRE), SYS_BRBCR_EL12);
+ write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+ isb(); // Ensure BRBCR_ELx settings take effect before unpausing
+
+ // Finally write SYS_BRBFCR_EL to unpause BRBE
+ write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+ // Synchronization in PMCR write ensures ordering WRT PMU enabling
+}
+
+void brbe_disable(void)
+{
+ /*
+ * No need for synchronization here as synchronization in PMCR write
+ * ensures ordering and in the interrupt handler this is a NOP as
+ * we're already paused.
+ */
+ write_sysreg_s(BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
+ write_sysreg_s(0, SYS_BRBCR_EL1);
+}
+
+static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = {
+ [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 },
+ [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 },
+ [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 },
+ [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 },
+ [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 },
+ [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 },
+ [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 },
+ [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 },
+ [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 },
+ [BRBINFx_EL1_TYPE_TRAP] = { PERF_BR_IRQ, 0 },
+ [BRBINFx_EL1_TYPE_SERROR] = { PERF_BR_SERROR, 0 },
+ [BRBINFx_EL1_TYPE_ALIGN_FAULT] = { PERF_BR_EXTEND_ABI, PERF_BR_NEW_FAULT_ALGN },
+ [BRBINFx_EL1_TYPE_INSN_FAULT] = { PERF_BR_EXTEND_ABI, PERF_BR_NEW_FAULT_INST },
+ [BRBINFx_EL1_TYPE_DATA_FAULT] = { PERF_BR_EXTEND_ABI, PERF_BR_NEW_FAULT_DATA },
+};
+
+static void brbe_set_perf_entry_type(struct perf_branch_entry *entry, u64 brbinf)
+{
+ int brbe_type = brbinf_get_type(brbinf);
+
+ if (brbe_type <= BRBINFx_EL1_TYPE_DEBUG_EXIT) {
+ const int *br_type = brbe_type_to_perf_type_map[brbe_type];
+
+ entry->type = br_type[0];
+ entry->new_type = br_type[1];
+ }
+}
+
+static int brbinf_get_perf_priv(u64 brbinf)
+{
+ int brbe_el = brbinf_get_el(brbinf);
+
+ switch (brbe_el) {
+ case BRBINFx_EL1_EL_EL0:
+ return PERF_BR_PRIV_USER;
+ case BRBINFx_EL1_EL_EL1:
+ return PERF_BR_PRIV_KERNEL;
+ case BRBINFx_EL1_EL_EL2:
+ if (is_kernel_in_hyp_mode())
+ return PERF_BR_PRIV_KERNEL;
+ return PERF_BR_PRIV_HV;
+ default:
+ pr_warn_once("%d - unknown branch privilege captured\n", brbe_el);
+ return PERF_BR_PRIV_UNKNOWN;
+ }
+}
+
+static bool perf_entry_from_brbe_regset(int index, struct perf_branch_entry *entry,
+ const struct perf_event *event)
+{
+ struct brbe_regset bregs;
+ u64 brbinf;
+
+ if (!__read_brbe_regset(&bregs, index))
+ return false;
+
+ brbinf = bregs.brbinf;
+ perf_clear_branch_entry_bitfields(entry);
+ if (brbe_record_is_complete(brbinf)) {
+ entry->from = bregs.brbsrc;
+ entry->to = bregs.brbtgt;
+ } else if (brbe_record_is_source_only(brbinf)) {
+ entry->from = bregs.brbsrc;
+ entry->to = 0;
+ } else if (brbe_record_is_target_only(brbinf)) {
+ entry->from = 0;
+ entry->to = bregs.brbtgt;
+ }
+
+ brbe_set_perf_entry_type(entry, brbinf);
+
+ if (!branch_sample_no_cycles(event))
+ entry->cycles = brbinf_get_cycles(brbinf);
+
+ if (!branch_sample_no_flags(event)) {
+ /* Mispredict info is available for source only and complete branch records. */
+ if (!brbe_record_is_target_only(brbinf)) {
+ entry->mispred = brbinf_get_mispredict(brbinf);
+ entry->predicted = !entry->mispred;
+ }
+
+ /*
+ * Currently TME feature is neither implemented in any hardware
+ * nor it is being supported in the kernel. Just warn here once
+ * if TME related information shows up rather unexpectedly.
+ */
+ if (brbinf_get_lastfailed(brbinf) || brbinf_get_in_tx(brbinf))
+ pr_warn_once("Unknown transaction states\n");
+ }
+
+ /*
+ * Branch privilege level is available for target only and complete
+ * branch records.
+ */
+ if (!brbe_record_is_source_only(brbinf))
+ entry->priv = brbinf_get_perf_priv(brbinf);
+
+ return true;
+}
+
+#define PERF_BR_ARM64_ALL ( \
+ BIT(PERF_BR_COND) | \
+ BIT(PERF_BR_UNCOND) | \
+ BIT(PERF_BR_IND) | \
+ BIT(PERF_BR_CALL) | \
+ BIT(PERF_BR_IND_CALL) | \
+ BIT(PERF_BR_RET))
+
+#define PERF_BR_ARM64_ALL_KERNEL ( \
+ BIT(PERF_BR_SYSCALL) | \
+ BIT(PERF_BR_IRQ) | \
+ BIT(PERF_BR_SERROR) | \
+ BIT(PERF_BR_MAX + PERF_BR_NEW_FAULT_ALGN) | \
+ BIT(PERF_BR_MAX + PERF_BR_NEW_FAULT_DATA) | \
+ BIT(PERF_BR_MAX + PERF_BR_NEW_FAULT_INST))
+
+static void prepare_event_branch_type_mask(u64 branch_sample,
+ unsigned long *event_type_mask)
+{
+ if (branch_sample & PERF_SAMPLE_BRANCH_ANY) {
+ if (branch_sample & PERF_SAMPLE_BRANCH_KERNEL)
+ bitmap_from_u64(event_type_mask,
+ BIT(PERF_BR_ERET) | PERF_BR_ARM64_ALL |
+ PERF_BR_ARM64_ALL_KERNEL);
+ else
+ bitmap_from_u64(event_type_mask, PERF_BR_ARM64_ALL);
+ return;
+ }
+
+ bitmap_zero(event_type_mask, PERF_BR_ARM64_MAX);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_ANY_CALL) {
+ if (branch_sample & PERF_SAMPLE_BRANCH_KERNEL)
+ bitmap_from_u64(event_type_mask, PERF_BR_ARM64_ALL_KERNEL);
+
+ set_bit(PERF_BR_CALL, event_type_mask);
+ set_bit(PERF_BR_IND_CALL, event_type_mask);
+ }
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_IND_JUMP)
+ set_bit(PERF_BR_IND, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_COND)
+ set_bit(PERF_BR_COND, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_CALL)
+ set_bit(PERF_BR_CALL, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_IND_CALL)
+ set_bit(PERF_BR_IND_CALL, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_ANY_RETURN) {
+ set_bit(PERF_BR_RET, event_type_mask);
+
+ if (branch_sample & PERF_SAMPLE_BRANCH_KERNEL)
+ set_bit(PERF_BR_ERET, event_type_mask);
+ }
+}
+
+/*
+ * BRBE is configured with an OR of permissions from all events, so there may
+ * be events which have to be dropped or events where just the source or target
+ * address has to be zeroed.
+ */
+static bool filter_branch_privilege(struct perf_branch_entry *entry, u64 branch_sample_type)
+{
+ /* We can only have a half record if permissions have not been expanded */
+ if (!entry->from || !entry->to)
+ return true;
+
+ bool from_user = access_ok((void __user *)(unsigned long)entry->from, 4);
+ bool to_user = access_ok((void __user *)(unsigned long)entry->to, 4);
+ bool exclude_kernel = !((branch_sample_type & PERF_SAMPLE_BRANCH_KERNEL) ||
+ (is_kernel_in_hyp_mode() && (branch_sample_type & PERF_SAMPLE_BRANCH_HV)));
+
+ /*
+ * If record is within a single exception level, just need to either
+ * drop or keep the entire record.
+ */
+ if (from_user == to_user)
+ return ((entry->priv == PERF_BR_PRIV_KERNEL) && !exclude_kernel) ||
+ ((entry->priv == PERF_BR_PRIV_USER) &&
+ (branch_sample_type & PERF_SAMPLE_BRANCH_USER));
+
+ // Fixup calls which are syscalls
+ if (entry->type == PERF_BR_CALL && from_user && !to_user)
+ entry->type = PERF_BR_SYSCALL;
+
+ /*
+ * Record is across exception levels, mask addresses for the exception
+ * level we're not capturing.
+ */
+ if (!(branch_sample_type & PERF_SAMPLE_BRANCH_USER)) {
+ if (from_user)
+ entry->from = 0;
+ if (to_user)
+ entry->to = 0;
+ }
+
+ if (exclude_kernel) {
+ if (!from_user)
+ entry->from = 0;
+ if (!to_user)
+ entry->to = 0;
+ }
+ return true;
+}
+
+static bool filter_branch_type(struct perf_branch_entry *entry,
+ const unsigned long *event_type_mask)
+{
+ if (entry->type == PERF_BR_EXTEND_ABI)
+ return test_bit(PERF_BR_MAX + entry->new_type, event_type_mask);
+ else
+ return test_bit(entry->type, event_type_mask);
+}
+
+static bool filter_branch_record(struct perf_branch_entry *entry,
+ u64 branch_sample,
+ const unsigned long *event_type_mask)
+{
+ return filter_branch_type(entry, event_type_mask) &&
+ filter_branch_privilege(entry, branch_sample);
+}
+
+void brbe_read_filtered_entries(struct perf_branch_stack *branch_stack,
+ const struct perf_event *event)
+{
+ struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
+ int nr_hw = brbe_num_branch_records(cpu_pmu);
+ int nr_banks = DIV_ROUND_UP(nr_hw, BRBE_BANK_MAX_ENTRIES);
+ int nr_filtered = 0;
+ u64 branch_sample_type = event->attr.branch_sample_type;
+ DECLARE_BITMAP(event_type_mask, PERF_BR_ARM64_MAX);
+
+ prepare_event_branch_type_mask(branch_sample_type, event_type_mask);
+
+ for (int bank = 0; bank < nr_banks; bank++) {
+ int nr_remaining = nr_hw - (bank * BRBE_BANK_MAX_ENTRIES);
+ int nr_this_bank = min(nr_remaining, BRBE_BANK_MAX_ENTRIES);
+
+ select_brbe_bank(bank);
+
+ for (int i = 0; i < nr_this_bank; i++) {
+ struct perf_branch_entry *pbe = &branch_stack->entries[nr_filtered];
+
+ if (!perf_entry_from_brbe_regset(i, pbe, event))
+ goto done;
+
+ if (!filter_branch_record(pbe, branch_sample_type, event_type_mask))
+ continue;
+
+ nr_filtered++;
+ }
+ }
+
+done:
+ brbe_invalidate();
+ branch_stack->nr = nr_filtered;
+}
diff --git a/drivers/perf/arm_brbe.h b/drivers/perf/arm_brbe.h
new file mode 100644
index 000000000000..b7c7d8796c86
--- /dev/null
+++ b/drivers/perf/arm_brbe.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Branch Record Buffer Extension Helpers.
+ *
+ * Copyright (C) 2022-2025 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+
+struct arm_pmu;
+struct perf_branch_stack;
+struct perf_event;
+
+#ifdef CONFIG_ARM64_BRBE
+void brbe_probe(struct arm_pmu *arm_pmu);
+unsigned int brbe_num_branch_records(const struct arm_pmu *armpmu);
+void brbe_invalidate(void);
+
+void brbe_enable(const struct arm_pmu *arm_pmu);
+void brbe_disable(void);
+
+bool brbe_branch_attr_valid(struct perf_event *event);
+void brbe_read_filtered_entries(struct perf_branch_stack *branch_stack,
+ const struct perf_event *event);
+#else
+static inline void brbe_probe(struct arm_pmu *arm_pmu) { }
+static inline unsigned int brbe_num_branch_records(const struct arm_pmu *armpmu)
+{
+ return 0;
+}
+
+static inline void brbe_invalidate(void) { }
+
+static inline void brbe_enable(const struct arm_pmu *arm_pmu) { };
+static inline void brbe_disable(void) { };
+
+static inline bool brbe_branch_attr_valid(struct perf_event *event)
+{
+ WARN_ON_ONCE(!has_branch_stack(event));
+ return false;
+}
+
+static void brbe_read_filtered_entries(struct perf_branch_stack *branch_stack,
+ const struct perf_event *event)
+{
+}
+#endif
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2f33e69a8caf..df9867c0dc57 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -99,7 +99,7 @@ static const struct pmu_irq_ops percpu_pmunmi_ops = {
.free_pmuirq = armpmu_free_percpu_pmunmi
};
-static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
+DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
static DEFINE_PER_CPU(int, cpu_irq);
static DEFINE_PER_CPU(const struct pmu_irq_ops *, cpu_irq_ops);
@@ -317,6 +317,11 @@ armpmu_del(struct perf_event *event, int flags)
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
+ if (has_branch_stack(event)) {
+ hw_events->branch_users--;
+ perf_sched_cb_dec(event->pmu);
+ }
+
armpmu_stop(event, PERF_EF_UPDATE);
hw_events->events[idx] = NULL;
armpmu->clear_event_idx(hw_events, event);
@@ -345,6 +350,11 @@ armpmu_add(struct perf_event *event, int flags)
/* The newly-allocated counter should be empty */
WARN_ON_ONCE(hw_events->events[idx]);
+ if (has_branch_stack(event)) {
+ hw_events->branch_users++;
+ perf_sched_cb_inc(event->pmu);
+ }
+
event->hw.idx = idx;
hw_events->events[idx] = event;
@@ -509,8 +519,7 @@ static int armpmu_event_init(struct perf_event *event)
!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
return -ENOENT;
- /* does not support taken branch sampling */
- if (has_branch_stack(event))
+ if (has_branch_stack(event) && !armpmu->reg_brbidr)
return -EOPNOTSUPP;
return __hw_perf_event_init(event);
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index e506d59654e7..4f101fb4525d 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -25,6 +25,8 @@
#include <linux/smp.h>
#include <linux/nmi.h>
+#include "arm_brbe.h"
+
/* ARMv8 Cortex-A53 specific event types. */
#define ARMV8_A53_PERFCTR_PREF_LINEFILL 0xC2
@@ -438,7 +440,19 @@ static ssize_t threshold_max_show(struct device *dev,
static DEVICE_ATTR_RO(threshold_max);
+static ssize_t branches_show(struct device *dev,
+ struct device_attribute *attr, char *page)
+{
+ struct pmu *pmu = dev_get_drvdata(dev);
+ struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
+
+ return sysfs_emit(page, "%d\n", brbe_num_branch_records(cpu_pmu));
+}
+
+static DEVICE_ATTR_RO(branches);
+
static struct attribute *armv8_pmuv3_caps_attrs[] = {
+ &dev_attr_branches.attr,
&dev_attr_slots.attr,
&dev_attr_bus_slots.attr,
&dev_attr_bus_width.attr,
@@ -446,9 +460,22 @@ static struct attribute *armv8_pmuv3_caps_attrs[] = {
NULL,
};
+static umode_t caps_is_visible(struct kobject *kobj, struct attribute *attr, int i)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct pmu *pmu = dev_get_drvdata(dev);
+ struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
+
+ if (i == 0)
+ return brbe_num_branch_records(cpu_pmu) ? attr->mode : 0;
+
+ return attr->mode;
+}
+
static const struct attribute_group armv8_pmuv3_caps_attr_group = {
.name = "caps",
.attrs = armv8_pmuv3_caps_attrs,
+ .is_visible = caps_is_visible,
};
/*
@@ -806,9 +833,10 @@ static void armv8pmu_disable_event(struct perf_event *event)
armv8pmu_disable_event_irq(event);
}
-static void armv8pmu_start(struct arm_pmu *cpu_pmu)
+static void armv8pmu_restart(struct arm_pmu *cpu_pmu)
{
struct perf_event_context *ctx;
+ struct pmu_hw_events *hw_events = this_cpu_ptr(cpu_pmu->hw_events);
int nr_user = 0;
ctx = perf_cpu_task_ctx();
@@ -822,16 +850,44 @@ static void armv8pmu_start(struct arm_pmu *cpu_pmu)
kvm_vcpu_pmu_resync_el0();
+ if (hw_events->branch_users)
+ brbe_enable(cpu_pmu);
+
/* Enable all counters */
armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
}
+static void armv8pmu_start(struct arm_pmu *cpu_pmu)
+{
+ struct pmu_hw_events *hw_events = this_cpu_ptr(cpu_pmu->hw_events);
+
+ if (hw_events->branch_users)
+ brbe_invalidate();
+
+ armv8pmu_restart(cpu_pmu);
+}
+
static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
{
+ struct pmu_hw_events *hw_events = this_cpu_ptr(cpu_pmu->hw_events);
+
+ if (hw_events->branch_users)
+ brbe_disable();
+
/* Disable all counters */
armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
}
+static void read_branch_records(struct pmu_hw_events *cpuc,
+ struct perf_event *event,
+ struct perf_sample_data *data)
+{
+ struct perf_branch_stack *branch_stack = cpuc->branch_stack;
+
+ brbe_read_filtered_entries(branch_stack, event);
+ perf_sample_save_brstack(data, event, branch_stack, NULL);
+}
+
static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
{
u64 pmovsr;
@@ -882,6 +938,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
if (!armpmu_event_set_period(event))
continue;
+ /*
+ * PMU IRQ should remain asserted until all branch records
+ * are captured and processed into struct perf_sample_data.
+ */
+ if (has_branch_stack(event))
+ read_branch_records(cpuc, event, &data);
+
/*
* Perf event overflow will queue the processing of the event as
* an irq_work which will be taken care of in the handling of
@@ -890,7 +953,7 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
if (perf_event_overflow(event, &data, regs))
cpu_pmu->disable(event);
}
- armv8pmu_start(cpu_pmu);
+ armv8pmu_restart(cpu_pmu);
return IRQ_HANDLED;
}
@@ -939,7 +1002,7 @@ static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
/* Always prefer to place a cycle counter into the cycle counter. */
if ((evtype == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) &&
- !armv8pmu_event_get_threshold(&event->attr)) {
+ !armv8pmu_event_get_threshold(&event->attr) && !has_branch_stack(event)) {
if (!test_and_set_bit(ARMV8_PMU_CYCLE_IDX, cpuc->used_mask))
return ARMV8_PMU_CYCLE_IDX;
else if (armv8pmu_event_is_64bit(event) &&
@@ -988,6 +1051,19 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
return event->hw.idx + 1;
}
+static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx,
+ struct task_struct *task, bool sched_in)
+{
+ struct arm_pmu *armpmu = *this_cpu_ptr(&cpu_armpmu);
+ struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
+
+ if (!hw_events->branch_users)
+ return;
+
+ if (sched_in)
+ brbe_invalidate();
+}
+
/*
* Add an event filter to a given event.
*/
@@ -1005,6 +1081,13 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
return -EOPNOTSUPP;
}
+ if (has_branch_stack(perf_event)) {
+ if (!brbe_num_branch_records(cpu_pmu) || !brbe_branch_attr_valid(perf_event))
+ return -EOPNOTSUPP;
+
+ perf_event->attach_state |= PERF_ATTACH_SCHED_CB;
+ }
+
/*
* If we're running in hyp mode, then we *are* the hypervisor.
* Therefore we ignore exclude_hv in this configuration, since
@@ -1071,6 +1154,11 @@ static void armv8pmu_reset(void *info)
/* Clear the counters we flip at guest entry/exit */
kvm_clr_pmu_events(mask);
+ if (brbe_num_branch_records(cpu_pmu)) {
+ brbe_disable();
+ brbe_invalidate();
+ }
+
/*
* Initialize & Reset PMNC. Request overflow interrupt for
* 64 bit cycle counter but cheat in armv8pmu_write_counter().
@@ -1239,6 +1327,30 @@ static void __armv8pmu_probe_pmu(void *info)
cpu_pmu->reg_pmmir = read_pmmir();
else
cpu_pmu->reg_pmmir = 0;
+
+ brbe_probe(cpu_pmu);
+}
+
+static int branch_records_alloc(struct arm_pmu *armpmu)
+{
+ struct perf_branch_stack *branch_stack_cpu;
+ struct perf_branch_stack __percpu *branch_stack;
+ size_t size = struct_size(branch_stack_cpu, entries, brbe_num_branch_records(armpmu));
+ int cpu;
+
+ branch_stack = __alloc_percpu_gfp(size, __alignof__(*branch_stack_cpu),
+ GFP_KERNEL);
+ if (!branch_stack)
+ return -ENOMEM;
+
+ for_each_possible_cpu(cpu) {
+ struct pmu_hw_events *events_cpu;
+
+ events_cpu = per_cpu_ptr(armpmu->hw_events, cpu);
+ branch_stack_cpu = per_cpu_ptr(branch_stack, cpu);
+ events_cpu->branch_stack = branch_stack_cpu;
+ }
+ return 0;
}
static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
@@ -1255,7 +1367,15 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
if (ret)
return ret;
- return probe.present ? 0 : -ENODEV;
+ if (!probe.present)
+ return -ENODEV;
+
+ if (brbe_num_branch_records(cpu_pmu)) {
+ ret = branch_records_alloc(cpu_pmu);
+ if (ret)
+ return ret;
+ }
+ return 0;
}
static void armv8pmu_disable_user_access_ipi(void *unused)
@@ -1314,6 +1434,7 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
cpu_pmu->set_event_filter = armv8pmu_set_event_filter;
cpu_pmu->pmu.event_idx = armv8pmu_user_event_idx;
+ cpu_pmu->pmu.sched_task = armv8pmu_sched_task;
cpu_pmu->name = name;
cpu_pmu->map_event = map_event;
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 6dc5e0cd76ca..93c9a26492fc 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -70,6 +70,11 @@ struct pmu_hw_events {
struct arm_pmu *percpu_pmu;
int irq;
+
+ struct perf_branch_stack *branch_stack;
+
+ /* Active events requesting branch records */
+ unsigned int branch_users;
};
enum armpmu_attr_groups {
@@ -115,6 +120,7 @@ struct arm_pmu {
/* PMUv3 only */
int pmuver;
u64 reg_pmmir;
+ u64 reg_brbidr;
#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
#define ARMV8_PMUV3_EXT_COMMON_EVENT_BASE 0x4000
@@ -126,6 +132,8 @@ struct arm_pmu {
#define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
+DECLARE_PER_CPU(struct arm_pmu *, cpu_armpmu);
+
u64 armpmu_event_update(struct perf_event *event);
int armpmu_event_set_period(struct perf_event *event);
--
2.47.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v21 0/4] arm64/perf: Enable branch stack sampling
2025-04-07 17:41 [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
` (3 preceding siblings ...)
2025-04-07 17:41 ` [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
@ 2025-05-06 14:47 ` Jonathan Cameron
2025-05-06 21:30 ` Rob Herring
4 siblings, 1 reply; 18+ messages in thread
From: Jonathan Cameron @ 2025-05-06 14:47 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, Leo Yan,
linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Mark Brown
On Mon, 07 Apr 2025 12:41:29 -0500
"Rob Herring (Arm)" <robh@kernel.org> wrote:
> This series enables perf branch stack sampling support on arm64 via a
> v9.2 arch feature called Branch Record Buffer Extension (BRBE). Details
> on BRBE can be found in the Arm ARM[1] chapter D18.
>
> I've picked up this series from Anshuman. v19 and v20 versions have been
> reworked quite a bit by Mark and myself. The bulk of those changes are
> in patch 4.
>
> A git branch is here[2].
>
> [1] https://developer.arm.com/documentation/ddi0487/latest/
> [2] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git arm/brbe-v21
Hi Rob,
Expecting any changes on this? Anything specific we can do to help then shout.
We need to pull in the reworked version into openeuler that is carrying older
code and would rather not do it more than once.
Jonathan
>
> v21:
> - Drop clean-up patches 1-7 already applied
> - Rebase on v6.15-rc1
>
> v20:
> - https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-0-4e9922fc2e8e@kernel.org
> - Added back some of the arm64 specific exception types. The x86 IRQ
> branches also include other exceptions like page faults. On arm64, we
> can distinguish the exception types, so we do. Also, to better
> align with x86, we convert 'call' branches which are user to kernel
> to 'syscall'.
> - Only enable exceptions and exception returns if recording kernel
> branches (matching x86)
> - Drop requiring event and branch privileges to match
> - Add "branches" caps sysfs attribute like x86
> - Reword comment about FZP and MDCR_EL2.HPMN interaction
> - Rework BRBE invalidation to avoid invalidating in interrupt handler
> when no handled events capture the branch stack (i.e. when there are
> multiple users).
> - Also clear BRBCR_ELx bits in brbe_disable(). This is for KVM nVHE
> checks if BRBE is enabled.
> - Document that MDCR_EL3.SBRBE can be 0b01 also
>
> v19:
> - https://lore.kernel.org/all/20250202-arm-brbe-v19-v19-0-1c1300802385@kernel.org/
> - Drop saving of branch records when task scheduled out (Mark). Make
> sched_task() callback actually get called. Enabling requires a call
> to perf_sched_cb_inc(). So the saving of branch records never
> happened.
> - Got rid of added armpmu ops. All BRBE support is contained within
> pmuv3 code.
> - Fix freeze on overflow for VHE
> - The cycle counter doesn't freeze BRBE on overflow, so avoid assigning
> it when BRBE is enabled.
> - Drop all the Arm specific exception branches. Not a clear need for
> them.
> - Fix handling of branch 'cycles' reading. CC field is
> mantissa/exponent, not an integer.
> - Rework s/w filtering to better match h/w filtering
> - Reject events with disjoint event filter and branch filter or with
> exclude_host set
> - Dropped perf test patch which has been applied for 6.14
> - Dropped patch "KVM: arm64: Explicitly handle BRBE traps as UNDEFINED"
> which has been applied for 6.14
>
> v18:
> - https://lore.kernel.org/all/20240613061731.3109448-1-anshuman.khandual@arm.com/
>
> For v1-v17, see the above link. Not going to duplicate it all here...
>
> Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
> ---
> Anshuman Khandual (4):
> arm64/sysreg: Add BRBE registers and fields
> arm64: Handle BRBE booting requirements
> KVM: arm64: nvhe: Disable branch generation in nVHE guests
> perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
>
> Documentation/arch/arm64/booting.rst | 21 +
> arch/arm64/include/asm/el2_setup.h | 86 +++-
> arch/arm64/include/asm/kvm_host.h | 2 +
> arch/arm64/include/asm/sysreg.h | 17 +-
> arch/arm64/kvm/debug.c | 4 +
> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 32 ++
> arch/arm64/kvm/hyp/nvhe/switch.c | 2 +-
> arch/arm64/tools/sysreg | 132 ++++++
> drivers/perf/Kconfig | 11 +
> drivers/perf/Makefile | 1 +
> drivers/perf/arm_brbe.c | 802 +++++++++++++++++++++++++++++++++++
> drivers/perf/arm_brbe.h | 47 ++
> drivers/perf/arm_pmu.c | 15 +-
> drivers/perf/arm_pmuv3.c | 129 +++++-
> include/linux/perf/arm_pmu.h | 8 +
> 15 files changed, 1287 insertions(+), 22 deletions(-)
> ---
> base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8
> change-id: 20250129-arm-brbe-v19-24d5d9e5e623
>
> Best regards,
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 0/4] arm64/perf: Enable branch stack sampling
2025-05-06 14:47 ` [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Jonathan Cameron
@ 2025-05-06 21:30 ` Rob Herring
0 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2025-05-06 21:30 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Will Deacon, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, James Clark, Anshuman Khandual, Leo Yan,
linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm, Mark Brown
On Tue, May 6, 2025 at 9:47 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 07 Apr 2025 12:41:29 -0500
> "Rob Herring (Arm)" <robh@kernel.org> wrote:
>
> > This series enables perf branch stack sampling support on arm64 via a
> > v9.2 arch feature called Branch Record Buffer Extension (BRBE). Details
> > on BRBE can be found in the Arm ARM[1] chapter D18.
> >
> > I've picked up this series from Anshuman. v19 and v20 versions have been
> > reworked quite a bit by Mark and myself. The bulk of those changes are
> > in patch 4.
> >
> > A git branch is here[2].
> >
> > [1] https://developer.arm.com/documentation/ddi0487/latest/
> > [2] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git arm/brbe-v21
>
> Hi Rob,
>
> Expecting any changes on this? Anything specific we can do to help then shout.
I hope not. Still waiting on Mark R's review though... I'm sure there
will be something and since perf drivers seemingly only get reviewed
once a cycle, I guess it will be another cycle... Sigh.
Rob
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 2/4] arm64: Handle BRBE booting requirements
2025-04-07 17:41 ` [PATCH v21 2/4] arm64: Handle BRBE booting requirements Rob Herring (Arm)
@ 2025-05-19 14:07 ` Will Deacon
2025-05-19 19:31 ` Rob Herring
0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2025-05-19 14:07 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Mark Rutland, Catalin Marinas, Jonathan Corbet, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
James Clark, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Mon, Apr 07, 2025 at 12:41:31PM -0500, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> To use the Branch Record Buffer Extension (BRBE), some configuration is
> necessary at EL3 and EL2. This patch documents the requirements and adds
> the initial EL2 setup code, which largely consists of configuring the
> fine-grained traps and initializing a couple of BRBE control registers.
>
> Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and
> HDFGWTR_EL2 with the same value, relying on the read/write trap controls
> for a register occupying the same bit position in either register. The
> 'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit
> 59 of HDFGWTR_EL2 is RES0, and so this assumption no longer holds.
>
> To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit
> layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and
> HDFGWTR_EL2 control bits separately. While making this change the
> open-coded value (1 << 62) is replaced with
> HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK.
>
> The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special
> initialisation: even though they are subject to E2H renaming, both have
> an effect regardless of HCR_EL2.TGE, even when running at EL2, and
> consequently both need to be initialised. This is handled in
> __init_el2_brbe() with a comment to explain the situation.
>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Reviewed-by: Leo Yan <leo.yan@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> [Mark: rewrite commit message, fix typo in comment]
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
> Tested-by: James Clark <james.clark@linaro.org>
> ---
> v20:
> - Document that MDCR_EL3.SBRBE can be 0b01 also
> - Fix "HDFGWTR_EL2 is RES0" in commit msg
> ---
> Documentation/arch/arm64/booting.rst | 21 +++++++++
> arch/arm64/include/asm/el2_setup.h | 86 ++++++++++++++++++++++++++++++++++--
> 2 files changed, 104 insertions(+), 3 deletions(-)
It would be good to have an Ack from the kvm/arm64 side on this, but in
the meantime I've left some comments inline.
> diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
> index dee7b6de864f..a627c1e0e4a0 100644
> --- a/Documentation/arch/arm64/booting.rst
> +++ b/Documentation/arch/arm64/booting.rst
> @@ -358,6 +358,27 @@ Before jumping into the kernel, the following conditions must be met:
>
> - HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
>
> + For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE):
This doesn't make sense ^^^
> diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> index ebceaae3c749..e7fcba1e7d8e 100644
> --- a/arch/arm64/include/asm/el2_setup.h
> +++ b/arch/arm64/include/asm/el2_setup.h
> @@ -189,6 +189,39 @@
> .Lskip_set_cptr_\@:
> .endm
>
> +/*
> + * Configure BRBE to permit recording cycle counts and branch mispredicts.
> + *
> + * At any EL, to record cycle counts BRBE requires that both BRBCR_EL2.CC=1 and
> + * BRBCR_EL1.CC=1.
> + *
> + * At any EL, to record branch mispredicts BRBE requires that both
> + * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1.
> + *
> + * When HCR_EL2.E2H=1, the BRBCR_EL1 encoding is redirected to BRBCR_EL2, but
> + * the {CC,MPRED} bits in the real BRBCR_EL1 register still apply.
> + *
> + * Set {CC,MPRED} in both BRBCR_EL2 and BRBCR_EL1 so that at runtime we only
> + * need to enable/disable these in BRBCR_EL1 regardless of whether the kernel
> + * ends up executing in EL1 or EL2.
> + */
> +.macro __init_el2_brbe
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_\@
> +
> + mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED
> + msr_s SYS_BRBCR_EL2, x0
> +
> + __check_hvhe .Lset_brbe_nvhe_\@, x1
> + msr_s SYS_BRBCR_EL12, x0 // VHE
> + b .Lskip_brbe_\@
> +
> +.Lset_brbe_nvhe_\@:
> + msr_s SYS_BRBCR_EL1, x0 // NVHE
> +.Lskip_brbe_\@:
Why do we have to poke BRBCR_EL12/BRBCR_EL1 here rather than in the BRBE
driver code?
> +.endm
> +
> /* Disable any fine grained traps */
> .macro __init_el2_fgt
> mrs x1, id_aa64mmfr0_el1
> @@ -196,16 +229,48 @@
> cbz x1, .Lskip_fgt_\@
>
> mov x0, xzr
> + mov x2, xzr
> mrs x1, id_aa64dfr0_el1
> ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4
> cmp x1, #3
> b.lt .Lskip_spe_fgt_\@
> +
> /* Disable PMSNEVFR_EL1 read and write traps */
> - orr x0, x0, #(1 << 62)
> + orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK
> + orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK
>
> .Lskip_spe_fgt_\@:
> +#ifdef CONFIG_ARM64_BRBE
Why is this gated on CONFIG_ARM64_BRBE?
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_reg_fgt_\@
> +
> + /*
> + * Disable read traps for the following registers
> + *
> + * [BRBSRC|BRBTGT|RBINF]_EL1
> + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
> + */
> + orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK
> +
> + /*
> + * Disable write traps for the following registers
> + *
> + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
> + */
> + orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK
> +
> + /* Disable read and write traps for [BRBCR|BRBFCR]_EL1 */
> + orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK
> + orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK
> +
> + /* Disable read traps for BRBIDR_EL1 */
> + orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK
> +
> +.Lskip_brbe_reg_fgt_\@:
I think this label should become .Lset_debug_fgt_\@:. That way, we have
a clear point at which we're done with HDFG*TR_EL2. We can zero x0 and
x2 and from then on we can focus on HFG*TR + HFGITR.
nit: the existing .Lskip_debug_fgt_\@ label looks to be misnamed -- it
should probably be .Lskip_sme_fgt_\@ ?
> +#endif /* CONFIG_ARM64_BRBE */
> msr_s SYS_HDFGRTR_EL2, x0
> - msr_s SYS_HDFGWTR_EL2, x0
> + msr_s SYS_HDFGWTR_EL2, x2
nit: It would be cleaner to use x0/x1 for the pair of trap registers
but I can see that would be a more invasive change.
> mov x0, xzr
> mrs x1, id_aa64pfr1_el1
> @@ -246,7 +311,21 @@
> .Lset_fgt_\@:
> msr_s SYS_HFGRTR_EL2, x0
> msr_s SYS_HFGWTR_EL2, x0
> - msr_s SYS_HFGITR_EL2, xzr
> + mov x0, xzr
> +#ifdef CONFIG_ARM64_BRBE
> + mrs x1, id_aa64dfr0_el1
> + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> + cbz x1, .Lskip_brbe_insn_fgt_\@
It would probably scale better if we unconditionally stick x2 in
HFGITR_EL2 and zero that register in '.Lset_debug_fgt_\@'. We still have
two checks for BRBE, but at least '.Lset_fgt_\@' could just stick to
writing the registers.
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 1/4] arm64/sysreg: Add BRBE registers and fields
2025-04-07 17:41 ` [PATCH v21 1/4] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
@ 2025-05-19 14:08 ` Will Deacon
0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2025-05-19 14:08 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Mark Rutland, Catalin Marinas, Jonathan Corbet, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
James Clark, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm, Mark Brown
On Mon, Apr 07, 2025 at 12:41:30PM -0500, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> This patch adds definitions related to the Branch Record Buffer Extension
> (BRBE) as per ARM DDI 0487K.a. These will be used by KVM and a BRBE driver
> in subsequent patches.
>
> Some existing BRBE definitions in asm/sysreg.h are replaced with equivalent
> generated definitions.
>
> Cc: Marc Zyngier <maz@kernel.org>
> Reviewed-by: Mark Brown <broonie@kernel.org>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
> Tested-by: James Clark <james.clark@linaro.org>
> ---
> v19:
> - split BRBINF.CC field into mantissa and exponent
> ---
> arch/arm64/include/asm/sysreg.h | 17 ++----
> arch/arm64/tools/sysreg | 132 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 138 insertions(+), 11 deletions(-)
Some of the field names here are made up, but overall they seem to
correspond to the architecture, so:
Acked-by: Will Deacon <will@kernel.org>
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 3/4] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-04-07 17:41 ` [PATCH v21 3/4] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
@ 2025-05-19 14:11 ` Will Deacon
2025-05-28 16:09 ` Suzuki K Poulose
1 sibling, 0 replies; 18+ messages in thread
From: Will Deacon @ 2025-05-19 14:11 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Mark Rutland, Catalin Marinas, Jonathan Corbet, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
James Clark, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Mon, Apr 07, 2025 at 12:41:32PM -0500, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> While BRBE can record branches within guests, the host recording
> branches in guests is not supported by perf (though events are).
> Support for BRBE in guests will supported by providing direct access
> to BRBE within the guests. That is how x86 LBR works for guests.
> Therefore, BRBE needs to be disabled on guest entry and restored on
> exit.
>
> For nVHE, this requires explicit handling for guests. Before
> entering a guest, save the BRBE state and disable the it. When
> returning to the host, restore the state.
>
> For VHE, it is not necessary. We initialize
> BRBCR_EL1.{E1BRE,E0BRE}=={0,0} at boot time, and HCR_EL2.TGE==1 while
> running in the host. We configure BRBCR_EL2.{E2BRE,E0HBRE} to enable
> branch recording in the host. When entering the guest, we set
> HCR_EL2.TGE==0 which means BRBCR_EL1 is used instead of BRBCR_EL2.
> Consequently for VHE, BRBE recording is disabled at EL1 and EL0 when
> running a guest.
>
> Should recording in guests (by the host) ever be desired, the perf ABI
> will need to be extended to distinguish guest addresses (struct
> perf_branch_entry.priv) for starters. BRBE records would also need to be
> invalidated on guest entry/exit as guest/host EL1 and EL0 records can't
> be distinguished.
>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
> Tested-by: James Clark <james.clark@linaro.org>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
> ---
> v20:
> - Reword commit message about no guest recording.
> - Add BRBE to __kvm_vcpu_run() synchronization comment
>
> v19:
> - Rework due to v6.14 debug flag changes
> - Redo commit message
> ---
> arch/arm64/include/asm/kvm_host.h | 2 ++
> arch/arm64/kvm/debug.c | 4 ++++
> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 32 ++++++++++++++++++++++++++++++++
> arch/arm64/kvm/hyp/nvhe/switch.c | 2 +-
> 4 files changed, 39 insertions(+), 1 deletion(-)
This one also needs reviewing from the KVM/arm64 side...
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-04-07 17:41 ` [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
@ 2025-05-19 15:06 ` Will Deacon
2025-05-19 21:56 ` Rob Herring
0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2025-05-19 15:06 UTC (permalink / raw)
To: Rob Herring (Arm)
Cc: Mark Rutland, Catalin Marinas, Jonathan Corbet, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
James Clark, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
Hey Rob,
On Mon, Apr 07, 2025 at 12:41:33PM -0500, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> The ARMv9.2 architecture introduces the optional Branch Record Buffer
> Extension (BRBE), which records information about branches as they are
> executed into set of branch record registers. BRBE is similar to x86's
> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> (BHRB).
Since you picked this up from v19, the driver has changed considerably
and I presume you will be continuing to extend it in future as the
architecture progresses. Perhaps having you listed as Author (and
crucially, in git blame :p) with Anshuman as a Co-developed-by: would be
more appropriate?
> ---
> drivers/perf/Kconfig | 11 +
> drivers/perf/Makefile | 1 +
> drivers/perf/arm_brbe.c | 802 +++++++++++++++++++++++++++++++++++++++++++
> drivers/perf/arm_brbe.h | 47 +++
> drivers/perf/arm_pmu.c | 15 +-
> drivers/perf/arm_pmuv3.c | 129 ++++++-
> include/linux/perf/arm_pmu.h | 8 +
> 7 files changed, 1006 insertions(+), 7 deletions(-)
Do you know if James Clark's tests [1] are going to be respun for the
perf tool? It would be handy to have some way to test this new
functionality.
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index 4e268de351c4..3be60ff4236d 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -223,6 +223,17 @@ config ARM_SPE_PMU
> Extension, which provides periodic sampling of operations in
> the CPU pipeline and reports this via the perf AUX interface.
>
> +config ARM64_BRBE
> + bool "Enable support for branch stack sampling using FEAT_BRBE"
> + depends on ARM_PMUV3 && ARM64
> + default y
> + help
> + Enable perf support for Branch Record Buffer Extension (BRBE) which
> + records all branches taken in an execution path. This supports some
> + branch types and privilege based filtering. It captures additional
> + relevant information such as cycle count, misprediction and branch
> + type, branch privilege level etc.
It's a shame that this can't be modular, but I suppose the tight
integration with the CPU PMU driver precludes that. Oh well.
> diff --git a/drivers/perf/arm_brbe.c b/drivers/perf/arm_brbe.c
> new file mode 100644
> index 000000000000..2f254bd40af3
> --- /dev/null
> +++ b/drivers/perf/arm_brbe.c
(The driver code looks fine to me but I'd like an Ack from Mark on the
UAPI).
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 2f33e69a8caf..df9867c0dc57 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -99,7 +99,7 @@ static const struct pmu_irq_ops percpu_pmunmi_ops = {
> .free_pmuirq = armpmu_free_percpu_pmunmi
> };
>
> -static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
> +DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
> static DEFINE_PER_CPU(int, cpu_irq);
> static DEFINE_PER_CPU(const struct pmu_irq_ops *, cpu_irq_ops);
>
> @@ -317,6 +317,11 @@ armpmu_del(struct perf_event *event, int flags)
> struct hw_perf_event *hwc = &event->hw;
> int idx = hwc->idx;
>
> + if (has_branch_stack(event)) {
> + hw_events->branch_users--;
> + perf_sched_cb_dec(event->pmu);
> + }
Shouldn't we decrement this *after* calling armpmu_stop()?
> +
> armpmu_stop(event, PERF_EF_UPDATE);
> hw_events->events[idx] = NULL;
> armpmu->clear_event_idx(hw_events, event);
[...]
> +static int branch_records_alloc(struct arm_pmu *armpmu)
> +{
> + struct perf_branch_stack *branch_stack_cpu;
> + struct perf_branch_stack __percpu *branch_stack;
> + size_t size = struct_size(branch_stack_cpu, entries, brbe_num_branch_records(armpmu));
> + int cpu;
> +
> + branch_stack = __alloc_percpu_gfp(size, __alignof__(*branch_stack_cpu),
> + GFP_KERNEL);
> + if (!branch_stack)
> + return -ENOMEM;
> +
> + for_each_possible_cpu(cpu) {
> + struct pmu_hw_events *events_cpu;
> +
> + events_cpu = per_cpu_ptr(armpmu->hw_events, cpu);
> + branch_stack_cpu = per_cpu_ptr(branch_stack, cpu);
> + events_cpu->branch_stack = branch_stack_cpu;
> + }
> + return 0;
> }
How does this work in a heterogeneous system? Shouldn't we at least
scope the allocation to the CPUs associated with this PMU?
> static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
> @@ -1255,7 +1367,15 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
> if (ret)
> return ret;
>
> - return probe.present ? 0 : -ENODEV;
> + if (!probe.present)
> + return -ENODEV;
> +
> + if (brbe_num_branch_records(cpu_pmu)) {
> + ret = branch_records_alloc(cpu_pmu);
> + if (ret)
> + return ret;
> + }
> + return 0;
> }
>
> static void armv8pmu_disable_user_access_ipi(void *unused)
> @@ -1314,6 +1434,7 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
> cpu_pmu->set_event_filter = armv8pmu_set_event_filter;
>
> cpu_pmu->pmu.event_idx = armv8pmu_user_event_idx;
> + cpu_pmu->pmu.sched_task = armv8pmu_sched_task;
Can we avoid assigning this unless BRBE actually probed?
Will
[1] https://gitlab.arm.com/linux-arm/linux-jc/-/commit/3a7ddce70c2daadb63fcc511de0a89055ca48b32
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 2/4] arm64: Handle BRBE booting requirements
2025-05-19 14:07 ` Will Deacon
@ 2025-05-19 19:31 ` Rob Herring
0 siblings, 0 replies; 18+ messages in thread
From: Rob Herring @ 2025-05-19 19:31 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, Catalin Marinas, Jonathan Corbet, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
James Clark, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Mon, May 19, 2025 at 03:07:15PM +0100, Will Deacon wrote:
> On Mon, Apr 07, 2025 at 12:41:31PM -0500, Rob Herring (Arm) wrote:
> > From: Anshuman Khandual <anshuman.khandual@arm.com>
> >
> > To use the Branch Record Buffer Extension (BRBE), some configuration is
> > necessary at EL3 and EL2. This patch documents the requirements and adds
> > the initial EL2 setup code, which largely consists of configuring the
> > fine-grained traps and initializing a couple of BRBE control registers.
> >
> > Before this patch, __init_el2_fgt() would initialize HDFGRTR_EL2 and
> > HDFGWTR_EL2 with the same value, relying on the read/write trap controls
> > for a register occupying the same bit position in either register. The
> > 'nBRBIDR' trap control only exists in bit 59 of HDFGRTR_EL2, while bit
> > 59 of HDFGWTR_EL2 is RES0, and so this assumption no longer holds.
> >
> > To handle HDFGRTR_EL2 and HDFGWTR_EL2 having (slightly) different bit
> > layouts, __init_el2_fgt() is changed to accumulate the HDFGRTR_EL2 and
> > HDFGWTR_EL2 control bits separately. While making this change the
> > open-coded value (1 << 62) is replaced with
> > HDFG{R,W}TR_EL2_nPMSNEVFR_EL1_MASK.
> >
> > The BRBCR_EL1 and BRBCR_EL2 registers are unusual and require special
> > initialisation: even though they are subject to E2H renaming, both have
> > an effect regardless of HCR_EL2.TGE, even when running at EL2, and
> > consequently both need to be initialised. This is handled in
> > __init_el2_brbe() with a comment to explain the situation.
> >
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Oliver Upton <oliver.upton@linux.dev>
> > Reviewed-by: Leo Yan <leo.yan@arm.com>
> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> > [Mark: rewrite commit message, fix typo in comment]
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org>
> > Tested-by: James Clark <james.clark@linaro.org>
> > ---
> > v20:
> > - Document that MDCR_EL3.SBRBE can be 0b01 also
> > - Fix "HDFGWTR_EL2 is RES0" in commit msg
> > ---
> > Documentation/arch/arm64/booting.rst | 21 +++++++++
> > arch/arm64/include/asm/el2_setup.h | 86 ++++++++++++++++++++++++++++++++++--
> > 2 files changed, 104 insertions(+), 3 deletions(-)
>
> It would be good to have an Ack from the kvm/arm64 side on this, but in
> the meantime I've left some comments inline.
>
> > diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
> > index dee7b6de864f..a627c1e0e4a0 100644
> > --- a/Documentation/arch/arm64/booting.rst
> > +++ b/Documentation/arch/arm64/booting.rst
> > @@ -358,6 +358,27 @@ Before jumping into the kernel, the following conditions must be met:
> >
> > - HWFGWTR_EL2.nSMPRI_EL1 (bit 54) must be initialised to 0b01.
> >
> > + For CPUs with feature Branch Record Buffer Extension (FEAT_BRBE):
>
> This doesn't make sense ^^^
You mean it should be "with the Branch Record Buffer Extension feature"
instead?
>
> > diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> > index ebceaae3c749..e7fcba1e7d8e 100644
> > --- a/arch/arm64/include/asm/el2_setup.h
> > +++ b/arch/arm64/include/asm/el2_setup.h
> > @@ -189,6 +189,39 @@
> > .Lskip_set_cptr_\@:
> > .endm
> >
> > +/*
> > + * Configure BRBE to permit recording cycle counts and branch mispredicts.
> > + *
> > + * At any EL, to record cycle counts BRBE requires that both BRBCR_EL2.CC=1 and
> > + * BRBCR_EL1.CC=1.
> > + *
> > + * At any EL, to record branch mispredicts BRBE requires that both
> > + * BRBCR_EL2.MPRED=1 and BRBCR_EL1.MPRED=1.
> > + *
> > + * When HCR_EL2.E2H=1, the BRBCR_EL1 encoding is redirected to BRBCR_EL2, but
> > + * the {CC,MPRED} bits in the real BRBCR_EL1 register still apply.
> > + *
> > + * Set {CC,MPRED} in both BRBCR_EL2 and BRBCR_EL1 so that at runtime we only
> > + * need to enable/disable these in BRBCR_EL1 regardless of whether the kernel
> > + * ends up executing in EL1 or EL2.
> > + */
> > +.macro __init_el2_brbe
> > + mrs x1, id_aa64dfr0_el1
> > + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> > + cbz x1, .Lskip_brbe_\@
> > +
> > + mov_q x0, BRBCR_ELx_CC | BRBCR_ELx_MPRED
> > + msr_s SYS_BRBCR_EL2, x0
> > +
> > + __check_hvhe .Lset_brbe_nvhe_\@, x1
> > + msr_s SYS_BRBCR_EL12, x0 // VHE
> > + b .Lskip_brbe_\@
> > +
> > +.Lset_brbe_nvhe_\@:
> > + msr_s SYS_BRBCR_EL1, x0 // NVHE
> > +.Lskip_brbe_\@:
>
> Why do we have to poke BRBCR_EL12/BRBCR_EL1 here rather than in the BRBE
> driver code?
Yeah, I think we can drop this. Originally, the driver did not touch
SYS_BRBCR_EL12, but it turns out we need to for freeze on overflow to
work correctly with VHE (see the comment in the driver for
SYS_BRBCR_EL12 access).
The only other reason I can come up with is some fields reset to UNKNOWN
and there may be no driver. However, the important ones, ExBRE, reset to
0, so we should be fine.
> > +.endm
> > +
> > /* Disable any fine grained traps */
> > .macro __init_el2_fgt
> > mrs x1, id_aa64mmfr0_el1
> > @@ -196,16 +229,48 @@
> > cbz x1, .Lskip_fgt_\@
> >
> > mov x0, xzr
> > + mov x2, xzr
> > mrs x1, id_aa64dfr0_el1
> > ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4
> > cmp x1, #3
> > b.lt .Lskip_spe_fgt_\@
> > +
> > /* Disable PMSNEVFR_EL1 read and write traps */
> > - orr x0, x0, #(1 << 62)
> > + orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK
> > + orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK
> >
> > .Lskip_spe_fgt_\@:
> > +#ifdef CONFIG_ARM64_BRBE
>
> Why is this gated on CONFIG_ARM64_BRBE?
Shrug. We don't do that anywhere else, so I'll drop it.
> > + mrs x1, id_aa64dfr0_el1
> > + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> > + cbz x1, .Lskip_brbe_reg_fgt_\@
> > +
> > + /*
> > + * Disable read traps for the following registers
> > + *
> > + * [BRBSRC|BRBTGT|RBINF]_EL1
> > + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
> > + */
> > + orr x0, x0, #HDFGRTR_EL2_nBRBDATA_MASK
> > +
> > + /*
> > + * Disable write traps for the following registers
> > + *
> > + * [BRBSRCINJ|BRBTGTINJ|BRBINFINJ|BRBTS]_EL1
> > + */
> > + orr x2, x2, #HDFGWTR_EL2_nBRBDATA_MASK
> > +
> > + /* Disable read and write traps for [BRBCR|BRBFCR]_EL1 */
> > + orr x0, x0, #HDFGRTR_EL2_nBRBCTL_MASK
> > + orr x2, x2, #HDFGWTR_EL2_nBRBCTL_MASK
> > +
> > + /* Disable read traps for BRBIDR_EL1 */
> > + orr x0, x0, #HDFGRTR_EL2_nBRBIDR_MASK
> > +
> > +.Lskip_brbe_reg_fgt_\@:
>
> I think this label should become .Lset_debug_fgt_\@:. That way, we have
> a clear point at which we're done with HDFG*TR_EL2. We can zero x0 and
> x2 and from then on we can focus on HFG*TR + HFGITR.
>
> nit: the existing .Lskip_debug_fgt_\@ label looks to be misnamed -- it
> should probably be .Lskip_sme_fgt_\@ ?
>
> > +#endif /* CONFIG_ARM64_BRBE */
> > msr_s SYS_HDFGRTR_EL2, x0
> > - msr_s SYS_HDFGWTR_EL2, x0
> > + msr_s SYS_HDFGWTR_EL2, x2
>
> nit: It would be cleaner to use x0/x1 for the pair of trap registers
> but I can see that would be a more invasive change.
That would not be worth the churn IMO. We'd want to just swap x1 and x2
everywhere so we consistently use x2 for ID registers.
And then I'd be to blame for *all* this wonderful code. ;)
>
> > mov x0, xzr
> > mrs x1, id_aa64pfr1_el1
> > @@ -246,7 +311,21 @@
> > .Lset_fgt_\@:
> > msr_s SYS_HFGRTR_EL2, x0
> > msr_s SYS_HFGWTR_EL2, x0
> > - msr_s SYS_HFGITR_EL2, xzr
> > + mov x0, xzr
> > +#ifdef CONFIG_ARM64_BRBE
> > + mrs x1, id_aa64dfr0_el1
> > + ubfx x1, x1, #ID_AA64DFR0_EL1_BRBE_SHIFT, #4
> > + cbz x1, .Lskip_brbe_insn_fgt_\@
>
> It would probably scale better if we unconditionally stick x2 in
> HFGITR_EL2 and zero that register in '.Lset_debug_fgt_\@'. We still have
> two checks for BRBE, but at least '.Lset_fgt_\@' could just stick to
> writing the registers.
All that looks doable.
Rob
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-05-19 15:06 ` Will Deacon
@ 2025-05-19 21:56 ` Rob Herring
2025-05-20 22:22 ` Rob Herring
2025-05-21 15:58 ` James Clark
0 siblings, 2 replies; 18+ messages in thread
From: Rob Herring @ 2025-05-19 21:56 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, Catalin Marinas, Jonathan Corbet, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
James Clark, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Mon, May 19, 2025 at 04:06:22PM +0100, Will Deacon wrote:
> Hey Rob,
>
> On Mon, Apr 07, 2025 at 12:41:33PM -0500, Rob Herring (Arm) wrote:
> > From: Anshuman Khandual <anshuman.khandual@arm.com>
> >
> > The ARMv9.2 architecture introduces the optional Branch Record Buffer
> > Extension (BRBE), which records information about branches as they are
> > executed into set of branch record registers. BRBE is similar to x86's
> > Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> > (BHRB).
>
> Since you picked this up from v19, the driver has changed considerably
> and I presume you will be continuing to extend it in future as the
> architecture progresses. Perhaps having you listed as Author (and
> crucially, in git blame :p) with Anshuman as a Co-developed-by: would be
> more appropriate?
Shrug.
> > ---
> > drivers/perf/Kconfig | 11 +
> > drivers/perf/Makefile | 1 +
> > drivers/perf/arm_brbe.c | 802 +++++++++++++++++++++++++++++++++++++++++++
> > drivers/perf/arm_brbe.h | 47 +++
> > drivers/perf/arm_pmu.c | 15 +-
> > drivers/perf/arm_pmuv3.c | 129 ++++++-
> > include/linux/perf/arm_pmu.h | 8 +
> > 7 files changed, 1006 insertions(+), 7 deletions(-)
>
> Do you know if James Clark's tests [1] are going to be respun for the
> perf tool? It would be handy to have some way to test this new
> functionality.
Yes. I dropped them here because I've been told by Arnaldo in the past
to send userspace stuff separately.
> > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> > index 4e268de351c4..3be60ff4236d 100644
> > --- a/drivers/perf/Kconfig
> > +++ b/drivers/perf/Kconfig
> > @@ -223,6 +223,17 @@ config ARM_SPE_PMU
> > Extension, which provides periodic sampling of operations in
> > the CPU pipeline and reports this via the perf AUX interface.
> >
> > +config ARM64_BRBE
> > + bool "Enable support for branch stack sampling using FEAT_BRBE"
> > + depends on ARM_PMUV3 && ARM64
> > + default y
> > + help
> > + Enable perf support for Branch Record Buffer Extension (BRBE) which
> > + records all branches taken in an execution path. This supports some
> > + branch types and privilege based filtering. It captures additional
> > + relevant information such as cycle count, misprediction and branch
> > + type, branch privilege level etc.
>
> It's a shame that this can't be modular, but I suppose the tight
> integration with the CPU PMU driver precludes that. Oh well.
>
> > diff --git a/drivers/perf/arm_brbe.c b/drivers/perf/arm_brbe.c
> > new file mode 100644
> > index 000000000000..2f254bd40af3
> > --- /dev/null
> > +++ b/drivers/perf/arm_brbe.c
>
> (The driver code looks fine to me but I'd like an Ack from Mark on the
> UAPI).
>
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 2f33e69a8caf..df9867c0dc57 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -99,7 +99,7 @@ static const struct pmu_irq_ops percpu_pmunmi_ops = {
> > .free_pmuirq = armpmu_free_percpu_pmunmi
> > };
> >
> > -static DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
> > +DEFINE_PER_CPU(struct arm_pmu *, cpu_armpmu);
> > static DEFINE_PER_CPU(int, cpu_irq);
> > static DEFINE_PER_CPU(const struct pmu_irq_ops *, cpu_irq_ops);
> >
> > @@ -317,6 +317,11 @@ armpmu_del(struct perf_event *event, int flags)
> > struct hw_perf_event *hwc = &event->hw;
> > int idx = hwc->idx;
> >
> > + if (has_branch_stack(event)) {
> > + hw_events->branch_users--;
> > + perf_sched_cb_dec(event->pmu);
> > + }
>
> Shouldn't we decrement this *after* calling armpmu_stop()?
Logically, that would make more sense. It's all serialized by the perf
core though, so we can't get .sched_task() during this.
> > +
> > armpmu_stop(event, PERF_EF_UPDATE);
> > hw_events->events[idx] = NULL;
> > armpmu->clear_event_idx(hw_events, event);
>
> [...]
>
> > +static int branch_records_alloc(struct arm_pmu *armpmu)
> > +{
> > + struct perf_branch_stack *branch_stack_cpu;
> > + struct perf_branch_stack __percpu *branch_stack;
> > + size_t size = struct_size(branch_stack_cpu, entries, brbe_num_branch_records(armpmu));
> > + int cpu;
> > +
> > + branch_stack = __alloc_percpu_gfp(size, __alignof__(*branch_stack_cpu),
> > + GFP_KERNEL);
> > + if (!branch_stack)
> > + return -ENOMEM;
> > +
> > + for_each_possible_cpu(cpu) {
> > + struct pmu_hw_events *events_cpu;
> > +
> > + events_cpu = per_cpu_ptr(armpmu->hw_events, cpu);
> > + branch_stack_cpu = per_cpu_ptr(branch_stack, cpu);
> > + events_cpu->branch_stack = branch_stack_cpu;
> > + }
> > + return 0;
> > }
>
> How does this work in a heterogeneous system? Shouldn't we at least
> scope the allocation to the CPUs associated with this PMU?
Leaks memory, that's how.
As a bonus, it could overrun some memory too if the record sizes are
different though mostly overrunning into other BRBE buffers...
I think we just need to loop over cpu_pmu->supported_cpus and call
kmalloc(). Since events_cpu is already percpu, we don't need a percpu
allocation of the branch stack buffers.
I'm assuming it is safe to assume all CPUs either have BRBE or they
don't? Different record sizes at least should work fine (other than the
above issue).
BTW, I was scratching my head how armpmu_alloc() works with
for_each_possible_cpu(), but I guess the hotplug callbacks overwrite the
events->percpu_pmu value. I think we could just remove the loop there.
I'll investigate that.
Rob
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-05-19 21:56 ` Rob Herring
@ 2025-05-20 22:22 ` Rob Herring
2025-05-21 15:58 ` James Clark
1 sibling, 0 replies; 18+ messages in thread
From: Rob Herring @ 2025-05-20 22:22 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, Catalin Marinas, Jonathan Corbet, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
James Clark, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Mon, May 19, 2025 at 4:59 PM Rob Herring <robh@kernel.org> wrote:
>
> On Mon, May 19, 2025 at 04:06:22PM +0100, Will Deacon wrote:
> > Hey Rob,
> >
> > On Mon, Apr 07, 2025 at 12:41:33PM -0500, Rob Herring (Arm) wrote:
> > > From: Anshuman Khandual <anshuman.khandual@arm.com>
> > >
> > > The ARMv9.2 architecture introduces the optional Branch Record Buffer
> > > Extension (BRBE), which records information about branches as they are
> > > executed into set of branch record registers. BRBE is similar to x86's
> > > Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> > > (BHRB).
> >
> > Since you picked this up from v19, the driver has changed considerably
> > and I presume you will be continuing to extend it in future as the
> > architecture progresses. Perhaps having you listed as Author (and
> > crucially, in git blame :p) with Anshuman as a Co-developed-by: would be
> > more appropriate?
>
> Shrug.
I did a comparison to v18 arm_brbe.c. It was 1134 lines. I (and Mark)
have removed 332 lines and changed 369 lines. So a little more than
half remains, but the unchanged parts are a lot of 1 line helper
functions, brackets and whitespace. So given the more complex parts
have changed, I will change the authorship.
Rob
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-05-19 21:56 ` Rob Herring
2025-05-20 22:22 ` Rob Herring
@ 2025-05-21 15:58 ` James Clark
2025-05-27 10:50 ` Will Deacon
1 sibling, 1 reply; 18+ messages in thread
From: James Clark @ 2025-05-21 15:58 UTC (permalink / raw)
To: Rob Herring, Will Deacon
Cc: Mark Rutland, Catalin Marinas, Jonathan Corbet, Marc Zyngier,
Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Anshuman Khandual, Leo Yan, linux-arm-kernel, linux-perf-users,
linux-kernel, linux-doc, kvmarm
On 19/05/2025 10:56 pm, Rob Herring wrote:
> On Mon, May 19, 2025 at 04:06:22PM +0100, Will Deacon wrote:
>> Hey Rob,
>>
>> On Mon, Apr 07, 2025 at 12:41:33PM -0500, Rob Herring (Arm) wrote:
>>> From: Anshuman Khandual <anshuman.khandual@arm.com>
>>>
>>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
>>> Extension (BRBE), which records information about branches as they are
>>> executed into set of branch record registers. BRBE is similar to x86's
>>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
>>> (BHRB).
>>
>> Since you picked this up from v19, the driver has changed considerably
>> and I presume you will be continuing to extend it in future as the
>> architecture progresses. Perhaps having you listed as Author (and
>> crucially, in git blame :p) with Anshuman as a Co-developed-by: would be
>> more appropriate?
>
> Shrug.
>
>>> ---
>>> drivers/perf/Kconfig | 11 +
>>> drivers/perf/Makefile | 1 +
>>> drivers/perf/arm_brbe.c | 802 +++++++++++++++++++++++++++++++++++++++++++
>>> drivers/perf/arm_brbe.h | 47 +++
>>> drivers/perf/arm_pmu.c | 15 +-
>>> drivers/perf/arm_pmuv3.c | 129 ++++++-
>>> include/linux/perf/arm_pmu.h | 8 +
>>> 7 files changed, 1006 insertions(+), 7 deletions(-)
>>
>> Do you know if James Clark's tests [1] are going to be respun for the
>> perf tool? It would be handy to have some way to test this new
>> functionality.
>
> Yes. I dropped them here because I've been told by Arnaldo in the past
> to send userspace stuff separately.
>
That version of the test was out of date so I've pushed the new version
here:
https://git.linaro.org/plugins/gitiles/people/james.clark/linux.git/+/16e4a18c2d5fc53736f05c9052b1d11d74909707
But I'll wait for the driver changes to be finalised before posting it.
Or Rob can take it back into the patchset.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-05-21 15:58 ` James Clark
@ 2025-05-27 10:50 ` Will Deacon
2025-05-27 17:57 ` James Clark
0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2025-05-27 10:50 UTC (permalink / raw)
To: James Clark
Cc: Rob Herring, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On Wed, May 21, 2025 at 04:58:42PM +0100, James Clark wrote:
>
>
> On 19/05/2025 10:56 pm, Rob Herring wrote:
> > On Mon, May 19, 2025 at 04:06:22PM +0100, Will Deacon wrote:
> > > Hey Rob,
> > >
> > > On Mon, Apr 07, 2025 at 12:41:33PM -0500, Rob Herring (Arm) wrote:
> > > > From: Anshuman Khandual <anshuman.khandual@arm.com>
> > > >
> > > > The ARMv9.2 architecture introduces the optional Branch Record Buffer
> > > > Extension (BRBE), which records information about branches as they are
> > > > executed into set of branch record registers. BRBE is similar to x86's
> > > > Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
> > > > (BHRB).
> > >
> > > Since you picked this up from v19, the driver has changed considerably
> > > and I presume you will be continuing to extend it in future as the
> > > architecture progresses. Perhaps having you listed as Author (and
> > > crucially, in git blame :p) with Anshuman as a Co-developed-by: would be
> > > more appropriate?
> >
> > Shrug.
> >
> > > > ---
> > > > drivers/perf/Kconfig | 11 +
> > > > drivers/perf/Makefile | 1 +
> > > > drivers/perf/arm_brbe.c | 802 +++++++++++++++++++++++++++++++++++++++++++
> > > > drivers/perf/arm_brbe.h | 47 +++
> > > > drivers/perf/arm_pmu.c | 15 +-
> > > > drivers/perf/arm_pmuv3.c | 129 ++++++-
> > > > include/linux/perf/arm_pmu.h | 8 +
> > > > 7 files changed, 1006 insertions(+), 7 deletions(-)
> > >
> > > Do you know if James Clark's tests [1] are going to be respun for the
> > > perf tool? It would be handy to have some way to test this new
> > > functionality.
> >
> > Yes. I dropped them here because I've been told by Arnaldo in the past
> > to send userspace stuff separately.
> >
>
> That version of the test was out of date so I've pushed the new version
> here: https://git.linaro.org/plugins/gitiles/people/james.clark/linux.git/+/16e4a18c2d5fc53736f05c9052b1d11d74909707
>
> But I'll wait for the driver changes to be finalised before posting it. Or
> Rob can take it back into the patchset.
Thanks, James. I just wanted to make sure that there was still a plan
for upstreaming these separately from the driver. Have you had a chance
to run them against this v21?
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
2025-05-27 10:50 ` Will Deacon
@ 2025-05-27 17:57 ` James Clark
0 siblings, 0 replies; 18+ messages in thread
From: James Clark @ 2025-05-27 17:57 UTC (permalink / raw)
To: Will Deacon
Cc: Rob Herring, Mark Rutland, Catalin Marinas, Jonathan Corbet,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Anshuman Khandual, Leo Yan, linux-arm-kernel,
linux-perf-users, linux-kernel, linux-doc, kvmarm
On 27/05/2025 11:50 am, Will Deacon wrote:
> On Wed, May 21, 2025 at 04:58:42PM +0100, James Clark wrote:
>>
>>
>> On 19/05/2025 10:56 pm, Rob Herring wrote:
>>> On Mon, May 19, 2025 at 04:06:22PM +0100, Will Deacon wrote:
>>>> Hey Rob,
>>>>
>>>> On Mon, Apr 07, 2025 at 12:41:33PM -0500, Rob Herring (Arm) wrote:
>>>>> From: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>>
>>>>> The ARMv9.2 architecture introduces the optional Branch Record Buffer
>>>>> Extension (BRBE), which records information about branches as they are
>>>>> executed into set of branch record registers. BRBE is similar to x86's
>>>>> Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
>>>>> (BHRB).
>>>>
>>>> Since you picked this up from v19, the driver has changed considerably
>>>> and I presume you will be continuing to extend it in future as the
>>>> architecture progresses. Perhaps having you listed as Author (and
>>>> crucially, in git blame :p) with Anshuman as a Co-developed-by: would be
>>>> more appropriate?
>>>
>>> Shrug.
>>>
>>>>> ---
>>>>> drivers/perf/Kconfig | 11 +
>>>>> drivers/perf/Makefile | 1 +
>>>>> drivers/perf/arm_brbe.c | 802 +++++++++++++++++++++++++++++++++++++++++++
>>>>> drivers/perf/arm_brbe.h | 47 +++
>>>>> drivers/perf/arm_pmu.c | 15 +-
>>>>> drivers/perf/arm_pmuv3.c | 129 ++++++-
>>>>> include/linux/perf/arm_pmu.h | 8 +
>>>>> 7 files changed, 1006 insertions(+), 7 deletions(-)
>>>>
>>>> Do you know if James Clark's tests [1] are going to be respun for the
>>>> perf tool? It would be handy to have some way to test this new
>>>> functionality.
>>>
>>> Yes. I dropped them here because I've been told by Arnaldo in the past
>>> to send userspace stuff separately.
>>>
>>
>> That version of the test was out of date so I've pushed the new version
>> here: https://git.linaro.org/plugins/gitiles/people/james.clark/linux.git/+/16e4a18c2d5fc53736f05c9052b1d11d74909707
>>
>> But I'll wait for the driver changes to be finalised before posting it. Or
>> Rob can take it back into the patchset.
>
> Thanks, James. I just wanted to make sure that there was still a plan
> for upstreaming these separately from the driver. Have you had a chance
> to run them against this v21?
>
> Will
Yes I'll send it once the driver is merged. And yes they're passing
against v21, but I'll retest v22 as well.
James
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v21 3/4] KVM: arm64: nvhe: Disable branch generation in nVHE guests
2025-04-07 17:41 ` [PATCH v21 3/4] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
2025-05-19 14:11 ` Will Deacon
@ 2025-05-28 16:09 ` Suzuki K Poulose
1 sibling, 0 replies; 18+ messages in thread
From: Suzuki K Poulose @ 2025-05-28 16:09 UTC (permalink / raw)
To: Rob Herring (Arm), Will Deacon, Mark Rutland, Catalin Marinas,
Jonathan Corbet, Marc Zyngier, Oliver Upton, Joey Gouly,
Zenghui Yu, James Clark, Anshuman Khandual, Leo Yan
Cc: linux-arm-kernel, linux-perf-users, linux-kernel, linux-doc,
kvmarm
On 07/04/2025 18:41, Rob Herring (Arm) wrote:
> From: Anshuman Khandual <anshuman.khandual@arm.com>
>
> While BRBE can record branches within guests, the host recording
> branches in guests is not supported by perf (though events are).
> Support for BRBE in guests will supported by providing direct access
> to BRBE within the guests. That is how x86 LBR works for guests.
> Therefore, BRBE needs to be disabled on guest entry and restored on
> exit.
>
> For nVHE, this requires explicit handling for guests. Before
> entering a guest, save the BRBE state and disable the it. When
> returning to the host, restore the state.
>
> For VHE, it is not necessary. We initialize
> BRBCR_EL1.{E1BRE,E0BRE}=={0,0} at boot time, and HCR_EL2.TGE==1 while
> running in the host. We configure BRBCR_EL2.{E2BRE,E0HBRE} to enable
> branch recording in the host. When entering the guest, we set
> HCR_EL2.TGE==0 which means BRBCR_EL1 is used instead of BRBCR_EL2.
> Consequently for VHE, BRBE recording is disabled at EL1 and EL0 when
> running a guest.
>
> Should recording in guests (by the host) ever be desired, the perf ABI
> will need to be extended to distinguish guest addresses (struct
> perf_branch_entry.priv) for starters. BRBE records would also need to be
> invalidated on guest entry/exit as guest/host EL1 and EL0 records can't
> be distinguished.
>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Co-developed-by: Rob Herring (Arm) <robh@kernel.org>
> Tested-by: James Clark <james.clark@linaro.org>
> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-05-28 16:09 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-07 17:41 [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Rob Herring (Arm)
2025-04-07 17:41 ` [PATCH v21 1/4] arm64/sysreg: Add BRBE registers and fields Rob Herring (Arm)
2025-05-19 14:08 ` Will Deacon
2025-04-07 17:41 ` [PATCH v21 2/4] arm64: Handle BRBE booting requirements Rob Herring (Arm)
2025-05-19 14:07 ` Will Deacon
2025-05-19 19:31 ` Rob Herring
2025-04-07 17:41 ` [PATCH v21 3/4] KVM: arm64: nvhe: Disable branch generation in nVHE guests Rob Herring (Arm)
2025-05-19 14:11 ` Will Deacon
2025-05-28 16:09 ` Suzuki K Poulose
2025-04-07 17:41 ` [PATCH v21 4/4] perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE) Rob Herring (Arm)
2025-05-19 15:06 ` Will Deacon
2025-05-19 21:56 ` Rob Herring
2025-05-20 22:22 ` Rob Herring
2025-05-21 15:58 ` James Clark
2025-05-27 10:50 ` Will Deacon
2025-05-27 17:57 ` James Clark
2025-05-06 14:47 ` [PATCH v21 0/4] arm64/perf: Enable branch stack sampling Jonathan Cameron
2025-05-06 21:30 ` Rob Herring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).