* [PATCH V6 0/7] arm64/perf: Enable branch stack sampling
@ 2022-12-08 8:43 Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 1/7] drivers: perf: arm_pmu: Add new sched_task() callback Anshuman Khandual
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Anshuman Khandual @ 2022-12-08 8:43 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, mark.rutland
Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
linux-perf-users
This series enables perf branch stack sampling support on arm64 platform
via a new arch feature called Branch Record Buffer Extension (BRBE). All
relevant register definitions could be accessed here.
https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
This series applies on v6.1-rc8.
Changes in V6:
- Restore the exception level privilege after reading the branch records
- Unpause the buffer after reading the branch records
- Decouple BRBCR_EL1_EXCEPTION/ERTN from perf event privilege level
- Reworked BRBE implementation and branch stack sampling support on arm pmu
- BRBE implementation is now part of overall ARMV8 PMU implementation
- BRBE implementation moved from drivers/perf/ to inside arch/arm64/kernel/
- CONFIG_ARM_BRBE_PMU renamed as CONFIG_ARM64_BRBE in arch/arm64/Kconfig
- File moved - drivers/perf/arm_pmu_brbe.c -> arch/arm64/kernel/brbe.c
- File moved - drivers/perf/arm_pmu_brbe.h -> arch/arm64/kernel/brbe.h
- BRBE name has been dropped from struct arm_pmu and struct hw_pmu_events
- BRBE name has been abstracted out as 'branches' in arm_pmu and hw_pmu_events
- BRBE name has been abstracted out as 'branches' in ARMV8 PMU implementation
- Added sched_task() callback into struct arm_pmu
- Added 'hw_attr' into struct arm_pmu encapsulating possible PMU HW attributes
- Dropped explicit attributes brbe_(v1p1, nr, cc, format) from struct arm_pmu
- Dropped brbfcr, brbcr, registers scratch area from struct hw_pmu_events
- Dropped brbe_users, brbe_context tracking in struct hw_pmu_events
- Added 'features' tracking into struct arm_pmu with ARM_PMU_BRANCH_STACK flag
- armpmu->hw_attr maps into 'struct brbe_hw_attr' inside BRBE implementation
- Set ARM_PMU_BRANCH_STACK in 'arm_pmu->features' after successful BRBE probe
- Added armv8pmu_branch_reset() inside armv8pmu_branch_enable()
- Dropped brbe_supported() as events will be rejected via ARM_PMU_BRANCH_STACK
- Dropped set_brbe_disabled() as well
- Reformated armv8pmu_branch_valid() warnings while rejecting unsupported events
Changes in V5:
https://lore.kernel.org/linux-arm-kernel/20221107062514.2851047-1-anshuman.khandual@arm.com/
- Changed BRBCR_EL1.VIRTUAL from 0b1 to 0b01
- Changed BRBFCR_EL1.EnL into BRBFCR_EL1.EnI
- Changed config ARM_BRBE_PMU from 'tristate' to 'bool'
Changes in V4:
https://lore.kernel.org/all/20221017055713.451092-1-anshuman.khandual@arm.com/
- Changed ../tools/sysreg declarations as suggested
- Set PERF_SAMPLE_BRANCH_STACK in data.sample_flags
- Dropped perfmon_capable() check in armpmu_event_init()
- s/pr_warn_once/pr_info in armpmu_event_init()
- Added brbe_format element into struct pmu_hw_events
- Changed v1p1 as brbe_v1p1 in struct pmu_hw_events
- Dropped pr_info() from arm64_pmu_brbe_probe(), solved LOCKDEP warning
Changes in V3:
https://lore.kernel.org/all/20220929075857.158358-1-anshuman.khandual@arm.com/
- Moved brbe_stack from the stack and now dynamically allocated
- Return PERF_BR_PRIV_UNKNOWN instead of -1 in brbe_fetch_perf_priv()
- Moved BRBIDR0, BRBCR, BRBFCR registers and fields into tools/sysreg
- Created dummy BRBINF_EL1 field definitions in tools/sysreg
- Dropped ARMPMU_EVT_PRIV framework which cached perfmon_capable()
- Both exception and exception return branche records are now captured
only if the event has PERF_SAMPLE_BRANCH_KERNEL which would already
been checked in generic perf via perf_allow_kernel()
Changes in V2:
https://lore.kernel.org/all/20220908051046.465307-1-anshuman.khandual@arm.com/
- Dropped branch sample filter helpers consolidation patch from this series
- Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable()
- Use cached perfmon_capable() while configuring BRBE branch record filters
Changes in V1:
https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.khandual@arm.com/
- Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers
- Process new perf branch types via PERF_BR_EXTEND_ABI
Changes in RFC V2:
https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.khandual@arm.com/
- Added branch_sample_priv() while consolidating other branch sample filter helpers
- Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc
- Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5)
- Added documentation for struct arm_pmu changes, updated commit message
- Updated commit message for BRBE detection infrastructure patch
- PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver)
- Branch privilege state capture mechanism has now moved inside the driver
Changes in RFC V1:
https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Anshuman Khandual (7):
drivers: perf: arm_pmu: Add new sched_task() callback
arm64/perf: Add BRBE registers and fields
arm64/perf: Add branch stack support in struct arm_pmu
arm64/perf: Add branch stack support in struct pmu_hw_events
arm64/perf: Add branch stack support in ARMV8 PMU
arm64/perf: Enable branch stack events via FEAT_BRBE
drivers: perf: arm_pmu: Enable branch stack sampling event
arch/arm64/Kconfig | 11 +
arch/arm64/include/asm/perf_event.h | 19 ++
arch/arm64/include/asm/sysreg.h | 103 +++++++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/brbe.c | 454 ++++++++++++++++++++++++++++
arch/arm64/kernel/brbe.h | 266 ++++++++++++++++
arch/arm64/kernel/perf_event.c | 31 ++
arch/arm64/tools/sysreg | 161 ++++++++++
drivers/perf/arm_pmu.c | 12 +-
include/linux/perf/arm_pmu.h | 19 ++
10 files changed, 1075 insertions(+), 2 deletions(-)
create mode 100644 arch/arm64/kernel/brbe.c
create mode 100644 arch/arm64/kernel/brbe.h
--
2.25.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH V6 1/7] drivers: perf: arm_pmu: Add new sched_task() callback
2022-12-08 8:43 [PATCH V6 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
@ 2022-12-08 8:43 ` Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 2/7] arm64/perf: Add BRBE registers and fields Anshuman Khandual
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2022-12-08 8:43 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, mark.rutland
Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
linux-perf-users
This adds armpmu_sched_task(), as generic pmu's sched_task() override which
in turn can utilize a new arm_pmu.sched_task() callback when available from
the arm_pmu instance. This new callback will be used while enabling BRBE in
ARMV8 PMU.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
drivers/perf/arm_pmu.c | 9 +++++++++
include/linux/perf/arm_pmu.h | 1 +
2 files changed, 10 insertions(+)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 3f07df5a7e95..66880a4bb248 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -520,6 +520,14 @@ static int armpmu_event_init(struct perf_event *event)
return __hw_perf_event_init(event);
}
+static void armpmu_sched_task(struct perf_event_context *ctx, bool sched_in)
+{
+ struct arm_pmu *armpmu = to_arm_pmu(ctx->pmu);
+
+ if (armpmu->sched_task)
+ armpmu->sched_task(ctx, sched_in);
+}
+
static void armpmu_enable(struct pmu *pmu)
{
struct arm_pmu *armpmu = to_arm_pmu(pmu);
@@ -877,6 +885,7 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
}
pmu->pmu = (struct pmu) {
+ .sched_task = armpmu_sched_task,
.pmu_enable = armpmu_enable,
.pmu_disable = armpmu_disable,
.event_init = armpmu_event_init,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 0356cb6a215d..60d4e4c9b3ea 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -101,6 +101,7 @@ struct arm_pmu {
void (*reset)(void *);
int (*map_event)(struct perf_event *event);
int (*filter_match)(struct perf_event *event);
+ void (*sched_task)(struct perf_event_context *ctx, bool sched_in);
int num_events;
bool secure_access; /* 32-bit ARM only */
#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH V6 2/7] arm64/perf: Add BRBE registers and fields
2022-12-08 8:43 [PATCH V6 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 1/7] drivers: perf: arm_pmu: Add new sched_task() callback Anshuman Khandual
@ 2022-12-08 8:43 ` Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 3/7] arm64/perf: Add branch stack support in struct arm_pmu Anshuman Khandual
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2022-12-08 8:43 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, mark.rutland
Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
linux-perf-users
This adds BRBE related register definitions and various other related field
macros there in. These will be used subsequently in a BRBE driver which is
being added later on.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/sysreg.h | 103 ++++++++++++++++++++
arch/arm64/tools/sysreg | 161 ++++++++++++++++++++++++++++++++
2 files changed, 264 insertions(+)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 7d301700d1a9..78335c7807dc 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -161,6 +161,109 @@
#define SYS_DBGDTRTX_EL0 sys_reg(2, 3, 0, 5, 0)
#define SYS_DBGVCR32_EL2 sys_reg(2, 4, 0, 7, 0)
+#define __SYS_BRBINFO(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
+#define __SYS_BRBSRC(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
+#define __SYS_BRBTGT(n) sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
+
+#define SYS_BRBINF0_EL1 __SYS_BRBINFO(0)
+#define SYS_BRBINF1_EL1 __SYS_BRBINFO(1)
+#define SYS_BRBINF2_EL1 __SYS_BRBINFO(2)
+#define SYS_BRBINF3_EL1 __SYS_BRBINFO(3)
+#define SYS_BRBINF4_EL1 __SYS_BRBINFO(4)
+#define SYS_BRBINF5_EL1 __SYS_BRBINFO(5)
+#define SYS_BRBINF6_EL1 __SYS_BRBINFO(6)
+#define SYS_BRBINF7_EL1 __SYS_BRBINFO(7)
+#define SYS_BRBINF8_EL1 __SYS_BRBINFO(8)
+#define SYS_BRBINF9_EL1 __SYS_BRBINFO(9)
+#define SYS_BRBINF10_EL1 __SYS_BRBINFO(10)
+#define SYS_BRBINF11_EL1 __SYS_BRBINFO(11)
+#define SYS_BRBINF12_EL1 __SYS_BRBINFO(12)
+#define SYS_BRBINF13_EL1 __SYS_BRBINFO(13)
+#define SYS_BRBINF14_EL1 __SYS_BRBINFO(14)
+#define SYS_BRBINF15_EL1 __SYS_BRBINFO(15)
+#define SYS_BRBINF16_EL1 __SYS_BRBINFO(16)
+#define SYS_BRBINF17_EL1 __SYS_BRBINFO(17)
+#define SYS_BRBINF18_EL1 __SYS_BRBINFO(18)
+#define SYS_BRBINF19_EL1 __SYS_BRBINFO(19)
+#define SYS_BRBINF20_EL1 __SYS_BRBINFO(20)
+#define SYS_BRBINF21_EL1 __SYS_BRBINFO(21)
+#define SYS_BRBINF22_EL1 __SYS_BRBINFO(22)
+#define SYS_BRBINF23_EL1 __SYS_BRBINFO(23)
+#define SYS_BRBINF24_EL1 __SYS_BRBINFO(24)
+#define SYS_BRBINF25_EL1 __SYS_BRBINFO(25)
+#define SYS_BRBINF26_EL1 __SYS_BRBINFO(26)
+#define SYS_BRBINF27_EL1 __SYS_BRBINFO(27)
+#define SYS_BRBINF28_EL1 __SYS_BRBINFO(28)
+#define SYS_BRBINF29_EL1 __SYS_BRBINFO(29)
+#define SYS_BRBINF30_EL1 __SYS_BRBINFO(30)
+#define SYS_BRBINF31_EL1 __SYS_BRBINFO(31)
+
+#define SYS_BRBSRC0_EL1 __SYS_BRBSRC(0)
+#define SYS_BRBSRC1_EL1 __SYS_BRBSRC(1)
+#define SYS_BRBSRC2_EL1 __SYS_BRBSRC(2)
+#define SYS_BRBSRC3_EL1 __SYS_BRBSRC(3)
+#define SYS_BRBSRC4_EL1 __SYS_BRBSRC(4)
+#define SYS_BRBSRC5_EL1 __SYS_BRBSRC(5)
+#define SYS_BRBSRC6_EL1 __SYS_BRBSRC(6)
+#define SYS_BRBSRC7_EL1 __SYS_BRBSRC(7)
+#define SYS_BRBSRC8_EL1 __SYS_BRBSRC(8)
+#define SYS_BRBSRC9_EL1 __SYS_BRBSRC(9)
+#define SYS_BRBSRC10_EL1 __SYS_BRBSRC(10)
+#define SYS_BRBSRC11_EL1 __SYS_BRBSRC(11)
+#define SYS_BRBSRC12_EL1 __SYS_BRBSRC(12)
+#define SYS_BRBSRC13_EL1 __SYS_BRBSRC(13)
+#define SYS_BRBSRC14_EL1 __SYS_BRBSRC(14)
+#define SYS_BRBSRC15_EL1 __SYS_BRBSRC(15)
+#define SYS_BRBSRC16_EL1 __SYS_BRBSRC(16)
+#define SYS_BRBSRC17_EL1 __SYS_BRBSRC(17)
+#define SYS_BRBSRC18_EL1 __SYS_BRBSRC(18)
+#define SYS_BRBSRC19_EL1 __SYS_BRBSRC(19)
+#define SYS_BRBSRC20_EL1 __SYS_BRBSRC(20)
+#define SYS_BRBSRC21_EL1 __SYS_BRBSRC(21)
+#define SYS_BRBSRC22_EL1 __SYS_BRBSRC(22)
+#define SYS_BRBSRC23_EL1 __SYS_BRBSRC(23)
+#define SYS_BRBSRC24_EL1 __SYS_BRBSRC(24)
+#define SYS_BRBSRC25_EL1 __SYS_BRBSRC(25)
+#define SYS_BRBSRC26_EL1 __SYS_BRBSRC(26)
+#define SYS_BRBSRC27_EL1 __SYS_BRBSRC(27)
+#define SYS_BRBSRC28_EL1 __SYS_BRBSRC(28)
+#define SYS_BRBSRC29_EL1 __SYS_BRBSRC(29)
+#define SYS_BRBSRC30_EL1 __SYS_BRBSRC(30)
+#define SYS_BRBSRC31_EL1 __SYS_BRBSRC(31)
+
+#define SYS_BRBTGT0_EL1 __SYS_BRBTGT(0)
+#define SYS_BRBTGT1_EL1 __SYS_BRBTGT(1)
+#define SYS_BRBTGT2_EL1 __SYS_BRBTGT(2)
+#define SYS_BRBTGT3_EL1 __SYS_BRBTGT(3)
+#define SYS_BRBTGT4_EL1 __SYS_BRBTGT(4)
+#define SYS_BRBTGT5_EL1 __SYS_BRBTGT(5)
+#define SYS_BRBTGT6_EL1 __SYS_BRBTGT(6)
+#define SYS_BRBTGT7_EL1 __SYS_BRBTGT(7)
+#define SYS_BRBTGT8_EL1 __SYS_BRBTGT(8)
+#define SYS_BRBTGT9_EL1 __SYS_BRBTGT(9)
+#define SYS_BRBTGT10_EL1 __SYS_BRBTGT(10)
+#define SYS_BRBTGT11_EL1 __SYS_BRBTGT(11)
+#define SYS_BRBTGT12_EL1 __SYS_BRBTGT(12)
+#define SYS_BRBTGT13_EL1 __SYS_BRBTGT(13)
+#define SYS_BRBTGT14_EL1 __SYS_BRBTGT(14)
+#define SYS_BRBTGT15_EL1 __SYS_BRBTGT(15)
+#define SYS_BRBTGT16_EL1 __SYS_BRBTGT(16)
+#define SYS_BRBTGT17_EL1 __SYS_BRBTGT(17)
+#define SYS_BRBTGT18_EL1 __SYS_BRBTGT(18)
+#define SYS_BRBTGT19_EL1 __SYS_BRBTGT(19)
+#define SYS_BRBTGT20_EL1 __SYS_BRBTGT(20)
+#define SYS_BRBTGT21_EL1 __SYS_BRBTGT(21)
+#define SYS_BRBTGT22_EL1 __SYS_BRBTGT(22)
+#define SYS_BRBTGT23_EL1 __SYS_BRBTGT(23)
+#define SYS_BRBTGT24_EL1 __SYS_BRBTGT(24)
+#define SYS_BRBTGT25_EL1 __SYS_BRBTGT(25)
+#define SYS_BRBTGT26_EL1 __SYS_BRBTGT(26)
+#define SYS_BRBTGT27_EL1 __SYS_BRBTGT(27)
+#define SYS_BRBTGT28_EL1 __SYS_BRBTGT(28)
+#define SYS_BRBTGT29_EL1 __SYS_BRBTGT(29)
+#define SYS_BRBTGT30_EL1 __SYS_BRBTGT(30)
+#define SYS_BRBTGT31_EL1 __SYS_BRBTGT(31)
+
#define SYS_MIDR_EL1 sys_reg(3, 0, 0, 0, 0)
#define SYS_MPIDR_EL1 sys_reg(3, 0, 0, 0, 5)
#define SYS_REVIDR_EL1 sys_reg(3, 0, 0, 0, 6)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 384757a7eda9..45b1834de1ae 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -167,6 +167,167 @@ Enum 3:0 BT
EndEnum
EndSysreg
+
+# This is just a dummy register declaration to get all common field masks and
+# shifts for accessing given BRBINF contents.
+Sysreg BRBINF_EL1 2 1 8 0 0
+Res0 63:47
+Field 46 CCU
+Field 45:32 CC
+Res0 31:18
+Field 17 LASTFAILED
+Field 16 T
+Res0 15:14
+Enum 13:8 TYPE
+ 0b000000 UNCOND_DIR
+ 0b000001 INDIR
+ 0b000010 DIR_LINK
+ 0b000011 INDIR_LINK
+ 0b000101 RET_SUB
+ 0b000111 RET_EXCPT
+ 0b001000 COND_DIR
+ 0b100001 DEBUG_HALT
+ 0b100010 CALL
+ 0b100011 TRAP
+ 0b100100 SERROR
+ 0b100110 INST_DEBUG
+ 0b100111 DATA_DEBUG
+ 0b101010 ALGN_FAULT
+ 0b101011 INST_FAULT
+ 0b101100 DATA_FAULT
+ 0b101110 IRQ
+ 0b101111 FIQ
+ 0b111001 DEBUG_EXIT
+EndEnum
+Enum 7:6 EL
+ 0b00 EL0
+ 0b01 EL1
+ 0b10 EL2
+ 0b11 EL3
+EndEnum
+Field 5 MPRED
+Res0 4:2
+Enum 1:0 VALID
+ 0b00 NONE
+ 0b01 TARGET
+ 0b10 SOURCE
+ 0b11 FULL
+EndEnum
+EndSysreg
+
+Sysreg BRBCR_EL1 2 1 9 0 0
+Res0 63:24
+Field 23 EXCEPTION
+Field 22 ERTN
+Res0 21:9
+Field 8 FZP
+Res0 7
+Enum 6:5 TS
+ 0b01 VIRTUAL
+ 0b10 GST_PHYSICAL
+ 0b11 PHYSICAL
+EndEnum
+Field 4 MPRED
+Field 3 CC
+Res0 2
+Field 1 E1BRE
+Field 0 E0BRE
+EndSysreg
+
+Sysreg BRBFCR_EL1 2 1 9 0 1
+Res0 63:30
+Enum 29:28 BANK
+ 0b0 FIRST
+ 0b1 SECOND
+EndEnum
+Res0 27:23
+Field 22 CONDDIR
+Field 21 DIRCALL
+Field 20 INDCALL
+Field 19 RTN
+Field 18 INDIRECT
+Field 17 DIRECT
+Field 16 EnI
+Res0 15:8
+Field 7 PAUSED
+Field 6 LASTFAILED
+Res0 5:0
+EndSysreg
+
+Sysreg BRBTS_EL1 2 1 9 0 2
+Field 63:0 TS
+EndSysreg
+
+Sysreg BRBINFINJ_EL1 2 1 9 1 0
+Res0 63:47
+Field 46 CCU
+Field 45:32 CC
+Res0 31:18
+Field 17 LASTFAILED
+Field 16 T
+Res0 15:14
+Enum 13:8 TYPE
+ 0b000000 UNCOND_DIR
+ 0b000001 INDIR
+ 0b000010 DIR_LINK
+ 0b000011 INDIR_LINK
+ 0b000100 RET_SUB
+ 0b000100 RET_SUB
+ 0b000111 RET_EXCPT
+ 0b001000 COND_DIR
+ 0b100001 DEBUG_HALT
+ 0b100010 CALL
+ 0b100011 TRAP
+ 0b100100 SERROR
+ 0b100110 INST_DEBUG
+ 0b100111 DATA_DEBUG
+ 0b101010 ALGN_FAULT
+ 0b101011 INST_FAULT
+ 0b101100 DATA_FAULT
+ 0b101110 IRQ
+ 0b101111 FIQ
+ 0b111001 DEBUG_EXIT
+EndEnum
+Enum 7:6 EL
+ 0b00 EL0
+ 0b01 EL1
+ 0b10 EL2
+ 0b11 EL3
+EndEnum
+Field 5 MPRED
+Res0 4:2
+Enum 1:0 VALID
+ 0b00 NONE
+ 0b01 TARGET
+ 0b10 SOURCE
+ 0b00 FULL
+EndEnum
+EndSysreg
+
+Sysreg BRBSRCINJ_EL1 2 1 9 1 1
+Field 63:0 ADDRESS
+EndSysreg
+
+Sysreg BRBTGTINJ_EL1 2 1 9 1 2
+Field 63:0 ADDRESS
+EndSysreg
+
+Sysreg BRBIDR0_EL1 2 1 9 2 0
+Res0 63:16
+Enum 15:12 CC
+ 0b101 20_BIT
+EndEnum
+Enum 11:8 FORMAT
+ 0b0 0
+EndEnum
+Enum 7:0 NUMREC
+ 0b1000 8
+ 0b10000 16
+ 0b100000 32
+ 0b1000000 64
+EndEnum
+EndSysreg
+
Sysreg ID_AA64ZFR0_EL1 3 0 0 4 4
Res0 63:60
Enum 59:56 F64MM
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH V6 3/7] arm64/perf: Add branch stack support in struct arm_pmu
2022-12-08 8:43 [PATCH V6 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 1/7] drivers: perf: arm_pmu: Add new sched_task() callback Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 2/7] arm64/perf: Add BRBE registers and fields Anshuman Khandual
@ 2022-12-08 8:43 ` Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 4/7] arm64/perf: Add branch stack support in struct pmu_hw_events Anshuman Khandual
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2022-12-08 8:43 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, mark.rutland
Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
linux-perf-users
This updates 'struct arm_pmu' for branch stack sampling support later. This
adds a new 'features' element in the structure to track supported features,
and another 'hw_attr' element to encapsulate implementation attributes on a
given 'struct arm_pmu'. These updates here will help in tracking any branch
stack sampling support, which is being added later. This also adds a helper
arm_pmu_branch_stack_supported().
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
include/linux/perf/arm_pmu.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 60d4e4c9b3ea..8c4f42189904 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -80,11 +80,14 @@ enum armpmu_attr_groups {
ARMPMU_NR_ATTR_GROUPS
};
+#define ARM_PMU_BRANCH_STACK BIT(0)
+
struct arm_pmu {
struct pmu pmu;
cpumask_t supported_cpus;
char *name;
int pmuver;
+ int features;
irqreturn_t (*handle_irq)(struct arm_pmu *pmu);
void (*enable)(struct perf_event *event);
void (*disable)(struct perf_event *event);
@@ -119,8 +122,14 @@ struct arm_pmu {
/* Only to be used by ACPI probing code */
unsigned long acpi_cpuid;
+ void *hw_attr;
};
+static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
+{
+ return armpmu->features & ARM_PMU_BRANCH_STACK;
+}
+
#define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
u64 armpmu_event_update(struct perf_event *event);
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH V6 4/7] arm64/perf: Add branch stack support in struct pmu_hw_events
2022-12-08 8:43 [PATCH V6 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
` (2 preceding siblings ...)
2022-12-08 8:43 ` [PATCH V6 3/7] arm64/perf: Add branch stack support in struct arm_pmu Anshuman Khandual
@ 2022-12-08 8:43 ` Anshuman Khandual
2022-12-08 8:44 ` [PATCH V6 5/7] arm64/perf: Add branch stack support in ARMV8 PMU Anshuman Khandual
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2022-12-08 8:43 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, mark.rutland
Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
linux-perf-users
This adds branch records buffer pointer in 'struct pmu_hw_events' which can
be used to capture branch records during PMU interrupt. This percpu pointer
here needs to be allocated first before usage.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
include/linux/perf/arm_pmu.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 8c4f42189904..871289fe4774 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -44,6 +44,13 @@ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_47BIT) == ARMPMU_EVT_47BIT);
}, \
}
+#define MAX_BRANCH_RECORDS 64
+
+struct branch_records {
+ struct perf_branch_stack branch_stack;
+ struct perf_branch_entry branch_entries[MAX_BRANCH_RECORDS];
+};
+
/* The events for a given PMU register set. */
struct pmu_hw_events {
/*
@@ -70,6 +77,8 @@ struct pmu_hw_events {
struct arm_pmu *percpu_pmu;
int irq;
+
+ struct branch_records *branches;
};
enum armpmu_attr_groups {
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH V6 5/7] arm64/perf: Add branch stack support in ARMV8 PMU
2022-12-08 8:43 [PATCH V6 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
` (3 preceding siblings ...)
2022-12-08 8:43 ` [PATCH V6 4/7] arm64/perf: Add branch stack support in struct pmu_hw_events Anshuman Khandual
@ 2022-12-08 8:44 ` Anshuman Khandual
2022-12-08 8:44 ` [PATCH V6 6/7] arm64/perf: Enable branch stack events via FEAT_BRBE Anshuman Khandual
2022-12-08 8:44 ` [PATCH V6 7/7] drivers: perf: arm_pmu: Enable branch stack sampling event Anshuman Khandual
6 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2022-12-08 8:44 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, mark.rutland
Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
linux-perf-users
This enables support for branch stack sampling event in ARMV8 PMU, checking
has_branch_stack() on the event inside 'struct arm_pmu' callbacks. Although
these branch stack helpers armv8pmu_branch_XXXXX() are just dummy functions
for now. While here, this also defines arm_pmu's sched_task() callback with
armv8pmu_sched_task(), which resets the branch record buffer on a sched_in.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/include/asm/perf_event.h | 10 ++++++++++
arch/arm64/kernel/perf_event.c | 31 +++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 3eaf462f5752..3be9b7a987e9 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -273,4 +273,14 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
(regs)->pstate = PSR_MODE_EL1h; \
}
+struct pmu_hw_events;
+struct arm_pmu;
+struct perf_event;
+
+static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
+static inline bool armv8pmu_branch_valid(struct perf_event *event) {return false; }
+static inline void armv8pmu_branch_enable(struct perf_event *event) { }
+static inline void armv8pmu_branch_disable(struct perf_event *event) { }
+static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
+static inline void armv8pmu_branch_reset(void) { }
#endif
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 7b0643fe2f13..25878978843e 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -789,6 +789,12 @@ static void armv8pmu_enable_event(struct perf_event *event)
* Enable counter
*/
armv8pmu_enable_event_counter(event);
+
+ /*
+ * Enable BRBE
+ */
+ if (has_branch_stack(event))
+ armv8pmu_branch_enable(event);
}
static void armv8pmu_disable_event(struct perf_event *event)
@@ -802,6 +808,12 @@ static void armv8pmu_disable_event(struct perf_event *event)
* Disable interrupt for this counter
*/
armv8pmu_disable_event_irq(event);
+
+ /*
+ * Disable BRBE
+ */
+ if (has_branch_stack(event))
+ armv8pmu_branch_disable(event);
}
static void armv8pmu_start(struct arm_pmu *cpu_pmu)
@@ -874,6 +886,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
if (!armpmu_event_set_period(event))
continue;
+ if (has_branch_stack(event)) {
+ WARN_ON(!cpuc->branches);
+ armv8pmu_branch_read(cpuc, event);
+ data.br_stack = &cpuc->branches->branch_stack;
+ data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+ }
+
/*
* Perf event overflow will queue the processing of the event as
* an irq_work which will be taken care of in the handling of
@@ -972,6 +991,12 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
return event->hw.idx;
}
+static void armv8pmu_sched_task(struct perf_event_context *ctx, bool sched_in)
+{
+ if (sched_in)
+ armv8pmu_branch_reset();
+}
+
/*
* Add an event filter to a given event.
*/
@@ -1048,6 +1073,7 @@ static void armv8pmu_reset(void *info)
pmcr |= ARMV8_PMU_PMCR_LP;
armv8pmu_pmcr_write(pmcr);
+ armv8pmu_branch_reset();
}
static int __armv8_pmuv3_map_event(struct perf_event *event,
@@ -1065,6 +1091,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
&armv8_pmuv3_perf_cache_map,
ARMV8_PMU_EVTYPE_EVENT);
+ if (has_branch_stack(event) && !armv8pmu_branch_valid(event))
+ return -EOPNOTSUPP;
+
if (armv8pmu_event_is_64bit(event))
event->hw.flags |= ARMPMU_EVT_64BIT;
@@ -1176,6 +1205,7 @@ static void __armv8pmu_probe_pmu(void *info)
cpu_pmu->reg_pmmir = read_cpuid(PMMIR_EL1);
else
cpu_pmu->reg_pmmir = 0;
+ armv8pmu_branch_probe(cpu_pmu);
}
static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
@@ -1256,6 +1286,7 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
cpu_pmu->filter_match = armv8pmu_filter_match;
cpu_pmu->pmu.event_idx = armv8pmu_user_event_idx;
+ cpu_pmu->sched_task = armv8pmu_sched_task;
cpu_pmu->name = name;
cpu_pmu->map_event = map_event;
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH V6 6/7] arm64/perf: Enable branch stack events via FEAT_BRBE
2022-12-08 8:43 [PATCH V6 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
` (4 preceding siblings ...)
2022-12-08 8:44 ` [PATCH V6 5/7] arm64/perf: Add branch stack support in ARMV8 PMU Anshuman Khandual
@ 2022-12-08 8:44 ` Anshuman Khandual
2022-12-08 8:44 ` [PATCH V6 7/7] drivers: perf: arm_pmu: Enable branch stack sampling event Anshuman Khandual
6 siblings, 0 replies; 9+ messages in thread
From: Anshuman Khandual @ 2022-12-08 8:44 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, mark.rutland
Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
linux-perf-users
This enables branch stack sampling events in ARMV8 PMU, via an architecture
feature FEAT_BRBE aka branch record buffer extension. This defines required
branch helper functions pmuv8pmu_branch_XXXXX() and the implementation here
is wrapped with a new config option CONFIG_ARM64_BRBE.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
arch/arm64/Kconfig | 11 +
arch/arm64/include/asm/perf_event.h | 9 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/brbe.c | 454 ++++++++++++++++++++++++++++
arch/arm64/kernel/brbe.h | 266 ++++++++++++++++
5 files changed, 741 insertions(+)
create mode 100644 arch/arm64/kernel/brbe.c
create mode 100644 arch/arm64/kernel/brbe.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 505c8a1ccbe0..6869fa5ef3e8 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1355,6 +1355,17 @@ config HW_PERF_EVENTS
def_bool y
depends on ARM_PMU
+config ARM64_BRBE
+ bool "Enable support for Branch Record Buffer Extension (BRBE)"
+ depends on PERF_EVENTS && ARM64 && ARM_PMU
+ default y
+ help
+ Enable perf support for Branch Record Buffer Extension (BRBE) which
+ records all branches taken in an execution path. This supports some
+ branch types and privilege based filtering. It captured additional
+ relevant information such as cycle count, misprediction and branch
+ type, branch privilege level etc.
+
# Supported by clang >= 7.0 or GCC >= 12.0.0
config CC_HAVE_SHADOW_CALL_STACK
def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 3be9b7a987e9..a87ab55cb253 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -277,6 +277,14 @@ struct pmu_hw_events;
struct arm_pmu;
struct perf_event;
+#ifdef CONFIG_ARM64_BRBE
+void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event);
+bool armv8pmu_branch_valid(struct perf_event *event);
+void armv8pmu_branch_enable(struct perf_event *event);
+void armv8pmu_branch_disable(struct perf_event *event);
+void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
+void armv8pmu_branch_reset(void);
+#else
static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
static inline bool armv8pmu_branch_valid(struct perf_event *event) {return false; }
static inline void armv8pmu_branch_enable(struct perf_event *event) { }
@@ -284,3 +292,4 @@ static inline void armv8pmu_branch_disable(struct perf_event *event) { }
static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
static inline void armv8pmu_branch_reset(void) { }
#endif
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 2f361a883d8c..561151a78082 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -52,6 +52,7 @@ obj-$(CONFIG_MODULES) += module.o
obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
+obj-$(CONFIG_ARM64_BRBE) += brbe.o
obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
obj-$(CONFIG_CPU_PM) += sleep.o suspend.o
obj-$(CONFIG_CPU_IDLE) += cpuidle.o
diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
new file mode 100644
index 000000000000..b8e4f9263630
--- /dev/null
+++ b/arch/arm64/kernel/brbe.c
@@ -0,0 +1,454 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Branch Record Buffer Extension Driver.
+ *
+ * Copyright (C) 2021 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#include "brbe.h"
+
+#define BRBFCR_BRANCH_ALL (BRBFCR_EL1_DIRECT | BRBFCR_EL1_INDIRECT | \
+ BRBFCR_EL1_RTN | BRBFCR_EL1_INDCALL | \
+ BRBFCR_EL1_DIRCALL | BRBFCR_EL1_CONDDIR)
+
+#define BRBE_FCR_MASK (BRBFCR_BRANCH_ALL)
+#define BRBE_CR_MASK (BRBCR_EL1_EXCEPTION | BRBCR_EL1_ERTN | BRBCR_EL1_CC | \
+ BRBCR_EL1_MPRED | BRBCR_EL1_E1BRE | BRBCR_EL1_E0BRE)
+
+bool armv8pmu_branch_valid(struct perf_event *event)
+{
+ /*
+ * If the event does not have at least one of the privilege
+ * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
+ * perf will adjust its value based on perf event's existing
+ * privilege level via attr.exclude_[user|kernel|hv].
+ *
+ * As event->attr.branch_sample_type might have been changed
+ * when the event reaches here, it is not possible to figure
+ * out whether the event originally had HV privilege request
+ * or got added via the core perf. Just report this situation
+ * once and continue ignoring if there are other instances.
+ */
+ if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HV)
+ pr_warn_once("branch filter not supported - hypervisor privilege\n");
+
+ if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
+ pr_warn_once("branch filter not supported - aborted transaction\n");
+ return false;
+ }
+
+ if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_TX) {
+ pr_warn_once("branch filter not supported - no transaction\n");
+ return false;
+ }
+
+ if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_IN_TX) {
+ pr_warn_once("branch filter not supported - in transaction\n");
+ return false;
+ }
+ return true;
+}
+
+static void branch_records_alloc(struct arm_pmu *armpmu)
+{
+ struct pmu_hw_events *events;
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ events = per_cpu_ptr(armpmu->hw_events, cpu);
+
+ events->branches = kmalloc(sizeof(struct branch_records), GFP_KERNEL);
+ WARN_ON(!events->branches);
+ }
+}
+
+void armv8pmu_branch_probe(struct arm_pmu *armpmu)
+{
+ struct brbe_hw_attr *brbe_attr;
+ u64 aa64dfr0, brbidr;
+ unsigned int brbe;
+
+ brbe_attr = kmalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
+ armpmu->hw_attr = brbe_attr;
+ WARN_ON(!brbe_attr);
+
+ aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
+ brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
+ if (!brbe)
+ return;
+
+ if (brbe == ID_AA64DFR0_EL1_BRBE_IMP)
+ brbe_attr->brbe_v1p1 = false;
+
+ if (brbe == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1)
+ brbe_attr->brbe_v1p1 = true;
+
+ brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
+ brbe_attr->brbe_format = brbe_fetch_format(brbidr);
+ if (brbe_attr->brbe_format != BRBIDR0_EL1_FORMAT_0)
+ return;
+
+ brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
+ if (brbe_attr->brbe_cc != BRBIDR0_EL1_CC_20_BIT)
+ return;
+
+ brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
+ if (!valid_brbe_nr(brbe_attr->brbe_nr))
+ return;
+
+ branch_records_alloc(armpmu);
+ armpmu->features |= ARM_PMU_BRANCH_STACK;
+}
+
+static u64 branch_type_to_brbfcr(int branch_type)
+{
+ u64 brbfcr = 0;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+ brbfcr |= BRBFCR_BRANCH_ALL;
+ return brbfcr;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+ brbfcr |= (BRBFCR_EL1_INDCALL | BRBFCR_EL1_DIRCALL);
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+ brbfcr |= BRBFCR_EL1_RTN;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
+ brbfcr |= BRBFCR_EL1_INDCALL;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_COND)
+ brbfcr |= BRBFCR_EL1_CONDDIR;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
+ brbfcr |= BRBFCR_EL1_INDIRECT;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_CALL)
+ brbfcr |= BRBFCR_EL1_DIRCALL;
+
+ return brbfcr;
+}
+
+static u64 branch_type_to_brbcr(int branch_type)
+{
+ u64 brbcr = (BRBCR_EL1_CC | BRBCR_EL1_MPRED);
+
+ if (branch_type & PERF_SAMPLE_BRANCH_USER)
+ brbcr |= BRBCR_EL1_E0BRE;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
+ brbcr |= BRBCR_EL1_E1BRE;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES)
+ brbcr &= ~BRBCR_EL1_CC;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS)
+ brbcr &= ~BRBCR_EL1_MPRED;
+
+ /*
+ * The exception and exception return branches could be
+ * captured, irrespective of the perf event's privilege.
+ * If the perf event does not have enough privilege for
+ * a given exception level, then addresses which falls
+ * under that exception level will be reported as zero
+ * for the captured branch record, creating source only
+ * or target only records.
+ */
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+ brbcr |= BRBCR_EL1_EXCEPTION;
+ brbcr |= BRBCR_EL1_ERTN;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+ brbcr |= BRBCR_EL1_EXCEPTION;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+ brbcr |= BRBCR_EL1_ERTN;
+
+ return brbcr;
+}
+
+void armv8pmu_branch_enable(struct perf_event *event)
+{
+ u64 branch_type = event->attr.branch_sample_type;
+ u64 brbfcr, brbcr;
+
+ brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+ brbfcr &= ~BRBFCR_EL1_BANK_MASK;
+ brbfcr &= ~(BRBFCR_EL1_EnI | BRBFCR_EL1_PAUSED | BRBE_FCR_MASK);
+ brbfcr |= (branch_type_to_brbfcr(branch_type) & BRBE_FCR_MASK);
+ write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+ isb();
+
+ brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+ brbcr &= ~BRBE_CR_MASK;
+ brbcr |= BRBCR_EL1_FZP;
+ brbcr |= (BRBCR_EL1_TS_PHYSICAL << BRBCR_EL1_TS_SHIFT);
+ brbcr |= (branch_type_to_brbcr(branch_type) & BRBE_CR_MASK);
+ write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+ isb();
+ armv8pmu_branch_reset();
+}
+
+void armv8pmu_branch_disable(struct perf_event *event)
+{
+ u64 brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+
+ brbcr &= ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
+ write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+ isb();
+}
+
+static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)
+{
+ int brbe_type = brbe_fetch_type(brbinf);
+ *new_branch_type = false;
+
+ switch (brbe_type) {
+ case BRBINF_EL1_TYPE_UNCOND_DIR:
+ return PERF_BR_UNCOND;
+ case BRBINF_EL1_TYPE_INDIR:
+ return PERF_BR_IND;
+ case BRBINF_EL1_TYPE_DIR_LINK:
+ return PERF_BR_CALL;
+ case BRBINF_EL1_TYPE_INDIR_LINK:
+ return PERF_BR_IND_CALL;
+ case BRBINF_EL1_TYPE_RET_SUB:
+ return PERF_BR_RET;
+ case BRBINF_EL1_TYPE_COND_DIR:
+ return PERF_BR_COND;
+ case BRBINF_EL1_TYPE_CALL:
+ return PERF_BR_CALL;
+ case BRBINF_EL1_TYPE_TRAP:
+ return PERF_BR_SYSCALL;
+ case BRBINF_EL1_TYPE_RET_EXCPT:
+ return PERF_BR_ERET;
+ case BRBINF_EL1_TYPE_IRQ:
+ return PERF_BR_IRQ;
+ case BRBINF_EL1_TYPE_DEBUG_HALT:
+ *new_branch_type = true;
+ return PERF_BR_ARM64_DEBUG_HALT;
+ case BRBINF_EL1_TYPE_SERROR:
+ return PERF_BR_SERROR;
+ case BRBINF_EL1_TYPE_INST_DEBUG:
+ *new_branch_type = true;
+ return PERF_BR_ARM64_DEBUG_INST;
+ case BRBINF_EL1_TYPE_DATA_DEBUG:
+ *new_branch_type = true;
+ return PERF_BR_ARM64_DEBUG_DATA;
+ case BRBINF_EL1_TYPE_ALGN_FAULT:
+ *new_branch_type = true;
+ return PERF_BR_NEW_FAULT_ALGN;
+ case BRBINF_EL1_TYPE_INST_FAULT:
+ *new_branch_type = true;
+ return PERF_BR_NEW_FAULT_INST;
+ case BRBINF_EL1_TYPE_DATA_FAULT:
+ *new_branch_type = true;
+ return PERF_BR_NEW_FAULT_DATA;
+ case BRBINF_EL1_TYPE_FIQ:
+ *new_branch_type = true;
+ return PERF_BR_ARM64_FIQ;
+ case BRBINF_EL1_TYPE_DEBUG_EXIT:
+ *new_branch_type = true;
+ return PERF_BR_ARM64_DEBUG_EXIT;
+ default:
+ pr_warn("unknown branch type captured\n");
+ return PERF_BR_UNKNOWN;
+ }
+}
+
+static int brbe_fetch_perf_priv(u64 brbinf)
+{
+ int brbe_el = brbe_fetch_el(brbinf);
+
+ switch (brbe_el) {
+ case BRBINF_EL1_EL_EL0:
+ return PERF_BR_PRIV_USER;
+ case BRBINF_EL1_EL_EL1:
+ return PERF_BR_PRIV_KERNEL;
+ case BRBINF_EL1_EL_EL2:
+ if (is_kernel_in_hyp_mode())
+ return PERF_BR_PRIV_KERNEL;
+ return PERF_BR_PRIV_HV;
+ default:
+ pr_warn("unknown branch privilege captured\n");
+ return PERF_BR_PRIV_UNKNOWN;
+ }
+}
+
+static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
+ u64 brbinf, int idx)
+{
+ int branch_type, type = brbe_record_valid(brbinf);
+ bool new_branch_type;
+
+ if (!branch_sample_no_cycles(event))
+ cpuc->branches->branch_entries[idx].cycles = brbe_fetch_cycles(brbinf);
+
+ if (branch_sample_type(event)) {
+ branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
+ if (new_branch_type) {
+ cpuc->branches->branch_entries[idx].type = PERF_BR_EXTEND_ABI;
+ cpuc->branches->branch_entries[idx].new_type = branch_type;
+ } else {
+ cpuc->branches->branch_entries[idx].type = branch_type;
+ }
+ }
+
+ if (!branch_sample_no_flags(event)) {
+ /*
+ * BRBINF_LASTFAILED does not indicate that the last transaction
+ * got failed or aborted during the current branch record itself.
+ * Rather, this indicates that all the branch records which were
+ * in transaction until the curret branch record have failed. So
+ * the entire BRBE buffer needs to be processed later on to find
+ * all branch records which might have failed.
+ */
+ cpuc->branches->branch_entries[idx].abort = brbinf & BRBINF_EL1_LASTFAILED;
+
+ /*
+ * All these information (i.e transaction state and mispredicts)
+ * are not available for target only branch records.
+ */
+ if (type != BRBINF_EL1_VALID_TARGET) {
+ cpuc->branches->branch_entries[idx].mispred = brbinf & BRBINF_EL1_MPRED;
+ cpuc->branches->branch_entries[idx].predicted = !(brbinf & BRBINF_EL1_MPRED);
+ cpuc->branches->branch_entries[idx].in_tx = brbinf & BRBINF_EL1_T;
+ }
+ }
+
+ if (branch_sample_priv(event)) {
+ /*
+ * All these information (i.e branch privilege level) are not
+ * available for source only branch records.
+ */
+ if (type != BRBINF_EL1_VALID_SOURCE)
+ cpuc->branches->branch_entries[idx].priv = brbe_fetch_perf_priv(brbinf);
+ }
+}
+
+/*
+ * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
+ * preceding consecutive branch records, that were in a transaction
+ * (i.e their BRBINF_EL1.TX set) have been aborted.
+ *
+ * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
+ * consecutive branch records upto the last record, which were in a
+ * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
+ *
+ * --------------------------------- -------------------
+ * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
+ * --------------------------------- -------------------
+ * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
+ * --------------------------------- -------------------
+ * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
+ * --------------------------------- -------------------
+ * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
+ * --------------------------------- -------------------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
+ * --------------------------------- -------------------
+ * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ *
+ * BRBFCR_EL1.LASTFAILED == 1
+ *
+ * Here BRBFCR_EL1.LASTFAILED failes all those consecutive and also
+ * in transaction branches near the end of the BRBE buffer.
+ */
+static void process_branch_aborts(struct pmu_hw_events *cpuc)
+{
+ struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *) cpuc->percpu_pmu->hw_attr;
+ u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+ bool lastfailed = !!(brbfcr & BRBFCR_EL1_LASTFAILED);
+ int idx = brbe_attr->brbe_nr - 1;
+
+ do {
+ if (cpuc->branches->branch_entries[idx].in_tx) {
+ cpuc->branches->branch_entries[idx].abort = lastfailed;
+ } else {
+ lastfailed = cpuc->branches->branch_entries[idx].abort;
+ cpuc->branches->branch_entries[idx].abort = false;
+ }
+ } while (idx--, idx >= 0);
+}
+
+void armv8pmu_branch_reset(void)
+{
+ asm volatile(BRB_IALL);
+ isb();
+}
+
+void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+ struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *) cpuc->percpu_pmu->hw_attr;
+ u64 brbinf, brbfcr, brbcr, saved_priv;
+ int idx;
+
+ brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+ brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+
+ /* Ensure pause on PMU interrupt is enabled */
+ WARN_ON_ONCE(~brbcr & BRBCR_EL1_FZP);
+
+ /* Save and clear the privilege */
+ saved_priv = brbcr & (BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
+ brbcr &= ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
+
+ /* Pause the buffer */
+ brbfcr |= BRBFCR_EL1_PAUSED;
+
+ write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+ write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+ isb();
+
+ for (idx = 0; idx < brbe_attr->brbe_nr; idx++) {
+ select_brbe_bank_index(idx);
+ brbinf = get_brbinf_reg(idx);
+ /*
+ * There are no valid entries anymore on the buffer.
+ * Abort the branch record processing to save some
+ * cycles and also reduce the capture/process load
+ * for the user space as well.
+ */
+ if (brbe_invalid(brbinf))
+ break;
+
+ perf_clear_branch_entry_bitfields(&cpuc->branches->branch_entries[idx]);
+ if (brbe_valid(brbinf)) {
+ cpuc->branches->branch_entries[idx].from = get_brbsrc_reg(idx);
+ cpuc->branches->branch_entries[idx].to = get_brbtgt_reg(idx);
+ } else if (brbe_source(brbinf)) {
+ cpuc->branches->branch_entries[idx].from = get_brbsrc_reg(idx);
+ cpuc->branches->branch_entries[idx].to = 0;
+ } else if (brbe_target(brbinf)) {
+ cpuc->branches->branch_entries[idx].from = 0;
+ cpuc->branches->branch_entries[idx].to = get_brbtgt_reg(idx);
+ }
+ capture_brbe_flags(cpuc, event, brbinf, idx);
+ }
+ cpuc->branches->branch_stack.nr = idx;
+ cpuc->branches->branch_stack.hw_idx = -1ULL;
+ process_branch_aborts(cpuc);
+
+ /* Restore privilege, enable pause on PMU interrupt */
+ brbcr |= saved_priv;
+ brbcr |= BRBCR_EL1_FZP;
+
+ /* Unpause the buffer */
+ brbfcr &= ~BRBFCR_EL1_PAUSED;
+
+ write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+ write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+ isb();
+ armv8pmu_branch_reset();
+}
diff --git a/arch/arm64/kernel/brbe.h b/arch/arm64/kernel/brbe.h
new file mode 100644
index 000000000000..b265e6bf3d23
--- /dev/null
+++ b/arch/arm64/kernel/brbe.h
@@ -0,0 +1,266 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Branch Record Buffer Extension Helpers.
+ *
+ * Copyright (C) 2021 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#define pr_fmt(fmt) "brbe: " fmt
+
+#include <linux/perf/arm_pmu.h>
+
+struct brbe_hw_attr {
+ bool brbe_v1p1;
+ int brbe_cc;
+ int brbe_nr;
+ int brbe_format;
+};
+
+/*
+ * BRBE Instructions
+ *
+ * BRB_IALL : Invalidate the entire buffer
+ * BRB_INJ : Inject latest branch record derived from [BRBSRCINJ, BRBTGTINJ, BRBINFINJ]
+ */
+#define BRB_IALL __emit_inst(0xD5000000 | sys_insn(1, 1, 7, 2, 4) | (0x1f))
+#define BRB_INJ __emit_inst(0xD5000000 | sys_insn(1, 1, 7, 2, 5) | (0x1f))
+
+/*
+ * BRBE Buffer Organization
+ *
+ * BRBE buffer is arranged as multiple banks of 32 branch record
+ * entries each. An indivdial branch record in a given bank could
+ * be accessedi, after selecting the bank in BRBFCR_EL1.BANK and
+ * accessing the registers i.e [BRBSRC, BRBTGT, BRBINF] set with
+ * indices [0..31].
+ *
+ * Bank 0
+ *
+ * --------------------------------- ------
+ * | 00 | BRBSRC | BRBTGT | BRBINF | | 00 |
+ * --------------------------------- ------
+ * | 01 | BRBSRC | BRBTGT | BRBINF | | 01 |
+ * --------------------------------- ------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | .. |
+ * --------------------------------- ------
+ * | 31 | BRBSRC | BRBTGT | BRBINF | | 31 |
+ * --------------------------------- ------
+ *
+ * Bank 1
+ *
+ * --------------------------------- ------
+ * | 32 | BRBSRC | BRBTGT | BRBINF | | 00 |
+ * --------------------------------- ------
+ * | 33 | BRBSRC | BRBTGT | BRBINF | | 01 |
+ * --------------------------------- ------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | .. |
+ * --------------------------------- ------
+ * | 63 | BRBSRC | BRBTGT | BRBINF | | 31 |
+ * --------------------------------- ------
+ */
+#define BRBE_BANK0_IDX_MIN 0
+#define BRBE_BANK0_IDX_MAX 31
+#define BRBE_BANK1_IDX_MIN 32
+#define BRBE_BANK1_IDX_MAX 63
+
+#define RETURN_READ_BRBSRCN(n) \
+ read_sysreg_s(SYS_BRBSRC##n##_EL1)
+
+#define RETURN_READ_BRBTGTN(n) \
+ read_sysreg_s(SYS_BRBTGT##n##_EL1)
+
+#define RETURN_READ_BRBINFN(n) \
+ read_sysreg_s(SYS_BRBINF##n##_EL1)
+
+#define BRBE_REGN_CASE(n, case_macro) \
+ case n: return case_macro(n); break
+
+#define BRBE_REGN_SWITCH(x, case_macro) \
+ do { \
+ switch (x) { \
+ BRBE_REGN_CASE(0, case_macro); \
+ BRBE_REGN_CASE(1, case_macro); \
+ BRBE_REGN_CASE(2, case_macro); \
+ BRBE_REGN_CASE(3, case_macro); \
+ BRBE_REGN_CASE(4, case_macro); \
+ BRBE_REGN_CASE(5, case_macro); \
+ BRBE_REGN_CASE(6, case_macro); \
+ BRBE_REGN_CASE(7, case_macro); \
+ BRBE_REGN_CASE(8, case_macro); \
+ BRBE_REGN_CASE(9, case_macro); \
+ BRBE_REGN_CASE(10, case_macro); \
+ BRBE_REGN_CASE(11, case_macro); \
+ BRBE_REGN_CASE(12, case_macro); \
+ BRBE_REGN_CASE(13, case_macro); \
+ BRBE_REGN_CASE(14, case_macro); \
+ BRBE_REGN_CASE(15, case_macro); \
+ BRBE_REGN_CASE(16, case_macro); \
+ BRBE_REGN_CASE(17, case_macro); \
+ BRBE_REGN_CASE(18, case_macro); \
+ BRBE_REGN_CASE(19, case_macro); \
+ BRBE_REGN_CASE(20, case_macro); \
+ BRBE_REGN_CASE(21, case_macro); \
+ BRBE_REGN_CASE(22, case_macro); \
+ BRBE_REGN_CASE(23, case_macro); \
+ BRBE_REGN_CASE(24, case_macro); \
+ BRBE_REGN_CASE(25, case_macro); \
+ BRBE_REGN_CASE(26, case_macro); \
+ BRBE_REGN_CASE(27, case_macro); \
+ BRBE_REGN_CASE(28, case_macro); \
+ BRBE_REGN_CASE(29, case_macro); \
+ BRBE_REGN_CASE(30, case_macro); \
+ BRBE_REGN_CASE(31, case_macro); \
+ default: \
+ pr_warn("unknown register index\n"); \
+ return -1; \
+ } \
+ } while (0)
+
+static inline int buffer_to_brbe_idx(int buffer_idx)
+{
+ return buffer_idx % 32;
+}
+
+static inline u64 get_brbsrc_reg(int buffer_idx)
+{
+ int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+ BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBSRCN);
+}
+
+static inline u64 get_brbtgt_reg(int buffer_idx)
+{
+ int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+ BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBTGTN);
+}
+
+static inline u64 get_brbinf_reg(int buffer_idx)
+{
+ int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+ BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBINFN);
+}
+
+static inline u64 brbe_record_valid(u64 brbinf)
+{
+ return (brbinf & BRBINF_EL1_VALID_MASK) >> BRBINF_EL1_VALID_SHIFT;
+}
+
+static inline bool brbe_invalid(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_NONE;
+}
+
+static inline bool brbe_valid(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_FULL;
+}
+
+static inline bool brbe_source(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_SOURCE;
+}
+
+static inline bool brbe_target(u64 brbinf)
+{
+ return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_TARGET;
+}
+
+static inline int brbe_fetch_cycles(u64 brbinf)
+{
+ /*
+ * Captured cycle count is unknown and hence
+ * should not be passed on the user space.
+ */
+ if (brbinf & BRBINF_EL1_CCU)
+ return 0;
+
+ return (brbinf & BRBINF_EL1_CC_MASK) >> BRBINF_EL1_CC_SHIFT;
+}
+
+static inline int brbe_fetch_type(u64 brbinf)
+{
+ return (brbinf & BRBINF_EL1_TYPE_MASK) >> BRBINF_EL1_TYPE_SHIFT;
+}
+
+static inline int brbe_fetch_el(u64 brbinf)
+{
+ return (brbinf & BRBINF_EL1_EL_MASK) >> BRBINF_EL1_EL_SHIFT;
+}
+
+static inline int brbe_fetch_numrec(u64 brbidr)
+{
+ return (brbidr & BRBIDR0_EL1_NUMREC_MASK) >> BRBIDR0_EL1_NUMREC_SHIFT;
+}
+
+static inline int brbe_fetch_format(u64 brbidr)
+{
+ return (brbidr & BRBIDR0_EL1_FORMAT_MASK) >> BRBIDR0_EL1_FORMAT_SHIFT;
+}
+
+static inline int brbe_fetch_cc_bits(u64 brbidr)
+{
+ return (brbidr & BRBIDR0_EL1_CC_MASK) >> BRBIDR0_EL1_CC_SHIFT;
+}
+
+static inline void select_brbe_bank(int bank)
+{
+ static int brbe_current_bank = -1;
+ u64 brbfcr;
+
+ if (brbe_current_bank == bank)
+ return;
+
+ WARN_ON(bank > 1);
+ brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+ brbfcr &= ~BRBFCR_EL1_BANK_MASK;
+ brbfcr |= ((bank << BRBFCR_EL1_BANK_SHIFT) & BRBFCR_EL1_BANK_MASK);
+ write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+ isb();
+ brbe_current_bank = bank;
+}
+
+static inline void select_brbe_bank_index(int buffer_idx)
+{
+ switch (buffer_idx) {
+ case BRBE_BANK0_IDX_MIN ... BRBE_BANK0_IDX_MAX:
+ select_brbe_bank(0);
+ break;
+ case BRBE_BANK1_IDX_MIN ... BRBE_BANK1_IDX_MAX:
+ select_brbe_bank(1);
+ break;
+ default:
+ pr_warn("unsupported BRBE index\n");
+ }
+}
+
+static inline bool valid_brbe_nr(int brbe_nr)
+{
+ switch (brbe_nr) {
+ case BRBIDR0_EL1_NUMREC_8:
+ case BRBIDR0_EL1_NUMREC_16:
+ case BRBIDR0_EL1_NUMREC_32:
+ case BRBIDR0_EL1_NUMREC_64:
+ return true;
+ default:
+ pr_warn("unsupported BRBE entries\n");
+ return false;
+ }
+}
+
+static inline bool brbe_paused(void)
+{
+ u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+
+ return brbfcr & BRBFCR_EL1_PAUSED;
+}
+
+static inline void set_brbe_paused(void)
+{
+ u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+
+ write_sysreg_s(brbfcr | BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
+ isb();
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH V6 7/7] drivers: perf: arm_pmu: Enable branch stack sampling event
2022-12-08 8:43 [PATCH V6 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
` (5 preceding siblings ...)
2022-12-08 8:44 ` [PATCH V6 6/7] arm64/perf: Enable branch stack events via FEAT_BRBE Anshuman Khandual
@ 2022-12-08 8:44 ` Anshuman Khandual
2022-12-08 11:43 ` James Clark
6 siblings, 1 reply; 9+ messages in thread
From: Anshuman Khandual @ 2022-12-08 8:44 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel, mark.rutland
Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
linux-perf-users
Now that all the required pieces are already in place, just enable the perf
branch stack sampling event on supported platforms, removing the gate which
blocks it unconditionally in armpmu_event_init(). Instead a quick probe can
be initiated first via arm_pmu_branch_stack_supported().
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
drivers/perf/arm_pmu.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 66880a4bb248..52a93b9bcbda 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
return -ENOENT;
- /* does not support taken branch sampling */
- if (has_branch_stack(event))
+ if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
return -EOPNOTSUPP;
if (armpmu->map_event(event) == -ENOENT)
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH V6 7/7] drivers: perf: arm_pmu: Enable branch stack sampling event
2022-12-08 8:44 ` [PATCH V6 7/7] drivers: perf: arm_pmu: Enable branch stack sampling event Anshuman Khandual
@ 2022-12-08 11:43 ` James Clark
0 siblings, 0 replies; 9+ messages in thread
From: James Clark @ 2022-12-08 11:43 UTC (permalink / raw)
To: Anshuman Khandual
Cc: Catalin Marinas, Will Deacon, Mark Brown, Rob Herring,
Marc Zyngier, Suzuki Poulose, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, linux-perf-users, linux-arm-kernel,
linux-kernel, mark.rutland
On 08/12/2022 08:44, Anshuman Khandual wrote:
> Now that all the required pieces are already in place, just enable the perf
> branch stack sampling event on supported platforms, removing the gate which
> blocks it unconditionally in armpmu_event_init(). Instead a quick probe can
> be initiated first via arm_pmu_branch_stack_supported().
>
All the issues from the previous versions seem to be resolved now:
Tested-by: James Clark <james.clark@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> drivers/perf/arm_pmu.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 66880a4bb248..52a93b9bcbda 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
> !cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
> return -ENOENT;
>
> - /* does not support taken branch sampling */
> - if (has_branch_stack(event))
> + if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
> return -EOPNOTSUPP;
>
> if (armpmu->map_event(event) == -ENOENT)
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-12-08 11:43 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-08 8:43 [PATCH V6 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 1/7] drivers: perf: arm_pmu: Add new sched_task() callback Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 2/7] arm64/perf: Add BRBE registers and fields Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 3/7] arm64/perf: Add branch stack support in struct arm_pmu Anshuman Khandual
2022-12-08 8:43 ` [PATCH V6 4/7] arm64/perf: Add branch stack support in struct pmu_hw_events Anshuman Khandual
2022-12-08 8:44 ` [PATCH V6 5/7] arm64/perf: Add branch stack support in ARMV8 PMU Anshuman Khandual
2022-12-08 8:44 ` [PATCH V6 6/7] arm64/perf: Enable branch stack events via FEAT_BRBE Anshuman Khandual
2022-12-08 8:44 ` [PATCH V6 7/7] drivers: perf: arm_pmu: Enable branch stack sampling event Anshuman Khandual
2022-12-08 11:43 ` James Clark
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).