linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext.
@ 2025-01-16 23:09 Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 1/7] perf: Increase the maximum number of samples to 256 Rajnesh Kanwal
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-16 23:09 UTC (permalink / raw)
  To: linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen, Rajnesh Kanwal

This series enables Control Transfer Records extension support on riscv
architecture. This extension is similar to Arch LBR in x86 and BRBE in ARM.
The Extension has been ratified and the latest release can be found here [0]

CTR extension depends on both the implementation of S-mode and Sscsrind
extension v1.0.0 [1]. CTR access ctrsource, ctrtartget and ctrdata CSRs using
sscsrind extension.

The series is based on Smcdeleg/Ssccfg counter delegation extension [2]
patches [3]. CTR itself doesn't depend on counter delegation support. This
rebase is basically to include the Smcsrind patches.

The last patch is in the perf tool to allow processing 256 entries. Without
this perf seems to consider that sample as corrupted and discards it.

Here is the link to a quick guide [4] to setup and run a basic perf demo on
Linux to use CTR Ext.

Qemu patches can be found here:
https://github.com/rajnesh-kanwal/qemu/tree/b4/ctr_upstream_v5

Opensbi patch can be found here:
https://github.com/rajnesh-kanwal/opensbi/tree/ctr_upstream_v2

Linux kernel patches can be found here:
https://github.com/rajnesh-kanwal/linux/tree/b4/ctr_upstream_v2

[0]: https://github.com/riscv/riscv-control-transfer-records/releases/tag/v1.0
[1]: https://github.com/riscvarchive/riscv-indirect-csr-access/releases/tag/v1.0.0
[2]: https://github.com/riscvarchive/riscv-smcdeleg-ssccfg/releases/tag/v1.0.0
[3]: https://lore.kernel.org/lkml/20250114-counter_delegation-v2-0-8ba74cdb851b@rivosinc.com/
[4]: https://github.com/rajnesh-kanwal/linux/wiki/Running-CTR-basic-demo-on-QEMU-RISC%E2%80%90V-Virt-machine

Rajnesh Kanwal (7):
  perf: Increase the maximum number of samples to 256.
  riscv: pmu: Add Control transfer records CSR definations.
  riscv: Add Control Transfer Records extension parsing
  dt-bindings: riscv: add Sxctr ISA extension description
  riscv: pmu: Add infrastructure for Control Transfer Record
  riscv: pmu: Add driver for Control Transfer Records Ext.
  riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver

 .../devicetree/bindings/riscv/extensions.yaml |  14 +
 MAINTAINERS                                   |   1 +
 arch/riscv/include/asm/csr.h                  |  83 +++
 arch/riscv/include/asm/hwcap.h                |   4 +
 arch/riscv/kernel/cpufeature.c                |   2 +
 drivers/perf/Kconfig                          |  11 +
 drivers/perf/Makefile                         |   1 +
 drivers/perf/riscv_ctr.c                      | 608 ++++++++++++++++++
 drivers/perf/riscv_pmu_common.c               |  23 +-
 drivers/perf/riscv_pmu_dev.c                  |  82 +++
 drivers/perf/riscv_pmu_legacy.c               |   2 +
 include/linux/perf/riscv_pmu.h                |  55 ++
 tools/perf/util/machine.c                     |  21 +-
 13 files changed, 898 insertions(+), 9 deletions(-)
 create mode 100644 drivers/perf/riscv_ctr.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/7] perf: Increase the maximum number of samples to 256.
  2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
@ 2025-01-16 23:09 ` Rajnesh Kanwal
  2025-02-20 18:51   ` Ian Rogers
  2025-01-16 23:09 ` [PATCH v2 2/7] riscv: pmu: Add Control transfer records CSR definations Rajnesh Kanwal
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-16 23:09 UTC (permalink / raw)
  To: linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen, Rajnesh Kanwal

RISCV CTR extension support a maximum depth of 256 last branch records.
The 127 entries limit results in corrupting CTR entries for RISC-V if
configured to be 256 entries. This will not impact any other architectures
as it is just increasing maximum limit of possible entries.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
 tools/perf/util/machine.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 27d5345d2b30..f2eb3c20274e 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2174,25 +2174,32 @@ static void save_iterations(struct iterations *iter,
 		iter->cycles += be[i].flags.cycles;
 }
 
-#define CHASHSZ 127
-#define CHASHBITS 7
-#define NO_ENTRY 0xff
+#define CHASHBITS 8
+#define NO_ENTRY 0xffU
 
-#define PERF_MAX_BRANCH_DEPTH 127
+#define PERF_MAX_BRANCH_DEPTH 256
 
 /* Remove loops. */
+/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
+ * so it's safe to have an unsigned char array to process 256 entries
+ * without causing clash between last entry and NO_ENTRY value.
+ */
 static int remove_loops(struct branch_entry *l, int nr,
 			struct iterations *iter)
 {
 	int i, j, off;
-	unsigned char chash[CHASHSZ];
+	unsigned char chash[PERF_MAX_BRANCH_DEPTH];
 
 	memset(chash, NO_ENTRY, sizeof(chash));
 
-	BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
+	BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
 
 	for (i = 0; i < nr; i++) {
-		int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
+		/* Remainder division by PERF_MAX_BRANCH_DEPTH is not
+		 * needed as hash_64 will anyway limit the hash
+		 * to CHASHBITS
+		 */
+		int h = hash_64(l[i].from, CHASHBITS);
 
 		/* no collision handling for now */
 		if (chash[h] == NO_ENTRY) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/7] riscv: pmu: Add Control transfer records CSR definations.
  2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 1/7] perf: Increase the maximum number of samples to 256 Rajnesh Kanwal
@ 2025-01-16 23:09 ` Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 3/7] riscv: Add Control Transfer Records extension parsing Rajnesh Kanwal
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-16 23:09 UTC (permalink / raw)
  To: linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen, Rajnesh Kanwal

Adding CSR defines for RISCV Control Transfer Records extension [0]
along with bit-field macros for each CSR.

[0]: https://github.com/riscv/riscv-control-transfer-records

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
 arch/riscv/include/asm/csr.h | 83 ++++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index a06d5fec6e6d..465a5e338ccb 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -325,6 +325,85 @@
 
 #define CSR_SCOUNTOVF		0xda0
 
+/* M-mode Control Transfer Records CSRs */
+#define CSR_MCTRCTL		0x34e
+
+/* S-mode Control Transfer Records CSRs */
+#define CSR_SCTRCTL		0x14e
+#define CSR_SCTRSTATUS		0x14f
+#define CSR_SCTRDEPTH		0x15f
+
+/* VS-mode Control Transfer Records CSRs */
+#define CSR_VSCTRCTL		0x24e
+
+/* xctrtl CSR bits. */
+#define CTRCTL_U_ENABLE		_AC(0x1, UL)
+#define CTRCTL_S_ENABLE		_AC(0x2, UL)
+#define CTRCTL_M_ENABLE		_AC(0x4, UL)
+#define CTRCTL_RASEMU		_AC(0x80, UL)
+#define CTRCTL_STE		_AC(0x100, UL)
+#define CTRCTL_MTE		_AC(0x200, UL)
+#define CTRCTL_BPFRZ		_AC(0x800, UL)
+#define CTRCTL_LCOFIFRZ		_AC(0x1000, UL)
+#define CTRCTL_EXCINH		_AC(0x200000000, UL)
+#define CTRCTL_INTRINH		_AC(0x400000000, UL)
+#define CTRCTL_TRETINH		_AC(0x800000000, UL)
+#define CTRCTL_NTBREN		_AC(0x1000000000, UL)
+#define CTRCTL_TKBRINH		_AC(0x2000000000, UL)
+#define CTRCTL_INDCALL_INH	_AC(0x10000000000, UL)
+#define CTRCTL_DIRCALL_INH	_AC(0x20000000000, UL)
+#define CTRCTL_INDJUMP_INH	_AC(0x40000000000, UL)
+#define CTRCTL_DIRJUMP_INH	_AC(0x80000000000, UL)
+#define CTRCTL_CORSWAP_INH	_AC(0x100000000000, UL)
+#define CTRCTL_RET_INH		_AC(0x200000000000, UL)
+#define CTRCTL_INDOJUMP_INH	_AC(0x400000000000, UL)
+#define CTRCTL_DIROJUMP_INH	_AC(0x800000000000, UL)
+
+/* sctrstatus CSR bits. */
+#define SCTRSTATUS_WRPTR_MASK	0xFF
+#define SCTRSTATUS_FROZEN	_AC(0x80000000, UL)
+
+#ifdef CONFIG_RISCV_M_MODE
+#define CTRCTL_KERNEL_ENABLE	CTRCTL_M_ENABLE
+#else
+#define CTRCTL_KERNEL_ENABLE	CTRCTL_S_ENABLE
+#endif
+
+/* sctrdepth CSR bits. */
+#define SCTRDEPTH_MASK		0x7
+
+#define SCTRDEPTH_MIN		0x0 /* 16 Entries. */
+#define SCTRDEPTH_MAX		0x4 /* 256 Entries. */
+
+/* ctrsource, ctrtarget and ctrdata CSR bits. */
+#define CTRSOURCE_VALID		0x1ULL
+#define CTRTARGET_MISP		0x1ULL
+
+#define CTRDATA_TYPE_MASK	0xF
+#define CTRDATA_CCV		0x8000
+#define CTRDATA_CCM_MASK	0xFFF0000
+#define CTRDATA_CCE_MASK	0xF0000000
+
+#define CTRDATA_TYPE_NONE			0
+#define CTRDATA_TYPE_EXCEPTION			1
+#define CTRDATA_TYPE_INTERRUPT			2
+#define CTRDATA_TYPE_TRAP_RET			3
+#define CTRDATA_TYPE_NONTAKEN_BRANCH		4
+#define CTRDATA_TYPE_TAKEN_BRANCH		5
+#define CTRDATA_TYPE_RESERVED_6			6
+#define CTRDATA_TYPE_RESERVED_7			7
+#define CTRDATA_TYPE_INDIRECT_CALL		8
+#define CTRDATA_TYPE_DIRECT_CALL		9
+#define CTRDATA_TYPE_INDIRECT_JUMP		10
+#define CTRDATA_TYPE_DIRECT_JUMP		11
+#define CTRDATA_TYPE_CO_ROUTINE_SWAP		12
+#define CTRDATA_TYPE_RETURN			13
+#define CTRDATA_TYPE_OTHER_INDIRECT_JUMP	14
+#define CTRDATA_TYPE_OTHER_DIRECT_JUMP		15
+
+#define CTR_ENTRIES_FIRST	0x200
+#define CTR_ENTRIES_LAST	0x2ff
+
 #define CSR_SSTATUS		0x100
 #define CSR_SIE			0x104
 #define CSR_STVEC		0x105
@@ -508,6 +587,8 @@
 # define CSR_TOPEI	CSR_MTOPEI
 # define CSR_TOPI	CSR_MTOPI
 
+# define CSR_CTRCTL     CSR_MCTRCTL
+
 # define SR_IE		SR_MIE
 # define SR_PIE		SR_MPIE
 # define SR_PP		SR_MPP
@@ -538,6 +619,8 @@
 # define CSR_TOPEI	CSR_STOPEI
 # define CSR_TOPI	CSR_STOPI
 
+# define CSR_CTRCTL     CSR_SCTRCTL
+
 # define SR_IE		SR_SIE
 # define SR_PIE		SR_SPIE
 # define SR_PP		SR_SPP
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/7] riscv: Add Control Transfer Records extension parsing
  2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 1/7] perf: Increase the maximum number of samples to 256 Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 2/7] riscv: pmu: Add Control transfer records CSR definations Rajnesh Kanwal
@ 2025-01-16 23:09 ` Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-16 23:09 UTC (permalink / raw)
  To: linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen, Rajnesh Kanwal

Adding CTR extension in ISA extension map to lookup for extension
availability.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
 arch/riscv/include/asm/hwcap.h | 4 ++++
 arch/riscv/kernel/cpufeature.c | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 42b34e2f80e8..552c7ebae7be 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -105,6 +105,8 @@
 #define RISCV_ISA_EXT_SSCCFG            96
 #define RISCV_ISA_EXT_SMCDELEG          97
 #define RISCV_ISA_EXT_SMCNTRPMF         98
+#define RISCV_ISA_EXT_SMCTR             99
+#define RISCV_ISA_EXT_SSCTR             100
 
 #define RISCV_ISA_EXT_XLINUXENVCFG	127
 
@@ -115,11 +117,13 @@
 #define RISCV_ISA_EXT_SxAIA		RISCV_ISA_EXT_SMAIA
 #define RISCV_ISA_EXT_SUPM		RISCV_ISA_EXT_SMNPM
 #define RISCV_ISA_EXT_SxCSRIND		RISCV_ISA_EXT_SMCSRIND
+#define RISCV_ISA_EXT_SxCTR		RISCV_ISA_EXT_SMCTR
 #else
 #define RISCV_ISA_EXT_SxAIA		RISCV_ISA_EXT_SSAIA
 #define RISCV_ISA_EXT_SUPM		RISCV_ISA_EXT_SSNPM
 #define RISCV_ISA_EXT_SxAIA		RISCV_ISA_EXT_SSAIA
 #define RISCV_ISA_EXT_SxCSRIND		RISCV_ISA_EXT_SSCSRIND
+#define RISCV_ISA_EXT_SxCTR		RISCV_ISA_EXT_SSCTR
 #endif
 
 #endif /* _ASM_RISCV_HWCAP_H */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index ec068c9130e5..ef3b70f7d5d2 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -391,6 +391,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
 	__RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT),
 	__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
 	__RISCV_ISA_EXT_DATA(smcdeleg, RISCV_ISA_EXT_SMCDELEG),
+	__RISCV_ISA_EXT_DATA(smctr, RISCV_ISA_EXT_SMCTR),
 	__RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM),
 	__RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts),
 	__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
@@ -400,6 +401,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
 	__RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
 	__RISCV_ISA_EXT_DATA(ssccfg, RISCV_ISA_EXT_SSCCFG),
 	__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
+	__RISCV_ISA_EXT_DATA(ssctr, RISCV_ISA_EXT_SSCTR),
 	__RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts),
 	__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
 	__RISCV_ISA_EXT_DATA(svade, RISCV_ISA_EXT_SVADE),
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description
  2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
                   ` (2 preceding siblings ...)
  2025-01-16 23:09 ` [PATCH v2 3/7] riscv: Add Control Transfer Records extension parsing Rajnesh Kanwal
@ 2025-01-16 23:09 ` Rajnesh Kanwal
  2025-01-17  7:26   ` Krzysztof Kozlowski
  2025-01-20 18:49   ` Conor Dooley
  2025-01-16 23:09 ` [PATCH v2 5/7] riscv: pmu: Add infrastructure for Control Transfer Record Rajnesh Kanwal
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-16 23:09 UTC (permalink / raw)
  To: linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen, Rajnesh Kanwal

Add the S[m|s]ctr ISA extension description.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
 .../devicetree/bindings/riscv/extensions.yaml      | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 848354e3048f..8322503f0773 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -167,6 +167,13 @@ properties:
 	    extension allows other ISA extension to use indirect CSR access
 	    mechanism in M-mode.
 
+        - const: smctr
+          description: |
+            The standard Smctr supervisor-level extension for the machine mode
+            to enable recording limited branch history in a register-accessible
+            internal core storage. Smctr depend on both the implementation of
+            S-mode and the Sscsrind extension.
+
 	- const: sscsrind
           description: |
             The standard Sscsrind supervisor-level extension extends the
@@ -193,6 +200,13 @@ properties:
             and mode-based filtering as ratified at commit 01d1df0 ("Add ability
             to manually trigger workflow. (#2)") of riscv-count-overflow.
 
+        - const: ssctr
+          description: |
+            The standard Ssctr supervisor-level extension enables recording of
+            limited branch history in a register-accessible internal core
+            storage. Ssctr depend on both the implementation of S-mode and the
+            Sscsrind extension.
+
         - const: ssnpm
           description: |
             The standard Ssnpm extension for next-mode pointer masking as
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 5/7] riscv: pmu: Add infrastructure for Control Transfer Record
  2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
                   ` (3 preceding siblings ...)
  2025-01-16 23:09 ` [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
@ 2025-01-16 23:09 ` Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 6/7] riscv: pmu: Add driver for Control Transfer Records Ext Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 7/7] riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver Rajnesh Kanwal
  6 siblings, 0 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-16 23:09 UTC (permalink / raw)
  To: linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen, Rajnesh Kanwal

To support Control Transfer Records (CTR) extension, we need to extend the
riscv_pmu framework with some basic infrastructure for branch stack sampling.
Subsequent patches will use this to add support for CTR in the riscv_pmu_dev
driver.

With CTR, the branches are stored into a hardware FIFO, which will be sampled
by software when perf events overflow. A task may be context-switched between
overflows, and to avoid leaking samples we need to clear the last task's
records when a task is context-switched in. To do this we will be using the
pmu::sched_task() callback added in this patch.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
 drivers/perf/riscv_pmu_common.c | 20 ++++++++++++++++++++
 drivers/perf/riscv_pmu_dev.c    | 17 +++++++++++++++++
 drivers/perf/riscv_pmu_legacy.c |  2 ++
 include/linux/perf/riscv_pmu.h  | 18 ++++++++++++++++++
 4 files changed, 57 insertions(+)

diff --git a/drivers/perf/riscv_pmu_common.c b/drivers/perf/riscv_pmu_common.c
index 7644147d50b4..c4c4b5d6bed0 100644
--- a/drivers/perf/riscv_pmu_common.c
+++ b/drivers/perf/riscv_pmu_common.c
@@ -157,6 +157,19 @@ u64 riscv_pmu_ctr_get_width_mask(struct perf_event *event)
 	return GENMASK_ULL(cwidth, 0);
 }
 
+static void riscv_pmu_sched_task(struct perf_event_pmu_context *pmu_ctx,
+				 bool sched_in)
+{
+	struct riscv_pmu *pmu;
+
+	if (!pmu_ctx)
+		return;
+
+	pmu = to_riscv_pmu(pmu_ctx->pmu);
+	if (pmu->sched_task)
+		pmu->sched_task(pmu_ctx, sched_in);
+}
+
 u64 riscv_pmu_event_update(struct perf_event *event)
 {
 	struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
@@ -269,6 +282,8 @@ static int riscv_pmu_add(struct perf_event *event, int flags)
 	cpuc->events[idx] = event;
 	cpuc->n_events++;
 	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+	if (rvpmu->ctr_add)
+		rvpmu->ctr_add(event, flags);
 	if (flags & PERF_EF_START)
 		riscv_pmu_start(event, PERF_EF_RELOAD);
 
@@ -286,6 +301,9 @@ static void riscv_pmu_del(struct perf_event *event, int flags)
 
 	riscv_pmu_stop(event, PERF_EF_UPDATE);
 	cpuc->events[hwc->idx] = NULL;
+	if (rvpmu->ctr_del)
+		rvpmu->ctr_del(event, flags);
+
 	/* The firmware need to reset the counter mapping */
 	if (rvpmu->ctr_stop)
 		rvpmu->ctr_stop(event, RISCV_PMU_STOP_FLAG_RESET);
@@ -402,6 +420,7 @@ struct riscv_pmu *riscv_pmu_alloc(void)
 	for_each_possible_cpu(cpuid) {
 		cpuc = per_cpu_ptr(pmu->hw_events, cpuid);
 		cpuc->n_events = 0;
+		cpuc->ctr_users = 0;
 		for (i = 0; i < RISCV_MAX_COUNTERS; i++)
 			cpuc->events[i] = NULL;
 		cpuc->snapshot_addr = NULL;
@@ -416,6 +435,7 @@ struct riscv_pmu *riscv_pmu_alloc(void)
 		.start		= riscv_pmu_start,
 		.stop		= riscv_pmu_stop,
 		.read		= riscv_pmu_read,
+		.sched_task	= riscv_pmu_sched_task,
 	};
 
 	return pmu;
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index d28d60abaaf2..b9b257607b76 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -1027,6 +1027,12 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
 	}
 }
 
+static void pmu_sched_task(struct perf_event_pmu_context *pmu_ctx,
+			   bool sched_in)
+{
+	/* Call CTR specific Sched hook. */
+}
+
 static int rvpmu_sbi_find_num_ctrs(void)
 {
 	struct sbiret ret;
@@ -1569,6 +1575,14 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
 	return -ENOENT;
 }
 
+static void rvpmu_ctr_add(struct perf_event *event, int flags)
+{
+}
+
+static void rvpmu_ctr_del(struct perf_event *event, int flags)
+{
+}
+
 static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
 {
 	struct hw_perf_event *hwc = &event->hw;
@@ -1984,6 +1998,8 @@ static int rvpmu_device_probe(struct platform_device *pdev)
 	else
 		pmu->pmu.attr_groups = riscv_sbi_pmu_attr_groups;
 	pmu->cmask = cmask;
+	pmu->ctr_add = rvpmu_ctr_add;
+	pmu->ctr_del = rvpmu_ctr_del;
 	pmu->ctr_start = rvpmu_ctr_start;
 	pmu->ctr_stop = rvpmu_ctr_stop;
 	pmu->event_map = rvpmu_event_map;
@@ -1995,6 +2011,7 @@ static int rvpmu_device_probe(struct platform_device *pdev)
 	pmu->event_mapped = rvpmu_event_mapped;
 	pmu->event_unmapped = rvpmu_event_unmapped;
 	pmu->csr_index = rvpmu_csr_index;
+	pmu->sched_task = pmu_sched_task;
 
 	ret = riscv_pm_pmu_register(pmu);
 	if (ret)
diff --git a/drivers/perf/riscv_pmu_legacy.c b/drivers/perf/riscv_pmu_legacy.c
index 93c8e0fdb589..bee6742d35fa 100644
--- a/drivers/perf/riscv_pmu_legacy.c
+++ b/drivers/perf/riscv_pmu_legacy.c
@@ -115,6 +115,8 @@ static void pmu_legacy_init(struct riscv_pmu *pmu)
 		BIT(RISCV_PMU_LEGACY_INSTRET);
 	pmu->ctr_start = pmu_legacy_ctr_start;
 	pmu->ctr_stop = NULL;
+	pmu->ctr_add = NULL;
+	pmu->ctr_del = NULL;
 	pmu->event_map = pmu_legacy_event_map;
 	pmu->ctr_get_idx = pmu_legacy_ctr_get_idx;
 	pmu->ctr_get_width = pmu_legacy_ctr_get_width;
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index e58f83811988..883781f12ae0 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -46,6 +46,13 @@
 	},								\
 }
 
+#define MAX_BRANCH_RECORDS 256
+
+struct branch_records {
+	struct perf_branch_stack branch_stack;
+	struct perf_branch_entry branch_entries[MAX_BRANCH_RECORDS];
+};
+
 struct cpu_hw_events {
 	/* currently enabled events */
 	int			n_events;
@@ -65,6 +72,12 @@ struct cpu_hw_events {
 	bool snapshot_set_done;
 	/* A shadow copy of the counter values to avoid clobbering during multiple SBI calls */
 	u64 snapshot_cval_shcopy[RISCV_MAX_COUNTERS];
+
+	/* Saved branch records. */
+	struct branch_records *branches;
+
+	/* Active events requesting branch records */
+	int ctr_users;
 };
 
 struct riscv_pmu {
@@ -78,6 +91,8 @@ struct riscv_pmu {
 	int		(*ctr_get_idx)(struct perf_event *event);
 	int		(*ctr_get_width)(int idx);
 	void		(*ctr_clear_idx)(struct perf_event *event);
+	void		(*ctr_add)(struct perf_event *event, int flags);
+	void		(*ctr_del)(struct perf_event *event, int flags);
 	void		(*ctr_start)(struct perf_event *event, u64 init_val);
 	void		(*ctr_stop)(struct perf_event *event, unsigned long flag);
 	int		(*event_map)(struct perf_event *event, u64 *config);
@@ -85,10 +100,13 @@ struct riscv_pmu {
 	void		(*event_mapped)(struct perf_event *event, struct mm_struct *mm);
 	void		(*event_unmapped)(struct perf_event *event, struct mm_struct *mm);
 	uint8_t		(*csr_index)(struct perf_event *event);
+	void		(*sched_task)(struct perf_event_pmu_context *ctx, bool sched_in);
 
 	struct cpu_hw_events	__percpu *hw_events;
 	struct hlist_node	node;
 	struct notifier_block   riscv_pm_nb;
+
+	unsigned int ctr_depth;
 };
 
 #define to_riscv_pmu(p) (container_of(p, struct riscv_pmu, pmu))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 6/7] riscv: pmu: Add driver for Control Transfer Records Ext.
  2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
                   ` (4 preceding siblings ...)
  2025-01-16 23:09 ` [PATCH v2 5/7] riscv: pmu: Add infrastructure for Control Transfer Record Rajnesh Kanwal
@ 2025-01-16 23:09 ` Rajnesh Kanwal
  2025-01-16 23:09 ` [PATCH v2 7/7] riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver Rajnesh Kanwal
  6 siblings, 0 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-16 23:09 UTC (permalink / raw)
  To: linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen, Rajnesh Kanwal

This adds support for CTR Ext defined in [0]. The extension
allows to records a maximum for 256 last branch records.

CTR extension depends on s[m|s]csrind and Sscofpmf extensions.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
 MAINTAINERS                    |   1 +
 drivers/perf/Kconfig           |  11 +
 drivers/perf/Makefile          |   1 +
 drivers/perf/riscv_ctr.c       | 608 +++++++++++++++++++++++++++++++++
 include/linux/perf/riscv_pmu.h |  37 ++
 5 files changed, 658 insertions(+)
 create mode 100644 drivers/perf/riscv_ctr.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 2ef7ff933266..7bcd79f33811 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -20177,6 +20177,7 @@ M:	Atish Patra <atishp@atishpatra.org>
 R:	Anup Patel <anup@brainfault.org>
 L:	linux-riscv@lists.infradead.org
 S:	Supported
+F:	drivers/perf/riscv_ctr.c
 F:	drivers/perf/riscv_pmu_common.c
 F:	drivers/perf/riscv_pmu_dev.c
 F:	drivers/perf/riscv_pmu_legacy.c
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index b3bdff2a99a4..9107c5208bf5 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -129,6 +129,17 @@ config ANDES_CUSTOM_PMU
 
 	  If you don't know what to do here, say "Y".
 
+config RISCV_CTR
+       bool "Enable support for Control Transfer Records (CTR)"
+       depends on PERF_EVENTS && RISCV_PMU
+       default y
+       help
+         Enable support for Control Transfer Records (CTR) which
+         allows recording branches, Jumps, Calls, returns etc taken in an
+         execution path. This also supports privilege based filtering. It
+         captures additional relevant information such as cycle count,
+         branch misprediction etc.
+
 config ARM_PMU_ACPI
 	depends on ARM_PMU && ACPI
 	def_bool y
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 0805d740c773..755609f184fe 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_RISCV_PMU_COMMON) += riscv_pmu_common.o
 obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o
 obj-$(CONFIG_RISCV_PMU) += riscv_pmu_dev.o
 obj-$(CONFIG_STARFIVE_STARLINK_PMU) += starfive_starlink_pmu.o
+obj-$(CONFIG_RISCV_CTR) += riscv_ctr.o
 obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
 obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
 obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
diff --git a/drivers/perf/riscv_ctr.c b/drivers/perf/riscv_ctr.c
new file mode 100644
index 000000000000..53419a656043
--- /dev/null
+++ b/drivers/perf/riscv_ctr.c
@@ -0,0 +1,608 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Control transfer records extension Helpers.
+ *
+ * Copyright (C) 2024 Rivos Inc.
+ *
+ * Author: Rajnesh Kanwal <rkanwal@rivosinc.com>
+ */
+
+#define pr_fmt(fmt) "CTR: " fmt
+
+#include <linux/bitfield.h>
+#include <linux/printk.h>
+#include <linux/types.h>
+#include <linux/perf_event.h>
+#include <linux/perf/riscv_pmu.h>
+#include <linux/cpufeature.h>
+#include <asm/hwcap.h>
+#include <asm/csr_ind.h>
+#include <asm/csr.h>
+
+#define CTR_BRANCH_FILTERS_INH  (CTRCTL_EXCINH       | \
+				 CTRCTL_INTRINH      | \
+				 CTRCTL_TRETINH      | \
+				 CTRCTL_TKBRINH      | \
+				 CTRCTL_INDCALL_INH  | \
+				 CTRCTL_DIRCALL_INH  | \
+				 CTRCTL_INDJUMP_INH  | \
+				 CTRCTL_DIRJUMP_INH  | \
+				 CTRCTL_CORSWAP_INH  | \
+				 CTRCTL_RET_INH      | \
+				 CTRCTL_INDOJUMP_INH | \
+				 CTRCTL_DIROJUMP_INH)
+
+#define CTR_BRANCH_ENABLE_BITS (CTRCTL_KERNEL_ENABLE | CTRCTL_U_ENABLE)
+
+/* Branch filters not-supported by CTR extension. */
+#define CTR_EXCLUDE_BRANCH_FILTERS (PERF_SAMPLE_BRANCH_ABORT_TX	        | \
+				    PERF_SAMPLE_BRANCH_IN_TX		| \
+				    PERF_SAMPLE_BRANCH_PRIV_SAVE        | \
+				    PERF_SAMPLE_BRANCH_NO_TX            | \
+				    PERF_SAMPLE_BRANCH_COUNTERS)
+
+/* Branch filters supported by CTR extension. */
+#define CTR_ALLOWED_BRANCH_FILTERS (PERF_SAMPLE_BRANCH_USER		| \
+				    PERF_SAMPLE_BRANCH_KERNEL		| \
+				    PERF_SAMPLE_BRANCH_HV		| \
+				    PERF_SAMPLE_BRANCH_ANY		| \
+				    PERF_SAMPLE_BRANCH_ANY_CALL	        | \
+				    PERF_SAMPLE_BRANCH_ANY_RETURN	| \
+				    PERF_SAMPLE_BRANCH_IND_CALL	        | \
+				    PERF_SAMPLE_BRANCH_COND		| \
+				    PERF_SAMPLE_BRANCH_IND_JUMP	        | \
+				    PERF_SAMPLE_BRANCH_HW_INDEX	        | \
+				    PERF_SAMPLE_BRANCH_NO_FLAGS	        | \
+				    PERF_SAMPLE_BRANCH_NO_CYCLES	| \
+				    PERF_SAMPLE_BRANCH_CALL_STACK       | \
+				    PERF_SAMPLE_BRANCH_CALL		| \
+				    PERF_SAMPLE_BRANCH_TYPE_SAVE)
+
+#define CTR_PERF_BRANCH_FILTERS    (CTR_ALLOWED_BRANCH_FILTERS	        | \
+				    CTR_EXCLUDE_BRANCH_FILTERS)
+
+static u64 allowed_filters __read_mostly;
+
+struct ctr_regset {
+	unsigned long src;
+	unsigned long target;
+	unsigned long ctr_data;
+};
+
+enum {
+	CTR_STATE_NONE,
+	CTR_STATE_VALID,
+};
+
+/* Head is the idx of the next available slot. The slot may be already populated
+ * by an old entry which will be lost on new writes.
+ */
+struct riscv_perf_task_context {
+	int callstack_users;
+	int stack_state;
+	unsigned int num_entries;
+	uint32_t ctr_status;
+	uint64_t ctr_control;
+	struct ctr_regset store[MAX_BRANCH_RECORDS];
+};
+
+static inline u64 get_ctr_src_reg(unsigned int ctr_idx)
+{
+	return csr_ind_read(CSR_SIREG, CTR_ENTRIES_FIRST, ctr_idx);
+}
+
+static inline void set_ctr_src_reg(unsigned int ctr_idx, u64 value)
+{
+	return csr_ind_write(CSR_SIREG, CTR_ENTRIES_FIRST, ctr_idx, value);
+}
+
+static inline u64 get_ctr_tgt_reg(unsigned int ctr_idx)
+{
+	return csr_ind_read(CSR_SIREG2, CTR_ENTRIES_FIRST, ctr_idx);
+}
+
+static inline void set_ctr_tgt_reg(unsigned int ctr_idx, u64 value)
+{
+	return csr_ind_write(CSR_SIREG2, CTR_ENTRIES_FIRST, ctr_idx, value);
+}
+
+static inline u64 get_ctr_data_reg(unsigned int ctr_idx)
+{
+	return csr_ind_read(CSR_SIREG3, CTR_ENTRIES_FIRST, ctr_idx);
+}
+
+static inline void set_ctr_data_reg(unsigned int ctr_idx, u64 value)
+{
+	return csr_ind_write(CSR_SIREG3, CTR_ENTRIES_FIRST, ctr_idx, value);
+}
+
+static inline bool ctr_record_valid(u64 ctr_src)
+{
+	return !!FIELD_GET(CTRSOURCE_VALID, ctr_src);
+}
+
+static inline int ctr_get_mispredict(u64 ctr_target)
+{
+	return FIELD_GET(CTRTARGET_MISP, ctr_target);
+}
+
+static inline unsigned int ctr_get_cycles(u64 ctr_data)
+{
+	const unsigned int cce = FIELD_GET(CTRDATA_CCE_MASK, ctr_data);
+	const unsigned int ccm = FIELD_GET(CTRDATA_CCM_MASK, ctr_data);
+
+	if (ctr_data & CTRDATA_CCV)
+		return 0;
+
+	/* Formula to calculate cycles from spec: (2^12 + CCM) << CCE-1 */
+	if (cce > 0)
+		return (4096 + ccm) << (cce - 1);
+
+	return FIELD_GET(CTRDATA_CCM_MASK, ctr_data);
+}
+
+static inline unsigned int ctr_get_type(u64 ctr_data)
+{
+	return FIELD_GET(CTRDATA_TYPE_MASK, ctr_data);
+}
+
+static inline unsigned int ctr_get_depth(u64 ctr_depth)
+{
+	/* Depth table from CTR Spec: 2.4 sctrdepth.
+	 *
+	 * sctrdepth.depth       Depth
+	 * 000			- 16
+	 * 001			- 32
+	 * 010			- 64
+	 * 011			- 128
+	 * 100			- 256
+	 *
+	 * Depth = 16 * 2 ^ (ctrdepth.depth)
+	 * or
+	 * Depth = 16 << ctrdepth.depth.
+	 */
+	return 16 << FIELD_GET(SCTRDEPTH_MASK, ctr_depth);
+}
+
+static inline struct riscv_perf_task_context *task_context(void *ctx)
+{
+	return (struct riscv_perf_task_context *)ctx;
+}
+
+/* Reads CTR entry at idx and stores it in entry struct. */
+static bool get_ctr_regset(struct ctr_regset *entry, unsigned int idx)
+{
+	entry->src = get_ctr_src_reg(idx);
+
+	if (!ctr_record_valid(entry->src))
+		return false;
+
+	entry->src = entry->src;
+	entry->target = get_ctr_tgt_reg(idx);
+	entry->ctr_data = get_ctr_data_reg(idx);
+
+	return true;
+}
+
+static void set_ctr_regset(struct ctr_regset *entry, unsigned int idx)
+{
+	set_ctr_src_reg(idx, entry->src);
+	set_ctr_tgt_reg(idx, entry->target);
+	set_ctr_data_reg(idx, entry->ctr_data);
+}
+
+static u64 branch_type_to_ctr(int branch_type)
+{
+	u64 config = CTR_BRANCH_FILTERS_INH | CTRCTL_LCOFIFRZ;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_USER)
+		config |= CTRCTL_U_ENABLE;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
+		config |= CTRCTL_KERNEL_ENABLE;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_HV) {
+		if (riscv_isa_extension_available(NULL, h))
+			config |= CTRCTL_KERNEL_ENABLE;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+		config &= ~CTR_BRANCH_FILTERS_INH;
+		return config;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+		config &= ~CTRCTL_INDCALL_INH;
+		config &= ~CTRCTL_DIRCALL_INH;
+		config &= ~CTRCTL_EXCINH;
+		config &= ~CTRCTL_INTRINH;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		config &= ~(CTRCTL_RET_INH | CTRCTL_TRETINH);
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
+		config &= ~CTRCTL_INDCALL_INH;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_COND)
+		config &= ~CTRCTL_TKBRINH;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_CALL_STACK)
+		config |= CTRCTL_RASEMU;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP) {
+		config &= ~CTRCTL_INDJUMP_INH;
+		config &= ~CTRCTL_INDOJUMP_INH;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_CALL)
+		config &= ~CTRCTL_DIRCALL_INH;
+
+	return config;
+}
+
+static const int ctr_perf_map[] = {
+	[CTRDATA_TYPE_NONE]			= PERF_BR_UNKNOWN,
+	[CTRDATA_TYPE_EXCEPTION]		= PERF_BR_SYSCALL,
+	[CTRDATA_TYPE_INTERRUPT]		= PERF_BR_IRQ,
+	[CTRDATA_TYPE_TRAP_RET]			= PERF_BR_ERET,
+	[CTRDATA_TYPE_NONTAKEN_BRANCH]		= PERF_BR_COND,
+	[CTRDATA_TYPE_TAKEN_BRANCH]		= PERF_BR_COND,
+	[CTRDATA_TYPE_RESERVED_6]		= PERF_BR_UNKNOWN,
+	[CTRDATA_TYPE_RESERVED_7]		= PERF_BR_UNKNOWN,
+	[CTRDATA_TYPE_INDIRECT_CALL]		= PERF_BR_IND_CALL,
+	[CTRDATA_TYPE_DIRECT_CALL]		= PERF_BR_CALL,
+	[CTRDATA_TYPE_INDIRECT_JUMP]		= PERF_BR_IND,
+	[CTRDATA_TYPE_DIRECT_JUMP]		= PERF_BR_UNCOND,
+	[CTRDATA_TYPE_CO_ROUTINE_SWAP]		= PERF_BR_UNKNOWN,
+	[CTRDATA_TYPE_RETURN]			= PERF_BR_RET,
+	[CTRDATA_TYPE_OTHER_INDIRECT_JUMP]	= PERF_BR_IND,
+	[CTRDATA_TYPE_OTHER_DIRECT_JUMP]	= PERF_BR_UNCOND,
+};
+
+static void ctr_set_perf_entry_type(struct perf_branch_entry *entry,
+				    u64 ctr_data)
+{
+	int ctr_type = ctr_get_type(ctr_data);
+
+	entry->type = ctr_perf_map[ctr_type];
+	if (entry->type == PERF_BR_UNKNOWN)
+		pr_warn("%d - unknown branch type captured\n", ctr_type);
+}
+
+static void capture_ctr_flags(struct perf_branch_entry *entry,
+			      struct perf_event *event, u64 ctr_data,
+			      u64 ctr_target)
+{
+	if (branch_sample_type(event))
+		ctr_set_perf_entry_type(entry, ctr_data);
+
+	if (!branch_sample_no_cycles(event))
+		entry->cycles = ctr_get_cycles(ctr_data);
+
+	if (!branch_sample_no_flags(event)) {
+		entry->abort = 0;
+		entry->mispred = ctr_get_mispredict(ctr_target);
+		entry->predicted = !entry->mispred;
+	}
+
+	if (branch_sample_priv(event))
+		entry->priv = PERF_BR_PRIV_UNKNOWN;
+}
+
+static void ctr_regset_to_branch_entry(struct cpu_hw_events *cpuc,
+				       struct perf_event *event,
+				       struct ctr_regset *regset,
+				       unsigned int idx)
+{
+	struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
+
+	perf_clear_branch_entry_bitfields(entry);
+	entry->from = regset->src & (~CTRSOURCE_VALID);
+	entry->to = regset->target & (~CTRTARGET_MISP);
+	capture_ctr_flags(entry, event, regset->ctr_data, regset->target);
+}
+
+static void ctr_read_entries(struct cpu_hw_events *cpuc,
+			     struct perf_event *event,
+			     unsigned int depth)
+{
+	struct ctr_regset entry = {};
+	u64 ctr_ctl;
+	int i;
+
+	ctr_ctl = csr_read_clear(CSR_CTRCTL, CTR_BRANCH_ENABLE_BITS);
+
+	for (i = 0; i < depth; i++) {
+		if (!get_ctr_regset(&entry, i))
+			break;
+
+		ctr_regset_to_branch_entry(cpuc, event, &entry, i);
+	}
+
+	csr_set(CSR_CTRCTL, ctr_ctl & CTR_BRANCH_ENABLE_BITS);
+
+	cpuc->branches->branch_stack.nr = i;
+	cpuc->branches->branch_stack.hw_idx = 0;
+}
+
+bool riscv_pmu_ctr_valid(struct perf_event *event)
+{
+	u64 branch_type = event->attr.branch_sample_type;
+
+	if (branch_type & ~allowed_filters) {
+		pr_debug_once("Requested branch filters not supported 0x%llx\n",
+				branch_type & ~allowed_filters);
+		return false;
+	}
+
+	return true;
+}
+
+void riscv_pmu_ctr_consume(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+	unsigned int depth = to_riscv_pmu(event->pmu)->ctr_depth;
+
+	ctr_read_entries(cpuc, event, depth);
+
+	/* Clear frozen bit. */
+	csr_clear(CSR_SCTRSTATUS, SCTRSTATUS_FROZEN);
+}
+
+static void riscv_pmu_ctr_reset(void)
+{
+	/* FIXME: Replace with sctrclr instruction once support is merged
+	 * into toolchain.
+	 */
+	asm volatile(".4byte 0x10400073\n" ::: "memory");
+	csr_write(CSR_SCTRSTATUS, 0);
+	csr_write(CSR_CTRCTL, 0);
+}
+
+static void __riscv_pmu_ctr_restore(void *ctx)
+{
+	struct riscv_perf_task_context *task_ctx = ctx;
+	unsigned int i;
+
+	csr_write(CSR_SCTRSTATUS, task_ctx->ctr_status);
+
+	for (i = 0; i < task_ctx->num_entries; i++)
+		set_ctr_regset(&task_ctx->store[i], i);
+}
+
+static void riscv_pmu_ctr_restore(void *ctx)
+{
+	if (task_context(ctx)->callstack_users == 0 ||
+	    task_context(ctx)->stack_state == CTR_STATE_NONE) {
+		riscv_pmu_ctr_reset();
+		return;
+	}
+
+	__riscv_pmu_ctr_restore(ctx);
+
+	task_context(ctx)->stack_state = CTR_STATE_NONE;
+}
+
+static void __riscv_pmu_ctr_save(void *ctx, unsigned int depth)
+{
+	struct riscv_perf_task_context *task_ctx = ctx;
+	struct ctr_regset *dst;
+	unsigned int i;
+
+	for (i = 0; i < depth; i++) {
+		dst = &task_ctx->store[i];
+		if (!get_ctr_regset(dst, i))
+			break;
+	}
+
+	task_ctx->num_entries = i;
+
+	task_ctx->ctr_status = csr_read(CSR_SCTRSTATUS);
+}
+
+static void riscv_pmu_ctr_save(void *ctx, unsigned int depth)
+{
+	if (task_context(ctx)->callstack_users == 0) {
+		task_context(ctx)->stack_state = CTR_STATE_NONE;
+		return;
+	}
+
+	__riscv_pmu_ctr_save(ctx, depth);
+
+	task_context(ctx)->stack_state = CTR_STATE_VALID;
+}
+
+/*
+ * On context switch in, we need to make sure no samples from previous tasks
+ * are left in the CTR.
+ *
+ * On ctxswin, sched_in = true, called after the PMU has started
+ * On ctxswout, sched_in = false, called before the PMU is stopped
+ */
+void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *pmu_ctx,
+			      bool sched_in)
+{
+	struct riscv_pmu *rvpmu = to_riscv_pmu(pmu_ctx->pmu);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+	void *task_ctx;
+
+	if (!cpuc->ctr_users)
+		return;
+
+	/* Save branch records in task_ctx on sched out */
+	task_ctx = pmu_ctx ? pmu_ctx->task_ctx_data : NULL;
+	if (task_ctx) {
+		if (sched_in)
+			riscv_pmu_ctr_restore(task_ctx);
+		else
+			riscv_pmu_ctr_save(task_ctx, rvpmu->ctr_depth);
+		return;
+	}
+
+	/* Reset branch records on sched in */
+	if (sched_in)
+		riscv_pmu_ctr_reset();
+}
+
+static inline bool branch_user_callstack(unsigned int br_type)
+{
+	return (br_type & PERF_SAMPLE_BRANCH_USER) &&
+		(br_type & PERF_SAMPLE_BRANCH_CALL_STACK);
+}
+
+void riscv_pmu_ctr_add(struct perf_event *event)
+{
+	struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+
+	if (branch_user_callstack(event->attr.branch_sample_type) &&
+			event->pmu_ctx->task_ctx_data)
+		task_context(event->pmu_ctx->task_ctx_data)->callstack_users++;
+
+	perf_sched_cb_inc(event->pmu);
+
+	if (!cpuc->ctr_users++)
+		riscv_pmu_ctr_reset();
+}
+
+void riscv_pmu_ctr_del(struct perf_event *event)
+{
+	struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
+	struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+
+	if (branch_user_callstack(event->attr.branch_sample_type) &&
+			event->pmu_ctx->task_ctx_data)
+		task_context(event->pmu_ctx->task_ctx_data)->callstack_users--;
+
+	cpuc->ctr_users--;
+	WARN_ON_ONCE(cpuc->ctr_users < 0);
+
+	perf_sched_cb_dec(event->pmu);
+}
+
+void riscv_pmu_ctr_enable(struct perf_event *event)
+{
+	u64 branch_type = event->attr.branch_sample_type;
+	u64 ctr;
+
+	ctr = branch_type_to_ctr(branch_type);
+	csr_write(CSR_CTRCTL, ctr);
+}
+
+void riscv_pmu_ctr_disable(struct perf_event *event)
+{
+	/* Clear CTRCTL to disable the recording. */
+	csr_write(CSR_CTRCTL, 0);
+}
+
+/*
+ * Check for hardware supported perf filters here. To avoid missing
+ * any new added filter in perf, we do a BUILD_BUG_ON check, so make sure
+ * to update CTR_ALLOWED_BRANCH_FILTERS or CTR_EXCLUDE_BRANCH_FILTERS
+ * defines when adding support for it in below function.
+ */
+static void __init check_available_filters(void)
+{
+	u64 ctr_ctl;
+
+	/*
+	 * Ensure both perf branch filter allowed and exclude
+	 * masks are always in sync with the generic perf ABI.
+	 */
+	BUILD_BUG_ON(CTR_PERF_BRANCH_FILTERS != (PERF_SAMPLE_BRANCH_MAX - 1));
+
+	allowed_filters = PERF_SAMPLE_BRANCH_USER      |
+			  PERF_SAMPLE_BRANCH_KERNEL    |
+			  PERF_SAMPLE_BRANCH_ANY       |
+			  PERF_SAMPLE_BRANCH_HW_INDEX  |
+			  PERF_SAMPLE_BRANCH_NO_FLAGS  |
+			  PERF_SAMPLE_BRANCH_NO_CYCLES |
+			  PERF_SAMPLE_BRANCH_TYPE_SAVE;
+
+	csr_write(CSR_CTRCTL, ~0);
+	ctr_ctl = csr_read(CSR_CTRCTL);
+
+	if (riscv_isa_extension_available(NULL, h))
+		allowed_filters |= PERF_SAMPLE_BRANCH_HV;
+
+	if (ctr_ctl & (CTRCTL_INDCALL_INH | CTRCTL_DIRCALL_INH))
+		allowed_filters |= PERF_SAMPLE_BRANCH_ANY_CALL;
+
+	if (ctr_ctl & (CTRCTL_RET_INH | CTRCTL_TRETINH))
+		allowed_filters |= PERF_SAMPLE_BRANCH_ANY_RETURN;
+
+	if (ctr_ctl & CTRCTL_INDCALL_INH)
+		allowed_filters |= PERF_SAMPLE_BRANCH_IND_CALL;
+
+	if (ctr_ctl & CTRCTL_TKBRINH)
+		allowed_filters |= PERF_SAMPLE_BRANCH_COND;
+
+	if (ctr_ctl & CTRCTL_RASEMU)
+		allowed_filters |= PERF_SAMPLE_BRANCH_CALL_STACK;
+
+	if (ctr_ctl & (CTRCTL_INDOJUMP_INH | CTRCTL_INDJUMP_INH))
+		allowed_filters |= PERF_SAMPLE_BRANCH_IND_JUMP;
+
+	if (ctr_ctl & CTRCTL_DIRCALL_INH)
+		allowed_filters |= PERF_SAMPLE_BRANCH_CALL;
+}
+
+void riscv_pmu_ctr_starting_cpu(void)
+{
+	if (!riscv_isa_extension_available(NULL, SxCTR) ||
+	    !riscv_isa_extension_available(NULL, SSCOFPMF) ||
+	    !riscv_isa_extension_available(NULL, SxCSRIND))
+		return;
+
+	/* Set depth to maximum. */
+	csr_write(CSR_SCTRDEPTH, SCTRDEPTH_MASK);
+}
+
+void riscv_pmu_ctr_dying_cpu(void)
+{
+	if (!riscv_isa_extension_available(NULL, SxCTR) ||
+	    !riscv_isa_extension_available(NULL, SSCOFPMF) ||
+	    !riscv_isa_extension_available(NULL, SxCSRIND))
+		return;
+
+	/* Clear and reset CTR CSRs. */
+	csr_write(CSR_SCTRDEPTH, 0);
+	riscv_pmu_ctr_reset();
+}
+
+int riscv_pmu_ctr_init(struct riscv_pmu *riscv_pmu)
+{
+	size_t size = sizeof(struct riscv_perf_task_context);
+
+	if (!riscv_isa_extension_available(NULL, SxCTR) ||
+	    !riscv_isa_extension_available(NULL, SSCOFPMF) ||
+	    !riscv_isa_extension_available(NULL, SxCSRIND))
+		return 0;
+
+	riscv_pmu->pmu.task_ctx_cache =
+		kmem_cache_create("ctr_task_ctx", size, sizeof(u64), 0, NULL);
+	if (!riscv_pmu->pmu.task_ctx_cache)
+		return -ENOMEM;
+
+	check_available_filters();
+
+	/* Set depth to maximum. */
+	csr_write(CSR_SCTRDEPTH, SCTRDEPTH_MASK);
+	riscv_pmu->ctr_depth = ctr_get_depth(csr_read(CSR_SCTRDEPTH));
+
+	pr_info("Perf CTR available, with %d depth\n", riscv_pmu->ctr_depth);
+
+	return 0;
+}
+
+void riscv_pmu_ctr_finish(struct riscv_pmu *riscv_pmu)
+{
+	if (!riscv_pmu_ctr_supported(riscv_pmu))
+		return;
+
+	csr_write(CSR_SCTRDEPTH, 0);
+	riscv_pmu->ctr_depth = 0;
+	riscv_pmu_ctr_reset();
+
+	kmem_cache_destroy(riscv_pmu->pmu.task_ctx_cache);
+}
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index 883781f12ae0..f32b6dcc3491 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -127,6 +127,43 @@ struct riscv_pmu *riscv_pmu_alloc(void);
 int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
 #endif
 
+static inline bool riscv_pmu_ctr_supported(struct riscv_pmu *pmu)
+{
+	return !!pmu->ctr_depth;
+}
+
 #endif /* CONFIG_RISCV_PMU_COMMON */
 
+#ifdef CONFIG_RISCV_CTR
+
+bool riscv_pmu_ctr_valid(struct perf_event *event);
+void riscv_pmu_ctr_consume(struct cpu_hw_events *cpuc, struct perf_event *event);
+void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in);
+void riscv_pmu_ctr_add(struct perf_event *event);
+void riscv_pmu_ctr_del(struct perf_event *event);
+void riscv_pmu_ctr_enable(struct perf_event *event);
+void riscv_pmu_ctr_disable(struct perf_event *event);
+void riscv_pmu_ctr_dying_cpu(void);
+void riscv_pmu_ctr_starting_cpu(void);
+int riscv_pmu_ctr_init(struct riscv_pmu *riscv_pmu);
+void riscv_pmu_ctr_finish(struct riscv_pmu *riscv_pmu);
+
+#else
+
+static inline bool riscv_pmu_ctr_valid(struct perf_event *event) { return false; }
+static inline void riscv_pmu_ctr_consume(struct cpu_hw_events *cpuc,
+				      struct perf_event *event) { }
+static inline void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *,
+					    bool sched_in) { }
+static void riscv_pmu_ctr_add(struct perf_event *event) { }
+static void riscv_pmu_ctr_del(struct perf_event *event) { }
+static inline void riscv_pmu_ctr_enable(struct perf_event *event) { }
+static inline void riscv_pmu_ctr_disable(struct perf_event *event) { }
+static inline void riscv_pmu_ctr_dying_cpu(void) { }
+static inline void riscv_pmu_ctr_starting_cpu(void) { }
+static inline int riscv_pmu_ctr_init(struct riscv_pmu *riscv_pmu) { return 0; }
+static inline void riscv_pmu_ctr_finish(struct riscv_pmu *riscv_pmu) { }
+
+#endif /* CONFIG_RISCV_CTR */
+
 #endif /* _RISCV_PMU_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 7/7] riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver
  2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
                   ` (5 preceding siblings ...)
  2025-01-16 23:09 ` [PATCH v2 6/7] riscv: pmu: Add driver for Control Transfer Records Ext Rajnesh Kanwal
@ 2025-01-16 23:09 ` Rajnesh Kanwal
  6 siblings, 0 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-16 23:09 UTC (permalink / raw)
  To: linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen, Rajnesh Kanwal

This integrates recently added CTR ext support in riscv_pmu_dev driver
to enable branch stack sampling using PMU events.

This mainly adds CTR enable/disable callbacks in rvpmu_ctr_stop()
and rvpmu_ctr_start() function to start/stop branch recording along
with the event.

PMU overflow handler rvpmu_ovf_handler() is also updated to sample
CTR entries in case of the overflow for the particular event programmed
to records branches. The recorded entries are fed to core perf for
further processing.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
 drivers/perf/riscv_pmu_common.c |  3 +-
 drivers/perf/riscv_pmu_dev.c    | 67 ++++++++++++++++++++++++++++++++-
 2 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/drivers/perf/riscv_pmu_common.c b/drivers/perf/riscv_pmu_common.c
index c4c4b5d6bed0..23077a6c4931 100644
--- a/drivers/perf/riscv_pmu_common.c
+++ b/drivers/perf/riscv_pmu_common.c
@@ -327,8 +327,7 @@ static int riscv_pmu_event_init(struct perf_event *event)
 	u64 event_config = 0;
 	uint64_t cmask;
 
-	/* driver does not support branch stack sampling */
-	if (has_branch_stack(event))
+	if (needs_branch_stack(event) && !riscv_pmu_ctr_supported(rvpmu))
 		return -EOPNOTSUPP;
 
 	hwc->flags = 0;
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index b9b257607b76..10697deb1d26 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -1030,7 +1030,7 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
 static void pmu_sched_task(struct perf_event_pmu_context *pmu_ctx,
 			   bool sched_in)
 {
-	/* Call CTR specific Sched hook. */
+	riscv_pmu_ctr_sched_task(pmu_ctx, sched_in);
 }
 
 static int rvpmu_sbi_find_num_ctrs(void)
@@ -1379,6 +1379,13 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
 		hw_evt->state |= PERF_HES_UPTODATE;
 		perf_sample_data_init(&data, 0, hw_evt->last_period);
 		if (riscv_pmu_event_set_period(event)) {
+			if (needs_branch_stack(event)) {
+				riscv_pmu_ctr_consume(cpu_hw_evt, event);
+				perf_sample_save_brstack(
+					&data, event,
+					&cpu_hw_evt->branches->branch_stack, NULL);
+			}
+
 			/*
 			 * Unlike other ISAs, RISC-V don't have to disable interrupts
 			 * to avoid throttling here. As per the specification, the
@@ -1577,10 +1584,14 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
 
 static void rvpmu_ctr_add(struct perf_event *event, int flags)
 {
+	if (needs_branch_stack(event))
+		riscv_pmu_ctr_add(event);
 }
 
 static void rvpmu_ctr_del(struct perf_event *event, int flags)
 {
+	if (needs_branch_stack(event))
+		riscv_pmu_ctr_del(event);
 }
 
 static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
@@ -1595,6 +1606,9 @@ static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
 	if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
 	    (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
 		rvpmu_set_scounteren((void *)event);
+
+	if (needs_branch_stack(event))
+		riscv_pmu_ctr_enable(event);
 }
 
 static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
@@ -1617,6 +1631,9 @@ static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
 	} else {
 		rvpmu_sbi_ctr_stop(event, flag);
 	}
+
+	if (needs_branch_stack(event) && flag != RISCV_PMU_STOP_FLAG_RESET)
+		riscv_pmu_ctr_disable(event);
 }
 
 static int rvpmu_find_ctrs(void)
@@ -1652,6 +1669,9 @@ static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
 {
 	u64 config1;
 
+	if (needs_branch_stack(event) && !riscv_pmu_ctr_valid(event))
+		return -EOPNOTSUPP;
+
 	config1 = event->attr.config1;
 	if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event) &&
 	    !(config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS)) { /* GUEST events rely on SBI encoding */
@@ -1701,6 +1721,8 @@ static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
 		enable_percpu_irq(riscv_pmu_irq, IRQ_TYPE_NONE);
 	}
 
+	riscv_pmu_ctr_starting_cpu();
+
 	if (sbi_pmu_snapshot_available())
 		return pmu_sbi_snapshot_setup(pmu, cpu);
 
@@ -1715,6 +1737,7 @@ static int rvpmu_dying_cpu(unsigned int cpu, struct hlist_node *node)
 
 	/* Disable all counters access for user mode now */
 	csr_write(CSR_SCOUNTEREN, 0x0);
+	riscv_pmu_ctr_dying_cpu();
 
 	if (sbi_pmu_snapshot_available())
 		return pmu_sbi_snapshot_disable();
@@ -1838,6 +1861,29 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
 	cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
 }
 
+static int branch_records_alloc(struct riscv_pmu *pmu)
+{
+	struct branch_records __percpu *tmp_alloc_ptr;
+	struct branch_records *records;
+	struct cpu_hw_events *events;
+	int cpu;
+
+	if (!riscv_pmu_ctr_supported(pmu))
+		return 0;
+
+	tmp_alloc_ptr = alloc_percpu_gfp(struct branch_records, GFP_KERNEL);
+	if (!tmp_alloc_ptr)
+		return -ENOMEM;
+
+	for_each_possible_cpu(cpu) {
+		events = per_cpu_ptr(pmu->hw_events, cpu);
+		records = per_cpu_ptr(tmp_alloc_ptr, cpu);
+		events->branches = records;
+	}
+
+	return 0;
+}
+
 static void rvpmu_event_init(struct perf_event *event)
 {
 	/*
@@ -1850,6 +1896,9 @@ static void rvpmu_event_init(struct perf_event *event)
 		event->hw.flags |= PERF_EVENT_FLAG_USER_ACCESS;
 	else
 		event->hw.flags |= PERF_EVENT_FLAG_LEGACY;
+
+	if (branch_sample_call_stack(event))
+		event->attach_state |= PERF_ATTACH_TASK_DATA;
 }
 
 static void rvpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
@@ -1997,6 +2046,15 @@ static int rvpmu_device_probe(struct platform_device *pdev)
 		pmu->pmu.attr_groups = riscv_cdeleg_pmu_attr_groups;
 	else
 		pmu->pmu.attr_groups = riscv_sbi_pmu_attr_groups;
+
+	ret = riscv_pmu_ctr_init(pmu);
+	if (ret)
+		goto out_free;
+
+	ret = branch_records_alloc(pmu);
+	if (ret)
+		goto out_ctr_finish;
+
 	pmu->cmask = cmask;
 	pmu->ctr_add = rvpmu_ctr_add;
 	pmu->ctr_del = rvpmu_ctr_del;
@@ -2013,6 +2071,10 @@ static int rvpmu_device_probe(struct platform_device *pdev)
 	pmu->csr_index = rvpmu_csr_index;
 	pmu->sched_task = pmu_sched_task;
 
+	ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
+	if (ret)
+		goto out_ctr_finish;
+
 	ret = riscv_pm_pmu_register(pmu);
 	if (ret)
 		goto out_unregister;
@@ -2062,6 +2124,9 @@ static int rvpmu_device_probe(struct platform_device *pdev)
 out_unregister:
 	riscv_pmu_destroy(pmu);
 
+out_ctr_finish:
+	riscv_pmu_ctr_finish(pmu);
+
 out_free:
 	kfree(pmu);
 	return ret;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description
  2025-01-16 23:09 ` [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
@ 2025-01-17  7:26   ` Krzysztof Kozlowski
  2025-01-20 14:31     ` Rajnesh Kanwal
  2025-01-20 18:49   ` Conor Dooley
  1 sibling, 1 reply; 16+ messages in thread
From: Krzysztof Kozlowski @ 2025-01-17  7:26 UTC (permalink / raw)
  To: Rajnesh Kanwal, linux-kernel, linux-riscv
  Cc: linux-perf-users, adrian.hunter, alexander.shishkin, ajones, anup,
	acme, atishp, beeman, brauner, conor, heiko, irogers, mingo,
	james.clark, renyu.zj, jolsa, jisheng.teoh, palmer, will,
	kaiwenxue1, vincent.chen

On 17/01/2025 00:09, Rajnesh Kanwal wrote:
> Add the S[m|s]ctr ISA extension description.
> 
> Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> ---
>  .../devicetree/bindings/riscv/extensions.yaml      | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)

<form letter>
Please use scripts/get_maintainers.pl to get a list of necessary people
and lists to CC. It might happen, that command when run on an older
kernel, gives you outdated entries. Therefore please be sure you base
your patches on recent Linux kernel.

Tools like b4 or scripts/get_maintainer.pl provide you proper list of
people, so fix your workflow. Tools might also fail if you work on some
ancient tree (don't, instead use mainline) or work on fork of kernel
(don't, instead use mainline). Just use b4 and everything should be
fine, although remember about `b4 prep --auto-to-cc` if you added new
patches to the patchset.

You missed at least devicetree list (maybe more), so this won't be
tested by automated tooling. Performing review on untested code might be
a waste of time.

Please kindly resend and include all necessary To/Cc entries.
</form letter>

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description
  2025-01-17  7:26   ` Krzysztof Kozlowski
@ 2025-01-20 14:31     ` Rajnesh Kanwal
  0 siblings, 0 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-01-20 14:31 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: linux-kernel, linux-riscv, linux-perf-users, adrian.hunter,
	alexander.shishkin, ajones, anup, acme, atishp, beeman, brauner,
	conor, heiko, irogers, mingo, james.clark, renyu.zj, jolsa,
	jisheng.teoh, palmer, will, kaiwenxue1, vincent.chen

Hi Krzysztof,

Sorry my bad. I will keep this in mind next time.
Thanks for pointing it out.

- Rajnesh

On Fri, Jan 17, 2025 at 7:26 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On 17/01/2025 00:09, Rajnesh Kanwal wrote:
> > Add the S[m|s]ctr ISA extension description.
> >
> > Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> > ---
> >  .../devicetree/bindings/riscv/extensions.yaml      | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
>
> <form letter>
> Please use scripts/get_maintainers.pl to get a list of necessary people
> and lists to CC. It might happen, that command when run on an older
> kernel, gives you outdated entries. Therefore please be sure you base
> your patches on recent Linux kernel.
>
> Tools like b4 or scripts/get_maintainer.pl provide you proper list of
> people, so fix your workflow. Tools might also fail if you work on some
> ancient tree (don't, instead use mainline) or work on fork of kernel
> (don't, instead use mainline). Just use b4 and everything should be
> fine, although remember about `b4 prep --auto-to-cc` if you added new
> patches to the patchset.
>
> You missed at least devicetree list (maybe more), so this won't be
> tested by automated tooling. Performing review on untested code might be
> a waste of time.
>
> Please kindly resend and include all necessary To/Cc entries.
> </form letter>
>
> Best regards,
> Krzysztof

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description
  2025-01-16 23:09 ` [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
  2025-01-17  7:26   ` Krzysztof Kozlowski
@ 2025-01-20 18:49   ` Conor Dooley
  1 sibling, 0 replies; 16+ messages in thread
From: Conor Dooley @ 2025-01-20 18:49 UTC (permalink / raw)
  To: Rajnesh Kanwal
  Cc: linux-kernel, linux-riscv, linux-perf-users, adrian.hunter,
	alexander.shishkin, ajones, anup, acme, atishp, beeman, brauner,
	heiko, irogers, mingo, james.clark, renyu.zj, jolsa, jisheng.teoh,
	palmer, will, kaiwenxue1, vincent.chen

[-- Attachment #1: Type: text/plain, Size: 2045 bytes --]

On Thu, Jan 16, 2025 at 11:09:52PM +0000, Rajnesh Kanwal wrote:
> Add the S[m|s]ctr ISA extension description.
> 
> Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> ---
>  .../devicetree/bindings/riscv/extensions.yaml      | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
> index 848354e3048f..8322503f0773 100644
> --- a/Documentation/devicetree/bindings/riscv/extensions.yaml
> +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
> @@ -167,6 +167,13 @@ properties:
>  	    extension allows other ISA extension to use indirect CSR access
>  	    mechanism in M-mode.
>  
> +        - const: smctr
> +          description: |
> +            The standard Smctr supervisor-level extension for the machine mode
> +            to enable recording limited branch history in a register-accessible
> +            internal core storage. Smctr depend on both the implementation of
> +            S-mode and the Sscsrind extension.

Please, like the other extensions, cite the commit (and repo) where the
extension was frozen or ratified.

> +
>  	- const: sscsrind
>            description: |
>              The standard Sscsrind supervisor-level extension extends the
> @@ -193,6 +200,13 @@ properties:
>              and mode-based filtering as ratified at commit 01d1df0 ("Add ability
>              to manually trigger workflow. (#2)") of riscv-count-overflow.
>  
> +        - const: ssctr
> +          description: |
> +            The standard Ssctr supervisor-level extension enables recording of
> +            limited branch history in a register-accessible internal core
> +            storage. Ssctr depend on both the implementation of S-mode and the
> +            Sscsrind extension.
> +
>          - const: ssnpm
>            description: |
>              The standard Ssnpm extension for next-mode pointer masking as
> -- 
> 2.34.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/7] perf: Increase the maximum number of samples to 256.
  2025-01-16 23:09 ` [PATCH v2 1/7] perf: Increase the maximum number of samples to 256 Rajnesh Kanwal
@ 2025-02-20 18:51   ` Ian Rogers
  2025-04-17 12:51     ` Rajnesh Kanwal
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2025-02-20 18:51 UTC (permalink / raw)
  To: Rajnesh Kanwal, Liang, Kan
  Cc: linux-kernel, linux-riscv, linux-perf-users, adrian.hunter,
	alexander.shishkin, ajones, anup, acme, atishp, beeman, brauner,
	conor, heiko, mingo, james.clark, renyu.zj, jolsa, jisheng.teoh,
	palmer, will, kaiwenxue1, vincent.chen

On Thu, Jan 16, 2025 at 3:10 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
>
> RISCV CTR extension support a maximum depth of 256 last branch records.
> The 127 entries limit results in corrupting CTR entries for RISC-V if
> configured to be 256 entries. This will not impact any other architectures
> as it is just increasing maximum limit of possible entries.

I wonder if rather than a constant this code should just use the auto
resizing hashmap code?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/hashmap.h

I assume the value of 127 comes from perf_event.h's PERF_MAX_STACK_DEPTH:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/perf_event.h#n1252

Perhaps these constants shouldn't exist. The perf-record man page mentions:
sysctl.kernel.perf_event_max_stack
which I believe gets a value from
/proc/sys/kernel/perf_event_max_stack, so maybe these should be
runtime determined constants rather than compile time.

Thanks,
Ian

> Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> ---
>  tools/perf/util/machine.c | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 27d5345d2b30..f2eb3c20274e 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2174,25 +2174,32 @@ static void save_iterations(struct iterations *iter,
>                 iter->cycles += be[i].flags.cycles;
>  }
>
> -#define CHASHSZ 127
> -#define CHASHBITS 7
> -#define NO_ENTRY 0xff
> +#define CHASHBITS 8
> +#define NO_ENTRY 0xffU
>
> -#define PERF_MAX_BRANCH_DEPTH 127
> +#define PERF_MAX_BRANCH_DEPTH 256
>
>  /* Remove loops. */
> +/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
> + * so it's safe to have an unsigned char array to process 256 entries
> + * without causing clash between last entry and NO_ENTRY value.
> + */
>  static int remove_loops(struct branch_entry *l, int nr,
>                         struct iterations *iter)
>  {
>         int i, j, off;
> -       unsigned char chash[CHASHSZ];
> +       unsigned char chash[PERF_MAX_BRANCH_DEPTH];
>
>         memset(chash, NO_ENTRY, sizeof(chash));
>
> -       BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
> +       BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
>
>         for (i = 0; i < nr; i++) {
> -               int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> +               /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
> +                * needed as hash_64 will anyway limit the hash
> +                * to CHASHBITS
> +                */
> +               int h = hash_64(l[i].from, CHASHBITS);
>
>                 /* no collision handling for now */
>                 if (chash[h] == NO_ENTRY) {
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/7] perf: Increase the maximum number of samples to 256.
  2025-02-20 18:51   ` Ian Rogers
@ 2025-04-17 12:51     ` Rajnesh Kanwal
  2025-05-21 10:47       ` Rajnesh Kanwal
  0 siblings, 1 reply; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-04-17 12:51 UTC (permalink / raw)
  To: Ian Rogers, ak
  Cc: Liang, Kan, linux-kernel, linux-riscv, linux-perf-users,
	adrian.hunter, alexander.shishkin, ajones, anup, acme, atishp,
	beeman, brauner, conor, heiko, mingo, james.clark, renyu.zj,
	jolsa, jisheng.teoh, palmer, will, kaiwenxue1, vincent.chen

+ Adding Andi Kleen.

On Thu, Feb 20, 2025 at 6:51 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Jan 16, 2025 at 3:10 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> >
> > RISCV CTR extension support a maximum depth of 256 last branch records.
> > The 127 entries limit results in corrupting CTR entries for RISC-V if
> > configured to be 256 entries. This will not impact any other architectures
> > as it is just increasing maximum limit of possible entries.
>
> I wonder if rather than a constant this code should just use the auto
> resizing hashmap code?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/hashmap.h
>
> I assume the value of 127 comes from perf_event.h's PERF_MAX_STACK_DEPTH:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/perf_event.h#n1252
>
> Perhaps these constants shouldn't exist. The perf-record man page mentions:
> sysctl.kernel.perf_event_max_stack
> which I believe gets a value from
> /proc/sys/kernel/perf_event_max_stack, so maybe these should be
> runtime determined constants rather than compile time.
>

Thanks Ian for your feedback. I am not sure if it's feasible to use auto
resizing hashmap here. On each sample of 256 entries we will be doing
6 callocs and transferring a whole lot of entries in hashmap_grow. We
can't reuse old hashmap as well. On each sample we bear the same cost

But I do agree this should be more dynamic but the maximum number
of entries remove_loops can process is limited by the type of chash array
here. I can change it and related logic to use uint16_t or higher but we
will still have a cap on the number of entries.

PERF_MAX_BRANCH_DEPTH seems to be denoting what remove_loops
can process. This is being used by thread__resolve_callchain_sample to
check if the sample is processable before calling remove_loops. I think
this can't be changed to use perf_event_max_stack. But I can rename
this macro to avoid confusion.

I didn't notice PERF_MAX_STACK_DEPTH. This seems to be defined in
multiple places and touches bpf as well. I agree that we should avoid
using this macro and use runtime determined value instead. Tbh I don't
have super in-depth perf understanding. I will give this a try and send a
patch in the next update. It would be helpful if you can review it.

Thanks
-Rajnesh

> Thanks,
> Ian
>
> > Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> > ---
> >  tools/perf/util/machine.c | 21 ++++++++++++++-------
> >  1 file changed, 14 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > index 27d5345d2b30..f2eb3c20274e 100644
> > --- a/tools/perf/util/machine.c
> > +++ b/tools/perf/util/machine.c
> > @@ -2174,25 +2174,32 @@ static void save_iterations(struct iterations *iter,
> >                 iter->cycles += be[i].flags.cycles;
> >  }
> >
> > -#define CHASHSZ 127
> > -#define CHASHBITS 7
> > -#define NO_ENTRY 0xff
> > +#define CHASHBITS 8
> > +#define NO_ENTRY 0xffU
> >
> > -#define PERF_MAX_BRANCH_DEPTH 127
> > +#define PERF_MAX_BRANCH_DEPTH 256
> >
> >  /* Remove loops. */
> > +/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
> > + * so it's safe to have an unsigned char array to process 256 entries
> > + * without causing clash between last entry and NO_ENTRY value.
> > + */
> >  static int remove_loops(struct branch_entry *l, int nr,
> >                         struct iterations *iter)
> >  {
> >         int i, j, off;
> > -       unsigned char chash[CHASHSZ];
> > +       unsigned char chash[PERF_MAX_BRANCH_DEPTH];
> >
> >         memset(chash, NO_ENTRY, sizeof(chash));
> >
> > -       BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
> > +       BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
> >
> >         for (i = 0; i < nr; i++) {
> > -               int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> > +               /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
> > +                * needed as hash_64 will anyway limit the hash
> > +                * to CHASHBITS
> > +                */
> > +               int h = hash_64(l[i].from, CHASHBITS);
> >
> >                 /* no collision handling for now */
> >                 if (chash[h] == NO_ENTRY) {
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/7] perf: Increase the maximum number of samples to 256.
  2025-04-17 12:51     ` Rajnesh Kanwal
@ 2025-05-21 10:47       ` Rajnesh Kanwal
  2025-05-21 15:36         ` Ian Rogers
  0 siblings, 1 reply; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-05-21 10:47 UTC (permalink / raw)
  To: Ian Rogers, ak
  Cc: Liang, Kan, linux-kernel, linux-riscv, linux-perf-users,
	adrian.hunter, alexander.shishkin, ajones, anup, acme, atishp,
	beeman, brauner, conor, heiko, mingo, james.clark, renyu.zj,
	jolsa, jisheng.teoh, palmer, will, kaiwenxue1, vincent.chen

On Thu, Apr 17, 2025 at 1:51 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
>
> + Adding Andi Kleen.
>
> On Thu, Feb 20, 2025 at 6:51 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Thu, Jan 16, 2025 at 3:10 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> > >
> > > RISCV CTR extension support a maximum depth of 256 last branch records.
> > > The 127 entries limit results in corrupting CTR entries for RISC-V if
> > > configured to be 256 entries. This will not impact any other architectures
> > > as it is just increasing maximum limit of possible entries.
> >
> > I wonder if rather than a constant this code should just use the auto
> > resizing hashmap code?
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/hashmap.h
> >
> > I assume the value of 127 comes from perf_event.h's PERF_MAX_STACK_DEPTH:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/perf_event.h#n1252
> >
> > Perhaps these constants shouldn't exist. The perf-record man page mentions:
> > sysctl.kernel.perf_event_max_stack
> > which I believe gets a value from
> > /proc/sys/kernel/perf_event_max_stack, so maybe these should be
> > runtime determined constants rather than compile time.

While working on this, I came across the following two patches. It
looks like what
you have suggested, it was tried before but later on Arnaldo reverted the change
from report and script cmds due to reasons mentioned in the second patch.

https://lore.kernel.org/lkml/1461767472-8827-31-git-send-email-acme@kernel.org/
https://lore.kernel.org/lkml/1463696493-27528-8-git-send-email-acme@kernel.org/

Regards
Rajnesh


> >
>
> Thanks Ian for your feedback. I am not sure if it's feasible to use auto
> resizing hashmap here. On each sample of 256 entries we will be doing
> 6 callocs and transferring a whole lot of entries in hashmap_grow. We
> can't reuse old hashmap as well. On each sample we bear the same cost
>
> But I do agree this should be more dynamic but the maximum number
> of entries remove_loops can process is limited by the type of chash array
> here. I can change it and related logic to use uint16_t or higher but we
> will still have a cap on the number of entries.
>
> PERF_MAX_BRANCH_DEPTH seems to be denoting what remove_loops
> can process. This is being used by thread__resolve_callchain_sample to
> check if the sample is processable before calling remove_loops. I think
> this can't be changed to use perf_event_max_stack. But I can rename
> this macro to avoid confusion.
>
> I didn't notice PERF_MAX_STACK_DEPTH. This seems to be defined in
> multiple places and touches bpf as well. I agree that we should avoid
> using this macro and use runtime determined value instead. Tbh I don't
> have super in-depth perf understanding. I will give this a try and send a
> patch in the next update. It would be helpful if you can review it.
>
> Thanks
> -Rajnesh
>
> > Thanks,
> > Ian
> >
> > > Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> > > ---
> > >  tools/perf/util/machine.c | 21 ++++++++++++++-------
> > >  1 file changed, 14 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > > index 27d5345d2b30..f2eb3c20274e 100644
> > > --- a/tools/perf/util/machine.c
> > > +++ b/tools/perf/util/machine.c
> > > @@ -2174,25 +2174,32 @@ static void save_iterations(struct iterations *iter,
> > >                 iter->cycles += be[i].flags.cycles;
> > >  }
> > >
> > > -#define CHASHSZ 127
> > > -#define CHASHBITS 7
> > > -#define NO_ENTRY 0xff
> > > +#define CHASHBITS 8
> > > +#define NO_ENTRY 0xffU
> > >
> > > -#define PERF_MAX_BRANCH_DEPTH 127
> > > +#define PERF_MAX_BRANCH_DEPTH 256
> > >
> > >  /* Remove loops. */
> > > +/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
> > > + * so it's safe to have an unsigned char array to process 256 entries
> > > + * without causing clash between last entry and NO_ENTRY value.
> > > + */
> > >  static int remove_loops(struct branch_entry *l, int nr,
> > >                         struct iterations *iter)
> > >  {
> > >         int i, j, off;
> > > -       unsigned char chash[CHASHSZ];
> > > +       unsigned char chash[PERF_MAX_BRANCH_DEPTH];
> > >
> > >         memset(chash, NO_ENTRY, sizeof(chash));
> > >
> > > -       BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
> > > +       BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
> > >
> > >         for (i = 0; i < nr; i++) {
> > > -               int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> > > +               /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
> > > +                * needed as hash_64 will anyway limit the hash
> > > +                * to CHASHBITS
> > > +                */
> > > +               int h = hash_64(l[i].from, CHASHBITS);
> > >
> > >                 /* no collision handling for now */
> > >                 if (chash[h] == NO_ENTRY) {
> > > --
> > > 2.34.1
> > >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/7] perf: Increase the maximum number of samples to 256.
  2025-05-21 10:47       ` Rajnesh Kanwal
@ 2025-05-21 15:36         ` Ian Rogers
  2025-05-21 17:40           ` Rajnesh Kanwal
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Rogers @ 2025-05-21 15:36 UTC (permalink / raw)
  To: Rajnesh Kanwal
  Cc: ak, Liang, Kan, linux-kernel, linux-riscv, linux-perf-users,
	adrian.hunter, alexander.shishkin, ajones, anup, acme, atishp,
	beeman, brauner, conor, heiko, mingo, james.clark, renyu.zj,
	jolsa, jisheng.teoh, palmer, will, kaiwenxue1, vincent.chen

On Wed, May 21, 2025 at 3:47 AM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
>
> On Thu, Apr 17, 2025 at 1:51 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> >
> > + Adding Andi Kleen.
> >
> > On Thu, Feb 20, 2025 at 6:51 PM Ian Rogers <irogers@google.com> wrote:
> > >
> > > On Thu, Jan 16, 2025 at 3:10 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> > > >
> > > > RISCV CTR extension support a maximum depth of 256 last branch records.
> > > > The 127 entries limit results in corrupting CTR entries for RISC-V if
> > > > configured to be 256 entries. This will not impact any other architectures
> > > > as it is just increasing maximum limit of possible entries.
> > >
> > > I wonder if rather than a constant this code should just use the auto
> > > resizing hashmap code?
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/hashmap.h
> > >
> > > I assume the value of 127 comes from perf_event.h's PERF_MAX_STACK_DEPTH:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/perf_event.h#n1252
> > >
> > > Perhaps these constants shouldn't exist. The perf-record man page mentions:
> > > sysctl.kernel.perf_event_max_stack
> > > which I believe gets a value from
> > > /proc/sys/kernel/perf_event_max_stack, so maybe these should be
> > > runtime determined constants rather than compile time.
>
> While working on this, I came across the following two patches. It
> looks like what
> you have suggested, it was tried before but later on Arnaldo reverted the change
> from report and script cmds due to reasons mentioned in the second patch.
>
> https://lore.kernel.org/lkml/1461767472-8827-31-git-send-email-acme@kernel.org/
> https://lore.kernel.org/lkml/1463696493-27528-8-git-send-email-acme@kernel.org/

Thanks Rajnash, agreed on what you found. I wonder to resolve the
issue we could add a header feature:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/header.h?h=perf-tools-next#n21
for max stack depth.

Thanks,
Ian

> Regards
> Rajnesh
>
>
> > >
> >
> > Thanks Ian for your feedback. I am not sure if it's feasible to use auto
> > resizing hashmap here. On each sample of 256 entries we will be doing
> > 6 callocs and transferring a whole lot of entries in hashmap_grow. We
> > can't reuse old hashmap as well. On each sample we bear the same cost
> >
> > But I do agree this should be more dynamic but the maximum number
> > of entries remove_loops can process is limited by the type of chash array
> > here. I can change it and related logic to use uint16_t or higher but we
> > will still have a cap on the number of entries.
> >
> > PERF_MAX_BRANCH_DEPTH seems to be denoting what remove_loops
> > can process. This is being used by thread__resolve_callchain_sample to
> > check if the sample is processable before calling remove_loops. I think
> > this can't be changed to use perf_event_max_stack. But I can rename
> > this macro to avoid confusion.
> >
> > I didn't notice PERF_MAX_STACK_DEPTH. This seems to be defined in
> > multiple places and touches bpf as well. I agree that we should avoid
> > using this macro and use runtime determined value instead. Tbh I don't
> > have super in-depth perf understanding. I will give this a try and send a
> > patch in the next update. It would be helpful if you can review it.
> >
> > Thanks
> > -Rajnesh
> >
> > > Thanks,
> > > Ian
> > >
> > > > Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> > > > ---
> > > >  tools/perf/util/machine.c | 21 ++++++++++++++-------
> > > >  1 file changed, 14 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > > > index 27d5345d2b30..f2eb3c20274e 100644
> > > > --- a/tools/perf/util/machine.c
> > > > +++ b/tools/perf/util/machine.c
> > > > @@ -2174,25 +2174,32 @@ static void save_iterations(struct iterations *iter,
> > > >                 iter->cycles += be[i].flags.cycles;
> > > >  }
> > > >
> > > > -#define CHASHSZ 127
> > > > -#define CHASHBITS 7
> > > > -#define NO_ENTRY 0xff
> > > > +#define CHASHBITS 8
> > > > +#define NO_ENTRY 0xffU
> > > >
> > > > -#define PERF_MAX_BRANCH_DEPTH 127
> > > > +#define PERF_MAX_BRANCH_DEPTH 256
> > > >
> > > >  /* Remove loops. */
> > > > +/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
> > > > + * so it's safe to have an unsigned char array to process 256 entries
> > > > + * without causing clash between last entry and NO_ENTRY value.
> > > > + */
> > > >  static int remove_loops(struct branch_entry *l, int nr,
> > > >                         struct iterations *iter)
> > > >  {
> > > >         int i, j, off;
> > > > -       unsigned char chash[CHASHSZ];
> > > > +       unsigned char chash[PERF_MAX_BRANCH_DEPTH];
> > > >
> > > >         memset(chash, NO_ENTRY, sizeof(chash));
> > > >
> > > > -       BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
> > > > +       BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
> > > >
> > > >         for (i = 0; i < nr; i++) {
> > > > -               int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> > > > +               /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
> > > > +                * needed as hash_64 will anyway limit the hash
> > > > +                * to CHASHBITS
> > > > +                */
> > > > +               int h = hash_64(l[i].from, CHASHBITS);
> > > >
> > > >                 /* no collision handling for now */
> > > >                 if (chash[h] == NO_ENTRY) {
> > > > --
> > > > 2.34.1
> > > >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/7] perf: Increase the maximum number of samples to 256.
  2025-05-21 15:36         ` Ian Rogers
@ 2025-05-21 17:40           ` Rajnesh Kanwal
  0 siblings, 0 replies; 16+ messages in thread
From: Rajnesh Kanwal @ 2025-05-21 17:40 UTC (permalink / raw)
  To: Ian Rogers
  Cc: ak, Liang, Kan, linux-kernel, linux-riscv, linux-perf-users,
	adrian.hunter, alexander.shishkin, ajones, anup, acme, atishp,
	beeman, brauner, conor, heiko, mingo, james.clark, renyu.zj,
	jolsa, jisheng.teoh, palmer, will, kaiwenxue1, vincent.chen

I am not sure tbh. I have never worked with this part of perf tool before.
I think someone else with a deeper understanding of perf can answer
this better. I will also try to go through it to build some understanding.

Also, I think this can be done as a new work item/series. Currently I am
just trying to increase the number of entries the remove_loops function
can process. Going through the commit description now, I feel like
I have not done a good job of describing the change.

Basically when I use --branch-history option on report cmd to
add the branches to callstack, the remove_loops logic complains
about the size of the sample and discards the sample as
corrupted sample because it's been hardcoded to process
maximum of 127 entries.

Here is the patch that added this option:
https://lore.kernel.org/all/1415844328-4884-3-git-send-email-andi@firstfloor.org/T/#u

Thanks
Rajnesh

On Wed, May 21, 2025 at 4:36 PM Ian Rogers <irogers@google.com> wrote:
>
> On Wed, May 21, 2025 at 3:47 AM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> >
> > On Thu, Apr 17, 2025 at 1:51 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> > >
> > > + Adding Andi Kleen.
> > >
> > > On Thu, Feb 20, 2025 at 6:51 PM Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > On Thu, Jan 16, 2025 at 3:10 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> > > > >
> > > > > RISCV CTR extension support a maximum depth of 256 last branch records.
> > > > > The 127 entries limit results in corrupting CTR entries for RISC-V if
> > > > > configured to be 256 entries. This will not impact any other architectures
> > > > > as it is just increasing maximum limit of possible entries.
> > > >
> > > > I wonder if rather than a constant this code should just use the auto
> > > > resizing hashmap code?
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/hashmap.h
> > > >
> > > > I assume the value of 127 comes from perf_event.h's PERF_MAX_STACK_DEPTH:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/perf_event.h#n1252
> > > >
> > > > Perhaps these constants shouldn't exist. The perf-record man page mentions:
> > > > sysctl.kernel.perf_event_max_stack
> > > > which I believe gets a value from
> > > > /proc/sys/kernel/perf_event_max_stack, so maybe these should be
> > > > runtime determined constants rather than compile time.
> >
> > While working on this, I came across the following two patches. It
> > looks like what
> > you have suggested, it was tried before but later on Arnaldo reverted the change
> > from report and script cmds due to reasons mentioned in the second patch.
> >
> > https://lore.kernel.org/lkml/1461767472-8827-31-git-send-email-acme@kernel.org/
> > https://lore.kernel.org/lkml/1463696493-27528-8-git-send-email-acme@kernel.org/
>
> Thanks Rajnash, agreed on what you found. I wonder to resolve the
> issue we could add a header feature:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/header.h?h=perf-tools-next#n21
> for max stack depth.
>
> Thanks,
> Ian
>
> > Regards
> > Rajnesh
> >
> >
> > > >
> > >
> > > Thanks Ian for your feedback. I am not sure if it's feasible to use auto
> > > resizing hashmap here. On each sample of 256 entries we will be doing
> > > 6 callocs and transferring a whole lot of entries in hashmap_grow. We
> > > can't reuse old hashmap as well. On each sample we bear the same cost
> > >
> > > But I do agree this should be more dynamic but the maximum number
> > > of entries remove_loops can process is limited by the type of chash array
> > > here. I can change it and related logic to use uint16_t or higher but we
> > > will still have a cap on the number of entries.
> > >
> > > PERF_MAX_BRANCH_DEPTH seems to be denoting what remove_loops
> > > can process. This is being used by thread__resolve_callchain_sample to
> > > check if the sample is processable before calling remove_loops. I think
> > > this can't be changed to use perf_event_max_stack. But I can rename
> > > this macro to avoid confusion.
> > >
> > > I didn't notice PERF_MAX_STACK_DEPTH. This seems to be defined in
> > > multiple places and touches bpf as well. I agree that we should avoid
> > > using this macro and use runtime determined value instead. Tbh I don't
> > > have super in-depth perf understanding. I will give this a try and send a
> > > patch in the next update. It would be helpful if you can review it.
> > >
> > > Thanks
> > > -Rajnesh
> > >
> > > > Thanks,
> > > > Ian
> > > >
> > > > > Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> > > > > ---
> > > > >  tools/perf/util/machine.c | 21 ++++++++++++++-------
> > > > >  1 file changed, 14 insertions(+), 7 deletions(-)
> > > > >
> > > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > > > > index 27d5345d2b30..f2eb3c20274e 100644
> > > > > --- a/tools/perf/util/machine.c
> > > > > +++ b/tools/perf/util/machine.c
> > > > > @@ -2174,25 +2174,32 @@ static void save_iterations(struct iterations *iter,
> > > > >                 iter->cycles += be[i].flags.cycles;
> > > > >  }
> > > > >
> > > > > -#define CHASHSZ 127
> > > > > -#define CHASHBITS 7
> > > > > -#define NO_ENTRY 0xff
> > > > > +#define CHASHBITS 8
> > > > > +#define NO_ENTRY 0xffU
> > > > >
> > > > > -#define PERF_MAX_BRANCH_DEPTH 127
> > > > > +#define PERF_MAX_BRANCH_DEPTH 256
> > > > >
> > > > >  /* Remove loops. */
> > > > > +/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
> > > > > + * so it's safe to have an unsigned char array to process 256 entries
> > > > > + * without causing clash between last entry and NO_ENTRY value.
> > > > > + */
> > > > >  static int remove_loops(struct branch_entry *l, int nr,
> > > > >                         struct iterations *iter)
> > > > >  {
> > > > >         int i, j, off;
> > > > > -       unsigned char chash[CHASHSZ];
> > > > > +       unsigned char chash[PERF_MAX_BRANCH_DEPTH];
> > > > >
> > > > >         memset(chash, NO_ENTRY, sizeof(chash));
> > > > >
> > > > > -       BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
> > > > > +       BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
> > > > >
> > > > >         for (i = 0; i < nr; i++) {
> > > > > -               int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> > > > > +               /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
> > > > > +                * needed as hash_64 will anyway limit the hash
> > > > > +                * to CHASHBITS
> > > > > +                */
> > > > > +               int h = hash_64(l[i].from, CHASHBITS);
> > > > >
> > > > >                 /* no collision handling for now */
> > > > >                 if (chash[h] == NO_ENTRY) {
> > > > > --
> > > > > 2.34.1
> > > > >

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-05-21 17:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-16 23:09 [PATCH v2 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 1/7] perf: Increase the maximum number of samples to 256 Rajnesh Kanwal
2025-02-20 18:51   ` Ian Rogers
2025-04-17 12:51     ` Rajnesh Kanwal
2025-05-21 10:47       ` Rajnesh Kanwal
2025-05-21 15:36         ` Ian Rogers
2025-05-21 17:40           ` Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 2/7] riscv: pmu: Add Control transfer records CSR definations Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 3/7] riscv: Add Control Transfer Records extension parsing Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 4/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
2025-01-17  7:26   ` Krzysztof Kozlowski
2025-01-20 14:31     ` Rajnesh Kanwal
2025-01-20 18:49   ` Conor Dooley
2025-01-16 23:09 ` [PATCH v2 5/7] riscv: pmu: Add infrastructure for Control Transfer Record Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 6/7] riscv: pmu: Add driver for Control Transfer Records Ext Rajnesh Kanwal
2025-01-16 23:09 ` [PATCH v2 7/7] riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver Rajnesh Kanwal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).