* [PATCH v3 1/7] perf: Increase the maximum number of branches remove_loops() can process.
2025-05-22 23:25 [PATCH v3 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
@ 2025-05-22 23:25 ` Rajnesh Kanwal
2025-05-27 9:59 ` Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 2/7] riscv: pmu: Add Control transfer records CSR definations Rajnesh Kanwal
` (5 subsequent siblings)
6 siblings, 1 reply; 11+ messages in thread
From: Rajnesh Kanwal @ 2025-05-22 23:25 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong
Cc: linux-perf-users, linux-kernel, linux-riscv, linux-arm-kernel,
Palmer Dabbelt, Conor Dooley, devicetree, Rajnesh Kanwal
RISCV CTR extension supports a maximum depth of 256 last branch records.
Currently remove_loops() can only process 127 entries at max. This leads
to samples with more than 127 entries being skipped. This change simply
updates the remove_loops() logic to be able to process 256 entries.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
tools/perf/util/machine.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 2d51badfbf2e2d1588fa4fdd42ef6c8fea35bf0e..5414528b9d336790decfb42a4f6a4da6c6b68b07 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2176,25 +2176,32 @@ static void save_iterations(struct iterations *iter,
iter->cycles += be[i].flags.cycles;
}
-#define CHASHSZ 127
-#define CHASHBITS 7
-#define NO_ENTRY 0xff
+#define CHASHBITS 8
+#define NO_ENTRY 0xffU
-#define PERF_MAX_BRANCH_DEPTH 127
+#define PERF_MAX_BRANCH_DEPTH 256
/* Remove loops. */
+/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
+ * so it's safe to have an unsigned char array to process 256 entries
+ * without causing clash between last entry and NO_ENTRY value.
+ */
static int remove_loops(struct branch_entry *l, int nr,
struct iterations *iter)
{
int i, j, off;
- unsigned char chash[CHASHSZ];
+ unsigned char chash[PERF_MAX_BRANCH_DEPTH];
memset(chash, NO_ENTRY, sizeof(chash));
- BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
+ BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
for (i = 0; i < nr; i++) {
- int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
+ /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
+ * needed as hash_64 will anyway limit the hash
+ * to CHASHBITS
+ */
+ int h = hash_64(l[i].from, CHASHBITS);
/* no collision handling for now */
if (chash[h] == NO_ENTRY) {
--
2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 1/7] perf: Increase the maximum number of branches remove_loops() can process.
2025-05-22 23:25 ` [PATCH v3 1/7] perf: Increase the maximum number of branches remove_loops() can process Rajnesh Kanwal
@ 2025-05-27 9:59 ` Rajnesh Kanwal
0 siblings, 0 replies; 11+ messages in thread
From: Rajnesh Kanwal @ 2025-05-27 9:59 UTC (permalink / raw)
To: ak, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong
Cc: linux-perf-users, linux-kernel, linux-riscv, linux-arm-kernel,
Conor Dooley, devicetree
Adding Andi Kleen as this was originally written by him.
-Rajnesh
On Fri, May 23, 2025 at 12:26 AM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
>
> RISCV CTR extension supports a maximum depth of 256 last branch records.
> Currently remove_loops() can only process 127 entries at max. This leads
> to samples with more than 127 entries being skipped. This change simply
> updates the remove_loops() logic to be able to process 256 entries.
>
> Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> ---
> tools/perf/util/machine.c | 21 ++++++++++++++-------
> 1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 2d51badfbf2e2d1588fa4fdd42ef6c8fea35bf0e..5414528b9d336790decfb42a4f6a4da6c6b68b07 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2176,25 +2176,32 @@ static void save_iterations(struct iterations *iter,
> iter->cycles += be[i].flags.cycles;
> }
>
> -#define CHASHSZ 127
> -#define CHASHBITS 7
> -#define NO_ENTRY 0xff
> +#define CHASHBITS 8
> +#define NO_ENTRY 0xffU
>
> -#define PERF_MAX_BRANCH_DEPTH 127
> +#define PERF_MAX_BRANCH_DEPTH 256
>
> /* Remove loops. */
> +/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
> + * so it's safe to have an unsigned char array to process 256 entries
> + * without causing clash between last entry and NO_ENTRY value.
> + */
> static int remove_loops(struct branch_entry *l, int nr,
> struct iterations *iter)
> {
> int i, j, off;
> - unsigned char chash[CHASHSZ];
> + unsigned char chash[PERF_MAX_BRANCH_DEPTH];
>
> memset(chash, NO_ENTRY, sizeof(chash));
>
> - BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
> + BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
>
> for (i = 0; i < nr; i++) {
> - int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> + /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
> + * needed as hash_64 will anyway limit the hash
> + * to CHASHBITS
> + */
> + int h = hash_64(l[i].from, CHASHBITS);
>
> /* no collision handling for now */
> if (chash[h] == NO_ENTRY) {
>
> --
> 2.43.0
>
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 2/7] riscv: pmu: Add Control transfer records CSR definations.
2025-05-22 23:25 [PATCH v3 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 1/7] perf: Increase the maximum number of branches remove_loops() can process Rajnesh Kanwal
@ 2025-05-22 23:25 ` Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 3/7] riscv: Add Control Transfer Records extension parsing Rajnesh Kanwal
` (4 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Rajnesh Kanwal @ 2025-05-22 23:25 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong
Cc: linux-perf-users, linux-kernel, linux-riscv, linux-arm-kernel,
Palmer Dabbelt, Conor Dooley, devicetree, Rajnesh Kanwal
Adding CSR defines for RISCV Control Transfer Records extension [0]
along with bit-field macros for each CSR.
[0]: https://github.com/riscv/riscv-control-transfer-records
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
arch/riscv/include/asm/csr.h | 83 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 83 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 8b2f5ae1d60efadbec90eab4b1a3637488a9431f..3aef621657603483e1cafd036f126692a731a333 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -331,6 +331,85 @@
#define CSR_SCOUNTOVF 0xda0
+/* M-mode Control Transfer Records CSRs */
+#define CSR_MCTRCTL 0x34e
+
+/* S-mode Control Transfer Records CSRs */
+#define CSR_SCTRCTL 0x14e
+#define CSR_SCTRSTATUS 0x14f
+#define CSR_SCTRDEPTH 0x15f
+
+/* VS-mode Control Transfer Records CSRs */
+#define CSR_VSCTRCTL 0x24e
+
+/* xctrtl CSR bits. */
+#define CTRCTL_U_ENABLE _AC(0x1, UL)
+#define CTRCTL_S_ENABLE _AC(0x2, UL)
+#define CTRCTL_M_ENABLE _AC(0x4, UL)
+#define CTRCTL_RASEMU _AC(0x80, UL)
+#define CTRCTL_STE _AC(0x100, UL)
+#define CTRCTL_MTE _AC(0x200, UL)
+#define CTRCTL_BPFRZ _AC(0x800, UL)
+#define CTRCTL_LCOFIFRZ _AC(0x1000, UL)
+#define CTRCTL_EXCINH _AC(0x200000000, UL)
+#define CTRCTL_INTRINH _AC(0x400000000, UL)
+#define CTRCTL_TRETINH _AC(0x800000000, UL)
+#define CTRCTL_NTBREN _AC(0x1000000000, UL)
+#define CTRCTL_TKBRINH _AC(0x2000000000, UL)
+#define CTRCTL_INDCALL_INH _AC(0x10000000000, UL)
+#define CTRCTL_DIRCALL_INH _AC(0x20000000000, UL)
+#define CTRCTL_INDJUMP_INH _AC(0x40000000000, UL)
+#define CTRCTL_DIRJUMP_INH _AC(0x80000000000, UL)
+#define CTRCTL_CORSWAP_INH _AC(0x100000000000, UL)
+#define CTRCTL_RET_INH _AC(0x200000000000, UL)
+#define CTRCTL_INDOJUMP_INH _AC(0x400000000000, UL)
+#define CTRCTL_DIROJUMP_INH _AC(0x800000000000, UL)
+
+/* sctrstatus CSR bits. */
+#define SCTRSTATUS_WRPTR_MASK 0xFF
+#define SCTRSTATUS_FROZEN _AC(0x80000000, UL)
+
+#ifdef CONFIG_RISCV_M_MODE
+#define CTRCTL_KERNEL_ENABLE CTRCTL_M_ENABLE
+#else
+#define CTRCTL_KERNEL_ENABLE CTRCTL_S_ENABLE
+#endif
+
+/* sctrdepth CSR bits. */
+#define SCTRDEPTH_MASK 0x7
+
+#define SCTRDEPTH_MIN 0x0 /* 16 Entries. */
+#define SCTRDEPTH_MAX 0x4 /* 256 Entries. */
+
+/* ctrsource, ctrtarget and ctrdata CSR bits. */
+#define CTRSOURCE_VALID 0x1ULL
+#define CTRTARGET_MISP 0x1ULL
+
+#define CTRDATA_TYPE_MASK 0xF
+#define CTRDATA_CCV 0x8000
+#define CTRDATA_CCM_MASK 0xFFF0000
+#define CTRDATA_CCE_MASK 0xF0000000
+
+#define CTRDATA_TYPE_NONE 0
+#define CTRDATA_TYPE_EXCEPTION 1
+#define CTRDATA_TYPE_INTERRUPT 2
+#define CTRDATA_TYPE_TRAP_RET 3
+#define CTRDATA_TYPE_NONTAKEN_BRANCH 4
+#define CTRDATA_TYPE_TAKEN_BRANCH 5
+#define CTRDATA_TYPE_RESERVED_6 6
+#define CTRDATA_TYPE_RESERVED_7 7
+#define CTRDATA_TYPE_INDIRECT_CALL 8
+#define CTRDATA_TYPE_DIRECT_CALL 9
+#define CTRDATA_TYPE_INDIRECT_JUMP 10
+#define CTRDATA_TYPE_DIRECT_JUMP 11
+#define CTRDATA_TYPE_CO_ROUTINE_SWAP 12
+#define CTRDATA_TYPE_RETURN 13
+#define CTRDATA_TYPE_OTHER_INDIRECT_JUMP 14
+#define CTRDATA_TYPE_OTHER_DIRECT_JUMP 15
+
+#define CTR_ENTRIES_FIRST 0x200
+#define CTR_ENTRIES_LAST 0x2ff
+
#define CSR_SSTATUS 0x100
#define CSR_SIE 0x104
#define CSR_STVEC 0x105
@@ -523,6 +602,8 @@
# define CSR_TOPEI CSR_MTOPEI
# define CSR_TOPI CSR_MTOPI
+# define CSR_CTRCTL CSR_MCTRCTL
+
# define SR_IE SR_MIE
# define SR_PIE SR_MPIE
# define SR_PP SR_MPP
@@ -553,6 +634,8 @@
# define CSR_TOPEI CSR_STOPEI
# define CSR_TOPI CSR_STOPI
+# define CSR_CTRCTL CSR_SCTRCTL
+
# define SR_IE SR_SIE
# define SR_PIE SR_SPIE
# define SR_PP SR_SPP
--
2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 3/7] riscv: Add Control Transfer Records extension parsing
2025-05-22 23:25 [PATCH v3 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 1/7] perf: Increase the maximum number of branches remove_loops() can process Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 2/7] riscv: pmu: Add Control transfer records CSR definations Rajnesh Kanwal
@ 2025-05-22 23:25 ` Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 4/7] riscv: pmu: Add infrastructure for Control Transfer Record Rajnesh Kanwal
` (3 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Rajnesh Kanwal @ 2025-05-22 23:25 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong
Cc: linux-perf-users, linux-kernel, linux-riscv, linux-arm-kernel,
Palmer Dabbelt, Conor Dooley, devicetree, Rajnesh Kanwal
Adding CTR extension in ISA extension map to lookup for extension
availability.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
arch/riscv/include/asm/hwcap.h | 4 ++++
arch/riscv/kernel/cpufeature.c | 2 ++
2 files changed, 6 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index fa5e01bcb990ec26a2681916be6f9b27262a0add..9b88dfd0e53c7070793ec71d363f8cd46ea43b92 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -105,6 +105,8 @@
#define RISCV_ISA_EXT_SMCNTRPMF 96
#define RISCV_ISA_EXT_SSCCFG 97
#define RISCV_ISA_EXT_SMCDELEG 98
+#define RISCV_ISA_EXT_SMCTR 99
+#define RISCV_ISA_EXT_SSCTR 100
#define RISCV_ISA_EXT_XLINUXENVCFG 127
@@ -115,11 +117,13 @@
#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA
#define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SMNPM
#define RISCV_ISA_EXT_SxCSRIND RISCV_ISA_EXT_SMCSRIND
+#define RISCV_ISA_EXT_SxCTR RISCV_ISA_EXT_SMCTR
#else
#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA
#define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SSNPM
#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA
#define RISCV_ISA_EXT_SxCSRIND RISCV_ISA_EXT_SSCSRIND
+#define RISCV_ISA_EXT_SxCTR RISCV_ISA_EXT_SSCTR
#endif
#endif /* _ASM_RISCV_HWCAP_H */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index f72552adb257681c35a9f94ad5bbf7165fb93945..7fcbde89e4b9ee55b30b27f5b93e33dbe8f9ce58 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -419,6 +419,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
riscv_ext_smcdeleg_validate),
__RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF),
__RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
+ __RISCV_ISA_EXT_DATA(smctr, RISCV_ISA_EXT_SMCTR),
__RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM),
__RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts),
__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
@@ -426,6 +427,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_DATA_VALIDATE(ssccfg, RISCV_ISA_EXT_SSCCFG, riscv_ext_ssccfg_validate),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
+ __RISCV_ISA_EXT_DATA(ssctr, RISCV_ISA_EXT_SSCTR),
__RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svade, RISCV_ISA_EXT_SVADE),
--
2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 4/7] riscv: pmu: Add infrastructure for Control Transfer Record
2025-05-22 23:25 [PATCH v3 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
` (2 preceding siblings ...)
2025-05-22 23:25 ` [PATCH v3 3/7] riscv: Add Control Transfer Records extension parsing Rajnesh Kanwal
@ 2025-05-22 23:25 ` Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 5/7] riscv: pmu: Add driver for Control Transfer Records Ext Rajnesh Kanwal
` (2 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: Rajnesh Kanwal @ 2025-05-22 23:25 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong
Cc: linux-perf-users, linux-kernel, linux-riscv, linux-arm-kernel,
Palmer Dabbelt, Conor Dooley, devicetree, Rajnesh Kanwal
To support Control Transfer Records (CTR) extension, we need to extend the
riscv_pmu framework with some basic infrastructure for branch stack
sampling. Subsequent patches will use this to add support for CTR in the
riscv_pmu_dev driver.
With CTR, the branches are stored into a hardware FIFO, which will be
sampled by software when perf events overflow. A task may be
context-switched between overflows, and to avoid leaking samples we need
to clear the last task's records when a task is context-switched in.
To do this we will be using the pmu::sched_task() callback added in this
patch.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
drivers/perf/riscv_pmu_common.c | 22 ++++++++++++++++++++++
drivers/perf/riscv_pmu_dev.c | 17 +++++++++++++++++
drivers/perf/riscv_pmu_legacy.c | 2 ++
include/linux/perf/riscv_pmu.h | 18 ++++++++++++++++++
4 files changed, 59 insertions(+)
diff --git a/drivers/perf/riscv_pmu_common.c b/drivers/perf/riscv_pmu_common.c
index 7644147d50b46a79f349d6cb7e32554cc9a39a74..b2dc78cbbb93926964f81f30be9ef4a1c02501df 100644
--- a/drivers/perf/riscv_pmu_common.c
+++ b/drivers/perf/riscv_pmu_common.c
@@ -157,6 +157,19 @@ u64 riscv_pmu_ctr_get_width_mask(struct perf_event *event)
return GENMASK_ULL(cwidth, 0);
}
+static void riscv_pmu_sched_task(struct perf_event_pmu_context *pmu_ctx,
+ bool sched_in)
+{
+ struct riscv_pmu *pmu;
+
+ if (!pmu_ctx)
+ return;
+
+ pmu = to_riscv_pmu(pmu_ctx->pmu);
+ if (pmu->sched_task)
+ pmu->sched_task(pmu_ctx, sched_in);
+}
+
u64 riscv_pmu_event_update(struct perf_event *event)
{
struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
@@ -269,6 +282,8 @@ static int riscv_pmu_add(struct perf_event *event, int flags)
cpuc->events[idx] = event;
cpuc->n_events++;
hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+ if (rvpmu->ctr_add)
+ rvpmu->ctr_add(event, flags);
if (flags & PERF_EF_START)
riscv_pmu_start(event, PERF_EF_RELOAD);
@@ -290,8 +305,13 @@ static void riscv_pmu_del(struct perf_event *event, int flags)
if (rvpmu->ctr_stop)
rvpmu->ctr_stop(event, RISCV_PMU_STOP_FLAG_RESET);
cpuc->n_events--;
+
+ if (rvpmu->ctr_del)
+ rvpmu->ctr_del(event, flags);
+
if (rvpmu->ctr_clear_idx)
rvpmu->ctr_clear_idx(event);
+
perf_event_update_userpage(event);
hwc->idx = -1;
}
@@ -402,6 +422,7 @@ struct riscv_pmu *riscv_pmu_alloc(void)
for_each_possible_cpu(cpuid) {
cpuc = per_cpu_ptr(pmu->hw_events, cpuid);
cpuc->n_events = 0;
+ cpuc->ctr_users = 0;
for (i = 0; i < RISCV_MAX_COUNTERS; i++)
cpuc->events[i] = NULL;
cpuc->snapshot_addr = NULL;
@@ -416,6 +437,7 @@ struct riscv_pmu *riscv_pmu_alloc(void)
.start = riscv_pmu_start,
.stop = riscv_pmu_stop,
.read = riscv_pmu_read,
+ .sched_task = riscv_pmu_sched_task,
};
return pmu;
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index cd2ac4cf34f12618a2df1895f1fab8522016d325..95e6dd272db69f53b679e5fc3450785e45d5e8b9 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -1035,6 +1035,12 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
}
}
+static void pmu_sched_task(struct perf_event_pmu_context *pmu_ctx,
+ bool sched_in)
+{
+ /* Call CTR specific Sched hook. */
+}
+
static int rvpmu_sbi_find_num_ctrs(void)
{
struct sbiret ret;
@@ -1561,6 +1567,14 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
return -ENOENT;
}
+static void rvpmu_ctr_add(struct perf_event *event, int flags)
+{
+}
+
+static void rvpmu_ctr_del(struct perf_event *event, int flags)
+{
+}
+
static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
{
struct hw_perf_event *hwc = &event->hw;
@@ -1979,6 +1993,8 @@ static int rvpmu_device_probe(struct platform_device *pdev)
else
pmu->pmu.attr_groups = riscv_sbi_pmu_attr_groups;
pmu->cmask = cmask;
+ pmu->ctr_add = rvpmu_ctr_add;
+ pmu->ctr_del = rvpmu_ctr_del;
pmu->ctr_start = rvpmu_ctr_start;
pmu->ctr_stop = rvpmu_ctr_stop;
pmu->event_map = rvpmu_event_map;
@@ -1990,6 +2006,7 @@ static int rvpmu_device_probe(struct platform_device *pdev)
pmu->event_mapped = rvpmu_event_mapped;
pmu->event_unmapped = rvpmu_event_unmapped;
pmu->csr_index = rvpmu_csr_index;
+ pmu->sched_task = pmu_sched_task;
ret = riscv_pm_pmu_register(pmu);
if (ret)
diff --git a/drivers/perf/riscv_pmu_legacy.c b/drivers/perf/riscv_pmu_legacy.c
index 93c8e0fdb5898587e89115c10587d69380da19ec..bee6742d35fa54a9b82d4a4842b481efaa226765 100644
--- a/drivers/perf/riscv_pmu_legacy.c
+++ b/drivers/perf/riscv_pmu_legacy.c
@@ -115,6 +115,8 @@ static void pmu_legacy_init(struct riscv_pmu *pmu)
BIT(RISCV_PMU_LEGACY_INSTRET);
pmu->ctr_start = pmu_legacy_ctr_start;
pmu->ctr_stop = NULL;
+ pmu->ctr_add = NULL;
+ pmu->ctr_del = NULL;
pmu->event_map = pmu_legacy_event_map;
pmu->ctr_get_idx = pmu_legacy_ctr_get_idx;
pmu->ctr_get_width = pmu_legacy_ctr_get_width;
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index e58f8381198849ea6134a46e894d91064a1a6154..883781f12ae0be93d8292ae1a7e7b03fea3ea955 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -46,6 +46,13 @@
}, \
}
+#define MAX_BRANCH_RECORDS 256
+
+struct branch_records {
+ struct perf_branch_stack branch_stack;
+ struct perf_branch_entry branch_entries[MAX_BRANCH_RECORDS];
+};
+
struct cpu_hw_events {
/* currently enabled events */
int n_events;
@@ -65,6 +72,12 @@ struct cpu_hw_events {
bool snapshot_set_done;
/* A shadow copy of the counter values to avoid clobbering during multiple SBI calls */
u64 snapshot_cval_shcopy[RISCV_MAX_COUNTERS];
+
+ /* Saved branch records. */
+ struct branch_records *branches;
+
+ /* Active events requesting branch records */
+ int ctr_users;
};
struct riscv_pmu {
@@ -78,6 +91,8 @@ struct riscv_pmu {
int (*ctr_get_idx)(struct perf_event *event);
int (*ctr_get_width)(int idx);
void (*ctr_clear_idx)(struct perf_event *event);
+ void (*ctr_add)(struct perf_event *event, int flags);
+ void (*ctr_del)(struct perf_event *event, int flags);
void (*ctr_start)(struct perf_event *event, u64 init_val);
void (*ctr_stop)(struct perf_event *event, unsigned long flag);
int (*event_map)(struct perf_event *event, u64 *config);
@@ -85,10 +100,13 @@ struct riscv_pmu {
void (*event_mapped)(struct perf_event *event, struct mm_struct *mm);
void (*event_unmapped)(struct perf_event *event, struct mm_struct *mm);
uint8_t (*csr_index)(struct perf_event *event);
+ void (*sched_task)(struct perf_event_pmu_context *ctx, bool sched_in);
struct cpu_hw_events __percpu *hw_events;
struct hlist_node node;
struct notifier_block riscv_pm_nb;
+
+ unsigned int ctr_depth;
};
#define to_riscv_pmu(p) (container_of(p, struct riscv_pmu, pmu))
--
2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 5/7] riscv: pmu: Add driver for Control Transfer Records Ext.
2025-05-22 23:25 [PATCH v3 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
` (3 preceding siblings ...)
2025-05-22 23:25 ` [PATCH v3 4/7] riscv: pmu: Add infrastructure for Control Transfer Record Rajnesh Kanwal
@ 2025-05-22 23:25 ` Rajnesh Kanwal
2025-05-23 17:48 ` kernel test robot
2025-05-22 23:25 ` [PATCH v3 6/7] riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 7/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
6 siblings, 1 reply; 11+ messages in thread
From: Rajnesh Kanwal @ 2025-05-22 23:25 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong
Cc: linux-perf-users, linux-kernel, linux-riscv, linux-arm-kernel,
Palmer Dabbelt, Conor Dooley, devicetree, Rajnesh Kanwal
This adds support for CTR Ext defined in [0]. The extension
allows to records a maximum for 256 last branch records.
CTR extension depends on s[m|s]csrind and Sscofpmf extensions.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
MAINTAINERS | 1 +
drivers/perf/Kconfig | 11 +
drivers/perf/Makefile | 1 +
drivers/perf/riscv_ctr.c | 612 +++++++++++++++++++++++++++++++++++++++++
include/linux/perf/riscv_pmu.h | 37 +++
5 files changed, 662 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index b6d174f7735e6d8e4c3c2eac91450e38f8b48519..068994eff9fdfda82f61f607e76ecacb54809792 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -20406,6 +20406,7 @@ M: Atish Patra <atishp@atishpatra.org>
R: Anup Patel <anup@brainfault.org>
L: linux-riscv@lists.infradead.org
S: Supported
+F: drivers/perf/riscv_ctr.c
F: drivers/perf/riscv_pmu_common.c
F: drivers/perf/riscv_pmu_dev.c
F: drivers/perf/riscv_pmu_legacy.c
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index b3bdff2a99a4a160718a322ed3b0a6af2b01a750..9107c5208bf5eba6c9db378ae8ed596f2b27498c 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -129,6 +129,17 @@ config ANDES_CUSTOM_PMU
If you don't know what to do here, say "Y".
+config RISCV_CTR
+ bool "Enable support for Control Transfer Records (CTR)"
+ depends on PERF_EVENTS && RISCV_PMU
+ default y
+ help
+ Enable support for Control Transfer Records (CTR) which
+ allows recording branches, Jumps, Calls, returns etc taken in an
+ execution path. This also supports privilege based filtering. It
+ captures additional relevant information such as cycle count,
+ branch misprediction etc.
+
config ARM_PMU_ACPI
depends on ARM_PMU && ACPI
def_bool y
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 0805d740c773f51263c94cf97c9fb4339bcd6767..755609f184fe4b4ad7cd77de10cc56319489f495 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_RISCV_PMU_COMMON) += riscv_pmu_common.o
obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o
obj-$(CONFIG_RISCV_PMU) += riscv_pmu_dev.o
obj-$(CONFIG_STARFIVE_STARLINK_PMU) += starfive_starlink_pmu.o
+obj-$(CONFIG_RISCV_CTR) += riscv_ctr.o
obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
diff --git a/drivers/perf/riscv_ctr.c b/drivers/perf/riscv_ctr.c
new file mode 100644
index 0000000000000000000000000000000000000000..4bbac1ce29c5dd558a3ebd89d6efef9db3a405b8
--- /dev/null
+++ b/drivers/perf/riscv_ctr.c
@@ -0,0 +1,612 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Control transfer records extension Helpers.
+ *
+ * Copyright (C) 2024 Rivos Inc.
+ *
+ * Author: Rajnesh Kanwal <rkanwal@rivosinc.com>
+ */
+
+#define pr_fmt(fmt) "CTR: " fmt
+
+#include <linux/bitfield.h>
+#include <linux/printk.h>
+#include <linux/types.h>
+#include <linux/perf_event.h>
+#include <linux/perf/riscv_pmu.h>
+#include <linux/cpufeature.h>
+#include <asm/hwcap.h>
+#include <asm/csr_ind.h>
+#include <asm/csr.h>
+
+#define CTR_BRANCH_FILTERS_INH (CTRCTL_EXCINH | \
+ CTRCTL_INTRINH | \
+ CTRCTL_TRETINH | \
+ CTRCTL_TKBRINH | \
+ CTRCTL_INDCALL_INH | \
+ CTRCTL_DIRCALL_INH | \
+ CTRCTL_INDJUMP_INH | \
+ CTRCTL_DIRJUMP_INH | \
+ CTRCTL_CORSWAP_INH | \
+ CTRCTL_RET_INH | \
+ CTRCTL_INDOJUMP_INH | \
+ CTRCTL_DIROJUMP_INH)
+
+#define CTR_BRANCH_ENABLE_BITS (CTRCTL_KERNEL_ENABLE | CTRCTL_U_ENABLE)
+
+/* Branch filters not-supported by CTR extension. */
+#define CTR_EXCLUDE_BRANCH_FILTERS (PERF_SAMPLE_BRANCH_ABORT_TX | \
+ PERF_SAMPLE_BRANCH_IN_TX | \
+ PERF_SAMPLE_BRANCH_PRIV_SAVE | \
+ PERF_SAMPLE_BRANCH_NO_TX | \
+ PERF_SAMPLE_BRANCH_COUNTERS)
+
+/* Branch filters supported by CTR extension. */
+#define CTR_ALLOWED_BRANCH_FILTERS (PERF_SAMPLE_BRANCH_USER | \
+ PERF_SAMPLE_BRANCH_KERNEL | \
+ PERF_SAMPLE_BRANCH_HV | \
+ PERF_SAMPLE_BRANCH_ANY | \
+ PERF_SAMPLE_BRANCH_ANY_CALL | \
+ PERF_SAMPLE_BRANCH_ANY_RETURN | \
+ PERF_SAMPLE_BRANCH_IND_CALL | \
+ PERF_SAMPLE_BRANCH_COND | \
+ PERF_SAMPLE_BRANCH_IND_JUMP | \
+ PERF_SAMPLE_BRANCH_HW_INDEX | \
+ PERF_SAMPLE_BRANCH_NO_FLAGS | \
+ PERF_SAMPLE_BRANCH_NO_CYCLES | \
+ PERF_SAMPLE_BRANCH_CALL_STACK | \
+ PERF_SAMPLE_BRANCH_CALL | \
+ PERF_SAMPLE_BRANCH_TYPE_SAVE)
+
+#define CTR_PERF_BRANCH_FILTERS (CTR_ALLOWED_BRANCH_FILTERS | \
+ CTR_EXCLUDE_BRANCH_FILTERS)
+
+static u64 allowed_filters __read_mostly;
+
+struct ctr_regset {
+ unsigned long src;
+ unsigned long target;
+ unsigned long ctr_data;
+};
+
+enum {
+ CTR_STATE_NONE,
+ CTR_STATE_VALID,
+};
+
+/* Head is the idx of the next available slot. The slot may be already populated
+ * by an old entry which will be lost on new writes.
+ */
+struct riscv_perf_task_context {
+ int callstack_users;
+ int stack_state;
+ unsigned int num_entries;
+ uint32_t ctr_status;
+ uint64_t ctr_control;
+ struct ctr_regset store[MAX_BRANCH_RECORDS];
+};
+
+static inline u64 get_ctr_src_reg(unsigned int ctr_idx)
+{
+ return csr_ind_read(CSR_SIREG, CTR_ENTRIES_FIRST, ctr_idx);
+}
+
+static inline void set_ctr_src_reg(unsigned int ctr_idx, u64 value)
+{
+ return csr_ind_write(CSR_SIREG, CTR_ENTRIES_FIRST, ctr_idx, value);
+}
+
+static inline u64 get_ctr_tgt_reg(unsigned int ctr_idx)
+{
+ return csr_ind_read(CSR_SIREG2, CTR_ENTRIES_FIRST, ctr_idx);
+}
+
+static inline void set_ctr_tgt_reg(unsigned int ctr_idx, u64 value)
+{
+ return csr_ind_write(CSR_SIREG2, CTR_ENTRIES_FIRST, ctr_idx, value);
+}
+
+static inline u64 get_ctr_data_reg(unsigned int ctr_idx)
+{
+ return csr_ind_read(CSR_SIREG3, CTR_ENTRIES_FIRST, ctr_idx);
+}
+
+static inline void set_ctr_data_reg(unsigned int ctr_idx, u64 value)
+{
+ return csr_ind_write(CSR_SIREG3, CTR_ENTRIES_FIRST, ctr_idx, value);
+}
+
+static inline bool ctr_record_valid(u64 ctr_src)
+{
+ return !!FIELD_GET(CTRSOURCE_VALID, ctr_src);
+}
+
+static inline int ctr_get_mispredict(u64 ctr_target)
+{
+ return FIELD_GET(CTRTARGET_MISP, ctr_target);
+}
+
+static inline unsigned int ctr_get_cycles(u64 ctr_data)
+{
+ const unsigned int cce = FIELD_GET(CTRDATA_CCE_MASK, ctr_data);
+ const unsigned int ccm = FIELD_GET(CTRDATA_CCM_MASK, ctr_data);
+
+ if (ctr_data & CTRDATA_CCV)
+ return 0;
+
+ /* Formula to calculate cycles from spec: (2^12 + CCM) << CCE-1 */
+ if (cce > 0)
+ return (4096 + ccm) << (cce - 1);
+
+ return FIELD_GET(CTRDATA_CCM_MASK, ctr_data);
+}
+
+static inline unsigned int ctr_get_type(u64 ctr_data)
+{
+ return FIELD_GET(CTRDATA_TYPE_MASK, ctr_data);
+}
+
+static inline unsigned int ctr_get_depth(u64 ctr_depth)
+{
+ /* Depth table from CTR Spec: 2.4 sctrdepth.
+ *
+ * sctrdepth.depth Depth
+ * 000 - 16
+ * 001 - 32
+ * 010 - 64
+ * 011 - 128
+ * 100 - 256
+ *
+ * Depth = 16 * 2 ^ (ctrdepth.depth)
+ * or
+ * Depth = 16 << ctrdepth.depth.
+ */
+ return 16 << FIELD_GET(SCTRDEPTH_MASK, ctr_depth);
+}
+
+static inline struct riscv_perf_task_context *task_context(void *ctx)
+{
+ return (struct riscv_perf_task_context *)ctx;
+}
+
+/* Reads CTR entry at idx and stores it in entry struct. */
+static bool get_ctr_regset(struct ctr_regset *entry, unsigned int idx)
+{
+ entry->src = get_ctr_src_reg(idx);
+
+ if (!ctr_record_valid(entry->src))
+ return false;
+
+ entry->src = entry->src;
+ entry->target = get_ctr_tgt_reg(idx);
+ entry->ctr_data = get_ctr_data_reg(idx);
+
+ return true;
+}
+
+static void set_ctr_regset(struct ctr_regset *entry, unsigned int idx)
+{
+ set_ctr_src_reg(idx, entry->src);
+ set_ctr_tgt_reg(idx, entry->target);
+ set_ctr_data_reg(idx, entry->ctr_data);
+}
+
+static u64 branch_type_to_ctr(int branch_type)
+{
+ u64 config = CTR_BRANCH_FILTERS_INH | CTRCTL_LCOFIFRZ;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_USER)
+ config |= CTRCTL_U_ENABLE;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
+ config |= CTRCTL_KERNEL_ENABLE;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_HV) {
+ if (riscv_isa_extension_available(NULL, h))
+ config |= CTRCTL_KERNEL_ENABLE;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+ config &= ~CTR_BRANCH_FILTERS_INH;
+ return config;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+ config &= ~CTRCTL_INDCALL_INH;
+ config &= ~CTRCTL_DIRCALL_INH;
+ config &= ~CTRCTL_EXCINH;
+ config &= ~CTRCTL_INTRINH;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+ config &= ~(CTRCTL_RET_INH | CTRCTL_TRETINH);
+
+ if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
+ config &= ~CTRCTL_INDCALL_INH;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_COND)
+ config &= ~CTRCTL_TKBRINH;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_CALL_STACK)
+ config |= CTRCTL_RASEMU;
+
+ if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP) {
+ config &= ~CTRCTL_INDJUMP_INH;
+ config &= ~CTRCTL_INDOJUMP_INH;
+ }
+
+ if (branch_type & PERF_SAMPLE_BRANCH_CALL)
+ config &= ~CTRCTL_DIRCALL_INH;
+
+ return config;
+}
+
+static const int ctr_perf_map[] = {
+ [CTRDATA_TYPE_NONE] = PERF_BR_UNKNOWN,
+ [CTRDATA_TYPE_EXCEPTION] = PERF_BR_SYSCALL,
+ [CTRDATA_TYPE_INTERRUPT] = PERF_BR_IRQ,
+ [CTRDATA_TYPE_TRAP_RET] = PERF_BR_ERET,
+ [CTRDATA_TYPE_NONTAKEN_BRANCH] = PERF_BR_COND,
+ [CTRDATA_TYPE_TAKEN_BRANCH] = PERF_BR_COND,
+ [CTRDATA_TYPE_RESERVED_6] = PERF_BR_UNKNOWN,
+ [CTRDATA_TYPE_RESERVED_7] = PERF_BR_UNKNOWN,
+ [CTRDATA_TYPE_INDIRECT_CALL] = PERF_BR_IND_CALL,
+ [CTRDATA_TYPE_DIRECT_CALL] = PERF_BR_CALL,
+ [CTRDATA_TYPE_INDIRECT_JUMP] = PERF_BR_IND,
+ [CTRDATA_TYPE_DIRECT_JUMP] = PERF_BR_UNCOND,
+ [CTRDATA_TYPE_CO_ROUTINE_SWAP] = PERF_BR_UNKNOWN,
+ [CTRDATA_TYPE_RETURN] = PERF_BR_RET,
+ [CTRDATA_TYPE_OTHER_INDIRECT_JUMP] = PERF_BR_IND,
+ [CTRDATA_TYPE_OTHER_DIRECT_JUMP] = PERF_BR_UNCOND,
+};
+
+static void ctr_set_perf_entry_type(struct perf_branch_entry *entry,
+ u64 ctr_data)
+{
+ int ctr_type = ctr_get_type(ctr_data);
+
+ entry->type = ctr_perf_map[ctr_type];
+ if (entry->type == PERF_BR_UNKNOWN)
+ pr_warn("%d - unknown branch type captured\n", ctr_type);
+}
+
+static void capture_ctr_flags(struct perf_branch_entry *entry,
+ struct perf_event *event, u64 ctr_data,
+ u64 ctr_target)
+{
+ if (branch_sample_type(event))
+ ctr_set_perf_entry_type(entry, ctr_data);
+
+ if (!branch_sample_no_cycles(event))
+ entry->cycles = ctr_get_cycles(ctr_data);
+
+ if (!branch_sample_no_flags(event)) {
+ entry->abort = 0;
+ entry->mispred = ctr_get_mispredict(ctr_target);
+ entry->predicted = !entry->mispred;
+ }
+
+ if (branch_sample_priv(event))
+ entry->priv = PERF_BR_PRIV_UNKNOWN;
+}
+
+static void ctr_regset_to_branch_entry(struct cpu_hw_events *cpuc,
+ struct perf_event *event,
+ struct ctr_regset *regset,
+ unsigned int idx)
+{
+ struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
+
+ perf_clear_branch_entry_bitfields(entry);
+ entry->from = regset->src & (~CTRSOURCE_VALID);
+ entry->to = regset->target & (~CTRTARGET_MISP);
+ capture_ctr_flags(entry, event, regset->ctr_data, regset->target);
+}
+
+static void ctr_read_entries(struct cpu_hw_events *cpuc,
+ struct perf_event *event,
+ unsigned int depth)
+{
+ struct ctr_regset entry = {};
+ u64 ctr_ctl;
+ int i;
+
+ ctr_ctl = csr_read_clear(CSR_CTRCTL, CTR_BRANCH_ENABLE_BITS);
+
+ for (i = 0; i < depth; i++) {
+ if (!get_ctr_regset(&entry, i))
+ break;
+
+ ctr_regset_to_branch_entry(cpuc, event, &entry, i);
+ }
+
+ csr_set(CSR_CTRCTL, ctr_ctl & CTR_BRANCH_ENABLE_BITS);
+
+ cpuc->branches->branch_stack.nr = i;
+ cpuc->branches->branch_stack.hw_idx = 0;
+}
+
+bool riscv_pmu_ctr_valid(struct perf_event *event)
+{
+ u64 branch_type = event->attr.branch_sample_type;
+
+ if (branch_type & ~allowed_filters) {
+ pr_debug_once("Requested branch filters not supported 0x%llx\n",
+ branch_type & ~allowed_filters);
+ return false;
+ }
+
+ return true;
+}
+
+void riscv_pmu_ctr_consume(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+ unsigned int depth = to_riscv_pmu(event->pmu)->ctr_depth;
+
+ ctr_read_entries(cpuc, event, depth);
+
+ /* Clear frozen bit. */
+ csr_clear(CSR_SCTRSTATUS, SCTRSTATUS_FROZEN);
+}
+
+static void riscv_pmu_ctr_reset(void)
+{
+ /* FIXME: Replace with sctrclr instruction once support is merged
+ * into toolchain.
+ */
+ asm volatile(".4byte 0x10400073\n" ::: "memory");
+ csr_write(CSR_SCTRSTATUS, 0);
+}
+
+static void __riscv_pmu_ctr_restore(void *ctx)
+{
+ struct riscv_perf_task_context *task_ctx = ctx;
+ unsigned int i;
+
+ csr_write(CSR_SCTRSTATUS, task_ctx->ctr_status);
+
+ for (i = 0; i < task_ctx->num_entries; i++)
+ set_ctr_regset(&task_ctx->store[i], i);
+}
+
+static void riscv_pmu_ctr_restore(void *ctx)
+{
+ if (task_context(ctx)->stack_state == CTR_STATE_NONE ||
+ task_context(ctx)->callstack_users == 0) {
+ return;
+ }
+
+ riscv_pmu_ctr_reset();
+ __riscv_pmu_ctr_restore(ctx);
+
+ task_context(ctx)->stack_state = CTR_STATE_NONE;
+}
+
+static void __riscv_pmu_ctr_save(void *ctx, unsigned int depth)
+{
+ struct riscv_perf_task_context *task_ctx = ctx;
+ struct ctr_regset *dst;
+ unsigned int i;
+
+ for (i = 0; i < depth; i++) {
+ dst = &task_ctx->store[i];
+ if (!get_ctr_regset(dst, i))
+ break;
+ }
+
+ task_ctx->num_entries = i;
+
+ task_ctx->ctr_status = csr_read(CSR_SCTRSTATUS);
+}
+
+static void riscv_pmu_ctr_save(void *ctx, unsigned int depth)
+{
+ if (task_context(ctx)->stack_state == CTR_STATE_VALID)
+ return;
+
+ if (task_context(ctx)->callstack_users == 0) {
+ task_context(ctx)->stack_state = CTR_STATE_NONE;
+ return;
+ }
+
+ __riscv_pmu_ctr_save(ctx, depth);
+
+ task_context(ctx)->stack_state = CTR_STATE_VALID;
+}
+
+/*
+ * On context switch in, we need to make sure no samples from previous tasks
+ * are left in the CTR.
+ *
+ * On ctxswin, sched_in = true, called after the PMU has started
+ * On ctxswout, sched_in = false, called before the PMU is stopped
+ */
+void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *pmu_ctx,
+ bool sched_in)
+{
+ struct riscv_pmu *rvpmu = to_riscv_pmu(pmu_ctx->pmu);
+ struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+ void *task_ctx;
+
+ if (!cpuc->ctr_users)
+ return;
+
+ /* Save branch records in task_ctx on sched out */
+ task_ctx = pmu_ctx ? pmu_ctx->task_ctx_data : NULL;
+ if (task_ctx) {
+ if (sched_in)
+ riscv_pmu_ctr_restore(task_ctx);
+ else
+ riscv_pmu_ctr_save(task_ctx, rvpmu->ctr_depth);
+ return;
+ }
+
+ /* Reset branch records on sched in */
+ if (sched_in)
+ riscv_pmu_ctr_reset();
+}
+
+static inline bool branch_user_callstack(unsigned int br_type)
+{
+ return (br_type & PERF_SAMPLE_BRANCH_USER) &&
+ (br_type & PERF_SAMPLE_BRANCH_CALL_STACK);
+}
+
+void riscv_pmu_ctr_add(struct perf_event *event)
+{
+ struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
+ struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+
+ if (branch_user_callstack(event->attr.branch_sample_type) &&
+ event->pmu_ctx->task_ctx_data)
+ task_context(event->pmu_ctx->task_ctx_data)->callstack_users++;
+
+ perf_sched_cb_inc(event->pmu);
+
+ if (!cpuc->ctr_users++)
+ riscv_pmu_ctr_reset();
+}
+
+void riscv_pmu_ctr_del(struct perf_event *event)
+{
+ struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
+ struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+
+ if (branch_user_callstack(event->attr.branch_sample_type) &&
+ event->pmu_ctx->task_ctx_data)
+ task_context(event->pmu_ctx->task_ctx_data)->callstack_users--;
+
+ perf_sched_cb_dec(event->pmu);
+ cpuc->ctr_users--;
+ WARN_ON_ONCE(cpuc->ctr_users < 0);
+}
+
+void riscv_pmu_ctr_enable(struct perf_event *event)
+{
+ u64 branch_type = event->attr.branch_sample_type;
+ u64 ctr;
+
+ ctr = branch_type_to_ctr(branch_type);
+ csr_write(CSR_CTRCTL, ctr);
+}
+
+void riscv_pmu_ctr_disable(struct perf_event *event)
+{
+ /* Clear CTRCTL to disable the recording. */
+ csr_write(CSR_CTRCTL, 0);
+}
+
+/*
+ * Check for hardware supported perf filters here. To avoid missing
+ * any new added filter in perf, we do a BUILD_BUG_ON check, so make sure
+ * to update CTR_ALLOWED_BRANCH_FILTERS or CTR_EXCLUDE_BRANCH_FILTERS
+ * defines when adding support for it in below function.
+ */
+static void __init check_available_filters(void)
+{
+ u64 ctr_ctl;
+
+ /*
+ * Ensure both perf branch filter allowed and exclude
+ * masks are always in sync with the generic perf ABI.
+ */
+ BUILD_BUG_ON(CTR_PERF_BRANCH_FILTERS != (PERF_SAMPLE_BRANCH_MAX - 1));
+
+ allowed_filters = PERF_SAMPLE_BRANCH_USER |
+ PERF_SAMPLE_BRANCH_KERNEL |
+ PERF_SAMPLE_BRANCH_ANY |
+ PERF_SAMPLE_BRANCH_HW_INDEX |
+ PERF_SAMPLE_BRANCH_NO_FLAGS |
+ PERF_SAMPLE_BRANCH_NO_CYCLES |
+ PERF_SAMPLE_BRANCH_TYPE_SAVE;
+
+ csr_write(CSR_CTRCTL, ~0);
+ ctr_ctl = csr_read(CSR_CTRCTL);
+ csr_write(CSR_CTRCTL, 0);
+
+ if (riscv_isa_extension_available(NULL, h))
+ allowed_filters |= PERF_SAMPLE_BRANCH_HV;
+
+ if (ctr_ctl & (CTRCTL_INDCALL_INH | CTRCTL_DIRCALL_INH))
+ allowed_filters |= PERF_SAMPLE_BRANCH_ANY_CALL;
+
+ if (ctr_ctl & (CTRCTL_RET_INH | CTRCTL_TRETINH))
+ allowed_filters |= PERF_SAMPLE_BRANCH_ANY_RETURN;
+
+ if (ctr_ctl & CTRCTL_INDCALL_INH)
+ allowed_filters |= PERF_SAMPLE_BRANCH_IND_CALL;
+
+ if (ctr_ctl & CTRCTL_TKBRINH)
+ allowed_filters |= PERF_SAMPLE_BRANCH_COND;
+
+ if (ctr_ctl & CTRCTL_RASEMU)
+ allowed_filters |= PERF_SAMPLE_BRANCH_CALL_STACK;
+
+ if (ctr_ctl & (CTRCTL_INDOJUMP_INH | CTRCTL_INDJUMP_INH))
+ allowed_filters |= PERF_SAMPLE_BRANCH_IND_JUMP;
+
+ if (ctr_ctl & CTRCTL_DIRCALL_INH)
+ allowed_filters |= PERF_SAMPLE_BRANCH_CALL;
+}
+
+void riscv_pmu_ctr_starting_cpu(void)
+{
+ if (!riscv_isa_extension_available(NULL, SxCTR) ||
+ !riscv_isa_extension_available(NULL, SSCOFPMF) ||
+ !riscv_isa_extension_available(NULL, SxCSRIND))
+ return;
+
+ /* Set depth to maximum. */
+ csr_write(CSR_SCTRDEPTH, SCTRDEPTH_MASK);
+}
+
+void riscv_pmu_ctr_dying_cpu(void)
+{
+ if (!riscv_isa_extension_available(NULL, SxCTR) ||
+ !riscv_isa_extension_available(NULL, SSCOFPMF) ||
+ !riscv_isa_extension_available(NULL, SxCSRIND))
+ return;
+
+ /* Clear and reset CTR CSRs. */
+ csr_write(CSR_SCTRDEPTH, 0);
+ csr_write(CSR_CTRCTL, 0);
+ riscv_pmu_ctr_reset();
+}
+
+int riscv_pmu_ctr_init(struct riscv_pmu *riscv_pmu)
+{
+ size_t size = sizeof(struct riscv_perf_task_context);
+
+ if (!riscv_isa_extension_available(NULL, SxCTR) ||
+ !riscv_isa_extension_available(NULL, SSCOFPMF) ||
+ !riscv_isa_extension_available(NULL, SxCSRIND))
+ return 0;
+
+ riscv_pmu->pmu.task_ctx_cache =
+ kmem_cache_create("ctr_task_ctx", size, sizeof(u64), 0, NULL);
+ if (!riscv_pmu->pmu.task_ctx_cache)
+ return -ENOMEM;
+
+ check_available_filters();
+
+ /* Set depth to maximum. */
+ csr_write(CSR_SCTRDEPTH, SCTRDEPTH_MASK);
+ riscv_pmu->ctr_depth = ctr_get_depth(csr_read(CSR_SCTRDEPTH));
+
+ pr_info("Perf CTR available, with %d depth\n", riscv_pmu->ctr_depth);
+
+ return 0;
+}
+
+void riscv_pmu_ctr_finish(struct riscv_pmu *riscv_pmu)
+{
+ if (!riscv_pmu_ctr_supported(riscv_pmu))
+ return;
+
+ riscv_pmu->ctr_depth = 0;
+ csr_write(CSR_SCTRDEPTH, 0);
+ csr_write(CSR_CTRCTL, 0);
+ riscv_pmu_ctr_reset();
+
+ kmem_cache_destroy(riscv_pmu->pmu.task_ctx_cache);
+}
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index 883781f12ae0be93d8292ae1a7e7b03fea3ea955..f32b6dcc349109dc0aa74cbe152381c0b2c662d0 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -127,6 +127,43 @@ struct riscv_pmu *riscv_pmu_alloc(void);
int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
#endif
+static inline bool riscv_pmu_ctr_supported(struct riscv_pmu *pmu)
+{
+ return !!pmu->ctr_depth;
+}
+
#endif /* CONFIG_RISCV_PMU_COMMON */
+#ifdef CONFIG_RISCV_CTR
+
+bool riscv_pmu_ctr_valid(struct perf_event *event);
+void riscv_pmu_ctr_consume(struct cpu_hw_events *cpuc, struct perf_event *event);
+void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in);
+void riscv_pmu_ctr_add(struct perf_event *event);
+void riscv_pmu_ctr_del(struct perf_event *event);
+void riscv_pmu_ctr_enable(struct perf_event *event);
+void riscv_pmu_ctr_disable(struct perf_event *event);
+void riscv_pmu_ctr_dying_cpu(void);
+void riscv_pmu_ctr_starting_cpu(void);
+int riscv_pmu_ctr_init(struct riscv_pmu *riscv_pmu);
+void riscv_pmu_ctr_finish(struct riscv_pmu *riscv_pmu);
+
+#else
+
+static inline bool riscv_pmu_ctr_valid(struct perf_event *event) { return false; }
+static inline void riscv_pmu_ctr_consume(struct cpu_hw_events *cpuc,
+ struct perf_event *event) { }
+static inline void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *,
+ bool sched_in) { }
+static void riscv_pmu_ctr_add(struct perf_event *event) { }
+static void riscv_pmu_ctr_del(struct perf_event *event) { }
+static inline void riscv_pmu_ctr_enable(struct perf_event *event) { }
+static inline void riscv_pmu_ctr_disable(struct perf_event *event) { }
+static inline void riscv_pmu_ctr_dying_cpu(void) { }
+static inline void riscv_pmu_ctr_starting_cpu(void) { }
+static inline int riscv_pmu_ctr_init(struct riscv_pmu *riscv_pmu) { return 0; }
+static inline void riscv_pmu_ctr_finish(struct riscv_pmu *riscv_pmu) { }
+
+#endif /* CONFIG_RISCV_CTR */
+
#endif /* _RISCV_PMU_H */
--
2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 5/7] riscv: pmu: Add driver for Control Transfer Records Ext.
2025-05-22 23:25 ` [PATCH v3 5/7] riscv: pmu: Add driver for Control Transfer Records Ext Rajnesh Kanwal
@ 2025-05-23 17:48 ` kernel test robot
0 siblings, 0 replies; 11+ messages in thread
From: kernel test robot @ 2025-05-23 17:48 UTC (permalink / raw)
To: Rajnesh Kanwal, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Atish Kumar Patra, Anup Patel, Will Deacon, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Beeman Strong
Cc: llvm, oe-kbuild-all, linux-perf-users, linux-kernel, linux-riscv,
linux-arm-kernel, devicetree, Rajnesh Kanwal
Hi Rajnesh,
kernel test robot noticed the following build warnings:
[auto build test WARNING on e0200e37637e573cd68f522ecd550be87e304c6c]
url: https://github.com/intel-lab-lkp/linux/commits/Rajnesh-Kanwal/perf-Increase-the-maximum-number-of-branches-remove_loops-can-process/20250523-073341
base: e0200e37637e573cd68f522ecd550be87e304c6c
patch link: https://lore.kernel.org/r/20250523-b4-ctr_upstream_v3-v3-5-ad355304ba1c%40rivosinc.com
patch subject: [PATCH v3 5/7] riscv: pmu: Add driver for Control Transfer Records Ext.
config: riscv-randconfig-002-20250523 (https://download.01.org/0day-ci/archive/20250524/202505240131.OJkUGGvA-lkp@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250524/202505240131.OJkUGGvA-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202505240131.OJkUGGvA-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from arch/riscv/kernel/asm-offsets.c:12:
In file included from arch/riscv/include/asm/kvm_host.h:23:
In file included from arch/riscv/include/asm/kvm_vcpu_pmu.h:12:
>> include/linux/perf/riscv_pmu.h:156:76: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
156 | static inline void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *,
| ^
include/linux/perf/riscv_pmu.h:158:13: warning: unused function 'riscv_pmu_ctr_add' [-Wunused-function]
158 | static void riscv_pmu_ctr_add(struct perf_event *event) { }
| ^~~~~~~~~~~~~~~~~
include/linux/perf/riscv_pmu.h:159:13: warning: unused function 'riscv_pmu_ctr_del' [-Wunused-function]
159 | static void riscv_pmu_ctr_del(struct perf_event *event) { }
| ^~~~~~~~~~~~~~~~~
3 warnings generated.
--
In file included from arch/riscv/kernel/asm-offsets.c:12:
In file included from arch/riscv/include/asm/kvm_host.h:23:
In file included from arch/riscv/include/asm/kvm_vcpu_pmu.h:12:
>> include/linux/perf/riscv_pmu.h:156:76: warning: omitting the parameter name in a function definition is a C2x extension [-Wc2x-extensions]
156 | static inline void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *,
| ^
include/linux/perf/riscv_pmu.h:158:13: warning: unused function 'riscv_pmu_ctr_add' [-Wunused-function]
158 | static void riscv_pmu_ctr_add(struct perf_event *event) { }
| ^~~~~~~~~~~~~~~~~
include/linux/perf/riscv_pmu.h:159:13: warning: unused function 'riscv_pmu_ctr_del' [-Wunused-function]
159 | static void riscv_pmu_ctr_del(struct perf_event *event) { }
| ^~~~~~~~~~~~~~~~~
3 warnings generated.
vim +156 include/linux/perf/riscv_pmu.h
152
153 static inline bool riscv_pmu_ctr_valid(struct perf_event *event) { return false; }
154 static inline void riscv_pmu_ctr_consume(struct cpu_hw_events *cpuc,
155 struct perf_event *event) { }
> 156 static inline void riscv_pmu_ctr_sched_task(struct perf_event_pmu_context *,
157 bool sched_in) { }
158 static void riscv_pmu_ctr_add(struct perf_event *event) { }
159 static void riscv_pmu_ctr_del(struct perf_event *event) { }
160 static inline void riscv_pmu_ctr_enable(struct perf_event *event) { }
161 static inline void riscv_pmu_ctr_disable(struct perf_event *event) { }
162 static inline void riscv_pmu_ctr_dying_cpu(void) { }
163 static inline void riscv_pmu_ctr_starting_cpu(void) { }
164 static inline int riscv_pmu_ctr_init(struct riscv_pmu *riscv_pmu) { return 0; }
165 static inline void riscv_pmu_ctr_finish(struct riscv_pmu *riscv_pmu) { }
166
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 6/7] riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver
2025-05-22 23:25 [PATCH v3 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
` (4 preceding siblings ...)
2025-05-22 23:25 ` [PATCH v3 5/7] riscv: pmu: Add driver for Control Transfer Records Ext Rajnesh Kanwal
@ 2025-05-22 23:25 ` Rajnesh Kanwal
2025-05-22 23:25 ` [PATCH v3 7/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
6 siblings, 0 replies; 11+ messages in thread
From: Rajnesh Kanwal @ 2025-05-22 23:25 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong
Cc: linux-perf-users, linux-kernel, linux-riscv, linux-arm-kernel,
Palmer Dabbelt, Conor Dooley, devicetree, Rajnesh Kanwal
This integrates recently added CTR ext support in riscv_pmu_dev driver
to enable branch stack sampling using PMU events.
This mainly adds CTR enable/disable callbacks in rvpmu_ctr_stop()
and rvpmu_ctr_start() function to start/stop branch recording along
with the event.
PMU overflow handler rvpmu_ovf_handler() is also updated to sample
CTR entries in case of the overflow for the particular event programmed
to records branches. The recorded entries are fed to core perf for
further processing.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
drivers/perf/riscv_pmu_common.c | 3 +-
drivers/perf/riscv_pmu_dev.c | 67 ++++++++++++++++++++++++++++++++++++++++-
2 files changed, 67 insertions(+), 3 deletions(-)
diff --git a/drivers/perf/riscv_pmu_common.c b/drivers/perf/riscv_pmu_common.c
index b2dc78cbbb93926964f81f30be9ef4a1c02501df..0b032b8d8762e77d2b553643b0f9064e7c789cfe 100644
--- a/drivers/perf/riscv_pmu_common.c
+++ b/drivers/perf/riscv_pmu_common.c
@@ -329,8 +329,7 @@ static int riscv_pmu_event_init(struct perf_event *event)
u64 event_config = 0;
uint64_t cmask;
- /* driver does not support branch stack sampling */
- if (has_branch_stack(event))
+ if (needs_branch_stack(event) && !riscv_pmu_ctr_supported(rvpmu))
return -EOPNOTSUPP;
hwc->flags = 0;
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index 95e6dd272db69f53b679e5fc3450785e45d5e8b9..b0c616fb939fcc61f7493877a8801916069f16f7 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -1038,7 +1038,7 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
static void pmu_sched_task(struct perf_event_pmu_context *pmu_ctx,
bool sched_in)
{
- /* Call CTR specific Sched hook. */
+ riscv_pmu_ctr_sched_task(pmu_ctx, sched_in);
}
static int rvpmu_sbi_find_num_ctrs(void)
@@ -1370,6 +1370,13 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
hw_evt->state |= PERF_HES_UPTODATE;
perf_sample_data_init(&data, 0, hw_evt->last_period);
if (riscv_pmu_event_set_period(event)) {
+ if (needs_branch_stack(event)) {
+ riscv_pmu_ctr_consume(cpu_hw_evt, event);
+ perf_sample_save_brstack(
+ &data, event,
+ &cpu_hw_evt->branches->branch_stack, NULL);
+ }
+
/*
* Unlike other ISAs, RISC-V don't have to disable interrupts
* to avoid throttling here. As per the specification, the
@@ -1569,16 +1576,23 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
static void rvpmu_ctr_add(struct perf_event *event, int flags)
{
+ if (needs_branch_stack(event))
+ riscv_pmu_ctr_add(event);
}
static void rvpmu_ctr_del(struct perf_event *event, int flags)
{
+ if (needs_branch_stack(event))
+ riscv_pmu_ctr_del(event);
}
static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
{
struct hw_perf_event *hwc = &event->hw;
+ if (needs_branch_stack(event))
+ riscv_pmu_ctr_enable(event);
+
if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
rvpmu_deleg_ctr_start(event, ival);
else
@@ -1593,6 +1607,9 @@ static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
{
struct hw_perf_event *hwc = &event->hw;
+ if (needs_branch_stack(event) && flag != RISCV_PMU_STOP_FLAG_RESET)
+ riscv_pmu_ctr_disable(event);
+
if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
rvpmu_reset_scounteren((void *)event);
@@ -1650,6 +1667,9 @@ static u32 rvpmu_find_ctrs(void)
static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
{
+ if (needs_branch_stack(event) && !riscv_pmu_ctr_valid(event))
+ return -EOPNOTSUPP;
+
if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
return rvpmu_cdeleg_event_map(event, econfig);
else
@@ -1696,6 +1716,8 @@ static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
enable_percpu_irq(riscv_pmu_irq, IRQ_TYPE_NONE);
}
+ riscv_pmu_ctr_starting_cpu();
+
if (sbi_pmu_snapshot_available())
return pmu_sbi_snapshot_setup(pmu, cpu);
@@ -1710,6 +1732,7 @@ static int rvpmu_dying_cpu(unsigned int cpu, struct hlist_node *node)
/* Disable all counters access for user mode now */
csr_write(CSR_SCOUNTEREN, 0x0);
+ riscv_pmu_ctr_dying_cpu();
if (sbi_pmu_snapshot_available())
return pmu_sbi_snapshot_disable();
@@ -1833,6 +1856,29 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
}
+static int branch_records_alloc(struct riscv_pmu *pmu)
+{
+ struct branch_records __percpu *tmp_alloc_ptr;
+ struct branch_records *records;
+ struct cpu_hw_events *events;
+ int cpu;
+
+ if (!riscv_pmu_ctr_supported(pmu))
+ return 0;
+
+ tmp_alloc_ptr = alloc_percpu_gfp(struct branch_records, GFP_KERNEL);
+ if (!tmp_alloc_ptr)
+ return -ENOMEM;
+
+ for_each_possible_cpu(cpu) {
+ events = per_cpu_ptr(pmu->hw_events, cpu);
+ records = per_cpu_ptr(tmp_alloc_ptr, cpu);
+ events->branches = records;
+ }
+
+ return 0;
+}
+
static void rvpmu_event_init(struct perf_event *event)
{
/*
@@ -1845,6 +1891,9 @@ static void rvpmu_event_init(struct perf_event *event)
event->hw.flags |= PERF_EVENT_FLAG_USER_ACCESS;
else
event->hw.flags |= PERF_EVENT_FLAG_LEGACY;
+
+ if (branch_sample_call_stack(event))
+ event->attach_state |= PERF_ATTACH_TASK_DATA;
}
static void rvpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
@@ -1992,6 +2041,15 @@ static int rvpmu_device_probe(struct platform_device *pdev)
pmu->pmu.attr_groups = riscv_cdeleg_pmu_attr_groups;
else
pmu->pmu.attr_groups = riscv_sbi_pmu_attr_groups;
+
+ ret = riscv_pmu_ctr_init(pmu);
+ if (ret)
+ goto out_free;
+
+ ret = branch_records_alloc(pmu);
+ if (ret)
+ goto out_ctr_finish;
+
pmu->cmask = cmask;
pmu->ctr_add = rvpmu_ctr_add;
pmu->ctr_del = rvpmu_ctr_del;
@@ -2008,6 +2066,10 @@ static int rvpmu_device_probe(struct platform_device *pdev)
pmu->csr_index = rvpmu_csr_index;
pmu->sched_task = pmu_sched_task;
+ ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
+ if (ret)
+ goto out_ctr_finish;
+
ret = riscv_pm_pmu_register(pmu);
if (ret)
goto out_unregister;
@@ -2057,6 +2119,9 @@ static int rvpmu_device_probe(struct platform_device *pdev)
out_unregister:
riscv_pmu_destroy(pmu);
+out_ctr_finish:
+ riscv_pmu_ctr_finish(pmu);
+
out_free:
kfree(pmu);
return ret;
--
2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 7/7] dt-bindings: riscv: add Sxctr ISA extension description
2025-05-22 23:25 [PATCH v3 0/7] riscv: pmu: Add support for Control Transfer Records Ext Rajnesh Kanwal
` (5 preceding siblings ...)
2025-05-22 23:25 ` [PATCH v3 6/7] riscv: pmu: Integrate CTR Ext support in riscv_pmu_dev driver Rajnesh Kanwal
@ 2025-05-22 23:25 ` Rajnesh Kanwal
2025-05-23 15:36 ` Conor Dooley
6 siblings, 1 reply; 11+ messages in thread
From: Rajnesh Kanwal @ 2025-05-22 23:25 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong
Cc: linux-perf-users, linux-kernel, linux-riscv, linux-arm-kernel,
Palmer Dabbelt, Conor Dooley, devicetree, Rajnesh Kanwal
Add the S[m|s]ctr ISA extension description.
Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
---
.../devicetree/bindings/riscv/extensions.yaml | 28 ++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index f34bc66940c06bf9b3c18fcd7cce7bfd0593cd28..193751400933ca3fe69e0b2bc03e9c635e2db244 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -149,6 +149,13 @@ properties:
to enable privilege mode filtering for cycle and instret counters as
ratified in the 20240326 version of the privileged ISA specification.
+ - const: smctr
+ description: |
+ The standard Smctr supervisor-level extension for the machine mode
+ to enable recording limited branch history in a register-accessible
+ internal core storage as ratified at commit 9c87013 ("Merge pull
+ request #44 from riscv/issue-42-fix") of riscv-control-transfer-records.
+
- const: smmpm
description: |
The standard Smmpm extension for M-mode pointer masking as
@@ -196,6 +203,13 @@ properties:
and mode-based filtering as ratified at commit 01d1df0 ("Add ability
to manually trigger workflow. (#2)") of riscv-count-overflow.
+ - const: ssctr
+ description: |
+ The standard Ssctr supervisor-level extension for recording limited
+ branch history in a register-accessible internal core storage as
+ ratified at commit 9c87013 ("Merge pull request #44 from
+ riscv/issue-42-fix") of riscv-control-transfer-records.
+
- const: ssnpm
description: |
The standard Ssnpm extension for next-mode pointer masking as
@@ -740,6 +754,20 @@ properties:
const: zihpm
- contains:
const: zicntr
+ # Smctr depends on Sscsrind
+ - if:
+ contains:
+ const: smctr
+ then:
+ contains:
+ const: sscsrind
+ # Ssctr depends on Sscsrind
+ - if:
+ contains:
+ const: ssctr
+ then:
+ contains:
+ const: sscsrind
allOf:
# Zcf extension does not exist on rv64
--
2.43.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 7/7] dt-bindings: riscv: add Sxctr ISA extension description
2025-05-22 23:25 ` [PATCH v3 7/7] dt-bindings: riscv: add Sxctr ISA extension description Rajnesh Kanwal
@ 2025-05-23 15:36 ` Conor Dooley
0 siblings, 0 replies; 11+ messages in thread
From: Conor Dooley @ 2025-05-23 15:36 UTC (permalink / raw)
To: Rajnesh Kanwal
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Atish Kumar Patra, Anup Patel,
Will Deacon, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Beeman Strong, linux-perf-users, linux-kernel, linux-riscv,
linux-arm-kernel, Palmer Dabbelt, devicetree
[-- Attachment #1.1: Type: text/plain, Size: 231 bytes --]
On Fri, May 23, 2025 at 12:25:13AM +0100, Rajnesh Kanwal wrote:
> Add the S[m|s]ctr ISA extension description.
>
> Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
[-- Attachment #2: Type: text/plain, Size: 161 bytes --]
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 11+ messages in thread