* [PATCH v8 01/22] RISC-V: perf: fix resource cleanup on driver probe failure
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 9:11 ` sashiko-bot
2026-07-01 8:46 ` [PATCH v8 02/22] RISC-V: Add Sxcsrind ISA extension CSR definitions Atish Patra
` (20 subsequent siblings)
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@meta.com>
Sashiko pointed out various UAF and memory leak issues around
pmu_sbi_device_probe() error paths.
If the probe fails, here are list of cleanups needed.
a. Already registered pmu must be freed
b. per cpu IRQ must be released
c. pmu_ctr_list data structure must be freed
d. cpu hotplug state must be cleaned up only if added.
Fix the resource cleanup by reorganizing the code around probe failure.
Reported-by: Sashiko AI <sashiko-bot@kernel.org>
Signed-off-by: Atish Patra <atishp@meta.com>
---
drivers/perf/riscv_pmu_sbi.c | 33 +++++++++++++++++++++++++++------
1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 385af5e6e6d0..5c8924ce1f38 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -1220,22 +1220,29 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde
DOMAIN_BUS_ANY);
if (!domain) {
pr_err("Failed to find INTC IRQ root domain\n");
- return -ENODEV;
+ ret = -ENODEV;
+ goto err;
}
riscv_pmu_irq = irq_create_mapping(domain, riscv_pmu_irq_num);
if (!riscv_pmu_irq) {
pr_err("Failed to map PMU interrupt for node\n");
- return -ENODEV;
+ ret = -ENODEV;
+ goto err;
}
ret = request_percpu_irq(riscv_pmu_irq, pmu_sbi_ovf_handler, "riscv-pmu", hw_events);
if (ret) {
pr_err("registering percpu irq failed [%d]\n", ret);
- return ret;
+ irq_dispose_mapping(riscv_pmu_irq);
+ riscv_pmu_irq = 0;
+ goto err;
}
return 0;
+err:
+ riscv_pmu_use_irq = false;
+ return ret;
}
#ifdef CONFIG_CPU_PM
@@ -1302,7 +1309,8 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
}
}
riscv_pm_pmu_unregister(pmu);
- cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
+ if (!hlist_unhashed(&pmu->node))
+ cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
}
static void pmu_sbi_event_init(struct perf_event *event)
@@ -1424,6 +1432,7 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
struct riscv_pmu *pmu = NULL;
int ret = -ENODEV;
int num_counters;
+ bool irq_requested = false;
pr_info("SBI PMU extension is available\n");
pmu = riscv_pmu_alloc();
@@ -1452,6 +1461,7 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE;
}
+ irq_requested = (ret == 0);
pmu->pmu.attr_groups = riscv_pmu_attr_groups;
pmu->pmu.parent = &pdev->dev;
@@ -1470,11 +1480,11 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
ret = riscv_pm_pmu_register(pmu);
if (ret)
- goto out_unregister;
+ goto out_destroy;
ret = perf_pmu_register(&pmu->pmu, "cpu", PERF_TYPE_RAW);
if (ret)
- goto out_unregister;
+ goto out_destroy;
/* SBI PMU Snapsphot is only available in SBI v2.0 */
if (sbi_v2_available) {
@@ -1515,9 +1525,20 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
return 0;
out_unregister:
+ perf_pmu_unregister(&pmu->pmu);
+
+out_destroy:
riscv_pmu_destroy(pmu);
+ if (irq_requested) {
+ free_percpu_irq(riscv_pmu_irq, pmu->hw_events);
+ irq_dispose_mapping(riscv_pmu_irq);
+ riscv_pmu_irq = 0;
+ }
out_free:
+ free_percpu(pmu->hw_events);
+ kfree(pmu_ctr_list);
+ pmu_ctr_list = NULL;
kfree(pmu);
return ret;
}
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 01/22] RISC-V: perf: fix resource cleanup on driver probe failure
2026-07-01 8:46 ` [PATCH v8 01/22] RISC-V: perf: fix resource cleanup on driver probe failure Atish Patra
@ 2026-07-01 9:11 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:11 UTC (permalink / raw)
To: Atish Patra; +Cc: robh, linux-perf-users, devicetree, conor+dt
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
Pre-existing issues:
- [High] Preexisting race condition and ignored -EBUSY return from perf_pmu_unregister() leads to UAF if events are active during probe failure.
- [High] The global bitmask `cmask` is modified during device probe but not cleared on error paths, leading to a guaranteed NULL pointer dereference in exported functions.
- [High] The static key `sbi_pmu_snapshot_available` is never disabled on probe failure, leading to a potential NULL pointer dereference.
--
commit 4312c8eb8c51217703e557d54647da346fae9bf0
Author: Atish Patra <atishp@meta.com>
Subject: RISC-V: perf: fix resource cleanup on driver probe failure
This commit reorganizes the error handling paths in pmu_sbi_device_probe()
to properly free registered PMUs, per-cpu IRQs, the pmu_ctr_list, and
CPU hotplug states if the probe fails.
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> index 385af5e6e6d0d..5c8924ce1f387 100644
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
[ ... ]
> @@ -1515,9 +1525,20 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
> return 0;
>
> out_unregister:
> + perf_pmu_unregister(&pmu->pmu);
> +
[Severity: High]
This isn't a bug introduced by this patch, but does this error path ignore a
potential -EBUSY return from perf_pmu_unregister()?
If perf_pmu_register() previously succeeded, an unprivileged user might race
to open an event on it. If a subsequent initialization step fails and triggers
this cleanup path, perf_pmu_unregister() can return -EBUSY and fail to remove
the PMU from the core structures.
Since the return value is ignored, the code will proceed to call kfree(pmu)
below. Can this leave a dangling pointer in the perf subsystem and result in a
use-after-free if those events are still active?
> +out_destroy:
> riscv_pmu_destroy(pmu);
> + if (irq_requested) {
> + free_percpu_irq(riscv_pmu_irq, pmu->hw_events);
> + irq_dispose_mapping(riscv_pmu_irq);
> + riscv_pmu_irq = 0;
> + }
>
> out_free:
> + free_percpu(pmu->hw_events);
> + kfree(pmu_ctr_list);
> + pmu_ctr_list = NULL;
[Severity: High]
This is a pre-existing issue, but does this error path leave the global cmask
variable set when pmu_ctr_list is freed?
During probe, pmu_sbi_get_ctrinfo() populates the global cmask. If the probe
fails and we take this error path, pmu_ctr_list is freed and set to NULL,
but cmask is never cleared.
If a subsequent call to the exported function riscv_pmu_get_hpm_info()
occurs (e.g., triggered by KVM vCPU initialization), could it bypass the
safety check because cmask is non-zero, and dereference the NULL pointer?
drivers/perf/riscv_pmu_sbi.c:riscv_pmu_get_hpm_info() {
...
for_each_set_bit(i, &cmask, RISCV_MAX_COUNTERS) {
info = &pmu_ctr_list[i];
if (!info)
continue;
...
}
[Severity: High]
This is also a pre-existing issue, but does this error path need to disable
the sbi_pmu_snapshot_available static key?
Earlier in pmu_sbi_device_probe(), the static key is enabled:
drivers/perf/riscv_pmu_sbi.c:pmu_sbi_device_probe() {
...
static_branch_enable(&sbi_pmu_snapshot_available);
...
}
If a subsequent step fails, the probe aborts and frees the snapshot memory,
but the static key remains enabled. If the driver is re-probed and setup
fails on the retry, the probe might continue successfully without snapshot
memory. Could this leave the static key true, causing functions like
pmu_sbi_ctr_read() to dereference a NULL snapshot_addr?
drivers/perf/riscv_pmu_sbi.c:pmu_sbi_ctr_read() {
...
struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr;
...
if (sbi_pmu_snapshot_available() && (hwc->state & PERF_HES_STOPPED)) {
val = sdata->ctr_values[idx];
...
}
> kfree(pmu);
> return ret;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 02/22] RISC-V: Add Sxcsrind ISA extension CSR definitions
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
2026-07-01 8:46 ` [PATCH v8 01/22] RISC-V: perf: fix resource cleanup on driver probe failure Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:46 ` [PATCH v8 03/22] RISC-V: Add Sxcsrind ISA extension definition and parsing Atish Patra
` (19 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Kaiwen Xue <kaiwenx@rivosinc.com>
This adds definitions of new CSRs and bits defined in Sxcsrind ISA
extension. These CSR enables indirect accesses mechanism to access
any CSRs in M-, S-, and VS-mode. The range of the select values
and ireg will be define by the ISA extension using Sxcsrind extension.
Signed-off-by: Kaiwen Xue <kaiwenx@rivosinc.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/csr.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 31b8988f4488..b4551a6cf7cb 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -347,6 +347,12 @@
/* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */
#define CSR_SISELECT 0x150
#define CSR_SIREG 0x151
+/* Supervisor-Level Window to Indirectly Accessed Registers (Sxcsrind) */
+#define CSR_SIREG2 0x152
+#define CSR_SIREG3 0x153
+#define CSR_SIREG4 0x155
+#define CSR_SIREG5 0x156
+#define CSR_SIREG6 0x157
/* Supervisor-Level Interrupts (AIA) */
#define CSR_STOPEI 0x15c
@@ -394,6 +400,14 @@
/* VS-Level Window to Indirectly Accessed Registers (H-extension with AIA) */
#define CSR_VSISELECT 0x250
#define CSR_VSIREG 0x251
+/*
+ * VS-Level Window to Indirectly Accessed Registers (H-extension with Sxcsrind)
+ */
+#define CSR_VSIREG2 0x252
+#define CSR_VSIREG3 0x253
+#define CSR_VSIREG4 0x255
+#define CSR_VSIREG5 0x256
+#define CSR_VSIREG6 0x257
/* VS-Level Interrupts (H-extension with AIA) */
#define CSR_VSTOPEI 0x25c
@@ -436,6 +450,12 @@
/* Machine-Level Window to Indirectly Accessed Registers (AIA) */
#define CSR_MISELECT 0x350
#define CSR_MIREG 0x351
+/* Machine-Level Window to Indirectly Accessed Registers (Sxcsrind) */
+#define CSR_MIREG2 0x352
+#define CSR_MIREG3 0x353
+#define CSR_MIREG4 0x355
+#define CSR_MIREG5 0x356
+#define CSR_MIREG6 0x357
/* Machine-Level Interrupts (AIA) */
#define CSR_MTOPEI 0x35c
@@ -498,6 +518,11 @@
# define CSR_IEH CSR_MIEH
# define CSR_ISELECT CSR_MISELECT
# define CSR_IREG CSR_MIREG
+# define CSR_IREG2 CSR_MIREG2
+# define CSR_IREG3 CSR_MIREG3
+# define CSR_IREG4 CSR_MIREG4
+# define CSR_IREG5 CSR_MIREG5
+# define CSR_IREG6 CSR_MIREG6
# define CSR_IPH CSR_MIPH
# define CSR_TOPEI CSR_MTOPEI
# define CSR_TOPI CSR_MTOPI
@@ -523,6 +548,11 @@
# define CSR_IEH CSR_SIEH
# define CSR_ISELECT CSR_SISELECT
# define CSR_IREG CSR_SIREG
+# define CSR_IREG2 CSR_SIREG2
+# define CSR_IREG3 CSR_SIREG3
+# define CSR_IREG4 CSR_SIREG4
+# define CSR_IREG5 CSR_SIREG5
+# define CSR_IREG6 CSR_SIREG6
# define CSR_IPH CSR_SIPH
# define CSR_TOPEI CSR_STOPEI
# define CSR_TOPI CSR_STOPI
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 03/22] RISC-V: Add Sxcsrind ISA extension definition and parsing
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
2026-07-01 8:46 ` [PATCH v8 01/22] RISC-V: perf: fix resource cleanup on driver probe failure Atish Patra
2026-07-01 8:46 ` [PATCH v8 02/22] RISC-V: Add Sxcsrind ISA extension CSR definitions Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:46 ` [PATCH v8 04/22] dt-bindings: riscv: add Sxcsrind ISA extension description Atish Patra
` (18 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
The S[m|s]csrind extension extends the indirect CSR access mechanism
defined in Smaia/Ssaia extensions.
This patch just enables the definition and parsing.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/hwcap.h | 4 ++++
arch/riscv/kernel/cpufeature.c | 2 ++
2 files changed, 6 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 7ef8e5f55c8d..d4a7b90e2d78 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -112,6 +112,8 @@
#define RISCV_ISA_EXT_ZCLSD 103
#define RISCV_ISA_EXT_ZICFILP 104
#define RISCV_ISA_EXT_ZICFISS 105
+#define RISCV_ISA_EXT_SSCSRIND 106
+#define RISCV_ISA_EXT_SMCSRIND 107
#define RISCV_ISA_EXT_XLINUXENVCFG 127
@@ -121,9 +123,11 @@
#ifdef CONFIG_RISCV_M_MODE
#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA
#define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SMNPM
+#define RISCV_ISA_EXT_SxCSRIND RISCV_ISA_EXT_SMCSRIND
#else
#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA
#define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SSNPM
+#define RISCV_ISA_EXT_SxCSRIND RISCV_ISA_EXT_SSCSRIND
#endif
#endif /* _ASM_RISCV_HWCAP_H */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index f46aa5602d74..3fa0a563fb21 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -576,11 +576,13 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_BUNDLE_VALIDATE(zvksg, riscv_zvksg_bundled_exts, riscv_ext_vector_crypto_validate),
__RISCV_ISA_EXT_DATA_VALIDATE(zvkt, RISCV_ISA_EXT_ZVKT, riscv_ext_vector_crypto_validate),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+ __RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM),
__RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts),
__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
+ __RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
__RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svade, RISCV_ISA_EXT_SVADE),
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 04/22] dt-bindings: riscv: add Sxcsrind ISA extension description
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (2 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 03/22] RISC-V: Add Sxcsrind ISA extension definition and parsing Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:46 ` [PATCH v8 05/22] RISC-V: Define indirect CSR access helpers Atish Patra
` (17 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
Add the S[m|s]csrind ISA extension description.
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
Documentation/devicetree/bindings/riscv/extensions.yaml | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 2b0a8a93bb21..15cf0e2ee3ed 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -181,6 +181,14 @@ properties:
changes to interrupts as frozen at commit ccbddab ("Merge pull
request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: smcsrind
+ description: |
+ The standard Smcsrind machine-level extension extends the
+ indirect CSR access mechanism defined by the Smaia extension. This
+ extension allows other ISA extensions to use indirect CSR access
+ mechanism in M-mode as ratified in the 20240326 version of the
+ privileged ISA specification.
+
- const: smmpm
description: |
The standard Smmpm extension for M-mode pointer masking as
@@ -226,6 +234,14 @@ properties:
Profiles Version 1.0, with commit b1d806605f87 ("Updated to
ratified state.")
+ - const: sscsrind
+ description: |
+ The standard Sscsrind supervisor-level extension extends the
+ indirect CSR access mechanism defined by the Ssaia extension. This
+ extension allows other ISA extensions to use indirect CSR access
+ mechanism in S-mode as ratified in the 20240326 version of the
+ privileged ISA specification.
+
- const: ssnpm
description: |
The standard Ssnpm extension for next-mode pointer masking as
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 05/22] RISC-V: Define indirect CSR access helpers
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (3 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 04/22] dt-bindings: riscv: add Sxcsrind ISA extension description Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:46 ` [PATCH v8 06/22] RISC-V: Add Smcntrpmf extension parsing Atish Patra
` (16 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
The indirect CSR requires multiple instructions to read/write CSR.
Add a few helper functions for ease of usage.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/csr_ind.h | 41 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/arch/riscv/include/asm/csr_ind.h b/arch/riscv/include/asm/csr_ind.h
new file mode 100644
index 000000000000..1b15e358484d
--- /dev/null
+++ b/arch/riscv/include/asm/csr_ind.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _ASM_RISCV_CSR_IND_H
+#define _ASM_RISCV_CSR_IND_H
+
+#include <linux/irqflags.h>
+
+#include <asm/csr.h>
+
+#define csr_ind_read(iregcsr, iselbase, iseloff) ({ \
+ unsigned long __value = 0; \
+ unsigned long __flags; \
+ local_irq_save(__flags); \
+ csr_write(CSR_ISELECT, (iselbase) + (iseloff)); \
+ __value = csr_read(iregcsr); \
+ local_irq_restore(__flags); \
+ __value; \
+})
+
+#define csr_ind_write(iregcsr, iselbase, iseloff, value) ({ \
+ unsigned long __flags; \
+ local_irq_save(__flags); \
+ csr_write(CSR_ISELECT, (iselbase) + (iseloff)); \
+ csr_write(iregcsr, (value)); \
+ local_irq_restore(__flags); \
+})
+
+#define csr_ind_warl(iregcsr, iselbase, iseloff, warl_val) ({ \
+ unsigned long __old_val = 0, __value = 0; \
+ unsigned long __flags; \
+ local_irq_save(__flags); \
+ csr_write(CSR_ISELECT, (iselbase) + (iseloff)); \
+ __old_val = csr_read(iregcsr); \
+ csr_write(iregcsr, (warl_val)); \
+ __value = csr_read(iregcsr); \
+ csr_write(iregcsr, __old_val); \
+ local_irq_restore(__flags); \
+ __value; \
+})
+
+#endif
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 06/22] RISC-V: Add Smcntrpmf extension parsing
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (4 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 05/22] RISC-V: Define indirect CSR access helpers Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:46 ` [PATCH v8 07/22] dt-bindings: riscv: add Smcntrpmf ISA extension description Atish Patra
` (15 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
Smcntrpmf extension allows M-mode to enable privilege mode filtering
for cycle/instret counters. However, the cyclecfg/instretcfg CSRs are
available in Ssccfg only if Smcntrpmf is present.
That's why, kernel needs to detect presence of Smcntrpmf extension and
enable privilege mode filtering for cycle/instret counters.
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/kernel/cpufeature.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index d4a7b90e2d78..51ad55b9677a 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -114,6 +114,7 @@
#define RISCV_ISA_EXT_ZICFISS 105
#define RISCV_ISA_EXT_SSCSRIND 106
#define RISCV_ISA_EXT_SMCSRIND 107
+#define RISCV_ISA_EXT_SMCNTRPMF 108
#define RISCV_ISA_EXT_XLINUXENVCFG 127
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 3fa0a563fb21..1452521d740a 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -576,6 +576,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_BUNDLE_VALIDATE(zvksg, riscv_zvksg_bundled_exts, riscv_ext_vector_crypto_validate),
__RISCV_ISA_EXT_DATA_VALIDATE(zvkt, RISCV_ISA_EXT_ZVKT, riscv_ext_vector_crypto_validate),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+ __RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF),
__RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM),
__RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts),
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 07/22] dt-bindings: riscv: add Smcntrpmf ISA extension description
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (5 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 06/22] RISC-V: Add Smcntrpmf extension parsing Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:46 ` [PATCH v8 08/22] RISC-V: Add Sscfg extension CSR definition Atish Patra
` (14 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
Add the description for Smcntrpmf ISA extension
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
Documentation/devicetree/bindings/riscv/extensions.yaml | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 15cf0e2ee3ed..2493766e956d 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -181,6 +181,12 @@ properties:
changes to interrupts as frozen at commit ccbddab ("Merge pull
request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: smcntrpmf
+ description: |
+ The standard Smcntrpmf machine-level extension for the machine mode
+ to enable privilege mode filtering for cycle and instret counters as
+ ratified in the 20240326 version of the privileged ISA specification.
+
- const: smcsrind
description: |
The standard Smcsrind machine-level extension extends the
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 08/22] RISC-V: Add Sscfg extension CSR definition
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (6 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 07/22] dt-bindings: riscv: add Smcntrpmf ISA extension description Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:46 ` [PATCH v8 09/22] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing Atish Patra
` (13 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Kaiwen Xue <kaiwenx@rivosinc.com>
This adds the scountinhibit CSR definition and S-mode accessible hpmevent
bits defined by smcdeleg/ssccfg. scountinhibit allows S-mode to start/stop
counters directly from S-mode without invoking SBI calls to M-mode. It is
also used to figure out the counters delegated to S-mode by the M-mode as
well.
Signed-off-by: Kaiwen Xue <kaiwenx@rivosinc.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
---
arch/riscv/include/asm/csr.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index b4551a6cf7cb..a3b24b88e401 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -241,6 +241,23 @@
#define SMSTATEEN0_HSENVCFG (_ULL(1) << SMSTATEEN0_HSENVCFG_SHIFT)
#define SMSTATEEN0_SSTATEEN0_SHIFT 63
#define SMSTATEEN0_SSTATEEN0 (_ULL(1) << SMSTATEEN0_SSTATEEN0_SHIFT)
+/* HPMEVENT bits. These are accessible in S-mode via Smcdeleg/Ssccfg */
+#define HPMEVENT_OF (BIT_ULL(63))
+#define HPMEVENT_MINH (BIT_ULL(62))
+#define HPMEVENT_SINH (BIT_ULL(61))
+#define HPMEVENT_UINH (BIT_ULL(60))
+#define HPMEVENT_VSINH (BIT_ULL(59))
+#define HPMEVENT_VUINH (BIT_ULL(58))
+#ifndef CONFIG_64BIT
+#define HPMEVENTH_OF (BIT(31))
+#define HPMEVENTH_MINH (BIT(30))
+#define HPMEVENTH_SINH (BIT(29))
+#define HPMEVENTH_UINH (BIT(28))
+#define HPMEVENTH_VSINH (BIT(27))
+#define HPMEVENTH_VUINH (BIT(26))
+#endif
+
+#define SISELECT_SSCCFG_BASE 0x40
/* mseccfg bits */
#define MSECCFG_PMM ENVCFG_PMM
@@ -322,6 +339,7 @@
#define CSR_SCOUNTEREN 0x106
#define CSR_SENVCFG 0x10a
#define CSR_SSTATEEN0 0x10c
+#define CSR_SCOUNTINHIBIT 0x120
#define CSR_SSCRATCH 0x140
#define CSR_SEPC 0x141
#define CSR_SCAUSE 0x142
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 09/22] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (7 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 08/22] RISC-V: Add Sscfg extension CSR definition Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 9:11 ` sashiko-bot
2026-07-01 8:46 ` [PATCH v8 10/22] dt-bindings: riscv: add Counter delegation ISA extensions description Atish Patra
` (12 subsequent siblings)
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
Smcdeleg extension allows the M-mode to delegate selected counters
to S-mode so that it can access those counters and correpsonding
hpmevent CSRs without M-mode.
Ssccfg (‘Ss’ for Privileged architecture and Supervisor-level
extension, ‘ccfg’ for Counter Configuration) provides access to
delegated counters and new supervisor-level state.
This patch just enables these definitions and enable parsing.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/hwcap.h | 2 ++
arch/riscv/kernel/cpufeature.c | 24 ++++++++++++++++++++++++
2 files changed, 26 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 51ad55b9677a..089353b250b0 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -115,6 +115,8 @@
#define RISCV_ISA_EXT_SSCSRIND 106
#define RISCV_ISA_EXT_SMCSRIND 107
#define RISCV_ISA_EXT_SMCNTRPMF 108
+#define RISCV_ISA_EXT_SSCCFG 109
+#define RISCV_ISA_EXT_SMCDELEG 110
#define RISCV_ISA_EXT_XLINUXENVCFG 127
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 1452521d740a..1fe647e03515 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -330,6 +330,27 @@ static const unsigned int riscv_a_exts[] = {
RISCV_ISA_EXT_ZKNE, \
RISCV_ISA_EXT_ZKNH
+static int riscv_ext_smcdeleg_validate(const struct riscv_isa_ext_data *data,
+ const unsigned long *isa_bitmap)
+{
+ if (__riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_SSCSRIND) &&
+ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZIHPM) &&
+ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZICNTR))
+ return 0;
+
+ return -EPROBE_DEFER;
+}
+
+static int riscv_ext_ssccfg_validate(const struct riscv_isa_ext_data *data,
+ const unsigned long *isa_bitmap)
+{
+ if (!riscv_ext_smcdeleg_validate(data, isa_bitmap) &&
+ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_SMCDELEG))
+ return 0;
+
+ return -EPROBE_DEFER;
+}
+
static const unsigned int riscv_zk_bundled_exts[] = {
RISCV_ISA_EXT_ZKN,
RISCV_ISA_EXT_ZKR,
@@ -576,12 +597,15 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_BUNDLE_VALIDATE(zvksg, riscv_zvksg_bundled_exts, riscv_ext_vector_crypto_validate),
__RISCV_ISA_EXT_DATA_VALIDATE(zvkt, RISCV_ISA_EXT_ZVKT, riscv_ext_vector_crypto_validate),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+ __RISCV_ISA_EXT_DATA_VALIDATE(smcdeleg, RISCV_ISA_EXT_SMCDELEG,
+ riscv_ext_smcdeleg_validate),
__RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF),
__RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM),
__RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts),
__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
+ __RISCV_ISA_EXT_DATA_VALIDATE(ssccfg, RISCV_ISA_EXT_SSCCFG, riscv_ext_ssccfg_validate),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
__RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts),
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 09/22] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing
2026-07-01 8:46 ` [PATCH v8 09/22] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing Atish Patra
@ 2026-07-01 9:11 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:11 UTC (permalink / raw)
To: Atish Patra; +Cc: linux-perf-users, robh, conor+dt, devicetree
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] The Supervisor-level `Ssccfg` extension validation strictly requires the Machine-level `Smcdeleg` extension, breaking S-mode guest VMs.
- [Medium] riscv_ext_ssccfg_validate passes the wrong context pointer to riscv_ext_smcdeleg_validate and performs a logically redundant check.
--
commit 6cf182c3b2cfda28c919d7041cfd7b6c8a237b86
Author: Atish Patra <atishp@rivosinc.com>
RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing
This commit introduces definitions and parsing for the Smcdeleg and Ssccfg
ISA extensions on RISC-V. It adds validation logic to ensure that required
dependencies are present before enabling these extensions.
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 1452521d740a6..1fe647e035157 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -330,6 +330,27 @@ static const unsigned int riscv_a_exts[] = {
> RISCV_ISA_EXT_ZKNE, \
> RISCV_ISA_EXT_ZKNH
>
> +static int riscv_ext_smcdeleg_validate(const struct riscv_isa_ext_data *data,
> + const unsigned long *isa_bitmap)
> +{
> + if (__riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_SSCSRIND) &&
> + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZIHPM) &&
> + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZICNTR))
> + return 0;
> +
> + return -EPROBE_DEFER;
> +}
> +
> +static int riscv_ext_ssccfg_validate(const struct riscv_isa_ext_data *data,
> + const unsigned long *isa_bitmap)
> +{
> + if (!riscv_ext_smcdeleg_validate(data, isa_bitmap) &&
> + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_SMCDELEG))
[Severity: High]
Does this code inadvertently break Ssccfg support in S-mode guest VMs?
When Linux runs as a guest VM under a hypervisor, M-mode extensions like
Smcdeleg are hidden from the guest OS. Since this validation strictly requires
RISCV_ISA_EXT_SMCDELEG to be present in the ISA string, Ssccfg will always
return -EPROBE_DEFER and be disabled.
Should a Supervisor-level extension depend on the presence of a Machine-level
extension?
[Severity: Medium]
Is it intentional to pass the SSCCFG data pointer to a function expecting
SMCDELEG's context?
While it currently doesn't crash because riscv_ext_smcdeleg_validate() ignores
the data pointer, this seems to violate the validation callback API contract.
Additionally, is it necessary to call riscv_ext_smcdeleg_validate() directly
here? If RISCV_ISA_EXT_SMCDELEG is present in the resolved isa_bitmap, it
should already be guaranteed to have passed its own validation function during
a previous loop iteration.
> + return 0;
> +
> + return -EPROBE_DEFER;
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=9
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 10/22] dt-bindings: riscv: add Counter delegation ISA extensions description
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (8 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 09/22] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:46 ` [PATCH v8 11/22] RISC-V: perf: Restructure the SBI PMU code Atish Patra
` (11 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
Add description for the Smcdeleg/Ssccfg extension.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
---
.../devicetree/bindings/riscv/extensions.yaml | 41 ++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 2493766e956d..eef5eeb198d0 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -181,6 +181,13 @@ properties:
changes to interrupts as frozen at commit ccbddab ("Merge pull
request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: smcdeleg
+ description: |
+ The standard Smcdeleg machine-level extension for the machine mode
+ to delegate the hpmcounters to supervisor mode so that they are
+ directly accessible in the supervisor mode as ratified in the
+ 20240213 version of the privileged ISA specification.
+
- const: smcntrpmf
description: |
The standard Smcntrpmf machine-level extension for the machine mode
@@ -220,6 +227,14 @@ properties:
behavioural changes to interrupts as frozen at commit ccbddab
("Merge pull request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: ssccfg
+ description: |
+ The standard Ssccfg supervisor-level extension for configuring
+ the delegated hpmcounters to be accessible directly in supervisor
+ mode as ratified in the 20240213 version of the privileged ISA
+ specification. This extension depends on Sscsrind, Smcdeleg, Zihpm,
+ Zicntr extensions.
+
- const: ssccptr
description: |
The standard Ssccptr extension for main memory (cacheability and
@@ -1135,6 +1150,32 @@ properties:
allOf:
- const: zilsd
- const: zca
+ # Smcdeleg depends on Sscsrind, Zihpm, Zicntr
+ - if:
+ contains:
+ const: smcdeleg
+ then:
+ allOf:
+ - contains:
+ const: sscsrind
+ - contains:
+ const: zihpm
+ - contains:
+ const: zicntr
+ # Ssccfg depends on Smcdeleg, Sscsrind, Zihpm, Zicntr
+ - if:
+ contains:
+ const: ssccfg
+ then:
+ allOf:
+ - contains:
+ const: smcdeleg
+ - contains:
+ const: sscsrind
+ - contains:
+ const: zihpm
+ - contains:
+ const: zicntr
allOf:
# Zcf extension does not exist on rv64
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 11/22] RISC-V: perf: Restructure the SBI PMU code
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (9 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 10/22] dt-bindings: riscv: add Counter delegation ISA extensions description Atish Patra
@ 2026-07-01 8:46 ` Atish Patra
2026-07-01 8:47 ` [PATCH v8 12/22] RISC-V: perf: Modify the counter discovery mechanism Atish Patra
` (10 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:46 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
With Ssccfg/Smcdeleg, supervisor mode can program and access the
hpmcounters and events directly, without the SBI PMU extension. The SBI
PMU extension is still required for firmware counters. Restructure the
existing SBI PMU code so the hpmcounter/event helpers can be shared
between the SBI and the counter delegation paths that follow.
The driver, file, module and Kconfig names are intentionally kept
unchanged to avoid backport churn and userspace breakage (module listings,
udev rules, cmdline options).
No functional change intended.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/Kconfig | 14 ++-
drivers/perf/riscv_pmu_sbi.c | 238 +++++++++++++++++++++++++------------------
2 files changed, 150 insertions(+), 102 deletions(-)
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index ab90932fc2d0..3245bb2969e1 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -97,13 +97,17 @@ config RISCV_PMU_LEGACY
config RISCV_PMU_SBI
depends on RISCV_PMU && RISCV_SBI
- bool "RISC-V PMU based on SBI PMU extension"
+ bool "RISC-V PMU based on SBI PMU extension and/or counter delegation"
default y
help
- Say y if you want to use the CPU performance monitor
- using SBI PMU extension on RISC-V based systems. This option provides
- full perf feature support i.e. counter overflow, privilege mode
- filtering, counter configuration.
+ Say y if you want to use the CPU performance monitor on RISC-V based
+ systems. This single driver supports both hardware counter access
+ mechanisms: it uses the counter delegation (Smcdeleg/Ssccfg) ISA
+ extension to program and read the hpmcounters directly in supervisor
+ mode when available, and uses the SBI PMU extension for firmware
+ counters and when counter delegation is not present. This option
+ provides full perf feature support i.e. counter overflow, privilege
+ mode filtering, counter configuration.
config STARFIVE_STARLINK_PMU
depends on ARCH_STARFIVE || COMPILE_TEST
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 5c8924ce1f38..74d934238821 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -88,6 +88,8 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = {
static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS;
/*
+ * This structure is SBI specific but counter delegation also require counter
+ * width, csr mapping. Reuse it for now.
* RISC-V doesn't have heterogeneous harts yet. This need to be part of
* per_cpu in case of harts with different pmu counters
*/
@@ -100,7 +102,7 @@ static unsigned int riscv_pmu_irq;
/* Cache the available counters in a bitmask */
static unsigned long cmask;
-static int pmu_event_find_cache(u64 config);
+static int sbi_pmu_event_find_cache(u64 config);
struct sbi_pmu_event_data {
union {
union {
@@ -121,7 +123,7 @@ struct sbi_pmu_event_data {
};
};
-static struct sbi_pmu_event_data pmu_hw_event_map[] = {
+static struct sbi_pmu_event_data pmu_hw_event_sbi_map[] = {
[PERF_COUNT_HW_CPU_CYCLES] = {.hw_gen_event = {
SBI_PMU_HW_CPU_CYCLES,
SBI_PMU_EVENT_TYPE_HW, 0}},
@@ -155,7 +157,7 @@ static struct sbi_pmu_event_data pmu_hw_event_map[] = {
};
#define C(x) PERF_COUNT_HW_CACHE_##x
-static struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_MAX]
+static struct sbi_pmu_event_data pmu_cache_event_sbi_map[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
[C(L1D)] = {
@@ -302,7 +304,7 @@ static struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_MAX]
static int pmu_sbi_check_event_info(void)
{
- int num_events = ARRAY_SIZE(pmu_hw_event_map) + PERF_COUNT_HW_CACHE_MAX *
+ int num_events = ARRAY_SIZE(pmu_hw_event_sbi_map) + PERF_COUNT_HW_CACHE_MAX *
PERF_COUNT_HW_CACHE_OP_MAX * PERF_COUNT_HW_CACHE_RESULT_MAX;
struct riscv_pmu_event_info *event_info_shmem;
phys_addr_t base_addr;
@@ -313,14 +315,14 @@ static int pmu_sbi_check_event_info(void)
if (!event_info_shmem)
return -ENOMEM;
- for (i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++)
- event_info_shmem[count++].event_idx = pmu_hw_event_map[i].event_idx;
+ for (i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
+ event_info_shmem[count++].event_idx = pmu_hw_event_sbi_map[i].event_idx;
- for (i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++) {
- for (j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++) {
- for (k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++)
+ for (i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++) {
+ for (j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++) {
+ for (k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
event_info_shmem[count++].event_idx =
- pmu_cache_event_map[i][j][k].event_idx;
+ pmu_cache_event_sbi_map[i][j][k].event_idx;
}
}
@@ -336,19 +338,19 @@ static int pmu_sbi_check_event_info(void)
goto free_mem;
}
- for (i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++) {
+ for (i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++) {
if (!(event_info_shmem[i].output & RISCV_PMU_EVENT_INFO_OUTPUT_MASK))
- pmu_hw_event_map[i].event_idx = -ENOENT;
+ pmu_hw_event_sbi_map[i].event_idx = -ENOENT;
}
- count = ARRAY_SIZE(pmu_hw_event_map);
+ count = ARRAY_SIZE(pmu_hw_event_sbi_map);
- for (i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++) {
- for (j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++) {
- for (k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++) {
+ for (i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++) {
+ for (j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++) {
+ for (k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++) {
if (!(event_info_shmem[count].output &
RISCV_PMU_EVENT_INFO_OUTPUT_MASK))
- pmu_cache_event_map[i][j][k].event_idx = -ENOENT;
+ pmu_cache_event_sbi_map[i][j][k].event_idx = -ENOENT;
count++;
}
}
@@ -360,7 +362,7 @@ static int pmu_sbi_check_event_info(void)
return result;
}
-static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata)
+static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata)
{
struct sbiret ret;
@@ -375,7 +377,7 @@ static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata)
}
}
-static void pmu_sbi_check_std_events(struct work_struct *work)
+static void rvpmu_sbi_check_std_events(struct work_struct *work)
{
int ret;
@@ -386,23 +388,23 @@ static void pmu_sbi_check_std_events(struct work_struct *work)
return;
}
- for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++)
- pmu_sbi_check_event(&pmu_hw_event_map[i]);
+ for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
+ rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]);
- for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++)
- for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++)
- for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++)
- pmu_sbi_check_event(&pmu_cache_event_map[i][j][k]);
+ for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++)
+ for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++)
+ for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
+ rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]);
}
-static DECLARE_WORK(check_std_events_work, pmu_sbi_check_std_events);
+static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events);
-static int pmu_sbi_ctr_get_width(int idx)
+static int rvpmu_ctr_get_width(int idx)
{
return pmu_ctr_list[idx].width;
}
-static bool pmu_sbi_ctr_is_fw(int cidx)
+static bool rvpmu_ctr_is_fw(int cidx)
{
union sbi_pmu_ctr_info *info;
@@ -421,10 +423,10 @@ int riscv_pmu_get_event_info(u32 type, u64 config, u64 *econfig)
case PERF_TYPE_HARDWARE:
if (config >= PERF_COUNT_HW_MAX)
return -EINVAL;
- ret = pmu_hw_event_map[config].event_idx;
+ ret = pmu_hw_event_sbi_map[config].event_idx;
break;
case PERF_TYPE_HW_CACHE:
- ret = pmu_event_find_cache(config);
+ ret = sbi_pmu_event_find_cache(config);
break;
case PERF_TYPE_RAW:
/*
@@ -509,12 +511,12 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
}
EXPORT_SYMBOL_GPL(riscv_pmu_get_hpm_info);
-static uint8_t pmu_sbi_csr_index(struct perf_event *event)
+static uint8_t rvpmu_csr_index(struct perf_event *event)
{
return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE;
}
-static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
+static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event)
{
unsigned long cflags = 0;
bool guest_events = false;
@@ -535,7 +537,7 @@ static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
return cflags;
}
-static int pmu_sbi_ctr_get_idx(struct perf_event *event)
+static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
@@ -545,7 +547,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
uint64_t cbase = 0, cmask = rvpmu->cmask;
unsigned long cflags = 0;
- cflags = pmu_sbi_get_filter_flags(event);
+ cflags = rvpmu_sbi_get_filter_flags(event);
/*
* In legacy mode, we have to force the fixed counters for those events
@@ -582,7 +584,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
return -ENOENT;
/* Additional sanity check for the counter id */
- if (pmu_sbi_ctr_is_fw(idx)) {
+ if (rvpmu_ctr_is_fw(idx)) {
if (!test_and_set_bit(idx, cpuc->used_fw_ctrs))
return idx;
} else {
@@ -593,7 +595,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
return -ENOENT;
}
-static void pmu_sbi_ctr_clear_idx(struct perf_event *event)
+static void rvpmu_ctr_clear_idx(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
@@ -601,13 +603,13 @@ static void pmu_sbi_ctr_clear_idx(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
int idx = hwc->idx;
- if (pmu_sbi_ctr_is_fw(idx))
+ if (rvpmu_ctr_is_fw(idx))
clear_bit(idx, cpuc->used_fw_ctrs);
else
clear_bit(idx, cpuc->used_hw_ctrs);
}
-static int pmu_event_find_cache(u64 config)
+static int sbi_pmu_event_find_cache(u64 config)
{
unsigned int cache_type, cache_op, cache_result, ret;
@@ -623,7 +625,7 @@ static int pmu_event_find_cache(u64 config)
if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
return -EINVAL;
- ret = pmu_cache_event_map[cache_type][cache_op][cache_result].event_idx;
+ ret = pmu_cache_event_sbi_map[cache_type][cache_op][cache_result].event_idx;
return ret;
}
@@ -639,7 +641,7 @@ static bool pmu_sbi_is_fw_event(struct perf_event *event)
return false;
}
-static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig)
+static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
{
u32 type = event->attr.type;
u64 config = event->attr.config;
@@ -736,7 +738,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu)
return 0;
}
-static u64 pmu_sbi_ctr_read(struct perf_event *event)
+static u64 rvpmu_sbi_ctr_read(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
@@ -778,25 +780,25 @@ static u64 pmu_sbi_ctr_read(struct perf_event *event)
return val;
}
-static void pmu_sbi_set_scounteren(void *arg)
+static void rvpmu_set_scounteren(void *arg)
{
struct perf_event *event = (struct perf_event *)arg;
if (event->hw.idx != -1)
csr_write(CSR_SCOUNTEREN,
- csr_read(CSR_SCOUNTEREN) | BIT(pmu_sbi_csr_index(event)));
+ csr_read(CSR_SCOUNTEREN) | BIT(rvpmu_csr_index(event)));
}
-static void pmu_sbi_reset_scounteren(void *arg)
+static void rvpmu_reset_scounteren(void *arg)
{
struct perf_event *event = (struct perf_event *)arg;
if (event->hw.idx != -1)
csr_write(CSR_SCOUNTEREN,
- csr_read(CSR_SCOUNTEREN) & ~BIT(pmu_sbi_csr_index(event)));
+ csr_read(CSR_SCOUNTEREN) & ~BIT(rvpmu_csr_index(event)));
}
-static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
+static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival)
{
struct sbiret ret;
struct hw_perf_event *hwc = &event->hw;
@@ -816,10 +818,10 @@ static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- pmu_sbi_set_scounteren((void *)event);
+ rvpmu_set_scounteren((void *)event);
}
-static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
+static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
{
struct sbiret ret;
struct hw_perf_event *hwc = &event->hw;
@@ -829,7 +831,7 @@ static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- pmu_sbi_reset_scounteren((void *)event);
+ rvpmu_reset_scounteren((void *)event);
if (sbi_pmu_snapshot_available())
flag |= SBI_PMU_STOP_FLAG_TAKE_SNAPSHOT;
@@ -855,7 +857,7 @@ static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
}
}
-static int pmu_sbi_find_num_ctrs(void)
+static int rvpmu_sbi_find_num_ctrs(void)
{
struct sbiret ret;
@@ -866,7 +868,7 @@ static int pmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}
-static int pmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
+static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
{
struct sbiret ret;
int i, num_hw_ctr = 0, num_fw_ctr = 0;
@@ -897,7 +899,7 @@ static int pmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
return 0;
}
-static inline void pmu_sbi_stop_all(struct riscv_pmu *pmu)
+static inline void rvpmu_sbi_stop_all(struct riscv_pmu *pmu)
{
/*
* No need to check the error because we are disabling all the counters
@@ -907,7 +909,7 @@ static inline void pmu_sbi_stop_all(struct riscv_pmu *pmu)
0, pmu->cmask, SBI_PMU_STOP_FLAG_RESET, 0, 0, 0);
}
-static inline void pmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
+static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
{
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr;
@@ -951,8 +953,8 @@ static inline void pmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
* while the overflowed counters need to be started with updated initialization
* value.
*/
-static inline void pmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
- u64 ctr_ovf_mask)
+static inline void rvpmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
+ u64 ctr_ovf_mask)
{
int idx = 0, i;
struct perf_event *event;
@@ -992,8 +994,8 @@ static inline void pmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
}
}
-static inline void pmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
- u64 ctr_ovf_mask)
+static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
+ u64 ctr_ovf_mask)
{
int i, idx = 0;
struct perf_event *event;
@@ -1027,18 +1029,18 @@ static inline void pmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_
}
}
-static void pmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
- u64 ctr_ovf_mask)
+static void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
+ u64 ctr_ovf_mask)
{
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
if (sbi_pmu_snapshot_available())
- pmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
+ rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
else
- pmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask);
+ rvpmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask);
}
-static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
+static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
{
struct perf_sample_data data;
struct pt_regs *regs;
@@ -1070,7 +1072,7 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
}
pmu = to_riscv_pmu(event->pmu);
- pmu_sbi_stop_hw_ctrs(pmu);
+ rvpmu_sbi_stop_hw_ctrs(pmu);
/* Overflow status register should only be read after counter are stopped */
if (sbi_pmu_snapshot_available())
@@ -1139,13 +1141,55 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
hw_evt->state = 0;
}
- pmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
+ rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
perf_sample_event_took(sched_clock() - start_clock);
return IRQ_HANDLED;
}
-static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
+static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
+{
+ rvpmu_sbi_ctr_start(event, ival);
+ /* TODO: Counter delegation implementation */
+}
+
+static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
+{
+ rvpmu_sbi_ctr_stop(event, flag);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_find_num_ctrs(void)
+{
+ return rvpmu_sbi_find_num_ctrs();
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask)
+{
+ return rvpmu_sbi_get_ctrinfo(nctr, mask);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
+{
+ return rvpmu_sbi_event_map(event, econfig);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_ctr_get_idx(struct perf_event *event)
+{
+ return rvpmu_sbi_ctr_get_idx(event);
+ /* TODO: Counter delegation implementation */
+}
+
+static u64 rvpmu_ctr_read(struct perf_event *event)
+{
+ return rvpmu_sbi_ctr_read(event);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
{
struct riscv_pmu *pmu = hlist_entry_safe(node, struct riscv_pmu, node);
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
@@ -1160,7 +1204,7 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
csr_write(CSR_SCOUNTEREN, 0x2);
/* Stop all the counters so that they can be enabled from perf */
- pmu_sbi_stop_all(pmu);
+ rvpmu_sbi_stop_all(pmu);
if (riscv_pmu_use_irq) {
cpu_hw_evt->irq = riscv_pmu_irq;
@@ -1174,7 +1218,7 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
return 0;
}
-static int pmu_sbi_dying_cpu(unsigned int cpu, struct hlist_node *node)
+static int rvpmu_dying_cpu(unsigned int cpu, struct hlist_node *node)
{
if (riscv_pmu_use_irq) {
disable_percpu_irq(riscv_pmu_irq);
@@ -1189,7 +1233,7 @@ static int pmu_sbi_dying_cpu(unsigned int cpu, struct hlist_node *node)
return 0;
}
-static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pdev)
+static int rvpmu_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pdev)
{
int ret;
struct cpu_hw_events __percpu *hw_events = pmu->hw_events;
@@ -1231,7 +1275,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde
goto err;
}
- ret = request_percpu_irq(riscv_pmu_irq, pmu_sbi_ovf_handler, "riscv-pmu", hw_events);
+ ret = request_percpu_irq(riscv_pmu_irq, rvpmu_ovf_handler, "riscv-pmu", hw_events);
if (ret) {
pr_err("registering percpu irq failed [%d]\n", ret);
irq_dispose_mapping(riscv_pmu_irq);
@@ -1313,7 +1357,7 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
}
-static void pmu_sbi_event_init(struct perf_event *event)
+static void rvpmu_event_init(struct perf_event *event)
{
/*
* The permissions are set at event_init so that we do not depend
@@ -1327,7 +1371,7 @@ static void pmu_sbi_event_init(struct perf_event *event)
event->hw.flags |= PERF_EVENT_FLAG_LEGACY;
}
-static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm)
+static void rvpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS)
return;
@@ -1355,14 +1399,14 @@ static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm)
* that it is possible to do so to avoid any race.
* And we must notify all cpus here because threads that currently run
* on other cpus will try to directly access the counter too without
- * calling pmu_sbi_ctr_start.
+ * calling rvpmu_sbi_ctr_start.
*/
if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS)
on_each_cpu_mask(mm_cpumask(mm),
- pmu_sbi_set_scounteren, (void *)event, 1);
+ rvpmu_set_scounteren, (void *)event, 1);
}
-static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *mm)
+static void rvpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS)
return;
@@ -1384,7 +1428,7 @@ static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *m
if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS)
on_each_cpu_mask(mm_cpumask(mm),
- pmu_sbi_reset_scounteren, (void *)event, 1);
+ rvpmu_reset_scounteren, (void *)event, 1);
}
static void riscv_pmu_update_counter_access(void *info)
@@ -1427,7 +1471,7 @@ static const struct ctl_table sbi_pmu_sysctl_table[] = {
},
};
-static int pmu_sbi_device_probe(struct platform_device *pdev)
+static int rvpmu_device_probe(struct platform_device *pdev)
{
struct riscv_pmu *pmu = NULL;
int ret = -ENODEV;
@@ -1439,7 +1483,7 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
if (!pmu)
return -ENOMEM;
- num_counters = pmu_sbi_find_num_ctrs();
+ num_counters = rvpmu_find_num_ctrs();
if (num_counters < 0) {
pr_err("SBI PMU extension doesn't provide any counters\n");
goto out_free;
@@ -1452,10 +1496,10 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
}
/* cache all the information about counters now */
- if (pmu_sbi_get_ctrinfo(num_counters, &cmask))
+ if (rvpmu_get_ctrinfo(num_counters, &cmask))
goto out_free;
- ret = pmu_sbi_setup_irqs(pmu, pdev);
+ ret = rvpmu_setup_irqs(pmu, pdev);
if (ret < 0) {
pr_info("Perf sampling/filtering is not supported as sscof extension is not available\n");
pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
@@ -1466,17 +1510,17 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
pmu->pmu.attr_groups = riscv_pmu_attr_groups;
pmu->pmu.parent = &pdev->dev;
pmu->cmask = cmask;
- pmu->ctr_start = pmu_sbi_ctr_start;
- pmu->ctr_stop = pmu_sbi_ctr_stop;
- pmu->event_map = pmu_sbi_event_map;
- pmu->ctr_get_idx = pmu_sbi_ctr_get_idx;
- pmu->ctr_get_width = pmu_sbi_ctr_get_width;
- pmu->ctr_clear_idx = pmu_sbi_ctr_clear_idx;
- pmu->ctr_read = pmu_sbi_ctr_read;
- pmu->event_init = pmu_sbi_event_init;
- pmu->event_mapped = pmu_sbi_event_mapped;
- pmu->event_unmapped = pmu_sbi_event_unmapped;
- pmu->csr_index = pmu_sbi_csr_index;
+ pmu->ctr_start = rvpmu_ctr_start;
+ pmu->ctr_stop = rvpmu_ctr_stop;
+ pmu->event_map = rvpmu_event_map;
+ pmu->ctr_get_idx = rvpmu_ctr_get_idx;
+ pmu->ctr_get_width = rvpmu_ctr_get_width;
+ pmu->ctr_clear_idx = rvpmu_ctr_clear_idx;
+ pmu->ctr_read = rvpmu_ctr_read;
+ pmu->event_init = rvpmu_event_init;
+ pmu->event_mapped = rvpmu_event_mapped;
+ pmu->event_unmapped = rvpmu_event_unmapped;
+ pmu->csr_index = rvpmu_csr_index;
ret = riscv_pm_pmu_register(pmu);
if (ret)
@@ -1543,14 +1587,14 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
return ret;
}
-static struct platform_driver pmu_sbi_driver = {
- .probe = pmu_sbi_device_probe,
+static struct platform_driver rvpmu_driver = {
+ .probe = rvpmu_device_probe,
.driver = {
.name = RISCV_PMU_SBI_PDEV_NAME,
},
};
-static int __init pmu_sbi_devinit(void)
+static int __init rvpmu_devinit(void)
{
int ret;
struct platform_device *pdev;
@@ -1568,20 +1612,20 @@ static int __init pmu_sbi_devinit(void)
ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_STARTING,
"perf/riscv/pmu:starting",
- pmu_sbi_starting_cpu, pmu_sbi_dying_cpu);
+ rvpmu_starting_cpu, rvpmu_dying_cpu);
if (ret) {
pr_err("CPU hotplug notifier could not be registered: %d\n",
ret);
return ret;
}
- ret = platform_driver_register(&pmu_sbi_driver);
+ ret = platform_driver_register(&rvpmu_driver);
if (ret)
return ret;
pdev = platform_device_register_simple(RISCV_PMU_SBI_PDEV_NAME, -1, NULL, 0);
if (IS_ERR(pdev)) {
- platform_driver_unregister(&pmu_sbi_driver);
+ platform_driver_unregister(&rvpmu_driver);
return PTR_ERR(pdev);
}
@@ -1590,4 +1634,4 @@ static int __init pmu_sbi_devinit(void)
return ret;
}
-device_initcall(pmu_sbi_devinit)
+device_initcall(rvpmu_devinit)
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 12/22] RISC-V: perf: Modify the counter discovery mechanism
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (10 preceding siblings ...)
2026-07-01 8:46 ` [PATCH v8 11/22] RISC-V: perf: Restructure the SBI PMU code Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 9:20 ` sashiko-bot
2026-07-01 8:47 ` [PATCH v8 13/22] RISC-V: perf: Add a mechanism to defined legacy event encoding Atish Patra
` (9 subsequent siblings)
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
If both counter delegation and SBI PMU is present, the counter
delegation will be used for hardware pmu counters while the SBI PMU
will be used for firmware counters. Thus, the driver has to probe
the counters info via SBI PMU to distinguish the firmware counters.
The hybrid scheme also requires improvements of the informational
logging messages to indicate the user about underlying interface
used for each use case.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/riscv_pmu_sbi.c | 139 ++++++++++++++++++++++++++++++++-----------
1 file changed, 104 insertions(+), 35 deletions(-)
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 74d934238821..c20f1e33c65d 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -67,6 +67,20 @@ static bool sbi_v3_available;
static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available);
#define sbi_pmu_snapshot_available() \
static_branch_unlikely(&sbi_pmu_snapshot_available)
+static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available);
+static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
+
+/* Avoid unnecessary code patching in the one time booting path*/
+#define riscv_pmu_cdeleg_available_boot() \
+ static_key_enabled(&riscv_pmu_cdeleg_available)
+#define riscv_pmu_sbi_available_boot() \
+ static_key_enabled(&riscv_pmu_sbi_available)
+
+/* Perform a runtime code patching with static key */
+#define riscv_pmu_cdeleg_available() \
+ static_branch_unlikely(&riscv_pmu_cdeleg_available)
+#define riscv_pmu_sbi_available() \
+ static_branch_likely(&riscv_pmu_sbi_available)
static struct attribute *riscv_arch_formats_attr[] = {
&format_attr_event.attr,
@@ -89,7 +103,8 @@ static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS;
/*
* This structure is SBI specific but counter delegation also require counter
- * width, csr mapping. Reuse it for now.
+ * width, csr mapping. Reuse it for now we can have firmware counters for
+ * platfroms with counter delegation support.
* RISC-V doesn't have heterogeneous harts yet. This need to be part of
* per_cpu in case of harts with different pmu counters
*/
@@ -101,6 +116,8 @@ static unsigned int riscv_pmu_irq;
/* Cache the available counters in a bitmask */
static unsigned long cmask;
+/* Cache the available firmware counters in another bitmask */
+static unsigned long firmware_cmask;
static int sbi_pmu_event_find_cache(u64 config);
struct sbi_pmu_event_data {
@@ -868,34 +885,38 @@ static int rvpmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}
-static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
+static u32 rvpmu_deleg_find_ctrs(void)
+{
+ /* TODO */
+ return 0;
+}
+
+static int rvpmu_sbi_get_ctrinfo(u32 nsbi_ctr, u32 *num_fw_ctr, u32 *num_hw_ctr)
{
struct sbiret ret;
- int i, num_hw_ctr = 0, num_fw_ctr = 0;
+ int i;
union sbi_pmu_ctr_info cinfo;
- pmu_ctr_list = kzalloc_objs(*pmu_ctr_list, nctr);
- if (!pmu_ctr_list)
- return -ENOMEM;
-
- for (i = 0; i < nctr; i++) {
+ for (i = 0; i < nsbi_ctr; i++) {
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_GET_INFO, i, 0, 0, 0, 0, 0);
if (ret.error)
/* The logical counter ids are not expected to be contiguous */
continue;
- *mask |= BIT(i);
-
cinfo.value = ret.value;
- if (cinfo.type == SBI_PMU_CTR_TYPE_FW)
- num_fw_ctr++;
- else
- num_hw_ctr++;
- pmu_ctr_list[i].value = cinfo.value;
+ if (cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+ /* Track firmware counters in a different mask */
+ firmware_cmask |= BIT(i);
+ pmu_ctr_list[i].value = cinfo.value;
+ *num_fw_ctr = *num_fw_ctr + 1;
+ } else if (cinfo.type == SBI_PMU_CTR_TYPE_HW &&
+ !riscv_pmu_cdeleg_available_boot()) {
+ *num_hw_ctr = *num_hw_ctr + 1;
+ cmask |= BIT(i);
+ pmu_ctr_list[i].value = cinfo.value;
+ }
}
- pr_info("%d firmware and %d hardware counters\n", num_fw_ctr, num_hw_ctr);
-
return 0;
}
@@ -906,7 +927,7 @@ static inline void rvpmu_sbi_stop_all(struct riscv_pmu *pmu)
* which may include counters that are not enabled yet.
*/
sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP,
- 0, pmu->cmask, SBI_PMU_STOP_FLAG_RESET, 0, 0, 0);
+ 0, pmu->cmask | firmware_cmask, SBI_PMU_STOP_FLAG_RESET, 0, 0, 0);
}
static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
@@ -1159,16 +1180,48 @@ static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
/* TODO: Counter delegation implementation */
}
-static int rvpmu_find_num_ctrs(void)
+static int rvpmu_find_ctrs(void)
{
- return rvpmu_sbi_find_num_ctrs();
- /* TODO: Counter delegation implementation */
-}
+ int num_sbi_counters = 0;
+ u32 num_deleg_counters = 0;
+ u32 num_hw_ctr = 0, num_fw_ctr = 0, num_ctr = 0;
+ /*
+ * We don't know how many firmware counters are available. Just allocate
+ * for maximum counters the driver can support. The default is 64 anyways.
+ */
+ pmu_ctr_list = kcalloc(RISCV_MAX_COUNTERS, sizeof(*pmu_ctr_list),
+ GFP_KERNEL);
+ if (!pmu_ctr_list)
+ return -ENOMEM;
-static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask)
-{
- return rvpmu_sbi_get_ctrinfo(nctr, mask);
- /* TODO: Counter delegation implementation */
+ if (riscv_pmu_cdeleg_available_boot())
+ num_deleg_counters = rvpmu_deleg_find_ctrs();
+
+ /* This is required for firmware counters even if the above is true */
+ if (riscv_pmu_sbi_available_boot()) {
+ num_sbi_counters = rvpmu_sbi_find_num_ctrs();
+ if (num_sbi_counters < 0) {
+ kfree(pmu_ctr_list);
+ pmu_ctr_list = NULL;
+ return num_sbi_counters;
+ }
+ if (num_sbi_counters > RISCV_MAX_COUNTERS)
+ num_sbi_counters = RISCV_MAX_COUNTERS;
+ }
+
+ /* cache all the information about counters now */
+ if (riscv_pmu_sbi_available_boot())
+ rvpmu_sbi_get_ctrinfo(num_sbi_counters, &num_fw_ctr, &num_hw_ctr);
+
+ if (riscv_pmu_cdeleg_available_boot()) {
+ pr_info("%u firmware and %u hardware counters\n", num_fw_ctr, num_deleg_counters);
+ num_ctr = num_fw_ctr + num_deleg_counters;
+ } else {
+ pr_info("%u firmware and %u hardware counters\n", num_fw_ctr, num_hw_ctr);
+ num_ctr = num_sbi_counters;
+ }
+
+ return num_ctr;
}
static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
@@ -1478,12 +1531,21 @@ static int rvpmu_device_probe(struct platform_device *pdev)
int num_counters;
bool irq_requested = false;
- pr_info("SBI PMU extension is available\n");
+ if (riscv_pmu_cdeleg_available_boot()) {
+ pr_info("hpmcounters will use the counter delegation ISA extension\n");
+ if (riscv_pmu_sbi_available_boot())
+ pr_info("Firmware counters will use SBI PMU extension\n");
+ else
+ pr_info("Firmware counters will not be available as SBI PMU extension is not present\n");
+ } else if (riscv_pmu_sbi_available_boot()) {
+ pr_info("Both hpmcounters and firmware counters will use SBI PMU extension\n");
+ }
+
pmu = riscv_pmu_alloc();
if (!pmu)
return -ENOMEM;
- num_counters = rvpmu_find_num_ctrs();
+ num_counters = rvpmu_find_ctrs();
if (num_counters < 0) {
pr_err("SBI PMU extension doesn't provide any counters\n");
goto out_free;
@@ -1495,9 +1557,6 @@ static int rvpmu_device_probe(struct platform_device *pdev)
pr_info("SBI returned more than maximum number of counters. Limiting the number of counters to %d\n", num_counters);
}
- /* cache all the information about counters now */
- if (rvpmu_get_ctrinfo(num_counters, &cmask))
- goto out_free;
ret = rvpmu_setup_irqs(pmu, pdev);
if (ret < 0) {
@@ -1599,13 +1658,23 @@ static int __init rvpmu_devinit(void)
int ret;
struct platform_device *pdev;
- if (sbi_spec_version < sbi_mk_version(0, 3) ||
- !sbi_probe_extension(SBI_EXT_PMU)) {
- return 0;
- }
+ if (sbi_spec_version >= sbi_mk_version(0, 3) &&
+ sbi_probe_extension(SBI_EXT_PMU))
+ static_branch_enable(&riscv_pmu_sbi_available);
if (sbi_spec_version >= sbi_mk_version(2, 0))
sbi_v2_available = true;
+ /*
+ * We need all three extensions to be present to access the counters
+ * in S-mode via Supervisor Counter delegation.
+ */
+ if (riscv_isa_extension_available(NULL, SSCCFG) &&
+ riscv_isa_extension_available(NULL, SMCDELEG) &&
+ riscv_isa_extension_available(NULL, SSCSRIND))
+ static_branch_enable(&riscv_pmu_cdeleg_available);
+
+ if (!(riscv_pmu_sbi_available_boot() || riscv_pmu_cdeleg_available_boot()))
+ return 0;
if (sbi_spec_version >= sbi_mk_version(3, 0))
sbi_v3_available = true;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 12/22] RISC-V: perf: Modify the counter discovery mechanism
2026-07-01 8:47 ` [PATCH v8 12/22] RISC-V: perf: Modify the counter discovery mechanism Atish Patra
@ 2026-07-01 9:20 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:20 UTC (permalink / raw)
To: Atish Patra; +Cc: robh, devicetree, linux-perf-users, conor+dt
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues:
- [High] Hardware counter CSR indices and firmware counter SBI logical indices share the same index namespace, leading to silent overwriting of hardware counters in `pmu_ctr_list`.
- [High] Bitwise shift overflow on 32-bit architectures when processing PMU firmware counters with indices >= 32.
- [Medium] A failure to discover firmware counters unconditionally aborts PMU initialization, erroneously discarding successfully delegated hardware counters.
Pre-existing issues:
- [High] Invalid counter base index passed to `sbi_ecall` in `rvpmu_sbi_start_ovf_ctrs_snapshot`.
--
commit 96c77115054b0358f4d8c8c58dd7977dd68ff1d2
Author: Atish Patra <atishp@rivosinc.com>
RISC-V: perf: Modify the counter discovery mechanism
This commit modifies the PMU counter discovery mechanism to support a hybrid
scheme where hardware counters use counter delegation and firmware counters
use the SBI PMU extension. It queries the counters info via SBI PMU to
distinguish the firmware counters and track them in separate bitmasks.
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> index 74d9342388212..c20f1e33c65d5 100644
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
[ ... ]
> @@ -868,34 +885,38 @@ static int rvpmu_sbi_find_num_ctrs(void)
> return sbi_err_map_linux_errno(ret.error);
> }
>
> -static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
> +static u32 rvpmu_deleg_find_ctrs(void)
> +{
> + /* TODO */
> + return 0;
> +}
> +
> +static int rvpmu_sbi_get_ctrinfo(u32 nsbi_ctr, u32 *num_fw_ctr, u32 *num_hw_ctr)
> {
> struct sbiret ret;
> - int i, num_hw_ctr = 0, num_fw_ctr = 0;
> + int i;
> union sbi_pmu_ctr_info cinfo;
>
> - for (i = 0; i < nctr; i++) {
> + for (i = 0; i < nsbi_ctr; i++) {
> ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_GET_INFO, i, 0, 0, 0, 0, 0);
> if (ret.error)
> /* The logical counter ids are not expected to be contiguous */
> continue;
>
> cinfo.value = ret.value;
> - if (cinfo.type == SBI_PMU_CTR_TYPE_FW)
> - num_fw_ctr++;
> - else
> - num_hw_ctr++;
> - pmu_ctr_list[i].value = cinfo.value;
> + if (cinfo.type == SBI_PMU_CTR_TYPE_FW) {
> + /* Track firmware counters in a different mask */
> + firmware_cmask |= BIT(i);
[Severity: High]
Does this code trigger a bitwise shift out of bounds on 32-bit architectures?
The firmware_cmask variable is an unsigned long, which is 32 bits wide on RV32.
Since the SBI PMU specification allows up to 64 counters, and standard firmware
implementations (like OpenSBI) typically assign firmware counters to logical
indices 32 and above, the BIT(i) macro will execute a shift greater than or
equal to the width of the type when i >= 32.
Will this cause firmware counters to be dropped or corrupted on 32-bit
platforms?
> + pmu_ctr_list[i].value = cinfo.value;
[Severity: High]
Can this overwrite hardware counter entries in the pmu_ctr_list array?
The rvpmu_deleg_find_ctrs() function iterates over hardware counters and maps
them using their architectural CSR indices (0-31). The rvpmu_sbi_get_ctrinfo()
function then maps firmware counters using their SBI logical indices ('i'
from 0 to N-1).
Since the SBI PMU specification does not restrict firmware counters to indices
>= 32, if a platform assigns a firmware counter to a logical index <= 31,
wouldn't using 'i' directly here silently overwrite the delegated hardware
counter entry?
> + *num_fw_ctr = *num_fw_ctr + 1;
> + } else if (cinfo.type == SBI_PMU_CTR_TYPE_HW &&
> + !riscv_pmu_cdeleg_available_boot()) {
> + *num_hw_ctr = *num_hw_ctr + 1;
> + cmask |= BIT(i);
> + pmu_ctr_list[i].value = cinfo.value;
> + }
> }
>
> return 0;
> }
[ ... ]
> @@ -1159,16 +1180,48 @@ static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
> /* TODO: Counter delegation implementation */
> }
>
> -static int rvpmu_find_num_ctrs(void)
> +static int rvpmu_find_ctrs(void)
> {
> - return rvpmu_sbi_find_num_ctrs();
> - /* TODO: Counter delegation implementation */
> -}
> + int num_sbi_counters = 0;
> + u32 num_deleg_counters = 0;
> + u32 num_hw_ctr = 0, num_fw_ctr = 0, num_ctr = 0;
> + /*
> + * We don't know how many firmware counters are available. Just allocate
> + * for maximum counters the driver can support. The default is 64 anyways.
> + */
> + pmu_ctr_list = kcalloc(RISCV_MAX_COUNTERS, sizeof(*pmu_ctr_list),
> + GFP_KERNEL);
> + if (!pmu_ctr_list)
> + return -ENOMEM;
>
> -static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask)
> -{
> - return rvpmu_sbi_get_ctrinfo(nctr, mask);
> - /* TODO: Counter delegation implementation */
> + if (riscv_pmu_cdeleg_available_boot())
> + num_deleg_counters = rvpmu_deleg_find_ctrs();
> +
> + /* This is required for firmware counters even if the above is true */
> + if (riscv_pmu_sbi_available_boot()) {
> + num_sbi_counters = rvpmu_sbi_find_num_ctrs();
> + if (num_sbi_counters < 0) {
> + kfree(pmu_ctr_list);
> + pmu_ctr_list = NULL;
> + return num_sbi_counters;
[Severity: Medium]
Should this gracefully continue with the available delegated hardware counters
instead of aborting PMU initialization?
If counter delegation for hardware performance counters succeeds, but the
optional firmware extension returns an error via rvpmu_sbi_find_num_ctrs(),
this error path frees the entire pmu_ctr_list array and returns an error.
This unnecessarily disables the system's hardware performance monitoring
instead of simply ignoring the broken firmware counters.
> + }
> + if (num_sbi_counters > RISCV_MAX_COUNTERS)
> + num_sbi_counters = RISCV_MAX_COUNTERS;
> + }
[ ... ]
[Severity: High]
This is a pre-existing issue, but there appears to be a logic error in
rvpmu_sbi_start_ovf_ctrs_snapshot() when passing the counter base index to
the firmware:
drivers/perf/riscv_pmu_sbi.c:rvpmu_sbi_start_ovf_ctrs_snapshot() {
...
for (i = 0; i < BITS_TO_LONGS(RISCV_MAX_COUNTERS); i++) {
...
for_each_set_bit(idx, &cpu_hw_evt->used_hw_ctrs[i], BITS_PER_LONG)
sdata->ctr_values[idx] =
cpu_hw_evt->snapshot_cval_shcopy[idx + i * BITS_PER_LONG];
/* Start all the counters in a single shot */
sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx * BITS_PER_LONG,
cpu_hw_evt->used_hw_ctrs[i], flag, 0, 0, 0);
...
}
Because idx is the exhausted iterator of for_each_set_bit(), its value will be
BITS_PER_LONG. When multiplied by BITS_PER_LONG, the base index evaluates to
an invalid out-of-bounds value (e.g., 4096 on RV64).
Should this pass i * BITS_PER_LONG instead of idx * BITS_PER_LONG to correctly
start the chunk?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=12
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 13/22] RISC-V: perf: Add a mechanism to defined legacy event encoding
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (11 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 12/22] RISC-V: perf: Modify the counter discovery mechanism Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 9:19 ` sashiko-bot
2026-07-01 8:47 ` [PATCH v8 14/22] RISC-V: perf: Implement supervisor counter delegation support Atish Patra
` (8 subsequent siblings)
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
RISC-V ISA doesn't define any standard event encodings or specify
any event to counter mapping. Thus, event encoding information
and corresponding counter mapping fot those events needs to be
provided in the driver for each vendor.
Add a framework to support that. The individual platform events
will be added later.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/riscv_pmu_sbi.c | 70 +++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 69 insertions(+), 1 deletion(-)
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index c20f1e33c65d..2568c6808f5d 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -10,6 +10,7 @@
#define pr_fmt(fmt) "riscv-pmu-sbi: " fmt
+#include <linux/limits.h>
#include <linux/mod_devicetable.h>
#include <linux/perf/riscv_pmu.h>
#include <linux/platform_device.h>
@@ -379,6 +380,71 @@ static int pmu_sbi_check_event_info(void)
return result;
}
+/*
+ * Vendor specific PMU events.
+ */
+struct riscv_pmu_event {
+ u64 event_id;
+ u32 counter_mask;
+};
+
+#define HW_OP_UNSUPPORTED U64_MAX
+#define CACHE_OP_UNSUPPORTED U64_MAX
+
+#define PERF_MAP_ALL_UNSUPPORTED \
+ [0 ... PERF_COUNT_HW_MAX - 1] = {HW_OP_UNSUPPORTED, 0x0}
+
+#define PERF_CACHE_MAP_ALL_UNSUPPORTED \
+[0 ... PERF_COUNT_HW_CACHE_MAX - 1] = { \
+ [0 ... PERF_COUNT_HW_CACHE_OP_MAX - 1] = { \
+ [0 ... PERF_COUNT_HW_CACHE_RESULT_MAX - 1] = { \
+ CACHE_OP_UNSUPPORTED, 0x0 \
+ }, \
+ }, \
+}
+
+struct riscv_vendor_pmu_events {
+ unsigned long vendorid;
+ unsigned long archid;
+ unsigned long implid;
+ const struct riscv_pmu_event *hw_event_map;
+ const struct riscv_pmu_event (*cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX];
+};
+
+#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, _cache_event_map) \
+ { .vendorid = _vendorid, .archid = _archid, .implid = _implid, \
+ .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map },
+
+static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = {
+};
+
+static const struct riscv_pmu_event *current_pmu_hw_event_map;
+static const struct riscv_pmu_event (*current_pmu_cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX];
+
+static void __init rvpmu_vendor_register_events(void)
+{
+ int cpu = raw_smp_processor_id();
+ unsigned long vendor_id = riscv_cached_mvendorid(cpu);
+ unsigned long impl_id = riscv_cached_mimpid(cpu);
+ unsigned long arch_id = riscv_cached_marchid(cpu);
+
+ for (int i = 0; i < ARRAY_SIZE(pmu_vendor_events_table); i++) {
+ if (pmu_vendor_events_table[i].vendorid == vendor_id &&
+ pmu_vendor_events_table[i].implid == impl_id &&
+ pmu_vendor_events_table[i].archid == arch_id) {
+ current_pmu_hw_event_map = pmu_vendor_events_table[i].hw_event_map;
+ current_pmu_cache_event_map = pmu_vendor_events_table[i].cache_event_map;
+ break;
+ }
+ }
+
+ if (!current_pmu_hw_event_map || !current_pmu_cache_event_map) {
+ pr_info("No default PMU events found\n");
+ }
+}
+
static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata)
{
struct sbiret ret;
@@ -1670,8 +1736,10 @@ static int __init rvpmu_devinit(void)
*/
if (riscv_isa_extension_available(NULL, SSCCFG) &&
riscv_isa_extension_available(NULL, SMCDELEG) &&
- riscv_isa_extension_available(NULL, SSCSRIND))
+ riscv_isa_extension_available(NULL, SSCSRIND)) {
static_branch_enable(&riscv_pmu_cdeleg_available);
+ rvpmu_vendor_register_events();
+ }
if (!(riscv_pmu_sbi_available_boot() || riscv_pmu_cdeleg_available_boot()))
return 0;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 13/22] RISC-V: perf: Add a mechanism to defined legacy event encoding
2026-07-01 8:47 ` [PATCH v8 13/22] RISC-V: perf: Add a mechanism to defined legacy event encoding Atish Patra
@ 2026-07-01 9:19 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:19 UTC (permalink / raw)
To: Atish Patra; +Cc: linux-perf-users, robh, conor+dt, devicetree
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] The fallback log message incorrectly reports 'No default PMU events found' if a vendor only provides hardware events.
--
Note: The format of this review is altered to provide a free-form summary
rather than inline diff quotes due to recitation restrictions.
commit 8db6e371918071f188390a413ded48d85e22f4f9
Author: Atish Patra <atishp@rivosinc.com>
RISC-V: perf: Add a mechanism to defined legacy event encoding
This commit introduces a framework in the RISC-V PMU driver to support
vendor-specific legacy event encodings and counter mappings. Platform-specific
events will be added using this framework in later patches.
[Severity: Low]
In the newly added rvpmu_vendor_register_events() function, the fallback log
message evaluates if either the hardware event map or the cache event map is
missing before printing "No default PMU events found".
If a vendor provides a valid hardware event map but leaves the cache event
map as NULL (e.g., if they do not support legacy cache PMU events), won't
this condition incorrectly trigger the fallback log? Does this code need to
use a logical AND instead, or should it evaluate and log hardware and cache
event presence separately to avoid misleading kernel logs?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=13
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 14/22] RISC-V: perf: Implement supervisor counter delegation support
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (12 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 13/22] RISC-V: perf: Add a mechanism to defined legacy event encoding Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 9:27 ` sashiko-bot
2026-07-01 8:47 ` [PATCH v8 15/22] RISC-V: perf: Skip PMU SBI extension when not implemented Atish Patra
` (7 subsequent siblings)
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
There are few new RISC-V ISA exensions (ssccfg, sscsrind, smcntrpmf) which
allows the hpmcounter/hpmevents to be programmed directly from S-mode. The
implementation detects the ISA extension at runtime and uses them if
available instead of SBI PMU extension. SBI PMU extension will still be
used for firmware counters if the user requests it.
The current linux driver relies on event encoding defined by SBI PMU
specification for standard perf events. However, there are no standard
event encoding available in the ISA. In the future, we may want to
decouple the counter delegation and SBI PMU completely. In that case,
counter delegation supported platforms must rely on the event encoding
defined in the perf json file or in the pmu driver.
For firmware events, it will continue to use the SBI PMU encoding as
one can not support firmware event without SBI PMU.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/csr.h | 1 +
drivers/perf/riscv_pmu_sbi.c | 578 +++++++++++++++++++++++++++++++++--------
include/linux/perf/riscv_pmu.h | 3 +
3 files changed, 478 insertions(+), 104 deletions(-)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index a3b24b88e401..cd22b5168689 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -258,6 +258,7 @@
#endif
#define SISELECT_SSCCFG_BASE 0x40
+#define HPMEVENT_MASK GENMASK_ULL(63, 56)
/* mseccfg bits */
#define MSECCFG_PMM ENVCFG_PMM
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 2568c6808f5d..7995da4a98a1 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -28,6 +28,8 @@
#include <asm/cpufeature.h>
#include <asm/vendor_extensions.h>
#include <asm/vendor_extensions/andes.h>
+#include <asm/hwcap.h>
+#include <asm/csr_ind.h>
#define ALT_SBI_PMU_OVERFLOW(__ovl) \
asm volatile(ALTERNATIVE_2( \
@@ -60,7 +62,20 @@ asm volatile(ALTERNATIVE( \
#define PERF_EVENT_FLAG_USER_ACCESS BIT(SYSCTL_USER_ACCESS)
#define PERF_EVENT_FLAG_LEGACY BIT(SYSCTL_LEGACY)
-PMU_FORMAT_ATTR(event, "config:0-55");
+#define RVPMU_SBI_PMU_FORMAT_ATTR "config:0-47"
+#define RVPMU_CDELEG_PMU_FORMAT_ATTR "config:0-55"
+
+static ssize_t __maybe_unused rvpmu_format_show(struct device *dev, struct device_attribute *attr,
+ char *buf);
+
+#define RVPMU_ATTR_ENTRY(_name, _func, _config) ( \
+ &((struct dev_ext_attribute[]) { \
+ { __ATTR(_name, 0444, _func, NULL), (void *)_config } \
+ })[0].attr.attr)
+
+#define RVPMU_FORMAT_ATTR_ENTRY(_name, _config) \
+ RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config)
+
PMU_FORMAT_ATTR(firmware, "config:62-63");
static bool sbi_v2_available;
@@ -68,7 +83,11 @@ static bool sbi_v3_available;
static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available);
#define sbi_pmu_snapshot_available() \
static_branch_unlikely(&sbi_pmu_snapshot_available)
+
static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available);
+#define riscv_pmu_sbi_available() \
+ static_branch_likely(&riscv_pmu_sbi_available)
+
static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
/* Avoid unnecessary code patching in the one time booting path*/
@@ -83,19 +102,35 @@ static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
#define riscv_pmu_sbi_available() \
static_branch_likely(&riscv_pmu_sbi_available)
-static struct attribute *riscv_arch_formats_attr[] = {
- &format_attr_event.attr,
+static struct attribute *riscv_sbi_pmu_formats_attr[] = {
+ RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_SBI_PMU_FORMAT_ATTR),
&format_attr_firmware.attr,
NULL,
};
-static struct attribute_group riscv_pmu_format_group = {
+static struct attribute_group riscv_sbi_pmu_format_group = {
.name = "format",
- .attrs = riscv_arch_formats_attr,
+ .attrs = riscv_sbi_pmu_formats_attr,
};
-static const struct attribute_group *riscv_pmu_attr_groups[] = {
- &riscv_pmu_format_group,
+static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = {
+ &riscv_sbi_pmu_format_group,
+ NULL,
+};
+
+static struct attribute *riscv_cdeleg_pmu_formats_attr[] = {
+ RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR),
+ &format_attr_firmware.attr,
+ NULL,
+};
+
+static struct attribute_group riscv_cdeleg_pmu_format_group = {
+ .name = "format",
+ .attrs = riscv_cdeleg_pmu_formats_attr,
+};
+
+static const struct attribute_group *riscv_cdeleg_pmu_attr_groups[] = {
+ &riscv_cdeleg_pmu_format_group,
NULL,
};
@@ -482,6 +517,14 @@ static void rvpmu_sbi_check_std_events(struct work_struct *work)
static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events);
+static ssize_t rvpmu_format_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct dev_ext_attribute *eattr = container_of(attr,
+ struct dev_ext_attribute, attr);
+ return sysfs_emit(buf, "%s\n", (char *)eattr->var);
+}
+
static int rvpmu_ctr_get_width(int idx)
{
return pmu_ctr_list[idx].width;
@@ -599,6 +642,38 @@ static uint8_t rvpmu_csr_index(struct perf_event *event)
return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE;
}
+static uint64_t get_deleg_priv_filter_bits(struct perf_event *event)
+{
+ u64 priv_filter_bits = 0;
+ bool guest_events = false;
+
+ if (event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS)
+ guest_events = true;
+ if (event->attr.exclude_kernel)
+ priv_filter_bits |= guest_events ? HPMEVENT_VSINH : HPMEVENT_SINH;
+ if (event->attr.exclude_user)
+ priv_filter_bits |= guest_events ? HPMEVENT_VUINH : HPMEVENT_UINH;
+ if (guest_events && event->attr.exclude_hv)
+ priv_filter_bits |= HPMEVENT_SINH;
+ if (event->attr.exclude_host)
+ priv_filter_bits |= HPMEVENT_UINH | HPMEVENT_SINH;
+ if (event->attr.exclude_guest)
+ priv_filter_bits |= HPMEVENT_VSINH | HPMEVENT_VUINH;
+
+ return priv_filter_bits;
+}
+
+static bool pmu_sbi_is_fw_event(struct perf_event *event)
+{
+ u32 type = event->attr.type;
+ u64 config = event->attr.config;
+
+ if (type == PERF_TYPE_RAW && ((config >> 63) == 1))
+ return true;
+ else
+ return false;
+}
+
static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event)
{
unsigned long cflags = 0;
@@ -627,7 +702,8 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
struct sbiret ret;
int idx;
- uint64_t cbase = 0, cmask = rvpmu->cmask;
+ u64 cbase = 0;
+ unsigned long ctr_mask = rvpmu->cmask;
unsigned long cflags = 0;
cflags = rvpmu_sbi_get_filter_flags(event);
@@ -640,21 +716,23 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
if ((hwc->flags & PERF_EVENT_FLAG_LEGACY) && (event->attr.type == PERF_TYPE_HARDWARE)) {
if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES) {
cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
- cmask = 1;
+ ctr_mask = 1;
} else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS) {
cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
- cmask = BIT(CSR_INSTRET - CSR_CYCLE);
+ ctr_mask = BIT(CSR_INSTRET - CSR_CYCLE);
}
+ } else if (pmu_sbi_is_fw_event(event)) {
+ ctr_mask = firmware_cmask;
}
/* retrieve the available counter index */
#if defined(CONFIG_32BIT)
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
- cmask, cflags, hwc->event_base, hwc->config,
+ ctr_mask, cflags, hwc->event_base, hwc->config,
hwc->config >> 32);
#else
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
- cmask, cflags, hwc->event_base, hwc->config, 0);
+ ctr_mask, cflags, hwc->event_base, hwc->config, 0);
#endif
if (ret.error) {
pr_debug("Not able to find a counter for event %lx config %llx\n",
@@ -663,7 +741,7 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
}
idx = ret.value;
- if (!test_bit(idx, &rvpmu->cmask) || !pmu_ctr_list[idx].value)
+ if (!test_bit(idx, &ctr_mask) || !pmu_ctr_list[idx].value)
return -ENOENT;
/* Additional sanity check for the counter id */
@@ -713,29 +791,98 @@ static int sbi_pmu_event_find_cache(u64 config)
return ret;
}
-static bool pmu_sbi_is_fw_event(struct perf_event *event)
+static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
{
u32 type = event->attr.type;
u64 config = event->attr.config;
- if ((type == PERF_TYPE_RAW) && ((config >> 63) == 1))
- return true;
- else
- return false;
+ /*
+ * Ensure we are finished checking standard hardware events for
+ * validity before allowing userspace to configure any events.
+ */
+ flush_work(&check_std_events_work);
+
+ return riscv_pmu_get_event_info(type, config, econfig);
}
-static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
+static int cdeleg_pmu_event_find_cache(u64 config, u64 *eventid, uint32_t *counter_mask)
+{
+ unsigned int cache_type, cache_op, cache_result;
+
+ if (!current_pmu_cache_event_map)
+ return -ENOENT;
+
+ cache_type = (config >> 0) & 0xff;
+ if (cache_type >= PERF_COUNT_HW_CACHE_MAX)
+ return -EINVAL;
+
+ cache_op = (config >> 8) & 0xff;
+ if (cache_op >= PERF_COUNT_HW_CACHE_OP_MAX)
+ return -EINVAL;
+
+ cache_result = (config >> 16) & 0xff;
+ if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
+ return -EINVAL;
+
+ if (eventid)
+ *eventid = current_pmu_cache_event_map[cache_type][cache_op]
+ [cache_result].event_id;
+ if (counter_mask)
+ *counter_mask = current_pmu_cache_event_map[cache_type][cache_op]
+ [cache_result].counter_mask;
+
+ return 0;
+}
+
+static int rvpmu_cdeleg_event_map(struct perf_event *event, u64 *econfig)
{
u32 type = event->attr.type;
u64 config = event->attr.config;
+ int ret = 0;
/*
- * Ensure we are finished checking standard hardware events for
- * validity before allowing userspace to configure any events.
+ * There are two ways standard perf events can be mapped to platform specific
+ * encoding.
+ * 1. The vendor may specify the encodings in the driver.
+ * 2. The Perf tool for RISC-V may remap the standard perf event to platform
+ * specific encoding.
+ *
+ * As RISC-V ISA doesn't define any standard event encoding. Thus, perf tool allows
+ * vendor to define it via json file. The encoding defined in the json will override
+ * the perf legacy encoding. However, some user may want to run performance
+ * monitoring without perf tool as well. That's why, vendors may specify the event
+ * encoding in the driver as well if they want to support that use case too.
+ * If an encoding is defined in the json, it will be encoded as a raw event.
*/
- flush_work(&check_std_events_work);
- return riscv_pmu_get_event_info(type, config, econfig);
+ switch (type) {
+ case PERF_TYPE_HARDWARE:
+ if (config >= PERF_COUNT_HW_MAX)
+ return -EINVAL;
+ if (!current_pmu_hw_event_map)
+ return -ENOENT;
+
+ *econfig = current_pmu_hw_event_map[config].event_id;
+ if (*econfig == HW_OP_UNSUPPORTED)
+ ret = -ENOENT;
+ break;
+ case PERF_TYPE_HW_CACHE:
+ ret = cdeleg_pmu_event_find_cache(config, econfig, NULL);
+ if (ret)
+ break;
+ if (*econfig == CACHE_OP_UNSUPPORTED)
+ ret = -ENOENT;
+ break;
+ case PERF_TYPE_RAW:
+ *econfig = config & RISCV_PMU_DELEG_RAW_EVENT_MASK;
+ break;
+ default:
+ ret = -ENOENT;
+ break;
+ }
+
+ /* event_base is not used for counter delegation */
+ return ret;
}
static void pmu_sbi_snapshot_free(struct riscv_pmu *pmu)
@@ -821,7 +968,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu)
return 0;
}
-static u64 rvpmu_sbi_ctr_read(struct perf_event *event)
+static u64 rvpmu_ctr_read(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
@@ -898,10 +1045,6 @@ static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival)
if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED))
pr_err("Starting counter idx %d failed with error %d\n",
hwc->idx, sbi_err_map_linux_errno(ret.error));
-
- if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
- (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- rvpmu_set_scounteren((void *)event);
}
static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
@@ -912,10 +1055,6 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr;
- if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
- (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- rvpmu_reset_scounteren((void *)event);
-
if (sbi_pmu_snapshot_available())
flag |= SBI_PMU_STOP_FLAG_TAKE_SNAPSHOT;
@@ -951,12 +1090,6 @@ static int rvpmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}
-static u32 rvpmu_deleg_find_ctrs(void)
-{
- /* TODO */
- return 0;
-}
-
static int rvpmu_sbi_get_ctrinfo(u32 nsbi_ctr, u32 *num_fw_ctr, u32 *num_hw_ctr)
{
struct sbiret ret;
@@ -1034,55 +1167,75 @@ static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
}
}
-/*
- * This function starts all the used counters in two step approach.
- * Any counter that did not overflow can be start in a single step
- * while the overflowed counters need to be started with updated initialization
- * value.
- */
-static inline void rvpmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
- u64 ctr_ovf_mask)
+static void rvpmu_deleg_ctr_start_mask(unsigned long mask)
{
- int idx = 0, i;
- struct perf_event *event;
- unsigned long flag = SBI_PMU_START_FLAG_SET_INIT_VALUE;
- unsigned long ctr_start_mask = 0;
- uint64_t max_period;
- struct hw_perf_event *hwc;
- u64 init_val = 0;
+ unsigned long scountinhibit_val = 0;
- for (i = 0; i < BITS_TO_LONGS(RISCV_MAX_COUNTERS); i++) {
- ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask;
- /* Start all the counters that did not overflow in a single shot */
- if (ctr_start_mask) {
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, i * BITS_PER_LONG,
- ctr_start_mask, 0, 0, 0, 0);
- }
- }
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val &= ~mask;
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
+}
+
+static void rvpmu_deleg_ctr_enable_irq(struct perf_event *event)
+{
+ unsigned long hpmevent_curr;
+ unsigned long of_mask;
+ struct hw_perf_event *hwc = &event->hw;
+ int counter_idx = hwc->idx;
+ unsigned long sip_val = csr_read(CSR_SIP);
+
+ if (!is_sampling_event(event) || (sip_val & SIP_LCOFIP))
+ return;
- /* Reinitialize and start all the counter that overflowed */
- while (ctr_ovf_mask) {
- if (ctr_ovf_mask & 0x01) {
- event = cpu_hw_evt->events[idx];
- hwc = &event->hw;
- max_period = riscv_pmu_ctr_get_width_mask(event);
- init_val = local64_read(&hwc->prev_count) & max_period;
#if defined(CONFIG_32BIT)
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
- flag, init_val, init_val >> 32, 0);
+ hpmevent_curr = csr_ind_read(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx);
+ of_mask = (u32)~HPMEVENTH_OF;
#else
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
- flag, init_val, 0, 0);
+ hpmevent_curr = csr_ind_read(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx);
+ of_mask = ~HPMEVENT_OF;
#endif
- perf_event_update_userpage(event);
- }
- ctr_ovf_mask = ctr_ovf_mask >> 1;
- idx++;
- }
+
+ hpmevent_curr &= of_mask;
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr);
+#else
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr);
+#endif
+}
+
+static void rvpmu_deleg_ctr_start(struct perf_event *event, u64 ival)
+{
+ unsigned long scountinhibit_val = 0;
+ struct hw_perf_event *hwc = &event->hw;
+
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival & 0xFFFFFFFF);
+ csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, hwc->idx, ival >> BITS_PER_LONG);
+#else
+ csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival);
+#endif
+
+ rvpmu_deleg_ctr_enable_irq(event);
+
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val &= ~BIT(hwc->idx);
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
+}
+
+static void rvpmu_deleg_ctr_stop_mask(unsigned long mask)
+{
+ unsigned long scountinhibit_val = 0;
+
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val |= mask;
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
}
-static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
- u64 ctr_ovf_mask)
+static void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
+ u64 ctr_ovf_mask)
{
int i, idx = 0;
struct perf_event *event;
@@ -1116,15 +1269,53 @@ static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_h
}
}
-static void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
- u64 ctr_ovf_mask)
+/*
+ * This function starts all the used counters in two step approach.
+ * Any counter that did not overflow can be start in a single step
+ * while the overflowed counters need to be started with updated initialization
+ * value.
+ */
+static void rvpmu_start_overflow_mask(struct riscv_pmu *pmu, u64 ctr_ovf_mask)
{
+ int idx = 0, i;
+ struct perf_event *event;
+ unsigned long ctr_start_mask = 0;
+ u64 max_period, init_val = 0;
+ struct hw_perf_event *hwc;
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
if (sbi_pmu_snapshot_available())
- rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
- else
- rvpmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask);
+ return rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
+
+ /* Start all the counters that did not overflow */
+ if (riscv_pmu_cdeleg_available()) {
+ ctr_start_mask = cpu_hw_evt->used_hw_ctrs[0] & ~ctr_ovf_mask;
+ rvpmu_deleg_ctr_start_mask(ctr_start_mask);
+ } else {
+ for (i = 0; i < BITS_TO_LONGS(RISCV_MAX_COUNTERS); i++) {
+ ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask;
+ /* Start all the counters that did not overflow in a single shot */
+ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, i * BITS_PER_LONG,
+ ctr_start_mask, 0, 0, 0, 0);
+ }
+ }
+
+ /* Reinitialize and start all the counter that overflowed */
+ while (ctr_ovf_mask) {
+ if (ctr_ovf_mask & 0x01) {
+ event = cpu_hw_evt->events[idx];
+ hwc = &event->hw;
+ max_period = riscv_pmu_ctr_get_width_mask(event);
+ init_val = local64_read(&hwc->prev_count) & max_period;
+ if (riscv_pmu_cdeleg_available())
+ rvpmu_deleg_ctr_start(event, init_val);
+ else
+ rvpmu_sbi_ctr_start(event, init_val);
+ perf_event_update_userpage(event);
+ }
+ ctr_ovf_mask = ctr_ovf_mask >> 1;
+ idx++;
+ }
}
static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
@@ -1159,10 +1350,18 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
}
pmu = to_riscv_pmu(event->pmu);
- rvpmu_sbi_stop_hw_ctrs(pmu);
+ if (riscv_pmu_cdeleg_available())
+ rvpmu_deleg_ctr_stop_mask(cpu_hw_evt->used_hw_ctrs[0]);
+ else
+ rvpmu_sbi_stop_hw_ctrs(pmu);
- /* Overflow status register should only be read after counter are stopped */
- if (sbi_pmu_snapshot_available())
+ /*
+ * Overflow status register should only be read after counter are stopped.
+ * In counter delegation mode the overflows are reported in scountovf, not
+ * in the SBI snapshot area, so read the CSR directly even when an SBI PMU
+ * snapshot is also available.
+ */
+ if (sbi_pmu_snapshot_available() && !riscv_pmu_cdeleg_available())
overflow = sdata->ctr_overflow_mask;
else
ALT_SBI_PMU_OVERFLOW(overflow);
@@ -1228,22 +1427,183 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
hw_evt->state = 0;
}
- rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
+ rvpmu_start_overflow_mask(pmu, overflowed_ctrs);
perf_sample_event_took(sched_clock() - start_clock);
return IRQ_HANDLED;
}
+static int get_deleg_hw_ctr_width(int counter_offset)
+{
+ unsigned long hpm_warl;
+ int num_bits;
+
+ if (counter_offset < 3 || counter_offset > 31)
+ return 0;
+
+ hpm_warl = csr_ind_warl(CSR_SIREG, SISELECT_SSCCFG_BASE, counter_offset, -1);
+ if (!hpm_warl)
+ return 0;
+ num_bits = __fls(hpm_warl);
+
+#if defined(CONFIG_32BIT)
+ /*
+ * The low half contributes a full BITS_PER_LONG bits when the counter is
+ * wider than 32 bits; the high half's __fls() gives the remaining width.
+ */
+ hpm_warl = csr_ind_warl(CSR_SIREG4, SISELECT_SSCCFG_BASE, counter_offset, -1);
+ if (hpm_warl)
+ num_bits = BITS_PER_LONG + __fls(hpm_warl);
+#endif
+ return num_bits;
+}
+
+static int rvpmu_deleg_find_ctrs(void)
+{
+ int i, num_hw_ctr = 0;
+ union sbi_pmu_ctr_info cinfo;
+ unsigned long scountinhibit_old = 0;
+
+ /* Do a WARL write/read to detect which hpmcounters have been delegated */
+ scountinhibit_old = csr_read(CSR_SCOUNTINHIBIT);
+ csr_write(CSR_SCOUNTINHIBIT, -1);
+ cmask = csr_read(CSR_SCOUNTINHIBIT);
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_old);
+
+ for_each_set_bit(i, &cmask, RISCV_MAX_HW_COUNTERS) {
+ if (unlikely(i == 1))
+ continue; /* This should never happen as TM is read only */
+ cinfo.value = 0;
+ cinfo.type = SBI_PMU_CTR_TYPE_HW;
+ /*
+ * If counter delegation is enabled, the csr stored to the cinfo will
+ * be a virtual counter that the delegation attempts to read.
+ */
+ cinfo.csr = CSR_CYCLE + i;
+ if (i == 0 || i == 2)
+ cinfo.width = 63;
+ else
+ cinfo.width = get_deleg_hw_ctr_width(i);
+
+ num_hw_ctr++;
+ pmu_ctr_list[i].value = cinfo.value;
+ }
+
+ return num_hw_ctr;
+}
+
+static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+ return -EINVAL;
+}
+
+static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+ unsigned long hw_ctr_mask = 0;
+
+ /*
+ * TODO: Treat every hpmcounter can monitor every event for now.
+ * The event to counter mapping should come from the json file.
+ * The mapping should also tell if sampling is supported or not.
+ */
+
+ /* Select only hpmcounters */
+ hw_ctr_mask = cmask & (~0x7);
+ hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]);
+ return __ffs(hw_ctr_mask);
+}
+
+static void update_deleg_hpmevent(int counter_idx, uint64_t event_value, uint64_t filter_bits)
+{
+ u64 hpmevent_value = 0;
+
+ /* OF bit should be enable during the start if sampling is requested */
+ hpmevent_value = (event_value & ~HPMEVENT_MASK) | filter_bits | HPMEVENT_OF;
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_value & 0xFFFFFFFF);
+ if (riscv_isa_extension_available(NULL, SSCOFPMF))
+ csr_ind_write(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx,
+ hpmevent_value >> BITS_PER_LONG);
+#else
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_value);
+#endif
+}
+
+static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
+ struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+ unsigned long hw_ctr_max_id;
+ u64 priv_filter;
+ int idx;
+
+ /*
+ * TODO: We should not rely on SBI Perf encoding to check if the event
+ * is a fixed one or not.
+ */
+ if (!is_sampling_event(event)) {
+ idx = get_deleg_fixed_hw_idx(cpuc, event);
+ if (idx == 0 || idx == 2) {
+ /* Priv mode filter bits are only available if smcntrpmf is present */
+ if (riscv_isa_extension_available(NULL, SMCNTRPMF))
+ goto found_idx;
+ else
+ goto skip_update;
+ }
+ }
+
+ if (!cmask)
+ goto out_err;
+ hw_ctr_max_id = __fls(cmask);
+ idx = get_deleg_next_hpm_hw_idx(cpuc, event);
+ if (idx < 3 || idx > hw_ctr_max_id)
+ goto out_err;
+found_idx:
+ priv_filter = get_deleg_priv_filter_bits(event);
+ update_deleg_hpmevent(idx, hwc->config, priv_filter);
+skip_update:
+ if (!test_and_set_bit(idx, cpuc->used_hw_ctrs))
+ return idx;
+out_err:
+ return -ENOENT;
+}
+
static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
{
- rvpmu_sbi_ctr_start(event, ival);
- /* TODO: Counter delegation implementation */
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
+ rvpmu_deleg_ctr_start(event, ival);
+ else
+ rvpmu_sbi_ctr_start(event, ival);
+
+ if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
+ (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
+ rvpmu_set_scounteren((void *)event);
}
static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
{
- rvpmu_sbi_ctr_stop(event, flag);
- /* TODO: Counter delegation implementation */
+ struct hw_perf_event *hwc = &event->hw;
+
+ if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
+ (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
+ rvpmu_reset_scounteren((void *)event);
+
+ if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event)) {
+ /*
+ * The counter is already stopped. No need to stop again. Counter
+ * mapping will be reset in clear_idx function.
+ */
+ if (flag != RISCV_PMU_STOP_FLAG_RESET)
+ rvpmu_deleg_ctr_stop_mask(BIT(hwc->idx));
+ else
+ update_deleg_hpmevent(hwc->idx, 0, 0);
+ } else {
+ rvpmu_sbi_ctr_stop(event, flag);
+ }
}
static int rvpmu_find_ctrs(void)
@@ -1292,20 +1652,18 @@ static int rvpmu_find_ctrs(void)
static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
{
- return rvpmu_sbi_event_map(event, econfig);
- /* TODO: Counter delegation implementation */
+ if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
+ return rvpmu_cdeleg_event_map(event, econfig);
+ else
+ return rvpmu_sbi_event_map(event, econfig);
}
static int rvpmu_ctr_get_idx(struct perf_event *event)
{
- return rvpmu_sbi_ctr_get_idx(event);
- /* TODO: Counter delegation implementation */
-}
-
-static u64 rvpmu_ctr_read(struct perf_event *event)
-{
- return rvpmu_sbi_ctr_read(event);
- /* TODO: Counter delegation implementation */
+ if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
+ return rvpmu_deleg_ctr_get_idx(event);
+ else
+ return rvpmu_sbi_ctr_get_idx(event);
}
static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
@@ -1323,7 +1681,16 @@ static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
csr_write(CSR_SCOUNTEREN, 0x2);
/* Stop all the counters so that they can be enabled from perf */
- rvpmu_sbi_stop_all(pmu);
+ if (riscv_pmu_cdeleg_available()) {
+ rvpmu_deleg_ctr_stop_mask(cmask);
+ if (riscv_pmu_sbi_available()) {
+ /* Stop the firmware counters as well */
+ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, 0, firmware_cmask,
+ 0, 0, 0, 0);
+ }
+ } else {
+ rvpmu_sbi_stop_all(pmu);
+ }
if (riscv_pmu_use_irq) {
cpu_hw_evt->irq = riscv_pmu_irq;
@@ -1632,8 +1999,11 @@ static int rvpmu_device_probe(struct platform_device *pdev)
}
irq_requested = (ret == 0);
- pmu->pmu.attr_groups = riscv_pmu_attr_groups;
pmu->pmu.parent = &pdev->dev;
+ if (riscv_pmu_cdeleg_available_boot())
+ pmu->pmu.attr_groups = riscv_cdeleg_pmu_attr_groups;
+ else
+ pmu->pmu.attr_groups = riscv_sbi_pmu_attr_groups;
pmu->cmask = cmask;
pmu->ctr_start = rvpmu_ctr_start;
pmu->ctr_stop = rvpmu_ctr_stop;
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index f82a28040594..3c64151cb038 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -20,6 +20,7 @@
*/
#define RISCV_MAX_COUNTERS 64
+#define RISCV_MAX_HW_COUNTERS 32
#define RISCV_OP_UNSUPP (-EOPNOTSUPP)
#define RISCV_PMU_SBI_PDEV_NAME "riscv-pmu-sbi"
#define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy"
@@ -28,6 +29,8 @@
#define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1
+#define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0)
+
struct cpu_hw_events {
/* currently enabled events */
int n_events;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 14/22] RISC-V: perf: Implement supervisor counter delegation support
2026-07-01 8:47 ` [PATCH v8 14/22] RISC-V: perf: Implement supervisor counter delegation support Atish Patra
@ 2026-07-01 9:27 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:27 UTC (permalink / raw)
To: Atish Patra; +Cc: robh, conor+dt, devicetree, linux-perf-users
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [Critical] Improper handling of the HPMEVENT_OF bit causes non-sampling events to never count and leads to permanent PMU interrupt deadlocks for sampling events.
- [High] Counter delegation and SBI PMU snapshot features conflict, causing stale counter reads and broken overflow handling when both are present.
Pre-existing issues:
- [High] The PMU overflow handler unconditionally restarts counters, ignoring the throttling decision from perf_event_overflow().
--
commit 11b768ace0260db50741675707cb662cdb941c07
Author: Atish Patra <atishp@rivosinc.com>
RISC-V: perf: Implement supervisor counter delegation support
This commit adds support for the ssccfg, sscsrind, and smcntrpmf RISC-V ISA
extensions to program hpmcounters directly from supervisor mode. It falls
back to the SBI PMU extension for firmware counters or if the ISA extensions
are unavailable.
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> index 2568c6808f5d2..7995da4a98a1c 100644
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
[ ... ]
> @@ -821,7 +968,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu)
> return 0;
> }
>
> -static u64 rvpmu_sbi_ctr_read(struct perf_event *event)
> +static u64 rvpmu_ctr_read(struct perf_event *event)
> {
> struct hw_perf_event *hwc = &event->hw;
> int idx = hwc->idx;
[Severity: High]
If both counter delegation and SBI snapshot features are available, could
this return stale data?
When the counter is stopped via rvpmu_deleg_ctr_stop_mask(), the SBI snapshot
shared memory is not updated. Since sbi_pmu_snapshot_available() is still
true, it seems we might read a stale or zero value from sdata->ctr_values
instead of reading the hardware counter.
[ ... ]
> +static void rvpmu_deleg_ctr_enable_irq(struct perf_event *event)
> +{
> + unsigned long hpmevent_curr;
> + unsigned long of_mask;
> + struct hw_perf_event *hwc = &event->hw;
> + int counter_idx = hwc->idx;
> + unsigned long sip_val = csr_read(CSR_SIP);
> +
> + if (!is_sampling_event(event) || (sip_val & SIP_LCOFIP))
> + return;
[Severity: Critical]
Will non-sampling events remain permanently halted here?
In update_deleg_hpmevent(), the HPMEVENT_OF bit is unconditionally set, which
architecturally inhibits the counter. This function is responsible for clearing
it. However, if this returns early for non-sampling events, the OF bit is
never cleared. Could this cause events like standard perf stat to perpetually
read zero?
Additionally, if sip_val & SIP_LCOFIP is true (e.g. if another counter
overflows while processing), does returning early permanently deadlock this
newly restarted sampling counter?
[ ... ]
> -static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
> - u64 ctr_ovf_mask)
> +static void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
> + u64 ctr_ovf_mask)
[ ... ]
> -static void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
> - u64 ctr_ovf_mask)
> +static void rvpmu_start_overflow_mask(struct riscv_pmu *pmu, u64 ctr_ovf_mask)
> {
> + int idx = 0, i;
> + struct perf_event *event;
> + unsigned long ctr_start_mask = 0;
> + u64 max_period, init_val = 0;
> + struct hw_perf_event *hwc;
> struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
>
> if (sbi_pmu_snapshot_available())
> - rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
> - else
> - rvpmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask);
> + return rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
[Severity: High]
Does this bypass counter delegation entirely when SBI snapshots are present?
If sbi_pmu_snapshot_available() is true, this unconditionally delegates to
rvpmu_sbi_start_ovf_ctrs_snapshot(). This appears to use the SBI interface
to restart the counters and initializes them from the potentially stale
snapshot memory, which might clobber the hardware state managed by counter
delegation.
[ ... ]
> @@ -1228,22 +1427,183 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
> hw_evt->state = 0;
> }
>
> - rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
> + rvpmu_start_overflow_mask(pmu, overflowed_ctrs);
[Severity: High]
This is a pre-existing issue, but does this unconditionally restart counters
even if perf_event_overflow() requests throttling?
The core perf_event_overflow() returns non-zero to signal that an event has
exceeded the maximum sample rate and should be throttled. Since the return
value is ignored here, the counter is blindly restarted via
rvpmu_start_overflow_mask(). Could this allow a maliciously low sample period
to cause an unmitigated PMU interrupt storm?
> perf_sample_event_took(sched_clock() - start_clock);
>
> return IRQ_HANDLED;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=14
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 15/22] RISC-V: perf: Skip PMU SBI extension when not implemented
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (13 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 14/22] RISC-V: perf: Implement supervisor counter delegation support Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 9:26 ` sashiko-bot
2026-07-01 8:47 ` [PATCH v8 16/22] RISC-V: perf: Use config2/vendor table for event to counter mapping Atish Patra
` (6 subsequent siblings)
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Charlie Jenkins <charlie@rivosinc.com>
When the PMU SBI extension is not implemented, sbi_v2_available should
not be set to true. The SBI implementation for counter config matching
and firmware counter read should also be skipped when the SBI extension
is not implemented.
Signed-off-by: Atish Patra <atishp@meta.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
---
drivers/perf/riscv_pmu_sbi.c | 49 ++++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 20 deletions(-)
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 7995da4a98a1..fcf8fbb6fd86 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -495,27 +495,32 @@ static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata)
}
}
-static void rvpmu_sbi_check_std_events(struct work_struct *work)
+static void rvpmu_check_std_events(struct work_struct *work)
{
int ret;
- if (sbi_v3_available) {
- ret = pmu_sbi_check_event_info();
- if (ret)
- pr_err("pmu_sbi_check_event_info failed with error %d\n", ret);
- return;
- }
+ if (riscv_pmu_sbi_available()) {
+ if (sbi_v3_available) {
+ ret = pmu_sbi_check_event_info();
+ if (ret)
+ pr_err("pmu_sbi_check_event_info failed with error %d\n", ret);
+ return;
+ }
- for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
- rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]);
+ for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
+ rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]);
- for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++)
- for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++)
- for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
- rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]);
+ for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++)
+ for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++)
+ for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
+ rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]);
+ } else {
+ DO_ONCE_LITE_IF(1, pr_info,
+ "Boot time config matching not required for smcdeleg\n");
+ }
}
-static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events);
+static DECLARE_WORK(check_std_events_work, rvpmu_check_std_events);
static ssize_t rvpmu_format_show(struct device *dev,
struct device_attribute *attr, char *buf)
@@ -708,6 +713,9 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
cflags = rvpmu_sbi_get_filter_flags(event);
+ if (!riscv_pmu_sbi_available())
+ return -ENOENT;
+
/*
* In legacy mode, we have to force the fixed counters for those events
* but not in the user access mode as we want to use the other counters
@@ -985,7 +993,7 @@ static u64 rvpmu_ctr_read(struct perf_event *event)
return val;
}
- if (pmu_sbi_is_fw_event(event)) {
+ if (pmu_sbi_is_fw_event(event) && riscv_pmu_sbi_available()) {
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_FW_READ,
hwc->idx, 0, 0, 0, 0, 0);
if (ret.error)
@@ -2094,12 +2102,13 @@ static int __init rvpmu_devinit(void)
int ret;
struct platform_device *pdev;
- if (sbi_spec_version >= sbi_mk_version(0, 3) &&
- sbi_probe_extension(SBI_EXT_PMU))
- static_branch_enable(&riscv_pmu_sbi_available);
+ if (sbi_probe_extension(SBI_EXT_PMU)) {
+ if (sbi_spec_version >= sbi_mk_version(0, 3))
+ static_branch_enable(&riscv_pmu_sbi_available);
+ if (sbi_spec_version >= sbi_mk_version(2, 0))
+ sbi_v2_available = true;
+ }
- if (sbi_spec_version >= sbi_mk_version(2, 0))
- sbi_v2_available = true;
/*
* We need all three extensions to be present to access the counters
* in S-mode via Supervisor Counter delegation.
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 15/22] RISC-V: perf: Skip PMU SBI extension when not implemented
2026-07-01 8:47 ` [PATCH v8 15/22] RISC-V: perf: Skip PMU SBI extension when not implemented Atish Patra
@ 2026-07-01 9:26 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:26 UTC (permalink / raw)
To: Atish Patra; +Cc: conor+dt, linux-perf-users, robh, devicetree
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Skipping standard event checking when SBI is unavailable leaves event maps with valid default encodings, causing false-positive event support.
--
commit b28b4646a82db47f9cf2c2d617a8c7ae78dfb038
Author: Charlie Jenkins <charlie@rivosinc.com>
RISC-V: perf: Skip PMU SBI extension when not implemented
This commit modifies the RISC-V perf driver to skip the SBI extension for
counter configuration matching and firmware counter reads when the PMU
SBI extension is not implemented. It also prevents sbi_v2_available from
being set to true when the extension is missing.
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
> @@ -495,27 +495,32 @@ static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata)
[ ... ]
> -static void rvpmu_sbi_check_std_events(struct work_struct *work)
> +static void rvpmu_check_std_events(struct work_struct *work)
> {
> int ret;
>
> - if (sbi_v3_available) {
> - ret = pmu_sbi_check_event_info();
> - if (ret)
> - pr_err("pmu_sbi_check_event_info failed with error %d\n", ret);
> - return;
> - }
> + if (riscv_pmu_sbi_available()) {
> + if (sbi_v3_available) {
> + ret = pmu_sbi_check_event_info();
> + if (ret)
> + pr_err("pmu_sbi_check_event_info failed with error %d\n", ret);
> + return;
> + }
>
> - for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
> - rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]);
> + for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
> + rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]);
>
> - for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++)
> - for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++)
> - for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
> - rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]);
> + for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++)
> + for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++)
> + for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
> + rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]);
> + } else {
> + DO_ONCE_LITE_IF(1, pr_info,
> + "Boot time config matching not required for smcdeleg\n");
> + }
> }
[Severity: High]
When riscv_pmu_sbi_available() is false, does skipping this loop leave the
event map arrays like pmu_hw_event_sbi_map with valid default event codes?
If these standard events are not invalidated via the hardware verification
step, riscv_pmu_get_event_info() could unconditionally return the default
uninvalidated values or synthesize valid encodings for firmware events.
Would this incorrectly signal to callers, such as KVM, that the events
are supported?
[ ... ]
> @@ -708,6 +713,9 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
>
> cflags = rvpmu_sbi_get_filter_flags(event);
>
> + if (!riscv_pmu_sbi_available())
> + return -ENOENT;
[Severity: High]
If firmware events successfully map via riscv_pmu_get_event_info() because
the mappings were left intact during initialization, they would be accepted
by perf_event_open.
Will they then fail silently here at schedule time instead of being
appropriately rejected during event initialization?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=15
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 16/22] RISC-V: perf: Use config2/vendor table for event to counter mapping
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (14 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 15/22] RISC-V: perf: Skip PMU SBI extension when not implemented Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 9:35 ` sashiko-bot
2026-07-01 8:47 ` [PATCH v8 17/22] RISC-V: perf: Add legacy event encodings via sysfs Atish Patra
` (5 subsequent siblings)
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
The counter restriction specified in the json file is passed to
the drivers via config2 paarameter in perf attributes. This allows
any platform vendor to define their custom mapping between event and
hpmcounters without any rules defined in the ISA.
For legacy events, the platform vendor may define the mapping in
the driver in the vendor event table.
The fixed cycle and instruction counters are fixed (0 and 2
respectively) by the ISA and maps to the legacy events. The platform
vendor must specify this in the driver if intended to be used while
profiling. Otherwise, they can just specify the alternate hpmcounters
that may monitor and/or sample the cycle/instruction counts.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/riscv_pmu_sbi.c | 95 +++++++++++++++++++++++++++++++++++-------
include/linux/perf/riscv_pmu.h | 2 +
2 files changed, 81 insertions(+), 16 deletions(-)
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index fcf8fbb6fd86..19d9e4750424 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -77,6 +77,7 @@ static ssize_t __maybe_unused rvpmu_format_show(struct device *dev, struct devic
RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config)
PMU_FORMAT_ATTR(firmware, "config:62-63");
+PMU_FORMAT_ATTR(counterid_mask, "config2:0-31");
static bool sbi_v2_available;
static bool sbi_v3_available;
@@ -121,6 +122,7 @@ static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = {
static struct attribute *riscv_cdeleg_pmu_formats_attr[] = {
RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR),
&format_attr_firmware.attr,
+ &format_attr_counterid_mask.attr,
NULL,
};
@@ -1501,24 +1503,85 @@ static int rvpmu_deleg_find_ctrs(void)
return num_hw_ctr;
}
+/*
+ * The json file must correctly specify counter 0 or counter 2 is available
+ * in the counter lists for cycle/instret events. Otherwise, the drivers have
+ * no way to figure out if a fixed counter must be used and pick a programmable
+ * counter if available.
+ */
static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
{
- return -EINVAL;
+ bool guest_events = event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS;
+ int idx;
+
+ /* event_base is 0 on the delegation path; match via the original perf attrs. */
+ if (guest_events) {
+ if (event->attr.type != PERF_TYPE_HARDWARE)
+ return -EINVAL;
+ if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES)
+ idx = 0; /* CY counter */
+ else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS)
+ idx = 2; /* IR counter */
+ else
+ return -EINVAL;
+ } else if (event->attr.config2 & RISCV_PMU_CYCLE_FIXED_CTR_MASK) {
+ idx = 0; /* CY counter */
+ } else if (event->attr.config2 & RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK) {
+ idx = 2; /* IR counter */
+ } else {
+ return -EINVAL;
+ }
+
+ /* Take the fixed counter only if delegated and free, else fall back. */
+ if (!(cmask & BIT(idx)) || test_bit(idx, cpuc->used_hw_ctrs))
+ return -EINVAL;
+
+ return idx;
}
static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
{
- unsigned long hw_ctr_mask = 0;
+ u32 hw_ctr_mask = 0, temp_mask = 0;
+ u32 type = event->attr.type;
+ u64 config = event->attr.config;
+ int ret;
- /*
- * TODO: Treat every hpmcounter can monitor every event for now.
- * The event to counter mapping should come from the json file.
- * The mapping should also tell if sampling is supported or not.
- */
+ /* Select only available hpmcounters */
+ hw_ctr_mask = cmask & (~0x7) & ~(cpuc->used_hw_ctrs[0]);
+
+ switch (type) {
+ case PERF_TYPE_HARDWARE:
+ temp_mask = current_pmu_hw_event_map[config].counter_mask;
+ break;
+ case PERF_TYPE_HW_CACHE:
+ ret = cdeleg_pmu_event_find_cache(config, NULL, &temp_mask);
+ if (ret)
+ return ret;
+ break;
+ case PERF_TYPE_RAW:
+ /*
+ * Mask off the counters that can't monitor this event (specified via json)
+ * The counter mask for this event is set in config2 via the property 'Counter'
+ * in the json file or manual configuration of config2. If the config2 is not set,
+ * it is assumed all the available hpmcounters can monitor this event.
+ * Note: This assumption may fail for virtualization use case where they hypervisor
+ * (e.g. KVM) virtualizes the counter. Any event to counter mapping provided by the
+ * guest is meaningless from a hypervisor perspective. Thus, the hypervisor doesn't
+ * set config2 when creating kernel counter and relies default host mapping.
+ */
+ if (event->attr.config2)
+ temp_mask = event->attr.config2;
+ break;
+ default:
+ break;
+ }
+
+ if (temp_mask)
+ hw_ctr_mask &= temp_mask;
+
+ if (!hw_ctr_mask)
+ return -EINVAL;
- /* Select only hpmcounters */
- hw_ctr_mask = cmask & (~0x7);
- hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]);
return __ffs(hw_ctr_mask);
}
@@ -1547,10 +1610,6 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
u64 priv_filter;
int idx;
- /*
- * TODO: We should not rely on SBI Perf encoding to check if the event
- * is a fixed one or not.
- */
if (!is_sampling_event(event)) {
idx = get_deleg_fixed_hw_idx(cpuc, event);
if (idx == 0 || idx == 2) {
@@ -1570,10 +1629,14 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
goto out_err;
found_idx:
priv_filter = get_deleg_priv_filter_bits(event);
+ if (test_and_set_bit(idx, cpuc->used_hw_ctrs))
+ goto out_err;
update_deleg_hpmevent(idx, hwc->config, priv_filter);
+ return idx;
skip_update:
- if (!test_and_set_bit(idx, cpuc->used_hw_ctrs))
- return idx;
+ if (test_and_set_bit(idx, cpuc->used_hw_ctrs))
+ goto out_err;
+ return idx;
out_err:
return -ENOENT;
}
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index 3c64151cb038..b23b71cb4e66 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -30,6 +30,8 @@
#define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1
#define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0)
+#define RISCV_PMU_CYCLE_FIXED_CTR_MASK 0x01
+#define RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK 0x04
struct cpu_hw_events {
/* currently enabled events */
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 16/22] RISC-V: perf: Use config2/vendor table for event to counter mapping
2026-07-01 8:47 ` [PATCH v8 16/22] RISC-V: perf: Use config2/vendor table for event to counter mapping Atish Patra
@ 2026-07-01 9:35 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:35 UTC (permalink / raw)
To: Atish Patra; +Cc: robh, linux-perf-users, devicetree, conor+dt
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Legacy perf tools fail to profile hardware cycle and instruction events on platforms with counter delegation, rendering fixed counters and vendor mappings unusable.
- [High] Standard hardware events (PERF_TYPE_HARDWARE) fail unconditionally on platforms supporting counter delegation unless a vendor-specific event table is present.
- [Medium] The 64-bit `event->attr.config2` mask is truncated to a 32-bit integer `temp_mask`, causing invalid user constraints to be silently bypassed.
--
commit 25bb8be42a29d054e5ce833c3ed3ff8935990ee1
Author: Atish Patra <atishp@rivosinc.com>
RISC-V: perf: Use config2/vendor table for event to counter mapping
This commit allows platform vendors to define custom mappings between
performance events and hardware counters via the config2 attribute. Legacy
events can either map to ISA-defined fixed counters or fall back to
alternate programmable counters as specified by the platform driver.
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> index fcf8fbb6fd864..19d9e47504247 100644
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
[ ... ]
> @@ -1501,24 +1503,85 @@ static int rvpmu_deleg_find_ctrs(void)
> return num_hw_ctr;
> }
>
> +/*
> + * The json file must correctly specify counter 0 or counter 2 is available
> + * in the counter lists for cycle/instret events. Otherwise, the drivers have
> + * no way to figure out if a fixed counter must be used and pick a programmable
> + * counter if available.
> + */
> static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
> {
> - return -EINVAL;
> + bool guest_events = event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS;
> + int idx;
> +
> + /* event_base is 0 on the delegation path; match via the original perf attrs. */
> + if (guest_events) {
> + if (event->attr.type != PERF_TYPE_HARDWARE)
> + return -EINVAL;
> + if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES)
> + idx = 0; /* CY counter */
> + else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS)
> + idx = 2; /* IR counter */
> + else
> + return -EINVAL;
> + } else if (event->attr.config2 & RISCV_PMU_CYCLE_FIXED_CTR_MASK) {
> + idx = 0; /* CY counter */
> + } else if (event->attr.config2 & RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK) {
> + idx = 2; /* IR counter */
> + } else {
> + return -EINVAL;
> + }
[Severity: High]
Does this break legacy perf tools (like 'perf stat -e cycles') on platforms
with counter delegation?
Older tools don't set config2, so legacy events without config2 set are
explicitly rejected here. The code then falls back to
get_deleg_next_hpm_hw_idx(), which excludes fixed counters:
hw_ctr_mask = cmask & (~0x7) & ~(cpuc->used_hw_ctrs[0]);
If the vendor table mapped the event to a fixed counter, the mask evaluation
will result in 0, causing the allocation to fail with -EINVAL.
> +
> + /* Take the fixed counter only if delegated and free, else fall back. */
> + if (!(cmask & BIT(idx)) || test_bit(idx, cpuc->used_hw_ctrs))
> + return -EINVAL;
> +
> + return idx;
> }
>
> static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
> {
> - unsigned long hw_ctr_mask = 0;
> + u32 hw_ctr_mask = 0, temp_mask = 0;
> + u32 type = event->attr.type;
> + u64 config = event->attr.config;
> + int ret;
>
> - /*
> - * TODO: Treat every hpmcounter can monitor every event for now.
> - * The event to counter mapping should come from the json file.
> - * The mapping should also tell if sampling is supported or not.
> - */
> + /* Select only available hpmcounters */
> + hw_ctr_mask = cmask & (~0x7) & ~(cpuc->used_hw_ctrs[0]);
> +
> + switch (type) {
> + case PERF_TYPE_HARDWARE:
> + temp_mask = current_pmu_hw_event_map[config].counter_mask;
[Severity: High]
Do standard hardware events (PERF_TYPE_HARDWARE) fail unconditionally on
platforms supporting counter delegation if a vendor-specific event table is
missing?
In rvpmu_event_map(), it appears that all non-firmware events are intercepted
when counter delegation is available:
if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
return rvpmu_cdeleg_event_map(event, econfig);
Then in rvpmu_cdeleg_event_map(), it hard-fails standard hardware events if
the vendor table is missing:
case PERF_TYPE_HARDWARE:
if (config >= PERF_COUNT_HW_MAX)
return -EINVAL;
if (!current_pmu_hw_event_map)
return -ENOENT;
Without a fallback to the SBI PMU, does this completely break legacy hardware
events on new platforms that don't yet have a vendor-specific driver table?
> + break;
> + case PERF_TYPE_HW_CACHE:
> + ret = cdeleg_pmu_event_find_cache(config, NULL, &temp_mask);
> + if (ret)
> + return ret;
> + break;
> + case PERF_TYPE_RAW:
> + /*
> + * Mask off the counters that can't monitor this event (specified via json)
> + * The counter mask for this event is set in config2 via the property 'Counter'
> + * in the json file or manual configuration of config2. If the config2 is not set,
> + * it is assumed all the available hpmcounters can monitor this event.
> + * Note: This assumption may fail for virtualization use case where they hypervisor
> + * (e.g. KVM) virtualizes the counter. Any event to counter mapping provided by the
> + * guest is meaningless from a hypervisor perspective. Thus, the hypervisor doesn't
> + * set config2 when creating kernel counter and relies default host mapping.
> + */
> + if (event->attr.config2)
> + temp_mask = event->attr.config2;
[Severity: Medium]
Will this silently bypass user-provided counter constraints if only bits above
31 are set in config2?
event->attr.config2 is a u64, but temp_mask is defined as a u32. When config2
is assigned to temp_mask, the upper 32 bits are truncated. If the resulting
temp_mask evaluates to 0, it skips the constraint application entirely:
if (temp_mask)
hw_ctr_mask &= temp_mask;
Would this result in the kernel incorrectly programming the event on any
available programmable counter instead of rejecting the invalid constraint?
> + break;
> + default:
> + break;
> + }
> +
> + if (temp_mask)
> + hw_ctr_mask &= temp_mask;
> +
> + if (!hw_ctr_mask)
> + return -EINVAL;
>
> - /* Select only hpmcounters */
> - hw_ctr_mask = cmask & (~0x7);
> - hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]);
> return __ffs(hw_ctr_mask);
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=16
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 17/22] RISC-V: perf: Add legacy event encodings via sysfs
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (15 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 16/22] RISC-V: perf: Use config2/vendor table for event to counter mapping Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 8:47 ` [PATCH v8 18/22] RISC-V: perf: Add Qemu virt machine events Atish Patra
` (4 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
Define sysfs details for the legacy events so that any tool can
parse these to understand the minimum set of legacy events
supported by the platform. The sysfs entry will describe both event
encoding and corresponding counter map so that an perf event can be
programmed accordingly.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/riscv_pmu_sbi.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 19d9e4750424..8d56bef95a1b 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -131,7 +131,20 @@ static struct attribute_group riscv_cdeleg_pmu_format_group = {
.attrs = riscv_cdeleg_pmu_formats_attr,
};
+#define RVPMU_EVENT_ATTR_RESOLVE(m) #m
+#define RVPMU_EVENT_CMASK_ATTR(_name, _var, config, mask) \
+ PMU_EVENT_ATTR_STRING(_name, rvpmu_event_attr_##_var, \
+ "event=" RVPMU_EVENT_ATTR_RESOLVE(config) \
+ ",counterid_mask=" RVPMU_EVENT_ATTR_RESOLVE(mask))
+
+#define RVPMU_EVENT_ATTR_PTR(name) (&rvpmu_event_attr_##name.attr.attr)
+
+static struct attribute_group riscv_cdeleg_pmu_event_group __ro_after_init = {
+ .name = "events",
+};
+
static const struct attribute_group *riscv_cdeleg_pmu_attr_groups[] = {
+ &riscv_cdeleg_pmu_event_group,
&riscv_cdeleg_pmu_format_group,
NULL,
};
@@ -447,11 +460,14 @@ struct riscv_vendor_pmu_events {
const struct riscv_pmu_event *hw_event_map;
const struct riscv_pmu_event (*cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX];
+ struct attribute **attrs_events;
};
-#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, _cache_event_map) \
+#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, \
+ _cache_event_map, _attrs) \
{ .vendorid = _vendorid, .archid = _archid, .implid = _implid, \
- .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map },
+ .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map, \
+ .attrs_events = _attrs },
static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = {
};
@@ -473,6 +489,8 @@ static void __init rvpmu_vendor_register_events(void)
pmu_vendor_events_table[i].archid == arch_id) {
current_pmu_hw_event_map = pmu_vendor_events_table[i].hw_event_map;
current_pmu_cache_event_map = pmu_vendor_events_table[i].cache_event_map;
+ riscv_cdeleg_pmu_event_group.attrs =
+ pmu_vendor_events_table[i].attrs_events;
break;
}
}
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 18/22] RISC-V: perf: Add Qemu virt machine events
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (16 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 17/22] RISC-V: perf: Add legacy event encodings via sysfs Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 8:47 ` [PATCH v8 19/22] tools/perf: Support event code for arch standard events Atish Patra
` (3 subsequent siblings)
21 siblings, 0 replies; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
Qemu virt machine supports a very minimal set of legacy perf events.
Add them to the vendor table so that users can use them when
counter delegation is enabled.
Qemu is identified by its marchid. Older Qemu reports all-zero
mvendorid/marchid/mimpid, while newer Qemu reports the marchid 0x2a (42)
allocated to it in the RISC-V ISA manual [1]. Register the events for
both ids so they are available across Qemu versions.
[1] https://github.com/riscv/riscv-isa-manual/blob/main/marchid.md
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/vendorid_list.h | 6 ++++++
drivers/perf/riscv_pmu_sbi.c | 39 ++++++++++++++++++++++++++++++++++
2 files changed, 45 insertions(+)
diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h
index 7f5030ee1fcf..beaf9236dba7 100644
--- a/arch/riscv/include/asm/vendorid_list.h
+++ b/arch/riscv/include/asm/vendorid_list.h
@@ -11,4 +11,10 @@
#define SIFIVE_VENDOR_ID 0x489
#define THEAD_VENDOR_ID 0x5b7
+#define QEMU_VIRT_VENDOR_ID 0x000
+#define QEMU_VIRT_IMPL_ID 0x000
+#define QEMU_VIRT_ARCH_ID 0x000
+/* Newer Qemu reports the spec-allocated marchid 0x2a (42) for non-vendor CPUs */
+#define QEMU_VIRT_ARCH_ID_SPEC 0x2a
+
#endif
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 8d56bef95a1b..6d528eafb525 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -27,6 +27,7 @@
#include <asm/sbi.h>
#include <asm/cpufeature.h>
#include <asm/vendor_extensions.h>
+#include <asm/vendorid_list.h>
#include <asm/vendor_extensions/andes.h>
#include <asm/hwcap.h>
#include <asm/csr_ind.h>
@@ -469,7 +470,45 @@ struct riscv_vendor_pmu_events {
.hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map, \
.attrs_events = _attrs },
+/* QEMU virt PMU events */
+static const struct riscv_pmu_event qemu_virt_hw_event_map[PERF_COUNT_HW_MAX] = {
+ PERF_MAP_ALL_UNSUPPORTED,
+ [PERF_COUNT_HW_CPU_CYCLES] = {0x01, 0xFFFFFFF8},
+ [PERF_COUNT_HW_INSTRUCTIONS] = {0x02, 0xFFFFFFF8}
+};
+
+static const struct riscv_pmu_event qemu_virt_cache_event_map[PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+ PERF_CACHE_MAP_ALL_UNSUPPORTED,
+ [C(DTLB)][C(OP_READ)][C(RESULT_MISS)] = {0x10019, 0xFFFFFFF8},
+ [C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)] = {0x1001B, 0xFFFFFFF8},
+
+ [C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = {0x10021, 0xFFFFFFF8},
+};
+
+RVPMU_EVENT_CMASK_ATTR(cycles, cycles, 0x01, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(instructions, instructions, 0x02, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(dTLB-load-misses, dTLB_load_miss, 0x10019, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(dTLB-store-misses, dTLB_store_miss, 0x1001B, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(iTLB-load-misses, iTLB_load_miss, 0x10021, 0xFFFFFFF8);
+
+static struct attribute *qemu_virt_event_group[] = {
+ RVPMU_EVENT_ATTR_PTR(cycles),
+ RVPMU_EVENT_ATTR_PTR(instructions),
+ RVPMU_EVENT_ATTR_PTR(dTLB_load_miss),
+ RVPMU_EVENT_ATTR_PTR(dTLB_store_miss),
+ RVPMU_EVENT_ATTR_PTR(iTLB_load_miss),
+ NULL,
+};
+
static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = {
+ RISCV_VENDOR_PMU_EVENTS(QEMU_VIRT_VENDOR_ID, QEMU_VIRT_ARCH_ID, QEMU_VIRT_IMPL_ID,
+ qemu_virt_hw_event_map, qemu_virt_cache_event_map,
+ qemu_virt_event_group)
+ RISCV_VENDOR_PMU_EVENTS(QEMU_VIRT_VENDOR_ID, QEMU_VIRT_ARCH_ID_SPEC, QEMU_VIRT_IMPL_ID,
+ qemu_virt_hw_event_map, qemu_virt_cache_event_map,
+ qemu_virt_event_group)
};
static const struct riscv_pmu_event *current_pmu_hw_event_map;
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* [PATCH v8 19/22] tools/perf: Support event code for arch standard events
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (17 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 18/22] RISC-V: perf: Add Qemu virt machine events Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 17:44 ` Ian Rogers
2026-07-01 8:47 ` [PATCH v8 20/22] tools/perf: Add RISC-V CounterIDMask event field Atish Patra
` (2 subsequent siblings)
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
RISC-V relies on the event encoding from the json file. That includes
arch standard events. If event code is present, event is already updated
with correct encoding. No need to update it again which results in losing
the event encoding.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
tools/perf/pmu-events/arch/riscv/arch-standard.json | 10 ++++++++++
tools/perf/pmu-events/jevents.py | 9 ++++++++-
2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/arch/riscv/arch-standard.json b/tools/perf/pmu-events/arch/riscv/arch-standard.json
new file mode 100644
index 000000000000..96e21f088558
--- /dev/null
+++ b/tools/perf/pmu-events/arch/riscv/arch-standard.json
@@ -0,0 +1,10 @@
+[
+ {
+ "EventName": "cycles",
+ "BriefDescription": "cycle executed"
+ },
+ {
+ "EventName": "instructions",
+ "BriefDescription": "instruction retired"
+ }
+]
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 3a1bcdcdc685..0cf9d26315b3 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -413,7 +413,14 @@ class JsonEvent:
self.long_desc = None
if arch_std:
if arch_std.lower() in _arch_std_events:
- event = _arch_std_events[arch_std.lower()].event
+ # Inherit the arch-standard encoding only if this event defines no
+ # explicit encoding of its own. Events with explicit EventCode,
+ # ConfigCode, etc. may carry alternate encodings and appended modifiers
+ # that must survive.
+ if ('EventCode' not in jd and 'ExtSel' not in jd and
+ configcode is None and eventidcode is None and
+ legacy_hw_config is None and legacy_cache_config is None):
+ event = _arch_std_events[arch_std.lower()].event
# Copy from the architecture standard event to self for undefined fields.
for attr, value in _arch_std_events[arch_std.lower()].__dict__.items():
if hasattr(self, attr) and not getattr(self, attr):
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 19/22] tools/perf: Support event code for arch standard events
2026-07-01 8:47 ` [PATCH v8 19/22] tools/perf: Support event code for arch standard events Atish Patra
@ 2026-07-01 17:44 ` Ian Rogers
0 siblings, 0 replies; 34+ messages in thread
From: Ian Rogers @ 2026-07-01 17:44 UTC (permalink / raw)
To: Atish Patra
Cc: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Will Deacon, James Clark, linux-arm-kernel, linux-riscv,
linux-kernel, devicetree, linux-perf-users, Conor Dooley
On Wed, Jul 1, 2026 at 1:48 AM Atish Patra <atish.patra@linux.dev> wrote:
>
> From: Atish Patra <atishp@rivosinc.com>
>
> RISC-V relies on the event encoding from the json file. That includes
> arch standard events. If event code is present, event is already updated
> with correct encoding. No need to update it again which results in losing
> the event encoding.
>
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/perf/pmu-events/arch/riscv/arch-standard.json | 10 ++++++++++
> tools/perf/pmu-events/jevents.py | 9 ++++++++-
> 2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/pmu-events/arch/riscv/arch-standard.json b/tools/perf/pmu-events/arch/riscv/arch-standard.json
> new file mode 100644
> index 000000000000..96e21f088558
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/riscv/arch-standard.json
> @@ -0,0 +1,10 @@
> +[
> + {
> + "EventName": "cycles",
> + "BriefDescription": "cycle executed"
> + },
> + {
> + "EventName": "instructions",
> + "BriefDescription": "instruction retired"
> + }
> +]
> diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
> index 3a1bcdcdc685..0cf9d26315b3 100755
> --- a/tools/perf/pmu-events/jevents.py
> +++ b/tools/perf/pmu-events/jevents.py
> @@ -413,7 +413,14 @@ class JsonEvent:
> self.long_desc = None
> if arch_std:
> if arch_std.lower() in _arch_std_events:
> - event = _arch_std_events[arch_std.lower()].event
> + # Inherit the arch-standard encoding only if this event defines no
> + # explicit encoding of its own. Events with explicit EventCode,
> + # ConfigCode, etc. may carry alternate encodings and appended modifiers
> + # that must survive.
> + if ('EventCode' not in jd and 'ExtSel' not in jd and
> + configcode is None and eventidcode is None and
> + legacy_hw_config is None and legacy_cache_config is None):
> + event = _arch_std_events[arch_std.lower()].event
> # Copy from the architecture standard event to self for undefined fields.
> for attr, value in _arch_std_events[arch_std.lower()].__dict__.items():
> if hasattr(self, attr) and not getattr(self, attr):
>
> --
> 2.53.0-Meta
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 20/22] tools/perf: Add RISC-V CounterIDMask event field
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (18 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 19/22] tools/perf: Support event code for arch standard events Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 17:44 ` Ian Rogers
2026-07-01 8:47 ` [PATCH v8 21/22] TEST(do-not-upstream): fake qemu-virt PMU events for cdeleg counter-mask testing Atish Patra
2026-07-01 8:47 ` [PATCH v8 22/22] TEST(do-not-upstream): fake qemu vendor JSON + mapfile entry for CounterIDMask path Atish Patra
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@rivosinc.com>
Counter delegation lets supervisor mode choose the hpmcounter for an event,
but the hardware may only allow a given event on a subset of counters. Add
a RISC-V specific "CounterIDMask" json event field, handled like the other
arch-specific entries in event_fields[], that carries the allowed-counter
bitmask through to the driver's existing counterid_mask (config2:0-31)
format.
The value is the bitmask directly so no counter-list to bitmask
conversion is needed, and because the field is RISC-V specific it is a
no-op for every other architecture's events (unlike the shared "Counter"
field).
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
tools/perf/pmu-events/jevents.py | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 0cf9d26315b3..516fb73886ed 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -396,6 +396,7 @@ class JsonEvent:
('EnAllSlices', 'enallslices='),
('SliceId', 'sliceid='),
('ThreadMask', 'threadmask='),
+ ('CounterIDMask', 'counterid_mask='),
]
for key, value in event_fields:
if key in jd and not is_zero(jd[key]):
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 20/22] tools/perf: Add RISC-V CounterIDMask event field
2026-07-01 8:47 ` [PATCH v8 20/22] tools/perf: Add RISC-V CounterIDMask event field Atish Patra
@ 2026-07-01 17:44 ` Ian Rogers
0 siblings, 0 replies; 34+ messages in thread
From: Ian Rogers @ 2026-07-01 17:44 UTC (permalink / raw)
To: Atish Patra
Cc: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Will Deacon, James Clark, linux-arm-kernel, linux-riscv,
linux-kernel, devicetree, linux-perf-users, Conor Dooley
On Wed, Jul 1, 2026 at 1:48 AM Atish Patra <atish.patra@linux.dev> wrote:
>
> From: Atish Patra <atishp@rivosinc.com>
>
> Counter delegation lets supervisor mode choose the hpmcounter for an event,
> but the hardware may only allow a given event on a subset of counters. Add
> a RISC-V specific "CounterIDMask" json event field, handled like the other
> arch-specific entries in event_fields[], that carries the allowed-counter
> bitmask through to the driver's existing counterid_mask (config2:0-31)
> format.
>
> The value is the bitmask directly so no counter-list to bitmask
> conversion is needed, and because the field is RISC-V specific it is a
> no-op for every other architecture's events (unlike the shared "Counter"
> field).
>
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/perf/pmu-events/jevents.py | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
> index 0cf9d26315b3..516fb73886ed 100755
> --- a/tools/perf/pmu-events/jevents.py
> +++ b/tools/perf/pmu-events/jevents.py
> @@ -396,6 +396,7 @@ class JsonEvent:
> ('EnAllSlices', 'enallslices='),
> ('SliceId', 'sliceid='),
> ('ThreadMask', 'threadmask='),
> + ('CounterIDMask', 'counterid_mask='),
> ]
> for key, value in event_fields:
> if key in jd and not is_zero(jd[key]):
>
> --
> 2.53.0-Meta
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 21/22] TEST(do-not-upstream): fake qemu-virt PMU events for cdeleg counter-mask testing
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (19 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 20/22] tools/perf: Add RISC-V CounterIDMask event field Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 9:38 ` sashiko-bot
2026-07-01 8:47 ` [PATCH v8 22/22] TEST(do-not-upstream): fake qemu vendor JSON + mapfile entry for CounterIDMask path Atish Patra
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@meta.com>
Adds fake-any/fake-ctr3/fake-ctr34 (event codes 0xF0x QEMU doesn't model) with
counterid_masks, to exercise the counter-delegation allocation + counter-mask
constraint in QEMU (events read 0 = allocated/programmed, vs 'not supported').
Signed-off-by: Atish Patra <atishp@meta.com>
---
drivers/perf/riscv_pmu_sbi.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index 6d528eafb525..725816c274e5 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -492,6 +492,12 @@ RVPMU_EVENT_CMASK_ATTR(instructions, instructions, 0x02, 0xFFFFFFF8);
RVPMU_EVENT_CMASK_ATTR(dTLB-load-misses, dTLB_load_miss, 0x10019, 0xFFFFFFF8);
RVPMU_EVENT_CMASK_ATTR(dTLB-store-misses, dTLB_store_miss, 0x1001B, 0xFFFFFFF8);
RVPMU_EVENT_CMASK_ATTR(iTLB-load-misses, iTLB_load_miss, 0x10021, 0xFFFFFFF8);
+/*
+ * FAKE events for cdeleg mechanism testing: event codes QEMU does NOT model.
+ */
+RVPMU_EVENT_CMASK_ATTR(fake-any, fake_any, 0xF00, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(fake-ctr3, fake_ctr3, 0xF01, 0x8);
+RVPMU_EVENT_CMASK_ATTR(fake-ctr34, fake_ctr34, 0xF02, 0x18);
static struct attribute *qemu_virt_event_group[] = {
RVPMU_EVENT_ATTR_PTR(cycles),
@@ -499,6 +505,9 @@ static struct attribute *qemu_virt_event_group[] = {
RVPMU_EVENT_ATTR_PTR(dTLB_load_miss),
RVPMU_EVENT_ATTR_PTR(dTLB_store_miss),
RVPMU_EVENT_ATTR_PTR(iTLB_load_miss),
+ RVPMU_EVENT_ATTR_PTR(fake_any),
+ RVPMU_EVENT_ATTR_PTR(fake_ctr3),
+ RVPMU_EVENT_ATTR_PTR(fake_ctr34),
NULL,
};
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 21/22] TEST(do-not-upstream): fake qemu-virt PMU events for cdeleg counter-mask testing
2026-07-01 8:47 ` [PATCH v8 21/22] TEST(do-not-upstream): fake qemu-virt PMU events for cdeleg counter-mask testing Atish Patra
@ 2026-07-01 9:38 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:38 UTC (permalink / raw)
To: Atish Patra; +Cc: devicetree, conor+dt, linux-perf-users, robh
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The patch injects non-existent fake PMU events for testing directly into a production driver, polluting the sysfs UAPI and userspace ABI.
--
commit 4ecbb9bba705eb0cbe075ec61ec799a9e140a5fc
Author: Atish Patra <atishp@meta.com>
TEST(do-not-upstream): fake qemu-virt PMU events for cdeleg counter-mask testing
This commit adds fake PMU events to the RISC-V SBI PMU driver to exercise
counter-delegation allocation and counter-mask constraints in QEMU.
> diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
> --- a/drivers/perf/riscv_pmu_sbi.c
> +++ b/drivers/perf/riscv_pmu_sbi.c
[ ... ]
> @@ -499,6 +505,9 @@ static struct attribute *qemu_virt_event_group[] = {
> RVPMU_EVENT_ATTR_PTR(dTLB_load_miss),
> RVPMU_EVENT_ATTR_PTR(dTLB_store_miss),
> RVPMU_EVENT_ATTR_PTR(iTLB_load_miss),
> + RVPMU_EVENT_ATTR_PTR(fake_any),
> + RVPMU_EVENT_ATTR_PTR(fake_ctr3),
> + RVPMU_EVENT_ATTR_PTR(fake_ctr34),
[Severity: Medium]
Even though this patch is explicitly marked as a test that is not meant for
upstream, does unconditionally appending these fake events to
qemu_virt_event_group pollute the sysfs UAPI?
Because they are added directly to the group array, tools interacting with
sysfs will see these non-existent hardware PMU events exposed as part of the
public userspace ABI. Could this logic be exercised using a dedicated test
framework rather than injecting fake test events into a production driver?
> NULL,
> };
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=21
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 22/22] TEST(do-not-upstream): fake qemu vendor JSON + mapfile entry for CounterIDMask path
2026-07-01 8:46 [PATCH v8 00/22] Add Counter delegation ISA extension support Atish Patra
` (20 preceding siblings ...)
2026-07-01 8:47 ` [PATCH v8 21/22] TEST(do-not-upstream): fake qemu-virt PMU events for cdeleg counter-mask testing Atish Patra
@ 2026-07-01 8:47 ` Atish Patra
2026-07-01 9:34 ` sashiko-bot
21 siblings, 1 reply; 34+ messages in thread
From: Atish Patra @ 2026-07-01 8:47 UTC (permalink / raw)
To: Jiri Olsa, Paul Walmsley, Mark Rutland, Rob Herring, Anup Patel,
Namhyung Kim, Arnaldo Carvalho de Melo, Krzysztof Kozlowski,
Atish Patra, Ian Rogers, Will Deacon, James Clark
Cc: linux-arm-kernel, linux-riscv, linux-kernel, devicetree,
linux-perf-users, Conor Dooley
From: Atish Patra <atishp@meta.com>
arch/riscv/qemu/virt/events.json: fake-json-{any,ctr3,ctr34,ctr6} with EventCode
+ CounterIDMask; mapfile.csv: 0x0-0x0-0x0 -> qemu/virt. Exercises jevents
CounterIDMask -> counterid_mask= -> config2 -> cdeleg counter allocation.
Signed-off-by: Atish Patra <atishp@meta.com>
---
tools/perf/pmu-events/arch/riscv/mapfile.csv | 2 ++
.../pmu-events/arch/riscv/qemu/virt/events.json | 26 ++++++++++++++++++++++
2 files changed, 28 insertions(+)
diff --git a/tools/perf/pmu-events/arch/riscv/mapfile.csv b/tools/perf/pmu-events/arch/riscv/mapfile.csv
index 87cfb0e0849f..2fa3c3fd4663 100644
--- a/tools/perf/pmu-events/arch/riscv/mapfile.csv
+++ b/tools/perf/pmu-events/arch/riscv/mapfile.csv
@@ -24,3 +24,5 @@
0x602-0x3-0x0,v1,openhwgroup/cva6,core
0x67e-0x80000000db0000[89]0-0x[[:xdigit:]]+,v1,starfive/dubhe-80,core
0x31e-0x8000000000008a45-0x[[:xdigit:]]+,v1,andes/ax45,core
+0x0-0x0-0x0,v1,qemu/virt,core
+0x0-0x2a-0x0,v1,qemu/virt,core
diff --git a/tools/perf/pmu-events/arch/riscv/qemu/virt/events.json b/tools/perf/pmu-events/arch/riscv/qemu/virt/events.json
new file mode 100644
index 000000000000..294c4ed645f6
--- /dev/null
+++ b/tools/perf/pmu-events/arch/riscv/qemu/virt/events.json
@@ -0,0 +1,26 @@
+[
+ {
+ "EventName": "fake-json-any",
+ "EventCode": "0xF10",
+ "CounterIDMask": "0xFFFFFFF8",
+ "BriefDescription": "FAKE json event (any hpmcounter 3-31) - QEMU does not model 0xF10"
+ },
+ {
+ "EventName": "fake-json-ctr3",
+ "EventCode": "0xF11",
+ "CounterIDMask": "0x8",
+ "BriefDescription": "FAKE json event constrained to hpmcounter3"
+ },
+ {
+ "EventName": "fake-json-ctr34",
+ "EventCode": "0xF12",
+ "CounterIDMask": "0x18",
+ "BriefDescription": "FAKE json event constrained to hpmcounter3,4"
+ },
+ {
+ "EventName": "fake-json-ctr6",
+ "EventCode": "0xF13",
+ "CounterIDMask": "0x40",
+ "BriefDescription": "FAKE json event constrained to hpmcounter6 (out of a small pmu-mask)"
+ }
+]
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 34+ messages in thread* Re: [PATCH v8 22/22] TEST(do-not-upstream): fake qemu vendor JSON + mapfile entry for CounterIDMask path
2026-07-01 8:47 ` [PATCH v8 22/22] TEST(do-not-upstream): fake qemu vendor JSON + mapfile entry for CounterIDMask path Atish Patra
@ 2026-07-01 9:34 ` sashiko-bot
0 siblings, 0 replies; 34+ messages in thread
From: sashiko-bot @ 2026-07-01 9:34 UTC (permalink / raw)
To: Atish Patra; +Cc: devicetree, conor+dt, linux-perf-users, robh
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] This patch contains fake test data and is explicitly marked as 'do-not-upstream'.
- [Medium] The patch adds a mapfile entry for '0x0-0x2a-0x0' which is not documented in the commit message.
--
commit 2c14c47c4b1f0c0aa09f9d1287cb457e27674f5b
Author: Atish Patra <atishp@meta.com>
Subject: TEST(do-not-upstream): fake qemu vendor JSON + mapfile entry for CounterIDMask path
This commit adds fake JSON events and mapfile entries for a QEMU virt
environment to test the CounterIDMask path in jevents.
> diff --git a/tools/perf/pmu-events/arch/riscv/mapfile.csv b/tools/perf/pmu-events/arch/riscv/mapfile.csv
> --- a/tools/perf/pmu-events/arch/riscv/mapfile.csv
> +++ b/tools/perf/pmu-events/arch/riscv/mapfile.csv
> @@ -24,3 +24,5 @@
> 0x602-0x3-0x0,v1,openhwgroup/cva6,core
> 0x67e-0x80000000db0000[89]0-0x[[:xdigit:]]+,v1,starfive/dubhe-80,core
> 0x31e-0x8000000000008a45-0x[[:xdigit:]]+,v1,andes/ax45,core
> +0x0-0x0-0x0,v1,qemu/virt,core
> +0x0-0x2a-0x0,v1,qemu/virt,core
[Severity: Medium]
The commit message explicitly states that it adds a mapfile entry for
0x0-0x0-0x0, but it looks like a second entry for 0x0-0x2a-0x0 was also
included here. Should this additional entry be documented in the commit
message, or was it included by mistake?
> diff --git a/tools/perf/pmu-events/arch/riscv/qemu/virt/events.json b/tools/perf/pmu-events/arch/riscv/qemu/virt/events.json
> new file mode 100644
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/riscv/qemu/virt/events.json
> @@ -0,0 +1,26 @@
> +[
> + {
> + "EventName": "fake-json-any",
[Severity: Medium]
Since the commit subject explicitly marks this as a test that should not be
upstreamed, and it populates the PMU event list with artificial hardware
events, was this patch included in the series by accident?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-counter_delegation-v8-0-7909f863a645@meta.com?part=22
^ permalink raw reply [flat|nested] 34+ messages in thread