* [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support
@ 2026-06-28 21:18 Drew Fustini
2026-06-28 21:18 ` [PATCH v3 1/8] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
` (7 more replies)
0 siblings, 8 replies; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
This series adds initial RISC-V QoS support: the Ssqosid extension [1]
(srmcfg CSR), the CBQRI controller interface [2] integrated with resctrl
[3], and a DT-based platform driver for cache controllers. It has been
tested on both the Tenstorrent Ascalon Shared Cache controller and a QEMU
implementation [4].
qemu-system-riscv64 -M virt,aia=aplic-imsic -nographic -m 1G -smp 8 \
-kernel arch/riscv/boot/Image \
-append "root=/dev/vda ro console=ttyS0 rootwait" \
-drive if=none,file=rootfs.ext2,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0 \
-device riscv.cbqri.capacity,max_mcids=256,max_rcids=64,ncblks=16,mmio_base=0x04820000
Cache allocation can be exercised on the booted system. Mount resctrl
and read the default schemata. The L2 controller has 16 capacity
blocks, so the default capacity bitmask (CBM) is 0xffff:
# mount -t resctrl resctrl /sys/fs/resctrl
# cat /sys/fs/resctrl/schemata
L2:0=ffff
Write a narrower CBM to a new control group and read it back to confirm
the L2 controller applied it:
# mkdir /sys/fs/resctrl/group0
# echo "L2:0=ff" > /sys/fs/resctrl/group0/schemata
# cat /sys/fs/resctrl/group0/schemata
L2:0=ff
Note that this series only implements support for resctrl L2 and L3
cache resources using CBQRI capacity allocation control. cc_block_mask
maps onto resctrl's existing cbm schema. However, cc_cunits is not
supported as there is no existing equivalent for capacity units in the
resctrl schemata.
I had previously been iterating on an RFC series [5] that did a full
implementation of CBQRI including capacity monitoring, bandwidth
allocation and monitoring. The bandwidth controls for CBQRI do not fit
well into resctrl's existing throttle-based MB schemata. I believe that
the path forward is Reinette's generic schema description proof of
concept [6]. My plan is to rebase the full support of CBQRI onto the
generic schema once it is ready.
This series is based on the linux-next tag next-20260623.
[1] https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
[2] https://github.com/riscv-non-isa/riscv-cbqri/releases/tag/v1.0
[3] https://docs.kernel.org/filesystems/resctrl.html
[4] https://github.com/tt-fustini/qemu/tree/riscv-cbqri-cache
[5] https://lore.kernel.org/linux-riscv/20260601-ssqosid-cbqri-rqsc-v7-0-v6-16-baf00f50028a@kernel.org/
[6] https://lore.kernel.org/all/aab804b9-e8b5-40ad-a85b-af7033391243@intel.com/
Changes in v3:
--------------
- riscv,cbqri.yaml:
- Require a device-specific compatible so a bare generic compatible
is no longer valid on its own.
- Rename node name from cache-controller@ to qos-controller@
- Rename compatible from tenstorrent,ascalon-sc-cbqri to
tenstorrent,ascalon-shared-cache-controller
- Take cbqri_controllers_lock in cbqri_attach_cpu_to_all_ctrls() so a
controller probed after boot can't corrupt the cbqri_controllers list.
- Revise comment above riscv_srmcfg_reset_cache() to clarify that the
teardown callback is not relied on. The cpuhp startup callback re-arms
the per-cpu sentinel, which forces the csr write on a re-onlined CPU.
- Drop the memory fences around the srmcfg CSR write on context switch.
Ssqosid does not require the ordering. The brief tagging inaccuracy at
the switch boundary is acceptable for QoS.
- Access the CBQRI controller registers with 32-bit reads and writes.
The spec only guarantees single-copy atomicity for 4-byte accesses.
This also removes the dependency on native 64-bit MMIO.
- Program cc_cunits to 0 before a config limit operation on controllers
that support capacity units, so a stale unit limit does not constrain
block-mask allocation.
- Link to v2:
https://patch.msgid.link/20260624-dfustini-atl-sc-cbqri-dt-v2-0-2f8049fd902b@kernel.org
- Sashiko review of v2:
https://sashiko.dev/#/patchset/20260624-dfustini-atl-sc-cbqri-dt-v2-0-2f8049fd902b@kernel.org
Changes in v2:
--------------
The changes in this revision address the Sashiko review of v1.
- Restore the srmcfg CSR for the current task on CPU_PM_EXIT and
CPU_PM_ENTER_FAILED, so it is not left configured incorrectly until
the next context switch.
- Serialize the cbqri_controllers list insert and the boot time walk
with a mutex, so an asynchronous driver probe cannot corrupt the list.
- Skip a controller at an unsupported cache level instead of aborting
resctrl setup, so valid L2 and L3 controllers still register.
- RISCV_ISA_SSQOSID selects ARCH_HAS_CPU_RESCTRL and RISCV_CBQRI
together, so no intermediate commit enables RESCTRL_FS without the
CBQRI resctrl glue.
- Rename the RISCV_CBQRI_DRIVER to RISCV_CBQRI, since it builds the
CBQRI core ops and resctrl integration rather than a driver.
- Drop the RISCV_CBQRI_DRIVER_DEBUG Kconfig option and rely on dynamic
debug to control the pr_debug() output.
- Note: Sashiko flagged the lack of suspend/resume state restore. I will
not fix that as register state is only lost when the power domain is
gated, which offlines the harts sharing the cache. resctrl reprograms
the default capacity mask through the normal control domain online
path on resume.
- Link to v1:
https://lore.kernel.org/all/20260619-dfustini-atl-sc-cbqri-dt-v1-0-e79a7723fab0@kernel.org/
- Sashiko review of v1:
https://sashiko.dev/#/patchset/20260619-dfustini-atl-sc-cbqri-dt-v1-0-e79a7723fab0@kernel.org
---
Drew Fustini (8):
dt-bindings: riscv: Add Ssqosid extension description
riscv: Detect the Ssqosid extension
riscv: Add support for srmcfg CSR from Ssqosid extension
riscv_cbqri: Add capacity controller probe and allocation device ops
riscv_cbqri: resctrl: Add cache allocation via capacity block mask
riscv: Enable resctrl filesystem for Ssqosid
dt-bindings: riscv: Add binding for CBQRI controllers
riscv_cbqri: Add CBQRI capacity allocation platform driver
.../devicetree/bindings/riscv/extensions.yaml | 6 +
.../devicetree/bindings/riscv/riscv,cbqri.yaml | 97 +++
MAINTAINERS | 15 +
arch/riscv/Kconfig | 20 +
arch/riscv/include/asm/csr.h | 5 +
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/include/asm/processor.h | 3 +
arch/riscv/include/asm/qos.h | 74 ++
arch/riscv/include/asm/resctrl.h | 147 ++++
arch/riscv/include/asm/switch_to.h | 3 +
arch/riscv/kernel/Makefile | 2 +
arch/riscv/kernel/cpufeature.c | 1 +
arch/riscv/kernel/qos.c | 99 +++
drivers/resctrl/Kconfig | 29 +
drivers/resctrl/Makefile | 5 +
drivers/resctrl/cbqri_capacity.c | 132 ++++
drivers/resctrl/cbqri_devices.c | 562 +++++++++++++++
drivers/resctrl/cbqri_internal.h | 124 ++++
drivers/resctrl/cbqri_resctrl.c | 787 +++++++++++++++++++++
include/linux/riscv_cbqri.h | 47 ++
20 files changed, 2159 insertions(+)
---
base-commit: 4e5dfb7c84012007c3c7061126491bbc92d71bf1
change-id: 20260610-dfustini-atl-sc-cbqri-dt-410c8e2711dd
Best regards,
--
Drew Fustini <fustini@kernel.org>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 1/8] dt-bindings: riscv: Add Ssqosid extension description
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
@ 2026-06-28 21:18 ` Drew Fustini
2026-06-28 21:18 ` [PATCH v3 2/8] riscv: Detect the Ssqosid extension Drew Fustini
` (6 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
Document the ratified Supervisor-mode Quality of Service ID (Ssqosid)
extension v1.0.
Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Drew Fustini <fustini@kernel.org>
---
Documentation/devicetree/bindings/riscv/extensions.yaml | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 2b0a8a93bb21..1c6f091518d4 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -232,6 +232,12 @@ properties:
ratified at commit d70011dde6c2 ("Update to ratified state")
of riscv-j-extension.
+ - const: ssqosid
+ description: |
+ The standard Ssqosid extension for Quality of Service ID is
+ ratified as v1.0 in commit d9c616497fde ("Merge pull
+ request #7 from ved-rivos/Ratified") of riscv-ssqosid.
+
- const: ssstateen
description: |
The standard Ssstateen extension for supervisor-mode view of the
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 2/8] riscv: Detect the Ssqosid extension
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
2026-06-28 21:18 ` [PATCH v3 1/8] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
@ 2026-06-28 21:18 ` Drew Fustini
2026-06-28 21:18 ` [PATCH v3 3/8] riscv: Add support for srmcfg CSR from " Drew Fustini
` (5 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
Ssqosid is the RISC-V Quality-of-Service (QoS) Identifiers specification
which defines the Supervisor Resource Management Configuration (srmcfg)
register.
Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
Co-developed-by: Kornel Dulęba <mindal@semihalf.com>
Signed-off-by: Kornel Dulęba <mindal@semihalf.com>
Signed-off-by: Drew Fustini <fustini@kernel.org>
---
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/kernel/cpufeature.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 7ef8e5f55c8d..b83dae5cebb9 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -112,6 +112,7 @@
#define RISCV_ISA_EXT_ZCLSD 103
#define RISCV_ISA_EXT_ZICFILP 104
#define RISCV_ISA_EXT_ZICFISS 105
+#define RISCV_ISA_EXT_SSQOSID 106
#define RISCV_ISA_EXT_XLINUXENVCFG 127
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index f46aa5602d74..668a7e71ff1c 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -582,6 +582,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts),
+ __RISCV_ISA_EXT_DATA(ssqosid, RISCV_ISA_EXT_SSQOSID),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svade, RISCV_ISA_EXT_SVADE),
__RISCV_ISA_EXT_DATA_VALIDATE(svadu, RISCV_ISA_EXT_SVADU, riscv_ext_svadu_validate),
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 3/8] riscv: Add support for srmcfg CSR from Ssqosid extension
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
2026-06-28 21:18 ` [PATCH v3 1/8] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-06-28 21:18 ` [PATCH v3 2/8] riscv: Detect the Ssqosid extension Drew Fustini
@ 2026-06-28 21:18 ` Drew Fustini
2026-06-28 21:28 ` sashiko-bot
2026-06-28 21:18 ` [PATCH v3 4/8] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
` (4 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
Add support for the srmcfg CSR defined in the Ssqosid ISA extension.
The CSR contains two fields:
- Resource Control ID (RCID) for resource allocation
- Monitoring Counter ID (MCID) for tracking resource usage
Requests from a hart to shared resources are tagged with these IDs,
allowing resource usage to be associated with the running task.
Add a srmcfg field to thread_struct with the same format as the CSR so
the scheduler can set the RCID and MCID for each task on context
switch. A per-cpu cpu_srmcfg variable mirrors the CSR state to avoid
redundant writes. L1D-hot memory access is faster than a CSR read and
avoids traps under virtualization.
A per-cpu cpu_srmcfg_default holds the default srmcfg for each CPU as
set by resctrl CPU group assignment. On context switch, RCID and MCID
inherit from the CPU default independently: a task whose thread RCID
field is zero takes the CPU default's RCID, and likewise for MCID.
Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
Assisted-by: Claude:claude-opus-4-7
Co-developed-by: Kornel Dulęba <mindal@semihalf.com>
Signed-off-by: Kornel Dulęba <mindal@semihalf.com>
Signed-off-by: Drew Fustini <fustini@kernel.org>
---
MAINTAINERS | 8 +++
arch/riscv/Kconfig | 18 +++++++
arch/riscv/include/asm/csr.h | 5 ++
arch/riscv/include/asm/processor.h | 3 ++
arch/riscv/include/asm/qos.h | 74 ++++++++++++++++++++++++++++
arch/riscv/include/asm/switch_to.h | 3 ++
arch/riscv/kernel/Makefile | 2 +
arch/riscv/kernel/qos.c | 99 ++++++++++++++++++++++++++++++++++++++
8 files changed, 212 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 0b9d7c8276ac..07109e1a8f84 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -23293,6 +23293,14 @@ F: drivers/perf/riscv_pmu.c
F: drivers/perf/riscv_pmu_legacy.c
F: drivers/perf/riscv_pmu_sbi.c
+RISC-V QOS RESCTRL SUPPORT
+M: Drew Fustini <fustini@kernel.org>
+R: yunhui cui <cuiyunhui@bytedance.com>
+L: linux-riscv@lists.infradead.org
+S: Supported
+F: arch/riscv/include/asm/qos.h
+F: arch/riscv/kernel/qos.c
+
RISC-V RPMI AND MPXY DRIVERS
M: Rahul Pathak <rahul@summations.net>
M: Anup Patel <anup@brainfault.org>
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 3f0a647218e4..ee586925f972 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -590,6 +590,24 @@ config RISCV_ISA_SVNAPOT
If you don't know what to do here, say Y.
+config RISCV_ISA_SSQOSID
+ bool "Ssqosid extension support for supervisor mode Quality of Service ID"
+ depends on 64BIT
+ default n
+ help
+ Adds support for the Ssqosid ISA extension (Supervisor-mode
+ Quality of Service ID).
+
+ Ssqosid defines the srmcfg CSR which allows the system to tag the
+ running process with an RCID (Resource Control ID) and MCID
+ (Monitoring Counter ID). The RCID is used to determine resource
+ allocation. The MCID is used to track resource usage in event
+ counters.
+
+ For example, a cache controller may use the RCID to apply a
+ cache partitioning scheme and use the MCID to track how much
+ cache a process, or a group of processes, is using.
+
config RISCV_ISA_SVPBMT
bool "Svpbmt extension support for supervisor mode page-based memory types"
depends on 64BIT && MMU
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 31b8988f4488..7bce928e5daa 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -84,6 +84,10 @@
#define SATP_ASID_MASK _AC(0xFFFF, UL)
#endif
+/* SRMCFG fields */
+#define SRMCFG_RCID_MASK GENMASK(11, 0)
+#define SRMCFG_MCID_MASK GENMASK(27, 16)
+
/* Exception cause high bit - is an interrupt if set */
#define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1))
@@ -328,6 +332,7 @@
#define CSR_STVAL 0x143
#define CSR_SIP 0x144
#define CSR_SATP 0x180
+#define CSR_SRMCFG 0x181
#define CSR_STIMECMP 0x14D
#define CSR_STIMECMPH 0x15D
diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 812517b2cec1..49a386d74cd3 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -123,6 +123,9 @@ struct thread_struct {
/* A forced icache flush is not needed if migrating to the previous cpu. */
unsigned int prev_cpu;
#endif
+#ifdef CONFIG_RISCV_ISA_SSQOSID
+ u32 srmcfg;
+#endif
};
/* Whitelist the fstate from the task_struct for hardened usercopy */
diff --git a/arch/riscv/include/asm/qos.h b/arch/riscv/include/asm/qos.h
new file mode 100644
index 000000000000..cf19e8438bb9
--- /dev/null
+++ b/arch/riscv/include/asm/qos.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_QOS_H
+#define _ASM_RISCV_QOS_H
+
+#include <linux/percpu-defs.h>
+
+#ifdef CONFIG_RISCV_ISA_SSQOSID
+
+#include <linux/bitfield.h>
+#include <linux/cpufeature.h>
+#include <linux/sched.h>
+
+#include <asm/csr.h>
+#include <asm/hwcap.h>
+
+/* cached value of srmcfg csr for each cpu */
+DECLARE_PER_CPU(u32, cpu_srmcfg);
+
+/* default srmcfg value for each cpu, set via resctrl cpu assignment */
+DECLARE_PER_CPU(u32, cpu_srmcfg_default);
+
+static inline void __switch_to_srmcfg(struct task_struct *next)
+{
+ u32 thread_srmcfg, default_srmcfg;
+
+ thread_srmcfg = READ_ONCE(next->thread.srmcfg);
+ default_srmcfg = __this_cpu_read(cpu_srmcfg_default);
+
+ /*
+ * RCID and MCID inherit from cpu_srmcfg_default independently.
+ * RESCTRL_RESERVED_CLOSID and RESCTRL_RESERVED_RMID are both 0, so a
+ * zero field means "unassigned" and takes the CPU default.
+ */
+ if (thread_srmcfg == 0) {
+ thread_srmcfg = default_srmcfg;
+ } else {
+ u32 rcid = FIELD_GET(SRMCFG_RCID_MASK, thread_srmcfg);
+ u32 mcid = FIELD_GET(SRMCFG_MCID_MASK, thread_srmcfg);
+
+ if (rcid == 0 || mcid == 0) {
+ if (rcid == 0)
+ rcid = FIELD_GET(SRMCFG_RCID_MASK, default_srmcfg);
+ if (mcid == 0)
+ mcid = FIELD_GET(SRMCFG_MCID_MASK, default_srmcfg);
+ thread_srmcfg = FIELD_PREP(SRMCFG_RCID_MASK, rcid) |
+ FIELD_PREP(SRMCFG_MCID_MASK, mcid);
+ }
+ }
+
+ if (thread_srmcfg != __this_cpu_read(cpu_srmcfg)) {
+ /*
+ * No fence around the csrw. Ssqosid is silent on srmcfg
+ * ordering versus memory accesses, so a few accesses at the
+ * switch boundary may carry the previous RCID/MCID. The
+ * tagging inaccuracy is bounded and acceptable for QoS.
+ */
+ __this_cpu_write(cpu_srmcfg, thread_srmcfg);
+ csr_write(CSR_SRMCFG, thread_srmcfg);
+ }
+}
+
+static __always_inline bool has_srmcfg(void)
+{
+ return riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID);
+}
+
+#else /* ! CONFIG_RISCV_ISA_SSQOSID */
+
+struct task_struct;
+static __always_inline bool has_srmcfg(void) { return false; }
+static inline void __switch_to_srmcfg(struct task_struct *next) { }
+
+#endif /* CONFIG_RISCV_ISA_SSQOSID */
+#endif /* _ASM_RISCV_QOS_H */
diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
index 0e71eb82f920..1c7ea53ec012 100644
--- a/arch/riscv/include/asm/switch_to.h
+++ b/arch/riscv/include/asm/switch_to.h
@@ -14,6 +14,7 @@
#include <asm/processor.h>
#include <asm/ptrace.h>
#include <asm/csr.h>
+#include <asm/qos.h>
#ifdef CONFIG_FPU
extern void __fstate_save(struct task_struct *save_to);
@@ -119,6 +120,8 @@ do { \
__switch_to_fpu(__prev, __next); \
if (has_vector() || has_xtheadvector()) \
__switch_to_vector(__prev, __next); \
+ if (has_srmcfg()) \
+ __switch_to_srmcfg(__next); \
if (switch_to_should_flush_icache(__next)) \
local_flush_icache_all(); \
__switch_to_envcfg(__next); \
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index cabb99cadfb6..ebe1c3588177 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -128,3 +128,5 @@ obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
obj-$(CONFIG_GENERIC_CPU_VULNERABILITIES) += bugs.o
obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o
+
+obj-$(CONFIG_RISCV_ISA_SSQOSID) += qos.o
diff --git a/arch/riscv/kernel/qos.c b/arch/riscv/kernel/qos.c
new file mode 100644
index 000000000000..c8900d91996f
--- /dev/null
+++ b/arch/riscv/kernel/qos.c
@@ -0,0 +1,99 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/cpu.h>
+#include <linux/cpu_pm.h>
+#include <linux/cpuhotplug.h>
+#include <linux/notifier.h>
+#include <linux/percpu-defs.h>
+#include <linux/types.h>
+
+#include <asm/cpufeature-macros.h>
+#include <asm/hwcap.h>
+#include <asm/qos.h>
+
+/*
+ * Cached value of srmcfg csr for each cpu. Seeded to U32_MAX so the next
+ * __switch_to_srmcfg() unconditionally writes the CSR. The encoding
+ * MCID << 16 | RCID with both fields well under 16 bits can never
+ * produce this sentinel. This covers early-boot context switches that
+ * happen before riscv_srmcfg_init() runs as an arch_initcall.
+ */
+DEFINE_PER_CPU(u32, cpu_srmcfg) = U32_MAX;
+
+/* default srmcfg value for each cpu, set via resctrl cpu assignment */
+DEFINE_PER_CPU(u32, cpu_srmcfg_default);
+
+/*
+ * Invalidate the per-CPU srmcfg cache. Used as both the cpuhp startup
+ * and teardown callback. U32_MAX is not a valid srmcfg value
+ * (MCID << 16 | RCID, both fields under 16 bits), so the next
+ * __switch_to_srmcfg() always writes the CSR.
+ *
+ * Ssqosid leaves the CSR implementation-defined across hart stop/start,
+ * so the cached value cannot be trusted after online. The startup
+ * callback runs at CPUHP_AP_ONLINE_DYN, before CPUHP_AP_ACTIVE makes
+ * the CPU schedulable, so the cache is invalidated before any normal
+ * task runs and the CSR is written on that task's first switch.
+ * The teardown callback is not relied on. Idle and per-CPU kthreads keep
+ * switching as the CPU goes down and overwrite the sentinel with the CPU
+ * default, so it does not survive the offline period.
+ */
+static int riscv_srmcfg_reset_cache(unsigned int cpu)
+{
+ per_cpu(cpu_srmcfg, cpu) = U32_MAX;
+ return 0;
+}
+
+/*
+ * CPU PM notifier: invalidate the cached srmcfg on resume from a deep
+ * idle / suspend. Ssqosid leaves CSR_SRMCFG state across low-power
+ * transitions implementation-defined, and the boot CPU never goes
+ * through the cpuhp online callback during system suspend, so without
+ * this hook __switch_to_srmcfg() would skip the CSR write when the
+ * outgoing task happens to share its srmcfg with the pre-suspend cache.
+ */
+static int riscv_srmcfg_pm_notify(struct notifier_block *nb,
+ unsigned long action, void *unused)
+{
+ switch (action) {
+ case CPU_PM_EXIT:
+ case CPU_PM_ENTER_FAILED:
+ /*
+ * The CSR is implementation-defined across the low-power
+ * transition. Invalidate the cache and eagerly rewrite the
+ * CSR for the current task so it does not run mis-tagged
+ * until the next context switch.
+ */
+ __this_cpu_write(cpu_srmcfg, U32_MAX);
+ __switch_to_srmcfg(current);
+ break;
+ }
+ return NOTIFY_OK;
+}
+
+static struct notifier_block riscv_srmcfg_pm_nb = {
+ .notifier_call = riscv_srmcfg_pm_notify,
+};
+
+static int __init riscv_srmcfg_init(void)
+{
+ int err;
+
+ if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID))
+ return 0;
+
+ /*
+ * cpuhp_setup_state() invokes the startup callback locally on every
+ * already-online CPU, so no separate seed loop is needed here.
+ */
+ err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
+ riscv_srmcfg_reset_cache, riscv_srmcfg_reset_cache);
+ if (err < 0) {
+ pr_warn("srmcfg cpuhp registration failed (%d), cpus brought online after boot will not invalidate the CSR_SRMCFG cache\n",
+ err);
+ return err;
+ }
+
+ cpu_pm_register_notifier(&riscv_srmcfg_pm_nb);
+ return 0;
+}
+arch_initcall(riscv_srmcfg_init);
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 4/8] riscv_cbqri: Add capacity controller probe and allocation device ops
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
` (2 preceding siblings ...)
2026-06-28 21:18 ` [PATCH v3 3/8] riscv: Add support for srmcfg CSR from " Drew Fustini
@ 2026-06-28 21:18 ` Drew Fustini
2026-06-28 21:28 ` sashiko-bot
2026-06-28 21:18 ` [PATCH v3 5/8] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
` (3 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
Add support for the RISC-V CBQRI capacity controller. A platform driver
passes a cbqri_controller_info descriptor together with the cache level
to riscv_cbqri_register_cc_dt(), which probes the controller and adds it
to the controller list.
Assisted-by: Claude:claude-opus-4-7
Co-developed-by: Adrien Ricciardi <aricciardi@baylibre.com>
Signed-off-by: Adrien Ricciardi <aricciardi@baylibre.com>
Signed-off-by: Drew Fustini <fustini@kernel.org>
---
MAINTAINERS | 3 +
drivers/resctrl/Kconfig | 13 +
drivers/resctrl/Makefile | 3 +
drivers/resctrl/cbqri_devices.c | 562 +++++++++++++++++++++++++++++++++++++++
drivers/resctrl/cbqri_internal.h | 124 +++++++++
include/linux/riscv_cbqri.h | 47 ++++
6 files changed, 752 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 07109e1a8f84..811c0c9b1fac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -23300,6 +23300,9 @@ L: linux-riscv@lists.infradead.org
S: Supported
F: arch/riscv/include/asm/qos.h
F: arch/riscv/kernel/qos.c
+F: drivers/resctrl/cbqri_devices.c
+F: drivers/resctrl/cbqri_internal.h
+F: include/linux/riscv_cbqri.h
RISC-V RPMI AND MPXY DRIVERS
M: Rahul Pathak <rahul@summations.net>
diff --git a/drivers/resctrl/Kconfig b/drivers/resctrl/Kconfig
index 672abea3b03c..92b9c82cf9f3 100644
--- a/drivers/resctrl/Kconfig
+++ b/drivers/resctrl/Kconfig
@@ -29,3 +29,16 @@ config ARM64_MPAM_RESCTRL_FS
default y if ARM64_MPAM_DRIVER && RESCTRL_FS
select RESCTRL_RMID_DEPENDS_ON_CLOSID
select RESCTRL_ASSIGN_FIXED
+
+menuconfig RISCV_CBQRI
+ bool "RISC-V CBQRI support"
+ depends on RISCV && RISCV_ISA_SSQOSID
+ help
+ Capacity and Bandwidth QoS Register Interface (CBQRI) support for
+ RISC-V cache QoS resources. CBQRI exposes cache capacity
+ allocation through the resctrl filesystem at /sys/fs/resctrl when
+ RESCTRL_FS is also enabled.
+
+if RISCV_CBQRI
+
+endif
diff --git a/drivers/resctrl/Makefile b/drivers/resctrl/Makefile
index 4f6d0e81f9b8..4d8a2c4b5627 100644
--- a/drivers/resctrl/Makefile
+++ b/drivers/resctrl/Makefile
@@ -3,3 +3,6 @@ mpam-y += mpam_devices.o
mpam-$(CONFIG_ARM64_MPAM_RESCTRL_FS) += mpam_resctrl.o
ccflags-$(CONFIG_ARM64_MPAM_DRIVER_DEBUG) += -DDEBUG
+
+obj-$(CONFIG_RISCV_CBQRI) += cbqri.o
+cbqri-y += cbqri_devices.o
diff --git a/drivers/resctrl/cbqri_devices.c b/drivers/resctrl/cbqri_devices.c
new file mode 100644
index 000000000000..69df46e07df1
--- /dev/null
+++ b/drivers/resctrl/cbqri_devices.c
@@ -0,0 +1,562 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
+
+#include <linux/bitfield.h>
+#include <linux/riscv_cbqri.h>
+#include <linux/cpumask.h>
+#include <linux/err.h>
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/ioport.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/printk.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include <asm/csr.h>
+
+#include "cbqri_internal.h"
+
+LIST_HEAD(cbqri_controllers);
+
+/*
+ * Serializes cbqri_controllers mutations against a concurrent insert under
+ * asynchronous driver probing, and against the boot-time walk in the resctrl
+ * glue. Runtime cpuhp walks happen after registration has settled.
+ */
+DEFINE_MUTEX(cbqri_controllers_lock);
+
+/*
+ * CBQRI registers are 64-bit, but the spec only guarantees single-copy
+ * atomicity for naturally aligned 4-byte accesses. Read the two halves and
+ * reconstruct, so the driver does not rely on native 64-bit MMIO.
+ */
+static u64 cbqri_readq(void __iomem *addr)
+{
+ return (u64)readl(addr) | ((u64)readl(addr + 4) << 32);
+}
+
+/* Set capacity block mask (cc_block_mask) */
+static void cbqri_set_cbm(struct cbqri_controller *ctrl, u64 cbm)
+{
+ /*
+ * cbqri_probe_cc() rejects ncblks > 32, so the mask fits the low word.
+ * Per CBQRI 3.5 the cc_block_mask bits above NCBLKS are read-only zero,
+ * so the upper word needs no write.
+ */
+ writel(lower_32_bits(cbm), ctrl->base + CBQRI_CC_BLOCK_MASK_OFF);
+}
+
+/*
+ * Clear cc_cunits so a CONFIG_LIMIT on a CUNITS-capable controller imposes no
+ * capacity-unit limit. resctrl models only the block mask. On controllers
+ * with cc_capabilities.CUNITS == 0 the register is read-only zero, so there is
+ * nothing to do.
+ */
+static void cbqri_clear_cunits(struct cbqri_controller *ctrl)
+{
+ if (ctrl->cc.cunits) {
+ writel(0, ctrl->base + CBQRI_CC_CUNITS_OFF);
+ writel(0, ctrl->base + CBQRI_CC_CUNITS_OFF + 4);
+ }
+}
+
+static int cbqri_wait_busy_flag(struct cbqri_controller *ctrl, int reg_offset,
+ u64 *regp)
+{
+ u64 reg;
+ int ret;
+
+ /*
+ * Sleeping poll: caller holds ctrl->lock as a sleeping mutex, so
+ * 10us/1ms is safe under PREEMPT_RT.
+ */
+ ret = read_poll_timeout(cbqri_readq, reg,
+ !FIELD_GET(CBQRI_CONTROL_REGISTERS_BUSY_MASK, reg),
+ 10, 1000, false, ctrl->base + reg_offset);
+ if (ret)
+ return ret;
+ if (regp)
+ *regp = reg;
+ return 0;
+}
+
+/*
+ * Perform capacity allocation control operation on capacity controller.
+ * Caller must hold ctrl->lock.
+ */
+static int cbqri_cc_alloc_op(struct cbqri_controller *ctrl, int operation,
+ int rcid, u32 at)
+{
+ int reg_offset = CBQRI_CC_ALLOC_CTL_OFF;
+ int status;
+ u64 reg;
+
+ lockdep_assert_held(&ctrl->lock);
+
+ if (cbqri_wait_busy_flag(ctrl, reg_offset, ®) < 0) {
+ pr_err_ratelimited("BUSY timeout before starting operation\n");
+ return -EIO;
+ }
+ FIELD_MODIFY(CBQRI_CONTROL_REGISTERS_OP_MASK, ®, operation);
+ FIELD_MODIFY(CBQRI_CONTROL_REGISTERS_RCID_MASK, ®, rcid);
+
+ /*
+ * CBQRI Table 1: AT 0=Data, 1=Code. Program AT on controllers
+ * that report supports_alloc_at_code. On controllers that don't,
+ * AT is reserved-zero and the op acts on both halves.
+ */
+ reg &= ~CBQRI_CONTROL_REGISTERS_AT_MASK;
+ if (ctrl->cc.supports_alloc_at_code)
+ reg |= FIELD_PREP(CBQRI_CONTROL_REGISTERS_AT_MASK, at);
+
+ writel(lower_32_bits(reg), ctrl->base + reg_offset);
+
+ if (cbqri_wait_busy_flag(ctrl, reg_offset, ®) < 0) {
+ pr_err_ratelimited("BUSY timeout during operation\n");
+ return -EIO;
+ }
+
+ status = FIELD_GET(CBQRI_CONTROL_REGISTERS_STATUS_MASK, reg);
+ if (status != CBQRI_CC_ALLOC_CTL_STATUS_SUCCESS) {
+ pr_err_ratelimited("operation %d failed: status=%d\n", operation, status);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+/*
+ * Apply a capacity block mask and verify via CONFIG_LIMIT + READ_LIMIT.
+ *
+ * AT-capable controllers with CDP off need a second CONFIG_LIMIT on the
+ * other AT half (the spec encodes AT only as 0=Data / 1=Code, there is
+ * no "both halves" value). CDP-on issues separate per-type writes from
+ * resctrl, so a single CONFIG_LIMIT per call is correct.
+ */
+int cbqri_apply_cache_config(struct cbqri_controller *ctrl, u32 closid,
+ const struct cbqri_cc_config *cfg)
+{
+ bool need_at_mirror;
+ u64 saved_cbm = 0;
+ int err = 0;
+ u64 reg;
+
+ mutex_lock(&ctrl->lock);
+
+ need_at_mirror = ctrl->cc.supports_alloc_at_code && !cfg->cdp_enabled;
+
+ /*
+ * Capture the cfg->at half CBM before any write so a partial
+ * AT-mirror failure can revert and keep the two halves consistent.
+ * Pre-clear cc_block_mask so a silent firmware no-op (status
+ * SUCCESS but staging not updated) shows as a zero readback
+ * rather than carrying stale data from a prior op.
+ */
+ if (need_at_mirror) {
+ cbqri_set_cbm(ctrl, 0);
+ err = cbqri_cc_alloc_op(ctrl, CBQRI_CC_ALLOC_CTL_OP_READ_LIMIT,
+ closid, cfg->at);
+ if (err < 0)
+ goto out;
+ saved_cbm = cbqri_readq(ctrl->base + CBQRI_CC_BLOCK_MASK_OFF);
+ }
+
+ /* Set capacity block mask (cc_block_mask) */
+ cbqri_set_cbm(ctrl, cfg->cbm);
+ cbqri_clear_cunits(ctrl);
+
+ /* Capacity config limit operation for the AT half implied by cfg->at */
+ err = cbqri_cc_alloc_op(ctrl, CBQRI_CC_ALLOC_CTL_OP_CONFIG_LIMIT,
+ closid, cfg->at);
+ if (err < 0)
+ goto out;
+
+ /*
+ * CDP-off mirror: on AT-capable controllers, also program the
+ * other AT half with the same mask so the two halves stay in sync.
+ */
+ if (need_at_mirror) {
+ u32 other = (cfg->at == CBQRI_CONTROL_REGISTERS_AT_CODE) ?
+ CBQRI_CONTROL_REGISTERS_AT_DATA :
+ CBQRI_CONTROL_REGISTERS_AT_CODE;
+
+ cbqri_set_cbm(ctrl, cfg->cbm);
+ cbqri_clear_cunits(ctrl);
+ err = cbqri_cc_alloc_op(ctrl,
+ CBQRI_CC_ALLOC_CTL_OP_CONFIG_LIMIT,
+ closid, other);
+ if (err < 0) {
+ int rerr;
+
+ /*
+ * Best-effort revert of the cfg->at half so the two
+ * halves stay in sync. A schemata read sees only one
+ * half, so silent divergence would otherwise report
+ * the new value as if the write had succeeded.
+ */
+ cbqri_set_cbm(ctrl, saved_cbm);
+ cbqri_clear_cunits(ctrl);
+ rerr = cbqri_cc_alloc_op(ctrl,
+ CBQRI_CC_ALLOC_CTL_OP_CONFIG_LIMIT,
+ closid, cfg->at);
+ if (rerr < 0)
+ pr_err_ratelimited("AT-mirror revert failed (err=%d), AT halves diverged\n",
+ rerr);
+ goto out;
+ }
+ }
+
+ /* Clear cc_block_mask before read limit to verify op works */
+ cbqri_set_cbm(ctrl, 0);
+
+ /* Perform a capacity read limit operation to verify blockmask */
+ err = cbqri_cc_alloc_op(ctrl, CBQRI_CC_ALLOC_CTL_OP_READ_LIMIT,
+ closid, cfg->at);
+ if (err < 0)
+ goto out;
+
+ /*
+ * Read capacity blockmask and narrow to u32 to match resctrl's CBM
+ * width. cbqri_probe_cc() rejects ncblks > 32 so the upper bits are
+ * reserved zero.
+ */
+ reg = cbqri_readq(ctrl->base + CBQRI_CC_BLOCK_MASK_OFF);
+ if (lower_32_bits(reg) != cfg->cbm) {
+ pr_err_ratelimited("CBM verify mismatch (reg=%llx != cbm=%llx)\n",
+ reg, cfg->cbm);
+ err = -EIO;
+ }
+
+out:
+ mutex_unlock(&ctrl->lock);
+ return err;
+}
+
+/*
+ * Read the configured CBM for closid on the at half via READ_LIMIT.
+ * Pre-clears cc_block_mask before the op so a silent firmware no-op
+ * (status SUCCESS but staging not updated) is detectable in cbm_out.
+ */
+int cbqri_read_cache_config(struct cbqri_controller *ctrl, u32 closid,
+ u32 at, u32 *cbm_out)
+{
+ int err;
+
+ mutex_lock(&ctrl->lock);
+ cbqri_set_cbm(ctrl, 0);
+ err = cbqri_cc_alloc_op(ctrl, CBQRI_CC_ALLOC_CTL_OP_READ_LIMIT, closid, at);
+ if (err == 0) {
+ /*
+ * resctrl exposes the CBM as a u32 and cbqri_probe_cc() rejects
+ * ncblks > 32, so the mask fits the low 32-bit word of the
+ * cc_block_mask register.
+ */
+ *cbm_out = readl(ctrl->base + CBQRI_CC_BLOCK_MASK_OFF);
+ }
+ mutex_unlock(&ctrl->lock);
+ return err;
+}
+
+static int cbqri_probe_feature(struct cbqri_controller *ctrl, int reg_offset,
+ int operation, int *status, bool *access_type_supported)
+{
+ const u64 active_mask = CBQRI_CONTROL_REGISTERS_OP_MASK |
+ CBQRI_CONTROL_REGISTERS_AT_MASK |
+ CBQRI_CONTROL_REGISTERS_RCID_MASK;
+ u64 reg, saved_reg;
+ int at;
+
+ /*
+ * Default the output to false so the status==0 (feature not
+ * implemented) path returns a deterministic value to the caller
+ * rather than leaving an uninitialized bool.
+ */
+ *access_type_supported = false;
+
+ /* Keep the initial register value to preserve the WPRI fields */
+ reg = cbqri_readq(ctrl->base + reg_offset);
+ saved_reg = reg;
+
+ /* Drain any in-flight firmware op before issuing our own write. */
+ if (cbqri_wait_busy_flag(ctrl, reg_offset, &saved_reg) < 0) {
+ pr_err("BUSY timeout before probe operation\n");
+ return -EIO;
+ }
+
+ /*
+ * Execute the requested operation with all active fields
+ * (OP/AT/RCID) zeroed except OP itself. Every bit not in
+ * active_mask is WPRI and gets carried over from saved_reg.
+ */
+ reg = (saved_reg & ~active_mask) |
+ FIELD_PREP(CBQRI_CONTROL_REGISTERS_OP_MASK, operation);
+ writel(lower_32_bits(reg), ctrl->base + reg_offset);
+ if (cbqri_wait_busy_flag(ctrl, reg_offset, ®) < 0) {
+ pr_err_ratelimited("BUSY timeout during operation\n");
+ return -EIO;
+ }
+
+ /* Get the operation status */
+ *status = FIELD_GET(CBQRI_CONTROL_REGISTERS_STATUS_MASK, reg);
+
+ /*
+ * Check for the AT support if the register is implemented
+ * (if not, the status value will remain 0)
+ */
+ if (*status != 0) {
+ /*
+ * Re-issue operation with AT=CODE so the controller
+ * latches AT=CODE on supported hardware (or resets it to 0
+ * on hardware that doesn't). OP must be a defined CBQRI op
+ * here. OP=0 is a no-op and would silently disable CDP.
+ */
+ reg = (saved_reg & ~active_mask) |
+ FIELD_PREP(CBQRI_CONTROL_REGISTERS_OP_MASK, operation) |
+ FIELD_PREP(CBQRI_CONTROL_REGISTERS_AT_MASK,
+ CBQRI_CONTROL_REGISTERS_AT_CODE);
+ writel(lower_32_bits(reg), ctrl->base + reg_offset);
+ if (cbqri_wait_busy_flag(ctrl, reg_offset, ®) < 0) {
+ pr_err("BUSY timeout setting AT field\n");
+ return -EIO;
+ }
+
+ /*
+ * If the AT field value has been reset to zero,
+ * then the AT support is not present
+ */
+ at = FIELD_GET(CBQRI_CONTROL_REGISTERS_AT_MASK, reg);
+ if (at == CBQRI_CONTROL_REGISTERS_AT_CODE)
+ *access_type_supported = true;
+ }
+
+ /*
+ * Restore the original register value.
+ * Clear OP to avoid re-triggering the probe op.
+ */
+ saved_reg &= ~CBQRI_CONTROL_REGISTERS_OP_MASK;
+ writel(lower_32_bits(saved_reg), ctrl->base + reg_offset);
+ if (cbqri_wait_busy_flag(ctrl, reg_offset, NULL) < 0) {
+ pr_err("BUSY timeout restoring register value\n");
+ return -EIO;
+ }
+
+ return 0;
+}
+
+static int cbqri_probe_cc(struct cbqri_controller *ctrl)
+{
+ int err, status;
+ int ver_major, ver_minor;
+ u64 reg;
+
+ reg = cbqri_readq(ctrl->base + CBQRI_CC_CAPABILITIES_OFF);
+ if (reg == 0)
+ return -ENODEV;
+
+ ver_minor = FIELD_GET(CBQRI_CC_CAPABILITIES_VER_MINOR_MASK, reg);
+ ver_major = FIELD_GET(CBQRI_CC_CAPABILITIES_VER_MAJOR_MASK, reg);
+ ctrl->cc.ncblks = FIELD_GET(CBQRI_CC_CAPABILITIES_NCBLKS_MASK, reg);
+
+ pr_debug("version=%d.%d ncblks=%d cache_level=%d\n",
+ ver_major, ver_minor,
+ ctrl->cc.ncblks, ctrl->cache.cache_level);
+
+ /*
+ * NCBLKS == 0 would divide-by-zero in the schemata math while
+ * ctrl->lock is held.
+ */
+ if (!ctrl->cc.ncblks) {
+ pr_warn("CC at %pa has 0 capacity blocks, skipping\n",
+ &ctrl->addr);
+ return -ENODEV;
+ }
+
+ if (ctrl->cc.ncblks > 32) {
+ pr_warn("CC at %pa has ncblks=%u > 32 (resctrl CBM is u32), skipping\n",
+ &ctrl->addr, ctrl->cc.ncblks);
+ return -ENODEV;
+ }
+
+ /*
+ * On a CUNITS-capable controller, CONFIG_LIMIT also consumes cc_cunits,
+ * whose reset value is unspecified. The driver clears it to 0 (no limit)
+ * on every CONFIG_LIMIT, so cc_cunits at 0x28 must be within the mapping.
+ */
+ ctrl->cc.cunits = FIELD_GET(CBQRI_CC_CAPABILITIES_CUNITS_MASK, reg);
+ if (ctrl->cc.cunits && ctrl->size < CBQRI_CC_CUNITS_OFF + 8) {
+ pr_warn("CC at %pa supports CUNITS but maps only %pa, skipping\n",
+ &ctrl->addr, &ctrl->size);
+ return -ENODEV;
+ }
+
+ /* Probe allocation features */
+ err = cbqri_probe_feature(ctrl, CBQRI_CC_ALLOC_CTL_OFF,
+ CBQRI_CC_ALLOC_CTL_OP_READ_LIMIT,
+ &status, &ctrl->cc.supports_alloc_at_code);
+ if (err)
+ return err;
+
+ if (status == CBQRI_CC_ALLOC_CTL_STATUS_SUCCESS)
+ ctrl->alloc_capable = true;
+
+ return 0;
+}
+
+static int cbqri_probe_controller(struct cbqri_controller *ctrl)
+{
+ int err;
+
+ pr_debug("controller info: type=%d addr=%pa size=%pa max-rcid=%u\n",
+ ctrl->type, &ctrl->addr, &ctrl->size, ctrl->rcid_count);
+
+ if (!ctrl->addr) {
+ pr_warn("controller has invalid addr=0x0, skipping\n");
+ return -EINVAL;
+ }
+
+ if (ctrl->size < CBQRI_CTRL_MIN_REG_SPAN) {
+ pr_warn("controller at %pa: size %pa < minimum 0x%x, skipping\n",
+ &ctrl->addr, &ctrl->size, CBQRI_CTRL_MIN_REG_SPAN);
+ return -EINVAL;
+ }
+
+ if (!request_mem_region(ctrl->addr, ctrl->size, "cbqri_controller")) {
+ pr_err("request_mem_region failed for %pa\n", &ctrl->addr);
+ return -EBUSY;
+ }
+
+ ctrl->base = ioremap(ctrl->addr, ctrl->size);
+ if (!ctrl->base) {
+ pr_err("ioremap failed for %pa\n", &ctrl->addr);
+ err = -ENOMEM;
+ goto err_release;
+ }
+
+ switch (ctrl->type) {
+ case CBQRI_CONTROLLER_TYPE_CAPACITY:
+ err = cbqri_probe_cc(ctrl);
+ break;
+ default:
+ pr_err("unknown controller type %d\n", ctrl->type);
+ err = -ENODEV;
+ break;
+ }
+
+ if (err)
+ goto err_iounmap;
+
+ return 0;
+
+err_iounmap:
+ iounmap(ctrl->base);
+ ctrl->base = NULL;
+err_release:
+ release_mem_region(ctrl->addr, ctrl->size);
+ return err;
+}
+
+void cbqri_controller_destroy(struct cbqri_controller *ctrl)
+{
+ /*
+ * cbqri_probe_controller() clears ctrl->base on its error paths and
+ * releases the mem region itself, so reach into both only when
+ * destroy is rolling back a successful probe.
+ */
+ if (ctrl->base) {
+ iounmap(ctrl->base);
+ release_mem_region(ctrl->addr, ctrl->size);
+ }
+ kfree(ctrl);
+}
+
+/**
+ * riscv_cbqri_register_cc_dt() - register a DT-described capacity controller
+ * @info: registration descriptor. info->cache_id is used as the
+ * resctrl domain id. info->type must be CAPACITY.
+ * @cache_level: cache level (2 or 3) the controller backs, mapped to the
+ * resctrl L2/L3 resource by the resctrl glue.
+ * @cpu_mask: CPUs that share this cache.
+ *
+ * The cache topology is supplied directly by the caller. A device-tree
+ * platform driver that already knows which CPUs share the cache and at what
+ * level passes that in. There is no firmware table to resolve it from.
+ *
+ * Return: 0 on success, or a negative errno on failure.
+ */
+int riscv_cbqri_register_cc_dt(const struct cbqri_controller_info *info,
+ u32 cache_level, const struct cpumask *cpu_mask)
+{
+ struct cbqri_controller *ctrl;
+ int err;
+
+ if (!info->addr) {
+ pr_warn("skipping controller with invalid addr=0x0\n");
+ return -EINVAL;
+ }
+
+ if (info->type != CBQRI_CONTROLLER_TYPE_CAPACITY) {
+ pr_warn("register_cc_dt called with non-capacity type %u\n",
+ info->type);
+ return -EINVAL;
+ }
+
+ if (!cpu_mask || cpumask_empty(cpu_mask)) {
+ pr_warn("register_cc_dt called with empty cpu_mask\n");
+ return -EINVAL;
+ }
+
+ ctrl = kzalloc(sizeof(*ctrl), GFP_KERNEL);
+ if (!ctrl)
+ return -ENOMEM;
+
+ mutex_init(&ctrl->lock);
+
+ ctrl->addr = info->addr;
+ ctrl->size = info->size;
+ ctrl->type = info->type;
+ ctrl->rcid_count = info->rcid_count;
+
+ /*
+ * SRMCFG encodes RCID in 12 bits. Reject an out-of-range count rather
+ * than silently truncating in every FIELD_PREP(SRMCFG_RCID_MASK, closid)
+ * on the schedule-in fast path.
+ */
+ if (ctrl->rcid_count > FIELD_MAX(SRMCFG_RCID_MASK) + 1) {
+ pr_warn("CC at %pa has RCID count %u beyond the 12-bit SRMCFG field, skipping\n",
+ &ctrl->addr, ctrl->rcid_count);
+ cbqri_controller_destroy(ctrl);
+ return -EINVAL;
+ }
+
+ ctrl->cache.cache_id = info->cache_id;
+ ctrl->cache.cache_level = cache_level;
+ cpumask_copy(&ctrl->cache.cpu_mask, cpu_mask);
+
+ err = cbqri_probe_controller(ctrl);
+ if (err) {
+ cbqri_controller_destroy(ctrl);
+ return err;
+ }
+
+ /*
+ * Allocation capability comes from the capabilities register probed
+ * above, not from device tree. rcid_count only bounds the RCID range,
+ * so a controller the hardware reports as alloc-capable but described
+ * with no RCID count cannot be driven. Reject that inconsistency. A
+ * monitoring-only controller (not alloc_capable) needs no RCID count.
+ */
+ if (ctrl->alloc_capable && !ctrl->rcid_count) {
+ pr_warn("CC at %pa is alloc-capable but has no RCID count, skipping\n",
+ &ctrl->addr);
+ cbqri_controller_destroy(ctrl);
+ return -EINVAL;
+ }
+
+ mutex_lock(&cbqri_controllers_lock);
+ list_add_tail(&ctrl->list, &cbqri_controllers);
+ mutex_unlock(&cbqri_controllers_lock);
+ return 0;
+}
diff --git a/drivers/resctrl/cbqri_internal.h b/drivers/resctrl/cbqri_internal.h
new file mode 100644
index 000000000000..cbc9166ad4c0
--- /dev/null
+++ b/drivers/resctrl/cbqri_internal.h
@@ -0,0 +1,124 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _DRIVERS_RESCTRL_CBQRI_INTERNAL_H
+#define _DRIVERS_RESCTRL_CBQRI_INTERNAL_H
+
+#include <linux/bitfield.h>
+#include <linux/riscv_cbqri.h>
+#include <linux/cpumask.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+/* Capacity Controller (CC) MMIO register offsets. */
+#define CBQRI_CC_CAPABILITIES_OFF 0
+#define CBQRI_CC_ALLOC_CTL_OFF 24
+#define CBQRI_CC_BLOCK_MASK_OFF 32
+
+/*
+ * Per CBQRI 3.5 the block-mask width BMW is the smallest multiple of 64 bits
+ * that holds NCBLKS, so cc_cunits sits at 32 + BMW/8 bytes. This constant is
+ * valid only while NCBLKS <= 64. cbqri_probe_cc() rejects ncblks > 32 before
+ * it reads cc_capabilities.CUNITS, forcing BMW to 64 bits, one 8-byte block
+ * mask, and cc_cunits at 0x28. Raising that cap above 64 would require
+ * computing 32 + roundup(ncblks, 64) / 8 instead of a constant.
+ */
+#define CBQRI_CC_CUNITS_OFF 40
+
+/*
+ * Highest base register offset (cc_block_mask at 0x20) plus its 8-byte width.
+ * cbqri_probe_controller() rejects smaller mappings. cc_cunits at 0x28 is
+ * optional and only required when cc_capabilities.CUNITS is set, which
+ * cbqri_probe_cc() checks against the mapping size separately.
+ */
+#define CBQRI_CTRL_MIN_REG_SPAN 0x28u
+
+#define CBQRI_CC_CAPABILITIES_VER_MINOR_MASK GENMASK_ULL(3, 0)
+#define CBQRI_CC_CAPABILITIES_VER_MAJOR_MASK GENMASK_ULL(7, 4)
+#define CBQRI_CC_CAPABILITIES_NCBLKS_MASK GENMASK_ULL(23, 8)
+#define CBQRI_CC_CAPABILITIES_CUNITS_MASK BIT_ULL(25)
+
+/*
+ * CC control registers are 64-bit, but the CBQRI spec only guarantees
+ * single-copy atomicity for naturally aligned 4-byte accesses. They are read
+ * as two 32-bit halves (cbqri_readq) reconstructed into a u64 for field
+ * extraction, and written via the low 32-bit half, so the driver does not
+ * depend on native 64-bit MMIO. Keep every field mask GENMASK_ULL so
+ * FIELD_MODIFY() or ~mask on the reconstructed u64 never zero-extends a
+ * 32-bit mask and clobbers STATUS/BUSY/WPRI in bits 63:32.
+ */
+#define CBQRI_CONTROL_REGISTERS_OP_MASK GENMASK_ULL(4, 0)
+#define CBQRI_CONTROL_REGISTERS_AT_MASK GENMASK_ULL(7, 5)
+/* AT field values (CBQRI Table 1): data vs code half for CDP */
+#define CBQRI_CONTROL_REGISTERS_AT_DATA 0
+#define CBQRI_CONTROL_REGISTERS_AT_CODE 1
+#define CBQRI_CONTROL_REGISTERS_RCID_MASK GENMASK_ULL(19, 8)
+#define CBQRI_CONTROL_REGISTERS_STATUS_MASK GENMASK_ULL(38, 32)
+#define CBQRI_CONTROL_REGISTERS_BUSY_MASK GENMASK_ULL(39, 39)
+
+#define CBQRI_CC_ALLOC_CTL_OP_CONFIG_LIMIT 1
+#define CBQRI_CC_ALLOC_CTL_OP_READ_LIMIT 2
+#define CBQRI_CC_ALLOC_CTL_STATUS_SUCCESS 1
+
+/* Capacity Controller hardware capabilities */
+struct riscv_cbqri_capacity_caps {
+ u16 ncblks;
+ bool supports_alloc_at_code;
+ /* cc_capabilities.CUNITS: controller enforces a capacity-unit limit */
+ bool cunits;
+};
+
+/**
+ * struct cbqri_cc_config - desired capacity allocation state for one rcid
+ * @cbm: capacity block mask
+ * @at: AT half the @cbm applies to (CBQRI_CONTROL_REGISTERS_AT_DATA
+ * or CBQRI_CONTROL_REGISTERS_AT_CODE)
+ * @cdp_enabled: when false and the controller supports AT, mirror @cbm
+ * into the other AT half so both stay in sync
+ */
+struct cbqri_cc_config {
+ u64 cbm;
+ u32 at;
+ bool cdp_enabled;
+};
+
+struct cbqri_controller {
+ void __iomem *base;
+ /*
+ * Serializes the write-then-poll-busy MMIO sequences on this
+ * controller. Each CBQRI op may busy-wait up to 1 ms on slow
+ * firmware, so use a sleeping mutex to keep preemption enabled.
+ * All resctrl-arch entry points run in process context.
+ */
+ struct mutex lock;
+
+ struct riscv_cbqri_capacity_caps cc;
+
+ bool alloc_capable;
+
+ phys_addr_t addr;
+ phys_addr_t size;
+ enum cbqri_controller_type type;
+ u32 rcid_count;
+
+ struct list_head list;
+
+ struct cache_controller {
+ u32 cache_level;
+ struct cpumask cpu_mask;
+ /* Cache id used as the resctrl domain id */
+ u32 cache_id;
+ } cache;
+};
+
+extern struct list_head cbqri_controllers;
+extern struct mutex cbqri_controllers_lock;
+
+void cbqri_controller_destroy(struct cbqri_controller *ctrl);
+
+int cbqri_apply_cache_config(struct cbqri_controller *ctrl, u32 closid,
+ const struct cbqri_cc_config *cfg);
+
+int cbqri_read_cache_config(struct cbqri_controller *ctrl, u32 closid,
+ u32 at, u32 *cbm_out);
+
+#endif /* _DRIVERS_RESCTRL_CBQRI_INTERNAL_H */
diff --git a/include/linux/riscv_cbqri.h b/include/linux/riscv_cbqri.h
new file mode 100644
index 000000000000..58737224d2f2
--- /dev/null
+++ b/include/linux/riscv_cbqri.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Public registration API for the RISC-V Capacity and Bandwidth QoS
+ * Register Interface (CBQRI) core. Discovery layers (device tree
+ * platform drivers) call riscv_cbqri_register_cc_dt() to hand a capacity
+ * controller descriptor to the core, which owns all subsequent state.
+ */
+#ifndef _LINUX_RISCV_CBQRI_H
+#define _LINUX_RISCV_CBQRI_H
+
+#include <linux/types.h>
+
+struct cpumask;
+
+enum cbqri_controller_type {
+ CBQRI_CONTROLLER_TYPE_CAPACITY,
+};
+
+/**
+ * struct cbqri_controller_info - registration descriptor
+ * @addr: MMIO base address of the controller's register interface
+ * @size: size of the MMIO region
+ * @type: controller type (capacity)
+ * @rcid_count: number of supported RCIDs
+ * @cache_id: cache id used as the resctrl domain id
+ */
+struct cbqri_controller_info {
+ phys_addr_t addr;
+ phys_addr_t size;
+ enum cbqri_controller_type type;
+ u32 rcid_count;
+ u32 cache_id;
+};
+
+#if IS_ENABLED(CONFIG_RISCV_CBQRI)
+int riscv_cbqri_register_cc_dt(const struct cbqri_controller_info *info,
+ u32 cache_level, const struct cpumask *cpu_mask);
+#else
+static inline int
+riscv_cbqri_register_cc_dt(const struct cbqri_controller_info *info,
+ u32 cache_level, const struct cpumask *cpu_mask)
+{
+ return -ENODEV;
+}
+#endif
+
+#endif /* _LINUX_RISCV_CBQRI_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 5/8] riscv_cbqri: resctrl: Add cache allocation via capacity block mask
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
` (3 preceding siblings ...)
2026-06-28 21:18 ` [PATCH v3 4/8] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
@ 2026-06-28 21:18 ` Drew Fustini
2026-06-28 21:34 ` sashiko-bot
2026-06-28 21:18 ` [PATCH v3 6/8] riscv: Enable resctrl filesystem for Ssqosid Drew Fustini
` (2 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
Wire CBQRI capacity controllers into resctrl as RDT_RESOURCE_L2 and
RDT_RESOURCE_L3 schemata.
Mismatched CC caps at the same cache level are treated as a fatal
configuration error since fs/resctrl exposes a single per-rid cap
set. Domains are created lazily in the cpuhp online callback so
cpu_mask reflects only currently online CPUs.
Assisted-by: Claude:claude-opus-4-7
Co-developed-by: Adrien Ricciardi <aricciardi@baylibre.com>
Signed-off-by: Adrien Ricciardi <aricciardi@baylibre.com>
Signed-off-by: Drew Fustini <fustini@kernel.org>
---
MAINTAINERS | 2 +
arch/riscv/include/asm/resctrl.h | 147 ++++++++
drivers/resctrl/Kconfig | 4 +
drivers/resctrl/Makefile | 1 +
drivers/resctrl/cbqri_resctrl.c | 787 +++++++++++++++++++++++++++++++++++++++
5 files changed, 941 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 811c0c9b1fac..9e1092165046 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -23299,9 +23299,11 @@ R: yunhui cui <cuiyunhui@bytedance.com>
L: linux-riscv@lists.infradead.org
S: Supported
F: arch/riscv/include/asm/qos.h
+F: arch/riscv/include/asm/resctrl.h
F: arch/riscv/kernel/qos.c
F: drivers/resctrl/cbqri_devices.c
F: drivers/resctrl/cbqri_internal.h
+F: drivers/resctrl/cbqri_resctrl.c
F: include/linux/riscv_cbqri.h
RISC-V RPMI AND MPXY DRIVERS
diff --git a/arch/riscv/include/asm/resctrl.h b/arch/riscv/include/asm/resctrl.h
new file mode 100644
index 000000000000..b08f4e12f7aa
--- /dev/null
+++ b/arch/riscv/include/asm/resctrl.h
@@ -0,0 +1,147 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _ASM_RISCV_RESCTRL_H
+#define _ASM_RISCV_RESCTRL_H
+
+#include <linux/resctrl_types.h>
+#include <linux/sched.h>
+#include <linux/types.h>
+
+#include <asm/qos.h>
+
+struct rdt_resource;
+
+/*
+ * Sentinel "no CLOSID assigned" used by resctrl_arch_rmid_idx_decode().
+ * fs/resctrl treats this opaquely. CBQRI uses MCID directly as the linear
+ * rmid index, so closid is unused on decode.
+ */
+#define RISCV_RESCTRL_EMPTY_CLOSID ((u32)~0)
+
+/*
+ * Terminology mapping between x86 (Intel RDT/AMD QoS) and RISC-V:
+ *
+ * CLOSID on x86 is RCID on RISC-V
+ * RMID on x86 is MCID on RISC-V
+ * CDP on x86 is AT (access type) on RISC-V
+ */
+
+/**
+ * resctrl_arch_alloc_capable() - any CBQRI controller exposes resctrl alloc
+ *
+ * Returns true once at least one CBQRI controller has successfully probed for
+ * a resctrl-exposed cache capacity allocation feature. Only meaningful after
+ * cbqri_resctrl_setup() runs at late_initcall.
+ */
+bool resctrl_arch_alloc_capable(void);
+
+/**
+ * resctrl_arch_mon_capable() - any CBQRI controller exposes resctrl monitoring
+ *
+ * The CBQRI driver implements capacity allocation only and wires up no
+ * monitoring events, so this always returns false. fs/resctrl references it
+ * unconditionally, hence the stub.
+ */
+bool resctrl_arch_mon_capable(void);
+
+/**
+ * resctrl_arch_rmid_idx_encode() - encode (RCID, MCID) into a linear index
+ * @closid: RCID (resource control id)
+ * @rmid: MCID (monitoring counter id)
+ *
+ * RISC-V uses MCID directly as the linear index into per-RMID arrays
+ * managed by fs/resctrl, since CBQRI controllers admit any MCID for any
+ * RCID. closid is unused here. CDP is encoded via the AT field on each
+ * CBQRI op rather than via the index.
+ */
+u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid);
+
+/**
+ * resctrl_arch_rmid_idx_decode() - inverse of resctrl_arch_rmid_idx_encode()
+ * @idx: linear index
+ * @closid: out: always RISCV_RESCTRL_EMPTY_CLOSID
+ * @rmid: out: the MCID that @idx encodes
+ */
+void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid);
+
+/**
+ * resctrl_arch_set_cpu_default_closid_rmid() - install per-CPU srmcfg default
+ * @cpu: CPU number
+ * @closid: RCID to use when no task is matched
+ * @rmid: MCID to use when no task is matched
+ *
+ * Sets the per-CPU cpu_srmcfg_default so __switch_to_srmcfg() can fall back
+ * to the CPU's default RCID/MCID for default-group tasks (those whose
+ * thread.srmcfg encodes to 0, i.e. closid == RESCTRL_RESERVED_CLOSID and
+ * rmid == RESCTRL_RESERVED_RMID). Implements resctrl allocation rule 2
+ * ("CPU default") on RISC-V.
+ */
+void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid, u32 rmid);
+
+/**
+ * resctrl_arch_sched_in() - context-switch hook to install task RCID/MCID
+ * @tsk: the task being scheduled in
+ *
+ * Called from finish_task_switch() to write tsk->thread.srmcfg into the
+ * srmcfg CSR. Tasks tagged with RISCV_RESCTRL_EMPTY_CLOSID inherit the
+ * per-CPU default set via resctrl_arch_set_cpu_default_closid_rmid().
+ */
+void resctrl_arch_sched_in(struct task_struct *tsk);
+
+/**
+ * resctrl_arch_set_closid_rmid() - tag a task with an RCID/MCID
+ * @tsk: task to tag
+ * @closid: RCID to install
+ * @rmid: MCID to install
+ *
+ * Updates tsk->thread.srmcfg with the encoded (RCID, MCID) pair. The new
+ * value takes effect on the next resctrl_arch_sched_in() for this task.
+ */
+void resctrl_arch_set_closid_rmid(struct task_struct *tsk, u32 closid, u32 rmid);
+
+/**
+ * resctrl_arch_match_closid() - test whether a task carries a given RCID
+ * @tsk: task
+ * @closid: RCID
+ */
+bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid);
+
+/**
+ * resctrl_arch_match_rmid() - test whether a task carries a given (RCID, MCID)
+ * @tsk: task
+ * @closid: RCID
+ * @rmid: MCID
+ */
+bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 closid, u32 rmid);
+
+/**
+ * resctrl_arch_mon_ctx_alloc() - allocate per-monitor-event arch context
+ * @r: resctrl resource being monitored
+ * @evtid: which monitor event needs context
+ *
+ * The CBQRI driver implements no monitoring events, so there is no per-event
+ * context to allocate and the stub returns NULL. fs/resctrl references it
+ * unconditionally before checking resctrl_arch_mon_capable().
+ */
+void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, enum resctrl_event_id evtid);
+
+/**
+ * resctrl_arch_mon_ctx_free() - release context returned by mon_ctx_alloc()
+ * @r: resctrl resource
+ * @evtid: monitor event id
+ * @arch_mon_ctx: pointer returned by resctrl_arch_mon_ctx_alloc()
+ */
+void resctrl_arch_mon_ctx_free(struct rdt_resource *r, enum resctrl_event_id evtid,
+ void *arch_mon_ctx);
+
+static inline unsigned int resctrl_arch_round_mon_val(unsigned int val)
+{
+ return val;
+}
+
+/* Not needed for RISC-V */
+static inline void resctrl_arch_enable_mon(void) { }
+static inline void resctrl_arch_disable_mon(void) { }
+static inline void resctrl_arch_enable_alloc(void) { }
+static inline void resctrl_arch_disable_alloc(void) { }
+
+#endif /* _ASM_RISCV_RESCTRL_H */
diff --git a/drivers/resctrl/Kconfig b/drivers/resctrl/Kconfig
index 92b9c82cf9f3..f8566c003d49 100644
--- a/drivers/resctrl/Kconfig
+++ b/drivers/resctrl/Kconfig
@@ -42,3 +42,7 @@ menuconfig RISCV_CBQRI
if RISCV_CBQRI
endif
+
+config RISCV_CBQRI_RESCTRL_FS
+ bool
+ default y if RISCV_CBQRI && RESCTRL_FS
diff --git a/drivers/resctrl/Makefile b/drivers/resctrl/Makefile
index 4d8a2c4b5627..a7631712dba9 100644
--- a/drivers/resctrl/Makefile
+++ b/drivers/resctrl/Makefile
@@ -6,3 +6,4 @@ ccflags-$(CONFIG_ARM64_MPAM_DRIVER_DEBUG) += -DDEBUG
obj-$(CONFIG_RISCV_CBQRI) += cbqri.o
cbqri-y += cbqri_devices.o
+cbqri-$(CONFIG_RISCV_CBQRI_RESCTRL_FS) += cbqri_resctrl.o
diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resctrl.c
new file mode 100644
index 000000000000..1fb0fbe1b000
--- /dev/null
+++ b/drivers/resctrl/cbqri_resctrl.c
@@ -0,0 +1,787 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
+
+#include <linux/bitfield.h>
+#include <linux/cacheinfo.h>
+#include <linux/riscv_cbqri.h>
+#include <linux/cpu.h>
+#include <linux/cpufeature.h>
+#include <linux/cpuhotplug.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/resctrl.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include <asm/csr.h>
+#include <asm/qos.h>
+
+#include "cbqri_internal.h"
+
+struct cbqri_resctrl_res {
+ struct cbqri_controller *ctrl;
+ struct rdt_resource resctrl_res;
+ bool cdp_enabled;
+};
+
+struct cbqri_resctrl_dom {
+ struct rdt_ctrl_domain resctrl_ctrl_dom;
+ struct cbqri_controller *hw_ctrl;
+};
+
+static struct cbqri_resctrl_res cbqri_resctrl_resources[RDT_NUM_RESOURCES];
+
+static bool exposed_alloc_capable;
+
+/* Protects ctrl_domain list mutations across CPU hotplug. */
+static DEFINE_MUTEX(cbqri_domain_list_lock);
+
+static struct rdt_ctrl_domain *
+cbqri_find_ctrl_domain(struct list_head *h, int id)
+{
+ struct rdt_domain_hdr *hdr = resctrl_find_domain(h, id, NULL);
+
+ return hdr ? container_of(hdr, struct rdt_ctrl_domain, hdr) : NULL;
+}
+
+/* Map a hardware cache level to its resctrl resource id, or -ENODEV. */
+static int cbqri_cache_level_to_rid(u32 cache_level)
+{
+ switch (cache_level) {
+ case 2:
+ return RDT_RESOURCE_L2;
+ case 3:
+ return RDT_RESOURCE_L3;
+ default:
+ return -ENODEV;
+ }
+}
+
+static int cbqri_apply_cache_config_dom(struct cbqri_resctrl_dom *hw_dom,
+ struct rdt_resource *r,
+ u32 closid, enum resctrl_conf_type t,
+ u64 cbm)
+{
+ struct cbqri_resctrl_res *hw_res =
+ container_of(r, struct cbqri_resctrl_res, resctrl_res);
+ struct cbqri_cc_config cfg = {
+ .cbm = cbm,
+ .at = (t == CDP_CODE) ? CBQRI_CONTROL_REGISTERS_AT_CODE :
+ CBQRI_CONTROL_REGISTERS_AT_DATA,
+ .cdp_enabled = hw_res->cdp_enabled,
+ };
+
+ return cbqri_apply_cache_config(hw_dom->hw_ctrl, closid, &cfg);
+}
+
+bool resctrl_arch_alloc_capable(void)
+{
+ return exposed_alloc_capable;
+}
+
+bool resctrl_arch_mon_capable(void)
+{
+ return false;
+}
+
+bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level rid)
+{
+ if (rid != RDT_RESOURCE_L2 && rid != RDT_RESOURCE_L3)
+ return false;
+ return cbqri_resctrl_resources[rid].cdp_enabled;
+}
+
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level rid, bool enable)
+{
+ struct cbqri_resctrl_res *cbqri_res;
+
+ if (rid != RDT_RESOURCE_L2 && rid != RDT_RESOURCE_L3)
+ return -ENODEV;
+
+ cbqri_res = &cbqri_resctrl_resources[rid];
+ if (!cbqri_res->resctrl_res.cdp_capable)
+ return -ENODEV;
+
+ cbqri_res->cdp_enabled = enable;
+ return 0;
+}
+
+struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
+{
+ if (l >= RDT_NUM_RESOURCES)
+ return NULL;
+
+ return &cbqri_resctrl_resources[l].resctrl_res;
+}
+
+/*
+ * fs/resctrl unconditionally references the symbols below before checking
+ * mon_capable. They are stubs for features CBQRI does not yet support.
+ */
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+{
+ return false;
+}
+
+void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r,
+ enum resctrl_event_id evtid)
+{
+ return NULL;
+}
+
+void resctrl_arch_mon_ctx_free(struct rdt_resource *r,
+ enum resctrl_event_id evtid, void *arch_mon_ctx)
+{
+}
+
+void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
+ enum resctrl_event_id evtid, u32 rmid, u32 closid,
+ u32 cntr_id, bool assign)
+{
+}
+
+int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
+ u32 unused, u32 rmid, int cntr_id,
+ enum resctrl_event_id eventid, u64 *val)
+{
+ return -EOPNOTSUPP;
+}
+
+bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
+{
+ return false;
+}
+
+int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
+{
+ return -EOPNOTSUPP;
+}
+
+void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
+ u32 unused, u32 rmid, int cntr_id,
+ enum resctrl_event_id eventid)
+{
+}
+
+bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r)
+{
+ return false;
+}
+
+int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
+{
+ return -EOPNOTSUPP;
+}
+
+void resctrl_arch_mon_event_config_read(void *info)
+{
+}
+
+void resctrl_arch_mon_event_config_write(void *info)
+{
+}
+
+void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_l3_mon_domain *d)
+{
+}
+
+void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
+ u32 unused, u32 rmid, enum resctrl_event_id eventid)
+{
+}
+
+int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain_hdr *hdr,
+ u32 closid, u32 rmid, enum resctrl_event_id eventid,
+ void *arch_priv, u64 *val, void *arch_mon_ctx)
+{
+ return -ENODATA;
+}
+
+/*
+ * Note about terminology between x86 (Intel RDT/AMD QoS) and RISC-V:
+ * CLOSID on x86 is RCID on RISC-V
+ * RMID on x86 is MCID on RISC-V
+ */
+u32 resctrl_arch_get_num_closid(struct rdt_resource *res)
+{
+ struct cbqri_resctrl_res *hw_res;
+
+ hw_res = container_of(res, struct cbqri_resctrl_res, resctrl_res);
+
+ if (!hw_res->ctrl)
+ return 0;
+
+ return hw_res->ctrl->rcid_count;
+}
+
+u32 resctrl_arch_system_num_rmid_idx(void)
+{
+ return 1;
+}
+
+u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid)
+{
+ return rmid;
+}
+
+void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid)
+{
+ *closid = RISCV_RESCTRL_EMPTY_CLOSID;
+ *rmid = idx;
+}
+
+void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid, u32 rmid)
+{
+ u32 srmcfg = FIELD_PREP(SRMCFG_RCID_MASK, closid) |
+ FIELD_PREP(SRMCFG_MCID_MASK, rmid);
+
+ WRITE_ONCE(per_cpu(cpu_srmcfg_default, cpu), srmcfg);
+}
+
+void resctrl_arch_sched_in(struct task_struct *tsk)
+{
+ __switch_to_srmcfg(tsk);
+}
+
+void resctrl_arch_set_closid_rmid(struct task_struct *tsk, u32 closid, u32 rmid)
+{
+ u32 srmcfg = FIELD_PREP(SRMCFG_RCID_MASK, closid) |
+ FIELD_PREP(SRMCFG_MCID_MASK, rmid);
+
+ WRITE_ONCE(tsk->thread.srmcfg, srmcfg);
+}
+
+void resctrl_arch_sync_cpu_closid_rmid(void *info)
+{
+ struct resctrl_cpu_defaults *r = info;
+
+ lockdep_assert_preemption_disabled();
+
+ if (r) {
+ resctrl_arch_set_cpu_default_closid_rmid(smp_processor_id(),
+ r->closid, r->rmid);
+ }
+
+ resctrl_arch_sched_in(current);
+}
+
+bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid)
+{
+ return FIELD_GET(SRMCFG_RCID_MASK, READ_ONCE(tsk->thread.srmcfg)) == closid;
+}
+
+bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 closid, u32 rmid)
+{
+ return FIELD_GET(SRMCFG_MCID_MASK, READ_ONCE(tsk->thread.srmcfg)) == rmid;
+}
+
+void resctrl_arch_pre_mount(void)
+{
+ /* All controllers discovered at boot via late_initcall. Nothing to do. */
+}
+
+int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
+ u32 closid, enum resctrl_conf_type t, u32 cfg_val)
+{
+ struct cbqri_resctrl_dom *dom;
+
+ dom = container_of(d, struct cbqri_resctrl_dom, resctrl_ctrl_dom);
+
+ if (!r->alloc_capable)
+ return -EINVAL;
+
+ switch (r->rid) {
+ case RDT_RESOURCE_L2:
+ case RDT_RESOURCE_L3:
+ return cbqri_apply_cache_config_dom(dom, r, closid, t, cfg_val);
+ default:
+ return -EINVAL;
+ }
+}
+
+int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
+{
+ struct resctrl_staged_config *cfg;
+ enum resctrl_conf_type t;
+ struct rdt_ctrl_domain *d;
+ int err = 0;
+
+ /* Walking r->ctrl_domains, ensure it can't race with cpuhp */
+ lockdep_assert_cpus_held();
+
+ list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
+ for (t = 0; t < CDP_NUM_TYPES; t++) {
+ cfg = &d->staged_config[t];
+ if (!cfg->have_new_ctrl)
+ continue;
+ err = resctrl_arch_update_one(r, d, closid, t, cfg->new_ctrl);
+ if (err)
+ return err;
+ }
+ }
+ return err;
+}
+
+u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
+ u32 closid, enum resctrl_conf_type type)
+{
+ struct cbqri_resctrl_dom *hw_dom;
+ struct cbqri_controller *ctrl;
+ u32 at;
+ u32 val;
+ int err;
+
+ hw_dom = container_of(d, struct cbqri_resctrl_dom, resctrl_ctrl_dom);
+ ctrl = hw_dom->hw_ctrl;
+ val = resctrl_get_default_ctrl(r);
+
+ if (!r->alloc_capable)
+ return val;
+
+ switch (r->rid) {
+ case RDT_RESOURCE_L2:
+ case RDT_RESOURCE_L3:
+ at = (type == CDP_CODE) ? CBQRI_CONTROL_REGISTERS_AT_CODE :
+ CBQRI_CONTROL_REGISTERS_AT_DATA;
+ err = cbqri_read_cache_config(ctrl, closid, at, &val);
+ if (err < 0)
+ val = resctrl_get_default_ctrl(r);
+ break;
+ default:
+ break;
+ }
+
+ return val;
+}
+
+void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
+{
+ struct cbqri_resctrl_res *hw_res;
+ struct rdt_ctrl_domain *d;
+ enum resctrl_conf_type t;
+ u32 default_ctrl;
+ int i;
+
+ lockdep_assert_cpus_held();
+
+ hw_res = container_of(r, struct cbqri_resctrl_res, resctrl_res);
+ default_ctrl = resctrl_get_default_ctrl(r);
+
+ if (!hw_res->ctrl)
+ return;
+
+ list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
+ for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
+ for (t = 0; t < CDP_NUM_TYPES; t++) {
+ int rerr;
+
+ rerr = resctrl_arch_update_one(r, d, i, t, default_ctrl);
+ if (rerr)
+ pr_err_ratelimited("rid=%d reset RCID %u type %u failed (%d)\n",
+ r->rid, i, t, rerr);
+ }
+ }
+ }
+}
+
+static struct rdt_ctrl_domain *cbqri_new_domain(struct cbqri_controller *ctrl)
+{
+ struct cbqri_resctrl_dom *hw_dom;
+ struct rdt_ctrl_domain *domain;
+
+ hw_dom = kzalloc_obj(*hw_dom, GFP_KERNEL);
+ if (!hw_dom)
+ return NULL;
+
+ hw_dom->hw_ctrl = ctrl;
+ domain = &hw_dom->resctrl_ctrl_dom;
+
+ INIT_LIST_HEAD(&domain->hdr.list);
+
+ return domain;
+}
+
+static int cbqri_init_domain_ctrlval(struct rdt_resource *r, struct rdt_ctrl_domain *d)
+{
+ struct cbqri_resctrl_res *hw_res;
+ enum resctrl_conf_type t;
+ int err = 0;
+ int i;
+
+ hw_res = container_of(r, struct cbqri_resctrl_res, resctrl_res);
+
+ for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
+ /*
+ * Seed both DATA and CODE staged slots so a later mount
+ * with -o cdp does not see stale CODE values.
+ * On non-AT controllers cbqri_cc_alloc_op() masks AT to 0
+ * so all three iterations land on the same hardware state.
+ * The redundant writes are harmless.
+ */
+ for (t = 0; t < CDP_NUM_TYPES; t++) {
+ err = resctrl_arch_update_one(r, d, i, t,
+ resctrl_get_default_ctrl(r));
+ if (err)
+ return err;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Walk cbqri_controllers and pick one capacity controller (CC) per cache
+ * level (L2/L3) to back the corresponding RDT_RESOURCE_L*. When more than
+ * one CC sits at the same level (e.g. one per socket), they must agree on
+ * rcid_count / ncblks / alloc_capable. A mismatch is fatal because resctrl
+ * exposes a single set of caps per rid. The first matching controller wins.
+ */
+static int cbqri_resctrl_pick_caches(void)
+{
+ struct cbqri_controller *ctrl;
+ int ret = 0;
+
+ mutex_lock(&cbqri_controllers_lock);
+
+ list_for_each_entry(ctrl, &cbqri_controllers, list) {
+ struct cbqri_resctrl_res *cbqri_res;
+ int rid;
+
+ if (ctrl->type != CBQRI_CONTROLLER_TYPE_CAPACITY)
+ continue;
+ if (!ctrl->alloc_capable)
+ continue;
+
+ rid = cbqri_cache_level_to_rid(ctrl->cache.cache_level);
+ if (rid < 0) {
+ pr_info("skipping controller at unsupported cache level %u\n",
+ ctrl->cache.cache_level);
+ continue;
+ }
+
+ cbqri_res = &cbqri_resctrl_resources[rid];
+ if (cbqri_res->ctrl) {
+ /*
+ * CCs at the same cache level must agree on every cap
+ * resctrl exposes globally. Reject mismatches at pick
+ * time so the inconsistency is visible at boot.
+ */
+ if (cbqri_res->ctrl->rcid_count != ctrl->rcid_count ||
+ cbqri_res->ctrl->cc.ncblks != ctrl->cc.ncblks ||
+ cbqri_res->ctrl->cc.supports_alloc_at_code !=
+ ctrl->cc.supports_alloc_at_code ||
+ cbqri_res->ctrl->alloc_capable != ctrl->alloc_capable) {
+ pr_err("L%d controllers have mismatched capabilities\n",
+ ctrl->cache.cache_level);
+ ret = -EINVAL;
+ break;
+ }
+ continue;
+ }
+
+ cbqri_res->ctrl = ctrl;
+ }
+
+ mutex_unlock(&cbqri_controllers_lock);
+ return ret;
+}
+
+/*
+ * Fill the rdt_resource fields for one picked rid. An rid with no picked
+ * controller is left untouched so it stays out of resctrl_arch_get_resource().
+ */
+static void cbqri_resctrl_control_init(struct cbqri_resctrl_res *cbqri_res)
+{
+ struct cbqri_controller *ctrl = cbqri_res->ctrl;
+ struct rdt_resource *res = &cbqri_res->resctrl_res;
+
+ if (!ctrl)
+ return;
+
+ switch (res->rid) {
+ case RDT_RESOURCE_L2:
+ case RDT_RESOURCE_L3:
+ res->name = (res->rid == RDT_RESOURCE_L2) ? "L2" : "L3";
+ res->schema_fmt = RESCTRL_SCHEMA_BITMAP;
+ res->ctrl_scope = (res->rid == RDT_RESOURCE_L2) ?
+ RESCTRL_L2_CACHE : RESCTRL_L3_CACHE;
+ res->cache.cbm_len = ctrl->cc.ncblks;
+ res->cache.shareable_bits = 0;
+ res->cache.min_cbm_bits = 1;
+ res->cache.arch_has_sparse_bitmasks = false;
+ res->cdp_capable = ctrl->cc.supports_alloc_at_code;
+ res->alloc_capable = ctrl->alloc_capable;
+ INIT_LIST_HEAD(&res->ctrl_domains);
+ INIT_LIST_HEAD(&res->mon_domains);
+ break;
+ default:
+ break;
+ }
+}
+
+static void cbqri_resctrl_accumulate_caps(void)
+{
+ int rid;
+
+ for (rid = 0; rid < RDT_NUM_RESOURCES; rid++) {
+ struct cbqri_resctrl_res *hw_res = &cbqri_resctrl_resources[rid];
+
+ if (!hw_res->ctrl)
+ continue;
+ if (hw_res->ctrl->alloc_capable)
+ exposed_alloc_capable = true;
+ }
+}
+
+/*
+ * Create, list-insert, and online a fresh ctrl_domain backing ctrl on
+ * resource res, seeded with cpu and identified by dom_id. Caller must
+ * hold cbqri_domain_list_lock and must have already verified that no
+ * existing ctrl_domain on res carries this id.
+ */
+static struct rdt_ctrl_domain *cbqri_create_ctrl_domain(struct cbqri_controller *ctrl,
+ struct rdt_resource *res,
+ unsigned int cpu, int dom_id)
+{
+ struct rdt_ctrl_domain *domain;
+ struct list_head *pos = NULL;
+ int err;
+
+ domain = cbqri_new_domain(ctrl);
+ if (!domain)
+ return ERR_PTR(-ENOMEM);
+
+ cpumask_set_cpu(cpu, &domain->hdr.cpu_mask);
+ domain->hdr.id = dom_id;
+ domain->hdr.type = RESCTRL_CTRL_DOMAIN;
+
+ err = cbqri_init_domain_ctrlval(res, domain);
+ if (err) {
+ kfree(container_of(domain, struct cbqri_resctrl_dom,
+ resctrl_ctrl_dom));
+ return ERR_PTR(err);
+ }
+
+ /* Insert sorted by id so user-visible ordering is deterministic. */
+ resctrl_find_domain(&res->ctrl_domains, dom_id, &pos);
+ list_add_tail(&domain->hdr.list, pos);
+
+ resctrl_online_ctrl_domain(res, domain);
+
+ return domain;
+}
+
+static int cbqri_attach_cpu_to_cap_ctrl(struct cbqri_controller *ctrl,
+ unsigned int cpu)
+{
+ struct cbqri_resctrl_res *hw_res;
+ struct rdt_ctrl_domain *domain;
+ struct rdt_resource *res;
+ int dom_id;
+ int rid;
+
+ rid = cbqri_cache_level_to_rid(ctrl->cache.cache_level);
+ if (rid < 0)
+ return 0;
+ hw_res = &cbqri_resctrl_resources[rid];
+
+ if (!hw_res->ctrl)
+ return 0;
+
+ res = &hw_res->resctrl_res;
+ dom_id = ctrl->cache.cache_id;
+
+ domain = cbqri_find_ctrl_domain(&res->ctrl_domains, dom_id);
+ if (domain) {
+ cpumask_set_cpu(cpu, &domain->hdr.cpu_mask);
+ return 0;
+ }
+
+ domain = cbqri_create_ctrl_domain(ctrl, res, cpu, dom_id);
+ if (IS_ERR(domain))
+ return PTR_ERR(domain);
+
+ return 0;
+}
+
+static void cbqri_detach_cpu_from_ctrl_domains(struct rdt_resource *res,
+ unsigned int cpu)
+{
+ struct rdt_ctrl_domain *domain, *tmp;
+
+ list_for_each_entry_safe(domain, tmp, &res->ctrl_domains, hdr.list) {
+ if (!cpumask_test_cpu(cpu, &domain->hdr.cpu_mask))
+ continue;
+ cpumask_clear_cpu(cpu, &domain->hdr.cpu_mask);
+ if (cpumask_empty(&domain->hdr.cpu_mask)) {
+ resctrl_offline_ctrl_domain(res, domain);
+ list_del(&domain->hdr.list);
+ kfree(container_of(domain, struct cbqri_resctrl_dom,
+ resctrl_ctrl_dom));
+ }
+ }
+}
+
+/*
+ * Remove a CPU from every domain it was attached to. The per-resource
+ * detach helpers act only when the CPU is set in a domain's mask, so this
+ * is idempotent and undoes a partial online attach as well as a full
+ * offline. Caller holds cbqri_domain_list_lock.
+ */
+static void cbqri_detach_cpu_from_all_ctrls(unsigned int cpu)
+{
+ int rid;
+
+ lockdep_assert_held(&cbqri_domain_list_lock);
+
+ for (rid = 0; rid < RDT_NUM_RESOURCES; rid++) {
+ struct cbqri_resctrl_res *hw_res = &cbqri_resctrl_resources[rid];
+
+ if (!hw_res->ctrl)
+ continue;
+ cbqri_detach_cpu_from_ctrl_domains(&hw_res->resctrl_res, cpu);
+ }
+}
+
+/*
+ * Attach a CPU to every controller that claims it. On failure, detach the
+ * CPU from everything attached so far: the cpuhp core does not run this
+ * state's offline teardown when its startup fails, so a partial attach
+ * would otherwise leak into the domain cpu_masks. Caller holds
+ * cbqri_domain_list_lock.
+ */
+static int cbqri_attach_cpu_to_all_ctrls(unsigned int cpu)
+{
+ struct cbqri_controller *ctrl;
+ int err = 0;
+
+ lockdep_assert_held(&cbqri_domain_list_lock);
+
+ /*
+ * Hold cbqri_controllers_lock across the walk so a controller
+ * registered after boot cannot corrupt it. The register path takes
+ * it as a leaf and never cbqri_domain_list_lock, so this nesting
+ * cannot invert.
+ */
+ mutex_lock(&cbqri_controllers_lock);
+ list_for_each_entry(ctrl, &cbqri_controllers, list) {
+ if (ctrl->type != CBQRI_CONTROLLER_TYPE_CAPACITY)
+ continue;
+ if (!cpumask_test_cpu(cpu, &ctrl->cache.cpu_mask))
+ continue;
+ if (!ctrl->alloc_capable)
+ continue;
+
+ err = cbqri_attach_cpu_to_cap_ctrl(ctrl, cpu);
+ if (err) {
+ cbqri_detach_cpu_from_all_ctrls(cpu);
+ break;
+ }
+ }
+ mutex_unlock(&cbqri_controllers_lock);
+
+ return err;
+}
+
+static bool cbqri_resctrl_inited;
+
+static void cbqri_resctrl_teardown(void)
+{
+ int rid;
+
+ if (!cbqri_resctrl_inited)
+ return;
+
+ resctrl_exit();
+
+ for (rid = 0; rid < RDT_NUM_RESOURCES; rid++) {
+ struct cbqri_resctrl_res *hw_res = &cbqri_resctrl_resources[rid];
+
+ hw_res->ctrl = NULL;
+ hw_res->cdp_enabled = false;
+ }
+ exposed_alloc_capable = false;
+ cbqri_resctrl_inited = false;
+}
+
+static int cbqri_resctrl_setup(void)
+{
+ int rid;
+ int err;
+
+ for (rid = 0; rid < RDT_NUM_RESOURCES; rid++)
+ cbqri_resctrl_resources[rid].resctrl_res.rid = rid;
+
+ err = cbqri_resctrl_pick_caches();
+ if (err)
+ return err;
+
+ for (rid = 0; rid < RDT_NUM_RESOURCES; rid++)
+ cbqri_resctrl_control_init(&cbqri_resctrl_resources[rid]);
+
+ cbqri_resctrl_accumulate_caps();
+
+ if (!exposed_alloc_capable) {
+ pr_debug("no resctrl-capable CBQRI controllers found\n");
+ return -ENODEV;
+ }
+
+ err = resctrl_init();
+ if (err)
+ return err;
+
+ cbqri_resctrl_inited = true;
+ return 0;
+}
+
+static int cbqri_resctrl_online_cpu(unsigned int cpu)
+{
+ int err;
+
+ mutex_lock(&cbqri_domain_list_lock);
+ err = cbqri_attach_cpu_to_all_ctrls(cpu);
+ mutex_unlock(&cbqri_domain_list_lock);
+ if (err)
+ return err;
+
+ /*
+ * Seed the per-CPU default RCID/MCID to the reserved (0, 0) pair and
+ * notify the resctrl core so it tracks this CPU in the default group.
+ */
+ resctrl_arch_set_cpu_default_closid_rmid(cpu, 0, 0);
+ resctrl_online_cpu(cpu);
+ return 0;
+}
+
+static int cbqri_resctrl_offline_cpu(unsigned int cpu)
+{
+ resctrl_offline_cpu(cpu);
+
+ mutex_lock(&cbqri_domain_list_lock);
+ cbqri_detach_cpu_from_all_ctrls(cpu);
+ mutex_unlock(&cbqri_domain_list_lock);
+ return 0;
+}
+
+static int __init cbqri_arch_late_init(void)
+{
+ int err;
+
+ if (!riscv_isa_extension_available(NULL, SSQOSID))
+ return -ENODEV;
+
+ err = cbqri_resctrl_setup();
+ if (err)
+ return err;
+
+ err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "cbqri:online",
+ cbqri_resctrl_online_cpu,
+ cbqri_resctrl_offline_cpu);
+ if (err < 0) {
+ cbqri_resctrl_teardown();
+ return err;
+ }
+
+ return 0;
+}
+late_initcall(cbqri_arch_late_init);
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 6/8] riscv: Enable resctrl filesystem for Ssqosid
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
` (4 preceding siblings ...)
2026-06-28 21:18 ` [PATCH v3 5/8] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
@ 2026-06-28 21:18 ` Drew Fustini
2026-06-28 21:30 ` sashiko-bot
2026-06-28 21:18 ` [PATCH v3 7/8] dt-bindings: riscv: Add binding for CBQRI controllers Drew Fustini
2026-06-28 21:18 ` [PATCH v3 8/8] riscv_cbqri: Add CBQRI capacity allocation platform driver Drew Fustini
7 siblings, 1 reply; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
RISCV_ISA_SSQOSID selects ARCH_HAS_CPU_RESCTRL and RISCV_CBQRI
unconditionally. ARCH_HAS_CPU_RESCTRL makes RESCTRL_FS available on
RISC-V, and RISCV_CBQRI builds the CBQRI core that backs it.
The resctrl filesystem integration is gated separately by
RISCV_CBQRI_RESCTRL_FS, a silent option that defaults to y when both
RISCV_CBQRI and RESCTRL_FS are enabled. Enabling the resctrl filesystem
itself stays a user choice via the standard fs/Kconfig MISC_FILESYSTEMS
menu.
Signed-off-by: Drew Fustini <fustini@kernel.org>
---
arch/riscv/Kconfig | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index ee586925f972..9c28bcbc29dc 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -594,6 +594,8 @@ config RISCV_ISA_SSQOSID
bool "Ssqosid extension support for supervisor mode Quality of Service ID"
depends on 64BIT
default n
+ select ARCH_HAS_CPU_RESCTRL
+ select RISCV_CBQRI
help
Adds support for the Ssqosid ISA extension (Supervisor-mode
Quality of Service ID).
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 7/8] dt-bindings: riscv: Add binding for CBQRI controllers
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
` (5 preceding siblings ...)
2026-06-28 21:18 ` [PATCH v3 6/8] riscv: Enable resctrl filesystem for Ssqosid Drew Fustini
@ 2026-06-28 21:18 ` Drew Fustini
2026-06-28 21:18 ` [PATCH v3 8/8] riscv_cbqri: Add CBQRI capacity allocation platform driver Drew Fustini
7 siblings, 0 replies; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
Document the device tree binding for RISC-V CBQRI capacity and bandwidth
controllers. Each controller is named by a device-specific compatible
followed by the generic compatible. The binding also describes the
common riscv,cbqri-rcid and riscv,cbqri-mcid properties, and the
optional riscv,cbqri-cache phandle that links a capacity controller to
the cache whose capacity it allocates.
Assisted-by: Claude:claude-opus-4-8
Co-developed-by: Adrien Ricciardi <aricciardi@baylibre.com>
Signed-off-by: Adrien Ricciardi <aricciardi@baylibre.com>
Signed-off-by: Drew Fustini <fustini@kernel.org>
---
.../devicetree/bindings/riscv/riscv,cbqri.yaml | 97 ++++++++++++++++++++++
MAINTAINERS | 1 +
2 files changed, 98 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/riscv,cbqri.yaml b/Documentation/devicetree/bindings/riscv/riscv,cbqri.yaml
new file mode 100644
index 000000000000..62d547a0cb96
--- /dev/null
+++ b/Documentation/devicetree/bindings/riscv/riscv,cbqri.yaml
@@ -0,0 +1,97 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/riscv/riscv,cbqri.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Capacity and Bandwidth QoS Register Interface (CBQRI) controller
+
+description: |
+ The RISC-V CBQRI specification defines capacity-controller and
+ bandwidth-controller register blocks that allocate cache capacity and memory
+ bandwidth to resource-control IDs (RCIDs) and monitor usage per
+ monitoring-counter ID (MCID):
+ https://github.com/riscv-non-isa/riscv-cbqri/blob/main/riscv-cbqri.pdf
+
+ Allocation and monitoring share one register block, and a controller may
+ implement either or both. A driver discovers which at runtime from the
+ capabilities register, so the compatible names only the controller type. It
+ does not distinguish allocation-only, monitoring-only or combined
+ controllers, and no property declares monitoring support.
+
+maintainers:
+ - Drew Fustini <fustini@kernel.org>
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - enum:
+ - tenstorrent,ascalon-shared-cache-controller
+ - const: riscv,cbqri-capacity-controller
+ - items:
+ - {}
+ - const: riscv,cbqri-bandwidth-controller
+
+ reg:
+ maxItems: 1
+ description:
+ The CBQRI controller register block.
+
+ riscv,cbqri-rcid:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ The maximum number of RCIDs the controller supports. RCIDs are the
+ resource-control IDs that allocation operations target.
+
+ riscv,cbqri-mcid:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ The maximum number of MCIDs the controller supports. MCIDs are the
+ monitoring-counter IDs that usage-monitoring operations target. Present
+ on controllers that implement monitoring.
+
+ riscv,cbqri-cache:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ Phandle to the cache node whose capacity this controller allocates.
+ Applies to capacity controllers that back a CPU cache. The cache level
+ and the harts sharing it are taken from that node's cache topology.
+
+required:
+ - compatible
+ - reg
+
+allOf:
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: tenstorrent,ascalon-shared-cache-controller
+ then:
+ required:
+ - riscv,cbqri-rcid
+ - riscv,cbqri-cache
+
+additionalProperties: false
+
+examples:
+ - |
+ l2_cache: l2-cache {
+ compatible = "cache";
+ cache-level = <2>;
+ cache-unified;
+ cache-size = <0xc00000>;
+ cache-sets = <512>;
+ cache-block-size = <64>;
+ };
+
+ qos-controller@a21a00c0 {
+ compatible = "tenstorrent,ascalon-shared-cache-controller",
+ "riscv,cbqri-capacity-controller";
+ reg = <0xa21a00c0 0xf40>;
+ riscv,cbqri-rcid = <16>;
+ riscv,cbqri-cache = <&l2_cache>;
+ };
+
+...
diff --git a/MAINTAINERS b/MAINTAINERS
index 9e1092165046..64a95a4d795a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -23298,6 +23298,7 @@ M: Drew Fustini <fustini@kernel.org>
R: yunhui cui <cuiyunhui@bytedance.com>
L: linux-riscv@lists.infradead.org
S: Supported
+F: Documentation/devicetree/bindings/riscv/riscv,cbqri.yaml
F: arch/riscv/include/asm/qos.h
F: arch/riscv/include/asm/resctrl.h
F: arch/riscv/kernel/qos.c
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 8/8] riscv_cbqri: Add CBQRI capacity allocation platform driver
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
` (6 preceding siblings ...)
2026-06-28 21:18 ` [PATCH v3 7/8] dt-bindings: riscv: Add binding for CBQRI controllers Drew Fustini
@ 2026-06-28 21:18 ` Drew Fustini
2026-06-28 21:27 ` sashiko-bot
7 siblings, 1 reply; 14+ messages in thread
From: Drew Fustini @ 2026-06-28 21:18 UTC (permalink / raw)
To: Adrien Ricciardi, Alexandre Ghiti, Atish Kumar Patra, Atish Patra,
Babu Moger, Ben Horgan, Borislav Petkov, Chen Pei, Conor Dooley,
Conor Dooley, Dave Hansen, Dave Martin, Fenghua Yu, Gong Shuai,
Gong Shuai, guo.wenjia23, James Morse, Kornel Dulęba,
Krzysztof Kozlowski, liu.qingtao2, Liu Zhiwei, Palmer Dabbelt,
Paul Walmsley, Peter Newman, Radim Krčmář,
Reinette Chatre, Rob Herring, Samuel Holland,
Sebastian Andrzej Siewior, Tony Luck, Vasudevan Srinivasan,
Ved Shanbhogue, Weiwei Li, yunhui cui, Drew Fustini
Cc: linux-kernel, linux-riscv, x86, devicetree, linux-rt-devel,
linux-doc
Add a device-tree platform driver, bound to the generic
riscv,cbqri-capacity-controller compatible, that registers a CBQRI
capacity controller as the resctrl cache-allocation resource for the
cache it governs.
The driver follows the node's riscv,cbqri-cache phandle to that cache,
reads its level, and matches it against cacheinfo to get the resctrl
domain id and the harts sharing the cache. It then hands the controller
to riscv_cbqri_register_cc_dt() with the riscv,cbqri-rcid count from the
node.
Nothing is vendor-specific, and the DT "reg" is the CBQRI register block
itself, so any SoC that describes a CBQRI capacity controller in device
tree can reuse the driver unchanged.
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Drew Fustini <fustini@kernel.org>
---
MAINTAINERS | 1 +
drivers/resctrl/Kconfig | 12 ++++
drivers/resctrl/Makefile | 1 +
drivers/resctrl/cbqri_capacity.c | 132 +++++++++++++++++++++++++++++++++++++++
4 files changed, 146 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 64a95a4d795a..e0ffccd9ed6e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -23302,6 +23302,7 @@ F: Documentation/devicetree/bindings/riscv/riscv,cbqri.yaml
F: arch/riscv/include/asm/qos.h
F: arch/riscv/include/asm/resctrl.h
F: arch/riscv/kernel/qos.c
+F: drivers/resctrl/cbqri_capacity.c
F: drivers/resctrl/cbqri_devices.c
F: drivers/resctrl/cbqri_internal.h
F: drivers/resctrl/cbqri_resctrl.c
diff --git a/drivers/resctrl/Kconfig b/drivers/resctrl/Kconfig
index f8566c003d49..3645dd643117 100644
--- a/drivers/resctrl/Kconfig
+++ b/drivers/resctrl/Kconfig
@@ -41,6 +41,18 @@ menuconfig RISCV_CBQRI
if RISCV_CBQRI
+config RISCV_CBQRI_CAPACITY
+ bool "RISC-V CBQRI cache capacity-allocation controller"
+ depends on OF
+ help
+ Enable driver for a RISC-V CBQRI capacity controller that
+ governs a CPU cache, matching the "riscv,cbqri-capacity-controller"
+ compatible. The controller's cache phandle gives the cache level and the
+ harts that share it, which the driver registers as a resctrl
+ cache-allocation resource.
+
+ Say N unless your device tree describes a CBQRI capacity controller.
+
endif
config RISCV_CBQRI_RESCTRL_FS
diff --git a/drivers/resctrl/Makefile b/drivers/resctrl/Makefile
index a7631712dba9..c8339113ef1f 100644
--- a/drivers/resctrl/Makefile
+++ b/drivers/resctrl/Makefile
@@ -7,3 +7,4 @@ ccflags-$(CONFIG_ARM64_MPAM_DRIVER_DEBUG) += -DDEBUG
obj-$(CONFIG_RISCV_CBQRI) += cbqri.o
cbqri-y += cbqri_devices.o
cbqri-$(CONFIG_RISCV_CBQRI_RESCTRL_FS) += cbqri_resctrl.o
+cbqri-$(CONFIG_RISCV_CBQRI_CAPACITY) += cbqri_capacity.o
diff --git a/drivers/resctrl/cbqri_capacity.c b/drivers/resctrl/cbqri_capacity.c
new file mode 100644
index 000000000000..2172432eb328
--- /dev/null
+++ b/drivers/resctrl/cbqri_capacity.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Platform driver for a RISC-V CBQRI capacity controller that backs a CPU
+ * cache. The controller is described in device tree by the generic
+ * "riscv,cbqri-capacity-controller" compatible together with a phandle to the
+ * cache node it governs. The driver hands it to the CBQRI core, which probes
+ * the capabilities register and exposes a controller that supports allocation
+ * as the resctrl cache allocation resource for that cache.
+ */
+
+#define pr_fmt(fmt) "cbqri-capacity: " fmt
+
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/ioport.h>
+#include <linux/mod_devicetable.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/printk.h>
+#include <linux/riscv_cbqri.h>
+#include <linux/types.h>
+
+static int cbqri_capacity_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct cbqri_controller_info info = {};
+ struct device_node *cache_np;
+ cpumask_var_t cpu_mask;
+ struct resource *res;
+ u32 rcid_count, cache_level;
+ int cache_id, cpu, ret;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res)
+ return -EINVAL;
+
+ ret = of_property_read_u32(dev->of_node, "riscv,cbqri-rcid", &rcid_count);
+ if (ret) {
+ dev_err(dev, "missing riscv,cbqri-rcid\n");
+ return ret;
+ }
+
+ cache_np = of_parse_phandle(dev->of_node, "riscv,cbqri-cache", 0);
+ if (!cache_np) {
+ dev_err(dev, "missing riscv,cbqri-cache phandle\n");
+ return -EINVAL;
+ }
+
+ ret = of_property_read_u32(cache_np, "cache-level", &cache_level);
+ if (ret) {
+ dev_err(dev, "%pOF: missing cache-level\n", cache_np);
+ goto out_put;
+ }
+
+ if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) {
+ ret = -ENOMEM;
+ goto out_put;
+ }
+
+ /*
+ * Associate the controller with its cache instance via
+ * cacheinfo. The matching cache provides the cache id and the
+ * set of harts that share the cache.
+ */
+ cache_id = -1;
+ cpus_read_lock();
+ for_each_online_cpu(cpu) {
+ struct cacheinfo *ci = get_cpu_cacheinfo_level(cpu, cache_level);
+
+ if (ci && ci->fw_token == cache_np) {
+ cache_id = ci->id;
+ cpumask_copy(cpu_mask, &ci->shared_cpu_map);
+ break;
+ }
+ }
+ cpus_read_unlock();
+
+ if (cache_id < 0) {
+ dev_err(dev, "%pOF: no online hart reports an L%u cache for this node\n",
+ cache_np, cache_level);
+ ret = -ENODEV;
+ goto out_free;
+ }
+
+ info.type = CBQRI_CONTROLLER_TYPE_CAPACITY;
+ info.addr = res->start;
+ info.size = resource_size(res);
+ info.rcid_count = rcid_count;
+ info.cache_id = cache_id;
+
+ ret = riscv_cbqri_register_cc_dt(&info, cache_level, cpu_mask);
+ if (ret) {
+ dev_err(dev, "failed to register capacity controller: %d\n", ret);
+ goto out_free;
+ }
+
+ dev_info(dev, "registered L%u capacity controller at %pa (cache_id=%d, rcid=%u)\n",
+ cache_level, &info.addr, cache_id, rcid_count);
+
+out_free:
+ free_cpumask_var(cpu_mask);
+out_put:
+ of_node_put(cache_np);
+ return ret;
+}
+
+static const struct of_device_id cbqri_capacity_of_match[] = {
+ { .compatible = "riscv,cbqri-capacity-controller" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, cbqri_capacity_of_match);
+
+static struct platform_driver cbqri_capacity_driver = {
+ .probe = cbqri_capacity_probe,
+ .driver = {
+ .name = "cbqri-capacity",
+ .of_match_table = cbqri_capacity_of_match,
+ /*
+ * The controller is registered permanently into the
+ * CBQRI core for the life of the system. Block unbind
+ * so userspace cannot leave a dangling controller.
+ */
+ .suppress_bind_attrs = true,
+ },
+};
+
+/*
+ * Register at device_initcall so probe runs before the CBQRI core's
+ * late_initcall which walks the cbqri_controllers list.
+ */
+builtin_platform_driver(cbqri_capacity_driver);
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 8/8] riscv_cbqri: Add CBQRI capacity allocation platform driver
2026-06-28 21:18 ` [PATCH v3 8/8] riscv_cbqri: Add CBQRI capacity allocation platform driver Drew Fustini
@ 2026-06-28 21:27 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-28 21:27 UTC (permalink / raw)
To: Drew Fustini; +Cc: conor+dt, robh, devicetree, linux-kernel, linux-rt-devel
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] cbqri_capacity_probe() incorrectly uses get_cpu_cacheinfo_level() to find the cacheinfo, which will fail for split caches (e.g., L1I/L1D).
--
commit 1a6b06fa1768e5d101b318b8e5167a959fc10696
Author: Drew Fustini <fustini@kernel.org>
riscv_cbqri: Add CBQRI capacity allocation platform driver
This commit adds a device-tree platform driver for the RISC-V CBQRI capacity
controller. It reads the cache level from the device tree and matches it
against the system's cacheinfo to get the resctrl domain ID and shared harts.
> diff --git a/drivers/resctrl/cbqri_capacity.c b/drivers/resctrl/cbqri_capacity.c
> new file mode 100644
> index 0000000000000..2172432eb3287
> --- /dev/null
> +++ b/drivers/resctrl/cbqri_capacity.c
[ ... ]
> +static int cbqri_capacity_probe(struct platform_device *pdev)
> +{
[ ... ]
> + /*
> + * Associate the controller with its cache instance via
> + * cacheinfo. The matching cache provides the cache id and the
> + * set of harts that share the cache.
> + */
> + cache_id = -1;
> + cpus_read_lock();
> + for_each_online_cpu(cpu) {
> + struct cacheinfo *ci = get_cpu_cacheinfo_level(cpu, cache_level);
[Severity: Medium]
Could this fail to match split caches, like an L1 data cache?
The get_cpu_cacheinfo_level() function unconditionally returns the first
cacheinfo leaf it finds at the requested level and stops searching. If the
given level has multiple leaves (for example, L1 instruction and L1 data),
and the device tree node refers to the second leaf, this loop will miss it.
This would cause the fw_token check to fail and the probe to return -ENODEV.
> +
> + if (ci && ci->fw_token == cache_np) {
> + cache_id = ci->id;
> + cpumask_copy(cpu_mask, &ci->shared_cpu_map);
> + break;
> + }
> + }
> + cpus_read_unlock();
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260628-dfustini-atl-sc-cbqri-dt-v3-0-c9c1342fe3cf@kernel.org?part=8
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 4/8] riscv_cbqri: Add capacity controller probe and allocation device ops
2026-06-28 21:18 ` [PATCH v3 4/8] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
@ 2026-06-28 21:28 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-28 21:28 UTC (permalink / raw)
To: Drew Fustini; +Cc: linux-rt-devel, robh, devicetree, conor+dt, linux-kernel
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Unspecified evaluation order in `cbqri_readq` causes a torn read during hardware polling, leading to false AT capability detection.
--
commit a52377aec62b1aa81e5439ad733c3839af6693c6
Author: Drew Fustini <fustini@kernel.org>
riscv_cbqri: Add capacity controller probe and allocation device ops
Add support for the RISC-V CBQRI capacity controller. A platform driver
passes a cbqri_controller_info descriptor together with the cache level
to riscv_cbqri_register_cc_dt(), which probes the controller and adds it
to the controller list.
> diff --git a/drivers/resctrl/cbqri_devices.c b/drivers/resctrl/cbqri_devices.c
> new file mode 100644
> index 0000000000000..69df46e07df17
> --- /dev/null
> +++ b/drivers/resctrl/cbqri_devices.c
[ ... ]
> +static u64 cbqri_readq(void __iomem *addr)
> +{
> + return (u64)readl(addr) | ((u64)readl(addr + 4) << 32);
> +}
[Severity: Medium]
Could the unspecified evaluation order here cause a torn read during hardware
polling?
Since the C standard does not specify the evaluation order of the operands
to the bitwise OR operator, the compiler may emit the lower 32-bit read
before the upper 32-bit read.
During the polling loop, if the final iteration reads the lower 32 bits
while the busy flag is still set, and then the hardware completes the
operation and clears the busy flag before the upper 32-bit read executes,
the loop will terminate and return this torn value.
[ ... ]
> +static int cbqri_probe_feature(struct cbqri_controller *ctrl, int reg_offset,
> + int operation, int *status, bool *access_type_supported)
> +{
[ ... ]
> + writel(lower_32_bits(reg), ctrl->base + reg_offset);
> + if (cbqri_wait_busy_flag(ctrl, reg_offset, ®) < 0) {
> + pr_err("BUSY timeout setting AT field\n");
> + return -EIO;
> + }
> +
> + /*
> + * If the AT field value has been reset to zero,
> + * then the AT support is not present
> + */
> + at = FIELD_GET(CBQRI_CONTROL_REGISTERS_AT_MASK, reg);
> + if (at == CBQRI_CONTROL_REGISTERS_AT_CODE)
> + *access_type_supported = true;
[Severity: Medium]
Does the torn read from cbqri_readq() result in misdetecting capability
support here?
If cbqri_wait_busy_flag() returns a torn value with a stale lower half, the
driver may incorrectly detect Access Type support on hardware that does not
support it.
This could break cache allocation isolation guarantees, as the driver will
split capacity blocks into CODE and DATA masks, but the hardware will silently
treat them as a single pool.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260628-dfustini-atl-sc-cbqri-dt-v3-0-c9c1342fe3cf@kernel.org?part=4
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 3/8] riscv: Add support for srmcfg CSR from Ssqosid extension
2026-06-28 21:18 ` [PATCH v3 3/8] riscv: Add support for srmcfg CSR from " Drew Fustini
@ 2026-06-28 21:28 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-28 21:28 UTC (permalink / raw)
To: Drew Fustini; +Cc: linux-kernel, conor+dt, devicetree, robh, linux-rt-devel
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Failing cpuhp_setup_state skips PM notifier registration, silently breaking QoS IDs across system suspend/resume.
--
commit e3ff463a672a85eb328bf3048323a464af88e408
Author: Drew Fustini <fustini@kernel.org>
riscv: Add support for srmcfg CSR from Ssqosid extension
This commit adds support for the srmcfg CSR defined in the Ssqosid ISA
extension. It introduces a per-task thread field and a per-CPU cache to track
Resource Control ID (RCID) and Monitoring Counter ID (MCID) fields, updating
the CSR during context switches.
Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
[ ... ]
> diff --git a/arch/riscv/kernel/qos.c b/arch/riscv/kernel/qos.c
> new file mode 100644
> index 0000000000000..c8900d91996f2
> --- /dev/null
> +++ b/arch/riscv/kernel/qos.c
[ ... ]
> +static int __init riscv_srmcfg_init(void)
> +{
> + int err;
> +
> + if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID))
> + return 0;
> +
> + /*
> + * cpuhp_setup_state() invokes the startup callback locally on every
> + * already-online CPU, so no separate seed loop is needed here.
> + */
> + err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
> + riscv_srmcfg_reset_cache, riscv_srmcfg_reset_cache);
> + if (err < 0) {
> + pr_warn("srmcfg cpuhp registration failed (%d), cpus brought online after boot will not invalidate the CSR_SRMCFG cache\n",
> + err);
> + return err;
[Severity: Medium]
If cpuhp_setup_state() fails and returns early here, the
cpu_pm_register_notifier() registration is skipped, but the hardware
capability RISCV_ISA_EXT_SSQOSID remains active.
Because returning an error from an arch_initcall does not halt the boot
process, the QoS feature will remain active in the kernel.
If the PM notifier is not registered to invalidate the software cache upon
exit from low-power states, will subsequent context switches find a cache
match, mistakenly bypass the csr_write(), and execute tasks with the wrong
hardware QoS ID?
> + }
> +
> + cpu_pm_register_notifier(&riscv_srmcfg_pm_nb);
> + return 0;
> +}
> +arch_initcall(riscv_srmcfg_init);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260628-dfustini-atl-sc-cbqri-dt-v3-0-c9c1342fe3cf@kernel.org?part=3
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 6/8] riscv: Enable resctrl filesystem for Ssqosid
2026-06-28 21:18 ` [PATCH v3 6/8] riscv: Enable resctrl filesystem for Ssqosid Drew Fustini
@ 2026-06-28 21:30 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-28 21:30 UTC (permalink / raw)
To: Drew Fustini; +Cc: linux-kernel, devicetree, conor+dt, linux-rt-devel, robh
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] RISCV_ISA_SSQOSID unconditionally selects the user-visible menuconfig RISCV_CBQRI, defeating its prompt and violating Kconfig best practices.
--
commit 6a7bfcfac7becb358efe0bde0c988634f28e5326
Author: Drew Fustini <fustini@kernel.org>
riscv: Enable resctrl filesystem for Ssqosid
This commit enables the resctrl filesystem for the RISC-V Ssqosid extension.
It modifies the RISCV_ISA_SSQOSID config to select ARCH_HAS_CPU_RESCTRL and
RISCV_CBQRI.
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index ee586925f9722..9c28bcbc29dc9 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -594,6 +594,8 @@ config RISCV_ISA_SSQOSID
> bool "Ssqosid extension support for supervisor mode Quality of Service ID"
> depends on 64BIT
> default n
> + select ARCH_HAS_CPU_RESCTRL
> + select RISCV_CBQRI
[Severity: Medium]
Is it intentional to use select on RISCV_CBQRI here?
RISCV_CBQRI appears to be defined as a user-visible menuconfig:
drivers/resctrl/Kconfig:
menuconfig RISCV_CBQRI
bool "RISC-V CBQRI support"
depends on RISCV && RISCV_ISA_SSQOSID
Using select on a user-visible symbol forces it on and overrides the prompt,
which can create a confusing configuration experience where the option is
visible but cannot be toggled.
Could this be addressed by either using "depends on" instead of select,
or by making RISCV_CBQRI a hidden symbol without a prompt?
> help
> Adds support for the Ssqosid ISA extension (Supervisor-mode
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260628-dfustini-atl-sc-cbqri-dt-v3-0-c9c1342fe3cf@kernel.org?part=6
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 5/8] riscv_cbqri: resctrl: Add cache allocation via capacity block mask
2026-06-28 21:18 ` [PATCH v3 5/8] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
@ 2026-06-28 21:34 ` sashiko-bot
0 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-28 21:34 UTC (permalink / raw)
To: Drew Fustini; +Cc: devicetree, linux-kernel, robh, linux-rt-devel, conor+dt
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Dynamically registered capacity controllers bypass capability validation, leading to hardware errors that abort CPU hotplug.
--
commit 3d57a44029d899d70e2c81681585f914c01cc700
Author: Drew Fustini <fustini@kernel.org>
riscv_cbqri: resctrl: Add cache allocation via capacity block mask
Wire CBQRI capacity controllers into resctrl as RDT_RESOURCE_L2 and
RDT_RESOURCE_L3 schemata. Mismatched CC caps at the same cache level
are treated as a fatal configuration error since fs/resctrl exposes a
single per-rid cap set. Domains are created lazily in the cpuhp online
callback so cpu_mask reflects only currently online CPUs.
> diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resctrl.c
> new file mode 100644
> index 0000000000000..1fb0fbe1b000f
> --- /dev/null
> +++ b/drivers/resctrl/cbqri_resctrl.c
[ ... ]
> +static int cbqri_attach_cpu_to_cap_ctrl(struct cbqri_controller *ctrl,
> + unsigned int cpu)
> +{
> + struct cbqri_resctrl_res *hw_res;
> + struct rdt_ctrl_domain *domain;
> + struct rdt_resource *res;
> + int dom_id;
> + int rid;
> +
> + rid = cbqri_cache_level_to_rid(ctrl->cache.cache_level);
> + if (rid < 0)
> + return 0;
> + hw_res = &cbqri_resctrl_resources[rid];
> +
> + if (!hw_res->ctrl)
> + return 0;
> +
> + res = &hw_res->resctrl_res;
> + dom_id = ctrl->cache.cache_id;
> +
> + domain = cbqri_find_ctrl_domain(&res->ctrl_domains, dom_id);
> + if (domain) {
> + cpumask_set_cpu(cpu, &domain->hdr.cpu_mask);
> + return 0;
> + }
> +
> + domain = cbqri_create_ctrl_domain(ctrl, res, cpu, dom_id);
[Severity: High]
Does this regression cause CPU hotplug to abort if a capacity controller is
dynamically registered (e.g., via a device tree overlay) after boot?
Late-registered controllers bypass the capability validation in
cbqri_resctrl_pick_caches(). When a CPU sharing this new controller is
hotplugged, it reaches this point to create a new domain.
If the late controller supports fewer RCIDs than the global controller,
cbqri_init_domain_ctrlval() will issue out-of-bounds commands to the hardware
because it iterates up to the global hw_res->ctrl->rcid_count:
drivers/resctrl/cbqri_resctrl.c:cbqri_init_domain_ctrlval() {
...
for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
for (t = 0; t < CDP_NUM_TYPES; t++) {
err = resctrl_arch_update_one(r, d, i, t, ...
...
}
The hardware will reject the out-of-bounds RCID commands, and the resulting
error will bubble up to cbqri_resctrl_online_cpu(), keeping the CPU
permanently offline.
Should the capabilities of dynamically registered controllers be validated
against hw_res->ctrl before attaching them?
> + if (IS_ERR(domain))
> + return PTR_ERR(domain);
> +
> + return 0;
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260628-dfustini-atl-sc-cbqri-dt-v3-0-c9c1342fe3cf@kernel.org?part=5
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-06-28 21:34 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-28 21:18 [PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support Drew Fustini
2026-06-28 21:18 ` [PATCH v3 1/8] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-06-28 21:18 ` [PATCH v3 2/8] riscv: Detect the Ssqosid extension Drew Fustini
2026-06-28 21:18 ` [PATCH v3 3/8] riscv: Add support for srmcfg CSR from " Drew Fustini
2026-06-28 21:28 ` sashiko-bot
2026-06-28 21:18 ` [PATCH v3 4/8] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-06-28 21:28 ` sashiko-bot
2026-06-28 21:18 ` [PATCH v3 5/8] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-06-28 21:34 ` sashiko-bot
2026-06-28 21:18 ` [PATCH v3 6/8] riscv: Enable resctrl filesystem for Ssqosid Drew Fustini
2026-06-28 21:30 ` sashiko-bot
2026-06-28 21:18 ` [PATCH v3 7/8] dt-bindings: riscv: Add binding for CBQRI controllers Drew Fustini
2026-06-28 21:18 ` [PATCH v3 8/8] riscv_cbqri: Add CBQRI capacity allocation platform driver Drew Fustini
2026-06-28 21:27 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox