Devicetree
 help / color / mirror / Atom feed
* [PATCH RFC v5 00/18] riscv: add Ssqosid and CBQRI resctrl support
@ 2026-05-24 23:55 Drew Fustini
  2026-05-24 23:55 ` [PATCH RFC v5 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
                   ` (17 more replies)
  0 siblings, 18 replies; 32+ messages in thread
From: Drew Fustini @ 2026-05-24 23:55 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Radim Krčmář, Samuel Holland, Adrien Ricciardi,
	Nicolas Pitre, Kornel Dulęba, Atish Patra, Atish Kumar Patra,
	Vasudevan Srinivasan, Ved Shanbhogue, Conor Dooley, yunhui cui,
	Chen Pei, Liu Zhiwei, Weiwei Li, guo.wenjia23, Gong Shuai,
	Gong Shuai, liu.qingtao2, Reinette Chatre, Tony Luck, Babu Moger,
	Peter Newman, Fenghua Yu, James Morse, Ben Horgan, Dave Martin,
	Rob Herring, Conor Dooley, Krzysztof Kozlowski, Rafael J. Wysocki,
	Len Brown, Robert Moore, Sunil V L, Drew Fustini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
	Jonathan Corbet
  Cc: linux-kernel, linux-riscv, x86, linux-acpi, acpica-devel,
	devicetree, Paul Walmsley, Conor Dooley, linux-rt-devel,
	linux-doc

This RFC series adds RISC-V QoS support: the Ssqosid extension [1]
(srmcfg CSR), the CBQRI controller interface [2] integrated with
resctrl [3], and ACPI RQSC [4] for controller discovery. DT support
is possible but no platform drivers are included. The series is
also available as a branch [5].

QEMU support for Ssqosid and CBQRI lives in [6], with ACPI RQSC as
a follow-on series [7]. There is also a combined branch [8].

Series organization
-------------------
01      DT binding for Ssqosid extension
02-03   Ssqosid ISA support (detection, srmcfg CSR, switch_to)
04-06   fs/resctrl helpers and resource type additions
07-10   CBQRI device ops (cbqri_devices.c): capacity probe +
        allocation, capacity monitoring, bandwidth probe +
        allocation, bandwidth monitoring
11-15   CBQRI resctrl integration (cbqri_resctrl.c): cache
        allocation, L3 cache occupancy monitoring, MB_MIN
        bandwidth allocation, MB_WGHT bandwidth allocation,
        mbm_total_bytes monitoring
16-17   ACPI RQSC parser and init
18      Enable resctrl filesystem for Ssqosid (Kconfig)

Refer to the v3 cover letter [9] for the test setup including the
reference SoC layout and the corresponding QEMU command line.

[1] https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
[2] https://github.com/riscv-non-isa/riscv-cbqri/releases/tag/v1.0
[3] https://docs.kernel.org/filesystems/resctrl.html
[4] https://github.com/riscv-non-isa/riscv-rqsc/blob/main/src/
[5] https://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux.git/log/?h=b4/ssqosid-cbqri-rqsc
[6] https://lore.kernel.org/qemu-devel/20260105-riscv-ssqosid-cbqri-v4-0-9ad7671dde78@kernel.org/
[7] https://lore.kernel.org/qemu-devel/20260202-riscv-rqsc-v1-0-dcf448a3ed73@kernel.org/
[8] https://github.com/tt-fustini/qemu/tree/b4/riscv-rqsc
[9] https://lore.kernel.org/r/20260414-ssqosid-cbqri-rqsc-v7-0-v3-0-b3b2e7e9847a@kernel.org

Key design decisions
--------------------
- Create new resource types as RDT_RESOURCE_MBA cannot represent the
  semantics of the CBQRI bandwidth controllers:

  - RDT_RESOURCE_MB_MIN matches CBQRI Rbwb (reserved bandwidth
    blocks). The sum of Rbwb across all control groups must be
    <= MRBWB (maximum number of reserved bandwidth blocks).

  - RDT_RESOURCE_MB_WGHT matches CBQRI Mweight, the weighted share of
    the remaining bandwidth blocks. Values are in [0, 255]: 0 disables
    work-conserving sharing for the group, 1..255 compete for the
    leftover pool.

- mbm_total_bytes is supported only when the platform exposes exactly
  one mon-capable bandwidth controller and exactly one L3 domain.
  Pairing a single BC across multiple L3 domains would let standard
  userspace tools overcount system bandwidth by summing the same
  counter across domains.

Open issues
-----------
 - RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT are intended to drive
   discussion, not as the final solution. I plan to rebase onto
   Reinette's proof of concept once it is posted.

 - resctrl monitoring scope limitations:
   - monitor-only L3 capacity controllers are not supported.
   - CBQRI capacity controllers can monitor any cache level, but resctrl
     only supports occupancy on L3.
   - resctrl needs to gain a non-CPU scope level for mbm_total_bytes
     to be supported on platforms with multiple bandwidth controllers
     or multiple L3 domains.

 - When a control group is freed, rbwb_cache[closid] is not reset,
   so the MB_MIN sum check can count the stale reservation against
   MRBWB. Fixing this requires a new resctrl_arch_* callback in
   fs/resctrl invoked on group destroy, which is out of scope for
   this arch-driver series.

 - cc_cunits is not supported. cc_block_mask maps well onto resctrl's
   existing CBM schema, but there is no existing equivalent for
   capacity units.

 - RQSC structs live in drivers/acpi/riscv/rqsc.h until the spec is
   ratified and the ACPICA upstream submission lands. They will then move
   to include/acpi/actbl2.h. The spec is in the final phase
   before ratification.

Changes in v5:
--------------
The changes in this revision are based on the feedback in the Sashiko
review of the series.

Ssqosid:
 - Seed cpu_srmcfg to U32_MAX in DEFINE_PER_CPU so early-boot context
   switches always write the CSR rather than matching a zero-initialised
   cache before riscv_srmcfg_init() runs.
 - __switch_to_srmcfg() evaluates RCID and MCID against
   cpu_srmcfg_default independently. A task in the default RCID group
   with a specific MCID previously bypassed the CPU default.
 - Register a CPU PM notifier that invalidates cpu_srmcfg on
   CPU_PM_EXIT / CPU_PM_ENTER_FAILED so resume-from-suspend on the boot
   CPU writes the CSR.
 - Drop the for_each_online_cpu pre-seed loop in riscv_srmcfg_init().
   cpuhp_setup_state() already covers already-online CPUs.

CBQRI:
 - Add mweight_cache. cbqri_apply_bc_field() seeds both fields of
   bc_bw_alloc from the software caches, so that stale data can not leak
   into the unmodified field.
 - Seed mweight_cache to FIELD_MAX(MWEIGHT_MASK) at probe so the first
   MB_MIN domain init does not commit Mweight=0 to every RCID. A weight
   of 0 is a hard cap on opportunistic bandwidth, which would starve
   every RCID until the subsequent MB_WGHT domain init catches up.
 - cbqri_apply_mweight_config() rejects mweight > WEIGHT_MASK at entry
   rather than letting it truncate and trigger a verify mismatch.
 - cbqri_apply_bc_field() updates per-RCID cache only after verifying.
 - cbqri_controller_destroy() now iounmaps and releases the mem region
   from rollback paths, gated on ctrl->base.
 - cbqri_probe_feature() clears OP, AT, RCID and EVT_ID on every write,
   so the probe never writes stale bits into the register.
 - cbqri_apply_cache_config() clears cc_block_mask before the initial
   READ_LIMIT that captures saved_cbm.
 - Drop the ctrl->faulted early return from controller ops.
 - Reject a second bandwidth controller when sharing a proximity domain.
 - Rejects ctrl->rcid_count > SRMCFG_RCID_MASK so the schedule-in
   fast path cannot silently truncate the RCID.
 - Widen CBQRI_MON_CTL_OP/MCID/EVT_ID masks to GENMASK_ULL so
   FIELD_MODIFY on a u64 register stays safe if RV32 support is added.

resctrl:
 - Switch the L3 mon_domain teardown paths from cancel_delayed_work_sync
   to cancel_delayed_work to avoid potential deadlock.
 - Guard the mbm_over cancel on QOS_L3_MBM_TOTAL_EVENT_ID, so a system
   without a paired BC does not cancel a zeroed work struct.
 - cbqri_attach_cpu_to_cap_ctrl() rolls back cpumask_set_cpu and any
   freshly created ctrl_domain when cbqri_attach_cpu_to_l3_mon() fails.
 - Restrict mbm_total_bytes to platforms with exactly one L3 domain.
 - Pair the L3 mon domain with its BC and initialise the BC's
   per-MCID accumulators before resctrl_online_mon_domain() exposes
   the domain, so a concurrent mbm_total_bytes read cannot race with
   paired_bc init.
 - Hold cbqri_domain_list_lock across the MMIO paths in
   resctrl_arch_rmid_read() and resctrl_arch_reset_rmid() so a
   concurrent CPU hotplug detach cannot free hw_dom mid-read.
 - cbqri_resctrl_setup() rolls back exposed_alloc_capable /
   exposed_mon_capable on resctrl_init() failure so
   resctrl_arch_*_capable() does not report stale state to callers.
 - Drop the cacheinfo_ready wait queue in cbqri_resctrl_setup() and
   the RCU annotations on the ctrl_domain list. cacheinfo runs at
   device_initcall_sync, strictly before late_initcall, and the list
   is mutated only from cpuhp callbacks under cbqri_domain_list_lock.

Kconfig:
 - RISCV_ISA_SSQOSID selects RISCV_CBQRI_DRIVER unconditionally. resctrl
   is gated separately by the silent RISCV_CBQRI_RESCTRL_FS option. 

ACPI:
 - acpi_parse_rqsc() rejects tables with the wrong header.revision,
   validates res0->type and res0->id_type, and checks that node->length
   does not overrun the table end.

Sashiko review:
https://sashiko.dev/#/patchset/20260510-ssqosid-cbqri-rqsc-v7-0-v4-0-eb53831ef683%40kernel.org

Link to v4:
https://lore.kernel.org/all/20260510-ssqosid-cbqri-rqsc-v7-0-v4-0-eb53831ef683@kernel.org/

Changes in v4:
--------------
resctrl:
 - Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT
 - Add default_to_min to resctrl_membw so MB_MIN defaults to min_bw
 - Add L3 cache occupancy monitoring for L3-scoped capacity controllers
 - Add mbm_total_bytes bandwidth monitoring when there is a single
   bandwidth controller
 - Move domain creation into cpuhp callbacks so that cpu_mask reflects
   only online CPUs
 - resctrl_arch_reset_rmid() returns early when called with IRQs
   disabled.

CBQRI:
 - Replace per-controller spinlock with mutex. Each CBQRI op is a
   write-then-poll-busy cycle of up to 1 ms. A sleeping mutex paired
   with readq_poll_timeout() keeps preemption enabled across the
   busy-wait. All resctrl-arch entry points run in process context.
 - Replace struct cbqri_config with direct params in helper functions.
 - max_rmid = min(max_rmid, ctrl->mcid_count) now gated on
   ctrl->mon_capable.
 - Validate that the sum of Rbwb does not exceed MRBWB.
 - Move CDP enable state from file-scope globals to per-resource
   cdp_enabled / cdp_capable.
 - Configure both AT_CODE and AT_DATA limits when CDP is supported but
   not enabled.

Ssqosid:
 - __switch_to_srmcfg() emits RISCV_FENCE(rw, o) before and (o, rw)
   after csrw to drain old-task stores and order new-task loads.
 - Invalidate per-cpu cpu_srmcfg on hart online via CPUHP_AP_ONLINE_DYN.
   Also seed already-online CPUs synchronously at init.

ACPI:
 - Drop the PPTT helper patch and resolve cache_size via cacheinfo at
   cbqri_resctrl_setup() time.
 - ACPI driver now calls riscv_cbqri_register_controller() and the
   cbqri_controller internals stay in cbqri_internal.h.

Refer to v3 for previous change logs:
https://lore.kernel.org/r/20260414-ssqosid-cbqri-rqsc-v7-0-v3-0-b3b2e7e9847a@kernel.org

---
Drew Fustini (18):
      dt-bindings: riscv: Add Ssqosid extension description
      riscv: detect the Ssqosid extension
      riscv: add support for srmcfg CSR from Ssqosid extension
      fs/resctrl: Add resctrl_is_membw() helper
      fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT
      fs/resctrl: Let bandwidth resources default to min_bw at reset
      riscv_cbqri: Add capacity controller probe and allocation device ops
      riscv_cbqri: Add capacity controller monitoring device ops
      riscv_cbqri: Add bandwidth controller probe and allocation device ops
      riscv_cbqri: Add bandwidth controller monitoring device ops
      riscv_cbqri: resctrl: Add cache allocation via capacity block mask
      riscv_cbqri: resctrl: Add L3 cache occupancy monitoring
      riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb
      riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight
      riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring
      ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table
      ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC)
      riscv: enable resctrl filesystem for Ssqosid

 .../devicetree/bindings/riscv/extensions.yaml      |    6 +
 MAINTAINERS                                        |   15 +
 arch/riscv/Kconfig                                 |   20 +
 arch/riscv/include/asm/acpi.h                      |   10 +
 arch/riscv/include/asm/csr.h                       |    5 +
 arch/riscv/include/asm/hwcap.h                     |    1 +
 arch/riscv/include/asm/processor.h                 |    3 +
 arch/riscv/include/asm/qos.h                       |   87 ++
 arch/riscv/include/asm/resctrl.h                   |  152 ++
 arch/riscv/include/asm/switch_to.h                 |    3 +
 arch/riscv/kernel/Makefile                         |    2 +
 arch/riscv/kernel/cpufeature.c                     |    1 +
 arch/riscv/kernel/qos.c                            |   84 ++
 drivers/acpi/riscv/Makefile                        |    1 +
 drivers/acpi/riscv/init.c                          |   21 +
 drivers/acpi/riscv/rqsc.c                          |  194 +++
 drivers/acpi/riscv/rqsc.h                          |   63 +
 drivers/resctrl/Kconfig                            |   32 +
 drivers/resctrl/Makefile                           |    6 +
 drivers/resctrl/cbqri_devices.c                    | 1100 +++++++++++++++
 drivers/resctrl/cbqri_internal.h                   |  246 ++++
 drivers/resctrl/cbqri_resctrl.c                    | 1458 ++++++++++++++++++++
 fs/resctrl/ctrlmondata.c                           |    3 +-
 fs/resctrl/internal.h                              |    2 +
 fs/resctrl/rdtgroup.c                              |   16 +-
 include/linux/resctrl.h                            |   13 +-
 include/linux/riscv_cbqri.h                        |   66 +
 27 files changed, 3601 insertions(+), 9 deletions(-)
---
base-commit: 5200f5f493f79f14bbdc349e402a40dfb32f23c8
change-id: 20260329-ssqosid-cbqri-rqsc-v7-0-b0c788bab48a

Best regards,
--  
Drew Fustini <fustini@kernel.org>


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2026-05-25  8:23 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-24 23:55 [PATCH RFC v5 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-25  0:29   ` sashiko-bot
2026-05-25  6:58     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-25  7:21     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-25  0:36   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-25  0:50   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-25  0:46   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-25  0:55   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-25  0:52   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-25  1:27   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-25  8:23   ` Sunil V L
2026-05-24 23:55 ` [PATCH RFC v5 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox