From: sashiko-bot@kernel.org
To: "Drew Fustini" <fustini@kernel.org>
Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org,
krzk+dt@kernel.org
Subject: Re: [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops
Date: Tue, 12 May 2026 01:26:34 +0000 [thread overview]
Message-ID: <20260512012635.84FCCC2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260510-ssqosid-cbqri-rqsc-v7-0-v4-7-eb53831ef683@kernel.org>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] Resource leak in `cbqri_controller_destroy()` leaves I/O mappings and memory regions permanently reserved.
- [High] Stale RCID mask in `cbqri_probe_feature()` causes the AT=CODE probe to execute against an uninitialized, arbitrary RCID.
- [High] Missing pre-clear of `BLOCK_MASK` register allows capturing stale MMIO garbage, corrupting controller state upon revert.
- [High] Cache size initialization relies on offline CPUs during early boot, permanently disabling occupancy monitoring.
- [Medium] Failure to validate firmware-provided RCID and MCID counts against defined safety caps allows boot-time soft lockups.
--
commit bb6b9ddd95867f863b5b7c53d6d572cc706cebd6
Author: Drew Fustini <fustini@kernel.org>
riscv_cbqri: Add capacity controller probe and allocation device ops
Add support for the RISC-V CBQRI capacity controller (CC). The firmware
discovery layer (ACPI or DT) is responsible for passing the
cbqri_controller_info descriptor to riscv_cbqri_register_controller().
The driver resolves the cpumask so callers do not need the cacheinfo
topology.
> diff --git a/drivers/resctrl/cbqri_devices.c b/drivers/resctrl/cbqri_devices.c
> new file mode 100644
> index 0000000000000..dc76a146e34df
> --- /dev/null
> +++ b/drivers/resctrl/cbqri_devices.c
[ ... ]
> +int cbqri_apply_cache_config(struct cbqri_controller *ctrl, u32 closid,
> + const struct cbqri_cc_config *cfg)
> +{
> + bool need_at_mirror;
> + u64 saved_cbm = 0;
> + int err = 0;
> + u64 reg;
> +
> + mutex_lock(&ctrl->lock);
> +
> + need_at_mirror = ctrl->cc.supports_alloc_at_code && !cfg->cdp_enabled;
> +
> + /*
> + * Capture the cfg->at half CBM before any write so a partial
> + * AT-mirror failure can revert and keep the two halves consistent.
> + */
> + if (need_at_mirror) {
> + err = cbqri_cc_alloc_op(ctrl, CBQRI_CC_ALLOC_CTL_OP_READ_LIMIT,
> + closid, cfg->at);
> + if (err < 0)
> + goto out;
> + saved_cbm = ioread64(ctrl->base + CBQRI_CC_BLOCK_MASK_OFF);
> + }
To mitigate silent firmware no-ops where the hardware returns success without
updating the staging register, cbqri_read_cache_config() pre-clears
the register.
If a silent no-op occurs here during the READ_LIMIT operation, could
saved_cbm capture stale MMIO data? If a revert is later triggered, does
this risk writing garbage data back and corrupting the controller's capacity
allocation state?
[ ... ]
> +static int cbqri_probe_feature(struct cbqri_controller *ctrl, int reg_offset,
> + int operation, int *status, bool *access_type_supported)
> +{
[ ... ]
> + /* Execute the requested operation to find if the register is implemented */
> + FIELD_MODIFY(CBQRI_CONTROL_REGISTERS_OP_MASK, ®, operation);
> + reg &= ~CBQRI_CONTROL_REGISTERS_RCID_MASK;
> + iowrite64(reg, ctrl->base + reg_offset);
[ ... ]
> + /* Get the operation status */
> + *status = FIELD_GET(CBQRI_CONTROL_REGISTERS_STATUS_MASK, reg);
> +
> + /*
> + * Check for the AT support if the register is implemented
> + * (if not, the status value will remain 0)
> + */
> + if (*status != 0) {
> + /*
> + * Re-issue operation with AT=CODE so the controller
> + * latches AT=CODE on supported hardware (or resets it to 0
> + * on hardware that doesn't). OP must be a defined CBQRI op
> + * here. OP=0 is a no-op and would silently disable CDP.
> + */
> + reg = saved_reg;
> + FIELD_MODIFY(CBQRI_CONTROL_REGISTERS_OP_MASK, ®, operation);
> + FIELD_MODIFY(CBQRI_CONTROL_REGISTERS_AT_MASK, ®,
> + CBQRI_CONTROL_REGISTERS_AT_CODE);
> + iowrite64(reg, ctrl->base + reg_offset);
When issuing the second probe to verify AT support, the code restores the raw
boot-time register state via reg = saved_reg.
Does this execute the command against whatever uninitialized RCID value the
firmware originally left in the register, since the RCID mask is not cleared
again? If the RCID is out of bounds, could the hardware reject the operation,
causing the driver to falsely report that supports_alloc_at_code is false?
[ ... ]
> +static int cbqri_probe_cc(struct cbqri_controller *ctrl)
> +{
[ ... ]
> + /*
> + * Resolve cache_size via cacheinfo. cpus_read_lock satisfies
> + * lockdep_assert_cpus_held() inside get_cpu_cacheinfo_level(). If
> + * every cpu_mask member is offline, cache_size stays 0 and the
> + * controller cannot back occupancy monitoring.
> + */
> + cpus_read_lock();
> + if (!ctrl->cache.cache_size) {
> + int cpu = cpumask_first_and(&ctrl->cache.cpu_mask, cpu_online_mask);
> +
> + if (cpu < nr_cpu_ids) {
> + struct cacheinfo *ci;
> +
> + ci = get_cpu_cacheinfo_level(cpu, ctrl->cache.cache_level);
> + if (ci)
> + ctrl->cache.cache_size = ci->size;
> + }
> + }
> + cpus_read_unlock();
A comment in riscv_cbqri_register_controller() states that cache_size is
filled lazily during probe.
Since cbqri_probe_cc() is invoked synchronously during early boot
registration, wouldn't CPUs in other clusters still be offline? Because there
is no lazy re-evaluation mechanism or CPU hotplug hook later on, does this
permanently leave ctrl->cache.cache_size at 0 and break occupancy monitoring
calculations for those controllers?
[ ... ]
> +void cbqri_controller_destroy(struct cbqri_controller *ctrl)
> +{
> + kfree(ctrl);
> +}
When riscv_cbqri_unregister_last() is used to roll back a successful
registration, it calls this function to free the controller.
Does this leak the I/O mappings and memory regions, since it fails to call
iounmap(ctrl->base) and release_mem_region() for the resources successfully
allocated during cbqri_probe_controller()?
[ ... ]
> +int riscv_cbqri_register_controller(const struct cbqri_controller_info *info)
> +{
> + struct cbqri_controller *ctrl;
> + int err;
> +
> + if (!info->addr) {
> + pr_warn("skipping controller with invalid addr=0x0\n");
> + return -EINVAL;
> + }
> +
> + ctrl = kzalloc(sizeof(*ctrl), GFP_KERNEL);
> + if (!ctrl)
> + return -ENOMEM;
> +
> + mutex_init(&ctrl->lock);
> +
> + ctrl->addr = info->addr;
> + ctrl->size = info->size;
> + ctrl->type = info->type;
> + ctrl->rcid_count = info->rcid_count;
> + ctrl->mcid_count = info->mcid_count;
The cbqri header explicitly defines CBQRI_MAX_RCID and CBQRI_MAX_MCID as 1024,
alongside documentation warning that a malformed firmware table claiming a
large number of IDs could trip the soft-lockup watchdog during per-id MMIO
init loops.
Is it safe to blindly assign these counts without validating them against the
safety caps? Could a malformed ACPI or DT table bypass this protection and
trigger a soft lockup during boot?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260510-ssqosid-cbqri-rqsc-v7-0-v4-0-eb53831ef683@kernel.org?part=7
next prev parent reply other threads:[~2026-05-12 1:26 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-11 23:52 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-12 1:26 ` sashiko-bot [this message]
2026-05-11 5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-12 1:58 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-12 2:29 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-12 4:01 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-12 5:00 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-12 5:42 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-12 21:26 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-12 21:46 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-05-12 22:56 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260512012635.84FCCC2BCB0@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=fustini@kernel.org \
--cc=krzk+dt@kernel.org \
--cc=robh@kernel.org \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox