From: sashiko-bot@kernel.org
To: "Drew Fustini" <fustini@kernel.org>
Cc: krzk+dt@kernel.org, robh@kernel.org, conor+dt@kernel.org,
devicetree@vger.kernel.org
Subject: Re: [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation device ops
Date: Tue, 12 May 2026 02:29:38 +0000 [thread overview]
Message-ID: <20260512022939.683A1C2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260510-ssqosid-cbqri-rqsc-v7-0-v4-9-eb53831ef683@kernel.org>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] State divergence on hardware verification failure breaks capacity invariants.
- [High] Unprotected initial `READ_LIMIT` is vulnerable to silent hardware failure and causes cross-RCID state corruption.
- [High] A single transient hardware timeout permanently disables the controller due to a latched `faulted` flag.
- [Medium] Feature probing fails to clear `EVT_ID`, risking false-negative feature detection.
- [Medium] Missing bounds check on `mweight` causes silent truncation and false I/O errors.
--
commit 40fea10bc5e942138ec126160ec0621edcc9673f
Author: Drew Fustini <fustini@kernel.org>
riscv_cbqri: Add bandwidth controller probe and allocation device ops
Add support for CBQRI bandwidth controller (BC) discovery and the two BC
allocation control knobs. Both fields share the bc_bw_alloc register, so
each write needs READ_LIMIT-modify-CONFIG_LIMIT-verify with a sentinel.
> diff --git a/drivers/resctrl/cbqri_devices.c b/drivers/resctrl/cbqri_devices.c
> index e46b02d2c50d7..7e5decd7a6b2e 100644
> --- a/drivers/resctrl/cbqri_devices.c
> +++ b/drivers/resctrl/cbqri_devices.c
> @@ -145,6 +184,44 @@ int cbqri_mon_op(struct cbqri_controller *ctrl, int reg_offset,
> return 0;
> }
>
> +/*
> + * Perform bandwidth allocation control operation on bandwidth controller.
> + * Caller must hold ctrl->lock.
> + */
> +static int cbqri_bc_alloc_op(struct cbqri_controller *ctrl, int operation, int rcid)
> +{
> + int reg_offset = CBQRI_BC_ALLOC_CTL_OFF;
> + int status;
> + u64 reg;
> +
> + lockdep_assert_held(&ctrl->lock);
> +
> + if (ctrl->faulted)
> + return -EIO;
> +
> + if (cbqri_wait_busy_flag(ctrl, reg_offset, ®) < 0) {
If a timeout in cbqri_wait_busy_flag() sets ctrl->faulted = true, will
runtime operations like cbqri_bc_alloc_op() ever execute again? Because
it returns early here when ctrl->faulted is true, the polling function
is never called again to clear the flag, which seems to permanently
disable the controller after a single timeout.
[ ... ]
> +static int cbqri_apply_bc_field(struct cbqri_controller *ctrl, u32 closid,
> + void (*set)(struct cbqri_controller *, u64),
> + u64 (*get)(struct cbqri_controller *),
> + u64 val, bool *committed)
> +{
> + int ret;
> + u64 reg;
> +
> + lockdep_assert_held(&ctrl->lock);
> +
> + /* Load current RCID state so the unmodified field is preserved */
> + ret = cbqri_bc_alloc_op(ctrl, CBQRI_BC_ALLOC_CTL_OP_READ_LIMIT, closid);
> + if (ret < 0)
> + return ret;
Does this initial read need the same sentinel protection as the
verification read? If this READ_LIMIT silently fails, the staging
register might retain data from a previous operation (like a different
RCID). Since set() only modifies the target field, CONFIG_LIMIT could
commit the stale data in the unmodified field, causing cross-RCID state
corruption.
> +
> + set(ctrl, val);
> +
> + ret = cbqri_bc_alloc_op(ctrl, CBQRI_BC_ALLOC_CTL_OP_CONFIG_LIMIT, closid);
> + if (ret < 0)
> + return ret;
> +
> + /*
> + * CONFIG_LIMIT committed. The per-CLOSID software cache must
> + * track hardware regardless of whether the verify below passes.
> + */
> + if (committed)
> + *committed = true;
[ ... ]
> + reg = get(ctrl);
> + if (reg != val) {
> + pr_err_ratelimited("BC field verify mismatch (reg=0x%llx != val=%llu)\n",
> + reg, val);
> + return -EIO;
> + }
> +
> + return 0;
> +}
If the verification fails here and returns -EIO, *committed remains
true. When called from cbqri_apply_rbwb(), this causes the rejected
value to be cached in ctrl->rbwb_cache[closid]. If the requested value
was smaller than the previous one, could this cause the software cache
to underestimate the reserved capacity and allow future allocations to
exceed the MRBWB hardware limit?
[ ... ]
> +int cbqri_apply_mweight_config(struct cbqri_controller *ctrl, u32 closid,
> + u64 mweight)
> +{
> + int ret;
> +
> + mutex_lock(&ctrl->lock);
> + ret = cbqri_apply_bc_field(ctrl, closid,
> + cbqri_set_mweight, cbqri_get_mweight,
> + mweight, NULL);
> + mutex_unlock(&ctrl->lock);
> + return ret;
> +}
What happens if an arbitrarily large mweight is passed in? FIELD_MODIFY
silently truncates values larger than the 8-bit MWEIGHT_MASK. The
verification step in cbqri_apply_bc_field() will then compare the
truncated hardware value against the original untruncated val and
always fail, returning -EIO and logging a hardware failure instead of
returning -EINVAL. Should this have bounds checking similar to
cbqri_apply_rbwb()?
[ ... ]
> @@ -437,6 +664,83 @@ static int cbqri_probe_cc(struct cbqri_controller *ctrl)
> return 0;
> }
>
> +static int cbqri_probe_bc(struct cbqri_controller *ctrl)
> +{
> + bool has_mon_at_code = false;
> + int err, status;
> + u64 reg;
[ ... ]
> + /* Probe monitoring features */
> + err = cbqri_probe_feature(ctrl, CBQRI_BC_MON_CTL_OFF,
> + CBQRI_BC_MON_CTL_OP_READ_COUNTER, &status,
> + &has_mon_at_code);
When probing features, cbqri_probe_feature() modifies the OP mask and
clears RCID_MASK, but it doesn't appear to clear EVT_ID_MASK:
drivers/resctrl/cbqri_devices.c:cbqri_probe_feature() {
...
FIELD_MODIFY(CBQRI_CONTROL_REGISTERS_OP_MASK, ®, operation);
reg &= ~CBQRI_CONTROL_REGISTERS_RCID_MASK;
iowrite64(reg, ctrl->base + reg_offset);
...
}
If the hardware or firmware left an unsupported Event ID in those bits,
could the hardware reject the READ_COUNTER operation, causing the probe
to see STATUS != SUCCESS and incorrectly assume monitoring is not
supported?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260510-ssqosid-cbqri-rqsc-v7-0-v4-0-eb53831ef683@kernel.org?part=9
next prev parent reply other threads:[~2026-05-12 2:29 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11 5:10 ` Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11 5:10 ` Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11 5:10 ` Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-11 5:10 ` Drew Fustini
2026-05-11 23:52 ` sashiko-bot
2026-05-14 22:24 ` Drew Fustini
2026-05-14 22:24 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 1:26 ` sashiko-bot
2026-05-15 13:46 ` Drew Fustini
2026-05-15 13:46 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 1:58 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 2:29 ` sashiko-bot [this message]
2026-05-11 5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 4:01 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 5:00 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 5:42 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 21:26 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 21:46 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 22:56 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260512022939.683A1C2BCB0@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=fustini@kernel.org \
--cc=krzk+dt@kernel.org \
--cc=robh@kernel.org \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.