From: Drew Fustini <fustini@kernel.org>
To: sashiko@lists.linux.dev
Cc: krzk+dt@kernel.org, robh@kernel.org, conor+dt@kernel.org,
devicetree@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation device ops
Date: Fri, 15 May 2026 23:15:30 -0700 [thread overview]
Message-ID: <aggLggGOeuPTQV3p@thelio> (raw)
In-Reply-To: <20260512022939.683A1C2BCB0@smtp.kernel.org>
On Tue, May 12, 2026 at 02:29:38AM +0000, sashiko-bot@kernel.org wrote:
[..]
> > +static int cbqri_bc_alloc_op(struct cbqri_controller *ctrl, int operation, int rcid)
> > +{
> > + int reg_offset = CBQRI_BC_ALLOC_CTL_OFF;
> > + int status;
> > + u64 reg;
> > +
> > + lockdep_assert_held(&ctrl->lock);
> > +
> > + if (ctrl->faulted)
> > + return -EIO;
> > +
> > + if (cbqri_wait_busy_flag(ctrl, reg_offset, ®) < 0) {
>
> If a timeout in cbqri_wait_busy_flag() sets ctrl->faulted = true, will
> runtime operations like cbqri_bc_alloc_op() ever execute again? Because
> it returns early here when ctrl->faulted is true, the polling function
> is never called again to clear the flag, which seems to permanently
> disable the controller after a single timeout.
I'll drop the ctrl->faulted early return from cbqri_bc_alloc_op(),
cbqri_cc_alloc_op() and cbqri_mon_op() to avoid locking the controller
out for the lifetime of the driver.
> > +static int cbqri_apply_bc_field(struct cbqri_controller *ctrl, u32 closid,
> > + void (*set)(struct cbqri_controller *, u64),
> > + u64 (*get)(struct cbqri_controller *),
> > + u64 val, bool *committed)
> > +{
> > + int ret;
> > + u64 reg;
> > +
> > + lockdep_assert_held(&ctrl->lock);
> > +
> > + /* Load current RCID state so the unmodified field is preserved */
> > + ret = cbqri_bc_alloc_op(ctrl, CBQRI_BC_ALLOC_CTL_OP_READ_LIMIT, closid);
> > + if (ret < 0)
> > + return ret;
>
> Does this initial read need the same sentinel protection as the
> verification read? If this READ_LIMIT silently fails, the staging
> register might retain data from a previous operation (like a different
> RCID). Since set() only modifies the target field, CONFIG_LIMIT could
> commit the stale data in the unmodified field, causing cross-RCID state
> corruption.
I'll stop relying on the register to preserve the unmodified field. I
will add an mweight_cache alongside rbwb_cache.
> > + reg = get(ctrl);
> > + if (reg != val) {
> > + pr_err_ratelimited("BC field verify mismatch (reg=0x%llx != val=%llu)\n",
> > + reg, val);
> > + return -EIO;
> > + }
> > +
> > + return 0;
> > +}
>
> If the verification fails here and returns -EIO, *committed remains
> true. When called from cbqri_apply_rbwb(), this causes the rejected
> value to be cached in ctrl->rbwb_cache[closid]. If the requested value
> was smaller than the previous one, could this cause the software cache
> to underestimate the reserved capacity and allow future allocations to
> exceed the MRBWB hardware limit?
I'll change the order so that committed is only set to true after the
verify readback succeeds.
> > +int cbqri_apply_mweight_config(struct cbqri_controller *ctrl, u32 closid,
> > + u64 mweight)
> > +{
> > + int ret;
> > +
> > + mutex_lock(&ctrl->lock);
> > + ret = cbqri_apply_bc_field(ctrl, closid,
> > + cbqri_set_mweight, cbqri_get_mweight,
> > + mweight, NULL);
> > + mutex_unlock(&ctrl->lock);
> > + return ret;
> > +}
>
> What happens if an arbitrarily large mweight is passed in? FIELD_MODIFY
> silently truncates values larger than the 8-bit MWEIGHT_MASK. The
> verification step in cbqri_apply_bc_field() will then compare the
> truncated hardware value against the original untruncated val and
> always fail, returning -EIO and logging a hardware failure instead of
> returning -EINVAL. Should this have bounds checking similar to
> cbqri_apply_rbwb()?
I will bound mweight against FIELD_MAX(MWEIGHT_MASK) at the beginning.
>
> > @@ -437,6 +664,83 @@ static int cbqri_probe_cc(struct cbqri_controller *ctrl)
> > return 0;
> > }
> >
> > +static int cbqri_probe_bc(struct cbqri_controller *ctrl)
> > +{
> > + bool has_mon_at_code = false;
> > + int err, status;
> > + u64 reg;
>
> [ ... ]
>
> > + /* Probe monitoring features */
> > + err = cbqri_probe_feature(ctrl, CBQRI_BC_MON_CTL_OFF,
> > + CBQRI_BC_MON_CTL_OP_READ_COUNTER, &status,
> > + &has_mon_at_code);
>
> When probing features, cbqri_probe_feature() modifies the OP mask and
> clears RCID_MASK, but it doesn't appear to clear EVT_ID_MASK:
>
> drivers/resctrl/cbqri_devices.c:cbqri_probe_feature() {
> ...
> FIELD_MODIFY(CBQRI_CONTROL_REGISTERS_OP_MASK, ®, operation);
> reg &= ~CBQRI_CONTROL_REGISTERS_RCID_MASK;
> iowrite64(reg, ctrl->base + reg_offset);
> ...
> }
>
> If the hardware or firmware left an unsupported Event ID in those bits,
> could the hardware reject the READ_COUNTER operation, causing the probe
> to see STATUS != SUCCESS and incorrectly assume monitoring is not
> supported?
I will clear OP, AT, RCID and EVT_ID on every write in the probe.
-Drew
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2026-05-16 6:15 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
[not found] ` <20260511235300.781C6C2BCB0@smtp.kernel.org>
2026-05-14 22:24 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
[not found] ` <20260512012635.84FCCC2BCB0@smtp.kernel.org>
2026-05-15 13:46 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
[not found] ` <20260512015859.B33A7C2BCB0@smtp.kernel.org>
2026-05-16 3:05 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
[not found] ` <20260512022939.683A1C2BCB0@smtp.kernel.org>
2026-05-16 6:15 ` Drew Fustini [this message]
2026-05-11 5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aggLggGOeuPTQV3p@thelio \
--to=fustini@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=krzk+dt@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=robh@kernel.org \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox