The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Drew Fustini <fustini@kernel.org>
To: sashiko@lists.linux.dev
Cc: krzk+dt@kernel.org, robh@kernel.org, conor+dt@kernel.org,
	devicetree@vger.kernel.org, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation device ops
Date: Fri, 15 May 2026 23:15:30 -0700	[thread overview]
Message-ID: <aggLggGOeuPTQV3p@thelio> (raw)
In-Reply-To: <20260512022939.683A1C2BCB0@smtp.kernel.org>

On Tue, May 12, 2026 at 02:29:38AM +0000, sashiko-bot@kernel.org wrote:
[..]
> > +static int cbqri_bc_alloc_op(struct cbqri_controller *ctrl, int operation, int rcid)
> > +{
> > +	int reg_offset = CBQRI_BC_ALLOC_CTL_OFF;
> > +	int status;
> > +	u64 reg;
> > +
> > +	lockdep_assert_held(&ctrl->lock);
> > +
> > +	if (ctrl->faulted)
> > +		return -EIO;
> > +
> > +	if (cbqri_wait_busy_flag(ctrl, reg_offset, &reg) < 0) {
> 
> If a timeout in cbqri_wait_busy_flag() sets ctrl->faulted = true, will
> runtime operations like cbqri_bc_alloc_op() ever execute again? Because
> it returns early here when ctrl->faulted is true, the polling function
> is never called again to clear the flag, which seems to permanently
> disable the controller after a single timeout.

I'll drop the ctrl->faulted early return from cbqri_bc_alloc_op(),
cbqri_cc_alloc_op() and cbqri_mon_op() to avoid locking the controller
out for the lifetime of the driver.

> > +static int cbqri_apply_bc_field(struct cbqri_controller *ctrl, u32 closid,
> > +				void (*set)(struct cbqri_controller *, u64),
> > +				u64 (*get)(struct cbqri_controller *),
> > +				u64 val, bool *committed)
> > +{
> > +	int ret;
> > +	u64 reg;
> > +
> > +	lockdep_assert_held(&ctrl->lock);
> > +
> > +	/* Load current RCID state so the unmodified field is preserved */
> > +	ret = cbqri_bc_alloc_op(ctrl, CBQRI_BC_ALLOC_CTL_OP_READ_LIMIT, closid);
> > +	if (ret < 0)
> > +		return ret;
> 
> Does this initial read need the same sentinel protection as the
> verification read? If this READ_LIMIT silently fails, the staging
> register might retain data from a previous operation (like a different
> RCID). Since set() only modifies the target field, CONFIG_LIMIT could
> commit the stale data in the unmodified field, causing cross-RCID state
> corruption.

I'll stop relying on the register to preserve the unmodified field. I
will add an mweight_cache alongside rbwb_cache.

> > +	reg = get(ctrl);
> > +	if (reg != val) {
> > +		pr_err_ratelimited("BC field verify mismatch (reg=0x%llx != val=%llu)\n",
> > +				   reg, val);
> > +		return -EIO;
> > +	}
> > +
> > +	return 0;
> > +}
> 
> If the verification fails here and returns -EIO, *committed remains
> true. When called from cbqri_apply_rbwb(), this causes the rejected
> value to be cached in ctrl->rbwb_cache[closid]. If the requested value
> was smaller than the previous one, could this cause the software cache
> to underestimate the reserved capacity and allow future allocations to
> exceed the MRBWB hardware limit?

I'll change the order so that committed is only set to true after the
verify readback succeeds.

> > +int cbqri_apply_mweight_config(struct cbqri_controller *ctrl, u32 closid,
> > +			       u64 mweight)
> > +{
> > +	int ret;
> > +
> > +	mutex_lock(&ctrl->lock);
> > +	ret = cbqri_apply_bc_field(ctrl, closid,
> > +				   cbqri_set_mweight, cbqri_get_mweight,
> > +				   mweight, NULL);
> > +	mutex_unlock(&ctrl->lock);
> > +	return ret;
> > +}
> 
> What happens if an arbitrarily large mweight is passed in? FIELD_MODIFY
> silently truncates values larger than the 8-bit MWEIGHT_MASK. The
> verification step in cbqri_apply_bc_field() will then compare the
> truncated hardware value against the original untruncated val and
> always fail, returning -EIO and logging a hardware failure instead of
> returning -EINVAL. Should this have bounds checking similar to
> cbqri_apply_rbwb()?

I will bound mweight against FIELD_MAX(MWEIGHT_MASK) at the beginning.

> 
> > @@ -437,6 +664,83 @@ static int cbqri_probe_cc(struct cbqri_controller *ctrl)
> >  	return 0;
> >  }
> >  
> > +static int cbqri_probe_bc(struct cbqri_controller *ctrl)
> > +{
> > +	bool has_mon_at_code = false;
> > +	int err, status;
> > +	u64 reg;
> 
> [ ... ]
> 
> > +	/* Probe monitoring features */
> > +	err = cbqri_probe_feature(ctrl, CBQRI_BC_MON_CTL_OFF,
> > +				  CBQRI_BC_MON_CTL_OP_READ_COUNTER, &status,
> > +				  &has_mon_at_code);
> 
> When probing features, cbqri_probe_feature() modifies the OP mask and
> clears RCID_MASK, but it doesn't appear to clear EVT_ID_MASK:
> 
> drivers/resctrl/cbqri_devices.c:cbqri_probe_feature() {
>     ...
>     FIELD_MODIFY(CBQRI_CONTROL_REGISTERS_OP_MASK, &reg, operation);
>     reg &= ~CBQRI_CONTROL_REGISTERS_RCID_MASK;
>     iowrite64(reg, ctrl->base + reg_offset);
>     ...
> }
> 
> If the hardware or firmware left an unsupported Event ID in those bits,
> could the hardware reject the READ_COUNTER operation, causing the probe
> to see STATUS != SUCCESS and incorrectly assume monitoring is not
> supported?

I will clear OP, AT, RCID and EVT_ID on every write in the probe.

-Drew

  parent reply	other threads:[~2026-05-16  6:15 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
     [not found]   ` <20260511235300.781C6C2BCB0@smtp.kernel.org>
2026-05-14 22:24     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
     [not found]   ` <20260512012635.84FCCC2BCB0@smtp.kernel.org>
2026-05-15 13:46     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
     [not found]   ` <20260512015859.B33A7C2BCB0@smtp.kernel.org>
2026-05-16  3:05     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
     [not found]   ` <20260512022939.683A1C2BCB0@smtp.kernel.org>
2026-05-16  6:15     ` Drew Fustini [this message]
2026-05-11  5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aggLggGOeuPTQV3p@thelio \
    --to=fustini@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=krzk+dt@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=robh@kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox