From: Drew Fustini <fustini@kernel.org>
To: sashiko@lists.linux.dev
Cc: conor+dt@kernel.org, krzk+dt@kernel.org, robh@kernel.org,
devicetree@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb
Date: Sat, 16 May 2026 17:33:10 -0700 [thread overview]
Message-ID: <agkMxmmZOwLdSlWS@thelio> (raw)
In-Reply-To: <20260512054240.432A6C2BCB0@smtp.kernel.org>
On Tue, May 12, 2026 at 05:42:39AM +0000, sashiko-bot@kernel.org wrote:
> > diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resctrl.c
> > index d8fd9b06703fa..bcd9367e3555c 100644
> > --- a/drivers/resctrl/cbqri_resctrl.c
> > +++ b/drivers/resctrl/cbqri_resctrl.c
> > @@ -415,6 +415,9 @@ int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> > case RDT_RESOURCE_L2:
> > case RDT_RESOURCE_L3:
> > return cbqri_apply_cache_config_dom(dom, r, closid, t, cfg_val);
> > + case RDT_RESOURCE_MB_MIN:
> > + /* sum(Rbwb) <= MRBWB validation runs inside cbqri_apply_rbwb(). */
> > + return cbqri_apply_rbwb(dom->hw_ctrl, closid, cfg_val, true);
>
> When a resctrl group is removed, does the framework invoke an architecture
> hook to reset the hardware control values to a minimum default?
>
> If the CLOSID is freed but the previously configured bandwidth value remains
> pinned in the software cache (ctrl->rbwb_cache[closid]), will this abandoned
> value falsely inflate the sum during subsequent cbqri_apply_rbwb() calls?
>
> Could this cause new allocation attempts to fail with -EINVAL, effectively
> creating a permanent denial of service for bandwidth allocations until the
> leaked CLOSID is coincidentally re-allocated?
resctrl has no arch-side hook, so the cached Rbwb for the freed closid
stays counted in the sum and could prevent new allocations.
The solution is probably a new resctrl_arch_* callback that resctrl
invokes on group destroy with a CBQRI implementation that resets
rbwb_cache[closid] back to the minimum. However, that would touch
fs/resctrl and all arch backends, so I will look at addressing that in a
future series.
> > @@ -491,15 +503,42 @@ void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
> > return;
> >
> > list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
> > - for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
> > - for (t = 0; t < CDP_NUM_TYPES; t++) {
> > + dom = container_of(d, struct cbqri_resctrl_dom,
> > + resctrl_ctrl_dom);
> > +
> > + switch (r->rid) {
> > + case RDT_RESOURCE_MB_MIN:
> > + /*
> > + * CBQRI section 4.5: Rbwb >= 1, sum(Rbwb) <= MRBWB.
> > + * Walk N-1..1 first so the final sum lands at
> > + * MRBWB. Use the unchecked helper since the
> > + * intermediate sum may exceed MRBWB.
> > + */
>
> This isn't a bug, but is this comment accurate?
>
> Since the loop strictly reduces the allocations for RCIDs N-1 down to 1
> before finally increasing RCID 0 to the remaining budget, doesn't the
> running hardware sum strictly decrease until the final step?
>
> Does the intermediate sum ever actually transiently exceed MRBWB?
I will reword the comment. The hardware sum decreases through the walk
and never exceeds MRBWB. However, the cache-based sum transiently
exceeds MRBWB.
> > + for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
> > + u32 rcid = (i + 1) % hw_res->ctrl->rcid_count;
> > + u64 rbwb = (rcid == 0) ?
> > + dom->hw_ctrl->bc.mrbwb - (hw_res->ctrl->rcid_count - 1) : 1;
>
> If the hardware capability reports mrbwb == rcid_count - 1, will this
> calculation evaluate exactly to 0 for RCID 0?
>
> If so, does cbqri_apply_rbwb() bypass the internal sum validation
> (which checks if rbwb > 0) and blindly write 0 to the hardware, violating
> the CBQRI section 4.5 requirement that Rbwb >= 1?
cbqri_probe_bc() already rejects mrbwb < rcid_count and refuses to
register the controller in that case. So by the time this reset runs,
mrbwb >= rcid_count is guaranteed and the subtraction yields at least 1.
I'll add a WARN_ON(mrbwb < rcid_count) at the top of the reset walk, or
factor the budget calculation into a helper that the probe and reset
paths share.
> > +static int cbqri_attach_cpu_to_one_bw_res(struct cbqri_controller *ctrl,
> > + enum resctrl_res_level rid,
> > + unsigned int cpu)
> > +{
> > + struct cbqri_resctrl_res *hw_res = &cbqri_resctrl_resources[rid];
> > + struct rdt_resource *res = &hw_res->resctrl_res;
> > + struct rdt_ctrl_domain *domain;
> > + int dom_id = ctrl->mem.prox_dom;
> > +
> > + if (!hw_res->ctrl)
> > + return 0;
> > +
> > + domain = cbqri_find_ctrl_domain(&res->ctrl_domains, dom_id);
> > + if (domain) {
>
> If a system has multiple memory bandwidth controllers within the same NUMA
> node, will the second controller match the existing domain via
> cbqri_find_ctrl_domain() here?
>
> Since struct cbqri_resctrl_dom only stores a single hw_ctrl pointer,
> does this mean the secondary controller is never tracked by the domain and
> its hardware limits will be silently ignored during subsequent configurations?
Multiple memory bandwidth controllers are not supported. I will update
it to detect the collision and reject the second BC at registration
time.
-Drew
next prev parent reply other threads:[~2026-05-17 0:33 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-11 23:52 ` sashiko-bot
2026-05-14 22:24 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-12 1:26 ` sashiko-bot
2026-05-15 13:46 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-12 1:58 ` sashiko-bot
2026-05-16 3:05 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-12 2:29 ` sashiko-bot
2026-05-16 6:15 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-12 4:01 ` sashiko-bot
2026-05-16 15:45 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-12 5:00 ` sashiko-bot
2026-05-16 22:39 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-12 5:42 ` sashiko-bot
2026-05-17 0:33 ` Drew Fustini [this message]
2026-05-11 5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-12 21:26 ` sashiko-bot
2026-05-17 1:38 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-12 21:46 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-05-12 22:56 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agkMxmmZOwLdSlWS@thelio \
--to=fustini@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=krzk+dt@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=robh@kernel.org \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox