From: Drew Fustini <fustini@kernel.org>
To: sashiko-reviews@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v6 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb
Date: Mon, 1 Jun 2026 22:00:49 -0700 [thread overview]
Message-ID: <ah5jgXEHzQUEREV0@gen8> (raw)
In-Reply-To: <20260601205717.7FDA51F00893@smtp.kernel.org>
On Mon, Jun 01, 2026 at 08:57:17PM +0000, sashiko-bot@kernel.org wrote:
> > diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resctrl.c
> > index f379058b0114b..1d312004b07d9 100644
> > --- a/drivers/resctrl/cbqri_resctrl.c
> > +++ b/drivers/resctrl/cbqri_resctrl.c
> > @@ -363,6 +363,9 @@ int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> > case RDT_RESOURCE_L2:
> > case RDT_RESOURCE_L3:
> > return cbqri_apply_cache_config_dom(dom, r, closid, t, cfg_val);
> > + case RDT_RESOURCE_MB_MIN:
> > + /* sum(Rbwb) <= MRBWB validation runs inside cbqri_apply_rbwb(). */
> > + return cbqri_apply_rbwb(dom->hw_ctrl, closid, cfg_val, true);
>
> [Severity: High]
> Does this code leak memory bandwidth capacity when a resctrl group is removed?
>
> When a resctrl group is removed via rmdir, the CLOSID is freed by the core,
> but no architecture hook is invoked to reset the hardware or software cache.
> This means ctrl->rbwb_cache[closid] permanently retains the deleted group's
> bandwidth allocation.
>
> Later, when cbqri_apply_rbwb() is called to modify bandwidth for any active
> group, it sums across all ctrl->rcid_count CLOSIDs, including the unallocated
> ones. The stranded bandwidth from deleted groups is included in the sum,
> causing new allocations to be rejected with -EINVAL.
>
> Can an attacker with mkdir permissions repeatedly create groups, assign high
> bandwidth, and remove them, to permanently starve the default group and
> active processes?
Only root can create directories in /sys/fs/resctrl so I do not think
this is a practical concern.
> > @@ -422,9 +433,22 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> > return val;
> > }
> >
> > +/*
> > + * RCID 0 carries the remaining MRBWB after every other RCID is seeded with
> > + * the minimum Rbwb of 1. cbqri_probe_bc() rejects a bandwidth controller
> > + * with mrbwb < rcid_count, so this subtraction cannot underflow.
> > + */
> > +static u64 cbqri_rcid0_rbwb(struct cbqri_controller *ctrl)
> > +{
> > + if (WARN_ON_ONCE(ctrl->bc.mrbwb < ctrl->rcid_count))
> > + return 1;
> > + return ctrl->bc.mrbwb - (ctrl->rcid_count - 1);
> > +}
>
> [Severity: High]
> Does this calculation cause unconditional initialization failure on systems
> where MRBWB exceeds U16_MAX?
>
> The maximum reserved bandwidth blocks (mrbwb) is a 32-bit field in hardware
> capabilities, but cbqri_apply_rbwb() enforces a strict hardware limitation
> of 16 bits per RCID (rejecting values > U16_MAX).
>
> If the global mrbwb exceeds 65535, this function returns a value exceeding
> U16_MAX, causing cbqri_apply_rbwb() to reject it with -EINVAL. This in turn
> causes domain initialization to fail, preventing CPUs from coming online
> with resctrl.
This is not a real issue. MRBWB can not exceeds U16_MAX.
> > @@ -439,15 +463,41 @@ void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
> > return;
> >
> > list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
> > - for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
> > - for (t = 0; t < CDP_NUM_TYPES; t++) {
> > + dom = container_of(d, struct cbqri_resctrl_dom,
> > + resctrl_ctrl_dom);
> > +
> > + switch (r->rid) {
> > + case RDT_RESOURCE_MB_MIN:
> > + /*
> > + * CBQRI section 4.5: Rbwb >= 1, sum(Rbwb) <= MRBWB.
> > + * Walk N-1..1 first so RCID 0 lands last with the
> > + * remaining budget.
> > + */
> > + for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
> > + u32 rcid = (i + 1) % hw_res->ctrl->rcid_count;
> > + u64 rbwb = (rcid == 0) ?
> > + cbqri_rcid0_rbwb(dom->hw_ctrl) : 1;
> > int rerr;
> >
> > - rerr = resctrl_arch_update_one(r, d, i, t, default_ctrl);
> > + rerr = cbqri_apply_rbwb(dom->hw_ctrl, rcid, rbwb, false);
> > if (rerr)
> > - pr_err_ratelimited("rid=%d reset RCID %u type %u failed (%d)\n",
> > - r->rid, i, t, rerr);
> > + pr_err_ratelimited("RBWB reset RCID %u failed (%d)\n",
> > + rcid, rerr);
>
> [Severity: High]
> Does this error handling violate the hardware bandwidth sum invariant?
>
> If cbqri_apply_rbwb() fails for any RCID (such as an MMIO error), it logs
> the error but continues the loop, leaving that RCID at its previous
> (potentially large) bandwidth.
>
> When it later blindly writes MRBWB - (N-1) to RCID 0 with check_sum = false,
> the total sum of all RCIDs in the hardware will exceed the hardware's MRBWB
> capacity. Could this violate the CBQRI invariant and potentially lock up the
> controller?
This is not a problem. The reset uses check_sum = false because the
intermediate sums during the walk legitimately exceed MRBWB, but the
hardware enforces sum(Rbwb) <= MRBWB itself.
> > + case RDT_RESOURCE_MB_MIN:
> > + res->name = "MB_MIN";
> > + res->schema_fmt = RESCTRL_SCHEMA_RANGE;
> > + /*
> > + * resctrl requires a cache scope for MBA-style domains.
> > + * Use L3 as a proxy until the resctrl supports non-cache
> > + * scopes for bandwidth resources.
> > + */
> > + res->ctrl_scope = RESCTRL_L3_CACHE;
> > + /* Rbwb is an integer block count, not a percentage. No MBA delay_linear. */
> > + res->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
> > + res->membw.min_bw = 1;
> > + res->membw.max_bw = ctrl->bc.mrbwb;
>
> Does setting max_bw to ctrl->bc.mrbwb expose an unclamped 32-bit maximum
> bandwidth to userspace?
>
> If ctrl->bc.mrbwb exceeds U16_MAX, exposing this value to userspace allows
> requests that bypass core validation but will ultimately fail at the hardware
> layer in cbqri_apply_rbwb().
I will address this issue in the next revision.
> > + res->membw.bw_gran = 1;
> > + /*
> > + * CBQRI section 4.5 caps sum(Rbwb) <= MRBWB. Default new
> > + * groups to min_bw so mkdir cannot overflow that sum.
> > + */
> > + res->membw.default_to_min = true;
> > + res->alloc_capable = ctrl->alloc_capable;
> > + INIT_LIST_HEAD(&res->ctrl_domains);
> > + INIT_LIST_HEAD(&res->mon_domains);
>
> [Severity: High]
> Does enabling MB_MIN cause massive memory allocation failures due to an
> unbounded max_rmid?
>
> In cbqri_resctrl_accumulate_caps(), exposed_mon_capable is set to true if
> any hardware controller has mon_capable == true, which now includes the
> bandwidth controller backing MB_MIN.
>
> However, max_rmid is conditionally bounded only by the L3 capacity
> controller. If the system has a monitor-capable bandwidth controller but no
> monitor-capable L3 capacity controller, max_rmid remains U32_MAX.
>
> Later, resctrl_init() invokes rmid_init(), which attempts to allocate
> structures for U32_MAX RMIDs. This will fail with -ENOMEM and completely
> abort resctrl initialization.
I'll address this in the next rev.
>
> > @@ -893,14 +1054,24 @@ static int cbqri_attach_cpu_to_all_ctrls(unsigned int cpu)
> > lockdep_assert_held(&cbqri_domain_list_lock);
> >
> > list_for_each_entry(ctrl, &cbqri_controllers, list) {
> > - if (ctrl->type != CBQRI_CONTROLLER_TYPE_CAPACITY)
> > - continue;
> > - if (!cpumask_test_cpu(cpu, &ctrl->cache.cpu_mask))
> > - continue;
> > - if (!ctrl->alloc_capable)
> > + switch (ctrl->type) {
> > + case CBQRI_CONTROLLER_TYPE_CAPACITY:
> > + if (!cpumask_test_cpu(cpu, &ctrl->cache.cpu_mask))
> > + continue;
> > + if (!ctrl->alloc_capable)
> > + continue;
> > + err = cbqri_attach_cpu_to_cap_ctrl(ctrl, cpu);
> > + break;
> > + case CBQRI_CONTROLLER_TYPE_BANDWIDTH:
> > + if (!cpumask_test_cpu(cpu, &ctrl->mem.cpu_mask))
> > + continue;
> > + if (!ctrl->alloc_capable)
> > + continue;
> > + err = cbqri_attach_cpu_to_bw_ctrl(ctrl, cpu);
> > + break;
> > + default:
> > continue;
> > -
> > - err = cbqri_attach_cpu_to_cap_ctrl(ctrl, cpu);
> > + }
> > if (err) {
> > cbqri_detach_cpu_from_all_ctrls(cpu);
>
> [Severity: High]
> This is a pre-existing issue, but does cbqri_detach_cpu_from_l3_mon() have a
> use-after-free bug due to an unsafe cancel_delayed_work()?
>
> If cbqri_detach_cpu_from_all_ctrls() eventually calls into
> cbqri_detach_cpu_from_l3_mon():
>
> drivers/resctrl/cbqri_resctrl.c:cbqri_detach_cpu_from_l3_mon() {
> ...
> cancel_delayed_work(&mon_dom->cqm_limbo);
> ...
> kfree(mon_dom);
> ...
> }
>
> The CPU offline callback runs under cpus_write_lock(). If the worker has
> already dequeued from the workqueue but is preempted right before acquiring
> cpus_read_lock(), cancel_delayed_work() returns false and does not wait.
>
> The callback then proceeds to kfree(mon_dom). When the hotplug callback
> finishes and releases the write lock, the worker acquires the read lock and
> dereferences the freed mon_dom pointer, resulting in a use-after-free.
I don't believe this a problem.
Drew
next prev parent reply other threads:[~2026-06-02 5:00 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 20:35 [PATCH RFC v6 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-02 0:58 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:55 ` sashiko-bot
2026-06-02 1:11 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:48 ` sashiko-bot
2026-06-02 2:17 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:51 ` sashiko-bot
2026-06-02 2:20 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-02 4:27 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:56 ` sashiko-bot
2026-06-02 4:31 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:58 ` sashiko-bot
2026-06-02 4:47 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-02 5:00 ` Drew Fustini [this message]
2026-06-01 20:36 ` [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-02 17:35 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 21:01 ` sashiko-bot
2026-06-02 21:09 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-06-01 20:36 ` Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ah5jgXEHzQUEREV0@gen8 \
--to=fustini@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.