From: sashiko-bot@kernel.org
To: "Drew Fustini" <fustini@kernel.org>
Cc: conor+dt@kernel.org, robh@kernel.org, devicetree@vger.kernel.org
Subject: Re: [PATCH RFC v6 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring
Date: Mon, 01 Jun 2026 20:58:37 +0000 [thread overview]
Message-ID: <20260601205837.C1AC91F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260601-ssqosid-cbqri-rqsc-v7-0-v6-12-baf00f50028a@kernel.org>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] Non-sync cancellation of `cqm_limbo` and `mbm_over` delayed work during CPU offline and error paths leads to a use-after-free of the `rdt_l3_mon_domain` structure.
- [High] AB-BA deadlock between `cbqri_domain_list_lock` and `rdtgroup_mutex` during concurrent CPU hotplug operations and RMID reads.
- [High] Self-deadlock on `cbqri_domain_list_lock` during CPU offline domain teardown when flushing limbo RMIDs.
- [High] `max_rmid` remains at its uninitialized `U32_MAX` value if an L2 capacity controller is `mon_capable`, leading to a massive OOM allocation during filesystem mount.
- [Medium] Userspace reads of LLC occupancy will fail with `-EIO` when targeting `nohz_full` CPUs due to incorrectly registering the event with `any_cpu = false`.
--
commit e556b965bbc51629bbf2d48c342592c000877b63
Author: Drew Fustini <fustini@kernel.org>
Subject: riscv_cbqri: resctrl: Add L3 cache occupancy monitoring
Expose QOS_L3_OCCUP_EVENT_ID so userspace can read per-MCID
llc_occupancy. The result is converted from capacity blocks to bytes
using cache_size and ncblks. Each MCID is armed once with the Occupancy
event by cbqri_init_mon_counters() and free runs thereafter.
> diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resctrl.c
> --- a/drivers/resctrl/cbqri_resctrl.c
> +++ b/drivers/resctrl/cbqri_resctrl.c
[ ... ]
> @@ -181,6 +181,89 @@ void resctrl_arch_mon_event_config_write(void *info)
> +int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain_hdr *hdr,
> + u32 closid, u32 rmid, enum resctrl_event_id eventid,
> + void *arch_priv, u64 *val, void *arch_mon_ctx)
> +{
[ ... ]
> + if (irqs_disabled())
> + return -EIO;
> +
> + /*
> + * cbqri_domain_list_lock serialises the list walk against
> + * cbqri_detach_cpu_from_ctrl_domains().
> + */
> + mutex_lock(&cbqri_domain_list_lock);
[Severity: High]
During CPU hotplug, cbqri_resctrl_online_cpu() acquires
cbqri_domain_list_lock and eventually calls resctrl_online_mon_domain()
which acquires the core framework rdtgroup_mutex.
However, when reading monitoring data, the core framework acquires
rdtgroup_mutex and then invokes resctrl_arch_rmid_read() which acquires
cbqri_domain_list_lock here.
Does this establish an AB-BA lock inversion between cbqri_domain_list_lock
and rdtgroup_mutex?
[Severity: High]
Additionally, when a CPU goes offline, cbqri_resctrl_offline_cpu() acquires
cbqri_domain_list_lock before initiating domain teardown.
If the domain is torn down, the resctrl core flushes busy RMIDs by calling
__check_limbo(), which invokes this resctrl_arch_rmid_read() hook.
Since this hook unconditionally attempts to acquire cbqri_domain_list_lock,
does this cause a self-deadlock when offlining a CPU with active monitoring
traffic?
[ ... ]
> @@ -500,6 +582,13 @@ static int cbqri_resctrl_control_init(struct cbqri_resctrl_res *cbqri_res)
[ ... ]
> + if (ctrl->mon_capable && res->rid == RDT_RESOURCE_L3) {
> + res->mon_scope = RESCTRL_L3_CACHE;
> + resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID,
> + false, 0, NULL);
> + res->mon_capable = true;
> + }
[Severity: Medium]
The L3 occupancy event is registered with any_cpu set to false. This
requires the core to read the event from a specific CPU using
smp_call_function_any().
If that CPU is configured as nohz_full, the read executes in an IPI context
with interrupts disabled. Since resctrl_arch_rmid_read() correctly returns
-EIO when irqs_disabled(), will userspace reads fail on nohz_full CPUs?
Could any_cpu be true here to avoid cross-CPU IPIs, since the MMIO reads
can be safely executed from any CPU?
[ ... ]
> @@ -510,6 +599,7 @@ static int cbqri_resctrl_control_init(struct cbqri_resctrl_res *cbqri_res)
[ ... ]
> + l3_ctrl = cbqri_resctrl_resources[RDT_RESOURCE_L3].ctrl;
> + if (l3_ctrl && l3_ctrl->mon_capable)
> + max_rmid = min(max_rmid, l3_ctrl->mcid_count);
> +
> + if (!exposed_mon_capable) {
> + max_rmid = 1;
> + return;
> }
[Severity: High]
If the hardware has an L2 monitoring controller but no L3 monitoring,
exposed_mon_capable becomes true but max_rmid is never narrowed from its
initial U32_MAX value.
When the core framework mounts the filesystem, it will query
resctrl_arch_system_num_rmid_idx() which returns max_rmid. Will this
cause an immediate OOM panic when it attempts to allocate
U32_MAX * sizeof(struct rmid_entry)?
[ ... ]
> @@ -560,13 +673,89 @@ static struct rdt_ctrl_domain *cbqri_create_ctrl_domain(struct cbqri_controller
[ ... ]
> +err_offline:
> + /*
> + * cancel_delayed_work avoids deadlocking against the cqm_limbo
> + * worker which takes cpus_read_lock while this hotplug callback
> + * already holds cpus_write_lock. mbm_over is only
> + * INIT_DELAYED_WORK'd when MBM_TOTAL was enabled, so gate the
> + * cancel on the same condition to avoid touching a zeroed work
> + * struct.
> + */
> + cancel_delayed_work(&mon_dom->cqm_limbo);
> + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
> + cancel_delayed_work(&mon_dom->mbm_over);
> + resctrl_offline_mon_domain(res, &mon_dom->hdr);
> +err_listdel:
> + list_del(&mon_dom->hdr.list);
> +err_free:
> + kfree(mon_dom);
> + return err;
> +}
[Severity: High]
If a worker thread is currently executing but blocked waiting for
cpus_read_lock(), cancel_delayed_work() will return immediately because
the work is no longer pending in the queue.
After the hotplug thread drops cpus_write_lock(), the worker could unblock
and continue executing, accessing the mon_dom structure after it was freed.
Does this create a use-after-free window?
[ ... ]
> +static void cbqri_detach_cpu_from_l3_mon(struct rdt_resource *res,
> + unsigned int cpu)
> +{
[ ... ]
> + cancel_delayed_work(&mon_dom->cqm_limbo);
> + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
> + cancel_delayed_work(&mon_dom->mbm_over);
> + resctrl_offline_mon_domain(res, &mon_dom->hdr);
> + list_del(&mon_dom->hdr.list);
> + kfree(mon_dom);
> + }
> + }
> }
[Severity: High]
Similarly here, if a worker thread is already executing and waiting on
cpus_read_lock(), cancel_delayed_work() won't wait for it to finish.
Will this lead to a use-after-free on mon_dom once the lock is dropped
and the worker continues?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260601-ssqosid-cbqri-rqsc-v7-0-v6-0-baf00f50028a@kernel.org?part=12
next prev parent reply other threads:[~2026-06-01 20:58 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 20:35 [PATCH RFC v6 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-01 20:35 ` [PATCH RFC v6 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-06-01 20:55 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-06-01 20:48 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-06-01 20:51 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-06-01 20:56 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-06-01 20:58 ` sashiko-bot [this message]
2026-06-01 20:36 ` [PATCH RFC v6 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-06-01 21:01 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260601205837.C1AC91F00893@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=fustini@kernel.org \
--cc=robh@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox