From: sashiko-bot@kernel.org
To: "Drew Fustini" <fustini@kernel.org>
Cc: conor+dt@kernel.org, robh@kernel.org, krzk+dt@kernel.org,
devicetree@vger.kernel.org
Subject: Re: [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring
Date: Tue, 12 May 2026 21:26:53 +0000 [thread overview]
Message-ID: <20260512212653.B5715C2BCB8@smtp.kernel.org> (raw)
In-Reply-To: <20260510-ssqosid-cbqri-rqsc-v7-0-v4-15-eb53831ef683@kernel.org>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] Use-After-Free / NULL pointer dereference due to premature domain exposure and TOCTOU race
- [High] List traversal race leading to Use-After-Free during CPU hotplug
- [Medium] Bandwidth monitoring is silently disabled if the L3 capacity controller lacks occupancy monitoring
- [Medium] Severe bandwidth overcounting on platforms with multiple L3 domains due to lack of topology matching
--
commit 2e7334d7e901ad4eb3efafaf7a1b871bac0c83fe
Author: Drew Fustini <fustini@kernel.org>
riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring
Expose CBQRI bandwidth controller's combined read+write counter as
the L3 mbm_total_bytes event. A software accumulator keeps the
64-bit byte total monotonic across the 62-bit hardware counter wrap.
> diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resctrl.c
> index 71ee8e6107571..ccd48cac7ccde 100644
> --- a/drivers/resctrl/cbqri_resctrl.c
> +++ b/drivers/resctrl/cbqri_resctrl.c
[ ... ]
> @@ -308,6 +348,76 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain_hdr *hdr,
> mutex_unlock(&ctrl->lock);
> return err;
>
> + case QOS_L3_MBM_TOTAL_EVENT_ID: {
> + struct cbqri_controller *bc;
> +
> + /*
> + * The L3 monitoring domain's id is the L3 cache id. The
> + * matching ctrl domain's hw_dom->paired_bc was cached at
> + * add time to avoid walking cbqri_controllers on every read.
> + */
> + d = cbqri_find_ctrl_domain(&r->ctrl_domains, hdr->id);
Is there a list traversal race here that could happen during CPU hotplug?
When offlining a CPU, the domain removal process uses list_del_rcu() and frees
domains after calling synchronize_rcu(). However, cbqri_find_ctrl_domain()
traverses the ctrl_domains list without holding an rcu_read_lock().
If synchronize_rcu() does not wait for this reader, could this lead to a
use-after-free when the freed domain struct is accessed?
[ ... ]
> @@ -730,6 +840,16 @@ static int cbqri_resctrl_control_init(struct cbqri_resctrl_res *cbqri_res)
> res->mon.num_rmid = ctrl->mcid_count;
> resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID,
> false, 0, NULL);
> +
> + /*
> + * Expose BC bandwidth monitoring as the L3's
> + * mbm_total_bytes when a BC shares topology with this
> + * L3 (MPAM "MB on L3" mapping).
> + */
> + if (cbqri_resctrl_counters[QOS_L3_MBM_TOTAL_EVENT_ID])
> + resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID,
> + false, 0, NULL);
> +
> res->mon_capable = true;
> }
> break;
Is bandwidth monitoring inadvertently disabled if the L3 capacity controller
lacks occupancy monitoring?
Because this block is inside an if condition checking ctrl->mon_capable where
ctrl is the L3 capacity controller, it appears the bandwidth monitoring won't
be exposed even if a valid and monitoring-capable bandwidth controller is
present on the system.
[ ... ]
> @@ -949,6 +1072,29 @@ static int cbqri_attach_cpu_to_l3_mon(struct cbqri_controller *ctrl,
> if (err)
> goto err_offline;
>
> + /*
> + * Pair this L3 domain with the system's mon-capable BC. The
> + * cached pointer is consulted by every rmid_read / reset_rmid.
> + * BC mon init is system-wide. Failure here means mbm_total_bytes
> + * is unusable for any domain, so fail the attach.
> + */
> + {
> + struct cbqri_resctrl_dom *hw_dom = container_of(ctrl_dom,
> + struct cbqri_resctrl_dom,
> + resctrl_ctrl_dom);
> +
> + hw_dom->paired_bc = cbqri_find_only_mon_bc();
> + if (hw_dom->paired_bc) {
> + err = cbqri_init_bc_mon_counters(hw_dom->paired_bc);
Could this sequence result in a use-after-free or a null pointer dereference?
By the time this block runs, resctrl_online_mon_domain() has already been
called earlier in the function, exposing the monitoring domain to user space.
If a concurrent sysfs read triggers resctrl_arch_rmid_read() at this moment,
it could pass the lockless check for bc->mbm_total_states and block waiting
for the hardware lock.
If cbqri_init_bc_mon_counters() fails, it drops the lock, frees
bc->mbm_total_states, and sets it to null.
When the sysfs read unblocks, will it dereference the freed pointer? Might it
be safer to defer domain onlining until after the hardware state is fully
initialized?
Furthermore, does this unconditional assignment of cbqri_find_only_mon_bc()
cause bandwidth overcounting on platforms with multiple L3 domains?
If the single system-wide bandwidth controller is paired with every L3
domain, standard user-space tools that calculate group bandwidth by summing
values across all L3 domains might overcount the true system bandwidth by
a factor equal to the number of active L3 domains.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260510-ssqosid-cbqri-rqsc-v7-0-v4-0-eb53831ef683@kernel.org?part=15
next prev parent reply other threads:[~2026-05-12 21:26 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-11 23:52 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-12 1:26 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-12 1:58 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-12 2:29 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-12 4:01 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-12 5:00 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-12 5:42 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-12 21:26 ` sashiko-bot [this message]
2026-05-11 5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-12 21:46 ` sashiko-bot
2026-05-11 5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-05-12 22:56 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260512212653.B5715C2BCB8@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=fustini@kernel.org \
--cc=krzk+dt@kernel.org \
--cc=robh@kernel.org \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox