From: Drew Fustini <fustini@kernel.org>
To: sashiko-reviews@lists.linux.dev
Cc: devicetree@vger.kernel.org, robh@kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH RFC v6 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring
Date: Tue, 2 Jun 2026 14:09:12 -0700 [thread overview]
Message-ID: <ah9GeBlzrrL/C3kL@x1> (raw)
In-Reply-To: <20260601210114.88F1B1F00893@smtp.kernel.org>
On Mon, Jun 01, 2026 at 09:01:14PM +0000, sashiko-bot@kernel.org wrote:
> > @@ -184,17 +191,67 @@ void resctrl_arch_mon_event_config_write(void *info)
> > void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
> > u32 unused, u32 rmid, enum resctrl_event_id eventid)
> > {
> > + struct cbqri_resctrl_dom *hw_dom;
> > + struct rdt_ctrl_domain *cd;
> > +
> > + if (irqs_disabled())
> > + return;
> > +
> > + mutex_lock(&cbqri_domain_list_lock);
>
> [Severity: High]
> This is a pre-existing issue, but does expanding cbqri_domain_list_lock
> usage here worsen an AB-BA deadlock with rdtgroup_mutex?
>
> A deadlock can occur during CPU hotplug and sysfs reads due to conflicting
> lock ordering:
>
> Thread A (CPU Hotplug) takes cbqri_domain_list_lock in
> cbqri_resctrl_online_cpu(), then calls resctrl_online_mon_domain() via
> cbqri_attach_cpu_to_l3_mon(), which blocks waiting for rdtgroup_mutex.
>
> Thread B (Sysfs read) takes rdtgroup_mutex in rdtgroup_mondata_show(), then
> delegates the read via smp_call_on_cpu() to a worker thread.
>
> Thread C (Worker) executes resctrl_arch_rmid_read(), which blocks waiting
> for cbqri_domain_list_lock.
This is not a problem. Every rdtgroup_mutex -> cbqri_domain_list_lock
path runs under cpus_read_lock, and the hotplug path runs under
cpus_write_lock.
> > +static void cbqri_resctrl_pick_counters(void)
> > +{
> [ ... ]
> > + /*
> > + * mbm_total_bytes is surfaced on the L3 monitoring domain, so it
> > + * needs a mon-capable L3 cache controller as well as a single
> > + * mon-capable bandwidth controller.
> > + */
> > + if (l3->ctrl && l3->ctrl->mon_capable && cbqri_find_only_mon_bc())
> > + resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID, false, 0, NULL);
>
> Does passing any_cpu = false here break monitor reads and RMID resets on
> nohz_full systems?
>
> When any_cpu is false, the resctrl core ensures the read executes on a CPU
> in the target domain. If all CPUs in that domain are nohz_full,
> mon_event_read() falls back from smp_call_on_cpu() (process context) to
> smp_call_function_any() (IPI hardirq context with interrupts disabled).
>
> Because drivers/resctrl/cbqri_resctrl.c checks for irqs_disabled() in both
> resctrl_arch_rmid_read() and resctrl_arch_reset_rmid() and silently aborts,
> monitor reads and resets will fail on nohz_full systems.
The CBQRI MMIO registers are cpu agnostic so I will register
QOS_L3_MBM_TOTAL with any_cpu=true so the core reads in process context.
> > @@ -854,6 +1054,7 @@ static int cbqri_attach_cpu_to_l3_mon(struct cbqri_controller *ctrl,
> [ ... ]
> > err_offline:
> > /*
> > * cancel_delayed_work avoids deadlocking against the cqm_limbo
> > * worker which takes cpus_read_lock while this hotplug callback
> > * already holds cpus_write_lock. mbm_over is only
> > * INIT_DELAYED_WORK'd when MBM_TOTAL was enabled, so gate the
> > * cancel on the same condition to avoid touching a zeroed work
> > * struct.
> > */
> > cancel_delayed_work(&mon_dom->cqm_limbo);
> > + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
> > + cancel_delayed_work(&mon_dom->mbm_over);
> > resctrl_offline_mon_domain(res, &mon_dom->hdr);
> > err_listdel:
> > list_del(&mon_dom->hdr.list);
> > err_free:
> > kfree(mon_dom);
> > return err;
> > }
>
> Does this non-blocking cancel_delayed_work() lead to a use-after-free of
> mon_dom?
>
> In drivers/resctrl/cbqri_resctrl.c:cbqri_attach_cpu_to_l3_mon(),
> resctrl_online_mon_domain() schedules the mbm_over delayed work. If hardware
> initialization fails later in the function, the error path executes
> cancel_delayed_work(), which is non-blocking.
>
> If the worker thread has already woken up and is actively blocked on
> cpus_read_lock() waiting for the hotplug thread to drop cpus_write_lock(),
> cancel_delayed_work() will return false.
>
> The hotplug thread will then proceed to free mon_dom and drop the lock.
> Once the lock is dropped, the worker thread resumes execution and will
> access the freed mon_dom.
I don't think that this is a problem. resctrl_offline_mon_domain()
already issues cancel_delayed_work() for mbm_over and cqm_limbo, and x86
frees the domain right after. The mbm_over and cqm_limbo workers take
cpus_read_lock() first, and both the error path and the offline teardown
run under cpus_write_lock().
Drew
next prev parent reply other threads:[~2026-06-02 21:09 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 20:35 [PATCH RFC v6 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-01 20:35 ` [PATCH RFC v6 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-06-01 20:55 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-06-01 20:48 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-06-01 20:51 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-06-01 20:56 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-06-01 20:58 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-06-01 21:01 ` sashiko-bot
2026-06-02 21:09 ` Drew Fustini [this message]
2026-06-01 20:36 ` [PATCH RFC v6 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ah9GeBlzrrL/C3kL@x1 \
--to=fustini@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=robh@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox