Re: [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Drew Fustini" <fustini@kernel.org>
Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org
Subject: Re: [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight
Date: Mon, 01 Jun 2026 20:57:34 +0000	[thread overview]
Message-ID: <20260601205734.76D021F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260601-ssqosid-cbqri-rqsc-v7-0-v6-14-baf00f50028a@kernel.org>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] AB-BA deadlock between rdtgroup_mutex and cbqri_domain_list_lock involving CPU hotplug and resctrl filesystem reads.
- [High] Unprotected list_del() and kfree() causes Use-After-Free on res->ctrl_domains.
- [High] Staging registers are read and modified before ensuring the hardware is not busy, leading to data corruption.
--

commit 77c2c065054c1b30b97ae2ddd37b439f110e553b
Author: Drew Fustini <fustini@kernel.org>
riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight

Add bandwidth allocation through Mweight (shared weight for unreserved
bandwidth) exposed as the MB_WGHT resource. Mweight is an integer in
[0, 255]. A value of 0 disables work-conserving sharing for the group,
capping its bandwidth at the MB_MIN reservation.

> diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resctrl.c
> index 1d312004b07d9..14b955eb79491 100644
> --- a/drivers/resctrl/cbqri_resctrl.c
> +++ b/drivers/resctrl/cbqri_resctrl.c
[ ... ]
> @@ -426,6 +428,14 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
>  			val = (u32)rbwb;
>  		break;
>  	}
> +	case RDT_RESOURCE_MB_WGHT: {
> +		u64 mweight;
> +
> +		err = cbqri_read_mweight(ctrl, closid, &mweight);

[Severity: High]
This isn't a bug introduced by this patch, but the newly added
cbqri_read_mweight() called here duplicates a pre-existing flaw.

Can this corrupt data by writing to staging registers before ensuring the
hardware is not busy?

In cbqri_read_mweight(), the driver interacts with the hardware by writing to a
staging register, writing a control register, and waiting for the BUSY flag to
clear.

If a previous operation times out (such as the 1ms limit in
cbqri_wait_busy_flag()), the driver returns -EIO and releases ctrl->lock, but
the hardware might still be processing.

The next caller acquires the lock and immediately calls cbqri_set_bc_bw_alloc()
to read-modify-write the staging register before cbqri_bc_alloc_op() verifies
the BUSY flag is 0.

Would modifying the staging register of an in-flight hardware operation cause
undefined behavior, misconfiguration, or WPRI field corruption?

> +		if (err == 0)
> +			val = (u32)mweight;
> +		break;
> +	}
[ ... ]
> @@ -961,10 +1007,46 @@ static int cbqri_attach_cpu_to_one_bw_res(struct cbqri_controller *ctrl,
>  	return 0;
>  }
>  
> +static void cbqri_detach_cpu_from_one_bw_res(struct cbqri_controller *ctrl,
> +					     enum resctrl_res_level rid,
> +					     unsigned int cpu)
> +{
> +	struct cbqri_resctrl_res *hw_res = &cbqri_resctrl_resources[rid];
> +	struct rdt_resource *res = &hw_res->resctrl_res;
> +	struct rdt_ctrl_domain *domain;
> +	int dom_id = ctrl->mem.prox_dom;
> +
> +	lockdep_assert_held(&cbqri_domain_list_lock);
> +
> +	if (!hw_res->ctrl)
> +		return;
> +
> +	domain = cbqri_find_ctrl_domain(&res->ctrl_domains, dom_id);
> +	if (!domain || !cpumask_test_cpu(cpu, &domain->hdr.cpu_mask))
> +		return;
> +
> +	cpumask_clear_cpu(cpu, &domain->hdr.cpu_mask);
> +	if (cpumask_empty(&domain->hdr.cpu_mask)) {
> +		resctrl_offline_ctrl_domain(res, domain);

[Severity: High]
This isn't a bug introduced by this patch, but the newly added
cbqri_detach_cpu_from_one_bw_res() replicates a broken lock ordering.

Could this lead to an AB-BA deadlock between rdtgroup_mutex and
cbqri_domain_list_lock?

If a user reading a monitor data file triggers rdtgroup_mondata_show(), it
acquires rdtgroup_mutex, then calls mon_event_read() ->
resctrl_arch_rmid_read(), which acquires cbqri_domain_list_lock.

Concurrently, when a CPU is taken offline, cbqri_resctrl_offline_cpu()
acquires cbqri_domain_list_lock, then calls cbqri_detach_cpu_from_all_ctrls()
-> cbqri_detach_cpu_from_one_bw_res(). If the domain becomes empty, it calls
resctrl_offline_ctrl_domain(), which attempts to acquire rdtgroup_mutex.

Would these two paths race and cause a permanent AB-BA deadlock?

> +		list_del(&domain->hdr.list);
> +		kfree(container_of(domain, struct cbqri_resctrl_dom,
> +				   resctrl_ctrl_dom));

[Severity: High]
This is a pre-existing issue, but the new cbqri_detach_cpu_from_one_bw_res()
replicates it here.

Can this result in a Use-After-Free on the res->ctrl_domains list?

VFS read paths in fs/resctrl (like rdtgroup_schemata_show()) traverse the
res->ctrl_domains list using list_for_each_entry while holding only
rdtgroup_mutex.

When the CPU mask becomes empty, resctrl_offline_ctrl_domain() is called above,
which takes and releases rdtgroup_mutex. After it returns, list_del() and
kfree() are executed without holding rdtgroup_mutex or using RCU primitives.

Could a concurrent list traversal under rdtgroup_mutex dereference the freed
domain pointer and crash?

> +	}
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260601-ssqosid-cbqri-rqsc-v7-0-v6-0-baf00f50028a@kernel.org?part=14

next prev parent reply	other threads:[~2026-06-01 20:57 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-01 20:35 [PATCH RFC v6 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-06-01 20:35   ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-06-01 20:35   ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-06-01 20:35   ` Drew Fustini
2026-06-01 20:49   ` sashiko-bot
2026-06-02  0:58     ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-06-01 20:35   ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-06-01 20:35   ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:55   ` sashiko-bot
2026-06-02  1:11     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:48   ` sashiko-bot
2026-06-02  2:17     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:51   ` sashiko-bot
2026-06-02  2:20     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:49   ` sashiko-bot
2026-06-02  4:27     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:56   ` sashiko-bot
2026-06-02  4:31     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:58   ` sashiko-bot
2026-06-02  4:47     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:57   ` sashiko-bot
2026-06-02  5:00     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:57   ` sashiko-bot [this message]
2026-06-02 17:35     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 21:01   ` sashiko-bot
2026-06-02 21:09     ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-06-01 20:36   ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-06-01 20:36   ` Drew Fustini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260601205734.76D021F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=fustini@kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.