Devicetree
 help / color / mirror / Atom feed
From: Drew Fustini <fustini@kernel.org>
To: sashiko@lists.linux.dev
Cc: conor+dt@kernel.org, robh@kernel.org, krzk+dt@kernel.org,
	devicetree@vger.kernel.org, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring
Date: Sat, 16 May 2026 18:38:59 -0700	[thread overview]
Message-ID: <agkcM6FCKbc9G9Df@thelio> (raw)
In-Reply-To: <20260512212653.B5715C2BCB8@smtp.kernel.org>

On Tue, May 12, 2026 at 09:26:53PM +0000, sashiko-bot@kernel.org wrote:
> > @@ -308,6 +348,76 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain_hdr *hdr,
> >  		mutex_unlock(&ctrl->lock);
> >  		return err;
> >  
> > +	case QOS_L3_MBM_TOTAL_EVENT_ID: {
> > +		struct cbqri_controller *bc;
> > +
> > +		/*
> > +		 * The L3 monitoring domain's id is the L3 cache id. The
> > +		 * matching ctrl domain's hw_dom->paired_bc was cached at
> > +		 * add time to avoid walking cbqri_controllers on every read.
> > +		 */
> > +		d = cbqri_find_ctrl_domain(&r->ctrl_domains, hdr->id);
> 
> Is there a list traversal race here that could happen during CPU hotplug?
> 
> When offlining a CPU, the domain removal process uses list_del_rcu() and frees 
> domains after calling synchronize_rcu(). However, cbqri_find_ctrl_domain() 
> traverses the ctrl_domains list without holding an rcu_read_lock(). 
> 
> If synchronize_rcu() does not wait for this reader, could this lead to a 
> use-after-free when the freed domain struct is accessed?

I'll change it to take cbqri_domain_list_lock in
resctrl_arch_rmid_read() and resctrl_arch_reset_rmid(), and drop the RCU
on the writer side in favor of plain list_del under the same mutex.

> > @@ -730,6 +840,16 @@ static int cbqri_resctrl_control_init(struct cbqri_resctrl_res *cbqri_res)
> >  			res->mon.num_rmid = ctrl->mcid_count;
> >  			resctrl_enable_mon_event(QOS_L3_OCCUP_EVENT_ID,
> >  						 false, 0, NULL);
> > +
> > +			/*
> > +			 * Expose BC bandwidth monitoring as the L3's
> > +			 * mbm_total_bytes when a BC shares topology with this
> > +			 * L3 (MPAM "MB on L3" mapping).
> > +			 */
> > +			if (cbqri_resctrl_counters[QOS_L3_MBM_TOTAL_EVENT_ID])
> > +				resctrl_enable_mon_event(QOS_L3_MBM_TOTAL_EVENT_ID,
> > +							 false, 0, NULL);
> > +
> >  			res->mon_capable = true;
> >  		}
> >  		break;
> 
> Is bandwidth monitoring inadvertently disabled if the L3 capacity controller 
> lacks occupancy monitoring? 
> 
> Because this block is inside an if condition checking ctrl->mon_capable where 
> ctrl is the L3 capacity controller, it appears the bandwidth monitoring won't 
> be exposed even if a valid and monitoring-capable bandwidth controller is 
> present on the system.

The L3 mon_domain is currently anchored on the L3 CC's mon_capable path,
so a system with a mon-capable BC but a mon-incapable L3 CC loses
mbm_total_bytes. That can be addressed in a future series.

> > @@ -949,6 +1072,29 @@ static int cbqri_attach_cpu_to_l3_mon(struct cbqri_controller *ctrl,
> >  	if (err)
> >  		goto err_offline;
> >  
> > +	/*
> > +	 * Pair this L3 domain with the system's mon-capable BC. The
> > +	 * cached pointer is consulted by every rmid_read / reset_rmid.
> > +	 * BC mon init is system-wide. Failure here means mbm_total_bytes
> > +	 * is unusable for any domain, so fail the attach.
> > +	 */
> > +	{
> > +		struct cbqri_resctrl_dom *hw_dom = container_of(ctrl_dom,
> > +								struct cbqri_resctrl_dom,
> > +								resctrl_ctrl_dom);
> > +
> > +		hw_dom->paired_bc = cbqri_find_only_mon_bc();
> > +		if (hw_dom->paired_bc) {
> > +			err = cbqri_init_bc_mon_counters(hw_dom->paired_bc);
> 
> Could this sequence result in a use-after-free or a null pointer dereference?
> 
> By the time this block runs, resctrl_online_mon_domain() has already been 
> called earlier in the function, exposing the monitoring domain to user space. 
> If a concurrent sysfs read triggers resctrl_arch_rmid_read() at this moment, 
> it could pass the lockless check for bc->mbm_total_states and block waiting 
> for the hardware lock.
> 
> If cbqri_init_bc_mon_counters() fails, it drops the lock, frees 
> bc->mbm_total_states, and sets it to null. 
> 
> When the sysfs read unblocks, will it dereference the freed pointer? Might it 
> be safer to defer domain onlining until after the hardware state is fully 
> initialized?

I will change it so cbqri_init_bc_mon_counters() and the paired_bc
assignment happen before resctrl_online_mon_domain(), so no reader can
observe the half-initialized state.

> 
> Furthermore, does this unconditional assignment of cbqri_find_only_mon_bc() 
> cause bandwidth overcounting on platforms with multiple L3 domains?
> 
> If the single system-wide bandwidth controller is paired with every L3 
> domain, standard user-space tools that calculate group bandwidth by summing 
> values across all L3 domains might overcount the true system bandwidth by 
> a factor equal to the number of active L3 domains.

Only a single bandwidth controller is supported with this implemenation.
cbqri_resctrl_pick_counters() counts distinct L3 cache_ids at setup and
refuses to expose mbm_total_bytes when more than one L3 domain is
present.

-Drew

  reply	other threads:[~2026-05-17  1:39 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-11 23:52   ` sashiko-bot
2026-05-14 22:24     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-12  1:26   ` sashiko-bot
2026-05-15 13:46     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-12  1:58   ` sashiko-bot
2026-05-16  3:05     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-12  2:29   ` sashiko-bot
2026-05-16  6:15     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-12  4:01   ` sashiko-bot
2026-05-16 15:45     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-12  5:00   ` sashiko-bot
2026-05-16 22:39     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-12  5:42   ` sashiko-bot
2026-05-17  0:33     ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-12 21:26   ` sashiko-bot
2026-05-17  1:38     ` Drew Fustini [this message]
2026-05-11  5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-12 21:46   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-05-12 22:56   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agkcM6FCKbc9G9Df@thelio \
    --to=fustini@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=krzk+dt@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=robh@kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox