From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01C9A32BF41 for ; Tue, 12 May 2026 05:00:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778562002; cv=none; b=aGjcsEkWLG8QOgF3Vbg+6nilEe8UvIY+MM5gj4okq/yCll5o9/G0/pQ8yi1P70VOW+Z/i8p6pddDRTzmnAVhWeCjwSmFfm/7ur0UByeDvzql0k+woGCSyU1Pepeze0OxDFO/nrsUfyALf1oalPkwmQyFDdACHRcs5xaHjkgG2EA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778562002; c=relaxed/simple; bh=nAwPH0HShXi20JmeJgbDSSSwIbW0HJWe0hNRMSirly0=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=nouev3YibjfEVRSd/fs04yUSi6jqSHQbSeZG7q5Lk0NB6I4WF87wB3kClVH6ygEIvdTC7arfSIPXpfgeN2PJzSxuP/n/v7a7h+Md/IYSMgO3JB3kByZ5IAL2BRgl9rxRfC789+66RkQtCj0sePsBPtZdpRPAkIO6kqxcbB4tNXk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qYmQi+cz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qYmQi+cz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 38B24C2BCB0; Tue, 12 May 2026 05:00:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778562001; bh=nAwPH0HShXi20JmeJgbDSSSwIbW0HJWe0hNRMSirly0=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date:From; b=qYmQi+czqHMmJ3QyVL8/1dNHk73HEFiALfE5GsS4qRlpsOu5mvG3m8omu98otAq7v cM2Sk5N7Dx27/CZ1eqXWn40JR94zu2OIjZzsG8CBvPTW32eO23HXILDzXx7hsK2Z7f qXUL0hh62eOcj5YyYyeDD/qfCgRmBULN1Ih+HjQKVurcMbzYVqr+gsUuqSfTfifcE8 BKFUoBkWIsKgkxRpaVGlMClarGiYF47RpOcHnaVZc5x2XJ3aCL0WHfplcgv4X+BKUg hKY3Czr6mOTjIQWV3N4tIlMSlOyl46o486G+nqxKoWrzzU7Kc0kceBoxpKzJ5nxh0Q T3e4xCPDW7dXg== From: sashiko-bot@kernel.org Subject: Re: [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Reply-To: sashiko@lists.linux.dev To: "Drew Fustini" Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org, krzk+dt@kernel.org In-Reply-To: <20260510-ssqosid-cbqri-rqsc-v7-0-v4-12-eb53831ef683@kernel.org> References: <20260510-ssqosid-cbqri-rqsc-v7-0-v4-12-eb53831ef683@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Tue, 12 May 2026 05:00:00 +0000 Message-Id: <20260512050001.38B24C2BCB0@smtp.kernel.org> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 8 potential issue(= s) to consider: - [High] AB-BA deadlock between CPU hotplug lock and `cqm_limbo` workqueue = synchronization. - [High] Unconditional call to `cancel_delayed_work_sync()` on uninitialize= d `mbm_over` work struct. - [High] Partial CPU hotplug failures leak CPUs in `ctrl_domain` and cause = permanent state corruption. - [High] `cbqri_resctrl_online_cpu()` ignores controllers that are only cap= able of monitoring. - [High] `resctrl_arch_reset_rmid()` is never invoked for L3 Occupancy duri= ng RMID recycle, leading to broken monitoring. - [Medium] `resctrl_arch_rmid_read()` unconditionally fails with `-EIO` on = `nohz_full` CPUs. - [Medium] Duplicate hardware initialization loops introduce unnecessary MM= IO overhead during L3 monitoring domain setup. - [Low] The `cbqri_resctrl_counters` array is populated but never read. -- commit 3c72f0947f0494c3b85bcbe44e05864be0ec064e Author: Drew Fustini riscv_cbqri: resctrl: Add L3 cache occupancy monitoring =20 Expose QOS_L3_OCCUP_EVENT_ID so userspace can read per-MCID llc_occupancy. The result is converted from capacity blocks to bytes using cache_size and ncblks. > resctrl_arch_reset_rmid() re-arms CONFIG_EVENT with EVT_ID=3DOccupanc= y. > CONFIG_EVENT both resets the counter to 0 and selects the event, so > re-arming with the same event keeps the MCID counting after reset rat= her > than relying on sticky-last-event semantics that the CBQRI register > definition does not guarantee. The resctrl core only calls mon_event_read() with first=3Dtrue (which triggers the arch reset hook) for Memory Bandwidth Monitoring events, never for L3 Occupancy. Because resctrl_arch_reset_rmid() is never invoked for L3 Occupancy during RMID recycle, will the counter fail to re-arm when a new monitoring group is created, causing it to stop counting or inherit dirty data from previous users? > The L3 mon_domain is created lazily on the first CPU of a cache_id and > linked to the paired ctrl_domain. > =20 > Assisted-by: Claude:claude-opus-4-7 > Co-developed-by: Adrien Ricciardi > Signed-off-by: Adrien Ricciardi > Signed-off-by: Drew Fustini > diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resc= trl.c > index 82b157d35576b..d8fd9b06703fa 100644 > --- a/drivers/resctrl/cbqri_resctrl.c > +++ b/drivers/resctrl/cbqri_resctrl.c [ ... ] > @@ -33,6 +34,13 @@ struct cbqri_resctrl_dom { > =20 > static struct cbqri_resctrl_res cbqri_resctrl_resources[RDT_NUM_RESOURCE= S]; > =20 > +/* > + * Per-event controller table. Only events CBQRI can back occupy a > + * slot, so other events do not bloat the array. > + */ > +#define CBQRI_MAX_EVENT QOS_L3_OCCUP_EVENT_ID > +static struct cbqri_controller *cbqri_resctrl_counters[CBQRI_MAX_EVENT += 1]; This array is allocated and initialized in cbqri_resctrl_pick_counters() and nulled out during teardown, but its contents appear to never be accessed by any logic in the driver.=20 Is this array just dead code? [ ... ] > @@ -205,10 +225,101 @@ int resctrl_arch_rmid_read(struct rdt_resource *r,= struct rdt_domain_hdr *hdr, > u32 closid, u32 rmid, enum resctrl_event_id eventid, > void *arch_priv, u64 *val, void *arch_mon_ctx) > { > - return -ENODATA; > + struct cbqri_resctrl_dom *hw_dom; > + struct cbqri_controller *ctrl; > + struct rdt_ctrl_domain *d; > + u64 ctr_val; > + int err; > + > + resctrl_arch_rmid_read_context_check(); > + > + /* > + * Each branch takes a sleeping mutex. Bail if called with IRQs > + * disabled (e.g. smp_call_function_any() from nohz_full CPUs). > + */ > + if (irqs_disabled()) > + return -EIO; The resctrl core intentionally reads remote domains via smp_call_function_any() when the target CPU is nohz_full, running the callback in hardirq context. Will rejecting calls in atomic context completely break cache occupancy monitoring for nohz_full CPUs, as any attempt to read their statistics will always fail with -EIO? [ ... ] > @@ -577,6 +725,71 @@ static struct rdt_ctrl_domain *cbqri_create_ctrl_dom= ain(struct cbqri_controller > return domain; > } > =20 > +static int cbqri_attach_cpu_to_l3_mon(struct cbqri_controller *ctrl, > + struct rdt_resource *res, unsigned int cpu) > +{ [ ... ] > + err =3D resctrl_online_mon_domain(res, &mon_dom->hdr); > + if (err) > + goto err_listdel; > + > + err =3D cbqri_init_mon_counters(ctrl); During domain initialization, resctrl_online_mon_domain() internally calls resctrl_arch_reset_rmid_all(), which loops over all max_rmid to issue CONFIG_EVENT MMIO operations. Immediately after this returns, cbqri_init_mon_counters() is called, which loops over ctrl->mcid_count to issue the exact same CONFIG_EVENT MMIO operations again. Since each CBQRI MMIO operation can sleep and take up to 1ms, does doing this twice per domain initialization introduce strictly redundant and significant latency during CPU hotplug? > + if (err) > + goto err_offline; > + > + return 0; > + > +err_offline: > + cancel_delayed_work_sync(&mon_dom->cqm_limbo); > + cancel_delayed_work_sync(&mon_dom->mbm_over); The resctrl core only initializes mbm_over if resctrl_is_mbm_enabled() is true. Since the CBQRI driver only enables the L3 Occupancy event and not MBM, mbm_over remains completely zero-initialized from allocation. Can calling cancel_delayed_work_sync() on a zeroed structure pass an uninitialized timer to del_timer_sync() and trigger kernel panics? > + resctrl_offline_mon_domain(res, &mon_dom->hdr); > +err_listdel: > + list_del(&mon_dom->hdr.list); > +err_free: > + kfree(mon_dom); > + return err; > +} > + > +static int cbqri_attach_cpu_to_cap_ctrl(struct cbqri_controller *ctrl, > + unsigned int cpu) > +{ The hotplug callback cbqri_resctrl_online_cpu() unconditionally skips controllers if !ctrl->alloc_capable. Will this cause controllers that are only capable of monitoring to be ignored entirely, meaning no CPUs will ever be attached and no monitoring domains will be created on monitoring-only hardware? > struct cbqri_resctrl_res *hw_res; > struct rdt_ctrl_domain *domain; > struct rdt_resource *res; > int dom_id; > + int err; > =20 [ ... ] > domain =3D cbqri_find_ctrl_domain(&res->ctrl_domains, dom_id); > if (domain) { > cpumask_set_cpu(cpu, &domain->hdr.cpu_mask); > - return 0; > + } else { > + domain =3D cbqri_create_ctrl_domain(ctrl, res, cpu, dom_id); > + if (IS_ERR(domain)) > + return PTR_ERR(domain); > } > =20 > - domain =3D cbqri_create_ctrl_domain(ctrl, res, cpu, dom_id); > - if (IS_ERR(domain)) > - return PTR_ERR(domain); > + if (ctrl->mon_capable && ctrl->cache.cache_level =3D=3D 3) { > + err =3D cbqri_attach_cpu_to_l3_mon(ctrl, res, cpu); > + if (err) > + return err; If cbqri_attach_cpu_to_l3_mon() fails here, the error is returned, but the CPU is not removed from the ctrl_domain mask and the domain structure is not freed. Because the CPU hotplug core handles a startup failure by rolling back from the previous state, the offline callback for the current state (cbqri_resctrl_offline_cpu) is not invoked. Does this leave the CPU permanently stranded in the ctrl_domain mask and leak the newly allocated domain structure? > + } > =20 > return 0; > } > =20 > +static void cbqri_detach_cpu_from_l3_mon(struct rdt_resource *res, > + unsigned int cpu) > +{ > + struct rdt_l3_mon_domain *mon_dom, *tmp; > + > + lockdep_assert_held(&cbqri_domain_list_lock); > + > + list_for_each_entry_safe(mon_dom, tmp, &res->mon_domains, hdr.list) { > + if (!cpumask_test_cpu(cpu, &mon_dom->hdr.cpu_mask)) > + continue; > + cpumask_clear_cpu(cpu, &mon_dom->hdr.cpu_mask); > + if (cpumask_empty(&mon_dom->hdr.cpu_mask)) { > + cancel_delayed_work_sync(&mon_dom->cqm_limbo); This function is executed as a CPU hotplug callback, meaning the thread holds cpus_write_lock(). Concurrently, the cqm_handle_limbo() worker function in the resctrl core begins by acquiring cpus_read_lock(). If the worker starts executing while the hotplug thread is active, the worker will block on cpus_read_lock(), while the hotplug thread blocks in cancel_delayed_work_sync() waiting for the worker to finish. Could this create an unresolvable AB-BA deadlock that hangs the system during CPU offline operations? > + cancel_delayed_work_sync(&mon_dom->mbm_over); (As mentioned above, mbm_over appears to be uninitialized here as well.) > + resctrl_offline_mon_domain(res, &mon_dom->hdr); > + list_del(&mon_dom->hdr.list); > + kfree(mon_dom); > + } > + } > +} --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260510-ssqosid-cb= qri-rqsc-v7-0-v4-0-eb53831ef683@kernel.org?part=3D12