From: Drew Fustini <fustini@kernel.org>
To: sashiko@lists.linux.dev
Cc: robh@kernel.org, krzk+dt@kernel.org, devicetree@vger.kernel.org,
conor+dt@kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask
Date: Sat, 16 May 2026 08:45:55 -0700 [thread overview]
Message-ID: <agiRMzX3nTvO0jPM@thelio> (raw)
In-Reply-To: <20260512040135.51715C2BCB0@smtp.kernel.org>
On Tue, May 12, 2026 at 04:01:34AM +0000, sashiko-bot@kernel.org wrote:
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -595,6 +595,7 @@ config RISCV_ISA_SSQOSID
> > bool "Ssqosid extension support for supervisor mode Quality of Service ID"
> > depends on 64BIT
> > default n
> > + select ARCH_HAS_CPU_RESCTRL
>
> If CONFIG_RISCV_CBQRI_DRIVER is disabled, does this cause a linker error
> when CONFIG_RESCTRL_FS is enabled? The riscv implementation of the
> resctrl_arch_* functions is in cbqri_resctrl.c, which seems to only be
> compiled when the driver is enabled.
I will change RISCV_ISA_SSQOSID to select RISCV_CBQRI_DRIVER
unconditionally. That makes the SSQOSID=y CBQRI_DRIVER=n combination
unreachable, so the resctrl_arch_* symbols are always available when
RESCTRL_FS is enabled.
> > +u32 resctrl_arch_get_num_closid(struct rdt_resource *res)
> > +{
> > + struct cbqri_resctrl_res *hw_res;
> > +
> > + hw_res = container_of(res, struct cbqri_resctrl_res, resctrl_res);
> > +
> > + /*
> > + * fs/resctrl calls this for resctrl-defined rids that CBQRI may not
> > + * back (e.g. RDT_RESOURCE_MBA from set_mba_sc() during unmount).
> > + * Unpicked rids have ctrl == NULL. Report no closids.
> > + */
> > + if (!hw_res->ctrl)
> > + return 0;
> > +
> > + return hw_res->ctrl->rcid_count;
> > +}
>
> Is there a bounds check needed for rcid_count against the 12-bit maximum?
>
> If the firmware reports an rcid_count greater than 4096, fs/resctrl would
> permit creating groups with closid >= 4096.
>
> This could lead to FIELD_PREP() silently truncating the upper bits in
> resctrl_arch_set_closid_rmid(), causing the task to run in a different group
> and bypassing the intended hardware isolation.
CBQRI_MAX_RCID is 1024, so there is no risk of exceeding the 12-bit
SRMCFG_RCID_MASK limit. However, I can add a WARN_ON_ONCE(rcid_count >
SRMCFG_RCID_MASK) at controller registration time so a future discovery
path like DT cannot silently bypass this.
> > +static struct rdt_ctrl_domain *cbqri_new_domain(struct cbqri_controller *ctrl)
> > +{
> > + struct cbqri_resctrl_dom *hw_dom;
> > + struct rdt_ctrl_domain *domain;
> > +
> > + hw_dom = kzalloc_obj(*hw_dom, GFP_KERNEL);
>
> Does kzalloc_obj exist in the kernel? It looks like this might cause a
> build failure due to an implicit function declaration.
kzalloc_obj() is defined in include/linux/slab.h
> > +static int cbqri_init_domain_ctrlval(struct rdt_resource *r, struct rdt_ctrl_domain *d)
> > +{
> > + struct cbqri_resctrl_res *hw_res;
> > + enum resctrl_conf_type t;
> > + int err = 0;
> > + int i;
> > +
> > + hw_res = container_of(r, struct cbqri_resctrl_res, resctrl_res);
> > +
> > + for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
> > + /*
> > + * Seed both DATA and CODE staged slots so a later mount
> > + * with -o cdp does not see stale CODE values.
> > + * CDP_NUM_TYPES is 1 on non-CDP controllers.
> > + */
> > + for (t = 0; t < CDP_NUM_TYPES; t++) {
> > + err = resctrl_arch_update_one(r, d, i, t,
> > + resctrl_get_default_ctrl(r));
>
> Does this loop correctly handle non-CDP hardware? A comment mentions that
> CDP_NUM_TYPES is 1 on non-CDP controllers, but since it is a compile-time enum
> constant, the loop will always execute for t = CDP_CODE.
>
> This could cause resctrl_arch_update_one() to send CBQRI_AT_CODE to
> controllers that do not support it, potentially triggering a hardware fault and
> aborting initialization.
The comment is wrong. CDP_NUM_TYPES is the compile-time constant 3
regardless of hardware. I will reword the misleading comment and gate
the inner loop on cdp_capable so non-CDP controllers only see CDP_NONE.
> > +static void cbqri_detach_cpu_from_ctrl_domains(struct rdt_resource *res,
> > + unsigned int cpu)
> > +{
> > + struct rdt_ctrl_domain *domain, *tmp;
> > +
> > + list_for_each_entry_safe(domain, tmp, &res->ctrl_domains, hdr.list) {
> > + if (!cpumask_test_cpu(cpu, &domain->hdr.cpu_mask))
> > + continue;
> > + cpumask_clear_cpu(cpu, &domain->hdr.cpu_mask);
> > + if (cpumask_empty(&domain->hdr.cpu_mask)) {
> > + resctrl_offline_ctrl_domain(res, domain);
> > + list_del_rcu(&domain->hdr.list);
> > + synchronize_rcu();
>
> Could this synchronize_rcu() call cause severe latency spikes during CPU
> hotplug?
>
> This function is called synchronously from cbqri_resctrl_offline_cpu() while
> the core kernel holds the global cpu_hotplug_lock. It seems fs/resctrl manages
> these lists using standard locks, so the RCU primitives might not be necessary
> here.
This is the same list_del_rcu + synchronize_rcu pattern used by
arch/x86/kernel/cpu/resctrl/core.c for domain teardown. Readers walk
ctrl_domains under rcu_read_lock() (e.g. the schemata file walks), so
the grace period is needed before kfree.
The latency is bounded by one RCU grace period per detach, same as on
x86. I do not see a way to drop it without changing the reader side too.
> > +static int cbqri_resctrl_setup(void)
> > +{
> > + int rid;
> > + int err;
> > +
> > + /* Wait for cacheinfo so cbqri_probe_cc()'s lazy fill has data. */
> > + wait_event(wait_cacheinfo_ready, cacheinfo_ready);
>
> Could this wait_event() potentially put the init thread to sleep indefinitely?
>
> Because cbqri_resctrl_setup() and __cacheinfo_ready() both run as initcalls on
> the same kernel_init thread, if the previous initcall failed or the order
> changed, wait_event() would sleep forever since no other thread exists to
> wake it up.
device_initcall_sync always runs before late_initcall, so
cacheinfo_ready is already true on entry. The wait_event() is dead code.
I will drop it along with the waitqueue and just leave a comment noting
the initcall ordering CBQRI relies on.
> > + cbqri_resctrl_accumulate_caps();
> > +
> > + if (!exposed_alloc_capable) {
> > + pr_debug("no resctrl-capable CBQRI controllers found\n");
> > + return -ENODEV;
> > + }
> > +
> > + err = resctrl_init();
> > + if (err)
> > + return err;
>
> If resctrl_init() fails, should this function call cbqri_resctrl_teardown()
> or otherwise revert the global state?
>
> Since cbqri_resctrl_accumulate_caps() sets exposed_alloc_capable to true,
> leaving it set on failure would cause resctrl_arch_alloc_capable() to
> incorrectly return true.
If resctrl_init() fails, then the resctrl filesystem is not mountable
and nothing queries resctrl_arch_alloc_capable() afterwards. I will call
cbqri_resctrl_teardown() on the resctrl_init() error path so
exposed_alloc_capable does not stay set.
> > +static int cbqri_resctrl_online_cpu(unsigned int cpu)
> > +{
> > + struct cbqri_controller *ctrl;
> > + int err = 0;
> > +
> > + mutex_lock(&cbqri_domain_list_lock);
>
> Is the cbqri_domain_list_lock mutex necessary?
>
> Since cbqri_resctrl_online_cpu() and cbqri_resctrl_offline_cpu() are invoked
> exclusively by the CPU hotplug state machine, the core kernel already
> serializes them globally using the cpu_hotplug_lock.
I will drop cbqri_domain_list_lock and replace it with
lockdep_assert_cpus_held() in the helpers.
-Drew
WARNING: multiple messages have this Message-ID (diff)
From: Drew Fustini <fustini@kernel.org>
To: sashiko@lists.linux.dev
Cc: robh@kernel.org, krzk+dt@kernel.org, devicetree@vger.kernel.org,
conor+dt@kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask
Date: Sat, 16 May 2026 08:45:55 -0700 [thread overview]
Message-ID: <agiRMzX3nTvO0jPM@thelio> (raw)
In-Reply-To: <20260512040135.51715C2BCB0@smtp.kernel.org>
On Tue, May 12, 2026 at 04:01:34AM +0000, sashiko-bot@kernel.org wrote:
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -595,6 +595,7 @@ config RISCV_ISA_SSQOSID
> > bool "Ssqosid extension support for supervisor mode Quality of Service ID"
> > depends on 64BIT
> > default n
> > + select ARCH_HAS_CPU_RESCTRL
>
> If CONFIG_RISCV_CBQRI_DRIVER is disabled, does this cause a linker error
> when CONFIG_RESCTRL_FS is enabled? The riscv implementation of the
> resctrl_arch_* functions is in cbqri_resctrl.c, which seems to only be
> compiled when the driver is enabled.
I will change RISCV_ISA_SSQOSID to select RISCV_CBQRI_DRIVER
unconditionally. That makes the SSQOSID=y CBQRI_DRIVER=n combination
unreachable, so the resctrl_arch_* symbols are always available when
RESCTRL_FS is enabled.
> > +u32 resctrl_arch_get_num_closid(struct rdt_resource *res)
> > +{
> > + struct cbqri_resctrl_res *hw_res;
> > +
> > + hw_res = container_of(res, struct cbqri_resctrl_res, resctrl_res);
> > +
> > + /*
> > + * fs/resctrl calls this for resctrl-defined rids that CBQRI may not
> > + * back (e.g. RDT_RESOURCE_MBA from set_mba_sc() during unmount).
> > + * Unpicked rids have ctrl == NULL. Report no closids.
> > + */
> > + if (!hw_res->ctrl)
> > + return 0;
> > +
> > + return hw_res->ctrl->rcid_count;
> > +}
>
> Is there a bounds check needed for rcid_count against the 12-bit maximum?
>
> If the firmware reports an rcid_count greater than 4096, fs/resctrl would
> permit creating groups with closid >= 4096.
>
> This could lead to FIELD_PREP() silently truncating the upper bits in
> resctrl_arch_set_closid_rmid(), causing the task to run in a different group
> and bypassing the intended hardware isolation.
CBQRI_MAX_RCID is 1024, so there is no risk of exceeding the 12-bit
SRMCFG_RCID_MASK limit. However, I can add a WARN_ON_ONCE(rcid_count >
SRMCFG_RCID_MASK) at controller registration time so a future discovery
path like DT cannot silently bypass this.
> > +static struct rdt_ctrl_domain *cbqri_new_domain(struct cbqri_controller *ctrl)
> > +{
> > + struct cbqri_resctrl_dom *hw_dom;
> > + struct rdt_ctrl_domain *domain;
> > +
> > + hw_dom = kzalloc_obj(*hw_dom, GFP_KERNEL);
>
> Does kzalloc_obj exist in the kernel? It looks like this might cause a
> build failure due to an implicit function declaration.
kzalloc_obj() is defined in include/linux/slab.h
> > +static int cbqri_init_domain_ctrlval(struct rdt_resource *r, struct rdt_ctrl_domain *d)
> > +{
> > + struct cbqri_resctrl_res *hw_res;
> > + enum resctrl_conf_type t;
> > + int err = 0;
> > + int i;
> > +
> > + hw_res = container_of(r, struct cbqri_resctrl_res, resctrl_res);
> > +
> > + for (i = 0; i < hw_res->ctrl->rcid_count; i++) {
> > + /*
> > + * Seed both DATA and CODE staged slots so a later mount
> > + * with -o cdp does not see stale CODE values.
> > + * CDP_NUM_TYPES is 1 on non-CDP controllers.
> > + */
> > + for (t = 0; t < CDP_NUM_TYPES; t++) {
> > + err = resctrl_arch_update_one(r, d, i, t,
> > + resctrl_get_default_ctrl(r));
>
> Does this loop correctly handle non-CDP hardware? A comment mentions that
> CDP_NUM_TYPES is 1 on non-CDP controllers, but since it is a compile-time enum
> constant, the loop will always execute for t = CDP_CODE.
>
> This could cause resctrl_arch_update_one() to send CBQRI_AT_CODE to
> controllers that do not support it, potentially triggering a hardware fault and
> aborting initialization.
The comment is wrong. CDP_NUM_TYPES is the compile-time constant 3
regardless of hardware. I will reword the misleading comment and gate
the inner loop on cdp_capable so non-CDP controllers only see CDP_NONE.
> > +static void cbqri_detach_cpu_from_ctrl_domains(struct rdt_resource *res,
> > + unsigned int cpu)
> > +{
> > + struct rdt_ctrl_domain *domain, *tmp;
> > +
> > + list_for_each_entry_safe(domain, tmp, &res->ctrl_domains, hdr.list) {
> > + if (!cpumask_test_cpu(cpu, &domain->hdr.cpu_mask))
> > + continue;
> > + cpumask_clear_cpu(cpu, &domain->hdr.cpu_mask);
> > + if (cpumask_empty(&domain->hdr.cpu_mask)) {
> > + resctrl_offline_ctrl_domain(res, domain);
> > + list_del_rcu(&domain->hdr.list);
> > + synchronize_rcu();
>
> Could this synchronize_rcu() call cause severe latency spikes during CPU
> hotplug?
>
> This function is called synchronously from cbqri_resctrl_offline_cpu() while
> the core kernel holds the global cpu_hotplug_lock. It seems fs/resctrl manages
> these lists using standard locks, so the RCU primitives might not be necessary
> here.
This is the same list_del_rcu + synchronize_rcu pattern used by
arch/x86/kernel/cpu/resctrl/core.c for domain teardown. Readers walk
ctrl_domains under rcu_read_lock() (e.g. the schemata file walks), so
the grace period is needed before kfree.
The latency is bounded by one RCU grace period per detach, same as on
x86. I do not see a way to drop it without changing the reader side too.
> > +static int cbqri_resctrl_setup(void)
> > +{
> > + int rid;
> > + int err;
> > +
> > + /* Wait for cacheinfo so cbqri_probe_cc()'s lazy fill has data. */
> > + wait_event(wait_cacheinfo_ready, cacheinfo_ready);
>
> Could this wait_event() potentially put the init thread to sleep indefinitely?
>
> Because cbqri_resctrl_setup() and __cacheinfo_ready() both run as initcalls on
> the same kernel_init thread, if the previous initcall failed or the order
> changed, wait_event() would sleep forever since no other thread exists to
> wake it up.
device_initcall_sync always runs before late_initcall, so
cacheinfo_ready is already true on entry. The wait_event() is dead code.
I will drop it along with the waitqueue and just leave a comment noting
the initcall ordering CBQRI relies on.
> > + cbqri_resctrl_accumulate_caps();
> > +
> > + if (!exposed_alloc_capable) {
> > + pr_debug("no resctrl-capable CBQRI controllers found\n");
> > + return -ENODEV;
> > + }
> > +
> > + err = resctrl_init();
> > + if (err)
> > + return err;
>
> If resctrl_init() fails, should this function call cbqri_resctrl_teardown()
> or otherwise revert the global state?
>
> Since cbqri_resctrl_accumulate_caps() sets exposed_alloc_capable to true,
> leaving it set on failure would cause resctrl_arch_alloc_capable() to
> incorrectly return true.
If resctrl_init() fails, then the resctrl filesystem is not mountable
and nothing queries resctrl_arch_alloc_capable() afterwards. I will call
cbqri_resctrl_teardown() on the resctrl_init() error path so
exposed_alloc_capable does not stay set.
> > +static int cbqri_resctrl_online_cpu(unsigned int cpu)
> > +{
> > + struct cbqri_controller *ctrl;
> > + int err = 0;
> > +
> > + mutex_lock(&cbqri_domain_list_lock);
>
> Is the cbqri_domain_list_lock mutex necessary?
>
> Since cbqri_resctrl_online_cpu() and cbqri_resctrl_offline_cpu() are invoked
> exclusively by the CPU hotplug state machine, the core kernel already
> serializes them globally using the cpu_hotplug_lock.
I will drop cbqri_domain_list_lock and replace it with
lockdep_assert_cpus_held() in the helpers.
-Drew
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2026-05-16 15:45 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11 5:10 ` Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11 5:10 ` Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11 5:10 ` Drew Fustini
2026-05-11 5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-11 5:10 ` Drew Fustini
2026-05-11 23:52 ` sashiko-bot
2026-05-14 22:24 ` Drew Fustini
2026-05-14 22:24 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 1:26 ` sashiko-bot
2026-05-15 13:46 ` Drew Fustini
2026-05-15 13:46 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 1:58 ` sashiko-bot
2026-05-16 3:05 ` Drew Fustini
2026-05-16 3:05 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 2:29 ` sashiko-bot
2026-05-16 6:15 ` Drew Fustini
2026-05-16 6:15 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 4:01 ` sashiko-bot
2026-05-16 15:45 ` Drew Fustini [this message]
2026-05-16 15:45 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 5:00 ` sashiko-bot
2026-05-16 22:39 ` Drew Fustini
2026-05-16 22:39 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 5:42 ` sashiko-bot
2026-05-17 0:33 ` Drew Fustini
2026-05-17 0:33 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 21:26 ` sashiko-bot
2026-05-17 1:38 ` Drew Fustini
2026-05-17 1:38 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 21:46 ` sashiko-bot
2026-05-17 5:01 ` Drew Fustini
2026-05-17 5:01 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-11 5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-05-11 5:11 ` Drew Fustini
2026-05-12 22:56 ` sashiko-bot
2026-05-17 6:56 ` Drew Fustini
2026-05-17 6:56 ` Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agiRMzX3nTvO0jPM@thelio \
--to=fustini@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=krzk+dt@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=robh@kernel.org \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.