Devicetree
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Drew Fustini" <fustini@kernel.org>
Cc: devicetree@vger.kernel.org, robh@kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH RFC v5 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask
Date: Mon, 25 May 2026 00:50:55 +0000	[thread overview]
Message-ID: <20260525005056.312281F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260524-ssqosid-cbqri-rqsc-v7-0-v5-11-78d3a7ba9dbe@kernel.org>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] The CPU hotplug callbacks do not notify the generic resctrl subsystem when CPUs come online or go offline, leaving the `rdtgroup` cpumasks out of sync.
- [High] The per-CPU default CLOSID and RMID are not reset to the reserved defaults when a CPU comes online.
- [High] CONFIG_RISCV_ISA_SSQOSID unconditionally selects ARCH_HAS_CPU_RESCTRL without requiring the implementation driver, leading to build failures.
- [High] Missing rollback on partial failure in CPU hotplug online callback leaves the system in an inconsistent state.
--

commit 31a7934518aa1237167aa6b7e0cb45ba9107610b
Author: Drew Fustini <fustini@kernel.org>

riscv_cbqri: resctrl: Add cache allocation via capacity block mask

This commit wires CBQRI capacity controllers into resctrl as RDT_RESOURCE_L2
and RDT_RESOURCE_L3 schemata. It treats mismatched controller capacities at
the same cache level as a fatal error and creates domains lazily during CPU
hotplug.

> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 6abbb21f3a0dc..390353a6153a2 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -595,6 +595,7 @@ config RISCV_ISA_SSQOSID
>  	bool "Ssqosid extension support for supervisor mode Quality of Service ID"
>  	depends on 64BIT
>  	default n
> +	select ARCH_HAS_CPU_RESCTRL

[Severity: High]
Could this lead to build failures if the CBQRI driver is not enabled?

The Kconfig symbol RISCV_ISA_SSQOSID unconditionally selects
ARCH_HAS_CPU_RESCTRL, which signals to the core kernel that the architecture
provides the resctrl_arch_* interface. This permits CONFIG_RESCTRL_FS to be
enabled.

However, since the implementation resides in drivers/resctrl/cbqri_resctrl.c
which is only compiled when CONFIG_RISCV_CBQRI_DRIVER is set, enabling
RISCV_ISA_SSQOSID and RESCTRL_FS without RISCV_CBQRI_DRIVER might cause
linker errors due to undefined references to the resctrl_arch_* functions.
Would it be safer to add a dependency on the driver?

> diff --git a/drivers/resctrl/cbqri_resctrl.c b/drivers/resctrl/cbqri_resctrl.c
> new file mode 100644
> index 0000000000000..1e6198e25fd29
> --- /dev/null
> +++ b/drivers/resctrl/cbqri_resctrl.c
> @@ -0,0 +1,736 @@

[ ... ]

> +static int cbqri_resctrl_online_cpu(unsigned int cpu)
> +{
> +	struct cbqri_controller *ctrl;
> +	int err = 0;
> +
> +	mutex_lock(&cbqri_domain_list_lock);
> +
> +	list_for_each_entry(ctrl, &cbqri_controllers, list) {
> +		if (ctrl->type != CBQRI_CONTROLLER_TYPE_CAPACITY)
> +			continue;
> +		if (!cpumask_test_cpu(cpu, &ctrl->cache.cpu_mask))
> +			continue;
> +		if (!ctrl->alloc_capable)
> +			continue;
> +
> +		err = cbqri_attach_cpu_to_cap_ctrl(ctrl, cpu);
> +		if (err)
> +			break;
> +	}

[Severity: High]
Does this loop leave the system in an inconsistent state if it fails partway
through?

If cbqri_attach_cpu_to_cap_ctrl() succeeds for an earlier controller
but fails for a subsequent one, the loop breaks and returns an error. The CPU
hotplug framework will abort the bring-up without invoking the teardown
callback cbqri_resctrl_offline_cpu().

Will this leak the rdt_ctrl_domain structs for the controllers that
were successfully attached?

> +
> +	mutex_unlock(&cbqri_domain_list_lock);
> +	return err;
> +}

[Severity: High]
Should we reset the per-CPU default closid and rmid when a CPU comes online?

When a CPU is offlined while in a custom resctrl group, its cpu_srmcfg_default
might still hold that custom closid/rmid. When the CPU comes back online, it
conceptually returns to the default rdtgroup, so it might need its state
reset via resctrl_arch_set_cpu_default_closid_rmid() to prevent tasks from
running with stale assignments.

[Severity: High]
Does this missing notification leave the generic resctrl subsystem out of sync?

The generic fs/resctrl layer usually relies on the architecture to notify it
via resctrl_online_cpu() to update resource group masks. Without this, newly
onlined CPUs might not be added to the default resctrl group, potentially
breaking sysfs CPU tracking.

> +
> +static int cbqri_resctrl_offline_cpu(unsigned int cpu)
> +{
> +	int rid;
> +
> +	mutex_lock(&cbqri_domain_list_lock);
> +
> +	for (rid = 0; rid < RDT_NUM_RESOURCES; rid++) {
> +		struct cbqri_resctrl_res *hw_res = &cbqri_resctrl_resources[rid];
> +
> +		if (!hw_res->ctrl)
> +			continue;
> +		cbqri_detach_cpu_from_ctrl_domains(&hw_res->resctrl_res, cpu);
> +	}
> +
> +	mutex_unlock(&cbqri_domain_list_lock);
> +	return 0;
> +}

[Severity: High]
Does this also need to call resctrl_offline_cpu() to notify the core layer?

Without notifying the generic layer, offlined CPUs might remain tracked
permanently in the rdtgroup cpumasks.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260524-ssqosid-cbqri-rqsc-v7-0-v5-0-78d3a7ba9dbe@kernel.org?part=11

  reply	other threads:[~2026-05-25  0:50 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-24 23:55 [PATCH RFC v5 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-25  0:29   ` sashiko-bot
2026-05-25  6:58     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-25  7:21     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-25  0:36   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-25  0:50   ` sashiko-bot [this message]
2026-05-24 23:55 ` [PATCH RFC v5 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-25  0:46   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-25  0:55   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-25  0:52   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-25  1:27   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-25  8:23   ` Sunil V L
2026-05-24 23:55 ` [PATCH RFC v5 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260525005056.312281F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=fustini@kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox