From: sashiko-bot@kernel.org
To: "Drew Fustini" <fustini@kernel.org>
Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org
Subject: Re: [PATCH RFC v6 03/18] riscv: add support for srmcfg CSR from Ssqosid extension
Date: Mon, 01 Jun 2026 20:49:30 +0000 [thread overview]
Message-ID: <20260601204931.3A9481F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260601-ssqosid-cbqri-rqsc-v7-0-v6-3-baf00f50028a@kernel.org>
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] CPU hotplug cache invalidation in riscv_srmcfg_offline() is overwritten before the CPU halts.
- [High] PM exit notifier invalidates software cache but fails to eagerly restore the hardware CSR.
- [Medium] Failure in cpuhp_setup_state() skips PM notifier registration, silently breaking suspend/resume QoS tracking.
--
commit bc4841378eac220040a12e4f7c694dde12285eb1
Author: Drew Fustini <fustini@kernel.org>
riscv: add support for srmcfg CSR from Ssqosid extension
This patch adds support for the srmcfg CSR defined in the Ssqosid ISA
extension, tracking resource allocation (RCID) and usage (MCID). It adds
a per-task thread_struct field and per-CPU cached variables to update the
CSR during context switches.
Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
[ ... ]
> diff --git a/arch/riscv/kernel/qos.c b/arch/riscv/kernel/qos.c
> new file mode 100644
> index 0000000000000..d18b99b195e79
> --- /dev/null
> +++ b/arch/riscv/kernel/qos.c
[ ... ]
> +/*
> + * Invalidate the cache on offline too. The sentinel persists across the
> + * offline period, so a CPU brought back online finds the cache already
> + * invalidated before it is schedulable. This closes the window where a
> + * task scheduled before riscv_srmcfg_online() runs could match a stale
> + * cache and skip the CSR write while the hardware CSR was reset across
> + * hart stop/start.
> + */
> +static int riscv_srmcfg_offline(unsigned int cpu)
> +{
> + per_cpu(cpu_srmcfg, cpu) = U32_MAX;
[Severity: High]
Will this U32_MAX sentinel be overwritten before the CPU actually halts?
Because riscv_srmcfg_offline() executes at CPUHP_AP_ONLINE_DYN, the dying CPU
continues tearing down and eventually context switches to the idle thread to
physically offline.
This context switch invokes __switch_to_srmcfg(), which writes the hardware
CSR and overwrites the cpu_srmcfg cache with the idle task's value,
permanently losing the U32_MAX sentinel.
When the CPU comes back online, its hardware CSR is in an implementation
defined reset state, but the software cache holds a valid value. Any early
tasks sharing this stale value might skip the CSR write and execute with
incorrect QoS tags until riscv_srmcfg_online() is reached.
> + return 0;
> +}
> +
> +/*
> + * CPU PM notifier: invalidate the cached srmcfg on resume from a deep
> + * idle / suspend. Ssqosid v1.0 leaves CSR_SRMCFG state across low-power
> + * transitions implementation-defined, and the boot CPU never goes
> + * through the cpuhp online callback during system suspend, so without
> + * this hook __switch_to_srmcfg() would skip the CSR write when the
> + * outgoing task happens to share its srmcfg with the pre-suspend cache.
> + */
> +static int riscv_srmcfg_pm_notify(struct notifier_block *nb,
> + unsigned long action, void *unused)
> +{
> + switch (action) {
> + case CPU_PM_EXIT:
> + case CPU_PM_ENTER_FAILED:
> + __this_cpu_write(cpu_srmcfg, U32_MAX);
[Severity: High]
Does this notifier need to eagerly restore the hardware CSR for the currently
executing thread?
It sets cpu_srmcfg to U32_MAX to force a CSR write on the next context
switch. However, upon waking from low-power states like suspend-to-RAM, the
CPU does not immediately context switch.
It unwinds the suspend stack, resumes executing the interrupted thread, and
services pending interrupts.
Because the hardware CSR is not explicitly restored here, all code executing
between wakeup and the first context switch might run with the hardware's
implementation defined reset value for CSR_SRMCFG instead of the task's
assigned QoS tags.
> + break;
> + }
> + return NOTIFY_OK;
> +}
> +
> +static struct notifier_block riscv_srmcfg_pm_nb = {
> + .notifier_call = riscv_srmcfg_pm_notify,
> +};
> +
> +static int __init riscv_srmcfg_init(void)
> +{
> + int err;
> +
> + if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID))
> + return 0;
> +
> + /*
> + * cpuhp_setup_state() invokes the startup callback locally on every
> + * already-online CPU, so no separate seed loop is needed here.
> + */
> + err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
> + riscv_srmcfg_online, riscv_srmcfg_offline);
> + if (err < 0) {
> + pr_warn("srmcfg cpuhp registration failed (%d), cpus brought online after boot will not invalidate the CSR_SRMCFG cache\n",
> + err);
> + return err;
[Severity: Medium]
Is it intentional to skip PM notifier registration if CPU hotplug setup fails?
This early return skips cpu_pm_register_notifier() entirely.
While cpuhp_setup_state() failures are rare, PM notifiers do not strictly
depend on CPU hotplug registration. If this error path is taken, tasks waking
from suspend could run with incorrect QoS tags because the PM notifier is never
registered.
> + }
> +
> + cpu_pm_register_notifier(&riscv_srmcfg_pm_nb);
> + return 0;
> +}
> +arch_initcall(riscv_srmcfg_init);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260601-ssqosid-cbqri-rqsc-v7-0-v6-0-baf00f50028a@kernel.org?part=3
next prev parent reply other threads:[~2026-06-01 20:49 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 20:35 [PATCH RFC v6 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-06-01 20:49 ` sashiko-bot [this message]
2026-06-01 20:35 ` [PATCH RFC v6 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-06-01 20:55 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-06-01 20:48 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-06-01 20:51 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-06-01 20:56 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-06-01 20:58 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-06-01 21:01 ` sashiko-bot
2026-06-01 20:36 ` [PATCH RFC v6 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260601204931.3A9481F00893@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=fustini@kernel.org \
--cc=robh@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox