From: Drew Fustini <fustini@kernel.org>
To: sashiko-reviews@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC v6 03/18] riscv: add support for srmcfg CSR from Ssqosid extension
Date: Mon, 1 Jun 2026 17:58:32 -0700 [thread overview]
Message-ID: <ah4quNFhfVu2NpY7@x1> (raw)
In-Reply-To: <20260601204931.3A9481F00893@smtp.kernel.org>
On Mon, Jun 01, 2026 at 08:49:30PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
> > +/*
> > + * Invalidate the cache on offline too. The sentinel persists across the
> > + * offline period, so a CPU brought back online finds the cache already
> > + * invalidated before it is schedulable. This closes the window where a
> > + * task scheduled before riscv_srmcfg_online() runs could match a stale
> > + * cache and skip the CSR write while the hardware CSR was reset across
> > + * hart stop/start.
> > + */
> > +static int riscv_srmcfg_offline(unsigned int cpu)
> > +{
> > + per_cpu(cpu_srmcfg, cpu) = U32_MAX;
>
> [Severity: High]
> Will this U32_MAX sentinel be overwritten before the CPU actually halts?
>
> Because riscv_srmcfg_offline() executes at CPUHP_AP_ONLINE_DYN, the dying CPU
> continues tearing down and eventually context switches to the idle thread to
> physically offline.
>
> This context switch invokes __switch_to_srmcfg(), which writes the hardware
> CSR and overwrites the cpu_srmcfg cache with the idle task's value,
> permanently losing the U32_MAX sentinel.
>
> When the CPU comes back online, its hardware CSR is in an implementation
> defined reset state, but the software cache holds a valid value. Any early
> tasks sharing this stale value might skip the CSR write and execute with
> incorrect QoS tags until riscv_srmcfg_online() is reached.
This is not a problem. riscv_srmcfg_online() writes U32_MAX again on
bring-up, and it runs at CPUHP_AP_ONLINE_DYN before the CPU becomes
schedulable.
> > + return 0;
> > +}
> > +
> > +/*
> > + * CPU PM notifier: invalidate the cached srmcfg on resume from a deep
> > + * idle / suspend. Ssqosid v1.0 leaves CSR_SRMCFG state across low-power
> > + * transitions implementation-defined, and the boot CPU never goes
> > + * through the cpuhp online callback during system suspend, so without
> > + * this hook __switch_to_srmcfg() would skip the CSR write when the
> > + * outgoing task happens to share its srmcfg with the pre-suspend cache.
> > + */
> > +static int riscv_srmcfg_pm_notify(struct notifier_block *nb,
> > + unsigned long action, void *unused)
> > +{
> > + switch (action) {
> > + case CPU_PM_EXIT:
> > + case CPU_PM_ENTER_FAILED:
> > + __this_cpu_write(cpu_srmcfg, U32_MAX);
>
> [Severity: High]
> Does this notifier need to eagerly restore the hardware CSR for the currently
> executing thread?
>
> It sets cpu_srmcfg to U32_MAX to force a CSR write on the next context
> switch. However, upon waking from low-power states like suspend-to-RAM, the
> CPU does not immediately context switch.
>
> It unwinds the suspend stack, resumes executing the interrupted thread, and
> services pending interrupts.
>
> Because the hardware CSR is not explicitly restored here, all code executing
> between wakeup and the first context switch might run with the hardware's
> implementation defined reset value for CSR_SRMCFG instead of the task's
> assigned QoS tags.
This is not a problem. Before userspace runs on this CPU again,
__switch_to_srmcfg() writes the CSR from the next task's thread.srmcfg.
> > + /*
> > + * cpuhp_setup_state() invokes the startup callback locally on every
> > + * already-online CPU, so no separate seed loop is needed here.
> > + */
> > + err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
> > + riscv_srmcfg_online, riscv_srmcfg_offline);
> > + if (err < 0) {
> > + pr_warn("srmcfg cpuhp registration failed (%d), cpus brought online after boot will not invalidate the CSR_SRMCFG cache\n",
> > + err);
> > + return err;
>
> [Severity: Medium]
> Is it intentional to skip PM notifier registration if CPU hotplug setup fails?
>
> This early return skips cpu_pm_register_notifier() entirely.
>
> While cpuhp_setup_state() failures are rare, PM notifiers do not strictly
> depend on CPU hotplug registration. If this error path is taken, tasks waking
> from suspend could run with incorrect QoS tags because the PM notifier is never
> registered.
This is very unlikely. I will not be making any changes.
Drew
next prev parent reply other threads:[~2026-06-02 0:58 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 20:35 [PATCH RFC v6 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-02 0:58 ` Drew Fustini [this message]
2026-06-01 20:35 ` [PATCH RFC v6 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:35 ` [PATCH RFC v6 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-06-01 20:35 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:55 ` sashiko-bot
2026-06-02 1:11 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:48 ` sashiko-bot
2026-06-02 2:17 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:51 ` sashiko-bot
2026-06-02 2:20 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:49 ` sashiko-bot
2026-06-02 4:27 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:56 ` sashiko-bot
2026-06-02 4:31 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:58 ` sashiko-bot
2026-06-02 4:47 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-02 5:00 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:57 ` sashiko-bot
2026-06-02 17:35 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 21:01 ` sashiko-bot
2026-06-02 21:09 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-06-01 20:36 ` Drew Fustini
2026-06-01 20:36 ` [PATCH RFC v6 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-06-01 20:36 ` Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ah4quNFhfVu2NpY7@x1 \
--to=fustini@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.