Devicetree
 help / color / mirror / Atom feed
From: Drew Fustini <fustini@kernel.org>
To: sashiko@lists.linux.dev
Cc: conor+dt@kernel.org, devicetree@vger.kernel.org,
	krzk+dt@kernel.org, robh@kernel.org,
	linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
	x86@kernel.org
Subject: Re: [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from Ssqosid extension
Date: Thu, 14 May 2026 15:24:56 -0700	[thread overview]
Message-ID: <agZLuHhsN5ezSmqg@tt-p14-dfustini> (raw)
In-Reply-To: <20260511235300.781C6C2BCB0@smtp.kernel.org>

On Mon, May 11, 2026 at 11:52:59PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
> - [High] Uninitialized Hardware State / Early Boot Bypass for CSR_SRMCFG
> - [High] Resource Control Bypass via Monitoring Group (MCID) evaluation flaw
> - [High] Hardware/Software State Desync on Boot CPU Resume from Suspend
> - [Medium] Redundant loop and Data Race in hotplug initialization
> - [Medium] Initcall returns positive hotplug state ID on success
> --
> 
> commit 0e2697a6f259670c211c47e45e2448410a9ac7b9
> Author: Drew Fustini <fustini@kernel.org>
> 
> riscv: add support for srmcfg CSR from Ssqosid extension
>     
> This commit adds support for the srmcfg CSR defined in the Ssqosid ISA
> extension. It introduces a per-task srmcfg field mirrored to a per-CPU
> cache to optimize context switching and configure resource management IDs.
>     
> Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
> 
> > diff --git a/arch/riscv/include/asm/qos.h b/arch/riscv/include/asm/qos.h
> > new file mode 100644
> > index 0000000000000..6988fe37551ee
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/qos.h
> > @@ -0,0 +1,64 @@
> [ ... ]
> > +static inline void __switch_to_srmcfg(struct task_struct *next)
> > +{
> > +	u32 thread_srmcfg;
> > +
> > +	thread_srmcfg = READ_ONCE(next->thread.srmcfg);
> > +
> > +	/* Default-group tasks (thread.srmcfg == 0) follow this CPU's default. */
> > +	if (thread_srmcfg == 0)
> > +		thread_srmcfg = __this_cpu_read(cpu_srmcfg_default);
> 
> Does this logic correctly handle tasks assigned to a specific monitoring
> group but the default resource group?
> 
> The srmcfg CSR encodes both the Resource Control ID (RCID) and Monitoring
> Counter ID (MCID).
> 
> If a task is placed in the default resource allocation group (RCID=0) but
> is assigned a specific monitoring group (MCID!=0), its thread.srmcfg will
> be non-zero.
> 
> The thread_srmcfg == 0 check will fail, bypassing the cpu_srmcfg_default
> override. This would allow the task to escape the CPU's restricted RCID
> allocation.
> 
> Could the RCID and MCID fields be extracted and evaluated independently?

Yes, good catch. v5 will rewrite __switch_to_srmcfg() to evaluate RCID
and MCID against cpu_srmcfg_default independently, matching x86 RDT's
__resctrl_sched_in() per-field logic. A task in the default RCID group
with a specific MCID (or vice versa) now correctly inherits the CPU
default for the unassigned dimension.

> 
> > diff --git a/arch/riscv/kernel/qos.c b/arch/riscv/kernel/qos.c
> > new file mode 100644
> > index 0000000000000..75bda2ed89e1f
> > --- /dev/null
> > +++ b/arch/riscv/kernel/qos.c
> > @@ -0,0 +1,49 @@
> [ ... ]
> > +/* cached value of srmcfg csr for each cpu */
> > +DEFINE_PER_CPU(u32, cpu_srmcfg);
> 
> Will this cause the CSR write to be bypassed during early boot?
> 
> DEFINE_PER_CPU places the cache in .bss or implicitly zero-initializes it.
> 
> During early boot, context switches occur to threads like idle or init
> before the arch_initcall executes, and these early threads have
> thread.srmcfg == 0.
> 
> When __switch_to_srmcfg() checks if thread_srmcfg != __this_cpu_read(cpu_srmcfg),
> it will evaluate to 0 != 0 and skip writing the CSR. This means the CPU
> would execute all early boot code with an implementation-defined,
> potentially random hardware CSR state.
> 
> Could we statically initialize the per-CPU variable, for example:
> 
> DEFINE_PER_CPU(u32, cpu_srmcfg) = U32_MAX;

I will fix in v5

> > +
> > +/* default srmcfg value for each cpu, set via resctrl cpu assignment */
> > +DEFINE_PER_CPU(u32, cpu_srmcfg_default);
> > +
> > +/*
> > + * Seed the per-CPU srmcfg cache to a sentinel that no real srmcfg encoding
> > + * can produce (MCID << 16 | RCID, both fields well under 16 bits) so the
> > + * next __switch_to_srmcfg() unconditionally writes the CSR. Ssqosid v1.0
> > + * leaves CSR state across hart stop/start implementation-defined, so the
> > + * cached value cannot be trusted after online.
> > + */
> > +static int riscv_srmcfg_online(unsigned int cpu)
> > +{
> > +	per_cpu(cpu_srmcfg, cpu) = U32_MAX;
> > +	return 0;
> > +}
> 
> How is the boot CPU's cache handled during system resume from suspend?

I will add a CPU PM notifier that invalidates cpu_srmcfg on CPU_PM_EXIT
and CPU_PM_ENTER_FAILED, so the boot CPU also forces a CSR write on
resume.
 
> > +static int __init riscv_srmcfg_init(void)
> > +{
> > +	unsigned int cpu;
> > +	int err;
> > +
> > +	if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID))
> > +		return 0;
> > +
> > +	/* Seed already-online CPUs. The cpuhp callback covers later onlines. */
> > +	for_each_online_cpu(cpu)
> > +		per_cpu(cpu_srmcfg, cpu) = U32_MAX;
> 
> Is this manual iteration over online CPUs necessary?

No, I will drop it. cpuhp_setup_state() already invokes the startup
callback on every already-online CPU.

> > +	err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
> > +				riscv_srmcfg_online, NULL);
> > +	if (err < 0)
> > +		pr_warn("srmcfg cpuhp registration failed (%d), cpus brought online after boot will not invalidate the CSR_SRMCFG cache\n",
> > +			err);
> > +	return err;
> > +}
> > +arch_initcall(riscv_srmcfg_init);
> 
> Does this return a false failure during boot?

do_one_initcall() in init/main.c ignores positive return values from
initcalls. I will leave this as-is.

thanks,
drew

  reply	other threads:[~2026-05-14 22:24 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-11 23:52   ` sashiko-bot
2026-05-14 22:24     ` Drew Fustini [this message]
2026-05-11  5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-12  1:26   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-12  1:58   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-12  2:29   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-12  4:01   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-12  5:00   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-12  5:42   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-12 21:26   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-12 21:46   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-05-12 22:56   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agZLuHhsN5ezSmqg@tt-p14-dfustini \
    --to=fustini@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=krzk+dt@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=robh@kernel.org \
    --cc=sashiko@lists.linux.dev \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox