All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Drew Fustini" <fustini@kernel.org>
Cc: conor+dt@kernel.org, devicetree@vger.kernel.org,
	krzk+dt@kernel.org, robh@kernel.org
Subject: Re: [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from Ssqosid extension
Date: Mon, 11 May 2026 23:52:59 +0000	[thread overview]
Message-ID: <20260511235300.781C6C2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260510-ssqosid-cbqri-rqsc-v7-0-v4-3-eb53831ef683@kernel.org>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] Uninitialized Hardware State / Early Boot Bypass for CSR_SRMCFG
- [High] Resource Control Bypass via Monitoring Group (MCID) evaluation flaw
- [High] Hardware/Software State Desync on Boot CPU Resume from Suspend
- [Medium] Redundant loop and Data Race in hotplug initialization
- [Medium] Initcall returns positive hotplug state ID on success
--

commit 0e2697a6f259670c211c47e45e2448410a9ac7b9
Author: Drew Fustini <fustini@kernel.org>

riscv: add support for srmcfg CSR from Ssqosid extension
    
This commit adds support for the srmcfg CSR defined in the Ssqosid ISA
extension. It introduces a per-task srmcfg field mirrored to a per-CPU
cache to optimize context switching and configure resource management IDs.
    
Link: https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0

> diff --git a/arch/riscv/include/asm/qos.h b/arch/riscv/include/asm/qos.h
> new file mode 100644
> index 0000000000000..6988fe37551ee
> --- /dev/null
> +++ b/arch/riscv/include/asm/qos.h
> @@ -0,0 +1,64 @@
[ ... ]
> +static inline void __switch_to_srmcfg(struct task_struct *next)
> +{
> +	u32 thread_srmcfg;
> +
> +	thread_srmcfg = READ_ONCE(next->thread.srmcfg);
> +
> +	/* Default-group tasks (thread.srmcfg == 0) follow this CPU's default. */
> +	if (thread_srmcfg == 0)
> +		thread_srmcfg = __this_cpu_read(cpu_srmcfg_default);

Does this logic correctly handle tasks assigned to a specific monitoring
group but the default resource group?

The srmcfg CSR encodes both the Resource Control ID (RCID) and Monitoring
Counter ID (MCID).

If a task is placed in the default resource allocation group (RCID=0) but
is assigned a specific monitoring group (MCID!=0), its thread.srmcfg will
be non-zero.

The thread_srmcfg == 0 check will fail, bypassing the cpu_srmcfg_default
override. This would allow the task to escape the CPU's restricted RCID
allocation.

Could the RCID and MCID fields be extracted and evaluated independently?

> diff --git a/arch/riscv/kernel/qos.c b/arch/riscv/kernel/qos.c
> new file mode 100644
> index 0000000000000..75bda2ed89e1f
> --- /dev/null
> +++ b/arch/riscv/kernel/qos.c
> @@ -0,0 +1,49 @@
[ ... ]
> +/* cached value of srmcfg csr for each cpu */
> +DEFINE_PER_CPU(u32, cpu_srmcfg);

Will this cause the CSR write to be bypassed during early boot?

DEFINE_PER_CPU places the cache in .bss or implicitly zero-initializes it.

During early boot, context switches occur to threads like idle or init
before the arch_initcall executes, and these early threads have
thread.srmcfg == 0.

When __switch_to_srmcfg() checks if thread_srmcfg != __this_cpu_read(cpu_srmcfg),
it will evaluate to 0 != 0 and skip writing the CSR. This means the CPU
would execute all early boot code with an implementation-defined,
potentially random hardware CSR state.

Could we statically initialize the per-CPU variable, for example:

DEFINE_PER_CPU(u32, cpu_srmcfg) = U32_MAX;

> +
> +/* default srmcfg value for each cpu, set via resctrl cpu assignment */
> +DEFINE_PER_CPU(u32, cpu_srmcfg_default);
> +
> +/*
> + * Seed the per-CPU srmcfg cache to a sentinel that no real srmcfg encoding
> + * can produce (MCID << 16 | RCID, both fields well under 16 bits) so the
> + * next __switch_to_srmcfg() unconditionally writes the CSR. Ssqosid v1.0
> + * leaves CSR state across hart stop/start implementation-defined, so the
> + * cached value cannot be trusted after online.
> + */
> +static int riscv_srmcfg_online(unsigned int cpu)
> +{
> +	per_cpu(cpu_srmcfg, cpu) = U32_MAX;
> +	return 0;
> +}

How is the boot CPU's cache handled during system resume from suspend?

During system suspend, non-boot CPUs are hotplugged out, but the boot CPU
remains online and enters sleep. The hardware CSR register loses its state
during sleep, reverting to an undefined power-on state.

Because the boot CPU is never offlined, its cpu_srmcfg software cache is
not invalidated by the hotplug callback upon resume.

If a task is scheduled with an srmcfg matching the pre-suspend cache,
__switch_to_srmcfg() will falsely assume the hardware CSR is valid and
skip the write, leaving the boot CPU executing with an invalid hardware tag.

> +
> +static int __init riscv_srmcfg_init(void)
> +{
> +	unsigned int cpu;
> +	int err;
> +
> +	if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SSQOSID))
> +		return 0;
> +
> +	/* Seed already-online CPUs. The cpuhp callback covers later onlines. */
> +	for_each_online_cpu(cpu)
> +		per_cpu(cpu_srmcfg, cpu) = U32_MAX;

Is this manual iteration over online CPUs necessary?

The cpuhp_setup_state() function used below implicitly invokes the
callback locally on all already-online CPUs. Iterating here is redundant
and might cause a data race with concurrent context switches.

> +
> +	err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/srmcfg:online",
> +				riscv_srmcfg_online, NULL);
> +	if (err < 0)
> +		pr_warn("srmcfg cpuhp registration failed (%d), cpus brought online after boot will not invalidate the CSR_SRMCFG cache\n",
> +			err);
> +	return err;
> +}
> +arch_initcall(riscv_srmcfg_init);

Does this return a false failure during boot?

When cpuhp_setup_state() dynamically allocates a hotplug state and succeeds,
it returns a positive integer ID.

Returning this positive value from an arch_initcall will be interpreted by
the kernel's initcall infrastructure as a failure, generating a misleading
boot warning like "initcall returned with error code".

Should this evaluate the return value and explicitly return 0 when err >= 0?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260510-ssqosid-cbqri-rqsc-v7-0-v4-0-eb53831ef683@kernel.org?part=3

  reply	other threads:[~2026-05-11 23:53 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  5:10 [PATCH RFC v4 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-11  5:10 ` Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-11  5:10   ` Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-11  5:10   ` Drew Fustini
2026-05-11  5:10 ` [PATCH RFC v4 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-11  5:10   ` Drew Fustini
2026-05-11 23:52   ` sashiko-bot [this message]
2026-05-14 22:24     ` Drew Fustini
2026-05-14 22:24       ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12  1:26   ` sashiko-bot
2026-05-15 13:46     ` Drew Fustini
2026-05-15 13:46       ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12  1:58   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12  2:29   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12  4:01   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12  5:00   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12  5:42   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12 21:26   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12 21:46   ` sashiko-bot
2026-05-11  5:11 ` [PATCH RFC v4 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-11  5:11 ` [PATCH RFC v4 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini
2026-05-11  5:11   ` Drew Fustini
2026-05-12 22:56   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260511235300.781C6C2BCB0@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=fustini@kernel.org \
    --cc=krzk+dt@kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.