All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Cheng-Yang Chou <yphbchou0911@gmail.com>
Cc: sched-ext@lists.linux.dev, Tejun Heo <tj@kernel.org>,
	David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
	Chia-Ping Tsai <chia7712@gmail.com>
Subject: Re: [PATCH] sched_ext: Fix inconsistent NUMA node lookup in scx_select_cpu_dfl()
Date: Sat, 21 Mar 2026 18:45:49 +0100	[thread overview]
Message-ID: <ab7ZTcjlC1sAF7su@gpd4> (raw)
In-Reply-To: <20260321105503.869337-1-yphbchou0911@gmail.com>

Hi Cheng-Yang,

On Sat, Mar 21, 2026 at 06:54:58PM +0800, Cheng-Yang Chou wrote:
> In the WAKE_SYNC path of scx_select_cpu_dfl(), waker_node was computed
> with cpu_to_node(), while node (for prev_cpu) was computed with
> scx_cpu_node_if_enabled(). When scx_builtin_idle_per_node is disabled,
> node is NUMA_NO_NODE but waker_node would be the actual NUMA node,
> causing two issues:
> 
> 1. The (waker_node == node) check always fails when SCX_PICK_IDLE_IN_NODE
>    is set, preventing the waker CPU optimization from ever triggering.

When scx_builtin_idle_per_node is disabled, SCX_PICK_IDLE_IN_NODE won't be
set, which means !(flags & SCX_PICK_IDLE_IN_NODE) should be always true,
short-circuiting the ||, and the waker_node == node comparison is never
evaluated.

However, ...

> 2. idle_cpumask(waker_node) is called with a real node ID even though
>    per-node idle tracking is disabled, resulting in undefined behavior.

...this looks like a legit bug. I'm wondering how this fix impacts
performance, will do some testing. Nice catch!

> 
> Fix by using scx_cpu_node_if_enabled() for waker_node as well, ensuring
> both variables are computed consistently.
> 
> Fixes: 48849271e6611 ("sched_ext: idle: Per-node idle cpumasks")
> Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>

Reviewed-by: Andrea Righi <arighi@nvidia.com>

We should also add:
Cc: stable@vger.kernel.org # v6.15+

Thanks!
-Andrea

> ---
>  kernel/sched/ext_idle.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> index c7e405262697..8436c7df0a56 100644
> --- a/kernel/sched/ext_idle.c
> +++ b/kernel/sched/ext_idle.c
> @@ -543,7 +543,7 @@ s32 scx_select_cpu_dfl(struct task_struct *p, s32 prev_cpu, u64 wake_flags,
>  		 * piled up on it even if there is an idle core elsewhere on
>  		 * the system.
>  		 */
> -		waker_node = cpu_to_node(cpu);
> +		waker_node = scx_cpu_node_if_enabled(cpu);
>  		if (!(current->flags & PF_EXITING) &&
>  		    cpu_rq(cpu)->scx.local_dsq.nr == 0 &&
>  		    (!(flags & SCX_PICK_IDLE_IN_NODE) || (waker_node == node)) &&
> -- 
> 2.48.1
> 

  reply	other threads:[~2026-03-21 17:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-21 10:54 [PATCH] sched_ext: Fix inconsistent NUMA node lookup in scx_select_cpu_dfl() Cheng-Yang Chou
2026-03-21 17:45 ` Andrea Righi [this message]
2026-03-21 18:42 ` Tejun Heo
2026-03-21 19:39   ` Cheng-Yang Chou
2026-03-21 19:38 ` [PATCH v2] " Cheng-Yang Chou
2026-03-22  0:26   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab7ZTcjlC1sAF7su@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=chia7712@gmail.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    --cc=yphbchou0911@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.