All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Joseph Salisbury <joseph.salisbury@oracle.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched_ext: idle: honor built-in idle disablement in node kfuncs
Date: Wed, 25 Mar 2026 23:22:35 +0100	[thread overview]
Message-ID: <acRgKzZodHT2Xn1n@gpd4> (raw)
In-Reply-To: <20260324194235.942952-1-joseph.salisbury@oracle.com>

Hi Joe,

On Tue, Mar 24, 2026 at 03:42:35PM -0400, Joseph Salisbury wrote:
> The node-aware idle kfunc helpers validate per-node idle tracking, but they
> don't check whether built-in idle tracking itself is enabled.
> 
> As a result, when ops.update_idle() disables built-in idle tracking, the
> node helpers can still read per-node idle masks and attempt idle CPU
> selection.  This violates the documented behavior and can expose stale
> idle state to BPF schedulers.
> 
> Fix this by checking check_builtin_idle_enabled() in the node mask getters
> and in scx_bpf_pick_idle_cpu_node(), matching the behavior of the non-node
> helpers.
> 
> scx_bpf_pick_any_cpu_node() is different by: when built-in idle
> tracking is disabled, it should skip idle selection and fall back directly
> to the any-CPU path.  Make it do so and match scx_bpf_pick_any_cpu().
> 
> Fixes: 01059219b0cf ("sched_ext: idle: Introduce node-aware idle cpu kfunc helpers")
> Cc: stable@vger.kernel.org # v6.15+
> Assisted-by: Codex:GPT-5
> Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com>

We are already validating this at load time, see validate_ops():
...
       /*
         * SCX_OPS_BUILTIN_IDLE_PER_NODE requires built-in CPU idle
         * selection policy to be enabled.
         */
        if ((ops->flags & SCX_OPS_BUILTIN_IDLE_PER_NODE) &&
            (ops->update_idle && !(ops->flags & SCX_OPS_KEEP_BUILTIN_IDLE))) {
                scx_error(sch, "SCX_OPS_BUILTIN_IDLE_PER_NODE requires CPU idle selection enabled");
                return -EINVAL;
        }
...

In practice you can't have SCX_OPS_BUILTIN_IDLE_PER_NODE set without
built-in idle enabled if a scheduler is running and we are checking for
SCX_OPS_BUILTIN_IDLE_PER_NODE in validate_node(). So I think these extra
checks are not needed.

Thanks,
-Andrea

> ---
>  kernel/sched/ext_idle.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> index ba298ac3ce6c..948f6b4f8ab5 100644
> --- a/kernel/sched/ext_idle.c
> +++ b/kernel/sched/ext_idle.c
> @@ -1082,6 +1082,9 @@ __bpf_kfunc const struct cpumask *scx_bpf_get_idle_cpumask_node(int node)
>  	if (node < 0)
>  		return cpu_none_mask;
>  
> +	if (!check_builtin_idle_enabled(sch))
> +		return cpu_none_mask;
> +
>  	return idle_cpumask(node)->cpu;
>  }
>  
> @@ -1137,6 +1140,9 @@ __bpf_kfunc const struct cpumask *scx_bpf_get_idle_smtmask_node(int node)
>  	if (node < 0)
>  		return cpu_none_mask;
>  
> +	if (!check_builtin_idle_enabled(sch))
> +		return cpu_none_mask;
> +
>  	if (sched_smt_active())
>  		return idle_cpumask(node)->smt;
>  	else
> @@ -1253,6 +1259,9 @@ __bpf_kfunc s32 scx_bpf_pick_idle_cpu_node(const struct cpumask *cpus_allowed,
>  	if (node < 0)
>  		return node;
>  
> +	if (!check_builtin_idle_enabled(sch))
> +		return -EBUSY;
> +
>  	return scx_pick_idle_cpu(cpus_allowed, node, flags);
>  }
>  
> @@ -1337,9 +1346,11 @@ __bpf_kfunc s32 scx_bpf_pick_any_cpu_node(const struct cpumask *cpus_allowed,
>  	if (node < 0)
>  		return node;
>  
> -	cpu = scx_pick_idle_cpu(cpus_allowed, node, flags);
> -	if (cpu >= 0)
> -		return cpu;
> +	if (static_branch_likely(&scx_builtin_idle_enabled)) {
> +		cpu = scx_pick_idle_cpu(cpus_allowed, node, flags);
> +		if (cpu >= 0)
> +			return cpu;
> +	}
>  
>  	if (flags & SCX_PICK_IDLE_IN_NODE)
>  		cpu = cpumask_any_and_distribute(cpumask_of_node(node), cpus_allowed);
> -- 
> 2.47.3
> 

  reply	other threads:[~2026-03-25 22:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 19:42 [PATCH] sched_ext: idle: honor built-in idle disablement in node kfuncs Joseph Salisbury
2026-03-25 22:22 ` Andrea Righi [this message]
2026-03-26 18:18   ` [External] : " Joseph Salisbury

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acRgKzZodHT2Xn1n@gpd4 \
    --to=arighi@nvidia.com \
    --cc=bsegall@google.com \
    --cc=changwoo@igalia.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joseph.salisbury@oracle.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=void@manifault.com \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.