* [PATCH] sched_ext: idle: honor built-in idle disablement in node kfuncs
@ 2026-03-24 19:42 Joseph Salisbury
2026-03-25 22:22 ` Andrea Righi
0 siblings, 1 reply; 3+ messages in thread
From: Joseph Salisbury @ 2026-03-24 19:42 UTC (permalink / raw)
To: Tejun Heo, David Vernet, Andrea Righi, Changwoo Min
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Valentin Schneider, sched-ext, linux-kernel
The node-aware idle kfunc helpers validate per-node idle tracking, but they
don't check whether built-in idle tracking itself is enabled.
As a result, when ops.update_idle() disables built-in idle tracking, the
node helpers can still read per-node idle masks and attempt idle CPU
selection. This violates the documented behavior and can expose stale
idle state to BPF schedulers.
Fix this by checking check_builtin_idle_enabled() in the node mask getters
and in scx_bpf_pick_idle_cpu_node(), matching the behavior of the non-node
helpers.
scx_bpf_pick_any_cpu_node() is different by: when built-in idle
tracking is disabled, it should skip idle selection and fall back directly
to the any-CPU path. Make it do so and match scx_bpf_pick_any_cpu().
Fixes: 01059219b0cf ("sched_ext: idle: Introduce node-aware idle cpu kfunc helpers")
Cc: stable@vger.kernel.org # v6.15+
Assisted-by: Codex:GPT-5
Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com>
---
kernel/sched/ext_idle.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
index ba298ac3ce6c..948f6b4f8ab5 100644
--- a/kernel/sched/ext_idle.c
+++ b/kernel/sched/ext_idle.c
@@ -1082,6 +1082,9 @@ __bpf_kfunc const struct cpumask *scx_bpf_get_idle_cpumask_node(int node)
if (node < 0)
return cpu_none_mask;
+ if (!check_builtin_idle_enabled(sch))
+ return cpu_none_mask;
+
return idle_cpumask(node)->cpu;
}
@@ -1137,6 +1140,9 @@ __bpf_kfunc const struct cpumask *scx_bpf_get_idle_smtmask_node(int node)
if (node < 0)
return cpu_none_mask;
+ if (!check_builtin_idle_enabled(sch))
+ return cpu_none_mask;
+
if (sched_smt_active())
return idle_cpumask(node)->smt;
else
@@ -1253,6 +1259,9 @@ __bpf_kfunc s32 scx_bpf_pick_idle_cpu_node(const struct cpumask *cpus_allowed,
if (node < 0)
return node;
+ if (!check_builtin_idle_enabled(sch))
+ return -EBUSY;
+
return scx_pick_idle_cpu(cpus_allowed, node, flags);
}
@@ -1337,9 +1346,11 @@ __bpf_kfunc s32 scx_bpf_pick_any_cpu_node(const struct cpumask *cpus_allowed,
if (node < 0)
return node;
- cpu = scx_pick_idle_cpu(cpus_allowed, node, flags);
- if (cpu >= 0)
- return cpu;
+ if (static_branch_likely(&scx_builtin_idle_enabled)) {
+ cpu = scx_pick_idle_cpu(cpus_allowed, node, flags);
+ if (cpu >= 0)
+ return cpu;
+ }
if (flags & SCX_PICK_IDLE_IN_NODE)
cpu = cpumask_any_and_distribute(cpumask_of_node(node), cpus_allowed);
--
2.47.3
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] sched_ext: idle: honor built-in idle disablement in node kfuncs 2026-03-24 19:42 [PATCH] sched_ext: idle: honor built-in idle disablement in node kfuncs Joseph Salisbury @ 2026-03-25 22:22 ` Andrea Righi 2026-03-26 18:18 ` [External] : " Joseph Salisbury 0 siblings, 1 reply; 3+ messages in thread From: Andrea Righi @ 2026-03-25 22:22 UTC (permalink / raw) To: Joseph Salisbury Cc: Tejun Heo, David Vernet, Changwoo Min, Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, sched-ext, linux-kernel Hi Joe, On Tue, Mar 24, 2026 at 03:42:35PM -0400, Joseph Salisbury wrote: > The node-aware idle kfunc helpers validate per-node idle tracking, but they > don't check whether built-in idle tracking itself is enabled. > > As a result, when ops.update_idle() disables built-in idle tracking, the > node helpers can still read per-node idle masks and attempt idle CPU > selection. This violates the documented behavior and can expose stale > idle state to BPF schedulers. > > Fix this by checking check_builtin_idle_enabled() in the node mask getters > and in scx_bpf_pick_idle_cpu_node(), matching the behavior of the non-node > helpers. > > scx_bpf_pick_any_cpu_node() is different by: when built-in idle > tracking is disabled, it should skip idle selection and fall back directly > to the any-CPU path. Make it do so and match scx_bpf_pick_any_cpu(). > > Fixes: 01059219b0cf ("sched_ext: idle: Introduce node-aware idle cpu kfunc helpers") > Cc: stable@vger.kernel.org # v6.15+ > Assisted-by: Codex:GPT-5 > Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com> We are already validating this at load time, see validate_ops(): ... /* * SCX_OPS_BUILTIN_IDLE_PER_NODE requires built-in CPU idle * selection policy to be enabled. */ if ((ops->flags & SCX_OPS_BUILTIN_IDLE_PER_NODE) && (ops->update_idle && !(ops->flags & SCX_OPS_KEEP_BUILTIN_IDLE))) { scx_error(sch, "SCX_OPS_BUILTIN_IDLE_PER_NODE requires CPU idle selection enabled"); return -EINVAL; } ... In practice you can't have SCX_OPS_BUILTIN_IDLE_PER_NODE set without built-in idle enabled if a scheduler is running and we are checking for SCX_OPS_BUILTIN_IDLE_PER_NODE in validate_node(). So I think these extra checks are not needed. Thanks, -Andrea > --- > kernel/sched/ext_idle.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c > index ba298ac3ce6c..948f6b4f8ab5 100644 > --- a/kernel/sched/ext_idle.c > +++ b/kernel/sched/ext_idle.c > @@ -1082,6 +1082,9 @@ __bpf_kfunc const struct cpumask *scx_bpf_get_idle_cpumask_node(int node) > if (node < 0) > return cpu_none_mask; > > + if (!check_builtin_idle_enabled(sch)) > + return cpu_none_mask; > + > return idle_cpumask(node)->cpu; > } > > @@ -1137,6 +1140,9 @@ __bpf_kfunc const struct cpumask *scx_bpf_get_idle_smtmask_node(int node) > if (node < 0) > return cpu_none_mask; > > + if (!check_builtin_idle_enabled(sch)) > + return cpu_none_mask; > + > if (sched_smt_active()) > return idle_cpumask(node)->smt; > else > @@ -1253,6 +1259,9 @@ __bpf_kfunc s32 scx_bpf_pick_idle_cpu_node(const struct cpumask *cpus_allowed, > if (node < 0) > return node; > > + if (!check_builtin_idle_enabled(sch)) > + return -EBUSY; > + > return scx_pick_idle_cpu(cpus_allowed, node, flags); > } > > @@ -1337,9 +1346,11 @@ __bpf_kfunc s32 scx_bpf_pick_any_cpu_node(const struct cpumask *cpus_allowed, > if (node < 0) > return node; > > - cpu = scx_pick_idle_cpu(cpus_allowed, node, flags); > - if (cpu >= 0) > - return cpu; > + if (static_branch_likely(&scx_builtin_idle_enabled)) { > + cpu = scx_pick_idle_cpu(cpus_allowed, node, flags); > + if (cpu >= 0) > + return cpu; > + } > > if (flags & SCX_PICK_IDLE_IN_NODE) > cpu = cpumask_any_and_distribute(cpumask_of_node(node), cpus_allowed); > -- > 2.47.3 > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [External] : Re: [PATCH] sched_ext: idle: honor built-in idle disablement in node kfuncs 2026-03-25 22:22 ` Andrea Righi @ 2026-03-26 18:18 ` Joseph Salisbury 0 siblings, 0 replies; 3+ messages in thread From: Joseph Salisbury @ 2026-03-26 18:18 UTC (permalink / raw) To: Andrea Righi Cc: Tejun Heo, David Vernet, Changwoo Min, Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, sched-ext, linux-kernel On 3/25/26 6:22 PM, Andrea Righi wrote: > Hi Joe, > > On Tue, Mar 24, 2026 at 03:42:35PM -0400, Joseph Salisbury wrote: >> The node-aware idle kfunc helpers validate per-node idle tracking, but they >> don't check whether built-in idle tracking itself is enabled. >> >> As a result, when ops.update_idle() disables built-in idle tracking, the >> node helpers can still read per-node idle masks and attempt idle CPU >> selection. This violates the documented behavior and can expose stale >> idle state to BPF schedulers. >> >> Fix this by checking check_builtin_idle_enabled() in the node mask getters >> and in scx_bpf_pick_idle_cpu_node(), matching the behavior of the non-node >> helpers. >> >> scx_bpf_pick_any_cpu_node() is different by: when built-in idle >> tracking is disabled, it should skip idle selection and fall back directly >> to the any-CPU path. Make it do so and match scx_bpf_pick_any_cpu(). >> >> Fixes: 01059219b0cf ("sched_ext: idle: Introduce node-aware idle cpu kfunc helpers") >> Cc: stable@vger.kernel.org # v6.15+ >> Assisted-by: Codex:GPT-5 >> Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com> > We are already validating this at load time, see validate_ops(): > ... > /* > * SCX_OPS_BUILTIN_IDLE_PER_NODE requires built-in CPU idle > * selection policy to be enabled. > */ > if ((ops->flags & SCX_OPS_BUILTIN_IDLE_PER_NODE) && > (ops->update_idle && !(ops->flags & SCX_OPS_KEEP_BUILTIN_IDLE))) { > scx_error(sch, "SCX_OPS_BUILTIN_IDLE_PER_NODE requires CPU idle selection enabled"); > return -EINVAL; > } > ... > > In practice you can't have SCX_OPS_BUILTIN_IDLE_PER_NODE set without > built-in idle enabled if a scheduler is running and we are checking for > SCX_OPS_BUILTIN_IDLE_PER_NODE in validate_node(). So I think these extra > checks are not needed. > > Thanks, > -Andrea Hi Andrea, Thanks for the review. I missed the validate_ops() check and focused on the helper-side behavior. SCX_OPS_BUILTIN_IDLE_PER_NODE is rejected when ops.update_idle() disables built-in idle unless SCX_OPS_KEEP_BUILTIN_IDLE is set, so the state I was trying to guard is not reachable for a running scheduler. That makes the added checks unnecessary. I thought scx_bpf_pick_any_cpu_node() should mirror scx_bpf_pick_any_cpu() and fall back when built-in idle is disabled. However, that only makes sense for the non-node helper, where built-in idle disabled is a valid configuration. For the per-node case, validate_ops() rejects that combination, so there is no runtime case to handle. While looking at this, I noticed the comment above scx_bpf_pick_any_cpu_node() still describes that unreachable built-in-idle-disabled fallback case (Line 1364 in ext_idle.c in mainline). I can create a comment-only cleanup to align the comment with the current behavior. Do you think that is worth sending? Maybe something like this: - * If ops.update_idle() is implemented and %SCX_OPS_KEEP_BUILTIN_IDLE is not - * set, this function can't tell which CPUs are idle and will always pick any - * CPU. + * %SCX_OPS_BUILTIN_IDLE_PER_NODE requires built-in idle tracking, so + * this helper always attempts node-aware idle selection before falling + * back to picking any CPU. Thanks for the explanation, and sorry for the noise. Thanks, Joe > >> --- >> kernel/sched/ext_idle.c | 17 ++++++++++++++--- >> 1 file changed, 14 insertions(+), 3 deletions(-) >> >> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c >> index ba298ac3ce6c..948f6b4f8ab5 100644 >> --- a/kernel/sched/ext_idle.c >> +++ b/kernel/sched/ext_idle.c >> @@ -1082,6 +1082,9 @@ __bpf_kfunc const struct cpumask *scx_bpf_get_idle_cpumask_node(int node) >> if (node < 0) >> return cpu_none_mask; >> >> + if (!check_builtin_idle_enabled(sch)) >> + return cpu_none_mask; >> + >> return idle_cpumask(node)->cpu; >> } >> >> @@ -1137,6 +1140,9 @@ __bpf_kfunc const struct cpumask *scx_bpf_get_idle_smtmask_node(int node) >> if (node < 0) >> return cpu_none_mask; >> >> + if (!check_builtin_idle_enabled(sch)) >> + return cpu_none_mask; >> + >> if (sched_smt_active()) >> return idle_cpumask(node)->smt; >> else >> @@ -1253,6 +1259,9 @@ __bpf_kfunc s32 scx_bpf_pick_idle_cpu_node(const struct cpumask *cpus_allowed, >> if (node < 0) >> return node; >> >> + if (!check_builtin_idle_enabled(sch)) >> + return -EBUSY; >> + >> return scx_pick_idle_cpu(cpus_allowed, node, flags); >> } >> >> @@ -1337,9 +1346,11 @@ __bpf_kfunc s32 scx_bpf_pick_any_cpu_node(const struct cpumask *cpus_allowed, >> if (node < 0) >> return node; >> >> - cpu = scx_pick_idle_cpu(cpus_allowed, node, flags); >> - if (cpu >= 0) >> - return cpu; >> + if (static_branch_likely(&scx_builtin_idle_enabled)) { >> + cpu = scx_pick_idle_cpu(cpus_allowed, node, flags); >> + if (cpu >= 0) >> + return cpu; >> + } >> >> if (flags & SCX_PICK_IDLE_IN_NODE) >> cpu = cpumask_any_and_distribute(cpumask_of_node(node), cpus_allowed); >> -- >> 2.47.3 >> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-03-26 18:19 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-24 19:42 [PATCH] sched_ext: idle: honor built-in idle disablement in node kfuncs Joseph Salisbury 2026-03-25 22:22 ` Andrea Righi 2026-03-26 18:18 ` [External] : " Joseph Salisbury
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox