* [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA @ 2025-06-03 8:22 Andrea Righi 2025-06-03 18:29 ` Tejun Heo 2025-06-04 14:05 ` Yury Norov 0 siblings, 2 replies; 4+ messages in thread From: Andrea Righi @ 2025-06-03 8:22 UTC (permalink / raw) To: Tejun Heo, David Vernet, Changwoo Min; +Cc: Yury Norov, linux-kernel In the idle CPU selection logic, attempting cross-node searches adds unnecessary complexity when CONFIG_NUMA is disabled. Since there's no meaningful concept of nodes in this case, simplify the logic by restricting the idle CPU search to the current node only. Fixes: 48849271e6611 ("sched_ext: idle: Per-node idle cpumasks") Signed-off-by: Andrea Righi <arighi@nvidia.com> --- kernel/sched/ext_idle.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c index 66da03cc0b338..8660d9ae40169 100644 --- a/kernel/sched/ext_idle.c +++ b/kernel/sched/ext_idle.c @@ -138,6 +138,7 @@ static s32 pick_idle_cpu_in_node(const struct cpumask *cpus_allowed, int node, u goto retry; } +#ifdef CONFIG_NUMA /* * Tracks nodes that have not yet been visited when searching for an idle * CPU across all available nodes. @@ -186,6 +187,13 @@ static s32 pick_idle_cpu_from_online_nodes(const struct cpumask *cpus_allowed, i return cpu; } +#else +static inline s32 +pick_idle_cpu_from_online_nodes(const struct cpumask *cpus_allowed, int node, u64 flags) +{ + return -EBUSY; +} +#endif /* * Find an idle CPU in the system, starting from @node. -- 2.49.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA 2025-06-03 8:22 [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA Andrea Righi @ 2025-06-03 18:29 ` Tejun Heo 2025-06-04 14:05 ` Yury Norov 1 sibling, 0 replies; 4+ messages in thread From: Tejun Heo @ 2025-06-03 18:29 UTC (permalink / raw) To: Andrea Righi; +Cc: David Vernet, Changwoo Min, Yury Norov, linux-kernel On Tue, Jun 03, 2025 at 10:22:01AM +0200, Andrea Righi wrote: > In the idle CPU selection logic, attempting cross-node searches adds > unnecessary complexity when CONFIG_NUMA is disabled. > > Since there's no meaningful concept of nodes in this case, simplify the > logic by restricting the idle CPU search to the current node only. > > Fixes: 48849271e6611 ("sched_ext: idle: Per-node idle cpumasks") > Signed-off-by: Andrea Righi <arighi@nvidia.com> Applied to sched_ext/for-6.16-fixes. Thanks. -- tejun ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA 2025-06-03 8:22 [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA Andrea Righi 2025-06-03 18:29 ` Tejun Heo @ 2025-06-04 14:05 ` Yury Norov 2025-06-04 15:07 ` Andrea Righi 1 sibling, 1 reply; 4+ messages in thread From: Yury Norov @ 2025-06-04 14:05 UTC (permalink / raw) To: Andrea Righi; +Cc: Tejun Heo, David Vernet, Changwoo Min, linux-kernel Hi Andrea! On Tue, Jun 03, 2025 at 10:22:01AM +0200, Andrea Righi wrote: > In the idle CPU selection logic, attempting cross-node searches adds > unnecessary complexity when CONFIG_NUMA is disabled. > > Since there's no meaningful concept of nodes in this case, simplify the > logic by restricting the idle CPU search to the current node only. > > Fixes: 48849271e6611 ("sched_ext: idle: Per-node idle cpumasks") > Signed-off-by: Andrea Righi <arighi@nvidia.com> > --- > kernel/sched/ext_idle.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c > index 66da03cc0b338..8660d9ae40169 100644 > --- a/kernel/sched/ext_idle.c > +++ b/kernel/sched/ext_idle.c > @@ -138,6 +138,7 @@ static s32 pick_idle_cpu_in_node(const struct cpumask *cpus_allowed, int node, u > goto retry; > } > > +#ifdef CONFIG_NUMA It would be more natural if you move this inside the function body, and not duplicate the function declaration. > /* > * Tracks nodes that have not yet been visited when searching for an idle > * CPU across all available nodes. > @@ -186,6 +187,13 @@ static s32 pick_idle_cpu_from_online_nodes(const struct cpumask *cpus_allowed, i > > return cpu; > } > +#else > +static inline s32 > +pick_idle_cpu_from_online_nodes(const struct cpumask *cpus_allowed, int node, u64 flags) > +{ > + return -EBUSY; > +} This is misleading errno. The system is nut busy, it is disabled. If it was a syscall, I would say you should return ENOSYS. ENODATA is another candidate. Or you have a special policy for the subsystem/ The above pick_idle_cpu_in_node() doesn't have CONFIG_NUMA protection as well. Is it safe against CONFIG_NUMA? > +#endif > > /* > * Find an idle CPU in the system, starting from @node. > -- > 2.49.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA 2025-06-04 14:05 ` Yury Norov @ 2025-06-04 15:07 ` Andrea Righi 0 siblings, 0 replies; 4+ messages in thread From: Andrea Righi @ 2025-06-04 15:07 UTC (permalink / raw) To: Yury Norov; +Cc: Tejun Heo, David Vernet, Changwoo Min, linux-kernel Hi Yuri, On Wed, Jun 04, 2025 at 10:05:15AM -0400, Yury Norov wrote: > Hi Andrea! > > On Tue, Jun 03, 2025 at 10:22:01AM +0200, Andrea Righi wrote: > > In the idle CPU selection logic, attempting cross-node searches adds > > unnecessary complexity when CONFIG_NUMA is disabled. > > > > Since there's no meaningful concept of nodes in this case, simplify the > > logic by restricting the idle CPU search to the current node only. > > > > Fixes: 48849271e6611 ("sched_ext: idle: Per-node idle cpumasks") > > Signed-off-by: Andrea Righi <arighi@nvidia.com> > > --- > > kernel/sched/ext_idle.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c > > index 66da03cc0b338..8660d9ae40169 100644 > > --- a/kernel/sched/ext_idle.c > > +++ b/kernel/sched/ext_idle.c > > @@ -138,6 +138,7 @@ static s32 pick_idle_cpu_in_node(const struct cpumask *cpus_allowed, int node, u > > goto retry; > > } > > > > +#ifdef CONFIG_NUMA > > It would be more natural if you move this inside the function body, > and not duplicate the function declaration. I was trying to catch both the function and the per_cpu_unvisited with a single #ifdef, but I can definitely split that and add another #ifdef inside the function body. > > > /* > > * Tracks nodes that have not yet been visited when searching for an idle > > * CPU across all available nodes. > > @@ -186,6 +187,13 @@ static s32 pick_idle_cpu_from_online_nodes(const struct cpumask *cpus_allowed, i > > > > return cpu; > > } > > +#else > > +static inline s32 > > +pick_idle_cpu_from_online_nodes(const struct cpumask *cpus_allowed, int node, u64 flags) > > +{ > > + return -EBUSY; > > +} > > This is misleading errno. The system is nut busy, it is disabled. If > it was a syscall, I would say you should return ENOSYS. ENODATA is > another candidate. Or you have a special policy for the subsystem/ So, this function is called only from scx_pick_idle_cpu(), that can still call pick_idle_cpu_from_online_nodes() even on kernels with !CONFIG_NUMA, if the BPF scheduler enables the per-node idle cpumask (setting the flag SCX_OPS_BUILTIN_IDLE_PER_NODE). We can return -ENOSYS, but then we still need to return -EBUSY from scx_pick_idle_cpu(), since its logic is host-wide, so the choice of -EBUSY was to be consistent with that. However, I don't have a strong opinion, if you think it's clearer to return -ENOSYS/ENODATA from pick_idle_cpu_from_online_nodes() I can change that, but I'd still return -EBUSY from scx_pick_idle_cpu(). > > The above pick_idle_cpu_in_node() doesn't have CONFIG_NUMA protection > as well. Is it safe against CONFIG_NUMA? pick_idle_cpu_in_node() is always called with a validated node (when passed from BPF) or a node from the kernel and idle_cpumask() is handling the NUMA_NO_NODE case, so that should be fine in theory. Thanks, -Andrea PS Tejun already applied this patch to his tree, so I'll send all the changes as a followup patch, at least the original bug is fixed. :) ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-06-04 15:08 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-06-03 8:22 [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA Andrea Righi 2025-06-03 18:29 ` Tejun Heo 2025-06-04 14:05 ` Yury Norov 2025-06-04 15:07 ` Andrea Righi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).