All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Yury Norov <yury.norov@gmail.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA
Date: Wed, 4 Jun 2025 17:07:55 +0200	[thread overview]
Message-ID: <aEBhS-WDH_kaXmVd@gpd4> (raw)
In-Reply-To: <aEBSm7Lm9Gx_anMo@yury>

Hi Yuri,

On Wed, Jun 04, 2025 at 10:05:15AM -0400, Yury Norov wrote:
> Hi Andrea!
> 
> On Tue, Jun 03, 2025 at 10:22:01AM +0200, Andrea Righi wrote:
> > In the idle CPU selection logic, attempting cross-node searches adds
> > unnecessary complexity when CONFIG_NUMA is disabled.
> > 
> > Since there's no meaningful concept of nodes in this case, simplify the
> > logic by restricting the idle CPU search to the current node only.
> > 
> > Fixes: 48849271e6611 ("sched_ext: idle: Per-node idle cpumasks")
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> > ---
> >  kernel/sched/ext_idle.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> > index 66da03cc0b338..8660d9ae40169 100644
> > --- a/kernel/sched/ext_idle.c
> > +++ b/kernel/sched/ext_idle.c
> > @@ -138,6 +138,7 @@ static s32 pick_idle_cpu_in_node(const struct cpumask *cpus_allowed, int node, u
> >  		goto retry;
> >  }
> >  
> > +#ifdef CONFIG_NUMA
> 
> It would be more natural if you move this inside the function body,
> and not duplicate the function declaration.

I was trying to catch both the function and the per_cpu_unvisited with a
single #ifdef, but I can definitely split that and add another #ifdef
inside the function body.

> 
> >  /*
> >   * Tracks nodes that have not yet been visited when searching for an idle
> >   * CPU across all available nodes.
> > @@ -186,6 +187,13 @@ static s32 pick_idle_cpu_from_online_nodes(const struct cpumask *cpus_allowed, i
> >  
> >  	return cpu;
> >  }
> > +#else
> > +static inline s32
> > +pick_idle_cpu_from_online_nodes(const struct cpumask *cpus_allowed, int node, u64 flags)
> > +{
> > +	return -EBUSY;
> > +}
> 
> This is misleading errno. The system is nut busy, it is disabled. If
> it was a syscall, I would say you should return ENOSYS. ENODATA is
> another candidate. Or you have a special policy for the subsystem/

So, this function is called only from scx_pick_idle_cpu(), that can still
call pick_idle_cpu_from_online_nodes() even on kernels with !CONFIG_NUMA,
if the BPF scheduler enables the per-node idle cpumask (setting the flag
SCX_OPS_BUILTIN_IDLE_PER_NODE).

We can return -ENOSYS, but then we still need to return -EBUSY from
scx_pick_idle_cpu(), since its logic is host-wide, so the choice of -EBUSY
was to be consistent with that.

However, I don't have a strong opinion, if you think it's clearer to return
-ENOSYS/ENODATA from pick_idle_cpu_from_online_nodes() I can change that,
but I'd still return -EBUSY from scx_pick_idle_cpu().

> 
> The above pick_idle_cpu_in_node() doesn't have CONFIG_NUMA protection
> as well. Is it safe against CONFIG_NUMA?

pick_idle_cpu_in_node() is always called with a validated node (when passed
from BPF) or a node from the kernel and idle_cpumask() is handling the
NUMA_NO_NODE case, so that should be fine in theory.

Thanks,
-Andrea

PS Tejun already applied this patch to his tree, so I'll send all the
changes as a followup patch, at least the original bug is fixed. :)

      reply	other threads:[~2025-06-04 15:08 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-03  8:22 [PATCH] sched_ext: idle: Skip cross-node search with !CONFIG_NUMA Andrea Righi
2025-06-03 18:29 ` Tejun Heo
2025-06-04 14:05 ` Yury Norov
2025-06-04 15:07   ` Andrea Righi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aEBhS-WDH_kaXmVd@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.