Re: [PATCH 2/3] sched_ext: Introduce per-NUMA idle cpumasks

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
	Yury Norov <yury.norov@gmail.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] sched_ext: Introduce per-NUMA idle cpumasks
Date: Wed, 4 Dec 2024 09:41:43 +0100	[thread overview]
Message-ID: <Z1AVx-yUY_37uMCb@gpd3> (raw)
In-Reply-To: <Z0-cf7gUzV8jIWIX@slm.duckdns.org>

On Tue, Dec 03, 2024 at 02:04:15PM -1000, Tejun Heo wrote:
> External email: Use caution opening links or attachments
> 
> 
> Hello,
> 
> On Tue, Dec 03, 2024 at 04:36:11PM +0100, Andrea Righi wrote:
> ...
> > Probably a better way to solve this issue is to introduce new kfunc's to
> > explicitly select specific per-NUMA cpumask and modify the scx
> > schedulers to transition to this new API, for example:
> >
> >   const struct cpumask *scx_bpf_get_idle_numa_cpumask(int node)
> >   const struct cpumask *scx_bpf_get_idle_numa_smtmask(int node)
> 
> Yeah, I don't think we want to break backward compat here. Can we introduce
> a flag to switch between node-aware and flattened logic and trigger ops
> error if the wrong flavor is used? Then, we can deprecate and drop the old
> behavior after a few releases. Also, I think it can be named
> scx_bpf_get_idle_cpumask_node().

I like the idea of introducing a flag. The default should be flattened
cpumask, so everything remains the same, and if a scheduler explicitly
enables SCX_OPS_NUMA_IDLE_MASK (suggestions for the name?) we can switch
to the NUMA-aware idle logic.

> 
> > +static struct cpumask *get_idle_cpumask(int cpu)
> > +{
> > +     int node = cpu_to_node(cpu);
> > +
> > +     return idle_masks[node]->cpu;
> > +}
> > +
> > +static struct cpumask *get_idle_smtmask(int cpu)
> > +{
> > +     int node = cpu_to_node(cpu);
> > +
> > +     return idle_masks[node]->smt;
> > +}
> 
> Hmm... why are they keyed by cpu? Wouldn't it make more sense to key them by
> node?

I was trying to save some code, but it's definitely more clear to use
node as key and rename those get_idle_cpumask_node() /
get_idle_smtmask_node(). Will change this.

> 
> > +static s32 scx_pick_idle_cpu(const struct cpumask *cpus_allowed, u64 flags)
> > +{
> > +     int start = cpu_to_node(smp_processor_id());
> > +     int node, cpu;
> > +
> > +     for_each_node_state_wrap(node, N_ONLINE, start) {
> > +             /*
> > +              * scx_pick_idle_cpu_from_node() can be expensive and redundant
> > +              * if none of the CPUs in the NUMA node can be used (according
> > +              * to cpus_allowed).
> > +              *
> > +              * Therefore, check if the NUMA node is usable in advance to
> > +              * save some CPU cycles.
> > +              */
> > +             if (!cpumask_intersects(cpumask_of_node(node), cpus_allowed))
> > +                     continue;
> > +             cpu = scx_pick_idle_cpu_from_node(node, cpus_allowed, flags);
> > +             if (cpu >= 0)
> > +                     return cpu;
> 
> This is fine for now but it'd be ideal if the iteration is in inter-node
> distance order so that each CPU radiates from local node to the furthest
> ones.

Ok.

Thanks,
-Andrea

next prev parent reply	other threads:[~2024-12-04  8:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03 15:36 [PATCHSET v3 sched_ext/for-6.13] sched_ext: split global idle cpumask into per-NUMA cpumasks Andrea Righi
2024-12-03 15:36 ` [PATCH 1/3] nodemask: Introduce for_each_node_mask_wrap/for_each_node_state_wrap() Andrea Righi
2024-12-03 16:27   ` Yury Norov
2024-12-03 15:36 ` [PATCH 2/3] sched_ext: Introduce per-NUMA idle cpumasks Andrea Righi
2024-12-04  0:04   ` Tejun Heo
2024-12-04  0:38     ` Yury Norov
2024-12-04  8:47       ` Andrea Righi
2024-12-04  8:41     ` Andrea Righi [this message]
2024-12-04 18:53       ` Tejun Heo
2024-12-03 15:36 ` [PATCH 3/3] sched_ext: get rid of the scx_selcpu_topo_numa logic Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z1AVx-yUY_37uMCb@gpd3 \
    --to=arighi@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.