Re: [PATCH 2/3] sched_ext: Introduce per-NUMA idle cpumasks

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Yury Norov <yury.norov@gmail.com>
To: Tejun Heo <tj@kernel.org>
Cc: Andrea Righi <arighi@nvidia.com>,
	David Vernet <void@manifault.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] sched_ext: Introduce per-NUMA idle cpumasks
Date: Tue, 3 Dec 2024 16:38:58 -0800	[thread overview]
Message-ID: <Z0-kovS-Ba9CaP9J@yury-ThinkPad> (raw)
In-Reply-To: <Z0-cf7gUzV8jIWIX@slm.duckdns.org>

On Tue, Dec 03, 2024 at 02:04:15PM -1000, Tejun Heo wrote:
> Hello,
> 
> On Tue, Dec 03, 2024 at 04:36:11PM +0100, Andrea Righi wrote:
> ...
> > Probably a better way to solve this issue is to introduce new kfunc's to
> > explicitly select specific per-NUMA cpumask and modify the scx
> > schedulers to transition to this new API, for example:
> > 
> >   const struct cpumask *scx_bpf_get_idle_numa_cpumask(int node)
> >   const struct cpumask *scx_bpf_get_idle_numa_smtmask(int node)
> 
> Yeah, I don't think we want to break backward compat here. Can we introduce
> a flag to switch between node-aware and flattened logic and trigger ops
> error if the wrong flavor is used? Then, we can deprecate and drop the old
> behavior after a few releases. Also, I think it can be named
> scx_bpf_get_idle_cpumask_node().
> 
> > +static struct cpumask *get_idle_cpumask(int cpu)
> > +{
> > +	int node = cpu_to_node(cpu);
> > +
> > +	return idle_masks[node]->cpu;
> > +}
> > +
> > +static struct cpumask *get_idle_smtmask(int cpu)
> > +{
> > +	int node = cpu_to_node(cpu);
> > +
> > +	return idle_masks[node]->smt;
> > +}
> 
> Hmm... why are they keyed by cpu? Wouldn't it make more sense to key them by
> node?
> 
> > +static s32 scx_pick_idle_cpu(const struct cpumask *cpus_allowed, u64 flags)
> > +{
> > +	int start = cpu_to_node(smp_processor_id());
> > +	int node, cpu;
> > +
> > +	for_each_node_state_wrap(node, N_ONLINE, start) {
> > +		/*
> > +		 * scx_pick_idle_cpu_from_node() can be expensive and redundant
> > +		 * if none of the CPUs in the NUMA node can be used (according
> > +		 * to cpus_allowed).
> > +		 *
> > +		 * Therefore, check if the NUMA node is usable in advance to
> > +		 * save some CPU cycles.
> > +		 */
> > +		if (!cpumask_intersects(cpumask_of_node(node), cpus_allowed))
> > +			continue;
> > +		cpu = scx_pick_idle_cpu_from_node(node, cpus_allowed, flags);
> > +		if (cpu >= 0)
> > +			return cpu;
> 
> This is fine for now but it'd be ideal if the iteration is in inter-node
> distance order so that each CPU radiates from local node to the furthest
> ones.

cpumask_local_spread() does exactly that - traverses CPUs in NUMA-aware
order. Or we can use for_each_numa_hop_mask() iterator, which does the
same thing more efficiently.

Thanks,
Yury

next prev parent reply	other threads:[~2024-12-04  0:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03 15:36 [PATCHSET v3 sched_ext/for-6.13] sched_ext: split global idle cpumask into per-NUMA cpumasks Andrea Righi
2024-12-03 15:36 ` [PATCH 1/3] nodemask: Introduce for_each_node_mask_wrap/for_each_node_state_wrap() Andrea Righi
2024-12-03 16:27   ` Yury Norov
2024-12-03 15:36 ` [PATCH 2/3] sched_ext: Introduce per-NUMA idle cpumasks Andrea Righi
2024-12-04  0:04   ` Tejun Heo
2024-12-04  0:38     ` Yury Norov [this message]
2024-12-04  8:47       ` Andrea Righi
2024-12-04  8:41     ` Andrea Righi
2024-12-04 18:53       ` Tejun Heo
2024-12-03 15:36 ` [PATCH 3/3] sched_ext: get rid of the scx_selcpu_topo_numa logic Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z0-kovS-Ba9CaP9J@yury-ThinkPad \
    --to=yury.norov@gmail.com \
    --cc=arighi@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.