Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Yury Norov <ynorov@nvidia.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	linux-kernel@vger.kernel.org, Yury Norov <yury.norov@gmail.com>
Subject: Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
Date: Sat, 28 Mar 2026 17:42:28 -0400	[thread overview]
Message-ID: <achLRH-MQbrxF4p1@yury> (raw)
In-Reply-To: <xhsmh5x6igysq.mognet@vschneid-thinkpadt14sgen2i.remote.csb>

On Thu, Mar 26, 2026 at 08:45:25PM +0100, Valentin Schneider wrote:
> On 26/03/26 20:09, Valentin Schneider wrote:
> > On 19/03/26 13:26, Yury Norov wrote:
> >> The binary search callback hop_cmp() uses cpumask_weight_and() on each
> >> iteration. Switch it to cpumask_nth_and() as it returns earlier, as
> >> soon as the required number of CPUs is found.
> >>
> >> Signed-off-by: Yury Norov <ynorov@nvidia.com>
> >
> > Woopsie, forget about the empty reply.
> >
> > Took me a little while to get back on track with how
> > sched_numa_find_nth_cpu() works. Doesn't help that it and the
> > cpumask_nth*() family have a @cpu parameter when it isn't a CPU but an
> > offset from a search start (i.e. an index, as coined for cpumask_local_spread()).
> >
> > Comments would help, I'll try to come up with something tomorrow if I wake
> > up from whatever's brewing in my lungs :(
> 
> I figured I'd give it a try before the fever kicks in:
> ---
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 32dcddaead82d..7069179d5ee0c 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2267,6 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
>  	struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
>  	struct __cmp_key *k = (struct __cmp_key *)a;
>  
> +	/* Not enough CPUs reachable in that many hops */
>  	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
>  		return 1;
>  
> @@ -2275,6 +2276,10 @@ static int hop_cmp(const void *a, const void *b)
>  		return 0;
>  	}
>  
> +	/*
> +	 * cur_hop spans enough CPUs to return an nth one, if the immediately
> +	 * preceding hop doesn't then we're done.
> +	 */
>  	prev_hop = *((struct cpumask ***)b - 1);
>  	k->w = cpumask_weight_and(k->cpus, prev_hop[k->node]);
>  	if (k->w <= k->cpu)

I'm not a fan of comments explaining how the code works, especially
if they pollute the code itself. Instead of increasing the function
vertical length, can you add a top comment like:

  Comparator for the binary search in sched_numa_find_nth_cpu().

  Returns positive value if the given hop doesn't contain enough CPUs.
  Returns zero if the current hop is a minimal hop containing enough CPUs.
  Returns negative value if the current hop is not a minimal hop containing
  enough CPUs.

If you think the comparator is too complicated, it's always better to
add self-explaining helpers:

        if (not_enough_hops())
                return 1;

        if (just_enough_hops())
                return 0;

        /* Otherwise too many hops */
        return -1;

> @@ -2284,16 +2289,16 @@ static int hop_cmp(const void *a, const void *b)
>  }
>  
>  /**
> - * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth closest CPU
> - *                             from @cpus to @cpu, taking into account distance
> - *                             from a given @node.
> + * sched_numa_find_nth_cpu() - given the NUMA topology, find the @nth_cpu in
> + *                             @cpus reachable from @node in the least amount
> + *                             of hops.
>   * @cpus: cpumask to find a cpu from
> - * @cpu: CPU to start searching
> - * @node: NUMA node to order CPUs by distance
> + * @nth_cpu: CPU offset to search for
> + * @node: NUMA node to start the search from
>   *
>   * Return: cpu, or nr_cpu_ids when nothing found.
>   */
> -int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
> +int sched_numa_find_nth_cpu(const struct cpumask *cpus, int nth_cpu, int node)

If you think that 'cpu' is confusing, then 'nth_cpu' would be even
more confusing: you're searching for the nth_cpu, and you pass it as a
parameter.

Let's name it just num or idx, with the reasoning: 

1. This is not CPU (contrary to cpumask_next(cpu), for example); and
2. This is just an index in a numa-based distance enumeration.

And the description should reflect that, like:

 @idx: index of a CPU to find in a node-based distance CPU enumeration

>  {
>  	struct __cmp_key k = { .cpus = cpus, .cpu = cpu };
>  	struct cpumask ***hop_masks;
> @@ -2315,8 +2320,21 @@ int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
>  	hop_masks = bsearch(&k, k.masks, sched_domains_numa_levels, sizeof(k.masks[0]), hop_cmp);
>  	if (!hop_masks)
>  		goto unlock;
> +	/*
> +	 * bsearch returned sched_domains_numa_masks[hop], with @hop being the
> +	 * smallest amount of hops it takes to reach an @nth_cpu from @node.
> +	 */

Please no kernel doc syntax out of kernel doc blocks.

>  	hop = hop_masks	- k.masks;
>  
> +	/*
> +	 * @hop is constructed by hop_cmp() such that sched_domains_numa_masks[hop][node]

hop_cmp() doesn't construct, it's a comparator for bsearch, which searches for
the nearest hop containing at least N CPUs.

> +	 * spans enough CPUs to return an @nth_cpu, and sched_domains_numa_masks[hop-1][node]
> +	 * doesn't.
> +	 *
> +	 * Get a cpumask without the CPUs from sched_domains_numa_masks[hop-1][node]
> +	 * subtract how many CPUs that contains (@k.w), and fetch our @nth_cpu from
> +	 * the resulting mask.

It explains only hop != NULL case. This sentence confuses more than
explain, to me. The below code is quite self-explaining. 

> +	 */
>  	ret = hop ?
>  		cpumask_nth_and_andnot(cpu - k.w, cpus, k.masks[hop][node], k.masks[hop-1][node]) :
>  		cpumask_nth_and(cpu, cpus, k.masks[0][node]);

next prev parent reply	other threads:[~2026-03-28 21:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19 17:26 [PATCH] sched/topology: optimize sched_numa_find_nth_cpu() Yury Norov
2026-03-26 19:05 ` Valentin Schneider
2026-03-26 19:09 ` Valentin Schneider
2026-03-26 19:45   ` Valentin Schneider
2026-03-27 13:38     ` Shrikanth Hegde
2026-03-27 15:59       ` Valentin Schneider
2026-03-28 21:42     ` Yury Norov [this message]
2026-04-01 10:09       ` Valentin Schneider
2026-04-01 14:54         ` Yury Norov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=achLRH-MQbrxF4p1@yury \
    --to=ynorov@nvidia.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.