public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yury Norov <ynorov@nvidia.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	linux-kernel@vger.kernel.org, Yury Norov <yury.norov@gmail.com>
Subject: Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
Date: Sat, 28 Mar 2026 17:42:28 -0400	[thread overview]
Message-ID: <achLRH-MQbrxF4p1@yury> (raw)
In-Reply-To: <xhsmh5x6igysq.mognet@vschneid-thinkpadt14sgen2i.remote.csb>

On Thu, Mar 26, 2026 at 08:45:25PM +0100, Valentin Schneider wrote:
> On 26/03/26 20:09, Valentin Schneider wrote:
> > On 19/03/26 13:26, Yury Norov wrote:
> >> The binary search callback hop_cmp() uses cpumask_weight_and() on each
> >> iteration. Switch it to cpumask_nth_and() as it returns earlier, as
> >> soon as the required number of CPUs is found.
> >>
> >> Signed-off-by: Yury Norov <ynorov@nvidia.com>
> >
> > Woopsie, forget about the empty reply.
> >
> > Took me a little while to get back on track with how
> > sched_numa_find_nth_cpu() works. Doesn't help that it and the
> > cpumask_nth*() family have a @cpu parameter when it isn't a CPU but an
> > offset from a search start (i.e. an index, as coined for cpumask_local_spread()).
> >
> > Comments would help, I'll try to come up with something tomorrow if I wake
> > up from whatever's brewing in my lungs :(
> 
> I figured I'd give it a try before the fever kicks in:
> ---
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 32dcddaead82d..7069179d5ee0c 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2267,6 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
>  	struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
>  	struct __cmp_key *k = (struct __cmp_key *)a;
>  
> +	/* Not enough CPUs reachable in that many hops */
>  	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
>  		return 1;
>  
> @@ -2275,6 +2276,10 @@ static int hop_cmp(const void *a, const void *b)
>  		return 0;
>  	}
>  
> +	/*
> +	 * cur_hop spans enough CPUs to return an nth one, if the immediately
> +	 * preceding hop doesn't then we're done.
> +	 */
>  	prev_hop = *((struct cpumask ***)b - 1);
>  	k->w = cpumask_weight_and(k->cpus, prev_hop[k->node]);
>  	if (k->w <= k->cpu)

I'm not a fan of comments explaining how the code works, especially
if they pollute the code itself. Instead of increasing the function
vertical length, can you add a top comment like:

  Comparator for the binary search in sched_numa_find_nth_cpu().

  Returns positive value if the given hop doesn't contain enough CPUs.
  Returns zero if the current hop is a minimal hop containing enough CPUs.
  Returns negative value if the current hop is not a minimal hop containing
  enough CPUs.

If you think the comparator is too complicated, it's always better to
add self-explaining helpers:

        if (not_enough_hops())
                return 1;

        if (just_enough_hops())
                return 0;

        /* Otherwise too many hops */
        return -1;

> @@ -2284,16 +2289,16 @@ static int hop_cmp(const void *a, const void *b)
>  }
>  
>  /**
> - * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth closest CPU
> - *                             from @cpus to @cpu, taking into account distance
> - *                             from a given @node.
> + * sched_numa_find_nth_cpu() - given the NUMA topology, find the @nth_cpu in
> + *                             @cpus reachable from @node in the least amount
> + *                             of hops.
>   * @cpus: cpumask to find a cpu from
> - * @cpu: CPU to start searching
> - * @node: NUMA node to order CPUs by distance
> + * @nth_cpu: CPU offset to search for
> + * @node: NUMA node to start the search from
>   *
>   * Return: cpu, or nr_cpu_ids when nothing found.
>   */
> -int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
> +int sched_numa_find_nth_cpu(const struct cpumask *cpus, int nth_cpu, int node)

If you think that 'cpu' is confusing, then 'nth_cpu' would be even
more confusing: you're searching for the nth_cpu, and you pass it as a
parameter.

Let's name it just num or idx, with the reasoning: 

1. This is not CPU (contrary to cpumask_next(cpu), for example); and
2. This is just an index in a numa-based distance enumeration.

And the description should reflect that, like:

 @idx: index of a CPU to find in a node-based distance CPU enumeration

>  {
>  	struct __cmp_key k = { .cpus = cpus, .cpu = cpu };
>  	struct cpumask ***hop_masks;
> @@ -2315,8 +2320,21 @@ int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
>  	hop_masks = bsearch(&k, k.masks, sched_domains_numa_levels, sizeof(k.masks[0]), hop_cmp);
>  	if (!hop_masks)
>  		goto unlock;
> +	/*
> +	 * bsearch returned sched_domains_numa_masks[hop], with @hop being the
> +	 * smallest amount of hops it takes to reach an @nth_cpu from @node.
> +	 */

Please no kernel doc syntax out of kernel doc blocks.

>  	hop = hop_masks	- k.masks;
>  
> +	/*
> +	 * @hop is constructed by hop_cmp() such that sched_domains_numa_masks[hop][node]

hop_cmp() doesn't construct, it's a comparator for bsearch, which searches for
the nearest hop containing at least N CPUs.

> +	 * spans enough CPUs to return an @nth_cpu, and sched_domains_numa_masks[hop-1][node]
> +	 * doesn't.
> +	 *
> +	 * Get a cpumask without the CPUs from sched_domains_numa_masks[hop-1][node]
> +	 * subtract how many CPUs that contains (@k.w), and fetch our @nth_cpu from
> +	 * the resulting mask.

It explains only hop != NULL case. This sentence confuses more than
explain, to me. The below code is quite self-explaining. 

> +	 */
>  	ret = hop ?
>  		cpumask_nth_and_andnot(cpu - k.w, cpus, k.masks[hop][node], k.masks[hop-1][node]) :
>  		cpumask_nth_and(cpu, cpus, k.masks[0][node]);

      parent reply	other threads:[~2026-03-28 21:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19 17:26 [PATCH] sched/topology: optimize sched_numa_find_nth_cpu() Yury Norov
2026-03-26 19:05 ` Valentin Schneider
2026-03-26 19:09 ` Valentin Schneider
2026-03-26 19:45   ` Valentin Schneider
2026-03-27 13:38     ` Shrikanth Hegde
2026-03-27 15:59       ` Valentin Schneider
2026-03-28 21:42     ` Yury Norov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=achLRH-MQbrxF4p1@yury \
    --to=ynorov@nvidia.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox