[PATCH] sched/topology: optimize sched_numa_find_nth

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
@ 2026-03-19 17:26 Yury Norov
  2026-03-26 19:05 ` Valentin Schneider
  2026-03-26 19:09 ` Valentin Schneider
  0 siblings, 2 replies; 7+ messages in thread
From: Yury Norov @ 2026-03-19 17:26 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, linux-kernel
  Cc: Yury Norov, Yury Norov

The binary search callback hop_cmp() uses cpumask_weight_and() on each
iteration. Switch it to cpumask_nth_and() as it returns earlier, as
soon as the required number of CPUs is found.

Signed-off-by: Yury Norov <ynorov@nvidia.com>
---
 kernel/sched/topology.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 32dcddaead82..0437f5ded37b 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2267,7 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
 	struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
 	struct __cmp_key *k = (struct __cmp_key *)a;
 
-	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
+	if (cpumask_nth_and(k->cpu, k->cpus, cur_hop[k->node]) >= nr_cpu_ids)
 		return 1;
 
 	if (b == k->masks) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
  2026-03-19 17:26 [PATCH] sched/topology: optimize sched_numa_find_nth_cpu() Yury Norov
@ 2026-03-26 19:05 ` Valentin Schneider
  2026-03-26 19:09 ` Valentin Schneider
  1 sibling, 0 replies; 7+ messages in thread
From: Valentin Schneider @ 2026-03-26 19:05 UTC (permalink / raw)
  To: Yury Norov, Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, linux-kernel
  Cc: Yury Norov, Yury Norov

On 19/03/26 13:26, Yury Norov wrote:
> The binary search callback hop_cmp() uses cpumask_weight_and() on each
> iteration. Switch it to cpumask_nth_and() as it returns earlier, as
> soon as the required number of CPUs is found.
>
> Signed-off-by: Yury Norov <ynorov@nvidia.com>



> ---
>  kernel/sched/topology.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 32dcddaead82..0437f5ded37b 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2267,7 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
>  	struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
>  	struct __cmp_key *k = (struct __cmp_key *)a;
>  
> -	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
> +	if (cpumask_nth_and(k->cpu, k->cpus, cur_hop[k->node]) >= nr_cpu_ids)
>  		return 1;
>  
>  	if (b == k->masks) {
> -- 
> 2.43.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
  2026-03-19 17:26 [PATCH] sched/topology: optimize sched_numa_find_nth_cpu() Yury Norov
  2026-03-26 19:05 ` Valentin Schneider
@ 2026-03-26 19:09 ` Valentin Schneider
  2026-03-26 19:45   ` Valentin Schneider
  1 sibling, 1 reply; 7+ messages in thread
From: Valentin Schneider @ 2026-03-26 19:09 UTC (permalink / raw)
  To: Yury Norov, Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, linux-kernel
  Cc: Yury Norov, Yury Norov

On 19/03/26 13:26, Yury Norov wrote:
> The binary search callback hop_cmp() uses cpumask_weight_and() on each
> iteration. Switch it to cpumask_nth_and() as it returns earlier, as
> soon as the required number of CPUs is found.
>
> Signed-off-by: Yury Norov <ynorov@nvidia.com>

Woopsie, forget about the empty reply.

Took me a little while to get back on track with how
sched_numa_find_nth_cpu() works. Doesn't help that it and the
cpumask_nth*() family have a @cpu parameter when it isn't a CPU but an
offset from a search start (i.e. an index, as coined for cpumask_local_spread()).

Comments would help, I'll try to come up with something tomorrow if I wake
up from whatever's brewing in my lungs :(

Anywho, for the change itself:

Reviewed-by: Valentin Schneider <vschneid@redhat.com>

> ---
>  kernel/sched/topology.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 32dcddaead82..0437f5ded37b 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2267,7 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
>       struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
>       struct __cmp_key *k = (struct __cmp_key *)a;
>
> -	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
> +	if (cpumask_nth_and(k->cpu, k->cpus, cur_hop[k->node]) >= nr_cpu_ids)
>               return 1;
>
>       if (b == k->masks) {
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
  2026-03-26 19:09 ` Valentin Schneider
@ 2026-03-26 19:45   ` Valentin Schneider
  2026-03-27 13:38     ` Shrikanth Hegde
  2026-03-28 21:42     ` Yury Norov
  0 siblings, 2 replies; 7+ messages in thread
From: Valentin Schneider @ 2026-03-26 19:45 UTC (permalink / raw)
  To: Yury Norov, Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, linux-kernel
  Cc: Yury Norov, Yury Norov

On 26/03/26 20:09, Valentin Schneider wrote:
> On 19/03/26 13:26, Yury Norov wrote:
>> The binary search callback hop_cmp() uses cpumask_weight_and() on each
>> iteration. Switch it to cpumask_nth_and() as it returns earlier, as
>> soon as the required number of CPUs is found.
>>
>> Signed-off-by: Yury Norov <ynorov@nvidia.com>
>
> Woopsie, forget about the empty reply.
>
> Took me a little while to get back on track with how
> sched_numa_find_nth_cpu() works. Doesn't help that it and the
> cpumask_nth*() family have a @cpu parameter when it isn't a CPU but an
> offset from a search start (i.e. an index, as coined for cpumask_local_spread()).
>
> Comments would help, I'll try to come up with something tomorrow if I wake
> up from whatever's brewing in my lungs :(

I figured I'd give it a try before the fever kicks in:
---
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 32dcddaead82d..7069179d5ee0c 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2267,6 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
 	struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
 	struct __cmp_key *k = (struct __cmp_key *)a;
 
+	/* Not enough CPUs reachable in that many hops */
 	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
 		return 1;
 
@@ -2275,6 +2276,10 @@ static int hop_cmp(const void *a, const void *b)
 		return 0;
 	}
 
+	/*
+	 * cur_hop spans enough CPUs to return an nth one, if the immediately
+	 * preceding hop doesn't then we're done.
+	 */
 	prev_hop = *((struct cpumask ***)b - 1);
 	k->w = cpumask_weight_and(k->cpus, prev_hop[k->node]);
 	if (k->w <= k->cpu)
@@ -2284,16 +2289,16 @@ static int hop_cmp(const void *a, const void *b)
 }
 
 /**
- * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth closest CPU
- *                             from @cpus to @cpu, taking into account distance
- *                             from a given @node.
+ * sched_numa_find_nth_cpu() - given the NUMA topology, find the @nth_cpu in
+ *                             @cpus reachable from @node in the least amount
+ *                             of hops.
  * @cpus: cpumask to find a cpu from
- * @cpu: CPU to start searching
- * @node: NUMA node to order CPUs by distance
+ * @nth_cpu: CPU offset to search for
+ * @node: NUMA node to start the search from
  *
  * Return: cpu, or nr_cpu_ids when nothing found.
  */
-int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
+int sched_numa_find_nth_cpu(const struct cpumask *cpus, int nth_cpu, int node)
 {
 	struct __cmp_key k = { .cpus = cpus, .cpu = cpu };
 	struct cpumask ***hop_masks;
@@ -2315,8 +2320,21 @@ int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
 	hop_masks = bsearch(&k, k.masks, sched_domains_numa_levels, sizeof(k.masks[0]), hop_cmp);
 	if (!hop_masks)
 		goto unlock;
+	/*
+	 * bsearch returned sched_domains_numa_masks[hop], with @hop being the
+	 * smallest amount of hops it takes to reach an @nth_cpu from @node.
+	 */
 	hop = hop_masks	- k.masks;
 
+	/*
+	 * @hop is constructed by hop_cmp() such that sched_domains_numa_masks[hop][node]
+	 * spans enough CPUs to return an @nth_cpu, and sched_domains_numa_masks[hop-1][node]
+	 * doesn't.
+	 *
+	 * Get a cpumask without the CPUs from sched_domains_numa_masks[hop-1][node]
+	 * subtract how many CPUs that contains (@k.w), and fetch our @nth_cpu from
+	 * the resulting mask.
+	 */
 	ret = hop ?
 		cpumask_nth_and_andnot(cpu - k.w, cpus, k.masks[hop][node], k.masks[hop-1][node]) :
 		cpumask_nth_and(cpu, cpus, k.masks[0][node]);


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
  2026-03-26 19:45   ` Valentin Schneider
@ 2026-03-27 13:38     ` Shrikanth Hegde
  2026-03-27 15:59       ` Valentin Schneider
  2026-03-28 21:42     ` Yury Norov
  1 sibling, 1 reply; 7+ messages in thread
From: Shrikanth Hegde @ 2026-03-27 13:38 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Yury Norov, Yury Norov, Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, linux-kernel

Hi Valentin.

On 3/27/26 1:15 AM, Valentin Schneider wrote:
> On 26/03/26 20:09, Valentin Schneider wrote:
>> On 19/03/26 13:26, Yury Norov wrote:
>>> The binary search callback hop_cmp() uses cpumask_weight_and() on each
>>> iteration. Switch it to cpumask_nth_and() as it returns earlier, as
>>> soon as the required number of CPUs is found.
>>>
>>> Signed-off-by: Yury Norov <ynorov@nvidia.com>
>>
>> Woopsie, forget about the empty reply.
>>
>> Took me a little while to get back on track with how
>> sched_numa_find_nth_cpu() works. Doesn't help that it and the
>> cpumask_nth*() family have a @cpu parameter when it isn't a CPU but an
>> offset from a search start (i.e. an index, as coined for cpumask_local_spread()).
>>
>> Comments would help,

Yes. It is quite difficult to get at first glance.

  I'll try to come up with something tomorrow if I wake
>> up from whatever's brewing in my lungs :(
> 
> I figured I'd give it a try before the fever kicks in:

Take care.

> ---
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 32dcddaead82d..7069179d5ee0c 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2267,6 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
>   	struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
>   	struct __cmp_key *k = (struct __cmp_key *)a;
>   
> +	/* Not enough CPUs reachable in that many hops */
>   	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
>   		return 1;
>   
> @@ -2275,6 +2276,10 @@ static int hop_cmp(const void *a, const void *b)
>   		return 0;
>   	}
>   
> +	/*
> +	 * cur_hop spans enough CPUs to return an nth one, if the immediately
> +	 * preceding hop doesn't then we're done.
> +	 */
>   	prev_hop = *((struct cpumask ***)b - 1);
>   	k->w = cpumask_weight_and(k->cpus, prev_hop[k->node]);
>   	if (k->w <= k->cpu)
> @@ -2284,16 +2289,16 @@ static int hop_cmp(const void *a, const void *b)
>   }
>   
>   /**
> - * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth closest CPU
> - *                             from @cpus to @cpu, taking into account distance
> - *                             from a given @node.
> + * sched_numa_find_nth_cpu() - given the NUMA topology, find the @nth_cpu in
> + *                             @cpus reachable from @node in the least amount
> + *                             of hops.
>    * @cpus: cpumask to find a cpu from
> - * @cpu: CPU to start searching
> - * @node: NUMA node to order CPUs by distance
> + * @nth_cpu: CPU offset to search for
> + * @node: NUMA node to start the search from
>    *
>    * Return: cpu, or nr_cpu_ids when nothing found.
>    */
> -int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
> +int sched_numa_find_nth_cpu(const struct cpumask *cpus, int nth_cpu, int node)
>   {
>   	struct __cmp_key k = { .cpus = cpus, .cpu = cpu };

nit: it should .cpu = nth_cpu? and one more place below should change to nth_cpu.

>   	struct cpumask ***hop_masks;
> @@ -2315,8 +2320,21 @@ int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
>   	hop_masks = bsearch(&k, k.masks, sched_domains_numa_levels, sizeof(k.masks[0]), hop_cmp);
>   	if (!hop_masks)
>   		goto unlock;
> +	/*
> +	 * bsearch returned sched_domains_numa_masks[hop], with @hop being the
> +	 * smallest amount of hops it takes to reach an @nth_cpu from @node.
> +	 */
>   	hop = hop_masks	- k.masks;
>   
> +	/*
> +	 * @hop is constructed by hop_cmp() such that sched_domains_numa_masks[hop][node]
> +	 * spans enough CPUs to return an @nth_cpu, and sched_domains_numa_masks[hop-1][node]
> +	 * doesn't.
> +	 *
> +	 * Get a cpumask without the CPUs from sched_domains_numa_masks[hop-1][node]
> +	 * subtract how many CPUs that contains (@k.w), and fetch our @nth_cpu from
> +	 * the resulting mask.
> +	 */
>   	ret = hop ?
>   		cpumask_nth_and_andnot(cpu - k.w, cpus, k.masks[hop][node], k.masks[hop-1][node]) :
>   		cpumask_nth_and(cpu, cpus, k.masks[0][node]);
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
  2026-03-27 13:38     ` Shrikanth Hegde
@ 2026-03-27 15:59       ` Valentin Schneider
  0 siblings, 0 replies; 7+ messages in thread
From: Valentin Schneider @ 2026-03-27 15:59 UTC (permalink / raw)
  To: Shrikanth Hegde
  Cc: Yury Norov, Yury Norov, Peter Zijlstra, Ingo Molnar, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, linux-kernel

On 27/03/26 19:08, Shrikanth Hegde wrote:
> Hi Valentin.
>
> On 3/27/26 1:15 AM, Valentin Schneider wrote:
>> On 26/03/26 20:09, Valentin Schneider wrote:
>>> On 19/03/26 13:26, Yury Norov wrote:
>> @@ -2284,16 +2289,16 @@ static int hop_cmp(const void *a, const void *b)
>>   }
>>   
>>   /**
>> - * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth closest CPU
>> - *                             from @cpus to @cpu, taking into account distance
>> - *                             from a given @node.
>> + * sched_numa_find_nth_cpu() - given the NUMA topology, find the @nth_cpu in
>> + *                             @cpus reachable from @node in the least amount
>> + *                             of hops.
>>    * @cpus: cpumask to find a cpu from
>> - * @cpu: CPU to start searching
>> - * @node: NUMA node to order CPUs by distance
>> + * @nth_cpu: CPU offset to search for
>> + * @node: NUMA node to start the search from
>>    *
>>    * Return: cpu, or nr_cpu_ids when nothing found.
>>    */
>> -int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
>> +int sched_numa_find_nth_cpu(const struct cpumask *cpus, int nth_cpu, int node)
>>   {
>>   	struct __cmp_key k = { .cpus = cpus, .cpu = cpu };
>
> nit: it should .cpu = nth_cpu? and one more place below should change to nth_cpu.
>

Yeah, and the cpumask_nth*() family should get the same treatment.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
  2026-03-26 19:45   ` Valentin Schneider
  2026-03-27 13:38     ` Shrikanth Hegde
@ 2026-03-28 21:42     ` Yury Norov
  1 sibling, 0 replies; 7+ messages in thread
From: Yury Norov @ 2026-03-28 21:42 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	linux-kernel, Yury Norov

On Thu, Mar 26, 2026 at 08:45:25PM +0100, Valentin Schneider wrote:
> On 26/03/26 20:09, Valentin Schneider wrote:
> > On 19/03/26 13:26, Yury Norov wrote:
> >> The binary search callback hop_cmp() uses cpumask_weight_and() on each
> >> iteration. Switch it to cpumask_nth_and() as it returns earlier, as
> >> soon as the required number of CPUs is found.
> >>
> >> Signed-off-by: Yury Norov <ynorov@nvidia.com>
> >
> > Woopsie, forget about the empty reply.
> >
> > Took me a little while to get back on track with how
> > sched_numa_find_nth_cpu() works. Doesn't help that it and the
> > cpumask_nth*() family have a @cpu parameter when it isn't a CPU but an
> > offset from a search start (i.e. an index, as coined for cpumask_local_spread()).
> >
> > Comments would help, I'll try to come up with something tomorrow if I wake
> > up from whatever's brewing in my lungs :(
> 
> I figured I'd give it a try before the fever kicks in:
> ---
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 32dcddaead82d..7069179d5ee0c 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2267,6 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
>  	struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
>  	struct __cmp_key *k = (struct __cmp_key *)a;
>  
> +	/* Not enough CPUs reachable in that many hops */
>  	if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
>  		return 1;
>  
> @@ -2275,6 +2276,10 @@ static int hop_cmp(const void *a, const void *b)
>  		return 0;
>  	}
>  
> +	/*
> +	 * cur_hop spans enough CPUs to return an nth one, if the immediately
> +	 * preceding hop doesn't then we're done.
> +	 */
>  	prev_hop = *((struct cpumask ***)b - 1);
>  	k->w = cpumask_weight_and(k->cpus, prev_hop[k->node]);
>  	if (k->w <= k->cpu)

I'm not a fan of comments explaining how the code works, especially
if they pollute the code itself. Instead of increasing the function
vertical length, can you add a top comment like:

  Comparator for the binary search in sched_numa_find_nth_cpu().

  Returns positive value if the given hop doesn't contain enough CPUs.
  Returns zero if the current hop is a minimal hop containing enough CPUs.
  Returns negative value if the current hop is not a minimal hop containing
  enough CPUs.

If you think the comparator is too complicated, it's always better to
add self-explaining helpers:

        if (not_enough_hops())
                return 1;

        if (just_enough_hops())
                return 0;

        /* Otherwise too many hops */
        return -1;

> @@ -2284,16 +2289,16 @@ static int hop_cmp(const void *a, const void *b)
>  }
>  
>  /**
> - * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth closest CPU
> - *                             from @cpus to @cpu, taking into account distance
> - *                             from a given @node.
> + * sched_numa_find_nth_cpu() - given the NUMA topology, find the @nth_cpu in
> + *                             @cpus reachable from @node in the least amount
> + *                             of hops.
>   * @cpus: cpumask to find a cpu from
> - * @cpu: CPU to start searching
> - * @node: NUMA node to order CPUs by distance
> + * @nth_cpu: CPU offset to search for
> + * @node: NUMA node to start the search from
>   *
>   * Return: cpu, or nr_cpu_ids when nothing found.
>   */
> -int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
> +int sched_numa_find_nth_cpu(const struct cpumask *cpus, int nth_cpu, int node)

If you think that 'cpu' is confusing, then 'nth_cpu' would be even
more confusing: you're searching for the nth_cpu, and you pass it as a
parameter.

Let's name it just num or idx, with the reasoning: 

1. This is not CPU (contrary to cpumask_next(cpu), for example); and
2. This is just an index in a numa-based distance enumeration.

And the description should reflect that, like:

 @idx: index of a CPU to find in a node-based distance CPU enumeration

>  {
>  	struct __cmp_key k = { .cpus = cpus, .cpu = cpu };
>  	struct cpumask ***hop_masks;
> @@ -2315,8 +2320,21 @@ int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
>  	hop_masks = bsearch(&k, k.masks, sched_domains_numa_levels, sizeof(k.masks[0]), hop_cmp);
>  	if (!hop_masks)
>  		goto unlock;
> +	/*
> +	 * bsearch returned sched_domains_numa_masks[hop], with @hop being the
> +	 * smallest amount of hops it takes to reach an @nth_cpu from @node.
> +	 */

Please no kernel doc syntax out of kernel doc blocks.

>  	hop = hop_masks	- k.masks;
>  
> +	/*
> +	 * @hop is constructed by hop_cmp() such that sched_domains_numa_masks[hop][node]

hop_cmp() doesn't construct, it's a comparator for bsearch, which searches for
the nearest hop containing at least N CPUs.

> +	 * spans enough CPUs to return an @nth_cpu, and sched_domains_numa_masks[hop-1][node]
> +	 * doesn't.
> +	 *
> +	 * Get a cpumask without the CPUs from sched_domains_numa_masks[hop-1][node]
> +	 * subtract how many CPUs that contains (@k.w), and fetch our @nth_cpu from
> +	 * the resulting mask.

It explains only hop != NULL case. This sentence confuses more than
explain, to me. The below code is quite self-explaining. 

> +	 */
>  	ret = hop ?
>  		cpumask_nth_and_andnot(cpu - k.w, cpus, k.masks[hop][node], k.masks[hop-1][node]) :
>  		cpumask_nth_and(cpu, cpus, k.masks[0][node]);

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-03-28 21:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-19 17:26 [PATCH] sched/topology: optimize sched_numa_find_nth_cpu() Yury Norov
2026-03-26 19:05 ` Valentin Schneider
2026-03-26 19:09 ` Valentin Schneider
2026-03-26 19:45   ` Valentin Schneider
2026-03-27 13:38     ` Shrikanth Hegde
2026-03-27 15:59       ` Valentin Schneider
2026-03-28 21:42     ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox