All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] smp: Improve on cpumasks handling
@ 2025-06-06 20:27 Yury Norov
  2025-06-06 20:27 ` [PATCH 1/4] smp: Improve locality in smp_call_function_any() Yury Norov
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Yury Norov @ 2025-06-06 20:27 UTC (permalink / raw)
  To: Paul E. McKenney, Neeraj Upadhyay, Yury Norov [NVIDIA],
	Thomas Gleixner, Thorsten Blum, Zqiang, Mathieu Desnoyers,
	linux-kernel

Switch smp_call_function_*() to use more suitable cpumask API.

Yury Norov [NVIDIA] (4):
  smp: Improve locality in smp_call_function_any()
  smp: Use cpumask_any_but() in smp_call_function_many_cond()
  smp: Don't wait for remote work done if not needed in
    smp_call_function_many_cond()
  smp: Defer check for local execution in smp_call_function_many_cond()

 kernel/smp.c | 38 +++++++++-----------------------------
 1 file changed, 9 insertions(+), 29 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] smp: Improve locality in smp_call_function_any()
  2025-06-06 20:27 [PATCH 0/4] smp: Improve on cpumasks handling Yury Norov
@ 2025-06-06 20:27 ` Yury Norov
  2025-06-06 20:27 ` [PATCH 2/4] smp: Use cpumask_any_but() in smp_call_function_many_cond() Yury Norov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Yury Norov @ 2025-06-06 20:27 UTC (permalink / raw)
  To: Paul E. McKenney, Neeraj Upadhyay, Yury Norov [NVIDIA],
	Thomas Gleixner, Thorsten Blum, Zqiang, Mathieu Desnoyers,
	linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

smp_call_function_any() tries to make a local call as it's the cheapest
option, or switches to a CPU in the same node. If it's not possible, the
algorithm gives up and searches for any CPU, in a numerical order.

Instead, we can search for the best CPU based on NUMA locality, including
2nd nearest hop (a set of equidistant nodes), and higher.

sched_numa_find_nth_cpu() does exactly that, and also helps to drop most
of housekeeping code.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 kernel/smp.c | 19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 974f3a3962e8..7c8cfab0ce55 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -741,32 +741,19 @@ EXPORT_SYMBOL_GPL(smp_call_function_single_async);
  *
  * Selection preference:
  *	1) current cpu if in @mask
- *	2) any cpu of current node if in @mask
- *	3) any other online cpu in @mask
+ *	2) nearest cpu in @mask, based on NUMA topology
  */
 int smp_call_function_any(const struct cpumask *mask,
 			  smp_call_func_t func, void *info, int wait)
 {
 	unsigned int cpu;
-	const struct cpumask *nodemask;
 	int ret;
 
 	/* Try for same CPU (cheapest) */
 	cpu = get_cpu();
-	if (cpumask_test_cpu(cpu, mask))
-		goto call;
-
-	/* Try for same node. */
-	nodemask = cpumask_of_node(cpu_to_node(cpu));
-	for (cpu = cpumask_first_and(nodemask, mask); cpu < nr_cpu_ids;
-	     cpu = cpumask_next_and(cpu, nodemask, mask)) {
-		if (cpu_online(cpu))
-			goto call;
-	}
+	if (!cpumask_test_cpu(cpu, mask))
+		cpu = sched_numa_find_nth_cpu(mask, 0, cpu_to_node(cpu));
 
-	/* Any online will do: smp_call_function_single handles nr_cpu_ids. */
-	cpu = cpumask_any_and(mask, cpu_online_mask);
-call:
 	ret = smp_call_function_single(cpu, func, info, wait);
 	put_cpu();
 	return ret;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] smp: Use cpumask_any_but() in smp_call_function_many_cond()
  2025-06-06 20:27 [PATCH 0/4] smp: Improve on cpumasks handling Yury Norov
  2025-06-06 20:27 ` [PATCH 1/4] smp: Improve locality in smp_call_function_any() Yury Norov
@ 2025-06-06 20:27 ` Yury Norov
  2025-06-06 20:27 ` [PATCH 3/4] smp: Don't wait until remote work done if not needed " Yury Norov
  2025-06-06 20:27 ` [PATCH 4/4] smp: Defer check for local execution " Yury Norov
  3 siblings, 0 replies; 6+ messages in thread
From: Yury Norov @ 2025-06-06 20:27 UTC (permalink / raw)
  To: Paul E. McKenney, Neeraj Upadhyay, Yury Norov [NVIDIA],
	Thomas Gleixner, Thorsten Blum, Zqiang, Mathieu Desnoyers,
	linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

smp_call_function_many_cond() opencodes cpumask_any_but(). Fix it.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 kernel/smp.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 7c8cfab0ce55..5871acf3cd45 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -807,13 +807,8 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 		run_local = true;
 
 	/* Check if we need remote execution, i.e., any CPU excluding this one. */
-	cpu = cpumask_first_and(mask, cpu_online_mask);
-	if (cpu == this_cpu)
-		cpu = cpumask_next_and(cpu, mask, cpu_online_mask);
-	if (cpu < nr_cpu_ids)
+	if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) {
 		run_remote = true;
-
-	if (run_remote) {
 		cfd = this_cpu_ptr(&cfd_data);
 		cpumask_and(cfd->cpumask, mask, cpu_online_mask);
 		__cpumask_clear_cpu(this_cpu, cfd->cpumask);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] smp: Don't wait until remote work done if not needed in smp_call_function_many_cond()
  2025-06-06 20:27 [PATCH 0/4] smp: Improve on cpumasks handling Yury Norov
  2025-06-06 20:27 ` [PATCH 1/4] smp: Improve locality in smp_call_function_any() Yury Norov
  2025-06-06 20:27 ` [PATCH 2/4] smp: Use cpumask_any_but() in smp_call_function_many_cond() Yury Norov
@ 2025-06-06 20:27 ` Yury Norov
  2025-06-06 20:27 ` [PATCH 4/4] smp: Defer check for local execution " Yury Norov
  3 siblings, 0 replies; 6+ messages in thread
From: Yury Norov @ 2025-06-06 20:27 UTC (permalink / raw)
  To: Paul E. McKenney, Neeraj Upadhyay, Yury Norov [NVIDIA],
	Thomas Gleixner, Thorsten Blum, Zqiang, Mathieu Desnoyers,
	linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

If we don't actually send any IPIs, there's no need to wait for a job
completion.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 kernel/smp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/smp.c b/kernel/smp.c
index 5871acf3cd45..715190669e94 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -849,6 +849,8 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 			send_call_function_single_ipi(last_cpu);
 		else if (likely(nr_cpus > 1))
 			send_call_function_ipi_mask(cfd->cpumask_ipi);
+		else
+			run_remote = false;
 	}
 
 	if (run_local) {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] smp: Defer check for local execution in smp_call_function_many_cond()
  2025-06-06 20:27 [PATCH 0/4] smp: Improve on cpumasks handling Yury Norov
                   ` (2 preceding siblings ...)
  2025-06-06 20:27 ` [PATCH 3/4] smp: Don't wait until remote work done if not needed " Yury Norov
@ 2025-06-06 20:27 ` Yury Norov
  2025-06-10 18:59   ` Yury Norov
  3 siblings, 1 reply; 6+ messages in thread
From: Yury Norov @ 2025-06-06 20:27 UTC (permalink / raw)
  To: Paul E. McKenney, Neeraj Upadhyay, Yury Norov [NVIDIA],
	Thomas Gleixner, Thorsten Blum, Zqiang, Mathieu Desnoyers,
	linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

Defer check for local execution to the actual place where it is needed,
and save some stack on a useless local variable.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 kernel/smp.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 715190669e94..867f79689684 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -779,7 +779,6 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 	bool wait = scf_flags & SCF_WAIT;
 	int nr_cpus = 0;
 	bool run_remote = false;
-	bool run_local = false;
 
 	lockdep_assert_preemption_disabled();
 
@@ -801,11 +800,6 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 	 */
 	WARN_ON_ONCE(!in_task());
 
-	/* Check if we need local execution. */
-	if ((scf_flags & SCF_RUN_LOCAL) && cpumask_test_cpu(this_cpu, mask) &&
-	    (!cond_func || cond_func(this_cpu, info)))
-		run_local = true;
-
 	/* Check if we need remote execution, i.e., any CPU excluding this one. */
 	if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) {
 		run_remote = true;
@@ -853,7 +847,9 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 			run_remote = false;
 	}
 
-	if (run_local) {
+	/* Check if we need local execution. */
+	if ((scf_flags & SCF_RUN_LOCAL) & cpumask_test_cpu(this_cpu, mask) &&
+	    (!cond_func || cond_func(this_cpu, info))) {
 		unsigned long flags;
 
 		local_irq_save(flags);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/4] smp: Defer check for local execution in smp_call_function_many_cond()
  2025-06-06 20:27 ` [PATCH 4/4] smp: Defer check for local execution " Yury Norov
@ 2025-06-10 18:59   ` Yury Norov
  0 siblings, 0 replies; 6+ messages in thread
From: Yury Norov @ 2025-06-10 18:59 UTC (permalink / raw)
  To: Paul E. McKenney, Neeraj Upadhyay, Thomas Gleixner, Thorsten Blum,
	Zqiang, Mathieu Desnoyers, Dan Carpenter, linux-kernel

On Fri, Jun 06, 2025 at 04:27:31PM -0400, Yury Norov wrote:
> From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>
> 
> Defer check for local execution to the actual place where it is needed,
> and save some stack on a useless local variable.
> 
> Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
> ---
>  kernel/smp.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 715190669e94..867f79689684 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -779,7 +779,6 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
>  	bool wait = scf_flags & SCF_WAIT;
>  	int nr_cpus = 0;
>  	bool run_remote = false;
> -	bool run_local = false;
>  
>  	lockdep_assert_preemption_disabled();
>  
> @@ -801,11 +800,6 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
>  	 */
>  	WARN_ON_ONCE(!in_task());
>  
> -	/* Check if we need local execution. */
> -	if ((scf_flags & SCF_RUN_LOCAL) && cpumask_test_cpu(this_cpu, mask) &&
> -	    (!cond_func || cond_func(this_cpu, info)))
> -		run_local = true;
> -
>  	/* Check if we need remote execution, i.e., any CPU excluding this one. */
>  	if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) {
>  		run_remote = true;
> @@ -853,7 +847,9 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
>  			run_remote = false;
>  	}
>  
> -	if (run_local) {
> +	/* Check if we need local execution. */
> +	if ((scf_flags & SCF_RUN_LOCAL) & cpumask_test_cpu(this_cpu, mask) &&
> +	    (!cond_func || cond_func(this_cpu, info))) {

Dan Carpenter's robot pointed the bug here: it should be:

        (scf_flags & SCF_RUN_LOCAL) && cpumask_test_cpu(this_cpu, mask)

I'll resend it shortly.

Thanks, Dan!

>  		unsigned long flags;
>  
>  		local_irq_save(flags);
> -- 
> 2.43.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-06-10 18:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-06 20:27 [PATCH 0/4] smp: Improve on cpumasks handling Yury Norov
2025-06-06 20:27 ` [PATCH 1/4] smp: Improve locality in smp_call_function_any() Yury Norov
2025-06-06 20:27 ` [PATCH 2/4] smp: Use cpumask_any_but() in smp_call_function_many_cond() Yury Norov
2025-06-06 20:27 ` [PATCH 3/4] smp: Don't wait until remote work done if not needed " Yury Norov
2025-06-06 20:27 ` [PATCH 4/4] smp: Defer check for local execution " Yury Norov
2025-06-10 18:59   ` Yury Norov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.