linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
@ 2025-07-05 17:23 Joel Fernandes
  2025-07-06 17:01 ` Paul E. McKenney
  2025-07-07  7:50 ` Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Joel Fernandes @ 2025-07-05 17:23 UTC (permalink / raw)
  To: linux-kernel, Thomas Gleixner, Peter Zijlstra
  Cc: Joel Fernandes, Andrea Righi, Paul E . McKenney,
	Frederic Weisbecker, rcu

Recently while revising RCU's cpu online checks, there was some discussion
around how IPIs synchronize with hotplug.

Add comments explaining how preemption disable creates mutual exclusion with
CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
atomically updates CPU masks and flushes IPIs with interrupts disabled, and
cannot proceed while any CPU (including the IPI sender) has preemption
disabled.

Cc: Andrea Righi <arighi@nvidia.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: rcu@vger.kernel.org
Co-developed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
---
v1->v2: Reworded a bit more (minor nit).

 kernel/cpu.c |  4 ++++
 kernel/smp.c | 12 ++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index a59e009e0be4..a8ce1395dd2c 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1310,6 +1310,10 @@ static int takedown_cpu(unsigned int cpu)
 
 	/*
 	 * So now all preempt/rcu users must observe !cpu_active().
+	 *
+	 * stop_machine() waits for all CPUs to enable preemption. This lets
+	 * take_cpu_down() atomically update CPU masks and flush last IPI
+	 * before new IPIs can be attempted to be sent.
 	 */
 	err = stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu));
 	if (err) {
diff --git a/kernel/smp.c b/kernel/smp.c
index 974f3a3962e8..842691467f9e 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu)
 	 * explicitly (without waiting for the IPIs to arrive), to
 	 * ensure that the outgoing CPU doesn't go offline with work
 	 * still pending.
+	 *
+	 * This runs in stop_machine's atomic context with interrupts disabled,
+	 * thus CPU offlining and IPI flushing happen together atomically.
 	 */
 	__flush_smp_call_function_queue(false);
 	irq_work_run();
@@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node *node)
  */
 static int generic_exec_single(int cpu, call_single_data_t *csd)
 {
+	/*
+	 * Preemption must be disabled by caller to mutually exclude with
+	 * stop_machine() atomically updating CPU masks and flushing IPIs.
+	 */
 	if (cpu == smp_processor_id()) {
 		smp_call_func_t func = csd->func;
 		void *info = csd->info;
@@ -640,6 +647,11 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
 	/*
 	 * prevent preemption and reschedule on another processor,
 	 * as well as CPU removal
+	 *
+	 * get_cpu() disables preemption, ensuring mutual exclusion with
+	 * stop_machine() where CPU offlining and last IPI flushing happen
+	 * atomically versus this code. This guarantees here that the cpu_online()
+	 * check and IPI sending are safe without losing IPIs due to offlining.
 	 */
 	this_cpu = get_cpu();
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-05 17:23 [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion Joel Fernandes
@ 2025-07-06 17:01 ` Paul E. McKenney
  2025-07-07  7:50 ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2025-07-06 17:01 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, Thomas Gleixner, Peter Zijlstra, Andrea Righi,
	Frederic Weisbecker, rcu

On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote:
> Recently while revising RCU's cpu online checks, there was some discussion
> around how IPIs synchronize with hotplug.
> 
> Add comments explaining how preemption disable creates mutual exclusion with
> CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> cannot proceed while any CPU (including the IPI sender) has preemption
> disabled.
> 
> Cc: Andrea Righi <arighi@nvidia.com>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: rcu@vger.kernel.org
> Co-developed-by: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> ---
> v1->v2: Reworded a bit more (minor nit).
> 
>  kernel/cpu.c |  4 ++++
>  kernel/smp.c | 12 ++++++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index a59e009e0be4..a8ce1395dd2c 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1310,6 +1310,10 @@ static int takedown_cpu(unsigned int cpu)
>  
>  	/*
>  	 * So now all preempt/rcu users must observe !cpu_active().
> +	 *
> +	 * stop_machine() waits for all CPUs to enable preemption. This lets
> +	 * take_cpu_down() atomically update CPU masks and flush last IPI
> +	 * before new IPIs can be attempted to be sent.
>  	 */
>  	err = stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu));
>  	if (err) {
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 974f3a3962e8..842691467f9e 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu)
>  	 * explicitly (without waiting for the IPIs to arrive), to
>  	 * ensure that the outgoing CPU doesn't go offline with work
>  	 * still pending.
> +	 *
> +	 * This runs in stop_machine's atomic context with interrupts disabled,
> +	 * thus CPU offlining and IPI flushing happen together atomically.
>  	 */
>  	__flush_smp_call_function_queue(false);
>  	irq_work_run();
> @@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node *node)
>   */
>  static int generic_exec_single(int cpu, call_single_data_t *csd)
>  {
> +	/*
> +	 * Preemption must be disabled by caller to mutually exclude with
> +	 * stop_machine() atomically updating CPU masks and flushing IPIs.
> +	 */
>  	if (cpu == smp_processor_id()) {
>  		smp_call_func_t func = csd->func;
>  		void *info = csd->info;
> @@ -640,6 +647,11 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
>  	/*
>  	 * prevent preemption and reschedule on another processor,
>  	 * as well as CPU removal
> +	 *
> +	 * get_cpu() disables preemption, ensuring mutual exclusion with
> +	 * stop_machine() where CPU offlining and last IPI flushing happen
> +	 * atomically versus this code. This guarantees here that the cpu_online()
> +	 * check and IPI sending are safe without losing IPIs due to offlining.
>  	 */
>  	this_cpu = get_cpu();
>  
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-05 17:23 [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion Joel Fernandes
  2025-07-06 17:01 ` Paul E. McKenney
@ 2025-07-07  7:50 ` Peter Zijlstra
  2025-07-07 14:19   ` Joel Fernandes
  2025-07-07 15:56   ` Paul E. McKenney
  1 sibling, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2025-07-07  7:50 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, Thomas Gleixner, Andrea Righi, Paul E . McKenney,
	Frederic Weisbecker, rcu

On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote:
> Recently while revising RCU's cpu online checks, there was some discussion
> around how IPIs synchronize with hotplug.
> 
> Add comments explaining how preemption disable creates mutual exclusion with
> CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> cannot proceed while any CPU (including the IPI sender) has preemption
> disabled.

I'm very conflicted on this. While the added comments aren't wrong,
they're not quite accurate either. Stop_machine doesn't wait for people
to enable preemption as such.

Fundamentally there seems to be a misconception around what stop machine
is and how it works, and I don't feel these comments make things better.

Basically, stop-machine (and stop_one_cpu(), stop_two_cpus()) use the
stopper task, a task running at the ultimate priority; if it is
runnable, it will run.

Stop-machine simply wakes all the stopper tasks and co-ordinates them to
literally stop the machine. All CPUs have the stopper task scheduled and
then they go sit in a spin-loop driven state machine with IRQs disabled.

There really isn't anything magical about any of this.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-07  7:50 ` Peter Zijlstra
@ 2025-07-07 14:19   ` Joel Fernandes
  2025-07-08  7:21     ` Peter Zijlstra
  2025-07-07 15:56   ` Paul E. McKenney
  1 sibling, 1 reply; 10+ messages in thread
From: Joel Fernandes @ 2025-07-07 14:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Andrea Righi, Paul E . McKenney,
	Frederic Weisbecker, rcu

On Mon, Jul 07, 2025 at 09:50:50AM +0200, Peter Zijlstra wrote:
> On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote:
> > Recently while revising RCU's cpu online checks, there was some discussion
> > around how IPIs synchronize with hotplug.
> > 
> > Add comments explaining how preemption disable creates mutual exclusion with
> > CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> > atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> > cannot proceed while any CPU (including the IPI sender) has preemption
> > disabled.
> 
> I'm very conflicted on this. While the added comments aren't wrong,
> they're not quite accurate either. Stop_machine doesn't wait for people
> to enable preemption as such.

You're right. I actually did not mean to describe how stop_machine is
supposed to work. Indeed, this "trick" for IPI sending safety is more of a
dependency on stop machine I suppose.

> Fundamentally there seems to be a misconception around what stop machine
> is and how it works, and I don't feel these comments make things better.

Sure, but again I am not intending to discuss how stop machine works in this
patch. That's more ambitious.

> Basically, stop-machine (and stop_one_cpu(), stop_two_cpus()) use the
> stopper task, a task running at the ultimate priority; if it is
> runnable, it will run.
> 
> Stop-machine simply wakes all the stopper tasks and co-ordinates them to
> literally stop the machine. All CPUs have the stopper task scheduled and
> then they go sit in a spin-loop driven state machine with IRQs disabled.

Yep.

> There really isn't anything magical about any of this.

So I modified the original patch I sent mainly removing the comments in
stop-machine code and reducing the wordiness. Hope this looks good to you now!

---8<-----------------------

From: Joel Fernandes <joelagnelf@nvidia.com>
Subject: [PATCH] smp: Document preemption and stop_machine() mutual exclusion

Recently while revising RCU's cpu online checks, there was some discussion
around how IPIs synchronize with hotplug.

Add comments explaining how preemption disable creates mutual exclusion with
CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
atomically updates CPU masks and flushes IPIs with interrupts disabled, and
cannot proceed while any CPU (including the IPI sender) has preemption
disabled.

Cc: Andrea Righi <arighi@nvidia.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: rcu@vger.kernel.org
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Co-developed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
---
I am leaving in Paul's Ack but Paul please let me know if there is a concern!

 kernel/smp.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 974f3a3962e8..957959031063 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu)
 	 * explicitly (without waiting for the IPIs to arrive), to
 	 * ensure that the outgoing CPU doesn't go offline with work
 	 * still pending.
+	 *
+	 * This runs with interrupts disabled inside the stopper task invoked
+	 * by stop_machine(), ensuring CPU offlining and IPI flushing are atomic.
 	 */
 	__flush_smp_call_function_queue(false);
 	irq_work_run();
@@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node *node)
  */
 static int generic_exec_single(int cpu, call_single_data_t *csd)
 {
+	/*
+	 * Preemption already disabled here so stopper cannot run on this CPU,
+	 * ensuring mutual exclusion with CPU offlining and last IPI flush.
+	 */
 	if (cpu == smp_processor_id()) {
 		smp_call_func_t func = csd->func;
 		void *info = csd->info;
@@ -638,8 +645,10 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
 	int err;
 
 	/*
-	 * prevent preemption and reschedule on another processor,
-	 * as well as CPU removal
+	 * Prevent preemption and reschedule on another processor, as well as
+	 * CPU removal. Also preempt_disable() prevents stopper from running on
+	 * this CPU, thus providing atomicity between the cpu_online() check
+	 * and IPI sending ensuring IPI is not missed by CPU going offline.
 	 */
 	this_cpu = get_cpu();
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-07  7:50 ` Peter Zijlstra
  2025-07-07 14:19   ` Joel Fernandes
@ 2025-07-07 15:56   ` Paul E. McKenney
  2025-07-08  7:23     ` Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2025-07-07 15:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Joel Fernandes, linux-kernel, Thomas Gleixner, Andrea Righi,
	Frederic Weisbecker, rcu

On Mon, Jul 07, 2025 at 09:50:50AM +0200, Peter Zijlstra wrote:
> On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote:
> > Recently while revising RCU's cpu online checks, there was some discussion
> > around how IPIs synchronize with hotplug.
> > 
> > Add comments explaining how preemption disable creates mutual exclusion with
> > CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> > atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> > cannot proceed while any CPU (including the IPI sender) has preemption
> > disabled.
> 
> I'm very conflicted on this. While the added comments aren't wrong,
> they're not quite accurate either. Stop_machine doesn't wait for people
> to enable preemption as such.
> 
> Fundamentally there seems to be a misconception around what stop machine
> is and how it works, and I don't feel these comments make things better.
> 
> Basically, stop-machine (and stop_one_cpu(), stop_two_cpus()) use the
> stopper task, a task running at the ultimate priority; if it is
> runnable, it will run.
> 
> Stop-machine simply wakes all the stopper tasks and co-ordinates them to
> literally stop the machine. All CPUs have the stopper task scheduled and
> then they go sit in a spin-loop driven state machine with IRQs disabled.
> 
> There really isn't anything magical about any of this.

There is the mechanism (which you have described above), and then there
are the use cases.  Those of us maintaining a given mechanism might
argue that a detailed description of the mechanism suffices, but that
argument does not always win the day.

I do like the description in the stop_machine() kernel-doc header:

 * This can be thought of as a very heavy write lock, equivalent to
 * grabbing every spinlock in the kernel.

Though doesn't this need to upgrace "spinlock" to "raw spinlock"
now that PREEMPT_RT is in mainline?

Also, this function is more powerful than grabbing every write lock
in the kernel because it also excludes all regions of code that have
preemption disabled, which is one thing that CPU hotplug is relying on.
Any objection to calling out that additional semantic?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-07 14:19   ` Joel Fernandes
@ 2025-07-08  7:21     ` Peter Zijlstra
  2025-07-08 14:00       ` Joel Fernandes
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2025-07-08  7:21 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: linux-kernel, Thomas Gleixner, Andrea Righi, Paul E . McKenney,
	Frederic Weisbecker, rcu

On Mon, Jul 07, 2025 at 10:19:52AM -0400, Joel Fernandes wrote:

> From: Joel Fernandes <joelagnelf@nvidia.com>
> Subject: [PATCH] smp: Document preemption and stop_machine() mutual exclusion
> 
> Recently while revising RCU's cpu online checks, there was some discussion
> around how IPIs synchronize with hotplug.
> 
> Add comments explaining how preemption disable creates mutual exclusion with
> CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> cannot proceed while any CPU (including the IPI sender) has preemption
> disabled.
> 
> Cc: Andrea Righi <arighi@nvidia.com>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: rcu@vger.kernel.org
> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> Co-developed-by: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> ---
> I am leaving in Paul's Ack but Paul please let me know if there is a concern!
> 
>  kernel/smp.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 974f3a3962e8..957959031063 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu)
>  	 * explicitly (without waiting for the IPIs to arrive), to
>  	 * ensure that the outgoing CPU doesn't go offline with work
>  	 * still pending.
> +	 *
> +	 * This runs with interrupts disabled inside the stopper task invoked
> +	 * by stop_machine(), ensuring CPU offlining and IPI flushing are atomic.

So below you use 'mutual exclusion', which I prefer over 'atomic' as
used here.

>  	 */
>  	__flush_smp_call_function_queue(false);
>  	irq_work_run();
> @@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node *node)
>   */
>  static int generic_exec_single(int cpu, call_single_data_t *csd)
>  {
> +	/*
> +	 * Preemption already disabled here so stopper cannot run on this CPU,
> +	 * ensuring mutual exclusion with CPU offlining and last IPI flush.
> +	 */
>  	if (cpu == smp_processor_id()) {
>  		smp_call_func_t func = csd->func;
>  		void *info = csd->info;
> @@ -638,8 +645,10 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
>  	int err;
>  
>  	/*
> -	 * prevent preemption and reschedule on another processor,
> -	 * as well as CPU removal
> +	 * Prevent preemption and reschedule on another processor, as well as
> +	 * CPU removal.

>          Also preempt_disable() prevents stopper from running on
> +	 * this CPU, thus providing atomicity between the cpu_online() check
> +	 * and IPI sending ensuring IPI is not missed by CPU going offline.

That first sentence already covers this, no? 'prevents preemption' ->
stopper task cannot run, 'CPU removal' -> no CPU_DYING (because no
stopper). Also that 'atomicy' vs 'mutual exclusion' thing.


>  	 */
>  	this_cpu = get_cpu();
>  
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-07 15:56   ` Paul E. McKenney
@ 2025-07-08  7:23     ` Peter Zijlstra
  2025-07-08 22:52       ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2025-07-08  7:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Joel Fernandes, linux-kernel, Thomas Gleixner, Andrea Righi,
	Frederic Weisbecker, rcu

On Mon, Jul 07, 2025 at 08:56:04AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 07, 2025 at 09:50:50AM +0200, Peter Zijlstra wrote:
> > On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote:
> > > Recently while revising RCU's cpu online checks, there was some discussion
> > > around how IPIs synchronize with hotplug.
> > > 
> > > Add comments explaining how preemption disable creates mutual exclusion with
> > > CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> > > atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> > > cannot proceed while any CPU (including the IPI sender) has preemption
> > > disabled.
> > 
> > I'm very conflicted on this. While the added comments aren't wrong,
> > they're not quite accurate either. Stop_machine doesn't wait for people
> > to enable preemption as such.
> > 
> > Fundamentally there seems to be a misconception around what stop machine
> > is and how it works, and I don't feel these comments make things better.
> > 
> > Basically, stop-machine (and stop_one_cpu(), stop_two_cpus()) use the
> > stopper task, a task running at the ultimate priority; if it is
> > runnable, it will run.
> > 
> > Stop-machine simply wakes all the stopper tasks and co-ordinates them to
> > literally stop the machine. All CPUs have the stopper task scheduled and
> > then they go sit in a spin-loop driven state machine with IRQs disabled.
> > 
> > There really isn't anything magical about any of this.
> 
> There is the mechanism (which you have described above), and then there
> are the use cases.  Those of us maintaining a given mechanism might
> argue that a detailed description of the mechanism suffices, but that
> argument does not always win the day.
> 
> I do like the description in the stop_machine() kernel-doc header:
> 
>  * This can be thought of as a very heavy write lock, equivalent to
>  * grabbing every spinlock in the kernel.
> 
> Though doesn't this need to upgrace "spinlock" to "raw spinlock"
> now that PREEMPT_RT is in mainline?
> 
> Also, this function is more powerful than grabbing every write lock
> in the kernel because it also excludes all regions of code that have
> preemption disabled, which is one thing that CPU hotplug is relying on.
> Any objection to calling out that additional semantic?

Best to just re-formulate the entire comment I think. State it provides
exclusion vs all non-preemptible regions in the kernel -- at insane cost
and should not be used when humanly possible :-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-08  7:21     ` Peter Zijlstra
@ 2025-07-08 14:00       ` Joel Fernandes
  0 siblings, 0 replies; 10+ messages in thread
From: Joel Fernandes @ 2025-07-08 14:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Thomas Gleixner, Andrea Righi, Paul E . McKenney,
	Frederic Weisbecker, rcu



On 7/8/2025 3:21 AM, Peter Zijlstra wrote:
> On Mon, Jul 07, 2025 at 10:19:52AM -0400, Joel Fernandes wrote:
> 
>> From: Joel Fernandes <joelagnelf@nvidia.com>
>> Subject: [PATCH] smp: Document preemption and stop_machine() mutual exclusion
>>
>> Recently while revising RCU's cpu online checks, there was some discussion
>> around how IPIs synchronize with hotplug.
>>
>> Add comments explaining how preemption disable creates mutual exclusion with
>> CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
>> atomically updates CPU masks and flushes IPIs with interrupts disabled, and
>> cannot proceed while any CPU (including the IPI sender) has preemption
>> disabled.
>>
>> Cc: Andrea Righi <arighi@nvidia.com>
>> Cc: Paul E. McKenney <paulmck@kernel.org>
>> Cc: Frederic Weisbecker <frederic@kernel.org>
>> Cc: rcu@vger.kernel.org
>> Acked-by: Paul E. McKenney <paulmck@kernel.org>
>> Co-developed-by: Frederic Weisbecker <frederic@kernel.org>
>> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
>> ---
>> I am leaving in Paul's Ack but Paul please let me know if there is a concern!
>>
>>  kernel/smp.c | 13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/smp.c b/kernel/smp.c
>> index 974f3a3962e8..957959031063 100644
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu)
>>  	 * explicitly (without waiting for the IPIs to arrive), to
>>  	 * ensure that the outgoing CPU doesn't go offline with work
>>  	 * still pending.
>> +	 *
>> +	 * This runs with interrupts disabled inside the stopper task invoked
>> +	 * by stop_machine(), ensuring CPU offlining and IPI flushing are atomic.
> 
> So below you use 'mutual exclusion', which I prefer over 'atomic' as
> used here.

Sure, will fix.

> 
>>  	 */
>>  	__flush_smp_call_function_queue(false);
>>  	irq_work_run();
>> @@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node *node)
>>   */
>>  static int generic_exec_single(int cpu, call_single_data_t *csd)
>>  {
>> +	/*
>> +	 * Preemption already disabled here so stopper cannot run on this CPU,
>> +	 * ensuring mutual exclusion with CPU offlining and last IPI flush.
>> +	 */
>>  	if (cpu == smp_processor_id()) {
>>  		smp_call_func_t func = csd->func;
>>  		void *info = csd->info;
>> @@ -638,8 +645,10 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
>>  	int err;
>>  
>>  	/*
>> -	 * prevent preemption and reschedule on another processor,
>> -	 * as well as CPU removal
>> +	 * Prevent preemption and reschedule on another processor, as well as
>> +	 * CPU removal.
> 
>>          Also preempt_disable() prevents stopper from running on
>> +	 * this CPU, thus providing atomicity between the cpu_online() check
>> +	 * and IPI sending ensuring IPI is not missed by CPU going offline.
> 
> That first sentence already covers this, no? 'prevents preemption' ->
> stopper task cannot run, 'CPU removal' -> no CPU_DYING (because no
> stopper).

Yeah I understand that's "implied" but I'd like to specifically call that out if
that's Ok :)

> Also that 'atomicy' vs 'mutual exclusion' thing.

Sure, will fix :)

Thanks!

 - Joel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-08  7:23     ` Peter Zijlstra
@ 2025-07-08 22:52       ` Paul E. McKenney
  2025-07-14 17:20         ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2025-07-08 22:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Joel Fernandes, linux-kernel, Thomas Gleixner, Andrea Righi,
	Frederic Weisbecker, rcu

On Tue, Jul 08, 2025 at 09:23:21AM +0200, Peter Zijlstra wrote:
> On Mon, Jul 07, 2025 at 08:56:04AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 07, 2025 at 09:50:50AM +0200, Peter Zijlstra wrote:
> > > On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote:
> > > > Recently while revising RCU's cpu online checks, there was some discussion
> > > > around how IPIs synchronize with hotplug.
> > > > 
> > > > Add comments explaining how preemption disable creates mutual exclusion with
> > > > CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> > > > atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> > > > cannot proceed while any CPU (including the IPI sender) has preemption
> > > > disabled.
> > > 
> > > I'm very conflicted on this. While the added comments aren't wrong,
> > > they're not quite accurate either. Stop_machine doesn't wait for people
> > > to enable preemption as such.
> > > 
> > > Fundamentally there seems to be a misconception around what stop machine
> > > is and how it works, and I don't feel these comments make things better.
> > > 
> > > Basically, stop-machine (and stop_one_cpu(), stop_two_cpus()) use the
> > > stopper task, a task running at the ultimate priority; if it is
> > > runnable, it will run.
> > > 
> > > Stop-machine simply wakes all the stopper tasks and co-ordinates them to
> > > literally stop the machine. All CPUs have the stopper task scheduled and
> > > then they go sit in a spin-loop driven state machine with IRQs disabled.
> > > 
> > > There really isn't anything magical about any of this.
> > 
> > There is the mechanism (which you have described above), and then there
> > are the use cases.  Those of us maintaining a given mechanism might
> > argue that a detailed description of the mechanism suffices, but that
> > argument does not always win the day.
> > 
> > I do like the description in the stop_machine() kernel-doc header:
> > 
> >  * This can be thought of as a very heavy write lock, equivalent to
> >  * grabbing every spinlock in the kernel.
> > 
> > Though doesn't this need to upgrace "spinlock" to "raw spinlock"
> > now that PREEMPT_RT is in mainline?
> > 
> > Also, this function is more powerful than grabbing every write lock
> > in the kernel because it also excludes all regions of code that have
> > preemption disabled, which is one thing that CPU hotplug is relying on.
> > Any objection to calling out that additional semantic?
> 
> Best to just re-formulate the entire comment I think. State it provides
> exclusion vs all non-preemptible regions in the kernel -- at insane cost
> and should not be used when humanly possible :-)

How about like this?

							Thanx, Paul

------------------------------------------------------------------------

commit 3f0626b0514ccda56d15995e5bd1d1552f828705
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Tue Jul 8 15:47:29 2025 -0700

    stop_machine: Improve kernel-doc function-header comments
    
    Add more detail to the kernel-doc function-header comments for
    stop_machine(), stop_machine_cpuslocked(), and stop_core_cpuslocked().
    
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index 3132262a404dc..72820503514cc 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -88,55 +88,73 @@ static inline void print_stop_info(const char *log_lvl, struct task_struct *task
 #endif	/* CONFIG_SMP */
 
 /*
- * stop_machine "Bogolock": stop the entire machine, disable
- * interrupts.  This is a very heavy lock, which is equivalent to
- * grabbing every spinlock (and more).  So the "read" side to such a
- * lock is anything which disables preemption.
+ * stop_machine "Bogolock": stop the entire machine, disable interrupts.
+ * This is a very heavy lock, which is equivalent to grabbing every raw
+ * spinlock (and more).  So the "read" side to such a lock is anything
+ * which disables preemption.
  */
 #if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU)
 
 /**
  * stop_machine: freeze the machine on all CPUs and run this function
  * @fn: the function to run
- * @data: the data ptr for the @fn()
- * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
+ * @data: the data ptr to pass to @fn()
+ * @cpus: the cpus to run @fn() on (NULL = run on each online CPU)
  *
- * Description: This causes a thread to be scheduled on every cpu,
- * each of which disables interrupts.  The result is that no one is
- * holding a spinlock or inside any other preempt-disabled region when
- * @fn() runs.
+ * Description: This causes a thread to be scheduled on every CPU, which
+ * will run with interrupts disabled.  Each CPU specified by @cpus will
+ * run @fn.  While @fn is executing, there will no other CPUs holding
+ * a raw spinlock or running within any other type of preempt-disabled
+ * region of code.
  *
- * This can be thought of as a very heavy write lock, equivalent to
- * grabbing every spinlock in the kernel.
+ * When @cpus specifies only a single CPU, this can be thought of as
+ * a reader-writer lock where readers disable preemption (for example,
+ * by holding a raw spinlock) and where the insanely heavy writers run
+ * @fn while also preventing any other CPU from doing any useful work.
+ * These writers can also be thought of as having implicitly grabbed every
+ * raw spinlock in the kernel.
  *
- * Protects against CPU hotplug.
+ * When @fn is a no-op, this can be thought of as an RCU implementation
+ * where readers again disable preemption and writers use stop_machine()
+ * in place of synchronize_rcu(), albeit with orders of magnitude more
+ * disruption than even that of synchronize_rcu_expedited().
+ *
+ * Although only one stop_machine() operation can proceed at a time,
+ * the possibility of blocking in cpus_read_lock() means that the caller
+ * cannot usefully rely on this serialization.
+ *
+ * Return: 0 if all invocations of @fn return zero.  Otherwise, the
+ * value returned by an arbitrarily chosen member of the set of calls to
+ * @fn that returned non-zero.
  */
 int stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
 
 /**
  * stop_machine_cpuslocked: freeze the machine on all CPUs and run this function
  * @fn: the function to run
- * @data: the data ptr for the @fn()
- * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
+ * @data: the data ptr to pass to @fn()
+ * @cpus: the cpus to run @fn() on (NULL = run on each online CPU)
+ *
+ * Same as above.  Avoids nested calls to cpus_read_lock().
  *
- * Same as above. Must be called from with in a cpus_read_lock() protected
- * region. Avoids nested calls to cpus_read_lock().
+ * Context: Must be called from within a cpus_read_lock() protected region.
  */
 int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
 
 /**
  * stop_core_cpuslocked: - stop all threads on just one core
  * @cpu: any cpu in the targeted core
- * @fn: the function to run
- * @data: the data ptr for @fn()
+ * @fn: the function to run on each CPU in the core containing @cpu
+ * @data: the data ptr to pass to @fn()
  *
- * Same as above, but instead of every CPU, only the logical CPUs of a
- * single core are affected.
+ * Same as above, but instead of every CPU, only the logical CPUs of the
+ * single core containing @cpu are affected.
  *
  * Context: Must be called from within a cpus_read_lock() protected region.
  *
- * Return: 0 if all executions of @fn returned 0, any non zero return
- * value if any returned non zero.
+ * Return: 0 if all invocations of @fn return zero.  Otherwise, the
+ * value returned by an arbitrarily chosen member of the set of calls to
+ * @fn that returned non-zero.
  */
 int stop_core_cpuslocked(unsigned int cpu, cpu_stop_fn_t fn, void *data);
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion
  2025-07-08 22:52       ` Paul E. McKenney
@ 2025-07-14 17:20         ` Paul E. McKenney
  0 siblings, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2025-07-14 17:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Joel Fernandes, linux-kernel, Thomas Gleixner, Andrea Righi,
	Frederic Weisbecker, rcu

On Tue, Jul 08, 2025 at 03:52:44PM -0700, Paul E. McKenney wrote:
> On Tue, Jul 08, 2025 at 09:23:21AM +0200, Peter Zijlstra wrote:

[ . . . ]

> > Best to just re-formulate the entire comment I think. State it provides
> > exclusion vs all non-preemptible regions in the kernel -- at insane cost
> > and should not be used when humanly possible :-)
> 
> How about like this?

Hearing no objections...  ;-)

							Thanx, Paul

> ------------------------------------------------------------------------
> 
> commit 3f0626b0514ccda56d15995e5bd1d1552f828705
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Tue Jul 8 15:47:29 2025 -0700
> 
>     stop_machine: Improve kernel-doc function-header comments
>     
>     Add more detail to the kernel-doc function-header comments for
>     stop_machine(), stop_machine_cpuslocked(), and stop_core_cpuslocked().
>     
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
> index 3132262a404dc..72820503514cc 100644
> --- a/include/linux/stop_machine.h
> +++ b/include/linux/stop_machine.h
> @@ -88,55 +88,73 @@ static inline void print_stop_info(const char *log_lvl, struct task_struct *task
>  #endif	/* CONFIG_SMP */
>  
>  /*
> - * stop_machine "Bogolock": stop the entire machine, disable
> - * interrupts.  This is a very heavy lock, which is equivalent to
> - * grabbing every spinlock (and more).  So the "read" side to such a
> - * lock is anything which disables preemption.
> + * stop_machine "Bogolock": stop the entire machine, disable interrupts.
> + * This is a very heavy lock, which is equivalent to grabbing every raw
> + * spinlock (and more).  So the "read" side to such a lock is anything
> + * which disables preemption.
>   */
>  #if defined(CONFIG_SMP) || defined(CONFIG_HOTPLUG_CPU)
>  
>  /**
>   * stop_machine: freeze the machine on all CPUs and run this function
>   * @fn: the function to run
> - * @data: the data ptr for the @fn()
> - * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
> + * @data: the data ptr to pass to @fn()
> + * @cpus: the cpus to run @fn() on (NULL = run on each online CPU)
>   *
> - * Description: This causes a thread to be scheduled on every cpu,
> - * each of which disables interrupts.  The result is that no one is
> - * holding a spinlock or inside any other preempt-disabled region when
> - * @fn() runs.
> + * Description: This causes a thread to be scheduled on every CPU, which
> + * will run with interrupts disabled.  Each CPU specified by @cpus will
> + * run @fn.  While @fn is executing, there will no other CPUs holding
> + * a raw spinlock or running within any other type of preempt-disabled
> + * region of code.
>   *
> - * This can be thought of as a very heavy write lock, equivalent to
> - * grabbing every spinlock in the kernel.
> + * When @cpus specifies only a single CPU, this can be thought of as
> + * a reader-writer lock where readers disable preemption (for example,
> + * by holding a raw spinlock) and where the insanely heavy writers run
> + * @fn while also preventing any other CPU from doing any useful work.
> + * These writers can also be thought of as having implicitly grabbed every
> + * raw spinlock in the kernel.
>   *
> - * Protects against CPU hotplug.
> + * When @fn is a no-op, this can be thought of as an RCU implementation
> + * where readers again disable preemption and writers use stop_machine()
> + * in place of synchronize_rcu(), albeit with orders of magnitude more
> + * disruption than even that of synchronize_rcu_expedited().
> + *
> + * Although only one stop_machine() operation can proceed at a time,
> + * the possibility of blocking in cpus_read_lock() means that the caller
> + * cannot usefully rely on this serialization.
> + *
> + * Return: 0 if all invocations of @fn return zero.  Otherwise, the
> + * value returned by an arbitrarily chosen member of the set of calls to
> + * @fn that returned non-zero.
>   */
>  int stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
>  
>  /**
>   * stop_machine_cpuslocked: freeze the machine on all CPUs and run this function
>   * @fn: the function to run
> - * @data: the data ptr for the @fn()
> - * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
> + * @data: the data ptr to pass to @fn()
> + * @cpus: the cpus to run @fn() on (NULL = run on each online CPU)
> + *
> + * Same as above.  Avoids nested calls to cpus_read_lock().
>   *
> - * Same as above. Must be called from with in a cpus_read_lock() protected
> - * region. Avoids nested calls to cpus_read_lock().
> + * Context: Must be called from within a cpus_read_lock() protected region.
>   */
>  int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
>  
>  /**
>   * stop_core_cpuslocked: - stop all threads on just one core
>   * @cpu: any cpu in the targeted core
> - * @fn: the function to run
> - * @data: the data ptr for @fn()
> + * @fn: the function to run on each CPU in the core containing @cpu
> + * @data: the data ptr to pass to @fn()
>   *
> - * Same as above, but instead of every CPU, only the logical CPUs of a
> - * single core are affected.
> + * Same as above, but instead of every CPU, only the logical CPUs of the
> + * single core containing @cpu are affected.
>   *
>   * Context: Must be called from within a cpus_read_lock() protected region.
>   *
> - * Return: 0 if all executions of @fn returned 0, any non zero return
> - * value if any returned non zero.
> + * Return: 0 if all invocations of @fn return zero.  Otherwise, the
> + * value returned by an arbitrarily chosen member of the set of calls to
> + * @fn that returned non-zero.
>   */
>  int stop_core_cpuslocked(unsigned int cpu, cpu_stop_fn_t fn, void *data);
>  

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-07-14 17:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-05 17:23 [PATCH v2] smp: Document preemption and stop_machine() mutual exclusion Joel Fernandes
2025-07-06 17:01 ` Paul E. McKenney
2025-07-07  7:50 ` Peter Zijlstra
2025-07-07 14:19   ` Joel Fernandes
2025-07-08  7:21     ` Peter Zijlstra
2025-07-08 14:00       ` Joel Fernandes
2025-07-07 15:56   ` Paul E. McKenney
2025-07-08  7:23     ` Peter Zijlstra
2025-07-08 22:52       ` Paul E. McKenney
2025-07-14 17:20         ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).