[PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine
@ 2024-06-25 11:42 Nicholas Piggin
  2024-06-25 11:42 ` [PATCH 1/4] workqueue: wq_watchdog_touch is always called with valid CPU Nicholas Piggin
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Nicholas Piggin @ 2024-06-25 11:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nicholas Piggin, Paul E . McKenney, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, linux-kernel

Here are a few patches to fix a lockup caused by very slow progress due
to a scalability problem in workqueue watchdog touch being hammered by
thousands of CPUs in multi_cpu_stop. Patch 2 is the fix.

I did notice when making a microbenchmark reproducer that the RCU call
was actually also causing slowdowns. Not nearly so bad as the workqueue
touch, but workqueue queueing of dummy jobs slowed down by a factor of
several times when lots of other CPUs were making
rcu_momentary_dyntick_idle() calls. So I did the stop_machine patches to
reduce that. So those patches 3,4 are independent of the first two and
can go in any order.

Thanks,
Nick

Nicholas Piggin (4):
  workqueue: wq_watchdog_touch is always called with valid CPU
  workqueue: Improve scalability of workqueue watchdog touch
  stop_machine: Rearrange multi_cpu_stop state machine loop
  stop_machine: Add a delay between multi_cpu_stop touching watchdogs

 kernel/stop_machine.c | 31 +++++++++++++++++++++++--------
 kernel/workqueue.c    | 12 ++++++++++--
 2 files changed, 33 insertions(+), 10 deletions(-)

-- 
2.45.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] workqueue: wq_watchdog_touch is always called with valid CPU
  2024-06-25 11:42 [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Nicholas Piggin
@ 2024-06-25 11:42 ` Nicholas Piggin
  2024-06-25 11:42 ` [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch Nicholas Piggin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2024-06-25 11:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nicholas Piggin, Paul E . McKenney, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, linux-kernel

Warn in the case it is called with cpu == -1. This does not appear
to happen anywhere.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/workqueue.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 003474c9a77d..0954b778b315 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -7562,6 +7562,8 @@ notrace void wq_watchdog_touch(int cpu)
 {
 	if (cpu >= 0)
 		per_cpu(wq_watchdog_touched_cpu, cpu) = jiffies;
+	else
+		WARN_ONCE(1, "%s should be called with valid CPU", __func__);
 
 	wq_watchdog_touched = jiffies;
 }
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch
  2024-06-25 11:42 [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Nicholas Piggin
  2024-06-25 11:42 ` [PATCH 1/4] workqueue: wq_watchdog_touch is always called with valid CPU Nicholas Piggin
@ 2024-06-25 11:42 ` Nicholas Piggin
  2024-06-25 16:57   ` Tejun Heo
  2024-06-27 12:16   ` Hillf Danton
  2024-06-25 11:42 ` [PATCH 3/4] stop_machine: Rearrange multi_cpu_stop state machine loop Nicholas Piggin
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 13+ messages in thread
From: Nicholas Piggin @ 2024-06-25 11:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nicholas Piggin, Paul E . McKenney, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, linux-kernel

On a ~2000 CPU powerpc system, hard lockups have been observed in the
workqueue code when stop_machine runs (in this case due to CPU hotplug).
This is due to lots of CPUs spinning in multi_cpu_stop, calling
touch_nmi_watchdog() which ends up calling wq_watchdog_touch().
wq_watchdog_touch() writes to the global variable wq_watchdog_touched,
and that can find itself in the same cacheline as other important
workqueue data, which slows down operations to the point of lockups.

In the case of the following abridged trace, worker_pool_idr was in
the hot line, causing the lockups to always appear at idr_find.

  watchdog: CPU 1125 self-detected hard LOCKUP @ idr_find
  Call Trace:
  get_work_pool
  __queue_work
  call_timer_fn
  run_timer_softirq
  __do_softirq
  do_softirq_own_stack
  irq_exit
  timer_interrupt
  decrementer_common_virt
  * interrupt: 900 (timer) at multi_cpu_stop
  multi_cpu_stop
  cpu_stopper_thread
  smpboot_thread_fn
  kthread

Fix this by having wq_watchdog_touch() only write to the line if the
last time a touch was recorded exceeds 1/4 of the watchdog threshold.

Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/workqueue.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0954b778b315..f60886782f31 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -7560,12 +7560,18 @@ static void wq_watchdog_timer_fn(struct timer_list *unused)
 
 notrace void wq_watchdog_touch(int cpu)
 {
+	unsigned long thresh = READ_ONCE(wq_watchdog_thresh) * HZ;
+	unsigned long touch_ts = READ_ONCE(wq_watchdog_touched);
+	unsigned long now = jiffies;
+
 	if (cpu >= 0)
-		per_cpu(wq_watchdog_touched_cpu, cpu) = jiffies;
+		per_cpu(wq_watchdog_touched_cpu, cpu) = now;
 	else
 		WARN_ONCE(1, "%s should be called with valid CPU", __func__);
 
-	wq_watchdog_touched = jiffies;
+	/* Don't unnecessarily store to global cacheline */
+	if (time_after(now, touch_ts + thresh / 4))
+		WRITE_ONCE(wq_watchdog_touched, jiffies);
 }
 
 static void wq_watchdog_set_thresh(unsigned long thresh)
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] stop_machine: Rearrange multi_cpu_stop state machine loop
  2024-06-25 11:42 [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Nicholas Piggin
  2024-06-25 11:42 ` [PATCH 1/4] workqueue: wq_watchdog_touch is always called with valid CPU Nicholas Piggin
  2024-06-25 11:42 ` [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch Nicholas Piggin
@ 2024-06-25 11:42 ` Nicholas Piggin
  2024-06-25 11:42 ` [PATCH 4/4] stop_machine: Add a delay between multi_cpu_stop touching watchdogs Nicholas Piggin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2024-06-25 11:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nicholas Piggin, Paul E . McKenney, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, linux-kernel

More clearly separate the state-machine progress case from the
non-progress case.

Move stop_machine_yield() and rcu_momentary_dyntick_idle() calls
to the non-progress case like touch_nmi_watchdog(), rather than
always calling them, since there is no reason to yield or touch
watchdogs if the state machine progressed.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/stop_machine.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index cedb17ba158a..1e5c4702e36c 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -225,8 +225,6 @@ static int multi_cpu_stop(void *data)
 
 	/* Simple state machine */
 	do {
-		/* Chill out and ensure we re-read multi_stop_state. */
-		stop_machine_yield(cpumask);
 		newstate = READ_ONCE(msdata->state);
 		if (newstate != curstate) {
 			curstate = newstate;
@@ -243,15 +241,22 @@ static int multi_cpu_stop(void *data)
 				break;
 			}
 			ack_state(msdata);
-		} else if (curstate > MULTI_STOP_PREPARE) {
-			/*
-			 * At this stage all other CPUs we depend on must spin
-			 * in the same loop. Any reason for hard-lockup should
-			 * be detected and reported on their side.
-			 */
-			touch_nmi_watchdog();
+
+		} else {
+			/* No state change, chill out */
+			stop_machine_yield(cpumask);
+			if (curstate > MULTI_STOP_PREPARE) {
+				/*
+				 * At this stage all other CPUs we depend on
+				 * must spin in the same loop. Any reason for
+				 * hard-lockup should be detected and reported
+				 * on their side.
+				 */
+				touch_nmi_watchdog();
+			}
+			rcu_momentary_dyntick_idle();
 		}
-		rcu_momentary_dyntick_idle();
+
 	} while (curstate != MULTI_STOP_EXIT);
 
 	local_irq_restore(flags);
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] stop_machine: Add a delay between multi_cpu_stop touching watchdogs
  2024-06-25 11:42 [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Nicholas Piggin
                   ` (2 preceding siblings ...)
  2024-06-25 11:42 ` [PATCH 3/4] stop_machine: Rearrange multi_cpu_stop state machine loop Nicholas Piggin
@ 2024-06-25 11:42 ` Nicholas Piggin
  2024-06-25 14:53 ` [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Paul E. McKenney
  2024-06-26 12:58 ` Michal Koutný
  5 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2024-06-25 11:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nicholas Piggin, Paul E . McKenney, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, linux-kernel

If a lot of CPUs call rcu_momentary_dyntick_idle() in a tight loop,
this can cause contention that could slow other CPUs reaching
multi_cpu_stop. Add a 10ms delay between patting the various dogs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/stop_machine.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 1e5c4702e36c..626199b572c6 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -243,8 +243,18 @@ static int multi_cpu_stop(void *data)
 			ack_state(msdata);
 
 		} else {
-			/* No state change, chill out */
-			stop_machine_yield(cpumask);
+			/*
+			 * No state change, chill out. Delay here to prevent
+			 * the watchdogs and RCU being hit too hard by lots
+			 * of CPUs, which can cause contention and slowdowns.
+			 */
+			unsigned long t = jiffies + msecs_to_jiffies(10);
+
+			while (time_before(jiffies, t)) {
+				if (READ_ONCE(msdata->state) != curstate)
+					break;
+				stop_machine_yield(cpumask);
+			}
 			if (curstate > MULTI_STOP_PREPARE) {
 				/*
 				 * At this stage all other CPUs we depend on
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine
  2024-06-25 11:42 [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Nicholas Piggin
                   ` (3 preceding siblings ...)
  2024-06-25 11:42 ` [PATCH 4/4] stop_machine: Add a delay between multi_cpu_stop touching watchdogs Nicholas Piggin
@ 2024-06-25 14:53 ` Paul E. McKenney
  2024-06-26  0:57   ` Nicholas Piggin
  2024-06-26 12:58 ` Michal Koutný
  5 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2024-06-25 14:53 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Tejun Heo, Peter Zijlstra, Lai Jiangshan, Srikar Dronamraju,
	linux-kernel

On Tue, Jun 25, 2024 at 09:42:43PM +1000, Nicholas Piggin wrote:
> Here are a few patches to fix a lockup caused by very slow progress due
> to a scalability problem in workqueue watchdog touch being hammered by
> thousands of CPUs in multi_cpu_stop. Patch 2 is the fix.
> 
> I did notice when making a microbenchmark reproducer that the RCU call
> was actually also causing slowdowns. Not nearly so bad as the workqueue
> touch, but workqueue queueing of dummy jobs slowed down by a factor of
> several times when lots of other CPUs were making
> rcu_momentary_dyntick_idle() calls. So I did the stop_machine patches to
> reduce that. So those patches 3,4 are independent of the first two and
> can go in any order.

For the series:

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> Thanks,
> Nick
> 
> Nicholas Piggin (4):
>   workqueue: wq_watchdog_touch is always called with valid CPU
>   workqueue: Improve scalability of workqueue watchdog touch
>   stop_machine: Rearrange multi_cpu_stop state machine loop
>   stop_machine: Add a delay between multi_cpu_stop touching watchdogs
> 
>  kernel/stop_machine.c | 31 +++++++++++++++++++++++--------
>  kernel/workqueue.c    | 12 ++++++++++--
>  2 files changed, 33 insertions(+), 10 deletions(-)
> 
> -- 
> 2.45.1
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch
  2024-06-25 11:42 ` [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch Nicholas Piggin
@ 2024-06-25 16:57   ` Tejun Heo
  2024-06-26  0:52     ` Nicholas Piggin
  2024-06-27 12:16   ` Hillf Danton
  1 sibling, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2024-06-25 16:57 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Paul E . McKenney, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, linux-kernel

On Tue, Jun 25, 2024 at 09:42:45PM +1000, Nicholas Piggin wrote:
> On a ~2000 CPU powerpc system, hard lockups have been observed in the
> workqueue code when stop_machine runs (in this case due to CPU hotplug).
> This is due to lots of CPUs spinning in multi_cpu_stop, calling
> touch_nmi_watchdog() which ends up calling wq_watchdog_touch().
> wq_watchdog_touch() writes to the global variable wq_watchdog_touched,
> and that can find itself in the same cacheline as other important
> workqueue data, which slows down operations to the point of lockups.
> 
> In the case of the following abridged trace, worker_pool_idr was in
> the hot line, causing the lockups to always appear at idr_find.
> 
>   watchdog: CPU 1125 self-detected hard LOCKUP @ idr_find
>   Call Trace:
>   get_work_pool
>   __queue_work
>   call_timer_fn
>   run_timer_softirq
>   __do_softirq
>   do_softirq_own_stack
>   irq_exit
>   timer_interrupt
>   decrementer_common_virt
>   * interrupt: 900 (timer) at multi_cpu_stop
>   multi_cpu_stop
>   cpu_stopper_thread
>   smpboot_thread_fn
>   kthread
> 
> Fix this by having wq_watchdog_touch() only write to the line if the
> last time a touch was recorded exceeds 1/4 of the watchdog threshold.
> 
> Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Applied 1-2 to wq/for-6.11. I think 3 and 4 should probably be routed
through either tip or Andrew?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch
  2024-06-25 16:57   ` Tejun Heo
@ 2024-06-26  0:52     ` Nicholas Piggin
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2024-06-26  0:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Paul E . McKenney, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, linux-kernel

On Wed Jun 26, 2024 at 2:57 AM AEST, Tejun Heo wrote:
> On Tue, Jun 25, 2024 at 09:42:45PM +1000, Nicholas Piggin wrote:
> > On a ~2000 CPU powerpc system, hard lockups have been observed in the
> > workqueue code when stop_machine runs (in this case due to CPU hotplug).
> > This is due to lots of CPUs spinning in multi_cpu_stop, calling
> > touch_nmi_watchdog() which ends up calling wq_watchdog_touch().
> > wq_watchdog_touch() writes to the global variable wq_watchdog_touched,
> > and that can find itself in the same cacheline as other important
> > workqueue data, which slows down operations to the point of lockups.
> > 
> > In the case of the following abridged trace, worker_pool_idr was in
> > the hot line, causing the lockups to always appear at idr_find.
> > 
> >   watchdog: CPU 1125 self-detected hard LOCKUP @ idr_find
> >   Call Trace:
> >   get_work_pool
> >   __queue_work
> >   call_timer_fn
> >   run_timer_softirq
> >   __do_softirq
> >   do_softirq_own_stack
> >   irq_exit
> >   timer_interrupt
> >   decrementer_common_virt
> >   * interrupt: 900 (timer) at multi_cpu_stop
> >   multi_cpu_stop
> >   cpu_stopper_thread
> >   smpboot_thread_fn
> >   kthread
> > 
> > Fix this by having wq_watchdog_touch() only write to the line if the
> > last time a touch was recorded exceeds 1/4 of the watchdog threshold.
> > 
> > Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>
> Applied 1-2 to wq/for-6.11.

Thanks Tejun.

> I think 3 and 4 should probably be routed
> through either tip or Andrew?

Yeah, let's see if it gets any comments.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine
  2024-06-25 14:53 ` [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Paul E. McKenney
@ 2024-06-26  0:57   ` Nicholas Piggin
  2024-09-25  5:25     ` Srikar Dronamraju
  0 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2024-06-26  0:57 UTC (permalink / raw)
  To: paulmck
  Cc: Tejun Heo, Peter Zijlstra, Lai Jiangshan, Srikar Dronamraju,
	linux-kernel

On Wed Jun 26, 2024 at 12:53 AM AEST, Paul E. McKenney wrote:
> On Tue, Jun 25, 2024 at 09:42:43PM +1000, Nicholas Piggin wrote:
> > Here are a few patches to fix a lockup caused by very slow progress due
> > to a scalability problem in workqueue watchdog touch being hammered by
> > thousands of CPUs in multi_cpu_stop. Patch 2 is the fix.
> > 
> > I did notice when making a microbenchmark reproducer that the RCU call
> > was actually also causing slowdowns. Not nearly so bad as the workqueue
> > touch, but workqueue queueing of dummy jobs slowed down by a factor of
> > several times when lots of other CPUs were making
> > rcu_momentary_dyntick_idle() calls. So I did the stop_machine patches to
> > reduce that. So those patches 3,4 are independent of the first two and
> > can go in any order.
>
> For the series:
>
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

Oh, it did get a comment :) Thanks Paul. Not sure who owns the
multi_cpu_stop loop, Tejun and Peter I guess but that was 10+
years ago :P

I might ask Andrew if he would take patches 3-4, if there are
no objections.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine
  2024-06-25 11:42 [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Nicholas Piggin
                   ` (4 preceding siblings ...)
  2024-06-25 14:53 ` [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Paul E. McKenney
@ 2024-06-26 12:58 ` Michal Koutný
  5 siblings, 0 replies; 13+ messages in thread
From: Michal Koutný @ 2024-06-26 12:58 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Tejun Heo, Paul E . McKenney, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1703 bytes --]

Hello Nicholas.

On Tue, Jun 25, 2024 at 09:42:43PM GMT, Nicholas Piggin <npiggin@gmail.com> wrote:
> Here are a few patches to fix a lockup caused by very slow progress due
> to a scalability problem in workqueue watchdog touch being hammered by
> thousands of CPUs in multi_cpu_stop. Patch 2 is the fix.

<del>Is this something you noticed with stop_machine alone or in some broader
context?</del> I see you mention CPU hotplug in patch 2. Was it a single
CPU or SMT offlining?
Good job tracking it down touching same cacheline.
I had a suspicion [1] back in the day that cpuhp_smt_disable() would
scale in O(nr_cpus^2) but I didn't dedicate time to verifying that. Has
something similar popped up in your examination?

Also, I think your change in patch 2 effectively reduces
wq_watchdog_thresh to 3/4 of configured value. (Not sure if default
should be scaled accordingly.)

Thanks,
Michal

[1]
cpuhp_smt_disable
  mutex_lock(&cpu_add_remove_lock); // cpu_maps_update_begin()

  for_each_online_cpu                      // <-- nr_cpus
    takedown_cpu(cpu)
      stop_machine_cpuslocked // cpu_hotplug_lock
        active_cpus = cpumask_of(cpu)
        for_each_cpu(cpu, cpu_online_mask) // <-- nr_cpus
          cpu_stop_queue_work(cpu, work) // this is serial
          multi_cpu_stop // this runs nr_cpus in parallel
            // runs on downed cpu only
            take_cpu_down
            // other cpus are spinning in multi_cpu_stop() until take_cpu_down
            // above is finished on downed cpu
            // with interrupts disabled
            do {cpu_relax} while(curstate != MULTI_STOP_EXIT)
    lockup_detector_cleanup() // not a touch

  mutex_unlock(&cpu_add_remove_lock);



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch
  2024-06-25 11:42 ` [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch Nicholas Piggin
  2024-06-25 16:57   ` Tejun Heo
@ 2024-06-27 12:16   ` Hillf Danton
  2024-06-27 12:42     ` Waiman Long
  1 sibling, 1 reply; 13+ messages in thread
From: Hillf Danton @ 2024-06-27 12:16 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Paul E . McKenney, Peter Zijlstra, Waiman Long, linux-kernel

On Tue, Jun 25, 2024 at 09:42:45PM +1000, Nicholas Piggin wrote:
> On a ~2000 CPU powerpc system, hard lockups have been observed in the
> workqueue code when stop_machine runs (in this case due to CPU hotplug).
> This is due to lots of CPUs spinning in multi_cpu_stop, calling
> touch_nmi_watchdog() which ends up calling wq_watchdog_touch().
> wq_watchdog_touch() writes to the global variable wq_watchdog_touched,
> and that can find itself in the same cacheline as other important
> workqueue data, which slows down operations to the point of lockups.
> 
> In the case of the following abridged trace, worker_pool_idr was in
> the hot line, causing the lockups to always appear at idr_find.
> 
Wonder if the MCS lock does not help in this case.

>   watchdog: CPU 1125 self-detected hard LOCKUP @ idr_find
>   Call Trace:
>   get_work_pool
>   __queue_work
>   call_timer_fn
>   run_timer_softirq
>   __do_softirq
>   do_softirq_own_stack
>   irq_exit
>   timer_interrupt
>   decrementer_common_virt
>   * interrupt: 900 (timer) at multi_cpu_stop
>   multi_cpu_stop
>   cpu_stopper_thread
>   smpboot_thread_fn
>   kthread
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch
  2024-06-27 12:16   ` Hillf Danton
@ 2024-06-27 12:42     ` Waiman Long
  0 siblings, 0 replies; 13+ messages in thread
From: Waiman Long @ 2024-06-27 12:42 UTC (permalink / raw)
  To: Hillf Danton, Nicholas Piggin
  Cc: Paul E . McKenney, Peter Zijlstra, linux-kernel


On 6/27/24 08:16, Hillf Danton wrote:
> On Tue, Jun 25, 2024 at 09:42:45PM +1000, Nicholas Piggin wrote:
>> On a ~2000 CPU powerpc system, hard lockups have been observed in the
>> workqueue code when stop_machine runs (in this case due to CPU hotplug).
>> This is due to lots of CPUs spinning in multi_cpu_stop, calling
>> touch_nmi_watchdog() which ends up calling wq_watchdog_touch().
>> wq_watchdog_touch() writes to the global variable wq_watchdog_touched,
>> and that can find itself in the same cacheline as other important
>> workqueue data, which slows down operations to the point of lockups.
>>
>> In the case of the following abridged trace, worker_pool_idr was in
>> the hot line, causing the lockups to always appear at idr_find.
>>
> Wonder if the MCS lock does not help in this case.

This patch just tries to avoid polluting the shared cacheline leading to 
excessive cacheline bouncing. No locking is involved. I am not sure what 
you are thinking about using MCS lock for.

Regards,
Longman

>>    watchdog: CPU 1125 self-detected hard LOCKUP @ idr_find
>>    Call Trace:
>>    get_work_pool
>>    __queue_work
>>    call_timer_fn
>>    run_timer_softirq
>>    __do_softirq
>>    do_softirq_own_stack
>>    irq_exit
>>    timer_interrupt
>>    decrementer_common_virt
>>    * interrupt: 900 (timer) at multi_cpu_stop
>>    multi_cpu_stop
>>    cpu_stopper_thread
>>    smpboot_thread_fn
>>    kthread
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine
  2024-06-26  0:57   ` Nicholas Piggin
@ 2024-09-25  5:25     ` Srikar Dronamraju
  0 siblings, 0 replies; 13+ messages in thread
From: Srikar Dronamraju @ 2024-09-25  5:25 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: paulmck, Tejun Heo, Peter Zijlstra, Lai Jiangshan,
	Srikar Dronamraju, Valentin Schneider, linux-kernel

* Nicholas Piggin <npiggin@gmail.com> [2024-06-26 10:57:36]:

> On Wed Jun 26, 2024 at 12:53 AM AEST, Paul E. McKenney wrote:
> > On Tue, Jun 25, 2024 at 09:42:43PM +1000, Nicholas Piggin wrote:
> > > Here are a few patches to fix a lockup caused by very slow progress due
> > > to a scalability problem in workqueue watchdog touch being hammered by
> > > thousands of CPUs in multi_cpu_stop. Patch 2 is the fix.
> > > 
> > > I did notice when making a microbenchmark reproducer that the RCU call
> > > was actually also causing slowdowns. Not nearly so bad as the workqueue
> > > touch, but workqueue queueing of dummy jobs slowed down by a factor of
> > > several times when lots of other CPUs were making
> > > rcu_momentary_dyntick_idle() calls. So I did the stop_machine patches to
> > > reduce that. So those patches 3,4 are independent of the first two and
> > > can go in any order.
> >
> > For the series:
> >
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> 
> Oh, it did get a comment :) Thanks Paul. Not sure who owns the
> multi_cpu_stop loop, Tejun and Peter I guess but that was 10+
> years ago :P
> 
> I might ask Andrew if he would take patches 3-4, if there are
> no objections.
> 

patches 3 and 4 are still not part of any tree.
Can we please include them or are there any reservations on them.

The patches still seem to apply on top of Linus tree except one line where
rcu_momentary_dyntick_idle() has been renamed to rcu_momentary_eqs()

Commit 32a9f26e5e26 ("rcu: Rename rcu_momentary_dyntick_idle() into
rcu_momentary_eqs()") 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/patch/?id=32a9f26e5e26

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-09-25  5:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-25 11:42 [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Nicholas Piggin
2024-06-25 11:42 ` [PATCH 1/4] workqueue: wq_watchdog_touch is always called with valid CPU Nicholas Piggin
2024-06-25 11:42 ` [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch Nicholas Piggin
2024-06-25 16:57   ` Tejun Heo
2024-06-26  0:52     ` Nicholas Piggin
2024-06-27 12:16   ` Hillf Danton
2024-06-27 12:42     ` Waiman Long
2024-06-25 11:42 ` [PATCH 3/4] stop_machine: Rearrange multi_cpu_stop state machine loop Nicholas Piggin
2024-06-25 11:42 ` [PATCH 4/4] stop_machine: Add a delay between multi_cpu_stop touching watchdogs Nicholas Piggin
2024-06-25 14:53 ` [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Paul E. McKenney
2024-06-26  0:57   ` Nicholas Piggin
2024-09-25  5:25     ` Srikar Dronamraju
2024-06-26 12:58 ` Michal Koutný

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox