From: Nicholas Piggin <npiggin@gmail.com>
To: Tejun Heo <tj@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org
Subject: [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine
Date: Tue, 25 Jun 2024 21:42:43 +1000 [thread overview]
Message-ID: <20240625114249.289014-1-npiggin@gmail.com> (raw)
Here are a few patches to fix a lockup caused by very slow progress due
to a scalability problem in workqueue watchdog touch being hammered by
thousands of CPUs in multi_cpu_stop. Patch 2 is the fix.
I did notice when making a microbenchmark reproducer that the RCU call
was actually also causing slowdowns. Not nearly so bad as the workqueue
touch, but workqueue queueing of dummy jobs slowed down by a factor of
several times when lots of other CPUs were making
rcu_momentary_dyntick_idle() calls. So I did the stop_machine patches to
reduce that. So those patches 3,4 are independent of the first two and
can go in any order.
Thanks,
Nick
Nicholas Piggin (4):
workqueue: wq_watchdog_touch is always called with valid CPU
workqueue: Improve scalability of workqueue watchdog touch
stop_machine: Rearrange multi_cpu_stop state machine loop
stop_machine: Add a delay between multi_cpu_stop touching watchdogs
kernel/stop_machine.c | 31 +++++++++++++++++++++++--------
kernel/workqueue.c | 12 ++++++++++--
2 files changed, 33 insertions(+), 10 deletions(-)
--
2.45.1
next reply other threads:[~2024-06-25 11:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 11:42 Nicholas Piggin [this message]
2024-06-25 11:42 ` [PATCH 1/4] workqueue: wq_watchdog_touch is always called with valid CPU Nicholas Piggin
2024-06-25 11:42 ` [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch Nicholas Piggin
2024-06-25 16:57 ` Tejun Heo
2024-06-26 0:52 ` Nicholas Piggin
2024-06-27 12:16 ` Hillf Danton
2024-06-27 12:42 ` Waiman Long
2024-06-25 11:42 ` [PATCH 3/4] stop_machine: Rearrange multi_cpu_stop state machine loop Nicholas Piggin
2024-06-25 11:42 ` [PATCH 4/4] stop_machine: Add a delay between multi_cpu_stop touching watchdogs Nicholas Piggin
2024-06-25 14:53 ` [PATCH 0/4] Fix scalability problem in workqueue watchdog touch caused by stop_machine Paul E. McKenney
2024-06-26 0:57 ` Nicholas Piggin
2024-09-25 5:25 ` Srikar Dronamraju
2024-06-26 12:58 ` Michal Koutný
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240625114249.289014-1-npiggin@gmail.com \
--to=npiggin@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=srikar@linux.vnet.ibm.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.