From: Robin Holt <holt@sgi.com>
To: linux-kernel@vger.kernel.org
Subject: [Patch v2] stop_machine stalls for a considerable period on large cpu count machines.
Date: Sat, 27 Jun 2009 06:40:29 -0500 [thread overview]
Message-ID: <20090627114029.GC6894@sgi.com> (raw)
I forgot again on the repost.
Sorry for the noise,
Robin
----- Forwarded message from Robin Holt <holt@sgi.com> -----
Date: Sat, 27 Jun 2009 06:34:10 -0500
From: Robin Holt <holt@sgi.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Travis <travis@sgi.com>, Rusty Russell <rusty@rustcorp.com.au>,
Stable Kernel Maintainers <stable@kernel.org>
Subject: [Patch v2] stop_machine stalls for a considerable period on large
cpu count machines.
Mike Travis noted that a 2048 cpu machine booting would take hours
to get through its modprobes. We would get numerous back traces from
stop_cpu indicating they had not serviced interrupts.
A quick code review indicated we have a situation of heavy cacheline
contention due to the 'state' (read-mostly) and 'thread_ack'
(write-mostly) variables being located in the same cacheline.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stable Kernel Maintainers <stable@kernel.org>
---
My first attempt missed a 'quilt refresh' and did not work.
kernel/stop_machine.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
Index: stop_machine_false_sharing/kernel/stop_machine.c
===================================================================
--- stop_machine_false_sharing.orig/kernel/stop_machine.c 2009-06-27 06:30:24.196637521 -0500
+++ stop_machine_false_sharing/kernel/stop_machine.c 2009-06-27 06:30:28.401164425 -0500
@@ -13,6 +13,13 @@
#include <asm/atomic.h>
#include <asm/uaccess.h>
+/*
+ * It is important to keep 'thread_ack' and 'state' in a seperate
+ * cachelines to prevent cacheline sharing between threads updating
+ * thread_ack and other threads spinning on state.
+ */
+static atomic_t thread_ack ____cacheline_aligned;
+
/* This controls the threads on each CPU. */
enum stopmachine_state {
/* Dummy starting state for thread. */
@@ -26,7 +33,7 @@ enum stopmachine_state {
/* Exit */
STOPMACHINE_EXIT,
};
-static enum stopmachine_state state;
+static enum stopmachine_state state ____cacheline_aligned;
struct stop_machine_data {
int (*fn)(void *);
@@ -36,7 +43,6 @@ struct stop_machine_data {
/* Like num_online_cpus(), but hotplug cpu uses us, so we need this. */
static unsigned int num_threads;
-static atomic_t thread_ack;
static DEFINE_MUTEX(lock);
/* setup_lock protects refcount, stop_machine_wq and stop_machine_work. */
static DEFINE_MUTEX(setup_lock);
----- End forwarded message -----
reply other threads:[~2009-06-27 11:40 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090627114029.GC6894@sgi.com \
--to=holt@sgi.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox