From: Robin Holt <holt@sgi.com>
To: linux-kernel@vger.kernel.org
Subject: [Patch v2] stop_machine stalls for a considerable period on large cpu count machines.
Date: Sat, 27 Jun 2009 06:40:29 -0500 [thread overview]
Message-ID: <20090627114029.GC6894@sgi.com> (raw)
I forgot again on the repost.
Sorry for the noise,
Robin
----- Forwarded message from Robin Holt <holt@sgi.com> -----
Date: Sat, 27 Jun 2009 06:34:10 -0500
From: Robin Holt <holt@sgi.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Travis <travis@sgi.com>, Rusty Russell <rusty@rustcorp.com.au>,
Stable Kernel Maintainers <stable@kernel.org>
Subject: [Patch v2] stop_machine stalls for a considerable period on large
cpu count machines.
Mike Travis noted that a 2048 cpu machine booting would take hours
to get through its modprobes. We would get numerous back traces from
stop_cpu indicating they had not serviced interrupts.
A quick code review indicated we have a situation of heavy cacheline
contention due to the 'state' (read-mostly) and 'thread_ack'
(write-mostly) variables being located in the same cacheline.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stable Kernel Maintainers <stable@kernel.org>
---
My first attempt missed a 'quilt refresh' and did not work.
kernel/stop_machine.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
Index: stop_machine_false_sharing/kernel/stop_machine.c
===================================================================
--- stop_machine_false_sharing.orig/kernel/stop_machine.c 2009-06-27 06:30:24.196637521 -0500
+++ stop_machine_false_sharing/kernel/stop_machine.c 2009-06-27 06:30:28.401164425 -0500
@@ -13,6 +13,13 @@
#include <asm/atomic.h>
#include <asm/uaccess.h>
+/*
+ * It is important to keep 'thread_ack' and 'state' in a seperate
+ * cachelines to prevent cacheline sharing between threads updating
+ * thread_ack and other threads spinning on state.
+ */
+static atomic_t thread_ack ____cacheline_aligned;
+
/* This controls the threads on each CPU. */
enum stopmachine_state {
/* Dummy starting state for thread. */
@@ -26,7 +33,7 @@ enum stopmachine_state {
/* Exit */
STOPMACHINE_EXIT,
};
-static enum stopmachine_state state;
+static enum stopmachine_state state ____cacheline_aligned;
struct stop_machine_data {
int (*fn)(void *);
@@ -36,7 +43,6 @@ struct stop_machine_data {
/* Like num_online_cpus(), but hotplug cpu uses us, so we need this. */
static unsigned int num_threads;
-static atomic_t thread_ack;
static DEFINE_MUTEX(lock);
/* setup_lock protects refcount, stop_machine_wq and stop_machine_work. */
static DEFINE_MUTEX(setup_lock);
----- End forwarded message -----
reply other threads:[~2009-06-27 11:40 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090627114029.GC6894@sgi.com \
--to=holt@sgi.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.