* [PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock
@ 2006-07-14 9:54 Paul Jackson
2006-07-23 18:12 ` Paul Jackson
0 siblings, 1 reply; 4+ messages in thread
From: Paul Jackson @ 2006-07-14 9:54 UTC (permalink / raw)
To: Andrew Morton; +Cc: Dinakar Guniguntala, Simon.Derr, Paul Jackson, linux-kernel
From: Paul Jackson <pj@sgi.com>
Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset
callback_mutex lock.
It only happens on cpu_exclusive cpusets, due to the dynamic
sched domain code trying to take the cpu hotplug lock inside
the cpuset callback_mutex lock.
This bug has apparently been here for several months, but didn't
get hit until the right customer load on a large system.
This fix appears right from inspection, but it will take a few
more days running it on that customers workload to be confident
we nailed it. We don't have any other reproducible test case.
The cpu_hotplug_lock() tends to cover large runs of code.
The other places that hold both that lock and the cpuset callback
mutex lock always nest the cpuset lock inside the hotplug lock.
This place tries to do the reverse, risking an ABBA deadlock.
This is in the cpuset_rmdir() code, where we:
* take the callback_mutex lock
* mark the cpuset CS_REMOVED
* call update_cpu_domains for cpu_exclusive cpusets
* in that call, take the cpu_hotplug lock if the
cpuset is marked for removal.
Thanks to Jack Steiner for identifying this deadlock.
The fix is to tear down the dynamic sched domain before we grab
the cpuset callback_mutex lock. This way, the two locks are
serialized, with the hotplug lock taken and released before
trying for the cpuset lock.
I suspect that this bug was introduced when I changed the
cpuset locking from one lock to two. The dynamic sched domain
dependency on cpu_exclusive cpusets and its hotplug hooks were
added to this code earlier, when cpusets had only a single lock.
It may well have been fine then.
Signed-off-by: Paul Jackson <pj@sgi.com>
---
kernel/cpuset.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
--- 2.6.18-rc1-mm1.orig/kernel/cpuset.c 2006-07-13 01:43:26.944147637 -0700
+++ 2.6.18-rc1-mm1/kernel/cpuset.c 2006-07-13 01:43:30.144185379 -0700
@@ -761,6 +761,8 @@ static int validate_change(const struct
*
* Call with manage_mutex held. May nest a call to the
* lock_cpu_hotplug()/unlock_cpu_hotplug() pair.
+ * Must not be called holding callback_mutex, because we must
+ * not call lock_cpu_hotplug() while holding callback_mutex.
*/
static void update_cpu_domains(struct cpuset *cur)
@@ -780,7 +782,7 @@ static void update_cpu_domains(struct cp
if (is_cpu_exclusive(c))
cpus_andnot(pspan, pspan, c->cpus_allowed);
}
- if (is_removed(cur) || !is_cpu_exclusive(cur)) {
+ if (!is_cpu_exclusive(cur)) {
cpus_or(pspan, pspan, cur->cpus_allowed);
if (cpus_equal(pspan, cur->cpus_allowed))
return;
@@ -1916,6 +1918,17 @@ static int cpuset_mkdir(struct inode *di
return cpuset_create(c_parent, dentry->d_name.name, mode | S_IFDIR);
}
+/*
+ * Locking note on the strange update_flag() call below:
+ *
+ * If the cpuset being removed is marked cpu_exclusive, then simulate
+ * turning cpu_exclusive off, which will call update_cpu_domains().
+ * The lock_cpu_hotplug() call in update_cpu_domains() must not be
+ * made while holding callback_mutex. Elsewhere the kernel nests
+ * callback_mutex inside lock_cpu_hotplug() calls. So the reverse
+ * nesting would risk an ABBA deadlock.
+ */
+
static int cpuset_rmdir(struct inode *unused_dir, struct dentry *dentry)
{
struct cpuset *cs = dentry->d_fsdata;
@@ -1935,11 +1948,16 @@ static int cpuset_rmdir(struct inode *un
mutex_unlock(&manage_mutex);
return -EBUSY;
}
+ if (is_cpu_exclusive(cs)) {
+ int retval = update_flag(CS_CPU_EXCLUSIVE, cs, "0");
+ if (retval < 0) {
+ mutex_unlock(&manage_mutex);
+ return retval;
+ }
+ }
parent = cs->parent;
mutex_lock(&callback_mutex);
set_bit(CS_REMOVED, &cs->flags);
- if (is_cpu_exclusive(cs))
- update_cpu_domains(cs);
list_del(&cs->sibling); /* delete my sibling from parent->children */
spin_lock(&cs->dentry->d_lock);
d = dget(cs->dentry);
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.650.933.1373
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock
2006-07-14 9:54 [PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock Paul Jackson
@ 2006-07-23 18:12 ` Paul Jackson
2006-07-23 18:23 ` Linus Torvalds
0 siblings, 1 reply; 4+ messages in thread
From: Paul Jackson @ 2006-07-23 18:12 UTC (permalink / raw)
To: Paul Jackson
Cc: akpm, dino, Simon.Derr, linux-kernel, Ingo Molnar, Linus Torvalds,
davej, ashok.raj
[I added the CC list from the "remove cpu hotplug bustification" thread. -pj]
Nine days ago, I offered this patch to fix a particular instance of
ABBA deadlock we were seeing on a single customer machine, between the
cpu hotplug lock and one of the cpuset locks. In this case, the patch
was simple enough - reorder a couple of operations to avoid the risk of
taking these locks in the wrong order.
pj wrote:
> This fix appears right from inspection, but it will take a few
> more days running it on that customers workload to be confident
> we nailed it. We don't have any other reproducible test case.
We now have enough exposure running this fix on that customers workload
to be confident this fix works. We should have seen at least ten of
these crashes by now. We've seen zero.
If the upcoming 2.6.18 is still open for non-critical fixes please
consider this patch. In this case, it is non-critical because few
systems (exactly one system, world wide, so far) hit it. It is quite
fatal when hit.
On the other hand, if you decide to nuke the cpu hotplug lock for
2.6.18, then just forget this patch, as it would no longer apply.
There's no need of trimming the patients ingrown toenail if you
amputate the leg.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock
2006-07-23 18:12 ` Paul Jackson
@ 2006-07-23 18:23 ` Linus Torvalds
2006-07-23 18:36 ` Paul Jackson
0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2006-07-23 18:23 UTC (permalink / raw)
To: Paul Jackson
Cc: akpm, dino, Simon.Derr, linux-kernel, Ingo Molnar, davej,
ashok.raj
On Sun, 23 Jul 2006, Paul Jackson wrote:
>
> Nine days ago, I offered this patch to fix a particular instance of
> ABBA deadlock we were seeing on a single customer machine,
"this patch"?
Can you resend the actual patch?
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock
2006-07-23 18:23 ` Linus Torvalds
@ 2006-07-23 18:36 ` Paul Jackson
0 siblings, 0 replies; 4+ messages in thread
From: Paul Jackson @ 2006-07-23 18:36 UTC (permalink / raw)
To: Linus Torvalds
Cc: akpm, dino, Simon.Derr, linux-kernel, mingo, davej, ashok.raj
From: Paul Jackson <pj@sgi.com>
Resending ...
Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset
callback_mutex lock.
It only happens on cpu_exclusive cpusets, due to the dynamic
sched domain code trying to take the cpu hotplug lock inside
the cpuset callback_mutex lock.
This bug has apparently been here for several months, but didn't
get hit until the right customer load on a large system.
This fix appears right from inspection, but it will take a few
more days running it on that customers workload to be confident
we nailed it. We don't have any other reproducible test case.
The cpu_hotplug_lock() tends to cover large runs of code.
The other places that hold both that lock and the cpuset callback
mutex lock always nest the cpuset lock inside the hotplug lock.
This place tries to do the reverse, risking an ABBA deadlock.
This is in the cpuset_rmdir() code, where we:
* take the callback_mutex lock
* mark the cpuset CS_REMOVED
* call update_cpu_domains for cpu_exclusive cpusets
* in that call, take the cpu_hotplug lock if the
cpuset is marked for removal.
Thanks to Jack Steiner for identifying this deadlock.
The fix is to tear down the dynamic sched domain before we grab
the cpuset callback_mutex lock. This way, the two locks are
serialized, with the hotplug lock taken and released before
trying for the cpuset lock.
I suspect that this bug was introduced when I changed the
cpuset locking from one lock to two. The dynamic sched domain
dependency on cpu_exclusive cpusets and its hotplug hooks were
added to this code earlier, when cpusets had only a single lock.
It may well have been fine then.
Signed-off-by: Paul Jackson <pj@sgi.com>
---
kernel/cpuset.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
--- 2.6.18-rc1-mm1.orig/kernel/cpuset.c 2006-07-13 01:43:26.944147637 -0700
+++ 2.6.18-rc1-mm1/kernel/cpuset.c 2006-07-13 01:43:30.144185379 -0700
@@ -761,6 +761,8 @@ static int validate_change(const struct
*
* Call with manage_mutex held. May nest a call to the
* lock_cpu_hotplug()/unlock_cpu_hotplug() pair.
+ * Must not be called holding callback_mutex, because we must
+ * not call lock_cpu_hotplug() while holding callback_mutex.
*/
static void update_cpu_domains(struct cpuset *cur)
@@ -780,7 +782,7 @@ static void update_cpu_domains(struct cp
if (is_cpu_exclusive(c))
cpus_andnot(pspan, pspan, c->cpus_allowed);
}
- if (is_removed(cur) || !is_cpu_exclusive(cur)) {
+ if (!is_cpu_exclusive(cur)) {
cpus_or(pspan, pspan, cur->cpus_allowed);
if (cpus_equal(pspan, cur->cpus_allowed))
return;
@@ -1916,6 +1918,17 @@ static int cpuset_mkdir(struct inode *di
return cpuset_create(c_parent, dentry->d_name.name, mode | S_IFDIR);
}
+/*
+ * Locking note on the strange update_flag() call below:
+ *
+ * If the cpuset being removed is marked cpu_exclusive, then simulate
+ * turning cpu_exclusive off, which will call update_cpu_domains().
+ * The lock_cpu_hotplug() call in update_cpu_domains() must not be
+ * made while holding callback_mutex. Elsewhere the kernel nests
+ * callback_mutex inside lock_cpu_hotplug() calls. So the reverse
+ * nesting would risk an ABBA deadlock.
+ */
+
static int cpuset_rmdir(struct inode *unused_dir, struct dentry *dentry)
{
struct cpuset *cs = dentry->d_fsdata;
@@ -1935,11 +1948,16 @@ static int cpuset_rmdir(struct inode *un
mutex_unlock(&manage_mutex);
return -EBUSY;
}
+ if (is_cpu_exclusive(cs)) {
+ int retval = update_flag(CS_CPU_EXCLUSIVE, cs, "0");
+ if (retval < 0) {
+ mutex_unlock(&manage_mutex);
+ return retval;
+ }
+ }
parent = cs->parent;
mutex_lock(&callback_mutex);
set_bit(CS_REMOVED, &cs->flags);
- if (is_cpu_exclusive(cs))
- update_cpu_domains(cs);
list_del(&cs->sibling); /* delete my sibling from parent->children */
spin_lock(&cs->dentry->d_lock);
d = dget(cs->dentry);
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-07-23 18:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-14 9:54 [PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock Paul Jackson
2006-07-23 18:12 ` Paul Jackson
2006-07-23 18:23 ` Linus Torvalds
2006-07-23 18:36 ` Paul Jackson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox