From: Paul Menage <menage@google.com>
To: Vegard Nossum <vegard.nossum@gmail.com>,
Paul Jackson <pj@sgi.com>,
a.p.zijlstra@chello.nl, maxk@qualcomm.com
Cc: linux-kernel@vger.kernel.org
Subject: [RFC][PATCH] CPUSets: Move most calls to rebuild_sched_domains() to the workqueue
Date: Thu, 26 Jun 2008 00:56:49 -0700 [thread overview]
Message-ID: <48634BC1.8@google.com> (raw)
CPUsets: Move most calls to rebuild_sched_domains() to the workqueue
In the current cpusets code the lock nesting between cgroup_mutex and
cpuhotplug.lock when calling rebuild_sched_domains is inconsistent -
in the CPU hotplug path cpuhotplug.lock nests outside cgroup_mutex,
and in all other paths that call rebuild_sched_domains() it nests
inside.
This patch makes most calls to rebuild_sched_domains() asynchronous
via the workqueue, which removes the nesting of the two locks in that
case. In the case of an actual hotplug event, cpuhotplug.lock nests
outside cgroup_mutex as now.
Signed-off-by: Paul Menage <menage@google.com>
---
Note that all I've done with this patch is verify that it compiles
without warnings; I'm not sure how to trigger a hotplug event to test
the lock dependencies or verify that scheduler domain support is still
behaving correctly. Vegard, does this fix the problems that you were
seeing? Paul/Max, does this still seem sane with regard to scheduler domains?
kernel/cpuset.c | 35 +++++++++++++++++++++++------------
1 file changed, 23 insertions(+), 12 deletions(-)
Index: lockfix-2.6.26-rc5-mm3/kernel/cpuset.c
===================================================================
--- lockfix-2.6.26-rc5-mm3.orig/kernel/cpuset.c
+++ lockfix-2.6.26-rc5-mm3/kernel/cpuset.c
@@ -522,13 +522,9 @@ update_domain_attr(struct sched_domain_a
* domains when operating in the severe memory shortage situations
* that could cause allocation failures below.
*
- * Call with cgroup_mutex held. May take callback_mutex during
- * call due to the kfifo_alloc() and kmalloc() calls. May nest
- * a call to the get_online_cpus()/put_online_cpus() pair.
- * Must not be called holding callback_mutex, because we must not
- * call get_online_cpus() while holding callback_mutex. Elsewhere
- * the kernel nests callback_mutex inside get_online_cpus() calls.
- * So the reverse nesting would risk an ABBA deadlock.
+ * Call with cgroup_mutex held, and inside get_online_cpus(). May
+ * take callback_mutex during call due to the kfifo_alloc() and
+ * kmalloc() calls.
*
* The three key local variables below are:
* q - a kfifo queue of cpuset pointers, used to implement a
@@ -689,9 +685,7 @@ restart:
rebuild:
/* Have scheduler rebuild sched domains */
- get_online_cpus();
partition_sched_domains(ndoms, doms, dattr);
- put_online_cpus();
done:
if (q && !IS_ERR(q))
@@ -701,6 +695,21 @@ done:
/* Don't kfree(dattr) -- partition_sched_domains() does that. */
}
+/*
+ * Due to the need to nest cgroup_mutex inside cpuhotplug.lock, most
+ * of our invocations of rebuild_sched_domains() are done
+ * asynchronously via the workqueue
+ */
+static void delayed_rebuild_sched_domains(struct work_struct *work)
+{
+ get_online_cpus();
+ cgroup_lock();
+ rebuild_sched_domains();
+ cgroup_unlock();
+ put_online_cpus();
+}
+static DECLARE_WORK(rebuild_sched_domains_work, delayed_rebuild_sched_domains);
+
static inline int started_after_time(struct task_struct *t1,
struct timespec *time,
struct task_struct *t2)
@@ -853,7 +862,7 @@ static int update_cpumask(struct cpuset
return retval;
if (is_load_balanced)
- rebuild_sched_domains();
+ schedule_work(&rebuild_sched_domains_work);
return 0;
}
@@ -1080,7 +1089,7 @@ static int update_relax_domain_level(str
if (val != cs->relax_domain_level) {
cs->relax_domain_level = val;
- rebuild_sched_domains();
+ schedule_work(&rebuild_sched_domains_work);
}
return 0;
@@ -1121,7 +1130,7 @@ static int update_flag(cpuset_flagbits_t
mutex_unlock(&callback_mutex);
if (cpus_nonempty && balance_flag_changed)
- rebuild_sched_domains();
+ schedule_work(&rebuild_sched_domains_work);
return 0;
}
@@ -1929,6 +1938,7 @@ static void scan_for_empty_cpusets(const
static void common_cpu_mem_hotplug_unplug(void)
{
+ get_online_cpus();
cgroup_lock();
top_cpuset.cpus_allowed = cpu_online_map;
@@ -1942,6 +1952,7 @@ static void common_cpu_mem_hotplug_unplu
rebuild_sched_domains();
cgroup_unlock();
+ put_online_cpus();
}
/*
next reply other threads:[~2008-06-26 7:57 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-26 7:56 Paul Menage [this message]
2008-06-26 9:34 ` [RFC][PATCH] CPUSets: Move most calls to rebuild_sched_domains() to the workqueue Vegard Nossum
2008-06-26 9:50 ` Paul Menage
2008-06-26 18:49 ` Max Krasnyansky
2008-06-26 19:19 ` Peter Zijlstra
2008-06-26 20:34 ` Paul Menage
2008-06-26 21:17 ` Paul Menage
2008-06-27 5:10 ` Max Krasnyansky
2008-06-27 5:51 ` Paul Menage
2008-06-27 17:31 ` Max Krasnyansky
2008-06-27 3:22 ` Gautham R Shenoy
2008-06-27 3:23 ` Gautham R Shenoy
2008-06-27 4:53 ` Max Krasnyansky
2008-06-27 16:42 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48634BC1.8@google.com \
--to=menage@google.com \
--cc=a.p.zijlstra@chello.nl \
--cc=linux-kernel@vger.kernel.org \
--cc=maxk@qualcomm.com \
--cc=pj@sgi.com \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox