All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 6/6] make select_fallback_rq() cpuset friendly
@ 2010-03-15  9:10 Oleg Nesterov
  2010-04-02 19:13 ` [tip:sched/core] sched: Make " tip-bot for Oleg Nesterov
  0 siblings, 1 reply; 2+ messages in thread
From: Oleg Nesterov @ 2010-03-15  9:10 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Ben Blum, Jiri Slaby, Lai Jiangshan, Li Zefan, Miao Xie,
	Paul Menage, Rafael J. Wysocki, Tejun Heo, linux-kernel

Introduce cpuset_cpus_allowed_fallback() helper to fix the cpuset problems
with select_fallback_rq(). It can be called from any context and can't use
any cpuset locks including task_lock(). It is called when the task doesn't
have online cpus in ->cpus_allowed but ttwu/etc must be able to find a
suitable cpu.

I am not proud of this patch. Everything which needs such a fact comment
can't be good even if correct. But I'd prefer to not change the locking
rules in the code I hardly understand, and in any case I believe this
simple change make the code much more correct compared to deadlocks we
currently have.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 include/linux/cpuset.h |    7 +++++++
 kernel/sched.c         |    4 +---
 kernel/cpuset.c        |   42 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 3 deletions(-)

--- 34-rc1/include/linux/cpuset.h~6_FALLBACK_CPUSETS	2010-03-15 09:40:16.000000000 +0100
+++ 34-rc1/include/linux/cpuset.h	2010-03-15 09:42:08.000000000 +0100
@@ -21,6 +21,7 @@ extern int number_of_cpusets;	/* How man
 extern int cpuset_init(void);
 extern void cpuset_init_smp(void);
 extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask);
+extern int cpuset_cpus_allowed_fallback(struct task_struct *p);
 extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
 #define cpuset_current_mems_allowed (current->mems_allowed)
 void cpuset_init_current_mems_allowed(void);
@@ -101,6 +102,12 @@ static inline void cpuset_cpus_allowed(s
 	cpumask_copy(mask, cpu_possible_mask);
 }
 
+static inline int cpuset_cpus_allowed_fallback(struct task_struct *p)
+{
+	cpumask_copy(&p->cpus_allowed, cpu_possible_mask);
+	return cpumask_any(cpu_active_mask);
+}
+
 static inline nodemask_t cpuset_mems_allowed(struct task_struct *p)
 {
 	return node_possible_map;
--- 34-rc1/kernel/sched.c~6_FALLBACK_CPUSETS	2010-03-15 09:41:51.000000000 +0100
+++ 34-rc1/kernel/sched.c	2010-03-15 09:42:08.000000000 +0100
@@ -2292,9 +2292,7 @@ static int select_fallback_rq(int cpu, s
 
 	/* No more Mr. Nice Guy. */
 	if (unlikely(dest_cpu >= nr_cpu_ids)) {
-		cpumask_copy(&p->cpus_allowed, cpu_possible_mask);
-		dest_cpu = cpumask_any(cpu_active_mask);
-
+		dest_cpu = cpuset_cpus_allowed_fallback(p);
 		/*
 		 * Don't tell them about moving exiting tasks or
 		 * kernel threads (both mm NULL), since they never
--- 34-rc1/kernel/cpuset.c~6_FALLBACK_CPUSETS	2010-03-15 09:40:16.000000000 +0100
+++ 34-rc1/kernel/cpuset.c	2010-03-15 09:42:08.000000000 +0100
@@ -2146,6 +2146,48 @@ void cpuset_cpus_allowed(struct task_str
 	mutex_unlock(&callback_mutex);
 }
 
+int cpuset_cpus_allowed_fallback(struct task_struct *tsk)
+{
+	const struct cpuset *cs;
+	int cpu;
+
+	rcu_read_lock();
+	cs = task_cs(tsk);
+	if (cs)
+		cpumask_copy(&tsk->cpus_allowed, cs->cpus_allowed);
+	rcu_read_unlock();
+
+	/*
+	 * We own tsk->cpus_allowed, nobody can change it under us.
+	 *
+	 * But we used cs && cs->cpus_allowed lockless and thus can
+	 * race with cgroup_attach_task() or update_cpumask() and get
+	 * the wrong tsk->cpus_allowed. However, both cases imply the
+	 * subsequent cpuset_change_cpumask()->set_cpus_allowed_ptr()
+	 * which takes task_rq_lock().
+	 *
+	 * If we are called after it dropped the lock we must see all
+	 * changes in tsk_cs()->cpus_allowed. Otherwise we can temporary
+	 * set any mask even if it is not right from task_cs() pov,
+	 * the pending set_cpus_allowed_ptr() will fix things.
+	 */
+
+	cpu = cpumask_any_and(&tsk->cpus_allowed, cpu_active_mask);
+	if (cpu >= nr_cpu_ids) {
+		/*
+		 * Either tsk->cpus_allowed is wrong (see above) or it
+		 * is actually empty. The latter case is only possible
+		 * if we are racing with remove_tasks_in_empty_cpuset().
+		 * Like above we can temporary set any mask and rely on
+		 * set_cpus_allowed_ptr() as synchronization point.
+		 */
+		cpumask_copy(&tsk->cpus_allowed, cpu_possible_mask);
+		cpu = cpumask_any(cpu_active_mask);
+	}
+
+	return cpu;
+}
+
 void cpuset_init_current_mems_allowed(void)
 {
 	nodes_setall(current->mems_allowed);


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-04-02 19:13 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-15  9:10 [PATCH 6/6] make select_fallback_rq() cpuset friendly Oleg Nesterov
2010-04-02 19:13 ` [tip:sched/core] sched: Make " tip-bot for Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.