From: Paul Jackson <pj@sgi.com>
To: Nathan Lynch <ntl@pobox.com>
Cc: dino@in.ibm.com, Simon.Derr@bull.net,
lse-tech@lists.sourceforge.net, akpm@osdl.org,
nickpiggin@yahoo.com.au, vatsa@in.ibm.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] cpusets+hotplug+preepmt broken
Date: Wed, 11 May 2005 13:42:35 -0700 [thread overview]
Message-ID: <20050511134235.5cecf85c.pj@sgi.com> (raw)
In-Reply-To: <20050511195156.GE3614@otto>
Nathan wrote:
> I'm not too familiar with the cpusets code but I
> would like to stay away from nesting these semaphores if at all
> possible.
I share you preference for not nesting these semaphores.
The other choice I am aware of would be for the hotplug code to be less
cpuset-friendly. In the move_task_off_dead_cpu() code, at the point it
says "No more Mr. Nice Guy", instead of looking for the nearest
enclosing cpuset that has something online, which is what the
cpuset_cpus_allowed() does, instead we could just take any damn cpu that
was online.
Something along the lines of the following fix:
--- pj/kernel.old/sched.c 2005-05-11 13:00:17.000000000 -0700
+++ pj/kernel.new/sched.c 2005-05-11 13:02:24.000000000 -0700
@@ -4229,7 +4229,7 @@ static void move_task_off_dead_cpu(int d
/* No more Mr. Nice Guy. */
if (dest_cpu == NR_CPUS) {
- tsk->cpus_allowed = cpuset_cpus_allowed(tsk);
+ tsk->cpus_allowed = cpu_online_map;
dest_cpu = any_online_cpu(tsk->cpus_allowed);
/*
We've already decided here that we had to violate the cpuset container -
as apparently someone hot unplugged every cpu in the current tasks
cpuset.
Hmmm ... that's not quite right ... we've decided that we had to violate
the current tasks cpus_allowed, as apparently someone hot unplugged
every cpu allowed there. This might be a proper subset of what cpus the
tasks cpuset allows, perhaps due to a sched_setaffinity() call to
restrict a task to just one or a few cpus that are allowed in its
cpuset.
So what we'd really like to do would be to first fallback to all the
cpus allowed in the specified tasks cpuset (no walking the cpuset
hierarchy), and see if any of those cpus are still online to receive
this orphan task. Unless someone has botched the system configuration,
and taken offline all the cpus in a cpuset, this should yield up a cpu
that is still both allowed and online. If that fails, then to heck with
honoring cpuset placement - just take the first online cpu we can find.
This is doable without holding cpuset_sem. We can look at a current
tasks cpuset without cpuset_sem, just with the task lock. And this
should almost always work (yield up an online cpu that the cpuset
allows). And when it doesn't work, we can reasonably blame the system
administrator for forcing us to blow out the cpuset confinement.
The following untested, uncompiled patch claims to do this:
--- 2.6.12-rc1-mm4/include/linux/cpuset.h 2005-04-02 15:43:28.000000000 -0800
+++ 2.6.12-rc1-mm4.new/include/linux/cpuset.h 2005-05-11 13:26:10.000000000 -0700
@@ -19,6 +19,7 @@ extern void cpuset_init_smp(void);
extern void cpuset_fork(struct task_struct *p);
extern void cpuset_exit(struct task_struct *p);
extern const cpumask_t cpuset_cpus_allowed(const struct task_struct *p);
+extern const cpumask_t cpuset_task_cpus_allowed(const struct task_struct *p);
void cpuset_init_current_mems_allowed(void);
void cpuset_update_current_mems_allowed(void);
void cpuset_restrict_to_mems_allowed(unsigned long *nodes);
@@ -38,6 +39,10 @@ static inline cpumask_t cpuset_cpus_allo
{
return cpu_possible_map;
}
+static inline cpumask_t cpuset_task_cpus_allowed(struct task_struct *p)
+{
+ return cpu_possible_map;
+}
static inline void cpuset_init_current_mems_allowed(void) {}
static inline void cpuset_update_current_mems_allowed(void) {}
--- 2.6.12-rc1-mm4/kernel/cpuset.c 2005-04-22 19:35:34.000000000 -0700
+++ 2.6.12-rc1-mm4.new/kernel/cpuset.c 2005-05-11 13:40:05.000000000 -0700
@@ -1570,6 +1570,27 @@ const cpumask_t cpuset_cpus_allowed(cons
return mask;
}
+/**
+ * cpuset_task_cpus_allowed - return cpus_allowed mask from a tasks cpuset.
+ * @tsk: pointer to task_struct from which to obtain cpuset->cpus_allowed.
+ *
+ * Description: Returns the cpumask_t cpus_allowed of the cpuset
+ * attached to the specified @tsk. Unlike cpuset_cpus_allowed(),
+ * is not guaranteed to return a non-empty subset of cpu_online_map.
+ * Does not walk up the cpuset hierarchy, and does not attempt to
+ * acquire the cpuset_sem. If called on a task about to exit,
+ * where tsk->cpuset is already NULL, return cpu_online_map.
+ *
+ * Call with task locked.
+ **/
+
+const cpumask_t cpuset_task_cpus_allowed(const struct task_struct *tsk)
+{
+ if (!tsk->cpuset)
+ return cpu_online_map;
+ return tsk->cpuset->cpus_allowed;
+}
+
void cpuset_init_current_mems_allowed(void)
{
current->mems_allowed = NODE_MASK_ALL;
--- 2.6.12-rc1-mm4/kernel/sched.c 2005-04-22 19:51:44.000000000 -0700
+++ 2.6.12-rc1-mm4.new/kernel/sched.c 2005-05-11 13:33:20.000000000 -0700
@@ -4303,7 +4303,7 @@ static void move_task_off_dead_cpu(int d
/* No more Mr. Nice Guy. */
if (dest_cpu == NR_CPUS) {
- tsk->cpus_allowed = cpuset_cpus_allowed(tsk);
+ tsk->cpus_allowed = cpuset_task_cpus_allowed(tsk);
dest_cpu = any_online_cpu(tsk->cpus_allowed);
/*
I retract my Ack of Dinakar's patch, in favor of further consideration
of this last patch, above.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401
next prev parent reply other threads:[~2005-05-11 20:46 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-05-11 19:16 [PATCH] cpusets+hotplug+preepmt broken Dinakar Guniguntala
2005-05-11 19:25 ` [Lse-tech] " Paul Jackson
2005-05-11 19:55 ` Paul Jackson
2005-05-11 20:26 ` Nathan Lynch
2005-05-11 20:45 ` Paul Jackson
2005-05-11 19:51 ` Nathan Lynch
2005-05-11 20:42 ` Paul Jackson [this message]
2005-05-11 20:58 ` Paul Jackson
2005-05-14 2:23 ` Paul Jackson
2005-05-14 12:14 ` Nathan Lynch
2005-05-14 17:04 ` Paul Jackson
2005-05-14 17:44 ` Paul Jackson
2005-05-18 4:14 ` Paul Jackson
2005-05-18 9:29 ` [Lse-tech] " Dinakar Guniguntala
2005-05-18 14:48 ` Nathan Lynch
2005-05-18 21:16 ` Paul Jackson
2005-05-14 16:28 ` Srivatsa Vaddagiri
2005-05-12 15:10 ` Dinakar Guniguntala
2005-05-13 12:15 ` [Lse-tech] " Dinakar Guniguntala
2005-05-13 0:34 ` Nathan Lynch
2005-05-13 12:32 ` [Lse-tech] " Dinakar Guniguntala
2005-05-13 17:25 ` Srivatsa Vaddagiri
2005-05-13 19:59 ` Paul Jackson
2005-05-13 20:20 ` Dipankar Sarma
2005-05-13 20:46 ` Nathan Lynch
2005-05-13 21:05 ` Paul Jackson
2005-05-13 21:06 ` Dipankar Sarma
2005-05-13 20:52 ` Paul Jackson
2005-05-13 21:02 ` Dipankar Sarma
2005-05-14 2:58 ` Paul Jackson
2005-05-14 16:09 ` Srivatsa Vaddagiri
2005-05-14 17:50 ` Paul Jackson
2005-05-14 17:57 ` Paul Jackson
2005-05-14 16:30 ` Nathan Lynch
2005-05-14 17:23 ` Srivatsa Vaddagiri
2005-05-14 23:17 ` Nathan Lynch
2005-05-13 19:59 ` Paul Jackson
2005-05-13 21:27 ` Nathan Lynch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050511134235.5cecf85c.pj@sgi.com \
--to=pj@sgi.com \
--cc=Simon.Derr@bull.net \
--cc=akpm@osdl.org \
--cc=dino@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lse-tech@lists.sourceforge.net \
--cc=nickpiggin@yahoo.com.au \
--cc=ntl@pobox.com \
--cc=vatsa@in.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox