* [PATCH] sched: CPU hotplug race vs. set_cpus_allowed()
@ 2006-06-23 8:16 Kirill Korotaev
2006-06-23 8:18 ` Ingo Molnar
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Kirill Korotaev @ 2006-06-23 8:16 UTC (permalink / raw)
To: Ingo Molnar, Linux Kernel Mailing List, Con Kolivas,
Andrew Morton
[-- Attachment #1: Type: text/plain, Size: 393 bytes --]
Looks like there is a race between set_cpus_allowed()
and move_task_off_dead_cpu().
__migrate_task() doesn't report any err code, so
task can be left on its runqueue if its cpus_allowed mask
changed so that dest_cpu is not longer a possible target.
Also, chaning cpus_allowed mask requires rq->lock being held.
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Kirill
P.S. against 2.6.17-mm1
[-- Attachment #2: diff-migrate-task --]
[-- Type: text/plain, Size: 1957 bytes --]
--- linux-2.6.17-mm1s.orig/kernel/sched.c 2006-06-21 18:53:17.000000000 +0400
+++ linux-2.6.17-mm1.dev/kernel/sched.c 2006-06-23 12:09:16.000000000 +0400
@@ -4858,12 +4858,13 @@ EXPORT_SYMBOL_GPL(set_cpus_allowed);
* So we race with normal scheduler movements, but that's OK, as long
* as the task is no longer on this CPU.
*/
-static void __migrate_task(struct task_struct *p, int src_cpu, int dest_cpu)
+static int __migrate_task(struct task_struct *p, int src_cpu, int dest_cpu)
{
runqueue_t *rq_dest, *rq_src;
+ int res = 0;
if (unlikely(cpu_is_offline(dest_cpu)))
- return;
+ return 0;
rq_src = cpu_rq(src_cpu);
rq_dest = cpu_rq(dest_cpu);
@@ -4891,9 +4892,10 @@ static void __migrate_task(struct task_s
if (TASK_PREEMPTS_CURR(p, rq_dest))
resched_task(rq_dest->curr);
}
-
+ res = 1;
out:
double_rq_unlock(rq_src, rq_dest);
+ return res;
}
/*
@@ -4964,10 +4966,13 @@ int sigstop_on_cpu_lost;
/* Figure out where task on dead CPU should go, use force if neccessary. */
static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
{
- int dest_cpu;
+ runqueue_t *rq;
+ unsigned long flags;
+ int dest_cpu, res;
cpumask_t mask;
int force = 0;
+restart:
/* On same node? */
mask = node_to_cpumask(cpu_to_node(dead_cpu));
cpus_and(mask, mask, tsk->cpus_allowed);
@@ -4979,8 +4984,10 @@ static void move_task_off_dead_cpu(int d
/* No more Mr. Nice Guy. */
if (dest_cpu == NR_CPUS) {
+ rq = task_rq_lock(tsk, &flags);
cpus_setall(tsk->cpus_allowed);
dest_cpu = any_online_cpu(tsk->cpus_allowed);
+ task_rq_unlock(rq, &flags);
/*
* Don't tell them about moving exiting tasks or
@@ -5000,7 +5007,9 @@ static void move_task_off_dead_cpu(int d
if (tsk->mm && sigstop_on_cpu_lost)
force = 1;
}
- __migrate_task(tsk, dead_cpu, dest_cpu);
+ res = __migrate_task(tsk, dead_cpu, dest_cpu);
+ if (!res)
+ goto restart;
if (force)
force_sig_specific(SIGSTOP, tsk);
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: CPU hotplug race vs. set_cpus_allowed()
2006-06-23 8:16 [PATCH] sched: CPU hotplug race vs. set_cpus_allowed() Kirill Korotaev
@ 2006-06-23 8:18 ` Ingo Molnar
2006-06-23 10:53 ` Con Kolivas
2006-06-24 5:25 ` Andrew Morton
2 siblings, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2006-06-23 8:18 UTC (permalink / raw)
To: Kirill Korotaev; +Cc: Linux Kernel Mailing List, Con Kolivas, Andrew Morton
* Kirill Korotaev <dev@openvz.org> wrote:
> Looks like there is a race between set_cpus_allowed() and
> move_task_off_dead_cpu(). __migrate_task() doesn't report any err
> code, so task can be left on its runqueue if its cpus_allowed mask
> changed so that dest_cpu is not longer a possible target. Also,
> chaning cpus_allowed mask requires rq->lock being held.
>
> Signed-Off-By: Kirill Korotaev <dev@openvz.org>
good one!
Acked-by: Ingo Molnar <mingo@elte.hu>
Ingo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: CPU hotplug race vs. set_cpus_allowed()
2006-06-23 8:16 [PATCH] sched: CPU hotplug race vs. set_cpus_allowed() Kirill Korotaev
2006-06-23 8:18 ` Ingo Molnar
@ 2006-06-23 10:53 ` Con Kolivas
2006-06-24 5:25 ` Andrew Morton
2 siblings, 0 replies; 5+ messages in thread
From: Con Kolivas @ 2006-06-23 10:53 UTC (permalink / raw)
To: Kirill Korotaev; +Cc: Ingo Molnar, Linux Kernel Mailing List, Andrew Morton
On Friday 23 June 2006 18:16, Kirill Korotaev wrote:
> Looks like there is a race between set_cpus_allowed()
> and move_task_off_dead_cpu().
> __migrate_task() doesn't report any err code, so
> task can be left on its runqueue if its cpus_allowed mask
> changed so that dest_cpu is not longer a possible target.
> Also, chaning cpus_allowed mask requires rq->lock being held.
>
> Signed-Off-By: Kirill Korotaev <dev@openvz.org>
>
> Kirill
> P.S. against 2.6.17-mm1
Hi!
Since you've got
-static void __migrate_task(struct task_struct *p, int src_cpu, int dest_cpu)
+static int __migrate_task(struct task_struct *p, int src_cpu, int dest_cpu)
{
runqueue_t *rq_dest, *rq_src;
+ int res = 0;
if (unlikely(cpu_is_offline(dest_cpu)))
- return;
+ return 0;
why not return res here?
oh and ret is a more commonly used name than res (your choice of course).
and an addition to the comment such as "returns non-zero only when it fails to
migrate the task" would be nice.
thanks!
--
-ck
--
-ck
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: CPU hotplug race vs. set_cpus_allowed()
2006-06-23 8:16 [PATCH] sched: CPU hotplug race vs. set_cpus_allowed() Kirill Korotaev
2006-06-23 8:18 ` Ingo Molnar
2006-06-23 10:53 ` Con Kolivas
@ 2006-06-24 5:25 ` Andrew Morton
2 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2006-06-24 5:25 UTC (permalink / raw)
To: Kirill Korotaev; +Cc: mingo, linux-kernel, kernel
On Fri, 23 Jun 2006 12:16:09 +0400
Kirill Korotaev <dev@openvz.org> wrote:
> Looks like there is a race between set_cpus_allowed()
> and move_task_off_dead_cpu().
> __migrate_task() doesn't report any err code, so
> task can be left on its runqueue if its cpus_allowed mask
> changed so that dest_cpu is not longer a possible target.
> Also, chaning cpus_allowed mask requires rq->lock being held.
>
> Signed-Off-By: Kirill Korotaev <dev@openvz.org>
>
> Kirill
> P.S. against 2.6.17-mm1
>
That's not against 2.6.17-mm1.
> static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
> {
> - int dest_cpu;
> + runqueue_t *rq;
> + unsigned long flags;
> + int dest_cpu, res;
> cpumask_t mask;
> int force = 0;
>
Your kernel has extra goodies: the `force' stuff.
Please check that the patch which I merged is correct.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] sched: CPU hotplug race vs. set_cpus_allowed()
@ 2006-06-26 7:58 Kirill Korotaev
0 siblings, 0 replies; 5+ messages in thread
From: Kirill Korotaev @ 2006-06-26 7:58 UTC (permalink / raw)
To: Andrew Morton, Con Kolivas, Linux Kernel Mailing List, devel,
Ingo Molnar
[-- Attachment #1: Type: text/plain, Size: 519 bytes --]
There is a race between set_cpus_allowed()
and move_task_off_dead_cpu().
__migrate_task() doesn't report any err code, so
task can be left on its runqueue if its cpus_allowed mask
changed so that dest_cpu is not longer a possible target.
Also, chaning cpus_allowed mask requires rq->lock being held.
Changes from original post according to to Con Colivas notes:
- added comment
- cleanup code a bit
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Acked-By: Ingo Molnar <mingo@elte.hu>
Kirill
P.S. against 2.6.17-mm1
[-- Attachment #2: diff-migrate-task --]
[-- Type: text/plain, Size: 1989 bytes --]
--- linux-2.6.17-mm1s.orig/kernel/sched.c 2006-06-21 18:53:17.000000000 +0400
+++ linux-2.6.17-mm1.dev/kernel/sched.c 2006-06-26 11:54:19.000000000 +0400
@@ -4857,13 +4857,16 @@ EXPORT_SYMBOL_GPL(set_cpus_allowed);
*
* So we race with normal scheduler movements, but that's OK, as long
* as the task is no longer on this CPU.
+ *
+ * Returns non-zero if task was successfully migrated.
*/
-static void __migrate_task(struct task_struct *p, int src_cpu, int dest_cpu)
+static int __migrate_task(struct task_struct *p, int src_cpu, int dest_cpu)
{
runqueue_t *rq_dest, *rq_src;
+ int ret = 0;
if (unlikely(cpu_is_offline(dest_cpu)))
- return;
+ return ret;
rq_src = cpu_rq(src_cpu);
rq_dest = cpu_rq(dest_cpu);
@@ -4891,9 +4894,10 @@ static void __migrate_task(struct task_s
if (TASK_PREEMPTS_CURR(p, rq_dest))
resched_task(rq_dest->curr);
}
-
+ ret = 1;
out:
double_rq_unlock(rq_src, rq_dest);
+ return ret;
}
/*
@@ -4964,10 +4968,13 @@ int sigstop_on_cpu_lost;
/* Figure out where task on dead CPU should go, use force if neccessary. */
static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
{
+ runqueue_t *rq;
+ unsigned long flags;
int dest_cpu;
cpumask_t mask;
int force = 0;
+restart:
/* On same node? */
mask = node_to_cpumask(cpu_to_node(dead_cpu));
cpus_and(mask, mask, tsk->cpus_allowed);
@@ -4979,8 +4986,10 @@ static void move_task_off_dead_cpu(int d
/* No more Mr. Nice Guy. */
if (dest_cpu == NR_CPUS) {
+ rq = task_rq_lock(tsk, &flags);
cpus_setall(tsk->cpus_allowed);
dest_cpu = any_online_cpu(tsk->cpus_allowed);
+ task_rq_unlock(rq, &flags);
/*
* Don't tell them about moving exiting tasks or
@@ -5000,7 +5009,8 @@ static void move_task_off_dead_cpu(int d
if (tsk->mm && sigstop_on_cpu_lost)
force = 1;
}
- __migrate_task(tsk, dead_cpu, dest_cpu);
+ if (!__migrate_task(tsk, dead_cpu, dest_cpu))
+ goto restart;
if (force)
force_sig_specific(SIGSTOP, tsk);
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-06-26 7:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-23 8:16 [PATCH] sched: CPU hotplug race vs. set_cpus_allowed() Kirill Korotaev
2006-06-23 8:18 ` Ingo Molnar
2006-06-23 10:53 ` Con Kolivas
2006-06-24 5:25 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2006-06-26 7:58 Kirill Korotaev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox