* [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()
@ 2009-09-22 20:47 Oren Laadan
[not found] ` <1253652463-11529-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Oren Laadan @ 2009-09-22 20:47 UTC (permalink / raw)
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA; +Cc: Oren Laadan
From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org>
If prepare_descendants() is walking a tree and one of the tasks is
forking, one of two bads can happen. If the child doesn't inherit the
->ctx, it breaks the assumption that the entire subtree is prepared.
If the child inherits the ->ctx, it will have one without having taken
a reference.
This patch closed this race by explicitly getting and referencing the
->ctx for a child process should the parent have one, atomically under
the tasklist_lock.
Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
---
kernel/fork.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 9f13d7b..57118e4 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -62,6 +62,7 @@
#include <linux/fs_struct.h>
#include <linux/magic.h>
#include <linux/perf_counter.h>
+#include <linux/checkpoint.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
INIT_LIST_HEAD(&p->pi_state_list);
p->pi_state_cache = NULL;
#endif
-#ifdef CONFIG_CHECKPOINT
- p->checkpoint_ctx = NULL;
-#endif
/*
* sigaltstack should be cleared when sharing the same VM
*/
@@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
/* Need tasklist lock for parent etc handling! */
write_lock_irq(&tasklist_lock);
+#ifdef CONFIG_CHECKPOINT
+ /* If parent is restarting, child should be too */
+ if (unlikely(current->checkpoint_ctx)) {
+ p->checkpoint_ctx = current->checkpoint_ctx;
+ ckpt_ctx_get(p->checkpoint_ctx);
+ }
+#endif
/*
* The task hasn't been attached yet, so its cpus_allowed mask will
* not be changed, nor will its assigned CPU.
--
1.6.0.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()
[not found] ` <1253652463-11529-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
@ 2009-09-22 21:38 ` Serge E. Hallyn
[not found] ` <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Serge E. Hallyn @ 2009-09-22 21:38 UTC (permalink / raw)
To: Oren Laadan
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Oren Laadan
Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
> From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org>
>
> If prepare_descendants() is walking a tree and one of the tasks is
> forking, one of two bads can happen. If the child doesn't inherit the
> ->ctx, it breaks the assumption that the entire subtree is prepared.
> If the child inherits the ->ctx, it will have one without having taken
> a reference.
>
> This patch closed this race by explicitly getting and referencing the
> ->ctx for a child process should the parent have one, atomically under
> the tasklist_lock.
>
> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
> ---
> kernel/fork.c | 11 ++++++++---
> 1 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 9f13d7b..57118e4 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -62,6 +62,7 @@
> #include <linux/fs_struct.h>
> #include <linux/magic.h>
> #include <linux/perf_counter.h>
> +#include <linux/checkpoint.h>
>
> #include <asm/pgtable.h>
> #include <asm/pgalloc.h>
> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> INIT_LIST_HEAD(&p->pi_state_list);
> p->pi_state_cache = NULL;
> #endif
> -#ifdef CONFIG_CHECKPOINT
> - p->checkpoint_ctx = NULL;
> -#endif
> /*
> * sigaltstack should be cleared when sharing the same VM
> */
> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> /* Need tasklist lock for parent etc handling! */
> write_lock_irq(&tasklist_lock);
>
> +#ifdef CONFIG_CHECKPOINT
> + /* If parent is restarting, child should be too */
> + if (unlikely(current->checkpoint_ctx)) {
> + p->checkpoint_ctx = current->checkpoint_ctx;
Won't break anything, but technically p->checkpoint_ctx will
already be copied from current->checkpoint_ctx, so only the
ckpt_ctx_get() is necessary, so this could really read
if (p->checkpoint_ctx)
ckpt_ctx_get(p->checkpoint_ctx);
Right?
> + ckpt_ctx_get(p->checkpoint_ctx);
> + }
> +#endif
> /*
> * The task hasn't been attached yet, so its cpus_allowed mask will
> * not be changed, nor will its assigned CPU.
> --
> 1.6.0.4
>
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()
[not found] ` <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-09-22 21:46 ` Serge E. Hallyn
[not found] ` <20090922214619.GA30215-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-09-22 23:41 ` Oren Laadan
1 sibling, 1 reply; 6+ messages in thread
From: Serge E. Hallyn @ 2009-09-22 21:46 UTC (permalink / raw)
To: Oren Laadan
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Oren Laadan
Quoting Serge E. Hallyn (serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org):
> Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
> > From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org>
> >
> > If prepare_descendants() is walking a tree and one of the tasks is
> > forking, one of two bads can happen. If the child doesn't inherit the
> > ->ctx, it breaks the assumption that the entire subtree is prepared.
> > If the child inherits the ->ctx, it will have one without having taken
> > a reference.
> >
> > This patch closed this race by explicitly getting and referencing the
> > ->ctx for a child process should the parent have one, atomically under
> > the tasklist_lock.
> >
> > Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
> > ---
> > kernel/fork.c | 11 ++++++++---
> > 1 files changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 9f13d7b..57118e4 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -62,6 +62,7 @@
> > #include <linux/fs_struct.h>
> > #include <linux/magic.h>
> > #include <linux/perf_counter.h>
> > +#include <linux/checkpoint.h>
> >
> > #include <asm/pgtable.h>
> > #include <asm/pgalloc.h>
> > @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> > INIT_LIST_HEAD(&p->pi_state_list);
> > p->pi_state_cache = NULL;
> > #endif
> > -#ifdef CONFIG_CHECKPOINT
> > - p->checkpoint_ctx = NULL;
> > -#endif
> > /*
> > * sigaltstack should be cleared when sharing the same VM
> > */
> > @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> > /* Need tasklist lock for parent etc handling! */
> > write_lock_irq(&tasklist_lock);
> >
> > +#ifdef CONFIG_CHECKPOINT
> > + /* If parent is restarting, child should be too */
> > + if (unlikely(current->checkpoint_ctx)) {
> > + p->checkpoint_ctx = current->checkpoint_ctx;
>
> Won't break anything, but technically p->checkpoint_ctx will
> already be copied from current->checkpoint_ctx, so only the
> ckpt_ctx_get() is necessary, so this could really read
>
> if (p->checkpoint_ctx)
> ckpt_ctx_get(p->checkpoint_ctx);
>
> Right?
BTW since all I'm doing is nit-picking, I obviously agree with
the patch on the whole :)
There is no way for the task to be forked into a traced state
(without the parent being traced) right? And, is the fact that
ctx->nr_total may end up less than the total number of active
tasks a problem at all?
thanks,
-serge
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()
[not found] ` <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-09-22 21:46 ` Serge E. Hallyn
@ 2009-09-22 23:41 ` Oren Laadan
[not found] ` <4AB960A1.2090209-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
1 sibling, 1 reply; 6+ messages in thread
From: Oren Laadan @ 2009-09-22 23:41 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Oren Laadan
Serge E. Hallyn wrote:
> Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
>> From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org>
>>
>> If prepare_descendants() is walking a tree and one of the tasks is
>> forking, one of two bads can happen. If the child doesn't inherit the
>> ->ctx, it breaks the assumption that the entire subtree is prepared.
>> If the child inherits the ->ctx, it will have one without having taken
>> a reference.
>>
>> This patch closed this race by explicitly getting and referencing the
>> ->ctx for a child process should the parent have one, atomically under
>> the tasklist_lock.
>>
>> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
>> ---
>> kernel/fork.c | 11 ++++++++---
>> 1 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index 9f13d7b..57118e4 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -62,6 +62,7 @@
>> #include <linux/fs_struct.h>
>> #include <linux/magic.h>
>> #include <linux/perf_counter.h>
>> +#include <linux/checkpoint.h>
>>
>> #include <asm/pgtable.h>
>> #include <asm/pgalloc.h>
>> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>> INIT_LIST_HEAD(&p->pi_state_list);
>> p->pi_state_cache = NULL;
>> #endif
>> -#ifdef CONFIG_CHECKPOINT
>> - p->checkpoint_ctx = NULL;
>> -#endif
>> /*
>> * sigaltstack should be cleared when sharing the same VM
>> */
>> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>> /* Need tasklist lock for parent etc handling! */
>> write_lock_irq(&tasklist_lock);
>>
>> +#ifdef CONFIG_CHECKPOINT
>> + /* If parent is restarting, child should be too */
>> + if (unlikely(current->checkpoint_ctx)) {
>> + p->checkpoint_ctx = current->checkpoint_ctx;
>
> Won't break anything, but technically p->checkpoint_ctx will
> already be copied from current->checkpoint_ctx, so only the
> ckpt_ctx_get() is necessary, so this could really read
>
> if (p->checkpoint_ctx)
> ckpt_ctx_get(p->checkpoint_ctx);
>
> Right?
That's what I initially thought. However, it _is_ possible that
the following occurs:
1) Task A starts a fork an does some of copy_process(): so
@p is NULL, but copy_process() isn't complete
2) An ancestor of task A calls prepare_descendants(), which
then places a valid checkpoint_ctx on A.
That is why I placed this test-and-set snippet while holding
tasklist_lock (write): it ensure that the child has whatever
the parent has, because prepare_descendants() uses the same
lock.
Oren.
>
>> + ckpt_ctx_get(p->checkpoint_ctx);
>> + }
>> +#endif
>> /*
>> * The task hasn't been attached yet, so its cpus_allowed mask will
>> * not be changed, nor will its assigned CPU.
>> --
>> 1.6.0.4
>>
>> _______________________________________________
>> Containers mailing list
>> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>> https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()
[not found] ` <20090922214619.GA30215-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-09-22 23:43 ` Oren Laadan
0 siblings, 0 replies; 6+ messages in thread
From: Oren Laadan @ 2009-09-22 23:43 UTC (permalink / raw)
To: Serge E. Hallyn; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Serge E. Hallyn wrote:
> Quoting Serge E. Hallyn (serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org):
>> Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
>>> From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org>
>>>
>>> If prepare_descendants() is walking a tree and one of the tasks is
>>> forking, one of two bads can happen. If the child doesn't inherit the
>>> ->ctx, it breaks the assumption that the entire subtree is prepared.
>>> If the child inherits the ->ctx, it will have one without having taken
>>> a reference.
>>>
>>> This patch closed this race by explicitly getting and referencing the
>>> ->ctx for a child process should the parent have one, atomically under
>>> the tasklist_lock.
>>>
>>> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
>>> ---
>>> kernel/fork.c | 11 ++++++++---
>>> 1 files changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/fork.c b/kernel/fork.c
>>> index 9f13d7b..57118e4 100644
>>> --- a/kernel/fork.c
>>> +++ b/kernel/fork.c
>>> @@ -62,6 +62,7 @@
>>> #include <linux/fs_struct.h>
>>> #include <linux/magic.h>
>>> #include <linux/perf_counter.h>
>>> +#include <linux/checkpoint.h>
>>>
>>> #include <asm/pgtable.h>
>>> #include <asm/pgalloc.h>
>>> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>>> INIT_LIST_HEAD(&p->pi_state_list);
>>> p->pi_state_cache = NULL;
>>> #endif
>>> -#ifdef CONFIG_CHECKPOINT
>>> - p->checkpoint_ctx = NULL;
>>> -#endif
>>> /*
>>> * sigaltstack should be cleared when sharing the same VM
>>> */
>>> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>>> /* Need tasklist lock for parent etc handling! */
>>> write_lock_irq(&tasklist_lock);
>>>
>>> +#ifdef CONFIG_CHECKPOINT
>>> + /* If parent is restarting, child should be too */
>>> + if (unlikely(current->checkpoint_ctx)) {
>>> + p->checkpoint_ctx = current->checkpoint_ctx;
>> Won't break anything, but technically p->checkpoint_ctx will
>> already be copied from current->checkpoint_ctx, so only the
>> ckpt_ctx_get() is necessary, so this could really read
>>
>> if (p->checkpoint_ctx)
>> ckpt_ctx_get(p->checkpoint_ctx);
>>
>> Right?
>
> BTW since all I'm doing is nit-picking, I obviously agree with
> the patch on the whole :)
>
> There is no way for the task to be forked into a traced state
> (without the parent being traced) right? And, is the fact that
Not that I'm aware of.
> ctx->nr_total may end up less than the total number of active
> tasks a problem at all?
That's ok. A restart will fail if there are not enough tasks,
and will not-succeed (hang until interrupted) if there are
"too many" tasks.
The above is a protection against malicious use of sys_restart(),
not a fix to a correct use through (bug-less!) restart.c :p
Oren.
>
> thanks,
> -serge
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()
[not found] ` <4AB960A1.2090209-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
@ 2009-09-23 0:05 ` Serge E. Hallyn
0 siblings, 0 replies; 6+ messages in thread
From: Serge E. Hallyn @ 2009-09-23 0:05 UTC (permalink / raw)
To: Oren Laadan
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Oren Laadan
Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
>
>
> Serge E. Hallyn wrote:
> > Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
> >> From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org>
> >>
> >> If prepare_descendants() is walking a tree and one of the tasks is
> >> forking, one of two bads can happen. If the child doesn't inherit the
> >> ->ctx, it breaks the assumption that the entire subtree is prepared.
> >> If the child inherits the ->ctx, it will have one without having taken
> >> a reference.
> >>
> >> This patch closed this race by explicitly getting and referencing the
> >> ->ctx for a child process should the parent have one, atomically under
> >> the tasklist_lock.
> >>
> >> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
> >> ---
> >> kernel/fork.c | 11 ++++++++---
> >> 1 files changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/kernel/fork.c b/kernel/fork.c
> >> index 9f13d7b..57118e4 100644
> >> --- a/kernel/fork.c
> >> +++ b/kernel/fork.c
> >> @@ -62,6 +62,7 @@
> >> #include <linux/fs_struct.h>
> >> #include <linux/magic.h>
> >> #include <linux/perf_counter.h>
> >> +#include <linux/checkpoint.h>
> >>
> >> #include <asm/pgtable.h>
> >> #include <asm/pgalloc.h>
> >> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> >> INIT_LIST_HEAD(&p->pi_state_list);
> >> p->pi_state_cache = NULL;
> >> #endif
> >> -#ifdef CONFIG_CHECKPOINT
> >> - p->checkpoint_ctx = NULL;
> >> -#endif
> >> /*
> >> * sigaltstack should be cleared when sharing the same VM
> >> */
> >> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> >> /* Need tasklist lock for parent etc handling! */
> >> write_lock_irq(&tasklist_lock);
> >>
> >> +#ifdef CONFIG_CHECKPOINT
> >> + /* If parent is restarting, child should be too */
> >> + if (unlikely(current->checkpoint_ctx)) {
> >> + p->checkpoint_ctx = current->checkpoint_ctx;
> >
> > Won't break anything, but technically p->checkpoint_ctx will
> > already be copied from current->checkpoint_ctx, so only the
> > ckpt_ctx_get() is necessary, so this could really read
> >
> > if (p->checkpoint_ctx)
> > ckpt_ctx_get(p->checkpoint_ctx);
> >
> > Right?
>
> That's what I initially thought. However, it _is_ possible that
> the following occurs:
>
> 1) Task A starts a fork an does some of copy_process(): so
> @p is NULL, but copy_process() isn't complete
>
> 2) An ancestor of task A calls prepare_descendants(), which
> then places a valid checkpoint_ctx on A.
Huh. Guess so. Thanks.
Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> That is why I placed this test-and-set snippet while holding
> tasklist_lock (write): it ensure that the child has whatever
> the parent has, because prepare_descendants() uses the same
> lock.
>
> Oren.
>
> >
> >> + ckpt_ctx_get(p->checkpoint_ctx);
> >> + }
> >> +#endif
> >> /*
> >> * The task hasn't been attached yet, so its cpus_allowed mask will
> >> * not be changed, nor will its assigned CPU.
> >> --
> >> 1.6.0.4
> >>
> >> _______________________________________________
> >> Containers mailing list
> >> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> >> https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-09-23 0:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-22 20:47 [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork() Oren Laadan
[not found] ` <1253652463-11529-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-09-22 21:38 ` Serge E. Hallyn
[not found] ` <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-09-22 21:46 ` Serge E. Hallyn
[not found] ` <20090922214619.GA30215-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-09-22 23:43 ` Oren Laadan
2009-09-22 23:41 ` Oren Laadan
[not found] ` <4AB960A1.2090209-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-09-23 0:05 ` Serge E. Hallyn
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.