* [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork()
@ 2009-09-22 20:47 Oren Laadan
[not found] ` <1253652463-11529-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Oren Laadan @ 2009-09-22 20:47 UTC (permalink / raw)
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA; +Cc: Oren Laadan
From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org>
If prepare_descendants() is walking a tree and one of the tasks is
forking, one of two bads can happen. If the child doesn't inherit the
->ctx, it breaks the assumption that the entire subtree is prepared.
If the child inherits the ->ctx, it will have one without having taken
a reference.
This patch closed this race by explicitly getting and referencing the
->ctx for a child process should the parent have one, atomically under
the tasklist_lock.
Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
---
kernel/fork.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 9f13d7b..57118e4 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -62,6 +62,7 @@
#include <linux/fs_struct.h>
#include <linux/magic.h>
#include <linux/perf_counter.h>
+#include <linux/checkpoint.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
INIT_LIST_HEAD(&p->pi_state_list);
p->pi_state_cache = NULL;
#endif
-#ifdef CONFIG_CHECKPOINT
- p->checkpoint_ctx = NULL;
-#endif
/*
* sigaltstack should be cleared when sharing the same VM
*/
@@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags,
/* Need tasklist lock for parent etc handling! */
write_lock_irq(&tasklist_lock);
+#ifdef CONFIG_CHECKPOINT
+ /* If parent is restarting, child should be too */
+ if (unlikely(current->checkpoint_ctx)) {
+ p->checkpoint_ctx = current->checkpoint_ctx;
+ ckpt_ctx_get(p->checkpoint_ctx);
+ }
+#endif
/*
* The task hasn't been attached yet, so its cpus_allowed mask will
* not be changed, nor will its assigned CPU.
--
1.6.0.4
^ permalink raw reply related [flat|nested] 6+ messages in thread[parent not found: <1253652463-11529-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork() [not found] ` <1253652463-11529-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org> @ 2009-09-22 21:38 ` Serge E. Hallyn [not found] ` <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Serge E. Hallyn @ 2009-09-22 21:38 UTC (permalink / raw) To: Oren Laadan Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oren Laadan Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org): > From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org> > > If prepare_descendants() is walking a tree and one of the tasks is > forking, one of two bads can happen. If the child doesn't inherit the > ->ctx, it breaks the assumption that the entire subtree is prepared. > If the child inherits the ->ctx, it will have one without having taken > a reference. > > This patch closed this race by explicitly getting and referencing the > ->ctx for a child process should the parent have one, atomically under > the tasklist_lock. > > Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> > --- > kernel/fork.c | 11 ++++++++--- > 1 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 9f13d7b..57118e4 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -62,6 +62,7 @@ > #include <linux/fs_struct.h> > #include <linux/magic.h> > #include <linux/perf_counter.h> > +#include <linux/checkpoint.h> > > #include <asm/pgtable.h> > #include <asm/pgalloc.h> > @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, > INIT_LIST_HEAD(&p->pi_state_list); > p->pi_state_cache = NULL; > #endif > -#ifdef CONFIG_CHECKPOINT > - p->checkpoint_ctx = NULL; > -#endif > /* > * sigaltstack should be cleared when sharing the same VM > */ > @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags, > /* Need tasklist lock for parent etc handling! */ > write_lock_irq(&tasklist_lock); > > +#ifdef CONFIG_CHECKPOINT > + /* If parent is restarting, child should be too */ > + if (unlikely(current->checkpoint_ctx)) { > + p->checkpoint_ctx = current->checkpoint_ctx; Won't break anything, but technically p->checkpoint_ctx will already be copied from current->checkpoint_ctx, so only the ckpt_ctx_get() is necessary, so this could really read if (p->checkpoint_ctx) ckpt_ctx_get(p->checkpoint_ctx); Right? > + ckpt_ctx_get(p->checkpoint_ctx); > + } > +#endif > /* > * The task hasn't been attached yet, so its cpus_allowed mask will > * not be changed, nor will its assigned CPU. > -- > 1.6.0.4 > > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork() [not found] ` <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2009-09-22 21:46 ` Serge E. Hallyn [not found] ` <20090922214619.GA30215-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 2009-09-22 23:41 ` Oren Laadan 1 sibling, 1 reply; 6+ messages in thread From: Serge E. Hallyn @ 2009-09-22 21:46 UTC (permalink / raw) To: Oren Laadan Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oren Laadan Quoting Serge E. Hallyn (serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org): > > From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org> > > > > If prepare_descendants() is walking a tree and one of the tasks is > > forking, one of two bads can happen. If the child doesn't inherit the > > ->ctx, it breaks the assumption that the entire subtree is prepared. > > If the child inherits the ->ctx, it will have one without having taken > > a reference. > > > > This patch closed this race by explicitly getting and referencing the > > ->ctx for a child process should the parent have one, atomically under > > the tasklist_lock. > > > > Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> > > --- > > kernel/fork.c | 11 ++++++++--- > > 1 files changed, 8 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > index 9f13d7b..57118e4 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -62,6 +62,7 @@ > > #include <linux/fs_struct.h> > > #include <linux/magic.h> > > #include <linux/perf_counter.h> > > +#include <linux/checkpoint.h> > > > > #include <asm/pgtable.h> > > #include <asm/pgalloc.h> > > @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, > > INIT_LIST_HEAD(&p->pi_state_list); > > p->pi_state_cache = NULL; > > #endif > > -#ifdef CONFIG_CHECKPOINT > > - p->checkpoint_ctx = NULL; > > -#endif > > /* > > * sigaltstack should be cleared when sharing the same VM > > */ > > @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags, > > /* Need tasklist lock for parent etc handling! */ > > write_lock_irq(&tasklist_lock); > > > > +#ifdef CONFIG_CHECKPOINT > > + /* If parent is restarting, child should be too */ > > + if (unlikely(current->checkpoint_ctx)) { > > + p->checkpoint_ctx = current->checkpoint_ctx; > > Won't break anything, but technically p->checkpoint_ctx will > already be copied from current->checkpoint_ctx, so only the > ckpt_ctx_get() is necessary, so this could really read > > if (p->checkpoint_ctx) > ckpt_ctx_get(p->checkpoint_ctx); > > Right? BTW since all I'm doing is nit-picking, I obviously agree with the patch on the whole :) There is no way for the task to be forked into a traced state (without the parent being traced) right? And, is the fact that ctx->nr_total may end up less than the total number of active tasks a problem at all? thanks, -serge ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <20090922214619.GA30215-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork() [not found] ` <20090922214619.GA30215-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> @ 2009-09-22 23:43 ` Oren Laadan 0 siblings, 0 replies; 6+ messages in thread From: Oren Laadan @ 2009-09-22 23:43 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Serge E. Hallyn wrote: > Quoting Serge E. Hallyn (serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): >> Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org): >>> From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org> >>> >>> If prepare_descendants() is walking a tree and one of the tasks is >>> forking, one of two bads can happen. If the child doesn't inherit the >>> ->ctx, it breaks the assumption that the entire subtree is prepared. >>> If the child inherits the ->ctx, it will have one without having taken >>> a reference. >>> >>> This patch closed this race by explicitly getting and referencing the >>> ->ctx for a child process should the parent have one, atomically under >>> the tasklist_lock. >>> >>> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> >>> --- >>> kernel/fork.c | 11 ++++++++--- >>> 1 files changed, 8 insertions(+), 3 deletions(-) >>> >>> diff --git a/kernel/fork.c b/kernel/fork.c >>> index 9f13d7b..57118e4 100644 >>> --- a/kernel/fork.c >>> +++ b/kernel/fork.c >>> @@ -62,6 +62,7 @@ >>> #include <linux/fs_struct.h> >>> #include <linux/magic.h> >>> #include <linux/perf_counter.h> >>> +#include <linux/checkpoint.h> >>> >>> #include <asm/pgtable.h> >>> #include <asm/pgalloc.h> >>> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, >>> INIT_LIST_HEAD(&p->pi_state_list); >>> p->pi_state_cache = NULL; >>> #endif >>> -#ifdef CONFIG_CHECKPOINT >>> - p->checkpoint_ctx = NULL; >>> -#endif >>> /* >>> * sigaltstack should be cleared when sharing the same VM >>> */ >>> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags, >>> /* Need tasklist lock for parent etc handling! */ >>> write_lock_irq(&tasklist_lock); >>> >>> +#ifdef CONFIG_CHECKPOINT >>> + /* If parent is restarting, child should be too */ >>> + if (unlikely(current->checkpoint_ctx)) { >>> + p->checkpoint_ctx = current->checkpoint_ctx; >> Won't break anything, but technically p->checkpoint_ctx will >> already be copied from current->checkpoint_ctx, so only the >> ckpt_ctx_get() is necessary, so this could really read >> >> if (p->checkpoint_ctx) >> ckpt_ctx_get(p->checkpoint_ctx); >> >> Right? > > BTW since all I'm doing is nit-picking, I obviously agree with > the patch on the whole :) > > There is no way for the task to be forked into a traced state > (without the parent being traced) right? And, is the fact that Not that I'm aware of. > ctx->nr_total may end up less than the total number of active > tasks a problem at all? That's ok. A restart will fail if there are not enough tasks, and will not-succeed (hang until interrupted) if there are "too many" tasks. The above is a protection against malicious use of sys_restart(), not a fix to a correct use through (bug-less!) restart.c :p Oren. > > thanks, > -serge ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork() [not found] ` <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> 2009-09-22 21:46 ` Serge E. Hallyn @ 2009-09-22 23:41 ` Oren Laadan [not found] ` <4AB960A1.2090209-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org> 1 sibling, 1 reply; 6+ messages in thread From: Oren Laadan @ 2009-09-22 23:41 UTC (permalink / raw) To: Serge E. Hallyn Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oren Laadan Serge E. Hallyn wrote: > Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org): >> From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org> >> >> If prepare_descendants() is walking a tree and one of the tasks is >> forking, one of two bads can happen. If the child doesn't inherit the >> ->ctx, it breaks the assumption that the entire subtree is prepared. >> If the child inherits the ->ctx, it will have one without having taken >> a reference. >> >> This patch closed this race by explicitly getting and referencing the >> ->ctx for a child process should the parent have one, atomically under >> the tasklist_lock. >> >> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> >> --- >> kernel/fork.c | 11 ++++++++--- >> 1 files changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/kernel/fork.c b/kernel/fork.c >> index 9f13d7b..57118e4 100644 >> --- a/kernel/fork.c >> +++ b/kernel/fork.c >> @@ -62,6 +62,7 @@ >> #include <linux/fs_struct.h> >> #include <linux/magic.h> >> #include <linux/perf_counter.h> >> +#include <linux/checkpoint.h> >> >> #include <asm/pgtable.h> >> #include <asm/pgalloc.h> >> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, >> INIT_LIST_HEAD(&p->pi_state_list); >> p->pi_state_cache = NULL; >> #endif >> -#ifdef CONFIG_CHECKPOINT >> - p->checkpoint_ctx = NULL; >> -#endif >> /* >> * sigaltstack should be cleared when sharing the same VM >> */ >> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags, >> /* Need tasklist lock for parent etc handling! */ >> write_lock_irq(&tasklist_lock); >> >> +#ifdef CONFIG_CHECKPOINT >> + /* If parent is restarting, child should be too */ >> + if (unlikely(current->checkpoint_ctx)) { >> + p->checkpoint_ctx = current->checkpoint_ctx; > > Won't break anything, but technically p->checkpoint_ctx will > already be copied from current->checkpoint_ctx, so only the > ckpt_ctx_get() is necessary, so this could really read > > if (p->checkpoint_ctx) > ckpt_ctx_get(p->checkpoint_ctx); > > Right? That's what I initially thought. However, it _is_ possible that the following occurs: 1) Task A starts a fork an does some of copy_process(): so @p is NULL, but copy_process() isn't complete 2) An ancestor of task A calls prepare_descendants(), which then places a valid checkpoint_ctx on A. That is why I placed this test-and-set snippet while holding tasklist_lock (write): it ensure that the child has whatever the parent has, because prepare_descendants() uses the same lock. Oren. > >> + ckpt_ctx_get(p->checkpoint_ctx); >> + } >> +#endif >> /* >> * The task hasn't been attached yet, so its cpus_allowed mask will >> * not be changed, nor will its assigned CPU. >> -- >> 1.6.0.4 >> >> _______________________________________________ >> Containers mailing list >> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org >> https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <4AB960A1.2090209-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork() [not found] ` <4AB960A1.2090209-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org> @ 2009-09-23 0:05 ` Serge E. Hallyn 0 siblings, 0 replies; 6+ messages in thread From: Serge E. Hallyn @ 2009-09-23 0:05 UTC (permalink / raw) To: Oren Laadan Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Oren Laadan Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org): > > > Serge E. Hallyn wrote: > > Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org): > >> From: Oren Laadan <orenl-RdfvBDnrOiyVc3sceRu5cw@public.gmane.org> > >> > >> If prepare_descendants() is walking a tree and one of the tasks is > >> forking, one of two bads can happen. If the child doesn't inherit the > >> ->ctx, it breaks the assumption that the entire subtree is prepared. > >> If the child inherits the ->ctx, it will have one without having taken > >> a reference. > >> > >> This patch closed this race by explicitly getting and referencing the > >> ->ctx for a child process should the parent have one, atomically under > >> the tasklist_lock. > >> > >> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> > >> --- > >> kernel/fork.c | 11 ++++++++--- > >> 1 files changed, 8 insertions(+), 3 deletions(-) > >> > >> diff --git a/kernel/fork.c b/kernel/fork.c > >> index 9f13d7b..57118e4 100644 > >> --- a/kernel/fork.c > >> +++ b/kernel/fork.c > >> @@ -62,6 +62,7 @@ > >> #include <linux/fs_struct.h> > >> #include <linux/magic.h> > >> #include <linux/perf_counter.h> > >> +#include <linux/checkpoint.h> > >> > >> #include <asm/pgtable.h> > >> #include <asm/pgalloc.h> > >> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, > >> INIT_LIST_HEAD(&p->pi_state_list); > >> p->pi_state_cache = NULL; > >> #endif > >> -#ifdef CONFIG_CHECKPOINT > >> - p->checkpoint_ctx = NULL; > >> -#endif > >> /* > >> * sigaltstack should be cleared when sharing the same VM > >> */ > >> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags, > >> /* Need tasklist lock for parent etc handling! */ > >> write_lock_irq(&tasklist_lock); > >> > >> +#ifdef CONFIG_CHECKPOINT > >> + /* If parent is restarting, child should be too */ > >> + if (unlikely(current->checkpoint_ctx)) { > >> + p->checkpoint_ctx = current->checkpoint_ctx; > > > > Won't break anything, but technically p->checkpoint_ctx will > > already be copied from current->checkpoint_ctx, so only the > > ckpt_ctx_get() is necessary, so this could really read > > > > if (p->checkpoint_ctx) > > ckpt_ctx_get(p->checkpoint_ctx); > > > > Right? > > That's what I initially thought. However, it _is_ possible that > the following occurs: > > 1) Task A starts a fork an does some of copy_process(): so > @p is NULL, but copy_process() isn't complete > > 2) An ancestor of task A calls prepare_descendants(), which > then places a valid checkpoint_ctx on A. Huh. Guess so. Thanks. Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> > That is why I placed this test-and-set snippet while holding > tasklist_lock (write): it ensure that the child has whatever > the parent has, because prepare_descendants() uses the same > lock. > > Oren. > > > > >> + ckpt_ctx_get(p->checkpoint_ctx); > >> + } > >> +#endif > >> /* > >> * The task hasn't been attached yet, so its cpus_allowed mask will > >> * not be changed, nor will its assigned CPU. > >> -- > >> 1.6.0.4 > >> > >> _______________________________________________ > >> Containers mailing list > >> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > >> https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-09-23 0:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-22 20:47 [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork() Oren Laadan
[not found] ` <1253652463-11529-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-09-22 21:38 ` Serge E. Hallyn
[not found] ` <20090922213858.GA30048-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-09-22 21:46 ` Serge E. Hallyn
[not found] ` <20090922214619.GA30215-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-09-22 23:43 ` Oren Laadan
2009-09-22 23:41 ` Oren Laadan
[not found] ` <4AB960A1.2090209-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-09-23 0:05 ` Serge E. Hallyn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.