From mboxrd@z Thu Jan 1 00:00:00 1970 From: Oren Laadan Subject: Re: [PATCH] c/r: fix race of prepare_descendant() with an ongoing fork() Date: Tue, 22 Sep 2009 19:43:03 -0400 Message-ID: <4AB96107.70709@librato.com> References: <1253652463-11529-1-git-send-email-orenl@librato.com> <20090922213858.GA30048@us.ibm.com> <20090922214619.GA30215@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20090922214619.GA30215-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: "Serge E. Hallyn" Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org List-Id: containers.vger.kernel.org Serge E. Hallyn wrote: > Quoting Serge E. Hallyn (serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): >> Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org): >>> From: Oren Laadan >>> >>> If prepare_descendants() is walking a tree and one of the tasks is >>> forking, one of two bads can happen. If the child doesn't inherit the >>> ->ctx, it breaks the assumption that the entire subtree is prepared. >>> If the child inherits the ->ctx, it will have one without having taken >>> a reference. >>> >>> This patch closed this race by explicitly getting and referencing the >>> ->ctx for a child process should the parent have one, atomically under >>> the tasklist_lock. >>> >>> Signed-off-by: Oren Laadan >>> --- >>> kernel/fork.c | 11 ++++++++--- >>> 1 files changed, 8 insertions(+), 3 deletions(-) >>> >>> diff --git a/kernel/fork.c b/kernel/fork.c >>> index 9f13d7b..57118e4 100644 >>> --- a/kernel/fork.c >>> +++ b/kernel/fork.c >>> @@ -62,6 +62,7 @@ >>> #include >>> #include >>> #include >>> +#include >>> >>> #include >>> #include >>> @@ -1148,9 +1149,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, >>> INIT_LIST_HEAD(&p->pi_state_list); >>> p->pi_state_cache = NULL; >>> #endif >>> -#ifdef CONFIG_CHECKPOINT >>> - p->checkpoint_ctx = NULL; >>> -#endif >>> /* >>> * sigaltstack should be cleared when sharing the same VM >>> */ >>> @@ -1188,6 +1186,13 @@ static struct task_struct *copy_process(unsigned long clone_flags, >>> /* Need tasklist lock for parent etc handling! */ >>> write_lock_irq(&tasklist_lock); >>> >>> +#ifdef CONFIG_CHECKPOINT >>> + /* If parent is restarting, child should be too */ >>> + if (unlikely(current->checkpoint_ctx)) { >>> + p->checkpoint_ctx = current->checkpoint_ctx; >> Won't break anything, but technically p->checkpoint_ctx will >> already be copied from current->checkpoint_ctx, so only the >> ckpt_ctx_get() is necessary, so this could really read >> >> if (p->checkpoint_ctx) >> ckpt_ctx_get(p->checkpoint_ctx); >> >> Right? > > BTW since all I'm doing is nit-picking, I obviously agree with > the patch on the whole :) > > There is no way for the task to be forked into a traced state > (without the parent being traced) right? And, is the fact that Not that I'm aware of. > ctx->nr_total may end up less than the total number of active > tasks a problem at all? That's ok. A restart will fail if there are not enough tasks, and will not-succeed (hang until interrupted) if there are "too many" tasks. The above is a protection against malicious use of sys_restart(), not a fix to a correct use through (bug-less!) restart.c :p Oren. > > thanks, > -serge