Linux Container Development
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
To: Oren Laadan <orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [PATCH] user-cr: fix race in ckpt_coordinator_pidns and --no-wait
Date: Mon, 26 Oct 2009 11:19:54 -0500	[thread overview]
Message-ID: <20091026161954.GC23564@us.ibm.com> (raw)
In-Reply-To: <1256508767-2356-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>

Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
> In a restart with --pidns, if the root-task has pid != 1, then we fork
> a new process as the init of the new pid-ns, and that process becomes
> the coordinator (instead of 'restart').
> 
> The original 'restart' process will then wait for the new coordinator
> to exit (when all restarted tasks and their descendants terminate).
> With --no-wait, 'restart' doesn't wait for them to exit, but should
> simply return as soon as the restart completes.
> 
> However, it has no way to know when (and if) the restart completes,
> because the coordiantor calls sys_restart(). As a result, it may exit
> early, exiting the feeder (which is a thread) too, and fail the
> restart.
> 
> This patch adds a pipe through which the new coordinator reports the
> status of the restart back to its parent, so that the parent only
> exits once it knows that the operation has completed.
> 
> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>

Well, it looks good.  It does make me worry that there are too many
toggles and switches to keep it all working under all error conditions
long-term, but what the heck.

Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

-serge

> ---
>  restart.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++----
>  1 files changed, 46 insertions(+), 4 deletions(-)
> 
> diff --git a/restart.c b/restart.c
> index fbaab88..14d8aec 100644
> --- a/restart.c
> +++ b/restart.c
> @@ -254,6 +254,7 @@ struct ckpt_ctx {
> 
>  	int pipe_child[2];	/* for children to report status */
>  	int pipe_feed[2];	/* for feeder to provide input */
> +	int pipe_coord[2];	/* for coord to report status (if needed) */
> 
>  	struct ckpt_pids *pids_arr;
>  	struct ckpt_pids *copy_arr;
> @@ -848,7 +849,32 @@ static int ckpt_probe_child(pid_t pid, char *str)
>  #ifdef CLONE_NEWPID
>  static int __ckpt_coordinator(void *arg)
>  {
> -	return ckpt_coordinator((struct ckpt_ctx *) arg);
> +	struct ckpt_ctx *ctx = (struct ckpt_ctx *) arg;
> +
> +	if (!ctx->args->wait)
> +		close(ctx->pipe_coord[0]);
> +
> +	return ckpt_coordinator(ctx);
> +}
> +
> +static int ckpt_coordinator_status(struct ckpt_ctx *ctx)
> +{
> +	int status = -1;
> +	int ret;
> +
> +	close(ctx->pipe_coord[1]);
> +
> +	ret = read(ctx->pipe_coord[0], &status, sizeof(status));
> +	if (ret < 0)
> +		perror("read coordinator status");
> +	else if (ret == 0) {
> +		/* coordinator failed to report */
> +		ckpt_dbg("Coordinator failed to report status.");
> +	} else
> +		status = 0;
> +
> +	close(ctx->pipe_coord[0]);
> +	return status;
>  }
> 
>  static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
> @@ -859,6 +885,15 @@ static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
> 
>  	ckpt_dbg("forking coordinator in new pidns\n");
> 
> +	/*
> +	 * We won't wait for (collect) the coordinator, so we use a
> +	 * pipe instead for the coordinator to report success/failure.
> +	 */
> +	if (!ctx->args->wait && pipe(ctx->pipe_coord)) {
> +		perror("pipe");
> +		return -1;
> +	}
> +
>  	stk = malloc(PTHREAD_STACK_MIN);
>  	if (!stk) {
>  		perror("coordinator stack malloc");
> @@ -893,7 +928,7 @@ static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
>  	if (ctx->args->wait)
>  		return ckpt_collect_child(ctx);
>  	else
> -		return 0;
> +		return ckpt_coordinator_status(ctx);
>  }
>  #else
>  static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
> @@ -941,17 +976,24 @@ static int ckpt_coordinator(struct ckpt_ctx *ctx)
>  	ckpt_verbose("Success\n");
>  	ckpt_dbg("restart succeeded\n");
> 
> +	ret = 0;
> +
>  	if (ctx->args->pidns && ctx->tasks_arr[0].pid != 1) {
>  		/*
>  		 * If root task isn't container init, we must stay
>  		 * around and be reaper until all tasks are gone.
>  		 * Otherwise, container will die as soon as we exit.
>  		 */
> +		if (!ctx->args->wait) {
> +			/* report status because parent won't wait for us */
> +			if (write(ctx->pipe_coord[1], &ret, sizeof(ret)) < 0) {
> +				perror("failed to report status");
> +				exit(1);
> +			}
> +		}
>  		ret = ckpt_pretend_reaper(ctx);
>  	} else if (ctx->args->wait) {
>  		ret = ckpt_collect_child(ctx);
> -	} else {
> -		ret = 0;
>  	}
> 
>  	if (ret < 0)
> -- 
> 1.6.0.4
> 
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers

      parent reply	other threads:[~2009-10-26 16:19 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-25 22:12 [PATCH] user-cr: fix race in ckpt_coordinator_pidns and --no-wait Oren Laadan
     [not found] ` <1256508767-2356-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-10-26 16:19   ` Serge E. Hallyn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091026161954.GC23564@us.ibm.com \
    --to=serue-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox