From: "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
To: Linux Containers <containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>
Subject: [PATCH 1/3] restart: make sure all tasks are in sys_restart
Date: Tue, 29 Sep 2009 11:53:42 -0500 [thread overview]
Message-ID: <20090929165342.GA10076@us.ibm.com> (raw)
Make sure that all prepared tasks are in sys_restart before the
coordinator proceeds. Otherwise, it was possible for the last
task actually in sys_restart to sync before the straggler was in
sys_restart, sending that task a signal and causing that task to
exit. Then since there are not enough tasks completing restart,
the restart operation ends up hanging, waiting for the killed task.
Signed-off-by: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
checkpoint/restart.c | 53 ++++++++++++++++++++++++++++++++++++++
checkpoint/sys.c | 1 +
include/linux/checkpoint_types.h | 2 +
3 files changed, 56 insertions(+), 0 deletions(-)
diff --git a/checkpoint/restart.c b/checkpoint/restart.c
index 543b380..849bda5 100644
--- a/checkpoint/restart.c
+++ b/checkpoint/restart.c
@@ -655,13 +655,35 @@ static struct ckpt_ctx *wait_checkpoint_ctx(void)
static int do_ghost_task(void)
{
struct ckpt_ctx *ctx;
+ int ret;
ctx = wait_checkpoint_ctx();
if (IS_ERR(ctx))
return PTR_ERR(ctx);
+ /*
+ * Wait until coordinator task has completed prepare_descendants().
+ * Note that prepare_descendants() wakes each task one a time
+ * finishing the wait_checkpoint_ctx() above. We may want to change
+ * that flow so tasks only sleep once, but this works for now.
+ */
+ ret = wait_event_interruptible(ctx->waitq,
+ atomic_read(&ctx->nr_running));
+ if (ret) {
+ ckpt_debug("wait for coordinator prep returned %d\n", ret);
+ return ret;
+ }
+ /*
+ * Now each prepared task decrements nr_running, and, when that
+ * hits zero, wakes the coordinator. That way the coordinator
+ * knows that all tasks are in sys_restart(0
+ */
+ if (atomic_dec_and_test(&ctx->nr_running))
+ complete(&ctx->all_ready);
+
current->flags |= PF_RESTARTING;
+ /* Finally wait for all ghosts to be woken up - to die */
wait_event_interruptible(ctx->ghostq,
all_tasks_activated(ctx) ||
ckpt_test_ctx_error(ctx));
@@ -682,6 +704,26 @@ static int do_restore_task(void)
if (IS_ERR(ctx))
return PTR_ERR(ctx);
+ /*
+ * Wait until coordinator task has completed prepare_descendants().
+ * Note that prepare_descendants() wakes each task one a time
+ * finishing the wait_checkpoint_ctx() above. We may want to change
+ * that flow so tasks only sleep once, but this works for now.
+ */
+ ret = wait_event_interruptible(ctx->waitq,
+ atomic_read(&ctx->nr_running));
+ if (ret) {
+ ckpt_debug("wait for coordinator prep returned %d\n", ret);
+ return ret;
+ }
+ /*
+ * Now each prepared task decrements nr_running, and, when that
+ * hits zero, wakes the coordinator. That way the coordinator
+ * knows that all tasks are in sys_restart(0
+ */
+ if (atomic_dec_and_test(&ctx->nr_running))
+ complete(&ctx->all_ready);
+
current->flags |= PF_RESTARTING;
/* wait for our turn, do the restore, and tell next task in line */
@@ -814,6 +856,7 @@ static int prepare_descendants(struct ckpt_ctx *ctx, struct task_struct *root)
ret = -ESRCH;
atomic_set(&ctx->nr_total, nr_pids);
+ atomic_set(&ctx->nr_running, nr_pids);
return ret;
}
@@ -948,6 +991,16 @@ static int do_restore_coord(struct ckpt_ctx *ctx, pid_t pid)
ret = prepare_descendants(ctx, ctx->root_task);
if (ret < 0)
goto out;
+
+ /* tell tasks we're ready for them to dec ctx->nr_running */
+ wake_up_all(&ctx->waitq);
+ /* ctx->nr_running hit 0, all our tasks are in sys_restart */
+ ret = wait_for_completion_interruptible(&ctx->all_ready);
+ if (ret) {
+ ckpt_debug("waiting on all_ready returned %d\n", ret);
+ goto out;
+ }
+
/* wait for all other tasks to complete do_restore_task() */
ret = wait_all_tasks_finish(ctx);
if (ret < 0)
diff --git a/checkpoint/sys.c b/checkpoint/sys.c
index 76a3fa9..d48e261 100644
--- a/checkpoint/sys.c
+++ b/checkpoint/sys.c
@@ -233,6 +233,7 @@ static struct ckpt_ctx *ckpt_ctx_alloc(int fd, unsigned long uflags,
ctx->uflags = uflags;
ctx->kflags = kflags;
ctx->ktime_begin = ktime_get();
+ init_completion(&ctx->all_ready);
atomic_set(&ctx->refcount, 0);
INIT_LIST_HEAD(&ctx->pgarr_list);
diff --git a/include/linux/checkpoint_types.h b/include/linux/checkpoint_types.h
index 9b7b4dd..2a854e0 100644
--- a/include/linux/checkpoint_types.h
+++ b/include/linux/checkpoint_types.h
@@ -69,8 +69,10 @@ struct ckpt_ctx {
struct ckpt_pids *pids_arr; /* array of all pids [restart] */
int nr_pids; /* size of pids array */
atomic_t nr_total; /* total tasks count (with ghosts) */
+ atomic_t nr_running; /* countdown of tasks in sys_restart */
int active_pid; /* (next) position in pids array */
struct completion complete; /* completion for container root */
+ struct completion all_ready; /* all tasks are in sys_restart */
wait_queue_head_t waitq; /* waitqueue for restarting tasks */
wait_queue_head_t ghostq; /* waitqueue for ghost tasks */
struct cred *realcred, *ecred; /* tmp storage for cred at restart */
--
1.6.1
next reply other threads:[~2009-09-29 16:53 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-29 16:53 Serge E. Hallyn [this message]
[not found] ` <20090929165342.GA10076-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-09-29 16:54 ` [PATCH 2/3] restart debug: add final process tree status Serge E. Hallyn
[not found] ` <20090929165402.GA10114-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-01 1:57 ` Oren Laadan
[not found] ` <4AC40CA0.8020305-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-10-01 15:33 ` Serge E. Hallyn
[not found] ` <20091001153356.GA20565-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-01 23:29 ` Oren Laadan
2009-09-29 16:54 ` [PATCH 3/3] restart debug: splatter more ckpt_debugs about Serge E. Hallyn
[not found] ` <20090929165415.GB10114-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-01 1:54 ` Oren Laadan
2009-10-01 1:53 ` [PATCH 1/3] restart: make sure all tasks are in sys_restart Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090929165342.GA10076@us.ibm.com \
--to=serue-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox