* [PATCH] c/r: coordinator to report correct error on restart failure
@ 2009-10-25 22:23 Oren Laadan
[not found] ` <1256509416-3897-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
0 siblings, 1 reply; 2+ messages in thread
From: Oren Laadan @ 2009-10-25 22:23 UTC (permalink / raw)
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
If restart fails it is usually due to an error for a restoring task,
which is place in ctx->errno. Then the coordinator wakes up and sees
an -EINTR.
This patch changes the coordinator's behavior to report the error
value placed in ctx->errno (if an error occurred) rather than report
a confusing -EINTR.
Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
---
checkpoint/restart.c | 15 +++++++++++++--
1 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/checkpoint/restart.c b/checkpoint/restart.c
index 5daadc4..9b75de8 100644
--- a/checkpoint/restart.c
+++ b/checkpoint/restart.c
@@ -711,7 +711,7 @@ static inline int is_task_active(struct ckpt_ctx *ctx, pid_t pid)
}
/* should not be called under write_lock_irq(&tasklist_lock) */
-static inline void _restore_notify_error(struct ckpt_ctx *ctx, int errno)
+static void _restore_notify_error(struct ckpt_ctx *ctx, int errno)
{
/* first to fail: notify everyone (racy but harmless) */
if (!ckpt_test_ctx_error(ctx)) {
@@ -1263,9 +1263,20 @@ static int do_restore_coord(struct ckpt_ctx *ctx, pid_t pid)
post_restore_task();
restore_debug_error(ctx, ret);
- if (ret < 0) {
+
+ if (ret < 0)
ckpt_set_ctx_error(ctx, ret);
+
+ if (ckpt_test_ctx_error(ctx)) {
destroy_descendants(ctx);
+ /*
+ * If a restaring task (or we) reported an error, that set
+ * out return value to that error. (Need the unlikely loop
+ * because the error is recorded after the flag is set).
+ */
+ while (!ctx->errno)
+ yield();
+ ret = ctx->errno;
} else {
ckpt_set_ctx_success(ctx);
wake_up_all(&ctx->waitq);
--
1.6.0.4
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] c/r: coordinator to report correct error on restart failure
[not found] ` <1256509416-3897-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
@ 2009-10-26 18:35 ` Serge E. Hallyn
0 siblings, 0 replies; 2+ messages in thread
From: Serge E. Hallyn @ 2009-10-26 18:35 UTC (permalink / raw)
To: Oren Laadan; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
> If restart fails it is usually due to an error for a restoring task,
> which is place in ctx->errno. Then the coordinator wakes up and sees
> an -EINTR.
>
> This patch changes the coordinator's behavior to report the error
> value placed in ctx->errno (if an error occurred) rather than report
> a confusing -EINTR.
>
> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
> checkpoint/restart.c | 15 +++++++++++++--
> 1 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/checkpoint/restart.c b/checkpoint/restart.c
> index 5daadc4..9b75de8 100644
> --- a/checkpoint/restart.c
> +++ b/checkpoint/restart.c
> @@ -711,7 +711,7 @@ static inline int is_task_active(struct ckpt_ctx *ctx, pid_t pid)
> }
>
> /* should not be called under write_lock_irq(&tasklist_lock) */
> -static inline void _restore_notify_error(struct ckpt_ctx *ctx, int errno)
> +static void _restore_notify_error(struct ckpt_ctx *ctx, int errno)
> {
> /* first to fail: notify everyone (racy but harmless) */
> if (!ckpt_test_ctx_error(ctx)) {
> @@ -1263,9 +1263,20 @@ static int do_restore_coord(struct ckpt_ctx *ctx, pid_t pid)
> post_restore_task();
>
> restore_debug_error(ctx, ret);
> - if (ret < 0) {
> +
> + if (ret < 0)
> ckpt_set_ctx_error(ctx, ret);
> +
> + if (ckpt_test_ctx_error(ctx)) {
> destroy_descendants(ctx);
> + /*
> + * If a restaring task (or we) reported an error, that set
> + * out return value to that error. (Need the unlikely loop
> + * because the error is recorded after the flag is set).
> + */
> + while (!ctx->errno)
> + yield();
> + ret = ctx->errno;
> } else {
> ckpt_set_ctx_success(ctx);
> wake_up_all(&ctx->waitq);
> --
> 1.6.0.4
>
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-10-26 18:35 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-25 22:23 [PATCH] c/r: coordinator to report correct error on restart failure Oren Laadan
[not found] ` <1256509416-3897-1-git-send-email-orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-10-26 18:35 ` Serge E. Hallyn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox