All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH][cr]: Fix ghost task bug
@ 2011-02-25  7:55 Sukadev Bhattiprolu
       [not found] ` <20110225075552.GB24361-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Sukadev Bhattiprolu @ 2011-02-25  7:55 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ, Containers


From b32a8c440687955975180e909620eba50cb146c3 Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Tue, 22 Feb 2011 10:44:55 -0800
Subject: [PATCH 1/1] Fix ghost task bug

Ghost tasks used to be marked as "detached" in do_ghost_task() but
this resulted in a crash which was fixed as discussed in:
https://lists.linux-foundation.org/pipermail/containers/2010-December/026076.html.

But that fix incorrectly attempted to mark a task as detached from user
space. It is not possible to "detach" a task from user space, which
means we must detach the thread in the kernel while still avoiding
the above crash.

The original crash occured because the container-init did not wait for
the detached children. When the detached children exited, they would
access a freed proc_mnt. Eric Biederman fixed this crash as discussed in
the above link.

While Eric's patch fixed the crash it could still leave the container
init hanging indefinitely due to the following race:

container-int:				ghost task
-----------------                      ------------
					do_ghost_task()

zap_pid_ns_processes()
- send SIGKILL
- do_wait()
   - at least one child exists
   - so wait for child to exit
   					wake up for the SIGKILL
					set ->exit_signal = -1
					exit without notifying parent

This leaves the container init waiting indefinitely.

To fix this hang, we have the children of container-init issue an extra wake
up call in exit_checkpoint(). Note that in exit_checkpoint() we do not know
if it is the ghost task that is exiting. Since this wake up applies to any
other task, we should further make sure that the parent is itself not exiting
(which could cause __wake_up_parent() to access invalid pointers in the
parent's task structure).

See this thread for more discussion:

https://lists.linux-foundation.org/pipermail/containers/2011-February/026459.html

Signed-off-by: Sukadev Bhattiprolu (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org)
Cc: Louis Rilling <Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
---
 kernel/checkpoint/restart.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
index b0ea8ec..8ecc052 100644
--- a/kernel/checkpoint/restart.c
+++ b/kernel/checkpoint/restart.c
@@ -972,6 +972,7 @@ static int do_ghost_task(void)
 	if (ret < 0)
 		ckpt_err(ctx, ret, "ghost restart failed\n");
 
+	current->exit_signal = -1;
 	restore_debug_exit(ctx);
 	ckpt_ctx_put(ctx);
 	do_exit(0);
@@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
 	/* restarting zombies will activate next task in restart */
 	if (tsk->flags & PF_RESTARTING) {
 		BUG_ON(ctx->active_pid == -1);
+
+		/*
+		 * if we are a "ghost" task, that was terminated by the
+		 * container-init (from zap_pid_ns_processes()), we should
+		 * wake up the parent since we are now a detached process.
+		 */
+		read_lock_irq(&tasklist_lock);
+                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
+                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
+					"parent\n", tsk->pid, tsk->comm);
+                        __wake_up_parent(tsk, tsk->parent);
+                }
+		read_unlock_irq(&tasklist_lock);
+
 		restore_task_done(ctx);
+
 	}
 
 	ckpt_ctx_put(ctx);
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found] ` <20110225075552.GB24361-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2011-02-25 11:55   ` Louis Rilling
       [not found]     ` <20110225115507.GD3915-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Louis Rilling @ 2011-02-25 11:55 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: Containers


[-- Attachment #1.1: Type: text/plain, Size: 4869 bytes --]

On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
> 
> From b32a8c440687955975180e909620eba50cb146c3 Mon Sep 17 00:00:00 2001
> From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> Date: Tue, 22 Feb 2011 10:44:55 -0800
> Subject: [PATCH 1/1] Fix ghost task bug
> 
> Ghost tasks used to be marked as "detached" in do_ghost_task() but
> this resulted in a crash which was fixed as discussed in:
> https://lists.linux-foundation.org/pipermail/containers/2010-December/026076.html.
> 
> But that fix incorrectly attempted to mark a task as detached from user
> space. It is not possible to "detach" a task from user space, which
> means we must detach the thread in the kernel while still avoiding
> the above crash.
> 
> The original crash occured because the container-init did not wait for
> the detached children. When the detached children exited, they would
> access a freed proc_mnt. Eric Biederman fixed this crash as discussed in
> the above link.
> 
> While Eric's patch fixed the crash it could still leave the container
> init hanging indefinitely due to the following race:
> 
> container-int:				ghost task
> -----------------                      ------------
> 					do_ghost_task()
> 
> zap_pid_ns_processes()
> - send SIGKILL
> - do_wait()
>    - at least one child exists
>    - so wait for child to exit
>    					wake up for the SIGKILL
> 					set ->exit_signal = -1
> 					exit without notifying parent
> 
> This leaves the container init waiting indefinitely.
> 
> To fix this hang, we have the children of container-init issue an extra wake
> up call in exit_checkpoint(). Note that in exit_checkpoint() we do not know
> if it is the ghost task that is exiting. Since this wake up applies to any
> other task, we should further make sure that the parent is itself not exiting
> (which could cause __wake_up_parent() to access invalid pointers in the
> parent's task structure).
> 
> See this thread for more discussion:
> 
> https://lists.linux-foundation.org/pipermail/containers/2011-February/026459.html
> 
> Signed-off-by: Sukadev Bhattiprolu (sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org)
> Cc: Louis Rilling <Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
> ---
>  kernel/checkpoint/restart.c |   16 ++++++++++++++++
>  1 files changed, 16 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
> index b0ea8ec..8ecc052 100644
> --- a/kernel/checkpoint/restart.c
> +++ b/kernel/checkpoint/restart.c
> @@ -972,6 +972,7 @@ static int do_ghost_task(void)
>  	if (ret < 0)
>  		ckpt_err(ctx, ret, "ghost restart failed\n");
>  
> +	current->exit_signal = -1;

Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
eligibile_child() (for an instance of a reader being another task) holds
read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.

>  	restore_debug_exit(ctx);
>  	ckpt_ctx_put(ctx);
>  	do_exit(0);
> @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
>  	/* restarting zombies will activate next task in restart */
>  	if (tsk->flags & PF_RESTARTING) {
>  		BUG_ON(ctx->active_pid == -1);
> +
> +		/*
> +		 * if we are a "ghost" task, that was terminated by the
> +		 * container-init (from zap_pid_ns_processes()), we should
> +		 * wake up the parent since we are now a detached process.
> +		 */
> +		read_lock_irq(&tasklist_lock);

read_lock() is enough. tasklist_lock is never taken for write from IRQs or
softIRQs.

> +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
> +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
> +					"parent\n", tsk->pid, tsk->comm);
> +                        __wake_up_parent(tsk, tsk->parent);
> +                }
> +		read_unlock_irq(&tasklist_lock);

Looking at this closer, I wonder if this wakeup logic should be called from
do_ghost_task(), right after setting ->exit_signal. This way there would be no
need for a tricky condition to recognize ghost tasks, and (I think) this is
closer to the other cases changing ->exit_signal (reparent_leader() and
exit_notify()).

Opinions?

Thanks,

Louis

> +
>  		restore_task_done(ctx);
> +
>  	}
>  
>  	ckpt_ctx_put(ctx);
> -- 
> 1.6.6.1
> 
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]     ` <20110225115507.GD3915-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
@ 2011-02-25 19:01       ` Sukadev Bhattiprolu
       [not found]         ` <20110225190132.GA31300-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Sukadev Bhattiprolu @ 2011-02-25 19:01 UTC (permalink / raw)
  To: Oren Laadan, Containers

Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
| On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
| > 
| > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
| > index b0ea8ec..8ecc052 100644
| > --- a/kernel/checkpoint/restart.c
| > +++ b/kernel/checkpoint/restart.c
| > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
| >  	if (ret < 0)
| >  		ckpt_err(ctx, ret, "ghost restart failed\n");
| >  
| > +	current->exit_signal = -1;
| 
| Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
| places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
| eligibile_child() (for an instance of a reader being another task) holds
| read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
| 

Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
back.

| >  	restore_debug_exit(ctx);
| >  	ckpt_ctx_put(ctx);
| >  	do_exit(0);
| > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
| >  	/* restarting zombies will activate next task in restart */
| >  	if (tsk->flags & PF_RESTARTING) {
| >  		BUG_ON(ctx->active_pid == -1);
| > +
| > +		/*
| > +		 * if we are a "ghost" task, that was terminated by the
| > +		 * container-init (from zap_pid_ns_processes()), we should
| > +		 * wake up the parent since we are now a detached process.
| > +		 */
| > +		read_lock_irq(&tasklist_lock);
| 
| read_lock() is enough. tasklist_lock is never taken for write from IRQs or
| softIRQs.
| 
| > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
| > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
| > +					"parent\n", tsk->pid, tsk->comm);
| > +                        __wake_up_parent(tsk, tsk->parent);
| > +                }
| > +		read_unlock_irq(&tasklist_lock);
| 
| Looking at this closer, I wonder if this wakeup logic should be called from
| do_ghost_task(), right after setting ->exit_signal. This way there would be no
| need for a tricky condition to recognize ghost tasks, and (I think) this is
| closer to the other cases changing ->exit_signal (reparent_leader() and
| exit_notify()).

Yes, we tried the following in the earlier version. 

void ghost_auto_reapable()
{
        write_lock(&tasklist_lock);
        current->exit_signal = -1;
        __wake_up_parent(current, current->parent);
        write_unlock(&tasklist_lock);
}

And called this from do_ghost_task(). But with this, the parent could
wake up, find that it still has an eligible child (the ghost) to wait 
for, and go back to waiting before the ghost enters the EXIT_DEAD state.
And so we would lose the wake up.

(zap_pid_ns_processes() passes the __WALL so the ghost would be considered
an eligible child).

Sukadev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]         ` <20110225190132.GA31300-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2011-02-26 13:54           ` Louis Rilling
       [not found]             ` <20110226135431.GA3251-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Louis Rilling @ 2011-02-26 13:54 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: Containers


[-- Attachment #1.1: Type: text/plain, Size: 4182 bytes --]

On Fri, Feb 25, 2011 at 11:01:32AM -0800, Sukadev Bhattiprolu wrote:
> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
> | On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
> | > 
> | > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
> | > index b0ea8ec..8ecc052 100644
> | > --- a/kernel/checkpoint/restart.c
> | > +++ b/kernel/checkpoint/restart.c
> | > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
> | >  	if (ret < 0)
> | >  		ckpt_err(ctx, ret, "ghost restart failed\n");
> | >  
> | > +	current->exit_signal = -1;
> | 
> | Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
> | places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
> | eligibile_child() (for an instance of a reader being another task) holds
> | read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
> | 
> 
> Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
> back.
> 
> | >  	restore_debug_exit(ctx);
> | >  	ckpt_ctx_put(ctx);
> | >  	do_exit(0);
> | > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
> | >  	/* restarting zombies will activate next task in restart */
> | >  	if (tsk->flags & PF_RESTARTING) {
> | >  		BUG_ON(ctx->active_pid == -1);
> | > +
> | > +		/*
> | > +		 * if we are a "ghost" task, that was terminated by the
> | > +		 * container-init (from zap_pid_ns_processes()), we should
> | > +		 * wake up the parent since we are now a detached process.
> | > +		 */
> | > +		read_lock_irq(&tasklist_lock);
> | 
> | read_lock() is enough. tasklist_lock is never taken for write from IRQs or
> | softIRQs.
> | 
> | > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
> | > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
> | > +					"parent\n", tsk->pid, tsk->comm);
> | > +                        __wake_up_parent(tsk, tsk->parent);
> | > +                }
> | > +		read_unlock_irq(&tasklist_lock);
> | 
> | Looking at this closer, I wonder if this wakeup logic should be called from
> | do_ghost_task(), right after setting ->exit_signal. This way there would be no
> | need for a tricky condition to recognize ghost tasks, and (I think) this is
> | closer to the other cases changing ->exit_signal (reparent_leader() and
> | exit_notify()).
> 
> Yes, we tried the following in the earlier version. 
> 
> void ghost_auto_reapable()
> {
>         write_lock(&tasklist_lock);
>         current->exit_signal = -1;
>         __wake_up_parent(current, current->parent);
>         write_unlock(&tasklist_lock);
> }
> 
> And called this from do_ghost_task(). But with this, the parent could
> wake up, find that it still has an eligible child (the ghost) to wait 
> for, and go back to waiting before the ghost enters the EXIT_DEAD state.
> And so we would lose the wake up.
> 
> (zap_pid_ns_processes() passes the __WALL so the ghost would be considered
> an eligible child).

I think I see now. The point is that ->exit_signal = -1 is only meant to work
correctly for sub-threads, which the parent does not need to wait for. IOW, the
notion of detached task is only implemented for sub-threads.

IIUC, setting ->exit_signal to -1 is only used here to let exit_notify() set
->exit_state to EXIT_DEAD, right? Otherwise, setting ->exit_signal to 0 and
letting do_notify_parent() proceed for ghost tasks would have be sufficient I
guess (provided that the confusion between ghost tasks and zombies could be
easily avoided in do_notify_parent()).

Then I agree that the proposed patch looks like a reasonably simple approach.

Thanks for the explanation,

Louis

> 
> Sukadev
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]             ` <20110226135431.GA3251-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
@ 2011-02-28 14:43               ` Oren Laadan
       [not found]                 ` <4D6BB499.7050602-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Oren Laadan @ 2011-02-28 14:43 UTC (permalink / raw)
  To: Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ; +Cc: Containers, Sukadev Bhattiprolu



On 02/26/2011 08:54 AM, Louis Rilling wrote:
> On Fri, Feb 25, 2011 at 11:01:32AM -0800, Sukadev Bhattiprolu wrote:
>> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
>> | On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
>> | > 
>> | > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
>> | > index b0ea8ec..8ecc052 100644
>> | > --- a/kernel/checkpoint/restart.c
>> | > +++ b/kernel/checkpoint/restart.c
>> | > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
>> | >  	if (ret < 0)
>> | >  		ckpt_err(ctx, ret, "ghost restart failed\n");
>> | >  
>> | > +	current->exit_signal = -1;
>> | 
>> | Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
>> | places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
>> | eligibile_child() (for an instance of a reader being another task) holds
>> | read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
>> | 
>>
>> Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
>> back.
>>
>> | >  	restore_debug_exit(ctx);
>> | >  	ckpt_ctx_put(ctx);
>> | >  	do_exit(0);
>> | > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
>> | >  	/* restarting zombies will activate next task in restart */
>> | >  	if (tsk->flags & PF_RESTARTING) {
>> | >  		BUG_ON(ctx->active_pid == -1);
>> | > +
>> | > +		/*
>> | > +		 * if we are a "ghost" task, that was terminated by the
>> | > +		 * container-init (from zap_pid_ns_processes()), we should
>> | > +		 * wake up the parent since we are now a detached process.
>> | > +		 */
>> | > +		read_lock_irq(&tasklist_lock);
>> | 
>> | read_lock() is enough. tasklist_lock is never taken for write from IRQs or
>> | softIRQs.
>> | 
>> | > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
>> | > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
>> | > +					"parent\n", tsk->pid, tsk->comm);
>> | > +                        __wake_up_parent(tsk, tsk->parent);
>> | > +                }
>> | > +		read_unlock_irq(&tasklist_lock);
>> | 
>> | Looking at this closer, I wonder if this wakeup logic should be called from
>> | do_ghost_task(), right after setting ->exit_signal. This way there would be no
>> | need for a tricky condition to recognize ghost tasks, and (I think) this is
>> | closer to the other cases changing ->exit_signal (reparent_leader() and
>> | exit_notify()).
>>
>> Yes, we tried the following in the earlier version. 
>>
>> void ghost_auto_reapable()
>> {
>>         write_lock(&tasklist_lock);
>>         current->exit_signal = -1;
>>         __wake_up_parent(current, current->parent);
>>         write_unlock(&tasklist_lock);
>> }
>>
>> And called this from do_ghost_task(). But with this, the parent could
>> wake up, find that it still has an eligible child (the ghost) to wait 
>> for, and go back to waiting before the ghost enters the EXIT_DEAD state.
>> And so we would lose the wake up.
>>
>> (zap_pid_ns_processes() passes the __WALL so the ghost would be considered
>> an eligible child).
> 
> I think I see now. The point is that ->exit_signal = -1 is only meant to work
> correctly for sub-threads, which the parent does not need to wait for. IOW, the
> notion of detached task is only implemented for sub-threads.
> 
> IIUC, setting ->exit_signal to -1 is only used here to let exit_notify() set
> ->exit_state to EXIT_DEAD, right? Otherwise, setting ->exit_signal to 0 and
> letting do_notify_parent() proceed for ghost tasks would have be sufficient I
> guess (provided that the confusion between ghost tasks and zombies could be
> easily avoided in do_notify_parent()).
> 
> Then I agree that the proposed patch looks like a reasonably simple approach.
> 
> Thanks for the explanation,
> 

Louis, Suka:

One subtlety with the method is that if a process get reparented
(for whatever reason) then the ->exit_signal field is reset to 
SIGCHLD. Fortunately, that should not affect us because our ghost
tasks never become orphaned.

However, I can't avoid thinking that maybe there is a better way 
to do this altogether ?

The requirement is simple: ghost tasks are temporary tasks whose
role is to keep certain pid's alive for the duration of the restart
and then exit without a trace before the restarted tasks resume
execution.

The reason I opted for the ->exit_signal = -1 is because it makes
sure that the parent need not explicitly collect the child ghost.

If I had set it to 0, then it would not send a signal, but still
would require a wait() to collect it (right ?).

Can you think of a way to achieve this functionality without the 
subtleties that we have observed so far ?  even at the cost of a 
minor change to, say, wait() logic or what not ?

Thanks,

Oren.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                 ` <4D6BB499.7050602-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2011-02-28 15:09                   ` Louis Rilling
       [not found]                     ` <20110228150942.GD18970-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Louis Rilling @ 2011-02-28 15:09 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Containers, Sukadev Bhattiprolu


[-- Attachment #1.1: Type: text/plain, Size: 6162 bytes --]

On 28/02/11  9:43 -0500, Oren Laadan wrote:
> 
> 
> On 02/26/2011 08:54 AM, Louis Rilling wrote:
> > On Fri, Feb 25, 2011 at 11:01:32AM -0800, Sukadev Bhattiprolu wrote:
> >> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
> >> | On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
> >> | > 
> >> | > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
> >> | > index b0ea8ec..8ecc052 100644
> >> | > --- a/kernel/checkpoint/restart.c
> >> | > +++ b/kernel/checkpoint/restart.c
> >> | > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
> >> | >  	if (ret < 0)
> >> | >  		ckpt_err(ctx, ret, "ghost restart failed\n");
> >> | >  
> >> | > +	current->exit_signal = -1;
> >> | 
> >> | Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
> >> | places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
> >> | eligibile_child() (for an instance of a reader being another task) holds
> >> | read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
> >> | 
> >>
> >> Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
> >> back.
> >>
> >> | >  	restore_debug_exit(ctx);
> >> | >  	ckpt_ctx_put(ctx);
> >> | >  	do_exit(0);
> >> | > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
> >> | >  	/* restarting zombies will activate next task in restart */
> >> | >  	if (tsk->flags & PF_RESTARTING) {
> >> | >  		BUG_ON(ctx->active_pid == -1);
> >> | > +
> >> | > +		/*
> >> | > +		 * if we are a "ghost" task, that was terminated by the
> >> | > +		 * container-init (from zap_pid_ns_processes()), we should
> >> | > +		 * wake up the parent since we are now a detached process.
> >> | > +		 */
> >> | > +		read_lock_irq(&tasklist_lock);
> >> | 
> >> | read_lock() is enough. tasklist_lock is never taken for write from IRQs or
> >> | softIRQs.
> >> | 
> >> | > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
> >> | > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
> >> | > +					"parent\n", tsk->pid, tsk->comm);
> >> | > +                        __wake_up_parent(tsk, tsk->parent);
> >> | > +                }
> >> | > +		read_unlock_irq(&tasklist_lock);
> >> | 
> >> | Looking at this closer, I wonder if this wakeup logic should be called from
> >> | do_ghost_task(), right after setting ->exit_signal. This way there would be no
> >> | need for a tricky condition to recognize ghost tasks, and (I think) this is
> >> | closer to the other cases changing ->exit_signal (reparent_leader() and
> >> | exit_notify()).
> >>
> >> Yes, we tried the following in the earlier version. 
> >>
> >> void ghost_auto_reapable()
> >> {
> >>         write_lock(&tasklist_lock);
> >>         current->exit_signal = -1;
> >>         __wake_up_parent(current, current->parent);
> >>         write_unlock(&tasklist_lock);
> >> }
> >>
> >> And called this from do_ghost_task(). But with this, the parent could
> >> wake up, find that it still has an eligible child (the ghost) to wait 
> >> for, and go back to waiting before the ghost enters the EXIT_DEAD state.
> >> And so we would lose the wake up.
> >>
> >> (zap_pid_ns_processes() passes the __WALL so the ghost would be considered
> >> an eligible child).
> > 
> > I think I see now. The point is that ->exit_signal = -1 is only meant to work
> > correctly for sub-threads, which the parent does not need to wait for. IOW, the
> > notion of detached task is only implemented for sub-threads.
> > 
> > IIUC, setting ->exit_signal to -1 is only used here to let exit_notify() set
> > ->exit_state to EXIT_DEAD, right? Otherwise, setting ->exit_signal to 0 and
> > letting do_notify_parent() proceed for ghost tasks would have be sufficient I
> > guess (provided that the confusion between ghost tasks and zombies could be
> > easily avoided in do_notify_parent()).
> > 
> > Then I agree that the proposed patch looks like a reasonably simple approach.
> > 
> > Thanks for the explanation,
> > 
> 
> Louis, Suka:
> 
> One subtlety with the method is that if a process get reparented
> (for whatever reason) then the ->exit_signal field is reset to 
> SIGCHLD. Fortunately, that should not affect us because our ghost
> tasks never become orphaned.

AFAICS, this is not true for detached tasks: neither in reparent_leader(), nor
in exit_notify(). Unless I missed another place?

> 
> However, I can't avoid thinking that maybe there is a better way 
> to do this altogether ?
> 
> The requirement is simple: ghost tasks are temporary tasks whose
> role is to keep certain pid's alive for the duration of the restart
> and then exit without a trace before the restarted tasks resume
> execution.
> 
> The reason I opted for the ->exit_signal = -1 is because it makes
> sure that the parent need not explicitly collect the child ghost.
> 
> If I had set it to 0, then it would not send a signal, but still
> would require a wait() to collect it (right ?).

Right.

> 
> Can you think of a way to achieve this functionality without the 
> subtleties that we have observed so far ?  even at the cost of a 
> minor change to, say, wait() logic or what not ?

I wonder if things would be easier by providing a mean to distinguish ghost
tasks from truely restarted tasks in do_notify_parent().

Assuming this, there might be a kernel-patch-free way, although I can't say if
this fits userspace restart constraints:

Setting the signal handler of ghost tasks' parent to SIG_IGN, and ->exit_signal
of ghost tasks to SIGCHLD, will make them autoreap. This requires that the
parent does not need to synchronize with other children until all ghost tasks
have exited, and that the parent remains alive too.

So, why not adding some flag PF_RESTART_GHOST or TIF_RESTART_GHOST?

What do you think?

Thanks,

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                     ` <20110228150942.GD18970-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
@ 2011-02-28 22:10                       ` Oren Laadan
       [not found]                         ` <4D6C1D58.5060400-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Oren Laadan @ 2011-02-28 22:10 UTC (permalink / raw)
  To: Sukadev Bhattiprolu, Containers



On 02/28/2011 10:09 AM, Louis Rilling wrote:
> On 28/02/11  9:43 -0500, Oren Laadan wrote:
>>
>>
>> On 02/26/2011 08:54 AM, Louis Rilling wrote:
>>> On Fri, Feb 25, 2011 at 11:01:32AM -0800, Sukadev Bhattiprolu wrote:
>>>> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
>>>> | On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
>>>> | > 
>>>> | > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
>>>> | > index b0ea8ec..8ecc052 100644
>>>> | > --- a/kernel/checkpoint/restart.c
>>>> | > +++ b/kernel/checkpoint/restart.c
>>>> | > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
>>>> | >  	if (ret < 0)
>>>> | >  		ckpt_err(ctx, ret, "ghost restart failed\n");
>>>> | >  
>>>> | > +	current->exit_signal = -1;
>>>> | 
>>>> | Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
>>>> | places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
>>>> | eligibile_child() (for an instance of a reader being another task) holds
>>>> | read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
>>>> | 
>>>>
>>>> Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
>>>> back.
>>>>
>>>> | >  	restore_debug_exit(ctx);
>>>> | >  	ckpt_ctx_put(ctx);
>>>> | >  	do_exit(0);
>>>> | > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
>>>> | >  	/* restarting zombies will activate next task in restart */
>>>> | >  	if (tsk->flags & PF_RESTARTING) {
>>>> | >  		BUG_ON(ctx->active_pid == -1);
>>>> | > +
>>>> | > +		/*
>>>> | > +		 * if we are a "ghost" task, that was terminated by the
>>>> | > +		 * container-init (from zap_pid_ns_processes()), we should
>>>> | > +		 * wake up the parent since we are now a detached process.
>>>> | > +		 */
>>>> | > +		read_lock_irq(&tasklist_lock);
>>>> | 
>>>> | read_lock() is enough. tasklist_lock is never taken for write from IRQs or
>>>> | softIRQs.
>>>> | 
>>>> | > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
>>>> | > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
>>>> | > +					"parent\n", tsk->pid, tsk->comm);
>>>> | > +                        __wake_up_parent(tsk, tsk->parent);
>>>> | > +                }
>>>> | > +		read_unlock_irq(&tasklist_lock);
>>>> | 
>>>> | Looking at this closer, I wonder if this wakeup logic should be called from
>>>> | do_ghost_task(), right after setting ->exit_signal. This way there would be no
>>>> | need for a tricky condition to recognize ghost tasks, and (I think) this is
>>>> | closer to the other cases changing ->exit_signal (reparent_leader() and
>>>> | exit_notify()).
>>>>
>>>> Yes, we tried the following in the earlier version. 
>>>>
>>>> void ghost_auto_reapable()
>>>> {
>>>>         write_lock(&tasklist_lock);
>>>>         current->exit_signal = -1;
>>>>         __wake_up_parent(current, current->parent);
>>>>         write_unlock(&tasklist_lock);
>>>> }
>>>>
>>>> And called this from do_ghost_task(). But with this, the parent could
>>>> wake up, find that it still has an eligible child (the ghost) to wait 
>>>> for, and go back to waiting before the ghost enters the EXIT_DEAD state.
>>>> And so we would lose the wake up.
>>>>
>>>> (zap_pid_ns_processes() passes the __WALL so the ghost would be considered
>>>> an eligible child).
>>>
>>> I think I see now. The point is that ->exit_signal = -1 is only meant to work
>>> correctly for sub-threads, which the parent does not need to wait for. IOW, the
>>> notion of detached task is only implemented for sub-threads.
>>>
>>> IIUC, setting ->exit_signal to -1 is only used here to let exit_notify() set
>>> ->exit_state to EXIT_DEAD, right? Otherwise, setting ->exit_signal to 0 and
>>> letting do_notify_parent() proceed for ghost tasks would have be sufficient I
>>> guess (provided that the confusion between ghost tasks and zombies could be
>>> easily avoided in do_notify_parent()).
>>>
>>> Then I agree that the proposed patch looks like a reasonably simple approach.
>>>
>>> Thanks for the explanation,
>>>
>>
>> Louis, Suka:
>>
>> One subtlety with the method is that if a process get reparented
>> (for whatever reason) then the ->exit_signal field is reset to 
>> SIGCHLD. Fortunately, that should not affect us because our ghost
>> tasks never become orphaned.
> 
> AFAICS, this is not true for detached tasks: neither in reparent_leader(), nor
> in exit_notify(). Unless I missed another place?

Doh.. you are totally right, I missed that.
Sometime it feels really good to be wrong :)

> 
>>
>> However, I can't avoid thinking that maybe there is a better way 
>> to do this altogether ?
>>
>> The requirement is simple: ghost tasks are temporary tasks whose
>> role is to keep certain pid's alive for the duration of the restart
>> and then exit without a trace before the restarted tasks resume
>> execution.
>>
>> The reason I opted for the ->exit_signal = -1 is because it makes
>> sure that the parent need not explicitly collect the child ghost.
>>
>> If I had set it to 0, then it would not send a signal, but still
>> would require a wait() to collect it (right ?).
> 
> Right.
> 
>>
>> Can you think of a way to achieve this functionality without the 
>> subtleties that we have observed so far ?  even at the cost of a 
>> minor change to, say, wait() logic or what not ?
> 
> I wonder if things would be easier by providing a mean to distinguish ghost
> tasks from truely restarted tasks in do_notify_parent().
> 
> Assuming this, there might be a kernel-patch-free way, although I can't say if
> this fits userspace restart constraints:
> 
> Setting the signal handler of ghost tasks' parent to SIG_IGN, and ->exit_signal

Yes, I thought about it. But those parent are part of the restart,
and changing their signal handlers will is tricky and will require
complex sync logic at the end of restart to ensure that all tasks
restore their handlers after the ghosts are gone and released, but
before any task gets to userspace ... I don't want to go there.

> of ghost tasks to SIGCHLD, will make them autoreap. This requires that the
> parent does not need to synchronize with other children until all ghost tasks
> have exited, and that the parent remains alive too.
> 
> So, why not adding some flag PF_RESTART_GHOST or TIF_RESTART_GHOST?

Actually, we already have a way to distinguish them:

	if ((tsk->flags & PF_RESTARTING) && task_detached(tsk))

One problem with this is that we only set exit_signal to -1 after
the ghost was born, so if the parent is already waiting I think 
that will be racy, as per suka's comment above.

So looking at the code again, we could add one condition in exit.c
at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
to also test:

inline static bool is_ghost_task(p)
{
	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
		PF_EXITING|PF_RESTARTING) && task_detached(p)
}

	if (p->flags & is_ghost_task(p))
		return 0;

Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
of PF_EXITING). While requiring a kernel patch, it is relatively short,
clean and easy to review.

Thoughts ?

Oren.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                         ` <4D6C1D58.5060400-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2011-03-01 15:31                           ` Louis Rilling
       [not found]                             ` <20110301153105.GD4410-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Louis Rilling @ 2011-03-01 15:31 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Containers, Sukadev Bhattiprolu


[-- Attachment #1.1: Type: text/plain, Size: 8381 bytes --]

On 28/02/11 17:10 -0500, Oren Laadan wrote:
> 
> 
> On 02/28/2011 10:09 AM, Louis Rilling wrote:
> > On 28/02/11  9:43 -0500, Oren Laadan wrote:
> >>
> >>
> >> On 02/26/2011 08:54 AM, Louis Rilling wrote:
> >>> On Fri, Feb 25, 2011 at 11:01:32AM -0800, Sukadev Bhattiprolu wrote:
> >>>> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
> >>>> | On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
> >>>> | > 
> >>>> | > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
> >>>> | > index b0ea8ec..8ecc052 100644
> >>>> | > --- a/kernel/checkpoint/restart.c
> >>>> | > +++ b/kernel/checkpoint/restart.c
> >>>> | > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
> >>>> | >  	if (ret < 0)
> >>>> | >  		ckpt_err(ctx, ret, "ghost restart failed\n");
> >>>> | >  
> >>>> | > +	current->exit_signal = -1;
> >>>> | 
> >>>> | Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
> >>>> | places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
> >>>> | eligibile_child() (for an instance of a reader being another task) holds
> >>>> | read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
> >>>> | 
> >>>>
> >>>> Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
> >>>> back.
> >>>>
> >>>> | >  	restore_debug_exit(ctx);
> >>>> | >  	ckpt_ctx_put(ctx);
> >>>> | >  	do_exit(0);
> >>>> | > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
> >>>> | >  	/* restarting zombies will activate next task in restart */
> >>>> | >  	if (tsk->flags & PF_RESTARTING) {
> >>>> | >  		BUG_ON(ctx->active_pid == -1);
> >>>> | > +
> >>>> | > +		/*
> >>>> | > +		 * if we are a "ghost" task, that was terminated by the
> >>>> | > +		 * container-init (from zap_pid_ns_processes()), we should
> >>>> | > +		 * wake up the parent since we are now a detached process.
> >>>> | > +		 */
> >>>> | > +		read_lock_irq(&tasklist_lock);
> >>>> | 
> >>>> | read_lock() is enough. tasklist_lock is never taken for write from IRQs or
> >>>> | softIRQs.
> >>>> | 
> >>>> | > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
> >>>> | > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
> >>>> | > +					"parent\n", tsk->pid, tsk->comm);
> >>>> | > +                        __wake_up_parent(tsk, tsk->parent);
> >>>> | > +                }
> >>>> | > +		read_unlock_irq(&tasklist_lock);
> >>>> | 
> >>>> | Looking at this closer, I wonder if this wakeup logic should be called from
> >>>> | do_ghost_task(), right after setting ->exit_signal. This way there would be no
> >>>> | need for a tricky condition to recognize ghost tasks, and (I think) this is
> >>>> | closer to the other cases changing ->exit_signal (reparent_leader() and
> >>>> | exit_notify()).
> >>>>
> >>>> Yes, we tried the following in the earlier version. 
> >>>>
> >>>> void ghost_auto_reapable()
> >>>> {
> >>>>         write_lock(&tasklist_lock);
> >>>>         current->exit_signal = -1;
> >>>>         __wake_up_parent(current, current->parent);
> >>>>         write_unlock(&tasklist_lock);
> >>>> }
> >>>>
> >>>> And called this from do_ghost_task(). But with this, the parent could
> >>>> wake up, find that it still has an eligible child (the ghost) to wait 
> >>>> for, and go back to waiting before the ghost enters the EXIT_DEAD state.
> >>>> And so we would lose the wake up.
> >>>>
> >>>> (zap_pid_ns_processes() passes the __WALL so the ghost would be considered
> >>>> an eligible child).
> >>>
> >>> I think I see now. The point is that ->exit_signal = -1 is only meant to work
> >>> correctly for sub-threads, which the parent does not need to wait for. IOW, the
> >>> notion of detached task is only implemented for sub-threads.
> >>>
> >>> IIUC, setting ->exit_signal to -1 is only used here to let exit_notify() set
> >>> ->exit_state to EXIT_DEAD, right? Otherwise, setting ->exit_signal to 0 and
> >>> letting do_notify_parent() proceed for ghost tasks would have be sufficient I
> >>> guess (provided that the confusion between ghost tasks and zombies could be
> >>> easily avoided in do_notify_parent()).
> >>>
> >>> Then I agree that the proposed patch looks like a reasonably simple approach.
> >>>
> >>> Thanks for the explanation,
> >>>
> >>
> >> Louis, Suka:
> >>
> >> One subtlety with the method is that if a process get reparented
> >> (for whatever reason) then the ->exit_signal field is reset to 
> >> SIGCHLD. Fortunately, that should not affect us because our ghost
> >> tasks never become orphaned.
> > 
> > AFAICS, this is not true for detached tasks: neither in reparent_leader(), nor
> > in exit_notify(). Unless I missed another place?
> 
> Doh.. you are totally right, I missed that.
> Sometime it feels really good to be wrong :)

;)

> 
> > 
> >>
> >> However, I can't avoid thinking that maybe there is a better way 
> >> to do this altogether ?
> >>
> >> The requirement is simple: ghost tasks are temporary tasks whose
> >> role is to keep certain pid's alive for the duration of the restart
> >> and then exit without a trace before the restarted tasks resume
> >> execution.
> >>
> >> The reason I opted for the ->exit_signal = -1 is because it makes
> >> sure that the parent need not explicitly collect the child ghost.
> >>
> >> If I had set it to 0, then it would not send a signal, but still
> >> would require a wait() to collect it (right ?).
> > 
> > Right.
> > 
> >>
> >> Can you think of a way to achieve this functionality without the 
> >> subtleties that we have observed so far ?  even at the cost of a 
> >> minor change to, say, wait() logic or what not ?
> > 
> > I wonder if things would be easier by providing a mean to distinguish ghost
> > tasks from truely restarted tasks in do_notify_parent().
> > 
> > Assuming this, there might be a kernel-patch-free way, although I can't say if
> > this fits userspace restart constraints:
> > 
> > Setting the signal handler of ghost tasks' parent to SIG_IGN, and ->exit_signal
> 
> Yes, I thought about it. But those parent are part of the restart,
> and changing their signal handlers will is tricky and will require
> complex sync logic at the end of restart to ensure that all tasks
> restore their handlers after the ghosts are gone and released, but
> before any task gets to userspace ... I don't want to go there.

Yes, I can imagine well the added complexity in userspace.

> 
> > of ghost tasks to SIGCHLD, will make them autoreap. This requires that the
> > parent does not need to synchronize with other children until all ghost tasks
> > have exited, and that the parent remains alive too.
> > 
> > So, why not adding some flag PF_RESTART_GHOST or TIF_RESTART_GHOST?
> 
> Actually, we already have a way to distinguish them:
> 
> 	if ((tsk->flags & PF_RESTARTING) && task_detached(tsk))
> 
> One problem with this is that we only set exit_signal to -1 after
> the ghost was born, so if the parent is already waiting I think 
> that will be racy, as per suka's comment above.

Yes.

> 
> So looking at the code again, we could add one condition in exit.c
> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
> to also test:
> 
> inline static bool is_ghost_task(p)
> {
> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
> }
> 
> 	if (p->flags & is_ghost_task(p))
> 		return 0;
> 
> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
> of PF_EXITING). While requiring a kernel patch, it is relatively short,
> clean and easy to review.

Adding the check in do_wait() is racy IMHO. What does prevent do_ghost_task()
from setting ->exit_signal = -1 after the parent sleeps in do_wait()?

Otherwise, why not making ghost tasks simply sub-threads of the "parent"? They
would autoreap and nobody would wait or have to wait for them. I would feel so
much better if we had not to change ->exit_signal in do_ghost_task()...

Opinion?

Thanks,

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                             ` <20110301153105.GD4410-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
@ 2011-03-03 15:38                               ` Oren Laadan
       [not found]                                 ` <4D6FB5D9.8010002-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Oren Laadan @ 2011-03-03 15:38 UTC (permalink / raw)
  To: Sukadev Bhattiprolu, Containers



On 03/01/2011 10:31 AM, Louis Rilling wrote:
> On 28/02/11 17:10 -0500, Oren Laadan wrote:
>>
>>
>> On 02/28/2011 10:09 AM, Louis Rilling wrote:
>>> On 28/02/11  9:43 -0500, Oren Laadan wrote:
>>>>
>>>>
>>>> On 02/26/2011 08:54 AM, Louis Rilling wrote:
>>>>> On Fri, Feb 25, 2011 at 11:01:32AM -0800, Sukadev Bhattiprolu wrote:
>>>>>> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
>>>>>> | On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
>>>>>> | > 
>>>>>> | > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
>>>>>> | > index b0ea8ec..8ecc052 100644
>>>>>> | > --- a/kernel/checkpoint/restart.c
>>>>>> | > +++ b/kernel/checkpoint/restart.c
>>>>>> | > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
>>>>>> | >  	if (ret < 0)
>>>>>> | >  		ckpt_err(ctx, ret, "ghost restart failed\n");
>>>>>> | >  
>>>>>> | > +	current->exit_signal = -1;
>>>>>> | 
>>>>>> | Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
>>>>>> | places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
>>>>>> | eligibile_child() (for an instance of a reader being another task) holds
>>>>>> | read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
>>>>>> | 
>>>>>>
>>>>>> Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
>>>>>> back.
>>>>>>
>>>>>> | >  	restore_debug_exit(ctx);
>>>>>> | >  	ckpt_ctx_put(ctx);
>>>>>> | >  	do_exit(0);
>>>>>> | > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
>>>>>> | >  	/* restarting zombies will activate next task in restart */
>>>>>> | >  	if (tsk->flags & PF_RESTARTING) {
>>>>>> | >  		BUG_ON(ctx->active_pid == -1);
>>>>>> | > +
>>>>>> | > +		/*
>>>>>> | > +		 * if we are a "ghost" task, that was terminated by the
>>>>>> | > +		 * container-init (from zap_pid_ns_processes()), we should
>>>>>> | > +		 * wake up the parent since we are now a detached process.
>>>>>> | > +		 */
>>>>>> | > +		read_lock_irq(&tasklist_lock);
>>>>>> | 
>>>>>> | read_lock() is enough. tasklist_lock is never taken for write from IRQs or
>>>>>> | softIRQs.
>>>>>> | 
>>>>>> | > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
>>>>>> | > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
>>>>>> | > +					"parent\n", tsk->pid, tsk->comm);
>>>>>> | > +                        __wake_up_parent(tsk, tsk->parent);
>>>>>> | > +                }
>>>>>> | > +		read_unlock_irq(&tasklist_lock);
>>>>>> | 
>>>>>> | Looking at this closer, I wonder if this wakeup logic should be called from
>>>>>> | do_ghost_task(), right after setting ->exit_signal. This way there would be no
>>>>>> | need for a tricky condition to recognize ghost tasks, and (I think) this is
>>>>>> | closer to the other cases changing ->exit_signal (reparent_leader() and
>>>>>> | exit_notify()).
>>>>>>
>>>>>> Yes, we tried the following in the earlier version. 
>>>>>>
>>>>>> void ghost_auto_reapable()
>>>>>> {
>>>>>>         write_lock(&tasklist_lock);
>>>>>>         current->exit_signal = -1;
>>>>>>         __wake_up_parent(current, current->parent);
>>>>>>         write_unlock(&tasklist_lock);
>>>>>> }
>>>>>>
>>>>>> And called this from do_ghost_task(). But with this, the parent could
>>>>>> wake up, find that it still has an eligible child (the ghost) to wait 
>>>>>> for, and go back to waiting before the ghost enters the EXIT_DEAD state.
>>>>>> And so we would lose the wake up.
>>>>>>
>>>>>> (zap_pid_ns_processes() passes the __WALL so the ghost would be considered
>>>>>> an eligible child).
>>>>>
>>>>> I think I see now. The point is that ->exit_signal = -1 is only meant to work
>>>>> correctly for sub-threads, which the parent does not need to wait for. IOW, the
>>>>> notion of detached task is only implemented for sub-threads.
>>>>>
>>>>> IIUC, setting ->exit_signal to -1 is only used here to let exit_notify() set
>>>>> ->exit_state to EXIT_DEAD, right? Otherwise, setting ->exit_signal to 0 and
>>>>> letting do_notify_parent() proceed for ghost tasks would have be sufficient I
>>>>> guess (provided that the confusion between ghost tasks and zombies could be
>>>>> easily avoided in do_notify_parent()).
>>>>>
>>>>> Then I agree that the proposed patch looks like a reasonably simple approach.
>>>>>
>>>>> Thanks for the explanation,
>>>>>
>>>>
>>>> Louis, Suka:
>>>>
>>>> One subtlety with the method is that if a process get reparented
>>>> (for whatever reason) then the ->exit_signal field is reset to 
>>>> SIGCHLD. Fortunately, that should not affect us because our ghost
>>>> tasks never become orphaned.
>>>
>>> AFAICS, this is not true for detached tasks: neither in reparent_leader(), nor
>>> in exit_notify(). Unless I missed another place?
>>
>> Doh.. you are totally right, I missed that.
>> Sometime it feels really good to be wrong :)
> 
> ;)
> 
>>
>>>
>>>>
>>>> However, I can't avoid thinking that maybe there is a better way 
>>>> to do this altogether ?
>>>>
>>>> The requirement is simple: ghost tasks are temporary tasks whose
>>>> role is to keep certain pid's alive for the duration of the restart
>>>> and then exit without a trace before the restarted tasks resume
>>>> execution.
>>>>
>>>> The reason I opted for the ->exit_signal = -1 is because it makes
>>>> sure that the parent need not explicitly collect the child ghost.
>>>>
>>>> If I had set it to 0, then it would not send a signal, but still
>>>> would require a wait() to collect it (right ?).
>>>
>>> Right.
>>>
>>>>
>>>> Can you think of a way to achieve this functionality without the 
>>>> subtleties that we have observed so far ?  even at the cost of a 
>>>> minor change to, say, wait() logic or what not ?
>>>
>>> I wonder if things would be easier by providing a mean to distinguish ghost
>>> tasks from truely restarted tasks in do_notify_parent().
>>>
>>> Assuming this, there might be a kernel-patch-free way, although I can't say if
>>> this fits userspace restart constraints:
>>>
>>> Setting the signal handler of ghost tasks' parent to SIG_IGN, and ->exit_signal
>>
>> Yes, I thought about it. But those parent are part of the restart,
>> and changing their signal handlers will is tricky and will require
>> complex sync logic at the end of restart to ensure that all tasks
>> restore their handlers after the ghosts are gone and released, but
>> before any task gets to userspace ... I don't want to go there.
> 
> Yes, I can imagine well the added complexity in userspace.
> 
>>
>>> of ghost tasks to SIGCHLD, will make them autoreap. This requires that the
>>> parent does not need to synchronize with other children until all ghost tasks
>>> have exited, and that the parent remains alive too.
>>>
>>> So, why not adding some flag PF_RESTART_GHOST or TIF_RESTART_GHOST?
>>
>> Actually, we already have a way to distinguish them:
>>
>> 	if ((tsk->flags & PF_RESTARTING) && task_detached(tsk))
>>
>> One problem with this is that we only set exit_signal to -1 after
>> the ghost was born, so if the parent is already waiting I think 
>> that will be racy, as per suka's comment above.
> 
> Yes.
> 
>>
>> So looking at the code again, we could add one condition in exit.c
>> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
>> to also test:
>>
>> inline static bool is_ghost_task(p)
>> {
>> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
>> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
>> }
>>
>> 	if (p->flags & is_ghost_task(p))
>> 		return 0;
>>
>> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
>> of PF_EXITING). While requiring a kernel patch, it is relatively short,
>> clean and easy to review.
> 
> Adding the check in do_wait() is racy IMHO. What does prevent do_ghost_task()
> from setting ->exit_signal = -1 after the parent sleeps in do_wait()?

If we do it per your suggestion (and I also had suggested the
same originally):

>>>>>> void ghost_auto_reapable()
>>>>>> {
>>>>>>         write_lock(&tasklist_lock);
>>>>>>         current->exit_signal = -1;
>>>>>>         __wake_up_parent(current, current->parent);
>>>>>>         write_unlock(&tasklist_lock);
>>>>>> }

then it should be safe, no ?

> 
> Otherwise, why not making ghost tasks simply sub-threads of the "parent"? They
> would autoreap and nobody would wait or have to wait for them. I would feel so
> much better if we had not to change ->exit_signal in do_ghost_task()...

There are two types of ghosts:

1) A placeholder needed to ensure that some processes are orphaned
correctly (while retaining their sid)

2) A placeholder needed to ensure that a certain pid exists during
the restart (and can be, e.g. used by other processes as pgid).

in both cases, a thread of the parent won't do what I need: in the
first case, the child will be re-parented to another thread (leader?).
And in the second, the pgid will be be usable as a pgid.

But please keep throwing suggestions, we're making progress here...

Oren.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                                 ` <4D6FB5D9.8010002-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2011-03-03 16:35                                   ` Louis Rilling
       [not found]                                     ` <20110303163531.GO21429-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Louis Rilling @ 2011-03-03 16:35 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Containers, Sukadev Bhattiprolu


[-- Attachment #1.1: Type: text/plain, Size: 11061 bytes --]

On 03/03/11 10:38 -0500, Oren Laadan wrote:
> 
> 
> On 03/01/2011 10:31 AM, Louis Rilling wrote:
> > On 28/02/11 17:10 -0500, Oren Laadan wrote:
> >>
> >>
> >> On 02/28/2011 10:09 AM, Louis Rilling wrote:
> >>> On 28/02/11  9:43 -0500, Oren Laadan wrote:
> >>>>
> >>>>
> >>>> On 02/26/2011 08:54 AM, Louis Rilling wrote:
> >>>>> On Fri, Feb 25, 2011 at 11:01:32AM -0800, Sukadev Bhattiprolu wrote:
> >>>>>> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
> >>>>>> | On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
> >>>>>> | > 
> >>>>>> | > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
> >>>>>> | > index b0ea8ec..8ecc052 100644
> >>>>>> | > --- a/kernel/checkpoint/restart.c
> >>>>>> | > +++ b/kernel/checkpoint/restart.c
> >>>>>> | > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
> >>>>>> | >  	if (ret < 0)
> >>>>>> | >  		ckpt_err(ctx, ret, "ghost restart failed\n");
> >>>>>> | >  
> >>>>>> | > +	current->exit_signal = -1;
> >>>>>> | 
> >>>>>> | Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
> >>>>>> | places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
> >>>>>> | eligibile_child() (for an instance of a reader being another task) holds
> >>>>>> | read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
> >>>>>> | 
> >>>>>>
> >>>>>> Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
> >>>>>> back.
> >>>>>>
> >>>>>> | >  	restore_debug_exit(ctx);
> >>>>>> | >  	ckpt_ctx_put(ctx);
> >>>>>> | >  	do_exit(0);
> >>>>>> | > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
> >>>>>> | >  	/* restarting zombies will activate next task in restart */
> >>>>>> | >  	if (tsk->flags & PF_RESTARTING) {
> >>>>>> | >  		BUG_ON(ctx->active_pid == -1);
> >>>>>> | > +
> >>>>>> | > +		/*
> >>>>>> | > +		 * if we are a "ghost" task, that was terminated by the
> >>>>>> | > +		 * container-init (from zap_pid_ns_processes()), we should
> >>>>>> | > +		 * wake up the parent since we are now a detached process.
> >>>>>> | > +		 */
> >>>>>> | > +		read_lock_irq(&tasklist_lock);
> >>>>>> | 
> >>>>>> | read_lock() is enough. tasklist_lock is never taken for write from IRQs or
> >>>>>> | softIRQs.
> >>>>>> | 
> >>>>>> | > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
> >>>>>> | > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
> >>>>>> | > +					"parent\n", tsk->pid, tsk->comm);
> >>>>>> | > +                        __wake_up_parent(tsk, tsk->parent);
> >>>>>> | > +                }
> >>>>>> | > +		read_unlock_irq(&tasklist_lock);
> >>>>>> | 
> >>>>>> | Looking at this closer, I wonder if this wakeup logic should be called from
> >>>>>> | do_ghost_task(), right after setting ->exit_signal. This way there would be no
> >>>>>> | need for a tricky condition to recognize ghost tasks, and (I think) this is
> >>>>>> | closer to the other cases changing ->exit_signal (reparent_leader() and
> >>>>>> | exit_notify()).
> >>>>>>
> >>>>>> Yes, we tried the following in the earlier version. 
> >>>>>>
> >>>>>> void ghost_auto_reapable()
> >>>>>> {
> >>>>>>         write_lock(&tasklist_lock);
> >>>>>>         current->exit_signal = -1;
> >>>>>>         __wake_up_parent(current, current->parent);
> >>>>>>         write_unlock(&tasklist_lock);
> >>>>>> }
> >>>>>>
> >>>>>> And called this from do_ghost_task(). But with this, the parent could
> >>>>>> wake up, find that it still has an eligible child (the ghost) to wait 
> >>>>>> for, and go back to waiting before the ghost enters the EXIT_DEAD state.
> >>>>>> And so we would lose the wake up.
> >>>>>>
> >>>>>> (zap_pid_ns_processes() passes the __WALL so the ghost would be considered
> >>>>>> an eligible child).
> >>>>>
> >>>>> I think I see now. The point is that ->exit_signal = -1 is only meant to work
> >>>>> correctly for sub-threads, which the parent does not need to wait for. IOW, the
> >>>>> notion of detached task is only implemented for sub-threads.
> >>>>>
> >>>>> IIUC, setting ->exit_signal to -1 is only used here to let exit_notify() set
> >>>>> ->exit_state to EXIT_DEAD, right? Otherwise, setting ->exit_signal to 0 and
> >>>>> letting do_notify_parent() proceed for ghost tasks would have be sufficient I
> >>>>> guess (provided that the confusion between ghost tasks and zombies could be
> >>>>> easily avoided in do_notify_parent()).
> >>>>>
> >>>>> Then I agree that the proposed patch looks like a reasonably simple approach.
> >>>>>
> >>>>> Thanks for the explanation,
> >>>>>
> >>>>
> >>>> Louis, Suka:
> >>>>
> >>>> One subtlety with the method is that if a process get reparented
> >>>> (for whatever reason) then the ->exit_signal field is reset to 
> >>>> SIGCHLD. Fortunately, that should not affect us because our ghost
> >>>> tasks never become orphaned.
> >>>
> >>> AFAICS, this is not true for detached tasks: neither in reparent_leader(), nor
> >>> in exit_notify(). Unless I missed another place?
> >>
> >> Doh.. you are totally right, I missed that.
> >> Sometime it feels really good to be wrong :)
> > 
> > ;)
> > 
> >>
> >>>
> >>>>
> >>>> However, I can't avoid thinking that maybe there is a better way 
> >>>> to do this altogether ?
> >>>>
> >>>> The requirement is simple: ghost tasks are temporary tasks whose
> >>>> role is to keep certain pid's alive for the duration of the restart
> >>>> and then exit without a trace before the restarted tasks resume
> >>>> execution.
> >>>>
> >>>> The reason I opted for the ->exit_signal = -1 is because it makes
> >>>> sure that the parent need not explicitly collect the child ghost.
> >>>>
> >>>> If I had set it to 0, then it would not send a signal, but still
> >>>> would require a wait() to collect it (right ?).
> >>>
> >>> Right.
> >>>
> >>>>
> >>>> Can you think of a way to achieve this functionality without the 
> >>>> subtleties that we have observed so far ?  even at the cost of a 
> >>>> minor change to, say, wait() logic or what not ?
> >>>
> >>> I wonder if things would be easier by providing a mean to distinguish ghost
> >>> tasks from truely restarted tasks in do_notify_parent().
> >>>
> >>> Assuming this, there might be a kernel-patch-free way, although I can't say if
> >>> this fits userspace restart constraints:
> >>>
> >>> Setting the signal handler of ghost tasks' parent to SIG_IGN, and ->exit_signal
> >>
> >> Yes, I thought about it. But those parent are part of the restart,
> >> and changing their signal handlers will is tricky and will require
> >> complex sync logic at the end of restart to ensure that all tasks
> >> restore their handlers after the ghosts are gone and released, but
> >> before any task gets to userspace ... I don't want to go there.
> > 
> > Yes, I can imagine well the added complexity in userspace.
> > 
> >>
> >>> of ghost tasks to SIGCHLD, will make them autoreap. This requires that the
> >>> parent does not need to synchronize with other children until all ghost tasks
> >>> have exited, and that the parent remains alive too.
> >>>
> >>> So, why not adding some flag PF_RESTART_GHOST or TIF_RESTART_GHOST?
> >>
> >> Actually, we already have a way to distinguish them:
> >>
> >> 	if ((tsk->flags & PF_RESTARTING) && task_detached(tsk))
> >>
> >> One problem with this is that we only set exit_signal to -1 after
> >> the ghost was born, so if the parent is already waiting I think 
> >> that will be racy, as per suka's comment above.
> > 
> > Yes.
> > 
> >>
> >> So looking at the code again, we could add one condition in exit.c
> >> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
> >> to also test:
> >>
> >> inline static bool is_ghost_task(p)
> >> {
> >> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
> >> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
> >> }
> >>
> >> 	if (p->flags & is_ghost_task(p))
> >> 		return 0;
> >>
> >> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
> >> of PF_EXITING). While requiring a kernel patch, it is relatively short,
> >> clean and easy to review.

EXIT_ZOMBIE comparison would not optimize much imho, since p->flags must be
checked anyway.

Nit1: I don't think that checking p->flags saves anything before calling
is_ghost_task().

Nit2: why would you like to check that PF_EXITING and PF_RESTARTING come
together? Is it to make sure that no "real" restarted thread will be skipped
this way?

> > 
> > Adding the check in do_wait() is racy IMHO. What does prevent do_ghost_task()
> > from setting ->exit_signal = -1 after the parent sleeps in do_wait()?
> 
> If we do it per your suggestion (and I also had suggested the
> same originally):
> 
> >>>>>> void ghost_auto_reapable()
> >>>>>> {
> >>>>>>         write_lock(&tasklist_lock);
> >>>>>>         current->exit_signal = -1;
> >>>>>>         __wake_up_parent(current, current->parent);
> >>>>>>         write_unlock(&tasklist_lock);
> >>>>>> }
> 
> then it should be safe, no ?

s/write_lock/write_lock_irq/ and I'll call it safe :)

> 
> > 
> > Otherwise, why not making ghost tasks simply sub-threads of the "parent"? They
> > would autoreap and nobody would wait or have to wait for them. I would feel so
> > much better if we had not to change ->exit_signal in do_ghost_task()...
> 
> There are two types of ghosts:
> 
> 1) A placeholder needed to ensure that some processes are orphaned
> correctly (while retaining their sid)
> 
> 2) A placeholder needed to ensure that a certain pid exists during
> the restart (and can be, e.g. used by other processes as pgid).
> 
> in both cases, a thread of the parent won't do what I need: in the
> first case, the child will be re-parented to another thread (leader?).

any other thread, possibly the leader. Right.

> And in the second, the pgid will be be usable as a pgid.

You probably meant that the pid would not be usable as a pgid (because not group
leader pid, possible session mismatch, etc.).

> 
> But please keep throwing suggestions, we're making progress here...
> 

(Just had a closer look to how pgids are restored)

Well, given that it's useful to have a consistent pgid/session for ghost tasks,
and that it looks too difficult to rework the synchronization logic before
reaping ghost tasks (referring to the ->exit_signal = 0 + parent sighandler =
SIG_IGN suggestion), I'm falling short of cleaner suggestions :)

The ghost_auto_reapable() thing above + wait_consider_task() change looks like
the cleanest solution so far. My only fear is that people could be reluctant to
adding this (small?) overhead to do_wait() "only" for cr.

Thanks,

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                                     ` <20110303163531.GO21429-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
@ 2011-03-04 16:07                                       ` Oren Laadan
       [not found]                                         ` <4D710E37.8000109-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Oren Laadan @ 2011-03-04 16:07 UTC (permalink / raw)
  To: Louis Rilling; +Cc: Containers, Sukadev Bhattiprolu



On 03/03/2011 11:35 AM, Louis Rilling wrote:
> On 03/03/11 10:38 -0500, Oren Laadan wrote:
>>
>>
>> On 03/01/2011 10:31 AM, Louis Rilling wrote:
>>> On 28/02/11 17:10 -0500, Oren Laadan wrote:
>>>>
>>>>
>>>> On 02/28/2011 10:09 AM, Louis Rilling wrote:
>>>>> On 28/02/11  9:43 -0500, Oren Laadan wrote:
>>>>>>
>>>>>>
>>>>>> On 02/26/2011 08:54 AM, Louis Rilling wrote:
>>>>>>> On Fri, Feb 25, 2011 at 11:01:32AM -0800, Sukadev Bhattiprolu wrote:
>>>>>>>> Louis Rilling [Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org] wrote:
>>>>>>>> | On 24/02/11 23:55 -0800, Sukadev Bhattiprolu wrote:
>>>>>>>> | > 
>>>>>>>> | > diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
>>>>>>>> | > index b0ea8ec..8ecc052 100644
>>>>>>>> | > --- a/kernel/checkpoint/restart.c
>>>>>>>> | > +++ b/kernel/checkpoint/restart.c
>>>>>>>> | > @@ -972,6 +972,7 @@ static int do_ghost_task(void)
>>>>>>>> | >  	if (ret < 0)
>>>>>>>> | >  		ckpt_err(ctx, ret, "ghost restart failed\n");
>>>>>>>> | >  
>>>>>>>> | > +	current->exit_signal = -1;
>>>>>>>> | 
>>>>>>>> | Setting ->exit_signal outside of tasklist_lock makes me nervous. All other
>>>>>>>> | places that change ->exit_signal hold write_lock_irq(&tasklist_lock), and
>>>>>>>> | eligibile_child() (for an instance of a reader being another task) holds
>>>>>>>> | read_lock(&tasklist_lock). But maybe this does not matter for ghost tasks.
>>>>>>>> | 
>>>>>>>>
>>>>>>>> Yes, an earlier version had the write_lock(&tasklist_lock). Will add it
>>>>>>>> back.
>>>>>>>>
>>>>>>>> | >  	restore_debug_exit(ctx);
>>>>>>>> | >  	ckpt_ctx_put(ctx);
>>>>>>>> | >  	do_exit(0);
>>>>>>>> | > @@ -1465,7 +1466,22 @@ void exit_checkpoint(struct task_struct *tsk)
>>>>>>>> | >  	/* restarting zombies will activate next task in restart */
>>>>>>>> | >  	if (tsk->flags & PF_RESTARTING) {
>>>>>>>> | >  		BUG_ON(ctx->active_pid == -1);
>>>>>>>> | > +
>>>>>>>> | > +		/*
>>>>>>>> | > +		 * if we are a "ghost" task, that was terminated by the
>>>>>>>> | > +		 * container-init (from zap_pid_ns_processes()), we should
>>>>>>>> | > +		 * wake up the parent since we are now a detached process.
>>>>>>>> | > +		 */
>>>>>>>> | > +		read_lock_irq(&tasklist_lock);
>>>>>>>> | 
>>>>>>>> | read_lock() is enough. tasklist_lock is never taken for write from IRQs or
>>>>>>>> | softIRQs.
>>>>>>>> | 
>>>>>>>> | > +                if (tsk->exit_state == EXIT_DEAD && !tsk->parent->exit_state) {
>>>>>>>> | > +                        ckpt_debug("[%d, %s]: exit_checkpoint(): notifying "
>>>>>>>> | > +					"parent\n", tsk->pid, tsk->comm);
>>>>>>>> | > +                        __wake_up_parent(tsk, tsk->parent);
>>>>>>>> | > +                }
>>>>>>>> | > +		read_unlock_irq(&tasklist_lock);
>>>>>>>> | 
>>>>>>>> | Looking at this closer, I wonder if this wakeup logic should be called from
>>>>>>>> | do_ghost_task(), right after setting ->exit_signal. This way there would be no
>>>>>>>> | need for a tricky condition to recognize ghost tasks, and (I think) this is
>>>>>>>> | closer to the other cases changing ->exit_signal (reparent_leader() and
>>>>>>>> | exit_notify()).
>>>>>>>>
>>>>>>>> Yes, we tried the following in the earlier version. 
>>>>>>>>
>>>>>>>> void ghost_auto_reapable()
>>>>>>>> {
>>>>>>>>         write_lock(&tasklist_lock);
>>>>>>>>         current->exit_signal = -1;
>>>>>>>>         __wake_up_parent(current, current->parent);
>>>>>>>>         write_unlock(&tasklist_lock);
>>>>>>>> }
>>>>>>>>
>>>>>>>> And called this from do_ghost_task(). But with this, the parent could
>>>>>>>> wake up, find that it still has an eligible child (the ghost) to wait 
>>>>>>>> for, and go back to waiting before the ghost enters the EXIT_DEAD state.
>>>>>>>> And so we would lose the wake up.
>>>>>>>>
>>>>>>>> (zap_pid_ns_processes() passes the __WALL so the ghost would be considered
>>>>>>>> an eligible child).
>>>>>>>
>>>>>>> I think I see now. The point is that ->exit_signal = -1 is only meant to work
>>>>>>> correctly for sub-threads, which the parent does not need to wait for. IOW, the
>>>>>>> notion of detached task is only implemented for sub-threads.
>>>>>>>
>>>>>>> IIUC, setting ->exit_signal to -1 is only used here to let exit_notify() set
>>>>>>> ->exit_state to EXIT_DEAD, right? Otherwise, setting ->exit_signal to 0 and
>>>>>>> letting do_notify_parent() proceed for ghost tasks would have be sufficient I
>>>>>>> guess (provided that the confusion between ghost tasks and zombies could be
>>>>>>> easily avoided in do_notify_parent()).
>>>>>>>
>>>>>>> Then I agree that the proposed patch looks like a reasonably simple approach.
>>>>>>>
>>>>>>> Thanks for the explanation,
>>>>>>>
>>>>>>
>>>>>> Louis, Suka:
>>>>>>
>>>>>> One subtlety with the method is that if a process get reparented
>>>>>> (for whatever reason) then the ->exit_signal field is reset to 
>>>>>> SIGCHLD. Fortunately, that should not affect us because our ghost
>>>>>> tasks never become orphaned.
>>>>>
>>>>> AFAICS, this is not true for detached tasks: neither in reparent_leader(), nor
>>>>> in exit_notify(). Unless I missed another place?
>>>>
>>>> Doh.. you are totally right, I missed that.
>>>> Sometime it feels really good to be wrong :)
>>>
>>> ;)
>>>
>>>>
>>>>>
>>>>>>
>>>>>> However, I can't avoid thinking that maybe there is a better way 
>>>>>> to do this altogether ?
>>>>>>
>>>>>> The requirement is simple: ghost tasks are temporary tasks whose
>>>>>> role is to keep certain pid's alive for the duration of the restart
>>>>>> and then exit without a trace before the restarted tasks resume
>>>>>> execution.
>>>>>>
>>>>>> The reason I opted for the ->exit_signal = -1 is because it makes
>>>>>> sure that the parent need not explicitly collect the child ghost.
>>>>>>
>>>>>> If I had set it to 0, then it would not send a signal, but still
>>>>>> would require a wait() to collect it (right ?).
>>>>>
>>>>> Right.
>>>>>
>>>>>>
>>>>>> Can you think of a way to achieve this functionality without the 
>>>>>> subtleties that we have observed so far ?  even at the cost of a 
>>>>>> minor change to, say, wait() logic or what not ?
>>>>>
>>>>> I wonder if things would be easier by providing a mean to distinguish ghost
>>>>> tasks from truely restarted tasks in do_notify_parent().
>>>>>
>>>>> Assuming this, there might be a kernel-patch-free way, although I can't say if
>>>>> this fits userspace restart constraints:
>>>>>
>>>>> Setting the signal handler of ghost tasks' parent to SIG_IGN, and ->exit_signal
>>>>
>>>> Yes, I thought about it. But those parent are part of the restart,
>>>> and changing their signal handlers will is tricky and will require
>>>> complex sync logic at the end of restart to ensure that all tasks
>>>> restore their handlers after the ghosts are gone and released, but
>>>> before any task gets to userspace ... I don't want to go there.
>>>
>>> Yes, I can imagine well the added complexity in userspace.
>>>
>>>>
>>>>> of ghost tasks to SIGCHLD, will make them autoreap. This requires that the
>>>>> parent does not need to synchronize with other children until all ghost tasks
>>>>> have exited, and that the parent remains alive too.
>>>>>
>>>>> So, why not adding some flag PF_RESTART_GHOST or TIF_RESTART_GHOST?
>>>>
>>>> Actually, we already have a way to distinguish them:
>>>>
>>>> 	if ((tsk->flags & PF_RESTARTING) && task_detached(tsk))
>>>>
>>>> One problem with this is that we only set exit_signal to -1 after
>>>> the ghost was born, so if the parent is already waiting I think 
>>>> that will be racy, as per suka's comment above.
>>>
>>> Yes.
>>>
>>>>
>>>> So looking at the code again, we could add one condition in exit.c
>>>> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
>>>> to also test:
>>>>
>>>> inline static bool is_ghost_task(p)
>>>> {
>>>> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
>>>> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
>>>> }
>>>>
>>>> 	if (p->flags & is_ghost_task(p))
>>>> 		return 0;
>>>>
>>>> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
>>>> of PF_EXITING). While requiring a kernel patch, it is relatively short,
>>>> clean and easy to review.
> 
> EXIT_ZOMBIE comparison would not optimize much imho, since p->flags must be
> checked anyway.
> 
> Nit1: I don't think that checking p->flags saves anything before calling
> is_ghost_task().

Hmm.. right -
That's a leftover from before I decided to introduce is_ghost_task()

> 
> Nit2: why would you like to check that PF_EXITING and PF_RESTARTING come
> together? Is it to make sure that no "real" restarted thread will be skipped
> this way?

If wait() is called to get the state of stopped children, and for
whatever reason the ghost is stopped or being ptraced (we should
probably prevent that... but ok) - testing for the exiting/zombie 
condition is an extra safety measure: only skip this task when it
is actually exiting.

Do you not think it's needed ?

> 
>>>
>>> Adding the check in do_wait() is racy IMHO. What does prevent do_ghost_task()
>>> from setting ->exit_signal = -1 after the parent sleeps in do_wait()?
>>
>> If we do it per your suggestion (and I also had suggested the
>> same originally):
>>
>>>>>>>> void ghost_auto_reapable()
>>>>>>>> {
>>>>>>>>         write_lock(&tasklist_lock);
>>>>>>>>         current->exit_signal = -1;
>>>>>>>>         __wake_up_parent(current, current->parent);
>>>>>>>>         write_unlock(&tasklist_lock);
>>>>>>>> }
>>
>> then it should be safe, no ?
> 
> s/write_lock/write_lock_irq/ and I'll call it safe :)

Sure. We don't want deadlocks :)

> 
>>
>>>
>>> Otherwise, why not making ghost tasks simply sub-threads of the "parent"? They
>>> would autoreap and nobody would wait or have to wait for them. I would feel so
>>> much better if we had not to change ->exit_signal in do_ghost_task()...
>>
>> There are two types of ghosts:
>>
>> 1) A placeholder needed to ensure that some processes are orphaned
>> correctly (while retaining their sid)
>>
>> 2) A placeholder needed to ensure that a certain pid exists during
>> the restart (and can be, e.g. used by other processes as pgid).
>>
>> in both cases, a thread of the parent won't do what I need: in the
>> first case, the child will be re-parented to another thread (leader?).
> 
> any other thread, possibly the leader. Right.
> 
>> And in the second, the pgid will be be usable as a pgid.
> 
> You probably meant that the pid would not be usable as a pgid (because not group
> leader pid, possible session mismatch, etc.).

Yes.

> 
>>
>> But please keep throwing suggestions, we're making progress here...
>>
> 
> (Just had a closer look to how pgids are restored)
> 
> Well, given that it's useful to have a consistent pgid/session for ghost tasks,
> and that it looks too difficult to rework the synchronization logic before
> reaping ghost tasks (referring to the ->exit_signal = 0 + parent sighandler =
> SIG_IGN suggestion), I'm falling short of cleaner suggestions :)
> 
> The ghost_auto_reapable() thing above + wait_consider_task() change looks like
> the cleanest solution so far. My only fear is that people could be reluctant to
> adding this (small?) overhead to do_wait() "only" for cr.

I think it's negligible compare to all the work done anyways
in there ;)

I'll try to post something to summarize this for review.

Thansk !

Oren.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                                         ` <4D710E37.8000109-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2011-03-04 17:29                                           ` Louis Rilling
       [not found]                                             ` <20110304172914.GE3543-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Louis Rilling @ 2011-03-04 17:29 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Containers, Sukadev Bhattiprolu


[-- Attachment #1.1: Type: text/plain, Size: 3664 bytes --]

On 04/03/11 11:07 -0500, Oren Laadan wrote:
> On 03/03/2011 11:35 AM, Louis Rilling wrote:
> > On 03/03/11 10:38 -0500, Oren Laadan wrote:
> >> On 03/01/2011 10:31 AM, Louis Rilling wrote:
> >>> On 28/02/11 17:10 -0500, Oren Laadan wrote:
> >>>> So looking at the code again, we could add one condition in exit.c
> >>>> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
> >>>> to also test:
> >>>>
> >>>> inline static bool is_ghost_task(p)
> >>>> {
> >>>> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
> >>>> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
> >>>> }
> >>>>
> >>>> 	if (p->flags & is_ghost_task(p))
> >>>> 		return 0;
> >>>>
> >>>> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
> >>>> of PF_EXITING). While requiring a kernel patch, it is relatively short,
> >>>> clean and easy to review.
> > 
> > EXIT_ZOMBIE comparison would not optimize much imho, since p->flags must be
> > checked anyway.
> > 
> > Nit1: I don't think that checking p->flags saves anything before calling
> > is_ghost_task().
> 
> Hmm.. right -
> That's a leftover from before I decided to introduce is_ghost_task()
> 
> > 
> > Nit2: why would you like to check that PF_EXITING and PF_RESTARTING come
> > together? Is it to make sure that no "real" restarted thread will be skipped
> > this way?
> 
> If wait() is called to get the state of stopped children, and for
> whatever reason the ghost is stopped or being ptraced (we should
> probably prevent that... but ok) - testing for the exiting/zombie 
> condition is an extra safety measure: only skip this task when it
> is actually exiting.

I don't see how a ghost task could be stopped or ptraced, since it calls
do_exit() right after becoming detached, and thus identifiable as a ghost.
Unless it gets ptraced right before calling sys_restart()? Even in that case,
it's not reapable by ptrace since it's not in stopped state. OTOH, it may still
be reaped in wait_task_continued() (see below).

> 
> Do you not think it's needed ?

Not sure. As far as I can see, other restarting (with PF_RESTARTING) and
detached tasks can only be sub-threads, and are mostly not reapable in any way
as long as PF_RESTARTING is set. They can surely be reaped neither by
wait_task_zombie(), nor by wait_task_stopped(). The only possibility I see is by
wait_task_continued(), because a previous "wakeup from stopped" has not been
consumed before the checkpoint.

But, and I think that this is a good reason to check PF_EXITING (or
->exit_state), if threads are skipped this way, then wait() might incorrectly
return -ECHILD instead of sleeping.

Wait. Even with this, after ->exit_signal is set to -1, and before PF_EXITING is
set, wait_consider_task() can still consider the ghost as potentially reapable
in the future. Deadlock again.

In fact, it's probably much saner to have something atomic, like:

	write_lock_irq(&tasklist_lock);
	p->flags |= PF_EXITING;
	p->exit_signal = -1;
	__wake_up_parent(p, p->parent);
	write_unlock_irq(&tasklist_lock);

Unfortunately this is not accepted by do_exit(). So two kinds of solutions:
either set a new flag à la PF_RESTART_GHOST, and only check for this flag in
wait_consider_task(),
or somewhere in do_exit() (latest in exit_notify()), have
another mean to recognize ghost tasks, and do the ->exit_signal = -1 +
__wake_up_parent() there.

What's your opinion?

Thanks,

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                                             ` <20110304172914.GE3543-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
@ 2011-03-26  2:14                                               ` Oren Laadan
       [not found]                                                 ` <4D8D4BE9.8010605-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Oren Laadan @ 2011-03-26  2:14 UTC (permalink / raw)
  To: Sukadev Bhattiprolu, Luis Rilling; +Cc: Containers



On 03/04/2011 12:29 PM, Louis Rilling wrote:
> On 04/03/11 11:07 -0500, Oren Laadan wrote:
>> On 03/03/2011 11:35 AM, Louis Rilling wrote:
>>> On 03/03/11 10:38 -0500, Oren Laadan wrote:
>>>> On 03/01/2011 10:31 AM, Louis Rilling wrote:
>>>>> On 28/02/11 17:10 -0500, Oren Laadan wrote:
>>>>>> So looking at the code again, we could add one condition in exit.c
>>>>>> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
>>>>>> to also test:
>>>>>>
>>>>>> inline static bool is_ghost_task(p)
>>>>>> {
>>>>>> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
>>>>>> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
>>>>>> }
>>>>>>
>>>>>> 	if (p->flags & is_ghost_task(p))
>>>>>> 		return 0;
>>>>>>
>>>>>> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
>>>>>> of PF_EXITING). While requiring a kernel patch, it is relatively short,
>>>>>> clean and easy to review.
>>>
>>> EXIT_ZOMBIE comparison would not optimize much imho, since p->flags must be
>>> checked anyway.
>>>
>>> Nit1: I don't think that checking p->flags saves anything before calling
>>> is_ghost_task().
>>
>> Hmm.. right -
>> That's a leftover from before I decided to introduce is_ghost_task()
>>
>>>
>>> Nit2: why would you like to check that PF_EXITING and PF_RESTARTING come
>>> together? Is it to make sure that no "real" restarted thread will be skipped
>>> this way?
>>
>> If wait() is called to get the state of stopped children, and for
>> whatever reason the ghost is stopped or being ptraced (we should
>> probably prevent that... but ok) - testing for the exiting/zombie 
>> condition is an extra safety measure: only skip this task when it
>> is actually exiting.
> 
> I don't see how a ghost task could be stopped or ptraced, since it calls
> do_exit() right after becoming detached, and thus identifiable as a ghost.
> Unless it gets ptraced right before calling sys_restart()? Even in that case,
> it's not reapable by ptrace since it's not in stopped state. OTOH, it may still
> be reaped in wait_task_continued() (see below).
> 
>>
>> Do you not think it's needed ?
> 
> Not sure. As far as I can see, other restarting (with PF_RESTARTING) and
> detached tasks can only be sub-threads, and are mostly not reapable in any way
> as long as PF_RESTARTING is set. They can surely be reaped neither by
> wait_task_zombie(), nor by wait_task_stopped(). The only possibility I see is by
> wait_task_continued(), because a previous "wakeup from stopped" has not been
> consumed before the checkpoint.
> 
> But, and I think that this is a good reason to check PF_EXITING (or
> ->exit_state), if threads are skipped this way, then wait() might incorrectly
> return -ECHILD instead of sleeping.
> 
> Wait. Even with this, after ->exit_signal is set to -1, and before PF_EXITING is
> set, wait_consider_task() can still consider the ghost as potentially reapable
> in the future. Deadlock again.
> 
> In fact, it's probably much saner to have something atomic, like:
> 
> 	write_lock_irq(&tasklist_lock);
> 	p->flags |= PF_EXITING;
> 	p->exit_signal = -1;
> 	__wake_up_parent(p, p->parent);
> 	write_unlock_irq(&tasklist_lock);
> 
> Unfortunately this is not accepted by do_exit(). So two kinds of solutions:
> either set a new flag à la PF_RESTART_GHOST, and only check for this flag in
> wait_consider_task(),
> or somewhere in do_exit() (latest in exit_notify()), have
> another mean to recognize ghost tasks, and do the ->exit_signal = -1 +
> __wake_up_parent() there.
> 
> What's your opinion?
> 

Doing it in wait_consider_task() may be a problem since we only mark
a task as ghost after it has lived for a while, so wait() would have
already considered it a valid child to wait for.

If I had to choose, then I'd do the snippet you suggest above - and
in particular where PF_EXITING is already set, which is exit_signals().

Adding a means to recognize ghost tasks is simple: we ran out of 
task->flags, but we can add a c/r related field to hold such a flag
(we already add one field to the task_struct).

Do you think that will do it ?

Thanks,

Oren.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                                                 ` <4D8D4BE9.8010605-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2011-03-26 16:06                                                   ` Louis Rilling
       [not found]                                                     ` <20110326160607.GA18492-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Louis Rilling @ 2011-03-26 16:06 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Containers, Sukadev Bhattiprolu


[-- Attachment #1.1: Type: text/plain, Size: 4737 bytes --]

On 25/03/11 22:14 -0400, Oren Laadan wrote:
> 
> 
> On 03/04/2011 12:29 PM, Louis Rilling wrote:
> > On 04/03/11 11:07 -0500, Oren Laadan wrote:
> >> On 03/03/2011 11:35 AM, Louis Rilling wrote:
> >>> On 03/03/11 10:38 -0500, Oren Laadan wrote:
> >>>> On 03/01/2011 10:31 AM, Louis Rilling wrote:
> >>>>> On 28/02/11 17:10 -0500, Oren Laadan wrote:
> >>>>>> So looking at the code again, we could add one condition in exit.c
> >>>>>> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
> >>>>>> to also test:
> >>>>>>
> >>>>>> inline static bool is_ghost_task(p)
> >>>>>> {
> >>>>>> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
> >>>>>> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
> >>>>>> }
> >>>>>>
> >>>>>> 	if (p->flags & is_ghost_task(p))
> >>>>>> 		return 0;
> >>>>>>
> >>>>>> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
> >>>>>> of PF_EXITING). While requiring a kernel patch, it is relatively short,
> >>>>>> clean and easy to review.
> >>>
> >>> EXIT_ZOMBIE comparison would not optimize much imho, since p->flags must be
> >>> checked anyway.
> >>>
> >>> Nit1: I don't think that checking p->flags saves anything before calling
> >>> is_ghost_task().
> >>
> >> Hmm.. right -
> >> That's a leftover from before I decided to introduce is_ghost_task()
> >>
> >>>
> >>> Nit2: why would you like to check that PF_EXITING and PF_RESTARTING come
> >>> together? Is it to make sure that no "real" restarted thread will be skipped
> >>> this way?
> >>
> >> If wait() is called to get the state of stopped children, and for
> >> whatever reason the ghost is stopped or being ptraced (we should
> >> probably prevent that... but ok) - testing for the exiting/zombie 
> >> condition is an extra safety measure: only skip this task when it
> >> is actually exiting.
> > 
> > I don't see how a ghost task could be stopped or ptraced, since it calls
> > do_exit() right after becoming detached, and thus identifiable as a ghost.
> > Unless it gets ptraced right before calling sys_restart()? Even in that case,
> > it's not reapable by ptrace since it's not in stopped state. OTOH, it may still
> > be reaped in wait_task_continued() (see below).
> > 
> >>
> >> Do you not think it's needed ?
> > 
> > Not sure. As far as I can see, other restarting (with PF_RESTARTING) and
> > detached tasks can only be sub-threads, and are mostly not reapable in any way
> > as long as PF_RESTARTING is set. They can surely be reaped neither by
> > wait_task_zombie(), nor by wait_task_stopped(). The only possibility I see is by
> > wait_task_continued(), because a previous "wakeup from stopped" has not been
> > consumed before the checkpoint.
> > 
> > But, and I think that this is a good reason to check PF_EXITING (or
> > ->exit_state), if threads are skipped this way, then wait() might incorrectly
> > return -ECHILD instead of sleeping.
> > 
> > Wait. Even with this, after ->exit_signal is set to -1, and before PF_EXITING is
> > set, wait_consider_task() can still consider the ghost as potentially reapable
> > in the future. Deadlock again.
> > 
> > In fact, it's probably much saner to have something atomic, like:
> > 
> > 	write_lock_irq(&tasklist_lock);
> > 	p->flags |= PF_EXITING;
> > 	p->exit_signal = -1;
> > 	__wake_up_parent(p, p->parent);
> > 	write_unlock_irq(&tasklist_lock);
> > 
> > Unfortunately this is not accepted by do_exit(). So two kinds of solutions:
> > either set a new flag à la PF_RESTART_GHOST, and only check for this flag in
> > wait_consider_task(),
> > or somewhere in do_exit() (latest in exit_notify()), have
> > another mean to recognize ghost tasks, and do the ->exit_signal = -1 +
> > __wake_up_parent() there.
> > 
> > What's your opinion?
> > 
> 
> Doing it in wait_consider_task() may be a problem since we only mark
> a task as ghost after it has lived for a while, so wait() would have
> already considered it a valid child to wait for.
> 
> If I had to choose, then I'd do the snippet you suggest above - and
> in particular where PF_EXITING is already set, which is exit_signals().
> 
> Adding a means to recognize ghost tasks is simple: we ran out of 
> task->flags, but we can add a c/r related field to hold such a flag
> (we already add one field to the task_struct).
> 
> Do you think that will do it ?

Yup, any way to have a flag protected by tasklist_lock would be ok. For
instance, use some bit near ->did_exec. IMHO of course :)

Thanks,

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                                                     ` <20110326160607.GA18492-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
@ 2011-03-27  3:17                                                       ` Oren Laadan
       [not found]                                                         ` <4D8EAC5A.2080604-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Oren Laadan @ 2011-03-27  3:17 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: Louis Rilling, Containers


On 03/26/2011 12:06 PM, Louis Rilling wrote:
> On 25/03/11 22:14 -0400, Oren Laadan wrote:
>>
>>
>> On 03/04/2011 12:29 PM, Louis Rilling wrote:
>>> On 04/03/11 11:07 -0500, Oren Laadan wrote:
>>>> On 03/03/2011 11:35 AM, Louis Rilling wrote:
>>>>> On 03/03/11 10:38 -0500, Oren Laadan wrote:
>>>>>> On 03/01/2011 10:31 AM, Louis Rilling wrote:
>>>>>>> On 28/02/11 17:10 -0500, Oren Laadan wrote:
>>>>>>>> So looking at the code again, we could add one condition in exit.c
>>>>>>>> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
>>>>>>>> to also test:
>>>>>>>>
>>>>>>>> inline static bool is_ghost_task(p)
>>>>>>>> {
>>>>>>>> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
>>>>>>>> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
>>>>>>>> }
>>>>>>>>
>>>>>>>> 	if (p->flags & is_ghost_task(p))
>>>>>>>> 		return 0;
>>>>>>>>
>>>>>>>> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
>>>>>>>> of PF_EXITING). While requiring a kernel patch, it is relatively short,
>>>>>>>> clean and easy to review.
>>>>>
>>>>> EXIT_ZOMBIE comparison would not optimize much imho, since p->flags must be
>>>>> checked anyway.
>>>>>
>>>>> Nit1: I don't think that checking p->flags saves anything before calling
>>>>> is_ghost_task().
>>>>
>>>> Hmm.. right -
>>>> That's a leftover from before I decided to introduce is_ghost_task()
>>>>
>>>>>
>>>>> Nit2: why would you like to check that PF_EXITING and PF_RESTARTING come
>>>>> together? Is it to make sure that no "real" restarted thread will be skipped
>>>>> this way?
>>>>
>>>> If wait() is called to get the state of stopped children, and for
>>>> whatever reason the ghost is stopped or being ptraced (we should
>>>> probably prevent that... but ok) - testing for the exiting/zombie 
>>>> condition is an extra safety measure: only skip this task when it
>>>> is actually exiting.
>>>
>>> I don't see how a ghost task could be stopped or ptraced, since it calls
>>> do_exit() right after becoming detached, and thus identifiable as a ghost.
>>> Unless it gets ptraced right before calling sys_restart()? Even in that case,
>>> it's not reapable by ptrace since it's not in stopped state. OTOH, it may still
>>> be reaped in wait_task_continued() (see below).
>>>
>>>>
>>>> Do you not think it's needed ?
>>>
>>> Not sure. As far as I can see, other restarting (with PF_RESTARTING) and
>>> detached tasks can only be sub-threads, and are mostly not reapable in any way
>>> as long as PF_RESTARTING is set. They can surely be reaped neither by
>>> wait_task_zombie(), nor by wait_task_stopped(). The only possibility I see is by
>>> wait_task_continued(), because a previous "wakeup from stopped" has not been
>>> consumed before the checkpoint.
>>>
>>> But, and I think that this is a good reason to check PF_EXITING (or
>>> ->exit_state), if threads are skipped this way, then wait() might incorrectly
>>> return -ECHILD instead of sleeping.
>>>
>>> Wait. Even with this, after ->exit_signal is set to -1, and before PF_EXITING is
>>> set, wait_consider_task() can still consider the ghost as potentially reapable
>>> in the future. Deadlock again.
>>>
>>> In fact, it's probably much saner to have something atomic, like:
>>>
>>> 	write_lock_irq(&tasklist_lock);
>>> 	p->flags |= PF_EXITING;
>>> 	p->exit_signal = -1;
>>> 	__wake_up_parent(p, p->parent);
>>> 	write_unlock_irq(&tasklist_lock);
>>>
>>> Unfortunately this is not accepted by do_exit(). So two kinds of solutions:
>>> either set a new flag à la PF_RESTART_GHOST, and only check for this flag in
>>> wait_consider_task(),
>>> or somewhere in do_exit() (latest in exit_notify()), have
>>> another mean to recognize ghost tasks, and do the ->exit_signal = -1 +
>>> __wake_up_parent() there.
>>>
>>> What's your opinion?
>>>
>>
>> Doing it in wait_consider_task() may be a problem since we only mark
>> a task as ghost after it has lived for a while, so wait() would have
>> already considered it a valid child to wait for.
>>
>> If I had to choose, then I'd do the snippet you suggest above - and
>> in particular where PF_EXITING is already set, which is exit_signals().
>>
>> Adding a means to recognize ghost tasks is simple: we ran out of 
>> task->flags, but we can add a c/r related field to hold such a flag
>> (we already add one field to the task_struct).
>>
>> Do you think that will do it ?
> 
> Yup, any way to have a flag protected by tasklist_lock would be ok. For
> instance, use some bit near ->did_exec. IMHO of course :)

Good idea.
How about this patch:

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e42bf29..5a08d49 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1267,6 +1267,9 @@ struct task_struct {
 	unsigned in_execve:1;	/* Tell the LSMs that the process is doing an
 				 * execve */
 	unsigned in_iowait:1;
+#ifdef CONFIG_CHECKPOINT
+	unsigned ckpt_ghost:1;  /* ghost task in c/r - auto-reap */
+#endif
 
 
 	/* Revert to default priority/policy when forking */
diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
index 01da67f..880d456 100644
--- a/kernel/checkpoint/restart.c
+++ b/kernel/checkpoint/restart.c
@@ -971,6 +971,8 @@ static int do_ghost_task(void)
 	restore_debug_error(ctx, ret);
 	if (ret < 0)
 		ckpt_err(ctx, ret, "ghost restart failed\n");
+	else
+		current->ckpt_ghost = 1;
 
 	restore_debug_exit(ctx);
 	ckpt_ctx_put(ctx);
diff --git a/kernel/signal.c b/kernel/signal.c
index b1e6a31..8bc7c9e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2017,6 +2017,13 @@ void exit_signals(struct task_struct *tsk)
 	struct task_struct *t;
 
 	if (thread_group_empty(tsk) || signal_group_exit(tsk->signal)) {
+		if (tsk->ckpt_ghost) {
+			write_lock_irq(&tasklist_lock);
+			p->flags |= PF_EXITING;
+			p->exit_signal = -1;
+			__wake_up_parent(p, p->parent);
+			write_unlock_irq(&tasklist_lock);
+		}
 		tsk->flags |= PF_EXITING;
 		return;
 	}

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH][cr]: Fix ghost task bug
       [not found]                                                         ` <4D8EAC5A.2080604-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2011-03-28 15:24                                                           ` Louis Rilling
  0 siblings, 0 replies; 16+ messages in thread
From: Louis Rilling @ 2011-03-28 15:24 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Containers, Sukadev Bhattiprolu


[-- Attachment #1.1: Type: text/plain, Size: 8726 bytes --]

On 26/03/11 23:17 -0400, Oren Laadan wrote:
> 
> On 03/26/2011 12:06 PM, Louis Rilling wrote:
> > On 25/03/11 22:14 -0400, Oren Laadan wrote:
> >>
> >>
> >> On 03/04/2011 12:29 PM, Louis Rilling wrote:
> >>> On 04/03/11 11:07 -0500, Oren Laadan wrote:
> >>>> On 03/03/2011 11:35 AM, Louis Rilling wrote:
> >>>>> On 03/03/11 10:38 -0500, Oren Laadan wrote:
> >>>>>> On 03/01/2011 10:31 AM, Louis Rilling wrote:
> >>>>>>> On 28/02/11 17:10 -0500, Oren Laadan wrote:
> >>>>>>>> So looking at the code again, we could add one condition in exit.c
> >>>>>>>> at wait_consider_task(), after the test of p->exit_state == EXIT_DEAD,
> >>>>>>>> to also test:
> >>>>>>>>
> >>>>>>>> inline static bool is_ghost_task(p)
> >>>>>>>> {
> >>>>>>>> 	return (p->flags & (PF_EXITING|PF_RESTARTING) ==
> >>>>>>>> 		PF_EXITING|PF_RESTARTING) && task_detached(p)
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> 	if (p->flags & is_ghost_task(p))
> >>>>>>>> 		return 0;
> >>>>>>>>
> >>>>>>>> Or something along the lines (e.g. used EXIT_ZOMBIE comparison instead
> >>>>>>>> of PF_EXITING). While requiring a kernel patch, it is relatively short,
> >>>>>>>> clean and easy to review.
> >>>>>
> >>>>> EXIT_ZOMBIE comparison would not optimize much imho, since p->flags must be
> >>>>> checked anyway.
> >>>>>
> >>>>> Nit1: I don't think that checking p->flags saves anything before calling
> >>>>> is_ghost_task().
> >>>>
> >>>> Hmm.. right -
> >>>> That's a leftover from before I decided to introduce is_ghost_task()
> >>>>
> >>>>>
> >>>>> Nit2: why would you like to check that PF_EXITING and PF_RESTARTING come
> >>>>> together? Is it to make sure that no "real" restarted thread will be skipped
> >>>>> this way?
> >>>>
> >>>> If wait() is called to get the state of stopped children, and for
> >>>> whatever reason the ghost is stopped or being ptraced (we should
> >>>> probably prevent that... but ok) - testing for the exiting/zombie 
> >>>> condition is an extra safety measure: only skip this task when it
> >>>> is actually exiting.
> >>>
> >>> I don't see how a ghost task could be stopped or ptraced, since it calls
> >>> do_exit() right after becoming detached, and thus identifiable as a ghost.
> >>> Unless it gets ptraced right before calling sys_restart()? Even in that case,
> >>> it's not reapable by ptrace since it's not in stopped state. OTOH, it may still
> >>> be reaped in wait_task_continued() (see below).
> >>>
> >>>>
> >>>> Do you not think it's needed ?
> >>>
> >>> Not sure. As far as I can see, other restarting (with PF_RESTARTING) and
> >>> detached tasks can only be sub-threads, and are mostly not reapable in any way
> >>> as long as PF_RESTARTING is set. They can surely be reaped neither by
> >>> wait_task_zombie(), nor by wait_task_stopped(). The only possibility I see is by
> >>> wait_task_continued(), because a previous "wakeup from stopped" has not been
> >>> consumed before the checkpoint.
> >>>
> >>> But, and I think that this is a good reason to check PF_EXITING (or
> >>> ->exit_state), if threads are skipped this way, then wait() might incorrectly
> >>> return -ECHILD instead of sleeping.
> >>>
> >>> Wait. Even with this, after ->exit_signal is set to -1, and before PF_EXITING is
> >>> set, wait_consider_task() can still consider the ghost as potentially reapable
> >>> in the future. Deadlock again.
> >>>
> >>> In fact, it's probably much saner to have something atomic, like:
> >>>
> >>> 	write_lock_irq(&tasklist_lock);
> >>> 	p->flags |= PF_EXITING;
> >>> 	p->exit_signal = -1;
> >>> 	__wake_up_parent(p, p->parent);
> >>> 	write_unlock_irq(&tasklist_lock);
> >>>
> >>> Unfortunately this is not accepted by do_exit(). So two kinds of solutions:
> >>> either set a new flag à la PF_RESTART_GHOST, and only check for this flag in
> >>> wait_consider_task(),
> >>> or somewhere in do_exit() (latest in exit_notify()), have
> >>> another mean to recognize ghost tasks, and do the ->exit_signal = -1 +
> >>> __wake_up_parent() there.
> >>>
> >>> What's your opinion?
> >>>
> >>
> >> Doing it in wait_consider_task() may be a problem since we only mark
> >> a task as ghost after it has lived for a while, so wait() would have
> >> already considered it a valid child to wait for.
> >>
> >> If I had to choose, then I'd do the snippet you suggest above - and
> >> in particular where PF_EXITING is already set, which is exit_signals().
> >>
> >> Adding a means to recognize ghost tasks is simple: we ran out of 
> >> task->flags, but we can add a c/r related field to hold such a flag
> >> (we already add one field to the task_struct).
> >>
> >> Do you think that will do it ?
> > 
> > Yup, any way to have a flag protected by tasklist_lock would be ok. For
> > instance, use some bit near ->did_exec. IMHO of course :)
> 
> Good idea.
> How about this patch:
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index e42bf29..5a08d49 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1267,6 +1267,9 @@ struct task_struct {
>  	unsigned in_execve:1;	/* Tell the LSMs that the process is doing an
>  				 * execve */
>  	unsigned in_iowait:1;
> +#ifdef CONFIG_CHECKPOINT
> +	unsigned ckpt_ghost:1;  /* ghost task in c/r - auto-reap */
> +#endif
>  
>  
>  	/* Revert to default priority/policy when forking */
> diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
> index 01da67f..880d456 100644
> --- a/kernel/checkpoint/restart.c
> +++ b/kernel/checkpoint/restart.c
> @@ -971,6 +971,8 @@ static int do_ghost_task(void)
>  	restore_debug_error(ctx, ret);
>  	if (ret < 0)
>  		ckpt_err(ctx, ret, "ghost restart failed\n");
> +	else
> +		current->ckpt_ghost = 1;
>  
>  	restore_debug_exit(ctx);
>  	ckpt_ctx_put(ctx);
> diff --git a/kernel/signal.c b/kernel/signal.c
> index b1e6a31..8bc7c9e 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2017,6 +2017,13 @@ void exit_signals(struct task_struct *tsk)
>  	struct task_struct *t;
>  
>  	if (thread_group_empty(tsk) || signal_group_exit(tsk->signal)) {
> +		if (tsk->ckpt_ghost) {
> +			write_lock_irq(&tasklist_lock);
> +			p->flags |= PF_EXITING;
> +			p->exit_signal = -1;
> +			__wake_up_parent(p, p->parent);
> +			write_unlock_irq(&tasklist_lock);
> +		}
>  		tsk->flags |= PF_EXITING;
>  		return;

This alone is not enough. I was not clear enough and did not pay enough
attention to your suggestion, I'm afraid :)
wait_consider_task() can still wait for such a child, with __WALL or __WCLONE.
So wait_consider_task() must check ->ckpt_ghost, and __wake_up_parent() must be
done when setting ->ckpt_ghost. Then PF_EXITING need no particular handling, and
all can be done in do_ghost_task().

In other words, something like this could do it (hope it's not an issue to set
->ckpt_ghost before calling restore_debug_error()):


diff --git a/include/linux/sched.h b/include/linux/sched.h
index e42bf29..5a08d49 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1267,6 +1267,9 @@ struct task_struct {
 	unsigned in_execve:1;	/* Tell the LSMs that the process is doing an
 				 * execve */
 	unsigned in_iowait:1;
+#ifdef CONFIG_CHECKPOINT
+	unsigned ckpt_ghost:1;  /* ghost task in c/r - auto-reap */
+#endif
 
 
 	/* Revert to default priority/policy when forking */
diff --git a/kernel/checkpoint/restart.c b/kernel/checkpoint/restart.c
index 25e3b6d..e7b66f5 100644
--- a/kernel/checkpoint/restart.c
+++ b/kernel/checkpoint/restart.c
@@ -1307,6 +1307,17 @@ static int do_ghost_task(void)
 	ret = wait_event_interruptible(ctx->ghostq,
 				       all_tasks_activated(ctx) ||
 				       ckpt_test_error(ctx));
+	if (ret < 0)
+		goto out;
+
+	write_lock_irq(&tasklist_lock);
+	/* Tell parent not to wait for us. It may be already waiting though. */
+	current->ckpt_ghost = 1;
+	__wake_up_parent(current, current->parent);
+	/* Re-use sub-threads' auto-reap logic */
+	current->exit_signal = -1;
+	write_unlock_irq(&tasklist_lock);
+
  out:
 	restore_debug_error(ctx, ret);
 	if (ret < 0)
diff --git a/kernel/exit.c b/kernel/exit.c
index 6d81911..75cab68 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1554,6 +1554,11 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
 	if (p->exit_state == EXIT_DEAD)
 		return 0;
 
+#ifdef CONFIG_CHECKPOINT
+	if (p->ckpt_ghost)
+		return 0;
+#endif
+
 	/*
 	 * We don't reap group leaders with subthreads.
 	 */

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-03-28 15:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-25  7:55 [PATCH][cr]: Fix ghost task bug Sukadev Bhattiprolu
     [not found] ` <20110225075552.GB24361-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2011-02-25 11:55   ` Louis Rilling
     [not found]     ` <20110225115507.GD3915-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2011-02-25 19:01       ` Sukadev Bhattiprolu
     [not found]         ` <20110225190132.GA31300-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2011-02-26 13:54           ` Louis Rilling
     [not found]             ` <20110226135431.GA3251-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
2011-02-28 14:43               ` Oren Laadan
     [not found]                 ` <4D6BB499.7050602-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2011-02-28 15:09                   ` Louis Rilling
     [not found]                     ` <20110228150942.GD18970-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2011-02-28 22:10                       ` Oren Laadan
     [not found]                         ` <4D6C1D58.5060400-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2011-03-01 15:31                           ` Louis Rilling
     [not found]                             ` <20110301153105.GD4410-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2011-03-03 15:38                               ` Oren Laadan
     [not found]                                 ` <4D6FB5D9.8010002-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2011-03-03 16:35                                   ` Louis Rilling
     [not found]                                     ` <20110303163531.GO21429-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2011-03-04 16:07                                       ` Oren Laadan
     [not found]                                         ` <4D710E37.8000109-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2011-03-04 17:29                                           ` Louis Rilling
     [not found]                                             ` <20110304172914.GE3543-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2011-03-26  2:14                                               ` Oren Laadan
     [not found]                                                 ` <4D8D4BE9.8010605-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2011-03-26 16:06                                                   ` Louis Rilling
     [not found]                                                     ` <20110326160607.GA18492-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2011-03-27  3:17                                                       ` Oren Laadan
     [not found]                                                         ` <4D8EAC5A.2080604-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2011-03-28 15:24                                                           ` Louis Rilling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.