public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] Disable CLONE_PARENT for init
@ 2009-07-01  7:31 Sukadev Bhattiprolu
  2009-07-01  7:46 ` Roland McGrath
  0 siblings, 1 reply; 11+ messages in thread
From: Sukadev Bhattiprolu @ 2009-07-01  7:31 UTC (permalink / raw)
  To: Oleg Nesterov, roland, Eric W. Biederman, Oren Laadan
  Cc: serue, Alexey Dobriyan, Containers, linux-kernel


Disable CLONE_PARENT for init

When global or container-init processes use CLONE_PARENT, they create a
multi-rooted process tree. Besides, if the siblings of init exit, the
SIGCHLD is not sent to init process resulting in the zombies sticking
around indefinitely. So disable CLONE_PARENT for init.

Lightly tested, RFC patch :-)

Changelog[v2]:
	- Simplify patch description based on comments from Eric Biederman
	  and Oleg Nesterov.
	- [Oleg Nesterov] Use SIGNAL_UNKILLABLE instead of is_global_init()

Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: Roland McGrath <roland@redhat.com>
---
 kernel/fork.c |   11 +++++++++++
 1 file changed, 11 insertions(+)

Index: linux-mmotm/kernel/fork.c
===================================================================
--- linux-mmotm.orig/kernel/fork.c	2009-06-30 23:01:06.000000000 -0700
+++ linux-mmotm/kernel/fork.c	2009-06-30 23:13:53.000000000 -0700
@@ -974,6 +974,17 @@ static struct task_struct *copy_process(
 	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
 		return ERR_PTR(-EINVAL);
 
+	/*
+	 * Swapper process sets the handler for SIGCHLD to SIG_DFL. If init
+	 * creates a sibling and the sibling exits, the SIGCHLD is sent to
+	 * the swapper (since the swapper's handler for SIGCHLD is SIG_DFL).
+	 * But since the swapper does not reap its children, the zombie will
+	 * remain forever. So prevent init from using CLONE_PARENT.
+	 */
+	if ((clone_flags & CLONE_PARENT) &&
+				current->signal->flags & SIGNAL_UNKILLABLE)
+		return ERR_PTR(-EINVAL);
+
 	retval = security_task_create(clone_flags);
 	if (retval)
 		goto fork_out;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-01  7:31 [RFC][PATCH] Disable CLONE_PARENT for init Sukadev Bhattiprolu
@ 2009-07-01  7:46 ` Roland McGrath
  2009-07-01  8:01   ` Sukadev Bhattiprolu
  2009-07-01  8:24   ` Oleg Nesterov
  0 siblings, 2 replies; 11+ messages in thread
From: Roland McGrath @ 2009-07-01  7:46 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

> When global or container-init processes use CLONE_PARENT, they create a
> multi-rooted process tree. 

I take this to be the real motivation for your change.
But you don't mention it in the code comment.

> +	 * Swapper process sets the handler for SIGCHLD to SIG_DFL. If init
> +	 * creates a sibling and the sibling exits, the SIGCHLD is sent to
> +	 * the swapper (since the swapper's handler for SIGCHLD is SIG_DFL).
> +	 * But since the swapper does not reap its children, the zombie will
> +	 * remain forever. So prevent init from using CLONE_PARENT.

This would be fixed by having swapper set its SIGCHLD to SIG_IGN instead,
so such children self-reap.  That seems like the better fix for that.

If you want to make this change because of container-init issues, I think
you should just say so independent of this global-init case.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-01  7:46 ` Roland McGrath
@ 2009-07-01  8:01   ` Sukadev Bhattiprolu
  2009-07-01  8:24   ` Oleg Nesterov
  1 sibling, 0 replies; 11+ messages in thread
From: Sukadev Bhattiprolu @ 2009-07-01  8:01 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

Roland McGrath [roland@redhat.com] wrote:
| > When global or container-init processes use CLONE_PARENT, they create a
| > multi-rooted process tree. 
| 
| I take this to be the real motivation for your change.
| But you don't mention it in the code comment.

Well, it was - when I started. But my understanding of the comments was that
the constraint could be extended to global init as well for the following
reason.

| 
| > +	 * Swapper process sets the handler for SIGCHLD to SIG_DFL. If init
| > +	 * creates a sibling and the sibling exits, the SIGCHLD is sent to
| > +	 * the swapper (since the swapper's handler for SIGCHLD is SIG_DFL).
| > +	 * But since the swapper does not reap its children, the zombie will
| > +	 * remain forever. So prevent init from using CLONE_PARENT.
| 
| This would be fixed by having swapper set its SIGCHLD to SIG_IGN instead,
| so such children self-reap.  That seems like the better fix for that.

Yes, that would fix the global init case.

| 
| If you want to make this change because of container-init issues, I think
| you should just say so independent of this global-init case.

So can I leave the check for SIGNAL_UNKILLABLE but simplify the comments
to refer to the multi-rooted process tree ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-01  7:46 ` Roland McGrath
  2009-07-01  8:01   ` Sukadev Bhattiprolu
@ 2009-07-01  8:24   ` Oleg Nesterov
  2009-07-01 21:48     ` Sukadev Bhattiprolu
  1 sibling, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2009-07-01  8:24 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Sukadev Bhattiprolu, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

On 07/01, Roland McGrath wrote:
>
> > When global or container-init processes use CLONE_PARENT, they create a
> > multi-rooted process tree. 
>
> I take this to be the real motivation for your change.
> But you don't mention it in the code comment.
>
> > +	 * Swapper process sets the handler for SIGCHLD to SIG_DFL. If init
> > +	 * creates a sibling and the sibling exits, the SIGCHLD is sent to
> > +	 * the swapper (since the swapper's handler for SIGCHLD is SIG_DFL).
> > +	 * But since the swapper does not reap its children, the zombie will
> > +	 * remain forever. So prevent init from using CLONE_PARENT.
>
> This would be fixed by having swapper set its SIGCHLD to SIG_IGN instead,
> so such children self-reap.  That seems like the better fix for that.

This won't fix the problem. The child won't autoreap itself if ->exit_signal
!= SIGCHLD.

> If you want to make this change because of container-init issues, I think
> you should just say so independent of this global-init case.

Yes, agreed, the comment looks confusing.

Oleg


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-01  8:24   ` Oleg Nesterov
@ 2009-07-01 21:48     ` Sukadev Bhattiprolu
  2009-07-01 21:58       ` Roland McGrath
  2009-07-01 23:27       ` Eric W. Biederman
  0 siblings, 2 replies; 11+ messages in thread
From: Sukadev Bhattiprolu @ 2009-07-01 21:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

| This won't fix the problem. The child won't autoreap itself if ->exit_signal
| != SIGCHLD.
| 
| > If you want to make this change because of container-init issues, I think
| > you should just say so independent of this global-init case.
| 
| Yes, agreed, the comment looks confusing.
| 
| Oleg

Here is an updated patch with comments fixed.

Roland pls ack again if this is better.

---

Disable CLONE_PARENT for init.

When global or container-init processes use CLONE_PARENT, they create a
multi-rooted process tree. Besides if the siblings of init exit, the
SIGCHLD is not sent to init process resulting in the zombies sticking
around indefinitely.

Changelog[v3]:
	- [Roland, Oleg] Simplify comment describing the change
Changelog[v2]:
	- Simplify patch description based on comments from Eric Biederman
	  and Oleg Nesterov.
	- [Oleg Nesterov] Use SIGNAL_UNKILLABLE instead of is_global_init()

Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
---
 kernel/fork.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Index: linux-mmotm/kernel/fork.c
===================================================================
--- linux-mmotm.orig/kernel/fork.c	2009-06-30 23:01:06.000000000 -0700
+++ linux-mmotm/kernel/fork.c	2009-07-01 14:43:10.000000000 -0700
@@ -974,6 +974,14 @@ static struct task_struct *copy_process(
 	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
 		return ERR_PTR(-EINVAL);
 
+	/*
+	 * To avoid multi-rooted process-trees prevent global and container
+	 * inits from creating siblings.
+	 */
+	if ((clone_flags & CLONE_PARENT) &&
+				current->signal->flags & SIGNAL_UNKILLABLE)
+		return ERR_PTR(-EINVAL);
+
 	retval = security_task_create(clone_flags);
 	if (retval)
 		goto fork_out;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-01 21:48     ` Sukadev Bhattiprolu
@ 2009-07-01 21:58       ` Roland McGrath
  2009-07-02  0:35         ` Sukadev Bhattiprolu
  2009-07-01 23:27       ` Eric W. Biederman
  1 sibling, 1 reply; 11+ messages in thread
From: Roland McGrath @ 2009-07-01 21:58 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

Yeah, that's fine.  Since Oleg's pointed out that there is indeed no way to
avoid the leak in some global-init uses, it is fine to have a comment that
says that global init problems are part of the reason to outlaw this usage.
I just objected to a change that was really for container init but said it
was only to fix something different.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-01 21:48     ` Sukadev Bhattiprolu
  2009-07-01 21:58       ` Roland McGrath
@ 2009-07-01 23:27       ` Eric W. Biederman
  1 sibling, 0 replies; 11+ messages in thread
From: Eric W. Biederman @ 2009-07-01 23:27 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Oleg Nesterov, Roland McGrath, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> writes:

> | This won't fix the problem. The child won't autoreap itself if ->exit_signal
> | != SIGCHLD.
> | 
> | > If you want to make this change because of container-init issues, I think
> | > you should just say so independent of this global-init case.
> | 
> | Yes, agreed, the comment looks confusing.
> | 
> | Oleg
>
> Here is an updated patch with comments fixed.
>
> Roland pls ack again if this is better.
>
> ---
>
> Disable CLONE_PARENT for init.
>
> When global or container-init processes use CLONE_PARENT, they create a
> multi-rooted process tree. Besides if the siblings of init exit, the
> SIGCHLD is not sent to init process resulting in the zombies sticking
> around indefinitely.
>
> Changelog[v3]:
> 	- [Roland, Oleg] Simplify comment describing the change
> Changelog[v2]:
> 	- Simplify patch description based on comments from Eric Biederman
> 	  and Oleg Nesterov.
> 	- [Oleg Nesterov] Use SIGNAL_UNKILLABLE instead of is_global_init()
>
> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

> ---
>  kernel/fork.c |    8 ++++++++
>  1 file changed, 8 insertions(+)
>
> Index: linux-mmotm/kernel/fork.c
> ===================================================================
> --- linux-mmotm.orig/kernel/fork.c	2009-06-30 23:01:06.000000000 -0700
> +++ linux-mmotm/kernel/fork.c	2009-07-01 14:43:10.000000000 -0700
> @@ -974,6 +974,14 @@ static struct task_struct *copy_process(
>  	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
>  		return ERR_PTR(-EINVAL);
>  
> +	/*
> +	 * To avoid multi-rooted process-trees prevent global and container
> +	 * inits from creating siblings.
> +	 */
> +	if ((clone_flags & CLONE_PARENT) &&
> +				current->signal->flags & SIGNAL_UNKILLABLE)
> +		return ERR_PTR(-EINVAL);
> +
>  	retval = security_task_create(clone_flags);
>  	if (retval)
>  		goto fork_out;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-01 21:58       ` Roland McGrath
@ 2009-07-02  0:35         ` Sukadev Bhattiprolu
  2009-07-02  0:49           ` Roland McGrath
  2009-07-02  7:58           ` Oleg Nesterov
  0 siblings, 2 replies; 11+ messages in thread
From: Sukadev Bhattiprolu @ 2009-07-02  0:35 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

Roland McGrath [roland@redhat.com] wrote:
| Yeah, that's fine.  Since Oleg's pointed out that there is indeed no way to
| avoid the leak in some global-init uses, it is fine to have a comment that
| says that global init problems are part of the reason to outlaw this usage.
| I just objected to a change that was really for container init but said it
| was only to fix something different.

Ok. How about this comment:

---

Disable CLONE_PARENT for init.

When global or container-init processes use CLONE_PARENT, they create a
multi-rooted process tree. Besides siblings of global init remain as
zombies on exit since they are not reaped by their parent (swapper). So
prevent global and container-inits from creating siblings.

Changelog[v3]:
	- [Roland, Oleg] Simplify comment describing the change
Changelog[v2]:
	- Simplify patch description based on comments from Eric Biederman
	  and Oleg Nesterov.
	- [Oleg Nesterov] Use SIGNAL_UNKILLABLE instead of is_global_init()

Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/fork.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Index: linux-mmotm/kernel/fork.c
===================================================================
--- linux-mmotm.orig/kernel/fork.c	2009-06-30 23:01:06.000000000 -0700
+++ linux-mmotm/kernel/fork.c	2009-07-01 17:29:09.000000000 -0700
@@ -974,6 +974,16 @@ static struct task_struct *copy_process(
 	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
 		return ERR_PTR(-EINVAL);
 
+	/*
+	 * Siblings of global init remain as zombies on exit since they are
+	 * not reaped by their parent (swapper). To solve this and to avoid
+	 * multi-rooted process trees, prevent global and container-inits
+	 * from creating siblings.
+	 */
+	if ((clone_flags & CLONE_PARENT) &&
+				current->signal->flags & SIGNAL_UNKILLABLE)
+		return ERR_PTR(-EINVAL);
+
 	retval = security_task_create(clone_flags);
 	if (retval)
 		goto fork_out;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-02  0:35         ` Sukadev Bhattiprolu
@ 2009-07-02  0:49           ` Roland McGrath
  2009-07-02  7:58           ` Oleg Nesterov
  1 sibling, 0 replies; 11+ messages in thread
From: Roland McGrath @ 2009-07-02  0:49 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

Looks good to me.

Acked-by: Roland McGrath <roland@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-02  0:35         ` Sukadev Bhattiprolu
  2009-07-02  0:49           ` Roland McGrath
@ 2009-07-02  7:58           ` Oleg Nesterov
  2009-07-02 12:36             ` Oleg Nesterov
  1 sibling, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2009-07-02  7:58 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Roland McGrath, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

On 07/01, Sukadev Bhattiprolu wrote:
>
> --- linux-mmotm.orig/kernel/fork.c	2009-06-30 23:01:06.000000000 -0700
> +++ linux-mmotm/kernel/fork.c	2009-07-01 17:29:09.000000000 -0700
> @@ -974,6 +974,16 @@ static struct task_struct *copy_process(
>  	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
>  		return ERR_PTR(-EINVAL);
>
> +	/*
> +	 * Siblings of global init remain as zombies on exit since they are
> +	 * not reaped by their parent (swapper). To solve this and to avoid
> +	 * multi-rooted process trees, prevent global and container-inits
> +	 * from creating siblings.
> +	 */
> +	if ((clone_flags & CLONE_PARENT) &&
> +				current->signal->flags & SIGNAL_UNKILLABLE)
> +		return ERR_PTR(-EINVAL);

Acked-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC][PATCH] Disable CLONE_PARENT for init
  2009-07-02  7:58           ` Oleg Nesterov
@ 2009-07-02 12:36             ` Oleg Nesterov
  0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2009-07-02 12:36 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Roland McGrath, Eric W. Biederman, Oren Laadan, serue,
	Alexey Dobriyan, Containers, linux-kernel

On 07/02, Oleg Nesterov wrote:
>
> On 07/01, Sukadev Bhattiprolu wrote:
> >
> > --- linux-mmotm.orig/kernel/fork.c	2009-06-30 23:01:06.000000000 -0700
> > +++ linux-mmotm/kernel/fork.c	2009-07-01 17:29:09.000000000 -0700
> > @@ -974,6 +974,16 @@ static struct task_struct *copy_process(
> >  	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
> >  		return ERR_PTR(-EINVAL);
> >
> > +	/*
> > +	 * Siblings of global init remain as zombies on exit since they are
> > +	 * not reaped by their parent (swapper). To solve this and to avoid
> > +	 * multi-rooted process trees, prevent global and container-inits
> > +	 * from creating siblings.
> > +	 */
> > +	if ((clone_flags & CLONE_PARENT) &&
> > +				current->signal->flags & SIGNAL_UNKILLABLE)
> > +		return ERR_PTR(-EINVAL);
>
> Acked-by: Oleg Nesterov <oleg@redhat.com>

Thinking more, perhaps it makes sense to disallow CLONE_VM too.

If init forks CLONE_VM task, this task can be killed by
sig_kernel_coredump signal. In that case init will be killed too
and the kernel will crash.

But this is minor, we can trust the global init.

Oleg.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-07-02 12:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-01  7:31 [RFC][PATCH] Disable CLONE_PARENT for init Sukadev Bhattiprolu
2009-07-01  7:46 ` Roland McGrath
2009-07-01  8:01   ` Sukadev Bhattiprolu
2009-07-01  8:24   ` Oleg Nesterov
2009-07-01 21:48     ` Sukadev Bhattiprolu
2009-07-01 21:58       ` Roland McGrath
2009-07-02  0:35         ` Sukadev Bhattiprolu
2009-07-02  0:49           ` Roland McGrath
2009-07-02  7:58           ` Oleg Nesterov
2009-07-02 12:36             ` Oleg Nesterov
2009-07-01 23:27       ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox