* [RFC][PATCH] Disable CLONE_PARENT for init
@ 2009-07-01 7:31 Sukadev Bhattiprolu
2009-07-01 7:46 ` Roland McGrath
0 siblings, 1 reply; 11+ messages in thread
From: Sukadev Bhattiprolu @ 2009-07-01 7:31 UTC (permalink / raw)
To: Oleg Nesterov, roland, Eric W. Biederman, Oren Laadan
Cc: serue, Alexey Dobriyan, Containers, linux-kernel
Disable CLONE_PARENT for init
When global or container-init processes use CLONE_PARENT, they create a
multi-rooted process tree. Besides, if the siblings of init exit, the
SIGCHLD is not sent to init process resulting in the zombies sticking
around indefinitely. So disable CLONE_PARENT for init.
Lightly tested, RFC patch :-)
Changelog[v2]:
- Simplify patch description based on comments from Eric Biederman
and Oleg Nesterov.
- [Oleg Nesterov] Use SIGNAL_UNKILLABLE instead of is_global_init()
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: Roland McGrath <roland@redhat.com>
---
kernel/fork.c | 11 +++++++++++
1 file changed, 11 insertions(+)
Index: linux-mmotm/kernel/fork.c
===================================================================
--- linux-mmotm.orig/kernel/fork.c 2009-06-30 23:01:06.000000000 -0700
+++ linux-mmotm/kernel/fork.c 2009-06-30 23:13:53.000000000 -0700
@@ -974,6 +974,17 @@ static struct task_struct *copy_process(
if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
return ERR_PTR(-EINVAL);
+ /*
+ * Swapper process sets the handler for SIGCHLD to SIG_DFL. If init
+ * creates a sibling and the sibling exits, the SIGCHLD is sent to
+ * the swapper (since the swapper's handler for SIGCHLD is SIG_DFL).
+ * But since the swapper does not reap its children, the zombie will
+ * remain forever. So prevent init from using CLONE_PARENT.
+ */
+ if ((clone_flags & CLONE_PARENT) &&
+ current->signal->flags & SIGNAL_UNKILLABLE)
+ return ERR_PTR(-EINVAL);
+
retval = security_task_create(clone_flags);
if (retval)
goto fork_out;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-01 7:31 [RFC][PATCH] Disable CLONE_PARENT for init Sukadev Bhattiprolu
@ 2009-07-01 7:46 ` Roland McGrath
2009-07-01 8:01 ` Sukadev Bhattiprolu
2009-07-01 8:24 ` Oleg Nesterov
0 siblings, 2 replies; 11+ messages in thread
From: Roland McGrath @ 2009-07-01 7:46 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
> When global or container-init processes use CLONE_PARENT, they create a
> multi-rooted process tree.
I take this to be the real motivation for your change.
But you don't mention it in the code comment.
> + * Swapper process sets the handler for SIGCHLD to SIG_DFL. If init
> + * creates a sibling and the sibling exits, the SIGCHLD is sent to
> + * the swapper (since the swapper's handler for SIGCHLD is SIG_DFL).
> + * But since the swapper does not reap its children, the zombie will
> + * remain forever. So prevent init from using CLONE_PARENT.
This would be fixed by having swapper set its SIGCHLD to SIG_IGN instead,
so such children self-reap. That seems like the better fix for that.
If you want to make this change because of container-init issues, I think
you should just say so independent of this global-init case.
Thanks,
Roland
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-01 7:46 ` Roland McGrath
@ 2009-07-01 8:01 ` Sukadev Bhattiprolu
2009-07-01 8:24 ` Oleg Nesterov
1 sibling, 0 replies; 11+ messages in thread
From: Sukadev Bhattiprolu @ 2009-07-01 8:01 UTC (permalink / raw)
To: Roland McGrath
Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
Roland McGrath [roland@redhat.com] wrote:
| > When global or container-init processes use CLONE_PARENT, they create a
| > multi-rooted process tree.
|
| I take this to be the real motivation for your change.
| But you don't mention it in the code comment.
Well, it was - when I started. But my understanding of the comments was that
the constraint could be extended to global init as well for the following
reason.
|
| > + * Swapper process sets the handler for SIGCHLD to SIG_DFL. If init
| > + * creates a sibling and the sibling exits, the SIGCHLD is sent to
| > + * the swapper (since the swapper's handler for SIGCHLD is SIG_DFL).
| > + * But since the swapper does not reap its children, the zombie will
| > + * remain forever. So prevent init from using CLONE_PARENT.
|
| This would be fixed by having swapper set its SIGCHLD to SIG_IGN instead,
| so such children self-reap. That seems like the better fix for that.
Yes, that would fix the global init case.
|
| If you want to make this change because of container-init issues, I think
| you should just say so independent of this global-init case.
So can I leave the check for SIGNAL_UNKILLABLE but simplify the comments
to refer to the multi-rooted process tree ?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-01 7:46 ` Roland McGrath
2009-07-01 8:01 ` Sukadev Bhattiprolu
@ 2009-07-01 8:24 ` Oleg Nesterov
2009-07-01 21:48 ` Sukadev Bhattiprolu
1 sibling, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2009-07-01 8:24 UTC (permalink / raw)
To: Roland McGrath
Cc: Sukadev Bhattiprolu, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
On 07/01, Roland McGrath wrote:
>
> > When global or container-init processes use CLONE_PARENT, they create a
> > multi-rooted process tree.
>
> I take this to be the real motivation for your change.
> But you don't mention it in the code comment.
>
> > + * Swapper process sets the handler for SIGCHLD to SIG_DFL. If init
> > + * creates a sibling and the sibling exits, the SIGCHLD is sent to
> > + * the swapper (since the swapper's handler for SIGCHLD is SIG_DFL).
> > + * But since the swapper does not reap its children, the zombie will
> > + * remain forever. So prevent init from using CLONE_PARENT.
>
> This would be fixed by having swapper set its SIGCHLD to SIG_IGN instead,
> so such children self-reap. That seems like the better fix for that.
This won't fix the problem. The child won't autoreap itself if ->exit_signal
!= SIGCHLD.
> If you want to make this change because of container-init issues, I think
> you should just say so independent of this global-init case.
Yes, agreed, the comment looks confusing.
Oleg
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-01 8:24 ` Oleg Nesterov
@ 2009-07-01 21:48 ` Sukadev Bhattiprolu
2009-07-01 21:58 ` Roland McGrath
2009-07-01 23:27 ` Eric W. Biederman
0 siblings, 2 replies; 11+ messages in thread
From: Sukadev Bhattiprolu @ 2009-07-01 21:48 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Roland McGrath, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
| This won't fix the problem. The child won't autoreap itself if ->exit_signal
| != SIGCHLD.
|
| > If you want to make this change because of container-init issues, I think
| > you should just say so independent of this global-init case.
|
| Yes, agreed, the comment looks confusing.
|
| Oleg
Here is an updated patch with comments fixed.
Roland pls ack again if this is better.
---
Disable CLONE_PARENT for init.
When global or container-init processes use CLONE_PARENT, they create a
multi-rooted process tree. Besides if the siblings of init exit, the
SIGCHLD is not sent to init process resulting in the zombies sticking
around indefinitely.
Changelog[v3]:
- [Roland, Oleg] Simplify comment describing the change
Changelog[v2]:
- Simplify patch description based on comments from Eric Biederman
and Oleg Nesterov.
- [Oleg Nesterov] Use SIGNAL_UNKILLABLE instead of is_global_init()
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
---
kernel/fork.c | 8 ++++++++
1 file changed, 8 insertions(+)
Index: linux-mmotm/kernel/fork.c
===================================================================
--- linux-mmotm.orig/kernel/fork.c 2009-06-30 23:01:06.000000000 -0700
+++ linux-mmotm/kernel/fork.c 2009-07-01 14:43:10.000000000 -0700
@@ -974,6 +974,14 @@ static struct task_struct *copy_process(
if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
return ERR_PTR(-EINVAL);
+ /*
+ * To avoid multi-rooted process-trees prevent global and container
+ * inits from creating siblings.
+ */
+ if ((clone_flags & CLONE_PARENT) &&
+ current->signal->flags & SIGNAL_UNKILLABLE)
+ return ERR_PTR(-EINVAL);
+
retval = security_task_create(clone_flags);
if (retval)
goto fork_out;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-01 21:48 ` Sukadev Bhattiprolu
@ 2009-07-01 21:58 ` Roland McGrath
2009-07-02 0:35 ` Sukadev Bhattiprolu
2009-07-01 23:27 ` Eric W. Biederman
1 sibling, 1 reply; 11+ messages in thread
From: Roland McGrath @ 2009-07-01 21:58 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
Yeah, that's fine. Since Oleg's pointed out that there is indeed no way to
avoid the leak in some global-init uses, it is fine to have a comment that
says that global init problems are part of the reason to outlaw this usage.
I just objected to a change that was really for container init but said it
was only to fix something different.
Thanks,
Roland
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-01 21:58 ` Roland McGrath
@ 2009-07-02 0:35 ` Sukadev Bhattiprolu
2009-07-02 0:49 ` Roland McGrath
2009-07-02 7:58 ` Oleg Nesterov
0 siblings, 2 replies; 11+ messages in thread
From: Sukadev Bhattiprolu @ 2009-07-02 0:35 UTC (permalink / raw)
To: Roland McGrath
Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
Roland McGrath [roland@redhat.com] wrote:
| Yeah, that's fine. Since Oleg's pointed out that there is indeed no way to
| avoid the leak in some global-init uses, it is fine to have a comment that
| says that global init problems are part of the reason to outlaw this usage.
| I just objected to a change that was really for container init but said it
| was only to fix something different.
Ok. How about this comment:
---
Disable CLONE_PARENT for init.
When global or container-init processes use CLONE_PARENT, they create a
multi-rooted process tree. Besides siblings of global init remain as
zombies on exit since they are not reaped by their parent (swapper). So
prevent global and container-inits from creating siblings.
Changelog[v3]:
- [Roland, Oleg] Simplify comment describing the change
Changelog[v2]:
- Simplify patch description based on comments from Eric Biederman
and Oleg Nesterov.
- [Oleg Nesterov] Use SIGNAL_UNKILLABLE instead of is_global_init()
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
kernel/fork.c | 10 ++++++++++
1 file changed, 10 insertions(+)
Index: linux-mmotm/kernel/fork.c
===================================================================
--- linux-mmotm.orig/kernel/fork.c 2009-06-30 23:01:06.000000000 -0700
+++ linux-mmotm/kernel/fork.c 2009-07-01 17:29:09.000000000 -0700
@@ -974,6 +974,16 @@ static struct task_struct *copy_process(
if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
return ERR_PTR(-EINVAL);
+ /*
+ * Siblings of global init remain as zombies on exit since they are
+ * not reaped by their parent (swapper). To solve this and to avoid
+ * multi-rooted process trees, prevent global and container-inits
+ * from creating siblings.
+ */
+ if ((clone_flags & CLONE_PARENT) &&
+ current->signal->flags & SIGNAL_UNKILLABLE)
+ return ERR_PTR(-EINVAL);
+
retval = security_task_create(clone_flags);
if (retval)
goto fork_out;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-02 0:35 ` Sukadev Bhattiprolu
@ 2009-07-02 0:49 ` Roland McGrath
2009-07-02 7:58 ` Oleg Nesterov
1 sibling, 0 replies; 11+ messages in thread
From: Roland McGrath @ 2009-07-02 0:49 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Oleg Nesterov, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
Looks good to me.
Acked-by: Roland McGrath <roland@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-02 0:35 ` Sukadev Bhattiprolu
2009-07-02 0:49 ` Roland McGrath
@ 2009-07-02 7:58 ` Oleg Nesterov
2009-07-02 12:36 ` Oleg Nesterov
1 sibling, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2009-07-02 7:58 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Roland McGrath, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
On 07/01, Sukadev Bhattiprolu wrote:
>
> --- linux-mmotm.orig/kernel/fork.c 2009-06-30 23:01:06.000000000 -0700
> +++ linux-mmotm/kernel/fork.c 2009-07-01 17:29:09.000000000 -0700
> @@ -974,6 +974,16 @@ static struct task_struct *copy_process(
> if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
> return ERR_PTR(-EINVAL);
>
> + /*
> + * Siblings of global init remain as zombies on exit since they are
> + * not reaped by their parent (swapper). To solve this and to avoid
> + * multi-rooted process trees, prevent global and container-inits
> + * from creating siblings.
> + */
> + if ((clone_flags & CLONE_PARENT) &&
> + current->signal->flags & SIGNAL_UNKILLABLE)
> + return ERR_PTR(-EINVAL);
Acked-by: Oleg Nesterov <oleg@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-02 7:58 ` Oleg Nesterov
@ 2009-07-02 12:36 ` Oleg Nesterov
0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2009-07-02 12:36 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Roland McGrath, Eric W. Biederman, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
On 07/02, Oleg Nesterov wrote:
>
> On 07/01, Sukadev Bhattiprolu wrote:
> >
> > --- linux-mmotm.orig/kernel/fork.c 2009-06-30 23:01:06.000000000 -0700
> > +++ linux-mmotm/kernel/fork.c 2009-07-01 17:29:09.000000000 -0700
> > @@ -974,6 +974,16 @@ static struct task_struct *copy_process(
> > if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
> > return ERR_PTR(-EINVAL);
> >
> > + /*
> > + * Siblings of global init remain as zombies on exit since they are
> > + * not reaped by their parent (swapper). To solve this and to avoid
> > + * multi-rooted process trees, prevent global and container-inits
> > + * from creating siblings.
> > + */
> > + if ((clone_flags & CLONE_PARENT) &&
> > + current->signal->flags & SIGNAL_UNKILLABLE)
> > + return ERR_PTR(-EINVAL);
>
> Acked-by: Oleg Nesterov <oleg@redhat.com>
Thinking more, perhaps it makes sense to disallow CLONE_VM too.
If init forks CLONE_VM task, this task can be killed by
sig_kernel_coredump signal. In that case init will be killed too
and the kernel will crash.
But this is minor, we can trust the global init.
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH] Disable CLONE_PARENT for init
2009-07-01 21:48 ` Sukadev Bhattiprolu
2009-07-01 21:58 ` Roland McGrath
@ 2009-07-01 23:27 ` Eric W. Biederman
1 sibling, 0 replies; 11+ messages in thread
From: Eric W. Biederman @ 2009-07-01 23:27 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Oleg Nesterov, Roland McGrath, Oren Laadan, serue,
Alexey Dobriyan, Containers, linux-kernel
Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> writes:
> | This won't fix the problem. The child won't autoreap itself if ->exit_signal
> | != SIGCHLD.
> |
> | > If you want to make this change because of container-init issues, I think
> | > you should just say so independent of this global-init case.
> |
> | Yes, agreed, the comment looks confusing.
> |
> | Oleg
>
> Here is an updated patch with comments fixed.
>
> Roland pls ack again if this is better.
>
> ---
>
> Disable CLONE_PARENT for init.
>
> When global or container-init processes use CLONE_PARENT, they create a
> multi-rooted process tree. Besides if the siblings of init exit, the
> SIGCHLD is not sent to init process resulting in the zombies sticking
> around indefinitely.
>
> Changelog[v3]:
> - [Roland, Oleg] Simplify comment describing the change
> Changelog[v2]:
> - Simplify patch description based on comments from Eric Biederman
> and Oleg Nesterov.
> - [Oleg Nesterov] Use SIGNAL_UNKILLABLE instead of is_global_init()
>
> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
> kernel/fork.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> Index: linux-mmotm/kernel/fork.c
> ===================================================================
> --- linux-mmotm.orig/kernel/fork.c 2009-06-30 23:01:06.000000000 -0700
> +++ linux-mmotm/kernel/fork.c 2009-07-01 14:43:10.000000000 -0700
> @@ -974,6 +974,14 @@ static struct task_struct *copy_process(
> if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
> return ERR_PTR(-EINVAL);
>
> + /*
> + * To avoid multi-rooted process-trees prevent global and container
> + * inits from creating siblings.
> + */
> + if ((clone_flags & CLONE_PARENT) &&
> + current->signal->flags & SIGNAL_UNKILLABLE)
> + return ERR_PTR(-EINVAL);
> +
> retval = security_task_create(clone_flags);
> if (retval)
> goto fork_out;
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-07-02 12:39 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-01 7:31 [RFC][PATCH] Disable CLONE_PARENT for init Sukadev Bhattiprolu
2009-07-01 7:46 ` Roland McGrath
2009-07-01 8:01 ` Sukadev Bhattiprolu
2009-07-01 8:24 ` Oleg Nesterov
2009-07-01 21:48 ` Sukadev Bhattiprolu
2009-07-01 21:58 ` Roland McGrath
2009-07-02 0:35 ` Sukadev Bhattiprolu
2009-07-02 0:49 ` Roland McGrath
2009-07-02 7:58 ` Oleg Nesterov
2009-07-02 12:36 ` Oleg Nesterov
2009-07-01 23:27 ` Eric W. Biederman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox