* [RFC][PATCH 0/2] CLONE_PARENT and pid namespaces
@ 2009-06-18 2:47 Sukadev Bhattiprolu
[not found] ` <20090618024743.GA31515-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2009-06-18 2:47 UTC (permalink / raw)
To: serue-r/Jw6+rmf7HQT0dZR+AlfA, Eric W. Biederman, Oleg Nesterov,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov, Alexey
Cc: Containers, sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
David C. Hansen
Its not clear what the semantics should be when CLONE_PARENT is combined
with pid namespaces. Patches in this set prevent CLONE_PARENT with pid
namespaces, at least until we have a better understanding of what the
semantics should be.
The patches seem to compile/boot, but could use more testing...
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 1/2] Deny CLONE_PARENT|CLONE_NEWPID combination
[not found] ` <20090618024743.GA31515-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-06-18 2:49 ` Sukadev Bhattiprolu
[not found] ` <20090618024934.GA31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-06-18 2:51 ` [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT Sukadev Bhattiprolu
1 sibling, 1 reply; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2009-06-18 2:49 UTC (permalink / raw)
To: serue-r/Jw6+rmf7HQT0dZR+AlfA, Eric W. Biederman, Oleg Nesterov,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov, Alexey
Cc: Containers, David C. Hansen
Deny CLONE_PARENT|CLONE_NEWPID combination.
CLONE_PARENT was probably used to implement an older threading model.
If so, for consistency with CLONE_THREAD, the CLONE_PARENT|CLONE_NEWPID
combination should also fail with -EINVAL.
Signed-off-by: Sukadev Bhattiprolu <sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
kernel/pid_namespace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-mmotm/kernel/pid_namespace.c
===================================================================
--- linux-mmotm.orig/kernel/pid_namespace.c 2009-06-17 18:19:42.000000000 -0700
+++ linux-mmotm/kernel/pid_namespace.c 2009-06-17 18:19:58.000000000 -0700
@@ -118,7 +118,7 @@ struct pid_namespace *copy_pid_ns(unsign
{
if (!(flags & CLONE_NEWPID))
return get_pid_ns(old_ns);
- if (flags & CLONE_THREAD)
+ if (flags & (CLONE_THREAD|CLONE_PARENT))
return ERR_PTR(-EINVAL);
return create_pid_namespace(old_ns);
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT
[not found] ` <20090618024743.GA31515-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-06-18 2:49 ` [RFC][PATCH 1/2] Deny CLONE_PARENT|CLONE_NEWPID combination Sukadev Bhattiprolu
@ 2009-06-18 2:51 ` Sukadev Bhattiprolu
[not found] ` <20090618025103.GB31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
1 sibling, 1 reply; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2009-06-18 2:51 UTC (permalink / raw)
To: serue-r/Jw6+rmf7HQT0dZR+AlfA, Eric W. Biederman, Oleg Nesterov,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov, Alexey
Cc: Containers, David C. Hansen
Prevent container-inits from using CLONE_PARENT
If a container-init creates a sibling (using CLONE_PARENT), pid namespace
semantics become complicated:
- the "active pid namespace" of the sibling will be the descendant
container, but its not obvious if that is correct.
- if container-init exits, it will terminate the sibling, but again
its not clear if that is the correct behavior.
- the sibling exists in both parent and child containers while current
pid namespace semantics assume that only container-init can exist
in both parent/child containers.
- the parent of the sibling is not a descendant of container-init
(while pid namespaces assume that all processes in the container
are descendants of the container-init)
- When the sibling dies, the SIGCHLD is sent to its parent (if
alive), i.e the signal escapes the container to a parent container.
(if the parent of the sibling exits, the container-init then becomes
the reaper of the sibling).
To keep pid namespace semantics simple, prevent container-inits from using
CLONE_PARENT at least until we have a better understanding of CLONE_PARENT
and pid-namespace interactions.
Untested, RFC patch :-)
Signed-off-by: Sukadev Bhattiprolu <sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
kernel/fork.c | 8 ++++++++
1 file changed, 8 insertions(+)
Index: linux-mmotm/kernel/fork.c
===================================================================
--- linux-mmotm.orig/kernel/fork.c 2009-06-17 18:23:23.000000000 -0700
+++ linux-mmotm/kernel/fork.c 2009-06-17 19:17:54.000000000 -0700
@@ -974,6 +974,14 @@ static struct task_struct *copy_process(
if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
return ERR_PTR(-EINVAL);
+ /*
+ * To keep pid namespace semantics simple, prevent container-inits
+ * from creating siblings.
+ */
+ if ((clone_flags & CLONE_PARENT) &&
+ is_container_init(current) && !is_global_init(current))
+ return ERR_PTR(-EINVAL);
+
retval = security_task_create(clone_flags);
if (retval)
goto fork_out;
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT
[not found] ` <20090618025103.GB31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-06-18 3:20 ` Eric W. Biederman
[not found] ` <m18wjqkz2i.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-06-18 3:28 ` Roland McGrath
2009-06-18 15:35 ` Oleg Nesterov
2 siblings, 1 reply; 13+ messages in thread
From: Eric W. Biederman @ 2009-06-18 3:20 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Containers, David C. Hansen, Oleg Nesterov, Alexey Dobriyan,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov
Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> Prevent container-inits from using CLONE_PARENT
>
> If a container-init creates a sibling (using CLONE_PARENT), pid namespace
> semantics become complicated:
>
> - the "active pid namespace" of the sibling will be the descendant
> container, but its not obvious if that is correct.
It is correct the sibling must not change pid namespaces. You are not
allowed to escape out of a pid namespace.
> - if container-init exits, it will terminate the sibling, but again
> its not clear if that is the correct behavior.
Again correct because the container-init is the child reaper for the pid namespace.
No reaper no namespace.
> - the sibling exists in both parent and child containers while current
> pid namespace semantics assume that only container-init can exist
> in both parent/child containers.
All tasks in the container also exist in the parent container.
What assumption are you talking about?
> - the parent of the sibling is not a descendant of container-init
> (while pid namespaces assume that all processes in the container
> are descendants of the container-init)
User space assumes that certainly. What part of the pid namespace
code makes such an assumption?
> - When the sibling dies, the SIGCHLD is sent to its parent (if
> alive), i.e the signal escapes the container to a parent container.
> (if the parent of the sibling exits, the container-init then becomes
> the reaper of the sibling).
Yes.
> To keep pid namespace semantics simple, prevent container-inits from using
> CLONE_PARENT at least until we have a better understanding of CLONE_PARENT
> and pid-namespace interactions.
The only argument that I can see that carries any weight is that unix
semantics fundamentally assume a process tree. Allowing init to use
CLONE_PARENT creates a multi-rooted process tree.
At which point the is_global_init check is foolish.
Eric
> Untested, RFC patch :-)
>
> Signed-off-by: Sukadev Bhattiprolu <sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
> kernel/fork.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> Index: linux-mmotm/kernel/fork.c
> ===================================================================
> --- linux-mmotm.orig/kernel/fork.c 2009-06-17 18:23:23.000000000 -0700
> +++ linux-mmotm/kernel/fork.c 2009-06-17 19:17:54.000000000 -0700
> @@ -974,6 +974,14 @@ static struct task_struct *copy_process(
> if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
> return ERR_PTR(-EINVAL);
>
> + /*
> + * To keep pid namespace semantics simple, prevent container-inits
> + * from creating siblings.
> + */
> + if ((clone_flags & CLONE_PARENT) &&
> + is_container_init(current) && !is_global_init(current))
> + return ERR_PTR(-EINVAL);
> +
> retval = security_task_create(clone_flags);
> if (retval)
> goto fork_out;
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/2] Deny CLONE_PARENT|CLONE_NEWPID combination
[not found] ` <20090618024934.GA31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-06-18 3:26 ` Roland McGrath
2009-06-18 3:30 ` Eric W. Biederman
1 sibling, 0 replies; 13+ messages in thread
From: Roland McGrath @ 2009-06-18 3:26 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Containers, David C. Hansen, Oleg Nesterov, Eric W. Biederman,
Alexey Dobriyan, Pavel Emelyanov
NPTL has never used CLONE_PARENT, so this is fine.
Acked-by: Roland McGrath <roland-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Thanks,
Roland
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT
[not found] ` <20090618025103.GB31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-06-18 3:20 ` Eric W. Biederman
@ 2009-06-18 3:28 ` Roland McGrath
2009-06-18 15:35 ` Oleg Nesterov
2 siblings, 0 replies; 13+ messages in thread
From: Roland McGrath @ 2009-06-18 3:28 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Containers, David C. Hansen, Oleg Nesterov, Eric W. Biederman,
Alexey Dobriyan, Pavel Emelyanov
Looks sensible to me.
Acked-by: Roland McGrath <roland-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Thanks,
Roland
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/2] Deny CLONE_PARENT|CLONE_NEWPID combination
[not found] ` <20090618024934.GA31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-06-18 3:26 ` Roland McGrath
@ 2009-06-18 3:30 ` Eric W. Biederman
[not found] ` <m1d492jk0k.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
1 sibling, 1 reply; 13+ messages in thread
From: Eric W. Biederman @ 2009-06-18 3:30 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Containers, David C. Hansen, Oleg Nesterov, Alexey Dobriyan,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov
Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> Deny CLONE_PARENT|CLONE_NEWPID combination.
>
> CLONE_PARENT was probably used to implement an older threading model.
Yes it was.
> If so, for consistency with CLONE_THREAD, the CLONE_PARENT|CLONE_NEWPID
> combination should also fail with -EINVAL.
CLONE_THREAD can not work with CLONE_NEWPID because the processes share
a signal queue.
I can see a similar argument going for CLONE_SIGHAND even though there is not
as much sharing there. I don't see how CLONE_PARENT could cause any harm.
Without CLONE_SIGHAND.
Eric
> Signed-off-by: Sukadev Bhattiprolu <sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
> kernel/pid_namespace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-mmotm/kernel/pid_namespace.c
> ===================================================================
> --- linux-mmotm.orig/kernel/pid_namespace.c 2009-06-17 18:19:42.000000000 -0700
> +++ linux-mmotm/kernel/pid_namespace.c 2009-06-17 18:19:58.000000000 -0700
> @@ -118,7 +118,7 @@ struct pid_namespace *copy_pid_ns(unsign
> {
> if (!(flags & CLONE_NEWPID))
> return get_pid_ns(old_ns);
> - if (flags & CLONE_THREAD)
> + if (flags & (CLONE_THREAD|CLONE_PARENT))
> return ERR_PTR(-EINVAL);
> return create_pid_namespace(old_ns);
> }
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT
[not found] ` <20090618025103.GB31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-06-18 3:20 ` Eric W. Biederman
2009-06-18 3:28 ` Roland McGrath
@ 2009-06-18 15:35 ` Oleg Nesterov
[not found] ` <20090618153501.GA6404-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2009-06-18 15:35 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Containers, David C. Hansen, Eric W. Biederman, Alexey Dobriyan,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov
On 06/17, Sukadev Bhattiprolu wrote:
>
> Prevent container-inits from using CLONE_PARENT
>
> If a container-init creates a sibling (using CLONE_PARENT), pid namespace
> semantics become complicated:
>
> - the "active pid namespace" of the sibling will be the descendant
> container, but its not obvious if that is correct.
>
> - if container-init exits, it will terminate the sibling, but again
> its not clear if that is the correct behavior.
>
> - the sibling exists in both parent and child containers while current
> pid namespace semantics assume that only container-init can exist
> in both parent/child containers.
>
> - the parent of the sibling is not a descendant of container-init
> (while pid namespaces assume that all processes in the container
> are descendants of the container-init)
I agree, this all a bit strange and perhaps should be fixed. But afaics,
nothing bad can happen? I mean, if the sub-namespace does stupid things
it can't do a harm to the parent namespace? Or I missed something?
> - When the sibling dies, the SIGCHLD is sent to its parent (if
> alive), i.e the signal escapes the container to a parent container.
The same if container-init exits, we send SIGCHLD up. But yes, I agree,
this is a bit strange.
> (if the parent of the sibling exits, the container-init then becomes
> the reaper of the sibling).
Again, strange but harmless.
> To keep pid namespace semantics simple, prevent container-inits from using
> CLONE_PARENT at least until we have a better understanding of CLONE_PARENT
> and pid-namespace interactions.
Yes, perhaps makes sense.
> --- linux-mmotm.orig/kernel/fork.c 2009-06-17 18:23:23.000000000 -0700
> +++ linux-mmotm/kernel/fork.c 2009-06-17 19:17:54.000000000 -0700
> @@ -974,6 +974,14 @@ static struct task_struct *copy_process(
> if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
> return ERR_PTR(-EINVAL);
>
> + /*
> + * To keep pid namespace semantics simple, prevent container-inits
> + * from creating siblings.
> + */
> + if ((clone_flags & CLONE_PARENT) &&
> + is_container_init(current) && !is_global_init(current))
Both is_ checks are not right afaics. There are per-thread. This means
that container-init can do clone(CLONE_THREAD), and then this thread
does CLONE_PARENT and fools copy_process().
As for !is_global_init(). I never understood what should we do if the
global init does CLONE_PARENT, this attaches another process to swapper,
not good.
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT
2009-06-18 15:35 ` Oleg Nesterov
@ 2009-06-18 15:42 ` Oleg Nesterov
0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-06-18 15:42 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: Containers, David C. Hansen, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
Eric W. Biederman, Alexey Dobriyan, roland-H+wXaHxf7aLQT0dZR+AlfA,
Pavel Emelyanov
On 06/18, Oleg Nesterov wrote:
>
> On 06/17, Sukadev Bhattiprolu wrote:
> >
> > @@ -974,6 +974,14 @@ static struct task_struct *copy_process(
> > if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
> > return ERR_PTR(-EINVAL);
> >
> > + /*
> > + * To keep pid namespace semantics simple, prevent container-inits
> > + * from creating siblings.
> > + */
> > + if ((clone_flags & CLONE_PARENT) &&
> > + is_container_init(current) && !is_global_init(current))
>
> Both is_ checks are not right afaics. There are per-thread. This means
> that container-init can do clone(CLONE_THREAD), and then this thread
> does CLONE_PARENT and fools copy_process().
>
> As for !is_global_init(). I never understood what should we do if the
> global init does CLONE_PARENT, this attaches another process to swapper,
> not good.
Hmm. And idle threads run with ->action[SIGHLD] == SIG_DFL, so this is
really wrong. Fortunately, we can trust the global init.
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT
@ 2009-06-18 15:42 ` Oleg Nesterov
0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-06-18 15:42 UTC (permalink / raw)
To: Sukadev Bhattiprolu
Cc: serue, Eric W. Biederman, roland, Pavel Emelyanov,
Alexey Dobriyan, Oren Laadan, David C. Hansen, Containers,
linux-kernel
On 06/18, Oleg Nesterov wrote:
>
> On 06/17, Sukadev Bhattiprolu wrote:
> >
> > @@ -974,6 +974,14 @@ static struct task_struct *copy_process(
> > if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
> > return ERR_PTR(-EINVAL);
> >
> > + /*
> > + * To keep pid namespace semantics simple, prevent container-inits
> > + * from creating siblings.
> > + */
> > + if ((clone_flags & CLONE_PARENT) &&
> > + is_container_init(current) && !is_global_init(current))
>
> Both is_ checks are not right afaics. There are per-thread. This means
> that container-init can do clone(CLONE_THREAD), and then this thread
> does CLONE_PARENT and fools copy_process().
>
> As for !is_global_init(). I never understood what should we do if the
> global init does CLONE_PARENT, this attaches another process to swapper,
> not good.
Hmm. And idle threads run with ->action[SIGHLD] == SIG_DFL, so this is
really wrong. Fortunately, we can trust the global init.
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 1/2] Deny CLONE_PARENT|CLONE_NEWPID combination
[not found] ` <m1d492jk0k.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
@ 2009-06-18 22:28 ` Sukadev Bhattiprolu
0 siblings, 0 replies; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2009-06-18 22:28 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Containers, David C. Hansen, Oleg Nesterov, Alexey Dobriyan,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov
Eric W. Biederman [ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org] wrote:
| Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
|
| > Deny CLONE_PARENT|CLONE_NEWPID combination.
| >
| > CLONE_PARENT was probably used to implement an older threading model.
|
| Yes it was.
|
| > If so, for consistency with CLONE_THREAD, the CLONE_PARENT|CLONE_NEWPID
| > combination should also fail with -EINVAL.
|
| CLONE_THREAD can not work with CLONE_NEWPID because the processes share
| a signal queue.
|
| I can see a similar argument going for CLONE_SIGHAND even though there is not
| as much sharing there. I don't see how CLONE_PARENT could cause any harm.
| Without CLONE_SIGHAND.
It does not cause any harm. Only reason to disable CLONE_PARENT, at least for
now, is the confusing semantics (from users pov) and the process-tree model
and the usefulness (if CLONE_PARENT is only used in old threading model, the
needs of such an application acting as container-init is not clear).
Should we disable CLONE_SIGHAND in addition to CLONE_PARENT or just
CLONE_SIGHAND ?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT
[not found] ` <m18wjqkz2i.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
@ 2009-06-18 22:40 ` Sukadev Bhattiprolu
0 siblings, 0 replies; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2009-06-18 22:40 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Containers, David C. Hansen, Oleg Nesterov, Alexey Dobriyan,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov
Eric W. Biederman [ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org] wrote:
| Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
|
| > Prevent container-inits from using CLONE_PARENT
| >
| > If a container-init creates a sibling (using CLONE_PARENT), pid namespace
| > semantics become complicated:
| >
| > - the "active pid namespace" of the sibling will be the descendant
| > container, but its not obvious if that is correct.
|
| It is correct the sibling must not change pid namespaces. You are not
| allowed to escape out of a pid namespace.
|
| > - if container-init exits, it will terminate the sibling, but again
| > its not clear if that is the correct behavior.
|
| Again correct because the container-init is the child reaper for the pid namespace.
| No reaper no namespace.
|
| > - the sibling exists in both parent and child containers while current
| > pid namespace semantics assume that only container-init can exist
| > in both parent/child containers.
|
| All tasks in the container also exist in the parent container.
| What assumption are you talking about?
You are right, thats not really different for CLONE_PARENT.
|
| > - the parent of the sibling is not a descendant of container-init
| > (while pid namespaces assume that all processes in the container
| > are descendants of the container-init)
|
| User space assumes that certainly. What part of the pid namespace
| code makes such an assumption?
I was referring only to user-space view.
|
| > - When the sibling dies, the SIGCHLD is sent to its parent (if
| > alive), i.e the signal escapes the container to a parent container.
| > (if the parent of the sibling exits, the container-init then becomes
| > the reaper of the sibling).
|
| Yes.
|
| > To keep pid namespace semantics simple, prevent container-inits from using
| > CLONE_PARENT at least until we have a better understanding of CLONE_PARENT
| > and pid-namespace interactions.
|
| The only argument that I can see that carries any weight is that unix
| semantics fundamentally assume a process tree. Allowing init to use
| CLONE_PARENT creates a multi-rooted process tree.
Right.
|
| At which point the is_global_init check is foolish.
Well, I was trying to disable CLONE_PARENT just with pid namespaces,
Disabling CLONE_PARENT for global init seemed independent of namespaces
and there was recent talk of potential users of CLONE_PARENT so I am
not sure if there is an init that uses the old threading model !
I don't have convincing reason besides "lets enable when uses/semanitcs
for CLONE_PARENT with pid namespaces are clear".
|
| Eric
|
|
| > Untested, RFC patch :-)
| >
| > Signed-off-by: Sukadev Bhattiprolu <sukadev-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
| > ---
| > kernel/fork.c | 8 ++++++++
| > 1 file changed, 8 insertions(+)
| >
| > Index: linux-mmotm/kernel/fork.c
| > ===================================================================
| > --- linux-mmotm.orig/kernel/fork.c 2009-06-17 18:23:23.000000000 -0700
| > +++ linux-mmotm/kernel/fork.c 2009-06-17 19:17:54.000000000 -0700
| > @@ -974,6 +974,14 @@ static struct task_struct *copy_process(
| > if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
| > return ERR_PTR(-EINVAL);
| >
| > + /*
| > + * To keep pid namespace semantics simple, prevent container-inits
| > + * from creating siblings.
| > + */
| > + if ((clone_flags & CLONE_PARENT) &&
| > + is_container_init(current) && !is_global_init(current))
| > + return ERR_PTR(-EINVAL);
| > +
| > retval = security_task_create(clone_flags);
| > if (retval)
| > goto fork_out;
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT
[not found] ` <20090618153501.GA6404-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-18 15:42 ` Oleg Nesterov
@ 2009-06-18 22:52 ` Sukadev Bhattiprolu
1 sibling, 0 replies; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2009-06-18 22:52 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Containers, David C. Hansen, Eric W. Biederman, Alexey Dobriyan,
roland-H+wXaHxf7aLQT0dZR+AlfA, Pavel Emelyanov
| Again, strange but harmless.
Right nothing broken, just trying to disable the strangeness till usage
becomes clear.
|
| > To keep pid namespace semantics simple, prevent container-inits from using
| > CLONE_PARENT at least until we have a better understanding of CLONE_PARENT
| > and pid-namespace interactions.
|
| Yes, perhaps makes sense.
|
| > --- linux-mmotm.orig/kernel/fork.c 2009-06-17 18:23:23.000000000 -0700
| > +++ linux-mmotm/kernel/fork.c 2009-06-17 19:17:54.000000000 -0700
| > @@ -974,6 +974,14 @@ static struct task_struct *copy_process(
| > if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
| > return ERR_PTR(-EINVAL);
| >
| > + /*
| > + * To keep pid namespace semantics simple, prevent container-inits
| > + * from creating siblings.
| > + */
| > + if ((clone_flags & CLONE_PARENT) &&
| > + is_container_init(current) && !is_global_init(current))
|
| Both is_ checks are not right afaics. There are per-thread. This means
| that container-init can do clone(CLONE_THREAD), and then this thread
| does CLONE_PARENT and fools copy_process().
Good point. Should check the tgid.
|
| As for !is_global_init(). I never understood what should we do if the
| global init does CLONE_PARENT, this attaches another process to swapper,
| not good.
Agree, like I replied to Eric, I just was not sure there were any
existing users of CLONE_PARENT :-(
I could make a separate patch in this set to just disable CLONE_PARENT for
global init.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-06-18 22:52 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-18 2:47 [RFC][PATCH 0/2] CLONE_PARENT and pid namespaces Sukadev Bhattiprolu
[not found] ` <20090618024743.GA31515-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-06-18 2:49 ` [RFC][PATCH 1/2] Deny CLONE_PARENT|CLONE_NEWPID combination Sukadev Bhattiprolu
[not found] ` <20090618024934.GA31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-06-18 3:26 ` Roland McGrath
2009-06-18 3:30 ` Eric W. Biederman
[not found] ` <m1d492jk0k.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-06-18 22:28 ` Sukadev Bhattiprolu
2009-06-18 2:51 ` [RFC][PATCH 2/2] Prevent container-inits from using CLONE_PARENT Sukadev Bhattiprolu
[not found] ` <20090618025103.GB31672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-06-18 3:20 ` Eric W. Biederman
[not found] ` <m18wjqkz2i.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-06-18 22:40 ` Sukadev Bhattiprolu
2009-06-18 3:28 ` Roland McGrath
2009-06-18 15:35 ` Oleg Nesterov
[not found] ` <20090618153501.GA6404-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-18 15:42 ` Oleg Nesterov
2009-06-18 15:42 ` Oleg Nesterov
2009-06-18 22:52 ` Sukadev Bhattiprolu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.