* [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
@ 2008-04-13 8:09 Manfred Spraul
2008-04-14 15:00 ` Serge E. Hallyn
0 siblings, 1 reply; 5+ messages in thread
From: Manfred Spraul @ 2008-04-13 8:09 UTC (permalink / raw)
To: Andrew Morton
Cc: Linux Kernel Mailing List, Serge E. Hallyn, Eric W. Biederman,
Pavel Emelyanov, Sukadev Bhattiprolu
sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this can
cause a kernel memory corruption. CLONE_NEWIPC must detach from the existing
undo lists.
Fix, part 2: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC.
CLONE_NEWIPC creates a new IPC namespace, the task cannot access the
existing semaphore arrays after the unshare syscall. Thus the task
can/must detach from the existing undo list entries, too.
This fixes the kernel corruption, because it makes it impossible that
undo records from two different namespaces are in sysvsem.undo_list.
I'm not sure who's the right maintainer, Andrew, could you merge it?
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
kernel/fork.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 7f242b0..6f956a7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1735,9 +1735,13 @@ asmlinkage long sys_unshare(unsigned long unshare_flags)
goto bad_unshare_cleanup_semundo;
if (new_fs || new_mm || new_fd || new_ulist || new_nsproxy) {
- if (unshare_flags & CLONE_SYSVSEM) {
+ if (unshare_flags & (CLONE_NEWIPC|CLONE_SYSVSEM)) {
/*
* CLONE_SYSVSEM is equivalent to sys_exit().
+ *
+ * CLONE_NEWIPC must also detach from the undolist: after switching
+ * to a new ipc namespace, the semaphore arrays from the old
+ * namespace are unreachable.
*/
exit_sem(current);
}
--
1.5.4.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
2008-04-13 8:09 [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC Manfred Spraul
@ 2008-04-14 15:00 ` Serge E. Hallyn
2008-04-14 18:40 ` Manfred Spraul
0 siblings, 1 reply; 5+ messages in thread
From: Serge E. Hallyn @ 2008-04-14 15:00 UTC (permalink / raw)
To: Manfred Spraul
Cc: Andrew Morton, Linux Kernel Mailing List, Serge E. Hallyn,
Eric W. Biederman, Pavel Emelyanov, Sukadev Bhattiprolu
Quoting Manfred Spraul (manfred@colorfullife.com):
> sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this can
> cause a kernel memory corruption. CLONE_NEWIPC must detach from the existing
> undo lists.
> Fix, part 2: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC.
> CLONE_NEWIPC creates a new IPC namespace, the task cannot access the
> existing semaphore arrays after the unshare syscall. Thus the task
> can/must detach from the existing undo list entries, too.
>
> This fixes the kernel corruption, because it makes it impossible that
> undo records from two different namespaces are in sysvsem.undo_list.
So this was never an issue with clone(CLONE_NEWIPC|CLONE_SYSVSEM), which
should have had the same result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM)?
> I'm not sure who's the right maintainer, Andrew, could you merge it?
>
> Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
> ---
> kernel/fork.c | 6 +++++-
> 1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 7f242b0..6f956a7 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1735,9 +1735,13 @@ asmlinkage long sys_unshare(unsigned long unshare_flags)
> goto bad_unshare_cleanup_semundo;
>
> if (new_fs || new_mm || new_fd || new_ulist || new_nsproxy) {
> - if (unshare_flags & CLONE_SYSVSEM) {
> + if (unshare_flags & (CLONE_NEWIPC|CLONE_SYSVSEM)) {
> /*
> * CLONE_SYSVSEM is equivalent to sys_exit().
> + *
> + * CLONE_NEWIPC must also detach from the undolist: after switching
> + * to a new ipc namespace, the semaphore arrays from the old
> + * namespace are unreachable.
> */
> exit_sem(current);
> }
> --
> 1.5.4.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
2008-04-14 15:00 ` Serge E. Hallyn
@ 2008-04-14 18:40 ` Manfred Spraul
2008-04-14 19:23 ` Serge E. Hallyn
0 siblings, 1 reply; 5+ messages in thread
From: Manfred Spraul @ 2008-04-14 18:40 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: Andrew Morton, Linux Kernel Mailing List, Eric W. Biederman,
Pavel Emelyanov, Sukadev Bhattiprolu
Serge E. Hallyn wrote:
> Quoting Manfred Spraul (manfred@colorfullife.com):
>
>> sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this can
>> cause a kernel memory corruption. CLONE_NEWIPC must detach from the existing
>> undo lists.
>> Fix, part 2: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC.
>> CLONE_NEWIPC creates a new IPC namespace, the task cannot access the
>> existing semaphore arrays after the unshare syscall. Thus the task
>> can/must detach from the existing undo list entries, too.
>>
>> This fixes the kernel corruption, because it makes it impossible that
>> undo records from two different namespaces are in sysvsem.undo_list.
>>
>
> So this was never an issue with clone(CLONE_NEWIPC|CLONE_SYSVSEM), which
> should have had the same result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM)?
>
>
Actually, the story is slightly different:
unshare(CLONE_NEWIPC|CLONE_SYSVSEM) returns -EINVAL right now.
Thus all apps right now call unshare(CLONE_NEWIPC|&~CLONE_SYSVSEM).
This combination doesn't make much sense. Even worse - it easily causes
a kernel oops.
Thus my fix is twofold:
- add support for unshare(CLONE_SYSVSEM).
- implicitely add CLONE_SYSVSEM to all calls that set CLONE_NEWIPC.
It's not really pretty: If a pivot_namespace syscall is ever added, then
CLONE_NEWIPC&~CLONE_SYSVSEM would make sense again.
What do you think? Can we break backward compatibility and add
if ( (unshare_flags & CLONE_NEWIPC) && !(unshare_flags &
CLONE_SYSVSEM) )
return -EINVAL;
into sys_unshare()?
I have decided against that, it breaks the current ABI.
And we gain virtually nothing - most if not all unshare users will be
single threaded apps that do not use sysvsem at all, and even most
sysvsem users do not use SEM_UNDO.
--
Manfred
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
2008-04-14 18:40 ` Manfred Spraul
@ 2008-04-14 19:23 ` Serge E. Hallyn
2008-04-15 18:42 ` Manfred Spraul
0 siblings, 1 reply; 5+ messages in thread
From: Serge E. Hallyn @ 2008-04-14 19:23 UTC (permalink / raw)
To: Manfred Spraul
Cc: Serge E. Hallyn, Andrew Morton, Linux Kernel Mailing List,
Eric W. Biederman, Pavel Emelyanov, Sukadev Bhattiprolu
Quoting Manfred Spraul (manfred@colorfullife.com):
> Serge E. Hallyn wrote:
>> Quoting Manfred Spraul (manfred@colorfullife.com):
>>
>>> sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this
>>> can
>>> cause a kernel memory corruption. CLONE_NEWIPC must detach from the
>>> existing
>>> undo lists.
>>> Fix, part 2: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC.
>>> CLONE_NEWIPC creates a new IPC namespace, the task cannot access the
>>> existing semaphore arrays after the unshare syscall. Thus the task
>>> can/must detach from the existing undo list entries, too.
>>>
>>> This fixes the kernel corruption, because it makes it impossible that
>>> undo records from two different namespaces are in sysvsem.undo_list.
>>>
>>
>> So this was never an issue with clone(CLONE_NEWIPC|CLONE_SYSVSEM), which
>> should have had the same result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM)?
>>
>>
> Actually, the story is slightly different:
> unshare(CLONE_NEWIPC|CLONE_SYSVSEM) returns -EINVAL right now.
Right, but it's feasible with clone (where CLONE_SYSVSEM has the
opposite meaning of with unshare) right? According to the
comment, CLONE_SYSVSEM was disabled mainly out of fear that it
would be confusing to enable it since it does have the inverse meaning
in unshare as in clone.
> Thus all apps right now call unshare(CLONE_NEWIPC|&~CLONE_SYSVSEM).
> This combination doesn't make much sense. Even worse - it easily causes a
> kernel oops.
> Thus my fix is twofold:
> - add support for unshare(CLONE_SYSVSEM).
> - implicitely add CLONE_SYSVSEM to all calls that set CLONE_NEWIPC.
Right but your fix ignores the fact that you can achieve the same
result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM) by doing
clone(CLONE_NEWIPC|CLONE_SYSVSEM).
> It's not really pretty: If a pivot_namespace syscall is ever added, then
> CLONE_NEWIPC&~CLONE_SYSVSEM would make sense again.
> What do you think? Can we break backward compatibility and add
> if ( (unshare_flags & CLONE_NEWIPC) && !(unshare_flags & CLONE_SYSVSEM)
> )
> return -EINVAL;
> into sys_unshare()?
I don't think we need to, I think what you're doing makes perfect sense.
If you asked for CLONE_NEWIPC, then you wanted to do CLONE_SYSVSEM (uh, or,
in clone parlance, !CLONE_SYSVSEM :)
> I have decided against that, it breaks the current ABI.
> And we gain virtually nothing - most if not all unshare users will be
> single threaded apps that do not use sysvsem at all, and even most sysvsem
> users do not use SEM_UNDO.
And most importantly sharing your semundo list but not your sems with
your parent is silly!
-serge
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
2008-04-14 19:23 ` Serge E. Hallyn
@ 2008-04-15 18:42 ` Manfred Spraul
0 siblings, 0 replies; 5+ messages in thread
From: Manfred Spraul @ 2008-04-15 18:42 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: Andrew Morton, Linux Kernel Mailing List, Eric W. Biederman,
Pavel Emelyanov, Sukadev Bhattiprolu
Serge E. Hallyn wrote:
>
>> Thus all apps right now call unshare(CLONE_NEWIPC|&~CLONE_SYSVSEM).
>> This combination doesn't make much sense. Even worse - it easily causes a
>> kernel oops.
>> Thus my fix is twofold:
>> - add support for unshare(CLONE_SYSVSEM).
>> - implicitely add CLONE_SYSVSEM to all calls that set CLONE_NEWIPC.
>>
>
> Right but your fix ignores the fact that you can achieve the same
> result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM) by doing
> clone(CLONE_NEWIPC|CLONE_SYSVSEM).
>
Duh, I overlooked that part. You are right.
Hmm - what's the best fix? Implicitely add CLONE_SYSVSEM or return
-EINVAL if CLONE_SYSVSEM is not set?
I don't care.
Could you write a patch? I probably won't have enough time until the
next weekend.
>> I have decided against that, it breaks the current ABI.
>> And we gain virtually nothing - most if not all unshare users will be
>> single threaded apps that do not use sysvsem at all, and even most sysvsem
>> users do not use SEM_UNDO.
>>
>
> And most importantly sharing your semundo list but not your sems with
> your parent is silly!
It's only 99% silly: a task might want to have not-yet undone changes in its parent's namespace.
As soon as the task exit, they are undone. Probably the most complex way to implement wait4().
--
Manfred
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-04-15 18:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-13 8:09 [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC Manfred Spraul
2008-04-14 15:00 ` Serge E. Hallyn
2008-04-14 18:40 ` Manfred Spraul
2008-04-14 19:23 ` Serge E. Hallyn
2008-04-15 18:42 ` Manfred Spraul
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox