public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
@ 2008-04-13  8:09 Manfred Spraul
  2008-04-14 15:00 ` Serge E. Hallyn
  0 siblings, 1 reply; 5+ messages in thread
From: Manfred Spraul @ 2008-04-13  8:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Kernel Mailing List, Serge E. Hallyn, Eric W. Biederman,
	Pavel Emelyanov, Sukadev Bhattiprolu

sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this can
cause a kernel memory corruption. CLONE_NEWIPC must detach from the existing
undo lists.
Fix, part 2: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC.
CLONE_NEWIPC creates a new IPC namespace, the task cannot access the
existing semaphore arrays after the unshare syscall. Thus the task
can/must detach from the existing undo list entries, too.

This fixes the kernel corruption, because it makes it impossible that
undo records from two different namespaces are in sysvsem.undo_list.

I'm not sure who's the right maintainer, Andrew, could you merge it?

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
 kernel/fork.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 7f242b0..6f956a7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1735,9 +1735,13 @@ asmlinkage long sys_unshare(unsigned long unshare_flags)
 		goto bad_unshare_cleanup_semundo;
 
 	if (new_fs ||  new_mm || new_fd || new_ulist || new_nsproxy) {
-		if (unshare_flags & CLONE_SYSVSEM) {
+		if (unshare_flags & (CLONE_NEWIPC|CLONE_SYSVSEM)) {
 			/*
 			 * CLONE_SYSVSEM is equivalent to sys_exit().
+			 *
+			 * CLONE_NEWIPC must also detach from the undolist: after switching
+			 * to a new ipc namespace, the semaphore arrays from the old
+			 * namespace are unreachable.
 			 */
 			exit_sem(current);
 		}
-- 
1.5.4.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
  2008-04-13  8:09 [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC Manfred Spraul
@ 2008-04-14 15:00 ` Serge E. Hallyn
  2008-04-14 18:40   ` Manfred Spraul
  0 siblings, 1 reply; 5+ messages in thread
From: Serge E. Hallyn @ 2008-04-14 15:00 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Andrew Morton, Linux Kernel Mailing List, Serge E. Hallyn,
	Eric W. Biederman, Pavel Emelyanov, Sukadev Bhattiprolu

Quoting Manfred Spraul (manfred@colorfullife.com):
> sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this can
> cause a kernel memory corruption. CLONE_NEWIPC must detach from the existing
> undo lists.
> Fix, part 2: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC.
> CLONE_NEWIPC creates a new IPC namespace, the task cannot access the
> existing semaphore arrays after the unshare syscall. Thus the task
> can/must detach from the existing undo list entries, too.
> 
> This fixes the kernel corruption, because it makes it impossible that
> undo records from two different namespaces are in sysvsem.undo_list.

So this was never an issue with clone(CLONE_NEWIPC|CLONE_SYSVSEM), which
should have had the same result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM)?

> I'm not sure who's the right maintainer, Andrew, could you merge it?
> 
> Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
> ---
>  kernel/fork.c |    6 +++++-
>  1 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 7f242b0..6f956a7 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1735,9 +1735,13 @@ asmlinkage long sys_unshare(unsigned long unshare_flags)
>  		goto bad_unshare_cleanup_semundo;
> 
>  	if (new_fs ||  new_mm || new_fd || new_ulist || new_nsproxy) {
> -		if (unshare_flags & CLONE_SYSVSEM) {
> +		if (unshare_flags & (CLONE_NEWIPC|CLONE_SYSVSEM)) {
>  			/*
>  			 * CLONE_SYSVSEM is equivalent to sys_exit().
> +			 *
> +			 * CLONE_NEWIPC must also detach from the undolist: after switching
> +			 * to a new ipc namespace, the semaphore arrays from the old
> +			 * namespace are unreachable.
>  			 */
>  			exit_sem(current);
>  		}
> -- 
> 1.5.4.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
  2008-04-14 15:00 ` Serge E. Hallyn
@ 2008-04-14 18:40   ` Manfred Spraul
  2008-04-14 19:23     ` Serge E. Hallyn
  0 siblings, 1 reply; 5+ messages in thread
From: Manfred Spraul @ 2008-04-14 18:40 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Andrew Morton, Linux Kernel Mailing List, Eric W. Biederman,
	Pavel Emelyanov, Sukadev Bhattiprolu

Serge E. Hallyn wrote:
> Quoting Manfred Spraul (manfred@colorfullife.com):
>   
>> sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this can
>> cause a kernel memory corruption. CLONE_NEWIPC must detach from the existing
>> undo lists.
>> Fix, part 2: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC.
>> CLONE_NEWIPC creates a new IPC namespace, the task cannot access the
>> existing semaphore arrays after the unshare syscall. Thus the task
>> can/must detach from the existing undo list entries, too.
>>
>> This fixes the kernel corruption, because it makes it impossible that
>> undo records from two different namespaces are in sysvsem.undo_list.
>>     
>
> So this was never an issue with clone(CLONE_NEWIPC|CLONE_SYSVSEM), which
> should have had the same result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM)?
>
>   
Actually, the story is slightly different:
unshare(CLONE_NEWIPC|CLONE_SYSVSEM) returns -EINVAL right now.

Thus all apps right now call unshare(CLONE_NEWIPC|&~CLONE_SYSVSEM).
This combination doesn't make much sense. Even worse - it easily causes 
a kernel oops.
Thus my fix is twofold:
- add support for unshare(CLONE_SYSVSEM).
- implicitely add CLONE_SYSVSEM to all calls that set CLONE_NEWIPC.

It's not really pretty: If a pivot_namespace syscall is ever added, then 
CLONE_NEWIPC&~CLONE_SYSVSEM would make sense again.
What do you think? Can we break backward compatibility and add
    if ( (unshare_flags & CLONE_NEWIPC) && !(unshare_flags & 
CLONE_SYSVSEM) )
       return -EINVAL;
into sys_unshare()?
I have decided against that, it breaks the current ABI.
And we gain virtually nothing - most if not all unshare users will be 
single threaded apps that do not use sysvsem at all, and even most 
sysvsem users do not use SEM_UNDO.

--
    Manfred

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
  2008-04-14 18:40   ` Manfred Spraul
@ 2008-04-14 19:23     ` Serge E. Hallyn
  2008-04-15 18:42       ` Manfred Spraul
  0 siblings, 1 reply; 5+ messages in thread
From: Serge E. Hallyn @ 2008-04-14 19:23 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Serge E. Hallyn, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Pavel Emelyanov, Sukadev Bhattiprolu

Quoting Manfred Spraul (manfred@colorfullife.com):
> Serge E. Hallyn wrote:
>> Quoting Manfred Spraul (manfred@colorfullife.com):
>>   
>>> sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this 
>>> can
>>> cause a kernel memory corruption. CLONE_NEWIPC must detach from the 
>>> existing
>>> undo lists.
>>> Fix, part 2: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC.
>>> CLONE_NEWIPC creates a new IPC namespace, the task cannot access the
>>> existing semaphore arrays after the unshare syscall. Thus the task
>>> can/must detach from the existing undo list entries, too.
>>>
>>> This fixes the kernel corruption, because it makes it impossible that
>>> undo records from two different namespaces are in sysvsem.undo_list.
>>>     
>>
>> So this was never an issue with clone(CLONE_NEWIPC|CLONE_SYSVSEM), which
>> should have had the same result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM)?
>>
>>   
> Actually, the story is slightly different:
> unshare(CLONE_NEWIPC|CLONE_SYSVSEM) returns -EINVAL right now.

Right, but it's feasible with clone (where CLONE_SYSVSEM has the
opposite meaning of with unshare) right?  According to the
comment, CLONE_SYSVSEM was disabled mainly out of fear that it
would be confusing to enable it since it does have the inverse meaning
in unshare as in clone.

> Thus all apps right now call unshare(CLONE_NEWIPC|&~CLONE_SYSVSEM).
> This combination doesn't make much sense. Even worse - it easily causes a 
> kernel oops.
> Thus my fix is twofold:
> - add support for unshare(CLONE_SYSVSEM).
> - implicitely add CLONE_SYSVSEM to all calls that set CLONE_NEWIPC.

Right but your fix ignores the fact that you can achieve the same
result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM) by doing
clone(CLONE_NEWIPC|CLONE_SYSVSEM).

> It's not really pretty: If a pivot_namespace syscall is ever added, then 
> CLONE_NEWIPC&~CLONE_SYSVSEM would make sense again.
> What do you think? Can we break backward compatibility and add
>    if ( (unshare_flags & CLONE_NEWIPC) && !(unshare_flags & CLONE_SYSVSEM) 
> )
>       return -EINVAL;
> into sys_unshare()?

I don't think we need to, I think what you're doing makes perfect sense.
If you asked for CLONE_NEWIPC, then you wanted to do CLONE_SYSVSEM (uh, or,
in clone parlance, !CLONE_SYSVSEM :)

> I have decided against that, it breaks the current ABI.
> And we gain virtually nothing - most if not all unshare users will be 
> single threaded apps that do not use sysvsem at all, and even most sysvsem 
> users do not use SEM_UNDO.

And most importantly sharing your semundo list but not your sems with
your parent is silly!

-serge

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC
  2008-04-14 19:23     ` Serge E. Hallyn
@ 2008-04-15 18:42       ` Manfred Spraul
  0 siblings, 0 replies; 5+ messages in thread
From: Manfred Spraul @ 2008-04-15 18:42 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Andrew Morton, Linux Kernel Mailing List, Eric W. Biederman,
	Pavel Emelyanov, Sukadev Bhattiprolu

Serge E. Hallyn wrote:
>
>> Thus all apps right now call unshare(CLONE_NEWIPC|&~CLONE_SYSVSEM).
>> This combination doesn't make much sense. Even worse - it easily causes a 
>> kernel oops.
>> Thus my fix is twofold:
>> - add support for unshare(CLONE_SYSVSEM).
>> - implicitely add CLONE_SYSVSEM to all calls that set CLONE_NEWIPC.
>>     
>
> Right but your fix ignores the fact that you can achieve the same
> result as unshare(CLONE_NEWIPC&~CLONE_SYSVSEM) by doing
> clone(CLONE_NEWIPC|CLONE_SYSVSEM).
>   
Duh, I overlooked that part. You are right.
Hmm - what's the best fix? Implicitely add CLONE_SYSVSEM or return 
-EINVAL if CLONE_SYSVSEM is not set?
I don't care.
Could you write a patch? I probably won't have enough time until the 
next weekend.

>> I have decided against that, it breaks the current ABI.
>> And we gain virtually nothing - most if not all unshare users will be 
>> single threaded apps that do not use sysvsem at all, and even most sysvsem 
>> users do not use SEM_UNDO.
>>     
>
> And most importantly sharing your semundo list but not your sems with
> your parent is silly!
It's only 99% silly: a task might want to have not-yet undone changes in its parent's namespace.
As soon as the task exit, they are undone. Probably the most complex way to implement wait4().


--
Manfred

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-04-15 18:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-13  8:09 [PATCH 2/2] fix sys_unshare()+SEM_UNDO: perform an implicit CLONE_SYSVSEM in CLONE_NEWIPC Manfred Spraul
2008-04-14 15:00 ` Serge E. Hallyn
2008-04-14 18:40   ` Manfred Spraul
2008-04-14 19:23     ` Serge E. Hallyn
2008-04-15 18:42       ` Manfred Spraul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox