public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
@ 2010-02-14 16:40 Jouni Malinen
  2010-02-14 22:03 ` Michael Neuling
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jouni Malinen @ 2010-02-14 16:40 UTC (permalink / raw)
  To: Michael Neuling, KOSAKI Motohiro; +Cc: linux-kernel

It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
(fs/exec.c: restrict initial stack space expansion to rlimit) broke my
user mode Linux setup by somehow preventing system setup from running
properly (or killing some processes that try to mount things, etc.).
This commit turned up as the reason based on git bisect and reverting it
fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
arch for both). I have no idea what exactly would be the main cause for
this issue, but this looks like a somewhat unfortunately timed
regression in 2.6.33-rc8.

The failed run shows like this (with current linux-2.6.git):

...
EXT3-fs (ubda): mounted filesystem with writeback data mode
VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
mountall: mount /sys/kernel/debug [218] killed by KILL signal
mountall: Filesystem could not be mounted: /sys/kernel/debug
mountall: mount /dev [219] killed by KILL signal
mountall: Filesystem could not be mounted: /dev
mountall: mount /tmp [220] killed by KILL signal
mountall: Filesystem could not be mounted: /tmp
mountall: mount /var/lock [222] killed by KILL signal
mountall: Filesystem could not be mounted: /var/lock
...


With 803bf5ec reverted, UML comes up and the output looks like this:

...
EXT3-fs (ubda): mounted filesystem with writeback data mode
VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
init: procps main process (226) terminated with status 255
fsck from util-linux-ng 2.16
...

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-14 16:40 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit Jouni Malinen
@ 2010-02-14 22:03 ` Michael Neuling
  2010-02-15  2:38   ` KOSAKI Motohiro
  2010-02-15  6:56   ` Jouni Malinen
  2010-02-14 23:23 ` Rafael J. Wysocki
  2010-02-15  6:30 ` Michael Neuling
  2 siblings, 2 replies; 15+ messages in thread
From: Michael Neuling @ 2010-02-14 22:03 UTC (permalink / raw)
  To: Jouni Malinen; +Cc: KOSAKI Motohiro, linux-kernel



In message <20100214164023.GA2726@jm.kir.nu> you wrote:
> It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> user mode Linux setup by somehow preventing system setup from running
> properly (or killing some processes that try to mount things, etc.).
> This commit turned up as the reason based on git bisect and reverting it
> fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> arch for both). I have no idea what exactly would be the main cause for
> this issue, but this looks like a somewhat unfortunately timed
> regression in 2.6.33-rc8.
> 
> The failed run shows like this (with current linux-2.6.git):
> 
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> mountall: mount /sys/kernel/debug [218] killed by KILL signal
> mountall: Filesystem could not be mounted: /sys/kernel/debug
> mountall: mount /dev [219] killed by KILL signal
> mountall: Filesystem could not be mounted: /dev
> mountall: mount /tmp [220] killed by KILL signal
> mountall: Filesystem could not be mounted: /tmp
> mountall: mount /var/lock [222] killed by KILL signal
> mountall: Filesystem could not be mounted: /var/lock
> ...
> 
> 
> With 803bf5ec reverted, UML comes up and the output looks like this:
> 
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> init: procps main process (226) terminated with status 255
> fsck from util-linux-ng 2.16
> ...

Crud, the "killed" is definitely something this patch could cause.

I'm not familiar with UML.  Is this the guest and the host booting rc8,
or just the host?  Does UML use stack protection at all?

Can you try booting the guest to init=/bin/sh and try running some tests
to see what you can set 'ulimit -s' to and still be able to run a simple
command like '/bin/ls'?

Mikey

> 
> -- 
> Jouni Malinen                                            PGP id EFC895FA
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-14 16:40 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit Jouni Malinen
  2010-02-14 22:03 ` Michael Neuling
@ 2010-02-14 23:23 ` Rafael J. Wysocki
  2010-02-15  6:30 ` Michael Neuling
  2 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2010-02-14 23:23 UTC (permalink / raw)
  To: Jouni Malinen; +Cc: Michael Neuling, KOSAKI Motohiro, linux-kernel

On Sunday 14 February 2010, Jouni Malinen wrote:
> It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> user mode Linux setup by somehow preventing system setup from running
> properly (or killing some processes that try to mount things, etc.).
> This commit turned up as the reason based on git bisect and reverting it
> fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> arch for both). I have no idea what exactly would be the main cause for
> this issue, but this looks like a somewhat unfortunately timed
> regression in 2.6.33-rc8.
> 
> The failed run shows like this (with current linux-2.6.git):
> 
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> mountall: mount /sys/kernel/debug [218] killed by KILL signal
> mountall: Filesystem could not be mounted: /sys/kernel/debug
> mountall: mount /dev [219] killed by KILL signal
> mountall: Filesystem could not be mounted: /dev
> mountall: mount /tmp [220] killed by KILL signal
> mountall: Filesystem could not be mounted: /tmp
> mountall: mount /var/lock [222] killed by KILL signal
> mountall: Filesystem could not be mounted: /var/lock
> ...
> 
> 
> With 803bf5ec reverted, UML comes up and the output looks like this:
> 
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> init: procps main process (226) terminated with status 255
> fsck from util-linux-ng 2.16
> ...

I have created the Bugzilla entry at http://bugzilla.kernel.org/show_bug.cgi?id=15308
for your report.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-14 22:03 ` Michael Neuling
@ 2010-02-15  2:38   ` KOSAKI Motohiro
  2010-02-15  7:02     ` Jouni Malinen
  2010-02-15  6:56   ` Jouni Malinen
  1 sibling, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2010-02-15  2:38 UTC (permalink / raw)
  To: Michael Neuling; +Cc: kosaki.motohiro, Jouni Malinen, linux-kernel

> 
> 
> In message <20100214164023.GA2726@jm.kir.nu> you wrote:
> > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> > user mode Linux setup by somehow preventing system setup from running
> > properly (or killing some processes that try to mount things, etc.).
> > This commit turned up as the reason based on git bisect and reverting it
> > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> > arch for both). I have no idea what exactly would be the main cause for
> > this issue, but this looks like a somewhat unfortunately timed
> > regression in 2.6.33-rc8.
> > 
> > The failed run shows like this (with current linux-2.6.git):
> > 
> > ...
> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> > mountall: Filesystem could not be mounted: /sys/kernel/debug
> > mountall: mount /dev [219] killed by KILL signal
> > mountall: Filesystem could not be mounted: /dev
> > mountall: mount /tmp [220] killed by KILL signal
> > mountall: Filesystem could not be mounted: /tmp
> > mountall: mount /var/lock [222] killed by KILL signal
> > mountall: Filesystem could not be mounted: /var/lock
> > ...

Wow. It seems very strange. Usually stack overflow makes SIGSEGV, not SIGKILL.
plus, In my environment (x86_64 non-uml), mount command doesn't use
the stack so much.


% /usr/bin/time --format="mem %M" ls
/usr/bin/time --format="mem %M" ls
GPATH      INSTALL  README       configure    fdisk          lib      misc-utils   schedutils
mem 3232

% sudo /usr/bin/time --format="mem %M"  mount -a
sudo /usr/bin/time --format="mem %M"  mount -a
mem 2992


Hmmm...
I have no idea.



> > 
> > 
> > With 803bf5ec reverted, UML comes up and the output looks like this:
> > 
> > ...
> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > init: procps main process (226) terminated with status 255
> > fsck from util-linux-ng 2.16
> > ...
> 
> Crud, the "killed" is definitely something this patch could cause.
> 
> I'm not familiar with UML.  Is this the guest and the host booting rc8,
> or just the host?  Does UML use stack protection at all?
> 
> Can you try booting the guest to init=/bin/sh and try running some tests
> to see what you can set 'ulimit -s' to and still be able to run a simple
> command like '/bin/ls'?
> 
> Mikey
> 
> > 
> > -- 
> > Jouni Malinen                                            PGP id EFC895FA
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-14 16:40 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit Jouni Malinen
  2010-02-14 22:03 ` Michael Neuling
  2010-02-14 23:23 ` Rafael J. Wysocki
@ 2010-02-15  6:30 ` Michael Neuling
  2010-02-15  6:59   ` KOSAKI Motohiro
  2010-02-15  7:12   ` Jouni Malinen
  2 siblings, 2 replies; 15+ messages in thread
From: Michael Neuling @ 2010-02-15  6:30 UTC (permalink / raw)
  To: Jouni Malinen; +Cc: KOSAKI Motohiro, linux-kernel, Andrew Morton, anton



In message <20100214164023.GA2726@jm.kir.nu> you wrote:
> It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> user mode Linux setup by somehow preventing system setup from running
> properly (or killing some processes that try to mount things, etc.).
> This commit turned up as the reason based on git bisect and reverting it
> fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> arch for both). I have no idea what exactly would be the main cause for
> this issue, but this looks like a somewhat unfortunately timed
> regression in 2.6.33-rc8.
> 
> The failed run shows like this (with current linux-2.6.git):
> 
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> mountall: mount /sys/kernel/debug [218] killed by KILL signal
> mountall: Filesystem could not be mounted: /sys/kernel/debug
> mountall: mount /dev [219] killed by KILL signal
> mountall: Filesystem could not be mounted: /dev
> mountall: mount /tmp [220] killed by KILL signal
> mountall: Filesystem could not be mounted: /tmp
> mountall: mount /var/lock [222] killed by KILL signal
> mountall: Filesystem could not be mounted: /var/lock
> ...
> 
> 
> With 803bf5ec reverted, UML comes up and the output looks like this:
> 
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> init: procps main process (226) terminated with status 255
> fsck from util-linux-ng 2.16
> ...

Jouni,

I can reproduce this now.  

We got the logic wrong in one of the cleanups and hence we aren't
actually changing the stack reservation ever, when we intended on
allocating up to 20 new pages.  

The:
	rlim_stack = min(rlim_stack, stack_size);
always chooses stack_size hence we end up not changing the stack at all.
This seems to cause fatal problems on UML, but is obviously not what was
intended for archs as well.  

The following works for me on PPC64 64k and 4k pages and UML on x86_64. 

Let me know if it fixes it for you also.

Mikey


exec/fs: fix initial stack reservation

803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
stack space expansion to rlimit) attempts to limit the initial stack to
20*PAGE_SIZE.  Unfortunately, in also attempting ensure the stack is not
reduced in size, we ended up not changing the stack at all.  

This caused a regression in UML resulting in most guest processes to be
killed. 

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: <stable@kernel.org>

diff --git a/fs/exec.c b/fs/exec.c
index e95c692..e0e7b3c 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
 	 * will align it up.
 	 */
 	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
-	rlim_stack = min(rlim_stack, stack_size);
 #ifdef CONFIG_STACK_GROWSUP
 	if (stack_size + stack_expand > rlim_stack)
-		stack_base = vma->vm_start + rlim_stack;
+		/*  Expand only to rlimit, making sure not to shrink it */
+		stack_base = vma->vm_start + max(rlim_stack,stack_size);
 	else
 		stack_base = vma->vm_end + stack_expand;
 #else
 	if (stack_size + stack_expand > rlim_stack)
-		stack_base = vma->vm_end - rlim_stack;
+		/*  Expand only to rlimit, making sure not to shrink it */
+		stack_base = vma->vm_end - max(rlim_stack,stack_size);
 	else
 		stack_base = vma->vm_start - stack_expand;
 #endif



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-14 22:03 ` Michael Neuling
  2010-02-15  2:38   ` KOSAKI Motohiro
@ 2010-02-15  6:56   ` Jouni Malinen
  1 sibling, 0 replies; 15+ messages in thread
From: Jouni Malinen @ 2010-02-15  6:56 UTC (permalink / raw)
  To: Michael Neuling; +Cc: KOSAKI Motohiro, linux-kernel

On Mon, Feb 15, 2010 at 09:03:39AM +1100, Michael Neuling wrote:
> Crud, the "killed" is definitely something this patch could cause.
> 
> I'm not familiar with UML.  Is this the guest and the host booting rc8,
> or just the host?  Does UML use stack protection at all?

Only the guest was booting rc8 (host is Ubuntu 2.6.31-19-generic), i.e.,
the issue shows up in the guest kernel after rc8 (and is removed by
reverting 803bf5). I do not know whether stack protection is used.

> Can you try booting the guest to init=/bin/sh and try running some tests
> to see what you can set 'ulimit -s' to and still be able to run a simple
> command like '/bin/ls'?

With 803bf5 included, init=/bin/sh seems to fail the boot quite
frequently, but not always. Once I get to shell on the UML guest,
/bin/ls fails about half of the time (e.g., running /bin/ls in /root
failed six times out of ten). This was before touching ulimit -s at all
("ulimit -s" shows 8192).

The behavior of 'ls' run with various ulimit -s values:

24-10000: OK or Killed
16-23: OK or Segmentation fault or Killed
1-15: Killed or Segmentation fault

I have no idea what is causing the random behavior in the results, but
anyway, it looks like 'ulimit -s 23' is the first point where
segmentation faults start showing up and 'ulimit -s 15' is the point
where ls fails every time.


If I start the guest (normal boot with multiple virtual consoles) with
803bf5 reverted, 'ulimit -s 84' has /bin/ls working every time and
'ulimit -s 83' makes it fail every time.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-15  6:30 ` Michael Neuling
@ 2010-02-15  6:59   ` KOSAKI Motohiro
  2010-02-15  7:17     ` Jouni Malinen
                       ` (2 more replies)
  2010-02-15  7:12   ` Jouni Malinen
  1 sibling, 3 replies; 15+ messages in thread
From: KOSAKI Motohiro @ 2010-02-15  6:59 UTC (permalink / raw)
  To: Michael Neuling
  Cc: kosaki.motohiro, Jouni Malinen, linux-kernel, Andrew Morton,
	anton

> 
> 
> In message <20100214164023.GA2726@jm.kir.nu> you wrote:
> > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> > user mode Linux setup by somehow preventing system setup from running
> > properly (or killing some processes that try to mount things, etc.).
> > This commit turned up as the reason based on git bisect and reverting it
> > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> > arch for both). I have no idea what exactly would be the main cause for
> > this issue, but this looks like a somewhat unfortunately timed
> > regression in 2.6.33-rc8.
> > 
> > The failed run shows like this (with current linux-2.6.git):
> > 
> > ...
> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> > mountall: Filesystem could not be mounted: /sys/kernel/debug
> > mountall: mount /dev [219] killed by KILL signal
> > mountall: Filesystem could not be mounted: /dev
> > mountall: mount /tmp [220] killed by KILL signal
> > mountall: Filesystem could not be mounted: /tmp
> > mountall: mount /var/lock [222] killed by KILL signal
> > mountall: Filesystem could not be mounted: /var/lock
> > ...
> > 
> > 
> > With 803bf5ec reverted, UML comes up and the output looks like this:
> > 
> > ...
> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > init: procps main process (226) terminated with status 255
> > fsck from util-linux-ng 2.16
> > ...
> 
> Jouni,
> 
> I can reproduce this now.  
> 
> We got the logic wrong in one of the cleanups and hence we aren't
> actually changing the stack reservation ever, when we intended on
> allocating up to 20 new pages.  
> 
> The:
> 	rlim_stack = min(rlim_stack, stack_size);
> always chooses stack_size hence we end up not changing the stack at all.
> This seems to cause fatal problems on UML, but is obviously not what was
> intended for archs as well.  
> 
> The following works for me on PPC64 64k and 4k pages and UML on x86_64. 
> 
> Let me know if it fixes it for you also.
> 
> Mikey
> 
> 
> exec/fs: fix initial stack reservation
> 
> 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> stack space expansion to rlimit) attempts to limit the initial stack to
> 20*PAGE_SIZE.  Unfortunately, in also attempting ensure the stack is not
> reduced in size, we ended up not changing the stack at all.  
> 
> This caused a regression in UML resulting in most guest processes to be
> killed. 
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> cc: <stable@kernel.org>
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index e95c692..e0e7b3c 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
>  	 * will align it up.
>  	 */
>  	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> -	rlim_stack = min(rlim_stack, stack_size);
>  #ifdef CONFIG_STACK_GROWSUP
>  	if (stack_size + stack_expand > rlim_stack)
> -		stack_base = vma->vm_start + rlim_stack;
> +		/*  Expand only to rlimit, making sure not to shrink it */
> +		stack_base = vma->vm_start + max(rlim_stack,stack_size);
>  	else
>  		stack_base = vma->vm_end + stack_expand;
>  #else
>  	if (stack_size + stack_expand > rlim_stack)
> -		stack_base = vma->vm_end - rlim_stack;
> +		/*  Expand only to rlimit, making sure not to shrink it */
> +		stack_base = vma->vm_end - max(rlim_stack,stack_size);
>  	else
>  		stack_base = vma->vm_start - stack_expand;
>  #endif

-	rlim_stack = min(rlim_stack, stack_size);
+	/*  Expand only to rlimit, making sure not to shrink it */
+	rlim_stack = max(rlim_stack, stack_size);

is better fix?




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-15  2:38   ` KOSAKI Motohiro
@ 2010-02-15  7:02     ` Jouni Malinen
  0 siblings, 0 replies; 15+ messages in thread
From: Jouni Malinen @ 2010-02-15  7:02 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Michael Neuling, linux-kernel

On Mon, Feb 15, 2010 at 11:38:52AM +0900, KOSAKI Motohiro wrote:
> Wow. It seems very strange. Usually stack overflow makes SIGSEGV, not SIGKILL.
> plus, In my environment (x86_64 non-uml), mount command doesn't use
> the stack so much.

The SIGKILL part is the odd one, I would assume.. It looks like about
half of the commands are failing with SIGKILL even with ulimit -s 8192
and the ulimit -s value did not really change the failure rate that
much if at all. SIGSEGV starts showing up only after having limited
ulimit -s to quite samll number (see my previous email).

> % /usr/bin/time --format="mem %M" ls
> /usr/bin/time --format="mem %M" ls
> GPATH      INSTALL  README       configure    fdisk          lib      misc-utils   schedutils
> mem 3232

mem 3296
on my UML setup

> % sudo /usr/bin/time --format="mem %M"  mount -a
> sudo /usr/bin/time --format="mem %M"  mount -a
> mem 2992

mem 2928

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-15  6:30 ` Michael Neuling
  2010-02-15  6:59   ` KOSAKI Motohiro
@ 2010-02-15  7:12   ` Jouni Malinen
  1 sibling, 0 replies; 15+ messages in thread
From: Jouni Malinen @ 2010-02-15  7:12 UTC (permalink / raw)
  To: Michael Neuling; +Cc: KOSAKI Motohiro, linux-kernel, Andrew Morton, anton

On Mon, Feb 15, 2010 at 05:30:20PM +1100, Michael Neuling wrote:

> We got the logic wrong in one of the cleanups and hence we aren't
> actually changing the stack reservation ever, when we intended on
> allocating up to 20 new pages.  
> 
> The:
> 	rlim_stack = min(rlim_stack, stack_size);
> always chooses stack_size hence we end up not changing the stack at all.
> This seems to cause fatal problems on UML, but is obviously not what was
> intended for archs as well.  
> 
> The following works for me on PPC64 64k and 4k pages and UML on x86_64. 

That fixes my UML setup, too.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-15  6:59   ` KOSAKI Motohiro
@ 2010-02-15  7:17     ` Jouni Malinen
  2010-02-15  8:57     ` [PATCH] exec/fs: fix initial stack reservation Michael Neuling
  2010-02-15  8:57     ` 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit Américo Wang
  2 siblings, 0 replies; 15+ messages in thread
From: Jouni Malinen @ 2010-02-15  7:17 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Michael Neuling, linux-kernel, Andrew Morton, anton

On Mon, Feb 15, 2010 at 03:59:26PM +0900, KOSAKI Motohiro wrote:

> -	rlim_stack = min(rlim_stack, stack_size);
> +	/*  Expand only to rlimit, making sure not to shrink it */
> +	rlim_stack = max(rlim_stack, stack_size);
> 
> is better fix?

Assuming I understood correctly that that was to replace the patch from
Michael completely and not just a part of it (i.e., just this one-liner
on top of linux-2.6.git), I can confirm that this, too, resolves the
issue I was seeing with UML.

-- 
Jouni Malinen                                            PGP id EFC895FA

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] exec/fs: fix initial stack reservation
  2010-02-15  6:59   ` KOSAKI Motohiro
  2010-02-15  7:17     ` Jouni Malinen
@ 2010-02-15  8:57     ` Michael Neuling
  2010-02-15  9:04       ` KOSAKI Motohiro
  2010-02-15  9:08       ` Américo Wang
  2010-02-15  8:57     ` 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit Américo Wang
  2 siblings, 2 replies; 15+ messages in thread
From: Michael Neuling @ 2010-02-15  8:57 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: Jouni Malinen, linux-kernel, Andrew Morton, anton

In message <20100215155821.7298.A69D9226@jp.fujitsu.com> you wrote:
> > 
> > 
> > In message <20100214164023.GA2726@jm.kir.nu> you wrote:
> > > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> > > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> > > user mode Linux setup by somehow preventing system setup from running
> > > properly (or killing some processes that try to mount things, etc.).
> > > This commit turned up as the reason based on git bisect and reverting it
> > > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> > > arch for both). I have no idea what exactly would be the main cause for
> > > this issue, but this looks like a somewhat unfortunately timed
> > > regression in 2.6.33-rc8.
> > > 
> > > The failed run shows like this (with current linux-2.6.git):
> > > 
> > > ...
> > > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> > > mountall: Filesystem could not be mounted: /sys/kernel/debug
> > > mountall: mount /dev [219] killed by KILL signal
> > > mountall: Filesystem could not be mounted: /dev
> > > mountall: mount /tmp [220] killed by KILL signal
> > > mountall: Filesystem could not be mounted: /tmp
> > > mountall: mount /var/lock [222] killed by KILL signal
> > > mountall: Filesystem could not be mounted: /var/lock
> > > ...
> > > 
> > > 
> > > With 803bf5ec reverted, UML comes up and the output looks like this:
> > > 
> > > ...
> > > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > init: procps main process (226) terminated with status 255
> > > fsck from util-linux-ng 2.16
> > > ...
> > 
> > Jouni,
> > 
> > I can reproduce this now.  
> > 
> > We got the logic wrong in one of the cleanups and hence we aren't
> > actually changing the stack reservation ever, when we intended on
> > allocating up to 20 new pages.  
> > 
> > The:
> > 	rlim_stack = min(rlim_stack, stack_size);
> > always chooses stack_size hence we end up not changing the stack at all.
> > This seems to cause fatal problems on UML, but is obviously not what was
> > intended for archs as well.  
> > 
> > The following works for me on PPC64 64k and 4k pages and UML on x86_64. 
> > 
> > Let me know if it fixes it for you also.
> > 
> > Mikey
> > 
> > 
> > exec/fs: fix initial stack reservation
> > 
> > 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> > stack space expansion to rlimit) attempts to limit the initial stack to
> > 20*PAGE_SIZE.  Unfortunately, in also attempting ensure the stack is not
> > reduced in size, we ended up not changing the stack at all.  
> > 
> > This caused a regression in UML resulting in most guest processes to be
> > killed. 
> > 
> > Signed-off-by: Michael Neuling <mikey@neuling.org>
> > cc: <stable@kernel.org>
> > 
> > diff --git a/fs/exec.c b/fs/exec.c
> > index e95c692..e0e7b3c 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
> >  	 * will align it up.
> >  	 */
> >  	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> > -	rlim_stack = min(rlim_stack, stack_size);
> >  #ifdef CONFIG_STACK_GROWSUP
> >  	if (stack_size + stack_expand > rlim_stack)
> > -		stack_base = vma->vm_start + rlim_stack;
> > +		/*  Expand only to rlimit, making sure not to shrink it */
> > +		stack_base = vma->vm_start + max(rlim_stack,stack_size);
> >  	else
> >  		stack_base = vma->vm_end + stack_expand;
> >  #else
> >  	if (stack_size + stack_expand > rlim_stack)
> > -		stack_base = vma->vm_end - rlim_stack;
> > +		/*  Expand only to rlimit, making sure not to shrink it */
> > +		stack_base = vma->vm_end - max(rlim_stack,stack_size);
> >  	else
> >  		stack_base = vma->vm_start - stack_expand;
> >  #endif
> 
> -	rlim_stack = min(rlim_stack, stack_size);
> +	/*  Expand only to rlimit, making sure not to shrink it */
> +	rlim_stack = max(rlim_stack, stack_size);
> 
> is better fix?

Actually, I think we can just get rid of min() line altogether.
expand_stack checks to make sure the stack is getting bigger, otherwise
it does nothing.  We don't need to bother with this check.

The below works for me on UML x86_64 and ppc64 64k and 4k pages.

Mikey

exec/fs: fix initial stack reservation

803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
stack space expansion to rlimit) attempts to limit the initial stack to
20*PAGE_SIZE.  Unfortunately, in attempting ensure the stack is not
reduced in size, we ended up not changing the stack at all.

This size reduction check is not necessary as the expand_stack call does
this already.

This caused a regression in UML resulting in most guest processes being
killed.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: <stable@kernel.org>
---
 fs/exec.c |    1 -
 1 file changed, 1 deletion(-)

Index: linux-2.6-ozlabs/fs/exec.c
===================================================================
--- linux-2.6-ozlabs.orig/fs/exec.c
+++ linux-2.6-ozlabs/fs/exec.c
@@ -637,7 +637,6 @@ int setup_arg_pages(struct linux_binprm 
 	 * will align it up.
 	 */
 	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
-	rlim_stack = min(rlim_stack, stack_size);
 #ifdef CONFIG_STACK_GROWSUP
 	if (stack_size + stack_expand > rlim_stack)
 		stack_base = vma->vm_start + rlim_stack;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-15  6:59   ` KOSAKI Motohiro
  2010-02-15  7:17     ` Jouni Malinen
  2010-02-15  8:57     ` [PATCH] exec/fs: fix initial stack reservation Michael Neuling
@ 2010-02-15  8:57     ` Américo Wang
  2010-02-15  9:03       ` KOSAKI Motohiro
  2 siblings, 1 reply; 15+ messages in thread
From: Américo Wang @ 2010-02-15  8:57 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Michael Neuling, Jouni Malinen, linux-kernel, Andrew Morton,
	anton

On Mon, Feb 15, 2010 at 03:59:26PM +0900, KOSAKI Motohiro wrote:
>> 
>> 
>> In message <20100214164023.GA2726@jm.kir.nu> you wrote:
>> > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
>> > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
>> > user mode Linux setup by somehow preventing system setup from running
>> > properly (or killing some processes that try to mount things, etc.).
>> > This commit turned up as the reason based on git bisect and reverting it
>> > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
>> > arch for both). I have no idea what exactly would be the main cause for
>> > this issue, but this looks like a somewhat unfortunately timed
>> > regression in 2.6.33-rc8.
>> > 
>> > The failed run shows like this (with current linux-2.6.git):
>> > 
>> > ...
>> > EXT3-fs (ubda): mounted filesystem with writeback data mode
>> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
>> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > mountall: mount /sys/kernel/debug [218] killed by KILL signal
>> > mountall: Filesystem could not be mounted: /sys/kernel/debug
>> > mountall: mount /dev [219] killed by KILL signal
>> > mountall: Filesystem could not be mounted: /dev
>> > mountall: mount /tmp [220] killed by KILL signal
>> > mountall: Filesystem could not be mounted: /tmp
>> > mountall: mount /var/lock [222] killed by KILL signal
>> > mountall: Filesystem could not be mounted: /var/lock
>> > ...
>> > 
>> > 
>> > With 803bf5ec reverted, UML comes up and the output looks like this:
>> > 
>> > ...
>> > EXT3-fs (ubda): mounted filesystem with writeback data mode
>> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
>> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > init: procps main process (226) terminated with status 255
>> > fsck from util-linux-ng 2.16
>> > ...
>> 
>> Jouni,
>> 
>> I can reproduce this now.  
>> 
>> We got the logic wrong in one of the cleanups and hence we aren't
>> actually changing the stack reservation ever, when we intended on
>> allocating up to 20 new pages.  
>> 
>> The:
>> 	rlim_stack = min(rlim_stack, stack_size);
>> always chooses stack_size hence we end up not changing the stack at all.
>> This seems to cause fatal problems on UML, but is obviously not what was
>> intended for archs as well.  
>> 
>> The following works for me on PPC64 64k and 4k pages and UML on x86_64. 
>> 
>> Let me know if it fixes it for you also.
>> 
>> Mikey
>> 
>> 
>> exec/fs: fix initial stack reservation
>> 
>> 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
>> stack space expansion to rlimit) attempts to limit the initial stack to
>> 20*PAGE_SIZE.  Unfortunately, in also attempting ensure the stack is not
>> reduced in size, we ended up not changing the stack at all.  
>> 
>> This caused a regression in UML resulting in most guest processes to be
>> killed. 
>> 
>> Signed-off-by: Michael Neuling <mikey@neuling.org>
>> cc: <stable@kernel.org>
>> 
>> diff --git a/fs/exec.c b/fs/exec.c
>> index e95c692..e0e7b3c 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
>>  	 * will align it up.
>>  	 */
>>  	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
>> -	rlim_stack = min(rlim_stack, stack_size);
>>  #ifdef CONFIG_STACK_GROWSUP
>>  	if (stack_size + stack_expand > rlim_stack)
>> -		stack_base = vma->vm_start + rlim_stack;
>> +		/*  Expand only to rlimit, making sure not to shrink it */
>> +		stack_base = vma->vm_start + max(rlim_stack,stack_size);
>>  	else
>>  		stack_base = vma->vm_end + stack_expand;
>>  #else
>>  	if (stack_size + stack_expand > rlim_stack)
>> -		stack_base = vma->vm_end - rlim_stack;
>> +		/*  Expand only to rlimit, making sure not to shrink it */
>> +		stack_base = vma->vm_end - max(rlim_stack,stack_size);
>>  	else
>>  		stack_base = vma->vm_start - stack_expand;
>>  #endif
>
>-	rlim_stack = min(rlim_stack, stack_size);
>+	/*  Expand only to rlimit, making sure not to shrink it */
>+	rlim_stack = max(rlim_stack, stack_size);
>
>is better fix?
>

Odd. If this is the right fix, 'stack_size" will be able to exceed
stack rlimit, then Michael's previous rlimit patch will be useless.
Am I missing something?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit
  2010-02-15  8:57     ` 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit Américo Wang
@ 2010-02-15  9:03       ` KOSAKI Motohiro
  0 siblings, 0 replies; 15+ messages in thread
From: KOSAKI Motohiro @ 2010-02-15  9:03 UTC (permalink / raw)
  To: Americo Wang
  Cc: kosaki.motohiro, Michael Neuling, Jouni Malinen, linux-kernel,
	Andrew Morton, anton

> On Mon, Feb 15, 2010 at 03:59:26PM +0900, KOSAKI Motohiro wrote:
> >> 
> >> 
> >> In message <20100214164023.GA2726@jm.kir.nu> you wrote:
> >> > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> >> > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> >> > user mode Linux setup by somehow preventing system setup from running
> >> > properly (or killing some processes that try to mount things, etc.).
> >> > This commit turned up as the reason based on git bisect and reverting it
> >> > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> >> > arch for both). I have no idea what exactly would be the main cause for
> >> > this issue, but this looks like a somewhat unfortunately timed
> >> > regression in 2.6.33-rc8.
> >> > 
> >> > The failed run shows like this (with current linux-2.6.git):
> >> > 
> >> > ...
> >> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> >> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> >> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> >> > mountall: Filesystem could not be mounted: /sys/kernel/debug
> >> > mountall: mount /dev [219] killed by KILL signal
> >> > mountall: Filesystem could not be mounted: /dev
> >> > mountall: mount /tmp [220] killed by KILL signal
> >> > mountall: Filesystem could not be mounted: /tmp
> >> > mountall: mount /var/lock [222] killed by KILL signal
> >> > mountall: Filesystem could not be mounted: /var/lock
> >> > ...
> >> > 
> >> > 
> >> > With 803bf5ec reverted, UML comes up and the output looks like this:
> >> > 
> >> > ...
> >> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> >> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> >> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > init: procps main process (226) terminated with status 255
> >> > fsck from util-linux-ng 2.16
> >> > ...
> >> 
> >> Jouni,
> >> 
> >> I can reproduce this now.  
> >> 
> >> We got the logic wrong in one of the cleanups and hence we aren't
> >> actually changing the stack reservation ever, when we intended on
> >> allocating up to 20 new pages.  
> >> 
> >> The:
> >> 	rlim_stack = min(rlim_stack, stack_size);
> >> always chooses stack_size hence we end up not changing the stack at all.
> >> This seems to cause fatal problems on UML, but is obviously not what was
> >> intended for archs as well.  
> >> 
> >> The following works for me on PPC64 64k and 4k pages and UML on x86_64. 
> >> 
> >> Let me know if it fixes it for you also.
> >> 
> >> Mikey
> >> 
> >> 
> >> exec/fs: fix initial stack reservation
> >> 
> >> 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> >> stack space expansion to rlimit) attempts to limit the initial stack to
> >> 20*PAGE_SIZE.  Unfortunately, in also attempting ensure the stack is not
> >> reduced in size, we ended up not changing the stack at all.  
> >> 
> >> This caused a regression in UML resulting in most guest processes to be
> >> killed. 
> >> 
> >> Signed-off-by: Michael Neuling <mikey@neuling.org>
> >> cc: <stable@kernel.org>
> >> 
> >> diff --git a/fs/exec.c b/fs/exec.c
> >> index e95c692..e0e7b3c 100644
> >> --- a/fs/exec.c
> >> +++ b/fs/exec.c
> >> @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
> >>  	 * will align it up.
> >>  	 */
> >>  	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> >> -	rlim_stack = min(rlim_stack, stack_size);
> >>  #ifdef CONFIG_STACK_GROWSUP
> >>  	if (stack_size + stack_expand > rlim_stack)
> >> -		stack_base = vma->vm_start + rlim_stack;
> >> +		/*  Expand only to rlimit, making sure not to shrink it */
> >> +		stack_base = vma->vm_start + max(rlim_stack,stack_size);
> >>  	else
> >>  		stack_base = vma->vm_end + stack_expand;
> >>  #else
> >>  	if (stack_size + stack_expand > rlim_stack)
> >> -		stack_base = vma->vm_end - rlim_stack;
> >> +		/*  Expand only to rlimit, making sure not to shrink it */
> >> +		stack_base = vma->vm_end - max(rlim_stack,stack_size);
> >>  	else
> >>  		stack_base = vma->vm_start - stack_expand;
> >>  #endif
> >
> >-	rlim_stack = min(rlim_stack, stack_size);
> >+	/*  Expand only to rlimit, making sure not to shrink it */
> >+	rlim_stack = max(rlim_stack, stack_size);
> >
> >is better fix?
> >
> 
> Odd. If this is the right fix, 'stack_size" will be able to exceed
> stack rlimit, then Michael's previous rlimit patch will be useless.
> Am I missing something?
> 

This function is in exec processing, IOW user process doesn't start yet,
and stack_size is always PAGE_SIZE. 

No problem.

This expression only mean we parse "ulimit -s 1" as "ulimit -s 4".
(round up to one-page)




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] exec/fs: fix initial stack reservation
  2010-02-15  8:57     ` [PATCH] exec/fs: fix initial stack reservation Michael Neuling
@ 2010-02-15  9:04       ` KOSAKI Motohiro
  2010-02-15  9:08       ` Américo Wang
  1 sibling, 0 replies; 15+ messages in thread
From: KOSAKI Motohiro @ 2010-02-15  9:04 UTC (permalink / raw)
  To: Michael Neuling
  Cc: kosaki.motohiro, Jouni Malinen, linux-kernel, Andrew Morton,
	anton

> In message <20100215155821.7298.A69D9226@jp.fujitsu.com> you wrote:
> > > 
> > > 
> > > In message <20100214164023.GA2726@jm.kir.nu> you wrote:
> > > > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> > > > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> > > > user mode Linux setup by somehow preventing system setup from running
> > > > properly (or killing some processes that try to mount things, etc.).
> > > > This commit turned up as the reason based on git bisect and reverting it
> > > > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> > > > arch for both). I have no idea what exactly would be the main cause for
> > > > this issue, but this looks like a somewhat unfortunately timed
> > > > regression in 2.6.33-rc8.
> > > > 
> > > > The failed run shows like this (with current linux-2.6.git):
> > > > 
> > > > ...
> > > > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> > > > mountall: Filesystem could not be mounted: /sys/kernel/debug
> > > > mountall: mount /dev [219] killed by KILL signal
> > > > mountall: Filesystem could not be mounted: /dev
> > > > mountall: mount /tmp [220] killed by KILL signal
> > > > mountall: Filesystem could not be mounted: /tmp
> > > > mountall: mount /var/lock [222] killed by KILL signal
> > > > mountall: Filesystem could not be mounted: /var/lock
> > > > ...
> > > > 
> > > > 
> > > > With 803bf5ec reverted, UML comes up and the output looks like this:
> > > > 
> > > > ...
> > > > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > init: procps main process (226) terminated with status 255
> > > > fsck from util-linux-ng 2.16
> > > > ...
> > > 
> > > Jouni,
> > > 
> > > I can reproduce this now.  
> > > 
> > > We got the logic wrong in one of the cleanups and hence we aren't
> > > actually changing the stack reservation ever, when we intended on
> > > allocating up to 20 new pages.  
> > > 
> > > The:
> > > 	rlim_stack = min(rlim_stack, stack_size);
> > > always chooses stack_size hence we end up not changing the stack at all.
> > > This seems to cause fatal problems on UML, but is obviously not what was
> > > intended for archs as well.  
> > > 
> > > The following works for me on PPC64 64k and 4k pages and UML on x86_64. 
> > > 
> > > Let me know if it fixes it for you also.
> > > 
> > > Mikey
> > > 
> > > 
> > > exec/fs: fix initial stack reservation
> > > 
> > > 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> > > stack space expansion to rlimit) attempts to limit the initial stack to
> > > 20*PAGE_SIZE.  Unfortunately, in also attempting ensure the stack is not
> > > reduced in size, we ended up not changing the stack at all.  
> > > 
> > > This caused a regression in UML resulting in most guest processes to be
> > > killed. 
> > > 
> > > Signed-off-by: Michael Neuling <mikey@neuling.org>
> > > cc: <stable@kernel.org>
> > > 
> > > diff --git a/fs/exec.c b/fs/exec.c
> > > index e95c692..e0e7b3c 100644
> > > --- a/fs/exec.c
> > > +++ b/fs/exec.c
> > > @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
> > >  	 * will align it up.
> > >  	 */
> > >  	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> > > -	rlim_stack = min(rlim_stack, stack_size);
> > >  #ifdef CONFIG_STACK_GROWSUP
> > >  	if (stack_size + stack_expand > rlim_stack)
> > > -		stack_base = vma->vm_start + rlim_stack;
> > > +		/*  Expand only to rlimit, making sure not to shrink it */
> > > +		stack_base = vma->vm_start + max(rlim_stack,stack_size);
> > >  	else
> > >  		stack_base = vma->vm_end + stack_expand;
> > >  #else
> > >  	if (stack_size + stack_expand > rlim_stack)
> > > -		stack_base = vma->vm_end - rlim_stack;
> > > +		/*  Expand only to rlimit, making sure not to shrink it */
> > > +		stack_base = vma->vm_end - max(rlim_stack,stack_size);
> > >  	else
> > >  		stack_base = vma->vm_start - stack_expand;
> > >  #endif
> > 
> > -	rlim_stack = min(rlim_stack, stack_size);
> > +	/*  Expand only to rlimit, making sure not to shrink it */
> > +	rlim_stack = max(rlim_stack, stack_size);
> > 
> > is better fix?
> 
> Actually, I think we can just get rid of min() line altogether.
> expand_stack checks to make sure the stack is getting bigger, otherwise
> it does nothing.  We don't need to bother with this check.
> 
> The below works for me on UML x86_64 and ppc64 64k and 4k pages.

OK, Right you are.
	Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>


> 
> Mikey
> 
> exec/fs: fix initial stack reservation
> 
> 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> stack space expansion to rlimit) attempts to limit the initial stack to
> 20*PAGE_SIZE.  Unfortunately, in attempting ensure the stack is not
> reduced in size, we ended up not changing the stack at all.
> 
> This size reduction check is not necessary as the expand_stack call does
> this already.
> 
> This caused a regression in UML resulting in most guest processes being
> killed.
> 
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> cc: <stable@kernel.org>
> ---
>  fs/exec.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> Index: linux-2.6-ozlabs/fs/exec.c
> ===================================================================
> --- linux-2.6-ozlabs.orig/fs/exec.c
> +++ linux-2.6-ozlabs/fs/exec.c
> @@ -637,7 +637,6 @@ int setup_arg_pages(struct linux_binprm 
>  	 * will align it up.
>  	 */
>  	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> -	rlim_stack = min(rlim_stack, stack_size);
>  #ifdef CONFIG_STACK_GROWSUP
>  	if (stack_size + stack_expand > rlim_stack)
>  		stack_base = vma->vm_start + rlim_stack;




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] exec/fs: fix initial stack reservation
  2010-02-15  8:57     ` [PATCH] exec/fs: fix initial stack reservation Michael Neuling
  2010-02-15  9:04       ` KOSAKI Motohiro
@ 2010-02-15  9:08       ` Américo Wang
  1 sibling, 0 replies; 15+ messages in thread
From: Américo Wang @ 2010-02-15  9:08 UTC (permalink / raw)
  To: Michael Neuling
  Cc: KOSAKI Motohiro, Jouni Malinen, linux-kernel, Andrew Morton,
	anton

On Mon, Feb 15, 2010 at 07:57:11PM +1100, Michael Neuling wrote:
>In message <20100215155821.7298.A69D9226@jp.fujitsu.com> you wrote:
>> > 
>> > 
>> > In message <20100214164023.GA2726@jm.kir.nu> you wrote:
>> > > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
>> > > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
>> > > user mode Linux setup by somehow preventing system setup from running
>> > > properly (or killing some processes that try to mount things, etc.).
>> > > This commit turned up as the reason based on git bisect and reverting it
>> > > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
>> > > arch for both). I have no idea what exactly would be the main cause for
>> > > this issue, but this looks like a somewhat unfortunately timed
>> > > regression in 2.6.33-rc8.
>> > > 
>> > > The failed run shows like this (with current linux-2.6.git):
>> > > 
>> > > ...
>> > > EXT3-fs (ubda): mounted filesystem with writeback data mode
>> > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
>> > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > mountall: mount /sys/kernel/debug [218] killed by KILL signal
>> > > mountall: Filesystem could not be mounted: /sys/kernel/debug
>> > > mountall: mount /dev [219] killed by KILL signal
>> > > mountall: Filesystem could not be mounted: /dev
>> > > mountall: mount /tmp [220] killed by KILL signal
>> > > mountall: Filesystem could not be mounted: /tmp
>> > > mountall: mount /var/lock [222] killed by KILL signal
>> > > mountall: Filesystem could not be mounted: /var/lock
>> > > ...
>> > > 
>> > > 
>> > > With 803bf5ec reverted, UML comes up and the output looks like this:
>> > > 
>> > > ...
>> > > EXT3-fs (ubda): mounted filesystem with writeback data mode
>> > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
>> > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > init: procps main process (226) terminated with status 255
>> > > fsck from util-linux-ng 2.16
>> > > ...
>> > 
>> > Jouni,
>> > 
>> > I can reproduce this now.  
>> > 
>> > We got the logic wrong in one of the cleanups and hence we aren't
>> > actually changing the stack reservation ever, when we intended on
>> > allocating up to 20 new pages.  
>> > 
>> > The:
>> > 	rlim_stack = min(rlim_stack, stack_size);
>> > always chooses stack_size hence we end up not changing the stack at all.
>> > This seems to cause fatal problems on UML, but is obviously not what was
>> > intended for archs as well.  
>> > 
>> > The following works for me on PPC64 64k and 4k pages and UML on x86_64. 
>> > 
>> > Let me know if it fixes it for you also.
>> > 
>> > Mikey
>> > 
>> > 
>> > exec/fs: fix initial stack reservation
>> > 
>> > 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
>> > stack space expansion to rlimit) attempts to limit the initial stack to
>> > 20*PAGE_SIZE.  Unfortunately, in also attempting ensure the stack is not
>> > reduced in size, we ended up not changing the stack at all.  
>> > 
>> > This caused a regression in UML resulting in most guest processes to be
>> > killed. 
>> > 
>> > Signed-off-by: Michael Neuling <mikey@neuling.org>
>> > cc: <stable@kernel.org>
>> > 
>> > diff --git a/fs/exec.c b/fs/exec.c
>> > index e95c692..e0e7b3c 100644
>> > --- a/fs/exec.c
>> > +++ b/fs/exec.c
>> > @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
>> >  	 * will align it up.
>> >  	 */
>> >  	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
>> > -	rlim_stack = min(rlim_stack, stack_size);
>> >  #ifdef CONFIG_STACK_GROWSUP
>> >  	if (stack_size + stack_expand > rlim_stack)
>> > -		stack_base = vma->vm_start + rlim_stack;
>> > +		/*  Expand only to rlimit, making sure not to shrink it */
>> > +		stack_base = vma->vm_start + max(rlim_stack,stack_size);
>> >  	else
>> >  		stack_base = vma->vm_end + stack_expand;
>> >  #else
>> >  	if (stack_size + stack_expand > rlim_stack)
>> > -		stack_base = vma->vm_end - rlim_stack;
>> > +		/*  Expand only to rlimit, making sure not to shrink it */
>> > +		stack_base = vma->vm_end - max(rlim_stack,stack_size);
>> >  	else
>> >  		stack_base = vma->vm_start - stack_expand;
>> >  #endif
>> 
>> -	rlim_stack = min(rlim_stack, stack_size);
>> +	/*  Expand only to rlimit, making sure not to shrink it */
>> +	rlim_stack = max(rlim_stack, stack_size);
>> 
>> is better fix?
>
>Actually, I think we can just get rid of min() line altogether.
>expand_stack checks to make sure the stack is getting bigger, otherwise
>it does nothing.  We don't need to bother with this check.
>

Right...

Above change makes me confused. :-( But now, everything is clear.


>The below works for me on UML x86_64 and ppc64 64k and 4k pages.
>
>Mikey
>
>exec/fs: fix initial stack reservation
>
>803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
>stack space expansion to rlimit) attempts to limit the initial stack to
>20*PAGE_SIZE.  Unfortunately, in attempting ensure the stack is not
>reduced in size, we ended up not changing the stack at all.
>
>This size reduction check is not necessary as the expand_stack call does
>this already.
>
>This caused a regression in UML resulting in most guest processes being
>killed.
>
>Signed-off-by: Michael Neuling <mikey@neuling.org>
>cc: <stable@kernel.org>


This one definitely better.

Acked-by: WANG Cong <xiyou.wangcong@gmail.com>

>---
> fs/exec.c |    1 -
> 1 file changed, 1 deletion(-)
>
>Index: linux-2.6-ozlabs/fs/exec.c
>===================================================================
>--- linux-2.6-ozlabs.orig/fs/exec.c
>+++ linux-2.6-ozlabs/fs/exec.c
>@@ -637,7 +637,6 @@ int setup_arg_pages(struct linux_binprm 
> 	 * will align it up.
> 	 */
> 	rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
>-	rlim_stack = min(rlim_stack, stack_size);
> #ifdef CONFIG_STACK_GROWSUP
> 	if (stack_size + stack_expand > rlim_stack)
> 		stack_base = vma->vm_start + rlim_stack;
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/

-- 
Live like a child, think like the god.
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-02-15  9:05 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-14 16:40 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit Jouni Malinen
2010-02-14 22:03 ` Michael Neuling
2010-02-15  2:38   ` KOSAKI Motohiro
2010-02-15  7:02     ` Jouni Malinen
2010-02-15  6:56   ` Jouni Malinen
2010-02-14 23:23 ` Rafael J. Wysocki
2010-02-15  6:30 ` Michael Neuling
2010-02-15  6:59   ` KOSAKI Motohiro
2010-02-15  7:17     ` Jouni Malinen
2010-02-15  8:57     ` [PATCH] exec/fs: fix initial stack reservation Michael Neuling
2010-02-15  9:04       ` KOSAKI Motohiro
2010-02-15  9:08       ` Américo Wang
2010-02-15  8:57     ` 2.6.33-rc8 breaks UML with Restrict initial stack space expansion to rlimit Américo Wang
2010-02-15  9:03       ` KOSAKI Motohiro
2010-02-15  7:12   ` Jouni Malinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox