public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
@ 2013-04-12 18:13 Vivek Goyal
  2013-04-13  2:12 ` Michel Lespinasse
  2013-04-13  2:31 ` Michel Lespinasse
  0 siblings, 2 replies; 13+ messages in thread
From: Vivek Goyal @ 2013-04-12 18:13 UTC (permalink / raw)
  To: linux kernel mailing list, Michel Lespinasse
  Cc: Hugh Dickins, Rik van Riel, Paul E. McKenney

Hi, 

I am writing some code where I lock down a process memory at exec() time.
My patches were working fine till 3.9-rc4 and suddendly things broke down
in 3.9-rc5.

Whenever I tried to exec() a process with memory locked down, my bash
session hangs and after a while I get following warning.

login: [  174.669002] INFO: rcu_sched self-detected stall on CPU { 2}  (t=60000 jiffies g=2580 c=2579 q=1085)
[  174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
[  174.669002] Call Trace:
[  174.669002]  <IRQ>  [<ffffffff810c413a>] rcu_check_callbacks+0x21a/0x760
[  174.669002]  [<ffffffff810c7c0c>] ? acct_account_cputime+0x1c/0x20
[  174.669002]  [<ffffffff8104fd08>] update_process_times+0x48/0x80
[  174.669002]  [<ffffffff810913dd>] tick_sched_handle+0x3d/0x50
[  174.669002]  [<ffffffff810915e5>] tick_sched_timer+0x45/0x70
[  174.669002]  [<ffffffff81066951>] __run_hrtimer+0x81/0x220
[  174.669002]  [<ffffffff810915a0>] ? tick_nohz_handler+0xa0/0xa0
[  174.669002]  [<ffffffff8108ae0c>] ? ktime_get_update_offsets+0x4c/0xd0
[  174.669002]  [<ffffffff81067297>] hrtimer_interrupt+0xf7/0x250
[  174.669002]  [<ffffffff81886739>] smp_apic_timer_interrupt+0x69/0x99
[  174.669002]  [<ffffffff818859ca>] apic_timer_interrupt+0x6a/0x70
[  174.669002]  <EOI>  [<ffffffff8111e557>] ?  __mlock_vma_pages_range+0x57/0x70
[  174.669002]  [<ffffffff8111e568>] ? __mlock_vma_pages_range+0x68/0x70
[  174.669002]  [<ffffffff8111ea01>] __mm_populate+0x71/0x140
[  174.669002]  [<ffffffff81121b5f>] vm_brk+0x7f/0xa0
[  174.669002]  [<ffffffff81199633>] load_elf_binary+0x1a73/0x1b10
[  174.669002]  [<ffffffff812d25a5>] ? ima_bprm_check+0x55/0x70
[  174.669002]  [<ffffffff8114890a>] search_binary_handler+0x12a/0x3b0
[  174.669002]  [<ffffffff81197bc0>] ? load_elf_library+0x210/0x210
[  174.669002]  [<ffffffff8114aa00>] do_execve_common+0x500/0x5c0
[  174.669002]  [<ffffffff8114aaf7>] do_execve+0x37/0x40
[  174.669002]  [<ffffffff8114ad9d>] sys_execve+0x3d/0x60
[  174.669002]  [<ffffffff81885379>] stub_execve+0x69/0xa0

I did a git bisect and bisection says that following is first bad
commit.

commit 09a9f1d27892255cfb9c91203f19476765e2d8d1
Author: Michel Lespinasse <walken@google.com>
Date:   Thu Mar 28 16:26:23 2013 -0700

    Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace pr
    
    This reverts commit 186930500985 ("mm: introduce VM_POPULATE flag to
    better deal with racy userspace programs").

I reverted above commit and problem gets fixed.

Following is my simple patch to lock down a selected process memory.

Index: linux-2.6/fs/binfmt_elf.c
===================================================================
--- linux-2.6.orig/fs/binfmt_elf.c      2013-04-13 01:50:26.380184101
-0400
+++ linux-2.6/fs/binfmt_elf.c   2013-04-13 01:50:49.827184821 -0400
@@ -721,6 +721,10 @@ static int load_elf_binary(struct linux_

        /* OK, This is the point of no return */
        current->mm->def_flags = def_flags;
+       if (!strcmp(bprm->filename, "/sbin/kexec")) {
+               printk("Memlocking /sbin/kexec\n");
+               current->mm->def_flags |= VM_LOCKED;
+       }

        /* Do this immediately, since STACK_TOP as used in setup_arg_pages
           may depend on the personality.  */


Do you have any thoughts on what's going on. I am wondering if it indicates
a bigger problem which can then be triggered from other paths too.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-12 18:13 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27 Vivek Goyal
@ 2013-04-13  2:12 ` Michel Lespinasse
  2013-04-13  2:31 ` Michel Lespinasse
  1 sibling, 0 replies; 13+ messages in thread
From: Michel Lespinasse @ 2013-04-13  2:12 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux kernel mailing list, Hugh Dickins, Rik van Riel,
	Paul E. McKenney

On Fri, Apr 12, 2013 at 11:13 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Hi,
>
> I am writing some code where I lock down a process memory at exec() time.
> My patches were working fine till 3.9-rc4 and suddendly things broke down
> in 3.9-rc5.
>
> Whenever I tried to exec() a process with memory locked down, my bash
> session hangs and after a while I get following warning.
>
> login: [  174.669002] INFO: rcu_sched self-detected stall on CPU { 2}  (t=60000 jiffies g=2580 c=2579 q=1085)
> [  174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
> [  174.669002] Call Trace:
> [  174.669002]  <IRQ>  [<ffffffff810c413a>] rcu_check_callbacks+0x21a/0x760
> [  174.669002]  [<ffffffff810c7c0c>] ? acct_account_cputime+0x1c/0x20
> [  174.669002]  [<ffffffff8104fd08>] update_process_times+0x48/0x80
> [  174.669002]  [<ffffffff810913dd>] tick_sched_handle+0x3d/0x50
> [  174.669002]  [<ffffffff810915e5>] tick_sched_timer+0x45/0x70
> [  174.669002]  [<ffffffff81066951>] __run_hrtimer+0x81/0x220
> [  174.669002]  [<ffffffff810915a0>] ? tick_nohz_handler+0xa0/0xa0
> [  174.669002]  [<ffffffff8108ae0c>] ? ktime_get_update_offsets+0x4c/0xd0
> [  174.669002]  [<ffffffff81067297>] hrtimer_interrupt+0xf7/0x250
> [  174.669002]  [<ffffffff81886739>] smp_apic_timer_interrupt+0x69/0x99
> [  174.669002]  [<ffffffff818859ca>] apic_timer_interrupt+0x6a/0x70
> [  174.669002]  <EOI>  [<ffffffff8111e557>] ?  __mlock_vma_pages_range+0x57/0x70
> [  174.669002]  [<ffffffff8111e568>] ? __mlock_vma_pages_range+0x68/0x70
> [  174.669002]  [<ffffffff8111ea01>] __mm_populate+0x71/0x140
> [  174.669002]  [<ffffffff81121b5f>] vm_brk+0x7f/0xa0
> [  174.669002]  [<ffffffff81199633>] load_elf_binary+0x1a73/0x1b10
> [  174.669002]  [<ffffffff812d25a5>] ? ima_bprm_check+0x55/0x70
> [  174.669002]  [<ffffffff8114890a>] search_binary_handler+0x12a/0x3b0
> [  174.669002]  [<ffffffff81197bc0>] ? load_elf_library+0x210/0x210
> [  174.669002]  [<ffffffff8114aa00>] do_execve_common+0x500/0x5c0
> [  174.669002]  [<ffffffff8114aaf7>] do_execve+0x37/0x40
> [  174.669002]  [<ffffffff8114ad9d>] sys_execve+0x3d/0x60
> [  174.669002]  [<ffffffff81885379>] stub_execve+0x69/0xa0
>
> I did a git bisect and bisection says that following is first bad
> commit.
>
> commit 09a9f1d27892255cfb9c91203f19476765e2d8d1
> Author: Michel Lespinasse <walken@google.com>
> Date:   Thu Mar 28 16:26:23 2013 -0700
>
>     Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace pr
>
>     This reverts commit 186930500985 ("mm: introduce VM_POPULATE flag to
>     better deal with racy userspace programs").
>
> I reverted above commit and problem gets fixed.
>
> Following is my simple patch to lock down a selected process memory.
>
> Index: linux-2.6/fs/binfmt_elf.c
> ===================================================================
> --- linux-2.6.orig/fs/binfmt_elf.c      2013-04-13 01:50:26.380184101
> -0400
> +++ linux-2.6/fs/binfmt_elf.c   2013-04-13 01:50:49.827184821 -0400
> @@ -721,6 +721,10 @@ static int load_elf_binary(struct linux_
>
>         /* OK, This is the point of no return */
>         current->mm->def_flags = def_flags;
> +       if (!strcmp(bprm->filename, "/sbin/kexec")) {
> +               printk("Memlocking /sbin/kexec\n");
> +               current->mm->def_flags |= VM_LOCKED;
> +       }
>
>         /* Do this immediately, since STACK_TOP as used in setup_arg_pages
>            may depend on the personality.  */
>
>
> Do you have any thoughts on what's going on. I am wondering if it indicates
> a bigger problem which can then be triggered from other paths too.

Based on your patch, it looks like from 3.9-rc1 to 3.9-rc5 your change
wouldn't actually cause pages to get mlocked during exec - for this
range of kernels, mlockall would need to set both VM_LOCKED and

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-12 18:13 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27 Vivek Goyal
  2013-04-13  2:12 ` Michel Lespinasse
@ 2013-04-13  2:31 ` Michel Lespinasse
  2013-04-15 16:35   ` Vivek Goyal
  1 sibling, 1 reply; 13+ messages in thread
From: Michel Lespinasse @ 2013-04-13  2:31 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: linux kernel mailing list, Hugh Dickins, Rik van Riel,
	Paul E. McKenney

Sorry for the earlier message getting sent before I was done typing it.

On Fri, Apr 12, 2013 at 11:13 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> Hi,
>
> I am writing some code where I lock down a process memory at exec() time.
> My patches were working fine till 3.9-rc4 and suddendly things broke down
> in 3.9-rc5.
>
> Whenever I tried to exec() a process with memory locked down, my bash
> session hangs and after a while I get following warning.
>
> login: [  174.669002] INFO: rcu_sched self-detected stall on CPU { 2}  (t=60000 jiffies g=2580 c=2579 q=1085)
> [  174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
> [  174.669002] Call Trace:
> [  174.669002]  <IRQ>  [<ffffffff810c413a>] rcu_check_callbacks+0x21a/0x760
> [  174.669002]  [<ffffffff810c7c0c>] ? acct_account_cputime+0x1c/0x20
> [  174.669002]  [<ffffffff8104fd08>] update_process_times+0x48/0x80
> [  174.669002]  [<ffffffff810913dd>] tick_sched_handle+0x3d/0x50
> [  174.669002]  [<ffffffff810915e5>] tick_sched_timer+0x45/0x70
> [  174.669002]  [<ffffffff81066951>] __run_hrtimer+0x81/0x220
> [  174.669002]  [<ffffffff810915a0>] ? tick_nohz_handler+0xa0/0xa0
> [  174.669002]  [<ffffffff8108ae0c>] ? ktime_get_update_offsets+0x4c/0xd0
> [  174.669002]  [<ffffffff81067297>] hrtimer_interrupt+0xf7/0x250
> [  174.669002]  [<ffffffff81886739>] smp_apic_timer_interrupt+0x69/0x99
> [  174.669002]  [<ffffffff818859ca>] apic_timer_interrupt+0x6a/0x70
> [  174.669002]  <EOI>  [<ffffffff8111e557>] ?  __mlock_vma_pages_range+0x57/0x70
> [  174.669002]  [<ffffffff8111e568>] ? __mlock_vma_pages_range+0x68/0x70
> [  174.669002]  [<ffffffff8111ea01>] __mm_populate+0x71/0x140
> [  174.669002]  [<ffffffff81121b5f>] vm_brk+0x7f/0xa0
> [  174.669002]  [<ffffffff81199633>] load_elf_binary+0x1a73/0x1b10
> [  174.669002]  [<ffffffff812d25a5>] ? ima_bprm_check+0x55/0x70
> [  174.669002]  [<ffffffff8114890a>] search_binary_handler+0x12a/0x3b0
> [  174.669002]  [<ffffffff81197bc0>] ? load_elf_library+0x210/0x210
> [  174.669002]  [<ffffffff8114aa00>] do_execve_common+0x500/0x5c0
> [  174.669002]  [<ffffffff8114aaf7>] do_execve+0x37/0x40
> [  174.669002]  [<ffffffff8114ad9d>] sys_execve+0x3d/0x60
> [  174.669002]  [<ffffffff81885379>] stub_execve+0x69/0xa0
>
> I did a git bisect and bisection says that following is first bad
> commit.
>
> commit 09a9f1d27892255cfb9c91203f19476765e2d8d1
> Author: Michel Lespinasse <walken@google.com>
> Date:   Thu Mar 28 16:26:23 2013 -0700
>
>     Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace pr
>
>     This reverts commit 186930500985 ("mm: introduce VM_POPULATE flag to
>     better deal with racy userspace programs").
>
> I reverted above commit and problem gets fixed.
>
> Following is my simple patch to lock down a selected process memory.
>
> Index: linux-2.6/fs/binfmt_elf.c
> ===================================================================
> --- linux-2.6.orig/fs/binfmt_elf.c      2013-04-13 01:50:26.380184101
> -0400
> +++ linux-2.6/fs/binfmt_elf.c   2013-04-13 01:50:49.827184821 -0400
> @@ -721,6 +721,10 @@ static int load_elf_binary(struct linux_
>
>         /* OK, This is the point of no return */
>         current->mm->def_flags = def_flags;
> +       if (!strcmp(bprm->filename, "/sbin/kexec")) {
> +               printk("Memlocking /sbin/kexec\n");
> +               current->mm->def_flags |= VM_LOCKED;
> +       }
>
>         /* Do this immediately, since STACK_TOP as used in setup_arg_pages
>            may depend on the personality.  */
>
>
> Do you have any thoughts on what's going on. I am wondering if it indicates
> a bigger problem which can then be triggered from other paths too.
>
> Thanks
> Vivek

Based on your patch, it looks like from 3.9-rc1 to 3.9-rc5 your change
wouldn't actually cause pages to get mlocked during exec - for this
range of kernels, mlockall would need to set both VM_LOCKED and
VM_POPULATE. I suspect you would see the same crash if you included
VM_POPULATE in your change, too.

That said, I am not sure immediately what's wrong. It looks like a
deadlock situation, does CONFIG_LOCKDEP help here ?

My first guess would be that mmap_sem is held during exec, so you
can't have __mm_populate() try holding it recursively.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-13  2:31 ` Michel Lespinasse
@ 2013-04-15 16:35   ` Vivek Goyal
  2013-04-15 17:34     ` Vivek Goyal
  0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2013-04-15 16:35 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux kernel mailing list, Hugh Dickins, Rik van Riel,
	Paul E. McKenney

On Fri, Apr 12, 2013 at 07:31:32PM -0700, Michel Lespinasse wrote:
> Sorry for the earlier message getting sent before I was done typing it.
> 
> On Fri, Apr 12, 2013 at 11:13 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > Hi,
> >
> > I am writing some code where I lock down a process memory at exec() time.
> > My patches were working fine till 3.9-rc4 and suddendly things broke down
> > in 3.9-rc5.
> >
> > Whenever I tried to exec() a process with memory locked down, my bash
> > session hangs and after a while I get following warning.
> >
> > login: [  174.669002] INFO: rcu_sched self-detected stall on CPU { 2}  (t=60000 jiffies g=2580 c=2579 q=1085)
> > [  174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
> > [  174.669002] Call Trace:
> > [  174.669002]  <IRQ>  [<ffffffff810c413a>] rcu_check_callbacks+0x21a/0x760
> > [  174.669002]  [<ffffffff810c7c0c>] ? acct_account_cputime+0x1c/0x20
> > [  174.669002]  [<ffffffff8104fd08>] update_process_times+0x48/0x80
> > [  174.669002]  [<ffffffff810913dd>] tick_sched_handle+0x3d/0x50
> > [  174.669002]  [<ffffffff810915e5>] tick_sched_timer+0x45/0x70
> > [  174.669002]  [<ffffffff81066951>] __run_hrtimer+0x81/0x220
> > [  174.669002]  [<ffffffff810915a0>] ? tick_nohz_handler+0xa0/0xa0
> > [  174.669002]  [<ffffffff8108ae0c>] ? ktime_get_update_offsets+0x4c/0xd0
> > [  174.669002]  [<ffffffff81067297>] hrtimer_interrupt+0xf7/0x250
> > [  174.669002]  [<ffffffff81886739>] smp_apic_timer_interrupt+0x69/0x99
> > [  174.669002]  [<ffffffff818859ca>] apic_timer_interrupt+0x6a/0x70
> > [  174.669002]  <EOI>  [<ffffffff8111e557>] ?  __mlock_vma_pages_range+0x57/0x70
> > [  174.669002]  [<ffffffff8111e568>] ? __mlock_vma_pages_range+0x68/0x70
> > [  174.669002]  [<ffffffff8111ea01>] __mm_populate+0x71/0x140
> > [  174.669002]  [<ffffffff81121b5f>] vm_brk+0x7f/0xa0
> > [  174.669002]  [<ffffffff81199633>] load_elf_binary+0x1a73/0x1b10
> > [  174.669002]  [<ffffffff812d25a5>] ? ima_bprm_check+0x55/0x70
> > [  174.669002]  [<ffffffff8114890a>] search_binary_handler+0x12a/0x3b0
> > [  174.669002]  [<ffffffff81197bc0>] ? load_elf_library+0x210/0x210
> > [  174.669002]  [<ffffffff8114aa00>] do_execve_common+0x500/0x5c0
> > [  174.669002]  [<ffffffff8114aaf7>] do_execve+0x37/0x40
> > [  174.669002]  [<ffffffff8114ad9d>] sys_execve+0x3d/0x60
> > [  174.669002]  [<ffffffff81885379>] stub_execve+0x69/0xa0
> >
> > I did a git bisect and bisection says that following is first bad
> > commit.
> >
> > commit 09a9f1d27892255cfb9c91203f19476765e2d8d1
> > Author: Michel Lespinasse <walken@google.com>
> > Date:   Thu Mar 28 16:26:23 2013 -0700
> >
> >     Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace pr
> >
> >     This reverts commit 186930500985 ("mm: introduce VM_POPULATE flag to
> >     better deal with racy userspace programs").
> >
> > I reverted above commit and problem gets fixed.
> >
> > Following is my simple patch to lock down a selected process memory.
> >
> > Index: linux-2.6/fs/binfmt_elf.c
> > ===================================================================
> > --- linux-2.6.orig/fs/binfmt_elf.c      2013-04-13 01:50:26.380184101
> > -0400
> > +++ linux-2.6/fs/binfmt_elf.c   2013-04-13 01:50:49.827184821 -0400
> > @@ -721,6 +721,10 @@ static int load_elf_binary(struct linux_
> >
> >         /* OK, This is the point of no return */
> >         current->mm->def_flags = def_flags;
> > +       if (!strcmp(bprm->filename, "/sbin/kexec")) {
> > +               printk("Memlocking /sbin/kexec\n");
> > +               current->mm->def_flags |= VM_LOCKED;
> > +       }
> >
> >         /* Do this immediately, since STACK_TOP as used in setup_arg_pages
> >            may depend on the personality.  */
> >
> >
> > Do you have any thoughts on what's going on. I am wondering if it indicates
> > a bigger problem which can then be triggered from other paths too.
> >
> > Thanks
> > Vivek
> 
> Based on your patch, it looks like from 3.9-rc1 to 3.9-rc5 your change
> wouldn't actually cause pages to get mlocked during exec - for this
> range of kernels, mlockall would need to set both VM_LOCKED and
> VM_POPULATE. I suspect you would see the same crash if you included
> VM_POPULATE in your change, too.

Ok. I specified VM_POPULATE and I see the same problem.
> 
> That said, I am not sure immediately what's wrong. It looks like a
> deadlock situation, does CONFIG_LOCKDEP help here ?

I turned on CONFIG_LOCKDEP and nothing shows up.

> 
> My first guess would be that mmap_sem is held during exec, so you
> can't have __mm_populate() try holding it recursively.

I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem
and things are fine. 

So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and
VM_POPULATE specifed). I will do git bisect and try to figure out which
is first commit which has the issue.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-15 16:35   ` Vivek Goyal
@ 2013-04-15 17:34     ` Vivek Goyal
  2013-04-15 17:59       ` Vivek Goyal
  0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2013-04-15 17:34 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux kernel mailing list, Hugh Dickins, Rik van Riel,
	Paul E. McKenney

On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote:

[..]
> > My first guess would be that mmap_sem is held during exec, so you
> > can't have __mm_populate() try holding it recursively.
> 
> I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem
> and things are fine. 
> 
> So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and
> VM_POPULATE specifed). I will do git bisect and try to figure out which
> is first commit which has the issue.

Ok, following seems to be first bad commit.

commit bebeb3d68b24bb4132d452c5707fe321208bcbcd
Author: Michel Lespinasse <walken@google.com>
Date:   Fri Feb 22 16:32:37 2013 -0800

    mm: introduce mm_populate() for populating new vmas

I have locked down /sbin/kexec. And I get following traceback after a while.

Thanks
Vivek

[   93.130001] INFO: rcu_sched self-detected stall on CPU[   93.131007]
INFO: rcu_sched detected stalls on CPUs/tasks: { 2} (detected by 3,
t=60002 jiffies, g=2786, c=2785, q=10480)
[   93.131016] Task dump for CPU 2:
[   93.131018] kexec           R  running task        0  4861   2982
0x00000088
[   93.131022]  ffff88013252bc68 ffffffff8109ab12 00000000ffffffff
0000000000000000
[   93.131026]  ffff88013252bc58 ffff88013a43c180 ffffffff81881edc
00007feed583221a
[   93.131028]  00007feed5832000 0000000000000000 ffff88013a43c828
ffff88013a43c180
[   93.131032] Call Trace:
[   93.131032]  [<ffffffff8109ab12>] ? mark_held_locks+0x82/0x130
[   93.131040]  [<ffffffff81881edc>] ? retint_restore_args+0xe/0xe
[   93.131045]  [<ffffffff8109ab12>] ? mark_held_locks+0x82/0x130
[   93.131047]  [<ffffffff81881edc>] ? retint_restore_args+0xe/0xe
[   93.131049]  [<ffffffff8109acc5>] ?
trace_hardirqs_on_caller+0x105/0x190
[   93.131051]  [<ffffffff8132913e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   93.131055]  [<ffffffff81881edc>] ? retint_restore_args+0xe/0xe
[   93.131058]  [<ffffffff81127898>] ? __get_user_pages+0x1d8/0x540
[   93.131061]  [<ffffffff81129fb7>] ? __mlock_vma_pages_range+0x67/0x70
[   93.131064]  [<ffffffff8112a721>] ? __mm_populate+0x71/0x150
[   93.131066]  [<ffffffff8111d01f>] ? vm_mmap_pgoff+0xdf/0xf0
[   93.131071]  [<ffffffff8112c839>] ? sys_mmap_pgoff+0xa9/0x180
[   93.131073]  [<ffffffff8109acc5>] ?
trace_hardirqs_on_caller+0x105/0x190
[   93.131076]  [<ffffffff8132913e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   93.131078]  [<ffffffff81006972>] ? sys_mmap+0x22/0x30
[   93.131081]  [<ffffffff8188a342>] ? system_call_fastpath+0x16/0x1b

[   93.130001]  { 2}  (t=60153 jiffies g=2786 c=2785 q=10480)
[   93.130001] Pid: 4861, comm: kexec Not tainted 3.8.0+ #270
[   93.130001] Call Trace:
[   93.130001]  <IRQ>  [<ffffffff810ce69a>]
rcu_check_callbacks+0x21a/0x760
[   93.130001]  [<ffffffff810d22ec>] ? acct_account_cputime+0x1c/0x20
[   93.130001]  [<ffffffff81077570>] ? account_system_time+0x160/0x230
[   93.130001]  [<ffffffff8104f388>] update_process_times+0x48/0x80
[   93.130001]  [<ffffffff810934ad>] tick_sched_handle+0x3d/0x50
[   93.130001]  [<ffffffff810936b5>] tick_sched_timer+0x45/0x70
[   93.130001]  [<ffffffff81067661>] __run_hrtimer+0x81/0x220
[   93.130001]  [<ffffffff81093670>] ? tick_nohz_handler+0xa0/0xa0
[   93.130001]  [<ffffffff8108ce7c>] ? ktime_get_update_offsets+0x4c/0xd0
[   93.130001]  [<ffffffff81067fb7>] hrtimer_interrupt+0xf7/0x250
[   93.130001]  [<ffffffff8188bdb9>] smp_apic_timer_interrupt+0x69/0x99
[   93.130001]  [<ffffffff8188af6f>] apic_timer_interrupt+0x6f/0x80
[   93.130001]  <EOI>  [<ffffffff81881edc>] ? retint_restore_args+0xe/0xe
[   93.130001]  [<ffffffff81127898>] ? __get_user_pages+0x1d8/0x540
[   93.130001]  [<ffffffff81129fb7>] __mlock_vma_pages_range+0x67/0x70
[   93.130001]  [<ffffffff8112a721>] __mm_populate+0x71/0x150
[   93.130001]  [<ffffffff8111d01f>] vm_mmap_pgoff+0xdf/0xf0
[   93.130001]  [<ffffffff8112c839>] sys_mmap_pgoff+0xa9/0x180
[   93.130001]  [<ffffffff8109acc5>] ?
trace_hardirqs_on_caller+0x105/0x190
[   93.130001]  [<ffffffff8132913e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   93.130001]  [<ffffffff81006972>] sys_mmap+0x22/0x30
[   93.130001]  [<ffffffff8188a342>] system_call_fastpath+0x16/0x1b

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-15 17:34     ` Vivek Goyal
@ 2013-04-15 17:59       ` Vivek Goyal
  2013-04-15 19:13         ` Vivek Goyal
  0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2013-04-15 17:59 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux kernel mailing list, Hugh Dickins, Rik van Riel,
	Paul E. McKenney, Andrew Morton

CCing akpm.

Vivek

On Mon, Apr 15, 2013 at 01:34:24PM -0400, Vivek Goyal wrote:
> On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote:
> 
> [..]
> > > My first guess would be that mmap_sem is held during exec, so you
> > > can't have __mm_populate() try holding it recursively.
> > 
> > I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem
> > and things are fine. 
> > 
> > So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and
> > VM_POPULATE specifed). I will do git bisect and try to figure out which
> > is first commit which has the issue.
> 
> Ok, following seems to be first bad commit.
> 
> commit bebeb3d68b24bb4132d452c5707fe321208bcbcd
> Author: Michel Lespinasse <walken@google.com>
> Date:   Fri Feb 22 16:32:37 2013 -0800
> 
>     mm: introduce mm_populate() for populating new vmas
> 
> I have locked down /sbin/kexec. And I get following traceback after a while.
> 
> Thanks
> Vivek
> 
> [   93.130001] INFO: rcu_sched self-detected stall on CPU[   93.131007]
> INFO: rcu_sched detected stalls on CPUs/tasks: { 2} (detected by 3,
> t=60002 jiffies, g=2786, c=2785, q=10480)
> [   93.131016] Task dump for CPU 2:
> [   93.131018] kexec           R  running task        0  4861   2982
> 0x00000088
> [   93.131022]  ffff88013252bc68 ffffffff8109ab12 00000000ffffffff
> 0000000000000000
> [   93.131026]  ffff88013252bc58 ffff88013a43c180 ffffffff81881edc
> 00007feed583221a
> [   93.131028]  00007feed5832000 0000000000000000 ffff88013a43c828
> ffff88013a43c180
> [   93.131032] Call Trace:
> [   93.131032]  [<ffffffff8109ab12>] ? mark_held_locks+0x82/0x130
> [   93.131040]  [<ffffffff81881edc>] ? retint_restore_args+0xe/0xe
> [   93.131045]  [<ffffffff8109ab12>] ? mark_held_locks+0x82/0x130
> [   93.131047]  [<ffffffff81881edc>] ? retint_restore_args+0xe/0xe
> [   93.131049]  [<ffffffff8109acc5>] ?
> trace_hardirqs_on_caller+0x105/0x190
> [   93.131051]  [<ffffffff8132913e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [   93.131055]  [<ffffffff81881edc>] ? retint_restore_args+0xe/0xe
> [   93.131058]  [<ffffffff81127898>] ? __get_user_pages+0x1d8/0x540
> [   93.131061]  [<ffffffff81129fb7>] ? __mlock_vma_pages_range+0x67/0x70
> [   93.131064]  [<ffffffff8112a721>] ? __mm_populate+0x71/0x150
> [   93.131066]  [<ffffffff8111d01f>] ? vm_mmap_pgoff+0xdf/0xf0
> [   93.131071]  [<ffffffff8112c839>] ? sys_mmap_pgoff+0xa9/0x180
> [   93.131073]  [<ffffffff8109acc5>] ?
> trace_hardirqs_on_caller+0x105/0x190
> [   93.131076]  [<ffffffff8132913e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [   93.131078]  [<ffffffff81006972>] ? sys_mmap+0x22/0x30
> [   93.131081]  [<ffffffff8188a342>] ? system_call_fastpath+0x16/0x1b
> 
> [   93.130001]  { 2}  (t=60153 jiffies g=2786 c=2785 q=10480)
> [   93.130001] Pid: 4861, comm: kexec Not tainted 3.8.0+ #270
> [   93.130001] Call Trace:
> [   93.130001]  <IRQ>  [<ffffffff810ce69a>]
> rcu_check_callbacks+0x21a/0x760
> [   93.130001]  [<ffffffff810d22ec>] ? acct_account_cputime+0x1c/0x20
> [   93.130001]  [<ffffffff81077570>] ? account_system_time+0x160/0x230
> [   93.130001]  [<ffffffff8104f388>] update_process_times+0x48/0x80
> [   93.130001]  [<ffffffff810934ad>] tick_sched_handle+0x3d/0x50
> [   93.130001]  [<ffffffff810936b5>] tick_sched_timer+0x45/0x70
> [   93.130001]  [<ffffffff81067661>] __run_hrtimer+0x81/0x220
> [   93.130001]  [<ffffffff81093670>] ? tick_nohz_handler+0xa0/0xa0
> [   93.130001]  [<ffffffff8108ce7c>] ? ktime_get_update_offsets+0x4c/0xd0
> [   93.130001]  [<ffffffff81067fb7>] hrtimer_interrupt+0xf7/0x250
> [   93.130001]  [<ffffffff8188bdb9>] smp_apic_timer_interrupt+0x69/0x99
> [   93.130001]  [<ffffffff8188af6f>] apic_timer_interrupt+0x6f/0x80
> [   93.130001]  <EOI>  [<ffffffff81881edc>] ? retint_restore_args+0xe/0xe
> [   93.130001]  [<ffffffff81127898>] ? __get_user_pages+0x1d8/0x540
> [   93.130001]  [<ffffffff81129fb7>] __mlock_vma_pages_range+0x67/0x70
> [   93.130001]  [<ffffffff8112a721>] __mm_populate+0x71/0x150
> [   93.130001]  [<ffffffff8111d01f>] vm_mmap_pgoff+0xdf/0xf0
> [   93.130001]  [<ffffffff8112c839>] sys_mmap_pgoff+0xa9/0x180
> [   93.130001]  [<ffffffff8109acc5>] ?
> trace_hardirqs_on_caller+0x105/0x190
> [   93.130001]  [<ffffffff8132913e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [   93.130001]  [<ffffffff81006972>] sys_mmap+0x22/0x30
> [   93.130001]  [<ffffffff8188a342>] system_call_fastpath+0x16/0x1b

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-15 17:59       ` Vivek Goyal
@ 2013-04-15 19:13         ` Vivek Goyal
  2013-04-15 21:47           ` Hugh Dickins
  0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2013-04-15 19:13 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: linux kernel mailing list, Hugh Dickins, Rik van Riel,
	Paul E. McKenney, Andrew Morton

On Mon, Apr 15, 2013 at 01:59:29PM -0400, Vivek Goyal wrote:
> CCing akpm.
> 
> Vivek
> 
> On Mon, Apr 15, 2013 at 01:34:24PM -0400, Vivek Goyal wrote:
> > On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote:
> > 
> > [..]
> > > > My first guess would be that mmap_sem is held during exec, so you
> > > > can't have __mm_populate() try holding it recursively.
> > > 
> > > I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem
> > > and things are fine. 
> > > 
> > > So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and
> > > VM_POPULATE specifed). I will do git bisect and try to figure out which
> > > is first commit which has the issue.
> > 
> > Ok, following seems to be first bad commit.
> > 
> > commit bebeb3d68b24bb4132d452c5707fe321208bcbcd
> > Author: Michel Lespinasse <walken@google.com>
> > Date:   Fri Feb 22 16:32:37 2013 -0800
> > 
> >     mm: introduce mm_populate() for populating new vmas
> > 

Michel,

An interesting observation. After this commit looks like simple
mmap(MAP_LOCKED) of a file was broken and it would hang and give RCU stall
warning similar to my patch of locking /sbin/kexec.

But in latest kernel mmap(MAP_LOCKED) does not hang. So looks like
this problem got fixed in a patch after this first bad commit. But
locking /sbin/kexec issue still remains.

I used following test program to map a arbitray file.

Thanks
Vivek


#define _GNU_SOURCE
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <keyutils.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <attr/xattr.h>
#include <unistd.h>
#include <sys/syscall.h>

main(int argc, char *argv[])
{
	char *filename = argv[1];
	int fd, ret;
	void *file_addr, *data_addr;
	struct stat stats;
	ssize_t sig_sz;
	void *sig;

	if (argc != 2) {
		fprintf(stderr, "Enter file name\n");	
		exit(1);
	}

	fd = open(filename, O_RDONLY);
	if (fd == -1) {
		fprintf(stderr, "Open of file %s failed:%s\n", filename,
					strerror(errno));
		exit(1);
	}

	ret = fstat(fd, &stats);
	if (ret == -1) {
		fprintf(stderr, "fstat of file %s failed:%s\n", filename,
				strerror(errno));
		exit(1);
	}

	file_addr = mmap(NULL, stats.st_size, PROT_READ, MAP_PRIVATE | MAP_LOCKED, fd, 0);

	if (file_addr == MAP_FAILED) {
		fprintf(stderr, "mmap() failed:%s\n", strerror(errno));
		exit(1);
	}
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-15 19:13         ` Vivek Goyal
@ 2013-04-15 21:47           ` Hugh Dickins
  2013-04-16  0:21             ` Michel Lespinasse
  0 siblings, 1 reply; 13+ messages in thread
From: Hugh Dickins @ 2013-04-15 21:47 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Michel Lespinasse, linux kernel mailing list, Rik van Riel,
	Paul E. McKenney, Andrew Morton

On Mon, 15 Apr 2013, Vivek Goyal wrote:
> On Mon, Apr 15, 2013 at 01:59:29PM -0400, Vivek Goyal wrote:
> > On Mon, Apr 15, 2013 at 01:34:24PM -0400, Vivek Goyal wrote:
> > > On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote:
> > > 
> > > [..]
> > > > > My first guess would be that mmap_sem is held during exec, so you
> > > > > can't have __mm_populate() try holding it recursively.
> > > > 
> > > > I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem
> > > > and things are fine. 
> > > > 
> > > > So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and
> > > > VM_POPULATE specifed). I will do git bisect and try to figure out which
> > > > is first commit which has the issue.
> > > 
> > > Ok, following seems to be first bad commit.
> > > 
> > > commit bebeb3d68b24bb4132d452c5707fe321208bcbcd
> > > Author: Michel Lespinasse <walken@google.com>
> > > Date:   Fri Feb 22 16:32:37 2013 -0800
> > > 
> > >     mm: introduce mm_populate() for populating new vmas
> > > 
> 
> Michel,
> 
> An interesting observation. After this commit looks like simple
> mmap(MAP_LOCKED) of a file was broken and it would hang and give RCU stall
> warning similar to my patch of locking /sbin/kexec.
> 
> But in latest kernel mmap(MAP_LOCKED) does not hang. So looks like
> this problem got fixed in a patch after this first bad commit. But
> locking /sbin/kexec issue still remains.

I haven't tried to understand that.  But I did just try your
def_flags |= VM_LOCKED hack to fs/binfmt_elf.c, and CONFIG_DEBUG_VM=y
quickly suggested the patch below - without the BUG, yes, __mm_populate
might well loop forever trying to populate 0 pages.

Whether a fix is actually needed, and whether it should be fixed here
or elsewhere, I'll leave to Michel.

Hugh

--- 3.9-rc7/mm/mlock.c	2013-04-01 09:08:05.736012852 -0700
+++ linux/mm/mlock.c	2013-04-15 14:20:24.454773245 -0700
@@ -397,8 +397,7 @@ int __mm_populate(unsigned long start, u
 	long ret = 0;
 
 	VM_BUG_ON(start & ~PAGE_MASK);
-	VM_BUG_ON(len != PAGE_ALIGN(len));
-	end = start + len;
+	end = start + PAGE_ALIGN(len);
 
 	for (nstart = start; nstart < end; nstart = nend) {
 		/*


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-15 21:47           ` Hugh Dickins
@ 2013-04-16  0:21             ` Michel Lespinasse
  2013-04-16  1:27               ` Hugh Dickins
  0 siblings, 1 reply; 13+ messages in thread
From: Michel Lespinasse @ 2013-04-16  0:21 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Vivek Goyal, linux kernel mailing list, Rik van Riel,
	Paul E. McKenney, Andrew Morton

On Mon, Apr 15, 2013 at 2:47 PM, Hugh Dickins <hughd@google.com> wrote:
> --- 3.9-rc7/mm/mlock.c  2013-04-01 09:08:05.736012852 -0700
> +++ linux/mm/mlock.c    2013-04-15 14:20:24.454773245 -0700
> @@ -397,8 +397,7 @@ int __mm_populate(unsigned long start, u
>         long ret = 0;
>
>         VM_BUG_ON(start & ~PAGE_MASK);
> -       VM_BUG_ON(len != PAGE_ALIGN(len));
> -       end = start + len;
> +       end = start + PAGE_ALIGN(len);
>
>         for (nstart = start; nstart < end; nstart = nend) {
>                 /*

Yes, there was originally an issue with page alignment in mmap as well, which was fixed by making sure mmap now passes the page aligned length to mm_populate().

sys_brk() passes the length as the difference of two page aligned addresses, so it's fine. But vm_brk() doesn't - it calls do_brk() which page aligns the length, but then vm_brk passes the unaligned length to mm_populate().

What do you think of the following ? len is already a PAGE_SIZE multiple when called from sys_brk()...

diff --git a/mm/mmap.c b/mm/mmap.c
index 0db0de1c2fbe..6af8b0d1c7db 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2557,10 +2557,6 @@ static unsigned long do_brk(unsigned long addr, unsigned long len)
 	pgoff_t pgoff = addr >> PAGE_SHIFT;
 	int error;
 
-	len = PAGE_ALIGN(len);
-	if (!len)
-		return addr;
-
 	flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;
 
 	error = get_unmapped_area(NULL, addr, len, 0, MAP_FIXED);
@@ -2643,6 +2639,10 @@ unsigned long vm_brk(unsigned long addr, unsigned long len)
 	unsigned long ret;
 	bool populate;
 
+	len = PAGE_ALIGN(len);
+	if (!len)
+		return addr;
+
 	down_write(&mm->mmap_sem);
 	ret = do_brk(addr, len);
 	populate = ((mm->def_flags & VM_LOCKED) != 0);

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-16  0:21             ` Michel Lespinasse
@ 2013-04-16  1:27               ` Hugh Dickins
  2013-04-29  8:57                 ` Michel Lespinasse
  0 siblings, 1 reply; 13+ messages in thread
From: Hugh Dickins @ 2013-04-16  1:27 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Vivek Goyal, linux kernel mailing list, Rik van Riel,
	Paul E. McKenney, Andrew Morton

On Mon, 15 Apr 2013, Michel Lespinasse wrote:
> On Mon, Apr 15, 2013 at 2:47 PM, Hugh Dickins <hughd@google.com> wrote:
> > --- 3.9-rc7/mm/mlock.c  2013-04-01 09:08:05.736012852 -0700
> > +++ linux/mm/mlock.c    2013-04-15 14:20:24.454773245 -0700
> > @@ -397,8 +397,7 @@ int __mm_populate(unsigned long start, u
> >         long ret = 0;
> >
> >         VM_BUG_ON(start & ~PAGE_MASK);
> > -       VM_BUG_ON(len != PAGE_ALIGN(len));
> > -       end = start + len;
> > +       end = start + PAGE_ALIGN(len);
> >
> >         for (nstart = start; nstart < end; nstart = nend) {
> >                 /*
> 
> Yes, there was originally an issue with page alignment in mmap as well, which was fixed by making sure mmap now passes the page aligned length to mm_populate().
> 
> sys_brk() passes the length as the difference of two page aligned addresses, so it's fine. But vm_brk() doesn't - it calls do_brk() which page aligns the length, but then vm_brk passes the unaligned length to mm_populate().
> 
> What do you think of the following ? len is already a PAGE_SIZE multiple when called from sys_brk()...

Yes, that's good.

> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 0db0de1c2fbe..6af8b0d1c7db 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2557,10 +2557,6 @@ static unsigned long do_brk(unsigned long addr, unsigned long len)
>  	pgoff_t pgoff = addr >> PAGE_SHIFT;
>  	int error;
>  
> -	len = PAGE_ALIGN(len);
> -	if (!len)
> -		return addr;
> -
>  	flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;
>  
>  	error = get_unmapped_area(NULL, addr, len, 0, MAP_FIXED);
> @@ -2643,6 +2639,10 @@ unsigned long vm_brk(unsigned long addr, unsigned long len)
>  	unsigned long ret;
>  	bool populate;
>  
> +	len = PAGE_ALIGN(len);
> +	if (!len)
> +		return addr;
> +
>  	down_write(&mm->mmap_sem);
>  	ret = do_brk(addr, len);
>  	populate = ((mm->def_flags & VM_LOCKED) != 0);
> 
> -- 
> Michel "Walken" Lespinasse
> A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-16  1:27               ` Hugh Dickins
@ 2013-04-29  8:57                 ` Michel Lespinasse
  2013-04-29 13:29                   ` Vivek Goyal
  0 siblings, 1 reply; 13+ messages in thread
From: Michel Lespinasse @ 2013-04-29  8:57 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Vivek Goyal, linux kernel mailing list, Rik van Riel,
	Paul E. McKenney, Andrew Morton

On Mon, Apr 15, 2013 at 6:27 PM, Hugh Dickins <hughd@google.com> wrote:
> On Mon, 15 Apr 2013, Michel Lespinasse wrote:
>> sys_brk() passes the length as the difference of two page aligned addresses, so it's fine. But vm_brk() doesn't - it calls do_brk() which page aligns the length, but then vm_brk passes the unaligned length to mm_populate().
>>
>> What do you think of the following ? len is already a PAGE_SIZE multiple when called from sys_brk()...
>
> Yes, that's good.
>>
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 0db0de1c2fbe..6af8b0d1c7db 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -2557,10 +2557,6 @@ static unsigned long do_brk(unsigned long addr, unsigned long len)
>>       pgoff_t pgoff = addr >> PAGE_SHIFT;
>>       int error;
>>
>> -     len = PAGE_ALIGN(len);
>> -     if (!len)
>> -             return addr;
>> -
>>       flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;
>>
>>       error = get_unmapped_area(NULL, addr, len, 0, MAP_FIXED);
>> @@ -2643,6 +2639,10 @@ unsigned long vm_brk(unsigned long addr, unsigned long len)
>>       unsigned long ret;
>>       bool populate;
>>
>> +     len = PAGE_ALIGN(len);
>> +     if (!len)
>> +             return addr;
>> +
>>       down_write(&mm->mmap_sem);
>>       ret = do_brk(addr, len);
>>       populate = ((mm->def_flags & VM_LOCKED) != 0);

Vivek - what should I do with this change ? Do you want to take it as
part of your series, or should I push to Linus/Andrew ?

All vm_brk calls are in elf and aout loaders, so I don' t think we can
hit the issue without your changes. Still, it may be good to make
vm_brk robust anyway.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-29  8:57                 ` Michel Lespinasse
@ 2013-04-29 13:29                   ` Vivek Goyal
  2013-05-20 20:23                     ` Vivek Goyal
  0 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2013-04-29 13:29 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Hugh Dickins, linux kernel mailing list, Rik van Riel,
	Paul E. McKenney, Andrew Morton

On Mon, Apr 29, 2013 at 01:57:18AM -0700, Michel Lespinasse wrote:
> On Mon, Apr 15, 2013 at 6:27 PM, Hugh Dickins <hughd@google.com> wrote:
> > On Mon, 15 Apr 2013, Michel Lespinasse wrote:
> >> sys_brk() passes the length as the difference of two page aligned addresses, so it's fine. But vm_brk() doesn't - it calls do_brk() which page aligns the length, but then vm_brk passes the unaligned length to mm_populate().
> >>
> >> What do you think of the following ? len is already a PAGE_SIZE multiple when called from sys_brk()...
> >
> > Yes, that's good.
> >>
> >> diff --git a/mm/mmap.c b/mm/mmap.c
> >> index 0db0de1c2fbe..6af8b0d1c7db 100644
> >> --- a/mm/mmap.c
> >> +++ b/mm/mmap.c
> >> @@ -2557,10 +2557,6 @@ static unsigned long do_brk(unsigned long addr, unsigned long len)
> >>       pgoff_t pgoff = addr >> PAGE_SHIFT;
> >>       int error;
> >>
> >> -     len = PAGE_ALIGN(len);
> >> -     if (!len)
> >> -             return addr;
> >> -
> >>       flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;
> >>
> >>       error = get_unmapped_area(NULL, addr, len, 0, MAP_FIXED);
> >> @@ -2643,6 +2639,10 @@ unsigned long vm_brk(unsigned long addr, unsigned long len)
> >>       unsigned long ret;
> >>       bool populate;
> >>
> >> +     len = PAGE_ALIGN(len);
> >> +     if (!len)
> >> +             return addr;
> >> +
> >>       down_write(&mm->mmap_sem);
> >>       ret = do_brk(addr, len);
> >>       populate = ((mm->def_flags & VM_LOCKED) != 0);
> 
> Vivek - what should I do with this change ? Do you want to take it as
> part of your series, or should I push to Linus/Andrew ?
> 
> All vm_brk calls are in elf and aout loaders, so I don' t think we can
> hit the issue without your changes. Still, it may be good to make
> vm_brk robust anyway.

Hi Michel,

Thanks for the fix. Sorry, could not respond earlier as I was away from
work.

This patch did fix the issue for me. I think it is a good idea to push
this fix to Andrew irrespective of my changes. My series is in RFC phase
and there are no guarantees whether patches will be accespected or not.

So please go ahead and send this fix to andrew for inclusion. 

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
  2013-04-29 13:29                   ` Vivek Goyal
@ 2013-05-20 20:23                     ` Vivek Goyal
  0 siblings, 0 replies; 13+ messages in thread
From: Vivek Goyal @ 2013-05-20 20:23 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Hugh Dickins, linux kernel mailing list, Rik van Riel,
	Paul E. McKenney, Andrew Morton

On Mon, Apr 29, 2013 at 09:29:52AM -0400, Vivek Goyal wrote:
> On Mon, Apr 29, 2013 at 01:57:18AM -0700, Michel Lespinasse wrote:
> > On Mon, Apr 15, 2013 at 6:27 PM, Hugh Dickins <hughd@google.com> wrote:
> > > On Mon, 15 Apr 2013, Michel Lespinasse wrote:
> > >> sys_brk() passes the length as the difference of two page aligned addresses, so it's fine. But vm_brk() doesn't - it calls do_brk() which page aligns the length, but then vm_brk passes the unaligned length to mm_populate().
> > >>
> > >> What do you think of the following ? len is already a PAGE_SIZE multiple when called from sys_brk()...
> > >
> > > Yes, that's good.
> > >>
> > >> diff --git a/mm/mmap.c b/mm/mmap.c
> > >> index 0db0de1c2fbe..6af8b0d1c7db 100644
> > >> --- a/mm/mmap.c
> > >> +++ b/mm/mmap.c
> > >> @@ -2557,10 +2557,6 @@ static unsigned long do_brk(unsigned long addr, unsigned long len)
> > >>       pgoff_t pgoff = addr >> PAGE_SHIFT;
> > >>       int error;
> > >>
> > >> -     len = PAGE_ALIGN(len);
> > >> -     if (!len)
> > >> -             return addr;
> > >> -
> > >>       flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;
> > >>
> > >>       error = get_unmapped_area(NULL, addr, len, 0, MAP_FIXED);
> > >> @@ -2643,6 +2639,10 @@ unsigned long vm_brk(unsigned long addr, unsigned long len)
> > >>       unsigned long ret;
> > >>       bool populate;
> > >>
> > >> +     len = PAGE_ALIGN(len);
> > >> +     if (!len)
> > >> +             return addr;
> > >> +
> > >>       down_write(&mm->mmap_sem);
> > >>       ret = do_brk(addr, len);
> > >>       populate = ((mm->def_flags & VM_LOCKED) != 0);
> > 
> > Vivek - what should I do with this change ? Do you want to take it as
> > part of your series, or should I push to Linus/Andrew ?
> > 
> > All vm_brk calls are in elf and aout loaders, so I don' t think we can
> > hit the issue without your changes. Still, it may be good to make
> > vm_brk robust anyway.
> 
> Hi Michel,
> 
> Thanks for the fix. Sorry, could not respond earlier as I was away from
> work.
> 
> This patch did fix the issue for me. I think it is a good idea to push
> this fix to Andrew irrespective of my changes. My series is in RFC phase
> and there are no guarantees whether patches will be accespected or not.
> 
> So please go ahead and send this fix to andrew for inclusion. 

Hi Michel,

So are you planning to push this patch upstream?

Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-05-20 20:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-12 18:13 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27 Vivek Goyal
2013-04-13  2:12 ` Michel Lespinasse
2013-04-13  2:31 ` Michel Lespinasse
2013-04-15 16:35   ` Vivek Goyal
2013-04-15 17:34     ` Vivek Goyal
2013-04-15 17:59       ` Vivek Goyal
2013-04-15 19:13         ` Vivek Goyal
2013-04-15 21:47           ` Hugh Dickins
2013-04-16  0:21             ` Michel Lespinasse
2013-04-16  1:27               ` Hugh Dickins
2013-04-29  8:57                 ` Michel Lespinasse
2013-04-29 13:29                   ` Vivek Goyal
2013-05-20 20:23                     ` Vivek Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox