public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Michel Lespinasse <walken@google.com>
Cc: linux kernel mailing list <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27
Date: Mon, 15 Apr 2013 12:35:52 -0400	[thread overview]
Message-ID: <20130415163552.GA31868@redhat.com> (raw)
In-Reply-To: <CANN689GKfxM3R96wgY8-+FftpVT76zMFcHdgKccNKnjHGfYq1Q@mail.gmail.com>

On Fri, Apr 12, 2013 at 07:31:32PM -0700, Michel Lespinasse wrote:
> Sorry for the earlier message getting sent before I was done typing it.
> 
> On Fri, Apr 12, 2013 at 11:13 AM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > Hi,
> >
> > I am writing some code where I lock down a process memory at exec() time.
> > My patches were working fine till 3.9-rc4 and suddendly things broke down
> > in 3.9-rc5.
> >
> > Whenever I tried to exec() a process with memory locked down, my bash
> > session hangs and after a while I get following warning.
> >
> > login: [  174.669002] INFO: rcu_sched self-detected stall on CPU { 2}  (t=60000 jiffies g=2580 c=2579 q=1085)
> > [  174.669002] Pid: 4894, comm: kexec Not tainted 3.9.0-rc6+ #243
> > [  174.669002] Call Trace:
> > [  174.669002]  <IRQ>  [<ffffffff810c413a>] rcu_check_callbacks+0x21a/0x760
> > [  174.669002]  [<ffffffff810c7c0c>] ? acct_account_cputime+0x1c/0x20
> > [  174.669002]  [<ffffffff8104fd08>] update_process_times+0x48/0x80
> > [  174.669002]  [<ffffffff810913dd>] tick_sched_handle+0x3d/0x50
> > [  174.669002]  [<ffffffff810915e5>] tick_sched_timer+0x45/0x70
> > [  174.669002]  [<ffffffff81066951>] __run_hrtimer+0x81/0x220
> > [  174.669002]  [<ffffffff810915a0>] ? tick_nohz_handler+0xa0/0xa0
> > [  174.669002]  [<ffffffff8108ae0c>] ? ktime_get_update_offsets+0x4c/0xd0
> > [  174.669002]  [<ffffffff81067297>] hrtimer_interrupt+0xf7/0x250
> > [  174.669002]  [<ffffffff81886739>] smp_apic_timer_interrupt+0x69/0x99
> > [  174.669002]  [<ffffffff818859ca>] apic_timer_interrupt+0x6a/0x70
> > [  174.669002]  <EOI>  [<ffffffff8111e557>] ?  __mlock_vma_pages_range+0x57/0x70
> > [  174.669002]  [<ffffffff8111e568>] ? __mlock_vma_pages_range+0x68/0x70
> > [  174.669002]  [<ffffffff8111ea01>] __mm_populate+0x71/0x140
> > [  174.669002]  [<ffffffff81121b5f>] vm_brk+0x7f/0xa0
> > [  174.669002]  [<ffffffff81199633>] load_elf_binary+0x1a73/0x1b10
> > [  174.669002]  [<ffffffff812d25a5>] ? ima_bprm_check+0x55/0x70
> > [  174.669002]  [<ffffffff8114890a>] search_binary_handler+0x12a/0x3b0
> > [  174.669002]  [<ffffffff81197bc0>] ? load_elf_library+0x210/0x210
> > [  174.669002]  [<ffffffff8114aa00>] do_execve_common+0x500/0x5c0
> > [  174.669002]  [<ffffffff8114aaf7>] do_execve+0x37/0x40
> > [  174.669002]  [<ffffffff8114ad9d>] sys_execve+0x3d/0x60
> > [  174.669002]  [<ffffffff81885379>] stub_execve+0x69/0xa0
> >
> > I did a git bisect and bisection says that following is first bad
> > commit.
> >
> > commit 09a9f1d27892255cfb9c91203f19476765e2d8d1
> > Author: Michel Lespinasse <walken@google.com>
> > Date:   Thu Mar 28 16:26:23 2013 -0700
> >
> >     Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace pr
> >
> >     This reverts commit 186930500985 ("mm: introduce VM_POPULATE flag to
> >     better deal with racy userspace programs").
> >
> > I reverted above commit and problem gets fixed.
> >
> > Following is my simple patch to lock down a selected process memory.
> >
> > Index: linux-2.6/fs/binfmt_elf.c
> > ===================================================================
> > --- linux-2.6.orig/fs/binfmt_elf.c      2013-04-13 01:50:26.380184101
> > -0400
> > +++ linux-2.6/fs/binfmt_elf.c   2013-04-13 01:50:49.827184821 -0400
> > @@ -721,6 +721,10 @@ static int load_elf_binary(struct linux_
> >
> >         /* OK, This is the point of no return */
> >         current->mm->def_flags = def_flags;
> > +       if (!strcmp(bprm->filename, "/sbin/kexec")) {
> > +               printk("Memlocking /sbin/kexec\n");
> > +               current->mm->def_flags |= VM_LOCKED;
> > +       }
> >
> >         /* Do this immediately, since STACK_TOP as used in setup_arg_pages
> >            may depend on the personality.  */
> >
> >
> > Do you have any thoughts on what's going on. I am wondering if it indicates
> > a bigger problem which can then be triggered from other paths too.
> >
> > Thanks
> > Vivek
> 
> Based on your patch, it looks like from 3.9-rc1 to 3.9-rc5 your change
> wouldn't actually cause pages to get mlocked during exec - for this
> range of kernels, mlockall would need to set both VM_LOCKED and
> VM_POPULATE. I suspect you would see the same crash if you included
> VM_POPULATE in your change, too.

Ok. I specified VM_POPULATE and I see the same problem.
> 
> That said, I am not sure immediately what's wrong. It looks like a
> deadlock situation, does CONFIG_LOCKDEP help here ?

I turned on CONFIG_LOCKDEP and nothing shows up.

> 
> My first guess would be that mmap_sem is held during exec, so you
> can't have __mm_populate() try holding it recursively.

I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem
and things are fine. 

So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and
VM_POPULATE specifed). I will do git bisect and try to figure out which
is first commit which has the issue.

Thanks
Vivek

  reply	other threads:[~2013-04-15 16:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-12 18:13 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27 Vivek Goyal
2013-04-13  2:12 ` Michel Lespinasse
2013-04-13  2:31 ` Michel Lespinasse
2013-04-15 16:35   ` Vivek Goyal [this message]
2013-04-15 17:34     ` Vivek Goyal
2013-04-15 17:59       ` Vivek Goyal
2013-04-15 19:13         ` Vivek Goyal
2013-04-15 21:47           ` Hugh Dickins
2013-04-16  0:21             ` Michel Lespinasse
2013-04-16  1:27               ` Hugh Dickins
2013-04-29  8:57                 ` Michel Lespinasse
2013-04-29 13:29                   ` Vivek Goyal
2013-05-20 20:23                     ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130415163552.GA31868@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox