From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756580Ab3DOSFg (ORCPT ); Mon, 15 Apr 2013 14:05:36 -0400 Received: from mx1.redhat.com ([209.132.183.28]:15704 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756057Ab3DOSFf (ORCPT ); Mon, 15 Apr 2013 14:05:35 -0400 Date: Mon, 15 Apr 2013 13:59:29 -0400 From: Vivek Goyal To: Michel Lespinasse Cc: linux kernel mailing list , Hugh Dickins , Rik van Riel , "Paul E. McKenney" , Andrew Morton Subject: Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall on CPU due to 09a9f1d27 Message-ID: <20130415175929.GB30583@redhat.com> References: <20130412181348.GA2253@redhat.com> <20130415163552.GA31868@redhat.com> <20130415173424.GB31868@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130415173424.GB31868@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org CCing akpm. Vivek On Mon, Apr 15, 2013 at 01:34:24PM -0400, Vivek Goyal wrote: > On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote: > > [..] > > > My first guess would be that mmap_sem is held during exec, so you > > > can't have __mm_populate() try holding it recursively. > > > > I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem > > and things are fine. > > > > So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and > > VM_POPULATE specifed). I will do git bisect and try to figure out which > > is first commit which has the issue. > > Ok, following seems to be first bad commit. > > commit bebeb3d68b24bb4132d452c5707fe321208bcbcd > Author: Michel Lespinasse > Date: Fri Feb 22 16:32:37 2013 -0800 > > mm: introduce mm_populate() for populating new vmas > > I have locked down /sbin/kexec. And I get following traceback after a while. > > Thanks > Vivek > > [ 93.130001] INFO: rcu_sched self-detected stall on CPU[ 93.131007] > INFO: rcu_sched detected stalls on CPUs/tasks: { 2} (detected by 3, > t=60002 jiffies, g=2786, c=2785, q=10480) > [ 93.131016] Task dump for CPU 2: > [ 93.131018] kexec R running task 0 4861 2982 > 0x00000088 > [ 93.131022] ffff88013252bc68 ffffffff8109ab12 00000000ffffffff > 0000000000000000 > [ 93.131026] ffff88013252bc58 ffff88013a43c180 ffffffff81881edc > 00007feed583221a > [ 93.131028] 00007feed5832000 0000000000000000 ffff88013a43c828 > ffff88013a43c180 > [ 93.131032] Call Trace: > [ 93.131032] [] ? mark_held_locks+0x82/0x130 > [ 93.131040] [] ? retint_restore_args+0xe/0xe > [ 93.131045] [] ? mark_held_locks+0x82/0x130 > [ 93.131047] [] ? retint_restore_args+0xe/0xe > [ 93.131049] [] ? > trace_hardirqs_on_caller+0x105/0x190 > [ 93.131051] [] ? trace_hardirqs_on_thunk+0x3a/0x3f > [ 93.131055] [] ? retint_restore_args+0xe/0xe > [ 93.131058] [] ? __get_user_pages+0x1d8/0x540 > [ 93.131061] [] ? __mlock_vma_pages_range+0x67/0x70 > [ 93.131064] [] ? __mm_populate+0x71/0x150 > [ 93.131066] [] ? vm_mmap_pgoff+0xdf/0xf0 > [ 93.131071] [] ? sys_mmap_pgoff+0xa9/0x180 > [ 93.131073] [] ? > trace_hardirqs_on_caller+0x105/0x190 > [ 93.131076] [] ? trace_hardirqs_on_thunk+0x3a/0x3f > [ 93.131078] [] ? sys_mmap+0x22/0x30 > [ 93.131081] [] ? system_call_fastpath+0x16/0x1b > > [ 93.130001] { 2} (t=60153 jiffies g=2786 c=2785 q=10480) > [ 93.130001] Pid: 4861, comm: kexec Not tainted 3.8.0+ #270 > [ 93.130001] Call Trace: > [ 93.130001] [] > rcu_check_callbacks+0x21a/0x760 > [ 93.130001] [] ? acct_account_cputime+0x1c/0x20 > [ 93.130001] [] ? account_system_time+0x160/0x230 > [ 93.130001] [] update_process_times+0x48/0x80 > [ 93.130001] [] tick_sched_handle+0x3d/0x50 > [ 93.130001] [] tick_sched_timer+0x45/0x70 > [ 93.130001] [] __run_hrtimer+0x81/0x220 > [ 93.130001] [] ? tick_nohz_handler+0xa0/0xa0 > [ 93.130001] [] ? ktime_get_update_offsets+0x4c/0xd0 > [ 93.130001] [] hrtimer_interrupt+0xf7/0x250 > [ 93.130001] [] smp_apic_timer_interrupt+0x69/0x99 > [ 93.130001] [] apic_timer_interrupt+0x6f/0x80 > [ 93.130001] [] ? retint_restore_args+0xe/0xe > [ 93.130001] [] ? __get_user_pages+0x1d8/0x540 > [ 93.130001] [] __mlock_vma_pages_range+0x67/0x70 > [ 93.130001] [] __mm_populate+0x71/0x150 > [ 93.130001] [] vm_mmap_pgoff+0xdf/0xf0 > [ 93.130001] [] sys_mmap_pgoff+0xa9/0x180 > [ 93.130001] [] ? > trace_hardirqs_on_caller+0x105/0x190 > [ 93.130001] [] ? trace_hardirqs_on_thunk+0x3a/0x3f > [ 93.130001] [] sys_mmap+0x22/0x30 > [ 93.130001] [] system_call_fastpath+0x16/0x1b