From mboxrd@z Thu Jan 1 00:00:00 1970 From: tony@atomide.com (Tony Lindgren) Date: Mon, 12 May 2014 14:21:03 -0700 Subject: RCU stall on panda In-Reply-To: <20140505180617.GM8754@linux.vnet.ibm.com> References: <53675C5F.10509@linaro.org> <20140505180617.GM8754@linux.vnet.ibm.com> Message-ID: <20140512212102.GF5668@atomide.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org * Paul E. McKenney [140505 11:11]: > On Mon, May 05, 2014 at 05:39:43PM +0800, Alex Shi wrote: > > I keep seeing the RCU stall problem on panda board from 3.10 kernel to latest upstream kernel > > and google find some one report it before: https://lkml.org/lkml/2012/9/20/519 > > > > Is it the hardware issue or a real software problem? > > I cannot distinguish between hardware and software from the trace below, > but given that you are also seeing a soft lockup, either way you do > appear to have a real problem as opposed to an RCU CPU stall warning > false positive. Looks like you have CPU_IDLE enabled on panda. Hangs with current linux next with CPU_IDLE are currently being discussed on the linux-omap list in thread "omap4-panda-es boot issues with v3.15-rc4" I've seen occasional system hangs, and I've also noticed that doing ctrl-a-f h or ctrl-a-f l for sysrq backtrace can unlock the system producing similar errors to the below. Regards, Tony > > 95.519653] INFO: rcu_sched self-detected stall on CPU^M > > [ 95.519866] 1: (1 GPs behind) idle=2e7/1/0 softirq=4404/4405 ^M > > [ 95.526489] INFO: rcu_sched detected stalls on CPUs/tasks:^M > > [ 95.526489] 1: (1 GPs behind) idle=2e7/1/0 softirq=4404/4405 ^M > > [ 95.526489] (detected by 0, t=4229 jiffies, g=800, c=799, q=440)^M > > [ 95.526519] Task dump for CPU 1:^M > > [ 95.526519] swapper/1 R running 0 0 1 0x00000000^M > > [ 95.559844] (t=4229 jiffies g=800 c=799 q=440)^M > > [ 95.564727] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.15.0-rc4 #93^M > > [ 95.571502] [] (unwind_backtrace) from [] (show_stack+0x11/0x14)^M > > [ 95.579711] [] (show_stack) from [] (dump_stack+0x75/0x88)^M > > [ 95.587371] [] (dump_stack) from [] (rcu_check_callbacks+0x353/0x79c)^M > > [ 95.596038] [] (rcu_check_callbacks) from [] (update_process_times+0x33/0x4c)^M > > [ 95.605438] [] (update_process_times) from [] (tick_sched_handle.isra.18+0x1f/0x48)^M > > [ 95.615386] [] (tick_sched_handle.isra.18) from [] (tick_sched_timer+0x3d/0x5c)^M > > [ 95.624969] [] (tick_sched_timer) from [] (__run_hrtimer+0x67/0x310)^M > > [ 95.633544] [] (__run_hrtimer) from [] (hrtimer_interrupt+0xe1/0x214)^M > > [ 95.642211] [] (hrtimer_interrupt) from [] (tick_receive_broadcast+0x1f/0x30)^M > > [ 95.651611] [] (tick_receive_broadcast) from [] (handle_IPI+0xb3/0x120)^M > > [ 95.660461] [] (handle_IPI) from [] (gic_handle_irq+0x51/0x54)^M > > [ 95.668487] [] (gic_handle_irq) from [] (__irq_svc+0x3f/0x64)^M > > [ 95.676391] Exception stack(0xee0dbf10 to 0xee0dbf58)^M > > [ 95.681762] bf00: 00000001 00000001 00000000 ee0d8c40^M > > [ 95.690429] bf20: 3c6bd296 00000016 3c6f8c43 00000016 eefab540 c08e0c84 00000000 c0fc7114^M > > [ 95.699066] bf40: 00000010 ee0dbf58 c006ef4d c0443890 40000033 ffffffff^M > > [ 95.706085] [] (__irq_svc) from [] (cpuidle_enter_state+0xc0/0xc4)^M > > [ 95.714477] [] (cpuidle_enter_state) from [] (cpuidle_enter_state_coupled+0xe1/0x290)^M > > [ 95.724639] [] (cpuidle_enter_state_coupled) from [] (cpu_startup_entry+0x1a5/0x494)^M > > [ 95.734680] [] (cpu_startup_entry) from [<80008685>] (0x80008685)^M > > [ 95.742095] BUG: soft lockup - CPU#1 stuck for 40s! [swapper/1:0]^M > > [ 95.748535] Modules linked in:^M > > [ 95.751770] irq event stamp: 128730^M > > [ 95.755462] hardirqs last enabled at (128727): [] cpuidle_enter_state+0xbf/0xc4^M > > [ 95.764221] hardirqs last disabled at (128728): [] __irq_svc+0x33/0x64^M > > [ 95.772064] softirqs last enabled at (128730): [] irq_enter+0x59/0x60^M > > [ 95.779907] softirqs last disabled at (128729): [] irq_enter+0x46/0x60^M > > [ 95.787750] ^M > > > > > > my RCU and IDLE related kernel config as blow: > > > > CONFIG_TREE_RCU=y > > CONFIG_RCU_STALL_COMMON=y > > CONFIG_RCU_FANOUT=32 > > CONFIG_RCU_FANOUT_LEAF=16 > > CONFIG_TREE_RCU_TRACE=y > > CONFIG_PROVE_RCU=y > > CONFIG_PROVE_RCU_REPEATEDLY=y > > CONFIG_SPARSE_RCU_POINTER=y > > CONFIG_RCU_CPU_STALL_TIMEOUT=21 > > CONFIG_RCU_CPU_STALL_INFO=y > > CONFIG_RCU_TRACE=y > > alexs at alex-panda:~$ cat /proc/config.gz | gunzip | grep IDLE > > CONFIG_NO_HZ_IDLE=y > > CONFIG_GENERIC_SMP_IDLE_THREAD=y > > CONFIG_GENERIC_IDLE_POLL_SETUP=y > > CONFIG_CPU_IDLE=y > > CONFIG_CPU_IDLE_GOV_LADDER=y > > CONFIG_CPU_IDLE_GOV_MENU=y > > CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED=y > > > > -- > > Thanks > > Alex > > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel