From mboxrd@z Thu Jan 1 00:00:00 1970 From: Antonio Barbalace Subject: Re: 3.0.10-rt27 arch/arm/kernel/smp.c bug Date: Wed, 18 Jan 2012 09:29:12 -0500 Message-ID: <20120118092912.77420k7p4wwjg37c@webmail.vt.edu> References: <20120113141757.12301lfyl3g6enad@webmail.vt.edu> <1326861125.17534.74.camel@gandalf.stny.rr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Transfer-Encoding: 7bit Cc: linux-rt-users@vger.kernel.org To: Steven Rostedt Return-path: Received: from lennier.cc.vt.edu ([198.82.162.213]:47955 "EHLO lennier.cc.vt.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757609Ab2ARO3O (ORCPT ); Wed, 18 Jan 2012 09:29:14 -0500 In-Reply-To: <1326861125.17534.74.camel@gandalf.stny.rr.com> Content-Disposition: inline Sender: linux-rt-users-owner@vger.kernel.org List-ID: Hi Steve, the problem always happen putting the cpu to sleep on ARM OMAP platform (I am currently using a Pandaboard), i.e. after echo 0 > /sys/devices/system/cpu/cpu1/online Thanks a lot for your help, Antonio --- Quoting Steven Rostedt : > Hi Antonio, > > On Fri, 2012-01-13 at 14:17 -0500, Antonio Barbalace wrote: >> I would like to report the following bug that is not still solved in >> the current 3.0.14 ver. >> >> [ 300.459960] BUG: sleeping function called from invalid context at >> kernel/rtm5 >> [ 300.459991] in_atomic(): 1, irqs_disabled(): 128, pid: 9, name: >> migration/1 >> [ 300.459991] 1 lock held by migration/1/9: >> [ 300.459991] #0: (tasklist_lock){++++..}, at: [] >> __cpu_disable+0x0 >> [ 300.460021] irq event stamp: 1887 >> [ 300.460052] hardirqs last enabled at (1886): [] >> _raw_spin_unlock_8 >> [ 300.460052] hardirqs last disabled at (1887): [] >> stop_machine_cpu_4 >> [ 300.460083] softirqs last enabled at (0): [] >> copy_process+0x3b4/00 >> [ 300.460113] softirqs last disabled at (0): [< (null)>] (null) >> [ 300.460144] [] (unwind_backtrace+0x0/0xf4) from >> [] (__rt) >> [ 300.460174] [] (__rt_spin_lock+0x18/0x2c) from >> [] (rt_re) >> [ 300.460174] [] (rt_read_lock+0x54/0x68) from [] >> (__cpu_d) >> [ 300.460235] [] (__cpu_disable+0xdc/0x170) from >> [] (take_) >> [ 300.460235] [] (take_cpu_down+0xc/0x30) from [] >> (stop_ma) >> [ 300.460235] [] (stop_machine_cpu_stop+0xd8/0x114) from >> [) >> [ 300.460266] [] (cpu_stopper_thread+0xb8/0x1ac) from >> [] () >> [ 300.460327] [] (kthread+0x88/0x90) from [] >> (kernel_threa) >> [ 300.464385] CPU1: shutdown >> >> This is due to the following arch/arm/kernel/smp.c @ __cpu_disable code: >> >> 169 read_lock(&tasklist_lock); >> 170 for_each_process(p) { >> 171 if (p->mm) >> 172 cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); >> 173 } >> 174 read_unlock(&tasklist_lock); >> >> I am not a rt expert, do you have any clue on how to solve this problem? > > Hmm, I'll need to look at this code deeper. The read_lock() in -rt can > sleep, and this is being called to shutdown a CPU, which I'm sure > disables interrupts along the way. > > What did you do to cause this? Does this happen when you take CPU 1 > offline? > > -- Steve > > >