From mboxrd@z Thu Jan 1 00:00:00 1970 From: mbohan@codeaurora.org (Michael Bohan) Date: Mon, 25 Apr 2011 16:33:27 -0700 Subject: console_cpu_notify can cause scheduling BUG during CPU hotplug Message-ID: <4DB604C7.8090305@codeaurora.org> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi, I've run into a crash scenario during CPU hotplug on ARM/MSM where we BUG() due to a schedule while atomic in v2.6.38-rc6. The issue appears to be that the console cpu notifier can block on a semaphore during cpu_stopper_thread's atomic code path. Preemption is explicitly disabled in cpu_stopper_thread. The suspected path was added with this commit: commit 034260d6779087431a8b2f67589c68b919299e5c Author: Kevin Cernekee Date: Thu Jun 3 22:11:25 2010 -0700 printk: fix delayed messages from CPU hotplug events I was curious if this scenario was accounted for in the design of the console CPU notifier. One workaround for this problem is to remove CPU_DEAD from the possible actions in console_cpu_notify(). In fact, v1-v4 of the patch above did not have CPU_DEAD, CPU_DYING or CPU_DOWN_FAILED in the list of actions. I wasn't able to track down why the other cases were added in the final patch. Crash log: <3>[ 21.408237] BUG: scheduling while atomic: migration/1/371/0x00000002 <4>[ 21.408247] Modules linked in: <4>[ 21.408286] [] (unwind_backtrace+0x0/0x128) from [] (schedule+0x9c/0x6c4) <4>[ 21.408303] [] (schedule+0x9c/0x6c4) from [] (schedule_timeout+0x1c/0x208) <4>[ 21.408319] [] (schedule_timeout+0x1c/0x208) from [] (__down+0x68/0x98) <4>[ 21.408337] [] (__down+0x68/0x98) from [] (down+0x2c/0x3c) <4>[ 21.408354] [] (down+0x2c/0x3c) from [] (console_lock+0x38/0x60) <4>[ 21.408377] [] (console_lock+0x38/0x60) from [] (console_cpu_notify+0x20/0x2c) <4>[ 21.408394] [] (console_cpu_notify+0x20/0x2c) from [] (notifier_call_chain+0x2c/0x70) <4>[ 21.408410] [] (notifier_call_chain+0x2c/0x70) from [] (__cpu_notify+0x24/0x3c) <4>[ 21.408425] [] (__cpu_notify+0x24/0x3c) from [] (take_cpu_down+0x2c/0x34) <4>[ 21.408444] [] (take_cpu_down+0x2c/0x34) from [] (stop_machine_cpu_stop+0xc0/0x11c) <4>[ 21.408462] [] (stop_machine_cpu_stop+0xc0/0x11c) from [] (cpu_stopper_thread+0xc8/0x160) <4>[ 21.408482] [] (cpu_stopper_thread+0xc8/0x160) from [] (kthread+0x80/0x88) <4>[ 21.408498] [] (kthread+0x80/0x88) from [] (kernel_thread_exit+0x0/0x8) Thanks, Mike -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum