On 04/04/2014 03:52 PM, Viresh Kumar wrote: > On 4 April 2014 13:16, Jet Chen wrote: >> Hi Viresh, >> >> I changed your print message as you suggested. >> >> diff --git a/kernel/timer.c b/kernel/timer.c >> index 6c3a371..193101d 100644 >> --- a/kernel/timer.c >> +++ b/kernel/timer.c >> @@ -1617,8 +1617,8 @@ static void migrate_timer_list(struct tvec_base >> *new_base, struct list_head *hea >> >> /* Check if CPU still has pinned timers */ >> if (unlikely(WARN(is_pinned, >> - "%s: can't migrate pinned timer: %p, >> deactivating it\n", >> - __func__, timer))) >> + "%s: can't migrate pinned timer: %p, >> timer->function: %p,deactivating it\n", >> + __func__, timer, timer->function))) >> continue; >> >> Then I reproduced the issue, and got the dmesg output, >> >> [ 37.918406] migrate_timer_list: can't migrate pinned timer: >> ffffffff81f06a60, timer->function: ffffffff810d7010,deactivating it >> >> We reproduced this issue for several times in our LKP system. The address of >> timer ffffffff81f06a60 is very constant. So is timer->function, I believe. >> >> Hope this information will help you. Please feel free to tell me what else I >> can do to help you. > > Hi Jet, > > Thanks a lot. Yes that's pretty helpful.. But I need some more help from you.. > I don't have any idea which function has this address in your kernel: > ffffffff810d7010 :) > > Can you please debug that a bit more? You need to find which function > this address belongs to. You can try that using objdump on your vmlinux. > > Some help can be found here: Documentation/BUG-HUNTING > > Thanks in Advance. > vmlinuz from our build system doesn't have debug information. It is hard to use objdump to identify which routine is timer->function. But after several times trials, I get below dmesg messages. It is clear to see address of "timer->function" is 0xffffffff810d7010. In calling stack, " [] ? clocksource_watchdog_kthread+0x40/0x40 ". So I guess timer->function is clocksource_watchdog_kthread. I manually disable CONFIG_CLOCKSOURCE_WATCHDOG, then I never see this oops again (But see other oops for other reason :( ) [ 37.918345] WARNING: CPU: 0 PID: 1932 at kernel/timer.c:1621 migrate_timer_list+0xdb/0xf0() [ 37.918406] migrate_timer_list: can't migrate pinned timer: ffffffff81f06a60, timer->function: ffffffff810d7010,deactivating it [ 37.918406] Modules linked in: [ 37.918406] CPU: 0 PID: 1932 Comm: 01-cpu-hotplug Not tainted 3.14.0-rc1-00088-gab3c4fd #4 [ 37.918406] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 37.918406] 0000000000000009 ffff88001d407c38 ffffffff817237bd ffff88001d407c80 [ 37.918406] ffff88001d407c70 ffffffff8106a1dd 0000000000000010 ffffffff81f06a60 [ 37.918406] ffff88001e04d040 ffffffff81e3d4c0 ffff88001e04d030 ffff88001d407cd0 [ 37.918406] Call Trace: [ 37.918406] [] dump_stack+0x4d/0x66 [ 37.918406] [] warn_slowpath_common+0x7d/0xa0 [ 37.918406] [] warn_slowpath_fmt+0x4c/0x50 [ 37.918406] [] ? __internal_add_timer+0x113/0x130 [ 37.918406] [] ? clocksource_watchdog_kthread+0x40/0x40 [ 37.918406] [] migrate_timer_list+0xdb/0xf0 [ 37.918406] [] timer_cpu_notify+0xfc/0x1f0 [ 37.918406] [] notifier_call_chain+0x4c/0x70 [ 37.918406] [] __raw_notifier_call_chain+0xe/0x10 [ 37.918406] [] cpu_notify+0x23/0x50 [ 37.918406] [] cpu_notify_nofail+0xe/0x20 [ 37.918406] [] _cpu_down+0x1ad/0x2e0 [ 37.918406] [] cpu_down+0x34/0x50 [ 37.918406] [] cpu_subsys_offline+0x14/0x20 [ 37.918406] [] device_offline+0x95/0xc0 [ 37.918406] [] online_store+0x40/0x90 [ 37.918406] [] dev_attr_store+0x18/0x30 [ 37.918406] [] sysfs_kf_write+0x3d/0x50