* 18rc1 soft lockup @ 2006-07-11 19:03 Dave Jones 2006-07-11 19:13 ` john stultz 0 siblings, 1 reply; 12+ messages in thread From: Dave Jones @ 2006-07-11 19:03 UTC (permalink / raw) To: Linux Kernel Just saw this during boot of a HT P4 box. BUG: soft lockup detected on CPU#0! [<c04051af>] show_trace_log_lvl+0x54/0xfd [<c0405766>] show_trace+0xd/0x10 [<c0405885>] dump_stack+0x19/0x1b [<c0450ec7>] softlockup_tick+0xa5/0xb9 [<c042d496>] run_local_timers+0x12/0x14 [<c042d81b>] update_process_times+0x3c/0x61 [<c04179e0>] smp_apic_timer_interrupt+0x6d/0x75 [<c0404ada>] apic_timer_interrupt+0x2a/0x30 BUG: soft lockup detected on CPU#1! [<c04051af>] show_trace_log_lvl+0x54/0xfd [<c0405766>] show_trace+0xd/0x10 [<c0405885>] dump_stack+0x19/0x1b [<c0450ec7>] softlockup_tick+0xa5/0xb9 [<c042d496>] run_local_timers+0x12/0x14 [<c042d81b>] update_process_times+0x3c/0x61 [<c04179e0>] smp_apic_timer_interrupt+0x6d/0x75 [<c0404ada>] apic_timer_interrupt+0x2a/0x30 BUG: soft lockup detected on CPU#0! [<c04051af>] show_trace_log_lvl+0x54/0xfd [<c0405766>] show_trace+0xd/0x10 [<c0405885>] dump_stack+0x19/0x1b [<c0450ec7>] softlockup_tick+0xa5/0xb9 [<c042d496>] run_local_timers+0x12/0x14 [<c042d81b>] update_process_times+0x3c/0x61 [<c04179e0>] smp_apic_timer_interrupt+0x6d/0x75 [<c0404ada>] apic_timer_interrupt+0x2a/0x30 It then continued booting just fine.. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-11 19:03 18rc1 soft lockup Dave Jones @ 2006-07-11 19:13 ` john stultz 2006-07-11 19:16 ` Dave Jones 0 siblings, 1 reply; 12+ messages in thread From: john stultz @ 2006-07-11 19:13 UTC (permalink / raw) To: Dave Jones; +Cc: Linux Kernel On Tue, 2006-07-11 at 15:03 -0400, Dave Jones wrote: > Just saw this during boot of a HT P4 box. > > BUG: soft lockup detected on CPU#0! > [<c04051af>] show_trace_log_lvl+0x54/0xfd > [<c0405766>] show_trace+0xd/0x10 > [<c0405885>] dump_stack+0x19/0x1b > [<c0450ec7>] softlockup_tick+0xa5/0xb9 > [<c042d496>] run_local_timers+0x12/0x14 > [<c042d81b>] update_process_times+0x3c/0x61 > [<c04179e0>] smp_apic_timer_interrupt+0x6d/0x75 > [<c0404ada>] apic_timer_interrupt+0x2a/0x30 That's clocksource_adjust/lost tick bug. Roman's fix landed in Linus' -git yesterday. thanks -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-11 19:13 ` john stultz @ 2006-07-11 19:16 ` Dave Jones 2006-07-13 22:07 ` Dave Jones 0 siblings, 1 reply; 12+ messages in thread From: Dave Jones @ 2006-07-11 19:16 UTC (permalink / raw) To: john stultz; +Cc: Linux Kernel On Tue, Jul 11, 2006 at 12:13:47PM -0700, john stultz wrote: > On Tue, 2006-07-11 at 15:03 -0400, Dave Jones wrote: > > Just saw this during boot of a HT P4 box. > > > > BUG: soft lockup detected on CPU#0! > > [<c04051af>] show_trace_log_lvl+0x54/0xfd > > [<c0405766>] show_trace+0xd/0x10 > > [<c0405885>] dump_stack+0x19/0x1b > > [<c0450ec7>] softlockup_tick+0xa5/0xb9 > > [<c042d496>] run_local_timers+0x12/0x14 > > [<c042d81b>] update_process_times+0x3c/0x61 > > [<c04179e0>] smp_apic_timer_interrupt+0x6d/0x75 > > [<c0404ada>] apic_timer_interrupt+0x2a/0x30 > > That's clocksource_adjust/lost tick bug. Roman's fix landed in Linus' > -git yesterday. Ah, that was actually a .18rc1-git3 tree. I notice a git4 just appeared, I'll try and reproduce with that. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-11 19:16 ` Dave Jones @ 2006-07-13 22:07 ` Dave Jones 2006-07-13 22:15 ` john stultz 0 siblings, 1 reply; 12+ messages in thread From: Dave Jones @ 2006-07-13 22:07 UTC (permalink / raw) To: john stultz, Linux Kernel On Tue, Jul 11, 2006 at 03:16:58PM -0400, Dave Jones wrote: > On Tue, Jul 11, 2006 at 12:13:47PM -0700, john stultz wrote: > > On Tue, 2006-07-11 at 15:03 -0400, Dave Jones wrote: > > > Just saw this during boot of a HT P4 box. > > > > > > BUG: soft lockup detected on CPU#0! > > > [<c04051af>] show_trace_log_lvl+0x54/0xfd > > > [<c0405766>] show_trace+0xd/0x10 > > > [<c0405885>] dump_stack+0x19/0x1b > > > [<c0450ec7>] softlockup_tick+0xa5/0xb9 > > > [<c042d496>] run_local_timers+0x12/0x14 > > > [<c042d81b>] update_process_times+0x3c/0x61 > > > [<c04179e0>] smp_apic_timer_interrupt+0x6d/0x75 > > > [<c0404ada>] apic_timer_interrupt+0x2a/0x30 > > > > That's clocksource_adjust/lost tick bug. Roman's fix landed in Linus' > > -git yesterday. > > Ah, that was actually a .18rc1-git3 tree. I notice a git4 just appeared, > I'll try and reproduce with that. Just when I thought it had gotten fixed.. 2.6.18rc1-git6 this time on x86-64.. BUG: soft lockup detected on CPU#3! Call Trace: [<ffffffff80270865>] show_trace+0xaa/0x23d [<ffffffff80270a0d>] dump_stack+0x15/0x17 [<ffffffff802c44e6>] softlockup_tick+0xd5/0xea [<ffffffff80250bea>] run_local_timers+0x13/0x15 [<ffffffff8029cc1d>] update_process_times+0x4c/0x79 [<ffffffff8027bfeb>] smp_local_timer_interrupt+0x2b/0x50 [<ffffffff8027c766>] smp_apic_timer_interrupt+0x58/0x62 [<ffffffff802628ae>] apic_timer_interrupt+0x6a/0x70 Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-13 22:07 ` Dave Jones @ 2006-07-13 22:15 ` john stultz 2006-07-13 22:28 ` Dave Jones 2006-07-13 23:05 ` Roman Zippel 0 siblings, 2 replies; 12+ messages in thread From: john stultz @ 2006-07-13 22:15 UTC (permalink / raw) To: Dave Jones; +Cc: Linux Kernel, Roman Zippel On Thu, 2006-07-13 at 18:07 -0400, Dave Jones wrote: > On Tue, Jul 11, 2006 at 03:16:58PM -0400, Dave Jones wrote: > > On Tue, Jul 11, 2006 at 12:13:47PM -0700, john stultz wrote: > > > On Tue, 2006-07-11 at 15:03 -0400, Dave Jones wrote: > > > > Just saw this during boot of a HT P4 box. > > > > > > > > BUG: soft lockup detected on CPU#0! > > > > [<c04051af>] show_trace_log_lvl+0x54/0xfd > > > > [<c0405766>] show_trace+0xd/0x10 > > > > [<c0405885>] dump_stack+0x19/0x1b > > > > [<c0450ec7>] softlockup_tick+0xa5/0xb9 > > > > [<c042d496>] run_local_timers+0x12/0x14 > > > > [<c042d81b>] update_process_times+0x3c/0x61 > > > > [<c04179e0>] smp_apic_timer_interrupt+0x6d/0x75 > > > > [<c0404ada>] apic_timer_interrupt+0x2a/0x30 > > > > > > That's clocksource_adjust/lost tick bug. Roman's fix landed in Linus' > > > -git yesterday. > > > > Ah, that was actually a .18rc1-git3 tree. I notice a git4 just appeared, > > I'll try and reproduce with that. > > Just when I thought it had gotten fixed.. > 2.6.18rc1-git6 this time on x86-64.. > > BUG: soft lockup detected on CPU#3! > > Call Trace: > [<ffffffff80270865>] show_trace+0xaa/0x23d > [<ffffffff80270a0d>] dump_stack+0x15/0x17 > [<ffffffff802c44e6>] softlockup_tick+0xd5/0xea > [<ffffffff80250bea>] run_local_timers+0x13/0x15 > [<ffffffff8029cc1d>] update_process_times+0x4c/0x79 > [<ffffffff8027bfeb>] smp_local_timer_interrupt+0x2b/0x50 > [<ffffffff8027c766>] smp_apic_timer_interrupt+0x58/0x62 > [<ffffffff802628ae>] apic_timer_interrupt+0x6a/0x70 Hmmm.. grumble. Was this on bootup, or after some time period? I'm looking into it. thanks for the report -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-13 22:15 ` john stultz @ 2006-07-13 22:28 ` Dave Jones 2006-07-14 9:22 ` Roman Zippel 2006-07-13 23:05 ` Roman Zippel 1 sibling, 1 reply; 12+ messages in thread From: Dave Jones @ 2006-07-13 22:28 UTC (permalink / raw) To: john stultz; +Cc: Linux Kernel, Roman Zippel On Thu, Jul 13, 2006 at 03:15:43PM -0700, john stultz wrote: > > Just when I thought it had gotten fixed.. > > 2.6.18rc1-git6 this time on x86-64.. > > > > BUG: soft lockup detected on CPU#3! > > > > Call Trace: > > [<ffffffff80270865>] show_trace+0xaa/0x23d > > [<ffffffff80270a0d>] dump_stack+0x15/0x17 > > [<ffffffff802c44e6>] softlockup_tick+0xd5/0xea > > [<ffffffff80250bea>] run_local_timers+0x13/0x15 > > [<ffffffff8029cc1d>] update_process_times+0x4c/0x79 > > [<ffffffff8027bfeb>] smp_local_timer_interrupt+0x2b/0x50 > > [<ffffffff8027c766>] smp_apic_timer_interrupt+0x58/0x62 > > [<ffffffff802628ae>] apic_timer_interrupt+0x6a/0x70 > > Hmmm.. grumble. Was this on bootup, or after some time period? Right at the end of boot up, between the switch from runlevel 3 to 5. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-13 22:28 ` Dave Jones @ 2006-07-14 9:22 ` Roman Zippel 2006-07-14 14:09 ` Dave Jones 0 siblings, 1 reply; 12+ messages in thread From: Roman Zippel @ 2006-07-14 9:22 UTC (permalink / raw) To: Dave Jones; +Cc: john stultz, Linux Kernel Hi, On Thu, 13 Jul 2006, Dave Jones wrote: > On Thu, Jul 13, 2006 at 03:15:43PM -0700, john stultz wrote: > > > > Just when I thought it had gotten fixed.. > > > 2.6.18rc1-git6 this time on x86-64.. > > > > > > BUG: soft lockup detected on CPU#3! > > > > > > Call Trace: > > > [<ffffffff80270865>] show_trace+0xaa/0x23d > > > [<ffffffff80270a0d>] dump_stack+0x15/0x17 > > > [<ffffffff802c44e6>] softlockup_tick+0xd5/0xea > > > [<ffffffff80250bea>] run_local_timers+0x13/0x15 > > > [<ffffffff8029cc1d>] update_process_times+0x4c/0x79 > > > [<ffffffff8027bfeb>] smp_local_timer_interrupt+0x2b/0x50 > > > [<ffffffff8027c766>] smp_apic_timer_interrupt+0x58/0x62 > > > [<ffffffff802628ae>] apic_timer_interrupt+0x6a/0x70 > > > > Hmmm.. grumble. Was this on bootup, or after some time period? > > Right at the end of boot up, between the switch from runlevel 3 to 5. When it waits, a SysRq+T might be useful. bye, Roman ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-14 9:22 ` Roman Zippel @ 2006-07-14 14:09 ` Dave Jones 2006-07-14 23:30 ` john stultz 0 siblings, 1 reply; 12+ messages in thread From: Dave Jones @ 2006-07-14 14:09 UTC (permalink / raw) To: Roman Zippel; +Cc: john stultz, Linux Kernel On Fri, Jul 14, 2006 at 11:22:47AM +0200, Roman Zippel wrote: > Hi, > > On Thu, 13 Jul 2006, Dave Jones wrote: > > > On Thu, Jul 13, 2006 at 03:15:43PM -0700, john stultz wrote: > > > > > > Just when I thought it had gotten fixed.. > > > > 2.6.18rc1-git6 this time on x86-64.. > > > > > > > > BUG: soft lockup detected on CPU#3! > > > > > > > > Call Trace: > > > > [<ffffffff80270865>] show_trace+0xaa/0x23d > > > > [<ffffffff80270a0d>] dump_stack+0x15/0x17 > > > > [<ffffffff802c44e6>] softlockup_tick+0xd5/0xea > > > > [<ffffffff80250bea>] run_local_timers+0x13/0x15 > > > > [<ffffffff8029cc1d>] update_process_times+0x4c/0x79 > > > > [<ffffffff8027bfeb>] smp_local_timer_interrupt+0x2b/0x50 > > > > [<ffffffff8027c766>] smp_apic_timer_interrupt+0x58/0x62 > > > > [<ffffffff802628ae>] apic_timer_interrupt+0x6a/0x70 > > > > > > Hmmm.. grumble. Was this on bootup, or after some time period? > > > > Right at the end of boot up, between the switch from runlevel 3 to 5. > > When it waits, a SysRq+T might be useful. it doesn't wait. this scrolls by within a split second. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-14 14:09 ` Dave Jones @ 2006-07-14 23:30 ` john stultz 0 siblings, 0 replies; 12+ messages in thread From: john stultz @ 2006-07-14 23:30 UTC (permalink / raw) To: Dave Jones; +Cc: Roman Zippel, Linux Kernel On Fri, 2006-07-14 at 10:09 -0400, Dave Jones wrote: > On Fri, Jul 14, 2006 at 11:22:47AM +0200, Roman Zippel wrote: > > Hi, > > > > On Thu, 13 Jul 2006, Dave Jones wrote: > > > > > On Thu, Jul 13, 2006 at 03:15:43PM -0700, john stultz wrote: > > > > > > > > Just when I thought it had gotten fixed.. > > > > > 2.6.18rc1-git6 this time on x86-64.. > > > > > > > > > > BUG: soft lockup detected on CPU#3! > > > > > > > > > > Call Trace: > > > > > [<ffffffff80270865>] show_trace+0xaa/0x23d > > > > > [<ffffffff80270a0d>] dump_stack+0x15/0x17 > > > > > [<ffffffff802c44e6>] softlockup_tick+0xd5/0xea > > > > > [<ffffffff80250bea>] run_local_timers+0x13/0x15 > > > > > [<ffffffff8029cc1d>] update_process_times+0x4c/0x79 > > > > > [<ffffffff8027bfeb>] smp_local_timer_interrupt+0x2b/0x50 > > > > > [<ffffffff8027c766>] smp_apic_timer_interrupt+0x58/0x62 > > > > > [<ffffffff802628ae>] apic_timer_interrupt+0x6a/0x70 > > > > > > > > Hmmm.. grumble. Was this on bootup, or after some time period? > > > > > > Right at the end of boot up, between the switch from runlevel 3 to 5. > > > > When it waits, a SysRq+T might be useful. > > it doesn't wait. this scrolls by within a split second. Huh. I wonder if the x86-64 lost-tick compensation code is being triggered. Could you boot w/ "report_lost_ticks" and see if it spits anything out right before this? thanks -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-13 22:15 ` john stultz 2006-07-13 22:28 ` Dave Jones @ 2006-07-13 23:05 ` Roman Zippel 2006-07-14 0:02 ` john stultz 1 sibling, 1 reply; 12+ messages in thread From: Roman Zippel @ 2006-07-13 23:05 UTC (permalink / raw) To: john stultz; +Cc: Dave Jones, Linux Kernel Hi, On Thu, 13 Jul 2006, john stultz wrote: > > Just when I thought it had gotten fixed.. > > 2.6.18rc1-git6 this time on x86-64.. > > > > BUG: soft lockup detected on CPU#3! > > > > Call Trace: > > [<ffffffff80270865>] show_trace+0xaa/0x23d > > [<ffffffff80270a0d>] dump_stack+0x15/0x17 > > [<ffffffff802c44e6>] softlockup_tick+0xd5/0xea > > [<ffffffff80250bea>] run_local_timers+0x13/0x15 > > [<ffffffff8029cc1d>] update_process_times+0x4c/0x79 > > [<ffffffff8027bfeb>] smp_local_timer_interrupt+0x2b/0x50 > > [<ffffffff8027c766>] smp_apic_timer_interrupt+0x58/0x62 > > [<ffffffff802628ae>] apic_timer_interrupt+0x6a/0x70 > > Hmmm.. grumble. Was this on bootup, or after some time period? > > I'm looking into it. I don't quite understand how this is clock related, soft lockup uses jiffies and there is nothing clock related in the trace??? bye, Roman ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-13 23:05 ` Roman Zippel @ 2006-07-14 0:02 ` john stultz 2006-07-14 0:12 ` Dave Jones 0 siblings, 1 reply; 12+ messages in thread From: john stultz @ 2006-07-14 0:02 UTC (permalink / raw) To: Roman Zippel; +Cc: Dave Jones, Linux Kernel On Fri, 2006-07-14 at 01:05 +0200, Roman Zippel wrote: > Hi, > > On Thu, 13 Jul 2006, john stultz wrote: > > > > Just when I thought it had gotten fixed.. > > > 2.6.18rc1-git6 this time on x86-64.. > > > > > > BUG: soft lockup detected on CPU#3! > > > > > > Call Trace: > > > [<ffffffff80270865>] show_trace+0xaa/0x23d > > > [<ffffffff80270a0d>] dump_stack+0x15/0x17 > > > [<ffffffff802c44e6>] softlockup_tick+0xd5/0xea > > > [<ffffffff80250bea>] run_local_timers+0x13/0x15 > > > [<ffffffff8029cc1d>] update_process_times+0x4c/0x79 > > > [<ffffffff8027bfeb>] smp_local_timer_interrupt+0x2b/0x50 > > > [<ffffffff8027c766>] smp_apic_timer_interrupt+0x58/0x62 > > > [<ffffffff802628ae>] apic_timer_interrupt+0x6a/0x70 > > > > Hmmm.. grumble. Was this on bootup, or after some time period? > > > > I'm looking into it. > > I don't quite understand how this is clock related, soft lockup uses > jiffies and there is nothing clock related in the trace??? Hmmm. Well, its easy to check: Dave, could you comment out the "clocksource_adjust(...)" line in kernel/timer.c::update_wall_time() just to check if its the same issue? thanks -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 18rc1 soft lockup 2006-07-14 0:02 ` john stultz @ 2006-07-14 0:12 ` Dave Jones 0 siblings, 0 replies; 12+ messages in thread From: Dave Jones @ 2006-07-14 0:12 UTC (permalink / raw) To: john stultz; +Cc: Roman Zippel, Linux Kernel On Thu, Jul 13, 2006 at 05:02:38PM -0700, john stultz wrote: > > I don't quite understand how this is clock related, soft lockup uses > > jiffies and there is nothing clock related in the trace??? > > Hmmm. Well, its easy to check: > > Dave, could you comment out the "clocksource_adjust(...)" line in > kernel/timer.c::update_wall_time() just to check if its the same issue? I'll try, but just like every other bug I've hit together, it's non-deterministic. I'll do a half dozen boots to see if turns up again. Whatever happened to the good old days of reproducable bugs? :) Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2006-07-14 23:30 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-07-11 19:03 18rc1 soft lockup Dave Jones 2006-07-11 19:13 ` john stultz 2006-07-11 19:16 ` Dave Jones 2006-07-13 22:07 ` Dave Jones 2006-07-13 22:15 ` john stultz 2006-07-13 22:28 ` Dave Jones 2006-07-14 9:22 ` Roman Zippel 2006-07-14 14:09 ` Dave Jones 2006-07-14 23:30 ` john stultz 2006-07-13 23:05 ` Roman Zippel 2006-07-14 0:02 ` john stultz 2006-07-14 0:12 ` Dave Jones
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox