* 2.6.32.21 - uptime related crashes? @ 2011-04-28 8:26 Nikola Ciprich 2011-04-28 18:34 ` [stable] " Willy Tarreau 0 siblings, 1 reply; 58+ messages in thread From: Nikola Ciprich @ 2011-04-28 8:26 UTC (permalink / raw) To: linux-kernel mlist; +Cc: linux-stable mlist [-- Attachment #1: Type: text/plain, Size: 3798 bytes --] Hello everybody, I'm trying to solve strange issue, today, my fourth machine running 2.6.32.21 just crashed. What makes the cases similar, apart fromn same kernel version is that all boxes had very similar uptimes: 214, 216, 216, and 224 days. This might just be a coincidence, but I think this might be important. Unfortunately I only have backtraces of two crashes (and those are trimmed, sorry), and they do not look as similar as I'd like, but still maybe there is something in common: [<ffffffff81120cc7>] pollwake+0x57/0x60 [<ffffffff81046720>] ? default_wake_function+0x0/0x10 [<ffffffff8103683a>] __wake_up_common+0x5a/0x90 [<ffffffff8103a313>] __wake_up+0x43/0x70 [<ffffffffa0321573>] process_masterspan+0x643/0x670 [dahdi] [<ffffffffa0326595>] coretimer_func+0x135/0x1d0 [dahdi] [<ffffffff8105d74d>] run_timer_softirq+0x15d/0x320 [<ffffffffa0326460>] ? coretimer_func+0x0/0x1d0 [dahdi] [<ffffffff8105690c>] __do_softirq+0xcc/0x220 [<ffffffff8100c40c>] call_softirq+0x1c/0x30 [<ffffffff8100e3ba>] do_softirq+0x4a/0x80 [<ffffffff810567c7>] irq_exit+0x87/0x90 [<ffffffff8100d7b7>] do_IRQ+0x77/0xf0 [<ffffffff8100bc53>] ret_from_intr+0x0/Oxa <EUI> [<ffffffffa019e556>] ? acpi_idle_enter_bm+0x273/0x2a1 [processor] [<ffffffffa019e54c>] ? acpi_idle_enter_bm+0x269/0x2a1 [processor] [<ffffffff81280095>] ? cpuidle_idle_call+0xa5/0x150 [<ffffffff8100a18f>] ? cpu_idle+0x4f/0x90 [<ffffffff81323c95>] ? rest_init+0x75/0x80 [<ffffffff81582d7f>] ? start_kernel+0x2ef/0x390 [<ffffffff81582271>] ? x86_64_start_reservations+0x81/0xc0 [<ffffffff81582386>] ? x86_64_start_kernel+0xd6/0x100 this box (actually two of the crashed ones) is using dahdi_dummy module to generate timing for asterisk SW pbx, so maybe it's related to it. [<ffffffff810a5063>] handle_IRQ_event+0x63/0x1c0 [<ffffffff810a71ae>] handle_edge_irq+0xce/0x160 [<ffffffff8100e1bf>] handle_irq+0x1f/0x30 [<ffffffff8100d7ae>] do_IRQ+0x6e/0xf0 [<ffffffff8100bc53>] ret_from_intr+0x0/Oxa <EUI> [<ffffffff8133?f?f>] ? _spin_un1ock_irq+0xf/0x40 [<ffffffff81337f79>] ? _spin_un1ock_irq+0x9/0x40 [<ffffffff81064b9a>] ? exit_signals+0x8a/0x130 [<ffffffff8105372e>] ? do_exit+0x7e/0x7d0 [<ffffffff8100f8a7>] ? oops_end+0xa7/0xb0 [<ffffffff8100faa6>] ? die+0x56/0x90 [<ffffffff8100c810>] ? do_trap+0x130/0x150 [<ffffffff8100ccca>] ? do_divide_error+0x8a/0xa0 [<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00 [<ffffffff8104400b>] ? cpuacct_charge+0x6b/0x90 [<ffffffff8100c045>] ? divide_error+0x15/0x20 [<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00 [<ffffffff8103cfff>] ? find_busiest_group+0x1af/0xa00 [<ffffffff81335483>] ? thread_return+0x4ce/0x7bb [<ffffffff8133bec5>] ? do_nanosleep+0x75/0x30 [<ffffffff810?1?4e>] ? hrtimer_nanosleep+0x9e/0x120 [<ffffffff810?08f0>] ? hrtimer_wakeup+0x0/0x30 [<ffffffff810?183f>] ? sys_nanosleep+0x6f/0x80 another two don't use it. only similarity I see here is that it seems to be IRQ handling related, but both issues don't have anything in common. Does anybody have an idea on where should I look? Of course I should update all those boxes to (at least) latest 2.6.32.x, and I'll do it for sure, but still I'd first like to know where the problem was, and if it has been fixed, or how to fix it... I'd be gratefull for any help... BR nik -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@linuxbox.cz ------------------------------------- [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-28 8:26 2.6.32.21 - uptime related crashes? Nikola Ciprich @ 2011-04-28 18:34 ` Willy Tarreau 2011-04-29 10:02 ` Nikola Ciprich 2011-05-13 22:08 ` Nicolas Carlier 0 siblings, 2 replies; 58+ messages in thread From: Willy Tarreau @ 2011-04-28 18:34 UTC (permalink / raw) To: Nikola Ciprich Cc: linux-kernel mlist, linux-stable mlist, Hervé Commowick Hello Nikola, On Thu, Apr 28, 2011 at 10:26:25AM +0200, Nikola Ciprich wrote: > Hello everybody, > > I'm trying to solve strange issue, today, my fourth machine running 2.6.32.21 just crashed. What makes the cases similar, apart fromn same kernel version is that all boxes had very similar uptimes: 214, 216, 216, and 224 days. This might just be a coincidence, but I think this might be important. Interestingly, one of our customers just had two machines who crashed yesterday after 212 days and 212+20h respectively. They were running debian's 2.6.32-bpo.5-amd64 which is based on 2.6.32.23 AIUI. The crash looks very similar to the following bug which we have updated : https://bugzilla.kernel.org/show_bug.cgi?id=16991 (bugzilla doesn't appear to respond as I'm posting this mail). The top of your ouput is missing. In our case as in the reports on the bug above, there was a divide by zero error. Did you happen to spot this one too, or do you just not know ? I observe "divide_error+0x15/0x20" in one of your reports, so it's possible that it matches the same pattern at least for one trace. Just in case, it would be nice to feed the bugzilla entry above. > Unfortunately I only have backtraces of two crashes (and those are trimmed, sorry), and they do not look as similar as I'd like, but still maybe there is something in common: > > [<ffffffff81120cc7>] pollwake+0x57/0x60 > [<ffffffff81046720>] ? default_wake_function+0x0/0x10 > [<ffffffff8103683a>] __wake_up_common+0x5a/0x90 > [<ffffffff8103a313>] __wake_up+0x43/0x70 > [<ffffffffa0321573>] process_masterspan+0x643/0x670 [dahdi] > [<ffffffffa0326595>] coretimer_func+0x135/0x1d0 [dahdi] > [<ffffffff8105d74d>] run_timer_softirq+0x15d/0x320 > [<ffffffffa0326460>] ? coretimer_func+0x0/0x1d0 [dahdi] > [<ffffffff8105690c>] __do_softirq+0xcc/0x220 > [<ffffffff8100c40c>] call_softirq+0x1c/0x30 > [<ffffffff8100e3ba>] do_softirq+0x4a/0x80 > [<ffffffff810567c7>] irq_exit+0x87/0x90 > [<ffffffff8100d7b7>] do_IRQ+0x77/0xf0 > [<ffffffff8100bc53>] ret_from_intr+0x0/Oxa > <EUI> [<ffffffffa019e556>] ? acpi_idle_enter_bm+0x273/0x2a1 [processor] > [<ffffffffa019e54c>] ? acpi_idle_enter_bm+0x269/0x2a1 [processor] > [<ffffffff81280095>] ? cpuidle_idle_call+0xa5/0x150 > [<ffffffff8100a18f>] ? cpu_idle+0x4f/0x90 > [<ffffffff81323c95>] ? rest_init+0x75/0x80 > [<ffffffff81582d7f>] ? start_kernel+0x2ef/0x390 > [<ffffffff81582271>] ? x86_64_start_reservations+0x81/0xc0 > [<ffffffff81582386>] ? x86_64_start_kernel+0xd6/0x100 > > this box (actually two of the crashed ones) is using dahdi_dummy module to generate timing for asterisk SW pbx, so maybe it's related to it. > > > [<ffffffff810a5063>] handle_IRQ_event+0x63/0x1c0 > [<ffffffff810a71ae>] handle_edge_irq+0xce/0x160 > [<ffffffff8100e1bf>] handle_irq+0x1f/0x30 > [<ffffffff8100d7ae>] do_IRQ+0x6e/0xf0 > [<ffffffff8100bc53>] ret_from_intr+0x0/Oxa > <EUI> [<ffffffff8133?f?f>] ? _spin_un1ock_irq+0xf/0x40 > [<ffffffff81337f79>] ? _spin_un1ock_irq+0x9/0x40 > [<ffffffff81064b9a>] ? exit_signals+0x8a/0x130 > [<ffffffff8105372e>] ? do_exit+0x7e/0x7d0 > [<ffffffff8100f8a7>] ? oops_end+0xa7/0xb0 > [<ffffffff8100faa6>] ? die+0x56/0x90 > [<ffffffff8100c810>] ? do_trap+0x130/0x150 > [<ffffffff8100ccca>] ? do_divide_error+0x8a/0xa0 > [<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00 > [<ffffffff8104400b>] ? cpuacct_charge+0x6b/0x90 > [<ffffffff8100c045>] ? divide_error+0x15/0x20 > [<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00 > [<ffffffff8103cfff>] ? find_busiest_group+0x1af/0xa00 > [<ffffffff81335483>] ? thread_return+0x4ce/0x7bb > [<ffffffff8133bec5>] ? do_nanosleep+0x75/0x30 > [<ffffffff810?1?4e>] ? hrtimer_nanosleep+0x9e/0x120 > [<ffffffff810?08f0>] ? hrtimer_wakeup+0x0/0x30 > [<ffffffff810?183f>] ? sys_nanosleep+0x6f/0x80 > > another two don't use it. only similarity I see here is that it seems to be IRQ handling related, but both issues don't have anything in common. > Does anybody have an idea on where should I look? Of course I should update all those boxes to (at least) latest 2.6.32.x, and I'll do it for sure, but still I'd first like to know where the problem was, and if it has been fixed, or how to fix it... > I'd be gratefull for any help... There were quite a bunch of scheduler updates recently. We may be lucky and hope for the bug to have vanished with the changes, but we may as well see the same crash in 7 months :-/ My coworker Hervé (CC'd) who worked on the issue suggests that we might have something which goes wrong past a certain uptime (eg: 212 days), which needs a special event to be triggered (I/O, process exiting, etc...). I think this makes quite some sense. Could you check your CONFIG_HZ so that we could convert those uptimes to jiffies ? Maybe this will ring a bell in someone's head :-/ Best regards, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-28 18:34 ` [stable] " Willy Tarreau @ 2011-04-29 10:02 ` Nikola Ciprich 2011-04-30 9:36 ` Willy Tarreau 2011-05-06 3:12 ` [stable] " Hidetoshi Seto 2011-05-13 22:08 ` Nicolas Carlier 1 sibling, 2 replies; 58+ messages in thread From: Nikola Ciprich @ 2011-04-29 10:02 UTC (permalink / raw) To: Willy Tarreau Cc: linux-kernel mlist, linux-stable mlist, Hervé Commowick, seto.hidetoshi [-- Attachment #1: Type: text/plain, Size: 8125 bytes --] (another CC added) Hello Willy! I made some statistics of our servers regarding kernel version and uptime. Here are some my thoughts: - I'm 100% sure this problem wasn't present in kernels <= 2.6.30.x (we've got a lot of boxes with uptimes >600days) - I'm 90% sure this problem also wasn't present in 2.6.32.16 (we've got 6 boxes running for 235 to 280days) What I'm not sure is, whether this is present in 2.6.19, I have: 2 boxes running 2.6.32.19 for 238days and one 2.6.32.20 for 216days. I also have a bunch ov 2.6.32.23 boxes, which are now getting close to 200days uptime. But I suspect this really is first problematic version, more on it later. First regarding Your question about CONFIG_HZ - we use 250HZ setting, which leads me to following: 250 * 60 * 60 * 24 * 199 = 4298400000 which is value a little over 2**32! So maybe some unsingned long variable might overflow? Does this make sense? And to my suspicion about 2.6.32.19, there is one commit which maybe is related: commit 0cf55e1ec08bb5a22e068309e2d8ba1180ab4239 Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Date: Wed Dec 2 17:28:07 2009 +0900 sched, cputime: Introduce thread_group_times() This is a real fix for problem of utime/stime values decreasing described in the thread: http://lkml.org/lkml/2009/11/3/522 Now cputime is accounted in the following way: - {u,s}time in task_struct are increased every time when the thread is interrupted by a tick (timer interrupt). - When a thread exits, its {u,s}time are added to signal->{u,s}time, after adjusted by task_times(). - When all threads in a thread_group exits, accumulated {u,s}time (and also c{u,s}time) in signal struct are added to c{u,s}time in signal struct of the group's parent. . . . I haven't studied this into detail yet, but it seems to me it might really be related. Hidetoshi-san - do You have some opinion about this? Could this somehow either create or invoke the problem with overflow of some variable which would lead to division by zero or similar problems? Any other thoughts? best regards nik On Thu, Apr 28, 2011 at 08:34:34PM +0200, Willy Tarreau wrote: > Hello Nikola, > > On Thu, Apr 28, 2011 at 10:26:25AM +0200, Nikola Ciprich wrote: > > Hello everybody, > > > > I'm trying to solve strange issue, today, my fourth machine running 2.6.32.21 just crashed. What makes the cases similar, apart fromn same kernel version is that all boxes had very similar uptimes: 214, 216, 216, and 224 days. This might just be a coincidence, but I think this might be important. > > Interestingly, one of our customers just had two machines who crashed > yesterday after 212 days and 212+20h respectively. They were running > debian's 2.6.32-bpo.5-amd64 which is based on 2.6.32.23 AIUI. > > The crash looks very similar to the following bug which we have updated : > > https://bugzilla.kernel.org/show_bug.cgi?id=16991 > > (bugzilla doesn't appear to respond as I'm posting this mail). > > The top of your ouput is missing. In our case as in the reports on the bug > above, there was a divide by zero error. Did you happen to spot this one > too, or do you just not know ? I observe "divide_error+0x15/0x20" in one > of your reports, so it's possible that it matches the same pattern at least > for one trace. Just in case, it would be nice to feed the bugzilla entry > above. > > > Unfortunately I only have backtraces of two crashes (and those are trimmed, sorry), and they do not look as similar as I'd like, but still maybe there is something in common: > > > > [<ffffffff81120cc7>] pollwake+0x57/0x60 > > [<ffffffff81046720>] ? default_wake_function+0x0/0x10 > > [<ffffffff8103683a>] __wake_up_common+0x5a/0x90 > > [<ffffffff8103a313>] __wake_up+0x43/0x70 > > [<ffffffffa0321573>] process_masterspan+0x643/0x670 [dahdi] > > [<ffffffffa0326595>] coretimer_func+0x135/0x1d0 [dahdi] > > [<ffffffff8105d74d>] run_timer_softirq+0x15d/0x320 > > [<ffffffffa0326460>] ? coretimer_func+0x0/0x1d0 [dahdi] > > [<ffffffff8105690c>] __do_softirq+0xcc/0x220 > > [<ffffffff8100c40c>] call_softirq+0x1c/0x30 > > [<ffffffff8100e3ba>] do_softirq+0x4a/0x80 > > [<ffffffff810567c7>] irq_exit+0x87/0x90 > > [<ffffffff8100d7b7>] do_IRQ+0x77/0xf0 > > [<ffffffff8100bc53>] ret_from_intr+0x0/Oxa > > <EUI> [<ffffffffa019e556>] ? acpi_idle_enter_bm+0x273/0x2a1 [processor] > > [<ffffffffa019e54c>] ? acpi_idle_enter_bm+0x269/0x2a1 [processor] > > [<ffffffff81280095>] ? cpuidle_idle_call+0xa5/0x150 > > [<ffffffff8100a18f>] ? cpu_idle+0x4f/0x90 > > [<ffffffff81323c95>] ? rest_init+0x75/0x80 > > [<ffffffff81582d7f>] ? start_kernel+0x2ef/0x390 > > [<ffffffff81582271>] ? x86_64_start_reservations+0x81/0xc0 > > [<ffffffff81582386>] ? x86_64_start_kernel+0xd6/0x100 > > > > this box (actually two of the crashed ones) is using dahdi_dummy module to generate timing for asterisk SW pbx, so maybe it's related to it. > > > > > > [<ffffffff810a5063>] handle_IRQ_event+0x63/0x1c0 > > [<ffffffff810a71ae>] handle_edge_irq+0xce/0x160 > > [<ffffffff8100e1bf>] handle_irq+0x1f/0x30 > > [<ffffffff8100d7ae>] do_IRQ+0x6e/0xf0 > > [<ffffffff8100bc53>] ret_from_intr+0x0/Oxa > > <EUI> [<ffffffff8133?f?f>] ? _spin_un1ock_irq+0xf/0x40 > > [<ffffffff81337f79>] ? _spin_un1ock_irq+0x9/0x40 > > [<ffffffff81064b9a>] ? exit_signals+0x8a/0x130 > > [<ffffffff8105372e>] ? do_exit+0x7e/0x7d0 > > [<ffffffff8100f8a7>] ? oops_end+0xa7/0xb0 > > [<ffffffff8100faa6>] ? die+0x56/0x90 > > [<ffffffff8100c810>] ? do_trap+0x130/0x150 > > [<ffffffff8100ccca>] ? do_divide_error+0x8a/0xa0 > > [<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00 > > [<ffffffff8104400b>] ? cpuacct_charge+0x6b/0x90 > > [<ffffffff8100c045>] ? divide_error+0x15/0x20 > > [<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00 > > [<ffffffff8103cfff>] ? find_busiest_group+0x1af/0xa00 > > [<ffffffff81335483>] ? thread_return+0x4ce/0x7bb > > [<ffffffff8133bec5>] ? do_nanosleep+0x75/0x30 > > [<ffffffff810?1?4e>] ? hrtimer_nanosleep+0x9e/0x120 > > [<ffffffff810?08f0>] ? hrtimer_wakeup+0x0/0x30 > > [<ffffffff810?183f>] ? sys_nanosleep+0x6f/0x80 > > > > another two don't use it. only similarity I see here is that it seems to be IRQ handling related, but both issues don't have anything in common. > > Does anybody have an idea on where should I look? Of course I should update all those boxes to (at least) latest 2.6.32.x, and I'll do it for sure, but still I'd first like to know where the problem was, and if it has been fixed, or how to fix it... > > I'd be gratefull for any help... > > There were quite a bunch of scheduler updates recently. We may be lucky and > hope for the bug to have vanished with the changes, but we may as well see > the same crash in 7 months :-/ > > My coworker Hervé (CC'd) who worked on the issue suggests that we might have > something which goes wrong past a certain uptime (eg: 212 days), which needs > a special event to be triggered (I/O, process exiting, etc...). I think this > makes quite some sense. > > Could you check your CONFIG_HZ so that we could convert those uptimes to > jiffies ? Maybe this will ring a bell in someone's head :-/ > > Best regards, > Willy > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@linuxbox.cz ------------------------------------- [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-29 10:02 ` Nikola Ciprich @ 2011-04-30 9:36 ` Willy Tarreau 2011-04-30 11:22 ` Henrique de Moraes Holschuh ` (2 more replies) 2011-05-06 3:12 ` [stable] " Hidetoshi Seto 1 sibling, 3 replies; 58+ messages in thread From: Willy Tarreau @ 2011-04-30 9:36 UTC (permalink / raw) To: Nikola Ciprich Cc: linux-kernel mlist, linux-stable mlist, Hervé Commowick, seto.hidetoshi Hello Nikola, On Fri, Apr 29, 2011 at 12:02:00PM +0200, Nikola Ciprich wrote: > (another CC added) > > Hello Willy! > > I made some statistics of our servers regarding kernel version and uptime. > Here are some my thoughts: > - I'm 100% sure this problem wasn't present in kernels <= 2.6.30.x (we've got a lot of boxes with uptimes >600days) > - I'm 90% sure this problem also wasn't present in 2.6.32.16 (we've got 6 boxes running for 235 to 280days) OK those are all precious information. > What I'm not sure is, whether this is present in 2.6.19, I have: > 2 boxes running 2.6.32.19 for 238days and one 2.6.32.20 for 216days. > I also have a bunch ov 2.6.32.23 boxes, which are now getting close to 200days uptime. > But I suspect this really is first problematic version, more on it later. > First regarding Your question about CONFIG_HZ - we use 250HZ setting, which leads me to following: > 250 * 60 * 60 * 24 * 199 = 4298400000 which is value a little over 2**32! So maybe some unsingned long variable > might overflow? Does this make sense? Yes of course it makes sense, that was also my worries. 2^32 jiffies at 250 Hz is slightly less than 199 days. Maybe an overflow somewhere keeps propagating wrong results on some computations. I remember having encountered a lot of funny things when trying to get 2.4 get past the 497 days limit using the jiffies64 patch. So I would not be surprized at all that we're in a similar situation here. Also, I've checked the Debian kernel config where we had the divide overflow and it was running at 250 Hz too. > And to my suspicion about 2.6.32.19, there is one commit which maybe is related: > > commit 0cf55e1ec08bb5a22e068309e2d8ba1180ab4239 > Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> > Date: Wed Dec 2 17:28:07 2009 +0900 > > sched, cputime: Introduce thread_group_times() > > This is a real fix for problem of utime/stime values decreasing > described in the thread: > > http://lkml.org/lkml/2009/11/3/522 > > Now cputime is accounted in the following way: > > - {u,s}time in task_struct are increased every time when the thread > is interrupted by a tick (timer interrupt). > > - When a thread exits, its {u,s}time are added to signal->{u,s}time, > after adjusted by task_times(). > > - When all threads in a thread_group exits, accumulated {u,s}time > (and also c{u,s}time) in signal struct are added to c{u,s}time > in signal struct of the group's parent. > . > . > . > > I haven't studied this into detail yet, but it seems to me it might really be related. Hidetoshi-san - do You have some opinion about this? > Could this somehow either create or invoke the problem with overflow of some variable which would lead to division by zero or similar problems? > > Any other thoughts? There was a kernel parameter in the past that was used to make jiffies wrap a few minutes after boot, maybe we should revive it to try to reproduce without waiting 7 new months :-/ Last, the "advantage" with a suspected regression in a stable series is that there are a lot less patches to test. Regards, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-30 9:36 ` Willy Tarreau @ 2011-04-30 11:22 ` Henrique de Moraes Holschuh 2011-04-30 11:54 ` Willy Tarreau 2011-04-30 12:02 ` Nikola Ciprich 2011-04-30 17:39 ` Faidon Liambotis 2 siblings, 1 reply; 58+ messages in thread From: Henrique de Moraes Holschuh @ 2011-04-30 11:22 UTC (permalink / raw) To: Willy Tarreau Cc: Nikola Ciprich, linux-kernel mlist, linux-stable mlist, Hervé Commowick, seto.hidetoshi On Sat, 30 Apr 2011, Willy Tarreau wrote: > There was a kernel parameter in the past that was used to make jiffies wrap > a few minutes after boot, maybe we should revive it to try to reproduce > without waiting 7 new months :-/ IMHO, such an option should be given a permanent home in KCONFIG, and it should be enabled by default. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-30 11:22 ` Henrique de Moraes Holschuh @ 2011-04-30 11:54 ` Willy Tarreau 2011-04-30 12:32 ` Henrique de Moraes Holschuh 0 siblings, 1 reply; 58+ messages in thread From: Willy Tarreau @ 2011-04-30 11:54 UTC (permalink / raw) To: Henrique de Moraes Holschuh Cc: seto.hidetoshi, Hervé Commowick, linux-kernel mlist, linux-stable mlist On Sat, Apr 30, 2011 at 08:22:44AM -0300, Henrique de Moraes Holschuh wrote: > On Sat, 30 Apr 2011, Willy Tarreau wrote: > > There was a kernel parameter in the past that was used to make jiffies wrap > > a few minutes after boot, maybe we should revive it to try to reproduce > > without waiting 7 new months :-/ > > IMHO, such an option should be given a permanent home in KCONFIG, and it > should be enabled by default. Not exactly as it affects uptime. Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-30 11:54 ` Willy Tarreau @ 2011-04-30 12:32 ` Henrique de Moraes Holschuh 0 siblings, 0 replies; 58+ messages in thread From: Henrique de Moraes Holschuh @ 2011-04-30 12:32 UTC (permalink / raw) To: Willy Tarreau Cc: seto.hidetoshi, Hervé Commowick, linux-kernel mlist, linux-stable mlist On Sat, 30 Apr 2011, Willy Tarreau wrote: > On Sat, Apr 30, 2011 at 08:22:44AM -0300, Henrique de Moraes Holschuh wrote: > > On Sat, 30 Apr 2011, Willy Tarreau wrote: > > > There was a kernel parameter in the past that was used to make jiffies wrap > > > a few minutes after boot, maybe we should revive it to try to reproduce > > > without waiting 7 new months :-/ > > > > IMHO, such an option should be given a permanent home in KCONFIG, and it > > should be enabled by default. > > Not exactly as it affects uptime. Well, the offset to the real uptime is known, is there a way to fudge it back that affects only the uptime reporting? -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-30 9:36 ` Willy Tarreau 2011-04-30 11:22 ` Henrique de Moraes Holschuh @ 2011-04-30 12:02 ` Nikola Ciprich 2011-04-30 15:57 ` Greg KH 2011-04-30 17:39 ` Faidon Liambotis 2 siblings, 1 reply; 58+ messages in thread From: Nikola Ciprich @ 2011-04-30 12:02 UTC (permalink / raw) To: Willy Tarreau Cc: linux-kernel mlist, linux-stable mlist, Hervé Commowick, seto.hidetoshi [-- Attachment #1: Type: text/plain, Size: 1098 bytes --] > There was a kernel parameter in the past that was used to make jiffies wrap > a few minutes after boot, maybe we should revive it to try to reproduce > without waiting 7 new months :-/ hmm, I've googled this up, and it seems to have been 2.5.x patch, so it will certainly need some porting.. I'll try to have a look on it tonight and report... > > Last, the "advantage" with a suspected regression in a stable series is that > there are a lot less patches to test. > > Regards, > Willy > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@linuxbox.cz ------------------------------------- [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-30 12:02 ` Nikola Ciprich @ 2011-04-30 15:57 ` Greg KH 2011-04-30 16:08 ` Randy Dunlap 0 siblings, 1 reply; 58+ messages in thread From: Greg KH @ 2011-04-30 15:57 UTC (permalink / raw) To: Nikola Ciprich Cc: Willy Tarreau, seto.hidetoshi, Hervé Commowick, linux-kernel mlist, linux-stable mlist On Sat, Apr 30, 2011 at 02:02:05PM +0200, Nikola Ciprich wrote: > > There was a kernel parameter in the past that was used to make jiffies wrap > > a few minutes after boot, maybe we should revive it to try to reproduce > > without waiting 7 new months :-/ > hmm, I've googled this up, and it seems to have been 2.5.x patch, so it will > certainly need some porting.. > I'll try to have a look on it tonight and report... I don't think any patch is needed, I thought we did that by default now, but I can't seem to find the code where it happens... odd. greg k-h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-30 15:57 ` Greg KH @ 2011-04-30 16:08 ` Randy Dunlap 2011-04-30 16:49 ` Willy Tarreau 0 siblings, 1 reply; 58+ messages in thread From: Randy Dunlap @ 2011-04-30 16:08 UTC (permalink / raw) To: Greg KH Cc: Nikola Ciprich, Willy Tarreau, seto.hidetoshi, Hervé Commowick, linux-kernel mlist, linux-stable mlist On Sat, 30 Apr 2011 08:57:07 -0700 Greg KH wrote: > On Sat, Apr 30, 2011 at 02:02:05PM +0200, Nikola Ciprich wrote: > > > There was a kernel parameter in the past that was used to make jiffies wrap > > > a few minutes after boot, maybe we should revive it to try to reproduce > > > without waiting 7 new months :-/ > > hmm, I've googled this up, and it seems to have been 2.5.x patch, so it will > > certainly need some porting.. > > I'll try to have a look on it tonight and report... > > I don't think any patch is needed, I thought we did that by default now, > but I can't seem to find the code where it happens... > > odd. linux/jiffies.h: /* * Have the 32 bit jiffies value wrap 5 minutes after boot * so jiffies wrap bugs show up earlier. */ #define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ)) and kernel/timer.c: u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES; --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-30 16:08 ` Randy Dunlap @ 2011-04-30 16:49 ` Willy Tarreau 2011-04-30 18:14 ` Henrique de Moraes Holschuh 0 siblings, 1 reply; 58+ messages in thread From: Willy Tarreau @ 2011-04-30 16:49 UTC (permalink / raw) To: Randy Dunlap Cc: Greg KH, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, linux-kernel mlist, linux-stable mlist On Sat, Apr 30, 2011 at 09:08:05AM -0700, Randy Dunlap wrote: > On Sat, 30 Apr 2011 08:57:07 -0700 Greg KH wrote: > > > On Sat, Apr 30, 2011 at 02:02:05PM +0200, Nikola Ciprich wrote: > > > > There was a kernel parameter in the past that was used to make jiffies wrap > > > > a few minutes after boot, maybe we should revive it to try to reproduce > > > > without waiting 7 new months :-/ > > > hmm, I've googled this up, and it seems to have been 2.5.x patch, so it will > > > certainly need some porting.. > > > I'll try to have a look on it tonight and report... > > > > I don't think any patch is needed, I thought we did that by default now, > > but I can't seem to find the code where it happens... > > > > odd. > > linux/jiffies.h: > > /* > * Have the 32 bit jiffies value wrap 5 minutes after boot > * so jiffies wrap bugs show up earlier. > */ > #define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ)) > > > and kernel/timer.c: > > u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES; Thanks Randy. So that would mean that wrapping jiffies should be unrelated to the reported panics. Let's wait for Hidetoshi-san's analysis then. Regards, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-30 16:49 ` Willy Tarreau @ 2011-04-30 18:14 ` Henrique de Moraes Holschuh 0 siblings, 0 replies; 58+ messages in thread From: Henrique de Moraes Holschuh @ 2011-04-30 18:14 UTC (permalink / raw) To: Willy Tarreau Cc: Randy Dunlap, Greg KH, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, linux-kernel mlist, linux-stable mlist On Sat, 30 Apr 2011, Willy Tarreau wrote: > So that would mean that wrapping jiffies should be unrelated to the > reported panics. Let's wait for Hidetoshi-san's analysis then. Err, it actually could be a problem when it wraps twice, OR it could be related to conditions that didn't happen yet right after boot. But that's less likely, of course. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-04-30 9:36 ` Willy Tarreau 2011-04-30 11:22 ` Henrique de Moraes Holschuh 2011-04-30 12:02 ` Nikola Ciprich @ 2011-04-30 17:39 ` Faidon Liambotis 2011-04-30 20:14 ` Willy Tarreau 2011-06-28 2:25 ` john stultz 2 siblings, 2 replies; 58+ messages in thread From: Faidon Liambotis @ 2011-04-30 17:39 UTC (permalink / raw) To: linux-kernel, stable Cc: Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Willy Tarreau, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos If I may add some, hopefully useful, information to the thread: At work we have an HP c7000 blade enclosure. It's populated with 8 ProLiant BL460c G1 (Xeon E5405, constant_tsc but not nonstop_tsc) and 4 ProLiant BL460c G6 (Xeon E5504, constant_tsc + nonstop_tsc). All were booted at the same day with Debian's 2.6.32-23~bpo50+1 kernel, i.e. upstream 2.6.32.21. We too experienced problems with just the G6 blades at near 215 days uptime (on the 19th of April), all at the same time. From our investigation, it seems that their cpu_clocks jumped suddenly far in the future and then almost immediately rolled over due to wrapping around 64-bits. Although all of their (G6s) clocks wrapped around *at the same time*, only one of them actually crashed at the time, with a second one crashing just a few days later, on the 28th. Three of them had the following on their logs: Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers present Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 stuck for 17163091968s! [kvm:25913] [...] Apr 19 10:15:42 hn-05 kernel: [18446743935.447275] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b Apr 19 10:18:32 hn-05 kernel: [ 31.587025] bond0.13: received packet with own address as source address (full oops at the end of this mail) Note that the 17163091968s time stuck was *the same* (+/- 1s) in *all three of them*. What's also very strange, although we're not very sure if related, is that when we took the first line of the above log entry and substracted 17966378.581971 from Apr 18 20:56:07, this resulted in a boot-time that differed several hours from the actual boot time (the latter being verified both with syslog and /proc/bstat's btime, which were both in agreement). This was verified post-mortem too, with the date checked to be correct. IOW, we have some serious clock drift (calcuated it at runtime to ~0.1s/min) on these machines that hasn't been made apparent probably since they all run NTP. Moreover, we also saw that drift on other machines (1U, different vendor, different data center, E5520 CPUs) but not with the G1 blades. Our investigation showed that the drift is there constantly (it's not a one-time event) but we're not really sure if it's related with the Apr 18th jump event. Note that the drift is there even if we boot with "clocksource=hpet" but disappears when booted with "notsc". Also note that we've verified with 2.6.38 and the drift is still there. Regards, Faidon 1: The full backtrace is: Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers present Apr 19 06:25:02 hn-05 kernel: imklog 3.18.6, log source = /proc/kmsg started. Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 stuck for 17163091968s! [kvm:25913] Apr 19 10:15:42 hn-05 kernel: [18446743935.447056] Modules linked in: xt_pkttype ext2 tun kvm_intel kvm nf_conntrack_ipv6 ip6table_filter ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_ defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables 8021q garp bridge stp bonding dm_round_robin dm_multipath scsi_dh ipmi_poweroff ipmi_devintf loop snd_pcm snd_ti mer snd soundcore bnx2x psmouse snd_page_alloc serio_raw sd_mod crc_t10dif hpilo ipmi_si ipmi_msghandler container evdev crc32c pcspkr shpchp pci_hotplug libcrc32c power_meter mdio button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot dm_mod usbhid hid cciss ehci_hcd uhci_hcd qla2xxx scsi_transport_fc usbcore nls_base tg3 libphy scsi_ tgt scsi_mod thermal fan thermal_sys [last unloaded: scsi_wait_scan] Apr 19 10:15:42 hn-05 kernel: [18446743935.447111] CPU 4: Apr 19 10:15:42 hn-05 kernel: [18446743935.447112] Modules linked in: xt_pkttype ext2 tun kvm_intel kvm nf_conntrack_ipv6 ip6table_filter ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_ defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables 8021q garp bridge stp bonding dm_round_robin dm_multipath scsi_dh ipmi_poweroff ipmi_devintf loop snd_pcm snd_timer snd soundcore bnx2x psmouse snd_page_alloc serio_raw sd_mod crc_t10dif hpilo ipmi_si ipmi_msghandler container evdev crc32c pcspkr shpchp pci_hotplug libcrc32c power_meter mdio button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot dm_mod usbhid hid cciss ehci_hcd uhci_hcd qla2xxx scsi_transport_fc usbcore nls_base tg3 libphy scsi_tgt scsi_mod thermal fan thermal_sys [last unloaded: scsi_wait_scan] Apr 19 10:15:42 hn-05 kernel: [18446743935.447154] Pid: 25913, comm: kvm Not tainted 2.6.32-bpo.5-amd64 #1 ProLiant BL460c G6 Apr 19 10:15:42 hn-05 kernel: [18446743935.447157] RIP: 0010:[<ffffffffa02e209a>] [<ffffffffa02e209a>] kvm_arch_vcpu_ioctl_run+0x785/0xa44 [kvm] Apr 19 10:15:42 hn-05 kernel: [18446743935.447177] RSP: 0018:ffff88070065fd38 EFLAGS: 00000202 Apr 19 10:15:42 hn-05 kernel: [18446743935.447179] RAX: ffff88070065ffd8 RBX: ffff8804154ba860 RCX: ffff8804154ba860 Apr 19 10:15:42 hn-05 kernel: [18446743935.447182] RDX: ffff8804154ba8b9 RSI: ffff81004c818b10 RDI: ffff8100291a7d48 Apr 19 10:15:42 hn-05 kernel: [18446743935.447184] RBP: ffffffff8101166e R08: 0000000000000000 R09: 0000000000000000 Apr 19 10:15:42 hn-05 kernel: [18446743935.447187] R10: 00007f1b2a2af078 R11: ffffffff802f3405 R12: 0000000000000001 Apr 19 10:15:42 hn-05 kernel: [18446743935.447189] R13: 00000000154ba8b8 R14: ffff8804136ac000 R15: ffff8804154bbd48 Apr 19 10:15:42 hn-05 kernel: [18446743935.447192] FS: 0000000042cbb950(0000) GS:ffff88042e440000(0000) knlGS:0000000000000000 Apr 19 10:15:42 hn-05 kernel: [18446743935.447195] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 Apr 19 10:15:42 hn-05 kernel: [18446743935.447197] CR2: 0000000000000008 CR3: 0000000260c41000 CR4: 00000000000026e0 Apr 19 10:15:42 hn-05 kernel: [18446743935.447200] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 19 10:15:42 hn-05 kernel: [18446743935.447202] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 19 10:15:42 hn-05 kernel: [18446743935.447205] Call Trace: Apr 19 10:15:42 hn-05 kernel: [18446743935.447215] [<ffffffffa02e2004>] ? kvm_arch_vcpu_ioctl_run+0x6ef/0xa44 [kvm] Apr 19 10:15:42 hn-05 kernel: [18446743935.447224] [<ffffffff8100f79c>] ? __switch_to+0x285/0x297 Apr 19 10:15:42 hn-05 kernel: [18446743935.447231] [<ffffffffa02d49d1>] ? kvm_vcpu_ioctl+0xf1/0x4e6 [kvm] Apr 19 10:15:42 hn-05 kernel: [18446743935.447237] [<ffffffff810240da>] ? lapic_next_event+0x18/0x1d Apr 19 10:15:42 hn-05 kernel: [18446743935.447245] [<ffffffff8106fb77>] ? tick_dev_program_event+0x2d/0x95 Apr 19 10:15:42 hn-05 kernel: [18446743935.447251] [<ffffffff81047e29>] ? finish_task_switch+0x3a/0xaf Apr 19 10:15:42 hn-05 kernel: [18446743935.447258] [<ffffffff810f9f5a>] ? vfs_ioctl+0x21/0x6c Apr 19 10:15:42 hn-05 kernel: [18446743935.447261] [<ffffffff810fa4a8>] ? do_vfs_ioctl+0x48d/0x4cb Apr 19 10:15:42 hn-05 kernel: [18446743935.447268] [<ffffffff81063fcd>] ? sys_timer_settime+0x233/0x283 Apr 19 10:15:42 hn-05 kernel: [18446743935.447272] [<ffffffff810fa537>] ? sys_ioctl+0x51/0x70 Apr 19 10:15:42 hn-05 kernel: [18446743935.447275] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b Apr 19 10:18:32 hn-05 kernel: [ 31.587025] bond0.13: received packet with own address as source address ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-04-30 17:39 ` Faidon Liambotis @ 2011-04-30 20:14 ` Willy Tarreau 2011-05-14 19:04 ` Nikola Ciprich 2011-06-28 2:25 ` john stultz 1 sibling, 1 reply; 58+ messages in thread From: Willy Tarreau @ 2011-04-30 20:14 UTC (permalink / raw) To: Faidon Liambotis Cc: linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos Hi Faidon, On Sat, Apr 30, 2011 at 08:39:05PM +0300, Faidon Liambotis wrote: > If I may add some, hopefully useful, information to the thread: > > At work we have an HP c7000 blade enclosure. It's populated with 8 ProLiant > BL460c G1 (Xeon E5405, constant_tsc but not nonstop_tsc) and 4 ProLiant > BL460c > G6 (Xeon E5504, constant_tsc + nonstop_tsc). All were booted at the same day > with Debian's 2.6.32-23~bpo50+1 kernel, i.e. upstream 2.6.32.21. > > We too experienced problems with just the G6 blades at near 215 days > uptime (on the 19th of April), all at the same time. From our > investigation, it seems that their cpu_clocks jumped suddenly far in the > future and then almost immediately rolled over due to wrapping around > 64-bits. > > Although all of their (G6s) clocks wrapped around *at the same time*, only > one > of them actually crashed at the time, with a second one crashing just a few > days later, on the 28th. > > Three of them had the following on their logs: > Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers > present > Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 > stuck for 17163091968s! [kvm:25913] > [...] > Apr 19 10:15:42 hn-05 kernel: [18446743935.447275] [<ffffffff81010b42>] ? > system_call_fastpath+0x16/0x1b > Apr 19 10:18:32 hn-05 kernel: [ 31.587025] bond0.13: received packet with > own address as source address > (full oops at the end of this mail) > > Note that the 17163091968s time stuck was *the same* (+/- 1s) in *all > three of them*. > > What's also very strange, although we're not very sure if related, is > that when we took the first line of the above log entry and substracted > 17966378.581971 from Apr 18 20:56:07, this resulted in a boot-time that > differed several hours from the actual boot time (the latter being > verified both with syslog and /proc/bstat's btime, which were both in > agreement). This was verified post-mortem too, with the date checked to > be correct. Well, your report is by far the most complete of our 3 since you managed to get this trace of wraping time. I don't know if it means anything to anyone, but 17163091968 is exactly 0x3FF000000, or 1023*2^24. Too round to be a coincidence. This number of seconds corresponds to 1000 * 2^32 jiffies at 250 Hz. > IOW, we have some serious clock drift (calcuated it at runtime to > ~0.1s/min) on these machines that hasn't been made apparent probably > since they all run NTP. Moreover, we also saw that drift on other > machines (1U, different vendor, different data center, E5520 CPUs) but > not with the G1 blades. Our investigation showed that the drift is there > constantly (it's not a one-time event) but we're not really sure if it's > related with the Apr 18th jump event. Note that the drift is there even > if we boot with "clocksource=hpet" but disappears when booted with > "notsc". Also note that we've verified with 2.6.38 and the drift is > still there. > > Regards, > Faidon > > 1: The full backtrace is: > Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers > present > Apr 19 06:25:02 hn-05 kernel: imklog 3.18.6, log source = /proc/kmsg > started. > Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 > stuck for 17163091968s! [kvm:25913] > Apr 19 10:15:42 hn-05 kernel: [18446743935.447056] Modules linked in: > xt_pkttype ext2 tun kvm_intel kvm nf_conntrack_ipv6 ip6table_filter > ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_ > defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables 8021q > garp bridge stp bonding dm_round_robin dm_multipath scsi_dh ipmi_poweroff > ipmi_devintf loop snd_pcm snd_ti > mer snd soundcore bnx2x psmouse snd_page_alloc serio_raw sd_mod crc_t10dif > hpilo ipmi_si ipmi_msghandler container evdev crc32c pcspkr shpchp > pci_hotplug libcrc32c power_meter mdio > button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log > dm_snapshot dm_mod usbhid hid cciss ehci_hcd uhci_hcd qla2xxx > scsi_transport_fc usbcore nls_base tg3 libphy scsi_ > tgt scsi_mod thermal fan thermal_sys [last unloaded: scsi_wait_scan] > Apr 19 10:15:42 hn-05 kernel: [18446743935.447111] CPU 4: > Apr 19 10:15:42 hn-05 kernel: [18446743935.447112] Modules linked in: > xt_pkttype ext2 tun kvm_intel kvm nf_conntrack_ipv6 ip6table_filter > ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_ > defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables 8021q > garp bridge stp bonding dm_round_robin dm_multipath scsi_dh ipmi_poweroff > ipmi_devintf loop snd_pcm snd_timer snd soundcore bnx2x psmouse > snd_page_alloc serio_raw sd_mod crc_t10dif hpilo ipmi_si ipmi_msghandler > container evdev crc32c pcspkr shpchp pci_hotplug libcrc32c power_meter mdio > button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log > dm_snapshot dm_mod usbhid hid cciss ehci_hcd uhci_hcd qla2xxx > scsi_transport_fc usbcore nls_base tg3 libphy scsi_tgt scsi_mod thermal fan > thermal_sys [last unloaded: scsi_wait_scan] > Apr 19 10:15:42 hn-05 kernel: [18446743935.447154] Pid: 25913, comm: kvm > Not tainted 2.6.32-bpo.5-amd64 #1 ProLiant BL460c G6 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447157] RIP: > 0010:[<ffffffffa02e209a>] [<ffffffffa02e209a>] > kvm_arch_vcpu_ioctl_run+0x785/0xa44 [kvm] > Apr 19 10:15:42 hn-05 kernel: [18446743935.447177] RSP: > 0018:ffff88070065fd38 EFLAGS: 00000202 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447179] RAX: ffff88070065ffd8 > RBX: ffff8804154ba860 RCX: ffff8804154ba860 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447182] RDX: ffff8804154ba8b9 > RSI: ffff81004c818b10 RDI: ffff8100291a7d48 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447184] RBP: ffffffff8101166e > R08: 0000000000000000 R09: 0000000000000000 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447187] R10: 00007f1b2a2af078 > R11: ffffffff802f3405 R12: 0000000000000001 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447189] R13: 00000000154ba8b8 > R14: ffff8804136ac000 R15: ffff8804154bbd48 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447192] FS: > 0000000042cbb950(0000) GS:ffff88042e440000(0000) knlGS:0000000000000000 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447195] CS: 0010 DS: 002b ES: > 002b CR0: 0000000080050033 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447197] CR2: 0000000000000008 > CR3: 0000000260c41000 CR4: 00000000000026e0 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447200] DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447202] DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447205] Call Trace: > Apr 19 10:15:42 hn-05 kernel: [18446743935.447215] [<ffffffffa02e2004>] ? > kvm_arch_vcpu_ioctl_run+0x6ef/0xa44 [kvm] > Apr 19 10:15:42 hn-05 kernel: [18446743935.447224] [<ffffffff8100f79c>] ? > __switch_to+0x285/0x297 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447231] [<ffffffffa02d49d1>] ? > kvm_vcpu_ioctl+0xf1/0x4e6 [kvm] > Apr 19 10:15:42 hn-05 kernel: [18446743935.447237] [<ffffffff810240da>] ? > lapic_next_event+0x18/0x1d > Apr 19 10:15:42 hn-05 kernel: [18446743935.447245] [<ffffffff8106fb77>] ? > tick_dev_program_event+0x2d/0x95 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447251] [<ffffffff81047e29>] ? > finish_task_switch+0x3a/0xaf > Apr 19 10:15:42 hn-05 kernel: [18446743935.447258] [<ffffffff810f9f5a>] ? > vfs_ioctl+0x21/0x6c > Apr 19 10:15:42 hn-05 kernel: [18446743935.447261] [<ffffffff810fa4a8>] ? > do_vfs_ioctl+0x48d/0x4cb > Apr 19 10:15:42 hn-05 kernel: [18446743935.447268] [<ffffffff81063fcd>] ? > sys_timer_settime+0x233/0x283 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447272] [<ffffffff810fa537>] ? > sys_ioctl+0x51/0x70 > Apr 19 10:15:42 hn-05 kernel: [18446743935.447275] [<ffffffff81010b42>] ? > system_call_fastpath+0x16/0x1b > Apr 19 10:18:32 hn-05 kernel: [ 31.587025] bond0.13: received packet with > own address as source address Regards, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-04-30 20:14 ` Willy Tarreau @ 2011-05-14 19:04 ` Nikola Ciprich 2011-05-14 20:45 ` Willy Tarreau 2011-05-15 22:56 ` Faidon Liambotis 0 siblings, 2 replies; 58+ messages in thread From: Nikola Ciprich @ 2011-05-14 19:04 UTC (permalink / raw) To: Willy Tarreau Cc: Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos, chronidev [-- Attachment #1: Type: text/plain, Size: 10222 bytes --] Hello gentlemans, Nicolas, thanks for further report, it contradicts my theory that problem occured somewhere during 2.6.32.16. Now I think I know why several of my other machines running 2.6.32.x for long time didn't crashed: I checked bugzilla entry for (I believe the same) problem here: https://bugzilla.kernel.org/show_bug.cgi?id=16991 and Peter Zijlstra asked there, whether reporters systems were running some RT tasks. Then I realised that all of my four crashed boxes were pacemaker/corosync clusters and pacemaker uses lots of RT priority tasks. So I believe this is important, and might be reason why other machines seem to be running rock solid - they are not running any RT tasks. It also might help with hunting this bug. Is somebody of You also running some RT priority tasks on inflicted systems, or problem also occured without it? Cheers! n. PS: Hidetoshi-san - btw, (late) thanks for Your reply and confirmation that Your patch should not be the cause of this problem. I'm now sure it must have emerged sooner, and I'm sorry for this false accusation :) On Sat, Apr 30, 2011 at 10:14:36PM +0200, Willy Tarreau wrote: > Hi Faidon, > > On Sat, Apr 30, 2011 at 08:39:05PM +0300, Faidon Liambotis wrote: > > If I may add some, hopefully useful, information to the thread: > > > > At work we have an HP c7000 blade enclosure. It's populated with 8 ProLiant > > BL460c G1 (Xeon E5405, constant_tsc but not nonstop_tsc) and 4 ProLiant > > BL460c > > G6 (Xeon E5504, constant_tsc + nonstop_tsc). All were booted at the same day > > with Debian's 2.6.32-23~bpo50+1 kernel, i.e. upstream 2.6.32.21. > > > > We too experienced problems with just the G6 blades at near 215 days > > uptime (on the 19th of April), all at the same time. From our > > investigation, it seems that their cpu_clocks jumped suddenly far in the > > future and then almost immediately rolled over due to wrapping around > > 64-bits. > > > > Although all of their (G6s) clocks wrapped around *at the same time*, only > > one > > of them actually crashed at the time, with a second one crashing just a few > > days later, on the 28th. > > > > Three of them had the following on their logs: > > Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers > > present > > Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 > > stuck for 17163091968s! [kvm:25913] > > [...] > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447275] [<ffffffff81010b42>] ? > > system_call_fastpath+0x16/0x1b > > Apr 19 10:18:32 hn-05 kernel: [ 31.587025] bond0.13: received packet with > > own address as source address > > (full oops at the end of this mail) > > > > Note that the 17163091968s time stuck was *the same* (+/- 1s) in *all > > three of them*. > > > > What's also very strange, although we're not very sure if related, is > > that when we took the first line of the above log entry and substracted > > 17966378.581971 from Apr 18 20:56:07, this resulted in a boot-time that > > differed several hours from the actual boot time (the latter being > > verified both with syslog and /proc/bstat's btime, which were both in > > agreement). This was verified post-mortem too, with the date checked to > > be correct. > > Well, your report is by far the most complete of our 3 since you managed > to get this trace of wraping time. > > I don't know if it means anything to anyone, but 17163091968 is exactly > 0x3FF000000, or 1023*2^24. Too round to be a coincidence. This number > of seconds corresponds to 1000 * 2^32 jiffies at 250 Hz. > > > IOW, we have some serious clock drift (calcuated it at runtime to > > ~0.1s/min) on these machines that hasn't been made apparent probably > > since they all run NTP. Moreover, we also saw that drift on other > > machines (1U, different vendor, different data center, E5520 CPUs) but > > not with the G1 blades. Our investigation showed that the drift is there > > constantly (it's not a one-time event) but we're not really sure if it's > > related with the Apr 18th jump event. Note that the drift is there even > > if we boot with "clocksource=hpet" but disappears when booted with > > "notsc". Also note that we've verified with 2.6.38 and the drift is > > still there. > > > > Regards, > > Faidon > > > > 1: The full backtrace is: > > Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers > > present > > Apr 19 06:25:02 hn-05 kernel: imklog 3.18.6, log source = /proc/kmsg > > started. > > Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 > > stuck for 17163091968s! [kvm:25913] > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447056] Modules linked in: > > xt_pkttype ext2 tun kvm_intel kvm nf_conntrack_ipv6 ip6table_filter > > ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_ > > defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables 8021q > > garp bridge stp bonding dm_round_robin dm_multipath scsi_dh ipmi_poweroff > > ipmi_devintf loop snd_pcm snd_ti > > mer snd soundcore bnx2x psmouse snd_page_alloc serio_raw sd_mod crc_t10dif > > hpilo ipmi_si ipmi_msghandler container evdev crc32c pcspkr shpchp > > pci_hotplug libcrc32c power_meter mdio > > button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log > > dm_snapshot dm_mod usbhid hid cciss ehci_hcd uhci_hcd qla2xxx > > scsi_transport_fc usbcore nls_base tg3 libphy scsi_ > > tgt scsi_mod thermal fan thermal_sys [last unloaded: scsi_wait_scan] > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447111] CPU 4: > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447112] Modules linked in: > > xt_pkttype ext2 tun kvm_intel kvm nf_conntrack_ipv6 ip6table_filter > > ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_ > > defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables 8021q > > garp bridge stp bonding dm_round_robin dm_multipath scsi_dh ipmi_poweroff > > ipmi_devintf loop snd_pcm snd_timer snd soundcore bnx2x psmouse > > snd_page_alloc serio_raw sd_mod crc_t10dif hpilo ipmi_si ipmi_msghandler > > container evdev crc32c pcspkr shpchp pci_hotplug libcrc32c power_meter mdio > > button processor ext3 jbd mbcache dm_mirror dm_region_hash dm_log > > dm_snapshot dm_mod usbhid hid cciss ehci_hcd uhci_hcd qla2xxx > > scsi_transport_fc usbcore nls_base tg3 libphy scsi_tgt scsi_mod thermal fan > > thermal_sys [last unloaded: scsi_wait_scan] > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447154] Pid: 25913, comm: kvm > > Not tainted 2.6.32-bpo.5-amd64 #1 ProLiant BL460c G6 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447157] RIP: > > 0010:[<ffffffffa02e209a>] [<ffffffffa02e209a>] > > kvm_arch_vcpu_ioctl_run+0x785/0xa44 [kvm] > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447177] RSP: > > 0018:ffff88070065fd38 EFLAGS: 00000202 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447179] RAX: ffff88070065ffd8 > > RBX: ffff8804154ba860 RCX: ffff8804154ba860 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447182] RDX: ffff8804154ba8b9 > > RSI: ffff81004c818b10 RDI: ffff8100291a7d48 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447184] RBP: ffffffff8101166e > > R08: 0000000000000000 R09: 0000000000000000 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447187] R10: 00007f1b2a2af078 > > R11: ffffffff802f3405 R12: 0000000000000001 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447189] R13: 00000000154ba8b8 > > R14: ffff8804136ac000 R15: ffff8804154bbd48 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447192] FS: > > 0000000042cbb950(0000) GS:ffff88042e440000(0000) knlGS:0000000000000000 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447195] CS: 0010 DS: 002b ES: > > 002b CR0: 0000000080050033 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447197] CR2: 0000000000000008 > > CR3: 0000000260c41000 CR4: 00000000000026e0 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447200] DR0: 0000000000000000 > > DR1: 0000000000000000 DR2: 0000000000000000 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447202] DR3: 0000000000000000 > > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447205] Call Trace: > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447215] [<ffffffffa02e2004>] ? > > kvm_arch_vcpu_ioctl_run+0x6ef/0xa44 [kvm] > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447224] [<ffffffff8100f79c>] ? > > __switch_to+0x285/0x297 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447231] [<ffffffffa02d49d1>] ? > > kvm_vcpu_ioctl+0xf1/0x4e6 [kvm] > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447237] [<ffffffff810240da>] ? > > lapic_next_event+0x18/0x1d > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447245] [<ffffffff8106fb77>] ? > > tick_dev_program_event+0x2d/0x95 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447251] [<ffffffff81047e29>] ? > > finish_task_switch+0x3a/0xaf > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447258] [<ffffffff810f9f5a>] ? > > vfs_ioctl+0x21/0x6c > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447261] [<ffffffff810fa4a8>] ? > > do_vfs_ioctl+0x48d/0x4cb > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447268] [<ffffffff81063fcd>] ? > > sys_timer_settime+0x233/0x283 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447272] [<ffffffff810fa537>] ? > > sys_ioctl+0x51/0x70 > > Apr 19 10:15:42 hn-05 kernel: [18446743935.447275] [<ffffffff81010b42>] ? > > system_call_fastpath+0x16/0x1b > > Apr 19 10:18:32 hn-05 kernel: [ 31.587025] bond0.13: received packet with > > own address as source address > > Regards, > Willy > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@linuxbox.cz ------------------------------------- [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-05-14 19:04 ` Nikola Ciprich @ 2011-05-14 20:45 ` Willy Tarreau 2011-05-14 20:59 ` Ben Hutchings 2011-05-14 23:13 ` Nicolas Carlier 2011-05-15 22:56 ` Faidon Liambotis 1 sibling, 2 replies; 58+ messages in thread From: Willy Tarreau @ 2011-05-14 20:45 UTC (permalink / raw) To: Nikola Ciprich Cc: Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos, chronidev Hi, On Sat, May 14, 2011 at 09:04:23PM +0200, Nikola Ciprich wrote: > Hello gentlemans, > Nicolas, thanks for further report, it contradicts my theory that problem occured somewhere during 2.6.32.16. Well, I'd like to be sure what kernel we're talking about. Nicolas said "2.6.32.8 Debian Kernel", but I suspect it's "2.6.32-8something" instead. Nicolas, could you please report the exact version as indicated by "uname -a" ? > Now I think I know why several of my other machines running 2.6.32.x for long time didn't crashed: > > I checked bugzilla entry for (I believe the same) problem here: > https://bugzilla.kernel.org/show_bug.cgi?id=16991 > and Peter Zijlstra asked there, whether reporters systems were running some RT tasks. Then I realised that all of my four crashed boxes were pacemaker/corosync clusters and pacemaker uses lots of RT priority tasks. So I believe this is important, and might be reason why other machines seem to be running rock solid - they are not running any RT tasks. > It also might help with hunting this bug. Is somebody of You also running some RT priority tasks on inflicted systems, or problem also occured without it? No, our customer who had two of these boxes crash at the same time was not running any RT task to the best of my knowledge. Cheers, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-05-14 20:45 ` Willy Tarreau @ 2011-05-14 20:59 ` Ben Hutchings 2011-05-14 23:13 ` Nicolas Carlier 1 sibling, 0 replies; 58+ messages in thread From: Ben Hutchings @ 2011-05-14 20:59 UTC (permalink / raw) To: Willy Tarreau Cc: Nikola Ciprich, Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Apollon Oikonomopoulos, chronidev [-- Attachment #1: Type: text/plain, Size: 703 bytes --] On Sat, 2011-05-14 at 22:45 +0200, Willy Tarreau wrote: > Hi, > > On Sat, May 14, 2011 at 09:04:23PM +0200, Nikola Ciprich wrote: > > Hello gentlemans, > > Nicolas, thanks for further report, it contradicts my theory that problem occured somewhere during 2.6.32.16. > > Well, I'd like to be sure what kernel we're talking about. Nicolas said > "2.6.32.8 Debian Kernel", but I suspect it's "2.6.32-8something" instead. > Nicolas, could you please report the exact version as indicated by "uname -a" ? [...] Actually you need to use 'cat /proc/version' (or dpkg) to get the full version. Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-05-14 20:45 ` Willy Tarreau 2011-05-14 20:59 ` Ben Hutchings @ 2011-05-14 23:13 ` Nicolas Carlier 1 sibling, 0 replies; 58+ messages in thread From: Nicolas Carlier @ 2011-05-14 23:13 UTC (permalink / raw) To: Willy Tarreau Cc: Nikola Ciprich, Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos Hi, On Sat, May 14, 2011 at 10:45 PM, Willy Tarreau <w@1wt.eu> wrote: > Hi, > > On Sat, May 14, 2011 at 09:04:23PM +0200, Nikola Ciprich wrote: >> Hello gentlemans, >> Nicolas, thanks for further report, it contradicts my theory that problem occured somewhere during 2.6.32.16. > > Well, I'd like to be sure what kernel we're talking about. Nicolas said > "2.6.32.8 Debian Kernel", but I suspect it's "2.6.32-8something" instead. > Nicolas, could you please report the exact version as indicated by "uname -a" ? Sorry, I can't provide more informations on this version because I don't use it anymore, I can just corrected myself, it was not a 2.6.32.8 kernel but a 2.6.32.7 backport debian kernel, which had been recompiled. Because of this problem I took the oportunity to change to a 2.6.32.26 kernel, however as there was nothing on the changelog or bugzilla about the resolution of this issue we have applied the patch found in bugzilla which revealed this problem: https://bugzilla.kernel.org/show_bug.cgi?id=16991#c17 > >> Now I think I know why several of my other machines running 2.6.32.x for long time didn't crashed: >> >> I checked bugzilla entry for (I believe the same) problem here: >> https://bugzilla.kernel.org/show_bug.cgi?id=16991 >> and Peter Zijlstra asked there, whether reporters systems were running some RT tasks. Then I realised that all of my four crashed boxes were pacemaker/corosync clusters and pacemaker uses lots of RT priority tasks. So I believe this is important, and might be reason why other machines seem to be running rock solid - they are not running any RT tasks. >> It also might help with hunting this bug. Is somebody of You also running some RT priority tasks on inflicted systems, or problem also occured without it? > > No, our customer who had two of these boxes crash at the same time was > not running any RT task to the best of my knowledge. > Regards, -- Nicolas Carlier ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-05-14 19:04 ` Nikola Ciprich 2011-05-14 20:45 ` Willy Tarreau @ 2011-05-15 22:56 ` Faidon Liambotis 2011-05-16 6:49 ` Apollon Oikonomopoulos 1 sibling, 1 reply; 58+ messages in thread From: Faidon Liambotis @ 2011-05-15 22:56 UTC (permalink / raw) To: Nikola Ciprich Cc: Willy Tarreau, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos, chronidev On Sat, May 14, 2011 at 09:04:23PM +0200, Nikola Ciprich wrote: > Nicolas, thanks for further report, it contradicts my theory that > problem occured somewhere during 2.6.32.16. Now I think I know why > several of my other machines running 2.6.32.x for long time didn't > crashed: > > I checked bugzilla entry for (I believe the same) problem here: > https://bugzilla.kernel.org/show_bug.cgi?id=16991 I don't think that that bug is related, I for one haven't seen any backtrace that is similar to the above or relevant to divide by zero. > and Peter Zijlstra asked there, whether reporters systems were running > some RT tasks. Then I realised that all of my four crashed boxes were > pacemaker/corosync clusters and pacemaker uses lots of RT priority > tasks. So I believe this is important, and might be reason why other > machines seem to be running rock solid - they are not running any RT > tasks. It also might help with hunting this bug. Is somebody of You > also running some RT priority tasks on inflicted systems, or problem > also occured without it? No, no RT tasks here. The boxes in my case were just running a lot of kvm processes. Regards, Faidon ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-05-15 22:56 ` Faidon Liambotis @ 2011-05-16 6:49 ` Apollon Oikonomopoulos 0 siblings, 0 replies; 58+ messages in thread From: Apollon Oikonomopoulos @ 2011-05-16 6:49 UTC (permalink / raw) To: Faidon Liambotis Cc: Nikola Ciprich, Willy Tarreau, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Ben Hutchings, chronidev Hello all, On 01:56 Mon 16 May , Faidon Liambotis wrote: > > and Peter Zijlstra asked there, whether reporters systems were running > > some RT tasks. Then I realised that all of my four crashed boxes were > > pacemaker/corosync clusters and pacemaker uses lots of RT priority > > tasks. So I believe this is important, and might be reason why other > > machines seem to be running rock solid - they are not running any RT > > tasks. It also might help with hunting this bug. Is somebody of You > > also running some RT priority tasks on inflicted systems, or problem > > also occured without it? > > No, no RT tasks here. The boxes in my case were just running a lot of > kvm processes. Actually we are running multipathd, which is an RT process and heavily loaded on these particular systems. Regards, Apollon ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-04-30 17:39 ` Faidon Liambotis 2011-04-30 20:14 ` Willy Tarreau @ 2011-06-28 2:25 ` john stultz 2011-06-28 5:17 ` Willy Tarreau 2011-07-06 6:15 ` Andrew Morton 1 sibling, 2 replies; 58+ messages in thread From: john stultz @ 2011-06-28 2:25 UTC (permalink / raw) To: Faidon Liambotis Cc: linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Willy Tarreau, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos On Sat, Apr 30, 2011 at 10:39 AM, Faidon Liambotis <paravoid@debian.org> wrote: > We too experienced problems with just the G6 blades at near 215 days uptime > (on the 19th of April), all at the same time. From our investigation, it > seems that their cpu_clocks jumped suddenly far in the future and then > almost immediately rolled over due to wrapping around 64-bits. > > Although all of their (G6s) clocks wrapped around *at the same time*, only > one > of them actually crashed at the time, with a second one crashing just a few > days later, on the 28th. > > Three of them had the following on their logs: > Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers > present > Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 > stuck for 17163091968s! [kvm:25913] So, did this issue ever get any traction or get resolved? >From the softlockup message, I suspect we hit a multiply overflow in the underlying sched_clock() implementation. Because the goal of sched_clock is to be very fast, lightweight and safe from locking issues (so it can be called anywhere) handling transient corner cases internally has been avoided as they would require costly locking and extra overhead. Because of this, sched_clock users should be cautious to be robust in the face of transient errors. Peter: I wonder if the soft lockup code should be using the (hopefully) more robust timekeeping code (ie: get_seconds) for its get_timestamp function? I'd worry that you might have issues catching cases where the system was locked up so the timekeeping accounting code didn't get to run, but you have the same problem in the jiffies based sched_clock code as well (since timekeeping increments jiffies in most cases). That said, I didn't see from any of the backtraces in this thread why the system actually crashed. The softlockup message on its own shouldn't do that, so I suspect there's still a related issue somewhere else here. thanks -john ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-06-28 2:25 ` john stultz @ 2011-06-28 5:17 ` Willy Tarreau 2011-06-28 6:19 ` Apollon Oikonomopoulos 2011-07-06 6:15 ` Andrew Morton 1 sibling, 1 reply; 58+ messages in thread From: Willy Tarreau @ 2011-06-28 5:17 UTC (permalink / raw) To: john stultz Cc: Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos On Mon, Jun 27, 2011 at 07:25:31PM -0700, john stultz wrote: > On Sat, Apr 30, 2011 at 10:39 AM, Faidon Liambotis <paravoid@debian.org> wrote: > > We too experienced problems with just the G6 blades at near 215 days uptime > > (on the 19th of April), all at the same time. From our investigation, it > > seems that their cpu_clocks jumped suddenly far in the future and then > > almost immediately rolled over due to wrapping around 64-bits. > > > > Although all of their (G6s) clocks wrapped around *at the same time*, only > > one > > of them actually crashed at the time, with a second one crashing just a few > > days later, on the 28th. > > > > Three of them had the following on their logs: > > Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers > > present > > Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 > > stuck for 17163091968s! [kvm:25913] > > So, did this issue ever get any traction or get resolved? I'm not aware of any news on the subject unfortunately. We asked our customer to reboot both machines one week apart so that in 6 months they don't crash at the same time :-/ (...) > That said, I didn't see from any of the backtraces in this thread why > the system actually crashed. The softlockup message on its own > shouldn't do that, so I suspect there's still a related issue > somewhere else here. One of the traces clearly showed that the kernel's uptime had wrapped or jumped, because the uptime suddenly jumped forwards to something like 2^32/HZ seconds IIRC. Thus it is possible that we have two bugs, one on the clock making it jump forwards and one somewhere else causing an overflow when the clock jumps too far. Regards, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-06-28 5:17 ` Willy Tarreau @ 2011-06-28 6:19 ` Apollon Oikonomopoulos 0 siblings, 0 replies; 58+ messages in thread From: Apollon Oikonomopoulos @ 2011-06-28 6:19 UTC (permalink / raw) To: Willy Tarreau Cc: john stultz, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Randy Dunlap, Greg KH, Ben Hutchings On 07:17 Tue 28 Jun , Willy Tarreau wrote: > On Mon, Jun 27, 2011 at 07:25:31PM -0700, john stultz wrote: > > That said, I didn't see from any of the backtraces in this thread why > > the system actually crashed. The softlockup message on its own > > shouldn't do that, so I suspect there's still a related issue > > somewhere else here. > > One of the traces clearly showed that the kernel's uptime had wrapped > or jumped, because the uptime suddenly jumped forwards to something > like 2^32/HZ seconds IIRC. > > Thus it is possible that we have two bugs, one on the clock making it > jump forwards and one somewhere else causing an overflow when the clock > jumps too far. Our last machine with wrapped time crashed 1 month ago, almost 1 month after the time wrap. One thing I noticed, was that although the machine seemed healthy apart from the time-wrap, there seemed to be random scheduling glitches, which were mostly visible as high ping times to the KVM guests running on the machine. Unfortunately I don't have any exact numbers, so I suppose the best I can do is describe what we saw. All scheduler statistics under /proc/sched_debug on the host seemed normal, however pinging a VM from outside would give random spikes in the order of hundreds of ms among the usual 1-2 ms times. Moving the VM to another host would restore sane ping times and any other VM moved to this host would exhibit the same behaviour. Ping times to the host itself from outside were stable. This was also accompanied by bad I/O performance in the KVM guests themelves and the strange effect that the total CPU time on the VM's munin graphs would add to less than 100% * #CPUs. Neither the host nor the guests were experiencing heavy load. As a side note, this was similar to the behaviour we had experienced once when some of multipathd's path checkers (which are RT tasks IIRC) had crashed, although this time restarting multipathd didn't help. Regards, Apollon ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-06-28 2:25 ` john stultz 2011-06-28 5:17 ` Willy Tarreau @ 2011-07-06 6:15 ` Andrew Morton 2011-07-12 1:18 ` MINOURA Makoto / 箕浦 真 1 sibling, 1 reply; 58+ messages in thread From: Andrew Morton @ 2011-07-06 6:15 UTC (permalink / raw) To: john stultz Cc: Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Willy Tarreau, Randy Dunlap, Greg KH, Ben Hutchings, Apollon Oikonomopoulos On Mon, 27 Jun 2011 19:25:31 -0700 john stultz <johnstul@us.ibm.com> wrote: > On Sat, Apr 30, 2011 at 10:39 AM, Faidon Liambotis <paravoid@debian.org> wrote: > > We too experienced problems with just the G6 blades at near 215 days uptime > > (on the 19th of April), all at the same time. From our investigation, it > > seems that their cpu_clocks jumped suddenly far in the future and then > > almost immediately rolled over due to wrapping around 64-bits. > > > > Although all of their (G6s) clocks wrapped around *at the same time*, only > > one > > of them actually crashed at the time, with a second one crashing just a few > > days later, on the 28th. > > > > Three of them had the following on their logs: > > Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers > > present > > Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 > > stuck for 17163091968s! [kvm:25913] > > So, did this issue ever get any traction or get resolved? > https://bugzilla.kernel.org/show_bug.cgi?id=37382 is similar - a divide-by-zero in update_sg_lb_stats() after 209 days uptime. Can we change this stuff so that the timers wrap after 10 minutes uptime, like INITIAL_JIFFIES? ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-06 6:15 ` Andrew Morton @ 2011-07-12 1:18 ` MINOURA Makoto / 箕浦 真 2011-07-12 1:40 ` john stultz 0 siblings, 1 reply; 58+ messages in thread From: MINOURA Makoto / 箕浦 真 @ 2011-07-12 1:18 UTC (permalink / raw) To: Andrew Morton Cc: john stultz, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Willy Tarreau, Rand We're experiencing similar but slightly different problems. Some KVM hosts crash after 210-220 uptime. Some of them hits divide-by-zero, but one of them shows: [671528.8780080] BUG: soft lockup - CPU#4 stuck for 61s! [kvm:11131] (sorry we have no full crash message including the backtrace) The host kernel is 2.6.32.11-based (ubuntu 2.6.32-22-server, 2.6.32-22.36). I'm not sure but probably the task scheduler is confusing by the sched_clock overflow? Thanks, -- Minoura Makoto <minoura@valinux.co.jp> ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-12 1:18 ` MINOURA Makoto / 箕浦 真 @ 2011-07-12 1:40 ` john stultz 2011-07-12 2:49 ` MINOURA Makoto / 箕浦 真 0 siblings, 1 reply; 58+ messages in thread From: john stultz @ 2011-07-12 1:40 UTC (permalink / raw) To: MINOURA Makoto / 箕浦 真 Cc: Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Willy Tarreau, Rand On Tue, 2011-07-12 at 10:18 +0900, MINOURA Makoto / 箕浦 真 wrote: > We're experiencing similar but slightly different > problems. Some KVM hosts crash after 210-220 uptime. > Some of them hits divide-by-zero, but one of them shows: > > [671528.8780080] BUG: soft lockup - CPU#4 stuck for 61s! [kvm:11131] > > (sorry we have no full crash message including the backtrace) > > The host kernel is 2.6.32.11-based (ubuntu 2.6.32-22-server, > 2.6.32-22.36). > > I'm not sure but probably the task scheduler is confusing by > the sched_clock overflow? I'm working on a debug patch that will hopefully trip sched_clock overflows very early to see if we can't shake these issues out. thanks -john ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-12 1:40 ` john stultz @ 2011-07-12 2:49 ` MINOURA Makoto / 箕浦 真 2011-07-12 4:19 ` Willy Tarreau 0 siblings, 1 reply; 58+ messages in thread From: MINOURA Makoto / 箕浦 真 @ 2011-07-12 2:49 UTC (permalink / raw) To: john stultz Cc: Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Willy Tarreau, Rand |> In <1310434819.30337.21.camel@work-vm> |> john stultz <johnstul@us.ibm.com> wrote: > I'm working on a debug patch that will hopefully trip sched_clock > overflows very early to see if we can't shake these issues out. Thanks. I hope that'll help. -- Minoura Makoto <minoura@valinux.co.jp> ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-12 2:49 ` MINOURA Makoto / 箕浦 真 @ 2011-07-12 4:19 ` Willy Tarreau 2011-07-15 0:35 ` john stultz 0 siblings, 1 reply; 58+ messages in thread From: Willy Tarreau @ 2011-07-12 4:19 UTC (permalink / raw) To: MINOURA Makoto / ?$BL'1: ?$B?? Cc: john stultz, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Tue, Jul 12, 2011 at 11:49:57AM +0900, MINOURA Makoto / ?$BL'1: ?$B?? wrote: > > |> In <1310434819.30337.21.camel@work-vm> > |> john stultz <johnstul@us.ibm.com> wrote: > > > I'm working on a debug patch that will hopefully trip sched_clock > > overflows very early to see if we can't shake these issues out. > > Thanks. I hope that'll help. That certainly will. What currently makes this bug hard to fix is that it takes around 7 months to test a possible fix. We don't even know if more recent kernels are affected, it's possible that 2.6.32-stable is the only one that people don't reboot for an update in 7 months :-/ We need to make this bug more visible, so John's patch will be very welcome ! Regards, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-12 4:19 ` Willy Tarreau @ 2011-07-15 0:35 ` john stultz 2011-07-15 8:30 ` Peter Zijlstra 2011-07-15 10:01 ` Peter Zijlstra 0 siblings, 2 replies; 58+ messages in thread From: john stultz @ 2011-07-15 0:35 UTC (permalink / raw) To: Willy Tarreau, Peter Zijlstra, Ingo Molnar Cc: MINOURA Makoto / ?$BL'1: ?$B??, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Tue, 2011-07-12 at 06:19 +0200, Willy Tarreau wrote: > On Tue, Jul 12, 2011 at 11:49:57AM +0900, MINOURA Makoto / ?$BL'1: ?$B?? wrote: > > > > |> In <1310434819.30337.21.camel@work-vm> > > |> john stultz <johnstul@us.ibm.com> wrote: > > > > > I'm working on a debug patch that will hopefully trip sched_clock > > > overflows very early to see if we can't shake these issues out. > > > > Thanks. I hope that'll help. > > That certainly will. What currently makes this bug hard to fix is that > it takes around 7 months to test a possible fix. We don't even know if > more recent kernels are affected, it's possible that 2.6.32-stable is > the only one that people don't reboot for an update in 7 months :-/ > > We need to make this bug more visible, so John's patch will be very > welcome ! Ok. So this might be long. So I scratched out a hack that trips the multiplication overflows to happen after 5 minutes, but found that for my systems, it didn't cause any strange behavior at all. The reason being that my system sched_clock_stable is not set, so there are extra corrective layers going on. This likely shows why this issue crops up on some boxes but not others. Its likely only x86 systems with both X86_FEATURE_TSC_RELIABLE and X86_FEATURE_CONSTANT_TSC are affected. Even forcing sched_clock_stable on, I still didn't see any crashes or oopses with mainline. So I then back ported to 2.6.31, and there I could only trigger softlockup watchdog warnings (but not crashes). None the less, adding further debug messages, I did see some odd behavior from sched_clock around the time the multiplication overflows happen. Earlier in this thread, one reporter had the following in their logs when they hit a softlockup warning: Apr 18 20:56:07 hn-05 kernel: [17966378.581971] tap0: no IPv6 routers present Apr 19 10:15:42 hn-05 kernel: [18446743935.365550] BUG: soft lockup - CPU#4 stuck for 17163091968s! [kvm:25913] ... Apr 19 10:18:32 hn-05 kernel: [ 31.587025] bond0.13: received packet with own address as source address This was a 2ghz box, so working backward, we know the cyc2ns scale value is 512, so the cyc2ns equation is (cyc*512)>>10 Well, 64bits divided by 512 is ~36028797018963967 cycles, which is ~18014398 seconds or ~208 days (which lines up closely to the 17966378 printk time above from 20:56:07 - the day before the overflow). So on this machines we should expect to see the sched_clock values get close to 18014398 seconds, and then drop down to a small number and grow again. Which happens as we see at 10:18:32 timestamp above. The oddball bit is why the large [18446743935.365550] printk time in-between when the softlockup occured? Since we're unsigned and shifting down by 10, we shouldn't be seeing such large numbers! Well, I reproduced this as well periodically, and it ends up sched_clock is doing more then (cyc*512)>>10. Because the freq might change, there's also a normalization done by adding a cyc2ns_offset value. Well, at boot, this value is miscalculated on the first cpu, and we end up with a negative value in cyc2ns_offset. So what we're seeing is the overflow, which is expected, but then the subtraction rolls the value under and back into a very large number. So yea, the multiplication overflow is probably hard enough for sched_clock users to handle, however the extra spike caused by the subtraction makes it even more complicated. You can find my current forced-overflow patch and my cyc2ns_offset initialization fix here (Caution, I may re-base these branches!): http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=shortlog;h=refs/heads/dev/uptime-crash However, as I mentioned in the cyc2ns_offset fix, there's still the chance for the same thing to occur in other cases where the cpufreq does change. And this suggests we really need a deeper fix here. One thought I've had is to rework the sched_clock implementation to be a bit more like the timekeeping code. However, this requires keeping track of a bit more data, which then requires fancy atomic updates to structures using rcu (since we can't do locking with sched_clock), and probably has some extra overhead. That said, it avoids having to do the funky offset based normalization and if we add a very rare periodic timer to do accumulation, we can avoid the multiplication overflow all together. I've put together a first pitch at this here (boots, but not robustly tested): http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=shortlog;h=refs/heads/dev/sched_clock-rework Peter/Ingo: Can you take a look at the above and let me know if you find it too disagreeable? thanks -john ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-15 0:35 ` john stultz @ 2011-07-15 8:30 ` Peter Zijlstra 2011-07-15 10:02 ` Peter Zijlstra 2011-07-15 10:01 ` Peter Zijlstra 1 sibling, 1 reply; 58+ messages in thread From: Peter Zijlstra @ 2011-07-15 8:30 UTC (permalink / raw) To: john stultz Cc: Willy Tarreau, Ingo Molnar, MINOURA Makoto / ?$BL'1: ?$B??, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Thu, 2011-07-14 at 17:35 -0700, john stultz wrote: > > Peter/Ingo: Can you take a look at the above and let me know if you find > it too disagreeable? I'm not quite sure of the calling conditions of set_cyc2ns_scale(), but there's two sites in there that do: + local_irq_save(flags); + data = kmalloc(sizeof(*data), GFP_KERNEL); + local_irq_restore(flags); Clearly that's not going to work well at all. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-15 8:30 ` Peter Zijlstra @ 2011-07-15 10:02 ` Peter Zijlstra 2011-07-15 18:03 ` john stultz 0 siblings, 1 reply; 58+ messages in thread From: Peter Zijlstra @ 2011-07-15 10:02 UTC (permalink / raw) To: john stultz Cc: Willy Tarreau, Ingo Molnar, MINOURA Makoto / ?$BL'1: ?$B??, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Fri, 2011-07-15 at 10:30 +0200, Peter Zijlstra wrote: > On Thu, 2011-07-14 at 17:35 -0700, john stultz wrote: > > > > Peter/Ingo: Can you take a look at the above and let me know if you find > > it too disagreeable? > > I'm not quite sure of the calling conditions of set_cyc2ns_scale(), but > there's two sites in there that do: > > + local_irq_save(flags); > > + data = kmalloc(sizeof(*data), GFP_KERNEL); > > + local_irq_restore(flags); > > Clearly that's not going to work well at all. Furthermore there is a distinct lack of error handling there, it assumes the allocation always succeeds. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-15 10:02 ` Peter Zijlstra @ 2011-07-15 18:03 ` john stultz 0 siblings, 0 replies; 58+ messages in thread From: john stultz @ 2011-07-15 18:03 UTC (permalink / raw) To: Peter Zijlstra Cc: Willy Tarreau, Ingo Molnar, MINOURA Makoto / ?$BL'1: ?$B??, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Fri, 2011-07-15 at 12:02 +0200, Peter Zijlstra wrote: > On Fri, 2011-07-15 at 10:30 +0200, Peter Zijlstra wrote: > > On Thu, 2011-07-14 at 17:35 -0700, john stultz wrote: > > > > > > Peter/Ingo: Can you take a look at the above and let me know if you find > > > it too disagreeable? > > > > I'm not quite sure of the calling conditions of set_cyc2ns_scale(), but > > there's two sites in there that do: > > > > + local_irq_save(flags); > > > > + data = kmalloc(sizeof(*data), GFP_KERNEL); > > > > + local_irq_restore(flags); > > > > Clearly that's not going to work well at all. > > Furthermore there is a distinct lack of error handling there, it assumes > the allocation always succeeds. Yes, both of those issues are embarrassing. Thanks for pointing them out. I'll take another swing at it. thanks -john ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-15 0:35 ` john stultz 2011-07-15 8:30 ` Peter Zijlstra @ 2011-07-15 10:01 ` Peter Zijlstra 2011-07-15 17:59 ` john stultz 1 sibling, 1 reply; 58+ messages in thread From: Peter Zijlstra @ 2011-07-15 10:01 UTC (permalink / raw) To: john stultz Cc: Willy Tarreau, Ingo Molnar, MINOURA Makoto / ?$BL'1: ?$B??, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Thu, 2011-07-14 at 17:35 -0700, john stultz wrote: > > Peter/Ingo: Can you take a look at the above and let me know if you find > it too disagreeable? +static unsigned long long __cycles_2_ns(unsigned long long cyc) +{ + unsigned long long ns = 0; + struct x86_sched_clock_data *data; + int cpu = smp_processor_id(); + + rcu_read_lock(); + data = rcu_dereference(per_cpu(cpu_sched_clock_data, cpu)); + + if (unlikely(!data)) + goto out; + + ns = ((cyc - data->base_cycles) * data->mult) >> CYC2NS_SCALE_FACTOR; + ns += data->accumulated_ns; +out: + rcu_read_unlock(); + return ns; +} The way I read that we're still not wrapping properly if freq scaling 'never' happens. Because then we're wrapping on accumulated_ns + 2^54. Something like resetting base, and adding ns to accumulated_ns and returning the latter would make more sense. ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-15 10:01 ` Peter Zijlstra @ 2011-07-15 17:59 ` john stultz 2011-07-21 7:22 ` Ingo Molnar 0 siblings, 1 reply; 58+ messages in thread From: john stultz @ 2011-07-15 17:59 UTC (permalink / raw) To: Peter Zijlstra Cc: Willy Tarreau, Ingo Molnar, MINOURA Makoto / ?$BL'1: ?$B??, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Fri, 2011-07-15 at 12:01 +0200, Peter Zijlstra wrote: > On Thu, 2011-07-14 at 17:35 -0700, john stultz wrote: > > > > Peter/Ingo: Can you take a look at the above and let me know if you find > > it too disagreeable? > > +static unsigned long long __cycles_2_ns(unsigned long long cyc) > +{ > + unsigned long long ns = 0; > + struct x86_sched_clock_data *data; > + int cpu = smp_processor_id(); > + > + rcu_read_lock(); > + data = rcu_dereference(per_cpu(cpu_sched_clock_data, cpu)); > + > + if (unlikely(!data)) > + goto out; > + > + ns = ((cyc - data->base_cycles) * data->mult) >> CYC2NS_SCALE_FACTOR; > + ns += data->accumulated_ns; > +out: > + rcu_read_unlock(); > + return ns; > +} > > The way I read that we're still not wrapping properly if freq scaling > 'never' happens. Right, this doesn't address the mult overflow behavior. As I mentioned in the patch that the rework allows for solving that in the future using a (possibly very rare) timer that would accumulate cycles to ns. This rework just really addresses the multiplication overflow->negative roll under that currently occurs with the cyc2ns_offset value. > Because then we're wrapping on accumulated_ns + 2^54. > > Something like resetting base, and adding ns to accumulated_ns and > returning the latter would make more sense. Although we have to update the base_cycles and accumulated_ns atomically, so its probably not something to do in the sched_clock path. thanks -john ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-15 17:59 ` john stultz @ 2011-07-21 7:22 ` Ingo Molnar 2011-07-21 12:24 ` Peter Zijlstra 2011-07-21 19:53 ` john stultz 0 siblings, 2 replies; 58+ messages in thread From: Ingo Molnar @ 2011-07-21 7:22 UTC (permalink / raw) To: john stultz Cc: Peter Zijlstra, Willy Tarreau, MINOURA Makoto / ?$BL'1: ?$B??, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand * john stultz <johnstul@us.ibm.com> wrote: > On Fri, 2011-07-15 at 12:01 +0200, Peter Zijlstra wrote: > > On Thu, 2011-07-14 at 17:35 -0700, john stultz wrote: > > > > > > Peter/Ingo: Can you take a look at the above and let me know if you find > > > it too disagreeable? > > > > +static unsigned long long __cycles_2_ns(unsigned long long cyc) > > +{ > > + unsigned long long ns = 0; > > + struct x86_sched_clock_data *data; > > + int cpu = smp_processor_id(); > > + > > + rcu_read_lock(); > > + data = rcu_dereference(per_cpu(cpu_sched_clock_data, cpu)); > > + > > + if (unlikely(!data)) > > + goto out; > > + > > + ns = ((cyc - data->base_cycles) * data->mult) >> CYC2NS_SCALE_FACTOR; > > + ns += data->accumulated_ns; > > +out: > > + rcu_read_unlock(); > > + return ns; > > +} > > > > The way I read that we're still not wrapping properly if freq scaling > > 'never' happens. > > Right, this doesn't address the mult overflow behavior. As I mentioned > in the patch that the rework allows for solving that in the future using > a (possibly very rare) timer that would accumulate cycles to ns. > > This rework just really addresses the multiplication overflow->negative > roll under that currently occurs with the cyc2ns_offset value. > > > Because then we're wrapping on accumulated_ns + 2^54. > > > > Something like resetting base, and adding ns to accumulated_ns and > > returning the latter would make more sense. > > Although we have to update the base_cycles and accumulated_ns > atomically, so its probably not something to do in the sched_clock path. Ping, what's going on with this bug? Systems are crashing so we need a quick fix ASAP ... Thanks, Ingo ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 7:22 ` Ingo Molnar @ 2011-07-21 12:24 ` Peter Zijlstra 2011-07-21 12:50 ` Nikola Ciprich 2011-07-21 19:53 ` john stultz 1 sibling, 1 reply; 58+ messages in thread From: Peter Zijlstra @ 2011-07-21 12:24 UTC (permalink / raw) To: Ingo Molnar Cc: john stultz, Willy Tarreau, MINOURA Makoto, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Thu, 2011-07-21 at 09:22 +0200, Ingo Molnar wrote: > > Ping, what's going on with this bug? Systems are crashing so we need > a quick fix ASAP ... Something as simple as the below ought to cure things for now. Once we get __cycles_2_ns() fixed up we can enable it again. (patch against -tip, .32 code is different but equally simple to fix) --- Subject: x86, intel: Don't mark sched_clock() as stable Because the x86 sched_clock() implementation wraps at 54 bits and the scheduler code assumes it wraps at the full 64bits we can get into trouble after 208 days (~7 months) of uptime. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- arch/x86/kernel/cpu/intel.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index ed6086e..c8dc48b 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -91,8 +91,15 @@ static void __cpuinit early_init_intel(struct cpuinfo_x86 *c) if (c->x86_power & (1 << 8)) { set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC); + /* + * Unfortunately our __cycles_2_ns() implementation makes + * the raw sched_clock() interface wrap at 54-bits, which + * makes it unsuitable for direct use, so disable this + * for now. + * if (!check_tsc_unstable()) sched_clock_stable = 1; + */ } /* ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 12:24 ` Peter Zijlstra @ 2011-07-21 12:50 ` Nikola Ciprich 2011-07-21 12:53 ` Peter Zijlstra 0 siblings, 1 reply; 58+ messages in thread From: Nikola Ciprich @ 2011-07-21 12:50 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, john stultz, Willy Tarreau, MINOURA Makoto, Andrew Morton, Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Rand, Nikola Ciprich [-- Attachment #1: Type: text/plain, Size: 2130 bytes --] Hi, thanks for the patch! I'll put this on our testing boxes... Are You going to push this upstream so we can ask Greg to push this to -stable? or do You plan to wait for more complex patch? n. On Thu, Jul 21, 2011 at 02:24:58PM +0200, Peter Zijlstra wrote: > On Thu, 2011-07-21 at 09:22 +0200, Ingo Molnar wrote: > > > > Ping, what's going on with this bug? Systems are crashing so we need > > a quick fix ASAP ... > > Something as simple as the below ought to cure things for now. Once we > get __cycles_2_ns() fixed up we can enable it again. > > (patch against -tip, .32 code is different but equally simple to fix) > > --- > Subject: x86, intel: Don't mark sched_clock() as stable > > Because the x86 sched_clock() implementation wraps at 54 bits and the > scheduler code assumes it wraps at the full 64bits we can get into > trouble after 208 days (~7 months) of uptime. > > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > --- > arch/x86/kernel/cpu/intel.c | 7 +++++++ > 1 files changed, 7 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c > index ed6086e..c8dc48b 100644 > --- a/arch/x86/kernel/cpu/intel.c > +++ b/arch/x86/kernel/cpu/intel.c > @@ -91,8 +91,15 @@ static void __cpuinit early_init_intel(struct cpuinfo_x86 *c) > if (c->x86_power & (1 << 8)) { > set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); > set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC); > + /* > + * Unfortunately our __cycles_2_ns() implementation makes > + * the raw sched_clock() interface wrap at 54-bits, which > + * makes it unsuitable for direct use, so disable this > + * for now. > + * > if (!check_tsc_unstable()) > sched_clock_stable = 1; > + */ > } > > /* > > -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@linuxbox.cz ------------------------------------- [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 12:50 ` Nikola Ciprich @ 2011-07-21 12:53 ` Peter Zijlstra 2011-07-21 18:45 ` Ingo Molnar 2011-07-21 19:25 ` Nikola Ciprich 0 siblings, 2 replies; 58+ messages in thread From: Peter Zijlstra @ 2011-07-21 12:53 UTC (permalink / raw) To: Nikola Ciprich Cc: Ingo Molnar, john stultz, Willy Tarreau, MINOURA Makoto, Andrew Morton, Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Rand On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote: > thanks for the patch! I'll put this on our testing boxes... With a patch that frobs the starting value close to overflowing I hope, otherwise we'll not hear from you in like 7 months ;-) > Are You going to push this upstream so we can ask Greg to push this to > -stable? Yeah, I think we want to commit this with a -stable tag, Ingo? ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 12:53 ` Peter Zijlstra @ 2011-07-21 18:45 ` Ingo Molnar 2011-07-21 19:32 ` Nikola Ciprich 2011-08-25 18:56 ` Faidon Liambotis 2011-07-21 19:25 ` Nikola Ciprich 1 sibling, 2 replies; 58+ messages in thread From: Ingo Molnar @ 2011-07-21 18:45 UTC (permalink / raw) To: Peter Zijlstra Cc: Nikola Ciprich, john stultz, Willy Tarreau, MINOURA Makoto, Andrew Morton, Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Rand * Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote: > > thanks for the patch! I'll put this on our testing boxes... > > With a patch that frobs the starting value close to overflowing I hope, > otherwise we'll not hear from you in like 7 months ;-) > > > Are You going to push this upstream so we can ask Greg to push this to > > -stable? > > Yeah, I think we want to commit this with a -stable tag, Ingo? yeah - and we also want a Reported-by tag and an explanation of how it can crash and why it matters in practice. I can then stick it into the urgent branch for Linus. (probably will only hit upstream in the merge window though.) Thanks, Ingo ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 18:45 ` Ingo Molnar @ 2011-07-21 19:32 ` Nikola Ciprich 2011-08-25 18:56 ` Faidon Liambotis 1 sibling, 0 replies; 58+ messages in thread From: Nikola Ciprich @ 2011-07-21 19:32 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, john stultz, Willy Tarreau, MINOURA Makoto, Andrew Morton, Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Rand, Nikola Ciprich, Petr Kopecký [-- Attachment #1: Type: text/plain, Size: 1443 bytes --] > yeah - and we also want a Reported-by tag and an explanation of how > it can crash and why it matters in practice. I can then stick it into > the urgent branch for Linus. (probably will only hit upstream in the > merge window though.) Hello Ingo, well, I guess You can add me as reporter, but this has been independently reported by others as well, as the bug got hit by quite a lot of people... I'm afraid I won't add much to technical description of how this crashes the machine apart from what has been discussed in this thread. But the reason why this hurts us a lot is that it seems systems running RT tasks are affected in particular, and many of our crashed machines were failover clusters running pacemaker/corosync (which runs a lot of RT processes). And it really sucks, when both nodes of "high-availability" system crash in the same time :( So we were then forced to plan preventive restarts of some of those critical systems just to be sure they don't end up badly.. thanks to You all for taking a look at this! cheers! nik > > Thanks, > > Ingo > -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@linuxbox.cz ------------------------------------- [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 18:45 ` Ingo Molnar 2011-07-21 19:32 ` Nikola Ciprich @ 2011-08-25 18:56 ` Faidon Liambotis 2011-08-30 22:38 ` [stable] " Greg KH 1 sibling, 1 reply; 58+ messages in thread From: Faidon Liambotis @ 2011-08-25 18:56 UTC (permalink / raw) To: linux-kernel Cc: Peter Zijlstra, Nikola Ciprich, john stultz, Willy Tarreau, MINOURA Makoto, Andrew Morton, Ingo Molnar, stable, seto.hidetoshi, Hervé Commowick, Rand On Thu, Jul 21, 2011 at 08:45:25PM +0200, Ingo Molnar wrote: > * Peter Zijlstra <peterz@infradead.org> wrote: > > > On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote: > > > thanks for the patch! I'll put this on our testing boxes... > > > > With a patch that frobs the starting value close to overflowing I hope, > > otherwise we'll not hear from you in like 7 months ;-) > > > > > Are You going to push this upstream so we can ask Greg to push this to > > > -stable? > > > > Yeah, I think we want to commit this with a -stable tag, Ingo? > > yeah - and we also want a Reported-by tag and an explanation of how > it can crash and why it matters in practice. I can then stick it into > the urgent branch for Linus. (probably will only hit upstream in the > merge window though.) Has this been pushed or has the problem been solved somehow? Time is against us on this bug as more boxes will crash as they reach 200 days of uptime... In any case, feel free to use me as a Reported-by, my full report of the problem being <20110430173905.GA25641@tty.gr>. FWIW and if I understand correctly, my symptoms were caused by *two* different bugs: a) the 54 bits wraparound at 208 days that Peter fixed above, b) a kernel crash at ~215 days related to RT tasks, fixed by 305e6835e05513406fa12820e40e4a8ecb63743c (already in -stable). Regards, Faidon ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-08-25 18:56 ` Faidon Liambotis @ 2011-08-30 22:38 ` Greg KH 2011-09-04 23:26 ` Faidon Liambotis 0 siblings, 1 reply; 58+ messages in thread From: Greg KH @ 2011-08-30 22:38 UTC (permalink / raw) To: Faidon Liambotis Cc: linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, john stultz, Rand, Andrew Morton, Willy Tarreau On Thu, Aug 25, 2011 at 09:56:16PM +0300, Faidon Liambotis wrote: > On Thu, Jul 21, 2011 at 08:45:25PM +0200, Ingo Molnar wrote: > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > > On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote: > > > > thanks for the patch! I'll put this on our testing boxes... > > > > > > With a patch that frobs the starting value close to overflowing I hope, > > > otherwise we'll not hear from you in like 7 months ;-) > > > > > > > Are You going to push this upstream so we can ask Greg to push this to > > > > -stable? > > > > > > Yeah, I think we want to commit this with a -stable tag, Ingo? > > > > yeah - and we also want a Reported-by tag and an explanation of how > > it can crash and why it matters in practice. I can then stick it into > > the urgent branch for Linus. (probably will only hit upstream in the > > merge window though.) > > Has this been pushed or has the problem been solved somehow? Time is > against us on this bug as more boxes will crash as they reach 200 days > of uptime... > > In any case, feel free to use me as a Reported-by, my full report of the > problem being <20110430173905.GA25641@tty.gr>. > > FWIW and if I understand correctly, my symptoms were caused by *two* > different bugs: > a) the 54 bits wraparound at 208 days that Peter fixed above, > b) a kernel crash at ~215 days related to RT tasks, fixed by > 305e6835e05513406fa12820e40e4a8ecb63743c (already in -stable). So, what do I do here as part of the .32-longterm kernel? Is there a fix that is in Linus's tree that I need to apply here? confused, greg k-h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-08-30 22:38 ` [stable] " Greg KH @ 2011-09-04 23:26 ` Faidon Liambotis 2011-10-23 18:31 ` Ruben Kerkhof 0 siblings, 1 reply; 58+ messages in thread From: Faidon Liambotis @ 2011-09-04 23:26 UTC (permalink / raw) To: Greg KH Cc: linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, john stultz, Rand, Andrew Morton, Willy Tarreau On Tue, Aug 30, 2011 at 03:38:29PM -0700, Greg KH wrote: > On Thu, Aug 25, 2011 at 09:56:16PM +0300, Faidon Liambotis wrote: > > On Thu, Jul 21, 2011 at 08:45:25PM +0200, Ingo Molnar wrote: > > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote: > > > > > thanks for the patch! I'll put this on our testing boxes... > > > > > > > > With a patch that frobs the starting value close to overflowing I hope, > > > > otherwise we'll not hear from you in like 7 months ;-) > > > > > > > > > Are You going to push this upstream so we can ask Greg to push this to > > > > > -stable? > > > > > > > > Yeah, I think we want to commit this with a -stable tag, Ingo? > > > > > > yeah - and we also want a Reported-by tag and an explanation of how > > > it can crash and why it matters in practice. I can then stick it into > > > the urgent branch for Linus. (probably will only hit upstream in the > > > merge window though.) > > > > Has this been pushed or has the problem been solved somehow? Time is > > against us on this bug as more boxes will crash as they reach 200 days > > of uptime... > > > > In any case, feel free to use me as a Reported-by, my full report of the > > problem being <20110430173905.GA25641@tty.gr>. > > > > FWIW and if I understand correctly, my symptoms were caused by *two* > > different bugs: > > a) the 54 bits wraparound at 208 days that Peter fixed above, > > b) a kernel crash at ~215 days related to RT tasks, fixed by > > 305e6835e05513406fa12820e40e4a8ecb63743c (already in -stable). > > So, what do I do here as part of the .32-longterm kernel? Is there a > fix that is in Linus's tree that I need to apply here? > > confused, Is this even pushed upstream? I checked Linus' tree and the proposed patch is *not* merged there. I'm not really sure if it was fixed some other way, though. I thought this was intended to be an "urgent" fix or something? Regards, Faidon ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-09-04 23:26 ` Faidon Liambotis @ 2011-10-23 18:31 ` Ruben Kerkhof 2011-10-23 22:07 ` Greg KH 2011-10-25 22:44 ` john stultz 0 siblings, 2 replies; 58+ messages in thread From: Ruben Kerkhof @ 2011-10-23 18:31 UTC (permalink / raw) To: Greg KH Cc: linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, john stultz, Rand, Andrew Morton, Willy Tarreau, Faidon Liambotis On Mon, Sep 5, 2011 at 01:26, Faidon Liambotis <paravoid@debian.org> wrote: > On Tue, Aug 30, 2011 at 03:38:29PM -0700, Greg KH wrote: >> On Thu, Aug 25, 2011 at 09:56:16PM +0300, Faidon Liambotis wrote: >> > On Thu, Jul 21, 2011 at 08:45:25PM +0200, Ingo Molnar wrote: >> > > * Peter Zijlstra <peterz@infradead.org> wrote: >> > > >> > > > On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote: >> > > > > thanks for the patch! I'll put this on our testing boxes... >> > > > >> > > > With a patch that frobs the starting value close to overflowing I hope, >> > > > otherwise we'll not hear from you in like 7 months ;-) >> > > > >> > > > > Are You going to push this upstream so we can ask Greg to push this to >> > > > > -stable? >> > > > >> > > > Yeah, I think we want to commit this with a -stable tag, Ingo? >> > > >> > > yeah - and we also want a Reported-by tag and an explanation of how >> > > it can crash and why it matters in practice. I can then stick it into >> > > the urgent branch for Linus. (probably will only hit upstream in the >> > > merge window though.) >> > >> > Has this been pushed or has the problem been solved somehow? Time is >> > against us on this bug as more boxes will crash as they reach 200 days >> > of uptime... >> > >> > In any case, feel free to use me as a Reported-by, my full report of the >> > problem being <20110430173905.GA25641@tty.gr>. >> > >> > FWIW and if I understand correctly, my symptoms were caused by *two* >> > different bugs: >> > a) the 54 bits wraparound at 208 days that Peter fixed above, >> > b) a kernel crash at ~215 days related to RT tasks, fixed by >> > 305e6835e05513406fa12820e40e4a8ecb63743c (already in -stable). >> >> So, what do I do here as part of the .32-longterm kernel? Is there a >> fix that is in Linus's tree that I need to apply here? >> >> confused, > > Is this even pushed upstream? I checked Linus' tree and the proposed > patch is *not* merged there. I'm not really sure if it was fixed some > other way, though. I thought this was intended to be an "urgent" fix or > something? > > Regards, > Faidon I just had two crashes on two different machines, both with an uptime of 208 days. Both were 5520's running 2.6.34.8, but with a CONFIG_HZ of 1000 2011-10-23T16:49:18.618029+02:00 phy001 kernel: BUG: soft lockup - CPU#0 stuck for 17163091968s! [qemu-kvm:16949] 2011-10-23T16:49:18.618054+02:00 phy001 kernel: Modules linked in: xt_limit ebt_log ebt_limit ebt_arp ebtable_filter ebtable_nat ebtables ufs nls_utf8 tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm ioatdma i2c_i801 igb iTCO_wdt dca iTCO_vendor_support serio_raw i2c_core 3w_9xxx [last unloaded: scsi_wait_scan] 2011-10-23T16:49:18.618060+02:00 phy001 kernel: CPU 0 2011-10-23T16:49:18.618068+02:00 phy001 kernel: Modules linked in: xt_limit ebt_log ebt_limit ebt_arp ebtable_filter ebtable_nat ebtables ufs nls_utf8 tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q garp stp llc bonding xt_comment xt_recent ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm ioatdma i2c_i801 igb iTCO_wdt dca iTCO_vendor_support serio_raw i2c_core 3w_9xxx [last unloaded: scsi_wait_scan] 2011-10-23T16:49:18.618072+02:00 phy001 kernel: 2011-10-23T16:49:18.618077+02:00 phy001 kernel: Pid: 16949, comm: qemu-kvm Tainted: G M 2.6.34.8-68.local.fc13.x86_64 #1 X8DTU/X8DTU 2011-10-23T16:49:18.618083+02:00 phy001 kernel: RIP: 0010:[<ffffffffa007f92f>] [<ffffffffa007f92f>] kvm_arch_vcpu_ioctl_run+0x764/0xa74 [kvm] 2011-10-23T16:49:18.618086+02:00 phy001 kernel: RSP: 0018:ffff880bafa29d18 EFLAGS: 00000202 2011-10-23T16:49:18.618088+02:00 phy001 kernel: RAX: ffff880002000000 RBX: ffff880bafa29dc8 RCX: ffff8805e45128a0 2011-10-23T16:49:18.618091+02:00 phy001 kernel: RDX: 000000000000cb80 RSI: 0000000004b2a3a0 RDI: 000000000b630000 2011-10-23T16:49:18.618093+02:00 phy001 kernel: RBP: ffffffff8100a60e R08: 000000000000002b R09: 00000000760d0735 2011-10-23T16:49:18.618095+02:00 phy001 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 2011-10-23T16:49:18.618097+02:00 phy001 kernel: R13: ffff880bafa29cc8 R14: ffffffffa007b536 R15: ffff880bafa29ca8 2011-10-23T16:49:18.618100+02:00 phy001 kernel: FS: 00007fe92cd38700(0000) GS:ffff880002000000(0000) knlGS:fffff880009b8000 2011-10-23T16:49:18.618102+02:00 phy001 kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 2011-10-23T16:49:18.618104+02:00 phy001 kernel: CR2: 00000000c1a00044 CR3: 00000006b3f2e000 CR4: 00000000000026e0 2011-10-23T16:49:18.618107+02:00 phy001 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2011-10-23T16:49:18.618109+02:00 phy001 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2011-10-23T16:49:18.618112+02:00 phy001 kernel: Process qemu-kvm (pid: 16949, threadinfo ffff880bafa28000, task ffff880c242e0000) 2011-10-23T16:49:18.618114+02:00 phy001 kernel: Stack: 2011-10-23T16:49:18.618116+02:00 phy001 kernel: ffff88077b1a3ca8 ffffffff81d3cf38 ffff8805e4513f00 ffff880c242e0000 2011-10-23T16:49:18.618119+02:00 phy001 kernel: <0> ffff880c242e0000 ffff880bafa29fd8 ffff8805e4513ef8 0000000000015fd0 2011-10-23T16:49:18.618121+02:00 phy001 kernel: <0> 000000000000cb80 ffff880c242e0000 ffff880bafa28000 ffff880ab43f4038 2011-10-23T16:49:18.618123+02:00 phy001 kernel: Call Trace: 2011-10-23T16:49:18.618126+02:00 phy001 kernel: [<ffffffffa006e5ba>] ? kvm_vcpu_ioctl+0xfd/0x56e [kvm] 2011-10-23T16:49:18.618129+02:00 phy001 kernel: [<ffffffff81011252>] ? __switch_to_xtra+0x121/0x141 2011-10-23T16:49:18.618131+02:00 phy001 kernel: [<ffffffff8111ad5f>] ? vfs_ioctl+0x32/0xa6 2011-10-23T16:49:18.618134+02:00 phy001 kernel: [<ffffffff8111b2d2>] ? do_vfs_ioctl+0x483/0x4c9 2011-10-23T16:49:18.618137+02:00 phy001 kernel: [<ffffffff8111b36e>] ? sys_ioctl+0x56/0x79 2011-10-23T16:49:18.618139+02:00 phy001 kernel: [<ffffffff81009c72>] ? system_call_fastpath+0x16/0x1b 2011-10-23T16:49:18.618142+02:00 phy001 kernel: Code: df ff 90 48 01 00 00 48 8b 55 90 65 48 8b 04 25 90 e8 00 00 f6 04 10 aa 74 05 e8 05 06 f9 e0 f0 41 80 0f 02 fb 66 0f 1f 44 00 00 <ff> 83 b0 00 00 00 48 8b b5 68 ff ff ff 83 66 14 ef 48 8b 3b 48 Can the necessary fix please be pushed upstream? Kind regards, Ruben Kerkhof ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-10-23 18:31 ` Ruben Kerkhof @ 2011-10-23 22:07 ` Greg KH 2011-10-25 22:44 ` john stultz 1 sibling, 0 replies; 58+ messages in thread From: Greg KH @ 2011-10-23 22:07 UTC (permalink / raw) To: Ruben Kerkhof Cc: linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, john stultz, Rand, Andrew Morton, Willy Tarreau, Faidon Liambotis On Sun, Oct 23, 2011 at 08:31:32PM +0200, Ruben Kerkhof wrote: > On Mon, Sep 5, 2011 at 01:26, Faidon Liambotis <paravoid@debian.org> wrote: > > On Tue, Aug 30, 2011 at 03:38:29PM -0700, Greg KH wrote: > >> On Thu, Aug 25, 2011 at 09:56:16PM +0300, Faidon Liambotis wrote: > >> > On Thu, Jul 21, 2011 at 08:45:25PM +0200, Ingo Molnar wrote: > >> > > * Peter Zijlstra <peterz@infradead.org> wrote: > >> > > > >> > > > On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote: > >> > > > > thanks for the patch! I'll put this on our testing boxes... > >> > > > > >> > > > With a patch that frobs the starting value close to overflowing I hope, > >> > > > otherwise we'll not hear from you in like 7 months ;-) > >> > > > > >> > > > > Are You going to push this upstream so we can ask Greg to push this to > >> > > > > -stable? > >> > > > > >> > > > Yeah, I think we want to commit this with a -stable tag, Ingo? > >> > > > >> > > yeah - and we also want a Reported-by tag and an explanation of how > >> > > it can crash and why it matters in practice. I can then stick it into > >> > > the urgent branch for Linus. (probably will only hit upstream in the > >> > > merge window though.) > >> > > >> > Has this been pushed or has the problem been solved somehow? Time is > >> > against us on this bug as more boxes will crash as they reach 200 days > >> > of uptime... > >> > > >> > In any case, feel free to use me as a Reported-by, my full report of the > >> > problem being <20110430173905.GA25641@tty.gr>. > >> > > >> > FWIW and if I understand correctly, my symptoms were caused by *two* > >> > different bugs: > >> > a) the 54 bits wraparound at 208 days that Peter fixed above, > >> > b) a kernel crash at ~215 days related to RT tasks, fixed by > >> > 305e6835e05513406fa12820e40e4a8ecb63743c (already in -stable). > >> > >> So, what do I do here as part of the .32-longterm kernel? Is there a > >> fix that is in Linus's tree that I need to apply here? > >> > >> confused, > > > > Is this even pushed upstream? I checked Linus' tree and the proposed > > patch is *not* merged there. I'm not really sure if it was fixed some > > other way, though. I thought this was intended to be an "urgent" fix or > > something? > > > > Regards, > > Faidon > > I just had two crashes on two different machines, both with an uptime > of 208 days. > Both were 5520's running 2.6.34.8, but with a CONFIG_HZ of 1000 > > 2011-10-23T16:49:18.618029+02:00 phy001 kernel: BUG: soft lockup - > CPU#0 stuck for 17163091968s! [qemu-kvm:16949] > 2011-10-23T16:49:18.618054+02:00 phy001 kernel: Modules linked in: > xt_limit ebt_log ebt_limit ebt_arp ebtable_filter ebtable_nat ebtables > ufs nls_utf8 tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q > garp stp llc bonding xt_comment xt_recent ip6t_REJECT > nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm > ioatdma i2c_i801 igb iTCO_wdt dca iTCO_vendor_support serio_raw > i2c_core 3w_9xxx [last unloaded: scsi_wait_scan] > 2011-10-23T16:49:18.618060+02:00 phy001 kernel: CPU 0 > 2011-10-23T16:49:18.618068+02:00 phy001 kernel: Modules linked in: > xt_limit ebt_log ebt_limit ebt_arp ebtable_filter ebtable_nat ebtables > ufs nls_utf8 tun ipmi_devintf ipmi_si ipmi_msghandler bridge 8021q > garp stp llc bonding xt_comment xt_recent ip6t_REJECT > nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm > ioatdma i2c_i801 igb iTCO_wdt dca iTCO_vendor_support serio_raw > i2c_core 3w_9xxx [last unloaded: scsi_wait_scan] > 2011-10-23T16:49:18.618072+02:00 phy001 kernel: > 2011-10-23T16:49:18.618077+02:00 phy001 kernel: Pid: 16949, comm: > qemu-kvm Tainted: G M 2.6.34.8-68.local.fc13.x86_64 #1 > X8DTU/X8DTU > 2011-10-23T16:49:18.618083+02:00 phy001 kernel: RIP: > 0010:[<ffffffffa007f92f>] [<ffffffffa007f92f>] > kvm_arch_vcpu_ioctl_run+0x764/0xa74 [kvm] > 2011-10-23T16:49:18.618086+02:00 phy001 kernel: RSP: > 0018:ffff880bafa29d18 EFLAGS: 00000202 > 2011-10-23T16:49:18.618088+02:00 phy001 kernel: RAX: ffff880002000000 > RBX: ffff880bafa29dc8 RCX: ffff8805e45128a0 > 2011-10-23T16:49:18.618091+02:00 phy001 kernel: RDX: 000000000000cb80 > RSI: 0000000004b2a3a0 RDI: 000000000b630000 > 2011-10-23T16:49:18.618093+02:00 phy001 kernel: RBP: ffffffff8100a60e > R08: 000000000000002b R09: 00000000760d0735 > 2011-10-23T16:49:18.618095+02:00 phy001 kernel: R10: 0000000000000000 > R11: 0000000000000000 R12: 0000000000000001 > 2011-10-23T16:49:18.618097+02:00 phy001 kernel: R13: ffff880bafa29cc8 > R14: ffffffffa007b536 R15: ffff880bafa29ca8 > 2011-10-23T16:49:18.618100+02:00 phy001 kernel: FS: > 00007fe92cd38700(0000) GS:ffff880002000000(0000) > knlGS:fffff880009b8000 > 2011-10-23T16:49:18.618102+02:00 phy001 kernel: CS: 0010 DS: 002b ES: > 002b CR0: 0000000080050033 > 2011-10-23T16:49:18.618104+02:00 phy001 kernel: CR2: 00000000c1a00044 > CR3: 00000006b3f2e000 CR4: 00000000000026e0 > 2011-10-23T16:49:18.618107+02:00 phy001 kernel: DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > 2011-10-23T16:49:18.618109+02:00 phy001 kernel: DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > 2011-10-23T16:49:18.618112+02:00 phy001 kernel: Process qemu-kvm (pid: > 16949, threadinfo ffff880bafa28000, task ffff880c242e0000) > 2011-10-23T16:49:18.618114+02:00 phy001 kernel: Stack: > 2011-10-23T16:49:18.618116+02:00 phy001 kernel: ffff88077b1a3ca8 > ffffffff81d3cf38 ffff8805e4513f00 ffff880c242e0000 > 2011-10-23T16:49:18.618119+02:00 phy001 kernel: <0> ffff880c242e0000 > ffff880bafa29fd8 ffff8805e4513ef8 0000000000015fd0 > 2011-10-23T16:49:18.618121+02:00 phy001 kernel: <0> 000000000000cb80 > ffff880c242e0000 ffff880bafa28000 ffff880ab43f4038 > 2011-10-23T16:49:18.618123+02:00 phy001 kernel: Call Trace: > 2011-10-23T16:49:18.618126+02:00 phy001 kernel: [<ffffffffa006e5ba>] ? > kvm_vcpu_ioctl+0xfd/0x56e [kvm] > 2011-10-23T16:49:18.618129+02:00 phy001 kernel: [<ffffffff81011252>] ? > __switch_to_xtra+0x121/0x141 > 2011-10-23T16:49:18.618131+02:00 phy001 kernel: [<ffffffff8111ad5f>] ? > vfs_ioctl+0x32/0xa6 > 2011-10-23T16:49:18.618134+02:00 phy001 kernel: [<ffffffff8111b2d2>] ? > do_vfs_ioctl+0x483/0x4c9 > 2011-10-23T16:49:18.618137+02:00 phy001 kernel: [<ffffffff8111b36e>] ? > sys_ioctl+0x56/0x79 > 2011-10-23T16:49:18.618139+02:00 phy001 kernel: [<ffffffff81009c72>] ? > system_call_fastpath+0x16/0x1b > 2011-10-23T16:49:18.618142+02:00 phy001 kernel: Code: df ff 90 48 01 > 00 00 48 8b 55 90 65 48 8b 04 25 90 e8 00 00 f6 04 10 aa 74 05 e8 05 > 06 f9 e0 f0 41 80 0f 02 fb 66 0f 1f 44 00 00 <ff> 83 b0 00 00 00 48 8b > b5 68 ff ff ff 83 66 14 ef 48 8b 3b 48 > > Can the necessary fix please be pushed upstream? I agree, again, can someone please do this? greg k-h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-10-23 18:31 ` Ruben Kerkhof 2011-10-23 22:07 ` Greg KH @ 2011-10-25 22:44 ` john stultz 2011-10-25 23:25 ` Willy Tarreau 2011-10-26 18:21 ` Ruben Kerkhof 1 sibling, 2 replies; 58+ messages in thread From: john stultz @ 2011-10-25 22:44 UTC (permalink / raw) To: Ruben Kerkhof Cc: Greg KH, linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, Rand, Andrew Morton, Willy Tarreau, Faidon Liambotis On Sun, 2011-10-23 at 20:31 +0200, Ruben Kerkhof wrote: > On Mon, Sep 5, 2011 at 01:26, Faidon Liambotis <paravoid@debian.org> wrote: > > On Tue, Aug 30, 2011 at 03:38:29PM -0700, Greg KH wrote: > >> On Thu, Aug 25, 2011 at 09:56:16PM +0300, Faidon Liambotis wrote: > >> > On Thu, Jul 21, 2011 at 08:45:25PM +0200, Ingo Molnar wrote: > >> > > * Peter Zijlstra <peterz@infradead.org> wrote: > >> > > > >> > > > On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote: > >> > > > > thanks for the patch! I'll put this on our testing boxes... > >> > > > > >> > > > With a patch that frobs the starting value close to overflowing I hope, > >> > > > otherwise we'll not hear from you in like 7 months ;-) > >> > > > > >> > > > > Are You going to push this upstream so we can ask Greg to push this to > >> > > > > -stable? > >> > > > > >> > > > Yeah, I think we want to commit this with a -stable tag, Ingo? > >> > > > >> > > yeah - and we also want a Reported-by tag and an explanation of how > >> > > it can crash and why it matters in practice. I can then stick it into > >> > > the urgent branch for Linus. (probably will only hit upstream in the > >> > > merge window though.) > >> > > >> > Has this been pushed or has the problem been solved somehow? Time is > >> > against us on this bug as more boxes will crash as they reach 200 days > >> > of uptime... > >> > > >> > In any case, feel free to use me as a Reported-by, my full report of the > >> > problem being <20110430173905.GA25641@tty.gr>. > >> > > >> > FWIW and if I understand correctly, my symptoms were caused by *two* > >> > different bugs: > >> > a) the 54 bits wraparound at 208 days that Peter fixed above, > >> > b) a kernel crash at ~215 days related to RT tasks, fixed by > >> > 305e6835e05513406fa12820e40e4a8ecb63743c (already in -stable). > >> > >> So, what do I do here as part of the .32-longterm kernel? Is there a > >> fix that is in Linus's tree that I need to apply here? > >> > >> confused, > > > > Is this even pushed upstream? I checked Linus' tree and the proposed > > patch is *not* merged there. I'm not really sure if it was fixed some > > other way, though. I thought this was intended to be an "urgent" fix or > > something? > > > > Regards, > > Faidon > > I just had two crashes on two different machines, both with an uptime > of 208 days. > Both were 5520's running 2.6.34.8, but with a CONFIG_HZ of 1000 > > 2011-10-23T16:49:18.618029+02:00 phy001 kernel: BUG: soft lockup - > CPU#0 stuck for 17163091968s! [qemu-kvm:16949] So were these actual crashes, or just softlockup false positives? I had thought the earlier crash issue (div by zero) fix from PeterZ had been already pushed upstream, but maybe that was just against 2.6.32 and not 2.6.33? The softlockup false positive issue should have been fixed by Peter's "x86, intel: Don't mark sched_clock() as stable" below. But I'm not seeing it upstream. Peter, is this still the right fix? thanks -john From: Peter Zijlstra <a.p.zijlstra@chello.nl> Subject: x86, intel: Don't mark sched_clock() as stable Because the x86 sched_clock() implementation wraps at 54 bits and the scheduler code assumes it wraps at the full 64bits we can get into trouble after 208 days (~7 months) of uptime. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- arch/x86/kernel/cpu/intel.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index ed6086e..c8dc48b 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -91,8 +91,15 @@ static void __cpuinit early_init_intel(struct cpuinfo_x86 *c) if (c->x86_power & (1 << 8)) { set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC); + /* + * Unfortunately our __cycles_2_ns() implementation makes + * the raw sched_clock() interface wrap at 54-bits, which + * makes it unsuitable for direct use, so disable this + * for now. + * if (!check_tsc_unstable()) sched_clock_stable = 1; + */ } /* ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-10-25 22:44 ` john stultz @ 2011-10-25 23:25 ` Willy Tarreau 2011-12-02 23:45 ` Greg KH 2011-10-26 18:21 ` Ruben Kerkhof 1 sibling, 1 reply; 58+ messages in thread From: Willy Tarreau @ 2011-10-25 23:25 UTC (permalink / raw) To: john stultz Cc: Ruben Kerkhof, Greg KH, linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, Rand, Andrew Morton, Faidon Liambotis Hi John, On Tue, Oct 25, 2011 at 03:44:30PM -0700, john stultz wrote: > The softlockup false positive issue should have been fixed by Peter's > "x86, intel: Don't mark sched_clock() as stable" below. But I'm not > seeing it upstream. Peter, is this still the right fix? I've not seen any other one proposed, and both you and Peter appeared to like it. I understood that Ingo was waiting for the merge window to submit it and I think that it simply got lost. Ingo, can you confirm ? Thanks, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-10-25 23:25 ` Willy Tarreau @ 2011-12-02 23:45 ` Greg KH 2011-12-03 0:02 ` john stultz 0 siblings, 1 reply; 58+ messages in thread From: Greg KH @ 2011-12-02 23:45 UTC (permalink / raw) To: Willy Tarreau Cc: john stultz, Ruben Kerkhof, linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, Rand, Andrew Morton, Faidon Liambotis On Wed, Oct 26, 2011 at 01:25:45AM +0200, Willy Tarreau wrote: > Hi John, > > On Tue, Oct 25, 2011 at 03:44:30PM -0700, john stultz wrote: > > The softlockup false positive issue should have been fixed by Peter's > > "x86, intel: Don't mark sched_clock() as stable" below. But I'm not > > seeing it upstream. Peter, is this still the right fix? > > I've not seen any other one proposed, and both you and Peter appeared > to like it. I understood that Ingo was waiting for the merge window to > submit it and I think that it simply got lost. > > Ingo, can you confirm ? I'm totally confused here, what's the status of this, and what exactly is the patch? greg k-h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-12-02 23:45 ` Greg KH @ 2011-12-03 0:02 ` john stultz 2011-12-03 1:02 ` Greg KH 0 siblings, 1 reply; 58+ messages in thread From: john stultz @ 2011-12-03 0:02 UTC (permalink / raw) To: Greg KH Cc: Willy Tarreau, Ruben Kerkhof, linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, Rand, Andrew Morton, Faidon Liambotis On Fri, 2011-12-02 at 15:45 -0800, Greg KH wrote: > On Wed, Oct 26, 2011 at 01:25:45AM +0200, Willy Tarreau wrote: > > Hi John, > > > > On Tue, Oct 25, 2011 at 03:44:30PM -0700, john stultz wrote: > > > The softlockup false positive issue should have been fixed by Peter's > > > "x86, intel: Don't mark sched_clock() as stable" below. But I'm not > > > seeing it upstream. Peter, is this still the right fix? > > > > I've not seen any other one proposed, and both you and Peter appeared > > to like it. I understood that Ingo was waiting for the merge window to > > submit it and I think that it simply got lost. > > > > Ingo, can you confirm ? > > I'm totally confused here, what's the status of this, and what exactly > is the patch? Ingo has the fix from Salman queued in -tip, but I'm not sure why its not been pushed to Linus yet. http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commit;h=4cecf6d401a01d054afc1e5f605bcbfe553cb9b9 thanks -john ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-12-03 0:02 ` john stultz @ 2011-12-03 1:02 ` Greg KH 2011-12-03 7:00 ` Willy Tarreau 2011-12-05 16:53 ` Ingo Molnar 0 siblings, 2 replies; 58+ messages in thread From: Greg KH @ 2011-12-03 1:02 UTC (permalink / raw) To: john stultz, Ingo Molnar Cc: Willy Tarreau, Ruben Kerkhof, linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, stable, Hervé Commowick, Rand, Andrew Morton, Faidon Liambotis On Fri, Dec 02, 2011 at 04:02:23PM -0800, john stultz wrote: > On Fri, 2011-12-02 at 15:45 -0800, Greg KH wrote: > > On Wed, Oct 26, 2011 at 01:25:45AM +0200, Willy Tarreau wrote: > > > Hi John, > > > > > > On Tue, Oct 25, 2011 at 03:44:30PM -0700, john stultz wrote: > > > > The softlockup false positive issue should have been fixed by Peter's > > > > "x86, intel: Don't mark sched_clock() as stable" below. But I'm not > > > > seeing it upstream. Peter, is this still the right fix? > > > > > > I've not seen any other one proposed, and both you and Peter appeared > > > to like it. I understood that Ingo was waiting for the merge window to > > > submit it and I think that it simply got lost. > > > > > > Ingo, can you confirm ? > > > > I'm totally confused here, what's the status of this, and what exactly > > is the patch? > > Ingo has the fix from Salman queued in -tip, but I'm not sure why its > not been pushed to Linus yet. > > http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commit;h=4cecf6d401a01d054afc1e5f605bcbfe553cb9b9 Wonderful, thanks for pointing this out to me. Ingo, any idea when this will go to Linus's tree? thanks, greg k-h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-12-03 1:02 ` Greg KH @ 2011-12-03 7:00 ` Willy Tarreau 2011-12-05 16:53 ` Ingo Molnar 1 sibling, 0 replies; 58+ messages in thread From: Willy Tarreau @ 2011-12-03 7:00 UTC (permalink / raw) To: Ingo Molnar Cc: john stultz, Greg KH, Ruben Kerkhof, linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, stable, Rand, Andrew Morton, Faidon Liambotis On Fri, Dec 02, 2011 at 05:02:32PM -0800, Greg KH wrote: > On Fri, Dec 02, 2011 at 04:02:23PM -0800, john stultz wrote: > > On Fri, 2011-12-02 at 15:45 -0800, Greg KH wrote: > > > On Wed, Oct 26, 2011 at 01:25:45AM +0200, Willy Tarreau wrote: > > > > Hi John, > > > > > > > > On Tue, Oct 25, 2011 at 03:44:30PM -0700, john stultz wrote: > > > > > The softlockup false positive issue should have been fixed by Peter's > > > > > "x86, intel: Don't mark sched_clock() as stable" below. But I'm not > > > > > seeing it upstream. Peter, is this still the right fix? > > > > > > > > I've not seen any other one proposed, and both you and Peter appeared > > > > to like it. I understood that Ingo was waiting for the merge window to > > > > submit it and I think that it simply got lost. > > > > > > > > Ingo, can you confirm ? > > > > > > I'm totally confused here, what's the status of this, and what exactly > > > is the patch? > > > > Ingo has the fix from Salman queued in -tip, but I'm not sure why its > > not been pushed to Linus yet. > > > > http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commit;h=4cecf6d401a01d054afc1e5f605bcbfe553cb9b9 > > Wonderful, thanks for pointing this out to me. > > Ingo, any idea when this will go to Linus's tree? Yes please Ingo, do not delay it any further, this is becoming a real problem, there are people who monitor their uptime to plan a reboot before 200 days. We shouldn't need to wait for the next merge window, the patch is already 15 days old and is a fix for a real-world stability issue ! Thanks, Willy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-12-03 1:02 ` Greg KH 2011-12-03 7:00 ` Willy Tarreau @ 2011-12-05 16:53 ` Ingo Molnar 1 sibling, 0 replies; 58+ messages in thread From: Ingo Molnar @ 2011-12-05 16:53 UTC (permalink / raw) To: Greg KH Cc: john stultz, Willy Tarreau, Ruben Kerkhof, linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, stable, Hervé Commowick, Rand, Andrew Morton, Faidon Liambotis * Greg KH <greg@kroah.com> wrote: > On Fri, Dec 02, 2011 at 04:02:23PM -0800, john stultz wrote: > > On Fri, 2011-12-02 at 15:45 -0800, Greg KH wrote: > > > On Wed, Oct 26, 2011 at 01:25:45AM +0200, Willy Tarreau wrote: > > > > Hi John, > > > > > > > > On Tue, Oct 25, 2011 at 03:44:30PM -0700, john stultz wrote: > > > > > The softlockup false positive issue should have been fixed by Peter's > > > > > "x86, intel: Don't mark sched_clock() as stable" below. But I'm not > > > > > seeing it upstream. Peter, is this still the right fix? > > > > > > > > I've not seen any other one proposed, and both you and Peter appeared > > > > to like it. I understood that Ingo was waiting for the merge window to > > > > submit it and I think that it simply got lost. > > > > > > > > Ingo, can you confirm ? > > > > > > I'm totally confused here, what's the status of this, and what exactly > > > is the patch? > > > > Ingo has the fix from Salman queued in -tip, but I'm not sure why its > > not been pushed to Linus yet. > > > > http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commit;h=4cecf6d401a01d054afc1e5f605bcbfe553cb9b9 > > Wonderful, thanks for pointing this out to me. > > Ingo, any idea when this will go to Linus's tree? today if everything goes fine. Thanks, Ingo ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-10-25 22:44 ` john stultz 2011-10-25 23:25 ` Willy Tarreau @ 2011-10-26 18:21 ` Ruben Kerkhof 1 sibling, 0 replies; 58+ messages in thread From: Ruben Kerkhof @ 2011-10-26 18:21 UTC (permalink / raw) To: john stultz Cc: Greg KH, linux-kernel, seto.hidetoshi, Peter Zijlstra, MINOURA Makoto, Ingo Molnar, stable, Hervé Commowick, Rand, Andrew Morton, Willy Tarreau, Faidon Liambotis On Wed, Oct 26, 2011 at 00:44, john stultz <johnstul@us.ibm.com> wrote: > On Sun, 2011-10-23 at 20:31 +0200, Ruben Kerkhof wrote: >> I just had two crashes on two different machines, both with an uptime >> of 208 days. >> Both were 5520's running 2.6.34.8, but with a CONFIG_HZ of 1000 >> >> 2011-10-23T16:49:18.618029+02:00 phy001 kernel: BUG: soft lockup - >> CPU#0 stuck for 17163091968s! [qemu-kvm:16949] > > So were these actual crashes, or just softlockup false positives? Just softlockups, I haven't seen the divide_by_zero crash. Thanks, Ruben ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 12:53 ` Peter Zijlstra 2011-07-21 18:45 ` Ingo Molnar @ 2011-07-21 19:25 ` Nikola Ciprich 2011-07-21 19:37 ` john stultz 1 sibling, 1 reply; 58+ messages in thread From: Nikola Ciprich @ 2011-07-21 19:25 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, john stultz, Willy Tarreau, MINOURA Makoto, Andrew Morton, Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Rand, Nikola Ciprich [-- Attachment #1: Type: text/plain, Size: 709 bytes --] Hello Peter, > With a patch that frobs the starting value close to overflowing I hope, > otherwise we'll not hear from you in like 7 months ;-) sure. Which is the best patch to use for testing, You mean john's one? (http://www.gossamer-threads.com/lists/linux/kernel/1406779#1406779) or some other? nik > Yeah, I think we want to commit this with a -stable tag, Ingo? > -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28. rijna 168, 709 01 Ostrava tel.: +420 596 603 142 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@linuxbox.cz ------------------------------------- [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 19:25 ` Nikola Ciprich @ 2011-07-21 19:37 ` john stultz 0 siblings, 0 replies; 58+ messages in thread From: john stultz @ 2011-07-21 19:37 UTC (permalink / raw) To: Nikola Ciprich Cc: Peter Zijlstra, Ingo Molnar, Willy Tarreau, MINOURA Makoto, Andrew Morton, Faidon Liambotis, linux-kernel, stable, seto.hidetoshi, Hervé Commowick, Rand On Thu, 2011-07-21 at 21:25 +0200, Nikola Ciprich wrote: > Hello Peter, > > > With a patch that frobs the starting value close to overflowing I hope, > > otherwise we'll not hear from you in like 7 months ;-) > sure. Which is the best patch to use for testing, You mean john's one? > (http://www.gossamer-threads.com/lists/linux/kernel/1406779#1406779) > or some other? http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=commitdiff;h=3eedec0ad10a856cf48ada066e88871d8ab427b3 Should do it. thanks -john ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: 2.6.32.21 - uptime related crashes? 2011-07-21 7:22 ` Ingo Molnar 2011-07-21 12:24 ` Peter Zijlstra @ 2011-07-21 19:53 ` john stultz 1 sibling, 0 replies; 58+ messages in thread From: john stultz @ 2011-07-21 19:53 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, Willy Tarreau, MINOURA Makoto / ?$BL'1: ?$B??, Andrew Morton, Faidon Liambotis, linux-kernel, stable, Nikola Ciprich, seto.hidetoshi, Hervé Commowick, Rand On Thu, 2011-07-21 at 09:22 +0200, Ingo Molnar wrote: > * john stultz <johnstul@us.ibm.com> wrote: > > > On Fri, 2011-07-15 at 12:01 +0200, Peter Zijlstra wrote: > > > On Thu, 2011-07-14 at 17:35 -0700, john stultz wrote: > > > > > > > > Peter/Ingo: Can you take a look at the above and let me know if you find > > > > it too disagreeable? > > > > > > +static unsigned long long __cycles_2_ns(unsigned long long cyc) > > > +{ > > > + unsigned long long ns = 0; > > > + struct x86_sched_clock_data *data; > > > + int cpu = smp_processor_id(); > > > + > > > + rcu_read_lock(); > > > + data = rcu_dereference(per_cpu(cpu_sched_clock_data, cpu)); > > > + > > > + if (unlikely(!data)) > > > + goto out; > > > + > > > + ns = ((cyc - data->base_cycles) * data->mult) >> CYC2NS_SCALE_FACTOR; > > > + ns += data->accumulated_ns; > > > +out: > > > + rcu_read_unlock(); > > > + return ns; > > > +} > > > > > > The way I read that we're still not wrapping properly if freq scaling > > > 'never' happens. > > > > Right, this doesn't address the mult overflow behavior. As I mentioned > > in the patch that the rework allows for solving that in the future using > > a (possibly very rare) timer that would accumulate cycles to ns. > > > > This rework just really addresses the multiplication overflow->negative > > roll under that currently occurs with the cyc2ns_offset value. > > > > > Because then we're wrapping on accumulated_ns + 2^54. > > > > > > Something like resetting base, and adding ns to accumulated_ns and > > > returning the latter would make more sense. > > > > Although we have to update the base_cycles and accumulated_ns > > atomically, so its probably not something to do in the sched_clock path. > > Ping, what's going on with this bug? Systems are crashing so we need > a quick fix ASAP ... I think Peter's patch disabling sched_clock_stable is a good approach for now. And just to clarify a bit here, while there was a related scheduler division-by-zero issue which to my understanding has already been fixed post-2.6.32.21, I have not actually seen any other crash logs connected to the overflow. There have been posted softlockup watchdog false-positive messages (which I have also reproduced), but I've not seen any details on actual crashes or have I been able to reproduce them using my forced-overflow patch. This isn't to say that the overflow isn't causing crashes, but that the reports have not been clear that there have been crashes by something other then the div-bv-zero issue. thanks -john ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-29 10:02 ` Nikola Ciprich 2011-04-30 9:36 ` Willy Tarreau @ 2011-05-06 3:12 ` Hidetoshi Seto 1 sibling, 0 replies; 58+ messages in thread From: Hidetoshi Seto @ 2011-05-06 3:12 UTC (permalink / raw) To: Nikola Ciprich Cc: Willy Tarreau, linux-kernel mlist, linux-stable mlist, Hervé Commowick Hi Nikola, Sorry for not replying sooner. (2011/04/29 19:02), Nikola Ciprich wrote: > (another CC added) > > Hello Willy! > > I made some statistics of our servers regarding kernel version and uptime. > Here are some my thoughts: > - I'm 100% sure this problem wasn't present in kernels <= 2.6.30.x (we've got a lot of boxes with uptimes >600days) > - I'm 90% sure this problem also wasn't present in 2.6.32.16 (we've got 6 boxes running for 235 to 280days) > > What I'm not sure is, whether this is present in 2.6.19, I have: > 2 boxes running 2.6.32.19 for 238days and one 2.6.32.20 for 216days. > I also have a bunch ov 2.6.32.23 boxes, which are now getting close to 200days uptime. > But I suspect this really is first problematic version, more on it later. > First regarding Your question about CONFIG_HZ - we use 250HZ setting, which leads me to following: > 250 * 60 * 60 * 24 * 199 = 4298400000 which is value a little over 2**32! So maybe some unsingned long variable > might overflow? Does this make sense? > > And to my suspicion about 2.6.32.19, there is one commit which maybe is related: > > commit 0cf55e1ec08bb5a22e068309e2d8ba1180ab4239 > Author: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> > Date: Wed Dec 2 17:28:07 2009 +0900 > > sched, cputime: Introduce thread_group_times() > > This is a real fix for problem of utime/stime values decreasing > described in the thread: > > http://lkml.org/lkml/2009/11/3/522 > > Now cputime is accounted in the following way: > > - {u,s}time in task_struct are increased every time when the thread > is interrupted by a tick (timer interrupt). > > - When a thread exits, its {u,s}time are added to signal->{u,s}time, > after adjusted by task_times(). > > - When all threads in a thread_group exits, accumulated {u,s}time > (and also c{u,s}time) in signal struct are added to c{u,s}time > in signal struct of the group's parent. > . > . > . > > I haven't studied this into detail yet, but it seems to me it might really be related. Hidetoshi-san - do You have some opinion about this? > Could this somehow either create or invoke the problem with overflow of some variable which would lead to division by zero or similar problems? No. The commit you pointed is a change for runtimes (cputime_t) accounted for threads, not for uptime/jiffies/tick. And I suppose any overflow/zero-div cannot be there: if (total) { : do_div(temp, total); : } : p->prev_utime = max(p->prev_utime, utime); > > Any other thoughts? > > best regards > > nik >From a glance of diff v2.6.32.16..v2.6.32.23, tick_nohz_* could be an another suspect. Humm... Thanks, H.Seto ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: [stable] 2.6.32.21 - uptime related crashes? 2011-04-28 18:34 ` [stable] " Willy Tarreau 2011-04-29 10:02 ` Nikola Ciprich @ 2011-05-13 22:08 ` Nicolas Carlier 1 sibling, 0 replies; 58+ messages in thread From: Nicolas Carlier @ 2011-05-13 22:08 UTC (permalink / raw) To: Willy Tarreau Cc: Nikola Ciprich, linux-kernel mlist, linux-stable mlist, Hervé Commowick Hi Willy, On Thu, Apr 28, 2011 at 8:34 PM, Willy Tarreau <w@1wt.eu> wrote: > Hello Nikola, > > On Thu, Apr 28, 2011 at 10:26:25AM +0200, Nikola Ciprich wrote: >> Hello everybody, >> >> I'm trying to solve strange issue, today, my fourth machine running 2.6.32.21 just crashed. What makes the cases similar, apart fromn same kernel version is that all boxes had very similar uptimes: 214, 216, 216, and 224 days. This might just be a coincidence, but I think this might be important. > > Interestingly, one of our customers just had two machines who crashed > yesterday after 212 days and 212+20h respectively. They were running > debian's 2.6.32-bpo.5-amd64 which is based on 2.6.32.23 AIUI. > > The crash looks very similar to the following bug which we have updated : > > https://bugzilla.kernel.org/show_bug.cgi?id=16991 > > (bugzilla doesn't appear to respond as I'm posting this mail). > > The top of your ouput is missing. In our case as in the reports on the bug > above, there was a divide by zero error. Did you happen to spot this one > too, or do you just not know ? I observe "divide_error+0x15/0x20" in one > of your reports, so it's possible that it matches the same pattern at least > for one trace. Just in case, it would be nice to feed the bugzilla entry > above. > >> Unfortunately I only have backtraces of two crashes (and those are trimmed, sorry), and they do not look as similar as I'd like, but still maybe there is something in common: >> >> [<ffffffff81120cc7>] pollwake+0x57/0x60 >> [<ffffffff81046720>] ? default_wake_function+0x0/0x10 >> [<ffffffff8103683a>] __wake_up_common+0x5a/0x90 >> [<ffffffff8103a313>] __wake_up+0x43/0x70 >> [<ffffffffa0321573>] process_masterspan+0x643/0x670 [dahdi] >> [<ffffffffa0326595>] coretimer_func+0x135/0x1d0 [dahdi] >> [<ffffffff8105d74d>] run_timer_softirq+0x15d/0x320 >> [<ffffffffa0326460>] ? coretimer_func+0x0/0x1d0 [dahdi] >> [<ffffffff8105690c>] __do_softirq+0xcc/0x220 >> [<ffffffff8100c40c>] call_softirq+0x1c/0x30 >> [<ffffffff8100e3ba>] do_softirq+0x4a/0x80 >> [<ffffffff810567c7>] irq_exit+0x87/0x90 >> [<ffffffff8100d7b7>] do_IRQ+0x77/0xf0 >> [<ffffffff8100bc53>] ret_from_intr+0x0/Oxa >> <EUI> [<ffffffffa019e556>] ? acpi_idle_enter_bm+0x273/0x2a1 [processor] >> [<ffffffffa019e54c>] ? acpi_idle_enter_bm+0x269/0x2a1 [processor] >> [<ffffffff81280095>] ? cpuidle_idle_call+0xa5/0x150 >> [<ffffffff8100a18f>] ? cpu_idle+0x4f/0x90 >> [<ffffffff81323c95>] ? rest_init+0x75/0x80 >> [<ffffffff81582d7f>] ? start_kernel+0x2ef/0x390 >> [<ffffffff81582271>] ? x86_64_start_reservations+0x81/0xc0 >> [<ffffffff81582386>] ? x86_64_start_kernel+0xd6/0x100 >> >> this box (actually two of the crashed ones) is using dahdi_dummy module to generate timing for asterisk SW pbx, so maybe it's related to it. >> >> >> [<ffffffff810a5063>] handle_IRQ_event+0x63/0x1c0 >> [<ffffffff810a71ae>] handle_edge_irq+0xce/0x160 >> [<ffffffff8100e1bf>] handle_irq+0x1f/0x30 >> [<ffffffff8100d7ae>] do_IRQ+0x6e/0xf0 >> [<ffffffff8100bc53>] ret_from_intr+0x0/Oxa >> <EUI> [<ffffffff8133?f?f>] ? _spin_un1ock_irq+0xf/0x40 >> [<ffffffff81337f79>] ? _spin_un1ock_irq+0x9/0x40 >> [<ffffffff81064b9a>] ? exit_signals+0x8a/0x130 >> [<ffffffff8105372e>] ? do_exit+0x7e/0x7d0 >> [<ffffffff8100f8a7>] ? oops_end+0xa7/0xb0 >> [<ffffffff8100faa6>] ? die+0x56/0x90 >> [<ffffffff8100c810>] ? do_trap+0x130/0x150 >> [<ffffffff8100ccca>] ? do_divide_error+0x8a/0xa0 >> [<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00 >> [<ffffffff8104400b>] ? cpuacct_charge+0x6b/0x90 >> [<ffffffff8100c045>] ? divide_error+0x15/0x20 >> [<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00 >> [<ffffffff8103cfff>] ? find_busiest_group+0x1af/0xa00 >> [<ffffffff81335483>] ? thread_return+0x4ce/0x7bb >> [<ffffffff8133bec5>] ? do_nanosleep+0x75/0x30 >> [<ffffffff810?1?4e>] ? hrtimer_nanosleep+0x9e/0x120 >> [<ffffffff810?08f0>] ? hrtimer_wakeup+0x0/0x30 >> [<ffffffff810?183f>] ? sys_nanosleep+0x6f/0x80 >> >> another two don't use it. only similarity I see here is that it seems to be IRQ handling related, but both issues don't have anything in common. >> Does anybody have an idea on where should I look? Of course I should update all those boxes to (at least) latest 2.6.32.x, and I'll do it for sure, but still I'd first like to know where the problem was, and if it has been fixed, or how to fix it... >> I'd be gratefull for any help... > > There were quite a bunch of scheduler updates recently. We may be lucky and > hope for the bug to have vanished with the changes, but we may as well see > the same crash in 7 months :-/ > > My coworker Hervé (CC'd) who worked on the issue suggests that we might have > something which goes wrong past a certain uptime (eg: 212 days), which needs > a special event to be triggered (I/O, process exiting, etc...). I think this > makes quite some sense. > > Could you check your CONFIG_HZ so that we could convert those uptimes to > jiffies ? Maybe this will ring a bell in someone's head :-/ > We had encounter the same issue on many nodes of our cluster which ran on a 2.6.32.8 Debian Kernel. All the servers which had crashed, had almost the same uptime, more than 200 days. But those which didn't crashed, had the same uptime. Each time, we had the "divide by zero" in "find_busiest_group" One explanation can be the difference in term of number of tasks since boot. As the servers fallen one by one, and as we were not able to reproduce the problem quickly, we had use the patch provides by Andrew Dickinson. Regards, -- Nicolas Carlier ^ permalink raw reply [flat|nested] 58+ messages in thread
end of thread, other threads:[~2011-12-05 16:56 UTC | newest] Thread overview: 58+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-04-28 8:26 2.6.32.21 - uptime related crashes? Nikola Ciprich 2011-04-28 18:34 ` [stable] " Willy Tarreau 2011-04-29 10:02 ` Nikola Ciprich 2011-04-30 9:36 ` Willy Tarreau 2011-04-30 11:22 ` Henrique de Moraes Holschuh 2011-04-30 11:54 ` Willy Tarreau 2011-04-30 12:32 ` Henrique de Moraes Holschuh 2011-04-30 12:02 ` Nikola Ciprich 2011-04-30 15:57 ` Greg KH 2011-04-30 16:08 ` Randy Dunlap 2011-04-30 16:49 ` Willy Tarreau 2011-04-30 18:14 ` Henrique de Moraes Holschuh 2011-04-30 17:39 ` Faidon Liambotis 2011-04-30 20:14 ` Willy Tarreau 2011-05-14 19:04 ` Nikola Ciprich 2011-05-14 20:45 ` Willy Tarreau 2011-05-14 20:59 ` Ben Hutchings 2011-05-14 23:13 ` Nicolas Carlier 2011-05-15 22:56 ` Faidon Liambotis 2011-05-16 6:49 ` Apollon Oikonomopoulos 2011-06-28 2:25 ` john stultz 2011-06-28 5:17 ` Willy Tarreau 2011-06-28 6:19 ` Apollon Oikonomopoulos 2011-07-06 6:15 ` Andrew Morton 2011-07-12 1:18 ` MINOURA Makoto / 箕浦 真 2011-07-12 1:40 ` john stultz 2011-07-12 2:49 ` MINOURA Makoto / 箕浦 真 2011-07-12 4:19 ` Willy Tarreau 2011-07-15 0:35 ` john stultz 2011-07-15 8:30 ` Peter Zijlstra 2011-07-15 10:02 ` Peter Zijlstra 2011-07-15 18:03 ` john stultz 2011-07-15 10:01 ` Peter Zijlstra 2011-07-15 17:59 ` john stultz 2011-07-21 7:22 ` Ingo Molnar 2011-07-21 12:24 ` Peter Zijlstra 2011-07-21 12:50 ` Nikola Ciprich 2011-07-21 12:53 ` Peter Zijlstra 2011-07-21 18:45 ` Ingo Molnar 2011-07-21 19:32 ` Nikola Ciprich 2011-08-25 18:56 ` Faidon Liambotis 2011-08-30 22:38 ` [stable] " Greg KH 2011-09-04 23:26 ` Faidon Liambotis 2011-10-23 18:31 ` Ruben Kerkhof 2011-10-23 22:07 ` Greg KH 2011-10-25 22:44 ` john stultz 2011-10-25 23:25 ` Willy Tarreau 2011-12-02 23:45 ` Greg KH 2011-12-03 0:02 ` john stultz 2011-12-03 1:02 ` Greg KH 2011-12-03 7:00 ` Willy Tarreau 2011-12-05 16:53 ` Ingo Molnar 2011-10-26 18:21 ` Ruben Kerkhof 2011-07-21 19:25 ` Nikola Ciprich 2011-07-21 19:37 ` john stultz 2011-07-21 19:53 ` john stultz 2011-05-06 3:12 ` [stable] " Hidetoshi Seto 2011-05-13 22:08 ` Nicolas Carlier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox