* [patch] softlockup watchdog: fix Xen bogosity
@ 2007-07-17 11:44 Ingo Molnar
2007-07-17 14:17 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2007-07-17 11:44 UTC (permalink / raw)
To: linux-kernel
Cc: Andrew Morton, Linus Torvalds, Jeremy Fitzhardinge, stable,
Greg KH, Chris Wright
Subject: softlockup: fix Xen bogosity
From: Ingo Molnar <mingo@elte.hu>
this Xen related commit:
commit 966812dc98e6a7fcdf759cbfa0efab77500a8868
Author: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Tue May 8 00:28:02 2007 -0700
Ignore stolen time in the softlockup watchdog
broke the softlockup watchdog to never report any lockups. (!)
print_timestamp defaults to 0, this makes the following condition
always true:
if (print_timestamp < (touch_timestamp + 1) ||
and we'll never report soft lockups.
apparently the functionality of the soft lockup watchdog was never
actually tested with that patch applied ...
[ this is -stable material too. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/softlockup.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
Index: linux/kernel/softlockup.c
===================================================================
--- linux.orig/kernel/softlockup.c
+++ linux/kernel/softlockup.c
@@ -79,10 +79,11 @@ void softlockup_tick(void)
print_timestamp = per_cpu(print_timestamp, this_cpu);
/* report at most once a second */
- if (print_timestamp < (touch_timestamp + 1) ||
- did_panic ||
- !per_cpu(watchdog_task, this_cpu))
+ if ((print_timestamp >= touch_timestamp &&
+ print_timestamp < (touch_timestamp + 1)) ||
+ did_panic || !per_cpu(watchdog_task, this_cpu)) {
return;
+ }
/* do not print during early bootup: */
if (unlikely(system_state != SYSTEM_RUNNING)) {
^ permalink raw reply [flat|nested] 40+ messages in thread* Re: [patch] softlockup watchdog: fix Xen bogosity 2007-07-17 11:44 [patch] softlockup watchdog: fix Xen bogosity Ingo Molnar @ 2007-07-17 14:17 ` Jeremy Fitzhardinge 2007-07-17 15:49 ` [patch] fix the softlockup watchdog to actually work Ingo Molnar 0 siblings, 1 reply; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-17 14:17 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Andrew Morton, Linus Torvalds, stable, Greg KH, Chris Wright Ingo Molnar wrote: > Subject: softlockup: fix Xen bogosity > From: Ingo Molnar <mingo@elte.hu> > > this Xen related commit: > Well, not just Xen. It relates to any virtual environment: kvm, lguest, vmi, xen... (Not that they all implement a measure of unstolen time.) How about a more descriptive patch title, along the lines of "softlockup watchdog: fix rate limiting"? > commit 966812dc98e6a7fcdf759cbfa0efab77500a8868 > Author: Jeremy Fitzhardinge <jeremy@goop.org> > Date: Tue May 8 00:28:02 2007 -0700 > > Ignore stolen time in the softlockup watchdog > > broke the softlockup watchdog to never report any lockups. (!) > > print_timestamp defaults to 0, this makes the following condition > always true: > > if (print_timestamp < (touch_timestamp + 1) || > > and we'll never report soft lockups. > > apparently the functionality of the soft lockup watchdog was never > actually tested with that patch applied ... > > [ this is -stable material too. ] > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > --- > kernel/softlockup.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > Index: linux/kernel/softlockup.c > =================================================================== > --- linux.orig/kernel/softlockup.c > +++ linux/kernel/softlockup.c > @@ -79,10 +79,11 @@ void softlockup_tick(void) > print_timestamp = per_cpu(print_timestamp, this_cpu); > > /* report at most once a second */ > - if (print_timestamp < (touch_timestamp + 1) || > - did_panic || > - !per_cpu(watchdog_task, this_cpu)) > + if ((print_timestamp >= touch_timestamp && > + print_timestamp < (touch_timestamp + 1)) || > + did_panic || !per_cpu(watchdog_task, this_cpu)) { > return; > + } > OK, thanks. Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com> J ^ permalink raw reply [flat|nested] 40+ messages in thread
* [patch] fix the softlockup watchdog to actually work 2007-07-17 14:17 ` Jeremy Fitzhardinge @ 2007-07-17 15:49 ` Ingo Molnar 2007-07-17 17:03 ` Randy Dunlap ` (3 more replies) 0 siblings, 4 replies; 40+ messages in thread From: Ingo Molnar @ 2007-07-17 15:49 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: linux-kernel, Andrew Morton, Linus Torvalds, stable, Greg KH, Chris Wright * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > Ingo Molnar wrote: > > Subject: softlockup: fix Xen bogosity > > From: Ingo Molnar <mingo@elte.hu> > > > > this Xen related commit: > > > > Well, not just Xen. It relates to any virtual environment: kvm, > lguest, vmi, xen... (Not that they all implement a measure of > unstolen time.) > > How about a more descriptive patch title, along the lines of > "softlockup watchdog: fix rate limiting"? uhm, the problem was that it did not work _at all_, not something about 'rate limiting'. Yes, i got quite a bit grumpy when i found this, because you completely broke the softlockup watchdog via a pretty intrusive commit and you apparently didnt even do a minimal check whether its functionality was preserved! Updated patch for Andrew/Linus and for -stable attached. Ingo -----------------------------> Subject: fix the softlockup watchdog to actually work From: Ingo Molnar <mingo@elte.hu> this Xen related commit: commit 966812dc98e6a7fcdf759cbfa0efab77500a8868 Author: Jeremy Fitzhardinge <jeremy@goop.org> Date: Tue May 8 00:28:02 2007 -0700 Ignore stolen time in the softlockup watchdog broke the softlockup watchdog to never report any lockups. (!) print_timestamp defaults to 0, this makes the following condition always true: if (print_timestamp < (touch_timestamp + 1) || and we'll in essence never report soft lockups. apparently the functionality of the soft lockup watchdog was never actually tested with that patch applied ... [this is -stable material too.] Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/softlockup.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) Index: linux/kernel/softlockup.c =================================================================== --- linux.orig/kernel/softlockup.c +++ linux/kernel/softlockup.c @@ -79,10 +79,11 @@ void softlockup_tick(void) print_timestamp = per_cpu(print_timestamp, this_cpu); /* report at most once a second */ - if (print_timestamp < (touch_timestamp + 1) || - did_panic || - !per_cpu(watchdog_task, this_cpu)) + if ((print_timestamp >= touch_timestamp && + print_timestamp < (touch_timestamp + 1)) || + did_panic || !per_cpu(watchdog_task, this_cpu)) { return; + } /* do not print during early bootup: */ if (unlikely(system_state != SYSTEM_RUNNING)) { ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-17 15:49 ` [patch] fix the softlockup watchdog to actually work Ingo Molnar @ 2007-07-17 17:03 ` Randy Dunlap 2007-07-17 17:25 ` Ingo Molnar 2007-07-17 18:14 ` Linus Torvalds 2007-07-19 7:22 ` Andrew Morton ` (2 subsequent siblings) 3 siblings, 2 replies; 40+ messages in thread From: Randy Dunlap @ 2007-07-17 17:03 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, linux-kernel, Andrew Morton, Linus Torvalds, stable, Greg KH, Chris Wright On Tue, 17 Jul 2007 17:49:34 +0200 Ingo Molnar wrote: > > * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > > > Ingo Molnar wrote: > > > Subject: softlockup: fix Xen bogosity > > > From: Ingo Molnar <mingo@elte.hu> > > > > > > this Xen related commit: > > > > > > > Well, not just Xen. It relates to any virtual environment: kvm, > > lguest, vmi, xen... (Not that they all implement a measure of > > unstolen time.) > > > > How about a more descriptive patch title, along the lines of > > "softlockup watchdog: fix rate limiting"? > > uhm, the problem was that it did not work _at all_, not something about > 'rate limiting'. Yes, i got quite a bit grumpy when i found this, > because you completely broke the softlockup watchdog via a pretty > intrusive commit and you apparently didnt even do a minimal check > whether its functionality was preserved! Updated patch for Andrew/Linus > and for -stable attached. > > Ingo > > -----------------------------> > Subject: fix the softlockup watchdog to actually work > From: Ingo Molnar <mingo@elte.hu> > > this Xen related commit: > > commit 966812dc98e6a7fcdf759cbfa0efab77500a8868 > Author: Jeremy Fitzhardinge <jeremy@goop.org> > Date: Tue May 8 00:28:02 2007 -0700 > > Ignore stolen time in the softlockup watchdog > > broke the softlockup watchdog to never report any lockups. (!) > > print_timestamp defaults to 0, this makes the following condition > always true: > > if (print_timestamp < (touch_timestamp + 1) || > > and we'll in essence never report soft lockups. > > apparently the functionality of the soft lockup watchdog was never > actually tested with that patch applied ... > > [this is -stable material too.] > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > --- > kernel/softlockup.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > Index: linux/kernel/softlockup.c > =================================================================== > --- linux.orig/kernel/softlockup.c > +++ linux/kernel/softlockup.c > @@ -79,10 +79,11 @@ void softlockup_tick(void) > print_timestamp = per_cpu(print_timestamp, this_cpu); > > /* report at most once a second */ > - if (print_timestamp < (touch_timestamp + 1) || > - did_panic || > - !per_cpu(watchdog_task, this_cpu)) > + if ((print_timestamp >= touch_timestamp && > + print_timestamp < (touch_timestamp + 1)) || > + did_panic || !per_cpu(watchdog_task, this_cpu)) { > return; > + } > > /* do not print during early bootup: */ > if (unlikely(system_state != SYSTEM_RUNNING)) { patch contains unneeded braces { }. --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-17 17:03 ` Randy Dunlap @ 2007-07-17 17:25 ` Ingo Molnar 2007-07-17 18:14 ` Linus Torvalds 1 sibling, 0 replies; 40+ messages in thread From: Ingo Molnar @ 2007-07-17 17:25 UTC (permalink / raw) To: Randy Dunlap Cc: Jeremy Fitzhardinge, linux-kernel, Andrew Morton, Linus Torvalds, stable, Greg KH, Chris Wright * Randy Dunlap <randy.dunlap@oracle.com> wrote: > > + if ((print_timestamp >= touch_timestamp && > > + print_timestamp < (touch_timestamp + 1)) || > > + did_panic || !per_cpu(watchdog_task, this_cpu)) { > > return; > > + } > > > > /* do not print during early bootup: */ > > if (unlikely(system_state != SYSTEM_RUNNING)) { > > patch contains unneeded braces { }. for multi-line conditions like the above they can be used. Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-17 17:03 ` Randy Dunlap 2007-07-17 17:25 ` Ingo Molnar @ 2007-07-17 18:14 ` Linus Torvalds 2007-07-17 21:38 ` Randy Dunlap 1 sibling, 1 reply; 40+ messages in thread From: Linus Torvalds @ 2007-07-17 18:14 UTC (permalink / raw) To: Randy Dunlap Cc: Ingo Molnar, Jeremy Fitzhardinge, linux-kernel, Andrew Morton, stable, Greg KH, Chris Wright On Tue, 17 Jul 2007, Randy Dunlap wrote: > > > + if ((print_timestamp >= touch_timestamp && > > + print_timestamp < (touch_timestamp + 1)) || > > + did_panic || !per_cpu(watchdog_task, this_cpu)) { > > return; > > + } > > > > /* do not print during early bootup: */ > > if (unlikely(system_state != SYSTEM_RUNNING)) { > > patch contains unneeded braces { }. When there are issues with indentation, those braces are actually not unneeded any more, except for the compiler. Just _look_ at the code. The indentation is not obvious, because the if-conditional itself is multiple lines, and indented (arguably wrongly so too, but that's another issue). So it's no longer a trivial one-liner statement, it's a "multi-statement" spread out over multiple lines, and I think the braces are actually a good idea for things like that. I also encourage people do do braces when you have nested indentation, ie if (something) if (somethingelse) return; is actively *wrong*, while if (something) { if (somethingelse) return; } is right, even though the braces are "unnecessary". Again, it's about the visual representation, not about whether the compiler needs them or not. Linus ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-17 18:14 ` Linus Torvalds @ 2007-07-17 21:38 ` Randy Dunlap 0 siblings, 0 replies; 40+ messages in thread From: Randy Dunlap @ 2007-07-17 21:38 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Jeremy Fitzhardinge, linux-kernel, Andrew Morton, stable, Greg KH, Chris Wright On Tue, 17 Jul 2007 11:14:51 -0700 (PDT) Linus Torvalds wrote: > On Tue, 17 Jul 2007, Randy Dunlap wrote: > > > > > + if ((print_timestamp >= touch_timestamp && > > > + print_timestamp < (touch_timestamp + 1)) || > > > + did_panic || !per_cpu(watchdog_task, this_cpu)) { > > > return; > > > + } > > > > > > /* do not print during early bootup: */ > > > if (unlikely(system_state != SYSTEM_RUNNING)) { > > > > patch contains unneeded braces { }. > > When there are issues with indentation, those braces are actually not > unneeded any more, except for the compiler. > > Just _look_ at the code. The indentation is not obvious, because the > if-conditional itself is multiple lines, and indented (arguably wrongly so > too, but that's another issue). I strongly agree with your parenthetical remark and think that is the real problem. > So it's no longer a trivial one-liner statement, it's a "multi-statement" > spread out over multiple lines, and I think the braces are actually a good > idea for things like that. > > I also encourage people do do braces when you have nested indentation, ie > > if (something) > if (somethingelse) > return; I think that this is wrong only when there is an "else" branch following this. > is actively *wrong*, while > > if (something) { > if (somethingelse) > return; > } > > is right, even though the braces are "unnecessary". Again, it's about the > visual representation, not about whether the compiler needs them or not. I agree with your last statement. And thankfully none of this is in CodingStyle. (oops) --- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-17 15:49 ` [patch] fix the softlockup watchdog to actually work Ingo Molnar 2007-07-17 17:03 ` Randy Dunlap @ 2007-07-19 7:22 ` Andrew Morton 2007-07-19 7:46 ` Ingo Molnar 2007-07-19 7:51 ` Ingo Molnar 2007-07-25 8:49 ` [patch] fix the softlockup watchdog to actually work Andrew Morton 2007-10-03 23:49 ` Yinghai Lu 3 siblings, 2 replies; 40+ messages in thread From: Andrew Morton @ 2007-07-19 7:22 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright On Tue, 17 Jul 2007 17:49:34 +0200 Ingo Molnar <mingo@elte.hu> wrote: > Subject: fix the softlockup watchdog to actually work > From: Ingo Molnar <mingo@elte.hu> > > this Xen related commit: > > commit 966812dc98e6a7fcdf759cbfa0efab77500a8868 > Author: Jeremy Fitzhardinge <jeremy@goop.org> > Date: Tue May 8 00:28:02 2007 -0700 > > Ignore stolen time in the softlockup watchdog > > broke the softlockup watchdog to never report any lockups. (!) > > print_timestamp defaults to 0, this makes the following condition > always true: > > if (print_timestamp < (touch_timestamp + 1) || > > and we'll in essence never report soft lockups. > > apparently the functionality of the soft lockup watchdog was never > actually tested with that patch applied ... > > [this is -stable material too.] This seems terribly sensitive. Someone has broken the Vaio (shock, horror). It now has mysterious jerkiness: when leaning on autorepeat it stalls for maybe 0.25 seconds every 1.5 seconds. The stalls are far less than a second. Yet this is enough to trigger random softlockup warnings. Some of those warnings are below. Note that the traces are all pretty useless, as softlockup warnings so often seem to be. Of course, it could be that whatever is causing these pauses really _is_ stalling for a whole second occasionally, dunno. But I didn't notice any long stalls in the console output when a particular storm of softlockup warnings came out. But I'll sit on this patch for a while until this gets sorted out. Meanwhile, please double-check the elapsed-time arithmetic in there, maybe do a bit of runtime testing? [ 78.820961] BUG: soft lockup detected on CPU#0! [ 78.821083] [<c0122475>] update_process_times+0x32/0x54 [ 78.821216] [<c012fe7a>] tick_sched_timer+0x61/0x9c [ 78.821340] [<c012c2e7>] hrtimer_interrupt+0x142/0x1d4 [ 78.821463] [<c012fe19>] tick_sched_timer+0x0/0x9c [ 78.821587] [<c012f74a>] tick_do_broadcast+0x1f/0x3f [ 78.821707] [<c012fa01>] tick_handle_oneshot_broadcast+0x47/0x72 [ 78.821852] [<c01067ca>] timer_interrupt+0x1a/0x20 [ 78.821968] [<c014291e>] handle_IRQ_event+0x1a/0x3f [ 78.822089] [<c0143521>] handle_edge_irq+0x9d/0xcc [ 78.822206] [<c0105d7b>] do_IRQ+0x53/0x6c [ 78.822307] [<c012f4f0>] tick_notify+0x15c/0x208 [ 78.822422] [<c01044cf>] common_interrupt+0x23/0x28 [ 78.822539] [<c012f1d4>] clockevents_notify+0x8/0x36 [ 78.822663] [<c020d199>] acpi_processor_idle+0x1d2/0x36d [ 78.822798] [<c0102345>] cpu_idle+0x44/0x5e [ 78.822900] [<c03baa8d>] start_kernel+0x26d/0x275 [ 78.823017] [<c03ba3fe>] unknown_bootoption+0x0/0x202 [ 78.823142] ======================= [ 106.282830] BUG: soft lockup detected on CPU#0! [ 106.282967] [<c0122475>] update_process_times+0x32/0x54 [ 106.283116] [<c012fe7a>] tick_sched_timer+0x61/0x9c [ 106.283255] [<c012c2e7>] hrtimer_interrupt+0x142/0x1d4 [ 106.283391] [<c012fe19>] tick_sched_timer+0x0/0x9c [ 106.283530] [<c012f74a>] tick_do_broadcast+0x1f/0x3f [ 106.283663] [<c012fa01>] tick_handle_oneshot_broadcast+0x47/0x72 [ 106.283821] [<c01067ca>] timer_interrupt+0x1a/0x20 [ 106.283949] [<c014291e>] handle_IRQ_event+0x1a/0x3f [ 106.284084] [<c0143521>] handle_edge_irq+0x9d/0xcc [ 106.284215] [<c0105d7b>] do_IRQ+0x53/0x6c [ 106.284326] [<c012f4f0>] tick_notify+0x15c/0x208 [ 106.284455] [<c01044cf>] common_interrupt+0x23/0x28 [ 106.284587] [<c012f1d4>] clockevents_notify+0x8/0x36 [ 106.284725] [<c020d199>] acpi_processor_idle+0x1d2/0x36d [ 106.284875] [<c0102345>] cpu_idle+0x44/0x5e [ 106.284988] [<c03baa8d>] start_kernel+0x26d/0x275 [ 106.285117] [<c03ba3fe>] unknown_bootoption+0x0/0x202 [ 106.285257] ======================= [ 109.266423] BUG: soft lockup detected on CPU#0! [ 109.266558] [<c0122475>] update_process_times+0x32/0x54 [ 109.266703] [<c012fe7a>] tick_sched_timer+0x61/0x9c [ 109.270745] [<c012c2e7>] hrtimer_interrupt+0x142/0x1d4 [ 109.274790] [<c012fe19>] tick_sched_timer+0x0/0x9c [ 109.278865] [<c012f74a>] tick_do_broadcast+0x1f/0x3f [ 109.282950] [<c012fa01>] tick_handle_oneshot_broadcast+0x47/0x72 [ 109.287026] [<c01067ca>] timer_interrupt+0x1a/0x20 [ 109.291012] [<c014291e>] handle_IRQ_event+0x1a/0x3f [ 109.294950] [<c0143521>] handle_edge_irq+0x9d/0xcc [ 109.298864] [<c0105d7b>] do_IRQ+0x53/0x6c [ 109.302818] [<c012f4f0>] tick_notify+0x15c/0x208 [ 109.306740] [<c01044cf>] common_interrupt+0x23/0x28 [ 109.310641] [<c012f1d4>] clockevents_notify+0x8/0x36 [ 109.314543] [<c020d199>] acpi_processor_idle+0x1d2/0x36d [ 109.318461] [<c0102345>] cpu_idle+0x44/0x5e [ 109.322348] [<c03baa8d>] start_kernel+0x26d/0x275 [ 109.326267] [<c03ba3fe>] unknown_bootoption+0x0/0x202 [ 109.330188] ======================= (ah, the Vaio breakage seems to be -mm-only, whew) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 7:22 ` Andrew Morton @ 2007-07-19 7:46 ` Ingo Molnar 2007-07-19 7:51 ` Ingo Molnar 1 sibling, 0 replies; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 7:46 UTC (permalink / raw) To: Andrew Morton Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Andrew Morton <akpm@linux-foundation.org> wrote: > > [this is -stable material too.] > > This seems terribly sensitive. > > Someone has broken the Vaio (shock, horror). It now has mysterious > jerkiness: when leaning on autorepeat it stalls for maybe 0.25 seconds > every 1.5 seconds. The stalls are far less than a second. Yet this > is enough to trigger random softlockup warnings. > > Some of those warnings are below. Note that the traces are all pretty > useless, as softlockup warnings so often seem to be. hm, you havent picked up the other softlockup enhancements i did, which make the warnings more useful. Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 7:22 ` Andrew Morton 2007-07-19 7:46 ` Ingo Molnar @ 2007-07-19 7:51 ` Ingo Molnar 2007-07-19 14:31 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 7:51 UTC (permalink / raw) To: Andrew Morton Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Andrew Morton <akpm@linux-foundation.org> wrote: > But I'll sit on this patch for a while until this gets sorted out. > Meanwhile, please double-check the elapsed-time arithmetic in there, > maybe do a bit of runtime testing? btw., could you apply the patch below as well? Maybe sched_clock() is misbehaving on your box? (with this i have 5 softlockup patches in my tree - and they are working fine so far.) Ingo ----------------> Subject: [patch] softlockup: use a reliable global time source From: Ingo Molnar <mingo@elte.hu> using sched_clock() for the soft-lockups was a bad idea, sched_clock() is not a reliable global time-source. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/softlockup.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) Index: linux/kernel/softlockup.c =================================================================== --- linux.orig/kernel/softlockup.c +++ linux/kernel/softlockup.c @@ -37,13 +37,11 @@ static struct notifier_block panic_block }; /* - * Returns seconds, approximately. We don't need nanosecond - * resolution, and we don't need to waste time with a big divide when - * 2^30ns == 1.074s. + * Returns seconds. */ static unsigned long get_timestamp(void) { - return sched_clock() >> 30; /* 2^30 ~= 10^9 */ + return jiffies / HZ; } void touch_softlockup_watchdog(void) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 7:51 ` Ingo Molnar @ 2007-07-19 14:31 ` Jeremy Fitzhardinge 2007-07-19 14:35 ` Ingo Molnar 0 siblings, 1 reply; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 14:31 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright Ingo Molnar wrote: > btw., could you apply the patch below as well? Maybe sched_clock() is > misbehaving on your box? (with this i have 5 softlockup patches in my > tree - and they are working fine so far.) > > Ingo > > ----------------> > Subject: [patch] softlockup: use a reliable global time source > From: Ingo Molnar <mingo@elte.hu> > > using sched_clock() for the soft-lockups was a bad idea, sched_clock() > is not a reliable global time-source. > How reliable does it need to be? All we need is to measure "about 10 seconds"; if we can't get that out of it, how can it be good for anything else? J ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 14:31 ` Jeremy Fitzhardinge @ 2007-07-19 14:35 ` Ingo Molnar 2007-07-19 14:40 ` Jeremy Fitzhardinge 2007-07-19 14:46 ` Jeremy Fitzhardinge 0 siblings, 2 replies; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 14:35 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > How reliable does it need to be? All we need is to measure "about 10 > seconds"; if we can't get that out of it, how can it be good for > anything else? sched_clock(), as its name suggests it, is meant for the scheduler's use. The scheduler generally only needs to measure time when the CPU is busy - not across idle periods. So sched_clock() can (and will) break across certain types of ACPI idle methods. Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 14:35 ` Ingo Molnar @ 2007-07-19 14:40 ` Jeremy Fitzhardinge 2007-07-19 14:46 ` Jeremy Fitzhardinge 1 sibling, 0 replies; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 14:40 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright Ingo Molnar wrote: >> How reliable does it need to be? All we need is to measure "about 10 >> seconds"; if we can't get that out of it, how can it be good for >> anything else? >> > > sched_clock(), as its name suggests it, is meant for the scheduler's > use. The scheduler generally only needs to measure time when the CPU is > busy - not across idle periods. So sched_clock() can (and will) break > across certain types of ACPI idle methods. > Doesn't that mean it will mis-measure process idle times? Is that a problem? J ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 14:35 ` Ingo Molnar 2007-07-19 14:40 ` Jeremy Fitzhardinge @ 2007-07-19 14:46 ` Jeremy Fitzhardinge 2007-07-19 14:50 ` Ingo Molnar 1 sibling, 1 reply; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 14:46 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright Ingo Molnar wrote: > sched_clock(), as its name suggests it, is meant for the scheduler's > use. The scheduler generally only needs to measure time when the CPU is > busy - not across idle periods. So sched_clock() can (and will) break > across certain types of ACPI idle methods. > Hm, or more specifically, why would that be a problem for softlockup? Do you mean it doesn't measure time during ACPI idle? That would just make it trigger later than it would otherwise. Andrew is reporting that it is triggering very early, and making the machine feel jerky. How can that be related to sched_clock's quality, unless its too broken to be useful to anyone for anything? J ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 14:46 ` Jeremy Fitzhardinge @ 2007-07-19 14:50 ` Ingo Molnar 2007-07-19 15:04 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 14:50 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > Hm, or more specifically, why would that be a problem for softlockup? > Do you mean it doesn't measure time during ACPI idle? That would just > make it trigger later than it would otherwise. no, the return value after idling can be completely random on some boxes, on a 64-bit scale - triggering the softlockup watchdog randomly. (some boxes return random TSC values, etc.) Again, it's fine for the scheduler's purpose, that's why i named it sched_clock(). the proper clocksource use within the kernel is ktime_get() [or ktime_get_ts()]. Do not abuse sched_clock() for such things. Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 14:50 ` Ingo Molnar @ 2007-07-19 15:04 ` Jeremy Fitzhardinge 2007-07-19 15:09 ` Ingo Molnar 0 siblings, 1 reply; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 15:04 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright Ingo Molnar wrote: > no, the return value after idling can be completely random on some > boxes, on a 64-bit scale - triggering the softlockup watchdog randomly. > (some boxes return random TSC values, etc.) Again, it's fine for the > scheduler's purpose, that's why i named it sched_clock(). > > the proper clocksource use within the kernel is ktime_get() [or > ktime_get_ts()]. Do not abuse sched_clock() for such things. Well, my observation is that both softlockup and the scheduler really want to measure unstolen time, so it seemed to me that sched_clock was a nice common place to implement that, rather than implementing a whole new time interface. At the time that seemed OK, and nobody had any objections. But it's a bit beside the point unless it does turn out to be making Andrew's Vaio sad. J ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 15:04 ` Jeremy Fitzhardinge @ 2007-07-19 15:09 ` Ingo Molnar 2007-07-19 15:21 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 15:09 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > Ingo Molnar wrote: > > no, the return value after idling can be completely random on some > > boxes, on a 64-bit scale - triggering the softlockup watchdog randomly. > > (some boxes return random TSC values, etc.) Again, it's fine for the > > scheduler's purpose, that's why i named it sched_clock(). > > > > the proper clocksource use within the kernel is ktime_get() [or > > ktime_get_ts()]. Do not abuse sched_clock() for such things. > > Well, my observation is that both softlockup and the scheduler really > want to measure unstolen time, so it seemed to me that sched_clock was > a nice common place to implement that, rather than implementing a > whole new time interface. At the time that seemed OK, and nobody had > any objections. yeah. But then it should not be using sched_clock() but CFS's new rq_clock() method - which does try to construct a globally valid timesource out of sched_clock(). [that fix is not backportable though] Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-19 15:09 ` Ingo Molnar @ 2007-07-19 15:21 ` Jeremy Fitzhardinge 2007-07-19 15:42 ` [patch] sched: implement cpu_clock(cpu) high-speed time source Ingo Molnar 0 siblings, 1 reply; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 15:21 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright Ingo Molnar wrote: >> Well, my observation is that both softlockup and the scheduler really >> want to measure unstolen time, so it seemed to me that sched_clock was >> a nice common place to implement that, rather than implementing a >> whole new time interface. At the time that seemed OK, and nobody had >> any objections. >> > > yeah. But then it should not be using sched_clock() but CFS's new > rq_clock() method - which does try to construct a globally valid > timesource out of sched_clock(). [that fix is not backportable though] > Hm, that doesn't look quite right. Doesn't rq_clock measure time spent running? Unstolen time includes idle time too (it just excludes time in which a VCPU is runnable but not actually running). J ^ permalink raw reply [flat|nested] 40+ messages in thread
* [patch] sched: implement cpu_clock(cpu) high-speed time source 2007-07-19 15:21 ` Jeremy Fitzhardinge @ 2007-07-19 15:42 ` Ingo Molnar 2007-07-19 15:44 ` [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 Ingo Molnar 0 siblings, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 15:42 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Jens Axboe * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > > yeah. But then it should not be using sched_clock() but CFS's new > > rq_clock() method - which does try to construct a globally valid > > timesource out of sched_clock(). [that fix is not backportable > > though] > > Hm, that doesn't look quite right. Doesn't rq_clock measure time > spent running? Unstolen time includes idle time too (it just excludes > time in which a VCPU is runnable but not actually running). generally rq_clock() also includes idle time, so it should work fine for this purpose. So, what do you think about the patch below - does it suit Xen's purposes? Ingo --------------------> Subject: sched: implement cpu_clock(cpu) high-speed time source From: Ingo Molnar <mingo@elte.hu> Implement the cpu_clock(cpu) interface for kernel-internal use: high-speed (but slightly incorrect) per-cpu clock constructed from sched_clock(). fix up blktrace and softlockup-watchdog to use this new interface. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- block/blktrace.c | 20 ++++++++++---------- include/linux/sched.h | 7 +++++++ kernel/sched.c | 17 +++++++++++++++++ kernel/softlockup.c | 10 ++++++---- 4 files changed, 40 insertions(+), 14 deletions(-) Index: linux/block/blktrace.c =================================================================== --- linux.orig/block/blktrace.c +++ linux/block/blktrace.c @@ -41,7 +41,7 @@ static void trace_note(struct blk_trace const int cpu = smp_processor_id(); t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); t->device = bt->dev; t->action = action; t->pid = pid; @@ -159,7 +159,7 @@ void __blk_add_trace(struct blk_trace *b t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; t->sequence = ++(*sequence); - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); t->sector = sector; t->bytes = bytes; t->action = what; @@ -488,17 +488,17 @@ void blk_trace_shutdown(request_queue_t } /* - * Average offset over two calls to sched_clock() with a gettimeofday() + * Average offset over two calls to cpu_clock() with a gettimeofday() * in the middle */ -static void blk_check_time(unsigned long long *t) +static void blk_check_time(unsigned long long *t, int this_cpu) { unsigned long long a, b; struct timeval tv; - a = sched_clock(); + a = cpu_clock(this_cpu); do_gettimeofday(&tv); - b = sched_clock(); + b = cpu_clock(this_cpu); *t = tv.tv_sec * 1000000000 + tv.tv_usec * 1000; *t -= (a + b) / 2; @@ -510,16 +510,16 @@ static void blk_check_time(unsigned long static void blk_trace_check_cpu_time(void *data) { unsigned long long *t; - int cpu = get_cpu(); + int this_cpu = get_cpu(); - t = &per_cpu(blk_trace_cpu_offset, cpu); + t = &per_cpu(blk_trace_cpu_offset, this_cpu); /* * Just call it twice, hopefully the second call will be cache hot * and a little more precise */ - blk_check_time(t); - blk_check_time(t); + blk_check_time(t, this_cpu); + blk_check_time(t, this_cpu); put_cpu(); } Index: linux/include/linux/sched.h =================================================================== --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -1327,6 +1327,13 @@ static inline int set_cpus_allowed(struc #endif extern unsigned long long sched_clock(void); + +/* + * For kernel-internal use: high-speed (but slightly incorrect) per-cpu + * clock constructed from sched_clock(): + */ +extern unsigned long long cpu_clock(int cpu); + extern unsigned long long task_sched_runtime(struct task_struct *task); Index: linux/kernel/sched.c =================================================================== --- linux.orig/kernel/sched.c +++ linux/kernel/sched.c @@ -365,6 +365,23 @@ static inline unsigned long long rq_cloc } /* + * For kernel-internal use: high-speed (but slightly incorrect) per-cpu + * clock constructed from sched_clock(): + */ +unsigned long long cpu_clock(int cpu) +{ + struct rq *rq = cpu_rq(cpu); + unsigned long long now; + unsigned long flags; + + spin_lock_irqsave(&rq->lock, flags); + now = rq_clock(rq); + spin_unlock_irqrestore(&rq->lock, flags); + + return now; +} + +/* * The domain tree (rq->sd) is protected by RCU's quiescent state transition. * See detach_destroy_domains: synchronize_sched for details. * Index: linux/kernel/softlockup.c =================================================================== --- linux.orig/kernel/softlockup.c +++ linux/kernel/softlockup.c @@ -41,14 +41,16 @@ static struct notifier_block panic_block * resolution, and we don't need to waste time with a big divide when * 2^30ns == 1.074s. */ -static unsigned long get_timestamp(void) +static unsigned long get_timestamp(int this_cpu) { - return sched_clock() >> 30; /* 2^30 ~= 10^9 */ + return cpu_clock(this_cpu) >> 30; /* 2^30 ~= 10^9 */ } void touch_softlockup_watchdog(void) { - __raw_get_cpu_var(touch_timestamp) = get_timestamp(); + int this_cpu = raw_smp_processor_id(); + + per_cpu(touch_timestamp, this_cpu) = get_timestamp(this_cpu); } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -94,7 +96,7 @@ void softlockup_tick(void) return; } - now = get_timestamp(); + now = get_timestamp(this_cpu); /* Wake up the high-prio watchdog task every second: */ if (now > (touch_timestamp + 1)) ^ permalink raw reply [flat|nested] 40+ messages in thread
* [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 15:42 ` [patch] sched: implement cpu_clock(cpu) high-speed time source Ingo Molnar @ 2007-07-19 15:44 ` Ingo Molnar 2007-07-19 16:11 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 15:44 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Jens Axboe * Ingo Molnar <mingo@elte.hu> wrote: > > Hm, that doesn't look quite right. Doesn't rq_clock measure time > > spent running? Unstolen time includes idle time too (it just > > excludes time in which a VCPU is runnable but not actually running). > > generally rq_clock() also includes idle time, so it should work fine > for this purpose. So, what do you think about the patch below - does > it suit Xen's purposes? how about the patch below instead? (which, unlike the first one, happens to build and boot ;-) Ingo --------------> Subject: sched: implement cpu_clock(cpu) high-speed time source From: Ingo Molnar <mingo@elte.hu> Implement the cpu_clock(cpu) interface for kernel-internal use: high-speed (but slightly incorrect) per-cpu clock constructed from sched_clock(). update blktrace and the softlockup-watchdog to use this new interface. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- block/blktrace.c | 20 ++++++++++---------- include/linux/sched.h | 7 +++++++ kernel/sched.c | 17 +++++++++++++++++ kernel/softlockup.c | 10 ++++++---- 4 files changed, 40 insertions(+), 14 deletions(-) Index: linux/block/blktrace.c =================================================================== --- linux.orig/block/blktrace.c +++ linux/block/blktrace.c @@ -41,7 +41,7 @@ static void trace_note(struct blk_trace const int cpu = smp_processor_id(); t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); t->device = bt->dev; t->action = action; t->pid = pid; @@ -159,7 +159,7 @@ void __blk_add_trace(struct blk_trace *b t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; t->sequence = ++(*sequence); - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); t->sector = sector; t->bytes = bytes; t->action = what; @@ -488,17 +488,17 @@ void blk_trace_shutdown(request_queue_t } /* - * Average offset over two calls to sched_clock() with a gettimeofday() + * Average offset over two calls to cpu_clock() with a gettimeofday() * in the middle */ -static void blk_check_time(unsigned long long *t) +static void blk_check_time(unsigned long long *t, int this_cpu) { unsigned long long a, b; struct timeval tv; - a = sched_clock(); + a = cpu_clock(this_cpu); do_gettimeofday(&tv); - b = sched_clock(); + b = cpu_clock(this_cpu); *t = tv.tv_sec * 1000000000 + tv.tv_usec * 1000; *t -= (a + b) / 2; @@ -510,16 +510,16 @@ static void blk_check_time(unsigned long static void blk_trace_check_cpu_time(void *data) { unsigned long long *t; - int cpu = get_cpu(); + int this_cpu = get_cpu(); - t = &per_cpu(blk_trace_cpu_offset, cpu); + t = &per_cpu(blk_trace_cpu_offset, this_cpu); /* * Just call it twice, hopefully the second call will be cache hot * and a little more precise */ - blk_check_time(t); - blk_check_time(t); + blk_check_time(t, this_cpu); + blk_check_time(t, this_cpu); put_cpu(); } Index: linux/include/linux/sched.h =================================================================== --- linux.orig/include/linux/sched.h +++ linux/include/linux/sched.h @@ -1327,6 +1327,13 @@ static inline int set_cpus_allowed(struc #endif extern unsigned long long sched_clock(void); + +/* + * For kernel-internal use: high-speed (but slightly incorrect) per-cpu + * clock constructed from sched_clock(): + */ +extern unsigned long long cpu_clock(int cpu); + extern unsigned long long task_sched_runtime(struct task_struct *task); Index: linux/kernel/sched.c =================================================================== --- linux.orig/kernel/sched.c +++ linux/kernel/sched.c @@ -379,6 +379,23 @@ static inline unsigned long long rq_cloc #define task_rq(p) cpu_rq(task_cpu(p)) #define cpu_curr(cpu) (cpu_rq(cpu)->curr) +/* + * For kernel-internal use: high-speed (but slightly incorrect) per-cpu + * clock constructed from sched_clock(): + */ +unsigned long long cpu_clock(int cpu) +{ + struct rq *rq = cpu_rq(cpu); + unsigned long long now; + unsigned long flags; + + spin_lock_irqsave(&rq->lock, flags); + now = rq_clock(rq); + spin_unlock_irqrestore(&rq->lock, flags); + + return now; +} + #ifdef CONFIG_FAIR_GROUP_SCHED /* Change a task's ->cfs_rq if it moves across CPUs */ static inline void set_task_cfs_rq(struct task_struct *p) Index: linux/kernel/softlockup.c =================================================================== --- linux.orig/kernel/softlockup.c +++ linux/kernel/softlockup.c @@ -41,14 +41,16 @@ static struct notifier_block panic_block * resolution, and we don't need to waste time with a big divide when * 2^30ns == 1.074s. */ -static unsigned long get_timestamp(void) +static unsigned long get_timestamp(int this_cpu) { - return sched_clock() >> 30; /* 2^30 ~= 10^9 */ + return cpu_clock(this_cpu) >> 30; /* 2^30 ~= 10^9 */ } void touch_softlockup_watchdog(void) { - __raw_get_cpu_var(touch_timestamp) = get_timestamp(); + int this_cpu = raw_smp_processor_id(); + + per_cpu(touch_timestamp, this_cpu) = get_timestamp(this_cpu); } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -94,7 +96,7 @@ void softlockup_tick(void) return; } - now = get_timestamp(); + now = get_timestamp(this_cpu); /* Wake up the high-prio watchdog task every second: */ if (now > (touch_timestamp + 1)) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 15:44 ` [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 Ingo Molnar @ 2007-07-19 16:11 ` Jeremy Fitzhardinge 2007-07-19 16:16 ` Ingo Molnar 2007-07-19 17:24 ` Jens Axboe 0 siblings, 2 replies; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 16:11 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Jens Axboe, Srivatsa Vaddagiri, Jan Glauber, Andi Kleen Ingo Molnar wrote: > * Ingo Molnar <mingo@elte.hu> wrote: > > >>> Hm, that doesn't look quite right. Doesn't rq_clock measure time >>> spent running? Unstolen time includes idle time too (it just >>> excludes time in which a VCPU is runnable but not actually running). >>> >> generally rq_clock() also includes idle time, so it should work fine >> for this purpose. So, what do you think about the patch below - does >> it suit Xen's purposes? >> > > how about the patch below instead? (which, unlike the first one, happens > to build and boot ;-) > Yes, that should be fine if its just based on sched_clock. Presumably that means that any architecture (eg, s390) which chooses to implement sched_clock as unstolen time will get good behaviour from softlockup as well as the scheduler. How does this interact with the sched_clock changes Andi just posted? (Couple of comments below.) > Ingo > > --------------> > Subject: sched: implement cpu_clock(cpu) high-speed time source > From: Ingo Molnar <mingo@elte.hu> > > Implement the cpu_clock(cpu) interface for kernel-internal use: > high-speed (but slightly incorrect) per-cpu clock constructed from > sched_clock(). > > update blktrace and the softlockup-watchdog to use this new interface. > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com> > --- > block/blktrace.c | 20 ++++++++++---------- > include/linux/sched.h | 7 +++++++ > kernel/sched.c | 17 +++++++++++++++++ > kernel/softlockup.c | 10 ++++++---- > 4 files changed, 40 insertions(+), 14 deletions(-) > > Index: linux/block/blktrace.c > =================================================================== > --- linux.orig/block/blktrace.c > +++ linux/block/blktrace.c > @@ -41,7 +41,7 @@ static void trace_note(struct blk_trace > const int cpu = smp_processor_id(); > > t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; > - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); > + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); > t->device = bt->dev; > t->action = action; > t->pid = pid; > @@ -159,7 +159,7 @@ void __blk_add_trace(struct blk_trace *b > > t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; > t->sequence = ++(*sequence); > - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); > + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); > What's this measuring here? Time spend in IO? Wouldn't it be better off with a measurement of real monotonic time? > t->sector = sector; > t->bytes = bytes; > t->action = what; > @@ -488,17 +488,17 @@ void blk_trace_shutdown(request_queue_t > } > > /* > - * Average offset over two calls to sched_clock() with a gettimeofday() > + * Average offset over two calls to cpu_clock() with a gettimeofday() > * in the middle > */ > -static void blk_check_time(unsigned long long *t) > +static void blk_check_time(unsigned long long *t, int this_cpu) > { > unsigned long long a, b; > struct timeval tv; > > - a = sched_clock(); > + a = cpu_clock(this_cpu); > do_gettimeofday(&tv); > - b = sched_clock(); > + b = cpu_clock(this_cpu); > Is this measuring what it thinks its measuring? > *t = tv.tv_sec * 1000000000 + tv.tv_usec * 1000; > *t -= (a + b) / 2; > @@ -510,16 +510,16 @@ static void blk_check_time(unsigned long > static void blk_trace_check_cpu_time(void *data) > { > unsigned long long *t; > - int cpu = get_cpu(); > + int this_cpu = get_cpu(); > > - t = &per_cpu(blk_trace_cpu_offset, cpu); > + t = &per_cpu(blk_trace_cpu_offset, this_cpu); > > /* > * Just call it twice, hopefully the second call will be cache hot > * and a little more precise > */ > - blk_check_time(t); > - blk_check_time(t); > + blk_check_time(t, this_cpu); > + blk_check_time(t, this_cpu); > > put_cpu(); > } > Index: linux/include/linux/sched.h > =================================================================== > --- linux.orig/include/linux/sched.h > +++ linux/include/linux/sched.h > @@ -1327,6 +1327,13 @@ static inline int set_cpus_allowed(struc > #endif > > extern unsigned long long sched_clock(void); > + > +/* > + * For kernel-internal use: high-speed (but slightly incorrect) per-cpu > + * clock constructed from sched_clock(): > + */ > +extern unsigned long long cpu_clock(int cpu); > + > extern unsigned long long > task_sched_runtime(struct task_struct *task); > > Index: linux/kernel/sched.c > =================================================================== > --- linux.orig/kernel/sched.c > +++ linux/kernel/sched.c > @@ -379,6 +379,23 @@ static inline unsigned long long rq_cloc > #define task_rq(p) cpu_rq(task_cpu(p)) > #define cpu_curr(cpu) (cpu_rq(cpu)->curr) > > +/* > + * For kernel-internal use: high-speed (but slightly incorrect) per-cpu > + * clock constructed from sched_clock(): > + */ > +unsigned long long cpu_clock(int cpu) > +{ > + struct rq *rq = cpu_rq(cpu); > + unsigned long long now; > + unsigned long flags; > + > + spin_lock_irqsave(&rq->lock, flags); > + now = rq_clock(rq); > + spin_unlock_irqrestore(&rq->lock, flags); > + > + return now; > +} > + > #ifdef CONFIG_FAIR_GROUP_SCHED > /* Change a task's ->cfs_rq if it moves across CPUs */ > static inline void set_task_cfs_rq(struct task_struct *p) > Index: linux/kernel/softlockup.c > =================================================================== > --- linux.orig/kernel/softlockup.c > +++ linux/kernel/softlockup.c > @@ -41,14 +41,16 @@ static struct notifier_block panic_block > * resolution, and we don't need to waste time with a big divide when > * 2^30ns == 1.074s. > */ > -static unsigned long get_timestamp(void) > +static unsigned long get_timestamp(int this_cpu) > { > - return sched_clock() >> 30; /* 2^30 ~= 10^9 */ > + return cpu_clock(this_cpu) >> 30; /* 2^30 ~= 10^9 */ > } > > void touch_softlockup_watchdog(void) > { > - __raw_get_cpu_var(touch_timestamp) = get_timestamp(); > + int this_cpu = raw_smp_processor_id(); > + > + per_cpu(touch_timestamp, this_cpu) = get_timestamp(this_cpu); > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -94,7 +96,7 @@ void softlockup_tick(void) > return; > } > > - now = get_timestamp(); > + now = get_timestamp(this_cpu); > > /* Wake up the high-prio watchdog task every second: */ > if (now > (touch_timestamp + 1)) > J ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 16:11 ` Jeremy Fitzhardinge @ 2007-07-19 16:16 ` Ingo Molnar 2007-07-19 16:18 ` Jeremy Fitzhardinge 2007-07-19 17:24 ` Jens Axboe 1 sibling, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 16:16 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Jens Axboe, Srivatsa Vaddagiri, Jan Glauber, Andi Kleen * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > > how about the patch below instead? (which, unlike the first one, > > happens to build and boot ;-) > > Yes, that should be fine if its just based on sched_clock. Presumably > that means that any architecture (eg, s390) which chooses to implement > sched_clock as unstolen time will get good behaviour from softlockup > as well as the scheduler. yeah, that's the idea. > How does this interact with the sched_clock changes Andi just posted? those changes pose no problem, and they are largely orthogonal. Andi's changes should improve the quality of sched_clock() on some boxes that encounter a cpu frequency transition that makes the TSC readout unreliable. Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 16:16 ` Ingo Molnar @ 2007-07-19 16:18 ` Jeremy Fitzhardinge 2007-07-19 16:21 ` Ingo Molnar 0 siblings, 1 reply; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 16:18 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Jens Axboe, Srivatsa Vaddagiri, Jan Glauber, Andi Kleen Ingo Molnar wrote: > yeah, that's the idea. > Good. But the real question: does it help Andrew? J ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 16:18 ` Jeremy Fitzhardinge @ 2007-07-19 16:21 ` Ingo Molnar 2007-07-19 16:29 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-19 16:21 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Jens Axboe, Srivatsa Vaddagiri, Jan Glauber, Andi Kleen * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > Ingo Molnar wrote: > > yeah, that's the idea. > > Good. But the real question: does it help Andrew? that's still an important question, but these changes are still needed nevertheless, to unbreak softlockup.c ... I havent been watching while doing CFS :-/ Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 16:21 ` Ingo Molnar @ 2007-07-19 16:29 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 16:29 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Jens Axboe, Srivatsa Vaddagiri, Jan Glauber, Andi Kleen, Zachary Amsden Ingo Molnar wrote: > that's still an important question, but these changes are still needed > nevertheless, to unbreak softlockup.c Well, I'm happy with using jiffies as the backport fix for softlockup (if sched_clock is indeed a problem there), but mainly because it won't affect Xen. I don't know what effect it will have on vmi though. cpu_clock seems like a good approach for .23 onwards. I guess I should test it at some point... J ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 16:11 ` Jeremy Fitzhardinge 2007-07-19 16:16 ` Ingo Molnar @ 2007-07-19 17:24 ` Jens Axboe 2007-07-19 18:10 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 40+ messages in thread From: Jens Axboe @ 2007-07-19 17:24 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Ingo Molnar, Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Srivatsa Vaddagiri, Jan Glauber, Andi Kleen On Thu, Jul 19 2007, Jeremy Fitzhardinge wrote: > > Index: linux/block/blktrace.c > > =================================================================== > > --- linux.orig/block/blktrace.c > > +++ linux/block/blktrace.c > > @@ -41,7 +41,7 @@ static void trace_note(struct blk_trace > > const int cpu = smp_processor_id(); > > > > t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; > > - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); > > + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); > > t->device = bt->dev; > > t->action = action; > > t->pid = pid; > > @@ -159,7 +159,7 @@ void __blk_add_trace(struct blk_trace *b > > > > t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; > > t->sequence = ++(*sequence); > > - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); > > + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); > > > > What's this measuring here? Time spend in IO? Wouldn't it be better > off with a measurement of real monotonic time? It's not time spent in IO, it wants a nanosecond timestamp. -- Jens Axboe ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 17:24 ` Jens Axboe @ 2007-07-19 18:10 ` Jeremy Fitzhardinge 2007-07-19 18:20 ` Jens Axboe 0 siblings, 1 reply; 40+ messages in thread From: Jeremy Fitzhardinge @ 2007-07-19 18:10 UTC (permalink / raw) To: Jens Axboe Cc: Ingo Molnar, Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Srivatsa Vaddagiri, Jan Glauber, Andi Kleen Jens Axboe wrote: > On Thu, Jul 19 2007, Jeremy Fitzhardinge wrote: > >>> Index: linux/block/blktrace.c >>> =================================================================== >>> --- linux.orig/block/blktrace.c >>> +++ linux/block/blktrace.c >>> @@ -41,7 +41,7 @@ static void trace_note(struct blk_trace >>> const int cpu = smp_processor_id(); >>> >>> t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; >>> - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); >>> + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); >>> t->device = bt->dev; >>> t->action = action; >>> t->pid = pid; >>> @@ -159,7 +159,7 @@ void __blk_add_trace(struct blk_trace *b >>> >>> t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; >>> t->sequence = ++(*sequence); >>> - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); >>> + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); >>> >>> >> What's this measuring here? Time spend in IO? Wouldn't it be better >> off with a measurement of real monotonic time? >> > > It's not time spent in IO, it wants a nanosecond timestamp. > What kind of nanoseconds? Real? Virtual? sched_clock() (and now cpu_clock()) measure the number of nanoseconds available for running CPU instructions (regardless of whether it chooses to use them or not), and so in a virtual environment doesn't include time stolen by the hypervisor for other VCPUs. J ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 2007-07-19 18:10 ` Jeremy Fitzhardinge @ 2007-07-19 18:20 ` Jens Axboe 0 siblings, 0 replies; 40+ messages in thread From: Jens Axboe @ 2007-07-19 18:20 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Ingo Molnar, Andrew Morton, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Srivatsa Vaddagiri, Jan Glauber, Andi Kleen On Thu, Jul 19 2007, Jeremy Fitzhardinge wrote: > Jens Axboe wrote: > > On Thu, Jul 19 2007, Jeremy Fitzhardinge wrote: > > > >>> Index: linux/block/blktrace.c > >>> =================================================================== > >>> --- linux.orig/block/blktrace.c > >>> +++ linux/block/blktrace.c > >>> @@ -41,7 +41,7 @@ static void trace_note(struct blk_trace > >>> const int cpu = smp_processor_id(); > >>> > >>> t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; > >>> - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); > >>> + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); > >>> t->device = bt->dev; > >>> t->action = action; > >>> t->pid = pid; > >>> @@ -159,7 +159,7 @@ void __blk_add_trace(struct blk_trace *b > >>> > >>> t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; > >>> t->sequence = ++(*sequence); > >>> - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); > >>> + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); > >>> > >>> > >> What's this measuring here? Time spend in IO? Wouldn't it be better > >> off with a measurement of real monotonic time? > >> > > > > It's not time spent in IO, it wants a nanosecond timestamp. > > > > What kind of nanoseconds? Real? Virtual? > > sched_clock() (and now cpu_clock()) measure the number of nanoseconds > available for running CPU instructions (regardless of whether it chooses > to use them or not), and so in a virtual environment doesn't include > time stolen by the hypervisor for other VCPUs. Real time, it's a timestamp that the user tools can use to see the time delta between A and B. So I guess blktrace currently doesn't work so well inside a gues... -- Jens Axboe ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-17 15:49 ` [patch] fix the softlockup watchdog to actually work Ingo Molnar 2007-07-17 17:03 ` Randy Dunlap 2007-07-19 7:22 ` Andrew Morton @ 2007-07-25 8:49 ` Andrew Morton 2007-07-25 8:52 ` Ingo Molnar 2007-07-25 16:34 ` Andi Kleen 2007-10-03 23:49 ` Yinghai Lu 3 siblings, 2 replies; 40+ messages in thread From: Andrew Morton @ 2007-07-25 8:49 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright On Tue, 17 Jul 2007 17:49:34 +0200 Ingo Molnar <mingo@elte.hu> wrote: > this Xen related commit: > > commit 966812dc98e6a7fcdf759cbfa0efab77500a8868 > Author: Jeremy Fitzhardinge <jeremy@goop.org> > Date: Tue May 8 00:28:02 2007 -0700 > > Ignore stolen time in the softlockup watchdog > > broke the softlockup watchdog to never report any lockups. (!) > > print_timestamp defaults to 0, this makes the following condition > always true: > > if (print_timestamp < (touch_timestamp + 1) || > > and we'll in essence never report soft lockups. > > apparently the functionality of the soft lockup watchdog was never > actually tested with that patch applied ... > > [this is -stable material too.] Still isn't working. I'm getting random meaningless softlockup trippings coming out for no apparent reason. I guess softlockup is otherwise busted and this patch enables that bustedness to be seen. One possibility is that sched_clock() is bollixed and (say) it's returning a 32-bit value. That'll cause the softlockup logic to get a bit sick when time wraps. This machine (yes it's the Vaio) has marked the TSC unstable but afaict that's OK. So I'll shelve this patch for now. I'm pretty heartily tired of the softlockup thing btw - it's been way more trouble than benefit. Which is strange, for such a simple thing. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 8:49 ` [patch] fix the softlockup watchdog to actually work Andrew Morton @ 2007-07-25 8:52 ` Ingo Molnar 2007-07-25 8:55 ` Ingo Molnar 2007-07-25 9:00 ` Andrew Morton 2007-07-25 16:34 ` Andi Kleen 1 sibling, 2 replies; 40+ messages in thread From: Ingo Molnar @ 2007-07-25 8:52 UTC (permalink / raw) To: Andrew Morton Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Andrew Morton <akpm@linux-foundation.org> wrote: > > apparently the functionality of the soft lockup watchdog was never > > actually tested with that patch applied ... > > > > [this is -stable material too.] > > Still isn't working. I'm getting random meaningless softlockup > trippings coming out for no apparent reason. hm, you still havent applied the other 4 patches i sent: softlockup-fix.patch softlockup-add-irq-regs-h.patch softlockup-better-printout.patch softlockup-cleanups.patch softlockup-use-cpu-clock.patch they are all necessary. softlockup-use-cpu-clock.patch could easily solve the present problem you have: as i pointed it out it is _wrong_ to use sched_clock(), because sched_clock() is not a reliable clocksource. Especially on your VAIO. Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 8:52 ` Ingo Molnar @ 2007-07-25 8:55 ` Ingo Molnar 2007-07-25 9:00 ` Andrew Morton 1 sibling, 0 replies; 40+ messages in thread From: Ingo Molnar @ 2007-07-25 8:55 UTC (permalink / raw) To: Andrew Morton Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Ingo Molnar <mingo@elte.hu> wrote: > > Still isn't working. I'm getting random meaningless softlockup > > trippings coming out for no apparent reason. > > hm, you still havent applied the other 4 patches i sent: > > softlockup-fix.patch > > softlockup-add-irq-regs-h.patch > softlockup-better-printout.patch > softlockup-cleanups.patch > softlockup-use-cpu-clock.patch > > they are all necessary. > > softlockup-use-cpu-clock.patch could easily solve the present problem > you have: as i pointed it out it is _wrong_ to use sched_clock(), > because sched_clock() is not a reliable clocksource. Especially on > your VAIO. note: the cpu_clock() API is in v2.6.23-rc1 already so the above patches work fine as-is. [ for -stable, my other patch should be used that restores jiffies use in softirq.c. ] Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 8:52 ` Ingo Molnar 2007-07-25 8:55 ` Ingo Molnar @ 2007-07-25 9:00 ` Andrew Morton 2007-07-25 9:04 ` Ingo Molnar 1 sibling, 1 reply; 40+ messages in thread From: Andrew Morton @ 2007-07-25 9:00 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright On Wed, 25 Jul 2007 10:52:04 +0200 Ingo Molnar <mingo@elte.hu> wrote: > > * Andrew Morton <akpm@linux-foundation.org> wrote: > > > > apparently the functionality of the soft lockup watchdog was never > > > actually tested with that patch applied ... > > > > > > [this is -stable material too.] > > > > Still isn't working. I'm getting random meaningless softlockup > > trippings coming out for no apparent reason. > > hm, you still havent applied the other 4 patches i sent: > > softlockup-fix.patch > > softlockup-add-irq-regs-h.patch > softlockup-better-printout.patch > softlockup-cleanups.patch > softlockup-use-cpu-clock.patch > > they are all necessary. I think I have. Seems that someone hasn't been naming their patches consistently (which is quite irksome). I have: fix-the-softlockup-watchdog-to-actually-work.patch softlockup-make-asm-irq_regsh-available-on-every-platform.patch softlockup-improve-debug-output.patch softlockup-watchdog-style-cleanups.patch softlockup-add-a-proc-tuning-parameter.patch softlockup-add-a-proc-tuning-parameter-fix.patch > softlockup-use-cpu-clock.patch could easily solve the present problem > you have: as i pointed it out it is _wrong_ to use sched_clock(), > because sched_clock() is not a reliable clocksource. Especially on your > VAIO. It fails with them all applied too: fix-leak-on-proc-lockdep_stats.patch OK fix-the-softlockup-watchdog-to-actually-work.patch BAD softlockup-make-asm-irq_regsh-available-on-every-platform.patch softlockup-improve-debug-output.patch BAD softlockup-watchdog-style-cleanups.patch softlockup-add-a-proc-tuning-parameter.patch softlockup-add-a-proc-tuning-parameter-fix.patch BAD ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 9:00 ` Andrew Morton @ 2007-07-25 9:04 ` Ingo Molnar 2007-07-25 9:17 ` Andrew Morton 0 siblings, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-25 9:04 UTC (permalink / raw) To: Andrew Morton Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Andrew Morton <akpm@linux-foundation.org> wrote: > > softlockup-add-irq-regs-h.patch > > softlockup-better-printout.patch > > softlockup-cleanups.patch > > softlockup-use-cpu-clock.patch > > > > they are all necessary. > > I think I have. Seems that someone hasn't been naming their patches > consistently (which is quite irksome). I have: > > fix-the-softlockup-watchdog-to-actually-work.patch > softlockup-make-asm-irq_regsh-available-on-every-platform.patch > softlockup-improve-debug-output.patch > softlockup-watchdog-style-cleanups.patch > softlockup-add-a-proc-tuning-parameter.patch > softlockup-add-a-proc-tuning-parameter-fix.patch you have the key one missing i think - attached below. Ingo ------------------------------------> Subject: softlockup: use cpu_clock() instead of sched_clock() From: Ingo Molnar <mingo@elte.hu> sched_clock() is not a reliable time-source, use cpu_clock() instead. Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/softlockup.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) Index: linux/kernel/softlockup.c =================================================================== --- linux.orig/kernel/softlockup.c +++ linux/kernel/softlockup.c @@ -42,14 +42,16 @@ static struct notifier_block panic_block * resolution, and we don't need to waste time with a big divide when * 2^30ns == 1.074s. */ -static unsigned long get_timestamp(void) +static unsigned long get_timestamp(int this_cpu) { - return sched_clock() >> 30; /* 2^30 ~= 10^9 */ + return cpu_clock(this_cpu) >> 30; /* 2^30 ~= 10^9 */ } void touch_softlockup_watchdog(void) { - __raw_get_cpu_var(touch_timestamp) = get_timestamp(); + int this_cpu = raw_smp_processor_id(); + + per_cpu(touch_timestamp, this_cpu) = get_timestamp(this_cpu); } EXPORT_SYMBOL(touch_softlockup_watchdog); @@ -95,7 +97,7 @@ void softlockup_tick(void) return; } - now = get_timestamp(); + now = get_timestamp(this_cpu); /* Wake up the high-prio watchdog task every second: */ if (now > (touch_timestamp + 1)) ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 9:04 ` Ingo Molnar @ 2007-07-25 9:17 ` Andrew Morton 2007-07-25 9:23 ` Ingo Molnar 0 siblings, 1 reply; 40+ messages in thread From: Andrew Morton @ 2007-07-25 9:17 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright On Wed, 25 Jul 2007 11:04:39 +0200 Ingo Molnar <mingo@elte.hu> wrote: > > fix-the-softlockup-watchdog-to-actually-work.patch > > softlockup-make-asm-irq_regsh-available-on-every-platform.patch > > softlockup-improve-debug-output.patch > > softlockup-watchdog-style-cleanups.patch > > softlockup-add-a-proc-tuning-parameter.patch > > softlockup-add-a-proc-tuning-parameter-fix.patch > > you have the key one missing i think - attached below. > > Ingo > > ------------------------------------> > Subject: softlockup: use cpu_clock() instead of sched_clock() > From: Ingo Molnar <mingo@elte.hu> > > sched_clock() is not a reliable time-source, use cpu_clock() instead. > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > --- > kernel/softlockup.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > Index: linux/kernel/softlockup.c > =================================================================== > --- linux.orig/kernel/softlockup.c > +++ linux/kernel/softlockup.c > @@ -42,14 +42,16 @@ static struct notifier_block panic_block > * resolution, and we don't need to waste time with a big divide when > * 2^30ns == 1.074s. > */ > -static unsigned long get_timestamp(void) > +static unsigned long get_timestamp(int this_cpu) > { > - return sched_clock() >> 30; /* 2^30 ~= 10^9 */ > + return cpu_clock(this_cpu) >> 30; /* 2^30 ~= 10^9 */ > } > > void touch_softlockup_watchdog(void) > { > - __raw_get_cpu_var(touch_timestamp) = get_timestamp(); > + int this_cpu = raw_smp_processor_id(); > + > + per_cpu(touch_timestamp, this_cpu) = get_timestamp(this_cpu); > } > EXPORT_SYMBOL(touch_softlockup_watchdog); > > @@ -95,7 +97,7 @@ void softlockup_tick(void) > return; argh. afacit this was never sent, except as part of some jumbopatch called "sched: implement cpu_clock(cpu) high-speed time source". That patch helped. It's all a plot. ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 9:17 ` Andrew Morton @ 2007-07-25 9:23 ` Ingo Molnar 2007-07-25 9:59 ` Jens Axboe 0 siblings, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-25 9:23 UTC (permalink / raw) To: Andrew Morton Cc: Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright, Jens Axboe * Andrew Morton <akpm@linux-foundation.org> wrote: > > * 2^30ns == 1.074s. > > */ > > -static unsigned long get_timestamp(void) > > +static unsigned long get_timestamp(int this_cpu) > > { > > - return sched_clock() >> 30; /* 2^30 ~= 10^9 */ > > + return cpu_clock(this_cpu) >> 30; /* 2^30 ~= 10^9 */ > > } > > > > void touch_softlockup_watchdog(void) > > { > > - __raw_get_cpu_var(touch_timestamp) = get_timestamp(); > > + int this_cpu = raw_smp_processor_id(); > > + > > + per_cpu(touch_timestamp, this_cpu) = get_timestamp(this_cpu); > > } > > EXPORT_SYMBOL(touch_softlockup_watchdog); > > > > @@ -95,7 +97,7 @@ void softlockup_tick(void) > > return; > > argh. afacit this was never sent, except as part of some jumbopatch > called "sched: implement cpu_clock(cpu) high-speed time source". > > That patch helped. > > It's all a plot. sorry, it's really my fault: i decoupled it from the jumbopatch (so that the new API could go in first) but forgot to re-send that crutial bit. There's also the patch below (Jens Cc:-ed) to update blktrace. I guess i should do a softlockup.git tree to avoid such foul-ups in the future. Ingo -----------------------> Subject: blktrace: use cpu_clock() instead of sched_clock() From: Ingo Molnar <mingo@elte.hu> use cpu_clock() instead of sched_clock(). (the latter is not a proper clock-source) Signed-off-by: Ingo Molnar <mingo@elte.hu> --- block/blktrace.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) Index: linux/block/blktrace.c =================================================================== --- linux.orig/block/blktrace.c +++ linux/block/blktrace.c @@ -41,7 +41,7 @@ static void trace_note(struct blk_trace const int cpu = smp_processor_id(); t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); t->device = bt->dev; t->action = action; t->pid = pid; @@ -159,7 +159,7 @@ void __blk_add_trace(struct blk_trace *b t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; t->sequence = ++(*sequence); - t->time = sched_clock() - per_cpu(blk_trace_cpu_offset, cpu); + t->time = cpu_clock(cpu) - per_cpu(blk_trace_cpu_offset, cpu); t->sector = sector; t->bytes = bytes; t->action = what; @@ -488,17 +488,17 @@ void blk_trace_shutdown(request_queue_t } /* - * Average offset over two calls to sched_clock() with a gettimeofday() + * Average offset over two calls to cpu_clock() with a gettimeofday() * in the middle */ -static void blk_check_time(unsigned long long *t) +static void blk_check_time(unsigned long long *t, int this_cpu) { unsigned long long a, b; struct timeval tv; - a = sched_clock(); + a = cpu_clock(this_cpu); do_gettimeofday(&tv); - b = sched_clock(); + b = cpu_clock(this_cpu); *t = tv.tv_sec * 1000000000 + tv.tv_usec * 1000; *t -= (a + b) / 2; @@ -510,16 +510,16 @@ static void blk_check_time(unsigned long static void blk_trace_check_cpu_time(void *data) { unsigned long long *t; - int cpu = get_cpu(); + int this_cpu = get_cpu(); - t = &per_cpu(blk_trace_cpu_offset, cpu); + t = &per_cpu(blk_trace_cpu_offset, this_cpu); /* * Just call it twice, hopefully the second call will be cache hot * and a little more precise */ - blk_check_time(t); - blk_check_time(t); + blk_check_time(t, this_cpu); + blk_check_time(t, this_cpu); put_cpu(); } ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 9:23 ` Ingo Molnar @ 2007-07-25 9:59 ` Jens Axboe 2007-07-25 11:04 ` Ingo Molnar 0 siblings, 1 reply; 40+ messages in thread From: Jens Axboe @ 2007-07-25 9:59 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright On Wed, Jul 25 2007, Ingo Molnar wrote: > > * Andrew Morton <akpm@linux-foundation.org> wrote: > > > > * 2^30ns == 1.074s. > > > */ > > > -static unsigned long get_timestamp(void) > > > +static unsigned long get_timestamp(int this_cpu) > > > { > > > - return sched_clock() >> 30; /* 2^30 ~= 10^9 */ > > > + return cpu_clock(this_cpu) >> 30; /* 2^30 ~= 10^9 */ > > > } > > > > > > void touch_softlockup_watchdog(void) > > > { > > > - __raw_get_cpu_var(touch_timestamp) = get_timestamp(); > > > + int this_cpu = raw_smp_processor_id(); > > > + > > > + per_cpu(touch_timestamp, this_cpu) = get_timestamp(this_cpu); > > > } > > > EXPORT_SYMBOL(touch_softlockup_watchdog); > > > > > > @@ -95,7 +97,7 @@ void softlockup_tick(void) > > > return; > > > > argh. afacit this was never sent, except as part of some jumbopatch > > called "sched: implement cpu_clock(cpu) high-speed time source". > > > > That patch helped. > > > > It's all a plot. > > sorry, it's really my fault: i decoupled it from the jumbopatch (so that > the new API could go in first) but forgot to re-send that crutial bit. > There's also the patch below (Jens Cc:-ed) to update blktrace. I guess i > should do a softlockup.git tree to avoid such foul-ups in the future. > > Ingo > > -----------------------> > Subject: blktrace: use cpu_clock() instead of sched_clock() > From: Ingo Molnar <mingo@elte.hu> > > use cpu_clock() instead of sched_clock(). (the latter is not a proper > clock-source) > > Signed-off-by: Ingo Molnar <mingo@elte.hu> I tested it, seems to work fine for me. Acked-by: Jens Axboe <jens.axboe@oracle.com> -- Jens Axboe ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 9:59 ` Jens Axboe @ 2007-07-25 11:04 ` Ingo Molnar 2007-07-25 11:06 ` Jens Axboe 0 siblings, 1 reply; 40+ messages in thread From: Ingo Molnar @ 2007-07-25 11:04 UTC (permalink / raw) To: Jens Axboe Cc: Andrew Morton, Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright * Jens Axboe <jens.axboe@oracle.com> wrote: > > Subject: blktrace: use cpu_clock() instead of sched_clock() > > From: Ingo Molnar <mingo@elte.hu> > > > > use cpu_clock() instead of sched_clock(). (the latter is not a proper > > clock-source) > > > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > > I tested it, seems to work fine for me. > > Acked-by: Jens Axboe <jens.axboe@oracle.com> great! Now that cpu_clock() is in Linus' tree, could you put this patch into your block.git tree and merge it (once you think it should go upstream) together with other IO related fixes? Ingo ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 11:04 ` Ingo Molnar @ 2007-07-25 11:06 ` Jens Axboe 0 siblings, 0 replies; 40+ messages in thread From: Jens Axboe @ 2007-07-25 11:06 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright On Wed, Jul 25 2007, Ingo Molnar wrote: > > * Jens Axboe <jens.axboe@oracle.com> wrote: > > > > Subject: blktrace: use cpu_clock() instead of sched_clock() > > > From: Ingo Molnar <mingo@elte.hu> > > > > > > use cpu_clock() instead of sched_clock(). (the latter is not a proper > > > clock-source) > > > > > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > > > > I tested it, seems to work fine for me. > > > > Acked-by: Jens Axboe <jens.axboe@oracle.com> > > great! Now that cpu_clock() is in Linus' tree, could you put this patch > into your block.git tree and merge it (once you think it should go > upstream) together with other IO related fixes? Sure, I'll pull it into for-linus for merge for 2.6.23. -- Jens Axboe ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-25 8:49 ` [patch] fix the softlockup watchdog to actually work Andrew Morton 2007-07-25 8:52 ` Ingo Molnar @ 2007-07-25 16:34 ` Andi Kleen 1 sibling, 0 replies; 40+ messages in thread From: Andi Kleen @ 2007-07-25 16:34 UTC (permalink / raw) To: Andrew Morton Cc: Ingo Molnar, Jeremy Fitzhardinge, linux-kernel, Linus Torvalds, stable, Greg KH, Chris Wright Andrew Morton <akpm@linux-foundation.org> writes: > > One possibility is that sched_clock() is bollixed and (say) it's returning > a 32-bit value. That'll cause the softlockup logic to get a bit sick when > time wraps. FYI, The current ff x86 sched_clock() [which you likely have, it's not mainline] tends to warp backwards by 80-90ms during cpu freq changes on AMD on one of my test systems. I haven't tracked down why it does that yet. Intel or systems with no cpufreq or cpufreq not changing frequencies regularly are not affected -Andi ^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [patch] fix the softlockup watchdog to actually work 2007-07-17 15:49 ` [patch] fix the softlockup watchdog to actually work Ingo Molnar ` (2 preceding siblings ...) 2007-07-25 8:49 ` [patch] fix the softlockup watchdog to actually work Andrew Morton @ 2007-10-03 23:49 ` Yinghai Lu 3 siblings, 0 replies; 40+ messages in thread From: Yinghai Lu @ 2007-10-03 23:49 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, linux-kernel, Andrew Morton, Linus Torvalds, stable, Greg KH, Chris Wright On 7/17/07, Ingo Molnar <mingo@elte.hu> wrote: > > * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > > > Ingo Molnar wrote: > > > Subject: softlockup: fix Xen bogosity > > > From: Ingo Molnar <mingo@elte.hu> > > > > > > this Xen related commit: > > > > > > > Well, not just Xen. It relates to any virtual environment: kvm, > > lguest, vmi, xen... (Not that they all implement a measure of > > unstolen time.) > > > > How about a more descriptive patch title, along the lines of > > "softlockup watchdog: fix rate limiting"? > > uhm, the problem was that it did not work _at all_, not something about > 'rate limiting'. Yes, i got quite a bit grumpy when i found this, > because you completely broke the softlockup watchdog via a pretty > intrusive commit and you apparently didnt even do a minimal check > whether its functionality was preserved! Updated patch for Andrew/Linus > and for -stable attached. > > Ingo > > -----------------------------> > Subject: fix the softlockup watchdog to actually work > From: Ingo Molnar <mingo@elte.hu> > > this Xen related commit: > > commit 966812dc98e6a7fcdf759cbfa0efab77500a8868 > Author: Jeremy Fitzhardinge <jeremy@goop.org> > Date: Tue May 8 00:28:02 2007 -0700 > > Ignore stolen time in the softlockup watchdog > > broke the softlockup watchdog to never report any lockups. (!) > > print_timestamp defaults to 0, this makes the following condition > always true: > > if (print_timestamp < (touch_timestamp + 1) || > > and we'll in essence never report soft lockups. > > apparently the functionality of the soft lockup watchdog was never > actually tested with that patch applied ... > > [this is -stable material too.] > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > --- > kernel/softlockup.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > Index: linux/kernel/softlockup.c > =================================================================== > --- linux.orig/kernel/softlockup.c > +++ linux/kernel/softlockup.c > @@ -79,10 +79,11 @@ void softlockup_tick(void) > print_timestamp = per_cpu(print_timestamp, this_cpu); > > /* report at most once a second */ > - if (print_timestamp < (touch_timestamp + 1) || > - did_panic || > - !per_cpu(watchdog_task, this_cpu)) > + if ((print_timestamp >= touch_timestamp && > + print_timestamp < (touch_timestamp + 1)) || > + did_panic || !per_cpu(watchdog_task, this_cpu)) { > return; > + } > > /* do not print during early bootup: */ > if (unlikely(system_state != SYSTEM_RUNNING)) { > - how about diff --git a/kernel/softlockup.c b/kernel/softlockup.c index 708d488..bbc0292 100644 --- a/kernel/softlockup.c +++ b/kernel/softlockup.c @@ -80,7 +80,7 @@ void softlockup_tick(void) print_timestamp = per_cpu(print_timestamp, this_cpu); /* report at most once a second */ - if (print_timestamp < (touch_timestamp + 1) || + if (((touch_timestamp - print_timestamp) < 1) || did_panic || !per_cpu(watchdog_task, this_cpu)) YH ^ permalink raw reply related [flat|nested] 40+ messages in thread
end of thread, other threads:[~2007-10-03 23:49 UTC | newest] Thread overview: 40+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-07-17 11:44 [patch] softlockup watchdog: fix Xen bogosity Ingo Molnar 2007-07-17 14:17 ` Jeremy Fitzhardinge 2007-07-17 15:49 ` [patch] fix the softlockup watchdog to actually work Ingo Molnar 2007-07-17 17:03 ` Randy Dunlap 2007-07-17 17:25 ` Ingo Molnar 2007-07-17 18:14 ` Linus Torvalds 2007-07-17 21:38 ` Randy Dunlap 2007-07-19 7:22 ` Andrew Morton 2007-07-19 7:46 ` Ingo Molnar 2007-07-19 7:51 ` Ingo Molnar 2007-07-19 14:31 ` Jeremy Fitzhardinge 2007-07-19 14:35 ` Ingo Molnar 2007-07-19 14:40 ` Jeremy Fitzhardinge 2007-07-19 14:46 ` Jeremy Fitzhardinge 2007-07-19 14:50 ` Ingo Molnar 2007-07-19 15:04 ` Jeremy Fitzhardinge 2007-07-19 15:09 ` Ingo Molnar 2007-07-19 15:21 ` Jeremy Fitzhardinge 2007-07-19 15:42 ` [patch] sched: implement cpu_clock(cpu) high-speed time source Ingo Molnar 2007-07-19 15:44 ` [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 Ingo Molnar 2007-07-19 16:11 ` Jeremy Fitzhardinge 2007-07-19 16:16 ` Ingo Molnar 2007-07-19 16:18 ` Jeremy Fitzhardinge 2007-07-19 16:21 ` Ingo Molnar 2007-07-19 16:29 ` Jeremy Fitzhardinge 2007-07-19 17:24 ` Jens Axboe 2007-07-19 18:10 ` Jeremy Fitzhardinge 2007-07-19 18:20 ` Jens Axboe 2007-07-25 8:49 ` [patch] fix the softlockup watchdog to actually work Andrew Morton 2007-07-25 8:52 ` Ingo Molnar 2007-07-25 8:55 ` Ingo Molnar 2007-07-25 9:00 ` Andrew Morton 2007-07-25 9:04 ` Ingo Molnar 2007-07-25 9:17 ` Andrew Morton 2007-07-25 9:23 ` Ingo Molnar 2007-07-25 9:59 ` Jens Axboe 2007-07-25 11:04 ` Ingo Molnar 2007-07-25 11:06 ` Jens Axboe 2007-07-25 16:34 ` Andi Kleen 2007-10-03 23:49 ` Yinghai Lu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).