All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	stable@kernel.org, Greg KH <greg@kroah.com>,
	Chris Wright <chrisw@sous-sol.org>
Subject: Re: [patch] fix the softlockup watchdog to actually work
Date: Thu, 19 Jul 2007 00:22:31 -0700	[thread overview]
Message-ID: <20070719002231.069ebbdd.akpm@linux-foundation.org> (raw)
In-Reply-To: <20070717154934.GA24231@elte.hu>

On Tue, 17 Jul 2007 17:49:34 +0200 Ingo Molnar <mingo@elte.hu> wrote:

> Subject: fix the softlockup watchdog to actually work
> From: Ingo Molnar <mingo@elte.hu>
> 
> this Xen related commit:
> 
>    commit 966812dc98e6a7fcdf759cbfa0efab77500a8868
>    Author: Jeremy Fitzhardinge <jeremy@goop.org>
>    Date:   Tue May 8 00:28:02 2007 -0700
> 
>        Ignore stolen time in the softlockup watchdog
> 
> broke the softlockup watchdog to never report any lockups. (!)
> 
> print_timestamp defaults to 0, this makes the following condition
> always true:
> 
> 	if (print_timestamp < (touch_timestamp + 1) ||
> 
> and we'll in essence never report soft lockups.
> 
> apparently the functionality of the soft lockup watchdog was never
> actually tested with that patch applied ...
> 
> [this is -stable material too.]

This seems terribly sensitive.

Someone has broken the Vaio (shock, horror).  It now has mysterious
jerkiness: when leaning on autorepeat it stalls for maybe 0.25 seconds
every 1.5 seconds.  The stalls are far less than a second.  Yet this
is enough to trigger random softlockup warnings.

Some of those warnings are below.  Note that the traces are all pretty
useless, as softlockup warnings so often seem to be.

Of course, it could be that whatever is causing these pauses really _is_
stalling for a whole second occasionally, dunno.  But I didn't notice any
long stalls in the console output when a particular storm of softlockup
warnings came out.

But I'll sit on this patch for a while until this gets sorted out. 
Meanwhile, please double-check the elapsed-time arithmetic in there,
maybe do a bit of runtime testing?



[   78.820961] BUG: soft lockup detected on CPU#0!
[   78.821083]  [<c0122475>] update_process_times+0x32/0x54
[   78.821216]  [<c012fe7a>] tick_sched_timer+0x61/0x9c
[   78.821340]  [<c012c2e7>] hrtimer_interrupt+0x142/0x1d4
[   78.821463]  [<c012fe19>] tick_sched_timer+0x0/0x9c
[   78.821587]  [<c012f74a>] tick_do_broadcast+0x1f/0x3f
[   78.821707]  [<c012fa01>] tick_handle_oneshot_broadcast+0x47/0x72
[   78.821852]  [<c01067ca>] timer_interrupt+0x1a/0x20
[   78.821968]  [<c014291e>] handle_IRQ_event+0x1a/0x3f
[   78.822089]  [<c0143521>] handle_edge_irq+0x9d/0xcc
[   78.822206]  [<c0105d7b>] do_IRQ+0x53/0x6c
[   78.822307]  [<c012f4f0>] tick_notify+0x15c/0x208
[   78.822422]  [<c01044cf>] common_interrupt+0x23/0x28
[   78.822539]  [<c012f1d4>] clockevents_notify+0x8/0x36
[   78.822663]  [<c020d199>] acpi_processor_idle+0x1d2/0x36d
[   78.822798]  [<c0102345>] cpu_idle+0x44/0x5e
[   78.822900]  [<c03baa8d>] start_kernel+0x26d/0x275
[   78.823017]  [<c03ba3fe>] unknown_bootoption+0x0/0x202
[   78.823142]  =======================
[  106.282830] BUG: soft lockup detected on CPU#0!
[  106.282967]  [<c0122475>] update_process_times+0x32/0x54
[  106.283116]  [<c012fe7a>] tick_sched_timer+0x61/0x9c
[  106.283255]  [<c012c2e7>] hrtimer_interrupt+0x142/0x1d4
[  106.283391]  [<c012fe19>] tick_sched_timer+0x0/0x9c
[  106.283530]  [<c012f74a>] tick_do_broadcast+0x1f/0x3f
[  106.283663]  [<c012fa01>] tick_handle_oneshot_broadcast+0x47/0x72
[  106.283821]  [<c01067ca>] timer_interrupt+0x1a/0x20
[  106.283949]  [<c014291e>] handle_IRQ_event+0x1a/0x3f
[  106.284084]  [<c0143521>] handle_edge_irq+0x9d/0xcc
[  106.284215]  [<c0105d7b>] do_IRQ+0x53/0x6c
[  106.284326]  [<c012f4f0>] tick_notify+0x15c/0x208
[  106.284455]  [<c01044cf>] common_interrupt+0x23/0x28
[  106.284587]  [<c012f1d4>] clockevents_notify+0x8/0x36
[  106.284725]  [<c020d199>] acpi_processor_idle+0x1d2/0x36d
[  106.284875]  [<c0102345>] cpu_idle+0x44/0x5e
[  106.284988]  [<c03baa8d>] start_kernel+0x26d/0x275
[  106.285117]  [<c03ba3fe>] unknown_bootoption+0x0/0x202
[  106.285257]  =======================
[  109.266423] BUG: soft lockup detected on CPU#0!
[  109.266558]  [<c0122475>] update_process_times+0x32/0x54
[  109.266703]  [<c012fe7a>] tick_sched_timer+0x61/0x9c
[  109.270745]  [<c012c2e7>] hrtimer_interrupt+0x142/0x1d4
[  109.274790]  [<c012fe19>] tick_sched_timer+0x0/0x9c
[  109.278865]  [<c012f74a>] tick_do_broadcast+0x1f/0x3f
[  109.282950]  [<c012fa01>] tick_handle_oneshot_broadcast+0x47/0x72
[  109.287026]  [<c01067ca>] timer_interrupt+0x1a/0x20
[  109.291012]  [<c014291e>] handle_IRQ_event+0x1a/0x3f
[  109.294950]  [<c0143521>] handle_edge_irq+0x9d/0xcc
[  109.298864]  [<c0105d7b>] do_IRQ+0x53/0x6c
[  109.302818]  [<c012f4f0>] tick_notify+0x15c/0x208
[  109.306740]  [<c01044cf>] common_interrupt+0x23/0x28
[  109.310641]  [<c012f1d4>] clockevents_notify+0x8/0x36
[  109.314543]  [<c020d199>] acpi_processor_idle+0x1d2/0x36d
[  109.318461]  [<c0102345>] cpu_idle+0x44/0x5e
[  109.322348]  [<c03baa8d>] start_kernel+0x26d/0x275
[  109.326267]  [<c03ba3fe>] unknown_bootoption+0x0/0x202
[  109.330188]  =======================

(ah, the Vaio breakage seems to be -mm-only, whew)


  parent reply	other threads:[~2007-07-19  7:24 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-17 11:44 [patch] softlockup watchdog: fix Xen bogosity Ingo Molnar
2007-07-17 14:17 ` Jeremy Fitzhardinge
2007-07-17 15:49   ` [patch] fix the softlockup watchdog to actually work Ingo Molnar
2007-07-17 17:03     ` Randy Dunlap
2007-07-17 17:25       ` Ingo Molnar
2007-07-17 18:14       ` Linus Torvalds
2007-07-17 21:38         ` Randy Dunlap
2007-07-19  7:22     ` Andrew Morton [this message]
2007-07-19  7:46       ` Ingo Molnar
2007-07-19  7:51       ` Ingo Molnar
2007-07-19 14:31         ` Jeremy Fitzhardinge
2007-07-19 14:35           ` Ingo Molnar
2007-07-19 14:40             ` Jeremy Fitzhardinge
2007-07-19 14:46             ` Jeremy Fitzhardinge
2007-07-19 14:50               ` Ingo Molnar
2007-07-19 15:04                 ` Jeremy Fitzhardinge
2007-07-19 15:09                   ` Ingo Molnar
2007-07-19 15:21                     ` Jeremy Fitzhardinge
2007-07-19 15:42                       ` [patch] sched: implement cpu_clock(cpu) high-speed time source Ingo Molnar
2007-07-19 15:44                         ` [patch] sched: implement cpu_clock(cpu) high-speed time source, take #2 Ingo Molnar
2007-07-19 16:11                           ` Jeremy Fitzhardinge
2007-07-19 16:16                             ` Ingo Molnar
2007-07-19 16:18                               ` Jeremy Fitzhardinge
2007-07-19 16:21                                 ` Ingo Molnar
2007-07-19 16:29                                   ` Jeremy Fitzhardinge
2007-07-19 17:24                             ` Jens Axboe
2007-07-19 18:10                               ` Jeremy Fitzhardinge
2007-07-19 18:20                                 ` Jens Axboe
2007-07-25  8:49     ` [patch] fix the softlockup watchdog to actually work Andrew Morton
2007-07-25  8:52       ` Ingo Molnar
2007-07-25  8:55         ` Ingo Molnar
2007-07-25  9:00         ` Andrew Morton
2007-07-25  9:04           ` Ingo Molnar
2007-07-25  9:17             ` Andrew Morton
2007-07-25  9:23               ` Ingo Molnar
2007-07-25  9:59                 ` Jens Axboe
2007-07-25 11:04                   ` Ingo Molnar
2007-07-25 11:06                     ` Jens Axboe
2007-07-25 16:34       ` Andi Kleen
2007-10-03 23:49     ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070719002231.069ebbdd.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=chrisw@sous-sol.org \
    --cc=greg@kroah.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.