From: Andrew Morton <akpm@linux-foundation.org>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Bhargava <prarit@redhat.com>,
Thomas@smtp2.linux-foundation.org, Lindsley <ricklind@us.ibm.com>,
Prarit@smtp2.linux-foundation.org,
john stultz <johnstul@us.ibm.com>,
Zachary@smtp2.linux-foundation.org,
Linux Kernel <linux-kernel@vger.kernel.org>,
Eric Dumazet <dada1@cosmosbay.com>,
virtualization@lists.osdl.org,
Chris Lalancette <clalance@redhat.com>,
Paul Mackerras <paulus@samba.org>,
Rick@smtp2.linux-foundation.org,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Ingo Molnar <mingo@elte.hu>, Gleixner <tglx@linutronix.de>
Subject: Re: [patch 1/4] Ignore stolen time in the softlockup watchdog
Date: Mon, 23 Apr 2007 23:49:10 -0700 [thread overview]
Message-ID: <20070423234910.50149faf.akpm@linux-foundation.org> (raw)
In-Reply-To: <20070327215827.871954359@goop.org>
On Tue, 27 Mar 2007 14:49:20 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> The softlockup watchdog is currently a nuisance in a virtual machine,
> since the whole system could have the CPU stolen from it for a long
> period of time. While it would be unlikely for a guest domain to be
> denied timer interrupts for over 10s, it could happen and any softlockup
> message would be completely spurious.
>
> Earlier I proposed that sched_clock() return time in unstolen
> nanoseconds, which is how Xen and VMI currently implement it. If the
> softlockup watchdog uses sched_clock() to measure time, it would
> automatically ignore stolen time, and therefore only report when the
> guest itself locked up. When running native, sched_clock() returns
> real-time nanoseconds, so the behaviour would be unchanged.
>
> Note that sched_clock() used this way is inherently per-cpu, so this
> patch makes sure that the per-processor watchdog thread initialized
> its own timestamp.
This patch
(ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/broken-out/ignore-stolen-time-in-the-softlockup-watchdog.patch)
causes six failures in the locking self-tests, which I must say is rather
clever of it.
Here's the first one:
[17179569.184000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[17179569.184000] ... MAX_LOCKDEP_SUBCLASSES: 8
[17179569.184000] ... MAX_LOCK_DEPTH: 30
[17179569.184000] ... MAX_LOCKDEP_KEYS: 2048
[17179569.184000] ... CLASSHASH_SIZE: 1024
[17179569.184000] ... MAX_LOCKDEP_ENTRIES: 8192
[17179569.184000] ... MAX_LOCKDEP_CHAINS: 16384
[17179569.184000] ... CHAINHASH_SIZE: 8192
[17179569.184000] memory used by lock dependency info: 992 kB
[17179569.184000] per task-struct memory footprint: 1200 bytes
[17179569.184000] ------------------------
[17179569.184000] | Locking API testsuite:
[17179569.184000] ----------------------------------------------------------------------------
[17179569.184000] | spin |wlock |rlock |mutex | wsem | rsem |
[17179569.184000] --------------------------------------------------------------------------
[17179569.184000] A-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184000] A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184000] A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184001] A-B-C-A-B-C deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184002] A-B-B-C-C-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184003] A-B-C-D-B-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184004] A-B-C-D-B-C-D-A deadlock: ok | ok | ok | ok | ok | ok |
[17179569.184005] double unlock: ok | ok | ok | ok | ok | ok |
[17179569.184006] initialize held: ok | ok | ok | ok | ok | ok |
[17179569.184006] bad unlock order: ok | ok | ok | ok | ok | ok |
[17179569.184006] --------------------------------------------------------------------------
[17179569.184006] recursive read-lock: | ok | | ok |
[17179569.184006] recursive read-lock #2: | ok | | ok |
[17179569.184007] mixed read-write-lock: | ok | | ok |
[17179569.184007] mixed write-read-lock: | ok | | ok |
[17179569.184007] --------------------------------------------------------------------------
[17179569.184007] hard-irqs-on + irq-safe-A/12: ok | ok | ok |
[17179569.184007] soft-irqs-on + irq-safe-A/12: ok | ok | ok |
[17179569.184007] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
[17179569.184007] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
[17179569.184007] sirq-safe-A => hirqs-on/12: ok | ok |irq event stamp: 458
[17179569.184007] hardirqs last enabled at (458): [<c01e4116>] irqsafe2A_rlock_12+0x96/0xa3
[17179569.184007] hardirqs last disabled at (457): [<c01095b9>] sched_clock+0x5e/0xe9
[17179569.184007] softirqs last enabled at (454): [<c01e4101>] irqsafe2A_rlock_12+0x81/0xa3
[17179569.184007] softirqs last disabled at (450): [<c01e408b>] irqsafe2A_rlock_12+0xb/0xa3
[17179569.184007] FAILED| [<c0104cf0>] dump_trace+0x63/0x1ec
[17179569.184007] [<c0104e93>] show_trace_log_lvl+0x1a/0x30
[17179569.184007] [<c01059ec>] show_trace+0x12/0x14
[17179569.184007] [<c0105a45>] dump_stack+0x16/0x18
[17179569.184007] [<c01e1eb5>] dotest+0x6b/0x3d0
[17179569.184007] [<c01eb249>] locking_selftest+0x915/0x1a58
[17179569.184007] [<c048c979>] start_kernel+0x1d0/0x2a2
[17179569.184007] =======================
[17179569.184007]
[17179569.184007] sirq-safe-A => hirqs-on/21:irq event stamp: 462
[17179569.184007] hardirqs last enabled at (462): [<c01e3eb2>] irqsafe2A_spin_21+0x1b/0xa3
[17179569.184007] hardirqs last disabled at (461): [<c01095b9>] sched_clock+0x5e/0xe9
[17179569.184007] softirqs last enabled at (454): [<c01e4101>] irqsafe2A_rlock_12+0x81/0xa3
[17179569.184007] softirqs last disabled at (450): [<c01e408b>] irqsafe2A_rlock_12+0xb/0xa3
[17179569.184007] ok |irq event stamp: 466
It's a challenge to even find the code which corresponds with this failure,
so good luck.
It seems fairly sensitive to .config settings. See
http://userweb.kernel.org/~akpm/config-sony.txt
Unfortunately this causes lockdep to disable itself, thus hiding all the
other bugs which people have contributed, so I'll drop your patch for now.
next prev parent reply other threads:[~2007-04-24 6:49 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-27 21:49 [patch 0/4] Revised softlockup watchdog improvement patches Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 1/4] Ignore stolen time in the softlockup watchdog Jeremy Fitzhardinge
2007-04-24 6:49 ` Andrew Morton [this message]
2007-04-24 6:58 ` Jeremy Fitzhardinge
2007-04-24 7:09 ` Andrew Morton
2007-04-24 17:51 ` Jeremy Fitzhardinge
2007-04-24 17:57 ` Andrew Morton
2007-04-24 18:16 ` Jeremy Fitzhardinge
2007-04-24 18:32 ` Andrew Morton
2007-04-24 20:00 ` Jeremy Fitzhardinge
2007-04-24 20:14 ` Andrew Morton
2007-04-24 20:46 ` Jeremy Fitzhardinge
2007-04-24 20:24 ` Jeremy Fitzhardinge
2007-04-24 20:33 ` Andrew Morton
2007-04-24 20:48 ` Jeremy Fitzhardinge
2007-04-24 20:52 ` Daniel Walker
2007-04-24 20:59 ` Ingo Molnar
2007-04-24 21:01 ` Daniel Walker
2007-04-24 21:14 ` Andrew Morton
2007-04-24 21:20 ` Andi Kleen
2007-04-24 21:33 ` Daniel Walker
2007-03-27 21:49 ` [patch 2/4] percpu enable flag for " Jeremy Fitzhardinge
2007-03-27 21:49 ` [patch 3/4] Locally disable the softlockup watchdog rather than touching it Jeremy Fitzhardinge
2007-03-28 13:33 ` Prarit Bhargava
2007-03-28 13:50 ` Andi Kleen
2007-03-28 14:00 ` Prarit Bhargava
2007-03-28 14:09 ` Andi Kleen
2007-03-28 14:13 ` Prarit Bhargava
2007-03-28 14:44 ` Jeremy Fitzhardinge
2007-03-28 14:51 ` Prarit Bhargava
2007-03-28 15:22 ` Jeremy Fitzhardinge
2007-03-28 15:27 ` Prarit Bhargava
2007-03-27 21:49 ` [patch 4/4] Add global disable/enable for softlockup watchdog Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070423234910.50149faf.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=Prarit@smtp2.linux-foundation.org \
--cc=Rick@smtp2.linux-foundation.org \
--cc=Thomas@smtp2.linux-foundation.org \
--cc=Zachary@smtp2.linux-foundation.org \
--cc=clalance@redhat.com \
--cc=dada1@cosmosbay.com \
--cc=jeremy@goop.org \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=prarit@redhat.com \
--cc=ricklind@us.ibm.com \
--cc=schwidefsky@de.ibm.com \
--cc=tglx@linutronix.de \
--cc=virtualization@lists.osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).