From: Ingo Molnar <mingo@elte.hu>
To: Andrew Morton <akpm@osdl.org>
Cc: ccb@acm.org, linux-kernel@vger.kernel.org
Subject: [patch] fix spinlock-debug looping
Date: Tue, 20 Jun 2006 10:40:01 +0200 [thread overview]
Message-ID: <20060620084001.GC7899@elte.hu> (raw)
In-Reply-To: <20060619125531.4c72b8cc.akpm@osdl.org>
* Andrew Morton <akpm@osdl.org> wrote:
> > > > The write_trylock + __delay in the loop is not a problem or a bug, as
> > > > the trylock will at most _increase_ the delay - and our goal is to not
> > > > have a false positive, not to be absolutely accurate about the
> > > > measurement here.
> > >
> > > Precisely. We have delays of over a second (but we don't know how
> > > much more than a second). Let's say two seconds. The NMI watchdog
> > > timeout is, what? Five seconds?
> >
> > i dont see the problem.
>
> It's taking over a second to acquire a write_lock. A lock which is
> unlikely to be held for more than a microsecond anywhere. That's
> really bad, isn't it? Being on the edge of an NMI watchdog induced
> system crash is bad, too.
i obviously agree that any such crash is a serious problem, but is it
caused by the spinlock-debugging code? I doubt it is, unless __delay()
is seriously buggered.
in any case, to move this problem forward i'd suggest we go with the
patch below in -mm and wait for feedback. It fixes a potential overflow
in the comparison (if HZ*lpj overflows 32-bits) That needs a really fast
box to run the 32-bit kernel though, so i doubt this is the cause of the
problems. In any case, this change makes it easier to increase the
looping timeout from 1 second to 10 seconds later on or so - at which
point the overflow can happen for real and must be handled .
Ingo
------------
Subject: fix spinlock-debug looping
From: Ingo Molnar <mingo@elte.hu>
make sure the right hand side of the comparison does not overflow
on 32-bits. Also print out more info when detecting a lockup, so
that we see how many times the code tried (and failed) to get the
lock.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
lib/spinlock_debug.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
Index: linux/lib/spinlock_debug.c
===================================================================
--- linux.orig/lib/spinlock_debug.c
+++ linux/lib/spinlock_debug.c
@@ -104,7 +104,7 @@ static void __spin_lock_debug(spinlock_t
u64 i;
for (;;) {
- for (i = 0; i < loops_per_jiffy * HZ; i++) {
+ for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
if (__raw_spin_trylock(&lock->raw_lock))
return;
__delay(1);
@@ -112,10 +112,10 @@ static void __spin_lock_debug(spinlock_t
/* lockup suspected: */
if (print_once) {
print_once = 0;
- printk(KERN_EMERG "BUG: spinlock lockup on CPU#%d, "
- "%s/%d, %p\n",
+ printk(KERN_EMERG "BUG: possible spinlock lockup on CPU#%d, "
+ "%s/%d, %p [%Ld/%ld]\n",
raw_smp_processor_id(), current->comm,
- current->pid, lock);
+ current->pid, lock, i, loops_per_jiffy);
dump_stack();
}
}
@@ -169,7 +169,7 @@ static void __read_lock_debug(rwlock_t *
u64 i;
for (;;) {
- for (i = 0; i < loops_per_jiffy * HZ; i++) {
+ for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
if (__raw_read_trylock(&lock->raw_lock))
return;
__delay(1);
@@ -177,10 +177,10 @@ static void __read_lock_debug(rwlock_t *
/* lockup suspected: */
if (print_once) {
print_once = 0;
- printk(KERN_EMERG "BUG: read-lock lockup on CPU#%d, "
- "%s/%d, %p\n",
+ printk(KERN_EMERG "BUG: possible read-lock lockup on CPU#%d, "
+ "%s/%d, %p [%Ld/%ld]\n",
raw_smp_processor_id(), current->comm,
- current->pid, lock);
+ current->pid, lock, i, loops_per_jiffy);
dump_stack();
}
}
@@ -242,7 +242,7 @@ static void __write_lock_debug(rwlock_t
u64 i;
for (;;) {
- for (i = 0; i < loops_per_jiffy * HZ; i++) {
+ for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
if (__raw_write_trylock(&lock->raw_lock))
return;
__delay(1);
@@ -250,10 +250,10 @@ static void __write_lock_debug(rwlock_t
/* lockup suspected: */
if (print_once) {
print_once = 0;
- printk(KERN_EMERG "BUG: write-lock lockup on CPU#%d, "
- "%s/%d, %p\n",
+ printk(KERN_EMERG "BUG: possible write-lock lockup on CPU#%d, "
+ "%s/%d, %p [%Ld/%ld]\n",
raw_smp_processor_id(), current->comm,
- current->pid, lock);
+ current->pid, lock, i, loops_per_jiffy);
dump_stack();
}
}
next prev parent reply other threads:[~2006-06-20 8:45 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-12 19:53 BUG: write-lock lockup Charles C. Bennett, Jr.
2006-06-17 17:07 ` Andrew Morton
2006-06-19 7:02 ` [patch] increase spinlock-debug looping timeouts from 1 sec to 1 min Ingo Molnar
2006-06-19 7:59 ` Andrew Morton
2006-06-19 8:12 ` Ingo Molnar
2006-06-19 8:21 ` Ingo Molnar
2006-06-19 8:32 ` Andrew Morton
2006-06-19 8:35 ` Ingo Molnar
2006-06-19 9:13 ` Andrew Morton
2006-06-19 11:39 ` Ingo Molnar
2006-06-19 19:55 ` Andrew Morton
2006-06-20 8:06 ` Arjan van de Ven
2006-06-20 8:40 ` Ingo Molnar [this message]
2006-06-20 8:52 ` [patch] fix spinlock-debug looping Andrew Morton
2006-06-20 9:15 ` Ingo Molnar
2006-06-20 9:32 ` Andrew Morton
2006-06-20 9:34 ` Ingo Molnar
2006-06-20 16:02 ` Dave Olson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060620084001.GC7899@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@osdl.org \
--cc=ccb@acm.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox