public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Andrew Morton <akpm@osdl.org>
Cc: ccb@acm.org, linux-kernel@vger.kernel.org
Subject: [patch] fix spinlock-debug looping
Date: Tue, 20 Jun 2006 10:40:01 +0200	[thread overview]
Message-ID: <20060620084001.GC7899@elte.hu> (raw)
In-Reply-To: <20060619125531.4c72b8cc.akpm@osdl.org>


* Andrew Morton <akpm@osdl.org> wrote:

> > > > The write_trylock + __delay in the loop is not a problem or a bug, as 
> > > > the trylock will at most _increase_ the delay - and our goal is to not 
> > > > have a false positive, not to be absolutely accurate about the 
> > > > measurement here.
> > > 
> > > Precisely.  We have delays of over a second (but we don't know how 
> > > much more than a second).  Let's say two seconds.  The NMI watchdog 
> > > timeout is, what?  Five seconds?
> > 
> > i dont see the problem.
> 
> It's taking over a second to acquire a write_lock.  A lock which is 
> unlikely to be held for more than a microsecond anywhere.  That's 
> really bad, isn't it?  Being on the edge of an NMI watchdog induced 
> system crash is bad, too.

i obviously agree that any such crash is a serious problem, but is it 
caused by the spinlock-debugging code? I doubt it is, unless __delay() 
is seriously buggered.

in any case, to move this problem forward i'd suggest we go with the 
patch below in -mm and wait for feedback. It fixes a potential overflow 
in the comparison (if HZ*lpj overflows 32-bits) That needs a really fast 
box to run the 32-bit kernel though, so i doubt this is the cause of the 
problems. In any case, this change makes it easier to increase the 
looping timeout from 1 second to 10 seconds later on or so - at which 
point the overflow can happen for real and must be handled .

	Ingo

------------
Subject: fix spinlock-debug looping
From: Ingo Molnar <mingo@elte.hu>

make sure the right hand side of the comparison does not overflow
on 32-bits. Also print out more info when detecting a lockup, so
that we see how many times the code tried (and failed) to get the
lock.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 lib/spinlock_debug.c |   24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

Index: linux/lib/spinlock_debug.c
===================================================================
--- linux.orig/lib/spinlock_debug.c
+++ linux/lib/spinlock_debug.c
@@ -104,7 +104,7 @@ static void __spin_lock_debug(spinlock_t
 	u64 i;
 
 	for (;;) {
-		for (i = 0; i < loops_per_jiffy * HZ; i++) {
+		for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
 			if (__raw_spin_trylock(&lock->raw_lock))
 				return;
 			__delay(1);
@@ -112,10 +112,10 @@ static void __spin_lock_debug(spinlock_t
 		/* lockup suspected: */
 		if (print_once) {
 			print_once = 0;
-			printk(KERN_EMERG "BUG: spinlock lockup on CPU#%d, "
-					"%s/%d, %p\n",
+			printk(KERN_EMERG "BUG: possible spinlock lockup on CPU#%d, "
+					"%s/%d, %p [%Ld/%ld]\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				current->pid, lock, i, loops_per_jiffy);
 			dump_stack();
 		}
 	}
@@ -169,7 +169,7 @@ static void __read_lock_debug(rwlock_t *
 	u64 i;
 
 	for (;;) {
-		for (i = 0; i < loops_per_jiffy * HZ; i++) {
+		for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
 			if (__raw_read_trylock(&lock->raw_lock))
 				return;
 			__delay(1);
@@ -177,10 +177,10 @@ static void __read_lock_debug(rwlock_t *
 		/* lockup suspected: */
 		if (print_once) {
 			print_once = 0;
-			printk(KERN_EMERG "BUG: read-lock lockup on CPU#%d, "
-					"%s/%d, %p\n",
+			printk(KERN_EMERG "BUG: possible read-lock lockup on CPU#%d, "
+					"%s/%d, %p [%Ld/%ld]\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				current->pid, lock, i, loops_per_jiffy);
 			dump_stack();
 		}
 	}
@@ -242,7 +242,7 @@ static void __write_lock_debug(rwlock_t 
 	u64 i;
 
 	for (;;) {
-		for (i = 0; i < loops_per_jiffy * HZ; i++) {
+		for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
 			if (__raw_write_trylock(&lock->raw_lock))
 				return;
 			__delay(1);
@@ -250,10 +250,10 @@ static void __write_lock_debug(rwlock_t 
 		/* lockup suspected: */
 		if (print_once) {
 			print_once = 0;
-			printk(KERN_EMERG "BUG: write-lock lockup on CPU#%d, "
-					"%s/%d, %p\n",
+			printk(KERN_EMERG "BUG: possible write-lock lockup on CPU#%d, "
+					"%s/%d, %p [%Ld/%ld]\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				current->pid, lock, i, loops_per_jiffy);
 			dump_stack();
 		}
 	}

  parent reply	other threads:[~2006-06-20  8:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-12 19:53 BUG: write-lock lockup Charles C. Bennett, Jr.
2006-06-17 17:07 ` Andrew Morton
2006-06-19  7:02   ` [patch] increase spinlock-debug looping timeouts from 1 sec to 1 min Ingo Molnar
2006-06-19  7:59     ` Andrew Morton
2006-06-19  8:12       ` Ingo Molnar
2006-06-19  8:21         ` Ingo Molnar
2006-06-19  8:32         ` Andrew Morton
2006-06-19  8:35           ` Ingo Molnar
2006-06-19  9:13             ` Andrew Morton
2006-06-19 11:39               ` Ingo Molnar
2006-06-19 19:55                 ` Andrew Morton
2006-06-20  8:06                   ` Arjan van de Ven
2006-06-20  8:40                   ` Ingo Molnar [this message]
2006-06-20  8:52                     ` [patch] fix spinlock-debug looping Andrew Morton
2006-06-20  9:15                       ` Ingo Molnar
2006-06-20  9:32                         ` Andrew Morton
2006-06-20  9:34                           ` Ingo Molnar
2006-06-20 16:02                           ` Dave Olson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060620084001.GC7899@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@osdl.org \
    --cc=ccb@acm.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox