All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Andrew Morton <akpm@osdl.org>
Cc: ccb@acm.org, linux-kernel@vger.kernel.org
Subject: [patch] fix spinlock-debug looping
Date: Tue, 20 Jun 2006 10:40:01 +0200	[thread overview]
Message-ID: <20060620084001.GC7899@elte.hu> (raw)
In-Reply-To: <20060619125531.4c72b8cc.akpm@osdl.org>


* Andrew Morton <akpm@osdl.org> wrote:

> > > > The write_trylock + __delay in the loop is not a problem or a bug, as 
> > > > the trylock will at most _increase_ the delay - and our goal is to not 
> > > > have a false positive, not to be absolutely accurate about the 
> > > > measurement here.
> > > 
> > > Precisely.  We have delays of over a second (but we don't know how 
> > > much more than a second).  Let's say two seconds.  The NMI watchdog 
> > > timeout is, what?  Five seconds?
> > 
> > i dont see the problem.
> 
> It's taking over a second to acquire a write_lock.  A lock which is 
> unlikely to be held for more than a microsecond anywhere.  That's 
> really bad, isn't it?  Being on the edge of an NMI watchdog induced 
> system crash is bad, too.

i obviously agree that any such crash is a serious problem, but is it 
caused by the spinlock-debugging code? I doubt it is, unless __delay() 
is seriously buggered.

in any case, to move this problem forward i'd suggest we go with the 
patch below in -mm and wait for feedback. It fixes a potential overflow 
in the comparison (if HZ*lpj overflows 32-bits) That needs a really fast 
box to run the 32-bit kernel though, so i doubt this is the cause of the 
problems. In any case, this change makes it easier to increase the 
looping timeout from 1 second to 10 seconds later on or so - at which 
point the overflow can happen for real and must be handled .

	Ingo

------------
Subject: fix spinlock-debug looping
From: Ingo Molnar <mingo@elte.hu>

make sure the right hand side of the comparison does not overflow
on 32-bits. Also print out more info when detecting a lockup, so
that we see how many times the code tried (and failed) to get the
lock.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 lib/spinlock_debug.c |   24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

Index: linux/lib/spinlock_debug.c
===================================================================
--- linux.orig/lib/spinlock_debug.c
+++ linux/lib/spinlock_debug.c
@@ -104,7 +104,7 @@ static void __spin_lock_debug(spinlock_t
 	u64 i;
 
 	for (;;) {
-		for (i = 0; i < loops_per_jiffy * HZ; i++) {
+		for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
 			if (__raw_spin_trylock(&lock->raw_lock))
 				return;
 			__delay(1);
@@ -112,10 +112,10 @@ static void __spin_lock_debug(spinlock_t
 		/* lockup suspected: */
 		if (print_once) {
 			print_once = 0;
-			printk(KERN_EMERG "BUG: spinlock lockup on CPU#%d, "
-					"%s/%d, %p\n",
+			printk(KERN_EMERG "BUG: possible spinlock lockup on CPU#%d, "
+					"%s/%d, %p [%Ld/%ld]\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				current->pid, lock, i, loops_per_jiffy);
 			dump_stack();
 		}
 	}
@@ -169,7 +169,7 @@ static void __read_lock_debug(rwlock_t *
 	u64 i;
 
 	for (;;) {
-		for (i = 0; i < loops_per_jiffy * HZ; i++) {
+		for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
 			if (__raw_read_trylock(&lock->raw_lock))
 				return;
 			__delay(1);
@@ -177,10 +177,10 @@ static void __read_lock_debug(rwlock_t *
 		/* lockup suspected: */
 		if (print_once) {
 			print_once = 0;
-			printk(KERN_EMERG "BUG: read-lock lockup on CPU#%d, "
-					"%s/%d, %p\n",
+			printk(KERN_EMERG "BUG: possible read-lock lockup on CPU#%d, "
+					"%s/%d, %p [%Ld/%ld]\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				current->pid, lock, i, loops_per_jiffy);
 			dump_stack();
 		}
 	}
@@ -242,7 +242,7 @@ static void __write_lock_debug(rwlock_t 
 	u64 i;
 
 	for (;;) {
-		for (i = 0; i < loops_per_jiffy * HZ; i++) {
+		for (i = 0; i < (u64)loops_per_jiffy * HZ; i++) {
 			if (__raw_write_trylock(&lock->raw_lock))
 				return;
 			__delay(1);
@@ -250,10 +250,10 @@ static void __write_lock_debug(rwlock_t 
 		/* lockup suspected: */
 		if (print_once) {
 			print_once = 0;
-			printk(KERN_EMERG "BUG: write-lock lockup on CPU#%d, "
-					"%s/%d, %p\n",
+			printk(KERN_EMERG "BUG: possible write-lock lockup on CPU#%d, "
+					"%s/%d, %p [%Ld/%ld]\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				current->pid, lock, i, loops_per_jiffy);
 			dump_stack();
 		}
 	}

  parent reply	other threads:[~2006-06-20  8:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-12 19:53 BUG: write-lock lockup Charles C. Bennett, Jr.
2006-06-17 17:07 ` Andrew Morton
2006-06-19  7:02   ` [patch] increase spinlock-debug looping timeouts from 1 sec to 1 min Ingo Molnar
2006-06-19  7:59     ` Andrew Morton
2006-06-19  8:12       ` Ingo Molnar
2006-06-19  8:21         ` Ingo Molnar
2006-06-19  8:32         ` Andrew Morton
2006-06-19  8:35           ` Ingo Molnar
2006-06-19  9:13             ` Andrew Morton
2006-06-19 11:39               ` Ingo Molnar
2006-06-19 19:55                 ` Andrew Morton
2006-06-20  8:06                   ` Arjan van de Ven
2006-06-20  8:40                   ` Ingo Molnar [this message]
2006-06-20  8:52                     ` [patch] fix spinlock-debug looping Andrew Morton
2006-06-20  9:15                       ` Ingo Molnar
2006-06-20  9:32                         ` Andrew Morton
2006-06-20  9:34                           ` Ingo Molnar
2006-06-20 16:02                           ` Dave Olson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060620084001.GC7899@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@osdl.org \
    --cc=ccb@acm.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.