public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Arkadiusz Mi??kiewicz <arekm@maven.pl>
Cc: Christoph Hellwig <hch@infradead.org>, xfs@oss.sgi.com
Subject: Re: [PATCH] repair: avoid ABBA deadlocks on prefetched buffers
Date: Tue, 22 Nov 2011 17:46:20 -0500	[thread overview]
Message-ID: <20111122224620.GA20107@infradead.org> (raw)
In-Reply-To: <201111180944.10048.arekm@maven.pl>

On Fri, Nov 18, 2011 at 09:44:09AM +0100, Arkadiusz Mi??kiewicz wrote:
> On Tuesday 15 of November 2011, Christoph Hellwig wrote:
> > Both the prefetch threads and actual repair processing threads can have
> > multiple buffers at a time locked, but they do no use a common locker
> > order, which can lead to ABBA deadlocks while trying to lock the buffers.
> 
> There is still some issue with deadlocking.
> 
> The last printed messages:
> b????dna liczba magiczna 0x41425443 w bloku inobt 2/1438099
> b????dna liczba magiczna 0x41425443 w bloku inobt 2/1438196
> b????dna liczba magiczna 0x41425443 w bloku inobt 2/1438732
> (invalid magic number ... in block inobt ...)

It looks like you have a circular loop in the inobt tree, and repair
deadlocks trying to read the same node again.  Below is a patch working
around that by allowing recursive locking for the buffer lock and then
letting the normal two strikes and out policy apply.  I'm not overly
proud of the patch, but in the short term I can't think of anything
better.


Index: xfsprogs-dev/include/libxfs.h
===================================================================
--- xfsprogs-dev.orig/include/libxfs.h	2011-11-22 22:28:23.000000000 +0000
+++ xfsprogs-dev/include/libxfs.h	2011-11-22 22:34:27.000000000 +0000
@@ -226,6 +226,8 @@ typedef struct xfs_buf {
 	unsigned		b_bcount;
 	dev_t			b_dev;
 	pthread_mutex_t		b_lock;
+	pthread_t		b_holder;
+	unsigned int		b_recur;
 	void			*b_fsprivate;
 	void			*b_fsprivate2;
 	void			*b_fsprivate3;
Index: xfsprogs-dev/libxfs/rdwr.c
===================================================================
--- xfsprogs-dev.orig/libxfs/rdwr.c	2011-11-22 22:28:23.000000000 +0000
+++ xfsprogs-dev/libxfs/rdwr.c	2011-11-22 22:40:01.000000000 +0000
@@ -342,6 +342,8 @@ libxfs_initbuf(xfs_buf_t *bp, dev_t devi
 	list_head_init(&bp->b_lock_list);
 #endif
 	pthread_mutex_init(&bp->b_lock, NULL);
+	bp->b_holder = 0;
+	bp->b_recur = 0;
 }
 
 xfs_buf_t *
@@ -410,18 +412,24 @@ libxfs_getbuf_flags(dev_t device, xfs_da
 		return NULL;
 
 	if (use_xfs_buf_lock) {
-		if (flags & LIBXFS_GETBUF_TRYLOCK) {
-			int ret;
+		int ret;
 
-			ret = pthread_mutex_trylock(&bp->b_lock);
-			if (ret) {
-				ASSERT(ret == EAGAIN);
-				cache_node_put(libxfs_bcache, (struct cache_node *)bp);
-				return NULL;
+		ret = pthread_mutex_trylock(&bp->b_lock);
+		if (ret) {
+			ASSERT(ret == EAGAIN);
+			if (flags & LIBXFS_GETBUF_TRYLOCK)
+				goto out_put;
+
+			if (pthread_equal(bp->b_holder, pthread_self())) {
+				fprintf(stderr,
+	_("recursive buffer locking detected\n"));
+				bp->b_recur++;
+			} else {
+				pthread_mutex_lock(&bp->b_lock);
 			}
-		} else {
-			pthread_mutex_lock(&bp->b_lock);
 		}
+
+		bp->b_holder = pthread_self();
 	}
 
 	cache_node_set_priority(libxfs_bcache, (struct cache_node *)bp,
@@ -440,6 +448,9 @@ libxfs_getbuf_flags(dev_t device, xfs_da
 #endif
 
 	return bp;
+out_put:
+	cache_node_put(libxfs_bcache, (struct cache_node *)bp);
+	return NULL;
 }
 
 struct xfs_buf *
@@ -458,8 +469,14 @@ libxfs_putbuf(xfs_buf_t *bp)
 	list_del_init(&bp->b_lock_list);
 	pthread_mutex_unlock(&libxfs_bcache->c_mutex);
 #endif
-	if (use_xfs_buf_lock)
-		pthread_mutex_unlock(&bp->b_lock);
+	if (use_xfs_buf_lock) {
+		if (bp->b_recur) {
+			bp->b_recur--;
+		} else {
+			bp->b_holder = 0;
+			pthread_mutex_unlock(&bp->b_lock);
+		}
+	}
 	cache_node_put(libxfs_bcache, (struct cache_node *)bp);
 }
 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-11-22 22:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-15 21:09 [PATCH] repair: avoid ABBA deadlocks on prefetched buffers Christoph Hellwig
2011-11-17  4:25 ` Dave Chinner
2011-11-18  8:44 ` Arkadiusz Miśkiewicz
2011-11-22 22:46   ` Christoph Hellwig [this message]
2011-11-23 17:27     ` Arkadiusz Miśkiewicz
2012-01-13 20:09 ` Mark Tinguely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111122224620.GA20107@infradead.org \
    --to=hch@infradead.org \
    --cc=arekm@maven.pl \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox