public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@redhat.com>
To: xfs-oss <xfs@oss.sgi.com>
Subject: [PATCH] xfs_repair: don't unlock prefetch tree to read discontig buffers
Date: Wed, 07 May 2014 18:29:15 -0500	[thread overview]
Message-ID: <536AC1CB.8060601@redhat.com> (raw)

The way discontiguous buffers are currently handled in
prefetch is by unlocking the prefetch tree and reading
them one at a time in pf_read_discontig(), inside the
normal loop of searching for buffers to read in a more
optimized fashion.

But by unlocking the tree, we allow other threads to come
in and find buffers which we've already stashed locally
on our bplist[].  If 2 threads think they own the same
set of buffers, they may both try to delete them from
the prefetch btree, and the second one to arrive will not
find it, resulting in:

	fatal error -- prefetch corruption

Fix this by maintaining 2 lists; the original bplist,
and a new one containing only discontiguous buffers.

The original list can be seek-optimized as before,
and the discontiguous list can be read one by one
before we do the seek-optimized reads, after all of the
tree manipulation has been completed.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

diff --git a/repair/prefetch.c b/repair/prefetch.c
index 65fedf5..2a8008f 100644
--- a/repair/prefetch.c
+++ b/repair/prefetch.c
@@ -444,28 +444,7 @@ pf_read_inode_dirs(
 }
 
 /*
- * Discontiguous buffers require multiple IOs to fill, so we can't use any
- * linearising, hole filling algorithms on them to avoid seeks. Just remove them
- * for the prefetch queue and read them straight into the cache and release
- * them.
- */
-static void
-pf_read_discontig(
-	struct prefetch_args	*args,
-	struct xfs_buf		*bp)
-{
-	if (!btree_delete(args->io_queue, XFS_DADDR_TO_FSB(mp, bp->b_bn)))
-		do_error(_("prefetch corruption\n"));
-
-	pthread_mutex_unlock(&args->lock);
-	libxfs_readbufr_map(mp->m_ddev_targp, bp, 0);
-	bp->b_flags |= LIBXFS_B_UNCHECKED;
-	libxfs_putbuf(bp);
-	pthread_mutex_lock(&args->lock);
-}
-
-/*
- * pf_batch_read must be called with the lock locked.
+ * pf_batch_read must be called with the args->lock mutex locked.
  */
 static void
 pf_batch_read(
@@ -474,7 +453,8 @@ pf_batch_read(
 	void			*buf)
 {
 	xfs_buf_t		*bplist[MAX_BUFS];
-	unsigned int		num;
+	xfs_buf_t		*bplist_disc[MAX_BUFS];
+	unsigned int		num, num_disc;
 	off64_t			first_off, last_off, next_off;
 	int			len, size;
 	int			i;
@@ -484,7 +464,7 @@ pf_batch_read(
 	char			*pbuf;
 
 	for (;;) {
-		num = 0;
+		num = num_disc = 0;
 		if (which == PF_SECONDARY) {
 			bplist[0] = btree_find(args->io_queue, 0, &fsbno);
 			max_fsbno = MIN(fsbno + pf_max_fsbs,
@@ -494,18 +474,22 @@ pf_batch_read(
 						args->last_bno_read, &fsbno);
 			max_fsbno = fsbno + pf_max_fsbs;
 		}
+
 		while (bplist[num] && num < MAX_BUFS && fsbno < max_fsbno) {
 			/*
-			 * Handle discontiguous buffers outside the seek
-			 * optimised IO loop below.
+ 			 * Discontiguous buffers require multiple IOs to fill,
+ 			 * so we can't use any linearising, hole filling
+ 			 * algorithms on them to avoid seeks. Just move them
+ 			 * to their own list and read them individually later.
 			 */
 			if ((bplist[num]->b_flags & LIBXFS_B_DISCONTIG)) {
-				pf_read_discontig(args, bplist[num]);
-				bplist[num] = NULL;
+				bplist_disc[num_disc] = bplist[num];
+				num_disc++;
 			} else if (which != PF_META_ONLY ||
 				   !B_IS_INODE(XFS_BUF_PRIORITY(bplist[num])))
 				num++;
-			if (num == MAX_BUFS)
+
+			if (num == MAX_BUFS || num_disc == MAX_BUFS)
 				break;
 			bplist[num] = btree_lookup_next(args->io_queue, &fsbno);
 		}
@@ -541,12 +525,19 @@ pf_batch_read(
 			num = i;
 		}
 
+		/* Take everything we found out of the tree */
 		for (i = 0; i < num; i++) {
 			if (btree_delete(args->io_queue, XFS_DADDR_TO_FSB(mp,
 					XFS_BUF_ADDR(bplist[i]))) == NULL)
 				do_error(_("prefetch corruption\n"));
 		}
 
+		for (i = 0; i < num_disc; i++) {
+			if (btree_delete(args->io_queue, XFS_DADDR_TO_FSB(mp,
+					XFS_BUF_ADDR(bplist_disc[i]))) == NULL)
+				do_error(_("prefetch corruption\n"));
+		}
+
 		if (which == PF_PRIMARY) {
 			for (inode_bufs = 0, i = 0; i < num; i++) {
 				if (B_IS_INODE(XFS_BUF_PRIORITY(bplist[i])))
@@ -566,6 +557,12 @@ pf_batch_read(
 #endif
 		pthread_mutex_unlock(&args->lock);
 
+		/* Read discontig buffers individually, if any */
+		for (i = 0; i < num_disc; i++) {
+			libxfs_readbufr_map(mp->m_ddev_targp, bplist_disc[i], 0);
+			bplist_disc[i]->b_flags |= LIBXFS_B_UNCHECKED;
+			libxfs_putbuf(bplist_disc[i]);
+		}
 		/*
 		 * now read the data and put into the xfs_but_t's
 		 */


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

             reply	other threads:[~2014-05-07 23:29 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-07 23:29 Eric Sandeen [this message]
2014-05-08  1:42 ` [PATCH] xfs_repair: don't unlock prefetch tree to read discontig buffers Dave Chinner
2014-05-08  1:58   ` Eric Sandeen
2014-05-08  5:42     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=536AC1CB.8060601@redhat.com \
    --to=sandeen@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox