public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 1/5] xfs: unpin stale inodes directly in IOP_COMMITTED
Date: Mon,  4 Jul 2011 15:27:36 +1000	[thread overview]
Message-ID: <1309757260-5484-2-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1309757260-5484-1-git-send-email-david@fromorbit.com>

From: Dave Chinner <dchinner@redhat.com>

When inodes are marked stale in a transaction, they are treated
specially when the iinode log item is being inserted into the AIL.
It trieѕ to avoid moving the log item forward in the AIL due to a
race condition with the writing the underlying buffer back to disk.
The was "fixed" in commit de25c18 ("xfs: avoid moving stale inodes in
the AIL").

To avoid moving the item forward, we return a LSN smaller than the
commit_lsn of the completing transaction, thereby trying to trick
the commit code into not moving the inode forward at all. I'm not
sure this ever worked as intended - it assumes the inode is already
in the AIL, but I don't think the returned LSN would have been small
enough to prevent moving the inode. It appears that the reason it
worked is that the lower LSN of the inodes meant they were inserted
into the AIL and flushed before the inode buffer (which was moved to
the commit_lsn of the transaction).

The big problem is that with delayed logging, the returning of the
different LSN means insertion takes the slow, non-bulk path.  Worse
yet is that insertion is to a position -before- the commit_lsn so it
is doing a AIL traversal on every insertion, and has to walk over
all the items that have already been inserted into the AIL. It's
expensive.

To compound the matter further, with delayed logging inodes are
likely to go from clean to stale in a single checkpoint, which means
they aren't even in the AIL at all when we come across them at AIL
insertion time. Hence these were all getting inserted into the AIL
when they simply do not need to be as inodes marked XFS_ISTALE are
never written back.

Transactional/recovery integrity is maintained in this case by the
other items in the unlink transaction that were modified (e.g. the
AGI btree blocks) and committed in the same checkpoint.

So to fix this, simply unpin the stale inodes directly in
xfs_inode_item_committed() and return -1 to indicate that the AIL
insertion code does not need to do any further processing of these
inodes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_inode_item.c |   14 ++++++++------
 fs/xfs/xfs_trans.c      |    2 +-
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 09983a3..b1e88d5 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -681,15 +681,15 @@ xfs_inode_item_unlock(
  * where the cluster buffer may be unpinned before the inode is inserted into
  * the AIL during transaction committed processing. If the buffer is unpinned
  * before the inode item has been committed and inserted, then it is possible
- * for the buffer to be written and IO completions before the inode is inserted
+ * for the buffer to be written and IO completes before the inode is inserted
  * into the AIL. In that case, we'd be inserting a clean, stale inode into the
  * AIL which will never get removed. It will, however, get reclaimed which
  * triggers an assert in xfs_inode_free() complaining about freein an inode
  * still in the AIL.
  *
- * To avoid this, return a lower LSN than the one passed in so that the
- * transaction committed code will not move the inode forward in the AIL but
- * will still unpin it properly.
+ * To avoid this, just unpin the inode directly and return a LSN of -1 so the
+ * transaction committed code knows that it does not need to do any further
+ * processing on the item.
  */
 STATIC xfs_lsn_t
 xfs_inode_item_committed(
@@ -699,8 +699,10 @@ xfs_inode_item_committed(
 	struct xfs_inode_log_item *iip = INODE_ITEM(lip);
 	struct xfs_inode	*ip = iip->ili_inode;
 
-	if (xfs_iflags_test(ip, XFS_ISTALE))
-		return lsn - 1;
+	if (xfs_iflags_test(ip, XFS_ISTALE)) {
+		xfs_inode_item_unpin(lip, 0);
+		return -1;
+	}
 	return lsn;
 }
 
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 7c7bc2b..3744337 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -1474,7 +1474,7 @@ xfs_trans_committed_bulk(
 			lip->li_flags |= XFS_LI_ABORTED;
 		item_lsn = IOP_COMMITTED(lip, commit_lsn);
 
-		/* item_lsn of -1 means the item was freed */
+		/* item_lsn of -1 means the item needs no further processing */
 		if (XFS_LSN_CMP(item_lsn, (xfs_lsn_t)-1) == 0)
 			continue;
 
-- 
1.7.5.1

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-07-04  5:27 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-04  5:27 [PATCH 0/5] xfs: fix AIL bulk insert issues and cleanups Dave Chinner
2011-07-04  5:27 ` Dave Chinner [this message]
2011-07-04  8:13   ` [PATCH 1/5] xfs: unpin stale inodes directly in IOP_COMMITTED Christoph Hellwig
2011-07-06 20:45   ` Alex Elder
2011-07-04  5:27 ` [PATCH 2/5] xfs: use a cursor for bulk AIL insertion Dave Chinner
2011-07-04  8:32   ` Christoph Hellwig
2011-07-04 11:16   ` Christoph Hellwig
2011-07-04 21:20   ` Christoph Hellwig
2011-07-07 21:26     ` Alex Elder
2011-07-08  1:04       ` Dave Chinner
2011-07-07 20:29   ` Alex Elder
2011-07-04  5:27 ` [PATCH 3/5] xfs: remove confusing ail cursor wrapper Dave Chinner
2011-07-04  8:16   ` Christoph Hellwig
2011-07-07 20:33   ` Alex Elder
2011-07-04  5:27 ` [PATCH 4/5] xfs: convert AIL cursors to use struct list_head Dave Chinner
2011-07-04  8:43   ` Christoph Hellwig
2011-07-07 21:15   ` Alex Elder
2011-07-08  1:54     ` Dave Chinner
2011-07-04  5:27 ` [PATCH 5/5] xfs: add size update tracepoint to IO completion Dave Chinner
2011-07-04  8:16   ` Christoph Hellwig
2011-07-07 21:18   ` Alex Elder
2011-07-04  8:13 ` [PATCH 0/5] xfs: fix AIL bulk insert issues and cleanups Christoph Hellwig
2011-07-04 11:26   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1309757260-5484-2-git-send-email-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox