public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 06/37] xfs: do not write the buffer from xfs_iflush
Date: Mon, 23 Apr 2012 15:58:36 +1000	[thread overview]
Message-ID: <1335160747-17254-7-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1335160747-17254-1-git-send-email-david@fromorbit.com>

From: Christoph Hellwig <hch@infradead.org>

Instead of writing the buffer directly from inside xfs_iflush return it to
the caller and let the caller decide what to do with the buffer.  Also
remove the pincount check in xfs_iflush that all non-blocking callers already
implement and the now unused flags parameter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
---
 fs/xfs/xfs_inode.c      |   54 ++++++++++++++---------------------------------
 fs/xfs/xfs_inode.h      |    2 +-
 fs/xfs/xfs_inode_item.c |   17 ++++++++++++++-
 fs/xfs/xfs_sync.c       |   29 +++++++++++++------------
 4 files changed, 48 insertions(+), 54 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 00f9c2f..0fa987d 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2384,22 +2384,22 @@ cluster_corrupt_out:
 }
 
 /*
- * xfs_iflush() will write a modified inode's changes out to the
- * inode's on disk home.  The caller must have the inode lock held
- * in at least shared mode and the inode flush completion must be
- * active as well.  The inode lock will still be held upon return from
- * the call and the caller is free to unlock it.
- * The inode flush will be completed when the inode reaches the disk.
- * The flags indicate how the inode's buffer should be written out.
+ * Flush dirty inode metadata into the backing buffer.
+ *
+ * The caller must have the inode lock and the inode flush lock held.  The
+ * inode lock will still be held upon return to the caller, and the inode
+ * flush lock will be released after the inode has reached the disk.
+ *
+ * The caller must write out the buffer returned in *bpp and release it.
  */
 int
 xfs_iflush(
-	xfs_inode_t		*ip,
-	uint			flags)
+	struct xfs_inode	*ip,
+	struct xfs_buf		**bpp)
 {
-	xfs_buf_t		*bp;
-	xfs_dinode_t		*dip;
-	xfs_mount_t		*mp;
+	struct xfs_mount	*mp = ip->i_mount;
+	struct xfs_buf		*bp;
+	struct xfs_dinode	*dip;
 	int			error;
 
 	XFS_STATS_INC(xs_iflush_count);
@@ -2409,24 +2409,8 @@ xfs_iflush(
 	ASSERT(ip->i_d.di_format != XFS_DINODE_FMT_BTREE ||
 	       ip->i_d.di_nextents > XFS_IFORK_MAXEXT(ip, XFS_DATA_FORK));
 
-	mp = ip->i_mount;
+	*bpp = NULL;
 
-	/*
-	 * We can't flush the inode until it is unpinned, so wait for it if we
-	 * are allowed to block.  We know no one new can pin it, because we are
-	 * holding the inode lock shared and you need to hold it exclusively to
-	 * pin the inode.
-	 *
-	 * If we are not allowed to block, force the log out asynchronously so
-	 * that when we come back the inode will be unpinned. If other inodes
-	 * in the same cluster are dirty, they will probably write the inode
-	 * out for us if they occur after the log force completes.
-	 */
-	if (!(flags & SYNC_WAIT) && xfs_ipincount(ip)) {
-		xfs_iunpin(ip);
-		xfs_ifunlock(ip);
-		return EAGAIN;
-	}
 	xfs_iunpin_wait(ip);
 
 	/*
@@ -2458,8 +2442,7 @@ xfs_iflush(
 	/*
 	 * Get the buffer containing the on-disk inode.
 	 */
-	error = xfs_itobp(mp, NULL, ip, &dip, &bp,
-				(flags & SYNC_TRYLOCK) ? XBF_TRYLOCK : XBF_LOCK);
+	error = xfs_itobp(mp, NULL, ip, &dip, &bp, XBF_TRYLOCK);
 	if (error || !bp) {
 		xfs_ifunlock(ip);
 		return error;
@@ -2487,13 +2470,8 @@ xfs_iflush(
 	if (error)
 		goto cluster_corrupt_out;
 
-	if (flags & SYNC_WAIT)
-		error = xfs_bwrite(bp);
-	else
-		xfs_buf_delwri_queue(bp);
-
-	xfs_buf_relse(bp);
-	return error;
+	*bpp = bp;
+	return 0;
 
 corrupt_out:
 	xfs_buf_relse(bp);
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 7fee338..a2fa79a 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -529,7 +529,7 @@ int		xfs_iunlink(struct xfs_trans *, xfs_inode_t *);
 
 void		xfs_iext_realloc(xfs_inode_t *, int, int);
 void		xfs_iunpin_wait(xfs_inode_t *);
-int		xfs_iflush(xfs_inode_t *, uint);
+int		xfs_iflush(struct xfs_inode *, struct xfs_buf **);
 void		xfs_promote_inode(struct xfs_inode *);
 void		xfs_lock_inodes(xfs_inode_t **, int, uint);
 void		xfs_lock_two_inodes(xfs_inode_t *, xfs_inode_t *, uint);
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 05d924e..d3601ab 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -506,6 +506,15 @@ xfs_inode_item_trylock(
 	if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
 		return XFS_ITEM_LOCKED;
 
+	/*
+	 * Re-check the pincount now that we stabilized the value by
+	 * taking the ilock.
+	 */
+	if (xfs_ipincount(ip) > 0) {
+		xfs_iunlock(ip, XFS_ILOCK_SHARED);
+		return XFS_ITEM_PINNED;
+	}
+
 	if (!xfs_iflock_nowait(ip)) {
 		/*
 		 * inode has already been flushed to the backing buffer,
@@ -666,6 +675,8 @@ xfs_inode_item_push(
 {
 	struct xfs_inode_log_item *iip = INODE_ITEM(lip);
 	struct xfs_inode	*ip = iip->ili_inode;
+	struct xfs_buf		*bp = NULL;
+	int			error;
 
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_SHARED));
 	ASSERT(xfs_isiflocked(ip));
@@ -689,7 +700,11 @@ xfs_inode_item_push(
 	 * will pull the inode from the AIL, mark it clean and unlock the flush
 	 * lock.
 	 */
-	(void) xfs_iflush(ip, SYNC_TRYLOCK);
+	error = xfs_iflush(ip, &bp);
+	if (!error) {
+		xfs_buf_delwri_queue(bp);
+		xfs_buf_relse(bp);
+	}
 	xfs_iunlock(ip, XFS_ILOCK_SHARED);
 }
 
diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
index 89f14e5..8f11e1f 100644
--- a/fs/xfs/xfs_sync.c
+++ b/fs/xfs/xfs_sync.c
@@ -648,10 +648,6 @@ xfs_reclaim_inode_grab(
  * (*) dgc: I don't think the clean, pinned state is possible but it gets
  * handled anyway given the order of checks implemented.
  *
- * As can be seen from the table, the return value of xfs_iflush() is not
- * sufficient to correctly decide the reclaim action here. The checks in
- * xfs_iflush() might look like duplicates, but they are not.
- *
  * Also, because we get the flush lock first, we know that any inode that has
  * been flushed delwri has had the flush completed by the time we check that
  * the inode is clean.
@@ -679,7 +675,8 @@ xfs_reclaim_inode(
 	struct xfs_perag	*pag,
 	int			sync_mode)
 {
-	int	error;
+	struct xfs_buf		*bp = NULL;
+	int			error;
 
 restart:
 	error = 0;
@@ -728,29 +725,33 @@ restart:
 	/*
 	 * Now we have an inode that needs flushing.
 	 *
-	 * We do a nonblocking flush here even if we are doing a SYNC_WAIT
-	 * reclaim as we can deadlock with inode cluster removal.
+	 * Note that xfs_iflush will never block on the inode buffer lock, as
 	 * xfs_ifree_cluster() can lock the inode buffer before it locks the
-	 * ip->i_lock, and we are doing the exact opposite here. As a result,
-	 * doing a blocking xfs_itobp() to get the cluster buffer will result
+	 * ip->i_lock, and we are doing the exact opposite here.  As a result,
+	 * doing a blocking xfs_itobp() to get the cluster buffer would result
 	 * in an ABBA deadlock with xfs_ifree_cluster().
 	 *
 	 * As xfs_ifree_cluser() must gather all inodes that are active in the
 	 * cache to mark them stale, if we hit this case we don't actually want
 	 * to do IO here - we want the inode marked stale so we can simply
-	 * reclaim it. Hence if we get an EAGAIN error on a SYNC_WAIT flush,
-	 * just unlock the inode, back off and try again. Hopefully the next
-	 * pass through will see the stale flag set on the inode.
+	 * reclaim it.  Hence if we get an EAGAIN error here,  just unlock the
+	 * inode, back off and try again.  Hopefully the next pass through will
+	 * see the stale flag set on the inode.
 	 */
-	error = xfs_iflush(ip, SYNC_TRYLOCK | sync_mode);
+	error = xfs_iflush(ip, &bp);
 	if (error == EAGAIN) {
 		xfs_iunlock(ip, XFS_ILOCK_EXCL);
 		/* backoff longer than in xfs_ifree_cluster */
 		delay(2);
 		goto restart;
 	}
-	xfs_iflock(ip);
 
+	if (!error) {
+		error = xfs_bwrite(bp);
+		xfs_buf_relse(bp);
+	}
+
+	xfs_iflock(ip);
 reclaim:
 	xfs_ifunlock(ip);
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
-- 
1.7.9.5

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2012-04-23  5:59 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-23  5:58 [PATCH 00/37] xfs: current 3.4 patch queue Dave Chinner
2012-04-23  5:58 ` [PATCH 01/37] xfs: remove log item from AIL in xfs_qm_dqflush after a shutdown Dave Chinner
2012-04-23  5:58 ` [PATCH 02/37] xfs: remove log item from AIL in xfs_iflush " Dave Chinner
2012-04-23 15:39   ` Mark Tinguely
2012-04-23  5:58 ` [PATCH 03/37] xfs: allow assigning the tail lsn with the AIL lock held Dave Chinner
2012-04-23  5:58 ` [PATCH 04/37] xfs: implement freezing by emptying the AIL Dave Chinner
2012-04-23 15:40   ` Mark Tinguely
2012-04-29 21:43   ` Christoph Hellwig
2012-04-23  5:58 ` [PATCH 05/37] xfs: don't flush inodes from background inode reclaim Dave Chinner
2012-04-23  5:58 ` Dave Chinner [this message]
2012-04-23  5:58 ` [PATCH 07/37] xfs: do not write the buffer from xfs_qm_dqflush Dave Chinner
2012-04-23  5:58 ` [PATCH 08/37] xfs: do not add buffers to the delwri queue until pushed Dave Chinner
2012-04-23  5:58 ` [PATCH 09/37] xfs: on-stack delayed write buffer lists Dave Chinner
2012-04-25 18:34   ` Mark Tinguely
2012-04-29 21:44   ` Christoph Hellwig
2012-04-23  5:58 ` [PATCH 10/37] xfs: remove some obsolete comments in xfs_trans_ail.c Dave Chinner
2012-04-23 15:41   ` Mark Tinguely
2012-04-23  5:58 ` [PATCH 11/37] xfs: pass shutdown method into xfs_trans_ail_delete_bulk Dave Chinner
2012-04-23  5:58 ` [PATCH 12/37] xfs: Do background CIL flushes via a workqueue Dave Chinner
2012-04-23  7:54   ` [PATCH 12/37 V2] " Dave Chinner
2012-04-29 21:46     ` Christoph Hellwig
2012-04-23  5:58 ` [PATCH 13/37] xfs: page type check in writeback only checks last buffer Dave Chinner
2012-04-23  5:58 ` [PATCH 14/37] xfs: Use preallocation for inodes with extsz hints Dave Chinner
2012-04-29 21:47   ` Christoph Hellwig
2012-04-23  5:58 ` [PATCH 15/37] xfs: fix buffer lookup race on allocation failure Dave Chinner
2012-04-23  5:58 ` [PATCH 16/37] xfs: check for buffer errors before waiting Dave Chinner
2012-04-23  5:58 ` [PATCH 17/37] xfs: fix incorrect b_offset initialisation Dave Chinner
2012-04-23  5:58 ` [PATCH 18/37] xfs: use kmem_zone_zalloc for buffers Dave Chinner
2012-04-23  5:58 ` [PATCH 19/37] xfs: clean up buffer get/read call API Dave Chinner
2012-04-23  5:58 ` [PATCH 20/37] xfs: kill b_file_offset Dave Chinner
2012-04-23  5:58 ` [PATCH 21/37] xfs: use blocks for counting length of buffers Dave Chinner
2012-04-23  5:58 ` [PATCH 22/37] xfs: use blocks for storing the desired IO size Dave Chinner
2012-04-23  5:58 ` [PATCH 23/37] xfs: kill xfs_buf_btoc Dave Chinner
2012-04-23  5:58 ` [PATCH 24/37] xfs: kill XBF_LOCK Dave Chinner
2012-04-23  5:58 ` [PATCH 25/37] xfs: kill xfs_read_buf() Dave Chinner
2012-04-23  5:58 ` [PATCH 26/37] xfs: kill XBF_DONTBLOCK Dave Chinner
2012-04-23  5:58 ` [PATCH 27/37] xfs: use iolock on XFS_IOC_ALLOCSP calls Dave Chinner
2012-04-23  5:58 ` [PATCH 28/37] xfs: move xfsagino_t to xfs_types.h Dave Chinner
2012-04-23 15:43   ` Mark Tinguely
2012-04-24 15:10   ` Mark Tinguely
2012-04-29 21:49   ` Christoph Hellwig
2012-04-30  0:32     ` Dave Chinner
2012-04-23  5:58 ` [PATCH 29/37] xfs: move busy extent handling to it's own file Dave Chinner
2012-04-23 17:57   ` Ben Myers
2012-04-24  0:25     ` [PATCH 29/37 V2] " Dave Chinner
2012-04-24 15:56       ` Mark Tinguely
2012-04-24 18:10         ` Mark Tinguely
2012-04-29 10:39           ` [PATCH 29/37 V3] " Dave Chinner
2012-04-29 21:50             ` Christoph Hellwig
2012-04-30  0:36               ` Dave Chinner
2012-04-30  2:17                 ` Dave Chinner
2012-04-23  5:59 ` [PATCH 30/37] xfs: clean up busy extent naming Dave Chinner
2012-04-24 18:11   ` Mark Tinguely
2012-04-29 10:41     ` [PATCH 30/37 V2] " Dave Chinner
2012-04-29 21:50       ` Christoph Hellwig
2012-04-23  5:59 ` [PATCH 31/37] xfs: move xfs_fsb_to_db to xfs_bmap.h Dave Chinner
2012-04-24 19:24   ` Mark Tinguely
2012-04-29 21:53   ` Christoph Hellwig
2012-04-30  2:31     ` Dave Chinner
2012-04-23  5:59 ` [PATCH 32/37] xfs: move xfs_get_extsz_hint() and kill xfs_rw.h Dave Chinner
2012-04-24 19:30   ` Mark Tinguely
2012-04-29 21:53   ` Christoph Hellwig
2012-04-23  5:59 ` [PATCH 33/37] xfs: move xfs_do_force_shutdown() and kill xfs_rw.c Dave Chinner
2012-04-24 19:37   ` Mark Tinguely
2012-04-29 21:54   ` Christoph Hellwig
2012-04-30  2:38     ` Dave Chinner
2012-04-23  5:59 ` [PATCH 34/37] xfs: clean up xfs_bit.h includes Dave Chinner
2012-04-24 19:44   ` Mark Tinguely
2012-04-29 21:55   ` Christoph Hellwig
2012-04-30  2:40     ` Dave Chinner
2012-04-23  5:59 ` [PATCH 35/37] xfs: Properly exclude IO type flags from buffer flags Dave Chinner
2012-04-24 20:02   ` Mark Tinguely
2012-04-29 21:55   ` Christoph Hellwig
2012-04-23  5:59 ` [PATCH 36/37] xfs: flush outstanding buffers on log mount failure Dave Chinner
2012-04-23 15:47   ` Mark Tinguely
2012-04-29 21:55   ` Christoph Hellwig
2012-04-23  5:59 ` [PATCH 37/37] xfs: make XBF_MAPPED the default behaviour Dave Chinner
2012-04-25 18:35   ` Mark Tinguely
2012-04-25 20:09   ` Mark Tinguely
2012-04-25 22:33     ` Dave Chinner
2012-04-29 21:57   ` Christoph Hellwig
2012-04-30  2:45     ` Dave Chinner
2012-04-23 18:01 ` [PATCH 00/37] xfs: current 3.4 patch queue Ben Myers
2012-04-23 23:29   ` Dave Chinner
2012-04-30 14:24     ` Ben Myers
2012-04-28  2:15 ` Ben Myers
2012-04-28 21:28   ` Ben Myers
2012-04-29  0:21     ` Dave Chinner
2012-04-29  0:14   ` Dave Chinner
2012-04-30 14:44     ` Ben Myers
2012-04-30 23:04       ` Dave Chinner
2012-04-30 14:32 ` Assertion failed: RB_EMPTY_NODE(&bp->b_rbnode) Ben Myers
2012-04-30 23:12   ` Dave Chinner
2012-04-30 14:34 ` [PATCH 00/37] xfs: current 3.4 patch queue Ben Myers
2012-04-30 23:20   ` Dave Chinner
2012-04-30 19:25 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1335160747-17254-7-git-send-email-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox