public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: force buffer writeback before blocking on the ilock in inode reclaim
@ 2011-11-20  7:23 Christoph Hellwig
  2011-11-20  7:59 ` Christoph Hellwig
  2011-11-20  8:33 ` Dave Chinner
  0 siblings, 2 replies; 4+ messages in thread
From: Christoph Hellwig @ 2011-11-20  7:23 UTC (permalink / raw)
  To: xfs

If we are doing synchronous inode reclaim we block the VM from making
progress in memory reclaim.  So if we encouter a flush locked inode
make sure we force out all delayed buffers ASAP to speed up the wait
for it to be unlocked.  Without this we can get hangs of up to 30
seconds during workloads hitting synchronous inode reclaim.

Reported-by: Simon Kirby <sim@hostway.ca>
Tested-by: Simon Kirby <sim@hostway.ca>
Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: xfs/fs/xfs/xfs_sync.c
===================================================================
--- xfs.orig/fs/xfs/xfs_sync.c	2011-11-19 20:14:52.110141228 +0100
+++ xfs/fs/xfs/xfs_sync.c	2011-11-19 20:40:17.381878121 +0100
@@ -762,7 +762,8 @@ xfs_reclaim_inode(
 	struct xfs_perag	*pag,
 	int			sync_mode)
 {
-	int	error;
+	struct xfs_mount	*mp = ip->i_mount;
+	int			error;
 
 restart:
 	error = 0;
@@ -770,12 +771,25 @@ restart:
 	if (!xfs_iflock_nowait(ip)) {
 		if (!(sync_mode & SYNC_WAIT))
 			goto out;
+
+		/*
+		 * If we only have a single dirty inode in a cluster there is
+		 * a fair chance that the AIL push may have pushed it into
+		 * the buffer, but xfsbufd won't touch it until 30 seconds
+		 * from now, and thus we will lock up here.
+		 *
+		 * Wakeup xfsbufd now, and force it to write back even
+		 * recently dirtied buffers.
+		 */
+		set_bit(XBT_FORCE_FLUSH, &mp->m_ddev_targp->bt_flags);
+		wake_up_process(mp->m_ddev_targp->bt_task);
+
 		xfs_iflock(ip);
 	}
 
 	if (is_bad_inode(VFS_I(ip)))
 		goto reclaim;
-	if (XFS_FORCED_SHUTDOWN(ip->i_mount)) {
+	if (XFS_FORCED_SHUTDOWN(mp)) {
 		xfs_iunpin_wait(ip);
 		goto reclaim;
 	}
@@ -829,8 +843,8 @@ restart:
 	 * is permanent then the next sync reclaim will reclaim the inode and
 	 * pass on the error.
 	 */
-	if (error && error != EAGAIN && !XFS_FORCED_SHUTDOWN(ip->i_mount)) {
-		xfs_warn(ip->i_mount,
+	if (error && error != EAGAIN && !XFS_FORCED_SHUTDOWN(mp)) {
+		xfs_warn(mp,
 			"inode 0x%llx background reclaim flush failed with %d",
 			(long long)ip->i_ino, error);
 	}
@@ -860,7 +874,7 @@ reclaim:
 	 */
 	spin_lock(&pag->pag_ici_lock);
 	if (!radix_tree_delete(&pag->pag_ici_root,
-				XFS_INO_TO_AGINO(ip->i_mount, ip->i_ino)))
+				XFS_INO_TO_AGINO(mp, ip->i_ino)))
 		ASSERT(0);
 	__xfs_inode_clear_reclaim(pag, ip);
 	spin_unlock(&pag->pag_ici_lock);

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] xfs: force buffer writeback before blocking on the ilock in inode reclaim
  2011-11-20  7:23 [PATCH] xfs: force buffer writeback before blocking on the ilock in inode reclaim Christoph Hellwig
@ 2011-11-20  7:59 ` Christoph Hellwig
  2011-11-20  8:33 ` Dave Chinner
  1 sibling, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2011-11-20  7:59 UTC (permalink / raw)
  To: xfs; +Cc: Simon Kirby

I should probably keep Simon on Cc for this one..

On Sun, Nov 20, 2011 at 02:23:34AM -0500, Christoph Hellwig wrote:
> If we are doing synchronous inode reclaim we block the VM from making
> progress in memory reclaim.  So if we encouter a flush locked inode
> make sure we force out all delayed buffers ASAP to speed up the wait
> for it to be unlocked.  Without this we can get hangs of up to 30
> seconds during workloads hitting synchronous inode reclaim.
> 
> Reported-by: Simon Kirby <sim@hostway.ca>
> Tested-by: Simon Kirby <sim@hostway.ca>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Index: xfs/fs/xfs/xfs_sync.c
> ===================================================================
> --- xfs.orig/fs/xfs/xfs_sync.c	2011-11-19 20:14:52.110141228 +0100
> +++ xfs/fs/xfs/xfs_sync.c	2011-11-19 20:40:17.381878121 +0100
> @@ -762,7 +762,8 @@ xfs_reclaim_inode(
>  	struct xfs_perag	*pag,
>  	int			sync_mode)
>  {
> -	int	error;
> +	struct xfs_mount	*mp = ip->i_mount;
> +	int			error;
>  
>  restart:
>  	error = 0;
> @@ -770,12 +771,25 @@ restart:
>  	if (!xfs_iflock_nowait(ip)) {
>  		if (!(sync_mode & SYNC_WAIT))
>  			goto out;
> +
> +		/*
> +		 * If we only have a single dirty inode in a cluster there is
> +		 * a fair chance that the AIL push may have pushed it into
> +		 * the buffer, but xfsbufd won't touch it until 30 seconds
> +		 * from now, and thus we will lock up here.
> +		 *
> +		 * Wakeup xfsbufd now, and force it to write back even
> +		 * recently dirtied buffers.
> +		 */
> +		set_bit(XBT_FORCE_FLUSH, &mp->m_ddev_targp->bt_flags);
> +		wake_up_process(mp->m_ddev_targp->bt_task);
> +
>  		xfs_iflock(ip);
>  	}
>  
>  	if (is_bad_inode(VFS_I(ip)))
>  		goto reclaim;
> -	if (XFS_FORCED_SHUTDOWN(ip->i_mount)) {
> +	if (XFS_FORCED_SHUTDOWN(mp)) {
>  		xfs_iunpin_wait(ip);
>  		goto reclaim;
>  	}
> @@ -829,8 +843,8 @@ restart:
>  	 * is permanent then the next sync reclaim will reclaim the inode and
>  	 * pass on the error.
>  	 */
> -	if (error && error != EAGAIN && !XFS_FORCED_SHUTDOWN(ip->i_mount)) {
> -		xfs_warn(ip->i_mount,
> +	if (error && error != EAGAIN && !XFS_FORCED_SHUTDOWN(mp)) {
> +		xfs_warn(mp,
>  			"inode 0x%llx background reclaim flush failed with %d",
>  			(long long)ip->i_ino, error);
>  	}
> @@ -860,7 +874,7 @@ reclaim:
>  	 */
>  	spin_lock(&pag->pag_ici_lock);
>  	if (!radix_tree_delete(&pag->pag_ici_root,
> -				XFS_INO_TO_AGINO(ip->i_mount, ip->i_ino)))
> +				XFS_INO_TO_AGINO(mp, ip->i_ino)))
>  		ASSERT(0);
>  	__xfs_inode_clear_reclaim(pag, ip);
>  	spin_unlock(&pag->pag_ici_lock);
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
---end quoted text---

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] xfs: force buffer writeback before blocking on the ilock in inode reclaim
  2011-11-20  7:23 [PATCH] xfs: force buffer writeback before blocking on the ilock in inode reclaim Christoph Hellwig
  2011-11-20  7:59 ` Christoph Hellwig
@ 2011-11-20  8:33 ` Dave Chinner
  2011-11-20 11:46   ` Christoph Hellwig
  1 sibling, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2011-11-20  8:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Sun, Nov 20, 2011 at 02:23:34AM -0500, Christoph Hellwig wrote:
> If we are doing synchronous inode reclaim we block the VM from making
> progress in memory reclaim.  So if we encouter a flush locked inode
> make sure we force out all delayed buffers ASAP to speed up the wait
> for it to be unlocked.  Without this we can get hangs of up to 30
> seconds during workloads hitting synchronous inode reclaim.

I don't think we need to push out all delayed buffers - that's an
awfully big sledge hammer to get a single buffer moving. Indeed, we
already have a mechanism for dealing with this problem -
xfs_buf_delwri_promote() - when we hit it during AIL flushing.

IOWs, we only need to promote the buffer the inode sits in and kick
xfsbufd. that is, something like:

        bp = xfs_incore(ip->i_mount->m_ddev_targp, iip->ili_format.ilf_blkno,
                        iip->ili_format.ilf_len, XBF_TRYLOCK);
        if (bp && XFS_BUF_ISDELAYWRITE(bp)) {
                xfs_buf_delwri_promote(bp);
		wake_up_process(mp->m_ddev_targp->bt_task);
	}
	if (bp)
		xfs_buf_relse(bp);


Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] xfs: force buffer writeback before blocking on the ilock in inode reclaim
  2011-11-20  8:33 ` Dave Chinner
@ 2011-11-20 11:46   ` Christoph Hellwig
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2011-11-20 11:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs

On Sun, Nov 20, 2011 at 07:33:45PM +1100, Dave Chinner wrote:
> On Sun, Nov 20, 2011 at 02:23:34AM -0500, Christoph Hellwig wrote:
> > If we are doing synchronous inode reclaim we block the VM from making
> > progress in memory reclaim.  So if we encouter a flush locked inode
> > make sure we force out all delayed buffers ASAP to speed up the wait
> > for it to be unlocked.  Without this we can get hangs of up to 30
> > seconds during workloads hitting synchronous inode reclaim.
> 
> I don't think we need to push out all delayed buffers - that's an
> awfully big sledge hammer to get a single buffer moving. Indeed, we
> already have a mechanism for dealing with this problem -
> xfs_buf_delwri_promote() - when we hit it during AIL flushing.
> 
> IOWs, we only need to promote the buffer the inode sits in and kick
> xfsbufd. that is, something like:
> 
>         bp = xfs_incore(ip->i_mount->m_ddev_targp, iip->ili_format.ilf_blkno,
>                         iip->ili_format.ilf_len, XBF_TRYLOCK);
>         if (bp && XFS_BUF_ISDELAYWRITE(bp)) {
>                 xfs_buf_delwri_promote(bp);
> 		wake_up_process(mp->m_ddev_targp->bt_task);
> 	}
> 	if (bp)
> 		xfs_buf_relse(bp);

Yes, we could try a variant of that.  Note that a buffer in synchronous
reclaim isn't guaranteed to actually have a log item, but we can get
the same information from ip->i_imap.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-11-20 11:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-20  7:23 [PATCH] xfs: force buffer writeback before blocking on the ilock in inode reclaim Christoph Hellwig
2011-11-20  7:59 ` Christoph Hellwig
2011-11-20  8:33 ` Dave Chinner
2011-11-20 11:46   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox