All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Kirby <sim@hostway.ca>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: XFS read hangs in 3.1-rc10
Date: Tue, 25 Oct 2011 13:07:48 -0700	[thread overview]
Message-ID: <20111025200748.GA25043@hostway.ca> (raw)
In-Reply-To: <20111024082219.GA19941@infradead.org>

On Mon, Oct 24, 2011 at 04:22:19AM -0400, Christoph Hellwig wrote:

> On Fri, Oct 21, 2011 at 01:28:57PM -0700, Simon Kirby wrote:
> > > So we're waiting for the inode to be flushed, aka I/O again.
> > 
> > But I don't seem to see any queued I/O, hmm.
> 
> Well, as far as XFS is concerned the inode is beeing flushed and
> the buffer is locked.  It could be stuck in the XFS internal delwri
> list because a buffer for example is pinned.
> 
> If that is the case the big hammer patch I attached below - probably
> not the final issue, but it should fix the hang if that is the case.
> 
> > > If this doesn't help I'll probably need to come up with some tracing
> > > patches for you.
> > 
> > It seemes 3.0.7+gregkh's stable-queue queue-3.0 patches seems to be
> > running fine without blocking at all on this SSD box, so that should
> > narrow it down significantly.
> > 
> > Hmm, looking at git diff --stat v3.0.7..v3.1-rc10 fs/xfs , maybe not.. :)
> > 
> > Maybe 3.1 fs/xfs would transplant into 3.0 or vice-versa?
> 
> If the patch above doesn't work I'll prepare a backport for you.
> 
> Index: linux-2.6/fs/xfs/xfs_sync.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_sync.c	2011-10-24 10:02:27.361971264 +0200
> +++ linux-2.6/fs/xfs/xfs_sync.c	2011-10-24 10:11:03.301036954 +0200
> @@ -764,7 +764,8 @@ xfs_reclaim_inode(
>  	struct xfs_perag	*pag,
>  	int			sync_mode)
>  {
> -	int	error;
> +	struct xfs_mount	*mp = ip->i_mount;
> +	int			error;
>  
>  restart:
>  	error = 0;
> @@ -772,6 +773,18 @@ restart:
>  	if (!xfs_iflock_nowait(ip)) {
>  		if (!(sync_mode & SYNC_WAIT))
>  			goto out;
> +
> +		/*
> +		 * If the inode is flush locked we probably had someone else
> +		 * push it to the buffer and the buffer is now sitting in
> +		 * the delwri list.
> +		 *
> +		 * Use the big hammer to force it.
> +		 */
> +		xfs_log_force(mp, XFS_LOG_SYNC);
> +		set_bit(XBT_FORCE_FLUSH, &mp->m_ddev_targp->bt_flags);
> +		wake_up_process(mp->m_ddev_targp->bt_task);
> +
>  		xfs_iflock(ip);
>  	}
>  

This patch seems to work, at least on an SSD box. No more hung task
warnings, and everything appears normal.

Do we know what caused this regression and/or how to fix it without the
big hammer, or do we need to break it down further?

Thanks!

Simon-

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Simon Kirby <sim@hostway.ca>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: XFS read hangs in 3.1-rc10
Date: Tue, 25 Oct 2011 13:07:48 -0700	[thread overview]
Message-ID: <20111025200748.GA25043@hostway.ca> (raw)
In-Reply-To: <20111024082219.GA19941@infradead.org>

On Mon, Oct 24, 2011 at 04:22:19AM -0400, Christoph Hellwig wrote:

> On Fri, Oct 21, 2011 at 01:28:57PM -0700, Simon Kirby wrote:
> > > So we're waiting for the inode to be flushed, aka I/O again.
> > 
> > But I don't seem to see any queued I/O, hmm.
> 
> Well, as far as XFS is concerned the inode is beeing flushed and
> the buffer is locked.  It could be stuck in the XFS internal delwri
> list because a buffer for example is pinned.
> 
> If that is the case the big hammer patch I attached below - probably
> not the final issue, but it should fix the hang if that is the case.
> 
> > > If this doesn't help I'll probably need to come up with some tracing
> > > patches for you.
> > 
> > It seemes 3.0.7+gregkh's stable-queue queue-3.0 patches seems to be
> > running fine without blocking at all on this SSD box, so that should
> > narrow it down significantly.
> > 
> > Hmm, looking at git diff --stat v3.0.7..v3.1-rc10 fs/xfs , maybe not.. :)
> > 
> > Maybe 3.1 fs/xfs would transplant into 3.0 or vice-versa?
> 
> If the patch above doesn't work I'll prepare a backport for you.
> 
> Index: linux-2.6/fs/xfs/xfs_sync.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_sync.c	2011-10-24 10:02:27.361971264 +0200
> +++ linux-2.6/fs/xfs/xfs_sync.c	2011-10-24 10:11:03.301036954 +0200
> @@ -764,7 +764,8 @@ xfs_reclaim_inode(
>  	struct xfs_perag	*pag,
>  	int			sync_mode)
>  {
> -	int	error;
> +	struct xfs_mount	*mp = ip->i_mount;
> +	int			error;
>  
>  restart:
>  	error = 0;
> @@ -772,6 +773,18 @@ restart:
>  	if (!xfs_iflock_nowait(ip)) {
>  		if (!(sync_mode & SYNC_WAIT))
>  			goto out;
> +
> +		/*
> +		 * If the inode is flush locked we probably had someone else
> +		 * push it to the buffer and the buffer is now sitting in
> +		 * the delwri list.
> +		 *
> +		 * Use the big hammer to force it.
> +		 */
> +		xfs_log_force(mp, XFS_LOG_SYNC);
> +		set_bit(XBT_FORCE_FLUSH, &mp->m_ddev_targp->bt_flags);
> +		wake_up_process(mp->m_ddev_targp->bt_task);
> +
>  		xfs_iflock(ip);
>  	}
>  

This patch seems to work, at least on an SSD box. No more hung task
warnings, and everything appears normal.

Do we know what caused this regression and/or how to fix it without the
big hammer, or do we need to break it down further?

Thanks!

Simon-

  reply	other threads:[~2011-10-25 20:07 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-20 22:42 XFS read hangs in 3.1-rc10 Simon Kirby
2011-10-20 22:42 ` Simon Kirby
2011-10-21 13:22 ` Christoph Hellwig
2011-10-21 13:22   ` Christoph Hellwig
2011-10-21 20:28   ` Simon Kirby
2011-10-21 20:28     ` Simon Kirby
2011-10-24  8:22     ` Christoph Hellwig
2011-10-24  8:22       ` Christoph Hellwig
2011-10-25 20:07       ` Simon Kirby [this message]
2011-10-25 20:07         ` Simon Kirby
2011-10-26 11:25         ` Christoph Hellwig
2011-10-26 11:25           ` Christoph Hellwig
2011-11-04 21:03           ` Christoph Hellwig
2011-11-04 21:03             ` Christoph Hellwig
2011-11-16 19:56             ` Simon Kirby
2011-11-16 19:56               ` Simon Kirby
2011-11-20 15:32               ` Christoph Hellwig
2011-11-20 15:32                 ` Christoph Hellwig
2011-11-28 19:05                 ` Simon Kirby
2011-11-28 19:05                   ` Simon Kirby
2011-11-28 19:55                   ` Christoph Hellwig
2011-11-28 19:55                     ` Christoph Hellwig
2011-11-28 21:01                     ` Ben Myers
2011-11-28 21:01                       ` Ben Myers
2011-10-21 20:29   ` Markus Trippelsdorf
2011-10-21 20:29     ` Markus Trippelsdorf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111025200748.GA25043@hostway.ca \
    --to=sim@hostway.ca \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.