public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Peter Leckie <pleckie@sgi.com>, xfs@oss.sgi.com
Subject: Re: crash with latest code drop.
Date: Wed, 15 Oct 2008 14:16:46 +1100	[thread overview]
Message-ID: <20081015031645.GA25906@disturbed> (raw)
In-Reply-To: <20081015022948.GA20966@infradead.org>

On Tue, Oct 14, 2008 at 10:29:49PM -0400, Christoph Hellwig wrote:
> > +		 * in reclaim. Leave it for the reclaim code to flush.
> >  		 */
> >  		if (igrab(inode)) {
> >  			read_unlock(&pag->pag_ici_lock);
> >  			xfs_ilock(ip, lock_flags);
> >  		} else {
> > +			/* leave it to reclaim */
> >  			read_unlock(&pag->pag_ici_lock);
> > +			continue;
> >  		}
> 
> Might be betters as
> 
> 		if (!igrab(inode)) {
> 			/* leave it to reclaim */
> 			read_unlock(&pag->pag_ici_lock);
> 			continue;
> 		}
> 
> 		read_unlock(&pag->pag_ici_lock);
> 		xfs_ilock(ip, lock_flags);

Yes, saner and more consistent with other code to do it that way.

> which then also shows that we could stop doing the ilock at all for
> the DELWRI case, thas is after fixing the last caller doing
> SYNC_ATTR|SYNC_DELWRI.  Well yeah, lots of things still to sort out in
> this area.

Here's an updated patch that takes into account all this. it removes
a fair chunk of code now....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


XFS: avoid all reclaimable inodes in xfs_sync_inodes_ag

If we are syncing data in xfs_sync_inodes_ag(), the VFS
inode must still be referencable as the dirty data state
is carried on the VFS inode. hence if we can't get a
reference via igrab(), the inode must be in reclaim which
implies that it has no dirty data attached.

Leave such inodes to the reclaim code to flush the dirty
inode state to disk and so avoid attempting to access the
VFS inode when it may not exist in xfs_sync_inodes_ag().

Version 2:
o change igrab logic to be more linear
o remove initial reclaimable inode check now that we are using
  igrab() failure to detect inodes in reclaim
o assert that igrab failure occurs only on reclaimable inodes
o clean up inode locking - only grab the iolock if we are doing
  a SYNC_DELWRI call and we have a dirty inode.

Signed-off-by: Dave Chinner <david@fromorbit.com>
---
 fs/xfs/linux-2.6/xfs_sync.c |   63 +++++++++---------------------------------
 1 files changed, 14 insertions(+), 49 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
index 08b2acf..d371423 100644
--- a/fs/xfs/linux-2.6/xfs_sync.c
+++ b/fs/xfs/linux-2.6/xfs_sync.c
@@ -63,25 +63,16 @@ xfs_sync_inodes_ag(
 	int		error = 0;
 	int		last_error = 0;
 	int		fflag = XFS_B_ASYNC;
-	int		lock_flags = XFS_ILOCK_SHARED;
 
 	if (flags & SYNC_DELWRI)
 		fflag = XFS_B_DELWRI;
 	if (flags & SYNC_WAIT)
 		fflag = 0;		/* synchronous overrides all */
 
-	if (flags & SYNC_DELWRI) {
-		/*
-		 * We need the I/O lock if we're going to call any of
-		 * the flush/inval routines.
-		 */
-		lock_flags |= XFS_IOLOCK_SHARED;
-	}
-
 	do {
 		struct inode	*inode;
-		boolean_t	inode_refed;
 		xfs_inode_t	*ip = NULL;
+		int		lock_flags = XFS_ILOCK_SHARED;
 
 		/*
 		 * use a gang lookup to find the next inode in the tree
@@ -109,15 +100,6 @@ xfs_sync_inodes_ag(
 			break;
 		}
 
-		/*
-		 * skip inodes in reclaim. Let xfs_syncsub do that for
-		 * us so we don't need to worry.
-		 */
-		if (xfs_iflags_test(ip, (XFS_IRECLAIM|XFS_IRECLAIMABLE))) {
-			read_unlock(&pag->pag_ici_lock);
-			continue;
-		}
-
 		/* bad inodes are dealt with elsewhere */
 		inode = VFS_I(ip);
 		if (is_bad_inode(inode)) {
@@ -132,42 +114,29 @@ xfs_sync_inodes_ag(
 		}
 
 		/*
-		 * If we can't get a reference on the VFS_I, the inode must be
-		 * in reclaim. If we can get the inode lock without blocking,
-		 * it is safe to flush the inode because we hold the tree lock
-		 * and xfs_iextract will block right now. Hence if we lock the
-		 * inode while holding the tree lock, xfs_ireclaim() is
-		 * guaranteed to block on the inode lock we now hold and hence
-		 * it is safe to reference the inode until we drop the inode
-		 * locks completely.
+		 * If we can't get a reference on the inode, it must be
+		 * in reclaim. Leave it for the reclaim code to flush.
 		 */
-		inode_refed = B_FALSE;
-		if (igrab(inode)) {
-			read_unlock(&pag->pag_ici_lock);
-			xfs_ilock(ip, lock_flags);
-			inode_refed = B_TRUE;
-		} else {
-			if (!xfs_ilock_nowait(ip, lock_flags)) {
-				/* leave it to reclaim */
-				read_unlock(&pag->pag_ici_lock);
-				continue;
-			}
+		if (!igrab(inode)) {
+			ASSERT(xfs_iflags_test(ip,
+					(XFS_IRECLAIM|XFS_IRECLAIMABLE)));
 			read_unlock(&pag->pag_ici_lock);
+			continue;
 		}
+		read_unlock(&pag->pag_ici_lock);
 
 		/*
 		 * If we have to flush data or wait for I/O completion
-		 * we need to drop the ilock that we currently hold.
-		 * If we need to drop the lock, insert a marker if we
-		 * have not already done so.
+		 * we need to hold the iolock.
 		 */
 		if ((flags & SYNC_DELWRI) && VN_DIRTY(inode)) {
-			xfs_iunlock(ip, XFS_ILOCK_SHARED);
+			xfs_ilock(ip, XFS_IOLOCK_SHARED);
+			lock_flags |= XFS_IOLOCK_SHARED;
 			error = xfs_flush_pages(ip, 0, -1, fflag, FI_NONE);
 			if (flags & SYNC_IOWAIT)
 				vn_iowait(ip);
-			xfs_ilock(ip, XFS_ILOCK_SHARED);
 		}
+		xfs_ilock(ip, XFS_ILOCK_SHARED);
 
 		if ((flags & SYNC_ATTR) && !xfs_inode_clean(ip)) {
 			if (flags & SYNC_WAIT) {
@@ -184,12 +153,8 @@ xfs_sync_inodes_ag(
 			}
 		}
 
-		if (lock_flags)
-			xfs_iunlock(ip, lock_flags);
-
-		if (inode_refed) {
-			IRELE(ip);
-		}
+		xfs_iunlock(ip, lock_flags);
+		IRELE(ip);
 
 		if (error)
 			last_error = error;

  reply	other threads:[~2008-10-15  3:15 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-15  1:49 crash with latest code drop Peter Leckie
2008-10-15  1:18 ` Dave Chinner
2008-10-15  2:29   ` Christoph Hellwig
2008-10-15  3:16     ` Dave Chinner [this message]
2008-10-15  3:24       ` Christoph Hellwig
2008-10-15  3:51         ` Dave Chinner
2008-10-15  5:50           ` Peter Leckie
2008-10-15  6:19             ` Dave Chinner
2008-10-15  7:51               ` Peter Leckie
2008-10-16  2:43                 ` Peter Leckie
2008-10-16  5:55                   ` Dave Chinner
2008-10-16  9:00                   ` Christoph Hellwig
2008-10-17  1:01                     ` Lachlan McIlroy
2008-10-22  6:19               ` Dave Chinner
2008-10-23  4:23                 ` Peter Leckie
2008-10-23  5:39                   ` Dave Chinner
2008-10-15  3:47   ` Peter Leckie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081015031645.GA25906@disturbed \
    --to=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=pleckie@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox