From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 09/10] xfs: xfs_fs_write_inode() can fail to write inodes synchronously V2
Date: Wed, 3 Feb 2010 10:25:03 +1100 [thread overview]
Message-ID: <1265153104-29680-10-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1265153104-29680-1-git-send-email-david@fromorbit.com>
When an inode has already be flushed delayed write,
xfs_inode_clean() returns true and hence xfs_fs_write_inode() can
return on a synchronous inode write without having written the
inode. Currently these sycnhronous writes only come sync(1),
unmount, a sycnhronous NFS export and cachefiles so should be
relatively rare and out of common performance paths.
Realistically, a synchronous inode write is not necessary here; we
can treat this like fsync where we either force the log if there are
no unlogged changes, or do a sync transaction if there are unlogged
changes. The will result real synchronous semantics as the fsync
will issue barriers, but may slow down the above two configurations
as a result. However, if the inode is not pinned and has no unlogged
changes, then the fsync code is a no-op and hence it may be faster
than the existing code.
The only thing we need tobe careful of here is if the inode has not
yet been flushed to the inode buffer bulkstat scans will fail to see
it. Hence even on a sync write, we should try to flush the inode to
the backing buffer. This is only a best-effort delwri flush - it
doesn't guarantee that the flush happens but races with other inode
operations should not happen as the inode lock is not dropped
between the fsync operation and the flush.
For the asynchronous write, move the clean check until after we have
locked up the inode. With the inode locked and the flush lock held,
we know that if the inodis clean there are no pending changes in the
log and there are no current outstanding delayed writes or IO in
progress, so if it reports clean now it really is clean. This matches
the order of locking and checks in xfs_sync_inode_attr().
Version 2:
- ilock is now held external to the fsync call to close race windows
between the fsync and delwri flush to the backing buffer.
Signed-off-by: Dave Chinner <david@fromorbit.com>
---
fs/xfs/linux-2.6/xfs_super.c | 54 ++++++++++++++++++++++++-----------------
1 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 226fe20..1257a5f 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -1045,35 +1045,45 @@ xfs_fs_write_inode(
error = xfs_wait_on_pages(ip, 0, -1);
if (error)
goto out;
- }
-
- /*
- * Bypass inodes which have already been cleaned by
- * the inode flush clustering code inside xfs_iflush
- */
- if (xfs_inode_clean(ip))
- goto out;
-
- /*
- * We make this non-blocking if the inode is contended, return
- * EAGAIN to indicate to the caller that they did not succeed.
- * This prevents the flush path from blocking on inodes inside
- * another operation right now, they get caught later by xfs_sync.
- */
- if (sync) {
+ /*
+ * The fsync operation makes inode changes stable and it
+ * reduces the IOs we have to do here from two (log and inode)
+ * to just the log. We still need to do a delwri write of the
+ * inode after this to flush it to the bacing buffer so that
+ * bulkstat works properly.
+ */
xfs_ilock(ip, XFS_ILOCK_SHARED);
- xfs_iflock(ip);
-
- error = xfs_iflush(ip, SYNC_WAIT);
+ error = xfs_fsync(ip, XFS_ILOCK_SHARED);
+ if (error)
+ goto out_unlock;
+ error = EAGAIN;
} else {
+ /*
+ * We make this non-blocking if the inode is contended, return
+ * EAGAIN to indicate to the caller that they did not succeed.
+ * This prevents the flush path from blocking on inodes inside
+ * another operation right now, they get caught later by xfs_sync.
+ */
error = EAGAIN;
if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
goto out;
- if (xfs_ipincount(ip) || !xfs_iflock_nowait(ip))
- goto out_unlock;
+ }
+
+ if (xfs_ipincount(ip) || !xfs_iflock_nowait(ip))
+ goto out_unlock;
- error = xfs_iflush(ip, 0);
+ /*
+ * Now we have the flush lock and the inode is not pinned, we can check
+ * if the inode is really clean as we know that there are no pending
+ * transaction completions, it is not waiting on the delayed write
+ * queue and there is no IO in progress.
+ */
+ error = 0;
+ if (xfs_inode_clean(ip)) {
+ xfs_ifunlock(ip);
+ goto out_unlock;
}
+ error = xfs_iflush(ip, 0);
out_unlock:
xfs_iunlock(ip, XFS_ILOCK_SHARED);
--
1.6.5
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-02-03 2:09 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-02 23:24 [PATCH 0/10] Delayed write metadata writeback V4 Dave Chinner
2010-02-02 23:24 ` [PATCH 01/10] xfs: Make inode reclaim states explicit Dave Chinner
2010-02-05 19:06 ` Alex Elder
2010-02-06 0:07 ` Dave Chinner
2010-02-02 23:24 ` [PATCH 02/10] xfs: Use delayed write for inodes rather than async V2 Dave Chinner
2010-02-03 11:17 ` Christoph Hellwig
2010-02-05 21:38 ` Alex Elder
2010-02-02 23:24 ` [PATCH 03/10] xfs: Don't issue buffer IO direct from AIL push V2 Dave Chinner
2010-02-05 22:51 ` Alex Elder
2010-02-02 23:24 ` [PATCH 04/10] xfs: Sort delayed write buffers before dispatch Dave Chinner
2010-02-05 23:53 ` Alex Elder
2010-02-02 23:24 ` [PATCH 05/10] xfs: Use delay write promotion for dquot flushing Dave Chinner
2010-02-05 23:55 ` Alex Elder
2010-02-02 23:25 ` [PATCH 06/10] xfs: kill the unused XFS_QMOPT_* flush flags V2 Dave Chinner
2010-02-03 11:17 ` Christoph Hellwig
2010-02-02 23:25 ` [PATCH 07/10] xfs: remove invalid barrier optimization from xfs_fsync Dave Chinner
2010-02-02 23:25 ` [PATCH 08/10] xfs: move the inode locking outside xfs_fsync() Dave Chinner
2010-02-03 11:29 ` Christoph Hellwig
2010-02-03 23:08 ` Dave Chinner
2010-02-04 16:07 ` Christoph Hellwig
2010-02-02 23:25 ` Dave Chinner [this message]
2010-02-03 11:27 ` [PATCH 09/10] xfs: xfs_fs_write_inode() can fail to write inodes synchronously V2 Christoph Hellwig
2010-02-03 18:07 ` bpm
2010-02-03 20:55 ` Christoph Hellwig
2010-02-03 20:56 ` Christoph Hellwig
2010-02-03 23:02 ` Dave Chinner
2010-02-04 17:36 ` Christoph Hellwig
2010-02-02 23:25 ` [PATCH 10/10] xfs: kill xfs_bawrite Dave Chinner
2010-02-03 11:19 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1265153104-29680-10-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox