From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 3/4] [PATCH 3/4] xfs: remove wrapper for the fsync file operation
Date: Wed, 17 Feb 2010 15:09:44 +1100 [thread overview]
Message-ID: <20100217040944.GK28392@discord.disaster> (raw)
In-Reply-To: <20100215094604.640912740@bombadil.infradead.org>
On Mon, Feb 15, 2010 at 04:44:48AM -0500, Christoph Hellwig wrote:
> Currently the fsync file operation is divided into a low-level routine doing
> all the work and one that implements the Linux file operation and does minimal
> argument wrapping. This is a leftover from the days of the vnode operations
> layer and can be removed to simplify the code a bit, as well as preparing for
> the implementation of an optimized fdatasync which needs to look at the
> Linux inode state.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good, one minor thing:
>
> Index: xfs/fs/xfs/linux-2.6/xfs_file.c
> ===================================================================
> --- xfs.orig/fs/xfs/linux-2.6/xfs_file.c 2010-02-15 10:18:58.640023657 +0100
> +++ xfs/fs/xfs/linux-2.6/xfs_file.c 2010-02-15 10:28:07.311260422 +0100
> @@ -35,6 +35,7 @@
> #include "xfs_dir2_sf.h"
> #include "xfs_dinode.h"
> #include "xfs_inode.h"
> +#include "xfs_inode_item.h"
> #include "xfs_bmap.h"
> #include "xfs_error.h"
> #include "xfs_rw.h"
> @@ -96,6 +97,120 @@ xfs_iozero(
> return (-status);
> }
>
> +/*
> + * We ignore the datasync flag here because a datasync is effectively
> + * identical to an fsync. That is, datasync implies that we need to write
> + * only the metadata needed to be able to access the data that is written
> + * if we crash after the call completes. Hence if we are writing beyond
> + * EOF we have to log the inode size change as well, which makes it a
> + * full fsync. If we don't write beyond EOF, the inode core will be
> + * clean in memory and so we don't need to log the inode, just like
> + * fsync.
> + */
> +STATIC int
> +xfs_file_fsync(
> + struct file *file,
> + struct dentry *dentry,
> + int datasync)
> +{
> + struct xfs_inode *ip = XFS_I(dentry->d_inode);
> + struct xfs_trans *tp;
> + int error = 0;
> + int log_flushed = 0;
> +
> + xfs_itrace_entry(ip);
> +
> + if (XFS_FORCED_SHUTDOWN(ip->i_mount))
> + return -XFS_ERROR(EIO);
> +
> + xfs_iflags_clear(ip, XFS_ITRUNCATED);
> +
> + /*
> + * We always need to make sure that the required inode state is safe on
> + * disk. The inode might be clean but we still might need to force the
> + * log because of committed transactions that haven't hit the disk yet.
> + * Likewise, there could be unflushed non-transactional changes to the
> + * inode core that have to go to disk and this requires us to issue
> + * a synchronous transaction to capture these changes correctly.
> + *
> + * This code relies on the assumption that if the i_update_core field
> + * of the inode is clear and the inode is unpinned then it is clean
> + * and no action is required.
> + */
> + xfs_ilock(ip, XFS_ILOCK_SHARED);
> +
> + if (ip->i_update_core) {
> + /*
> + * Kick off a transaction to log the inode core to get the
> + * updates. The sync transaction will also force the log.
> + */
> + xfs_iunlock(ip, XFS_ILOCK_SHARED);
> + tp = xfs_trans_alloc(ip->i_mount, XFS_TRANS_FSYNC_TS);
> + error = xfs_trans_reserve(tp, 0,
> + XFS_FSYNC_TS_LOG_RES(ip->i_mount), 0, 0, 0);
> + if (error) {
> + xfs_trans_cancel(tp, 0);
> + return -error;
> + }
> + xfs_ilock(ip, XFS_ILOCK_EXCL);
> +
> + /*
> + * Note - it's possible that we might have pushed ourselves out
> + * of the way during trans_reserve which would flush the inode.
> + * But there's no guarantee that the inode buffer has actually
> + * gone out yet (it's delwri). Plus the buffer could be pinned
> + * anyway if it's part of an inode in another recent
> + * transaction. So we play it safe and fire off the
> + * transaction anyway.
> + */
> + xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
> + xfs_trans_ihold(tp, ip);
> + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> + xfs_trans_set_sync(tp);
> + error = _xfs_trans_commit(tp, 0, &log_flushed);
> +
> + xfs_iunlock(ip, XFS_ILOCK_EXCL);
> + } else {
> + /*
> + * Timestamps/size haven't changed since last inode flush or
> + * inode transaction commit. That means either nothing got
> + * written or a transaction committed which caught the updates.
> + * If the latter happened and the transaction hasn't hit the
> + * disk yet, the inode will be still be pinned. If it is,
> + * force the log.
> + */
> + xfs_iunlock(ip, XFS_ILOCK_SHARED);
> + if (xfs_ipincount(ip)) {
> + if (ip->i_itemp->ili_last_lsn) {
> + error = _xfs_log_force_lsn(ip->i_mount,
> + ip->i_itemp->ili_last_lsn,
> + XFS_LOG_SYNC, &log_flushed);
> + } else {
> + error = _xfs_log_force(ip->i_mount,
> + XFS_LOG_SYNC, &log_flushed);
> + }
> + }
To be technically correct, the ilock should be held over the
pincount check and log force, as is done in xfs_iunpin_wait().
That way we can guarantee the inode was correctly forced and not
unpinned between the unlock/check/log force being issued. I know
this is just a copy of the existing fsync code, but I think that
the existing code is wrong, too. ;)
Also, if the inode is pinned while we have it locked, then
ip->i_itemp->ili_last_lsn is guaranteed to be set as it is updated
in IOP_COMMITTING() which is called during transaction commit.
As it is, ili_last_lsn is never reset to zero after a transaction,
so i think the _xfs_log_force() branch will never be executed,
either.
Other than that, the change looks ok.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-02-17 4:08 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-15 9:44 [PATCH 0/4] implement optimized fdatasync Christoph Hellwig
2010-02-15 9:44 ` [PATCH 1/4] [PATCH 1/4] xfs: merge xfs_lrw.c into xfs_file.c Christoph Hellwig
2010-02-17 3:36 ` Dave Chinner
2010-02-17 8:14 ` Christoph Hellwig
2010-02-15 9:44 ` [PATCH 2/4] [PATCH 2/4] xfs: remove wrappers for read/write file operations Christoph Hellwig
2010-02-17 3:55 ` Dave Chinner
2010-02-17 8:31 ` Christoph Hellwig
[not found] ` <20100217211355.GR28392@discord.disaster>
2010-02-17 22:41 ` Dave Chinner
2010-02-25 20:33 ` Alex Elder
2010-02-15 9:44 ` [PATCH 3/4] [PATCH 3/4] xfs: remove wrapper for the fsync file operation Christoph Hellwig
2010-02-17 4:09 ` Dave Chinner [this message]
2010-02-17 8:33 ` Christoph Hellwig
2010-02-15 9:44 ` [PATCH 4/4] [PATCH 4/4] xfs: implement optimized fdatasync Christoph Hellwig
2010-02-17 4:17 ` Dave Chinner
2010-02-15 21:04 ` [PATCH 0/4] " Andi Kleen
2010-02-15 21:49 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100217040944.GK28392@discord.disaster \
--to=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox