From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id o1H48VtH203576 for ; Tue, 16 Feb 2010 22:08:31 -0600 Received: from mail.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 00B3013B0837 for ; Tue, 16 Feb 2010 20:09:47 -0800 (PST) Received: from mail.internode.on.net (bld-mail18.adl2.internode.on.net [150.101.137.103]) by cuda.sgi.com with ESMTP id ill065K79ETxcKIq for ; Tue, 16 Feb 2010 20:09:47 -0800 (PST) Date: Wed, 17 Feb 2010 15:09:44 +1100 From: Dave Chinner Subject: Re: [PATCH 3/4] [PATCH 3/4] xfs: remove wrapper for the fsync file operation Message-ID: <20100217040944.GK28392@discord.disaster> References: <20100215094445.528696829@bombadil.infradead.org> <20100215094604.640912740@bombadil.infradead.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20100215094604.640912740@bombadil.infradead.org> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: Christoph Hellwig Cc: xfs@oss.sgi.com On Mon, Feb 15, 2010 at 04:44:48AM -0500, Christoph Hellwig wrote: > Currently the fsync file operation is divided into a low-level routine doing > all the work and one that implements the Linux file operation and does minimal > argument wrapping. This is a leftover from the days of the vnode operations > layer and can be removed to simplify the code a bit, as well as preparing for > the implementation of an optimized fdatasync which needs to look at the > Linux inode state. > > Signed-off-by: Christoph Hellwig Looks good, one minor thing: > > Index: xfs/fs/xfs/linux-2.6/xfs_file.c > =================================================================== > --- xfs.orig/fs/xfs/linux-2.6/xfs_file.c 2010-02-15 10:18:58.640023657 +0100 > +++ xfs/fs/xfs/linux-2.6/xfs_file.c 2010-02-15 10:28:07.311260422 +0100 > @@ -35,6 +35,7 @@ > #include "xfs_dir2_sf.h" > #include "xfs_dinode.h" > #include "xfs_inode.h" > +#include "xfs_inode_item.h" > #include "xfs_bmap.h" > #include "xfs_error.h" > #include "xfs_rw.h" > @@ -96,6 +97,120 @@ xfs_iozero( > return (-status); > } > > +/* > + * We ignore the datasync flag here because a datasync is effectively > + * identical to an fsync. That is, datasync implies that we need to write > + * only the metadata needed to be able to access the data that is written > + * if we crash after the call completes. Hence if we are writing beyond > + * EOF we have to log the inode size change as well, which makes it a > + * full fsync. If we don't write beyond EOF, the inode core will be > + * clean in memory and so we don't need to log the inode, just like > + * fsync. > + */ > +STATIC int > +xfs_file_fsync( > + struct file *file, > + struct dentry *dentry, > + int datasync) > +{ > + struct xfs_inode *ip = XFS_I(dentry->d_inode); > + struct xfs_trans *tp; > + int error = 0; > + int log_flushed = 0; > + > + xfs_itrace_entry(ip); > + > + if (XFS_FORCED_SHUTDOWN(ip->i_mount)) > + return -XFS_ERROR(EIO); > + > + xfs_iflags_clear(ip, XFS_ITRUNCATED); > + > + /* > + * We always need to make sure that the required inode state is safe on > + * disk. The inode might be clean but we still might need to force the > + * log because of committed transactions that haven't hit the disk yet. > + * Likewise, there could be unflushed non-transactional changes to the > + * inode core that have to go to disk and this requires us to issue > + * a synchronous transaction to capture these changes correctly. > + * > + * This code relies on the assumption that if the i_update_core field > + * of the inode is clear and the inode is unpinned then it is clean > + * and no action is required. > + */ > + xfs_ilock(ip, XFS_ILOCK_SHARED); > + > + if (ip->i_update_core) { > + /* > + * Kick off a transaction to log the inode core to get the > + * updates. The sync transaction will also force the log. > + */ > + xfs_iunlock(ip, XFS_ILOCK_SHARED); > + tp = xfs_trans_alloc(ip->i_mount, XFS_TRANS_FSYNC_TS); > + error = xfs_trans_reserve(tp, 0, > + XFS_FSYNC_TS_LOG_RES(ip->i_mount), 0, 0, 0); > + if (error) { > + xfs_trans_cancel(tp, 0); > + return -error; > + } > + xfs_ilock(ip, XFS_ILOCK_EXCL); > + > + /* > + * Note - it's possible that we might have pushed ourselves out > + * of the way during trans_reserve which would flush the inode. > + * But there's no guarantee that the inode buffer has actually > + * gone out yet (it's delwri). Plus the buffer could be pinned > + * anyway if it's part of an inode in another recent > + * transaction. So we play it safe and fire off the > + * transaction anyway. > + */ > + xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL); > + xfs_trans_ihold(tp, ip); > + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); > + xfs_trans_set_sync(tp); > + error = _xfs_trans_commit(tp, 0, &log_flushed); > + > + xfs_iunlock(ip, XFS_ILOCK_EXCL); > + } else { > + /* > + * Timestamps/size haven't changed since last inode flush or > + * inode transaction commit. That means either nothing got > + * written or a transaction committed which caught the updates. > + * If the latter happened and the transaction hasn't hit the > + * disk yet, the inode will be still be pinned. If it is, > + * force the log. > + */ > + xfs_iunlock(ip, XFS_ILOCK_SHARED); > + if (xfs_ipincount(ip)) { > + if (ip->i_itemp->ili_last_lsn) { > + error = _xfs_log_force_lsn(ip->i_mount, > + ip->i_itemp->ili_last_lsn, > + XFS_LOG_SYNC, &log_flushed); > + } else { > + error = _xfs_log_force(ip->i_mount, > + XFS_LOG_SYNC, &log_flushed); > + } > + } To be technically correct, the ilock should be held over the pincount check and log force, as is done in xfs_iunpin_wait(). That way we can guarantee the inode was correctly forced and not unpinned between the unlock/check/log force being issued. I know this is just a copy of the existing fsync code, but I think that the existing code is wrong, too. ;) Also, if the inode is pinned while we have it locked, then ip->i_itemp->ili_last_lsn is guaranteed to be set as it is updated in IOP_COMMITTING() which is called during transaction commit. As it is, ili_last_lsn is never reset to zero after a transaction, so i think the _xfs_log_force() branch will never be executed, either. Other than that, the change looks ok. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs