* Re: [RFC PATCH 0/4] wsync export option
[not found] <20100203233755.17677.96582.stgit@case>
@ 2010-02-04 15:30 ` Christoph Hellwig
2010-02-04 18:15 ` bpm
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2010-02-04 15:30 UTC (permalink / raw)
To: Ben Myers; +Cc: linux-nfs, xfs
On Wed, Feb 03, 2010 at 05:44:24PM -0600, Ben Myers wrote:
> The following series is adds a 'wsync' export option to nfsd. It is intended
> to be used on XFS with the wsync mount option. When you already have a
> synchronous log there is no need to sync metadata separately.
You don't need the xfs wsync option, as the existing write_inode
calls or your new fsync calls are doing the same as the wsync mount
option, just from a higher layer. The wsync option causes the log
to be synchronously forced up to the log sequence number of the commit
for the metadata operation, that is make all the operations affected
by it synchronous. That's exactly what we'll do using fsync (actually
right now we force the whole log, but I have a patch to optimize it
to only force nuntil the last commit lsn), and approqimately the
same as we do using write_inode, just a lot less efficiently.
> This is barely tested, YMMV, I could have this all wrong, etc, etc. Here are
> some very unscientific measurements taken over gigabit ethernet.
>
> # time tar -xvf /mnt2/quilt-0.47.tar > /dev/null
>
> No XFS wsync, no NFS wsync:
> 0m13.177s 0m13.301s 0m13.528s
>
> XFS wsync set, no NFS wsync:
> 0m13.019s 0m13.232s 0m13.094s
>
> XFS wsync set, NFS wsync set:
> 0m8.361s 0m8.400s 0m8.301s
>
> Curious to hear if this is a reasonable thing to do. Suggestions welcome.
I think it's reasonable. What might be even better it to have an
export operation call out into the filesystem so that we can force wsync
and not let nfsd deal with it at all. There is a fair chance that the
filesystem can do the sync more efficiently.
And btw, not directly related to your patch, but getting rid of this
write_inode call defintively makes the usage of ->write_inode more
regular. From generic code we mostly use it for full inode writeback
in writeback_single_inode, and one other special case in
generic_detach_inode if a filesystem is unmounting. Only having
writeback_single_inode and fs code use it will make this call much
more predictable for the filesystem.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC PATCH 0/4] wsync export option
2010-02-04 15:30 ` [RFC PATCH 0/4] wsync export option Christoph Hellwig
@ 2010-02-04 18:15 ` bpm
2010-02-04 18:39 ` Christoph Hellwig
0 siblings, 1 reply; 3+ messages in thread
From: bpm @ 2010-02-04 18:15 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-nfs, xfs
On Thu, Feb 04, 2010 at 10:30:06AM -0500, Christoph Hellwig wrote:
> On Wed, Feb 03, 2010 at 05:44:24PM -0600, Ben Myers wrote:
> > The following series is adds a 'wsync' export option to nfsd. It is intended
> > to be used on XFS with the wsync mount option. When you already have a
> > synchronous log there is no need to sync metadata separately.
>
> You don't need the xfs wsync option, as the existing write_inode calls
> or your new fsync calls are doing the same as the wsync mount option,
> just from a higher layer.
Ok, I see that now. write_inode_now is basically a superset of what we
need. The important thing is to force the log (which
xfs_file_operations.xfs_file_fsync does do) but not to call
xfs_super_operations.write_inode (via write_inode_now) which eventually
gets into xfs_iflush and takes forever. We can take forever later-- when
the log is full. ;)
> The wsync option causes the log to be synchronously forced up to the
> log sequence number of the commit for the metadata operation, that is
> make all the operations affected by it synchronous. That's exactly
> what we'll do using fsync (actually right now we force the whole log,
> but I have a patch to optimize it to only force nuntil the last commit
> lsn), and approqimately the same as we do using write_inode, just a
> lot less efficiently.
Rather than having X number of synchronous log transactions written
separately as with wsync you have X number of log transactions written
out together in one go by vfs_fsync (if datasync==0). That should be
faster than wsync.
> > Curious to hear if this is a reasonable thing to do. Suggestions
> > welcome.
>
> I think it's reasonable. What might be even better it to have an
> export operation call out into the filesystem so that we can force
> wsync and not let nfsd deal with it at all. There is a fair chance
> that the filesystem can do the sync more efficiently.
Trond also suggested an export_operation and I think it's a good idea.
I'll explore that and repost.
Thanks,
Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC PATCH 0/4] wsync export option
2010-02-04 18:15 ` bpm
@ 2010-02-04 18:39 ` Christoph Hellwig
0 siblings, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2010-02-04 18:39 UTC (permalink / raw)
To: bpm; +Cc: linux-nfs, xfs
On Thu, Feb 04, 2010 at 12:15:29PM -0600, bpm@sgi.com wrote:
> Rather than having X number of synchronous log transactions written
> separately as with wsync you have X number of log transactions written
> out together in one go by vfs_fsync (if datasync==0). That should be
> faster than wsync.
Indeed, except that there aren't a lot of different transactions
usually.
- nfsd_setattr is one SETATTR transaction
- nfsd_create might be multiple transactions, indeed - especially
the nfsv3 variant that also adds a setattr transaction
- nfsd_link should be a single one
but yes, doing the log force from nfsd should be a benefit for
the create side at least. The additional benefit is that we can
just drive it from NFSD and don't need to force mount options on
the fs. So yes, let's do it from nfsd.
> Trond also suggested an export_operation and I think it's a good idea.
> I'll explore that and repost.
Indeed. For XFS that export_operation could probably be a lot
simpler than xfs_fsync. I don't think we need to catch the
non-transactional timestamp and size updates at all, and we're
guaranteed the transaction has already commited. So the
method might be as simple as:
static int xfs_nfs_force_inode(struct inode *inode)
{
struct xfs_inode *ip = XFS_I(inode);
xfs_ilock(ip, XFS_ILOCK_SHARED);
if (xfs_ipincount(ip)) {
xfs_lsn_t force_lsn = ip->i_itemp->ili_last_lsn;
ASSERT(force_lsn);
xfs_log_force_lsn(ip->i_mount, force_lsn, XFS_LOG_SYNC);
}
xfs_iunlock(ip, XFS_ILOCK_SHARED);
return 0;
}
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-02-04 18:38 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20100203233755.17677.96582.stgit@case>
2010-02-04 15:30 ` [RFC PATCH 0/4] wsync export option Christoph Hellwig
2010-02-04 18:15 ` bpm
2010-02-04 18:39 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox