From: "J. Bruce Fields" <bfields@fieldses.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>,
Benjamin Coddington <bcodding@redhat.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: Client never uses DATA_SYNC
Date: Wed, 5 Nov 2014 09:41:33 -0500 [thread overview]
Message-ID: <20141105144133.GA3139@fieldses.org> (raw)
In-Reply-To: <20141105085317.GA18658@infradead.org>
On Wed, Nov 05, 2014 at 12:53:17AM -0800, Christoph Hellwig wrote:
> On Tue, Nov 04, 2014 at 03:38:28PM -0500, Trond Myklebust wrote:
> > Hi Ben,
> >
> > On Tue, Nov 4, 2014 at 10:47 AM, Benjamin Coddington
> > <bcodding@redhat.com> wrote:
> > > I have a customer that would like the client to use DATA_SYNC instead of
> > > FILE_SYNC when writing a file with O_DSYNC. It looks like the client will
> > > only use FILE_SYNC since:
> > >
> > > 87ed5eb44ad9338b1617 NFS: Don't use DATA_SYNC writes
> > > http://marc.info/?l=git-commits-head&m=131180398113265&w=2
> > >
> > > I've been unable to dig up any other discussion on this, so I think it
> > > has just been an overlooked point until now. I'm only starting to
> > > figure out what would need to change for this, and I thought that while
> > > I do that I'd ask the list if anyone thinks that serious implementation
> > > issues might emerge if this were attempted.
> >
> > I'm not aware of any servers that make a real distinction between
> > FILE_SYNC and DATA_SYNC. Has anybody done any performance measurements
> > to show that it is worth the effort?
>
> For a fully allocated file the difference between fdatasync and fsync
> or O_DSYNC / O_SYNC can make a difference, this is especially notiable
> on data base workloads that do lots of small I/O on fully preallocated
> files.
>
> Implementing this difference for the Linux NFS server also seem fairly
> trivial, patch attached.
Makes sense to me.--b.
>
> >From 61fee1aacf2f31b6fb1bb90d90f5e625994970f6 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Wed, 5 Nov 2014 09:50:35 +0100
> Subject: nfsd: implement DATA_SYNC4 support
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/nfsd/nfs3proc.c | 4 ++--
> fs/nfsd/nfs4proc.c | 7 +++----
> fs/nfsd/nfsproc.c | 7 ++-----
> fs/nfsd/vfs.c | 14 +++++++-------
> fs/nfsd/vfs.h | 3 ++-
> 5 files changed, 16 insertions(+), 19 deletions(-)
>
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index 12f2aab..cc9622f 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -191,12 +191,12 @@ nfsd3_proc_write(struct svc_rqst *rqstp, struct nfsd3_writeargs *argp,
> argp->stable? " stable" : "");
>
> fh_copy(&resp->fh, &argp->fh);
> - resp->committed = argp->stable;
> nfserr = nfsd_write(rqstp, &resp->fh, NULL,
> argp->offset,
> rqstp->rq_vec, argp->vlen,
> &cnt,
> - &resp->committed);
> + argp->stable);
> + resp->committed = argp->stable;
> resp->count = cnt;
> RETURN_STATUS(nfserr);
> }
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index c4a2adc..b11fd3b 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -998,15 +998,14 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> }
>
> cnt = write->wr_buflen;
> - write->wr_how_written = write->wr_stable_how;
> gen_boot_verifier(&write->wr_verifier, SVC_NET(rqstp));
>
> nvecs = fill_in_write_vector(rqstp->rq_vec, write);
> WARN_ON_ONCE(nvecs > ARRAY_SIZE(rqstp->rq_vec));
>
> - status = nfsd_write(rqstp, &cstate->current_fh, filp,
> - write->wr_offset, rqstp->rq_vec, nvecs,
> - &cnt, &write->wr_how_written);
> + status = nfsd_write(rqstp, &cstate->current_fh, filp, write->wr_offset,
> + rqstp->rq_vec, nvecs, &cnt, write->wr_stable_how);
> + write->wr_how_written = write->wr_stable_how;
> if (filp)
> fput(filp);
>
> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
> index b868073..90caa00 100644
> --- a/fs/nfsd/nfsproc.c
> +++ b/fs/nfsd/nfsproc.c
> @@ -158,7 +158,6 @@ nfsd_proc_write(struct svc_rqst *rqstp, struct nfsd_writeargs *argp,
> struct nfsd_attrstat *resp)
> {
> __be32 nfserr;
> - int stable = 1;
> unsigned long cnt = argp->len;
>
> dprintk("nfsd: WRITE %s %d bytes at %d\n",
> @@ -166,10 +165,8 @@ nfsd_proc_write(struct svc_rqst *rqstp, struct nfsd_writeargs *argp,
> argp->len, argp->offset);
>
> nfserr = nfsd_write(rqstp, fh_copy(&resp->fh, &argp->fh), NULL,
> - argp->offset,
> - rqstp->rq_vec, argp->vlen,
> - &cnt,
> - &stable);
> + argp->offset, rqstp->rq_vec, argp->vlen,
> + &cnt, NFS_FILE_SYNC);
> return nfsd_return_attrs(nfserr, resp);
> }
>
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 989129e..2ea8ff6 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -927,7 +927,7 @@ static int wait_for_concurrent_writes(struct file *file)
> static __be32
> nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
> loff_t offset, struct kvec *vec, int vlen,
> - unsigned long *cnt, int *stablep)
> + unsigned long *cnt, enum nfs3_stable_how stable)
> {
> struct svc_export *exp;
> struct dentry *dentry;
> @@ -935,7 +935,6 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
> mm_segment_t oldfs;
> __be32 err = 0;
> int host_err;
> - int stable = *stablep;
> int use_wgather;
> loff_t pos = offset;
> unsigned int pflags = current->flags;
> @@ -956,7 +955,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
> use_wgather = (rqstp->rq_vers == 2) && EX_WGATHER(exp);
>
> if (!EX_ISSYNC(exp))
> - stable = 0;
> + stable = NFS_UNSTABLE;
>
> /* Write the data. */
> oldfs = get_fs(); set_fs(KERNEL_DS);
> @@ -972,7 +971,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
> if (use_wgather)
> host_err = wait_for_concurrent_writes(file);
> else
> - host_err = vfs_fsync_range(file, offset, offset+*cnt, 0);
> + host_err = vfs_fsync_range(file, offset, offset+*cnt,
> + stable == NFS_DATA_SYNC);
> }
>
> out_nfserr:
> @@ -1051,7 +1051,7 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
> __be32
> nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
> loff_t offset, struct kvec *vec, int vlen, unsigned long *cnt,
> - int *stablep)
> + enum nfs3_stable_how stable)
> {
> __be32 err = 0;
>
> @@ -1061,7 +1061,7 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
> if (err)
> goto out;
> err = nfsd_vfs_write(rqstp, fhp, file, offset, vec, vlen, cnt,
> - stablep);
> + stable);
> } else {
> err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_WRITE, &file);
> if (err)
> @@ -1069,7 +1069,7 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
>
> if (cnt)
> err = nfsd_vfs_write(rqstp, fhp, file, offset, vec, vlen,
> - cnt, stablep);
> + cnt, stable);
> nfsd_close(file);
> }
> out:
> diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
> index c2ff3f1..3cb8e55 100644
> --- a/fs/nfsd/vfs.h
> +++ b/fs/nfsd/vfs.h
> @@ -81,7 +81,8 @@ __be32 nfsd_readv(struct file *, loff_t, struct kvec *, int,
> __be32 nfsd_read(struct svc_rqst *, struct svc_fh *,
> loff_t, struct kvec *, int, unsigned long *);
> __be32 nfsd_write(struct svc_rqst *, struct svc_fh *,struct file *,
> - loff_t, struct kvec *,int, unsigned long *, int *);
> + loff_t, struct kvec *,int, unsigned long *,
> + enum nfs3_stable_how);
> __be32 nfsd_readlink(struct svc_rqst *, struct svc_fh *,
> char *, int *);
> __be32 nfsd_symlink(struct svc_rqst *, struct svc_fh *,
> --
> 1.9.1
>
next prev parent reply other threads:[~2014-11-05 14:41 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-04 15:47 Client never uses DATA_SYNC Benjamin Coddington
2014-11-04 20:38 ` Trond Myklebust
2014-11-05 8:53 ` Christoph Hellwig
2014-11-05 14:41 ` J. Bruce Fields [this message]
2014-11-06 20:13 ` J. Bruce Fields
2014-11-07 7:26 ` Christoph Hellwig
2014-11-07 15:53 ` J. Bruce Fields
2014-11-08 7:06 ` Christoph Hellwig
2014-11-19 20:55 ` J. Bruce Fields
2014-11-18 17:02 ` J. Bruce Fields
2014-11-20 5:48 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141105144133.GA3139@fieldses.org \
--to=bfields@fieldses.org \
--cc=bcodding@redhat.com \
--cc=hch@infradead.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.