Linux NFS development
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "chuck.lever@oracle.com" <chuck.lever@oracle.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] nfsd: Don't reset the write verifier on a commit EAGAIN
Date: Mon, 11 Sep 2023 20:54:37 +0000	[thread overview]
Message-ID: <f754d8a170b967d1523d103837eaeeb5e9a6c85b.camel@hammerspace.com> (raw)
In-Reply-To: <ZP91EwHCt0/c0jvJ@tissot.1015granger.net>

On Mon, 2023-09-11 at 16:14 -0400, Chuck Lever wrote:
> On Mon, Sep 11, 2023 at 02:43:57PM -0400, trondmy@gmail.com wrote:
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > If fsync() is returning EAGAIN, then we can assume that the
> > filesystem
> > being exported is something like NFS with the 'softerr' mount
> > option
> > enabled, and that it is just asking us to replay the fsync()
> > operation
> > at a later date.
> > If we see an ESTALE, then ditto: the file is gone, so there is no
> > danger
> > of losing the error.
> > For those cases, do not reset the write verifier.
> 
> Out of interest, what's the hazard in a write verifier change in
> these cases? There could be a slight performance penalty, I imagine,
> but how frequently does this happen?

When re-exporting to NFSv4 clients, it should be less of a problem,
since any REMOVE will result in a sillyrenamed file that only
disappears once the file is closed. However with NFSv3 clients, that is
circumvented by the fact that the filecache closes the files when they
are inactive. We've seen this occur frequently with VMware vmdks: their
lock files appear to generate a lot of these phantom ESTALE writes.

As for EAGAIN, I just pushed out a 2 patch client series that makes it
a lot more frequent when re-exporting NFSv4 with 'softerr'.

Finally, it is worth noting that a write verifier change has a global
effect, causing retransmission by all clients of all uncommitted
unstable writes for all files, so is worth mitigating where possible.

> 
> One more below.
> 
> 
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > ---
> >  fs/nfsd/vfs.c | 29 +++++++++++++++++++----------
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > index 98fa4fd0556d..31daf9f63572 100644
> > --- a/fs/nfsd/vfs.c
> > +++ b/fs/nfsd/vfs.c
> > @@ -337,6 +337,20 @@ nfsd_lookup(struct svc_rqst *rqstp, struct
> > svc_fh *fhp, const char *name,
> >         return err;
> >  }
> >  
> > +static void
> > +commit_reset_write_verifier(struct nfsd_net *nn, struct svc_rqst
> > *rqstp,
> > +                           int err)
> > +{
> > +       switch (err) {
> > +       case -EAGAIN:
> > +       case -ESTALE:
> > +               break;
> > +       default:
> > +               nfsd_reset_write_verifier(nn);
> > +               trace_nfsd_writeverf_reset(nn, rqstp, err);
> > +       }
> > +}
> > +
> >  /*
> >   * Commit metadata changes to stable storage.
> >   */
> > @@ -647,8 +661,7 @@ __be32 nfsd4_clone_file_range(struct svc_rqst
> > *rqstp,
> >                                         &nfsd4_get_cstate(rqstp)-
> > >current_fh,
> >                                         dst_pos,
> >                                         count, status);
> > -                       nfsd_reset_write_verifier(nn);
> > -                       trace_nfsd_writeverf_reset(nn, rqstp,
> > status);
> > +                       commit_reset_write_verifier(nn, rqstp,
> > status);
> >                         ret = nfserrno(status);
> >                 }
> >         }
> > @@ -1170,8 +1183,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct
> > svc_fh *fhp, struct nfsd_file *nf,
> >         host_err = vfs_iter_write(file, &iter, &pos, flags);
> >         file_end_write(file);
> >         if (host_err < 0) {
> > -               nfsd_reset_write_verifier(nn);
> > -               trace_nfsd_writeverf_reset(nn, rqstp, host_err);
> > +               commit_reset_write_verifier(nn, rqstp, host_err);
> 
> Can generic_file_write_iter() or its brethren return STALE or AGAIN
> before they get to the generic_write_sync() call ?

The call to nfs_revalidate_file_size(), which can occur when you are
appending to the file (whether or not O_APPEND is set) could indeed
return ESTALE.
With the new patchset mentioned above, it could also return EAGAIN.

> 
> 
> >                 goto out_nfserr;
> >         }
> >         *cnt = host_err;
> > @@ -1183,10 +1195,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp,
> > struct svc_fh *fhp, struct nfsd_file *nf,
> >  
> >         if (stable && use_wgather) {
> >                 host_err = wait_for_concurrent_writes(file);
> > -               if (host_err < 0) {
> > -                       nfsd_reset_write_verifier(nn);
> > -                       trace_nfsd_writeverf_reset(nn, rqstp,
> > host_err);
> > -               }
> > +               if (host_err < 0)
> > +                       commit_reset_write_verifier(nn, rqstp,
> > host_err);
> >         }
> >  
> >  out_nfserr:
> > @@ -1329,8 +1339,7 @@ nfsd_commit(struct svc_rqst *rqstp, struct
> > svc_fh *fhp, struct nfsd_file *nf,
> >                         err = nfserr_notsupp;
> >                         break;
> >                 default:
> > -                       nfsd_reset_write_verifier(nn);
> > -                       trace_nfsd_writeverf_reset(nn, rqstp,
> > err2);
> > +                       commit_reset_write_verifier(nn, rqstp,
> > err2);
> >                         err = nfserrno(err2);
> >                 }
> >         } else
> > -- 
> > 2.41.0
> > 
> 

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2023-09-11 22:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11 18:43 [PATCH] nfsd: Don't reset the write verifier on a commit EAGAIN trondmy
2023-09-11 20:14 ` Chuck Lever
2023-09-11 20:54   ` Trond Myklebust [this message]
2023-09-11 22:10     ` Chuck Lever III
2023-09-11 23:42       ` Trond Myklebust
2023-09-12  0:45         ` Chuck Lever III
2023-09-12  1:11           ` Trond Myklebust
2023-09-12 13:19             ` Chuck Lever
2023-09-12 10:54 ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f754d8a170b967d1523d103837eaeeb5e9a6c85b.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox