From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Neil Brown <neilb@suse.de>
Subject: Re: [PATCH v2 3/3] nfsd: start non-blocking writeback after adding nfsd_file to the LRU
Date: Fri, 28 Oct 2022 11:05:08 -0400 [thread overview]
Message-ID: <ae07f54d107cf1848c0a36dd16e437185a0304c3.camel@kernel.org> (raw)
In-Reply-To: <D32F829C-434C-4BA4-9057-C9769C2F4655@oracle.com>
On Fri, 2022-10-28 at 13:16 +0000, Chuck Lever III wrote:
>
> > On Oct 27, 2022, at 5:52 PM, Jeff Layton <jlayton@kernel.org> wrote:
> >
> > When a GC entry gets added to the LRU, kick off SYNC_NONE writeback
> > so that we can be ready to close it out when the time comes.
>
> For a large file, a background flush still has to walk the file's
> pages to see if they are dirty, and that consumes time, CPU, and
> memory bandwidth. We're talking hundreds of microseconds for a
> large file.
>
> Then the final flush does all that again.
>
> Basically, two (or more!) passes through the file for exactly the
> same amount of work. Is there any measured improvement in latency
> or throughput?
>
> And then... for a GC file, no-one is waiting on data persistence
> during nfsd_file_put() so I'm not sure what is gained by taking
> control of the flushing process away from the underlying filesystem.
>
>
> Remind me why the filecache is flushing? Shouldn't NFSD rely on
> COMMIT operations for that? (It's not obvious reading the code,
> maybe there should be a documenting comment somewhere that
> explains this arrangement).
>
Fair point. I was trying to replicate the behaviors introduced in these
patches:
b6669305d35a nfsd: Reduce the number of calls to nfsd_file_gc()
6b8a94332ee4 nfsd: Fix a write performance regression
AFAICT, the fsync is there to catch writeback errors so that we can
reset the write verifiers (AFAICT). The rationale for that is described
here:
055b24a8f230 nfsd: Don't garbage collect files that might contain write errors
The problem with not calling vfs_fsync is that we might miss writeback
errors. The nfsd_file could get reaped before a v3 COMMIT ever comes in.
nfsd would eventually reopen the file but it could miss seeing the error
if it got opened locally in the interim.
I'm not sure we need to worry about that so much for v4 though. Maybe we
should just do this for GC files?
>
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > fs/nfsd/filecache.c | 37 +++++++++++++++++++++++++++++++------
> > 1 file changed, 31 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index d2bbded805d4..491d3d9a1870 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -316,7 +316,7 @@ nfsd_file_alloc(struct nfsd_file_lookup_key *key, unsigned int may)
> > }
> >
> > static void
> > -nfsd_file_flush(struct nfsd_file *nf)
> > +nfsd_file_fsync(struct nfsd_file *nf)
> > {
> > struct file *file = nf->nf_file;
> >
> > @@ -327,6 +327,22 @@ nfsd_file_flush(struct nfsd_file *nf)
> > nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id));
> > }
> >
> > +static void
> > +nfsd_file_flush(struct nfsd_file *nf)
> > +{
> > + struct file *file = nf->nf_file;
> > + unsigned long nrpages;
> > +
> > + if (!file || !(file->f_mode & FMODE_WRITE))
> > + return;
> > +
> > + nrpages = file->f_mapping->nrpages;
> > + if (nrpages) {
> > + this_cpu_add(nfsd_file_pages_flushed, nrpages);
> > + filemap_flush(file->f_mapping);
> > + }
> > +}
> > +
> > static void
> > nfsd_file_free(struct nfsd_file *nf)
> > {
> > @@ -337,7 +353,7 @@ nfsd_file_free(struct nfsd_file *nf)
> > this_cpu_inc(nfsd_file_releases);
> > this_cpu_add(nfsd_file_total_age, age);
> >
> > - nfsd_file_flush(nf);
> > + nfsd_file_fsync(nf);
> >
> > if (nf->nf_mark)
> > nfsd_file_mark_put(nf->nf_mark);
> > @@ -500,12 +516,21 @@ nfsd_file_put(struct nfsd_file *nf)
> >
> > if (test_bit(NFSD_FILE_GC, &nf->nf_flags)) {
> > /*
> > - * If this is the last reference (nf_ref == 1), then transfer
> > - * it to the LRU. If the add to the LRU fails, just put it as
> > - * usual.
> > + * If this is the last reference (nf_ref == 1), then try
> > + * to transfer it to the LRU.
> > + */
> > + if (refcount_dec_not_one(&nf->nf_ref))
> > + return;
> > +
> > + /*
> > + * If the add to the list succeeds, try to kick off SYNC_NONE
> > + * writeback. If the add fails, then just fall through to
> > + * decrement as usual.
>
> These comments simply repeat what the code does, so they seem
> redundant to me. Could they instead explain why?
>
>
> > */
> > - if (refcount_dec_not_one(&nf->nf_ref) || nfsd_file_lru_add(nf))
> > + if (nfsd_file_lru_add(nf)) {
> > + nfsd_file_flush(nf);
> > return;
> > + }
> > }
> > __nfsd_file_put(nf);
> > }
> > --
> > 2.37.3
> >
>
> --
> Chuck Lever
>
>
>
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2022-10-28 15:05 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-27 21:52 [PATCH v2 0/3] nfsd: clean up refcounting in filecache Jeff Layton
2022-10-27 21:52 ` [PATCH v2 1/3] nfsd: rework " Jeff Layton
2022-10-27 22:51 ` NeilBrown
2022-10-28 9:57 ` Jeff Layton
2022-10-28 22:14 ` NeilBrown
2022-10-30 21:29 ` NeilBrown
2022-10-31 9:58 ` Jeff Layton
2022-10-31 9:54 ` Jeff Layton
2022-10-27 21:52 ` [PATCH v2 2/3] nfsd: only keep unused entries on the LRU Jeff Layton
2022-10-27 22:20 ` NeilBrown
2022-10-27 22:47 ` Jeff Layton
2022-10-27 22:55 ` NeilBrown
2022-10-27 23:52 ` Chuck Lever III
2022-10-27 21:52 ` [PATCH v2 3/3] nfsd: start non-blocking writeback after adding nfsd_file to " Jeff Layton
2022-10-27 22:25 ` NeilBrown
2022-10-28 13:01 ` Jeff Layton
2022-10-28 13:16 ` Chuck Lever III
2022-10-28 15:05 ` Jeff Layton [this message]
2022-10-28 15:29 ` Chuck Lever III
2022-10-28 15:51 ` Jeff Layton
2022-10-28 17:21 ` Chuck Lever III
2022-10-28 17:43 ` Jeff Layton
2022-10-28 18:53 ` Chuck Lever III
2022-10-28 20:04 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ae07f54d107cf1848c0a36dd16e437185a0304c3.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox