From: Trond Myklebust <trondmy@hammerspace.com>
To: "dwysocha@redhat.com" <dwysocha@redhat.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Subject: Re: [PATCH 4/4] NFS: Fix fscache read from NFS after cache error
Date: Tue, 29 Jun 2021 15:54:52 +0000 [thread overview]
Message-ID: <7923d155bbf8d7c13d031971203dbfa9fd0f6051.camel@hammerspace.com> (raw)
In-Reply-To: <815639d7a8eff037fa3fabcb8e94f4053c2498d0.camel@hammerspace.com>
On Tue, 2021-06-29 at 11:50 -0400, Trond Myklebust wrote:
> On Tue, 2021-06-29 at 11:29 -0400, David Wysochanski wrote:
> > On Tue, Jun 29, 2021 at 10:54 AM Trond Myklebust
> > <trondmy@hammerspace.com> wrote:
> > >
> > > On Tue, 2021-06-29 at 09:20 -0400, David Wysochanski wrote:
> > > > On Tue, Jun 29, 2021 at 8:46 AM Trond Myklebust
> > > > <trondmy@hammerspace.com> wrote:
> > > > >
> > > > > On Tue, 2021-06-29 at 05:17 -0400, David Wysochanski wrote:
> > > > > > On Mon, Jun 28, 2021 at 8:39 PM Trond Myklebust
> > > > > > <trondmy@hammerspace.com> wrote:
> > > > > > >
> > > > > > > On Mon, 2021-06-28 at 19:46 -0400, David Wysochanski
> > > > > > > wrote:
> > > > > > > > On Mon, Jun 28, 2021 at 5:59 PM Trond Myklebust
> > > > > > > > <trondmy@hammerspace.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 2021-06-28 at 17:12 -0400, David Wysochanski
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, Jun 28, 2021 at 3:09 PM Trond Myklebust
> > > > > > > > > > <trondmy@hammerspace.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 2021-06-28 at 13:39 -0400, Dave
> > > > > > > > > > > Wysochanski
> > > > > > > > > > > wrote:
> > > > > > > > > > > > Earlier commits refactored some NFS read code
> > > > > > > > > > > > and
> > > > > > > > > > > > removed
> > > > > > > > > > > > nfs_readpage_async(), but neglected to properly
> > > > > > > > > > > > fixup
> > > > > > > > > > > > nfs_readpage_from_fscache_complete(). The code
> > > > > > > > > > > > path
> > > > > > > > > > > > is
> > > > > > > > > > > > only hit when something unusual occurs with the
> > > > > > > > > > > > cachefiles
> > > > > > > > > > > > backing filesystem, such as an IO error or
> > > > > > > > > > > > while
> > > > > > > > > > > > a
> > > > > > > > > > > > cookie
> > > > > > > > > > > > is being invalidated.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Dave Wysochanski
> > > > > > > > > > > > <dwysocha@redhat.com>
> > > > > > > > > > > > ---
> > > > > > > > > > > > fs/nfs/fscache.c | 14 ++++++++++++--
> > > > > > > > > > > > 1 file changed, 12 insertions(+), 2
> > > > > > > > > > > > deletions(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/fs/nfs/fscache.c
> > > > > > > > > > > > b/fs/nfs/fscache.c
> > > > > > > > > > > > index c4c021c6ebbd..d308cb7e1dd4 100644
> > > > > > > > > > > > --- a/fs/nfs/fscache.c
> > > > > > > > > > > > +++ b/fs/nfs/fscache.c
> > > > > > > > > > > > @@ -381,15 +381,25 @@ static void
> > > > > > > > > > > > nfs_readpage_from_fscache_complete(struct page
> > > > > > > > > > > > *page,
> > > > > > > > > > > >
> > > > > > > > > > > > void
> > > > > > > > > > > > *context,
> > > > > > > > > > > >
> > > > > > > > > > > > int
> > > > > > > > > > > > error)
> > > > > > > > > > > > {
> > > > > > > > > > > > + struct nfs_readdesc desc;
> > > > > > > > > > > > + struct inode *inode = page->mapping-
> > > > > > > > > > > > > host;
> > > > > > > > > > > > +
> > > > > > > > > > > > dfprintk(FSCACHE,
> > > > > > > > > > > > "NFS:
> > > > > > > > > > > > readpage_from_fscache_complete
> > > > > > > > > > > > (0x%p/0x%p/%d)\n",
> > > > > > > > > > > > page, context, error);
> > > > > > > > > > > >
> > > > > > > > > > > > - /* if the read completes with an error,
> > > > > > > > > > > > we
> > > > > > > > > > > > just
> > > > > > > > > > > > unlock
> > > > > > > > > > > > the
> > > > > > > > > > > > page and let
> > > > > > > > > > > > - * the VM reissue the readpage */
> > > > > > > > > > > > if (!error) {
> > > > > > > > > > > > SetPageUptodate(page);
> > > > > > > > > > > > unlock_page(page);
> > > > > > > > > > > > + } else {
> > > > > > > > > > > > + desc.ctx = context;
> > > > > > > > > > > > +
> > > > > > > > > > > > nfs_pageio_init_read(&desc.pgio,
> > > > > > > > > > > > inode,
> > > > > > > > > > > > false,
> > > > > > > > > > > > +
> > > > > > > > > > > > &nfs_async_read_completion_ops);
> > > > > > > > > > > > + error =
> > > > > > > > > > > > readpage_async_filler(&desc,
> > > > > > > > > > > > page);
> > > > > > > > > > > > + if (error)
> > > > > > > > > > > > + return;
> > > > > > > > > > >
> > > > > > > > > > > This code path can clearly fail too. Why can we
> > > > > > > > > > > not
> > > > > > > > > > > fix
> > > > > > > > > > > this
> > > > > > > > > > > code
> > > > > > > > > > > to
> > > > > > > > > > > allow it to return that reported error so that we
> > > > > > > > > > > can
> > > > > > > > > > > handle
> > > > > > > > > > > the
> > > > > > > > > > > failure case in nfs_readpage() instead of dead-
> > > > > > > > > > > ending
> > > > > > > > > > > here?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Maybe the below patch is what you had in mind?
> > > > > > > > > > That
> > > > > > > > > > way
> > > > > > > > > > if
> > > > > > > > > > fscache
> > > > > > > > > > is enabled, nfs_readpage() should behave the same
> > > > > > > > > > way
> > > > > > > > > > as
> > > > > > > > > > if
> > > > > > > > > > it's
> > > > > > > > > > not,
> > > > > > > > > > for the case where an IO error occurs in the NFS
> > > > > > > > > > read
> > > > > > > > > > completion
> > > > > > > > > > path.
> > > > > > > > > >
> > > > > > > > > > If we call into fscache and we get back that the IO
> > > > > > > > > > has
> > > > > > > > > > been
> > > > > > > > > > submitted,
> > > > > > > > > > wait until it is completed, so we'll catch any IO
> > > > > > > > > > errors
> > > > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > read
> > > > > > > > > > completion
> > > > > > > > > > path. This does not solve the "catch the internal
> > > > > > > > > > errors",
> > > > > > > > > > IOW,
> > > > > > > > > > the
> > > > > > > > > > ones that show up as pg_error, that will probably
> > > > > > > > > > require
> > > > > > > > > > copying
> > > > > > > > > > pg_error into nfs_open_context.error field.
> > > > > > > > > >
> > > > > > > > > > diff --git a/fs/nfs/read.c b/fs/nfs/read.c
> > > > > > > > > > index 78b9181e94ba..28e3318080e0 100644
> > > > > > > > > > --- a/fs/nfs/read.c
> > > > > > > > > > +++ b/fs/nfs/read.c
> > > > > > > > > > @@ -357,13 +357,13 @@ int nfs_readpage(struct file
> > > > > > > > > > *file,
> > > > > > > > > > struct
> > > > > > > > > > page
> > > > > > > > > > *page)
> > > > > > > > > > } else
> > > > > > > > > > desc.ctx =
> > > > > > > > > > get_nfs_open_context(nfs_file_open_context(file));
> > > > > > > > > >
> > > > > > > > > > + xchg(&desc.ctx->error, 0);
> > > > > > > > > > if (!IS_SYNC(inode)) {
> > > > > > > > > > ret =
> > > > > > > > > > nfs_readpage_from_fscache(desc.ctx,
> > > > > > > > > > inode,
> > > > > > > > > > page);
> > > > > > > > > > if (ret == 0)
> > > > > > > > > > - goto out;
> > > > > > > > > > + goto out_wait;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > - xchg(&desc.ctx->error, 0);
> > > > > > > > > > nfs_pageio_init_read(&desc.pgio, inode,
> > > > > > > > > > false,
> > > > > > > > > >
> > > > > > > > > > &nfs_async_read_completion_ops);
> > > > > > > > > >
> > > > > > > > > > @@ -373,6 +373,7 @@ int nfs_readpage(struct file
> > > > > > > > > > *file,
> > > > > > > > > > struct
> > > > > > > > > > page
> > > > > > > > > > *page)
> > > > > > > > > >
> > > > > > > > > > nfs_pageio_complete_read(&desc.pgio);
> > > > > > > > > > ret = desc.pgio.pg_error < 0 ?
> > > > > > > > > > desc.pgio.pg_error
> > > > > > > > > > :
> > > > > > > > > > 0;
> > > > > > > > > > +out_wait:
> > > > > > > > > > if (!ret) {
> > > > > > > > > > ret =
> > > > > > > > > > wait_on_page_locked_killable(page);
> > > > > > > > > > if (!PageUptodate(page) && !ret)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > +
> > > > > > > > > > > > +
> > > > > > > > > > > > nfs_pageio_complete_read(&desc.pgio);
> > > > > > > > > > > > }
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Trond Myklebust
> > > > > > > > > > > Linux NFS client maintainer, Hammerspace
> > > > > > > > > > > trond.myklebust@hammerspace.com
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yes, please. This avoids that duplication of NFS read
> > > > > > > > > code
> > > > > > > > > in
> > > > > > > > > the
> > > > > > > > > fscache layer.
> > > > > > > > >
> > > > > > > >
> > > > > > > > If you mean patch 4 we still need that - I don't see
> > > > > > > > anyway
> > > > > > > > to
> > > > > > > > avoid it. The above just will make the fscache enabled
> > > > > > > > path waits for the IO to complete, same as the non-
> > > > > > > > fscache
> > > > > > > > case.
> > > > > > > >
> > > > > > >
> > > > > > > With the above, you can simplify patch 4/4 to just make
> > > > > > > the
> > > > > > > page
> > > > > > > unlock
> > > > > > > unconditional on the error, no?
> > > > > > >
> > > > > > > i.e.
> > > > > > > if (!error)
> > > > > > > SetPageUptodate(page);
> > > > > > > unlock_page(page);
> > > > > > >
> > > > > > > End result: the client just does the same check as before
> > > > > > > and
> > > > > > > let's
> > > > > > > the
> > > > > > > vfs/mm decide based on the status of the PG_uptodate flag
> > > > > > > what
> > > > > > > to
> > > > > > > do
> > > > > > > next. I'm assuming that a retry won't cause fscache to do
> > > > > > > another
> > > > > > > bio
> > > > > > > attempt?
> > > > > > >
> > > > > >
> > > > > > Yes I think you're right and I'm following - let me test it
> > > > > > and
> > > > > > I'll
> > > > > > send a v2.
> > > > > > Then we can drop patch #3 right?
> > > > > >
> > > > > Sounds good. Thanks Dave!
> > > > >
> > > >
> > > > This approach works but it differs from the original when an
> > > > fscache
> > > > error occurs.
> > > > The original (see below) would call back into NFS to read from
> > > > the
> > > > server, but
> > > > now we just let the VM handle it. The VM will re-issue the
> > > > read,
> > > > but
> > > > will go back into
> > > > fscache again (because it's enabled), which may fail again.
> > >
> > > How about marking the page on failure, then? I don't believe we
> > > currently use PG_owner_priv_1 (a.k.a. PageOwnerPriv1,
> > > PageChecked,
> > > PagePinned, PageForeign, PageSwapCache, PageXenRemapped) for
> > > anything
> > > and according to legend it is supposed to be usable by the fs for
> > > page
> > > cache pages.
> > >
> > > So what say we use SetPageChecked() to mark the page as having
> > > failed
> > > retrieval from fscache?
> > >
> >
> > So this? I confirm this patch on top of the one I just sent works.
> > Want me to merge them together and send a v3?
> >
> > Author: Dave Wysochanski <dwysocha@redhat.com>
> > Date: Tue Jun 29 11:10:15 2021 -0400
> >
> > NFS: Mark page with PG_checked if fscache IO completes in error
> >
> > If fscache is enabled and we try to read from fscache, but the
> > IO fails, mark the page with PG_checked. Then when the VM
> > re-issues the IO, skip over fscache and just read from the
> > server.
> >
> > Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
> >
> > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> > index 0966e147e973..687e98b08994 100644
> > --- a/fs/nfs/fscache.c
> > +++ b/fs/nfs/fscache.c
> > @@ -404,10 +404,12 @@ static void
> > nfs_readpage_from_fscache_complete(struct page *page,
> > "NFS: readpage_from_fscache_complete
> > (0x%p/0x%p/%d)\n",
> > page, context, error);
> >
> > - /* if the read completes with an error, unlock the page and
> > let
> > - * the VM reissue the readpage */
> > + /* if the read completes with an error, mark the page with
> > PG_checked,
> > + * unlock the page, and let the VM reissue the readpage */
> > if (!error)
> > SetPageUptodate(page);
> > + else
> > + SetPageChecked(page);
> > unlock_page(page);
> > }
> >
> > @@ -423,6 +425,11 @@ int __nfs_readpage_from_fscache(struct
> > nfs_open_context *ctx,
> > "NFS: readpage_from_fscache(fsc:%p/p:%p(i:%lx
> > f:%lx)/0x%p)\n",
> > nfs_i_fscache(inode), page, page->index, page-
> > > flags, inode);
> >
> > + if (PageChecked(page)) {
> > + ClearPageChecked(page)
> > + return 1;
> > + }
> > +
> >
> > ret = fscache_read_or_alloc_page(nfs_i_fscache(inode),
> > page,
> >
> > nfs_readpage_from_fscache_complete,
> >
>
> Yes, but how about just changing the above to:
>
> if (PageChecked(page))
> return 1;
> SetPageChecked(page);
>
> Then you can short-circuit all further checks in
> __nfs_readpage_from_fscache() if they've already failed once.
>
> Note that I don't think it is useful to clear PageChecked() once it
> has
> been set. Once a call to nfs_readpage() succeeds, the page will need
> to
> be evicted from the page cache before we can call nfs_readpage() on
> it
> again.
>
>
Oops. Never mind. The advantage of doing it as you do above is that
nfs_readpage_from_fscache_complete() is also called from
__nfs_readpages_from_fscache(). So let's stick with this patch...
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
prev parent reply other threads:[~2021-06-29 15:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-28 17:38 [PATCH 0/4] Fix a few error paths in nfs_readpage and fscache Dave Wysochanski
2021-06-28 17:39 ` [PATCH 1/4] NFS: Remove unnecessary inode parameter from nfs_pageio_complete_read() Dave Wysochanski
2021-06-28 17:39 ` [PATCH 2/4] NFS: Ensure nfs_readpage returns promptly when internal error occurs Dave Wysochanski
2021-06-28 19:17 ` Trond Myklebust
2021-06-28 20:00 ` David Wysochanski
2021-06-28 22:00 ` Trond Myklebust
2021-06-28 17:39 ` [PATCH 3/4] NFS: Allow internal use of read structs and functions Dave Wysochanski
2021-06-28 17:39 ` [PATCH 4/4] NFS: Fix fscache read from NFS after cache error Dave Wysochanski
2021-06-28 19:09 ` Trond Myklebust
2021-06-28 20:15 ` David Wysochanski
2021-06-28 21:12 ` David Wysochanski
2021-06-28 21:59 ` Trond Myklebust
2021-06-28 23:46 ` David Wysochanski
2021-06-29 0:39 ` Trond Myklebust
2021-06-29 9:17 ` David Wysochanski
2021-06-29 12:45 ` Trond Myklebust
2021-06-29 13:20 ` David Wysochanski
2021-06-29 14:54 ` Trond Myklebust
2021-06-29 15:29 ` David Wysochanski
2021-06-29 15:50 ` Trond Myklebust
2021-06-29 15:54 ` Trond Myklebust [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7923d155bbf8d7c13d031971203dbfa9fd0f6051.camel@hammerspace.com \
--to=trondmy@hammerspace.com \
--cc=anna.schumaker@netapp.com \
--cc=dwysocha@redhat.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox