From: Trond Myklebust <trondmy@hammerspace.com>
To: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"calum.mackay@oracle.com" <calum.mackay@oracle.com>
Subject: Re: nfs_page_async_flush returning 0 for fatal errors on writeback
Date: Wed, 7 Jul 2021 23:50:23 +0000 [thread overview]
Message-ID: <421b63750c5d80bdda1184c854d7ab3489c7ff01.camel@hammerspace.com> (raw)
In-Reply-To: <f270f45b-b5f9-5241-83c8-97c5f5482c56@oracle.com>
On Thu, 2021-07-08 at 00:13 +0100, Calum Mackay wrote:
> On 07/07/2021 11:01 pm, Trond Myklebust wrote:
> > On Wed, 2021-07-07 at 19:51 +0100, Calum Mackay wrote:
> > > hi Trond,
> > >
> > > I had a question about these two old commits of yours, from v5.0
> > > &
> > > v5.2:
> > >
> > > 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal
> > > errors
> > > (2
> > > years, 2 months ago)
> > >
> > > 8fc75bed96bb NFS: Fix up return value on fatal errors in
> > > nfs_page_async_flush() (2 years, 5 months ago)
> > >
> > >
> > > I am looking at a crash dump, with a kernel based on an older-
> > > still
> > > v4.14 stable which did not have either of the above commits.
> > >
> > > PANIC: "BUG: unable to handle kernel NULL pointer
> > > dereference
> > > at
> > > 0000000000000080"
> > >
> > > [exception RIP: _raw_spin_lock+20]
> > >
> > > #10 [ffffb1493d78fcb8] nfs_updatepage at ffffffffc08f1791 [nfs]
> > > #11 [ffffb1493d78fd10] nfs_write_end at ffffffffc08e094e [nfs]
> > > #12 [ffffb1493d78fd58] generic_perform_write at ffffffffa71d458b
> > > #13 [ffffb1493d78fde0] nfs_file_write at ffffffffc08dfdb4 [nfs]
> > > #14 [ffffb1493d78fe18] __vfs_write at ffffffffa72848bc
> > > #15 [ffffb1493d78fea0] vfs_write at ffffffffa7284ad2
> > > #16 [ffffb1493d78fee0] sys_write at ffffffffa7284d35
> > > #17 [ffffb1493d78ff28] do_syscall_64 at ffffffffa7003949
> > >
> > > the real sequence, obscured by compiler inlining, is:
> > >
> > > nfs_updatepage
> > > nfs_writepage_setup
> > > nfs_setup_write_request
> > > nfs_inode_add_request
> > > spin_lock(&mapping->private_lock);
> > >
> > > and we crash since the as mapping pointer is NULL.
> > >
> > >
> > > I thought I was able to construct a possible sequence that would
> > > explain
> > > the above, if we are in (from above):
> > >
> > > nfs_setup_write_request
> > > nfs_try_to_update_request
> > > nfs_wb_page
> > > nfs_writepage_locked
> > > nfs_do_writepage
> > >
> > > and nfs_page_async_flush detects a fatal server error, and calls
> > > nfs_write_error_remove_page, which results in the page->mapping
> > > set
> > > to NULL.
> > >
> > > In that version of the code, without your commits above,
> > > nfs_page_async_flush returns 0 in this case, which I thought
> > > might
> > > result in nfs_setup_write_request going ahead and calling
> > > nfs_inode_add_request with that page, resulting in the crash
> > > seen.
> > >
> > >
> > > I then discovered your v5.0 commit:
> > >
> > > 8fc75bed96bb NFS: Fix up return value on fatal errors in
> > > nfs_page_async_flush() (2 years, 5 months ago)
> > >
> > > which appeared to correct that, having nfs_page_async_flush
> > > return
> > > the
> > > error in this case, so we would not end up in
> > > nfs_inode_add_request.
> > >
> > >
> > > But I then spotted your later v5.2 commit:
> > >
> > > 14bebe3c90b3 NFS: Don't interrupt file writeout due to fatal
> > > errors
> > > (2
> > > years, 2 months ago)
> > >
> > > which changes things back, so that nfs_page_async_flush now again
> > > returns 0, in the "launder" case, and that's how that code
> > > remains
> > > today.
> > >
> > >
> > > If so, is there anything to stop the possible crash path that I
> > > describe
> > > above?
> > >
> > >
> > > path I suggest above? Or perhaps I'm missing another commit that
> > > stops
> > > it happening, even after your second commit above?
> > >
> >
> > In order for page->mapping to get set to NULL, we'd have to be
> > removing
> > the page from the page cache altogether. I'm not seeing where we'd
> > be
> > doing that here. It certainly isn't possible for some third party
> > to do
> > so, since our thread is holding the page lock and I'm not seeing
> > where
> > the call to nfs_write_error() might be doing so.
> >
> > We do call nfs_inode_remove_request(), which removes the struct
> > nfs_page that is tracking the page dirtiness, however it shouldn't
> > ever
> > result in the removal of the pagecache page itself.
> >
> > Am I misreading your email?
>
> thanks very much Trond; much more likely I am misreading the code :)
>
>
> My theory was that we have nfs_page_async_flush detecting
> nfs_error_is_fatal_on_server, so calling nfs_write_error_remove_page
> (this is an older v4.14.72-ish kernel).
>
> That would then generic_error_remove_page -> truncate_inode_page ->
> truncate_complete_page -> delete_from_page_cache
>
> thus, as you say, removing the page from the page cache, with
> __delete_from_page_cache clearing page->mapping.
>
>
> Without your v5.0 commit, nfs_page_async_flush will then return 0,
> via
> nfs_do_writepage, nfs_writepage_locked, nfs_wb_page to
> nfs_try_to_update_request, which then returns NULL to
> nfs_setup_write_request.
>
> nfs_inode_add_request, which itself then dereferences the mapping:
>
> spin_lock(&mapping->private_lock);
>
> which is where we crash.
>
>
> Obviously, there are a number of assumptions in the above, so I
> thought
> it must just be a possible path the code could take?
>
> Does that sound plausible (given that v4.14.72-ish code)?
>
>
>
> However, I note that in a subsequent v5.2 commit:
>
> 22876f540bdf NFS: Don't call generic_error_remove_page() while
> holding locks
>
> you remove the call to generic_error_remove_page from
> nfs_write_error_remove_page(), and that is itself then renamed
> nfs_write_error(), as part of a later v5.2 commit:
>
> 6fbda89b257f NFS: Replace custom error reporting mechanism with
> generic one
>
>
> Without those commits, and also without your adjustments to
> nfs_page_async_flush I mentioned earlier, is it possible that the
> code
> path I present above, where the page /is/ removed from the page
> cache,
> could result in the crash we saw?
>
>
OK, yes that is plausible. The call to generic_error_remove_page()
would, as you said, remove the page from the page cache, and thus could
result in the crash you describe.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
next prev parent reply other threads:[~2021-07-07 23:50 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-07 18:51 nfs_page_async_flush returning 0 for fatal errors on writeback Calum Mackay
2021-07-07 22:01 ` Trond Myklebust
2021-07-07 23:13 ` Calum Mackay
2021-07-07 23:50 ` Trond Myklebust [this message]
2021-07-07 23:57 ` Calum Mackay
2021-08-06 21:25 ` Calum Mackay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=421b63750c5d80bdda1184c854d7ab3489c7ff01.camel@hammerspace.com \
--to=trondmy@hammerspace.com \
--cc=calum.mackay@oracle.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox