From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH v2] xfs: trim writepage mapping to within eof
Date: Sun, 15 Oct 2017 09:08:56 +1100 [thread overview]
Message-ID: <20171014220856.GA15067@dastard> (raw)
In-Reply-To: <20171013123826.46207-1-bfoster@redhat.com>
On Fri, Oct 13, 2017 at 08:38:26AM -0400, Brian Foster wrote:
> The writeback rework in commit fbcc02561359 ("xfs: Introduce
> writeback context for writepages") introduced a subtle change in
> behavior with regard to the block mapping used across the
> ->writepages() sequence. The previous xfs_cluster_write() code would
> only flush pages up to EOF at the time of the writepage, thus
> ensuring that any pages due to file-extending writes would be
> handled on a separate cycle and with a new, updated block mapping.
>
> The updated code establishes a block mapping in xfs_writepage_map()
> that could extend beyond EOF if the file has post-eof preallocation.
> Because we now use the generic writeback infrastructure and pass the
> cached mapping to each writepage call, there is no implicit EOF
> limit in place. If eofblocks trimming occurs during ->writepages(),
> any post-eof portion of the cached mapping becomes invalid. The
> eofblocks code has no means to serialize against writeback because
> there are no pages associated with post-eof blocks. Therefore if an
> eofblocks trim occurs and is followed by a file-extending buffered
> write, not only has the mapping become invalid, but we could end up
> writing a page to disk based on the invalid mapping.
>
> Consider the following sequence of events:
>
> - A buffered write creates a delalloc extent and post-eof
> speculative preallocation.
> - Writeback starts and on the first writepage cycle, the delalloc
> extent is converted to real blocks (including the post-eof blocks)
> and the mapping is cached.
> - The file is closed and xfs_release() trims post-eof blocks. The
> cached writeback mapping is now invalid.
> - Another buffered write appends the file with a delalloc extent.
> - The concurrent writeback cycle picks up the just written page
> because the writeback range end is LLONG_MAX. xfs_writepage_map()
> attributes it to the (now invalid) cached mapping and writes the
> data to an incorrect location on disk (and where the file offset is
> still backed by a delalloc extent).
>
> This problem is reproduced by xfstests test generic/464, which
> triggers racing writes, appends, open/closes and writeback requests.
>
> To address this problem, trim the mapping used during writeback to
> within EOF when the mapping is validated. This ensures the mapping
> is revalidated for any pages encountered beyond EOF as of the time
> the current mapping was cached or last validated.
>
> Reported-by: Eryu Guan <eguan@redhat.com>
> Diagnosed-by: Eryu Guan <eguan@redhat.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Looks good to me.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
prev parent reply other threads:[~2017-10-14 22:08 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-13 12:38 [PATCH v2] xfs: trim writepage mapping to within eof Brian Foster
2017-10-14 22:08 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171014220856.GA15067@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).