From: Brian Foster <bfoster@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/7] xfs: always allocate blocks as unwritten for file data
Date: Mon, 1 Oct 2018 10:19:56 -0400 [thread overview]
Message-ID: <20181001141956.GD53694@bfoster> (raw)
In-Reply-To: <20181001123741.32005-3-hch@lst.de>
On Mon, Oct 01, 2018 at 05:37:36AM -0700, Christoph Hellwig wrote:
> XFS historically had a small race that could lead to exposing
> uninitialized data in case of a crash. If we are filling holes using
> buffered I/O we convert the delayed allocation to a real allocation
> before writing out the data. If we crash after the blocks were
> allocated, but before the data was written this could lead to reading
> uninitialized blocks (or leaked data from a previous allocation that was
> reused). Now that we have the CIL logging extent format changes is
> cheap, so we can switch to always allocating blocks as unwritten.
> Note that this is not be strictly necessary for writes that append
> beyond i_size, but given that we have to log a transaction in that
> case anyway we might as well give all block allocations a uniform
> treatment.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
It's great that we can finally fix this, particularly with such a simple
change. IIRC, the only real thing standing in the way was the buffer
head delalloc state management mess.
> fs/xfs/xfs_aops.c | 3 +--
> fs/xfs/xfs_aops.h | 2 --
> fs/xfs/xfs_iomap.c | 4 ++--
> 3 files changed, 3 insertions(+), 6 deletions(-)
>
...
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 6320aca39f39..10fc93cebc42 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -662,11 +662,11 @@ xfs_iomap_write_allocate(
> xfs_trans_t *tp;
> int nimaps;
> int error = 0;
> - int flags = XFS_BMAPI_DELALLOC;
> + int flags = XFS_BMAPI_DELALLOC | XFS_BMAPI_PREALLOC;
... though I don't quite think this is sufficient. xfs_bmapi_allocate()
has this snippet of code:
if ((!bma->wasdel || (bma->flags & XFS_BMAPI_COWFORK)) &&
(bma->flags & XFS_BMAPI_PREALLOC) &&
xfs_sb_version_hasextflgbit(&mp->m_sb))
bma->got.br_state = XFS_EXT_UNWRITTEN;
... which looks like it explicitly bypasses the PREALLOC flag for
delalloc extents. I figured this would just be an inefficiency since
prealloc conversion comes later, but if you look at
xfs_bmapi_convert_unwritten():
/* check if we need to do real->unwritten conversion */
if (mval->br_state == XFS_EXT_NORM &&
(flags & (XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT)) !=
(XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT))
return 0;
... it sees this as an existing extent and so doesn't change the state
unless the CONVERT flag is also passed. A quick test to shut down
immediately after the xfs_iomap_write_allocate() transaction commits
seems to confirm this behavior:
# xfs_io -c "fiemap -v" /mnt/file
/mnt/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..7]: 72..79 8 0x1
I think the right fix here is to remove the referenced logic from
xfs_bmapi_allocate(). I also think this demonstrates the need for an
xfstest. ;) Expected behavior should be easy to confirm with a new error
tag, for example.
Brian
> int nres;
>
> if (whichfork == XFS_COW_FORK)
> - flags |= XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC;
> + flags |= XFS_BMAPI_COWFORK;
>
> /*
> * Make sure that the dquots are there.
> --
> 2.19.0
>
next prev parent reply other threads:[~2018-10-01 20:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-01 12:37 delalloc and reflink fixes & tweaks V2 Christoph Hellwig
2018-10-01 12:37 ` [PATCH 1/7] xfs: remove XFS_IO_INVALID Christoph Hellwig
2018-10-01 14:19 ` Brian Foster
2018-10-01 12:37 ` [PATCH 2/7] xfs: always allocate blocks as unwritten for file data Christoph Hellwig
2018-10-01 14:19 ` Brian Foster [this message]
2018-10-01 12:37 ` [PATCH 3/7] xfs: handle zeroing in xfs_file_iomap_begin_delay Christoph Hellwig
2018-10-01 14:20 ` Brian Foster
2018-10-01 14:46 ` Christoph Hellwig
2018-10-01 12:37 ` [PATCH 4/7] xfs: remove the unused shared argument to xfs_reflink_reserve_cow Christoph Hellwig
2018-10-01 14:26 ` Brian Foster
2018-10-01 12:37 ` [PATCH 5/7] xfs: remove the unused trimmed argument from xfs_reflink_trim_around_shared Christoph Hellwig
2018-10-01 14:26 ` Brian Foster
2018-10-01 12:37 ` [PATCH 6/7] xfs: fix fork selection in xfs_find_trim_cow_extent Christoph Hellwig
2018-10-01 14:27 ` Brian Foster
2018-10-01 12:37 ` [PATCH 7/7] xfs: print dangling delalloc extents Christoph Hellwig
2018-10-01 14:27 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181001141956.GD53694@bfoster \
--to=bfoster@redhat.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).