linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/7] xfs: always allocate blocks as unwritten for file data
Date: Mon, 1 Oct 2018 10:19:56 -0400	[thread overview]
Message-ID: <20181001141956.GD53694@bfoster> (raw)
In-Reply-To: <20181001123741.32005-3-hch@lst.de>

On Mon, Oct 01, 2018 at 05:37:36AM -0700, Christoph Hellwig wrote:
> XFS historically had a small race that could lead to exposing
> uninitialized data in case of a crash.  If we are filling holes using
> buffered I/O we convert the delayed allocation to a real allocation
> before writing out the data.  If we crash after the blocks were
> allocated, but before the data was written this could lead to reading
> uninitialized blocks (or leaked data from a previous allocation that was
> reused).  Now that we have the CIL logging extent format changes is
> cheap, so we can switch to always allocating blocks as unwritten.
> Note that this is not be strictly necessary for writes that append
> beyond i_size, but given that we have to log a transaction in that
> case anyway we might as well give all block allocations a uniform
> treatment.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

It's great that we can finally fix this, particularly with such a simple
change. IIRC, the only real thing standing in the way was the buffer
head delalloc state management mess.

>  fs/xfs/xfs_aops.c  | 3 +--
>  fs/xfs/xfs_aops.h  | 2 --
>  fs/xfs/xfs_iomap.c | 4 ++--
>  3 files changed, 3 insertions(+), 6 deletions(-)
> 
...
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 6320aca39f39..10fc93cebc42 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -662,11 +662,11 @@ xfs_iomap_write_allocate(
>  	xfs_trans_t	*tp;
>  	int		nimaps;
>  	int		error = 0;
> -	int		flags = XFS_BMAPI_DELALLOC;
> +	int		flags = XFS_BMAPI_DELALLOC | XFS_BMAPI_PREALLOC;

... though I don't quite think this is sufficient. xfs_bmapi_allocate()
has this snippet of code:

        if ((!bma->wasdel || (bma->flags & XFS_BMAPI_COWFORK)) &&
            (bma->flags & XFS_BMAPI_PREALLOC) &&
            xfs_sb_version_hasextflgbit(&mp->m_sb))
                bma->got.br_state = XFS_EXT_UNWRITTEN;

... which looks like it explicitly bypasses the PREALLOC flag for
delalloc extents. I figured this would just be an inefficiency since
prealloc conversion comes later, but if you look at
xfs_bmapi_convert_unwritten():

        /* check if we need to do real->unwritten conversion */
        if (mval->br_state == XFS_EXT_NORM &&
            (flags & (XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT)) !=
                        (XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT))
                return 0;

... it sees this as an existing extent and so doesn't change the state
unless the CONVERT flag is also passed. A quick test to shut down
immediately after the xfs_iomap_write_allocate() transaction commits
seems to confirm this behavior:

# xfs_io -c "fiemap -v" /mnt/file
/mnt/file:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          72..79               8   0x1

I think the right fix here is to remove the referenced logic from
xfs_bmapi_allocate(). I also think this demonstrates the need for an
xfstest. ;) Expected behavior should be easy to confirm with a new error
tag, for example.

Brian

>  	int		nres;
>  
>  	if (whichfork == XFS_COW_FORK)
> -		flags |= XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC;
> +		flags |= XFS_BMAPI_COWFORK;
>  
>  	/*
>  	 * Make sure that the dquots are there.
> -- 
> 2.19.0
> 

  reply	other threads:[~2018-10-01 20:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-01 12:37 delalloc and reflink fixes & tweaks V2 Christoph Hellwig
2018-10-01 12:37 ` [PATCH 1/7] xfs: remove XFS_IO_INVALID Christoph Hellwig
2018-10-01 14:19   ` Brian Foster
2018-10-01 12:37 ` [PATCH 2/7] xfs: always allocate blocks as unwritten for file data Christoph Hellwig
2018-10-01 14:19   ` Brian Foster [this message]
2018-10-01 12:37 ` [PATCH 3/7] xfs: handle zeroing in xfs_file_iomap_begin_delay Christoph Hellwig
2018-10-01 14:20   ` Brian Foster
2018-10-01 14:46     ` Christoph Hellwig
2018-10-01 12:37 ` [PATCH 4/7] xfs: remove the unused shared argument to xfs_reflink_reserve_cow Christoph Hellwig
2018-10-01 14:26   ` Brian Foster
2018-10-01 12:37 ` [PATCH 5/7] xfs: remove the unused trimmed argument from xfs_reflink_trim_around_shared Christoph Hellwig
2018-10-01 14:26   ` Brian Foster
2018-10-01 12:37 ` [PATCH 6/7] xfs: fix fork selection in xfs_find_trim_cow_extent Christoph Hellwig
2018-10-01 14:27   ` Brian Foster
2018-10-01 12:37 ` [PATCH 7/7] xfs: print dangling delalloc extents Christoph Hellwig
2018-10-01 14:27   ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181001141956.GD53694@bfoster \
    --to=bfoster@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).