From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] xfs: fix data corruption w/ unaligned dedupe ranges
Date: Fri, 5 Oct 2018 16:58:21 -0700 [thread overview]
Message-ID: <20181005235821.GA28243@magnolia> (raw)
In-Reply-To: <20181005012336.1418-2-david@fromorbit.com>
On Fri, Oct 05, 2018 at 11:23:35AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> A deduplication data corruption is Exposed by fstests generic/505 on
> XFS. It is caused by extending the block match range to include the
> partial EOF block, but then allowing unknown data beyond EOF to be
> considered a "match" to data in the destination file because the
> comparison is only made to the end of the source file. This corrupts
> the destination file when the source extent is shared with it.
>
> XFS only supports whole block dedupe, but we still need to appear to
> support whole file dedupe correctly. Hence if the dedupe request
> includes the last block of the souce file, don't include it in the
> actual XFS dedupe operation. If the rest of the range dedupes
> successfully, then report the partial last block as deduped, too, so
> that userspace sees it as a successful dedupe rather than return
> EINVAL because we can't dedupe unaligned blocks.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_reflink.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 5289e22cb081..6b0da1b80103 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1222,6 +1222,19 @@ xfs_iolock_two_inodes_and_break_layout(
>
> /*
> * Link a range of blocks from one file to another.
> + *
> + * The VFS allows partial EOF blocks to "match" for dedupe even though it hasn't
> + * checked that the bytes beyond EOF physically match. Hence we cannot use the
> + * EOF block in the source dedupe range because it's not a complete block match,
> + * hence can introduce a corruption into the file that has it's
> + * block replaced.
> + *
> + * Despite this issue, we still need to report that range as successfully
> + * deduped to avoid confusing userspace with EINVAL errors on completely
> + * matching file data. The only time that an unaligned length will be passed to
> + * us is when it spans the EOF block of the source file, so if we simply mask it
> + * down to be block aligned here the we will dedupe everything but that partial
> + * EOF block.
> */
> int
> xfs_reflink_remap_range(
> @@ -1274,6 +1287,14 @@ xfs_reflink_remap_range(
> if (ret <= 0)
> goto out_unlock;
>
> + /*
> + * If the dedupe data matches, chop off the partial EOF block
> + * from the source file so we don't try to dedupe the partial
> + * EOF block.
> + */
> + if (is_dedupe)
> + len &= ~((u64)i_blocksize(inode_in) - 1);
I think that truncating the length like this is going to cause a mess
since we don't have the plumbing to report the shorter dedupe length to
userspace. Granted, this also causes stale data exposure and I don't
want to hold this up for my big long clonerange cleanup to land.
I'll probably end up cleaning up all this into a generic "check these
clone args for block alignment" later anyway, so you might as well go
ahead:
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> +
> /* Attach dquots to dest inode before changing block map */
> ret = xfs_qm_dqattach(dest);
> if (ret)
> --
> 2.17.0
>
next prev parent reply other threads:[~2018-10-06 6:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-05 1:23 [PATCH 0/2] xfs: fix another couple of reflink data corruptions Dave Chinner
2018-10-05 1:23 ` [PATCH 1/2] xfs: fix data corruption w/ unaligned dedupe ranges Dave Chinner
2018-10-05 23:58 ` Darrick J. Wong [this message]
2018-10-06 10:47 ` Christoph Hellwig
2018-10-05 1:23 ` [PATCH 2/2] xfs: fix data corruption w/ unaligned reflink ranges Dave Chinner
2018-10-05 1:40 ` Darrick J. Wong
2018-10-05 5:21 ` Dave Chinner
2018-10-05 17:11 ` Darrick J. Wong
2018-10-06 0:00 ` Darrick J. Wong
2018-10-06 10:54 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181005235821.GA28243@magnolia \
--to=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.