All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, chandan.babu@oracle.com
Subject: Re: [PATCH] xfs: fix SEEK_HOLE/DATA for regions with active COW extents
Date: Tue, 20 Feb 2024 18:16:25 -0800	[thread overview]
Message-ID: <20240221021625.GC616564@frogsfrogsfrogs> (raw)
In-Reply-To: <20240220224928.3356-1-david@fromorbit.com>

On Wed, Feb 21, 2024 at 09:49:28AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> A data corruption problem was reported by CoreOS image builders
> when using reflink based disk image copies and then converting
> them to qcow2 images. The converted images failed the conversion
> verification step, and it was isolated down to the fact that
> qemu-img uses SEEK_HOLE/SEEK_DATA to find the data it is supposed to
> copy.
> 
> The reproducer allowed me to isolate the issue down to a region of
> the file that had overlapping data and COW fork extents, and the
> problem was that the COW fork extent was being reported in it's
> entirity by xfs_seek_iomap_begin() and so skipping over the real
> data fork extents in that range.
> 
> This was somewhat hidden by the fact that 'xfs_bmap -vvp' reported
> all the extents correctly, and reading the file completely (i.e. not
> using seek to skip holes) would map the file correctly and all the
> correct data extents are read. Hence the problem is isolated to just
> the xfs_seek_iomap_begin() implementation.
> 
> Instrumentation with trace_printk made the problem obvious: we are
> passing the wrong length to xfs_trim_extent() in
> xfs_seek_iomap_begin(). We are passing the end_fsb, not the
> maximum length of the extent we want to trim the map too. Hence the
> COW extent map never gets trimmed to the start of the next data fork
> extent, and so the seek code treats the entire COW fork extent as
> unwritten and skips entirely over the data fork extents in that
> range.
> 
> Link: https://github.com/coreos/coreos-assembler/issues/3728
> Fixes: 60271ab79d40 ("xfs: fix SEEK_DATA for speculative COW fork preallocation")
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_iomap.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 18c8f168b153..055cdec2e9ad 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1323,7 +1323,7 @@ xfs_seek_iomap_begin(
>  	if (cow_fsb != NULLFILEOFF && cow_fsb <= offset_fsb) {
>  		if (data_fsb < cow_fsb + cmap.br_blockcount)
>  			end_fsb = min(end_fsb, data_fsb);
> -		xfs_trim_extent(&cmap, offset_fsb, end_fsb);
> +		xfs_trim_extent(&cmap, offset_fsb, end_fsb - offset_fsb);

Doh.  Is there a reproducer we can hammer into a fstests regression test?
Sure would be nice if the type system actually caught things like this
for us.

Anyway thanks for fixing this,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

>  		seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED);
>  		error = xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
>  				IOMAP_F_SHARED, seq);
> @@ -1348,7 +1348,7 @@ xfs_seek_iomap_begin(
>  	imap.br_state = XFS_EXT_NORM;
>  done:
>  	seq = xfs_iomap_inode_sequence(ip, 0);
> -	xfs_trim_extent(&imap, offset_fsb, end_fsb);
> +	xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
>  	error = xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq);
>  out_unlock:
>  	xfs_iunlock(ip, lockmode);
> -- 
> 2.43.0
> 
> 

  reply	other threads:[~2024-02-21  2:16 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-20 22:49 [PATCH] xfs: fix SEEK_HOLE/DATA for regions with active COW extents Dave Chinner
2024-02-21  2:16 ` Darrick J. Wong [this message]
2024-02-21  2:32   ` Dave Chinner
2024-02-21  5:34 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240221021625.GC616564@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.