* [PATCH] Btrfs: incremental send, fix emission of invalid clone operations
@ 2017-08-10 21:54 fdmanana
2017-08-15 16:45 ` Liu Bo
0 siblings, 1 reply; 4+ messages in thread
From: fdmanana @ 2017-08-10 21:54 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
When doing an incremental send it's possible that the computed send stream
contains clone operations that will fail on the receiver if the receiver
has compression enabled and the clone operations target a sector sized
extent that starts at a zero file offset, is not compressed on the source
filesystem but ends up being compressed and inlined at the destination
filesystem.
Example scenario:
$ mkfs.btrfs -f /dev/sdb
$ mount -o compress /dev/sdb /mnt
# By doing a direct IO write, the data is not compressed.
$ xfs_io -f -d -c "pwrite -S 0xab 0 4K" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/mysnap1
$ xfs_io -c "reflink /mnt/foobar 0 8K 4K" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/mysnap2
$ btrfs send -f /tmp/1.snap /mnt/mysnap1
$ btrfs send -f /tmp/2.snap -p /mnt/mysnap1 /mnt/mysnap2
$ umount /mnt
$ mkfs.btrfs -f /dev/sdc
$ mount -o compress /dev/sdc /mnt
$ btrfs receive -f /tmp/1.snap /mnt
$ btrfs receive -f /tmp/2.snap /mnt
ERROR: failed to clone extents to foobar
Operation not supported
The same could be achieved by mounting the source filesystem without
compression and doing a buffered IO write instead of a direct IO one,
and mounting the destination filesystem with compression enabled.
So fix this by issuing regular write operations in the send stream
instead of clone operations when the source offset is zero and the
range has a length matching the sector size.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/send.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index b082210df9c8..460be72ab78b 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -4992,6 +4992,25 @@ static int clone_range(struct send_ctx *sctx,
struct btrfs_key key;
int ret;
+ /*
+ * Prevent cloning from a zero offset with a length matching the sector
+ * size because in some scenarios this will make the receiver fail.
+ *
+ * For example, if in the source filesystem the extent at offset 0
+ * has a length of sectorsize and it was written using direct IO, then
+ * it can never be an inline extent (even if compression is enabled).
+ * Then this extent can be cloned in the original filesystem to a non
+ * zero file offset, but it may not be possible to clone in the
+ * destination filesystem because it can be inlined due to compression
+ * on the destination filesystem (as the receiver's write operations are
+ * always done using buffered IO). The same happens when the original
+ * filesystem does not have compression enabled but the destination
+ * filesystem has.
+ */
+ if (clone_root->offset == 0 &&
+ len == sctx->send_root->fs_info->sectorsize)
+ return send_extent_data(sctx, offset, len);
+
path = alloc_path_for_send();
if (!path)
return -ENOMEM;
--
2.11.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] Btrfs: incremental send, fix emission of invalid clone operations
2017-08-10 21:54 [PATCH] Btrfs: incremental send, fix emission of invalid clone operations fdmanana
@ 2017-08-15 16:45 ` Liu Bo
0 siblings, 0 replies; 4+ messages in thread
From: Liu Bo @ 2017-08-15 16:45 UTC (permalink / raw)
To: fdmanana; +Cc: linux-btrfs
On Thu, Aug 10, 2017 at 10:54:51PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> When doing an incremental send it's possible that the computed send stream
> contains clone operations that will fail on the receiver if the receiver
> has compression enabled and the clone operations target a sector sized
> extent that starts at a zero file offset, is not compressed on the source
> filesystem but ends up being compressed and inlined at the destination
> filesystem.
>
> Example scenario:
>
> $ mkfs.btrfs -f /dev/sdb
> $ mount -o compress /dev/sdb /mnt
>
> # By doing a direct IO write, the data is not compressed.
> $ xfs_io -f -d -c "pwrite -S 0xab 0 4K" /mnt/foobar
> $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1
>
> $ xfs_io -c "reflink /mnt/foobar 0 8K 4K" /mnt/foobar
> $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2
>
> $ btrfs send -f /tmp/1.snap /mnt/mysnap1
> $ btrfs send -f /tmp/2.snap -p /mnt/mysnap1 /mnt/mysnap2
> $ umount /mnt
>
> $ mkfs.btrfs -f /dev/sdc
> $ mount -o compress /dev/sdc /mnt
> $ btrfs receive -f /tmp/1.snap /mnt
> $ btrfs receive -f /tmp/2.snap /mnt
> ERROR: failed to clone extents to foobar
> Operation not supported
>
> The same could be achieved by mounting the source filesystem without
> compression and doing a buffered IO write instead of a direct IO one,
> and mounting the destination filesystem with compression enabled.
>
> So fix this by issuing regular write operations in the send stream
> instead of clone operations when the source offset is zero and the
> range has a length matching the sector size.
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Thanks,
-liubo
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
> fs/btrfs/send.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> index b082210df9c8..460be72ab78b 100644
> --- a/fs/btrfs/send.c
> +++ b/fs/btrfs/send.c
> @@ -4992,6 +4992,25 @@ static int clone_range(struct send_ctx *sctx,
> struct btrfs_key key;
> int ret;
>
> + /*
> + * Prevent cloning from a zero offset with a length matching the sector
> + * size because in some scenarios this will make the receiver fail.
> + *
> + * For example, if in the source filesystem the extent at offset 0
> + * has a length of sectorsize and it was written using direct IO, then
> + * it can never be an inline extent (even if compression is enabled).
> + * Then this extent can be cloned in the original filesystem to a non
> + * zero file offset, but it may not be possible to clone in the
> + * destination filesystem because it can be inlined due to compression
> + * on the destination filesystem (as the receiver's write operations are
> + * always done using buffered IO). The same happens when the original
> + * filesystem does not have compression enabled but the destination
> + * filesystem has.
> + */
> + if (clone_root->offset == 0 &&
> + len == sctx->send_root->fs_info->sectorsize)
> + return send_extent_data(sctx, offset, len);
> +
> path = alloc_path_for_send();
> if (!path)
> return -ENOMEM;
> --
> 2.11.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] Btrfs: incremental send, fix emission of invalid clone operations
@ 2019-05-20 8:57 fdmanana
2019-05-28 17:28 ` David Sterba
0 siblings, 1 reply; 4+ messages in thread
From: fdmanana @ 2019-05-20 8:57 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
When doing an incremental send we can now issue clone operations with a
source range that ends at the source's file eof and with a destination
range that ends at an offset smaller then the destination's file eof.
If the eof of the source file is not aligned to the sector size of the
filesystem, the receiver will get a -EINVAL error when trying to do the
operation or, on older kernels, silently corrupt the destination file.
The corruption happens on kernels without commit ac765f83f1397646
("Btrfs: fix data corruption due to cloning of eof block"), while the
failure to clone happens on kernels with that commit.
Example reproducer:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt/sdb
$ xfs_io -f -c "pwrite -S 0xb1 0 2M" /mnt/sdb/foo
$ xfs_io -f -c "pwrite -S 0xc7 0 2M" /mnt/sdb/bar
$ xfs_io -f -c "pwrite -S 0x4d 0 2M" /mnt/sdb/baz
$ xfs_io -f -c "pwrite -S 0xe2 0 2M" /mnt/sdb/zoo
$ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/base
$ btrfs send -f /tmp/base.send /mnt/sdb/base
$ xfs_io -c "reflink /mnt/sdb/bar 1560K 500K 100K" /mnt/sdb/bar
$ xfs_io -c "reflink /mnt/sdb/bar 1560K 0 100K" /mnt/sdb/zoo
$ xfs_io -c "truncate 550K" /mnt/sdb/bar
$ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/incr
$ btrfs send -f /tmp/incr.send -p /mnt/sdb/base /mnt/sdb/incr
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt/sdc
$ btrfs receive -f /tmp/base.send /mnt/sdc
$ btrfs receive -vv -f /tmp/incr.send /mnt/sdc
(...)
truncate bar size=563200
utimes bar
clone zoo - source=bar source offset=512000 offset=0 length=51200
ERROR: failed to clone extents to zoo
Invalid argument
The failure happens because the clone source range ends at the eof of file
bar, 563200, which is not aligned to the filesystems sector size (4Kb in
this case), and the destination range ends at offset 0 + 51200, which is
less then the size of the file zoo (2Mb).
So fix this by detecting such case and instead of issuing a clone
operation for the whole range, do a clone operation for smaller range
that is sector size aligned followed by a write operation for the block
containing the eof. Here we will always be pessimistic and assume the
destination filesystem of the send stream has the largest possible sector
size (64Kb), since we have no way of determining it.
This fixes a recent regression introduced in kernel 5.2-rc1.
Fixes: 040ee6120cb6706 ("Btrfs: send, improve clone range")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/send.c | 46 +++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 43 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 1549d0639b57..66db1271a3cb 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -5224,10 +5224,50 @@ static int clone_range(struct send_ctx *sctx,
clone_len = min_t(u64, ext_len, len);
if (btrfs_file_extent_disk_bytenr(leaf, ei) == disk_byte &&
- clone_data_offset == data_offset)
- ret = send_clone(sctx, offset, clone_len, clone_root);
- else
+ clone_data_offset == data_offset) {
+ const u64 src_end = clone_root->offset + clone_len;
+ const u64 sectorsize = SZ_64K;
+
+ /*
+ * We can't clone the last block, when its size is not
+ * sector size aligned, into the middle of a file. If we
+ * do so, the receiver will get a failure (-EINVAL) when
+ * trying to clone or will silently corrupt the data in
+ * the destination file if it's on a kernel without the
+ * fix introduced by commit ac765f83f1397646
+ * ("Btrfs: fix data corruption due to cloning of eof
+ * block).
+ *
+ * So issue a clone of the aligned down range plus a
+ * regular write for the eof block, if we hit that case.
+ *
+ * Also, we use the maximum possible sector size, 64K,
+ * because we don't know what's the sector size of the
+ * filesystem that receives the stream, so we have to
+ * assume the largest possible sector size.
+ */
+ if (src_end == clone_src_i_size &&
+ !IS_ALIGNED(src_end, sectorsize) &&
+ offset + clone_len < sctx->cur_inode_size) {
+ u64 slen;
+
+ slen = ALIGN_DOWN(src_end - clone_root->offset,
+ sectorsize);
+ if (slen > 0) {
+ ret = send_clone(sctx, offset, slen,
+ clone_root);
+ if (ret < 0)
+ goto out;
+ }
+ ret = send_extent_data(sctx, offset + slen,
+ clone_len - slen);
+ } else {
+ ret = send_clone(sctx, offset, clone_len,
+ clone_root);
+ }
+ } else {
ret = send_extent_data(sctx, offset, clone_len);
+ }
if (ret < 0)
goto out;
--
2.11.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] Btrfs: incremental send, fix emission of invalid clone operations
2019-05-20 8:57 fdmanana
@ 2019-05-28 17:28 ` David Sterba
0 siblings, 0 replies; 4+ messages in thread
From: David Sterba @ 2019-05-28 17:28 UTC (permalink / raw)
To: fdmanana; +Cc: linux-btrfs
On Mon, May 20, 2019 at 09:57:00AM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> When doing an incremental send we can now issue clone operations with a
> source range that ends at the source's file eof and with a destination
> range that ends at an offset smaller then the destination's file eof.
> If the eof of the source file is not aligned to the sector size of the
> filesystem, the receiver will get a -EINVAL error when trying to do the
> operation or, on older kernels, silently corrupt the destination file.
> The corruption happens on kernels without commit ac765f83f1397646
> ("Btrfs: fix data corruption due to cloning of eof block"), while the
> failure to clone happens on kernels with that commit.
>
> Example reproducer:
>
> $ mkfs.btrfs -f /dev/sdb
> $ mount /dev/sdb /mnt/sdb
>
> $ xfs_io -f -c "pwrite -S 0xb1 0 2M" /mnt/sdb/foo
> $ xfs_io -f -c "pwrite -S 0xc7 0 2M" /mnt/sdb/bar
> $ xfs_io -f -c "pwrite -S 0x4d 0 2M" /mnt/sdb/baz
> $ xfs_io -f -c "pwrite -S 0xe2 0 2M" /mnt/sdb/zoo
>
> $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/base
>
> $ btrfs send -f /tmp/base.send /mnt/sdb/base
>
> $ xfs_io -c "reflink /mnt/sdb/bar 1560K 500K 100K" /mnt/sdb/bar
> $ xfs_io -c "reflink /mnt/sdb/bar 1560K 0 100K" /mnt/sdb/zoo
> $ xfs_io -c "truncate 550K" /mnt/sdb/bar
>
> $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/incr
>
> $ btrfs send -f /tmp/incr.send -p /mnt/sdb/base /mnt/sdb/incr
>
> $ mkfs.btrfs -f /dev/sdc
> $ mount /dev/sdc /mnt/sdc
>
> $ btrfs receive -f /tmp/base.send /mnt/sdc
> $ btrfs receive -vv -f /tmp/incr.send /mnt/sdc
> (...)
> truncate bar size=563200
> utimes bar
> clone zoo - source=bar source offset=512000 offset=0 length=51200
> ERROR: failed to clone extents to zoo
> Invalid argument
>
> The failure happens because the clone source range ends at the eof of file
> bar, 563200, which is not aligned to the filesystems sector size (4Kb in
> this case), and the destination range ends at offset 0 + 51200, which is
> less then the size of the file zoo (2Mb).
>
> So fix this by detecting such case and instead of issuing a clone
> operation for the whole range, do a clone operation for smaller range
> that is sector size aligned followed by a write operation for the block
> containing the eof. Here we will always be pessimistic and assume the
> destination filesystem of the send stream has the largest possible sector
> size (64Kb), since we have no way of determining it.
>
> This fixes a recent regression introduced in kernel 5.2-rc1.
>
> Fixes: 040ee6120cb6706 ("Btrfs: send, improve clone range")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
Added to 5.2-rc queue, thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-05-28 17:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-10 21:54 [PATCH] Btrfs: incremental send, fix emission of invalid clone operations fdmanana
2017-08-15 16:45 ` Liu Bo
-- strict thread matches above, loose matches on Subject: below --
2019-05-20 8:57 fdmanana
2019-05-28 17:28 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).