Linux Btrfs filesystem development
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent()
@ 2024-04-15 23:24 Qu Wenruo
  2024-04-15 23:24 ` [PATCH 1/2] btrfs: set correct ram_bytes when splitting ordered extent Qu Wenruo
  2024-04-15 23:24 ` [PATCH 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check Qu Wenruo
  0 siblings, 2 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-04-15 23:24 UTC (permalink / raw)
  To: linux-btrfs

During my extent_map members rework, I added a sanity check to make sure
regular non-compressed extent_map would have its disk_num_bytes to match
ram_bytes.

But that extent_map sanity check always fail as we have on-disk file
extent items which has its ram_bytes much larger than the corresponding
disk_num_bytes, even if it's not compressed.

It turns out that, the ram_bytes > disk_num_bytes is caused by
btrfs_split_ordered_extent(), where it doesn't properly update
ram_bytes, resulting it larger than disk_num_bytes.

Thankfully everything is fine, as our code doesn't really bother
ram_bytes for non-compressed regular file extents, so no real damage.

Still I'd like to catch such problem in the future, so add another
tree-checker patch for this case.

And since the invalid ram_bytes is already in the wild for a while, we
do not want to bother the end users to fix their fs for nothing.
So the check is only behind DEBUG builds.

Furthermore, the tree-checker is only to make sure @ram_bytes <
@disk_num_bytes for non-compressed file extents.
As we still have other locations to make @ram_bytes < @disk_num_bytes.

And for btrfs-progs, I'm going to add extra check and repair support
soon.

Qu Wenruo (2):
  btrfs: set correct ram_bytes when splitting ordered extent
  btrfs: tree-checker: add one extra file extent item ram_bytes check

 fs/btrfs/ordered-data.c |  1 +
 fs/btrfs/tree-checker.c | 35 +++++++++++++++++++++++++++++------
 2 files changed, 30 insertions(+), 6 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] btrfs: set correct ram_bytes when splitting ordered extent
  2024-04-15 23:24 [PATCH 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
@ 2024-04-15 23:24 ` Qu Wenruo
  2024-04-16 11:38   ` Filipe Manana
  2024-04-15 23:24 ` [PATCH 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check Qu Wenruo
  1 sibling, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2024-04-15 23:24 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
When running generic/287, the following file extent items can be
generated:

        item 16 key (258 EXTENT_DATA 2682880) itemoff 15305 itemsize 53
                generation 9 type 1 (regular)
                extent data disk byte 1378414592 nr 462848
                extent data offset 0 nr 462848 ram 2097152
                extent compression 0 (none)

Note that file extent item is not a compressed one, but its ram_bytes is
way larger than its disk_num_bytes.

According to btrfs on-disk scheme, ram_bytes should match disk_num_bytes
if it's not a compressed one.

[CAUSE]
Since commit b73a6fd1b1ef ("btrfs: split partial dio bios before
submit"), for partial dio writes, we would split the ordered extent.

However the function btrfs_split_ordered_extent() doesn't update the
ram_bytes even it has already shrunk the disk_num_bytes.

Originally the function btrfs_split_ordered_extent() is only introduced
for zoned devices in commit d22002fd37bd ("btrfs: zoned: split ordered
extent when bio is sent"), but later commit b73a6fd1b1ef ("btrfs: split
partial dio bios before submit") makes non-zoned btrfs affected.

Thankfully for un-compressed file extent, we do not really utilize the
ram_bytes member, thus it won't cause any real problem.

[FIX]
Also update btrfs_ordered_extent::ram_bytes inside
btrfs_split_ordered_extent().

Fixes: b73a6fd1b1ef ("btrfs: split partial dio bios before submit")
Fixes: d22002fd37bd ("btrfs: zoned: split ordered extent when bio is sent")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ordered-data.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index b749ba45da2b..c2a42bcde98e 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -1188,6 +1188,7 @@ struct btrfs_ordered_extent *btrfs_split_ordered_extent(
 	ordered->disk_bytenr += len;
 	ordered->num_bytes -= len;
 	ordered->disk_num_bytes -= len;
+	ordered->ram_bytes -= len;
 
 	if (test_bit(BTRFS_ORDERED_IO_DONE, &ordered->flags)) {
 		ASSERT(ordered->bytes_left == 0);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check
  2024-04-15 23:24 [PATCH 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
  2024-04-15 23:24 ` [PATCH 1/2] btrfs: set correct ram_bytes when splitting ordered extent Qu Wenruo
@ 2024-04-15 23:24 ` Qu Wenruo
  2024-04-16 11:41   ` Filipe Manana
  1 sibling, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2024-04-15 23:24 UTC (permalink / raw)
  To: linux-btrfs

During my development on extent map cleanups, I hit a case where we can
create a file extent item that has ram_bytes double the size of
num_bytes but it's not compressed.

Later it turns out to be a bug in btrfs_split_ordered_extent(), and
thankfully it doesn't cause any real corruption, just a drift from
on-disk format.

Here we add an extra check on ram_bytes for btrfs_file_extent_item to
catch such problem.

However considering the incorrect ram_bytes are already in the wild, and
no real data corruption, we do not want end users to be bothered as their
data is still consistent.

So this patch would only hide the check behind DEBUG builds for us
developers to catch future problem.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/tree-checker.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index c8fbcae4e88e..8dfbec3e6ba2 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -212,6 +212,7 @@ static int check_extent_data_item(struct extent_buffer *leaf,
 	u32 sectorsize = fs_info->sectorsize;
 	u32 item_size = btrfs_item_size(leaf, slot);
 	u64 extent_end;
+	u8 compression;
 
 	if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) {
 		file_extent_err(leaf, slot,
@@ -251,16 +252,15 @@ static int check_extent_data_item(struct extent_buffer *leaf,
 		return -EUCLEAN;
 	}
 
+	compression = btrfs_file_extent_compression(leaf, fi);
 	/*
 	 * Support for new compression/encryption must introduce incompat flag,
 	 * and must be caught in open_ctree().
 	 */
-	if (unlikely(btrfs_file_extent_compression(leaf, fi) >=
-		     BTRFS_NR_COMPRESS_TYPES)) {
+	if (unlikely(compression >= BTRFS_NR_COMPRESS_TYPES)) {
 		file_extent_err(leaf, slot,
 	"invalid compression for file extent, have %u expect range [0, %u]",
-			btrfs_file_extent_compression(leaf, fi),
-			BTRFS_NR_COMPRESS_TYPES - 1);
+			compression, BTRFS_NR_COMPRESS_TYPES - 1);
 		return -EUCLEAN;
 	}
 	if (unlikely(btrfs_file_extent_encryption(leaf, fi))) {
@@ -279,8 +279,7 @@ static int check_extent_data_item(struct extent_buffer *leaf,
 		}
 
 		/* Compressed inline extent has no on-disk size, skip it */
-		if (btrfs_file_extent_compression(leaf, fi) !=
-		    BTRFS_COMPRESS_NONE)
+		if (compression != BTRFS_COMPRESS_NONE)
 			return 0;
 
 		/* Uncompressed inline extent size must match item size */
@@ -319,6 +318,30 @@ static int check_extent_data_item(struct extent_buffer *leaf,
 		return -EUCLEAN;
 	}
 
+	/*
+	 * If it's a uncompressed regular extents, its ram size should match
+	 * disk_num_bytes. But for now we have several call sites that doesn't
+	 * properly update @ram_bytes, so at least make sure
+	 * @ram_bytes <= @disk_num_bytes.
+	 *
+	 * However we have a recent bug related to @ram_bytes update, causing
+	 * all zoned btrfs and regular btrfs DIO to be affected.
+	 * Thankfully the ram_bytes is not critical for non-compressed file extents.
+	 * So here we hide the check behind DEBUG builds for developers only.
+	 */
+#ifdef CONFIG_BTRFS_DEBUG
+	if (unlikely(compression == BTRFS_COMPRESS_NONE &&
+		     btrfs_file_extent_disk_bytenr(leaf, fi) &&
+		     btrfs_file_extent_ram_bytes(leaf, fi) >
+		     btrfs_file_extent_disk_num_bytes(leaf, fi))) {
+		file_extent_err(leaf, slot,
+				"invalid ram_bytes, have %llu expect <=%llu",
+				btrfs_file_extent_ram_bytes(leaf, fi),
+				btrfs_file_extent_disk_num_bytes(leaf, fi));
+		return -EUCLEAN;
+	}
+#endif
+
 	/*
 	 * Check that no two consecutive file extent items, in the same leaf,
 	 * present ranges that overlap each other.
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] btrfs: set correct ram_bytes when splitting ordered extent
  2024-04-15 23:24 ` [PATCH 1/2] btrfs: set correct ram_bytes when splitting ordered extent Qu Wenruo
@ 2024-04-16 11:38   ` Filipe Manana
  0 siblings, 0 replies; 5+ messages in thread
From: Filipe Manana @ 2024-04-16 11:38 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, Apr 16, 2024 at 12:24 AM Qu Wenruo <wqu@suse.com> wrote:
>
> [BUG]
> When running generic/287, the following file extent items can be
> generated:
>
>         item 16 key (258 EXTENT_DATA 2682880) itemoff 15305 itemsize 53
>                 generation 9 type 1 (regular)
>                 extent data disk byte 1378414592 nr 462848
>                 extent data offset 0 nr 462848 ram 2097152
>                 extent compression 0 (none)
>
> Note that file extent item is not a compressed one, but its ram_bytes is
> way larger than its disk_num_bytes.
>
> According to btrfs on-disk scheme, ram_bytes should match disk_num_bytes
> if it's not a compressed one.
>
> [CAUSE]
> Since commit b73a6fd1b1ef ("btrfs: split partial dio bios before
> submit"), for partial dio writes, we would split the ordered extent.
>
> However the function btrfs_split_ordered_extent() doesn't update the
> ram_bytes even it has already shrunk the disk_num_bytes.
>
> Originally the function btrfs_split_ordered_extent() is only introduced
> for zoned devices in commit d22002fd37bd ("btrfs: zoned: split ordered
> extent when bio is sent"), but later commit b73a6fd1b1ef ("btrfs: split
> partial dio bios before submit") makes non-zoned btrfs affected.
>
> Thankfully for un-compressed file extent, we do not really utilize the
> ram_bytes member, thus it won't cause any real problem.
>
> [FIX]
> Also update btrfs_ordered_extent::ram_bytes inside
> btrfs_split_ordered_extent().
>
> Fixes: b73a6fd1b1ef ("btrfs: split partial dio bios before submit")
> Fixes: d22002fd37bd ("btrfs: zoned: split ordered extent when bio is sent")

It's confusing to have two commits here.

Given that the second one is the one that introduced the ordered
extent splitting function and that the first one just uses that
function, leaving only the second Fixes tag makes it clear and avoids
confusion. I.e. it's not the first commit that introduced the bug.

Otherwise it looks good, thanks.

Reviewed-by: Filipe Manana <fdmanana@suse.com>

> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/ordered-data.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> index b749ba45da2b..c2a42bcde98e 100644
> --- a/fs/btrfs/ordered-data.c
> +++ b/fs/btrfs/ordered-data.c
> @@ -1188,6 +1188,7 @@ struct btrfs_ordered_extent *btrfs_split_ordered_extent(
>         ordered->disk_bytenr += len;
>         ordered->num_bytes -= len;
>         ordered->disk_num_bytes -= len;
> +       ordered->ram_bytes -= len;
>
>         if (test_bit(BTRFS_ORDERED_IO_DONE, &ordered->flags)) {
>                 ASSERT(ordered->bytes_left == 0);
> --
> 2.44.0
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check
  2024-04-15 23:24 ` [PATCH 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check Qu Wenruo
@ 2024-04-16 11:41   ` Filipe Manana
  0 siblings, 0 replies; 5+ messages in thread
From: Filipe Manana @ 2024-04-16 11:41 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, Apr 16, 2024 at 12:24 AM Qu Wenruo <wqu@suse.com> wrote:
>
> During my development on extent map cleanups, I hit a case where we can
> create a file extent item that has ram_bytes double the size of
> num_bytes but it's not compressed.
>
> Later it turns out to be a bug in btrfs_split_ordered_extent(), and
> thankfully it doesn't cause any real corruption, just a drift from
> on-disk format.
>
> Here we add an extra check on ram_bytes for btrfs_file_extent_item to
> catch such problem.
>
> However considering the incorrect ram_bytes are already in the wild, and
> no real data corruption, we do not want end users to be bothered as their
> data is still consistent.
>
> So this patch would only hide the check behind DEBUG builds for us
> developers to catch future problem.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/tree-checker.c | 35 +++++++++++++++++++++++++++++------
>  1 file changed, 29 insertions(+), 6 deletions(-)
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index c8fbcae4e88e..8dfbec3e6ba2 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -212,6 +212,7 @@ static int check_extent_data_item(struct extent_buffer *leaf,
>         u32 sectorsize = fs_info->sectorsize;
>         u32 item_size = btrfs_item_size(leaf, slot);
>         u64 extent_end;
> +       u8 compression;
>
>         if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) {
>                 file_extent_err(leaf, slot,
> @@ -251,16 +252,15 @@ static int check_extent_data_item(struct extent_buffer *leaf,
>                 return -EUCLEAN;
>         }
>
> +       compression = btrfs_file_extent_compression(leaf, fi);
>         /*
>          * Support for new compression/encryption must introduce incompat flag,
>          * and must be caught in open_ctree().
>          */
> -       if (unlikely(btrfs_file_extent_compression(leaf, fi) >=
> -                    BTRFS_NR_COMPRESS_TYPES)) {
> +       if (unlikely(compression >= BTRFS_NR_COMPRESS_TYPES)) {
>                 file_extent_err(leaf, slot,
>         "invalid compression for file extent, have %u expect range [0, %u]",
> -                       btrfs_file_extent_compression(leaf, fi),
> -                       BTRFS_NR_COMPRESS_TYPES - 1);
> +                       compression, BTRFS_NR_COMPRESS_TYPES - 1);
>                 return -EUCLEAN;
>         }
>         if (unlikely(btrfs_file_extent_encryption(leaf, fi))) {
> @@ -279,8 +279,7 @@ static int check_extent_data_item(struct extent_buffer *leaf,
>                 }
>
>                 /* Compressed inline extent has no on-disk size, skip it */
> -               if (btrfs_file_extent_compression(leaf, fi) !=
> -                   BTRFS_COMPRESS_NONE)
> +               if (compression != BTRFS_COMPRESS_NONE)
>                         return 0;
>
>                 /* Uncompressed inline extent size must match item size */
> @@ -319,6 +318,30 @@ static int check_extent_data_item(struct extent_buffer *leaf,
>                 return -EUCLEAN;
>         }
>
> +       /*
> +        * If it's a uncompressed regular extents, its ram size should match
> +        * disk_num_bytes. But for now we have several call sites that doesn't
> +        * properly update @ram_bytes, so at least make sure
> +        * @ram_bytes <= @disk_num_bytes.
> +        *
> +        * However we have a recent bug related to @ram_bytes update, causing

Reading this in one year or more will be confusing.
Why not just say:

"However we had a bug related to ..."

> +        * all zoned btrfs and regular btrfs DIO to be affected.
> +        * Thankfully the ram_bytes is not critical for non-compressed file extents.
> +        * So here we hide the check behind DEBUG builds for developers only.
> +        */
> +#ifdef CONFIG_BTRFS_DEBUG
> +       if (unlikely(compression == BTRFS_COMPRESS_NONE &&
> +                    btrfs_file_extent_disk_bytenr(leaf, fi) &&
> +                    btrfs_file_extent_ram_bytes(leaf, fi) >
> +                    btrfs_file_extent_disk_num_bytes(leaf, fi))) {
> +               file_extent_err(leaf, slot,
> +                               "invalid ram_bytes, have %llu expect <=%llu",

Please leave a space between <= and %llu, that makes it more readable,
consistent with code and consistent with an error message at
check_block_group_item().

With that change:

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.

> +                               btrfs_file_extent_ram_bytes(leaf, fi),
> +                               btrfs_file_extent_disk_num_bytes(leaf, fi));
> +               return -EUCLEAN;
> +       }
> +#endif
> +
>         /*
>          * Check that no two consecutive file extent items, in the same leaf,
>          * present ranges that overlap each other.
> --
> 2.44.0
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-16 11:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-15 23:24 [PATCH 0/2] btrfs: fix btrfs_file_extent_item::ram_bytes of btrfs_split_ordered_extent() Qu Wenruo
2024-04-15 23:24 ` [PATCH 1/2] btrfs: set correct ram_bytes when splitting ordered extent Qu Wenruo
2024-04-16 11:38   ` Filipe Manana
2024-04-15 23:24 ` [PATCH 2/2] btrfs: tree-checker: add one extra file extent item ram_bytes check Qu Wenruo
2024-04-16 11:41   ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox