public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks
@ 2026-03-13  9:19 ZhengYuan Huang
  2026-03-13  9:19 ` [PATCH v2 1/2] btrfs: add tree parent check to btrfs_buffer_uptodate() ZhengYuan Huang
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: ZhengYuan Huang @ 2026-03-13  9:19 UTC (permalink / raw)
  To: dsterba, clm
  Cc: wqu, linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
	ZhengYuan Huang

This series fixes a btrfs crash caused by reusing a cached extent buffer
without re-running the caller supplied tree-parent verification.

The problem happens when a tree block is first read and validated with one
expected level, then later looked up again through a path that derives a
different expected level from corrupted metadata. If the extent buffer is
already marked EXTENT_BUFFER_UPTODATE, the cached-hit path returns it
without re-validating the supplied btrfs_tree_parent_check. This can allow
an inconsistent btrfs_root to be constructed and later lead to a
null-ptr-deref during backref walking.

Patch 1/2 is a preparatory change that extends
btrfs_buffer_uptodate() to support tree-parent verification on cached
buffers. Patch 2/2 uses that support on the cached-hit path and contains
the actual fix.

Together, these changes make cache hits and fresh reads follow the same
tree-parent verification rules, turning the corruption into a read failure
instead of constructing an inconsistent root object and crashing later.

For reference, a more detailed analysis of the trigger path is available at:
https://lore.kernel.org/all/CAOmEq9U14a=pwN_dw2M70gfujhMKki434cfmegoxcyUpkYs5bQ@mail.gmail.com/

Changes since v1:
  - drop the adhoc root-specific consistency check in read_tree_root_path()
  - move the validation into the cached-hit path as suggested by Qu Wenruo
  - extend btrfs_buffer_uptodate() with an optional tree-parent check
  - make read_tree_root_path() pass its check when validating a cached root

ZhengYuan Huang (2):
  btrfs: add tree parent check to btrfs_buffer_uptodate()
  btrfs: revalidate cached tree blocks on the uptodate path

 fs/btrfs/ctree.c       |  2 +-
 fs/btrfs/disk-io.c     | 18 ++++++++++++++----
 fs/btrfs/disk-io.h     |  3 ++-
 fs/btrfs/extent-tree.c |  2 +-
 fs/btrfs/extent_io.c   | 12 ++++++++++--
 fs/btrfs/tree-log.c    |  2 +-
 6 files changed, 29 insertions(+), 10 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] btrfs: add tree parent check to btrfs_buffer_uptodate()
  2026-03-13  9:19 [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks ZhengYuan Huang
@ 2026-03-13  9:19 ` ZhengYuan Huang
  2026-03-13  9:19 ` [PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path ZhengYuan Huang
  2026-03-13 23:49 ` [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks Qu Wenruo
  2 siblings, 0 replies; 5+ messages in thread
From: ZhengYuan Huang @ 2026-03-13  9:19 UTC (permalink / raw)
  To: dsterba, clm
  Cc: wqu, linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
	ZhengYuan Huang

btrfs_buffer_uptodate() only checks whether an extent buffer is uptodate
and whether its generation matches the expected parent transid.

For callers that also need tree-parent verification, this is not enough on
a cache hit, because a cached extent buffer can be reported uptodate
without being checked against the expected level/first_key constraints.

Extend btrfs_buffer_uptodate() to take an optional
btrfs_tree_parent_check so cached buffers can be verified before being
reported uptodate.

For now all callers pass NULL, so there is no functional change. A
follow-up patch will use the new argument on the relevant cached-hit path.

Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
 fs/btrfs/ctree.c       |  2 +-
 fs/btrfs/disk-io.c     | 14 +++++++++++---
 fs/btrfs/disk-io.h     |  3 ++-
 fs/btrfs/extent-tree.c |  2 +-
 fs/btrfs/extent_io.c   |  2 +-
 fs/btrfs/tree-log.c    |  2 +-
 6 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 561658aca018..c008b847200a 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1493,7 +1493,7 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
 			reada_for_search(fs_info, p, parent_level, slot, key->objectid);
 
 		/* first we do an atomic uptodate check */
-		if (btrfs_buffer_uptodate(tmp, check.transid, true) > 0) {
+		if (btrfs_buffer_uptodate(tmp, check.transid, true, NULL) > 0) {
 			/*
 			 * Do extra check for first_key, eb can be stale due to
 			 * being cached, read from scrub, or have multiple
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 900e462d8ea1..8773f1f7ea46 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -116,13 +116,21 @@ static void csum_tree_block(struct extent_buffer *buf, u8 *result)
  * detect blocks that either didn't get written at all or got written
  * in the wrong place.
  */
-int btrfs_buffer_uptodate(struct extent_buffer *eb, u64 parent_transid, bool atomic)
+int btrfs_buffer_uptodate(struct extent_buffer *eb, u64 parent_transid, bool atomic,
+			  const struct btrfs_tree_parent_check *check)
 {
 	if (!extent_buffer_uptodate(eb))
 		return 0;
 
-	if (!parent_transid || btrfs_header_generation(eb) == parent_transid)
+	if (!parent_transid || btrfs_header_generation(eb) == parent_transid) {
+		/*
+		 * On a cache hit, the caller may still need tree parent
+		 * verification before reusing the buffer.
+		 */
+		if (check && btrfs_verify_level_key(eb, check))
+			return -EUCLEAN;
 		return 1;
+	}
 
 	if (atomic)
 		return -EAGAIN;
@@ -1046,7 +1054,7 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
 		root->node = NULL;
 		goto fail;
 	}
-	if (unlikely(!btrfs_buffer_uptodate(root->node, generation, false))) {
+	if (unlikely(!btrfs_buffer_uptodate(root->node, generation, false, NULL))) {
 		ret = -EIO;
 		goto fail;
 	}
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 57920f2c6fe4..aef106484dbe 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -106,7 +106,8 @@ static inline struct btrfs_root *btrfs_grab_root(struct btrfs_root *root)
 void btrfs_put_root(struct btrfs_root *root);
 void btrfs_mark_buffer_dirty(struct btrfs_trans_handle *trans,
 			     struct extent_buffer *buf);
-int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, bool atomic);
+int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, bool atomic,
+			  const struct btrfs_tree_parent_check *check);
 int btrfs_read_extent_buffer(struct extent_buffer *buf,
 			     const struct btrfs_tree_parent_check *check);
 
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index dc4ca98c3780..54daa8672272 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5584,7 +5584,7 @@ static int check_next_block_uptodate(struct btrfs_trans_handle *trans,
 
 	generation = btrfs_node_ptr_generation(path->nodes[level], path->slots[level]);
 
-	if (btrfs_buffer_uptodate(next, generation, false))
+	if (btrfs_buffer_uptodate(next, generation, false, NULL))
 		return 0;
 
 	check.level = level - 1;
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 14da72a9a950..93eed1d3716c 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4574,7 +4574,7 @@ void btrfs_readahead_tree_block(struct btrfs_fs_info *fs_info,
 	if (IS_ERR(eb))
 		return;
 
-	if (btrfs_buffer_uptodate(eb, gen, true)) {
+	if (btrfs_buffer_uptodate(eb, gen, true, NULL)) {
 		free_extent_buffer(eb);
 		return;
 	}
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 30f3c3b849c1..ff1a83d26598 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -456,7 +456,7 @@ static int process_one_buffer(struct extent_buffer *eb,
 			return ret;
 		}
 
-		if (btrfs_buffer_uptodate(eb, gen, false) && level == 0) {
+		if (btrfs_buffer_uptodate(eb, gen, false, NULL) && level == 0) {
 			ret = btrfs_exclude_logged_extents(eb);
 			if (ret)
 				btrfs_abort_transaction(trans, ret);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path
  2026-03-13  9:19 [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks ZhengYuan Huang
  2026-03-13  9:19 ` [PATCH v2 1/2] btrfs: add tree parent check to btrfs_buffer_uptodate() ZhengYuan Huang
@ 2026-03-13  9:19 ` ZhengYuan Huang
  2026-03-15 21:17   ` Qu Wenruo
  2026-03-13 23:49 ` [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks Qu Wenruo
  2 siblings, 1 reply; 5+ messages in thread
From: ZhengYuan Huang @ 2026-03-13  9:19 UTC (permalink / raw)
  To: dsterba, clm
  Cc: wqu, linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
	ZhengYuan Huang

read_extent_buffer_pages_nowait() returns immediately when an extent
buffer is already marked EXTENT_BUFFER_UPTODATE. On that cache-hit path,
the caller supplied btrfs_tree_parent_check is not re-run.

This can let read_tree_root_path() accept a cached tree block whose
actual header level does not match the expected level derived from the
root item. In particular, if root_item.level is corrupted while the
actual root block was already cached and validated earlier with a
different expected level, the later read hits the cached uptodate path,
skips re-validation, and builds an inconsistent btrfs_root.

That inconsistent root can later lead to a null-ptr-deref in
handle_indirect_tree_backref(), because backref walking uses
root->root_item.level while btrfs_search_slot() fills path->nodes[]
according to the cached commit_root's actual level.

Fix this by re-validating cached extent buffers against the supplied
btrfs_tree_parent_check on the EXTENT_BUFFER_UPTODATE path, and make
read_tree_root_path() pass its check to btrfs_buffer_uptodate().

This makes cache hits and fresh reads follow the same tree-parent
verification rules, and turns the corruption into a read failure instead
of constructing an inconsistent root object.

Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
 fs/btrfs/disk-io.c   |  6 ++++--
 fs/btrfs/extent_io.c | 12 +++++++++++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8773f1f7ea46..9a8c06c0adc2 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1054,8 +1054,10 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
 		root->node = NULL;
 		goto fail;
 	}
-	if (unlikely(!btrfs_buffer_uptodate(root->node, generation, false, NULL))) {
-		ret = -EIO;
+	ret = btrfs_buffer_uptodate(root->node, generation, false, &check);
+	if (unlikely(ret <= 0)) {
+		if (ret == 0)
+			ret = -EIO;
 		goto fail;
 	}
 
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 93eed1d3716c..1324449e892d 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3828,8 +3828,13 @@ int read_extent_buffer_pages_nowait(struct extent_buffer *eb, int mirror_num,
 {
 	struct btrfs_bio *bbio;
 
-	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))
+	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) {
+		int ret = btrfs_buffer_uptodate(eb, 0, true, check);
+
+		if (unlikely(ret < 0))
+			return ret;
 		return 0;
+	}
 
 	/*
 	 * We could have had EXTENT_BUFFER_UPTODATE cleared by the write
@@ -3850,7 +3855,12 @@ int read_extent_buffer_pages_nowait(struct extent_buffer *eb, int mirror_num,
 	 * will now be set, and we shouldn't read it in again.
 	 */
 	if (unlikely(test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))) {
+		int ret;
+
 		clear_extent_buffer_reading(eb);
+		ret = btrfs_buffer_uptodate(eb, 0, true, check);
+		if (unlikely(ret < 0))
+			return ret;
 		return 0;
 	}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks
  2026-03-13  9:19 [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks ZhengYuan Huang
  2026-03-13  9:19 ` [PATCH v2 1/2] btrfs: add tree parent check to btrfs_buffer_uptodate() ZhengYuan Huang
  2026-03-13  9:19 ` [PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path ZhengYuan Huang
@ 2026-03-13 23:49 ` Qu Wenruo
  2 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2026-03-13 23:49 UTC (permalink / raw)
  To: ZhengYuan Huang, dsterba, clm
  Cc: linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427



在 2026/3/13 19:49, ZhengYuan Huang 写道:
> This series fixes a btrfs crash caused by reusing a cached extent buffer
> without re-running the caller supplied tree-parent verification.
> 
> The problem happens when a tree block is first read and validated with one
> expected level, then later looked up again through a path that derives a
> different expected level from corrupted metadata. If the extent buffer is
> already marked EXTENT_BUFFER_UPTODATE, the cached-hit path returns it
> without re-validating the supplied btrfs_tree_parent_check. This can allow
> an inconsistent btrfs_root to be constructed and later lead to a
> null-ptr-deref during backref walking.
> 
> Patch 1/2 is a preparatory change that extends
> btrfs_buffer_uptodate() to support tree-parent verification on cached
> buffers. Patch 2/2 uses that support on the cached-hit path and contains
> the actual fix.
> 
> Together, these changes make cache hits and fresh reads follow the same
> tree-parent verification rules, turning the corruption into a read failure
> instead of constructing an inconsistent root object and crashing later.
> 
> For reference, a more detailed analysis of the trigger path is available at:
> https://lore.kernel.org/all/CAOmEq9U14a=pwN_dw2M70gfujhMKki434cfmegoxcyUpkYs5bQ@mail.gmail.com/
> 
> Changes since v1:
>    - drop the adhoc root-specific consistency check in read_tree_root_path()
>    - move the validation into the cached-hit path as suggested by Qu Wenruo
>    - extend btrfs_buffer_uptodate() with an optional tree-parent check
>    - make read_tree_root_path() pass its check when validating a cached root
> 
> ZhengYuan Huang (2):
>    btrfs: add tree parent check to btrfs_buffer_uptodate()
>    btrfs: revalidate cached tree blocks on the uptodate path

The code looks good to me, but considering both patches are small, and 
without the second patch the first one doesn't make much sense.

So I'll merge both into a single patch at merging.

Thanks,
Qu

> 
>   fs/btrfs/ctree.c       |  2 +-
>   fs/btrfs/disk-io.c     | 18 ++++++++++++++----
>   fs/btrfs/disk-io.h     |  3 ++-
>   fs/btrfs/extent-tree.c |  2 +-
>   fs/btrfs/extent_io.c   | 12 ++++++++++--
>   fs/btrfs/tree-log.c    |  2 +-
>   6 files changed, 29 insertions(+), 10 deletions(-)
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path
  2026-03-13  9:19 ` [PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path ZhengYuan Huang
@ 2026-03-15 21:17   ` Qu Wenruo
  0 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2026-03-15 21:17 UTC (permalink / raw)
  To: ZhengYuan Huang, dsterba, clm
  Cc: wqu, linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427



在 2026/3/13 19:49, ZhengYuan Huang 写道:
> read_extent_buffer_pages_nowait() returns immediately when an extent
> buffer is already marked EXTENT_BUFFER_UPTODATE. On that cache-hit path,
> the caller supplied btrfs_tree_parent_check is not re-run.
> 
> This can let read_tree_root_path() accept a cached tree block whose
> actual header level does not match the expected level derived from the
> root item. In particular, if root_item.level is corrupted while the
> actual root block was already cached and validated earlier with a
> different expected level, the later read hits the cached uptodate path,
> skips re-validation, and builds an inconsistent btrfs_root.
> 
> That inconsistent root can later lead to a null-ptr-deref in
> handle_indirect_tree_backref(), because backref walking uses
> root->root_item.level while btrfs_search_slot() fills path->nodes[]
> according to the cached commit_root's actual level.
> 
> Fix this by re-validating cached extent buffers against the supplied
> btrfs_tree_parent_check on the EXTENT_BUFFER_UPTODATE path, and make
> read_tree_root_path() pass its check to btrfs_buffer_uptodate().
> 
> This makes cache hits and fresh reads follow the same tree-parent
> verification rules, and turns the corruption into a read failure instead
> of constructing an inconsistent root object.
> 
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
>   fs/btrfs/disk-io.c   |  6 ++++--
>   fs/btrfs/extent_io.c | 12 +++++++++++-
>   2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 8773f1f7ea46..9a8c06c0adc2 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1054,8 +1054,10 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
>   		root->node = NULL;
>   		goto fail;
>   	}
> -	if (unlikely(!btrfs_buffer_uptodate(root->node, generation, false, NULL))) {
> -		ret = -EIO;
> +	ret = btrfs_buffer_uptodate(root->node, generation, false, &check);
> +	if (unlikely(ret <= 0)) {
> +		if (ret == 0)
> +			ret = -EIO;
>   		goto fail;
>   	}
>   
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 93eed1d3716c..1324449e892d 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -3828,8 +3828,13 @@ int read_extent_buffer_pages_nowait(struct extent_buffer *eb, int mirror_num,
>   {
>   	struct btrfs_bio *bbio;
>   
> -	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))
> +	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) {

This has a conflict with the latest for-next branch.

It has already been replaced with extent_buffer_uptodate() helper by 
commit "btrfs: use the helper extent_buffer_uptodate() everywhere", 
which is introduced over one month ago.

I have solved the conflicts this time, but please always base your 
patches on the latest for-next branch:

  https://github.com/btrfs/linux/tree/for-next

> +		int ret = btrfs_buffer_uptodate(eb, 0, true, check);
> +
> +		if (unlikely(ret < 0))
> +			return ret;

You didn't check (ret == 0) case, where it's transid mismatch.

>   		return 0;
> +	}
>   
>   	/*
>   	 * We could have had EXTENT_BUFFER_UPTODATE cleared by the write
> @@ -3850,7 +3855,12 @@ int read_extent_buffer_pages_nowait(struct extent_buffer *eb, int mirror_num,
>   	 * will now be set, and we shouldn't read it in again.
>   	 */
>   	if (unlikely(test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))) {
> +		int ret;
> +
>   		clear_extent_buffer_reading(eb);
> +		ret = btrfs_buffer_uptodate(eb, 0, true, check);
> +		if (unlikely(ret < 0))
> +			return ret;

The same, I have fixed both call sites during merge.

Thanks,
Qu

>   		return 0;
>   	}
>   


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-03-15 21:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13  9:19 [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks ZhengYuan Huang
2026-03-13  9:19 ` [PATCH v2 1/2] btrfs: add tree parent check to btrfs_buffer_uptodate() ZhengYuan Huang
2026-03-13  9:19 ` [PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path ZhengYuan Huang
2026-03-15 21:17   ` Qu Wenruo
2026-03-13 23:49 ` [PATCH v2 0/2] btrfs: verify cached extent buffers against tree parent checks Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox