public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: ZhengYuan Huang <gality369@gmail.com>
To: dsterba@suse.com, clm@fb.com
Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	baijiaju1990@gmail.com, r33s3n6@gmail.com, zzzccc427@gmail.com,
	ZhengYuan Huang <gality369@gmail.com>,
	stable@vger.kernel.org
Subject: [PATCH] btrfs: reject root with mismatched level between root_item and node header
Date: Thu, 12 Mar 2026 18:22:29 +0800	[thread overview]
Message-ID: <20260312102229.220570-1-gality369@gmail.com> (raw)

[BUG]
A KASAN null-ptr-deref is triggered when running balance on a filesystem
with a corrupted root item:

  KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
  CPU: 1 UID: 0 PID: 347 ... Tainted: G OE  6.18.0+ #17 PREEMPT(voluntary)
  Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
  Hardware name: QEMU Ubuntu 24.04 BIOS 1.16.3-debian-1.16.3-2 04/01/2014
  RIP: 0010:get_eb_folio_index fs/btrfs/extent_io.h:180 [inline]
  RIP: 0010:btrfs_get_64+0x91/0x590 fs/btrfs/accessors.c:117
  Code: 400400f3 f3f36548 8b056324 31054889
  Call Trace:
   btrfs_key_blockptr fs/btrfs/accessors.h:368 [inline]
   btrfs_node_blockptr fs/btrfs/accessors.h:380 [inline]
   handle_indirect_tree_backref fs/btrfs/backref.c:3324 [inline]
   btrfs_backref_add_tree_node+0x7a5/0x26a0 fs/btrfs/backref.c:3538
   build_backref_tree+0x11c/0xb00 fs/btrfs/relocation.c:437
   relocate_tree_blocks+0x583/0x1a30 fs/btrfs/relocation.c:2649
   relocate_block_group+0x521/0xf60 fs/btrfs/relocation.c:3584
   btrfs_relocate_block_group+0x4d8/0xde0 fs/btrfs/relocation.c:3984
   btrfs_relocate_chunk+0x133/0x620 fs/btrfs/volumes.c:3451
   __btrfs_balance fs/btrfs/volumes.c:4227 [inline]
   btrfs_balance+0x1e8b/0x42b0 fs/btrfs/volumes.c:4604
   btrfs_ioctl_balance fs/btrfs/ioctl.c:3577 [inline]
   btrfs_ioctl+0x25cf/0x5b90 fs/btrfs/ioctl.c:5313
   ...
  RIP: 0033:0x7bbaa73a75ad
  Code: ffc3662e 0f1f8400 00000000 90f30f1e fa4889f8

The bug is reproducible on 7.0.0-rc2-next-20260311 with our dynamic
metadata fuzzing tool that corrupts btrfs metadata at runtime.

[CAUSE]
The corruption consists of a single corrupted field in a root tree leaf:
the btrfs_root_item for the affected tree has its .level field set to 1,
while the actual root block on disk has header.level = 0. The root block
itself is completely intact; only the field value stored inside the root
tree leaf is wrong. The existing tree-checker validation in
check_root_item() accepts this because it only verifies that
root_item.level < BTRFS_MAX_LEVEL, and does not cross-check the value
against the root block's own header.

The inconsistency becomes dangerous when balance calls
relocate_tree_blocks() to move a level-0 block belonging to that tree.
relocate_tree_blocks() has two sequential phases that together set the
trap:

Phase 1 -- get_tree_block_key(): reads the root block to retrieve its first
key before building the backref tree. The check level passed to
read_tree_block() here comes from the EXTENT_ITEM in the extent tree, which
correctly records level 0. The disk I/O completes,
btrfs_validate_extent_buffer() sees found_level(0) == check->level(0), and
marks the extent_buffer EXTENT_BUFFER_UPTODATE.

Phase 2 -- build_backref_tree() calls handle_indirect_tree_backref(), which
calls btrfs_get_fs_root() to open the affected tree. Inside
read_tree_root_path(), level is set from btrfs_root_level(&root->root_item),
yielding the corrupted value 1. read_tree_block() is then called with
check.level = 1 for the same bytenr. Because EXTENT_BUFFER_UPTODATE is
already set from Phase 1, read_extent_buffer_pages_nowait() returns
immediately via the cache fast path, skipping
btrfs_validate_extent_buffer() entirely. read_tree_root_path() has no
cross-check between btrfs_header_level(root->node) and the level read from
root_item, so it silently builds a btrfs_root with root_item.level = 1 and
commit_root whose btrfs_header_level() is 0 and installs it in the
fs_roots radix tree.

Back in handle_indirect_tree_backref(), btrfs_root_level(&root->root_item)
returns 1, which does not equal cur->level(0), so the tree-root early-exit
is skipped and path->lowest_level is set to 1.
btrfs_search_slot_get_root() starts at commit_root (level 0), records it in
p->nodes[0], and returns immediately because it is already a leaf --
p->nodes[1] is never assigned and retains its kzalloc-zeroed NULL value.
eb = path->nodes[1] = NULL is then passed directly to
btrfs_node_blockptr(), which calls btrfs_get_64() and then
get_eb_folio_index(), where eb->folio_shift is dereferenced through the
NULL pointer, causing the crash.

Note that the subsequent for() loop in handle_indirect_tree_backref()
already checks for a NULL path->nodes[level] correctly; the initial
blockptr comparison just above it was never given the same guard.

[FIX]
Catch the inconsistency in read_tree_root_path(), right after read_tree_block()
returns root->node and the generation and owner checks have passed. At that
point level = btrfs_root_level(&root->root_item) is already known, so
comparing it against btrfs_header_level(root->node) costs nothing. If they
differ, emit a btrfs_crit() message and return -EUCLEAN to prevent the
inconsistent btrfs_root object from being installed in the radix-tree cache
and reaching any caller. read_tree_root_path() is the only place that sees
both root_item.level and the actual root node simultaneously, making it the
correct and minimal location for this cross-block consistency check.
Returning -EUCLEAN is consistent with the existing owner-mismatch check
directly above and with the general btrfs policy of converting detectable
corruption into -EUCLEAN rather than crashing later.

After the fix, btrfs detects the level mismatch at root load time and
fails with -EUCLEAN instead of crashing later in
handle_indirect_tree_backref().

Cc: stable@vger.kernel.org
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
 fs/btrfs/disk-io.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 900e462d8ea1..06a8689cbf62 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1067,6 +1067,26 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
 		ret = -EUCLEAN;
 		goto fail;
 	}
+	/*
+	 * Verify that the root node's on-disk level matches root_item.level.
+	 * These can diverge when the root item in the root tree was corrupted
+	 * (e.g. a bit flip changing level) while the actual tree block is
+	 * already cached in memory at its real level. In that case
+	 * read_tree_block() returns the cached buffer without re-running
+	 * btrfs_validate_extent_buffer(), silently bypassing the level check.
+	 * The mismatch would later cause a null-ptr-deref in backref walking
+	 * (handle_indirect_tree_backref) when the commit root's real height is
+	 * lower than what root_item.level claims.
+	 */
+	if (unlikely(btrfs_header_level(root->node) != level)) {
+		btrfs_crit(fs_info,
+           "root=%llu block=%llu, root item level mismatch: "
+           "root_item.level=%d block.level=%u",
+           btrfs_root_id(root), root->node->start,
+           level, btrfs_header_level(root->node));
+		ret = -EUCLEAN;
+		goto fail;
+	}
 	root->commit_root = btrfs_root_node(root);
 	return root;
 fail:
-- 
2.43.0


             reply	other threads:[~2026-03-12 10:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-12 10:22 ZhengYuan Huang [this message]
2026-03-12 21:29 ` [PATCH] btrfs: reject root with mismatched level between root_item and node header Qu Wenruo
2026-03-13  2:49   ` ZhengYuan Huang
2026-03-13  3:09     ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260312102229.220570-1-gality369@gmail.com \
    --to=gality369@gmail.com \
    --cc=baijiaju1990@gmail.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=r33s3n6@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=zzzccc427@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox