From: ZhengYuan Huang <gality369@gmail.com>
To: dsterba@suse.com, clm@fb.com
Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
baijiaju1990@gmail.com, r33s3n6@gmail.com, zzzccc427@gmail.com,
ZhengYuan Huang <gality369@gmail.com>,
stable@vger.kernel.org
Subject: [PATCH] btrfs: reject root with mismatched level between root_item and node header
Date: Thu, 12 Mar 2026 18:22:29 +0800 [thread overview]
Message-ID: <20260312102229.220570-1-gality369@gmail.com> (raw)
[BUG]
A KASAN null-ptr-deref is triggered when running balance on a filesystem
with a corrupted root item:
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 1 UID: 0 PID: 347 ... Tainted: G OE 6.18.0+ #17 PREEMPT(voluntary)
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU Ubuntu 24.04 BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:get_eb_folio_index fs/btrfs/extent_io.h:180 [inline]
RIP: 0010:btrfs_get_64+0x91/0x590 fs/btrfs/accessors.c:117
Code: 400400f3 f3f36548 8b056324 31054889
Call Trace:
btrfs_key_blockptr fs/btrfs/accessors.h:368 [inline]
btrfs_node_blockptr fs/btrfs/accessors.h:380 [inline]
handle_indirect_tree_backref fs/btrfs/backref.c:3324 [inline]
btrfs_backref_add_tree_node+0x7a5/0x26a0 fs/btrfs/backref.c:3538
build_backref_tree+0x11c/0xb00 fs/btrfs/relocation.c:437
relocate_tree_blocks+0x583/0x1a30 fs/btrfs/relocation.c:2649
relocate_block_group+0x521/0xf60 fs/btrfs/relocation.c:3584
btrfs_relocate_block_group+0x4d8/0xde0 fs/btrfs/relocation.c:3984
btrfs_relocate_chunk+0x133/0x620 fs/btrfs/volumes.c:3451
__btrfs_balance fs/btrfs/volumes.c:4227 [inline]
btrfs_balance+0x1e8b/0x42b0 fs/btrfs/volumes.c:4604
btrfs_ioctl_balance fs/btrfs/ioctl.c:3577 [inline]
btrfs_ioctl+0x25cf/0x5b90 fs/btrfs/ioctl.c:5313
...
RIP: 0033:0x7bbaa73a75ad
Code: ffc3662e 0f1f8400 00000000 90f30f1e fa4889f8
The bug is reproducible on 7.0.0-rc2-next-20260311 with our dynamic
metadata fuzzing tool that corrupts btrfs metadata at runtime.
[CAUSE]
The corruption consists of a single corrupted field in a root tree leaf:
the btrfs_root_item for the affected tree has its .level field set to 1,
while the actual root block on disk has header.level = 0. The root block
itself is completely intact; only the field value stored inside the root
tree leaf is wrong. The existing tree-checker validation in
check_root_item() accepts this because it only verifies that
root_item.level < BTRFS_MAX_LEVEL, and does not cross-check the value
against the root block's own header.
The inconsistency becomes dangerous when balance calls
relocate_tree_blocks() to move a level-0 block belonging to that tree.
relocate_tree_blocks() has two sequential phases that together set the
trap:
Phase 1 -- get_tree_block_key(): reads the root block to retrieve its first
key before building the backref tree. The check level passed to
read_tree_block() here comes from the EXTENT_ITEM in the extent tree, which
correctly records level 0. The disk I/O completes,
btrfs_validate_extent_buffer() sees found_level(0) == check->level(0), and
marks the extent_buffer EXTENT_BUFFER_UPTODATE.
Phase 2 -- build_backref_tree() calls handle_indirect_tree_backref(), which
calls btrfs_get_fs_root() to open the affected tree. Inside
read_tree_root_path(), level is set from btrfs_root_level(&root->root_item),
yielding the corrupted value 1. read_tree_block() is then called with
check.level = 1 for the same bytenr. Because EXTENT_BUFFER_UPTODATE is
already set from Phase 1, read_extent_buffer_pages_nowait() returns
immediately via the cache fast path, skipping
btrfs_validate_extent_buffer() entirely. read_tree_root_path() has no
cross-check between btrfs_header_level(root->node) and the level read from
root_item, so it silently builds a btrfs_root with root_item.level = 1 and
commit_root whose btrfs_header_level() is 0 and installs it in the
fs_roots radix tree.
Back in handle_indirect_tree_backref(), btrfs_root_level(&root->root_item)
returns 1, which does not equal cur->level(0), so the tree-root early-exit
is skipped and path->lowest_level is set to 1.
btrfs_search_slot_get_root() starts at commit_root (level 0), records it in
p->nodes[0], and returns immediately because it is already a leaf --
p->nodes[1] is never assigned and retains its kzalloc-zeroed NULL value.
eb = path->nodes[1] = NULL is then passed directly to
btrfs_node_blockptr(), which calls btrfs_get_64() and then
get_eb_folio_index(), where eb->folio_shift is dereferenced through the
NULL pointer, causing the crash.
Note that the subsequent for() loop in handle_indirect_tree_backref()
already checks for a NULL path->nodes[level] correctly; the initial
blockptr comparison just above it was never given the same guard.
[FIX]
Catch the inconsistency in read_tree_root_path(), right after read_tree_block()
returns root->node and the generation and owner checks have passed. At that
point level = btrfs_root_level(&root->root_item) is already known, so
comparing it against btrfs_header_level(root->node) costs nothing. If they
differ, emit a btrfs_crit() message and return -EUCLEAN to prevent the
inconsistent btrfs_root object from being installed in the radix-tree cache
and reaching any caller. read_tree_root_path() is the only place that sees
both root_item.level and the actual root node simultaneously, making it the
correct and minimal location for this cross-block consistency check.
Returning -EUCLEAN is consistent with the existing owner-mismatch check
directly above and with the general btrfs policy of converting detectable
corruption into -EUCLEAN rather than crashing later.
After the fix, btrfs detects the level mismatch at root load time and
fails with -EUCLEAN instead of crashing later in
handle_indirect_tree_backref().
Cc: stable@vger.kernel.org
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
fs/btrfs/disk-io.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 900e462d8ea1..06a8689cbf62 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1067,6 +1067,26 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
ret = -EUCLEAN;
goto fail;
}
+ /*
+ * Verify that the root node's on-disk level matches root_item.level.
+ * These can diverge when the root item in the root tree was corrupted
+ * (e.g. a bit flip changing level) while the actual tree block is
+ * already cached in memory at its real level. In that case
+ * read_tree_block() returns the cached buffer without re-running
+ * btrfs_validate_extent_buffer(), silently bypassing the level check.
+ * The mismatch would later cause a null-ptr-deref in backref walking
+ * (handle_indirect_tree_backref) when the commit root's real height is
+ * lower than what root_item.level claims.
+ */
+ if (unlikely(btrfs_header_level(root->node) != level)) {
+ btrfs_crit(fs_info,
+ "root=%llu block=%llu, root item level mismatch: "
+ "root_item.level=%d block.level=%u",
+ btrfs_root_id(root), root->node->start,
+ level, btrfs_header_level(root->node));
+ ret = -EUCLEAN;
+ goto fail;
+ }
root->commit_root = btrfs_root_node(root);
return root;
fail:
--
2.43.0
next reply other threads:[~2026-03-12 10:22 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-12 10:22 ZhengYuan Huang [this message]
2026-03-12 21:29 ` [PATCH] btrfs: reject root with mismatched level between root_item and node header Qu Wenruo
2026-03-13 2:49 ` ZhengYuan Huang
2026-03-13 3:09 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260312102229.220570-1-gality369@gmail.com \
--to=gality369@gmail.com \
--cc=baijiaju1990@gmail.com \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=r33s3n6@gmail.com \
--cc=stable@vger.kernel.org \
--cc=zzzccc427@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox