From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: ZhengYuan Huang <gality369@gmail.com>,
dsterba@suse.com, clm@fb.com, wqu@suse.com
Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
baijiaju1990@gmail.com, r33s3n6@gmail.com, zzzccc427@gmail.com,
stable@vger.kernel.org
Subject: Re: [PATCH] btrfs: reject root items with drop_progress and zero drop_level
Date: Thu, 12 Mar 2026 07:38:29 +1030 [thread overview]
Message-ID: <849ac4be-b10d-4eb7-892f-4b9ee2ef5cb2@gmx.com> (raw)
In-Reply-To: <20260311111632.2836293-1-gality369@gmail.com>
在 2026/3/11 21:46, ZhengYuan Huang 写道:
> [BUG]
> When recovering relocation at mount time, merge_reloc_root() and
> btrfs_drop_snapshot() both use BUG_ON(level == 0) to guard against
> an impossible state: a non-zero drop_progress combined with a zero
> drop_level in a root_item, which can be triggered:
>
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/relocation.c:1545!
> Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
> CPU: 1 UID: 0 PID: 283 ... Tainted: 6.18.0+ #16 PREEMPT(voluntary)
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: QEMU Ubuntu 24.04 PC v2, BIOS 1.16.3-debian-1.16.3-2
> RIP: 0010:merge_reloc_root+0x1266/0x1650 fs/btrfs/relocation.c:1545
> Code: ffff0000 00004589 d7e9acfa ffffe8a1 79bafebe 02000000
> Call Trace:
> merge_reloc_roots+0x295/0x890 fs/btrfs/relocation.c:1861
> btrfs_recover_relocation+0xd6e/0x11d0 fs/btrfs/relocation.c:4195
> btrfs_start_pre_rw_mount+0xa4d/0x1810 fs/btrfs/disk-io.c:3130
> open_ctree+0x5824/0x5fe0 fs/btrfs/disk-io.c:3640
> btrfs_fill_super fs/btrfs/super.c:987 [inline]
> btrfs_get_tree_super fs/btrfs/super.c:1951 [inline]
> btrfs_get_tree_subvol fs/btrfs/super.c:2094 [inline]
> btrfs_get_tree+0x111c/0x2190 fs/btrfs/super.c:2128
> vfs_get_tree+0x9a/0x370 fs/super.c:1758
> fc_mount fs/namespace.c:1199 [inline]
> do_new_mount_fc fs/namespace.c:3642 [inline]
> do_new_mount fs/namespace.c:3718 [inline]
> path_mount+0x5b8/0x1ea0 fs/namespace.c:4028
> do_mount fs/namespace.c:4041 [inline]
> __do_sys_mount fs/namespace.c:4229 [inline]
> __se_sys_mount fs/namespace.c:4206 [inline]
> __x64_sys_mount+0x282/0x320 fs/namespace.c:4206
> ...
> RIP: 0033:0x7f969c9a8fde
> Code: 0f1f4000 48c7c2b0 fffffff7 d8648902 b8ffffff ffc3660f
> ---[ end trace 0000000000000000 ]---
>
> [CAUSE]
> A non-zero drop_progress.objectid means an interrupted
> btrfs_drop_snapshot() left a resume point on disk, and in that case
> drop_level must be greater than 0 because the checkpoint is only
> saved at internal node levels.
>
> Although this invariant is enforced when the kernel writes the root
> item, it is not validated when the root item is read back from disk.
> That allows on-disk corruption to provide an invalid state with
> drop_progress.objectid != 0 and drop_level == 0.
>
> When relocation recovery later processes such a root item,
> merge_reloc_root() reads drop_level and hits BUG_ON(level == 0). The
> same invalid metadata can also trigger the corresponding BUG_ON() in
> btrfs_drop_snapshot().
>
> [FIX]
> Fix this by validating the root_item invariant in tree-checker when
> reading root items from disk: if drop_progress.objectid is non-zero,
> drop_level must also be non-zero. Reject such malformed metadata with
> -EUCLEAN before it reaches merge_reloc_root() or btrfs_drop_snapshot()
> and triggers the BUG_ON.
>
> Also fix the related tree-checker error message to report
> "invalid root drop_level" instead of the misleading "invalid root level".
>
> The bug is reproducible on 7.0.0-rc2-next-20260310 with our dynamic
> metadata fuzzing tool that corrupts btrfs metadata at runtime. After
> the fix, the same corruption is correctly rejected by tree-checker
> and the BUG_ON is no longer triggered.
>
> Fixes: 259ee7754b67 ("btrfs: tree-checker: Add ROOT_ITEM check")
The only "fix" part I can see is the fix of the message from drop_level.
If you really want to do that, please send out a fix dedicated for that
single line.
Otherwise you're adding a new check. Please do not mix fix and new check
into one patch.
Thanks,
Qu
> Cc: stable@vger.kernel.org # 5.3+
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
> Reproduction (v6.18, x86_64, KASAN)
> ===================================
> The PoC is relatively large, so it is provided separately through google drive:
> https://drive.google.com/drive/folders/1Rto3DUtjUTOg3bjFeH5G2Kl8Li6OuzsG
>
> To reproduce the issue:
> 1. Build the ublk helper program from the ublk codebase, which is
> used to provide the runtime corruption capability:
> g++ -std=c++20 -fcoroutines -O2 -o standalone_replay \
> standalone_replay_btrfs.cpp targets/ublksrv_tgt.cpp \
> -I. -Iinclude -Itargets/include \
> -L./lib/.libs -lublksrv -luring -lpthread
> 2. Attach the crafted image through ublk:
> ./standalone_replay add -t loop -f /path/to/image
> 3. Mount the image:
> mount -o loop /path/to/image /mnt
> This reliably reproduces the bug.
> ---
> fs/btrfs/tree-checker.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index dd274f67ad7f..a8c568b10432 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -1256,10 +1256,27 @@ static int check_root_item(struct extent_buffer *leaf, struct btrfs_key *key,
> }
> if (unlikely(btrfs_root_drop_level(&ri) >= BTRFS_MAX_LEVEL)) {
> generic_err(leaf, slot,
> - "invalid root level, have %u expect [0, %u]",
> + "invalid root drop_level, have %u expect [0, %u]",
> btrfs_root_drop_level(&ri), BTRFS_MAX_LEVEL - 1);
> return -EUCLEAN;
> }
> + /*
> + * If drop_progress.objectid is non-zero, a btrfs_drop_snapshot() was
> + * interrupted and the resume point was recorded in drop_progress and
> + * drop_level. In that case drop_level must be >= 1: level 0 is the
> + * leaf level and drop_snapshot never saves a checkpoint there (it
> + * only records checkpoints at internal node levels in DROP_REFERENCE
> + * stage). A zero drop_level combined with a non-zero drop_progress
> + * objectid indicates on-disk corruption and would cause a BUG_ON in
> + * merge_reloc_root() and btrfs_drop_snapshot() at mount time.
> + */
> + if (unlikely(btrfs_disk_key_objectid(&ri.drop_progress) != 0 &&
> + btrfs_root_drop_level(&ri) == 0)) {
> + generic_err(leaf, slot,
> + "invalid root drop_level 0 with non-zero drop_progress objectid %llu",
> + btrfs_disk_key_objectid(&ri.drop_progress));
> + return -EUCLEAN;
> + }
>
> /* Flags check */
> if (unlikely(btrfs_root_flags(&ri) & ~valid_root_flags)) {
next prev parent reply other threads:[~2026-03-11 21:08 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 11:16 [PATCH] btrfs: reject root items with drop_progress and zero drop_level ZhengYuan Huang
2026-03-11 21:08 ` Qu Wenruo [this message]
2026-03-12 0:02 ` ZhengYuan Huang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=849ac4be-b10d-4eb7-892f-4b9ee2ef5cb2@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=baijiaju1990@gmail.com \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=gality369@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=r33s3n6@gmail.com \
--cc=stable@vger.kernel.org \
--cc=wqu@suse.com \
--cc=zzzccc427@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox