public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: ZhengYuan Huang <gality369@gmail.com>,
	dsterba@suse.com, clm@fb.com, wqu@suse.com
Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	baijiaju1990@gmail.com, r33s3n6@gmail.com, zzzccc427@gmail.com,
	stable@vger.kernel.org
Subject: Re: [PATCH] btrfs: reject root items with drop_progress and zero drop_level
Date: Thu, 12 Mar 2026 07:38:29 +1030	[thread overview]
Message-ID: <849ac4be-b10d-4eb7-892f-4b9ee2ef5cb2@gmx.com> (raw)
In-Reply-To: <20260311111632.2836293-1-gality369@gmail.com>



在 2026/3/11 21:46, ZhengYuan Huang 写道:
> [BUG]
> When recovering relocation at mount time, merge_reloc_root() and
> btrfs_drop_snapshot() both use BUG_ON(level == 0) to guard against
> an impossible state: a non-zero drop_progress combined with a zero
> drop_level in a root_item, which can be triggered:
> 
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/relocation.c:1545!
> Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
> CPU: 1 UID: 0 PID: 283 ... Tainted: 6.18.0+ #16 PREEMPT(voluntary)
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: QEMU Ubuntu 24.04 PC v2, BIOS 1.16.3-debian-1.16.3-2
> RIP: 0010:merge_reloc_root+0x1266/0x1650 fs/btrfs/relocation.c:1545
> Code: ffff0000 00004589 d7e9acfa ffffe8a1 79bafebe 02000000
> Call Trace:
>   merge_reloc_roots+0x295/0x890 fs/btrfs/relocation.c:1861
>   btrfs_recover_relocation+0xd6e/0x11d0 fs/btrfs/relocation.c:4195
>   btrfs_start_pre_rw_mount+0xa4d/0x1810 fs/btrfs/disk-io.c:3130
>   open_ctree+0x5824/0x5fe0 fs/btrfs/disk-io.c:3640
>   btrfs_fill_super fs/btrfs/super.c:987 [inline]
>   btrfs_get_tree_super fs/btrfs/super.c:1951 [inline]
>   btrfs_get_tree_subvol fs/btrfs/super.c:2094 [inline]
>   btrfs_get_tree+0x111c/0x2190 fs/btrfs/super.c:2128
>   vfs_get_tree+0x9a/0x370 fs/super.c:1758
>   fc_mount fs/namespace.c:1199 [inline]
>   do_new_mount_fc fs/namespace.c:3642 [inline]
>   do_new_mount fs/namespace.c:3718 [inline]
>   path_mount+0x5b8/0x1ea0 fs/namespace.c:4028
>   do_mount fs/namespace.c:4041 [inline]
>   __do_sys_mount fs/namespace.c:4229 [inline]
>   __se_sys_mount fs/namespace.c:4206 [inline]
>   __x64_sys_mount+0x282/0x320 fs/namespace.c:4206
>   ...
> RIP: 0033:0x7f969c9a8fde
> Code: 0f1f4000 48c7c2b0 fffffff7 d8648902 b8ffffff ffc3660f
> ---[ end trace 0000000000000000 ]---
> 
> [CAUSE]
> A non-zero drop_progress.objectid means an interrupted
> btrfs_drop_snapshot() left a resume point on disk, and in that case
> drop_level must be greater than 0 because the checkpoint is only
> saved at internal node levels.
> 
> Although this invariant is enforced when the kernel writes the root
> item, it is not validated when the root item is read back from disk.
> That allows on-disk corruption to provide an invalid state with
> drop_progress.objectid != 0 and drop_level == 0.
> 
> When relocation recovery later processes such a root item,
> merge_reloc_root() reads drop_level and hits BUG_ON(level == 0). The
> same invalid metadata can also trigger the corresponding BUG_ON() in
> btrfs_drop_snapshot().
> 
> [FIX]
> Fix this by validating the root_item invariant in tree-checker when
> reading root items from disk: if drop_progress.objectid is non-zero,
> drop_level must also be non-zero. Reject such malformed metadata with
> -EUCLEAN before it reaches merge_reloc_root() or btrfs_drop_snapshot()
> and triggers the BUG_ON.
> 
> Also fix the related tree-checker error message to report
> "invalid root drop_level" instead of the misleading "invalid root level".
> 
> The bug is reproducible on 7.0.0-rc2-next-20260310 with our dynamic
> metadata fuzzing tool that corrupts btrfs metadata at runtime. After
> the fix, the same corruption is correctly rejected by tree-checker
> and the BUG_ON is no longer triggered.
> 
> Fixes: 259ee7754b67 ("btrfs: tree-checker: Add ROOT_ITEM check")

The only "fix" part I can see is the fix of the message from drop_level.

If you really want to do that, please send out a fix dedicated for that 
single line.

Otherwise you're adding a new check. Please do not mix fix and new check 
into one patch.

Thanks,
Qu

> Cc: stable@vger.kernel.org # 5.3+
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
> Reproduction (v6.18, x86_64, KASAN)
> ===================================
> The PoC is relatively large, so it is provided separately through google drive:
> https://drive.google.com/drive/folders/1Rto3DUtjUTOg3bjFeH5G2Kl8Li6OuzsG
> 
> To reproduce the issue:
>    1. Build the ublk helper program from the ublk codebase, which is
> 	 used to provide the runtime corruption capability:
> 	  g++ -std=c++20 -fcoroutines -O2 -o standalone_replay \
>        standalone_replay_btrfs.cpp targets/ublksrv_tgt.cpp \
>        -I. -Iinclude -Itargets/include \
>        -L./lib/.libs -lublksrv -luring -lpthread
>    2. Attach the crafted image through ublk:
>        ./standalone_replay add -t loop -f /path/to/image
>    3. Mount the image:
> 	  mount -o loop /path/to/image /mnt
> This reliably reproduces the bug.
> ---
>   fs/btrfs/tree-checker.c | 19 ++++++++++++++++++-
>   1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index dd274f67ad7f..a8c568b10432 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -1256,10 +1256,27 @@ static int check_root_item(struct extent_buffer *leaf, struct btrfs_key *key,
>   	}
>   	if (unlikely(btrfs_root_drop_level(&ri) >= BTRFS_MAX_LEVEL)) {
>   		generic_err(leaf, slot,
> -			    "invalid root level, have %u expect [0, %u]",
> +			    "invalid root drop_level, have %u expect [0, %u]",
>   			    btrfs_root_drop_level(&ri), BTRFS_MAX_LEVEL - 1);
>   		return -EUCLEAN;
>   	}
> +	/*
> +	 * If drop_progress.objectid is non-zero, a btrfs_drop_snapshot() was
> +	 * interrupted and the resume point was recorded in drop_progress and
> +	 * drop_level.  In that case drop_level must be >= 1: level 0 is the
> +	 * leaf level and drop_snapshot never saves a checkpoint there (it
> +	 * only records checkpoints at internal node levels in DROP_REFERENCE
> +	 * stage).  A zero drop_level combined with a non-zero drop_progress
> +	 * objectid indicates on-disk corruption and would cause a BUG_ON in
> +	 * merge_reloc_root() and btrfs_drop_snapshot() at mount time.
> +	 */
> +	if (unlikely(btrfs_disk_key_objectid(&ri.drop_progress) != 0 &&
> +		     btrfs_root_drop_level(&ri) == 0)) {
> +		generic_err(leaf, slot,
> +			    "invalid root drop_level 0 with non-zero drop_progress objectid %llu",
> +			    btrfs_disk_key_objectid(&ri.drop_progress));
> +		return -EUCLEAN;
> +	}
>   
>   	/* Flags check */
>   	if (unlikely(btrfs_root_flags(&ri) & ~valid_root_flags)) {


  reply	other threads:[~2026-03-11 21:08 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11 11:16 [PATCH] btrfs: reject root items with drop_progress and zero drop_level ZhengYuan Huang
2026-03-11 21:08 ` Qu Wenruo [this message]
2026-03-12  0:02   ` ZhengYuan Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=849ac4be-b10d-4eb7-892f-4b9ee2ef5cb2@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=baijiaju1990@gmail.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=gality369@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=r33s3n6@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    --cc=zzzccc427@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox