Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Teng Liu <27rabbitlt@gmail.com>, linux-btrfs@vger.kernel.org
Cc: dsterba@suse.com, clm@fb.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] btrfs: wait for in-flight readahead BIOs on open_ctree() error
Date: Sun, 29 Mar 2026 17:33:19 +1030	[thread overview]
Message-ID: <11c944aa-7745-4720-9f40-af99bf7bb727@suse.com> (raw)
In-Reply-To: <20260329063417.642647-1-27rabbitlt@gmail.com>



在 2026/3/29 17:01, Teng Liu 写道:
> When open_ctree() fails during btrfs_read_chunk_tree(), readahead BIOs
> submitted by readahead_tree_node_children() may still be in flight. The
> error path frees fs_info without waiting for these BIOs to complete.
> When a readahead BIO later completes, btrfs_simple_end_io() calls
> btrfs_bio_counter_dec() which accesses the already-freed
> fs_info->dev_replace.bio_counter, causing a use-after-free.
> 
> This can be triggered by connecting a USB drive with a corrupted btrfs
> filesystem (e.g. chunk tree destroyed by a partial format), where the
> slow USB device keeps readahead BIOs in flight long enough for the
> error path to free fs_info before they complete. It can be reproduced
> on qemu with a properly corrupted btrfs img.
> 
>    BTRFS error (device sda): failed to read chunk tree: -2
>    BTRFS error (device sda): open_ctree failed: -2
>    BUG: unable to handle page fault for address: ffff89322ceb3000
>    RIP: 0010:percpu_counter_add_batch+0xe/0xb0
>     btrfs_bio_counter_sub+0x22/0x60
>     btrfs_simple_end_io+0x32/0x90
>     blk_update_request+0x12b/0x480
>     scsi_end_request+0x26/0x1b0
>     scsi_io_completion+0x50/0x790
> 
> Fix this by waiting for the bio_counter to reach zero in the error path
> before stopping workers, so all in-flight BIOs have completed their
> callbacks before fs_info is freed. The bio_counter is already
> initialized in init_mount_fs_info() so this wait is safe for all error
> paths reaching the fail_sb_buffer label.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=221270
> Reported-by: AHN SEOK-YOUNG
> Signed-off-by: Teng Liu <27rabbitlt@gmail.com>
> ---
>   fs/btrfs/disk-io.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 01f2dbb69..61e6b8dca 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3723,6 +3723,18 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>   	invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
>   
>   fail_sb_buffer:
> +	/*
> +	 * Wait for in-flight readahead BIOs before stopping workers.
> +	 * Readahead BIOs from btrfs_read_chunk_tree() (via
> +	 * readahead_tree_node_children) may still be in flight on slow
> +	 * devices (e.g. USB). Their completion callbacks
> +	 * (btrfs_simple_end_io) access fs_info->dev_replace.bio_counter
> +	 * which would be destroyed later, causing a use-after-free.
> +	 * The bio_counter was already initialized in init_mount_fs_info()
> +	 * so this wait is safe for all error paths reaching this label.
> +	 */
> +	wait_event(fs_info->dev_replace.replace_wait,
> +		   percpu_counter_sum(&fs_info->dev_replace.bio_counter) == 0);

This doesn't make any sense to me.

The wait and counter are all for dev-reaplce, not matching your 
description of the generic metadata readahead.

If you want to wait for all existing metadata reads, I didn't find a 
good helper, thus you will need to go through all extent buffers and 
wait for EXTENT_BUFFER_READING flags.

>   	btrfs_stop_all_workers(fs_info);
>   	btrfs_free_block_groups(fs_info);
>   fail_alloc:


  reply	other threads:[~2026-03-29  7:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-29  6:31 [PATCH] btrfs: wait for in-flight readahead BIOs on open_ctree() error Teng Liu
2026-03-29  7:03 ` Qu Wenruo [this message]
2026-03-29 17:23   ` Teng Liu
2026-03-29 22:06     ` Qu Wenruo
2026-03-29 22:21       ` Qu Wenruo
2026-03-30 18:00         ` Teng Liu
2026-03-30 21:48           ` Qu Wenruo
2026-03-30 22:14             ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11c944aa-7745-4720-9f40-af99bf7bb727@suse.com \
    --to=wqu@suse.com \
    --cc=27rabbitlt@gmail.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox