From: Qu Wenruo <wqu@suse.com>
To: Teng Liu <27rabbitlt@gmail.com>, linux-btrfs@vger.kernel.org
Cc: dsterba@suse.com, clm@fb.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] btrfs: wait for in-flight readahead BIOs on open_ctree() error
Date: Mon, 30 Mar 2026 08:36:15 +1030 [thread overview]
Message-ID: <4a129696-0352-427f-9e0e-7962e789df57@suse.com> (raw)
In-Reply-To: <aclYf4R2XxlUkxAQ@rabbitArch>
在 2026/3/30 03:53, Teng Liu 写道:
> Thanks for your review!
> On 2026-03-29 17:33, Qu Wenruo wrote:
>>
>>
>> This doesn't make any sense to me.
> It confuses me as well when I try to reproduce the bug. The reported
> claimed that btrfs_bio_counter_sub triggered a use-after-free but this
> function lives under `dev-reaplce.c` which should have nothing to do
> with the setting from the name.
>
> However when I checked the function call chain:
>
> open_ctree()
> → btrfs_read_sys_array() # OK — sys_chunk_array in superblock is intact
> → load_super_root(chunk_root) # OK — reads root node, passes validation
> → btrfs_read_chunk_tree()
> → btrfs_for_each_slot()
> → readahead_tree_node_children(node)
> → for each child pointer in the internal node:
> btrfs_readahead_node_child()
> → btrfs_readahead_tree_block()
> → read_extent_buffer_pages_nowait()
> → btrfs_submit_bbio()
> → btrfs_submit_chunk()
> → btrfs_bio_counter_inc_blocked() ← bio_counter++
> → btrfs_map_block()
> → submit_bio() ← sent to USB drive
Even you wait for all bios, it can still cause problems.
As the bio counter is only for btrfs bio layer, we still have
btrfs_bio::end_io called after btrfs_bio_counter_dec().
And if the full fs_info has been freed, then at end_bbio_meta_read(), we
can still have problems as btrfs_validate_extent_buffer() will access eb
(bbio->private) and fs_info (eb->fs_info), which triggers use after free.
So using that bio counter is not going to solve all problems, but only
reducing the race window thus masking the problem.
>
> After submit_bio() sends BIO to USB drive, we continue on
> read_one_dev():
>
> open_ctree()
> → btrfs_read_sys_array() # OK — sys_chunk_array in superblock is intact
> → load_super_root(chunk_root) # OK — reads root node, passes validation
> → btrfs_read_chunk_tree()
> → btrfs_for_each_slot()
> → readahead_tree_node_children(node)
> → bio_coutner++ and submit_bio() send BIO to USB drive
> → read_one_dev()
>
> This read_one_dev will return an error since the leaf block is actually
> corrupted. Then open_ctree will get into error path and try to free
> fs_info.
>
> After USB device finished BIO, it will try to decreament the counter but
> the fs_info is already freed.
>
> Any suggestions on this?
The following ideas come up to me, but neither seems as simple as your
current one:
1) Introduce a dedicated counter for metadata readahead/reads
This seems to be the simplest one among all.
But the only usage is only the error handling, thus may not be
worthy.
2) Disable metadata readahead during open_ctree()
Which will delay the mount, especially for large extent tree without
bgt feature.
3) Use buffer_tree xarray to iterate through all ebs
Since this is only for error handling of open_ctree(), we're fine to
do the full xarray iteration, and wait for any eb that has
EXTENT_BUFFER_READING flag.
The problem is, we do not have a dedicated tag like
PAGECACHE_TAG_(TOWRITE|DIRTY) to easily catch all dirty/writeback
ebs.
So the only option is to go through each eb and check their flags.
I think this is the one with minimal impact, but may cause much
longer runtime during this error handling path.
My personal preference is option 3).
>
>
>>
>> The wait and counter are all for dev-reaplce, not matching your description
>> of the generic metadata readahead.
>>
>> If you want to wait for all existing metadata reads, I didn't find a good
>> helper, thus you will need to go through all extent buffers and wait for
>> EXTENT_BUFFER_READING flags.
>>
>>
>
>
next prev parent reply other threads:[~2026-03-29 22:06 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-29 6:31 [PATCH] btrfs: wait for in-flight readahead BIOs on open_ctree() error Teng Liu
2026-03-29 7:03 ` Qu Wenruo
2026-03-29 17:23 ` Teng Liu
2026-03-29 22:06 ` Qu Wenruo [this message]
2026-03-29 22:21 ` Qu Wenruo
2026-03-30 18:00 ` Teng Liu
2026-03-30 21:48 ` Qu Wenruo
2026-03-30 22:14 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4a129696-0352-427f-9e0e-7962e789df57@suse.com \
--to=wqu@suse.com \
--cc=27rabbitlt@gmail.com \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox