public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Filipe Manana <fdmanana@kernel.org>
Cc: linux-btrfs@vger.kernel.org, AHN SEOK-YOUNG <iamsyahn@gmail.com>,
	Teng Liu <27rabbitlt@gmail.com>
Subject: Re: [PATCH] btrfs: warn about extent buffer that can not be released
Date: Thu, 16 Apr 2026 08:32:40 +0930	[thread overview]
Message-ID: <ec6a7a95-28d2-41bc-85a7-ef36e58ff9bd@suse.com> (raw)
In-Reply-To: <CAL3q7H45KF=ptRmAEE32J07oeewV3h-Zq8k_-LhO15PnReL+2Q@mail.gmail.com>



在 2026/4/15 20:55, Filipe Manana 写道:
> On Sun, Apr 5, 2026 at 5:56 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> When we unmount the fs or during mount failures, btrfs will call
>> invalidate_inode_pages() to release all btree inode folios.
>>
>> However that function can return -EBUSY if any folios can not be
>> invalidated.
>> This can be caused by:
>>
>> - Some extent buffers are still held by btrfs
>>    This is a logic error, as we should release all tree root nodes
>>    during unmount and mount failure handling.
>>
>> - Some extent buffers are under readahead and haven't yet finished
>>    This is much rarer but valid cases.
>>    In that case we should wait for those extent buffers.
>>
>> Introduce a new helper invalidate_btree_folios() which will:
>>
>> - Call invalidate_inode_pages2() and catch its return value
>>    If it returned 0 as expected, that's great and we can call it a day.
>>
>> - Otherwise go through each extent buffer in buffer_tree
>>    For eb that still has EXTENT_BUFFER_READING flag, wait for them to
>>    finish first.
>>
>>    Then check if the refs of the eb is greater than 1, if so it means
>>    it's still held by someone, give it a warning.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=221270
>> Reported-by: AHN SEOK-YOUNG <iamsyahn@gmail.com>
>> Cc: Teng Liu <27rabbitlt@gmail.com>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> This new debug feature has exposed some problems, e.g. generic/648,
>> after an injected failure for device barrier, transaction is aborted but
>> we have a lot of dirty tree blocks unreleased and triggered the new
>> warning.
>>
>> We still need to fix these corner cases.
>>
>> Meanwhile if Ahn or Teng can give this patch a try for the existing
>> use-after-free problem, it would be appreciated a lot.
>> ---
>>   fs/btrfs/disk-io.c   | 52 ++++++++++++++++++++++++++++++++++++++++++--
>>   fs/btrfs/extent_io.c |  6 -----
>>   fs/btrfs/extent_io.h |  6 +++++
>>   3 files changed, 56 insertions(+), 8 deletions(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index b8ac3275d8f8..d450bfc6d89b 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -3276,6 +3276,54 @@ static bool fs_is_full_ro(const struct btrfs_fs_info *fs_info)
>>          return false;
>>   }
>>
>> +static void invalidate_btree_folios(struct btrfs_fs_info *fs_info)
>> +{
>> +       XA_STATE(xas, &fs_info->buffer_tree, 0);
>> +       unsigned int checked = 0;
>> +       void *tmp;
>> +       int ret;
>> +
>> +       ret = invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
>> +       if (likely(ret == 0))
>> +               return;
>> +
>> +       /*
>> +        * Some btree pages can not be invalidated, this happens when some
>> +        * tree blocks are still held (either by some pointer or readahead).
>> +        */
>> +       xas_lock_irq(&xas);
>> +       xas_for_each(&xas, tmp, ULONG_MAX) {
>> +               struct extent_buffer *eb = tmp;
> 
> Why use the tmp variable? We can pass 'eb' to xas_for_each().
> 
>> +
>> +               /* Wait for any readahead first. */
>> +               if (test_bit(EXTENT_BUFFER_READING, &eb->bflags)) {
>> +                       xas_pause(&xas);
>> +                       xas_unlock_irq(&xas);
>> +                       wait_on_bit_io(&eb->bflags, EXTENT_BUFFER_READING,
>> +                                      TASK_UNINTERRUPTIBLE);
>> +                       xas_lock_irq(&xas);
>> +               }
>> +
>> +               spin_lock(&eb->refs_lock);
>> +               if (refcount_read(&eb->refs) > 1 || extent_buffer_under_io(eb)) {
>> +                       btrfs_warn(fs_info,
>> +                       "unbale to release extent buffer %llu owner %llu flags 0x%lx",
> 
> unbale -> unable
> 
>> +                                  eb->start, btrfs_header_owner(eb), eb->bflags);
>> +                       if (IS_ENABLED(CONFIG_BTRFS_DEBUG))
>> +                               WARN_ON_ONCE(true);
> 
> Pointless if, just do DEBUG_WARN();

The problem is we do not have DEBUG_WARN_ON_ONCE().

We already have cases triggering this during fstests, and there were a 
lot of ebs that are dirty in that case.

DEBUG_WARN() will flood the dmesg with the same call trace for each 
dirty eb.

I can add a new DEBUG_WARN_ON_ONCE() for this particular call site though.

Thanks,
Qu


  reply	other threads:[~2026-04-15 23:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-05  4:56 [PATCH] btrfs: warn about extent buffer that can not be released Qu Wenruo
2026-04-05 10:05 ` 말란말야
2026-04-08 22:18 ` Teng Liu
2026-04-15 11:25 ` Filipe Manana
2026-04-15 23:02   ` Qu Wenruo [this message]
2026-04-16 10:22     ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec6a7a95-28d2-41bc-85a7-ef36e58ff9bd@suse.com \
    --to=wqu@suse.com \
    --cc=27rabbitlt@gmail.com \
    --cc=fdmanana@kernel.org \
    --cc=iamsyahn@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox