From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Cc: AHN SEOK-YOUNG <iamsyahn@gmail.com>, Teng Liu <27rabbitlt@gmail.com>
Subject: [PATCH v4 2/2] btrfs: warn about extent buffer that can not be released
Date: Thu, 30 Apr 2026 10:37:23 +0930 [thread overview]
Message-ID: <ecc473cee5a79c8c8b1006febbfeb24963c46714.1777510825.git.wqu@suse.com> (raw)
In-Reply-To: <cover.1777510825.git.wqu@suse.com>
When we unmount the fs or during mount failures, btrfs will call
invalidate_inode_pages() to release all btree inode folios.
However that function can return -EBUSY if any folios can not be
invalidated.
This can be caused by:
- Some extent buffers are still held by btrfs
This is a logic error, as we should release all tree root nodes
during unmount and mount failure handling.
- Some extent buffers are under readahead and haven't yet finished
These are much rarer but valid cases.
In that case we should wait for those extent buffers.
Introduce a new helper invalidate_and_check_btree_folios() which will:
- Call invalidate_inode_pages2() and catch its return value
If it returned 0 as expected, that's great and we can call it a day.
- Otherwise go through each extent buffer in buffer_tree
Increase the ref by one first for the eb we're checking.
This is to ensure the eb won't be freed after the readahead is
finished.
For ebs that still have EXTENT_BUFFER_READING flag, wait for them to
finish first.
After waiting for the readahead, check the refs of the eb and if it's
still dirty.
If the eb ref count is greater than 2 (one for the buffer tree, one
held by us), it means we are still holding the extent buffer somewhere
else, which is a code bug.
If the eb is still dirty, it means a bug in transaction handling, e.g.
the bug fixed by patch "btrfs: only release the dirty pages io tree
after successful writes".
For either case, show a warning message about the eb, including its
bytenr, owner, refs and flags.
And if it's a debug build, also trigger WARN_ON_ONCE() so that fstests
can properly catch such situation.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221270
Reported-by: AHN SEOK-YOUNG <iamsyahn@gmail.com>
Cc: Teng Liu <27rabbitlt@gmail.com>
Tested-by: Teng Liu <27rabbitlt@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/disk-io.c | 54 ++++++++++++++++++++++++++++++++++++++++++--
fs/btrfs/extent_io.c | 6 -----
fs/btrfs/extent_io.h | 6 +++++
3 files changed, 58 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f28cef8217de..97299394515b 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3272,6 +3272,56 @@ static bool fs_is_full_ro(const struct btrfs_fs_info *fs_info)
return false;
}
+/*
+ * Try to wait for any metadata readahead, and invalidate all btree folios.
+ *
+ * If the invalidation failed, report any dirty/held extent buffers.
+ */
+static void invalidate_and_check_btree_folios(struct btrfs_fs_info *fs_info)
+{
+ unsigned long index = 0;
+ struct extent_buffer *eb;
+ int ret;
+
+ ret = invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
+ if (likely(ret == 0))
+ return;
+
+ /*
+ * Some btree pages can not be invalidated, this happens when some
+ * tree blocks are still held (either by some pointer or readahead).
+ */
+ rcu_read_lock();
+ xa_for_each(&fs_info->buffer_tree, index, eb) {
+ /* Increase the ref so that the eb won't disappear. */
+ if (!refcount_inc_not_zero(&eb->refs))
+ continue;
+ rcu_read_unlock();
+
+ /* Wait for any readahead first. */
+ if (test_bit(EXTENT_BUFFER_READING, &eb->bflags))
+ wait_on_bit_io(&eb->bflags, EXTENT_BUFFER_READING,
+ TASK_UNINTERRUPTIBLE);
+ /*
+ * The refs threshold is 2, one held by us at the beginning
+ * of the loop, one for the ownership in the buffer tree.
+ */
+ if (unlikely(refcount_read(&eb->refs) > 2 ||
+ extent_buffer_under_io(eb))) {
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_BTRFS_DEBUG));
+ btrfs_warn(fs_info,
+ "unable to release extent buffer %llu owner %llu gen %llu refs %u flags 0x%lx",
+ eb->start, btrfs_header_owner(eb),
+ btrfs_header_generation(eb),
+ refcount_read(&eb->refs), eb->bflags);
+ }
+ free_extent_buffer(eb);
+ rcu_read_lock();
+ }
+ rcu_read_unlock();
+ invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
+}
+
int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_devices)
{
u32 sectorsize;
@@ -3709,7 +3759,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
if (fs_info->data_reloc_root)
btrfs_drop_and_free_fs_root(fs_info, fs_info->data_reloc_root);
free_root_pointers(fs_info, true);
- invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
+ invalidate_and_check_btree_folios(fs_info);
fail_sb_buffer:
btrfs_stop_all_workers(fs_info);
@@ -4438,7 +4488,7 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
* We must make sure there is not any read request to
* submit after we stop all workers.
*/
- invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
+ invalidate_and_check_btree_folios(fs_info);
btrfs_stop_all_workers(fs_info);
/*
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 8800faa8b4be..9c9dffda2b07 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2882,12 +2882,6 @@ bool try_release_extent_mapping(struct folio *folio, gfp_t mask)
return try_release_extent_state(io_tree, folio);
}
-static int extent_buffer_under_io(const struct extent_buffer *eb)
-{
- return (test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags) ||
- test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags));
-}
-
static bool folio_range_has_eb(struct folio *folio)
{
struct btrfs_folio_state *bfs;
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index b310a5145cf6..7b4152387d88 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -327,6 +327,12 @@ static inline bool extent_buffer_uptodate(const struct extent_buffer *eb)
return test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags);
}
+static inline bool extent_buffer_under_io(const struct extent_buffer *eb)
+{
+ return (test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags) ||
+ test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags));
+}
+
int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv,
unsigned long start, unsigned long len);
void read_extent_buffer(const struct extent_buffer *eb, void *dst,
--
2.54.0
next prev parent reply other threads:[~2026-04-30 1:07 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 1:07 [PATCH v4 0/2] btrfs: fix and detect dirty/held ebs at unmount time Qu Wenruo
2026-04-30 1:07 ` [PATCH v4 1/2] btrfs: only release the dirty pages io tree after successful writes Qu Wenruo
2026-04-30 12:53 ` Filipe Manana
2026-04-30 1:07 ` Qu Wenruo [this message]
2026-04-30 12:48 ` [PATCH v4 2/2] btrfs: warn about extent buffer that can not be released Filipe Manana
2026-04-30 21:58 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ecc473cee5a79c8c8b1006febbfeb24963c46714.1777510825.git.wqu@suse.com \
--to=wqu@suse.com \
--cc=27rabbitlt@gmail.com \
--cc=iamsyahn@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox