* [PATCH v5 1/3] btrfs: convert the buffer_radix to an xarray
2025-04-28 14:52 [PATCH v5 0/3] btrfs: simplify extent buffer writeback Josef Bacik
@ 2025-04-28 14:52 ` Josef Bacik
2025-05-07 9:31 ` Qu Wenruo
2025-04-28 14:52 ` [PATCH v5 2/3] btrfs: set DIRTY and WRITEBACK tags on the buffer_tree Josef Bacik
2025-04-28 14:52 ` [PATCH v5 3/3] btrfs: use buffer xarray for extent buffer writeback operations Josef Bacik
2 siblings, 1 reply; 9+ messages in thread
From: Josef Bacik @ 2025-04-28 14:52 UTC (permalink / raw)
To: linux-btrfs, kernel-team; +Cc: Filipe Manana
In order to fully utilize xarray tagging to improve writeback we need to
convert the buffer_radix to a proper xarray. This conversion is
relatively straightforward as the radix code uses the xarray underneath.
Using xarray directly allows for quite a lot less code.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/disk-io.c | 15 ++-
fs/btrfs/extent_io.c | 199 ++++++++++++++++-------------------
fs/btrfs/fs.h | 4 +-
fs/btrfs/tests/btrfs-tests.c | 28 ++---
fs/btrfs/zoned.c | 16 +--
5 files changed, 112 insertions(+), 150 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 59da809b7d57..24c08eb86b7b 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2762,10 +2762,22 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info)
return ret;
}
+/*
+ * lockdep gets confused between our buffer_tree which requires IRQ locking
+ * because we modify marks in the IRQ context, and our delayed inode xarray
+ * which doesn't have these requirements. Use a class key so lockdep doesn't get
+ * them mixed up.
+ */
+static struct lock_class_key buffer_xa_class;
+
void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
{
INIT_RADIX_TREE(&fs_info->fs_roots_radix, GFP_ATOMIC);
- INIT_RADIX_TREE(&fs_info->buffer_radix, GFP_ATOMIC);
+
+ /* Use the same flags as mapping->i_pages. */
+ xa_init_flags(&fs_info->buffer_tree, XA_FLAGS_LOCK_IRQ | XA_FLAGS_ACCOUNT);
+ lockdep_set_class(&fs_info->buffer_tree.xa_lock, &buffer_xa_class);
+
INIT_LIST_HEAD(&fs_info->trans_list);
INIT_LIST_HEAD(&fs_info->dead_roots);
INIT_LIST_HEAD(&fs_info->delayed_iputs);
@@ -2777,7 +2789,6 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
spin_lock_init(&fs_info->delayed_iput_lock);
spin_lock_init(&fs_info->defrag_inodes_lock);
spin_lock_init(&fs_info->super_lock);
- spin_lock_init(&fs_info->buffer_lock);
spin_lock_init(&fs_info->unused_bgs_lock);
spin_lock_init(&fs_info->treelog_bg_lock);
spin_lock_init(&fs_info->zone_active_bgs_lock);
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 6cfd286b8bbc..bedcacaf809f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1893,19 +1893,17 @@ static void set_btree_ioerr(struct extent_buffer *eb)
* context.
*/
static struct extent_buffer *find_extent_buffer_nolock(
- const struct btrfs_fs_info *fs_info, u64 start)
+ struct btrfs_fs_info *fs_info, u64 start)
{
struct extent_buffer *eb;
+ unsigned long index = start >> fs_info->sectorsize_bits;
rcu_read_lock();
- eb = radix_tree_lookup(&fs_info->buffer_radix,
- start >> fs_info->sectorsize_bits);
- if (eb && atomic_inc_not_zero(&eb->refs)) {
- rcu_read_unlock();
- return eb;
- }
+ eb = xa_load(&fs_info->buffer_tree, index);
+ if (eb && !atomic_inc_not_zero(&eb->refs))
+ eb = NULL;
rcu_read_unlock();
- return NULL;
+ return eb;
}
static void end_bbio_meta_write(struct btrfs_bio *bbio)
@@ -2769,11 +2767,10 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
if (!btrfs_meta_is_subpage(fs_info)) {
/*
- * We do this since we'll remove the pages after we've
- * removed the eb from the radix tree, so we could race
- * and have this page now attached to the new eb. So
- * only clear folio if it's still connected to
- * this eb.
+ * We do this since we'll remove the pages after we've removed
+ * the eb from the xarray, so we could race and have this page
+ * now attached to the new eb. So only clear folio if it's
+ * still connected to this eb.
*/
if (folio_test_private(folio) && folio_get_private(folio) == eb) {
BUG_ON(test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags));
@@ -2938,9 +2935,9 @@ static void check_buffer_tree_ref(struct extent_buffer *eb)
{
int refs;
/*
- * The TREE_REF bit is first set when the extent_buffer is added
- * to the radix tree. It is also reset, if unset, when a new reference
- * is created by find_extent_buffer.
+ * The TREE_REF bit is first set when the extent_buffer is added to the
+ * xarray. It is also reset, if unset, when a new reference is created
+ * by find_extent_buffer.
*
* It is only cleared in two cases: freeing the last non-tree
* reference to the extent_buffer when its STALE bit is set or
@@ -2952,13 +2949,12 @@ static void check_buffer_tree_ref(struct extent_buffer *eb)
* conditions between the calls to check_buffer_tree_ref in those
* codepaths and clearing TREE_REF in try_release_extent_buffer.
*
- * The actual lifetime of the extent_buffer in the radix tree is
- * adequately protected by the refcount, but the TREE_REF bit and
- * its corresponding reference are not. To protect against this
- * class of races, we call check_buffer_tree_ref from the codepaths
- * which trigger io. Note that once io is initiated, TREE_REF can no
- * longer be cleared, so that is the moment at which any such race is
- * best fixed.
+ * The actual lifetime of the extent_buffer in the xarray is adequately
+ * protected by the refcount, but the TREE_REF bit and its corresponding
+ * reference are not. To protect against this class of races, we call
+ * check_buffer_tree_ref from the codepaths which trigger io. Note that
+ * once io is initiated, TREE_REF can no longer be cleared, so that is
+ * the moment at which any such race is best fixed.
*/
refs = atomic_read(&eb->refs);
if (refs >= 2 && test_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags))
@@ -3022,23 +3018,26 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info,
return ERR_PTR(-ENOMEM);
eb->fs_info = fs_info;
again:
- ret = radix_tree_preload(GFP_NOFS);
- if (ret) {
- exists = ERR_PTR(ret);
+ xa_lock_irq(&fs_info->buffer_tree);
+ exists = __xa_cmpxchg(&fs_info->buffer_tree,
+ start >> fs_info->sectorsize_bits, NULL, eb,
+ GFP_NOFS);
+ if (xa_is_err(exists)) {
+ ret = xa_err(exists);
+ xa_unlock_irq(&fs_info->buffer_tree);
+ btrfs_release_extent_buffer(eb);
+ return ERR_PTR(ret);
+ }
+ if (exists) {
+ if (!atomic_inc_not_zero(&exists->refs)) {
+ /* The extent buffer is being freed, retry. */
+ xa_unlock_irq(&fs_info->buffer_tree);
+ goto again;
+ }
+ xa_unlock_irq(&fs_info->buffer_tree);
goto free_eb;
}
- spin_lock(&fs_info->buffer_lock);
- ret = radix_tree_insert(&fs_info->buffer_radix,
- start >> fs_info->sectorsize_bits, eb);
- spin_unlock(&fs_info->buffer_lock);
- radix_tree_preload_end();
- if (ret == -EEXIST) {
- exists = find_extent_buffer(fs_info, start);
- if (exists)
- goto free_eb;
- else
- goto again;
- }
+ xa_unlock_irq(&fs_info->buffer_tree);
check_buffer_tree_ref(eb);
return eb;
@@ -3059,9 +3058,9 @@ static struct extent_buffer *grab_extent_buffer(struct btrfs_fs_info *fs_info,
lockdep_assert_held(&folio->mapping->i_private_lock);
/*
- * For subpage case, we completely rely on radix tree to ensure we
- * don't try to insert two ebs for the same bytenr. So here we always
- * return NULL and just continue.
+ * For subpage case, we completely rely on xarray to ensure we don't try
+ * to insert two ebs for the same bytenr. So here we always return NULL
+ * and just continue.
*/
if (btrfs_meta_is_subpage(fs_info))
return NULL;
@@ -3194,7 +3193,7 @@ static int attach_eb_folio_to_filemap(struct extent_buffer *eb, int i,
/*
* To inform we have an extra eb under allocation, so that
* detach_extent_buffer_page() won't release the folio private when the
- * eb hasn't been inserted into radix tree yet.
+ * eb hasn't been inserted into the xarray yet.
*
* The ref will be decreased when the eb releases the page, in
* detach_extent_buffer_page(). Thus needs no special handling in the
@@ -3328,10 +3327,10 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
/*
* We can't unlock the pages just yet since the extent buffer
- * hasn't been properly inserted in the radix tree, this
- * opens a race with btree_release_folio which can free a page
- * while we are still filling in all pages for the buffer and
- * we could crash.
+ * hasn't been properly inserted in the xarray, this opens a
+ * race with btree_release_folio which can free a page while we
+ * are still filling in all pages for the buffer and we could
+ * crash.
*/
}
if (uptodate)
@@ -3340,23 +3339,25 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
if (page_contig)
eb->addr = folio_address(eb->folios[0]) + offset_in_page(eb->start);
again:
- ret = radix_tree_preload(GFP_NOFS);
- if (ret)
+ xa_lock_irq(&fs_info->buffer_tree);
+ existing_eb = __xa_cmpxchg(&fs_info->buffer_tree,
+ start >> fs_info->sectorsize_bits, NULL, eb,
+ GFP_NOFS);
+ if (xa_is_err(existing_eb)) {
+ ret = xa_err(existing_eb);
+ xa_unlock_irq(&fs_info->buffer_tree);
goto out;
-
- spin_lock(&fs_info->buffer_lock);
- ret = radix_tree_insert(&fs_info->buffer_radix,
- start >> fs_info->sectorsize_bits, eb);
- spin_unlock(&fs_info->buffer_lock);
- radix_tree_preload_end();
- if (ret == -EEXIST) {
- ret = 0;
- existing_eb = find_extent_buffer(fs_info, start);
- if (existing_eb)
- goto out;
- else
- goto again;
}
+ if (existing_eb) {
+ if (!atomic_inc_not_zero(&existing_eb->refs)) {
+ xa_unlock_irq(&fs_info->buffer_tree);
+ goto again;
+ }
+ xa_unlock_irq(&fs_info->buffer_tree);
+ goto out;
+ }
+ xa_unlock_irq(&fs_info->buffer_tree);
+
/* add one reference for the tree */
check_buffer_tree_ref(eb);
@@ -3426,10 +3427,23 @@ static int release_extent_buffer(struct extent_buffer *eb)
spin_unlock(&eb->refs_lock);
- spin_lock(&fs_info->buffer_lock);
- radix_tree_delete_item(&fs_info->buffer_radix,
- eb->start >> fs_info->sectorsize_bits, eb);
- spin_unlock(&fs_info->buffer_lock);
+ /*
+ * We're erasing, theoretically there will be no allocations, so
+ * just use GFP_ATOMIC.
+ *
+ * We use cmpxchg instead of erase because we do not know if
+ * this eb is actually in the tree or not, we could be cleaning
+ * up an eb that we allocated but never inserted into the tree.
+ * Thus use cmpxchg to remove it from the tree if it is there,
+ * or leave the other entry if this isn't in the tree.
+ *
+ * The documentation says that putting a NULL value is the same
+ * as erase as long as XA_FLAGS_ALLOC is not set, which it isn't
+ * in this case.
+ */
+ xa_cmpxchg_irq(&fs_info->buffer_tree,
+ eb->start >> fs_info->sectorsize_bits, eb, NULL,
+ GFP_ATOMIC);
btrfs_leak_debug_del_eb(eb);
/* Should be safe to release folios at this point. */
@@ -4260,44 +4274,6 @@ void memmove_extent_buffer(const struct extent_buffer *dst,
}
}
-#define GANG_LOOKUP_SIZE 16
-static struct extent_buffer *get_next_extent_buffer(
- const struct btrfs_fs_info *fs_info, struct folio *folio, u64 bytenr)
-{
- struct extent_buffer *gang[GANG_LOOKUP_SIZE];
- struct extent_buffer *found = NULL;
- u64 folio_start = folio_pos(folio);
- u64 cur = folio_start;
-
- ASSERT(in_range(bytenr, folio_start, PAGE_SIZE));
- lockdep_assert_held(&fs_info->buffer_lock);
-
- while (cur < folio_start + PAGE_SIZE) {
- int ret;
- int i;
-
- ret = radix_tree_gang_lookup(&fs_info->buffer_radix,
- (void **)gang, cur >> fs_info->sectorsize_bits,
- min_t(unsigned int, GANG_LOOKUP_SIZE,
- PAGE_SIZE / fs_info->nodesize));
- if (ret == 0)
- goto out;
- for (i = 0; i < ret; i++) {
- /* Already beyond page end */
- if (gang[i]->start >= folio_start + PAGE_SIZE)
- goto out;
- /* Found one */
- if (gang[i]->start >= bytenr) {
- found = gang[i];
- goto out;
- }
- }
- cur = gang[ret - 1]->start + gang[ret - 1]->len;
- }
-out:
- return found;
-}
-
static int try_release_subpage_extent_buffer(struct folio *folio)
{
struct btrfs_fs_info *fs_info = folio_to_fs_info(folio);
@@ -4306,21 +4282,22 @@ static int try_release_subpage_extent_buffer(struct folio *folio)
int ret;
while (cur < end) {
+ unsigned long index = cur >> fs_info->sectorsize_bits;
struct extent_buffer *eb = NULL;
/*
* Unlike try_release_extent_buffer() which uses folio private
- * to grab buffer, for subpage case we rely on radix tree, thus
- * we need to ensure radix tree consistency.
+ * to grab buffer, for subpage case we rely on xarray, thus we
+ * need to ensure xarray tree consistency.
*
- * We also want an atomic snapshot of the radix tree, thus go
+ * We also want an atomic snapshot of the xarray tree, thus go
* with spinlock rather than RCU.
*/
- spin_lock(&fs_info->buffer_lock);
- eb = get_next_extent_buffer(fs_info, folio, cur);
+ xa_lock_irq(&fs_info->buffer_tree);
+ eb = xa_load(&fs_info->buffer_tree, index);
if (!eb) {
/* No more eb in the page range after or at cur */
- spin_unlock(&fs_info->buffer_lock);
+ xa_unlock_irq(&fs_info->buffer_tree);
break;
}
cur = eb->start + eb->len;
@@ -4332,10 +4309,10 @@ static int try_release_subpage_extent_buffer(struct folio *folio)
spin_lock(&eb->refs_lock);
if (atomic_read(&eb->refs) != 1 || extent_buffer_under_io(eb)) {
spin_unlock(&eb->refs_lock);
- spin_unlock(&fs_info->buffer_lock);
+ xa_unlock_irq(&fs_info->buffer_tree);
break;
}
- spin_unlock(&fs_info->buffer_lock);
+ xa_unlock_irq(&fs_info->buffer_tree);
/*
* If tree ref isn't set then we know the ref on this eb is a
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index bcca43046064..ed02d276d908 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -776,10 +776,8 @@ struct btrfs_fs_info {
struct btrfs_delayed_root *delayed_root;
- /* Extent buffer radix tree */
- spinlock_t buffer_lock;
/* Entries are eb->start / sectorsize */
- struct radix_tree_root buffer_radix;
+ struct xarray buffer_tree;
/* Next backup root to be overwritten */
int backup_root_index;
diff --git a/fs/btrfs/tests/btrfs-tests.c b/fs/btrfs/tests/btrfs-tests.c
index 02a915eb51fb..b576897d71cc 100644
--- a/fs/btrfs/tests/btrfs-tests.c
+++ b/fs/btrfs/tests/btrfs-tests.c
@@ -157,9 +157,9 @@ struct btrfs_fs_info *btrfs_alloc_dummy_fs_info(u32 nodesize, u32 sectorsize)
void btrfs_free_dummy_fs_info(struct btrfs_fs_info *fs_info)
{
- struct radix_tree_iter iter;
- void **slot;
struct btrfs_device *dev, *tmp;
+ struct extent_buffer *eb;
+ unsigned long index;
if (!fs_info)
return;
@@ -169,25 +169,13 @@ void btrfs_free_dummy_fs_info(struct btrfs_fs_info *fs_info)
test_mnt->mnt_sb->s_fs_info = NULL;
- spin_lock(&fs_info->buffer_lock);
- radix_tree_for_each_slot(slot, &fs_info->buffer_radix, &iter, 0) {
- struct extent_buffer *eb;
-
- eb = radix_tree_deref_slot_protected(slot, &fs_info->buffer_lock);
- if (!eb)
- continue;
- /* Shouldn't happen but that kind of thinking creates CVE's */
- if (radix_tree_exception(eb)) {
- if (radix_tree_deref_retry(eb))
- slot = radix_tree_iter_retry(&iter);
- continue;
- }
- slot = radix_tree_iter_resume(slot, &iter);
- spin_unlock(&fs_info->buffer_lock);
- free_extent_buffer_stale(eb);
- spin_lock(&fs_info->buffer_lock);
+ xa_lock_irq(&fs_info->buffer_tree);
+ xa_for_each(&fs_info->buffer_tree, index, eb) {
+ xa_unlock_irq(&fs_info->buffer_tree);
+ free_extent_buffer(eb);
+ xa_lock_irq(&fs_info->buffer_tree);
}
- spin_unlock(&fs_info->buffer_lock);
+ xa_unlock_irq(&fs_info->buffer_tree);
btrfs_mapping_tree_free(fs_info);
list_for_each_entry_safe(dev, tmp, &fs_info->fs_devices->devices,
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 7b30700ec930..4b59bc480663 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -2171,27 +2171,15 @@ static void wait_eb_writebacks(struct btrfs_block_group *block_group)
{
struct btrfs_fs_info *fs_info = block_group->fs_info;
const u64 end = block_group->start + block_group->length;
- struct radix_tree_iter iter;
struct extent_buffer *eb;
- void __rcu **slot;
+ unsigned long index, start = block_group->start >> fs_info->sectorsize_bits;
rcu_read_lock();
- radix_tree_for_each_slot(slot, &fs_info->buffer_radix, &iter,
- block_group->start >> fs_info->sectorsize_bits) {
- eb = radix_tree_deref_slot(slot);
- if (!eb)
- continue;
- if (radix_tree_deref_retry(eb)) {
- slot = radix_tree_iter_retry(&iter);
- continue;
- }
-
+ xa_for_each_start(&fs_info->buffer_tree, index, eb, start) {
if (eb->start < block_group->start)
continue;
if (eb->start >= end)
break;
-
- slot = radix_tree_iter_resume(slot, &iter);
rcu_read_unlock();
wait_on_extent_buffer_writeback(eb);
rcu_read_lock();
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v5 1/3] btrfs: convert the buffer_radix to an xarray
2025-04-28 14:52 ` [PATCH v5 1/3] btrfs: convert the buffer_radix to an xarray Josef Bacik
@ 2025-05-07 9:31 ` Qu Wenruo
0 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2025-05-07 9:31 UTC (permalink / raw)
To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Filipe Manana
在 2025/4/29 00:22, Josef Bacik 写道:
> In order to fully utilize xarray tagging to improve writeback we need to
> convert the buffer_radix to a proper xarray. This conversion is
> relatively straightforward as the radix code uses the xarray underneath.
> Using xarray directly allows for quite a lot less code.
>
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Unfortauntely this is causing problems for aarch64 subpage cases, where
the page size is 64K, and fs block size is 4K.
It's causing the extent buffer leak detector go crazy for every test case:
[ 1766.425913] BTRFS info (device dm-2): last unmount of filesystem
b15f2893-4948-4c0c-a380-03d8b88234a2
[ 1766.430207] BTRFS warning (device dm-2): folio private not zero on
folio 22020096
[ 1766.430224] BTRFS warning (device dm-2): folio private not zero on
folio 30408704
[ 1766.430241] BTRFS warning (device dm-2): folio private not zero on
folio 30539776
[ 1766.432235] ------------[ cut here ]------------
[ 1766.432298] WARNING: CPU: 3 PID: 27284 at extent_io.c:73
btrfs_extent_buffer_leak_debug_check+0x80/0x150 [btrfs]
[ 1766.437876] Modules linked in: btrfs(OE) nls_ascii nls_cp437 vfat fat
polyval_ce polyval_generic ghash_ce rtc_efi xor xor_neon raid6_pq
processor zstd_compress fuse loop nfnetlink qemu_fw_cfg ext4 crc16
mbcache jbd2 dm_mod xhci_pci xhci_hcd virtio_scsi virtio_blk
virtio_balloon virtio_net virtio_console net_failover failover virtio_mmio
[ 1766.437919] Unloaded tainted modules: btrfs(OE):14 [last unloaded:
btrfs(OE)]
[ 1766.443832] CPU: 3 UID: 0 PID: 27284 Comm: umount Tainted: G W
OE 6.15.0-rc5-custom+ #119 PREEMPT(voluntary)
[ 1766.445617] Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 1766.446621] Hardware name: QEMU KVM Virtual Machine, BIOS unknown
2/2/2022
[ 1766.447717] pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[ 1766.448827] pc : btrfs_extent_buffer_leak_debug_check+0x80/0x150 [btrfs]
[ 1766.450093] lr : btrfs_free_fs_info+0x134/0x1a0 [btrfs]
[ 1766.451101] sp : ffff80008588fcc0
[ 1766.451654] x29: ffff80008588fcc0 x28: ffff03e6ffe31d80 x27:
0000000000000000
[ 1766.452827] x26: 0000000000000000 x25: 0000000000000000 x24:
0000000000000000
[ 1766.454001] x23: ffff03e6ffe3285c x22: ffff03e7c1446e00 x21:
ffff03e7c1446ef0
[ 1766.455174] x20: ffff03e7c1446000 x19: ffff03e7c1446000 x18:
0000000000000228
[ 1766.456354] x17: ffffffdfc0000000 x16: ffffcedfe8577f48 x15:
ffff03e70a3dc4c8
[ 1766.457522] x14: 0000000000000000 x13: ffff03e6e0075c10 x12:
ffff03e70a3dc430
[ 1766.458690] x11: 0000000000000012 x10: ffff03e6e0075c18 x9 :
ffffcedfcfd0babc
[ 1766.459860] x8 : ffff80008588fcd0 x7 : fefefefefefefefe x6 :
0000000000000000
[ 1766.461025] x5 : 0000000000000000 x4 : ffffffe0b9bb8a20 x3 :
000000008020001d
[ 1766.462189] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
ffff03e6edc665b8
[ 1766.463360] Call trace:
[ 1766.463763] btrfs_extent_buffer_leak_debug_check+0x80/0x150 [btrfs] (P)
[ 1766.465002] btrfs_free_fs_info+0x134/0x1a0 [btrfs]
[ 1766.465938] btrfs_kill_super+0x28/0x38 [btrfs]
[ 1766.466823] deactivate_locked_super+0x4c/0x100
[ 1766.467577] deactivate_super+0x8c/0xb0
[ 1766.468209] cleanup_mnt+0xb4/0x150
[ 1766.468796] __cleanup_mnt+0x1c/0x30
[ 1766.469391] task_work_run+0x80/0x110
[ 1766.469997] do_notify_resume+0x158/0x190
[ 1766.470665] el0_svc+0xf4/0x148
[ 1766.471183] el0t_64_sync_handler+0x10c/0x140
[ 1766.471898] el0t_64_sync+0x198/0x1a0
[ 1766.472505] ---[ end trace 0000000000000000 ]---
[ 1766.473363] BTRFS: buffer leak start 30588928 len 16384 refs 1 bflags
5 owner 10
Thanks,
Qu
> ---
> fs/btrfs/disk-io.c | 15 ++-
> fs/btrfs/extent_io.c | 199 ++++++++++++++++-------------------
> fs/btrfs/fs.h | 4 +-
> fs/btrfs/tests/btrfs-tests.c | 28 ++---
> fs/btrfs/zoned.c | 16 +--
> 5 files changed, 112 insertions(+), 150 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 59da809b7d57..24c08eb86b7b 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2762,10 +2762,22 @@ static int __cold init_tree_roots(struct btrfs_fs_info *fs_info)
> return ret;
> }
>
> +/*
> + * lockdep gets confused between our buffer_tree which requires IRQ locking
> + * because we modify marks in the IRQ context, and our delayed inode xarray
> + * which doesn't have these requirements. Use a class key so lockdep doesn't get
> + * them mixed up.
> + */
> +static struct lock_class_key buffer_xa_class;
> +
> void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
> {
> INIT_RADIX_TREE(&fs_info->fs_roots_radix, GFP_ATOMIC);
> - INIT_RADIX_TREE(&fs_info->buffer_radix, GFP_ATOMIC);
> +
> + /* Use the same flags as mapping->i_pages. */
> + xa_init_flags(&fs_info->buffer_tree, XA_FLAGS_LOCK_IRQ | XA_FLAGS_ACCOUNT);
> + lockdep_set_class(&fs_info->buffer_tree.xa_lock, &buffer_xa_class);
> +
> INIT_LIST_HEAD(&fs_info->trans_list);
> INIT_LIST_HEAD(&fs_info->dead_roots);
> INIT_LIST_HEAD(&fs_info->delayed_iputs);
> @@ -2777,7 +2789,6 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
> spin_lock_init(&fs_info->delayed_iput_lock);
> spin_lock_init(&fs_info->defrag_inodes_lock);
> spin_lock_init(&fs_info->super_lock);
> - spin_lock_init(&fs_info->buffer_lock);
> spin_lock_init(&fs_info->unused_bgs_lock);
> spin_lock_init(&fs_info->treelog_bg_lock);
> spin_lock_init(&fs_info->zone_active_bgs_lock);
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 6cfd286b8bbc..bedcacaf809f 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -1893,19 +1893,17 @@ static void set_btree_ioerr(struct extent_buffer *eb)
> * context.
> */
> static struct extent_buffer *find_extent_buffer_nolock(
> - const struct btrfs_fs_info *fs_info, u64 start)
> + struct btrfs_fs_info *fs_info, u64 start)
> {
> struct extent_buffer *eb;
> + unsigned long index = start >> fs_info->sectorsize_bits;
>
> rcu_read_lock();
> - eb = radix_tree_lookup(&fs_info->buffer_radix,
> - start >> fs_info->sectorsize_bits);
> - if (eb && atomic_inc_not_zero(&eb->refs)) {
> - rcu_read_unlock();
> - return eb;
> - }
> + eb = xa_load(&fs_info->buffer_tree, index);
> + if (eb && !atomic_inc_not_zero(&eb->refs))
> + eb = NULL;
> rcu_read_unlock();
> - return NULL;
> + return eb;
> }
>
> static void end_bbio_meta_write(struct btrfs_bio *bbio)
> @@ -2769,11 +2767,10 @@ static void detach_extent_buffer_folio(const struct extent_buffer *eb, struct fo
>
> if (!btrfs_meta_is_subpage(fs_info)) {
> /*
> - * We do this since we'll remove the pages after we've
> - * removed the eb from the radix tree, so we could race
> - * and have this page now attached to the new eb. So
> - * only clear folio if it's still connected to
> - * this eb.
> + * We do this since we'll remove the pages after we've removed
> + * the eb from the xarray, so we could race and have this page
> + * now attached to the new eb. So only clear folio if it's
> + * still connected to this eb.
> */
> if (folio_test_private(folio) && folio_get_private(folio) == eb) {
> BUG_ON(test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags));
> @@ -2938,9 +2935,9 @@ static void check_buffer_tree_ref(struct extent_buffer *eb)
> {
> int refs;
> /*
> - * The TREE_REF bit is first set when the extent_buffer is added
> - * to the radix tree. It is also reset, if unset, when a new reference
> - * is created by find_extent_buffer.
> + * The TREE_REF bit is first set when the extent_buffer is added to the
> + * xarray. It is also reset, if unset, when a new reference is created
> + * by find_extent_buffer.
> *
> * It is only cleared in two cases: freeing the last non-tree
> * reference to the extent_buffer when its STALE bit is set or
> @@ -2952,13 +2949,12 @@ static void check_buffer_tree_ref(struct extent_buffer *eb)
> * conditions between the calls to check_buffer_tree_ref in those
> * codepaths and clearing TREE_REF in try_release_extent_buffer.
> *
> - * The actual lifetime of the extent_buffer in the radix tree is
> - * adequately protected by the refcount, but the TREE_REF bit and
> - * its corresponding reference are not. To protect against this
> - * class of races, we call check_buffer_tree_ref from the codepaths
> - * which trigger io. Note that once io is initiated, TREE_REF can no
> - * longer be cleared, so that is the moment at which any such race is
> - * best fixed.
> + * The actual lifetime of the extent_buffer in the xarray is adequately
> + * protected by the refcount, but the TREE_REF bit and its corresponding
> + * reference are not. To protect against this class of races, we call
> + * check_buffer_tree_ref from the codepaths which trigger io. Note that
> + * once io is initiated, TREE_REF can no longer be cleared, so that is
> + * the moment at which any such race is best fixed.
> */
> refs = atomic_read(&eb->refs);
> if (refs >= 2 && test_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags))
> @@ -3022,23 +3018,26 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info,
> return ERR_PTR(-ENOMEM);
> eb->fs_info = fs_info;
> again:
> - ret = radix_tree_preload(GFP_NOFS);
> - if (ret) {
> - exists = ERR_PTR(ret);
> + xa_lock_irq(&fs_info->buffer_tree);
> + exists = __xa_cmpxchg(&fs_info->buffer_tree,
> + start >> fs_info->sectorsize_bits, NULL, eb,
> + GFP_NOFS);
> + if (xa_is_err(exists)) {
> + ret = xa_err(exists);
> + xa_unlock_irq(&fs_info->buffer_tree);
> + btrfs_release_extent_buffer(eb);
> + return ERR_PTR(ret);
> + }
> + if (exists) {
> + if (!atomic_inc_not_zero(&exists->refs)) {
> + /* The extent buffer is being freed, retry. */
> + xa_unlock_irq(&fs_info->buffer_tree);
> + goto again;
> + }
> + xa_unlock_irq(&fs_info->buffer_tree);
> goto free_eb;
> }
> - spin_lock(&fs_info->buffer_lock);
> - ret = radix_tree_insert(&fs_info->buffer_radix,
> - start >> fs_info->sectorsize_bits, eb);
> - spin_unlock(&fs_info->buffer_lock);
> - radix_tree_preload_end();
> - if (ret == -EEXIST) {
> - exists = find_extent_buffer(fs_info, start);
> - if (exists)
> - goto free_eb;
> - else
> - goto again;
> - }
> + xa_unlock_irq(&fs_info->buffer_tree);
> check_buffer_tree_ref(eb);
>
> return eb;
> @@ -3059,9 +3058,9 @@ static struct extent_buffer *grab_extent_buffer(struct btrfs_fs_info *fs_info,
> lockdep_assert_held(&folio->mapping->i_private_lock);
>
> /*
> - * For subpage case, we completely rely on radix tree to ensure we
> - * don't try to insert two ebs for the same bytenr. So here we always
> - * return NULL and just continue.
> + * For subpage case, we completely rely on xarray to ensure we don't try
> + * to insert two ebs for the same bytenr. So here we always return NULL
> + * and just continue.
> */
> if (btrfs_meta_is_subpage(fs_info))
> return NULL;
> @@ -3194,7 +3193,7 @@ static int attach_eb_folio_to_filemap(struct extent_buffer *eb, int i,
> /*
> * To inform we have an extra eb under allocation, so that
> * detach_extent_buffer_page() won't release the folio private when the
> - * eb hasn't been inserted into radix tree yet.
> + * eb hasn't been inserted into the xarray yet.
> *
> * The ref will be decreased when the eb releases the page, in
> * detach_extent_buffer_page(). Thus needs no special handling in the
> @@ -3328,10 +3327,10 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
>
> /*
> * We can't unlock the pages just yet since the extent buffer
> - * hasn't been properly inserted in the radix tree, this
> - * opens a race with btree_release_folio which can free a page
> - * while we are still filling in all pages for the buffer and
> - * we could crash.
> + * hasn't been properly inserted in the xarray, this opens a
> + * race with btree_release_folio which can free a page while we
> + * are still filling in all pages for the buffer and we could
> + * crash.
> */
> }
> if (uptodate)
> @@ -3340,23 +3339,25 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
> if (page_contig)
> eb->addr = folio_address(eb->folios[0]) + offset_in_page(eb->start);
> again:
> - ret = radix_tree_preload(GFP_NOFS);
> - if (ret)
> + xa_lock_irq(&fs_info->buffer_tree);
> + existing_eb = __xa_cmpxchg(&fs_info->buffer_tree,
> + start >> fs_info->sectorsize_bits, NULL, eb,
> + GFP_NOFS);
> + if (xa_is_err(existing_eb)) {
> + ret = xa_err(existing_eb);
> + xa_unlock_irq(&fs_info->buffer_tree);
> goto out;
> -
> - spin_lock(&fs_info->buffer_lock);
> - ret = radix_tree_insert(&fs_info->buffer_radix,
> - start >> fs_info->sectorsize_bits, eb);
> - spin_unlock(&fs_info->buffer_lock);
> - radix_tree_preload_end();
> - if (ret == -EEXIST) {
> - ret = 0;
> - existing_eb = find_extent_buffer(fs_info, start);
> - if (existing_eb)
> - goto out;
> - else
> - goto again;
> }
> + if (existing_eb) {
> + if (!atomic_inc_not_zero(&existing_eb->refs)) {
> + xa_unlock_irq(&fs_info->buffer_tree);
> + goto again;
> + }
> + xa_unlock_irq(&fs_info->buffer_tree);
> + goto out;
> + }
> + xa_unlock_irq(&fs_info->buffer_tree);
> +
> /* add one reference for the tree */
> check_buffer_tree_ref(eb);
>
> @@ -3426,10 +3427,23 @@ static int release_extent_buffer(struct extent_buffer *eb)
>
> spin_unlock(&eb->refs_lock);
>
> - spin_lock(&fs_info->buffer_lock);
> - radix_tree_delete_item(&fs_info->buffer_radix,
> - eb->start >> fs_info->sectorsize_bits, eb);
> - spin_unlock(&fs_info->buffer_lock);
> + /*
> + * We're erasing, theoretically there will be no allocations, so
> + * just use GFP_ATOMIC.
> + *
> + * We use cmpxchg instead of erase because we do not know if
> + * this eb is actually in the tree or not, we could be cleaning
> + * up an eb that we allocated but never inserted into the tree.
> + * Thus use cmpxchg to remove it from the tree if it is there,
> + * or leave the other entry if this isn't in the tree.
> + *
> + * The documentation says that putting a NULL value is the same
> + * as erase as long as XA_FLAGS_ALLOC is not set, which it isn't
> + * in this case.
> + */
> + xa_cmpxchg_irq(&fs_info->buffer_tree,
> + eb->start >> fs_info->sectorsize_bits, eb, NULL,
> + GFP_ATOMIC);
>
> btrfs_leak_debug_del_eb(eb);
> /* Should be safe to release folios at this point. */
> @@ -4260,44 +4274,6 @@ void memmove_extent_buffer(const struct extent_buffer *dst,
> }
> }
>
> -#define GANG_LOOKUP_SIZE 16
> -static struct extent_buffer *get_next_extent_buffer(
> - const struct btrfs_fs_info *fs_info, struct folio *folio, u64 bytenr)
> -{
> - struct extent_buffer *gang[GANG_LOOKUP_SIZE];
> - struct extent_buffer *found = NULL;
> - u64 folio_start = folio_pos(folio);
> - u64 cur = folio_start;
> -
> - ASSERT(in_range(bytenr, folio_start, PAGE_SIZE));
> - lockdep_assert_held(&fs_info->buffer_lock);
> -
> - while (cur < folio_start + PAGE_SIZE) {
> - int ret;
> - int i;
> -
> - ret = radix_tree_gang_lookup(&fs_info->buffer_radix,
> - (void **)gang, cur >> fs_info->sectorsize_bits,
> - min_t(unsigned int, GANG_LOOKUP_SIZE,
> - PAGE_SIZE / fs_info->nodesize));
> - if (ret == 0)
> - goto out;
> - for (i = 0; i < ret; i++) {
> - /* Already beyond page end */
> - if (gang[i]->start >= folio_start + PAGE_SIZE)
> - goto out;
> - /* Found one */
> - if (gang[i]->start >= bytenr) {
> - found = gang[i];
> - goto out;
> - }
> - }
> - cur = gang[ret - 1]->start + gang[ret - 1]->len;
> - }
> -out:
> - return found;
> -}
> -
> static int try_release_subpage_extent_buffer(struct folio *folio)
> {
> struct btrfs_fs_info *fs_info = folio_to_fs_info(folio);
> @@ -4306,21 +4282,22 @@ static int try_release_subpage_extent_buffer(struct folio *folio)
> int ret;
>
> while (cur < end) {
> + unsigned long index = cur >> fs_info->sectorsize_bits;
> struct extent_buffer *eb = NULL;
>
> /*
> * Unlike try_release_extent_buffer() which uses folio private
> - * to grab buffer, for subpage case we rely on radix tree, thus
> - * we need to ensure radix tree consistency.
> + * to grab buffer, for subpage case we rely on xarray, thus we
> + * need to ensure xarray tree consistency.
> *
> - * We also want an atomic snapshot of the radix tree, thus go
> + * We also want an atomic snapshot of the xarray tree, thus go
> * with spinlock rather than RCU.
> */
> - spin_lock(&fs_info->buffer_lock);
> - eb = get_next_extent_buffer(fs_info, folio, cur);
> + xa_lock_irq(&fs_info->buffer_tree);
> + eb = xa_load(&fs_info->buffer_tree, index);
> if (!eb) {
> /* No more eb in the page range after or at cur */
> - spin_unlock(&fs_info->buffer_lock);
> + xa_unlock_irq(&fs_info->buffer_tree);
> break;
> }
> cur = eb->start + eb->len;
> @@ -4332,10 +4309,10 @@ static int try_release_subpage_extent_buffer(struct folio *folio)
> spin_lock(&eb->refs_lock);
> if (atomic_read(&eb->refs) != 1 || extent_buffer_under_io(eb)) {
> spin_unlock(&eb->refs_lock);
> - spin_unlock(&fs_info->buffer_lock);
> + xa_unlock_irq(&fs_info->buffer_tree);
> break;
> }
> - spin_unlock(&fs_info->buffer_lock);
> + xa_unlock_irq(&fs_info->buffer_tree);
>
> /*
> * If tree ref isn't set then we know the ref on this eb is a
> diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
> index bcca43046064..ed02d276d908 100644
> --- a/fs/btrfs/fs.h
> +++ b/fs/btrfs/fs.h
> @@ -776,10 +776,8 @@ struct btrfs_fs_info {
>
> struct btrfs_delayed_root *delayed_root;
>
> - /* Extent buffer radix tree */
> - spinlock_t buffer_lock;
> /* Entries are eb->start / sectorsize */
> - struct radix_tree_root buffer_radix;
> + struct xarray buffer_tree;
>
> /* Next backup root to be overwritten */
> int backup_root_index;
> diff --git a/fs/btrfs/tests/btrfs-tests.c b/fs/btrfs/tests/btrfs-tests.c
> index 02a915eb51fb..b576897d71cc 100644
> --- a/fs/btrfs/tests/btrfs-tests.c
> +++ b/fs/btrfs/tests/btrfs-tests.c
> @@ -157,9 +157,9 @@ struct btrfs_fs_info *btrfs_alloc_dummy_fs_info(u32 nodesize, u32 sectorsize)
>
> void btrfs_free_dummy_fs_info(struct btrfs_fs_info *fs_info)
> {
> - struct radix_tree_iter iter;
> - void **slot;
> struct btrfs_device *dev, *tmp;
> + struct extent_buffer *eb;
> + unsigned long index;
>
> if (!fs_info)
> return;
> @@ -169,25 +169,13 @@ void btrfs_free_dummy_fs_info(struct btrfs_fs_info *fs_info)
>
> test_mnt->mnt_sb->s_fs_info = NULL;
>
> - spin_lock(&fs_info->buffer_lock);
> - radix_tree_for_each_slot(slot, &fs_info->buffer_radix, &iter, 0) {
> - struct extent_buffer *eb;
> -
> - eb = radix_tree_deref_slot_protected(slot, &fs_info->buffer_lock);
> - if (!eb)
> - continue;
> - /* Shouldn't happen but that kind of thinking creates CVE's */
> - if (radix_tree_exception(eb)) {
> - if (radix_tree_deref_retry(eb))
> - slot = radix_tree_iter_retry(&iter);
> - continue;
> - }
> - slot = radix_tree_iter_resume(slot, &iter);
> - spin_unlock(&fs_info->buffer_lock);
> - free_extent_buffer_stale(eb);
> - spin_lock(&fs_info->buffer_lock);
> + xa_lock_irq(&fs_info->buffer_tree);
> + xa_for_each(&fs_info->buffer_tree, index, eb) {
> + xa_unlock_irq(&fs_info->buffer_tree);
> + free_extent_buffer(eb);
> + xa_lock_irq(&fs_info->buffer_tree);
> }
> - spin_unlock(&fs_info->buffer_lock);
> + xa_unlock_irq(&fs_info->buffer_tree);
>
> btrfs_mapping_tree_free(fs_info);
> list_for_each_entry_safe(dev, tmp, &fs_info->fs_devices->devices,
> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
> index 7b30700ec930..4b59bc480663 100644
> --- a/fs/btrfs/zoned.c
> +++ b/fs/btrfs/zoned.c
> @@ -2171,27 +2171,15 @@ static void wait_eb_writebacks(struct btrfs_block_group *block_group)
> {
> struct btrfs_fs_info *fs_info = block_group->fs_info;
> const u64 end = block_group->start + block_group->length;
> - struct radix_tree_iter iter;
> struct extent_buffer *eb;
> - void __rcu **slot;
> + unsigned long index, start = block_group->start >> fs_info->sectorsize_bits;
>
> rcu_read_lock();
> - radix_tree_for_each_slot(slot, &fs_info->buffer_radix, &iter,
> - block_group->start >> fs_info->sectorsize_bits) {
> - eb = radix_tree_deref_slot(slot);
> - if (!eb)
> - continue;
> - if (radix_tree_deref_retry(eb)) {
> - slot = radix_tree_iter_retry(&iter);
> - continue;
> - }
> -
> + xa_for_each_start(&fs_info->buffer_tree, index, eb, start) {
> if (eb->start < block_group->start)
> continue;
> if (eb->start >= end)
> break;
> -
> - slot = radix_tree_iter_resume(slot, &iter);
> rcu_read_unlock();
> wait_on_extent_buffer_writeback(eb);
> rcu_read_lock();
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v5 2/3] btrfs: set DIRTY and WRITEBACK tags on the buffer_tree
2025-04-28 14:52 [PATCH v5 0/3] btrfs: simplify extent buffer writeback Josef Bacik
2025-04-28 14:52 ` [PATCH v5 1/3] btrfs: convert the buffer_radix to an xarray Josef Bacik
@ 2025-04-28 14:52 ` Josef Bacik
2025-04-28 14:52 ` [PATCH v5 3/3] btrfs: use buffer xarray for extent buffer writeback operations Josef Bacik
2 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2025-04-28 14:52 UTC (permalink / raw)
To: linux-btrfs, kernel-team; +Cc: Filipe Manana
In preparation for changing how we do writeout of extent buffers, start
tagging the extent buffer xarray with DIRTY and WRITEBACK to make it
easier to find extent buffers that are in either state.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/extent_io.c | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index bedcacaf809f..daab6373c6a4 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1801,8 +1801,19 @@ static noinline_for_stack bool lock_extent_buffer_for_io(struct extent_buffer *e
*/
spin_lock(&eb->refs_lock);
if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)) {
+ XA_STATE(xas, &fs_info->buffer_tree,
+ eb->start >> fs_info->sectorsize_bits);
+ unsigned long flags;
+
set_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
spin_unlock(&eb->refs_lock);
+
+ xas_lock_irqsave(&xas, flags);
+ xas_load(&xas);
+ xas_set_mark(&xas, PAGECACHE_TAG_WRITEBACK);
+ xas_clear_mark(&xas, PAGECACHE_TAG_DIRTY);
+ xas_unlock_irqrestore(&xas, flags);
+
btrfs_set_header_flag(eb, BTRFS_HEADER_FLAG_WRITTEN);
percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
-eb->len,
@@ -1888,6 +1899,30 @@ static void set_btree_ioerr(struct extent_buffer *eb)
}
}
+static void buffer_tree_set_mark(const struct extent_buffer *eb, xa_mark_t mark)
+{
+ struct btrfs_fs_info *fs_info = eb->fs_info;
+ XA_STATE(xas, &fs_info->buffer_tree, eb->start >> fs_info->sectorsize_bits);
+ unsigned long flags;
+
+ xas_lock_irqsave(&xas, flags);
+ xas_load(&xas);
+ xas_set_mark(&xas, mark);
+ xas_unlock_irqrestore(&xas, flags);
+}
+
+static void buffer_tree_clear_mark(const struct extent_buffer *eb, xa_mark_t mark)
+{
+ struct btrfs_fs_info *fs_info = eb->fs_info;
+ XA_STATE(xas, &fs_info->buffer_tree, eb->start >> fs_info->sectorsize_bits);
+ unsigned long flags;
+
+ xas_lock_irqsave(&xas, flags);
+ xas_load(&xas);
+ xas_clear_mark(&xas, mark);
+ xas_unlock_irqrestore(&xas, flags);
+}
+
/*
* The endio specific version which won't touch any unsafe spinlock in endio
* context.
@@ -1918,6 +1953,7 @@ static void end_bbio_meta_write(struct btrfs_bio *bbio)
btrfs_meta_folio_clear_writeback(fi.folio, eb);
}
+ buffer_tree_clear_mark(eb, PAGECACHE_TAG_WRITEBACK);
clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
smp_mb__after_atomic();
wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK);
@@ -3544,6 +3580,7 @@ void btrfs_clear_buffer_dirty(struct btrfs_trans_handle *trans,
if (!test_and_clear_bit(EXTENT_BUFFER_DIRTY, &eb->bflags))
return;
+ buffer_tree_clear_mark(eb, PAGECACHE_TAG_DIRTY);
percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, -eb->len,
fs_info->dirty_metadata_batch);
@@ -3592,6 +3629,7 @@ void set_extent_buffer_dirty(struct extent_buffer *eb)
folio_lock(eb->folios[0]);
for (int i = 0; i < num_extent_folios(eb); i++)
btrfs_meta_folio_set_dirty(eb->folios[i], eb);
+ buffer_tree_set_mark(eb, PAGECACHE_TAG_DIRTY);
if (subpage)
folio_unlock(eb->folios[0]);
percpu_counter_add_batch(&eb->fs_info->dirty_metadata_bytes,
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v5 3/3] btrfs: use buffer xarray for extent buffer writeback operations
2025-04-28 14:52 [PATCH v5 0/3] btrfs: simplify extent buffer writeback Josef Bacik
2025-04-28 14:52 ` [PATCH v5 1/3] btrfs: convert the buffer_radix to an xarray Josef Bacik
2025-04-28 14:52 ` [PATCH v5 2/3] btrfs: set DIRTY and WRITEBACK tags on the buffer_tree Josef Bacik
@ 2025-04-28 14:52 ` Josef Bacik
2025-05-26 1:17 ` Shinichiro Kawasaki
2 siblings, 1 reply; 9+ messages in thread
From: Josef Bacik @ 2025-04-28 14:52 UTC (permalink / raw)
To: linux-btrfs, kernel-team; +Cc: Filipe Manana
Currently we have this ugly back and forth with the btree writeback
where we find the folio, find the eb associated with that folio, and
then attempt to writeback. This results in two different paths for
subpage eb's and >= pagesize eb's.
Clean this up by adding our own infrastructure around looking up tag'ed
eb's and writing the eb's out directly. This allows us to unify the
subpage and >= pagesize IO paths, resulting in a much cleaner writeback
path for extent buffers.
I ran this through fsperf on a VM with 8 CPUs and 16gib of ram. I used
smallfiles100k, but reduced the files to 1k to make it run faster, the
results are as follows, with the statistically significant improvements
marked with *, there were no regressions. fsperf was run with -n 10 for
both runs, so the baseline is the average 10 runs and the test is the
average of 10 runs.
smallfiles100k results
metric baseline current stdev diff
================================================================================
avg_commit_ms 68.58 58.44 3.35 -14.79% *
commits 270.60 254.70 16.24 -5.88%
dev_read_iops 48 48 0 0.00%
dev_read_kbytes 1044 1044 0 0.00%
dev_write_iops 866117.90 850028.10 14292.20 -1.86%
dev_write_kbytes 10939976.40 10605701.20 351330.32 -3.06%
elapsed 49.30 33 1.64 -33.06% *
end_state_mount_ns 41251498.80 35773220.70 2531205.32 -13.28% *
end_state_umount_ns 1.90e+09 1.50e+09 14186226.85 -21.38% *
max_commit_ms 139 111.60 9.72 -19.71% *
sys_cpu 4.90 3.86 0.88 -21.29%
write_bw_bytes 42935768.20 64318451.10 1609415.05 49.80% *
write_clat_ns_mean 366431.69 243202.60 14161.98 -33.63% *
write_clat_ns_p50 49203.20 20992 264.40 -57.34% *
write_clat_ns_p99 827392 653721.60 65904.74 -20.99% *
write_io_kbytes 2035940 2035940 0 0.00%
write_iops 10482.37 15702.75 392.92 49.80% *
write_lat_ns_max 1.01e+08 90516129 3910102.06 -10.29% *
write_lat_ns_mean 366556.19 243308.48 14154.51 -33.62% *
As you can see we get about a 33% decrease runtime, with a 50%
throughput increase, which is pretty significant.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/extent_io.c | 339 ++++++++++++++++++++---------------------
fs/btrfs/extent_io.h | 2 +
fs/btrfs/transaction.c | 5 +-
3 files changed, 169 insertions(+), 177 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index daab6373c6a4..a8f6d1530b53 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1923,6 +1923,111 @@ static void buffer_tree_clear_mark(const struct extent_buffer *eb, xa_mark_t mar
xas_unlock_irqrestore(&xas, flags);
}
+static void buffer_tree_tag_for_writeback(struct btrfs_fs_info *fs_info,
+ unsigned long start, unsigned long end)
+{
+ XA_STATE(xas, &fs_info->buffer_tree, start);
+ unsigned int tagged = 0;
+ void *eb;
+
+ xas_lock_irq(&xas);
+ xas_for_each_marked(&xas, eb, end, PAGECACHE_TAG_DIRTY) {
+ xas_set_mark(&xas, PAGECACHE_TAG_TOWRITE);
+ if (++tagged % XA_CHECK_SCHED)
+ continue;
+ xas_pause(&xas);
+ xas_unlock_irq(&xas);
+ cond_resched();
+ xas_lock_irq(&xas);
+ }
+ xas_unlock_irq(&xas);
+}
+
+struct eb_batch {
+ unsigned int nr;
+ unsigned int cur;
+ struct extent_buffer *ebs[PAGEVEC_SIZE];
+};
+
+static inline bool eb_batch_add(struct eb_batch *batch, struct extent_buffer *eb)
+{
+ batch->ebs[batch->nr++] = eb;
+ return (batch->nr < PAGEVEC_SIZE);
+}
+
+static inline void eb_batch_init(struct eb_batch *batch)
+{
+ batch->nr = 0;
+ batch->cur = 0;
+}
+
+static inline struct extent_buffer *eb_batch_next(struct eb_batch *batch)
+{
+ if (batch->cur >= batch->nr)
+ return NULL;
+ return batch->ebs[batch->cur++];
+}
+
+static inline void eb_batch_release(struct eb_batch *batch)
+{
+ for (unsigned int i = 0; i < batch->nr; i++)
+ free_extent_buffer(batch->ebs[i]);
+ eb_batch_init(batch);
+}
+
+static inline struct extent_buffer *find_get_eb(struct xa_state *xas, unsigned long max,
+ xa_mark_t mark)
+{
+ struct extent_buffer *eb;
+
+retry:
+ eb = xas_find_marked(xas, max, mark);
+
+ if (xas_retry(xas, eb))
+ goto retry;
+
+ if (!eb)
+ return NULL;
+
+ if (!atomic_inc_not_zero(&eb->refs))
+ goto reset;
+
+ if (unlikely(eb != xas_reload(xas))) {
+ free_extent_buffer(eb);
+ goto reset;
+ }
+
+ return eb;
+reset:
+ xas_reset(xas);
+ goto retry;
+}
+
+static unsigned int buffer_tree_get_ebs_tag(struct btrfs_fs_info *fs_info,
+ unsigned long *start,
+ unsigned long end, xa_mark_t tag,
+ struct eb_batch *batch)
+{
+ XA_STATE(xas, &fs_info->buffer_tree, *start);
+ struct extent_buffer *eb;
+
+ rcu_read_lock();
+ while ((eb = find_get_eb(&xas, end, tag)) != NULL) {
+ if (!eb_batch_add(batch, eb)) {
+ *start = (eb->start + eb->len) >> fs_info->sectorsize_bits;
+ goto out;
+ }
+ }
+ if (end == ULONG_MAX)
+ *start = ULONG_MAX;
+ else
+ *start = end + 1;
+out:
+ rcu_read_unlock();
+
+ return batch->nr;
+}
+
/*
* The endio specific version which won't touch any unsafe spinlock in endio
* context.
@@ -2025,163 +2130,38 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb,
}
/*
- * Submit one subpage btree page.
+ * Wait for all eb writeback in the given range to finish.
*
- * The main difference to submit_eb_page() is:
- * - Page locking
- * For subpage, we don't rely on page locking at all.
- *
- * - Flush write bio
- * We only flush bio if we may be unable to fit current extent buffers into
- * current bio.
- *
- * Return >=0 for the number of submitted extent buffers.
- * Return <0 for fatal error.
+ * @fs_info: The fs_info for this file system.
+ * @start: The offset of the range to start waiting on writeback.
+ * @end: The end of the range, inclusive. This is meant to be used in
+ * conjuction with wait_marked_extents, so this will usually be
+ * the_next_eb->start - 1.
*/
-static int submit_eb_subpage(struct folio *folio, struct writeback_control *wbc)
+void btrfs_btree_wait_writeback_range(struct btrfs_fs_info *fs_info, u64 start,
+ u64 end)
{
- struct btrfs_fs_info *fs_info = folio_to_fs_info(folio);
- int submitted = 0;
- u64 folio_start = folio_pos(folio);
- int bit_start = 0;
- int sectors_per_node = fs_info->nodesize >> fs_info->sectorsize_bits;
- const unsigned int blocks_per_folio = btrfs_blocks_per_folio(fs_info, folio);
+ struct eb_batch batch;
+ unsigned long start_index = start >> fs_info->sectorsize_bits;
+ unsigned long end_index = end >> fs_info->sectorsize_bits;
- /* Lock and write each dirty extent buffers in the range */
- while (bit_start < blocks_per_folio) {
- struct btrfs_subpage *subpage = folio_get_private(folio);
+ eb_batch_init(&batch);
+ while (start_index <= end_index) {
struct extent_buffer *eb;
- unsigned long flags;
- u64 start;
+ unsigned int nr_ebs;
- /*
- * Take private lock to ensure the subpage won't be detached
- * in the meantime.
- */
- spin_lock(&folio->mapping->i_private_lock);
- if (!folio_test_private(folio)) {
- spin_unlock(&folio->mapping->i_private_lock);
+ nr_ebs = buffer_tree_get_ebs_tag(fs_info, &start_index,
+ end_index,
+ PAGECACHE_TAG_WRITEBACK,
+ &batch);
+ if (!nr_ebs)
break;
- }
- spin_lock_irqsave(&subpage->lock, flags);
- if (!test_bit(bit_start + btrfs_bitmap_nr_dirty * blocks_per_folio,
- subpage->bitmaps)) {
- spin_unlock_irqrestore(&subpage->lock, flags);
- spin_unlock(&folio->mapping->i_private_lock);
- bit_start += sectors_per_node;
- continue;
- }
- start = folio_start + bit_start * fs_info->sectorsize;
- bit_start += sectors_per_node;
-
- /*
- * Here we just want to grab the eb without touching extra
- * spin locks, so call find_extent_buffer_nolock().
- */
- eb = find_extent_buffer_nolock(fs_info, start);
- spin_unlock_irqrestore(&subpage->lock, flags);
- spin_unlock(&folio->mapping->i_private_lock);
-
- /*
- * The eb has already reached 0 refs thus find_extent_buffer()
- * doesn't return it. We don't need to write back such eb
- * anyway.
- */
- if (!eb)
- continue;
-
- if (lock_extent_buffer_for_io(eb, wbc)) {
- write_one_eb(eb, wbc);
- submitted++;
- }
- free_extent_buffer(eb);
+ while ((eb = eb_batch_next(&batch)) != NULL)
+ wait_on_extent_buffer_writeback(eb);
+ eb_batch_release(&batch);
+ cond_resched();
}
- return submitted;
-}
-
-/*
- * Submit all page(s) of one extent buffer.
- *
- * @page: the page of one extent buffer
- * @eb_context: to determine if we need to submit this page, if current page
- * belongs to this eb, we don't need to submit
- *
- * The caller should pass each page in their bytenr order, and here we use
- * @eb_context to determine if we have submitted pages of one extent buffer.
- *
- * If we have, we just skip until we hit a new page that doesn't belong to
- * current @eb_context.
- *
- * If not, we submit all the page(s) of the extent buffer.
- *
- * Return >0 if we have submitted the extent buffer successfully.
- * Return 0 if we don't need to submit the page, as it's already submitted by
- * previous call.
- * Return <0 for fatal error.
- */
-static int submit_eb_page(struct folio *folio, struct btrfs_eb_write_context *ctx)
-{
- struct writeback_control *wbc = ctx->wbc;
- struct address_space *mapping = folio->mapping;
- struct extent_buffer *eb;
- int ret;
-
- if (!folio_test_private(folio))
- return 0;
-
- if (btrfs_meta_is_subpage(folio_to_fs_info(folio)))
- return submit_eb_subpage(folio, wbc);
-
- spin_lock(&mapping->i_private_lock);
- if (!folio_test_private(folio)) {
- spin_unlock(&mapping->i_private_lock);
- return 0;
- }
-
- eb = folio_get_private(folio);
-
- /*
- * Shouldn't happen and normally this would be a BUG_ON but no point
- * crashing the machine for something we can survive anyway.
- */
- if (WARN_ON(!eb)) {
- spin_unlock(&mapping->i_private_lock);
- return 0;
- }
-
- if (eb == ctx->eb) {
- spin_unlock(&mapping->i_private_lock);
- return 0;
- }
- ret = atomic_inc_not_zero(&eb->refs);
- spin_unlock(&mapping->i_private_lock);
- if (!ret)
- return 0;
-
- ctx->eb = eb;
-
- ret = btrfs_check_meta_write_pointer(eb->fs_info, ctx);
- if (ret) {
- if (ret == -EBUSY)
- ret = 0;
- free_extent_buffer(eb);
- return ret;
- }
-
- if (!lock_extent_buffer_for_io(eb, wbc)) {
- free_extent_buffer(eb);
- return 0;
- }
- /* Implies write in zoned mode. */
- if (ctx->zoned_bg) {
- /* Mark the last eb in the block group. */
- btrfs_schedule_zone_finish_bg(ctx->zoned_bg, eb);
- ctx->zoned_bg->meta_write_pointer += eb->len;
- }
- write_one_eb(eb, wbc);
- free_extent_buffer(eb);
- return 1;
}
int btree_write_cache_pages(struct address_space *mapping,
@@ -2192,25 +2172,27 @@ int btree_write_cache_pages(struct address_space *mapping,
int ret = 0;
int done = 0;
int nr_to_write_done = 0;
- struct folio_batch fbatch;
- unsigned int nr_folios;
- pgoff_t index;
- pgoff_t end; /* Inclusive */
+ struct eb_batch batch;
+ unsigned int nr_ebs;
+ unsigned long index;
+ unsigned long end;
int scanned = 0;
xa_mark_t tag;
- folio_batch_init(&fbatch);
+ eb_batch_init(&batch);
if (wbc->range_cyclic) {
- index = mapping->writeback_index; /* Start from prev offset */
+ index = (mapping->writeback_index << PAGE_SHIFT) >> fs_info->sectorsize_bits;
end = -1;
+
/*
* Start from the beginning does not need to cycle over the
* range, mark it as scanned.
*/
scanned = (index == 0);
} else {
- index = wbc->range_start >> PAGE_SHIFT;
- end = wbc->range_end >> PAGE_SHIFT;
+ index = wbc->range_start >> fs_info->sectorsize_bits;
+ end = wbc->range_end >> fs_info->sectorsize_bits;
+
scanned = 1;
}
if (wbc->sync_mode == WB_SYNC_ALL)
@@ -2220,31 +2202,40 @@ int btree_write_cache_pages(struct address_space *mapping,
btrfs_zoned_meta_io_lock(fs_info);
retry:
if (wbc->sync_mode == WB_SYNC_ALL)
- tag_pages_for_writeback(mapping, index, end);
+ buffer_tree_tag_for_writeback(fs_info, index, end);
while (!done && !nr_to_write_done && (index <= end) &&
- (nr_folios = filemap_get_folios_tag(mapping, &index, end,
- tag, &fbatch))) {
- unsigned i;
+ (nr_ebs = buffer_tree_get_ebs_tag(fs_info, &index, end, tag,
+ &batch))) {
+ struct extent_buffer *eb;
- for (i = 0; i < nr_folios; i++) {
- struct folio *folio = fbatch.folios[i];
+ while ((eb = eb_batch_next(&batch)) != NULL) {
+ ctx.eb = eb;
- ret = submit_eb_page(folio, &ctx);
- if (ret == 0)
+ ret = btrfs_check_meta_write_pointer(eb->fs_info, &ctx);
+ if (ret) {
+ if (ret == -EBUSY)
+ ret = 0;
+ if (ret) {
+ done = 1;
+ break;
+ }
+ free_extent_buffer(eb);
continue;
- if (ret < 0) {
- done = 1;
- break;
}
- /*
- * the filesystem may choose to bump up nr_to_write.
- * We have to make sure to honor the new nr_to_write
- * at any time
- */
- nr_to_write_done = wbc->nr_to_write <= 0;
+ if (!lock_extent_buffer_for_io(eb, wbc))
+ continue;
+
+ /* Implies write in zoned mode. */
+ if (ctx.zoned_bg) {
+ /* Mark the last eb in the block group. */
+ btrfs_schedule_zone_finish_bg(ctx.zoned_bg, eb);
+ ctx.zoned_bg->meta_write_pointer += eb->len;
+ }
+ write_one_eb(eb, wbc);
}
- folio_batch_release(&fbatch);
+ nr_to_write_done = wbc->nr_to_write <= 0;
+ eb_batch_release(&batch);
cond_resched();
}
if (!scanned && !done) {
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index b344162f790c..b7c1cd0b3c20 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -240,6 +240,8 @@ void extent_write_locked_range(struct inode *inode, const struct folio *locked_f
int btrfs_writepages(struct address_space *mapping, struct writeback_control *wbc);
int btree_write_cache_pages(struct address_space *mapping,
struct writeback_control *wbc);
+void btrfs_btree_wait_writeback_range(struct btrfs_fs_info *fs_info, u64 start,
+ u64 end);
void btrfs_readahead(struct readahead_control *rac);
int set_folio_extent_mapped(struct folio *folio);
void clear_folio_extent_mapped(struct folio *folio);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 39e48bf610a1..a538a85ac2bd 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1155,7 +1155,7 @@ int btrfs_write_marked_extents(struct btrfs_fs_info *fs_info,
if (!ret)
ret = filemap_fdatawrite_range(mapping, start, end);
if (!ret && wait_writeback)
- ret = filemap_fdatawait_range(mapping, start, end);
+ btrfs_btree_wait_writeback_range(fs_info, start, end);
btrfs_free_extent_state(cached_state);
if (ret)
break;
@@ -1175,7 +1175,6 @@ int btrfs_write_marked_extents(struct btrfs_fs_info *fs_info,
static int __btrfs_wait_marked_extents(struct btrfs_fs_info *fs_info,
struct extent_io_tree *dirty_pages)
{
- struct address_space *mapping = fs_info->btree_inode->i_mapping;
struct extent_state *cached_state = NULL;
u64 start = 0;
u64 end;
@@ -1196,7 +1195,7 @@ static int __btrfs_wait_marked_extents(struct btrfs_fs_info *fs_info,
if (ret == -ENOMEM)
ret = 0;
if (!ret)
- ret = filemap_fdatawait_range(mapping, start, end);
+ btrfs_btree_wait_writeback_range(fs_info, start, end);
btrfs_free_extent_state(cached_state);
if (ret)
break;
--
2.48.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v5 3/3] btrfs: use buffer xarray for extent buffer writeback operations
2025-04-28 14:52 ` [PATCH v5 3/3] btrfs: use buffer xarray for extent buffer writeback operations Josef Bacik
@ 2025-05-26 1:17 ` Shinichiro Kawasaki
2025-05-26 4:20 ` Qu Wenruo
0 siblings, 1 reply; 9+ messages in thread
From: Shinichiro Kawasaki @ 2025-05-26 1:17 UTC (permalink / raw)
To: Josef Bacik
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com, Filipe Manana,
Johannes Thumshirn, Naohiro Aota, Damien Le Moal
On Apr 28, 2025 / 10:52, Josef Bacik wrote:
> Currently we have this ugly back and forth with the btree writeback
> where we find the folio, find the eb associated with that folio, and
> then attempt to writeback. This results in two different paths for
> subpage eb's and >= pagesize eb's.
>
> Clean this up by adding our own infrastructure around looking up tag'ed
> eb's and writing the eb's out directly. This allows us to unify the
> subpage and >= pagesize IO paths, resulting in a much cleaner writeback
> path for extent buffers.
[...]
When I ran blktests on the for-next kernel with the tag next-20250521, I
observed the test case zdd/009 failed with repeated WARNs at
release_extent_buffer() [1]. I bisected and found this patch as the commit
5e121ae687b8 is the trigger. When I revert the commit from the tag
next-20250521, the WARNs disappear and the test case passes.
The test case creates zoned btrfs on scsi_debug and runs fio. I guess this
problem might be unique to zoned btrfs. Actions for fix will be appreciated.
[1]
[ 2415.130602][T19872] run blktests zbd/009 at 2025-05-22 05:43:18
[ 2415.336100][T19925] sd 11:0:0:0: [sdg] Synchronizing SCSI cache
[ 2415.656624][T19927] scsi_debug:sdebug_driver_probe: scsi_debug: trim poll_queues to 0. poll_q/nr_hw = (0/1)
[ 2415.666978][T19927] scsi host11: scsi_debug: version 0191 [20210520]
[ 2415.666978][T19927] dev_size_mb=1024, opts=0x0, submit_queues=1, statistics=0
[ 2415.688478][T19927] scsi 11:0:0:0: Direct-Access-ZBC Linux scsi_debug 0191 PQ: 0 ANSI: 7
[ 2415.701080][ C3] scsi 11:0:0:0: Power-on or device reset occurred
[ 2415.711696][T19927] sd 11:0:0:0: Attached scsi generic sg7 type 20
[ 2415.711931][T14481] sd 11:0:0:0: [sdg] Host-managed zoned block device
[ 2415.729457][T14481] sd 11:0:0:0: [sdg] 262144 4096-byte logical blocks: (1.07 GB/1.00 GiB)
[ 2415.741314][T14481] sd 11:0:0:0: [sdg] Write Protect is off
[ 2415.749640][T14481] sd 11:0:0:0: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA
[ 2415.764470][T14481] sd 11:0:0:0: [sdg] permanent stream count = 5
[ 2415.772745][T14481] sd 11:0:0:0: [sdg] Preferred minimum I/O size 4096 bytes
[ 2415.781284][T14481] sd 11:0:0:0: [sdg] Optimal transfer size 4194304 bytes
[ 2415.791238][T14481] sd 11:0:0:0: [sdg] 256 zones of 1024 logical blocks
[ 2415.844595][T14481] sd 11:0:0:0: [sdg] Attached SCSI disk
[ 2416.180138][T19955] BTRFS: device fsid c63c1228-a6d2-4e6e-8c38-81ccbc43ddf9 devid 1 transid 8 /dev/sdg (8:96) scanned by mount (19955)
[ 2416.224609][T19955] BTRFS info (device sdg): first mount of filesystem c63c1228-a6d2-4e6e-8c38-81ccbc43ddf9
[ 2416.236163][T19955] BTRFS info (device sdg): using crc32c (crc32c-x86) checksum algorithm
[ 2416.246181][T19955] BTRFS info (device sdg): using free-space-tree
[ 2416.261378][T19955] BTRFS info (device sdg): host-managed zoned block device /dev/sdg, 256 zones of 4194304 bytes
[ 2416.273530][T19955] BTRFS info (device sdg): zoned mode enabled with zone size 4194304
[ 2416.287809][T19955] BTRFS info (device sdg): checking UUID tree
[ 2427.128617][T20014] ------------[ cut here ]------------
[ 2427.134580][T20014] WARNING: CPU: 13 PID: 20014 at fs/btrfs/extent_io.c:3441 release_extent_buffer+0x22f/0x2a0 [btrfs]
[ 2427.146076][T20014] Modules linked in: scsi_debug btrfs xor raid6_pq xfs target_core_user target_core_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack rfkill nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables ip6table_filter ip6_tables iptable_filter ip_tables qrtr irdma ice gnss ib_uverbs sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency ib_core intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm spi_nor i40e irqbypass mtd rapl iTCO_wdt intel_pmc_bxt intel_cstate iTCO_vendor_support vfat ses fat intel_uncore enclosure libie mei_me i2c_i801 spi_intel_pci ioatdma i2c_smbus lpc_ich spi_intel mei intel_pch_thermal wmi dca joydev acpi_power_meter acpi_pad fuse loop dm_multipath nfnetlink zram lz4hc_compress lz4_compress
[ 2427.146286][T20014] zstd_compress ast drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper drm nvme mpi3mr nvme_core polyval_clmulni ghash_clmulni_intel sha512_ssse3 nvme_keyring sha1_ssse3 scsi_transport_sas nvme_auth scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkcs8_key_parser [last unloaded: scsi_debug]
[ 2427.271554][T20014] CPU: 13 UID: 0 PID: 20014 Comm: umount Tainted: G B 6.15.0-rc7-next-20250521-kts #1 PREEMPT(lazy)
[ 2427.285246][T20014] Tainted: [B]=BAD_PAGE
[ 2427.290012][T20014] Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.5 05/18/2021
[ 2427.298958][T20014] RIP: 0010:release_extent_buffer+0x22f/0x2a0 [btrfs]
[ 2427.306531][T20014] Code: 08 5b 5d 41 5c 41 5d e9 8f 08 06 e6 49 8d 7c 24 40 be ff ff ff ff e8 10 0e 03 e6 85 c0 0f 85 26 fe ff ff 0f 0b e9 1f fe ff ff <0f> 0b e9 61 fe ff ff 48 c7 c7 84 cb 48 ab e8 6e 99 cd e3 e9 f9 fd
[ 2427.327541][T20014] RSP: 0018:ffff888348297068 EFLAGS: 00010246
[ 2427.334290][T20014] RAX: 0000000000000000 RBX: ffff8882cd2d34e8 RCX: 0000000000000001
[ 2427.342954][T20014] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8882cd2d34e8
[ 2427.351616][T20014] RBP: ffff8882cd2d34e8 R08: ffffffffc332b4e0 R09: ffffed1059a5a69d
[ 2427.360272][T20014] R10: ffffed1059a5a69e R11: 0000000000000000 R12: ffff8882cd2d3480
[ 2427.368916][T20014] R13: dffffc0000000000 R14: 000000000000001f R15: ffffed1059a5a692
[ 2427.377567][T20014] FS: 00007f2e60decb80(0000) GS:ffff889055845000(0000) knlGS:0000000000000000
[ 2427.387171][T20014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2427.394426][T20014] CR2: 00007ffcad825fe0 CR3: 00000001b3fa9003 CR4: 00000000007726f0
[ 2427.403076][T20014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2427.411723][T20014] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2427.420367][T20014] PKRU: 55555554
[ 2427.424579][T20014] Call Trace:
[ 2427.428523][T20014] <TASK>
[ 2427.432114][T20014] free_extent_buffer+0x1e6/0x2b0 [btrfs]
[ 2427.438667][T20014] ? __pfx_free_extent_buffer+0x10/0x10 [btrfs]
[ 2427.445722][T20014] ? btrfs_check_meta_write_pointer+0x243/0x5a0 [btrfs]
[ 2427.453465][T20014] btree_write_cache_pages+0x40f/0x950 [btrfs]
[ 2427.460435][T20014] ? __pfx_btree_write_cache_pages+0x10/0x10 [btrfs]
[ 2427.467904][T20014] ? unwind_get_return_address+0x6b/0xe0
[ 2427.474184][T20014] ? kasan_save_stack+0x3f/0x50
[ 2427.479655][T20014] ? kasan_save_stack+0x30/0x50
[ 2427.485121][T20014] ? kasan_save_track+0x14/0x30
[ 2427.490572][T20014] ? kasan_save_free_info+0x3b/0x70
[ 2427.496370][T20014] ? __kasan_slab_free+0x52/0x70
[ 2427.501920][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2427.507452][T20014] ? btrfs_convert_extent_bit+0x97e/0xfd0 [btrfs]
[ 2427.514600][T20014] ? btrfs_write_marked_extents+0x17b/0x230 [btrfs]
[ 2427.521903][T20014] ? btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
[ 2427.529630][T20014] ? btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
[ 2427.536903][T20014] do_writepages+0x21e/0x560
[ 2427.542029][T20014] ? __pfx_do_writepages+0x10/0x10
[ 2427.547667][T20014] ? _raw_spin_unlock+0x23/0x40
[ 2427.553017][T20014] ? wbc_attach_and_unlock_inode.part.0+0x388/0x730
[ 2427.560110][T20014] filemap_fdatawrite_wbc+0xd2/0x120
[ 2427.565892][T20014] __filemap_fdatawrite_range+0xa7/0xe0
[ 2427.571900][T20014] ? __pfx___filemap_fdatawrite_range+0x10/0x10
[ 2427.578604][T20014] btrfs_write_marked_extents+0xf7/0x230 [btrfs]
[ 2427.585514][T20014] ? __pfx_btrfs_write_marked_extents+0x10/0x10 [btrfs]
[ 2427.593011][T20014] ? __pfx___mutex_lock+0x10/0x10
[ 2427.598452][T20014] btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
[ 2427.605839][T20014] ? do_raw_spin_lock+0x128/0x270
[ 2427.611255][T20014] ? __pfx_btrfs_write_and_wait_transaction+0x10/0x10 [btrfs]
[ 2427.619249][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2427.624658][T20014] ? _raw_spin_unlock_irqrestore+0x44/0x60
[ 2427.630852][T20014] btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
[ 2427.637777][T20014] ? start_transaction+0x520/0x1520 [btrfs]
[ 2427.644167][T20014] ? __pfx_btrfs_commit_transaction+0x10/0x10 [btrfs]
[ 2427.651428][T20014] ? btrfs_attach_transaction_barrier+0x25/0xa0 [btrfs]
[ 2427.658843][T20014] sync_filesystem+0x177/0x220
[ 2427.663954][T20014] generic_shutdown_super+0x79/0x320
[ 2427.669583][T20014] kill_anon_super+0x3a/0x60
[ 2427.674514][T20014] btrfs_kill_super+0x3e/0x60 [btrfs]
[ 2427.680359][T20014] deactivate_locked_super+0xa8/0x160
[ 2427.686052][T20014] cleanup_mnt+0x1da/0x410
[ 2427.690792][T20014] task_work_run+0x116/0x200
[ 2427.695694][T20014] ? __pfx_task_work_run+0x10/0x10
[ 2427.701122][T20014] ? __x64_sys_umount+0x10c/0x140
[ 2427.706461][T20014] ? __pfx___x64_sys_umount+0x10/0x10
[ 2427.712141][T20014] exit_to_user_mode_loop+0x135/0x160
[ 2427.717825][T20014] do_syscall_64+0x223/0x380
[ 2427.722740][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2427.728083][T20014] ? kasan_save_track+0x14/0x30
[ 2427.733247][T20014] ? kasan_quarantine_put+0xf5/0x240
[ 2427.738835][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2427.744081][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2427.749321][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2427.754546][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
[ 2427.760637][T20014] ? do_syscall_64+0x158/0x380
[ 2427.765699][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2427.771018][T20014] ? kasan_save_track+0x14/0x30
[ 2427.776173][T20014] ? kasan_quarantine_put+0xf5/0x240
[ 2427.781760][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2427.786995][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2427.792231][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2427.797465][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
[ 2427.803561][T20014] ? do_syscall_64+0x158/0x380
[ 2427.808610][T20014] ? clear_bhb_loop+0x30/0x80
[ 2427.813571][T20014] ? clear_bhb_loop+0x30/0x80
[ 2427.818529][T20014] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 2427.824705][T20014] RIP: 0033:0x7f2e60ee280b
[ 2427.829399][T20014] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 c9 35 0f 00 f7 d8
[ 2427.849737][T20014] RSP: 002b:00007ffcad827e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 2427.858513][T20014] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f2e60ee280b
[ 2427.866805][T20014] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000015800980
[ 2427.875102][T20014] RBP: 00007f2e610bdfd4 R08: 0000000000000002 R09: 0000000000000000
[ 2427.883395][T20014] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000015800648
[ 2427.891679][T20014] R13: 0000000015800980 R14: 0000000015800540 R15: 0000000000000000
[ 2427.899968][T20014] </TASK>
[ 2427.903304][T20014] irq event stamp: 0
[ 2427.907498][T20014] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 2427.914911][T20014] hardirqs last disabled at (0): [<ffffffffa6557892>] copy_process+0x1862/0x5730
[ 2427.924331][T20014] softirqs last enabled at (0): [<ffffffffa65578ea>] copy_process+0x18ba/0x5730
[ 2427.933742][T20014] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 2427.941153][T20014] ---[ end trace 0000000000000000 ]---
[ 2427.947036][T20014] ------------[ cut here ]------------
[ 2427.953580][T20014] WARNING: CPU: 3 PID: 20014 at fs/btrfs/extent_io.c:3441 release_extent_buffer+0x22f/0x2a0 [btrfs]
[ 2427.965056][T20014] Modules linked in: scsi_debug btrfs xor raid6_pq xfs target_core_user target_core_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack rfkill nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables ip6table_filter ip6_tables iptable_filter ip_tables qrtr irdma ice gnss ib_uverbs sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency ib_core intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm spi_nor i40e irqbypass mtd rapl iTCO_wdt intel_pmc_bxt intel_cstate iTCO_vendor_support vfat ses fat intel_uncore enclosure libie mei_me i2c_i801 spi_intel_pci ioatdma i2c_smbus lpc_ich spi_intel mei intel_pch_thermal wmi dca joydev acpi_power_meter acpi_pad fuse loop dm_multipath nfnetlink zram lz4hc_compress lz4_compress
[ 2427.965264][T20014] zstd_compress ast drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper drm nvme mpi3mr nvme_core polyval_clmulni ghash_clmulni_intel sha512_ssse3 nvme_keyring sha1_ssse3 scsi_transport_sas nvme_auth scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkcs8_key_parser [last unloaded: scsi_debug]
[ 2428.088867][T20014] CPU: 3 UID: 0 PID: 20014 Comm: umount Tainted: G B W 6.15.0-rc7-next-20250521-kts #1 PREEMPT(lazy)
[ 2428.102283][T20014] Tainted: [B]=BAD_PAGE, [W]=WARN
[ 2428.107834][T20014] Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.5 05/18/2021
[ 2428.116715][T20014] RIP: 0010:release_extent_buffer+0x22f/0x2a0 [btrfs]
[ 2428.124193][T20014] Code: 08 5b 5d 41 5c 41 5d e9 8f 08 06 e6 49 8d 7c 24 40 be ff ff ff ff e8 10 0e 03 e6 85 c0 0f 85 26 fe ff ff 0f 0b e9 1f fe ff ff <0f> 0b e9 61 fe ff ff 48 c7 c7 84 cb 48 ab e8 6e 99 cd e3 e9 f9 fd
[ 2428.145115][T20014] RSP: 0018:ffff888348297068 EFLAGS: 00010246
[ 2428.151793][T20014] RAX: 0000000000000000 RBX: ffff8881765d4ba8 RCX: 0000000000000001
[ 2428.160383][T20014] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8881765d4ba8
[ 2428.168974][T20014] RBP: ffff8881765d4ba8 R08: ffffffffc332b4e0 R09: ffffed102ecba975
[ 2428.177572][T20014] R10: ffffed102ecba976 R11: 0000000000000000 R12: ffff8881765d4b40
[ 2428.186179][T20014] R13: dffffc0000000000 R14: 000000000000001f R15: ffffed102ecba96a
[ 2428.194782][T20014] FS: 00007f2e60decb80(0000) GS:ffff889055345000(0000) knlGS:0000000000000000
[ 2428.204347][T20014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2428.211586][T20014] CR2: 0000000031894360 CR3: 00000001b3fa9002 CR4: 00000000007726f0
[ 2428.220219][T20014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2428.228853][T20014] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2428.237481][T20014] PKRU: 55555554
[ 2428.241671][T20014] Call Trace:
[ 2428.245578][T20014] <TASK>
[ 2428.249132][T20014] free_extent_buffer+0x1e6/0x2b0 [btrfs]
[ 2428.255616][T20014] ? __pfx_free_extent_buffer+0x10/0x10 [btrfs]
[ 2428.262619][T20014] ? btrfs_check_meta_write_pointer+0x243/0x5a0 [btrfs]
[ 2428.270315][T20014] btree_write_cache_pages+0x40f/0x950 [btrfs]
[ 2428.277241][T20014] ? __pfx_btree_write_cache_pages+0x10/0x10 [btrfs]
[ 2428.284682][T20014] ? unwind_get_return_address+0x6b/0xe0
[ 2428.290928][T20014] ? kasan_save_stack+0x3f/0x50
[ 2428.296394][T20014] ? kasan_save_stack+0x30/0x50
[ 2428.301841][T20014] ? kasan_save_track+0x14/0x30
[ 2428.307293][T20014] ? kasan_save_free_info+0x3b/0x70
[ 2428.313092][T20014] ? __kasan_slab_free+0x52/0x70
[ 2428.318622][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2428.324156][T20014] ? btrfs_convert_extent_bit+0x97e/0xfd0 [btrfs]
[ 2428.331314][T20014] ? btrfs_write_marked_extents+0x17b/0x230 [btrfs]
[ 2428.338626][T20014] ? btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
[ 2428.346364][T20014] ? btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
[ 2428.353658][T20014] do_writepages+0x21e/0x560
[ 2428.358785][T20014] ? __pfx_do_writepages+0x10/0x10
[ 2428.364431][T20014] ? _raw_spin_unlock+0x23/0x40
[ 2428.369784][T20014] ? wbc_attach_and_unlock_inode.part.0+0x388/0x730
[ 2428.376917][T20014] filemap_fdatawrite_wbc+0xd2/0x120
[ 2428.382733][T20014] __filemap_fdatawrite_range+0xa7/0xe0
[ 2428.388789][T20014] ? __pfx___filemap_fdatawrite_range+0x10/0x10
[ 2428.395550][T20014] btrfs_write_marked_extents+0xf7/0x230 [btrfs]
[ 2428.402529][T20014] ? __pfx_btrfs_write_marked_extents+0x10/0x10 [btrfs]
[ 2428.410091][T20014] ? __pfx___mutex_lock+0x10/0x10
[ 2428.415586][T20014] btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
[ 2428.423028][T20014] ? do_raw_spin_lock+0x128/0x270
[ 2428.428491][T20014] ? __pfx_btrfs_write_and_wait_transaction+0x10/0x10 [btrfs]
[ 2428.436543][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2428.442000][T20014] ? _raw_spin_unlock_irqrestore+0x44/0x60
[ 2428.448238][T20014] btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
[ 2428.455225][T20014] ? start_transaction+0x520/0x1520 [btrfs]
[ 2428.461655][T20014] ? __pfx_btrfs_commit_transaction+0x10/0x10 [btrfs]
[ 2428.468948][T20014] ? btrfs_attach_transaction_barrier+0x25/0xa0 [btrfs]
[ 2428.476440][T20014] sync_filesystem+0x177/0x220
[ 2428.481589][T20014] generic_shutdown_super+0x79/0x320
[ 2428.487264][T20014] kill_anon_super+0x3a/0x60
[ 2428.492238][T20014] btrfs_kill_super+0x3e/0x60 [btrfs]
[ 2428.498133][T20014] deactivate_locked_super+0xa8/0x160
[ 2428.503869][T20014] cleanup_mnt+0x1da/0x410
[ 2428.508648][T20014] task_work_run+0x116/0x200
[ 2428.513592][T20014] ? __pfx_task_work_run+0x10/0x10
[ 2428.519052][T20014] ? __x64_sys_umount+0x10c/0x140
[ 2428.524436][T20014] ? __pfx___x64_sys_umount+0x10/0x10
[ 2428.530159][T20014] exit_to_user_mode_loop+0x135/0x160
[ 2428.535880][T20014] do_syscall_64+0x223/0x380
[ 2428.540827][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2428.546208][T20014] ? kasan_save_track+0x14/0x30
[ 2428.551416][T20014] ? kasan_quarantine_put+0xf5/0x240
[ 2428.557040][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2428.562323][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2428.567597][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2428.572859][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
[ 2428.578986][T20014] ? do_syscall_64+0x158/0x380
[ 2428.584094][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2428.589468][T20014] ? kasan_save_track+0x14/0x30
[ 2428.594652][T20014] ? kasan_quarantine_put+0xf5/0x240
[ 2428.600282][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2428.605557][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2428.610827][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2428.616095][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
[ 2428.622232][T20014] ? do_syscall_64+0x158/0x380
[ 2428.627328][T20014] ? clear_bhb_loop+0x30/0x80
[ 2428.632328][T20014] ? clear_bhb_loop+0x30/0x80
[ 2428.637329][T20014] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 2428.643530][T20014] RIP: 0033:0x7f2e60ee280b
[ 2428.648266][T20014] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 c9 35 0f 00 f7 d8
[ 2428.668680][T20014] RSP: 002b:00007ffcad827e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 2428.677460][T20014] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f2e60ee280b
[ 2428.685785][T20014] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000015800980
[ 2428.694123][T20014] RBP: 00007f2e610bdfd4 R08: 0000000000000002 R09: 0000000000000000
[ 2428.702459][T20014] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000015800648
[ 2428.710783][T20014] R13: 0000000015800980 R14: 0000000015800540 R15: 0000000000000000
[ 2428.719116][T20014] </TASK>
[ 2428.722494][T20014] irq event stamp: 0
[ 2428.726720][T20014] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 2428.734179][T20014] hardirqs last disabled at (0): [<ffffffffa6557892>] copy_process+0x1862/0x5730
[ 2428.743627][T20014] softirqs last enabled at (0): [<ffffffffa65578ea>] copy_process+0x18ba/0x5730
[ 2428.753082][T20014] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 2428.760539][T20014] ---[ end trace 0000000000000000 ]---
[ 2428.766495][T20014] ------------[ cut here ]------------
[ 2428.772550][T20014] WARNING: CPU: 0 PID: 20014 at fs/btrfs/extent_io.c:3441 release_extent_buffer+0x22f/0x2a0 [btrfs]
[ 2428.784046][T20014] Modules linked in: scsi_debug btrfs xor raid6_pq xfs target_core_user target_core_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack rfkill nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables ip6table_filter ip6_tables iptable_filter ip_tables qrtr irdma ice gnss ib_uverbs sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency ib_core intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm spi_nor i40e irqbypass mtd rapl iTCO_wdt intel_pmc_bxt intel_cstate iTCO_vendor_support vfat ses fat intel_uncore enclosure libie mei_me i2c_i801 spi_intel_pci ioatdma i2c_smbus lpc_ich spi_intel mei intel_pch_thermal wmi dca joydev acpi_power_meter acpi_pad fuse loop dm_multipath nfnetlink zram lz4hc_compress lz4_compress
[ 2428.784308][T20014] zstd_compress ast drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper drm nvme mpi3mr nvme_core polyval_clmulni ghash_clmulni_intel sha512_ssse3 nvme_keyring sha1_ssse3 scsi_transport_sas nvme_auth scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkcs8_key_parser [last unloaded: scsi_debug]
[ 2428.910058][T20014] CPU: 0 UID: 0 PID: 20014 Comm: umount Tainted: G B W 6.15.0-rc7-next-20250521-kts #1 PREEMPT(lazy)
[ 2428.923575][T20014] Tainted: [B]=BAD_PAGE, [W]=WARN
[ 2428.929199][T20014] Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.5 05/18/2021
[ 2428.938139][T20014] RIP: 0010:release_extent_buffer+0x22f/0x2a0 [btrfs]
[ 2428.945688][T20014] Code: 08 5b 5d 41 5c 41 5d e9 8f 08 06 e6 49 8d 7c 24 40 be ff ff ff ff e8 10 0e 03 e6 85 c0 0f 85 26 fe ff ff 0f 0b e9 1f fe ff ff <0f> 0b e9 61 fe ff ff 48 c7 c7 84 cb 48 ab e8 6e 99 cd e3 e9 f9 fd
[ 2428.966691][T20014] RSP: 0018:ffff888348297068 EFLAGS: 00010246
[ 2428.973501][T20014] RAX: 0000000000000000 RBX: ffff8881765d47e8 RCX: 0000000000000001
[ 2428.982152][T20014] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8881765d47e8
[ 2428.990816][T20014] RBP: ffff8881765d47e8 R08: ffffffffc332b4e0 R09: ffffed102ecba8fd
[ 2428.999472][T20014] R10: ffffed102ecba8fe R11: 0000000000000000 R12: ffff8881765d4780
[ 2429.008137][T20014] R13: dffffc0000000000 R14: 000000000000001f R15: ffffed102ecba8f2
[ 2429.016801][T20014] FS: 00007f2e60decb80(0000) GS:ffff8890551c5000(0000) knlGS:0000000000000000
[ 2429.026431][T20014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2429.033725][T20014] CR2: 0000000031d2e004 CR3: 00000001b3fa9005 CR4: 00000000007726f0
[ 2429.042418][T20014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2429.051112][T20014] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2429.059798][T20014] PKRU: 55555554
[ 2429.064035][T20014] Call Trace:
[ 2429.067989][T20014] <TASK>
[ 2429.071603][T20014] free_extent_buffer+0x1e6/0x2b0 [btrfs]
[ 2429.078167][T20014] ? __pfx_free_extent_buffer+0x10/0x10 [btrfs]
[ 2429.085241][T20014] ? btrfs_check_meta_write_pointer+0x243/0x5a0 [btrfs]
[ 2429.092991][T20014] btree_write_cache_pages+0x40f/0x950 [btrfs]
[ 2429.099983][T20014] ? __pfx_btree_write_cache_pages+0x10/0x10 [btrfs]
[ 2429.107513][T20014] ? unwind_get_return_address+0x6b/0xe0
[ 2429.113833][T20014] ? kasan_save_stack+0x3f/0x50
[ 2429.119359][T20014] ? kasan_save_stack+0x30/0x50
[ 2429.124860][T20014] ? kasan_save_track+0x14/0x30
[ 2429.130368][T20014] ? kasan_save_free_info+0x3b/0x70
[ 2429.136228][T20014] ? __kasan_slab_free+0x52/0x70
[ 2429.141820][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2429.147405][T20014] ? btrfs_convert_extent_bit+0x97e/0xfd0 [btrfs]
[ 2429.154630][T20014] ? btrfs_write_marked_extents+0x17b/0x230 [btrfs]
[ 2429.161997][T20014] ? btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
[ 2429.169809][T20014] ? btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
[ 2429.177167][T20014] do_writepages+0x21e/0x560
[ 2429.182366][T20014] ? __pfx_do_writepages+0x10/0x10
[ 2429.188047][T20014] ? _raw_spin_unlock+0x23/0x40
[ 2429.193468][T20014] ? wbc_attach_and_unlock_inode.part.0+0x388/0x730
[ 2429.200618][T20014] filemap_fdatawrite_wbc+0xd2/0x120
[ 2429.206451][T20014] __filemap_fdatawrite_range+0xa7/0xe0
[ 2429.212519][T20014] ? __pfx___filemap_fdatawrite_range+0x10/0x10
[ 2429.219276][T20014] btrfs_write_marked_extents+0xf7/0x230 [btrfs]
[ 2429.226248][T20014] ? __pfx_btrfs_write_marked_extents+0x10/0x10 [btrfs]
[ 2429.233805][T20014] ? __pfx___mutex_lock+0x10/0x10
[ 2429.239296][T20014] btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
[ 2429.246736][T20014] ? do_raw_spin_lock+0x128/0x270
[ 2429.252202][T20014] ? __pfx_btrfs_write_and_wait_transaction+0x10/0x10 [btrfs]
[ 2429.260249][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2429.265705][T20014] ? _raw_spin_unlock_irqrestore+0x44/0x60
[ 2429.271931][T20014] btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
[ 2429.278924][T20014] ? start_transaction+0x520/0x1520 [btrfs]
[ 2429.285377][T20014] ? __pfx_btrfs_commit_transaction+0x10/0x10 [btrfs]
[ 2429.292675][T20014] ? btrfs_attach_transaction_barrier+0x25/0xa0 [btrfs]
[ 2429.300156][T20014] sync_filesystem+0x177/0x220
[ 2429.305312][T20014] generic_shutdown_super+0x79/0x320
[ 2429.310974][T20014] kill_anon_super+0x3a/0x60
[ 2429.315937][T20014] btrfs_kill_super+0x3e/0x60 [btrfs]
[ 2429.321835][T20014] deactivate_locked_super+0xa8/0x160
[ 2429.327575][T20014] cleanup_mnt+0x1da/0x410
[ 2429.332368][T20014] task_work_run+0x116/0x200
[ 2429.337317][T20014] ? __pfx_task_work_run+0x10/0x10
[ 2429.342774][T20014] ? __x64_sys_umount+0x10c/0x140
[ 2429.348165][T20014] ? __pfx___x64_sys_umount+0x10/0x10
[ 2429.353881][T20014] exit_to_user_mode_loop+0x135/0x160
[ 2429.359599][T20014] do_syscall_64+0x223/0x380
[ 2429.364547][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2429.369920][T20014] ? kasan_save_track+0x14/0x30
[ 2429.375545][T20014] ? kasan_quarantine_put+0xf5/0x240
[ 2429.381614][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2429.386910][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2429.392206][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2429.397479][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
[ 2429.403609][T20014] ? do_syscall_64+0x158/0x380
[ 2429.408709][T20014] ? trace_hardirqs_on+0x18/0x150
[ 2429.414072][T20014] ? kasan_save_track+0x14/0x30
[ 2429.419265][T20014] ? kasan_quarantine_put+0xf5/0x240
[ 2429.424881][T20014] ? kmem_cache_free+0x1a1/0x580
[ 2429.430168][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2429.435450][T20014] ? __x64_sys_statx+0x141/0x1b0
[ 2429.440710][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
[ 2429.446839][T20014] ? do_syscall_64+0x158/0x380
[ 2429.451922][T20014] ? clear_bhb_loop+0x30/0x80
[ 2429.456919][T20014] ? clear_bhb_loop+0x30/0x80
[ 2429.461908][T20014] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 2429.468130][T20014] RIP: 0033:0x7f2e60ee280b
[ 2429.472858][T20014] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 c9 35 0f 00 f7 d8
[ 2429.493303][T20014] RSP: 002b:00007ffcad827e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 2429.502091][T20014] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f2e60ee280b
[ 2429.510437][T20014] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000015800980
[ 2429.518761][T20014] RBP: 00007f2e610bdfd4 R08: 0000000000000002 R09: 0000000000000000
[ 2429.527098][T20014] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000015800648
[ 2429.535424][T20014] R13: 0000000015800980 R14: 0000000015800540 R15: 0000000000000000
[ 2429.543751][T20014] </TASK>
[ 2429.547133][T20014] irq event stamp: 0
[ 2429.551395][T20014] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 2429.558844][T20014] hardirqs last disabled at (0): [<ffffffffa6557892>] copy_process+0x1862/0x5730
[ 2429.568313][T20014] softirqs last enabled at (0): [<ffffffffa65578ea>] copy_process+0x18ba/0x5730
[ 2429.577761][T20014] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 2429.585219][T20014] ---[ end trace 0000000000000000 ]---
...
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 3/3] btrfs: use buffer xarray for extent buffer writeback operations
2025-05-26 1:17 ` Shinichiro Kawasaki
@ 2025-05-26 4:20 ` Qu Wenruo
2025-05-26 6:53 ` Shinichiro Kawasaki
0 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2025-05-26 4:20 UTC (permalink / raw)
To: Shinichiro Kawasaki, Josef Bacik
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com, Filipe Manana,
Johannes Thumshirn, Naohiro Aota, Damien Le Moal
在 2025/5/26 10:47, Shinichiro Kawasaki 写道:
> On Apr 28, 2025 / 10:52, Josef Bacik wrote:
>> Currently we have this ugly back and forth with the btree writeback
>> where we find the folio, find the eb associated with that folio, and
>> then attempt to writeback. This results in two different paths for
>> subpage eb's and >= pagesize eb's.
>>
>> Clean this up by adding our own infrastructure around looking up tag'ed
>> eb's and writing the eb's out directly. This allows us to unify the
>> subpage and >= pagesize IO paths, resulting in a much cleaner writeback
>> path for extent buffers.
>
> [...]
>
> When I ran blktests on the for-next kernel with the tag next-20250521, I
> observed the test case zdd/009 failed with repeated WARNs at
> release_extent_buffer() [1].
Unfortunately that's a known bug, fixed by this patch:
https://lore.kernel.org/linux-btrfs/b964b92f482453cbd122743995ff23aa7158b2cb.1747677774.git.josef@toxicpanda.com/
Thanks,
Qu
> I bisected and found this patch as the commit
> 5e121ae687b8 is the trigger. When I revert the commit from the tag
> next-20250521, the WARNs disappear and the test case passes.
>
> The test case creates zoned btrfs on scsi_debug and runs fio. I guess this
> problem might be unique to zoned btrfs. Actions for fix will be appreciated.
>
> [1]
>
> [ 2415.130602][T19872] run blktests zbd/009 at 2025-05-22 05:43:18
> [ 2415.336100][T19925] sd 11:0:0:0: [sdg] Synchronizing SCSI cache
> [ 2415.656624][T19927] scsi_debug:sdebug_driver_probe: scsi_debug: trim poll_queues to 0. poll_q/nr_hw = (0/1)
> [ 2415.666978][T19927] scsi host11: scsi_debug: version 0191 [20210520]
> [ 2415.666978][T19927] dev_size_mb=1024, opts=0x0, submit_queues=1, statistics=0
> [ 2415.688478][T19927] scsi 11:0:0:0: Direct-Access-ZBC Linux scsi_debug 0191 PQ: 0 ANSI: 7
> [ 2415.701080][ C3] scsi 11:0:0:0: Power-on or device reset occurred
> [ 2415.711696][T19927] sd 11:0:0:0: Attached scsi generic sg7 type 20
> [ 2415.711931][T14481] sd 11:0:0:0: [sdg] Host-managed zoned block device
> [ 2415.729457][T14481] sd 11:0:0:0: [sdg] 262144 4096-byte logical blocks: (1.07 GB/1.00 GiB)
> [ 2415.741314][T14481] sd 11:0:0:0: [sdg] Write Protect is off
> [ 2415.749640][T14481] sd 11:0:0:0: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [ 2415.764470][T14481] sd 11:0:0:0: [sdg] permanent stream count = 5
> [ 2415.772745][T14481] sd 11:0:0:0: [sdg] Preferred minimum I/O size 4096 bytes
> [ 2415.781284][T14481] sd 11:0:0:0: [sdg] Optimal transfer size 4194304 bytes
> [ 2415.791238][T14481] sd 11:0:0:0: [sdg] 256 zones of 1024 logical blocks
> [ 2415.844595][T14481] sd 11:0:0:0: [sdg] Attached SCSI disk
> [ 2416.180138][T19955] BTRFS: device fsid c63c1228-a6d2-4e6e-8c38-81ccbc43ddf9 devid 1 transid 8 /dev/sdg (8:96) scanned by mount (19955)
> [ 2416.224609][T19955] BTRFS info (device sdg): first mount of filesystem c63c1228-a6d2-4e6e-8c38-81ccbc43ddf9
> [ 2416.236163][T19955] BTRFS info (device sdg): using crc32c (crc32c-x86) checksum algorithm
> [ 2416.246181][T19955] BTRFS info (device sdg): using free-space-tree
> [ 2416.261378][T19955] BTRFS info (device sdg): host-managed zoned block device /dev/sdg, 256 zones of 4194304 bytes
> [ 2416.273530][T19955] BTRFS info (device sdg): zoned mode enabled with zone size 4194304
> [ 2416.287809][T19955] BTRFS info (device sdg): checking UUID tree
> [ 2427.128617][T20014] ------------[ cut here ]------------
> [ 2427.134580][T20014] WARNING: CPU: 13 PID: 20014 at fs/btrfs/extent_io.c:3441 release_extent_buffer+0x22f/0x2a0 [btrfs]
> [ 2427.146076][T20014] Modules linked in: scsi_debug btrfs xor raid6_pq xfs target_core_user target_core_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack rfkill nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables ip6table_filter ip6_tables iptable_filter ip_tables qrtr irdma ice gnss ib_uverbs sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency ib_core intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm spi_nor i40e irqbypass mtd rapl iTCO_wdt intel_pmc_bxt intel_cstate iTCO_vendor_support vfat ses fat intel_uncore enclosure libie mei_me i2c_i801 spi_intel_pci ioatdma i2c_smbus lpc_ich spi_intel mei intel_pch_thermal wmi dca joydev acpi_power_meter acpi_pad fuse loop dm_multipath nfnetlink zram lz4hc_compress lz4_compress
> [ 2427.146286][T20014] zstd_compress ast drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper drm nvme mpi3mr nvme_core polyval_clmulni ghash_clmulni_intel sha512_ssse3 nvme_keyring sha1_ssse3 scsi_transport_sas nvme_auth scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkcs8_key_parser [last unloaded: scsi_debug]
> [ 2427.271554][T20014] CPU: 13 UID: 0 PID: 20014 Comm: umount Tainted: G B 6.15.0-rc7-next-20250521-kts #1 PREEMPT(lazy)
> [ 2427.285246][T20014] Tainted: [B]=BAD_PAGE
> [ 2427.290012][T20014] Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.5 05/18/2021
> [ 2427.298958][T20014] RIP: 0010:release_extent_buffer+0x22f/0x2a0 [btrfs]
> [ 2427.306531][T20014] Code: 08 5b 5d 41 5c 41 5d e9 8f 08 06 e6 49 8d 7c 24 40 be ff ff ff ff e8 10 0e 03 e6 85 c0 0f 85 26 fe ff ff 0f 0b e9 1f fe ff ff <0f> 0b e9 61 fe ff ff 48 c7 c7 84 cb 48 ab e8 6e 99 cd e3 e9 f9 fd
> [ 2427.327541][T20014] RSP: 0018:ffff888348297068 EFLAGS: 00010246
> [ 2427.334290][T20014] RAX: 0000000000000000 RBX: ffff8882cd2d34e8 RCX: 0000000000000001
> [ 2427.342954][T20014] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8882cd2d34e8
> [ 2427.351616][T20014] RBP: ffff8882cd2d34e8 R08: ffffffffc332b4e0 R09: ffffed1059a5a69d
> [ 2427.360272][T20014] R10: ffffed1059a5a69e R11: 0000000000000000 R12: ffff8882cd2d3480
> [ 2427.368916][T20014] R13: dffffc0000000000 R14: 000000000000001f R15: ffffed1059a5a692
> [ 2427.377567][T20014] FS: 00007f2e60decb80(0000) GS:ffff889055845000(0000) knlGS:0000000000000000
> [ 2427.387171][T20014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2427.394426][T20014] CR2: 00007ffcad825fe0 CR3: 00000001b3fa9003 CR4: 00000000007726f0
> [ 2427.403076][T20014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2427.411723][T20014] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2427.420367][T20014] PKRU: 55555554
> [ 2427.424579][T20014] Call Trace:
> [ 2427.428523][T20014] <TASK>
> [ 2427.432114][T20014] free_extent_buffer+0x1e6/0x2b0 [btrfs]
> [ 2427.438667][T20014] ? __pfx_free_extent_buffer+0x10/0x10 [btrfs]
> [ 2427.445722][T20014] ? btrfs_check_meta_write_pointer+0x243/0x5a0 [btrfs]
> [ 2427.453465][T20014] btree_write_cache_pages+0x40f/0x950 [btrfs]
> [ 2427.460435][T20014] ? __pfx_btree_write_cache_pages+0x10/0x10 [btrfs]
> [ 2427.467904][T20014] ? unwind_get_return_address+0x6b/0xe0
> [ 2427.474184][T20014] ? kasan_save_stack+0x3f/0x50
> [ 2427.479655][T20014] ? kasan_save_stack+0x30/0x50
> [ 2427.485121][T20014] ? kasan_save_track+0x14/0x30
> [ 2427.490572][T20014] ? kasan_save_free_info+0x3b/0x70
> [ 2427.496370][T20014] ? __kasan_slab_free+0x52/0x70
> [ 2427.501920][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2427.507452][T20014] ? btrfs_convert_extent_bit+0x97e/0xfd0 [btrfs]
> [ 2427.514600][T20014] ? btrfs_write_marked_extents+0x17b/0x230 [btrfs]
> [ 2427.521903][T20014] ? btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
> [ 2427.529630][T20014] ? btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
> [ 2427.536903][T20014] do_writepages+0x21e/0x560
> [ 2427.542029][T20014] ? __pfx_do_writepages+0x10/0x10
> [ 2427.547667][T20014] ? _raw_spin_unlock+0x23/0x40
> [ 2427.553017][T20014] ? wbc_attach_and_unlock_inode.part.0+0x388/0x730
> [ 2427.560110][T20014] filemap_fdatawrite_wbc+0xd2/0x120
> [ 2427.565892][T20014] __filemap_fdatawrite_range+0xa7/0xe0
> [ 2427.571900][T20014] ? __pfx___filemap_fdatawrite_range+0x10/0x10
> [ 2427.578604][T20014] btrfs_write_marked_extents+0xf7/0x230 [btrfs]
> [ 2427.585514][T20014] ? __pfx_btrfs_write_marked_extents+0x10/0x10 [btrfs]
> [ 2427.593011][T20014] ? __pfx___mutex_lock+0x10/0x10
> [ 2427.598452][T20014] btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
> [ 2427.605839][T20014] ? do_raw_spin_lock+0x128/0x270
> [ 2427.611255][T20014] ? __pfx_btrfs_write_and_wait_transaction+0x10/0x10 [btrfs]
> [ 2427.619249][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2427.624658][T20014] ? _raw_spin_unlock_irqrestore+0x44/0x60
> [ 2427.630852][T20014] btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
> [ 2427.637777][T20014] ? start_transaction+0x520/0x1520 [btrfs]
> [ 2427.644167][T20014] ? __pfx_btrfs_commit_transaction+0x10/0x10 [btrfs]
> [ 2427.651428][T20014] ? btrfs_attach_transaction_barrier+0x25/0xa0 [btrfs]
> [ 2427.658843][T20014] sync_filesystem+0x177/0x220
> [ 2427.663954][T20014] generic_shutdown_super+0x79/0x320
> [ 2427.669583][T20014] kill_anon_super+0x3a/0x60
> [ 2427.674514][T20014] btrfs_kill_super+0x3e/0x60 [btrfs]
> [ 2427.680359][T20014] deactivate_locked_super+0xa8/0x160
> [ 2427.686052][T20014] cleanup_mnt+0x1da/0x410
> [ 2427.690792][T20014] task_work_run+0x116/0x200
> [ 2427.695694][T20014] ? __pfx_task_work_run+0x10/0x10
> [ 2427.701122][T20014] ? __x64_sys_umount+0x10c/0x140
> [ 2427.706461][T20014] ? __pfx___x64_sys_umount+0x10/0x10
> [ 2427.712141][T20014] exit_to_user_mode_loop+0x135/0x160
> [ 2427.717825][T20014] do_syscall_64+0x223/0x380
> [ 2427.722740][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2427.728083][T20014] ? kasan_save_track+0x14/0x30
> [ 2427.733247][T20014] ? kasan_quarantine_put+0xf5/0x240
> [ 2427.738835][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2427.744081][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2427.749321][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2427.754546][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
> [ 2427.760637][T20014] ? do_syscall_64+0x158/0x380
> [ 2427.765699][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2427.771018][T20014] ? kasan_save_track+0x14/0x30
> [ 2427.776173][T20014] ? kasan_quarantine_put+0xf5/0x240
> [ 2427.781760][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2427.786995][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2427.792231][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2427.797465][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
> [ 2427.803561][T20014] ? do_syscall_64+0x158/0x380
> [ 2427.808610][T20014] ? clear_bhb_loop+0x30/0x80
> [ 2427.813571][T20014] ? clear_bhb_loop+0x30/0x80
> [ 2427.818529][T20014] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 2427.824705][T20014] RIP: 0033:0x7f2e60ee280b
> [ 2427.829399][T20014] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 c9 35 0f 00 f7 d8
> [ 2427.849737][T20014] RSP: 002b:00007ffcad827e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [ 2427.858513][T20014] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f2e60ee280b
> [ 2427.866805][T20014] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000015800980
> [ 2427.875102][T20014] RBP: 00007f2e610bdfd4 R08: 0000000000000002 R09: 0000000000000000
> [ 2427.883395][T20014] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000015800648
> [ 2427.891679][T20014] R13: 0000000015800980 R14: 0000000015800540 R15: 0000000000000000
> [ 2427.899968][T20014] </TASK>
> [ 2427.903304][T20014] irq event stamp: 0
> [ 2427.907498][T20014] hardirqs last enabled at (0): [<0000000000000000>] 0x0
> [ 2427.914911][T20014] hardirqs last disabled at (0): [<ffffffffa6557892>] copy_process+0x1862/0x5730
> [ 2427.924331][T20014] softirqs last enabled at (0): [<ffffffffa65578ea>] copy_process+0x18ba/0x5730
> [ 2427.933742][T20014] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 2427.941153][T20014] ---[ end trace 0000000000000000 ]---
> [ 2427.947036][T20014] ------------[ cut here ]------------
> [ 2427.953580][T20014] WARNING: CPU: 3 PID: 20014 at fs/btrfs/extent_io.c:3441 release_extent_buffer+0x22f/0x2a0 [btrfs]
> [ 2427.965056][T20014] Modules linked in: scsi_debug btrfs xor raid6_pq xfs target_core_user target_core_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack rfkill nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables ip6table_filter ip6_tables iptable_filter ip_tables qrtr irdma ice gnss ib_uverbs sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency ib_core intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm spi_nor i40e irqbypass mtd rapl iTCO_wdt intel_pmc_bxt intel_cstate iTCO_vendor_support vfat ses fat intel_uncore enclosure libie mei_me i2c_i801 spi_intel_pci ioatdma i2c_smbus lpc_ich spi_intel mei intel_pch_thermal wmi dca joydev acpi_power_meter acpi_pad fuse loop dm_multipath nfnetlink zram lz4hc_compress lz4_compress
> [ 2427.965264][T20014] zstd_compress ast drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper drm nvme mpi3mr nvme_core polyval_clmulni ghash_clmulni_intel sha512_ssse3 nvme_keyring sha1_ssse3 scsi_transport_sas nvme_auth scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkcs8_key_parser [last unloaded: scsi_debug]
> [ 2428.088867][T20014] CPU: 3 UID: 0 PID: 20014 Comm: umount Tainted: G B W 6.15.0-rc7-next-20250521-kts #1 PREEMPT(lazy)
> [ 2428.102283][T20014] Tainted: [B]=BAD_PAGE, [W]=WARN
> [ 2428.107834][T20014] Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.5 05/18/2021
> [ 2428.116715][T20014] RIP: 0010:release_extent_buffer+0x22f/0x2a0 [btrfs]
> [ 2428.124193][T20014] Code: 08 5b 5d 41 5c 41 5d e9 8f 08 06 e6 49 8d 7c 24 40 be ff ff ff ff e8 10 0e 03 e6 85 c0 0f 85 26 fe ff ff 0f 0b e9 1f fe ff ff <0f> 0b e9 61 fe ff ff 48 c7 c7 84 cb 48 ab e8 6e 99 cd e3 e9 f9 fd
> [ 2428.145115][T20014] RSP: 0018:ffff888348297068 EFLAGS: 00010246
> [ 2428.151793][T20014] RAX: 0000000000000000 RBX: ffff8881765d4ba8 RCX: 0000000000000001
> [ 2428.160383][T20014] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8881765d4ba8
> [ 2428.168974][T20014] RBP: ffff8881765d4ba8 R08: ffffffffc332b4e0 R09: ffffed102ecba975
> [ 2428.177572][T20014] R10: ffffed102ecba976 R11: 0000000000000000 R12: ffff8881765d4b40
> [ 2428.186179][T20014] R13: dffffc0000000000 R14: 000000000000001f R15: ffffed102ecba96a
> [ 2428.194782][T20014] FS: 00007f2e60decb80(0000) GS:ffff889055345000(0000) knlGS:0000000000000000
> [ 2428.204347][T20014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2428.211586][T20014] CR2: 0000000031894360 CR3: 00000001b3fa9002 CR4: 00000000007726f0
> [ 2428.220219][T20014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2428.228853][T20014] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2428.237481][T20014] PKRU: 55555554
> [ 2428.241671][T20014] Call Trace:
> [ 2428.245578][T20014] <TASK>
> [ 2428.249132][T20014] free_extent_buffer+0x1e6/0x2b0 [btrfs]
> [ 2428.255616][T20014] ? __pfx_free_extent_buffer+0x10/0x10 [btrfs]
> [ 2428.262619][T20014] ? btrfs_check_meta_write_pointer+0x243/0x5a0 [btrfs]
> [ 2428.270315][T20014] btree_write_cache_pages+0x40f/0x950 [btrfs]
> [ 2428.277241][T20014] ? __pfx_btree_write_cache_pages+0x10/0x10 [btrfs]
> [ 2428.284682][T20014] ? unwind_get_return_address+0x6b/0xe0
> [ 2428.290928][T20014] ? kasan_save_stack+0x3f/0x50
> [ 2428.296394][T20014] ? kasan_save_stack+0x30/0x50
> [ 2428.301841][T20014] ? kasan_save_track+0x14/0x30
> [ 2428.307293][T20014] ? kasan_save_free_info+0x3b/0x70
> [ 2428.313092][T20014] ? __kasan_slab_free+0x52/0x70
> [ 2428.318622][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2428.324156][T20014] ? btrfs_convert_extent_bit+0x97e/0xfd0 [btrfs]
> [ 2428.331314][T20014] ? btrfs_write_marked_extents+0x17b/0x230 [btrfs]
> [ 2428.338626][T20014] ? btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
> [ 2428.346364][T20014] ? btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
> [ 2428.353658][T20014] do_writepages+0x21e/0x560
> [ 2428.358785][T20014] ? __pfx_do_writepages+0x10/0x10
> [ 2428.364431][T20014] ? _raw_spin_unlock+0x23/0x40
> [ 2428.369784][T20014] ? wbc_attach_and_unlock_inode.part.0+0x388/0x730
> [ 2428.376917][T20014] filemap_fdatawrite_wbc+0xd2/0x120
> [ 2428.382733][T20014] __filemap_fdatawrite_range+0xa7/0xe0
> [ 2428.388789][T20014] ? __pfx___filemap_fdatawrite_range+0x10/0x10
> [ 2428.395550][T20014] btrfs_write_marked_extents+0xf7/0x230 [btrfs]
> [ 2428.402529][T20014] ? __pfx_btrfs_write_marked_extents+0x10/0x10 [btrfs]
> [ 2428.410091][T20014] ? __pfx___mutex_lock+0x10/0x10
> [ 2428.415586][T20014] btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
> [ 2428.423028][T20014] ? do_raw_spin_lock+0x128/0x270
> [ 2428.428491][T20014] ? __pfx_btrfs_write_and_wait_transaction+0x10/0x10 [btrfs]
> [ 2428.436543][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2428.442000][T20014] ? _raw_spin_unlock_irqrestore+0x44/0x60
> [ 2428.448238][T20014] btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
> [ 2428.455225][T20014] ? start_transaction+0x520/0x1520 [btrfs]
> [ 2428.461655][T20014] ? __pfx_btrfs_commit_transaction+0x10/0x10 [btrfs]
> [ 2428.468948][T20014] ? btrfs_attach_transaction_barrier+0x25/0xa0 [btrfs]
> [ 2428.476440][T20014] sync_filesystem+0x177/0x220
> [ 2428.481589][T20014] generic_shutdown_super+0x79/0x320
> [ 2428.487264][T20014] kill_anon_super+0x3a/0x60
> [ 2428.492238][T20014] btrfs_kill_super+0x3e/0x60 [btrfs]
> [ 2428.498133][T20014] deactivate_locked_super+0xa8/0x160
> [ 2428.503869][T20014] cleanup_mnt+0x1da/0x410
> [ 2428.508648][T20014] task_work_run+0x116/0x200
> [ 2428.513592][T20014] ? __pfx_task_work_run+0x10/0x10
> [ 2428.519052][T20014] ? __x64_sys_umount+0x10c/0x140
> [ 2428.524436][T20014] ? __pfx___x64_sys_umount+0x10/0x10
> [ 2428.530159][T20014] exit_to_user_mode_loop+0x135/0x160
> [ 2428.535880][T20014] do_syscall_64+0x223/0x380
> [ 2428.540827][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2428.546208][T20014] ? kasan_save_track+0x14/0x30
> [ 2428.551416][T20014] ? kasan_quarantine_put+0xf5/0x240
> [ 2428.557040][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2428.562323][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2428.567597][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2428.572859][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
> [ 2428.578986][T20014] ? do_syscall_64+0x158/0x380
> [ 2428.584094][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2428.589468][T20014] ? kasan_save_track+0x14/0x30
> [ 2428.594652][T20014] ? kasan_quarantine_put+0xf5/0x240
> [ 2428.600282][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2428.605557][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2428.610827][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2428.616095][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
> [ 2428.622232][T20014] ? do_syscall_64+0x158/0x380
> [ 2428.627328][T20014] ? clear_bhb_loop+0x30/0x80
> [ 2428.632328][T20014] ? clear_bhb_loop+0x30/0x80
> [ 2428.637329][T20014] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 2428.643530][T20014] RIP: 0033:0x7f2e60ee280b
> [ 2428.648266][T20014] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 c9 35 0f 00 f7 d8
> [ 2428.668680][T20014] RSP: 002b:00007ffcad827e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [ 2428.677460][T20014] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f2e60ee280b
> [ 2428.685785][T20014] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000015800980
> [ 2428.694123][T20014] RBP: 00007f2e610bdfd4 R08: 0000000000000002 R09: 0000000000000000
> [ 2428.702459][T20014] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000015800648
> [ 2428.710783][T20014] R13: 0000000015800980 R14: 0000000015800540 R15: 0000000000000000
> [ 2428.719116][T20014] </TASK>
> [ 2428.722494][T20014] irq event stamp: 0
> [ 2428.726720][T20014] hardirqs last enabled at (0): [<0000000000000000>] 0x0
> [ 2428.734179][T20014] hardirqs last disabled at (0): [<ffffffffa6557892>] copy_process+0x1862/0x5730
> [ 2428.743627][T20014] softirqs last enabled at (0): [<ffffffffa65578ea>] copy_process+0x18ba/0x5730
> [ 2428.753082][T20014] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 2428.760539][T20014] ---[ end trace 0000000000000000 ]---
> [ 2428.766495][T20014] ------------[ cut here ]------------
> [ 2428.772550][T20014] WARNING: CPU: 0 PID: 20014 at fs/btrfs/extent_io.c:3441 release_extent_buffer+0x22f/0x2a0 [btrfs]
> [ 2428.784046][T20014] Modules linked in: scsi_debug btrfs xor raid6_pq xfs target_core_user target_core_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack rfkill nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables ip6table_filter ip6_tables iptable_filter ip_tables qrtr irdma ice gnss ib_uverbs sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency ib_core intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm spi_nor i40e irqbypass mtd rapl iTCO_wdt intel_pmc_bxt intel_cstate iTCO_vendor_support vfat ses fat intel_uncore enclosure libie mei_me i2c_i801 spi_intel_pci ioatdma i2c_smbus lpc_ich spi_intel mei intel_pch_thermal wmi dca joydev acpi_power_meter acpi_pad fuse loop dm_multipath nfnetlink zram lz4hc_compress lz4_compress
> [ 2428.784308][T20014] zstd_compress ast drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper drm nvme mpi3mr nvme_core polyval_clmulni ghash_clmulni_intel sha512_ssse3 nvme_keyring sha1_ssse3 scsi_transport_sas nvme_auth scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkcs8_key_parser [last unloaded: scsi_debug]
> [ 2428.910058][T20014] CPU: 0 UID: 0 PID: 20014 Comm: umount Tainted: G B W 6.15.0-rc7-next-20250521-kts #1 PREEMPT(lazy)
> [ 2428.923575][T20014] Tainted: [B]=BAD_PAGE, [W]=WARN
> [ 2428.929199][T20014] Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.5 05/18/2021
> [ 2428.938139][T20014] RIP: 0010:release_extent_buffer+0x22f/0x2a0 [btrfs]
> [ 2428.945688][T20014] Code: 08 5b 5d 41 5c 41 5d e9 8f 08 06 e6 49 8d 7c 24 40 be ff ff ff ff e8 10 0e 03 e6 85 c0 0f 85 26 fe ff ff 0f 0b e9 1f fe ff ff <0f> 0b e9 61 fe ff ff 48 c7 c7 84 cb 48 ab e8 6e 99 cd e3 e9 f9 fd
> [ 2428.966691][T20014] RSP: 0018:ffff888348297068 EFLAGS: 00010246
> [ 2428.973501][T20014] RAX: 0000000000000000 RBX: ffff8881765d47e8 RCX: 0000000000000001
> [ 2428.982152][T20014] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8881765d47e8
> [ 2428.990816][T20014] RBP: ffff8881765d47e8 R08: ffffffffc332b4e0 R09: ffffed102ecba8fd
> [ 2428.999472][T20014] R10: ffffed102ecba8fe R11: 0000000000000000 R12: ffff8881765d4780
> [ 2429.008137][T20014] R13: dffffc0000000000 R14: 000000000000001f R15: ffffed102ecba8f2
> [ 2429.016801][T20014] FS: 00007f2e60decb80(0000) GS:ffff8890551c5000(0000) knlGS:0000000000000000
> [ 2429.026431][T20014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2429.033725][T20014] CR2: 0000000031d2e004 CR3: 00000001b3fa9005 CR4: 00000000007726f0
> [ 2429.042418][T20014] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2429.051112][T20014] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2429.059798][T20014] PKRU: 55555554
> [ 2429.064035][T20014] Call Trace:
> [ 2429.067989][T20014] <TASK>
> [ 2429.071603][T20014] free_extent_buffer+0x1e6/0x2b0 [btrfs]
> [ 2429.078167][T20014] ? __pfx_free_extent_buffer+0x10/0x10 [btrfs]
> [ 2429.085241][T20014] ? btrfs_check_meta_write_pointer+0x243/0x5a0 [btrfs]
> [ 2429.092991][T20014] btree_write_cache_pages+0x40f/0x950 [btrfs]
> [ 2429.099983][T20014] ? __pfx_btree_write_cache_pages+0x10/0x10 [btrfs]
> [ 2429.107513][T20014] ? unwind_get_return_address+0x6b/0xe0
> [ 2429.113833][T20014] ? kasan_save_stack+0x3f/0x50
> [ 2429.119359][T20014] ? kasan_save_stack+0x30/0x50
> [ 2429.124860][T20014] ? kasan_save_track+0x14/0x30
> [ 2429.130368][T20014] ? kasan_save_free_info+0x3b/0x70
> [ 2429.136228][T20014] ? __kasan_slab_free+0x52/0x70
> [ 2429.141820][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2429.147405][T20014] ? btrfs_convert_extent_bit+0x97e/0xfd0 [btrfs]
> [ 2429.154630][T20014] ? btrfs_write_marked_extents+0x17b/0x230 [btrfs]
> [ 2429.161997][T20014] ? btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
> [ 2429.169809][T20014] ? btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
> [ 2429.177167][T20014] do_writepages+0x21e/0x560
> [ 2429.182366][T20014] ? __pfx_do_writepages+0x10/0x10
> [ 2429.188047][T20014] ? _raw_spin_unlock+0x23/0x40
> [ 2429.193468][T20014] ? wbc_attach_and_unlock_inode.part.0+0x388/0x730
> [ 2429.200618][T20014] filemap_fdatawrite_wbc+0xd2/0x120
> [ 2429.206451][T20014] __filemap_fdatawrite_range+0xa7/0xe0
> [ 2429.212519][T20014] ? __pfx___filemap_fdatawrite_range+0x10/0x10
> [ 2429.219276][T20014] btrfs_write_marked_extents+0xf7/0x230 [btrfs]
> [ 2429.226248][T20014] ? __pfx_btrfs_write_marked_extents+0x10/0x10 [btrfs]
> [ 2429.233805][T20014] ? __pfx___mutex_lock+0x10/0x10
> [ 2429.239296][T20014] btrfs_write_and_wait_transaction+0xdb/0x1d0 [btrfs]
> [ 2429.246736][T20014] ? do_raw_spin_lock+0x128/0x270
> [ 2429.252202][T20014] ? __pfx_btrfs_write_and_wait_transaction+0x10/0x10 [btrfs]
> [ 2429.260249][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2429.265705][T20014] ? _raw_spin_unlock_irqrestore+0x44/0x60
> [ 2429.271931][T20014] btrfs_commit_transaction+0x163a/0x30b0 [btrfs]
> [ 2429.278924][T20014] ? start_transaction+0x520/0x1520 [btrfs]
> [ 2429.285377][T20014] ? __pfx_btrfs_commit_transaction+0x10/0x10 [btrfs]
> [ 2429.292675][T20014] ? btrfs_attach_transaction_barrier+0x25/0xa0 [btrfs]
> [ 2429.300156][T20014] sync_filesystem+0x177/0x220
> [ 2429.305312][T20014] generic_shutdown_super+0x79/0x320
> [ 2429.310974][T20014] kill_anon_super+0x3a/0x60
> [ 2429.315937][T20014] btrfs_kill_super+0x3e/0x60 [btrfs]
> [ 2429.321835][T20014] deactivate_locked_super+0xa8/0x160
> [ 2429.327575][T20014] cleanup_mnt+0x1da/0x410
> [ 2429.332368][T20014] task_work_run+0x116/0x200
> [ 2429.337317][T20014] ? __pfx_task_work_run+0x10/0x10
> [ 2429.342774][T20014] ? __x64_sys_umount+0x10c/0x140
> [ 2429.348165][T20014] ? __pfx___x64_sys_umount+0x10/0x10
> [ 2429.353881][T20014] exit_to_user_mode_loop+0x135/0x160
> [ 2429.359599][T20014] do_syscall_64+0x223/0x380
> [ 2429.364547][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2429.369920][T20014] ? kasan_save_track+0x14/0x30
> [ 2429.375545][T20014] ? kasan_quarantine_put+0xf5/0x240
> [ 2429.381614][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2429.386910][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2429.392206][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2429.397479][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
> [ 2429.403609][T20014] ? do_syscall_64+0x158/0x380
> [ 2429.408709][T20014] ? trace_hardirqs_on+0x18/0x150
> [ 2429.414072][T20014] ? kasan_save_track+0x14/0x30
> [ 2429.419265][T20014] ? kasan_quarantine_put+0xf5/0x240
> [ 2429.424881][T20014] ? kmem_cache_free+0x1a1/0x580
> [ 2429.430168][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2429.435450][T20014] ? __x64_sys_statx+0x141/0x1b0
> [ 2429.440710][T20014] ? trace_hardirqs_on_prepare+0x101/0x150
> [ 2429.446839][T20014] ? do_syscall_64+0x158/0x380
> [ 2429.451922][T20014] ? clear_bhb_loop+0x30/0x80
> [ 2429.456919][T20014] ? clear_bhb_loop+0x30/0x80
> [ 2429.461908][T20014] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 2429.468130][T20014] RIP: 0033:0x7f2e60ee280b
> [ 2429.472858][T20014] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 c9 35 0f 00 f7 d8
> [ 2429.493303][T20014] RSP: 002b:00007ffcad827e98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [ 2429.502091][T20014] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f2e60ee280b
> [ 2429.510437][T20014] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000015800980
> [ 2429.518761][T20014] RBP: 00007f2e610bdfd4 R08: 0000000000000002 R09: 0000000000000000
> [ 2429.527098][T20014] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000015800648
> [ 2429.535424][T20014] R13: 0000000015800980 R14: 0000000015800540 R15: 0000000000000000
> [ 2429.543751][T20014] </TASK>
> [ 2429.547133][T20014] irq event stamp: 0
> [ 2429.551395][T20014] hardirqs last enabled at (0): [<0000000000000000>] 0x0
> [ 2429.558844][T20014] hardirqs last disabled at (0): [<ffffffffa6557892>] copy_process+0x1862/0x5730
> [ 2429.568313][T20014] softirqs last enabled at (0): [<ffffffffa65578ea>] copy_process+0x18ba/0x5730
> [ 2429.577761][T20014] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 2429.585219][T20014] ---[ end trace 0000000000000000 ]---
> ...
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 3/3] btrfs: use buffer xarray for extent buffer writeback operations
2025-05-26 4:20 ` Qu Wenruo
@ 2025-05-26 6:53 ` Shinichiro Kawasaki
2025-05-28 23:25 ` David Sterba
0 siblings, 1 reply; 9+ messages in thread
From: Shinichiro Kawasaki @ 2025-05-26 6:53 UTC (permalink / raw)
To: Qu Wenruo
Cc: Josef Bacik, linux-btrfs@vger.kernel.org, kernel-team@fb.com,
Filipe Manana, Johannes Thumshirn, Naohiro Aota, Damien Le Moal
On May 26, 2025 / 13:50, Qu Wenruo wrote:
>
>
> 在 2025/5/26 10:47, Shinichiro Kawasaki 写道:
> > On Apr 28, 2025 / 10:52, Josef Bacik wrote:
> > > Currently we have this ugly back and forth with the btree writeback
> > > where we find the folio, find the eb associated with that folio, and
> > > then attempt to writeback. This results in two different paths for
> > > subpage eb's and >= pagesize eb's.
> > >
> > > Clean this up by adding our own infrastructure around looking up tag'ed
> > > eb's and writing the eb's out directly. This allows us to unify the
> > > subpage and >= pagesize IO paths, resulting in a much cleaner writeback
> > > path for extent buffers.
> >
> > [...]
> >
> > When I ran blktests on the for-next kernel with the tag next-20250521, I
> > observed the test case zdd/009 failed with repeated WARNs at
> > release_extent_buffer() [1].
>
> Unfortunately that's a known bug, fixed by this patch:
>
> https://lore.kernel.org/linux-btrfs/b964b92f482453cbd122743995ff23aa7158b2cb.1747677774.git.josef@toxicpanda.com/
>
Ah, thank you for letting me know. I confirmed that the fix avoids the failure
on my test system.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 3/3] btrfs: use buffer xarray for extent buffer writeback operations
2025-05-26 6:53 ` Shinichiro Kawasaki
@ 2025-05-28 23:25 ` David Sterba
0 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2025-05-28 23:25 UTC (permalink / raw)
To: Shinichiro Kawasaki
Cc: Qu Wenruo, Josef Bacik, linux-btrfs@vger.kernel.org,
kernel-team@fb.com, Filipe Manana, Johannes Thumshirn,
Naohiro Aota, Damien Le Moal
On Mon, May 26, 2025 at 06:53:02AM +0000, Shinichiro Kawasaki wrote:
> On May 26, 2025 / 13:50, Qu Wenruo wrote:
> >
> >
> > 在 2025/5/26 10:47, Shinichiro Kawasaki 写道:
> > > On Apr 28, 2025 / 10:52, Josef Bacik wrote:
> > > > Currently we have this ugly back and forth with the btree writeback
> > > > where we find the folio, find the eb associated with that folio, and
> > > > then attempt to writeback. This results in two different paths for
> > > > subpage eb's and >= pagesize eb's.
> > > >
> > > > Clean this up by adding our own infrastructure around looking up tag'ed
> > > > eb's and writing the eb's out directly. This allows us to unify the
> > > > subpage and >= pagesize IO paths, resulting in a much cleaner writeback
> > > > path for extent buffers.
> > >
> > > [...]
> > >
> > > When I ran blktests on the for-next kernel with the tag next-20250521, I
> > > observed the test case zdd/009 failed with repeated WARNs at
> > > release_extent_buffer() [1].
> >
> > Unfortunately that's a known bug, fixed by this patch:
> >
> > https://lore.kernel.org/linux-btrfs/b964b92f482453cbd122743995ff23aa7158b2cb.1747677774.git.josef@toxicpanda.com/
> >
>
> Ah, thank you for letting me know. I confirmed that the fix avoids the failure
> on my test system.
FYI, the fix is now in master.
^ permalink raw reply [flat|nested] 9+ messages in thread