* btrfs/071 is unhappy on 6.18-rc2
@ 2025-10-20 7:22 Christoph Hellwig
2025-10-20 9:11 ` Qu Wenruo
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-20 7:22 UTC (permalink / raw)
To: linux-btrfs
I just kicked off a baseline run with the xfstests volume group and
a SCRATCH_DEV_POOL with 5 virtual nvme devices to test a VFS change that
affects іt a little, and it does not seem too happy.
btrfs/071 gets into slab poisoning:
[ 279.241695] BTRFS info (device nvme1n1 state M): use zlib compression, level 3
[ 279.247651] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d73:I
[ 279.250656] CPU: 1 UID: 0 PID: 82037 Comm: btrfs-cleaner Tainted: GN 6.18.0-rc2
[ 279.250656] Tainted: [N]=TEST
[ 279.250656] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/4
[ 279.250656] RIP: 0010:btrfs_kill_all_delayed_nodes+0x145/0x1e0
[ 279.250656] Code: 08 48 c1 e5 03 4b 8b 5c 3d 00 48 89 df e8 23 d0 ff ff 48 85 db 74 0f 4a 8d 54 3c0
[ 279.250656] RSP: 0018:ffffc9000138bdc0 EFLAGS: 00010246
[ 279.250656] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88810dad3d58 RCX: 0000000000000000
[ 279.250656] RDX: 0000000000000001 RSI: 0000000000000286 RDI: 00000000ffffffff
[ 279.250656] RBP: 0000000000000008 R08: ffff88810dad3f60 R09: ffff88810dad3f10
[ 279.250656] R10: 000000000000000d R11: ffff88810dad3d58 R12: ffff88811bdfbc18
[ 279.250656] R13: ffffc9000138bdc8 R14: ffff88811bdfb800 R15: 0000000000000000
[ 279.250656] FS: 0000000000000000(0000) GS:ffff8883ef66a000(0000) knlGS:0000000000000000
[ 279.250656] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 279.250656] CR2: 00007fcba10194c8 CR3: 00000001217a3002 CR4: 0000000000772ef0
[ 279.250656] PKRU: 55555554
[ 279.250656] Call Trace:
[ 279.250656] <TASK>
[ 279.250656] ? __schedule+0x52c/0xb60
[ 279.250656] btrfs_clean_one_deleted_snapshot+0x72/0x100
[ 279.250656] cleaner_kthread+0xd3/0x150
[ 279.250656] ? __pfx_cleaner_kthread+0x10/0x10
[ 279.250656] kthread+0x109/0x220
[ 279.250656] ? __pfx_kthread+0x10/0x10
[ 279.250656] ? __pfx_kthread+0x10/0x10
[ 279.250656] ret_from_fork+0x120/0x160
[ 279.250656] ? __pfx_kthread+0x10/0x10
[ 279.250656] ret_from_fork_asm+0x1a/0x30
[ 279.250656] </TASK>
[ 279.250656] Modules linked in: kvm_intel kvm irqbypass
[ 279.277534] ---[ end trace 0000000000000000 ]---
similar things repeat a few times, and then it loops basically forever
doing device replacements, I waited for 30 minutes before killing it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs/071 is unhappy on 6.18-rc2
2025-10-20 7:22 btrfs/071 is unhappy on 6.18-rc2 Christoph Hellwig
@ 2025-10-20 9:11 ` Qu Wenruo
2025-10-20 9:46 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2025-10-20 9:11 UTC (permalink / raw)
To: Christoph Hellwig, linux-btrfs
在 2025/10/20 17:52, Christoph Hellwig 写道:
> I just kicked off a baseline run with the xfstests volume group and
> a SCRATCH_DEV_POOL with 5 virtual nvme devices to test a VFS change that
> affects іt a little, and it does not seem too happy.
>
> btrfs/071 gets into slab poisoning:
>
> [ 279.241695] BTRFS info (device nvme1n1 state M): use zlib compression, level 3
> [ 279.247651] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d73:I
> [ 279.250656] CPU: 1 UID: 0 PID: 82037 Comm: btrfs-cleaner Tainted: GN 6.18.0-rc2
> [ 279.250656] Tainted: [N]=TEST
> [ 279.250656] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/4
> [ 279.250656] RIP: 0010:btrfs_kill_all_delayed_nodes+0x145/0x1e0
Any line number/context and reproducibility?
> [ 279.250656] Code: 08 48 c1 e5 03 4b 8b 5c 3d 00 48 89 df e8 23 d0 ff ff 48 85 db 74 0f 4a 8d 54 3c0
> [ 279.250656] RSP: 0018:ffffc9000138bdc0 EFLAGS: 00010246
> [ 279.250656] RAX: 6b6b6b6b6b6b6b6b
This looks like POISON_FREE, so some use-after-free bug?
If you're able to reproduce, mind to try KASAN?
As I just checked my logs, no failures on btrfs/071 recorded yet (but
not on upstream rc2 yet)
Thanks,
Qu
> RBX: ffff88810dad3d58 RCX: 0000000000000000
> [ 279.250656] RDX: 0000000000000001 RSI: 0000000000000286 RDI: 00000000ffffffff
> [ 279.250656] RBP: 0000000000000008 R08: ffff88810dad3f60 R09: ffff88810dad3f10
> [ 279.250656] R10: 000000000000000d R11: ffff88810dad3d58 R12: ffff88811bdfbc18
> [ 279.250656] R13: ffffc9000138bdc8 R14: ffff88811bdfb800 R15: 0000000000000000
> [ 279.250656] FS: 0000000000000000(0000) GS:ffff8883ef66a000(0000) knlGS:0000000000000000
> [ 279.250656] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 279.250656] CR2: 00007fcba10194c8 CR3: 00000001217a3002 CR4: 0000000000772ef0
> [ 279.250656] PKRU: 55555554
> [ 279.250656] Call Trace:
> [ 279.250656] <TASK>
> [ 279.250656] ? __schedule+0x52c/0xb60
> [ 279.250656] btrfs_clean_one_deleted_snapshot+0x72/0x100
> [ 279.250656] cleaner_kthread+0xd3/0x150
> [ 279.250656] ? __pfx_cleaner_kthread+0x10/0x10
> [ 279.250656] kthread+0x109/0x220
> [ 279.250656] ? __pfx_kthread+0x10/0x10
> [ 279.250656] ? __pfx_kthread+0x10/0x10
> [ 279.250656] ret_from_fork+0x120/0x160
> [ 279.250656] ? __pfx_kthread+0x10/0x10
> [ 279.250656] ret_from_fork_asm+0x1a/0x30
> [ 279.250656] </TASK>
> [ 279.250656] Modules linked in: kvm_intel kvm irqbypass
> [ 279.277534] ---[ end trace 0000000000000000 ]---
>
> similar things repeat a few times, and then it loops basically forever
> doing device replacements, I waited for 30 minutes before killing it.
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs/071 is unhappy on 6.18-rc2
2025-10-20 9:11 ` Qu Wenruo
@ 2025-10-20 9:46 ` Christoph Hellwig
2025-10-20 10:26 ` Qu Wenruo
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-20 9:46 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Christoph Hellwig, linux-btrfs
On Mon, Oct 20, 2025 at 07:41:03PM +1030, Qu Wenruo wrote:
> > [ 279.247651] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d73:I
> > [ 279.250656] CPU: 1 UID: 0 PID: 82037 Comm: btrfs-cleaner Tainted: GN 6.18.0-rc2
> > [ 279.250656] Tainted: [N]=TEST
> > [ 279.250656] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/4
> > [ 279.250656] RIP: 0010:btrfs_kill_all_delayed_nodes+0x145/0x1e0
>
> Any line number/context
Nope, that's it. Last lines before are:
[ 62.492209] BTRFS info (device nvme1n1): first mount of filesystem 975f6fd4-b50f-4f3d-8112-319c
[ 62.492520] BTRFS info (device nvme1n1): using crc32c (crc32c-lib) checksum algorithm
[ 62.510951] BTRFS info (device nvme1n1): checking UUID tree
[ 62.511230] BTRFS info (device nvme1n1): enabling ssd optimizations
[ 62.511452] BTRFS info (device nvme1n1): turning on async discard
[ 62.511728] BTRFS info (device nvme1n1): enabling free space tree
[ 62.642011] BTRFS info (device nvme1n1 state M): use zlib compression, level 3
> and reproducibility?
100% over a few runs.
> If you're able to reproduce, mind to try KASAN?
> As I just checked my logs, no failures on btrfs/071 recorded yet (but not on
> upstream rc2 yet)
A bit busy right now, but I'll try to do a KASAN run later.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs/071 is unhappy on 6.18-rc2
2025-10-20 9:46 ` Christoph Hellwig
@ 2025-10-20 10:26 ` Qu Wenruo
2025-10-20 14:19 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2025-10-20 10:26 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-btrfs
在 2025/10/20 20:16, Christoph Hellwig 写道:
> On Mon, Oct 20, 2025 at 07:41:03PM +1030, Qu Wenruo wrote:
>>> [ 279.247651] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d73:I
>>> [ 279.250656] CPU: 1 UID: 0 PID: 82037 Comm: btrfs-cleaner Tainted: GN 6.18.0-rc2
>>> [ 279.250656] Tainted: [N]=TEST
>>> [ 279.250656] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/4
>>> [ 279.250656] RIP: 0010:btrfs_kill_all_delayed_nodes+0x145/0x1e0
>>
>> Any line number/context
>
> Nope, that's it. Last lines before are:
I mean the code line number extracted from that RIP.
But I'll try to reproduce it after the recent direct IO problem solved.
Thanks,
Qu
>
> [ 62.492209] BTRFS info (device nvme1n1): first mount of filesystem 975f6fd4-b50f-4f3d-8112-319c
> [ 62.492520] BTRFS info (device nvme1n1): using crc32c (crc32c-lib) checksum algorithm
> [ 62.510951] BTRFS info (device nvme1n1): checking UUID tree
> [ 62.511230] BTRFS info (device nvme1n1): enabling ssd optimizations
> [ 62.511452] BTRFS info (device nvme1n1): turning on async discard
> [ 62.511728] BTRFS info (device nvme1n1): enabling free space tree
> [ 62.642011] BTRFS info (device nvme1n1 state M): use zlib compression, level 3
>
>> and reproducibility?
>
> 100% over a few runs.
>
>> If you're able to reproduce, mind to try KASAN?
>> As I just checked my logs, no failures on btrfs/071 recorded yet (but not on
>> upstream rc2 yet)
>
> A bit busy right now, but I'll try to do a KASAN run later.
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs/071 is unhappy on 6.18-rc2
2025-10-20 10:26 ` Qu Wenruo
@ 2025-10-20 14:19 ` Christoph Hellwig
2025-10-20 16:55 ` Leo Martins
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-20 14:19 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Christoph Hellwig, linux-btrfs
KASAN output:
[ 75.341543] ==================================================================
[ 75.341824] BUG: KASAN: slab-use-after-free in btrfs_kill_all_delayed_nodes+0x46f/0x4c0
[ 75.342082] Read of size 8 at addr ffff88812389f380 by task btrfs-cleaner/4493
[ 75.342310]
[ 75.342369] CPU: 1 UID: 0 PID: 4493 Comm: btrfs-cleaner Tainted: G N 6.18.0-rc2+ #4115 PREEMPT(f
[ 75.342372] Tainted: [N]=TEST
[ 75.342373] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 75.342374] Call Trace:
[ 75.342375] <TASK>
[ 75.342376] dump_stack_lvl+0x4b/0x70
[ 75.342379] print_report+0x174/0x4e7
[ 75.342382] ? __virt_addr_valid+0x1bb/0x2f0
[ 75.342384] ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
[ 75.342385] kasan_report+0xd2/0x100
[ 75.342387] ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
[ 75.342388] btrfs_kill_all_delayed_nodes+0x46f/0x4c0
[ 75.342389] ? _raw_spin_unlock+0x13/0x30
[ 75.342392] ? __pfx_btrfs_kill_all_delayed_nodes+0x10/0x10
[ 75.342393] ? do_raw_spin_lock+0x128/0x260
[ 75.342395] ? __pfx_do_raw_spin_lock+0x10/0x10
[ 75.342397] ? list_lru_add_obj+0xfb/0x1a0
[ 75.342399] ? do_raw_spin_lock+0x128/0x260
[ 75.342401] ? __pfx_do_raw_spin_lock+0x10/0x10
[ 75.342402] btrfs_clean_one_deleted_snapshot+0x143/0x370
[ 75.342405] cleaner_kthread+0x1ee/0x300
[ 75.342406] ? __pfx_cleaner_kthread+0x10/0x10
[ 75.342407] kthread+0x37f/0x6f0
[ 75.342409] ? __pfx_kthread+0x10/0x10
[ 75.342411] ? __pfx_kthread+0x10/0x10
[ 75.342412] ? __pfx_kthread+0x10/0x10
[ 75.342413] ret_from_fork+0x17d/0x240
[ 75.342415] ? __pfx_kthread+0x10/0x10
[ 75.342416] ret_from_fork_asm+0x1a/0x30
[ 75.342419] </TASK>
[ 75.342419]
[ 75.345517] Allocated by task 4527:
[ 75.345517] kasan_save_stack+0x22/0x40
[ 75.345517] kasan_save_track+0x14/0x30
[ 75.345517] __kasan_slab_alloc+0x6e/0x70
[ 75.345517] kmem_cache_alloc_noprof+0x14c/0x400
[ 75.345517] btrfs_get_or_create_delayed_node+0x9e/0x9e0
[ 75.345517] btrfs_insert_delayed_dir_index+0xe4/0x8a0
[ 75.345517] btrfs_insert_dir_item+0x4c1/0x720
[ 75.345517] btrfs_add_link+0x173/0xa30
[ 75.345517] btrfs_create_new_inode+0x1551/0x2650
[ 75.345517] btrfs_create_common+0x17b/0x200
[ 75.345517] vfs_mknod+0x3a7/0x600
[ 75.345517] do_mknodat+0x34e/0x520
[ 75.345517] __x64_sys_mknodat+0xaa/0xe0
[ 75.345517] do_syscall_64+0x50/0xfa0
[ 75.345517] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 75.345517]
[ 75.345517] Freed by task 4493:
[ 75.345517] kasan_save_stack+0x22/0x40
[ 75.345517] kasan_save_track+0x14/0x30
[ 75.345517] __kasan_save_free_info+0x3b/0x70
[ 75.345517] __kasan_slab_free+0x43/0x70
[ 75.345517] kmem_cache_free+0x172/0x610
[ 75.345517] btrfs_kill_all_delayed_nodes+0x2db/0x4c0
[ 75.345517] btrfs_clean_one_deleted_snapshot+0x143/0x370
[ 75.345517] cleaner_kthread+0x1ee/0x300
[ 75.345517] kthread+0x37f/0x6f0
[ 75.345517] ret_from_fork+0x17d/0x240
[ 75.345517] ret_from_fork_asm+0x1a/0x30
[ 75.345517]
[ 75.345517] The buggy address belongs to the object at ffff88812389f370
[ 75.345517] which belongs to the cache btrfs_delayed_node of size 440
[ 75.345517] The buggy address is located 16 bytes inside of
[ 75.345517] freed 440-byte region [ffff88812389f370, ffff88812389f528)
[ 75.345517]
[ 75.345517] The buggy address belongs to the physical page:
[ 75.345517] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12389e
[ 75.345517] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 75.345517] flags: 0x4000000000000040(head|zone=2)
[ 75.345517] page_type: f5(slab)
[ 75.345517] raw: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
[ 75.345517] raw: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
[ 75.345517] head: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
[ 75.345517] head: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
[ 75.345517] head: 4000000000000001 ffffea00048e2781 00000000ffffffff 00000000ffffffff
[ 75.345517] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000002
[ 75.345517] page dumped because: kasan: bad access detected
[ 75.345517]
[ 75.345517] Memory state around the buggy address:
[ 75.345517] ffff88812389f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 75.345517] ffff88812389f300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fa fb
[ 75.345517] >ffff88812389f380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 75.345517] ^
[ 75.345517] ffff88812389f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 75.345517] ffff88812389f480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 75.345517] ==================================================================
[ 75.501545] Disabling lock debugging due to kernel taint
gdb) l *(btrfs_kill_all_delayed_nodes+0x46f)
0xffffffff82f2422f is in btrfs_kill_all_delayed_nodes (fs/btrfs/delayed-inode.h:219).
214 ref_tracker_dir_exit(&node->ref_dir.dir);
215 }
216
217 static inline void btrfs_delayed_node_ref_tracker_dir_print(struct btrfs_delayed_node *node)
218 {
219 if (!btrfs_test_opt(node->root->fs_info, REF_TRACKER))
220 return;
221
222 ref_tracker_dir_print(&node->ref_dir.dir,
223 BTRFS_DELAYED_NODE_REF_TRACKER_DISPLAY_LIMIT);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs/071 is unhappy on 6.18-rc2
2025-10-20 14:19 ` Christoph Hellwig
@ 2025-10-20 16:55 ` Leo Martins
2025-10-20 23:25 ` Leo Martins
0 siblings, 1 reply; 7+ messages in thread
From: Leo Martins @ 2025-10-20 16:55 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Qu Wenruo, linux-btrfs
On Mon, 20 Oct 2025 07:19:39 -0700 Christoph Hellwig <hch@infradead.org> wrote:
> KASAN output:
>
> [ 75.341543] ==================================================================
> [ 75.341824] BUG: KASAN: slab-use-after-free in btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> [ 75.342082] Read of size 8 at addr ffff88812389f380 by task btrfs-cleaner/4493
> [ 75.342310]
> [ 75.342369] CPU: 1 UID: 0 PID: 4493 Comm: btrfs-cleaner Tainted: G N 6.18.0-rc2+ #4115 PREEMPT(f
> [ 75.342372] Tainted: [N]=TEST
> [ 75.342373] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 75.342374] Call Trace:
> [ 75.342375] <TASK>
> [ 75.342376] dump_stack_lvl+0x4b/0x70
> [ 75.342379] print_report+0x174/0x4e7
> [ 75.342382] ? __virt_addr_valid+0x1bb/0x2f0
> [ 75.342384] ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> [ 75.342385] kasan_report+0xd2/0x100
> [ 75.342387] ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> [ 75.342388] btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> [ 75.342389] ? _raw_spin_unlock+0x13/0x30
> [ 75.342392] ? __pfx_btrfs_kill_all_delayed_nodes+0x10/0x10
> [ 75.342393] ? do_raw_spin_lock+0x128/0x260
> [ 75.342395] ? __pfx_do_raw_spin_lock+0x10/0x10
> [ 75.342397] ? list_lru_add_obj+0xfb/0x1a0
> [ 75.342399] ? do_raw_spin_lock+0x128/0x260
> [ 75.342401] ? __pfx_do_raw_spin_lock+0x10/0x10
> [ 75.342402] btrfs_clean_one_deleted_snapshot+0x143/0x370
> [ 75.342405] cleaner_kthread+0x1ee/0x300
> [ 75.342406] ? __pfx_cleaner_kthread+0x10/0x10
> [ 75.342407] kthread+0x37f/0x6f0
> [ 75.342409] ? __pfx_kthread+0x10/0x10
> [ 75.342411] ? __pfx_kthread+0x10/0x10
> [ 75.342412] ? __pfx_kthread+0x10/0x10
> [ 75.342413] ret_from_fork+0x17d/0x240
> [ 75.342415] ? __pfx_kthread+0x10/0x10
> [ 75.342416] ret_from_fork_asm+0x1a/0x30
> [ 75.342419] </TASK>
> [ 75.342419]
> [ 75.345517] Allocated by task 4527:
> [ 75.345517] kasan_save_stack+0x22/0x40
> [ 75.345517] kasan_save_track+0x14/0x30
> [ 75.345517] __kasan_slab_alloc+0x6e/0x70
> [ 75.345517] kmem_cache_alloc_noprof+0x14c/0x400
> [ 75.345517] btrfs_get_or_create_delayed_node+0x9e/0x9e0
> [ 75.345517] btrfs_insert_delayed_dir_index+0xe4/0x8a0
> [ 75.345517] btrfs_insert_dir_item+0x4c1/0x720
> [ 75.345517] btrfs_add_link+0x173/0xa30
> [ 75.345517] btrfs_create_new_inode+0x1551/0x2650
> [ 75.345517] btrfs_create_common+0x17b/0x200
> [ 75.345517] vfs_mknod+0x3a7/0x600
> [ 75.345517] do_mknodat+0x34e/0x520
> [ 75.345517] __x64_sys_mknodat+0xaa/0xe0
> [ 75.345517] do_syscall_64+0x50/0xfa0
> [ 75.345517] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 75.345517]
> [ 75.345517] Freed by task 4493:
> [ 75.345517] kasan_save_stack+0x22/0x40
> [ 75.345517] kasan_save_track+0x14/0x30
> [ 75.345517] __kasan_save_free_info+0x3b/0x70
> [ 75.345517] __kasan_slab_free+0x43/0x70
> [ 75.345517] kmem_cache_free+0x172/0x610
> [ 75.345517] btrfs_kill_all_delayed_nodes+0x2db/0x4c0
> [ 75.345517] btrfs_clean_one_deleted_snapshot+0x143/0x370
> [ 75.345517] cleaner_kthread+0x1ee/0x300
> [ 75.345517] kthread+0x37f/0x6f0
> [ 75.345517] ret_from_fork+0x17d/0x240
> [ 75.345517] ret_from_fork_asm+0x1a/0x30
> [ 75.345517]
> [ 75.345517] The buggy address belongs to the object at ffff88812389f370
> [ 75.345517] which belongs to the cache btrfs_delayed_node of size 440
> [ 75.345517] The buggy address is located 16 bytes inside of
> [ 75.345517] freed 440-byte region [ffff88812389f370, ffff88812389f528)
> [ 75.345517]
> [ 75.345517] The buggy address belongs to the physical page:
> [ 75.345517] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12389e
> [ 75.345517] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> [ 75.345517] flags: 0x4000000000000040(head|zone=2)
> [ 75.345517] page_type: f5(slab)
> [ 75.345517] raw: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
> [ 75.345517] raw: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
> [ 75.345517] head: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
> [ 75.345517] head: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
> [ 75.345517] head: 4000000000000001 ffffea00048e2781 00000000ffffffff 00000000ffffffff
> [ 75.345517] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000002
> [ 75.345517] page dumped because: kasan: bad access detected
> [ 75.345517]
> [ 75.345517] Memory state around the buggy address:
> [ 75.345517] ffff88812389f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 75.345517] ffff88812389f300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fa fb
> [ 75.345517] >ffff88812389f380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 75.345517] ^
> [ 75.345517] ffff88812389f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 75.345517] ffff88812389f480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 75.345517] ==================================================================
> [ 75.501545] Disabling lock debugging due to kernel taint
>
>
> gdb) l *(btrfs_kill_all_delayed_nodes+0x46f)
> 0xffffffff82f2422f is in btrfs_kill_all_delayed_nodes (fs/btrfs/delayed-inode.h:219).
> 214 ref_tracker_dir_exit(&node->ref_dir.dir);
> 215 }
> 216
> 217 static inline void btrfs_delayed_node_ref_tracker_dir_print(struct btrfs_delayed_node *node)
> 218 {
> 219 if (!btrfs_test_opt(node->root->fs_info, REF_TRACKER))
> 220 return;
> 221
> 222 ref_tracker_dir_print(&node->ref_dir.dir,
> 223 BTRFS_DELAYED_NODE_REF_TRACKER_DISPLAY_LIMIT);
This is a use after free bug with my ref_tracker patch, it's trying to print delayed_node ref_tracker
stats after the delayed node has been freed. Will send a fix in a second.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: btrfs/071 is unhappy on 6.18-rc2
2025-10-20 16:55 ` Leo Martins
@ 2025-10-20 23:25 ` Leo Martins
0 siblings, 0 replies; 7+ messages in thread
From: Leo Martins @ 2025-10-20 23:25 UTC (permalink / raw)
To: Leo Martins; +Cc: Christoph Hellwig, Qu Wenruo, linux-btrfs
On Mon, 20 Oct 2025 09:55:10 -0700 Leo Martins <loemra.dev@gmail.com> wrote:
> On Mon, 20 Oct 2025 07:19:39 -0700 Christoph Hellwig <hch@infradead.org> wrote:
>
> > KASAN output:
> >
> > [ 75.341543] ==================================================================
> > [ 75.341824] BUG: KASAN: slab-use-after-free in btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> > [ 75.342082] Read of size 8 at addr ffff88812389f380 by task btrfs-cleaner/4493
> > [ 75.342310]
> > [ 75.342369] CPU: 1 UID: 0 PID: 4493 Comm: btrfs-cleaner Tainted: G N 6.18.0-rc2+ #4115 PREEMPT(f
> > [ 75.342372] Tainted: [N]=TEST
> > [ 75.342373] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > [ 75.342374] Call Trace:
> > [ 75.342375] <TASK>
> > [ 75.342376] dump_stack_lvl+0x4b/0x70
> > [ 75.342379] print_report+0x174/0x4e7
> > [ 75.342382] ? __virt_addr_valid+0x1bb/0x2f0
> > [ 75.342384] ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> > [ 75.342385] kasan_report+0xd2/0x100
> > [ 75.342387] ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> > [ 75.342388] btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> > [ 75.342389] ? _raw_spin_unlock+0x13/0x30
> > [ 75.342392] ? __pfx_btrfs_kill_all_delayed_nodes+0x10/0x10
> > [ 75.342393] ? do_raw_spin_lock+0x128/0x260
> > [ 75.342395] ? __pfx_do_raw_spin_lock+0x10/0x10
> > [ 75.342397] ? list_lru_add_obj+0xfb/0x1a0
> > [ 75.342399] ? do_raw_spin_lock+0x128/0x260
> > [ 75.342401] ? __pfx_do_raw_spin_lock+0x10/0x10
> > [ 75.342402] btrfs_clean_one_deleted_snapshot+0x143/0x370
> > [ 75.342405] cleaner_kthread+0x1ee/0x300
> > [ 75.342406] ? __pfx_cleaner_kthread+0x10/0x10
> > [ 75.342407] kthread+0x37f/0x6f0
> > [ 75.342409] ? __pfx_kthread+0x10/0x10
> > [ 75.342411] ? __pfx_kthread+0x10/0x10
> > [ 75.342412] ? __pfx_kthread+0x10/0x10
> > [ 75.342413] ret_from_fork+0x17d/0x240
> > [ 75.342415] ? __pfx_kthread+0x10/0x10
> > [ 75.342416] ret_from_fork_asm+0x1a/0x30
> > [ 75.342419] </TASK>
> > [ 75.342419]
> > [ 75.345517] Allocated by task 4527:
> > [ 75.345517] kasan_save_stack+0x22/0x40
> > [ 75.345517] kasan_save_track+0x14/0x30
> > [ 75.345517] __kasan_slab_alloc+0x6e/0x70
> > [ 75.345517] kmem_cache_alloc_noprof+0x14c/0x400
> > [ 75.345517] btrfs_get_or_create_delayed_node+0x9e/0x9e0
> > [ 75.345517] btrfs_insert_delayed_dir_index+0xe4/0x8a0
> > [ 75.345517] btrfs_insert_dir_item+0x4c1/0x720
> > [ 75.345517] btrfs_add_link+0x173/0xa30
> > [ 75.345517] btrfs_create_new_inode+0x1551/0x2650
> > [ 75.345517] btrfs_create_common+0x17b/0x200
> > [ 75.345517] vfs_mknod+0x3a7/0x600
> > [ 75.345517] do_mknodat+0x34e/0x520
> > [ 75.345517] __x64_sys_mknodat+0xaa/0xe0
> > [ 75.345517] do_syscall_64+0x50/0xfa0
> > [ 75.345517] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [ 75.345517]
> > [ 75.345517] Freed by task 4493:
> > [ 75.345517] kasan_save_stack+0x22/0x40
> > [ 75.345517] kasan_save_track+0x14/0x30
> > [ 75.345517] __kasan_save_free_info+0x3b/0x70
> > [ 75.345517] __kasan_slab_free+0x43/0x70
> > [ 75.345517] kmem_cache_free+0x172/0x610
> > [ 75.345517] btrfs_kill_all_delayed_nodes+0x2db/0x4c0
> > [ 75.345517] btrfs_clean_one_deleted_snapshot+0x143/0x370
> > [ 75.345517] cleaner_kthread+0x1ee/0x300
> > [ 75.345517] kthread+0x37f/0x6f0
> > [ 75.345517] ret_from_fork+0x17d/0x240
> > [ 75.345517] ret_from_fork_asm+0x1a/0x30
> > [ 75.345517]
> > [ 75.345517] The buggy address belongs to the object at ffff88812389f370
> > [ 75.345517] which belongs to the cache btrfs_delayed_node of size 440
> > [ 75.345517] The buggy address is located 16 bytes inside of
> > [ 75.345517] freed 440-byte region [ffff88812389f370, ffff88812389f528)
> > [ 75.345517]
> > [ 75.345517] The buggy address belongs to the physical page:
> > [ 75.345517] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12389e
> > [ 75.345517] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> > [ 75.345517] flags: 0x4000000000000040(head|zone=2)
> > [ 75.345517] page_type: f5(slab)
> > [ 75.345517] raw: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
> > [ 75.345517] raw: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
> > [ 75.345517] head: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
> > [ 75.345517] head: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
> > [ 75.345517] head: 4000000000000001 ffffea00048e2781 00000000ffffffff 00000000ffffffff
> > [ 75.345517] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000002
> > [ 75.345517] page dumped because: kasan: bad access detected
> > [ 75.345517]
> > [ 75.345517] Memory state around the buggy address:
> > [ 75.345517] ffff88812389f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [ 75.345517] ffff88812389f300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fa fb
> > [ 75.345517] >ffff88812389f380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [ 75.345517] ^
> > [ 75.345517] ffff88812389f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [ 75.345517] ffff88812389f480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [ 75.345517] ==================================================================
> > [ 75.501545] Disabling lock debugging due to kernel taint
> >
> >
> > gdb) l *(btrfs_kill_all_delayed_nodes+0x46f)
> > 0xffffffff82f2422f is in btrfs_kill_all_delayed_nodes (fs/btrfs/delayed-inode.h:219).
> > 214 ref_tracker_dir_exit(&node->ref_dir.dir);
> > 215 }
> > 216
> > 217 static inline void btrfs_delayed_node_ref_tracker_dir_print(struct btrfs_delayed_node *node)
> > 218 {
> > 219 if (!btrfs_test_opt(node->root->fs_info, REF_TRACKER))
> > 220 return;
> > 221
> > 222 ref_tracker_dir_print(&node->ref_dir.dir,
> > 223 BTRFS_DELAYED_NODE_REF_TRACKER_DISPLAY_LIMIT);
>
> This is a use after free bug with my ref_tracker patch, it's trying to print delayed_node ref_tracker
> stats after the delayed node has been freed. Will send a fix in a second.
I wasn't able to reproduce the crash by running btrfs/071. I sent out a fix,
if you have time it would be great if you could check it against your reproducer.
Link: https://lore.kernel.org/linux-btrfs/e5d6dd45f720f2543ca4ea7ee3e66454ef55f639.1761001854.git.loemra.dev@gmail.com/T/#u
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-20 23:25 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 7:22 btrfs/071 is unhappy on 6.18-rc2 Christoph Hellwig
2025-10-20 9:11 ` Qu Wenruo
2025-10-20 9:46 ` Christoph Hellwig
2025-10-20 10:26 ` Qu Wenruo
2025-10-20 14:19 ` Christoph Hellwig
2025-10-20 16:55 ` Leo Martins
2025-10-20 23:25 ` Leo Martins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox