btrfs/071 is unhappy on 6.18-rc2

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

* btrfs/071 is unhappy on 6.18-rc2
@ 2025-10-20  7:22 Christoph Hellwig
  2025-10-20  9:11 ` Qu Wenruo
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-20  7:22 UTC (permalink / raw)
  To: linux-btrfs

I just kicked off a baseline run with the xfstests volume group and
a SCRATCH_DEV_POOL with 5 virtual nvme devices to test a VFS change that
affects іt a little, and it does not seem too happy.

btrfs/071 gets into slab poisoning:

[  279.241695] BTRFS info (device nvme1n1 state M): use zlib compression, level 3
[  279.247651] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d73:I
[  279.250656] CPU: 1 UID: 0 PID: 82037 Comm: btrfs-cleaner Tainted: GN  6.18.0-rc2  
[  279.250656] Tainted: [N]=TEST
[  279.250656] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/4
[  279.250656] RIP: 0010:btrfs_kill_all_delayed_nodes+0x145/0x1e0
[  279.250656] Code: 08 48 c1 e5 03 4b 8b 5c 3d 00 48 89 df e8 23 d0 ff ff 48 85 db 74 0f 4a 8d 54 3c0
[  279.250656] RSP: 0018:ffffc9000138bdc0 EFLAGS: 00010246
[  279.250656] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88810dad3d58 RCX: 0000000000000000
[  279.250656] RDX: 0000000000000001 RSI: 0000000000000286 RDI: 00000000ffffffff
[  279.250656] RBP: 0000000000000008 R08: ffff88810dad3f60 R09: ffff88810dad3f10
[  279.250656] R10: 000000000000000d R11: ffff88810dad3d58 R12: ffff88811bdfbc18
[  279.250656] R13: ffffc9000138bdc8 R14: ffff88811bdfb800 R15: 0000000000000000
[  279.250656] FS:  0000000000000000(0000) GS:ffff8883ef66a000(0000) knlGS:0000000000000000
[  279.250656] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  279.250656] CR2: 00007fcba10194c8 CR3: 00000001217a3002 CR4: 0000000000772ef0
[  279.250656] PKRU: 55555554
[  279.250656] Call Trace:
[  279.250656]  <TASK>
[  279.250656]  ? __schedule+0x52c/0xb60
[  279.250656]  btrfs_clean_one_deleted_snapshot+0x72/0x100
[  279.250656]  cleaner_kthread+0xd3/0x150
[  279.250656]  ? __pfx_cleaner_kthread+0x10/0x10
[  279.250656]  kthread+0x109/0x220
[  279.250656]  ? __pfx_kthread+0x10/0x10
[  279.250656]  ? __pfx_kthread+0x10/0x10
[  279.250656]  ret_from_fork+0x120/0x160
[  279.250656]  ? __pfx_kthread+0x10/0x10
[  279.250656]  ret_from_fork_asm+0x1a/0x30
[  279.250656]  </TASK>
[  279.250656] Modules linked in: kvm_intel kvm irqbypass
[  279.277534] ---[ end trace 0000000000000000 ]---

similar things repeat a few times, and then it loops basically forever
doing device replacements, I waited for 30 minutes before killing it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs/071 is unhappy on 6.18-rc2
  2025-10-20  7:22 btrfs/071 is unhappy on 6.18-rc2 Christoph Hellwig
@ 2025-10-20  9:11 ` Qu Wenruo
  2025-10-20  9:46   ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2025-10-20  9:11 UTC (permalink / raw)
  To: Christoph Hellwig, linux-btrfs



在 2025/10/20 17:52, Christoph Hellwig 写道:
> I just kicked off a baseline run with the xfstests volume group and
> a SCRATCH_DEV_POOL with 5 virtual nvme devices to test a VFS change that
> affects іt a little, and it does not seem too happy.
> 
> btrfs/071 gets into slab poisoning:
> 
> [  279.241695] BTRFS info (device nvme1n1 state M): use zlib compression, level 3
> [  279.247651] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d73:I
> [  279.250656] CPU: 1 UID: 0 PID: 82037 Comm: btrfs-cleaner Tainted: GN  6.18.0-rc2
> [  279.250656] Tainted: [N]=TEST
> [  279.250656] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/4
> [  279.250656] RIP: 0010:btrfs_kill_all_delayed_nodes+0x145/0x1e0

Any line number/context and reproducibility?


> [  279.250656] Code: 08 48 c1 e5 03 4b 8b 5c 3d 00 48 89 df e8 23 d0 ff ff 48 85 db 74 0f 4a 8d 54 3c0
> [  279.250656] RSP: 0018:ffffc9000138bdc0 EFLAGS: 00010246
> [  279.250656] RAX: 6b6b6b6b6b6b6b6b

This looks like POISON_FREE, so some use-after-free bug?

If you're able to reproduce, mind to try KASAN?
As I just checked my logs, no failures on btrfs/071 recorded yet (but 
not on upstream rc2 yet)

Thanks,
Qu

>  RBX: ffff88810dad3d58 RCX: 0000000000000000
> [  279.250656] RDX: 0000000000000001 RSI: 0000000000000286 RDI: 00000000ffffffff
> [  279.250656] RBP: 0000000000000008 R08: ffff88810dad3f60 R09: ffff88810dad3f10
> [  279.250656] R10: 000000000000000d R11: ffff88810dad3d58 R12: ffff88811bdfbc18
> [  279.250656] R13: ffffc9000138bdc8 R14: ffff88811bdfb800 R15: 0000000000000000
> [  279.250656] FS:  0000000000000000(0000) GS:ffff8883ef66a000(0000) knlGS:0000000000000000
> [  279.250656] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  279.250656] CR2: 00007fcba10194c8 CR3: 00000001217a3002 CR4: 0000000000772ef0
> [  279.250656] PKRU: 55555554
> [  279.250656] Call Trace:
> [  279.250656]  <TASK>
> [  279.250656]  ? __schedule+0x52c/0xb60
> [  279.250656]  btrfs_clean_one_deleted_snapshot+0x72/0x100
> [  279.250656]  cleaner_kthread+0xd3/0x150
> [  279.250656]  ? __pfx_cleaner_kthread+0x10/0x10
> [  279.250656]  kthread+0x109/0x220
> [  279.250656]  ? __pfx_kthread+0x10/0x10
> [  279.250656]  ? __pfx_kthread+0x10/0x10
> [  279.250656]  ret_from_fork+0x120/0x160
> [  279.250656]  ? __pfx_kthread+0x10/0x10
> [  279.250656]  ret_from_fork_asm+0x1a/0x30
> [  279.250656]  </TASK>
> [  279.250656] Modules linked in: kvm_intel kvm irqbypass
> [  279.277534] ---[ end trace 0000000000000000 ]---
> 
> similar things repeat a few times, and then it loops basically forever
> doing device replacements, I waited for 30 minutes before killing it.
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs/071 is unhappy on 6.18-rc2
  2025-10-20  9:11 ` Qu Wenruo
@ 2025-10-20  9:46   ` Christoph Hellwig
  2025-10-20 10:26     ` Qu Wenruo
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-20  9:46 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Christoph Hellwig, linux-btrfs

On Mon, Oct 20, 2025 at 07:41:03PM +1030, Qu Wenruo wrote:
> > [  279.247651] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d73:I
> > [  279.250656] CPU: 1 UID: 0 PID: 82037 Comm: btrfs-cleaner Tainted: GN  6.18.0-rc2
> > [  279.250656] Tainted: [N]=TEST
> > [  279.250656] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/4
> > [  279.250656] RIP: 0010:btrfs_kill_all_delayed_nodes+0x145/0x1e0
> 
> Any line number/context

Nope, that's it.  Last lines before are:

[   62.492209] BTRFS info (device nvme1n1): first mount of filesystem 975f6fd4-b50f-4f3d-8112-319c
[   62.492520] BTRFS info (device nvme1n1): using crc32c (crc32c-lib) checksum algorithm
[   62.510951] BTRFS info (device nvme1n1): checking UUID tree
[   62.511230] BTRFS info (device nvme1n1): enabling ssd optimizations
[   62.511452] BTRFS info (device nvme1n1): turning on async discard
[   62.511728] BTRFS info (device nvme1n1): enabling free space tree
[   62.642011] BTRFS info (device nvme1n1 state M): use zlib compression, level 3

> and reproducibility?

100% over a few runs.

> If you're able to reproduce, mind to try KASAN?
> As I just checked my logs, no failures on btrfs/071 recorded yet (but not on
> upstream rc2 yet)

A bit busy right now, but I'll try to do a KASAN run later.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs/071 is unhappy on 6.18-rc2
  2025-10-20  9:46   ` Christoph Hellwig
@ 2025-10-20 10:26     ` Qu Wenruo
  2025-10-20 14:19       ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2025-10-20 10:26 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-btrfs



在 2025/10/20 20:16, Christoph Hellwig 写道:
> On Mon, Oct 20, 2025 at 07:41:03PM +1030, Qu Wenruo wrote:
>>> [  279.247651] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d73:I
>>> [  279.250656] CPU: 1 UID: 0 PID: 82037 Comm: btrfs-cleaner Tainted: GN  6.18.0-rc2
>>> [  279.250656] Tainted: [N]=TEST
>>> [  279.250656] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/4
>>> [  279.250656] RIP: 0010:btrfs_kill_all_delayed_nodes+0x145/0x1e0
>>
>> Any line number/context
> 
> Nope, that's it.  Last lines before are:

I mean the code line number extracted from that RIP.

But I'll try to reproduce it after the recent direct IO problem solved.

Thanks,
Qu

> 
> [   62.492209] BTRFS info (device nvme1n1): first mount of filesystem 975f6fd4-b50f-4f3d-8112-319c
> [   62.492520] BTRFS info (device nvme1n1): using crc32c (crc32c-lib) checksum algorithm
> [   62.510951] BTRFS info (device nvme1n1): checking UUID tree
> [   62.511230] BTRFS info (device nvme1n1): enabling ssd optimizations
> [   62.511452] BTRFS info (device nvme1n1): turning on async discard
> [   62.511728] BTRFS info (device nvme1n1): enabling free space tree
> [   62.642011] BTRFS info (device nvme1n1 state M): use zlib compression, level 3
> 
>> and reproducibility?
> 
> 100% over a few runs.
> 
>> If you're able to reproduce, mind to try KASAN?
>> As I just checked my logs, no failures on btrfs/071 recorded yet (but not on
>> upstream rc2 yet)
> 
> A bit busy right now, but I'll try to do a KASAN run later.
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs/071 is unhappy on 6.18-rc2
  2025-10-20 10:26     ` Qu Wenruo
@ 2025-10-20 14:19       ` Christoph Hellwig
  2025-10-20 16:55         ` Leo Martins
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2025-10-20 14:19 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Christoph Hellwig, linux-btrfs

KASAN output:

[   75.341543] ==================================================================
[   75.341824] BUG: KASAN: slab-use-after-free in btrfs_kill_all_delayed_nodes+0x46f/0x4c0
[   75.342082] Read of size 8 at addr ffff88812389f380 by task btrfs-cleaner/4493
[   75.342310] 
[   75.342369] CPU: 1 UID: 0 PID: 4493 Comm: btrfs-cleaner Tainted: G                 N  6.18.0-rc2+ #4115 PREEMPT(f 
[   75.342372] Tainted: [N]=TEST
[   75.342373] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   75.342374] Call Trace:
[   75.342375]  <TASK>
[   75.342376]  dump_stack_lvl+0x4b/0x70
[   75.342379]  print_report+0x174/0x4e7
[   75.342382]  ? __virt_addr_valid+0x1bb/0x2f0
[   75.342384]  ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
[   75.342385]  kasan_report+0xd2/0x100
[   75.342387]  ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
[   75.342388]  btrfs_kill_all_delayed_nodes+0x46f/0x4c0
[   75.342389]  ? _raw_spin_unlock+0x13/0x30
[   75.342392]  ? __pfx_btrfs_kill_all_delayed_nodes+0x10/0x10
[   75.342393]  ? do_raw_spin_lock+0x128/0x260
[   75.342395]  ? __pfx_do_raw_spin_lock+0x10/0x10
[   75.342397]  ? list_lru_add_obj+0xfb/0x1a0
[   75.342399]  ? do_raw_spin_lock+0x128/0x260
[   75.342401]  ? __pfx_do_raw_spin_lock+0x10/0x10
[   75.342402]  btrfs_clean_one_deleted_snapshot+0x143/0x370
[   75.342405]  cleaner_kthread+0x1ee/0x300
[   75.342406]  ? __pfx_cleaner_kthread+0x10/0x10
[   75.342407]  kthread+0x37f/0x6f0
[   75.342409]  ? __pfx_kthread+0x10/0x10
[   75.342411]  ? __pfx_kthread+0x10/0x10
[   75.342412]  ? __pfx_kthread+0x10/0x10
[   75.342413]  ret_from_fork+0x17d/0x240
[   75.342415]  ? __pfx_kthread+0x10/0x10
[   75.342416]  ret_from_fork_asm+0x1a/0x30
[   75.342419]  </TASK>
[   75.342419] 
[   75.345517] Allocated by task 4527:
[   75.345517]  kasan_save_stack+0x22/0x40
[   75.345517]  kasan_save_track+0x14/0x30
[   75.345517]  __kasan_slab_alloc+0x6e/0x70
[   75.345517]  kmem_cache_alloc_noprof+0x14c/0x400
[   75.345517]  btrfs_get_or_create_delayed_node+0x9e/0x9e0
[   75.345517]  btrfs_insert_delayed_dir_index+0xe4/0x8a0
[   75.345517]  btrfs_insert_dir_item+0x4c1/0x720
[   75.345517]  btrfs_add_link+0x173/0xa30
[   75.345517]  btrfs_create_new_inode+0x1551/0x2650
[   75.345517]  btrfs_create_common+0x17b/0x200
[   75.345517]  vfs_mknod+0x3a7/0x600
[   75.345517]  do_mknodat+0x34e/0x520
[   75.345517]  __x64_sys_mknodat+0xaa/0xe0
[   75.345517]  do_syscall_64+0x50/0xfa0
[   75.345517]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   75.345517] 
[   75.345517] Freed by task 4493:
[   75.345517]  kasan_save_stack+0x22/0x40
[   75.345517]  kasan_save_track+0x14/0x30
[   75.345517]  __kasan_save_free_info+0x3b/0x70
[   75.345517]  __kasan_slab_free+0x43/0x70
[   75.345517]  kmem_cache_free+0x172/0x610
[   75.345517]  btrfs_kill_all_delayed_nodes+0x2db/0x4c0
[   75.345517]  btrfs_clean_one_deleted_snapshot+0x143/0x370
[   75.345517]  cleaner_kthread+0x1ee/0x300
[   75.345517]  kthread+0x37f/0x6f0
[   75.345517]  ret_from_fork+0x17d/0x240
[   75.345517]  ret_from_fork_asm+0x1a/0x30
[   75.345517] 
[   75.345517] The buggy address belongs to the object at ffff88812389f370
[   75.345517]  which belongs to the cache btrfs_delayed_node of size 440
[   75.345517] The buggy address is located 16 bytes inside of
[   75.345517]  freed 440-byte region [ffff88812389f370, ffff88812389f528)
[   75.345517] 
[   75.345517] The buggy address belongs to the physical page:
[   75.345517] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12389e
[   75.345517] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[   75.345517] flags: 0x4000000000000040(head|zone=2)
[   75.345517] page_type: f5(slab)
[   75.345517] raw: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
[   75.345517] raw: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
[   75.345517] head: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
[   75.345517] head: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
[   75.345517] head: 4000000000000001 ffffea00048e2781 00000000ffffffff 00000000ffffffff
[   75.345517] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000002
[   75.345517] page dumped because: kasan: bad access detected
[   75.345517] 
[   75.345517] Memory state around the buggy address:
[   75.345517]  ffff88812389f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   75.345517]  ffff88812389f300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fa fb
[   75.345517] >ffff88812389f380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   75.345517]                    ^
[   75.345517]  ffff88812389f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   75.345517]  ffff88812389f480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   75.345517] ==================================================================
[   75.501545] Disabling lock debugging due to kernel taint


gdb) l *(btrfs_kill_all_delayed_nodes+0x46f)
0xffffffff82f2422f is in btrfs_kill_all_delayed_nodes (fs/btrfs/delayed-inode.h:219).
214		ref_tracker_dir_exit(&node->ref_dir.dir);
215	}
216	
217	static inline void btrfs_delayed_node_ref_tracker_dir_print(struct btrfs_delayed_node *node)
218	{
219		if (!btrfs_test_opt(node->root->fs_info, REF_TRACKER))
220			return;
221	
222		ref_tracker_dir_print(&node->ref_dir.dir,
223				      BTRFS_DELAYED_NODE_REF_TRACKER_DISPLAY_LIMIT);


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs/071 is unhappy on 6.18-rc2
  2025-10-20 14:19       ` Christoph Hellwig
@ 2025-10-20 16:55         ` Leo Martins
  2025-10-20 23:25           ` Leo Martins
  0 siblings, 1 reply; 7+ messages in thread
From: Leo Martins @ 2025-10-20 16:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Qu Wenruo, linux-btrfs

On Mon, 20 Oct 2025 07:19:39 -0700 Christoph Hellwig <hch@infradead.org> wrote:

> KASAN output:
> 
> [   75.341543] ==================================================================
> [   75.341824] BUG: KASAN: slab-use-after-free in btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> [   75.342082] Read of size 8 at addr ffff88812389f380 by task btrfs-cleaner/4493
> [   75.342310] 
> [   75.342369] CPU: 1 UID: 0 PID: 4493 Comm: btrfs-cleaner Tainted: G                 N  6.18.0-rc2+ #4115 PREEMPT(f 
> [   75.342372] Tainted: [N]=TEST
> [   75.342373] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [   75.342374] Call Trace:
> [   75.342375]  <TASK>
> [   75.342376]  dump_stack_lvl+0x4b/0x70
> [   75.342379]  print_report+0x174/0x4e7
> [   75.342382]  ? __virt_addr_valid+0x1bb/0x2f0
> [   75.342384]  ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> [   75.342385]  kasan_report+0xd2/0x100
> [   75.342387]  ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> [   75.342388]  btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> [   75.342389]  ? _raw_spin_unlock+0x13/0x30
> [   75.342392]  ? __pfx_btrfs_kill_all_delayed_nodes+0x10/0x10
> [   75.342393]  ? do_raw_spin_lock+0x128/0x260
> [   75.342395]  ? __pfx_do_raw_spin_lock+0x10/0x10
> [   75.342397]  ? list_lru_add_obj+0xfb/0x1a0
> [   75.342399]  ? do_raw_spin_lock+0x128/0x260
> [   75.342401]  ? __pfx_do_raw_spin_lock+0x10/0x10
> [   75.342402]  btrfs_clean_one_deleted_snapshot+0x143/0x370
> [   75.342405]  cleaner_kthread+0x1ee/0x300
> [   75.342406]  ? __pfx_cleaner_kthread+0x10/0x10
> [   75.342407]  kthread+0x37f/0x6f0
> [   75.342409]  ? __pfx_kthread+0x10/0x10
> [   75.342411]  ? __pfx_kthread+0x10/0x10
> [   75.342412]  ? __pfx_kthread+0x10/0x10
> [   75.342413]  ret_from_fork+0x17d/0x240
> [   75.342415]  ? __pfx_kthread+0x10/0x10
> [   75.342416]  ret_from_fork_asm+0x1a/0x30
> [   75.342419]  </TASK>
> [   75.342419] 
> [   75.345517] Allocated by task 4527:
> [   75.345517]  kasan_save_stack+0x22/0x40
> [   75.345517]  kasan_save_track+0x14/0x30
> [   75.345517]  __kasan_slab_alloc+0x6e/0x70
> [   75.345517]  kmem_cache_alloc_noprof+0x14c/0x400
> [   75.345517]  btrfs_get_or_create_delayed_node+0x9e/0x9e0
> [   75.345517]  btrfs_insert_delayed_dir_index+0xe4/0x8a0
> [   75.345517]  btrfs_insert_dir_item+0x4c1/0x720
> [   75.345517]  btrfs_add_link+0x173/0xa30
> [   75.345517]  btrfs_create_new_inode+0x1551/0x2650
> [   75.345517]  btrfs_create_common+0x17b/0x200
> [   75.345517]  vfs_mknod+0x3a7/0x600
> [   75.345517]  do_mknodat+0x34e/0x520
> [   75.345517]  __x64_sys_mknodat+0xaa/0xe0
> [   75.345517]  do_syscall_64+0x50/0xfa0
> [   75.345517]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   75.345517] 
> [   75.345517] Freed by task 4493:
> [   75.345517]  kasan_save_stack+0x22/0x40
> [   75.345517]  kasan_save_track+0x14/0x30
> [   75.345517]  __kasan_save_free_info+0x3b/0x70
> [   75.345517]  __kasan_slab_free+0x43/0x70
> [   75.345517]  kmem_cache_free+0x172/0x610
> [   75.345517]  btrfs_kill_all_delayed_nodes+0x2db/0x4c0
> [   75.345517]  btrfs_clean_one_deleted_snapshot+0x143/0x370
> [   75.345517]  cleaner_kthread+0x1ee/0x300
> [   75.345517]  kthread+0x37f/0x6f0
> [   75.345517]  ret_from_fork+0x17d/0x240
> [   75.345517]  ret_from_fork_asm+0x1a/0x30
> [   75.345517] 
> [   75.345517] The buggy address belongs to the object at ffff88812389f370
> [   75.345517]  which belongs to the cache btrfs_delayed_node of size 440
> [   75.345517] The buggy address is located 16 bytes inside of
> [   75.345517]  freed 440-byte region [ffff88812389f370, ffff88812389f528)
> [   75.345517] 
> [   75.345517] The buggy address belongs to the physical page:
> [   75.345517] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12389e
> [   75.345517] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> [   75.345517] flags: 0x4000000000000040(head|zone=2)
> [   75.345517] page_type: f5(slab)
> [   75.345517] raw: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
> [   75.345517] raw: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
> [   75.345517] head: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
> [   75.345517] head: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
> [   75.345517] head: 4000000000000001 ffffea00048e2781 00000000ffffffff 00000000ffffffff
> [   75.345517] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000002
> [   75.345517] page dumped because: kasan: bad access detected
> [   75.345517] 
> [   75.345517] Memory state around the buggy address:
> [   75.345517]  ffff88812389f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   75.345517]  ffff88812389f300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fa fb
> [   75.345517] >ffff88812389f380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   75.345517]                    ^
> [   75.345517]  ffff88812389f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   75.345517]  ffff88812389f480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   75.345517] ==================================================================
> [   75.501545] Disabling lock debugging due to kernel taint
> 
> 
> gdb) l *(btrfs_kill_all_delayed_nodes+0x46f)
> 0xffffffff82f2422f is in btrfs_kill_all_delayed_nodes (fs/btrfs/delayed-inode.h:219).
> 214		ref_tracker_dir_exit(&node->ref_dir.dir);
> 215	}
> 216	
> 217	static inline void btrfs_delayed_node_ref_tracker_dir_print(struct btrfs_delayed_node *node)
> 218	{
> 219		if (!btrfs_test_opt(node->root->fs_info, REF_TRACKER))
> 220			return;
> 221	
> 222		ref_tracker_dir_print(&node->ref_dir.dir,
> 223				      BTRFS_DELAYED_NODE_REF_TRACKER_DISPLAY_LIMIT);

This is a use after free bug with my ref_tracker patch, it's trying to print delayed_node ref_tracker
stats after the delayed node has been freed. Will send a fix in a second.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs/071 is unhappy on 6.18-rc2
  2025-10-20 16:55         ` Leo Martins
@ 2025-10-20 23:25           ` Leo Martins
  0 siblings, 0 replies; 7+ messages in thread
From: Leo Martins @ 2025-10-20 23:25 UTC (permalink / raw)
  To: Leo Martins; +Cc: Christoph Hellwig, Qu Wenruo, linux-btrfs

On Mon, 20 Oct 2025 09:55:10 -0700 Leo Martins <loemra.dev@gmail.com> wrote:

> On Mon, 20 Oct 2025 07:19:39 -0700 Christoph Hellwig <hch@infradead.org> wrote:
> 
> > KASAN output:
> > 
> > [   75.341543] ==================================================================
> > [   75.341824] BUG: KASAN: slab-use-after-free in btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> > [   75.342082] Read of size 8 at addr ffff88812389f380 by task btrfs-cleaner/4493
> > [   75.342310] 
> > [   75.342369] CPU: 1 UID: 0 PID: 4493 Comm: btrfs-cleaner Tainted: G                 N  6.18.0-rc2+ #4115 PREEMPT(f 
> > [   75.342372] Tainted: [N]=TEST
> > [   75.342373] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > [   75.342374] Call Trace:
> > [   75.342375]  <TASK>
> > [   75.342376]  dump_stack_lvl+0x4b/0x70
> > [   75.342379]  print_report+0x174/0x4e7
> > [   75.342382]  ? __virt_addr_valid+0x1bb/0x2f0
> > [   75.342384]  ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> > [   75.342385]  kasan_report+0xd2/0x100
> > [   75.342387]  ? btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> > [   75.342388]  btrfs_kill_all_delayed_nodes+0x46f/0x4c0
> > [   75.342389]  ? _raw_spin_unlock+0x13/0x30
> > [   75.342392]  ? __pfx_btrfs_kill_all_delayed_nodes+0x10/0x10
> > [   75.342393]  ? do_raw_spin_lock+0x128/0x260
> > [   75.342395]  ? __pfx_do_raw_spin_lock+0x10/0x10
> > [   75.342397]  ? list_lru_add_obj+0xfb/0x1a0
> > [   75.342399]  ? do_raw_spin_lock+0x128/0x260
> > [   75.342401]  ? __pfx_do_raw_spin_lock+0x10/0x10
> > [   75.342402]  btrfs_clean_one_deleted_snapshot+0x143/0x370
> > [   75.342405]  cleaner_kthread+0x1ee/0x300
> > [   75.342406]  ? __pfx_cleaner_kthread+0x10/0x10
> > [   75.342407]  kthread+0x37f/0x6f0
> > [   75.342409]  ? __pfx_kthread+0x10/0x10
> > [   75.342411]  ? __pfx_kthread+0x10/0x10
> > [   75.342412]  ? __pfx_kthread+0x10/0x10
> > [   75.342413]  ret_from_fork+0x17d/0x240
> > [   75.342415]  ? __pfx_kthread+0x10/0x10
> > [   75.342416]  ret_from_fork_asm+0x1a/0x30
> > [   75.342419]  </TASK>
> > [   75.342419] 
> > [   75.345517] Allocated by task 4527:
> > [   75.345517]  kasan_save_stack+0x22/0x40
> > [   75.345517]  kasan_save_track+0x14/0x30
> > [   75.345517]  __kasan_slab_alloc+0x6e/0x70
> > [   75.345517]  kmem_cache_alloc_noprof+0x14c/0x400
> > [   75.345517]  btrfs_get_or_create_delayed_node+0x9e/0x9e0
> > [   75.345517]  btrfs_insert_delayed_dir_index+0xe4/0x8a0
> > [   75.345517]  btrfs_insert_dir_item+0x4c1/0x720
> > [   75.345517]  btrfs_add_link+0x173/0xa30
> > [   75.345517]  btrfs_create_new_inode+0x1551/0x2650
> > [   75.345517]  btrfs_create_common+0x17b/0x200
> > [   75.345517]  vfs_mknod+0x3a7/0x600
> > [   75.345517]  do_mknodat+0x34e/0x520
> > [   75.345517]  __x64_sys_mknodat+0xaa/0xe0
> > [   75.345517]  do_syscall_64+0x50/0xfa0
> > [   75.345517]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [   75.345517] 
> > [   75.345517] Freed by task 4493:
> > [   75.345517]  kasan_save_stack+0x22/0x40
> > [   75.345517]  kasan_save_track+0x14/0x30
> > [   75.345517]  __kasan_save_free_info+0x3b/0x70
> > [   75.345517]  __kasan_slab_free+0x43/0x70
> > [   75.345517]  kmem_cache_free+0x172/0x610
> > [   75.345517]  btrfs_kill_all_delayed_nodes+0x2db/0x4c0
> > [   75.345517]  btrfs_clean_one_deleted_snapshot+0x143/0x370
> > [   75.345517]  cleaner_kthread+0x1ee/0x300
> > [   75.345517]  kthread+0x37f/0x6f0
> > [   75.345517]  ret_from_fork+0x17d/0x240
> > [   75.345517]  ret_from_fork_asm+0x1a/0x30
> > [   75.345517] 
> > [   75.345517] The buggy address belongs to the object at ffff88812389f370
> > [   75.345517]  which belongs to the cache btrfs_delayed_node of size 440
> > [   75.345517] The buggy address is located 16 bytes inside of
> > [   75.345517]  freed 440-byte region [ffff88812389f370, ffff88812389f528)
> > [   75.345517] 
> > [   75.345517] The buggy address belongs to the physical page:
> > [   75.345517] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12389e
> > [   75.345517] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> > [   75.345517] flags: 0x4000000000000040(head|zone=2)
> > [   75.345517] page_type: f5(slab)
> > [   75.345517] raw: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
> > [   75.345517] raw: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
> > [   75.345517] head: 4000000000000040 ffff88810bcaadc0 ffffea0004487a10 ffff88810c6e6d80
> > [   75.345517] head: 0000000000000000 00000000000e000e 00000000f5000000 0000000000000000
> > [   75.345517] head: 4000000000000001 ffffea00048e2781 00000000ffffffff 00000000ffffffff
> > [   75.345517] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000002
> > [   75.345517] page dumped because: kasan: bad access detected
> > [   75.345517] 
> > [   75.345517] Memory state around the buggy address:
> > [   75.345517]  ffff88812389f280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [   75.345517]  ffff88812389f300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fa fb
> > [   75.345517] >ffff88812389f380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [   75.345517]                    ^
> > [   75.345517]  ffff88812389f400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [   75.345517]  ffff88812389f480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > [   75.345517] ==================================================================
> > [   75.501545] Disabling lock debugging due to kernel taint
> > 
> > 
> > gdb) l *(btrfs_kill_all_delayed_nodes+0x46f)
> > 0xffffffff82f2422f is in btrfs_kill_all_delayed_nodes (fs/btrfs/delayed-inode.h:219).
> > 214		ref_tracker_dir_exit(&node->ref_dir.dir);
> > 215	}
> > 216	
> > 217	static inline void btrfs_delayed_node_ref_tracker_dir_print(struct btrfs_delayed_node *node)
> > 218	{
> > 219		if (!btrfs_test_opt(node->root->fs_info, REF_TRACKER))
> > 220			return;
> > 221	
> > 222		ref_tracker_dir_print(&node->ref_dir.dir,
> > 223				      BTRFS_DELAYED_NODE_REF_TRACKER_DISPLAY_LIMIT);
> 
> This is a use after free bug with my ref_tracker patch, it's trying to print delayed_node ref_tracker
> stats after the delayed node has been freed. Will send a fix in a second.

I wasn't able to reproduce the crash by running btrfs/071. I sent out a fix,
if you have time it would be great if you could check it against your reproducer.

Link: https://lore.kernel.org/linux-btrfs/e5d6dd45f720f2543ca4ea7ee3e66454ef55f639.1761001854.git.loemra.dev@gmail.com/T/#u

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-10-20 23:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20  7:22 btrfs/071 is unhappy on 6.18-rc2 Christoph Hellwig
2025-10-20  9:11 ` Qu Wenruo
2025-10-20  9:46   ` Christoph Hellwig
2025-10-20 10:26     ` Qu Wenruo
2025-10-20 14:19       ` Christoph Hellwig
2025-10-20 16:55         ` Leo Martins
2025-10-20 23:25           ` Leo Martins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox