New netfs crash in last month or so

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* New netfs crash in last month or so
@ 2025-11-07 23:07 Steve French
  2025-11-10  5:18 ` Shyam Prasad N
  2025-11-14 16:39 ` David Howells
  0 siblings, 2 replies; 3+ messages in thread
From: Steve French @ 2025-11-07 23:07 UTC (permalink / raw)
  To: CIFS; +Cc: linux-fsdevel, David Howells

Have been seeing this netfs crash over the last month or so
(presumably a recent regression) for example running generix/215.
Ideas welcome

[Fri Nov 7 10:03:14 2025] run fstests generic/215 at 2025-11-07 10:03:15
==================================================================
[Fri Nov 7 10:03:15 2025] BUG: KASAN: slab-use-after-free in
netfs_limit_iter+0x50f/0x770 [netfs]
[Fri Nov 7 10:03:15 2025] Read of size 1 at addr ff1100011b65d910 by
task kworker/u36:2/69285
[Fri Nov 7 10:03:15 2025] CPU: 3 UID: 0 PID: 69285 Comm: kworker/u36:2
Tainted: G E 6.18.0-rc4 #1 PREEMPT(voluntary)
[Fri Nov 7 10:03:15 2025] Tainted: [E]=UNSIGNED_MODULE
[Fri Nov 7 10:03:15 2025] Hardware name: Red Hat KVM, BIOS
1.16.3-4.el9 04/01/2014
[Fri Nov 7 10:03:15 2025] Workqueue: events_unbound
netfs_write_collection_worker [netfs]
[Fri Nov 7 10:03:15 2025] Call Trace:
[Fri Nov 7 10:03:15 2025] <TASK>
[Fri Nov 7 10:03:15 2025] dump_stack_lvl+0x79/0xb0
[Fri Nov 7 10:03:15 2025] print_report+0xcb/0x610
[Fri Nov 7 10:03:15 2025] ? __virt_addr_valid+0x19a/0x300
[Fri Nov 7 10:03:15 2025] ? netfs_limit_iter+0x50f/0x770 [netfs]
[Fri Nov 7 10:03:15 2025] ? netfs_limit_iter+0x50f/0x770 [netfs]
[Fri Nov 7 10:03:15 2025] kasan_report+0xca/0x100
[Fri Nov 7 10:03:15 2025] ? netfs_limit_iter+0x50f/0x770 [netfs]
[Fri Nov 7 10:03:15 2025] netfs_limit_iter+0x50f/0x770 [netfs]
[Fri Nov 7 10:03:15 2025] ? __pfx_netfs_limit_iter+0x10/0x10 [netfs]
[Fri Nov 7 10:03:15 2025] ? cifs_prepare_write+0x28e/0x490 [cifs]
[Fri Nov 7 10:03:15 2025] netfs_retry_writes+0x94d/0xcf0 [netfs]
[Fri Nov 7 10:03:15 2025] ? __pfx_netfs_retry_writes+0x10/0x10 [netfs]
[Fri Nov 7 10:03:15 2025] ? folio_end_writeback+0x9b/0xf0
[Fri Nov 7 10:03:15 2025] ? netfs_folio_written_back+0x1af/0x3e0 [netfs]
[Fri Nov 7 10:03:15 2025] netfs_write_collection+0x936/0x1bb0 [netfs]
[Fri Nov 7 10:03:15 2025] netfs_write_collection_worker+0x13d/0x2b0 [netfs]
[Fri Nov 7 10:03:15 2025] process_one_work+0x4bf/0xb40
[Fri Nov 7 10:03:15 2025] ? __pfx_process_one_work+0x10/0x10
[Fri Nov 7 10:03:15 2025] ? assign_work+0xd6/0x110
[Fri Nov 7 10:03:15 2025] worker_thread+0x2c9/0x550
[Fri Nov 7 10:03:15 2025] ? __pfx_worker_thread+0x10/0x10
[Fri Nov 7 10:03:15 2025] kthread+0x216/0x3e0
[Fri Nov 7 10:03:15 2025] ? __pfx_kthread+0x10/0x10
[Fri Nov 7 10:03:15 2025] ? __pfx_kthread+0x10/0x10
[Fri Nov 7 10:03:15 2025] ? lock_release+0xc4/0x270
[Fri Nov 7 10:03:15 2025] ? rcu_is_watching+0x20/0x50
[Fri Nov 7 10:03:15 2025] ? __pfx_kthread+0x10/0x10
[Fri Nov 7 10:03:15 2025] ret_from_fork+0x2a8/0x350
[Fri Nov 7 10:03:15 2025] ? __pfx_kthread+0x10/0x10
[Fri Nov 7 10:03:15 2025] ret_from_fork_asm+0x1a/0x30
[Fri Nov 7 10:03:15 2025] </TASK>
[Fri Nov 7 10:03:15 2025] Allocated by task 74971:
[Fri Nov 7 10:03:15 2025] kasan_save_stack+0x24/0x50
[Fri Nov 7 10:03:15 2025] kasan_save_track+0x14/0x30
[Fri Nov 7 10:03:15 2025] __kasan_kmalloc+0x7f/0x90
[Fri Nov 7 10:03:15 2025] netfs_folioq_alloc+0x56/0x1b0 [netfs]
[Fri Nov 7 10:03:15 2025] rolling_buffer_init+0x23/0x70 [netfs]
[Fri Nov 7 10:03:15 2025] netfs_create_write_req+0x85/0x360 [netfs]
[Fri Nov 7 10:03:15 2025] netfs_writepages+0x110/0x520 [netfs]
[Fri Nov 7 10:03:15 2025] do_writepages+0x123/0x260
[Fri Nov 7 10:03:15 2025] filemap_fdatawrite_wbc+0x74/0x90
[Fri Nov 7 10:03:15 2025] __filemap_fdatawrite_range+0x9a/0xc0
[Fri Nov 7 10:03:15 2025] filemap_write_and_wait_range+0x56/0xc0
[Fri Nov 7 10:03:15 2025] cifs_flush+0x10c/0x1f0 [cifs]
[Fri Nov 7 10:03:15 2025] filp_flush+0x97/0xd0
[Fri Nov 7 10:03:15 2025] __x64_sys_close+0x4a/0x90
[Fri Nov 7 10:03:15 2025] do_syscall_64+0x75/0x9c0
[Fri Nov 7 10:03:15 2025] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Fri Nov 7 10:03:15 2025] Freed by task 69285:
[Fri Nov 7 10:03:15 2025] kasan_save_stack+0x24/0x50
[Fri Nov 7 10:03:15 2025] kasan_save_track+0x14/0x30
[Fri Nov 7 10:03:15 2025] __kasan_save_free_info+0x3b/0x60
[Fri Nov 7 10:03:15 2025] __kasan_slab_free+0x43/0x70
[Fri Nov 7 10:03:15 2025] kfree+0x11a/0x630
[Fri Nov 7 10:03:15 2025] rolling_buffer_delete_spent+0x80/0xa0 [netfs]
[Fri Nov 7 10:03:15 2025] netfs_write_collection+0x119c/0x1bb0 [netfs]
[Fri Nov 7 10:03:15 2025] netfs_write_collection_worker+0x13d/0x2b0 [netfs]
[Fri Nov 7 10:03:15 2025] process_one_work+0x4bf/0xb40
[Fri Nov 7 10:03:15 2025] worker_thread+0x2c9/0x550
[Fri Nov 7 10:03:15 2025] kthread+0x216/0x3e0
[Fri Nov 7 10:03:15 2025] ret_from_fork+0x2a8/0x350
[Fri Nov 7 10:03:15 2025] ret_from_fork_asm+0x1a/0x30
[Fri Nov 7 10:03:15 2025] The buggy address belongs to the object at
ff1100011b65d800
which belongs to the cache kmalloc-512 of size 512
[Fri Nov 7 10:03:15 2025] The buggy address is located 272 bytes inside of
freed 512-byte region [ff1100011b65d800, ff1100011b65da00)
[Fri Nov 7 10:03:15 2025] The buggy address belongs to the physical page:
[Fri Nov 7 10:03:15 2025] page: refcount:0 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0x11b658
[Fri Nov 7 10:03:15 2025] head: order:3 mapcount:0 entire_mapcount:0
nr_pages_mapped:0 pincount:0
[Fri Nov 7 10:03:15 2025] anon flags:
0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
[Fri Nov 7 10:03:15 2025] page_type: f5(slab)
[Fri Nov 7 10:03:15 2025] raw: 0017ffffc0000040 ff11000100038c80
0000000000000000 dead000000000001
[Fri Nov 7 10:03:15 2025] raw: 0000000000000000 0000000000200020
00000000f5000000 0000000000000000
[Fri Nov 7 10:03:15 2025] head: 0017ffffc0000040 ff11000100038c80
0000000000000000 dead000000000001
[Fri Nov 7 10:03:15 2025] head: 0000000000000000 0000000000200020
00000000f5000000 0000000000000000
[Fri Nov 7 10:03:15 2025] head: 0017ffffc0000003 ffd40000046d9601
00000000ffffffff 00000000ffffffff
[Fri Nov 7 10:03:15 2025] head: ffffffffffffffff 0000000000000000
00000000ffffffff 0000000000000008
[Fri Nov 7 10:03:15 2025] page dumped because: kasan: bad access detected
[Fri Nov 7 10:03:15 2025] Memory state around the buggy address:
[Fri Nov 7 10:03:15 2025] ff1100011b65d800: fa fb fb fb fb fb fb fb fb
fb fb fb fb fb fb fb
[Fri Nov 7 10:03:15 2025] ff1100011b65d880: fb fb fb fb fb fb fb fb fb
fb fb fb fb fb fb fb
[Fri Nov 7 10:03:15 2025] >ff1100011b65d900: fb fb fb fb fb fb fb fb
fb fb fb fb fb fb fb fb
[Fri Nov 7 10:03:15 2025] ^
[Fri Nov 7 10:03:15 2025] ff1100011b65d980: fb fb fb fb fb fb fb fb fb
fb fb fb fb fb fb fb
[Fri Nov 7 10:03:15 2025] ff1100011b65da00: fc fc fc fc fc fc fc fc fc
fc fc fc fc fc fc fc
[Fri Nov 7 10:03:15 2025]
==================================================================
[Fri Nov 7 10:03:15 2025] Disabling lock debugging due to kernel taint

http://smb311-linux-testing.southcentralus.cloudapp.azure.com/#/builders/8/builds/152/steps/78/logs/stdio

-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: New netfs crash in last month or so
  2025-11-07 23:07 New netfs crash in last month or so Steve French
@ 2025-11-10  5:18 ` Shyam Prasad N
  2025-11-14 16:39 ` David Howells
  1 sibling, 0 replies; 3+ messages in thread
From: Shyam Prasad N @ 2025-11-10  5:18 UTC (permalink / raw)
  To: Steve French, David Howells, Paulo Alcantara, Paulo Alcantara
  Cc: CIFS, linux-fsdevel

On Sat, Nov 8, 2025 at 4:37 AM Steve French <smfrench@gmail.com> wrote:
>
> Have been seeing this netfs crash over the last month or so
> (presumably a recent regression) for example running generix/215.
> Ideas welcome
>
> [Fri Nov 7 10:03:14 2025] run fstests generic/215 at 2025-11-07 10:03:15
> ==================================================================
> [Fri Nov 7 10:03:15 2025] BUG: KASAN: slab-use-after-free in
> netfs_limit_iter+0x50f/0x770 [netfs]
> [Fri Nov 7 10:03:15 2025] Read of size 1 at addr ff1100011b65d910 by
> task kworker/u36:2/69285
> [Fri Nov 7 10:03:15 2025] CPU: 3 UID: 0 PID: 69285 Comm: kworker/u36:2
> Tainted: G E 6.18.0-rc4 #1 PREEMPT(voluntary)
> [Fri Nov 7 10:03:15 2025] Tainted: [E]=UNSIGNED_MODULE
> [Fri Nov 7 10:03:15 2025] Hardware name: Red Hat KVM, BIOS
> 1.16.3-4.el9 04/01/2014
> [Fri Nov 7 10:03:15 2025] Workqueue: events_unbound
> netfs_write_collection_worker [netfs]
> [Fri Nov 7 10:03:15 2025] Call Trace:
> [Fri Nov 7 10:03:15 2025] <TASK>
> [Fri Nov 7 10:03:15 2025] dump_stack_lvl+0x79/0xb0
> [Fri Nov 7 10:03:15 2025] print_report+0xcb/0x610
> [Fri Nov 7 10:03:15 2025] ? __virt_addr_valid+0x19a/0x300
> [Fri Nov 7 10:03:15 2025] ? netfs_limit_iter+0x50f/0x770 [netfs]
> [Fri Nov 7 10:03:15 2025] ? netfs_limit_iter+0x50f/0x770 [netfs]
> [Fri Nov 7 10:03:15 2025] kasan_report+0xca/0x100
> [Fri Nov 7 10:03:15 2025] ? netfs_limit_iter+0x50f/0x770 [netfs]
> [Fri Nov 7 10:03:15 2025] netfs_limit_iter+0x50f/0x770 [netfs]
> [Fri Nov 7 10:03:15 2025] ? __pfx_netfs_limit_iter+0x10/0x10 [netfs]
> [Fri Nov 7 10:03:15 2025] ? cifs_prepare_write+0x28e/0x490 [cifs]
> [Fri Nov 7 10:03:15 2025] netfs_retry_writes+0x94d/0xcf0 [netfs]
> [Fri Nov 7 10:03:15 2025] ? __pfx_netfs_retry_writes+0x10/0x10 [netfs]
> [Fri Nov 7 10:03:15 2025] ? folio_end_writeback+0x9b/0xf0
> [Fri Nov 7 10:03:15 2025] ? netfs_folio_written_back+0x1af/0x3e0 [netfs]
> [Fri Nov 7 10:03:15 2025] netfs_write_collection+0x936/0x1bb0 [netfs]
> [Fri Nov 7 10:03:15 2025] netfs_write_collection_worker+0x13d/0x2b0 [netfs]
> [Fri Nov 7 10:03:15 2025] process_one_work+0x4bf/0xb40
> [Fri Nov 7 10:03:15 2025] ? __pfx_process_one_work+0x10/0x10
> [Fri Nov 7 10:03:15 2025] ? assign_work+0xd6/0x110
> [Fri Nov 7 10:03:15 2025] worker_thread+0x2c9/0x550
> [Fri Nov 7 10:03:15 2025] ? __pfx_worker_thread+0x10/0x10
> [Fri Nov 7 10:03:15 2025] kthread+0x216/0x3e0
> [Fri Nov 7 10:03:15 2025] ? __pfx_kthread+0x10/0x10
> [Fri Nov 7 10:03:15 2025] ? __pfx_kthread+0x10/0x10
> [Fri Nov 7 10:03:15 2025] ? lock_release+0xc4/0x270
> [Fri Nov 7 10:03:15 2025] ? rcu_is_watching+0x20/0x50
> [Fri Nov 7 10:03:15 2025] ? __pfx_kthread+0x10/0x10
> [Fri Nov 7 10:03:15 2025] ret_from_fork+0x2a8/0x350
> [Fri Nov 7 10:03:15 2025] ? __pfx_kthread+0x10/0x10
> [Fri Nov 7 10:03:15 2025] ret_from_fork_asm+0x1a/0x30
> [Fri Nov 7 10:03:15 2025] </TASK>
> [Fri Nov 7 10:03:15 2025] Allocated by task 74971:
> [Fri Nov 7 10:03:15 2025] kasan_save_stack+0x24/0x50
> [Fri Nov 7 10:03:15 2025] kasan_save_track+0x14/0x30
> [Fri Nov 7 10:03:15 2025] __kasan_kmalloc+0x7f/0x90
> [Fri Nov 7 10:03:15 2025] netfs_folioq_alloc+0x56/0x1b0 [netfs]
> [Fri Nov 7 10:03:15 2025] rolling_buffer_init+0x23/0x70 [netfs]
> [Fri Nov 7 10:03:15 2025] netfs_create_write_req+0x85/0x360 [netfs]
> [Fri Nov 7 10:03:15 2025] netfs_writepages+0x110/0x520 [netfs]
> [Fri Nov 7 10:03:15 2025] do_writepages+0x123/0x260
> [Fri Nov 7 10:03:15 2025] filemap_fdatawrite_wbc+0x74/0x90
> [Fri Nov 7 10:03:15 2025] __filemap_fdatawrite_range+0x9a/0xc0
> [Fri Nov 7 10:03:15 2025] filemap_write_and_wait_range+0x56/0xc0
> [Fri Nov 7 10:03:15 2025] cifs_flush+0x10c/0x1f0 [cifs]
> [Fri Nov 7 10:03:15 2025] filp_flush+0x97/0xd0
> [Fri Nov 7 10:03:15 2025] __x64_sys_close+0x4a/0x90
> [Fri Nov 7 10:03:15 2025] do_syscall_64+0x75/0x9c0
> [Fri Nov 7 10:03:15 2025] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [Fri Nov 7 10:03:15 2025] Freed by task 69285:
> [Fri Nov 7 10:03:15 2025] kasan_save_stack+0x24/0x50
> [Fri Nov 7 10:03:15 2025] kasan_save_track+0x14/0x30
> [Fri Nov 7 10:03:15 2025] __kasan_save_free_info+0x3b/0x60
> [Fri Nov 7 10:03:15 2025] __kasan_slab_free+0x43/0x70
> [Fri Nov 7 10:03:15 2025] kfree+0x11a/0x630
> [Fri Nov 7 10:03:15 2025] rolling_buffer_delete_spent+0x80/0xa0 [netfs]
> [Fri Nov 7 10:03:15 2025] netfs_write_collection+0x119c/0x1bb0 [netfs]
> [Fri Nov 7 10:03:15 2025] netfs_write_collection_worker+0x13d/0x2b0 [netfs]
> [Fri Nov 7 10:03:15 2025] process_one_work+0x4bf/0xb40
> [Fri Nov 7 10:03:15 2025] worker_thread+0x2c9/0x550
> [Fri Nov 7 10:03:15 2025] kthread+0x216/0x3e0
> [Fri Nov 7 10:03:15 2025] ret_from_fork+0x2a8/0x350
> [Fri Nov 7 10:03:15 2025] ret_from_fork_asm+0x1a/0x30
> [Fri Nov 7 10:03:15 2025] The buggy address belongs to the object at
> ff1100011b65d800
> which belongs to the cache kmalloc-512 of size 512
> [Fri Nov 7 10:03:15 2025] The buggy address is located 272 bytes inside of
> freed 512-byte region [ff1100011b65d800, ff1100011b65da00)
> [Fri Nov 7 10:03:15 2025] The buggy address belongs to the physical page:
> [Fri Nov 7 10:03:15 2025] page: refcount:0 mapcount:0
> mapping:0000000000000000 index:0x0 pfn:0x11b658
> [Fri Nov 7 10:03:15 2025] head: order:3 mapcount:0 entire_mapcount:0
> nr_pages_mapped:0 pincount:0
> [Fri Nov 7 10:03:15 2025] anon flags:
> 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> [Fri Nov 7 10:03:15 2025] page_type: f5(slab)
> [Fri Nov 7 10:03:15 2025] raw: 0017ffffc0000040 ff11000100038c80
> 0000000000000000 dead000000000001
> [Fri Nov 7 10:03:15 2025] raw: 0000000000000000 0000000000200020
> 00000000f5000000 0000000000000000
> [Fri Nov 7 10:03:15 2025] head: 0017ffffc0000040 ff11000100038c80
> 0000000000000000 dead000000000001
> [Fri Nov 7 10:03:15 2025] head: 0000000000000000 0000000000200020
> 00000000f5000000 0000000000000000
> [Fri Nov 7 10:03:15 2025] head: 0017ffffc0000003 ffd40000046d9601
> 00000000ffffffff 00000000ffffffff
> [Fri Nov 7 10:03:15 2025] head: ffffffffffffffff 0000000000000000
> 00000000ffffffff 0000000000000008
> [Fri Nov 7 10:03:15 2025] page dumped because: kasan: bad access detected
> [Fri Nov 7 10:03:15 2025] Memory state around the buggy address:
> [Fri Nov 7 10:03:15 2025] ff1100011b65d800: fa fb fb fb fb fb fb fb fb
> fb fb fb fb fb fb fb
> [Fri Nov 7 10:03:15 2025] ff1100011b65d880: fb fb fb fb fb fb fb fb fb
> fb fb fb fb fb fb fb
> [Fri Nov 7 10:03:15 2025] >ff1100011b65d900: fb fb fb fb fb fb fb fb
> fb fb fb fb fb fb fb fb
> [Fri Nov 7 10:03:15 2025] ^
> [Fri Nov 7 10:03:15 2025] ff1100011b65d980: fb fb fb fb fb fb fb fb fb
> fb fb fb fb fb fb fb
> [Fri Nov 7 10:03:15 2025] ff1100011b65da00: fc fc fc fc fc fc fc fc fc
> fc fc fc fc fc fc fc
> [Fri Nov 7 10:03:15 2025]
> ==================================================================
> [Fri Nov 7 10:03:15 2025] Disabling lock debugging due to kernel taint
>
> http://smb311-linux-testing.southcentralus.cloudapp.azure.com/#/builders/8/builds/152/steps/78/logs/stdio
>
> --
> Thanks,
>
> Steve
>

It looks like a missing initialization in the netfs write retry code.
This initialization of sreq_max_segs seems different from all the
other places to me:
https://elixir.bootlin.com/linux/v6.18-rc4/source/fs/netfs/write_retry.c#L162
David / Paulo: Is it expected to set a non-zero value to this? If the
value of this was 0, we wouldn't have called netfs_limit_iter in this
codepath.

-- 
Regards,
Shyam

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: New netfs crash in last month or so
  2025-11-07 23:07 New netfs crash in last month or so Steve French
  2025-11-10  5:18 ` Shyam Prasad N
@ 2025-11-14 16:39 ` David Howells
  1 sibling, 0 replies; 3+ messages in thread
From: David Howells @ 2025-11-14 16:39 UTC (permalink / raw)
  To: Shyam Prasad N
  Cc: dhowells, Steve French, Paulo Alcantara, Paulo Alcantara, CIFS,
	linux-fsdevel

Shyam Prasad N <nspmangalore@gmail.com> wrote:

> It looks like a missing initialization in the netfs write retry code.
> This initialization of sreq_max_segs seems different from all the
> other places to me:
> https://elixir.bootlin.com/linux/v6.18-rc4/source/fs/netfs/write_retry.c#L162
> David / Paulo: Is it expected to set a non-zero value to this? If the
> value of this was 0, we wouldn't have called netfs_limit_iter in this
> codepath.

That shouldn't matter.  netfs_limit_iter() should still work, even if max_segs
is INT_MAX.

> [Fri Nov 7 10:03:15 2025] netfs_limit_iter+0x50f/0x770 [netfs]

Can you find a line for this, Steve?

David


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-11-14 16:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-07 23:07 New netfs crash in last month or so Steve French
2025-11-10  5:18 ` Shyam Prasad N
2025-11-14 16:39 ` David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).