* 6.12 WARNING in netfs_consume_read_data()
@ 2024-11-18 23:04 Max Kellermann
2024-11-29 12:41 ` David Howells
0 siblings, 1 reply; 14+ messages in thread
From: Max Kellermann @ 2024-11-18 23:04 UTC (permalink / raw)
To: David Howells, Jeff Layton, netfs, linux-fsdevel, linux-kernel
Hi David & netfs developers,
I tried the new Linux kernel 6.12 today and it quickly reported a bug:
------------[ cut here ]------------
WARNING: CPU: 13 PID: 0 at kernel/softirq.c:361 __local_bh_enable_ip+0x37/0x50
Modules linked in:
CPU: 13 UID: 0 PID: 0 Comm: swapper/13 Not tainted 6.12.0-cm4all1-hp+ #236
Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS
U30 09/05/2019
RIP: 0010:__local_bh_enable_ip+0x37/0x50
Code: 00 0f 00 75 25 83 ee 01 f7 de 65 01 35 2a 93 d8 70 65 f7 05 1f
93 d8 70 00 ff ff 00 74 10 65 ff 0d 16 93 d8 70 c3 cc cc cc cc <0f> 0b
eb d7 65 66 83 3d 24 93 d8 70 00 74 e5 e8 45 ff ff ff eb de
RSP: 0018:ffff979300464d30 EFLAGS: 00010006
RAX: dead000000000122 RBX: ffff8c5cd1045b00 RCX: ffff8c5cdeff7800
RDX: ffff8c5d022f91f0 RSI: 0000000000000200 RDI: ffffffff8f5ee1cc
RBP: 0000000000000000 R08: 0000000000000002 R09: ffff8cba7cf6eb68
R10: ffffffff910621e0 R11: 0000000000000001 R12: ffff8c5d022f9368
R13: 0000000000001000 R14: ffff8c5d022f9368 R15: 0000000000001000
FS: 0000000000000000(0000) GS:ffff8cba7cf40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055ea9986d368 CR3: 000000603ee2e005 CR4: 00000000007706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
? __warn+0x81/0x110
? __local_bh_enable_ip+0x37/0x50
? report_bug+0x14c/0x170
? handle_bug+0x53/0x90
? exc_invalid_op+0x13/0x60
? asm_exc_invalid_op+0x16/0x20
? netfs_consume_read_data.isra.0+0x3fc/0xa70
? __local_bh_enable_ip+0x37/0x50
netfs_consume_read_data.isra.0+0x3fc/0xa70
? __pfx_cachefiles_read_complete+0x10/0x10
netfs_read_subreq_terminated+0x265/0x360
cachefiles_read_complete+0x45/0xf0
iomap_dio_bio_end_io+0x122/0x160
blk_update_request+0xf1/0x3e0
scsi_end_request+0x27/0x190
scsi_io_completion+0x43/0x6c0
pqi_irq_handler+0x108/0xcd0
__handle_irq_event_percpu+0x43/0x160
handle_irq_event+0x27/0x70
handle_edge_irq+0x82/0x220
__common_interrupt+0x37/0xb0
common_interrupt+0x74/0xa0
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40
RIP: 0010:cpuidle_enter_state+0xba/0x3c0
Code: 00 e8 7a 06 1d ff e8 45 f7 ff ff 8b 53 04 49 89 c5 0f 1f 44 00
00 31 ff e8 83 38 1c ff 45 84 ff 0f 85 16 01 00 00 fb 45 85 f6 <0f> 88
76 01 00 00 48 8b 04 24 49 63 ce 48 6b d1 68 49 29 c5 48 89
RSP: 0018:ffff97930015fe98 EFLAGS: 00000206
RAX: ffff8cba7cf40000 RBX: ffffb792fe94f250 RCX: 000000000000001f
RDX: 000000000000000d RSI: 0000000037c86db9 RDI: 0000000000000000
RBP: 0000000000000003 R08: 0000000000000002 R09: 0000000000000020
R10: 0000000000000003 R11: 0000000000000015 R12: ffffffff9125b220
R13: 000006c7b80e3e03 R14: 0000000000000003 R15: 0000000000000000
? cpuidle_enter_state+0xad/0x3c0
cpuidle_enter+0x29/0x40
do_idle+0x19f/0x200
cpu_startup_entry+0x25/0x30
start_secondary+0xf3/0x100
common_startup_64+0x13e/0x148
</TASK>
---[ end trace 0000000000000000 ]---
Apparently, the netfs code doesn't want to be called from hardirq
context, but the cachefiles read completion callback may run in a
hardirq handler.
This source file is new in 6.12, so this regression may have been
caused by David Howell's commit ee4cdf7ba857 ("netfs: Speed up
buffered reading").
Max
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: 6.12 WARNING in netfs_consume_read_data() 2024-11-18 23:04 6.12 WARNING in netfs_consume_read_data() Max Kellermann @ 2024-11-29 12:41 ` David Howells 2024-11-29 21:06 ` Max Kellermann 2024-11-29 22:29 ` Max Kellermann 0 siblings, 2 replies; 14+ messages in thread From: David Howells @ 2024-11-29 12:41 UTC (permalink / raw) To: Max Kellermann; +Cc: dhowells, Jeff Layton, netfs, linux-fsdevel, linux-kernel Hi Max, Could you try: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-writeback I think the patches there should fix it. However, as these aren't making it till the next merge window, I'll try and work out an interim fix. David ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 6.12 WARNING in netfs_consume_read_data() 2024-11-29 12:41 ` David Howells @ 2024-11-29 21:06 ` Max Kellermann 2024-11-29 22:29 ` Max Kellermann 1 sibling, 0 replies; 14+ messages in thread From: Max Kellermann @ 2024-11-29 21:06 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel On Fri, Nov 29, 2024 at 1:41 PM David Howells <dhowells@redhat.com> wrote: > Could you try: > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-writeback > > I think the patches there should fix it. Confirmed. I have merged this branch on top of 6.12.1, and the problem no longer occurs. Thanks! ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 6.12 WARNING in netfs_consume_read_data() 2024-11-29 12:41 ` David Howells 2024-11-29 21:06 ` Max Kellermann @ 2024-11-29 22:29 ` Max Kellermann 2024-12-06 15:08 ` Max Kellermann 1 sibling, 1 reply; 14+ messages in thread From: Max Kellermann @ 2024-11-29 22:29 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel On Fri, Nov 29, 2024 at 1:41 PM David Howells <dhowells@redhat.com> wrote: > Could you try: > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-writeback > > I think the patches there should fix it. I have encountered multiple hangs with that branch; for example: [<0>] folio_wait_private_2_killable+0xfb/0x180 [<0>] netfs_write_begin+0xe0/0x450 [<0>] ceph_write_begin+0x27/0x50 [<0>] generic_perform_write+0xcd/0x280 [<0>] ceph_write_iter+0x4d9/0x640 [<0>] iter_file_splice_write+0x308/0x550 [<0>] splice_file_range_actor+0x29/0x40 [<0>] splice_direct_to_actor+0xee/0x270 [<0>] splice_file_range+0x80/0xc0 [<0>] ceph_copy_file_range+0xb5/0x5b0 [<0>] vfs_copy_file_range+0x320/0x5b0 [<0>] __x64_sys_copy_file_range+0xef/0x200 [<0>] do_syscall_64+0x64/0x100 [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e I can't be 100% sure if this was caused by your branch or if this is just another 6.12 regression, but I haven't seen this with 6.11. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 6.12 WARNING in netfs_consume_read_data() 2024-11-29 22:29 ` Max Kellermann @ 2024-12-06 15:08 ` Max Kellermann 2024-12-06 15:13 ` Max Kellermann 2024-12-06 16:29 ` Max Kellermann 0 siblings, 2 replies; 14+ messages in thread From: Max Kellermann @ 2024-12-06 15:08 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel On Fri, Nov 29, 2024 at 11:29 PM Max Kellermann <max.kellermann@ionos.com> wrote: > I have encountered multiple hangs with that branch; for example: Similar hangs wth 6.12.2 (vanilla without your "netfs-writeback" branch): [<0>] folio_wait_bit_common+0x23a/0x4f0 [<0>] folio_wait_private_2+0x37/0x70 [<0>] netfs_invalidate_folio+0x168/0x520 [<0>] truncate_cleanup_folio+0x281/0x340 [<0>] truncate_inode_pages_range+0x1bb/0x780 [<0>] ceph_evict_inode+0x17e/0x6b0 [<0>] evict+0x331/0x780 [<0>] __dentry_kill+0x17b/0x4f0 [<0>] dput+0x2a6/0x4a0 [<0>] __fput+0x36d/0x910 [<0>] __x64_sys_close+0x78/0xd0 [<0>] do_syscall_64+0x64/0x100 [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e netfs in 6.12 remains pretty broken, both with and without your fixes. The sad thing is that 6.11 has just been EOLed. Linux 6.6 is the only sane kernel. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 6.12 WARNING in netfs_consume_read_data() 2024-12-06 15:08 ` Max Kellermann @ 2024-12-06 15:13 ` Max Kellermann 2024-12-06 16:29 ` Max Kellermann 1 sibling, 0 replies; 14+ messages in thread From: Max Kellermann @ 2024-12-06 15:13 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel On Fri, Dec 6, 2024 at 4:08 PM Max Kellermann <max.kellermann@ionos.com> wrote: > Similar hangs wth 6.12.2 (vanilla without your "netfs-writeback" branch): > > [<0>] folio_wait_bit_common+0x23a/0x4f0 > [<0>] folio_wait_private_2+0x37/0x70 > [<0>] netfs_invalidate_folio+0x168/0x520 > [<0>] truncate_cleanup_folio+0x281/0x340 > [<0>] truncate_inode_pages_range+0x1bb/0x780 > [<0>] ceph_evict_inode+0x17e/0x6b0 > [<0>] evict+0x331/0x780 > [<0>] __dentry_kill+0x17b/0x4f0 > [<0>] dput+0x2a6/0x4a0 > [<0>] __fput+0x36d/0x910 > [<0>] __x64_sys_close+0x78/0xd0 > [<0>] do_syscall_64+0x64/0x100 > [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e This might help you understand the problem: INFO: task cp:3345 blocked for more than 122 seconds. Not tainted 6.12.3-cm4all0-hp+ #297 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:cp state:D stack:0 pid:3345 tgid:3345 ppid:3331 flags:0x00000002 Call Trace: <TASK> __schedule+0xc34/0x4df0 ? __pfx___schedule+0x10/0x10 ? lock_release+0x206/0x660 ? schedule+0x283/0x340 ? __pfx_lock_release+0x10/0x10 ? schedule+0x1e8/0x340 schedule+0xdc/0x340 io_schedule+0xc0/0x130 folio_wait_bit_common+0x23a/0x4f0 ? __pfx_folio_wait_bit_common+0x10/0x10 ? __pfx_wake_page_function+0x10/0x10 ? __pfx_truncate_folio_batch_exceptionals.part.0+0x10/0x10 folio_wait_private_2+0x37/0x70 netfs_invalidate_folio+0x168/0x520 ? ceph_invalidate_folio+0x114/0x2a0 truncate_cleanup_folio+0x281/0x340 truncate_inode_pages_range+0x1bb/0x780 ? __pfx_truncate_inode_pages_range+0x10/0x10 ? __pfx_do_raw_spin_lock+0x10/0x10 ? find_held_lock+0x2d/0x110 ? do_raw_spin_unlock+0x54/0x220 ceph_evict_inode+0x17e/0x6b0 ? lock_release+0x206/0x660 ? __pfx_ceph_evict_inode+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_lock+0x12d/0x270 ? __pfx_do_raw_spin_lock+0x10/0x10 evict+0x331/0x780 ? __pfx_evict+0x10/0x10 ? do_raw_spin_unlock+0x54/0x220 ? _raw_spin_unlock+0x1f/0x30 ? iput.part.0+0x3d0/0x670 __dentry_kill+0x17b/0x4f0 dput+0x2a6/0x4a0 __fput+0x36d/0x910 __x64_sys_close+0x78/0xd0 do_syscall_64+0x64/0x100 ? syscall_exit_to_user_mode+0x57/0x120 ? do_syscall_64+0x70/0x100 ? do_raw_spin_unlock+0x54/0x220 ? _raw_spin_unlock+0x1f/0x30 ? generic_fadvise+0x210/0x590 ? __pfx_generic_fadvise+0x10/0x10 ? syscall_exit_to_user_mode+0x57/0x120 ? __pfx___seccomp_filter+0x10/0x10 ? fdget+0x53/0x430 ? __pfx_do_sys_openat2+0x10/0x10 ? __x64_sys_fadvise64+0x139/0x180 ? syscall_exit_to_user_mode+0x57/0x120 ? do_syscall_64+0x70/0x100 ? __x64_sys_openat+0x135/0x1d0 ? __pfx___x64_sys_openat+0x10/0x10 ? syscall_exit_to_user_mode+0x57/0x120 ? do_syscall_64+0x70/0x100 ? irqentry_exit_to_user_mode+0x3d/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f66375bd960 RSP: 002b:00007ffd5bcd65a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f66375bd960 RDX: 0000000000000004 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007ffd5bcd6990 R08: 7fffffffc0000000 R09: 0000000000000000 R10: 00007f66374dc498 R11: 0000000000000202 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000008000 R15: 0000000000000001 </TASK> Showing all locks held in the system: 1 lock held by khungtaskd/163: #0: ffffffffb5829b80 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x64/0x280 2 locks held by kworker/14:1/476: #0: ffff88815cb2b548 ((wq_completion)ceph-cap){....}-{0:0}, at: process_one_work+0xdea/0x14f0 #1: ffff88810ca97da0 ((work_completion)(&mdsc->cap_reclaim_work)){....}-{0:0}, at: process_one_work+0x743/0x14f0 ============================================= INFO: task kworker/14:1:476 blocked for more than 122 seconds. Not tainted 6.12.3-cm4all0-hp+ #297 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/14:1 state:D stack:0 pid:476 tgid:476 ppid:2 flags:0x00004000 Workqueue: ceph-cap ceph_cap_reclaim_work Call Trace: <TASK> __schedule+0xc34/0x4df0 ? __pfx___schedule+0x10/0x10 ? lock_release+0x206/0x660 ? schedule+0x283/0x340 ? __pfx_lock_release+0x10/0x10 ? kick_pool+0x1da/0x530 schedule+0xdc/0x340 io_schedule+0xc0/0x130 folio_wait_bit_common+0x23a/0x4f0 ? __pfx_folio_wait_bit_common+0x10/0x10 ? __pfx_wake_page_function+0x10/0x10 ? __pfx_truncate_folio_batch_exceptionals.part.0+0x10/0x10 folio_wait_private_2+0x37/0x70 netfs_invalidate_folio+0x168/0x520 ? ceph_invalidate_folio+0x114/0x2a0 truncate_cleanup_folio+0x281/0x340 truncate_inode_pages_range+0x1bb/0x780 ? __pfx_truncate_inode_pages_range+0x10/0x10 ? __lock_acquire.constprop.0+0x598/0x13e0 ? release_sock+0x1b/0x180 ? reacquire_held_locks+0x1e9/0x460 ? release_sock+0x1b/0x180 ? find_held_lock+0x2d/0x110 ? lock_release+0x206/0x660 ? truncate_inode_pages_final+0x59/0x80 ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_lock+0x12d/0x270 ? __pfx_do_raw_spin_lock+0x10/0x10 ? find_held_lock+0x2d/0x110 ? do_raw_spin_unlock+0x54/0x220 ceph_evict_inode+0x17e/0x6b0 ? lock_release+0x206/0x660 ? __pfx_ceph_evict_inode+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? lock_is_held_type+0xdb/0x110 evict+0x331/0x780 ? __pfx_evict+0x10/0x10 ? do_raw_spin_unlock+0x54/0x220 ? _raw_spin_unlock+0x1f/0x30 ? iput.part.0+0x3d0/0x670 __dentry_kill+0x17b/0x4f0 dput+0x2a6/0x4a0 __dentry_leases_walk+0x6c6/0x10d0 ? do_raw_spin_lock+0x12d/0x270 ? __pfx___dentry_leases_walk+0x10/0x10 ? __pfx_do_raw_spin_lock+0x10/0x10 ceph_trim_dentries+0x1b1/0x260 ? __pfx_ceph_trim_dentries+0x10/0x10 ? lock_acquire+0x11f/0x290 ? process_one_work+0x743/0x14f0 ceph_cap_reclaim_work+0x19/0xc0 process_one_work+0x7b4/0x14f0 ? __pfx_process_one_work+0x10/0x10 ? assign_work+0x16c/0x240 ? lock_is_held_type+0x9a/0x110 worker_thread+0x52b/0xe40 ? do_raw_spin_unlock+0x54/0x220 ? __kthread_parkme+0x95/0x120 ? __pfx_worker_thread+0x10/0x10 kthread+0x28a/0x350 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 6.12 WARNING in netfs_consume_read_data() 2024-12-06 15:08 ` Max Kellermann 2024-12-06 15:13 ` Max Kellermann @ 2024-12-06 16:29 ` Max Kellermann 2024-12-06 21:13 ` David Howells 1 sibling, 1 reply; 14+ messages in thread From: Max Kellermann @ 2024-12-06 16:29 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel On Fri, Dec 6, 2024 at 4:08 PM Max Kellermann <max.kellermann@ionos.com> wrote: > Similar hangs wth 6.12.2 (vanilla without your "netfs-writeback" branch): (Correction: this was 6.12.3, not 6.12.2) I tried with 6.12.3 + dhowells/netfs-writeback; David's branch solved many problems and it took much longer to trigger the hang, but it eventually occurred: INFO: task bash:6599 blocked for more than 122 seconds. Not tainted 6.12.3-cm4all0-hp+ #298 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:bash state:D stack:0 pid:6599 tgid:6599 ppid:6598 flags:0x00000006 Call Trace: <TASK> __schedule+0xc34/0x4df0 ? is_dynamic_key+0x120/0x150 ? __pfx___schedule+0x10/0x10 ? lock_release+0x206/0x660 ? schedule+0x283/0x340 ? __pfx_lock_release+0x10/0x10 ? schedule+0x1e8/0x340 schedule+0xdc/0x340 schedule_preempt_disabled+0xa/0x10 rwsem_down_read_slowpath+0x6ba/0xd00 ? __pfx_rwsem_down_read_slowpath+0x10/0x10 ? kernel_text_address+0xb8/0x150 ? lock_acquire+0x11f/0x290 ? ceph_start_io_read+0x19/0x80 down_read+0xcd/0x220 ? __pfx_down_read+0x10/0x10 ? do_sys_openat2+0x106/0x160 ? stack_trace_save+0x96/0xd0 ? __pfx_stack_trace_save+0x10/0x10 ceph_start_io_read+0x19/0x80 ceph_read_iter+0x2e2/0xe70 ? __pfx_ceph_read_iter+0x10/0x10 ? psi_task_switch+0x256/0x810 ? find_held_lock+0x2d/0x110 ? lock_release+0x206/0x660 ? finish_task_switch.isra.0+0x1db/0xa40 vfs_read+0x6e1/0xc40 ? lock_acquire+0x11f/0x290 ? finish_task_switch.isra.0+0x129/0xa40 ? __pfx_vfs_read+0x10/0x10 ? finish_task_switch.isra.0+0x225/0xa40 ? fdget_pos+0x1b3/0x540 ? __pfx___seccomp_filter+0x10/0x10 ksys_read+0xee/0x1c0 ? __pfx_ksys_read+0x10/0x10 ? lock_release+0x206/0x660 ? __ceph_do_getattr+0xe8/0x380 do_syscall_64+0x64/0x100 ? fdget_raw+0x53/0x390 ? __do_sys_newfstatat+0x86/0xd0 ? __pfx___do_sys_newfstatat+0x10/0x10 ? syscall_exit_to_user_mode+0x57/0x120 ? do_syscall_64+0x70/0x100 ? irqentry_exit_to_user_mode+0x3d/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f71fb04f21d RSP: 002b:00007ffdd516a918 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f71fb04f21d RDX: 0000000000003003 RSI: 00005597dc55d4d0 RDI: 0000000000000003 RBP: 0000000000003003 R08: 00007f71fb12a020 R09: 00007f71fb12a020 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000001f4 R13: 00005597dc485340 R14: 00005597dc55d4d0 R15: 00005597be8d7524 </TASK> INFO: task bash:6614 blocked for more than 122 seconds. Not tainted 6.12.3-cm4all0-hp+ #298 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:bash state:D stack:0 pid:6614 tgid:6614 ppid:6613 flags:0x00000002 Call Trace: <TASK> __schedule+0xc34/0x4df0 ? __free_insn_slot+0x370/0x3d0 ? __pfx___schedule+0x10/0x10 ? lock_release+0x206/0x660 ? schedule+0x283/0x340 ? __pfx_lock_release+0x10/0x10 ? schedule+0x1e8/0x340 schedule+0xdc/0x340 schedule_preempt_disabled+0xa/0x10 rwsem_down_read_slowpath+0x6ba/0xd00 ? __pfx_rwsem_down_read_slowpath+0x10/0x10 ? kasan_save_stack+0x1c/0x40 ? kasan_save_track+0x10/0x30 ? lock_acquire+0x11f/0x290 ? ceph_start_io_read+0x19/0x80 ? find_held_lock+0x2d/0x110 down_read+0xcd/0x220 ? __ceph_caps_issued_mask+0x416/0xa10 ? __pfx_down_read+0x10/0x10 ceph_start_io_read+0x19/0x80 ceph_read_iter+0x2e2/0xe70 ? _copy_to_user+0x50/0x70 ? __pfx_ceph_read_iter+0x10/0x10 ? fdget_raw+0x53/0x390 vfs_read+0x6e1/0xc40 ? __do_sys_newfstatat+0x86/0xd0 ? __pfx___do_sys_newfstatat+0x10/0x10 ? __pfx_vfs_read+0x10/0x10 ? fdget_pos+0x1b3/0x540 ? __pfx___seccomp_filter+0x10/0x10 ksys_read+0xee/0x1c0 ? __pfx_ksys_read+0x10/0x10 do_syscall_64+0x64/0x100 ? do_user_addr_fault+0x401/0x8f0 ? find_held_lock+0x59/0x110 ? find_held_lock+0x2d/0x110 ? lock_release+0x206/0x660 ? do_user_addr_fault+0x45e/0x8f0 ? __pfx_lock_release+0x10/0x10 ? do_user_addr_fault+0x401/0x8f0 ? do_user_addr_fault+0x463/0x8f0 ? irqentry_exit_to_user_mode+0x3d/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fa84a43a21d RSP: 002b:00007ffec8720278 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa84a43a21d RDX: 0000000000003003 RSI: 0000565007f224d0 RDI: 0000000000000003 RBP: 0000000000003003 R08: 00007fa84a515020 R09: 00007fa84a515020 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000001f4 R13: 0000565007e4a340 R14: 0000565007f224d0 R15: 0000565005ba8524 </TASK> Showing all locks held in the system: 1 lock held by khungtaskd/163: #0: ffffffffae629b80 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x64/0x280 2 locks held by bash/3365: #0: ffff8881661803e0 (sb_writers#19){....}-{0:0}, at: ksys_write+0xee/0x1c0 #1: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at: ceph_start_io_write+0x15/0x30 1 lock held by bash/6599: #0: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at: ceph_start_io_read+0x19/0x80 1 lock held by bash/6614: #0: ffff888192604b18 (&sb->s_type->i_mutex_key#19){....}-{3:3}, at: ceph_start_io_read+0x19/0x80 ============================================= ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 6.12 WARNING in netfs_consume_read_data() 2024-12-06 16:29 ` Max Kellermann @ 2024-12-06 21:13 ` David Howells 2024-12-07 18:39 ` Max Kellermann 0 siblings, 1 reply; 14+ messages in thread From: David Howells @ 2024-12-06 21:13 UTC (permalink / raw) To: Max Kellermann; +Cc: dhowells, Jeff Layton, netfs, linux-fsdevel, linux-kernel Max Kellermann <max.kellermann@ionos.com> wrote: > > I tried with 6.12.3 + dhowells/netfs-writeback; David's branch solved > many problems and it took much longer to trigger the hang, but it > eventually occurred: Can you get some tracing for me? echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_failure/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_rreq/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_sreq/enable echo 1 >/sys/kernel/debug/tracing/events/netfs/netfs_folio/enable echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_read/enable echo 1 >/sys/kernel/debug/tracing/events/cachefiles/cachefiles_write/enable (And compress the result before posting) Thanks, David ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 6.12 WARNING in netfs_consume_read_data() 2024-12-06 21:13 ` David Howells @ 2024-12-07 18:39 ` Max Kellermann 2024-12-08 10:20 ` Max Kellermann 0 siblings, 1 reply; 14+ messages in thread From: Max Kellermann @ 2024-12-07 18:39 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 997 bytes --] On Fri, Dec 6, 2024 at 10:13 PM David Howells <dhowells@redhat.com> wrote: > Can you get some tracing for me? Here's a Brotli-compressed trace (of 6.12.3 + dhowells/netfs-writeback). I can reproduce it this way: 1. log in via SSH (cm4all-lukko is our SSH server) 2. "cp" a file on the Ceph home directory 3. log out - bash hangs inside write(): [<0>] folio_wait_bit_common+0x23a/0x4f0 [<0>] folio_wait_private_2_killable+0x3b/0x80 [<0>] netfs_write_begin+0x5a5/0xb90 [<0>] ceph_write_begin+0x6f/0x150 [<0>] generic_perform_write+0x253/0x7d0 [<0>] ceph_write_iter+0xafd/0x1010 [<0>] vfs_write+0x5bc/0x1030 [<0>] ksys_write+0xee/0x1c0 [<0>] do_syscall_64+0x64/0x100 [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e bash tries to write the history file when logging out and that hangs. This apparently always happens after somebody does copy_file_range on Ceph - "cp" in my example. Triggering this is trivial for me, so if you need any more data, I can get it. Max [-- Attachment #2: trace.br --] [-- Type: application/octet-stream, Size: 178978 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 6.12 WARNING in netfs_consume_read_data() 2024-12-07 18:39 ` Max Kellermann @ 2024-12-08 10:20 ` Max Kellermann 2024-12-09 13:14 ` [PATCH] netfs: Fix ceph copy to cache on write-begin David Howells 0 siblings, 1 reply; 14+ messages in thread From: Max Kellermann @ 2024-12-08 10:20 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel On Sat, Dec 7, 2024 at 7:39 PM Max Kellermann <max.kellermann@ionos.com> wrote: > Here's a Brotli-compressed trace (of 6.12.3 + > dhowells/netfs-writeback). I can reproduce it this way: > > 1. log in via SSH (cm4all-lukko is our SSH server) > 2. "cp" a file on the Ceph home directory > 3. log out - bash hangs inside write(): Today, I took the time to bisect it, and the bisect arrived at first-bad commit 796a4049640b54cb1daf9e7fe543292c5ca02c74 ("netfs: In readahead, put the folio refs as soon extracted"). ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] netfs: Fix ceph copy to cache on write-begin 2024-12-08 10:20 ` Max Kellermann @ 2024-12-09 13:14 ` David Howells 2024-12-09 13:31 ` Max Kellermann 0 siblings, 1 reply; 14+ messages in thread From: David Howells @ 2024-12-09 13:14 UTC (permalink / raw) To: Max Kellermann; +Cc: dhowells, Jeff Layton, netfs, linux-fsdevel, linux-kernel Hi Max, Could you try this? David --- netfs: Fix ceph copy to cache on write-begin At the end of netfs_unlock_read_folio() in which folios are marked appropriately for copying to the cache (either with by being marked dirty and having their private data set or by having PG_private_2 set) and then unlocked, the folio_queue struct has the entry pointing to the folio cleared. This presents a problem for netfs_pgpriv2_write_to_the_cache(), which is used to write folios marked with PG_private_2 to the cache as it expects to be able to trawl the folio_queue list thereafter to find the relevant folios, leading to a hang. Fix this by not clearing the folio_queue entry if we're going to do the deprecated copy-to-cache. The clearance will be done instead as the folios are written to the cache. This can be reproduced by starting cachefiles, mounting a ceph filesystem with "-o fsc" and writing to it. Reported-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Ilya Dryomov <idryomov@gmail.com> cc: Xiubo Li <xiubli@redhat.com> cc: netfs@lists.linux.dev cc: ceph-devel@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- fs/netfs/read_collect.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index 849f40f64443..72a16222b63b 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -62,10 +62,14 @@ static void netfs_unlock_read_folio(struct netfs_io_subrequest *subreq, } else { trace_netfs_folio(folio, netfs_folio_trace_read_done); } + + folioq_clear(folioq, slot); } else { // TODO: Use of PG_private_2 is deprecated. if (test_bit(NETFS_SREQ_COPY_TO_CACHE, &subreq->flags)) netfs_pgpriv2_mark_copy_to_cache(subreq, rreq, folioq, slot); + else + folioq_clear(folioq, slot); } if (!test_bit(NETFS_RREQ_DONT_UNLOCK_FOLIOS, &rreq->flags)) { @@ -77,8 +81,6 @@ static void netfs_unlock_read_folio(struct netfs_io_subrequest *subreq, folio_unlock(folio); } } - - folioq_clear(folioq, slot); } /* ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] netfs: Fix ceph copy to cache on write-begin 2024-12-09 13:14 ` [PATCH] netfs: Fix ceph copy to cache on write-begin David Howells @ 2024-12-09 13:31 ` Max Kellermann 2024-12-12 16:00 ` David Howells 0 siblings, 1 reply; 14+ messages in thread From: Max Kellermann @ 2024-12-09 13:31 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel On Mon, Dec 9, 2024 at 2:14 PM David Howells <dhowells@redhat.com> wrote: > Could you try this? No change, still hangs immediately on the first try. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] netfs: Fix ceph copy to cache on write-begin 2024-12-09 13:31 ` Max Kellermann @ 2024-12-12 16:00 ` David Howells 2025-01-07 10:58 ` Max Kellermann 0 siblings, 1 reply; 14+ messages in thread From: David Howells @ 2024-12-12 16:00 UTC (permalink / raw) To: Max Kellermann; +Cc: dhowells, Jeff Layton, netfs, linux-fsdevel, linux-kernel How about if you add the attached? For convenience, I've put the outstanding fix patches I have here: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes David --- commit d0bc2ecca996105f55da22e8867905ca1dad7c8f Author: David Howells <dhowells@redhat.com> Date: Thu Dec 12 15:26:24 2024 +0000 netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled When the caching for a cookie is temporarily disabled (e.g. due to a DIO write on that file), future copying to the cache for that file is disabled until all fds open on that file are closed. However, if netfslib is using the deprecated PG_private_2 method (such as is currently used by ceph), and decides it wants to copy to the cache, netfs_advance_write() will just bail at the first check seeing that the cache stream is unavailable, and indicate that it dealt with all the content. This means that we have no subrequests to provide notifications to drive the state machine or even to pin the request and the request just gets discarded, leaving the folios with PG_private_2 set. Fix this by jumping directly to cancel the request if the cache is not available. That way, we don't remove mark3 from the folio_queue list and netfs_pgpriv2_cancel() will clean up the folios. This was found by running the generic/013 xfstest against ceph with an active cache and the "-o fsc" option passed to ceph. That would usually hang Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Ilya Dryomov <idryomov@gmail.com> cc: Xiubo Li <xiubli@redhat.com> cc: netfs@lists.linux.dev cc: ceph-devel@vger.kernel.org cc: linux-fsdevel@vger.kernel.org diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c index ba5af89d37fa..54d5004fec18 100644 --- a/fs/netfs/read_pgpriv2.c +++ b/fs/netfs/read_pgpriv2.c @@ -170,6 +170,10 @@ void netfs_pgpriv2_write_to_the_cache(struct netfs_io_request *rreq) trace_netfs_write(wreq, netfs_write_trace_copy_to_cache); netfs_stat(&netfs_n_wh_copy_to_cache); + if (!wreq->io_streams[1].avail) { + netfs_put_request(wreq, false, netfs_rreq_trace_put_return); + goto couldnt_start; + } for (;;) { error = netfs_pgpriv2_copy_folio(wreq, folio); ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] netfs: Fix ceph copy to cache on write-begin 2024-12-12 16:00 ` David Howells @ 2025-01-07 10:58 ` Max Kellermann 0 siblings, 0 replies; 14+ messages in thread From: Max Kellermann @ 2025-01-07 10:58 UTC (permalink / raw) To: David Howells; +Cc: Jeff Layton, netfs, linux-fsdevel, linux-kernel On Thu, Dec 12, 2024 at 5:00 PM David Howells <dhowells@redhat.com> wrote: > How about if you add the attached? Still no change, first try led to hang, as always. (Sorry for the late reply, I was offline on vacation) ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-01-07 10:58 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-18 23:04 6.12 WARNING in netfs_consume_read_data() Max Kellermann 2024-11-29 12:41 ` David Howells 2024-11-29 21:06 ` Max Kellermann 2024-11-29 22:29 ` Max Kellermann 2024-12-06 15:08 ` Max Kellermann 2024-12-06 15:13 ` Max Kellermann 2024-12-06 16:29 ` Max Kellermann 2024-12-06 21:13 ` David Howells 2024-12-07 18:39 ` Max Kellermann 2024-12-08 10:20 ` Max Kellermann 2024-12-09 13:14 ` [PATCH] netfs: Fix ceph copy to cache on write-begin David Howells 2024-12-09 13:31 ` Max Kellermann 2024-12-12 16:00 ` David Howells 2025-01-07 10:58 ` Max Kellermann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox