* [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter @ 2025-03-22 15:54 syzbot 2025-03-23 18:48 ` Oleg Nesterov ` (2 more replies) 0 siblings, 3 replies; 48+ messages in thread From: syzbot @ 2025-03-22 15:54 UTC (permalink / raw) To: brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, oleg, syzkaller-bugs, viro Hello, syzbot found the following issue on: HEAD commit: fc444ada1310 Merge tag 'soc-fixes-6.14-2' of git://git.ker.. git tree: upstream console+strace: https://syzkaller.appspot.com/x/log.txt?x=1397319b980000 kernel config: https://syzkaller.appspot.com/x/.config?x=2e330e9768b5b8ff dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1057319b980000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/924e6055daef/disk-fc444ada.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/0cd40093a53e/vmlinux-fc444ada.xz kernel image: https://storage.googleapis.com/syzbot-assets/7370bbe4e1b8/bzImage-fc444ada.xz The issue was bisected to: commit aaec5a95d59615523db03dd53c2052f0a87beea7 Author: Oleg Nesterov <oleg@redhat.com> Date: Thu Jan 2 14:07:15 2025 +0000 pipe_read: don't wake up the writer if the pipe is still full bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14b6b19b980000 final oops: https://syzkaller.appspot.com/x/report.txt?x=16b6b19b980000 console output: https://syzkaller.appspot.com/x/log.txt?x=12b6b19b980000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com Fixes: aaec5a95d596 ("pipe_read: don't wake up the writer if the pipe is still full") INFO: task syz-executor309:5842 blocked for more than 143 seconds. Not tainted 6.14.0-rc7-syzkaller-00050-gfc444ada1310 #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor309 state:D stack:24944 pid:5842 tgid:5842 ppid:5838 task_flags:0x400140 flags:0x00004006 Call Trace: <TASK> context_switch kernel/sched/core.c:5378 [inline] __schedule+0xf43/0x5890 kernel/sched/core.c:6765 __schedule_loop kernel/sched/core.c:6842 [inline] schedule+0xe7/0x350 kernel/sched/core.c:6857 bit_wait+0x15/0xe0 kernel/sched/wait_bit.c:237 __wait_on_bit+0x62/0x180 kernel/sched/wait_bit.c:49 out_of_line_wait_on_bit+0xda/0x110 kernel/sched/wait_bit.c:64 wait_on_bit include/linux/wait_bit.h:77 [inline] netfs_unbuffered_write_iter_locked+0xaa1/0xd30 fs/netfs/direct_write.c:108 netfs_unbuffered_write_iter+0x413/0x6d0 fs/netfs/direct_write.c:195 v9fs_file_write_iter+0xbf/0x100 fs/9p/vfs_file.c:404 new_sync_write fs/read_write.c:586 [inline] vfs_write+0x5ae/0x1150 fs/read_write.c:679 ksys_write+0x12b/0x250 fs/read_write.c:731 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff138d69f79 RSP: 002b:00007ffe82a84918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0030656c69662f2e RCX: 00007ff138d69f79 RDX: 0000000000007fec RSI: 0000400000000540 RDI: 0000000000000007 RBP: 0000400000000000 R08: 0000000000000006 R09: 0000000000000006 R10: 0000000000000006 R11: 0000000000000246 R12: 0000400000000280 R13: 00007ff138db304e R14: 0000000000000001 R15: 0000000000000001 </TASK> Showing all locks held in the system: 1 lock held by khungtaskd/31: #0: ffffffff8e1bd140 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] #0: ffffffff8e1bd140 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] #0: ffffffff8e1bd140 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6746 2 locks held by kworker/u8:3/58: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3213 #1: ffffc9000123fd18 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3214 2 locks held by getty/5585: #0: ffff88803117c0a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243 #1: ffffc90002fde2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0xfba/0x1480 drivers/tty/n_tty.c:2211 2 locks held by syz-executor309/5842: #0: ffff888030b1a420 (sb_writers#10){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #1: ffff88807ae98148 (&sb->s_type->i_mutex_key#15){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 ============================================= NMI backtrace for cpu 0 CPU: 0 UID: 0 PID: 31 Comm: khungtaskd Not tainted 6.14.0-rc7-syzkaller-00050-gfc444ada1310 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:236 [inline] watchdog+0xf62/0x12b0 kernel/hung_task.c:399 kthread+0x3af/0x750 kernel/kthread.c:464 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline] NMI backtrace for cpu 1 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:106 [inline] NMI backtrace for cpu 1 skipped: idling at acpi_safe_halt+0x1a/0x20 drivers/acpi/processor_idle.c:111 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. For information about bisection process see: https://goo.gl/tpsmEJ#bisection If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-22 15:54 [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter syzbot @ 2025-03-23 18:48 ` Oleg Nesterov 2025-03-23 19:17 ` syzbot 2025-03-28 18:14 ` David Howells 2025-08-03 12:09 ` syzbot 2 siblings, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-23 18:48 UTC (permalink / raw) To: syzbot Cc: brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, syzkaller-bugs, viro, K Prateek Nayak, Sapkal, Swapnil, Mateusz Guzik On 03/22, syzbot wrote: > > HEAD commit: fc444ada1310 Merge tag 'soc-fixes-6.14-2' of git://git.ker.. > git tree: upstream > console+strace: https://syzkaller.appspot.com/x/log.txt?x=1397319b980000 > kernel config: https://syzkaller.appspot.com/x/.config?x=2e330e9768b5b8ff > dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1057319b980000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/924e6055daef/disk-fc444ada.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/0cd40093a53e/vmlinux-fc444ada.xz > kernel image: https://storage.googleapis.com/syzbot-assets/7370bbe4e1b8/bzImage-fc444ada.xz > > The issue was bisected to: > > commit aaec5a95d59615523db03dd53c2052f0a87beea7 > Author: Oleg Nesterov <oleg@redhat.com> > Date: Thu Jan 2 14:07:15 2025 +0000 > > pipe_read: don't wake up the writer if the pipe is still full OMG :/ Just to ensure it does not help, #syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7 diff --git a/fs/pipe.c b/fs/pipe.c index 82fede0f2111..7e36f54d21a5 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -417,8 +417,8 @@ static inline int is_packetized(struct file *file) /* Done while waiting without holding the pipe lock - thus the READ_ONCE() */ static inline bool pipe_writable(const struct pipe_inode_info *pipe) { - unsigned int head = READ_ONCE(pipe->head); unsigned int tail = READ_ONCE(pipe->tail); + unsigned int head = READ_ONCE(pipe->head); unsigned int max_usage = READ_ONCE(pipe->max_usage); return !pipe_full(head, tail, max_usage) || ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-23 18:48 ` Oleg Nesterov @ 2025-03-23 19:17 ` syzbot 2025-03-23 19:47 ` Oleg Nesterov 2025-03-23 20:03 ` Mateusz Guzik 0 siblings, 2 replies; 48+ messages in thread From: syzbot @ 2025-03-23 19:17 UTC (permalink / raw) To: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, viro Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: INFO: task hung in netfs_unbuffered_write_iter INFO: task syz.0.17:6724 blocked for more than 143 seconds. Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz.0.17 state:D stack:24768 pid:6724 tgid:6722 ppid:6593 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5369 [inline] __schedule+0xe58/0x5ad0 kernel/sched/core.c:6756 __schedule_loop kernel/sched/core.c:6833 [inline] schedule+0xe7/0x350 kernel/sched/core.c:6848 bit_wait+0x15/0xe0 kernel/sched/wait_bit.c:237 __wait_on_bit+0x62/0x180 kernel/sched/wait_bit.c:49 out_of_line_wait_on_bit+0xda/0x110 kernel/sched/wait_bit.c:64 wait_on_bit include/linux/wait_bit.h:77 [inline] netfs_unbuffered_write_iter_locked+0xba8/0xe70 fs/netfs/direct_write.c:105 netfs_unbuffered_write_iter+0x413/0x6d0 fs/netfs/direct_write.c:193 v9fs_file_write_iter+0xbf/0x100 fs/9p/vfs_file.c:404 new_sync_write fs/read_write.c:586 [inline] vfs_write+0x5ae/0x1150 fs/read_write.c:679 ksys_write+0x12b/0x250 fs/read_write.c:731 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fce81f8d169 RSP: 002b:00007fce82d75038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fce821a5fa0 RCX: 00007fce81f8d169 RDX: 0000000000007fec RSI: 0000400000000540 RDI: 0000000000000007 RBP: 00007fce8200e2a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fce821a5fa0 R15: 00007ffc98f4b948 </TASK> Showing all locks held in the system: 2 locks held by kworker/u8:0/11: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90000107d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:1/12: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90000117d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 1 lock held by khungtaskd/30: #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6744 2 locks held by kworker/u8:2/35: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90000ab7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:3/56: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000121fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:4/81: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000217fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:5/1317: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90004897d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:6/3513: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000cadfd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:7/3805: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000d66fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by getty/5580: #0: ffff88803216c0a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243 #1: ffffc90002fde2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0xfba/0x1480 drivers/tty/n_tty.c:2211 3 locks held by syz.0.17/6724: #0: ffff8880227fd438 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88807965c420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880749e0148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.1.19/6772: #0: ffff888028adf438 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88802984c420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888074838148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.2.21/6792: #0: ffff888034b1bcf8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88805ffda420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880748387b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 2 locks held by kworker/u8:8/6807: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90003157d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 3 locks held by syz.3.23/6812: #0: ffff88807c3c70b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888026990420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888074838e28 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.4.25/6838: #0: ffff8880794df438 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88807c934420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880749e07b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.5.27/6864: #0: ffff8880248b07f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88805ccc4420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888074839498 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 2 locks held by kworker/u8:11/6865: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90003117d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 3 locks held by syz.6.29/6891: #0: ffff8880355f7eb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888078eb2420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888074839b08 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.7.31/6917: #0: ffff88807f33f0b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88807613e420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff88807483a178 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.8.34/6953: #0: ffff88806f5b4478 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888078710420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff88807483ae58 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.9.36/6980: #0: ffff888029c18ef8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888033c44420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff88807483b4c8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 ============================================= NMI backtrace for cpu 0 CPU: 0 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:234 [inline] watchdog+0xf14/0x1240 kernel/hung_task.c:397 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 UID: 0 PID: 6813 Comm: kworker/u8:9 Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Workqueue: events_unbound nsim_dev_trap_report_work RIP: 0010:orc_ip arch/x86/kernel/unwind_orc.c:80 [inline] RIP: 0010:__orc_find+0x70/0xf0 arch/x86/kernel/unwind_orc.c:102 Code: ec 72 4e 4c 89 e2 48 29 ea 48 89 d6 48 c1 ea 3f 48 c1 fe 02 48 01 f2 48 d1 fa 48 8d 5c 95 00 48 89 da 48 c1 ea 03 0f b6 34 0a <48> 89 da 83 e2 07 83 c2 03 40 38 f2 7c 05 40 84 f6 75 4b 48 63 13 RSP: 0018:ffffc900031476d8 EFLAGS: 00000a06 RAX: ffffffff90f0214a RBX: ffffffff906e9fd0 RCX: dffffc0000000000 RDX: 1ffffffff20dd3fa RSI: 0000000000000000 RDI: ffffffff906e9fa8 RBP: ffffffff906e9fd0 R08: ffffffff90f0218c R09: ffffffff90f117cc R10: ffffc900031477d8 R11: 0000000000004070 R12: ffffffff906e9fd0 R13: ffffffff814070f3 R14: ffffffff906e9fa8 R15: ffffffff906e9fcc FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c008cfd000 CR3: 000000000df7e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <NMI> </NMI> <TASK> orc_find arch/x86/kernel/unwind_orc.c:227 [inline] unwind_next_frame+0x2be/0x20c0 arch/x86/kernel/unwind_orc.c:494 __unwind_start+0x45f/0x7f0 arch/x86/kernel/unwind_orc.c:760 unwind_start arch/x86/include/asm/unwind.h:64 [inline] arch_stack_walk+0x74/0x100 arch/x86/kernel/stacktrace.c:24 stack_trace_save+0x95/0xd0 kernel/stacktrace.c:122 kasan_save_stack+0x33/0x60 mm/kasan/common.c:47 kasan_save_track+0x14/0x30 mm/kasan/common.c:68 kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x51/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free mm/slub.c:4598 [inline] kmem_cache_free+0x152/0x4c0 mm/slub.c:4700 kfree_skbmem+0x1a4/0x1f0 net/core/skbuff.c:1148 __kfree_skb net/core/skbuff.c:1205 [inline] consume_skb net/core/skbuff.c:1436 [inline] consume_skb+0xcc/0x100 net/core/skbuff.c:1430 nsim_dev_trap_report drivers/net/netdevsim/dev.c:821 [inline] nsim_dev_trap_report_work+0x8cf/0xd00 drivers/net/netdevsim/dev.c:851 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Tested on: commit: aaec5a95 pipe_read: don't wake up the writer if the pi.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=169ac43f980000 kernel config: https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=17803c4c580000 ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-23 19:17 ` syzbot @ 2025-03-23 19:47 ` Oleg Nesterov 2025-03-23 19:50 ` K Prateek Nayak 2025-03-23 19:52 ` Mateusz Guzik 2025-03-23 20:03 ` Mateusz Guzik 1 sibling, 2 replies; 48+ messages in thread From: Oleg Nesterov @ 2025-03-23 19:47 UTC (permalink / raw) To: syzbot Cc: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, viro On 03/23, syzbot wrote: > > Hello, > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > INFO: task hung in netfs_unbuffered_write_iter OK, as expected. Dear syzbot, thank you. So far I think this is another problem revealed by aaec5a95d59615523db03dd5 ("pipe_read: don't wake up the writer if the pipe is still full"). I am going to forget about this report for now and return to it later, when all the pending pipe-related changes in vfs.git are merged. Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-23 19:47 ` Oleg Nesterov @ 2025-03-23 19:50 ` K Prateek Nayak 2025-03-23 19:52 ` Mateusz Guzik 1 sibling, 0 replies; 48+ messages in thread From: K Prateek Nayak @ 2025-03-23 19:50 UTC (permalink / raw) To: Oleg Nesterov, syzbot Cc: brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, viro Hello Oleg, On 3/24/2025 1:17 AM, Oleg Nesterov wrote: > On 03/23, syzbot wrote: >> >> Hello, >> >> syzbot has tested the proposed patch but the reproducer is still triggering an issue: >> INFO: task hung in netfs_unbuffered_write_iter > > OK, as expected. > > Dear syzbot, thank you. > > So far I think this is another problem revealed by aaec5a95d59615523db03dd5 > ("pipe_read: don't wake up the writer if the pipe is still full"). > > I am going to forget about this report for now and return to it later, when > all the pending pipe-related changes in vfs.git are merged. P.S. I'm trying to repro this on my machine on latest upstream. Haven't gotten lucky yet with the C reproducer with my config. I'm moving to the the syzbot's config to see if it reveals something. If I have something I'll report back here. > > Oleg. > -- Thanks and Regards, Prateek ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-23 19:47 ` Oleg Nesterov 2025-03-23 19:50 ` K Prateek Nayak @ 2025-03-23 19:52 ` Mateusz Guzik 2025-03-23 21:02 ` Oleg Nesterov 1 sibling, 1 reply; 48+ messages in thread From: Mateusz Guzik @ 2025-03-23 19:52 UTC (permalink / raw) To: Oleg Nesterov Cc: syzbot, brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote: > > On 03/23, syzbot wrote: > > > > Hello, > > > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > > INFO: task hung in netfs_unbuffered_write_iter > > OK, as expected. > > Dear syzbot, thank you. > > So far I think this is another problem revealed by aaec5a95d59615523db03dd5 > ("pipe_read: don't wake up the writer if the pipe is still full"). > > I am going to forget about this report for now and return to it later, when > all the pending pipe-related changes in vfs.git are merged. > How do you ask syzbot for all stacks? The reproducer *does* use pipes, but it is unclear to me if they play any role here -- and notably we don't know if there is someone stuck in pipe code, resulting in not waking up the reported thread. -- Mateusz Guzik <mjguzik gmail.com> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-23 19:52 ` Mateusz Guzik @ 2025-03-23 21:02 ` Oleg Nesterov 2025-03-24 10:47 ` K Prateek Nayak 0 siblings, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-23 21:02 UTC (permalink / raw) To: Mateusz Guzik Cc: syzbot, brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro Prateek, Mateusz, thanks for your participation! On 03/23, Mateusz Guzik wrote: > > On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote: > > > > OK, as expected. > > > > Dear syzbot, thank you. > > > > So far I think this is another problem revealed by aaec5a95d59615523db03dd5 > > ("pipe_read: don't wake up the writer if the pipe is still full"). > > > > I am going to forget about this report for now and return to it later, when > > all the pending pipe-related changes in vfs.git are merged. > > > > How do you ask syzbot for all stacks? Heh, I don't know. > The reproducer *does* use pipes, but it is unclear to me if they play > any role here please see the reproducer, https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000 res = syscall(__NR_pipe2, /*pipefd=*/0x400000001900ul, /*flags=*/0ul); if (res != -1) { r[2] = *(uint32_t*)0x400000001900; r[3] = *(uint32_t*)0x400000001904; } then res = syscall(__NR_dup, /*oldfd=*/r[3]); if (res != -1) r[4] = res; so r[2] and r[4] are the read/write fd's. then later memcpy((void*)0x400000000280, "trans=fd,", 9); ... memcpy((void*)0x400000000289, "rfdno", 5); ... sprintf((char*)0x40000000028f, "0x%016llx", (long long)r[2]); ... memcpy((void*)0x4000000002a2, "wfdno", 5); ... sprintf((char*)0x4000000002a8, "0x%016llx", (long long)r[4]); ... syscall(__NR_mount, /*src=*/0ul, /*dst=*/0x400000000000ul, /*type=*/0x400000000040ul, /*flags=*/0ul, /*opts=*/0x400000000280ul); so this pipe is actually used as "trans=fd". > -- and notably we don't know if there is someone stuck > in pipe code, resulting in not waking up the reported thread. Yes, I am not familiar with 9p or netfs, so I don't know either. Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-23 21:02 ` Oleg Nesterov @ 2025-03-24 10:47 ` K Prateek Nayak 2025-03-24 11:15 ` syzbot 2025-03-24 13:17 ` Mateusz Guzik 0 siblings, 2 replies; 48+ messages in thread From: K Prateek Nayak @ 2025-03-24 10:47 UTC (permalink / raw) To: Oleg Nesterov, Mateusz Guzik Cc: syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro Hello Oleg, Mateusz, On 3/24/2025 2:32 AM, Oleg Nesterov wrote: > Prateek, Mateusz, thanks for your participation! > > On 03/23, Mateusz Guzik wrote: >> >> On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote: >>> >>> OK, as expected. >>> >>> Dear syzbot, thank you. >>> >>> So far I think this is another problem revealed by aaec5a95d59615523db03dd5 >>> ("pipe_read: don't wake up the writer if the pipe is still full"). >>> >>> I am going to forget about this report for now and return to it later, when >>> all the pending pipe-related changes in vfs.git are merged. >>> >> >> How do you ask syzbot for all stacks? > > Heh, I don't know. > >> The reproducer *does* use pipes, but it is unclear to me if they play >> any role here > > please see the reproducer, > > https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000 > > res = syscall(__NR_pipe2, /*pipefd=*/0x400000001900ul, /*flags=*/0ul); > if (res != -1) { > r[2] = *(uint32_t*)0x400000001900; > r[3] = *(uint32_t*)0x400000001904; > } > > then > > res = syscall(__NR_dup, /*oldfd=*/r[3]); > if (res != -1) > r[4] = res; > > so r[2] and r[4] are the read/write fd's. > > then later > > memcpy((void*)0x400000000280, "trans=fd,", 9); > ... > memcpy((void*)0x400000000289, "rfdno", 5); > ... > sprintf((char*)0x40000000028f, "0x%016llx", (long long)r[2]); > ... > memcpy((void*)0x4000000002a2, "wfdno", 5); > ... > sprintf((char*)0x4000000002a8, "0x%016llx", (long long)r[4]); > ... > syscall(__NR_mount, /*src=*/0ul, /*dst=*/0x400000000000ul, > /*type=*/0x400000000040ul, /*flags=*/0ul, /*opts=*/0x400000000280ul); > > so this pipe is actually used as "trans=fd". > >> -- and notably we don't know if there is someone stuck >> in pipe code, resulting in not waking up the reported thread. > > Yes, I am not familiar with 9p or netfs, so I don't know either. Didn't have any luck reproducing this yet but I'm looking at https://syzkaller.appspot.com/x/log.txt?x=1397319b980000 which is the trimmed log from original report and I see ... [pid 5842] creat("./file0", 000) = 7 [ 137.753309][ T30] audit: type=1400 audit(1742312362.045:90): avc: denied { mount } for pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=filesystem permissive=1 [ 137.775741][ T30] audit: type=1400 audit(1742312362.065:91): avc: denied { setattr } for pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 [ 137.798215][ T30] audit: type=1400 audit(1742312362.075:92): avc: denied { write } for pid=5842 comm="syz-executor309" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 [ 137.819189][ T30] audit: type=1400 audit(1742312362.075:93): avc: denied { open } for pid=5842 comm="syz-executor309" path="/file0" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 [pid 5842] write(7, "\x08\x00\x00\x00\x1a\x17\x92\x4a\xb2\x18\xea\xcb\x15\xa3\xfc\xcf\x92\x9e\x2d\xd2\x49\x79\x03\xc1\xf8\x53\xd9\x5b\x99\x5c\x65\xe9\x94\x49\xff\x95\x3f\xa1\x1c\x77\x23\xb2\x14\x9e\xcd\xaa\x7f\x83\x3f\x60\xe1\x3b\x19\xa6\x6e\x96\x3f\x7e\x8d\xa4\x29\x7e\xbb\xfd\xda\x5b\x36\xfb\x4d\x01\xbd\x02\xe6\xc6\x52\xdc\x4d\x99\xe2\xcb\x82\xc2\xa1\xd4\xa4\x5e\x4c\x89\xba\x99\x94\xe8\x2f\x85\x4b\xbc\x34\xa4\x0b\x3a"..., 32748 <unfinished ...> So we have a "write" denied for pid 5842 and then it tries a write that seems to hangs. In all the traces for hang, I see a denied for a task followed by a hang for the task in the same tgid. But since this is a "permissive" policy, it should not cause a hang, only report that the program is in violation. Also I have no clue how a spurious wakeup of a writer could then lead to progress. Since in all cases the thread info flags "flags:0x00004006" has the TIF_NOTIFY_SIGNAL bit set, I'm wondering if it has something to do with the fact that pipe_read() directly return -ERESTARTSYS in case of a pending signal without any wakeups? Well here goes nothing I guess; Totally a shot in the dark: P.S. I think it should be wake_next_reader but if this does not hang, perhaps it could point to some interaction with signal pending and wakeup. There are some EPOLL semantics in pipe_write() which will cause readers to wakeup if the writer has signal_pending() and pipe->poll_usage is set. #syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7 diff --git a/fs/pipe.c b/fs/pipe.c index 82fede0f2111..9efeb86eaac5 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -359,6 +359,8 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to) ret = -EAGAIN; break; } + if (signal_pending(current) && pipe_full(pipe->head, pipe->tail, pipe->max_usage)) + wake_writer = true; mutex_unlock(&pipe->mutex); /* ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-24 10:47 ` K Prateek Nayak @ 2025-03-24 11:15 ` syzbot 2025-03-24 13:17 ` Mateusz Guzik 1 sibling, 0 replies; 48+ messages in thread From: syzbot @ 2025-03-24 11:15 UTC (permalink / raw) To: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, viro Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: INFO: task hung in netfs_unbuffered_write_iter INFO: task syz.0.18:6758 blocked for more than 143 seconds. Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz.0.18 state:D stack:26608 pid:6758 tgid:6757 ppid:6540 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5369 [inline] __schedule+0xe58/0x5ad0 kernel/sched/core.c:6756 __schedule_loop kernel/sched/core.c:6833 [inline] schedule+0xe7/0x350 kernel/sched/core.c:6848 bit_wait+0x15/0xe0 kernel/sched/wait_bit.c:237 __wait_on_bit+0x62/0x180 kernel/sched/wait_bit.c:49 out_of_line_wait_on_bit+0xda/0x110 kernel/sched/wait_bit.c:64 wait_on_bit include/linux/wait_bit.h:77 [inline] netfs_unbuffered_write_iter_locked+0xba8/0xe70 fs/netfs/direct_write.c:105 netfs_unbuffered_write_iter+0x413/0x6d0 fs/netfs/direct_write.c:193 v9fs_file_write_iter+0xbf/0x100 fs/9p/vfs_file.c:404 new_sync_write fs/read_write.c:586 [inline] vfs_write+0x5ae/0x1150 fs/read_write.c:679 ksys_write+0x12b/0x250 fs/read_write.c:731 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fae42f8d169 RSP: 002b:00007fae43eb2038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fae431a5fa0 RCX: 00007fae42f8d169 RDX: 0000000000007fec RSI: 0000400000000540 RDI: 0000000000000007 RBP: 00007fae4300e2a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fae431a5fa0 R15: 00007ffef04451c8 </TASK> Showing all locks held in the system: 2 locks held by kworker/u8:1/12: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90000117d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 1 lock held by khungtaskd/30: #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6744 2 locks held by kworker/u8:3/52: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90000bd7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:4/63: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000214fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:5/1034: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000403fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:6/2116: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90005937d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:7/4367: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000f717d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:8/4487: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000fd07d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:9/4559: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000ffe7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by getty/5582: #0: ffff8880313180a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243 #1: ffffc90002fde2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0xfba/0x1480 drivers/tty/n_tty.c:2211 3 locks held by syz.0.18/6758: #0: ffff888026eba478 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888024f5a420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880755987b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.1.23/6787: #0: ffff88803111c9b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88807d6da420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888075720148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.2.25/6807: #0: ffff88803562d438 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888027a9e420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff88807559a178 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.3.28/6829: #0: ffff888026d3e9b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888025ffa420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880757207b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.4.30/6854: #0: ffff888033e600f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888036adc420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888075720e28 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 2 locks held by kworker/u8:10/6856: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90002ee7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 3 locks held by syz.5.32/6881: #0: ffff88803162deb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88802aa02420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff88807559a7e8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by kworker/u8:11/6882: 3 locks held by syz.6.34/6908: #0: ffff88804f312d38 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888062c8c420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888075721498 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 2 locks held by kworker/u8:12/6909: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc900021dfd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 3 locks held by syz.7.36/6935: #0: ffff8880369530b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888063c48420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888075721b08 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.8.38/6968: #0: ffff88801e2adeb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88807b25e420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff88807559ae58 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.9.40/6997: #0: ffff88802a74a7f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888053afa420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888075722178 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 ============================================= NMI backtrace for cpu 1 CPU: 1 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:234 [inline] watchdog+0xf14/0x1240 kernel/hung_task.c:397 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 UID: 0 PID: 5182 Comm: klogd Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 RIP: 0010:__raw_callee_save___pv_queued_spin_unlock+0x12/0x18 Code: 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 52 b8 01 00 00 00 31 d2 f0 0f b0 17 75 06 <5a> c3 cc cc cc cc 56 0f b6 f0 e8 9f ff ff ff 5e 5a c3 cc cc cc cc RSP: 0018:ffffc90005667618 EFLAGS: 00000046 RAX: 0000000000000001 RBX: ffffffff9a9f6bc8 RCX: ffffffff81770803 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff9a9f6bc8 RBP: ffffffff9a9f6bd0 R08: 0000000000000000 R09: fffffbfff353ed79 R10: ffffffff9a9f6bcb R11: 0000000000000001 R12: ffffffff9a9f6bd8 R13: 0000000000000012 R14: 0000000000000000 R15: ffff888067f30000 FS: 00007fb083fb7500(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005612bebd5240 CR3: 00000000347be000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <NMI> </NMI> <TASK> pv_queued_spin_unlock arch/x86/include/asm/paravirt.h:589 [inline] queued_spin_unlock arch/x86/include/asm/qspinlock.h:57 [inline] do_raw_spin_unlock+0x172/0x230 kernel/locking/spinlock_debug.c:142 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:150 [inline] _raw_spin_unlock_irqrestore+0x22/0x80 kernel/locking/spinlock.c:194 __debug_check_no_obj_freed lib/debugobjects.c:1108 [inline] debug_check_no_obj_freed+0x327/0x600 lib/debugobjects.c:1129 free_pages_prepare mm/page_alloc.c:1134 [inline] free_unref_page+0x276/0x1080 mm/page_alloc.c:2657 __put_partials+0x14c/0x170 mm/slub.c:3142 qlink_free mm/kasan/quarantine.c:163 [inline] qlist_free_all+0x4e/0x120 mm/kasan/quarantine.c:179 kasan_quarantine_reduce+0x195/0x1e0 mm/kasan/quarantine.c:286 __kasan_slab_alloc+0x69/0x90 mm/kasan/common.c:329 kasan_slab_alloc include/linux/kasan.h:250 [inline] slab_post_alloc_hook mm/slub.c:4104 [inline] slab_alloc_node mm/slub.c:4153 [inline] kmem_cache_alloc_node_noprof+0x223/0x3c0 mm/slub.c:4205 __alloc_skb+0x2b1/0x380 net/core/skbuff.c:668 alloc_skb include/linux/skbuff.h:1323 [inline] alloc_skb_with_frags+0xe4/0x850 net/core/skbuff.c:6612 sock_alloc_send_pskb+0x7f1/0x980 net/core/sock.c:2881 unix_dgram_sendmsg+0x4b8/0x19e0 net/unix/af_unix.c:2027 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] __sys_sendto+0x488/0x4f0 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __x64_sys_sendto+0xe0/0x1c0 net/socket.c:2200 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fb0841199b5 Code: 8b 44 24 08 48 83 c4 28 48 98 c3 48 98 c3 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 26 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 76 7a 48 8b 15 44 c4 0c 00 f7 d8 64 89 02 48 83 RSP: 002b:00007ffd368d15b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb0841199b5 RDX: 0000000000000049 RSI: 0000558a1e31e270 RDI: 0000000000000003 RBP: 0000558a1e3172c0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000004000 R11: 0000000000000246 R12: 0000000000000013 R13: 00007fb0842a7212 R14: 00007ffd368d16b8 R15: 0000000000000000 </TASK> Tested on: commit: aaec5a95 pipe_read: don't wake up the writer if the pi.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=12debc4c580000 kernel config: https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=10750198580000 ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-24 10:47 ` K Prateek Nayak 2025-03-24 11:15 ` syzbot @ 2025-03-24 13:17 ` Mateusz Guzik 2025-03-24 13:19 ` K Prateek Nayak 1 sibling, 1 reply; 48+ messages in thread From: Mateusz Guzik @ 2025-03-24 13:17 UTC (permalink / raw) To: K Prateek Nayak Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro On Mon, Mar 24, 2025 at 11:48 AM K Prateek Nayak <kprateek.nayak@amd.com> wrote: > > Hello Oleg, Mateusz, > > On 3/24/2025 2:32 AM, Oleg Nesterov wrote: > > Prateek, Mateusz, thanks for your participation! > > > > On 03/23, Mateusz Guzik wrote: > >> > >> On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote: > >>> > >>> OK, as expected. > >>> > >>> Dear syzbot, thank you. > >>> > >>> So far I think this is another problem revealed by aaec5a95d59615523db03dd5 > >>> ("pipe_read: don't wake up the writer if the pipe is still full"). > >>> > >>> I am going to forget about this report for now and return to it later, when > >>> all the pending pipe-related changes in vfs.git are merged. > >>> > >> > >> How do you ask syzbot for all stacks? > > > > Heh, I don't know. > > > >> The reproducer *does* use pipes, but it is unclear to me if they play > >> any role here > > > > please see the reproducer, > > > > https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000 > > > > res = syscall(__NR_pipe2, /*pipefd=*/0x400000001900ul, /*flags=*/0ul); > > if (res != -1) { > > r[2] = *(uint32_t*)0x400000001900; > > r[3] = *(uint32_t*)0x400000001904; > > } > > > > then > > > > res = syscall(__NR_dup, /*oldfd=*/r[3]); > > if (res != -1) > > r[4] = res; > > > > so r[2] and r[4] are the read/write fd's. > > > > then later > > > > memcpy((void*)0x400000000280, "trans=fd,", 9); > > ... > > memcpy((void*)0x400000000289, "rfdno", 5); > > ... > > sprintf((char*)0x40000000028f, "0x%016llx", (long long)r[2]); > > ... > > memcpy((void*)0x4000000002a2, "wfdno", 5); > > ... > > sprintf((char*)0x4000000002a8, "0x%016llx", (long long)r[4]); > > ... > > syscall(__NR_mount, /*src=*/0ul, /*dst=*/0x400000000000ul, > > /*type=*/0x400000000040ul, /*flags=*/0ul, /*opts=*/0x400000000280ul); > > > > so this pipe is actually used as "trans=fd". > > > >> -- and notably we don't know if there is someone stuck > >> in pipe code, resulting in not waking up the reported thread. > > > > Yes, I am not familiar with 9p or netfs, so I don't know either. > > Didn't have any luck reproducing this yet but I'm looking at > https://syzkaller.appspot.com/x/log.txt?x=1397319b980000 > which is the trimmed log from original report and I see ... > > [pid 5842] creat("./file0", 000) = 7 > [ 137.753309][ T30] audit: type=1400 audit(1742312362.045:90): avc: denied { mount } for pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=filesystem permissive=1 > [ 137.775741][ T30] audit: type=1400 audit(1742312362.065:91): avc: denied { setattr } for pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 > [ 137.798215][ T30] audit: type=1400 audit(1742312362.075:92): avc: denied { write } for pid=5842 comm="syz-executor309" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 > [ 137.819189][ T30] audit: type=1400 audit(1742312362.075:93): avc: denied { open } for pid=5842 comm="syz-executor309" path="/file0" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 > [pid 5842] write(7, "\x08\x00\x00\x00\x1a\x17\x92\x4a\xb2\x18\xea\xcb\x15\xa3\xfc\xcf\x92\x9e\x2d\xd2\x49\x79\x03\xc1\xf8\x53\xd9\x5b\x99\x5c\x65\xe9\x94\x49\xff\x95\x3f\xa1\x1c\x77\x23\xb2\x14\x9e\xcd\xaa\x7f\x83\x3f\x60\xe1\x3b\x19\xa6\x6e\x96\x3f\x7e\x8d\xa4\x29\x7e\xbb\xfd\xda\x5b\x36\xfb\x4d\x01\xbd\x02\xe6\xc6\x52\xdc\x4d\x99\xe2\xcb\x82\xc2\xa1\xd4\xa4\x5e\x4c\x89\xba\x99\x94\xe8\x2f\x85\x4b\xbc\x34\xa4\x0b\x3a"..., 32748 <unfinished ...> > > So we have a "write" denied for pid 5842 and then it tries a write that > seems to hangs. In all the traces for hang, I see a denied for a task > followed by a hang for the task in the same tgid. > > But since this is a "permissive" policy, it should not cause a hang, > only report that the program is in violation. Also I have no clue how a > spurious wakeup of a writer could then lead to progress. > > Since in all cases the thread info flags "flags:0x00004006" has the > TIF_NOTIFY_SIGNAL bit set, I'm wondering if it has something to do with > the fact that pipe_read() directly return -ERESTARTSYS in case of a > pending signal without any wakeups? > Per syzbot this attempt did not work out either. I think the blind stabs taken by everyone here are enough. The report does not provide the crucial bit: what are the other threads doing. Presumably someone else is stuck somewhere, possibly not even in pipe code and that stuck thread was supposed to wake up the one which trips over hung task detector. Figuring out what that thread is imo the next step. I failed to find a relevant command in https://github.com/google/syzkaller/blob/master/docs/syzbot.md So if you guys know someone on syzkaller side, maybe you can ask them to tweak the report *or* maybe syzbot can test a "fix" which makes hung task detector also report all backtraces? I don't know if that can work, the output may be long enough that it will get trimmed by something. I don't have to time work on this for now, just throwing ideas. -- Mateusz Guzik <mjguzik gmail.com> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-24 13:17 ` Mateusz Guzik @ 2025-03-24 13:19 ` K Prateek Nayak 2025-03-24 14:52 ` K Prateek Nayak 0 siblings, 1 reply; 48+ messages in thread From: K Prateek Nayak @ 2025-03-24 13:19 UTC (permalink / raw) To: Mateusz Guzik Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro Hello Mateusz, On 3/24/2025 6:47 PM, Mateusz Guzik wrote: > On Mon, Mar 24, 2025 at 11:48 AM K Prateek Nayak <kprateek.nayak@amd.com> wrote: >> >> Hello Oleg, Mateusz, >> >> On 3/24/2025 2:32 AM, Oleg Nesterov wrote: >>> Prateek, Mateusz, thanks for your participation! >>> >>> On 03/23, Mateusz Guzik wrote: >>>> >>>> On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote: >>>>> >>>>> OK, as expected. >>>>> >>>>> Dear syzbot, thank you. >>>>> >>>>> So far I think this is another problem revealed by aaec5a95d59615523db03dd5 >>>>> ("pipe_read: don't wake up the writer if the pipe is still full"). >>>>> >>>>> I am going to forget about this report for now and return to it later, when >>>>> all the pending pipe-related changes in vfs.git are merged. >>>>> >>>> >>>> How do you ask syzbot for all stacks? >>> >>> Heh, I don't know. >>> >>>> The reproducer *does* use pipes, but it is unclear to me if they play >>>> any role here >>> >>> please see the reproducer, >>> >>> https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000 >>> >>> res = syscall(__NR_pipe2, /*pipefd=*/0x400000001900ul, /*flags=*/0ul); >>> if (res != -1) { >>> r[2] = *(uint32_t*)0x400000001900; >>> r[3] = *(uint32_t*)0x400000001904; >>> } >>> >>> then >>> >>> res = syscall(__NR_dup, /*oldfd=*/r[3]); >>> if (res != -1) >>> r[4] = res; >>> >>> so r[2] and r[4] are the read/write fd's. >>> >>> then later >>> >>> memcpy((void*)0x400000000280, "trans=fd,", 9); >>> ... >>> memcpy((void*)0x400000000289, "rfdno", 5); >>> ... >>> sprintf((char*)0x40000000028f, "0x%016llx", (long long)r[2]); >>> ... >>> memcpy((void*)0x4000000002a2, "wfdno", 5); >>> ... >>> sprintf((char*)0x4000000002a8, "0x%016llx", (long long)r[4]); >>> ... >>> syscall(__NR_mount, /*src=*/0ul, /*dst=*/0x400000000000ul, >>> /*type=*/0x400000000040ul, /*flags=*/0ul, /*opts=*/0x400000000280ul); >>> >>> so this pipe is actually used as "trans=fd". >>> >>>> -- and notably we don't know if there is someone stuck >>>> in pipe code, resulting in not waking up the reported thread. >>> >>> Yes, I am not familiar with 9p or netfs, so I don't know either. >> >> Didn't have any luck reproducing this yet but I'm looking at >> https://syzkaller.appspot.com/x/log.txt?x=1397319b980000 >> which is the trimmed log from original report and I see ... >> >> [pid 5842] creat("./file0", 000) = 7 >> [ 137.753309][ T30] audit: type=1400 audit(1742312362.045:90): avc: denied { mount } for pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=filesystem permissive=1 >> [ 137.775741][ T30] audit: type=1400 audit(1742312362.065:91): avc: denied { setattr } for pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 >> [ 137.798215][ T30] audit: type=1400 audit(1742312362.075:92): avc: denied { write } for pid=5842 comm="syz-executor309" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 >> [ 137.819189][ T30] audit: type=1400 audit(1742312362.075:93): avc: denied { open } for pid=5842 comm="syz-executor309" path="/file0" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1 >> [pid 5842] write(7, "\x08\x00\x00\x00\x1a\x17\x92\x4a\xb2\x18\xea\xcb\x15\xa3\xfc\xcf\x92\x9e\x2d\xd2\x49\x79\x03\xc1\xf8\x53\xd9\x5b\x99\x5c\x65\xe9\x94\x49\xff\x95\x3f\xa1\x1c\x77\x23\xb2\x14\x9e\xcd\xaa\x7f\x83\x3f\x60\xe1\x3b\x19\xa6\x6e\x96\x3f\x7e\x8d\xa4\x29\x7e\xbb\xfd\xda\x5b\x36\xfb\x4d\x01\xbd\x02\xe6\xc6\x52\xdc\x4d\x99\xe2\xcb\x82\xc2\xa1\xd4\xa4\x5e\x4c\x89\xba\x99\x94\xe8\x2f\x85\x4b\xbc\x34\xa4\x0b\x3a"..., 32748 <unfinished ...> >> >> So we have a "write" denied for pid 5842 and then it tries a write that >> seems to hangs. In all the traces for hang, I see a denied for a task >> followed by a hang for the task in the same tgid. >> >> But since this is a "permissive" policy, it should not cause a hang, >> only report that the program is in violation. Also I have no clue how a >> spurious wakeup of a writer could then lead to progress. >> >> Since in all cases the thread info flags "flags:0x00004006" has the >> TIF_NOTIFY_SIGNAL bit set, I'm wondering if it has something to do with >> the fact that pipe_read() directly return -ERESTARTSYS in case of a >> pending signal without any wakeups? >> > > Per syzbot this attempt did not work out either. > > I think the blind stabs taken by everyone here are enough. > > The report does not provide the crucial bit: what are the other > threads doing. Presumably someone else is stuck somewhere, possibly > not even in pipe code and that stuck thread was supposed to wake up > the one which trips over hung task detector. Figuring out what that > thread is imo the next step. > > I failed to find a relevant command in > https://github.com/google/syzkaller/blob/master/docs/syzbot.md > > So if you guys know someone on syzkaller side, maybe you can ask them > to tweak the report *or* maybe syzbot can test a "fix" which makes > hung task detector also report all backtraces? I don't know if that > can work, the output may be long enough that it will get trimmed by > something. > > I don't have to time work on this for now, just throwing ideas. I got the reproducer running locally. Tracing stuff currently to see what is tripping. Will report back once I find something interesting. Might take a while since the 9p bits are so far spread out. -- Thanks and Regards, Prateek ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-24 13:19 ` K Prateek Nayak @ 2025-03-24 14:52 ` K Prateek Nayak 2025-03-24 16:03 ` Mateusz Guzik 2025-03-25 2:52 ` K Prateek Nayak 0 siblings, 2 replies; 48+ messages in thread From: K Prateek Nayak @ 2025-03-24 14:52 UTC (permalink / raw) To: Oleg Nesterov, Mateusz Guzik Cc: syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro Hello folks, Some updates. On 3/24/2025 6:49 PM, K Prateek Nayak wrote: >> >> Per syzbot this attempt did not work out either. >> >> I think the blind stabs taken by everyone here are enough. >> >> The report does not provide the crucial bit: what are the other >> threads doing. Presumably someone else is stuck somewhere, possibly >> not even in pipe code and that stuck thread was supposed to wake up >> the one which trips over hung task detector. Figuring out what that >> thread is imo the next step. >> >> I failed to find a relevant command in >> https://github.com/google/syzkaller/blob/master/docs/syzbot.md >> >> So if you guys know someone on syzkaller side, maybe you can ask them >> to tweak the report *or* maybe syzbot can test a "fix" which makes >> hung task detector also report all backtraces? I don't know if that >> can work, the output may be long enough that it will get trimmed by >> something. >> >> I don't have to time work on this for now, just throwing ideas. > > I got the reproducer running locally. Tracing stuff currently to see > what is tripping. Will report back once I find something interesting. > Might take a while since the 9p bits are so far spread out. > So far, with tracing, this is where I'm: o Mainline + Oleg's optimization reverted: ... kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55 kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55 kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7 kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7 repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0 repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up! ... (syzbot reproducer continues) o Mainline: kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7 kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7 kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55 kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55 repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0 repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! ... (syzbot reproducer hangs) There is a third "kworker/u1030" component that never gets woken up for reasons currently unknown to me with Oleg's optimization. I'll keep digging. -- Thanks and Regards, Prateek ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-24 14:52 ` K Prateek Nayak @ 2025-03-24 16:03 ` Mateusz Guzik 2025-03-24 16:25 ` K Prateek Nayak 2025-03-25 2:52 ` K Prateek Nayak 1 sibling, 1 reply; 48+ messages in thread From: Mateusz Guzik @ 2025-03-24 16:03 UTC (permalink / raw) To: K Prateek Nayak Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote: > So far, with tracing, this is where I'm: > > o Mainline + Oleg's optimization reverted: > > ... > kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55 > kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55 > kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7 > kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7 > repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector > repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work > repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write > repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write > repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0 > repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector > kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work > kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up! > ... (syzbot reproducer continues) > > o Mainline: > > kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7 > kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7 > kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55 > kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55 > repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector > repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work > repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write > repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write > repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0 > repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > ... (syzbot reproducer hangs) > > There is a third "kworker/u1030" component that never gets woken up for > reasons currently unknown to me with Oleg's optimization. I'll keep > digging. > Thanks for the update. It is unclear to me if you checked, so I'm going to have to ask just in case: when there is a hang, is there *anyone* stuck in pipe code (and if so, where)? You can get the kernel to print stacks for all threads with sysrq: echo t > /proc/sysrq-trigger -- Mateusz Guzik <mjguzik gmail.com> ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-24 16:03 ` Mateusz Guzik @ 2025-03-24 16:25 ` K Prateek Nayak 2025-03-24 16:36 ` K Prateek Nayak 0 siblings, 1 reply; 48+ messages in thread From: K Prateek Nayak @ 2025-03-24 16:25 UTC (permalink / raw) To: Mateusz Guzik Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro Hello Mateusz, On 3/24/2025 9:33 PM, Mateusz Guzik wrote: > On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote: >> So far, with tracing, this is where I'm: >> >> o Mainline + Oleg's optimization reverted: >> >> ... >> kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55 >> kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55 >> kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7 >> kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7 >> repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector >> repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work >> repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write >> repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write >> repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0 >> repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! >> kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector >> kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work >> kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up! >> ... (syzbot reproducer continues) >> >> o Mainline: >> >> kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7 >> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7 >> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55 >> kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55 >> repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector >> repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work >> repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write >> repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write >> repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0 >> repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! >> ... (syzbot reproducer hangs) >> >> There is a third "kworker/u1030" component that never gets woken up for >> reasons currently unknown to me with Oleg's optimization. I'll keep >> digging. >> > > Thanks for the update. > > It is unclear to me if you checked, so I'm going to have to ask just > in case: when there is a hang, is there *anyone* stuck in pipe code > (and if so, where)? > > You can get the kernel to print stacks for all threads with sysrq: > echo t > /proc/sysrq-trigger This dumps an insane amount of stuff on my console. Let me search if there is any reference to pipe somewhere in there. Meanwhile, for the reproducer threads themself, they are at: # ps aux | grep repro root 4245 0.0 0.0 19040 0 ? D 16:10 0:00 ./repro root 4306 0.0 0.0 19172 0 pts/5 S+ 16:13 0:00 ./repro root 4307 0.0 0.0 19040 0 pts/5 D 16:13 0:00 ./repro # cat /proc/4306/stack [<0>] do_wait+0xa9/0x100 [<0>] kernel_wait4+0xa9/0x150 [<0>] do_syscall_64+0x6f/0x110 [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e # cat /proc/4307/stack [<0>] do_lock_mount+0x37/0x200 [<0>] path_mount+0x876/0xae0 [<0>] __x64_sys_mount+0x103/0x130 [<0>] do_syscall_64+0x6f/0x110 [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e # cat /proc/4245/stack [<0>] netfs_unbuffered_write_iter_locked+0x222/0x2b0 [netfs] [<0>] netfs_unbuffered_write_iter+0x12d/0x1f0 [netfs] [<0>] vfs_write+0x307/0x420 [<0>] ksys_write+0x66/0xe0 [<0>] do_syscall_64+0x6f/0x110 [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e For kworkers, I did: # for i in $(ps aux | grep kworker | tr -s ' ' | cut -d ' ' -f2); do cat /proc/$i/stack | grep pipe; done and I got no output. Same for the string "wait". > -- Thanks and Regards, Prateek ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-24 16:25 ` K Prateek Nayak @ 2025-03-24 16:36 ` K Prateek Nayak 0 siblings, 0 replies; 48+ messages in thread From: K Prateek Nayak @ 2025-03-24 16:36 UTC (permalink / raw) To: Mateusz Guzik Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro On 3/24/2025 9:55 PM, K Prateek Nayak wrote: > Hello Mateusz, > > On 3/24/2025 9:33 PM, Mateusz Guzik wrote: >> On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote: >>> So far, with tracing, this is where I'm: >>> >>> o Mainline + Oleg's optimization reverted: >>> >>> ... >>> kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55 >>> kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55 >>> kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7 >>> kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7 >>> repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector >>> repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work >>> repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write >>> repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write >>> repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0 >>> repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! >>> kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector >>> kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work >>> kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up! >>> ... (syzbot reproducer continues) >>> >>> o Mainline: >>> >>> kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7 >>> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7 >>> kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55 >>> kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55 >>> repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector >>> repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work >>> repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write >>> repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write >>> repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0 >>> repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! >>> ... (syzbot reproducer hangs) >>> >>> There is a third "kworker/u1030" component that never gets woken up for >>> reasons currently unknown to me with Oleg's optimization. I'll keep >>> digging. >>> >> >> Thanks for the update. >> >> It is unclear to me if you checked, so I'm going to have to ask just >> in case: when there is a hang, is there *anyone* stuck in pipe code >> (and if so, where)? >> >> You can get the kernel to print stacks for all threads with sysrq: >> echo t > /proc/sysrq-trigger > > This dumps an insane amount of stuff on my console. Let me search if there > is any reference to pipe somewhere in there. Only pipe_read() and pipe_write() pairs I found were for: [ 1043.618621] task:containerd state:S stack:0 pid:3567 tgid:3475 ppid:1 task_flags:0x400040 flags:0x00000002 [ 1043.629673] Call Trace: [ 1043.632133] <TASK> [ 1043.634252] __schedule+0x436/0x1620 [ 1043.637832] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1043.642638] ? syscall_exit_to_user_mode+0x51/0x1a0 [ 1043.647538] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1043.652347] ? __smp_call_single_queue+0xc7/0x150 [ 1043.657072] schedule+0x28/0x110 [ 1043.660320] futex_wait_queue+0x69/0xa0 [ 1043.664176] __futex_wait+0x13c/0x1b0 [ 1043.667863] ? __pfx_futex_wake_mark+0x10/0x10 [ 1043.672325] futex_wait+0x68/0x110 [ 1043.675747] ? __x64_sys_futex+0x77/0x1d0 [ 1043.679775] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1043.684585] ? pipe_write+0x42c/0x630 <----------- here [ 1043.688269] ? copy_fpstate_to_sigframe+0x330/0x3d0 [ 1043.693167] do_futex+0x13c/0x1d0 [ 1043.696501] __x64_sys_futex+0x77/0x1d0 [ 1043.700359] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1043.705170] ? vfs_write+0x376/0x420 [ 1043.708767] do_syscall_64+0x6f/0x110 [ 1043.712449] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1043.717259] ? ksys_write+0x90/0xe0 [ 1043.720768] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1043.725579] ? syscall_exit_to_user_mode+0x51/0x1a0 [ 1043.730475] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1043.735288] ? do_syscall_64+0x7b/0x110 [ 1043.739144] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1043.743953] ? irqentry_exit_to_user_mode+0x2e/0x160 [ 1043.748935] entry_SYSCALL_64_after_hwframe+0x76/0x7e and [ 1048.980518] task:dockerd state:S stack:0 pid:3591 tgid:3582 ppid:1 task_flags:0x400040 flags:0x00000002 [ 1048.991569] Call Trace: [ 1048.994024] <TASK> [ 1048.996143] __schedule+0x436/0x1620 [ 1048.999741] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.004554] ? get_nohz_timer_target+0x2a/0x180 [ 1049.009101] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.013911] ? timerqueue_add+0x6a/0xc0 [ 1049.017768] schedule+0x28/0x110 [ 1049.021017] schedule_hrtimeout_range_clock+0x78/0xd0 [ 1049.026088] ? __pfx_hrtimer_wakeup+0x10/0x10 [ 1049.030470] do_epoll_wait+0x666/0x7d0 [ 1049.034245] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.039053] ? __pfx_ep_autoremove_wake_function+0x10/0x10 [ 1049.044556] do_compat_epoll_pwait.part.0+0xc/0x70 [ 1049.049368] __x64_sys_epoll_pwait+0x83/0x140 [ 1049.053746] do_syscall_64+0x6f/0x110 [ 1049.057425] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.062235] ? do_syscall_64+0x7b/0x110 [ 1049.066079] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.070887] ? do_futex+0xc2/0x1d0 [ 1049.074309] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.079120] ? __x64_sys_futex+0x77/0x1d0 [ 1049.083151] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.087958] ? current_time+0x31/0x130 [ 1049.091731] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.096539] ? inode_update_timestamps+0xc8/0x110 [ 1049.101264] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.106074] ? generic_update_time+0x51/0x60 [ 1049.110365] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.115173] ? touch_atime+0xb5/0x100 [ 1049.118857] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.123671] ? pipe_read+0x3fe/0x480 <----------- here [ 1049.127265] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.132073] ? ep_done_scan+0xab/0xf0 [ 1049.135757] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.140567] ? do_epoll_wait+0xe1/0x7d0 [ 1049.144425] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.149234] ? do_compat_epoll_pwait.part.0+0xc/0x70 [ 1049.154218] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.159027] ? __x64_sys_epoll_pwait+0x83/0x140 [ 1049.163577] ? do_syscall_64+0x7b/0x110 [ 1049.167433] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.172243] ? syscall_exit_to_user_mode+0x51/0x1a0 [ 1049.177141] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.181951] ? do_syscall_64+0x7b/0x110 [ 1049.185811] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.190619] ? ksys_read+0x90/0xe0 [ 1049.194025] ? do_syscall_64+0x7b/0x110 [ 1049.197877] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.202682] ? syscall_exit_to_user_mode+0x51/0x1a0 [ 1049.207579] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.212391] ? do_syscall_64+0x7b/0x110 [ 1049.216243] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.221053] ? do_syscall_64+0x7b/0x110 [ 1049.224900] ? srso_alias_return_thunk+0x5/0xfbef5 [ 1049.229703] ? do_syscall_64+0x7b/0x110 [ 1049.233561] entry_SYSCALL_64_after_hwframe+0x76/0x7e -- Thanks and Regards, Prateek ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-24 14:52 ` K Prateek Nayak 2025-03-24 16:03 ` Mateusz Guzik @ 2025-03-25 2:52 ` K Prateek Nayak 2025-03-25 12:15 ` Oleg Nesterov 1 sibling, 1 reply; 48+ messages in thread From: K Prateek Nayak @ 2025-03-25 2:52 UTC (permalink / raw) To: Oleg Nesterov, Mateusz Guzik Cc: syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro On 3/24/2025 8:22 PM, K Prateek Nayak wrote: > Hello folks, > > Some updates. > > On 3/24/2025 6:49 PM, K Prateek Nayak wrote: >>> >>> Per syzbot this attempt did not work out either. >>> >>> I think the blind stabs taken by everyone here are enough. >>> >>> The report does not provide the crucial bit: what are the other >>> threads doing. Presumably someone else is stuck somewhere, possibly >>> not even in pipe code and that stuck thread was supposed to wake up >>> the one which trips over hung task detector. Figuring out what that >>> thread is imo the next step. >>> >>> I failed to find a relevant command in >>> https://github.com/google/syzkaller/blob/master/docs/syzbot.md >>> >>> So if you guys know someone on syzkaller side, maybe you can ask them >>> to tweak the report *or* maybe syzbot can test a "fix" which makes >>> hung task detector also report all backtraces? I don't know if that >>> can work, the output may be long enough that it will get trimmed by >>> something. >>> >>> I don't have to time work on this for now, just throwing ideas. >> >> I got the reproducer running locally. Tracing stuff currently to see >> what is tripping. Will report back once I find something interesting. >> Might take a while since the 9p bits are so far spread out. >> > > So far, with tracing, this is where I'm: > > o Mainline + Oleg's optimization reverted: > > ... > kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55 > kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55 > kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7 > kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7 > repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector > repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work > repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write > repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write > repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0 > repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector > kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work > kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up! > ... (syzbot reproducer continues) > > o Mainline: > > kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7 > kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7 > kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55 > kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55 > repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector > repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work > repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write > repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write > repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0 > repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > ... (syzbot reproducer hangs) > > There is a third "kworker/u1030" component that never gets woken up for > reasons currently unknown to me with Oleg's optimization. I'll keep > digging. More data ... I chased this down to p9_client_rpc() net/9p/client.c specifically: err = c->trans_mod->request(c, req); if (err < 0) { /* write won't happen */ p9_req_put(c, req); if (err != -ERESTARTSYS && err != -EFAULT) c->status = Disconnected; goto recalc_sigpending; } c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c which basically does a p9_fd_poll(). Previously, the above would fail with err as -EIO which would cause the client to "Disconnect" and the retry logic would make progress. Now however, the err returned is -ERESTARTSYS which will not cause a disconnect and the retry logic will hang somewhere in p9_client_rpc() later. I'll chase it a little more but if 9p folks can chime in it would be great since I'm out of my depths here. P.S. There are more interactions at play and I'm trying to still piece them together. Relevant traces: o Mainline + Oleg's optimization reverted: repro-4161 [239] ..... 107.785644: p9_client_write: p9_client_rpc done repro-4161 [239] ..... 107.785644: p9_client_write: p9_pdup repro-4161 [239] ..... 107.785644: p9_client_write: iter revert repro-4161 [239] ..... 107.785644: p9_client_write: p9_client_rpc repro-4161 [239] ..... 107.785653: p9_fd_request: p9_fd_request repro-4161 [239] ...1. 107.785653: p9_fd_request: p9_fd_request error repro-4161 [239] ..... 107.785653: p9_client_rpc: Client disconnected (no write) <------------- "write won't happen" case repro-4161 [239] ..... 107.785655: p9_client_write: p9_client_rpc done repro-4161 [239] ..... 107.785655: p9_client_write: p9_client_rpc error (-5) <------------- -EIO repro-4161 [239] ..... 107.785656: v9fs_issue_write: Issue write done 2 err(-5) repro-4161 [239] ..... 107.785657: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated repro-4161 [239] ..... 107.785657: netfs_wake_write_collector: Wake collector repro-4161 [239] ..... 107.785658: netfs_wake_write_collector: Queuing collector work repro-4161 [239] ..... 107.785660: v9fs_issue_write: Issue write subrequest terminated 2 repro-4161 [239] ..... 107.785661: netfs_unbuffered_write: netfs_unbuffered_write repro-4161 [239] ..... 107.785661: netfs_end_issue_write: netfs_end_issue_write repro-4161 [239] ..... 107.785662: netfs_end_issue_write: Write collector need poke 0 repro-4161 [239] ..... 107.785662: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! kworker/u1038:0-1583 [105] ..... 107.785667: netfs_retry_writes: netfs_reissue_write 1 kworker/u1038:0-1583 [105] ..... 107.785668: v9fs_issue_write: v9fs_issue_write kworker/u1038:0-1583 [105] ..... 107.785669: p9_client_write: p9_client_rpc kworker/u1038:0-1583 [105] ..... 107.785669: p9_client_prepare_req: p9_client_prepare_req eio 1 kworker/u1038:0-1583 [105] ..... 107.785670: p9_client_rpc: p9_client_rpc early err return kworker/u1038:0-1583 [105] ..... 107.785670: p9_client_write: p9_client_rpc done kworker/u1038:0-1583 [105] ..... 107.785671: p9_client_write: p9_client_rpc error (-5) kworker/u1038:0-1583 [105] ..... 107.785672: v9fs_issue_write: Issue write done 0 err(-5) kworker/u1038:0-1583 [105] ..... 107.785672: netfs_write_subrequest_terminated: Collector woken up kworker/u1038:0-1583 [105] ..... 107.785672: netfs_wake_write_collector: Wake collector kworker/u1038:0-1583 [105] ..... 107.785672: netfs_wake_write_collector: Queuing collector work kworker/u1038:0-1583 [105] ..... 107.785677: v9fs_issue_write: Issue write subrequest terminated 0 kworker/u1038:0-1583 [105] ..... 107.785684: netfs_write_collection_worker: Write collect clearing and waking up! repro-4161 [239] ..... 107.785883: p9_client_prepare_req: p9_client_prepare_req eio 1 ... (continues) o Mainline: repro-4161 [087] ..... 123.474660: p9_client_write: p9_client_rpc done repro-4161 [087] ..... 123.474661: p9_client_write: p9_pdup repro-4161 [087] ..... 123.474661: p9_client_write: iter revert repro-4161 [087] ..... 123.474661: p9_client_write: p9_client_rpc repro-4161 [087] ..... 123.474672: p9_fd_request: p9_fd_request repro-4161 [087] ..... 123.474673: p9_fd_request: p9_fd_request EPOLL repro-4161 [087] ..... 123.474673: p9_fd_poll: p9_fd_poll rd poll repro-4161 [087] ..... 123.474674: p9_fd_poll: p9_fd_request wr poll repro-4161 [087] ..... 128.233025: p9_client_write: p9_client_rpc done repro-4161 [087] ..... 128.233033: p9_client_write: p9_client_rpc error (-512) <------------- -ERESTARTSYS repro-4161 [087] ..... 128.233034: v9fs_issue_write: Issue write done 2 err(-512) repro-4161 [087] ..... 128.233035: netfs_write_subrequest_terminated: Collector woken repro-4161 [087] ..... 128.233036: netfs_wake_write_collector: Wake collector repro-4161 [087] ..... 128.233036: netfs_wake_write_collector: Queuing collector work repro-4161 [087] ..... 128.233040: v9fs_issue_write: Issue write subrequest terminated 2 repro-4161 [087] ..... 128.233040: netfs_unbuffered_write: netfs_unbuffered_write repro-4161 [087] ..... 128.233040: netfs_end_issue_write: netfs_end_issue_write repro-4161 [087] ..... 128.233041: netfs_end_issue_write: Write collector need poke 0 repro-4161 [087] ..... 128.233041: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! kworker/u1035:0-1580 [080] ..... 128.233077: netfs_retry_writes: netfs_reissue_write 1 kworker/u1035:0-1580 [080] ..... 128.233078: v9fs_issue_write: v9fs_issue_write kworker/u1035:0-1580 [080] ..... 128.233079: p9_client_write: p9_client_rpc kworker/u1035:0-1580 [080] ..... 128.233099: p9_fd_request: p9_fd_request kworker/u1035:0-1580 [080] ..... 128.233101: p9_fd_request: p9_fd_request EPOLL kworker/u1035:0-1580 [080] ..... 128.233101: p9_fd_poll: p9_fd_poll rd poll kworker/u1035:0-1580 [080] ..... 128.233101: p9_fd_poll: p9_fd_request wr poll (Hangs) -- Thanks and Regards, Prateek ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-25 2:52 ` K Prateek Nayak @ 2025-03-25 12:15 ` Oleg Nesterov 2025-03-25 12:36 ` Dominique Martinet 2025-03-25 13:04 ` Oleg Nesterov 0 siblings, 2 replies; 48+ messages in thread From: Oleg Nesterov @ 2025-03-25 12:15 UTC (permalink / raw) To: K Prateek Nayak, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck Cc: Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs Prateek, thanks again for your hard work! Yes, I think we need the help from 9p folks. I know nothing about it, but so far this still looks as a 9p problem to me... All I can say right now is that the "sigpending" logic in p9_client_rpc() looks wrong. If nothing else: - clear_thread_flag(TIF_SIGPENDING) is not enough, it won't make signal_pending() false if TIF_NOTIFY_SIGNAL is set. - otoh, if signal_pending() was true because of pending SIGKILL, then after clear_thread_flag(TIF_SIGPENDING) wait_event_killable() will act as uninterruptible wait_event(). Oleg. On 03/25, K Prateek Nayak wrote: > > On 3/24/2025 8:22 PM, K Prateek Nayak wrote: > >Hello folks, > > > >Some updates. > > > >On 3/24/2025 6:49 PM, K Prateek Nayak wrote: > >>> > >>>Per syzbot this attempt did not work out either. > >>> > >>>I think the blind stabs taken by everyone here are enough. > >>> > >>>The report does not provide the crucial bit: what are the other > >>>threads doing. Presumably someone else is stuck somewhere, possibly > >>>not even in pipe code and that stuck thread was supposed to wake up > >>>the one which trips over hung task detector. Figuring out what that > >>>thread is imo the next step. > >>> > >>>I failed to find a relevant command in > >>>https://github.com/google/syzkaller/blob/master/docs/syzbot.md > >>> > >>>So if you guys know someone on syzkaller side, maybe you can ask them > >>>to tweak the report *or* maybe syzbot can test a "fix" which makes > >>>hung task detector also report all backtraces? I don't know if that > >>>can work, the output may be long enough that it will get trimmed by > >>>something. > >>> > >>>I don't have to time work on this for now, just throwing ideas. > >> > >>I got the reproducer running locally. Tracing stuff currently to see > >>what is tripping. Will report back once I find something interesting. > >>Might take a while since the 9p bits are so far spread out. > >> > > > >So far, with tracing, this is where I'm: > > > >o Mainline + Oleg's optimization reverted: > > > > ... > > kworker/43:1-1723 [043] ..... 115.309065: p9_read_work: Data read wait 55 > > kworker/43:1-1723 [043] ..... 115.309066: p9_read_work: Data read 55 > > kworker/43:1-1723 [043] ..... 115.309067: p9_read_work: Data read wait 7 > > kworker/43:1-1723 [043] ..... 115.309068: p9_read_work: Data read 7 > > repro-4138 [043] ..... 115.309084: netfs_wake_write_collector: Wake collector > > repro-4138 [043] ..... 115.309085: netfs_wake_write_collector: Queuing collector work > > repro-4138 [043] ..... 115.309088: netfs_unbuffered_write: netfs_unbuffered_write > > repro-4138 [043] ..... 115.309088: netfs_end_issue_write: netfs_end_issue_write > > repro-4138 [043] ..... 115.309089: netfs_end_issue_write: Write collector need poke 0 > > repro-4138 [043] ..... 115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > > kworker/u1030:1-1951 [168] ..... 115.309096: netfs_wake_write_collector: Wake collector > > kworker/u1030:1-1951 [168] ..... 115.309097: netfs_wake_write_collector: Queuing collector work > > kworker/u1030:1-1951 [168] ..... 115.309102: netfs_write_collection_worker: Write collect clearing and waking up! > > ... (syzbot reproducer continues) > > > >o Mainline: > > > > kworker/185:1-1767 [185] ..... 109.485961: p9_read_work: Data read wait 7 > > kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read 7 > > kworker/185:1-1767 [185] ..... 109.485962: p9_read_work: Data read wait 55 > > kworker/185:1-1767 [185] ..... 109.485963: p9_read_work: Data read 55 > > repro-4038 [185] ..... 114.225717: netfs_wake_write_collector: Wake collector > > repro-4038 [185] ..... 114.225723: netfs_wake_write_collector: Queuing collector work > > repro-4038 [185] ..... 114.225727: netfs_unbuffered_write: netfs_unbuffered_write > > repro-4038 [185] ..... 114.225727: netfs_end_issue_write: netfs_end_issue_write > > repro-4038 [185] ..... 114.225728: netfs_end_issue_write: Write collector need poke 0 > > repro-4038 [185] ..... 114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > > ... (syzbot reproducer hangs) > > > >There is a third "kworker/u1030" component that never gets woken up for > >reasons currently unknown to me with Oleg's optimization. I'll keep > >digging. > > More data ... > > I chased this down to p9_client_rpc() net/9p/client.c specifically: > > err = c->trans_mod->request(c, req); > if (err < 0) { > /* write won't happen */ > p9_req_put(c, req); > if (err != -ERESTARTSYS && err != -EFAULT) > c->status = Disconnected; > goto recalc_sigpending; > } > > c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c > which basically does a p9_fd_poll(). > > Previously, the above would fail with err as -EIO which would > cause the client to "Disconnect" and the retry logic would make > progress. Now however, the err returned is -ERESTARTSYS which > will not cause a disconnect and the retry logic will hang > somewhere in p9_client_rpc() later. > > I'll chase it a little more but if 9p folks can chime in it would > be great since I'm out of my depths here. > > P.S. There are more interactions at play and I'm trying to still > piece them together. > > Relevant traces: > > o Mainline + Oleg's optimization reverted: > > repro-4161 [239] ..... 107.785644: p9_client_write: p9_client_rpc done > repro-4161 [239] ..... 107.785644: p9_client_write: p9_pdup > repro-4161 [239] ..... 107.785644: p9_client_write: iter revert > repro-4161 [239] ..... 107.785644: p9_client_write: p9_client_rpc > repro-4161 [239] ..... 107.785653: p9_fd_request: p9_fd_request > repro-4161 [239] ...1. 107.785653: p9_fd_request: p9_fd_request error > repro-4161 [239] ..... 107.785653: p9_client_rpc: Client disconnected (no write) <------------- "write won't happen" case > repro-4161 [239] ..... 107.785655: p9_client_write: p9_client_rpc done > repro-4161 [239] ..... 107.785655: p9_client_write: p9_client_rpc error (-5) <------------- -EIO > repro-4161 [239] ..... 107.785656: v9fs_issue_write: Issue write done 2 err(-5) > repro-4161 [239] ..... 107.785657: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated > repro-4161 [239] ..... 107.785657: netfs_wake_write_collector: Wake collector > repro-4161 [239] ..... 107.785658: netfs_wake_write_collector: Queuing collector work > repro-4161 [239] ..... 107.785660: v9fs_issue_write: Issue write subrequest terminated 2 > repro-4161 [239] ..... 107.785661: netfs_unbuffered_write: netfs_unbuffered_write > repro-4161 [239] ..... 107.785661: netfs_end_issue_write: netfs_end_issue_write > repro-4161 [239] ..... 107.785662: netfs_end_issue_write: Write collector need poke 0 > repro-4161 [239] ..... 107.785662: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > kworker/u1038:0-1583 [105] ..... 107.785667: netfs_retry_writes: netfs_reissue_write 1 > kworker/u1038:0-1583 [105] ..... 107.785668: v9fs_issue_write: v9fs_issue_write > kworker/u1038:0-1583 [105] ..... 107.785669: p9_client_write: p9_client_rpc > kworker/u1038:0-1583 [105] ..... 107.785669: p9_client_prepare_req: p9_client_prepare_req eio 1 > kworker/u1038:0-1583 [105] ..... 107.785670: p9_client_rpc: p9_client_rpc early err return > kworker/u1038:0-1583 [105] ..... 107.785670: p9_client_write: p9_client_rpc done > kworker/u1038:0-1583 [105] ..... 107.785671: p9_client_write: p9_client_rpc error (-5) > kworker/u1038:0-1583 [105] ..... 107.785672: v9fs_issue_write: Issue write done 0 err(-5) > kworker/u1038:0-1583 [105] ..... 107.785672: netfs_write_subrequest_terminated: Collector woken up > kworker/u1038:0-1583 [105] ..... 107.785672: netfs_wake_write_collector: Wake collector > kworker/u1038:0-1583 [105] ..... 107.785672: netfs_wake_write_collector: Queuing collector work > kworker/u1038:0-1583 [105] ..... 107.785677: v9fs_issue_write: Issue write subrequest terminated 0 > kworker/u1038:0-1583 [105] ..... 107.785684: netfs_write_collection_worker: Write collect clearing and waking up! > repro-4161 [239] ..... 107.785883: p9_client_prepare_req: p9_client_prepare_req eio 1 > ... (continues) > > o Mainline: > > repro-4161 [087] ..... 123.474660: p9_client_write: p9_client_rpc done > repro-4161 [087] ..... 123.474661: p9_client_write: p9_pdup > repro-4161 [087] ..... 123.474661: p9_client_write: iter revert > repro-4161 [087] ..... 123.474661: p9_client_write: p9_client_rpc > repro-4161 [087] ..... 123.474672: p9_fd_request: p9_fd_request > repro-4161 [087] ..... 123.474673: p9_fd_request: p9_fd_request EPOLL > repro-4161 [087] ..... 123.474673: p9_fd_poll: p9_fd_poll rd poll > repro-4161 [087] ..... 123.474674: p9_fd_poll: p9_fd_request wr poll > repro-4161 [087] ..... 128.233025: p9_client_write: p9_client_rpc done > repro-4161 [087] ..... 128.233033: p9_client_write: p9_client_rpc error (-512) <------------- -ERESTARTSYS > repro-4161 [087] ..... 128.233034: v9fs_issue_write: Issue write done 2 err(-512) > repro-4161 [087] ..... 128.233035: netfs_write_subrequest_terminated: Collector woken > repro-4161 [087] ..... 128.233036: netfs_wake_write_collector: Wake collector > repro-4161 [087] ..... 128.233036: netfs_wake_write_collector: Queuing collector work > repro-4161 [087] ..... 128.233040: v9fs_issue_write: Issue write subrequest terminated 2 > repro-4161 [087] ..... 128.233040: netfs_unbuffered_write: netfs_unbuffered_write > repro-4161 [087] ..... 128.233040: netfs_end_issue_write: netfs_end_issue_write > repro-4161 [087] ..... 128.233041: netfs_end_issue_write: Write collector need poke 0 > repro-4161 [087] ..... 128.233041: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! > kworker/u1035:0-1580 [080] ..... 128.233077: netfs_retry_writes: netfs_reissue_write 1 > kworker/u1035:0-1580 [080] ..... 128.233078: v9fs_issue_write: v9fs_issue_write > kworker/u1035:0-1580 [080] ..... 128.233079: p9_client_write: p9_client_rpc > kworker/u1035:0-1580 [080] ..... 128.233099: p9_fd_request: p9_fd_request > kworker/u1035:0-1580 [080] ..... 128.233101: p9_fd_request: p9_fd_request EPOLL > kworker/u1035:0-1580 [080] ..... 128.233101: p9_fd_poll: p9_fd_poll rd poll > kworker/u1035:0-1580 [080] ..... 128.233101: p9_fd_poll: p9_fd_request wr poll > (Hangs) > > -- > Thanks and Regards, > Prateek > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-25 12:15 ` Oleg Nesterov @ 2025-03-25 12:36 ` Dominique Martinet 2025-03-25 13:04 ` Oleg Nesterov 1 sibling, 0 replies; 48+ messages in thread From: Dominique Martinet @ 2025-03-25 12:36 UTC (permalink / raw) To: Oleg Nesterov Cc: K Prateek Nayak, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs Thanks for the Cc Just replying quickly without looking at anything Oleg Nesterov wrote on Tue, Mar 25, 2025 at 01:15:26PM +0100: > All I can say right now is that the "sigpending" logic in p9_client_rpc() > looks wrong. If nothing else: > > - clear_thread_flag(TIF_SIGPENDING) is not enough, it won't make > signal_pending() false if TIF_NOTIFY_SIGNAL is set. > > - otoh, if signal_pending() was true because of pending SIGKILL, > then after clear_thread_flag(TIF_SIGPENDING) wait_event_killable() > will act as uninterruptible wait_event(). Yeah, this is effectively an unkillable event loop once a flush has been sent; this is a known issue. I've tried to address this with async rpc (so we could send the flush and forget about it), but that caused other regressions and I never had time to dig into these... The patches date back 2018 and probably won't even apply cleanly anymore, but if anyone cares they are here: https://lore.kernel.org/all/1544532108-21689-3-git-send-email-asmadeus@codewreck.org/T/#u (the hard work of refcounting was done just before that in order to kill this pattern, I just pretty much ran out of free time at that point, hobbies are hard...) So: sorry, it's probably possible to improve this, but it won't be easy nor immediate. > > c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c > > which basically does a p9_fd_poll(). > > > > Previously, the above would fail with err as -EIO which would > > cause the client to "Disconnect" and the retry logic would make > > progress. Now however, the err returned is -ERESTARTSYS which > > will not cause a disconnect and the retry logic will hang > > somewhere in p9_client_rpc() later. Now, if you got this far I think it'll be easier to make whatever changed error out with EIO again instead; I'll try to check the rest of the thread later this week as I didn't follow this thread at all. Thanks, -- Dominique ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-25 12:15 ` Oleg Nesterov 2025-03-25 12:36 ` Dominique Martinet @ 2025-03-25 13:04 ` Oleg Nesterov 2025-03-25 14:49 ` K Prateek Nayak 1 sibling, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-25 13:04 UTC (permalink / raw) To: K Prateek Nayak, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck Cc: Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs On 03/25, K Prateek Nayak wrote: > > I chased this down to p9_client_rpc() net/9p/client.c specifically: > > err = c->trans_mod->request(c, req); > if (err < 0) { > /* write won't happen */ > p9_req_put(c, req); > if (err != -ERESTARTSYS && err != -EFAULT) > c->status = Disconnected; > goto recalc_sigpending; > } > > c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c > which basically does a p9_fd_poll(). Again, I know nothing about 9p... but if p9_fd_request() returns an err < 0, then it comes from p9_conn->err and p9_fd_request() does nothing else. > Previously, the above would fail with err as -EIO which would > cause the client to "Disconnect" and the retry logic would make > progress. Now however, the err returned is -ERESTARTSYS OK... So p9_conn->err = -ERESTARTSYS was set by p9_conn_cancel() called by p9_write_work() because pipe_write() returns ERESTARTSYS? But I don't understand -EIO with the reverted commit aaec5a95d59615 Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-25 13:04 ` Oleg Nesterov @ 2025-03-25 14:49 ` K Prateek Nayak 2025-03-25 14:58 ` Dominique Martinet 0 siblings, 1 reply; 48+ messages in thread From: K Prateek Nayak @ 2025-03-25 14:49 UTC (permalink / raw) To: Oleg Nesterov, Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck Cc: Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs [-- Attachment #1: Type: text/plain, Size: 12265 bytes --] Hello Oleg, Dominique, On 3/25/2025 6:34 PM, Oleg Nesterov wrote: > On 03/25, K Prateek Nayak wrote: >> >> I chased this down to p9_client_rpc() net/9p/client.c specifically: >> >> err = c->trans_mod->request(c, req); >> if (err < 0) { >> /* write won't happen */ >> p9_req_put(c, req); >> if (err != -ERESTARTSYS && err != -EFAULT) >> c->status = Disconnected; >> goto recalc_sigpending; >> } >> >> c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c >> which basically does a p9_fd_poll(). > > Again, I know nothing about 9p... but if p9_fd_request() returns > an err < 0, then it comes from p9_conn->err and p9_fd_request() > does nothing else. > >> Previously, the above would fail with err as -EIO which would >> cause the client to "Disconnect" and the retry logic would make >> progress. Now however, the err returned is -ERESTARTSYS > > OK... So p9_conn->err = -ERESTARTSYS was set by p9_conn_cancel() > called by p9_write_work() because pipe_write() returns ERESTARTSYS? > > But I don't understand -EIO with the reverted commit aaec5a95d59615 Okay it is a long explanation. These are the relevant bits of traces with aaec5a95d59615 reverted on mainline and I'll break it down to the best of my abilities: kworker/100:1-1803 [100] ..... 286.618784: p9_write_work: Data write wait 32770 kworker/100:1-1803 [100] ..... 286.618792: p9_write_work: Data written 28672 kworker/100:1-1803 [100] ..... 286.618793: p9_fd_poll: p9_fd_poll rd poll kworker/100:1-1803 [100] ..... 286.618793: p9_fd_poll: p9_fd_request wr poll kworker/100:1-1803 [100] ..... 286.618793: p9_fd_poll: p9_fd_poll rd poll kworker/100:1-1803 [100] ..... 286.618793: p9_fd_poll: p9_fd_request wr poll kworker/100:1-1803 [100] ..... 286.618795: p9_read_work: Data read wait 7 kworker/100:1-1803 [100] ..... 286.618796: p9_read_work: Data read 7 kworker/100:1-1803 [100] ..... 286.618796: p9_fd_poll: p9_fd_poll rd poll kworker/100:1-1803 [100] ..... 286.618796: p9_fd_poll: p9_fd_request wr poll kworker/100:1-1803 [100] ..... 286.618797: p9_fd_poll: p9_fd_poll rd poll kworker/100:1-1803 [100] ..... 286.618797: p9_fd_poll: p9_fd_request wr poll kworker/100:1-1803 [100] ..... 286.618797: p9_read_work: Data read wait 55 kworker/100:1-1803 [100] ..... 286.618798: p9_read_work: Data read 55 repro-4102 [100] ..... 286.618802: p9_client_write: p9_client_rpc done repro-4102 [100] ..... 286.618804: p9_client_write: p9_pdup repro-4102 [100] ..... 286.618804: p9_client_write: iter revert ==== the above bits are common between the two === repro-4102 [100] ..... 286.618805: p9_client_write: p9_client_rpc repro-4102 [100] ..... 286.618817: p9_fd_request: p9_fd_request repro-4102 [100] ..... 286.618818: p9_fd_request: p9_fd_request EPOLL repro-4102 [100] ..... 286.618818: p9_fd_poll: p9_fd_poll rd poll repro-4102 [100] ..... 286.618818: p9_fd_poll: p9_fd_request wr poll kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_poll rd poll kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_request wr poll kworker/100:1-1803 [100] ..... 286.618823: p9_read_work: Data read wait 7 kworker/100:1-1803 [100] ..... 286.618825: p9_read_work: Data read 7 kworker/100:1-1803 [100] ..... 286.618825: p9_read_work: p9_tag_lookup error ^^^^^^^^^^^^^^^^^^^ ==== In this case the read succeeds but p9_tag_lookup() later will error out ==== ==== Connection is cancelled with m->err as -EIO ==== kworker/100:1-1803 [100] ..... 286.618825: p9_read_work: Connection cancel kworker/100:1-1803 [100] ...1. 286.618826: p9_conn_cancel: Connection cancelled (-5) repro-4102 [100] ..... 286.618831: p9_client_write: p9_client_rpc done repro-4102 [100] ..... 286.618832: p9_client_write: p9_client_rpc error (-5) ^^^ ==== This error from kworker/100 is received by the reproducer ==== repro-4102 [100] ..... 286.618833: v9fs_issue_write: Issue write done 2 err(-5) repro-4102 [100] ..... 286.618833: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated repro-4102 [100] ..... 286.618834: netfs_wake_write_collector: Wake collector repro-4102 [100] ..... 286.618834: netfs_wake_write_collector: Queuing collector work repro-4102 [100] ..... 286.618837: v9fs_issue_write: Issue write subrequest terminated 2 repro-4102 [100] ..... 286.618838: netfs_unbuffered_write: netfs_unbuffered_write repro-4102 [100] ..... 286.618838: netfs_end_issue_write: netfs_end_issue_write repro-4102 [100] ..... 286.618838: netfs_end_issue_write: Write collector need poke 0 repro-4102 [100] ..... 286.618839: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! ==== A reissue mechanism triggers below ==== kworker/u1037:2-3232 [096] ..... 286.618845: netfs_retry_writes: netfs_reissue_write 1 kworker/u1037:2-3232 [096] ..... 286.618846: v9fs_issue_write: v9fs_issue_write kworker/u1037:2-3232 [096] ..... 286.618847: p9_client_write: p9_client_rpc kworker/u1037:2-3232 [096] ..... 286.618873: p9_fd_request: p9_fd_request kworker/u1037:2-3232 [096] ...1. 286.618874: p9_fd_request: p9_fd_request error kworker/u1037:2-3232 [096] ..... 286.618874: p9_client_rpc: Client disconnected (no write) ==== Connection with client is disconnected ==== kworker/u1037:2-3232 [096] ..... 286.618877: p9_client_write: p9_client_rpc done kworker/u1037:2-3232 [096] ..... 286.618880: p9_client_write: p9_client_rpc error (-5) kworker/u1037:2-3232 [096] ..... 286.618881: v9fs_issue_write: Issue write done 0 err(-5) kworker/u1037:2-3232 [096] ..... 286.618882: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated kworker/u1037:2-3232 [096] ..... 286.618882: netfs_wake_write_collector: Wake collector kworker/u1037:2-3232 [096] ..... 286.618882: netfs_wake_write_collector: Queuing collector work kworker/u1037:2-3232 [096] ..... 286.618887: v9fs_issue_write: Issue write subrequest terminated 0 kworker/u1037:2-3232 [096] ..... 286.618890: netfs_write_collection_worker: Write collect clearing and waking up! ==== The retry worker will clear the NETFS_RREQ_IN_PROGRESS bit and issue a wakeup ==== repro-4102 [100] ..... 286.619051: p9_client_prepare_req: p9_client_prepare_req eio 1 repro-4102 [100] ..... 286.619052: p9_client_rpc: p9_client_rpc early err return repro-4108 [240] ..... 286.621325: p9_fd_create: Client connected ==== The reproducer will then re-establish the connection and the pattern repeats ==== --- So the pattern is with the optimization reverted is: o The RPC thread reads "7" bytes. o p9_tag_lookup() fails giving -EIO. o connection is cancelled from rpc thread. o The reproducer thread receives the packet and finds -EIO. o The reproducer starts a Disconnect. o The reproducer wakes up collector and sleeps on bit NETFS_RREQ_IN_PROGRESS o A retry worker wakes up. o Finds connection has been disconnected o Wakes up the reproducer thread. o Reproducer wakes up and establishes connection. I don't know 9p bits enough to understand the intermediates steps but this is the rough flow. --- Now for the traces on vanilla mainline: kworker/31:1-1723 [031] ..... 95.011362: p9_write_work: Data write wait 32770 kworker/31:1-1723 [031] ..... 95.011372: p9_write_work: Data written 28672 kworker/31:1-1723 [031] ..... 95.011373: p9_fd_poll: p9_fd_poll rd poll kworker/31:1-1723 [031] ..... 95.011373: p9_fd_poll: p9_fd_request wr poll kworker/31:1-1723 [031] ..... 95.011374: p9_fd_poll: p9_fd_poll rd poll kworker/31:1-1723 [031] ..... 95.011374: p9_fd_poll: p9_fd_request wr poll kworker/31:1-1723 [031] ..... 95.011374: p9_read_work: Data read wait 7 kworker/31:1-1723 [031] ..... 95.011375: p9_read_work: Data read 7 kworker/31:1-1723 [031] ..... 95.011375: p9_fd_poll: p9_fd_poll rd poll kworker/31:1-1723 [031] ..... 95.011375: p9_fd_poll: p9_fd_request wr poll kworker/31:1-1723 [031] ..... 95.011376: p9_read_work: Data read wait 55 kworker/31:1-1723 [031] ..... 95.011376: p9_read_work: Data read 55 repro-4076 [031] ..... 95.011381: p9_client_rpc: Wait event killable (0) repro-4076 [031] ..... 95.011382: p9_client_rpc: Check error (0) repro-4076 [031] ..... 95.011382: p9_client_write: p9_client_rpc done repro-4076 [031] ..... 95.011382: p9_client_write: p9_pdup repro-4076 [031] ..... 95.011382: p9_client_write: iter revert ==== Above bits are same as the previous traces ==== repro-4076 [031] ..... 95.011383: p9_client_write: p9_client_rpc repro-4076 [031] ..... 95.011393: p9_fd_request: p9_fd_request repro-4076 [031] ..... 95.011394: p9_fd_request: p9_fd_request EPOLL repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_poll rd poll repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_request wr poll repro-4076 [031] ..... 99.731970: p9_client_rpc: Wait event killable (-512) ==== The wait event here fails with -ERESTARTSYS ==== repro-4076 [031] ..... 99.731979: p9_client_rpc: Flushed (-512) repro-4076 [031] ..... 99.731983: p9_client_rpc: Final error (-512) repro-4076 [031] ..... 99.731983: p9_client_write: p9_client_rpc done repro-4076 [031] ..... 99.731984: p9_client_write: p9_client_rpc error (-512) repro-4076 [031] ..... 99.731985: v9fs_issue_write: Issue write done 2 err(-512) repro-4076 [031] ..... 99.731987: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated repro-4076 [031] ..... 99.731987: netfs_wake_write_collector: Wake collector repro-4076 [031] ..... 99.731988: netfs_wake_write_collector: Queuing collector work repro-4076 [031] ..... 99.731991: v9fs_issue_write: Issue write subrequest terminated 2 repro-4076 [031] ..... 99.731991: netfs_unbuffered_write: netfs_unbuffered_write repro-4076 [031] ..... 99.731991: netfs_end_issue_write: netfs_end_issue_write repro-4076 [031] ..... 99.731992: netfs_end_issue_write: Write collector need poke 0 repro-4076 [031] ..... 99.731993: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS! ==== The reproducer waits on bit NETFS_RREQ_IN_PROGRESS ==== kworker/u1028:3-3860 [030] ..... 99.732028: netfs_retry_writes: netfs_reissue_write 1 kworker/u1028:3-3860 [030] ..... 99.732030: v9fs_issue_write: v9fs_issue_write kworker/u1028:3-3860 [030] ..... 99.732031: p9_client_write: p9_client_rpc kworker/u1028:3-3860 [030] ..... 99.732051: p9_fd_request: p9_fd_request kworker/u1028:3-3860 [030] ..... 99.732052: p9_fd_request: p9_fd_request EPOLL kworker/u1028:3-3860 [030] ..... 99.732052: p9_fd_poll: p9_fd_poll rd poll kworker/u1028:3-3860 [030] ..... 99.732053: p9_fd_poll: p9_fd_request wr poll ==== The retry worker seemingly gets stuck post p9_fd_poll() waiting for wakeup ==== --- That is my analysis so far. > > Oleg. > -- Thanks and Regards, Prateek [-- Attachment #2: debug.diff --] [-- Type: text/plain, Size: 19614 bytes --] diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c index 32619d146cbc..3fbfc6882c35 100644 --- a/fs/9p/vfs_addr.c +++ b/fs/9p/vfs_addr.c @@ -56,10 +56,13 @@ static void v9fs_issue_write(struct netfs_io_subrequest *subreq) struct p9_fid *fid = subreq->rreq->netfs_priv; int err, len; + trace_printk("v9fs_issue_write\n"); len = p9_client_write(fid, subreq->start, &subreq->io_iter, &err); + trace_printk("Issue write done %d err(%d)\n", len, err); if (len > 0) __set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); netfs_write_subrequest_terminated(subreq, len ?: err, false); + trace_printk("Issue write subrequest terminated %d\n", len); } /** @@ -74,6 +77,7 @@ static void v9fs_issue_read(struct netfs_io_subrequest *subreq) int total, err; total = p9_client_read(fid, pos, &subreq->io_iter, &err); + trace_printk("v9fs_issue_read %d err(%d)\n", total, err); /* if we just extended the file size, any portion not in * cache won't be on server and is zeroes */ diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c index 42ce53cc216e..6333b8bb4008 100644 --- a/fs/netfs/direct_write.c +++ b/fs/netfs/direct_write.c @@ -105,6 +105,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter * if (!async) { trace_netfs_rreq(wreq, netfs_rreq_trace_wait_ip); + trace_printk("Waiting on NETFS_RREQ_IN_PROGRESS!\n"); wait_on_bit(&wreq->flags, NETFS_RREQ_IN_PROGRESS, TASK_UNINTERRUPTIBLE); ret = wreq->error; diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index 23c75755ad4e..a609063db8a7 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -446,6 +446,7 @@ static void netfs_read_collection(struct netfs_io_request *rreq) task_io_account_read(rreq->transferred); trace_netfs_rreq(rreq, netfs_rreq_trace_wake_ip); + trace_printk("Read collect clearing and waking up!\n"); clear_and_wake_up_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags); trace_netfs_rreq(rreq, netfs_rreq_trace_done); diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c index 3fca59e6475d..a54aa2bcc786 100644 --- a/fs/netfs/write_collect.c +++ b/fs/netfs/write_collect.c @@ -429,6 +429,7 @@ void netfs_write_collection_worker(struct work_struct *work) _debug("finished"); trace_netfs_rreq(wreq, netfs_rreq_trace_wake_ip); + trace_printk("Write collect clearing and waking up!\n"); clear_and_wake_up_bit(NETFS_RREQ_IN_PROGRESS, &wreq->flags); if (wreq->iocb) { @@ -449,10 +450,14 @@ void netfs_write_collection_worker(struct work_struct *work) */ void netfs_wake_write_collector(struct netfs_io_request *wreq, bool was_async) { + trace_printk("Wake collector\n"); if (!work_pending(&wreq->work)) { netfs_get_request(wreq, netfs_rreq_trace_get_work); - if (!queue_work(system_unbound_wq, &wreq->work)) + trace_printk("Queuing collector work\n"); + if (!queue_work(system_unbound_wq, &wreq->work)) { + trace_printk("Put request\n"); netfs_put_request(wreq, was_async, netfs_rreq_trace_put_work_nq); + } } } @@ -542,8 +547,10 @@ void netfs_write_subrequest_terminated(void *_op, ssize_t transferred_or_error, /* If we are at the head of the queue, wake up the collector, * transferring a ref to it if we were the ones to do so. */ - if (list_is_first(&subreq->rreq_link, &stream->subrequests)) + if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { + trace_printk("Collector woken up from netfs_write_subrequest_terminated\n"); netfs_wake_write_collector(wreq, was_async); + } netfs_put_subrequest(subreq, was_async, netfs_sreq_trace_put_terminated); } diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c index 77279fc5b5a7..22a4b793e789 100644 --- a/fs/netfs/write_issue.c +++ b/fs/netfs/write_issue.c @@ -232,8 +232,10 @@ static void netfs_do_issue_write(struct netfs_io_stream *stream, _enter("R=%x[%x],%zx", wreq->debug_id, subreq->debug_index, subreq->len); - if (test_bit(NETFS_SREQ_FAILED, &subreq->flags)) + if (test_bit(NETFS_SREQ_FAILED, &subreq->flags)) { + trace_printk("netfs_do_issue_write failed!\n"); return netfs_write_subrequest_terminated(subreq, subreq->error, false); + } trace_netfs_sreq(subreq, netfs_sreq_trace_submit); stream->issue_write(subreq); @@ -264,6 +266,7 @@ void netfs_issue_write(struct netfs_io_request *wreq, if (!subreq) return; + trace_printk("netfs_issue_write!\n"); stream->construct = NULL; subreq->io_iter.count = subreq->len; netfs_do_issue_write(stream, subreq); @@ -290,6 +293,7 @@ size_t netfs_advance_write(struct netfs_io_request *wreq, _enter("R=%x[%x]", wreq->debug_id, subreq ? subreq->debug_index : 0); if (subreq && start != subreq->start + subreq->len) { + trace_printk("netfs_issue_write from advance write!\n"); netfs_issue_write(wreq, stream); subreq = NULL; } @@ -307,6 +311,7 @@ size_t netfs_advance_write(struct netfs_io_request *wreq, if (subreq->len >= stream->sreq_max_len || subreq->nr_segs >= stream->sreq_max_segs || to_eof) { + trace_printk("netfs_issue_write from advance write! 2\n"); netfs_issue_write(wreq, stream); subreq = NULL; } @@ -397,6 +402,7 @@ static int netfs_write_folio(struct netfs_io_request *wreq, * when they were read. Note that these appear as a special * write-back group. */ + trace_printk("netfs_issue_write in folio copy\n"); if (fgroup == NETFS_FOLIO_COPY_TO_CACHE) { netfs_issue_write(wreq, upload); } else if (fgroup != wreq->group) { @@ -526,6 +532,7 @@ static int netfs_write_folio(struct netfs_io_request *wreq, */ static void netfs_end_issue_write(struct netfs_io_request *wreq) { + trace_printk("netfs_end_issue_write\n"); bool needs_poke = true; smp_wmb(); /* Write subreq lists before ALL_QUEUED. */ @@ -541,6 +548,7 @@ static void netfs_end_issue_write(struct netfs_io_request *wreq) netfs_issue_write(wreq, stream); } + trace_printk("Write collector need poke %d\n", needs_poke); if (needs_poke) netfs_wake_write_collector(wreq, false); } @@ -730,6 +738,7 @@ int netfs_unbuffered_write(struct netfs_io_request *wreq, bool may_wait, size_t break; } + trace_printk("netfs_unbuffered_write\n"); netfs_end_issue_write(wreq); _leave(" = %d", error); return error; @@ -908,6 +917,7 @@ int netfs_writeback_single(struct address_space *mapping, } stop: + trace_printk("netfs_issue_write writeback single\n"); for (int s = 0; s < NR_IO_STREAMS; s++) netfs_issue_write(wreq, &wreq->io_streams[s]); smp_wmb(); /* Write lists before ALL_QUEUED. */ diff --git a/fs/netfs/write_retry.c b/fs/netfs/write_retry.c index 545d33079a77..60a03079766b 100644 --- a/fs/netfs/write_retry.c +++ b/fs/netfs/write_retry.c @@ -43,6 +43,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq, iov_iter_revert(&source, subreq->len - source.count); netfs_get_subrequest(subreq, netfs_sreq_trace_get_resubmit); + trace_printk("netfs_reissue_write 1\n"); netfs_reissue_write(stream, subreq, &source); } } @@ -116,6 +117,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq, boundary = true; netfs_get_subrequest(subreq, netfs_sreq_trace_get_resubmit); + trace_printk("netfs_reissue_write 2\n"); netfs_reissue_write(stream, subreq, &source); if (subreq == to) break; @@ -183,6 +185,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq, boundary = false; } + trace_printk("netfs_reissue_write 3\n"); netfs_reissue_write(stream, subreq, &source); if (!len) break; diff --git a/net/9p/client.c b/net/9p/client.c index 09f8ced9f8bb..909c83b0e015 100644 --- a/net/9p/client.c +++ b/net/9p/client.c @@ -633,12 +633,16 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c, p9_debug(P9_DEBUG_MUX, "client %p op %d\n", c, type); /* we allow for any status other than disconnected */ - if (c->status == Disconnected) + if (c->status == Disconnected) { + trace_printk("p9_client_prepare_req eio 1\n"); return ERR_PTR(-EIO); + } /* if status is begin_disconnected we allow only clunk request */ - if (c->status == BeginDisconnect && type != P9_TCLUNK) + if (c->status == BeginDisconnect && type != P9_TCLUNK) { + trace_printk("p9_client_prepare_req eio 2\n"); return ERR_PTR(-EIO); + } va_copy(apc, ap); req = p9_tag_alloc(c, type, t_size, r_size, fmt, apc); @@ -690,8 +694,10 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...) va_start(ap, fmt); req = p9_client_prepare_req(c, type, tsize, rsize, fmt, ap); va_end(ap); - if (IS_ERR(req)) + if (IS_ERR(req)) { + trace_printk("p9_client_rpc early err return\n"); return req; + } req->tc.zc = false; req->rc.zc = false; @@ -706,9 +712,12 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...) err = c->trans_mod->request(c, req); if (err < 0) { /* write won't happen */ + trace_printk("write won't happnen (%d)\n", err); p9_req_put(c, req); - if (err != -ERESTARTSYS && err != -EFAULT) + if (err != -ERESTARTSYS && err != -EFAULT) { + trace_printk("Client disconnected (no write)\n"); c->status = Disconnected; + } goto recalc_sigpending; } again: @@ -721,6 +730,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...) */ smp_rmb(); + trace_printk("Wait event killable (%d) (%d) (%d) (%d)\n", err, READ_ONCE(req->status), c->status, type); if (err == -ERESTARTSYS && c->status == Connected && type == P9_TFLUSH) { sigpending = 1; @@ -731,6 +741,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...) if (READ_ONCE(req->status) == REQ_STATUS_ERROR) { p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err); err = req->t_err; + trace_printk("REQ_STATUS_ERROR (%d)\n", err); } if (err == -ERESTARTSYS && c->status == Connected) { p9_debug(P9_DEBUG_MUX, "flushing\n"); @@ -743,6 +754,8 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...) /* if we received the response anyway, don't signal error */ if (READ_ONCE(req->status) == REQ_STATUS_RCVD) err = 0; + + trace_printk("Flushed (%d)\n", err); } recalc_sigpending: if (sigpending) { @@ -754,11 +767,13 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...) goto reterr; err = p9_check_errors(c, req); + trace_printk("Check error (%d)\n", err); trace_9p_client_res(c, type, req->rc.tag, err); if (!err) return req; reterr: p9_req_put(c, req); + trace_printk("Final error (%d)\n", err); return ERR_PTR(safe_errno(err)); } @@ -808,8 +823,10 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type, err = c->trans_mod->zc_request(c, req, uidata, uodata, inlen, olen, in_hdrlen); if (err < 0) { - if (err == -EIO) + if (err == -EIO) { + trace_printk("Client disconnected (zero_copy)\n"); c->status = Disconnected; + } if (err != -ERESTARTSYS) goto recalc_sigpending; } @@ -1093,6 +1110,7 @@ EXPORT_SYMBOL(p9_client_destroy); void p9_client_disconnect(struct p9_client *clnt) { p9_debug(P9_DEBUG_9P, "clnt %p\n", clnt); + trace_printk("Client disconnected\n"); clnt->status = Disconnected; } EXPORT_SYMBOL(p9_client_disconnect); @@ -1100,6 +1118,7 @@ EXPORT_SYMBOL(p9_client_disconnect); void p9_client_begin_disconnect(struct p9_client *clnt) { p9_debug(P9_DEBUG_9P, "clnt %p\n", clnt); + trace_printk("Client begin disconnect\n"); clnt->status = BeginDisconnect; } EXPORT_SYMBOL(p9_client_begin_disconnect); @@ -1572,11 +1591,13 @@ p9_client_read_once(struct p9_fid *fid, u64 offset, struct iov_iter *to, offset, rsize); } else { non_zc = 1; + trace_printk("RPC issue\n"); req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset, rsize); } if (IS_ERR(req)) { *err = PTR_ERR(req); + trace_printk("read RPC issue error(%d)\n", *err); if (!non_zc) iov_iter_revert(to, count - iov_iter_count(to)); return 0; @@ -1587,12 +1608,14 @@ p9_client_read_once(struct p9_fid *fid, u64 offset, struct iov_iter *to, if (*err) { if (!non_zc) iov_iter_revert(to, count - iov_iter_count(to)); + trace_printk("p9pdu_readf error(%d)\n", *err); trace_9p_protocol_dump(clnt, &req->rc); p9_req_put(clnt, req); return 0; } if (rsize < received) { pr_err("bogus RREAD count (%d > %d)\n", received, rsize); + trace_printk("bogus read\n"); received = rsize; } @@ -1602,6 +1625,7 @@ p9_client_read_once(struct p9_fid *fid, u64 offset, struct iov_iter *to, int n = copy_to_iter(dataptr, received, to); if (n != received) { + trace_printk("copy_to_iter fault\n"); *err = -EFAULT; p9_req_put(clnt, req); return n; @@ -1642,23 +1666,29 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err) rsize, P9_ZC_HDR_SZ, "dqd", fid->fid, offset, rsize); } else { + trace_printk("p9_client_rpc\n"); req = p9_client_rpc(clnt, P9_TWRITE, "dqV", fid->fid, offset, rsize, from); + trace_printk("p9_client_rpc done\n"); } if (IS_ERR(req)) { iov_iter_revert(from, count - iov_iter_count(from)); *err = PTR_ERR(req); + trace_printk("p9_client_rpc error (%d)\n", *err); break; } + trace_printk("p9_pdup\n"); *err = p9pdu_readf(&req->rc, clnt->proto_version, "d", &written); if (*err) { + trace_printk("p9_pdup err(%d)\n", *err); iov_iter_revert(from, count - iov_iter_count(from)); trace_9p_protocol_dump(clnt, &req->rc); p9_req_put(clnt, req); break; } if (rsize < written) { + trace_printk("p9_pdup bogus %d %d\n", written, rsize); pr_err("bogus RWRITE count (%d > %d)\n", written, rsize); written = rsize; } @@ -1666,6 +1696,7 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err) p9_debug(P9_DEBUG_9P, "<<< RWRITE count %d\n", written); p9_req_put(clnt, req); + trace_printk("iter revert\n"); iov_iter_revert(from, count - written - iov_iter_count(from)); total += written; offset += written; diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 196060dc6138..3c83b3855e82 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -196,6 +196,7 @@ static void p9_conn_cancel(struct p9_conn *m, int err) return; } + trace_printk("Connection cancelled (%d)\n", err); m->err = err; list_for_each_entry_safe(req, rtmp, &m->req_list, req_list) { @@ -233,9 +234,12 @@ p9_fd_poll(struct p9_client *client, struct poll_table_struct *pt, int *err) return EPOLLERR; } + trace_printk("p9_fd_poll rd poll\n"); ret = vfs_poll(ts->rd, pt); - if (ts->rd != ts->wr) + if (ts->rd != ts->wr) { + trace_printk("p9_fd_request wr poll\n"); ret = (ret & ~EPOLLOUT) | (vfs_poll(ts->wr, pt) & ~EPOLLIN); + } return ret; } @@ -263,9 +267,13 @@ static int p9_fd_read(struct p9_client *client, void *v, int len) p9_debug(P9_DEBUG_ERROR, "blocking read ...\n"); pos = ts->rd->f_pos; + trace_printk("Data read wait %d\n", len); ret = kernel_read(ts->rd, v, len, &pos); - if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN) + trace_printk("Data read %d\n", ret); + if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN) { + trace_printk("Client disconnected (fd_read)\n"); client->status = Disconnected; + } return ret; } @@ -330,6 +338,7 @@ static void p9_read_work(struct work_struct *work) if (!m->rreq || (m->rreq->status != REQ_STATUS_SENT)) { p9_debug(P9_DEBUG_ERROR, "Unexpected packet tag %d\n", m->rc.tag); + trace_printk("p9_tag_lookup error\n"); err = -EIO; goto error; } @@ -338,6 +347,7 @@ static void p9_read_work(struct work_struct *work) p9_debug(P9_DEBUG_ERROR, "requested packet size too big: %d for tag %d with capacity %zd\n", m->rc.size, m->rc.tag, m->rreq->rc.capacity); + trace_printk("capacity error\n"); err = -EIO; goto error; } @@ -348,6 +358,7 @@ static void p9_read_work(struct work_struct *work) m->rc.tag, m->rreq); p9_req_put(m->client, m->rreq); m->rreq = NULL; + trace_printk("sdata error\n"); err = -EIO; goto error; } @@ -375,6 +386,7 @@ static void p9_read_work(struct work_struct *work) p9_debug(P9_DEBUG_ERROR, "Request tag %d errored out while we were reading the reply\n", m->rc.tag); + trace_printk("request tag error %d status(%d)\n", m->rc.tag, m->rreq->status); err = -EIO; goto error; } @@ -403,6 +415,7 @@ static void p9_read_work(struct work_struct *work) return; error: + trace_printk("Connection cancel\n"); p9_conn_cancel(m, err); clear_bit(Rworksched, &m->wsched); } @@ -429,9 +442,13 @@ static int p9_fd_write(struct p9_client *client, void *v, int len) if (!(ts->wr->f_flags & O_NONBLOCK)) p9_debug(P9_DEBUG_ERROR, "blocking write ...\n"); + trace_printk("Data write wait %d\n",len); ret = kernel_write(ts->wr, v, len, &ts->wr->f_pos); - if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN) + trace_printk("Data written %ld\n", ret); + if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN) { + trace_printk("Client disconnected (fd_write)\n"); client->status = Disconnected; + } return ret; } @@ -519,6 +536,7 @@ static void p9_write_work(struct work_struct *work) return; error: + trace_printk("Connection cancel\n"); p9_conn_cancel(m, err); clear_bit(Wworksched, &m->wsched); } @@ -628,6 +646,7 @@ static void p9_poll_mux(struct p9_conn *m) n = p9_fd_poll(m->client, NULL, &err); if (n & (EPOLLERR | EPOLLHUP | EPOLLNVAL)) { p9_debug(P9_DEBUG_TRANS, "error mux %p err %d\n", m, n); + trace_printk("Connection cancel\n"); p9_conn_cancel(m, err); } @@ -668,12 +687,14 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) struct p9_trans_fd *ts = client->trans; struct p9_conn *m = &ts->conn; + trace_printk("p9_fd_request\n"); p9_debug(P9_DEBUG_TRANS, "mux %p task %p tcall %p id %d\n", m, current, &req->tc, req->tc.id); spin_lock(&m->req_lock); if (m->err < 0) { + trace_printk("p9_fd_request error\n"); spin_unlock(&m->req_lock); return m->err; } @@ -682,13 +703,16 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) list_add_tail(&req->req_list, &m->unsent_req_list); spin_unlock(&m->req_lock); + trace_printk("p9_fd_request EPOLL\n"); if (test_and_clear_bit(Wpending, &m->wsched)) n = EPOLLOUT; else n = p9_fd_poll(m->client, NULL, NULL); - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) + if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) { + trace_printk("p9_fd_request schedule work\n"); schedule_work(&m->wq); + } return 0; } @@ -852,6 +876,7 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd) data_race(ts->wr->f_flags |= O_NONBLOCK); client->trans = ts; + trace_printk("Client connected\n"); client->status = Connected; return 0; @@ -889,6 +914,7 @@ static int p9_socket_open(struct p9_client *client, struct socket *csocket) get_file(file); p->wr = p->rd = file; client->trans = p; + trace_printk("Client connected\n"); client->status = Connected; p->rd->f_flags |= O_NONBLOCK; @@ -920,6 +946,7 @@ static void p9_conn_destroy(struct p9_conn *m) m->wreq = NULL; } + trace_printk("Connection destroy\n"); p9_conn_cancel(m, -ECONNRESET); m->client = NULL; @@ -942,6 +969,7 @@ static void p9_fd_close(struct p9_client *client) if (!ts) return; + trace_printk("Client disconnected (fd_close)\n"); client->status = Disconnected; p9_conn_destroy(&ts->conn); ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-25 14:49 ` K Prateek Nayak @ 2025-03-25 14:58 ` Dominique Martinet 2025-03-26 12:19 ` Oleg Nesterov 2025-03-27 17:46 ` K Prateek Nayak 0 siblings, 2 replies; 48+ messages in thread From: Dominique Martinet @ 2025-03-25 14:58 UTC (permalink / raw) To: K Prateek Nayak Cc: Oleg Nesterov, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs Thanks for the traces. w/ revert K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530: > kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_poll rd poll > kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_request wr poll > kworker/100:1-1803 [100] ..... 286.618823: p9_read_work: Data read wait 7 new behavior > repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_poll rd poll > repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_request wr poll > repro-4076 [031] ..... 99.731970: p9_client_rpc: Wait event killable (-512) For me the problem isn't so much that this gets ERESTARTSYS but that it nevers gets to read the 7 bytes that are available? If the repro has already written the bytes in both cases then there's no reason to wait 5 seconds here... OTOH syzbot 9p code is silly and might have been depending on something that's not supposed to work e.g. they might be missing a flush or equivalent for all I know (I still haven't looked at the repro) -- Dominique ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-25 14:58 ` Dominique Martinet @ 2025-03-26 12:19 ` Oleg Nesterov 2025-03-26 12:44 ` Oleg Nesterov 2025-03-27 17:46 ` K Prateek Nayak 1 sibling, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-26 12:19 UTC (permalink / raw) To: Dominique Martinet Cc: K Prateek Nayak, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs On 03/25, Dominique Martinet wrote: > > Thanks for the traces. > > w/ revert > K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530: > > kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_poll rd poll > > kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_request wr poll > > kworker/100:1-1803 [100] ..... 286.618823: p9_read_work: Data read wait 7 > > new behavior > > repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_poll rd poll > > repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_request wr poll > > repro-4076 [031] ..... 99.731970: p9_client_rpc: Wait event killable (-512) > > For me the problem isn't so much that this gets ERESTARTSYS but that it > nevers gets to read the 7 bytes that are available? Yes... OK, lets first recall what the commit aaec5a95d59615523 ("pipe_read: don't wake up the writer if the pipe is still full") does. It simply removes the unnecessary/spurious wakeups when the writer can't add more data to the pipe. See the "stupid test-cas" in https://lore.kernel.org/all/20250120144338.GC7432@redhat.com/ In particular this note: As you can see, without this patch pipe_read() wakes the writer up 4095 times for no reason, the writer burns a bit of CPU and blocks again after wakeup until the last read(fd[0], &c, 1). in this test-case the writer sleeps in pipe_write(), but the same is true for the task sleeping in poll( { .fd = pipe_fd, .events = POLLOUT}, ...). Now, after some grepping I have found static void p9_conn_create(struct p9_client *client) { ... init_poll_funcptr(&m->pt, p9_pollwait); n = p9_fd_poll(client, &m->pt, NULL); ... } So, iiuc, in this case p9_fd_poll(&m->pt /* != NULL */) -> p9_pollwait() paths will add the "dummy" pwait->wait entries with ->func = p9_pollwake to pipe_inode_info.rd_wait and pipe_inode_info.wr_wait. Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the ret from vfs_poll(ts->rd), but I assume this is correct. This means that every time pipe_read() does wake_up(&pipe->wr_wait) p9_pollwake() is called. This function kicks p9_poll_workfn() which calls p9_poll_mux() which calls p9_fd_poll() again with pt == NULL. In this case the conditional vfs_poll(ts->wr) looks more understandable... So. Without the commit above, p9_poll_mux()->p9_fd_poll() can be called much more often and, in particular, can report the "additional" EPOLLIN. Can this somehow explain the problem? Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-26 12:19 ` Oleg Nesterov @ 2025-03-26 12:44 ` Oleg Nesterov 2025-03-26 13:05 ` Oleg Nesterov 0 siblings, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-26 12:44 UTC (permalink / raw) To: Dominique Martinet, K Prateek Nayak Cc: Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs On 03/26, Oleg Nesterov wrote: > > Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the > ret from vfs_poll(ts->rd), but I assume this is correct. I meant, if pt != NULL and ts->rd != ts->wr we need both vfs_poll(ts->rd) and vfs_poll(ts->wr) ? and the reproducer writes to the pipe before it mounts 9p... Prateek, this is just a shot in the dark but since you can reproduce, can you check if the patch below makes any difference? Oleg. --- x/net/9p/trans_fd.c +++ x/net/9p/trans_fd.c @@ -234,8 +234,10 @@ p9_fd_poll(struct p9_client *client, str } ret = vfs_poll(ts->rd, pt); - if (ts->rd != ts->wr) + if (ts->rd != ts->wr) { + if (pt != NULL) vfs_poll(ts->wr, pt); ret = (ret & ~EPOLLOUT) | (vfs_poll(ts->wr, pt) & ~EPOLLIN); + } return ret; } ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-26 12:44 ` Oleg Nesterov @ 2025-03-26 13:05 ` Oleg Nesterov 0 siblings, 0 replies; 48+ messages in thread From: Oleg Nesterov @ 2025-03-26 13:05 UTC (permalink / raw) To: Dominique Martinet, K Prateek Nayak Cc: Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs Damn, sorry for the noise please ignore ;) On 03/26, Oleg Nesterov wrote: > > On 03/26, Oleg Nesterov wrote: > > > > Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the > > ret from vfs_poll(ts->rd), but I assume this is correct. > > I meant, if pt != NULL and ts->rd != ts->wr we need both > vfs_poll(ts->rd) and vfs_poll(ts->wr) ? I am stupid. After the lot of reading I can't distinguish "|" and "||". Oleg. > and the reproducer writes to the pipe before it mounts 9p... > > Prateek, this is just a shot in the dark but since you can reproduce, > can you check if the patch below makes any difference? > > Oleg. > > --- x/net/9p/trans_fd.c > +++ x/net/9p/trans_fd.c > @@ -234,8 +234,10 @@ p9_fd_poll(struct p9_client *client, str > } > > ret = vfs_poll(ts->rd, pt); > - if (ts->rd != ts->wr) > + if (ts->rd != ts->wr) { > + if (pt != NULL) vfs_poll(ts->wr, pt); > ret = (ret & ~EPOLLOUT) | (vfs_poll(ts->wr, pt) & ~EPOLLIN); > + } > return ret; > } > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-25 14:58 ` Dominique Martinet 2025-03-26 12:19 ` Oleg Nesterov @ 2025-03-27 17:46 ` K Prateek Nayak 2025-03-27 21:19 ` syzbot 1 sibling, 1 reply; 48+ messages in thread From: K Prateek Nayak @ 2025-03-27 17:46 UTC (permalink / raw) To: Dominique Martinet, Oleg Nesterov Cc: Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro, v9fs Hello all, On 3/25/2025 8:28 PM, Dominique Martinet wrote: > Thanks for the traces. > > w/ revert > K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530: >> kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_poll rd poll >> kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_request wr poll >> kworker/100:1-1803 [100] ..... 286.618823: p9_read_work: Data read wait 7 > > new behavior >> repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_poll rd poll >> repro-4076 [031] ..... 95.011394: p9_fd_poll: p9_fd_request wr poll >> repro-4076 [031] ..... 99.731970: p9_client_rpc: Wait event killable (-512) > > For me the problem isn't so much that this gets ERESTARTSYS but that it > nevers gets to read the 7 bytes that are available? I'm back after getting distracted for a bit. So here it goes: On the mainline, the signal is actually SIGKILL from parent thread due to a timeout: repro-4084 [112] d..2. 233.654264: signal_generate: sig=9 errno=0 code=0 comm=repro pid=4085 grp=1 res=0 repro-4084 [112] d..1. 233.654272: signal_generate: sig=9 errno=0 code=0 comm=repro pid=4085 grp=1 res=1 repro-4085 [125] ..... 233.654304: p9_client_rpc: Wait event killable (-512) (1) (0) (118) repro-4085 [125] ..... 233.654308: p9_client_rpc: Flushed (-512) repro-4085 [125] ..... 233.654312: p9_client_rpc: Final error (-512) repro-4085 [125] ..... 233.654313: p9_client_write: p9_client_rpc done repro-4085 [125] ..... 233.654313: p9_client_write: p9_client_rpc error (-512) repro-4085 [125] ..... 233.654315: v9fs_issue_write: Issue write done 2 err(-512) Specifically, this part: for (;;) { sleep_ms(10); if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid) break; if (current_time_ms() - start < 5000) continue; kill_and_wait(pid, &status); /* <------- here */ break; } Now for the 7 bytes that were not read - If you look at the traces, every time there is a valid trans_mod->request(), there is a kworker wakeup: repro-4102 [100] ..... 286.618772: p9_client_write: p9_client_rpc repro-4102 [100] ..... 286.618780: p9_fd_request: p9_fd_request repro-4102 [100] ..... 286.618781: p9_fd_request: p9_fd_request EPOLL repro-4102 [100] ..... 286.618781: p9_fd_request: p9_fd_request schedule work <----- this kworker/100:1-1803 [100] ..... 286.618784: p9_write_work: Data write wait 32770 However, for that last 7 byte read, there is in fact no wakeup: repro-4102 [100] ..... 286.618805: p9_client_write: p9_client_rpc repro-4102 [100] ..... 286.618817: p9_fd_request: p9_fd_request repro-4102 [100] ..... 286.618818: p9_fd_request: p9_fd_request EPOLL repro-4102 [100] ..... 286.618818: p9_fd_poll: p9_fd_poll rd poll repro-4102 [100] ..... 286.618818: p9_fd_poll: p9_fd_request wr poll <--- No wakeup after kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_poll rd poll kworker/100:1-1803 [100] ..... 286.618822: p9_fd_poll: p9_fd_request wr poll kworker/100:1-1803 [100] ..... 286.618823: p9_read_work: Data read wait 7 kworker/100:1-1803 [100] ..... 286.618825: p9_read_work: Data read 7 The kworker would spuriously wakeup, do something, realizes it wasn't suppose to read it, and then disconnects the connection waking up all waiters. But if we haven't sent a request, why are we waiting at all? > > If the repro has already written the bytes in both cases then there's no > reason to wait 5 seconds here... > > OTOH syzbot 9p code is silly and might have been depending on something > that's not supposed to work e.g. they might be missing a flush or > equivalent for all I know (I still haven't looked at the repro) > I believe if c->trans_mod->request(c, req) does not dispatch the request to the worker for RPC, the caller must not wait. So here goes something that got the benchmark going in my case which might be totally wrong but worth a look: #syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7 diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 196060dc6138..148533c993f1 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -687,7 +687,11 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) else n = p9_fd_poll(m->client, NULL, NULL); - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) + /* Request was not sent */ + if (!(n & EPOLLOUT)) + return -EIO; + + if (!test_and_set_bit(Wworksched, &m->wsched)) schedule_work(&m->wq); return 0; ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-27 17:46 ` K Prateek Nayak @ 2025-03-27 21:19 ` syzbot 2025-03-27 22:18 ` asmadeus 0 siblings, 1 reply; 48+ messages in thread From: syzbot @ 2025-03-27 21:19 UTC (permalink / raw) To: asmadeus, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: KASAN: slab-use-after-free Read in p9_conn_cancel ================================================================== BUG: KASAN: slab-use-after-free in p9_conn_cancel+0x900/0x910 net/9p/trans_fd.c:205 Read of size 8 at addr ffff88807b19ea50 by task syz-executor/6595 CPU: 0 UID: 0 PID: 6595 Comm: syz-executor Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xc3/0x620 mm/kasan/report.c:489 kasan_report+0xd9/0x110 mm/kasan/report.c:602 p9_conn_cancel+0x900/0x910 net/9p/trans_fd.c:205 p9_conn_destroy net/9p/trans_fd.c:927 [inline] p9_fd_close+0x3c5/0x590 net/9p/trans_fd.c:951 p9_client_destroy+0xce/0x480 net/9p/client.c:1077 v9fs_session_close+0x49/0x2d0 fs/9p/v9fs.c:506 v9fs_kill_super+0x4d/0xa0 fs/9p/vfs_super.c:196 deactivate_locked_super+0xbe/0x1a0 fs/super.c:473 deactivate_super+0xde/0x100 fs/super.c:506 cleanup_mnt+0x222/0x450 fs/namespace.c:1373 task_work_run+0x14e/0x250 kernel/task_work.c:239 exit_task_work include/linux/task_work.h:43 [inline] do_exit+0xadd/0x2d70 kernel/exit.c:938 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087 get_signal+0x24ed/0x26c0 kernel/signal.c:3017 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218 do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa1f978bb7c Code: Unable to access opcode bytes at 0x7fa1f978bb52. RSP: 002b:00007ffd5c5893b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: fffffffffffffe00 RBX: 0000000000000003 RCX: 00007fa1f978bb7c RDX: 0000000000000030 RSI: 00007ffd5c589470 RDI: 00000000000000f9 RBP: 00007ffd5c58941c R08: 0000000000000000 R09: 0079746972756365 R10: 00007ffd5c588d70 R11: 0000000000000246 R12: 0000000000000258 R13: 00000000000927c0 R14: 0000000000019c1d R15: 00007ffd5c589470 </TASK> Allocated by task 52: kasan_save_stack+0x33/0x60 mm/kasan/common.c:47 kasan_save_track+0x14/0x30 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:319 [inline] __kasan_slab_alloc+0x89/0x90 mm/kasan/common.c:345 kasan_slab_alloc include/linux/kasan.h:250 [inline] slab_post_alloc_hook mm/slub.c:4104 [inline] slab_alloc_node mm/slub.c:4153 [inline] kmem_cache_alloc_noprof+0x226/0x3d0 mm/slub.c:4160 p9_tag_alloc+0x9c/0x870 net/9p/client.c:281 p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644 p9_client_rpc+0x1c3/0xc10 net/9p/client.c:691 p9_client_write+0x31f/0x680 net/9p/client.c:1645 v9fs_issue_write+0xe2/0x180 fs/9p/vfs_addr.c:59 netfs_do_issue_write+0x92/0x110 fs/netfs/write_issue.c:233 netfs_retry_write_stream fs/netfs/write_collect.c:184 [inline] netfs_retry_writes fs/netfs/write_collect.c:361 [inline] netfs_collect_write_results fs/netfs/write_collect.c:529 [inline] netfs_write_collection_worker+0x3e10/0x47d0 fs/netfs/write_collect.c:551 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Freed by task 5192: kasan_save_stack+0x33/0x60 mm/kasan/common.c:47 kasan_save_track+0x14/0x30 mm/kasan/common.c:68 kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x51/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free_after_rcu_debug+0x115/0x340 mm/slub.c:4648 rcu_do_batch kernel/rcu/tree.c:2567 [inline] rcu_core+0x79d/0x14d0 kernel/rcu/tree.c:2823 handle_softirqs+0x213/0x8f0 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0x109/0x170 kernel/softirq.c:655 irq_exit_rcu+0x9/0x30 kernel/softirq.c:671 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0xa4/0xc0 arch/x86/kernel/apic/apic.c:1049 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 Last potentially related work creation: kasan_save_stack+0x33/0x60 mm/kasan/common.c:47 __kasan_record_aux_stack+0xba/0xd0 mm/kasan/generic.c:544 slab_free_hook mm/slub.c:2299 [inline] slab_free mm/slub.c:4598 [inline] kmem_cache_free+0x305/0x4c0 mm/slub.c:4700 p9_req_put net/9p/client.c:409 [inline] p9_req_put+0x1c6/0x250 net/9p/client.c:402 p9_client_rpc+0x591/0xc10 net/9p/client.c:761 p9_client_write+0x31f/0x680 net/9p/client.c:1645 v9fs_issue_write+0xe2/0x180 fs/9p/vfs_addr.c:59 netfs_do_issue_write+0x92/0x110 fs/netfs/write_issue.c:233 netfs_retry_write_stream fs/netfs/write_collect.c:184 [inline] netfs_retry_writes fs/netfs/write_collect.c:361 [inline] netfs_collect_write_results fs/netfs/write_collect.c:529 [inline] netfs_write_collection_worker+0x3e10/0x47d0 fs/netfs/write_collect.c:551 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 The buggy address belongs to the object at ffff88807b19e990 which belongs to the cache p9_req_t of size 208 The buggy address is located 192 bytes inside of freed 208-byte region [ffff88807b19e990, ffff88807b19ea60) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7b19e flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff) page_type: f5(slab) raw: 00fff00000000000 ffff88814c7a0140 ffffea0001dd70c0 0000000000000006 raw: 0000000000000000 00000000000f000f 00000001f5000000 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 6752, tgid 6751 (syz.0.17), ts 91802565462, free_ts 91788017536 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x2d1/0x350 mm/page_alloc.c:1556 prep_new_page mm/page_alloc.c:1564 [inline] get_page_from_freelist+0xfce/0x2f80 mm/page_alloc.c:3474 __alloc_pages_noprof+0x221/0x2470 mm/page_alloc.c:4751 alloc_pages_mpol_noprof+0x2c9/0x610 mm/mempolicy.c:2265 alloc_slab_page mm/slub.c:2408 [inline] allocate_slab mm/slub.c:2574 [inline] new_slab+0x2c9/0x410 mm/slub.c:2627 ___slab_alloc+0xda4/0x1880 mm/slub.c:3815 __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3905 __slab_alloc_node mm/slub.c:3980 [inline] slab_alloc_node mm/slub.c:4141 [inline] kmem_cache_alloc_noprof+0xfa/0x3d0 mm/slub.c:4160 p9_tag_alloc+0x9c/0x870 net/9p/client.c:281 p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644 p9_client_rpc+0x1c3/0xc10 net/9p/client.c:691 p9_client_version net/9p/client.c:930 [inline] p9_client_create+0xc65/0x1200 net/9p/client.c:1036 v9fs_session_init+0x1f8/0x1a80 fs/9p/v9fs.c:410 v9fs_mount+0xc6/0xa30 fs/9p/vfs_super.c:122 legacy_get_tree+0x109/0x220 fs/fs_context.c:662 vfs_get_tree+0x8f/0x380 fs/super.c:1814 page last free pid 6750 tgid 6748 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1127 [inline] free_unref_folios+0xa7b/0x14f0 mm/page_alloc.c:2704 folios_put_refs+0x587/0x7b0 mm/swap.c:962 free_pages_and_swap_cache+0x45f/0x510 mm/swap_state.c:335 __tlb_batch_free_encoded_pages+0xf9/0x290 mm/mmu_gather.c:136 tlb_batch_pages_flush mm/mmu_gather.c:149 [inline] tlb_flush_mmu_free mm/mmu_gather.c:366 [inline] tlb_flush_mmu mm/mmu_gather.c:373 [inline] tlb_finish_mmu+0x168/0x7b0 mm/mmu_gather.c:465 exit_mmap+0x3df/0xb20 mm/mmap.c:1680 __mmput+0x12a/0x4c0 kernel/fork.c:1353 mmput+0x62/0x70 kernel/fork.c:1375 exit_mm kernel/exit.c:570 [inline] do_exit+0x9bf/0x2d70 kernel/exit.c:925 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087 get_signal+0x24ed/0x26c0 kernel/signal.c:3017 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218 do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f Memory state around the buggy address: ffff88807b19e900: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc ffff88807b19e980: fc fc fa fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88807b19ea00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff88807b19ea80: fc fc fc fc 00 00 00 00 00 00 00 00 00 00 00 00 ffff88807b19eb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc ================================================================== Tested on: commit: aaec5a95 pipe_read: don't wake up the writer if the pi.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=15fafde4580000 kernel config: https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=110a3804580000 ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-27 21:19 ` syzbot @ 2025-03-27 22:18 ` asmadeus 2025-03-28 4:01 ` K Prateek Nayak 0 siblings, 1 reply; 48+ messages in thread From: asmadeus @ 2025-03-27 22:18 UTC (permalink / raw) To: syzbot Cc: brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro syzbot wrote on Thu, Mar 27, 2025 at 02:19:03PM -0700: > BUG: KASAN: slab-use-after-free in p9_conn_cancel+0x900/0x910 net/9p/trans_fd.c:205 > Read of size 8 at addr ffff88807b19ea50 by task syz-executor/6595 Ugh, why... Ah, if ->request() fails p9_client_rpc assumes the request was not written (e.g. write error), so you can't return an error after the list_add_tail call in p9_fd_request. I think you can call p9_conn_cancel with the error and return 0 anyway, and this paticular workaround will probably work, regardless of whether it's the correct thing to do here (still haven't had time to look at the patch here) Sorry for this mess (even if most of it predates me...) -- Dominique ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-27 22:18 ` asmadeus @ 2025-03-28 4:01 ` K Prateek Nayak 2025-03-28 4:43 ` syzbot 2025-03-28 13:06 ` Oleg Nesterov 0 siblings, 2 replies; 48+ messages in thread From: K Prateek Nayak @ 2025-03-28 4:01 UTC (permalink / raw) To: asmadeus, syzbot Cc: brauner, dhowells, ericvh, jack, jlayton, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro Hello Dominique, On 3/28/2025 3:48 AM, asmadeus@codewreck.org wrote: > syzbot wrote on Thu, Mar 27, 2025 at 02:19:03PM -0700: >> BUG: KASAN: slab-use-after-free in p9_conn_cancel+0x900/0x910 net/9p/trans_fd.c:205 >> Read of size 8 at addr ffff88807b19ea50 by task syz-executor/6595 Woops! Should have tested with KASAN enabled. This time I did. > > Ugh, why... > Ah, if ->request() fails p9_client_rpc assumes the request was not > written (e.g. write error), so you can't return an error after the > list_add_tail call in p9_fd_request. > > I think you can call p9_conn_cancel with the error and return 0 anyway, > and this paticular workaround will probably work, regardless of whether > it's the correct thing to do here (still haven't had time to look at the > patch here) That works too! Thank you for the pointer. > > Sorry for this mess (even if most of it predates me...) And I'm sorry for the noise from hacking on stuff that I don't fully understand. Fingers crossed (and thank you syzbot) #syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7 diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 196060dc6138..dab7f02dc243 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -687,7 +687,13 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) else n = p9_fd_poll(m->client, NULL, NULL); - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) + /* Failed to send request */ + if (!(n & EPOLLOUT)) { + p9_conn_cancel(m, -EIO); + return 0; + } + + if (!test_and_set_bit(Wworksched, &m->wsched)) schedule_work(&m->wq); return 0; ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 4:01 ` K Prateek Nayak @ 2025-03-28 4:43 ` syzbot 2025-03-28 13:06 ` Oleg Nesterov 1 sibling, 0 replies; 48+ messages in thread From: syzbot @ 2025-03-28 4:43 UTC (permalink / raw) To: asmadeus, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com Tested-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com Tested on: commit: aaec5a95 pipe_read: don't wake up the writer if the pi.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=1285fa4c580000 kernel config: https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=121eede4580000 Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 4:01 ` K Prateek Nayak 2025-03-28 4:43 ` syzbot @ 2025-03-28 13:06 ` Oleg Nesterov 2025-03-28 13:07 ` syzbot 1 sibling, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-28 13:06 UTC (permalink / raw) To: K Prateek Nayak, Dominique Martinet Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro On 03/28, K Prateek Nayak wrote: > > --- a/net/9p/trans_fd.c > +++ b/net/9p/trans_fd.c > @@ -687,7 +687,13 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) > else > n = p9_fd_poll(m->client, NULL, NULL); > - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) > + /* Failed to send request */ > + if (!(n & EPOLLOUT)) { > + p9_conn_cancel(m, -EIO); Thanks a lot Prateek! Can't really the changes in net/9p, but I am not sure. !(n & EPOLLOUT) means that pipe is full, nothing wrong. We need to kick m->rq in this case. Dominique says: For me the problem isn't so much that this gets ERESTARTSYS but that it nevers gets to read the 7 bytes that are available? Yes. Of course I do not pretend I fully understand the problem, but it seems this is because p9_read_work() doesn't set Rworksched and doesn't do schedule_work(&m->rq) if list_empty(&m->req_list). However, if the pipe is full, before the commit aaec5a95d59615523db0 ("pipe_read: don't wake up the writer if the pipe is still full"), p9_read_work() -> p9_fd_read() -> kernel_read(ts->rd) triggered the unnecessary wakeup. This wakeup calls p9_pollwake() shich kicks p9_poll_workfn() -> p9_poll_mux(), and p9_poll_mux() will notice EPOLLIN and schedule_work(&m->rq). May be the patch below makes more sense? Oleg. #syz test: upstream diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 56e62978e502..aa9cd248a243 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -669,7 +669,6 @@ static void p9_poll_mux(struct p9_conn *m) static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) { - __poll_t n; struct p9_trans_fd *ts = client->trans; struct p9_conn *m = &ts->conn; @@ -687,13 +686,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) list_add_tail(&req->req_list, &m->unsent_req_list); spin_unlock(&m->req_lock); - if (test_and_clear_bit(Wpending, &m->wsched)) - n = EPOLLOUT; - else - n = p9_fd_poll(m->client, NULL, NULL); - - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) - schedule_work(&m->wq); + p9_poll_mux(m); return 0; } ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 13:06 ` Oleg Nesterov @ 2025-03-28 13:07 ` syzbot 2025-03-28 13:25 ` Oleg Nesterov 0 siblings, 1 reply; 48+ messages in thread From: syzbot @ 2025-03-28 13:07 UTC (permalink / raw) To: oleg Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro > On 03/28, K Prateek Nayak wrote: >> >> --- a/net/9p/trans_fd.c >> +++ b/net/9p/trans_fd.c >> @@ -687,7 +687,13 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) >> else >> n = p9_fd_poll(m->client, NULL, NULL); >> - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) >> + /* Failed to send request */ >> + if (!(n & EPOLLOUT)) { >> + p9_conn_cancel(m, -EIO); > > Thanks a lot Prateek! > > Can't really the changes in net/9p, but I am not sure. !(n & EPOLLOUT) > means that pipe is full, nothing wrong. We need to kick m->rq in this > case. > > Dominique says: > > For me the problem isn't so much that this gets ERESTARTSYS but that it > nevers gets to read the 7 bytes that are available? > > Yes. Of course I do not pretend I fully understand the problem, but it > seems this is because p9_read_work() doesn't set Rworksched and doesn't > do schedule_work(&m->rq) if list_empty(&m->req_list). > > However, if the pipe is full, before the commit aaec5a95d59615523db0 > ("pipe_read: don't wake up the writer if the pipe is still full"), > p9_read_work() -> p9_fd_read() -> kernel_read(ts->rd) triggered the > unnecessary wakeup. This wakeup calls p9_pollwake() shich kicks > p9_poll_workfn() -> p9_poll_mux(), and p9_poll_mux() will notice > EPOLLIN and schedule_work(&m->rq). > > May be the patch below makes more sense? > > Oleg. > > #syz test: upstream want either no args or 2 args (repo, branch), got 5 > > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c > index 56e62978e502..aa9cd248a243 100644 > --- a/net/9p/trans_fd.c > +++ b/net/9p/trans_fd.c > @@ -669,7 +669,6 @@ static void p9_poll_mux(struct p9_conn *m) > > static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) > { > - __poll_t n; > struct p9_trans_fd *ts = client->trans; > struct p9_conn *m = &ts->conn; > > @@ -687,13 +686,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) > list_add_tail(&req->req_list, &m->unsent_req_list); > spin_unlock(&m->req_lock); > > - if (test_and_clear_bit(Wpending, &m->wsched)) > - n = EPOLLOUT; > - else > - n = p9_fd_poll(m->client, NULL, NULL); > - > - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) > - schedule_work(&m->wq); > + p9_poll_mux(m); > > return 0; > } > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 13:07 ` syzbot @ 2025-03-28 13:25 ` Oleg Nesterov 2025-03-28 13:49 ` syzbot 0 siblings, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-28 13:25 UTC (permalink / raw) To: syzbot Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro On 03/28, syzbot wrote: > > > #syz test: upstream > > want either no args or 2 args (repo, branch), got 5 #syz test diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 56e62978e502..aa9cd248a243 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -669,7 +669,6 @@ static void p9_poll_mux(struct p9_conn *m) static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) { - __poll_t n; struct p9_trans_fd *ts = client->trans; struct p9_conn *m = &ts->conn; @@ -687,13 +686,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) list_add_tail(&req->req_list, &m->unsent_req_list); spin_unlock(&m->req_lock); - if (test_and_clear_bit(Wpending, &m->wsched)) - n = EPOLLOUT; - else - n = p9_fd_poll(m->client, NULL, NULL); - - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) - schedule_work(&m->wq); + p9_poll_mux(m); return 0; } ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 13:25 ` Oleg Nesterov @ 2025-03-28 13:49 ` syzbot 2025-03-28 14:49 ` Oleg Nesterov 0 siblings, 1 reply; 48+ messages in thread From: syzbot @ 2025-03-28 13:49 UTC (permalink / raw) To: asmadeus, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: unregister_netdevice: waiting for DEV to become free unregister_netdevice: waiting for batadv0 to become free. Usage count = 3 Tested on: commit: acb4f337 Merge tag 'm68knommu-for-v6.15' of git://git... git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=16bab804580000 kernel config: https://syzkaller.appspot.com/x/.config?x=95c3bbe7ce8436a7 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=146b9bb0580000 ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 13:49 ` syzbot @ 2025-03-28 14:49 ` Oleg Nesterov 2025-03-28 15:22 ` syzbot 0 siblings, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-28 14:49 UTC (permalink / raw) To: syzbot Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro On 03/28, syzbot wrote: > > Hello, > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > unregister_netdevice: waiting for DEV to become free > > unregister_netdevice: waiting for batadv0 to become free. Usage count = 3 Looks "obviously unrelated", but I'm not sure about anything anymore. In any case I leave this to 9p maintainers. Okay, just out of curiosity: #syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7 diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 56e62978e502..aa9cd248a243 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -669,7 +669,6 @@ static void p9_poll_mux(struct p9_conn *m) static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) { - __poll_t n; struct p9_trans_fd *ts = client->trans; struct p9_conn *m = &ts->conn; @@ -687,13 +686,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) list_add_tail(&req->req_list, &m->unsent_req_list); spin_unlock(&m->req_lock); - if (test_and_clear_bit(Wpending, &m->wsched)) - n = EPOLLOUT; - else - n = p9_fd_poll(m->client, NULL, NULL); - - if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) - schedule_work(&m->wq); + p9_poll_mux(m); return 0; } ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 14:49 ` Oleg Nesterov @ 2025-03-28 15:22 ` syzbot 2025-03-28 17:00 ` Oleg Nesterov 0 siblings, 1 reply; 48+ messages in thread From: syzbot @ 2025-03-28 15:22 UTC (permalink / raw) To: asmadeus, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com Tested-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com Tested on: commit: aaec5a95 pipe_read: don't wake up the writer if the pi.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=1598064c580000 kernel config: https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=16682a4c580000 Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 15:22 ` syzbot @ 2025-03-28 17:00 ` Oleg Nesterov 2025-03-28 17:56 ` K Prateek Nayak 2025-03-29 0:00 ` asmadeus 0 siblings, 2 replies; 48+ messages in thread From: Oleg Nesterov @ 2025-03-28 17:00 UTC (permalink / raw) To: syzbot Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro Dear syzbot, On 03/28, syzbot wrote: > > syzbot has tested the proposed patch and the reproducer did not trigger any issue: > > Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com > Tested-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com Thanks. so the previous syzbot has tested the proposed patch but the reproducer is still triggering an issue: unregister_netdevice: waiting for DEV to become free unregister_netdevice: waiting for batadv0 to become free. Usage count = 3 report suggests that upstream/master has another/unrelated issue(s). As for the patches from me or Prateek (thanks again!), I think that the maintainers should take a look. But at this point I am mostly confident that the bisected commit aaec5a95d5961 ("pipe_read: don't wake up the writer if the pipe is still full") is innocent, it just reveals yet another problem. I guess (I hope ;) Prateek agrees. Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 17:00 ` Oleg Nesterov @ 2025-03-28 17:56 ` K Prateek Nayak 2025-03-28 18:20 ` Oleg Nesterov 2025-03-29 0:00 ` asmadeus 1 sibling, 1 reply; 48+ messages in thread From: K Prateek Nayak @ 2025-03-28 17:56 UTC (permalink / raw) To: Oleg Nesterov, syzbot Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro Hello Oleg, On 3/28/2025 10:30 PM, Oleg Nesterov wrote: > Dear syzbot, > > On 03/28, syzbot wrote: >> >> syzbot has tested the proposed patch and the reproducer did not trigger any issue: >> >> Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com >> Tested-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com > Yours is the right approach. I was searching for p9_poll_mux() which would wake the read collector to free pipe buffers but I didn't dig enough and I was too lazy to open code it (guilty!) If this gets picked, feel free to add: Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com> > Thanks. so the previous > > syzbot has tested the proposed patch but the reproducer is still triggering an issue: > unregister_netdevice: waiting for DEV to become free > > unregister_netdevice: waiting for batadv0 to become free. Usage count = 3 > > report suggests that upstream/master has another/unrelated issue(s). I could not reproduce this bit on mainline at commit acb4f33713b9 ("Merge tag 'm68knommu-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu") so I too second that it is an unrelated issue. > > As for the patches from me or Prateek (thanks again!), I think that > the maintainers should take a look. > > But at this point I am mostly confident that the bisected commit aaec5a95d5961 > ("pipe_read: don't wake up the writer if the pipe is still full") is innocent, > it just reveals yet another problem. > > I guess (I hope ;) Prateek agrees. I agree aaec5a95d5961 is not guilty! (unlike me) :) > > Oleg. > -- Thanks and Regards, Prateek ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 17:56 ` K Prateek Nayak @ 2025-03-28 18:20 ` Oleg Nesterov 0 siblings, 0 replies; 48+ messages in thread From: Oleg Nesterov @ 2025-03-28 18:20 UTC (permalink / raw) To: K Prateek Nayak Cc: syzbot, asmadeus, brauner, dhowells, ericvh, jack, jlayton, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro Hi Prateek, On 03/28, K Prateek Nayak wrote: > > Yours is the right approach. OK, thank you, but lets wait for 9p maintainers. but... > If this gets picked, feel free to add: > > Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com> No! My version is just a slightly updated version of your initial patch. It was you who (unless we both are wrong) actually pinpointed the problem. So. If this is acked by maintainers, please-please send the updated patch with the changelog, and feel free to add my Reviewed-by or Acked-by. Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 17:00 ` Oleg Nesterov 2025-03-28 17:56 ` K Prateek Nayak @ 2025-03-29 0:00 ` asmadeus 2025-03-29 14:21 ` Oleg Nesterov 1 sibling, 1 reply; 48+ messages in thread From: asmadeus @ 2025-03-29 0:00 UTC (permalink / raw) To: Oleg Nesterov Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro Oleg Nesterov wrote on Fri, Mar 28, 2025 at 06:00:12PM +0100: > As for the patches from me or Prateek (thanks again!), I think that > the maintainers should take a look. > > But at this point I am mostly confident that the bisected commit aaec5a95d5961 > ("pipe_read: don't wake up the writer if the pipe is still full") is innocent, > it just reveals yet another problem. > > I guess (I hope ;) Prateek agrees. Right, so your patch sounds better than Prateek's initial blowing up workaround, but it's a bit weird anyway so let me recap: - that syz repro has this unnatural pattern where the replies are all written before the requests are sent - 9p_read_work() (read worker) has an optimization that if there is no in fly request then there obviously must be nothing to read (9p is 100% client initiated, there's no way the server should send something first), so at this point the reader task is idle (OTOH, we're checking for rx right at p9_conn_create() before anything is sent, so it's not like we're consistent on this...) - p9_fd_request() (sending a new request) has another optimization that only checks for tx: at this point if another request was already in flight then the rx task should have a poll going on for rx, and if there were no in flight request yet then there should be no point in checking for rx, so p9_fd_request() only kick in the tx worker if there is room to send - at this point I don't really get the logic that'll wake the rx thread up either... p9_pollwake() will trigger p9_poll_workfn() (through p9_poll_work) which will call p9_poll_mux() and get the reader kicking again, but I don't know how the waitqueue mechanism is supposed to work (see p9_pollwait()) I'd have expected the tx task to somehow nudge this on, but it doesn't?... - due to the new optimization (aaec5a95d59615 "pipe_read: don't wake up the writer if the pipe is still full"), that 'if there is room to send' check started failing and tx thread doesn't start? Because syzbot passed us a pipe that was already full, or they messed with it after mount? I'm not clear about this point, but I think it's the key here -- the 9p "mount by fd" is great for local pseudo filesystems and things like that but it's been abused by syzbot too much, and I don't want to spend too much time making sure that any unholy things they do with these fd works. If possible I'd like to mark that fd unusable by userspace but I'm honestly doubtful it's possible (if e.g. it was dup'd or something before mount for example...) So, yeah, well, okay I don't mind the patch even if it doesn't make sense with a regular server. We don't really care about trans fd performance either so it's fine if it's a bit slower, and the error Prateek added might happen in a real case if tx queue is full of real requests so I think your approach is good enough. If we understand what's happening here I think it's as good as anything else, but I'd just like it clear in the commit message what syzbot is doing and why the regression happened Thank you both for the thorough analysis and follow ups! -- Dominique Martinet | Asmadeus ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-29 0:00 ` asmadeus @ 2025-03-29 14:21 ` Oleg Nesterov 2025-03-29 23:27 ` asmadeus 0 siblings, 1 reply; 48+ messages in thread From: Oleg Nesterov @ 2025-03-29 14:21 UTC (permalink / raw) To: asmadeus Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro First of all, let me remind that I know nothing about 9p or netfs ;) And I am not sure that my patch is the right solution. I am not even sure we need the fix, according to syzbot testing the problem goes away with the fixes from David https://web.git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes but I didn't even try to read them, this is not my area. Now, I'll try to answer some of your questions, but I can be easily wrong. On 03/29, asmadeus@codewreck.org wrote: > > Right, so your patch sounds better than Prateek's initial blowing > up workaround, but it's a bit weird anyway so let me recap: > - that syz repro has this unnatural pattern where the replies are all > written before the requests are sent Yes, > - 9p_read_work() (read worker) has an optimization that if there is no > in fly request then there obviously must be nothing to read (9p is 100% > client initiated, there's no way the server should send something > first), so at this point the reader task is idle Yes. But note that it does kernel_read() -> pipe_read() before it becomes idle. See below. > - p9_fd_request() (sending a new request) has another optimization that > only checks for tx: at this point if another request was already in > flight then the rx task should have a poll going on for rx, and if there > were no in flight request yet then there should be no point in checking > for rx, so p9_fd_request() only kick in the tx worker if there is room > to send Can't comment, but > - at this point I don't really get the logic that'll wake the rx thread > up either... p9_pollwake() will trigger p9_poll_workfn() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Yes, but where this p9_pollwake() can come from? see below. > - due to the new optimization (aaec5a95d59615 "pipe_read: don't wake up > the writer if the pipe is still full"), that 'if there is room to send' > check started failing and tx thread doesn't start? Again, I can be easily wrong, but no. With or without the optimization above, it doesn't make sense to start the tx thread when the pipe is full, p9_fd_poll() can't report EPOLLOUT. Lets recall that the idle read worker did kernel_read() -> pipe_read(). Before this optimization, pipe_read() did the unnecessary wake_up_interruptible_sync_poll(&pipe->wr_wait); when the pipe was full before the reading _and_ is still full after the reading. This wakeup calls p9_pollwake() which kicks p9_poll_workfn(). p9_poll_workfn() calls p9_poll_mux(). p9_poll_mux() does n = p9_fd_poll(). "n & EPOLLOUT" is false, exactly because this wakeup was unnecessary, so p9_poll_mux() won't do schedule_work(&m->wq), this is fine, But, "n & EPOLLIN" is true, so p9_poll_mux() does schedule_work(&m->rq) and wakes the rx thread. p9_read_work() is called again. It reads more data and (I guess) notices some problem and does p9_conn_cancel(EIO). This no longer happens after the optimization. So in some sense the p9_fd_request() -> p9_poll_mux() hack (which wakes the rx thread in this case) restores the old behaviour. But again, again, quite possibly I completely misread this (nontrivial) code. Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-29 14:21 ` Oleg Nesterov @ 2025-03-29 23:27 ` asmadeus 2025-03-30 10:21 ` Oleg Nesterov 0 siblings, 1 reply; 48+ messages in thread From: asmadeus @ 2025-03-29 23:27 UTC (permalink / raw) To: Oleg Nesterov Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro Oleg Nesterov wrote on Sat, Mar 29, 2025 at 03:21:39PM +0100: > First of all, let me remind that I know nothing about 9p or netfs ;) > And I am not sure that my patch is the right solution. > > I am not even sure we need the fix, according to syzbot testing the > problem goes away with the fixes from David > https://web.git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes > but I didn't even try to read them, this is not my area. (gah, I hate emails when one gets added to thread later.. I've just now opened the thread on lore and seen David's test :/) > > - due to the new optimization (aaec5a95d59615 "pipe_read: don't wake up > > the writer if the pipe is still full"), that 'if there is room to send' > > check started failing and tx thread doesn't start? > > Again, I can be easily wrong, but no. > > With or without the optimization above, it doesn't make sense to start > the tx thread when the pipe is full, p9_fd_poll() can't report EPOLLOUT. > > Lets recall that the idle read worker did kernel_read() -> pipe_read(). > Before this optimization, pipe_read() did the unnecessary > > wake_up_interruptible_sync_poll(&pipe->wr_wait); > > when the pipe was full before the reading _and_ is still full after the > reading. > > This wakeup calls p9_pollwake() which kicks p9_poll_workfn(). Aah, that's the bit I didn't get, thank you! > This no longer happens after the optimization. So in some sense the > p9_fd_request() -> p9_poll_mux() hack (which wakes the rx thread in this > case) restores the old behaviour. > > But again, again, quite possibly I completely misread this (nontrivial) > code. Yes, this totally makes sense; I agree with your analysis. So basically 9p was optimizing for this impossible (on a normal server) behaviour in the 9p side (it doesn't make any sense for the tx pipe to be full with 0 in flight request, and tx pipe never goes unfull, and reply comes (was there) before the actual write happened!!), but this old behaviour made it work anyway... So part of me wants to just leave it there and if anything try to make this kind of usage impossible by adding more checks to mount -o trans=fd, but I don't think it's possible to lock down all kind of weird behaviour root users (=syzbot) can engage in... OTOH syzbot does find some useful bugs so I guess it might be worth fixing, I don't know. If David's patch also happens to fix it I guess we can also just wait for that? Thanks, -- Dominique Martinet | Asmadeus ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-29 23:27 ` asmadeus @ 2025-03-30 10:21 ` Oleg Nesterov 0 siblings, 0 replies; 48+ messages in thread From: Oleg Nesterov @ 2025-03-30 10:21 UTC (permalink / raw) To: asmadeus Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro On 03/30, asmadeus@codewreck.org wrote: > > If David's patch also happens to fix it I guess we can also just wait > for that? Sure, whatever you prefer. Thanks! Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-23 19:17 ` syzbot 2025-03-23 19:47 ` Oleg Nesterov @ 2025-03-23 20:03 ` Mateusz Guzik 2025-03-23 20:43 ` syzbot 1 sibling, 1 reply; 48+ messages in thread From: Mateusz Guzik @ 2025-03-23 20:03 UTC (permalink / raw) To: syzbot Cc: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, netfs, oleg, swapnil.sapkal, syzkaller-bugs, viro > Tested on: > > commit: aaec5a95 pipe_read: don't wake up the writer if the pi.. > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > console output: https://syzkaller.appspot.com/x/log.txt?x=169ac43f980000 > kernel config: https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972 > dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > patch: https://syzkaller.appspot.com/x/patch.diff?x=17803c4c580000 > Here is a "just in case" by me: the patch which made sure to only look at head + tail with the lock held. #syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7 diff --git a/fs/pipe.c b/fs/pipe.c index 82fede0f2111..7eedcef2811e 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -210,11 +210,20 @@ static const struct pipe_buf_operations anon_pipe_buf_ops = { /* Done while waiting without holding the pipe lock - thus the READ_ONCE() */ static inline bool pipe_readable(const struct pipe_inode_info *pipe) { - unsigned int head = READ_ONCE(pipe->head); - unsigned int tail = READ_ONCE(pipe->tail); - unsigned int writers = READ_ONCE(pipe->writers); + return !READ_ONCE(pipe->isempty) || !READ_ONCE(pipe->writers); +} + +static inline void pipe_recalc_state(struct pipe_inode_info *pipe) +{ + pipe->isempty = pipe_empty(pipe->head, pipe->tail); + pipe->isfull = pipe_full(pipe->head, pipe->tail, pipe->max_usage); +} - return !pipe_empty(head, tail) || !writers; +static inline void pipe_update_head(struct pipe_inode_info *pipe, + unsigned int head) +{ + pipe->head = ++head; + pipe_recalc_state(pipe); } static inline unsigned int pipe_update_tail(struct pipe_inode_info *pipe, @@ -244,6 +253,7 @@ static inline unsigned int pipe_update_tail(struct pipe_inode_info *pipe, * without the spinlock - the mutex is enough. */ pipe->tail = ++tail; + pipe_recalc_state(pipe); return tail; } @@ -417,12 +427,7 @@ static inline int is_packetized(struct file *file) /* Done while waiting without holding the pipe lock - thus the READ_ONCE() */ static inline bool pipe_writable(const struct pipe_inode_info *pipe) { - unsigned int head = READ_ONCE(pipe->head); - unsigned int tail = READ_ONCE(pipe->tail); - unsigned int max_usage = READ_ONCE(pipe->max_usage); - - return !pipe_full(head, tail, max_usage) || - !READ_ONCE(pipe->readers); + return !READ_ONCE(pipe->isfull) || !READ_ONCE(pipe->readers); } static ssize_t @@ -524,7 +529,7 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from) * it, either the reader will consume it or it'll still * be there for the next write. */ - pipe->head = head + 1; + pipe_update_head(pipe, head); /* Insert it into the buffer array */ buf = &pipe->bufs[head & mask]; @@ -549,10 +554,9 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from) if (!iov_iter_count(from)) break; - } - if (!pipe_full(head, pipe->tail, pipe->max_usage)) continue; + } /* Wait for buffer space to become available. */ if ((filp->f_flags & O_NONBLOCK) || diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h index 8ff23bf5a819..d4b7539399b5 100644 --- a/include/linux/pipe_fs_i.h +++ b/include/linux/pipe_fs_i.h @@ -69,6 +69,8 @@ struct pipe_inode_info { unsigned int r_counter; unsigned int w_counter; bool poll_usage; + bool isempty; + bool isfull; #ifdef CONFIG_WATCH_QUEUE bool note_loss; #endif ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-23 20:03 ` Mateusz Guzik @ 2025-03-23 20:43 ` syzbot 0 siblings, 0 replies; 48+ messages in thread From: syzbot @ 2025-03-23 20:43 UTC (permalink / raw) To: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, viro Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: INFO: task hung in netfs_unbuffered_write_iter INFO: task syz.0.17:6690 blocked for more than 143 seconds. Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz.0.17 state:D stack:25184 pid:6690 tgid:6689 ppid:6491 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5369 [inline] __schedule+0xe58/0x5ad0 kernel/sched/core.c:6756 __schedule_loop kernel/sched/core.c:6833 [inline] schedule+0xe7/0x350 kernel/sched/core.c:6848 bit_wait+0x15/0xe0 kernel/sched/wait_bit.c:237 __wait_on_bit+0x62/0x180 kernel/sched/wait_bit.c:49 out_of_line_wait_on_bit+0xda/0x110 kernel/sched/wait_bit.c:64 wait_on_bit include/linux/wait_bit.h:77 [inline] netfs_unbuffered_write_iter_locked+0xba8/0xe70 fs/netfs/direct_write.c:105 netfs_unbuffered_write_iter+0x413/0x6d0 fs/netfs/direct_write.c:193 v9fs_file_write_iter+0xbf/0x100 fs/9p/vfs_file.c:404 new_sync_write fs/read_write.c:586 [inline] vfs_write+0x5ae/0x1150 fs/read_write.c:679 ksys_write+0x12b/0x250 fs/read_write.c:731 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fbba998d169 RSP: 002b:00007fbba97f9038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fbba9ba5fa0 RCX: 00007fbba998d169 RDX: 0000000000007fec RSI: 0000400000000540 RDI: 0000000000000007 RBP: 00007fbba9a0e2a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fbba9ba5fa0 R15: 00007ffea301fe68 </TASK> Showing all locks held in the system: 2 locks held by kworker/u8:0/11: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90000107d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:1/12: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90000117d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 1 lock held by khungtaskd/30: #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6744 2 locks held by kworker/u8:2/35: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc90000ab7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:4/80: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000213fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 3 locks held by kworker/u8:6/2887: 2 locks held by kworker/u8:7/2936: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000bc87d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:8/3460: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000c917d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:9/3519: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000c757d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:10/3600: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000cbe7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:11/3645: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000d047d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by kworker/u8:12/3693: #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204 #1: ffffc9000d437d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205 2 locks held by getty/5579: #0: ffff8880315940a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243 #1: ffffc90002fde2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0xfba/0x1480 drivers/tty/n_tty.c:2211 3 locks held by syz.0.17/6690: #0: ffff88807c837eb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88807341a420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880739f0148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.1.19/6790: #0: ffff88802beef7b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888055054420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880739f07b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.2.22/6813: #0: ffff888034a757b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff8880736a2420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880739f0e28 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.3.24/6832: #0: ffff88802fbce9b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888030d96420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888073af0148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.4.26/6857: #0: ffff8880355be638 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff8880331c0420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888073af07b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.5.28/6882: #0: ffff8880635d1eb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88807347c420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880739f1498 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.6.30/6910: #0: ffff88807b32e2b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff888029cb0420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880739f1b08 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.7.32/6938: #0: ffff88802fba9278 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88806225e420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888073af0e28 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.8.34/6969: #0: ffff8880336ea478 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88802a7a4420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff888073af1498 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 3 locks held by syz.9.36/6996: #0: ffff88801e2c27f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191 #1: ffff88802d796420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731 #2: ffff8880739f2178 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188 ============================================= NMI backtrace for cpu 1 CPU: 1 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:234 [inline] watchdog+0xf14/0x1240 kernel/hung_task.c:397 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 UID: 0 PID: 5871 Comm: kworker/0:4 Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Workqueue: events_power_efficient gc_worker RIP: 0010:__lock_acquire+0x2fe/0x3c40 kernel/locking/lockdep.c:5162 Code: 47 10 48 89 c2 48 89 44 24 50 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 70 2b 00 00 49 8d 47 18 4d 89 67 10 <48> 89 c2 48 89 44 24 48 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 RSP: 0018:ffffc90003ad7990 EFLAGS: 00000046 RAX: ffff8880315f53c8 RBX: 0000000000000000 RCX: 0000000000000002 RDX: 1ffff110062bea78 RSI: 1ffff110062bea75 RDI: ffff8880315f53b8 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001 R10: 000000000000000a R11: 0000000000000003 R12: ffffffff8e1bac80 R13: ffff8880315f4880 R14: 0000000000000021 R15: ffff8880315f53b0 FS: 0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c008d0a000 CR3: 000000000df7e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <NMI> </NMI> <TASK> lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5849 rcu_lock_acquire include/linux/rcupdate.h:337 [inline] rcu_read_lock include/linux/rcupdate.h:849 [inline] gc_worker+0x24d/0x1760 net/netfilter/nf_conntrack_core.c:1532 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Tested on: commit: aaec5a95 pipe_read: don't wake up the writer if the pi.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git console output: https://syzkaller.appspot.com/x/log.txt?x=11550804580000 kernel config: https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 patch: https://syzkaller.appspot.com/x/patch.diff?x=10e90804580000 ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-22 15:54 [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter syzbot 2025-03-23 18:48 ` Oleg Nesterov @ 2025-03-28 18:14 ` David Howells 2025-03-28 18:44 ` syzbot 2025-03-28 19:01 ` Oleg Nesterov 2025-08-03 12:09 ` syzbot 2 siblings, 2 replies; 48+ messages in thread From: David Howells @ 2025-03-28 18:14 UTC (permalink / raw) To: syzbot Cc: dhowells, Oleg Nesterov, brauner, jack, jlayton, linux-fsdevel, linux-kernel, netfs, syzkaller-bugs, viro, K Prateek Nayak, Sapkal, Swapnil, Mateusz Guzik #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git netfs-fixes There's inconsistency with some wakeups between using rreq->waitq (a private waitqueue) and using clear_and_wake_up_bit() (a shared global waitqueue). The reason I didn't get round to posting the patches on this branch yet is that they *ought* to fix the problem, but Steve French still sees an occasional hang with cifs - but only in his automated testing thing, and we haven't tracked it down yet. David ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 18:14 ` David Howells @ 2025-03-28 18:44 ` syzbot 2025-03-28 19:01 ` Oleg Nesterov 1 sibling, 0 replies; 48+ messages in thread From: syzbot @ 2025-03-28 18:44 UTC (permalink / raw) To: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, viro Hello, syzbot has tested the proposed patch but the reproducer is still triggering an issue: unregister_netdevice: waiting for DEV to become free unregister_netdevice: waiting for batadv0 to become free. Usage count = 3 Tested on: commit: c7fffb8c netfs: Fix wait/wake to be consistent about t.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git netfs-fixes console output: https://syzkaller.appspot.com/x/log.txt?x=15722a4c580000 kernel config: https://syzkaller.appspot.com/x/.config?x=95c3bbe7ce8436a7 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 Note: no patches were applied. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-28 18:14 ` David Howells 2025-03-28 18:44 ` syzbot @ 2025-03-28 19:01 ` Oleg Nesterov 1 sibling, 0 replies; 48+ messages in thread From: Oleg Nesterov @ 2025-03-28 19:01 UTC (permalink / raw) To: David Howells Cc: syzbot, brauner, jack, jlayton, linux-fsdevel, linux-kernel, netfs, syzkaller-bugs, viro, K Prateek Nayak, Sapkal, Swapnil, Mateusz Guzik David, I know that I will regret my email tomorrow, but On 03/28, David Howells wrote: > > #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git netfs-fixes > > There's inconsistency with some wakeups between using rreq->waitq (a private > waitqueue) and using clear_and_wake_up_bit() (a shared global waitqueue). Too late for me, I didn't even try to read the "netfs-fixes" changes. And if I even tried, I probably wouldn't understand them ;) But, afaics (I can be easily wrong), the curent logic in net/9p/ doesn't look right regardless of any other fixes, no? Oleg. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter 2025-03-22 15:54 [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter syzbot 2025-03-23 18:48 ` Oleg Nesterov 2025-03-28 18:14 ` David Howells @ 2025-08-03 12:09 ` syzbot 2 siblings, 0 replies; 48+ messages in thread From: syzbot @ 2025-08-03 12:09 UTC (permalink / raw) To: asmadeus, brauner, dhowells, eadavis, ericvh, hdanton, jack, jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg, pc, sfrench, swapnil.sapkal, syzkaller-bugs, v9fs, viro syzbot suspects this issue was fixed by commit: commit da8cf4bd458722d090a788c6e581eeb72695c62f Author: David Howells <dhowells@redhat.com> Date: Tue Jul 1 16:38:36 2025 +0000 netfs: Fix hang due to missing case in final DIO read result collection bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14cfbcf0580000 start commit: 66701750d556 Merge tag 'io_uring-6.16-20250630' of git://g.. git tree: upstream kernel config: https://syzkaller.appspot.com/x/.config?x=e29b8115bf337f53 dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1400348c580000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17b2048c580000 If the result looks correct, please mark the issue as fixed by replying with: #syz fix: netfs: Fix hang due to missing case in final DIO read result collection For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2025-08-03 12:09 UTC | newest] Thread overview: 48+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-03-22 15:54 [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter syzbot 2025-03-23 18:48 ` Oleg Nesterov 2025-03-23 19:17 ` syzbot 2025-03-23 19:47 ` Oleg Nesterov 2025-03-23 19:50 ` K Prateek Nayak 2025-03-23 19:52 ` Mateusz Guzik 2025-03-23 21:02 ` Oleg Nesterov 2025-03-24 10:47 ` K Prateek Nayak 2025-03-24 11:15 ` syzbot 2025-03-24 13:17 ` Mateusz Guzik 2025-03-24 13:19 ` K Prateek Nayak 2025-03-24 14:52 ` K Prateek Nayak 2025-03-24 16:03 ` Mateusz Guzik 2025-03-24 16:25 ` K Prateek Nayak 2025-03-24 16:36 ` K Prateek Nayak 2025-03-25 2:52 ` K Prateek Nayak 2025-03-25 12:15 ` Oleg Nesterov 2025-03-25 12:36 ` Dominique Martinet 2025-03-25 13:04 ` Oleg Nesterov 2025-03-25 14:49 ` K Prateek Nayak 2025-03-25 14:58 ` Dominique Martinet 2025-03-26 12:19 ` Oleg Nesterov 2025-03-26 12:44 ` Oleg Nesterov 2025-03-26 13:05 ` Oleg Nesterov 2025-03-27 17:46 ` K Prateek Nayak 2025-03-27 21:19 ` syzbot 2025-03-27 22:18 ` asmadeus 2025-03-28 4:01 ` K Prateek Nayak 2025-03-28 4:43 ` syzbot 2025-03-28 13:06 ` Oleg Nesterov 2025-03-28 13:07 ` syzbot 2025-03-28 13:25 ` Oleg Nesterov 2025-03-28 13:49 ` syzbot 2025-03-28 14:49 ` Oleg Nesterov 2025-03-28 15:22 ` syzbot 2025-03-28 17:00 ` Oleg Nesterov 2025-03-28 17:56 ` K Prateek Nayak 2025-03-28 18:20 ` Oleg Nesterov 2025-03-29 0:00 ` asmadeus 2025-03-29 14:21 ` Oleg Nesterov 2025-03-29 23:27 ` asmadeus 2025-03-30 10:21 ` Oleg Nesterov 2025-03-23 20:03 ` Mateusz Guzik 2025-03-23 20:43 ` syzbot 2025-03-28 18:14 ` David Howells 2025-03-28 18:44 ` syzbot 2025-03-28 19:01 ` Oleg Nesterov 2025-08-03 12:09 ` syzbot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).