[syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write

netfs.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

* [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
@ 2025-03-22 15:54 syzbot
  2025-03-23 18:48 ` Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 48+ messages in thread
From: syzbot @ 2025-03-22 15:54 UTC (permalink / raw)
  To: brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel,
	netfs, oleg, syzkaller-bugs, viro

Hello,

syzbot found the following issue on:

HEAD commit:    fc444ada1310 Merge tag 'soc-fixes-6.14-2' of git://git.ker..
git tree:       upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1397319b980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=2e330e9768b5b8ff
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1057319b980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/924e6055daef/disk-fc444ada.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/0cd40093a53e/vmlinux-fc444ada.xz
kernel image: https://storage.googleapis.com/syzbot-assets/7370bbe4e1b8/bzImage-fc444ada.xz

The issue was bisected to:

commit aaec5a95d59615523db03dd53c2052f0a87beea7
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Thu Jan 2 14:07:15 2025 +0000

    pipe_read: don't wake up the writer if the pipe is still full

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14b6b19b980000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=16b6b19b980000
console output: https://syzkaller.appspot.com/x/log.txt?x=12b6b19b980000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com
Fixes: aaec5a95d596 ("pipe_read: don't wake up the writer if the pipe is still full")

INFO: task syz-executor309:5842 blocked for more than 143 seconds.
      Not tainted 6.14.0-rc7-syzkaller-00050-gfc444ada1310 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor309 state:D stack:24944 pid:5842  tgid:5842  ppid:5838   task_flags:0x400140 flags:0x00004006
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5378 [inline]
 __schedule+0xf43/0x5890 kernel/sched/core.c:6765
 __schedule_loop kernel/sched/core.c:6842 [inline]
 schedule+0xe7/0x350 kernel/sched/core.c:6857
 bit_wait+0x15/0xe0 kernel/sched/wait_bit.c:237
 __wait_on_bit+0x62/0x180 kernel/sched/wait_bit.c:49
 out_of_line_wait_on_bit+0xda/0x110 kernel/sched/wait_bit.c:64
 wait_on_bit include/linux/wait_bit.h:77 [inline]
 netfs_unbuffered_write_iter_locked+0xaa1/0xd30 fs/netfs/direct_write.c:108
 netfs_unbuffered_write_iter+0x413/0x6d0 fs/netfs/direct_write.c:195
 v9fs_file_write_iter+0xbf/0x100 fs/9p/vfs_file.c:404
 new_sync_write fs/read_write.c:586 [inline]
 vfs_write+0x5ae/0x1150 fs/read_write.c:679
 ksys_write+0x12b/0x250 fs/read_write.c:731
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff138d69f79
RSP: 002b:00007ffe82a84918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0030656c69662f2e RCX: 00007ff138d69f79
RDX: 0000000000007fec RSI: 0000400000000540 RDI: 0000000000000007
RBP: 0000400000000000 R08: 0000000000000006 R09: 0000000000000006
R10: 0000000000000006 R11: 0000000000000246 R12: 0000400000000280
R13: 00007ff138db304e R14: 0000000000000001 R15: 0000000000000001
 </TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/31:
 #0: ffffffff8e1bd140 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 #0: ffffffff8e1bd140 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
 #0: ffffffff8e1bd140 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6746
2 locks held by kworker/u8:3/58:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3213
 #1: ffffc9000123fd18 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3214
2 locks held by getty/5585:
 #0: ffff88803117c0a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243
 #1: ffffc90002fde2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0xfba/0x1480 drivers/tty/n_tty.c:2211
2 locks held by syz-executor309/5842:
 #0: ffff888030b1a420 (sb_writers#10){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #1: ffff88807ae98148 (&sb->s_type->i_mutex_key#15){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188

=============================================

NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 31 Comm: khungtaskd Not tainted 6.14.0-rc7-syzkaller-00050-gfc444ada1310 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113
 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:236 [inline]
 watchdog+0xf62/0x12b0 kernel/hung_task.c:399
 kthread+0x3af/0x750 kernel/kthread.c:464
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
NMI backtrace for cpu 1 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:106 [inline]
NMI backtrace for cpu 1 skipped: idling at acpi_safe_halt+0x1a/0x20 drivers/acpi/processor_idle.c:111


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-22 15:54 [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter syzbot
@ 2025-03-23 18:48 ` Oleg Nesterov
  2025-03-23 19:17   ` syzbot
  2025-03-28 18:14 ` David Howells
  2025-08-03 12:09 ` syzbot
  2 siblings, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-23 18:48 UTC (permalink / raw)
  To: syzbot
  Cc: brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel,
	netfs, syzkaller-bugs, viro, K Prateek Nayak, Sapkal, Swapnil,
	Mateusz Guzik

On 03/22, syzbot wrote:
>
> HEAD commit:    fc444ada1310 Merge tag 'soc-fixes-6.14-2' of git://git.ker..
> git tree:       upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=1397319b980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=2e330e9768b5b8ff
> dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1057319b980000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/924e6055daef/disk-fc444ada.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/0cd40093a53e/vmlinux-fc444ada.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/7370bbe4e1b8/bzImage-fc444ada.xz
>
> The issue was bisected to:
>
> commit aaec5a95d59615523db03dd53c2052f0a87beea7
> Author: Oleg Nesterov <oleg@redhat.com>
> Date:   Thu Jan 2 14:07:15 2025 +0000
>
>     pipe_read: don't wake up the writer if the pipe is still full

OMG :/

Just to ensure it does not help,

#syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7

diff --git a/fs/pipe.c b/fs/pipe.c
index 82fede0f2111..7e36f54d21a5 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -417,8 +417,8 @@ static inline int is_packetized(struct file *file)
 /* Done while waiting without holding the pipe lock - thus the READ_ONCE() */
 static inline bool pipe_writable(const struct pipe_inode_info *pipe)
 {
-	unsigned int head = READ_ONCE(pipe->head);
 	unsigned int tail = READ_ONCE(pipe->tail);
+	unsigned int head = READ_ONCE(pipe->head);
 	unsigned int max_usage = READ_ONCE(pipe->max_usage);
 
 	return !pipe_full(head, tail, max_usage) ||


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-23 18:48 ` Oleg Nesterov
@ 2025-03-23 19:17   ` syzbot
  2025-03-23 19:47     ` Oleg Nesterov
  2025-03-23 20:03     ` Mateusz Guzik
  0 siblings, 2 replies; 48+ messages in thread
From: syzbot @ 2025-03-23 19:17 UTC (permalink / raw)
  To: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel,
	linux-kernel, mjguzik, netfs, oleg, swapnil.sapkal,
	syzkaller-bugs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: task hung in netfs_unbuffered_write_iter

INFO: task syz.0.17:6724 blocked for more than 143 seconds.
      Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.17        state:D stack:24768 pid:6724  tgid:6722  ppid:6593   flags:0x00004004
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5369 [inline]
 __schedule+0xe58/0x5ad0 kernel/sched/core.c:6756
 __schedule_loop kernel/sched/core.c:6833 [inline]
 schedule+0xe7/0x350 kernel/sched/core.c:6848
 bit_wait+0x15/0xe0 kernel/sched/wait_bit.c:237
 __wait_on_bit+0x62/0x180 kernel/sched/wait_bit.c:49
 out_of_line_wait_on_bit+0xda/0x110 kernel/sched/wait_bit.c:64
 wait_on_bit include/linux/wait_bit.h:77 [inline]
 netfs_unbuffered_write_iter_locked+0xba8/0xe70 fs/netfs/direct_write.c:105
 netfs_unbuffered_write_iter+0x413/0x6d0 fs/netfs/direct_write.c:193
 v9fs_file_write_iter+0xbf/0x100 fs/9p/vfs_file.c:404
 new_sync_write fs/read_write.c:586 [inline]
 vfs_write+0x5ae/0x1150 fs/read_write.c:679
 ksys_write+0x12b/0x250 fs/read_write.c:731
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fce81f8d169
RSP: 002b:00007fce82d75038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fce821a5fa0 RCX: 00007fce81f8d169
RDX: 0000000000007fec RSI: 0000400000000540 RDI: 0000000000000007
RBP: 00007fce8200e2a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fce821a5fa0 R15: 00007ffc98f4b948
 </TASK>

Showing all locks held in the system:
2 locks held by kworker/u8:0/11:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000107d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:1/12:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000117d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
1 lock held by khungtaskd/30:
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6744
2 locks held by kworker/u8:2/35:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000ab7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:3/56:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000121fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:4/81:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000217fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:5/1317:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90004897d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:6/3513:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000cadfd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:7/3805:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000d66fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by getty/5580:
 #0: ffff88803216c0a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243
 #1: ffffc90002fde2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0xfba/0x1480 drivers/tty/n_tty.c:2211
3 locks held by syz.0.17/6724:
 #0: ffff8880227fd438 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88807965c420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880749e0148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.1.19/6772:
 #0: ffff888028adf438 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88802984c420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888074838148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.2.21/6792:
 #0: ffff888034b1bcf8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88805ffda420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880748387b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
2 locks held by kworker/u8:8/6807:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90003157d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
3 locks held by syz.3.23/6812:
 #0: ffff88807c3c70b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888026990420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888074838e28 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.4.25/6838:
 #0: ffff8880794df438 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88807c934420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880749e07b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.5.27/6864:
 #0: ffff8880248b07f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88805ccc4420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888074839498 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
2 locks held by kworker/u8:11/6865:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90003117d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
3 locks held by syz.6.29/6891:
 #0: ffff8880355f7eb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888078eb2420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888074839b08 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.7.31/6917:
 #0: ffff88807f33f0b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88807613e420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff88807483a178 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.8.34/6953:
 #0: ffff88806f5b4478 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888078710420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff88807483ae58 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.9.36/6980:
 #0: ffff888029c18ef8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888033c44420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff88807483b4c8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188

=============================================

NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113
 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:234 [inline]
 watchdog+0xf14/0x1240 kernel/hung_task.c:397
 kthread+0x2c1/0x3a0 kernel/kthread.c:389
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 6813 Comm: kworker/u8:9 Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: events_unbound nsim_dev_trap_report_work
RIP: 0010:orc_ip arch/x86/kernel/unwind_orc.c:80 [inline]
RIP: 0010:__orc_find+0x70/0xf0 arch/x86/kernel/unwind_orc.c:102
Code: ec 72 4e 4c 89 e2 48 29 ea 48 89 d6 48 c1 ea 3f 48 c1 fe 02 48 01 f2 48 d1 fa 48 8d 5c 95 00 48 89 da 48 c1 ea 03 0f b6 34 0a <48> 89 da 83 e2 07 83 c2 03 40 38 f2 7c 05 40 84 f6 75 4b 48 63 13
RSP: 0018:ffffc900031476d8 EFLAGS: 00000a06
RAX: ffffffff90f0214a RBX: ffffffff906e9fd0 RCX: dffffc0000000000
RDX: 1ffffffff20dd3fa RSI: 0000000000000000 RDI: ffffffff906e9fa8
RBP: ffffffff906e9fd0 R08: ffffffff90f0218c R09: ffffffff90f117cc
R10: ffffc900031477d8 R11: 0000000000004070 R12: ffffffff906e9fd0
R13: ffffffff814070f3 R14: ffffffff906e9fa8 R15: ffffffff906e9fcc
FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c008cfd000 CR3: 000000000df7e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <NMI>
 </NMI>
 <TASK>
 orc_find arch/x86/kernel/unwind_orc.c:227 [inline]
 unwind_next_frame+0x2be/0x20c0 arch/x86/kernel/unwind_orc.c:494
 __unwind_start+0x45f/0x7f0 arch/x86/kernel/unwind_orc.c:760
 unwind_start arch/x86/include/asm/unwind.h:64 [inline]
 arch_stack_walk+0x74/0x100 arch/x86/kernel/stacktrace.c:24
 stack_trace_save+0x95/0xd0 kernel/stacktrace.c:122
 kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:582
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x51/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2338 [inline]
 slab_free mm/slub.c:4598 [inline]
 kmem_cache_free+0x152/0x4c0 mm/slub.c:4700
 kfree_skbmem+0x1a4/0x1f0 net/core/skbuff.c:1148
 __kfree_skb net/core/skbuff.c:1205 [inline]
 consume_skb net/core/skbuff.c:1436 [inline]
 consume_skb+0xcc/0x100 net/core/skbuff.c:1430
 nsim_dev_trap_report drivers/net/netdevsim/dev.c:821 [inline]
 nsim_dev_trap_report_work+0x8cf/0xd00 drivers/net/netdevsim/dev.c:851
 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229
 process_scheduled_works kernel/workqueue.c:3310 [inline]
 worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
 kthread+0x2c1/0x3a0 kernel/kthread.c:389
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>


Tested on:

commit:         aaec5a95 pipe_read: don't wake up the writer if the pi..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=169ac43f980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17803c4c580000


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-23 19:17   ` syzbot
@ 2025-03-23 19:47     ` Oleg Nesterov
  2025-03-23 19:50       ` K Prateek Nayak
  2025-03-23 19:52       ` Mateusz Guzik
  2025-03-23 20:03     ` Mateusz Guzik
  1 sibling, 2 replies; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-23 19:47 UTC (permalink / raw)
  To: syzbot
  Cc: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel,
	linux-kernel, mjguzik, netfs, swapnil.sapkal, syzkaller-bugs,
	viro

On 03/23, syzbot wrote:
>
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> INFO: task hung in netfs_unbuffered_write_iter

OK, as expected.

Dear syzbot, thank you.

So far I think this is another problem revealed by aaec5a95d59615523db03dd5
("pipe_read: don't wake up the writer if the pipe is still full").

I am going to forget about this report for now and return to it later, when
all the pending pipe-related changes in vfs.git are merged.

Oleg.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-23 19:47     ` Oleg Nesterov
@ 2025-03-23 19:50       ` K Prateek Nayak
  2025-03-23 19:52       ` Mateusz Guzik
  1 sibling, 0 replies; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-23 19:50 UTC (permalink / raw)
  To: Oleg Nesterov, syzbot
  Cc: brauner, dhowells, jack, jlayton, linux-fsdevel, linux-kernel,
	mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, viro

Hello Oleg,

On 3/24/2025 1:17 AM, Oleg Nesterov wrote:
> On 03/23, syzbot wrote:
>>
>> Hello,
>>
>> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
>> INFO: task hung in netfs_unbuffered_write_iter
> 
> OK, as expected.
> 
> Dear syzbot, thank you.
> 
> So far I think this is another problem revealed by aaec5a95d59615523db03dd5
> ("pipe_read: don't wake up the writer if the pipe is still full").
> 
> I am going to forget about this report for now and return to it later, when
> all the pending pipe-related changes in vfs.git are merged.

P.S. I'm trying to repro this on my machine on latest upstream. Haven't
gotten lucky yet with the C reproducer with my config. I'm moving to the
the syzbot's config to see if it reveals something. If I have something
I'll report back here.

> 
> Oleg.
> 

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-23 19:47     ` Oleg Nesterov
  2025-03-23 19:50       ` K Prateek Nayak
@ 2025-03-23 19:52       ` Mateusz Guzik
  2025-03-23 21:02         ` Oleg Nesterov
  1 sibling, 1 reply; 48+ messages in thread
From: Mateusz Guzik @ 2025-03-23 19:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: syzbot, brauner, dhowells, jack, jlayton, kprateek.nayak,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro

On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote:
>
> On 03/23, syzbot wrote:
> >
> > Hello,
> >
> > syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> > INFO: task hung in netfs_unbuffered_write_iter
>
> OK, as expected.
>
> Dear syzbot, thank you.
>
> So far I think this is another problem revealed by aaec5a95d59615523db03dd5
> ("pipe_read: don't wake up the writer if the pipe is still full").
>
> I am going to forget about this report for now and return to it later, when
> all the pending pipe-related changes in vfs.git are merged.
>

How do you ask syzbot for all stacks?

The reproducer *does* use pipes, but it is unclear to me if they play
any role here -- and notably we don't know if there is someone stuck
in pipe code, resulting in not waking up the reported thread.

-- 
Mateusz Guzik <mjguzik gmail.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-23 19:52       ` Mateusz Guzik
@ 2025-03-23 21:02         ` Oleg Nesterov
  2025-03-24 10:47           ` K Prateek Nayak
  0 siblings, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-23 21:02 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: syzbot, brauner, dhowells, jack, jlayton, kprateek.nayak,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro

Prateek, Mateusz, thanks for your participation!

On 03/23, Mateusz Guzik wrote:
>
> On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > OK, as expected.
> >
> > Dear syzbot, thank you.
> >
> > So far I think this is another problem revealed by aaec5a95d59615523db03dd5
> > ("pipe_read: don't wake up the writer if the pipe is still full").
> >
> > I am going to forget about this report for now and return to it later, when
> > all the pending pipe-related changes in vfs.git are merged.
> >
>
> How do you ask syzbot for all stacks?

Heh, I don't know.

> The reproducer *does* use pipes, but it is unclear to me if they play
> any role here

please see the reproducer,

	https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000

  res = syscall(__NR_pipe2, /*pipefd=*/0x400000001900ul, /*flags=*/0ul);
  if (res != -1) {
    r[2] = *(uint32_t*)0x400000001900;
    r[3] = *(uint32_t*)0x400000001904;
  }

then

  res = syscall(__NR_dup, /*oldfd=*/r[3]);
  if (res != -1)
    r[4] = res;

so r[2] and r[4] are the read/write fd's.

then later

   memcpy((void*)0x400000000280, "trans=fd,", 9);
   ...
   memcpy((void*)0x400000000289, "rfdno", 5);
   ...
   sprintf((char*)0x40000000028f, "0x%016llx", (long long)r[2]);
   ...
   memcpy((void*)0x4000000002a2, "wfdno", 5);
   ...
   sprintf((char*)0x4000000002a8, "0x%016llx", (long long)r[4]);
   ...
   syscall(__NR_mount, /*src=*/0ul, /*dst=*/0x400000000000ul,
           /*type=*/0x400000000040ul, /*flags=*/0ul, /*opts=*/0x400000000280ul);

so this pipe is actually used as "trans=fd".

> -- and notably we don't know if there is someone stuck
> in pipe code, resulting in not waking up the reported thread.

Yes, I am not familiar with 9p or netfs, so I don't know either.

Oleg.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-23 21:02         ` Oleg Nesterov
@ 2025-03-24 10:47           ` K Prateek Nayak
  2025-03-24 11:15             ` syzbot
  2025-03-24 13:17             ` Mateusz Guzik
  0 siblings, 2 replies; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-24 10:47 UTC (permalink / raw)
  To: Oleg Nesterov, Mateusz Guzik
  Cc: syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel,
	linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro

Hello Oleg, Mateusz,

On 3/24/2025 2:32 AM, Oleg Nesterov wrote:
> Prateek, Mateusz, thanks for your participation!
> 
> On 03/23, Mateusz Guzik wrote:
>>
>> On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote:
>>>
>>> OK, as expected.
>>>
>>> Dear syzbot, thank you.
>>>
>>> So far I think this is another problem revealed by aaec5a95d59615523db03dd5
>>> ("pipe_read: don't wake up the writer if the pipe is still full").
>>>
>>> I am going to forget about this report for now and return to it later, when
>>> all the pending pipe-related changes in vfs.git are merged.
>>>
>>
>> How do you ask syzbot for all stacks?
> 
> Heh, I don't know.
> 
>> The reproducer *does* use pipes, but it is unclear to me if they play
>> any role here
> 
> please see the reproducer,
> 
> 	https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000
> 
>    res = syscall(__NR_pipe2, /*pipefd=*/0x400000001900ul, /*flags=*/0ul);
>    if (res != -1) {
>      r[2] = *(uint32_t*)0x400000001900;
>      r[3] = *(uint32_t*)0x400000001904;
>    }
> 
> then
> 
>    res = syscall(__NR_dup, /*oldfd=*/r[3]);
>    if (res != -1)
>      r[4] = res;
> 
> so r[2] and r[4] are the read/write fd's.
> 
> then later
> 
>     memcpy((void*)0x400000000280, "trans=fd,", 9);
>     ...
>     memcpy((void*)0x400000000289, "rfdno", 5);
>     ...
>     sprintf((char*)0x40000000028f, "0x%016llx", (long long)r[2]);
>     ...
>     memcpy((void*)0x4000000002a2, "wfdno", 5);
>     ...
>     sprintf((char*)0x4000000002a8, "0x%016llx", (long long)r[4]);
>     ...
>     syscall(__NR_mount, /*src=*/0ul, /*dst=*/0x400000000000ul,
>             /*type=*/0x400000000040ul, /*flags=*/0ul, /*opts=*/0x400000000280ul);
> 
> so this pipe is actually used as "trans=fd".
> 
>> -- and notably we don't know if there is someone stuck
>> in pipe code, resulting in not waking up the reported thread.
> 
> Yes, I am not familiar with 9p or netfs, so I don't know either.

Didn't have any luck reproducing this yet but I'm looking at
https://syzkaller.appspot.com/x/log.txt?x=1397319b980000
which is the trimmed log from original report and I see ...

[pid  5842] creat("./file0", 000)       = 7
[  137.753309][   T30] audit: type=1400 audit(1742312362.045:90): avc:  denied  { mount } for  pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=filesystem permissive=1
[  137.775741][   T30] audit: type=1400 audit(1742312362.065:91): avc:  denied  { setattr } for  pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
[  137.798215][   T30] audit: type=1400 audit(1742312362.075:92): avc:  denied  { write } for  pid=5842 comm="syz-executor309" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
[  137.819189][   T30] audit: type=1400 audit(1742312362.075:93): avc:  denied  { open } for  pid=5842 comm="syz-executor309" path="/file0" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
[pid  5842] write(7, "\x08\x00\x00\x00\x1a\x17\x92\x4a\xb2\x18\xea\xcb\x15\xa3\xfc\xcf\x92\x9e\x2d\xd2\x49\x79\x03\xc1\xf8\x53\xd9\x5b\x99\x5c\x65\xe9\x94\x49\xff\x95\x3f\xa1\x1c\x77\x23\xb2\x14\x9e\xcd\xaa\x7f\x83\x3f\x60\xe1\x3b\x19\xa6\x6e\x96\x3f\x7e\x8d\xa4\x29\x7e\xbb\xfd\xda\x5b\x36\xfb\x4d\x01\xbd\x02\xe6\xc6\x52\xdc\x4d\x99\xe2\xcb\x82\xc2\xa1\xd4\xa4\x5e\x4c\x89\xba\x99\x94\xe8\x2f\x85\x4b\xbc\x34\xa4\x0b\x3a"..., 32748 <unfinished ...>

So we have a "write" denied for pid 5842 and then it tries a write that
seems to hangs. In all the traces for hang, I see a denied for a task
followed by a hang for the task in the same tgid.

But since this is a "permissive" policy, it should not cause a hang,
only report that the program is in violation. Also I have no clue how a
spurious wakeup of a writer could then lead to progress.

Since in all cases the thread info flags "flags:0x00004006" has the
TIF_NOTIFY_SIGNAL bit set, I'm wondering if it has something to do with
the fact that pipe_read() directly return -ERESTARTSYS in case of a
pending signal without any wakeups?

Well here goes nothing I guess; Totally a shot in the dark:

P.S. I think it should be wake_next_reader but if this does not hang,
perhaps it could point to some interaction with signal pending and
wakeup. There are some EPOLL semantics in pipe_write() which will
cause readers to wakeup if the writer has signal_pending() and
pipe->poll_usage is set.

#syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7

diff --git a/fs/pipe.c b/fs/pipe.c
index 82fede0f2111..9efeb86eaac5 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -359,6 +359,8 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
  			ret = -EAGAIN;
  			break;
  		}
+		if (signal_pending(current) && pipe_full(pipe->head, pipe->tail, pipe->max_usage))
+			wake_writer = true;
  		mutex_unlock(&pipe->mutex);
  
  		/*



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-24 10:47           ` K Prateek Nayak
@ 2025-03-24 11:15             ` syzbot
  2025-03-24 13:17             ` Mateusz Guzik
  1 sibling, 0 replies; 48+ messages in thread
From: syzbot @ 2025-03-24 11:15 UTC (permalink / raw)
  To: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel,
	linux-kernel, mjguzik, netfs, oleg, swapnil.sapkal,
	syzkaller-bugs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: task hung in netfs_unbuffered_write_iter

INFO: task syz.0.18:6758 blocked for more than 143 seconds.
      Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.18        state:D stack:26608 pid:6758  tgid:6757  ppid:6540   flags:0x00004004
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5369 [inline]
 __schedule+0xe58/0x5ad0 kernel/sched/core.c:6756
 __schedule_loop kernel/sched/core.c:6833 [inline]
 schedule+0xe7/0x350 kernel/sched/core.c:6848
 bit_wait+0x15/0xe0 kernel/sched/wait_bit.c:237
 __wait_on_bit+0x62/0x180 kernel/sched/wait_bit.c:49
 out_of_line_wait_on_bit+0xda/0x110 kernel/sched/wait_bit.c:64
 wait_on_bit include/linux/wait_bit.h:77 [inline]
 netfs_unbuffered_write_iter_locked+0xba8/0xe70 fs/netfs/direct_write.c:105
 netfs_unbuffered_write_iter+0x413/0x6d0 fs/netfs/direct_write.c:193
 v9fs_file_write_iter+0xbf/0x100 fs/9p/vfs_file.c:404
 new_sync_write fs/read_write.c:586 [inline]
 vfs_write+0x5ae/0x1150 fs/read_write.c:679
 ksys_write+0x12b/0x250 fs/read_write.c:731
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fae42f8d169
RSP: 002b:00007fae43eb2038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fae431a5fa0 RCX: 00007fae42f8d169
RDX: 0000000000007fec RSI: 0000400000000540 RDI: 0000000000000007
RBP: 00007fae4300e2a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fae431a5fa0 R15: 00007ffef04451c8
 </TASK>

Showing all locks held in the system:
2 locks held by kworker/u8:1/12:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000117d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
1 lock held by khungtaskd/30:
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6744
2 locks held by kworker/u8:3/52:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000bd7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:4/63:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000214fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:5/1034:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000403fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:6/2116:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90005937d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:7/4367:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000f717d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:8/4487:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000fd07d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:9/4559:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000ffe7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by getty/5582:
 #0: ffff8880313180a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243
 #1: ffffc90002fde2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0xfba/0x1480 drivers/tty/n_tty.c:2211
3 locks held by syz.0.18/6758:
 #0: ffff888026eba478 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888024f5a420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880755987b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.1.23/6787:
 #0: ffff88803111c9b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88807d6da420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888075720148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.2.25/6807:
 #0: ffff88803562d438 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888027a9e420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff88807559a178 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.3.28/6829:
 #0: ffff888026d3e9b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888025ffa420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880757207b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.4.30/6854:
 #0: ffff888033e600f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888036adc420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888075720e28 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
2 locks held by kworker/u8:10/6856:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90002ee7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
3 locks held by syz.5.32/6881:
 #0: ffff88803162deb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88802aa02420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff88807559a7e8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by kworker/u8:11/6882:
3 locks held by syz.6.34/6908:
 #0: ffff88804f312d38 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888062c8c420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888075721498 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
2 locks held by kworker/u8:12/6909:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc900021dfd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
3 locks held by syz.7.36/6935:
 #0: ffff8880369530b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888063c48420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888075721b08 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.8.38/6968:
 #0: ffff88801e2adeb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88807b25e420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff88807559ae58 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.9.40/6997:
 #0: ffff88802a74a7f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888053afa420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888075722178 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188

=============================================

NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113
 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:234 [inline]
 watchdog+0xf14/0x1240 kernel/hung_task.c:397
 kthread+0x2c1/0x3a0 kernel/kthread.c:389
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 5182 Comm: klogd Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
RIP: 0010:__raw_callee_save___pv_queued_spin_unlock+0x12/0x18
Code: 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 52 b8 01 00 00 00 31 d2 f0 0f b0 17 75 06 <5a> c3 cc cc cc cc 56 0f b6 f0 e8 9f ff ff ff 5e 5a c3 cc cc cc cc
RSP: 0018:ffffc90005667618 EFLAGS: 00000046
RAX: 0000000000000001 RBX: ffffffff9a9f6bc8 RCX: ffffffff81770803
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff9a9f6bc8
RBP: ffffffff9a9f6bd0 R08: 0000000000000000 R09: fffffbfff353ed79
R10: ffffffff9a9f6bcb R11: 0000000000000001 R12: ffffffff9a9f6bd8
R13: 0000000000000012 R14: 0000000000000000 R15: ffff888067f30000
FS:  00007fb083fb7500(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005612bebd5240 CR3: 00000000347be000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <NMI>
 </NMI>
 <TASK>
 pv_queued_spin_unlock arch/x86/include/asm/paravirt.h:589 [inline]
 queued_spin_unlock arch/x86/include/asm/qspinlock.h:57 [inline]
 do_raw_spin_unlock+0x172/0x230 kernel/locking/spinlock_debug.c:142
 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:150 [inline]
 _raw_spin_unlock_irqrestore+0x22/0x80 kernel/locking/spinlock.c:194
 __debug_check_no_obj_freed lib/debugobjects.c:1108 [inline]
 debug_check_no_obj_freed+0x327/0x600 lib/debugobjects.c:1129
 free_pages_prepare mm/page_alloc.c:1134 [inline]
 free_unref_page+0x276/0x1080 mm/page_alloc.c:2657
 __put_partials+0x14c/0x170 mm/slub.c:3142
 qlink_free mm/kasan/quarantine.c:163 [inline]
 qlist_free_all+0x4e/0x120 mm/kasan/quarantine.c:179
 kasan_quarantine_reduce+0x195/0x1e0 mm/kasan/quarantine.c:286
 __kasan_slab_alloc+0x69/0x90 mm/kasan/common.c:329
 kasan_slab_alloc include/linux/kasan.h:250 [inline]
 slab_post_alloc_hook mm/slub.c:4104 [inline]
 slab_alloc_node mm/slub.c:4153 [inline]
 kmem_cache_alloc_node_noprof+0x223/0x3c0 mm/slub.c:4205
 __alloc_skb+0x2b1/0x380 net/core/skbuff.c:668
 alloc_skb include/linux/skbuff.h:1323 [inline]
 alloc_skb_with_frags+0xe4/0x850 net/core/skbuff.c:6612
 sock_alloc_send_pskb+0x7f1/0x980 net/core/sock.c:2881
 unix_dgram_sendmsg+0x4b8/0x19e0 net/unix/af_unix.c:2027
 sock_sendmsg_nosec net/socket.c:711 [inline]
 __sock_sendmsg net/socket.c:726 [inline]
 __sys_sendto+0x488/0x4f0 net/socket.c:2197
 __do_sys_sendto net/socket.c:2204 [inline]
 __se_sys_sendto net/socket.c:2200 [inline]
 __x64_sys_sendto+0xe0/0x1c0 net/socket.c:2200
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb0841199b5
Code: 8b 44 24 08 48 83 c4 28 48 98 c3 48 98 c3 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 26 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 76 7a 48 8b 15 44 c4 0c 00 f7 d8 64 89 02 48 83
RSP: 002b:00007ffd368d15b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb0841199b5
RDX: 0000000000000049 RSI: 0000558a1e31e270 RDI: 0000000000000003
RBP: 0000558a1e3172c0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000004000 R11: 0000000000000246 R12: 0000000000000013
R13: 00007fb0842a7212 R14: 00007ffd368d16b8 R15: 0000000000000000
 </TASK>


Tested on:

commit:         aaec5a95 pipe_read: don't wake up the writer if the pi..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=12debc4c580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=10750198580000


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-24 10:47           ` K Prateek Nayak
  2025-03-24 11:15             ` syzbot
@ 2025-03-24 13:17             ` Mateusz Guzik
  2025-03-24 13:19               ` K Prateek Nayak
  1 sibling, 1 reply; 48+ messages in thread
From: Mateusz Guzik @ 2025-03-24 13:17 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro

On Mon, Mar 24, 2025 at 11:48 AM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>
> Hello Oleg, Mateusz,
>
> On 3/24/2025 2:32 AM, Oleg Nesterov wrote:
> > Prateek, Mateusz, thanks for your participation!
> >
> > On 03/23, Mateusz Guzik wrote:
> >>
> >> On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote:
> >>>
> >>> OK, as expected.
> >>>
> >>> Dear syzbot, thank you.
> >>>
> >>> So far I think this is another problem revealed by aaec5a95d59615523db03dd5
> >>> ("pipe_read: don't wake up the writer if the pipe is still full").
> >>>
> >>> I am going to forget about this report for now and return to it later, when
> >>> all the pending pipe-related changes in vfs.git are merged.
> >>>
> >>
> >> How do you ask syzbot for all stacks?
> >
> > Heh, I don't know.
> >
> >> The reproducer *does* use pipes, but it is unclear to me if they play
> >> any role here
> >
> > please see the reproducer,
> >
> >       https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000
> >
> >    res = syscall(__NR_pipe2, /*pipefd=*/0x400000001900ul, /*flags=*/0ul);
> >    if (res != -1) {
> >      r[2] = *(uint32_t*)0x400000001900;
> >      r[3] = *(uint32_t*)0x400000001904;
> >    }
> >
> > then
> >
> >    res = syscall(__NR_dup, /*oldfd=*/r[3]);
> >    if (res != -1)
> >      r[4] = res;
> >
> > so r[2] and r[4] are the read/write fd's.
> >
> > then later
> >
> >     memcpy((void*)0x400000000280, "trans=fd,", 9);
> >     ...
> >     memcpy((void*)0x400000000289, "rfdno", 5);
> >     ...
> >     sprintf((char*)0x40000000028f, "0x%016llx", (long long)r[2]);
> >     ...
> >     memcpy((void*)0x4000000002a2, "wfdno", 5);
> >     ...
> >     sprintf((char*)0x4000000002a8, "0x%016llx", (long long)r[4]);
> >     ...
> >     syscall(__NR_mount, /*src=*/0ul, /*dst=*/0x400000000000ul,
> >             /*type=*/0x400000000040ul, /*flags=*/0ul, /*opts=*/0x400000000280ul);
> >
> > so this pipe is actually used as "trans=fd".
> >
> >> -- and notably we don't know if there is someone stuck
> >> in pipe code, resulting in not waking up the reported thread.
> >
> > Yes, I am not familiar with 9p or netfs, so I don't know either.
>
> Didn't have any luck reproducing this yet but I'm looking at
> https://syzkaller.appspot.com/x/log.txt?x=1397319b980000
> which is the trimmed log from original report and I see ...
>
> [pid  5842] creat("./file0", 000)       = 7
> [  137.753309][   T30] audit: type=1400 audit(1742312362.045:90): avc:  denied  { mount } for  pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=filesystem permissive=1
> [  137.775741][   T30] audit: type=1400 audit(1742312362.065:91): avc:  denied  { setattr } for  pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
> [  137.798215][   T30] audit: type=1400 audit(1742312362.075:92): avc:  denied  { write } for  pid=5842 comm="syz-executor309" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
> [  137.819189][   T30] audit: type=1400 audit(1742312362.075:93): avc:  denied  { open } for  pid=5842 comm="syz-executor309" path="/file0" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
> [pid  5842] write(7, "\x08\x00\x00\x00\x1a\x17\x92\x4a\xb2\x18\xea\xcb\x15\xa3\xfc\xcf\x92\x9e\x2d\xd2\x49\x79\x03\xc1\xf8\x53\xd9\x5b\x99\x5c\x65\xe9\x94\x49\xff\x95\x3f\xa1\x1c\x77\x23\xb2\x14\x9e\xcd\xaa\x7f\x83\x3f\x60\xe1\x3b\x19\xa6\x6e\x96\x3f\x7e\x8d\xa4\x29\x7e\xbb\xfd\xda\x5b\x36\xfb\x4d\x01\xbd\x02\xe6\xc6\x52\xdc\x4d\x99\xe2\xcb\x82\xc2\xa1\xd4\xa4\x5e\x4c\x89\xba\x99\x94\xe8\x2f\x85\x4b\xbc\x34\xa4\x0b\x3a"..., 32748 <unfinished ...>
>
> So we have a "write" denied for pid 5842 and then it tries a write that
> seems to hangs. In all the traces for hang, I see a denied for a task
> followed by a hang for the task in the same tgid.
>
> But since this is a "permissive" policy, it should not cause a hang,
> only report that the program is in violation. Also I have no clue how a
> spurious wakeup of a writer could then lead to progress.
>
> Since in all cases the thread info flags "flags:0x00004006" has the
> TIF_NOTIFY_SIGNAL bit set, I'm wondering if it has something to do with
> the fact that pipe_read() directly return -ERESTARTSYS in case of a
> pending signal without any wakeups?
>

Per syzbot this attempt did not work out either.

I think the blind stabs taken by everyone here are enough.

The report does not provide the crucial bit: what are the other
threads doing. Presumably someone else is stuck somewhere, possibly
not even in pipe code and that stuck thread was supposed to wake up
the one which trips over hung task detector. Figuring out what that
thread is imo the next step.

I failed to find a relevant command in
https://github.com/google/syzkaller/blob/master/docs/syzbot.md

So if you guys know someone on syzkaller side, maybe you can ask them
to tweak the report *or* maybe syzbot can test a "fix" which makes
hung task detector also report all backtraces? I don't know if that
can work, the output may be long enough that it will get trimmed by
something.

I don't have to time work on this for now, just throwing ideas.
-- 
Mateusz Guzik <mjguzik gmail.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-24 13:17             ` Mateusz Guzik
@ 2025-03-24 13:19               ` K Prateek Nayak
  2025-03-24 14:52                 ` K Prateek Nayak
  0 siblings, 1 reply; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-24 13:19 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro

Hello Mateusz,

On 3/24/2025 6:47 PM, Mateusz Guzik wrote:
> On Mon, Mar 24, 2025 at 11:48 AM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>>
>> Hello Oleg, Mateusz,
>>
>> On 3/24/2025 2:32 AM, Oleg Nesterov wrote:
>>> Prateek, Mateusz, thanks for your participation!
>>>
>>> On 03/23, Mateusz Guzik wrote:
>>>>
>>>> On Sun, Mar 23, 2025 at 8:47 PM Oleg Nesterov <oleg@redhat.com> wrote:
>>>>>
>>>>> OK, as expected.
>>>>>
>>>>> Dear syzbot, thank you.
>>>>>
>>>>> So far I think this is another problem revealed by aaec5a95d59615523db03dd5
>>>>> ("pipe_read: don't wake up the writer if the pipe is still full").
>>>>>
>>>>> I am going to forget about this report for now and return to it later, when
>>>>> all the pending pipe-related changes in vfs.git are merged.
>>>>>
>>>>
>>>> How do you ask syzbot for all stacks?
>>>
>>> Heh, I don't know.
>>>
>>>> The reproducer *does* use pipes, but it is unclear to me if they play
>>>> any role here
>>>
>>> please see the reproducer,
>>>
>>>        https://syzkaller.appspot.com/x/repro.c?x=10d6a44c580000
>>>
>>>     res = syscall(__NR_pipe2, /*pipefd=*/0x400000001900ul, /*flags=*/0ul);
>>>     if (res != -1) {
>>>       r[2] = *(uint32_t*)0x400000001900;
>>>       r[3] = *(uint32_t*)0x400000001904;
>>>     }
>>>
>>> then
>>>
>>>     res = syscall(__NR_dup, /*oldfd=*/r[3]);
>>>     if (res != -1)
>>>       r[4] = res;
>>>
>>> so r[2] and r[4] are the read/write fd's.
>>>
>>> then later
>>>
>>>      memcpy((void*)0x400000000280, "trans=fd,", 9);
>>>      ...
>>>      memcpy((void*)0x400000000289, "rfdno", 5);
>>>      ...
>>>      sprintf((char*)0x40000000028f, "0x%016llx", (long long)r[2]);
>>>      ...
>>>      memcpy((void*)0x4000000002a2, "wfdno", 5);
>>>      ...
>>>      sprintf((char*)0x4000000002a8, "0x%016llx", (long long)r[4]);
>>>      ...
>>>      syscall(__NR_mount, /*src=*/0ul, /*dst=*/0x400000000000ul,
>>>              /*type=*/0x400000000040ul, /*flags=*/0ul, /*opts=*/0x400000000280ul);
>>>
>>> so this pipe is actually used as "trans=fd".
>>>
>>>> -- and notably we don't know if there is someone stuck
>>>> in pipe code, resulting in not waking up the reported thread.
>>>
>>> Yes, I am not familiar with 9p or netfs, so I don't know either.
>>
>> Didn't have any luck reproducing this yet but I'm looking at
>> https://syzkaller.appspot.com/x/log.txt?x=1397319b980000
>> which is the trimmed log from original report and I see ...
>>
>> [pid  5842] creat("./file0", 000)       = 7
>> [  137.753309][   T30] audit: type=1400 audit(1742312362.045:90): avc:  denied  { mount } for  pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=filesystem permissive=1
>> [  137.775741][   T30] audit: type=1400 audit(1742312362.065:91): avc:  denied  { setattr } for  pid=5842 comm="syz-executor309" name="/" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
>> [  137.798215][   T30] audit: type=1400 audit(1742312362.075:92): avc:  denied  { write } for  pid=5842 comm="syz-executor309" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
>> [  137.819189][   T30] audit: type=1400 audit(1742312362.075:93): avc:  denied  { open } for  pid=5842 comm="syz-executor309" path="/file0" dev="9p" ino=2 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=file permissive=1
>> [pid  5842] write(7, "\x08\x00\x00\x00\x1a\x17\x92\x4a\xb2\x18\xea\xcb\x15\xa3\xfc\xcf\x92\x9e\x2d\xd2\x49\x79\x03\xc1\xf8\x53\xd9\x5b\x99\x5c\x65\xe9\x94\x49\xff\x95\x3f\xa1\x1c\x77\x23\xb2\x14\x9e\xcd\xaa\x7f\x83\x3f\x60\xe1\x3b\x19\xa6\x6e\x96\x3f\x7e\x8d\xa4\x29\x7e\xbb\xfd\xda\x5b\x36\xfb\x4d\x01\xbd\x02\xe6\xc6\x52\xdc\x4d\x99\xe2\xcb\x82\xc2\xa1\xd4\xa4\x5e\x4c\x89\xba\x99\x94\xe8\x2f\x85\x4b\xbc\x34\xa4\x0b\x3a"..., 32748 <unfinished ...>
>>
>> So we have a "write" denied for pid 5842 and then it tries a write that
>> seems to hangs. In all the traces for hang, I see a denied for a task
>> followed by a hang for the task in the same tgid.
>>
>> But since this is a "permissive" policy, it should not cause a hang,
>> only report that the program is in violation. Also I have no clue how a
>> spurious wakeup of a writer could then lead to progress.
>>
>> Since in all cases the thread info flags "flags:0x00004006" has the
>> TIF_NOTIFY_SIGNAL bit set, I'm wondering if it has something to do with
>> the fact that pipe_read() directly return -ERESTARTSYS in case of a
>> pending signal without any wakeups?
>>
> 
> Per syzbot this attempt did not work out either.
> 
> I think the blind stabs taken by everyone here are enough.
> 
> The report does not provide the crucial bit: what are the other
> threads doing. Presumably someone else is stuck somewhere, possibly
> not even in pipe code and that stuck thread was supposed to wake up
> the one which trips over hung task detector. Figuring out what that
> thread is imo the next step.
> 
> I failed to find a relevant command in
> https://github.com/google/syzkaller/blob/master/docs/syzbot.md
> 
> So if you guys know someone on syzkaller side, maybe you can ask them
> to tweak the report *or* maybe syzbot can test a "fix" which makes
> hung task detector also report all backtraces? I don't know if that
> can work, the output may be long enough that it will get trimmed by
> something.
> 
> I don't have to time work on this for now, just throwing ideas.

I got the reproducer running locally. Tracing stuff currently to see
what is tripping. Will report back once I find something interesting.
Might take a while since the 9p bits are so far spread out.

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-24 13:19               ` K Prateek Nayak
@ 2025-03-24 14:52                 ` K Prateek Nayak
  2025-03-24 16:03                   ` Mateusz Guzik
  2025-03-25  2:52                   ` K Prateek Nayak
  0 siblings, 2 replies; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-24 14:52 UTC (permalink / raw)
  To: Oleg Nesterov, Mateusz Guzik
  Cc: syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel,
	linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro

Hello folks,

Some updates.

On 3/24/2025 6:49 PM, K Prateek Nayak wrote:
>>
>> Per syzbot this attempt did not work out either.
>>
>> I think the blind stabs taken by everyone here are enough.
>>
>> The report does not provide the crucial bit: what are the other
>> threads doing. Presumably someone else is stuck somewhere, possibly
>> not even in pipe code and that stuck thread was supposed to wake up
>> the one which trips over hung task detector. Figuring out what that
>> thread is imo the next step.
>>
>> I failed to find a relevant command in
>> https://github.com/google/syzkaller/blob/master/docs/syzbot.md
>>
>> So if you guys know someone on syzkaller side, maybe you can ask them
>> to tweak the report *or* maybe syzbot can test a "fix" which makes
>> hung task detector also report all backtraces? I don't know if that
>> can work, the output may be long enough that it will get trimmed by
>> something.
>>
>> I don't have to time work on this for now, just throwing ideas.
> 
> I got the reproducer running locally. Tracing stuff currently to see
> what is tripping. Will report back once I find something interesting.
> Might take a while since the 9p bits are so far spread out.
> 

So far, with tracing, this is where I'm:

o Mainline + Oleg's optimization reverted:

     ...
     kworker/43:1-1723    [043] .....   115.309065: p9_read_work: Data read wait 55
     kworker/43:1-1723    [043] .....   115.309066: p9_read_work: Data read 55
     kworker/43:1-1723    [043] .....   115.309067: p9_read_work: Data read wait 7
     kworker/43:1-1723    [043] .....   115.309068: p9_read_work: Data read 7
            repro-4138    [043] .....   115.309084: netfs_wake_write_collector: Wake collector
            repro-4138    [043] .....   115.309085: netfs_wake_write_collector: Queuing collector work
            repro-4138    [043] .....   115.309088: netfs_unbuffered_write: netfs_unbuffered_write
            repro-4138    [043] .....   115.309088: netfs_end_issue_write: netfs_end_issue_write
            repro-4138    [043] .....   115.309089: netfs_end_issue_write: Write collector need poke 0
            repro-4138    [043] .....   115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
  kworker/u1030:1-1951    [168] .....   115.309096: netfs_wake_write_collector: Wake collector
  kworker/u1030:1-1951    [168] .....   115.309097: netfs_wake_write_collector: Queuing collector work
  kworker/u1030:1-1951    [168] .....   115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
     ... (syzbot reproducer continues)

o Mainline:

    kworker/185:1-1767    [185] .....   109.485961: p9_read_work: Data read wait 7
    kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read 7
    kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read wait 55
    kworker/185:1-1767    [185] .....   109.485963: p9_read_work: Data read 55
            repro-4038    [185] .....   114.225717: netfs_wake_write_collector: Wake collector
            repro-4038    [185] .....   114.225723: netfs_wake_write_collector: Queuing collector work
            repro-4038    [185] .....   114.225727: netfs_unbuffered_write: netfs_unbuffered_write
            repro-4038    [185] .....   114.225727: netfs_end_issue_write: netfs_end_issue_write
            repro-4038    [185] .....   114.225728: netfs_end_issue_write: Write collector need poke 0
            repro-4038    [185] .....   114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
    ... (syzbot reproducer hangs)

There is a third "kworker/u1030" component that never gets woken up for
reasons currently unknown to me with Oleg's optimization. I'll keep
digging.

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-24 14:52                 ` K Prateek Nayak
@ 2025-03-24 16:03                   ` Mateusz Guzik
  2025-03-24 16:25                     ` K Prateek Nayak
  2025-03-25  2:52                   ` K Prateek Nayak
  1 sibling, 1 reply; 48+ messages in thread
From: Mateusz Guzik @ 2025-03-24 16:03 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro

On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
> So far, with tracing, this is where I'm:
>
> o Mainline + Oleg's optimization reverted:
>
>      ...
>      kworker/43:1-1723    [043] .....   115.309065: p9_read_work: Data read wait 55
>      kworker/43:1-1723    [043] .....   115.309066: p9_read_work: Data read 55
>      kworker/43:1-1723    [043] .....   115.309067: p9_read_work: Data read wait 7
>      kworker/43:1-1723    [043] .....   115.309068: p9_read_work: Data read 7
>             repro-4138    [043] .....   115.309084: netfs_wake_write_collector: Wake collector
>             repro-4138    [043] .....   115.309085: netfs_wake_write_collector: Queuing collector work
>             repro-4138    [043] .....   115.309088: netfs_unbuffered_write: netfs_unbuffered_write
>             repro-4138    [043] .....   115.309088: netfs_end_issue_write: netfs_end_issue_write
>             repro-4138    [043] .....   115.309089: netfs_end_issue_write: Write collector need poke 0
>             repro-4138    [043] .....   115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>   kworker/u1030:1-1951    [168] .....   115.309096: netfs_wake_write_collector: Wake collector
>   kworker/u1030:1-1951    [168] .....   115.309097: netfs_wake_write_collector: Queuing collector work
>   kworker/u1030:1-1951    [168] .....   115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
>      ... (syzbot reproducer continues)
>
> o Mainline:
>
>     kworker/185:1-1767    [185] .....   109.485961: p9_read_work: Data read wait 7
>     kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read 7
>     kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read wait 55
>     kworker/185:1-1767    [185] .....   109.485963: p9_read_work: Data read 55
>             repro-4038    [185] .....   114.225717: netfs_wake_write_collector: Wake collector
>             repro-4038    [185] .....   114.225723: netfs_wake_write_collector: Queuing collector work
>             repro-4038    [185] .....   114.225727: netfs_unbuffered_write: netfs_unbuffered_write
>             repro-4038    [185] .....   114.225727: netfs_end_issue_write: netfs_end_issue_write
>             repro-4038    [185] .....   114.225728: netfs_end_issue_write: Write collector need poke 0
>             repro-4038    [185] .....   114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>     ... (syzbot reproducer hangs)
>
> There is a third "kworker/u1030" component that never gets woken up for
> reasons currently unknown to me with Oleg's optimization. I'll keep
> digging.
>

Thanks for the update.

It is unclear to me if you checked, so I'm going to have to ask just
in case: when there is a hang, is there *anyone* stuck in pipe code
(and if so, where)?

You can get the kernel to print stacks for all threads with sysrq:
echo t > /proc/sysrq-trigger

-- 
Mateusz Guzik <mjguzik gmail.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-24 16:03                   ` Mateusz Guzik
@ 2025-03-24 16:25                     ` K Prateek Nayak
  2025-03-24 16:36                       ` K Prateek Nayak
  0 siblings, 1 reply; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-24 16:25 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro

Hello Mateusz,

On 3/24/2025 9:33 PM, Mateusz Guzik wrote:
> On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>> So far, with tracing, this is where I'm:
>>
>> o Mainline + Oleg's optimization reverted:
>>
>>       ...
>>       kworker/43:1-1723    [043] .....   115.309065: p9_read_work: Data read wait 55
>>       kworker/43:1-1723    [043] .....   115.309066: p9_read_work: Data read 55
>>       kworker/43:1-1723    [043] .....   115.309067: p9_read_work: Data read wait 7
>>       kworker/43:1-1723    [043] .....   115.309068: p9_read_work: Data read 7
>>              repro-4138    [043] .....   115.309084: netfs_wake_write_collector: Wake collector
>>              repro-4138    [043] .....   115.309085: netfs_wake_write_collector: Queuing collector work
>>              repro-4138    [043] .....   115.309088: netfs_unbuffered_write: netfs_unbuffered_write
>>              repro-4138    [043] .....   115.309088: netfs_end_issue_write: netfs_end_issue_write
>>              repro-4138    [043] .....   115.309089: netfs_end_issue_write: Write collector need poke 0
>>              repro-4138    [043] .....   115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>>    kworker/u1030:1-1951    [168] .....   115.309096: netfs_wake_write_collector: Wake collector
>>    kworker/u1030:1-1951    [168] .....   115.309097: netfs_wake_write_collector: Queuing collector work
>>    kworker/u1030:1-1951    [168] .....   115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
>>       ... (syzbot reproducer continues)
>>
>> o Mainline:
>>
>>      kworker/185:1-1767    [185] .....   109.485961: p9_read_work: Data read wait 7
>>      kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read 7
>>      kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read wait 55
>>      kworker/185:1-1767    [185] .....   109.485963: p9_read_work: Data read 55
>>              repro-4038    [185] .....   114.225717: netfs_wake_write_collector: Wake collector
>>              repro-4038    [185] .....   114.225723: netfs_wake_write_collector: Queuing collector work
>>              repro-4038    [185] .....   114.225727: netfs_unbuffered_write: netfs_unbuffered_write
>>              repro-4038    [185] .....   114.225727: netfs_end_issue_write: netfs_end_issue_write
>>              repro-4038    [185] .....   114.225728: netfs_end_issue_write: Write collector need poke 0
>>              repro-4038    [185] .....   114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>>      ... (syzbot reproducer hangs)
>>
>> There is a third "kworker/u1030" component that never gets woken up for
>> reasons currently unknown to me with Oleg's optimization. I'll keep
>> digging.
>>
> 
> Thanks for the update.
> 
> It is unclear to me if you checked, so I'm going to have to ask just
> in case: when there is a hang, is there *anyone* stuck in pipe code
> (and if so, where)?
> 
> You can get the kernel to print stacks for all threads with sysrq:
> echo t > /proc/sysrq-trigger

This dumps an insane amount of stuff on my console. Let me search if there
is any reference to pipe somewhere in there. Meanwhile, for the reproducer
threads themself, they are at:

# ps aux | grep repro
root        4245  0.0  0.0  19040     0 ?        D    16:10   0:00 ./repro
root        4306  0.0  0.0  19172     0 pts/5    S+   16:13   0:00 ./repro
root        4307  0.0  0.0  19040     0 pts/5    D    16:13   0:00 ./repro

# cat /proc/4306/stack
[<0>] do_wait+0xa9/0x100
[<0>] kernel_wait4+0xa9/0x150
[<0>] do_syscall_64+0x6f/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

# cat /proc/4307/stack
[<0>] do_lock_mount+0x37/0x200
[<0>] path_mount+0x876/0xae0
[<0>] __x64_sys_mount+0x103/0x130
[<0>] do_syscall_64+0x6f/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

# cat /proc/4245/stack
[<0>] netfs_unbuffered_write_iter_locked+0x222/0x2b0 [netfs]
[<0>] netfs_unbuffered_write_iter+0x12d/0x1f0 [netfs]
[<0>] vfs_write+0x307/0x420
[<0>] ksys_write+0x66/0xe0
[<0>] do_syscall_64+0x6f/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

For kworkers, I did:

# for i in $(ps aux | grep kworker | tr -s ' ' | cut -d ' ' -f2); do cat /proc/$i/stack | grep pipe; done

and I got no output. Same for the string "wait".

> 

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-24 16:25                     ` K Prateek Nayak
@ 2025-03-24 16:36                       ` K Prateek Nayak
  0 siblings, 0 replies; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-24 16:36 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: Oleg Nesterov, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro



On 3/24/2025 9:55 PM, K Prateek Nayak wrote:
> Hello Mateusz,
> 
> On 3/24/2025 9:33 PM, Mateusz Guzik wrote:
>> On Mon, Mar 24, 2025 at 3:52 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>>> So far, with tracing, this is where I'm:
>>>
>>> o Mainline + Oleg's optimization reverted:
>>>
>>>       ...
>>>       kworker/43:1-1723    [043] .....   115.309065: p9_read_work: Data read wait 55
>>>       kworker/43:1-1723    [043] .....   115.309066: p9_read_work: Data read 55
>>>       kworker/43:1-1723    [043] .....   115.309067: p9_read_work: Data read wait 7
>>>       kworker/43:1-1723    [043] .....   115.309068: p9_read_work: Data read 7
>>>              repro-4138    [043] .....   115.309084: netfs_wake_write_collector: Wake collector
>>>              repro-4138    [043] .....   115.309085: netfs_wake_write_collector: Queuing collector work
>>>              repro-4138    [043] .....   115.309088: netfs_unbuffered_write: netfs_unbuffered_write
>>>              repro-4138    [043] .....   115.309088: netfs_end_issue_write: netfs_end_issue_write
>>>              repro-4138    [043] .....   115.309089: netfs_end_issue_write: Write collector need poke 0
>>>              repro-4138    [043] .....   115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>>>    kworker/u1030:1-1951    [168] .....   115.309096: netfs_wake_write_collector: Wake collector
>>>    kworker/u1030:1-1951    [168] .....   115.309097: netfs_wake_write_collector: Queuing collector work
>>>    kworker/u1030:1-1951    [168] .....   115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
>>>       ... (syzbot reproducer continues)
>>>
>>> o Mainline:
>>>
>>>      kworker/185:1-1767    [185] .....   109.485961: p9_read_work: Data read wait 7
>>>      kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read 7
>>>      kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read wait 55
>>>      kworker/185:1-1767    [185] .....   109.485963: p9_read_work: Data read 55
>>>              repro-4038    [185] .....   114.225717: netfs_wake_write_collector: Wake collector
>>>              repro-4038    [185] .....   114.225723: netfs_wake_write_collector: Queuing collector work
>>>              repro-4038    [185] .....   114.225727: netfs_unbuffered_write: netfs_unbuffered_write
>>>              repro-4038    [185] .....   114.225727: netfs_end_issue_write: netfs_end_issue_write
>>>              repro-4038    [185] .....   114.225728: netfs_end_issue_write: Write collector need poke 0
>>>              repro-4038    [185] .....   114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>>>      ... (syzbot reproducer hangs)
>>>
>>> There is a third "kworker/u1030" component that never gets woken up for
>>> reasons currently unknown to me with Oleg's optimization. I'll keep
>>> digging.
>>>
>>
>> Thanks for the update.
>>
>> It is unclear to me if you checked, so I'm going to have to ask just
>> in case: when there is a hang, is there *anyone* stuck in pipe code
>> (and if so, where)?
>>
>> You can get the kernel to print stacks for all threads with sysrq:
>> echo t > /proc/sysrq-trigger
> 
> This dumps an insane amount of stuff on my console. Let me search if there
> is any reference to pipe somewhere in there.

Only pipe_read() and pipe_write() pairs I found were for:

[ 1043.618621] task:containerd      state:S stack:0     pid:3567  tgid:3475  ppid:1      task_flags:0x400040 flags:0x00000002
[ 1043.629673] Call Trace:
[ 1043.632133]  <TASK>
[ 1043.634252]  __schedule+0x436/0x1620
[ 1043.637832]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1043.642638]  ? syscall_exit_to_user_mode+0x51/0x1a0
[ 1043.647538]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1043.652347]  ? __smp_call_single_queue+0xc7/0x150
[ 1043.657072]  schedule+0x28/0x110
[ 1043.660320]  futex_wait_queue+0x69/0xa0
[ 1043.664176]  __futex_wait+0x13c/0x1b0
[ 1043.667863]  ? __pfx_futex_wake_mark+0x10/0x10
[ 1043.672325]  futex_wait+0x68/0x110
[ 1043.675747]  ? __x64_sys_futex+0x77/0x1d0
[ 1043.679775]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1043.684585]  ? pipe_write+0x42c/0x630                   <----------- here
[ 1043.688269]  ? copy_fpstate_to_sigframe+0x330/0x3d0
[ 1043.693167]  do_futex+0x13c/0x1d0
[ 1043.696501]  __x64_sys_futex+0x77/0x1d0
[ 1043.700359]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1043.705170]  ? vfs_write+0x376/0x420
[ 1043.708767]  do_syscall_64+0x6f/0x110
[ 1043.712449]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1043.717259]  ? ksys_write+0x90/0xe0
[ 1043.720768]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1043.725579]  ? syscall_exit_to_user_mode+0x51/0x1a0
[ 1043.730475]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1043.735288]  ? do_syscall_64+0x7b/0x110
[ 1043.739144]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1043.743953]  ? irqentry_exit_to_user_mode+0x2e/0x160
[ 1043.748935]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

and

[ 1048.980518] task:dockerd         state:S stack:0     pid:3591  tgid:3582  ppid:1      task_flags:0x400040 flags:0x00000002
[ 1048.991569] Call Trace:
[ 1048.994024]  <TASK>
[ 1048.996143]  __schedule+0x436/0x1620
[ 1048.999741]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.004554]  ? get_nohz_timer_target+0x2a/0x180
[ 1049.009101]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.013911]  ? timerqueue_add+0x6a/0xc0
[ 1049.017768]  schedule+0x28/0x110
[ 1049.021017]  schedule_hrtimeout_range_clock+0x78/0xd0
[ 1049.026088]  ? __pfx_hrtimer_wakeup+0x10/0x10
[ 1049.030470]  do_epoll_wait+0x666/0x7d0
[ 1049.034245]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.039053]  ? __pfx_ep_autoremove_wake_function+0x10/0x10
[ 1049.044556]  do_compat_epoll_pwait.part.0+0xc/0x70
[ 1049.049368]  __x64_sys_epoll_pwait+0x83/0x140
[ 1049.053746]  do_syscall_64+0x6f/0x110
[ 1049.057425]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.062235]  ? do_syscall_64+0x7b/0x110
[ 1049.066079]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.070887]  ? do_futex+0xc2/0x1d0
[ 1049.074309]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.079120]  ? __x64_sys_futex+0x77/0x1d0
[ 1049.083151]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.087958]  ? current_time+0x31/0x130
[ 1049.091731]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.096539]  ? inode_update_timestamps+0xc8/0x110
[ 1049.101264]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.106074]  ? generic_update_time+0x51/0x60
[ 1049.110365]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.115173]  ? touch_atime+0xb5/0x100
[ 1049.118857]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.123671]  ? pipe_read+0x3fe/0x480                   <----------- here
[ 1049.127265]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.132073]  ? ep_done_scan+0xab/0xf0
[ 1049.135757]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.140567]  ? do_epoll_wait+0xe1/0x7d0
[ 1049.144425]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.149234]  ? do_compat_epoll_pwait.part.0+0xc/0x70
[ 1049.154218]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.159027]  ? __x64_sys_epoll_pwait+0x83/0x140
[ 1049.163577]  ? do_syscall_64+0x7b/0x110
[ 1049.167433]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.172243]  ? syscall_exit_to_user_mode+0x51/0x1a0
[ 1049.177141]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.181951]  ? do_syscall_64+0x7b/0x110
[ 1049.185811]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.190619]  ? ksys_read+0x90/0xe0
[ 1049.194025]  ? do_syscall_64+0x7b/0x110
[ 1049.197877]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.202682]  ? syscall_exit_to_user_mode+0x51/0x1a0
[ 1049.207579]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.212391]  ? do_syscall_64+0x7b/0x110
[ 1049.216243]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.221053]  ? do_syscall_64+0x7b/0x110
[ 1049.224900]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 1049.229703]  ? do_syscall_64+0x7b/0x110
[ 1049.233561]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-24 14:52                 ` K Prateek Nayak
  2025-03-24 16:03                   ` Mateusz Guzik
@ 2025-03-25  2:52                   ` K Prateek Nayak
  2025-03-25 12:15                     ` Oleg Nesterov
  1 sibling, 1 reply; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-25  2:52 UTC (permalink / raw)
  To: Oleg Nesterov, Mateusz Guzik
  Cc: syzbot, brauner, dhowells, jack, jlayton, linux-fsdevel,
	linux-kernel, netfs, swapnil.sapkal, syzkaller-bugs, viro

On 3/24/2025 8:22 PM, K Prateek Nayak wrote:
> Hello folks,
> 
> Some updates.
> 
> On 3/24/2025 6:49 PM, K Prateek Nayak wrote:
>>>
>>> Per syzbot this attempt did not work out either.
>>>
>>> I think the blind stabs taken by everyone here are enough.
>>>
>>> The report does not provide the crucial bit: what are the other
>>> threads doing. Presumably someone else is stuck somewhere, possibly
>>> not even in pipe code and that stuck thread was supposed to wake up
>>> the one which trips over hung task detector. Figuring out what that
>>> thread is imo the next step.
>>>
>>> I failed to find a relevant command in
>>> https://github.com/google/syzkaller/blob/master/docs/syzbot.md
>>>
>>> So if you guys know someone on syzkaller side, maybe you can ask them
>>> to tweak the report *or* maybe syzbot can test a "fix" which makes
>>> hung task detector also report all backtraces? I don't know if that
>>> can work, the output may be long enough that it will get trimmed by
>>> something.
>>>
>>> I don't have to time work on this for now, just throwing ideas.
>>
>> I got the reproducer running locally. Tracing stuff currently to see
>> what is tripping. Will report back once I find something interesting.
>> Might take a while since the 9p bits are so far spread out.
>>
> 
> So far, with tracing, this is where I'm:
> 
> o Mainline + Oleg's optimization reverted:
> 
>      ...
>      kworker/43:1-1723    [043] .....   115.309065: p9_read_work: Data read wait 55
>      kworker/43:1-1723    [043] .....   115.309066: p9_read_work: Data read 55
>      kworker/43:1-1723    [043] .....   115.309067: p9_read_work: Data read wait 7
>      kworker/43:1-1723    [043] .....   115.309068: p9_read_work: Data read 7
>             repro-4138    [043] .....   115.309084: netfs_wake_write_collector: Wake collector
>             repro-4138    [043] .....   115.309085: netfs_wake_write_collector: Queuing collector work
>             repro-4138    [043] .....   115.309088: netfs_unbuffered_write: netfs_unbuffered_write
>             repro-4138    [043] .....   115.309088: netfs_end_issue_write: netfs_end_issue_write
>             repro-4138    [043] .....   115.309089: netfs_end_issue_write: Write collector need poke 0
>             repro-4138    [043] .....   115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>   kworker/u1030:1-1951    [168] .....   115.309096: netfs_wake_write_collector: Wake collector
>   kworker/u1030:1-1951    [168] .....   115.309097: netfs_wake_write_collector: Queuing collector work
>   kworker/u1030:1-1951    [168] .....   115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
>      ... (syzbot reproducer continues)
> 
> o Mainline:
> 
>     kworker/185:1-1767    [185] .....   109.485961: p9_read_work: Data read wait 7
>     kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read 7
>     kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read wait 55
>     kworker/185:1-1767    [185] .....   109.485963: p9_read_work: Data read 55
>             repro-4038    [185] .....   114.225717: netfs_wake_write_collector: Wake collector
>             repro-4038    [185] .....   114.225723: netfs_wake_write_collector: Queuing collector work
>             repro-4038    [185] .....   114.225727: netfs_unbuffered_write: netfs_unbuffered_write
>             repro-4038    [185] .....   114.225727: netfs_end_issue_write: netfs_end_issue_write
>             repro-4038    [185] .....   114.225728: netfs_end_issue_write: Write collector need poke 0
>             repro-4038    [185] .....   114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>     ... (syzbot reproducer hangs)
> 
> There is a third "kworker/u1030" component that never gets woken up for
> reasons currently unknown to me with Oleg's optimization. I'll keep
> digging.

More data ...

I chased this down to p9_client_rpc() net/9p/client.c specifically:

         err = c->trans_mod->request(c, req);
         if (err < 0) {
                 /* write won't happen */
                 p9_req_put(c, req);
                 if (err != -ERESTARTSYS && err != -EFAULT)
                         c->status = Disconnected;
                 goto recalc_sigpending;
         }

c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c
which basically does a p9_fd_poll().

Previously, the above would fail with err as -EIO which would
cause the client to "Disconnect" and the retry logic would make
progress. Now however, the err returned is -ERESTARTSYS which
will not cause a disconnect and the retry logic will hang
somewhere in p9_client_rpc() later.

I'll chase it a little more but if 9p folks can chime in it would
be great since I'm out of my depths here.

P.S. There are more interactions at play and I'm trying to still
piece them together.

Relevant traces:

o Mainline + Oleg's optimization reverted:

            repro-4161    [239] .....   107.785644: p9_client_write: p9_client_rpc done
            repro-4161    [239] .....   107.785644: p9_client_write: p9_pdup
            repro-4161    [239] .....   107.785644: p9_client_write: iter revert
            repro-4161    [239] .....   107.785644: p9_client_write: p9_client_rpc
            repro-4161    [239] .....   107.785653: p9_fd_request: p9_fd_request
            repro-4161    [239] ...1.   107.785653: p9_fd_request: p9_fd_request error
            repro-4161    [239] .....   107.785653: p9_client_rpc: Client disconnected (no write) <------------- "write won't happen" case
            repro-4161    [239] .....   107.785655: p9_client_write: p9_client_rpc done
            repro-4161    [239] .....   107.785655: p9_client_write: p9_client_rpc error (-5)     <------------- -EIO
            repro-4161    [239] .....   107.785656: v9fs_issue_write: Issue write done 2 err(-5)
            repro-4161    [239] .....   107.785657: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated
            repro-4161    [239] .....   107.785657: netfs_wake_write_collector: Wake collector
            repro-4161    [239] .....   107.785658: netfs_wake_write_collector: Queuing collector work
            repro-4161    [239] .....   107.785660: v9fs_issue_write: Issue write subrequest terminated 2
            repro-4161    [239] .....   107.785661: netfs_unbuffered_write: netfs_unbuffered_write
            repro-4161    [239] .....   107.785661: netfs_end_issue_write: netfs_end_issue_write
            repro-4161    [239] .....   107.785662: netfs_end_issue_write: Write collector need poke 0
            repro-4161    [239] .....   107.785662: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
  kworker/u1038:0-1583    [105] .....   107.785667: netfs_retry_writes: netfs_reissue_write 1
  kworker/u1038:0-1583    [105] .....   107.785668: v9fs_issue_write: v9fs_issue_write
  kworker/u1038:0-1583    [105] .....   107.785669: p9_client_write: p9_client_rpc
  kworker/u1038:0-1583    [105] .....   107.785669: p9_client_prepare_req: p9_client_prepare_req eio 1
  kworker/u1038:0-1583    [105] .....   107.785670: p9_client_rpc: p9_client_rpc early err return
  kworker/u1038:0-1583    [105] .....   107.785670: p9_client_write: p9_client_rpc done
  kworker/u1038:0-1583    [105] .....   107.785671: p9_client_write: p9_client_rpc error (-5)
  kworker/u1038:0-1583    [105] .....   107.785672: v9fs_issue_write: Issue write done 0 err(-5)
  kworker/u1038:0-1583    [105] .....   107.785672: netfs_write_subrequest_terminated: Collector woken up
  kworker/u1038:0-1583    [105] .....   107.785672: netfs_wake_write_collector: Wake collector
  kworker/u1038:0-1583    [105] .....   107.785672: netfs_wake_write_collector: Queuing collector work
  kworker/u1038:0-1583    [105] .....   107.785677: v9fs_issue_write: Issue write subrequest terminated 0
  kworker/u1038:0-1583    [105] .....   107.785684: netfs_write_collection_worker: Write collect clearing and waking up!
            repro-4161    [239] .....   107.785883: p9_client_prepare_req: p9_client_prepare_req eio 1
            ... (continues)

o Mainline:

            repro-4161    [087] .....   123.474660: p9_client_write: p9_client_rpc done
            repro-4161    [087] .....   123.474661: p9_client_write: p9_pdup
            repro-4161    [087] .....   123.474661: p9_client_write: iter revert
            repro-4161    [087] .....   123.474661: p9_client_write: p9_client_rpc
            repro-4161    [087] .....   123.474672: p9_fd_request: p9_fd_request
            repro-4161    [087] .....   123.474673: p9_fd_request: p9_fd_request EPOLL
            repro-4161    [087] .....   123.474673: p9_fd_poll: p9_fd_poll rd poll
            repro-4161    [087] .....   123.474674: p9_fd_poll: p9_fd_request wr poll
            repro-4161    [087] .....   128.233025: p9_client_write: p9_client_rpc done
            repro-4161    [087] .....   128.233033: p9_client_write: p9_client_rpc error (-512)     <------------- -ERESTARTSYS
            repro-4161    [087] .....   128.233034: v9fs_issue_write: Issue write done 2 err(-512)
            repro-4161    [087] .....   128.233035: netfs_write_subrequest_terminated: Collector woken
            repro-4161    [087] .....   128.233036: netfs_wake_write_collector: Wake collector
            repro-4161    [087] .....   128.233036: netfs_wake_write_collector: Queuing collector work
            repro-4161    [087] .....   128.233040: v9fs_issue_write: Issue write subrequest terminated 2
            repro-4161    [087] .....   128.233040: netfs_unbuffered_write: netfs_unbuffered_write
            repro-4161    [087] .....   128.233040: netfs_end_issue_write: netfs_end_issue_write
            repro-4161    [087] .....   128.233041: netfs_end_issue_write: Write collector need poke 0
            repro-4161    [087] .....   128.233041: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
  kworker/u1035:0-1580    [080] .....   128.233077: netfs_retry_writes: netfs_reissue_write 1
  kworker/u1035:0-1580    [080] .....   128.233078: v9fs_issue_write: v9fs_issue_write
  kworker/u1035:0-1580    [080] .....   128.233079: p9_client_write: p9_client_rpc
  kworker/u1035:0-1580    [080] .....   128.233099: p9_fd_request: p9_fd_request
  kworker/u1035:0-1580    [080] .....   128.233101: p9_fd_request: p9_fd_request EPOLL
  kworker/u1035:0-1580    [080] .....   128.233101: p9_fd_poll: p9_fd_poll rd poll
  kworker/u1035:0-1580    [080] .....   128.233101: p9_fd_poll: p9_fd_request wr poll
            (Hangs)

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-25  2:52                   ` K Prateek Nayak
@ 2025-03-25 12:15                     ` Oleg Nesterov
  2025-03-25 12:36                       ` Dominique Martinet
  2025-03-25 13:04                       ` Oleg Nesterov
  0 siblings, 2 replies; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-25 12:15 UTC (permalink / raw)
  To: K Prateek Nayak, Eric Van Hensbergen, Latchesar Ionkov,
	Dominique Martinet, Christian Schoenebeck
  Cc: Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

Prateek, thanks again for your hard work!

Yes, I think we need the help from 9p folks. I know nothing about it, but
so far this still looks as a 9p problem to me...

All I can say right now is that the "sigpending" logic in p9_client_rpc()
looks wrong. If nothing else:

	- clear_thread_flag(TIF_SIGPENDING) is not enough, it won't make
	  signal_pending() false if TIF_NOTIFY_SIGNAL is set.

	- otoh, if signal_pending() was true because of pending SIGKILL,
	  then after clear_thread_flag(TIF_SIGPENDING) wait_event_killable()
	  will act as uninterruptible wait_event().

Oleg.

On 03/25, K Prateek Nayak wrote:
>
> On 3/24/2025 8:22 PM, K Prateek Nayak wrote:
> >Hello folks,
> >
> >Some updates.
> >
> >On 3/24/2025 6:49 PM, K Prateek Nayak wrote:
> >>>
> >>>Per syzbot this attempt did not work out either.
> >>>
> >>>I think the blind stabs taken by everyone here are enough.
> >>>
> >>>The report does not provide the crucial bit: what are the other
> >>>threads doing. Presumably someone else is stuck somewhere, possibly
> >>>not even in pipe code and that stuck thread was supposed to wake up
> >>>the one which trips over hung task detector. Figuring out what that
> >>>thread is imo the next step.
> >>>
> >>>I failed to find a relevant command in
> >>>https://github.com/google/syzkaller/blob/master/docs/syzbot.md
> >>>
> >>>So if you guys know someone on syzkaller side, maybe you can ask them
> >>>to tweak the report *or* maybe syzbot can test a "fix" which makes
> >>>hung task detector also report all backtraces? I don't know if that
> >>>can work, the output may be long enough that it will get trimmed by
> >>>something.
> >>>
> >>>I don't have to time work on this for now, just throwing ideas.
> >>
> >>I got the reproducer running locally. Tracing stuff currently to see
> >>what is tripping. Will report back once I find something interesting.
> >>Might take a while since the 9p bits are so far spread out.
> >>
> >
> >So far, with tracing, this is where I'm:
> >
> >o Mainline + Oleg's optimization reverted:
> >
> >     ...
> >     kworker/43:1-1723    [043] .....   115.309065: p9_read_work: Data read wait 55
> >     kworker/43:1-1723    [043] .....   115.309066: p9_read_work: Data read 55
> >     kworker/43:1-1723    [043] .....   115.309067: p9_read_work: Data read wait 7
> >     kworker/43:1-1723    [043] .....   115.309068: p9_read_work: Data read 7
> >            repro-4138    [043] .....   115.309084: netfs_wake_write_collector: Wake collector
> >            repro-4138    [043] .....   115.309085: netfs_wake_write_collector: Queuing collector work
> >            repro-4138    [043] .....   115.309088: netfs_unbuffered_write: netfs_unbuffered_write
> >            repro-4138    [043] .....   115.309088: netfs_end_issue_write: netfs_end_issue_write
> >            repro-4138    [043] .....   115.309089: netfs_end_issue_write: Write collector need poke 0
> >            repro-4138    [043] .....   115.309091: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
> >  kworker/u1030:1-1951    [168] .....   115.309096: netfs_wake_write_collector: Wake collector
> >  kworker/u1030:1-1951    [168] .....   115.309097: netfs_wake_write_collector: Queuing collector work
> >  kworker/u1030:1-1951    [168] .....   115.309102: netfs_write_collection_worker: Write collect clearing and waking up!
> >     ... (syzbot reproducer continues)
> >
> >o Mainline:
> >
> >    kworker/185:1-1767    [185] .....   109.485961: p9_read_work: Data read wait 7
> >    kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read 7
> >    kworker/185:1-1767    [185] .....   109.485962: p9_read_work: Data read wait 55
> >    kworker/185:1-1767    [185] .....   109.485963: p9_read_work: Data read 55
> >            repro-4038    [185] .....   114.225717: netfs_wake_write_collector: Wake collector
> >            repro-4038    [185] .....   114.225723: netfs_wake_write_collector: Queuing collector work
> >            repro-4038    [185] .....   114.225727: netfs_unbuffered_write: netfs_unbuffered_write
> >            repro-4038    [185] .....   114.225727: netfs_end_issue_write: netfs_end_issue_write
> >            repro-4038    [185] .....   114.225728: netfs_end_issue_write: Write collector need poke 0
> >            repro-4038    [185] .....   114.225728: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
> >    ... (syzbot reproducer hangs)
> >
> >There is a third "kworker/u1030" component that never gets woken up for
> >reasons currently unknown to me with Oleg's optimization. I'll keep
> >digging.
>
> More data ...
>
> I chased this down to p9_client_rpc() net/9p/client.c specifically:
>
>         err = c->trans_mod->request(c, req);
>         if (err < 0) {
>                 /* write won't happen */
>                 p9_req_put(c, req);
>                 if (err != -ERESTARTSYS && err != -EFAULT)
>                         c->status = Disconnected;
>                 goto recalc_sigpending;
>         }
>
> c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c
> which basically does a p9_fd_poll().
>
> Previously, the above would fail with err as -EIO which would
> cause the client to "Disconnect" and the retry logic would make
> progress. Now however, the err returned is -ERESTARTSYS which
> will not cause a disconnect and the retry logic will hang
> somewhere in p9_client_rpc() later.
>
> I'll chase it a little more but if 9p folks can chime in it would
> be great since I'm out of my depths here.
>
> P.S. There are more interactions at play and I'm trying to still
> piece them together.
>
> Relevant traces:
>
> o Mainline + Oleg's optimization reverted:
>
>            repro-4161    [239] .....   107.785644: p9_client_write: p9_client_rpc done
>            repro-4161    [239] .....   107.785644: p9_client_write: p9_pdup
>            repro-4161    [239] .....   107.785644: p9_client_write: iter revert
>            repro-4161    [239] .....   107.785644: p9_client_write: p9_client_rpc
>            repro-4161    [239] .....   107.785653: p9_fd_request: p9_fd_request
>            repro-4161    [239] ...1.   107.785653: p9_fd_request: p9_fd_request error
>            repro-4161    [239] .....   107.785653: p9_client_rpc: Client disconnected (no write) <------------- "write won't happen" case
>            repro-4161    [239] .....   107.785655: p9_client_write: p9_client_rpc done
>            repro-4161    [239] .....   107.785655: p9_client_write: p9_client_rpc error (-5)     <------------- -EIO
>            repro-4161    [239] .....   107.785656: v9fs_issue_write: Issue write done 2 err(-5)
>            repro-4161    [239] .....   107.785657: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated
>            repro-4161    [239] .....   107.785657: netfs_wake_write_collector: Wake collector
>            repro-4161    [239] .....   107.785658: netfs_wake_write_collector: Queuing collector work
>            repro-4161    [239] .....   107.785660: v9fs_issue_write: Issue write subrequest terminated 2
>            repro-4161    [239] .....   107.785661: netfs_unbuffered_write: netfs_unbuffered_write
>            repro-4161    [239] .....   107.785661: netfs_end_issue_write: netfs_end_issue_write
>            repro-4161    [239] .....   107.785662: netfs_end_issue_write: Write collector need poke 0
>            repro-4161    [239] .....   107.785662: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>  kworker/u1038:0-1583    [105] .....   107.785667: netfs_retry_writes: netfs_reissue_write 1
>  kworker/u1038:0-1583    [105] .....   107.785668: v9fs_issue_write: v9fs_issue_write
>  kworker/u1038:0-1583    [105] .....   107.785669: p9_client_write: p9_client_rpc
>  kworker/u1038:0-1583    [105] .....   107.785669: p9_client_prepare_req: p9_client_prepare_req eio 1
>  kworker/u1038:0-1583    [105] .....   107.785670: p9_client_rpc: p9_client_rpc early err return
>  kworker/u1038:0-1583    [105] .....   107.785670: p9_client_write: p9_client_rpc done
>  kworker/u1038:0-1583    [105] .....   107.785671: p9_client_write: p9_client_rpc error (-5)
>  kworker/u1038:0-1583    [105] .....   107.785672: v9fs_issue_write: Issue write done 0 err(-5)
>  kworker/u1038:0-1583    [105] .....   107.785672: netfs_write_subrequest_terminated: Collector woken up
>  kworker/u1038:0-1583    [105] .....   107.785672: netfs_wake_write_collector: Wake collector
>  kworker/u1038:0-1583    [105] .....   107.785672: netfs_wake_write_collector: Queuing collector work
>  kworker/u1038:0-1583    [105] .....   107.785677: v9fs_issue_write: Issue write subrequest terminated 0
>  kworker/u1038:0-1583    [105] .....   107.785684: netfs_write_collection_worker: Write collect clearing and waking up!
>            repro-4161    [239] .....   107.785883: p9_client_prepare_req: p9_client_prepare_req eio 1
>            ... (continues)
>
> o Mainline:
>
>            repro-4161    [087] .....   123.474660: p9_client_write: p9_client_rpc done
>            repro-4161    [087] .....   123.474661: p9_client_write: p9_pdup
>            repro-4161    [087] .....   123.474661: p9_client_write: iter revert
>            repro-4161    [087] .....   123.474661: p9_client_write: p9_client_rpc
>            repro-4161    [087] .....   123.474672: p9_fd_request: p9_fd_request
>            repro-4161    [087] .....   123.474673: p9_fd_request: p9_fd_request EPOLL
>            repro-4161    [087] .....   123.474673: p9_fd_poll: p9_fd_poll rd poll
>            repro-4161    [087] .....   123.474674: p9_fd_poll: p9_fd_request wr poll
>            repro-4161    [087] .....   128.233025: p9_client_write: p9_client_rpc done
>            repro-4161    [087] .....   128.233033: p9_client_write: p9_client_rpc error (-512)     <------------- -ERESTARTSYS
>            repro-4161    [087] .....   128.233034: v9fs_issue_write: Issue write done 2 err(-512)
>            repro-4161    [087] .....   128.233035: netfs_write_subrequest_terminated: Collector woken
>            repro-4161    [087] .....   128.233036: netfs_wake_write_collector: Wake collector
>            repro-4161    [087] .....   128.233036: netfs_wake_write_collector: Queuing collector work
>            repro-4161    [087] .....   128.233040: v9fs_issue_write: Issue write subrequest terminated 2
>            repro-4161    [087] .....   128.233040: netfs_unbuffered_write: netfs_unbuffered_write
>            repro-4161    [087] .....   128.233040: netfs_end_issue_write: netfs_end_issue_write
>            repro-4161    [087] .....   128.233041: netfs_end_issue_write: Write collector need poke 0
>            repro-4161    [087] .....   128.233041: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!
>  kworker/u1035:0-1580    [080] .....   128.233077: netfs_retry_writes: netfs_reissue_write 1
>  kworker/u1035:0-1580    [080] .....   128.233078: v9fs_issue_write: v9fs_issue_write
>  kworker/u1035:0-1580    [080] .....   128.233079: p9_client_write: p9_client_rpc
>  kworker/u1035:0-1580    [080] .....   128.233099: p9_fd_request: p9_fd_request
>  kworker/u1035:0-1580    [080] .....   128.233101: p9_fd_request: p9_fd_request EPOLL
>  kworker/u1035:0-1580    [080] .....   128.233101: p9_fd_poll: p9_fd_poll rd poll
>  kworker/u1035:0-1580    [080] .....   128.233101: p9_fd_poll: p9_fd_request wr poll
>            (Hangs)
>
> --
> Thanks and Regards,
> Prateek
>


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-25 12:15                     ` Oleg Nesterov
@ 2025-03-25 12:36                       ` Dominique Martinet
  2025-03-25 13:04                       ` Oleg Nesterov
  1 sibling, 0 replies; 48+ messages in thread
From: Dominique Martinet @ 2025-03-25 12:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: K Prateek Nayak, Eric Van Hensbergen, Latchesar Ionkov,
	Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells,
	jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

Thanks for the Cc

Just replying quickly without looking at anything

Oleg Nesterov wrote on Tue, Mar 25, 2025 at 01:15:26PM +0100:
> All I can say right now is that the "sigpending" logic in p9_client_rpc()
> looks wrong. If nothing else:
> 
> 	- clear_thread_flag(TIF_SIGPENDING) is not enough, it won't make
> 	  signal_pending() false if TIF_NOTIFY_SIGNAL is set.
> 
> 	- otoh, if signal_pending() was true because of pending SIGKILL,
> 	  then after clear_thread_flag(TIF_SIGPENDING) wait_event_killable()
> 	  will act as uninterruptible wait_event().

Yeah, this is effectively an unkillable event loop once a flush has been
sent; this is a known issue.
I've tried to address this with async rpc (so we could send the flush
and forget about it), but that caused other regressions and I never had
time to dig into these...

The patches date back 2018 and probably won't even apply cleanly
anymore, but if anyone cares they are here:
https://lore.kernel.org/all/1544532108-21689-3-git-send-email-asmadeus@codewreck.org/T/#u

(the hard work of refcounting was done just before that in order to kill
this pattern, I just pretty much ran out of free time at that point,
hobbies are hard...)

So: sorry, it's probably possible to improve this, but it won't be easy
nor immediate.

> > c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c
> > which basically does a p9_fd_poll().
> >
> > Previously, the above would fail with err as -EIO which would
> > cause the client to "Disconnect" and the retry logic would make
> > progress. Now however, the err returned is -ERESTARTSYS which
> > will not cause a disconnect and the retry logic will hang
> > somewhere in p9_client_rpc() later.

Now, if you got this far I think it'll be easier to make whatever
changed error out with EIO again instead; I'll try to check the rest of
the thread later this week as I didn't follow this thread at all.

Thanks,
-- 
Dominique

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-25 12:15                     ` Oleg Nesterov
  2025-03-25 12:36                       ` Dominique Martinet
@ 2025-03-25 13:04                       ` Oleg Nesterov
  2025-03-25 14:49                         ` K Prateek Nayak
  1 sibling, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-25 13:04 UTC (permalink / raw)
  To: K Prateek Nayak, Eric Van Hensbergen, Latchesar Ionkov,
	Dominique Martinet, Christian Schoenebeck
  Cc: Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

On 03/25, K Prateek Nayak wrote:
>
> I chased this down to p9_client_rpc() net/9p/client.c specifically:
>
>         err = c->trans_mod->request(c, req);
>         if (err < 0) {
>                 /* write won't happen */
>                 p9_req_put(c, req);
>                 if (err != -ERESTARTSYS && err != -EFAULT)
>                         c->status = Disconnected;
>                 goto recalc_sigpending;
>         }
>
> c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c
> which basically does a p9_fd_poll().

Again, I know nothing about 9p... but if p9_fd_request() returns
an err < 0, then it comes from p9_conn->err and p9_fd_request()
does nothing else.

> Previously, the above would fail with err as -EIO which would
> cause the client to "Disconnect" and the retry logic would make
> progress. Now however, the err returned is -ERESTARTSYS

OK... So p9_conn->err = -ERESTARTSYS was set by p9_conn_cancel()
called by p9_write_work() because pipe_write() returns ERESTARTSYS?

But I don't understand -EIO with the reverted commit aaec5a95d59615

Oleg.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-25 13:04                       ` Oleg Nesterov
@ 2025-03-25 14:49                         ` K Prateek Nayak
  2025-03-25 14:58                           ` Dominique Martinet
  0 siblings, 1 reply; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-25 14:49 UTC (permalink / raw)
  To: Oleg Nesterov, Eric Van Hensbergen, Latchesar Ionkov,
	Dominique Martinet, Christian Schoenebeck
  Cc: Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

[-- Attachment #1: Type: text/plain, Size: 12265 bytes --]

Hello Oleg, Dominique,

On 3/25/2025 6:34 PM, Oleg Nesterov wrote:
> On 03/25, K Prateek Nayak wrote:
>>
>> I chased this down to p9_client_rpc() net/9p/client.c specifically:
>>
>>          err = c->trans_mod->request(c, req);
>>          if (err < 0) {
>>                  /* write won't happen */
>>                  p9_req_put(c, req);
>>                  if (err != -ERESTARTSYS && err != -EFAULT)
>>                          c->status = Disconnected;
>>                  goto recalc_sigpending;
>>          }
>>
>> c->trans_mod->request() calls p9_fd_request() in net/9p/trans_fd.c
>> which basically does a p9_fd_poll().
> 
> Again, I know nothing about 9p... but if p9_fd_request() returns
> an err < 0, then it comes from p9_conn->err and p9_fd_request()
> does nothing else.
> 
>> Previously, the above would fail with err as -EIO which would
>> cause the client to "Disconnect" and the retry logic would make
>> progress. Now however, the err returned is -ERESTARTSYS
> 
> OK... So p9_conn->err = -ERESTARTSYS was set by p9_conn_cancel()
> called by p9_write_work() because pipe_write() returns ERESTARTSYS?
> 
> But I don't understand -EIO with the reverted commit aaec5a95d59615

Okay it is a long explanation. These are the relevant bits of
traces with aaec5a95d59615 reverted on mainline and I'll break it
down to the best of my abilities:

    kworker/100:1-1803    [100] .....   286.618784: p9_write_work: Data write wait 32770
    kworker/100:1-1803    [100] .....   286.618792: p9_write_work: Data written 28672
    kworker/100:1-1803    [100] .....   286.618793: p9_fd_poll: p9_fd_poll rd poll
    kworker/100:1-1803    [100] .....   286.618793: p9_fd_poll: p9_fd_request wr poll
    kworker/100:1-1803    [100] .....   286.618793: p9_fd_poll: p9_fd_poll rd poll
    kworker/100:1-1803    [100] .....   286.618793: p9_fd_poll: p9_fd_request wr poll
    kworker/100:1-1803    [100] .....   286.618795: p9_read_work: Data read wait 7
    kworker/100:1-1803    [100] .....   286.618796: p9_read_work: Data read 7
    kworker/100:1-1803    [100] .....   286.618796: p9_fd_poll: p9_fd_poll rd poll
    kworker/100:1-1803    [100] .....   286.618796: p9_fd_poll: p9_fd_request wr poll
    kworker/100:1-1803    [100] .....   286.618797: p9_fd_poll: p9_fd_poll rd poll
    kworker/100:1-1803    [100] .....   286.618797: p9_fd_poll: p9_fd_request wr poll
    kworker/100:1-1803    [100] .....   286.618797: p9_read_work: Data read wait 55
    kworker/100:1-1803    [100] .....   286.618798: p9_read_work: Data read 55
            repro-4102    [100] .....   286.618802: p9_client_write: p9_client_rpc done
            repro-4102    [100] .....   286.618804: p9_client_write: p9_pdup
            repro-4102    [100] .....   286.618804: p9_client_write: iter revert

   ==== the above bits are common between the two ===

            repro-4102    [100] .....   286.618805: p9_client_write: p9_client_rpc
            repro-4102    [100] .....   286.618817: p9_fd_request: p9_fd_request
            repro-4102    [100] .....   286.618818: p9_fd_request: p9_fd_request EPOLL
            repro-4102    [100] .....   286.618818: p9_fd_poll: p9_fd_poll rd poll
            repro-4102    [100] .....   286.618818: p9_fd_poll: p9_fd_request wr poll
    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_poll rd poll
    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_request wr poll
    kworker/100:1-1803    [100] .....   286.618823: p9_read_work: Data read wait 7
    kworker/100:1-1803    [100] .....   286.618825: p9_read_work: Data read 7
    kworker/100:1-1803    [100] .....   286.618825: p9_read_work: p9_tag_lookup error

                                                                  ^^^^^^^^^^^^^^^^^^^
   ==== In this case the  read succeeds but p9_tag_lookup() later will error out ====
   ==== Connection is cancelled with m->err as -EIO                              ====

    kworker/100:1-1803    [100] .....   286.618825: p9_read_work: Connection cancel
    kworker/100:1-1803    [100] ...1.   286.618826: p9_conn_cancel: Connection cancelled (-5)
            repro-4102    [100] .....   286.618831: p9_client_write: p9_client_rpc done
            repro-4102    [100] .....   286.618832: p9_client_write: p9_client_rpc error (-5)

                                                                                          ^^^
   ==== This error from kworker/100 is received by the reproducer                         ====

            repro-4102    [100] .....   286.618833: v9fs_issue_write: Issue write done 2 err(-5)
            repro-4102    [100] .....   286.618833: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated
            repro-4102    [100] .....   286.618834: netfs_wake_write_collector: Wake collector
            repro-4102    [100] .....   286.618834: netfs_wake_write_collector: Queuing collector work
            repro-4102    [100] .....   286.618837: v9fs_issue_write: Issue write subrequest terminated 2
            repro-4102    [100] .....   286.618838: netfs_unbuffered_write: netfs_unbuffered_write
            repro-4102    [100] .....   286.618838: netfs_end_issue_write: netfs_end_issue_write
            repro-4102    [100] .....   286.618838: netfs_end_issue_write: Write collector need poke 0
            repro-4102    [100] .....   286.618839: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!

   ==== A reissue mechanism triggers below ====

  kworker/u1037:2-3232    [096] .....   286.618845: netfs_retry_writes: netfs_reissue_write 1
  kworker/u1037:2-3232    [096] .....   286.618846: v9fs_issue_write: v9fs_issue_write
  kworker/u1037:2-3232    [096] .....   286.618847: p9_client_write: p9_client_rpc
  kworker/u1037:2-3232    [096] .....   286.618873: p9_fd_request: p9_fd_request
  kworker/u1037:2-3232    [096] ...1.   286.618874: p9_fd_request: p9_fd_request error
  kworker/u1037:2-3232    [096] .....   286.618874: p9_client_rpc: Client disconnected (no write)

   ==== Connection with client is disconnected ====

  kworker/u1037:2-3232    [096] .....   286.618877: p9_client_write: p9_client_rpc done
  kworker/u1037:2-3232    [096] .....   286.618880: p9_client_write: p9_client_rpc error (-5)
  kworker/u1037:2-3232    [096] .....   286.618881: v9fs_issue_write: Issue write done 0 err(-5)
  kworker/u1037:2-3232    [096] .....   286.618882: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated
  kworker/u1037:2-3232    [096] .....   286.618882: netfs_wake_write_collector: Wake collector
  kworker/u1037:2-3232    [096] .....   286.618882: netfs_wake_write_collector: Queuing collector work
  kworker/u1037:2-3232    [096] .....   286.618887: v9fs_issue_write: Issue write subrequest terminated 0
  kworker/u1037:2-3232    [096] .....   286.618890: netfs_write_collection_worker: Write collect clearing and waking up!

   ==== The retry worker will clear the NETFS_RREQ_IN_PROGRESS bit and issue a wakeup ====

            repro-4102    [100] .....   286.619051: p9_client_prepare_req: p9_client_prepare_req eio 1
            repro-4102    [100] .....   286.619052: p9_client_rpc: p9_client_rpc early err return
            repro-4108    [240] .....   286.621325: p9_fd_create: Client connected

   ==== The reproducer will then re-establish the connection and the pattern repeats ====

---

So the pattern is with the optimization reverted is:

o The RPC thread reads "7" bytes.
o p9_tag_lookup() fails giving -EIO.
o connection is cancelled from rpc thread.

o The reproducer thread receives the packet and finds -EIO.
o The reproducer starts a Disconnect.
o The reproducer wakes up collector and sleeps on bit
   NETFS_RREQ_IN_PROGRESS

o A retry worker wakes up.
o Finds connection has been disconnected
o Wakes up the reproducer thread.

o Reproducer wakes up and establishes connection.

I don't know 9p bits enough to understand the intermediates steps but
this is the rough flow.

---

Now for the traces on vanilla mainline:

     kworker/31:1-1723    [031] .....    95.011362: p9_write_work: Data write wait 32770
     kworker/31:1-1723    [031] .....    95.011372: p9_write_work: Data written 28672
     kworker/31:1-1723    [031] .....    95.011373: p9_fd_poll: p9_fd_poll rd poll
     kworker/31:1-1723    [031] .....    95.011373: p9_fd_poll: p9_fd_request wr poll
     kworker/31:1-1723    [031] .....    95.011374: p9_fd_poll: p9_fd_poll rd poll
     kworker/31:1-1723    [031] .....    95.011374: p9_fd_poll: p9_fd_request wr poll
     kworker/31:1-1723    [031] .....    95.011374: p9_read_work: Data read wait 7
     kworker/31:1-1723    [031] .....    95.011375: p9_read_work: Data read 7
     kworker/31:1-1723    [031] .....    95.011375: p9_fd_poll: p9_fd_poll rd poll
     kworker/31:1-1723    [031] .....    95.011375: p9_fd_poll: p9_fd_request wr poll
     kworker/31:1-1723    [031] .....    95.011376: p9_read_work: Data read wait 55
     kworker/31:1-1723    [031] .....    95.011376: p9_read_work: Data read 55
            repro-4076    [031] .....    95.011381: p9_client_rpc: Wait event killable (0)
            repro-4076    [031] .....    95.011382: p9_client_rpc: Check error (0)
            repro-4076    [031] .....    95.011382: p9_client_write: p9_client_rpc done
            repro-4076    [031] .....    95.011382: p9_client_write: p9_pdup
            repro-4076    [031] .....    95.011382: p9_client_write: iter revert

   ==== Above bits are same as the previous traces ====

            repro-4076    [031] .....    95.011383: p9_client_write: p9_client_rpc
            repro-4076    [031] .....    95.011393: p9_fd_request: p9_fd_request
            repro-4076    [031] .....    95.011394: p9_fd_request: p9_fd_request EPOLL
            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_poll rd poll
            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_request wr poll
            repro-4076    [031] .....    99.731970: p9_client_rpc: Wait event killable (-512)

   ==== The wait event here fails with -ERESTARTSYS ====

            repro-4076    [031] .....    99.731979: p9_client_rpc: Flushed (-512)
            repro-4076    [031] .....    99.731983: p9_client_rpc: Final error (-512)
            repro-4076    [031] .....    99.731983: p9_client_write: p9_client_rpc done
            repro-4076    [031] .....    99.731984: p9_client_write: p9_client_rpc error (-512)
            repro-4076    [031] .....    99.731985: v9fs_issue_write: Issue write done 2 err(-512)
            repro-4076    [031] .....    99.731987: netfs_write_subrequest_terminated: Collector woken up from netfs_write_subrequest_terminated
            repro-4076    [031] .....    99.731987: netfs_wake_write_collector: Wake collector
            repro-4076    [031] .....    99.731988: netfs_wake_write_collector: Queuing collector work
            repro-4076    [031] .....    99.731991: v9fs_issue_write: Issue write subrequest terminated 2
            repro-4076    [031] .....    99.731991: netfs_unbuffered_write: netfs_unbuffered_write
            repro-4076    [031] .....    99.731991: netfs_end_issue_write: netfs_end_issue_write
            repro-4076    [031] .....    99.731992: netfs_end_issue_write: Write collector need poke 0
            repro-4076    [031] .....    99.731993: netfs_unbuffered_write_iter_locked: Waiting on NETFS_RREQ_IN_PROGRESS!

   ==== The reproducer waits on bit NETFS_RREQ_IN_PROGRESS ====

  kworker/u1028:3-3860    [030] .....    99.732028: netfs_retry_writes: netfs_reissue_write 1
  kworker/u1028:3-3860    [030] .....    99.732030: v9fs_issue_write: v9fs_issue_write
  kworker/u1028:3-3860    [030] .....    99.732031: p9_client_write: p9_client_rpc
  kworker/u1028:3-3860    [030] .....    99.732051: p9_fd_request: p9_fd_request
  kworker/u1028:3-3860    [030] .....    99.732052: p9_fd_request: p9_fd_request EPOLL
  kworker/u1028:3-3860    [030] .....    99.732052: p9_fd_poll: p9_fd_poll rd poll
  kworker/u1028:3-3860    [030] .....    99.732053: p9_fd_poll: p9_fd_request wr poll

   ==== The retry worker seemingly gets stuck post p9_fd_poll() waiting for wakeup ====

---

That is my analysis so far.

> 
> Oleg.
> 

-- 
Thanks and Regards,
Prateek

[-- Attachment #2: debug.diff --]
[-- Type: text/plain, Size: 19614 bytes --]

diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
index 32619d146cbc..3fbfc6882c35 100644
--- a/fs/9p/vfs_addr.c
+++ b/fs/9p/vfs_addr.c
@@ -56,10 +56,13 @@ static void v9fs_issue_write(struct netfs_io_subrequest *subreq)
 	struct p9_fid *fid = subreq->rreq->netfs_priv;
 	int err, len;
 
+	trace_printk("v9fs_issue_write\n");
 	len = p9_client_write(fid, subreq->start, &subreq->io_iter, &err);
+	trace_printk("Issue write done %d err(%d)\n", len, err);
 	if (len > 0)
 		__set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags);
 	netfs_write_subrequest_terminated(subreq, len ?: err, false);
+	trace_printk("Issue write subrequest terminated %d\n", len);
 }
 
 /**
@@ -74,6 +77,7 @@ static void v9fs_issue_read(struct netfs_io_subrequest *subreq)
 	int total, err;
 
 	total = p9_client_read(fid, pos, &subreq->io_iter, &err);
+	trace_printk("v9fs_issue_read %d err(%d)\n", total, err);
 
 	/* if we just extended the file size, any portion not in
 	 * cache won't be on server and is zeroes */
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index 42ce53cc216e..6333b8bb4008 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -105,6 +105,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter *
 
 	if (!async) {
 		trace_netfs_rreq(wreq, netfs_rreq_trace_wait_ip);
+		trace_printk("Waiting on NETFS_RREQ_IN_PROGRESS!\n");
 		wait_on_bit(&wreq->flags, NETFS_RREQ_IN_PROGRESS,
 			    TASK_UNINTERRUPTIBLE);
 		ret = wreq->error;
diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index 23c75755ad4e..a609063db8a7 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -446,6 +446,7 @@ static void netfs_read_collection(struct netfs_io_request *rreq)
 	task_io_account_read(rreq->transferred);
 
 	trace_netfs_rreq(rreq, netfs_rreq_trace_wake_ip);
+	trace_printk("Read collect clearing and waking up!\n");
 	clear_and_wake_up_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
 
 	trace_netfs_rreq(rreq, netfs_rreq_trace_done);
diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c
index 3fca59e6475d..a54aa2bcc786 100644
--- a/fs/netfs/write_collect.c
+++ b/fs/netfs/write_collect.c
@@ -429,6 +429,7 @@ void netfs_write_collection_worker(struct work_struct *work)
 
 	_debug("finished");
 	trace_netfs_rreq(wreq, netfs_rreq_trace_wake_ip);
+	trace_printk("Write collect clearing and waking up!\n");
 	clear_and_wake_up_bit(NETFS_RREQ_IN_PROGRESS, &wreq->flags);
 
 	if (wreq->iocb) {
@@ -449,10 +450,14 @@ void netfs_write_collection_worker(struct work_struct *work)
  */
 void netfs_wake_write_collector(struct netfs_io_request *wreq, bool was_async)
 {
+	trace_printk("Wake collector\n");
 	if (!work_pending(&wreq->work)) {
 		netfs_get_request(wreq, netfs_rreq_trace_get_work);
-		if (!queue_work(system_unbound_wq, &wreq->work))
+		trace_printk("Queuing collector work\n");
+		if (!queue_work(system_unbound_wq, &wreq->work)) {
+			trace_printk("Put request\n");
 			netfs_put_request(wreq, was_async, netfs_rreq_trace_put_work_nq);
+		}
 	}
 }
 
@@ -542,8 +547,10 @@ void netfs_write_subrequest_terminated(void *_op, ssize_t transferred_or_error,
 	/* If we are at the head of the queue, wake up the collector,
 	 * transferring a ref to it if we were the ones to do so.
 	 */
-	if (list_is_first(&subreq->rreq_link, &stream->subrequests))
+	if (list_is_first(&subreq->rreq_link, &stream->subrequests)) {
+		trace_printk("Collector woken up from netfs_write_subrequest_terminated\n");
 		netfs_wake_write_collector(wreq, was_async);
+	}
 
 	netfs_put_subrequest(subreq, was_async, netfs_sreq_trace_put_terminated);
 }
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index 77279fc5b5a7..22a4b793e789 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -232,8 +232,10 @@ static void netfs_do_issue_write(struct netfs_io_stream *stream,
 
 	_enter("R=%x[%x],%zx", wreq->debug_id, subreq->debug_index, subreq->len);
 
-	if (test_bit(NETFS_SREQ_FAILED, &subreq->flags))
+	if (test_bit(NETFS_SREQ_FAILED, &subreq->flags)) {
+		trace_printk("netfs_do_issue_write failed!\n");
 		return netfs_write_subrequest_terminated(subreq, subreq->error, false);
+	}
 
 	trace_netfs_sreq(subreq, netfs_sreq_trace_submit);
 	stream->issue_write(subreq);
@@ -264,6 +266,7 @@ void netfs_issue_write(struct netfs_io_request *wreq,
 
 	if (!subreq)
 		return;
+	trace_printk("netfs_issue_write!\n");
 	stream->construct = NULL;
 	subreq->io_iter.count = subreq->len;
 	netfs_do_issue_write(stream, subreq);
@@ -290,6 +293,7 @@ size_t netfs_advance_write(struct netfs_io_request *wreq,
 	_enter("R=%x[%x]", wreq->debug_id, subreq ? subreq->debug_index : 0);
 
 	if (subreq && start != subreq->start + subreq->len) {
+		trace_printk("netfs_issue_write from advance write!\n");
 		netfs_issue_write(wreq, stream);
 		subreq = NULL;
 	}
@@ -307,6 +311,7 @@ size_t netfs_advance_write(struct netfs_io_request *wreq,
 	if (subreq->len >= stream->sreq_max_len ||
 	    subreq->nr_segs >= stream->sreq_max_segs ||
 	    to_eof) {
+		trace_printk("netfs_issue_write from advance write! 2\n");
 		netfs_issue_write(wreq, stream);
 		subreq = NULL;
 	}
@@ -397,6 +402,7 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
 	 *     when they were read.  Note that these appear as a special
 	 *     write-back group.
 	 */
+	trace_printk("netfs_issue_write in folio copy\n");
 	if (fgroup == NETFS_FOLIO_COPY_TO_CACHE) {
 		netfs_issue_write(wreq, upload);
 	} else if (fgroup != wreq->group) {
@@ -526,6 +532,7 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
  */
 static void netfs_end_issue_write(struct netfs_io_request *wreq)
 {
+	trace_printk("netfs_end_issue_write\n");
 	bool needs_poke = true;
 
 	smp_wmb(); /* Write subreq lists before ALL_QUEUED. */
@@ -541,6 +548,7 @@ static void netfs_end_issue_write(struct netfs_io_request *wreq)
 		netfs_issue_write(wreq, stream);
 	}
 
+	trace_printk("Write collector need poke %d\n", needs_poke);
 	if (needs_poke)
 		netfs_wake_write_collector(wreq, false);
 }
@@ -730,6 +738,7 @@ int netfs_unbuffered_write(struct netfs_io_request *wreq, bool may_wait, size_t
 			break;
 	}
 
+	trace_printk("netfs_unbuffered_write\n");
 	netfs_end_issue_write(wreq);
 	_leave(" = %d", error);
 	return error;
@@ -908,6 +917,7 @@ int netfs_writeback_single(struct address_space *mapping,
 	}
 
 stop:
+	trace_printk("netfs_issue_write writeback single\n");
 	for (int s = 0; s < NR_IO_STREAMS; s++)
 		netfs_issue_write(wreq, &wreq->io_streams[s]);
 	smp_wmb(); /* Write lists before ALL_QUEUED. */
diff --git a/fs/netfs/write_retry.c b/fs/netfs/write_retry.c
index 545d33079a77..60a03079766b 100644
--- a/fs/netfs/write_retry.c
+++ b/fs/netfs/write_retry.c
@@ -43,6 +43,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq,
 
 				iov_iter_revert(&source, subreq->len - source.count);
 				netfs_get_subrequest(subreq, netfs_sreq_trace_get_resubmit);
+				trace_printk("netfs_reissue_write 1\n");
 				netfs_reissue_write(stream, subreq, &source);
 			}
 		}
@@ -116,6 +117,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq,
 				boundary = true;
 
 			netfs_get_subrequest(subreq, netfs_sreq_trace_get_resubmit);
+			trace_printk("netfs_reissue_write 2\n");
 			netfs_reissue_write(stream, subreq, &source);
 			if (subreq == to)
 				break;
@@ -183,6 +185,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq,
 				boundary = false;
 			}
 
+			trace_printk("netfs_reissue_write 3\n");
 			netfs_reissue_write(stream, subreq, &source);
 			if (!len)
 				break;
diff --git a/net/9p/client.c b/net/9p/client.c
index 09f8ced9f8bb..909c83b0e015 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -633,12 +633,16 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
 	p9_debug(P9_DEBUG_MUX, "client %p op %d\n", c, type);
 
 	/* we allow for any status other than disconnected */
-	if (c->status == Disconnected)
+	if (c->status == Disconnected) {
+		trace_printk("p9_client_prepare_req eio 1\n");
 		return ERR_PTR(-EIO);
+	}
 
 	/* if status is begin_disconnected we allow only clunk request */
-	if (c->status == BeginDisconnect && type != P9_TCLUNK)
+	if (c->status == BeginDisconnect && type != P9_TCLUNK) {
+		trace_printk("p9_client_prepare_req eio 2\n");
 		return ERR_PTR(-EIO);
+	}
 
 	va_copy(apc, ap);
 	req = p9_tag_alloc(c, type, t_size, r_size, fmt, apc);
@@ -690,8 +694,10 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 	va_start(ap, fmt);
 	req = p9_client_prepare_req(c, type, tsize, rsize, fmt, ap);
 	va_end(ap);
-	if (IS_ERR(req))
+	if (IS_ERR(req)) {
+		trace_printk("p9_client_rpc early err return\n");
 		return req;
+	}
 
 	req->tc.zc = false;
 	req->rc.zc = false;
@@ -706,9 +712,12 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 	err = c->trans_mod->request(c, req);
 	if (err < 0) {
 		/* write won't happen */
+		trace_printk("write won't happnen (%d)\n", err);
 		p9_req_put(c, req);
-		if (err != -ERESTARTSYS && err != -EFAULT)
+		if (err != -ERESTARTSYS && err != -EFAULT) {
+			trace_printk("Client disconnected (no write)\n");
 			c->status = Disconnected;
+		}
 		goto recalc_sigpending;
 	}
 again:
@@ -721,6 +730,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 	 */
 	smp_rmb();
 
+	trace_printk("Wait event killable (%d) (%d) (%d) (%d)\n", err, READ_ONCE(req->status), c->status, type);
 	if (err == -ERESTARTSYS && c->status == Connected &&
 	    type == P9_TFLUSH) {
 		sigpending = 1;
@@ -731,6 +741,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 	if (READ_ONCE(req->status) == REQ_STATUS_ERROR) {
 		p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
 		err = req->t_err;
+		trace_printk("REQ_STATUS_ERROR (%d)\n", err);
 	}
 	if (err == -ERESTARTSYS && c->status == Connected) {
 		p9_debug(P9_DEBUG_MUX, "flushing\n");
@@ -743,6 +754,8 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 		/* if we received the response anyway, don't signal error */
 		if (READ_ONCE(req->status) == REQ_STATUS_RCVD)
 			err = 0;
+
+		trace_printk("Flushed (%d)\n", err);
 	}
 recalc_sigpending:
 	if (sigpending) {
@@ -754,11 +767,13 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 		goto reterr;
 
 	err = p9_check_errors(c, req);
+	trace_printk("Check error (%d)\n", err);
 	trace_9p_client_res(c, type, req->rc.tag, err);
 	if (!err)
 		return req;
 reterr:
 	p9_req_put(c, req);
+	trace_printk("Final error (%d)\n", err);
 	return ERR_PTR(safe_errno(err));
 }
 
@@ -808,8 +823,10 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 	err = c->trans_mod->zc_request(c, req, uidata, uodata,
 				       inlen, olen, in_hdrlen);
 	if (err < 0) {
-		if (err == -EIO)
+		if (err == -EIO) {
+			trace_printk("Client disconnected (zero_copy)\n");
 			c->status = Disconnected;
+		}
 		if (err != -ERESTARTSYS)
 			goto recalc_sigpending;
 	}
@@ -1093,6 +1110,7 @@ EXPORT_SYMBOL(p9_client_destroy);
 void p9_client_disconnect(struct p9_client *clnt)
 {
 	p9_debug(P9_DEBUG_9P, "clnt %p\n", clnt);
+	trace_printk("Client disconnected\n");
 	clnt->status = Disconnected;
 }
 EXPORT_SYMBOL(p9_client_disconnect);
@@ -1100,6 +1118,7 @@ EXPORT_SYMBOL(p9_client_disconnect);
 void p9_client_begin_disconnect(struct p9_client *clnt)
 {
 	p9_debug(P9_DEBUG_9P, "clnt %p\n", clnt);
+	trace_printk("Client begin disconnect\n");
 	clnt->status = BeginDisconnect;
 }
 EXPORT_SYMBOL(p9_client_begin_disconnect);
@@ -1572,11 +1591,13 @@ p9_client_read_once(struct p9_fid *fid, u64 offset, struct iov_iter *to,
 				       offset, rsize);
 	} else {
 		non_zc = 1;
+		trace_printk("RPC issue\n");
 		req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset,
 				    rsize);
 	}
 	if (IS_ERR(req)) {
 		*err = PTR_ERR(req);
+		trace_printk("read RPC issue error(%d)\n", *err);
 		if (!non_zc)
 			iov_iter_revert(to, count - iov_iter_count(to));
 		return 0;
@@ -1587,12 +1608,14 @@ p9_client_read_once(struct p9_fid *fid, u64 offset, struct iov_iter *to,
 	if (*err) {
 		if (!non_zc)
 			iov_iter_revert(to, count - iov_iter_count(to));
+		trace_printk("p9pdu_readf error(%d)\n", *err);
 		trace_9p_protocol_dump(clnt, &req->rc);
 		p9_req_put(clnt, req);
 		return 0;
 	}
 	if (rsize < received) {
 		pr_err("bogus RREAD count (%d > %d)\n", received, rsize);
+		trace_printk("bogus read\n");
 		received = rsize;
 	}
 
@@ -1602,6 +1625,7 @@ p9_client_read_once(struct p9_fid *fid, u64 offset, struct iov_iter *to,
 		int n = copy_to_iter(dataptr, received, to);
 
 		if (n != received) {
+			trace_printk("copy_to_iter fault\n");
 			*err = -EFAULT;
 			p9_req_put(clnt, req);
 			return n;
@@ -1642,23 +1666,29 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
 					       rsize, P9_ZC_HDR_SZ, "dqd",
 					       fid->fid, offset, rsize);
 		} else {
+			trace_printk("p9_client_rpc\n");
 			req = p9_client_rpc(clnt, P9_TWRITE, "dqV", fid->fid,
 					    offset, rsize, from);
+			trace_printk("p9_client_rpc done\n");
 		}
 		if (IS_ERR(req)) {
 			iov_iter_revert(from, count - iov_iter_count(from));
 			*err = PTR_ERR(req);
+			trace_printk("p9_client_rpc error (%d)\n", *err);
 			break;
 		}
 
+		trace_printk("p9_pdup\n");
 		*err = p9pdu_readf(&req->rc, clnt->proto_version, "d", &written);
 		if (*err) {
+			trace_printk("p9_pdup err(%d)\n", *err);
 			iov_iter_revert(from, count - iov_iter_count(from));
 			trace_9p_protocol_dump(clnt, &req->rc);
 			p9_req_put(clnt, req);
 			break;
 		}
 		if (rsize < written) {
+			trace_printk("p9_pdup bogus %d %d\n", written, rsize);
 			pr_err("bogus RWRITE count (%d > %d)\n", written, rsize);
 			written = rsize;
 		}
@@ -1666,6 +1696,7 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
 		p9_debug(P9_DEBUG_9P, "<<< RWRITE count %d\n", written);
 
 		p9_req_put(clnt, req);
+		trace_printk("iter revert\n");
 		iov_iter_revert(from, count - written - iov_iter_count(from));
 		total += written;
 		offset += written;
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 196060dc6138..3c83b3855e82 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -196,6 +196,7 @@ static void p9_conn_cancel(struct p9_conn *m, int err)
 		return;
 	}
 
+	trace_printk("Connection cancelled (%d)\n", err);
 	m->err = err;
 
 	list_for_each_entry_safe(req, rtmp, &m->req_list, req_list) {
@@ -233,9 +234,12 @@ p9_fd_poll(struct p9_client *client, struct poll_table_struct *pt, int *err)
 		return EPOLLERR;
 	}
 
+	trace_printk("p9_fd_poll rd poll\n");
 	ret = vfs_poll(ts->rd, pt);
-	if (ts->rd != ts->wr)
+	if (ts->rd != ts->wr) {
+		trace_printk("p9_fd_request wr poll\n");
 		ret = (ret & ~EPOLLOUT) | (vfs_poll(ts->wr, pt) & ~EPOLLIN);
+	}
 	return ret;
 }
 
@@ -263,9 +267,13 @@ static int p9_fd_read(struct p9_client *client, void *v, int len)
 		p9_debug(P9_DEBUG_ERROR, "blocking read ...\n");
 
 	pos = ts->rd->f_pos;
+	trace_printk("Data read wait %d\n", len);
 	ret = kernel_read(ts->rd, v, len, &pos);
-	if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN)
+	trace_printk("Data read %d\n", ret);
+	if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN) {
+		trace_printk("Client disconnected (fd_read)\n");
 		client->status = Disconnected;
+	}
 	return ret;
 }
 
@@ -330,6 +338,7 @@ static void p9_read_work(struct work_struct *work)
 		if (!m->rreq || (m->rreq->status != REQ_STATUS_SENT)) {
 			p9_debug(P9_DEBUG_ERROR, "Unexpected packet tag %d\n",
 				 m->rc.tag);
+			trace_printk("p9_tag_lookup error\n");
 			err = -EIO;
 			goto error;
 		}
@@ -338,6 +347,7 @@ static void p9_read_work(struct work_struct *work)
 			p9_debug(P9_DEBUG_ERROR,
 				 "requested packet size too big: %d for tag %d with capacity %zd\n",
 				 m->rc.size, m->rc.tag, m->rreq->rc.capacity);
+			trace_printk("capacity error\n");
 			err = -EIO;
 			goto error;
 		}
@@ -348,6 +358,7 @@ static void p9_read_work(struct work_struct *work)
 				 m->rc.tag, m->rreq);
 			p9_req_put(m->client, m->rreq);
 			m->rreq = NULL;
+			trace_printk("sdata error\n");
 			err = -EIO;
 			goto error;
 		}
@@ -375,6 +386,7 @@ static void p9_read_work(struct work_struct *work)
 			p9_debug(P9_DEBUG_ERROR,
 				 "Request tag %d errored out while we were reading the reply\n",
 				 m->rc.tag);
+			trace_printk("request tag error %d status(%d)\n", m->rc.tag, m->rreq->status);
 			err = -EIO;
 			goto error;
 		}
@@ -403,6 +415,7 @@ static void p9_read_work(struct work_struct *work)
 
 	return;
 error:
+	trace_printk("Connection cancel\n");
 	p9_conn_cancel(m, err);
 	clear_bit(Rworksched, &m->wsched);
 }
@@ -429,9 +442,13 @@ static int p9_fd_write(struct p9_client *client, void *v, int len)
 	if (!(ts->wr->f_flags & O_NONBLOCK))
 		p9_debug(P9_DEBUG_ERROR, "blocking write ...\n");
 
+	trace_printk("Data write wait %d\n",len);
 	ret = kernel_write(ts->wr, v, len, &ts->wr->f_pos);
-	if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN)
+	trace_printk("Data written %ld\n", ret);
+	if (ret <= 0 && ret != -ERESTARTSYS && ret != -EAGAIN) {
+		trace_printk("Client disconnected (fd_write)\n");
 		client->status = Disconnected;
+	}
 	return ret;
 }
 
@@ -519,6 +536,7 @@ static void p9_write_work(struct work_struct *work)
 	return;
 
 error:
+	trace_printk("Connection cancel\n");
 	p9_conn_cancel(m, err);
 	clear_bit(Wworksched, &m->wsched);
 }
@@ -628,6 +646,7 @@ static void p9_poll_mux(struct p9_conn *m)
 	n = p9_fd_poll(m->client, NULL, &err);
 	if (n & (EPOLLERR | EPOLLHUP | EPOLLNVAL)) {
 		p9_debug(P9_DEBUG_TRANS, "error mux %p err %d\n", m, n);
+		trace_printk("Connection cancel\n");
 		p9_conn_cancel(m, err);
 	}
 
@@ -668,12 +687,14 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 	struct p9_trans_fd *ts = client->trans;
 	struct p9_conn *m = &ts->conn;
 
+	trace_printk("p9_fd_request\n");
 	p9_debug(P9_DEBUG_TRANS, "mux %p task %p tcall %p id %d\n",
 		 m, current, &req->tc, req->tc.id);
 
 	spin_lock(&m->req_lock);
 
 	if (m->err < 0) {
+		trace_printk("p9_fd_request error\n");
 		spin_unlock(&m->req_lock);
 		return m->err;
 	}
@@ -682,13 +703,16 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 	list_add_tail(&req->req_list, &m->unsent_req_list);
 	spin_unlock(&m->req_lock);
 
+	trace_printk("p9_fd_request EPOLL\n");
 	if (test_and_clear_bit(Wpending, &m->wsched))
 		n = EPOLLOUT;
 	else
 		n = p9_fd_poll(m->client, NULL, NULL);
 
-	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
+	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched)) {
+		trace_printk("p9_fd_request schedule work\n");
 		schedule_work(&m->wq);
+	}
 
 	return 0;
 }
@@ -852,6 +876,7 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd)
 	data_race(ts->wr->f_flags |= O_NONBLOCK);
 
 	client->trans = ts;
+	trace_printk("Client connected\n");
 	client->status = Connected;
 
 	return 0;
@@ -889,6 +914,7 @@ static int p9_socket_open(struct p9_client *client, struct socket *csocket)
 	get_file(file);
 	p->wr = p->rd = file;
 	client->trans = p;
+	trace_printk("Client connected\n");
 	client->status = Connected;
 
 	p->rd->f_flags |= O_NONBLOCK;
@@ -920,6 +946,7 @@ static void p9_conn_destroy(struct p9_conn *m)
 		m->wreq = NULL;
 	}
 
+	trace_printk("Connection destroy\n");
 	p9_conn_cancel(m, -ECONNRESET);
 
 	m->client = NULL;
@@ -942,6 +969,7 @@ static void p9_fd_close(struct p9_client *client)
 	if (!ts)
 		return;
 
+	trace_printk("Client disconnected (fd_close)\n");
 	client->status = Disconnected;
 
 	p9_conn_destroy(&ts->conn);

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-25 14:49                         ` K Prateek Nayak
@ 2025-03-25 14:58                           ` Dominique Martinet
  2025-03-26 12:19                             ` Oleg Nesterov
  2025-03-27 17:46                             ` K Prateek Nayak
  0 siblings, 2 replies; 48+ messages in thread
From: Dominique Martinet @ 2025-03-25 14:58 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Oleg Nesterov, Eric Van Hensbergen, Latchesar Ionkov,
	Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells,
	jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

Thanks for the traces.

w/ revert
K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530:
>    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_poll rd poll
>    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_request wr poll
>    kworker/100:1-1803    [100] .....   286.618823: p9_read_work: Data read wait 7

new behavior
>            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_poll rd poll
>            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_request wr poll
>            repro-4076    [031] .....    99.731970: p9_client_rpc: Wait event killable (-512)

For me the problem isn't so much that this gets ERESTARTSYS but that it
nevers gets to read the 7 bytes that are available?

If the repro has already written the bytes in both cases then there's no
reason to wait 5 seconds here...

OTOH syzbot 9p code is silly and might have been depending on something
that's not supposed to work e.g. they might be missing a flush or
equivalent for all I know (I still haven't looked at the repro)

-- 
Dominique

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-25 14:58                           ` Dominique Martinet
@ 2025-03-26 12:19                             ` Oleg Nesterov
  2025-03-26 12:44                               ` Oleg Nesterov
  2025-03-27 17:46                             ` K Prateek Nayak
  1 sibling, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-26 12:19 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: K Prateek Nayak, Eric Van Hensbergen, Latchesar Ionkov,
	Christian Schoenebeck, Mateusz Guzik, syzbot, brauner, dhowells,
	jack, jlayton, linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

On 03/25, Dominique Martinet wrote:
>
> Thanks for the traces.
>
> w/ revert
> K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530:
> >    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_poll rd poll
> >    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_request wr poll
> >    kworker/100:1-1803    [100] .....   286.618823: p9_read_work: Data read wait 7
>
> new behavior
> >            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_poll rd poll
> >            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_request wr poll
> >            repro-4076    [031] .....    99.731970: p9_client_rpc: Wait event killable (-512)
>
> For me the problem isn't so much that this gets ERESTARTSYS but that it
> nevers gets to read the 7 bytes that are available?

Yes...

OK, lets first recall what the commit aaec5a95d59615523 ("pipe_read:
don't wake up the writer if the pipe is still full") does.
It simply removes the unnecessary/spurious wakeups when the writer
can't add more data to the pipe.

See the "stupid test-cas" in
https://lore.kernel.org/all/20250120144338.GC7432@redhat.com/

In particular this note:

	As you can see, without this patch pipe_read() wakes the writer up
	4095 times for no reason, the writer burns a bit of CPU and blocks
	again after wakeup until the last read(fd[0], &c, 1).

in this test-case the writer sleeps in pipe_write(), but the same is true
for the task sleeping in poll( { .fd = pipe_fd, .events = POLLOUT}, ...).

Now, after some grepping I have found

	static void p9_conn_create(struct p9_client *client)
	{
		...

		init_poll_funcptr(&m->pt, p9_pollwait);

		n = p9_fd_poll(client, &m->pt, NULL);

		...
	}

So, iiuc, in this case p9_fd_poll(&m->pt /* != NULL */) -> p9_pollwait()
paths will add the "dummy" pwait->wait entries with ->func = p9_pollwake
to pipe_inode_info.rd_wait and pipe_inode_info.wr_wait.

Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the
ret from vfs_poll(ts->rd), but I assume this is correct.

This means that every time pipe_read() does wake_up(&pipe->wr_wait)
p9_pollwake() is called. This function kicks p9_poll_workfn() which
calls p9_poll_mux() which calls p9_fd_poll() again with pt == NULL.

In this case the conditional vfs_poll(ts->wr) looks more understandable...

So. Without the commit above, p9_poll_mux()->p9_fd_poll() can be called
much more often and, in particular, can report the "additional" EPOLLIN.

Can this somehow explain the problem?

Oleg.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-26 12:19                             ` Oleg Nesterov
@ 2025-03-26 12:44                               ` Oleg Nesterov
  2025-03-26 13:05                                 ` Oleg Nesterov
  0 siblings, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-26 12:44 UTC (permalink / raw)
  To: Dominique Martinet, K Prateek Nayak
  Cc: Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck,
	Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

On 03/26, Oleg Nesterov wrote:
>
> Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the
> ret from vfs_poll(ts->rd), but I assume this is correct.

I meant, if pt != NULL and ts->rd != ts->wr we need both
vfs_poll(ts->rd) and vfs_poll(ts->wr) ?

and the reproducer writes to the pipe before it mounts 9p...

Prateek, this is just a shot in the dark but since you can reproduce,
can you check if the patch below makes any difference?

Oleg.

--- x/net/9p/trans_fd.c
+++ x/net/9p/trans_fd.c
@@ -234,8 +234,10 @@ p9_fd_poll(struct p9_client *client, str
 	}
 
 	ret = vfs_poll(ts->rd, pt);
-	if (ts->rd != ts->wr)
+	if (ts->rd != ts->wr) {
+		if (pt != NULL) vfs_poll(ts->wr, pt);
 		ret = (ret & ~EPOLLOUT) | (vfs_poll(ts->wr, pt) & ~EPOLLIN);
+	}
 	return ret;
 }
 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-26 12:44                               ` Oleg Nesterov
@ 2025-03-26 13:05                                 ` Oleg Nesterov
  0 siblings, 0 replies; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-26 13:05 UTC (permalink / raw)
  To: Dominique Martinet, K Prateek Nayak
  Cc: Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck,
	Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

Damn, sorry for the noise please ignore ;)

On 03/26, Oleg Nesterov wrote:
>
> On 03/26, Oleg Nesterov wrote:
> >
> > Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the
> > ret from vfs_poll(ts->rd), but I assume this is correct.
>
> I meant, if pt != NULL and ts->rd != ts->wr we need both
> vfs_poll(ts->rd) and vfs_poll(ts->wr) ?

I am stupid. After the lot of reading I can't distinguish "|" and "||".

Oleg.

> and the reproducer writes to the pipe before it mounts 9p...
> 
> Prateek, this is just a shot in the dark but since you can reproduce,
> can you check if the patch below makes any difference?
> 
> Oleg.
> 
> --- x/net/9p/trans_fd.c
> +++ x/net/9p/trans_fd.c
> @@ -234,8 +234,10 @@ p9_fd_poll(struct p9_client *client, str
>  	}
>  
>  	ret = vfs_poll(ts->rd, pt);
> -	if (ts->rd != ts->wr)
> +	if (ts->rd != ts->wr) {
> +		if (pt != NULL) vfs_poll(ts->wr, pt);
>  		ret = (ret & ~EPOLLOUT) | (vfs_poll(ts->wr, pt) & ~EPOLLIN);
> +	}
>  	return ret;
>  }
>  


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-25 14:58                           ` Dominique Martinet
  2025-03-26 12:19                             ` Oleg Nesterov
@ 2025-03-27 17:46                             ` K Prateek Nayak
  2025-03-27 21:19                               ` syzbot
  1 sibling, 1 reply; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-27 17:46 UTC (permalink / raw)
  To: Dominique Martinet, Oleg Nesterov
  Cc: Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck,
	Mateusz Guzik, syzbot, brauner, dhowells, jack, jlayton,
	linux-fsdevel, linux-kernel, netfs, swapnil.sapkal,
	syzkaller-bugs, viro, v9fs

Hello all,

On 3/25/2025 8:28 PM, Dominique Martinet wrote:
> Thanks for the traces.
> 
> w/ revert
> K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530:
>>     kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_poll rd poll
>>     kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_request wr poll
>>     kworker/100:1-1803    [100] .....   286.618823: p9_read_work: Data read wait 7
> 
> new behavior
>>             repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_poll rd poll
>>             repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_request wr poll
>>             repro-4076    [031] .....    99.731970: p9_client_rpc: Wait event killable (-512)
> 
> For me the problem isn't so much that this gets ERESTARTSYS but that it
> nevers gets to read the 7 bytes that are available?

I'm back after getting distracted for a bit. So here it goes:

On the mainline, the signal is actually SIGKILL from parent thread
due to a timeout:

            repro-4084    [112] d..2.   233.654264: signal_generate: sig=9 errno=0 code=0 comm=repro pid=4085 grp=1 res=0
            repro-4084    [112] d..1.   233.654272: signal_generate: sig=9 errno=0 code=0 comm=repro pid=4085 grp=1 res=1
            repro-4085    [125] .....   233.654304: p9_client_rpc: Wait event killable (-512) (1) (0) (118)
            repro-4085    [125] .....   233.654308: p9_client_rpc: Flushed (-512)
            repro-4085    [125] .....   233.654312: p9_client_rpc: Final error (-512)
            repro-4085    [125] .....   233.654313: p9_client_write: p9_client_rpc done
            repro-4085    [125] .....   233.654313: p9_client_write: p9_client_rpc error (-512)
            repro-4085    [125] .....   233.654315: v9fs_issue_write: Issue write done 2 err(-512)

Specifically, this part:

     for (;;) {
       sleep_ms(10);
       if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
         break;
       if (current_time_ms() - start < 5000)
         continue;
       kill_and_wait(pid, &status); /* <------- here */
       break;
     }

Now for the 7 bytes that were not read - If you look at the traces,
every time there is a valid trans_mod->request(), there is a kworker
wakeup:

            repro-4102    [100] .....   286.618772: p9_client_write: p9_client_rpc
            repro-4102    [100] .....   286.618780: p9_fd_request: p9_fd_request
            repro-4102    [100] .....   286.618781: p9_fd_request: p9_fd_request EPOLL
            repro-4102    [100] .....   286.618781: p9_fd_request: p9_fd_request schedule work  <----- this
    kworker/100:1-1803    [100] .....   286.618784: p9_write_work: Data write wait 32770

However, for that last 7 byte read, there is in fact no wakeup:

            repro-4102    [100] .....   286.618805: p9_client_write: p9_client_rpc
            repro-4102    [100] .....   286.618817: p9_fd_request: p9_fd_request
            repro-4102    [100] .....   286.618818: p9_fd_request: p9_fd_request EPOLL
            repro-4102    [100] .....   286.618818: p9_fd_poll: p9_fd_poll rd poll
            repro-4102    [100] .....   286.618818: p9_fd_poll: p9_fd_request wr poll  <--- No wakeup after
    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_poll rd poll
    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_request wr poll
    kworker/100:1-1803    [100] .....   286.618823: p9_read_work: Data read wait 7
    kworker/100:1-1803    [100] .....   286.618825: p9_read_work: Data read 7

The kworker would spuriously wakeup, do something, realizes it wasn't
suppose to read it, and then disconnects the connection waking up all
waiters.

But if we haven't sent a request, why are we waiting at all?

> 
> If the repro has already written the bytes in both cases then there's no
> reason to wait 5 seconds here...
> 
> OTOH syzbot 9p code is silly and might have been depending on something
> that's not supposed to work e.g. they might be missing a flush or
> equivalent for all I know (I still haven't looked at the repro)
> 

I believe if c->trans_mod->request(c, req) does not dispatch the
request to the worker for RPC, the caller must not wait. So here
goes something that got the benchmark going in my case which might
be totally wrong but worth a look:

#syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 196060dc6138..148533c993f1 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -687,7 +687,11 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
  	else
  		n = p9_fd_poll(m->client, NULL, NULL);
  
-	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
+	/* Request was not sent */
+	if (!(n & EPOLLOUT))
+		return -EIO;
+
+	if (!test_and_set_bit(Wworksched, &m->wsched))
  		schedule_work(&m->wq);
  
  	return 0;


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-27 17:46                             ` K Prateek Nayak
@ 2025-03-27 21:19                               ` syzbot
  2025-03-27 22:18                                 ` asmadeus
  0 siblings, 1 reply; 48+ messages in thread
From: syzbot @ 2025-03-27 21:19 UTC (permalink / raw)
  To: asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho,
	mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
KASAN: slab-use-after-free Read in p9_conn_cancel

==================================================================
BUG: KASAN: slab-use-after-free in p9_conn_cancel+0x900/0x910 net/9p/trans_fd.c:205
Read of size 8 at addr ffff88807b19ea50 by task syz-executor/6595

CPU: 0 UID: 0 PID: 6595 Comm: syz-executor Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xc3/0x620 mm/kasan/report.c:489
 kasan_report+0xd9/0x110 mm/kasan/report.c:602
 p9_conn_cancel+0x900/0x910 net/9p/trans_fd.c:205
 p9_conn_destroy net/9p/trans_fd.c:927 [inline]
 p9_fd_close+0x3c5/0x590 net/9p/trans_fd.c:951
 p9_client_destroy+0xce/0x480 net/9p/client.c:1077
 v9fs_session_close+0x49/0x2d0 fs/9p/v9fs.c:506
 v9fs_kill_super+0x4d/0xa0 fs/9p/vfs_super.c:196
 deactivate_locked_super+0xbe/0x1a0 fs/super.c:473
 deactivate_super+0xde/0x100 fs/super.c:506
 cleanup_mnt+0x222/0x450 fs/namespace.c:1373
 task_work_run+0x14e/0x250 kernel/task_work.c:239
 exit_task_work include/linux/task_work.h:43 [inline]
 do_exit+0xadd/0x2d70 kernel/exit.c:938
 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087
 get_signal+0x24ed/0x26c0 kernel/signal.c:3017
 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218
 do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fa1f978bb7c
Code: Unable to access opcode bytes at 0x7fa1f978bb52.
RSP: 002b:00007ffd5c5893b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: fffffffffffffe00 RBX: 0000000000000003 RCX: 00007fa1f978bb7c
RDX: 0000000000000030 RSI: 00007ffd5c589470 RDI: 00000000000000f9
RBP: 00007ffd5c58941c R08: 0000000000000000 R09: 0079746972756365
R10: 00007ffd5c588d70 R11: 0000000000000246 R12: 0000000000000258
R13: 00000000000927c0 R14: 0000000000019c1d R15: 00007ffd5c589470
 </TASK>

Allocated by task 52:
 kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 unpoison_slab_object mm/kasan/common.c:319 [inline]
 __kasan_slab_alloc+0x89/0x90 mm/kasan/common.c:345
 kasan_slab_alloc include/linux/kasan.h:250 [inline]
 slab_post_alloc_hook mm/slub.c:4104 [inline]
 slab_alloc_node mm/slub.c:4153 [inline]
 kmem_cache_alloc_noprof+0x226/0x3d0 mm/slub.c:4160
 p9_tag_alloc+0x9c/0x870 net/9p/client.c:281
 p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644
 p9_client_rpc+0x1c3/0xc10 net/9p/client.c:691
 p9_client_write+0x31f/0x680 net/9p/client.c:1645
 v9fs_issue_write+0xe2/0x180 fs/9p/vfs_addr.c:59
 netfs_do_issue_write+0x92/0x110 fs/netfs/write_issue.c:233
 netfs_retry_write_stream fs/netfs/write_collect.c:184 [inline]
 netfs_retry_writes fs/netfs/write_collect.c:361 [inline]
 netfs_collect_write_results fs/netfs/write_collect.c:529 [inline]
 netfs_write_collection_worker+0x3e10/0x47d0 fs/netfs/write_collect.c:551
 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229
 process_scheduled_works kernel/workqueue.c:3310 [inline]
 worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
 kthread+0x2c1/0x3a0 kernel/kthread.c:389
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Freed by task 5192:
 kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:582
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x51/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2338 [inline]
 slab_free_after_rcu_debug+0x115/0x340 mm/slub.c:4648
 rcu_do_batch kernel/rcu/tree.c:2567 [inline]
 rcu_core+0x79d/0x14d0 kernel/rcu/tree.c:2823
 handle_softirqs+0x213/0x8f0 kernel/softirq.c:554
 __do_softirq kernel/softirq.c:588 [inline]
 invoke_softirq kernel/softirq.c:428 [inline]
 __irq_exit_rcu+0x109/0x170 kernel/softirq.c:655
 irq_exit_rcu+0x9/0x30 kernel/softirq.c:671
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
 sysvec_apic_timer_interrupt+0xa4/0xc0 arch/x86/kernel/apic/apic.c:1049
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702

Last potentially related work creation:
 kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
 __kasan_record_aux_stack+0xba/0xd0 mm/kasan/generic.c:544
 slab_free_hook mm/slub.c:2299 [inline]
 slab_free mm/slub.c:4598 [inline]
 kmem_cache_free+0x305/0x4c0 mm/slub.c:4700
 p9_req_put net/9p/client.c:409 [inline]
 p9_req_put+0x1c6/0x250 net/9p/client.c:402
 p9_client_rpc+0x591/0xc10 net/9p/client.c:761
 p9_client_write+0x31f/0x680 net/9p/client.c:1645
 v9fs_issue_write+0xe2/0x180 fs/9p/vfs_addr.c:59
 netfs_do_issue_write+0x92/0x110 fs/netfs/write_issue.c:233
 netfs_retry_write_stream fs/netfs/write_collect.c:184 [inline]
 netfs_retry_writes fs/netfs/write_collect.c:361 [inline]
 netfs_collect_write_results fs/netfs/write_collect.c:529 [inline]
 netfs_write_collection_worker+0x3e10/0x47d0 fs/netfs/write_collect.c:551
 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229
 process_scheduled_works kernel/workqueue.c:3310 [inline]
 worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
 kthread+0x2c1/0x3a0 kernel/kthread.c:389
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The buggy address belongs to the object at ffff88807b19e990
 which belongs to the cache p9_req_t of size 208
The buggy address is located 192 bytes inside of
 freed 208-byte region [ffff88807b19e990, ffff88807b19ea60)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7b19e
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 00fff00000000000 ffff88814c7a0140 ffffea0001dd70c0 0000000000000006
raw: 0000000000000000 00000000000f000f 00000001f5000000 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 6752, tgid 6751 (syz.0.17), ts 91802565462, free_ts 91788017536
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x2d1/0x350 mm/page_alloc.c:1556
 prep_new_page mm/page_alloc.c:1564 [inline]
 get_page_from_freelist+0xfce/0x2f80 mm/page_alloc.c:3474
 __alloc_pages_noprof+0x221/0x2470 mm/page_alloc.c:4751
 alloc_pages_mpol_noprof+0x2c9/0x610 mm/mempolicy.c:2265
 alloc_slab_page mm/slub.c:2408 [inline]
 allocate_slab mm/slub.c:2574 [inline]
 new_slab+0x2c9/0x410 mm/slub.c:2627
 ___slab_alloc+0xda4/0x1880 mm/slub.c:3815
 __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3905
 __slab_alloc_node mm/slub.c:3980 [inline]
 slab_alloc_node mm/slub.c:4141 [inline]
 kmem_cache_alloc_noprof+0xfa/0x3d0 mm/slub.c:4160
 p9_tag_alloc+0x9c/0x870 net/9p/client.c:281
 p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644
 p9_client_rpc+0x1c3/0xc10 net/9p/client.c:691
 p9_client_version net/9p/client.c:930 [inline]
 p9_client_create+0xc65/0x1200 net/9p/client.c:1036
 v9fs_session_init+0x1f8/0x1a80 fs/9p/v9fs.c:410
 v9fs_mount+0xc6/0xa30 fs/9p/vfs_super.c:122
 legacy_get_tree+0x109/0x220 fs/fs_context.c:662
 vfs_get_tree+0x8f/0x380 fs/super.c:1814
page last free pid 6750 tgid 6748 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 free_pages_prepare mm/page_alloc.c:1127 [inline]
 free_unref_folios+0xa7b/0x14f0 mm/page_alloc.c:2704
 folios_put_refs+0x587/0x7b0 mm/swap.c:962
 free_pages_and_swap_cache+0x45f/0x510 mm/swap_state.c:335
 __tlb_batch_free_encoded_pages+0xf9/0x290 mm/mmu_gather.c:136
 tlb_batch_pages_flush mm/mmu_gather.c:149 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:366 [inline]
 tlb_flush_mmu mm/mmu_gather.c:373 [inline]
 tlb_finish_mmu+0x168/0x7b0 mm/mmu_gather.c:465
 exit_mmap+0x3df/0xb20 mm/mmap.c:1680
 __mmput+0x12a/0x4c0 kernel/fork.c:1353
 mmput+0x62/0x70 kernel/fork.c:1375
 exit_mm kernel/exit.c:570 [inline]
 do_exit+0x9bf/0x2d70 kernel/exit.c:925
 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087
 get_signal+0x24ed/0x26c0 kernel/signal.c:3017
 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218
 do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Memory state around the buggy address:
 ffff88807b19e900: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
 ffff88807b19e980: fc fc fa fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88807b19ea00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                                                 ^
 ffff88807b19ea80: fc fc fc fc 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88807b19eb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
==================================================================


Tested on:

commit:         aaec5a95 pipe_read: don't wake up the writer if the pi..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=15fafde4580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=110a3804580000


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-27 21:19                               ` syzbot
@ 2025-03-27 22:18                                 ` asmadeus
  2025-03-28  4:01                                   ` K Prateek Nayak
  0 siblings, 1 reply; 48+ messages in thread
From: asmadeus @ 2025-03-27 22:18 UTC (permalink / raw)
  To: syzbot
  Cc: brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak,
	linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs,
	oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro

syzbot wrote on Thu, Mar 27, 2025 at 02:19:03PM -0700:
> BUG: KASAN: slab-use-after-free in p9_conn_cancel+0x900/0x910 net/9p/trans_fd.c:205
> Read of size 8 at addr ffff88807b19ea50 by task syz-executor/6595

Ugh, why...
Ah, if ->request() fails p9_client_rpc assumes the request was not
written (e.g. write error), so you can't return an error after the
list_add_tail call in p9_fd_request.

I think you can call p9_conn_cancel with the error and return 0 anyway,
and this paticular workaround will probably work, regardless of whether
it's the correct thing to do here (still haven't had time to look at the
patch here)

Sorry for this mess (even if most of it predates me...)
-- 
Dominique

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-27 22:18                                 ` asmadeus
@ 2025-03-28  4:01                                   ` K Prateek Nayak
  2025-03-28  4:43                                     ` syzbot
  2025-03-28 13:06                                     ` Oleg Nesterov
  0 siblings, 2 replies; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-28  4:01 UTC (permalink / raw)
  To: asmadeus, syzbot
  Cc: brauner, dhowells, ericvh, jack, jlayton, linux-fsdevel,
	linux-kernel, linux_oss, lucho, mjguzik, netfs, oleg,
	swapnil.sapkal, syzkaller-bugs, v9fs, viro

Hello Dominique,

On 3/28/2025 3:48 AM, asmadeus@codewreck.org wrote:
> syzbot wrote on Thu, Mar 27, 2025 at 02:19:03PM -0700:
>> BUG: KASAN: slab-use-after-free in p9_conn_cancel+0x900/0x910 net/9p/trans_fd.c:205
>> Read of size 8 at addr ffff88807b19ea50 by task syz-executor/6595

Woops! Should have tested with KASAN enabled. This time I did.

> 
> Ugh, why...
> Ah, if ->request() fails p9_client_rpc assumes the request was not
> written (e.g. write error), so you can't return an error after the
> list_add_tail call in p9_fd_request.
> 
> I think you can call p9_conn_cancel with the error and return 0 anyway,
> and this paticular workaround will probably work, regardless of whether
> it's the correct thing to do here (still haven't had time to look at the
> patch here)

That works too! Thank you for the pointer.

> 
> Sorry for this mess (even if most of it predates me...)

And I'm sorry for the noise from hacking on stuff that I don't fully
understand.

Fingers crossed (and thank you syzbot)

#syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 196060dc6138..dab7f02dc243 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -687,7 +687,13 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
  	else
  		n = p9_fd_poll(m->client, NULL, NULL);
  
-	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
+	/* Failed to send request */
+	if (!(n & EPOLLOUT)) {
+		p9_conn_cancel(m, -EIO);
+		return 0;
+	}
+
+	if (!test_and_set_bit(Wworksched, &m->wsched))
  		schedule_work(&m->wq);
  
  	return 0;


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28  4:01                                   ` K Prateek Nayak
@ 2025-03-28  4:43                                     ` syzbot
  2025-03-28 13:06                                     ` Oleg Nesterov
  1 sibling, 0 replies; 48+ messages in thread
From: syzbot @ 2025-03-28  4:43 UTC (permalink / raw)
  To: asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho,
	mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com
Tested-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com

Tested on:

commit:         aaec5a95 pipe_read: don't wake up the writer if the pi..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=1285fa4c580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=121eede4580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28  4:01                                   ` K Prateek Nayak
  2025-03-28  4:43                                     ` syzbot
@ 2025-03-28 13:06                                     ` Oleg Nesterov
  2025-03-28 13:07                                       ` syzbot
  1 sibling, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-28 13:06 UTC (permalink / raw)
  To: K Prateek Nayak, Dominique Martinet
  Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, linux-fsdevel,
	linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal,
	syzkaller-bugs, v9fs, viro

On 03/28, K Prateek Nayak wrote:
>
> --- a/net/9p/trans_fd.c
> +++ b/net/9p/trans_fd.c
> @@ -687,7 +687,13 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
>  	else
>  		n = p9_fd_poll(m->client, NULL, NULL);
> -	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
> +	/* Failed to send request */
> +	if (!(n & EPOLLOUT)) {
> +		p9_conn_cancel(m, -EIO);

Thanks a lot Prateek!

Can't really the changes in net/9p, but I am not sure. !(n & EPOLLOUT)
means that pipe is full, nothing wrong. We need to kick m->rq in this
case.

Dominique says:

	For me the problem isn't so much that this gets ERESTARTSYS but that it
	nevers gets to read the 7 bytes that are available?

Yes. Of course I do not pretend I fully understand the problem, but it
seems this is because p9_read_work() doesn't set Rworksched and doesn't
do schedule_work(&m->rq) if list_empty(&m->req_list).

However, if the pipe is full, before the commit aaec5a95d59615523db0
("pipe_read: don't wake up the writer if the pipe is still full"),
p9_read_work() -> p9_fd_read() -> kernel_read(ts->rd) triggered the
unnecessary wakeup. This wakeup calls p9_pollwake() shich kicks
p9_poll_workfn() -> p9_poll_mux(), and p9_poll_mux() will notice
EPOLLIN and schedule_work(&m->rq).

May be the patch below makes more sense?

Oleg.

#syz test: upstream

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 56e62978e502..aa9cd248a243 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -669,7 +669,6 @@ static void p9_poll_mux(struct p9_conn *m)
 
 static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 {
-	__poll_t n;
 	struct p9_trans_fd *ts = client->trans;
 	struct p9_conn *m = &ts->conn;
 
@@ -687,13 +686,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 	list_add_tail(&req->req_list, &m->unsent_req_list);
 	spin_unlock(&m->req_lock);
 
-	if (test_and_clear_bit(Wpending, &m->wsched))
-		n = EPOLLOUT;
-	else
-		n = p9_fd_poll(m->client, NULL, NULL);
-
-	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
-		schedule_work(&m->wq);
+	p9_poll_mux(m);
 
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 13:06                                     ` Oleg Nesterov
@ 2025-03-28 13:07                                       ` syzbot
  2025-03-28 13:25                                         ` Oleg Nesterov
  0 siblings, 1 reply; 48+ messages in thread
From: syzbot @ 2025-03-28 13:07 UTC (permalink / raw)
  To: oleg
  Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho,
	mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro

> On 03/28, K Prateek Nayak wrote:
>>
>> --- a/net/9p/trans_fd.c
>> +++ b/net/9p/trans_fd.c
>> @@ -687,7 +687,13 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
>>  	else
>>  		n = p9_fd_poll(m->client, NULL, NULL);
>> -	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
>> +	/* Failed to send request */
>> +	if (!(n & EPOLLOUT)) {
>> +		p9_conn_cancel(m, -EIO);
>
> Thanks a lot Prateek!
>
> Can't really the changes in net/9p, but I am not sure. !(n & EPOLLOUT)
> means that pipe is full, nothing wrong. We need to kick m->rq in this
> case.
>
> Dominique says:
>
> 	For me the problem isn't so much that this gets ERESTARTSYS but that it
> 	nevers gets to read the 7 bytes that are available?
>
> Yes. Of course I do not pretend I fully understand the problem, but it
> seems this is because p9_read_work() doesn't set Rworksched and doesn't
> do schedule_work(&m->rq) if list_empty(&m->req_list).
>
> However, if the pipe is full, before the commit aaec5a95d59615523db0
> ("pipe_read: don't wake up the writer if the pipe is still full"),
> p9_read_work() -> p9_fd_read() -> kernel_read(ts->rd) triggered the
> unnecessary wakeup. This wakeup calls p9_pollwake() shich kicks
> p9_poll_workfn() -> p9_poll_mux(), and p9_poll_mux() will notice
> EPOLLIN and schedule_work(&m->rq).
>
> May be the patch below makes more sense?
>
> Oleg.
>
> #syz test: upstream

want either no args or 2 args (repo, branch), got 5

>
> diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> index 56e62978e502..aa9cd248a243 100644
> --- a/net/9p/trans_fd.c
> +++ b/net/9p/trans_fd.c
> @@ -669,7 +669,6 @@ static void p9_poll_mux(struct p9_conn *m)
>  
>  static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
>  {
> -	__poll_t n;
>  	struct p9_trans_fd *ts = client->trans;
>  	struct p9_conn *m = &ts->conn;
>  
> @@ -687,13 +686,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
>  	list_add_tail(&req->req_list, &m->unsent_req_list);
>  	spin_unlock(&m->req_lock);
>  
> -	if (test_and_clear_bit(Wpending, &m->wsched))
> -		n = EPOLLOUT;
> -	else
> -		n = p9_fd_poll(m->client, NULL, NULL);
> -
> -	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
> -		schedule_work(&m->wq);
> +	p9_poll_mux(m);
>  
>  	return 0;
>  }
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 13:07                                       ` syzbot
@ 2025-03-28 13:25                                         ` Oleg Nesterov
  2025-03-28 13:49                                           ` syzbot
  0 siblings, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-28 13:25 UTC (permalink / raw)
  To: syzbot
  Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho,
	mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro

On 03/28, syzbot wrote:
>
> > #syz test: upstream
>
> want either no args or 2 args (repo, branch), got 5

#syz test

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 56e62978e502..aa9cd248a243 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -669,7 +669,6 @@ static void p9_poll_mux(struct p9_conn *m)
 
 static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 {
-	__poll_t n;
 	struct p9_trans_fd *ts = client->trans;
 	struct p9_conn *m = &ts->conn;
 
@@ -687,13 +686,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 	list_add_tail(&req->req_list, &m->unsent_req_list);
 	spin_unlock(&m->req_lock);
 
-	if (test_and_clear_bit(Wpending, &m->wsched))
-		n = EPOLLOUT;
-	else
-		n = p9_fd_poll(m->client, NULL, NULL);
-
-	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
-		schedule_work(&m->wq);
+	p9_poll_mux(m);
 
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 13:25                                         ` Oleg Nesterov
@ 2025-03-28 13:49                                           ` syzbot
  2025-03-28 14:49                                             ` Oleg Nesterov
  0 siblings, 1 reply; 48+ messages in thread
From: syzbot @ 2025-03-28 13:49 UTC (permalink / raw)
  To: asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho,
	mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
unregister_netdevice: waiting for DEV to become free

unregister_netdevice: waiting for batadv0 to become free. Usage count = 3

Tested on:

commit:         acb4f337 Merge tag 'm68knommu-for-v6.15' of git://git...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16bab804580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=95c3bbe7ce8436a7
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=146b9bb0580000

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 13:49                                           ` syzbot
@ 2025-03-28 14:49                                             ` Oleg Nesterov
  2025-03-28 15:22                                               ` syzbot
  0 siblings, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-28 14:49 UTC (permalink / raw)
  To: syzbot
  Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho,
	mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro

On 03/28, syzbot wrote:
>
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> unregister_netdevice: waiting for DEV to become free
>
> unregister_netdevice: waiting for batadv0 to become free. Usage count = 3

Looks "obviously unrelated", but I'm not sure about anything anymore.

In any case I leave this to 9p maintainers.

Okay, just out of curiosity:

#syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7


diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 56e62978e502..aa9cd248a243 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -669,7 +669,6 @@ static void p9_poll_mux(struct p9_conn *m)
 
 static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 {
-	__poll_t n;
 	struct p9_trans_fd *ts = client->trans;
 	struct p9_conn *m = &ts->conn;
 
@@ -687,13 +686,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 	list_add_tail(&req->req_list, &m->unsent_req_list);
 	spin_unlock(&m->req_lock);
 
-	if (test_and_clear_bit(Wpending, &m->wsched))
-		n = EPOLLOUT;
-	else
-		n = p9_fd_poll(m->client, NULL, NULL);
-
-	if (n & EPOLLOUT && !test_and_set_bit(Wworksched, &m->wsched))
-		schedule_work(&m->wq);
+	p9_poll_mux(m);
 
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 14:49                                             ` Oleg Nesterov
@ 2025-03-28 15:22                                               ` syzbot
  2025-03-28 17:00                                                 ` Oleg Nesterov
  0 siblings, 1 reply; 48+ messages in thread
From: syzbot @ 2025-03-28 15:22 UTC (permalink / raw)
  To: asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho,
	mjguzik, netfs, oleg, swapnil.sapkal, syzkaller-bugs, v9fs, viro

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com
Tested-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com

Tested on:

commit:         aaec5a95 pipe_read: don't wake up the writer if the pi..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=1598064c580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=16682a4c580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 15:22                                               ` syzbot
@ 2025-03-28 17:00                                                 ` Oleg Nesterov
  2025-03-28 17:56                                                   ` K Prateek Nayak
  2025-03-29  0:00                                                   ` asmadeus
  0 siblings, 2 replies; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-28 17:00 UTC (permalink / raw)
  To: syzbot
  Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss, lucho,
	mjguzik, netfs, swapnil.sapkal, syzkaller-bugs, v9fs, viro

Dear syzbot,

On 03/28, syzbot wrote:
>
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
>
> Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com
> Tested-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com

Thanks. so the previous

	syzbot has tested the proposed patch but the reproducer is still triggering an issue:
	unregister_netdevice: waiting for DEV to become free

	unregister_netdevice: waiting for batadv0 to become free. Usage count = 3

report suggests that upstream/master has another/unrelated issue(s).

As for the patches from me or Prateek (thanks again!), I think that
the maintainers should take a look.

But at this point I am mostly confident that the bisected commit aaec5a95d5961
("pipe_read: don't wake up the writer if the pipe is still full") is innocent,
it just reveals yet another problem.

I guess (I hope ;) Prateek agrees.

Oleg.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 17:00                                                 ` Oleg Nesterov
@ 2025-03-28 17:56                                                   ` K Prateek Nayak
  2025-03-28 18:20                                                     ` Oleg Nesterov
  2025-03-29  0:00                                                   ` asmadeus
  1 sibling, 1 reply; 48+ messages in thread
From: K Prateek Nayak @ 2025-03-28 17:56 UTC (permalink / raw)
  To: Oleg Nesterov, syzbot
  Cc: asmadeus, brauner, dhowells, ericvh, jack, jlayton, linux-fsdevel,
	linux-kernel, linux_oss, lucho, mjguzik, netfs, swapnil.sapkal,
	syzkaller-bugs, v9fs, viro

Hello Oleg,

On 3/28/2025 10:30 PM, Oleg Nesterov wrote:
> Dear syzbot,
> 
> On 03/28, syzbot wrote:
>>
>> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
>>
>> Reported-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com
>> Tested-by: syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com
>

Yours is the right approach. I was searching for p9_poll_mux()
which would wake the read collector to free pipe buffers but I
didn't dig enough and I was too lazy to open code it (guilty!)
If this gets picked, feel free to add:

Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
  
> Thanks. so the previous
> 
> 	syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> 	unregister_netdevice: waiting for DEV to become free
> 
> 	unregister_netdevice: waiting for batadv0 to become free. Usage count = 3
> 
> report suggests that upstream/master has another/unrelated issue(s).

I could not reproduce this bit on mainline at commit acb4f33713b9
("Merge tag 'm68knommu-for-v6.15' of
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu") so I too
second that it is an unrelated issue.

> 
> As for the patches from me or Prateek (thanks again!), I think that
> the maintainers should take a look.
> 
> But at this point I am mostly confident that the bisected commit aaec5a95d5961
> ("pipe_read: don't wake up the writer if the pipe is still full") is innocent,
> it just reveals yet another problem.
> 
> I guess (I hope ;) Prateek agrees.

I agree aaec5a95d5961 is not guilty! (unlike me) :)

> 
> Oleg.
> 

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 17:56                                                   ` K Prateek Nayak
@ 2025-03-28 18:20                                                     ` Oleg Nesterov
  0 siblings, 0 replies; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-28 18:20 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: syzbot, asmadeus, brauner, dhowells, ericvh, jack, jlayton,
	linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs,
	swapnil.sapkal, syzkaller-bugs, v9fs, viro

Hi Prateek,

On 03/28, K Prateek Nayak wrote:
>
> Yours is the right approach.

OK, thank you, but lets wait for 9p maintainers.

but...

> If this gets picked, feel free to add:
>
> Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com>

No!

My version is just a slightly updated version of your initial patch.
It was you who (unless we both are wrong) actually pinpointed the
problem.

So. If this is acked by maintainers, please-please send the updated patch
with the changelog, and feel free to add my Reviewed-by or Acked-by.

Oleg.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 17:00                                                 ` Oleg Nesterov
  2025-03-28 17:56                                                   ` K Prateek Nayak
@ 2025-03-29  0:00                                                   ` asmadeus
  2025-03-29 14:21                                                     ` Oleg Nesterov
  1 sibling, 1 reply; 48+ messages in thread
From: asmadeus @ 2025-03-29  0:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak,
	linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs,
	swapnil.sapkal, syzkaller-bugs, v9fs, viro

Oleg Nesterov wrote on Fri, Mar 28, 2025 at 06:00:12PM +0100:
> As for the patches from me or Prateek (thanks again!), I think that
> the maintainers should take a look.
> 
> But at this point I am mostly confident that the bisected commit aaec5a95d5961
> ("pipe_read: don't wake up the writer if the pipe is still full") is innocent,
> it just reveals yet another problem.
> 
> I guess (I hope ;) Prateek agrees.

Right, so your patch sounds better than Prateek's initial blowing
up workaround, but it's a bit weird anyway so let me recap:
- that syz repro has this unnatural pattern where the replies are all
written before the requests are sent
- 9p_read_work() (read worker) has an optimization that if there is no
in fly request then there obviously must be nothing to read (9p is 100%
client initiated, there's no way the server should send something
first), so at this point the reader task is idle
(OTOH, we're checking for rx right at p9_conn_create() before anything
is sent, so it's not like we're consistent on this...)
- p9_fd_request() (sending a new request) has another optimization that
only checks for tx: at this point if another request was already in
flight then the rx task should have a poll going on for rx, and if there
were no in flight request yet then there should be no point in checking
for rx, so p9_fd_request() only kick in the tx worker if there is room
to send
- at this point I don't really get the logic that'll wake the rx thread
up either... p9_pollwake() will trigger p9_poll_workfn() (through
p9_poll_work) which will call p9_poll_mux() and get the reader kicking
again, but I don't know how the waitqueue mechanism is supposed to work
(see p9_pollwait())
I'd have expected the tx task to somehow nudge this on, but it
doesn't?...
- due to the new optimization (aaec5a95d59615 "pipe_read: don't wake up
the writer if the pipe is still full"), that 'if there is room to send'
check started failing and tx thread doesn't start? Because syzbot passed
us a pipe that was already full, or they messed with it after mount?
I'm not clear about this point, but I think it's the key here -- the 9p
"mount by fd" is great for local pseudo filesystems and things like that
but it's been abused by syzbot too much, and I don't want to spend too
much time making sure that any unholy things they do with these fd
works. If possible I'd like to mark that fd unusable by userspace but
I'm honestly doubtful it's possible (if e.g. it was dup'd or something
before mount for example...)

So, yeah, well, okay I don't mind the patch even if it doesn't make
sense with a regular server.
We don't really care about trans fd performance either so it's fine if
it's a bit slower, and the error Prateek added might happen in a real
case if tx queue is full of real requests so I think your approach is
good enough.

If we understand what's happening here I think it's as good as anything
else, but I'd just like it clear in the commit message what syzbot is
doing and why the regression happened

Thank you both for the thorough analysis and follow ups!
-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-29  0:00                                                   ` asmadeus
@ 2025-03-29 14:21                                                     ` Oleg Nesterov
  2025-03-29 23:27                                                       ` asmadeus
  0 siblings, 1 reply; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-29 14:21 UTC (permalink / raw)
  To: asmadeus
  Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak,
	linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs,
	swapnil.sapkal, syzkaller-bugs, v9fs, viro

First of all, let me remind that I know nothing about 9p or netfs ;)
And I am not sure that my patch is the right solution.

I am not even sure we need the fix, according to syzbot testing the
problem goes away with the fixes from David
https://web.git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes
but I didn't even try to read them, this is not my area.

Now, I'll try to answer some of your questions, but I can be easily
wrong.

On 03/29, asmadeus@codewreck.org wrote:
>
> Right, so your patch sounds better than Prateek's initial blowing
> up workaround, but it's a bit weird anyway so let me recap:
> - that syz repro has this unnatural pattern where the replies are all
> written before the requests are sent

Yes,

> - 9p_read_work() (read worker) has an optimization that if there is no
> in fly request then there obviously must be nothing to read (9p is 100%
> client initiated, there's no way the server should send something
> first), so at this point the reader task is idle

Yes. But note that it does kernel_read() -> pipe_read() before it becomes
idle. See below.

> - p9_fd_request() (sending a new request) has another optimization that
> only checks for tx: at this point if another request was already in
> flight then the rx task should have a poll going on for rx, and if there
> were no in flight request yet then there should be no point in checking
> for rx, so p9_fd_request() only kick in the tx worker if there is room
> to send

Can't comment, but

> - at this point I don't really get the logic that'll wake the rx thread
> up either... p9_pollwake() will trigger p9_poll_workfn()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Yes, but where this p9_pollwake() can come from? see below.

> - due to the new optimization (aaec5a95d59615 "pipe_read: don't wake up
> the writer if the pipe is still full"), that 'if there is room to send'
> check started failing and tx thread doesn't start?

Again, I can be easily wrong, but no.

With or without the optimization above, it doesn't make sense to start
the tx thread when the pipe is full, p9_fd_poll() can't report EPOLLOUT.

Lets recall that the idle read worker did kernel_read() -> pipe_read().
Before this optimization, pipe_read() did the unnecessary

	wake_up_interruptible_sync_poll(&pipe->wr_wait);

when the pipe was full before the reading _and_ is still full after the
reading.

This wakeup calls p9_pollwake() which kicks p9_poll_workfn().

p9_poll_workfn() calls p9_poll_mux().

p9_poll_mux() does n = p9_fd_poll().

"n & EPOLLOUT" is false, exactly because this wakeup was unnecessary,
so p9_poll_mux() won't do schedule_work(&m->wq), this is fine,

But, "n & EPOLLIN" is true, so p9_poll_mux() does schedule_work(&m->rq)
and wakes the rx thread.

p9_read_work() is called again. It reads more data and (I guess) notices
some problem and does p9_conn_cancel(EIO).

This no longer happens after the optimization. So in some sense the
p9_fd_request() -> p9_poll_mux() hack (which wakes the rx thread in this
case) restores the old behaviour.

But again, again, quite possibly I completely misread this (nontrivial)
code.

Oleg.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-29 14:21                                                     ` Oleg Nesterov
@ 2025-03-29 23:27                                                       ` asmadeus
  2025-03-30 10:21                                                         ` Oleg Nesterov
  0 siblings, 1 reply; 48+ messages in thread
From: asmadeus @ 2025-03-29 23:27 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak,
	linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs,
	swapnil.sapkal, syzkaller-bugs, v9fs, viro

Oleg Nesterov wrote on Sat, Mar 29, 2025 at 03:21:39PM +0100:
> First of all, let me remind that I know nothing about 9p or netfs ;)
> And I am not sure that my patch is the right solution.
> 
> I am not even sure we need the fix, according to syzbot testing the
> problem goes away with the fixes from David
> https://web.git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes
> but I didn't even try to read them, this is not my area.

(gah, I hate emails when one gets added to thread later.. I've just now
opened the thread on lore and seen David's test :/)

> > - due to the new optimization (aaec5a95d59615 "pipe_read: don't wake up
> > the writer if the pipe is still full"), that 'if there is room to send'
> > check started failing and tx thread doesn't start?
> 
> Again, I can be easily wrong, but no.
> 
> With or without the optimization above, it doesn't make sense to start
> the tx thread when the pipe is full, p9_fd_poll() can't report EPOLLOUT.
> 
> Lets recall that the idle read worker did kernel_read() -> pipe_read().
> Before this optimization, pipe_read() did the unnecessary
> 
> 	wake_up_interruptible_sync_poll(&pipe->wr_wait);
> 
> when the pipe was full before the reading _and_ is still full after the
> reading.
> 
> This wakeup calls p9_pollwake() which kicks p9_poll_workfn().

Aah, that's the bit I didn't get, thank you!

> This no longer happens after the optimization. So in some sense the
> p9_fd_request() -> p9_poll_mux() hack (which wakes the rx thread in this
> case) restores the old behaviour.
> 
> But again, again, quite possibly I completely misread this (nontrivial)
> code.

Yes, this totally makes sense; I agree with your analysis.

So basically 9p was optimizing for this impossible (on a normal server)
behaviour in the 9p side (it doesn't make any sense for the tx pipe to
be full with 0 in flight request, and tx pipe never goes unfull, and
reply comes (was there) before the actual write happened!!), but this
old behaviour made it work anyway...
So part of me wants to just leave it there and if anything try to make
this kind of usage impossible by adding more checks to mount -o
trans=fd, but I don't think it's possible to lock down all kind of weird
behaviour root users (=syzbot) can engage in...
OTOH syzbot does find some useful bugs so I guess it might be worth
fixing, I don't know.
If David's patch also happens to fix it I guess we can also just wait
for that?

Thanks,
-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-29 23:27                                                       ` asmadeus
@ 2025-03-30 10:21                                                         ` Oleg Nesterov
  0 siblings, 0 replies; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-30 10:21 UTC (permalink / raw)
  To: asmadeus
  Cc: syzbot, brauner, dhowells, ericvh, jack, jlayton, kprateek.nayak,
	linux-fsdevel, linux-kernel, linux_oss, lucho, mjguzik, netfs,
	swapnil.sapkal, syzkaller-bugs, v9fs, viro

On 03/30, asmadeus@codewreck.org wrote:
>
> If David's patch also happens to fix it I guess we can also just wait
> for that?

Sure, whatever you prefer.

Thanks!

Oleg.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-23 19:17   ` syzbot
  2025-03-23 19:47     ` Oleg Nesterov
@ 2025-03-23 20:03     ` Mateusz Guzik
  2025-03-23 20:43       ` syzbot
  1 sibling, 1 reply; 48+ messages in thread
From: Mateusz Guzik @ 2025-03-23 20:03 UTC (permalink / raw)
  To: syzbot
  Cc: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel,
	linux-kernel, netfs, oleg, swapnil.sapkal, syzkaller-bugs, viro

> Tested on:
> 
> commit:         aaec5a95 pipe_read: don't wake up the writer if the pi..
> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> console output: https://syzkaller.appspot.com/x/log.txt?x=169ac43f980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972
> dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=17803c4c580000
> 

Here is a "just in case" by me: the patch which made sure to only look
at head + tail with the lock held.

#syz test: upstream aaec5a95d59615523db03dd53c2052f0a87beea7

diff --git a/fs/pipe.c b/fs/pipe.c
index 82fede0f2111..7eedcef2811e 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -210,11 +210,20 @@ static const struct pipe_buf_operations anon_pipe_buf_ops = {
 /* Done while waiting without holding the pipe lock - thus the READ_ONCE() */
 static inline bool pipe_readable(const struct pipe_inode_info *pipe)
 {
-	unsigned int head = READ_ONCE(pipe->head);
-	unsigned int tail = READ_ONCE(pipe->tail);
-	unsigned int writers = READ_ONCE(pipe->writers);
+	return !READ_ONCE(pipe->isempty) || !READ_ONCE(pipe->writers);
+}
+
+static inline void pipe_recalc_state(struct pipe_inode_info *pipe)
+{
+	pipe->isempty = pipe_empty(pipe->head, pipe->tail);
+	pipe->isfull = pipe_full(pipe->head, pipe->tail, pipe->max_usage);
+}
 
-	return !pipe_empty(head, tail) || !writers;
+static inline void pipe_update_head(struct pipe_inode_info *pipe,
+				    unsigned int head)
+{
+	pipe->head = ++head;
+	pipe_recalc_state(pipe);
 }
 
 static inline unsigned int pipe_update_tail(struct pipe_inode_info *pipe,
@@ -244,6 +253,7 @@ static inline unsigned int pipe_update_tail(struct pipe_inode_info *pipe,
 	 * without the spinlock - the mutex is enough.
 	 */
 	pipe->tail = ++tail;
+	pipe_recalc_state(pipe);
 	return tail;
 }
 
@@ -417,12 +427,7 @@ static inline int is_packetized(struct file *file)
 /* Done while waiting without holding the pipe lock - thus the READ_ONCE() */
 static inline bool pipe_writable(const struct pipe_inode_info *pipe)
 {
-	unsigned int head = READ_ONCE(pipe->head);
-	unsigned int tail = READ_ONCE(pipe->tail);
-	unsigned int max_usage = READ_ONCE(pipe->max_usage);
-
-	return !pipe_full(head, tail, max_usage) ||
-		!READ_ONCE(pipe->readers);
+	return !READ_ONCE(pipe->isfull) || !READ_ONCE(pipe->readers);
 }
 
 static ssize_t
@@ -524,7 +529,7 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from)
 			 * it, either the reader will consume it or it'll still
 			 * be there for the next write.
 			 */
-			pipe->head = head + 1;
+			pipe_update_head(pipe, head);
 
 			/* Insert it into the buffer array */
 			buf = &pipe->bufs[head & mask];
@@ -549,10 +554,9 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from)
 
 			if (!iov_iter_count(from))
 				break;
-		}
 
-		if (!pipe_full(head, pipe->tail, pipe->max_usage))
 			continue;
+		}
 
 		/* Wait for buffer space to become available. */
 		if ((filp->f_flags & O_NONBLOCK) ||
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index 8ff23bf5a819..d4b7539399b5 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -69,6 +69,8 @@ struct pipe_inode_info {
 	unsigned int r_counter;
 	unsigned int w_counter;
 	bool poll_usage;
+	bool isempty;
+	bool isfull;
 #ifdef CONFIG_WATCH_QUEUE
 	bool note_loss;
 #endif

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-23 20:03     ` Mateusz Guzik
@ 2025-03-23 20:43       ` syzbot
  0 siblings, 0 replies; 48+ messages in thread
From: syzbot @ 2025-03-23 20:43 UTC (permalink / raw)
  To: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel,
	linux-kernel, mjguzik, netfs, oleg, swapnil.sapkal,
	syzkaller-bugs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: task hung in netfs_unbuffered_write_iter

INFO: task syz.0.17:6690 blocked for more than 143 seconds.
      Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.17        state:D stack:25184 pid:6690  tgid:6689  ppid:6491   flags:0x00004004
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5369 [inline]
 __schedule+0xe58/0x5ad0 kernel/sched/core.c:6756
 __schedule_loop kernel/sched/core.c:6833 [inline]
 schedule+0xe7/0x350 kernel/sched/core.c:6848
 bit_wait+0x15/0xe0 kernel/sched/wait_bit.c:237
 __wait_on_bit+0x62/0x180 kernel/sched/wait_bit.c:49
 out_of_line_wait_on_bit+0xda/0x110 kernel/sched/wait_bit.c:64
 wait_on_bit include/linux/wait_bit.h:77 [inline]
 netfs_unbuffered_write_iter_locked+0xba8/0xe70 fs/netfs/direct_write.c:105
 netfs_unbuffered_write_iter+0x413/0x6d0 fs/netfs/direct_write.c:193
 v9fs_file_write_iter+0xbf/0x100 fs/9p/vfs_file.c:404
 new_sync_write fs/read_write.c:586 [inline]
 vfs_write+0x5ae/0x1150 fs/read_write.c:679
 ksys_write+0x12b/0x250 fs/read_write.c:731
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fbba998d169
RSP: 002b:00007fbba97f9038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fbba9ba5fa0 RCX: 00007fbba998d169
RDX: 0000000000007fec RSI: 0000400000000540 RDI: 0000000000000007
RBP: 00007fbba9a0e2a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fbba9ba5fa0 R15: 00007ffea301fe68
 </TASK>

Showing all locks held in the system:
2 locks held by kworker/u8:0/11:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000107d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:1/12:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000117d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
1 lock held by khungtaskd/30:
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
 #0: ffffffff8e1bac80 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x7f/0x390 kernel/locking/lockdep.c:6744
2 locks held by kworker/u8:2/35:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc90000ab7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:4/80:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000213fd80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
3 locks held by kworker/u8:6/2887:
2 locks held by kworker/u8:7/2936:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000bc87d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:8/3460:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000c917d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:9/3519:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000c757d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:10/3600:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000cbe7d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:11/3645:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000d047d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by kworker/u8:12/3693:
 #0: ffff88801b081148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1293/0x1ba0 kernel/workqueue.c:3204
 #1: ffffc9000d437d80 ((work_completion)(&rreq->work)){+.+.}-{0:0}, at: process_one_work+0x921/0x1ba0 kernel/workqueue.c:3205
2 locks held by getty/5579:
 #0: ffff8880315940a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243
 #1: ffffc90002fde2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0xfba/0x1480 drivers/tty/n_tty.c:2211
3 locks held by syz.0.17/6690:
 #0: ffff88807c837eb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88807341a420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880739f0148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.1.19/6790:
 #0: ffff88802beef7b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888055054420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880739f07b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.2.22/6813:
 #0: ffff888034a757b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff8880736a2420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880739f0e28 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.3.24/6832:
 #0: ffff88802fbce9b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888030d96420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888073af0148 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.4.26/6857:
 #0: ffff8880355be638 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff8880331c0420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888073af07b8 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.5.28/6882:
 #0: ffff8880635d1eb8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88807347c420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880739f1498 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.6.30/6910:
 #0: ffff88807b32e2b8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff888029cb0420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880739f1b08 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.7.32/6938:
 #0: ffff88802fba9278 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88806225e420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888073af0e28 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.8.34/6969:
 #0: ffff8880336ea478 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88802a7a4420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff888073af1498 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188
3 locks held by syz.9.36/6996:
 #0: ffff88801e2c27f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x267/0x390 fs/file.c:1191
 #1: ffff88802d796420 (sb_writers#14){.+.+}-{0:0}, at: ksys_write+0x12b/0x250 fs/read_write.c:731
 #2: ffff8880739f2178 (&sb->s_type->i_mutex_key#20){++++}-{4:4}, at: netfs_start_io_direct+0x116/0x260 fs/netfs/locking.c:188

=============================================

NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113
 nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:234 [inline]
 watchdog+0xf14/0x1240 kernel/hung_task.c:397
 kthread+0x2c1/0x3a0 kernel/kthread.c:389
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 5871 Comm: kworker/0:4 Not tainted 6.13.0-rc1-syzkaller-00017-gaaec5a95d596-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: events_power_efficient gc_worker
RIP: 0010:__lock_acquire+0x2fe/0x3c40 kernel/locking/lockdep.c:5162
Code: 47 10 48 89 c2 48 89 44 24 50 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 70 2b 00 00 49 8d 47 18 4d 89 67 10 <48> 89 c2 48 89 44 24 48 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03
RSP: 0018:ffffc90003ad7990 EFLAGS: 00000046
RAX: ffff8880315f53c8 RBX: 0000000000000000 RCX: 0000000000000002
RDX: 1ffff110062bea78 RSI: 1ffff110062bea75 RDI: ffff8880315f53b8
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
R10: 000000000000000a R11: 0000000000000003 R12: ffffffff8e1bac80
R13: ffff8880315f4880 R14: 0000000000000021 R15: ffff8880315f53b0
FS:  0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c008d0a000 CR3: 000000000df7e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <NMI>
 </NMI>
 <TASK>
 lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5849
 rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 rcu_read_lock include/linux/rcupdate.h:849 [inline]
 gc_worker+0x24d/0x1760 net/netfilter/nf_conntrack_core.c:1532
 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229
 process_scheduled_works kernel/workqueue.c:3310 [inline]
 worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
 kthread+0x2c1/0x3a0 kernel/kthread.c:389
 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>


Tested on:

commit:         aaec5a95 pipe_read: don't wake up the writer if the pi..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=11550804580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8d5a2956e94d7972
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=10e90804580000


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-22 15:54 [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter syzbot
  2025-03-23 18:48 ` Oleg Nesterov
@ 2025-03-28 18:14 ` David Howells
  2025-03-28 18:44   ` syzbot
  2025-03-28 19:01   ` Oleg Nesterov
  2025-08-03 12:09 ` syzbot
  2 siblings, 2 replies; 48+ messages in thread
From: David Howells @ 2025-03-28 18:14 UTC (permalink / raw)
  To: syzbot
  Cc: dhowells, Oleg Nesterov, brauner, jack, jlayton, linux-fsdevel,
	linux-kernel, netfs, syzkaller-bugs, viro, K Prateek Nayak,
	Sapkal, Swapnil, Mateusz Guzik

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git netfs-fixes

  There's inconsistency with some wakeups between using rreq->waitq (a private
  waitqueue) and using clear_and_wake_up_bit() (a shared global waitqueue).
  The reason I didn't get round to posting the patches on this branch yet is
  that they *ought* to fix the problem, but Steve French still sees an
  occasional hang with cifs - but only in his automated testing thing, and we
  haven't tracked it down yet.

  David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 18:14 ` David Howells
@ 2025-03-28 18:44   ` syzbot
  2025-03-28 19:01   ` Oleg Nesterov
  1 sibling, 0 replies; 48+ messages in thread
From: syzbot @ 2025-03-28 18:44 UTC (permalink / raw)
  To: brauner, dhowells, jack, jlayton, kprateek.nayak, linux-fsdevel,
	linux-kernel, mjguzik, netfs, oleg, swapnil.sapkal,
	syzkaller-bugs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
unregister_netdevice: waiting for DEV to become free

unregister_netdevice: waiting for batadv0 to become free. Usage count = 3

Tested on:

commit:         c7fffb8c netfs: Fix wait/wake to be consistent about t..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git netfs-fixes
console output: https://syzkaller.appspot.com/x/log.txt?x=15722a4c580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=95c3bbe7ce8436a7
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-28 18:14 ` David Howells
  2025-03-28 18:44   ` syzbot
@ 2025-03-28 19:01   ` Oleg Nesterov
  1 sibling, 0 replies; 48+ messages in thread
From: Oleg Nesterov @ 2025-03-28 19:01 UTC (permalink / raw)
  To: David Howells
  Cc: syzbot, brauner, jack, jlayton, linux-fsdevel, linux-kernel,
	netfs, syzkaller-bugs, viro, K Prateek Nayak, Sapkal, Swapnil,
	Mateusz Guzik

David,

I know that I will regret my email tomorrow, but

On 03/28, David Howells wrote:
>
> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git netfs-fixes
>
>   There's inconsistency with some wakeups between using rreq->waitq (a private
>   waitqueue) and using clear_and_wake_up_bit() (a shared global waitqueue).

Too late for me, I didn't even try to read the "netfs-fixes" changes.
And if I even tried, I probably wouldn't understand them ;)

But, afaics (I can be easily wrong), the curent logic in net/9p/ doesn't look
right regardless of any other fixes, no?

Oleg.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
  2025-03-22 15:54 [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter syzbot
  2025-03-23 18:48 ` Oleg Nesterov
  2025-03-28 18:14 ` David Howells
@ 2025-08-03 12:09 ` syzbot
  2 siblings, 0 replies; 48+ messages in thread
From: syzbot @ 2025-08-03 12:09 UTC (permalink / raw)
  To: asmadeus, brauner, dhowells, eadavis, ericvh, hdanton, jack,
	jlayton, kprateek.nayak, linux-fsdevel, linux-kernel, linux_oss,
	lucho, mjguzik, netfs, oleg, pc, sfrench, swapnil.sapkal,
	syzkaller-bugs, v9fs, viro

syzbot suspects this issue was fixed by commit:

commit da8cf4bd458722d090a788c6e581eeb72695c62f
Author: David Howells <dhowells@redhat.com>
Date:   Tue Jul 1 16:38:36 2025 +0000

    netfs: Fix hang due to missing case in final DIO read result collection

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14cfbcf0580000
start commit:   66701750d556 Merge tag 'io_uring-6.16-20250630' of git://g..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=e29b8115bf337f53
dashboard link: https://syzkaller.appspot.com/bug?extid=62262fdc0e01d99573fc
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1400348c580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17b2048c580000

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: netfs: Fix hang due to missing case in final DIO read result collection

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2025-08-03 12:09 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-22 15:54 [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter syzbot
2025-03-23 18:48 ` Oleg Nesterov
2025-03-23 19:17   ` syzbot
2025-03-23 19:47     ` Oleg Nesterov
2025-03-23 19:50       ` K Prateek Nayak
2025-03-23 19:52       ` Mateusz Guzik
2025-03-23 21:02         ` Oleg Nesterov
2025-03-24 10:47           ` K Prateek Nayak
2025-03-24 11:15             ` syzbot
2025-03-24 13:17             ` Mateusz Guzik
2025-03-24 13:19               ` K Prateek Nayak
2025-03-24 14:52                 ` K Prateek Nayak
2025-03-24 16:03                   ` Mateusz Guzik
2025-03-24 16:25                     ` K Prateek Nayak
2025-03-24 16:36                       ` K Prateek Nayak
2025-03-25  2:52                   ` K Prateek Nayak
2025-03-25 12:15                     ` Oleg Nesterov
2025-03-25 12:36                       ` Dominique Martinet
2025-03-25 13:04                       ` Oleg Nesterov
2025-03-25 14:49                         ` K Prateek Nayak
2025-03-25 14:58                           ` Dominique Martinet
2025-03-26 12:19                             ` Oleg Nesterov
2025-03-26 12:44                               ` Oleg Nesterov
2025-03-26 13:05                                 ` Oleg Nesterov
2025-03-27 17:46                             ` K Prateek Nayak
2025-03-27 21:19                               ` syzbot
2025-03-27 22:18                                 ` asmadeus
2025-03-28  4:01                                   ` K Prateek Nayak
2025-03-28  4:43                                     ` syzbot
2025-03-28 13:06                                     ` Oleg Nesterov
2025-03-28 13:07                                       ` syzbot
2025-03-28 13:25                                         ` Oleg Nesterov
2025-03-28 13:49                                           ` syzbot
2025-03-28 14:49                                             ` Oleg Nesterov
2025-03-28 15:22                                               ` syzbot
2025-03-28 17:00                                                 ` Oleg Nesterov
2025-03-28 17:56                                                   ` K Prateek Nayak
2025-03-28 18:20                                                     ` Oleg Nesterov
2025-03-29  0:00                                                   ` asmadeus
2025-03-29 14:21                                                     ` Oleg Nesterov
2025-03-29 23:27                                                       ` asmadeus
2025-03-30 10:21                                                         ` Oleg Nesterov
2025-03-23 20:03     ` Mateusz Guzik
2025-03-23 20:43       ` syzbot
2025-03-28 18:14 ` David Howells
2025-03-28 18:44   ` syzbot
2025-03-28 19:01   ` Oleg Nesterov
2025-08-03 12:09 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).