Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
       [not found] <0000000000004385ec06198753f8@google.com>
@ 2024-08-31 18:22 ` syzbot
  2024-09-02  1:57   ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: syzbot @ 2024-08-31 18:22 UTC (permalink / raw)
  To: Dai.Ngo, chuck.lever, dai.ngo, jlayton, kolga, linux-kernel,
	linux-nfs, lorenzo, neilb, netdev, okorniev, syzkaller-bugs, tom

syzbot has found a reproducer for the following issue on:

HEAD commit:    fe1910f9337b tcp_bpf: fix return value of tcp_bpf_sendmsg()
git tree:       net
console+strace: https://syzkaller.appspot.com/x/log.txt?x=100e272b980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=996585887acdadb3
dashboard link: https://syzkaller.appspot.com/bug?extid=d1e76d963f757db40f91
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=113234fb980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11140943980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/cb37d16e2860/disk-fe1910f9.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/315198aa296e/vmlinux-fe1910f9.xz
kernel image: https://storage.googleapis.com/syzbot-assets/b3e6fb9fa8a4/bzImage-fe1910f9.xz

The issue was bisected to:

commit 16a471177496c8e04a9793812c187a2c1a2192fa
Author: Lorenzo Bianconi <lorenzo@kernel.org>
Date:   Tue Apr 23 13:25:44 2024 +0000

    NFSD: add listener-{set,get} netlink command

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16af38d3980000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=15af38d3980000
console output: https://syzkaller.appspot.com/x/log.txt?x=11af38d3980000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d1e76d963f757db40f91@syzkaller.appspotmail.com
Fixes: 16a471177496 ("NFSD: add listener-{set,get} netlink command")

INFO: task syz-executor388:5254 blocked for more than 143 seconds.
      Not tainted 6.11.0-rc5-syzkaller-00151-gfe1910f9337b #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor388 state:D stack:23736 pid:5254  tgid:5254  ppid:5252   flags:0x00004006
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5188 [inline]
 __schedule+0x1800/0x4a60 kernel/sched/core.c:6529
 __schedule_loop kernel/sched/core.c:6606 [inline]
 schedule+0x14b/0x320 kernel/sched/core.c:6621
 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6678
 __mutex_lock_common kernel/locking/mutex.c:684 [inline]
 __mutex_lock+0x6a4/0xd70 kernel/locking/mutex.c:752
 nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956
 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
 genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2550
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
 ___sys_sendmsg net/socket.c:2651 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f4204435ce9
RSP: 002b:00007ffea2feb278 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4204435ce9
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00000000000000a0 R09: 00000000000000a0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007ffea2feb2a0 R15: 00007ffea2feb290
 </TASK>
INFO: task syz-executor388:5258 blocked for more than 143 seconds.
      Not tainted 6.11.0-rc5-syzkaller-00151-gfe1910f9337b #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor388 state:D stack:26576 pid:5258  tgid:5258  ppid:5255   flags:0x00004006
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5188 [inline]
 __schedule+0x1800/0x4a60 kernel/sched/core.c:6529
 __schedule_loop kernel/sched/core.c:6606 [inline]
 schedule+0x14b/0x320 kernel/sched/core.c:6621
 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6678
 __mutex_lock_common kernel/locking/mutex.c:684 [inline]
 __mutex_lock+0x6a4/0xd70 kernel/locking/mutex.c:752
 nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956
 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
 genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2550
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
 ___sys_sendmsg net/socket.c:2651 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f4204435ce9
RSP: 002b:00007ffea2feb278 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4204435ce9
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00000000000000a0 R09: 00000000000000a0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007ffea2feb2a0 R15: 00007ffea2feb290
 </TASK>
INFO: task syz-executor388:5259 blocked for more than 144 seconds.
      Not tainted 6.11.0-rc5-syzkaller-00151-gfe1910f9337b #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor388 state:D stack:26576 pid:5259  tgid:5259  ppid:5257   flags:0x00004006
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5188 [inline]
 __schedule+0x1800/0x4a60 kernel/sched/core.c:6529
 __schedule_loop kernel/sched/core.c:6606 [inline]
 schedule+0x14b/0x320 kernel/sched/core.c:6621
 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6678
 __mutex_lock_common kernel/locking/mutex.c:684 [inline]
 __mutex_lock+0x6a4/0xd70 kernel/locking/mutex.c:752
 nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956
 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
 genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2550
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
 ___sys_sendmsg net/socket.c:2651 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f4204435ce9
RSP: 002b:00007ffea2feb278 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4204435ce9
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00000000000000a0 R09: 00000000000000a0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007ffea2feb2a0 R15: 00007ffea2feb290
 </TASK>
INFO: task syz-executor388:5270 blocked for more than 144 seconds.
      Not tainted 6.11.0-rc5-syzkaller-00151-gfe1910f9337b #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor388 state:D stack:26576 pid:5270  tgid:5270  ppid:5253   flags:0x00004006
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5188 [inline]
 __schedule+0x1800/0x4a60 kernel/sched/core.c:6529
 __schedule_loop kernel/sched/core.c:6606 [inline]
 schedule+0x14b/0x320 kernel/sched/core.c:6621
 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6678
 __mutex_lock_common kernel/locking/mutex.c:684 [inline]
 __mutex_lock+0x6a4/0xd70 kernel/locking/mutex.c:752
 nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956
 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
 genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2550
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
 ___sys_sendmsg net/socket.c:2651 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f4204435ce9
RSP: 002b:00007ffea2feb278 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4204435ce9
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 0000000000000000 R08: 00000000000000a0 R09: 00000000000000a0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007ffea2feb2a0 R15: 00007ffea2feb290
 </TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/30:
 #0: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
 #0: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
 #0: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x2a0 kernel/locking/lockdep.c:6626
2 locks held by kworker/u8:5/1073:
4 locks held by klogd/4679:
 #0: ffff8880b893e9d8 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:560
 #1: ffff8880b8928948 (&per_cpu_ptr(group->pcpu, cpu)->seq){-.-.}-{0:0}, at: psi_task_switch+0x441/0x770 kernel/sched/psi.c:989
 #2: ffff88807cdc6418 (&p->pi_lock){-.-.}-{2:2}, at: class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
 #2: ffff88807cdc6418 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0xb0/0x1470 kernel/sched/core.c:4051
 #3: ffff8880b893e9d8 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:560
2 locks held by getty/4991:
 #0: ffff8880309080a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
 #1: ffffc900031332f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6ac/0x1e00 drivers/tty/n_tty.c:2211
2 locks held by syz-executor388/5254:
 #0: ffffffff8fcf11b0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1218
 #1: ffffffff8ec0b628 (nfsd_mutex){+.+.}-{3:3}, at: nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956
2 locks held by syz-executor388/5258:
 #0: ffffffff8fcf11b0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1218
 #1: ffffffff8ec0b628 (nfsd_mutex){+.+.}-{3:3}, at: nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956
2 locks held by syz-executor388/5259:
 #0: ffffffff8fcf11b0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1218
 #1: ffffffff8ec0b628 (nfsd_mutex){+.+.}-{3:3}, at: nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956
2 locks held by syz-executor388/5261:
 #0: ffffffff8fcf11b0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1218
 #1: ffffffff8ec0b628 (nfsd_mutex){+.+.}-{3:3}, at: nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956
2 locks held by syz-executor388/5270:
 #0: ffffffff8fcf11b0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1218
 #1: ffffffff8ec0b628 (nfsd_mutex){+.+.}-{3:3}, at: nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1956

=============================================

NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.11.0-rc5-syzkaller-00151-gfe1910f9337b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:93 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
 nmi_cpu_backtrace+0x49c/0x4d0 lib/nmi_backtrace.c:113
 nmi_trigger_cpumask_backtrace+0x198/0x320 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
 check_hung_uninterruptible_tasks kernel/hung_task.c:223 [inline]
 watchdog+0xff4/0x1040 kernel/hung_task.c:379
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 4679 Comm: klogd Not tainted 6.11.0-rc5-syzkaller-00151-gfe1910f9337b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
RIP: 0010:unwind_next_frame+0x690/0x2a00 arch/x86/kernel/unwind_orc.c:505
Code: 3c 5b 4d 01 ff 49 01 cf 0f 84 82 00 00 00 49 89 ee e8 64 5e 52 00 49 8d 6f 04 49 8d 5f 05 48 89 e8 48 c1 e8 03 42 0f b6 04 28 <84> c0 0f 85 88 1b 00 00 48 89 d8 48 c1 e8 03 42 0f b6 04 28 84 c0
RSP: 0018:ffffc90004187268 EFLAGS: 00000a02
RAX: 0000000000000000 RBX: ffffffff90a9f4f1 RCX: ffff88807c3c8000
RDX: 0000000000000000 RSI: ffffffff81faa2e5 RDI: ffffffff81faa2ea
RBP: ffffffff90a9f4f0 R08: ffffffff81412c60 R09: ffffc90004187430
R10: 0000000000000003 R11: ffffffff817f2f00 R12: ffffffff90310524
R13: dffffc0000000000 R14: 1ffff92000830e68 R15: ffffffff90a9f4ec
FS:  00007fb93ee6f380(0000) GS:ffff8880b8900000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555557e676f8 CR3: 000000007c060000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <NMI>
 </NMI>
 <TASK>
 arch_stack_walk+0x151/0x1b0 arch/x86/kernel/stacktrace.c:25
 stack_trace_save+0x118/0x1d0 kernel/stacktrace.c:122
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 unpoison_slab_object mm/kasan/common.c:312 [inline]
 __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slub.c:3988 [inline]
 slab_alloc_node mm/slub.c:4037 [inline]
 kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4080
 __alloc_skb+0x1c3/0x440 net/core/skbuff.c:664
 alloc_skb include/linux/skbuff.h:1320 [inline]
 alloc_skb_with_frags+0xc3/0x770 net/core/skbuff.c:6526
 sock_alloc_send_pskb+0x91a/0xa60 net/core/sock.c:2815
 unix_dgram_sendmsg+0x6d3/0x1f80 net/unix/af_unix.c:2030
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 __sys_sendto+0x3a4/0x4f0 net/socket.c:2204
 __do_sys_sendto net/socket.c:2216 [inline]
 __se_sys_sendto net/socket.c:2212 [inline]
 __x64_sys_sendto+0xde/0x100 net/socket.c:2212
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb93efd19b5
Code: 8b 44 24 08 48 83 c4 28 48 98 c3 48 98 c3 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 26 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 76 7a 48 8b 15 44 c4 0c 00 f7 d8 64 89 02 48 83
RSP: 002b:00007ffecbd3b868 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fb93efd19b5
RDX: 000000000000008b RSI: 000056134a140d80 RDI: 0000000000000003
RBP: 000056134a13c910 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000004000 R11: 0000000000000246 R12: 0000000000000013
R13: 00007fb93f15f212 R14: 00007ffecbd3b968 R15: 0000000000000000
 </TASK>
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.486 msecs


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
  2024-08-31 18:22 ` [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit syzbot
@ 2024-09-02  1:57   ` NeilBrown
  2024-09-04 14:23     ` Chuck Lever
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2024-09-02  1:57 UTC (permalink / raw)
  To: syzbot
  Cc: Dai.Ngo, chuck.lever, dai.ngo, jlayton, kolga, linux-kernel,
	linux-nfs, lorenzo, netdev, okorniev, syzkaller-bugs, tom

On Sun, 01 Sep 2024, syzbot wrote:
> syzbot has found a reproducer for the following issue on:

I had a poke around using the provided disk image and kernel for
exploring.

I think the problem is demonstrated by this stack :

[<0>] rpc_wait_bit_killable+0x1b/0x160
[<0>] __rpc_execute+0x723/0x1460
[<0>] rpc_execute+0x1ec/0x3f0
[<0>] rpc_run_task+0x562/0x6c0
[<0>] rpc_call_sync+0x197/0x2e0
[<0>] rpcb_register+0x36b/0x670
[<0>] svc_unregister+0x208/0x730
[<0>] svc_bind+0x1bb/0x1e0
[<0>] nfsd_create_serv+0x3f0/0x760
[<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
[<0>] genl_rcv_msg+0xb16/0xec0
[<0>] netlink_rcv_skb+0x1e5/0x430

No rpcbind is running on this host so that "svc_unregister" takes a
long time.  Maybe not forever but if a few of these get queued up all
blocking some other thread, then maybe that pushed it over the limit.

The fact that rpcbind is not running might not be relevant as the test
messes up the network.  "ping 127.0.0.1" stops working.

So this bug comes down to "we try to contact rpcbind while holding a
mutex and if that gets no response and no error, then we can hold the
mutex for a long time".

Are we surprised? Do we want to fix this?  Any suggestions how?

NeilBrown

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
  2024-09-02  1:57   ` NeilBrown
@ 2024-09-04 14:23     ` Chuck Lever
  2024-09-04 14:36       ` Jeff Layton
  2024-10-09 20:26       ` Jeff Layton
  0 siblings, 2 replies; 9+ messages in thread
From: Chuck Lever @ 2024-09-04 14:23 UTC (permalink / raw)
  To: NeilBrown
  Cc: syzbot, Dai.Ngo, jlayton, kolga, linux-kernel, linux-nfs, lorenzo,
	netdev, okorniev, syzkaller-bugs, tom

On Mon, Sep 02, 2024 at 11:57:55AM +1000, NeilBrown wrote:
> On Sun, 01 Sep 2024, syzbot wrote:
> > syzbot has found a reproducer for the following issue on:
> 
> I had a poke around using the provided disk image and kernel for
> exploring.
> 
> I think the problem is demonstrated by this stack :
> 
> [<0>] rpc_wait_bit_killable+0x1b/0x160
> [<0>] __rpc_execute+0x723/0x1460
> [<0>] rpc_execute+0x1ec/0x3f0
> [<0>] rpc_run_task+0x562/0x6c0
> [<0>] rpc_call_sync+0x197/0x2e0
> [<0>] rpcb_register+0x36b/0x670
> [<0>] svc_unregister+0x208/0x730
> [<0>] svc_bind+0x1bb/0x1e0
> [<0>] nfsd_create_serv+0x3f0/0x760
> [<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
> [<0>] genl_rcv_msg+0xb16/0xec0
> [<0>] netlink_rcv_skb+0x1e5/0x430
> 
> No rpcbind is running on this host so that "svc_unregister" takes a
> long time.  Maybe not forever but if a few of these get queued up all
> blocking some other thread, then maybe that pushed it over the limit.
> 
> The fact that rpcbind is not running might not be relevant as the test
> messes up the network.  "ping 127.0.0.1" stops working.
> 
> So this bug comes down to "we try to contact rpcbind while holding a
> mutex and if that gets no response and no error, then we can hold the
> mutex for a long time".
> 
> Are we surprised? Do we want to fix this?  Any suggestions how?

In the past, we've tried to address "hanging upcall" issues where
the kernel part of an administrative command needs a user space
service that isn't working or present. (eg mount needing a running
gssd)

If NFSD is using the kernel RPC client for the upcall, then maybe
adding the RPC_TASK_SOFTCONN flag might turn the hang into an
immediate failure.

IMO this should be addressed.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
  2024-09-04 14:23     ` Chuck Lever
@ 2024-09-04 14:36       ` Jeff Layton
  2024-10-09 20:26       ` Jeff Layton
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2024-09-04 14:36 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown
  Cc: syzbot, Dai.Ngo, kolga, linux-kernel, linux-nfs, lorenzo, netdev,
	okorniev, syzkaller-bugs, tom

On Wed, 2024-09-04 at 10:23 -0400, Chuck Lever wrote:
> On Mon, Sep 02, 2024 at 11:57:55AM +1000, NeilBrown wrote:
> > On Sun, 01 Sep 2024, syzbot wrote:
> > > syzbot has found a reproducer for the following issue on:
> > 
> > I had a poke around using the provided disk image and kernel for
> > exploring.
> > 
> > I think the problem is demonstrated by this stack :
> > 
> > [<0>] rpc_wait_bit_killable+0x1b/0x160
> > [<0>] __rpc_execute+0x723/0x1460
> > [<0>] rpc_execute+0x1ec/0x3f0
> > [<0>] rpc_run_task+0x562/0x6c0
> > [<0>] rpc_call_sync+0x197/0x2e0
> > [<0>] rpcb_register+0x36b/0x670
> > [<0>] svc_unregister+0x208/0x730
> > [<0>] svc_bind+0x1bb/0x1e0
> > [<0>] nfsd_create_serv+0x3f0/0x760
> > [<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
> > [<0>] genl_rcv_msg+0xb16/0xec0
> > [<0>] netlink_rcv_skb+0x1e5/0x430
> > 
> > No rpcbind is running on this host so that "svc_unregister" takes a
> > long time.  Maybe not forever but if a few of these get queued up all
> > blocking some other thread, then maybe that pushed it over the limit.
> > 
> > The fact that rpcbind is not running might not be relevant as the test
> > messes up the network.  "ping 127.0.0.1" stops working.
> > 
> > So this bug comes down to "we try to contact rpcbind while holding a
> > mutex and if that gets no response and no error, then we can hold the
> > mutex for a long time".
> > 
> > Are we surprised? Do we want to fix this?  Any suggestions how?
> 
> In the past, we've tried to address "hanging upcall" issues where
> the kernel part of an administrative command needs a user space
> service that isn't working or present. (eg mount needing a running
> gssd)
> 
> If NFSD is using the kernel RPC client for the upcall, then maybe
> adding the RPC_TASK_SOFTCONN flag might turn the hang into an
> immediate failure.
>
> IMO this should be addressed.
> 


Looking at rpcb_register_call, it looks like we already set SOFTCONN if
is_set is true. We probably did that assuming that we only call
svc_unregister on shutdown. svc_rpcb_setup does this though:

        /* Remove any stale portmap registrations */
        svc_unregister(serv, net);
        return 0;

What would be the risk in just setting SOFTCONN unconditionally?
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
  2024-09-04 14:23     ` Chuck Lever
  2024-09-04 14:36       ` Jeff Layton
@ 2024-10-09 20:26       ` Jeff Layton
  2024-10-11 18:18         ` Chuck Lever III
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2024-10-09 20:26 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown
  Cc: syzbot, Dai.Ngo, kolga, linux-kernel, linux-nfs, lorenzo, netdev,
	okorniev, syzkaller-bugs, tom

On Wed, 2024-09-04 at 10:23 -0400, Chuck Lever wrote:
> On Mon, Sep 02, 2024 at 11:57:55AM +1000, NeilBrown wrote:
> > On Sun, 01 Sep 2024, syzbot wrote:
> > > syzbot has found a reproducer for the following issue on:
> > 
> > I had a poke around using the provided disk image and kernel for
> > exploring.
> > 
> > I think the problem is demonstrated by this stack :
> > 
> > [<0>] rpc_wait_bit_killable+0x1b/0x160
> > [<0>] __rpc_execute+0x723/0x1460
> > [<0>] rpc_execute+0x1ec/0x3f0
> > [<0>] rpc_run_task+0x562/0x6c0
> > [<0>] rpc_call_sync+0x197/0x2e0
> > [<0>] rpcb_register+0x36b/0x670
> > [<0>] svc_unregister+0x208/0x730
> > [<0>] svc_bind+0x1bb/0x1e0
> > [<0>] nfsd_create_serv+0x3f0/0x760
> > [<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
> > [<0>] genl_rcv_msg+0xb16/0xec0
> > [<0>] netlink_rcv_skb+0x1e5/0x430
> > 
> > No rpcbind is running on this host so that "svc_unregister" takes a
> > long time.  Maybe not forever but if a few of these get queued up all
> > blocking some other thread, then maybe that pushed it over the limit.
> > 
> > The fact that rpcbind is not running might not be relevant as the test
> > messes up the network.  "ping 127.0.0.1" stops working.
> > 
> > So this bug comes down to "we try to contact rpcbind while holding a
> > mutex and if that gets no response and no error, then we can hold the
> > mutex for a long time".
> > 
> > Are we surprised? Do we want to fix this?  Any suggestions how?
> 
> In the past, we've tried to address "hanging upcall" issues where
> the kernel part of an administrative command needs a user space
> service that isn't working or present. (eg mount needing a running
> gssd)
> 
> If NFSD is using the kernel RPC client for the upcall, then maybe
> adding the RPC_TASK_SOFTCONN flag might turn the hang into an
> immediate failure.
> 
> IMO this should be addressed.
> 
> 

I sent a patch that does the above, but now I'm wondering if we ought
to take another approach. The listener array can be pretty long. What
if we instead were to just drop and reacquire the mutex in the loop at
strategic points? Then we wouldn't squat on the mutex for so long. 

Something like this maybe? It's ugly but it might prevent hung task
warnings, and listener setup isn't a fastpath anyway.


diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 3adbc05ebaac..5de01fb4c557 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -2042,7 +2042,9 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
 
                set_bit(XPT_CLOSE, &xprt->xpt_flags);
                spin_unlock_bh(&serv->sv_lock);

                svc_xprt_close(xprt);
+
+               /* ensure we don't squat on the mutex for too long */
+               mutex_unlock(&nfsd_mutex);
+               mutex_lock(&nfsd_mutex);
                spin_lock_bh(&serv->sv_lock);
        }
 
@@ -2082,6 +2084,10 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
                /* always save the latest error */
                if (ret < 0)
                        err = ret;
+
+               /* ensure we don't squat on the mutex for too long */
+               mutex_unlock(&nfsd_mutex);
+               mutex_lock(&nfsd_mutex);
        }
 
        if (!serv->sv_nrthreads && list_empty(&nn->nfsd_serv->sv_permsocks))



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
  2024-10-09 20:26       ` Jeff Layton
@ 2024-10-11 18:18         ` Chuck Lever III
  2024-10-11 19:15           ` Jeff Layton
  2024-10-11 21:08           ` NeilBrown
  0 siblings, 2 replies; 9+ messages in thread
From: Chuck Lever III @ 2024-10-11 18:18 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Neil Brown, syzbot, Dai Ngo, Olga Kornievskaia,
	Linux Kernel Mailing List, Linux NFS Mailing List,
	Lorenzo Bianconi, netdev, Olga Kornievskaia,
	syzkaller-bugs@googlegroups.com, Tom Talpey



> On Oct 9, 2024, at 4:26 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Wed, 2024-09-04 at 10:23 -0400, Chuck Lever wrote:
>> On Mon, Sep 02, 2024 at 11:57:55AM +1000, NeilBrown wrote:
>>> On Sun, 01 Sep 2024, syzbot wrote:
>>>> syzbot has found a reproducer for the following issue on:
>>> 
>>> I had a poke around using the provided disk image and kernel for
>>> exploring.
>>> 
>>> I think the problem is demonstrated by this stack :
>>> 
>>> [<0>] rpc_wait_bit_killable+0x1b/0x160
>>> [<0>] __rpc_execute+0x723/0x1460
>>> [<0>] rpc_execute+0x1ec/0x3f0
>>> [<0>] rpc_run_task+0x562/0x6c0
>>> [<0>] rpc_call_sync+0x197/0x2e0
>>> [<0>] rpcb_register+0x36b/0x670
>>> [<0>] svc_unregister+0x208/0x730
>>> [<0>] svc_bind+0x1bb/0x1e0
>>> [<0>] nfsd_create_serv+0x3f0/0x760
>>> [<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
>>> [<0>] genl_rcv_msg+0xb16/0xec0
>>> [<0>] netlink_rcv_skb+0x1e5/0x430
>>> 
>>> No rpcbind is running on this host so that "svc_unregister" takes a
>>> long time.  Maybe not forever but if a few of these get queued up all
>>> blocking some other thread, then maybe that pushed it over the limit.
>>> 
>>> The fact that rpcbind is not running might not be relevant as the test
>>> messes up the network.  "ping 127.0.0.1" stops working.
>>> 
>>> So this bug comes down to "we try to contact rpcbind while holding a
>>> mutex and if that gets no response and no error, then we can hold the
>>> mutex for a long time".
>>> 
>>> Are we surprised? Do we want to fix this?  Any suggestions how?
>> 
>> In the past, we've tried to address "hanging upcall" issues where
>> the kernel part of an administrative command needs a user space
>> service that isn't working or present. (eg mount needing a running
>> gssd)
>> 
>> If NFSD is using the kernel RPC client for the upcall, then maybe
>> adding the RPC_TASK_SOFTCONN flag might turn the hang into an
>> immediate failure.
>> 
>> IMO this should be addressed.
>> 
>> 
> 
> I sent a patch that does the above, but now I'm wondering if we ought
> to take another approach. The listener array can be pretty long. What
> if we instead were to just drop and reacquire the mutex in the loop at
> strategic points? Then we wouldn't squat on the mutex for so long. 
> 
> Something like this maybe? It's ugly but it might prevent hung task
> warnings, and listener setup isn't a fastpath anyway.
> 
> 
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 3adbc05ebaac..5de01fb4c557 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -2042,7 +2042,9 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
> 
>                set_bit(XPT_CLOSE, &xprt->xpt_flags);
>                spin_unlock_bh(&serv->sv_lock);
> 
>                svc_xprt_close(xprt);
> +
> +               /* ensure we don't squat on the mutex for too long */
> +               mutex_unlock(&nfsd_mutex);
> +               mutex_lock(&nfsd_mutex);
>                spin_lock_bh(&serv->sv_lock);
>        }
> 
> @@ -2082,6 +2084,10 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
>                /* always save the latest error */
>                if (ret < 0)
>                        err = ret;
> +
> +               /* ensure we don't squat on the mutex for too long */
> +               mutex_unlock(&nfsd_mutex);
> +               mutex_lock(&nfsd_mutex);
>        }
> 
>        if (!serv->sv_nrthreads && list_empty(&nn->nfsd_serv->sv_permsocks))

I had a look at the rpcb upcall code a couple of weeks ago.
I'm not convinced that setting SOFTCONN in all cases will
help but unfortunately the reasons for my skepticism have
all but leaked out of my head.

Releasing and re-acquiring the mutex is often a sign of
a deeper problem. I think you're in the right vicinity
but I'd like to better understand the actual cause of
the delay. The listener list shouldn't be all that long,
but maybe it has a unintentional loop in it?

I wish we had a reproducer for these issues.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
  2024-10-11 18:18         ` Chuck Lever III
@ 2024-10-11 19:15           ` Jeff Layton
  2024-10-11 21:08           ` NeilBrown
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2024-10-11 19:15 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Neil Brown, syzbot, Dai Ngo, Olga Kornievskaia,
	Linux Kernel Mailing List, Linux NFS Mailing List,
	Lorenzo Bianconi, netdev, Olga Kornievskaia,
	syzkaller-bugs@googlegroups.com, Tom Talpey

On Fri, 2024-10-11 at 18:18 +0000, Chuck Lever III wrote:
> 
> > On Oct 9, 2024, at 4:26 PM, Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > On Wed, 2024-09-04 at 10:23 -0400, Chuck Lever wrote:
> > > On Mon, Sep 02, 2024 at 11:57:55AM +1000, NeilBrown wrote:
> > > > On Sun, 01 Sep 2024, syzbot wrote:
> > > > > syzbot has found a reproducer for the following issue on:
> > > > 
> > > > I had a poke around using the provided disk image and kernel for
> > > > exploring.
> > > > 
> > > > I think the problem is demonstrated by this stack :
> > > > 
> > > > [<0>] rpc_wait_bit_killable+0x1b/0x160
> > > > [<0>] __rpc_execute+0x723/0x1460
> > > > [<0>] rpc_execute+0x1ec/0x3f0
> > > > [<0>] rpc_run_task+0x562/0x6c0
> > > > [<0>] rpc_call_sync+0x197/0x2e0
> > > > [<0>] rpcb_register+0x36b/0x670
> > > > [<0>] svc_unregister+0x208/0x730
> > > > [<0>] svc_bind+0x1bb/0x1e0
> > > > [<0>] nfsd_create_serv+0x3f0/0x760
> > > > [<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
> > > > [<0>] genl_rcv_msg+0xb16/0xec0
> > > > [<0>] netlink_rcv_skb+0x1e5/0x430
> > > > 
> > > > No rpcbind is running on this host so that "svc_unregister" takes a
> > > > long time.  Maybe not forever but if a few of these get queued up all
> > > > blocking some other thread, then maybe that pushed it over the limit.
> > > > 
> > > > The fact that rpcbind is not running might not be relevant as the test
> > > > messes up the network.  "ping 127.0.0.1" stops working.
> > > > 
> > > > So this bug comes down to "we try to contact rpcbind while holding a
> > > > mutex and if that gets no response and no error, then we can hold the
> > > > mutex for a long time".
> > > > 
> > > > Are we surprised? Do we want to fix this?  Any suggestions how?
> > > 
> > > In the past, we've tried to address "hanging upcall" issues where
> > > the kernel part of an administrative command needs a user space
> > > service that isn't working or present. (eg mount needing a running
> > > gssd)
> > > 
> > > If NFSD is using the kernel RPC client for the upcall, then maybe
> > > adding the RPC_TASK_SOFTCONN flag might turn the hang into an
> > > immediate failure.
> > > 
> > > IMO this should be addressed.
> > > 
> > > 
> > 
> > I sent a patch that does the above, but now I'm wondering if we ought
> > to take another approach. The listener array can be pretty long. What
> > if we instead were to just drop and reacquire the mutex in the loop at
> > strategic points? Then we wouldn't squat on the mutex for so long. 
> > 
> > Something like this maybe? It's ugly but it might prevent hung task
> > warnings, and listener setup isn't a fastpath anyway.
> > 
> > 
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 3adbc05ebaac..5de01fb4c557 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -2042,7 +2042,9 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
> > 
> >                set_bit(XPT_CLOSE, &xprt->xpt_flags);
> >                spin_unlock_bh(&serv->sv_lock);
> > 
> >                svc_xprt_close(xprt);
> > +
> > +               /* ensure we don't squat on the mutex for too long */
> > +               mutex_unlock(&nfsd_mutex);
> > +               mutex_lock(&nfsd_mutex);
> >                spin_lock_bh(&serv->sv_lock);
> >        }
> > 
> > @@ -2082,6 +2084,10 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
> >                /* always save the latest error */
> >                if (ret < 0)
> >                        err = ret;
> > +
> > +               /* ensure we don't squat on the mutex for too long */
> > +               mutex_unlock(&nfsd_mutex);
> > +               mutex_lock(&nfsd_mutex);
> >        }
> > 
> >        if (!serv->sv_nrthreads && list_empty(&nn->nfsd_serv->sv_permsocks))
> 
> I had a look at the rpcb upcall code a couple of weeks ago.
> I'm not convinced that setting SOFTCONN in all cases will
> help but unfortunately the reasons for my skepticism have
> all but leaked out of my head.
> 
> Releasing and re-acquiring the mutex is often a sign of
> a deeper problem.

It might look that way, but in this case we're iterating over the list
in the netlink call. There's no reason we need to hold the nfsd_mutex
over that entire process. Probably we can reduce the scope a bit
further and make this look a little less sketchy. I'll send a revised
patch.

>
> I think you're in the right vicinity
> but I'd like to better understand the actual cause of
> the delay. The listener list shouldn't be all that long,
> but maybe it has a unintentional loop in it?
> 

I don't think that's possible given that this is in a netlink request,
but I'm no expert there.

> I wish we had a reproducer for these issues.

The old interface only allowed one listener to be set at a time. The
new one allows a list, and I don't think it's bounded in any way. You
could send down hundreds of listeners at once, and if they all end up
hanging for a little while trying to talk to rpcbind, then that could
easily cause the hung task warning.

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
  2024-10-11 18:18         ` Chuck Lever III
  2024-10-11 19:15           ` Jeff Layton
@ 2024-10-11 21:08           ` NeilBrown
  2024-10-11 21:13             ` Chuck Lever III
  1 sibling, 1 reply; 9+ messages in thread
From: NeilBrown @ 2024-10-11 21:08 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Jeff Layton, syzbot, Dai Ngo, Olga Kornievskaia,
	Linux Kernel Mailing List, Linux NFS Mailing List,
	Lorenzo Bianconi, netdev, Olga Kornievskaia,
	syzkaller-bugs@googlegroups.com, Tom Talpey

On Sat, 12 Oct 2024, Chuck Lever III wrote:
> 
> 
> > On Oct 9, 2024, at 4:26 PM, Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > On Wed, 2024-09-04 at 10:23 -0400, Chuck Lever wrote:
> >> On Mon, Sep 02, 2024 at 11:57:55AM +1000, NeilBrown wrote:
> >>> On Sun, 01 Sep 2024, syzbot wrote:
> >>>> syzbot has found a reproducer for the following issue on:
> >>> 
> >>> I had a poke around using the provided disk image and kernel for
> >>> exploring.
> >>> 
> >>> I think the problem is demonstrated by this stack :
> >>> 
> >>> [<0>] rpc_wait_bit_killable+0x1b/0x160
> >>> [<0>] __rpc_execute+0x723/0x1460
> >>> [<0>] rpc_execute+0x1ec/0x3f0
> >>> [<0>] rpc_run_task+0x562/0x6c0
> >>> [<0>] rpc_call_sync+0x197/0x2e0
> >>> [<0>] rpcb_register+0x36b/0x670
> >>> [<0>] svc_unregister+0x208/0x730
> >>> [<0>] svc_bind+0x1bb/0x1e0
> >>> [<0>] nfsd_create_serv+0x3f0/0x760
> >>> [<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
> >>> [<0>] genl_rcv_msg+0xb16/0xec0
> >>> [<0>] netlink_rcv_skb+0x1e5/0x430
> >>> 
> >>> No rpcbind is running on this host so that "svc_unregister" takes a
> >>> long time.  Maybe not forever but if a few of these get queued up all
> >>> blocking some other thread, then maybe that pushed it over the limit.
> >>> 
> >>> The fact that rpcbind is not running might not be relevant as the test
> >>> messes up the network.  "ping 127.0.0.1" stops working.
> >>> 
> >>> So this bug comes down to "we try to contact rpcbind while holding a
> >>> mutex and if that gets no response and no error, then we can hold the
> >>> mutex for a long time".
> >>> 
> >>> Are we surprised? Do we want to fix this?  Any suggestions how?
> >> 
> >> In the past, we've tried to address "hanging upcall" issues where
> >> the kernel part of an administrative command needs a user space
> >> service that isn't working or present. (eg mount needing a running
> >> gssd)
> >> 
> >> If NFSD is using the kernel RPC client for the upcall, then maybe
> >> adding the RPC_TASK_SOFTCONN flag might turn the hang into an
> >> immediate failure.
> >> 
> >> IMO this should be addressed.
> >> 
> >> 
> > 
> > I sent a patch that does the above, but now I'm wondering if we ought
> > to take another approach. The listener array can be pretty long. What
> > if we instead were to just drop and reacquire the mutex in the loop at
> > strategic points? Then we wouldn't squat on the mutex for so long. 
> > 
> > Something like this maybe? It's ugly but it might prevent hung task
> > warnings, and listener setup isn't a fastpath anyway.
> > 
> > 
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 3adbc05ebaac..5de01fb4c557 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -2042,7 +2042,9 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
> > 
> >                set_bit(XPT_CLOSE, &xprt->xpt_flags);
> >                spin_unlock_bh(&serv->sv_lock);
> > 
> >                svc_xprt_close(xprt);
> > +
> > +               /* ensure we don't squat on the mutex for too long */
> > +               mutex_unlock(&nfsd_mutex);
> > +               mutex_lock(&nfsd_mutex);
> >                spin_lock_bh(&serv->sv_lock);
> >        }
> > 
> > @@ -2082,6 +2084,10 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
> >                /* always save the latest error */
> >                if (ret < 0)
> >                        err = ret;
> > +
> > +               /* ensure we don't squat on the mutex for too long */
> > +               mutex_unlock(&nfsd_mutex);
> > +               mutex_lock(&nfsd_mutex);
> >        }
> > 
> >        if (!serv->sv_nrthreads && list_empty(&nn->nfsd_serv->sv_permsocks))
> 
> I had a look at the rpcb upcall code a couple of weeks ago.
> I'm not convinced that setting SOFTCONN in all cases will
> help but unfortunately the reasons for my skepticism have
> all but leaked out of my head.
> 
> Releasing and re-acquiring the mutex is often a sign of
> a deeper problem. I think you're in the right vicinity
> but I'd like to better understand the actual cause of
> the delay. The listener list shouldn't be all that long,
> but maybe it has a unintentional loop in it?

I think it is wrong to register with rpcbind while holding a mutex.
Registering with rpcbind doesn't need to by synchronous does it?  Could
we punt that to a workqueue?
Do we need to get a failure status back somehow??
wait_for_completion_killable() somewhere??

> 
> I wish we had a reproducer for these issues.

We do I think.  I downloaded the provided kernel and root image and ran
the reproduced supplied (or maybe an earlier version) and it triggered
quite easily.

NeilBrown

> 
> 
> --
> Chuck Lever
> 
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit
  2024-10-11 21:08           ` NeilBrown
@ 2024-10-11 21:13             ` Chuck Lever III
  0 siblings, 0 replies; 9+ messages in thread
From: Chuck Lever III @ 2024-10-11 21:13 UTC (permalink / raw)
  To: Neil Brown
  Cc: Jeff Layton, syzbot, Dai Ngo, Olga Kornievskaia,
	Linux Kernel Mailing List, Linux NFS Mailing List,
	Lorenzo Bianconi, netdev, Olga Kornievskaia,
	syzkaller-bugs@googlegroups.com, Tom Talpey



> On Oct 11, 2024, at 5:08 PM, NeilBrown <neilb@suse.de> wrote:
> 
> On Sat, 12 Oct 2024, Chuck Lever III wrote:
>> 
>> 
>>> On Oct 9, 2024, at 4:26 PM, Jeff Layton <jlayton@kernel.org> wrote:
>>> 
>>> On Wed, 2024-09-04 at 10:23 -0400, Chuck Lever wrote:
>>>> On Mon, Sep 02, 2024 at 11:57:55AM +1000, NeilBrown wrote:
>>>>> On Sun, 01 Sep 2024, syzbot wrote:
>>>>>> syzbot has found a reproducer for the following issue on:
>>>>> 
>>>>> I had a poke around using the provided disk image and kernel for
>>>>> exploring.
>>>>> 
>>>>> I think the problem is demonstrated by this stack :
>>>>> 
>>>>> [<0>] rpc_wait_bit_killable+0x1b/0x160
>>>>> [<0>] __rpc_execute+0x723/0x1460
>>>>> [<0>] rpc_execute+0x1ec/0x3f0
>>>>> [<0>] rpc_run_task+0x562/0x6c0
>>>>> [<0>] rpc_call_sync+0x197/0x2e0
>>>>> [<0>] rpcb_register+0x36b/0x670
>>>>> [<0>] svc_unregister+0x208/0x730
>>>>> [<0>] svc_bind+0x1bb/0x1e0
>>>>> [<0>] nfsd_create_serv+0x3f0/0x760
>>>>> [<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
>>>>> [<0>] genl_rcv_msg+0xb16/0xec0
>>>>> [<0>] netlink_rcv_skb+0x1e5/0x430
>>>>> 
>>>>> No rpcbind is running on this host so that "svc_unregister" takes a
>>>>> long time.  Maybe not forever but if a few of these get queued up all
>>>>> blocking some other thread, then maybe that pushed it over the limit.
>>>>> 
>>>>> The fact that rpcbind is not running might not be relevant as the test
>>>>> messes up the network.  "ping 127.0.0.1" stops working.
>>>>> 
>>>>> So this bug comes down to "we try to contact rpcbind while holding a
>>>>> mutex and if that gets no response and no error, then we can hold the
>>>>> mutex for a long time".
>>>>> 
>>>>> Are we surprised? Do we want to fix this?  Any suggestions how?
>>>> 
>>>> In the past, we've tried to address "hanging upcall" issues where
>>>> the kernel part of an administrative command needs a user space
>>>> service that isn't working or present. (eg mount needing a running
>>>> gssd)
>>>> 
>>>> If NFSD is using the kernel RPC client for the upcall, then maybe
>>>> adding the RPC_TASK_SOFTCONN flag might turn the hang into an
>>>> immediate failure.
>>>> 
>>>> IMO this should be addressed.
>>>> 
>>>> 
>>> 
>>> I sent a patch that does the above, but now I'm wondering if we ought
>>> to take another approach. The listener array can be pretty long. What
>>> if we instead were to just drop and reacquire the mutex in the loop at
>>> strategic points? Then we wouldn't squat on the mutex for so long. 
>>> 
>>> Something like this maybe? It's ugly but it might prevent hung task
>>> warnings, and listener setup isn't a fastpath anyway.
>>> 
>>> 
>>> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
>>> index 3adbc05ebaac..5de01fb4c557 100644
>>> --- a/fs/nfsd/nfsctl.c
>>> +++ b/fs/nfsd/nfsctl.c
>>> @@ -2042,7 +2042,9 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
>>> 
>>>               set_bit(XPT_CLOSE, &xprt->xpt_flags);
>>>               spin_unlock_bh(&serv->sv_lock);
>>> 
>>>               svc_xprt_close(xprt);
>>> +
>>> +               /* ensure we don't squat on the mutex for too long */
>>> +               mutex_unlock(&nfsd_mutex);
>>> +               mutex_lock(&nfsd_mutex);
>>>               spin_lock_bh(&serv->sv_lock);
>>>       }
>>> 
>>> @@ -2082,6 +2084,10 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
>>>               /* always save the latest error */
>>>               if (ret < 0)
>>>                       err = ret;
>>> +
>>> +               /* ensure we don't squat on the mutex for too long */
>>> +               mutex_unlock(&nfsd_mutex);
>>> +               mutex_lock(&nfsd_mutex);
>>>       }
>>> 
>>>       if (!serv->sv_nrthreads && list_empty(&nn->nfsd_serv->sv_permsocks))
>> 
>> I had a look at the rpcb upcall code a couple of weeks ago.
>> I'm not convinced that setting SOFTCONN in all cases will
>> help but unfortunately the reasons for my skepticism have
>> all but leaked out of my head.
>> 
>> Releasing and re-acquiring the mutex is often a sign of
>> a deeper problem. I think you're in the right vicinity
>> but I'd like to better understand the actual cause of
>> the delay. The listener list shouldn't be all that long,
>> but maybe it has a unintentional loop in it?
> 
> I think it is wrong to register with rpcbind while holding a mutex.
> Registering with rpcbind doesn't need to by synchronous does it?  Could
> we punt that to a workqueue?
> Do we need to get a failure status back somehow??
> wait_for_completion_killable() somewhere??

I think kernel RPC service start-up needs to fail immediately
if rpcbind registration doesn't work.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-10-11 21:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <0000000000004385ec06198753f8@google.com>
2024-08-31 18:22 ` [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit syzbot
2024-09-02  1:57   ` NeilBrown
2024-09-04 14:23     ` Chuck Lever
2024-09-04 14:36       ` Jeff Layton
2024-10-09 20:26       ` Jeff Layton
2024-10-11 18:18         ` Chuck Lever III
2024-10-11 19:15           ` Jeff Layton
2024-10-11 21:08           ` NeilBrown
2024-10-11 21:13             ` Chuck Lever III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).