linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [nbd?] possible deadlock in nbd_queue_rq
@ 2025-07-06 16:33 syzbot
  2025-07-07  0:59 ` Hillf Danton
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: syzbot @ 2025-07-06 16:33 UTC (permalink / raw)
  To: axboe, josef, linux-block, linux-kernel, nbd, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    26ffb3d6f02c Add linux-next specific files for 20250704
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=154e6582580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1e4f88512ae53408
dashboard link: https://syzkaller.appspot.com/bug?extid=3dbc6142c85cc77eaf04
compiler:       Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/fd5569903143/disk-26ffb3d6.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1b0c9505c543/vmlinux-26ffb3d6.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9d864c72bed1/bzImage-26ffb3d6.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3dbc6142c85cc77eaf04@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc4-next-20250704-syzkaller #0 Not tainted
------------------------------------------------------
udevd/6083 is trying to acquire lock:
ffff88807b837870 (&nsock->tx_lock){+.+.}-{4:4}, at: nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
ffff88807b837870 (&nsock->tx_lock){+.+.}-{4:4}, at: nbd_queue_rq+0x257/0xf10 drivers/block/nbd.c:1204

but task is already holding lock:
ffff8880597ee178 (&cmd->lock){+.+.}-{4:4}, at: nbd_queue_rq+0xc8/0xf10 drivers/block/nbd.c:1196

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #6 (&cmd->lock){+.+.}-{4:4}:
       lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
       __mutex_lock_common kernel/locking/mutex.c:602 [inline]
       __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:747
       nbd_queue_rq+0xc8/0xf10 drivers/block/nbd.c:1196
       blk_mq_dispatch_rq_list+0x4c0/0x1900 block/blk-mq.c:2118
       __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
       blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
       __blk_mq_sched_dispatch_requests+0xda4/0x1570 block/blk-mq-sched.c:307
       blk_mq_sched_dispatch_requests+0xd7/0x190 block/blk-mq-sched.c:329
       blk_mq_run_hw_queue+0x348/0x4f0 block/blk-mq.c:2356
       blk_mq_dispatch_list+0xd0c/0xe00 include/linux/spinlock.h:-1
       blk_mq_flush_plug_list+0x469/0x550 block/blk-mq.c:2965
       __blk_flush_plug+0x3d3/0x4b0 block/blk-core.c:1220
       blk_finish_plug block/blk-core.c:1247 [inline]
       __submit_bio+0x2d3/0x5a0 block/blk-core.c:649
       __submit_bio_noacct_mq block/blk-core.c:722 [inline]
       submit_bio_noacct_nocheck+0x4ab/0xb50 block/blk-core.c:751
       submit_bh fs/buffer.c:2829 [inline]
       block_read_full_folio+0x7b7/0x830 fs/buffer.c:2461
       filemap_read_folio+0x117/0x380 mm/filemap.c:2413
       do_read_cache_folio+0x350/0x590 mm/filemap.c:3957
       read_mapping_folio include/linux/pagemap.h:972 [inline]
       read_part_sector+0xb6/0x2b0 block/partitions/core.c:722
       adfspart_check_ICS+0xa4/0xa50 block/partitions/acorn.c:360
       check_partition block/partitions/core.c:141 [inline]
       blk_add_partitions block/partitions/core.c:589 [inline]
       bdev_disk_changed+0x75c/0x14b0 block/partitions/core.c:693
       blkdev_get_whole+0x380/0x510 block/bdev.c:748
       bdev_open+0x31e/0xd30 block/bdev.c:957
       blkdev_open+0x3a8/0x510 block/fops.c:676
       do_dentry_open+0xdf0/0x1970 fs/open.c:964
       vfs_open+0x3b/0x340 fs/open.c:1094
       do_open fs/namei.c:3887 [inline]
       path_openat+0x2ee5/0x3830 fs/namei.c:4046
       do_filp_open+0x1fa/0x410 fs/namei.c:4073
       do_sys_openat2+0x121/0x1c0 fs/open.c:1434
       do_sys_open fs/open.c:1449 [inline]
       __do_sys_openat fs/open.c:1465 [inline]
       __se_sys_openat fs/open.c:1460 [inline]
       __x64_sys_openat+0x138/0x170 fs/open.c:1460
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #5 (set->srcu){.+.+}-{0:0}:
       lock_sync+0xba/0x160 kernel/locking/lockdep.c:5919
       srcu_lock_sync include/linux/srcu.h:173 [inline]
       __synchronize_srcu+0x96/0x3a0 kernel/rcu/srcutree.c:1429
       elevator_switch+0x12b/0x5f0 block/elevator.c:587
       elevator_change+0x21b/0x320 block/elevator.c:679
       elevator_set_default+0x144/0x210 block/elevator.c:737
       blk_register_queue+0x35d/0x400 block/blk-sysfs.c:879
       __add_disk+0x677/0xd50 block/genhd.c:528
       add_disk_fwnode+0xfc/0x480 block/genhd.c:597
       add_disk include/linux/blkdev.h:765 [inline]
       nbd_dev_add+0x70e/0xb00 drivers/block/nbd.c:1963
       nbd_init+0x21a/0x2d0 drivers/block/nbd.c:2670
       do_one_initcall+0x233/0x820 init/main.c:1269
       do_initcall_level+0x137/0x1f0 init/main.c:1331
       do_initcalls+0x69/0xd0 init/main.c:1347
       kernel_init_freeable+0x3d9/0x570 init/main.c:1579
       kernel_init+0x1d/0x1d0 init/main.c:1469
       ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #4 (&q->elevator_lock){+.+.}-{4:4}:
       lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
       __mutex_lock_common kernel/locking/mutex.c:602 [inline]
       __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:747
       elv_update_nr_hw_queues+0x87/0x2a0 block/elevator.c:699
       __blk_mq_update_nr_hw_queues block/blk-mq.c:5024 [inline]
       blk_mq_update_nr_hw_queues+0xd54/0x14c0 block/blk-mq.c:5045
       nbd_start_device+0x16c/0xac0 drivers/block/nbd.c:1476
       nbd_genl_connect+0x1250/0x1930 drivers/block/nbd.c:2201
       genl_family_rcv_msg_doit+0x212/0x300 net/netlink/genetlink.c:1115
       genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
       genl_rcv_msg+0x60e/0x790 net/netlink/genetlink.c:1210
       netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2534
       genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
       netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
       netlink_unicast+0x75b/0x8d0 net/netlink/af_netlink.c:1339
       netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883
       sock_sendmsg_nosec net/socket.c:714 [inline]
       __sock_sendmsg+0x219/0x270 net/socket.c:729
       ____sys_sendmsg+0x505/0x830 net/socket.c:2614
       ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2668
       __sys_sendmsg net/socket.c:2700 [inline]
       __do_sys_sendmsg net/socket.c:2705 [inline]
       __se_sys_sendmsg net/socket.c:2703 [inline]
       __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2703
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #3 (&q->q_usage_counter(io)#49){++++}-{0:0}:
       lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
       blk_alloc_queue+0x538/0x620 block/blk-core.c:461
       blk_mq_alloc_queue block/blk-mq.c:4398 [inline]
       __blk_mq_alloc_disk+0x162/0x340 block/blk-mq.c:4445
       nbd_dev_add+0x476/0xb00 drivers/block/nbd.c:1933
       nbd_init+0x21a/0x2d0 drivers/block/nbd.c:2670
       do_one_initcall+0x233/0x820 init/main.c:1269
       do_initcall_level+0x137/0x1f0 init/main.c:1331
       do_initcalls+0x69/0xd0 init/main.c:1347
       kernel_init_freeable+0x3d9/0x570 init/main.c:1579
       kernel_init+0x1d/0x1d0 init/main.c:1469
       ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #2 (fs_reclaim){+.+.}-{0:0}:
       lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
       __fs_reclaim_acquire mm/page_alloc.c:4231 [inline]
       fs_reclaim_acquire+0x72/0x100 mm/page_alloc.c:4245
       might_alloc include/linux/sched/mm.h:318 [inline]
       slab_pre_alloc_hook mm/slub.c:4131 [inline]
       slab_alloc_node mm/slub.c:4209 [inline]
       kmem_cache_alloc_node_noprof+0x47/0x3c0 mm/slub.c:4281
       __alloc_skb+0x112/0x2d0 net/core/skbuff.c:660
       alloc_skb_fclone include/linux/skbuff.h:1386 [inline]
       tcp_stream_alloc_skb+0x3d/0x340 net/ipv4/tcp.c:892
       tcp_sendmsg_locked+0x1f46/0x5630 net/ipv4/tcp.c:1198
       tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1394
       sock_sendmsg_nosec net/socket.c:714 [inline]
       __sock_sendmsg+0x19c/0x270 net/socket.c:729
       sock_write_iter+0x258/0x330 net/socket.c:1179
       new_sync_write fs/read_write.c:593 [inline]
       vfs_write+0x548/0xa90 fs/read_write.c:686
       ksys_write+0x145/0x250 fs/read_write.c:738
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (sk_lock-AF_INET){+.+.}-{0:0}:
       lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
       lock_sock_nested+0x48/0x100 net/core/sock.c:3727
       lock_sock include/net/sock.h:1667 [inline]
       inet_shutdown+0x6a/0x390 net/ipv4/af_inet.c:905
       nbd_mark_nsock_dead+0x2e9/0x560 drivers/block/nbd.c:318
       recv_work+0x2138/0x24f0 drivers/block/nbd.c:1018
       process_one_work kernel/workqueue.c:3239 [inline]
       process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3322
       worker_thread+0x8a0/0xda0 kernel/workqueue.c:3403
       kthread+0x70e/0x8a0 kernel/kthread.c:463
       ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #0 (&nsock->tx_lock){+.+.}-{4:4}:
       check_prev_add kernel/locking/lockdep.c:3168 [inline]
       check_prevs_add kernel/locking/lockdep.c:3287 [inline]
       validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3911
       __lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5240
       lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
       __mutex_lock_common kernel/locking/mutex.c:602 [inline]
       __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:747
       nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
       nbd_queue_rq+0x257/0xf10 drivers/block/nbd.c:1204
       blk_mq_dispatch_rq_list+0x4c0/0x1900 block/blk-mq.c:2118
       __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
       blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
       __blk_mq_sched_dispatch_requests+0xda4/0x1570 block/blk-mq-sched.c:307
       blk_mq_sched_dispatch_requests+0xd7/0x190 block/blk-mq-sched.c:329
       blk_mq_run_hw_queue+0x348/0x4f0 block/blk-mq.c:2356
       blk_mq_dispatch_list+0xd0c/0xe00 include/linux/spinlock.h:-1
       blk_mq_flush_plug_list+0x469/0x550 block/blk-mq.c:2965
       __blk_flush_plug+0x3d3/0x4b0 block/blk-core.c:1220
       blk_finish_plug block/blk-core.c:1247 [inline]
       __submit_bio+0x2d3/0x5a0 block/blk-core.c:649
       __submit_bio_noacct_mq block/blk-core.c:722 [inline]
       submit_bio_noacct_nocheck+0x4ab/0xb50 block/blk-core.c:751
       submit_bh fs/buffer.c:2829 [inline]
       block_read_full_folio+0x7b7/0x830 fs/buffer.c:2461
       filemap_read_folio+0x117/0x380 mm/filemap.c:2413
       do_read_cache_folio+0x350/0x590 mm/filemap.c:3957
       read_mapping_folio include/linux/pagemap.h:972 [inline]
       read_part_sector+0xb6/0x2b0 block/partitions/core.c:722
       adfspart_check_ICS+0xa4/0xa50 block/partitions/acorn.c:360
       check_partition block/partitions/core.c:141 [inline]
       blk_add_partitions block/partitions/core.c:589 [inline]
       bdev_disk_changed+0x75c/0x14b0 block/partitions/core.c:693
       blkdev_get_whole+0x380/0x510 block/bdev.c:748
       bdev_open+0x31e/0xd30 block/bdev.c:957
       blkdev_open+0x3a8/0x510 block/fops.c:676
       do_dentry_open+0xdf0/0x1970 fs/open.c:964
       vfs_open+0x3b/0x340 fs/open.c:1094
       do_open fs/namei.c:3887 [inline]
       path_openat+0x2ee5/0x3830 fs/namei.c:4046
       do_filp_open+0x1fa/0x410 fs/namei.c:4073
       do_sys_openat2+0x121/0x1c0 fs/open.c:1434
       do_sys_open fs/open.c:1449 [inline]
       __do_sys_openat fs/open.c:1465 [inline]
       __se_sys_openat fs/open.c:1460 [inline]
       __x64_sys_openat+0x138/0x170 fs/open.c:1460
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
  &nsock->tx_lock --> set->srcu --> &cmd->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&cmd->lock);
                               lock(set->srcu);
                               lock(&cmd->lock);
  lock(&nsock->tx_lock);

 *** DEADLOCK ***

3 locks held by udevd/6083:
 #0: ffff888024ed8358 (&disk->open_mutex){+.+.}-{4:4}, at: bdev_open+0xe0/0xd30 block/bdev.c:945
 #1: ffff888024d36f90 (set->srcu){.+.+}-{0:0}, at: srcu_lock_acquire include/linux/srcu.h:161 [inline]
 #1: ffff888024d36f90 (set->srcu){.+.+}-{0:0}, at: srcu_read_lock include/linux/srcu.h:253 [inline]
 #1: ffff888024d36f90 (set->srcu){.+.+}-{0:0}, at: blk_mq_run_hw_queue+0x31f/0x4f0 block/blk-mq.c:2356
 #2: ffff8880597ee178 (&cmd->lock){+.+.}-{4:4}, at: nbd_queue_rq+0xc8/0xf10 drivers/block/nbd.c:1196

stack backtrace:
CPU: 0 UID: 0 PID: 6083 Comm: udevd Not tainted 6.16.0-rc4-next-20250704-syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Call Trace:
 <TASK>
 dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
 print_circular_bug+0x2ee/0x310 kernel/locking/lockdep.c:2046
 check_noncircular+0x134/0x160 kernel/locking/lockdep.c:2178
 check_prev_add kernel/locking/lockdep.c:3168 [inline]
 check_prevs_add kernel/locking/lockdep.c:3287 [inline]
 validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3911
 __lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5240
 lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
 __mutex_lock_common kernel/locking/mutex.c:602 [inline]
 __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:747
 nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
 nbd_queue_rq+0x257/0xf10 drivers/block/nbd.c:1204
 blk_mq_dispatch_rq_list+0x4c0/0x1900 block/blk-mq.c:2118
 __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
 blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
 __blk_mq_sched_dispatch_requests+0xda4/0x1570 block/blk-mq-sched.c:307
 blk_mq_sched_dispatch_requests+0xd7/0x190 block/blk-mq-sched.c:329
 blk_mq_run_hw_queue+0x348/0x4f0 block/blk-mq.c:2356
 blk_mq_dispatch_list+0xd0c/0xe00 include/linux/spinlock.h:-1
 blk_mq_flush_plug_list+0x469/0x550 block/blk-mq.c:2965
 __blk_flush_plug+0x3d3/0x4b0 block/blk-core.c:1220
 blk_finish_plug block/blk-core.c:1247 [inline]
 __submit_bio+0x2d3/0x5a0 block/blk-core.c:649
 __submit_bio_noacct_mq block/blk-core.c:722 [inline]
 submit_bio_noacct_nocheck+0x4ab/0xb50 block/blk-core.c:751
 submit_bh fs/buffer.c:2829 [inline]
 block_read_full_folio+0x7b7/0x830 fs/buffer.c:2461
 filemap_read_folio+0x117/0x380 mm/filemap.c:2413
 do_read_cache_folio+0x350/0x590 mm/filemap.c:3957
 read_mapping_folio include/linux/pagemap.h:972 [inline]
 read_part_sector+0xb6/0x2b0 block/partitions/core.c:722
 adfspart_check_ICS+0xa4/0xa50 block/partitions/acorn.c:360
 check_partition block/partitions/core.c:141 [inline]
 blk_add_partitions block/partitions/core.c:589 [inline]
 bdev_disk_changed+0x75c/0x14b0 block/partitions/core.c:693
 blkdev_get_whole+0x380/0x510 block/bdev.c:748
 bdev_open+0x31e/0xd30 block/bdev.c:957
 blkdev_open+0x3a8/0x510 block/fops.c:676
 do_dentry_open+0xdf0/0x1970 fs/open.c:964
 vfs_open+0x3b/0x340 fs/open.c:1094
 do_open fs/namei.c:3887 [inline]
 path_openat+0x2ee5/0x3830 fs/namei.c:4046
 do_filp_open+0x1fa/0x410 fs/namei.c:4073
 do_sys_openat2+0x121/0x1c0 fs/open.c:1434
 do_sys_open fs/open.c:1449 [inline]
 __do_sys_openat fs/open.c:1465 [inline]
 __se_sys_openat fs/open.c:1460 [inline]
 __x64_sys_openat+0x138/0x170 fs/open.c:1460
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f674caa7407
Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP: 002b:00007fff79271070 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 00007f674d25e880 RCX: 00007f674caa7407
RDX: 00000000000a0800 RSI: 000055ad17c52fe0 RDI: ffffffffffffff9c
RBP: 000055ad17c52910 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 000055ad17c53000
R13: 000055ad17c6a430 R14: 0000000000000000 R15: 000055ad17c53000
 </TASK>
block nbd0: Dead connection, failed to find a fallback
block nbd0: shutting down sockets
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
ldm_validate_partition_table(): Disk read failed.
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Buffer I/O error on dev nbd0, logical block 0, async page read
Dev nbd0: unable to read RDB block 0
 nbd0: unable to read partition table
ldm_validate_partition_table(): Disk read failed.
Dev nbd0: unable to read RDB block 0
 nbd0: unable to read partition table


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-06 16:33 [syzbot] [nbd?] possible deadlock in nbd_queue_rq syzbot
@ 2025-07-07  0:59 ` Hillf Danton
  2025-07-07 17:39   ` Bart Van Assche
  2025-09-15  2:16 ` syzbot
  2025-09-16 18:18 ` syzbot
  2 siblings, 1 reply; 10+ messages in thread
From: Hillf Danton @ 2025-07-07  0:59 UTC (permalink / raw)
  To: syzbot
  Cc: axboe, josef, linux-block, Ming Lei, Tetsuo Handa, linux-kernel,
	nbd, syzkaller-bugs

> Date: Sun, 06 Jul 2025 09:33:27 -0700
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    26ffb3d6f02c Add linux-next specific files for 20250704
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=154e6582580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=1e4f88512ae53408
> dashboard link: https://syzkaller.appspot.com/bug?extid=3dbc6142c85cc77eaf04
> compiler:       Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/fd5569903143/disk-26ffb3d6.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/1b0c9505c543/vmlinux-26ffb3d6.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/9d864c72bed1/bzImage-26ffb3d6.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+3dbc6142c85cc77eaf04@syzkaller.appspotmail.com
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.16.0-rc4-next-20250704-syzkaller #0 Not tainted
> ------------------------------------------------------
> udevd/6083 is trying to acquire lock:
> ffff88807b837870 (&nsock->tx_lock){+.+.}-{4:4}, at: nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
> ffff88807b837870 (&nsock->tx_lock){+.+.}-{4:4}, at: nbd_queue_rq+0x257/0xf10 drivers/block/nbd.c:1204
> 
> but task is already holding lock:
> ffff8880597ee178 (&cmd->lock){+.+.}-{4:4}, at: nbd_queue_rq+0xc8/0xf10 drivers/block/nbd.c:1196
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #6 (&cmd->lock){+.+.}-{4:4}:
>        lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
>        __mutex_lock_common kernel/locking/mutex.c:602 [inline]
>        __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:747
>        nbd_queue_rq+0xc8/0xf10 drivers/block/nbd.c:1196
>        blk_mq_dispatch_rq_list+0x4c0/0x1900 block/blk-mq.c:2118
>        __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
>        blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
>        __blk_mq_sched_dispatch_requests+0xda4/0x1570 block/blk-mq-sched.c:307
>        blk_mq_sched_dispatch_requests+0xd7/0x190 block/blk-mq-sched.c:329
>        blk_mq_run_hw_queue+0x348/0x4f0 block/blk-mq.c:2356
>        blk_mq_dispatch_list+0xd0c/0xe00 include/linux/spinlock.h:-1
>        blk_mq_flush_plug_list+0x469/0x550 block/blk-mq.c:2965
>        __blk_flush_plug+0x3d3/0x4b0 block/blk-core.c:1220
>        blk_finish_plug block/blk-core.c:1247 [inline]
>        __submit_bio+0x2d3/0x5a0 block/blk-core.c:649
>        __submit_bio_noacct_mq block/blk-core.c:722 [inline]
>        submit_bio_noacct_nocheck+0x4ab/0xb50 block/blk-core.c:751
>        submit_bh fs/buffer.c:2829 [inline]
>        block_read_full_folio+0x7b7/0x830 fs/buffer.c:2461
>        filemap_read_folio+0x117/0x380 mm/filemap.c:2413
>        do_read_cache_folio+0x350/0x590 mm/filemap.c:3957
>        read_mapping_folio include/linux/pagemap.h:972 [inline]
>        read_part_sector+0xb6/0x2b0 block/partitions/core.c:722
>        adfspart_check_ICS+0xa4/0xa50 block/partitions/acorn.c:360
>        check_partition block/partitions/core.c:141 [inline]
>        blk_add_partitions block/partitions/core.c:589 [inline]
>        bdev_disk_changed+0x75c/0x14b0 block/partitions/core.c:693
>        blkdev_get_whole+0x380/0x510 block/bdev.c:748
>        bdev_open+0x31e/0xd30 block/bdev.c:957
>        blkdev_open+0x3a8/0x510 block/fops.c:676
>        do_dentry_open+0xdf0/0x1970 fs/open.c:964
>        vfs_open+0x3b/0x340 fs/open.c:1094
>        do_open fs/namei.c:3887 [inline]
>        path_openat+0x2ee5/0x3830 fs/namei.c:4046
>        do_filp_open+0x1fa/0x410 fs/namei.c:4073
>        do_sys_openat2+0x121/0x1c0 fs/open.c:1434
>        do_sys_open fs/open.c:1449 [inline]
>        __do_sys_openat fs/open.c:1465 [inline]
>        __se_sys_openat fs/open.c:1460 [inline]
>        __x64_sys_openat+0x138/0x170 fs/open.c:1460
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> -> #5 (set->srcu){.+.+}-{0:0}:
>        lock_sync+0xba/0x160 kernel/locking/lockdep.c:5919
>        srcu_lock_sync include/linux/srcu.h:173 [inline]
>        __synchronize_srcu+0x96/0x3a0 kernel/rcu/srcutree.c:1429
>        elevator_switch+0x12b/0x5f0 block/elevator.c:587
>        elevator_change+0x21b/0x320 block/elevator.c:679
>        elevator_set_default+0x144/0x210 block/elevator.c:737
>        blk_register_queue+0x35d/0x400 block/blk-sysfs.c:879
>        __add_disk+0x677/0xd50 block/genhd.c:528
>        add_disk_fwnode+0xfc/0x480 block/genhd.c:597
>        add_disk include/linux/blkdev.h:765 [inline]
>        nbd_dev_add+0x70e/0xb00 drivers/block/nbd.c:1963
>        nbd_init+0x21a/0x2d0 drivers/block/nbd.c:2670

The first case of nbd_init in the lock chain,

>        do_one_initcall+0x233/0x820 init/main.c:1269
>        do_initcall_level+0x137/0x1f0 init/main.c:1331
>        do_initcalls+0x69/0xd0 init/main.c:1347
>        kernel_init_freeable+0x3d9/0x570 init/main.c:1579
>        kernel_init+0x1d/0x1d0 init/main.c:1469
>        ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
>        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> -> #4 (&q->elevator_lock){+.+.}-{4:4}:
>        lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
>        __mutex_lock_common kernel/locking/mutex.c:602 [inline]
>        __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:747
>        elv_update_nr_hw_queues+0x87/0x2a0 block/elevator.c:699
>        __blk_mq_update_nr_hw_queues block/blk-mq.c:5024 [inline]
>        blk_mq_update_nr_hw_queues+0xd54/0x14c0 block/blk-mq.c:5045
>        nbd_start_device+0x16c/0xac0 drivers/block/nbd.c:1476
>        nbd_genl_connect+0x1250/0x1930 drivers/block/nbd.c:2201
>        genl_family_rcv_msg_doit+0x212/0x300 net/netlink/genetlink.c:1115
>        genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
>        genl_rcv_msg+0x60e/0x790 net/netlink/genetlink.c:1210
>        netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2534
>        genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
>        netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
>        netlink_unicast+0x75b/0x8d0 net/netlink/af_netlink.c:1339
>        netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883
>        sock_sendmsg_nosec net/socket.c:714 [inline]
>        __sock_sendmsg+0x219/0x270 net/socket.c:729
>        ____sys_sendmsg+0x505/0x830 net/socket.c:2614
>        ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2668
>        __sys_sendmsg net/socket.c:2700 [inline]
>        __do_sys_sendmsg net/socket.c:2705 [inline]
>        __se_sys_sendmsg net/socket.c:2703 [inline]
>        __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2703
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> -> #3 (&q->q_usage_counter(io)#49){++++}-{0:0}:
>        lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
>        blk_alloc_queue+0x538/0x620 block/blk-core.c:461
>        blk_mq_alloc_queue block/blk-mq.c:4398 [inline]
>        __blk_mq_alloc_disk+0x162/0x340 block/blk-mq.c:4445
>        nbd_dev_add+0x476/0xb00 drivers/block/nbd.c:1933
>        nbd_init+0x21a/0x2d0 drivers/block/nbd.c:2670

and given the second one, the report is false positive.

>        do_one_initcall+0x233/0x820 init/main.c:1269
>        do_initcall_level+0x137/0x1f0 init/main.c:1331
>        do_initcalls+0x69/0xd0 init/main.c:1347
>        kernel_init_freeable+0x3d9/0x570 init/main.c:1579
>        kernel_init+0x1d/0x1d0 init/main.c:1469
>        ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
>        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> -> #2 (fs_reclaim){+.+.}-{0:0}:
>        lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
>        __fs_reclaim_acquire mm/page_alloc.c:4231 [inline]
>        fs_reclaim_acquire+0x72/0x100 mm/page_alloc.c:4245
>        might_alloc include/linux/sched/mm.h:318 [inline]
>        slab_pre_alloc_hook mm/slub.c:4131 [inline]
>        slab_alloc_node mm/slub.c:4209 [inline]
>        kmem_cache_alloc_node_noprof+0x47/0x3c0 mm/slub.c:4281
>        __alloc_skb+0x112/0x2d0 net/core/skbuff.c:660
>        alloc_skb_fclone include/linux/skbuff.h:1386 [inline]
>        tcp_stream_alloc_skb+0x3d/0x340 net/ipv4/tcp.c:892
>        tcp_sendmsg_locked+0x1f46/0x5630 net/ipv4/tcp.c:1198
>        tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1394
>        sock_sendmsg_nosec net/socket.c:714 [inline]
>        __sock_sendmsg+0x19c/0x270 net/socket.c:729
>        sock_write_iter+0x258/0x330 net/socket.c:1179
>        new_sync_write fs/read_write.c:593 [inline]
>        vfs_write+0x548/0xa90 fs/read_write.c:686
>        ksys_write+0x145/0x250 fs/read_write.c:738
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> -> #1 (sk_lock-AF_INET){+.+.}-{0:0}:
>        lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
>        lock_sock_nested+0x48/0x100 net/core/sock.c:3727
>        lock_sock include/net/sock.h:1667 [inline]
>        inet_shutdown+0x6a/0x390 net/ipv4/af_inet.c:905
>        nbd_mark_nsock_dead+0x2e9/0x560 drivers/block/nbd.c:318
>        recv_work+0x2138/0x24f0 drivers/block/nbd.c:1018
>        process_one_work kernel/workqueue.c:3239 [inline]
>        process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3322
>        worker_thread+0x8a0/0xda0 kernel/workqueue.c:3403
>        kthread+0x70e/0x8a0 kernel/kthread.c:463
>        ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
>        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> -> #0 (&nsock->tx_lock){+.+.}-{4:4}:
>        check_prev_add kernel/locking/lockdep.c:3168 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3287 [inline]
>        validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3911
>        __lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5240
>        lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
>        __mutex_lock_common kernel/locking/mutex.c:602 [inline]
>        __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:747
>        nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
>        nbd_queue_rq+0x257/0xf10 drivers/block/nbd.c:1204
>        blk_mq_dispatch_rq_list+0x4c0/0x1900 block/blk-mq.c:2118
>        __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
>        blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
>        __blk_mq_sched_dispatch_requests+0xda4/0x1570 block/blk-mq-sched.c:307
>        blk_mq_sched_dispatch_requests+0xd7/0x190 block/blk-mq-sched.c:329
>        blk_mq_run_hw_queue+0x348/0x4f0 block/blk-mq.c:2356
>        blk_mq_dispatch_list+0xd0c/0xe00 include/linux/spinlock.h:-1
>        blk_mq_flush_plug_list+0x469/0x550 block/blk-mq.c:2965
>        __blk_flush_plug+0x3d3/0x4b0 block/blk-core.c:1220
>        blk_finish_plug block/blk-core.c:1247 [inline]
>        __submit_bio+0x2d3/0x5a0 block/blk-core.c:649
>        __submit_bio_noacct_mq block/blk-core.c:722 [inline]
>        submit_bio_noacct_nocheck+0x4ab/0xb50 block/blk-core.c:751
>        submit_bh fs/buffer.c:2829 [inline]
>        block_read_full_folio+0x7b7/0x830 fs/buffer.c:2461
>        filemap_read_folio+0x117/0x380 mm/filemap.c:2413
>        do_read_cache_folio+0x350/0x590 mm/filemap.c:3957
>        read_mapping_folio include/linux/pagemap.h:972 [inline]
>        read_part_sector+0xb6/0x2b0 block/partitions/core.c:722
>        adfspart_check_ICS+0xa4/0xa50 block/partitions/acorn.c:360
>        check_partition block/partitions/core.c:141 [inline]
>        blk_add_partitions block/partitions/core.c:589 [inline]
>        bdev_disk_changed+0x75c/0x14b0 block/partitions/core.c:693
>        blkdev_get_whole+0x380/0x510 block/bdev.c:748
>        bdev_open+0x31e/0xd30 block/bdev.c:957
>        blkdev_open+0x3a8/0x510 block/fops.c:676
>        do_dentry_open+0xdf0/0x1970 fs/open.c:964
>        vfs_open+0x3b/0x340 fs/open.c:1094
>        do_open fs/namei.c:3887 [inline]
>        path_openat+0x2ee5/0x3830 fs/namei.c:4046
>        do_filp_open+0x1fa/0x410 fs/namei.c:4073
>        do_sys_openat2+0x121/0x1c0 fs/open.c:1434
>        do_sys_open fs/open.c:1449 [inline]
>        __do_sys_openat fs/open.c:1465 [inline]
>        __se_sys_openat fs/open.c:1460 [inline]
>        __x64_sys_openat+0x138/0x170 fs/open.c:1460
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> other info that might help us debug this:
> 
> Chain exists of:
>   &nsock->tx_lock --> set->srcu --> &cmd->lock
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&cmd->lock);
>                                lock(set->srcu);
>                                lock(&cmd->lock);
>   lock(&nsock->tx_lock);
> 
>  *** DEADLOCK ***
> 
> 3 locks held by udevd/6083:
>  #0: ffff888024ed8358 (&disk->open_mutex){+.+.}-{4:4}, at: bdev_open+0xe0/0xd30 block/bdev.c:945
>  #1: ffff888024d36f90 (set->srcu){.+.+}-{0:0}, at: srcu_lock_acquire include/linux/srcu.h:161 [inline]
>  #1: ffff888024d36f90 (set->srcu){.+.+}-{0:0}, at: srcu_read_lock include/linux/srcu.h:253 [inline]
>  #1: ffff888024d36f90 (set->srcu){.+.+}-{0:0}, at: blk_mq_run_hw_queue+0x31f/0x4f0 block/blk-mq.c:2356
>  #2: ffff8880597ee178 (&cmd->lock){+.+.}-{4:4}, at: nbd_queue_rq+0xc8/0xf10 drivers/block/nbd.c:1196
> 
> stack backtrace:
> CPU: 0 UID: 0 PID: 6083 Comm: udevd Not tainted 6.16.0-rc4-next-20250704-syzkaller #0 PREEMPT(full) 
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
>  print_circular_bug+0x2ee/0x310 kernel/locking/lockdep.c:2046
>  check_noncircular+0x134/0x160 kernel/locking/lockdep.c:2178
>  check_prev_add kernel/locking/lockdep.c:3168 [inline]
>  check_prevs_add kernel/locking/lockdep.c:3287 [inline]
>  validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3911
>  __lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5240
>  lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5871
>  __mutex_lock_common kernel/locking/mutex.c:602 [inline]
>  __mutex_lock+0x182/0xe80 kernel/locking/mutex.c:747
>  nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
>  nbd_queue_rq+0x257/0xf10 drivers/block/nbd.c:1204
>  blk_mq_dispatch_rq_list+0x4c0/0x1900 block/blk-mq.c:2118
>  __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
>  blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
>  __blk_mq_sched_dispatch_requests+0xda4/0x1570 block/blk-mq-sched.c:307
>  blk_mq_sched_dispatch_requests+0xd7/0x190 block/blk-mq-sched.c:329
>  blk_mq_run_hw_queue+0x348/0x4f0 block/blk-mq.c:2356
>  blk_mq_dispatch_list+0xd0c/0xe00 include/linux/spinlock.h:-1
>  blk_mq_flush_plug_list+0x469/0x550 block/blk-mq.c:2965
>  __blk_flush_plug+0x3d3/0x4b0 block/blk-core.c:1220
>  blk_finish_plug block/blk-core.c:1247 [inline]
>  __submit_bio+0x2d3/0x5a0 block/blk-core.c:649
>  __submit_bio_noacct_mq block/blk-core.c:722 [inline]
>  submit_bio_noacct_nocheck+0x4ab/0xb50 block/blk-core.c:751
>  submit_bh fs/buffer.c:2829 [inline]
>  block_read_full_folio+0x7b7/0x830 fs/buffer.c:2461
>  filemap_read_folio+0x117/0x380 mm/filemap.c:2413
>  do_read_cache_folio+0x350/0x590 mm/filemap.c:3957
>  read_mapping_folio include/linux/pagemap.h:972 [inline]
>  read_part_sector+0xb6/0x2b0 block/partitions/core.c:722
>  adfspart_check_ICS+0xa4/0xa50 block/partitions/acorn.c:360
>  check_partition block/partitions/core.c:141 [inline]
>  blk_add_partitions block/partitions/core.c:589 [inline]
>  bdev_disk_changed+0x75c/0x14b0 block/partitions/core.c:693
>  blkdev_get_whole+0x380/0x510 block/bdev.c:748
>  bdev_open+0x31e/0xd30 block/bdev.c:957
>  blkdev_open+0x3a8/0x510 block/fops.c:676
>  do_dentry_open+0xdf0/0x1970 fs/open.c:964
>  vfs_open+0x3b/0x340 fs/open.c:1094
>  do_open fs/namei.c:3887 [inline]
>  path_openat+0x2ee5/0x3830 fs/namei.c:4046
>  do_filp_open+0x1fa/0x410 fs/namei.c:4073
>  do_sys_openat2+0x121/0x1c0 fs/open.c:1434
>  do_sys_open fs/open.c:1449 [inline]
>  __do_sys_openat fs/open.c:1465 [inline]
>  __se_sys_openat fs/open.c:1460 [inline]
>  __x64_sys_openat+0x138/0x170 fs/open.c:1460
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f674caa7407
> Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
> RSP: 002b:00007fff79271070 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
> RAX: ffffffffffffffda RBX: 00007f674d25e880 RCX: 00007f674caa7407
> RDX: 00000000000a0800 RSI: 000055ad17c52fe0 RDI: ffffffffffffff9c
> RBP: 000055ad17c52910 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000202 R12: 000055ad17c53000
> R13: 000055ad17c6a430 R14: 0000000000000000 R15: 000055ad17c53000
>  </TASK>
> block nbd0: Dead connection, failed to find a fallback
> block nbd0: shutting down sockets
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> ldm_validate_partition_table(): Disk read failed.
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> Buffer I/O error on dev nbd0, logical block 0, async page read
> Dev nbd0: unable to read RDB block 0
>  nbd0: unable to read partition table
> ldm_validate_partition_table(): Disk read failed.
> Dev nbd0: unable to read RDB block 0
>  nbd0: unable to read partition table
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-07  0:59 ` Hillf Danton
@ 2025-07-07 17:39   ` Bart Van Assche
  2025-07-08  0:18     ` Hillf Danton
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2025-07-07 17:39 UTC (permalink / raw)
  To: Hillf Danton, syzbot
  Cc: axboe, josef, linux-block, Ming Lei, Tetsuo Handa, linux-kernel,
	nbd, syzkaller-bugs

On 7/6/25 5:59 PM, Hillf Danton wrote:
> and given the second one, the report is false positive.

Whether or not this report is a false positive, the root cause should be
fixed because lockdep disables itself after the first circular locking
complaint. From print_usage_bug() in kernel/locking/lockdep.c:

	if (!debug_locks_off() || debug_locks_silent)
		return;

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-07 17:39   ` Bart Van Assche
@ 2025-07-08  0:18     ` Hillf Danton
  2025-07-08  0:52       ` Tetsuo Handa
  0 siblings, 1 reply; 10+ messages in thread
From: Hillf Danton @ 2025-07-08  0:18 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: axboe, josef, linux-block, syzbot, Ming Lei, Tetsuo Handa,
	linux-kernel, nbd, syzkaller-bugs

On Mon, 7 Jul 2025 10:39:44 -0700 Bart Van Assche wrote:
> On 7/6/25 5:59 PM, Hillf Danton wrote:
> > and given the second one, the report is false positive.
> 
> Whether or not this report is a false positive, the root cause should be
> fixed because lockdep disables itself after the first circular locking
> complaint. From print_usage_bug() in kernel/locking/lockdep.c:
> 
> 	if (!debug_locks_off() || debug_locks_silent)
> 		return;
> 
The root cause could be walked around for example by trying not to init
nbd more than once.

--- x/drivers/block/nbd.c
+++ y/drivers/block/nbd.c
@@ -2620,8 +2620,12 @@ static void nbd_dead_link_work(struct wo
 
 static int __init nbd_init(void)
 {
+	static int inited = 0;
 	int i;
 
+	if (inited)
+		return 0;
+	inited++;
 	BUILD_BUG_ON(sizeof(struct nbd_request) != 28);
 
 	if (max_part < 0) {

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-08  0:18     ` Hillf Danton
@ 2025-07-08  0:52       ` Tetsuo Handa
  2025-07-08  1:24         ` Hillf Danton
  0 siblings, 1 reply; 10+ messages in thread
From: Tetsuo Handa @ 2025-07-08  0:52 UTC (permalink / raw)
  To: Hillf Danton, Bart Van Assche
  Cc: axboe, josef, linux-block, syzbot, Ming Lei, linux-kernel, nbd,
	syzkaller-bugs

On 2025/07/08 9:18, Hillf Danton wrote:
> On Mon, 7 Jul 2025 10:39:44 -0700 Bart Van Assche wrote:
>> On 7/6/25 5:59 PM, Hillf Danton wrote:
>>> and given the second one, the report is false positive.
>>
>> Whether or not this report is a false positive, the root cause should be
>> fixed because lockdep disables itself after the first circular locking
>> complaint. From print_usage_bug() in kernel/locking/lockdep.c:
>>
>> 	if (!debug_locks_off() || debug_locks_silent)
>> 		return;
>>
> The root cause could be walked around for example by trying not to init
> nbd more than once.

How did you come to think so?

nbd_init() is already called only once because of module_init(nbd_init).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-08  0:52       ` Tetsuo Handa
@ 2025-07-08  1:24         ` Hillf Danton
  2025-07-08  2:19           ` Tetsuo Handa
  2025-07-08 16:23           ` Bart Van Assche
  0 siblings, 2 replies; 10+ messages in thread
From: Hillf Danton @ 2025-07-08  1:24 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Bart Van Assche, axboe, josef, linux-block, syzbot, Ming Lei,
	linux-kernel, nbd, syzkaller-bugs

On Tue, 8 Jul 2025 09:52:18 +0900 Tetsuo Handa wrote:
> On 2025/07/08 9:18, Hillf Danton wrote:
> > On Mon, 7 Jul 2025 10:39:44 -0700 Bart Van Assche wrote:
> >> On 7/6/25 5:59 PM, Hillf Danton wrote:
> >>> and given the second one, the report is false positive.
> >>
> >> Whether or not this report is a false positive, the root cause should be
> >> fixed because lockdep disables itself after the first circular locking
> >> complaint. From print_usage_bug() in kernel/locking/lockdep.c:
> >>
> >> 	if (!debug_locks_off() || debug_locks_silent)
> >> 		return;
> >>
> > The root cause could be walked around for example by trying not to init
> > nbd more than once.
> 
> How did you come to think so?
> 
Based on that nbd_init appears twice in the lock chain syzbot reported.

> nbd_init() is already called only once because of module_init(nbd_init).
> 
Ok Bart is misguiding.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-08  1:24         ` Hillf Danton
@ 2025-07-08  2:19           ` Tetsuo Handa
  2025-07-08 16:23           ` Bart Van Assche
  1 sibling, 0 replies; 10+ messages in thread
From: Tetsuo Handa @ 2025-07-08  2:19 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Bart Van Assche, axboe, josef, linux-block, syzbot, Ming Lei,
	linux-kernel, nbd, syzkaller-bugs

On 2025/07/08 10:24, Hillf Danton wrote:
> On Tue, 8 Jul 2025 09:52:18 +0900 Tetsuo Handa wrote:
>> On 2025/07/08 9:18, Hillf Danton wrote:
>>> On Mon, 7 Jul 2025 10:39:44 -0700 Bart Van Assche wrote:
>>>> On 7/6/25 5:59 PM, Hillf Danton wrote:
>>>>> and given the second one, the report is false positive.
>>>>
>>>> Whether or not this report is a false positive, the root cause should be
>>>> fixed because lockdep disables itself after the first circular locking
>>>> complaint. From print_usage_bug() in kernel/locking/lockdep.c:
>>>>
>>>> 	if (!debug_locks_off() || debug_locks_silent)
>>>> 		return;
>>>>
>>> The root cause could be walked around for example by trying not to init
>>> nbd more than once.
>>
>> How did you come to think so?
>>
> Based on that nbd_init appears twice in the lock chain syzbot reported.
> 

You might be misunderstanding what the lock chain is reporting.

The stack backtrace of a lock is taken only when that lock is taken
for the first time. That is, two stack backtraces from two locks might
share one or more functions. Also, the stack backtrace of a lock which
is printed when lockdep fired might not be a backtrace of that lock
when actual deadlock happens.

You need to understand all possible locking patterns (because lockdep
can associate only one backtrace with one lock) before you conclude
that the report is a false positive.

>> nbd_init() is already called only once because of module_init(nbd_init).
>>
> Ok Bart is misguiding.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-08  1:24         ` Hillf Danton
  2025-07-08  2:19           ` Tetsuo Handa
@ 2025-07-08 16:23           ` Bart Van Assche
  1 sibling, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2025-07-08 16:23 UTC (permalink / raw)
  To: Hillf Danton, Tetsuo Handa
  Cc: axboe, josef, linux-block, syzbot, Ming Lei, linux-kernel, nbd,
	syzkaller-bugs

On 7/7/25 6:24 PM, Hillf Danton wrote:
>> nbd_init() is already called only once because of module_init(nbd_init).
>>
> Ok Bart is misguiding.

No, I'm not. I didn't write anything about nbd_init().

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-06 16:33 [syzbot] [nbd?] possible deadlock in nbd_queue_rq syzbot
  2025-07-07  0:59 ` Hillf Danton
@ 2025-09-15  2:16 ` syzbot
  2025-09-16 18:18 ` syzbot
  2 siblings, 0 replies; 10+ messages in thread
From: syzbot @ 2025-09-15  2:16 UTC (permalink / raw)
  To: axboe, bvanassche, hdanton, josef, linux-block, linux-kernel,
	ming.lei, nbd, penguin-kernel, syzkaller-bugs

syzbot has found a reproducer for the following issue on:

HEAD commit:    8736259279a3 Merge branch 'for-next/core' into for-kernelci
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=11e2bb62580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=9ed0a6d7c80843e9
dashboard link: https://syzkaller.appspot.com/bug?extid=3dbc6142c85cc77eaf04
compiler:       Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
userspace arch: arm64
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=178df642580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=148e7934580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/025c082a7762/disk-87362592.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/606f903fe4d2/vmlinux-87362592.xz
kernel image: https://storage.googleapis.com/syzbot-assets/23ea2634f398/Image-87362592.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3dbc6142c85cc77eaf04@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Not tainted
------------------------------------------------------
udevd/6531 is trying to acquire lock:
ffff0001ffb32070 (&nsock->tx_lock){+.+.}-{4:4}, at: nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
ffff0001ffb32070 (&nsock->tx_lock){+.+.}-{4:4}, at: nbd_queue_rq+0x20c/0xc48 drivers/block/nbd.c:1204

but task is already holding lock:
ffff0000e184e178 (&cmd->lock){+.+.}-{4:4}, at: nbd_queue_rq+0xb4/0xc48 drivers/block/nbd.c:1196

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #6 (&cmd->lock){+.+.}-{4:4}:
       __mutex_lock_common+0x1d0/0x2678 kernel/locking/mutex.c:598
       __mutex_lock kernel/locking/mutex.c:760 [inline]
       mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:812
       nbd_queue_rq+0xb4/0xc48 drivers/block/nbd.c:1196
       blk_mq_dispatch_rq_list+0x890/0x1548 block/blk-mq.c:2120
       __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
       blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
       __blk_mq_sched_dispatch_requests+0xa7c/0x10e4 block/blk-mq-sched.c:307
       blk_mq_sched_dispatch_requests+0xa4/0x154 block/blk-mq-sched.c:329
       blk_mq_run_hw_queue+0x2d0/0x4a4 block/blk-mq.c:2358
       blk_mq_dispatch_list+0xa00/0xaf8 block/blk-mq.c:-1
       blk_mq_flush_plug_list+0x3a4/0x488 block/blk-mq.c:2967
       __blk_flush_plug+0x330/0x408 block/blk-core.c:1220
       blk_finish_plug block/blk-core.c:1247 [inline]
       __submit_bio+0x3f4/0x4d8 block/blk-core.c:649
       __submit_bio_noacct_mq block/blk-core.c:722 [inline]
       submit_bio_noacct_nocheck+0x390/0xaac block/blk-core.c:751
       submit_bio_noacct+0xc94/0x177c block/blk-core.c:874
       submit_bio+0x3b4/0x550 block/blk-core.c:916
       submit_bh_wbc+0x3ec/0x4bc fs/buffer.c:2824
       submit_bh fs/buffer.c:2829 [inline]
       block_read_full_folio+0x734/0x824 fs/buffer.c:2461
       blkdev_read_folio+0x28/0x38 block/fops.c:491
       filemap_read_folio+0xec/0x2f8 mm/filemap.c:2413
       do_read_cache_folio+0x364/0x5bc mm/filemap.c:3957
       read_cache_folio+0x68/0x88 mm/filemap.c:3989
       read_mapping_folio include/linux/pagemap.h:991 [inline]
       read_part_sector+0xcc/0x6fc block/partitions/core.c:722
       adfspart_check_ICS+0xa0/0x83c block/partitions/acorn.c:360
       check_partition block/partitions/core.c:141 [inline]
       blk_add_partitions block/partitions/core.c:589 [inline]
       bdev_disk_changed+0x674/0x11fc block/partitions/core.c:693
       blkdev_get_whole+0x2b0/0x4a4 block/bdev.c:748
       bdev_open+0x3b0/0xc20 block/bdev.c:957
       blkdev_open+0x300/0x440 block/fops.c:691
       do_dentry_open+0x7a4/0x10bc fs/open.c:965
       vfs_open+0x44/0x2d4 fs/open.c:1095
       do_open fs/namei.c:3887 [inline]
       path_openat+0x2424/0x2c40 fs/namei.c:4046
       do_filp_open+0x18c/0x36c fs/namei.c:4073
       do_sys_openat2+0x11c/0x1b4 fs/open.c:1435
       do_sys_open fs/open.c:1450 [inline]
       __do_sys_openat fs/open.c:1466 [inline]
       __se_sys_openat fs/open.c:1461 [inline]
       __arm64_sys_openat+0x120/0x158 fs/open.c:1461
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
       el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744
       el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763
       el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596

-> #5 (set->srcu){.+.+}-{0:0}:
       srcu_lock_sync+0x2c/0x38 include/linux/srcu.h:173
       __synchronize_srcu+0xa0/0x348 kernel/rcu/srcutree.c:1429
       synchronize_srcu+0x2cc/0x338 kernel/rcu/srcutree.c:-1
       blk_mq_wait_quiesce_done block/blk-mq.c:283 [inline]
       blk_mq_quiesce_queue+0x118/0x16c block/blk-mq.c:303
       elevator_switch+0x12c/0x410 block/elevator.c:588
       elevator_change+0x264/0x3cc block/elevator.c:690
       elevator_set_default+0x138/0x21c block/elevator.c:766
       blk_register_queue+0x2b4/0x338 block/blk-sysfs.c:904
       __add_disk+0x560/0xb90 block/genhd.c:528
       add_disk_fwnode+0xdc/0x438 block/genhd.c:597
       device_add_disk+0x38/0x4c block/genhd.c:627
       add_disk include/linux/blkdev.h:774 [inline]
       nbd_dev_add+0x560/0x820 drivers/block/nbd.c:1973
       nbd_init+0x15c/0x174 drivers/block/nbd.c:2680
       do_one_initcall+0x250/0x990 init/main.c:1269
       do_initcall_level+0x128/0x1c4 init/main.c:1331
       do_initcalls+0x70/0xd0 init/main.c:1347
       do_basic_setup+0x78/0x8c init/main.c:1366
       kernel_init_freeable+0x268/0x39c init/main.c:1579
       kernel_init+0x24/0x1dc init/main.c:1469
       ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844

-> #4 (&q->elevator_lock){+.+.}-{4:4}:
       __mutex_lock_common+0x1d0/0x2678 kernel/locking/mutex.c:598
       __mutex_lock kernel/locking/mutex.c:760 [inline]
       mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:812
       elevator_change+0x16c/0x3cc block/elevator.c:688
       elevator_set_none+0x48/0xac block/elevator.c:781
       blk_mq_elv_switch_none block/blk-mq.c:5023 [inline]
       __blk_mq_update_nr_hw_queues block/blk-mq.c:5066 [inline]
       blk_mq_update_nr_hw_queues+0x4c8/0x15f4 block/blk-mq.c:5124
       nbd_start_device+0x158/0xa48 drivers/block/nbd.c:1478
       nbd_genl_connect+0xf88/0x158c drivers/block/nbd.c:2228
       genl_family_rcv_msg_doit+0x1d8/0x2bc net/netlink/genetlink.c:1115
       genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
       genl_rcv_msg+0x450/0x624 net/netlink/genetlink.c:1210
       netlink_rcv_skb+0x220/0x3fc net/netlink/af_netlink.c:2552
       genl_rcv+0x38/0x50 net/netlink/genetlink.c:1219
       netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
       netlink_unicast+0x694/0x8c4 net/netlink/af_netlink.c:1346
       netlink_sendmsg+0x648/0x930 net/netlink/af_netlink.c:1896
       sock_sendmsg_nosec net/socket.c:714 [inline]
       __sock_sendmsg net/socket.c:729 [inline]
       ____sys_sendmsg+0x490/0x7b8 net/socket.c:2614
       ___sys_sendmsg+0x204/0x278 net/socket.c:2668
       __sys_sendmsg net/socket.c:2700 [inline]
       __do_sys_sendmsg net/socket.c:2705 [inline]
       __se_sys_sendmsg net/socket.c:2703 [inline]
       __arm64_sys_sendmsg+0x184/0x238 net/socket.c:2703
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
       el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744
       el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763
       el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596

-> #3 (&q->q_usage_counter(io)#33){++++}-{0:0}:
       blk_alloc_queue+0x48c/0x54c block/blk-core.c:461
       blk_mq_alloc_queue block/blk-mq.c:4400 [inline]
       __blk_mq_alloc_disk+0x124/0x304 block/blk-mq.c:4447
       nbd_dev_add+0x398/0x820 drivers/block/nbd.c:1943
       nbd_init+0x15c/0x174 drivers/block/nbd.c:2680
       do_one_initcall+0x250/0x990 init/main.c:1269
       do_initcall_level+0x128/0x1c4 init/main.c:1331
       do_initcalls+0x70/0xd0 init/main.c:1347
       do_basic_setup+0x78/0x8c init/main.c:1366
       kernel_init_freeable+0x268/0x39c init/main.c:1579
       kernel_init+0x24/0x1dc init/main.c:1469
       ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844

-> #2 (fs_reclaim){+.+.}-{0:0}:
       __fs_reclaim_acquire mm/page_alloc.c:4234 [inline]
       fs_reclaim_acquire+0x8c/0x118 mm/page_alloc.c:4248
       might_alloc include/linux/sched/mm.h:318 [inline]
       slab_pre_alloc_hook mm/slub.c:4142 [inline]
       slab_alloc_node mm/slub.c:4220 [inline]
       __kmalloc_cache_noprof+0x58/0x3fc mm/slub.c:4402
       kmalloc_noprof include/linux/slab.h:905 [inline]
       kzalloc_noprof include/linux/slab.h:1039 [inline]
       virtio_transport_do_socket_init+0x60/0x2b8 net/vmw_vsock/virtio_transport_common.c:910
       vsock_assign_transport+0x514/0x65c net/vmw_vsock/af_vsock.c:537
       vsock_connect+0x4a8/0xb94 net/vmw_vsock/af_vsock.c:1583
       __sys_connect_file net/socket.c:2086 [inline]
       __sys_connect+0x2a0/0x3ac net/socket.c:2105
       __do_sys_connect net/socket.c:2111 [inline]
       __se_sys_connect net/socket.c:2108 [inline]
       __arm64_sys_connect+0x7c/0x94 net/socket.c:2108
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
       el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744
       el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763
       el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596

-> #1 (sk_lock-AF_VSOCK){+.+.}-{0:0}:
       lock_sock_nested+0x58/0x118 net/core/sock.c:3711
       lock_sock include/net/sock.h:1669 [inline]
       vsock_shutdown+0x70/0x280 net/vmw_vsock/af_vsock.c:1103
       kernel_sock_shutdown+0x6c/0x80 net/socket.c:3701
       nbd_mark_nsock_dead+0x2a4/0x534 drivers/block/nbd.c:318
       recv_work+0x1cf8/0x2044 drivers/block/nbd.c:1018
       process_one_work+0x7e8/0x155c kernel/workqueue.c:3236
       process_scheduled_works kernel/workqueue.c:3319 [inline]
       worker_thread+0x958/0xed8 kernel/workqueue.c:3400
       kthread+0x5fc/0x75c kernel/kthread.c:463
       ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:844

-> #0 (&nsock->tx_lock){+.+.}-{4:4}:
       check_prev_add kernel/locking/lockdep.c:3165 [inline]
       check_prevs_add kernel/locking/lockdep.c:3284 [inline]
       validate_chain kernel/locking/lockdep.c:3908 [inline]
       __lock_acquire+0x1774/0x30a4 kernel/locking/lockdep.c:5237
       lock_acquire+0x14c/0x2e0 kernel/locking/lockdep.c:5868
       __mutex_lock_common+0x1d0/0x2678 kernel/locking/mutex.c:598
       __mutex_lock kernel/locking/mutex.c:760 [inline]
       mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:812
       nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
       nbd_queue_rq+0x20c/0xc48 drivers/block/nbd.c:1204
       blk_mq_dispatch_rq_list+0x890/0x1548 block/blk-mq.c:2120
       __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
       blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
       __blk_mq_sched_dispatch_requests+0xa7c/0x10e4 block/blk-mq-sched.c:307
       blk_mq_sched_dispatch_requests+0xa4/0x154 block/blk-mq-sched.c:329
       blk_mq_run_hw_queue+0x2d0/0x4a4 block/blk-mq.c:2358
       blk_mq_dispatch_list+0xa00/0xaf8 block/blk-mq.c:-1
       blk_mq_flush_plug_list+0x3a4/0x488 block/blk-mq.c:2967
       __blk_flush_plug+0x330/0x408 block/blk-core.c:1220
       blk_finish_plug block/blk-core.c:1247 [inline]
       __submit_bio+0x3f4/0x4d8 block/blk-core.c:649
       __submit_bio_noacct_mq block/blk-core.c:722 [inline]
       submit_bio_noacct_nocheck+0x390/0xaac block/blk-core.c:751
       submit_bio_noacct+0xc94/0x177c block/blk-core.c:874
       submit_bio+0x3b4/0x550 block/blk-core.c:916
       submit_bh_wbc+0x3ec/0x4bc fs/buffer.c:2824
       submit_bh fs/buffer.c:2829 [inline]
       block_read_full_folio+0x734/0x824 fs/buffer.c:2461
       blkdev_read_folio+0x28/0x38 block/fops.c:491
       filemap_read_folio+0xec/0x2f8 mm/filemap.c:2413
       do_read_cache_folio+0x364/0x5bc mm/filemap.c:3957
       read_cache_folio+0x68/0x88 mm/filemap.c:3989
       read_mapping_folio include/linux/pagemap.h:991 [inline]
       read_part_sector+0xcc/0x6fc block/partitions/core.c:722
       adfspart_check_ICS+0xa0/0x83c block/partitions/acorn.c:360
       check_partition block/partitions/core.c:141 [inline]
       blk_add_partitions block/partitions/core.c:589 [inline]
       bdev_disk_changed+0x674/0x11fc block/partitions/core.c:693
       blkdev_get_whole+0x2b0/0x4a4 block/bdev.c:748
       bdev_open+0x3b0/0xc20 block/bdev.c:957
       blkdev_open+0x300/0x440 block/fops.c:691
       do_dentry_open+0x7a4/0x10bc fs/open.c:965
       vfs_open+0x44/0x2d4 fs/open.c:1095
       do_open fs/namei.c:3887 [inline]
       path_openat+0x2424/0x2c40 fs/namei.c:4046
       do_filp_open+0x18c/0x36c fs/namei.c:4073
       do_sys_openat2+0x11c/0x1b4 fs/open.c:1435
       do_sys_open fs/open.c:1450 [inline]
       __do_sys_openat fs/open.c:1466 [inline]
       __se_sys_openat fs/open.c:1461 [inline]
       __arm64_sys_openat+0x120/0x158 fs/open.c:1461
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
       el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744
       el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763
       el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596

other info that might help us debug this:

Chain exists of:
  &nsock->tx_lock --> set->srcu --> &cmd->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&cmd->lock);
                               lock(set->srcu);
                               lock(&cmd->lock);
  lock(&nsock->tx_lock);

 *** DEADLOCK ***

3 locks held by udevd/6531:
 #0: ffff0000cb97e358 (&disk->open_mutex){+.+.}-{4:4}, at: bdev_open+0xcc/0xc20 block/bdev.c:945
 #1: ffff0000cad37e90 (set->srcu){.+.+}-{0:0}, at: srcu_lock_acquire+0x18/0x54 include/linux/srcu.h:160
 #2: ffff0000e184e178 (&cmd->lock){+.+.}-{4:4}, at: nbd_queue_rq+0xb4/0xc48 drivers/block/nbd.c:1196

stack backtrace:
CPU: 1 UID: 0 PID: 6531 Comm: udevd Not tainted syzkaller #0 PREEMPT 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/30/2025
Call trace:
 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C)
 __dump_stack+0x30/0x40 lib/dump_stack.c:94
 dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120
 dump_stack+0x1c/0x28 lib/dump_stack.c:129
 print_circular_bug+0x324/0x32c kernel/locking/lockdep.c:2043
 check_noncircular+0x154/0x174 kernel/locking/lockdep.c:2175
 check_prev_add kernel/locking/lockdep.c:3165 [inline]
 check_prevs_add kernel/locking/lockdep.c:3284 [inline]
 validate_chain kernel/locking/lockdep.c:3908 [inline]
 __lock_acquire+0x1774/0x30a4 kernel/locking/lockdep.c:5237
 lock_acquire+0x14c/0x2e0 kernel/locking/lockdep.c:5868
 __mutex_lock_common+0x1d0/0x2678 kernel/locking/mutex.c:598
 __mutex_lock kernel/locking/mutex.c:760 [inline]
 mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:812
 nbd_handle_cmd drivers/block/nbd.c:1140 [inline]
 nbd_queue_rq+0x20c/0xc48 drivers/block/nbd.c:1204
 blk_mq_dispatch_rq_list+0x890/0x1548 block/blk-mq.c:2120
 __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
 blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
 __blk_mq_sched_dispatch_requests+0xa7c/0x10e4 block/blk-mq-sched.c:307
 blk_mq_sched_dispatch_requests+0xa4/0x154 block/blk-mq-sched.c:329
 blk_mq_run_hw_queue+0x2d0/0x4a4 block/blk-mq.c:2358
 blk_mq_dispatch_list+0xa00/0xaf8 block/blk-mq.c:-1
 blk_mq_flush_plug_list+0x3a4/0x488 block/blk-mq.c:2967
 __blk_flush_plug+0x330/0x408 block/blk-core.c:1220
 blk_finish_plug block/blk-core.c:1247 [inline]
 __submit_bio+0x3f4/0x4d8 block/blk-core.c:649
 __submit_bio_noacct_mq block/blk-core.c:722 [inline]
 submit_bio_noacct_nocheck+0x390/0xaac block/blk-core.c:751
 submit_bio_noacct+0xc94/0x177c block/blk-core.c:874
 submit_bio+0x3b4/0x550 block/blk-core.c:916
 submit_bh_wbc+0x3ec/0x4bc fs/buffer.c:2824
 submit_bh fs/buffer.c:2829 [inline]
 block_read_full_folio+0x734/0x824 fs/buffer.c:2461
 blkdev_read_folio+0x28/0x38 block/fops.c:491
 filemap_read_folio+0xec/0x2f8 mm/filemap.c:2413
 do_read_cache_folio+0x364/0x5bc mm/filemap.c:3957
 read_cache_folio+0x68/0x88 mm/filemap.c:3989
 read_mapping_folio include/linux/pagemap.h:991 [inline]
 read_part_sector+0xcc/0x6fc block/partitions/core.c:722
 adfspart_check_ICS+0xa0/0x83c block/partitions/acorn.c:360
 check_partition block/partitions/core.c:141 [inline]
 blk_add_partitions block/partitions/core.c:589 [inline]
 bdev_disk_changed+0x674/0x11fc block/partitions/core.c:693
 blkdev_get_whole+0x2b0/0x4a4 block/bdev.c:748
 bdev_open+0x3b0/0xc20 block/bdev.c:957
 blkdev_open+0x300/0x440 block/fops.c:691
 do_dentry_open+0x7a4/0x10bc fs/open.c:965
 vfs_open+0x44/0x2d4 fs/open.c:1095
 do_open fs/namei.c:3887 [inline]
 path_openat+0x2424/0x2c40 fs/namei.c:4046
 do_filp_open+0x18c/0x36c fs/namei.c:4073
 do_sys_openat2+0x11c/0x1b4 fs/open.c:1435
 do_sys_open fs/open.c:1450 [inline]
 __do_sys_openat fs/open.c:1466 [inline]
 __se_sys_openat fs/open.c:1461 [inline]
 __arm64_sys_openat+0x120/0x158 fs/open.c:1461
 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
 invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
 el0_svc+0x5c/0x254 arch/arm64/kernel/entry-common.c:744
 el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:763
 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596
block nbd0: Dead connection, failed to find a fallback
block nbd0: shutting down sockets
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
ldm_validate_partition_table(): Disk read failed.
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Buffer I/O error on dev nbd0, logical block 0, async page read
Dev nbd0: unable to read RDB block 0
 nbd0: unable to read partition table
ldm_validate_partition_table(): Disk read failed.
Dev nbd0: unable to read RDB block 0
 nbd0: unable to read partition table


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [syzbot] [nbd?] possible deadlock in nbd_queue_rq
  2025-07-06 16:33 [syzbot] [nbd?] possible deadlock in nbd_queue_rq syzbot
  2025-07-07  0:59 ` Hillf Danton
  2025-09-15  2:16 ` syzbot
@ 2025-09-16 18:18 ` syzbot
  2 siblings, 0 replies; 10+ messages in thread
From: syzbot @ 2025-09-16 18:18 UTC (permalink / raw)
  To: axboe, bvanassche, hdanton, josef, linux-block, linux-kernel,
	ming.lei, nbd, penguin-kernel, syzkaller-bugs, thomas.hellstrom

syzbot has bisected this issue to:

commit ffa1e7ada456087c2402b37cd6b2863ced29aff0
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Date:   Tue Mar 18 09:55:48 2025 +0000

    block: Make request_queue lockdep splats show up earlier

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11eef762580000
start commit:   f83ec76bf285 Linux 6.17-rc6
git tree:       upstream
final oops:     https://syzkaller.appspot.com/x/report.txt?x=13eef762580000
console output: https://syzkaller.appspot.com/x/log.txt?x=15eef762580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8f01d8629880e620
dashboard link: https://syzkaller.appspot.com/bug?extid=3dbc6142c85cc77eaf04
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1009bb62580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14833b12580000

Reported-by: syzbot+3dbc6142c85cc77eaf04@syzkaller.appspotmail.com
Fixes: ffa1e7ada456 ("block: Make request_queue lockdep splats show up earlier")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-09-16 18:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-06 16:33 [syzbot] [nbd?] possible deadlock in nbd_queue_rq syzbot
2025-07-07  0:59 ` Hillf Danton
2025-07-07 17:39   ` Bart Van Assche
2025-07-08  0:18     ` Hillf Danton
2025-07-08  0:52       ` Tetsuo Handa
2025-07-08  1:24         ` Hillf Danton
2025-07-08  2:19           ` Tetsuo Handa
2025-07-08 16:23           ` Bart Van Assche
2025-09-15  2:16 ` syzbot
2025-09-16 18:18 ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).