linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2] nbd: fix lockdep deadlock warning
@ 2025-07-09 11:17 Ming Lei
  2025-07-09 12:28 ` Nilay Shroff
  2025-07-09 22:51 ` Jens Axboe
  0 siblings, 2 replies; 3+ messages in thread
From: Ming Lei @ 2025-07-09 11:17 UTC (permalink / raw)
  To: Jens Axboe, linux-block
  Cc: Ming Lei, syzbot+2bcecf3c38cb3e8fdc8d, Yu Kuai, Nilay Shroff

nbd grabs device lock nbd->config_lock for updating nr_hw_queues, this
ways cause the following lock dependency:

-> #2 (&disk->open_mutex){+.+.}-{4:4}:
       lock_acquire kernel/locking/lockdep.c:5871 [inline]
       lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
       __mutex_lock_common kernel/locking/mutex.c:602 [inline]
       __mutex_lock+0x166/0x1292 kernel/locking/mutex.c:747
       mutex_lock_nested+0x14/0x1c kernel/locking/mutex.c:799
       __del_gendisk+0x132/0xac6 block/genhd.c:706
       del_gendisk+0xf6/0x19a block/genhd.c:819
       nbd_dev_remove+0x3c/0xf2 drivers/block/nbd.c:268
       nbd_dev_remove_work+0x1c/0x26 drivers/block/nbd.c:284
       process_one_work+0x96a/0x1f32 kernel/workqueue.c:3238
       process_scheduled_works kernel/workqueue.c:3321 [inline]
       worker_thread+0x5ce/0xde8 kernel/workqueue.c:3402
       kthread+0x39c/0x7d4 kernel/kthread.c:464
       ret_from_fork_kernel+0x2a/0xbb2 arch/riscv/kernel/process.c:214
       ret_from_fork_kernel_asm+0x16/0x18 arch/riscv/kernel/entry.S:327

-> #1 (&set->update_nr_hwq_lock){++++}-{4:4}:
       lock_acquire kernel/locking/lockdep.c:5871 [inline]
       lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
       down_write+0x9c/0x19a kernel/locking/rwsem.c:1577
       blk_mq_update_nr_hw_queues+0x3e/0xb86 block/blk-mq.c:5041
       nbd_start_device+0x140/0xb2c drivers/block/nbd.c:1476
       nbd_genl_connect+0xae0/0x1b24 drivers/block/nbd.c:2201
       genl_family_rcv_msg_doit+0x206/0x2e6 net/netlink/genetlink.c:1115
       genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
       genl_rcv_msg+0x514/0x78e net/netlink/genetlink.c:1210
       netlink_rcv_skb+0x206/0x3be net/netlink/af_netlink.c:2534
       genl_rcv+0x36/0x4c net/netlink/genetlink.c:1219
       netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
       netlink_unicast+0x4f0/0x82c net/netlink/af_netlink.c:1339
       netlink_sendmsg+0x85e/0xdd6 net/netlink/af_netlink.c:1883
       sock_sendmsg_nosec net/socket.c:712 [inline]
       __sock_sendmsg+0xcc/0x160 net/socket.c:727
       ____sys_sendmsg+0x63e/0x79c net/socket.c:2566
       ___sys_sendmsg+0x144/0x1e6 net/socket.c:2620
       __sys_sendmsg+0x188/0x246 net/socket.c:2652
       __do_sys_sendmsg net/socket.c:2657 [inline]
       __se_sys_sendmsg net/socket.c:2655 [inline]
       __riscv_sys_sendmsg+0x70/0xa2 net/socket.c:2655
       syscall_handler+0x94/0x118 arch/riscv/include/asm/syscall.h:112
       do_trap_ecall_u+0x396/0x530 arch/riscv/kernel/traps.c:341
       handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197

-> #0 (&nbd->config_lock){+.+.}-{4:4}:
       check_noncircular+0x132/0x146 kernel/locking/lockdep.c:2178
       check_prev_add kernel/locking/lockdep.c:3168 [inline]
       check_prevs_add kernel/locking/lockdep.c:3287 [inline]
       validate_chain kernel/locking/lockdep.c:3911 [inline]
       __lock_acquire+0x12b2/0x24ea kernel/locking/lockdep.c:5240
       lock_acquire kernel/locking/lockdep.c:5871 [inline]
       lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
       __mutex_lock_common kernel/locking/mutex.c:602 [inline]
       __mutex_lock+0x166/0x1292 kernel/locking/mutex.c:747
       mutex_lock_nested+0x14/0x1c kernel/locking/mutex.c:799
       refcount_dec_and_mutex_lock+0x60/0xd8 lib/refcount.c:118
       nbd_config_put+0x3a/0x610 drivers/block/nbd.c:1423
       nbd_release+0x94/0x15c drivers/block/nbd.c:1735
       blkdev_put_whole+0xac/0xee block/bdev.c:721
       bdev_release+0x3fe/0x600 block/bdev.c:1144
       blkdev_release+0x1a/0x26 block/fops.c:684
       __fput+0x382/0xa8c fs/file_table.c:465
       ____fput+0x1c/0x26 fs/file_table.c:493
       task_work_run+0x16a/0x25e kernel/task_work.c:227
       resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
       exit_to_user_mode_loop+0x118/0x134 kernel/entry/common.c:114
       exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline]
       syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline]
       syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline]
       do_trap_ecall_u+0x3f0/0x530 arch/riscv/kernel/traps.c:355
       handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197

Also it isn't necessary to require nbd->config_lock, because
blk_mq_update_nr_hw_queues() does grab tagset lock for sync everything.

Fixes the issue by releasing ->config_lock & retry in case of concurrent
updating nr_hw_queues.

Fixes: 98e68f67020c ("block: prevent adding/deleting disk during updating nr_hw_queues")
Reported-by: syzbot+2bcecf3c38cb3e8fdc8d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6855034f.a00a0220.137b3.0031.GAE@google.com
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Cc: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
V2:
	- update 'nr_connections' in case retry is needed (Nilay)

 drivers/block/nbd.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 7bdc7eb808ea..709bad75c47b 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1473,7 +1473,17 @@ static int nbd_start_device(struct nbd_device *nbd)
 		return -EINVAL;
 	}
 
-	blk_mq_update_nr_hw_queues(&nbd->tag_set, config->num_connections);
+retry:
+	mutex_unlock(&nbd->config_lock);
+	blk_mq_update_nr_hw_queues(&nbd->tag_set, num_connections);
+	mutex_lock(&nbd->config_lock);
+
+	/* if another code path updated nr_hw_queues, retry until succeed */
+	if (num_connections != config->num_connections) {
+		num_connections = config->num_connections;
+		goto retry;
+	}
+
 	nbd->pid = task_pid_nr(current);
 
 	nbd_parse_flags(nbd);
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH V2] nbd: fix lockdep deadlock warning
  2025-07-09 11:17 [PATCH V2] nbd: fix lockdep deadlock warning Ming Lei
@ 2025-07-09 12:28 ` Nilay Shroff
  2025-07-09 22:51 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Nilay Shroff @ 2025-07-09 12:28 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block; +Cc: syzbot+2bcecf3c38cb3e8fdc8d, Yu Kuai



On 7/9/25 4:47 PM, Ming Lei wrote:
> nbd grabs device lock nbd->config_lock for updating nr_hw_queues, this
> ways cause the following lock dependency:
> 
> -> #2 (&disk->open_mutex){+.+.}-{4:4}:
>        lock_acquire kernel/locking/lockdep.c:5871 [inline]
>        lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
>        __mutex_lock_common kernel/locking/mutex.c:602 [inline]
>        __mutex_lock+0x166/0x1292 kernel/locking/mutex.c:747
>        mutex_lock_nested+0x14/0x1c kernel/locking/mutex.c:799
>        __del_gendisk+0x132/0xac6 block/genhd.c:706
>        del_gendisk+0xf6/0x19a block/genhd.c:819
>        nbd_dev_remove+0x3c/0xf2 drivers/block/nbd.c:268
>        nbd_dev_remove_work+0x1c/0x26 drivers/block/nbd.c:284
>        process_one_work+0x96a/0x1f32 kernel/workqueue.c:3238
>        process_scheduled_works kernel/workqueue.c:3321 [inline]
>        worker_thread+0x5ce/0xde8 kernel/workqueue.c:3402
>        kthread+0x39c/0x7d4 kernel/kthread.c:464
>        ret_from_fork_kernel+0x2a/0xbb2 arch/riscv/kernel/process.c:214
>        ret_from_fork_kernel_asm+0x16/0x18 arch/riscv/kernel/entry.S:327
> 
> -> #1 (&set->update_nr_hwq_lock){++++}-{4:4}:
>        lock_acquire kernel/locking/lockdep.c:5871 [inline]
>        lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
>        down_write+0x9c/0x19a kernel/locking/rwsem.c:1577
>        blk_mq_update_nr_hw_queues+0x3e/0xb86 block/blk-mq.c:5041
>        nbd_start_device+0x140/0xb2c drivers/block/nbd.c:1476
>        nbd_genl_connect+0xae0/0x1b24 drivers/block/nbd.c:2201
>        genl_family_rcv_msg_doit+0x206/0x2e6 net/netlink/genetlink.c:1115
>        genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
>        genl_rcv_msg+0x514/0x78e net/netlink/genetlink.c:1210
>        netlink_rcv_skb+0x206/0x3be net/netlink/af_netlink.c:2534
>        genl_rcv+0x36/0x4c net/netlink/genetlink.c:1219
>        netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
>        netlink_unicast+0x4f0/0x82c net/netlink/af_netlink.c:1339
>        netlink_sendmsg+0x85e/0xdd6 net/netlink/af_netlink.c:1883
>        sock_sendmsg_nosec net/socket.c:712 [inline]
>        __sock_sendmsg+0xcc/0x160 net/socket.c:727
>        ____sys_sendmsg+0x63e/0x79c net/socket.c:2566
>        ___sys_sendmsg+0x144/0x1e6 net/socket.c:2620
>        __sys_sendmsg+0x188/0x246 net/socket.c:2652
>        __do_sys_sendmsg net/socket.c:2657 [inline]
>        __se_sys_sendmsg net/socket.c:2655 [inline]
>        __riscv_sys_sendmsg+0x70/0xa2 net/socket.c:2655
>        syscall_handler+0x94/0x118 arch/riscv/include/asm/syscall.h:112
>        do_trap_ecall_u+0x396/0x530 arch/riscv/kernel/traps.c:341
>        handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197
> 
> -> #0 (&nbd->config_lock){+.+.}-{4:4}:
>        check_noncircular+0x132/0x146 kernel/locking/lockdep.c:2178
>        check_prev_add kernel/locking/lockdep.c:3168 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3287 [inline]
>        validate_chain kernel/locking/lockdep.c:3911 [inline]
>        __lock_acquire+0x12b2/0x24ea kernel/locking/lockdep.c:5240
>        lock_acquire kernel/locking/lockdep.c:5871 [inline]
>        lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
>        __mutex_lock_common kernel/locking/mutex.c:602 [inline]
>        __mutex_lock+0x166/0x1292 kernel/locking/mutex.c:747
>        mutex_lock_nested+0x14/0x1c kernel/locking/mutex.c:799
>        refcount_dec_and_mutex_lock+0x60/0xd8 lib/refcount.c:118
>        nbd_config_put+0x3a/0x610 drivers/block/nbd.c:1423
>        nbd_release+0x94/0x15c drivers/block/nbd.c:1735
>        blkdev_put_whole+0xac/0xee block/bdev.c:721
>        bdev_release+0x3fe/0x600 block/bdev.c:1144
>        blkdev_release+0x1a/0x26 block/fops.c:684
>        __fput+0x382/0xa8c fs/file_table.c:465
>        ____fput+0x1c/0x26 fs/file_table.c:493
>        task_work_run+0x16a/0x25e kernel/task_work.c:227
>        resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>        exit_to_user_mode_loop+0x118/0x134 kernel/entry/common.c:114
>        exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline]
>        syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline]
>        syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline]
>        do_trap_ecall_u+0x3f0/0x530 arch/riscv/kernel/traps.c:355
>        handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197
> 
> Also it isn't necessary to require nbd->config_lock, because
> blk_mq_update_nr_hw_queues() does grab tagset lock for sync everything.
> 
> Fixes the issue by releasing ->config_lock & retry in case of concurrent
> updating nr_hw_queues.
> 
> Fixes: 98e68f67020c ("block: prevent adding/deleting disk during updating nr_hw_queues")
> Reported-by: syzbot+2bcecf3c38cb3e8fdc8d@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/6855034f.a00a0220.137b3.0031.GAE@google.com
> Reviewed-by: Yu Kuai <yukuai3@huawei.com>
> Cc: Nilay Shroff <nilay@linux.ibm.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH V2] nbd: fix lockdep deadlock warning
  2025-07-09 11:17 [PATCH V2] nbd: fix lockdep deadlock warning Ming Lei
  2025-07-09 12:28 ` Nilay Shroff
@ 2025-07-09 22:51 ` Jens Axboe
  1 sibling, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2025-07-09 22:51 UTC (permalink / raw)
  To: linux-block, Ming Lei; +Cc: syzbot+2bcecf3c38cb3e8fdc8d, Yu Kuai, Nilay Shroff


On Wed, 09 Jul 2025 19:17:44 +0800, Ming Lei wrote:
> nbd grabs device lock nbd->config_lock for updating nr_hw_queues, this
> ways cause the following lock dependency:
> 
> -> #2 (&disk->open_mutex){+.+.}-{4:4}:
>        lock_acquire kernel/locking/lockdep.c:5871 [inline]
>        lock_acquire+0x1ac/0x448 kernel/locking/lockdep.c:5828
>        __mutex_lock_common kernel/locking/mutex.c:602 [inline]
>        __mutex_lock+0x166/0x1292 kernel/locking/mutex.c:747
>        mutex_lock_nested+0x14/0x1c kernel/locking/mutex.c:799
>        __del_gendisk+0x132/0xac6 block/genhd.c:706
>        del_gendisk+0xf6/0x19a block/genhd.c:819
>        nbd_dev_remove+0x3c/0xf2 drivers/block/nbd.c:268
>        nbd_dev_remove_work+0x1c/0x26 drivers/block/nbd.c:284
>        process_one_work+0x96a/0x1f32 kernel/workqueue.c:3238
>        process_scheduled_works kernel/workqueue.c:3321 [inline]
>        worker_thread+0x5ce/0xde8 kernel/workqueue.c:3402
>        kthread+0x39c/0x7d4 kernel/kthread.c:464
>        ret_from_fork_kernel+0x2a/0xbb2 arch/riscv/kernel/process.c:214
>        ret_from_fork_kernel_asm+0x16/0x18 arch/riscv/kernel/entry.S:327
> 
> [...]

Applied, thanks!

[1/1] nbd: fix lockdep deadlock warning
      commit: 8b428f42f3edfd62422aa7ad87049ab232a2eaa9

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-07-09 22:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-09 11:17 [PATCH V2] nbd: fix lockdep deadlock warning Ming Lei
2025-07-09 12:28 ` Nilay Shroff
2025-07-09 22:51 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).