linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] nbd: fix false lockdep deadlock warning
@ 2025-06-27  9:23 Yu Kuai
  2025-06-27 11:04 ` Ming Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Yu Kuai @ 2025-06-27  9:23 UTC (permalink / raw)
  To: josef, axboe, ming.lei, hch, nilay, hare
  Cc: linux-block, nbd, linux-kernel, yukuai3, yukuai1, yi.zhang,
	yangerkun, johnny.chenyi

From: Yu Kuai <yukuai3@huawei.com>

The deadlock is reported because there are circular dependency:

t1: disk->open_mutex -> nbd->config_lock

 blkdev_release
  bdev_release
   //lock disk->open_mutex)
   blkdev_put_whole
    nbd_release
     nbd_config_put
        refcount_dec_and_mutex_lock
        //lock nbd->config_lock

t2: nbd->config_lock -> set->update_nr_hwq_lock

 nbd_genl_connect
  //lock nbd->config_lock
  nbd_start_device
   blk_mq_update_nr_hw_queues
   //lock set->update_nr_hwq_lock

t3: set->update_nr_hwq_lock -> disk->open_mutex

 nbd_dev_remove_work
  nbd_dev_remove
   del_gendisk
    down_read(&set->update_nr_hwq_lock);
    __del_gendisk
    mutex_lock(&disk->open_mutex);

This is false warning because t1 and t2 should be synchronized by
nbd->refs, and t1 is still holding the reference while t2 is triggered
when the reference is decreased to 0. However the lock order is broken.

Fix the problem by breaking the dependency from t2, by calling
blk_mq_update_nr_hw_queues() outside of nbd internal config_lock, since
now other context can concurrent with nbd_start_device(), also make sure
they will still return -EBUSY, the difference is that they will not wait
for nbd_start_device() to be done.

Fixes: 98e68f67020c ("block: prevent adding/deleting disk during updating nr_hw_queues")
Reported-by: syzbot+2bcecf3c38cb3e8fdc8d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6855034f.a00a0220.137b3.0031.GAE@google.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/block/nbd.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 7bdc7eb808ea..d43e8e73aeb3 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1457,10 +1457,13 @@ static void nbd_config_put(struct nbd_device *nbd)
 	}
 }
 
-static int nbd_start_device(struct nbd_device *nbd)
+static int nbd_start_device(struct nbd_device *nbd, bool netlink)
+	__releases(&nbd->config_lock)
+	__acquires(&nbd->config_lock)
 {
 	struct nbd_config *config = nbd->config;
 	int num_connections = config->num_connections;
+	struct task_struct *old;
 	int error = 0, i;
 
 	if (nbd->pid)
@@ -1473,8 +1476,21 @@ static int nbd_start_device(struct nbd_device *nbd)
 		return -EINVAL;
 	}
 
-	blk_mq_update_nr_hw_queues(&nbd->tag_set, config->num_connections);
+	/*
+	 * synchronize with concurrent nbd_start_device() and
+	 * nbd_add_socket()
+	 */
 	nbd->pid = task_pid_nr(current);
+	if (!netlink) {
+		old = nbd->task_setup;
+		nbd->task_setup = current;
+	}
+
+	mutex_unlock(&nbd->config_lock);
+	blk_mq_update_nr_hw_queues(&nbd->tag_set, config->num_connections);
+	mutex_lock(&nbd->config_lock);
+	if (!netlink)
+		nbd->task_setup = old;
 
 	nbd_parse_flags(nbd);
 
@@ -1524,7 +1540,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd)
 	struct nbd_config *config = nbd->config;
 	int ret;
 
-	ret = nbd_start_device(nbd);
+	ret = nbd_start_device(nbd, false);
 	if (ret)
 		return ret;
 
@@ -1995,7 +2011,7 @@ static struct nbd_device *nbd_find_get_unused(void)
 	lockdep_assert_held(&nbd_index_mutex);
 
 	idr_for_each_entry(&nbd_index_idr, nbd, id) {
-		if (refcount_read(&nbd->config_refs) ||
+		if (refcount_read(&nbd->config_refs) || nbd->pid ||
 		    test_bit(NBD_DESTROY_ON_DISCONNECT, &nbd->flags))
 			continue;
 		if (refcount_inc_not_zero(&nbd->refs))
@@ -2109,7 +2125,7 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 	}
 
 	mutex_lock(&nbd->config_lock);
-	if (refcount_read(&nbd->config_refs)) {
+	if (refcount_read(&nbd->config_refs) || nbd->pid) {
 		mutex_unlock(&nbd->config_lock);
 		nbd_put(nbd);
 		if (index == -1)
@@ -2198,7 +2214,7 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
 				goto out;
 		}
 	}
-	ret = nbd_start_device(nbd);
+	ret = nbd_start_device(nbd, true);
 	if (ret)
 		goto out;
 	if (info->attrs[NBD_ATTR_BACKEND_IDENTIFIER]) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-06-27  9:23 [PATCH] nbd: fix false lockdep deadlock warning Yu Kuai
@ 2025-06-27 11:04 ` Ming Lei
  2025-06-28  0:48   ` Yu Kuai
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2025-06-27 11:04 UTC (permalink / raw)
  To: Yu Kuai
  Cc: josef, axboe, hch, nilay, hare, linux-block, nbd, linux-kernel,
	yukuai3, yi.zhang, yangerkun, johnny.chenyi

On Fri, Jun 27, 2025 at 05:23:48PM +0800, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> The deadlock is reported because there are circular dependency:
> 
> t1: disk->open_mutex -> nbd->config_lock
> 
>  blkdev_release
>   bdev_release
>    //lock disk->open_mutex)
>    blkdev_put_whole
>     nbd_release
>      nbd_config_put
>         refcount_dec_and_mutex_lock
>         //lock nbd->config_lock
> 
> t2: nbd->config_lock -> set->update_nr_hwq_lock
> 
>  nbd_genl_connect
>   //lock nbd->config_lock
>   nbd_start_device
>    blk_mq_update_nr_hw_queues
>    //lock set->update_nr_hwq_lock
> 
> t3: set->update_nr_hwq_lock -> disk->open_mutex
> 
>  nbd_dev_remove_work
>   nbd_dev_remove
>    del_gendisk
>     down_read(&set->update_nr_hwq_lock);
>     __del_gendisk
>     mutex_lock(&disk->open_mutex);
> 
> This is false warning because t1 and t2 should be synchronized by
> nbd->refs, and t1 is still holding the reference while t2 is triggered
> when the reference is decreased to 0. However the lock order is broken.
> 
> Fix the problem by breaking the dependency from t2, by calling
> blk_mq_update_nr_hw_queues() outside of nbd internal config_lock, since
> now other context can concurrent with nbd_start_device(), also make sure
> they will still return -EBUSY, the difference is that they will not wait
> for nbd_start_device() to be done.
> 
> Fixes: 98e68f67020c ("block: prevent adding/deleting disk during updating nr_hw_queues")
> Reported-by: syzbot+2bcecf3c38cb3e8fdc8d@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/6855034f.a00a0220.137b3.0031.GAE@google.com/
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/block/nbd.c | 28 ++++++++++++++++++++++------
>  1 file changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 7bdc7eb808ea..d43e8e73aeb3 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1457,10 +1457,13 @@ static void nbd_config_put(struct nbd_device *nbd)
>  	}
>  }
>  
> -static int nbd_start_device(struct nbd_device *nbd)
> +static int nbd_start_device(struct nbd_device *nbd, bool netlink)
> +	__releases(&nbd->config_lock)
> +	__acquires(&nbd->config_lock)
>  {
>  	struct nbd_config *config = nbd->config;
>  	int num_connections = config->num_connections;
> +	struct task_struct *old;
>  	int error = 0, i;
>  
>  	if (nbd->pid)
> @@ -1473,8 +1476,21 @@ static int nbd_start_device(struct nbd_device *nbd)
>  		return -EINVAL;
>  	}
>  
> -	blk_mq_update_nr_hw_queues(&nbd->tag_set, config->num_connections);
> +	/*
> +	 * synchronize with concurrent nbd_start_device() and
> +	 * nbd_add_socket()
> +	 */
>  	nbd->pid = task_pid_nr(current);
> +	if (!netlink) {
> +		old = nbd->task_setup;
> +		nbd->task_setup = current;
> +	}
> +
> +	mutex_unlock(&nbd->config_lock);
> +	blk_mq_update_nr_hw_queues(&nbd->tag_set, config->num_connections);
> +	mutex_lock(&nbd->config_lock);
> +	if (!netlink)
> +		nbd->task_setup = old;

I guess the patch in the following link may be simper, both two take
similar approach:

https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/


thanks,
Ming


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-06-27 11:04 ` Ming Lei
@ 2025-06-28  0:48   ` Yu Kuai
  2025-07-01 13:28     ` Nilay Shroff
  0 siblings, 1 reply; 12+ messages in thread
From: Yu Kuai @ 2025-06-28  0:48 UTC (permalink / raw)
  To: Ming Lei, Yu Kuai
  Cc: josef, axboe, hch, nilay, hare, linux-block, nbd, linux-kernel,
	yi.zhang, yangerkun, johnny.chenyi, yukuai (C)

Hi,

在 2025/06/27 19:04, Ming Lei 写道:
> I guess the patch in the following link may be simper, both two take
> similar approach:
> 
> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/

I this the above approach has concurrent problems if nbd_start_device
concurrent with nbd_start_device:

t1:
nbd_start_device
lock
num_connections = 1
unlock
	t2:
	nbd_add_socket
	lock
	config->num_connections++
	unlock
		t3:
		nbd_start_device
		lock
		num_connections = 2
		unlock
		blk_mq_update_nr_hw_queues

blk_mq_update_nr_hw_queues
//nr_hw_queues updated to 1 before failure
return -EINVAL

Thanks,
Kuai


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-06-28  0:48   ` Yu Kuai
@ 2025-07-01 13:28     ` Nilay Shroff
  2025-07-02  1:12       ` Yu Kuai
  0 siblings, 1 reply; 12+ messages in thread
From: Nilay Shroff @ 2025-07-01 13:28 UTC (permalink / raw)
  To: Yu Kuai, Ming Lei
  Cc: josef, axboe, hch, hare, linux-block, nbd, linux-kernel, yi.zhang,
	yangerkun, johnny.chenyi, yukuai (C)



On 6/28/25 6:18 AM, Yu Kuai wrote:
> Hi,
> 
> 在 2025/06/27 19:04, Ming Lei 写道:
>> I guess the patch in the following link may be simper, both two take
>> similar approach:
>>
>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
> 
> I this the above approach has concurrent problems if nbd_start_device
> concurrent with nbd_start_device:
> 
> t1:
> nbd_start_device
> lock
> num_connections = 1
> unlock
>     t2:
>     nbd_add_socket
>     lock
>     config->num_connections++
>     unlock
>         t3:
>         nbd_start_device
>         lock
>         num_connections = 2
>         unlock
>         blk_mq_update_nr_hw_queues
> 
> blk_mq_update_nr_hw_queues
> //nr_hw_queues updated to 1 before failure
> return -EINVAL
> 

In the above case, yes I see that t1 would return -EINVAL (as 
config->num_connections doesn't match with num_connections)
but then t3 would succeed to update nr_hw_queue (as both 
config->num_connections and num_connections set to 2 this 
time). Isn't it? If yes, then the above patch (from Ming)
seems good. 

Thanks,
--Nilay

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-07-01 13:28     ` Nilay Shroff
@ 2025-07-02  1:12       ` Yu Kuai
  2025-07-02  2:32         ` Ming Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Yu Kuai @ 2025-07-02  1:12 UTC (permalink / raw)
  To: Nilay Shroff, Yu Kuai, Ming Lei
  Cc: josef, axboe, hch, hare, linux-block, nbd, linux-kernel, yi.zhang,
	yangerkun, johnny.chenyi, yukuai (C)

Hi,

在 2025/07/01 21:28, Nilay Shroff 写道:
> 
> 
> On 6/28/25 6:18 AM, Yu Kuai wrote:
>> Hi,
>>
>> 在 2025/06/27 19:04, Ming Lei 写道:
>>> I guess the patch in the following link may be simper, both two take
>>> similar approach:
>>>
>>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
>>
>> I this the above approach has concurrent problems if nbd_start_device
>> concurrent with nbd_start_device:
>>
>> t1:
>> nbd_start_device
>> lock
>> num_connections = 1
>> unlock
>>      t2:
>>      nbd_add_socket
>>      lock
>>      config->num_connections++
>>      unlock
>>          t3:
>>          nbd_start_device
>>          lock
>>          num_connections = 2
>>          unlock
>>          blk_mq_update_nr_hw_queues
>>
>> blk_mq_update_nr_hw_queues
>> //nr_hw_queues updated to 1 before failure
>> return -EINVAL
>>
> 
> In the above case, yes I see that t1 would return -EINVAL (as
> config->num_connections doesn't match with num_connections)
> but then t3 would succeed to update nr_hw_queue (as both
> config->num_connections and num_connections set to 2 this
> time). Isn't it? If yes, then the above patch (from Ming)
> seems good.

Emm, I'm confused, If you agree with the concurrent process, then
t3 update nr_hw_queues to 2 first and return sucess, later t1 update
nr_hw_queues back to 1 and return failure.

Thanks,
Kuai

> 
> Thanks,
> --Nilay
> .
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-07-02  1:12       ` Yu Kuai
@ 2025-07-02  2:32         ` Ming Lei
  2025-07-02  6:22           ` Nilay Shroff
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2025-07-02  2:32 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Nilay Shroff, josef, axboe, hch, hare, linux-block, nbd,
	linux-kernel, yi.zhang, yangerkun, johnny.chenyi, yukuai (C)

On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2025/07/01 21:28, Nilay Shroff 写道:
> > 
> > 
> > On 6/28/25 6:18 AM, Yu Kuai wrote:
> > > Hi,
> > > 
> > > 在 2025/06/27 19:04, Ming Lei 写道:
> > > > I guess the patch in the following link may be simper, both two take
> > > > similar approach:
> > > > 
> > > > https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
> > > 
> > > I this the above approach has concurrent problems if nbd_start_device
> > > concurrent with nbd_start_device:
> > > 
> > > t1:
> > > nbd_start_device
> > > lock
> > > num_connections = 1
> > > unlock
> > >      t2:
> > >      nbd_add_socket
> > >      lock
> > >      config->num_connections++
> > >      unlock
> > >          t3:
> > >          nbd_start_device
> > >          lock
> > >          num_connections = 2
> > >          unlock
> > >          blk_mq_update_nr_hw_queues
> > > 
> > > blk_mq_update_nr_hw_queues
> > > //nr_hw_queues updated to 1 before failure
> > > return -EINVAL
> > > 
> > 
> > In the above case, yes I see that t1 would return -EINVAL (as
> > config->num_connections doesn't match with num_connections)
> > but then t3 would succeed to update nr_hw_queue (as both
> > config->num_connections and num_connections set to 2 this
> > time). Isn't it? If yes, then the above patch (from Ming)
> > seems good.
> 
> Emm, I'm confused, If you agree with the concurrent process, then
> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
> nr_hw_queues back to 1 and return failure.

It should be easy to avoid failure by simple retrying.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-07-02  2:32         ` Ming Lei
@ 2025-07-02  6:22           ` Nilay Shroff
  2025-07-02  7:30             ` Yu Kuai
  0 siblings, 1 reply; 12+ messages in thread
From: Nilay Shroff @ 2025-07-02  6:22 UTC (permalink / raw)
  To: Ming Lei, Yu Kuai
  Cc: josef, axboe, hch, hare, linux-block, nbd, linux-kernel, yi.zhang,
	yangerkun, johnny.chenyi, yukuai (C)



On 7/2/25 8:02 AM, Ming Lei wrote:
> On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2025/07/01 21:28, Nilay Shroff 写道:
>>>
>>>
>>> On 6/28/25 6:18 AM, Yu Kuai wrote:
>>>> Hi,
>>>>
>>>> 在 2025/06/27 19:04, Ming Lei 写道:
>>>>> I guess the patch in the following link may be simper, both two take
>>>>> similar approach:
>>>>>
>>>>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
>>>>
>>>> I this the above approach has concurrent problems if nbd_start_device
>>>> concurrent with nbd_start_device:
>>>>
>>>> t1:
>>>> nbd_start_device
>>>> lock
>>>> num_connections = 1
>>>> unlock
>>>>      t2:
>>>>      nbd_add_socket
>>>>      lock
>>>>      config->num_connections++
>>>>      unlock
>>>>          t3:
>>>>          nbd_start_device
>>>>          lock
>>>>          num_connections = 2
>>>>          unlock
>>>>          blk_mq_update_nr_hw_queues
>>>>
>>>> blk_mq_update_nr_hw_queues
>>>> //nr_hw_queues updated to 1 before failure
>>>> return -EINVAL
>>>>
>>>
>>> In the above case, yes I see that t1 would return -EINVAL (as
>>> config->num_connections doesn't match with num_connections)
>>> but then t3 would succeed to update nr_hw_queue (as both
>>> config->num_connections and num_connections set to 2 this
>>> time). Isn't it? If yes, then the above patch (from Ming)
>>> seems good.
>>
>> Emm, I'm confused, If you agree with the concurrent process, then
>> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
>> nr_hw_queues back to 1 and return failure.
> 
> It should be easy to avoid failure by simple retrying.
> 
Yeah I think retry should be a safe bet here. 

On another note, synchronizing nbd_start_device and nbd_add_socket
using nbd->task_setup looks more complex and rather we may use 
nbd->pid to synchronize both. We need to move setting of nbd->pid
before we invoke blk_mq_update_nr_hw_queues in nbd_start_device.
Then in nbd_add_socket we can evaluate nbd->pid and if it's 
non-NULL then we could assume that either nr_hw_queues update is in 
progress or device has been setup and so return -EBUSY. I think
anyways updating number of connections once device is configured
would not be possible, so once nbd_start_device is initiated, we
shall prevent user adding more connections. If we follow this
approach then IMO we don't need to add retry discussed above.

Thanks,
--Nilay

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-07-02  6:22           ` Nilay Shroff
@ 2025-07-02  7:30             ` Yu Kuai
  2025-07-05  1:15               ` Yu Kuai
  0 siblings, 1 reply; 12+ messages in thread
From: Yu Kuai @ 2025-07-02  7:30 UTC (permalink / raw)
  To: Nilay Shroff, Ming Lei, Yu Kuai
  Cc: josef, axboe, hch, hare, linux-block, nbd, linux-kernel, yi.zhang,
	yangerkun, johnny.chenyi, yukuai (C)

Hi,

在 2025/07/02 14:22, Nilay Shroff 写道:
> 
> 
> On 7/2/25 8:02 AM, Ming Lei wrote:
>> On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2025/07/01 21:28, Nilay Shroff 写道:
>>>>
>>>>
>>>> On 6/28/25 6:18 AM, Yu Kuai wrote:
>>>>> Hi,
>>>>>
>>>>> 在 2025/06/27 19:04, Ming Lei 写道:
>>>>>> I guess the patch in the following link may be simper, both two take
>>>>>> similar approach:
>>>>>>
>>>>>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
>>>>>
>>>>> I this the above approach has concurrent problems if nbd_start_device
>>>>> concurrent with nbd_start_device:
>>>>>
>>>>> t1:
>>>>> nbd_start_device
>>>>> lock
>>>>> num_connections = 1
>>>>> unlock
>>>>>       t2:
>>>>>       nbd_add_socket
>>>>>       lock
>>>>>       config->num_connections++
>>>>>       unlock
>>>>>           t3:
>>>>>           nbd_start_device
>>>>>           lock
>>>>>           num_connections = 2
>>>>>           unlock
>>>>>           blk_mq_update_nr_hw_queues
>>>>>
>>>>> blk_mq_update_nr_hw_queues
>>>>> //nr_hw_queues updated to 1 before failure
>>>>> return -EINVAL
>>>>>
>>>>
>>>> In the above case, yes I see that t1 would return -EINVAL (as
>>>> config->num_connections doesn't match with num_connections)
>>>> but then t3 would succeed to update nr_hw_queue (as both
>>>> config->num_connections and num_connections set to 2 this
>>>> time). Isn't it? If yes, then the above patch (from Ming)
>>>> seems good.
>>>
>>> Emm, I'm confused, If you agree with the concurrent process, then
>>> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
>>> nr_hw_queues back to 1 and return failure.
>>
>> It should be easy to avoid failure by simple retrying.
>>
> Yeah I think retry should be a safe bet here.
> 

I really not sure about the retry, the above is just a scenario that I
think of with a quick review, and there are still many concurrent
scenarios that need to be checked, I'm kind of lost here.

Except nbd_start_device() and nbd_add_socked(), I'm not confident
other context that is synchronized with config_lock is not broken.
However, I'm ok with the bet.

> On another note, synchronizing nbd_start_device and nbd_add_socket
> using nbd->task_setup looks more complex and rather we may use
> nbd->pid to synchronize both. We need to move setting of nbd->pid
> before we invoke blk_mq_update_nr_hw_queues in nbd_start_device.
> Then in nbd_add_socket we can evaluate nbd->pid and if it's
> non-NULL then we could assume that either nr_hw_queues update is in
> progress or device has been setup and so return -EBUSY. I think
> anyways updating number of connections once device is configured
> would not be possible, so once nbd_start_device is initiated, we
> shall prevent user adding more connections. If we follow this
> approach then IMO we don't need to add retry discussed above.

It's ok for me to forbit nbd_add_socked after nbd is configured, there
is nowhere to use the added sock. And if there really are other contexts
need to be synchronized, I think nbd->pid can be used as well.

> 
> Thanks,
> --Nilay
> .
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-07-02  7:30             ` Yu Kuai
@ 2025-07-05  1:15               ` Yu Kuai
  2025-07-08  5:12                 ` Nilay Shroff
  0 siblings, 1 reply; 12+ messages in thread
From: Yu Kuai @ 2025-07-05  1:15 UTC (permalink / raw)
  To: Yu Kuai, Nilay Shroff, Ming Lei
  Cc: josef, axboe, hch, hare, linux-block, nbd, linux-kernel, yi.zhang,
	yangerkun, johnny.chenyi, yukuai (C)

Hi,

在 2025/07/02 15:30, Yu Kuai 写道:
> Hi,
> 
> 在 2025/07/02 14:22, Nilay Shroff 写道:
>>
>>
>> On 7/2/25 8:02 AM, Ming Lei wrote:
>>> On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
>>>> Hi,
>>>>
>>>> 在 2025/07/01 21:28, Nilay Shroff 写道:
>>>>>
>>>>>
>>>>> On 6/28/25 6:18 AM, Yu Kuai wrote:
>>>>>> Hi,
>>>>>>
>>>>>> 在 2025/06/27 19:04, Ming Lei 写道:
>>>>>>> I guess the patch in the following link may be simper, both two take
>>>>>>> similar approach:
>>>>>>>
>>>>>>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
>>>>>>
>>>>>> I this the above approach has concurrent problems if nbd_start_device
>>>>>> concurrent with nbd_start_device:
>>>>>>
>>>>>> t1:
>>>>>> nbd_start_device
>>>>>> lock
>>>>>> num_connections = 1
>>>>>> unlock
>>>>>>       t2:
>>>>>>       nbd_add_socket
>>>>>>       lock
>>>>>>       config->num_connections++
>>>>>>       unlock
>>>>>>           t3:
>>>>>>           nbd_start_device
>>>>>>           lock
>>>>>>           num_connections = 2
>>>>>>           unlock
>>>>>>           blk_mq_update_nr_hw_queues
>>>>>>
>>>>>> blk_mq_update_nr_hw_queues
>>>>>> //nr_hw_queues updated to 1 before failure
>>>>>> return -EINVAL
>>>>>>
>>>>>
>>>>> In the above case, yes I see that t1 would return -EINVAL (as
>>>>> config->num_connections doesn't match with num_connections)
>>>>> but then t3 would succeed to update nr_hw_queue (as both
>>>>> config->num_connections and num_connections set to 2 this
>>>>> time). Isn't it? If yes, then the above patch (from Ming)
>>>>> seems good.
>>>>
>>>> Emm, I'm confused, If you agree with the concurrent process, then
>>>> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
>>>> nr_hw_queues back to 1 and return failure.
>>>
>>> It should be easy to avoid failure by simple retrying.
>>>
>> Yeah I think retry should be a safe bet here.
>>
> 
> I really not sure about the retry, the above is just a scenario that I
> think of with a quick review, and there are still many concurrent
> scenarios that need to be checked, I'm kind of lost here.
> 
> Except nbd_start_device() and nbd_add_socked(), I'm not confident
> other context that is synchronized with config_lock is not broken.
> However, I'm ok with the bet.
> 
>> On another note, synchronizing nbd_start_device and nbd_add_socket
>> using nbd->task_setup looks more complex and rather we may use
>> nbd->pid to synchronize both. We need to move setting of nbd->pid
>> before we invoke blk_mq_update_nr_hw_queues in nbd_start_device.
>> Then in nbd_add_socket we can evaluate nbd->pid and if it's
>> non-NULL then we could assume that either nr_hw_queues update is in
>> progress or device has been setup and so return -EBUSY. I think
>> anyways updating number of connections once device is configured
>> would not be possible, so once nbd_start_device is initiated, we
>> shall prevent user adding more connections. If we follow this
>> approach then IMO we don't need to add retry discussed above.
> 
> It's ok for me to forbit nbd_add_socked after nbd is configured, there
> is nowhere to use the added sock. And if there really are other contexts
> need to be synchronized, I think nbd->pid can be used as well.
> 

Do we have a conclusion now? Feel free to send the retry version, or let
me know if I should send a new synchronize version.

Thanks,
Kuai

>>
>> Thanks,
>> --Nilay
>> .
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-07-05  1:15               ` Yu Kuai
@ 2025-07-08  5:12                 ` Nilay Shroff
  2025-07-08  7:34                   ` Ming Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Nilay Shroff @ 2025-07-08  5:12 UTC (permalink / raw)
  To: Yu Kuai, Ming Lei
  Cc: josef, axboe, hch, hare, linux-block, nbd, linux-kernel, yi.zhang,
	yangerkun, johnny.chenyi, yukuai (C)



On 7/5/25 6:45 AM, Yu Kuai wrote:
> Hi,
> 
> 在 2025/07/02 15:30, Yu Kuai 写道:
>> Hi,
>>
>> 在 2025/07/02 14:22, Nilay Shroff 写道:
>>>
>>>
>>> On 7/2/25 8:02 AM, Ming Lei wrote:
>>>> On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
>>>>> Hi,
>>>>>
>>>>> 在 2025/07/01 21:28, Nilay Shroff 写道:
>>>>>>
>>>>>>
>>>>>> On 6/28/25 6:18 AM, Yu Kuai wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> 在 2025/06/27 19:04, Ming Lei 写道:
>>>>>>>> I guess the patch in the following link may be simper, both two take
>>>>>>>> similar approach:
>>>>>>>>
>>>>>>>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
>>>>>>>
>>>>>>> I this the above approach has concurrent problems if nbd_start_device
>>>>>>> concurrent with nbd_start_device:
>>>>>>>
>>>>>>> t1:
>>>>>>> nbd_start_device
>>>>>>> lock
>>>>>>> num_connections = 1
>>>>>>> unlock
>>>>>>>       t2:
>>>>>>>       nbd_add_socket
>>>>>>>       lock
>>>>>>>       config->num_connections++
>>>>>>>       unlock
>>>>>>>           t3:
>>>>>>>           nbd_start_device
>>>>>>>           lock
>>>>>>>           num_connections = 2
>>>>>>>           unlock
>>>>>>>           blk_mq_update_nr_hw_queues
>>>>>>>
>>>>>>> blk_mq_update_nr_hw_queues
>>>>>>> //nr_hw_queues updated to 1 before failure
>>>>>>> return -EINVAL
>>>>>>>
>>>>>>
>>>>>> In the above case, yes I see that t1 would return -EINVAL (as
>>>>>> config->num_connections doesn't match with num_connections)
>>>>>> but then t3 would succeed to update nr_hw_queue (as both
>>>>>> config->num_connections and num_connections set to 2 this
>>>>>> time). Isn't it? If yes, then the above patch (from Ming)
>>>>>> seems good.
>>>>>
>>>>> Emm, I'm confused, If you agree with the concurrent process, then
>>>>> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
>>>>> nr_hw_queues back to 1 and return failure.
>>>>
>>>> It should be easy to avoid failure by simple retrying.
>>>>
>>> Yeah I think retry should be a safe bet here.
>>>
>>
>> I really not sure about the retry, the above is just a scenario that I
>> think of with a quick review, and there are still many concurrent
>> scenarios that need to be checked, I'm kind of lost here.
>>
>> Except nbd_start_device() and nbd_add_socked(), I'm not confident
>> other context that is synchronized with config_lock is not broken.
>> However, I'm ok with the bet.
>>
>>> On another note, synchronizing nbd_start_device and nbd_add_socket
>>> using nbd->task_setup looks more complex and rather we may use
>>> nbd->pid to synchronize both. We need to move setting of nbd->pid
>>> before we invoke blk_mq_update_nr_hw_queues in nbd_start_device.
>>> Then in nbd_add_socket we can evaluate nbd->pid and if it's
>>> non-NULL then we could assume that either nr_hw_queues update is in
>>> progress or device has been setup and so return -EBUSY. I think
>>> anyways updating number of connections once device is configured
>>> would not be possible, so once nbd_start_device is initiated, we
>>> shall prevent user adding more connections. If we follow this
>>> approach then IMO we don't need to add retry discussed above.
>>
>> It's ok for me to forbit nbd_add_socked after nbd is configured, there
>> is nowhere to use the added sock. And if there really are other contexts
>> need to be synchronized, I think nbd->pid can be used as well.
>>
> 
> Do we have a conclusion now? Feel free to send the retry version, or let
> me know if I should send a new synchronize version.
> 
Personally, I prefer synchronizing nbd_start_device and nbd_add_socket
using nbd->pid but I do agree retry version would also work. Having
said that, lets wait for Ming's feedback as well.

Thanks,
--Nilay

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-07-08  5:12                 ` Nilay Shroff
@ 2025-07-08  7:34                   ` Ming Lei
  2025-07-08 11:13                     ` Nilay Shroff
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2025-07-08  7:34 UTC (permalink / raw)
  To: Nilay Shroff
  Cc: Yu Kuai, josef, axboe, hch, hare, linux-block, nbd, linux-kernel,
	yi.zhang, yangerkun, johnny.chenyi, yukuai (C)

On Tue, Jul 08, 2025 at 10:42:00AM +0530, Nilay Shroff wrote:
> 
> 
> On 7/5/25 6:45 AM, Yu Kuai wrote:
> > Hi,
> > 
> > 在 2025/07/02 15:30, Yu Kuai 写道:
> >> Hi,
> >>
> >> 在 2025/07/02 14:22, Nilay Shroff 写道:
> >>>
> >>>
> >>> On 7/2/25 8:02 AM, Ming Lei wrote:
> >>>> On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
> >>>>> Hi,
> >>>>>
> >>>>> 在 2025/07/01 21:28, Nilay Shroff 写道:
> >>>>>>
> >>>>>>
> >>>>>> On 6/28/25 6:18 AM, Yu Kuai wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> 在 2025/06/27 19:04, Ming Lei 写道:
> >>>>>>>> I guess the patch in the following link may be simper, both two take
> >>>>>>>> similar approach:
> >>>>>>>>
> >>>>>>>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
> >>>>>>>
> >>>>>>> I this the above approach has concurrent problems if nbd_start_device
> >>>>>>> concurrent with nbd_start_device:
> >>>>>>>
> >>>>>>> t1:
> >>>>>>> nbd_start_device
> >>>>>>> lock
> >>>>>>> num_connections = 1
> >>>>>>> unlock
> >>>>>>>       t2:
> >>>>>>>       nbd_add_socket
> >>>>>>>       lock
> >>>>>>>       config->num_connections++
> >>>>>>>       unlock
> >>>>>>>           t3:
> >>>>>>>           nbd_start_device
> >>>>>>>           lock
> >>>>>>>           num_connections = 2
> >>>>>>>           unlock
> >>>>>>>           blk_mq_update_nr_hw_queues
> >>>>>>>
> >>>>>>> blk_mq_update_nr_hw_queues
> >>>>>>> //nr_hw_queues updated to 1 before failure
> >>>>>>> return -EINVAL
> >>>>>>>
> >>>>>>
> >>>>>> In the above case, yes I see that t1 would return -EINVAL (as
> >>>>>> config->num_connections doesn't match with num_connections)
> >>>>>> but then t3 would succeed to update nr_hw_queue (as both
> >>>>>> config->num_connections and num_connections set to 2 this
> >>>>>> time). Isn't it? If yes, then the above patch (from Ming)
> >>>>>> seems good.
> >>>>>
> >>>>> Emm, I'm confused, If you agree with the concurrent process, then
> >>>>> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
> >>>>> nr_hw_queues back to 1 and return failure.
> >>>>
> >>>> It should be easy to avoid failure by simple retrying.
> >>>>
> >>> Yeah I think retry should be a safe bet here.
> >>>
> >>
> >> I really not sure about the retry, the above is just a scenario that I
> >> think of with a quick review, and there are still many concurrent
> >> scenarios that need to be checked, I'm kind of lost here.
> >>
> >> Except nbd_start_device() and nbd_add_socked(), I'm not confident
> >> other context that is synchronized with config_lock is not broken.
> >> However, I'm ok with the bet.
> >>
> >>> On another note, synchronizing nbd_start_device and nbd_add_socket
> >>> using nbd->task_setup looks more complex and rather we may use
> >>> nbd->pid to synchronize both. We need to move setting of nbd->pid
> >>> before we invoke blk_mq_update_nr_hw_queues in nbd_start_device.
> >>> Then in nbd_add_socket we can evaluate nbd->pid and if it's
> >>> non-NULL then we could assume that either nr_hw_queues update is in
> >>> progress or device has been setup and so return -EBUSY. I think
> >>> anyways updating number of connections once device is configured
> >>> would not be possible, so once nbd_start_device is initiated, we
> >>> shall prevent user adding more connections. If we follow this
> >>> approach then IMO we don't need to add retry discussed above.
> >>
> >> It's ok for me to forbit nbd_add_socked after nbd is configured, there
> >> is nowhere to use the added sock. And if there really are other contexts
> >> need to be synchronized, I think nbd->pid can be used as well.
> >>
> > 
> > Do we have a conclusion now? Feel free to send the retry version, or let
> > me know if I should send a new synchronize version.
> > 
> Personally, I prefer synchronizing nbd_start_device and nbd_add_socket
> using nbd->pid but I do agree retry version would also work. Having
> said that, lets wait for Ming's feedback as well.

IMO simpler fix, especially in concept, is more effective for addressing
this kind of issue, with less chance to introduce regression.

If no one objects, I may send the retry version.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] nbd: fix false lockdep deadlock warning
  2025-07-08  7:34                   ` Ming Lei
@ 2025-07-08 11:13                     ` Nilay Shroff
  0 siblings, 0 replies; 12+ messages in thread
From: Nilay Shroff @ 2025-07-08 11:13 UTC (permalink / raw)
  To: Ming Lei
  Cc: Yu Kuai, josef, axboe, hch, hare, linux-block, nbd, linux-kernel,
	yi.zhang, yangerkun, johnny.chenyi, yukuai (C)



On 7/8/25 1:04 PM, Ming Lei wrote:
> On Tue, Jul 08, 2025 at 10:42:00AM +0530, Nilay Shroff wrote:
>>
>>
>> On 7/5/25 6:45 AM, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2025/07/02 15:30, Yu Kuai 写道:
>>>> Hi,
>>>>
>>>> 在 2025/07/02 14:22, Nilay Shroff 写道:
>>>>>
>>>>>
>>>>> On 7/2/25 8:02 AM, Ming Lei wrote:
>>>>>> On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> 在 2025/07/01 21:28, Nilay Shroff 写道:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/28/25 6:18 AM, Yu Kuai wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> 在 2025/06/27 19:04, Ming Lei 写道:
>>>>>>>>>> I guess the patch in the following link may be simper, both two take
>>>>>>>>>> similar approach:
>>>>>>>>>>
>>>>>>>>>> https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
>>>>>>>>>
>>>>>>>>> I this the above approach has concurrent problems if nbd_start_device
>>>>>>>>> concurrent with nbd_start_device:
>>>>>>>>>
>>>>>>>>> t1:
>>>>>>>>> nbd_start_device
>>>>>>>>> lock
>>>>>>>>> num_connections = 1
>>>>>>>>> unlock
>>>>>>>>>       t2:
>>>>>>>>>       nbd_add_socket
>>>>>>>>>       lock
>>>>>>>>>       config->num_connections++
>>>>>>>>>       unlock
>>>>>>>>>           t3:
>>>>>>>>>           nbd_start_device
>>>>>>>>>           lock
>>>>>>>>>           num_connections = 2
>>>>>>>>>           unlock
>>>>>>>>>           blk_mq_update_nr_hw_queues
>>>>>>>>>
>>>>>>>>> blk_mq_update_nr_hw_queues
>>>>>>>>> //nr_hw_queues updated to 1 before failure
>>>>>>>>> return -EINVAL
>>>>>>>>>
>>>>>>>>
>>>>>>>> In the above case, yes I see that t1 would return -EINVAL (as
>>>>>>>> config->num_connections doesn't match with num_connections)
>>>>>>>> but then t3 would succeed to update nr_hw_queue (as both
>>>>>>>> config->num_connections and num_connections set to 2 this
>>>>>>>> time). Isn't it? If yes, then the above patch (from Ming)
>>>>>>>> seems good.
>>>>>>>
>>>>>>> Emm, I'm confused, If you agree with the concurrent process, then
>>>>>>> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
>>>>>>> nr_hw_queues back to 1 and return failure.
>>>>>>
>>>>>> It should be easy to avoid failure by simple retrying.
>>>>>>
>>>>> Yeah I think retry should be a safe bet here.
>>>>>
>>>>
>>>> I really not sure about the retry, the above is just a scenario that I
>>>> think of with a quick review, and there are still many concurrent
>>>> scenarios that need to be checked, I'm kind of lost here.
>>>>
>>>> Except nbd_start_device() and nbd_add_socked(), I'm not confident
>>>> other context that is synchronized with config_lock is not broken.
>>>> However, I'm ok with the bet.
>>>>
>>>>> On another note, synchronizing nbd_start_device and nbd_add_socket
>>>>> using nbd->task_setup looks more complex and rather we may use
>>>>> nbd->pid to synchronize both. We need to move setting of nbd->pid
>>>>> before we invoke blk_mq_update_nr_hw_queues in nbd_start_device.
>>>>> Then in nbd_add_socket we can evaluate nbd->pid and if it's
>>>>> non-NULL then we could assume that either nr_hw_queues update is in
>>>>> progress or device has been setup and so return -EBUSY. I think
>>>>> anyways updating number of connections once device is configured
>>>>> would not be possible, so once nbd_start_device is initiated, we
>>>>> shall prevent user adding more connections. If we follow this
>>>>> approach then IMO we don't need to add retry discussed above.
>>>>
>>>> It's ok for me to forbit nbd_add_socked after nbd is configured, there
>>>> is nowhere to use the added sock. And if there really are other contexts
>>>> need to be synchronized, I think nbd->pid can be used as well.
>>>>
>>>
>>> Do we have a conclusion now? Feel free to send the retry version, or let
>>> me know if I should send a new synchronize version.
>>>
>> Personally, I prefer synchronizing nbd_start_device and nbd_add_socket
>> using nbd->pid but I do agree retry version would also work. Having
>> said that, lets wait for Ming's feedback as well.
> 
> IMO simpler fix, especially in concept, is more effective for addressing
> this kind of issue, with less chance to introduce regression.
> 
> If no one objects, I may send the retry version.
> 
Adding new socket/connection, after nr_hw_queue update is initiated or 
device is setup, is of no use. But I am also good with retry version. 
So please send your patch.

Thanks,
--Nilay


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-07-08 11:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-27  9:23 [PATCH] nbd: fix false lockdep deadlock warning Yu Kuai
2025-06-27 11:04 ` Ming Lei
2025-06-28  0:48   ` Yu Kuai
2025-07-01 13:28     ` Nilay Shroff
2025-07-02  1:12       ` Yu Kuai
2025-07-02  2:32         ` Ming Lei
2025-07-02  6:22           ` Nilay Shroff
2025-07-02  7:30             ` Yu Kuai
2025-07-05  1:15               ` Yu Kuai
2025-07-08  5:12                 ` Nilay Shroff
2025-07-08  7:34                   ` Ming Lei
2025-07-08 11:13                     ` Nilay Shroff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).