[PATCH] nvmet-tcp: avoid circular locking dependency on install

public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed

* [PATCH] nvmet-tcp: avoid circular locking dependency on install_queue()
@ 2023-08-10 13:20 Hannes Reinecke
  2023-08-17  8:54 ` Sagi Grimberg
  0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2023-08-10 13:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke

nvmet_tcp_install_queue() is driven from the ->io_work workqueue
function, but will call flush_workqueue() which might trigger
->release_work() which in itself calls flush_work on ->io_work.

To avoid that check for pending queue in disconnecting status,
and return 'controller busy' until all disconnects are completed.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/target/tcp.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index f131b504aa01..97d07488072d 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1845,8 +1845,18 @@ static u16 nvmet_tcp_install_queue(struct nvmet_sq *sq)
 		container_of(sq, struct nvmet_tcp_queue, nvme_sq);
 
 	if (sq->qid == 0) {
-		/* Let inflight controller teardown complete */
-		flush_workqueue(nvmet_wq);
+		struct nvmet_tcp_queue *q;
+		int pending = 0;
+
+		mutex_lock(&nvmet_tcp_queue_mutex);
+		list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
+			if (q->state == NVMET_TCP_Q_DISCONNECTING)
+				pending++;
+		}
+		mutex_unlock(&nvmet_tcp_queue_mutex);
+		/* Retry for pending controller teardown */
+		if (pending)
+			return NVME_SC_CONNECT_CTRL_BUSY;
 	}
 
 	queue->nr_cmds = sq->size * 2;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvmet-tcp: avoid circular locking dependency on install_queue()
  2023-08-10 13:20 [PATCH] nvmet-tcp: avoid circular locking dependency on install_queue() Hannes Reinecke
@ 2023-08-17  8:54 ` Sagi Grimberg
  2023-08-17 11:11   ` Hannes Reinecke
  0 siblings, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2023-08-17  8:54 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme



On 8/10/23 16:20, Hannes Reinecke wrote:
> nvmet_tcp_install_queue() is driven from the ->io_work workqueue
> function, but will call flush_workqueue() which might trigger
> ->release_work() which in itself calls flush_work on ->io_work.
> 
> To avoid that check for pending queue in disconnecting status,
> and return 'controller busy' until all disconnects are completed.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/nvme/target/tcp.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index f131b504aa01..97d07488072d 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1845,8 +1845,18 @@ static u16 nvmet_tcp_install_queue(struct nvmet_sq *sq)
>   		container_of(sq, struct nvmet_tcp_queue, nvme_sq);
>   
>   	if (sq->qid == 0) {
> -		/* Let inflight controller teardown complete */
> -		flush_workqueue(nvmet_wq);
> +		struct nvmet_tcp_queue *q;
> +		int pending = 0;
> +
> +		mutex_lock(&nvmet_tcp_queue_mutex);
> +		list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
> +			if (q->state == NVMET_TCP_Q_DISCONNECTING)
> +				pending++;
> +		}
> +		mutex_unlock(&nvmet_tcp_queue_mutex);
> +		/* Retry for pending controller teardown */
> +		if (pending)
> +			return NVME_SC_CONNECT_CTRL_BUSY;

Overall I'm fine with the approach.

That will most likely be the case for a controller reset or a "reset
storm". Maybe it is better to keep it at some reasonably allowed pending
to say 10 or something... The whole point is to not let the removals
accumulate and continue creating new ones...

Can you please test this with a host that does a loop like:
while true; do nvme reset /dev/nvme0; done

And let it run for some time...
Or even better create a test that does this for 100 iterations
or a log/short variants of this?

The same would be needed by nvmet-rdma btw.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvmet-tcp: avoid circular locking dependency on install_queue()
  2023-08-17  8:54 ` Sagi Grimberg
@ 2023-08-17 11:11   ` Hannes Reinecke
  2023-08-17 11:20     ` Sagi Grimberg
  0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2023-08-17 11:11 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/17/23 10:54, Sagi Grimberg wrote:
> 
> 
> On 8/10/23 16:20, Hannes Reinecke wrote:
>> nvmet_tcp_install_queue() is driven from the ->io_work workqueue
>> function, but will call flush_workqueue() which might trigger
>> ->release_work() which in itself calls flush_work on ->io_work.
>>
>> To avoid that check for pending queue in disconnecting status,
>> and return 'controller busy' until all disconnects are completed.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/nvme/target/tcp.c | 14 ++++++++++++--
>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>> index f131b504aa01..97d07488072d 100644
>> --- a/drivers/nvme/target/tcp.c
>> +++ b/drivers/nvme/target/tcp.c
>> @@ -1845,8 +1845,18 @@ static u16 nvmet_tcp_install_queue(struct 
>> nvmet_sq *sq)
>>           container_of(sq, struct nvmet_tcp_queue, nvme_sq);
>>       if (sq->qid == 0) {
>> -        /* Let inflight controller teardown complete */
>> -        flush_workqueue(nvmet_wq);
>> +        struct nvmet_tcp_queue *q;
>> +        int pending = 0;
>> +
>> +        mutex_lock(&nvmet_tcp_queue_mutex);
>> +        list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
>> +            if (q->state == NVMET_TCP_Q_DISCONNECTING)
>> +                pending++;
>> +        }
>> +        mutex_unlock(&nvmet_tcp_queue_mutex);
>> +        /* Retry for pending controller teardown */
>> +        if (pending)
>> +            return NVME_SC_CONNECT_CTRL_BUSY;
> 
> Overall I'm fine with the approach.
> 
> That will most likely be the case for a controller reset or a "reset
> storm". Maybe it is better to keep it at some reasonably allowed pending
> to say 10 or something... The whole point is to not let the removals
> accumulate and continue creating new ones...
> 
> Can you please test this with a host that does a loop like:
> while true; do nvme reset /dev/nvme0; done
> 
> And let it run for some time...
> Or even better create a test that does this for 100 iterations
> or a log/short variants of this?
> 
> The same would be needed by nvmet-rdma btw.

Retested with the reset loop; no issues found after 327 iterations.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvmet-tcp: avoid circular locking dependency on install_queue()
  2023-08-17 11:11   ` Hannes Reinecke
@ 2023-08-17 11:20     ` Sagi Grimberg
  2023-08-17 11:36       ` Hannes Reinecke
  0 siblings, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2023-08-17 11:20 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme


>> On 8/10/23 16:20, Hannes Reinecke wrote:
>>> nvmet_tcp_install_queue() is driven from the ->io_work workqueue
>>> function, but will call flush_workqueue() which might trigger
>>> ->release_work() which in itself calls flush_work on ->io_work.
>>>
>>> To avoid that check for pending queue in disconnecting status,
>>> and return 'controller busy' until all disconnects are completed.
>>>
>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>> ---
>>>   drivers/nvme/target/tcp.c | 14 ++++++++++++--
>>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>>> index f131b504aa01..97d07488072d 100644
>>> --- a/drivers/nvme/target/tcp.c
>>> +++ b/drivers/nvme/target/tcp.c
>>> @@ -1845,8 +1845,18 @@ static u16 nvmet_tcp_install_queue(struct 
>>> nvmet_sq *sq)
>>>           container_of(sq, struct nvmet_tcp_queue, nvme_sq);
>>>       if (sq->qid == 0) {
>>> -        /* Let inflight controller teardown complete */
>>> -        flush_workqueue(nvmet_wq);
>>> +        struct nvmet_tcp_queue *q;
>>> +        int pending = 0;
>>> +
>>> +        mutex_lock(&nvmet_tcp_queue_mutex);
>>> +        list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
>>> +            if (q->state == NVMET_TCP_Q_DISCONNECTING)
>>> +                pending++;
>>> +        }
>>> +        mutex_unlock(&nvmet_tcp_queue_mutex);
>>> +        /* Retry for pending controller teardown */
>>> +        if (pending)
>>> +            return NVME_SC_CONNECT_CTRL_BUSY;
>>
>> Overall I'm fine with the approach.
>>
>> That will most likely be the case for a controller reset or a "reset
>> storm". Maybe it is better to keep it at some reasonably allowed pending
>> to say 10 or something... The whole point is to not let the removals
>> accumulate and continue creating new ones...
>>
>> Can you please test this with a host that does a loop like:
>> while true; do nvme reset /dev/nvme0; done
>>
>> And let it run for some time...
>> Or even better create a test that does this for 100 iterations
>> or a log/short variants of this?
>>
>> The same would be needed by nvmet-rdma btw.
> 
> Retested with the reset loop; no issues found after 327 iterations.

Interesting, did the host see a NVME_SC_CONNECT_CTRL_BUSY  during the
test?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvmet-tcp: avoid circular locking dependency on install_queue()
  2023-08-17 11:20     ` Sagi Grimberg
@ 2023-08-17 11:36       ` Hannes Reinecke
  2023-08-17 11:43         ` Sagi Grimberg
  0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2023-08-17 11:36 UTC (permalink / raw)
  To: Sagi Grimberg, Christoph Hellwig; +Cc: Keith Busch, linux-nvme

On 8/17/23 13:20, Sagi Grimberg wrote:
> 
>>> On 8/10/23 16:20, Hannes Reinecke wrote:
>>>> nvmet_tcp_install_queue() is driven from the ->io_work workqueue
>>>> function, but will call flush_workqueue() which might trigger
>>>> ->release_work() which in itself calls flush_work on ->io_work.
>>>>
>>>> To avoid that check for pending queue in disconnecting status,
>>>> and return 'controller busy' until all disconnects are completed.
>>>>
>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>> ---
>>>>   drivers/nvme/target/tcp.c | 14 ++++++++++++--
>>>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>>>> index f131b504aa01..97d07488072d 100644
>>>> --- a/drivers/nvme/target/tcp.c
>>>> +++ b/drivers/nvme/target/tcp.c
>>>> @@ -1845,8 +1845,18 @@ static u16 nvmet_tcp_install_queue(struct 
>>>> nvmet_sq *sq)
>>>>           container_of(sq, struct nvmet_tcp_queue, nvme_sq);
>>>>       if (sq->qid == 0) {
>>>> -        /* Let inflight controller teardown complete */
>>>> -        flush_workqueue(nvmet_wq);
>>>> +        struct nvmet_tcp_queue *q;
>>>> +        int pending = 0;
>>>> +
>>>> +        mutex_lock(&nvmet_tcp_queue_mutex);
>>>> +        list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
>>>> +            if (q->state == NVMET_TCP_Q_DISCONNECTING)
>>>> +                pending++;
>>>> +        }
>>>> +        mutex_unlock(&nvmet_tcp_queue_mutex);
>>>> +        /* Retry for pending controller teardown */
>>>> +        if (pending)
>>>> +            return NVME_SC_CONNECT_CTRL_BUSY;
>>>
>>> Overall I'm fine with the approach.
>>>
>>> That will most likely be the case for a controller reset or a "reset
>>> storm". Maybe it is better to keep it at some reasonably allowed pending
>>> to say 10 or something... The whole point is to not let the removals
>>> accumulate and continue creating new ones...
>>>
>>> Can you please test this with a host that does a loop like:
>>> while true; do nvme reset /dev/nvme0; done
>>>
>>> And let it run for some time...
>>> Or even better create a test that does this for 100 iterations
>>> or a log/short variants of this?
>>>
>>> The same would be needed by nvmet-rdma btw.
>>
>> Retested with the reset loop; no issues found after 327 iterations.
> 
> Interesting, did the host see a NVME_SC_CONNECT_CTRL_BUSY  during the
> test?

Nope.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvmet-tcp: avoid circular locking dependency on install_queue()
  2023-08-17 11:36       ` Hannes Reinecke
@ 2023-08-17 11:43         ` Sagi Grimberg
  0 siblings, 0 replies; 6+ messages in thread
From: Sagi Grimberg @ 2023-08-17 11:43 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig; +Cc: Keith Busch, linux-nvme


>>>>> nvmet_tcp_install_queue() is driven from the ->io_work workqueue
>>>>> function, but will call flush_workqueue() which might trigger
>>>>> ->release_work() which in itself calls flush_work on ->io_work.
>>>>>
>>>>> To avoid that check for pending queue in disconnecting status,
>>>>> and return 'controller busy' until all disconnects are completed.
>>>>>
>>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>>> ---
>>>>>   drivers/nvme/target/tcp.c | 14 ++++++++++++--
>>>>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>>>>> index f131b504aa01..97d07488072d 100644
>>>>> --- a/drivers/nvme/target/tcp.c
>>>>> +++ b/drivers/nvme/target/tcp.c
>>>>> @@ -1845,8 +1845,18 @@ static u16 nvmet_tcp_install_queue(struct 
>>>>> nvmet_sq *sq)
>>>>>           container_of(sq, struct nvmet_tcp_queue, nvme_sq);
>>>>>       if (sq->qid == 0) {
>>>>> -        /* Let inflight controller teardown complete */
>>>>> -        flush_workqueue(nvmet_wq);
>>>>> +        struct nvmet_tcp_queue *q;
>>>>> +        int pending = 0;
>>>>> +
>>>>> +        mutex_lock(&nvmet_tcp_queue_mutex);
>>>>> +        list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
>>>>> +            if (q->state == NVMET_TCP_Q_DISCONNECTING)
>>>>> +                pending++;
>>>>> +        }
>>>>> +        mutex_unlock(&nvmet_tcp_queue_mutex);
>>>>> +        /* Retry for pending controller teardown */
>>>>> +        if (pending)
>>>>> +            return NVME_SC_CONNECT_CTRL_BUSY;
>>>>
>>>> Overall I'm fine with the approach.
>>>>
>>>> That will most likely be the case for a controller reset or a "reset
>>>> storm". Maybe it is better to keep it at some reasonably allowed 
>>>> pending
>>>> to say 10 or something... The whole point is to not let the removals
>>>> accumulate and continue creating new ones...
>>>>
>>>> Can you please test this with a host that does a loop like:
>>>> while true; do nvme reset /dev/nvme0; done
>>>>
>>>> And let it run for some time...
>>>> Or even better create a test that does this for 100 iterations
>>>> or a log/short variants of this?
>>>>
>>>> The same would be needed by nvmet-rdma btw.
>>>
>>> Retested with the reset loop; no issues found after 327 iterations.
>>
>> Interesting, did the host see a NVME_SC_CONNECT_CTRL_BUSY  during the
>> test?
> 
> Nope.

So you are not stressing enough?

Before (5+ years ago), multiple reporters, including myself ran a
similar test and got to OOM after enough time.

If you are not exercising the busy return, you are not getting to a
point where you need to back-pressure...


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-17 11:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-10 13:20 [PATCH] nvmet-tcp: avoid circular locking dependency on install_queue() Hannes Reinecke
2023-08-17  8:54 ` Sagi Grimberg
2023-08-17 11:11   ` Hannes Reinecke
2023-08-17 11:20     ` Sagi Grimberg
2023-08-17 11:36       ` Hannes Reinecke
2023-08-17 11:43         ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox