[PATCHv3 0/2] nvmet: avoid circular locking warning

public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed

* [PATCHv3 0/2] nvmet: avoid circular locking warning
@ 2023-12-08 12:53 hare
  2023-12-08 12:53 ` [PATCH 1/2] nvmet-tcp: avoid circular locking dependency on install_queue() hare
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: hare @ 2023-12-08 12:53 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: Christoph Hellwig, Keith Busch, linux-nvme, Hannes Reinecke

From: Hannes Reinecke <hare@suse.de>

nvmet-rdma and nvmet-tcp trigger a circular locking warning when
tearing down; reason is a call to 'flush_workqueue' when creating
a new controller which tries to cover for the fact that old controller
instances might be in the process of tearing down.
However, this is pure speculation as we don't know (and don't check)
if there really _are_ controllers in shutdown.
And even if there were, that should be short-lived, and would have been
resolved by connecting just a tad later.
So this patch returns 'controller busy' if we really find ourselves in this
situation, allowing the caller to reconnect later.

Changes to v2:
- Implement backlog as suggested by Sagi

Changes to the original version:
- Update the rdma patch to implement 'install_queue()'
- Include suggestions from Jens Axboe

Hannes Reinecke (2):
  nvmet-tcp: avoid circular locking dependency on install_queue()
  nvmet-rdma: avoid circular locking dependency on install_queue()

 drivers/nvme/target/rdma.c | 19 ++++++++++++++++---
 drivers/nvme/target/tcp.c  | 18 +++++++++++++++---
 2 files changed, 31 insertions(+), 6 deletions(-)

-- 
2.35.3

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] nvmet-tcp: avoid circular locking dependency on install_queue()
  2023-12-08 12:53 [PATCHv3 0/2] nvmet: avoid circular locking warning hare
@ 2023-12-08 12:53 ` hare
  2023-12-11 14:10   ` Sagi Grimberg
  2023-12-08 12:53 ` [PATCH 2/2] nvmet-rdma: " hare
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: hare @ 2023-12-08 12:53 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Hannes Reinecke,
	Shin'ichiro Kawasaki

From: Hannes Reinecke <hare@suse.de>

nvmet_tcp_install_queue() is driven from the ->io_work workqueue
function, but will call flush_workqueue() which might trigger
->release_work() which in itself calls flush_work on ->io_work.

To avoid that check for pending queue in disconnecting status,
and return 'controller busy' when we reached a certain threshold.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 drivers/nvme/target/tcp.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 4cc27856aa8f..2be0eb6cdb23 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -24,6 +24,7 @@
 #include "nvmet.h"
 
 #define NVMET_TCP_DEF_INLINE_DATA_SIZE	(4 * PAGE_SIZE)
+#define NVMET_TCP_BACKLOG 128
 
 static int param_store_val(const char *str, int *val, int min, int max)
 {
@@ -2053,7 +2054,7 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport)
 		goto err_sock;
 	}
 
-	ret = kernel_listen(port->sock, 128);
+	ret = kernel_listen(port->sock, NVMET_TCP_BACKLOG);
 	if (ret) {
 		pr_err("failed to listen %d on port sock\n", ret);
 		goto err_sock;
@@ -2119,8 +2120,19 @@ static u16 nvmet_tcp_install_queue(struct nvmet_sq *sq)
 		container_of(sq, struct nvmet_tcp_queue, nvme_sq);
 
 	if (sq->qid == 0) {
-		/* Let inflight controller teardown complete */
-		flush_workqueue(nvmet_wq);
+		struct nvmet_tcp_queue *q;
+		int pending = 0;
+
+		/* Check for pending controller teardown */
+		mutex_lock(&nvmet_tcp_queue_mutex);
+		list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
+			if (q->nvme_sq.ctrl == sq->ctrl &&
+			    q->state == NVMET_TCP_Q_DISCONNECTING)
+				pending++;
+		}
+		mutex_unlock(&nvmet_tcp_queue_mutex);
+		if (pending > NVMET_TCP_BACKLOG)
+			return NVME_SC_CONNECT_CTRL_BUSY;
 	}
 
 	queue->nr_cmds = sq->size * 2;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] nvmet-tcp: avoid circular locking dependency on install_queue()
  2023-12-08 12:53 ` [PATCH 1/2] nvmet-tcp: avoid circular locking dependency on install_queue() hare
@ 2023-12-11 14:10   ` Sagi Grimberg
  0 siblings, 0 replies; 9+ messages in thread
From: Sagi Grimberg @ 2023-12-11 14:10 UTC (permalink / raw)
  To: hare
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Hannes Reinecke,
	Shin'ichiro Kawasaki

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/2] nvmet-rdma: avoid circular locking dependency on install_queue()
  2023-12-08 12:53 [PATCHv3 0/2] nvmet: avoid circular locking warning hare
  2023-12-08 12:53 ` [PATCH 1/2] nvmet-tcp: avoid circular locking dependency on install_queue() hare
@ 2023-12-08 12:53 ` hare
  2023-12-11 14:10   ` Sagi Grimberg
       [not found]   ` <3eefcbfd-a93f-4c1c-957a-66058e5cdb54@nvidia.com>
  2023-12-11 12:44 ` [PATCHv3 0/2] nvmet: avoid circular locking warning Shinichiro Kawasaki
  2024-01-10  4:12 ` Shinichiro Kawasaki
  3 siblings, 2 replies; 9+ messages in thread
From: hare @ 2023-12-08 12:53 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Hannes Reinecke,
	Shin'ichiro Kawasaki

From: Hannes Reinecke <hare@suse.de>

nvmet_rdma_install_queue() is driven from the ->io_work workqueue
function, but will call flush_workqueue() which might trigger
->release_work() which in itself calls flush_work on ->io_work.

To avoid that check for pending queue in disconnecting status,
and return 'controller busy' when we reached a certain threshold.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 drivers/nvme/target/rdma.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 4597bca43a6d..667f9c04f35d 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -37,6 +37,8 @@
 #define NVMET_RDMA_MAX_MDTS			8
 #define NVMET_RDMA_MAX_METADATA_MDTS		5
 
+#define NVMET_RDMA_BACKLOG 128
+
 struct nvmet_rdma_srq;
 
 struct nvmet_rdma_cmd {
@@ -1583,8 +1585,19 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id,
 	}
 
 	if (queue->host_qid == 0) {
-		/* Let inflight controller teardown complete */
-		flush_workqueue(nvmet_wq);
+		struct nvmet_rdma_queue *q;
+		int pending = 0;
+
+		/* Check for pending controller teardown */
+		mutex_lock(&nvmet_rdma_queue_mutex);
+		list_for_each_entry(q, &nvmet_rdma_queue_list, queue_list) {
+			if (q->nvme_sq.ctrl == queue->nvme_sq.ctrl &&
+			    q->state == NVMET_RDMA_Q_DISCONNECTING)
+				pending++;
+		}
+		mutex_unlock(&nvmet_rdma_queue_mutex);
+		if (pending > NVMET_RDMA_BACKLOG)
+			return NVME_SC_CONNECT_CTRL_BUSY;
 	}
 
 	ret = nvmet_rdma_cm_accept(cm_id, queue, &event->param.conn);
@@ -1880,7 +1893,7 @@ static int nvmet_rdma_enable_port(struct nvmet_rdma_port *port)
 		goto out_destroy_id;
 	}
 
-	ret = rdma_listen(cm_id, 128);
+	ret = rdma_listen(cm_id, NVMET_RDMA_BACKLOG);
 	if (ret) {
 		pr_err("listening to %pISpcs failed (%d)\n", addr, ret);
 		goto out_destroy_id;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] nvmet-rdma: avoid circular locking dependency on install_queue()
  2023-12-08 12:53 ` [PATCH 2/2] nvmet-rdma: " hare
@ 2023-12-11 14:10   ` Sagi Grimberg
       [not found]   ` <3eefcbfd-a93f-4c1c-957a-66058e5cdb54@nvidia.com>
  1 sibling, 0 replies; 9+ messages in thread
From: Sagi Grimberg @ 2023-12-11 14:10 UTC (permalink / raw)
  To: hare
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Hannes Reinecke,
	Shin'ichiro Kawasaki

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


^ permalink raw reply	[flat|nested] 9+ messages in thread

[parent not found: <3eefcbfd-a93f-4c1c-957a-66058e5cdb54@nvidia.com>]

* Re: [PATCH 2/2] nvmet-rdma: avoid circular locking dependency on install_queue()
       [not found]   ` <3eefcbfd-a93f-4c1c-957a-66058e5cdb54@nvidia.com>
@ 2024-06-03  6:43     ` Sagi Grimberg
  0 siblings, 0 replies; 9+ messages in thread
From: Sagi Grimberg @ 2024-06-03  6:43 UTC (permalink / raw)
  To: Max Gurtovoy, hare
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Hannes Reinecke,
	Shin'ichiro Kawasaki



On 02/06/2024 16:09, Max Gurtovoy wrote:
> Hi Hannes/Sagi,
>
> sorry for the late response on this one.
>
> On 08/12/2023 14:53, hare@kernel.org wrote:
>> From: Hannes Reinecke<hare@suse.de>
>>
>> nvmet_rdma_install_queue() is driven from the ->io_work workqueue
>
> nvmet_rdma_install_queue callback is not implemented in the driver. 
> And RDMA doesn't use io_work workqueue..
>
>
>> function, but will call flush_workqueue() which might trigger
>> ->release_work() which in itself calls flush_work on ->io_work.
>>
>> To avoid that check for pending queue in disconnecting status,
>> and return 'controller busy' when we reached a certain threshold.
>>
>> Signed-off-by: Hannes Reinecke<hare@suse.de>
>> Tested-by: Shin'ichiro Kawasaki<shinichiro.kawasaki@wdc.com>
>> ---
>>   drivers/nvme/target/rdma.c | 19 ++++++++++++++++---
>>   1 file changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
>> index 4597bca43a6d..667f9c04f35d 100644
>> --- a/drivers/nvme/target/rdma.c
>> +++ b/drivers/nvme/target/rdma.c
>> @@ -37,6 +37,8 @@
>>   #define NVMET_RDMA_MAX_MDTS            8
>>   #define NVMET_RDMA_MAX_METADATA_MDTS        5
>>   +#define NVMET_RDMA_BACKLOG 128
>> +
>>   struct nvmet_rdma_srq;
>>     struct nvmet_rdma_cmd {
>> @@ -1583,8 +1585,19 @@ static int nvmet_rdma_queue_connect(struct 
>> rdma_cm_id *cm_id,
>>       }
>>         if (queue->host_qid == 0) {
>> -        /* Let inflight controller teardown complete */
>> -        flush_workqueue(nvmet_wq);
>> +        struct nvmet_rdma_queue *q;
>> +        int pending = 0;
>> +
>> +        /* Check for pending controller teardown */
>> +        mutex_lock(&nvmet_rdma_queue_mutex);
>> +        list_for_each_entry(q, &nvmet_rdma_queue_list, queue_list) {
>> +            if (q->nvme_sq.ctrl == queue->nvme_sq.ctrl &&
>
> nvme_sq->ctrl pointer is set upon FABRICS_CONNECT during 
> nvmet_install_queue().
>
> obviously this check is not relevant since queue->nvme_sq.ctrl is 
> always NULL, isn't it ?

I think that the goal was to look at all the queues that were scheduled 
for teardown (queue->release_work)
so sq->ctrl should be valid for those queues afaict.

>
> So, I wonder what does this check is doing ?
>
> Was the intention to check that port->cm_id (listener) is not 
> overloaded ?
>
> Also the return value is "FABRICS" return code and not the value that 
> rdma_cm is looking for.
>
> We probably should introduce nvmet_rdma_install_queue() and check the 
> status of pending disconnections associated with port->cm_it which is 
> the one with the configured backlog.
>
> I don't understand the solution also for the nvmet_tcp.
>
> I think this series only removed the "flush_workqueue(nvmet_wq);" 
> actually.
>
> Is there something I miss in this commit ?

IIRC the goal was to get rid of the circular locking lockdep complaint. 
nvmet needs a back-pressure
mechanism for incoming connections because queue teardown is async and 
nvmet may hit OOM in
connect-disconnect storm scenario. The goal here was to detect such 
scenario, by counting the number
of queues that are queued for teardown, and if they exceed a certain 
threshold, return a BUSY status
to the incoming fabrics connect.

Before, nvmet flushed the pending queue teardowns, but that triggered a 
lockdep complaint and
hannes attempted to address it.

>
>> +                q->state == NVMET_RDMA_Q_DISCONNECTING)
>> +                pending++;
>> +        }
>> +        mutex_unlock(&nvmet_rdma_queue_mutex);
>> +        if (pending > NVMET_RDMA_BACKLOG)
>> +            return NVME_SC_CONNECT_CTRL_BUSY;
>>       }
>>         ret = nvmet_rdma_cm_accept(cm_id, queue, &event->param.conn);
>> @@ -1880,7 +1893,7 @@ static int nvmet_rdma_enable_port(struct 
>> nvmet_rdma_port *port)
>>           goto out_destroy_id;
>>       }
>>   -    ret = rdma_listen(cm_id, 128);
>> +    ret = rdma_listen(cm_id, NVMET_RDMA_BACKLOG);
>>       if (ret) {
>>           pr_err("listening to %pISpcs failed (%d)\n", addr, ret);
>>           goto out_destroy_id;



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3 0/2] nvmet: avoid circular locking warning
  2023-12-08 12:53 [PATCHv3 0/2] nvmet: avoid circular locking warning hare
  2023-12-08 12:53 ` [PATCH 1/2] nvmet-tcp: avoid circular locking dependency on install_queue() hare
  2023-12-08 12:53 ` [PATCH 2/2] nvmet-rdma: " hare
@ 2023-12-11 12:44 ` Shinichiro Kawasaki
  2024-01-10  4:12 ` Shinichiro Kawasaki
  3 siblings, 0 replies; 9+ messages in thread
From: Shinichiro Kawasaki @ 2023-12-11 12:44 UTC (permalink / raw)
  To: hare@kernel.org
  Cc: Sagi Grimberg, Christoph Hellwig, Keith Busch,
	linux-nvme@lists.infradead.org, Hannes Reinecke

On Dec 08, 2023 / 13:53, hare@kernel.org wrote:
> From: Hannes Reinecke <hare@suse.de>
> 
> nvmet-rdma and nvmet-tcp trigger a circular locking warning when
> tearing down; reason is a call to 'flush_workqueue' when creating
> a new controller which tries to cover for the fact that old controller
> instances might be in the process of tearing down.
> However, this is pure speculation as we don't know (and don't check)
> if there really _are_ controllers in shutdown.
> And even if there were, that should be short-lived, and would have been
> resolved by connecting just a tad later.
> So this patch returns 'controller busy' if we really find ourselves in this
> situation, allowing the caller to reconnect later.

Thanks Hannes. I reconfirmed the patches avoid the circular lockdep WARN with
rdma and tcp transports using v6.7-rc5 kernel. And I observed no regression in
my blktests runs. Looks good.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3 0/2] nvmet: avoid circular locking warning
  2023-12-08 12:53 [PATCHv3 0/2] nvmet: avoid circular locking warning hare
                   ` (2 preceding siblings ...)
  2023-12-11 12:44 ` [PATCHv3 0/2] nvmet: avoid circular locking warning Shinichiro Kawasaki
@ 2024-01-10  4:12 ` Shinichiro Kawasaki
  2024-01-10 21:38   ` Keith Busch
  3 siblings, 1 reply; 9+ messages in thread
From: Shinichiro Kawasaki @ 2024-01-10  4:12 UTC (permalink / raw)
  To: Keith Busch
  Cc: Sagi Grimberg, Christoph Hellwig, linux-nvme@lists.infradead.org,
	Hannes Reinecke, hare@kernel.org

On Dec 08, 2023 / 13:53, hare@kernel.org wrote:
> From: Hannes Reinecke <hare@suse.de>
> 
> nvmet-rdma and nvmet-tcp trigger a circular locking warning when
> tearing down; reason is a call to 'flush_workqueue' when creating
> a new controller which tries to cover for the fact that old controller
> instances might be in the process of tearing down.
> However, this is pure speculation as we don't know (and don't check)
> if there really _are_ controllers in shutdown.
> And even if there were, that should be short-lived, and would have been
> resolved by connecting just a tad later.
> So this patch returns 'controller busy' if we really find ourselves in this
> situation, allowing the caller to reconnect later.

Keith,

Could you consider to upstream these patches for kernel v6.8-rcX? They fix
lockdep WARNs observed with blktests nvme test groups and rdma/tcp transport.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv3 0/2] nvmet: avoid circular locking warning
  2024-01-10  4:12 ` Shinichiro Kawasaki
@ 2024-01-10 21:38   ` Keith Busch
  0 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2024-01-10 21:38 UTC (permalink / raw)
  To: Shinichiro Kawasaki
  Cc: Sagi Grimberg, Christoph Hellwig, linux-nvme@lists.infradead.org,
	Hannes Reinecke, hare@kernel.org

On Wed, Jan 10, 2024 at 04:12:16AM +0000, Shinichiro Kawasaki wrote:
> On Dec 08, 2023 / 13:53, hare@kernel.org wrote:
> > From: Hannes Reinecke <hare@suse.de>
> > 
> > nvmet-rdma and nvmet-tcp trigger a circular locking warning when
> > tearing down; reason is a call to 'flush_workqueue' when creating
> > a new controller which tries to cover for the fact that old controller
> > instances might be in the process of tearing down.
> > However, this is pure speculation as we don't know (and don't check)
> > if there really _are_ controllers in shutdown.
> > And even if there were, that should be short-lived, and would have been
> > resolved by connecting just a tad later.
> > So this patch returns 'controller busy' if we really find ourselves in this
> > situation, allowing the caller to reconnect later.
> 
> Keith,
> 
> Could you consider to upstream these patches for kernel v6.8-rcX? They fix
> lockdep WARNs observed with blktests nvme test groups and rdma/tcp transport.

Definitely, I knew I was forgetting something. Patches applied now for
6.8 and our 2nd merge window pull request will be issued today.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-06-03  6:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-08 12:53 [PATCHv3 0/2] nvmet: avoid circular locking warning hare
2023-12-08 12:53 ` [PATCH 1/2] nvmet-tcp: avoid circular locking dependency on install_queue() hare
2023-12-11 14:10   ` Sagi Grimberg
2023-12-08 12:53 ` [PATCH 2/2] nvmet-rdma: " hare
2023-12-11 14:10   ` Sagi Grimberg
     [not found]   ` <3eefcbfd-a93f-4c1c-957a-66058e5cdb54@nvidia.com>
2024-06-03  6:43     ` Sagi Grimberg
2023-12-11 12:44 ` [PATCHv3 0/2] nvmet: avoid circular locking warning Shinichiro Kawasaki
2024-01-10  4:12 ` Shinichiro Kawasaki
2024-01-10 21:38   ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox