* [PATCH 0/2] nvmet: avoid circular locking warning
@ 2023-11-01 10:32 Hannes Reinecke
2023-11-01 10:32 ` [PATCH 1/2] nvmet-rdma: avoid circular locking dependency on install_queue() Hannes Reinecke
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Hannes Reinecke @ 2023-11-01 10:32 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke
nvmet-rdma and nvmet-tcp trigger a circular locking warning when
tearing down; reason is a call to 'flush_workqueue' when creating
a new controller which tries to cover for the fact that old controller
instances might be in the process of tearing down.
However, this is pure speculation as we don't know (and don't check)
if there really _are_ controllers in shutdown.
And even if there were, that should be short-lived, and would have been
resolved by connecting just a tad later.
So this patch returns 'controller busy' if we really find ourselves in this
situation, allowing the caller to reconnect later.
Hannes Reinecke (2):
nvmet-rdma: avoid circular locking dependency on install_queue()
nvmet-tcp: avoid circular locking dependency on install_queue()
drivers/nvme/target/rdma.c | 14 ++++++++++++--
drivers/nvme/target/tcp.c | 14 ++++++++++++--
2 files changed, 24 insertions(+), 4 deletions(-)
--
2.35.3
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/2] nvmet-rdma: avoid circular locking dependency on install_queue()
2023-11-01 10:32 [PATCH 0/2] nvmet: avoid circular locking warning Hannes Reinecke
@ 2023-11-01 10:32 ` Hannes Reinecke
2023-11-01 16:21 ` Jens Axboe
2023-11-01 10:32 ` [PATCH 2/2] nvmet-tcp: " Hannes Reinecke
2023-11-01 12:40 ` [PATCH 0/2] nvmet: avoid circular locking warning Shinichiro Kawasaki
2 siblings, 1 reply; 7+ messages in thread
From: Hannes Reinecke @ 2023-11-01 10:32 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke
nvmet_rdma_install_queue() is driven from the ->io_work workqueue
function, but will call flush_workqueue() which might trigger
->release_work() which in itself calls flush_work on ->io_work.
To avoid that check for pending queue in disconnecting status,
and return 'controller busy' until all disconnects are completed.
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/nvme/target/rdma.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 4597bca43a6d..eaeb94a9e863 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -1583,8 +1583,18 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id,
}
if (queue->host_qid == 0) {
- /* Let inflight controller teardown complete */
- flush_workqueue(nvmet_wq);
+ struct nvmet_rdma_queue *q;
+ int pending = 0;
+
+ mutex_lock(&nvmet_rdma_queue_mutex);
+ list_for_each_entry(q, &nvmet_rdma_queue_list, queue_list) {
+ if (q->state == NVMET_RDMA_Q_DISCONNECTING)
+ pending++;
+ }
+ mutex_unlock(&nvmet_rdma_queue_mutex);
+ /* Retry for pending controller teardown */
+ if (pending)
+ return NVME_SC_CONNECT_CTRL_BUSY;
}
ret = nvmet_rdma_cm_accept(cm_id, queue, &event->param.conn);
--
2.35.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 1/2] nvmet-rdma: avoid circular locking dependency on install_queue()
2023-11-01 10:32 ` [PATCH 1/2] nvmet-rdma: avoid circular locking dependency on install_queue() Hannes Reinecke
@ 2023-11-01 16:21 ` Jens Axboe
2023-11-01 17:28 ` Hannes Reinecke
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2023-11-01 16:21 UTC (permalink / raw)
To: Hannes Reinecke, Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme
On 11/1/23 4:32 AM, Hannes Reinecke wrote:
> nvmet_rdma_install_queue() is driven from the ->io_work workqueue
> function, but will call flush_workqueue() which might trigger
> ->release_work() which in itself calls flush_work on ->io_work.
>
> To avoid that check for pending queue in disconnecting status,
> and return 'controller busy' until all disconnects are completed.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> drivers/nvme/target/rdma.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> index 4597bca43a6d..eaeb94a9e863 100644
> --- a/drivers/nvme/target/rdma.c
> +++ b/drivers/nvme/target/rdma.c
> @@ -1583,8 +1583,18 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id,
> }
>
> if (queue->host_qid == 0) {
> - /* Let inflight controller teardown complete */
> - flush_workqueue(nvmet_wq);
> + struct nvmet_rdma_queue *q;
> + int pending = 0;
> +
> + mutex_lock(&nvmet_rdma_queue_mutex);
> + list_for_each_entry(q, &nvmet_rdma_queue_list, queue_list) {
> + if (q->state == NVMET_RDMA_Q_DISCONNECTING)
> + pending++;
> + }
> + mutex_unlock(&nvmet_rdma_queue_mutex);
> + /* Retry for pending controller teardown */
> + if (pending)
> + return NVME_SC_CONNECT_CTRL_BUSY;
Not sure if it's worth turning this into a helper since both patches do
the same thing. Probably not, since you'd need to pass in the mutex and
state too. In any case, why not just break if you hit
NVMET_RDMA_Q_DISCONNECTING rather than keep looping? You don't care
about the exact count, jsut whether it's non-zero or not.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH 1/2] nvmet-rdma: avoid circular locking dependency on install_queue()
2023-11-01 16:21 ` Jens Axboe
@ 2023-11-01 17:28 ` Hannes Reinecke
0 siblings, 0 replies; 7+ messages in thread
From: Hannes Reinecke @ 2023-11-01 17:28 UTC (permalink / raw)
To: Jens Axboe, Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme
On 11/1/23 17:21, Jens Axboe wrote:
> On 11/1/23 4:32 AM, Hannes Reinecke wrote:
>> nvmet_rdma_install_queue() is driven from the ->io_work workqueue
>> function, but will call flush_workqueue() which might trigger
>> ->release_work() which in itself calls flush_work on ->io_work.
>>
>> To avoid that check for pending queue in disconnecting status,
>> and return 'controller busy' until all disconnects are completed.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>> drivers/nvme/target/rdma.c | 14 ++++++++++++--
>> 1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
>> index 4597bca43a6d..eaeb94a9e863 100644
>> --- a/drivers/nvme/target/rdma.c
>> +++ b/drivers/nvme/target/rdma.c
>> @@ -1583,8 +1583,18 @@ static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id,
>> }
>>
>> if (queue->host_qid == 0) {
>> - /* Let inflight controller teardown complete */
>> - flush_workqueue(nvmet_wq);
>> + struct nvmet_rdma_queue *q;
>> + int pending = 0;
>> +
>> + mutex_lock(&nvmet_rdma_queue_mutex);
>> + list_for_each_entry(q, &nvmet_rdma_queue_list, queue_list) {
>> + if (q->state == NVMET_RDMA_Q_DISCONNECTING)
>> + pending++;
>> + }
>> + mutex_unlock(&nvmet_rdma_queue_mutex);
>> + /* Retry for pending controller teardown */
>> + if (pending)
>> + return NVME_SC_CONNECT_CTRL_BUSY;
>
> Not sure if it's worth turning this into a helper since both patches do
> the same thing. Probably not, since you'd need to pass in the mutex and
> state too. In any case, why not just break if you hit
> NVMET_RDMA_Q_DISCONNECTING rather than keep looping? You don't care
> about the exact count, jsut whether it's non-zero or not.
>
True, that's easier. Will be updating the patches.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] nvmet-tcp: avoid circular locking dependency on install_queue()
2023-11-01 10:32 [PATCH 0/2] nvmet: avoid circular locking warning Hannes Reinecke
2023-11-01 10:32 ` [PATCH 1/2] nvmet-rdma: avoid circular locking dependency on install_queue() Hannes Reinecke
@ 2023-11-01 10:32 ` Hannes Reinecke
2023-11-01 12:40 ` [PATCH 0/2] nvmet: avoid circular locking warning Shinichiro Kawasaki
2 siblings, 0 replies; 7+ messages in thread
From: Hannes Reinecke @ 2023-11-01 10:32 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke
nvmet_tcp_install_queue() is driven from the ->io_work workqueue
function, but will call flush_workqueue() which might trigger
->release_work() which in itself calls flush_work on ->io_work.
To avoid that check for pending queue in disconnecting status,
and return 'controller busy' until all disconnects are completed.
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/nvme/target/tcp.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index b3c1cc6690f6..ecc9ccfac823 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -2121,8 +2121,18 @@ static u16 nvmet_tcp_install_queue(struct nvmet_sq *sq)
container_of(sq, struct nvmet_tcp_queue, nvme_sq);
if (sq->qid == 0) {
- /* Let inflight controller teardown complete */
- flush_workqueue(nvmet_wq);
+ struct nvmet_tcp_queue *q;
+ int pending = 0;
+
+ mutex_lock(&nvmet_tcp_queue_mutex);
+ list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
+ if (q->state == NVMET_TCP_Q_DISCONNECTING)
+ pending++;
+ }
+ mutex_unlock(&nvmet_tcp_queue_mutex);
+ /* Retry for pending controller teardown */
+ if (pending)
+ return NVME_SC_CONNECT_CTRL_BUSY;
}
queue->nr_cmds = sq->size * 2;
--
2.35.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 0/2] nvmet: avoid circular locking warning
2023-11-01 10:32 [PATCH 0/2] nvmet: avoid circular locking warning Hannes Reinecke
2023-11-01 10:32 ` [PATCH 1/2] nvmet-rdma: avoid circular locking dependency on install_queue() Hannes Reinecke
2023-11-01 10:32 ` [PATCH 2/2] nvmet-tcp: " Hannes Reinecke
@ 2023-11-01 12:40 ` Shinichiro Kawasaki
2 siblings, 0 replies; 7+ messages in thread
From: Shinichiro Kawasaki @ 2023-11-01 12:40 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch,
linux-nvme@lists.infradead.org
On Nov 01, 2023 / 11:32, Hannes Reinecke wrote:
> nvmet-rdma and nvmet-tcp trigger a circular locking warning when
> tearing down; reason is a call to 'flush_workqueue' when creating
> a new controller which tries to cover for the fact that old controller
> instances might be in the process of tearing down.
> However, this is pure speculation as we don't know (and don't check)
> if there really _are_ controllers in shutdown.
> And even if there were, that should be short-lived, and would have been
> resolved by connecting just a tad later.
> So this patch returns 'controller busy' if we really find ourselves in this
> situation, allowing the caller to reconnect later.
I confirmed this series avoids the failure that I reported for blktests
test case nvme/003 with trtype=rdma or tcp. For the series,
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCHv2 0/2] nvmet: avoid circular locking warning
@ 2023-11-02 14:19 Hannes Reinecke
2023-11-02 14:19 ` [PATCH 2/2] nvmet-tcp: avoid circular locking dependency on install_queue() Hannes Reinecke
0 siblings, 1 reply; 7+ messages in thread
From: Hannes Reinecke @ 2023-11-02 14:19 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke
nvmet-rdma and nvmet-tcp trigger a circular locking warning when
tearing down; reason is a call to 'flush_workqueue' when creating
a new controller which tries to cover for the fact that old controller
instances might be in the process of tearing down.
However, this is pure speculation as we don't know (and don't check)
if there really _are_ controllers in shutdown.
And even if there were, that should be short-lived, and would have been
resolved by connecting just a tad later.
So this patch returns 'controller busy' if we really find ourselves in this
situation, allowing the caller to reconnect later.
Changes to the original version:
- Update the rdma patch to implement 'install_queue()'
- Include suggestions from Jens Axboe
Hannes Reinecke (2):
nvmet-rdma: avoid circular locking dependency on install_queue()
nvmet-tcp: avoid circular locking dependency on install_queue()
drivers/nvme/target/rdma.c | 24 +++++++++++++++++++-----
drivers/nvme/target/tcp.c | 13 +++++++++++--
2 files changed, 30 insertions(+), 7 deletions(-)
--
2.35.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] nvmet-tcp: avoid circular locking dependency on install_queue()
2023-11-02 14:19 [PATCHv2 " Hannes Reinecke
@ 2023-11-02 14:19 ` Hannes Reinecke
0 siblings, 0 replies; 7+ messages in thread
From: Hannes Reinecke @ 2023-11-02 14:19 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Sagi Grimberg, Keith Busch, linux-nvme, Hannes Reinecke,
Shin'ichiro Kawasaki
nvmet_tcp_install_queue() is driven from the ->io_work workqueue
function, but will call flush_workqueue() which might trigger
->release_work() which in itself calls flush_work on ->io_work.
To avoid that check for pending queue in disconnecting status,
and return 'controller busy' until all disconnects are completed.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
drivers/nvme/target/tcp.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index b3c1cc6690f6..850e40eb70a0 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -2121,8 +2121,17 @@ static u16 nvmet_tcp_install_queue(struct nvmet_sq *sq)
container_of(sq, struct nvmet_tcp_queue, nvme_sq);
if (sq->qid == 0) {
- /* Let inflight controller teardown complete */
- flush_workqueue(nvmet_wq);
+ struct nvmet_tcp_queue *q;
+
+ mutex_lock(&nvmet_tcp_queue_mutex);
+ list_for_each_entry(q, &nvmet_tcp_queue_list, queue_list) {
+ if (q->state == NVMET_TCP_Q_DISCONNECTING) {
+ /* Retry for pending controller teardown */
+ mutex_unlock(&nvmet_tcp_queue_mutex);
+ return NVME_SC_CONNECT_CTRL_BUSY;
+ }
+ }
+ mutex_unlock(&nvmet_tcp_queue_mutex);
}
queue->nr_cmds = sq->size * 2;
--
2.35.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-11-02 14:19 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-01 10:32 [PATCH 0/2] nvmet: avoid circular locking warning Hannes Reinecke
2023-11-01 10:32 ` [PATCH 1/2] nvmet-rdma: avoid circular locking dependency on install_queue() Hannes Reinecke
2023-11-01 16:21 ` Jens Axboe
2023-11-01 17:28 ` Hannes Reinecke
2023-11-01 10:32 ` [PATCH 2/2] nvmet-tcp: " Hannes Reinecke
2023-11-01 12:40 ` [PATCH 0/2] nvmet: avoid circular locking warning Shinichiro Kawasaki
-- strict thread matches above, loose matches on Subject: below --
2023-11-02 14:19 [PATCHv2 " Hannes Reinecke
2023-11-02 14:19 ` [PATCH 2/2] nvmet-tcp: avoid circular locking dependency on install_queue() Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox