* [PATCH 0/4] nvme: start keep-alive after admin queue setup
@ 2023-10-20 14:25 Hannes Reinecke
2023-10-20 14:25 ` [PATCH 1/4] nvme-fc: add missing quiesce in nvme_fc_create_association() Hannes Reinecke
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Hannes Reinecke @ 2023-10-20 14:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Hannes Reinecke
Hi all,
Setting up I/O queues might take quite some time on larger and/or
busy setups, so KATO might expire on the admin queue before all
I/O queues can be setup.
This patchset fixes this issue by moving the calls to starting
and stopping KATO into the quiesce/unquiesce functions.
That requires some refactoring/alignment with some transport
drivers which do not always quiesce and unquiesce the admin
queue.
As usual, comments and reviews are welcome.
Hannes Reinecke (4):
nvme-fc: add missing quiesce in nvme_fc_create_association()
nvme-pci: unquiesce admin queue before nvme_init_ctrl_finish()
nvme-loop: quiesce admin queue on failure
nvme: start keep-alive after admin queue setup
drivers/nvme/host/apple.c | 4 ++--
drivers/nvme/host/core.c | 7 ++++---
drivers/nvme/host/fc.c | 16 +++++++++-------
drivers/nvme/host/nvme.h | 2 +-
drivers/nvme/host/pci.c | 9 ++++++---
drivers/nvme/host/rdma.c | 6 +++---
drivers/nvme/host/tcp.c | 6 +++---
drivers/nvme/target/loop.c | 6 ++++--
8 files changed, 32 insertions(+), 24 deletions(-)
--
2.35.3
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/4] nvme-fc: add missing quiesce in nvme_fc_create_association()
2023-10-20 14:25 [PATCH 0/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
@ 2023-10-20 14:25 ` Hannes Reinecke
2023-10-20 14:25 ` [PATCH 2/4] nvme-pci: unquiesce admin queue before nvme_init_ctrl_finish() Hannes Reinecke
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2023-10-20 14:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Hannes Reinecke
nvme_fc_create_association() is calling nvme_unquiesce_admin_queue(),
but does not quiesce the queue upon failure.
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/nvme/host/fc.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index a15b37750d6e..17b6c9238d68 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3135,7 +3135,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
if (!ret && test_bit(ASSOC_FAILED, &ctrl->flags))
ret = -EIO;
if (ret)
- goto out_disconnect_admin_queue;
+ goto out_quiesce_admin_queue;
/* sanity checks */
@@ -3144,7 +3144,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
dev_err(ctrl->ctrl.device, "icdoff %d is not supported!\n",
ctrl->ctrl.icdoff);
ret = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
- goto out_disconnect_admin_queue;
+ goto out_quiesce_admin_queue;
}
/* FC-NVME supports normal SGL Data Block Descriptors */
@@ -3152,7 +3152,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
dev_err(ctrl->ctrl.device,
"Mandatory sgls are not supported!\n");
ret = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
- goto out_disconnect_admin_queue;
+ goto out_quiesce_admin_queue;
}
if (opts->queue_size > ctrl->ctrl.maxcmd) {
@@ -3199,6 +3199,8 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
out_term_aen_ops:
nvme_fc_term_aen_ops(ctrl);
+out_quiesce_admin_queue:
+ nvme_quiesce_admin_queue(&ctrl->ctrl);
out_disconnect_admin_queue:
dev_warn(ctrl->ctrl.device,
"NVME-FC{%d}: create_assoc failed, assoc_id %llx ret %d\n",
--
2.35.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/4] nvme-pci: unquiesce admin queue before nvme_init_ctrl_finish()
2023-10-20 14:25 [PATCH 0/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
2023-10-20 14:25 ` [PATCH 1/4] nvme-fc: add missing quiesce in nvme_fc_create_association() Hannes Reinecke
@ 2023-10-20 14:25 ` Hannes Reinecke
2023-10-20 14:25 ` [PATCH 3/4] nvme-loop: quiesce admin queue on failure Hannes Reinecke
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2023-10-20 14:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Hannes Reinecke
We should be unquiesce the admin queue after setting the state to
'CONNECTING', but before calling into nvme_init_ctrl_finish().
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/nvme/host/pci.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 60a08dfe8d75..5b6dec052dfe 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2709,7 +2709,6 @@ static void nvme_reset_work(struct work_struct *work)
result = nvme_pci_enable(dev);
if (result)
goto out_unlock;
- nvme_unquiesce_admin_queue(&dev->ctrl);
mutex_unlock(&dev->shutdown_lock);
/*
@@ -2723,6 +2722,8 @@ static void nvme_reset_work(struct work_struct *work)
goto out;
}
+ nvme_unquiesce_admin_queue(&dev->ctrl);
+
result = nvme_init_ctrl_finish(&dev->ctrl, was_suspend);
if (result)
goto out;
@@ -3019,6 +3020,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto out_disable;
}
+ nvme_unquiesce_admin_queue(&dev->ctrl);
+
result = nvme_init_ctrl_finish(&dev->ctrl, false);
if (result)
goto out_disable;
--
2.35.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/4] nvme-loop: quiesce admin queue on failure
2023-10-20 14:25 [PATCH 0/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
2023-10-20 14:25 ` [PATCH 1/4] nvme-fc: add missing quiesce in nvme_fc_create_association() Hannes Reinecke
2023-10-20 14:25 ` [PATCH 2/4] nvme-pci: unquiesce admin queue before nvme_init_ctrl_finish() Hannes Reinecke
@ 2023-10-20 14:25 ` Hannes Reinecke
2023-10-20 14:26 ` [PATCH 4/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
2023-10-20 15:30 ` [PATCH 0/4] " Keith Busch
4 siblings, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2023-10-20 14:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Hannes Reinecke
We are calling nvme_unquiesce_admin_queue() in
nvme_loop_configure_admin_queue(), but don't call
nvme_quiesce_admin_queue() on failure.
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/nvme/target/loop.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index 48d5df054cd0..e1b8ead94575 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -379,10 +379,12 @@ static int nvme_loop_configure_admin_queue(struct nvme_loop_ctrl *ctrl)
error = nvme_init_ctrl_finish(&ctrl->ctrl, false);
if (error)
- goto out_cleanup_tagset;
+ goto out_quiesce_tagset;
return 0;
+out_quiesce_tagset:
+ nvme_quiesce_admin_queue(&ctrl->ctrl);
out_cleanup_tagset:
clear_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[0].flags);
nvme_remove_admin_tag_set(&ctrl->ctrl);
--
2.35.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/4] nvme: start keep-alive after admin queue setup
2023-10-20 14:25 [PATCH 0/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
` (2 preceding siblings ...)
2023-10-20 14:25 ` [PATCH 3/4] nvme-loop: quiesce admin queue on failure Hannes Reinecke
@ 2023-10-20 14:26 ` Hannes Reinecke
2023-10-20 15:30 ` [PATCH 0/4] " Keith Busch
4 siblings, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2023-10-20 14:26 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, Hannes Reinecke
Setting up I/O queues might take quite some time on larger and/or
busy setups, so KATO might expire before all I/O queues could be
set up.
Fix this by moving starting and stopping keep-alive into the calls
to nvme_unquiesce_admin_queue() and nvme_quiesce_admin_queue().
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/nvme/host/apple.c | 4 ++--
drivers/nvme/host/core.c | 7 ++++---
drivers/nvme/host/fc.c | 8 ++++----
drivers/nvme/host/nvme.h | 2 +-
drivers/nvme/host/pci.c | 8 ++++----
drivers/nvme/host/rdma.c | 6 +++---
drivers/nvme/host/tcp.c | 6 +++---
drivers/nvme/target/loop.c | 2 +-
8 files changed, 22 insertions(+), 21 deletions(-)
diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index 596bb11eeba5..91d3b1341723 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -869,7 +869,7 @@ static void apple_nvme_disable(struct apple_nvme *anv, bool shutdown)
*/
if (shutdown) {
nvme_unquiesce_io_queues(&anv->ctrl);
- nvme_unquiesce_admin_queue(&anv->ctrl);
+ nvme_unquiesce_admin_queue(&anv->ctrl, false);
}
}
@@ -1107,7 +1107,7 @@ static void apple_nvme_reset_work(struct work_struct *work)
dev_dbg(anv->dev, "Starting admin queue");
apple_nvme_init_queue(&anv->adminq);
- nvme_unquiesce_admin_queue(&anv->ctrl);
+ nvme_unquiesce_admin_queue(&anv->ctrl, true);
if (!nvme_change_ctrl_state(&anv->ctrl, NVME_CTRL_CONNECTING)) {
dev_warn(anv->ctrl.device,
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 62612f87aafa..070912e1601a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4344,8 +4344,6 @@ EXPORT_SYMBOL_GPL(nvme_stop_ctrl);
void nvme_start_ctrl(struct nvme_ctrl *ctrl)
{
- nvme_start_keep_alive(ctrl);
-
nvme_enable_aen(ctrl);
/*
@@ -4602,6 +4600,7 @@ EXPORT_SYMBOL_GPL(nvme_unquiesce_io_queues);
void nvme_quiesce_admin_queue(struct nvme_ctrl *ctrl)
{
+ nvme_stop_keep_alive(ctrl);
if (!test_and_set_bit(NVME_CTRL_ADMIN_Q_STOPPED, &ctrl->flags))
blk_mq_quiesce_queue(ctrl->admin_q);
else
@@ -4609,10 +4608,12 @@ void nvme_quiesce_admin_queue(struct nvme_ctrl *ctrl)
}
EXPORT_SYMBOL_GPL(nvme_quiesce_admin_queue);
-void nvme_unquiesce_admin_queue(struct nvme_ctrl *ctrl)
+void nvme_unquiesce_admin_queue(struct nvme_ctrl *ctrl, bool start_ka)
{
if (test_and_clear_bit(NVME_CTRL_ADMIN_Q_STOPPED, &ctrl->flags))
blk_mq_unquiesce_queue(ctrl->admin_q);
+ if (start_ka)
+ nvme_start_keep_alive(ctrl);
}
EXPORT_SYMBOL_GPL(nvme_unquiesce_admin_queue);
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 17b6c9238d68..3ac749bf34de 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2404,7 +2404,7 @@ nvme_fc_ctrl_free(struct kref *ref)
list_del(&ctrl->ctrl_list);
spin_unlock_irqrestore(&ctrl->rport->lock, flags);
- nvme_unquiesce_admin_queue(&ctrl->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, false);
nvme_remove_admin_tag_set(&ctrl->ctrl);
kfree(ctrl->queues);
@@ -2535,7 +2535,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
nvme_fc_terminate_exchange, &ctrl->ctrl);
blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
if (start_queues)
- nvme_unquiesce_admin_queue(&ctrl->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, true);
}
static void
@@ -3129,7 +3129,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
ctrl->ctrl.max_hw_sectors = ctrl->ctrl.max_segments <<
(ilog2(SZ_4K) - 9);
- nvme_unquiesce_admin_queue(&ctrl->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, true);
ret = nvme_init_ctrl_finish(&ctrl->ctrl, false);
if (!ret && test_bit(ASSOC_FAILED, &ctrl->flags))
@@ -3288,7 +3288,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
nvme_fc_free_queue(&ctrl->queues[0]);
/* re-enable the admin_q so anything new can fast fail */
- nvme_unquiesce_admin_queue(&ctrl->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, false);
/* resume the io queues so that things will fast fail */
nvme_unquiesce_io_queues(&ctrl->ctrl);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 39a90b7cb125..1aba30600b4a 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -770,7 +770,7 @@ void nvme_complete_async_event(struct nvme_ctrl *ctrl, __le16 status,
void nvme_quiesce_io_queues(struct nvme_ctrl *ctrl);
void nvme_unquiesce_io_queues(struct nvme_ctrl *ctrl);
void nvme_quiesce_admin_queue(struct nvme_ctrl *ctrl);
-void nvme_unquiesce_admin_queue(struct nvme_ctrl *ctrl);
+void nvme_unquiesce_admin_queue(struct nvme_ctrl *ctrl, bool start_ka);
void nvme_mark_namespaces_dead(struct nvme_ctrl *ctrl);
void nvme_sync_queues(struct nvme_ctrl *ctrl);
void nvme_sync_io_queues(struct nvme_ctrl *ctrl);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 5b6dec052dfe..b9a6abe3fd33 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1681,7 +1681,7 @@ static void nvme_dev_remove_admin(struct nvme_dev *dev)
* user requests may be waiting on a stopped queue. Start the
* queue to flush these to completion.
*/
- nvme_unquiesce_admin_queue(&dev->ctrl);
+ nvme_unquiesce_admin_queue(&dev->ctrl, false);
nvme_remove_admin_tag_set(&dev->ctrl);
}
}
@@ -2615,7 +2615,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
if (shutdown) {
nvme_unquiesce_io_queues(&dev->ctrl);
if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q))
- nvme_unquiesce_admin_queue(&dev->ctrl);
+ nvme_unquiesce_admin_queue(&dev->ctrl, false);
}
mutex_unlock(&dev->shutdown_lock);
}
@@ -2722,7 +2722,7 @@ static void nvme_reset_work(struct work_struct *work)
goto out;
}
- nvme_unquiesce_admin_queue(&dev->ctrl);
+ nvme_unquiesce_admin_queue(&dev->ctrl, true);
result = nvme_init_ctrl_finish(&dev->ctrl, was_suspend);
if (result)
@@ -3020,7 +3020,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto out_disable;
}
- nvme_unquiesce_admin_queue(&dev->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, true);
result = nvme_init_ctrl_finish(&dev->ctrl, false);
if (result)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 337a624a537c..a9368767560f 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -830,7 +830,7 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
else
ctrl->ctrl.max_integrity_segments = 0;
- nvme_unquiesce_admin_queue(&ctrl->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, true);
error = nvme_init_ctrl_finish(&ctrl->ctrl, false);
if (error)
@@ -932,7 +932,7 @@ static void nvme_rdma_teardown_admin_queue(struct nvme_rdma_ctrl *ctrl,
nvme_rdma_stop_queue(&ctrl->queues[0]);
nvme_cancel_admin_tagset(&ctrl->ctrl);
if (remove) {
- nvme_unquiesce_admin_queue(&ctrl->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, false);
nvme_remove_admin_tag_set(&ctrl->ctrl);
}
nvme_rdma_destroy_admin_queue(ctrl);
@@ -1120,7 +1120,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
nvme_rdma_teardown_io_queues(ctrl, false);
nvme_unquiesce_io_queues(&ctrl->ctrl);
nvme_rdma_teardown_admin_queue(ctrl, false);
- nvme_unquiesce_admin_queue(&ctrl->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, false);
nvme_auth_stop(&ctrl->ctrl);
if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 4714a902f4ca..a3c3ef843dca 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2103,7 +2103,7 @@ static int nvme_tcp_configure_admin_queue(struct nvme_ctrl *ctrl, bool new)
if (error)
goto out_stop_queue;
- nvme_unquiesce_admin_queue(ctrl);
+ nvme_unquiesce_admin_queue(ctrl, true);
error = nvme_init_ctrl_finish(ctrl, false);
if (error)
@@ -2133,7 +2133,7 @@ static void nvme_tcp_teardown_admin_queue(struct nvme_ctrl *ctrl,
nvme_tcp_stop_queue(ctrl, 0);
nvme_cancel_admin_tagset(ctrl);
if (remove)
- nvme_unquiesce_admin_queue(ctrl);
+ nvme_unquiesce_admin_queue(ctrl, false);
nvme_tcp_destroy_admin_queue(ctrl, remove);
}
@@ -2280,7 +2280,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
/* unquiesce to fail fast pending requests */
nvme_unquiesce_io_queues(ctrl);
nvme_tcp_teardown_admin_queue(ctrl, false);
- nvme_unquiesce_admin_queue(ctrl);
+ nvme_unquiesce_admin_queue(ctrl, false);
nvme_auth_stop(ctrl);
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) {
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index e1b8ead94575..6237f6baba4f 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -375,7 +375,7 @@ static int nvme_loop_configure_admin_queue(struct nvme_loop_ctrl *ctrl)
ctrl->ctrl.max_hw_sectors =
(NVME_LOOP_MAX_SEGMENTS - 1) << PAGE_SECTORS_SHIFT;
- nvme_unquiesce_admin_queue(&ctrl->ctrl);
+ nvme_unquiesce_admin_queue(&ctrl->ctrl, true);
error = nvme_init_ctrl_finish(&ctrl->ctrl, false);
if (error)
--
2.35.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/4] nvme: start keep-alive after admin queue setup
2023-10-20 14:25 [PATCH 0/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
` (3 preceding siblings ...)
2023-10-20 14:26 ` [PATCH 4/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
@ 2023-10-20 15:30 ` Keith Busch
2023-10-23 9:54 ` Hannes Reinecke
4 siblings, 1 reply; 9+ messages in thread
From: Keith Busch @ 2023-10-20 15:30 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: Christoph Hellwig, Sagi Grimberg, linux-nvme
On Fri, Oct 20, 2023 at 04:25:56PM +0200, Hannes Reinecke wrote:
> Hi all,
>
> Setting up I/O queues might take quite some time on larger and/or
> busy setups, so KATO might expire on the admin queue before all
> I/O queues can be setup.
> This patchset fixes this issue by moving the calls to starting
> and stopping KATO into the quiesce/unquiesce functions.
> That requires some refactoring/alignment with some transport
> drivers which do not always quiesce and unquiesce the admin
> queue.
>
> As usual, comments and reviews are welcome.
The pci transport doesn't care about keep-alive, so I don't think we
should force everyone to have to consider it.
Wouldn't you want to start the keep-alive on the connecting state
transition? Something like this:
---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 62612f87aafa2..a44b1206b20ad 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -583,6 +583,8 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
} else if (ctrl->state == NVME_CTRL_CONNECTING &&
old_state == NVME_CTRL_RESETTING) {
nvme_start_failfast_work(ctrl);
+ if (ctrl->ops->flags & NVME_F_FABRICS) {
+ nvme_start_keep_alive(ctrl);
}
return changed;
}
--
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/4] nvme: start keep-alive after admin queue setup
2023-10-20 15:30 ` [PATCH 0/4] " Keith Busch
@ 2023-10-23 9:54 ` Hannes Reinecke
2023-10-23 23:16 ` Keith Busch
0 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2023-10-23 9:54 UTC (permalink / raw)
To: Keith Busch; +Cc: Christoph Hellwig, Sagi Grimberg, linux-nvme
On 10/20/23 17:30, Keith Busch wrote:
> On Fri, Oct 20, 2023 at 04:25:56PM +0200, Hannes Reinecke wrote:
>> Hi all,
>>
>> Setting up I/O queues might take quite some time on larger and/or
>> busy setups, so KATO might expire on the admin queue before all
>> I/O queues can be setup.
>> This patchset fixes this issue by moving the calls to starting
>> and stopping KATO into the quiesce/unquiesce functions.
>> That requires some refactoring/alignment with some transport
>> drivers which do not always quiesce and unquiesce the admin
>> queue.
>>
>> As usual, comments and reviews are welcome.
>
> The pci transport doesn't care about keep-alive, so I don't think we
> should force everyone to have to consider it.
>
> Wouldn't you want to start the keep-alive on the connecting state
> transition? Something like this:
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 62612f87aafa2..a44b1206b20ad 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -583,6 +583,8 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
> } else if (ctrl->state == NVME_CTRL_CONNECTING &&
> old_state == NVME_CTRL_RESETTING) {
> nvme_start_failfast_work(ctrl);
> + if (ctrl->ops->flags & NVME_F_FABRICS) {
> + nvme_start_keep_alive(ctrl);
> }
> return changed;
> }
> --
>
But we start keep alive unconditionally nowadays; each and every
host driver (including PCI) is calling into nvme_start_ctrl(),
which calls nvme_start_keep_alive() without any checks.
Be it as it may, tying it into the state transition won't work
as the 'CONNECTING' state is entered _before_ any queues are setup,
so starting keep-alive here would be too early.
Cheers,
Hannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/4] nvme: start keep-alive after admin queue setup
2023-10-23 9:54 ` Hannes Reinecke
@ 2023-10-23 23:16 ` Keith Busch
2023-10-24 5:30 ` Hannes Reinecke
0 siblings, 1 reply; 9+ messages in thread
From: Keith Busch @ 2023-10-23 23:16 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: Christoph Hellwig, Sagi Grimberg, linux-nvme
On Mon, Oct 23, 2023 at 11:54:37AM +0200, Hannes Reinecke wrote:
> But we start keep alive unconditionally nowadays; each and every
> host driver (including PCI) is calling into nvme_start_ctrl(),
> which calls nvme_start_keep_alive() without any checks.
I see nvme_start_keep_alive() is always called, but that checks
'ctrl->kato' which will always be 0 for PCI.
> Be it as it may, tying it into the state transition won't work
> as the 'CONNECTING' state is entered _before_ any queues are setup,
> so starting keep-alive here would be too early.
That's just IO eueues, right? Keep-alive is an admin command, and the
admin queue should be up in the CONNECTING state.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/4] nvme: start keep-alive after admin queue setup
2023-10-23 23:16 ` Keith Busch
@ 2023-10-24 5:30 ` Hannes Reinecke
0 siblings, 0 replies; 9+ messages in thread
From: Hannes Reinecke @ 2023-10-24 5:30 UTC (permalink / raw)
To: Keith Busch; +Cc: Christoph Hellwig, Sagi Grimberg, linux-nvme
On 10/24/23 01:16, Keith Busch wrote:
> On Mon, Oct 23, 2023 at 11:54:37AM +0200, Hannes Reinecke wrote:
>> But we start keep alive unconditionally nowadays; each and every
>> host driver (including PCI) is calling into nvme_start_ctrl(),
>> which calls nvme_start_keep_alive() without any checks.
>
> I see nvme_start_keep_alive() is always called, but that checks
> 'ctrl->kato' which will always be 0 for PCI.
>
But it also means we don't have to check for fabrics here...
>> Be it as it may, tying it into the state transition won't work
>> as the 'CONNECTING' state is entered _before_ any queues are setup,
>> so starting keep-alive here would be too early.
>
> That's just IO eueues, right? Keep-alive is an admin command, and the
> admin queue should be up in the CONNECTING state.
Nope. 'loop' creates the admin queue after setting the status to
'CONNECTING'. So there is no clear rule when 'CONNECTING' is entered.
There is a valid point to be made to enter 'CONNECTING' only when the
admin queue is setup, but currently that's not the case.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-10-24 5:30 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-20 14:25 [PATCH 0/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
2023-10-20 14:25 ` [PATCH 1/4] nvme-fc: add missing quiesce in nvme_fc_create_association() Hannes Reinecke
2023-10-20 14:25 ` [PATCH 2/4] nvme-pci: unquiesce admin queue before nvme_init_ctrl_finish() Hannes Reinecke
2023-10-20 14:25 ` [PATCH 3/4] nvme-loop: quiesce admin queue on failure Hannes Reinecke
2023-10-20 14:26 ` [PATCH 4/4] nvme: start keep-alive after admin queue setup Hannes Reinecke
2023-10-20 15:30 ` [PATCH 0/4] " Keith Busch
2023-10-23 9:54 ` Hannes Reinecke
2023-10-23 23:16 ` Keith Busch
2023-10-24 5:30 ` Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox