* [PATCH v3 0/3] nvme-fc: fix race with connectivity loss and nvme_fc_create_association
@ 2024-11-29 9:28 Daniel Wagner
2024-11-29 9:28 ` [PATCH v3 1/3] nvme-fc: go straight to connecting state when initializing Daniel Wagner
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Daniel Wagner @ 2024-11-29 9:28 UTC (permalink / raw)
To: James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg,
Hannes Reinecke, Paul Ely
Cc: linux-nvme, linux-kernel, Daniel Wagner
After a long hard stare at the keep alive machinery I am convienced we
need to trigger a reset when nvme_keep_alive_end_io is called with status
!= 0. There is a lengthy explanation in patch #3.
I've also tested this version with blktests and some manual tests. Though
it's not that easy to get into exact sequence reported by Paul.
Daniel
previous cover letter:
We got a bug report that a controller was stuck in the connected state
after an association dropped.
It turns out that nvme_fc_create_association can succeed even though some
operation do fail. This is on purpose to handle the degraded controller
case, where the admin queue is up and running but not the io queues. In
this case the controller will still reach the LIVE state.
Unfortunatly, this will also ignore full connectivity loss for fabric
controllers. Let's address this by not filtering out all errors in
nvme_set_queue_count.
---
Changes in v3:
- collected reviewed tags
- added nvme_ctrl_reset to keep alive end io handler
- Link to v2: https://lore.kernel.org/r/20241029-nvme-fc-handle-com-lost-v2-0-5b0d137e2a0a@kernel.org
Changes in v2:
- handle connection lost in nvme_set_queue_count directly
- collected reviewed tags
- Link to v1: https://lore.kernel.org/r/20240611190647.11856-1-dwagner@suse.de
---
Daniel Wagner (3):
nvme-fc: go straight to connecting state when initializing
nvme: trigger reset when keep alive fails
nvme: handle connectivity loss in nvme_set_queue_count
drivers/nvme/host/core.c | 13 ++++++++++++-
drivers/nvme/host/fc.c | 3 +--
2 files changed, 13 insertions(+), 3 deletions(-)
---
base-commit: 029cc98dec2eadb5d0978b5fea9ae6c427f2a020
change-id: 20241029-nvme-fc-handle-com-lost-9b241936809a
Best regards,
--
Daniel Wagner <wagi@kernel.org>
^ permalink raw reply [flat|nested] 18+ messages in thread* [PATCH v3 1/3] nvme-fc: go straight to connecting state when initializing 2024-11-29 9:28 [PATCH v3 0/3] nvme-fc: fix race with connectivity loss and nvme_fc_create_association Daniel Wagner @ 2024-11-29 9:28 ` Daniel Wagner 2024-11-29 9:28 ` [PATCH v3 2/3] nvme: trigger reset when keep alive fails Daniel Wagner 2024-11-29 9:28 ` [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count Daniel Wagner 2 siblings, 0 replies; 18+ messages in thread From: Daniel Wagner @ 2024-11-29 9:28 UTC (permalink / raw) To: James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke, Paul Ely Cc: linux-nvme, linux-kernel, Daniel Wagner The initial controller initialization mimiks the reconnect loop behavior by switching from NEW to RESETTING and then to CONNECTING. The transition from NEW to CONNECTING is a valid transition, so there is no point entering the RESETTING state. TCP and RDMA also transition directly to CONNECTING state. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <wagi@kernel.org> --- drivers/nvme/host/fc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index b81af7919e94c421387033bf8361a9cf8a867486..d45ab530ff9b7bd03bc311474278fc840f8786d5 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -3579,8 +3579,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list); spin_unlock_irqrestore(&rport->lock, flags); - if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) || - !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) { + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) { dev_err(ctrl->ctrl.device, "NVME-FC{%d}: failed to init ctrl state\n", ctrl->cnum); goto fail_ctrl; -- 2.47.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/3] nvme: trigger reset when keep alive fails 2024-11-29 9:28 [PATCH v3 0/3] nvme-fc: fix race with connectivity loss and nvme_fc_create_association Daniel Wagner 2024-11-29 9:28 ` [PATCH v3 1/3] nvme-fc: go straight to connecting state when initializing Daniel Wagner @ 2024-11-29 9:28 ` Daniel Wagner 2024-11-29 11:09 ` Hannes Reinecke ` (2 more replies) 2024-11-29 9:28 ` [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count Daniel Wagner 2 siblings, 3 replies; 18+ messages in thread From: Daniel Wagner @ 2024-11-29 9:28 UTC (permalink / raw) To: James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke, Paul Ely Cc: linux-nvme, linux-kernel, Daniel Wagner nvme_keep_alive_work setups a keep alive command and uses blk_execute_rq_nowait to send out the command in an asynchronously manner. Eventually, nvme_keep_alive_end_io is called. If the status argument is 0, a new keep alive is send out. When the status argument is not 0, only an error is logged. The keep alive machinery does not trigger the error recovery. The FC driver is relying on the keep alive machinery to trigger recovery when an error is detected. Whenever an error happens during the creation of the association the idea is that the operation is aborted and retried later. Though there is a window where an error happens and nvme_fc_create_assocation can't detect the error. 1) nvme nvme10: NVME-FC{10}: create association : ... 2) nvme nvme10: NVME-FC{10}: controller connectivity lost. Awaiting Reconnect nvme nvme10: queue_size 128 > ctrl maxcmd 32, reducing to maxcmd 3) nvme nvme10: Could not set queue count (880) nvme nvme10: Failed to configure AEN (cfg 900) 4) nvme nvme10: NVME-FC{10}: controller connect complete 5) nvme nvme10: failed nvme_keep_alive_end_io error=4 A connection attempt starts 1) and the ctrl is in state CONNECTING. Shortly after the LLDD driver detects a connection lost event and calls nvme_fc_ctrl_connectivity_loss 2). Because we are still in CONNECTING state, this event is ignored. nvme_fc_create_association continues to run in parallel and tries to communicate with the controller and those commands fail. Though these errors are filtered out, e.g in 3) setting the I/O queues numbers fails which leads to an early exit in nvme_fc_create_io_queues. Because the number of IO queues is 0 at this point, there is nothing left in nvme_fc_create_association which could detected the connection drop. Thus the ctrl enters LIVE state 4). The keep alive timer fires and a keep alive command is send off but gets rejected by nvme_fc_queue_rq and the rq status is set to NVME_SC_HOST_PATH_ERROR. The nvme status is then mapped to a block layer status BLK_STS_TRANSPORT/4 in nvme_end_req. Eventually, nvme_keep_alive_end_io sees the status != 0 and just logs an error 5). We should obviously detect the problem in 3) and abort there (will address this later), but that still leaves a race window open. There is a race window open in nvme_fc_create_association after starting the IO queues and setting the ctrl state to LIVE. Thus trigger a reset from the keep alive handler when an error is reported. Signed-off-by: Daniel Wagner <wagi@kernel.org> --- drivers/nvme/host/core.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index bfd71511c85f8b1a9508c6ea062475ff51bf27fe..2a07c2c540b26c8cbe886711abaf6f0afbe6c4df 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1320,6 +1320,12 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, dev_err(ctrl->device, "failed nvme_keep_alive_end_io error=%d\n", status); + /* + * The driver reports that we lost the connection, + * trigger a recovery. + */ + if (status == BLK_STS_TRANSPORT) + nvme_reset_ctrl(ctrl); return RQ_END_IO_NONE; } -- 2.47.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/3] nvme: trigger reset when keep alive fails 2024-11-29 9:28 ` [PATCH v3 2/3] nvme: trigger reset when keep alive fails Daniel Wagner @ 2024-11-29 11:09 ` Hannes Reinecke 2024-12-09 13:36 ` Christoph Hellwig 2024-12-24 10:31 ` Sagi Grimberg 2 siblings, 0 replies; 18+ messages in thread From: Hannes Reinecke @ 2024-11-29 11:09 UTC (permalink / raw) To: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Paul Ely Cc: linux-nvme, linux-kernel On 11/29/24 10:28, Daniel Wagner wrote: > nvme_keep_alive_work setups a keep alive command and uses > blk_execute_rq_nowait to send out the command in an asynchronously > manner. Eventually, nvme_keep_alive_end_io is called. If the status > argument is 0, a new keep alive is send out. When the status argument is > not 0, only an error is logged. The keep alive machinery does not > trigger the error recovery. > > The FC driver is relying on the keep alive machinery to trigger recovery > when an error is detected. Whenever an error happens during the creation > of the association the idea is that the operation is aborted and retried > later. Though there is a window where an error happens and > nvme_fc_create_assocation can't detect the error. > > 1) nvme nvme10: NVME-FC{10}: create association : ... > 2) nvme nvme10: NVME-FC{10}: controller connectivity lost. Awaiting Reconnect > nvme nvme10: queue_size 128 > ctrl maxcmd 32, reducing to maxcmd > 3) nvme nvme10: Could not set queue count (880) > nvme nvme10: Failed to configure AEN (cfg 900) > 4) nvme nvme10: NVME-FC{10}: controller connect complete > 5) nvme nvme10: failed nvme_keep_alive_end_io error=4 > > A connection attempt starts 1) and the ctrl is in state CONNECTING. > Shortly after the LLDD driver detects a connection lost event and calls > nvme_fc_ctrl_connectivity_loss 2). Because we are still in CONNECTING > state, this event is ignored. > > nvme_fc_create_association continues to run in parallel and tries to > communicate with the controller and those commands fail. Though these > errors are filtered out, e.g in 3) setting the I/O queues numbers fails > which leads to an early exit in nvme_fc_create_io_queues. Because the > number of IO queues is 0 at this point, there is nothing left in > nvme_fc_create_association which could detected the connection drop. > Thus the ctrl enters LIVE state 4). > > The keep alive timer fires and a keep alive command is send off but > gets rejected by nvme_fc_queue_rq and the rq status is set to > NVME_SC_HOST_PATH_ERROR. The nvme status is then mapped to a block layer > status BLK_STS_TRANSPORT/4 in nvme_end_req. Eventually, > nvme_keep_alive_end_io sees the status != 0 and just logs an error 5). > > We should obviously detect the problem in 3) and abort there (will > address this later), but that still leaves a race window open. There is > a race window open in nvme_fc_create_association after starting the IO > queues and setting the ctrl state to LIVE. > > Thus trigger a reset from the keep alive handler when an error is > reported. > > Signed-off-by: Daniel Wagner <wagi@kernel.org> > --- > drivers/nvme/host/core.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index bfd71511c85f8b1a9508c6ea062475ff51bf27fe..2a07c2c540b26c8cbe886711abaf6f0afbe6c4df 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -1320,6 +1320,12 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, > dev_err(ctrl->device, > "failed nvme_keep_alive_end_io error=%d\n", > status); > + /* > + * The driver reports that we lost the connection, > + * trigger a recovery. > + */ > + if (status == BLK_STS_TRANSPORT) > + nvme_reset_ctrl(ctrl); > return RQ_END_IO_NONE; > } > > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/3] nvme: trigger reset when keep alive fails 2024-11-29 9:28 ` [PATCH v3 2/3] nvme: trigger reset when keep alive fails Daniel Wagner 2024-11-29 11:09 ` Hannes Reinecke @ 2024-12-09 13:36 ` Christoph Hellwig 2024-12-24 10:31 ` Sagi Grimberg 2 siblings, 0 replies; 18+ messages in thread From: Christoph Hellwig @ 2024-12-09 13:36 UTC (permalink / raw) To: Daniel Wagner Cc: James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke, Paul Ely, linux-nvme, linux-kernel Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/3] nvme: trigger reset when keep alive fails 2024-11-29 9:28 ` [PATCH v3 2/3] nvme: trigger reset when keep alive fails Daniel Wagner 2024-11-29 11:09 ` Hannes Reinecke 2024-12-09 13:36 ` Christoph Hellwig @ 2024-12-24 10:31 ` Sagi Grimberg 2025-01-07 14:38 ` Daniel Wagner 2 siblings, 1 reply; 18+ messages in thread From: Sagi Grimberg @ 2024-12-24 10:31 UTC (permalink / raw) To: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Hannes Reinecke, Paul Ely Cc: linux-nvme, linux-kernel On 29/11/2024 11:28, Daniel Wagner wrote: > nvme_keep_alive_work setups a keep alive command and uses > blk_execute_rq_nowait to send out the command in an asynchronously > manner. Eventually, nvme_keep_alive_end_io is called. If the status > argument is 0, a new keep alive is send out. When the status argument is > not 0, only an error is logged. The keep alive machinery does not > trigger the error recovery. > > The FC driver is relying on the keep alive machinery to trigger recovery > when an error is detected. Whenever an error happens during the creation > of the association the idea is that the operation is aborted and retried > later. Though there is a window where an error happens and > nvme_fc_create_assocation can't detect the error. > > 1) nvme nvme10: NVME-FC{10}: create association : ... > 2) nvme nvme10: NVME-FC{10}: controller connectivity lost. Awaiting Reconnect > nvme nvme10: queue_size 128 > ctrl maxcmd 32, reducing to maxcmd > 3) nvme nvme10: Could not set queue count (880) > nvme nvme10: Failed to configure AEN (cfg 900) > 4) nvme nvme10: NVME-FC{10}: controller connect complete > 5) nvme nvme10: failed nvme_keep_alive_end_io error=4 > > A connection attempt starts 1) and the ctrl is in state CONNECTING. > Shortly after the LLDD driver detects a connection lost event and calls > nvme_fc_ctrl_connectivity_loss 2). Because we are still in CONNECTING > state, this event is ignored. > > nvme_fc_create_association continues to run in parallel and tries to > communicate with the controller and those commands fail. Though these > errors are filtered out, e.g in 3) setting the I/O queues numbers fails > which leads to an early exit in nvme_fc_create_io_queues. Because the > number of IO queues is 0 at this point, there is nothing left in > nvme_fc_create_association which could detected the connection drop. > Thus the ctrl enters LIVE state 4). > > The keep alive timer fires and a keep alive command is send off but > gets rejected by nvme_fc_queue_rq and the rq status is set to > NVME_SC_HOST_PATH_ERROR. The nvme status is then mapped to a block layer > status BLK_STS_TRANSPORT/4 in nvme_end_req. Eventually, > nvme_keep_alive_end_io sees the status != 0 and just logs an error 5). > > We should obviously detect the problem in 3) and abort there (will > address this later), but that still leaves a race window open. There is > a race window open in nvme_fc_create_association after starting the IO > queues and setting the ctrl state to LIVE. > > Thus trigger a reset from the keep alive handler when an error is > reported. > > Signed-off-by: Daniel Wagner <wagi@kernel.org> > --- > drivers/nvme/host/core.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index bfd71511c85f8b1a9508c6ea062475ff51bf27fe..2a07c2c540b26c8cbe886711abaf6f0afbe6c4df 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -1320,6 +1320,12 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, > dev_err(ctrl->device, > "failed nvme_keep_alive_end_io error=%d\n", > status); > + /* > + * The driver reports that we lost the connection, > + * trigger a recovery. > + */ > + if (status == BLK_STS_TRANSPORT) > + nvme_reset_ctrl(ctrl); > return RQ_END_IO_NONE; > } > > A lengthy explanation that results in nvme core behavior that assumes a very specific driver behavior. Isn't the root of the problem that FC is willing to live peacefully with a controller without any queues/connectivity to it without periodically reconnecting? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/3] nvme: trigger reset when keep alive fails 2024-12-24 10:31 ` Sagi Grimberg @ 2025-01-07 14:38 ` Daniel Wagner 2025-01-08 10:50 ` Sagi Grimberg 0 siblings, 1 reply; 18+ messages in thread From: Daniel Wagner @ 2025-01-07 14:38 UTC (permalink / raw) To: Sagi Grimberg Cc: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Hannes Reinecke, Paul Ely, linux-nvme, linux-kernel On Tue, Dec 24, 2024 at 12:31:35PM +0200, Sagi Grimberg wrote: > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > > index bfd71511c85f8b1a9508c6ea062475ff51bf27fe..2a07c2c540b26c8cbe886711abaf6f0afbe6c4df 100644 > > --- a/drivers/nvme/host/core.c > > +++ b/drivers/nvme/host/core.c > > @@ -1320,6 +1320,12 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, > > dev_err(ctrl->device, > > "failed nvme_keep_alive_end_io error=%d\n", > > status); > > + /* > > + * The driver reports that we lost the connection, > > + * trigger a recovery. > > + */ > > + if (status == BLK_STS_TRANSPORT) > > + nvme_reset_ctrl(ctrl); > > return RQ_END_IO_NONE; > > } > > > > A lengthy explanation that results in nvme core behavior that assumes a very > specific driver behavior. I tried to explain exactly what's going on, so we can discuss possible solutions without communicating past each other. In the meantime I started on a patch set for the TP4129 related changes in the spec (KATO Corrections and Clarifications). These changes would also depend on the kato timeout handler triggering a reset. I am fine with dropping this change for now and discuss it in the light of TP4129 if this is what you prefer? > Isn't the root of the problem that FC is willing to live > peacefully with a controller > without any queues/connectivity to it without periodically reconnecting? The root problem is that the connect lost event gets ignored in the CONNECTING state for the first connection attempt. All will work fine for RECONNECTING state. Maybe something like this instead? (untested) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index c4cbe3ce81f7..1f1d1d62a978 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -148,6 +148,7 @@ struct nvme_fc_rport { #define ASSOC_ACTIVE 0 #define ASSOC_FAILED 1 #define FCCTRL_TERMIO 2 +#define CONNECTIVITY_LOST 3 struct nvme_fc_ctrl { spinlock_t lock; @@ -785,6 +786,8 @@ nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctrl) "NVME-FC{%d}: controller connectivity lost. Awaiting " "Reconnect", ctrl->cnum); + set_bit(CONNECTIVITY_LOST, &ctrl->flags); + switch (nvme_ctrl_state(&ctrl->ctrl)) { case NVME_CTRL_NEW: case NVME_CTRL_LIVE: @@ -3071,6 +3074,8 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl) if (nvme_fc_ctlr_active_on_rport(ctrl)) return -ENOTUNIQ; + clear_bit(CONNECTIVITY_LOST, &ctrl->flags); + dev_info(ctrl->ctrl.device, "NVME-FC{%d}: create association : host wwpn 0x%016llx " " rport wwpn 0x%016llx: NQN \"%s\"\n", @@ -3174,6 +3179,11 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl) changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE); + if (test_bit(CONNECTIVITY_LOST, &ctrl->flags)) { + ret = -EIO; + goto out_term_aeo_ops; + } + ctrl->ctrl.nr_reconnects = 0; if (changed) ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/3] nvme: trigger reset when keep alive fails 2025-01-07 14:38 ` Daniel Wagner @ 2025-01-08 10:50 ` Sagi Grimberg 0 siblings, 0 replies; 18+ messages in thread From: Sagi Grimberg @ 2025-01-08 10:50 UTC (permalink / raw) To: Daniel Wagner Cc: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Hannes Reinecke, Paul Ely, linux-nvme, linux-kernel On 07/01/2025 16:38, Daniel Wagner wrote: > On Tue, Dec 24, 2024 at 12:31:35PM +0200, Sagi Grimberg wrote: >>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c >>> index bfd71511c85f8b1a9508c6ea062475ff51bf27fe..2a07c2c540b26c8cbe886711abaf6f0afbe6c4df 100644 >>> --- a/drivers/nvme/host/core.c >>> +++ b/drivers/nvme/host/core.c >>> @@ -1320,6 +1320,12 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, >>> dev_err(ctrl->device, >>> "failed nvme_keep_alive_end_io error=%d\n", >>> status); >>> + /* >>> + * The driver reports that we lost the connection, >>> + * trigger a recovery. >>> + */ >>> + if (status == BLK_STS_TRANSPORT) >>> + nvme_reset_ctrl(ctrl); >>> return RQ_END_IO_NONE; >>> } >>> >> A lengthy explanation that results in nvme core behavior that assumes a very >> specific driver behavior. > I tried to explain exactly what's going on, so we can discuss possible > solutions without communicating past each other. > > In the meantime I started on a patch set for the TP4129 related changes > in the spec (KATO Corrections and Clarifications). These changes would > also depend on the kato timeout handler triggering a reset. > > I am fine with dropping this change for now and discuss it in the light > of TP4129 if this is what you prefer? > >> Isn't the root of the problem that FC is willing to live >> peacefully with a controller >> without any queues/connectivity to it without periodically reconnecting? > The root problem is that the connect lost event gets ignored in the > CONNECTING state for the first connection attempt. All will work fine > for RECONNECTING state. > > Maybe something like this instead? (untested) > > diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c > index c4cbe3ce81f7..1f1d1d62a978 100644 > --- a/drivers/nvme/host/fc.c > +++ b/drivers/nvme/host/fc.c > @@ -148,6 +148,7 @@ struct nvme_fc_rport { > #define ASSOC_ACTIVE 0 > #define ASSOC_FAILED 1 > #define FCCTRL_TERMIO 2 > +#define CONNECTIVITY_LOST 3 > > struct nvme_fc_ctrl { > spinlock_t lock; > @@ -785,6 +786,8 @@ nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctrl) > "NVME-FC{%d}: controller connectivity lost. Awaiting " > "Reconnect", ctrl->cnum); > > + set_bit(CONNECTIVITY_LOST, &ctrl->flags); > + > switch (nvme_ctrl_state(&ctrl->ctrl)) { > case NVME_CTRL_NEW: > case NVME_CTRL_LIVE: > @@ -3071,6 +3074,8 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl) > if (nvme_fc_ctlr_active_on_rport(ctrl)) > return -ENOTUNIQ; > > + clear_bit(CONNECTIVITY_LOST, &ctrl->flags); > + > dev_info(ctrl->ctrl.device, > "NVME-FC{%d}: create association : host wwpn 0x%016llx " > " rport wwpn 0x%016llx: NQN \"%s\"\n", > @@ -3174,6 +3179,11 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl) > > changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE); > > + if (test_bit(CONNECTIVITY_LOST, &ctrl->flags)) { > + ret = -EIO; > + goto out_term_aeo_ops; > + } > + > ctrl->ctrl.nr_reconnects = 0; > > if (changed) This looks a lot better to me. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-11-29 9:28 [PATCH v3 0/3] nvme-fc: fix race with connectivity loss and nvme_fc_create_association Daniel Wagner 2024-11-29 9:28 ` [PATCH v3 1/3] nvme-fc: go straight to connecting state when initializing Daniel Wagner 2024-11-29 9:28 ` [PATCH v3 2/3] nvme: trigger reset when keep alive fails Daniel Wagner @ 2024-11-29 9:28 ` Daniel Wagner 2024-11-29 11:10 ` Hannes Reinecke 2024-12-24 10:39 ` Sagi Grimberg 2 siblings, 2 replies; 18+ messages in thread From: Daniel Wagner @ 2024-11-29 9:28 UTC (permalink / raw) To: James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Hannes Reinecke, Paul Ely Cc: linux-nvme, linux-kernel, Daniel Wagner When the set feature attempts fails with any NVME status code set in nvme_set_queue_count, the function still report success. Though the numbers of queues set to 0. This is done to support controllers in degraded state (the admin queue is still up and running but no IO queues). Though there is an exception. When nvme_set_features reports an host path error, nvme_set_queue_count should propagate this error as the connectivity is lost, which means also the admin queue is not working anymore. Fixes: 9a0be7abb62f ("nvme: refactor set_queue_count") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <wagi@kernel.org> --- drivers/nvme/host/core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 2a07c2c540b26c8cbe886711abaf6f0afbe6c4df..aa2a7c7d4d0ae7bd1f7fb704e55c0a8d724b9d56 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1677,7 +1677,12 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count) status = nvme_set_features(ctrl, NVME_FEAT_NUM_QUEUES, q_count, NULL, 0, &result); - if (status < 0) + /* + * It's either a kernel error or the host observed a connection + * lost. In either case it's not possible communicate with the + * controller and thus enter the error code path. + */ + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) return status; /* -- 2.47.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-11-29 9:28 ` [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count Daniel Wagner @ 2024-11-29 11:10 ` Hannes Reinecke 2024-12-17 8:35 ` Daniel Wagner 2024-12-24 10:39 ` Sagi Grimberg 1 sibling, 1 reply; 18+ messages in thread From: Hannes Reinecke @ 2024-11-29 11:10 UTC (permalink / raw) To: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Paul Ely Cc: linux-nvme, linux-kernel On 11/29/24 10:28, Daniel Wagner wrote: > When the set feature attempts fails with any NVME status code set in > nvme_set_queue_count, the function still report success. Though the > numbers of queues set to 0. This is done to support controllers in > degraded state (the admin queue is still up and running but no IO > queues). > > Though there is an exception. When nvme_set_features reports an host > path error, nvme_set_queue_count should propagate this error as the > connectivity is lost, which means also the admin queue is not working > anymore. > > Fixes: 9a0be7abb62f ("nvme: refactor set_queue_count") > Reviewed-by: Christoph Hellwig <hch@lst.de> > Signed-off-by: Daniel Wagner <wagi@kernel.org> > --- > drivers/nvme/host/core.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index 2a07c2c540b26c8cbe886711abaf6f0afbe6c4df..aa2a7c7d4d0ae7bd1f7fb704e55c0a8d724b9d56 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -1677,7 +1677,12 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count) > > status = nvme_set_features(ctrl, NVME_FEAT_NUM_QUEUES, q_count, NULL, 0, > &result); > - if (status < 0) > + /* > + * It's either a kernel error or the host observed a connection > + * lost. In either case it's not possible communicate with the > + * controller and thus enter the error code path. > + */ > + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) > return status; > > /* > Hmm. Maybe checking for NVME_SC_DNR, too? Otherwise: Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-11-29 11:10 ` Hannes Reinecke @ 2024-12-17 8:35 ` Daniel Wagner 2024-12-17 9:45 ` Hannes Reinecke 2024-12-24 10:35 ` Sagi Grimberg 0 siblings, 2 replies; 18+ messages in thread From: Daniel Wagner @ 2024-12-17 8:35 UTC (permalink / raw) To: Hannes Reinecke Cc: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Paul Ely, linux-nvme, linux-kernel On Fri, Nov 29, 2024 at 12:10:33PM +0100, Hannes Reinecke wrote: > > + /* > > + * It's either a kernel error or the host observed a connection > > + * lost. In either case it's not possible communicate with the > > + * controller and thus enter the error code path. > > + */ > > + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) > > return status; > > /* > > > Hmm. Maybe checking for NVME_SC_DNR, too? if no one complains I'll update the check to: if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || status == NVME_SC_HOST_PATH_ERROR) return status; okay? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-12-17 8:35 ` Daniel Wagner @ 2024-12-17 9:45 ` Hannes Reinecke 2024-12-17 14:01 ` Daniel Wagner 2024-12-24 10:35 ` Sagi Grimberg 1 sibling, 1 reply; 18+ messages in thread From: Hannes Reinecke @ 2024-12-17 9:45 UTC (permalink / raw) To: Daniel Wagner Cc: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Paul Ely, linux-nvme, linux-kernel On 12/17/24 09:35, Daniel Wagner wrote: > On Fri, Nov 29, 2024 at 12:10:33PM +0100, Hannes Reinecke wrote: >>> + /* >>> + * It's either a kernel error or the host observed a connection >>> + * lost. In either case it's not possible communicate with the >>> + * controller and thus enter the error code path. >>> + */ >>> + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) >>> return status; >>> /* >>> >> Hmm. Maybe checking for NVME_SC_DNR, too? > > if no one complains I'll update the check to: > > if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || > status == NVME_SC_HOST_PATH_ERROR) > return status; > > okay? Which really cries out for a helper. But otherwise okay. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-12-17 9:45 ` Hannes Reinecke @ 2024-12-17 14:01 ` Daniel Wagner 2024-12-20 8:32 ` Hannes Reinecke 0 siblings, 1 reply; 18+ messages in thread From: Daniel Wagner @ 2024-12-17 14:01 UTC (permalink / raw) To: Hannes Reinecke Cc: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Paul Ely, linux-nvme, linux-kernel On Tue, Dec 17, 2024 at 10:45:45AM +0100, Hannes Reinecke wrote: > On 12/17/24 09:35, Daniel Wagner wrote: > > On Fri, Nov 29, 2024 at 12:10:33PM +0100, Hannes Reinecke wrote: > > > > + /* > > > > + * It's either a kernel error or the host observed a connection > > > > + * lost. In either case it's not possible communicate with the > > > > + * controller and thus enter the error code path. > > > > + */ > > > > + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) > > > > return status; > > > > /* > > > > > > > Hmm. Maybe checking for NVME_SC_DNR, too? > > > > if no one complains I'll update the check to: > > > > if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || > > status == NVME_SC_HOST_PATH_ERROR) > > return status; > > > > okay? > > Which really cries out for a helper. But otherwise okay. As far I remember, Christoph is not a big fan of single users of tiny helper functions. Anyway, what about this: +static bool nvme_num_queue_updated(status) +{ + /* + * It's either a kernel error or the host observed a connection + * lost. In either case it's not possible communicate with the + * controller and thus enter the error code path. + */ + + if (status < 0 || status & NVME_STATUS_DNR || + status == NVME_SC_HOST_PATH_ERROR) + return false; + + return true; +} + int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count) { u32 q_count = (*count - 1) | ((*count - 1) << 16); @@ -1701,13 +1716,8 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count) status = nvme_set_features(ctrl, NVME_FEAT_NUM_QUEUES, q_count, NULL, 0, &result); - /* - * It's either a kernel error or the host observed a connection - * lost. In either case it's not possible communicate with the - * controller and thus enter the error code path. - */ - if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || - status == NVME_SC_HOST_PATH_ERROR) + + if (!nvme_num_queue_updated(status)) return status; /* ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-12-17 14:01 ` Daniel Wagner @ 2024-12-20 8:32 ` Hannes Reinecke 0 siblings, 0 replies; 18+ messages in thread From: Hannes Reinecke @ 2024-12-20 8:32 UTC (permalink / raw) To: Daniel Wagner Cc: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Sagi Grimberg, Paul Ely, linux-nvme, linux-kernel On 12/17/24 15:01, Daniel Wagner wrote: > On Tue, Dec 17, 2024 at 10:45:45AM +0100, Hannes Reinecke wrote: >> On 12/17/24 09:35, Daniel Wagner wrote: >>> On Fri, Nov 29, 2024 at 12:10:33PM +0100, Hannes Reinecke wrote: >>>>> + /* >>>>> + * It's either a kernel error or the host observed a connection >>>>> + * lost. In either case it's not possible communicate with the >>>>> + * controller and thus enter the error code path. >>>>> + */ >>>>> + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) >>>>> return status; >>>>> /* >>>>> >>>> Hmm. Maybe checking for NVME_SC_DNR, too? >>> >>> if no one complains I'll update the check to: >>> >>> if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || >>> status == NVME_SC_HOST_PATH_ERROR) >>> return status; >>> >>> okay? >> >> Which really cries out for a helper. But otherwise okay. > > As far I remember, Christoph is not a big fan of single users of tiny > helper functions. Anyway, what about this: > > +static bool nvme_num_queue_updated(status) > +{ > + /* > + * It's either a kernel error or the host observed a connection > + * lost. In either case it's not possible communicate with the > + * controller and thus enter the error code path. > + */ > + > + if (status < 0 || status & NVME_STATUS_DNR || > + status == NVME_SC_HOST_PATH_ERROR) > + return false; > + > + return true; > +} > + > int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count) > { > u32 q_count = (*count - 1) | ((*count - 1) << 16); > @@ -1701,13 +1716,8 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count) > > status = nvme_set_features(ctrl, NVME_FEAT_NUM_QUEUES, q_count, NULL, 0, > &result); > - /* > - * It's either a kernel error or the host observed a connection > - * lost. In either case it's not possible communicate with the > - * controller and thus enter the error code path. > - */ > - if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || > - status == NVME_SC_HOST_PATH_ERROR) > + > + if (!nvme_num_queue_updated(status)) > return status; > > /* Naming is terrible. The helper does not check whether the number of queues updated, so why the name? Better keep it inline, then. Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-12-17 8:35 ` Daniel Wagner 2024-12-17 9:45 ` Hannes Reinecke @ 2024-12-24 10:35 ` Sagi Grimberg 2025-01-07 14:40 ` Daniel Wagner 1 sibling, 1 reply; 18+ messages in thread From: Sagi Grimberg @ 2024-12-24 10:35 UTC (permalink / raw) To: Daniel Wagner, Hannes Reinecke Cc: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Paul Ely, linux-nvme, linux-kernel On 17/12/2024 10:35, Daniel Wagner wrote: > On Fri, Nov 29, 2024 at 12:10:33PM +0100, Hannes Reinecke wrote: >>> + /* >>> + * It's either a kernel error or the host observed a connection >>> + * lost. In either case it's not possible communicate with the >>> + * controller and thus enter the error code path. >>> + */ >>> + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) >>> return status; >>> /* >>> >> Hmm. Maybe checking for NVME_SC_DNR, too? > if no one complains I'll update the check to: > > if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || > status == NVME_SC_HOST_PATH_ERROR) > return status; > > okay? Why do we care about the DNR? are you going to retry based on it? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-12-24 10:35 ` Sagi Grimberg @ 2025-01-07 14:40 ` Daniel Wagner 2025-01-08 10:51 ` Sagi Grimberg 0 siblings, 1 reply; 18+ messages in thread From: Daniel Wagner @ 2025-01-07 14:40 UTC (permalink / raw) To: Sagi Grimberg Cc: Hannes Reinecke, James Smart, Keith Busch, Christoph Hellwig, Paul Ely, linux-nvme, linux-kernel On Tue, Dec 24, 2024 at 12:35:23PM +0200, Sagi Grimberg wrote: > On 17/12/2024 10:35, Daniel Wagner wrote: > > On Fri, Nov 29, 2024 at 12:10:33PM +0100, Hannes Reinecke wrote: > > > > + /* > > > > + * It's either a kernel error or the host observed a connection > > > > + * lost. In either case it's not possible communicate with the > > > > + * controller and thus enter the error code path. > > > > + */ > > > > + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) > > > > return status; > > > > /* > > > > > > > Hmm. Maybe checking for NVME_SC_DNR, too? > > if no one complains I'll update the check to: > > > > if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || > > status == NVME_SC_HOST_PATH_ERROR) > > return status; > > > > okay? > Why do we care about the DNR? are you going to retry based on it? I don't know if we should care. Hannes brought this up. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2025-01-07 14:40 ` Daniel Wagner @ 2025-01-08 10:51 ` Sagi Grimberg 0 siblings, 0 replies; 18+ messages in thread From: Sagi Grimberg @ 2025-01-08 10:51 UTC (permalink / raw) To: Daniel Wagner Cc: Hannes Reinecke, James Smart, Keith Busch, Christoph Hellwig, Paul Ely, linux-nvme, linux-kernel On 07/01/2025 16:40, Daniel Wagner wrote: > On Tue, Dec 24, 2024 at 12:35:23PM +0200, Sagi Grimberg wrote: >> On 17/12/2024 10:35, Daniel Wagner wrote: >>> On Fri, Nov 29, 2024 at 12:10:33PM +0100, Hannes Reinecke wrote: >>>>> + /* >>>>> + * It's either a kernel error or the host observed a connection >>>>> + * lost. In either case it's not possible communicate with the >>>>> + * controller and thus enter the error code path. >>>>> + */ >>>>> + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) >>>>> return status; >>>>> /* >>>>> >>>> Hmm. Maybe checking for NVME_SC_DNR, too? >>> if no one complains I'll update the check to: >>> >>> if (status < 0 || (status > 0 && (status & NVME_STATUS_DNR)) || >>> status == NVME_SC_HOST_PATH_ERROR) >>> return status; >>> >>> okay? >> Why do we care about the DNR? are you going to retry based on it? > I don't know if we should care. Hannes brought this up. I don't see why should we care. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count 2024-11-29 9:28 ` [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count Daniel Wagner 2024-11-29 11:10 ` Hannes Reinecke @ 2024-12-24 10:39 ` Sagi Grimberg 1 sibling, 0 replies; 18+ messages in thread From: Sagi Grimberg @ 2024-12-24 10:39 UTC (permalink / raw) To: Daniel Wagner, James Smart, Keith Busch, Christoph Hellwig, Hannes Reinecke, Paul Ely Cc: linux-nvme, linux-kernel Makes sense to me, Reviewed-by: Sagi Grimberg <sagi@grimberg.me> ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-01-08 10:59 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-29 9:28 [PATCH v3 0/3] nvme-fc: fix race with connectivity loss and nvme_fc_create_association Daniel Wagner 2024-11-29 9:28 ` [PATCH v3 1/3] nvme-fc: go straight to connecting state when initializing Daniel Wagner 2024-11-29 9:28 ` [PATCH v3 2/3] nvme: trigger reset when keep alive fails Daniel Wagner 2024-11-29 11:09 ` Hannes Reinecke 2024-12-09 13:36 ` Christoph Hellwig 2024-12-24 10:31 ` Sagi Grimberg 2025-01-07 14:38 ` Daniel Wagner 2025-01-08 10:50 ` Sagi Grimberg 2024-11-29 9:28 ` [PATCH v3 3/3] nvme: handle connectivity loss in nvme_set_queue_count Daniel Wagner 2024-11-29 11:10 ` Hannes Reinecke 2024-12-17 8:35 ` Daniel Wagner 2024-12-17 9:45 ` Hannes Reinecke 2024-12-17 14:01 ` Daniel Wagner 2024-12-20 8:32 ` Hannes Reinecke 2024-12-24 10:35 ` Sagi Grimberg 2025-01-07 14:40 ` Daniel Wagner 2025-01-08 10:51 ` Sagi Grimberg 2024-12-24 10:39 ` Sagi Grimberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox