* [PATCH 0/2] nvme-fc: fix schedule in atomic context
@ 2025-02-14 8:02 Daniel Wagner
2025-02-14 8:02 ` [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state Daniel Wagner
` (4 more replies)
0 siblings, 5 replies; 27+ messages in thread
From: Daniel Wagner @ 2025-02-14 8:02 UTC (permalink / raw)
To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
James Smart, Hannes Reinecke, Shinichiro Kawasaki
Cc: linux-nvme, linux-kernel, Daniel Wagner
Shinichiro reported [1] the recent change in the nvme-fc setup path [2]
introduced a bug. I didn't spot the schedule call in
nvme_change_ctrl_state.
It turns out the locking is not necessary if we make the state machine a
bit more restrictive and only allow entering the LIVE state from
CONNECTING. If we do this, it's possible to ensure we either enter LIVE
only if there was no connection loss event. Also the connection loss
event handler should always trigger the reset handler to avoid a
read-write race on the state machine state variable.
I've tried to replicate the original problem once again and wrote a new
blktest which tries to trigger the race condition. I let it run a for a
while and nothing broke, but I can't be sure it is really gone. The rest
of the blktests also passed. Unfortunatly, the test box with FC hardware
is currently not working, so I can't test this with real hardware.
[1] https://lore.kernel.org/all/denqwui6sl5erqmz2gvrwueyxakl5txzbbiu3fgebryzrfxunm@iwxuthct377m/
[2] https://lore.kernel.org/all/20250109-nvme-fc-handle-com-lost-v4-3-fe5cae17b492@kernel.org/
Signed-off-by: Daniel Wagner <wagi@kernel.org>
---
Daniel Wagner (2):
nvme: only allow entering LIVE from CONNECTING state
nvme-fc: rely on state transitions to handle connectivity loss
drivers/nvme/host/core.c | 2 --
drivers/nvme/host/fc.c | 67 +++++-------------------------------------------
2 files changed, 6 insertions(+), 63 deletions(-)
---
base-commit: a64dcfb451e254085a7daee5fe51bf22959d52d3
change-id: 20250213-nvme-fc-fixes-eda1a10def35
Best regards,
--
Daniel Wagner <wagi@kernel.org>
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-02-14 8:02 [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
@ 2025-02-14 8:02 ` Daniel Wagner
2025-02-20 10:34 ` Sagi Grimberg
` (2 more replies)
2025-02-14 8:02 ` [PATCH 2/2] nvme-fc: rely on state transitions to handle connectivity loss Daniel Wagner
` (3 subsequent siblings)
4 siblings, 3 replies; 27+ messages in thread
From: Daniel Wagner @ 2025-02-14 8:02 UTC (permalink / raw)
To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
James Smart, Hannes Reinecke, Shinichiro Kawasaki
Cc: linux-nvme, linux-kernel, Daniel Wagner
The fabric transports and also the PCI transport are not entering the
LIVE state from NEW or RESETTING. This makes the state machine more
restrictive and allows to catch not supported state transitions, e.g.
directly switching from RESETTING to LIVE.
Signed-off-by: Daniel Wagner <wagi@kernel.org>
---
drivers/nvme/host/core.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 818d4e49aab51c388af9a48bf9d466fea9cef51b..f028913e2e622ee348e88879c6e6b7e8f8a1cc82 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -564,8 +564,6 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
switch (new_state) {
case NVME_CTRL_LIVE:
switch (old_state) {
- case NVME_CTRL_NEW:
- case NVME_CTRL_RESETTING:
case NVME_CTRL_CONNECTING:
changed = true;
fallthrough;
--
2.48.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 2/2] nvme-fc: rely on state transitions to handle connectivity loss
2025-02-14 8:02 [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
2025-02-14 8:02 ` [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state Daniel Wagner
@ 2025-02-14 8:02 ` Daniel Wagner
2025-02-20 10:36 ` Sagi Grimberg
2025-02-20 8:00 ` [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
` (2 subsequent siblings)
4 siblings, 1 reply; 27+ messages in thread
From: Daniel Wagner @ 2025-02-14 8:02 UTC (permalink / raw)
To: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
James Smart, Hannes Reinecke, Shinichiro Kawasaki
Cc: linux-nvme, linux-kernel, Daniel Wagner
It's not possible to call nvme_state_ctrl_state with holding a spin
lock, because nvme_state_ctrl_state calls cancel_delayed_work_sync
when fastfail is enabled.
Instead syncing the ASSOC_FLAG and state transitions using a lock, it's
possible to only rely on the state machine transitions. That means
nvme_fc_ctrl_connectivity_loss should unconditionally call
nvme_reset_ctrl which avoids the read race on the ctrl state variable.
Actually, it's not necessary to test in which state the ctrl is, the
reset work will only scheduled when the state machine is in LIVE state.
In nvme_fc_create_association, the LIVE state can only be entered if it
was previously CONNECTING. If this is not possible then the reset
handler got triggered. Thus just error out here.
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Closes: https://lore.kernel.org/all/denqwui6sl5erqmz2gvrwueyxakl5txzbbiu3fgebryzrfxunm@iwxuthct377m/
Fixes: ee59e3820ca9 ("nvme-fc: do not ignore connectivity loss during connecting")
Signed-off-by: Daniel Wagner <wagi@kernel.org>
---
drivers/nvme/host/fc.c | 67 +++++---------------------------------------------
1 file changed, 6 insertions(+), 61 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index f4f1866fbd5b8b05730a785c7d256108c9344e62..b9929a5a7f4e3f3a03953379aceb90f0c1a0b561 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -781,61 +781,12 @@ nvme_fc_abort_lsops(struct nvme_fc_rport *rport)
static void
nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctrl)
{
- enum nvme_ctrl_state state;
- unsigned long flags;
-
dev_info(ctrl->ctrl.device,
"NVME-FC{%d}: controller connectivity lost. Awaiting "
"Reconnect", ctrl->cnum);
- spin_lock_irqsave(&ctrl->lock, flags);
set_bit(ASSOC_FAILED, &ctrl->flags);
- state = nvme_ctrl_state(&ctrl->ctrl);
- spin_unlock_irqrestore(&ctrl->lock, flags);
-
- switch (state) {
- case NVME_CTRL_NEW:
- case NVME_CTRL_LIVE:
- /*
- * Schedule a controller reset. The reset will terminate the
- * association and schedule the reconnect timer. Reconnects
- * will be attempted until either the ctlr_loss_tmo
- * (max_retries * connect_delay) expires or the remoteport's
- * dev_loss_tmo expires.
- */
- if (nvme_reset_ctrl(&ctrl->ctrl)) {
- dev_warn(ctrl->ctrl.device,
- "NVME-FC{%d}: Couldn't schedule reset.\n",
- ctrl->cnum);
- nvme_delete_ctrl(&ctrl->ctrl);
- }
- break;
-
- case NVME_CTRL_CONNECTING:
- /*
- * The association has already been terminated and the
- * controller is attempting reconnects. No need to do anything
- * futher. Reconnects will be attempted until either the
- * ctlr_loss_tmo (max_retries * connect_delay) expires or the
- * remoteport's dev_loss_tmo expires.
- */
- break;
-
- case NVME_CTRL_RESETTING:
- /*
- * Controller is already in the process of terminating the
- * association. No need to do anything further. The reconnect
- * step will kick in naturally after the association is
- * terminated.
- */
- break;
-
- case NVME_CTRL_DELETING:
- case NVME_CTRL_DELETING_NOIO:
- default:
- /* no action to take - let it delete */
- break;
- }
+ nvme_reset_ctrl(&ctrl->ctrl);
}
/**
@@ -3071,7 +3022,6 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
struct nvmefc_ls_rcv_op *disls = NULL;
unsigned long flags;
int ret;
- bool changed;
++ctrl->ctrl.nr_reconnects;
@@ -3177,23 +3127,18 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
else
ret = nvme_fc_recreate_io_queues(ctrl);
}
+ if (!ret && test_bit(ASSOC_FAILED, &ctrl->flags))
+ ret = -EIO;
if (ret)
goto out_term_aen_ops;
- spin_lock_irqsave(&ctrl->lock, flags);
- if (!test_bit(ASSOC_FAILED, &ctrl->flags))
- changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
- else
+ if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE)) {
ret = -EIO;
- spin_unlock_irqrestore(&ctrl->lock, flags);
-
- if (ret)
goto out_term_aen_ops;
+ }
ctrl->ctrl.nr_reconnects = 0;
-
- if (changed)
- nvme_start_ctrl(&ctrl->ctrl);
+ nvme_start_ctrl(&ctrl->ctrl);
return 0; /* Success */
--
2.48.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 0/2] nvme-fc: fix schedule in atomic context
2025-02-14 8:02 [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
2025-02-14 8:02 ` [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state Daniel Wagner
2025-02-14 8:02 ` [PATCH 2/2] nvme-fc: rely on state transitions to handle connectivity loss Daniel Wagner
@ 2025-02-20 8:00 ` Daniel Wagner
2025-02-20 12:50 ` Shinichiro Kawasaki
2025-02-20 17:23 ` Keith Busch
4 siblings, 0 replies; 27+ messages in thread
From: Daniel Wagner @ 2025-02-20 8:00 UTC (permalink / raw)
To: Keith Busch
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, James Smart,
Hannes Reinecke, Shinichiro Kawasaki, linux-nvme, linux-kernel
On Fri, Feb 14, 2025 at 09:02:02AM +0100, Daniel Wagner wrote:
> Shinichiro reported [1] the recent change in the nvme-fc setup path [2]
> introduced a bug. I didn't spot the schedule call in
> nvme_change_ctrl_state.
>
> It turns out the locking is not necessary if we make the state machine a
> bit more restrictive and only allow entering the LIVE state from
> CONNECTING. If we do this, it's possible to ensure we either enter LIVE
> only if there was no connection loss event. Also the connection loss
> event handler should always trigger the reset handler to avoid a
> read-write race on the state machine state variable.
>
> I've tried to replicate the original problem once again and wrote a new
> blktest which tries to trigger the race condition. I let it run a for a
> while and nothing broke, but I can't be sure it is really gone. The rest
> of the blktests also passed. Unfortunatly, the test box with FC hardware
> is currently not working, so I can't test this with real hardware.
>
> [1] https://lore.kernel.org/all/denqwui6sl5erqmz2gvrwueyxakl5txzbbiu3fgebryzrfxunm@iwxuthct377m/
> [2] https://lore.kernel.org/all/20250109-nvme-fc-handle-com-lost-v4-3-fe5cae17b492@kernel.org/
ping
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-02-14 8:02 ` [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state Daniel Wagner
@ 2025-02-20 10:34 ` Sagi Grimberg
2025-04-27 15:59 ` Guenter Roeck
2026-01-09 19:18 ` John Meneghini
2 siblings, 0 replies; 27+ messages in thread
From: Sagi Grimberg @ 2025-02-20 10:34 UTC (permalink / raw)
To: Daniel Wagner, Keith Busch, Jens Axboe, Christoph Hellwig,
James Smart, Hannes Reinecke, Shinichiro Kawasaki
Cc: linux-nvme, linux-kernel
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/2] nvme-fc: rely on state transitions to handle connectivity loss
2025-02-14 8:02 ` [PATCH 2/2] nvme-fc: rely on state transitions to handle connectivity loss Daniel Wagner
@ 2025-02-20 10:36 ` Sagi Grimberg
0 siblings, 0 replies; 27+ messages in thread
From: Sagi Grimberg @ 2025-02-20 10:36 UTC (permalink / raw)
To: Daniel Wagner, Keith Busch, Jens Axboe, Christoph Hellwig,
James Smart, Hannes Reinecke, Shinichiro Kawasaki
Cc: linux-nvme, linux-kernel
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/2] nvme-fc: fix schedule in atomic context
2025-02-14 8:02 [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
` (2 preceding siblings ...)
2025-02-20 8:00 ` [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
@ 2025-02-20 12:50 ` Shinichiro Kawasaki
2025-02-20 17:23 ` Keith Busch
4 siblings, 0 replies; 27+ messages in thread
From: Shinichiro Kawasaki @ 2025-02-20 12:50 UTC (permalink / raw)
To: Daniel Wagner
Cc: Keith Busch, Jens Axboe, hch, Sagi Grimberg, James Smart,
Hannes Reinecke, linux-nvme@lists.infradead.org,
linux-kernel@vger.kernel.org
On Feb 14, 2025 / 09:02, Daniel Wagner wrote:
> Shinichiro reported [1] the recent change in the nvme-fc setup path [2]
> introduced a bug. I didn't spot the schedule call in
> nvme_change_ctrl_state.
>
> It turns out the locking is not necessary if we make the state machine a
> bit more restrictive and only allow entering the LIVE state from
> CONNECTING. If we do this, it's possible to ensure we either enter LIVE
> only if there was no connection loss event. Also the connection loss
> event handler should always trigger the reset handler to avoid a
> read-write race on the state machine state variable.
>
> I've tried to replicate the original problem once again and wrote a new
> blktest which tries to trigger the race condition. I let it run a for a
> while and nothing broke, but I can't be sure it is really gone. The rest
> of the blktests also passed. Unfortunatly, the test box with FC hardware
> is currently not working, so I can't test this with real hardware.
>
> [1] https://lore.kernel.org/all/denqwui6sl5erqmz2gvrwueyxakl5txzbbiu3fgebryzrfxunm@iwxuthct377m/
> [2] https://lore.kernel.org/all/20250109-nvme-fc-handle-com-lost-v4-3-fe5cae17b492@kernel.org/
>
> Signed-off-by: Daniel Wagner <wagi@kernel.org>
Thanks. I reconfirmed that this series avoids the failure I reported [1]. Also I
ran all nvme test cases with various transports and observed no regression.
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/2] nvme-fc: fix schedule in atomic context
2025-02-14 8:02 [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
` (3 preceding siblings ...)
2025-02-20 12:50 ` Shinichiro Kawasaki
@ 2025-02-20 17:23 ` Keith Busch
4 siblings, 0 replies; 27+ messages in thread
From: Keith Busch @ 2025-02-20 17:23 UTC (permalink / raw)
To: Daniel Wagner
Cc: Jens Axboe, Christoph Hellwig, Sagi Grimberg, James Smart,
Hannes Reinecke, Shinichiro Kawasaki, linux-nvme, linux-kernel
On Fri, Feb 14, 2025 at 09:02:02AM +0100, Daniel Wagner wrote:
> Shinichiro reported [1] the recent change in the nvme-fc setup path [2]
> introduced a bug. I didn't spot the schedule call in
> nvme_change_ctrl_state.
Thanks, applied to nvme-6.14.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-02-14 8:02 ` [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state Daniel Wagner
2025-02-20 10:34 ` Sagi Grimberg
@ 2025-04-27 15:59 ` Guenter Roeck
2025-04-28 12:44 ` Daniel Wagner
2026-01-09 19:18 ` John Meneghini
2 siblings, 1 reply; 27+ messages in thread
From: Guenter Roeck @ 2025-04-27 15:59 UTC (permalink / raw)
To: Daniel Wagner
Cc: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg,
James Smart, Hannes Reinecke, Shinichiro Kawasaki, linux-nvme,
linux-kernel
Hi,
On Fri, Feb 14, 2025 at 09:02:03AM +0100, Daniel Wagner wrote:
> The fabric transports and also the PCI transport are not entering the
> LIVE state from NEW or RESETTING. This makes the state machine more
> restrictive and allows to catch not supported state transitions, e.g.
> directly switching from RESETTING to LIVE.
>
> Signed-off-by: Daniel Wagner <wagi@kernel.org>
nvme_handle_aen_notice(), when handling NVME_AER_NOTICE_FW_ACT_STARTING,
sets the state to RESETTING and and triggers a worker. This worker
waits for firmware activation to complete and then tries to set the
state back to LIVE. This step now fails.
Possibly the handling of NVME_AER_NOTICE_FW_ACT_STARTING needs to be
improved. However, leaving the NVME in RESETTING state after an
NVME_AER_NOTICE_FW_ACT_STARTING event is worse.
I think this patch should be reverted at least for the time being until
the handling of NVME_AER_NOTICE_FW_ACT_STARTING no longer relies on a
direct state change from RESETTING to LIVE.
Please correct me if my analysis is wrong.
Thanks,
Guenter
> ---
> drivers/nvme/host/core.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 818d4e49aab51c388af9a48bf9d466fea9cef51b..f028913e2e622ee348e88879c6e6b7e8f8a1cc82 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -564,8 +564,6 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
> switch (new_state) {
> case NVME_CTRL_LIVE:
> switch (old_state) {
> - case NVME_CTRL_NEW:
> - case NVME_CTRL_RESETTING:
> case NVME_CTRL_CONNECTING:
> changed = true;
> fallthrough;
>
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-27 15:59 ` Guenter Roeck
@ 2025-04-28 12:44 ` Daniel Wagner
2025-04-28 13:21 ` Hannes Reinecke
0 siblings, 1 reply; 27+ messages in thread
From: Daniel Wagner @ 2025-04-28 12:44 UTC (permalink / raw)
To: Guenter Roeck
Cc: Daniel Wagner, Keith Busch, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, James Smart, Hannes Reinecke, Shinichiro Kawasaki,
linux-nvme, linux-kernel
On Sun, Apr 27, 2025 at 08:59:13AM -0700, Guenter Roeck wrote:
> Hi,
>
> On Fri, Feb 14, 2025 at 09:02:03AM +0100, Daniel Wagner wrote:
> > The fabric transports and also the PCI transport are not entering the
> > LIVE state from NEW or RESETTING. This makes the state machine more
> > restrictive and allows to catch not supported state transitions, e.g.
> > directly switching from RESETTING to LIVE.
> >
> > Signed-off-by: Daniel Wagner <wagi@kernel.org>
>
> nvme_handle_aen_notice(), when handling NVME_AER_NOTICE_FW_ACT_STARTING,
> sets the state to RESETTING and and triggers a worker. This worker
> waits for firmware activation to complete and then tries to set the
> state back to LIVE. This step now fails.
>
> Possibly the handling of NVME_AER_NOTICE_FW_ACT_STARTING needs to be
> improved. However, leaving the NVME in RESETTING state after an
> NVME_AER_NOTICE_FW_ACT_STARTING event is worse.
>
> I think this patch should be reverted at least for the time being until
> the handling of NVME_AER_NOTICE_FW_ACT_STARTING no longer relies on a
> direct state change from RESETTING to LIVE.
ee59e3820ca9 ("nvme-fc: do not ignore connectivity loss during connecting")
f13409bb3f91 ("nvme-fc: rely on state transitions to handle connectivity loss")
are depending on the fact that is not possible to switch from
NEW/RESETTING directly into LIVE.
I think it would be better to fix the worker instead dropping this patch
and the above fix for the fc transport.
What about:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b502ac07483b..d3c4eacf607f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
msleep(100);
}
- if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
+ if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
+ !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
return;
nvme_unquiesce_io_queues(ctrl);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-28 12:44 ` Daniel Wagner
@ 2025-04-28 13:21 ` Hannes Reinecke
2025-04-29 13:55 ` Daniel Wagner
2025-04-29 18:13 ` Keith Busch
0 siblings, 2 replies; 27+ messages in thread
From: Hannes Reinecke @ 2025-04-28 13:21 UTC (permalink / raw)
To: Daniel Wagner, Guenter Roeck
Cc: Daniel Wagner, Keith Busch, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, James Smart, Shinichiro Kawasaki, linux-nvme,
linux-kernel
On 4/28/25 14:44, Daniel Wagner wrote:
> On Sun, Apr 27, 2025 at 08:59:13AM -0700, Guenter Roeck wrote:
>> Hi,
>>
>> On Fri, Feb 14, 2025 at 09:02:03AM +0100, Daniel Wagner wrote:
>>> The fabric transports and also the PCI transport are not entering the
>>> LIVE state from NEW or RESETTING. This makes the state machine more
>>> restrictive and allows to catch not supported state transitions, e.g.
>>> directly switching from RESETTING to LIVE.
>>>
>>> Signed-off-by: Daniel Wagner <wagi@kernel.org>
>>
>> nvme_handle_aen_notice(), when handling NVME_AER_NOTICE_FW_ACT_STARTING,
>> sets the state to RESETTING and and triggers a worker. This worker
>> waits for firmware activation to complete and then tries to set the
>> state back to LIVE. This step now fails.
>>
>> Possibly the handling of NVME_AER_NOTICE_FW_ACT_STARTING needs to be
>> improved. However, leaving the NVME in RESETTING state after an
>> NVME_AER_NOTICE_FW_ACT_STARTING event is worse.
>>
>> I think this patch should be reverted at least for the time being until
>> the handling of NVME_AER_NOTICE_FW_ACT_STARTING no longer relies on a
>> direct state change from RESETTING to LIVE.
>
> ee59e3820ca9 ("nvme-fc: do not ignore connectivity loss during connecting")
> f13409bb3f91 ("nvme-fc: rely on state transitions to handle connectivity loss")
>
> are depending on the fact that is not possible to switch from
> NEW/RESETTING directly into LIVE.
>
> I think it would be better to fix the worker instead dropping this patch
> and the above fix for the fc transport.
>
> What about:
>
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index b502ac07483b..d3c4eacf607f 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
> msleep(100);
> }
>
> - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
> + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> return;
>
> nvme_unquiesce_io_queues(ctrl);
I would rather have a separate state for firmware activation.
(Ab-)using the 'RESETTING' state here has direct implications
with the error handler, as for the error handler 'RESETTING'
means that the error handler has been scheduled.
Which is not true for firmware activation.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-28 13:21 ` Hannes Reinecke
@ 2025-04-29 13:55 ` Daniel Wagner
2025-04-29 17:54 ` Hannes Reinecke
2025-04-29 18:13 ` Keith Busch
1 sibling, 1 reply; 27+ messages in thread
From: Daniel Wagner @ 2025-04-29 13:55 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Guenter Roeck, Daniel Wagner, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart,
Shinichiro Kawasaki, linux-nvme, linux-kernel
On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
> On 4/28/25 14:44, Daniel Wagner wrote:
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index b502ac07483b..d3c4eacf607f 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
> > msleep(100);
> > }
> >
> > - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
> > + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > return;
> >
> > nvme_unquiesce_io_queues(ctrl);
>
> I would rather have a separate state for firmware activation.
> (Ab-)using the 'RESETTING' state here has direct implications
> with the error handler, as for the error handler 'RESETTING'
> means that the error handler has been scheduled.
> Which is not true for firmware activation.
Okay, so something like this here (untested, working on it)?
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b502ac07483b..32482712d0f2 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -565,6 +565,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
case NVME_CTRL_LIVE:
switch (old_state) {
case NVME_CTRL_CONNECTING:
+ case NVME_CTRL_FW_ACTIVATION:
changed = true;
fallthrough;
default:
@@ -575,6 +576,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
switch (old_state) {
case NVME_CTRL_NEW:
case NVME_CTRL_LIVE:
+ case NVME_CTRL_FW_ACTIVATION:
changed = true;
fallthrough;
default:
@@ -596,6 +598,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
case NVME_CTRL_LIVE:
case NVME_CTRL_RESETTING:
case NVME_CTRL_CONNECTING:
+ case NVME_CTRL_FW_ACTIVATION:
changed = true;
fallthrough;
default:
@@ -621,6 +624,15 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
break;
}
break;
+ case NVME_CTRL_FW_ACTIVATION:
+ switch (old_state) {
+ case NVME_CTRL_LIVE:
+ changed = true;
+ fallthrough;
+ default:
+ break;
+ }
+ break;
default:
break;
}
@@ -4529,7 +4541,7 @@ static bool nvme_handle_aen_notice(struct nvme_ctrl *ctrl, u32 result)
* recovery actions from interfering with the controller's
* firmware activation.
*/
- if (nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) {
+ if (nvme_change_ctrl_state(ctrl, NVME_CTRL_FW_ACTIVATION)) {
requeue = false;
queue_work(nvme_wq, &ctrl->fw_act_work);
}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 51e078642127..3a383225afed 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -247,6 +247,7 @@ static inline u16 nvme_req_qid(struct request *req)
* shutdown or removal. In this case we forcibly
* kill all inflight I/O as they have no chance to
* complete
+ * @NVME_CTRL_FW_ACTIVATION: Controller is in firmware activation state.
*/
enum nvme_ctrl_state {
NVME_CTRL_NEW,
@@ -256,6 +257,7 @@ enum nvme_ctrl_state {
NVME_CTRL_DELETING,
NVME_CTRL_DELETING_NOIO,
NVME_CTRL_DEAD,
+ NVME_CTRL_FW_ACTIVATION,
};
struct nvme_fault_inject {
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-29 13:55 ` Daniel Wagner
@ 2025-04-29 17:54 ` Hannes Reinecke
0 siblings, 0 replies; 27+ messages in thread
From: Hannes Reinecke @ 2025-04-29 17:54 UTC (permalink / raw)
To: Daniel Wagner
Cc: Guenter Roeck, Daniel Wagner, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart,
Shinichiro Kawasaki, linux-nvme, linux-kernel
On 4/29/25 15:55, Daniel Wagner wrote:
> On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
>> On 4/28/25 14:44, Daniel Wagner wrote:
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index b502ac07483b..d3c4eacf607f 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
>>> msleep(100);
>>> }
>>>
>>> - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
>>> + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>> return;
>>>
>>> nvme_unquiesce_io_queues(ctrl);
>>
>> I would rather have a separate state for firmware activation.
>> (Ab-)using the 'RESETTING' state here has direct implications
>> with the error handler, as for the error handler 'RESETTING'
>> means that the error handler has been scheduled.
>> Which is not true for firmware activation.
>
> Okay, so something like this here (untested, working on it)?
>
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index b502ac07483b..32482712d0f2 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -565,6 +565,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
> case NVME_CTRL_LIVE:
> switch (old_state) {
> case NVME_CTRL_CONNECTING:
> + case NVME_CTRL_FW_ACTIVATION:
> changed = true;
> fallthrough;
> default:
> @@ -575,6 +576,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
> switch (old_state) {
> case NVME_CTRL_NEW:
> case NVME_CTRL_LIVE:
> + case NVME_CTRL_FW_ACTIVATION:
> changed = true;
> fallthrough;
> default:
> @@ -596,6 +598,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
> case NVME_CTRL_LIVE:
> case NVME_CTRL_RESETTING:
> case NVME_CTRL_CONNECTING:
> + case NVME_CTRL_FW_ACTIVATION:
> changed = true;
> fallthrough;
> default:
> @@ -621,6 +624,15 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
> break;
> }
> break;
> + case NVME_CTRL_FW_ACTIVATION:
> + switch (old_state) {
> + case NVME_CTRL_LIVE:
> + changed = true;
> + fallthrough;
> + default:
> + break;
> + }
> + break;
> default:
> break;
> }
> @@ -4529,7 +4541,7 @@ static bool nvme_handle_aen_notice(struct nvme_ctrl *ctrl, u32 result)
> * recovery actions from interfering with the controller's
> * firmware activation.
> */
> - if (nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) {
> + if (nvme_change_ctrl_state(ctrl, NVME_CTRL_FW_ACTIVATION)) {
> requeue = false;
> queue_work(nvme_wq, &ctrl->fw_act_work);
> }
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 51e078642127..3a383225afed 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -247,6 +247,7 @@ static inline u16 nvme_req_qid(struct request *req)
> * shutdown or removal. In this case we forcibly
> * kill all inflight I/O as they have no chance to
> * complete
> + * @NVME_CTRL_FW_ACTIVATION: Controller is in firmware activation state.
> */
> enum nvme_ctrl_state {
> NVME_CTRL_NEW,
> @@ -256,6 +257,7 @@ enum nvme_ctrl_state {
> NVME_CTRL_DELETING,
> NVME_CTRL_DELETING_NOIO,
> NVME_CTRL_DEAD,
> + NVME_CTRL_FW_ACTIVATION,
> };
>
> struct nvme_fault_inject {
>
Indeed, something like it.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-28 13:21 ` Hannes Reinecke
2025-04-29 13:55 ` Daniel Wagner
@ 2025-04-29 18:13 ` Keith Busch
2025-04-29 18:23 ` Guenter Roeck
2025-04-30 6:08 ` Hannes Reinecke
1 sibling, 2 replies; 27+ messages in thread
From: Keith Busch @ 2025-04-29 18:13 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Daniel Wagner, Guenter Roeck, Daniel Wagner, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart,
Shinichiro Kawasaki, linux-nvme, linux-kernel
On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index b502ac07483b..d3c4eacf607f 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
> > msleep(100);
> > }
> >
> > - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
> > + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > return;
> >
> > nvme_unquiesce_io_queues(ctrl);
>
> I would rather have a separate state for firmware activation.
> (Ab-)using the 'RESETTING' state here has direct implications
> with the error handler, as for the error handler 'RESETTING'
> means that the error handler has been scheduled.
> Which is not true for firmware activation.
But the point of having firmware activation set the state to RESETTING
was to fence off error handling from trying to schedule a real reset.
The fw activation work schedules its own recovery if it times out, but
we don't want any other recovery action or user requested resets to
proceed while an activation is still pending.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-29 18:13 ` Keith Busch
@ 2025-04-29 18:23 ` Guenter Roeck
2025-04-29 18:42 ` Keith Busch
2025-04-30 6:08 ` Hannes Reinecke
1 sibling, 1 reply; 27+ messages in thread
From: Guenter Roeck @ 2025-04-29 18:23 UTC (permalink / raw)
To: Keith Busch, Hannes Reinecke
Cc: Daniel Wagner, Daniel Wagner, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, James Smart, Shinichiro Kawasaki, linux-nvme,
linux-kernel
On 4/29/25 11:13, Keith Busch wrote:
> On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index b502ac07483b..d3c4eacf607f 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
>>> msleep(100);
>>> }
>>>
>>> - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
>>> + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>> return;
>>>
>>> nvme_unquiesce_io_queues(ctrl);
>>
>> I would rather have a separate state for firmware activation.
>> (Ab-)using the 'RESETTING' state here has direct implications
>> with the error handler, as for the error handler 'RESETTING'
>> means that the error handler has been scheduled.
>> Which is not true for firmware activation.
>
> But the point of having firmware activation set the state to RESETTING
> was to fence off error handling from trying to schedule a real reset.
> The fw activation work schedules its own recovery if it times out, but
> we don't want any other recovery action or user requested resets to
> proceed while an activation is still pending.
Not only that; there are various checks against NVME_CTRL_RESETTING
sprinkled through the code. What is the impact of introducing a new state
without handling all those checks ?
Guenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-29 18:23 ` Guenter Roeck
@ 2025-04-29 18:42 ` Keith Busch
2025-04-30 6:43 ` Daniel Wagner
0 siblings, 1 reply; 27+ messages in thread
From: Keith Busch @ 2025-04-29 18:42 UTC (permalink / raw)
To: Guenter Roeck
Cc: Hannes Reinecke, Daniel Wagner, Daniel Wagner, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart,
Shinichiro Kawasaki, linux-nvme, linux-kernel
On Tue, Apr 29, 2025 at 11:23:25AM -0700, Guenter Roeck wrote:
> On 4/29/25 11:13, Keith Busch wrote:
> > On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
> > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > > > index b502ac07483b..d3c4eacf607f 100644
> > > > --- a/drivers/nvme/host/core.c
> > > > +++ b/drivers/nvme/host/core.c
> > > > @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
> > > > msleep(100);
> > > > }
> > > >
> > > > - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > > > + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
> > > > + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > > > return;
> > > >
> > > > nvme_unquiesce_io_queues(ctrl);
> > >
> > > I would rather have a separate state for firmware activation.
> > > (Ab-)using the 'RESETTING' state here has direct implications
> > > with the error handler, as for the error handler 'RESETTING'
> > > means that the error handler has been scheduled.
> > > Which is not true for firmware activation.
> >
> > But the point of having firmware activation set the state to RESETTING
> > was to fence off error handling from trying to schedule a real reset.
> > The fw activation work schedules its own recovery if it times out, but
> > we don't want any other recovery action or user requested resets to
> > proceed while an activation is still pending.
>
> Not only that; there are various checks against NVME_CTRL_RESETTING
> sprinkled through the code. What is the impact of introducing a new state
> without handling all those checks ?
Good point, bad things will happen if these checks are not updated to
know about the new state. For example, nvme-pci will attempt aborting IO
or disabling the controller on a timeout instead of restarting the timer
as desired.
Can we just revert the commit that prevented the RESETTING -> LIVE
transtion for now?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-29 18:13 ` Keith Busch
2025-04-29 18:23 ` Guenter Roeck
@ 2025-04-30 6:08 ` Hannes Reinecke
1 sibling, 0 replies; 27+ messages in thread
From: Hannes Reinecke @ 2025-04-30 6:08 UTC (permalink / raw)
To: Keith Busch
Cc: Daniel Wagner, Guenter Roeck, Daniel Wagner, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart,
Shinichiro Kawasaki, linux-nvme, linux-kernel
On 4/29/25 20:13, Keith Busch wrote:
> On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index b502ac07483b..d3c4eacf607f 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
>>> msleep(100);
>>> }
>>>
>>> - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
>>> + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>> return;
>>>
>>> nvme_unquiesce_io_queues(ctrl);
>>
>> I would rather have a separate state for firmware activation.
>> (Ab-)using the 'RESETTING' state here has direct implications
>> with the error handler, as for the error handler 'RESETTING'
>> means that the error handler has been scheduled.
>> Which is not true for firmware activation.
>
> But the point of having firmware activation set the state to RESETTING
> was to fence off error handling from trying to schedule a real reset.
> The fw activation work schedules its own recovery if it times out, but
> we don't want any other recovery action or user requested resets to
> proceed while an activation is still pending.
I know; that was precisely my point. We are overloading the 'RESETTTING'
state to mean either 'reset has started' or 'fw activation is ongoing'.
Which are two _vastly_ different situations, and we should differentiate
them eg by introducing a new state. That new state can (and should) have
the same effects as the RESETTING state, true.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-29 18:42 ` Keith Busch
@ 2025-04-30 6:43 ` Daniel Wagner
2025-04-30 16:01 ` Keith Busch
2025-04-30 16:11 ` Guenter Roeck
0 siblings, 2 replies; 27+ messages in thread
From: Daniel Wagner @ 2025-04-30 6:43 UTC (permalink / raw)
To: Keith Busch
Cc: Guenter Roeck, Hannes Reinecke, Daniel Wagner, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart,
Shinichiro Kawasaki, linux-nvme, linux-kernel
On Tue, Apr 29, 2025 at 11:42:25AM -0700, Keith Busch wrote:
> On Tue, Apr 29, 2025 at 11:23:25AM -0700, Guenter Roeck wrote:
> > On 4/29/25 11:13, Keith Busch wrote:
> > > On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
> > > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > > > > index b502ac07483b..d3c4eacf607f 100644
> > > > > --- a/drivers/nvme/host/core.c
> > > > > +++ b/drivers/nvme/host/core.c
> > > > > @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
> > > > > msleep(100);
> > > > > }
> > > > >
> > > > > - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > > > > + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
> > > > > + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > > > > return;
> > > > >
> > > > > nvme_unquiesce_io_queues(ctrl);
> > > >
> > > > I would rather have a separate state for firmware activation.
> > > > (Ab-)using the 'RESETTING' state here has direct implications
> > > > with the error handler, as for the error handler 'RESETTING'
> > > > means that the error handler has been scheduled.
> > > > Which is not true for firmware activation.
> > >
> > > But the point of having firmware activation set the state to RESETTING
> > > was to fence off error handling from trying to schedule a real reset.
> > > The fw activation work schedules its own recovery if it times out, but
> > > we don't want any other recovery action or user requested resets to
> > > proceed while an activation is still pending.
> >
> > Not only that; there are various checks against NVME_CTRL_RESETTING
> > sprinkled through the code. What is the impact of introducing a new state
> > without handling all those checks ?
>
> Good point, bad things will happen if these checks are not updated to
> know about the new state. For example, nvme-pci will attempt aborting IO
> or disabling the controller on a timeout instead of restarting the timer
> as desired.
>
> Can we just revert the commit that prevented the RESETTING -> LIVE
> transtion for now?
Unfortunately, it will break the FC error handling again(*). The
simplest fix is right above, add the transition from RESETTING to
CONNECTING and then to LIVE, IMO.
(*) ee59e3820ca9 ("nvme-fc: do not ignore connectivity loss during connecting")
f13409bb3f91 ("nvme-fc: rely on state transitions to handle connectivity loss")
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-30 6:43 ` Daniel Wagner
@ 2025-04-30 16:01 ` Keith Busch
2025-04-30 16:12 ` Guenter Roeck
2025-04-30 16:11 ` Guenter Roeck
1 sibling, 1 reply; 27+ messages in thread
From: Keith Busch @ 2025-04-30 16:01 UTC (permalink / raw)
To: Daniel Wagner
Cc: Guenter Roeck, Hannes Reinecke, Daniel Wagner, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart,
Shinichiro Kawasaki, linux-nvme, linux-kernel
On Wed, Apr 30, 2025 at 08:43:04AM +0200, Daniel Wagner wrote:
> On Tue, Apr 29, 2025 at 11:42:25AM -0700, Keith Busch wrote:
> > On Tue, Apr 29, 2025 at 11:23:25AM -0700, Guenter Roeck wrote:
> > > On 4/29/25 11:13, Keith Busch wrote:
> > > > On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
> > > > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > > > > > index b502ac07483b..d3c4eacf607f 100644
> > > > > > --- a/drivers/nvme/host/core.c
> > > > > > +++ b/drivers/nvme/host/core.c
> > > > > > @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
> > > > > > msleep(100);
> > > > > > }
> > > > > >
> > > > > > - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > > > > > + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
> > > > > > + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> > > > > > return;
> > > > > >
> > > > > > nvme_unquiesce_io_queues(ctrl);
> > > > >
> > > > > I would rather have a separate state for firmware activation.
> > > > > (Ab-)using the 'RESETTING' state here has direct implications
> > > > > with the error handler, as for the error handler 'RESETTING'
> > > > > means that the error handler has been scheduled.
> > > > > Which is not true for firmware activation.
> > > >
> > > > But the point of having firmware activation set the state to RESETTING
> > > > was to fence off error handling from trying to schedule a real reset.
> > > > The fw activation work schedules its own recovery if it times out, but
> > > > we don't want any other recovery action or user requested resets to
> > > > proceed while an activation is still pending.
> > >
> > > Not only that; there are various checks against NVME_CTRL_RESETTING
> > > sprinkled through the code. What is the impact of introducing a new state
> > > without handling all those checks ?
> >
> > Good point, bad things will happen if these checks are not updated to
> > know about the new state. For example, nvme-pci will attempt aborting IO
> > or disabling the controller on a timeout instead of restarting the timer
> > as desired.
> >
> > Can we just revert the commit that prevented the RESETTING -> LIVE
> > transtion for now?
>
> Unfortunately, it will break the FC error handling again(*). The
> simplest fix is right above, add the transition from RESETTING to
> CONNECTING and then to LIVE, IMO.
Gotcha, yes, that looks like the simplest fix for the current release
then. We need to be careful with adding new states, so we can revisit
Hannes' suggestion for 6.16 if we really want to split this.
If you send the simple fix as a formal patch, please add my review and
the "Fixes:" tag.
Reviewed-by: Keith Busch <kbusch@kernel.org>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-30 6:43 ` Daniel Wagner
2025-04-30 16:01 ` Keith Busch
@ 2025-04-30 16:11 ` Guenter Roeck
1 sibling, 0 replies; 27+ messages in thread
From: Guenter Roeck @ 2025-04-30 16:11 UTC (permalink / raw)
To: Daniel Wagner, Keith Busch
Cc: Hannes Reinecke, Daniel Wagner, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, James Smart, Shinichiro Kawasaki, linux-nvme,
linux-kernel
On 4/29/25 23:43, Daniel Wagner wrote:
> On Tue, Apr 29, 2025 at 11:42:25AM -0700, Keith Busch wrote:
>> On Tue, Apr 29, 2025 at 11:23:25AM -0700, Guenter Roeck wrote:
>>> On 4/29/25 11:13, Keith Busch wrote:
>>>> On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
>>>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>>>> index b502ac07483b..d3c4eacf607f 100644
>>>>>> --- a/drivers/nvme/host/core.c
>>>>>> +++ b/drivers/nvme/host/core.c
>>>>>> @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
>>>>>> msleep(100);
>>>>>> }
>>>>>>
>>>>>> - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>>>>> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
>>>>>> + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>>>>> return;
>>>>>>
>>>>>> nvme_unquiesce_io_queues(ctrl);
>>>>>
>>>>> I would rather have a separate state for firmware activation.
>>>>> (Ab-)using the 'RESETTING' state here has direct implications
>>>>> with the error handler, as for the error handler 'RESETTING'
>>>>> means that the error handler has been scheduled.
>>>>> Which is not true for firmware activation.
>>>>
>>>> But the point of having firmware activation set the state to RESETTING
>>>> was to fence off error handling from trying to schedule a real reset.
>>>> The fw activation work schedules its own recovery if it times out, but
>>>> we don't want any other recovery action or user requested resets to
>>>> proceed while an activation is still pending.
>>>
>>> Not only that; there are various checks against NVME_CTRL_RESETTING
>>> sprinkled through the code. What is the impact of introducing a new state
>>> without handling all those checks ?
>>
>> Good point, bad things will happen if these checks are not updated to
>> know about the new state. For example, nvme-pci will attempt aborting IO
>> or disabling the controller on a timeout instead of restarting the timer
>> as desired.
>>
>> Can we just revert the commit that prevented the RESETTING -> LIVE
>> transtion for now?
>
> Unfortunately, it will break the FC error handling again(*). The
> simplest fix is right above, add the transition from RESETTING to
> CONNECTING and then to LIVE, IMO.
>
> (*) ee59e3820ca9 ("nvme-fc: do not ignore connectivity loss during connecting")
> f13409bb3f91 ("nvme-fc: rely on state transitions to handle connectivity loss")
I don't know if having nvme-fc _depend_ on state transition restrictions is good
or bad, and I don't know anything about the NVME state machine.
However, given that there _is_ a real regression in the code right now, I agree
that the solution (or call it workaround) suggested by Daniel (or something
similar) should be the way to go, and it would be easy to backport. After all,
firmware update is now broken in all stable release branches, and that is not
a good situation to be in.
Introducing a new state for firmware updates should be done carefully and not be
part of a regression fix which needs to be backported to all stable release
branches.
Thanks,
Guenter
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-30 16:01 ` Keith Busch
@ 2025-04-30 16:12 ` Guenter Roeck
2025-05-02 9:02 ` Daniel Wagner
0 siblings, 1 reply; 27+ messages in thread
From: Guenter Roeck @ 2025-04-30 16:12 UTC (permalink / raw)
To: Keith Busch, Daniel Wagner
Cc: Hannes Reinecke, Daniel Wagner, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, James Smart, Shinichiro Kawasaki, linux-nvme,
linux-kernel
On 4/30/25 09:01, Keith Busch wrote:
> On Wed, Apr 30, 2025 at 08:43:04AM +0200, Daniel Wagner wrote:
>> On Tue, Apr 29, 2025 at 11:42:25AM -0700, Keith Busch wrote:
>>> On Tue, Apr 29, 2025 at 11:23:25AM -0700, Guenter Roeck wrote:
>>>> On 4/29/25 11:13, Keith Busch wrote:
>>>>> On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
>>>>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>>>>> index b502ac07483b..d3c4eacf607f 100644
>>>>>>> --- a/drivers/nvme/host/core.c
>>>>>>> +++ b/drivers/nvme/host/core.c
>>>>>>> @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
>>>>>>> msleep(100);
>>>>>>> }
>>>>>>>
>>>>>>> - if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>>>>>> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
>>>>>>> + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>>>>>> return;
>>>>>>>
>>>>>>> nvme_unquiesce_io_queues(ctrl);
>>>>>>
>>>>>> I would rather have a separate state for firmware activation.
>>>>>> (Ab-)using the 'RESETTING' state here has direct implications
>>>>>> with the error handler, as for the error handler 'RESETTING'
>>>>>> means that the error handler has been scheduled.
>>>>>> Which is not true for firmware activation.
>>>>>
>>>>> But the point of having firmware activation set the state to RESETTING
>>>>> was to fence off error handling from trying to schedule a real reset.
>>>>> The fw activation work schedules its own recovery if it times out, but
>>>>> we don't want any other recovery action or user requested resets to
>>>>> proceed while an activation is still pending.
>>>>
>>>> Not only that; there are various checks against NVME_CTRL_RESETTING
>>>> sprinkled through the code. What is the impact of introducing a new state
>>>> without handling all those checks ?
>>>
>>> Good point, bad things will happen if these checks are not updated to
>>> know about the new state. For example, nvme-pci will attempt aborting IO
>>> or disabling the controller on a timeout instead of restarting the timer
>>> as desired.
>>>
>>> Can we just revert the commit that prevented the RESETTING -> LIVE
>>> transtion for now?
>>
>> Unfortunately, it will break the FC error handling again(*). The
>> simplest fix is right above, add the transition from RESETTING to
>> CONNECTING and then to LIVE, IMO.
>
> Gotcha, yes, that looks like the simplest fix for the current release
> then. We need to be careful with adding new states, so we can revisit
> Hannes' suggestion for 6.16 if we really want to split this.
>
> If you send the simple fix as a formal patch, please add my review and
> the "Fixes:" tag.
>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
Please feel free to add mine as well:
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-04-30 16:12 ` Guenter Roeck
@ 2025-05-02 9:02 ` Daniel Wagner
0 siblings, 0 replies; 27+ messages in thread
From: Daniel Wagner @ 2025-05-02 9:02 UTC (permalink / raw)
To: Guenter Roeck
Cc: Keith Busch, Hannes Reinecke, Daniel Wagner, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart,
Shinichiro Kawasaki, linux-nvme, linux-kernel
On Wed, Apr 30, 2025 at 09:12:30AM -0700, Guenter Roeck wrote:
> On 4/30/25 09:01, Keith Busch wrote:
> > If you send the simple fix as a formal patch, please add my review and
> > the "Fixes:" tag.
> >
> > Reviewed-by: Keith Busch <kbusch@kernel.org>
>
> Please feel free to add mine as well:
>
> Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Sorry for the delay, yesterday was a public holiday and way to nice
weather to do any work.
Just sent out the fix. Thanks for the review tags.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2025-02-14 8:02 ` [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state Daniel Wagner
2025-02-20 10:34 ` Sagi Grimberg
2025-04-27 15:59 ` Guenter Roeck
@ 2026-01-09 19:18 ` John Meneghini
2026-01-11 9:33 ` Nilay Shroff
2 siblings, 1 reply; 27+ messages in thread
From: John Meneghini @ 2026-01-09 19:18 UTC (permalink / raw)
To: Daniel Wagner, Keith Busch, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, James Smart, Hannes Reinecke, Shinichiro Kawasaki,
Nilay Shroff, Wen Xiong, Narayana Murty N
Cc: linux-nvme, linux-kernel, Ewan Milne, Maurizio Lombardi
Unfortunately, it has been discovered that this patch causes a serious regression on powerpc platforms.
If anyone has a powerpc platform with an NVMe/PCIe device installed, please run this simple test and see if it works.
# uname -av
Linux rdma-cert-03-lp10.rdma.lab.eng.rdu2.redhat.com 6.19.0-rc4+ #1 SMP Wed Jan 7 21:42:54 EST 2026 ppc64le GNU/Linux
# nvme list-subsys /dev/nvme0n1
nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
iopolicy=numa
\
+- nvme0 pcie 0018:01:00.0 live optimized
# nvme subsystem-reset /dev/nvme0; nvme list-subsys /dev/nvme0n1; sleep 1; nvme list-subsys /dev/nvme0n1; nvme list-subsys /dev/nvme0n1;
nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
iopolicy=numa
\
+- nvme0 pcie 0018:01:00.0 resetting optimized
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
# nvme list-subsys /dev/nvme0n1;
# nvme list-subsys /dev/nvme0n1;
nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
iopolicy=numa
\
+- nvme0 pcie 0018:01:00.0 resetting optimized
At this point the machine is HUNG. It's stuck in the resetting state forever.
Because /dev/nvme0n1 is the root device, I need to power-cycle/reboot the host to recover.
/John
On 2/14/25 3:02 AM, Daniel Wagner wrote:
> The fabric transports and also the PCI transport are not entering the
> LIVE state from NEW or RESETTING. This makes the state machine more
> restrictive and allows to catch not supported state transitions, e.g.
> directly switching from RESETTING to LIVE.
>
> Signed-off-by: Daniel Wagner <wagi@kernel.org>
> ---
> drivers/nvme/host/core.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 818d4e49aab51c388af9a48bf9d466fea9cef51b..f028913e2e622ee348e88879c6e6b7e8f8a1cc82 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -564,8 +564,6 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
> switch (new_state) {
> case NVME_CTRL_LIVE:
> switch (old_state) {
> - case NVME_CTRL_NEW:
> - case NVME_CTRL_RESETTING:
> case NVME_CTRL_CONNECTING:
> changed = true;
> fallthrough;
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2026-01-09 19:18 ` John Meneghini
@ 2026-01-11 9:33 ` Nilay Shroff
2026-01-12 8:14 ` Daniel Wagner
0 siblings, 1 reply; 27+ messages in thread
From: Nilay Shroff @ 2026-01-11 9:33 UTC (permalink / raw)
To: John Meneghini, Daniel Wagner, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart, Hannes Reinecke,
Shinichiro Kawasaki, Wen Xiong, Narayana Murty N
Cc: linux-nvme, linux-kernel, Ewan Milne, Maurizio Lombardi
On 1/10/26 12:48 AM, John Meneghini wrote:
> Unfortunately, it has been discovered that this patch causes a serious regression on powerpc platforms.
>
> If anyone has a powerpc platform with an NVMe/PCIe device installed, please run this simple test and see if it works.
>
> # uname -av
> Linux rdma-cert-03-lp10.rdma.lab.eng.rdu2.redhat.com 6.19.0-rc4+ #1 SMP Wed Jan 7 21:42:54 EST 2026 ppc64le GNU/Linux
>
> # nvme list-subsys /dev/nvme0n1
> nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
> hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
> iopolicy=numa
> \
> +- nvme0 pcie 0018:01:00.0 live optimized
>
> # nvme subsystem-reset /dev/nvme0; nvme list-subsys /dev/nvme0n1; sleep 1; nvme list-subsys /dev/nvme0n1; nvme list-subsys /dev/nvme0n1;
> nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
> hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
> iopolicy=numa
> \
> +- nvme0 pcie 0018:01:00.0 resetting optimized
> [Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
> [Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
> [Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
> [Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
> [Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
>
> # nvme list-subsys /dev/nvme0n1;
>
> # nvme list-subsys /dev/nvme0n1;
> nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
> hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
> iopolicy=numa
> \
> +- nvme0 pcie 0018:01:00.0 resetting optimized
>
> At this point the machine is HUNG. It's stuck in the resetting state forever.
>
> Because /dev/nvme0n1 is the root device, I need to power-cycle/reboot the host to recover.
> /John
>
> On 2/14/25 3:02 AM, Daniel Wagner wrote:
>> The fabric transports and also the PCI transport are not entering the
>> LIVE state from NEW or RESETTING. This makes the state machine more
>> restrictive and allows to catch not supported state transitions, e.g.
>> directly switching from RESETTING to LIVE.
>>
>> Signed-off-by: Daniel Wagner <wagi@kernel.org>
>> ---
>> drivers/nvme/host/core.c | 2 --
>> 1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 818d4e49aab51c388af9a48bf9d466fea9cef51b..f028913e2e622ee348e88879c6e6b7e8f8a1cc82 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -564,8 +564,6 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
>> switch (new_state) {
>> case NVME_CTRL_LIVE:
>> switch (old_state) {
>> - case NVME_CTRL_NEW:
>> - case NVME_CTRL_RESETTING:
>> case NVME_CTRL_CONNECTING:
>> changed = true;
>> fallthrough;
>>
>
This was broken because with this commit d2fe192348f9 (“nvme: only allow entering LIVE
from CONNECTING state”) now we don't allow changing controller state from
RESETTING -> LIVE. I saw we also had similar state change issue with firmware activation
code which was fixed by explicitly transitioning the controller state through RESETTING ->
CONNECTING -> LIVE. We may employ the similar solution here for subsystem reset case as well.
Currently, the NVMe PCIe subsystem reset code performs the following steps:
1. Sets the controller state to RESETTING
2. Writes the subsystem reset command to the NSSR register
3. Attempts to transition the controller state directly to LIVE
This effectively bypasses the CONNECTING state. The transition to LIVE is artificial but
intentional, since writing to the NSSR register causes the loss of communication with the
NVMe adapter and the controller must be marked LIVE so that any in-flight I/O at the time the
subsystem reset is issued, or an explicit MMIO read, can trigger EEH recovery and ultimately
restore communication link between the NVMe adapter and the system.
With the stricter state transition rules introduced by commit d2fe192348f9 (“nvme: only allow
entering LIVE from CONNECTING state”), the direct transition from RESETTING -> LIVE is no longer
permitted, rendering the current logic ineffective.
Taking a cue from the firmware activation fix, it seems reasonable to explicitly transition
the controller state through CONNECTING in the subsystem reset path as well. So how about making
the following change to fix this?
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 0e4caeab739c..3027bba232de 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1532,7 +1532,10 @@ static int nvme_pci_subsystem_reset(struct nvme_ctrl *ctrl)
}
writel(NVME_SUBSYS_RESET, dev->bar + NVME_REG_NSSR);
- nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE);
+
+ if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
+ !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
+ goto unlock;
/*
* Read controller status to flush the previous write and trigger a
Thanks,
--Nilay
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2026-01-11 9:33 ` Nilay Shroff
@ 2026-01-12 8:14 ` Daniel Wagner
2026-01-13 6:10 ` Nilay Shroff
0 siblings, 1 reply; 27+ messages in thread
From: Daniel Wagner @ 2026-01-12 8:14 UTC (permalink / raw)
To: Nilay Shroff
Cc: John Meneghini, Daniel Wagner, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart, Hannes Reinecke,
Shinichiro Kawasaki, Wen Xiong, Narayana Murty N, linux-nvme,
linux-kernel, Ewan Milne, Maurizio Lombardi
Hi Nilay,
On Sun, Jan 11, 2026 at 03:03:43PM +0530, Nilay Shroff wrote:
> This was broken because with this commit d2fe192348f9 (“nvme: only allow entering LIVE
> from CONNECTING state”) now we don't allow changing controller state from
> RESETTING -> LIVE. I saw we also had similar state change issue with firmware activation
> code which was fixed by explicitly transitioning the controller state through RESETTING ->
> CONNECTING -> LIVE. We may employ the similar solution here for subsystem reset case as well.
>
> Currently, the NVMe PCIe subsystem reset code performs the following steps:
>
> 1. Sets the controller state to RESETTING
> 2. Writes the subsystem reset command to the NSSR register
> 3. Attempts to transition the controller state directly to LIVE
>
> This effectively bypasses the CONNECTING state. The transition to LIVE is artificial but
> intentional, since writing to the NSSR register causes the loss of communication with the
> NVMe adapter and the controller must be marked LIVE so that any in-flight I/O at the time the
> subsystem reset is issued, or an explicit MMIO read, can trigger EEH recovery and ultimately
> restore communication link between the NVMe adapter and the system.
>
> With the stricter state transition rules introduced by commit d2fe192348f9 (“nvme: only allow
> entering LIVE from CONNECTING state”), the direct transition from RESETTING -> LIVE is no longer
> permitted, rendering the current logic ineffective.
>
> Taking a cue from the firmware activation fix, it seems reasonable to explicitly transition
> the controller state through CONNECTING in the subsystem reset path as well. So how about making
> the following change to fix this?
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 0e4caeab739c..3027bba232de 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1532,7 +1532,10 @@ static int nvme_pci_subsystem_reset(struct nvme_ctrl *ctrl)
> }
>
> writel(NVME_SUBSYS_RESET, dev->bar + NVME_REG_NSSR);
> - nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE);
> +
> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
> + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
> + goto unlock;
>
> /*
> * Read controller status to flush the previous write and trigger a
This seems to be similar to case where the firmware update got stuck [1].
650415fca0a9 ("nvme: unblock ctrl state transition for firmware update")
From my understanding this should work, but it's probably better to
double check that this doesn't violate any assumptions.
[1] https://lore.kernel.org/all/aBJJQoOBhaXj7P36@kbusch-mbp/
Thanks,
Daniel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2026-01-12 8:14 ` Daniel Wagner
@ 2026-01-13 6:10 ` Nilay Shroff
2026-01-13 13:55 ` John Meneghini
0 siblings, 1 reply; 27+ messages in thread
From: Nilay Shroff @ 2026-01-13 6:10 UTC (permalink / raw)
To: Daniel Wagner
Cc: John Meneghini, Daniel Wagner, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, James Smart, Hannes Reinecke,
Shinichiro Kawasaki, Wen Xiong, Narayana Murty N, linux-nvme,
linux-kernel, Ewan Milne, Maurizio Lombardi
On 1/12/26 1:44 PM, Daniel Wagner wrote:
> Hi Nilay,
>
> On Sun, Jan 11, 2026 at 03:03:43PM +0530, Nilay Shroff wrote:
>> This was broken because with this commit d2fe192348f9 (“nvme: only allow entering LIVE
>> from CONNECTING state”) now we don't allow changing controller state from
>> RESETTING -> LIVE. I saw we also had similar state change issue with firmware activation
>> code which was fixed by explicitly transitioning the controller state through RESETTING ->
>> CONNECTING -> LIVE. We may employ the similar solution here for subsystem reset case as well.
>>
>> Currently, the NVMe PCIe subsystem reset code performs the following steps:
>>
>> 1. Sets the controller state to RESETTING
>> 2. Writes the subsystem reset command to the NSSR register
>> 3. Attempts to transition the controller state directly to LIVE
>>
>> This effectively bypasses the CONNECTING state. The transition to LIVE is artificial but
>> intentional, since writing to the NSSR register causes the loss of communication with the
>> NVMe adapter and the controller must be marked LIVE so that any in-flight I/O at the time the
>> subsystem reset is issued, or an explicit MMIO read, can trigger EEH recovery and ultimately
>> restore communication link between the NVMe adapter and the system.
>>
>> With the stricter state transition rules introduced by commit d2fe192348f9 (“nvme: only allow
>> entering LIVE from CONNECTING state”), the direct transition from RESETTING -> LIVE is no longer
>> permitted, rendering the current logic ineffective.
>>
>> Taking a cue from the firmware activation fix, it seems reasonable to explicitly transition
>> the controller state through CONNECTING in the subsystem reset path as well. So how about making
>> the following change to fix this?
>>
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 0e4caeab739c..3027bba232de 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -1532,7 +1532,10 @@ static int nvme_pci_subsystem_reset(struct nvme_ctrl *ctrl)
>> }
>>
>> writel(NVME_SUBSYS_RESET, dev->bar + NVME_REG_NSSR);
>> - nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE);
>> +
>> + if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
>> + !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>> + goto unlock;
>>
>> /*
>> * Read controller status to flush the previous write and trigger a
>
> This seems to be similar to case where the firmware update got stuck [1].
>
> 650415fca0a9 ("nvme: unblock ctrl state transition for firmware update")
>
> From my understanding this should work, but it's probably better to
> double check that this doesn't violate any assumptions.
>
> [1] https://lore.kernel.org/all/aBJJQoOBhaXj7P36@kbusch-mbp/
>
I have tested this (and I believe John as well) and this seems tobe working good. From my code walk through, the controller state
transition RESETTING -> CONNECTING -> LIVE looks good to me, so I'll
send out a formal patch for review.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
2026-01-13 6:10 ` Nilay Shroff
@ 2026-01-13 13:55 ` John Meneghini
0 siblings, 0 replies; 27+ messages in thread
From: John Meneghini @ 2026-01-13 13:55 UTC (permalink / raw)
To: Nilay Shroff, Daniel Wagner
Cc: Daniel Wagner, Keith Busch, Jens Axboe, Christoph Hellwig,
Sagi Grimberg, James Smart, Hannes Reinecke, Shinichiro Kawasaki,
Wen Xiong, Narayana Murty N, linux-nvme, linux-kernel, Ewan Milne,
Maurizio Lombardi
On 1/13/26 1:10 AM, Nilay Shroff wrote:
> I have tested this (and I believe John as well) and this seems tobe working good. From my code walk through, the controller state
> transition RESETTING -> CONNECTING -> LIVE looks good to me, so I'll
> send out a formal patch for review.
Sounds good Nilay. Please send your patch for review and I will test it.
/John
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2026-01-13 13:56 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-14 8:02 [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
2025-02-14 8:02 ` [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state Daniel Wagner
2025-02-20 10:34 ` Sagi Grimberg
2025-04-27 15:59 ` Guenter Roeck
2025-04-28 12:44 ` Daniel Wagner
2025-04-28 13:21 ` Hannes Reinecke
2025-04-29 13:55 ` Daniel Wagner
2025-04-29 17:54 ` Hannes Reinecke
2025-04-29 18:13 ` Keith Busch
2025-04-29 18:23 ` Guenter Roeck
2025-04-29 18:42 ` Keith Busch
2025-04-30 6:43 ` Daniel Wagner
2025-04-30 16:01 ` Keith Busch
2025-04-30 16:12 ` Guenter Roeck
2025-05-02 9:02 ` Daniel Wagner
2025-04-30 16:11 ` Guenter Roeck
2025-04-30 6:08 ` Hannes Reinecke
2026-01-09 19:18 ` John Meneghini
2026-01-11 9:33 ` Nilay Shroff
2026-01-12 8:14 ` Daniel Wagner
2026-01-13 6:10 ` Nilay Shroff
2026-01-13 13:55 ` John Meneghini
2025-02-14 8:02 ` [PATCH 2/2] nvme-fc: rely on state transitions to handle connectivity loss Daniel Wagner
2025-02-20 10:36 ` Sagi Grimberg
2025-02-20 8:00 ` [PATCH 0/2] nvme-fc: fix schedule in atomic context Daniel Wagner
2025-02-20 12:50 ` Shinichiro Kawasaki
2025-02-20 17:23 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox