All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme_fc: correct hang in nvme_ns_remove()
@ 2018-01-11 23:21 James Smart
  2018-01-11 23:34 ` James Smart
  2018-01-14 10:44 ` Sagi Grimberg
  0 siblings, 2 replies; 5+ messages in thread
From: James Smart @ 2018-01-11 23:21 UTC (permalink / raw)


When connectivity is lost to a device, the association is terminated
and the blk-mq queues are quiesced/stopped. When connectivity is
re-established, they are resumed.

If connectivity is lost for a sufficient amount of time that the
controller is then deleted, the delete path starts tearing down queues,
and eventually calling nvme_ns_remove(). It appears that pending
commands may cause blk_cleanup_queue() to never complete and the
teardown stalls.

Correct by starting the ns queues after transitioning to a DELETING
state, allowing pending commands to be flushed with io failures. Thus
the delete path is clear when reached.

Signed-off-by: James Smart <james.smart at broadcom.com>
---
 drivers/nvme/host/fc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index fe98e994498c..99bf51c7e513 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2938,6 +2938,9 @@ nvme_fc_delete_ctrl(struct nvme_ctrl *nctrl)
 	 * waiting for io to terminate
 	 */
 	nvme_fc_delete_association(ctrl);
+
+	/* resume the io queues so that things will fast fail */
+	nvme_start_queues(nctrl);
 }
 
 static void
-- 
2.13.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] nvme_fc: correct hang in nvme_ns_remove()
  2018-01-11 23:21 [PATCH] nvme_fc: correct hang in nvme_ns_remove() James Smart
@ 2018-01-11 23:34 ` James Smart
  2018-01-11 23:46   ` Keith Busch
  2018-01-14 10:44 ` Sagi Grimberg
  1 sibling, 1 reply; 5+ messages in thread
From: James Smart @ 2018-01-11 23:34 UTC (permalink / raw)


If you compare behavior of FC with rdma, rdma starts the queues at the 
tail end of losing connectivity to the device - meaning any pending io 
and any future io issued while connectivity has yet to
be re-established (e.g. in RECONNECTING state) will fail with an io
error. This is good, if there is a multipathing config, as it's a
near-immediate fast fail scenario. But... if there is no multipath,
it means applications and filesystems are now seeing io errors while
connectivity is pending and that can be disastrous.  FC currently
leaves the queues quiesced while connectivity is pending so io errors 
are not seen. But this means FC won't fastfail the ios to the
multipath'er.

For now I want to fix this keeping the existing FC behavior. From there, 
I'd like the transports to block like FC does so no errors. However, a 
new timer would be introduced for a "fast failure timeout" - which 
starts at loss of connectivity and when expires, starts the queues and 
fails any pending and future io.

Thoughts ?

-- james


On 1/11/2018 3:21 PM, James Smart wrote:
> When connectivity is lost to a device, the association is terminated
> and the blk-mq queues are quiesced/stopped. When connectivity is
> re-established, they are resumed.
> 
> If connectivity is lost for a sufficient amount of time that the
> controller is then deleted, the delete path starts tearing down queues,
> and eventually calling nvme_ns_remove(). It appears that pending
> commands may cause blk_cleanup_queue() to never complete and the
> teardown stalls.
> 
> Correct by starting the ns queues after transitioning to a DELETING
> state, allowing pending commands to be flushed with io failures. Thus
> the delete path is clear when reached.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] nvme_fc: correct hang in nvme_ns_remove()
  2018-01-11 23:34 ` James Smart
@ 2018-01-11 23:46   ` Keith Busch
  2018-01-14 10:44     ` Sagi Grimberg
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2018-01-11 23:46 UTC (permalink / raw)


On Thu, Jan 11, 2018@03:34:58PM -0800, James Smart wrote:
> If you compare behavior of FC with rdma, rdma starts the queues at the tail
> end of losing connectivity to the device - meaning any pending io and any
> future io issued while connectivity has yet to
> be re-established (e.g. in RECONNECTING state) will fail with an io
> error. This is good, if there is a multipathing config, as it's a
> near-immediate fast fail scenario. But... if there is no multipath,
> it means applications and filesystems are now seeing io errors while
> connectivity is pending and that can be disastrous.  FC currently
> leaves the queues quiesced while connectivity is pending so io errors are
> not seen. But this means FC won't fastfail the ios to the
> multipath'er.
> 
> For now I want to fix this keeping the existing FC behavior. From there, I'd
> like the transports to block like FC does so no errors. However, a new timer
> would be introduced for a "fast failure timeout" - which starts at loss of
> connectivity and when expires, starts the queues and fails any pending and
> future io.
> 
> Thoughts ?

Yes, I think that sounds ok.

Longer term, I think it's a bit tacky that we rely on queue_rq to check
for early termination states. Since we can quiece blk-mq, it'd be better
if we introduce another tag iterator to end unstarted requests directly
when we need to give up on the request, rather than rely on queue_rq. I
was going to post a patch that does just that, but I still haven't gotten
a chance to test it... :(

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] nvme_fc: correct hang in nvme_ns_remove()
  2018-01-11 23:46   ` Keith Busch
@ 2018-01-14 10:44     ` Sagi Grimberg
  0 siblings, 0 replies; 5+ messages in thread
From: Sagi Grimberg @ 2018-01-14 10:44 UTC (permalink / raw)



>> If you compare behavior of FC with rdma, rdma starts the queues at the tail
>> end of losing connectivity to the device - meaning any pending io and any
>> future io issued while connectivity has yet to
>> be re-established (e.g. in RECONNECTING state) will fail with an io
>> error. This is good, if there is a multipathing config, as it's a
>> near-immediate fast fail scenario. But... if there is no multipath,
>> it means applications and filesystems are now seeing io errors while
>> connectivity is pending and that can be disastrous.  FC currently
>> leaves the queues quiesced while connectivity is pending so io errors are
>> not seen. But this means FC won't fastfail the ios to the
>> multipath'er.
>>
>> For now I want to fix this keeping the existing FC behavior. From there, I'd
>> like the transports to block like FC does so no errors. However, a new timer
>> would be introduced for a "fast failure timeout" - which starts at loss of
>> connectivity and when expires, starts the queues and fails any pending and
>> future io.
>>
>> Thoughts ?
> 
> Yes, I think that sounds ok.
> 
> Longer term, I think it's a bit tacky that we rely on queue_rq to check
> for early termination states. Since we can quiece blk-mq, it'd be better
> if we introduce another tag iterator to end unstarted requests directly
> when we need to give up on the request, rather than rely on queue_rq. I
> was going to post a patch that does just that, but I still haven't gotten
> a chance to test it... :(

I agree with the fast_fail_tmo, we also need it for the multipath code
which currently blocks forever until either we reconnect successfully
or delete the controller.

James, are you looking into this? Note that this should be done in
nvme-core.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] nvme_fc: correct hang in nvme_ns_remove()
  2018-01-11 23:21 [PATCH] nvme_fc: correct hang in nvme_ns_remove() James Smart
  2018-01-11 23:34 ` James Smart
@ 2018-01-14 10:44 ` Sagi Grimberg
  1 sibling, 0 replies; 5+ messages in thread
From: Sagi Grimberg @ 2018-01-14 10:44 UTC (permalink / raw)



> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index fe98e994498c..99bf51c7e513 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2938,6 +2938,9 @@ nvme_fc_delete_ctrl(struct nvme_ctrl *nctrl)
>   	 * waiting for io to terminate
>   	 */
>   	nvme_fc_delete_association(ctrl);
> +
> +	/* resume the io queues so that things will fast fail */
> +	nvme_start_queues(nctrl);

Is fc also unquiescing the admin queue here?

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-01-14 10:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-11 23:21 [PATCH] nvme_fc: correct hang in nvme_ns_remove() James Smart
2018-01-11 23:34 ` James Smart
2018-01-11 23:46   ` Keith Busch
2018-01-14 10:44     ` Sagi Grimberg
2018-01-14 10:44 ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.