* [RFC] nvme-rdma: Stop queues when starting with error recovery
@ 2022-05-23 15:21 Daniel Wagner
2022-05-24 13:12 ` Sagi Grimberg
0 siblings, 1 reply; 6+ messages in thread
From: Daniel Wagner @ 2022-05-23 15:21 UTC (permalink / raw)
To: linux-nvme; +Cc: Daniel Wagner
When we enter error recovery we should stop all queue activities and
all armed timers.
For example, we could arming an ANATT timer right before we enter
error recovery but do not successfully recover before the timer
fires. The timer is supposed only be active when the controller is in
LIVE state hence we should call nvme_stop_ctrl when starting with the
recover activites.
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
The nvme_stop_ctrl() does cancel pending ANATT timers. But so far I
don't got hold of logs when the two controllers get back live. So this
might not work as expected.
My question is do we just want to cancel the timer or is
nvme_stop_ctrl() the right function here. Obviously, the same problem
exists for nvme-tcp.
[ 889.241541] nvme nvme0: creating 4 I/O queues.
[ 892.341152] nvme nvme0: mapped 4/0/0 default/read/poll queues.
[ 892.350942] nvme nvme0: new ctrl: NQN "XXX", addr 192.20.93.101:4420
[ 892.402493] nvme nvme1: creating 4 I/O queues.
[ 895.392810] nvme nvme1: mapped 4/0/0 default/read/poll queues.
[ 895.402029] nvme nvme1: new ctrl: NQN "XXX", addr 192.20.93.102:4420
[ 895.471730] nvme nvme2: creating 4 I/O queues.
[ 898.509195] nvme nvme2: mapped 4/0/0 default/read/poll queues.
[ 898.519015] nvme nvme2: new ctrl: NQN "XXX", addr 192.20.193.101:4420
[ 898.571169] nvme nvme3: creating 4 I/O queues.
[ 901.592283] nvme nvme3: mapped 4/0/0 default/read/poll queues.
[ 901.601832] nvme nvme3: new ctrl: NQN "XXX", addr 192.20.193.102:4420
[ 983.429977] nvme nvme3: I/O 0 QID 0 timeout
[ 983.434472] nvme nvme3: starting error recovery
[ 984.549958] nvme nvme0: I/O 0 QID 0 timeout
[ 984.554452] nvme nvme0: starting error recovery
[ 986.962375] nvme nvme3: failed nvme_keep_alive_end_io error=10
[ 986.986898] nvme nvme3: Reconnecting in 10 seconds...
[ 1226.486740] nvme nvme3: Reconnecting in 10 seconds...
[ 1227.749980] nvme nvme0: rdma connection establishment failed (-110)
[ 1227.761593] nvme nvme0: Failed reconnect attempt 18
[ 1227.766848] nvme nvme0: Reconnecting in 10 seconds...
[ 1235.685958] nvme nvme0: ANATT timeout, resetting controller.
[ 1235.692107] nvme nvme3: ANATT timeout, resetting controller.
drivers/nvme/host/rdma.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index b87c8ae41d9b..209dd1becd6c 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1197,8 +1197,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
struct nvme_rdma_ctrl *ctrl = container_of(work,
struct nvme_rdma_ctrl, err_work);
- nvme_stop_keep_alive(&ctrl->ctrl);
- flush_work(&ctrl->ctrl.async_event_work);
+ nvme_stop_ctrl(&ctrl->ctrl);
nvme_rdma_teardown_io_queues(ctrl, false);
nvme_start_queues(&ctrl->ctrl);
nvme_rdma_teardown_admin_queue(ctrl, false);
--
2.29.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC] nvme-rdma: Stop queues when starting with error recovery
2022-05-23 15:21 [RFC] nvme-rdma: Stop queues when starting with error recovery Daniel Wagner
@ 2022-05-24 13:12 ` Sagi Grimberg
2022-05-24 13:13 ` Sagi Grimberg
0 siblings, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2022-05-24 13:12 UTC (permalink / raw)
To: Daniel Wagner, linux-nvme
On 5/23/22 18:21, Daniel Wagner wrote:
> When we enter error recovery we should stop all queue activities and
> all armed timers.
>
> For example, we could arming an ANATT timer right before we enter
> error recovery but do not successfully recover before the timer
> fires. The timer is supposed only be active when the controller is in
> LIVE state hence we should call nvme_stop_ctrl when starting with the
> recover activites.
This takes me back... IIRC the main reason why the error recovery
handler in nvme-rdma (and nvme-tcp) does not start with nvme_stop_ctrl()
is because back then, nvme_stop_ctrl() could block until an
admin_timeout expired on an inflight admin command (I have a vague
memory that if cancelled/flushed the scan_work), which delayed I/O
failover, hence we just cherry-picked what needed to be stopped to make
sure I/O failover will not be blocked by anything and progress quickly.
If I look at nvme_stop_queue it does not touch the scan_work, but it
does flush ana_work which includes I/O, so it can still block on it
to fail after admin_timeout expires... also flush failfast_work is
taking the namespaces_rwsem (although read access). The fw_act_work
is probably meaningless for fabrics...
Maybe we need to have something like nvme_stop_ctrl_fast, that will just
stop the non-blocking stuff, so failover can work quickly, and then
after we have failover, we can do the full nvme_stop_ctrl() or the
reminder of nvme_stop_ctrl_fast...
thoughts?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] nvme-rdma: Stop queues when starting with error recovery
2022-05-24 13:12 ` Sagi Grimberg
@ 2022-05-24 13:13 ` Sagi Grimberg
2022-05-24 13:38 ` Daniel Wagner
0 siblings, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2022-05-24 13:13 UTC (permalink / raw)
To: Daniel Wagner, linux-nvme
On 5/24/22 16:12, Sagi Grimberg wrote:
>
>
> On 5/23/22 18:21, Daniel Wagner wrote:
>> When we enter error recovery we should stop all queue activities and
>> all armed timers.
>>
>> For example, we could arming an ANATT timer right before we enter
>> error recovery but do not successfully recover before the timer
>> fires. The timer is supposed only be active when the controller is in
>> LIVE state hence we should call nvme_stop_ctrl when starting with the
>> recover activites.
>
> This takes me back... IIRC the main reason why the error recovery
> handler in nvme-rdma (and nvme-tcp) does not start with nvme_stop_ctrl()
> is because back then, nvme_stop_ctrl() could block until an
> admin_timeout expired on an inflight admin command (I have a vague
> memory that if cancelled/flushed the scan_work), which delayed I/O
> failover, hence we just cherry-picked what needed to be stopped to make
> sure I/O failover will not be blocked by anything and progress quickly.
>
> If I look at nvme_stop_queue
nvme_stop_ctrl obviously...
it does not touch the scan_work, but it
> does flush ana_work which includes I/O, so it can still block on it
> to fail after admin_timeout expires... also flush failfast_work is
> taking the namespaces_rwsem (although read access). The fw_act_work
> is probably meaningless for fabrics...
>
> Maybe we need to have something like nvme_stop_ctrl_fast, that will just
> stop the non-blocking stuff, so failover can work quickly, and then
> after we have failover, we can do the full nvme_stop_ctrl() or the
> reminder of nvme_stop_ctrl_fast...
>
> thoughts?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] nvme-rdma: Stop queues when starting with error recovery
2022-05-24 13:13 ` Sagi Grimberg
@ 2022-05-24 13:38 ` Daniel Wagner
2022-05-24 13:48 ` Sagi Grimberg
0 siblings, 1 reply; 6+ messages in thread
From: Daniel Wagner @ 2022-05-24 13:38 UTC (permalink / raw)
To: Sagi Grimberg; +Cc: linux-nvme
On Tue, May 24, 2022 at 04:13:28PM +0300, Sagi Grimberg wrote:
>
>
> On 5/24/22 16:12, Sagi Grimberg wrote:
> >
> >
> > On 5/23/22 18:21, Daniel Wagner wrote:
> > > When we enter error recovery we should stop all queue activities and
> > > all armed timers.
> > >
> > > For example, we could arming an ANATT timer right before we enter
> > > error recovery but do not successfully recover before the timer
> > > fires. The timer is supposed only be active when the controller is in
> > > LIVE state hence we should call nvme_stop_ctrl when starting with the
> > > recover activites.
> >
> > This takes me back... IIRC the main reason why the error recovery
> > handler in nvme-rdma (and nvme-tcp) does not start with nvme_stop_ctrl()
> > is because back then, nvme_stop_ctrl() could block until an
> > admin_timeout expired on an inflight admin command (I have a vague
> > memory that if cancelled/flushed the scan_work), which delayed I/O
> > failover, hence we just cherry-picked what needed to be stopped to make
> > sure I/O failover will not be blocked by anything and progress quickly.
> >
> > If I look at nvme_stop_queue
>
> nvme_stop_ctrl obviously...
>
> it does not touch the scan_work, but it
> > does flush ana_work which includes I/O, so it can still block on it
> > to fail after admin_timeout expires... also flush failfast_work is
> > taking the namespaces_rwsem (although read access). The fw_act_work
> > is probably meaningless for fabrics...
> >
> > Maybe we need to have something like nvme_stop_ctrl_fast, that will just
> > stop the non-blocking stuff, so failover can work quickly, and then
> > after we have failover, we can do the full nvme_stop_ctrl() or the
> > reminder of nvme_stop_ctrl_fast...
> >
> > thoughts?
I suspected that nvme_stop_ctrl() might have unwanted side
effects. Maybe factoring out the common bits from nvme_stop_ctrl() into
nvme_stop_ctrl_fast() and call it from here sounds like a reasonable
thing to do. The function name could be better, but naming is hard.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] nvme-rdma: Stop queues when starting with error recovery
2022-05-24 13:38 ` Daniel Wagner
@ 2022-05-24 13:48 ` Sagi Grimberg
2022-05-24 15:07 ` Daniel Wagner
0 siblings, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2022-05-24 13:48 UTC (permalink / raw)
To: Daniel Wagner; +Cc: linux-nvme
>>>> When we enter error recovery we should stop all queue activities and
>>>> all armed timers.
>>>>
>>>> For example, we could arming an ANATT timer right before we enter
>>>> error recovery but do not successfully recover before the timer
>>>> fires. The timer is supposed only be active when the controller is in
>>>> LIVE state hence we should call nvme_stop_ctrl when starting with the
>>>> recover activites.
>>>
>>> This takes me back... IIRC the main reason why the error recovery
>>> handler in nvme-rdma (and nvme-tcp) does not start with nvme_stop_ctrl()
>>> is because back then, nvme_stop_ctrl() could block until an
>>> admin_timeout expired on an inflight admin command (I have a vague
>>> memory that if cancelled/flushed the scan_work), which delayed I/O
>>> failover, hence we just cherry-picked what needed to be stopped to make
>>> sure I/O failover will not be blocked by anything and progress quickly.
>>>
>>> If I look at nvme_stop_queue
>>
>> nvme_stop_ctrl obviously...
>>
>> it does not touch the scan_work, but it
>>> does flush ana_work which includes I/O, so it can still block on it
>>> to fail after admin_timeout expires... also flush failfast_work is
>>> taking the namespaces_rwsem (although read access). The fw_act_work
>>> is probably meaningless for fabrics...
>>>
>>> Maybe we need to have something like nvme_stop_ctrl_fast, that will just
>>> stop the non-blocking stuff, so failover can work quickly, and then
>>> after we have failover, we can do the full nvme_stop_ctrl() or the
>>> reminder of nvme_stop_ctrl_fast...
>>>
>>> thoughts?
>
> I suspected that nvme_stop_ctrl() might have unwanted side
> effects. Maybe factoring out the common bits from nvme_stop_ctrl() into
> nvme_stop_ctrl_fast() and call it from here sounds like a reasonable
> thing to do. The function name could be better, but naming is hard.
The only name that comes to mind is nvme_stop_ctrl_noio, which is
arguably worse... Or maybe just __nvme_stop_ctrl.
Maybe others have better names.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] nvme-rdma: Stop queues when starting with error recovery
2022-05-24 13:48 ` Sagi Grimberg
@ 2022-05-24 15:07 ` Daniel Wagner
0 siblings, 0 replies; 6+ messages in thread
From: Daniel Wagner @ 2022-05-24 15:07 UTC (permalink / raw)
To: Sagi Grimberg; +Cc: linux-nvme
On Tue, May 24, 2022 at 04:48:27PM +0300, Sagi Grimberg wrote:
> > I suspected that nvme_stop_ctrl() might have unwanted side
> > effects. Maybe factoring out the common bits from nvme_stop_ctrl() into
> > nvme_stop_ctrl_fast() and call it from here sounds like a reasonable
> > thing to do. The function name could be better, but naming is hard.
>
> The only name that comes to mind is nvme_stop_ctrl_noio, which is
> arguably worse... Or maybe just __nvme_stop_ctrl.
I pondered for a while and no good idea. I'd prefer __nvme_stop_ctrl()
clearly over nvme_stop_ctrl_foo() though, because I read
__nvme_stop_ctrl() as to be called by nvme_stop_ctrl().
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-05-24 15:07 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-23 15:21 [RFC] nvme-rdma: Stop queues when starting with error recovery Daniel Wagner
2022-05-24 13:12 ` Sagi Grimberg
2022-05-24 13:13 ` Sagi Grimberg
2022-05-24 13:38 ` Daniel Wagner
2022-05-24 13:48 ` Sagi Grimberg
2022-05-24 15:07 ` Daniel Wagner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox