* [PATCH 0/2] nvme-fc: io termination cleanup
@ 2018-02-06 14:48 James Smart
2018-02-06 14:48 ` [PATCH 1/2] nvme_fc: correct abort race condition on resets James Smart
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: James Smart @ 2018-02-06 14:48 UTC (permalink / raw)
The patches address io termination issues in nvme-fc:
The first patch addresses a race condition on io termination vs
reset/delete abort paths that allowed an io to complete yet the
association termination counted it and is waiting for the completion.
The second patch addresses cleanup of the code that was synchronizing
io termination between the lldd done path and the blk-mq completion
path. the completion path existed because, at the time, the eh_handler
could return a successful return status, allowing the complete routine
to be called, although the io was still active in the lldd. Since then
the eh_handler has been corrected to reschedule the io timer after
initiating the abort and this condition is no longer true.
James Smart (2):
nvme_fc: correct abort race condition on resets
nvme_fc: cleanup io completion
drivers/nvme/host/fc.c | 143 ++++++++++---------------------------------------
1 file changed, 29 insertions(+), 114 deletions(-)
--
2.13.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] nvme_fc: correct abort race condition on resets
2018-02-06 14:48 [PATCH 0/2] nvme-fc: io termination cleanup James Smart
@ 2018-02-06 14:48 ` James Smart
2018-02-06 15:04 ` Johannes Thumshirn
2018-02-06 14:48 ` [PATCH 2/2] nvme_fc: cleanup io completion James Smart
2018-02-11 8:48 ` [PATCH 0/2] nvme-fc: io termination cleanup Sagi Grimberg
2 siblings, 1 reply; 7+ messages in thread
From: James Smart @ 2018-02-06 14:48 UTC (permalink / raw)
During reset handling, there is live io completing while the reset
is taking place. The reset path attempts to abort all outstanding io,
counting the number of ios that were reset. It then waits for those
ios to be reclaimed from the lldd before continuing.
The transport's logic on io state and flag setting was poor, allowing
ios to complete simultaneous to the abort request. The completed ios
were counted, but as the completion had already occurred, the
completion never reduced the count. As the count never zeros, the
reset/delete never completes.
Tighten it up by unconditionally changing the op state to completed
when the io done handler is called. The reset/abort path now changes
the op state to aborted, but the abort only continues if the op
state was live priviously. If complete, the abort is backed out.
Thus proper counting of io aborts and their completions is working
again.
Also removed the TERMIO state on the op as it's redundant with the
op's aborted state.
Signed-off-by: James Smart <james.smart at broadcom.com>
---
drivers/nvme/host/fc.c | 98 ++++++++++++++------------------------------------
1 file changed, 26 insertions(+), 72 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 99bf51c7e513..a8437a7adea6 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1514,13 +1514,19 @@ nvme_fc_exit_request(struct blk_mq_tag_set *set, struct request *rq,
static int
__nvme_fc_abort_op(struct nvme_fc_ctrl *ctrl, struct nvme_fc_fcp_op *op)
{
- int state;
+ unsigned long flags;
+ int opstate;
+
+ spin_lock_irqsave(&ctrl->lock, flags);
+ opstate = atomic_xchg(&op->state, FCPOP_STATE_ABORTED);
+ if (opstate != FCPOP_STATE_ACTIVE)
+ atomic_set(&op->state, opstate);
+ else if (ctrl->flags & FCCTRL_TERMIO)
+ ctrl->iocnt++;
+ spin_unlock_irqrestore(&ctrl->lock, flags);
- state = atomic_xchg(&op->state, FCPOP_STATE_ABORTED);
- if (state != FCPOP_STATE_ACTIVE) {
- atomic_set(&op->state, state);
+ if (opstate != FCPOP_STATE_ACTIVE)
return -ECANCELED;
- }
ctrl->lport->ops->fcp_abort(&ctrl->lport->localport,
&ctrl->rport->remoteport,
@@ -1534,52 +1540,23 @@ static void
nvme_fc_abort_aen_ops(struct nvme_fc_ctrl *ctrl)
{
struct nvme_fc_fcp_op *aen_op = ctrl->aen_ops;
- unsigned long flags;
- int i, ret;
-
- for (i = 0; i < NVME_NR_AEN_COMMANDS; i++, aen_op++) {
- if (atomic_read(&aen_op->state) != FCPOP_STATE_ACTIVE)
- continue;
-
- spin_lock_irqsave(&ctrl->lock, flags);
- if (ctrl->flags & FCCTRL_TERMIO) {
- ctrl->iocnt++;
- aen_op->flags |= FCOP_FLAGS_TERMIO;
- }
- spin_unlock_irqrestore(&ctrl->lock, flags);
-
- ret = __nvme_fc_abort_op(ctrl, aen_op);
- if (ret) {
- /*
- * if __nvme_fc_abort_op failed the io wasn't
- * active. Thus this call path is running in
- * parallel to the io complete. Treat as non-error.
- */
+ int i;
- /* back out the flags/counters */
- spin_lock_irqsave(&ctrl->lock, flags);
- if (ctrl->flags & FCCTRL_TERMIO)
- ctrl->iocnt--;
- aen_op->flags &= ~FCOP_FLAGS_TERMIO;
- spin_unlock_irqrestore(&ctrl->lock, flags);
- return;
- }
- }
+ for (i = 0; i < NVME_NR_AEN_COMMANDS; i++, aen_op++)
+ __nvme_fc_abort_op(ctrl, aen_op);
}
static inline int
__nvme_fc_fcpop_chk_teardowns(struct nvme_fc_ctrl *ctrl,
- struct nvme_fc_fcp_op *op)
+ struct nvme_fc_fcp_op *op, int opstate)
{
unsigned long flags;
bool complete_rq = false;
spin_lock_irqsave(&ctrl->lock, flags);
- if (unlikely(op->flags & FCOP_FLAGS_TERMIO)) {
- if (ctrl->flags & FCCTRL_TERMIO) {
- if (!--ctrl->iocnt)
- wake_up(&ctrl->ioabort_wait);
- }
+ if (opstate == FCPOP_STATE_ABORTED && ctrl->flags & FCCTRL_TERMIO) {
+ if (!--ctrl->iocnt)
+ wake_up(&ctrl->ioabort_wait);
}
if (op->flags & FCOP_FLAGS_RELEASED)
complete_rq = true;
@@ -1603,6 +1580,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
__le16 status = cpu_to_le16(NVME_SC_SUCCESS << 1);
union nvme_result result;
bool terminate_assoc = true;
+ int opstate;
/*
* WARNING:
@@ -1641,11 +1619,12 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
* association to be terminated.
*/
+ opstate = atomic_xchg(&op->state, FCPOP_STATE_COMPLETE);
+
fc_dma_sync_single_for_cpu(ctrl->lport->dev, op->fcp_req.rspdma,
sizeof(op->rsp_iu), DMA_FROM_DEVICE);
- if (atomic_read(&op->state) == FCPOP_STATE_ABORTED ||
- op->flags & FCOP_FLAGS_TERMIO)
+ if (opstate == FCPOP_STATE_ABORTED)
status = cpu_to_le16(NVME_SC_ABORT_REQ << 1);
else if (freq->status)
status = cpu_to_le16(NVME_SC_INTERNAL << 1);
@@ -1710,7 +1689,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
done:
if (op->flags & FCOP_FLAGS_AEN) {
nvme_complete_async_event(&queue->ctrl->ctrl, status, &result);
- __nvme_fc_fcpop_chk_teardowns(ctrl, op);
+ __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate);
atomic_set(&op->state, FCPOP_STATE_IDLE);
op->flags = FCOP_FLAGS_AEN; /* clear other flags */
nvme_fc_ctrl_put(ctrl);
@@ -1727,7 +1706,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
ctrl->ctrl.state == NVME_CTRL_RECONNECTING))
status |= cpu_to_le16(NVME_SC_DNR << 1);
- if (__nvme_fc_fcpop_chk_teardowns(ctrl, op))
+ if (__nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate))
__nvme_fc_final_op_cleanup(rq);
else
nvme_end_request(rq, status, result);
@@ -2429,8 +2408,7 @@ __nvme_fc_final_op_cleanup(struct request *rq)
struct nvme_fc_ctrl *ctrl = op->ctrl;
atomic_set(&op->state, FCPOP_STATE_IDLE);
- op->flags &= ~(FCOP_FLAGS_TERMIO | FCOP_FLAGS_RELEASED |
- FCOP_FLAGS_COMPLETE);
+ op->flags &= ~(FCOP_FLAGS_RELEASED | FCOP_FLAGS_COMPLETE);
nvme_fc_unmap_data(ctrl, rq, op);
nvme_complete_rq(rq);
@@ -2484,35 +2462,11 @@ nvme_fc_terminate_exchange(struct request *req, void *data, bool reserved)
struct nvme_ctrl *nctrl = data;
struct nvme_fc_ctrl *ctrl = to_fc_ctrl(nctrl);
struct nvme_fc_fcp_op *op = blk_mq_rq_to_pdu(req);
- unsigned long flags;
- int status;
if (!blk_mq_request_started(req))
return;
- spin_lock_irqsave(&ctrl->lock, flags);
- if (ctrl->flags & FCCTRL_TERMIO) {
- ctrl->iocnt++;
- op->flags |= FCOP_FLAGS_TERMIO;
- }
- spin_unlock_irqrestore(&ctrl->lock, flags);
-
- status = __nvme_fc_abort_op(ctrl, op);
- if (status) {
- /*
- * if __nvme_fc_abort_op failed the io wasn't
- * active. Thus this call path is running in
- * parallel to the io complete. Treat as non-error.
- */
-
- /* back out the flags/counters */
- spin_lock_irqsave(&ctrl->lock, flags);
- if (ctrl->flags & FCCTRL_TERMIO)
- ctrl->iocnt--;
- op->flags &= ~FCOP_FLAGS_TERMIO;
- spin_unlock_irqrestore(&ctrl->lock, flags);
- return;
- }
+ __nvme_fc_abort_op(ctrl, op);
}
--
2.13.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] nvme_fc: cleanup io completion
2018-02-06 14:48 [PATCH 0/2] nvme-fc: io termination cleanup James Smart
2018-02-06 14:48 ` [PATCH 1/2] nvme_fc: correct abort race condition on resets James Smart
@ 2018-02-06 14:48 ` James Smart
2018-02-06 15:25 ` Johannes Thumshirn
2018-02-11 8:48 ` [PATCH 0/2] nvme-fc: io termination cleanup Sagi Grimberg
2 siblings, 1 reply; 7+ messages in thread
From: James Smart @ 2018-02-06 14:48 UTC (permalink / raw)
There was some old cold that dealt with complete_rq being called
prior to the lldd returning the io completion. This is garbage code.
The complete_rq routine was being called after eh_timeouts were
called and it was due to eh_timeouts not being handled properly.
The timeouts were fixed in prior patches so that in general, a
timeout will initiate an abort and the reset timer restarted as
the abort operation will take care of completing things. Given the
reset timer restarted, the erroneous complete_rq calls were eliminated.
So remove the work that was synchronizing complete_rq with io
completion.
Signed-off-by: James Smart <james.smart at broadcom.com>
---
drivers/nvme/host/fc.c | 63 ++++++++++----------------------------------------
1 file changed, 12 insertions(+), 51 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index a8437a7adea6..bd72e8e4dfb7 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -57,9 +57,7 @@ struct nvme_fc_queue {
enum nvme_fcop_flags {
FCOP_FLAGS_TERMIO = (1 << 0),
- FCOP_FLAGS_RELEASED = (1 << 1),
- FCOP_FLAGS_COMPLETE = (1 << 2),
- FCOP_FLAGS_AEN = (1 << 3),
+ FCOP_FLAGS_AEN = (1 << 1),
};
struct nvmefc_ls_req_op {
@@ -1472,7 +1470,6 @@ nvme_fc_xmt_disconnect_assoc(struct nvme_fc_ctrl *ctrl)
/* *********************** NVME Ctrl Routines **************************** */
-static void __nvme_fc_final_op_cleanup(struct request *rq);
static void nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg);
static int
@@ -1546,25 +1543,20 @@ nvme_fc_abort_aen_ops(struct nvme_fc_ctrl *ctrl)
__nvme_fc_abort_op(ctrl, aen_op);
}
-static inline int
+static inline void
__nvme_fc_fcpop_chk_teardowns(struct nvme_fc_ctrl *ctrl,
struct nvme_fc_fcp_op *op, int opstate)
{
unsigned long flags;
- bool complete_rq = false;
- spin_lock_irqsave(&ctrl->lock, flags);
- if (opstate == FCPOP_STATE_ABORTED && ctrl->flags & FCCTRL_TERMIO) {
- if (!--ctrl->iocnt)
- wake_up(&ctrl->ioabort_wait);
+ if (opstate == FCPOP_STATE_ABORTED) {
+ spin_lock_irqsave(&ctrl->lock, flags);
+ if (ctrl->flags & FCCTRL_TERMIO) {
+ if (!--ctrl->iocnt)
+ wake_up(&ctrl->ioabort_wait);
+ }
+ spin_unlock_irqrestore(&ctrl->lock, flags);
}
- if (op->flags & FCOP_FLAGS_RELEASED)
- complete_rq = true;
- else
- op->flags |= FCOP_FLAGS_COMPLETE;
- spin_unlock_irqrestore(&ctrl->lock, flags);
-
- return complete_rq;
}
static void
@@ -1706,10 +1698,8 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
ctrl->ctrl.state == NVME_CTRL_RECONNECTING))
status |= cpu_to_le16(NVME_SC_DNR << 1);
- if (__nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate))
- __nvme_fc_final_op_cleanup(rq);
- else
- nvme_end_request(rq, status, result);
+ __nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate);
+ nvme_end_request(rq, status, result);
check_error:
if (terminate_assoc)
@@ -2402,45 +2392,16 @@ nvme_fc_submit_async_event(struct nvme_ctrl *arg)
}
static void
-__nvme_fc_final_op_cleanup(struct request *rq)
+nvme_fc_complete_rq(struct request *rq)
{
struct nvme_fc_fcp_op *op = blk_mq_rq_to_pdu(rq);
struct nvme_fc_ctrl *ctrl = op->ctrl;
atomic_set(&op->state, FCPOP_STATE_IDLE);
- op->flags &= ~(FCOP_FLAGS_RELEASED | FCOP_FLAGS_COMPLETE);
nvme_fc_unmap_data(ctrl, rq, op);
nvme_complete_rq(rq);
nvme_fc_ctrl_put(ctrl);
-
-}
-
-static void
-nvme_fc_complete_rq(struct request *rq)
-{
- struct nvme_fc_fcp_op *op = blk_mq_rq_to_pdu(rq);
- struct nvme_fc_ctrl *ctrl = op->ctrl;
- unsigned long flags;
- bool completed = false;
-
- /*
- * the core layer, on controller resets after calling
- * nvme_shutdown_ctrl(), calls complete_rq without our
- * calling blk_mq_complete_request(), thus there may still
- * be live i/o outstanding with the LLDD. Means transport has
- * to track complete calls vs fcpio_done calls to know what
- * path to take on completes and dones.
- */
- spin_lock_irqsave(&ctrl->lock, flags);
- if (op->flags & FCOP_FLAGS_COMPLETE)
- completed = true;
- else
- op->flags |= FCOP_FLAGS_RELEASED;
- spin_unlock_irqrestore(&ctrl->lock, flags);
-
- if (completed)
- __nvme_fc_final_op_cleanup(rq);
}
/*
--
2.13.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 1/2] nvme_fc: correct abort race condition on resets
2018-02-06 14:48 ` [PATCH 1/2] nvme_fc: correct abort race condition on resets James Smart
@ 2018-02-06 15:04 ` Johannes Thumshirn
0 siblings, 0 replies; 7+ messages in thread
From: Johannes Thumshirn @ 2018-02-06 15:04 UTC (permalink / raw)
Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
--
Johannes Thumshirn Storage
jthumshirn at suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] nvme_fc: cleanup io completion
2018-02-06 14:48 ` [PATCH 2/2] nvme_fc: cleanup io completion James Smart
@ 2018-02-06 15:25 ` Johannes Thumshirn
0 siblings, 0 replies; 7+ messages in thread
From: Johannes Thumshirn @ 2018-02-06 15:25 UTC (permalink / raw)
On Tue, Feb 06, 2018@06:48:30AM -0800, James Smart wrote:
> There was some old cold that dealt with complete_rq being called
code? ^
Otherwise,
Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
--
Johannes Thumshirn Storage
jthumshirn at suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 0/2] nvme-fc: io termination cleanup
2018-02-06 14:48 [PATCH 0/2] nvme-fc: io termination cleanup James Smart
2018-02-06 14:48 ` [PATCH 1/2] nvme_fc: correct abort race condition on resets James Smart
2018-02-06 14:48 ` [PATCH 2/2] nvme_fc: cleanup io completion James Smart
@ 2018-02-11 8:48 ` Sagi Grimberg
2018-02-12 16:28 ` James Smart
2 siblings, 1 reply; 7+ messages in thread
From: Sagi Grimberg @ 2018-02-11 8:48 UTC (permalink / raw)
> The patches address io termination issues in nvme-fc:
>
> The first patch addresses a race condition on io termination vs
> reset/delete abort paths that allowed an io to complete yet the
> association termination counted it and is waiting for the completion.
>
> The second patch addresses cleanup of the code that was synchronizing
> io termination between the lldd done path and the blk-mq completion
> path. the completion path existed because, at the time, the eh_handler
> could return a successful return status, allowing the complete routine
> to be called, although the io was still active in the lldd. Since then
> the eh_handler has been corrected to reschedule the io timer after
> initiating the abort and this condition is no longer true.
Hi James,
Picked up for 4.16-rc, had a slight conflict due to Max's ctrl state
change. Please verify its ok.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 0/2] nvme-fc: io termination cleanup
2018-02-11 8:48 ` [PATCH 0/2] nvme-fc: io termination cleanup Sagi Grimberg
@ 2018-02-12 16:28 ` James Smart
0 siblings, 0 replies; 7+ messages in thread
From: James Smart @ 2018-02-12 16:28 UTC (permalink / raw)
Looks good.? I was more interested in the blk-mq resource busy addition
that was merged as well. it looks fine too, but I'll need to test it a bit.
Thanks
-- james
On 2/11/2018 12:48 AM, Sagi Grimberg wrote:
>
>> The patches address io termination issues in nvme-fc:
>>
>> The first patch addresses a race condition on io termination vs
>> reset/delete abort paths that allowed an io to complete yet the
>> association termination counted it and is waiting for the completion.
>>
>> The second patch addresses cleanup of the code that was synchronizing
>> io termination between the lldd done path and the blk-mq completion
>> path. the completion path existed because, at the time, the eh_handler
>> could return a successful return status, allowing the complete routine
>> to be called, although the io was still active in the lldd. Since then
>> the eh_handler has been corrected to reschedule the io timer after
>> initiating the abort and this condition is no longer true.
>
> Hi James,
>
> Picked up for 4.16-rc, had a slight conflict due to Max's ctrl state
> change. Please verify its ok.
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-02-12 16:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-06 14:48 [PATCH 0/2] nvme-fc: io termination cleanup James Smart
2018-02-06 14:48 ` [PATCH 1/2] nvme_fc: correct abort race condition on resets James Smart
2018-02-06 15:04 ` Johannes Thumshirn
2018-02-06 14:48 ` [PATCH 2/2] nvme_fc: cleanup io completion James Smart
2018-02-06 15:25 ` Johannes Thumshirn
2018-02-11 8:48 ` [PATCH 0/2] nvme-fc: io termination cleanup Sagi Grimberg
2018-02-12 16:28 ` James Smart
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox