[PATCH v5 0/3] avoid double request completion and IO error

linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v5 0/3] avoid double request completion and IO error
@ 2021-02-01  3:49 Chao Leng
  2021-02-01  3:49 ` [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete Chao Leng
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Chao Leng @ 2021-02-01  3:49 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, axboe, hch, sagi, linux-block, axboe

First, avoid double request completion for nvmf_fail_nonready_command if
the request is freed asynchronously.
Second, avoid IO error. If work with nvme native multipath, rdma HBA
drivers may failed due to failed hardware, queue_rq should failed the
request instead of returning error to block layer, So upper layer can
failover the request if another path is available.
Need set the state of request to MQ_RQ_COMPLETE when queue_rq directly
complete the request. So introduce blk_mq_set_request_complete.

V5:
	- add comment for blk_mq_set_request_complete.
	- fail the request just for -EIO returned by rdma HBA drivers.
	- delete the helper: nvme_complete_failed_req.
V4:
	- add status parameter for nvme_complete_failed_req.
	- export nvme_complete_failed_req instead of inline function.
V3:
	- complete the request just for HBA driver path related error.
V2:
	- use "switch" instead "if" to check return status.

Chao Leng (3):
  blk-mq: introduce blk_mq_set_request_complete
  nvme-fabrics: avoid double request completion for
    nvmf_fail_nonready_command
  nvme-rdma: avoid IO error for nvme native multipath

 drivers/nvme/host/fabrics.c |  2 +-
 drivers/nvme/host/rdma.c    | 13 ++++++++++++-
 include/linux/blk-mq.h      |  9 +++++++++
 3 files changed, 22 insertions(+), 2 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete
  2021-02-01  3:49 [PATCH v5 0/3] avoid double request completion and IO error Chao Leng
@ 2021-02-01  3:49 ` Chao Leng
  2021-02-04 15:27   ` Jens Axboe
  2021-02-01  3:49 ` [PATCH v5 2/3] nvme-fabrics: avoid double request completion for nvmf_fail_nonready_command Chao Leng
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Chao Leng @ 2021-02-01  3:49 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, axboe, hch, sagi, linux-block, axboe

nvme drivers need to set the state of request to MQ_RQ_COMPLETE when
directly complete request in queue_rq.
So add blk_mq_set_request_complete.

Signed-off-by: Chao Leng <lengchao@huawei.com>
---
 include/linux/blk-mq.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 47b021952ac7..38c632ce2270 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -494,6 +494,15 @@ static inline int blk_mq_request_completed(struct request *rq)
 	return blk_mq_rq_state(rq) == MQ_RQ_COMPLETE;
 }
 
+/*
+ * If nvme complete request directly, need to set the state to complete
+ * to avoid canceling the request again.
+ */
+static inline void blk_mq_set_request_complete(struct request *rq)
+{
+	WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
+}
+
 void blk_mq_start_request(struct request *rq);
 void blk_mq_end_request(struct request *rq, blk_status_t error);
 void __blk_mq_end_request(struct request *rq, blk_status_t error);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete
  2021-02-01  3:49 ` [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete Chao Leng
@ 2021-02-04 15:27   ` Jens Axboe
  2021-02-04 15:30     ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2021-02-04 15:27 UTC (permalink / raw)
  To: Chao Leng, linux-nvme; +Cc: kbusch, hch, sagi, linux-block, axboe

On 1/31/21 8:49 PM, Chao Leng wrote:
> nvme drivers need to set the state of request to MQ_RQ_COMPLETE when
> directly complete request in queue_rq.
> So add blk_mq_set_request_complete.

This is a bit iffy, but looks ok for the intended use case. We just have
to be careful NOT to add frivolous users of it...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete
  2021-02-04 15:27   ` Jens Axboe
@ 2021-02-04 15:30     ` Christoph Hellwig
  2021-02-04 15:32       ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2021-02-04 15:30 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Chao Leng, linux-nvme, kbusch, hch, sagi, linux-block, axboe

On Thu, Feb 04, 2021 at 08:27:46AM -0700, Jens Axboe wrote:
> On 1/31/21 8:49 PM, Chao Leng wrote:
> > nvme drivers need to set the state of request to MQ_RQ_COMPLETE when
> > directly complete request in queue_rq.
> > So add blk_mq_set_request_complete.
> 
> This is a bit iffy, but looks ok for the intended use case. We just have
> to be careful NOT to add frivolous users of it...

Yes, that is my main issue with it.  The current use case looks fine,
but I suspect every time someone else uses it it's probly going to be
as misuse.  I've applied this to nvme-5.12 with the rest of the patches,
it should be on its way to you soon.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete
  2021-02-04 15:30     ` Christoph Hellwig
@ 2021-02-04 15:32       ` Jens Axboe
  2021-02-04 15:36         ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2021-02-04 15:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chao Leng, linux-nvme, kbusch, sagi, linux-block, axboe

On 2/4/21 8:30 AM, Christoph Hellwig wrote:
> On Thu, Feb 04, 2021 at 08:27:46AM -0700, Jens Axboe wrote:
>> On 1/31/21 8:49 PM, Chao Leng wrote:
>>> nvme drivers need to set the state of request to MQ_RQ_COMPLETE when
>>> directly complete request in queue_rq.
>>> So add blk_mq_set_request_complete.
>>
>> This is a bit iffy, but looks ok for the intended use case. We just have
>> to be careful NOT to add frivolous users of it...
> 
> Yes, that is my main issue with it.  The current use case looks fine,
> but I suspect every time someone else uses it it's probly going to be
> as misuse.  I've applied this to nvme-5.12 with the rest of the patches,
> it should be on its way to you soon.

Might benefit from a big fat comment on why you probably shouldn't
be using it...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete
  2021-02-04 15:32       ` Jens Axboe
@ 2021-02-04 15:36         ` Christoph Hellwig
  2021-02-04 15:41           ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2021-02-04 15:36 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Chao Leng, linux-nvme, kbusch, sagi,
	linux-block, axboe

On Thu, Feb 04, 2021 at 08:32:17AM -0700, Jens Axboe wrote:
> On 2/4/21 8:30 AM, Christoph Hellwig wrote:
> > On Thu, Feb 04, 2021 at 08:27:46AM -0700, Jens Axboe wrote:
> >> On 1/31/21 8:49 PM, Chao Leng wrote:
> >>> nvme drivers need to set the state of request to MQ_RQ_COMPLETE when
> >>> directly complete request in queue_rq.
> >>> So add blk_mq_set_request_complete.
> >>
> >> This is a bit iffy, but looks ok for the intended use case. We just have
> >> to be careful NOT to add frivolous users of it...
> > 
> > Yes, that is my main issue with it.  The current use case looks fine,
> > but I suspect every time someone else uses it it's probly going to be
> > as misuse.  I've applied this to nvme-5.12 with the rest of the patches,
> > it should be on its way to you soon.
> 
> Might benefit from a big fat comment on why you probably shouldn't
> be using it...

I actually reworded the comment a bit, this is the current version:

http://git.infradead.org/nvme.git/commitdiff/bdd2a4a61460a341c1f8462e5caf63195e960623

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete
  2021-02-04 15:36         ` Christoph Hellwig
@ 2021-02-04 15:41           ` Jens Axboe
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2021-02-04 15:41 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chao Leng, linux-nvme, kbusch, sagi, linux-block, axboe

On 2/4/21 8:36 AM, Christoph Hellwig wrote:
> On Thu, Feb 04, 2021 at 08:32:17AM -0700, Jens Axboe wrote:
>> On 2/4/21 8:30 AM, Christoph Hellwig wrote:
>>> On Thu, Feb 04, 2021 at 08:27:46AM -0700, Jens Axboe wrote:
>>>> On 1/31/21 8:49 PM, Chao Leng wrote:
>>>>> nvme drivers need to set the state of request to MQ_RQ_COMPLETE when
>>>>> directly complete request in queue_rq.
>>>>> So add blk_mq_set_request_complete.
>>>>
>>>> This is a bit iffy, but looks ok for the intended use case. We just have
>>>> to be careful NOT to add frivolous users of it...
>>>
>>> Yes, that is my main issue with it.  The current use case looks fine,
>>> but I suspect every time someone else uses it it's probly going to be
>>> as misuse.  I've applied this to nvme-5.12 with the rest of the patches,
>>> it should be on its way to you soon.
>>
>> Might benefit from a big fat comment on why you probably shouldn't
>> be using it...
> 
> I actually reworded the comment a bit, this is the current version:
> 
> http://git.infradead.org/nvme.git/commitdiff/bdd2a4a61460a341c1f8462e5caf63195e960623

Alright looks fine, as it clearly states it should be used from
->queue_rq().

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v5 2/3] nvme-fabrics: avoid double request completion for nvmf_fail_nonready_command
  2021-02-01  3:49 [PATCH v5 0/3] avoid double request completion and IO error Chao Leng
  2021-02-01  3:49 ` [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete Chao Leng
@ 2021-02-01  3:49 ` Chao Leng
  2021-02-01  3:49 ` [PATCH v5 3/3] nvme-rdma: avoid IO error for nvme native multipath Chao Leng
  2021-02-03 16:14 ` [PATCH v5 0/3] avoid double request completion and IO error Christoph Hellwig
  3 siblings, 0 replies; 14+ messages in thread
From: Chao Leng @ 2021-02-01  3:49 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, axboe, hch, sagi, linux-block, axboe

When reconnect, the request may be completed with NVME_SC_HOST_PATH_ERROR
in nvmf_fail_nonready_command. The state of request will be changed to
MQ_RQ_IN_FLIGHT before call nvme_complete_rq. If free the request
asynchronously such as in nvme_submit_user_cmd, in extreme scenario
the request may be completed again in tear down process.
nvmf_fail_nonready_command do not need calling blk_mq_start_request
before complete the request. nvmf_fail_nonready_command should set
the state of request to MQ_RQ_COMPLETE before complete the request.

Signed-off-by: Chao Leng <lengchao@huawei.com>
---
 drivers/nvme/host/fabrics.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 72ac00173500..cedf9b318986 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -554,7 +554,7 @@ blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
 		return BLK_STS_RESOURCE;
 
 	nvme_req(rq)->status = NVME_SC_HOST_PATH_ERROR;
-	blk_mq_start_request(rq);
+	blk_mq_set_request_complete(rq);
 	nvme_complete_rq(rq);
 	return BLK_STS_OK;
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 3/3] nvme-rdma: avoid IO error for nvme native multipath
  2021-02-01  3:49 [PATCH v5 0/3] avoid double request completion and IO error Chao Leng
  2021-02-01  3:49 ` [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete Chao Leng
  2021-02-01  3:49 ` [PATCH v5 2/3] nvme-fabrics: avoid double request completion for nvmf_fail_nonready_command Chao Leng
@ 2021-02-01  3:49 ` Chao Leng
  2021-02-03 16:14 ` [PATCH v5 0/3] avoid double request completion and IO error Christoph Hellwig
  3 siblings, 0 replies; 14+ messages in thread
From: Chao Leng @ 2021-02-01  3:49 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, axboe, hch, sagi, linux-block, axboe

Work with nvme native multipath, if a path related error occurs when
queue_rq call HBA drive to send request, queue_rq will return
BLK_STS_IOERR to blk-mq. The request is completed with BLK_STS_IOERR
instead of fail over to retry.
queue_rq need complete the request with NVME_SC_HOST_PATH_ERROR and set
the state of request to MQ_RQ_COMPLETE, the request will fail over to
retry if needed.

Signed-off-by: Chao Leng <lengchao@huawei.com>
---
 drivers/nvme/host/rdma.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index b7ce4f221d99..5fc113dd3302 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -2084,8 +2084,19 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 	err = nvme_rdma_post_send(queue, sqe, req->sge, req->num_sge,
 			req->mr ? &req->reg_wr.wr : NULL);
-	if (unlikely(err))
+	if (unlikely(err)) {
+		if (err == -EIO) {
+			/*
+			 * Fail the reqest so upper layer can failover I/O
+			 * if another path is available
+			 */
+			req->status = NVME_SC_HOST_PATH_ERROR;
+			blk_mq_set_request_complete(rq);
+			nvme_rdma_complete_rq(rq);
+			return BLK_STS_OK;
+		}
 		goto err_unmap;
+	}
 
 	return BLK_STS_OK;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 0/3] avoid double request completion and IO error
  2021-02-01  3:49 [PATCH v5 0/3] avoid double request completion and IO error Chao Leng
                   ` (2 preceding siblings ...)
  2021-02-01  3:49 ` [PATCH v5 3/3] nvme-rdma: avoid IO error for nvme native multipath Chao Leng
@ 2021-02-03 16:14 ` Christoph Hellwig
  2021-02-03 22:22   ` Sagi Grimberg
  2021-02-04  5:56   ` Chao Leng
  3 siblings, 2 replies; 14+ messages in thread
From: Christoph Hellwig @ 2021-02-03 16:14 UTC (permalink / raw)
  To: Chao Leng; +Cc: linux-nvme, kbusch, axboe, hch, sagi, linux-block, axboe

So I think this is conceptually fine, but I still find the API a little
arcane.  What do you think about the following incremental patch?
If that looks good and tests good for you I can apply the series with
the modifications:

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0befaad788a094..02579f4f776c7d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -355,6 +355,21 @@ void nvme_complete_rq(struct request *req)
 }
 EXPORT_SYMBOL_GPL(nvme_complete_rq);
 
+/*
+ * Called to unwind from ->queue_rq on a failed command submission so that the
+ * multipathing code gets called to potentially failover to another path.
+ * The caller needs to unwind all transport specific resource allocations and
+ * must return propagate the return value.
+ */
+blk_status_t nvme_host_path_error(struct request *req)
+{
+	nvme_req(req)->status = NVME_SC_HOST_PATH_ERROR;
+	blk_mq_set_request_complete(req);
+	nvme_complete_rq(req);
+	return BLK_STS_OK;
+}
+EXPORT_SYMBOL_GPL(nvme_host_path_error);
+
 bool nvme_cancel_request(struct request *req, void *data, bool reserved)
 {
 	dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index cedf9b31898673..5dfd806fc2d28c 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -552,11 +552,7 @@ blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
 	    !test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags) &&
 	    !blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
 		return BLK_STS_RESOURCE;
-
-	nvme_req(rq)->status = NVME_SC_HOST_PATH_ERROR;
-	blk_mq_set_request_complete(rq);
-	nvme_complete_rq(rq);
-	return BLK_STS_OK;
+	return nvme_host_path_error(rq);
 }
 EXPORT_SYMBOL_GPL(nvmf_fail_nonready_command);
 
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a72f0718109100..5819f038104149 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -575,6 +575,7 @@ static inline bool nvme_is_aen_req(u16 qid, __u16 command_id)
 }
 
 void nvme_complete_rq(struct request *req);
+blk_status_t nvme_host_path_error(struct request *req);
 bool nvme_cancel_request(struct request *req, void *data, bool reserved);
 void nvme_cancel_tagset(struct nvme_ctrl *ctrl);
 void nvme_cancel_admin_tagset(struct nvme_ctrl *ctrl);
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 6993efb27b39f0..f51af5e4970a2b 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -2091,16 +2091,6 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
 	err = nvme_rdma_post_send(queue, sqe, req->sge, req->num_sge,
 			req->mr ? &req->reg_wr.wr : NULL);
 	if (unlikely(err)) {
-		if (err == -EIO) {
-			/*
-			 * Fail the reqest so upper layer can failover I/O
-			 * if another path is available
-			 */
-			req->status = NVME_SC_HOST_PATH_ERROR;
-			blk_mq_set_request_complete(rq);
-			nvme_rdma_complete_rq(rq);
-			return BLK_STS_OK;
-		}
 		goto err_unmap;
 	}
 
@@ -2109,7 +2099,9 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
 err_unmap:
 	nvme_rdma_unmap_data(queue, rq);
 err:
-	if (err == -ENOMEM || err == -EAGAIN)
+	if (err == -EIO)
+		ret = nvme_host_path_error(rq);
+	else if (err == -ENOMEM || err == -EAGAIN)
 		ret = BLK_STS_RESOURCE;
 	else
 		ret = BLK_STS_IOERR;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 0/3] avoid double request completion and IO error
  2021-02-03 16:14 ` [PATCH v5 0/3] avoid double request completion and IO error Christoph Hellwig
@ 2021-02-03 22:22   ` Sagi Grimberg
  2021-02-04  5:56   ` Chao Leng
  1 sibling, 0 replies; 14+ messages in thread
From: Sagi Grimberg @ 2021-02-03 22:22 UTC (permalink / raw)
  To: Christoph Hellwig, Chao Leng
  Cc: linux-nvme, kbusch, axboe, linux-block, axboe



On 2/3/21 8:14 AM, Christoph Hellwig wrote:
> So I think this is conceptually fine, but I still find the API a little
> arcane.  What do you think about the following incremental patch?
> If that looks good and tests good for you I can apply the series with
> the modifications:
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 0befaad788a094..02579f4f776c7d 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -355,6 +355,21 @@ void nvme_complete_rq(struct request *req)
>   }
>   EXPORT_SYMBOL_GPL(nvme_complete_rq);
>   
> +/*
> + * Called to unwind from ->queue_rq on a failed command submission so that the
> + * multipathing code gets called to potentially failover to another path.
> + * The caller needs to unwind all transport specific resource allocations and
> + * must return propagate the return value.
> + */
> +blk_status_t nvme_host_path_error(struct request *req)
> +{
> +	nvme_req(req)->status = NVME_SC_HOST_PATH_ERROR;
> +	blk_mq_set_request_complete(req);
> +	nvme_complete_rq(req);
> +	return BLK_STS_OK;
> +}
> +EXPORT_SYMBOL_GPL(nvme_host_path_error);
> +
>   bool nvme_cancel_request(struct request *req, void *data, bool reserved)
>   {
>   	dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index cedf9b31898673..5dfd806fc2d28c 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -552,11 +552,7 @@ blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
>   	    !test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags) &&
>   	    !blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
>   		return BLK_STS_RESOURCE;
> -
> -	nvme_req(rq)->status = NVME_SC_HOST_PATH_ERROR;
> -	blk_mq_set_request_complete(rq);
> -	nvme_complete_rq(rq);
> -	return BLK_STS_OK;
> +	return nvme_host_path_error(rq);
>   }
>   EXPORT_SYMBOL_GPL(nvmf_fail_nonready_command);
>   
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index a72f0718109100..5819f038104149 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -575,6 +575,7 @@ static inline bool nvme_is_aen_req(u16 qid, __u16 command_id)
>   }
>   
>   void nvme_complete_rq(struct request *req);
> +blk_status_t nvme_host_path_error(struct request *req);
>   bool nvme_cancel_request(struct request *req, void *data, bool reserved);
>   void nvme_cancel_tagset(struct nvme_ctrl *ctrl);
>   void nvme_cancel_admin_tagset(struct nvme_ctrl *ctrl);
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 6993efb27b39f0..f51af5e4970a2b 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -2091,16 +2091,6 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
>   	err = nvme_rdma_post_send(queue, sqe, req->sge, req->num_sge,
>   			req->mr ? &req->reg_wr.wr : NULL);
>   	if (unlikely(err)) {
> -		if (err == -EIO) {
> -			/*
> -			 * Fail the reqest so upper layer can failover I/O
> -			 * if another path is available
> -			 */
> -			req->status = NVME_SC_HOST_PATH_ERROR;
> -			blk_mq_set_request_complete(rq);
> -			nvme_rdma_complete_rq(rq);
> -			return BLK_STS_OK;
> -		}
>   		goto err_unmap;
>   	}
>   
> @@ -2109,7 +2099,9 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
>   err_unmap:
>   	nvme_rdma_unmap_data(queue, rq);
>   err:
> -	if (err == -ENOMEM || err == -EAGAIN)
> +	if (err == -EIO)
> +		ret = nvme_host_path_error(rq);
> +	else if (err == -ENOMEM || err == -EAGAIN)
>   		ret = BLK_STS_RESOURCE;
>   	else
>   		ret = BLK_STS_IOERR;
> 

This looks good to me.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 0/3] avoid double request completion and IO error
  2021-02-03 16:14 ` [PATCH v5 0/3] avoid double request completion and IO error Christoph Hellwig
  2021-02-03 22:22   ` Sagi Grimberg
@ 2021-02-04  5:56   ` Chao Leng
  2021-02-04  8:04     ` Christoph Hellwig
  1 sibling, 1 reply; 14+ messages in thread
From: Chao Leng @ 2021-02-04  5:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvme, kbusch, axboe, sagi, linux-block, axboe

This looks good to me.
Thank you very much.

On 2021/2/4 0:14, Christoph Hellwig wrote:
> So I think this is conceptually fine, but I still find the API a little
> arcane.  What do you think about the following incremental patch?
> If that looks good and tests good for you I can apply the series with
> the modifications:
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 0befaad788a094..02579f4f776c7d 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -355,6 +355,21 @@ void nvme_complete_rq(struct request *req)
>   }
>   EXPORT_SYMBOL_GPL(nvme_complete_rq);
>   
> +/*
> + * Called to unwind from ->queue_rq on a failed command submission so that the
> + * multipathing code gets called to potentially failover to another path.
> + * The caller needs to unwind all transport specific resource allocations and
> + * must return propagate the return value.
> + */
> +blk_status_t nvme_host_path_error(struct request *req)
> +{
> +	nvme_req(req)->status = NVME_SC_HOST_PATH_ERROR;
> +	blk_mq_set_request_complete(req);
> +	nvme_complete_rq(req);
> +	return BLK_STS_OK;
> +}
> +EXPORT_SYMBOL_GPL(nvme_host_path_error);
> +
>   bool nvme_cancel_request(struct request *req, void *data, bool reserved)
>   {
>   	dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index cedf9b31898673..5dfd806fc2d28c 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -552,11 +552,7 @@ blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
>   	    !test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags) &&
>   	    !blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
>   		return BLK_STS_RESOURCE;
> -
> -	nvme_req(rq)->status = NVME_SC_HOST_PATH_ERROR;
> -	blk_mq_set_request_complete(rq);
> -	nvme_complete_rq(rq);
> -	return BLK_STS_OK;
> +	return nvme_host_path_error(rq);
>   }
>   EXPORT_SYMBOL_GPL(nvmf_fail_nonready_command);
>   
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index a72f0718109100..5819f038104149 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -575,6 +575,7 @@ static inline bool nvme_is_aen_req(u16 qid, __u16 command_id)
>   }
>   
>   void nvme_complete_rq(struct request *req);
> +blk_status_t nvme_host_path_error(struct request *req);
>   bool nvme_cancel_request(struct request *req, void *data, bool reserved);
>   void nvme_cancel_tagset(struct nvme_ctrl *ctrl);
>   void nvme_cancel_admin_tagset(struct nvme_ctrl *ctrl);
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 6993efb27b39f0..f51af5e4970a2b 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -2091,16 +2091,6 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
>   	err = nvme_rdma_post_send(queue, sqe, req->sge, req->num_sge,
>   			req->mr ? &req->reg_wr.wr : NULL);
>   	if (unlikely(err)) {
> -		if (err == -EIO) {
> -			/*
> -			 * Fail the reqest so upper layer can failover I/O
> -			 * if another path is available
> -			 */
> -			req->status = NVME_SC_HOST_PATH_ERROR;
> -			blk_mq_set_request_complete(rq);
> -			nvme_rdma_complete_rq(rq);
> -			return BLK_STS_OK;
> -		}
>   		goto err_unmap;
>   	}
>   
> @@ -2109,7 +2099,9 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
>   err_unmap:
>   	nvme_rdma_unmap_data(queue, rq);
>   err:
> -	if (err == -ENOMEM || err == -EAGAIN)
> +	if (err == -EIO)
> +		ret = nvme_host_path_error(rq);
> +	else if (err == -ENOMEM || err == -EAGAIN)
>   		ret = BLK_STS_RESOURCE;
>   	else
>   		ret = BLK_STS_IOERR;
> .
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 0/3] avoid double request completion and IO error
  2021-02-04  5:56   ` Chao Leng
@ 2021-02-04  8:04     ` Christoph Hellwig
  2021-02-05  2:02       ` Chao Leng
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2021-02-04  8:04 UTC (permalink / raw)
  To: Chao Leng
  Cc: Christoph Hellwig, linux-nvme, kbusch, axboe, sagi, linux-block,
	axboe

On Thu, Feb 04, 2021 at 01:56:34PM +0800, Chao Leng wrote:
> This looks good to me.
> Thank you very much.

Thanks a lot to you!

Please double check there version here:

http://git.infradead.org/nvme.git/shortlog/refs/heads/nvme-5.12

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 0/3] avoid double request completion and IO error
  2021-02-04  8:04     ` Christoph Hellwig
@ 2021-02-05  2:02       ` Chao Leng
  0 siblings, 0 replies; 14+ messages in thread
From: Chao Leng @ 2021-02-05  2:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvme, kbusch, axboe, sagi, linux-block, axboe



On 2021/2/4 16:04, Christoph Hellwig wrote:
> On Thu, Feb 04, 2021 at 01:56:34PM +0800, Chao Leng wrote:
>> This looks good to me.
>> Thank you very much.
> 
> Thanks a lot to you!
> 
> Please double check there version here:
> 
> http://git.infradead.org/nvme.git/shortlog/refs/heads/nvme-5.12
Looks good.
> .
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-02-05  2:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-01  3:49 [PATCH v5 0/3] avoid double request completion and IO error Chao Leng
2021-02-01  3:49 ` [PATCH v5 1/3] blk-mq: introduce blk_mq_set_request_complete Chao Leng
2021-02-04 15:27   ` Jens Axboe
2021-02-04 15:30     ` Christoph Hellwig
2021-02-04 15:32       ` Jens Axboe
2021-02-04 15:36         ` Christoph Hellwig
2021-02-04 15:41           ` Jens Axboe
2021-02-01  3:49 ` [PATCH v5 2/3] nvme-fabrics: avoid double request completion for nvmf_fail_nonready_command Chao Leng
2021-02-01  3:49 ` [PATCH v5 3/3] nvme-rdma: avoid IO error for nvme native multipath Chao Leng
2021-02-03 16:14 ` [PATCH v5 0/3] avoid double request completion and IO error Christoph Hellwig
2021-02-03 22:22   ` Sagi Grimberg
2021-02-04  5:56   ` Chao Leng
2021-02-04  8:04     ` Christoph Hellwig
2021-02-05  2:02       ` Chao Leng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).