Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Chao Leng <lengchao@huawei.com>
To: Sagi Grimberg <sagi@grimberg.me>, <linux-nvme@lists.infradead.org>
Cc: kbusch@kernel.org, axboe@fb.com, linux-block@vger.kernel.org,
	hch@lst.de, axboe@kernel.dk
Subject: Re: [PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion
Date: Thu, 14 Jan 2021 14:55:21 +0800	[thread overview]
Message-ID: <7b12be41-0fcd-5a22-0e01-8cd4ac9cde5b@huawei.com> (raw)
In-Reply-To: <07e41b4f-914a-11e8-5638-e2d6408feb3f@grimberg.me>



On 2021/1/14 8:19, Sagi Grimberg wrote:
> 
>> When a request is queued failed, blk_status_t is directly returned
>> to the blk-mq. If blk_status_t is not BLK_STS_RESOURCE,
>> BLK_STS_DEV_RESOURCE, BLK_STS_ZONE_RESOURCE, blk-mq call
>> blk_mq_end_request to complete the request with BLK_STS_IOERR.
>> In two scenarios, the request should be retried and may succeed.
>> First, if work with nvme multipath, the request may be retried
>> successfully in another path, because the error is probably related to
>> the path. Second, if work without multipath software, the request may
>> be retried successfully after error recovery.
>> If the request is complete with BLK_STS_IOERR in blk_mq_dispatch_rq_list.
>> The state of request may be changed to MQ_RQ_IN_FLIGHT. If free the
>> request asynchronously such as in nvme_submit_user_cmd, in extreme
>> scenario the request will be repeated freed in tear down.
>> If a non-resource error occurs in queue_rq, should directly call
>> nvme_complete_rq to complete request and set the state of request to
>> MQ_RQ_COMPLETE. nvme_complete_rq will decide to retry, fail over or end
>> the request.
>>
>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>> ---
>>   drivers/nvme/host/rdma.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>> index df9f6f4549f1..4a89bf44ecdc 100644
>> --- a/drivers/nvme/host/rdma.c
>> +++ b/drivers/nvme/host/rdma.c
>> @@ -2093,7 +2093,7 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx,
>>   unmap_qe:
>>       ib_dma_unmap_single(dev, req->sqe.dma, sizeof(struct nvme_command),
>>                   DMA_TO_DEVICE);
>> -    return ret;
>> +    return nvme_try_complete_failed_req(rq, ret);
> 
> I don't understand this. There are errors that may not be related to
> anything that is pathing related (sw bug, memory leak, mapping error,
> etc, etc) why should we return this one-shot error?
Although fail over retry is not required, if we return the error to
blk-mq, a low probability crash may happen. because blk-mq do not set
the state of request to MQ_RQ_COMPLETE before complete the request,
the request may be freed asynchronously such as in nvme_submit_user_cmd.
If race with error recovery, request double completion may happens.

So we can not return the error to blk-mq if the blk_status_t is not
BLK_STS_RESOURCE, BLK_STS_DEV_RESOURCE, BLK_STS_ZONE_RESOURCE.
> .

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-01-14  6:55 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-07  3:31 [PATCH v2 0/6] avoid repeated request completion and IO error Chao Leng
2021-01-07  3:31 ` [PATCH v2 1/6] blk-mq: introduce blk_mq_set_request_complete Chao Leng
2021-01-14  0:17   ` Sagi Grimberg
2021-01-14  6:50     ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 2/6] nvme-core: introduce complete failed request Chao Leng
2021-01-21  8:14   ` Hannes Reinecke
2021-01-22  1:45     ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 3/6] nvme-fabrics: avoid repeated request completion for nvmf_fail_nonready_command Chao Leng
2021-01-07  3:31 ` [PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion Chao Leng
2021-01-14  0:19   ` Sagi Grimberg
2021-01-14  6:55     ` Chao Leng [this message]
2021-01-14 21:25       ` Sagi Grimberg
2021-01-15  2:53         ` Chao Leng
2021-01-16  1:18           ` Sagi Grimberg
2021-01-18  3:22             ` Chao Leng
2021-01-18 17:49               ` Christoph Hellwig
2021-01-19  1:50                 ` Chao Leng
2021-01-20 21:35               ` Sagi Grimberg
2021-01-21  1:34                 ` Chao Leng
2021-01-07  3:31 ` [PATCH v2 5/6] nvme-tcp: " Chao Leng
2021-01-07  3:31 ` [PATCH v2 6/6] nvme-fc: " Chao Leng
2021-01-14  0:15 ` [PATCH v2 0/6] avoid repeated request completion and IO error Sagi Grimberg
2021-01-14  6:50   ` Chao Leng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b12be41-0fcd-5a22-0e01-8cd4ac9cde5b@huawei.com \
    --to=lengchao@huawei.com \
    --cc=axboe@fb.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox