From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A35BC433DB for ; Thu, 21 Jan 2021 01:35:06 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8127F23884 for ; Thu, 21 Jan 2021 01:35:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8127F23884 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Kp9m92worOUvJGALLhKRGlVm4eaGRRVAL9iumdQ8jgo=; b=Hfi48LilRNzdncblTmP+LzAXR +aeuXqDn/yMjaMCMaFpTsHJEyfFAVBme2eUxL1eJiqMHnBRecpC7PEDiUcTyuJADY/SCCOvD7NTyQ 4zXHvEs3nMena7pXAaobBeHy3iIxEp4KHgEUFIxhwxlkf9UnwVBF/8m28H+1cFaaaTWBJmPTKlDbK 3rmgKWVbLKsx65R+A9+2gZPczKxb51Rh3tYenIOAgxFTNgvwJjfYGq6ncmB0cBI0RubrWUiHGuh1j 20/OzCG6CwJBlvv+Hmisa0TN1DFGYu19Cw9xheu3onjnay2FxV5XOHcyzJvMsu0JrAsmxlM8MN2AS 4+BlS9aKA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1l2Os2-00040N-9u; Thu, 21 Jan 2021 01:34:46 +0000 Received: from szxga01-in.huawei.com ([45.249.212.187]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1l2Ory-0003z4-BM for linux-nvme@lists.infradead.org; Thu, 21 Jan 2021 01:34:43 +0000 Received: from DGGEMM406-HUB.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4DLlJ952nyzW39r; Thu, 21 Jan 2021 09:32:41 +0800 (CST) Received: from dggema772-chm.china.huawei.com (10.1.198.214) by DGGEMM406-HUB.china.huawei.com (10.3.20.214) with Microsoft SMTP Server (TLS) id 14.3.498.0; Thu, 21 Jan 2021 09:34:31 +0800 Received: from [10.169.42.93] (10.169.42.93) by dggema772-chm.china.huawei.com (10.1.198.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1913.5; Thu, 21 Jan 2021 09:34:30 +0800 Subject: Re: [PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion To: Sagi Grimberg , References: <20210107033149.15701-1-lengchao@huawei.com> <20210107033149.15701-5-lengchao@huawei.com> <07e41b4f-914a-11e8-5638-e2d6408feb3f@grimberg.me> <7b12be41-0fcd-5a22-0e01-8cd4ac9cde5b@huawei.com> <695b6839-5333-c342-2189-d7aaeba797a7@huawei.com> <4ff22d33-12fa-1f70-3606-54821f314c45@grimberg.me> <0b5c8e31-8dc2-994a-1710-1b1be07549c9@huawei.com> <2ed5391c-fe43-f512-adf0-214effd5d599@grimberg.me> From: Chao Leng Message-ID: <6bfca033-8fda-4ace-c05f-285fccb070fd@huawei.com> Date: Thu, 21 Jan 2021 09:34:30 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <2ed5391c-fe43-f512-adf0-214effd5d599@grimberg.me> Content-Language: en-US X-Originating-IP: [10.169.42.93] X-ClientProxiedBy: dggeme719-chm.china.huawei.com (10.1.199.115) To dggema772-chm.china.huawei.com (10.1.198.214) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210120_203442_894837_92C9341A X-CRM114-Status: GOOD ( 26.74 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kbusch@kernel.org, axboe@fb.com, linux-block@vger.kernel.org, hch@lst.de, axboe@kernel.dk Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On 2021/1/21 5:35, Sagi Grimberg wrote: > > is not something we should be handling in nvme. block drivers >>>>> should be able to fail queue_rq, and this all should live in the >>>>> block layer. >>>> Of course, it is also an idea to repair the block drivers directly. >>>> However, block layer is unaware of nvme native multipathing, >>> >>> Nor it should be >>> >>>> will cause the request return error which should be avoided. >>> >>> Not sure I understand.. >>> requests should failover for path related errors, >>> what queue_rq errors are expected to be failed over from your >>> perspective? >> Although fail over for only path related errors is the best choice, it's >> almost impossible to achieve. >> The probability of non-path-related errors is very low. Although these >> errors do not require fail over retry, the cost of fail over retry >> is complete the request with error delay a bit long time(retry several >> times). It's not the best choice, but I think it's acceptable, because >> HBA driver does not have path-related error codes but only general error >> codes. It is difficult to identify whether the general error codes are >> path-related. > > If we have a SW bug or breakage that can happen occasionally, this can > result in a constant failover rather than a simple failure. This is just > not a good approach IMO. > >>>> The scenario: use two HBAs for nvme native multipath, and then one HBA >>>> fault, >>> >>> What is the specific error the driver sees? >> The path related error code is closely related to HBA driver >> implementation. In general it is EIO. I don't think it's a good idea to >> assume what general error code the driver returns in the event of a path >> error. > > But assuming every error is a path error a good idea? Of course not, according to the old code logic, assuming !ENOMEM && !EAGIAN for HBA drivers is a path error. I think it might be reasonable. > >>>> the blk_status_t of queue_rq is BLK_STS_IOERR, blk-mq will call >>>> blk_mq_end_request to complete the request which bypass name native >>>> multipath. We expect the request fail over to normal HBA, but the request >>>> is directly completed with BLK_STS_IOERR. >>>> The two scenarios can be fixed by directly completing the request in queue_rq. >>> Well, certainly this one-shot always return 0 and complete the command >>> with HOST_PATH error is not a good approach IMO >> So what's the better option? Just complete the request with host path >> error for non-ENOMEM and EAGAIN returned by the HBA driver? > > Well, the correct thing to do here would be to clone the bio and > failover if the end_io error status is BLK_STS_IOERR. That sucks > because it adds overhead, but this proposal doesn't sit well. it > looks wrong to me. > > Alternatively, a more creative idea would be to encode the error > status somehow in the cookie returned from submit_bio, but that > also feels like a small(er) hack. If HBA drivers return !ENOMEM && !EAGIAN, queue_rq Directly call nvme_complete_rq with NVME_SC_HOST_PATH_ERROR like nvmf_fail_nonready_command. nvme_complete_rq will decide to retry, fail over or end the request. This may not be the best, but there seems to be no better choice. I will try to send the patch v2. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme