From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:58336 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752244AbeBANP1 (ORCPT ); Thu, 1 Feb 2018 08:15:27 -0500 Subject: Patch "nvme-rdma: don't complete requests before a send work request has completed" has been added to the 4.14-stable tree To: sagi@grimberg.me, alexander.levin@verizon.com, gregkh@linuxfoundation.org, hch@lst.de, maxg@mellanox.com Cc: , From: Date: Thu, 01 Feb 2018 14:13:40 +0100 Message-ID: <151749082022839@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: This is a note to let you know that I've just added the patch titled nvme-rdma: don't complete requests before a send work request has completed to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: nvme-rdma-don-t-complete-requests-before-a-send-work-request-has-completed.patch and it can be found in the queue-4.14 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >>From foo@baz Thu Feb 1 13:45:42 CET 2018 From: Sagi Grimberg Date: Thu, 23 Nov 2017 17:35:22 +0200 Subject: nvme-rdma: don't complete requests before a send work request has completed From: Sagi Grimberg [ Upstream commit 4af7f7ff92a42b6c713293c99e7982bcfcf51a70 ] In order to guarantee that the HCA will never get an access violation (either from invalidated rkey or from iommu) when retrying a send operation we must complete a request only when both send completion and the nvme cqe has arrived. We need to set the send/recv completions flags atomically because we might have more than a single context accessing the request concurrently (one is cq irq-poll context and the other is user-polling used in IOCB_HIPRI). Only then we are safe to invalidate the rkey (if needed), unmap the host buffers, and complete the IO. Signed-off-by: Sagi Grimberg Reviewed-by: Max Gurtovoy Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- drivers/nvme/host/rdma.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -67,6 +67,9 @@ struct nvme_rdma_request { struct nvme_request req; struct ib_mr *mr; struct nvme_rdma_qe sqe; + union nvme_result result; + __le16 status; + refcount_t ref; struct ib_sge sge[1 + NVME_RDMA_MAX_INLINE_SEGMENTS]; u32 num_sge; int nents; @@ -1177,6 +1180,7 @@ static int nvme_rdma_map_data(struct nvm req->num_sge = 1; req->inline_data = false; req->mr->need_inval = false; + refcount_set(&req->ref, 2); /* send and recv completions */ c->common.flags |= NVME_CMD_SGL_METABUF; @@ -1213,8 +1217,19 @@ static int nvme_rdma_map_data(struct nvm static void nvme_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc) { - if (unlikely(wc->status != IB_WC_SUCCESS)) + struct nvme_rdma_qe *qe = + container_of(wc->wr_cqe, struct nvme_rdma_qe, cqe); + struct nvme_rdma_request *req = + container_of(qe, struct nvme_rdma_request, sqe); + struct request *rq = blk_mq_rq_from_pdu(req); + + if (unlikely(wc->status != IB_WC_SUCCESS)) { nvme_rdma_wr_error(cq, wc, "SEND"); + return; + } + + if (refcount_dec_and_test(&req->ref)) + nvme_end_request(rq, req->status, req->result); } /* @@ -1359,14 +1374,19 @@ static int nvme_rdma_process_nvme_rsp(st } req = blk_mq_rq_to_pdu(rq); - if (rq->tag == tag) - ret = 1; + req->status = cqe->status; + req->result = cqe->result; if ((wc->wc_flags & IB_WC_WITH_INVALIDATE) && wc->ex.invalidate_rkey == req->mr->rkey) req->mr->need_inval = false; - nvme_end_request(rq, cqe->status, cqe->result); + if (refcount_dec_and_test(&req->ref)) { + if (rq->tag == tag) + ret = 1; + nvme_end_request(rq, req->status, req->result); + } + return ret; } Patches currently in stable-queue which might be from sagi@grimberg.me are queue-4.14/nvmet-fc-correct-ref-counting-error-when-deferred-rcv-used.patch queue-4.14/nvme-fabrics-introduce-init-command-check-for-a-queue-that-is-not-alive.patch queue-4.14/nvme-fc-check-if-queue-is-ready-in-queue_rq.patch queue-4.14/nvme-loop-check-if-queue-is-ready-in-queue_rq.patch queue-4.14/nvme-rdma-don-t-complete-requests-before-a-send-work-request-has-completed.patch