public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/11] NFS/RDMA client patches for 4.5
@ 2015-12-14 21:17 Chuck Lever
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:17 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

For 4.5, I'd like to address the send queue accounting and
invalidation/unmap ordering issues Jason brought up a couple of
months ago.

In preparation for Doug's final topic branch, Anna, I've rebased
these on Christoph's ib_device_attr branch, but there were no merge
conflicts or other changes needed. Could you begin preparing these
for linux-next and other final testing and review?

Also available in the "nfs-rdma-for-4.5" topic branch of this git repo:

git://git.linux-nfs.org/projects/cel/cel-2.6.git

Or for browsing:

http://git.linux-nfs.org/?p=cel/cel-2.6.git;a=log;h=refs/heads/nfs-rdma-for-4.5


Changes since v2:
- Rebased on Christoph's ib_device_attr branch


Changes since v1:

- Rebased on v4.4-rc3
- Receive buffer safety margin patch dropped
- Backchannel pr_err and pr_info converted to dprintk
- Backchannel spin locks converted to work queue-safe locks
- Fixed premature release of backchannel request buffer
- NFSv4.1 callbacks tested with for-4.5 server

---

Chuck Lever (11):
      xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock)
      xprtrdma: xprt_rdma_free() must not release backchannel reqs
      xprtrdma: Disable RPC/RDMA backchannel debugging messages
      xprtrdma: Move struct ib_send_wr off the stack
      xprtrdma: Introduce ro_unmap_sync method
      xprtrdma: Add ro_unmap_sync method for FRWR
      xprtrdma: Add ro_unmap_sync method for FMR
      xprtrdma: Add ro_unmap_sync method for all-physical registration
      SUNRPC: Introduce xprt_commit_rqst()
      xprtrdma: Invalidate in the RPC reply handler
      xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit').


 include/linux/sunrpc/xprt.h        |    1 
 net/sunrpc/xprt.c                  |   14 +++
 net/sunrpc/xprtrdma/backchannel.c  |   22 ++---
 net/sunrpc/xprtrdma/fmr_ops.c      |   64 +++++++++++++
 net/sunrpc/xprtrdma/frwr_ops.c     |  175 +++++++++++++++++++++++++++++++-----
 net/sunrpc/xprtrdma/physical_ops.c |   13 +++
 net/sunrpc/xprtrdma/rpc_rdma.c     |   14 +++
 net/sunrpc/xprtrdma/transport.c    |    3 +
 net/sunrpc/xprtrdma/verbs.c        |   13 +--
 net/sunrpc/xprtrdma/xprt_rdma.h    |   12 +-
 10 files changed, 283 insertions(+), 48 deletions(-)

--
Chuck Lever
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 01/11] xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock)
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
@ 2015-12-14 21:17   ` Chuck Lever
  2015-12-14 21:17   ` [PATCH v3 02/11] xprtrdma: xprt_rdma_free() must not release backchannel reqs Chuck Lever
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:17 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

Clean up.

rb_lock critical sections added in rpcrdma_ep_post_extra_recv()
should have first been converted to use normal spin_lock now that
the reply handler is a work queue.

The backchannel set up code should use the appropriate helper
instead of open-coding a rb_recv_bufs list add.

Problem introduced by glib patch re-ordering on my part.

Fixes: f531a5dbc451 ('xprtrdma: Pre-allocate backward rpc_rqst')
Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/backchannel.c |    6 +-----
 net/sunrpc/xprtrdma/verbs.c       |    7 +++----
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/net/sunrpc/xprtrdma/backchannel.c b/net/sunrpc/xprtrdma/backchannel.c
index 2dcb44f..11d2cfb 100644
--- a/net/sunrpc/xprtrdma/backchannel.c
+++ b/net/sunrpc/xprtrdma/backchannel.c
@@ -84,9 +84,7 @@ out_fail:
 static int rpcrdma_bc_setup_reps(struct rpcrdma_xprt *r_xprt,
 				 unsigned int count)
 {
-	struct rpcrdma_buffer *buffers = &r_xprt->rx_buf;
 	struct rpcrdma_rep *rep;
-	unsigned long flags;
 	int rc = 0;
 
 	while (count--) {
@@ -98,9 +96,7 @@ static int rpcrdma_bc_setup_reps(struct rpcrdma_xprt *r_xprt,
 			break;
 		}
 
-		spin_lock_irqsave(&buffers->rb_lock, flags);
-		list_add(&rep->rr_list, &buffers->rb_recv_bufs);
-		spin_unlock_irqrestore(&buffers->rb_lock, flags);
+		rpcrdma_recv_buffer_put(rep);
 	}
 
 	return rc;
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 650034b..f23f3d6 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1329,15 +1329,14 @@ rpcrdma_ep_post_extra_recv(struct rpcrdma_xprt *r_xprt, unsigned int count)
 	struct rpcrdma_ia *ia = &r_xprt->rx_ia;
 	struct rpcrdma_ep *ep = &r_xprt->rx_ep;
 	struct rpcrdma_rep *rep;
-	unsigned long flags;
 	int rc;
 
 	while (count--) {
-		spin_lock_irqsave(&buffers->rb_lock, flags);
+		spin_lock(&buffers->rb_lock);
 		if (list_empty(&buffers->rb_recv_bufs))
 			goto out_reqbuf;
 		rep = rpcrdma_buffer_get_rep_locked(buffers);
-		spin_unlock_irqrestore(&buffers->rb_lock, flags);
+		spin_unlock(&buffers->rb_lock);
 
 		rc = rpcrdma_ep_post_recv(ia, ep, rep);
 		if (rc)
@@ -1347,7 +1346,7 @@ rpcrdma_ep_post_extra_recv(struct rpcrdma_xprt *r_xprt, unsigned int count)
 	return 0;
 
 out_reqbuf:
-	spin_unlock_irqrestore(&buffers->rb_lock, flags);
+	spin_unlock(&buffers->rb_lock);
 	pr_warn("%s: no extra receive buffers\n", __func__);
 	return -ENOMEM;
 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 02/11] xprtrdma: xprt_rdma_free() must not release backchannel reqs
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
  2015-12-14 21:17   ` [PATCH v3 01/11] xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock) Chuck Lever
@ 2015-12-14 21:17   ` Chuck Lever
  2015-12-14 21:18   ` [PATCH v3 03/11] xprtrdma: Disable RPC/RDMA backchannel debugging messages Chuck Lever
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:17 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

Preserve any rpcrdma_req that is attached to rpc_rqst's allocated
for the backchannel. Otherwise, after all the pre-allocated
backchannel req's are consumed, incoming backward calls start
writing on freed memory.

Somehow this hunk got lost.

Fixes: f531a5dbc451 ('xprtrdma: Pre-allocate backward rpc_rqst')
Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/transport.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 8c545f7..740bddc 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -576,6 +576,9 @@ xprt_rdma_free(void *buffer)
 
 	rb = container_of(buffer, struct rpcrdma_regbuf, rg_base[0]);
 	req = rb->rg_owner;
+	if (req->rl_backchannel)
+		return;
+
 	r_xprt = container_of(req->rl_buffer, struct rpcrdma_xprt, rx_buf);
 
 	dprintk("RPC:       %s: called on 0x%p\n", __func__, req->rl_reply);

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 03/11] xprtrdma: Disable RPC/RDMA backchannel debugging messages
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
  2015-12-14 21:17   ` [PATCH v3 01/11] xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock) Chuck Lever
  2015-12-14 21:17   ` [PATCH v3 02/11] xprtrdma: xprt_rdma_free() must not release backchannel reqs Chuck Lever
@ 2015-12-14 21:18   ` Chuck Lever
  2015-12-14 21:18   ` [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack Chuck Lever
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:18 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

Clean up.

Fixes: 63cae47005af ('xprtrdma: Handle incoming backward direction')
Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/backchannel.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/xprtrdma/backchannel.c b/net/sunrpc/xprtrdma/backchannel.c
index 11d2cfb..cd31181 100644
--- a/net/sunrpc/xprtrdma/backchannel.c
+++ b/net/sunrpc/xprtrdma/backchannel.c
@@ -15,7 +15,7 @@
 # define RPCDBG_FACILITY	RPCDBG_TRANS
 #endif
 
-#define RPCRDMA_BACKCHANNEL_DEBUG
+#undef RPCRDMA_BACKCHANNEL_DEBUG
 
 static void rpcrdma_bc_free_rqst(struct rpcrdma_xprt *r_xprt,
 				 struct rpc_rqst *rqst)
@@ -136,6 +136,7 @@ int xprt_rdma_bc_setup(struct rpc_xprt *xprt, unsigned int reqs)
 			       __func__);
 			goto out_free;
 		}
+		dprintk("RPC:       %s: new rqst %p\n", __func__, rqst);
 
 		rqst->rq_xprt = &r_xprt->rx_xprt;
 		INIT_LIST_HEAD(&rqst->rq_list);
@@ -216,12 +217,14 @@ int rpcrdma_bc_marshal_reply(struct rpc_rqst *rqst)
 
 	rpclen = rqst->rq_svec[0].iov_len;
 
+#ifdef RPCRDMA_BACKCHANNEL_DEBUG
 	pr_info("RPC:       %s: rpclen %zd headerp 0x%p lkey 0x%x\n",
 		__func__, rpclen, headerp, rdmab_lkey(req->rl_rdmabuf));
 	pr_info("RPC:       %s: RPC/RDMA: %*ph\n",
 		__func__, (int)RPCRDMA_HDRLEN_MIN, headerp);
 	pr_info("RPC:       %s:      RPC: %*ph\n",
 		__func__, (int)rpclen, rqst->rq_svec[0].iov_base);
+#endif
 
 	req->rl_send_iov[0].addr = rdmab_addr(req->rl_rdmabuf);
 	req->rl_send_iov[0].length = RPCRDMA_HDRLEN_MIN;
@@ -265,6 +268,9 @@ void xprt_rdma_bc_free_rqst(struct rpc_rqst *rqst)
 {
 	struct rpc_xprt *xprt = rqst->rq_xprt;
 
+	dprintk("RPC:       %s: freeing rqst %p (req %p)\n",
+		__func__, rqst, rpcr_to_rdmar(rqst));
+
 	smp_mb__before_atomic();
 	WARN_ON_ONCE(!test_bit(RPC_BC_PA_IN_USE, &rqst->rq_bc_pa_state));
 	clear_bit(RPC_BC_PA_IN_USE, &rqst->rq_bc_pa_state);
@@ -329,9 +335,7 @@ void rpcrdma_bc_receive_call(struct rpcrdma_xprt *r_xprt,
 				struct rpc_rqst, rq_bc_pa_list);
 	list_del(&rqst->rq_bc_pa_list);
 	spin_unlock(&xprt->bc_pa_lock);
-#ifdef RPCRDMA_BACKCHANNEL_DEBUG
-	pr_info("RPC:       %s: using rqst %p\n", __func__, rqst);
-#endif
+	dprintk("RPC:       %s: using rqst %p\n", __func__, rqst);
 
 	/* Prepare rqst */
 	rqst->rq_reply_bytes_recvd = 0;
@@ -351,10 +355,8 @@ void rpcrdma_bc_receive_call(struct rpcrdma_xprt *r_xprt,
 	 * direction reply.
 	 */
 	req = rpcr_to_rdmar(rqst);
-#ifdef RPCRDMA_BACKCHANNEL_DEBUG
-	pr_info("RPC:       %s: attaching rep %p to req %p\n",
+	dprintk("RPC:       %s: attaching rep %p to req %p\n",
 		__func__, rep, req);
-#endif
 	req->rl_reply = rep;
 
 	/* Defeat the retransmit detection logic in send_request */

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (2 preceding siblings ...)
  2015-12-14 21:18   ` [PATCH v3 03/11] xprtrdma: Disable RPC/RDMA backchannel debugging messages Chuck Lever
@ 2015-12-14 21:18   ` Chuck Lever
       [not found]     ` <20151214211811.16295.47695.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
  2015-12-14 21:18   ` [PATCH v3 05/11] xprtrdma: Introduce ro_unmap_sync method Chuck Lever
                     ` (7 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:18 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

For FRWR FASTREG and LOCAL_INV, move the ib_*_wr structure off
the stack. This allows frwr_op_map and frwr_op_unmap to chain
WRs together without limit to register or invalidate a set of MRs
with a single ib_post_send().

(This will be for chaining LOCAL_INV requests).

Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/frwr_ops.c  |   38 ++++++++++++++++++++------------------
 net/sunrpc/xprtrdma/xprt_rdma.h |    2 ++
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index ae2a241..660d0b6 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -318,7 +318,7 @@ frwr_op_map(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg,
 	struct rpcrdma_mw *mw;
 	struct rpcrdma_frmr *frmr;
 	struct ib_mr *mr;
-	struct ib_reg_wr reg_wr;
+	struct ib_reg_wr *reg_wr;
 	struct ib_send_wr *bad_wr;
 	int rc, i, n, dma_nents;
 	u8 key;
@@ -335,6 +335,7 @@ frwr_op_map(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg,
 	frmr = &mw->r.frmr;
 	frmr->fr_state = FRMR_IS_VALID;
 	mr = frmr->fr_mr;
+	reg_wr = &frmr->fr_regwr;
 
 	if (nsegs > ia->ri_max_frmr_depth)
 		nsegs = ia->ri_max_frmr_depth;
@@ -380,19 +381,19 @@ frwr_op_map(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg,
 	key = (u8)(mr->rkey & 0x000000FF);
 	ib_update_fast_reg_key(mr, ++key);
 
-	reg_wr.wr.next = NULL;
-	reg_wr.wr.opcode = IB_WR_REG_MR;
-	reg_wr.wr.wr_id = (uintptr_t)mw;
-	reg_wr.wr.num_sge = 0;
-	reg_wr.wr.send_flags = 0;
-	reg_wr.mr = mr;
-	reg_wr.key = mr->rkey;
-	reg_wr.access = writing ?
-			IB_ACCESS_REMOTE_WRITE | IB_ACCESS_LOCAL_WRITE :
-			IB_ACCESS_REMOTE_READ;
+	reg_wr->wr.next = NULL;
+	reg_wr->wr.opcode = IB_WR_REG_MR;
+	reg_wr->wr.wr_id = (uintptr_t)mw;
+	reg_wr->wr.num_sge = 0;
+	reg_wr->wr.send_flags = 0;
+	reg_wr->mr = mr;
+	reg_wr->key = mr->rkey;
+	reg_wr->access = writing ?
+			 IB_ACCESS_REMOTE_WRITE | IB_ACCESS_LOCAL_WRITE :
+			 IB_ACCESS_REMOTE_READ;
 
 	DECR_CQCOUNT(&r_xprt->rx_ep);
-	rc = ib_post_send(ia->ri_id->qp, &reg_wr.wr, &bad_wr);
+	rc = ib_post_send(ia->ri_id->qp, &reg_wr->wr, &bad_wr);
 	if (rc)
 		goto out_senderr;
 
@@ -422,23 +423,24 @@ frwr_op_unmap(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg)
 	struct rpcrdma_ia *ia = &r_xprt->rx_ia;
 	struct rpcrdma_mw *mw = seg1->rl_mw;
 	struct rpcrdma_frmr *frmr = &mw->r.frmr;
-	struct ib_send_wr invalidate_wr, *bad_wr;
+	struct ib_send_wr *invalidate_wr, *bad_wr;
 	int rc, nsegs = seg->mr_nsegs;
 
 	dprintk("RPC:       %s: FRMR %p\n", __func__, mw);
 
 	seg1->rl_mw = NULL;
 	frmr->fr_state = FRMR_IS_INVALID;
+	invalidate_wr = &mw->r.frmr.fr_invwr;
 
-	memset(&invalidate_wr, 0, sizeof(invalidate_wr));
-	invalidate_wr.wr_id = (unsigned long)(void *)mw;
-	invalidate_wr.opcode = IB_WR_LOCAL_INV;
-	invalidate_wr.ex.invalidate_rkey = frmr->fr_mr->rkey;
+	memset(invalidate_wr, 0, sizeof(*invalidate_wr));
+	invalidate_wr->wr_id = (uintptr_t)mw;
+	invalidate_wr->opcode = IB_WR_LOCAL_INV;
+	invalidate_wr->ex.invalidate_rkey = frmr->fr_mr->rkey;
 	DECR_CQCOUNT(&r_xprt->rx_ep);
 
 	ib_dma_unmap_sg(ia->ri_device, frmr->sg, frmr->sg_nents, seg1->mr_dir);
 	read_lock(&ia->ri_qplock);
-	rc = ib_post_send(ia->ri_id->qp, &invalidate_wr, &bad_wr);
+	rc = ib_post_send(ia->ri_id->qp, invalidate_wr, &bad_wr);
 	read_unlock(&ia->ri_qplock);
 	if (rc)
 		goto out_err;
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 4197191..e60d817 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -206,6 +206,8 @@ struct rpcrdma_frmr {
 	enum rpcrdma_frmr_state		fr_state;
 	struct work_struct		fr_work;
 	struct rpcrdma_xprt		*fr_xprt;
+	struct ib_reg_wr		fr_regwr;
+	struct ib_send_wr		fr_invwr;
 };
 
 struct rpcrdma_fmr {

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 05/11] xprtrdma: Introduce ro_unmap_sync method
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (3 preceding siblings ...)
  2015-12-14 21:18   ` [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack Chuck Lever
@ 2015-12-14 21:18   ` Chuck Lever
  2015-12-14 21:18   ` [PATCH v3 06/11] xprtrdma: Add ro_unmap_sync method for FRWR Chuck Lever
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:18 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

In the current xprtrdma implementation, some memreg strategies
implement ro_unmap synchronously (the MR is knocked down before the
method returns) and some asynchonously (the MR will be knocked down
and returned to the pool in the background).

To guarantee the MR is truly invalid before the RPC consumer is
allowed to resume execution, we need an unmap method that is
always synchronous, invoked from the RPC/RDMA reply handler.

The new method unmaps all MRs for an RPC. The existing ro_unmap
method unmaps only one MR at a time.

Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/xprt_rdma.h |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index e60d817..d9f2f65 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -365,6 +365,8 @@ struct rpcrdma_xprt;
 struct rpcrdma_memreg_ops {
 	int		(*ro_map)(struct rpcrdma_xprt *,
 				  struct rpcrdma_mr_seg *, int, bool);
+	void		(*ro_unmap_sync)(struct rpcrdma_xprt *,
+					 struct rpcrdma_req *);
 	int		(*ro_unmap)(struct rpcrdma_xprt *,
 				    struct rpcrdma_mr_seg *);
 	int		(*ro_open)(struct rpcrdma_ia *,

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 06/11] xprtrdma: Add ro_unmap_sync method for FRWR
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (4 preceding siblings ...)
  2015-12-14 21:18   ` [PATCH v3 05/11] xprtrdma: Introduce ro_unmap_sync method Chuck Lever
@ 2015-12-14 21:18   ` Chuck Lever
       [not found]     ` <20151214211827.16295.76338.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
  2015-12-14 21:18   ` [PATCH v3 07/11] xprtrdma: Add ro_unmap_sync method for FMR Chuck Lever
                     ` (5 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:18 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

FRWR's ro_unmap is asynchronous. The new ro_unmap_sync posts
LOCAL_INV Work Requests and waits for them to complete before
returning.

Note also, DMA unmapping is now done _after_ invalidation.

Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/frwr_ops.c  |  137 ++++++++++++++++++++++++++++++++++++++-
 net/sunrpc/xprtrdma/xprt_rdma.h |    2 +
 2 files changed, 135 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 660d0b6..5b9e41d 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -244,12 +244,14 @@ frwr_op_maxpages(struct rpcrdma_xprt *r_xprt)
 		     rpcrdma_max_segments(r_xprt) * ia->ri_max_frmr_depth);
 }
 
-/* If FAST_REG or LOCAL_INV failed, indicate the frmr needs to be reset. */
+/* If FAST_REG or LOCAL_INV failed, indicate the frmr needs
+ * to be reset.
+ *
+ * WARNING: Only wr_id and status are reliable at this point
+ */
 static void
-frwr_sendcompletion(struct ib_wc *wc)
+__frwr_sendcompletion_flush(struct ib_wc *wc, struct rpcrdma_mw *r)
 {
-	struct rpcrdma_mw *r;
-
 	if (likely(wc->status == IB_WC_SUCCESS))
 		return;
 
@@ -260,9 +262,23 @@ frwr_sendcompletion(struct ib_wc *wc)
 	else
 		pr_warn("RPC:       %s: frmr %p error, status %s (%d)\n",
 			__func__, r, ib_wc_status_msg(wc->status), wc->status);
+
 	r->r.frmr.fr_state = FRMR_IS_STALE;
 }
 
+static void
+frwr_sendcompletion(struct ib_wc *wc)
+{
+	struct rpcrdma_mw *r = (struct rpcrdma_mw *)(unsigned long)wc->wr_id;
+	struct rpcrdma_frmr *f = &r->r.frmr;
+
+	if (unlikely(wc->status != IB_WC_SUCCESS))
+		__frwr_sendcompletion_flush(wc, r);
+
+	if (f->fr_waiter)
+		complete(&f->fr_linv_done);
+}
+
 static int
 frwr_op_init(struct rpcrdma_xprt *r_xprt)
 {
@@ -334,6 +350,7 @@ frwr_op_map(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg,
 	} while (mw->r.frmr.fr_state != FRMR_IS_INVALID);
 	frmr = &mw->r.frmr;
 	frmr->fr_state = FRMR_IS_VALID;
+	frmr->fr_waiter = false;
 	mr = frmr->fr_mr;
 	reg_wr = &frmr->fr_regwr;
 
@@ -413,6 +430,117 @@ out_senderr:
 	return rc;
 }
 
+static struct ib_send_wr *
+__frwr_prepare_linv_wr(struct rpcrdma_mr_seg *seg)
+{
+	struct rpcrdma_mw *mw = seg->rl_mw;
+	struct rpcrdma_frmr *f = &mw->r.frmr;
+	struct ib_send_wr *invalidate_wr;
+
+	f->fr_waiter = false;
+	f->fr_state = FRMR_IS_INVALID;
+	invalidate_wr = &f->fr_invwr;
+
+	memset(invalidate_wr, 0, sizeof(*invalidate_wr));
+	invalidate_wr->wr_id = (unsigned long)(void *)mw;
+	invalidate_wr->opcode = IB_WR_LOCAL_INV;
+	invalidate_wr->ex.invalidate_rkey = f->fr_mr->rkey;
+
+	return invalidate_wr;
+}
+
+static void
+__frwr_dma_unmap(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg,
+		 int rc)
+{
+	struct ib_device *device = r_xprt->rx_ia.ri_device;
+	struct rpcrdma_mw *mw = seg->rl_mw;
+	int nsegs = seg->mr_nsegs;
+
+	seg->rl_mw = NULL;
+
+	while (nsegs--)
+		rpcrdma_unmap_one(device, seg++);
+
+	if (!rc)
+		rpcrdma_put_mw(r_xprt, mw);
+	else
+		__frwr_queue_recovery(mw);
+}
+
+/* Invalidate all memory regions that were registered for "req".
+ *
+ * Sleeps until it is safe for the host CPU to access the
+ * previously mapped memory regions.
+ */
+static void
+frwr_op_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
+{
+	struct ib_send_wr *invalidate_wrs, *pos, *prev, *bad_wr;
+	struct rpcrdma_ia *ia = &r_xprt->rx_ia;
+	struct rpcrdma_mr_seg *seg;
+	unsigned int i, nchunks;
+	struct rpcrdma_frmr *f;
+	int rc;
+
+	dprintk("RPC:       %s: req %p\n", __func__, req);
+
+	/* ORDER: Invalidate all of the req's MRs first
+	 *
+	 * Chain the LOCAL_INV Work Requests and post them with
+	 * a single ib_post_send() call.
+	 */
+	invalidate_wrs = pos = prev = NULL;
+	seg = NULL;
+	for (i = 0, nchunks = req->rl_nchunks; nchunks; nchunks--) {
+		seg = &req->rl_segments[i];
+
+		pos = __frwr_prepare_linv_wr(seg);
+
+		if (!invalidate_wrs)
+			invalidate_wrs = pos;
+		else
+			prev->next = pos;
+		prev = pos;
+
+		i += seg->mr_nsegs;
+	}
+	f = &seg->rl_mw->r.frmr;
+
+	/* Strong send queue ordering guarantees that when the
+	 * last WR in the chain completes, all WRs in the chain
+	 * are complete.
+	 */
+	f->fr_invwr.send_flags = IB_SEND_SIGNALED;
+	f->fr_waiter = true;
+	init_completion(&f->fr_linv_done);
+	INIT_CQCOUNT(&r_xprt->rx_ep);
+
+	/* Transport disconnect drains the receive CQ before it
+	 * replaces the QP. The RPC reply handler won't call us
+	 * unless ri_id->qp is a valid pointer.
+	 */
+	rc = ib_post_send(ia->ri_id->qp, invalidate_wrs, &bad_wr);
+	if (rc)
+		pr_warn("%s: ib_post_send failed %i\n", __func__, rc);
+
+	wait_for_completion(&f->fr_linv_done);
+
+	/* ORDER: Now DMA unmap all of the req's MRs, and return
+	 * them to the free MW list.
+	 */
+	for (i = 0, nchunks = req->rl_nchunks; nchunks; nchunks--) {
+		seg = &req->rl_segments[i];
+
+		__frwr_dma_unmap(r_xprt, seg, rc);
+
+		i += seg->mr_nsegs;
+		seg->mr_nsegs = 0;
+	}
+
+	req->rl_nchunks = 0;
+}
+
 /* Post a LOCAL_INV Work Request to prevent further remote access
  * via RDMA READ or RDMA WRITE.
  */
@@ -472,6 +600,7 @@ frwr_op_destroy(struct rpcrdma_buffer *buf)
 
 const struct rpcrdma_memreg_ops rpcrdma_frwr_memreg_ops = {
 	.ro_map				= frwr_op_map,
+	.ro_unmap_sync			= frwr_op_unmap_sync,
 	.ro_unmap			= frwr_op_unmap,
 	.ro_open			= frwr_op_open,
 	.ro_maxpages			= frwr_op_maxpages,
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index d9f2f65..089a7db 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -206,6 +206,8 @@ struct rpcrdma_frmr {
 	enum rpcrdma_frmr_state		fr_state;
 	struct work_struct		fr_work;
 	struct rpcrdma_xprt		*fr_xprt;
+	bool				fr_waiter;
+	struct completion		fr_linv_done;;
 	struct ib_reg_wr		fr_regwr;
 	struct ib_send_wr		fr_invwr;
 };

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 07/11] xprtrdma: Add ro_unmap_sync method for FMR
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (5 preceding siblings ...)
  2015-12-14 21:18   ` [PATCH v3 06/11] xprtrdma: Add ro_unmap_sync method for FRWR Chuck Lever
@ 2015-12-14 21:18   ` Chuck Lever
  2015-12-14 21:19   ` [PATCH v3 08/11] xprtrdma: Add ro_unmap_sync method for all-physical registration Chuck Lever
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:18 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

FMR's ro_unmap method is already synchronous because ib_unmap_fmr()
is a synchronous verb. However, some improvements can be made here.

1. Gather all the MRs for the RPC request onto a list, and invoke
   ib_unmap_fmr() once with that list. This reduces the number of
   doorbells when there is more than one MR to invalidate

2. Perform the DMA unmap _after_ the MRs are unmapped, not before.
   This is critical after invalidating a Write chunk.

Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/fmr_ops.c |   64 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/net/sunrpc/xprtrdma/fmr_ops.c b/net/sunrpc/xprtrdma/fmr_ops.c
index f1e8daf..c14f3a4 100644
--- a/net/sunrpc/xprtrdma/fmr_ops.c
+++ b/net/sunrpc/xprtrdma/fmr_ops.c
@@ -179,6 +179,69 @@ out_maperr:
 	return rc;
 }
 
+static void
+__fmr_dma_unmap(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg)
+{
+	struct ib_device *device = r_xprt->rx_ia.ri_device;
+	struct rpcrdma_mw *mw = seg->rl_mw;
+	int nsegs = seg->mr_nsegs;
+
+	seg->rl_mw = NULL;
+
+	while (nsegs--)
+		rpcrdma_unmap_one(device, seg++);
+
+	rpcrdma_put_mw(r_xprt, mw);
+}
+
+/* Invalidate all memory regions that were registered for "req".
+ *
+ * Sleeps until it is safe for the host CPU to access the
+ * previously mapped memory regions.
+ */
+static void
+fmr_op_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
+{
+	struct rpcrdma_mr_seg *seg;
+	unsigned int i, nchunks;
+	struct rpcrdma_mw *mw;
+	LIST_HEAD(unmap_list);
+	int rc;
+
+	dprintk("RPC:       %s: req %p\n", __func__, req);
+
+	/* ORDER: Invalidate all of the req's MRs first
+	 *
+	 * ib_unmap_fmr() is slow, so use a single call instead
+	 * of one call per mapped MR.
+	 */
+	for (i = 0, nchunks = req->rl_nchunks; nchunks; nchunks--) {
+		seg = &req->rl_segments[i];
+		mw = seg->rl_mw;
+
+		list_add(&mw->r.fmr.fmr->list, &unmap_list);
+
+		i += seg->mr_nsegs;
+	}
+	rc = ib_unmap_fmr(&unmap_list);
+	if (rc)
+		pr_warn("%s: ib_unmap_fmr failed (%i)\n", __func__, rc);
+
+	/* ORDER: Now DMA unmap all of the req's MRs, and return
+	 * them to the free MW list.
+	 */
+	for (i = 0, nchunks = req->rl_nchunks; nchunks; nchunks--) {
+		seg = &req->rl_segments[i];
+
+		__fmr_dma_unmap(r_xprt, seg);
+
+		i += seg->mr_nsegs;
+		seg->mr_nsegs = 0;
+	}
+
+	req->rl_nchunks = 0;
+}
+
 /* Use the ib_unmap_fmr() verb to prevent further remote
  * access via RDMA READ or RDMA WRITE.
  */
@@ -231,6 +294,7 @@ fmr_op_destroy(struct rpcrdma_buffer *buf)
 
 const struct rpcrdma_memreg_ops rpcrdma_fmr_memreg_ops = {
 	.ro_map				= fmr_op_map,
+	.ro_unmap_sync			= fmr_op_unmap_sync,
 	.ro_unmap			= fmr_op_unmap,
 	.ro_open			= fmr_op_open,
 	.ro_maxpages			= fmr_op_maxpages,

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 08/11] xprtrdma: Add ro_unmap_sync method for all-physical registration
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (6 preceding siblings ...)
  2015-12-14 21:18   ` [PATCH v3 07/11] xprtrdma: Add ro_unmap_sync method for FMR Chuck Lever
@ 2015-12-14 21:19   ` Chuck Lever
  2015-12-14 21:19   ` [PATCH v3 09/11] SUNRPC: Introduce xprt_commit_rqst() Chuck Lever
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:19 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

physical's ro_unmap is synchronous already. The new ro_unmap_sync
method just has to DMA unmap all MRs associated with the RPC
request.

Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/physical_ops.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/net/sunrpc/xprtrdma/physical_ops.c b/net/sunrpc/xprtrdma/physical_ops.c
index 617b76f..dbb302e 100644
--- a/net/sunrpc/xprtrdma/physical_ops.c
+++ b/net/sunrpc/xprtrdma/physical_ops.c
@@ -83,6 +83,18 @@ physical_op_unmap(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg)
 	return 1;
 }
 
+/* DMA unmap all memory regions that were mapped for "req".
+ */
+static void
+physical_op_unmap_sync(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
+{
+	struct ib_device *device = r_xprt->rx_ia.ri_device;
+	unsigned int i;
+
+	for (i = 0; req->rl_nchunks; --req->rl_nchunks)
+		rpcrdma_unmap_one(device, &req->rl_segments[i++]);
+}
+
 static void
 physical_op_destroy(struct rpcrdma_buffer *buf)
 {
@@ -90,6 +102,7 @@ physical_op_destroy(struct rpcrdma_buffer *buf)
 
 const struct rpcrdma_memreg_ops rpcrdma_physical_memreg_ops = {
 	.ro_map				= physical_op_map,
+	.ro_unmap_sync			= physical_op_unmap_sync,
 	.ro_unmap			= physical_op_unmap,
 	.ro_open			= physical_op_open,
 	.ro_maxpages			= physical_op_maxpages,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 09/11] SUNRPC: Introduce xprt_commit_rqst()
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (7 preceding siblings ...)
  2015-12-14 21:19   ` [PATCH v3 08/11] xprtrdma: Add ro_unmap_sync method for all-physical registration Chuck Lever
@ 2015-12-14 21:19   ` Chuck Lever
       [not found]     ` <20151214211915.16295.30339.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
  2015-12-14 21:19   ` [PATCH v3 10/11] xprtrdma: Invalidate in the RPC reply handler Chuck Lever
                     ` (2 subsequent siblings)
  11 siblings, 1 reply; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:19 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

I'm about to add code in the RPC/RDMA reply handler between the
xprt_lookup_rqst() and xprt_complete_rqst() call site that needs
to execute outside of spinlock critical sections.

Add a hook to remove an rpc_rqst from the pending list once
the transport knows its going to invoke xprt_complete_rqst().

Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 include/linux/sunrpc/xprt.h    |    1 +
 net/sunrpc/xprt.c              |   14 ++++++++++++++
 net/sunrpc/xprtrdma/rpc_rdma.c |    4 ++++
 3 files changed, 19 insertions(+)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 69ef5b3..ab6c3a5 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -366,6 +366,7 @@ void			xprt_wait_for_buffer_space(struct rpc_task *task, rpc_action action);
 void			xprt_write_space(struct rpc_xprt *xprt);
 void			xprt_adjust_cwnd(struct rpc_xprt *xprt, struct rpc_task *task, int result);
 struct rpc_rqst *	xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
+void			xprt_commit_rqst(struct rpc_task *task);
 void			xprt_complete_rqst(struct rpc_task *task, int copied);
 void			xprt_release_rqst_cong(struct rpc_task *task);
 void			xprt_disconnect_done(struct rpc_xprt *xprt);
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 2e98f4a..a5be4ab 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -837,6 +837,20 @@ static void xprt_update_rtt(struct rpc_task *task)
 }
 
 /**
+ * xprt_commit_rqst - remove rqst from pending list early
+ * @task: RPC request to remove
+ *
+ * Caller holds transport lock.
+ */
+void xprt_commit_rqst(struct rpc_task *task)
+{
+	struct rpc_rqst *req = task->tk_rqstp;
+
+	list_del_init(&req->rq_list);
+}
+EXPORT_SYMBOL_GPL(xprt_commit_rqst);
+
+/**
  * xprt_complete_rqst - called when reply processing is complete
  * @task: RPC request that recently completed
  * @copied: actual number of bytes received from the transport
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index c10d969..0bc8c39 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -804,6 +804,9 @@ rpcrdma_reply_handler(struct rpcrdma_rep *rep)
 	if (req->rl_reply)
 		goto out_duplicate;
 
+	xprt_commit_rqst(rqst->rq_task);
+	spin_unlock_bh(&xprt->transport_lock);
+
 	dprintk("RPC:       %s: reply 0x%p completes request 0x%p\n"
 		"                   RPC request 0x%p xid 0x%08x\n",
 			__func__, rep, req, rqst,
@@ -894,6 +897,7 @@ badheader:
 	else if (credits > r_xprt->rx_buf.rb_max_requests)
 		credits = r_xprt->rx_buf.rb_max_requests;
 
+	spin_lock_bh(&xprt->transport_lock);
 	cwnd = xprt->cwnd;
 	xprt->cwnd = credits << RPC_CWNDSHIFT;
 	if (xprt->cwnd > cwnd)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 10/11] xprtrdma: Invalidate in the RPC reply handler
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (8 preceding siblings ...)
  2015-12-14 21:19   ` [PATCH v3 09/11] SUNRPC: Introduce xprt_commit_rqst() Chuck Lever
@ 2015-12-14 21:19   ` Chuck Lever
  2015-12-14 21:19   ` [PATCH v3 11/11] xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit') Chuck Lever
  2015-12-15 19:37   ` [PATCH v3 00/11] NFS/RDMA client patches for 4.5 Anna Schumaker
  11 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:19 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

There is a window between the time the RPC reply handler wakes the
waiting RPC task and when xprt_release() invokes ops->buf_free.
During this time, memory regions containing the data payload may
still be accessed by a broken or malicious server, but the RPC
application has already been allowed access to the memory containing
the RPC request's data payloads.

The server should be fenced from client memory containing RPC data
payloads _before_ the RPC application is allowed to continue.

This change also more strongly enforces send queue accounting. There
is a maximum number of RPC calls allowed to be outstanding. When an
RPC/RDMA transport is set up, just enough send queue resources are
allocated to handle registration, Send, and invalidation WRs for
each those RPCs at the same time.

Before, additional RPC calls could be dispatched while invalidation
WRs were still consuming send WQEs. When invalidation WRs backed
up, dispatching additional RPCs resulted in a send queue overrun.

Now, the reply handler prevents RPC dispatch until invalidation is
complete. This prevents RPC call dispatch until there are enough
send queue resources to proceed.

Still to do: If an RPC exits early (say, ^C), the reply handler has
no opportunity to perform invalidation. Currently, xprt_rdma_free()
still frees remaining RDMA resources, which could deadlock.
Additional changes are needed to handle invalidation properly in this
case.

Reported-by: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/rpc_rdma.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 0bc8c39..3d00c5d 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -891,6 +891,16 @@ badheader:
 		break;
 	}
 
+	/* Invalidate and flush the data payloads before waking the
+	 * waiting application. This guarantees the memory region is
+	 * properly fenced from the server before the application
+	 * accesses the data. It also ensures proper send flow
+	 * control: waking the next RPC waits until this RPC has
+	 * relinquished all its Send Queue entries.
+	 */
+	if (req->rl_nchunks)
+		r_xprt->rx_ia.ri_ops->ro_unmap_sync(r_xprt, req);
+
 	credits = be32_to_cpu(headerp->rm_credit);
 	if (credits == 0)
 		credits = 1;	/* don't deadlock */

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 11/11] xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit').
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (9 preceding siblings ...)
  2015-12-14 21:19   ` [PATCH v3 10/11] xprtrdma: Invalidate in the RPC reply handler Chuck Lever
@ 2015-12-14 21:19   ` Chuck Lever
  2015-12-15 19:37   ` [PATCH v3 00/11] NFS/RDMA client patches for 4.5 Anna Schumaker
  11 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-14 21:19 UTC (permalink / raw)
  To: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

The root of the problem was that sends (especially unsignalled
FASTREG and LOCAL_INV Work Requests) were not properly flow-
controlled, which allowed a send queue overrun.

Now that the RPC/RDMA reply handler waits for invalidation to
complete, the send queue is properly flow-controlled. Thus this
limit is no longer necessary.

Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
 net/sunrpc/xprtrdma/verbs.c     |    6 ++----
 net/sunrpc/xprtrdma/xprt_rdma.h |    6 ------
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index f23f3d6..1867e3a 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -608,10 +608,8 @@ rpcrdma_ep_create(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia,
 
 	/* set trigger for requesting send completion */
 	ep->rep_cqinit = ep->rep_attr.cap.max_send_wr/2 - 1;
-	if (ep->rep_cqinit > RPCRDMA_MAX_UNSIGNALED_SENDS)
-		ep->rep_cqinit = RPCRDMA_MAX_UNSIGNALED_SENDS;
-	else if (ep->rep_cqinit <= 2)
-		ep->rep_cqinit = 0;
+	if (ep->rep_cqinit <= 2)
+		ep->rep_cqinit = 0;	/* always signal? */
 	INIT_CQCOUNT(ep);
 	init_waitqueue_head(&ep->rep_connect_wait);
 	INIT_DELAYED_WORK(&ep->rep_connect_worker, rpcrdma_connect_worker);
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 089a7db..ba3bc3f 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -87,12 +87,6 @@ struct rpcrdma_ep {
 	struct delayed_work	rep_connect_worker;
 };
 
-/*
- * Force a signaled SEND Work Request every so often,
- * in case the provider needs to do some housekeeping.
- */
-#define RPCRDMA_MAX_UNSIGNALED_SENDS	(32)
-
 #define INIT_CQCOUNT(ep) atomic_set(&(ep)->rep_cqcount, (ep)->rep_cqinit)
 #define DECR_CQCOUNT(ep) atomic_sub_return(1, &(ep)->rep_cqcount)
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 00/11] NFS/RDMA client patches for 4.5
       [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
                     ` (10 preceding siblings ...)
  2015-12-14 21:19   ` [PATCH v3 11/11] xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit') Chuck Lever
@ 2015-12-15 19:37   ` Anna Schumaker
       [not found]     ` <56706BE0.7010609-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
  11 siblings, 1 reply; 23+ messages in thread
From: Anna Schumaker @ 2015-12-15 19:37 UTC (permalink / raw)
  To: Chuck Lever
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

Thanks, Chuck!

Everything looks okay to me, so I'll apply these patches and send them to Trond before the holiday.

On 12/14/2015 04:17 PM, Chuck Lever wrote:
> For 4.5, I'd like to address the send queue accounting and
> invalidation/unmap ordering issues Jason brought up a couple of
> months ago.
> 
> In preparation for Doug's final topic branch, Anna, I've rebased
> these on Christoph's ib_device_attr branch, but there were no merge
> conflicts or other changes needed. Could you begin preparing these
> for linux-next and other final testing and review?

No merge conflicts is nice, and we might not need to worry about ordering the pull request.

Thanks,
Anna

> 
> Also available in the "nfs-rdma-for-4.5" topic branch of this git repo:
> 
> git://git.linux-nfs.org/projects/cel/cel-2.6.git
> 
> Or for browsing:
> 
> http://git.linux-nfs.org/?p=cel/cel-2.6.git;a=log;h=refs/heads/nfs-rdma-for-4.5
> 
> 
> Changes since v2:
> - Rebased on Christoph's ib_device_attr branch
> 
> 
> Changes since v1:
> 
> - Rebased on v4.4-rc3
> - Receive buffer safety margin patch dropped
> - Backchannel pr_err and pr_info converted to dprintk
> - Backchannel spin locks converted to work queue-safe locks
> - Fixed premature release of backchannel request buffer
> - NFSv4.1 callbacks tested with for-4.5 server
> 
> ---
> 
> Chuck Lever (11):
>       xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock)
>       xprtrdma: xprt_rdma_free() must not release backchannel reqs
>       xprtrdma: Disable RPC/RDMA backchannel debugging messages
>       xprtrdma: Move struct ib_send_wr off the stack
>       xprtrdma: Introduce ro_unmap_sync method
>       xprtrdma: Add ro_unmap_sync method for FRWR
>       xprtrdma: Add ro_unmap_sync method for FMR
>       xprtrdma: Add ro_unmap_sync method for all-physical registration
>       SUNRPC: Introduce xprt_commit_rqst()
>       xprtrdma: Invalidate in the RPC reply handler
>       xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit').
> 
> 
>  include/linux/sunrpc/xprt.h        |    1 
>  net/sunrpc/xprt.c                  |   14 +++
>  net/sunrpc/xprtrdma/backchannel.c  |   22 ++---
>  net/sunrpc/xprtrdma/fmr_ops.c      |   64 +++++++++++++
>  net/sunrpc/xprtrdma/frwr_ops.c     |  175 +++++++++++++++++++++++++++++++-----
>  net/sunrpc/xprtrdma/physical_ops.c |   13 +++
>  net/sunrpc/xprtrdma/rpc_rdma.c     |   14 +++
>  net/sunrpc/xprtrdma/transport.c    |    3 +
>  net/sunrpc/xprtrdma/verbs.c        |   13 +--
>  net/sunrpc/xprtrdma/xprt_rdma.h    |   12 +-
>  10 files changed, 283 insertions(+), 48 deletions(-)
> 
> --
> Chuck Lever
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 00/11] NFS/RDMA client patches for 4.5
       [not found]     ` <56706BE0.7010609-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
@ 2015-12-16 12:11       ` Devesh Sharma
  0 siblings, 0 replies; 23+ messages in thread
From: Devesh Sharma @ 2015-12-16 12:11 UTC (permalink / raw)
  To: Anna Schumaker
  Cc: Chuck Lever, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Linux NFS Mailing List

Hi Chuck,

iozone passed on ocrdma device. Link bounce fails to recover iozone
traffic, however failure is not related to this patch series. I am in
processes of finding out the patch which broke it.

Tested-By: Devesh Sharma <devesh.sharma-1wcpHE2jlwO1Z/+hSey0Gg@public.gmane.org>

On Wed, Dec 16, 2015 at 1:07 AM, Anna Schumaker
<Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org> wrote:
> Thanks, Chuck!
>
> Everything looks okay to me, so I'll apply these patches and send them to Trond before the holiday.
>
> On 12/14/2015 04:17 PM, Chuck Lever wrote:
>> For 4.5, I'd like to address the send queue accounting and
>> invalidation/unmap ordering issues Jason brought up a couple of
>> months ago.
>>
>> In preparation for Doug's final topic branch, Anna, I've rebased
>> these on Christoph's ib_device_attr branch, but there were no merge
>> conflicts or other changes needed. Could you begin preparing these
>> for linux-next and other final testing and review?
>
> No merge conflicts is nice, and we might not need to worry about ordering the pull request.
>
> Thanks,
> Anna
>
>>
>> Also available in the "nfs-rdma-for-4.5" topic branch of this git repo:
>>
>> git://git.linux-nfs.org/projects/cel/cel-2.6.git
>>
>> Or for browsing:
>>
>> http://git.linux-nfs.org/?p=cel/cel-2.6.git;a=log;h=refs/heads/nfs-rdma-for-4.5
>>
>>
>> Changes since v2:
>> - Rebased on Christoph's ib_device_attr branch
>>
>>
>> Changes since v1:
>>
>> - Rebased on v4.4-rc3
>> - Receive buffer safety margin patch dropped
>> - Backchannel pr_err and pr_info converted to dprintk
>> - Backchannel spin locks converted to work queue-safe locks
>> - Fixed premature release of backchannel request buffer
>> - NFSv4.1 callbacks tested with for-4.5 server
>>
>> ---
>>
>> Chuck Lever (11):
>>       xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock)
>>       xprtrdma: xprt_rdma_free() must not release backchannel reqs
>>       xprtrdma: Disable RPC/RDMA backchannel debugging messages
>>       xprtrdma: Move struct ib_send_wr off the stack
>>       xprtrdma: Introduce ro_unmap_sync method
>>       xprtrdma: Add ro_unmap_sync method for FRWR
>>       xprtrdma: Add ro_unmap_sync method for FMR
>>       xprtrdma: Add ro_unmap_sync method for all-physical registration
>>       SUNRPC: Introduce xprt_commit_rqst()
>>       xprtrdma: Invalidate in the RPC reply handler
>>       xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit').
>>
>>
>>  include/linux/sunrpc/xprt.h        |    1
>>  net/sunrpc/xprt.c                  |   14 +++
>>  net/sunrpc/xprtrdma/backchannel.c  |   22 ++---
>>  net/sunrpc/xprtrdma/fmr_ops.c      |   64 +++++++++++++
>>  net/sunrpc/xprtrdma/frwr_ops.c     |  175 +++++++++++++++++++++++++++++++-----
>>  net/sunrpc/xprtrdma/physical_ops.c |   13 +++
>>  net/sunrpc/xprtrdma/rpc_rdma.c     |   14 +++
>>  net/sunrpc/xprtrdma/transport.c    |    3 +
>>  net/sunrpc/xprtrdma/verbs.c        |   13 +--
>>  net/sunrpc/xprtrdma/xprt_rdma.h    |   12 +-
>>  10 files changed, 283 insertions(+), 48 deletions(-)
>>
>> --
>> Chuck Lever
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 09/11] SUNRPC: Introduce xprt_commit_rqst()
       [not found]     ` <20151214211915.16295.30339.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
@ 2015-12-16 13:48       ` Anna Schumaker
       [not found]         ` <56716BB4.8080500-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Anna Schumaker @ 2015-12-16 13:48 UTC (permalink / raw)
  To: Chuck Lever
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

Hi Chuck,

Sorry for the last minute comment.

On 12/14/2015 04:19 PM, Chuck Lever wrote:
> I'm about to add code in the RPC/RDMA reply handler between the
> xprt_lookup_rqst() and xprt_complete_rqst() call site that needs
> to execute outside of spinlock critical sections.
> 
> Add a hook to remove an rpc_rqst from the pending list once
> the transport knows its going to invoke xprt_complete_rqst().
> 
> Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> ---
>  include/linux/sunrpc/xprt.h    |    1 +
>  net/sunrpc/xprt.c              |   14 ++++++++++++++
>  net/sunrpc/xprtrdma/rpc_rdma.c |    4 ++++
>  3 files changed, 19 insertions(+)
> 
> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> index 69ef5b3..ab6c3a5 100644
> --- a/include/linux/sunrpc/xprt.h
> +++ b/include/linux/sunrpc/xprt.h
> @@ -366,6 +366,7 @@ void			xprt_wait_for_buffer_space(struct rpc_task *task, rpc_action action);
>  void			xprt_write_space(struct rpc_xprt *xprt);
>  void			xprt_adjust_cwnd(struct rpc_xprt *xprt, struct rpc_task *task, int result);
>  struct rpc_rqst *	xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
> +void			xprt_commit_rqst(struct rpc_task *task);
>  void			xprt_complete_rqst(struct rpc_task *task, int copied);
>  void			xprt_release_rqst_cong(struct rpc_task *task);
>  void			xprt_disconnect_done(struct rpc_xprt *xprt);
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 2e98f4a..a5be4ab 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -837,6 +837,20 @@ static void xprt_update_rtt(struct rpc_task *task)
>  }
>  
>  /**
> + * xprt_commit_rqst - remove rqst from pending list early
> + * @task: RPC request to remove
> + *
> + * Caller holds transport lock.
> + */
> +void xprt_commit_rqst(struct rpc_task *task)
> +{
> +	struct rpc_rqst *req = task->tk_rqstp;
> +
> +	list_del_init(&req->rq_list);
> +}
> +EXPORT_SYMBOL_GPL(xprt_commit_rqst);

Can you move this function into the xprtrdma code, since it's not called outside of there?

Thanks,
Anna

> +
> +/**
>   * xprt_complete_rqst - called when reply processing is complete
>   * @task: RPC request that recently completed
>   * @copied: actual number of bytes received from the transport
> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
> index c10d969..0bc8c39 100644
> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
> @@ -804,6 +804,9 @@ rpcrdma_reply_handler(struct rpcrdma_rep *rep)
>  	if (req->rl_reply)
>  		goto out_duplicate;
>  
> +	xprt_commit_rqst(rqst->rq_task);
> +	spin_unlock_bh(&xprt->transport_lock);
> +
>  	dprintk("RPC:       %s: reply 0x%p completes request 0x%p\n"
>  		"                   RPC request 0x%p xid 0x%08x\n",
>  			__func__, rep, req, rqst,
> @@ -894,6 +897,7 @@ badheader:
>  	else if (credits > r_xprt->rx_buf.rb_max_requests)
>  		credits = r_xprt->rx_buf.rb_max_requests;
>  
> +	spin_lock_bh(&xprt->transport_lock);
>  	cwnd = xprt->cwnd;
>  	xprt->cwnd = credits << RPC_CWNDSHIFT;
>  	if (xprt->cwnd > cwnd)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 06/11] xprtrdma: Add ro_unmap_sync method for FRWR
       [not found]     ` <20151214211827.16295.76338.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
@ 2015-12-16 13:57       ` Sagi Grimberg
       [not found]         ` <56716DE4.8080403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2015-12-16 13:57 UTC (permalink / raw)
  To: Chuck Lever, anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA


> +static void
> +__frwr_dma_unmap(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg,
> +		 int rc)
> +{
> +	struct ib_device *device = r_xprt->rx_ia.ri_device;
> +	struct rpcrdma_mw *mw = seg->rl_mw;
> +	int nsegs = seg->mr_nsegs;
> +
> +	seg->rl_mw = NULL;
> +
> +	while (nsegs--)
> +		rpcrdma_unmap_one(device, seg++);

Chuck, shouldn't this be replaced with ib_dma_unmap_sg?

Sorry for the late comment (Didn't find enough time to properly
review this...)
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack
       [not found]     ` <20151214211811.16295.47695.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
@ 2015-12-16 14:00       ` Sagi Grimberg
       [not found]         ` <56716E6C.4020604-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Sagi Grimberg @ 2015-12-16 14:00 UTC (permalink / raw)
  To: Chuck Lever, anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA


> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
> index 4197191..e60d817 100644
> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
> @@ -206,6 +206,8 @@ struct rpcrdma_frmr {
>   	enum rpcrdma_frmr_state		fr_state;
>   	struct work_struct		fr_work;
>   	struct rpcrdma_xprt		*fr_xprt;
> +	struct ib_reg_wr		fr_regwr;
> +	struct ib_send_wr		fr_invwr;

Would it make sense to unionize these as they are guaranteed not to
execute together? Some people don't like this sort of savings.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack
       [not found]         ` <56716E6C.4020604-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2015-12-16 15:06           ` Chuck Lever
       [not found]             ` <F2E992ED-2698-43DE-B7C5-4F02DA75AE42-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Chuck Lever @ 2015-12-16 15:06 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List


> On Dec 16, 2015, at 9:00 AM, Sagi Grimberg <sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> 
> 
>> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
>> index 4197191..e60d817 100644
>> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
>> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
>> @@ -206,6 +206,8 @@ struct rpcrdma_frmr {
>>  	enum rpcrdma_frmr_state		fr_state;
>>  	struct work_struct		fr_work;
>>  	struct rpcrdma_xprt		*fr_xprt;
>> +	struct ib_reg_wr		fr_regwr;
>> +	struct ib_send_wr		fr_invwr;
> 
> Would it make sense to unionize these as they are guaranteed not to
> execute together? Some people don't like this sort of savings.

I dislike unions because they make the code that uses
them less readable. I can define macros to help that,
but sigh! OK.


--
Chuck Lever




--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 06/11] xprtrdma: Add ro_unmap_sync method for FRWR
       [not found]         ` <56716DE4.8080403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2015-12-16 15:09           ` Chuck Lever
  0 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-16 15:09 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List


> On Dec 16, 2015, at 8:57 AM, Sagi Grimberg <sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> 
> 
>> +static void
>> +__frwr_dma_unmap(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr_seg *seg,
>> +		 int rc)
>> +{
>> +	struct ib_device *device = r_xprt->rx_ia.ri_device;
>> +	struct rpcrdma_mw *mw = seg->rl_mw;
>> +	int nsegs = seg->mr_nsegs;
>> +
>> +	seg->rl_mw = NULL;
>> +
>> +	while (nsegs--)
>> +		rpcrdma_unmap_one(device, seg++);
> 
> Chuck, shouldn't this be replaced with ib_dma_unmap_sg?

Looks like this was left over from before the conversion
to use ib_dma_unmap_sg. I'll have a look.

> Sorry for the late comment (Didn't find enough time to properly
> review this...)

--
Chuck Lever




--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack
       [not found]             ` <F2E992ED-2698-43DE-B7C5-4F02DA75AE42-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2015-12-16 15:11               ` Christoph Hellwig
       [not found]                 ` <20151216151115.GA16905-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2015-12-16 15:11 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Sagi Grimberg, anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List

On Wed, Dec 16, 2015 at 10:06:33AM -0500, Chuck Lever wrote:
> > Would it make sense to unionize these as they are guaranteed not to
> > execute together? Some people don't like this sort of savings.
> 
> I dislike unions because they make the code that uses
> them less readable. I can define macros to help that,
> but sigh! OK.

Shouldn't be an issue with transparent unions these days:

	union {
		struct ib_reg_wr                fr_regwr;
		struct ib_send_wr               fr_invwr;
	};
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 09/11] SUNRPC: Introduce xprt_commit_rqst()
       [not found]         ` <56716BB4.8080500-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
@ 2015-12-16 15:12           ` Chuck Lever
  0 siblings, 0 replies; 23+ messages in thread
From: Chuck Lever @ 2015-12-16 15:12 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List

Hi Anna-


> On Dec 16, 2015, at 8:48 AM, Anna Schumaker <Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org> wrote:
> 
> Hi Chuck,
> 
> Sorry for the last minute comment.
> 
> On 12/14/2015 04:19 PM, Chuck Lever wrote:
>> I'm about to add code in the RPC/RDMA reply handler between the
>> xprt_lookup_rqst() and xprt_complete_rqst() call site that needs
>> to execute outside of spinlock critical sections.
>> 
>> Add a hook to remove an rpc_rqst from the pending list once
>> the transport knows its going to invoke xprt_complete_rqst().
>> 
>> Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>> ---
>> include/linux/sunrpc/xprt.h    |    1 +
>> net/sunrpc/xprt.c              |   14 ++++++++++++++
>> net/sunrpc/xprtrdma/rpc_rdma.c |    4 ++++
>> 3 files changed, 19 insertions(+)
>> 
>> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
>> index 69ef5b3..ab6c3a5 100644
>> --- a/include/linux/sunrpc/xprt.h
>> +++ b/include/linux/sunrpc/xprt.h
>> @@ -366,6 +366,7 @@ void			xprt_wait_for_buffer_space(struct rpc_task *task, rpc_action action);
>> void			xprt_write_space(struct rpc_xprt *xprt);
>> void			xprt_adjust_cwnd(struct rpc_xprt *xprt, struct rpc_task *task, int result);
>> struct rpc_rqst *	xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
>> +void			xprt_commit_rqst(struct rpc_task *task);
>> void			xprt_complete_rqst(struct rpc_task *task, int copied);
>> void			xprt_release_rqst_cong(struct rpc_task *task);
>> void			xprt_disconnect_done(struct rpc_xprt *xprt);
>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> index 2e98f4a..a5be4ab 100644
>> --- a/net/sunrpc/xprt.c
>> +++ b/net/sunrpc/xprt.c
>> @@ -837,6 +837,20 @@ static void xprt_update_rtt(struct rpc_task *task)
>> }
>> 
>> /**
>> + * xprt_commit_rqst - remove rqst from pending list early
>> + * @task: RPC request to remove
>> + *
>> + * Caller holds transport lock.
>> + */
>> +void xprt_commit_rqst(struct rpc_task *task)
>> +{
>> +	struct rpc_rqst *req = task->tk_rqstp;
>> +
>> +	list_del_init(&req->rq_list);
>> +}
>> +EXPORT_SYMBOL_GPL(xprt_commit_rqst);
> 
> Can you move this function into the xprtrdma code, since it's not called outside of there?

I think that's a layering violation, and the idea is
to allow other transports to use this API eventually.

But I'll include this change in the next version of
the series.


> Thanks,
> Anna
> 
>> +
>> +/**
>>  * xprt_complete_rqst - called when reply processing is complete
>>  * @task: RPC request that recently completed
>>  * @copied: actual number of bytes received from the transport
>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
>> index c10d969..0bc8c39 100644
>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>> @@ -804,6 +804,9 @@ rpcrdma_reply_handler(struct rpcrdma_rep *rep)
>> 	if (req->rl_reply)
>> 		goto out_duplicate;
>> 
>> +	xprt_commit_rqst(rqst->rq_task);
>> +	spin_unlock_bh(&xprt->transport_lock);
>> +
>> 	dprintk("RPC:       %s: reply 0x%p completes request 0x%p\n"
>> 		"                   RPC request 0x%p xid 0x%08x\n",
>> 			__func__, rep, req, rqst,
>> @@ -894,6 +897,7 @@ badheader:
>> 	else if (credits > r_xprt->rx_buf.rb_max_requests)
>> 		credits = r_xprt->rx_buf.rb_max_requests;
>> 
>> +	spin_lock_bh(&xprt->transport_lock);
>> 	cwnd = xprt->cwnd;
>> 	xprt->cwnd = credits << RPC_CWNDSHIFT;
>> 	if (xprt->cwnd > cwnd)
>> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack
       [not found]                 ` <20151216151115.GA16905-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2015-12-16 15:13                   ` Chuck Lever
       [not found]                     ` <D5B6F9FD-FF8F-4F37-A3E5-B287B91B1EA4-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Chuck Lever @ 2015-12-16 15:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List


> On Dec 16, 2015, at 10:11 AM, Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
> 
> On Wed, Dec 16, 2015 at 10:06:33AM -0500, Chuck Lever wrote:
>>> Would it make sense to unionize these as they are guaranteed not to
>>> execute together? Some people don't like this sort of savings.
>> 
>> I dislike unions because they make the code that uses
>> them less readable. I can define macros to help that,
>> but sigh! OK.
> 
> Shouldn't be an issue with transparent unions these days:
> 
> 	union {
> 		struct ib_reg_wr                fr_regwr;
> 		struct ib_send_wr               fr_invwr;
> 	};

Right, but isn't that a gcc-ism that Al hates? If
everyone is OK with that construction, I will use it.

--
Chuck Lever




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack
       [not found]                     ` <D5B6F9FD-FF8F-4F37-A3E5-B287B91B1EA4-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2015-12-16 15:17                       ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2015-12-16 15:17 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Sagi Grimberg, anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Linux NFS Mailing List

On Wed, Dec 16, 2015 at 10:13:31AM -0500, Chuck Lever wrote:
> > Shouldn't be an issue with transparent unions these days:
> > 
> > 	union {
> > 		struct ib_reg_wr                fr_regwr;
> > 		struct ib_send_wr               fr_invwr;
> > 	};
> 
> Right, but isn't that a gcc-ism that Al hates? If
> everyone is OK with that construction, I will use it.

I started out as a GNUism, but now is supported in C11.  We use it
a lot all over the kernel.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-12-16 15:17 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-14 21:17 [PATCH v3 00/11] NFS/RDMA client patches for 4.5 Chuck Lever
     [not found] ` <20151214211317.16295.70115.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2015-12-14 21:17   ` [PATCH v3 01/11] xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock) Chuck Lever
2015-12-14 21:17   ` [PATCH v3 02/11] xprtrdma: xprt_rdma_free() must not release backchannel reqs Chuck Lever
2015-12-14 21:18   ` [PATCH v3 03/11] xprtrdma: Disable RPC/RDMA backchannel debugging messages Chuck Lever
2015-12-14 21:18   ` [PATCH v3 04/11] xprtrdma: Move struct ib_send_wr off the stack Chuck Lever
     [not found]     ` <20151214211811.16295.47695.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2015-12-16 14:00       ` Sagi Grimberg
     [not found]         ` <56716E6C.4020604-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-12-16 15:06           ` Chuck Lever
     [not found]             ` <F2E992ED-2698-43DE-B7C5-4F02DA75AE42-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2015-12-16 15:11               ` Christoph Hellwig
     [not found]                 ` <20151216151115.GA16905-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2015-12-16 15:13                   ` Chuck Lever
     [not found]                     ` <D5B6F9FD-FF8F-4F37-A3E5-B287B91B1EA4-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2015-12-16 15:17                       ` Christoph Hellwig
2015-12-14 21:18   ` [PATCH v3 05/11] xprtrdma: Introduce ro_unmap_sync method Chuck Lever
2015-12-14 21:18   ` [PATCH v3 06/11] xprtrdma: Add ro_unmap_sync method for FRWR Chuck Lever
     [not found]     ` <20151214211827.16295.76338.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2015-12-16 13:57       ` Sagi Grimberg
     [not found]         ` <56716DE4.8080403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-12-16 15:09           ` Chuck Lever
2015-12-14 21:18   ` [PATCH v3 07/11] xprtrdma: Add ro_unmap_sync method for FMR Chuck Lever
2015-12-14 21:19   ` [PATCH v3 08/11] xprtrdma: Add ro_unmap_sync method for all-physical registration Chuck Lever
2015-12-14 21:19   ` [PATCH v3 09/11] SUNRPC: Introduce xprt_commit_rqst() Chuck Lever
     [not found]     ` <20151214211915.16295.30339.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2015-12-16 13:48       ` Anna Schumaker
     [not found]         ` <56716BB4.8080500-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-12-16 15:12           ` Chuck Lever
2015-12-14 21:19   ` [PATCH v3 10/11] xprtrdma: Invalidate in the RPC reply handler Chuck Lever
2015-12-14 21:19   ` [PATCH v3 11/11] xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit') Chuck Lever
2015-12-15 19:37   ` [PATCH v3 00/11] NFS/RDMA client patches for 4.5 Anna Schumaker
     [not found]     ` <56706BE0.7010609-ZwjVKphTwtPQT0dZR+AlfA@public.gmane.org>
2015-12-16 12:11       ` Devesh Sharma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox