From: Chuck Lever <chuck.lever@oracle.com>
To: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: [PATCH v4 01/24] xprtrdma: mind the device's max fast register page list depth
Date: Wed, 21 May 2014 20:54:24 -0400 [thread overview]
Message-ID: <20140522005424.27190.1625.stgit@manet.1015granger.net> (raw)
In-Reply-To: <20140522004505.27190.58897.stgit@manet.1015granger.net>
From: Steve Wise <swise@opengridcomputing.com>
Some rdma devices don't support a fast register page list depth of
at least RPCRDMA_MAX_DATA_SEGS. So xprtrdma needs to chunk its fast
register regions according to the minimum of the device max supported
depth or RPCRDMA_MAX_DATA_SEGS.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtrdma/rpc_rdma.c | 4 ---
net/sunrpc/xprtrdma/verbs.c | 47 +++++++++++++++++++++++++++++----------
net/sunrpc/xprtrdma/xprt_rdma.h | 1 +
3 files changed, 36 insertions(+), 16 deletions(-)
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 96ead52..400aa1b 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -248,10 +248,6 @@ rpcrdma_create_chunks(struct rpc_rqst *rqst, struct xdr_buf *target,
/* success. all failures return above */
req->rl_nchunks = nchunks;
- BUG_ON(nchunks == 0);
- BUG_ON((r_xprt->rx_ia.ri_memreg_strategy == RPCRDMA_FRMR)
- && (nchunks > 3));
-
/*
* finish off header. If write, marshal discrim and nchunks.
*/
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 9372656..55fb09a 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -539,6 +539,11 @@ rpcrdma_ia_open(struct rpcrdma_xprt *xprt, struct sockaddr *addr, int memreg)
__func__);
memreg = RPCRDMA_REGISTER;
#endif
+ } else {
+ /* Mind the ia limit on FRMR page list depth */
+ ia->ri_max_frmr_depth = min_t(unsigned int,
+ RPCRDMA_MAX_DATA_SEGS,
+ devattr.max_fast_reg_page_list_len);
}
break;
}
@@ -659,24 +664,42 @@ rpcrdma_ep_create(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia,
ep->rep_attr.srq = NULL;
ep->rep_attr.cap.max_send_wr = cdata->max_requests;
switch (ia->ri_memreg_strategy) {
- case RPCRDMA_FRMR:
+ case RPCRDMA_FRMR: {
+ int depth = 7;
+
/* Add room for frmr register and invalidate WRs.
* 1. FRMR reg WR for head
* 2. FRMR invalidate WR for head
- * 3. FRMR reg WR for pagelist
- * 4. FRMR invalidate WR for pagelist
+ * 3. N FRMR reg WRs for pagelist
+ * 4. N FRMR invalidate WRs for pagelist
* 5. FRMR reg WR for tail
* 6. FRMR invalidate WR for tail
* 7. The RDMA_SEND WR
*/
- ep->rep_attr.cap.max_send_wr *= 7;
+
+ /* Calculate N if the device max FRMR depth is smaller than
+ * RPCRDMA_MAX_DATA_SEGS.
+ */
+ if (ia->ri_max_frmr_depth < RPCRDMA_MAX_DATA_SEGS) {
+ int delta = RPCRDMA_MAX_DATA_SEGS -
+ ia->ri_max_frmr_depth;
+
+ do {
+ depth += 2; /* FRMR reg + invalidate */
+ delta -= ia->ri_max_frmr_depth;
+ } while (delta > 0);
+
+ }
+ ep->rep_attr.cap.max_send_wr *= depth;
if (ep->rep_attr.cap.max_send_wr > devattr.max_qp_wr) {
- cdata->max_requests = devattr.max_qp_wr / 7;
+ cdata->max_requests = devattr.max_qp_wr / depth;
if (!cdata->max_requests)
return -EINVAL;
- ep->rep_attr.cap.max_send_wr = cdata->max_requests * 7;
+ ep->rep_attr.cap.max_send_wr = cdata->max_requests *
+ depth;
}
break;
+ }
case RPCRDMA_MEMWINDOWS_ASYNC:
case RPCRDMA_MEMWINDOWS:
/* Add room for mw_binds+unbinds - overkill! */
@@ -1043,16 +1066,16 @@ rpcrdma_buffer_create(struct rpcrdma_buffer *buf, struct rpcrdma_ep *ep,
case RPCRDMA_FRMR:
for (i = buf->rb_max_requests * RPCRDMA_MAX_SEGS; i; i--) {
r->r.frmr.fr_mr = ib_alloc_fast_reg_mr(ia->ri_pd,
- RPCRDMA_MAX_SEGS);
+ ia->ri_max_frmr_depth);
if (IS_ERR(r->r.frmr.fr_mr)) {
rc = PTR_ERR(r->r.frmr.fr_mr);
dprintk("RPC: %s: ib_alloc_fast_reg_mr"
" failed %i\n", __func__, rc);
goto out;
}
- r->r.frmr.fr_pgl =
- ib_alloc_fast_reg_page_list(ia->ri_id->device,
- RPCRDMA_MAX_SEGS);
+ r->r.frmr.fr_pgl = ib_alloc_fast_reg_page_list(
+ ia->ri_id->device,
+ ia->ri_max_frmr_depth);
if (IS_ERR(r->r.frmr.fr_pgl)) {
rc = PTR_ERR(r->r.frmr.fr_pgl);
dprintk("RPC: %s: "
@@ -1498,8 +1521,8 @@ rpcrdma_register_frmr_external(struct rpcrdma_mr_seg *seg,
seg1->mr_offset -= pageoff; /* start of page */
seg1->mr_len += pageoff;
len = -pageoff;
- if (*nsegs > RPCRDMA_MAX_DATA_SEGS)
- *nsegs = RPCRDMA_MAX_DATA_SEGS;
+ if (*nsegs > ia->ri_max_frmr_depth)
+ *nsegs = ia->ri_max_frmr_depth;
for (page_no = i = 0; i < *nsegs;) {
rpcrdma_map_one(ia, seg, writing);
pa = seg->mr_dma;
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index cc1445d..98340a3 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -66,6 +66,7 @@ struct rpcrdma_ia {
struct completion ri_done;
int ri_async_rc;
enum rpcrdma_memreg ri_memreg_strategy;
+ unsigned int ri_max_frmr_depth;
};
/*
next prev parent reply other threads:[~2014-05-22 0:54 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-22 0:54 [PATCH v4 00/24] NFS/RDMA client patches for next merge Chuck Lever
2014-05-22 0:54 ` Chuck Lever [this message]
2014-05-22 0:54 ` [PATCH v4 02/24] nfs-rdma: Fix for FMR leaks Chuck Lever
2014-05-22 0:54 ` [PATCH v4 03/24] xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context Chuck Lever
2014-05-22 0:54 ` [PATCH v4 04/24] xprtrdma: Remove BOUNCEBUFFERS memory registration mode Chuck Lever
2014-05-22 0:54 ` [PATCH v4 05/24] xprtrdma: Remove MEMWINDOWS registration modes Chuck Lever
2014-05-22 0:55 ` [PATCH v4 06/24] xprtrdma: Remove REGISTER memory registration mode Chuck Lever
2014-05-22 0:55 ` [PATCH v4 07/24] xprtrdma: Fall back to MTHCAFMR when FRMR is not supported Chuck Lever
2014-05-22 0:55 ` [PATCH v4 08/24] xprtrdma: mount reports "Invalid mount option" if memreg mode " Chuck Lever
2014-05-22 0:55 ` [PATCH v4 09/24] xprtrdma: Simplify rpcrdma_deregister_external() synopsis Chuck Lever
2014-05-22 0:55 ` [PATCH v4 10/24] xprtrdma: Make rpcrdma_ep_destroy() return void Chuck Lever
2014-05-22 0:55 ` [PATCH v4 11/24] xprtrdma: Split the completion queue Chuck Lever
2014-05-22 0:55 ` [PATCH v4 12/24] xprtrmda: Reduce lock contention in completion handlers Chuck Lever
2014-05-22 0:56 ` [PATCH v4 13/24] xprtrmda: Reduce calls to ib_poll_cq() " Chuck Lever
2014-05-22 0:56 ` [PATCH v4 14/24] xprtrdma: Limit work done by completion handler Chuck Lever
2014-05-22 0:56 ` [PATCH v4 15/24] xprtrdma: Reduce the number of hardway buffer allocations Chuck Lever
2014-05-22 0:56 ` [PATCH v4 16/24] xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting Chuck Lever
2014-05-22 0:56 ` [PATCH v4 17/24] xprtrdma: Remove Tavor MTU setting Chuck Lever
2014-05-22 0:56 ` [PATCH v4 18/24] xprtrdma: Allocate missing pagelist Chuck Lever
2014-05-22 0:56 ` [PATCH v4 19/24] xprtrdma: Use macros for reconnection timeout constants Chuck Lever
2014-05-22 0:57 ` [PATCH v4 20/24] xprtrdma: Reset connection timeout after successful reconnect Chuck Lever
2014-05-22 2:07 ` Trond Myklebust
2014-05-22 3:28 ` Chuck Lever
2014-05-22 0:57 ` [PATCH v4 21/24] SUNRPC: Move congestion window contants to header file Chuck Lever
2014-05-22 0:57 ` [PATCH v4 22/24] xprtrdma: Avoid deadlock when credit window is reset Chuck Lever
2014-05-22 0:57 ` [PATCH v4 23/24] xprtrdma: Remove BUG_ON() call sites Chuck Lever
2014-05-22 0:57 ` [PATCH v4 24/24] xprtrdma: Disconnect on registration failure Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140522005424.27190.1625.stgit@manet.1015granger.net \
--to=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).