From: Chuck Lever <chuck.lever@oracle.com>
To: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: [PATCH v4 12/16] xprtrdma: Fix large NFS SYMLINK calls
Date: Mon, 03 Aug 2015 13:04:26 -0400 [thread overview]
Message-ID: <20150803170426.9115.31004.stgit@manet.1015granger.net> (raw)
In-Reply-To: <20150803165807.9115.23842.stgit@manet.1015granger.net>
Repair how rpcrdma_marshal_req() chooses which RDMA message type
to use for large non-WRITE operations so that it picks RDMA_NOMSG
in the correct situations, and sets up the marshaling logic to
SEND only the RPC/RDMA header.
Large NFSv2 SYMLINK requests now use RDMA_NOMSG calls. The Linux NFS
server XDR decoder for NFSv2 SYMLINK does not handle having the
pathname argument arrive in a separate buffer. The decoder could be
fixed, but this is simpler and RDMA_NOMSG can be used in a variety
of other situations.
Ensure that the Linux client continues to use "RDMA_MSG + read
list" when sending large NFSv3 SYMLINK requests, which is more
efficient than using RDMA_NOMSG.
Large NFSv4 CREATE(NF4LNK) requests are changed to use "RDMA_MSG +
read list" just like NFSv3 (see Section 5 of RFC 5667). Before,
these did not work at all.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Devesh Sharma <devesh.sharma@avagotech.com>
---
fs/nfs/nfs3xdr.c | 1 +
fs/nfs/nfs4xdr.c | 4 +++-
net/sunrpc/xprtrdma/rpc_rdma.c | 25 ++++++++++++++++---------
3 files changed, 20 insertions(+), 10 deletions(-)
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index 9b04c2e..267126d 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -1103,6 +1103,7 @@ static void nfs3_xdr_enc_symlink3args(struct rpc_rqst *req,
{
encode_diropargs3(xdr, args->fromfh, args->fromname, args->fromlen);
encode_symlinkdata3(xdr, args);
+ xdr->buf->flags |= XDRBUF_WRITE;
}
/*
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 558cd65d..c42459e 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1154,7 +1154,9 @@ static void encode_create(struct xdr_stream *xdr, const struct nfs4_create_arg *
case NF4LNK:
p = reserve_space(xdr, 4);
*p = cpu_to_be32(create->u.symlink.len);
- xdr_write_pages(xdr, create->u.symlink.pages, 0, create->u.symlink.len);
+ xdr_write_pages(xdr, create->u.symlink.pages, 0,
+ create->u.symlink.len);
+ xdr->buf->flags |= XDRBUF_WRITE;
break;
case NF4BLK: case NF4CHR:
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 1dd48f2..2721586 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -475,21 +475,24 @@ rpcrdma_marshal_req(struct rpc_rqst *rqst)
*
* o If the total request is under the inline threshold, all ops
* are sent as inline.
- * o Large non-write ops are sent with the entire message as a
- * single read chunk (protocol 0-position special case).
* o Large write ops transmit data as read chunk(s), header as
* inline.
+ * o Large non-write ops are sent with the entire message as a
+ * single read chunk (protocol 0-position special case).
*
- * Note: the NFS code sending down multiple argument segments
- * implies the op is a write.
- * TBD check NFSv4 setacl
+ * This assumes that the upper layer does not present a request
+ * that both has a data payload, and whose non-data arguments
+ * by themselves are larger than the inline threshold.
*/
- if (rpcrdma_args_inline(rqst))
+ if (rpcrdma_args_inline(rqst)) {
rtype = rpcrdma_noch;
- else if (rqst->rq_snd_buf.page_len == 0)
- rtype = rpcrdma_areadch;
- else
+ } else if (rqst->rq_snd_buf.flags & XDRBUF_WRITE) {
rtype = rpcrdma_readch;
+ } else {
+ headerp->rm_type = htonl(RDMA_NOMSG);
+ rtype = rpcrdma_areadch;
+ rpclen = 0;
+ }
/* The following simplification is not true forever */
if (rtype != rpcrdma_noch && wtype == rpcrdma_replych)
@@ -546,6 +549,10 @@ rpcrdma_marshal_req(struct rpc_rqst *rqst)
req->rl_send_iov[0].length = hdrlen;
req->rl_send_iov[0].lkey = rdmab_lkey(req->rl_rdmabuf);
+ req->rl_niovs = 1;
+ if (rtype == rpcrdma_areadch)
+ return 0;
+
req->rl_send_iov[1].addr = rdmab_addr(req->rl_sendbuf);
req->rl_send_iov[1].length = rpclen;
req->rl_send_iov[1].lkey = rdmab_lkey(req->rl_sendbuf);
next prev parent reply other threads:[~2015-08-03 17:04 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-03 17:02 [PATCH v4 00/16] NFS/RDMA client side for Linux 4.3 Chuck Lever
2015-08-03 17:02 ` [PATCH v4 01/16] xprtrdma: Make xprt_setup_rdma() agnostic to family of server address Chuck Lever
2015-08-03 17:02 ` [PATCH v4 02/16] xprtrdma: Raise maximum payload size to one megabyte Chuck Lever
2015-08-03 17:02 ` [PATCH v4 03/16] xprtrdma: Increase default credit limit Chuck Lever
2015-08-03 17:03 ` [PATCH v4 04/16] xprtrdma: Don't fall back to PHYSICAL memory registration Chuck Lever
2015-08-03 17:03 ` [PATCH v4 05/16] xprtrdma: Remove last ib_reg_phys_mr() call site Chuck Lever
2015-08-03 17:03 ` [PATCH v4 06/16] xprtrdma: Clean up rpcrdma_ia_open() Chuck Lever
2015-08-03 17:03 ` [PATCH v4 07/16] xprtrdma: Remove logic that constructs RDMA_MSGP type calls Chuck Lever
2015-08-03 17:03 ` [PATCH v4 08/16] xprtrdma: Account for RPC/RDMA header size when deciding to inline Chuck Lever
2015-08-03 17:03 ` [PATCH v4 09/16] xprtrdma: Always provide a write list when sending NFS READ Chuck Lever
2015-08-03 17:04 ` [PATCH v4 10/16] xprtrdma: Don't provide a reply chunk when expecting a short reply Chuck Lever
2015-08-03 17:04 ` [PATCH v4 11/16] xprtrdma: Fix XDR tail buffer marshalling Chuck Lever
2015-08-03 17:04 ` Chuck Lever [this message]
2015-08-03 17:04 ` [PATCH v4 13/16] xprtrdma: Clean up xprt_rdma_print_stats() Chuck Lever
2015-08-03 17:04 ` [PATCH v4 14/16] xprtrdma: Count RDMA_NOMSG type calls Chuck Lever
2015-08-03 17:04 ` [PATCH v4 15/16] core: Remove the ib_reg_phys_mr() and ib_rereg_phys_mr() verbs Chuck Lever
2015-08-03 17:05 ` [PATCH v4 16/16] xprtrdma: take HCA driver refcount at client Chuck Lever
2015-08-03 17:07 ` Chuck Lever
2015-08-06 14:28 ` [PATCH v4 00/16] NFS/RDMA client side for Linux 4.3 Anna Schumaker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150803170426.9115.31004.stgit@manet.1015granger.net \
--to=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox