From: "J. Bruce Fields" <bfields@fieldses.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Linux RDMA Mailing List <linux-rdma@vger.kernel.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v4 01/11] svcrdma: Do not send XDR roundup bytes for a write chunk
Date: Wed, 23 Dec 2015 14:59:15 -0500 [thread overview]
Message-ID: <20151223195915.GB27432@fieldses.org> (raw)
In-Reply-To: <DF5B7D29-0C6C-47EF-8E3E-74BF137D7F95@oracle.com>
On Mon, Dec 21, 2015 at 05:11:56PM -0500, Chuck Lever wrote:
>
> > On Dec 21, 2015, at 4:29 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > On Mon, Dec 21, 2015 at 04:15:23PM -0500, Chuck Lever wrote:
> >>
> >>> On Dec 21, 2015, at 4:07 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> >>>
> >>> On Mon, Dec 14, 2015 at 04:30:09PM -0500, Chuck Lever wrote:
> >>>> Minor optimization: when dealing with write chunk XDR roundup, do
> >>>> not post a Write WR for the zero bytes in the pad. Simply update
> >>>> the write segment in the RPC-over-RDMA header to reflect the extra
> >>>> pad bytes.
> >>>>
> >>>> The Reply chunk is also a write chunk, but the server does not use
> >>>> send_write_chunks() to send the Reply chunk. That's OK in this case:
> >>>> the server Upper Layer typically marshals the Reply chunk contents
> >>>> in a single contiguous buffer, without a separate tail for the XDR
> >>>> pad.
> >>>>
> >>>> The comments and the variable naming refer to "chunks" but what is
> >>>> really meant is "segments." The existing code sends only one
> >>>> xdr_write_chunk per RPC reply.
> >>>>
> >>>> The fix assumes this as well. When the XDR pad in the first write
> >>>> chunk is reached, the assumption is the Write list is complete and
> >>>> send_write_chunks() returns.
> >>>>
> >>>> That will remain a valid assumption until the server Upper Layer can
> >>>> support multiple bulk payload results per RPC.
> >>>>
> >>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >>>> ---
> >>>> net/sunrpc/xprtrdma/svc_rdma_sendto.c | 7 +++++++
> >>>> 1 file changed, 7 insertions(+)
> >>>>
> >>>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >>>> index 969a1ab..bad5eaa 100644
> >>>> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >>>> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >>>> @@ -342,6 +342,13 @@ static int send_write_chunks(struct svcxprt_rdma *xprt,
> >>>> arg_ch->rs_handle,
> >>>> arg_ch->rs_offset,
> >>>> write_len);
> >>>> +
> >>>> + /* Do not send XDR pad bytes */
> >>>> + if (chunk_no && write_len < 4) {
> >>>> + chunk_no++;
> >>>> + break;
> >>>
> >>> I'm pretty lost in this code. Why does (chunk_no && write_len < 4) mean
> >>> this is xdr padding?
> >>
> >> Chunk zero is always data. Padding is always going to be
> >> after the first chunk. Any chunk after chunk zero that is
> >> shorter than XDR quad alignment is going to be a pad.
> >
> > I don't really know what a chunk is.... Looking at the code:
> >
> > write_len = min(xfer_len, be32_to_cpu(arg_ch->rs_length));
> >
> > so I guess the assumption is just that those rs_length's are always a
> > multiple of four?
>
> The example you recently gave was a two-byte NFS READ
> that crosses a page boundary.
>
> In that case, the NFSD would pass down an xdr_buf that
> has one byte in a page, one byte in another page, and
> a two-byte XDR pad. The logic introduced by this
> optimization would be fooled, and neither the second
> byte nor the XDR pad would be written to the client.
>
> Unless you can think of a way to recognize an XDR pad
> in the xdr_buf 100% of the time, you should drop this
> patch.
It might be best to make this explicit and mark bytes as padding somehow
while encoding.
> As far as I know, none of the other patches in this
> series depend on this optimization, so please merge
> them if you can.
OK, I'll take a look at the rest, thanks.
--b.
> >> Probably too clever. Is there a better way to detect
> >> the XDR pad?
> >>
> >>
> >>>> + }
> >>>> +
> >>>> chunk_off = 0;
> >>>> while (write_len) {
> >>>> ret = send_write(xprt, rqstp,
> >>
> >> --
> >> Chuck Lever
> >>
> >>
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Chuck Lever
>
>
>
next prev parent reply other threads:[~2015-12-23 19:59 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-14 21:30 [PATCH v4 00/11] NFS/RDMA server patches for v4.5 Chuck Lever
2015-12-14 21:30 ` [PATCH v4 01/11] svcrdma: Do not send XDR roundup bytes for a write chunk Chuck Lever
2015-12-21 21:07 ` J. Bruce Fields
2015-12-21 21:15 ` Chuck Lever
2015-12-21 21:29 ` J. Bruce Fields
2015-12-21 22:11 ` Chuck Lever
2015-12-23 19:59 ` J. Bruce Fields [this message]
2015-12-14 21:30 ` [PATCH v4 02/11] svcrdma: Clean up rdma_create_xprt() Chuck Lever
2015-12-14 21:30 ` [PATCH v4 03/11] svcrdma: Clean up process_context() Chuck Lever
2015-12-14 21:30 ` [PATCH v4 04/11] svcrdma: Improve allocation of struct svc_rdma_op_ctxt Chuck Lever
2015-12-14 21:30 ` [PATCH v4 05/11] svcrdma: Improve allocation of struct svc_rdma_req_map Chuck Lever
2015-12-14 21:30 ` [PATCH v4 06/11] svcrdma: Remove unused req_map and ctxt kmem_caches Chuck Lever
2015-12-14 21:30 ` [PATCH v4 07/11] svcrdma: Add gfp flags to svc_rdma_post_recv() Chuck Lever
2015-12-14 21:31 ` [PATCH v4 08/11] svcrdma: Remove last two __GFP_NOFAIL call sites Chuck Lever
2015-12-14 21:31 ` [PATCH v4 09/11] svcrdma: Make map_xdr non-static Chuck Lever
2015-12-14 21:31 ` [PATCH v4 10/11] svcrdma: Define maximum number of backchannel requests Chuck Lever
2015-12-14 21:31 ` [PATCH v4 11/11] svcrdma: Add class for RDMA backwards direction transport Chuck Lever
2015-12-16 12:10 ` [PATCH v4 00/11] NFS/RDMA server patches for v4.5 Devesh Sharma
2015-12-23 21:00 ` J. Bruce Fields
2015-12-24 9:57 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151223195915.GB27432@fieldses.org \
--to=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox