Linux NFS development
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Linux RDMA Mailing List <linux-rdma@vger.kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v4 01/11] svcrdma: Do not send XDR roundup bytes for a write chunk
Date: Wed, 23 Dec 2015 14:59:15 -0500	[thread overview]
Message-ID: <20151223195915.GB27432@fieldses.org> (raw)
In-Reply-To: <DF5B7D29-0C6C-47EF-8E3E-74BF137D7F95@oracle.com>

On Mon, Dec 21, 2015 at 05:11:56PM -0500, Chuck Lever wrote:
> 
> > On Dec 21, 2015, at 4:29 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > 
> > On Mon, Dec 21, 2015 at 04:15:23PM -0500, Chuck Lever wrote:
> >> 
> >>> On Dec 21, 2015, at 4:07 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> >>> 
> >>> On Mon, Dec 14, 2015 at 04:30:09PM -0500, Chuck Lever wrote:
> >>>> Minor optimization: when dealing with write chunk XDR roundup, do
> >>>> not post a Write WR for the zero bytes in the pad. Simply update
> >>>> the write segment in the RPC-over-RDMA header to reflect the extra
> >>>> pad bytes.
> >>>> 
> >>>> The Reply chunk is also a write chunk, but the server does not use
> >>>> send_write_chunks() to send the Reply chunk. That's OK in this case:
> >>>> the server Upper Layer typically marshals the Reply chunk contents
> >>>> in a single contiguous buffer, without a separate tail for the XDR
> >>>> pad.
> >>>> 
> >>>> The comments and the variable naming refer to "chunks" but what is
> >>>> really meant is "segments." The existing code sends only one
> >>>> xdr_write_chunk per RPC reply.
> >>>> 
> >>>> The fix assumes this as well. When the XDR pad in the first write
> >>>> chunk is reached, the assumption is the Write list is complete and
> >>>> send_write_chunks() returns.
> >>>> 
> >>>> That will remain a valid assumption until the server Upper Layer can
> >>>> support multiple bulk payload results per RPC.
> >>>> 
> >>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >>>> ---
> >>>> net/sunrpc/xprtrdma/svc_rdma_sendto.c |    7 +++++++
> >>>> 1 file changed, 7 insertions(+)
> >>>> 
> >>>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >>>> index 969a1ab..bad5eaa 100644
> >>>> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >>>> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> >>>> @@ -342,6 +342,13 @@ static int send_write_chunks(struct svcxprt_rdma *xprt,
> >>>> 						arg_ch->rs_handle,
> >>>> 						arg_ch->rs_offset,
> >>>> 						write_len);
> >>>> +
> >>>> +		/* Do not send XDR pad bytes */
> >>>> +		if (chunk_no && write_len < 4) {
> >>>> +			chunk_no++;
> >>>> +			break;
> >>> 
> >>> I'm pretty lost in this code.  Why does (chunk_no && write_len < 4) mean
> >>> this is xdr padding?
> >> 
> >> Chunk zero is always data. Padding is always going to be
> >> after the first chunk. Any chunk after chunk zero that is
> >> shorter than XDR quad alignment is going to be a pad.
> > 
> > I don't really know what a chunk is....  Looking at the code:
> > 
> > 	write_len = min(xfer_len, be32_to_cpu(arg_ch->rs_length));
> > 
> > so I guess the assumption is just that those rs_length's are always a
> > multiple of four?
> 
> The example you recently gave was a two-byte NFS READ
> that crosses a page boundary.
> 
> In that case, the NFSD would pass down an xdr_buf that
> has one byte in a page, one byte in another page, and
> a two-byte XDR pad. The logic introduced by this
> optimization would be fooled, and neither the second
> byte nor the XDR pad would be written to the client.
> 
> Unless you can think of a way to recognize an XDR pad
> in the xdr_buf 100% of the time, you should drop this
> patch.

It might be best to make this explicit and mark bytes as padding somehow
while encoding.

> As far as I know, none of the other patches in this
> series depend on this optimization, so please merge
> them if you can.

OK, I'll take a look at the rest, thanks.

--b.

> >> Probably too clever. Is there a better way to detect
> >> the XDR pad?
> >> 
> >> 
> >>>> +		}
> >>>> +
> >>>> 		chunk_off = 0;
> >>>> 		while (write_len) {
> >>>> 			ret = send_write(xprt, rqstp,
> >> 
> >> --
> >> Chuck Lever
> >> 
> >> 
> >> 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 

  reply	other threads:[~2015-12-23 19:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-14 21:30 [PATCH v4 00/11] NFS/RDMA server patches for v4.5 Chuck Lever
2015-12-14 21:30 ` [PATCH v4 01/11] svcrdma: Do not send XDR roundup bytes for a write chunk Chuck Lever
2015-12-21 21:07   ` J. Bruce Fields
2015-12-21 21:15     ` Chuck Lever
2015-12-21 21:29       ` J. Bruce Fields
2015-12-21 22:11         ` Chuck Lever
2015-12-23 19:59           ` J. Bruce Fields [this message]
2015-12-14 21:30 ` [PATCH v4 02/11] svcrdma: Clean up rdma_create_xprt() Chuck Lever
2015-12-14 21:30 ` [PATCH v4 03/11] svcrdma: Clean up process_context() Chuck Lever
2015-12-14 21:30 ` [PATCH v4 04/11] svcrdma: Improve allocation of struct svc_rdma_op_ctxt Chuck Lever
2015-12-14 21:30 ` [PATCH v4 05/11] svcrdma: Improve allocation of struct svc_rdma_req_map Chuck Lever
2015-12-14 21:30 ` [PATCH v4 06/11] svcrdma: Remove unused req_map and ctxt kmem_caches Chuck Lever
2015-12-14 21:30 ` [PATCH v4 07/11] svcrdma: Add gfp flags to svc_rdma_post_recv() Chuck Lever
2015-12-14 21:31 ` [PATCH v4 08/11] svcrdma: Remove last two __GFP_NOFAIL call sites Chuck Lever
2015-12-14 21:31 ` [PATCH v4 09/11] svcrdma: Make map_xdr non-static Chuck Lever
2015-12-14 21:31 ` [PATCH v4 10/11] svcrdma: Define maximum number of backchannel requests Chuck Lever
2015-12-14 21:31 ` [PATCH v4 11/11] svcrdma: Add class for RDMA backwards direction transport Chuck Lever
2015-12-16 12:10 ` [PATCH v4 00/11] NFS/RDMA server patches for v4.5 Devesh Sharma
2015-12-23 21:00   ` J. Bruce Fields
2015-12-24  9:57     ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151223195915.GB27432@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox