All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: Re: [PATCH v3 09/14] svcrdma: Report Write/Reply chunk overruns
Date: Fri, 14 Apr 2017 11:56:34 -0400	[thread overview]
Message-ID: <20170414155634.GC5362@fieldses.org> (raw)
In-Reply-To: <20170409170641.15073.82788.stgit@klimt.1015granger.net>

On Sun, Apr 09, 2017 at 01:06:41PM -0400, Chuck Lever wrote:
> Observed at Connectathon 2017.
> 
> If a client has underestimated the size of a Write or Reply chunk,
> the Linux server writes as much payload data as it can, then it
> recognizes there was a problem and closes the connection without
> sending the transport header.

Why would the client underestimate?  Is this a client-side bug?

--b.

> 
> This creates a couple of problems:
> 
> <> The client never receives indication of the server-side failure,
>    so it continues to retransmit the bad RPC. Forward progress on
>    the transport is blocked.
> 
> <> The reply payload pages are not moved out of the svc_rqst, thus
>    they can be released by the RPC server before the RDMA Writes
>    have completed.
> 
> The new rdma_rw-ized helpers return a distinct error code when a
> Write/Reply chunk overrun occurs, so it's now easy for the caller
> (svc_rdma_sendto) to recognize this case.
> 
> Instead of dropping the connection, post an RDMA_ERROR message. The
> client now sees an RDMA_ERROR and can properly terminate the RPC
> transaction.
> 
> As part of the new logic, set up the same delayed release for these
> payload pages as would have occurred in the normal case.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>  net/sunrpc/xprtrdma/svc_rdma_sendto.c |   58 ++++++++++++++++++++++++++++++++-
>  1 file changed, 56 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> index 0b646e8..e514f68 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> @@ -621,6 +621,48 @@ static int svc_rdma_send_reply_msg(struct svcxprt_rdma *rdma,
>  	return ret;
>  }
>  
> +/* Given the client-provided Write and Reply chunks, the server was not
> + * able to form a complete reply. Return an RDMA_ERROR message so the
> + * client can retire this RPC transaction. As above, the Send completion
> + * routine releases payload pages that were part of a previous RDMA Write.
> + *
> + * Remote Invalidation is skipped for simplicity.
> + */
> +static int svc_rdma_send_error_msg(struct svcxprt_rdma *rdma,
> +				   __be32 *rdma_resp, struct svc_rqst *rqstp)
> +{
> +	struct svc_rdma_op_ctxt *ctxt;
> +	__be32 *p;
> +	int ret;
> +
> +	ctxt = svc_rdma_get_context(rdma);
> +
> +	/* Replace the original transport header with an
> +	 * RDMA_ERROR response. XID etc are preserved.
> +	 */
> +	p = rdma_resp + 3;
> +	*p++ = rdma_error;
> +	*p   = err_chunk;
> +
> +	ret = svc_rdma_map_reply_hdr(rdma, ctxt, rdma_resp, 20);
> +	if (ret < 0)
> +		goto err;
> +
> +	svc_rdma_save_io_pages(rqstp, ctxt);
> +
> +	ret = svc_rdma_post_send_wr(rdma, ctxt, 1 + ret, 0);
> +	if (ret)
> +		goto err;
> +
> +	return 0;
> +
> +err:
> +	pr_err("svcrdma: failed to post Send WR (%d)\n", ret);
> +	svc_rdma_unmap_dma(ctxt);
> +	svc_rdma_put_context(ctxt, 1);
> +	return ret;
> +}
> +
>  void svc_rdma_prep_reply_hdr(struct svc_rqst *rqstp)
>  {
>  }
> @@ -683,13 +725,13 @@ int svc_rdma_sendto(struct svc_rqst *rqstp)
>  		/* XXX: Presume the client sent only one Write chunk */
>  		ret = svc_rdma_send_write_chunk(rdma, wr_lst, xdr);
>  		if (ret < 0)
> -			goto err1;
> +			goto err2;
>  		svc_rdma_xdr_encode_write_list(rdma_resp, wr_lst, ret);
>  	}
>  	if (rp_ch) {
>  		ret = svc_rdma_send_reply_chunk(rdma, rp_ch, wr_lst, xdr);
>  		if (ret < 0)
> -			goto err1;
> +			goto err2;
>  		svc_rdma_xdr_encode_reply_chunk(rdma_resp, rp_ch, ret);
>  	}
>  
> @@ -702,6 +744,18 @@ int svc_rdma_sendto(struct svc_rqst *rqstp)
>  		goto err0;
>  	return 0;
>  
> + err2:
> +	if (ret != -E2BIG)
> +		goto err1;
> +
> +	ret = svc_rdma_post_recv(rdma, GFP_KERNEL);
> +	if (ret)
> +		goto err1;
> +	ret = svc_rdma_send_error_msg(rdma, rdma_resp, rqstp);
> +	if (ret < 0)
> +		goto err0;
> +	return 0;
> +
>   err1:
>  	put_page(res_page);
>   err0:

WARNING: multiple messages have this Message-ID (diff)
From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v3 09/14] svcrdma: Report Write/Reply chunk overruns
Date: Fri, 14 Apr 2017 11:56:34 -0400	[thread overview]
Message-ID: <20170414155634.GC5362@fieldses.org> (raw)
In-Reply-To: <20170409170641.15073.82788.stgit-Hs+gFlyCn65vLzlybtyyYzGyq/o6K9yX@public.gmane.org>

On Sun, Apr 09, 2017 at 01:06:41PM -0400, Chuck Lever wrote:
> Observed at Connectathon 2017.
> 
> If a client has underestimated the size of a Write or Reply chunk,
> the Linux server writes as much payload data as it can, then it
> recognizes there was a problem and closes the connection without
> sending the transport header.

Why would the client underestimate?  Is this a client-side bug?

--b.

> 
> This creates a couple of problems:
> 
> <> The client never receives indication of the server-side failure,
>    so it continues to retransmit the bad RPC. Forward progress on
>    the transport is blocked.
> 
> <> The reply payload pages are not moved out of the svc_rqst, thus
>    they can be released by the RPC server before the RDMA Writes
>    have completed.
> 
> The new rdma_rw-ized helpers return a distinct error code when a
> Write/Reply chunk overrun occurs, so it's now easy for the caller
> (svc_rdma_sendto) to recognize this case.
> 
> Instead of dropping the connection, post an RDMA_ERROR message. The
> client now sees an RDMA_ERROR and can properly terminate the RPC
> transaction.
> 
> As part of the new logic, set up the same delayed release for these
> payload pages as would have occurred in the normal case.
> 
> Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
> ---
>  net/sunrpc/xprtrdma/svc_rdma_sendto.c |   58 ++++++++++++++++++++++++++++++++-
>  1 file changed, 56 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> index 0b646e8..e514f68 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> @@ -621,6 +621,48 @@ static int svc_rdma_send_reply_msg(struct svcxprt_rdma *rdma,
>  	return ret;
>  }
>  
> +/* Given the client-provided Write and Reply chunks, the server was not
> + * able to form a complete reply. Return an RDMA_ERROR message so the
> + * client can retire this RPC transaction. As above, the Send completion
> + * routine releases payload pages that were part of a previous RDMA Write.
> + *
> + * Remote Invalidation is skipped for simplicity.
> + */
> +static int svc_rdma_send_error_msg(struct svcxprt_rdma *rdma,
> +				   __be32 *rdma_resp, struct svc_rqst *rqstp)
> +{
> +	struct svc_rdma_op_ctxt *ctxt;
> +	__be32 *p;
> +	int ret;
> +
> +	ctxt = svc_rdma_get_context(rdma);
> +
> +	/* Replace the original transport header with an
> +	 * RDMA_ERROR response. XID etc are preserved.
> +	 */
> +	p = rdma_resp + 3;
> +	*p++ = rdma_error;
> +	*p   = err_chunk;
> +
> +	ret = svc_rdma_map_reply_hdr(rdma, ctxt, rdma_resp, 20);
> +	if (ret < 0)
> +		goto err;
> +
> +	svc_rdma_save_io_pages(rqstp, ctxt);
> +
> +	ret = svc_rdma_post_send_wr(rdma, ctxt, 1 + ret, 0);
> +	if (ret)
> +		goto err;
> +
> +	return 0;
> +
> +err:
> +	pr_err("svcrdma: failed to post Send WR (%d)\n", ret);
> +	svc_rdma_unmap_dma(ctxt);
> +	svc_rdma_put_context(ctxt, 1);
> +	return ret;
> +}
> +
>  void svc_rdma_prep_reply_hdr(struct svc_rqst *rqstp)
>  {
>  }
> @@ -683,13 +725,13 @@ int svc_rdma_sendto(struct svc_rqst *rqstp)
>  		/* XXX: Presume the client sent only one Write chunk */
>  		ret = svc_rdma_send_write_chunk(rdma, wr_lst, xdr);
>  		if (ret < 0)
> -			goto err1;
> +			goto err2;
>  		svc_rdma_xdr_encode_write_list(rdma_resp, wr_lst, ret);
>  	}
>  	if (rp_ch) {
>  		ret = svc_rdma_send_reply_chunk(rdma, rp_ch, wr_lst, xdr);
>  		if (ret < 0)
> -			goto err1;
> +			goto err2;
>  		svc_rdma_xdr_encode_reply_chunk(rdma_resp, rp_ch, ret);
>  	}
>  
> @@ -702,6 +744,18 @@ int svc_rdma_sendto(struct svc_rqst *rqstp)
>  		goto err0;
>  	return 0;
>  
> + err2:
> +	if (ret != -E2BIG)
> +		goto err1;
> +
> +	ret = svc_rdma_post_recv(rdma, GFP_KERNEL);
> +	if (ret)
> +		goto err1;
> +	ret = svc_rdma_send_error_msg(rdma, rdma_resp, rqstp);
> +	if (ret < 0)
> +		goto err0;
> +	return 0;
> +
>   err1:
>  	put_page(res_page);
>   err0:
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2017-04-14 15:56 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-09 17:05 [PATCH v3 00/14] Server-side NFS/RDMA changes proposed for v4.12 Chuck Lever
2017-04-09 17:05 ` Chuck Lever
2017-04-09 17:05 ` [PATCH v3 01/14] svcrdma: Move send_wr to svc_rdma_op_ctxt Chuck Lever
2017-04-09 17:05   ` Chuck Lever
2017-04-09 17:05 ` [PATCH v3 02/14] svcrdma: Add svc_rdma_map_reply_hdr() Chuck Lever
2017-04-09 17:05   ` Chuck Lever
2017-04-09 17:05 ` [PATCH v3 03/14] svcrdma: Eliminate RPCRDMA_SQ_DEPTH_MULT Chuck Lever
2017-04-09 17:05   ` Chuck Lever
2017-04-09 17:06 ` [PATCH v3 04/14] svcrdma: Add helper to save pages under I/O Chuck Lever
2017-04-09 17:06   ` Chuck Lever
2017-04-09 17:06 ` [PATCH v3 05/14] svcrdma: Clean up svc_rdma_get_inv_rkey() Chuck Lever
2017-04-09 17:06   ` Chuck Lever
2017-04-09 17:06 ` [PATCH v3 06/14] svcrdma: Introduce local rdma_rw API helpers Chuck Lever
2017-04-09 17:06   ` Chuck Lever
2017-04-09 17:06 ` [PATCH v3 07/14] svcrdma: Use rdma_rw API in RPC reply path Chuck Lever
2017-04-09 17:06   ` Chuck Lever
2017-04-09 17:06 ` [PATCH v3 08/14] svcrdma: Clean up RDMA_ERROR path Chuck Lever
2017-04-09 17:06   ` Chuck Lever
2017-04-09 17:06 ` [PATCH v3 09/14] svcrdma: Report Write/Reply chunk overruns Chuck Lever
2017-04-09 17:06   ` Chuck Lever
2017-04-14 15:56   ` J. Bruce Fields [this message]
2017-04-14 15:56     ` J. Bruce Fields
2017-04-14 16:10     ` Chuck Lever
2017-04-14 16:10       ` Chuck Lever
2017-04-14 17:52       ` J. Bruce Fields
2017-04-14 17:52         ` J. Bruce Fields
2017-04-14 19:07         ` Chuck Lever
2017-04-14 19:07           ` Chuck Lever
2017-04-14 19:33           ` J. Bruce Fields
2017-04-14 19:33             ` J. Bruce Fields
2017-04-09 17:06 ` [PATCH v3 10/14] svcrdma: Clean up RPC-over-RDMA backchannel reply processing Chuck Lever
2017-04-09 17:06   ` Chuck Lever
2017-04-09 17:06 ` [PATCH v3 11/14] svcrdma: Reduce size of sge array in struct svc_rdma_op_ctxt Chuck Lever
2017-04-09 17:06   ` Chuck Lever
2017-04-09 17:07 ` [PATCH v3 12/14] svcrdma: Remove unused RDMA Write completion handler Chuck Lever
2017-04-09 17:07   ` Chuck Lever
2017-04-09 17:07 ` [PATCH v3 13/14] svcrdma: Remove the req_map cache Chuck Lever
2017-04-09 17:07   ` Chuck Lever
2017-04-09 17:07 ` [PATCH v3 14/14] svcrdma: Clean out old XDR encoders Chuck Lever
2017-04-09 17:07   ` Chuck Lever
2017-04-14 17:54 ` [PATCH v3 00/14] Server-side NFS/RDMA changes proposed for v4.12 J. Bruce Fields
2017-04-14 17:54   ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170414155634.GC5362@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.