public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
To: "chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org"
	<chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: "anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org"
	<anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH v3 12/12] sunrpc: Allow keepalive ping on a credit-full transport
Date: Thu, 9 Feb 2017 20:13:58 +0000	[thread overview]
Message-ID: <1486671236.5570.4.camel@primarydata.com> (raw)
In-Reply-To: <4E4245D4-8F9C-4CF3-8B2D-E4528B9E791F-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

On Thu, 2017-02-09 at 14:42 -0500, Chuck Lever wrote:
> > On Feb 9, 2017, at 10:37 AM, Chuck Lever <chuck.lever@oracle.com>
> > wrote:
> > 
> > > 
> > > On Feb 8, 2017, at 7:48 PM, Trond Myklebust <trondmy@primarydata.
> > > com> wrote:
> > > 
> > > On Wed, 2017-02-08 at 19:19 -0500, Chuck Lever wrote:
> > > > > On Feb 8, 2017, at 7:05 PM, Trond Myklebust <trondmy@primaryd
> > > > > ata.co
> > > > > m> wrote:
> > > > > 
> > > > > On Wed, 2017-02-08 at 17:01 -0500, Chuck Lever wrote:
> > > > > > Allow RPC-over-RDMA to send NULL pings even when the
> > > > > > transport
> > > > > > has
> > > > > > hit its credit limit. One RPC-over-RDMA credit is reserved
> > > > > > for
> > > > > > operations like keepalive.
> > > > > > 
> > > > > > For transports that convey NFSv4, it seems like lease
> > > > > > renewal
> > > > > > would
> > > > > > also be a candidate for using a priority transport slot.
> > > > > > I'd like
> > > > > > to
> > > > > > see a mechanism better than RPCRDMA_PRIORITY that can
> > > > > > ensure only
> > > > > > one priority operation is in use at a time.
> > > > > > 
> > > > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > > > ---
> > > > > > include/linux/sunrpc/sched.h    |    2 ++
> > > > > > net/sunrpc/xprt.c               |    4 ++++
> > > > > > net/sunrpc/xprtrdma/transport.c |    3 ++-
> > > > > > net/sunrpc/xprtrdma/verbs.c     |   13 ++++++++-----
> > > > > > 4 files changed, 16 insertions(+), 6 deletions(-)
> > > > > > 
> > > > > > diff --git a/include/linux/sunrpc/sched.h
> > > > > > b/include/linux/sunrpc/sched.h
> > > > > > index 13822e6..fcea158 100644
> > > > > > --- a/include/linux/sunrpc/sched.h
> > > > > > +++ b/include/linux/sunrpc/sched.h
> > > > > > @@ -127,6 +127,7 @@ struct rpc_task_setup {
> > > > > > #define RPC_TASK_TIMEOUT	0x1000		/*
> > > > > > fail
> > > > > > with
> > > > > > ETIMEDOUT on timeout */
> > > > > > #define RPC_TASK_NOCONNECT	0x2000		/*
> > > > > > return
> > > > > > ENOTCONN if not connected */
> > > > > > #define RPC_TASK_NO_RETRANS_TIMEOUT	0x4000		
> > > > > > /*
> > > > > > wait forever for a reply */
> > > > > > +#define RPC_TASK_NO_CONG	0x8000		/*
> > > > > > skip
> > > > > > congestion control */
> > > > > > 
> > > > > > #define RPC_TASK_SOFTPING	(RPC_TASK_SOFT |
> > > > > > RPC_TASK_SOFTCONN)
> > > > > > 
> > > > > > @@ -137,6 +138,7 @@ struct rpc_task_setup {
> > > > > > #define RPC_IS_SOFT(t)		((t)->tk_flags &
> > > > > > (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
> > > > > > #define RPC_IS_SOFTCONN(t)	((t)->tk_flags &
> > > > > > RPC_TASK_SOFTCONN)
> > > > > > #define RPC_WAS_SENT(t)		((t)->tk_flags &
> > > > > > RPC_TASK_SENT)
> > > > > > +#define RPC_SKIP_CONG(t)	((t)->tk_flags &
> > > > > > RPC_TASK_NO_CONG)
> > > > > > 
> > > > > > #define RPC_TASK_RUNNING	0
> > > > > > #define RPC_TASK_QUEUED		1
> > > > > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > > > > > index b530a28..a477ee6 100644
> > > > > > --- a/net/sunrpc/xprt.c
> > > > > > +++ b/net/sunrpc/xprt.c
> > > > > > @@ -392,6 +392,10 @@ static inline void
> > > > > > xprt_release_write(struct
> > > > > > rpc_xprt *xprt, struct rpc_task *ta
> > > > > > {
> > > > > > 	struct rpc_rqst *req = task->tk_rqstp;
> > > > > > 
> > > > > > +	if (RPC_SKIP_CONG(task)) {
> > > > > > +		req->rq_cong = 0;
> > > > > > +		return 1;
> > > > > > +	}
> > > > > 
> > > > > Why not just have the RDMA layer call xprt_reserve_xprt()
> > > > > (and
> > > > > xprt_release_xprt()) if this flag is set? It seems to me that
> > > > > you
> > > > > will
> > > > > need some kind of extra congestion control in the RDMA layer
> > > > > anyway
> > > > > since you only have one reserved credit for these privileged
> > > > > tasks
> > > > > (or
> > > > > did I miss where that is being gated?).
> > > > 
> > > > Thanks for the review.
> > > > 
> > > > See RPCRDMA_IA_RSVD_CREDIT in 11/12. It's a hack I'm not
> > > > terribly happy with.
> > > > 
> > > > So, I think you are suggesting replacing xprtrdma's
> > > > ->reserve_xprt with something like:
> > > > 
> > > > int xprt_rdma_reserve_xprt(xprt, task)
> > > > {
> > > >      if (RPC_SKIP_CONG(task))
> > > >           return xprt_reserve_xprt(xprt, task);
> > > >      return xprt_reserve_xprt_cong(xprt, task);
> > > > }
> > > > 
> > > > and likewise for ->release_xprt ?
> > > 
> > > Right.
> 
> This seems to work fine for the normal cases.
> 
> I'm confused about how to construct xprt_rdma_release_xprt()
> so it never releases a normal RPC task when a SKIP_CONG
> task completes and the credit limit is still full.
> 
> If it should send a normal task using the reserved credit
> and that task hangs too, we're in exactly the position
> we wanted to avoid.
> 
> My original solution might have had a similar problem,
> come to think of it.
> 
> 

That's true... You may need to set up a separate waitqueue that is
reserved for SKIP_CONG tasks. Again, it makes sense to keep that in the
RDMA code.

-- 


	
	


Trond Myklebust
Principal System Architect
4300 El Camino Real | Suite 100
Los Altos, CA  94022
W: 650-422-3800
C: 801-921-4583 
www.primarydata.com




  parent reply	other threads:[~2017-02-09 20:13 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-08 21:59 [PATCH v3 00/12] NFS/RDMA client-side patches for 4.11 Chuck Lever
     [not found] ` <20170208214854.7152.83331.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2017-02-08 21:59   ` [PATCH v3 01/12] xprtrdma: Fix Read chunk padding Chuck Lever
2017-02-08 21:59   ` [PATCH v3 02/12] xprtrdma: Per-connection pad optimization Chuck Lever
2017-02-08 22:00   ` [PATCH v3 03/12] xprtrdma: Disable pad optimization by default Chuck Lever
2017-02-08 22:00   ` [PATCH v3 04/12] xprtrdma: Reduce required number of send SGEs Chuck Lever
2017-02-08 22:00   ` [PATCH v3 05/12] xprtrdma: Shrink send SGEs array Chuck Lever
2017-02-08 22:00   ` [PATCH v3 06/12] xprtrdma: Properly recover FRWRs with in-flight FASTREG WRs Chuck Lever
2017-02-08 22:00   ` [PATCH v3 07/12] xprtrdma: Handle stale connection rejection Chuck Lever
2017-02-08 22:00   ` [PATCH v3 08/12] xprtrdma: Refactor management of mw_list field Chuck Lever
2017-02-08 22:00   ` [PATCH v3 09/12] sunrpc: Allow xprt->ops->timer method to sleep Chuck Lever
     [not found]     ` <20170208220051.7152.67740.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2017-02-08 23:48       ` Trond Myklebust
2017-02-08 22:00   ` [PATCH v3 10/12] sunrpc: Enable calls to rpc_call_null_helper() from other modules Chuck Lever
2017-02-08 22:01   ` [PATCH v3 11/12] xprtrdma: Detect unreachable NFS/RDMA servers more reliably Chuck Lever
2017-02-08 22:01   ` [PATCH v3 12/12] sunrpc: Allow keepalive ping on a credit-full transport Chuck Lever
     [not found]     ` <20170208220116.7152.87626.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2017-02-09  0:05       ` Trond Myklebust
     [not found]         ` <1486598713.11028.3.camel-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2017-02-09  0:19           ` Chuck Lever
     [not found]             ` <9D6B8B44-9C23-427C-9E06-7C92302EB04D-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-02-09  0:48               ` Trond Myklebust
     [not found]                 ` <1486601331.11028.5.camel-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2017-02-09 15:37                   ` Chuck Lever
     [not found]                     ` <2AFD96A3-8D49-4E2E-B1F1-9F5C46D0C9C8-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-02-09 19:42                       ` Chuck Lever
     [not found]                         ` <4E4245D4-8F9C-4CF3-8B2D-E4528B9E791F-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-02-09 20:13                           ` Trond Myklebust [this message]
     [not found]                             ` <1486671236.5570.4.camel-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2017-02-09 20:39                               ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1486671236.5570.4.camel@primarydata.com \
    --to=trondmy-7i+n7zu2hftekmmhf/gkza@public.gmane.org \
    --cc=anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox