From: Trond Myklebust <trondmy-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
To: "chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org"
<chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: "anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org"
<anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH v3 12/12] sunrpc: Allow keepalive ping on a credit-full transport
Date: Thu, 9 Feb 2017 20:13:58 +0000 [thread overview]
Message-ID: <1486671236.5570.4.camel@primarydata.com> (raw)
In-Reply-To: <4E4245D4-8F9C-4CF3-8B2D-E4528B9E791F-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
On Thu, 2017-02-09 at 14:42 -0500, Chuck Lever wrote:
> > On Feb 9, 2017, at 10:37 AM, Chuck Lever <chuck.lever@oracle.com>
> > wrote:
> >
> > >
> > > On Feb 8, 2017, at 7:48 PM, Trond Myklebust <trondmy@primarydata.
> > > com> wrote:
> > >
> > > On Wed, 2017-02-08 at 19:19 -0500, Chuck Lever wrote:
> > > > > On Feb 8, 2017, at 7:05 PM, Trond Myklebust <trondmy@primaryd
> > > > > ata.co
> > > > > m> wrote:
> > > > >
> > > > > On Wed, 2017-02-08 at 17:01 -0500, Chuck Lever wrote:
> > > > > > Allow RPC-over-RDMA to send NULL pings even when the
> > > > > > transport
> > > > > > has
> > > > > > hit its credit limit. One RPC-over-RDMA credit is reserved
> > > > > > for
> > > > > > operations like keepalive.
> > > > > >
> > > > > > For transports that convey NFSv4, it seems like lease
> > > > > > renewal
> > > > > > would
> > > > > > also be a candidate for using a priority transport slot.
> > > > > > I'd like
> > > > > > to
> > > > > > see a mechanism better than RPCRDMA_PRIORITY that can
> > > > > > ensure only
> > > > > > one priority operation is in use at a time.
> > > > > >
> > > > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > > > ---
> > > > > > include/linux/sunrpc/sched.h | 2 ++
> > > > > > net/sunrpc/xprt.c | 4 ++++
> > > > > > net/sunrpc/xprtrdma/transport.c | 3 ++-
> > > > > > net/sunrpc/xprtrdma/verbs.c | 13 ++++++++-----
> > > > > > 4 files changed, 16 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/include/linux/sunrpc/sched.h
> > > > > > b/include/linux/sunrpc/sched.h
> > > > > > index 13822e6..fcea158 100644
> > > > > > --- a/include/linux/sunrpc/sched.h
> > > > > > +++ b/include/linux/sunrpc/sched.h
> > > > > > @@ -127,6 +127,7 @@ struct rpc_task_setup {
> > > > > > #define RPC_TASK_TIMEOUT 0x1000 /*
> > > > > > fail
> > > > > > with
> > > > > > ETIMEDOUT on timeout */
> > > > > > #define RPC_TASK_NOCONNECT 0x2000 /*
> > > > > > return
> > > > > > ENOTCONN if not connected */
> > > > > > #define RPC_TASK_NO_RETRANS_TIMEOUT 0x4000
> > > > > > /*
> > > > > > wait forever for a reply */
> > > > > > +#define RPC_TASK_NO_CONG 0x8000 /*
> > > > > > skip
> > > > > > congestion control */
> > > > > >
> > > > > > #define RPC_TASK_SOFTPING (RPC_TASK_SOFT |
> > > > > > RPC_TASK_SOFTCONN)
> > > > > >
> > > > > > @@ -137,6 +138,7 @@ struct rpc_task_setup {
> > > > > > #define RPC_IS_SOFT(t) ((t)->tk_flags &
> > > > > > (RPC_TASK_SOFT|RPC_TASK_TIMEOUT))
> > > > > > #define RPC_IS_SOFTCONN(t) ((t)->tk_flags &
> > > > > > RPC_TASK_SOFTCONN)
> > > > > > #define RPC_WAS_SENT(t) ((t)->tk_flags &
> > > > > > RPC_TASK_SENT)
> > > > > > +#define RPC_SKIP_CONG(t) ((t)->tk_flags &
> > > > > > RPC_TASK_NO_CONG)
> > > > > >
> > > > > > #define RPC_TASK_RUNNING 0
> > > > > > #define RPC_TASK_QUEUED 1
> > > > > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> > > > > > index b530a28..a477ee6 100644
> > > > > > --- a/net/sunrpc/xprt.c
> > > > > > +++ b/net/sunrpc/xprt.c
> > > > > > @@ -392,6 +392,10 @@ static inline void
> > > > > > xprt_release_write(struct
> > > > > > rpc_xprt *xprt, struct rpc_task *ta
> > > > > > {
> > > > > > struct rpc_rqst *req = task->tk_rqstp;
> > > > > >
> > > > > > + if (RPC_SKIP_CONG(task)) {
> > > > > > + req->rq_cong = 0;
> > > > > > + return 1;
> > > > > > + }
> > > > >
> > > > > Why not just have the RDMA layer call xprt_reserve_xprt()
> > > > > (and
> > > > > xprt_release_xprt()) if this flag is set? It seems to me that
> > > > > you
> > > > > will
> > > > > need some kind of extra congestion control in the RDMA layer
> > > > > anyway
> > > > > since you only have one reserved credit for these privileged
> > > > > tasks
> > > > > (or
> > > > > did I miss where that is being gated?).
> > > >
> > > > Thanks for the review.
> > > >
> > > > See RPCRDMA_IA_RSVD_CREDIT in 11/12. It's a hack I'm not
> > > > terribly happy with.
> > > >
> > > > So, I think you are suggesting replacing xprtrdma's
> > > > ->reserve_xprt with something like:
> > > >
> > > > int xprt_rdma_reserve_xprt(xprt, task)
> > > > {
> > > > if (RPC_SKIP_CONG(task))
> > > > return xprt_reserve_xprt(xprt, task);
> > > > return xprt_reserve_xprt_cong(xprt, task);
> > > > }
> > > >
> > > > and likewise for ->release_xprt ?
> > >
> > > Right.
>
> This seems to work fine for the normal cases.
>
> I'm confused about how to construct xprt_rdma_release_xprt()
> so it never releases a normal RPC task when a SKIP_CONG
> task completes and the credit limit is still full.
>
> If it should send a normal task using the reserved credit
> and that task hangs too, we're in exactly the position
> we wanted to avoid.
>
> My original solution might have had a similar problem,
> come to think of it.
>
>
That's true... You may need to set up a separate waitqueue that is
reserved for SKIP_CONG tasks. Again, it makes sense to keep that in the
RDMA code.
--
Trond Myklebust
Principal System Architect
4300 El Camino Real | Suite 100
Los Altos, CA 94022
W: 650-422-3800
C: 801-921-4583
www.primarydata.com
next prev parent reply other threads:[~2017-02-09 20:13 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-08 21:59 [PATCH v3 00/12] NFS/RDMA client-side patches for 4.11 Chuck Lever
[not found] ` <20170208214854.7152.83331.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2017-02-08 21:59 ` [PATCH v3 01/12] xprtrdma: Fix Read chunk padding Chuck Lever
2017-02-08 21:59 ` [PATCH v3 02/12] xprtrdma: Per-connection pad optimization Chuck Lever
2017-02-08 22:00 ` [PATCH v3 03/12] xprtrdma: Disable pad optimization by default Chuck Lever
2017-02-08 22:00 ` [PATCH v3 04/12] xprtrdma: Reduce required number of send SGEs Chuck Lever
2017-02-08 22:00 ` [PATCH v3 05/12] xprtrdma: Shrink send SGEs array Chuck Lever
2017-02-08 22:00 ` [PATCH v3 06/12] xprtrdma: Properly recover FRWRs with in-flight FASTREG WRs Chuck Lever
2017-02-08 22:00 ` [PATCH v3 07/12] xprtrdma: Handle stale connection rejection Chuck Lever
2017-02-08 22:00 ` [PATCH v3 08/12] xprtrdma: Refactor management of mw_list field Chuck Lever
2017-02-08 22:00 ` [PATCH v3 09/12] sunrpc: Allow xprt->ops->timer method to sleep Chuck Lever
[not found] ` <20170208220051.7152.67740.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2017-02-08 23:48 ` Trond Myklebust
2017-02-08 22:00 ` [PATCH v3 10/12] sunrpc: Enable calls to rpc_call_null_helper() from other modules Chuck Lever
2017-02-08 22:01 ` [PATCH v3 11/12] xprtrdma: Detect unreachable NFS/RDMA servers more reliably Chuck Lever
2017-02-08 22:01 ` [PATCH v3 12/12] sunrpc: Allow keepalive ping on a credit-full transport Chuck Lever
[not found] ` <20170208220116.7152.87626.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2017-02-09 0:05 ` Trond Myklebust
[not found] ` <1486598713.11028.3.camel-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2017-02-09 0:19 ` Chuck Lever
[not found] ` <9D6B8B44-9C23-427C-9E06-7C92302EB04D-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-02-09 0:48 ` Trond Myklebust
[not found] ` <1486601331.11028.5.camel-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2017-02-09 15:37 ` Chuck Lever
[not found] ` <2AFD96A3-8D49-4E2E-B1F1-9F5C46D0C9C8-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-02-09 19:42 ` Chuck Lever
[not found] ` <4E4245D4-8F9C-4CF3-8B2D-E4528B9E791F-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-02-09 20:13 ` Trond Myklebust [this message]
[not found] ` <1486671236.5570.4.camel-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2017-02-09 20:39 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1486671236.5570.4.camel@primarydata.com \
--to=trondmy-7i+n7zu2hftekmmhf/gkza@public.gmane.org \
--cc=anna.schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org \
--cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox