From: Trond Myklebust <trondmy@hammerspace.com>
To: "chuck.lever@oracle.com" <chuck.lever@oracle.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH RFC] SUNRPC: Backchannel RPCs don't fail when the transport disconnects
Date: Fri, 3 Apr 2020 22:43:35 +0000 [thread overview]
Message-ID: <063db847f7f2129504463919978dede3d328d0b6.camel@hammerspace.com> (raw)
In-Reply-To: <B385D770-D511-4E72-B0D7-90ED66892C2D@oracle.com>
On Fri, 2020-04-03 at 17:46 -0400, Chuck Lever wrote:
> > On Apr 3, 2020, at 4:05 PM, Chuck Lever <chuck.lever@oracle.com>
> > wrote:
> >
> > Hi Trond, thanks for the look!
> >
> > > On Apr 3, 2020, at 4:00 PM, Trond Myklebust <
> > > trondmy@hammerspace.com> wrote:
> > >
> > > On Fri, 2020-04-03 at 15:42 -0400, Chuck Lever wrote:
> > > > Commit 3832591e6fa5 ("SUNRPC: Handle connection issues
> > > > correctly on
> > > > the back channel") intended to make backchannel RPCs fail
> > > > immediately when there is no forward channel connection. What's
> > > > currently happening is, when the forward channel conneciton
> > > > goes
> > > > away, backchannel operations are causing hard loops because
> > > > call_transmit_status's SOFTCONN logic ignores ENOTCONN.
> > >
> > > Won't RPC_TASK_NOCONNECT do the right thing? It should cause the
> > > request to exit with an ENOTCONN error when it hits
> > > call_connect().
> >
> > OK, so does that mean SOFTCONN is entirely the wrong semantic here?
> >
> > Was RPC_TASK_NOCONNECT available when 3832591e6fa5 was merged?
>
> It turns out 3832591e6fa5 is not related. It's 58255a4e3ce5 that
> added
> RPC_TASK_SOFTCONN on NFSv4 callback Calls.
>
> However, the server uses nfsd4_run_cb_work() for both NFSv4.0 and
> NFSv4.1 callbacks. IMO a fix for this will have to take care that
> RPC_TASK_NOCONNECT is not set on NFSv4.0 callback tasks.
Possibly, but don't we really want to let such a NFSv4.0 request fail
and send another CB_NULL? There is already version-specific code in
nfsd4_process_cb_update() to set up the callback client.
> Unfortunately the looping behavior appears also related to the
> RPC_IS_SIGNALLED bug I reported earlier this week. The CB request is
> signalled as it is firing up, and then drops into a hard loop.
>
Hmm... Is there any reason why xprt_rdma_alloc_slot() should not be
checking if the transport is being shut down? After all, if the
transport is being destroyed here, then there is no point in allocating
a slot for that specific RDMA transport (it's not as if it can be
reused elsewhere).
> nfsd-1986 [001] 123.028240:
> svc_drop: addr=192.168.2.51:52077 xid=0x489a88f6
> flags=RQ_SECURE|RQ_USEDEFERRAL|RQ_SPLICE_OK|RQ_BUSY
> kworker/u8:12-442 [003] 123.028242:
> rpc_task_signalled: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE status=0 action=rpc_prepare_task
> kworker/u8:10-440 [000] 123.028289:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=rpc_prepare_task
> kworker/u8:10-440 [000] 123.028289:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_start
> kworker/u8:10-440 [000] 123.028291:
> rpc_request: task:47@3 nfs4_cbv1 CB_RECALL (async)
> kworker/u8:10-440 [000] 123.028291:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserve
> kworker/u8:10-440 [000] 123.028298:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserveresult
> kworker/u8:10-440 [000] 123.028299:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refresh
> kworker/u8:10-440 [000] 123.028324:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refreshresult
> kworker/u8:10-440 [000] 123.028324:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_allocate
> kworker/u8:10-440 [000] 123.028340:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_encode
> kworker/u8:10-440 [000] 123.028356:
> xprt_enq_xmit: task:47@3 xid=0x8b95e935 seqno=0 stage=4
> kworker/u8:10-440 [000] 123.028357:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0
> action=call_transmit
> kworker/u8:10-440 [000] 123.028363:
> xprt_reserve_cong: task:47@3 snd_task:47 cong=0 cwnd=256
> kworker/u8:10-440 [000] 123.028365:
> xprt_transmit: task:47@3 xid=0x8b95e935 seqno=0 status=-512
> kworker/u8:10-440 [000] 123.028368:
> xprt_release_cong: task:47@3 snd_task:4294967295 cong=0 cwnd=256
> kworker/u8:10-440 [000] 123.028368:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0
> action=call_transmit_status
> kworker/u8:10-440 [000] 123.028371:
> rpc_task_sleep: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 timeout=0
> queue=xprt_pending
> kworker/u8:10-440 [000] 123.028376:
> rpc_task_wakeup: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|QUEUED|ACTIVE|NEED_RECV|SIGNALLED status=-107
> timeout=9000 queue=xprt_pending
> kworker/u8:10-440 [000] 123.028377:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107
> action=xprt_timer
> kworker/u8:10-440 [000] 123.028377:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107
> action=call_status
> kworker/u8:10-440 [000] 123.028378:
> rpc_call_status: task:47@3 status=-107
> kworker/u8:10-440 [000] 123.028378:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0
> action=call_encode
> kworker/u8:10-440 [000] 123.028380:
> xprt_enq_xmit: task:47@3 xid=0x8b95e935 seqno=0 stage=4
> kworker/u8:10-440 [000] 123.028380:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0
> action=call_transmit
> kworker/u8:10-440 [000] 123.028381:
> xprt_reserve_cong: task:47@3 snd_task:47 cong=0 cwnd=256
> kworker/u8:10-440 [000] 123.028381:
> xprt_transmit: task:47@3 xid=0x8b95e935 seqno=0 status=-512
> kworker/u8:10-440 [000] 123.028382:
> xprt_release_cong: task:47@3 snd_task:4294967295 cong=0 cwnd=256
> kworker/u8:10-440 [000] 123.028382:
> rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
> runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0
> action=call_transmit_status
>
> If we want to avoid the early RPC_IS_SIGNALLED check in the
> scheduler,
> I guess an alternative would be to have call_transmit_status() check
> for
> RPC_TASK_SIGNALLED. That's not as broad a fix, but it would address
> both the NULL and the CB looping cases, IIUC.
>
>
> > > > To avoid changing the behavior of call_transmit_status in the
> > > > forward direction, make backchannel RPCs return with a
> > > > different
> > > > error than ENOTCONN when they fail.
> > > >
> > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > ---
> > > > net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 15 ++++++++++---
> > > > --
> > > > net/sunrpc/xprtsock.c | 6 ++++--
> > > > 2 files changed, 14 insertions(+), 7 deletions(-)
> > > >
> > > > I'm playing with this fix. It seems to be required in order to
> > > > get
> > > > Kerberos mounts to work under load with NFSv4.1 and later on
> > > > RDMA.
> > > >
> > > > If there are no objections, I can carry this for v5.7-rc ...
> > > >
> > > >
> > > > diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> > > > b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> > > > index d510a3a15d4b..b8a72d7fbcc2 100644
> > > > --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> > > > +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> > > > @@ -207,11 +207,16 @@ rpcrdma_bc_send_request(struct
> > > > svcxprt_rdma
> > > > *rdma, struct rpc_rqst *rqst)
> > > >
> > > > drop_connection:
> > > > dprintk("svcrdma: failed to send bc call\n");
> > > > - return -ENOTCONN;
> > > > + return -EHOSTUNREACH;
> > > > }
> > > >
> > > > -/* Send an RPC call on the passive end of a transport
> > > > - * connection.
> > > > +/**
> > > > + * xprt_rdma_bc_send_request - send an RPC backchannel Call
> > > > + * @rqst: RPC Call in rq_snd_buf
> > > > + *
> > > > + * Returns:
> > > > + * %0 if the RPC message has been sent
> > > > + * %-EHOSTUNREACH if the Call could not be sent
> > > > */
> > > > static int
> > > > xprt_rdma_bc_send_request(struct rpc_rqst *rqst)
> > > > @@ -225,11 +230,11 @@ xprt_rdma_bc_send_request(struct rpc_rqst
> > > > *rqst)
> > > >
> > > > mutex_lock(&sxprt->xpt_mutex);
> > > >
> > > > - ret = -ENOTCONN;
> > > > + ret = -EHOSTUNREACH;
> > > > rdma = container_of(sxprt, struct svcxprt_rdma,
> > > > sc_xprt);
> > > > if (!test_bit(XPT_DEAD, &sxprt->xpt_flags)) {
> > > > ret = rpcrdma_bc_send_request(rdma, rqst);
> > > > - if (ret == -ENOTCONN)
> > > > + if (ret < 0)
> > > > svc_close_xprt(sxprt);
> > > > }
> > > >
> > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > > > index 17cb902e5153..92a358fd2ff0 100644
> > > > --- a/net/sunrpc/xprtsock.c
> > > > +++ b/net/sunrpc/xprtsock.c
> > > > @@ -2543,7 +2543,9 @@ static int bc_sendto(struct rpc_rqst
> > > > *req)
> > > > req->rq_xtime = ktime_get();
> > > > err = xprt_sock_sendmsg(transport->sock, &msg, xdr, 0,
> > > > marker,
> > > > &sent);
> > > > xdr_free_bvec(xdr);
> > > > - if (err < 0 || sent != (xdr->len + sizeof(marker)))
> > > > + if (err < 0)
> > > > + return -EHOSTUNREACH;
> > > > + if (sent != (xdr->len + sizeof(marker)))
> > > > return -EAGAIN;
> > > > return sent;
> > > > }
> > > > @@ -2567,7 +2569,7 @@ static int bc_send_request(struct
> > > > rpc_rqst
> > > > *req)
> > > > */
> > > > mutex_lock(&xprt->xpt_mutex);
> > > > if (test_bit(XPT_DEAD, &xprt->xpt_flags))
> > > > - len = -ENOTCONN;
> > > > + len = -EHOSTUNREACH;
> > > > else
> > > > len = bc_sendto(req);
> > > > mutex_unlock(&xprt->xpt_mutex);
> > > >
> > > --
> > > Trond Myklebust
> > > Linux NFS client maintainer, Hammerspace
> > > trond.myklebust@hammerspace.com
> >
> > --
> > Chuck Lever
>
> --
> Chuck Lever
>
>
>
--
Trond Myklebust
CTO, Hammerspace Inc
4300 El Camino Real, Suite 105
Los Altos, CA 94022
www.hammer.space
next prev parent reply other threads:[~2020-04-03 22:43 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-03 19:42 [PATCH RFC] SUNRPC: Backchannel RPCs don't fail when the transport disconnects Chuck Lever
2020-04-03 20:00 ` Trond Myklebust
2020-04-03 20:05 ` Chuck Lever
2020-04-03 21:46 ` Chuck Lever
2020-04-03 22:43 ` Trond Myklebust [this message]
2020-04-03 23:11 ` Chuck Lever
2020-04-06 20:10 ` Bruce Fields
2020-04-07 14:04 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=063db847f7f2129504463919978dede3d328d0b6.camel@hammerspace.com \
--to=trondmy@hammerspace.com \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox