From: Chuck Lever <chuck.lever@oracle.com>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH RFC] SUNRPC: Backchannel RPCs don't fail when the transport disconnects
Date: Fri, 3 Apr 2020 17:46:02 -0400 [thread overview]
Message-ID: <B385D770-D511-4E72-B0D7-90ED66892C2D@oracle.com> (raw)
In-Reply-To: <E472C4D7-313C-48F2-8D4E-8D1F81357979@oracle.com>
> On Apr 3, 2020, at 4:05 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>
> Hi Trond, thanks for the look!
>
>> On Apr 3, 2020, at 4:00 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
>>
>> On Fri, 2020-04-03 at 15:42 -0400, Chuck Lever wrote:
>>> Commit 3832591e6fa5 ("SUNRPC: Handle connection issues correctly on
>>> the back channel") intended to make backchannel RPCs fail
>>> immediately when there is no forward channel connection. What's
>>> currently happening is, when the forward channel conneciton goes
>>> away, backchannel operations are causing hard loops because
>>> call_transmit_status's SOFTCONN logic ignores ENOTCONN.
>>
>> Won't RPC_TASK_NOCONNECT do the right thing? It should cause the
>> request to exit with an ENOTCONN error when it hits call_connect().
>
> OK, so does that mean SOFTCONN is entirely the wrong semantic here?
>
> Was RPC_TASK_NOCONNECT available when 3832591e6fa5 was merged?
It turns out 3832591e6fa5 is not related. It's 58255a4e3ce5 that added
RPC_TASK_SOFTCONN on NFSv4 callback Calls.
However, the server uses nfsd4_run_cb_work() for both NFSv4.0 and
NFSv4.1 callbacks. IMO a fix for this will have to take care that
RPC_TASK_NOCONNECT is not set on NFSv4.0 callback tasks.
Unfortunately the looping behavior appears also related to the
RPC_IS_SIGNALLED bug I reported earlier this week. The CB request is
signalled as it is firing up, and then drops into a hard loop.
nfsd-1986 [001] 123.028240: svc_drop: addr=192.168.2.51:52077 xid=0x489a88f6 flags=RQ_SECURE|RQ_USEDEFERRAL|RQ_SPLICE_OK|RQ_BUSY
kworker/u8:12-442 [003] 123.028242: rpc_task_signalled: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE status=0 action=rpc_prepare_task
kworker/u8:10-440 [000] 123.028289: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=rpc_prepare_task
kworker/u8:10-440 [000] 123.028289: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_start
kworker/u8:10-440 [000] 123.028291: rpc_request: task:47@3 nfs4_cbv1 CB_RECALL (async)
kworker/u8:10-440 [000] 123.028291: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserve
kworker/u8:10-440 [000] 123.028298: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_reserveresult
kworker/u8:10-440 [000] 123.028299: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refresh
kworker/u8:10-440 [000] 123.028324: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_refreshresult
kworker/u8:10-440 [000] 123.028324: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_allocate
kworker/u8:10-440 [000] 123.028340: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|SIGNALLED status=0 action=call_encode
kworker/u8:10-440 [000] 123.028356: xprt_enq_xmit: task:47@3 xid=0x8b95e935 seqno=0 stage=4
kworker/u8:10-440 [000] 123.028357: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0 action=call_transmit
kworker/u8:10-440 [000] 123.028363: xprt_reserve_cong: task:47@3 snd_task:47 cong=0 cwnd=256
kworker/u8:10-440 [000] 123.028365: xprt_transmit: task:47@3 xid=0x8b95e935 seqno=0 status=-512
kworker/u8:10-440 [000] 123.028368: xprt_release_cong: task:47@3 snd_task:4294967295 cong=0 cwnd=256
kworker/u8:10-440 [000] 123.028368: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_transmit_status
kworker/u8:10-440 [000] 123.028371: rpc_task_sleep: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 timeout=0 queue=xprt_pending
kworker/u8:10-440 [000] 123.028376: rpc_task_wakeup: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|QUEUED|ACTIVE|NEED_RECV|SIGNALLED status=-107 timeout=9000 queue=xprt_pending
kworker/u8:10-440 [000] 123.028377: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107 action=xprt_timer
kworker/u8:10-440 [000] 123.028377: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=-107 action=call_status
kworker/u8:10-440 [000] 123.028378: rpc_call_status: task:47@3 status=-107
kworker/u8:10-440 [000] 123.028378: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_encode
kworker/u8:10-440 [000] 123.028380: xprt_enq_xmit: task:47@3 xid=0x8b95e935 seqno=0 stage=4
kworker/u8:10-440 [000] 123.028380: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV|SIGNALLED status=0 action=call_transmit
kworker/u8:10-440 [000] 123.028381: xprt_reserve_cong: task:47@3 snd_task:47 cong=0 cwnd=256
kworker/u8:10-440 [000] 123.028381: xprt_transmit: task:47@3 xid=0x8b95e935 seqno=0 status=-512
kworker/u8:10-440 [000] 123.028382: xprt_release_cong: task:47@3 snd_task:4294967295 cong=0 cwnd=256
kworker/u8:10-440 [000] 123.028382: rpc_task_run_action: task:47@3 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_RECV|SIGNALLED status=0 action=call_transmit_status
If we want to avoid the early RPC_IS_SIGNALLED check in the scheduler,
I guess an alternative would be to have call_transmit_status() check for
RPC_TASK_SIGNALLED. That's not as broad a fix, but it would address
both the NULL and the CB looping cases, IIUC.
>>> To avoid changing the behavior of call_transmit_status in the
>>> forward direction, make backchannel RPCs return with a different
>>> error than ENOTCONN when they fail.
>>>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>> net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 15 ++++++++++-----
>>> net/sunrpc/xprtsock.c | 6 ++++--
>>> 2 files changed, 14 insertions(+), 7 deletions(-)
>>>
>>> I'm playing with this fix. It seems to be required in order to get
>>> Kerberos mounts to work under load with NFSv4.1 and later on RDMA.
>>>
>>> If there are no objections, I can carry this for v5.7-rc ...
>>>
>>>
>>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>>> b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>>> index d510a3a15d4b..b8a72d7fbcc2 100644
>>> --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>>> +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>>> @@ -207,11 +207,16 @@ rpcrdma_bc_send_request(struct svcxprt_rdma
>>> *rdma, struct rpc_rqst *rqst)
>>>
>>> drop_connection:
>>> dprintk("svcrdma: failed to send bc call\n");
>>> - return -ENOTCONN;
>>> + return -EHOSTUNREACH;
>>> }
>>>
>>> -/* Send an RPC call on the passive end of a transport
>>> - * connection.
>>> +/**
>>> + * xprt_rdma_bc_send_request - send an RPC backchannel Call
>>> + * @rqst: RPC Call in rq_snd_buf
>>> + *
>>> + * Returns:
>>> + * %0 if the RPC message has been sent
>>> + * %-EHOSTUNREACH if the Call could not be sent
>>> */
>>> static int
>>> xprt_rdma_bc_send_request(struct rpc_rqst *rqst)
>>> @@ -225,11 +230,11 @@ xprt_rdma_bc_send_request(struct rpc_rqst
>>> *rqst)
>>>
>>> mutex_lock(&sxprt->xpt_mutex);
>>>
>>> - ret = -ENOTCONN;
>>> + ret = -EHOSTUNREACH;
>>> rdma = container_of(sxprt, struct svcxprt_rdma, sc_xprt);
>>> if (!test_bit(XPT_DEAD, &sxprt->xpt_flags)) {
>>> ret = rpcrdma_bc_send_request(rdma, rqst);
>>> - if (ret == -ENOTCONN)
>>> + if (ret < 0)
>>> svc_close_xprt(sxprt);
>>> }
>>>
>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>>> index 17cb902e5153..92a358fd2ff0 100644
>>> --- a/net/sunrpc/xprtsock.c
>>> +++ b/net/sunrpc/xprtsock.c
>>> @@ -2543,7 +2543,9 @@ static int bc_sendto(struct rpc_rqst *req)
>>> req->rq_xtime = ktime_get();
>>> err = xprt_sock_sendmsg(transport->sock, &msg, xdr, 0, marker,
>>> &sent);
>>> xdr_free_bvec(xdr);
>>> - if (err < 0 || sent != (xdr->len + sizeof(marker)))
>>> + if (err < 0)
>>> + return -EHOSTUNREACH;
>>> + if (sent != (xdr->len + sizeof(marker)))
>>> return -EAGAIN;
>>> return sent;
>>> }
>>> @@ -2567,7 +2569,7 @@ static int bc_send_request(struct rpc_rqst
>>> *req)
>>> */
>>> mutex_lock(&xprt->xpt_mutex);
>>> if (test_bit(XPT_DEAD, &xprt->xpt_flags))
>>> - len = -ENOTCONN;
>>> + len = -EHOSTUNREACH;
>>> else
>>> len = bc_sendto(req);
>>> mutex_unlock(&xprt->xpt_mutex);
>>>
>> --
>> Trond Myklebust
>> Linux NFS client maintainer, Hammerspace
>> trond.myklebust@hammerspace.com
>
> --
> Chuck Lever
--
Chuck Lever
next prev parent reply other threads:[~2020-04-03 21:46 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-03 19:42 [PATCH RFC] SUNRPC: Backchannel RPCs don't fail when the transport disconnects Chuck Lever
2020-04-03 20:00 ` Trond Myklebust
2020-04-03 20:05 ` Chuck Lever
2020-04-03 21:46 ` Chuck Lever [this message]
2020-04-03 22:43 ` Trond Myklebust
2020-04-03 23:11 ` Chuck Lever
2020-04-06 20:10 ` Bruce Fields
2020-04-07 14:04 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B385D770-D511-4E72-B0D7-90ED66892C2D@oracle.com \
--to=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@hammerspace.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox