Re: copy_file_range() infinitely hangs on NFSv4.2 over RDMA

Linux NFS development
 help / color / mirror / Atom feed

From: Chuck Lever <chuck.lever@oracle.com>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: Timo Rothenpieler <timo@rothenpieler.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: copy_file_range() infinitely hangs on NFSv4.2 over RDMA
Date: Fri, 19 Feb 2021 20:55:58 +0000	[thread overview]
Message-ID: <1A329051-FD97-4200-AB48-B3DAE7AA9D21@oracle.com> (raw)
In-Reply-To: <CAN-5tyHq-1SM8o-qpeF-_UGd0a74ky8Atcam=gY9HqUrMhfp_g@mail.gmail.com>



> On Feb 19, 2021, at 3:43 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Fri, Feb 19, 2021 at 1:48 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Feb 19, 2021, at 1:01 PM, Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>> 
>>> On 19.02.2021 18:48, Chuck Lever wrote:
>>>>> On Feb 19, 2021, at 12:38 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> 
>>>>> On Thu, Feb 18, 2021 at 3:22 PM Timo Rothenpieler <timo@rothenpieler.org> wrote:
>>>>>> 
>>>>>> On 18.02.2021 19:30, Olga Kornievskaia wrote:
>>>>>>> Thank you for getting tracepoints from a busy server but can you get
>>>>>>> more? As suspected, the server is having issues sending the callback.
>>>>>>> I'm not sure why. Any chance to turn on the server's sunrpc
>>>>>>> tracespoints, probably both sunrpc and rdmas tracepoints, I wonder if
>>>>>>> we can any more info about why it's failing?
>>>>>> 
>>>>>> I isolated out two of the machines on that cluster now, one acting as
>>>>>> NFS server from an ext4 mount, the other is the same client as before.
>>>>>> That way I managed to capture a trace and ibdump of an entire cycle:
>>>>>> mount + successful copy + 5 minutes later a copy that got stuck
>>>>>> 
>>>>>> Next to no noise happened during those traces, you can find them attached.
>>>>>> 
>>>>>> Another observation made due to this: unmount and re-mounting the NFS
>>>>>> share also gets it back into working condition for a while, no reboot
>>>>>> necessary.
>>>>>> During this trace, I got "lucky", and after just 5 minutes of waiting,
>>>>>> it got stuck.
>>>>>> 
>>>>>> Before that, I had a run of mount + trying to copy every 5 minutes where
>>>>>> it ran for 45 minutes without getting stuck. At which point I decided to
>>>>>> remount once more.
>>>>> 
>>>>> Timo, thank you for gathering the debug info.
>>>>> 
>>>>> Chuck, I need your help. Why would the server lose a callback channel?
>>>>> 
>>>>>          <...>-1461944 [001] 2076465.200151: rpc_request:
>>>>> task:57752@6 nfs4_cbv1 CB_OFFLOAD (async)
>>>>>          <...>-1461944 [001] 2076465.200151: rpc_task_run_action:
>>>>> task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
>>>>> runstate=RUNNING|ACTIVE status=0 action=call_reserve
>>>>>          <...>-1461944 [001] 2076465.200154: xprt_reserve:
>>>>> task:57752@6 xid=0x00a0aaf9
>>>>>          <...>-1461944 [001] 2076465.200155: rpc_task_run_action:
>>>>> task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
>>>>> runstate=RUNNING|ACTIVE status=0 action=call_reserveresult
>>>>>          <...>-1461944 [001] 2076465.200156: rpc_task_run_action:
>>>>> task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
>>>>> runstate=RUNNING|ACTIVE status=0 action=call_refresh
>>>>>          <...>-1461944 [001] 2076465.200163: rpc_task_run_action:
>>>>> task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
>>>>> runstate=RUNNING|ACTIVE status=0 action=call_refreshresult
>>>>>          <...>-1461944 [001] 2076465.200163: rpc_task_run_action:
>>>>> task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
>>>>> runstate=RUNNING|ACTIVE status=0 action=call_allocate
>>>>>          <...>-1461944 [001] 2076465.200168: rpc_buf_alloc:
>>>>> task:57752@6 callsize=548 recvsize=104 status=0
>>>>>          <...>-1461944 [001] 2076465.200168: rpc_task_run_action:
>>>>> task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
>>>>> runstate=RUNNING|ACTIVE status=0 action=call_encode
>>>>>          <...>-1461944 [001] 2076465.200173: rpc_task_run_action:
>>>>> task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
>>>>> runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV status=0
>>>>> action=call_connect
>>>>>          <...>-1461944 [001] 2076465.200174: rpc_call_rpcerror:
>>>>> task:57752@6 tk_status=-107 rpc_status=-107
>>>>>          <...>-1461944 [001] 2076465.200174: rpc_task_run_action:
>>>>> task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT
>>>>> runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV status=-107
>>>>> action=rpc_exit_task
>>>>> 
>>>>> It's reporting ENOTCON. I'm not really sure if this is related to copy
>>>>> offload but more perhaps doing callbacks over RDMA/IB.
>>>> If the client is awaiting a COPY notification callback, I can see why
>>>> a lost CB channel would cause the client to wait indefinitely.
>>>> The copy mechanism has to deal with this contingency... Perhaps the
>>>> server could force a disconnect so that the client and server have an
>>>> opportunity to re-establish the callback channel. <shrug>
>>>> In any event, the trace log above shows nothing more than "hey, it's
>>>> not working." Are there any rpcrdma trace events we can look at to
>>>> determine why the backchannel is getting lost?
>>> 
>>> The trace log attached to my previous mail has it enabled, along with nfsd and sunrpc.
>>> I'm attaching the two files again here.
>> 
>> Thanks, Timo.
>> 
>> The first CB_OFFLOAD callback succeeds:
>> 
>> 2076155.216687: nfsd_cb_work:         addr=10.110.10.252:0 client 602eb645:daa037ae procedure=CB_OFFLOAD
>> 2076155.216704: rpc_request:          task:57746@6 nfs4_cbv1 CB_OFFLOAD (async)
>> 
>> 2076155.216975: rpc_stats_latency:    task:57746@6 xid=0xff9faaf9 nfs4_cbv1 CB_OFFLOAD backlog=33 rtt=195 execute=282
>> 2076155.216977: nfsd_cb_done:         addr=10.110.10.252:0 client 602eb645:daa037ae status=0
>> 
>> 
>> About 305 seconds later, the autodisconnect timer fires. I'm not sure if this is the backchannel transport, but it looks suspicious:
>> 
>> 2076460.314954: xprt_disconnect_auto: peer=[10.110.10.252]:0 state=LOCKED|CONNECTED|BOUND
>> 2076460.314957: xprt_disconnect_done: peer=[10.110.10.252]:0 state=LOCKED|CONNECTED|BOUND
>> 
>> 
>> The next CB_OFFLOAD callback fails because the xprt has been marked "disconnected" and the request's NOCONNECT flag is set.
>> 
>> 2076465.200136: nfsd_cb_work:         addr=10.110.10.252:0 client 602eb645:daa037ae procedure=CB_OFFLOAD
>> 2076465.200151: rpc_request:          task:57752@6 nfs4_cbv1 CB_OFFLOAD (async)
>> 
>> 2076465.200168: rpc_task_run_action:  task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE status=0 action=call_encode
>> 2076465.200173: rpc_task_run_action:  task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV status=0 action=call_connect
>> 2076465.200174: rpc_call_rpcerror:    task:57752@6 tk_status=-107 rpc_status=-107
>> 2076465.200174: rpc_task_run_action:  task:57752@6 flags=ASYNC|DYNAMIC|SOFT|NOCONNECT runstate=RUNNING|ACTIVE|NEED_XMIT|NEED_RECV status=-107 action=rpc_exit_task
>> 2076465.200179: nfsd_cb_done:         addr=10.110.10.252:0 client 602eb645:daa037ae status=-107
>> 2076465.200180: nfsd_cb_state:        addr=10.110.10.252:0 client 602eb645:daa037ae state=FAULT
>> 
>> 
>> Perhaps RPC clients for NFSv4.1+ callback should be created with
>> the RPC_CLNT_CREATE_NO_IDLE_TIMEOUT flag.
> 
> Do you know if this callback behavior is new? The same problem would
> exist with delegation recalls.

Delegation recall would be affected, but that wouldn't cause an
indefinite hang. The server should eventually revoke the delegation
and life would continue.

Also, the server should be setting the CALLBACK_DOWN sequence status
flag in its replies. If the client isn't sending any other traffic
while waiting for the copy to complete, it probably wouldn't notice
the problem.

Is the client renewing its lease while the copy is ongoing? That
would be an opportunity for the server to signal the loss of
backchannel connectivity.

Again, in the long term, the copy mechanism needs to be prepared for
the contingency that the callback channel is lost while copy
operations are pending.

There's plenty to investigate and confirm here.


--
Chuck Lever

next prev parent reply	other threads:[~2021-02-19 20:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-14  3:31 copy_file_range() infinitely hangs on NFSv4.2 over RDMA Timo Rothenpieler
2021-02-16 20:12 ` Olga Kornievskaia
2021-02-16 20:37   ` Timo Rothenpieler
     [not found]     ` <0e49471c-e640-a331-c7b4-4e0a49a7a967@rothenpieler.org>
2021-02-17 22:37       ` Olga Kornievskaia
2021-02-18  1:12         ` Timo Rothenpieler
2021-02-18  3:52           ` Olga Kornievskaia
2021-02-18 13:28             ` Timo Rothenpieler
     [not found]               ` <e89ab742-7984-6a2c-2f01-402283ba6e89@rothenpieler.org>
2021-02-18 18:30                 ` Olga Kornievskaia
     [not found]                   ` <def12560-2481-b17d-5a42-7236edbd5392@rothenpieler.org>
2021-02-19 17:38                     ` Olga Kornievskaia
2021-02-19 17:48                       ` Chuck Lever
     [not found]                         ` <19c74710-bf35-6412-dd06-071331419ab5@rothenpieler.org>
2021-02-19 18:48                           ` Chuck Lever
2021-02-19 20:37                             ` Timo Rothenpieler
2021-02-19 20:43                             ` Olga Kornievskaia
2021-02-19 20:55                               ` Chuck Lever [this message]
2021-02-20 21:03                             ` Timo Rothenpieler
2021-02-21 17:45                               ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1A329051-FD97-4200-AB48-B3DAE7AA9D21@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=aglo@umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=timo@rothenpieler.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox