From: Dai Ngo <dai.ngo@oracle.com>
To: Trond Myklebust <trondmy@kernel.org>, anna@kernel.org
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH 1/1] SUNRPC: call_connect_status needs to destroy transport on ETIMEDOUT before retry
Date: Tue, 5 Aug 2025 08:46:39 -0700 [thread overview]
Message-ID: <de62d42e-5e85-45d5-825b-369938a4a24c@oracle.com> (raw)
In-Reply-To: <23fbff6b80f0c7c4b963f214c4c1e5d7b31c1d23.camel@kernel.org>
On 8/4/25 4:55 PM, Trond Myklebust wrote:
> On Mon, 2025-08-04 at 13:13 -0700, Dai Ngo wrote:
>> On 8/4/25 12:21 PM, Trond Myklebust wrote:
>>> On Mon, 2025-08-04 at 12:08 -0700, Dai Ngo wrote:
>>>> Currently, when an RPC connection times out during the connect
>>>> phase,
>>>> the task is retried by placing it back on the pending queue and
>>>> waiting
>>>> again. In some cases, the timeout occurs because TCP is unable to
>>>> send
>>>> the SYN packet. This situation most often arises on bare metal
>>>> systems
>>>> at boot time, when the NFS mount is attempted while the network
>>>> link
>>>> appears to be up but is not yet stable.
>>>>
>>>> This patch addresses the issue by updating call_connect_status to
>>>> destroy
>>>> the transport on ETIMEDOUT error before retrying the connection.
>>>> This
>>>> ensures that subsequent connection attempts use a fresh
>>>> transport,
>>>> reducing the likelihood of repeated failures due to lingering
>>>> network
>>>> issues.
>>>>
>>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>>> ---
>>>> net/sunrpc/clnt.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>> index 21426c3049d3..701b742750c5 100644
>>>> --- a/net/sunrpc/clnt.c
>>>> +++ b/net/sunrpc/clnt.c
>>>> @@ -2215,6 +2215,7 @@ call_connect_status(struct rpc_task *task)
>>>> case -EHOSTUNREACH:
>>>> case -EPIPE:
>>>> case -EPROTO:
>>>> + case -ETIMEDOUT:
>>>> xprt_conditional_disconnect(task->tk_rqstp-
>>>>> rq_xprt,
>>>> task->tk_rqstp-
>>>>> rq_connect_cookie);
>>>> if (RPC_IS_SOFTCONN(task))
>>>> @@ -2225,7 +2226,6 @@ call_connect_status(struct rpc_task *task)
>>>> case -EADDRINUSE:
>>>> case -ENOTCONN:
>>>> case -EAGAIN:
>>>> - case -ETIMEDOUT:
>>>> if (!(task->tk_flags & RPC_TASK_NO_ROUND_ROBIN)
>>>> &&
>>>> (task->tk_flags & RPC_TASK_MOVEABLE) &&
>>>> test_bit(XPRT_REMOVE, &xprt->state)) {
>>> Why is this needed? The ETIMEDOUT is supposed to be a task level
>>> error,
>>> not a connection level thing.
>> If TCP was not able to sent the SYN out on due to the transient error
>> with the link status and the task just turns around a wait again,
>> since
>> TCP does not retry the SYN, eventually systemd times out and stops
>> the
>> mount.
>>
>>
>> Below is the snippet from the system log with rpcdebug enabled:
>>
>>
>> Jun 20 10:23:01 qa-i360-odi03 kernel: i40e 0000:98:00.0 eth1: NIC
>> Link is Up, 10 Gbps Full Duplex, Flow Control: None
>> Jun 20 10:23:09 qa-i360-odi03 NetworkManager[1511]: <info>
>> [1750414989.6033] manager: startup complete
>>
>> ... <NFS mount starts>
>> Jun 20 10:23:09 qa-i360-odi03 systemd[1]: Mounting /odi...
>> ...
>> Jun 20 10:23:09 qa-i360-odi03 kernel: RPC: 1 added to queue
>> 0000000093f481cd "xprt_pending"
>> Jun 20 10:23:09 qa-i360-odi03 kernel: RPC: 1 setting alarm for
>> 60000 ms
>>
>> ... <link status down & up>
>> Jun 20 10:23:10 qa-i360-odi03 kernel: tg3 0000:04:00.0 em1: Link is
>> up at 1000 Mbps, full duplex
>> Jun 20 10:23:10 qa-i360-odi03 NetworkManager[1511]: <info>
>> [1750414990.6359] device (em1): state change: disconnected -> prepare
>> (reason 'none', sys-iface-state: 'managed')
>> Jun 20 10:23:10 qa-i360-odi03 NetworkManager[1511]: <info>
>> [1750414990.6361] device (em1): state change: prepare -> config
>> (reason 'none', sys-iface-state: 'managed')
>> Jun 20 10:23:10 qa-i360-odi03 NetworkManager[1511]: <info>
>> [1750414990.6364] device (em1): state change: config -> ip-config
>> (reason 'none', sys-iface-state: 'managed')
>> Jun 20 10:23:10 qa-i360-odi03 NetworkManager[1511]: <info>
>> [1750414990.6383] device (em1): Activation: successful, device
>> activated.
>>
>> ... <connect timed out>
>> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 timeout
>> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 __rpc_wake_up_task
>> (now 4294742016)
>> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 disabling timer
>> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 removed from queue
>> 0000000093f481cd "xprt_pending"
>> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 call_connect_status
>> (status -110)
>>
>> ... <wait again>
>> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 sleep_on(queue
>> "xprt_pending" time 4294742016)
>> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 added to queue
>> 0000000093f481cd "xprt_pending"
>>
>> ... <systemd timed out>
>> Jun 20 10:24:39 qa-i360-odi03 systemd[1]: odi.mount: Mounting timed
>> out. Terminating.
>>
>> Jun 20 10:24:39 qa-i360-odi03 kernel: RPC: 1 got signal
>> Jun 20 10:24:39 qa-i360-odi03 kernel: RPC: 1 __rpc_wake_up_task
>> (now 4294770229)
>> Jun 20 10:24:39 qa-i360-odi03 kernel: RPC: 1 disabling timer
>> Jun 20 10:24:39 qa-i360-odi03 kernel: RPC: 1 removed from queue
>> 0000000093f481cd "xprt_pending"
>>
>> Jun 20 10:24:39 qa-i360-odi03 kernel: <-- nfs4_try_mount() = -512
>> [error]
>> Jun 20 10:24:39 qa-i360-odi03 systemd[1]: odi.mount: Mount process
>> exited, code=killed status=15
>> Jun 20 10:24:39 qa-i360-odi03 systemd[1]: odi.mount: Failed with
>> result 'timeout'.
>>
>> This patch forces TCP to send the SYN on ETIMEDOUT error.
>>
> Something is very wrong here...
>
> If this patch is correct, and the call to xprt_conditional_disconnect()
> does indeed cause the socket to close, then something must have bumped
> xprt->connect_cookie.
I'm a little confused here, xprt_conditional_disconnect only closes the
socket if the connect cookie is still the same:
if (cookie != xprt->connect_cookie)
goto out;
So in this case the xprt->connect_cookie has not been bumped.
-Dai
> The only things that can do that are the state
> changes recorded in xs_tcp_state_change(),
> xs_sock_reset_connection_flags(), or xprt_autoclose().
>
> If it was xs_tcp_state_change() that bumped xprt->connect_cookie, then
> either we're in TCP_ESTABLISHED (in which case triggering a close on
> ETIMEDOUT is just wrong), or we're in the TCP_CLOSE state, in which
> case autoclose should already be scheduled.
> If xs_sock_reset_connection_flags() got called, then the socket is in
> the process of being closed.
> Ditto if xprt_autoclose() got called.
>
> So why do we need the call to xprt_conditional_disconnect()?
>
next prev parent reply other threads:[~2025-08-05 15:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-04 19:08 [PATCH 1/1] SUNRPC: call_connect_status needs to destroy transport on ETIMEDOUT before retry Dai Ngo
2025-08-04 19:21 ` Trond Myklebust
2025-08-04 20:13 ` Dai Ngo
2025-08-04 23:55 ` Trond Myklebust
2025-08-05 15:46 ` Dai Ngo [this message]
2025-08-05 16:41 ` Trond Myklebust
2025-08-05 17:07 ` Dai Ngo
2025-08-05 17:41 ` Trond Myklebust
2025-08-05 18:25 ` Dai Ngo
-- strict thread matches above, loose matches on Subject: below --
2025-08-04 18:46 Dai Ngo
2025-08-04 18:49 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de62d42e-5e85-45d5-825b-369938a4a24c@oracle.com \
--to=dai.ngo@oracle.com \
--cc=anna@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).