From: Trond Myklebust <trondmy@kernel.org>
To: Dai Ngo <dai.ngo@oracle.com>, anna@kernel.org
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH 1/1] SUNRPC: call_connect_status needs to destroy transport on ETIMEDOUT before retry
Date: Mon, 04 Aug 2025 16:55:34 -0700 [thread overview]
Message-ID: <23fbff6b80f0c7c4b963f214c4c1e5d7b31c1d23.camel@kernel.org> (raw)
In-Reply-To: <72e387a8-b800-431b-89c3-0104fbfe6273@oracle.com>
On Mon, 2025-08-04 at 13:13 -0700, Dai Ngo wrote:
>
> On 8/4/25 12:21 PM, Trond Myklebust wrote:
> > On Mon, 2025-08-04 at 12:08 -0700, Dai Ngo wrote:
> > > Currently, when an RPC connection times out during the connect
> > > phase,
> > > the task is retried by placing it back on the pending queue and
> > > waiting
> > > again. In some cases, the timeout occurs because TCP is unable to
> > > send
> > > the SYN packet. This situation most often arises on bare metal
> > > systems
> > > at boot time, when the NFS mount is attempted while the network
> > > link
> > > appears to be up but is not yet stable.
> > >
> > > This patch addresses the issue by updating call_connect_status to
> > > destroy
> > > the transport on ETIMEDOUT error before retrying the connection.
> > > This
> > > ensures that subsequent connection attempts use a fresh
> > > transport,
> > > reducing the likelihood of repeated failures due to lingering
> > > network
> > > issues.
> > >
> > > Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> > > ---
> > > net/sunrpc/clnt.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> > > index 21426c3049d3..701b742750c5 100644
> > > --- a/net/sunrpc/clnt.c
> > > +++ b/net/sunrpc/clnt.c
> > > @@ -2215,6 +2215,7 @@ call_connect_status(struct rpc_task *task)
> > > case -EHOSTUNREACH:
> > > case -EPIPE:
> > > case -EPROTO:
> > > + case -ETIMEDOUT:
> > > xprt_conditional_disconnect(task->tk_rqstp-
> > > >rq_xprt,
> > > task->tk_rqstp-
> > > > rq_connect_cookie);
> > > if (RPC_IS_SOFTCONN(task))
> > > @@ -2225,7 +2226,6 @@ call_connect_status(struct rpc_task *task)
> > > case -EADDRINUSE:
> > > case -ENOTCONN:
> > > case -EAGAIN:
> > > - case -ETIMEDOUT:
> > > if (!(task->tk_flags & RPC_TASK_NO_ROUND_ROBIN)
> > > &&
> > > (task->tk_flags & RPC_TASK_MOVEABLE) &&
> > > test_bit(XPRT_REMOVE, &xprt->state)) {
> > Why is this needed? The ETIMEDOUT is supposed to be a task level
> > error,
> > not a connection level thing.
>
> If TCP was not able to sent the SYN out on due to the transient error
> with the link status and the task just turns around a wait again,
> since
> TCP does not retry the SYN, eventually systemd times out and stops
> the
> mount.
>
>
> Below is the snippet from the system log with rpcdebug enabled:
>
>
> Jun 20 10:23:01 qa-i360-odi03 kernel: i40e 0000:98:00.0 eth1: NIC
> Link is Up, 10 Gbps Full Duplex, Flow Control: None
> Jun 20 10:23:09 qa-i360-odi03 NetworkManager[1511]: <info>
> [1750414989.6033] manager: startup complete
>
> ... <NFS mount starts>
> Jun 20 10:23:09 qa-i360-odi03 systemd[1]: Mounting /odi...
> ...
> Jun 20 10:23:09 qa-i360-odi03 kernel: RPC: 1 added to queue
> 0000000093f481cd "xprt_pending"
> Jun 20 10:23:09 qa-i360-odi03 kernel: RPC: 1 setting alarm for
> 60000 ms
>
> ... <link status down & up>
> Jun 20 10:23:10 qa-i360-odi03 kernel: tg3 0000:04:00.0 em1: Link is
> up at 1000 Mbps, full duplex
> Jun 20 10:23:10 qa-i360-odi03 NetworkManager[1511]: <info>
> [1750414990.6359] device (em1): state change: disconnected -> prepare
> (reason 'none', sys-iface-state: 'managed')
> Jun 20 10:23:10 qa-i360-odi03 NetworkManager[1511]: <info>
> [1750414990.6361] device (em1): state change: prepare -> config
> (reason 'none', sys-iface-state: 'managed')
> Jun 20 10:23:10 qa-i360-odi03 NetworkManager[1511]: <info>
> [1750414990.6364] device (em1): state change: config -> ip-config
> (reason 'none', sys-iface-state: 'managed')
> Jun 20 10:23:10 qa-i360-odi03 NetworkManager[1511]: <info>
> [1750414990.6383] device (em1): Activation: successful, device
> activated.
>
> ... <connect timed out>
> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 timeout
> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 __rpc_wake_up_task
> (now 4294742016)
> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 disabling timer
> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 removed from queue
> 0000000093f481cd "xprt_pending"
> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 call_connect_status
> (status -110)
>
> ... <wait again>
> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 sleep_on(queue
> "xprt_pending" time 4294742016)
> Jun 20 10:24:11 qa-i360-odi03 kernel: RPC: 1 added to queue
> 0000000093f481cd "xprt_pending"
>
> ... <systemd timed out>
> Jun 20 10:24:39 qa-i360-odi03 systemd[1]: odi.mount: Mounting timed
> out. Terminating.
>
> Jun 20 10:24:39 qa-i360-odi03 kernel: RPC: 1 got signal
> Jun 20 10:24:39 qa-i360-odi03 kernel: RPC: 1 __rpc_wake_up_task
> (now 4294770229)
> Jun 20 10:24:39 qa-i360-odi03 kernel: RPC: 1 disabling timer
> Jun 20 10:24:39 qa-i360-odi03 kernel: RPC: 1 removed from queue
> 0000000093f481cd "xprt_pending"
>
> Jun 20 10:24:39 qa-i360-odi03 kernel: <-- nfs4_try_mount() = -512
> [error]
> Jun 20 10:24:39 qa-i360-odi03 systemd[1]: odi.mount: Mount process
> exited, code=killed status=15
> Jun 20 10:24:39 qa-i360-odi03 systemd[1]: odi.mount: Failed with
> result 'timeout'.
>
> This patch forces TCP to send the SYN on ETIMEDOUT error.
>
Something is very wrong here...
If this patch is correct, and the call to xprt_conditional_disconnect()
does indeed cause the socket to close, then something must have bumped
xprt->connect_cookie. The only things that can do that are the state
changes recorded in xs_tcp_state_change(),
xs_sock_reset_connection_flags(), or xprt_autoclose().
If it was xs_tcp_state_change() that bumped xprt->connect_cookie, then
either we're in TCP_ESTABLISHED (in which case triggering a close on
ETIMEDOUT is just wrong), or we're in the TCP_CLOSE state, in which
case autoclose should already be scheduled.
If xs_sock_reset_connection_flags() got called, then the socket is in
the process of being closed.
Ditto if xprt_autoclose() got called.
So why do we need the call to xprt_conditional_disconnect()?
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trondmy@kernel.org, trond.myklebust@hammerspace.com
next prev parent reply other threads:[~2025-08-04 23:55 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-04 19:08 [PATCH 1/1] SUNRPC: call_connect_status needs to destroy transport on ETIMEDOUT before retry Dai Ngo
2025-08-04 19:21 ` Trond Myklebust
2025-08-04 20:13 ` Dai Ngo
2025-08-04 23:55 ` Trond Myklebust [this message]
2025-08-05 15:46 ` Dai Ngo
2025-08-05 16:41 ` Trond Myklebust
2025-08-05 17:07 ` Dai Ngo
2025-08-05 17:41 ` Trond Myklebust
2025-08-05 18:25 ` Dai Ngo
-- strict thread matches above, loose matches on Subject: below --
2025-08-04 18:46 Dai Ngo
2025-08-04 18:49 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=23fbff6b80f0c7c4b963f214c4c1e5d7b31c1d23.camel@kernel.org \
--to=trondmy@kernel.org \
--cc=anna@kernel.org \
--cc=dai.ngo@oracle.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).