* [PATCH v2 0/2] Avoid hang when mounting xprtsec=[m]tls
@ 2025-05-21 20:34 cel
2025-05-21 20:34 ` [PATCH v2 1/2] SUNRPC: Prevent hang on NFS mount with xprtsec=[m]tls cel
2025-05-21 20:34 ` [PATCH v2 2/2] SUNRPC: Remove dead code from xs_tcp_tls_setup_socket() cel
0 siblings, 2 replies; 3+ messages in thread
From: cel @ 2025-05-21 20:34 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker
Cc: Mike Snitzer, Thomas Haynes, linux-nfs, netdev,
kernel-tls-handshake, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
An NFS mount request can sometimes hang when TLS is requested.
This series attempts to address that.
I've checked on a couple of things since v1.
- Why doesn't the Linux kernel SunRPC client already poll just
after connecting? Typically the SunRPC client does not expect
an RPC Reply (ie, any ingress traffic) until it has sent an RPC
Call first. RPC-with-TLS has changed that scenario a bit.
- Is this an issue for other in-kernel TLS consumers? It is. But
the only other in-kernel TLS consumer at the moment is NVMe over
TCP, and it already polls after a successful connection, for
other reasons.
Changes since v1:
- Include Mike's R-b and T-b tags in 1/2
- Clean up dead code noticed while testing
Chuck Lever (2):
SUNRPC: Prevent hang on NFS mount with xprtsec=[m]tls
SUNRPC: Remove dead code from xs_tcp_tls_setup_socket()
net/sunrpc/xprtsock.c | 16 +++++-----------
1 file changed, 5 insertions(+), 11 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v2 1/2] SUNRPC: Prevent hang on NFS mount with xprtsec=[m]tls
2025-05-21 20:34 [PATCH v2 0/2] Avoid hang when mounting xprtsec=[m]tls cel
@ 2025-05-21 20:34 ` cel
2025-05-21 20:34 ` [PATCH v2 2/2] SUNRPC: Remove dead code from xs_tcp_tls_setup_socket() cel
1 sibling, 0 replies; 3+ messages in thread
From: cel @ 2025-05-21 20:34 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker
Cc: Mike Snitzer, Thomas Haynes, linux-nfs, netdev,
kernel-tls-handshake, Chuck Lever, Steve Sears, Jakub Kacinski,
stable
From: Chuck Lever <chuck.lever@oracle.com>
Engineers at Hammerspace noticed that sometimes mounting with
"xprtsec=tls" hangs for a minute or so, and then times out, even
when the NFS server is reachable and responsive.
kTLS shuts off data_ready callbacks if strp->msg_ready is set to
mitigate data_ready callbacks when a full TLS record is not yet
ready to be read from the socket.
Normally msg_ready is clear when the first TLS record arrives on
a socket. However, I observed that sometimes tls_setsockopt() sets
strp->msg_ready, and that prevents forward progress because
tls_data_ready() becomes a no-op.
Moreover, Jakub says: "If there's a full record queued at the time
when [tlshd] passes the socket back to the kernel, it's up to the
reader to read the already queued data out." So SunRPC cannot
expect a data_ready call when ingress data is already waiting.
Add an explicit poll after SunRPC's upper transport is set up to
pick up any data that arrived after the TLS handshake but before
transport set-up is complete.
Reported-by: Steve Sears <sjs@hammerspace.com>
Suggested-by: Jakub Kacinski <kuba@kernel.org>
Fixes: 75eb6af7acdf ("SUNRPC: Add a TCP-with-TLS RPC transport class")
Tested-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtsock.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 83cc095846d3..4b10ecf4c265 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2740,6 +2740,11 @@ static void xs_tcp_tls_setup_socket(struct work_struct *work)
}
rpc_shutdown_client(lower_clnt);
+ /* Check for ingress data that arrived before the socket's
+ * ->data_ready callback was set up.
+ */
+ xs_poll_check_readable(upper_transport);
+
out_unlock:
current_restore_flags(pflags, PF_MEMALLOC);
upper_transport->clnt = NULL;
--
2.49.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH v2 2/2] SUNRPC: Remove dead code from xs_tcp_tls_setup_socket()
2025-05-21 20:34 [PATCH v2 0/2] Avoid hang when mounting xprtsec=[m]tls cel
2025-05-21 20:34 ` [PATCH v2 1/2] SUNRPC: Prevent hang on NFS mount with xprtsec=[m]tls cel
@ 2025-05-21 20:34 ` cel
1 sibling, 0 replies; 3+ messages in thread
From: cel @ 2025-05-21 20:34 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker
Cc: Mike Snitzer, Thomas Haynes, linux-nfs, netdev,
kernel-tls-handshake, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
xs_tcp_tls_finish_connecting() already marks the upper xprt
connected, so the same code in xs_tcp_tls_setup_socket() is
never executed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtsock.c | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 4b10ecf4c265..04ff66758fc3 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2726,18 +2726,7 @@ static void xs_tcp_tls_setup_socket(struct work_struct *work)
if (status)
goto out_close;
xprt_release_write(lower_xprt, NULL);
-
trace_rpc_socket_connect(upper_xprt, upper_transport->sock, 0);
- if (!xprt_test_and_set_connected(upper_xprt)) {
- upper_xprt->connect_cookie++;
- clear_bit(XPRT_SOCK_CONNECTING, &upper_transport->sock_state);
- xprt_clear_connecting(upper_xprt);
-
- upper_xprt->stat.connect_count++;
- upper_xprt->stat.connect_time += (long)jiffies -
- upper_xprt->stat.connect_start;
- xs_run_error_worker(upper_transport, XPRT_SOCK_WAKE_PENDING);
- }
rpc_shutdown_client(lower_clnt);
/* Check for ingress data that arrived before the socket's
--
2.49.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-05-21 20:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-21 20:34 [PATCH v2 0/2] Avoid hang when mounting xprtsec=[m]tls cel
2025-05-21 20:34 ` [PATCH v2 1/2] SUNRPC: Prevent hang on NFS mount with xprtsec=[m]tls cel
2025-05-21 20:34 ` [PATCH v2 2/2] SUNRPC: Remove dead code from xs_tcp_tls_setup_socket() cel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).