netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tls: Fix tls_sw_sendmsg error handling
@ 2025-01-04 15:29 Benjamin Coddington
  2025-01-07  2:36 ` Jakub Kicinski
  2025-01-08  2:10 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 4+ messages in thread
From: Benjamin Coddington @ 2025-01-04 15:29 UTC (permalink / raw)
  To: Boris Pismenny, John Fastabend, Jakub Kicinski, David S. Miller,
	Eric Dumazet, Paolo Abeni, Simon Horman
  Cc: netdev, linux-nfs

We've noticed that NFS can hang when using RPC over TLS on an unstable
connection, and investigation shows that the RPC layer is stuck in a tight
loop attempting to transmit, but forever getting -EBADMSG back from the
underlying network.  The loop begins when tcp_sendmsg_locked() returns
-EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when
calling the socket's error reporting handler.

Instead of converting errors from tcp_sendmsg_locked(), let's pass them
along in this path.  The RPC layer handles -EPIPE by reconnecting the
transport, which prevents the endless attempts to transmit on a broken
connection.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 net/tls/tls_sw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index bbf26cc4f6ee..7bcc9b4408a2 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -458,7 +458,7 @@ int tls_tx_records(struct sock *sk, int flags)
 
 tx_err:
 	if (rc < 0 && rc != -EAGAIN)
-		tls_err_abort(sk, -EBADMSG);
+		tls_err_abort(sk, rc);
 
 	return rc;
 }

base-commit: 0bc21e701a6ffacfdde7f04f87d664d82e8a13bf
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] tls: Fix tls_sw_sendmsg error handling
  2025-01-04 15:29 [PATCH] tls: Fix tls_sw_sendmsg error handling Benjamin Coddington
@ 2025-01-07  2:36 ` Jakub Kicinski
  2025-01-07 12:28   ` Benjamin Coddington
  2025-01-08  2:10 ` patchwork-bot+netdevbpf
  1 sibling, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2025-01-07  2:36 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: Boris Pismenny, John Fastabend, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, netdev, linux-nfs

On Sat,  4 Jan 2025 10:29:45 -0500 Benjamin Coddington wrote:
> We've noticed that NFS can hang when using RPC over TLS on an unstable
> connection, and investigation shows that the RPC layer is stuck in a tight
> loop attempting to transmit, but forever getting -EBADMSG back from the
> underlying network.  The loop begins when tcp_sendmsg_locked() returns
> -EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when
> calling the socket's error reporting handler.
> 
> Instead of converting errors from tcp_sendmsg_locked(), let's pass them
> along in this path.  The RPC layer handles -EPIPE by reconnecting the
> transport, which prevents the endless attempts to transmit on a broken
> connection.

LGTM, only question in my mind is whether we should send this to stable.
Any preference?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tls: Fix tls_sw_sendmsg error handling
  2025-01-07  2:36 ` Jakub Kicinski
@ 2025-01-07 12:28   ` Benjamin Coddington
  0 siblings, 0 replies; 4+ messages in thread
From: Benjamin Coddington @ 2025-01-07 12:28 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Boris Pismenny, John Fastabend, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, netdev, linux-nfs, Vakul Garg

On 6 Jan 2025, at 21:36, Jakub Kicinski wrote:

> On Sat,  4 Jan 2025 10:29:45 -0500 Benjamin Coddington wrote:
>> We've noticed that NFS can hang when using RPC over TLS on an unstable
>> connection, and investigation shows that the RPC layer is stuck in a tight
>> loop attempting to transmit, but forever getting -EBADMSG back from the
>> underlying network.  The loop begins when tcp_sendmsg_locked() returns
>> -EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when
>> calling the socket's error reporting handler.
>>
>> Instead of converting errors from tcp_sendmsg_locked(), let's pass them
>> along in this path.  The RPC layer handles -EPIPE by reconnecting the
>> transport, which prevents the endless attempts to transmit on a broken
>> connection.
>
> LGTM, only question in my mind is whether we should send this to stable.
> Any preference?

Yes, I think it can go, though not a strong preference.  This code well
predates RPC over TLS which landed on v6.5.  I haven't investigated other
users - they may not have the same problem since RPC over TLS has very
precise error handling, so it perhaps it makes sense to show the Fixes but
limit how far back we go for RPC.

Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Cc: <stable@vger.kernel.org> # 6.5.x

Thanks for the look Jakub.
Ben


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tls: Fix tls_sw_sendmsg error handling
  2025-01-04 15:29 [PATCH] tls: Fix tls_sw_sendmsg error handling Benjamin Coddington
  2025-01-07  2:36 ` Jakub Kicinski
@ 2025-01-08  2:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-01-08  2:10 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: borisp, john.fastabend, kuba, davem, edumazet, pabeni, horms,
	netdev, linux-nfs

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sat,  4 Jan 2025 10:29:45 -0500 you wrote:
> We've noticed that NFS can hang when using RPC over TLS on an unstable
> connection, and investigation shows that the RPC layer is stuck in a tight
> loop attempting to transmit, but forever getting -EBADMSG back from the
> underlying network.  The loop begins when tcp_sendmsg_locked() returns
> -EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when
> calling the socket's error reporting handler.
> 
> [...]

Here is the summary with links:
  - tls: Fix tls_sw_sendmsg error handling
    https://git.kernel.org/netdev/net/c/b341ca51d267

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-01-08  2:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-04 15:29 [PATCH] tls: Fix tls_sw_sendmsg error handling Benjamin Coddington
2025-01-07  2:36 ` Jakub Kicinski
2025-01-07 12:28   ` Benjamin Coddington
2025-01-08  2:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).