[PATCH] net/tls: avoid TCP window full during ->read

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] net/tls: avoid TCP window full during ->read_sock()
@ 2023-08-03 10:08 Hannes Reinecke
  2023-08-05  0:57 ` Jakub Kicinski
  0 siblings, 1 reply; 3+ messages in thread
From: Hannes Reinecke @ 2023-08-03 10:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, Keith Busch, linux-nvme, Jakub Kicinski,
	Eric Dumazet, Paolo Abeni, netdev, Hannes Reinecke

When flushing the backlog after decoding each record in ->read_sock()
we may end up with really long records, causing a TCP window full as
the TCP window would only be increased again after we process the
record. So we should rather process the record first to allow the
TCP window to be increased again before flushing the backlog.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 net/tls/tls_sw.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 9c1f13541708..57db189b29b0 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2240,7 +2240,6 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
 			tlm = tls_msg(skb);
 		} else {
 			struct tls_decrypt_arg darg;
-			int to_decrypt;
 
 			err = tls_rx_rec_wait(sk, NULL, true, released);
 			if (err <= 0)
@@ -2248,20 +2247,16 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
 
 			memset(&darg.inargs, 0, sizeof(darg.inargs));
 
-			rxm = strp_msg(tls_strp_msg(ctx));
-			tlm = tls_msg(tls_strp_msg(ctx));
-
-			to_decrypt = rxm->full_len - prot->overhead_size;
-
 			err = tls_rx_one_record(sk, NULL, &darg);
 			if (err < 0) {
 				tls_err_abort(sk, -EBADMSG);
 				goto read_sock_end;
 			}
 
-			released = tls_read_flush_backlog(sk, prot, rxm->full_len, to_decrypt,
-							  decrypted, &flushed_at);
 			skb = darg.skb;
+			/* TLS 1.3 may have updated the length by more than overhead */
+			rxm = strp_msg(skb);
+			tlm = tls_msg(skb);
 			decrypted += rxm->full_len;
 
 			tls_rx_rec_done(ctx);
@@ -2280,6 +2275,12 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
 			goto read_sock_requeue;
 		}
 		copied += used;
+		/*
+		 * flush backlog after processing the TLS record, otherwise we might
+		 * end up with really large records and triggering a TCP window full.
+		 */
+		released = tls_read_flush_backlog(sk, prot, decrypted - copied, decrypted,
+						  copied, &flushed_at);
 		if (used < rxm->full_len) {
 			rxm->offset += used;
 			rxm->full_len -= used;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] net/tls: avoid TCP window full during ->read_sock()
  2023-08-03 10:08 [PATCH] net/tls: avoid TCP window full during ->read_sock() Hannes Reinecke
@ 2023-08-05  0:57 ` Jakub Kicinski
  2023-08-07  7:08   ` Sagi Grimberg
  0 siblings, 1 reply; 3+ messages in thread
From: Jakub Kicinski @ 2023-08-05  0:57 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Sagi Grimberg, Keith Busch, linux-nvme,
	Eric Dumazet, Paolo Abeni, netdev

On Thu,  3 Aug 2023 12:08:09 +0200 Hannes Reinecke wrote:
> When flushing the backlog after decoding each record in ->read_sock()
> we may end up with really long records, causing a TCP window full as
> the TCP window would only be increased again after we process the
> record. So we should rather process the record first to allow the
> TCP window to be increased again before flushing the backlog.

> -			released = tls_read_flush_backlog(sk, prot, rxm->full_len, to_decrypt,
> -							  decrypted, &flushed_at);
>  			skb = darg.skb;
> +			/* TLS 1.3 may have updated the length by more than overhead */

> +			rxm = strp_msg(skb);
> +			tlm = tls_msg(skb);
>  			decrypted += rxm->full_len;
>  
>  			tls_rx_rec_done(ctx);
> @@ -2280,6 +2275,12 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
>  			goto read_sock_requeue;
>  		}
>  		copied += used;
> +		/*
> +		 * flush backlog after processing the TLS record, otherwise we might
> +		 * end up with really large records and triggering a TCP window full.
> +		 */
> +		released = tls_read_flush_backlog(sk, prot, decrypted - copied, decrypted,
> +						  copied, &flushed_at);

I'm surprised moving the flushing out makes a difference.
rx_list should generally hold at most 1 skb (16kB) unless something 
is PEEKing the data.

Looking at it closer I think the problem may be calling args to
tls_read_flush_backlog(). Since we don't know how much data
reader wants we can't sensibly evaluate the first condition,
so how would it work if instead of this patch we did:

-			released = tls_read_flush_backlog(sk, prot, rxm->full_len, to_decrypt,
+			released = tls_read_flush_backlog(sk, prot, INT_MAX, 0,
							  decrypted, &flushed_at);

That would give us a flush every 128k of data (or every record if
inq is shorter than 16kB).

side note - I still prefer 80 char max lines, please. It seems to result
in prettier code ovarall as it forces people to think more about code
structure.

>  		if (used < rxm->full_len) {
>  			rxm->offset += used;
>  			rxm->full_len -= used;
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] net/tls: avoid TCP window full during ->read_sock()
  2023-08-05  0:57 ` Jakub Kicinski
@ 2023-08-07  7:08   ` Sagi Grimberg
  0 siblings, 0 replies; 3+ messages in thread
From: Sagi Grimberg @ 2023-08-07  7:08 UTC (permalink / raw)
  To: Jakub Kicinski, Hannes Reinecke
  Cc: Christoph Hellwig, Keith Busch, linux-nvme, Eric Dumazet,
	Paolo Abeni, netdev


>> When flushing the backlog after decoding each record in ->read_sock()
>> we may end up with really long records, causing a TCP window full as
>> the TCP window would only be increased again after we process the
>> record. So we should rather process the record first to allow the
>> TCP window to be increased again before flushing the backlog.
> 
>> -			released = tls_read_flush_backlog(sk, prot, rxm->full_len, to_decrypt,
>> -							  decrypted, &flushed_at);
>>   			skb = darg.skb;
>> +			/* TLS 1.3 may have updated the length by more than overhead */
> 
>> +			rxm = strp_msg(skb);
>> +			tlm = tls_msg(skb);
>>   			decrypted += rxm->full_len;
>>   
>>   			tls_rx_rec_done(ctx);
>> @@ -2280,6 +2275,12 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
>>   			goto read_sock_requeue;
>>   		}
>>   		copied += used;
>> +		/*
>> +		 * flush backlog after processing the TLS record, otherwise we might
>> +		 * end up with really large records and triggering a TCP window full.
>> +		 */
>> +		released = tls_read_flush_backlog(sk, prot, decrypted - copied, decrypted,
>> +						  copied, &flushed_at);
> 
> I'm surprised moving the flushing out makes a difference.
> rx_list should generally hold at most 1 skb (16kB) unless something
> is PEEKing the data.
> 
> Looking at it closer I think the problem may be calling args to
> tls_read_flush_backlog(). Since we don't know how much data
> reader wants we can't sensibly evaluate the first condition,
> so how would it work if instead of this patch we did:
> 
> -			released = tls_read_flush_backlog(sk, prot, rxm->full_len, to_decrypt,
> +			released = tls_read_flush_backlog(sk, prot, INT_MAX, 0,
> 							  decrypted, &flushed_at);
> 
> That would give us a flush every 128k of data (or every record if
> inq is shorter than 16kB).

What happens if the window is smaller than 128K ? isn't that what
Hannes is trying to solve for?

Hannes, do you have some absolute numbers to how the window behaves?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-08-07  7:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-03 10:08 [PATCH] net/tls: avoid TCP window full during ->read_sock() Hannes Reinecke
2023-08-05  0:57 ` Jakub Kicinski
2023-08-07  7:08   ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).