netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/5] tls: fixes for record type handling with PEEK
@ 2024-02-15 16:17 Sabrina Dubroca
  2024-02-15 16:17 ` [PATCH net 1/5] tls: break out of main loop when PEEK gets a non-data record Sabrina Dubroca
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-15 16:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Boris Pismenny, John Fastabend, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Shuah Khan,
	Vakul Garg, linux-kselftest

There are multiple bugs in tls_sw_recvmsg's handling of record types
when MSG_PEEK flag is used, which can lead to incorrectly merging two
records:
 - consecutive non-DATA records shouldn't be merged, even if they're
   the same type (partly handled by the test at the end of the main
   loop)
 - records of the same type (even DATA) shouldn't be merged if one
   record of a different type comes in between

Sabrina Dubroca (5):
  tls: break out of main loop when PEEK gets a non-data record
  tls: stop recv() if initial process_rx_list gave us non-DATA
  tls: don't skip over different type records from the rx_list
  selftests: tls: add test for merging of same-type control messages
  selftests: tls: add test for peeking past a record of a different type

 net/tls/tls_sw.c                  | 24 +++++++++++------
 tools/testing/selftests/net/tls.c | 45 +++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+), 8 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net 1/5] tls: break out of main loop when PEEK gets a non-data record
  2024-02-15 16:17 [PATCH net 0/5] tls: fixes for record type handling with PEEK Sabrina Dubroca
@ 2024-02-15 16:17 ` Sabrina Dubroca
  2024-02-15 16:17 ` [PATCH net 2/5] tls: stop recv() if initial process_rx_list gave us non-DATA Sabrina Dubroca
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-15 16:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Boris Pismenny, John Fastabend, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Shuah Khan,
	Vakul Garg, linux-kselftest

PEEK needs to leave decrypted records on the rx_list so that we can
receive them later on, so it jumps back into the async code that
queues the skb. Unfortunately that makes us skip the
TLS_RECORD_TYPE_DATA check at the bottom of the main loop, so if two
records of the same (non-DATA) type are queued, we end up merging
them.

Add the same record type check, and make it unlikely to not penalize
the async fastpath. Async decrypt only applies to data record, so this
check is only needed for PEEK.

process_rx_list also has similar issues.

Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 net/tls/tls_sw.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 9fbc70200cd0..78aedfc682ba 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2064,6 +2064,8 @@ int tls_sw_recvmsg(struct sock *sk,
 				decrypted += chunk;
 				len -= chunk;
 				__skb_queue_tail(&ctx->rx_list, skb);
+				if (unlikely(control != TLS_RECORD_TYPE_DATA))
+					break;
 				continue;
 			}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net 2/5] tls: stop recv() if initial process_rx_list gave us non-DATA
  2024-02-15 16:17 [PATCH net 0/5] tls: fixes for record type handling with PEEK Sabrina Dubroca
  2024-02-15 16:17 ` [PATCH net 1/5] tls: break out of main loop when PEEK gets a non-data record Sabrina Dubroca
@ 2024-02-15 16:17 ` Sabrina Dubroca
  2024-02-15 16:17 ` [PATCH net 3/5] tls: don't skip over different type records from the rx_list Sabrina Dubroca
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-15 16:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Boris Pismenny, John Fastabend, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Shuah Khan,
	Vakul Garg, linux-kselftest

If we have a non-DATA record on the rx_list and another record of the
same type still on the queue, we will end up merging them:
 - process_rx_list copies the non-DATA record
 - we start the loop and process the first available record since it's
   of the same type
 - we break out of the loop since the record was not DATA

Just check the record type and jump to the end in case process_rx_list
did some work.

Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 net/tls/tls_sw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 78aedfc682ba..43dd0d82b6ed 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1971,7 +1971,7 @@ int tls_sw_recvmsg(struct sock *sk,
 		goto end;
 
 	copied = err;
-	if (len <= copied)
+	if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA))
 		goto end;
 
 	target = sock_rcvlowat(sk, flags & MSG_WAITALL, len);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net 3/5] tls: don't skip over different type records from the rx_list
  2024-02-15 16:17 [PATCH net 0/5] tls: fixes for record type handling with PEEK Sabrina Dubroca
  2024-02-15 16:17 ` [PATCH net 1/5] tls: break out of main loop when PEEK gets a non-data record Sabrina Dubroca
  2024-02-15 16:17 ` [PATCH net 2/5] tls: stop recv() if initial process_rx_list gave us non-DATA Sabrina Dubroca
@ 2024-02-15 16:17 ` Sabrina Dubroca
  2024-02-19 20:07   ` Jakub Kicinski
  2024-02-15 16:17 ` [PATCH net 4/5] selftests: tls: add test for merging of same-type control messages Sabrina Dubroca
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-15 16:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Boris Pismenny, John Fastabend, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Shuah Khan,
	Vakul Garg, linux-kselftest

If we queue 3 records:
 - record 1, type DATA
 - record 2, some other type
 - record 3, type DATA
and do a recv(PEEK), the rx_list will contain the first two records.

The next large recv will walk through the rx_list and copy data from
record 1, then stop because record 2 is a different type. Since we
haven't filled up our buffer, we will process the next available
record. It's also DATA, so we can merge it with the current read.

We shouldn't do that, since there was a record in between that we
ignored.

Add a flag to let process_rx_list inform tls_sw_recvmsg that it had
more data available.

Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 net/tls/tls_sw.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 43dd0d82b6ed..de96959336c4 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 			   u8 *control,
 			   size_t skip,
 			   size_t len,
-			   bool is_peek)
+			   bool is_peek,
+			   bool *more)
 {
 	struct sk_buff *skb = skb_peek(&ctx->rx_list);
 	struct tls_msg *tlm;
@@ -1785,7 +1786,7 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 
 		err = tls_record_content_type(msg, tlm, control);
 		if (err <= 0)
-			goto out;
+			goto more;
 
 		if (skip < rxm->full_len)
 			break;
@@ -1803,12 +1804,12 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 
 		err = tls_record_content_type(msg, tlm, control);
 		if (err <= 0)
-			goto out;
+			goto more;
 
 		err = skb_copy_datagram_msg(skb, rxm->offset + skip,
 					    msg, chunk);
 		if (err < 0)
-			goto out;
+			goto more;
 
 		len = len - chunk;
 		copied = copied + chunk;
@@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 
 out:
 	return copied ? : err;
+more:
+	if (more)
+		*more = true;
+	goto out;
 }
 
 static bool
@@ -1947,6 +1952,7 @@ int tls_sw_recvmsg(struct sock *sk,
 	int target, err;
 	bool is_kvec = iov_iter_is_kvec(&msg->msg_iter);
 	bool is_peek = flags & MSG_PEEK;
+	bool rx_more = false;
 	bool released = true;
 	bool bpf_strp_enabled;
 	bool zc_capable;
@@ -1966,12 +1972,12 @@ int tls_sw_recvmsg(struct sock *sk,
 		goto end;
 
 	/* Process pending decrypted records. It must be non-zero-copy */
-	err = process_rx_list(ctx, msg, &control, 0, len, is_peek);
+	err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &rx_more);
 	if (err < 0)
 		goto end;
 
 	copied = err;
-	if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA))
+	if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA) || rx_more)
 		goto end;
 
 	target = sock_rcvlowat(sk, flags & MSG_WAITALL, len);
@@ -2130,10 +2136,10 @@ int tls_sw_recvmsg(struct sock *sk,
 		/* Drain records from the rx_list & copy if required */
 		if (is_peek || is_kvec)
 			err = process_rx_list(ctx, msg, &control, copied,
-					      decrypted, is_peek);
+					      decrypted, is_peek, NULL);
 		else
 			err = process_rx_list(ctx, msg, &control, 0,
-					      async_copy_bytes, is_peek);
+					      async_copy_bytes, is_peek, NULL);
 	}
 
 	copied += decrypted;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net 4/5] selftests: tls: add test for merging of same-type control messages
  2024-02-15 16:17 [PATCH net 0/5] tls: fixes for record type handling with PEEK Sabrina Dubroca
                   ` (2 preceding siblings ...)
  2024-02-15 16:17 ` [PATCH net 3/5] tls: don't skip over different type records from the rx_list Sabrina Dubroca
@ 2024-02-15 16:17 ` Sabrina Dubroca
  2024-02-15 16:17 ` [PATCH net 5/5] selftests: tls: add test for peeking past a record of a different type Sabrina Dubroca
  2024-02-21 22:30 ` [PATCH net 0/5] tls: fixes for record type handling with PEEK patchwork-bot+netdevbpf
  5 siblings, 0 replies; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-15 16:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Boris Pismenny, John Fastabend, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Shuah Khan,
	Vakul Garg, linux-kselftest

Two consecutive control messages of the same type should never be
merged into one large received blob of data.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 tools/testing/selftests/net/tls.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 49c84602707f..2714c230a0f9 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -1485,6 +1485,32 @@ TEST_F(tls, control_msg)
 	EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
 }
 
+TEST_F(tls, control_msg_nomerge)
+{
+	char *rec1 = "1111";
+	char *rec2 = "2222";
+	int send_len = 5;
+	char buf[15];
+
+	if (self->notls)
+		SKIP(return, "no TLS support");
+
+	EXPECT_EQ(tls_send_cmsg(self->fd, 100, rec1, send_len, 0), send_len);
+	EXPECT_EQ(tls_send_cmsg(self->fd, 100, rec2, send_len, 0), send_len);
+
+	EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, 100, buf, sizeof(buf), MSG_PEEK), send_len);
+	EXPECT_EQ(memcmp(buf, rec1, send_len), 0);
+
+	EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, 100, buf, sizeof(buf), MSG_PEEK), send_len);
+	EXPECT_EQ(memcmp(buf, rec1, send_len), 0);
+
+	EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, 100, buf, sizeof(buf), 0), send_len);
+	EXPECT_EQ(memcmp(buf, rec1, send_len), 0);
+
+	EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, 100, buf, sizeof(buf), 0), send_len);
+	EXPECT_EQ(memcmp(buf, rec2, send_len), 0);
+}
+
 TEST_F(tls, shutdown)
 {
 	char const *test_str = "test_read";
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net 5/5] selftests: tls: add test for peeking past a record of a different type
  2024-02-15 16:17 [PATCH net 0/5] tls: fixes for record type handling with PEEK Sabrina Dubroca
                   ` (3 preceding siblings ...)
  2024-02-15 16:17 ` [PATCH net 4/5] selftests: tls: add test for merging of same-type control messages Sabrina Dubroca
@ 2024-02-15 16:17 ` Sabrina Dubroca
  2024-02-21 22:30 ` [PATCH net 0/5] tls: fixes for record type handling with PEEK patchwork-bot+netdevbpf
  5 siblings, 0 replies; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-15 16:17 UTC (permalink / raw)
  To: netdev
  Cc: Sabrina Dubroca, Boris Pismenny, John Fastabend, Jakub Kicinski,
	David S. Miller, Eric Dumazet, Paolo Abeni, Shuah Khan,
	Vakul Garg, linux-kselftest

If we queue 3 records:
 - record 1, type DATA
 - record 2, some other type
 - record 3, type DATA
the current code can look past the 2nd record and merge the 2 data
records.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
 tools/testing/selftests/net/tls.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 2714c230a0f9..b95c249f81c2 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -1511,6 +1511,25 @@ TEST_F(tls, control_msg_nomerge)
 	EXPECT_EQ(memcmp(buf, rec2, send_len), 0);
 }
 
+TEST_F(tls, data_control_data)
+{
+	char *rec1 = "1111";
+	char *rec2 = "2222";
+	char *rec3 = "3333";
+	int send_len = 5;
+	char buf[15];
+
+	if (self->notls)
+		SKIP(return, "no TLS support");
+
+	EXPECT_EQ(send(self->fd, rec1, send_len, 0), send_len);
+	EXPECT_EQ(tls_send_cmsg(self->fd, 100, rec2, send_len, 0), send_len);
+	EXPECT_EQ(send(self->fd, rec3, send_len, 0), send_len);
+
+	EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), MSG_PEEK), send_len);
+	EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), MSG_PEEK), send_len);
+}
+
 TEST_F(tls, shutdown)
 {
 	char const *test_str = "test_read";
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net 3/5] tls: don't skip over different type records from the rx_list
  2024-02-15 16:17 ` [PATCH net 3/5] tls: don't skip over different type records from the rx_list Sabrina Dubroca
@ 2024-02-19 20:07   ` Jakub Kicinski
  2024-02-19 23:10     ` Sabrina Dubroca
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Kicinski @ 2024-02-19 20:07 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Boris Pismenny, John Fastabend, David S. Miller,
	Eric Dumazet, Paolo Abeni, Shuah Khan, Vakul Garg,
	linux-kselftest

On Thu, 15 Feb 2024 17:17:31 +0100 Sabrina Dubroca wrote:
> @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
>  			   u8 *control,
>  			   size_t skip,
>  			   size_t len,
> -			   bool is_peek)
> +			   bool is_peek,
> +			   bool *more)
>  {
>  	struct sk_buff *skb = skb_peek(&ctx->rx_list);
>  	struct tls_msg *tlm;


> @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
>  
>  out:
>  	return copied ? : err;
> +more:
> +	if (more)
> +		*more = true;
> +	goto out;

Patches look correct, one small nit here -

I don't have great ideas how to avoid the 7th argument completely but 
I think it'd be a little cleaner if we either:
 - passed in err as an output argument (some datagram code does that
   IIRC), then function can always return copied directly, or 
 - passed copied as an output argument, and then we can always return
   err?
I like the former a little better because we won't have to special case
NULL for the "after async decryption" call sites.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 3/5] tls: don't skip over different type records from the rx_list
  2024-02-19 20:07   ` Jakub Kicinski
@ 2024-02-19 23:10     ` Sabrina Dubroca
  2024-02-21  1:50       ` Jakub Kicinski
  0 siblings, 1 reply; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-19 23:10 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Boris Pismenny, John Fastabend, David S. Miller,
	Eric Dumazet, Paolo Abeni, Shuah Khan, Vakul Garg,
	linux-kselftest

2024-02-19, 12:07:03 -0800, Jakub Kicinski wrote:
> On Thu, 15 Feb 2024 17:17:31 +0100 Sabrina Dubroca wrote:
> > @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
> >  			   u8 *control,
> >  			   size_t skip,
> >  			   size_t len,
> > -			   bool is_peek)
> > +			   bool is_peek,
> > +			   bool *more)
> >  {
> >  	struct sk_buff *skb = skb_peek(&ctx->rx_list);
> >  	struct tls_msg *tlm;
> 
> 
> > @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
> >  
> >  out:
> >  	return copied ? : err;
> > +more:
> > +	if (more)
> > +		*more = true;
> > +	goto out;
> 
> Patches look correct, one small nit here -
> 
> I don't have great ideas how to avoid the 7th argument completely but 

I hesitated between this patch and a variant combining is_peek and
more into a single u8 *flags, but that felt a bit messy (or does that
fall into what you describe as "not [having] great ideas"? :))

@@ -1772,9 +1777,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 			   u8 *control,
 			   size_t skip,
 			   size_t len,
-			   bool is_peek)
+			   u8 *flags)
 {
 	struct sk_buff *skb = skb_peek(&ctx->rx_list);
+	bool is_peek = *flags & RXLIST_PEEK;
 	struct tls_msg *tlm;
 	ssize_t copied = 0;
 	int err;
[...]
@@ -1844,6 +1850,9 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 
 out:
 	return copied ? : err;
+more:
+	*flags |= RXLIST_MORE;
+	goto out;
 }


and then in tls_sw_recvmsg:
u8 rxlist_flags = is_peek ? RXLIST_PEEK : 0;
err = process_rx_list(ctx, msg, &control, 0, len, &rxlist_flags);


> I think it'd be a little cleaner if we either:
>  - passed in err as an output argument (some datagram code does that
>    IIRC), then function can always return copied directly, or 

(yes, __skb_wait_for_more_packets, __skb_try_recv_datagram, and their
variants)

>  - passed copied as an output argument, and then we can always return
>    err?

Aren't those 2 options adding an 8th argument?

I tend to find ">= 0 on success, otherwise errno" more readable,
probably because that's a very common pattern (either for recvmsg
style of cases, or all the ERR_PTR type situations).

> I like the former a little better because we won't have to special case
> NULL for the "after async decryption" call sites.

We could also pass &rx_more every time and not check for NULL.

What do you want to clean up more specifically? The number of
arguments, the backwards goto, the NULL check before setting *more,
something else/all of the above?

-- 
Sabrina


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 3/5] tls: don't skip over different type records from the rx_list
  2024-02-19 23:10     ` Sabrina Dubroca
@ 2024-02-21  1:50       ` Jakub Kicinski
  2024-02-21 13:59         ` Sabrina Dubroca
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Kicinski @ 2024-02-21  1:50 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Boris Pismenny, John Fastabend, David S. Miller,
	Eric Dumazet, Paolo Abeni, Shuah Khan, Vakul Garg,
	linux-kselftest

On Tue, 20 Feb 2024 00:10:58 +0100 Sabrina Dubroca wrote:
> 2024-02-19, 12:07:03 -0800, Jakub Kicinski wrote:
> > On Thu, 15 Feb 2024 17:17:31 +0100 Sabrina Dubroca wrote:  
> > > @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
> > >  			   u8 *control,
> > >  			   size_t skip,
> > >  			   size_t len,
> > > -			   bool is_peek)
> > > +			   bool is_peek,
> > > +			   bool *more)
> > >  {
> > >  	struct sk_buff *skb = skb_peek(&ctx->rx_list);
> > >  	struct tls_msg *tlm;  
> > 
> > > @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
> > >  
> > >  out:
> > >  	return copied ? : err;
> > > +more:
> > > +	if (more)
> > > +		*more = true;
> > > +	goto out;  
> > 
> > Patches look correct, one small nit here -
> > 
> > I don't have great ideas how to avoid the 7th argument completely but   
> 
> I hesitated between this patch and a variant combining is_peek and
> more into a single u8 *flags, but that felt a bit messy (or does that
> fall into what you describe as "not [having] great ideas"? :))

I guess it saves a register, it seems a bit better but then it's a
truly in/out argument :)

> > I think it'd be a little cleaner if we either:
> >  - passed in err as an output argument (some datagram code does that
> >    IIRC), then function can always return copied directly, or   
> 
> (yes, __skb_wait_for_more_packets, __skb_try_recv_datagram, and their
> variants)
> 
> >  - passed copied as an output argument, and then we can always return
> >    err?  
> 
> Aren't those 2 options adding an 8th argument?

No, no, still 7, if we separate copied from err - checking err < 0
is enough to know that we need to exit.

Differently put, perhaps, my preference is to pass an existing entity
(err or copied), rather that conjure new concept (more) on one end and
interpret it on the other.

> I tend to find ">= 0 on success, otherwise errno" more readable,
> probably because that's a very common pattern (either for recvmsg
> style of cases, or all the ERR_PTR type situations).

Right it definitely is a good pattern. I think passing copied via
argument would give us those semantics still?

> > I like the former a little better because we won't have to special case
> > NULL for the "after async decryption" call sites.  
> 
> We could also pass &rx_more every time and not check for NULL.
> 
> What do you want to clean up more specifically? The number of
> arguments, the backwards goto, the NULL check before setting *more,
> something else/all of the above?

Not compiled, but what I had in mind was something along the lines of:

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 9fbc70200cd0..6e6e6d89b173 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 			   u8 *control,
 			   size_t skip,
 			   size_t len,
-			   bool is_peek)
+			   bool is_peek,
+			   int *out_copied)
 {
 	struct sk_buff *skb = skb_peek(&ctx->rx_list);
 	struct tls_msg *tlm;
@@ -1843,7 +1844,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 	err = 0;
 
 out:
-	return copied ? : err;
+	*out_copied = copied;
+	return err;
 }
 
 static bool
@@ -1966,11 +1968,10 @@ int tls_sw_recvmsg(struct sock *sk,
 		goto end;
 
 	/* Process pending decrypted records. It must be non-zero-copy */
-	err = process_rx_list(ctx, msg, &control, 0, len, is_peek);
+	err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &copied);
 	if (err < 0)
 		goto end;
 
-	copied = err;
 	if (len <= copied)
 		goto end;
 
@@ -2128,10 +2129,10 @@ int tls_sw_recvmsg(struct sock *sk,
 		/* Drain records from the rx_list & copy if required */
 		if (is_peek || is_kvec)
 			err = process_rx_list(ctx, msg, &control, copied,
-					      decrypted, is_peek);
+					      decrypted, is_peek, &ret);
 		else
 			err = process_rx_list(ctx, msg, &control, 0,
-					      async_copy_bytes, is_peek);
+					      async_copy_bytes, is_peek, &ret);
 	}
 
 	copied += decrypted;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net 3/5] tls: don't skip over different type records from the rx_list
  2024-02-21  1:50       ` Jakub Kicinski
@ 2024-02-21 13:59         ` Sabrina Dubroca
  2024-02-21 18:33           ` Jakub Kicinski
  0 siblings, 1 reply; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-21 13:59 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Boris Pismenny, John Fastabend, David S. Miller,
	Eric Dumazet, Paolo Abeni, Shuah Khan, Vakul Garg,
	linux-kselftest

2024-02-20, 17:50:53 -0800, Jakub Kicinski wrote:
> On Tue, 20 Feb 2024 00:10:58 +0100 Sabrina Dubroca wrote:
> > 2024-02-19, 12:07:03 -0800, Jakub Kicinski wrote:
> > > On Thu, 15 Feb 2024 17:17:31 +0100 Sabrina Dubroca wrote:  
> > > > @@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
> > > >  			   u8 *control,
> > > >  			   size_t skip,
> > > >  			   size_t len,
> > > > -			   bool is_peek)
> > > > +			   bool is_peek,
> > > > +			   bool *more)
> > > >  {
> > > >  	struct sk_buff *skb = skb_peek(&ctx->rx_list);
> > > >  	struct tls_msg *tlm;  
> > > 
> > > > @@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
> > > >  
> > > >  out:
> > > >  	return copied ? : err;
> > > > +more:
> > > > +	if (more)
> > > > +		*more = true;
> > > > +	goto out;  
> > > 
> > > Patches look correct, one small nit here -
> > > 
> > > I don't have great ideas how to avoid the 7th argument completely but   
> > 
> > I hesitated between this patch and a variant combining is_peek and
> > more into a single u8 *flags, but that felt a bit messy (or does that
> > fall into what you describe as "not [having] great ideas"? :))
> 
> I guess it saves a register, it seems a bit better but then it's a
> truly in/out argument :)

We already do that with darg all over the receive code, so it
shouldn't be too confusing to readers. It can be named flags_inout if
you think that would help, or have a comment like above tls_decrypt_sg.

> > > I think it'd be a little cleaner if we either:
> > >  - passed in err as an output argument (some datagram code does that
> > >    IIRC), then function can always return copied directly, or   
> > 
> > (yes, __skb_wait_for_more_packets, __skb_try_recv_datagram, and their
> > variants)
> > 
> > >  - passed copied as an output argument, and then we can always return
> > >    err?  
> > 
> > Aren't those 2 options adding an 8th argument?
> 
> No, no, still 7, if we separate copied from err - checking err < 0
> is enough to know that we need to exit.

Right, I realized that you probably meant something like that as I was
going to bed last night.

It's not exactly enough, since tls_record_content_type will return 0
on a content type mismatch. We'll have to translate that into an
"error". I think it would be a bit nicer to set err=1 and then check
err != 0 in tls_sw_recvmsg (we can document that in a comment above
process_rx_list) rather than making up a fake errno. See diff [1].

Or we could swap the 0/1 returns from tls_record_content_type and
switch the err <= 0 tests to err != 0 after the existing calls, then
process_rx_list doesn't have a weird special case [2].

What do you think?


> Differently put, perhaps, my preference is to pass an existing entity
> (err or copied), rather that conjure new concept (more) on one end and
> interpret it on the other.
> 
> > I tend to find ">= 0 on success, otherwise errno" more readable,
> > probably because that's a very common pattern (either for recvmsg
> > style of cases, or all the ERR_PTR type situations).
> 
> Right it definitely is a good pattern. I think passing copied via
> argument would give us those semantics still?

For recvmsg sure, but not for process_rx_list.

> > > I like the former a little better because we won't have to special case
> > > NULL for the "after async decryption" call sites.  
> > 
> > We could also pass &rx_more every time and not check for NULL.
> > 
> > What do you want to clean up more specifically? The number of
> > arguments, the backwards goto, the NULL check before setting *more,
> > something else/all of the above?
> 
> Not compiled, but what I had in mind was something along the lines of:

copied is a ssize_t (but ret isn't), so the change gets a bit uglier :(


------------ 8< ------------

[1] fix by setting err=1 in process_rx_list

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 43dd0d82b6ed..711504614da7 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1766,13 +1766,19 @@ static void tls_rx_rec_done(struct tls_sw_context_rx *ctx)
  * decrypted records into the buffer provided by caller zero copy is not
  * true. Further, the records are removed from the rx_list if it is not a peek
  * case and the record has been consumed completely.
+ *
+ * Return:
+ *  - 0 if len bytes were copied
+ *  - 1 if < len bytes were copied due to a record type mismatch
+ *  - <0 if an error occurred
  */
 static int process_rx_list(struct tls_sw_context_rx *ctx,
 			   struct msghdr *msg,
 			   u8 *control,
 			   size_t skip,
 			   size_t len,
-			   bool is_peek)
+			   bool is_peek,
+			   ssize_t *out_copied)
 {
 	struct sk_buff *skb = skb_peek(&ctx->rx_list);
 	struct tls_msg *tlm;
@@ -1802,8 +1808,11 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 		tlm = tls_msg(skb);
 
 		err = tls_record_content_type(msg, tlm, control);
-		if (err <= 0)
+		if (err <= 0) {
+			if (err == 0)
+				err = 1;
 			goto out;
+		}
 
 		err = skb_copy_datagram_msg(skb, rxm->offset + skip,
 					    msg, chunk);
@@ -1843,7 +1852,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 	err = 0;
 
 out:
-	return copied ? : err;
+	*out_copied = copied;
+	return err;
 }
 
 static bool
@@ -1966,11 +1976,10 @@ int tls_sw_recvmsg(struct sock *sk,
 		goto end;
 
 	/* Process pending decrypted records. It must be non-zero-copy */
-	err = process_rx_list(ctx, msg, &control, 0, len, is_peek);
-	if (err < 0)
+	err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &copied);
+	if (err != 0)
 		goto end;
 
-	copied = err;
 	if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA))
 		goto end;
 
@@ -2114,6 +2123,7 @@ int tls_sw_recvmsg(struct sock *sk,
 
 recv_end:
 	if (async) {
+		ssize_t ret2;
 		int ret;
 
 		/* Wait for all previously submitted records to be decrypted */
@@ -2130,10 +2140,10 @@ int tls_sw_recvmsg(struct sock *sk,
 		/* Drain records from the rx_list & copy if required */
 		if (is_peek || is_kvec)
 			err = process_rx_list(ctx, msg, &control, copied,
-					      decrypted, is_peek);
+					      decrypted, is_peek, &ret2);
 		else
 			err = process_rx_list(ctx, msg, &control, 0,
-					      async_copy_bytes, is_peek);
+					      async_copy_bytes, is_peek, &ret2);
 	}
 
 	copied += decrypted;


------------ 8< ------------

[2] fixing the bug by changing tls_record_content_type as well

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 43dd0d82b6ed..3da62ba97945 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1734,6 +1734,11 @@ int decrypt_skb(struct sock *sk, struct scatterlist *sgout)
 	return tls_decrypt_sg(sk, NULL, sgout, &darg);
 }
 
+/* Return:
+ *  - 0 on success
+ *  - 1 if the record's type doesn't match the value in control
+ *  - <0 if an error occurred
+ */
 static int tls_record_content_type(struct msghdr *msg, struct tls_msg *tlm,
 				   u8 *control)
 {
@@ -1751,10 +1756,10 @@ static int tls_record_content_type(struct msghdr *msg, struct tls_msg *tlm,
 				return -EIO;
 		}
 	} else if (*control != tlm->control) {
-		return 0;
+		return 1;
 	}
 
-	return 1;
+	return 0;
 }
 
 static void tls_rx_rec_done(struct tls_sw_context_rx *ctx)
@@ -1766,13 +1771,19 @@ static void tls_rx_rec_done(struct tls_sw_context_rx *ctx)
  * decrypted records into the buffer provided by caller zero copy is not
  * true. Further, the records are removed from the rx_list if it is not a peek
  * case and the record has been consumed completely.
+ *
+ * Return:
+ *  - 0 if len bytes were copied
+ *  - 1 if < len bytes were copied due to a record type mismatch
+ *  - <0 if an error occurred
  */
 static int process_rx_list(struct tls_sw_context_rx *ctx,
 			   struct msghdr *msg,
 			   u8 *control,
 			   size_t skip,
 			   size_t len,
-			   bool is_peek)
+			   bool is_peek,
+			   ssize_t *out_copied)
 {
 	struct sk_buff *skb = skb_peek(&ctx->rx_list);
 	struct tls_msg *tlm;
@@ -1784,7 +1795,7 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 		tlm = tls_msg(skb);
 
 		err = tls_record_content_type(msg, tlm, control);
-		if (err <= 0)
+		if (err != 0)
 			goto out;
 
 		if (skip < rxm->full_len)
@@ -1802,7 +1813,7 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 		tlm = tls_msg(skb);
 
 		err = tls_record_content_type(msg, tlm, control);
-		if (err <= 0)
+		if (err != 0)
 			goto out;
 
 		err = skb_copy_datagram_msg(skb, rxm->offset + skip,
@@ -1843,7 +1854,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
 	err = 0;
 
 out:
-	return copied ? : err;
+	*out_copied = copied;
+	return err;
 }
 
 static bool
@@ -1966,11 +1978,10 @@ int tls_sw_recvmsg(struct sock *sk,
 		goto end;
 
 	/* Process pending decrypted records. It must be non-zero-copy */
-	err = process_rx_list(ctx, msg, &control, 0, len, is_peek);
-	if (err < 0)
+	err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &copied);
+	if (err != 0)
 		goto end;
 
-	copied = err;
 	if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA))
 		goto end;
 
@@ -2032,7 +2043,7 @@ int tls_sw_recvmsg(struct sock *sk,
 		 * For tls1.3, we disable async.
 		 */
 		err = tls_record_content_type(msg, tls_msg(darg.skb), &control);
-		if (err <= 0) {
+		if (err != 0) {
 			DEBUG_NET_WARN_ON_ONCE(darg.zc);
 			tls_rx_rec_done(ctx);
 put_on_rx_list_err:
@@ -2114,6 +2125,7 @@ int tls_sw_recvmsg(struct sock *sk,
 
 recv_end:
 	if (async) {
+		ssize_t ret2;
 		int ret;
 
 		/* Wait for all previously submitted records to be decrypted */
@@ -2130,10 +2142,10 @@ int tls_sw_recvmsg(struct sock *sk,
 		/* Drain records from the rx_list & copy if required */
 		if (is_peek || is_kvec)
 			err = process_rx_list(ctx, msg, &control, copied,
-					      decrypted, is_peek);
+					      decrypted, is_peek, &ret2);
 		else
 			err = process_rx_list(ctx, msg, &control, 0,
-					      async_copy_bytes, is_peek);
+					      async_copy_bytes, is_peek, &ret2);
 	}
 
 	copied += decrypted;

-- 
Sabrina


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net 3/5] tls: don't skip over different type records from the rx_list
  2024-02-21 13:59         ` Sabrina Dubroca
@ 2024-02-21 18:33           ` Jakub Kicinski
  2024-02-21 18:42             ` Sabrina Dubroca
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Kicinski @ 2024-02-21 18:33 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, Boris Pismenny, John Fastabend, David S. Miller,
	Eric Dumazet, Paolo Abeni, Shuah Khan, Vakul Garg,
	linux-kselftest

On Wed, 21 Feb 2024 14:59:40 +0100 Sabrina Dubroca wrote:
> It's not exactly enough, since tls_record_content_type will return 0
> on a content type mismatch. We'll have to translate that into an
> "error". 

Ugh, that's unpleasant.

> I think it would be a bit nicer to set err=1 and then check
> err != 0 in tls_sw_recvmsg (we can document that in a comment above
> process_rx_list) rather than making up a fake errno. See diff [1].
> 
> Or we could swap the 0/1 returns from tls_record_content_type and
> switch the err <= 0 tests to err != 0 after the existing calls, then
> process_rx_list doesn't have a weird special case [2].
> 
> What do you think?

I missed the error = 1 case, sorry. No strong preference, then.
Checking for error = 1 will be as special as the new rx_more
flag. Should I apply this version as is, then?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 3/5] tls: don't skip over different type records from the rx_list
  2024-02-21 18:33           ` Jakub Kicinski
@ 2024-02-21 18:42             ` Sabrina Dubroca
  0 siblings, 0 replies; 13+ messages in thread
From: Sabrina Dubroca @ 2024-02-21 18:42 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Boris Pismenny, John Fastabend, David S. Miller,
	Eric Dumazet, Paolo Abeni, Shuah Khan, Vakul Garg,
	linux-kselftest

2024-02-21, 10:33:30 -0800, Jakub Kicinski wrote:
> On Wed, 21 Feb 2024 14:59:40 +0100 Sabrina Dubroca wrote:
> > It's not exactly enough, since tls_record_content_type will return 0
> > on a content type mismatch. We'll have to translate that into an
> > "error". 
> 
> Ugh, that's unpleasant.
> 
> > I think it would be a bit nicer to set err=1 and then check
> > err != 0 in tls_sw_recvmsg (we can document that in a comment above
> > process_rx_list) rather than making up a fake errno. See diff [1].
> > 
> > Or we could swap the 0/1 returns from tls_record_content_type and
> > switch the err <= 0 tests to err != 0 after the existing calls, then
> > process_rx_list doesn't have a weird special case [2].
> > 
> > What do you think?
> 
> I missed the error = 1 case, sorry. No strong preference, then.
> Checking for error = 1 will be as special as the new rx_more
> flag. Should I apply this version as is, then?

If you're ok with that version, sure. Thanks.

-- 
Sabrina


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net 0/5] tls: fixes for record type handling with PEEK
  2024-02-15 16:17 [PATCH net 0/5] tls: fixes for record type handling with PEEK Sabrina Dubroca
                   ` (4 preceding siblings ...)
  2024-02-15 16:17 ` [PATCH net 5/5] selftests: tls: add test for peeking past a record of a different type Sabrina Dubroca
@ 2024-02-21 22:30 ` patchwork-bot+netdevbpf
  5 siblings, 0 replies; 13+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-02-21 22:30 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: netdev, borisp, john.fastabend, kuba, davem, edumazet, pabeni,
	shuah, vakul.garg, linux-kselftest

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 15 Feb 2024 17:17:28 +0100 you wrote:
> There are multiple bugs in tls_sw_recvmsg's handling of record types
> when MSG_PEEK flag is used, which can lead to incorrectly merging two
> records:
>  - consecutive non-DATA records shouldn't be merged, even if they're
>    the same type (partly handled by the test at the end of the main
>    loop)
>  - records of the same type (even DATA) shouldn't be merged if one
>    record of a different type comes in between
> 
> [...]

Here is the summary with links:
  - [net,1/5] tls: break out of main loop when PEEK gets a non-data record
    https://git.kernel.org/netdev/net/c/10f41d0710fc
  - [net,2/5] tls: stop recv() if initial process_rx_list gave us non-DATA
    https://git.kernel.org/netdev/net/c/fdfbaec5923d
  - [net,3/5] tls: don't skip over different type records from the rx_list
    https://git.kernel.org/netdev/net/c/ec823bf3a479
  - [net,4/5] selftests: tls: add test for merging of same-type control messages
    https://git.kernel.org/netdev/net/c/7b2a4c2a623a
  - [net,5/5] selftests: tls: add test for peeking past a record of a different type
    https://git.kernel.org/netdev/net/c/2bf6172632e1

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-02-21 22:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-15 16:17 [PATCH net 0/5] tls: fixes for record type handling with PEEK Sabrina Dubroca
2024-02-15 16:17 ` [PATCH net 1/5] tls: break out of main loop when PEEK gets a non-data record Sabrina Dubroca
2024-02-15 16:17 ` [PATCH net 2/5] tls: stop recv() if initial process_rx_list gave us non-DATA Sabrina Dubroca
2024-02-15 16:17 ` [PATCH net 3/5] tls: don't skip over different type records from the rx_list Sabrina Dubroca
2024-02-19 20:07   ` Jakub Kicinski
2024-02-19 23:10     ` Sabrina Dubroca
2024-02-21  1:50       ` Jakub Kicinski
2024-02-21 13:59         ` Sabrina Dubroca
2024-02-21 18:33           ` Jakub Kicinski
2024-02-21 18:42             ` Sabrina Dubroca
2024-02-15 16:17 ` [PATCH net 4/5] selftests: tls: add test for merging of same-type control messages Sabrina Dubroca
2024-02-15 16:17 ` [PATCH net 5/5] selftests: tls: add test for peeking past a record of a different type Sabrina Dubroca
2024-02-21 22:30 ` [PATCH net 0/5] tls: fixes for record type handling with PEEK patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).