linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] net/tls: support maximum record size limit
@ 2025-09-03  1:47 Wilfred Mallawa
  2025-09-03 10:14 ` Sabrina Dubroca
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Wilfred Mallawa @ 2025-09-03  1:47 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni
  Cc: horms, corbet, john.fastabend, netdev, linux-doc, linux-kernel,
	alistair.francis, dlemoal, sd, Wilfred Mallawa

From: Wilfred Mallawa <wilfred.mallawa@wdc.com>

During a handshake, an endpoint may specify a maximum record size limit.
Currently, the kernel defaults to TLS_MAX_PAYLOAD_SIZE (16KB) for the
maximum record size. Meaning that, the outgoing records from the kernel
can exceed a lower size negotiated during the handshake. In such a case,
the TLS endpoint must send a fatal "record_overflow" alert [1], and
thus the record is discarded.

Upcoming Western Digital NVMe-TCP hardware controllers implement TLS
support. For these devices, supporting TLS record size negotiation is
necessary because the maximum TLS record size supported by the controller
is less than the default 16KB currently used by the kernel.

This patch adds support for retrieving the negotiated record size limit
during a handshake, and enforcing it at the TLS layer such that outgoing
records are no larger than the size negotiated. This patch depends on
the respective userspace support in tlshd and GnuTLS [2].

[1] https://www.rfc-editor.org/rfc/rfc8449
[2] https://gitlab.com/gnutls/gnutls/-/merge_requests/2005

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
---
V2 -> V3:
 - Added crucial missing change to tls_sw_sendmsg_locked() that actually
   enforces the record size limit.
 - Added record size enforcement in tls_device.c 
 - Changed `record_size_limit` -> `tx_record_size_limit` easier to see that it's
   tx only.
 - Added do_tls_getsockopt() support for TLS_TX_RECORD_SIZE_LIM
 - tx_record_size_limit is set to TLS_MAX_PAYLOAD_SIZE in tls_init() and
   updated when record size is specified by userspace.
---
 Documentation/networking/tls.rst |  7 ++++
 include/net/tls.h                |  1 +
 include/uapi/linux/tls.h         |  2 +
 net/tls/tls_device.c             |  2 +-
 net/tls/tls_main.c               | 65 +++++++++++++++++++++++++++++++-
 net/tls/tls_sw.c                 |  2 +-
 6 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/Documentation/networking/tls.rst b/Documentation/networking/tls.rst
index 36cc7afc2527..0232df902320 100644
--- a/Documentation/networking/tls.rst
+++ b/Documentation/networking/tls.rst
@@ -280,6 +280,13 @@ If the record decrypted turns out to had been padded or is not a data
 record it will be decrypted again into a kernel buffer without zero copy.
 Such events are counted in the ``TlsDecryptRetry`` statistic.
 
+TLS_TX_RECORD_SIZE_LIM
+~~~~~~~~~~~~~~~~~~~~~~
+
+During a TLS handshake, an endpoint may use the record size limit extension
+to specify a maximum record size. This allows enforcing the specified record
+size limit, such that outgoing records do not exceed the limit specified.
+
 Statistics
 ==========
 
diff --git a/include/net/tls.h b/include/net/tls.h
index 857340338b69..6db532d310d5 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -226,6 +226,7 @@ struct tls_context {
 	u8 rx_conf:3;
 	u8 zerocopy_sendfile:1;
 	u8 rx_no_pad:1;
+	u16 tx_record_size_limit;
 
 	int (*push_pending_record)(struct sock *sk, int flags);
 	void (*sk_write_space)(struct sock *sk);
diff --git a/include/uapi/linux/tls.h b/include/uapi/linux/tls.h
index b66a800389cc..3add266d5916 100644
--- a/include/uapi/linux/tls.h
+++ b/include/uapi/linux/tls.h
@@ -41,6 +41,7 @@
 #define TLS_RX			2	/* Set receive parameters */
 #define TLS_TX_ZEROCOPY_RO	3	/* TX zerocopy (only sendfile now) */
 #define TLS_RX_EXPECT_NO_PAD	4	/* Attempt opportunistic zero-copy */
+#define TLS_TX_RECORD_SIZE_LIM	5	/* Maximum record size */
 
 /* Supported versions */
 #define TLS_VERSION_MINOR(ver)	((ver) & 0xFF)
@@ -194,6 +195,7 @@ enum {
 	TLS_INFO_RXCONF,
 	TLS_INFO_ZC_RO_TX,
 	TLS_INFO_RX_NO_PAD,
+	TLS_INFO_TX_RECORD_SIZE_LIM,
 	__TLS_INFO_MAX,
 };
 #define TLS_INFO_MAX (__TLS_INFO_MAX - 1)
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index f672a62a9a52..bf16ceb41dde 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -459,7 +459,7 @@ static int tls_push_data(struct sock *sk,
 	/* TLS_HEADER_SIZE is not counted as part of the TLS record, and
 	 * we need to leave room for an authentication tag.
 	 */
-	max_open_record_len = TLS_MAX_PAYLOAD_SIZE +
+	max_open_record_len = tls_ctx->tx_record_size_limit +
 			      prot->prepend_size;
 	do {
 		rc = tls_do_allocation(sk, ctx, pfrag, prot->prepend_size);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index a3ccb3135e51..94237c97f062 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -544,6 +544,28 @@ static int do_tls_getsockopt_no_pad(struct sock *sk, char __user *optval,
 	return 0;
 }
 
+static int do_tls_getsockopt_tx_record_size(struct sock *sk, char __user *optval,
+					    int __user *optlen)
+{
+	struct tls_context *ctx = tls_get_ctx(sk);
+	u16 record_size_limit = ctx->tx_record_size_limit;
+	int len;
+
+	if (get_user(len, optlen))
+		return -EFAULT;
+
+	if (len < sizeof(record_size_limit))
+		return -EINVAL;
+
+	if (put_user(sizeof(record_size_limit), optlen))
+		return -EFAULT;
+
+	if (copy_to_user(optval, &record_size_limit, sizeof(record_size_limit)))
+		return -EFAULT;
+
+	return 0;
+}
+
 static int do_tls_getsockopt(struct sock *sk, int optname,
 			     char __user *optval, int __user *optlen)
 {
@@ -563,6 +585,9 @@ static int do_tls_getsockopt(struct sock *sk, int optname,
 	case TLS_RX_EXPECT_NO_PAD:
 		rc = do_tls_getsockopt_no_pad(sk, optval, optlen);
 		break;
+	case TLS_TX_RECORD_SIZE_LIM:
+		rc = do_tls_getsockopt_tx_record_size(sk, optval, optlen);
+		break;
 	default:
 		rc = -ENOPROTOOPT;
 		break;
@@ -812,6 +837,31 @@ static int do_tls_setsockopt_no_pad(struct sock *sk, sockptr_t optval,
 	return rc;
 }
 
+static int do_tls_setsockopt_tx_record_size(struct sock *sk, sockptr_t optval,
+					    unsigned int optlen)
+{
+	struct tls_context *ctx = tls_get_ctx(sk);
+	u16 value;
+
+	if (sockptr_is_null(optval) || optlen != sizeof(value))
+		return -EINVAL;
+
+	if (copy_from_sockptr(&value, optval, sizeof(value)))
+		return -EFAULT;
+
+	if (ctx->prot_info.version == TLS_1_2_VERSION &&
+	    value > TLS_MAX_PAYLOAD_SIZE)
+		return -EINVAL;
+
+	if (ctx->prot_info.version == TLS_1_3_VERSION &&
+	    value > TLS_MAX_PAYLOAD_SIZE + 1)
+		return -EINVAL;
+
+	ctx->tx_record_size_limit = value;
+
+	return 0;
+}
+
 static int do_tls_setsockopt(struct sock *sk, int optname, sockptr_t optval,
 			     unsigned int optlen)
 {
@@ -833,6 +883,9 @@ static int do_tls_setsockopt(struct sock *sk, int optname, sockptr_t optval,
 	case TLS_RX_EXPECT_NO_PAD:
 		rc = do_tls_setsockopt_no_pad(sk, optval, optlen);
 		break;
+	case TLS_TX_RECORD_SIZE_LIM:
+		rc = do_tls_setsockopt_tx_record_size(sk, optval, optlen);
+		break;
 	default:
 		rc = -ENOPROTOOPT;
 		break;
@@ -1022,6 +1075,7 @@ static int tls_init(struct sock *sk)
 
 	ctx->tx_conf = TLS_BASE;
 	ctx->rx_conf = TLS_BASE;
+	ctx->tx_record_size_limit = TLS_MAX_PAYLOAD_SIZE;
 	update_sk_prot(sk, ctx);
 out:
 	write_unlock_bh(&sk->sk_callback_lock);
@@ -1065,7 +1119,7 @@ static u16 tls_user_config(struct tls_context *ctx, bool tx)
 
 static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
 {
-	u16 version, cipher_type;
+	u16 version, cipher_type, tx_record_size_limit;
 	struct tls_context *ctx;
 	struct nlattr *start;
 	int err;
@@ -1110,7 +1164,13 @@ static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
 		if (err)
 			goto nla_failure;
 	}
-
+	tx_record_size_limit = ctx->tx_record_size_limit;
+	if (tx_record_size_limit) {
+		err = nla_put_u16(skb, TLS_INFO_TX_RECORD_SIZE_LIM,
+				  tx_record_size_limit);
+		if (err)
+			goto nla_failure;
+	}
 	rcu_read_unlock();
 	nla_nest_end(skb, start);
 	return 0;
@@ -1132,6 +1192,7 @@ static size_t tls_get_info_size(const struct sock *sk, bool net_admin)
 		nla_total_size(sizeof(u16)) +	/* TLS_INFO_TXCONF */
 		nla_total_size(0) +		/* TLS_INFO_ZC_RO_TX */
 		nla_total_size(0) +		/* TLS_INFO_RX_NO_PAD */
+		nla_total_size(sizeof(u16)) +   /* TLS_INFO_TX_RECORD_SIZE_LIM */
 		0;
 
 	return size;
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index bac65d0d4e3e..28fb796573d1 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1079,7 +1079,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 		orig_size = msg_pl->sg.size;
 		full_record = false;
 		try_to_copy = msg_data_left(msg);
-		record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
+		record_room = tls_ctx->tx_record_size_limit - msg_pl->sg.size;
 		if (try_to_copy >= record_room) {
 			try_to_copy = record_room;
 			full_record = true;
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] net/tls: support maximum record size limit
  2025-09-03  1:47 [PATCH v3] net/tls: support maximum record size limit Wilfred Mallawa
@ 2025-09-03 10:14 ` Sabrina Dubroca
  2025-09-04  9:54   ` Sabrina Dubroca
  2025-09-03 22:51 ` Jakub Kicinski
  2025-09-04 10:10 ` Sabrina Dubroca
  2 siblings, 1 reply; 7+ messages in thread
From: Sabrina Dubroca @ 2025-09-03 10:14 UTC (permalink / raw)
  To: Wilfred Mallawa
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, john.fastabend,
	netdev, linux-doc, linux-kernel, alistair.francis, dlemoal,
	Wilfred Mallawa

note: since this is a new feature, the subject prefix should be
"[PATCH net-next vN]" (ie add "net-next", the target tree for "new
feature" changes)

2025-09-03, 11:47:57 +1000, Wilfred Mallawa wrote:
> diff --git a/Documentation/networking/tls.rst b/Documentation/networking/tls.rst
> index 36cc7afc2527..0232df902320 100644
> --- a/Documentation/networking/tls.rst
> +++ b/Documentation/networking/tls.rst
> @@ -280,6 +280,13 @@ If the record decrypted turns out to had been padded or is not a data
>  record it will be decrypted again into a kernel buffer without zero copy.
>  Such events are counted in the ``TlsDecryptRetry`` statistic.
>  
> +TLS_TX_RECORD_SIZE_LIM
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +During a TLS handshake, an endpoint may use the record size limit extension
> +to specify a maximum record size. This allows enforcing the specified record
> +size limit, such that outgoing records do not exceed the limit specified.

Maybe worth adding a reference to the RFC that defines this extension?
I'm not sure if that would be helpful to readers of this doc or not.


> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index a3ccb3135e51..94237c97f062 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
[...]
> @@ -1022,6 +1075,7 @@ static int tls_init(struct sock *sk)
>  
>  	ctx->tx_conf = TLS_BASE;
>  	ctx->rx_conf = TLS_BASE;
> +	ctx->tx_record_size_limit = TLS_MAX_PAYLOAD_SIZE;
>  	update_sk_prot(sk, ctx);
>  out:
>  	write_unlock_bh(&sk->sk_callback_lock);
> @@ -1065,7 +1119,7 @@ static u16 tls_user_config(struct tls_context *ctx, bool tx)
>  
>  static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
>  {
> -	u16 version, cipher_type;
> +	u16 version, cipher_type, tx_record_size_limit;
>  	struct tls_context *ctx;
>  	struct nlattr *start;
>  	int err;
> @@ -1110,7 +1164,13 @@ static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
>  		if (err)
>  			goto nla_failure;
>  	}
> -
> +	tx_record_size_limit = ctx->tx_record_size_limit;
> +	if (tx_record_size_limit) {

You probably meant to update that to:

    tx_record_size_limit != TLS_MAX_PAYLOAD_SIZE

Otherwise, now that the default is TLS_MAX_PAYLOAD_SIZE, it will
always be exported - which is not wrong either. So I'd either update
the conditional so that the attribute is only exported for non-default
sizes (like in v2), or drop the if() and always export it.

> +		err = nla_put_u16(skb, TLS_INFO_TX_RECORD_SIZE_LIM,
> +				  tx_record_size_limit);
> +		if (err)
> +			goto nla_failure;
> +	}

[...]
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index bac65d0d4e3e..28fb796573d1 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1079,7 +1079,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
>  		orig_size = msg_pl->sg.size;
>  		full_record = false;
>  		try_to_copy = msg_data_left(msg);
> -		record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
> +		record_room = tls_ctx->tx_record_size_limit - msg_pl->sg.size;

If we entered tls_sw_sendmsg_locked with an existing open record, this
could end up being negative and confuse the rest of the code.

    send(MSG_MORE) returns with an open record of length len1
    setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM, limit < len1)
    send() -> record_room < 0


Possibly not a problem with a "well-behaved" userspace, but we can't
rely on that.


Pushing out the pending "too big" record at the time we set
tx_record_size_limit would likely make the peer close the connection
(because it's already told us to limit our TX size), so I guess we'd
have to split the pending record into tx_record_size_limit chunks
before we start processing the new message (either directly at
setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM) time, or the next send/etc
call). The final push during socket closing, and maybe some more
codepaths that deal with ctx->open_rec, would also have to do that.

I think additional selftests for
    send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, send
and
    send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, close
verifying the received record sizes would make sense, since it's a bit
tricky to get that right.

-- 
Sabrina

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] net/tls: support maximum record size limit
  2025-09-03  1:47 [PATCH v3] net/tls: support maximum record size limit Wilfred Mallawa
  2025-09-03 10:14 ` Sabrina Dubroca
@ 2025-09-03 22:51 ` Jakub Kicinski
  2025-09-04 23:41   ` Wilfred Mallawa
  2025-09-04 10:10 ` Sabrina Dubroca
  2 siblings, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2025-09-03 22:51 UTC (permalink / raw)
  To: Wilfred Mallawa
  Cc: davem, edumazet, pabeni, horms, corbet, john.fastabend, netdev,
	linux-doc, linux-kernel, alistair.francis, dlemoal, sd,
	Wilfred Mallawa

On Wed,  3 Sep 2025 11:47:57 +1000 Wilfred Mallawa wrote:
> Upcoming Western Digital NVMe-TCP hardware controllers implement TLS
> support. For these devices, supporting TLS record size negotiation is
> necessary because the maximum TLS record size supported by the controller
> is less than the default 16KB currently used by the kernel.

Just to be clear -- the device does not require that the records align
with TCP segments, right?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] net/tls: support maximum record size limit
  2025-09-03 10:14 ` Sabrina Dubroca
@ 2025-09-04  9:54   ` Sabrina Dubroca
  0 siblings, 0 replies; 7+ messages in thread
From: Sabrina Dubroca @ 2025-09-04  9:54 UTC (permalink / raw)
  To: Wilfred Mallawa
  Cc: davem, edumazet, kuba, pabeni, horms, corbet, john.fastabend,
	netdev, linux-doc, linux-kernel, alistair.francis, dlemoal,
	Wilfred Mallawa

2025-09-03, 12:14:32 +0200, Sabrina Dubroca wrote:
> 2025-09-03, 11:47:57 +1000, Wilfred Mallawa wrote:
> Pushing out the pending "too big" record at the time we set
> tx_record_size_limit would likely make the peer close the connection
> (because it's already told us to limit our TX size), so I guess we'd
> have to split the pending record into tx_record_size_limit chunks
> before we start processing the new message (either directly at
> setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM) time, or the next send/etc
> call). The final push during socket closing, and maybe some more
> codepaths that deal with ctx->open_rec, would also have to do that.
> 
> I think additional selftests for
>     send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, send
> and
>     send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, close
> verifying the received record sizes would make sense, since it's a bit
> tricky to get that right.

Hmm, after thinking about this a bit more, maybe we don't need to
care? There could be more records larger than the new limit already
pushed out to TCP but not received by the peer, and we can't do
anything about those.

I suspect it's not a problem in practice because of what the TLS
exchange between the peers setting up this extension looks like? (ie,
there should never be an open record at this stage - unless userspace
delays doing this setsockopt after getting the message from the peer,
but then maybe we can call that a buggy userspace)

-- 
Sabrina

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] net/tls: support maximum record size limit
  2025-09-03  1:47 [PATCH v3] net/tls: support maximum record size limit Wilfred Mallawa
  2025-09-03 10:14 ` Sabrina Dubroca
  2025-09-03 22:51 ` Jakub Kicinski
@ 2025-09-04 10:10 ` Sabrina Dubroca
  2025-09-04 13:33   ` Jakub Kicinski
  2 siblings, 1 reply; 7+ messages in thread
From: Sabrina Dubroca @ 2025-09-04 10:10 UTC (permalink / raw)
  To: Wilfred Mallawa, kuba
  Cc: davem, edumazet, pabeni, horms, corbet, john.fastabend, netdev,
	linux-doc, linux-kernel, alistair.francis, dlemoal,
	Wilfred Mallawa

2025-09-03, 11:47:57 +1000, Wilfred Mallawa wrote:
> +static int do_tls_setsockopt_tx_record_size(struct sock *sk, sockptr_t optval,
> +					    unsigned int optlen)
> +{
> +	struct tls_context *ctx = tls_get_ctx(sk);
> +	u16 value;
> +
> +	if (sockptr_is_null(optval) || optlen != sizeof(value))
> +		return -EINVAL;
> +
> +	if (copy_from_sockptr(&value, optval, sizeof(value)))
> +		return -EFAULT;
> +
> +	if (ctx->prot_info.version == TLS_1_2_VERSION &&
> +	    value > TLS_MAX_PAYLOAD_SIZE)
> +		return -EINVAL;
> +
> +	if (ctx->prot_info.version == TLS_1_3_VERSION &&
> +	    value > TLS_MAX_PAYLOAD_SIZE + 1)
> +		return -EINVAL;

The RFC is not very explicit about this, but I think this +1 for
TLS1.3 is to allow an actual payload of TLS_MAX_PAYLOAD_SIZE and save
1B of room for the content_type that gets appended.

   This value is the length of the plaintext of a protected record.  The
   value includes the content type and padding added in TLS 1.3 (that
   is, the complete length of TLSInnerPlaintext).

AFAIU we don't actually want to stuff TLS_MAX_PAYLOAD_SIZE+1 bytes of
payload into a record.

If we set tx_record_size_limit to TLS_MAX_PAYLOAD_SIZE+1, we'll end up
sending a record with a plaintext of TLS_MAX_PAYLOAD_SIZE+2 bytes
(TLS_MAX_PAYLOAD_SIZE+1 of payload, then 1B of content_type), and a
"normal" implementation will reject the record since it's too big
(ktls does that in net/tls/tls_sw.c:tls_rx_msg_size).

So we should subtract 1 from the userspace-provided value for 1.3, and
then add it back in getsockopt/tls_get_info.

Or maybe userspace should provide the desired payload limit, instead
of the raw record_size_limit it got from the extension (ie, do -1 when
needed before calling the setsockopt). Then we should rename this
"tx_payload_size_limit" (and adjust the docs) to make it clear it's
not the raw record_size_limit.

The "tx_payload_size_limit" approach is maybe a little bit simpler
(not having to add/subtract 1 in a few places - I think userspace
would only have to do it in one place).


Wilfred, Jakub, what do you think?


> +	ctx->tx_record_size_limit = value;
> +
> +	return 0;
> +}

-- 
Sabrina

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] net/tls: support maximum record size limit
  2025-09-04 10:10 ` Sabrina Dubroca
@ 2025-09-04 13:33   ` Jakub Kicinski
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2025-09-04 13:33 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: Wilfred Mallawa, davem, edumazet, pabeni, horms, corbet,
	john.fastabend, netdev, linux-doc, linux-kernel, alistair.francis,
	dlemoal, Wilfred Mallawa

On Thu, 4 Sep 2025 12:10:48 +0200 Sabrina Dubroca wrote:
> If we set tx_record_size_limit to TLS_MAX_PAYLOAD_SIZE+1, we'll end up
> sending a record with a plaintext of TLS_MAX_PAYLOAD_SIZE+2 bytes
> (TLS_MAX_PAYLOAD_SIZE+1 of payload, then 1B of content_type), and a
> "normal" implementation will reject the record since it's too big
> (ktls does that in net/tls/tls_sw.c:tls_rx_msg_size).
> 
> So we should subtract 1 from the userspace-provided value for 1.3, and
> then add it back in getsockopt/tls_get_info.
> 
> Or maybe userspace should provide the desired payload limit, instead
> of the raw record_size_limit it got from the extension (ie, do -1 when
> needed before calling the setsockopt). Then we should rename this
> "tx_payload_size_limit" (and adjust the docs) to make it clear it's
> not the raw record_size_limit.
> 
> The "tx_payload_size_limit" approach is maybe a little bit simpler
> (not having to add/subtract 1 in a few places - I think userspace
> would only have to do it in one place).
> 
> 
> Wilfred, Jakub, what do you think?

I reckon either way is fine, assuming we clearly document the behavior.
I'd lean slightly to using the same definition of the setsockopt as the
RFC, it may be confusing but if it ever interacts with other settings
it may make it easier to refer to other RFCs.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] net/tls: support maximum record size limit
  2025-09-03 22:51 ` Jakub Kicinski
@ 2025-09-04 23:41   ` Wilfred Mallawa
  0 siblings, 0 replies; 7+ messages in thread
From: Wilfred Mallawa @ 2025-09-04 23:41 UTC (permalink / raw)
  To: kuba@kernel.org
  Cc: corbet@lwn.net, dlemoal@kernel.org, davem@davemloft.net,
	Alistair Francis, john.fastabend@gmail.com, sd@queasysnail.net,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	horms@kernel.org, edumazet@google.com, pabeni@redhat.com,
	netdev@vger.kernel.org

On Wed, 2025-09-03 at 15:51 -0700, Jakub Kicinski wrote:
> On Wed,  3 Sep 2025 11:47:57 +1000 Wilfred Mallawa wrote:
> > Upcoming Western Digital NVMe-TCP hardware controllers implement
> > TLS
> > support. For these devices, supporting TLS record size negotiation
> > is
> > necessary because the maximum TLS record size supported by the
> > controller
> > is less than the default 16KB currently used by the kernel.
> 
> Just to be clear -- the device does not require that the records
> align
> with TCP segments, right?
Yeah, that's correct. There is no requirement for alignment of TLS
records with TCP segments.

Cheers,
Wilfred

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-09-04 23:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-03  1:47 [PATCH v3] net/tls: support maximum record size limit Wilfred Mallawa
2025-09-03 10:14 ` Sabrina Dubroca
2025-09-04  9:54   ` Sabrina Dubroca
2025-09-03 22:51 ` Jakub Kicinski
2025-09-04 23:41   ` Wilfred Mallawa
2025-09-04 10:10 ` Sabrina Dubroca
2025-09-04 13:33   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).