* [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
@ 2025-10-22 0:19 Wilfred Mallawa
2025-10-22 0:19 ` [PATCH net-next v8 2/2] selftests: tls: add tls record_size_limit test Wilfred Mallawa
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Wilfred Mallawa @ 2025-10-22 0:19 UTC (permalink / raw)
To: netdev, linux-doc, linux-kernel, linux-kselftest
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Jonathan Corbet, Simon Horman, John Fastabend, Sabrina Dubroca,
Shuah Khan, Wilfred Mallawa
From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
During a handshake, an endpoint may specify a maximum record size limit.
Currently, the kernel defaults to TLS_MAX_PAYLOAD_SIZE (16KB) for the
maximum record size. Meaning that, the outgoing records from the kernel
can exceed a lower size negotiated during the handshake. In such a case,
the TLS endpoint must send a fatal "record_overflow" alert [1], and
thus the record is discarded.
Upcoming Western Digital NVMe-TCP hardware controllers implement TLS
support. For these devices, supporting TLS record size negotiation is
necessary because the maximum TLS record size supported by the controller
is less than the default 16KB currently used by the kernel.
Currently, there is no way to inform the kernel of such a limit. This patch
adds support to a new setsockopt() option `TLS_TX_MAX_PAYLOAD_LEN` that
allows for setting the maximum plaintext fragment size. Once set, outgoing
records are no larger than the size specified. This option can be used to
specify the record size limit.
[1] https://www.rfc-editor.org/rfc/rfc8449
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
---
V7 -> V8:
- Fixup HTML doc indentation
- Drop the getsockopt() change in V7 where ContentType was included in the
max payload length
---
Documentation/networking/tls.rst | 20 ++++++++++
include/net/tls.h | 3 ++
include/uapi/linux/tls.h | 2 +
net/tls/tls_device.c | 2 +-
net/tls/tls_main.c | 64 ++++++++++++++++++++++++++++++++
net/tls/tls_sw.c | 2 +-
6 files changed, 91 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/tls.rst b/Documentation/networking/tls.rst
index 36cc7afc2527..980c442d7161 100644
--- a/Documentation/networking/tls.rst
+++ b/Documentation/networking/tls.rst
@@ -280,6 +280,26 @@ If the record decrypted turns out to had been padded or is not a data
record it will be decrypted again into a kernel buffer without zero copy.
Such events are counted in the ``TlsDecryptRetry`` statistic.
+TLS_TX_MAX_PAYLOAD_LEN
+~~~~~~~~~~~~~~~~~~~~~~
+
+Specifies the maximum size of the plaintext payload for transmitted TLS records.
+
+When this option is set, the kernel enforces the specified limit on all outgoing
+TLS records. No plaintext fragment will exceed this size. This option can be used
+to implement the TLS Record Size Limit extension [1].
+
+* For TLS 1.2, the value corresponds directly to the record size limit.
+* For TLS 1.3, the value should be set to record_size_limit - 1, since
+ the record size limit includes one additional byte for the ContentType
+ field.
+
+The valid range for this option is 64 to 16384 bytes for TLS 1.2, and 63 to
+16384 bytes for TLS 1.3. The lower minimum for TLS 1.3 accounts for the
+extra byte used by the ContentType field.
+
+[1] https://datatracker.ietf.org/doc/html/rfc8449
+
Statistics
==========
diff --git a/include/net/tls.h b/include/net/tls.h
index 857340338b69..f2af113728aa 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -53,6 +53,8 @@ struct tls_rec;
/* Maximum data size carried in a TLS record */
#define TLS_MAX_PAYLOAD_SIZE ((size_t)1 << 14)
+/* Minimum record size limit as per RFC8449 */
+#define TLS_MIN_RECORD_SIZE_LIM ((size_t)1 << 6)
#define TLS_HEADER_SIZE 5
#define TLS_NONCE_OFFSET TLS_HEADER_SIZE
@@ -226,6 +228,7 @@ struct tls_context {
u8 rx_conf:3;
u8 zerocopy_sendfile:1;
u8 rx_no_pad:1;
+ u16 tx_max_payload_len;
int (*push_pending_record)(struct sock *sk, int flags);
void (*sk_write_space)(struct sock *sk);
diff --git a/include/uapi/linux/tls.h b/include/uapi/linux/tls.h
index b66a800389cc..b8b9c42f848c 100644
--- a/include/uapi/linux/tls.h
+++ b/include/uapi/linux/tls.h
@@ -41,6 +41,7 @@
#define TLS_RX 2 /* Set receive parameters */
#define TLS_TX_ZEROCOPY_RO 3 /* TX zerocopy (only sendfile now) */
#define TLS_RX_EXPECT_NO_PAD 4 /* Attempt opportunistic zero-copy */
+#define TLS_TX_MAX_PAYLOAD_LEN 5 /* Maximum plaintext size */
/* Supported versions */
#define TLS_VERSION_MINOR(ver) ((ver) & 0xFF)
@@ -194,6 +195,7 @@ enum {
TLS_INFO_RXCONF,
TLS_INFO_ZC_RO_TX,
TLS_INFO_RX_NO_PAD,
+ TLS_INFO_TX_MAX_PAYLOAD_LEN,
__TLS_INFO_MAX,
};
#define TLS_INFO_MAX (__TLS_INFO_MAX - 1)
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index caa2b5d24622..4d29b390aed9 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -462,7 +462,7 @@ static int tls_push_data(struct sock *sk,
/* TLS_HEADER_SIZE is not counted as part of the TLS record, and
* we need to leave room for an authentication tag.
*/
- max_open_record_len = TLS_MAX_PAYLOAD_SIZE +
+ max_open_record_len = tls_ctx->tx_max_payload_len +
prot->prepend_size;
do {
rc = tls_do_allocation(sk, ctx, pfrag, prot->prepend_size);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 39a2ab47fe72..56ce0bc8317b 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -541,6 +541,28 @@ static int do_tls_getsockopt_no_pad(struct sock *sk, char __user *optval,
return 0;
}
+static int do_tls_getsockopt_tx_payload_len(struct sock *sk, char __user *optval,
+ int __user *optlen)
+{
+ struct tls_context *ctx = tls_get_ctx(sk);
+ u16 payload_len = ctx->tx_max_payload_len;
+ int len;
+
+ if (get_user(len, optlen))
+ return -EFAULT;
+
+ if (len < sizeof(payload_len))
+ return -EINVAL;
+
+ if (put_user(sizeof(payload_len), optlen))
+ return -EFAULT;
+
+ if (copy_to_user(optval, &payload_len, sizeof(payload_len)))
+ return -EFAULT;
+
+ return 0;
+}
+
static int do_tls_getsockopt(struct sock *sk, int optname,
char __user *optval, int __user *optlen)
{
@@ -560,6 +582,9 @@ static int do_tls_getsockopt(struct sock *sk, int optname,
case TLS_RX_EXPECT_NO_PAD:
rc = do_tls_getsockopt_no_pad(sk, optval, optlen);
break;
+ case TLS_TX_MAX_PAYLOAD_LEN:
+ rc = do_tls_getsockopt_tx_payload_len(sk, optval, optlen);
+ break;
default:
rc = -ENOPROTOOPT;
break;
@@ -809,6 +834,32 @@ static int do_tls_setsockopt_no_pad(struct sock *sk, sockptr_t optval,
return rc;
}
+static int do_tls_setsockopt_tx_payload_len(struct sock *sk, sockptr_t optval,
+ unsigned int optlen)
+{
+ struct tls_context *ctx = tls_get_ctx(sk);
+ struct tls_sw_context_tx *sw_ctx = tls_sw_ctx_tx(ctx);
+ u16 value;
+ bool tls_13 = ctx->prot_info.version == TLS_1_3_VERSION;
+
+ if (sw_ctx && sw_ctx->open_rec)
+ return -EBUSY;
+
+ if (sockptr_is_null(optval) || optlen != sizeof(value))
+ return -EINVAL;
+
+ if (copy_from_sockptr(&value, optval, sizeof(value)))
+ return -EFAULT;
+
+ if (value < TLS_MIN_RECORD_SIZE_LIM - (tls_13 ? 1 : 0) ||
+ value > TLS_MAX_PAYLOAD_SIZE)
+ return -EINVAL;
+
+ ctx->tx_max_payload_len = value;
+
+ return 0;
+}
+
static int do_tls_setsockopt(struct sock *sk, int optname, sockptr_t optval,
unsigned int optlen)
{
@@ -830,6 +881,11 @@ static int do_tls_setsockopt(struct sock *sk, int optname, sockptr_t optval,
case TLS_RX_EXPECT_NO_PAD:
rc = do_tls_setsockopt_no_pad(sk, optval, optlen);
break;
+ case TLS_TX_MAX_PAYLOAD_LEN:
+ lock_sock(sk);
+ rc = do_tls_setsockopt_tx_payload_len(sk, optval, optlen);
+ release_sock(sk);
+ break;
default:
rc = -ENOPROTOOPT;
break;
@@ -1019,6 +1075,7 @@ static int tls_init(struct sock *sk)
ctx->tx_conf = TLS_BASE;
ctx->rx_conf = TLS_BASE;
+ ctx->tx_max_payload_len = TLS_MAX_PAYLOAD_SIZE;
update_sk_prot(sk, ctx);
out:
write_unlock_bh(&sk->sk_callback_lock);
@@ -1108,6 +1165,12 @@ static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool net_admin)
goto nla_failure;
}
+ err = nla_put_u16(skb, TLS_INFO_TX_MAX_PAYLOAD_LEN,
+ ctx->tx_max_payload_len);
+
+ if (err)
+ goto nla_failure;
+
rcu_read_unlock();
nla_nest_end(skb, start);
return 0;
@@ -1129,6 +1192,7 @@ static size_t tls_get_info_size(const struct sock *sk, bool net_admin)
nla_total_size(sizeof(u16)) + /* TLS_INFO_TXCONF */
nla_total_size(0) + /* TLS_INFO_ZC_RO_TX */
nla_total_size(0) + /* TLS_INFO_RX_NO_PAD */
+ nla_total_size(sizeof(u16)) + /* TLS_INFO_TX_MAX_PAYLOAD_LEN */
0;
return size;
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index d17135369980..9937d4c810f2 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1079,7 +1079,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
orig_size = msg_pl->sg.size;
full_record = false;
try_to_copy = msg_data_left(msg);
- record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
+ record_room = tls_ctx->tx_max_payload_len - msg_pl->sg.size;
if (try_to_copy >= record_room) {
try_to_copy = record_room;
full_record = true;
--
2.51.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net-next v8 2/2] selftests: tls: add tls record_size_limit test
2025-10-22 0:19 [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size Wilfred Mallawa
@ 2025-10-22 0:19 ` Wilfred Mallawa
2025-10-22 21:51 ` Sabrina Dubroca
2025-10-22 21:51 ` [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size Sabrina Dubroca
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Wilfred Mallawa @ 2025-10-22 0:19 UTC (permalink / raw)
To: netdev, linux-doc, linux-kernel, linux-kselftest
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Jonathan Corbet, Simon Horman, John Fastabend, Sabrina Dubroca,
Shuah Khan, Wilfred Mallawa
From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Test that outgoing plaintext records respect the tls TLS_TX_MAX_PAYLOAD_LEN
set using setsockopt(). The limit is set to be 128, thus, in all received
records, the plaintext must not exceed this amount.
Also test that setting a new record size limit whilst a pending open
record exists is handled correctly by discarding the request.
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
---
V7 -> V8:
- Drop TLS 1.3 tests for the removed getsockopt() changes from V7
---
tools/testing/selftests/net/tls.c | 141 ++++++++++++++++++++++++++++++
1 file changed, 141 insertions(+)
diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 5c6d8215021c..da1b50b30719 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -2856,6 +2856,147 @@ TEST_F(tls_err, oob_pressure)
EXPECT_EQ(send(self->fd2, buf, 5, MSG_OOB), 5);
}
+/*
+ * Parse a stream of TLS records and ensure that each record respects
+ * the specified @max_payload_len.
+ */
+static size_t parse_tls_records(struct __test_metadata *_metadata,
+ const __u8 *rx_buf, int rx_len, int overhead,
+ __u16 max_payload_len)
+{
+ const __u8 *rec = rx_buf;
+ size_t total_plaintext_rx = 0;
+ const __u8 rec_header_len = 5;
+
+ while (rec < rx_buf + rx_len) {
+ __u16 record_payload_len;
+ __u16 plaintext_len;
+
+ /* Sanity check that it's a TLS header for application data */
+ ASSERT_EQ(rec[0], 23);
+ ASSERT_EQ(rec[1], 0x3);
+ ASSERT_EQ(rec[2], 0x3);
+
+ memcpy(&record_payload_len, rec + 3, 2);
+ record_payload_len = ntohs(record_payload_len);
+ ASSERT_GE(record_payload_len, overhead);
+
+ plaintext_len = record_payload_len - overhead;
+ total_plaintext_rx += plaintext_len;
+
+ /* Plaintext must not exceed the specified limit */
+ ASSERT_LE(plaintext_len, max_payload_len);
+ rec += rec_header_len + record_payload_len;
+ }
+
+ return total_plaintext_rx;
+}
+
+TEST(tls_12_tx_max_payload_len)
+{
+ struct tls_crypto_info_keys tls12;
+ int cfd, ret, fd, overhead;
+ size_t total_plaintext_rx = 0;
+ __u8 tx[1024], rx[2000];
+ __u16 limit = 128;
+ __u16 opt = 0;
+ unsigned int optlen = sizeof(opt);
+ bool notls;
+
+ tls_crypto_info_init(TLS_1_2_VERSION, TLS_CIPHER_AES_CCM_128,
+ &tls12, 0);
+
+ ulp_sock_pair(_metadata, &fd, &cfd, ¬ls);
+
+ if (notls)
+ exit(KSFT_SKIP);
+
+ /* Don't install keys on fd, we'll parse raw records */
+ ret = setsockopt(cfd, SOL_TLS, TLS_TX, &tls12, tls12.len);
+ ASSERT_EQ(ret, 0);
+
+ ret = setsockopt(cfd, SOL_TLS, TLS_TX_MAX_PAYLOAD_LEN, &limit,
+ sizeof(limit));
+ ASSERT_EQ(ret, 0);
+
+ ret = getsockopt(cfd, SOL_TLS, TLS_TX_MAX_PAYLOAD_LEN, &opt, &optlen);
+ EXPECT_EQ(ret, 0);
+ EXPECT_EQ(limit, opt);
+ EXPECT_EQ(optlen, sizeof(limit));
+
+ memset(tx, 0, sizeof(tx));
+ ASSERT_EQ(send(cfd, tx, sizeof(tx), 0), sizeof(tx));
+ close(cfd);
+
+ ret = recv(fd, rx, sizeof(rx), 0);
+
+ /*
+ * 16B tag + 8B IV -- record header (5B) is not counted but we'll
+ * need it to walk the record stream
+ */
+ overhead = 16 + 8;
+ total_plaintext_rx = parse_tls_records(_metadata, rx, ret, overhead,
+ limit);
+
+ ASSERT_EQ(total_plaintext_rx, sizeof(tx));
+ close(fd);
+}
+
+TEST(tls_12_tx_max_payload_len_open_rec)
+{
+ struct tls_crypto_info_keys tls12;
+ int cfd, ret, fd, overhead;
+ size_t total_plaintext_rx = 0;
+ __u8 tx[1024], rx[2000];
+ __u16 tx_partial = 256;
+ __u16 og_limit = 512, limit = 128;
+ bool notls;
+
+ tls_crypto_info_init(TLS_1_2_VERSION, TLS_CIPHER_AES_CCM_128,
+ &tls12, 0);
+
+ ulp_sock_pair(_metadata, &fd, &cfd, ¬ls);
+
+ if (notls)
+ exit(KSFT_SKIP);
+
+ /* Don't install keys on fd, we'll parse raw records */
+ ret = setsockopt(cfd, SOL_TLS, TLS_TX, &tls12, tls12.len);
+ ASSERT_EQ(ret, 0);
+
+ ret = setsockopt(cfd, SOL_TLS, TLS_TX_MAX_PAYLOAD_LEN, &og_limit,
+ sizeof(og_limit));
+ ASSERT_EQ(ret, 0);
+
+ memset(tx, 0, sizeof(tx));
+ ASSERT_EQ(send(cfd, tx, tx_partial, MSG_MORE), tx_partial);
+
+ /*
+ * Changing the payload limit with a pending open record should
+ * not be allowed.
+ */
+ ret = setsockopt(cfd, SOL_TLS, TLS_TX_MAX_PAYLOAD_LEN, &limit,
+ sizeof(limit));
+ ASSERT_EQ(ret, -1);
+ ASSERT_EQ(errno, EBUSY);
+
+ ASSERT_EQ(send(cfd, tx + tx_partial, sizeof(tx) - tx_partial, MSG_EOR),
+ sizeof(tx) - tx_partial);
+ close(cfd);
+
+ ret = recv(fd, rx, sizeof(rx), 0);
+
+ /*
+ * 16B tag + 8B IV -- record header (5B) is not counted but we'll
+ * need it to walk the record stream
+ */
+ overhead = 16 + 8;
+ total_plaintext_rx = parse_tls_records(_metadata, rx, ret, overhead,
+ og_limit);
+ ASSERT_EQ(total_plaintext_rx, sizeof(tx));
+ close(fd);
+}
+
TEST(non_established) {
struct tls12_crypto_info_aes_gcm_256 tls12;
struct sockaddr_in addr;
--
2.51.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
2025-10-22 0:19 [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size Wilfred Mallawa
2025-10-22 0:19 ` [PATCH net-next v8 2/2] selftests: tls: add tls record_size_limit test Wilfred Mallawa
@ 2025-10-22 21:51 ` Sabrina Dubroca
2025-10-24 1:44 ` Jakub Kicinski
2025-10-27 23:30 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 11+ messages in thread
From: Sabrina Dubroca @ 2025-10-22 21:51 UTC (permalink / raw)
To: Wilfred Mallawa
Cc: netdev, linux-doc, linux-kernel, linux-kselftest,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Jonathan Corbet, Simon Horman, John Fastabend, Shuah Khan,
Wilfred Mallawa
2025-10-22, 10:19:36 +1000, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
>
> During a handshake, an endpoint may specify a maximum record size limit.
> Currently, the kernel defaults to TLS_MAX_PAYLOAD_SIZE (16KB) for the
> maximum record size. Meaning that, the outgoing records from the kernel
> can exceed a lower size negotiated during the handshake. In such a case,
> the TLS endpoint must send a fatal "record_overflow" alert [1], and
> thus the record is discarded.
>
> Upcoming Western Digital NVMe-TCP hardware controllers implement TLS
> support. For these devices, supporting TLS record size negotiation is
> necessary because the maximum TLS record size supported by the controller
> is less than the default 16KB currently used by the kernel.
>
> Currently, there is no way to inform the kernel of such a limit. This patch
> adds support to a new setsockopt() option `TLS_TX_MAX_PAYLOAD_LEN` that
> allows for setting the maximum plaintext fragment size. Once set, outgoing
> records are no larger than the size specified. This option can be used to
> specify the record size limit.
>
> [1] https://www.rfc-editor.org/rfc/rfc8449
>
> Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> ---
> V7 -> V8:
> - Fixup HTML doc indentation
> - Drop the getsockopt() change in V7 where ContentType was included in the
> max payload length
> ---
> Documentation/networking/tls.rst | 20 ++++++++++
> include/net/tls.h | 3 ++
> include/uapi/linux/tls.h | 2 +
> net/tls/tls_device.c | 2 +-
> net/tls/tls_main.c | 64 ++++++++++++++++++++++++++++++++
> net/tls/tls_sw.c | 2 +-
> 6 files changed, 91 insertions(+), 2 deletions(-)
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Thanks Wilfred.
--
Sabrina
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 2/2] selftests: tls: add tls record_size_limit test
2025-10-22 0:19 ` [PATCH net-next v8 2/2] selftests: tls: add tls record_size_limit test Wilfred Mallawa
@ 2025-10-22 21:51 ` Sabrina Dubroca
0 siblings, 0 replies; 11+ messages in thread
From: Sabrina Dubroca @ 2025-10-22 21:51 UTC (permalink / raw)
To: Wilfred Mallawa
Cc: netdev, linux-doc, linux-kernel, linux-kselftest,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Jonathan Corbet, Simon Horman, John Fastabend, Shuah Khan,
Wilfred Mallawa
2025-10-22, 10:19:37 +1000, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
>
> Test that outgoing plaintext records respect the tls TLS_TX_MAX_PAYLOAD_LEN
> set using setsockopt(). The limit is set to be 128, thus, in all received
> records, the plaintext must not exceed this amount.
>
> Also test that setting a new record size limit whilst a pending open
> record exists is handled correctly by discarding the request.
>
> Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
> Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> ---
> V7 -> V8:
> - Drop TLS 1.3 tests for the removed getsockopt() changes from V7
> ---
> tools/testing/selftests/net/tls.c | 141 ++++++++++++++++++++++++++++++
> 1 file changed, 141 insertions(+)
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
--
Sabrina
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
2025-10-22 0:19 [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size Wilfred Mallawa
2025-10-22 0:19 ` [PATCH net-next v8 2/2] selftests: tls: add tls record_size_limit test Wilfred Mallawa
2025-10-22 21:51 ` [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size Sabrina Dubroca
@ 2025-10-24 1:44 ` Jakub Kicinski
2025-10-24 2:11 ` Wilfred Mallawa
2025-10-27 23:30 ` patchwork-bot+netdevbpf
3 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2025-10-24 1:44 UTC (permalink / raw)
To: Wilfred Mallawa, Sabrina Dubroca
Cc: netdev, linux-doc, linux-kernel, linux-kselftest,
David S . Miller, Eric Dumazet, Paolo Abeni, Jonathan Corbet,
Simon Horman, John Fastabend, Shuah Khan, Wilfred Mallawa
On Wed, 22 Oct 2025 10:19:36 +1000 Wilfred Mallawa wrote:
> +TLS_TX_MAX_PAYLOAD_LEN
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +Specifies the maximum size of the plaintext payload for transmitted TLS records.
> +
> +When this option is set, the kernel enforces the specified limit on all outgoing
> +TLS records. No plaintext fragment will exceed this size. This option can be used
> +to implement the TLS Record Size Limit extension [1].
> +
> +* For TLS 1.2, the value corresponds directly to the record size limit.
> +* For TLS 1.3, the value should be set to record_size_limit - 1, since
> + the record size limit includes one additional byte for the ContentType
> + field.
> +
> +The valid range for this option is 64 to 16384 bytes for TLS 1.2, and 63 to
> +16384 bytes for TLS 1.3. The lower minimum for TLS 1.3 accounts for the
> +extra byte used by the ContentType field.
> +
> +[1] https://datatracker.ietf.org/doc/html/rfc8449
Sorry for not paying attention to the last few revisions.
So we decided to go with the non-RFC definition of the sockopt
parameter? Is there a reason for that? I like how the "per RFC"
behavior shifts any blame away from us :)
> + err = nla_put_u16(skb, TLS_INFO_TX_MAX_PAYLOAD_LEN,
> + ctx->tx_max_payload_len);
> +
nit: unnecessary empty line
> + if (err)
> + goto nla_failure;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
2025-10-24 1:44 ` Jakub Kicinski
@ 2025-10-24 2:11 ` Wilfred Mallawa
2025-10-24 23:33 ` Jakub Kicinski
0 siblings, 1 reply; 11+ messages in thread
From: Wilfred Mallawa @ 2025-10-24 2:11 UTC (permalink / raw)
To: Jakub Kicinski, Sabrina Dubroca
Cc: netdev, linux-doc, linux-kernel, linux-kselftest,
David S . Miller, Eric Dumazet, Paolo Abeni, Jonathan Corbet,
Simon Horman, John Fastabend, Shuah Khan
On Thu, 2025-10-23 at 18:44 -0700, Jakub Kicinski wrote:
> On Wed, 22 Oct 2025 10:19:36 +1000 Wilfred Mallawa wrote:
> > +TLS_TX_MAX_PAYLOAD_LEN
> > +~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Specifies the maximum size of the plaintext payload for
> > transmitted TLS records.
> > +
> > +When this option is set, the kernel enforces the specified limit
> > on all outgoing
> > +TLS records. No plaintext fragment will exceed this size. This
> > option can be used
> > +to implement the TLS Record Size Limit extension [1].
> > +
> > +* For TLS 1.2, the value corresponds directly to the record size
> > limit.
> > +* For TLS 1.3, the value should be set to record_size_limit - 1,
> > since
> > + the record size limit includes one additional byte for the
> > ContentType
> > + field.
> > +
> > +The valid range for this option is 64 to 16384 bytes for TLS 1.2,
> > and 63 to
> > +16384 bytes for TLS 1.3. The lower minimum for TLS 1.3 accounts
> > for the
> > +extra byte used by the ContentType field.
> > +
> > +[1] https://datatracker.ietf.org/doc/html/rfc8449
>
> Sorry for not paying attention to the last few revisions.
>
> So we decided to go with the non-RFC definition of the sockopt
> parameter? Is there a reason for that? I like how the "per RFC"
> behavior shifts any blame away from us :)
>
Hey Jakub,
We've made the change from record_size_limit to max_payload_len mainly
because:
In the previous record_size_limit approach for TLS 1.3, we need to
account for the ContentType byte. Which complicates get/setsockopt()
and tls_get_info(), where in setsockopt() for TLS 1.3 we need to
subtract 1 to the user provided value and in getsockopt() we need add 1
to keep the symmetry between the two (similarly in tls_get_info()). The
underlying assumption was that userspace passes up directly what the
endpoint specified as the record_size_limit.
With this approach we don't need to worry about it and we can pass the
responsibility to user-space as documented, which I think makes the
kernel code simpler.
> > + err = nla_put_u16(skb, TLS_INFO_TX_MAX_PAYLOAD_LEN,
> > + ctx->tx_max_payload_len);
> > +
>
> nit: unnecessary empty line
Ah! will fixup for V9
Regards,
Wilfred
>
> > + if (err)
> > + goto nla_failure;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
2025-10-24 2:11 ` Wilfred Mallawa
@ 2025-10-24 23:33 ` Jakub Kicinski
2025-10-26 8:50 ` Wilfred Mallawa
2025-10-27 11:32 ` Sabrina Dubroca
0 siblings, 2 replies; 11+ messages in thread
From: Jakub Kicinski @ 2025-10-24 23:33 UTC (permalink / raw)
To: Wilfred Mallawa
Cc: Sabrina Dubroca, netdev, linux-doc, linux-kernel, linux-kselftest,
David S . Miller, Eric Dumazet, Paolo Abeni, Jonathan Corbet,
Simon Horman, John Fastabend, Shuah Khan
On Fri, 24 Oct 2025 12:11:11 +1000 Wilfred Mallawa wrote:
> In the previous record_size_limit approach for TLS 1.3, we need to
> account for the ContentType byte. Which complicates get/setsockopt()
> and tls_get_info(), where in setsockopt() for TLS 1.3 we need to
> subtract 1 to the user provided value and in getsockopt() we need add 1
> to keep the symmetry between the two (similarly in tls_get_info()). The
> underlying assumption was that userspace passes up directly what the
> endpoint specified as the record_size_limit.
>
> With this approach we don't need to worry about it and we can pass the
> responsibility to user-space as documented, which I think makes the
> kernel code simpler.
But we haven't managed to avoid that completely:
+ if (value < TLS_MIN_RECORD_SIZE_LIM - (tls_13 ? 1 : 0) ||
I understand the motivation, the kernel code is indeed simpler.
Last night I read the RFC and then this patch, and it took me like
10min to get all of it straight in my head. Maybe I was tried but
I feel like the user space developers will judge us harshly for
the current uAPI.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
2025-10-24 23:33 ` Jakub Kicinski
@ 2025-10-26 8:50 ` Wilfred Mallawa
2025-10-27 11:32 ` Sabrina Dubroca
1 sibling, 0 replies; 11+ messages in thread
From: Wilfred Mallawa @ 2025-10-26 8:50 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Sabrina Dubroca, netdev, linux-doc, linux-kernel, linux-kselftest,
David S . Miller, Eric Dumazet, Paolo Abeni, Jonathan Corbet,
Simon Horman, John Fastabend, Shuah Khan
On Fri, 2025-10-24 at 16:33 -0700, Jakub Kicinski wrote:
> On Fri, 24 Oct 2025 12:11:11 +1000 Wilfred Mallawa wrote:
> > In the previous record_size_limit approach for TLS 1.3, we need to
> > account for the ContentType byte. Which complicates
> > get/setsockopt()
> > and tls_get_info(), where in setsockopt() for TLS 1.3 we need to
> > subtract 1 to the user provided value and in getsockopt() we need
> > add 1
> > to keep the symmetry between the two (similarly in tls_get_info()).
> > The
> > underlying assumption was that userspace passes up directly what
> > the
> > endpoint specified as the record_size_limit.
> >
> > With this approach we don't need to worry about it and we can pass
> > the
> > responsibility to user-space as documented, which I think makes the
> > kernel code simpler.
>
> But we haven't managed to avoid that completely:
>
> + if (value < TLS_MIN_RECORD_SIZE_LIM - (tls_13 ? 1 : 0) ||
>
> I understand the motivation, the kernel code is indeed simpler.
>
> Last night I read the RFC and then this patch, and it took me like
> 10min to get all of it straight in my head. Maybe I was tried but
> I feel like the user space developers will judge us harshly for
> the current uAPI.
I am open to reverting this to `record_size_limit` in that case. I
think the only trade off is just a bit more complexity in the kernel
side for the additional checks. Does that sound good to you
Jakub/Sabrina?
Regards,
Wilfred
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
2025-10-24 23:33 ` Jakub Kicinski
2025-10-26 8:50 ` Wilfred Mallawa
@ 2025-10-27 11:32 ` Sabrina Dubroca
2025-10-27 23:13 ` Jakub Kicinski
1 sibling, 1 reply; 11+ messages in thread
From: Sabrina Dubroca @ 2025-10-27 11:32 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Wilfred Mallawa, netdev, linux-doc, linux-kernel, linux-kselftest,
David S . Miller, Eric Dumazet, Paolo Abeni, Jonathan Corbet,
Simon Horman, John Fastabend, Shuah Khan
2025-10-24, 16:33:36 -0700, Jakub Kicinski wrote:
> On Fri, 24 Oct 2025 12:11:11 +1000 Wilfred Mallawa wrote:
> > In the previous record_size_limit approach for TLS 1.3, we need to
> > account for the ContentType byte. Which complicates get/setsockopt()
> > and tls_get_info(), where in setsockopt() for TLS 1.3 we need to
> > subtract 1 to the user provided value and in getsockopt() we need add 1
> > to keep the symmetry between the two (similarly in tls_get_info()). The
> > underlying assumption was that userspace passes up directly what the
> > endpoint specified as the record_size_limit.
> >
> > With this approach we don't need to worry about it and we can pass the
> > responsibility to user-space as documented, which I think makes the
> > kernel code simpler.
>
> But we haven't managed to avoid that completely:
>
> + if (value < TLS_MIN_RECORD_SIZE_LIM - (tls_13 ? 1 : 0) ||
We could, by taking a smaller minimum payload size than what the RFC
says (anything that allows us to make progress, maybe 8B?). ie, I
don't think we have to be as strict as rfc8449 (leave the userspace
library in charge of rejecting bogus values during negotiation of this
extension).
> I understand the motivation, the kernel code is indeed simpler.
Also more consistent: the kernel syscalls work with record payload (at
the send()/recv() level). The rest is hidden. Userspace could try an
approximation by sending max_payload-sized chunks with MSG_EOR.
> Last night I read the RFC and then this patch, and it took me like
> 10min to get all of it straight in my head.
I don't find this stuff very clear either tbh, but maybe that's a
problem in the RFC itself.
> Maybe I was tried but
> I feel like the user space developers will judge us harshly for
> the current uAPI.
But userspace libraries have to do the same computations on their side
if they want to implement this RFC. They have to figure out what the
max payload size is as they're building the record, they can't just
chop off a bit at the end after filling it.
Quick grepping through gnutls got me to this:
https://gitlab.com/gnutls/gnutls/-/blob/eb3c9febfa9969792b8ac0ca56ee9fbd9b0bd7ee/lib/ext/record_size_limit.c#L104-106
So I have a slight preference for not being tied to a (kind of
confusing) RFC.
--
Sabrina
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
2025-10-27 11:32 ` Sabrina Dubroca
@ 2025-10-27 23:13 ` Jakub Kicinski
0 siblings, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2025-10-27 23:13 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: Wilfred Mallawa, netdev, linux-doc, linux-kernel, linux-kselftest,
David S . Miller, Eric Dumazet, Paolo Abeni, Jonathan Corbet,
Simon Horman, John Fastabend, Shuah Khan
On Mon, 27 Oct 2025 12:32:02 +0100 Sabrina Dubroca wrote:
> > But we haven't managed to avoid that completely:
> >
> > + if (value < TLS_MIN_RECORD_SIZE_LIM - (tls_13 ? 1 : 0) ||
>
> We could, by taking a smaller minimum payload size than what the RFC
> says (anything that allows us to make progress, maybe 8B?). ie, I
> don't think we have to be as strict as rfc8449 (leave the userspace
> library in charge of rejecting bogus values during negotiation of this
> extension).
>
> > I understand the motivation, the kernel code is indeed simpler.
>
> Also more consistent: the kernel syscalls work with record payload (at
> the send()/recv() level). The rest is hidden. Userspace could try an
> approximation by sending max_payload-sized chunks with MSG_EOR.
>
> > Last night I read the RFC and then this patch, and it took me like
> > 10min to get all of it straight in my head.
>
> I don't find this stuff very clear either tbh, but maybe that's a
> problem in the RFC itself.
>
> > Maybe I was tried but
> > I feel like the user space developers will judge us harshly for
> > the current uAPI.
>
> But userspace libraries have to do the same computations on their side
> if they want to implement this RFC. They have to figure out what the
> max payload size is as they're building the record, they can't just
> chop off a bit at the end after filling it.
>
> Quick grepping through gnutls got me to this:
> https://gitlab.com/gnutls/gnutls/-/blob/eb3c9febfa9969792b8ac0ca56ee9fbd9b0bd7ee/lib/ext/record_size_limit.c#L104-106
>
> So I have a slight preference for not being tied to a (kind of
> confusing) RFC.
Alright :)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size
2025-10-22 0:19 [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size Wilfred Mallawa
` (2 preceding siblings ...)
2025-10-24 1:44 ` Jakub Kicinski
@ 2025-10-27 23:30 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-10-27 23:30 UTC (permalink / raw)
To: Wilfred Mallawa
Cc: netdev, linux-doc, linux-kernel, linux-kselftest, davem, edumazet,
kuba, pabeni, corbet, horms, john.fastabend, sd, shuah,
wilfred.mallawa
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 22 Oct 2025 10:19:36 +1000 you wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
>
> During a handshake, an endpoint may specify a maximum record size limit.
> Currently, the kernel defaults to TLS_MAX_PAYLOAD_SIZE (16KB) for the
> maximum record size. Meaning that, the outgoing records from the kernel
> can exceed a lower size negotiated during the handshake. In such a case,
> the TLS endpoint must send a fatal "record_overflow" alert [1], and
> thus the record is discarded.
>
> [...]
Here is the summary with links:
- [net-next,v8,1/2] net/tls: support setting the maximum payload size
https://git.kernel.org/netdev/net-next/c/82cb5be6ad64
- [net-next,v8,2/2] selftests: tls: add tls record_size_limit test
https://git.kernel.org/netdev/net-next/c/5f30bc470672
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-10-27 23:30 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-22 0:19 [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size Wilfred Mallawa
2025-10-22 0:19 ` [PATCH net-next v8 2/2] selftests: tls: add tls record_size_limit test Wilfred Mallawa
2025-10-22 21:51 ` Sabrina Dubroca
2025-10-22 21:51 ` [PATCH net-next v8 1/2] net/tls: support setting the maximum payload size Sabrina Dubroca
2025-10-24 1:44 ` Jakub Kicinski
2025-10-24 2:11 ` Wilfred Mallawa
2025-10-24 23:33 ` Jakub Kicinski
2025-10-26 8:50 ` Wilfred Mallawa
2025-10-27 11:32 ` Sabrina Dubroca
2025-10-27 23:13 ` Jakub Kicinski
2025-10-27 23:30 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).