netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/4] net/tls: add support for the record size limit extension
@ 2025-07-29  2:41 Wilfred Mallawa
  2025-07-29  2:41 ` [RFC 1/4] net/handshake: get negotiated tls record size limit Wilfred Mallawa
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Wilfred Mallawa @ 2025-07-29  2:41 UTC (permalink / raw)
  To: alistair.francis, dlemoal, chuck.lever, davem, edumazet, kuba,
	pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch, sagi,
	kch, borisp, john.fastabend, jlayton, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

From: Wilfred Mallawa <wilfred.mallawa@wdc.com>

During a tls handshake, an endpoint may specify a maximum record size limit. As
specified by [1]. which allows peers to negotiate a maximum plaintext record
size during the TLS handshake. If a TLS endpoint receives a record larger
than its advertised limit, it must send a fatal "record_overflow" alert [1].
Currently, this limit is not visble to the kernel, particularly in the case
where userspace handles the handshake (tlshd/gnutls).

This series in conjunction with the respective userspace changes for tlshd [2]
and gnutls [3], adds support for the kernel the receive the negotiated record
size limit through the existing netlink communication layer, and use this
value to limit outgoing records to the size specified.

[1] https://www.rfc-editor.org/rfc/rfc8449
[2] https://github.com/oracle/ktls-utils/pull/112
[3] https://gitlab.com/gnutls/gnutls/-/merge_requests/1989

Wilfred Mallawa (4):
  net/handshake: get negotiated tls record size limit
  net/tls/tls_sw: use the record size limit specified
  nvme/host/tcp: set max record size in the tls context
  nvme/target/tcp: set max record size in the tls context

 Documentation/netlink/specs/handshake.yaml |  3 +++
 Documentation/networking/tls-handshake.rst |  8 +++++++-
 drivers/nvme/host/tcp.c                    | 18 +++++++++++++++++-
 drivers/nvme/target/tcp.c                  | 16 +++++++++++++++-
 include/net/handshake.h                    |  4 +++-
 include/net/tls.h                          |  1 +
 include/uapi/linux/handshake.h             |  1 +
 net/handshake/genl.c                       |  5 +++--
 net/handshake/tlshd.c                      | 15 +++++++++++++--
 net/sunrpc/svcsock.c                       |  4 +++-
 net/sunrpc/xprtsock.c                      |  4 +++-
 net/tls/tls_sw.c                           | 10 +++++++++-
 12 files changed, 78 insertions(+), 11 deletions(-)

-- 
2.50.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC 1/4] net/handshake: get negotiated tls record size limit
  2025-07-29  2:41 [RFC 0/4] net/tls: add support for the record size limit extension Wilfred Mallawa
@ 2025-07-29  2:41 ` Wilfred Mallawa
  2025-07-29  8:07   ` Damien Le Moal
  2025-07-29  8:12   ` Hannes Reinecke
  2025-07-29  2:41 ` [RFC 2/4] net/tls/tls_sw: use the record size limit specified Wilfred Mallawa
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 14+ messages in thread
From: Wilfred Mallawa @ 2025-07-29  2:41 UTC (permalink / raw)
  To: alistair.francis, dlemoal, chuck.lever, davem, edumazet, kuba,
	pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch, sagi,
	kch, borisp, john.fastabend, jlayton, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

From: Wilfred Mallawa <wilfred.mallawa@wdc.com>

During a handshake, an endpoint may specify a maximum record size limit.
Currently, this limit is not visble to the kernel particularly in the case
where userspace handles the handshake (tlshd/gnutls). This patch adds
support for retrieving the record size limit.

This is the first step in ensuring that the kernel can respect the record
size limit imposed by the endpoint.

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
---
 Documentation/netlink/specs/handshake.yaml |  3 +++
 Documentation/networking/tls-handshake.rst |  8 +++++++-
 drivers/nvme/host/tcp.c                    |  3 ++-
 drivers/nvme/target/tcp.c                  |  3 ++-
 include/net/handshake.h                    |  4 +++-
 include/uapi/linux/handshake.h             |  1 +
 net/handshake/genl.c                       |  5 +++--
 net/handshake/tlshd.c                      | 15 +++++++++++++--
 net/sunrpc/svcsock.c                       |  4 +++-
 net/sunrpc/xprtsock.c                      |  4 +++-
 10 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
index b934cc513e3d..35d5eb91a3f9 100644
--- a/Documentation/netlink/specs/handshake.yaml
+++ b/Documentation/netlink/specs/handshake.yaml
@@ -84,6 +84,9 @@ attribute-sets:
         name: remote-auth
         type: u32
         multi-attr: true
+      -
+        name: record-size-limit
+        type: u32
 
 operations:
   list:
diff --git a/Documentation/networking/tls-handshake.rst b/Documentation/networking/tls-handshake.rst
index 6f5ea1646a47..cd984a137779 100644
--- a/Documentation/networking/tls-handshake.rst
+++ b/Documentation/networking/tls-handshake.rst
@@ -169,7 +169,8 @@ The synopsis of this function is:
 .. code-block:: c
 
   typedef void	(*tls_done_func_t)(void *data, int status,
-                                   key_serial_t peerid);
+                                   key_serial_t peerid,
+                                   size_t tls_record_size_limit);
 
 The consumer provides a cookie in the @ta_data field of the
 tls_handshake_args structure that is returned in the @data parameter of
@@ -200,6 +201,11 @@ The @peerid parameter contains the serial number of a key containing the
 remote peer's identity or the value TLS_NO_PEERID if the session is not
 authenticated.
 
+The @tls_record_size_limit parameter, if non-zero, exposes the tls max
+record size advertised by the endpoint. Record size must not exceed this amount.
+A negative value shall indicate that the endpoint did not advertise the
+maximum record size limit.
+
 A best practice is to close and destroy the socket immediately if the
 handshake failed.
 
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index d924008c3949..65ceadb4ffed 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1673,7 +1673,8 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
 		qid, queue->io_cpu);
 }
 
-static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
+static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
+			      size_t tls_record_size_limit)
 {
 	struct nvme_tcp_queue *queue = data;
 	struct nvme_tcp_ctrl *ctrl = queue->ctrl;
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 470bf37e5a63..60e308401a54 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1780,7 +1780,8 @@ static int nvmet_tcp_tls_key_lookup(struct nvmet_tcp_queue *queue,
 }
 
 static void nvmet_tcp_tls_handshake_done(void *data, int status,
-					 key_serial_t peerid)
+					key_serial_t peerid,
+					size_t tls_record_size_limit)
 {
 	struct nvmet_tcp_queue *queue = data;
 
diff --git a/include/net/handshake.h b/include/net/handshake.h
index 8ebd4f9ed26e..c00b1aaa7aba 100644
--- a/include/net/handshake.h
+++ b/include/net/handshake.h
@@ -15,10 +15,12 @@ enum {
 	TLS_NO_PEERID = 0,
 	TLS_NO_CERT = 0,
 	TLS_NO_PRIVKEY = 0,
+	TLS_NO_RECORD_SIZE_LIMIT = 0,
 };
 
 typedef void	(*tls_done_func_t)(void *data, int status,
-				   key_serial_t peerid);
+				   key_serial_t peerid,
+				   size_t tls_record_size_limit);
 
 struct tls_handshake_args {
 	struct socket		*ta_sock;
diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
index 3d7ea58778c9..0768eb8eb415 100644
--- a/include/uapi/linux/handshake.h
+++ b/include/uapi/linux/handshake.h
@@ -54,6 +54,7 @@ enum {
 	HANDSHAKE_A_DONE_STATUS = 1,
 	HANDSHAKE_A_DONE_SOCKFD,
 	HANDSHAKE_A_DONE_REMOTE_AUTH,
+	HANDSHAKE_A_DONE_RECORD_SIZE_LIMIT,
 
 	__HANDSHAKE_A_DONE_MAX,
 	HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
diff --git a/net/handshake/genl.c b/net/handshake/genl.c
index f55d14d7b726..44c43ce18361 100644
--- a/net/handshake/genl.c
+++ b/net/handshake/genl.c
@@ -16,10 +16,11 @@ static const struct nla_policy handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HAN
 };
 
 /* HANDSHAKE_CMD_DONE - do */
-static const struct nla_policy handshake_done_nl_policy[HANDSHAKE_A_DONE_REMOTE_AUTH + 1] = {
+static const struct nla_policy handshake_done_nl_policy[HANDSHAKE_A_DONE_RECORD_SIZE_LIMIT + 1] = {
 	[HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
 	[HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_S32, },
 	[HANDSHAKE_A_DONE_REMOTE_AUTH] = { .type = NLA_U32, },
+	[HANDSHAKE_A_DONE_RECORD_SIZE_LIMIT] = { .type = NLA_U32, },
 };
 
 /* Ops table for handshake */
@@ -35,7 +36,7 @@ static const struct genl_split_ops handshake_nl_ops[] = {
 		.cmd		= HANDSHAKE_CMD_DONE,
 		.doit		= handshake_nl_done_doit,
 		.policy		= handshake_done_nl_policy,
-		.maxattr	= HANDSHAKE_A_DONE_REMOTE_AUTH,
+		.maxattr	= HANDSHAKE_A_DONE_RECORD_SIZE_LIMIT,
 		.flags		= GENL_CMD_CAP_DO,
 	},
 };
diff --git a/net/handshake/tlshd.c b/net/handshake/tlshd.c
index d6f52839827e..7cafac6cff1f 100644
--- a/net/handshake/tlshd.c
+++ b/net/handshake/tlshd.c
@@ -26,7 +26,8 @@
 
 struct tls_handshake_req {
 	void			(*th_consumer_done)(void *data, int status,
-						    key_serial_t peerid);
+						    key_serial_t peerid,
+						    size_t tls_record_size_limit);
 	void			*th_consumer_data;
 
 	int			th_type;
@@ -39,6 +40,8 @@ struct tls_handshake_req {
 
 	unsigned int		th_num_peerids;
 	key_serial_t		th_peerid[5];
+
+	size_t			record_size_limit;
 };
 
 static struct tls_handshake_req *
@@ -55,6 +58,7 @@ tls_handshake_req_init(struct handshake_req *req,
 	treq->th_num_peerids = 0;
 	treq->th_certificate = TLS_NO_CERT;
 	treq->th_privkey = TLS_NO_PRIVKEY;
+	treq->record_size_limit = TLS_NO_RECORD_SIZE_LIMIT;
 	return treq;
 }
 
@@ -83,6 +87,13 @@ static void tls_handshake_remote_peerids(struct tls_handshake_req *treq,
 		if (i >= treq->th_num_peerids)
 			break;
 	}
+
+	nla_for_each_attr(nla, head, len, rem) {
+		if (nla_type(nla) == HANDSHAKE_A_DONE_RECORD_SIZE_LIMIT) {
+			treq->record_size_limit = nla_get_u32(nla);
+			break;
+		}
+	}
 }
 
 /**
@@ -105,7 +116,7 @@ static void tls_handshake_done(struct handshake_req *req,
 		set_bit(HANDSHAKE_F_REQ_SESSION, &req->hr_flags);
 
 	treq->th_consumer_done(treq->th_consumer_data, -status,
-			       treq->th_peerid[0]);
+			       treq->th_peerid[0], treq->record_size_limit);
 }
 
 #if IS_ENABLED(CONFIG_KEYS)
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index e1c85123b445..2014d906ff06 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -417,13 +417,15 @@ static void svc_tcp_kill_temp_xprt(struct svc_xprt *xprt)
  * @data: address of xprt to wake
  * @status: status of handshake
  * @peerid: serial number of key containing the remote peer's identity
+ * @tls_record_size_limit: Max tls_record_size_limit of the endpoint
  *
  * If a security policy is specified as an export option, we don't
  * have a specific export here to check. So we set a "TLS session
  * is present" flag on the xprt and let an upper layer enforce local
  * security policy.
  */
-static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid)
+static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid,
+				   size_t tls_record_size_limit)
 {
 	struct svc_xprt *xprt = data;
 	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 04ff66758fc3..509aa6269b0a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2569,9 +2569,11 @@ static int xs_tcp_tls_finish_connecting(struct rpc_xprt *lower_xprt,
  * @data: address of xprt to wake
  * @status: status of handshake
  * @peerid: serial number of key containing the remote's identity
+ * @tls_record_size_limit: Max tls_record_size_limit of the endpoint
  *
  */
-static void xs_tls_handshake_done(void *data, int status, key_serial_t peerid)
+static void xs_tls_handshake_done(void *data, int status, key_serial_t peerid,
+				  size_t tls_record_size_limit)
 {
 	struct rpc_xprt *lower_xprt = data;
 	struct sock_xprt *lower_transport =
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC 2/4] net/tls/tls_sw: use the record size limit specified
  2025-07-29  2:41 [RFC 0/4] net/tls: add support for the record size limit extension Wilfred Mallawa
  2025-07-29  2:41 ` [RFC 1/4] net/handshake: get negotiated tls record size limit Wilfred Mallawa
@ 2025-07-29  2:41 ` Wilfred Mallawa
  2025-07-29  8:13   ` Damien Le Moal
  2025-07-29  2:41 ` [RFC 3/4] nvme/host/tcp: set max record size in the tls context Wilfred Mallawa
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Wilfred Mallawa @ 2025-07-29  2:41 UTC (permalink / raw)
  To: alistair.francis, dlemoal, chuck.lever, davem, edumazet, kuba,
	pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch, sagi,
	kch, borisp, john.fastabend, jlayton, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

From: Wilfred Mallawa <wilfred.mallawa@wdc.com>

Currently, for tls_sw, the kernel uses the default 16K
TLS_MAX_PAYLOAD_SIZE for records. However, if an endpoint has specified
a record size much lower than that, it is currently not respected.

This patch adds support to using the record size limit specified by an
endpoint if it has been set.

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
---
 include/net/tls.h |  1 +
 net/tls/tls_sw.c  | 10 +++++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index 857340338b69..6248beb4a6c1 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -241,6 +241,7 @@ struct tls_context {
 
 	struct scatterlist *partially_sent_record;
 	u16 partially_sent_offset;
+	u32 tls_record_size_limit;
 
 	bool splicing_pages;
 	bool pending_open_record_frags;
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index fc88e34b7f33..4c64f1436832 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1024,6 +1024,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 	ssize_t copied = 0;
 	struct sk_msg *msg_pl, *msg_en;
 	struct tls_rec *rec;
+	u32 tls_record_size_limit;
 	int required_size;
 	int num_async = 0;
 	bool full_record;
@@ -1045,6 +1046,13 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 		}
 	}
 
+	if (tls_ctx->tls_record_size_limit > 0) {
+		tls_record_size_limit = min(tls_ctx->tls_record_size_limit,
+					    TLS_MAX_PAYLOAD_SIZE);
+	} else {
+		tls_record_size_limit = TLS_MAX_PAYLOAD_SIZE;
+	}
+
 	while (msg_data_left(msg)) {
 		if (sk->sk_err) {
 			ret = -sk->sk_err;
@@ -1066,7 +1074,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 		orig_size = msg_pl->sg.size;
 		full_record = false;
 		try_to_copy = msg_data_left(msg);
-		record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
+		record_room = tls_record_size_limit - msg_pl->sg.size;
 		if (try_to_copy >= record_room) {
 			try_to_copy = record_room;
 			full_record = true;
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC 3/4] nvme/host/tcp: set max record size in the tls context
  2025-07-29  2:41 [RFC 0/4] net/tls: add support for the record size limit extension Wilfred Mallawa
  2025-07-29  2:41 ` [RFC 1/4] net/handshake: get negotiated tls record size limit Wilfred Mallawa
  2025-07-29  2:41 ` [RFC 2/4] net/tls/tls_sw: use the record size limit specified Wilfred Mallawa
@ 2025-07-29  2:41 ` Wilfred Mallawa
  2025-07-29  8:16   ` Hannes Reinecke
  2025-07-29  2:41 ` [RFC 4/4] nvme/target/tcp: " Wilfred Mallawa
  2025-07-29 13:37 ` [RFC 0/4] net/tls: add support for the record size limit extension Chuck Lever
  4 siblings, 1 reply; 14+ messages in thread
From: Wilfred Mallawa @ 2025-07-29  2:41 UTC (permalink / raw)
  To: alistair.francis, dlemoal, chuck.lever, davem, edumazet, kuba,
	pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch, sagi,
	kch, borisp, john.fastabend, jlayton, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

From: Wilfred Mallawa <wilfred.mallawa@wdc.com>

During a tls handshake, a host may specify the tls record size limit
using the tls "record_size_limit" extension. Currently, the NVMe TCP
host driver does not specify this value to the tls layer.

This patch adds support for setting the tls record size limit into the
tls context, such that outgoing records may not exceed this limit
specified by the endpoint.

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
---
 drivers/nvme/host/tcp.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 65ceadb4ffed..84a55736f269 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1677,6 +1677,7 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
 			      size_t tls_record_size_limit)
 {
 	struct nvme_tcp_queue *queue = data;
+	struct tls_context *tls_ctx = tls_get_ctx(queue->sock->sk);
 	struct nvme_tcp_ctrl *ctrl = queue->ctrl;
 	int qid = nvme_tcp_queue_id(queue);
 	struct key *tls_key;
@@ -1700,6 +1701,20 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
 			ctrl->ctrl.tls_pskid = key_serial(tls_key);
 		key_put(tls_key);
 		queue->tls_err = 0;
+
+		/* Endpoint has specified a maximum tls record size limit */
+		if (tls_record_size_limit > TLS_MAX_PAYLOAD_SIZE) {
+			dev_err(ctrl->ctrl.device,
+				"queue %d: invalid tls max record size limit: %zd\n",
+				nvme_tcp_queue_id(queue), tls_record_size_limit);
+			queue->tls_err = -EINVAL;
+			goto out_complete;
+		} else if (tls_record_size_limit > 0) {
+			tls_ctx->tls_record_size_limit = (u32)tls_record_size_limit;
+			dev_dbg(ctrl->ctrl.device,
+				"queue %d: target specified tls_record_size_limit %u\n",
+				nvme_tcp_queue_id(queue), tls_ctx->tls_record_size_limit);
+		}
 	}
 
 out_complete:
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC 4/4] nvme/target/tcp: set max record size in the tls context
  2025-07-29  2:41 [RFC 0/4] net/tls: add support for the record size limit extension Wilfred Mallawa
                   ` (2 preceding siblings ...)
  2025-07-29  2:41 ` [RFC 3/4] nvme/host/tcp: set max record size in the tls context Wilfred Mallawa
@ 2025-07-29  2:41 ` Wilfred Mallawa
  2025-07-29  8:16   ` Hannes Reinecke
  2025-07-29 13:37 ` [RFC 0/4] net/tls: add support for the record size limit extension Chuck Lever
  4 siblings, 1 reply; 14+ messages in thread
From: Wilfred Mallawa @ 2025-07-29  2:41 UTC (permalink / raw)
  To: alistair.francis, dlemoal, chuck.lever, davem, edumazet, kuba,
	pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch, sagi,
	kch, borisp, john.fastabend, jlayton, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

From: Wilfred Mallawa <wilfred.mallawa@wdc.com>

During a tls handshake, a host may specify the tls record size limit
using the tls "record_size_limit" extension. Currently, the NVMe target
driver does not specify this value to the tls layer.

This patch adds support for setting the tls record size limit into the
tls context, such that outgoing records may not exceed this limit
specified by the endpoint.

Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
---
 drivers/nvme/target/tcp.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 60e308401a54..f2ab473ea5de 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1784,6 +1784,7 @@ static void nvmet_tcp_tls_handshake_done(void *data, int status,
 					size_t tls_record_size_limit)
 {
 	struct nvmet_tcp_queue *queue = data;
+	struct tls_context *tls_ctx = tls_get_ctx(queue->sock->sk);
 
 	pr_debug("queue %d: TLS handshake done, key %x, status %d\n",
 		 queue->idx, peerid, status);
@@ -1795,6 +1796,17 @@ static void nvmet_tcp_tls_handshake_done(void *data, int status,
 	if (!status) {
 		queue->tls_pskid = peerid;
 		queue->state = NVMET_TCP_Q_CONNECTING;
+
+		/* Endpoint has specified a maximum tls record size limit */
+		if (tls_record_size_limit > TLS_MAX_PAYLOAD_SIZE) {
+			pr_err("queue %d: invalid tls max record size limit: %zu\n",
+			       queue->idx, tls_record_size_limit);
+			queue->state = NVMET_TCP_Q_FAILED;
+		} else if (tls_record_size_limit > 0) {
+			tls_ctx->tls_record_size_limit = (u32)tls_record_size_limit;
+			pr_debug("queue %d: host specified tls max record size %u\n",
+				 queue->idx, tls_ctx->tls_record_size_limit);
+		}
 	} else
 		queue->state = NVMET_TCP_Q_FAILED;
 	spin_unlock_bh(&queue->state_lock);
@@ -1808,6 +1820,7 @@ static void nvmet_tcp_tls_handshake_done(void *data, int status,
 		nvmet_tcp_schedule_release_queue(queue);
 	else
 		nvmet_tcp_set_queue_sock(queue);
+
 	kref_put(&queue->kref, nvmet_tcp_release_queue);
 }
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC 1/4] net/handshake: get negotiated tls record size limit
  2025-07-29  2:41 ` [RFC 1/4] net/handshake: get negotiated tls record size limit Wilfred Mallawa
@ 2025-07-29  8:07   ` Damien Le Moal
  2025-07-29  8:12   ` Hannes Reinecke
  1 sibling, 0 replies; 14+ messages in thread
From: Damien Le Moal @ 2025-07-29  8:07 UTC (permalink / raw)
  To: Wilfred Mallawa, alistair.francis, chuck.lever, davem, edumazet,
	kuba, pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch,
	sagi, kch, borisp, john.fastabend, jlayton, neil, okorniev,
	Dai.Ngo, tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

On 7/29/25 11:41, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> 
> During a handshake, an endpoint may specify a maximum record size limit.
> Currently, this limit is not visble to the kernel particularly in the case
> where userspace handles the handshake (tlshd/gnutls). This patch adds
> support for retrieving the record size limit.
> 
> This is the first step in ensuring that the kernel can respect the record
> size limit imposed by the endpoint.
> 
> Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> ---
>  Documentation/netlink/specs/handshake.yaml |  3 +++
>  Documentation/networking/tls-handshake.rst |  8 +++++++-
>  drivers/nvme/host/tcp.c                    |  3 ++-
>  drivers/nvme/target/tcp.c                  |  3 ++-
>  include/net/handshake.h                    |  4 +++-
>  include/uapi/linux/handshake.h             |  1 +
>  net/handshake/genl.c                       |  5 +++--
>  net/handshake/tlshd.c                      | 15 +++++++++++++--
>  net/sunrpc/svcsock.c                       |  4 +++-
>  net/sunrpc/xprtsock.c                      |  4 +++-
>  10 files changed, 40 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
> index b934cc513e3d..35d5eb91a3f9 100644
> --- a/Documentation/netlink/specs/handshake.yaml
> +++ b/Documentation/netlink/specs/handshake.yaml
> @@ -84,6 +84,9 @@ attribute-sets:
>          name: remote-auth
>          type: u32
>          multi-attr: true
> +      -
> +        name: record-size-limit
> +        type: u32
>  
>  operations:
>    list:
> diff --git a/Documentation/networking/tls-handshake.rst b/Documentation/networking/tls-handshake.rst
> index 6f5ea1646a47..cd984a137779 100644
> --- a/Documentation/networking/tls-handshake.rst
> +++ b/Documentation/networking/tls-handshake.rst
> @@ -169,7 +169,8 @@ The synopsis of this function is:
>  .. code-block:: c
>  
>    typedef void	(*tls_done_func_t)(void *data, int status,
> -                                   key_serial_t peerid);
> +                                   key_serial_t peerid,
> +                                   size_t tls_record_size_limit);
>  
>  The consumer provides a cookie in the @ta_data field of the
>  tls_handshake_args structure that is returned in the @data parameter of
> @@ -200,6 +201,11 @@ The @peerid parameter contains the serial number of a key containing the
>  remote peer's identity or the value TLS_NO_PEERID if the session is not
>  authenticated.
>  
> +The @tls_record_size_limit parameter, if non-zero, exposes the tls max
> +record size advertised by the endpoint. Record size must not exceed this amount.
> +A negative value shall indicate that the endpoint did not advertise the
> +maximum record size limit.

size_t cannot be negative... Did you mean:
"A value of 0 (TLS_NO_RECORD_SIZE_LIMIT)..."

Also note that even if the endpoint does not advertize a record sie limit, we
still have one (16K was it ?). So I think that the name TLS_NO_RECORD_SIZE_LIMIT
is a little misleading.

> +
>  A best practice is to close and destroy the socket immediately if the
>  handshake failed.

[...]

> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index e1c85123b445..2014d906ff06 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -417,13 +417,15 @@ static void svc_tcp_kill_temp_xprt(struct svc_xprt *xprt)
>   * @data: address of xprt to wake
>   * @status: status of handshake
>   * @peerid: serial number of key containing the remote peer's identity
> + * @tls_record_size_limit: Max tls_record_size_limit of the endpoint

Please make a proper sentence to describe tls_record_size_limit instead of
repeating that argument name.

>   *
>   * If a security policy is specified as an export option, we don't
>   * have a specific export here to check. So we set a "TLS session
>   * is present" flag on the xprt and let an upper layer enforce local
>   * security policy.
>   */
> -static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid)
> +static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid,
> +				   size_t tls_record_size_limit)
>  {
>  	struct svc_xprt *xprt = data;
>  	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 04ff66758fc3..509aa6269b0a 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2569,9 +2569,11 @@ static int xs_tcp_tls_finish_connecting(struct rpc_xprt *lower_xprt,
>   * @data: address of xprt to wake
>   * @status: status of handshake
>   * @peerid: serial number of key containing the remote's identity
> + * @tls_record_size_limit: Max tls_record_size_limit of the endpoint

Same here.

>   *
>   */
> -static void xs_tls_handshake_done(void *data, int status, key_serial_t peerid)
> +static void xs_tls_handshake_done(void *data, int status, key_serial_t peerid,
> +				  size_t tls_record_size_limit)
>  {
>  	struct rpc_xprt *lower_xprt = data;
>  	struct sock_xprt *lower_transport =


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 1/4] net/handshake: get negotiated tls record size limit
  2025-07-29  2:41 ` [RFC 1/4] net/handshake: get negotiated tls record size limit Wilfred Mallawa
  2025-07-29  8:07   ` Damien Le Moal
@ 2025-07-29  8:12   ` Hannes Reinecke
  2025-08-07  0:03     ` Wilfred Mallawa
  1 sibling, 1 reply; 14+ messages in thread
From: Hannes Reinecke @ 2025-07-29  8:12 UTC (permalink / raw)
  To: Wilfred Mallawa, alistair.francis, dlemoal, chuck.lever, davem,
	edumazet, kuba, pabeni, horms, donald.hunter, corbet, kbusch,
	axboe, hch, sagi, kch, borisp, john.fastabend, jlayton, neil,
	okorniev, Dai.Ngo, tom, trondmy, anna, kernel-tls-handshake,
	netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

On 7/29/25 04:41, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> 
> During a handshake, an endpoint may specify a maximum record size limit.
> Currently, this limit is not visble to the kernel particularly in the case
> where userspace handles the handshake (tlshd/gnutls). This patch adds
> support for retrieving the record size limit.
> 
> This is the first step in ensuring that the kernel can respect the record
> size limit imposed by the endpoint.
> 
> Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> ---
>   Documentation/netlink/specs/handshake.yaml |  3 +++
>   Documentation/networking/tls-handshake.rst |  8 +++++++-
>   drivers/nvme/host/tcp.c                    |  3 ++-
>   drivers/nvme/target/tcp.c                  |  3 ++-
>   include/net/handshake.h                    |  4 +++-
>   include/uapi/linux/handshake.h             |  1 +
>   net/handshake/genl.c                       |  5 +++--
>   net/handshake/tlshd.c                      | 15 +++++++++++++--
>   net/sunrpc/svcsock.c                       |  4 +++-
>   net/sunrpc/xprtsock.c                      |  4 +++-
>   10 files changed, 40 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
> index b934cc513e3d..35d5eb91a3f9 100644
> --- a/Documentation/netlink/specs/handshake.yaml
> +++ b/Documentation/netlink/specs/handshake.yaml
> @@ -84,6 +84,9 @@ attribute-sets:
>           name: remote-auth
>           type: u32
>           multi-attr: true
> +      -
> +        name: record-size-limit
> +        type: u32
>   
>   operations:
>     list:
> diff --git a/Documentation/networking/tls-handshake.rst b/Documentation/networking/tls-handshake.rst
> index 6f5ea1646a47..cd984a137779 100644
> --- a/Documentation/networking/tls-handshake.rst
> +++ b/Documentation/networking/tls-handshake.rst
> @@ -169,7 +169,8 @@ The synopsis of this function is:
>   .. code-block:: c
>   
>     typedef void	(*tls_done_func_t)(void *data, int status,
> -                                   key_serial_t peerid);
> +                                   key_serial_t peerid,
> +                                   size_t tls_record_size_limit);
>   
>   The consumer provides a cookie in the @ta_data field of the
>   tls_handshake_args structure that is returned in the @data parameter of

Why is this exposed to the TLS handshake consumer?
The TLS record size is surely required for handling and processing TLS
streams in net/tls, but the consumer of that (eg NVMe-TCP, NFS)
are blissfully unaware that there _are_ such things like TLS records.
And they really should keep it that way.

So I'd really _not_ expose that to any ULP and keep it internal to
the TLS layer.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/4] net/tls/tls_sw: use the record size limit specified
  2025-07-29  2:41 ` [RFC 2/4] net/tls/tls_sw: use the record size limit specified Wilfred Mallawa
@ 2025-07-29  8:13   ` Damien Le Moal
  2025-08-07  0:04     ` Wilfred Mallawa
  0 siblings, 1 reply; 14+ messages in thread
From: Damien Le Moal @ 2025-07-29  8:13 UTC (permalink / raw)
  To: Wilfred Mallawa, alistair.francis, chuck.lever, davem, edumazet,
	kuba, pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch,
	sagi, kch, borisp, john.fastabend, jlayton, neil, okorniev,
	Dai.Ngo, tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

On 7/29/25 11:41, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> 
> Currently, for tls_sw, the kernel uses the default 16K
> TLS_MAX_PAYLOAD_SIZE for records. However, if an endpoint has specified
> a record size much lower than that, it is currently not respected.

Remove "much". Lower is lower and we have to respect it, even if it is 1B.

> This patch adds support to using the record size limit specified by an
> endpoint if it has been set.

s/to using/for using

> 
> Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>

> @@ -1045,6 +1046,13 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
>  		}
>  	}
>  
> +	if (tls_ctx->tls_record_size_limit > 0) {
> +		tls_record_size_limit = min(tls_ctx->tls_record_size_limit,
> +					    TLS_MAX_PAYLOAD_SIZE);
> +	} else {
> +		tls_record_size_limit = TLS_MAX_PAYLOAD_SIZE;
> +	}

You can simplify this with:

	tls_record_size_limit =
		min_not_zero(tls_ctx->tls_record_size_limit,
			     TLS_MAX_PAYLOAD_SIZE);

> +
>  	while (msg_data_left(msg)) {
>  		if (sk->sk_err) {
>  			ret = -sk->sk_err;
> @@ -1066,7 +1074,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
>  		orig_size = msg_pl->sg.size;
>  		full_record = false;
>  		try_to_copy = msg_data_left(msg);
> -		record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
> +		record_room = tls_record_size_limit - msg_pl->sg.size;
>  		if (try_to_copy >= record_room) {
>  			try_to_copy = record_room;
>  			full_record = true;


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 3/4] nvme/host/tcp: set max record size in the tls context
  2025-07-29  2:41 ` [RFC 3/4] nvme/host/tcp: set max record size in the tls context Wilfred Mallawa
@ 2025-07-29  8:16   ` Hannes Reinecke
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Reinecke @ 2025-07-29  8:16 UTC (permalink / raw)
  To: Wilfred Mallawa, alistair.francis, dlemoal, chuck.lever, davem,
	edumazet, kuba, pabeni, horms, donald.hunter, corbet, kbusch,
	axboe, hch, sagi, kch, borisp, john.fastabend, jlayton, neil,
	okorniev, Dai.Ngo, tom, trondmy, anna, kernel-tls-handshake,
	netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

On 7/29/25 04:41, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> 
> During a tls handshake, a host may specify the tls record size limit
> using the tls "record_size_limit" extension. Currently, the NVMe TCP
> host driver does not specify this value to the tls layer.
> 
> This patch adds support for setting the tls record size limit into the
> tls context, such that outgoing records may not exceed this limit
> specified by the endpoint.
> 
> Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> ---
>   drivers/nvme/host/tcp.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 65ceadb4ffed..84a55736f269 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1677,6 +1677,7 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
>   			      size_t tls_record_size_limit)
>   {
>   	struct nvme_tcp_queue *queue = data;
> +	struct tls_context *tls_ctx = tls_get_ctx(queue->sock->sk);
>   	struct nvme_tcp_ctrl *ctrl = queue->ctrl;
>   	int qid = nvme_tcp_queue_id(queue);
>   	struct key *tls_key;
> @@ -1700,6 +1701,20 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
>   			ctrl->ctrl.tls_pskid = key_serial(tls_key);
>   		key_put(tls_key);
>   		queue->tls_err = 0;
> +
> +		/* Endpoint has specified a maximum tls record size limit */
> +		if (tls_record_size_limit > TLS_MAX_PAYLOAD_SIZE) {
> +			dev_err(ctrl->ctrl.device,
> +				"queue %d: invalid tls max record size limit: %zd\n",
> +				nvme_tcp_queue_id(queue), tls_record_size_limit);
> +			queue->tls_err = -EINVAL;
> +			goto out_complete;
> +		} else if (tls_record_size_limit > 0) {
> +			tls_ctx->tls_record_size_limit = (u32)tls_record_size_limit;
> +			dev_dbg(ctrl->ctrl.device,
> +				"queue %d: target specified tls_record_size_limit %u\n",
> +				nvme_tcp_queue_id(queue), tls_ctx->tls_record_size_limit);
> +		}
>   	}
>   
>   out_complete:

Why do we need to do that?
This value is never used in that driver, so why can't the TLS layer 
handle it?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 4/4] nvme/target/tcp: set max record size in the tls context
  2025-07-29  2:41 ` [RFC 4/4] nvme/target/tcp: " Wilfred Mallawa
@ 2025-07-29  8:16   ` Hannes Reinecke
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Reinecke @ 2025-07-29  8:16 UTC (permalink / raw)
  To: Wilfred Mallawa, alistair.francis, dlemoal, chuck.lever, davem,
	edumazet, kuba, pabeni, horms, donald.hunter, corbet, kbusch,
	axboe, hch, sagi, kch, borisp, john.fastabend, jlayton, neil,
	okorniev, Dai.Ngo, tom, trondmy, anna, kernel-tls-handshake,
	netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

On 7/29/25 04:41, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> 
> During a tls handshake, a host may specify the tls record size limit
> using the tls "record_size_limit" extension. Currently, the NVMe target
> driver does not specify this value to the tls layer.
> 
> This patch adds support for setting the tls record size limit into the
> tls context, such that outgoing records may not exceed this limit
> specified by the endpoint.
> 
> Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> ---
>   drivers/nvme/target/tcp.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index 60e308401a54..f2ab473ea5de 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1784,6 +1784,7 @@ static void nvmet_tcp_tls_handshake_done(void *data, int status,
>   					size_t tls_record_size_limit)
>   {
>   	struct nvmet_tcp_queue *queue = data;
> +	struct tls_context *tls_ctx = tls_get_ctx(queue->sock->sk);
>   
>   	pr_debug("queue %d: TLS handshake done, key %x, status %d\n",
>   		 queue->idx, peerid, status);
> @@ -1795,6 +1796,17 @@ static void nvmet_tcp_tls_handshake_done(void *data, int status,
>   	if (!status) {
>   		queue->tls_pskid = peerid;
>   		queue->state = NVMET_TCP_Q_CONNECTING;
> +
> +		/* Endpoint has specified a maximum tls record size limit */
> +		if (tls_record_size_limit > TLS_MAX_PAYLOAD_SIZE) {
> +			pr_err("queue %d: invalid tls max record size limit: %zu\n",
> +			       queue->idx, tls_record_size_limit);
> +			queue->state = NVMET_TCP_Q_FAILED;
> +		} else if (tls_record_size_limit > 0) {
> +			tls_ctx->tls_record_size_limit = (u32)tls_record_size_limit;
> +			pr_debug("queue %d: host specified tls max record size %u\n",
> +				 queue->idx, tls_ctx->tls_record_size_limit);
> +		}
>   	} else
>   		queue->state = NVMET_TCP_Q_FAILED;
>   	spin_unlock_bh(&queue->state_lock);
> @@ -1808,6 +1820,7 @@ static void nvmet_tcp_tls_handshake_done(void *data, int status,
>   		nvmet_tcp_schedule_release_queue(queue);
>   	else
>   		nvmet_tcp_set_queue_sock(queue);
> +
>   	kref_put(&queue->kref, nvmet_tcp_release_queue);
>   }
>   
Again, why?

I'd rather have the TLS layer handling that internally.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 0/4] net/tls: add support for the record size limit extension
  2025-07-29  2:41 [RFC 0/4] net/tls: add support for the record size limit extension Wilfred Mallawa
                   ` (3 preceding siblings ...)
  2025-07-29  2:41 ` [RFC 4/4] nvme/target/tcp: " Wilfred Mallawa
@ 2025-07-29 13:37 ` Chuck Lever
  2025-08-07  0:14   ` Wilfred Mallawa
  4 siblings, 1 reply; 14+ messages in thread
From: Chuck Lever @ 2025-07-29 13:37 UTC (permalink / raw)
  To: Wilfred Mallawa, alistair.francis, dlemoal, davem, edumazet, kuba,
	pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch, sagi,
	kch, borisp, john.fastabend, jlayton, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs, Wilfred Mallawa

Hi Wilfred -

On 7/28/25 10:41 PM, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> 
> During a tls handshake, an endpoint may specify a maximum record size limit. As
> specified by [1]. which allows peers to negotiate a maximum plaintext record
> size during the TLS handshake. If a TLS endpoint receives a record larger
> than its advertised limit, it must send a fatal "record_overflow" alert [1].
> Currently, this limit is not visble to the kernel, particularly in the case
> where userspace handles the handshake (tlshd/gnutls).

This paragraph essentially says "The spec says we can, so I'm
implementing it". Generally we don't implement spec features just
because they are there.

What we reviewers need instead is a problem statement. What is not
working for you, and why is this the best way to solve it?


> This series in conjunction with the respective userspace changes for tlshd [2]
> and gnutls [3], adds support for the kernel the receive the negotiated record
> size limit through the existing netlink communication layer, and use this
> value to limit outgoing records to the size specified.

As Hannes asked elsewhere, why is it up to the TLS consumer to be
aware of this limit? Given the description here, it sounds to me
like something that should be handled for all consumers by the TLS
layer.


> [1] https://www.rfc-editor.org/rfc/rfc8449
> [2] https://github.com/oracle/ktls-utils/pull/112
> [3] https://gitlab.com/gnutls/gnutls/-/merge_requests/1989
> 
> Wilfred Mallawa (4):
>   net/handshake: get negotiated tls record size limit
>   net/tls/tls_sw: use the record size limit specified
>   nvme/host/tcp: set max record size in the tls context
>   nvme/target/tcp: set max record size in the tls context
> 
>  Documentation/netlink/specs/handshake.yaml |  3 +++
>  Documentation/networking/tls-handshake.rst |  8 +++++++-
>  drivers/nvme/host/tcp.c                    | 18 +++++++++++++++++-
>  drivers/nvme/target/tcp.c                  | 16 +++++++++++++++-
>  include/net/handshake.h                    |  4 +++-
>  include/net/tls.h                          |  1 +
>  include/uapi/linux/handshake.h             |  1 +
>  net/handshake/genl.c                       |  5 +++--
>  net/handshake/tlshd.c                      | 15 +++++++++++++--
>  net/sunrpc/svcsock.c                       |  4 +++-
>  net/sunrpc/xprtsock.c                      |  4 +++-
>  net/tls/tls_sw.c                           | 10 +++++++++-
>  12 files changed, 78 insertions(+), 11 deletions(-)
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 1/4] net/handshake: get negotiated tls record size limit
  2025-07-29  8:12   ` Hannes Reinecke
@ 2025-08-07  0:03     ` Wilfred Mallawa
  0 siblings, 0 replies; 14+ messages in thread
From: Wilfred Mallawa @ 2025-08-07  0:03 UTC (permalink / raw)
  To: Hannes Reinecke, alistair.francis, dlemoal, chuck.lever, davem,
	edumazet, kuba, pabeni, horms, donald.hunter, corbet, kbusch,
	axboe, hch, sagi, kch, borisp, john.fastabend, jlayton, neil,
	okorniev, Dai.Ngo, tom, trondmy, anna, kernel-tls-handshake,
	netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs

On Tue, 2025-07-29 at 10:12 +0200, Hannes Reinecke wrote:
> 
[snip]...
> > diff --git a/Documentation/networking/tls-handshake.rst
> > b/Documentation/networking/tls-handshake.rst
> > index 6f5ea1646a47..cd984a137779 100644
> > --- a/Documentation/networking/tls-handshake.rst
> > +++ b/Documentation/networking/tls-handshake.rst
> > @@ -169,7 +169,8 @@ The synopsis of this function is:
> >   .. code-block:: c
> >   
> >     typedef void	(*tls_done_func_t)(void *data, int status,
> > -                                   key_serial_t peerid);
> > +                                   key_serial_t peerid,
> > +                                   size_t tls_record_size_limit);
> >   
> >   The consumer provides a cookie in the @ta_data field of the
> >   tls_handshake_args structure that is returned in the @data
> > parameter of
> 
> Why is this exposed to the TLS handshake consumer?
> The TLS record size is surely required for handling and processing
> TLS
> streams in net/tls, but the consumer of that (eg NVMe-TCP, NFS)
> are blissfully unaware that there _are_ such things like TLS records.
> And they really should keep it that way.
> 
> So I'd really _not_ expose that to any ULP and keep it internal to
> the TLS layer.
> 
Hey Hannes,

Sorry for the delay in response, and thanks for the feedback! Yeah I
agree it was a bad approach from me. It definitely makes more sense to
keep things in the TLS layer. I will try to address this in V2.

Regards,
Wilfred

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/4] net/tls/tls_sw: use the record size limit specified
  2025-07-29  8:13   ` Damien Le Moal
@ 2025-08-07  0:04     ` Wilfred Mallawa
  0 siblings, 0 replies; 14+ messages in thread
From: Wilfred Mallawa @ 2025-08-07  0:04 UTC (permalink / raw)
  To: Damien Le Moal, alistair.francis, chuck.lever, davem, edumazet,
	kuba, pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch,
	sagi, kch, borisp, john.fastabend, jlayton, neil, okorniev,
	Dai.Ngo, tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs

On Tue, 2025-07-29 at 17:13 +0900, Damien Le Moal wrote:
> On 7/29/25 11:41, Wilfred Mallawa wrote:
> > From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> > 
> > Currently, for tls_sw, the kernel uses the default 16K
> > TLS_MAX_PAYLOAD_SIZE for records. However, if an endpoint has
> > specified
> > a record size much lower than that, it is currently not respected.
> 
> Remove "much". Lower is lower and we have to respect it, even if it
> is 1B.
> 
> > This patch adds support to using the record size limit specified by
> > an
> > endpoint if it has been set.
> 
> s/to using/for using
> 
> > 
> > Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> 
> > @@ -1045,6 +1046,13 @@ static int tls_sw_sendmsg_locked(struct sock
> > *sk, struct msghdr *msg,
> >  		}
> >  	}
> >  
> > +	if (tls_ctx->tls_record_size_limit > 0) {
> > +		tls_record_size_limit = min(tls_ctx-
> > >tls_record_size_limit,
> > +					    TLS_MAX_PAYLOAD_SIZE);
> > +	} else {
> > +		tls_record_size_limit = TLS_MAX_PAYLOAD_SIZE;
> > +	}
> 
> You can simplify this with:
> 
> 	tls_record_size_limit =
> 		min_not_zero(tls_ctx->tls_record_size_limit,
> 			     TLS_MAX_PAYLOAD_SIZE);
> 
Hey Damien,

Thanks for the feedback! Will amend for V2.

Regards,
Wilfred
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 0/4] net/tls: add support for the record size limit extension
  2025-07-29 13:37 ` [RFC 0/4] net/tls: add support for the record size limit extension Chuck Lever
@ 2025-08-07  0:14   ` Wilfred Mallawa
  0 siblings, 0 replies; 14+ messages in thread
From: Wilfred Mallawa @ 2025-08-07  0:14 UTC (permalink / raw)
  To: Chuck Lever, alistair.francis, dlemoal, davem, edumazet, kuba,
	pabeni, horms, donald.hunter, corbet, kbusch, axboe, hch, sagi,
	kch, borisp, john.fastabend, jlayton, neil, okorniev, Dai.Ngo,
	tom, trondmy, anna, kernel-tls-handshake, netdev
  Cc: linux-kernel, linux-doc, linux-nvme, linux-nfs

Hey Chuck,

On Tue, 2025-07-29 at 09:37 -0400, Chuck Lever wrote:
> Hi Wilfred -
> 
> On 7/28/25 10:41 PM, Wilfred Mallawa wrote:
> > From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> > 
> > During a tls handshake, an endpoint may specify a maximum record
> > size limit. As
> > specified by [1]. which allows peers to negotiate a maximum
> > plaintext record
> > size during the TLS handshake. If a TLS endpoint receives a record
> > larger
> > than its advertised limit, it must send a fatal "record_overflow"
> > alert [1].
> > Currently, this limit is not visble to the kernel, particularly in
> > the case
> > where userspace handles the handshake (tlshd/gnutls).
> 
> This paragraph essentially says "The spec says we can, so I'm
> implementing it". Generally we don't implement spec features just
> because they are there.
> 
> What we reviewers need instead is a problem statement. What is not
> working for you, and why is this the best way to solve it?
Thanks for the feedback.

Essentially, this is to support upcoming WD NVMe-TCP controller that
implements TLS support. These devices require record size negotiation
as they support a maximum record size less than the current kernel
default. I will add this to my V2 series in more detail.
> 
> 
> > This series in conjunction with the respective userspace changes
> > for tlshd [2]
> > and gnutls [3], adds support for the kernel the receive the
> > negotiated record
> > size limit through the existing netlink communication layer, and
> > use this
> > value to limit outgoing records to the size specified.
> 
> As Hannes asked elsewhere, why is it up to the TLS consumer to be
> aware of this limit? Given the description here, it sounds to me
> like something that should be handled for all consumers by the TLS
> layer.
Yeah great point, I didn't think it through too well. I will address
this in V2 and have the record size limit implemented in the TLS layer
without involving ULPs.

Regards,
Wilfred


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-08-07  0:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-29  2:41 [RFC 0/4] net/tls: add support for the record size limit extension Wilfred Mallawa
2025-07-29  2:41 ` [RFC 1/4] net/handshake: get negotiated tls record size limit Wilfred Mallawa
2025-07-29  8:07   ` Damien Le Moal
2025-07-29  8:12   ` Hannes Reinecke
2025-08-07  0:03     ` Wilfred Mallawa
2025-07-29  2:41 ` [RFC 2/4] net/tls/tls_sw: use the record size limit specified Wilfred Mallawa
2025-07-29  8:13   ` Damien Le Moal
2025-08-07  0:04     ` Wilfred Mallawa
2025-07-29  2:41 ` [RFC 3/4] nvme/host/tcp: set max record size in the tls context Wilfred Mallawa
2025-07-29  8:16   ` Hannes Reinecke
2025-07-29  2:41 ` [RFC 4/4] nvme/target/tcp: " Wilfred Mallawa
2025-07-29  8:16   ` Hannes Reinecke
2025-07-29 13:37 ` [RFC 0/4] net/tls: add support for the record size limit extension Chuck Lever
2025-08-07  0:14   ` Wilfred Mallawa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).