[PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests
@ 2025-11-12  4:27 alistair23
  2025-11-12  4:27 ` [PATCH v5 1/6] net/handshake: Store the key serial number on completion alistair23
                   ` (5 more replies)
  0 siblings, 6 replies; 33+ messages in thread
From: alistair23 @ 2025-11-12  4:27 UTC (permalink / raw)
  To: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, alistair23, Alistair Francis

From: Alistair Francis <alistair.francis@wdc.com>

The TLS 1.3 specification allows the TLS client or server to send a
KeyUpdate. This is generally used when the sequence is about to
overflow or after a certain amount of bytes have been encrypted.

The TLS spec doesn't mandate the conditions though, so a KeyUpdate
can be sent by the TLS client or server at any time. This includes
when running NVMe-OF over a TLS 1.3 connection.

As such Linux should be able to handle a KeyUpdate event, as the
other NVMe side could initiate a KeyUpdate.

Upcoming WD NVMe-TCP hardware controllers implement TLS support
and send KeyUpdate requests.

This series builds on top of the existing TLS EKEYEXPIRED work,
which already detects a KeyUpdate request. We can now pass that
information up to the NVMe layer (target and host) and then pass
it up to userspace.

Userspace (ktls-utils) will need to save the connection state
in the keyring during the initial handshake. The kernel then
provides the key serial back to userspace when handling a
KeyUpdate. Userspace can use this to restore the connection
information and then update the keys, this final process
is similar to the initial handshake.

This series depends on the recvmsg() kernel patch:
https://lore.kernel.org/linux-nvme/2cbe1350-0bf5-4487-be33-1d317cb73acf@suse.de/T/#mf56283228ae6c93e37dfbf1c0f6263910217cd80

ktls-utils (tlshd) userspace patches are available at:
https://lore.kernel.org/kernel-tls-handshake/CAKmqyKNpFhPtM8HAkgRMKQA8_N7AgoeqaSTe2=0spPnb+Oz2ng@mail.gmail.com/T/#mb277f5c998282666d0f41cc02f4abf516fcc4e9c

Link: https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3

Based-on: 2cbe1350-0bf5-4487-be33-1d317cb73acf@suse.de

v5:
 - Cleanup code flow for nvme-tcp
 - When using recvmsg in the host code first check for MSG_CTRUNC
   in the msg_flags returned from recvmsg() and use that to determine
   if it's a control message
 - Drop clientkeyupdaterequest and serverkeyupdaterequest
v4:
 - Don't stop the keep-alive timer
 - Remove any support for sending a KeyUpdate
 - Add tls_client_keyupdate_psk()' and 'tls_server_keyupdate_psk()'
 - Code cleanups
 - Change order of patches
v3:
 - Rebase on the recvmsg() workflow patch
 - Add debugfs support for the host
 - Don't cancel an ongoing request
 - Ensure a request is destructed on completion
v2:
 - Change "key-serial" to "session-id"
 - Fix reported build failures
 - Drop tls_clear_err() function
 - Stop keep alive timer during KeyUpdate
 - Drop handshake message decoding in the NVMe layer

Alistair Francis (6):
  net/handshake: Store the key serial number on completion
  net/handshake: Define handshake_sk_destruct_req
  net/handshake: Ensure the request is destructed on completion
  net/handshake: Support KeyUpdate message types
  nvme-tcp: Support KeyUpdate
  nvmet-tcp: Support KeyUpdate

 Documentation/netlink/specs/handshake.yaml |  20 +-
 Documentation/networking/tls-handshake.rst |   1 +
 drivers/nvme/host/tcp.c                    | 103 ++++++++--
 drivers/nvme/target/tcp.c                  | 216 ++++++++++++++-------
 include/net/handshake.h                    |  10 +-
 include/uapi/linux/handshake.h             |  12 ++
 net/handshake/genl.c                       |   5 +-
 net/handshake/request.c                    |  18 ++
 net/handshake/tlshd.c                      |  96 ++++++++-
 net/sunrpc/svcsock.c                       |   4 +-
 net/sunrpc/xprtsock.c                      |   4 +-
 11 files changed, 397 insertions(+), 92 deletions(-)

-- 
2.51.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v5 1/6] net/handshake: Store the key serial number on completion
  2025-11-12  4:27 [PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests alistair23
@ 2025-11-12  4:27 ` alistair23
  2025-11-12 15:02   ` Chuck Lever
  2025-11-30 22:21   ` Sagi Grimberg
  2025-11-12  4:27 ` [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req alistair23
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 33+ messages in thread
From: alistair23 @ 2025-11-12  4:27 UTC (permalink / raw)
  To: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, alistair23, Alistair Francis

From: Alistair Francis <alistair.francis@wdc.com>

Allow userspace to include a key serial number when completing a
handshake with the HANDSHAKE_CMD_DONE command.

We then store this serial number and will provide it back to userspace
in the future. This allows userspace to save data to the keyring and
then restore that data later.

This will be used to support the TLS KeyUpdate operation, as now
userspace can resume information about a established session.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Hannes Reincke <hare@suse.de>
---
v5:
 - Change name to "handshake session ID"
v4:
 - No change
v3:
 - No change
v2:
 - Change "key-serial" to "session-id"

 Documentation/netlink/specs/handshake.yaml |  4 ++++
 Documentation/networking/tls-handshake.rst |  1 +
 drivers/nvme/host/tcp.c                    |  3 ++-
 drivers/nvme/target/tcp.c                  |  3 ++-
 include/net/handshake.h                    |  4 +++-
 include/uapi/linux/handshake.h             |  1 +
 net/handshake/genl.c                       |  5 +++--
 net/handshake/tlshd.c                      | 15 +++++++++++++--
 net/sunrpc/svcsock.c                       |  4 +++-
 net/sunrpc/xprtsock.c                      |  4 +++-
 10 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
index 95c3fade7a8d..a273bc74d26f 100644
--- a/Documentation/netlink/specs/handshake.yaml
+++ b/Documentation/netlink/specs/handshake.yaml
@@ -87,6 +87,9 @@ attribute-sets:
         name: remote-auth
         type: u32
         multi-attr: true
+      -
+        name: session-id
+        type: u32
 
 operations:
   list:
@@ -123,6 +126,7 @@ operations:
             - status
             - sockfd
             - remote-auth
+            - session-id
 
 mcast-groups:
   list:
diff --git a/Documentation/networking/tls-handshake.rst b/Documentation/networking/tls-handshake.rst
index 6f5ea1646a47..0941b81dde5c 100644
--- a/Documentation/networking/tls-handshake.rst
+++ b/Documentation/networking/tls-handshake.rst
@@ -60,6 +60,7 @@ fills in a structure that contains the parameters of the request:
         key_serial_t    ta_my_privkey;
         unsigned int    ta_num_peerids;
         key_serial_t    ta_my_peerids[5];
+        key_serial_t    handshake_session_id;
   };
 
 The @ta_sock field references an open and connected socket. The consumer
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 9058ea64b89c..024d02248831 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1694,7 +1694,8 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
 		qid, queue->io_cpu);
 }
 
-static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
+static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
+			      key_serial_t handshake_session_id)
 {
 	struct nvme_tcp_queue *queue = data;
 	struct nvme_tcp_ctrl *ctrl = queue->ctrl;
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 470bf37e5a63..7f8516892359 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1780,7 +1780,8 @@ static int nvmet_tcp_tls_key_lookup(struct nvmet_tcp_queue *queue,
 }
 
 static void nvmet_tcp_tls_handshake_done(void *data, int status,
-					 key_serial_t peerid)
+					 key_serial_t peerid,
+					 key_serial_t handshake_session_id)
 {
 	struct nvmet_tcp_queue *queue = data;
 
diff --git a/include/net/handshake.h b/include/net/handshake.h
index 8ebd4f9ed26e..68d7f89e431a 100644
--- a/include/net/handshake.h
+++ b/include/net/handshake.h
@@ -18,7 +18,8 @@ enum {
 };
 
 typedef void	(*tls_done_func_t)(void *data, int status,
-				   key_serial_t peerid);
+				   key_serial_t peerid,
+				   key_serial_t handshake_session_id);
 
 struct tls_handshake_args {
 	struct socket		*ta_sock;
@@ -31,6 +32,7 @@ struct tls_handshake_args {
 	key_serial_t		ta_my_privkey;
 	unsigned int		ta_num_peerids;
 	key_serial_t		ta_my_peerids[5];
+	key_serial_t		handshake_session_id;
 };
 
 int tls_client_hello_anon(const struct tls_handshake_args *args, gfp_t flags);
diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
index 662e7de46c54..b68ffbaa5f31 100644
--- a/include/uapi/linux/handshake.h
+++ b/include/uapi/linux/handshake.h
@@ -55,6 +55,7 @@ enum {
 	HANDSHAKE_A_DONE_STATUS = 1,
 	HANDSHAKE_A_DONE_SOCKFD,
 	HANDSHAKE_A_DONE_REMOTE_AUTH,
+	HANDSHAKE_A_DONE_SESSION_ID,
 
 	__HANDSHAKE_A_DONE_MAX,
 	HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
diff --git a/net/handshake/genl.c b/net/handshake/genl.c
index f55d14d7b726..6cdce7e5dbc0 100644
--- a/net/handshake/genl.c
+++ b/net/handshake/genl.c
@@ -16,10 +16,11 @@ static const struct nla_policy handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HAN
 };
 
 /* HANDSHAKE_CMD_DONE - do */
-static const struct nla_policy handshake_done_nl_policy[HANDSHAKE_A_DONE_REMOTE_AUTH + 1] = {
+static const struct nla_policy handshake_done_nl_policy[HANDSHAKE_A_DONE_SESSION_ID + 1] = {
 	[HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
 	[HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_S32, },
 	[HANDSHAKE_A_DONE_REMOTE_AUTH] = { .type = NLA_U32, },
+	[HANDSHAKE_A_DONE_SESSION_ID] = { .type = NLA_U32, },
 };
 
 /* Ops table for handshake */
@@ -35,7 +36,7 @@ static const struct genl_split_ops handshake_nl_ops[] = {
 		.cmd		= HANDSHAKE_CMD_DONE,
 		.doit		= handshake_nl_done_doit,
 		.policy		= handshake_done_nl_policy,
-		.maxattr	= HANDSHAKE_A_DONE_REMOTE_AUTH,
+		.maxattr	= HANDSHAKE_A_DONE_SESSION_ID,
 		.flags		= GENL_CMD_CAP_DO,
 	},
 };
diff --git a/net/handshake/tlshd.c b/net/handshake/tlshd.c
index 081093dfd553..85c5fed690c0 100644
--- a/net/handshake/tlshd.c
+++ b/net/handshake/tlshd.c
@@ -26,7 +26,8 @@
 
 struct tls_handshake_req {
 	void			(*th_consumer_done)(void *data, int status,
-						    key_serial_t peerid);
+						    key_serial_t peerid,
+						    key_serial_t handshake_session_id);
 	void			*th_consumer_data;
 
 	int			th_type;
@@ -39,6 +40,8 @@ struct tls_handshake_req {
 
 	unsigned int		th_num_peerids;
 	key_serial_t		th_peerid[5];
+
+	key_serial_t		handshake_session_id;
 };
 
 static struct tls_handshake_req *
@@ -55,6 +58,7 @@ tls_handshake_req_init(struct handshake_req *req,
 	treq->th_num_peerids = 0;
 	treq->th_certificate = TLS_NO_CERT;
 	treq->th_privkey = TLS_NO_PRIVKEY;
+	treq->handshake_session_id = TLS_NO_PRIVKEY;
 	return treq;
 }
 
@@ -83,6 +87,13 @@ static void tls_handshake_remote_peerids(struct tls_handshake_req *treq,
 		if (i >= treq->th_num_peerids)
 			break;
 	}
+
+	nla_for_each_attr(nla, head, len, rem) {
+		if (nla_type(nla) == HANDSHAKE_A_DONE_SESSION_ID) {
+			treq->handshake_session_id = nla_get_u32(nla);
+			break;
+		}
+	}
 }
 
 /**
@@ -105,7 +116,7 @@ static void tls_handshake_done(struct handshake_req *req,
 		set_bit(HANDSHAKE_F_REQ_SESSION, &req->hr_flags);
 
 	treq->th_consumer_done(treq->th_consumer_data, -status,
-			       treq->th_peerid[0]);
+			       treq->th_peerid[0], treq->handshake_session_id);
 }
 
 #if IS_ENABLED(CONFIG_KEYS)
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 7b90abc5cf0e..2401b4c757f6 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -444,13 +444,15 @@ static void svc_tcp_kill_temp_xprt(struct svc_xprt *xprt)
  * @data: address of xprt to wake
  * @status: status of handshake
  * @peerid: serial number of key containing the remote peer's identity
+ * @handshake_session_id: serial number of the userspace session ID
  *
  * If a security policy is specified as an export option, we don't
  * have a specific export here to check. So we set a "TLS session
  * is present" flag on the xprt and let an upper layer enforce local
  * security policy.
  */
-static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid)
+static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid,
+				   key_serial_t handshake_session_id)
 {
 	struct svc_xprt *xprt = data;
 	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 3aa987e7f072..5c6e7543f293 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2589,9 +2589,11 @@ static int xs_tcp_tls_finish_connecting(struct rpc_xprt *lower_xprt,
  * @data: address of xprt to wake
  * @status: status of handshake
  * @peerid: serial number of key containing the remote's identity
+ * @handshake_session_id: serial number of the userspace session ID
  *
  */
-static void xs_tls_handshake_done(void *data, int status, key_serial_t peerid)
+static void xs_tls_handshake_done(void *data, int status, key_serial_t peerid,
+				  key_serial_t handshake_session_id)
 {
 	struct rpc_xprt *lower_xprt = data;
 	struct sock_xprt *lower_transport =
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-12  4:27 [PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests alistair23
  2025-11-12  4:27 ` [PATCH v5 1/6] net/handshake: Store the key serial number on completion alistair23
@ 2025-11-12  4:27 ` alistair23
  2025-11-12 15:47   ` Chuck Lever
  2025-11-12  4:27 ` [PATCH v5 3/6] net/handshake: Ensure the request is destructed on completion alistair23
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 33+ messages in thread
From: alistair23 @ 2025-11-12  4:27 UTC (permalink / raw)
  To: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, alistair23, Alistair Francis

From: Alistair Francis <alistair.francis@wdc.com>

Define a `handshake_sk_destruct_req()` function to allow the destruction
of the handshake req.

This is required to avoid hash conflicts when handshake_req_hash_add()
is called as part of submitting the KeyUpdate request.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
v5:
 - No change
v4:
 - No change
v3:
 - New patch

 net/handshake/request.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/net/handshake/request.c b/net/handshake/request.c
index 274d2c89b6b2..0d1c91c80478 100644
--- a/net/handshake/request.c
+++ b/net/handshake/request.c
@@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
 		sk_destruct(sk);
 }
 
+/**
+ * handshake_sk_destruct_req - destroy an existing request
+ * @sk: socket on which there is an existing request
+ */
+static void handshake_sk_destruct_req(struct sock *sk)
+{
+	struct handshake_req *req;
+
+	req = handshake_req_hash_lookup(sk);
+	if (!req)
+		return;
+
+	trace_handshake_destruct(sock_net(sk), req, sk);
+	handshake_req_destroy(req);
+}
+
 /**
  * handshake_req_alloc - Allocate a handshake request
  * @proto: security protocol
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 3/6] net/handshake: Ensure the request is destructed on completion
  2025-11-12  4:27 [PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests alistair23
  2025-11-12  4:27 ` [PATCH v5 1/6] net/handshake: Store the key serial number on completion alistair23
  2025-11-12  4:27 ` [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req alistair23
@ 2025-11-12  4:27 ` alistair23
  2025-11-12  4:27 ` [PATCH v5 4/6] net/handshake: Support KeyUpdate message types alistair23
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 33+ messages in thread
From: alistair23 @ 2025-11-12  4:27 UTC (permalink / raw)
  To: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, alistair23, Alistair Francis

From: Alistair Francis <alistair.francis@wdc.com>

To avoid future handshake_req_hash_add() calls failing with EEXIST when
performing a KeyUpdate let's make sure the old request is destructed
as part of the completion.

Until now a handshake would only be destroyed on a failure or when a
sock is freed (via the sk_destruct function pointer).
handshake_complete() is only called on errors, not a successful
handshake so it doesn't remove the request.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
v5:
 - No change
v4:
 - Improve description in commit message
v3:
 - New patch

 net/handshake/request.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/handshake/request.c b/net/handshake/request.c
index 0d1c91c80478..194725a8aaca 100644
--- a/net/handshake/request.c
+++ b/net/handshake/request.c
@@ -311,6 +311,8 @@ void handshake_complete(struct handshake_req *req, unsigned int status,
 		/* Handshake request is no longer pending */
 		sock_put(sk);
 	}
+
+	handshake_sk_destruct_req(sk);
 }
 EXPORT_SYMBOL_IF_KUNIT(handshake_complete);
 
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 4/6] net/handshake: Support KeyUpdate message types
  2025-11-12  4:27 [PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests alistair23
                   ` (2 preceding siblings ...)
  2025-11-12  4:27 ` [PATCH v5 3/6] net/handshake: Ensure the request is destructed on completion alistair23
@ 2025-11-12  4:27 ` alistair23
  2025-11-12 15:49   ` Chuck Lever
                     ` (2 more replies)
  2025-11-12  4:27 ` [PATCH v5 5/6] nvme-tcp: Support KeyUpdate alistair23
  2025-11-12  4:27 ` [PATCH v5 6/6] nvmet-tcp: " alistair23
  5 siblings, 3 replies; 33+ messages in thread
From: alistair23 @ 2025-11-12  4:27 UTC (permalink / raw)
  To: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, alistair23, Alistair Francis

From: Alistair Francis <alistair.francis@wdc.com>

When reporting the msg-type to userspace let's also support reporting
KeyUpdate events. This supports reporting a client/server event and if
the other side requested a KeyUpdateRequest.

Link: https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
v5:
 - Drop clientkeyupdaterequest and serverkeyupdaterequest
v4:
 - Don't overload existing functions, instead create new ones
v3:
 - Fixup yamllint and kernel-doc failures

 Documentation/netlink/specs/handshake.yaml | 16 ++++-
 drivers/nvme/host/tcp.c                    | 15 +++-
 drivers/nvme/target/tcp.c                  | 10 ++-
 include/net/handshake.h                    |  6 ++
 include/uapi/linux/handshake.h             | 11 +++
 net/handshake/tlshd.c                      | 83 +++++++++++++++++++++-
 6 files changed, 133 insertions(+), 8 deletions(-)

diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
index a273bc74d26f..2f77216c8ddf 100644
--- a/Documentation/netlink/specs/handshake.yaml
+++ b/Documentation/netlink/specs/handshake.yaml
@@ -21,12 +21,18 @@ definitions:
     type: enum
     name: msg-type
     value-start: 0
-    entries: [unspec, clienthello, serverhello]
+    entries: [unspec, clienthello, serverhello, clientkeyupdate,
+              serverkeyupdate]
   -
     type: enum
     name: auth
     value-start: 0
     entries: [unspec, unauth, psk, x509]
+  -
+    type: enum
+    name: key-update-type
+    value-start: 0
+    entries: [unspec, send, received, received_request_update]
 
 attribute-sets:
   -
@@ -74,6 +80,13 @@ attribute-sets:
       -
         name: keyring
         type: u32
+      -
+        name: key-update-request
+        type: u32
+        enum: key-update-type
+      -
+        name: session-id
+        type: u32
   -
     name: done
     attributes:
@@ -116,6 +129,7 @@ operations:
             - certificate
             - peername
             - keyring
+            - session-id
     -
       name: done
       doc: Handler reports handshake completion
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 024d02248831..4797a4532b0d 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -20,6 +20,7 @@
 #include <linux/iov_iter.h>
 #include <net/busy_poll.h>
 #include <trace/events/sock.h>
+#include <uapi/linux/handshake.h>
 
 #include "nvme.h"
 #include "fabrics.h"
@@ -206,6 +207,10 @@ static struct workqueue_struct *nvme_tcp_wq;
 static const struct blk_mq_ops nvme_tcp_mq_ops;
 static const struct blk_mq_ops nvme_tcp_admin_mq_ops;
 static int nvme_tcp_try_send(struct nvme_tcp_queue *queue);
+static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
+			      struct nvme_tcp_queue *queue,
+			      key_serial_t pskid,
+			      enum handshake_key_update_type keyupdate);
 
 static inline struct nvme_tcp_ctrl *to_tcp_ctrl(struct nvme_ctrl *ctrl)
 {
@@ -1729,7 +1734,8 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
 
 static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
 			      struct nvme_tcp_queue *queue,
-			      key_serial_t pskid)
+			      key_serial_t pskid,
+			      enum handshake_key_update_type keyupdate)
 {
 	int qid = nvme_tcp_queue_id(queue);
 	int ret;
@@ -1751,7 +1757,10 @@ static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
 	args.ta_timeout_ms = tls_handshake_timeout * 1000;
 	queue->tls_err = -EOPNOTSUPP;
 	init_completion(&queue->tls_complete);
-	ret = tls_client_hello_psk(&args, GFP_KERNEL);
+	if (keyupdate == HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC)
+		ret = tls_client_hello_psk(&args, GFP_KERNEL);
+	else
+		ret = tls_client_keyupdate_psk(&args, GFP_KERNEL, keyupdate);
 	if (ret) {
 		dev_err(nctrl->device, "queue %d: failed to start TLS: %d\n",
 			qid, ret);
@@ -1901,7 +1910,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
 
 	/* If PSKs are configured try to start TLS */
 	if (nvme_tcp_tls_configured(nctrl) && pskid) {
-		ret = nvme_tcp_start_tls(nctrl, queue, pskid);
+		ret = nvme_tcp_start_tls(nctrl, queue, pskid, HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC);
 		if (ret)
 			goto err_init_connect;
 	}
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 7f8516892359..818efdeccef1 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1833,7 +1833,8 @@ static void nvmet_tcp_tls_handshake_timeout(struct work_struct *w)
 	kref_put(&queue->kref, nvmet_tcp_release_queue);
 }
 
-static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue)
+static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue,
+				   enum handshake_key_update_type keyupdate)
 {
 	int ret = -EOPNOTSUPP;
 	struct tls_handshake_args args;
@@ -1852,7 +1853,10 @@ static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue)
 	args.ta_keyring = key_serial(queue->port->nport->keyring);
 	args.ta_timeout_ms = tls_handshake_timeout * 1000;
 
-	ret = tls_server_hello_psk(&args, GFP_KERNEL);
+	if (keyupdate == HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC)
+		ret = tls_server_hello_psk(&args, GFP_KERNEL);
+	else
+		ret = tls_server_keyupdate_psk(&args, GFP_KERNEL, keyupdate);
 	if (ret) {
 		kref_put(&queue->kref, nvmet_tcp_release_queue);
 		pr_err("failed to start TLS, err=%d\n", ret);
@@ -1934,7 +1938,7 @@ static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
 		sk->sk_data_ready = port->data_ready;
 		write_unlock_bh(&sk->sk_callback_lock);
 		if (!nvmet_tcp_try_peek_pdu(queue)) {
-			if (!nvmet_tcp_tls_handshake(queue))
+			if (!nvmet_tcp_tls_handshake(queue, HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC))
 				return;
 			/* TLS handshake failed, terminate the connection */
 			goto out_destroy_sq;
diff --git a/include/net/handshake.h b/include/net/handshake.h
index 68d7f89e431a..f5a249327bf6 100644
--- a/include/net/handshake.h
+++ b/include/net/handshake.h
@@ -10,6 +10,8 @@
 #ifndef _NET_HANDSHAKE_H
 #define _NET_HANDSHAKE_H
 
+#include <uapi/linux/handshake.h>
+
 enum {
 	TLS_NO_KEYRING = 0,
 	TLS_NO_PEERID = 0,
@@ -38,8 +40,12 @@ struct tls_handshake_args {
 int tls_client_hello_anon(const struct tls_handshake_args *args, gfp_t flags);
 int tls_client_hello_x509(const struct tls_handshake_args *args, gfp_t flags);
 int tls_client_hello_psk(const struct tls_handshake_args *args, gfp_t flags);
+int tls_client_keyupdate_psk(const struct tls_handshake_args *args, gfp_t flags,
+			     enum handshake_key_update_type keyupdate);
 int tls_server_hello_x509(const struct tls_handshake_args *args, gfp_t flags);
 int tls_server_hello_psk(const struct tls_handshake_args *args, gfp_t flags);
+int tls_server_keyupdate_psk(const struct tls_handshake_args *args, gfp_t flags,
+			     enum handshake_key_update_type keyupdate);
 
 bool tls_handshake_cancel(struct sock *sk);
 void tls_handshake_close(struct socket *sock);
diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
index b68ffbaa5f31..483815a064f0 100644
--- a/include/uapi/linux/handshake.h
+++ b/include/uapi/linux/handshake.h
@@ -19,6 +19,8 @@ enum handshake_msg_type {
 	HANDSHAKE_MSG_TYPE_UNSPEC,
 	HANDSHAKE_MSG_TYPE_CLIENTHELLO,
 	HANDSHAKE_MSG_TYPE_SERVERHELLO,
+	HANDSHAKE_MSG_TYPE_CLIENTKEYUPDATE,
+	HANDSHAKE_MSG_TYPE_SERVERKEYUPDATE,
 };
 
 enum handshake_auth {
@@ -28,6 +30,13 @@ enum handshake_auth {
 	HANDSHAKE_AUTH_X509,
 };
 
+enum handshake_key_update_type {
+	HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC,
+	HANDSHAKE_KEY_UPDATE_TYPE_SEND,
+	HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED,
+	HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED_REQUEST_UPDATE,
+};
+
 enum {
 	HANDSHAKE_A_X509_CERT = 1,
 	HANDSHAKE_A_X509_PRIVKEY,
@@ -46,6 +55,8 @@ enum {
 	HANDSHAKE_A_ACCEPT_CERTIFICATE,
 	HANDSHAKE_A_ACCEPT_PEERNAME,
 	HANDSHAKE_A_ACCEPT_KEYRING,
+	HANDSHAKE_A_ACCEPT_KEY_UPDATE_REQUEST,
+	HANDSHAKE_A_ACCEPT_SESSION_ID,
 
 	__HANDSHAKE_A_ACCEPT_MAX,
 	HANDSHAKE_A_ACCEPT_MAX = (__HANDSHAKE_A_ACCEPT_MAX - 1)
diff --git a/net/handshake/tlshd.c b/net/handshake/tlshd.c
index 85c5fed690c0..91d2bb515b7d 100644
--- a/net/handshake/tlshd.c
+++ b/net/handshake/tlshd.c
@@ -41,6 +41,7 @@ struct tls_handshake_req {
 	unsigned int		th_num_peerids;
 	key_serial_t		th_peerid[5];
 
+	unsigned int		th_key_update_request;
 	key_serial_t		handshake_session_id;
 };
 
@@ -58,7 +59,8 @@ tls_handshake_req_init(struct handshake_req *req,
 	treq->th_num_peerids = 0;
 	treq->th_certificate = TLS_NO_CERT;
 	treq->th_privkey = TLS_NO_PRIVKEY;
-	treq->handshake_session_id = TLS_NO_PRIVKEY;
+	treq->handshake_session_id = args->handshake_session_id;
+
 	return treq;
 }
 
@@ -265,6 +267,16 @@ static int tls_handshake_accept(struct handshake_req *req,
 		break;
 	}
 
+	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_SESSION_ID,
+			  treq->handshake_session_id);
+	if (ret < 0)
+		goto out_cancel;
+
+	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_KEY_UPDATE_REQUEST,
+			  treq->th_key_update_request);
+	if (ret < 0)
+		goto out_cancel;
+
 	genlmsg_end(msg, hdr);
 	return genlmsg_reply(msg, info);
 
@@ -372,6 +384,44 @@ int tls_client_hello_psk(const struct tls_handshake_args *args, gfp_t flags)
 }
 EXPORT_SYMBOL(tls_client_hello_psk);
 
+/**
+ * tls_client_keyupdate_psk - request a PSK-based TLS handshake on a socket
+ * @args: socket and handshake parameters for this request
+ * @flags: memory allocation control flags
+ * @keyupdate: specifies the type of KeyUpdate operation
+ *
+ * Return values:
+ *   %0: Handshake request enqueue; ->done will be called when complete
+ *   %-EINVAL: Wrong number of local peer IDs
+ *   %-ESRCH: No user agent is available
+ *   %-ENOMEM: Memory allocation failed
+ */
+int tls_client_keyupdate_psk(const struct tls_handshake_args *args, gfp_t flags,
+			     enum handshake_key_update_type keyupdate)
+{
+	struct tls_handshake_req *treq;
+	struct handshake_req *req;
+	unsigned int i;
+
+	if (!args->ta_num_peerids ||
+	    args->ta_num_peerids > ARRAY_SIZE(treq->th_peerid))
+		return -EINVAL;
+
+	req = handshake_req_alloc(&tls_handshake_proto, flags);
+	if (!req)
+		return -ENOMEM;
+	treq = tls_handshake_req_init(req, args);
+	treq->th_type = HANDSHAKE_MSG_TYPE_CLIENTKEYUPDATE;
+	treq->th_key_update_request = keyupdate;
+	treq->th_auth_mode = HANDSHAKE_AUTH_PSK;
+	treq->th_num_peerids = args->ta_num_peerids;
+	for (i = 0; i < args->ta_num_peerids; i++)
+		treq->th_peerid[i] = args->ta_my_peerids[i];
+
+	return handshake_req_submit(args->ta_sock, req, flags);
+}
+EXPORT_SYMBOL(tls_client_keyupdate_psk);
+
 /**
  * tls_server_hello_x509 - request a server TLS handshake on a socket
  * @args: socket and handshake parameters for this request
@@ -428,6 +478,37 @@ int tls_server_hello_psk(const struct tls_handshake_args *args, gfp_t flags)
 }
 EXPORT_SYMBOL(tls_server_hello_psk);
 
+/**
+ * tls_server_keyupdate_psk - request a server TLS KeyUpdate on a socket
+ * @args: socket and handshake parameters for this request
+ * @flags: memory allocation control flags
+ * @keyupdate: specifies the type of KeyUpdate operation
+ *
+ * Return values:
+ *   %0: Handshake request enqueue; ->done will be called when complete
+ *   %-ESRCH: No user agent is available
+ *   %-ENOMEM: Memory allocation failed
+ */
+int tls_server_keyupdate_psk(const struct tls_handshake_args *args, gfp_t flags,
+			     enum handshake_key_update_type keyupdate)
+{
+	struct tls_handshake_req *treq;
+	struct handshake_req *req;
+
+	req = handshake_req_alloc(&tls_handshake_proto, flags);
+	if (!req)
+		return -ENOMEM;
+	treq = tls_handshake_req_init(req, args);
+	treq->th_type = HANDSHAKE_MSG_TYPE_SERVERKEYUPDATE;
+	treq->th_key_update_request = keyupdate;
+	treq->th_auth_mode = HANDSHAKE_AUTH_PSK;
+	treq->th_num_peerids = 1;
+	treq->th_peerid[0] = args->ta_my_peerids[0];
+
+	return handshake_req_submit(args->ta_sock, req, flags);
+}
+EXPORT_SYMBOL(tls_server_keyupdate_psk);
+
 /**
  * tls_handshake_cancel - cancel a pending handshake
  * @sk: socket on which there is an ongoing handshake
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-12  4:27 [PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests alistair23
                   ` (3 preceding siblings ...)
  2025-11-12  4:27 ` [PATCH v5 4/6] net/handshake: Support KeyUpdate message types alistair23
@ 2025-11-12  4:27 ` alistair23
  2025-11-12  6:59   ` Christoph Hellwig
                     ` (2 more replies)
  2025-11-12  4:27 ` [PATCH v5 6/6] nvmet-tcp: " alistair23
  5 siblings, 3 replies; 33+ messages in thread
From: alistair23 @ 2025-11-12  4:27 UTC (permalink / raw)
  To: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, alistair23, Alistair Francis

From: Alistair Francis <alistair.francis@wdc.com>

If the nvme_tcp_try_send() or nvme_tcp_try_recv() functions return
EKEYEXPIRED then the underlying TLS keys need to be updated. This occurs
on an KeyUpdate event as described in RFC8446
https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3.

If the NVMe Target (TLS server) initiates a KeyUpdate this patch will
allow the NVMe layer to process the KeyUpdate request and forward the
request to userspace. Userspace must then update the key to keep the
connection alive.

This patch allows us to handle the NVMe target sending a KeyUpdate
request without aborting the connection. At this time we don't support
initiating a KeyUpdate.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
v5:
 - Cleanup code flow
 - Check for MSG_CTRUNC in the msg_flags return from recvmsg
   and use that to determine if it's a control message
v4:
 - Remove all support for initiating KeyUpdate
 - Don't call cancel_work() when updating keys
v3:
 - Don't cancel existing handshake requests
v2:
 - Don't change the state
 - Use a helper function for KeyUpdates
 - Continue sending in nvme_tcp_send_all() after a KeyUpdate
 - Remove command message using recvmsg

 drivers/nvme/host/tcp.c | 85 +++++++++++++++++++++++++++++++++--------
 1 file changed, 70 insertions(+), 15 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 4797a4532b0d..5cec5a974bbf 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -172,6 +172,7 @@ struct nvme_tcp_queue {
 	bool			tls_enabled;
 	u32			rcv_crc;
 	u32			snd_crc;
+	key_serial_t		handshake_session_id;
 	__le32			exp_ddgst;
 	__le32			recv_ddgst;
 	struct completion       tls_complete;
@@ -858,7 +859,10 @@ static void nvme_tcp_handle_c2h_term(struct nvme_tcp_queue *queue,
 static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
 {
 	char *pdu = queue->pdu;
+	char cbuf[CMSG_LEN(sizeof(char))] = {};
 	struct msghdr msg = {
+		.msg_control = cbuf,
+		.msg_controllen = sizeof(cbuf),
 		.msg_flags = MSG_DONTWAIT,
 	};
 	struct kvec iov = {
@@ -873,12 +877,17 @@ static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
 	if (ret <= 0)
 		return ret;
 
+	hdr = queue->pdu;
+	if (hdr->type == TLS_HANDSHAKE_KEYUPDATE) {
+		dev_err(queue->ctrl->ctrl.device, "KeyUpdate message\n");
+		return 1;
+	}
+
 	queue->pdu_remaining -= ret;
 	queue->pdu_offset += ret;
 	if (queue->pdu_remaining)
 		return 0;
 
-	hdr = queue->pdu;
 	if (unlikely(hdr->hlen != sizeof(struct nvme_tcp_rsp_pdu))) {
 		if (!nvme_tcp_recv_pdu_supported(hdr->type))
 			goto unsupported_pdu;
@@ -944,6 +953,7 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
 	struct request *rq =
 		nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
 	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
+	char cbuf[CMSG_LEN(sizeof(char))] = {};
 
 	if (nvme_tcp_recv_state(queue) != NVME_TCP_RECV_DATA)
 		return 0;
@@ -976,10 +986,26 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
 
 		ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
 		if (ret < 0) {
-			dev_err(queue->ctrl->ctrl.device,
-				"queue %d failed to receive request %#x data",
-				nvme_tcp_queue_id(queue), rq->tag);
-			return ret;
+			/* If MSG_CTRUNC is set, it's a control message,
+			 * so let's read the control message.
+			 */
+			if (msg.msg_flags & MSG_CTRUNC) {
+				memset(&msg, 0, sizeof(msg));
+				msg.msg_flags = MSG_DONTWAIT;
+				msg.msg_control = cbuf;
+				msg.msg_controllen = sizeof(cbuf);
+
+				ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
+			}
+
+			if (ret < 0) {
+				dev_dbg(queue->ctrl->ctrl.device,
+					"queue %d failed to receive request %#x data, %d",
+					nvme_tcp_queue_id(queue), rq->tag, ret);
+				return ret;
+			}
+
+			return 0;
 		}
 		if (queue->data_digest)
 			nvme_tcp_ddgst_calc(req, &queue->rcv_crc, ret);
@@ -1384,15 +1410,39 @@ static int nvme_tcp_try_recvmsg(struct nvme_tcp_queue *queue)
 		}
 	} while (result >= 0);
 
-	if (result < 0 && result != -EAGAIN) {
-		dev_err(queue->ctrl->ctrl.device,
-			"receive failed:  %d\n", result);
-		queue->rd_enabled = false;
-		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
-	} else if (result == -EAGAIN)
-		result = 0;
+	if (result < 0) {
+		if (result != -EKEYEXPIRED && result != -EAGAIN) {
+			dev_err(queue->ctrl->ctrl.device,
+				"receive failed:  %d\n", result);
+			queue->rd_enabled = false;
+			nvme_tcp_error_recovery(&queue->ctrl->ctrl);
+		}
+		return result;
+	}
+
+	queue->nr_cqe = nr_cqe;
+	return nr_cqe;
+}
+
+static void update_tls_keys(struct nvme_tcp_queue *queue)
+{
+	int qid = nvme_tcp_queue_id(queue);
+	int ret;
+
+	dev_dbg(queue->ctrl->ctrl.device,
+		"updating key for queue %d\n", qid);
 
-	return result < 0 ? result : (queue->nr_cqe = nr_cqe);
+	flush_work(&(queue->ctrl->ctrl).async_event_work);
+
+	ret = nvme_tcp_start_tls(&(queue->ctrl->ctrl),
+				 queue, queue->ctrl->ctrl.tls_pskid,
+				 HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED);
+
+	if (ret < 0) {
+		dev_err(queue->ctrl->ctrl.device,
+			"failed to update the keys %d\n", ret);
+		nvme_tcp_fail_request(queue->request);
+	}
 }
 
 static void nvme_tcp_io_work(struct work_struct *w)
@@ -1417,8 +1467,11 @@ static void nvme_tcp_io_work(struct work_struct *w)
 		result = nvme_tcp_try_recvmsg(queue);
 		if (result > 0)
 			pending = true;
-		else if (unlikely(result < 0))
-			return;
+		else if (unlikely(result < 0)) {
+			if (result == -EKEYEXPIRED)
+				update_tls_keys(queue);
+			break;
+		}
 
 		/* did we get some space after spending time in recv? */
 		if (nvme_tcp_queue_has_pending(queue) &&
@@ -1726,6 +1779,7 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
 			ctrl->ctrl.tls_pskid = key_serial(tls_key);
 		key_put(tls_key);
 		queue->tls_err = 0;
+		queue->handshake_session_id = handshake_session_id;
 	}
 
 out_complete:
@@ -1755,6 +1809,7 @@ static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
 		keyring = key_serial(nctrl->opts->keyring);
 	args.ta_keyring = keyring;
 	args.ta_timeout_ms = tls_handshake_timeout * 1000;
+	args.handshake_session_id = queue->handshake_session_id;
 	queue->tls_err = -EOPNOTSUPP;
 	init_completion(&queue->tls_complete);
 	if (keyupdate == HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC)
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 6/6] nvmet-tcp: Support KeyUpdate
  2025-11-12  4:27 [PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests alistair23
                   ` (4 preceding siblings ...)
  2025-11-12  4:27 ` [PATCH v5 5/6] nvme-tcp: Support KeyUpdate alistair23
@ 2025-11-12  4:27 ` alistair23
  2025-11-12  7:01   ` Christoph Hellwig
  5 siblings, 1 reply; 33+ messages in thread
From: alistair23 @ 2025-11-12  4:27 UTC (permalink / raw)
  To: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, alistair23, Alistair Francis

From: Alistair Francis <alistair.francis@wdc.com>

If the nvmet_tcp_try_recv() function return EKEYEXPIRED or if we receive
a KeyUpdate handshake type then the underlying TLS keys need to be
updated.

If the NVMe Host (TLS client) initiates a KeyUpdate this patch will
allow the NVMe layer to process the KeyUpdate request and forward the
request to userspace. Userspace must then update the key to keep the
connection alive.

This patch allows us to handle the NVMe host sending a KeyUpdate
request without aborting the connection. At this time we don't support
initiating a KeyUpdate.

Link: https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
v5:
 - No change
v4:
 - Restructure code to avoid #ifdefs and forward declarations
 - Use a helper function for checking -EKEYEXPIRED
 - Remove all support for initiating KeyUpdate
 - Use helper function for restoring callbacks
v3:
 - Use a write lock for sk_user_data
 - Fix build with CONFIG_NVME_TARGET_TCP_TLS disabled
 - Remove unused variable
v2:
 - Use a helper function for KeyUpdates
 - Ensure keep alive timer is stopped
 - Wait for TLS KeyUpdate to complete

 drivers/nvme/target/tcp.c | 203 ++++++++++++++++++++++++++------------
 1 file changed, 142 insertions(+), 61 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 818efdeccef1..486ea7bb0056 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -175,6 +175,7 @@ struct nvmet_tcp_queue {
 
 	/* TLS state */
 	key_serial_t		tls_pskid;
+	key_serial_t		handshake_session_id;
 	struct delayed_work	tls_handshake_tmo_work;
 
 	unsigned long           poll_end;
@@ -186,6 +187,8 @@ struct nvmet_tcp_queue {
 	struct sockaddr_storage	sockaddr_peer;
 	struct work_struct	release_work;
 
+	struct completion       tls_complete;
+
 	int			idx;
 	struct list_head	queue_list;
 
@@ -214,6 +217,10 @@ static struct workqueue_struct *nvmet_tcp_wq;
 static const struct nvmet_fabrics_ops nvmet_tcp_ops;
 static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c);
 static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd);
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
+static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue,
+				   enum handshake_key_update_type keyupdate);
+#endif
 
 static inline u16 nvmet_tcp_cmd_tag(struct nvmet_tcp_queue *queue,
 		struct nvmet_tcp_cmd *cmd)
@@ -832,6 +839,23 @@ static int nvmet_tcp_try_send_one(struct nvmet_tcp_queue *queue,
 	return 1;
 }
 
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
+static bool nvmet_tls_key_expired(struct nvmet_tcp_queue *queue, int ret)
+{
+	if (ret == -EKEYEXPIRED &&
+	    queue->state != NVMET_TCP_Q_DISCONNECTING &&
+	    queue->state != NVMET_TCP_Q_TLS_HANDSHAKE)
+					return true;
+
+	return false;
+}
+#else
+static bool nvmet_tls_key_expired(struct nvmet_tcp_queue *queue, int ret)
+{
+	return false;
+}
+#endif
+
 static int nvmet_tcp_try_send(struct nvmet_tcp_queue *queue,
 		int budget, int *sends)
 {
@@ -1106,6 +1130,103 @@ static inline bool nvmet_tcp_pdu_valid(u8 type)
 	return false;
 }
 
+static void nvmet_tcp_release_queue(struct kref *kref)
+{
+	struct nvmet_tcp_queue *queue =
+		container_of(kref, struct nvmet_tcp_queue, kref);
+
+	WARN_ON(queue->state != NVMET_TCP_Q_DISCONNECTING);
+	queue_work(nvmet_wq, &queue->release_work);
+}
+
+static void nvmet_tcp_schedule_release_queue(struct nvmet_tcp_queue *queue)
+{
+	spin_lock_bh(&queue->state_lock);
+	if (queue->state == NVMET_TCP_Q_TLS_HANDSHAKE) {
+		/* Socket closed during handshake */
+		tls_handshake_cancel(queue->sock->sk);
+	}
+	if (queue->state != NVMET_TCP_Q_DISCONNECTING) {
+		queue->state = NVMET_TCP_Q_DISCONNECTING;
+		kref_put(&queue->kref, nvmet_tcp_release_queue);
+	}
+	spin_unlock_bh(&queue->state_lock);
+}
+
+static void nvmet_tcp_restore_socket_callbacks(struct nvmet_tcp_queue *queue)
+{
+	struct socket *sock = queue->sock;
+
+	if (!queue->state_change)
+		return;
+
+	write_lock_bh(&sock->sk->sk_callback_lock);
+	sock->sk->sk_data_ready =  queue->data_ready;
+	sock->sk->sk_state_change = queue->state_change;
+	sock->sk->sk_write_space = queue->write_space;
+	sock->sk->sk_user_data = NULL;
+	write_unlock_bh(&sock->sk->sk_callback_lock);
+}
+
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
+static void nvmet_tcp_tls_handshake_timeout(struct work_struct *w)
+{
+	struct nvmet_tcp_queue *queue = container_of(to_delayed_work(w),
+			struct nvmet_tcp_queue, tls_handshake_tmo_work);
+
+	pr_warn("queue %d: TLS handshake timeout\n", queue->idx);
+	/*
+	 * If tls_handshake_cancel() fails we've lost the race with
+	 * nvmet_tcp_tls_handshake_done() */
+	if (!tls_handshake_cancel(queue->sock->sk))
+		return;
+	spin_lock_bh(&queue->state_lock);
+	if (WARN_ON(queue->state != NVMET_TCP_Q_TLS_HANDSHAKE)) {
+		spin_unlock_bh(&queue->state_lock);
+		return;
+	}
+	queue->state = NVMET_TCP_Q_FAILED;
+	spin_unlock_bh(&queue->state_lock);
+	nvmet_tcp_schedule_release_queue(queue);
+	kref_put(&queue->kref, nvmet_tcp_release_queue);
+}
+
+static int update_tls_keys(struct nvmet_tcp_queue *queue)
+{
+	int ret;
+
+	cancel_work(&queue->io_work);
+	queue->state = NVMET_TCP_Q_TLS_HANDSHAKE;
+
+	nvmet_tcp_restore_socket_callbacks(queue);
+
+	INIT_DELAYED_WORK(&queue->tls_handshake_tmo_work,
+			  nvmet_tcp_tls_handshake_timeout);
+
+	ret = nvmet_tcp_tls_handshake(queue, HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED);
+
+	if (ret < 0)
+		return ret;
+
+	ret = wait_for_completion_interruptible_timeout(&queue->tls_complete,
+							10 * HZ);
+
+	if (ret <= 0) {
+		tls_handshake_cancel(queue->sock->sk);
+		return ret;
+	}
+
+	queue->state = NVMET_TCP_Q_LIVE;
+
+	return 0;
+}
+#else
+static int update_tls_keys(struct nvmet_tcp_queue *queue)
+{
+	return -EPFNOSUPPORT;
+}
+#endif
+
 static int nvmet_tcp_tls_record_ok(struct nvmet_tcp_queue *queue,
 		struct msghdr *msg, char *cbuf)
 {
@@ -1131,6 +1252,9 @@ static int nvmet_tcp_tls_record_ok(struct nvmet_tcp_queue *queue,
 			ret = -EAGAIN;
 		}
 		break;
+	case TLS_RECORD_TYPE_HANDSHAKE:
+		ret = -EAGAIN;
+		break;
 	default:
 		/* discard this record type */
 		pr_err("queue %d: TLS record %d unhandled\n",
@@ -1340,6 +1464,8 @@ static int nvmet_tcp_try_recv(struct nvmet_tcp_queue *queue,
 	for (i = 0; i < budget; i++) {
 		ret = nvmet_tcp_try_recv_one(queue);
 		if (unlikely(ret < 0)) {
+			if (nvmet_tls_key_expired(queue, ret))
+					goto done;
 			nvmet_tcp_socket_error(queue, ret);
 			goto done;
 		} else if (ret == 0) {
@@ -1351,29 +1477,6 @@ static int nvmet_tcp_try_recv(struct nvmet_tcp_queue *queue,
 	return ret;
 }
 
-static void nvmet_tcp_release_queue(struct kref *kref)
-{
-	struct nvmet_tcp_queue *queue =
-		container_of(kref, struct nvmet_tcp_queue, kref);
-
-	WARN_ON(queue->state != NVMET_TCP_Q_DISCONNECTING);
-	queue_work(nvmet_wq, &queue->release_work);
-}
-
-static void nvmet_tcp_schedule_release_queue(struct nvmet_tcp_queue *queue)
-{
-	spin_lock_bh(&queue->state_lock);
-	if (queue->state == NVMET_TCP_Q_TLS_HANDSHAKE) {
-		/* Socket closed during handshake */
-		tls_handshake_cancel(queue->sock->sk);
-	}
-	if (queue->state != NVMET_TCP_Q_DISCONNECTING) {
-		queue->state = NVMET_TCP_Q_DISCONNECTING;
-		kref_put(&queue->kref, nvmet_tcp_release_queue);
-	}
-	spin_unlock_bh(&queue->state_lock);
-}
-
 static inline void nvmet_tcp_arm_queue_deadline(struct nvmet_tcp_queue *queue)
 {
 	queue->poll_end = jiffies + usecs_to_jiffies(idle_poll_period_usecs);
@@ -1404,8 +1507,12 @@ static void nvmet_tcp_io_work(struct work_struct *w)
 		ret = nvmet_tcp_try_recv(queue, NVMET_TCP_RECV_BUDGET, &ops);
 		if (ret > 0)
 			pending = true;
-		else if (ret < 0)
+		else if (ret < 0) {
+			if (ret == -EKEYEXPIRED)
+				break;
+
 			return;
+		}
 
 		ret = nvmet_tcp_try_send(queue, NVMET_TCP_SEND_BUDGET, &ops);
 		if (ret > 0)
@@ -1415,6 +1522,11 @@ static void nvmet_tcp_io_work(struct work_struct *w)
 
 	} while (pending && ops < NVMET_TCP_IO_WORK_BUDGET);
 
+	if (ret == -EKEYEXPIRED) {
+		update_tls_keys(queue);
+		pending = true;
+	}
+
 	/*
 	 * Requeue the worker if idle deadline period is in progress or any
 	 * ops activity was recorded during the do-while loop above.
@@ -1517,21 +1629,6 @@ static void nvmet_tcp_free_cmds(struct nvmet_tcp_queue *queue)
 	kfree(cmds);
 }
 
-static void nvmet_tcp_restore_socket_callbacks(struct nvmet_tcp_queue *queue)
-{
-	struct socket *sock = queue->sock;
-
-	if (!queue->state_change)
-		return;
-
-	write_lock_bh(&sock->sk->sk_callback_lock);
-	sock->sk->sk_data_ready =  queue->data_ready;
-	sock->sk->sk_state_change = queue->state_change;
-	sock->sk->sk_write_space = queue->write_space;
-	sock->sk->sk_user_data = NULL;
-	write_unlock_bh(&sock->sk->sk_callback_lock);
-}
-
 static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue)
 {
 	struct nvmet_tcp_cmd *cmd = queue->cmds;
@@ -1794,6 +1891,7 @@ static void nvmet_tcp_tls_handshake_done(void *data, int status,
 	}
 	if (!status) {
 		queue->tls_pskid = peerid;
+		queue->handshake_session_id = handshake_session_id;
 		queue->state = NVMET_TCP_Q_CONNECTING;
 	} else
 		queue->state = NVMET_TCP_Q_FAILED;
@@ -1809,28 +1907,7 @@ static void nvmet_tcp_tls_handshake_done(void *data, int status,
 	else
 		nvmet_tcp_set_queue_sock(queue);
 	kref_put(&queue->kref, nvmet_tcp_release_queue);
-}
-
-static void nvmet_tcp_tls_handshake_timeout(struct work_struct *w)
-{
-	struct nvmet_tcp_queue *queue = container_of(to_delayed_work(w),
-			struct nvmet_tcp_queue, tls_handshake_tmo_work);
-
-	pr_warn("queue %d: TLS handshake timeout\n", queue->idx);
-	/*
-	 * If tls_handshake_cancel() fails we've lost the race with
-	 * nvmet_tcp_tls_handshake_done() */
-	if (!tls_handshake_cancel(queue->sock->sk))
-		return;
-	spin_lock_bh(&queue->state_lock);
-	if (WARN_ON(queue->state != NVMET_TCP_Q_TLS_HANDSHAKE)) {
-		spin_unlock_bh(&queue->state_lock);
-		return;
-	}
-	queue->state = NVMET_TCP_Q_FAILED;
-	spin_unlock_bh(&queue->state_lock);
-	nvmet_tcp_schedule_release_queue(queue);
-	kref_put(&queue->kref, nvmet_tcp_release_queue);
+	complete(&queue->tls_complete);
 }
 
 static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue,
@@ -1852,11 +1929,15 @@ static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue,
 	args.ta_data = queue;
 	args.ta_keyring = key_serial(queue->port->nport->keyring);
 	args.ta_timeout_ms = tls_handshake_timeout * 1000;
+	args.handshake_session_id = queue->handshake_session_id;
+
+	init_completion(&queue->tls_complete);
 
 	if (keyupdate == HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC)
 		ret = tls_server_hello_psk(&args, GFP_KERNEL);
 	else
 		ret = tls_server_keyupdate_psk(&args, GFP_KERNEL, keyupdate);
+
 	if (ret) {
 		kref_put(&queue->kref, nvmet_tcp_release_queue);
 		pr_err("failed to start TLS, err=%d\n", ret);
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-12  4:27 ` [PATCH v5 5/6] nvme-tcp: Support KeyUpdate alistair23
@ 2025-11-12  6:59   ` Christoph Hellwig
  2025-11-12 14:31     ` Chuck Lever
  2025-11-27 13:31   ` Hannes Reinecke
  2025-11-30 22:31   ` Sagi Grimberg
  2 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2025-11-12  6:59 UTC (permalink / raw)
  To: alistair23
  Cc: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch,
	hare, Alistair Francis

On Wed, Nov 12, 2025 at 02:27:19PM +1000, alistair23@gmail.com wrote:
>  
>  		ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
>  		if (ret < 0) {
> +			/* If MSG_CTRUNC is set, it's a control message,
> +			 * so let's read the control message.
> +			 */

This is not the normal kernel comment style, which would be:

			/*
			 * If MSG_CTRUNC is set, it's a control message, so
			 * let's read the control message.
+			 */

> +			if (msg.msg_flags & MSG_CTRUNC) {
> +				memset(&msg, 0, sizeof(msg));
> +				msg.msg_flags = MSG_DONTWAIT;
> +				msg.msg_control = cbuf;
> +				msg.msg_controllen = sizeof(cbuf);
> +
> +				ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);

Overly long line.  Also given that we're in the main receive handler
it would be nice to have this outside the main flow using a goto and
an unlikely label anyway.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 6/6] nvmet-tcp: Support KeyUpdate
  2025-11-12  4:27 ` [PATCH v5 6/6] nvmet-tcp: " alistair23
@ 2025-11-12  7:01   ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2025-11-12  7:01 UTC (permalink / raw)
  To: alistair23
  Cc: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch,
	hare, Alistair Francis

On Wed, Nov 12, 2025 at 02:27:20PM +1000, alistair23@gmail.com wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
> 
> If the nvmet_tcp_try_recv() function return EKEYEXPIRED or if we receive
> a KeyUpdate handshake type then the underlying TLS keys need to be
> updated.
> 
> If the NVMe Host (TLS client) initiates a KeyUpdate this patch will
> allow the NVMe layer to process the KeyUpdate request and forward the
> request to userspace. Userspace must then update the key to keep the
> connection alive.
> 
> This patch allows us to handle the NVMe host sending a KeyUpdate
> request without aborting the connection. At this time we don't support
> initiating a KeyUpdate.
> 
> Link: https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> ---
> v5:
>  - No change
> v4:
>  - Restructure code to avoid #ifdefs and forward declarations
>  - Use a helper function for checking -EKEYEXPIRED
>  - Remove all support for initiating KeyUpdate
>  - Use helper function for restoring callbacks
> v3:
>  - Use a write lock for sk_user_data
>  - Fix build with CONFIG_NVME_TARGET_TCP_TLS disabled
>  - Remove unused variable
> v2:
>  - Use a helper function for KeyUpdates
>  - Ensure keep alive timer is stopped
>  - Wait for TLS KeyUpdate to complete
> 
>  drivers/nvme/target/tcp.c | 203 ++++++++++++++++++++++++++------------
>  1 file changed, 142 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index 818efdeccef1..486ea7bb0056 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -175,6 +175,7 @@ struct nvmet_tcp_queue {
>  
>  	/* TLS state */
>  	key_serial_t		tls_pskid;
> +	key_serial_t		handshake_session_id;
>  	struct delayed_work	tls_handshake_tmo_work;
>  
>  	unsigned long           poll_end;
> @@ -186,6 +187,8 @@ struct nvmet_tcp_queue {
>  	struct sockaddr_storage	sockaddr_peer;
>  	struct work_struct	release_work;
>  
> +	struct completion       tls_complete;
> +
>  	int			idx;
>  	struct list_head	queue_list;
>  
> @@ -214,6 +217,10 @@ static struct workqueue_struct *nvmet_tcp_wq;
>  static const struct nvmet_fabrics_ops nvmet_tcp_ops;
>  static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c);
>  static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd);
> +#ifdef CONFIG_NVME_TARGET_TCP_TLS
> +static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue,
> +				   enum handshake_key_update_type keyupdate);
> +#endif
>  
>  static inline u16 nvmet_tcp_cmd_tag(struct nvmet_tcp_queue *queue,
>  		struct nvmet_tcp_cmd *cmd)
> @@ -832,6 +839,23 @@ static int nvmet_tcp_try_send_one(struct nvmet_tcp_queue *queue,
>  	return 1;
>  }
>  
> +#ifdef CONFIG_NVME_TARGET_TCP_TLS
> +static bool nvmet_tls_key_expired(struct nvmet_tcp_queue *queue, int ret)
> +{
> +	if (ret == -EKEYEXPIRED &&
> +	    queue->state != NVMET_TCP_Q_DISCONNECTING &&
> +	    queue->state != NVMET_TCP_Q_TLS_HANDSHAKE)
> +					return true;
> +
> +	return false;

Extra indentation before the return true.  This could also be simplified
down to

	return ret == -EKEYEXPIRED &&
		queue->state != NVMET_TCP_Q_DISCONNECTING &&
		queue->state != NVMET_TCP_Q_TLS_HANDSHAKE;

or if you want to do away with the ifdef entirely:

	return IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS) &&
		ret == -EKEYEXPIRED &&
		queue->state != NVMET_TCP_Q_DISCONNECTING &&
		queue->state != NVMET_TCP_Q_TLS_HANDSHAKE;

Othwise looks good.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-12  6:59   ` Christoph Hellwig
@ 2025-11-12 14:31     ` Chuck Lever
  2025-11-12 14:38       ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Chuck Lever @ 2025-11-12 14:31 UTC (permalink / raw)
  To: Christoph Hellwig, alistair23
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, sagi, kch, hare,
	Alistair Francis

On 11/12/25 1:59 AM, Christoph Hellwig wrote:
> On Wed, Nov 12, 2025 at 02:27:19PM +1000, alistair23@gmail.com wrote:
>>  
>>  		ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
>>  		if (ret < 0) {
>> +			/* If MSG_CTRUNC is set, it's a control message,
>> +			 * so let's read the control message.
>> +			 */
> This is not the normal kernel comment style, which would be:
> 
> 			/*
> 			 * If MSG_CTRUNC is set, it's a control message, so
> 			 * let's read the control message.
> +			 */

But it is correct style for net/ .


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-12 14:31     ` Chuck Lever
@ 2025-11-12 14:38       ` Christoph Hellwig
  2025-11-12 14:38         ` Chuck Lever
  0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2025-11-12 14:38 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Christoph Hellwig, alistair23, hare, kernel-tls-handshake, netdev,
	linux-kernel, linux-doc, linux-nvme, linux-nfs, kbusch, axboe,
	sagi, kch, hare, Alistair Francis

On Wed, Nov 12, 2025 at 09:31:35AM -0500, Chuck Lever wrote:
> But it is correct style for net/ .

Maybe.  But why would non-net/ code care about their odd preference?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-12 14:38       ` Christoph Hellwig
@ 2025-11-12 14:38         ` Chuck Lever
  0 siblings, 0 replies; 33+ messages in thread
From: Chuck Lever @ 2025-11-12 14:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: alistair23, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs, kbusch, axboe, sagi, kch, hare,
	Alistair Francis

On 11/12/25 9:38 AM, Christoph Hellwig wrote:
> On Wed, Nov 12, 2025 at 09:31:35AM -0500, Chuck Lever wrote:
>> But it is correct style for net/ .
> 
> Maybe.  But why would non-net/ code care about their odd preference?

Yes. My bad. Carry on!


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/6] net/handshake: Store the key serial number on completion
  2025-11-12  4:27 ` [PATCH v5 1/6] net/handshake: Store the key serial number on completion alistair23
@ 2025-11-12 15:02   ` Chuck Lever
  2025-11-30 22:21   ` Sagi Grimberg
  1 sibling, 0 replies; 33+ messages in thread
From: Chuck Lever @ 2025-11-12 15:02 UTC (permalink / raw)
  To: alistair23, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, Alistair Francis

On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
> 
> Allow userspace to include a key serial number when completing a
> handshake with the HANDSHAKE_CMD_DONE command.
> 
> We then store this serial number and will provide it back to userspace
> in the future. This allows userspace to save data to the keyring and
> then restore that data later.
> 
> This will be used to support the TLS KeyUpdate operation, as now
> userspace can resume information about a established session.
> 
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> Reviewed-by: Hannes Reincke <hare@suse.de>
> ---
> v5:
>  - Change name to "handshake session ID"
> v4:
>  - No change
> v3:
>  - No change
> v2:
>  - Change "key-serial" to "session-id"
> 
>  Documentation/netlink/specs/handshake.yaml |  4 ++++
>  Documentation/networking/tls-handshake.rst |  1 +
>  drivers/nvme/host/tcp.c                    |  3 ++-
>  drivers/nvme/target/tcp.c                  |  3 ++-
>  include/net/handshake.h                    |  4 +++-
>  include/uapi/linux/handshake.h             |  1 +
>  net/handshake/genl.c                       |  5 +++--
>  net/handshake/tlshd.c                      | 15 +++++++++++++--
>  net/sunrpc/svcsock.c                       |  4 +++-
>  net/sunrpc/xprtsock.c                      |  4 +++-
>  10 files changed, 35 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
> index 95c3fade7a8d..a273bc74d26f 100644
> --- a/Documentation/netlink/specs/handshake.yaml
> +++ b/Documentation/netlink/specs/handshake.yaml
> @@ -87,6 +87,9 @@ attribute-sets:
>          name: remote-auth
>          type: u32
>          multi-attr: true
> +      -
> +        name: session-id
> +        type: u32
>  
>  operations:
>    list:
> @@ -123,6 +126,7 @@ operations:
>              - status
>              - sockfd
>              - remote-auth
> +            - session-id
>  
>  mcast-groups:
>    list:
> diff --git a/Documentation/networking/tls-handshake.rst b/Documentation/networking/tls-handshake.rst
> index 6f5ea1646a47..0941b81dde5c 100644
> --- a/Documentation/networking/tls-handshake.rst
> +++ b/Documentation/networking/tls-handshake.rst
> @@ -60,6 +60,7 @@ fills in a structure that contains the parameters of the request:
>          key_serial_t    ta_my_privkey;
>          unsigned int    ta_num_peerids;
>          key_serial_t    ta_my_peerids[5];
> +        key_serial_t    handshake_session_id;
>    };
>  
>  The @ta_sock field references an open and connected socket. The consumer
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 9058ea64b89c..024d02248831 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1694,7 +1694,8 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
>  		qid, queue->io_cpu);
>  }
>  
> -static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
> +static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
> +			      key_serial_t handshake_session_id)
>  {
>  	struct nvme_tcp_queue *queue = data;
>  	struct nvme_tcp_ctrl *ctrl = queue->ctrl;
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index 470bf37e5a63..7f8516892359 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1780,7 +1780,8 @@ static int nvmet_tcp_tls_key_lookup(struct nvmet_tcp_queue *queue,
>  }
>  
>  static void nvmet_tcp_tls_handshake_done(void *data, int status,
> -					 key_serial_t peerid)
> +					 key_serial_t peerid,
> +					 key_serial_t handshake_session_id)
>  {
>  	struct nvmet_tcp_queue *queue = data;
>  
> diff --git a/include/net/handshake.h b/include/net/handshake.h
> index 8ebd4f9ed26e..68d7f89e431a 100644
> --- a/include/net/handshake.h
> +++ b/include/net/handshake.h
> @@ -18,7 +18,8 @@ enum {
>  };
>  
>  typedef void	(*tls_done_func_t)(void *data, int status,
> -				   key_serial_t peerid);
> +				   key_serial_t peerid,
> +				   key_serial_t handshake_session_id);
>  
>  struct tls_handshake_args {
>  	struct socket		*ta_sock;
> @@ -31,6 +32,7 @@ struct tls_handshake_args {
>  	key_serial_t		ta_my_privkey;
>  	unsigned int		ta_num_peerids;
>  	key_serial_t		ta_my_peerids[5];
> +	key_serial_t		handshake_session_id;

Please follow the existing field name convention and add a "ta_"
prefix.


>  };
>  
>  int tls_client_hello_anon(const struct tls_handshake_args *args, gfp_t flags);
> diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
> index 662e7de46c54..b68ffbaa5f31 100644
> --- a/include/uapi/linux/handshake.h
> +++ b/include/uapi/linux/handshake.h
> @@ -55,6 +55,7 @@ enum {
>  	HANDSHAKE_A_DONE_STATUS = 1,
>  	HANDSHAKE_A_DONE_SOCKFD,
>  	HANDSHAKE_A_DONE_REMOTE_AUTH,
> +	HANDSHAKE_A_DONE_SESSION_ID,
>  
>  	__HANDSHAKE_A_DONE_MAX,
>  	HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
> diff --git a/net/handshake/genl.c b/net/handshake/genl.c
> index f55d14d7b726..6cdce7e5dbc0 100644
> --- a/net/handshake/genl.c
> +++ b/net/handshake/genl.c
> @@ -16,10 +16,11 @@ static const struct nla_policy handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HAN
>  };
>  
>  /* HANDSHAKE_CMD_DONE - do */
> -static const struct nla_policy handshake_done_nl_policy[HANDSHAKE_A_DONE_REMOTE_AUTH + 1] = {
> +static const struct nla_policy handshake_done_nl_policy[HANDSHAKE_A_DONE_SESSION_ID + 1] = {
>  	[HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
>  	[HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_S32, },
>  	[HANDSHAKE_A_DONE_REMOTE_AUTH] = { .type = NLA_U32, },
> +	[HANDSHAKE_A_DONE_SESSION_ID] = { .type = NLA_U32, },
>  };
>  
>  /* Ops table for handshake */
> @@ -35,7 +36,7 @@ static const struct genl_split_ops handshake_nl_ops[] = {
>  		.cmd		= HANDSHAKE_CMD_DONE,
>  		.doit		= handshake_nl_done_doit,
>  		.policy		= handshake_done_nl_policy,
> -		.maxattr	= HANDSHAKE_A_DONE_REMOTE_AUTH,
> +		.maxattr	= HANDSHAKE_A_DONE_SESSION_ID,
>  		.flags		= GENL_CMD_CAP_DO,
>  	},
>  };
> diff --git a/net/handshake/tlshd.c b/net/handshake/tlshd.c
> index 081093dfd553..85c5fed690c0 100644
> --- a/net/handshake/tlshd.c
> +++ b/net/handshake/tlshd.c
> @@ -26,7 +26,8 @@
>  
>  struct tls_handshake_req {
>  	void			(*th_consumer_done)(void *data, int status,
> -						    key_serial_t peerid);
> +						    key_serial_t peerid,
> +						    key_serial_t handshake_session_id);
>  	void			*th_consumer_data;
>  
>  	int			th_type;
> @@ -39,6 +40,8 @@ struct tls_handshake_req {
>  
>  	unsigned int		th_num_peerids;
>  	key_serial_t		th_peerid[5];
> +
> +	key_serial_t		handshake_session_id;

Please follow the existing field name convention and add a "th_"
prefix.


>  };
>  
>  static struct tls_handshake_req *
> @@ -55,6 +58,7 @@ tls_handshake_req_init(struct handshake_req *req,
>  	treq->th_num_peerids = 0;
>  	treq->th_certificate = TLS_NO_CERT;
>  	treq->th_privkey = TLS_NO_PRIVKEY;
> +	treq->handshake_session_id = TLS_NO_PRIVKEY;

Please define a TLS_NO_SESSION_ID symbolic constant for initializing
this field, which is not a private key.


>  	return treq;
>  }
>  
> @@ -83,6 +87,13 @@ static void tls_handshake_remote_peerids(struct tls_handshake_req *treq,
>  		if (i >= treq->th_num_peerids)
>  			break;
>  	}
> +
> +	nla_for_each_attr(nla, head, len, rem) {
> +		if (nla_type(nla) == HANDSHAKE_A_DONE_SESSION_ID) {
> +			treq->handshake_session_id = nla_get_u32(nla);
> +			break;
> +		}
> +	}
>  }
>  
>  /**
> @@ -105,7 +116,7 @@ static void tls_handshake_done(struct handshake_req *req,
>  		set_bit(HANDSHAKE_F_REQ_SESSION, &req->hr_flags);
>  
>  	treq->th_consumer_done(treq->th_consumer_data, -status,
> -			       treq->th_peerid[0]);
> +			       treq->th_peerid[0], treq->handshake_session_id);
>  }
>  
>  #if IS_ENABLED(CONFIG_KEYS)
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 7b90abc5cf0e..2401b4c757f6 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -444,13 +444,15 @@ static void svc_tcp_kill_temp_xprt(struct svc_xprt *xprt)
>   * @data: address of xprt to wake
>   * @status: status of handshake
>   * @peerid: serial number of key containing the remote peer's identity
> + * @handshake_session_id: serial number of the userspace session ID
>   *
>   * If a security policy is specified as an export option, we don't
>   * have a specific export here to check. So we set a "TLS session
>   * is present" flag on the xprt and let an upper layer enforce local
>   * security policy.
>   */
> -static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid)
> +static void svc_tcp_handshake_done(void *data, int status, key_serial_t peerid,
> +				   key_serial_t handshake_session_id)
>  {
>  	struct svc_xprt *xprt = data;
>  	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 3aa987e7f072..5c6e7543f293 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2589,9 +2589,11 @@ static int xs_tcp_tls_finish_connecting(struct rpc_xprt *lower_xprt,
>   * @data: address of xprt to wake
>   * @status: status of handshake
>   * @peerid: serial number of key containing the remote's identity
> + * @handshake_session_id: serial number of the userspace session ID
>   *
>   */
> -static void xs_tls_handshake_done(void *data, int status, key_serial_t peerid)
> +static void xs_tls_handshake_done(void *data, int status, key_serial_t peerid,
> +				  key_serial_t handshake_session_id)
>  {
>  	struct rpc_xprt *lower_xprt = data;
>  	struct sock_xprt *lower_transport =

Aside from a few nits:

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-12  4:27 ` [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req alistair23
@ 2025-11-12 15:47   ` Chuck Lever
  2025-11-13 10:19     ` Alistair Francis
  0 siblings, 1 reply; 33+ messages in thread
From: Chuck Lever @ 2025-11-12 15:47 UTC (permalink / raw)
  To: alistair23, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, Alistair Francis

On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
> 
> Define a `handshake_sk_destruct_req()` function to allow the destruction
> of the handshake req.
> 
> This is required to avoid hash conflicts when handshake_req_hash_add()
> is called as part of submitting the KeyUpdate request.
> 
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> ---
> v5:
>  - No change
> v4:
>  - No change
> v3:
>  - New patch
> 
>  net/handshake/request.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/net/handshake/request.c b/net/handshake/request.c
> index 274d2c89b6b2..0d1c91c80478 100644
> --- a/net/handshake/request.c
> +++ b/net/handshake/request.c
> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
>  		sk_destruct(sk);
>  }
>  
> +/**
> + * handshake_sk_destruct_req - destroy an existing request
> + * @sk: socket on which there is an existing request

Generally the kdoc style is unnecessary for static helper functions,
especially functions with only a single caller.

This all looks so much like handshake_sk_destruct(). Consider
eliminating the code duplication by splitting that function into a
couple of helpers instead of adding this one.

> + */
> +static void handshake_sk_destruct_req(struct sock *sk)

Because this function is static, I imagine that the compiler will
bark about the addition of an unused function. Perhaps it would
be better to combine 2/6 and 3/6.

That would also make it easier for reviewers to check the resource
accounting issues mentioned below.

> +{
> +	struct handshake_req *req;
> +
> +	req = handshake_req_hash_lookup(sk);
> +	if (!req)
> +		return;
> +
> +	trace_handshake_destruct(sock_net(sk), req, sk);

Wondering if this function needs to preserve the socket's destructor
callback chain like so:

+	void (sk_destruct)(struct sock sk);

  ...

+	sk_destruct = req->hr_odestruct;
+	sk->sk_destruct = sk_destruct;

then:

> +	handshake_req_destroy(req);

Because of the current code organization and patch ordering, it's
difficult to confirm that sock_put() isn't necessary here.

> +}
> +
>  /**
>   * handshake_req_alloc - Allocate a handshake request
>   * @proto: security protocol

There's no synchronization preventing concurrent handshake_req_cancel()
calls from accessing the request after it's freed during handshake
completion. That is one reason why handshake_complete() leaves completed
requests in the hash.

So I'm thinking that removing requests like this is not going to work
out. Would it work better if handshake_req_hash_add() could recognize
that a KeyUpdate is going on, and allow replacement of a hashed
request? I haven't thought that through.

As always, please double-check my questions and assumptions before
revising this patch!

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 4/6] net/handshake: Support KeyUpdate message types
  2025-11-12  4:27 ` [PATCH v5 4/6] net/handshake: Support KeyUpdate message types alistair23
@ 2025-11-12 15:49   ` Chuck Lever
  2025-11-13  2:16     ` Alistair Francis
  2025-11-13 14:41   ` Chuck Lever
  2025-11-27 13:12   ` Hannes Reinecke
  2 siblings, 1 reply; 33+ messages in thread
From: Chuck Lever @ 2025-11-12 15:49 UTC (permalink / raw)
  To: alistair23, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, Alistair Francis

On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
> 
> When reporting the msg-type to userspace let's also support reporting
> KeyUpdate events. This supports reporting a client/server event and if
> the other side requested a KeyUpdateRequest.
> 
> Link: https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>

I was not able to apply this to either v6.18-rc5, nfsd-testing, or
netdev-next, so I can't adequately review or test it. Is this series
available on a public git branch?


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 4/6] net/handshake: Support KeyUpdate message types
  2025-11-12 15:49   ` Chuck Lever
@ 2025-11-13  2:16     ` Alistair Francis
  0 siblings, 0 replies; 33+ messages in thread
From: Alistair Francis @ 2025-11-13  2:16 UTC (permalink / raw)
  To: Chuck Lever
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On Thu, Nov 13, 2025 at 1:49 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> > From: Alistair Francis <alistair.francis@wdc.com>
> >
> > When reporting the msg-type to userspace let's also support reporting
> > KeyUpdate events. This supports reporting a client/server event and if
> > the other side requested a KeyUpdateRequest.
> >
> > Link: https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3
> > Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
>
> I was not able to apply this to either v6.18-rc5, nfsd-testing, or
> netdev-next, so I can't adequately review or test it. Is this series
> available on a public git branch?

You might have missed Hanne's "nvme-tcp: Implement recvmsg() receive
flow" patch that is required, it's mentioned in the cover letter.

You can find this series, Hanne's patch, a few other patches that
are all on list and some extra patches I'm using for testing and debugging here:
https://github.com/alistair23/linux/tree/alistair/tls-keyupdate

Just remove the top commits from that branch to get just this series

Alistair

>

>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-12 15:47   ` Chuck Lever
@ 2025-11-13 10:19     ` Alistair Francis
  2025-11-13 14:01       ` Chuck Lever
  0 siblings, 1 reply; 33+ messages in thread
From: Alistair Francis @ 2025-11-13 10:19 UTC (permalink / raw)
  To: Chuck Lever
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> > From: Alistair Francis <alistair.francis@wdc.com>
> >
> > Define a `handshake_sk_destruct_req()` function to allow the destruction
> > of the handshake req.
> >
> > This is required to avoid hash conflicts when handshake_req_hash_add()
> > is called as part of submitting the KeyUpdate request.
> >
> > Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> > Reviewed-by: Hannes Reinecke <hare@suse.de>
> > ---
> > v5:
> >  - No change
> > v4:
> >  - No change
> > v3:
> >  - New patch
> >
> >  net/handshake/request.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/net/handshake/request.c b/net/handshake/request.c
> > index 274d2c89b6b2..0d1c91c80478 100644
> > --- a/net/handshake/request.c
> > +++ b/net/handshake/request.c
> > @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
> >               sk_destruct(sk);
> >  }
> >
> > +/**
> > + * handshake_sk_destruct_req - destroy an existing request
> > + * @sk: socket on which there is an existing request
>
> Generally the kdoc style is unnecessary for static helper functions,
> especially functions with only a single caller.
>
> This all looks so much like handshake_sk_destruct(). Consider
> eliminating the code duplication by splitting that function into a
> couple of helpers instead of adding this one.
>
>
> > + */
> > +static void handshake_sk_destruct_req(struct sock *sk)
>
> Because this function is static, I imagine that the compiler will
> bark about the addition of an unused function. Perhaps it would
> be better to combine 2/6 and 3/6.
>
> That would also make it easier for reviewers to check the resource
> accounting issues mentioned below.
>
>
> > +{
> > +     struct handshake_req *req;
> > +
> > +     req = handshake_req_hash_lookup(sk);
> > +     if (!req)
> > +             return;
> > +
> > +     trace_handshake_destruct(sock_net(sk), req, sk);
>
> Wondering if this function needs to preserve the socket's destructor
> callback chain like so:
>
> +       void (sk_destruct)(struct sock sk);
>
>   ...
>
> +       sk_destruct = req->hr_odestruct;
> +       sk->sk_destruct = sk_destruct;
>
> then:
>
> > +     handshake_req_destroy(req);
>
> Because of the current code organization and patch ordering, it's
> difficult to confirm that sock_put() isn't necessary here.
>
>
> > +}
> > +
> >  /**
> >   * handshake_req_alloc - Allocate a handshake request
> >   * @proto: security protocol
>
> There's no synchronization preventing concurrent handshake_req_cancel()
> calls from accessing the request after it's freed during handshake
> completion. That is one reason why handshake_complete() leaves completed
> requests in the hash.

Ah, so you are worried that free-ing the request will race with
accessing the request after a handshake_req_hash_lookup().

Ok, makes sense. It seems like one answer to that is to add synchronisation

>
> So I'm thinking that removing requests like this is not going to work
> out. Would it work better if handshake_req_hash_add() could recognize
> that a KeyUpdate is going on, and allow replacement of a hashed
> request? I haven't thought that through.

I guess the idea would be to do something like this in
handshake_req_hash_add() if the entry already exists?

    if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
        /* Request already completed */
        rhashtable_replace_fast(...);
    }

I'm not sure that's better. That could possibly still race with
something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
the request unexpectedly.

What about adding synchronisation and keeping the current approach?
From a quick look it should be enough to just edit
handshake_sk_destruct() and handshake_req_cancel()

Alistair

>
>
> As always, please double-check my questions and assumptions before
> revising this patch!
>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-13 10:19     ` Alistair Francis
@ 2025-11-13 14:01       ` Chuck Lever
  2025-11-13 14:37         ` Chuck Lever
  0 siblings, 1 reply; 33+ messages in thread
From: Chuck Lever @ 2025-11-13 14:01 UTC (permalink / raw)
  To: Alistair Francis
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On 11/13/25 5:19 AM, Alistair Francis wrote:
> On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
>>> From: Alistair Francis <alistair.francis@wdc.com>
>>>
>>> Define a `handshake_sk_destruct_req()` function to allow the destruction
>>> of the handshake req.
>>>
>>> This is required to avoid hash conflicts when handshake_req_hash_add()
>>> is called as part of submitting the KeyUpdate request.
>>>
>>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>>> ---
>>> v5:
>>>  - No change
>>> v4:
>>>  - No change
>>> v3:
>>>  - New patch
>>>
>>>  net/handshake/request.c | 16 ++++++++++++++++
>>>  1 file changed, 16 insertions(+)
>>>
>>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>>> index 274d2c89b6b2..0d1c91c80478 100644
>>> --- a/net/handshake/request.c
>>> +++ b/net/handshake/request.c
>>> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
>>>               sk_destruct(sk);
>>>  }
>>>
>>> +/**
>>> + * handshake_sk_destruct_req - destroy an existing request
>>> + * @sk: socket on which there is an existing request
>>
>> Generally the kdoc style is unnecessary for static helper functions,
>> especially functions with only a single caller.
>>
>> This all looks so much like handshake_sk_destruct(). Consider
>> eliminating the code duplication by splitting that function into a
>> couple of helpers instead of adding this one.
>>
>>
>>> + */
>>> +static void handshake_sk_destruct_req(struct sock *sk)
>>
>> Because this function is static, I imagine that the compiler will
>> bark about the addition of an unused function. Perhaps it would
>> be better to combine 2/6 and 3/6.
>>
>> That would also make it easier for reviewers to check the resource
>> accounting issues mentioned below.
>>
>>
>>> +{
>>> +     struct handshake_req *req;
>>> +
>>> +     req = handshake_req_hash_lookup(sk);
>>> +     if (!req)
>>> +             return;
>>> +
>>> +     trace_handshake_destruct(sock_net(sk), req, sk);
>>
>> Wondering if this function needs to preserve the socket's destructor
>> callback chain like so:
>>
>> +       void (sk_destruct)(struct sock sk);
>>
>>   ...
>>
>> +       sk_destruct = req->hr_odestruct;
>> +       sk->sk_destruct = sk_destruct;
>>
>> then:
>>
>>> +     handshake_req_destroy(req);
>>
>> Because of the current code organization and patch ordering, it's
>> difficult to confirm that sock_put() isn't necessary here.
>>
>>
>>> +}
>>> +
>>>  /**
>>>   * handshake_req_alloc - Allocate a handshake request
>>>   * @proto: security protocol
>>
>> There's no synchronization preventing concurrent handshake_req_cancel()
>> calls from accessing the request after it's freed during handshake
>> completion. That is one reason why handshake_complete() leaves completed
>> requests in the hash.
> 
> Ah, so you are worried that free-ing the request will race with
> accessing the request after a handshake_req_hash_lookup().
> 
> Ok, makes sense. It seems like one answer to that is to add synchronisation
> 
>>
>> So I'm thinking that removing requests like this is not going to work
>> out. Would it work better if handshake_req_hash_add() could recognize
>> that a KeyUpdate is going on, and allow replacement of a hashed
>> request? I haven't thought that through.
> 
> I guess the idea would be to do something like this in
> handshake_req_hash_add() if the entry already exists?
> 
>     if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>         /* Request already completed */
>         rhashtable_replace_fast(...);
>     }
> 
> I'm not sure that's better. That could possibly still race with
> something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
> the request unexpectedly.
> 
> What about adding synchronisation and keeping the current approach?
> From a quick look it should be enough to just edit
> handshake_sk_destruct() and handshake_req_cancel()

Or make the KeyUpdate requests somehow distinctive so they do not
collide with initial handshake requests.


> Alistair
> 
>>
>>
>> As always, please double-check my questions and assumptions before
>> revising this patch!
>>
>>
>> --
>> Chuck Lever


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-13 14:01       ` Chuck Lever
@ 2025-11-13 14:37         ` Chuck Lever
  2025-11-14  3:44           ` Alistair Francis
  0 siblings, 1 reply; 33+ messages in thread
From: Chuck Lever @ 2025-11-13 14:37 UTC (permalink / raw)
  To: Alistair Francis
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On 11/13/25 9:01 AM, Chuck Lever wrote:
> On 11/13/25 5:19 AM, Alistair Francis wrote:
>> On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>
>>> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
>>>> From: Alistair Francis <alistair.francis@wdc.com>
>>>>
>>>> Define a `handshake_sk_destruct_req()` function to allow the destruction
>>>> of the handshake req.
>>>>
>>>> This is required to avoid hash conflicts when handshake_req_hash_add()
>>>> is called as part of submitting the KeyUpdate request.
>>>>
>>>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
>>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>>>> ---
>>>> v5:
>>>>  - No change
>>>> v4:
>>>>  - No change
>>>> v3:
>>>>  - New patch
>>>>
>>>>  net/handshake/request.c | 16 ++++++++++++++++
>>>>  1 file changed, 16 insertions(+)
>>>>
>>>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>>>> index 274d2c89b6b2..0d1c91c80478 100644
>>>> --- a/net/handshake/request.c
>>>> +++ b/net/handshake/request.c
>>>> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
>>>>               sk_destruct(sk);
>>>>  }
>>>>
>>>> +/**
>>>> + * handshake_sk_destruct_req - destroy an existing request
>>>> + * @sk: socket on which there is an existing request
>>>
>>> Generally the kdoc style is unnecessary for static helper functions,
>>> especially functions with only a single caller.
>>>
>>> This all looks so much like handshake_sk_destruct(). Consider
>>> eliminating the code duplication by splitting that function into a
>>> couple of helpers instead of adding this one.
>>>
>>>
>>>> + */
>>>> +static void handshake_sk_destruct_req(struct sock *sk)
>>>
>>> Because this function is static, I imagine that the compiler will
>>> bark about the addition of an unused function. Perhaps it would
>>> be better to combine 2/6 and 3/6.
>>>
>>> That would also make it easier for reviewers to check the resource
>>> accounting issues mentioned below.
>>>
>>>
>>>> +{
>>>> +     struct handshake_req *req;
>>>> +
>>>> +     req = handshake_req_hash_lookup(sk);
>>>> +     if (!req)
>>>> +             return;
>>>> +
>>>> +     trace_handshake_destruct(sock_net(sk), req, sk);
>>>
>>> Wondering if this function needs to preserve the socket's destructor
>>> callback chain like so:
>>>
>>> +       void (sk_destruct)(struct sock sk);
>>>
>>>   ...
>>>
>>> +       sk_destruct = req->hr_odestruct;
>>> +       sk->sk_destruct = sk_destruct;
>>>
>>> then:
>>>
>>>> +     handshake_req_destroy(req);
>>>
>>> Because of the current code organization and patch ordering, it's
>>> difficult to confirm that sock_put() isn't necessary here.
>>>
>>>
>>>> +}
>>>> +
>>>>  /**
>>>>   * handshake_req_alloc - Allocate a handshake request
>>>>   * @proto: security protocol
>>>
>>> There's no synchronization preventing concurrent handshake_req_cancel()
>>> calls from accessing the request after it's freed during handshake
>>> completion. That is one reason why handshake_complete() leaves completed
>>> requests in the hash.
>>
>> Ah, so you are worried that free-ing the request will race with
>> accessing the request after a handshake_req_hash_lookup().
>>
>> Ok, makes sense. It seems like one answer to that is to add synchronisation
>>
>>>
>>> So I'm thinking that removing requests like this is not going to work
>>> out. Would it work better if handshake_req_hash_add() could recognize
>>> that a KeyUpdate is going on, and allow replacement of a hashed
>>> request? I haven't thought that through.
>>
>> I guess the idea would be to do something like this in
>> handshake_req_hash_add() if the entry already exists?
>>
>>     if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>>         /* Request already completed */
>>         rhashtable_replace_fast(...);
>>     }
>>
>> I'm not sure that's better. That could possibly still race with
>> something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
>> the request unexpectedly.
>>
>> What about adding synchronisation and keeping the current approach?
>> From a quick look it should be enough to just edit
>> handshake_sk_destruct() and handshake_req_cancel()
> 
> Or make the KeyUpdate requests somehow distinctive so they do not
> collide with initial handshake requests.

Another thought: expand the current _req structure to also manage
KeyUpdates. I think there can be only one upcall request pending
at a time, right?


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 4/6] net/handshake: Support KeyUpdate message types
  2025-11-12  4:27 ` [PATCH v5 4/6] net/handshake: Support KeyUpdate message types alistair23
  2025-11-12 15:49   ` Chuck Lever
@ 2025-11-13 14:41   ` Chuck Lever
  2025-11-27 13:12   ` Hannes Reinecke
  2 siblings, 0 replies; 33+ messages in thread
From: Chuck Lever @ 2025-11-13 14:41 UTC (permalink / raw)
  To: alistair23, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, hare, Alistair Francis

On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
> 
> When reporting the msg-type to userspace let's also support reporting
> KeyUpdate events. This supports reporting a client/server event and if
> the other side requested a KeyUpdateRequest.
> 
> Link: https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
> v5:
>  - Drop clientkeyupdaterequest and serverkeyupdaterequest
> v4:
>  - Don't overload existing functions, instead create new ones
> v3:
>  - Fixup yamllint and kernel-doc failures
> 
>  Documentation/netlink/specs/handshake.yaml | 16 ++++-
>  drivers/nvme/host/tcp.c                    | 15 +++-
>  drivers/nvme/target/tcp.c                  | 10 ++-
>  include/net/handshake.h                    |  6 ++
>  include/uapi/linux/handshake.h             | 11 +++
>  net/handshake/tlshd.c                      | 83 +++++++++++++++++++++-
>  6 files changed, 133 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
> index a273bc74d26f..2f77216c8ddf 100644
> --- a/Documentation/netlink/specs/handshake.yaml
> +++ b/Documentation/netlink/specs/handshake.yaml
> @@ -21,12 +21,18 @@ definitions:
>      type: enum
>      name: msg-type
>      value-start: 0
> -    entries: [unspec, clienthello, serverhello]
> +    entries: [unspec, clienthello, serverhello, clientkeyupdate,
> +              serverkeyupdate]
>    -
>      type: enum
>      name: auth
>      value-start: 0
>      entries: [unspec, unauth, psk, x509]
> +  -
> +    type: enum
> +    name: key-update-type
> +    value-start: 0
> +    entries: [unspec, send, received, received_request_update]
>  
>  attribute-sets:
>    -
> @@ -74,6 +80,13 @@ attribute-sets:
>        -
>          name: keyring
>          type: u32
> +      -
> +        name: key-update-request
> +        type: u32
> +        enum: key-update-type
> +      -
> +        name: session-id
> +        type: u32
>    -
>      name: done
>      attributes:
> @@ -116,6 +129,7 @@ operations:
>              - certificate
>              - peername
>              - keyring
> +            - session-id
>      -
>        name: done
>        doc: Handler reports handshake completion
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 024d02248831..4797a4532b0d 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -20,6 +20,7 @@
>  #include <linux/iov_iter.h>
>  #include <net/busy_poll.h>
>  #include <trace/events/sock.h>
> +#include <uapi/linux/handshake.h>
>  
>  #include "nvme.h"
>  #include "fabrics.h"
> @@ -206,6 +207,10 @@ static struct workqueue_struct *nvme_tcp_wq;
>  static const struct blk_mq_ops nvme_tcp_mq_ops;
>  static const struct blk_mq_ops nvme_tcp_admin_mq_ops;
>  static int nvme_tcp_try_send(struct nvme_tcp_queue *queue);
> +static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
> +			      struct nvme_tcp_queue *queue,
> +			      key_serial_t pskid,
> +			      enum handshake_key_update_type keyupdate);
>  
>  static inline struct nvme_tcp_ctrl *to_tcp_ctrl(struct nvme_ctrl *ctrl)
>  {
> @@ -1729,7 +1734,8 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
>  
>  static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
>  			      struct nvme_tcp_queue *queue,
> -			      key_serial_t pskid)
> +			      key_serial_t pskid,
> +			      enum handshake_key_update_type keyupdate)
>  {
>  	int qid = nvme_tcp_queue_id(queue);
>  	int ret;
> @@ -1751,7 +1757,10 @@ static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
>  	args.ta_timeout_ms = tls_handshake_timeout * 1000;
>  	queue->tls_err = -EOPNOTSUPP;
>  	init_completion(&queue->tls_complete);
> -	ret = tls_client_hello_psk(&args, GFP_KERNEL);
> +	if (keyupdate == HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC)
> +		ret = tls_client_hello_psk(&args, GFP_KERNEL);
> +	else
> +		ret = tls_client_keyupdate_psk(&args, GFP_KERNEL, keyupdate);
>  	if (ret) {
>  		dev_err(nctrl->device, "queue %d: failed to start TLS: %d\n",
>  			qid, ret);
> @@ -1901,7 +1910,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
>  
>  	/* If PSKs are configured try to start TLS */
>  	if (nvme_tcp_tls_configured(nctrl) && pskid) {
> -		ret = nvme_tcp_start_tls(nctrl, queue, pskid);
> +		ret = nvme_tcp_start_tls(nctrl, queue, pskid, HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC);
>  		if (ret)
>  			goto err_init_connect;
>  	}
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index 7f8516892359..818efdeccef1 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1833,7 +1833,8 @@ static void nvmet_tcp_tls_handshake_timeout(struct work_struct *w)
>  	kref_put(&queue->kref, nvmet_tcp_release_queue);
>  }
>  
> -static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue)
> +static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue,
> +				   enum handshake_key_update_type keyupdate)
>  {
>  	int ret = -EOPNOTSUPP;
>  	struct tls_handshake_args args;
> @@ -1852,7 +1853,10 @@ static int nvmet_tcp_tls_handshake(struct nvmet_tcp_queue *queue)
>  	args.ta_keyring = key_serial(queue->port->nport->keyring);
>  	args.ta_timeout_ms = tls_handshake_timeout * 1000;
>  
> -	ret = tls_server_hello_psk(&args, GFP_KERNEL);
> +	if (keyupdate == HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC)
> +		ret = tls_server_hello_psk(&args, GFP_KERNEL);
> +	else
> +		ret = tls_server_keyupdate_psk(&args, GFP_KERNEL, keyupdate);
>  	if (ret) {
>  		kref_put(&queue->kref, nvmet_tcp_release_queue);
>  		pr_err("failed to start TLS, err=%d\n", ret);
> @@ -1934,7 +1938,7 @@ static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
>  		sk->sk_data_ready = port->data_ready;
>  		write_unlock_bh(&sk->sk_callback_lock);
>  		if (!nvmet_tcp_try_peek_pdu(queue)) {
> -			if (!nvmet_tcp_tls_handshake(queue))
> +			if (!nvmet_tcp_tls_handshake(queue, HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC))
>  				return;
>  			/* TLS handshake failed, terminate the connection */
>  			goto out_destroy_sq;
> diff --git a/include/net/handshake.h b/include/net/handshake.h
> index 68d7f89e431a..f5a249327bf6 100644
> --- a/include/net/handshake.h
> +++ b/include/net/handshake.h
> @@ -10,6 +10,8 @@
>  #ifndef _NET_HANDSHAKE_H
>  #define _NET_HANDSHAKE_H
>  
> +#include <uapi/linux/handshake.h>
> +
>  enum {
>  	TLS_NO_KEYRING = 0,
>  	TLS_NO_PEERID = 0,
> @@ -38,8 +40,12 @@ struct tls_handshake_args {
>  int tls_client_hello_anon(const struct tls_handshake_args *args, gfp_t flags);
>  int tls_client_hello_x509(const struct tls_handshake_args *args, gfp_t flags);
>  int tls_client_hello_psk(const struct tls_handshake_args *args, gfp_t flags);
> +int tls_client_keyupdate_psk(const struct tls_handshake_args *args, gfp_t flags,
> +			     enum handshake_key_update_type keyupdate);
>  int tls_server_hello_x509(const struct tls_handshake_args *args, gfp_t flags);
>  int tls_server_hello_psk(const struct tls_handshake_args *args, gfp_t flags);
> +int tls_server_keyupdate_psk(const struct tls_handshake_args *args, gfp_t flags,
> +			     enum handshake_key_update_type keyupdate);
>  
>  bool tls_handshake_cancel(struct sock *sk);
>  void tls_handshake_close(struct socket *sock);
> diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
> index b68ffbaa5f31..483815a064f0 100644
> --- a/include/uapi/linux/handshake.h
> +++ b/include/uapi/linux/handshake.h
> @@ -19,6 +19,8 @@ enum handshake_msg_type {
>  	HANDSHAKE_MSG_TYPE_UNSPEC,
>  	HANDSHAKE_MSG_TYPE_CLIENTHELLO,
>  	HANDSHAKE_MSG_TYPE_SERVERHELLO,
> +	HANDSHAKE_MSG_TYPE_CLIENTKEYUPDATE,
> +	HANDSHAKE_MSG_TYPE_SERVERKEYUPDATE,
>  };
>  
>  enum handshake_auth {
> @@ -28,6 +30,13 @@ enum handshake_auth {
>  	HANDSHAKE_AUTH_X509,
>  };
>  
> +enum handshake_key_update_type {
> +	HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC,
> +	HANDSHAKE_KEY_UPDATE_TYPE_SEND,
> +	HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED,
> +	HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED_REQUEST_UPDATE,
> +};
> +
>  enum {
>  	HANDSHAKE_A_X509_CERT = 1,
>  	HANDSHAKE_A_X509_PRIVKEY,
> @@ -46,6 +55,8 @@ enum {
>  	HANDSHAKE_A_ACCEPT_CERTIFICATE,
>  	HANDSHAKE_A_ACCEPT_PEERNAME,
>  	HANDSHAKE_A_ACCEPT_KEYRING,
> +	HANDSHAKE_A_ACCEPT_KEY_UPDATE_REQUEST,
> +	HANDSHAKE_A_ACCEPT_SESSION_ID,
>  
>  	__HANDSHAKE_A_ACCEPT_MAX,
>  	HANDSHAKE_A_ACCEPT_MAX = (__HANDSHAKE_A_ACCEPT_MAX - 1)
> diff --git a/net/handshake/tlshd.c b/net/handshake/tlshd.c
> index 85c5fed690c0..91d2bb515b7d 100644
> --- a/net/handshake/tlshd.c
> +++ b/net/handshake/tlshd.c
> @@ -41,6 +41,7 @@ struct tls_handshake_req {
>  	unsigned int		th_num_peerids;
>  	key_serial_t		th_peerid[5];
>  
> +	unsigned int		th_key_update_request;
>  	key_serial_t		handshake_session_id;
>  };
>  
> @@ -58,7 +59,8 @@ tls_handshake_req_init(struct handshake_req *req,
>  	treq->th_num_peerids = 0;
>  	treq->th_certificate = TLS_NO_CERT;
>  	treq->th_privkey = TLS_NO_PRIVKEY;
> -	treq->handshake_session_id = TLS_NO_PRIVKEY;
> +	treq->handshake_session_id = args->handshake_session_id;

Should you initialize the new th_key_update_request field to 0 here?
Prevent leaking previous memory contents to user space...

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>


> +
>  	return treq;
>  }
>  
> @@ -265,6 +267,16 @@ static int tls_handshake_accept(struct handshake_req *req,
>  		break;
>  	}
>  
> +	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_SESSION_ID,
> +			  treq->handshake_session_id);
> +	if (ret < 0)
> +		goto out_cancel;
> +
> +	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_KEY_UPDATE_REQUEST,
> +			  treq->th_key_update_request);
> +	if (ret < 0)
> +		goto out_cancel;
> +
>  	genlmsg_end(msg, hdr);
>  	return genlmsg_reply(msg, info);
>  
> @@ -372,6 +384,44 @@ int tls_client_hello_psk(const struct tls_handshake_args *args, gfp_t flags)
>  }
>  EXPORT_SYMBOL(tls_client_hello_psk);
>  
> +/**
> + * tls_client_keyupdate_psk - request a PSK-based TLS handshake on a socket
> + * @args: socket and handshake parameters for this request
> + * @flags: memory allocation control flags
> + * @keyupdate: specifies the type of KeyUpdate operation
> + *
> + * Return values:
> + *   %0: Handshake request enqueue; ->done will be called when complete
> + *   %-EINVAL: Wrong number of local peer IDs
> + *   %-ESRCH: No user agent is available
> + *   %-ENOMEM: Memory allocation failed
> + */
> +int tls_client_keyupdate_psk(const struct tls_handshake_args *args, gfp_t flags,
> +			     enum handshake_key_update_type keyupdate)
> +{
> +	struct tls_handshake_req *treq;
> +	struct handshake_req *req;
> +	unsigned int i;
> +
> +	if (!args->ta_num_peerids ||
> +	    args->ta_num_peerids > ARRAY_SIZE(treq->th_peerid))
> +		return -EINVAL;
> +
> +	req = handshake_req_alloc(&tls_handshake_proto, flags);
> +	if (!req)
> +		return -ENOMEM;
> +	treq = tls_handshake_req_init(req, args);
> +	treq->th_type = HANDSHAKE_MSG_TYPE_CLIENTKEYUPDATE;
> +	treq->th_key_update_request = keyupdate;
> +	treq->th_auth_mode = HANDSHAKE_AUTH_PSK;
> +	treq->th_num_peerids = args->ta_num_peerids;
> +	for (i = 0; i < args->ta_num_peerids; i++)
> +		treq->th_peerid[i] = args->ta_my_peerids[i];
> +
> +	return handshake_req_submit(args->ta_sock, req, flags);
> +}
> +EXPORT_SYMBOL(tls_client_keyupdate_psk);
> +
>  /**
>   * tls_server_hello_x509 - request a server TLS handshake on a socket
>   * @args: socket and handshake parameters for this request
> @@ -428,6 +478,37 @@ int tls_server_hello_psk(const struct tls_handshake_args *args, gfp_t flags)
>  }
>  EXPORT_SYMBOL(tls_server_hello_psk);
>  
> +/**
> + * tls_server_keyupdate_psk - request a server TLS KeyUpdate on a socket
> + * @args: socket and handshake parameters for this request
> + * @flags: memory allocation control flags
> + * @keyupdate: specifies the type of KeyUpdate operation
> + *
> + * Return values:
> + *   %0: Handshake request enqueue; ->done will be called when complete
> + *   %-ESRCH: No user agent is available
> + *   %-ENOMEM: Memory allocation failed
> + */
> +int tls_server_keyupdate_psk(const struct tls_handshake_args *args, gfp_t flags,
> +			     enum handshake_key_update_type keyupdate)
> +{
> +	struct tls_handshake_req *treq;
> +	struct handshake_req *req;
> +
> +	req = handshake_req_alloc(&tls_handshake_proto, flags);
> +	if (!req)
> +		return -ENOMEM;
> +	treq = tls_handshake_req_init(req, args);
> +	treq->th_type = HANDSHAKE_MSG_TYPE_SERVERKEYUPDATE;
> +	treq->th_key_update_request = keyupdate;
> +	treq->th_auth_mode = HANDSHAKE_AUTH_PSK;
> +	treq->th_num_peerids = 1;
> +	treq->th_peerid[0] = args->ta_my_peerids[0];
> +
> +	return handshake_req_submit(args->ta_sock, req, flags);
> +}
> +EXPORT_SYMBOL(tls_server_keyupdate_psk);
> +
>  /**
>   * tls_handshake_cancel - cancel a pending handshake
>   * @sk: socket on which there is an ongoing handshake


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-13 14:37         ` Chuck Lever
@ 2025-11-14  3:44           ` Alistair Francis
  2025-11-14 14:14             ` Chuck Lever
  0 siblings, 1 reply; 33+ messages in thread
From: Alistair Francis @ 2025-11-14  3:44 UTC (permalink / raw)
  To: Chuck Lever
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On Fri, Nov 14, 2025 at 12:37 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 11/13/25 9:01 AM, Chuck Lever wrote:
> > On 11/13/25 5:19 AM, Alistair Francis wrote:
> >> On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>
> >>> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> >>>> From: Alistair Francis <alistair.francis@wdc.com>
> >>>>
> >>>> Define a `handshake_sk_destruct_req()` function to allow the destruction
> >>>> of the handshake req.
> >>>>
> >>>> This is required to avoid hash conflicts when handshake_req_hash_add()
> >>>> is called as part of submitting the KeyUpdate request.
> >>>>
> >>>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> >>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
> >>>> ---
> >>>> v5:
> >>>>  - No change
> >>>> v4:
> >>>>  - No change
> >>>> v3:
> >>>>  - New patch
> >>>>
> >>>>  net/handshake/request.c | 16 ++++++++++++++++
> >>>>  1 file changed, 16 insertions(+)
> >>>>
> >>>> diff --git a/net/handshake/request.c b/net/handshake/request.c
> >>>> index 274d2c89b6b2..0d1c91c80478 100644
> >>>> --- a/net/handshake/request.c
> >>>> +++ b/net/handshake/request.c
> >>>> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
> >>>>               sk_destruct(sk);
> >>>>  }
> >>>>
> >>>> +/**
> >>>> + * handshake_sk_destruct_req - destroy an existing request
> >>>> + * @sk: socket on which there is an existing request
> >>>
> >>> Generally the kdoc style is unnecessary for static helper functions,
> >>> especially functions with only a single caller.
> >>>
> >>> This all looks so much like handshake_sk_destruct(). Consider
> >>> eliminating the code duplication by splitting that function into a
> >>> couple of helpers instead of adding this one.
> >>>
> >>>
> >>>> + */
> >>>> +static void handshake_sk_destruct_req(struct sock *sk)
> >>>
> >>> Because this function is static, I imagine that the compiler will
> >>> bark about the addition of an unused function. Perhaps it would
> >>> be better to combine 2/6 and 3/6.
> >>>
> >>> That would also make it easier for reviewers to check the resource
> >>> accounting issues mentioned below.
> >>>
> >>>
> >>>> +{
> >>>> +     struct handshake_req *req;
> >>>> +
> >>>> +     req = handshake_req_hash_lookup(sk);
> >>>> +     if (!req)
> >>>> +             return;
> >>>> +
> >>>> +     trace_handshake_destruct(sock_net(sk), req, sk);
> >>>
> >>> Wondering if this function needs to preserve the socket's destructor
> >>> callback chain like so:
> >>>
> >>> +       void (sk_destruct)(struct sock sk);
> >>>
> >>>   ...
> >>>
> >>> +       sk_destruct = req->hr_odestruct;
> >>> +       sk->sk_destruct = sk_destruct;
> >>>
> >>> then:
> >>>
> >>>> +     handshake_req_destroy(req);
> >>>
> >>> Because of the current code organization and patch ordering, it's
> >>> difficult to confirm that sock_put() isn't necessary here.
> >>>
> >>>
> >>>> +}
> >>>> +
> >>>>  /**
> >>>>   * handshake_req_alloc - Allocate a handshake request
> >>>>   * @proto: security protocol
> >>>
> >>> There's no synchronization preventing concurrent handshake_req_cancel()
> >>> calls from accessing the request after it's freed during handshake
> >>> completion. That is one reason why handshake_complete() leaves completed
> >>> requests in the hash.
> >>
> >> Ah, so you are worried that free-ing the request will race with
> >> accessing the request after a handshake_req_hash_lookup().
> >>
> >> Ok, makes sense. It seems like one answer to that is to add synchronisation
> >>
> >>>
> >>> So I'm thinking that removing requests like this is not going to work
> >>> out. Would it work better if handshake_req_hash_add() could recognize
> >>> that a KeyUpdate is going on, and allow replacement of a hashed
> >>> request? I haven't thought that through.
> >>
> >> I guess the idea would be to do something like this in
> >> handshake_req_hash_add() if the entry already exists?
> >>
> >>     if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
> >>         /* Request already completed */
> >>         rhashtable_replace_fast(...);
> >>     }
> >>
> >> I'm not sure that's better. That could possibly still race with
> >> something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
> >> the request unexpectedly.
> >>
> >> What about adding synchronisation and keeping the current approach?
> >> From a quick look it should be enough to just edit
> >> handshake_sk_destruct() and handshake_req_cancel()
> >
> > Or make the KeyUpdate requests somehow distinctive so they do not
> > collide with initial handshake requests.

Hmmm... Then each KeyUpdate needs to be distinctive, which will
indefinitely grow the hash table

>
> Another thought: expand the current _req structure to also manage
> KeyUpdates. I think there can be only one upcall request pending
> at a time, right?

There should only be a single request pending per queue.

I'm not sure I see what we could do to expand the _req structure.

What about adding `HANDSHAKE_F_REQ_CANCEL` to `hr_flags_bits` and
using that to ensure we don't free something that is currently being
cancelled and the other way around?

Alistair

>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-14  3:44           ` Alistair Francis
@ 2025-11-14 14:14             ` Chuck Lever
  2025-11-19  0:45               ` Alistair Francis
  0 siblings, 1 reply; 33+ messages in thread
From: Chuck Lever @ 2025-11-14 14:14 UTC (permalink / raw)
  To: Alistair Francis
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On 11/13/25 10:44 PM, Alistair Francis wrote:
> On Fri, Nov 14, 2025 at 12:37 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 11/13/25 9:01 AM, Chuck Lever wrote:
>>> On 11/13/25 5:19 AM, Alistair Francis wrote:
>>>> On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>>
>>>>> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
>>>>>> From: Alistair Francis <alistair.francis@wdc.com>
>>>>>>
>>>>>> Define a `handshake_sk_destruct_req()` function to allow the destruction
>>>>>> of the handshake req.
>>>>>>
>>>>>> This is required to avoid hash conflicts when handshake_req_hash_add()
>>>>>> is called as part of submitting the KeyUpdate request.
>>>>>>
>>>>>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
>>>>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>>>>>> ---
>>>>>> v5:
>>>>>>  - No change
>>>>>> v4:
>>>>>>  - No change
>>>>>> v3:
>>>>>>  - New patch
>>>>>>
>>>>>>  net/handshake/request.c | 16 ++++++++++++++++
>>>>>>  1 file changed, 16 insertions(+)
>>>>>>
>>>>>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>>>>>> index 274d2c89b6b2..0d1c91c80478 100644
>>>>>> --- a/net/handshake/request.c
>>>>>> +++ b/net/handshake/request.c
>>>>>> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
>>>>>>               sk_destruct(sk);
>>>>>>  }
>>>>>>
>>>>>> +/**
>>>>>> + * handshake_sk_destruct_req - destroy an existing request
>>>>>> + * @sk: socket on which there is an existing request
>>>>>
>>>>> Generally the kdoc style is unnecessary for static helper functions,
>>>>> especially functions with only a single caller.
>>>>>
>>>>> This all looks so much like handshake_sk_destruct(). Consider
>>>>> eliminating the code duplication by splitting that function into a
>>>>> couple of helpers instead of adding this one.
>>>>>
>>>>>
>>>>>> + */
>>>>>> +static void handshake_sk_destruct_req(struct sock *sk)
>>>>>
>>>>> Because this function is static, I imagine that the compiler will
>>>>> bark about the addition of an unused function. Perhaps it would
>>>>> be better to combine 2/6 and 3/6.
>>>>>
>>>>> That would also make it easier for reviewers to check the resource
>>>>> accounting issues mentioned below.
>>>>>
>>>>>
>>>>>> +{
>>>>>> +     struct handshake_req *req;
>>>>>> +
>>>>>> +     req = handshake_req_hash_lookup(sk);
>>>>>> +     if (!req)
>>>>>> +             return;
>>>>>> +
>>>>>> +     trace_handshake_destruct(sock_net(sk), req, sk);
>>>>>
>>>>> Wondering if this function needs to preserve the socket's destructor
>>>>> callback chain like so:
>>>>>
>>>>> +       void (sk_destruct)(struct sock sk);
>>>>>
>>>>>   ...
>>>>>
>>>>> +       sk_destruct = req->hr_odestruct;
>>>>> +       sk->sk_destruct = sk_destruct;
>>>>>
>>>>> then:
>>>>>
>>>>>> +     handshake_req_destroy(req);
>>>>>
>>>>> Because of the current code organization and patch ordering, it's
>>>>> difficult to confirm that sock_put() isn't necessary here.
>>>>>
>>>>>
>>>>>> +}
>>>>>> +
>>>>>>  /**
>>>>>>   * handshake_req_alloc - Allocate a handshake request
>>>>>>   * @proto: security protocol
>>>>>
>>>>> There's no synchronization preventing concurrent handshake_req_cancel()
>>>>> calls from accessing the request after it's freed during handshake
>>>>> completion. That is one reason why handshake_complete() leaves completed
>>>>> requests in the hash.
>>>>
>>>> Ah, so you are worried that free-ing the request will race with
>>>> accessing the request after a handshake_req_hash_lookup().
>>>>
>>>> Ok, makes sense. It seems like one answer to that is to add synchronisation
>>>>
>>>>>
>>>>> So I'm thinking that removing requests like this is not going to work
>>>>> out. Would it work better if handshake_req_hash_add() could recognize
>>>>> that a KeyUpdate is going on, and allow replacement of a hashed
>>>>> request? I haven't thought that through.
>>>>
>>>> I guess the idea would be to do something like this in
>>>> handshake_req_hash_add() if the entry already exists?
>>>>
>>>>     if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>>>>         /* Request already completed */
>>>>         rhashtable_replace_fast(...);
>>>>     }
>>>>
>>>> I'm not sure that's better. That could possibly still race with
>>>> something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
>>>> the request unexpectedly.
>>>>
>>>> What about adding synchronisation and keeping the current approach?
>>>> From a quick look it should be enough to just edit
>>>> handshake_sk_destruct() and handshake_req_cancel()
>>>
>>> Or make the KeyUpdate requests somehow distinctive so they do not
>>> collide with initial handshake requests.
> 
> Hmmm... Then each KeyUpdate needs to be distinctive, which will
> indefinitely grow the hash table

Two random observations:

1. There is only zero or one KeyUpdate going on at a time. Certainly
the KeyUpdate-related data structures don't need to stay around.

2. Maybe a separate data structure to track KeyUpdates is appropriate.


>> Another thought: expand the current _req structure to also manage
>> KeyUpdates. I think there can be only one upcall request pending
>> at a time, right?
> 
> There should only be a single request pending per queue.
> 
> I'm not sure I see what we could do to expand the _req structure.
> 
> What about adding `HANDSHAKE_F_REQ_CANCEL` to `hr_flags_bits` and
> using that to ensure we don't free something that is currently being
> cancelled and the other way around?

A CANCEL can happen at any time during the life of the session/socket,
including long after the handshake was done. It's part of socket
teardown. I don't think we can simply remove the req on handshake
completion.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-14 14:14             ` Chuck Lever
@ 2025-11-19  0:45               ` Alistair Francis
  2025-11-20 13:51                 ` Chuck Lever
  0 siblings, 1 reply; 33+ messages in thread
From: Alistair Francis @ 2025-11-19  0:45 UTC (permalink / raw)
  To: Chuck Lever
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On Sat, Nov 15, 2025 at 12:14 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 11/13/25 10:44 PM, Alistair Francis wrote:
> > On Fri, Nov 14, 2025 at 12:37 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 11/13/25 9:01 AM, Chuck Lever wrote:
> >>> On 11/13/25 5:19 AM, Alistair Francis wrote:
> >>>> On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>>
> >>>>> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> >>>>>> From: Alistair Francis <alistair.francis@wdc.com>
> >>>>>>
> >>>>>> Define a `handshake_sk_destruct_req()` function to allow the destruction
> >>>>>> of the handshake req.
> >>>>>>
> >>>>>> This is required to avoid hash conflicts when handshake_req_hash_add()
> >>>>>> is called as part of submitting the KeyUpdate request.
> >>>>>>
> >>>>>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> >>>>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
> >>>>>> ---
> >>>>>> v5:
> >>>>>>  - No change
> >>>>>> v4:
> >>>>>>  - No change
> >>>>>> v3:
> >>>>>>  - New patch
> >>>>>>
> >>>>>>  net/handshake/request.c | 16 ++++++++++++++++
> >>>>>>  1 file changed, 16 insertions(+)
> >>>>>>
> >>>>>> diff --git a/net/handshake/request.c b/net/handshake/request.c
> >>>>>> index 274d2c89b6b2..0d1c91c80478 100644
> >>>>>> --- a/net/handshake/request.c
> >>>>>> +++ b/net/handshake/request.c
> >>>>>> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
> >>>>>>               sk_destruct(sk);
> >>>>>>  }
> >>>>>>
> >>>>>> +/**
> >>>>>> + * handshake_sk_destruct_req - destroy an existing request
> >>>>>> + * @sk: socket on which there is an existing request
> >>>>>
> >>>>> Generally the kdoc style is unnecessary for static helper functions,
> >>>>> especially functions with only a single caller.
> >>>>>
> >>>>> This all looks so much like handshake_sk_destruct(). Consider
> >>>>> eliminating the code duplication by splitting that function into a
> >>>>> couple of helpers instead of adding this one.
> >>>>>
> >>>>>
> >>>>>> + */
> >>>>>> +static void handshake_sk_destruct_req(struct sock *sk)
> >>>>>
> >>>>> Because this function is static, I imagine that the compiler will
> >>>>> bark about the addition of an unused function. Perhaps it would
> >>>>> be better to combine 2/6 and 3/6.
> >>>>>
> >>>>> That would also make it easier for reviewers to check the resource
> >>>>> accounting issues mentioned below.
> >>>>>
> >>>>>
> >>>>>> +{
> >>>>>> +     struct handshake_req *req;
> >>>>>> +
> >>>>>> +     req = handshake_req_hash_lookup(sk);
> >>>>>> +     if (!req)
> >>>>>> +             return;
> >>>>>> +
> >>>>>> +     trace_handshake_destruct(sock_net(sk), req, sk);
> >>>>>
> >>>>> Wondering if this function needs to preserve the socket's destructor
> >>>>> callback chain like so:
> >>>>>
> >>>>> +       void (sk_destruct)(struct sock sk);
> >>>>>
> >>>>>   ...
> >>>>>
> >>>>> +       sk_destruct = req->hr_odestruct;
> >>>>> +       sk->sk_destruct = sk_destruct;
> >>>>>
> >>>>> then:
> >>>>>
> >>>>>> +     handshake_req_destroy(req);
> >>>>>
> >>>>> Because of the current code organization and patch ordering, it's
> >>>>> difficult to confirm that sock_put() isn't necessary here.
> >>>>>
> >>>>>
> >>>>>> +}
> >>>>>> +
> >>>>>>  /**
> >>>>>>   * handshake_req_alloc - Allocate a handshake request
> >>>>>>   * @proto: security protocol
> >>>>>
> >>>>> There's no synchronization preventing concurrent handshake_req_cancel()
> >>>>> calls from accessing the request after it's freed during handshake
> >>>>> completion. That is one reason why handshake_complete() leaves completed
> >>>>> requests in the hash.
> >>>>
> >>>> Ah, so you are worried that free-ing the request will race with
> >>>> accessing the request after a handshake_req_hash_lookup().
> >>>>
> >>>> Ok, makes sense. It seems like one answer to that is to add synchronisation
> >>>>
> >>>>>
> >>>>> So I'm thinking that removing requests like this is not going to work
> >>>>> out. Would it work better if handshake_req_hash_add() could recognize
> >>>>> that a KeyUpdate is going on, and allow replacement of a hashed
> >>>>> request? I haven't thought that through.
> >>>>
> >>>> I guess the idea would be to do something like this in
> >>>> handshake_req_hash_add() if the entry already exists?
> >>>>
> >>>>     if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
> >>>>         /* Request already completed */
> >>>>         rhashtable_replace_fast(...);
> >>>>     }
> >>>>
> >>>> I'm not sure that's better. That could possibly still race with
> >>>> something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
> >>>> the request unexpectedly.
> >>>>
> >>>> What about adding synchronisation and keeping the current approach?
> >>>> From a quick look it should be enough to just edit
> >>>> handshake_sk_destruct() and handshake_req_cancel()
> >>>
> >>> Or make the KeyUpdate requests somehow distinctive so they do not
> >>> collide with initial handshake requests.
> >
> > Hmmm... Then each KeyUpdate needs to be distinctive, which will
> > indefinitely grow the hash table
>
> Two random observations:
>
> 1. There is only zero or one KeyUpdate going on at a time. Certainly
> the KeyUpdate-related data structures don't need to stay around.

That's the same as the original handshake req though, which you are
saying can't be freed

>
> 2. Maybe a separate data structure to track KeyUpdates is appropriate.
>
>
> >> Another thought: expand the current _req structure to also manage
> >> KeyUpdates. I think there can be only one upcall request pending
> >> at a time, right?
> >
> > There should only be a single request pending per queue.
> >
> > I'm not sure I see what we could do to expand the _req structure.
> >
> > What about adding `HANDSHAKE_F_REQ_CANCEL` to `hr_flags_bits` and
> > using that to ensure we don't free something that is currently being
> > cancelled and the other way around?
>
> A CANCEL can happen at any time during the life of the session/socket,
> including long after the handshake was done. It's part of socket
> teardown. I don't think we can simply remove the req on handshake
> completion.

Does that matter though? If a cancel is issued on a freed req we just
do nothing as there is nothing to cancel.

Alistair

>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-19  0:45               ` Alistair Francis
@ 2025-11-20 13:51                 ` Chuck Lever
  2025-11-25  5:00                   ` Alistair Francis
  0 siblings, 1 reply; 33+ messages in thread
From: Chuck Lever @ 2025-11-20 13:51 UTC (permalink / raw)
  To: Alistair Francis
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On 11/18/25 7:45 PM, Alistair Francis wrote:
> On Sat, Nov 15, 2025 at 12:14 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 11/13/25 10:44 PM, Alistair Francis wrote:
>>> On Fri, Nov 14, 2025 at 12:37 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>> On 11/13/25 9:01 AM, Chuck Lever wrote:
>>>>> On 11/13/25 5:19 AM, Alistair Francis wrote:
>>>>>> On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>>>>
>>>>>>> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
>>>>>>>> From: Alistair Francis <alistair.francis@wdc.com>
>>>>>>>>
>>>>>>>> Define a `handshake_sk_destruct_req()` function to allow the destruction
>>>>>>>> of the handshake req.
>>>>>>>>
>>>>>>>> This is required to avoid hash conflicts when handshake_req_hash_add()
>>>>>>>> is called as part of submitting the KeyUpdate request.
>>>>>>>>
>>>>>>>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
>>>>>>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>>>>>>>> ---
>>>>>>>> v5:
>>>>>>>>  - No change
>>>>>>>> v4:
>>>>>>>>  - No change
>>>>>>>> v3:
>>>>>>>>  - New patch
>>>>>>>>
>>>>>>>>  net/handshake/request.c | 16 ++++++++++++++++
>>>>>>>>  1 file changed, 16 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>>>>>>>> index 274d2c89b6b2..0d1c91c80478 100644
>>>>>>>> --- a/net/handshake/request.c
>>>>>>>> +++ b/net/handshake/request.c
>>>>>>>> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
>>>>>>>>               sk_destruct(sk);
>>>>>>>>  }
>>>>>>>>
>>>>>>>> +/**
>>>>>>>> + * handshake_sk_destruct_req - destroy an existing request
>>>>>>>> + * @sk: socket on which there is an existing request
>>>>>>>
>>>>>>> Generally the kdoc style is unnecessary for static helper functions,
>>>>>>> especially functions with only a single caller.
>>>>>>>
>>>>>>> This all looks so much like handshake_sk_destruct(). Consider
>>>>>>> eliminating the code duplication by splitting that function into a
>>>>>>> couple of helpers instead of adding this one.
>>>>>>>
>>>>>>>
>>>>>>>> + */
>>>>>>>> +static void handshake_sk_destruct_req(struct sock *sk)
>>>>>>>
>>>>>>> Because this function is static, I imagine that the compiler will
>>>>>>> bark about the addition of an unused function. Perhaps it would
>>>>>>> be better to combine 2/6 and 3/6.
>>>>>>>
>>>>>>> That would also make it easier for reviewers to check the resource
>>>>>>> accounting issues mentioned below.
>>>>>>>
>>>>>>>
>>>>>>>> +{
>>>>>>>> +     struct handshake_req *req;
>>>>>>>> +
>>>>>>>> +     req = handshake_req_hash_lookup(sk);
>>>>>>>> +     if (!req)
>>>>>>>> +             return;
>>>>>>>> +
>>>>>>>> +     trace_handshake_destruct(sock_net(sk), req, sk);
>>>>>>>
>>>>>>> Wondering if this function needs to preserve the socket's destructor
>>>>>>> callback chain like so:
>>>>>>>
>>>>>>> +       void (sk_destruct)(struct sock sk);
>>>>>>>
>>>>>>>   ...
>>>>>>>
>>>>>>> +       sk_destruct = req->hr_odestruct;
>>>>>>> +       sk->sk_destruct = sk_destruct;
>>>>>>>
>>>>>>> then:
>>>>>>>
>>>>>>>> +     handshake_req_destroy(req);
>>>>>>>
>>>>>>> Because of the current code organization and patch ordering, it's
>>>>>>> difficult to confirm that sock_put() isn't necessary here.
>>>>>>>
>>>>>>>
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  /**
>>>>>>>>   * handshake_req_alloc - Allocate a handshake request
>>>>>>>>   * @proto: security protocol
>>>>>>>
>>>>>>> There's no synchronization preventing concurrent handshake_req_cancel()
>>>>>>> calls from accessing the request after it's freed during handshake
>>>>>>> completion. That is one reason why handshake_complete() leaves completed
>>>>>>> requests in the hash.
>>>>>>
>>>>>> Ah, so you are worried that free-ing the request will race with
>>>>>> accessing the request after a handshake_req_hash_lookup().
>>>>>>
>>>>>> Ok, makes sense. It seems like one answer to that is to add synchronisation
>>>>>>
>>>>>>>
>>>>>>> So I'm thinking that removing requests like this is not going to work
>>>>>>> out. Would it work better if handshake_req_hash_add() could recognize
>>>>>>> that a KeyUpdate is going on, and allow replacement of a hashed
>>>>>>> request? I haven't thought that through.
>>>>>>
>>>>>> I guess the idea would be to do something like this in
>>>>>> handshake_req_hash_add() if the entry already exists?
>>>>>>
>>>>>>     if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>>>>>>         /* Request already completed */
>>>>>>         rhashtable_replace_fast(...);
>>>>>>     }
>>>>>>
>>>>>> I'm not sure that's better. That could possibly still race with
>>>>>> something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
>>>>>> the request unexpectedly.
>>>>>>
>>>>>> What about adding synchronisation and keeping the current approach?
>>>>>> From a quick look it should be enough to just edit
>>>>>> handshake_sk_destruct() and handshake_req_cancel()
>>>>>
>>>>> Or make the KeyUpdate requests somehow distinctive so they do not
>>>>> collide with initial handshake requests.
>>>
>>> Hmmm... Then each KeyUpdate needs to be distinctive, which will
>>> indefinitely grow the hash table
>>
>> Two random observations:
>>
>> 1. There is only zero or one KeyUpdate going on at a time. Certainly
>> the KeyUpdate-related data structures don't need to stay around.
> 
> That's the same as the original handshake req though, which you are
> saying can't be freed
> 
>>
>> 2. Maybe a separate data structure to track KeyUpdates is appropriate.
>>
>>
>>>> Another thought: expand the current _req structure to also manage
>>>> KeyUpdates. I think there can be only one upcall request pending
>>>> at a time, right?
>>>
>>> There should only be a single request pending per queue.
>>>
>>> I'm not sure I see what we could do to expand the _req structure.
>>>
>>> What about adding `HANDSHAKE_F_REQ_CANCEL` to `hr_flags_bits` and
>>> using that to ensure we don't free something that is currently being
>>> cancelled and the other way around?
>>
>> A CANCEL can happen at any time during the life of the session/socket,
>> including long after the handshake was done. It's part of socket
>> teardown. I don't think we can simply remove the req on handshake
>> completion.
> 
> Does that matter though? If a cancel is issued on a freed req we just
> do nothing as there is nothing to cancel.

I confess I've lost a bit of the plot.

I still don't yet understand why the req has to be removed from the
hash rather than re-using the socket's existing req for KeyUpdates.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-20 13:51                 ` Chuck Lever
@ 2025-11-25  5:00                   ` Alistair Francis
  2025-11-25 13:55                     ` Chuck Lever
  0 siblings, 1 reply; 33+ messages in thread
From: Alistair Francis @ 2025-11-25  5:00 UTC (permalink / raw)
  To: Chuck Lever
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On Thu, Nov 20, 2025 at 11:51 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 11/18/25 7:45 PM, Alistair Francis wrote:
> > On Sat, Nov 15, 2025 at 12:14 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 11/13/25 10:44 PM, Alistair Francis wrote:
> >>> On Fri, Nov 14, 2025 at 12:37 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>
> >>>> On 11/13/25 9:01 AM, Chuck Lever wrote:
> >>>>> On 11/13/25 5:19 AM, Alistair Francis wrote:
> >>>>>> On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>>>>
> >>>>>>> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
> >>>>>>>> From: Alistair Francis <alistair.francis@wdc.com>
> >>>>>>>>
> >>>>>>>> Define a `handshake_sk_destruct_req()` function to allow the destruction
> >>>>>>>> of the handshake req.
> >>>>>>>>
> >>>>>>>> This is required to avoid hash conflicts when handshake_req_hash_add()
> >>>>>>>> is called as part of submitting the KeyUpdate request.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> >>>>>>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
> >>>>>>>> ---
> >>>>>>>> v5:
> >>>>>>>>  - No change
> >>>>>>>> v4:
> >>>>>>>>  - No change
> >>>>>>>> v3:
> >>>>>>>>  - New patch
> >>>>>>>>
> >>>>>>>>  net/handshake/request.c | 16 ++++++++++++++++
> >>>>>>>>  1 file changed, 16 insertions(+)
> >>>>>>>>
> >>>>>>>> diff --git a/net/handshake/request.c b/net/handshake/request.c
> >>>>>>>> index 274d2c89b6b2..0d1c91c80478 100644
> >>>>>>>> --- a/net/handshake/request.c
> >>>>>>>> +++ b/net/handshake/request.c
> >>>>>>>> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
> >>>>>>>>               sk_destruct(sk);
> >>>>>>>>  }
> >>>>>>>>
> >>>>>>>> +/**
> >>>>>>>> + * handshake_sk_destruct_req - destroy an existing request
> >>>>>>>> + * @sk: socket on which there is an existing request
> >>>>>>>
> >>>>>>> Generally the kdoc style is unnecessary for static helper functions,
> >>>>>>> especially functions with only a single caller.
> >>>>>>>
> >>>>>>> This all looks so much like handshake_sk_destruct(). Consider
> >>>>>>> eliminating the code duplication by splitting that function into a
> >>>>>>> couple of helpers instead of adding this one.
> >>>>>>>
> >>>>>>>
> >>>>>>>> + */
> >>>>>>>> +static void handshake_sk_destruct_req(struct sock *sk)
> >>>>>>>
> >>>>>>> Because this function is static, I imagine that the compiler will
> >>>>>>> bark about the addition of an unused function. Perhaps it would
> >>>>>>> be better to combine 2/6 and 3/6.
> >>>>>>>
> >>>>>>> That would also make it easier for reviewers to check the resource
> >>>>>>> accounting issues mentioned below.
> >>>>>>>
> >>>>>>>
> >>>>>>>> +{
> >>>>>>>> +     struct handshake_req *req;
> >>>>>>>> +
> >>>>>>>> +     req = handshake_req_hash_lookup(sk);
> >>>>>>>> +     if (!req)
> >>>>>>>> +             return;
> >>>>>>>> +
> >>>>>>>> +     trace_handshake_destruct(sock_net(sk), req, sk);
> >>>>>>>
> >>>>>>> Wondering if this function needs to preserve the socket's destructor
> >>>>>>> callback chain like so:
> >>>>>>>
> >>>>>>> +       void (sk_destruct)(struct sock sk);
> >>>>>>>
> >>>>>>>   ...
> >>>>>>>
> >>>>>>> +       sk_destruct = req->hr_odestruct;
> >>>>>>> +       sk->sk_destruct = sk_destruct;
> >>>>>>>
> >>>>>>> then:
> >>>>>>>
> >>>>>>>> +     handshake_req_destroy(req);
> >>>>>>>
> >>>>>>> Because of the current code organization and patch ordering, it's
> >>>>>>> difficult to confirm that sock_put() isn't necessary here.
> >>>>>>>
> >>>>>>>
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>>  /**
> >>>>>>>>   * handshake_req_alloc - Allocate a handshake request
> >>>>>>>>   * @proto: security protocol
> >>>>>>>
> >>>>>>> There's no synchronization preventing concurrent handshake_req_cancel()
> >>>>>>> calls from accessing the request after it's freed during handshake
> >>>>>>> completion. That is one reason why handshake_complete() leaves completed
> >>>>>>> requests in the hash.
> >>>>>>
> >>>>>> Ah, so you are worried that free-ing the request will race with
> >>>>>> accessing the request after a handshake_req_hash_lookup().
> >>>>>>
> >>>>>> Ok, makes sense. It seems like one answer to that is to add synchronisation
> >>>>>>
> >>>>>>>
> >>>>>>> So I'm thinking that removing requests like this is not going to work
> >>>>>>> out. Would it work better if handshake_req_hash_add() could recognize
> >>>>>>> that a KeyUpdate is going on, and allow replacement of a hashed
> >>>>>>> request? I haven't thought that through.
> >>>>>>
> >>>>>> I guess the idea would be to do something like this in
> >>>>>> handshake_req_hash_add() if the entry already exists?
> >>>>>>
> >>>>>>     if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
> >>>>>>         /* Request already completed */
> >>>>>>         rhashtable_replace_fast(...);
> >>>>>>     }
> >>>>>>
> >>>>>> I'm not sure that's better. That could possibly still race with
> >>>>>> something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
> >>>>>> the request unexpectedly.
> >>>>>>
> >>>>>> What about adding synchronisation and keeping the current approach?
> >>>>>> From a quick look it should be enough to just edit
> >>>>>> handshake_sk_destruct() and handshake_req_cancel()
> >>>>>
> >>>>> Or make the KeyUpdate requests somehow distinctive so they do not
> >>>>> collide with initial handshake requests.
> >>>
> >>> Hmmm... Then each KeyUpdate needs to be distinctive, which will
> >>> indefinitely grow the hash table
> >>
> >> Two random observations:
> >>
> >> 1. There is only zero or one KeyUpdate going on at a time. Certainly
> >> the KeyUpdate-related data structures don't need to stay around.
> >
> > That's the same as the original handshake req though, which you are
> > saying can't be freed
> >
> >>
> >> 2. Maybe a separate data structure to track KeyUpdates is appropriate.
> >>
> >>
> >>>> Another thought: expand the current _req structure to also manage
> >>>> KeyUpdates. I think there can be only one upcall request pending
> >>>> at a time, right?
> >>>
> >>> There should only be a single request pending per queue.
> >>>
> >>> I'm not sure I see what we could do to expand the _req structure.
> >>>
> >>> What about adding `HANDSHAKE_F_REQ_CANCEL` to `hr_flags_bits` and
> >>> using that to ensure we don't free something that is currently being
> >>> cancelled and the other way around?
> >>
> >> A CANCEL can happen at any time during the life of the session/socket,
> >> including long after the handshake was done. It's part of socket
> >> teardown. I don't think we can simply remove the req on handshake
> >> completion.
> >
> > Does that matter though? If a cancel is issued on a freed req we just
> > do nothing as there is nothing to cancel.
>
> I confess I've lost a bit of the plot.

Ha, we are in the weeds a bit.

>
> I still don't yet understand why the req has to be removed from the
> hash rather than re-using the socket's existing req for KeyUpdates.

Basically we want to call handshake_req_submit() to submit a KeyUpdate
request. That will fail if there is already a request in the hash
table, in this case the request has been completed but not destroyed.

This patch is deconstructing the request on completion so that when we
perform a KeyUpdate the request doesn't exist. Which to me seems like
the way to go as we are no longer using the request, so why keep it
around.

You said that might race with cancelling the request
(handshake_req_cancel()), which I'm trying to find a solution to. My
proposal is to add some atomic synchronisation to ensure we don't
cancel/free a request at the same time.

You are saying that we could instead add a new function similar to
handshake_req_submit() that reuses the existing request. I was
thinking that would also race with handshake_req_cancel(), but I guess
it won't as nothing is being freed.

So you would prefer changing handshake_req_submit() to just re-use an
existing completed request for KeyUpdate?

Alistair

>
>
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req
  2025-11-25  5:00                   ` Alistair Francis
@ 2025-11-25 13:55                     ` Chuck Lever
  0 siblings, 0 replies; 33+ messages in thread
From: Chuck Lever @ 2025-11-25 13:55 UTC (permalink / raw)
  To: Alistair Francis
  Cc: hare, kernel-tls-handshake, netdev, linux-kernel, linux-doc,
	linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch, hare,
	Alistair Francis

On 11/25/25 12:00 AM, Alistair Francis wrote:
> On Thu, Nov 20, 2025 at 11:51 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 11/18/25 7:45 PM, Alistair Francis wrote:
>>> On Sat, Nov 15, 2025 at 12:14 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>> On 11/13/25 10:44 PM, Alistair Francis wrote:
>>>>> On Fri, Nov 14, 2025 at 12:37 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>>>
>>>>>> On 11/13/25 9:01 AM, Chuck Lever wrote:
>>>>>>> On 11/13/25 5:19 AM, Alistair Francis wrote:
>>>>>>>> On Thu, Nov 13, 2025 at 1:47 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>>>>>>
>>>>>>>>> On 11/11/25 11:27 PM, alistair23@gmail.com wrote:
>>>>>>>>>> From: Alistair Francis <alistair.francis@wdc.com>
>>>>>>>>>>
>>>>>>>>>> Define a `handshake_sk_destruct_req()` function to allow the destruction
>>>>>>>>>> of the handshake req.
>>>>>>>>>>
>>>>>>>>>> This is required to avoid hash conflicts when handshake_req_hash_add()
>>>>>>>>>> is called as part of submitting the KeyUpdate request.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
>>>>>>>>>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>>>>>>>>>> ---
>>>>>>>>>> v5:
>>>>>>>>>>  - No change
>>>>>>>>>> v4:
>>>>>>>>>>  - No change
>>>>>>>>>> v3:
>>>>>>>>>>  - New patch
>>>>>>>>>>
>>>>>>>>>>  net/handshake/request.c | 16 ++++++++++++++++
>>>>>>>>>>  1 file changed, 16 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/net/handshake/request.c b/net/handshake/request.c
>>>>>>>>>> index 274d2c89b6b2..0d1c91c80478 100644
>>>>>>>>>> --- a/net/handshake/request.c
>>>>>>>>>> +++ b/net/handshake/request.c
>>>>>>>>>> @@ -98,6 +98,22 @@ static void handshake_sk_destruct(struct sock *sk)
>>>>>>>>>>               sk_destruct(sk);
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>> +/**
>>>>>>>>>> + * handshake_sk_destruct_req - destroy an existing request
>>>>>>>>>> + * @sk: socket on which there is an existing request
>>>>>>>>>
>>>>>>>>> Generally the kdoc style is unnecessary for static helper functions,
>>>>>>>>> especially functions with only a single caller.
>>>>>>>>>
>>>>>>>>> This all looks so much like handshake_sk_destruct(). Consider
>>>>>>>>> eliminating the code duplication by splitting that function into a
>>>>>>>>> couple of helpers instead of adding this one.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> + */
>>>>>>>>>> +static void handshake_sk_destruct_req(struct sock *sk)
>>>>>>>>>
>>>>>>>>> Because this function is static, I imagine that the compiler will
>>>>>>>>> bark about the addition of an unused function. Perhaps it would
>>>>>>>>> be better to combine 2/6 and 3/6.
>>>>>>>>>
>>>>>>>>> That would also make it easier for reviewers to check the resource
>>>>>>>>> accounting issues mentioned below.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> +{
>>>>>>>>>> +     struct handshake_req *req;
>>>>>>>>>> +
>>>>>>>>>> +     req = handshake_req_hash_lookup(sk);
>>>>>>>>>> +     if (!req)
>>>>>>>>>> +             return;
>>>>>>>>>> +
>>>>>>>>>> +     trace_handshake_destruct(sock_net(sk), req, sk);
>>>>>>>>>
>>>>>>>>> Wondering if this function needs to preserve the socket's destructor
>>>>>>>>> callback chain like so:
>>>>>>>>>
>>>>>>>>> +       void (sk_destruct)(struct sock sk);
>>>>>>>>>
>>>>>>>>>   ...
>>>>>>>>>
>>>>>>>>> +       sk_destruct = req->hr_odestruct;
>>>>>>>>> +       sk->sk_destruct = sk_destruct;
>>>>>>>>>
>>>>>>>>> then:
>>>>>>>>>
>>>>>>>>>> +     handshake_req_destroy(req);
>>>>>>>>>
>>>>>>>>> Because of the current code organization and patch ordering, it's
>>>>>>>>> difficult to confirm that sock_put() isn't necessary here.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>  /**
>>>>>>>>>>   * handshake_req_alloc - Allocate a handshake request
>>>>>>>>>>   * @proto: security protocol
>>>>>>>>>
>>>>>>>>> There's no synchronization preventing concurrent handshake_req_cancel()
>>>>>>>>> calls from accessing the request after it's freed during handshake
>>>>>>>>> completion. That is one reason why handshake_complete() leaves completed
>>>>>>>>> requests in the hash.
>>>>>>>>
>>>>>>>> Ah, so you are worried that free-ing the request will race with
>>>>>>>> accessing the request after a handshake_req_hash_lookup().
>>>>>>>>
>>>>>>>> Ok, makes sense. It seems like one answer to that is to add synchronisation
>>>>>>>>
>>>>>>>>>
>>>>>>>>> So I'm thinking that removing requests like this is not going to work
>>>>>>>>> out. Would it work better if handshake_req_hash_add() could recognize
>>>>>>>>> that a KeyUpdate is going on, and allow replacement of a hashed
>>>>>>>>> request? I haven't thought that through.
>>>>>>>>
>>>>>>>> I guess the idea would be to do something like this in
>>>>>>>> handshake_req_hash_add() if the entry already exists?
>>>>>>>>
>>>>>>>>     if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>>>>>>>>         /* Request already completed */
>>>>>>>>         rhashtable_replace_fast(...);
>>>>>>>>     }
>>>>>>>>
>>>>>>>> I'm not sure that's better. That could possibly still race with
>>>>>>>> something that hasn't yet set HANDSHAKE_F_REQ_COMPLETED and overwrite
>>>>>>>> the request unexpectedly.
>>>>>>>>
>>>>>>>> What about adding synchronisation and keeping the current approach?
>>>>>>>> From a quick look it should be enough to just edit
>>>>>>>> handshake_sk_destruct() and handshake_req_cancel()
>>>>>>>
>>>>>>> Or make the KeyUpdate requests somehow distinctive so they do not
>>>>>>> collide with initial handshake requests.
>>>>>
>>>>> Hmmm... Then each KeyUpdate needs to be distinctive, which will
>>>>> indefinitely grow the hash table
>>>>
>>>> Two random observations:
>>>>
>>>> 1. There is only zero or one KeyUpdate going on at a time. Certainly
>>>> the KeyUpdate-related data structures don't need to stay around.
>>>
>>> That's the same as the original handshake req though, which you are
>>> saying can't be freed
>>>
>>>>
>>>> 2. Maybe a separate data structure to track KeyUpdates is appropriate.
>>>>
>>>>
>>>>>> Another thought: expand the current _req structure to also manage
>>>>>> KeyUpdates. I think there can be only one upcall request pending
>>>>>> at a time, right?
>>>>>
>>>>> There should only be a single request pending per queue.
>>>>>
>>>>> I'm not sure I see what we could do to expand the _req structure.
>>>>>
>>>>> What about adding `HANDSHAKE_F_REQ_CANCEL` to `hr_flags_bits` and
>>>>> using that to ensure we don't free something that is currently being
>>>>> cancelled and the other way around?
>>>>
>>>> A CANCEL can happen at any time during the life of the session/socket,
>>>> including long after the handshake was done. It's part of socket
>>>> teardown. I don't think we can simply remove the req on handshake
>>>> completion.
>>>
>>> Does that matter though? If a cancel is issued on a freed req we just
>>> do nothing as there is nothing to cancel.
>>
>> I confess I've lost a bit of the plot.
> 
> Ha, we are in the weeds a bit.
> 
>>
>> I still don't yet understand why the req has to be removed from the
>> hash rather than re-using the socket's existing req for KeyUpdates.
> 
> Basically we want to call handshake_req_submit() to submit a KeyUpdate
> request. That will fail if there is already a request in the hash
> table, in this case the request has been completed but not destroyed.
> 
> This patch is deconstructing the request on completion so that when we
> perform a KeyUpdate the request doesn't exist. Which to me seems like
> the way to go as we are no longer using the request, so why keep it
> around.
> 
> You said that might race with cancelling the request
> (handshake_req_cancel()), which I'm trying to find a solution to. My
> proposal is to add some atomic synchronisation to ensure we don't
> cancel/free a request at the same time.
> 
> You are saying that we could instead add a new function similar to
> handshake_req_submit() that reuses the existing request. I was
> thinking that would also race with handshake_req_cancel(), but I guess
> it won't as nothing is being freed.
> 
> So you would prefer changing handshake_req_submit() to just re-use an
> existing completed request for KeyUpdate?

Thanks. That jogs the memory.

How about a new API, say, handshake_req_keyupdate() that /expects/ to
find an existing req with a completed handshake? I think that fits a
little better with the state machine.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 4/6] net/handshake: Support KeyUpdate message types
  2025-11-12  4:27 ` [PATCH v5 4/6] net/handshake: Support KeyUpdate message types alistair23
  2025-11-12 15:49   ` Chuck Lever
  2025-11-13 14:41   ` Chuck Lever
@ 2025-11-27 13:12   ` Hannes Reinecke
  2 siblings, 0 replies; 33+ messages in thread
From: Hannes Reinecke @ 2025-11-27 13:12 UTC (permalink / raw)
  To: alistair23, chuck.lever, hare, kernel-tls-handshake, netdev,
	linux-kernel, linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, Alistair Francis

On 11/12/25 05:27, alistair23@gmail.com wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
> 
> When reporting the msg-type to userspace let's also support reporting
> KeyUpdate events. This supports reporting a client/server event and if
> the other side requested a KeyUpdateRequest.
> 
> Link: https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
> v5:
>   - Drop clientkeyupdaterequest and serverkeyupdaterequest
> v4:
>   - Don't overload existing functions, instead create new ones
> v3:
>   - Fixup yamllint and kernel-doc failures
> 
>   Documentation/netlink/specs/handshake.yaml | 16 ++++-
>   drivers/nvme/host/tcp.c                    | 15 +++-
>   drivers/nvme/target/tcp.c                  | 10 ++-
>   include/net/handshake.h                    |  6 ++
>   include/uapi/linux/handshake.h             | 11 +++
>   net/handshake/tlshd.c                      | 83 +++++++++++++++++++++-
>   6 files changed, 133 insertions(+), 8 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-12  4:27 ` [PATCH v5 5/6] nvme-tcp: Support KeyUpdate alistair23
  2025-11-12  6:59   ` Christoph Hellwig
@ 2025-11-27 13:31   ` Hannes Reinecke
  2025-12-01  4:18     ` Alistair Francis
  2025-11-30 22:31   ` Sagi Grimberg
  2 siblings, 1 reply; 33+ messages in thread
From: Hannes Reinecke @ 2025-11-27 13:31 UTC (permalink / raw)
  To: alistair23, chuck.lever, hare, kernel-tls-handshake, netdev,
	linux-kernel, linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, sagi, kch, Alistair Francis

On 11/12/25 05:27, alistair23@gmail.com wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
> 
> If the nvme_tcp_try_send() or nvme_tcp_try_recv() functions return
> EKEYEXPIRED then the underlying TLS keys need to be updated. This occurs
> on an KeyUpdate event as described in RFC8446
> https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3.
> 
> If the NVMe Target (TLS server) initiates a KeyUpdate this patch will
> allow the NVMe layer to process the KeyUpdate request and forward the
> request to userspace. Userspace must then update the key to keep the
> connection alive.
> 
> This patch allows us to handle the NVMe target sending a KeyUpdate
> request without aborting the connection. At this time we don't support
> initiating a KeyUpdate.
> 
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
> v5:
>   - Cleanup code flow
>   - Check for MSG_CTRUNC in the msg_flags return from recvmsg
>     and use that to determine if it's a control message
> v4:
>   - Remove all support for initiating KeyUpdate
>   - Don't call cancel_work() when updating keys
> v3:
>   - Don't cancel existing handshake requests
> v2:
>   - Don't change the state
>   - Use a helper function for KeyUpdates
>   - Continue sending in nvme_tcp_send_all() after a KeyUpdate
>   - Remove command message using recvmsg
> 
>   drivers/nvme/host/tcp.c | 85 +++++++++++++++++++++++++++++++++--------
>   1 file changed, 70 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 4797a4532b0d..5cec5a974bbf 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -172,6 +172,7 @@ struct nvme_tcp_queue {
>   	bool			tls_enabled;
>   	u32			rcv_crc;
>   	u32			snd_crc;
> +	key_serial_t		handshake_session_id;
>   	__le32			exp_ddgst;
>   	__le32			recv_ddgst;
>   	struct completion       tls_complete;
> @@ -858,7 +859,10 @@ static void nvme_tcp_handle_c2h_term(struct nvme_tcp_queue *queue,
>   static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
>   {
>   	char *pdu = queue->pdu;
> +	char cbuf[CMSG_LEN(sizeof(char))] = {};
>   	struct msghdr msg = {
> +		.msg_control = cbuf,
> +		.msg_controllen = sizeof(cbuf),
>   		.msg_flags = MSG_DONTWAIT,
>   	};
>   	struct kvec iov = {
> @@ -873,12 +877,17 @@ static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
>   	if (ret <= 0)
>   		return ret;
>   
> +	hdr = queue->pdu;
> +	if (hdr->type == TLS_HANDSHAKE_KEYUPDATE) {
> +		dev_err(queue->ctrl->ctrl.device, "KeyUpdate message\n");
> +		return 1;
> +	}
> +

Errm. 'hdr' is of type 'struct nvme_tcp_hdr', and that most certainly
does not define TLS_HANDSHAKE_KEYUPDATE. I think you should evaluate the
cmsg type here.

>   	queue->pdu_remaining -= ret;
>   	queue->pdu_offset += ret;
>   	if (queue->pdu_remaining)
>   		return 0;
>   
> -	hdr = queue->pdu;
>   	if (unlikely(hdr->hlen != sizeof(struct nvme_tcp_rsp_pdu))) {
>   		if (!nvme_tcp_recv_pdu_supported(hdr->type))
>   			goto unsupported_pdu;
> @@ -944,6 +953,7 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
>   	struct request *rq =
>   		nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
>   	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
> +	char cbuf[CMSG_LEN(sizeof(char))] = {};
>   
>   	if (nvme_tcp_recv_state(queue) != NVME_TCP_RECV_DATA)
>   		return 0;
> @@ -976,10 +986,26 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
>   
>   		ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
>   		if (ret < 0) {
> -			dev_err(queue->ctrl->ctrl.device,
> -				"queue %d failed to receive request %#x data",
> -				nvme_tcp_queue_id(queue), rq->tag);
> -			return ret;
> +			/* If MSG_CTRUNC is set, it's a control message,
> +			 * so let's read the control message.
> +			 */
> +			if (msg.msg_flags & MSG_CTRUNC) {
> +				memset(&msg, 0, sizeof(msg));
> +				msg.msg_flags = MSG_DONTWAIT;
> +				msg.msg_control = cbuf;
> +				msg.msg_controllen = sizeof(cbuf);
> +
This is not correct; reading the control message implies a kernel
memory allocation as message buffer, not an interator (as it's the
case here).
  > +				ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
> +			}
> +
> +			if (ret < 0) {
> +				dev_dbg(queue->ctrl->ctrl.device,
> +					"queue %d failed to receive request %#x data, %d",
> +					nvme_tcp_queue_id(queue), rq->tag, ret);
> +				return ret;
> +			}
> +
> +			return 0;
>   		}
>   		if (queue->data_digest)
>   			nvme_tcp_ddgst_calc(req, &queue->rcv_crc, ret);
> @@ -1384,15 +1410,39 @@ static int nvme_tcp_try_recvmsg(struct nvme_tcp_queue *queue)
>   		}
>   	} while (result >= 0);
>   
> -	if (result < 0 && result != -EAGAIN) {
> -		dev_err(queue->ctrl->ctrl.device,
> -			"receive failed:  %d\n", result);
> -		queue->rd_enabled = false;
> -		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> -	} else if (result == -EAGAIN)
> -		result = 0;
> +	if (result < 0) {
> +		if (result != -EKEYEXPIRED && result != -EAGAIN) {
> +			dev_err(queue->ctrl->ctrl.device,
> +				"receive failed:  %d\n", result);
> +			queue->rd_enabled = false;
> +			nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> +		}
> +		return result;
> +	}
> +
> +	queue->nr_cqe = nr_cqe;
> +	return nr_cqe;
> +}
> +
> +static void update_tls_keys(struct nvme_tcp_queue *queue)
> +{
> +	int qid = nvme_tcp_queue_id(queue);
> +	int ret;
> +
> +	dev_dbg(queue->ctrl->ctrl.device,
> +		"updating key for queue %d\n", qid);
>   
> -	return result < 0 ? result : (queue->nr_cqe = nr_cqe);
> +	flush_work(&(queue->ctrl->ctrl).async_event_work);
> +
> +	ret = nvme_tcp_start_tls(&(queue->ctrl->ctrl),
> +				 queue, queue->ctrl->ctrl.tls_pskid,
> +				 HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED);
> +
> +	if (ret < 0) {
> +		dev_err(queue->ctrl->ctrl.device,
> +			"failed to update the keys %d\n", ret);
> +		nvme_tcp_fail_request(queue->request);
> +	}
>   }
>   
>   static void nvme_tcp_io_work(struct work_struct *w)
> @@ -1417,8 +1467,11 @@ static void nvme_tcp_io_work(struct work_struct *w)
>   		result = nvme_tcp_try_recvmsg(queue);
>   		if (result > 0)
>   			pending = true;
> -		else if (unlikely(result < 0))
> -			return;
> +		else if (unlikely(result < 0)) {
> +			if (result == -EKEYEXPIRED)
> +				update_tls_keys(queue);
> +			break;
> +		}
>   
>   		/* did we get some space after spending time in recv? */
>   		if (nvme_tcp_queue_has_pending(queue) &&
> @@ -1726,6 +1779,7 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid,
>   			ctrl->ctrl.tls_pskid = key_serial(tls_key);
>   		key_put(tls_key);
>   		queue->tls_err = 0;
> +		queue->handshake_session_id = handshake_session_id;
>   	}
>   
>   out_complete:
> @@ -1755,6 +1809,7 @@ static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
>   		keyring = key_serial(nctrl->opts->keyring);
>   	args.ta_keyring = keyring;
>   	args.ta_timeout_ms = tls_handshake_timeout * 1000;
> +	args.handshake_session_id = queue->handshake_session_id;
>   	queue->tls_err = -EOPNOTSUPP;
>   	init_completion(&queue->tls_complete);
>   	if (keyupdate == HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC)
Cheers,Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/6] net/handshake: Store the key serial number on completion
  2025-11-12  4:27 ` [PATCH v5 1/6] net/handshake: Store the key serial number on completion alistair23
  2025-11-12 15:02   ` Chuck Lever
@ 2025-11-30 22:21   ` Sagi Grimberg
  1 sibling, 0 replies; 33+ messages in thread
From: Sagi Grimberg @ 2025-11-30 22:21 UTC (permalink / raw)
  To: alistair23, chuck.lever, hare, kernel-tls-handshake, netdev,
	linux-kernel, linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, kch, hare, Alistair Francis

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-12  4:27 ` [PATCH v5 5/6] nvme-tcp: Support KeyUpdate alistair23
  2025-11-12  6:59   ` Christoph Hellwig
  2025-11-27 13:31   ` Hannes Reinecke
@ 2025-11-30 22:31   ` Sagi Grimberg
  2025-12-01 23:27     ` Alistair Francis
  2 siblings, 1 reply; 33+ messages in thread
From: Sagi Grimberg @ 2025-11-30 22:31 UTC (permalink / raw)
  To: alistair23, chuck.lever, hare, kernel-tls-handshake, netdev,
	linux-kernel, linux-doc, linux-nvme, linux-nfs
  Cc: kbusch, axboe, hch, kch, hare, Alistair Francis



On 12/11/2025 6:27, alistair23@gmail.com wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
>
> If the nvme_tcp_try_send() or nvme_tcp_try_recv() functions return
> EKEYEXPIRED then the underlying TLS keys need to be updated. This occurs
> on an KeyUpdate event as described in RFC8446
> https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3.
>
> If the NVMe Target (TLS server) initiates a KeyUpdate this patch will
> allow the NVMe layer to process the KeyUpdate request and forward the
> request to userspace. Userspace must then update the key to keep the
> connection alive.
>
> This patch allows us to handle the NVMe target sending a KeyUpdate
> request without aborting the connection. At this time we don't support
> initiating a KeyUpdate.
>
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---

I see this is on top of Hannes recvmsg patch. Worth noting in the patch 
here at least.

> v5:
>   - Cleanup code flow
>   - Check for MSG_CTRUNC in the msg_flags return from recvmsg
>     and use that to determine if it's a control message
> v4:
>   - Remove all support for initiating KeyUpdate
>   - Don't call cancel_work() when updating keys
> v3:
>   - Don't cancel existing handshake requests
> v2:
>   - Don't change the state
>   - Use a helper function for KeyUpdates
>   - Continue sending in nvme_tcp_send_all() after a KeyUpdate
>   - Remove command message using recvmsg
>
>   drivers/nvme/host/tcp.c | 85 +++++++++++++++++++++++++++++++++--------
>   1 file changed, 70 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 4797a4532b0d..5cec5a974bbf 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -172,6 +172,7 @@ struct nvme_tcp_queue {
>   	bool			tls_enabled;
>   	u32			rcv_crc;
>   	u32			snd_crc;
> +	key_serial_t		handshake_session_id;
>   	__le32			exp_ddgst;
>   	__le32			recv_ddgst;
>   	struct completion       tls_complete;
> @@ -858,7 +859,10 @@ static void nvme_tcp_handle_c2h_term(struct nvme_tcp_queue *queue,
>   static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
>   {
>   	char *pdu = queue->pdu;
> +	char cbuf[CMSG_LEN(sizeof(char))] = {};
>   	struct msghdr msg = {
> +		.msg_control = cbuf,
> +		.msg_controllen = sizeof(cbuf),
>   		.msg_flags = MSG_DONTWAIT,
>   	};
>   	struct kvec iov = {
> @@ -873,12 +877,17 @@ static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
>   	if (ret <= 0)
>   		return ret;
>   
> +	hdr = queue->pdu;
> +	if (hdr->type == TLS_HANDSHAKE_KEYUPDATE) {
> +		dev_err(queue->ctrl->ctrl.device, "KeyUpdate message\n");
> +		return 1;
> +	}
> +
>   	queue->pdu_remaining -= ret;
>   	queue->pdu_offset += ret;
>   	if (queue->pdu_remaining)
>   		return 0;
>   
> -	hdr = queue->pdu;
>   	if (unlikely(hdr->hlen != sizeof(struct nvme_tcp_rsp_pdu))) {
>   		if (!nvme_tcp_recv_pdu_supported(hdr->type))
>   			goto unsupported_pdu;
> @@ -944,6 +953,7 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
>   	struct request *rq =
>   		nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
>   	struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
> +	char cbuf[CMSG_LEN(sizeof(char))] = {};
>   
>   	if (nvme_tcp_recv_state(queue) != NVME_TCP_RECV_DATA)
>   		return 0;
> @@ -976,10 +986,26 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
>   
>   		ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
>   		if (ret < 0) {
> -			dev_err(queue->ctrl->ctrl.device,
> -				"queue %d failed to receive request %#x data",
> -				nvme_tcp_queue_id(queue), rq->tag);
> -			return ret;
> +			/* If MSG_CTRUNC is set, it's a control message,
> +			 * so let's read the control message.
> +			 */
> +			if (msg.msg_flags & MSG_CTRUNC) {
> +				memset(&msg, 0, sizeof(msg));
> +				msg.msg_flags = MSG_DONTWAIT;
> +				msg.msg_control = cbuf;
> +				msg.msg_controllen = sizeof(cbuf);
> +
> +				ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
> +			}
> +
> +			if (ret < 0) {
> +				dev_dbg(queue->ctrl->ctrl.device,
> +					"queue %d failed to receive request %#x data, %d",
> +					nvme_tcp_queue_id(queue), rq->tag, ret);
> +				return ret;
> +			}
> +
> +			return 0;
>   		}
>   		if (queue->data_digest)
>   			nvme_tcp_ddgst_calc(req, &queue->rcv_crc, ret);
> @@ -1384,15 +1410,39 @@ static int nvme_tcp_try_recvmsg(struct nvme_tcp_queue *queue)
>   		}
>   	} while (result >= 0);
>   
> -	if (result < 0 && result != -EAGAIN) {
> -		dev_err(queue->ctrl->ctrl.device,
> -			"receive failed:  %d\n", result);
> -		queue->rd_enabled = false;
> -		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> -	} else if (result == -EAGAIN)
> -		result = 0;
> +	if (result < 0) {
> +		if (result != -EKEYEXPIRED && result != -EAGAIN) {
> +			dev_err(queue->ctrl->ctrl.device,
> +				"receive failed:  %d\n", result);
> +			queue->rd_enabled = false;
> +			nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> +		}
> +		return result;
> +	}
> +
> +	queue->nr_cqe = nr_cqe;
> +	return nr_cqe;
> +}
> +
> +static void update_tls_keys(struct nvme_tcp_queue *queue)
> +{
> +	int qid = nvme_tcp_queue_id(queue);
> +	int ret;
> +
> +	dev_dbg(queue->ctrl->ctrl.device,
> +		"updating key for queue %d\n", qid);
>   
> -	return result < 0 ? result : (queue->nr_cqe = nr_cqe);
> +	flush_work(&(queue->ctrl->ctrl).async_event_work);
> +
> +	ret = nvme_tcp_start_tls(&(queue->ctrl->ctrl),
> +				 queue, queue->ctrl->ctrl.tls_pskid,
> +				 HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED);

No need to quiesce the queue or anything like that? everything continues 
as usual?
Why are you flushing async_event_work?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-27 13:31   ` Hannes Reinecke
@ 2025-12-01  4:18     ` Alistair Francis
  2025-12-01 15:03       ` Hannes Reinecke
  0 siblings, 1 reply; 33+ messages in thread
From: Alistair Francis @ 2025-12-01  4:18 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch,
	Alistair Francis

On Thu, Nov 27, 2025 at 11:31 PM Hannes Reinecke <hare@suse.de> wrote:
>
> On 11/12/25 05:27, alistair23@gmail.com wrote:
> > From: Alistair Francis <alistair.francis@wdc.com>
> >
> > If the nvme_tcp_try_send() or nvme_tcp_try_recv() functions return
> > EKEYEXPIRED then the underlying TLS keys need to be updated. This occurs
> > on an KeyUpdate event as described in RFC8446
> > https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3.
> >
> > If the NVMe Target (TLS server) initiates a KeyUpdate this patch will
> > allow the NVMe layer to process the KeyUpdate request and forward the
> > request to userspace. Userspace must then update the key to keep the
> > connection alive.
> >
> > This patch allows us to handle the NVMe target sending a KeyUpdate
> > request without aborting the connection. At this time we don't support
> > initiating a KeyUpdate.
> >
> > Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> > ---
> > v5:
> >   - Cleanup code flow
> >   - Check for MSG_CTRUNC in the msg_flags return from recvmsg
> >     and use that to determine if it's a control message
> > v4:
> >   - Remove all support for initiating KeyUpdate
> >   - Don't call cancel_work() when updating keys
> > v3:
> >   - Don't cancel existing handshake requests
> > v2:
> >   - Don't change the state
> >   - Use a helper function for KeyUpdates
> >   - Continue sending in nvme_tcp_send_all() after a KeyUpdate
> >   - Remove command message using recvmsg
> >
> >   drivers/nvme/host/tcp.c | 85 +++++++++++++++++++++++++++++++++--------
> >   1 file changed, 70 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> > index 4797a4532b0d..5cec5a974bbf 100644
> > --- a/drivers/nvme/host/tcp.c
> > +++ b/drivers/nvme/host/tcp.c
> > @@ -172,6 +172,7 @@ struct nvme_tcp_queue {
> >       bool                    tls_enabled;
> >       u32                     rcv_crc;
> >       u32                     snd_crc;
> > +     key_serial_t            handshake_session_id;
> >       __le32                  exp_ddgst;
> >       __le32                  recv_ddgst;
> >       struct completion       tls_complete;
> > @@ -858,7 +859,10 @@ static void nvme_tcp_handle_c2h_term(struct nvme_tcp_queue *queue,
> >   static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
> >   {
> >       char *pdu = queue->pdu;
> > +     char cbuf[CMSG_LEN(sizeof(char))] = {};
> >       struct msghdr msg = {
> > +             .msg_control = cbuf,
> > +             .msg_controllen = sizeof(cbuf),
> >               .msg_flags = MSG_DONTWAIT,
> >       };
> >       struct kvec iov = {
> > @@ -873,12 +877,17 @@ static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
> >       if (ret <= 0)
> >               return ret;
> >
> > +     hdr = queue->pdu;
> > +     if (hdr->type == TLS_HANDSHAKE_KEYUPDATE) {
> > +             dev_err(queue->ctrl->ctrl.device, "KeyUpdate message\n");
> > +             return 1;
> > +     }
> > +
>
> Errm. 'hdr' is of type 'struct nvme_tcp_hdr', and that most certainly
> does not define TLS_HANDSHAKE_KEYUPDATE. I think you should evaluate the
> cmsg type here.
>
> >       queue->pdu_remaining -= ret;
> >       queue->pdu_offset += ret;
> >       if (queue->pdu_remaining)
> >               return 0;
> >
> > -     hdr = queue->pdu;
> >       if (unlikely(hdr->hlen != sizeof(struct nvme_tcp_rsp_pdu))) {
> >               if (!nvme_tcp_recv_pdu_supported(hdr->type))
> >                       goto unsupported_pdu;
> > @@ -944,6 +953,7 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
> >       struct request *rq =
> >               nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
> >       struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
> > +     char cbuf[CMSG_LEN(sizeof(char))] = {};
> >
> >       if (nvme_tcp_recv_state(queue) != NVME_TCP_RECV_DATA)
> >               return 0;
> > @@ -976,10 +986,26 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
> >
> >               ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
> >               if (ret < 0) {
> > -                     dev_err(queue->ctrl->ctrl.device,
> > -                             "queue %d failed to receive request %#x data",
> > -                             nvme_tcp_queue_id(queue), rq->tag);
> > -                     return ret;
> > +                     /* If MSG_CTRUNC is set, it's a control message,
> > +                      * so let's read the control message.
> > +                      */
> > +                     if (msg.msg_flags & MSG_CTRUNC) {
> > +                             memset(&msg, 0, sizeof(msg));
> > +                             msg.msg_flags = MSG_DONTWAIT;
> > +                             msg.msg_control = cbuf;
> > +                             msg.msg_controllen = sizeof(cbuf);
> > +
> This is not correct; reading the control message implies a kernel
> memory allocation as message buffer, not an interator (as it's the
> case here).

I don't follow what you mean

Alistair

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-12-01  4:18     ` Alistair Francis
@ 2025-12-01 15:03       ` Hannes Reinecke
  0 siblings, 0 replies; 33+ messages in thread
From: Hannes Reinecke @ 2025-12-01 15:03 UTC (permalink / raw)
  To: Alistair Francis
  Cc: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs, kbusch, axboe, hch, sagi, kch,
	Alistair Francis

On 12/1/25 05:18, Alistair Francis wrote:
> On Thu, Nov 27, 2025 at 11:31 PM Hannes Reinecke <hare@suse.de> wrote:
>>
>> On 11/12/25 05:27, alistair23@gmail.com wrote:
>>> From: Alistair Francis <alistair.francis@wdc.com>
[ .. ]>>> @@ -976,10 +986,26 @@ static int nvme_tcp_recvmsg_data(struct 
nvme_tcp_queue *queue)
>>>
>>>                ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
>>>                if (ret < 0) {
>>> -                     dev_err(queue->ctrl->ctrl.device,
>>> -                             "queue %d failed to receive request %#x data",
>>> -                             nvme_tcp_queue_id(queue), rq->tag);
>>> -                     return ret;
>>> +                     /* If MSG_CTRUNC is set, it's a control message,
>>> +                      * so let's read the control message.
>>> +                      */
>>> +                     if (msg.msg_flags & MSG_CTRUNC) {
>>> +                             memset(&msg, 0, sizeof(msg));
>>> +                             msg.msg_flags = MSG_DONTWAIT;
>>> +                             msg.msg_control = cbuf;
>>> +                             msg.msg_controllen = sizeof(cbuf);
>>> +
>> This is not correct; reading the control message implies a kernel
>> memory allocation as message buffer, not an interator (as it's the
>> case here).
> 
> I don't follow what you mean
> 
Ah, right. My comment refers to users of tls_alert_recv(), which we
don't do here.
Sorry for the noise.

You can add my:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 5/6] nvme-tcp: Support KeyUpdate
  2025-11-30 22:31   ` Sagi Grimberg
@ 2025-12-01 23:27     ` Alistair Francis
  0 siblings, 0 replies; 33+ messages in thread
From: Alistair Francis @ 2025-12-01 23:27 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: chuck.lever, hare, kernel-tls-handshake, netdev, linux-kernel,
	linux-doc, linux-nvme, linux-nfs, kbusch, axboe, hch, kch, hare,
	Alistair Francis

On Mon, Dec 1, 2025 at 8:31 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
>
> On 12/11/2025 6:27, alistair23@gmail.com wrote:
> > From: Alistair Francis <alistair.francis@wdc.com>
> >
> > If the nvme_tcp_try_send() or nvme_tcp_try_recv() functions return
> > EKEYEXPIRED then the underlying TLS keys need to be updated. This occurs
> > on an KeyUpdate event as described in RFC8446
> > https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3.
> >
> > If the NVMe Target (TLS server) initiates a KeyUpdate this patch will
> > allow the NVMe layer to process the KeyUpdate request and forward the
> > request to userspace. Userspace must then update the key to keep the
> > connection alive.
> >
> > This patch allows us to handle the NVMe target sending a KeyUpdate
> > request without aborting the connection. At this time we don't support
> > initiating a KeyUpdate.
> >
> > Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> > ---
>
> I see this is on top of Hannes recvmsg patch. Worth noting in the patch
> here at least.
>
> > v5:
> >   - Cleanup code flow
> >   - Check for MSG_CTRUNC in the msg_flags return from recvmsg
> >     and use that to determine if it's a control message
> > v4:
> >   - Remove all support for initiating KeyUpdate
> >   - Don't call cancel_work() when updating keys
> > v3:
> >   - Don't cancel existing handshake requests
> > v2:
> >   - Don't change the state
> >   - Use a helper function for KeyUpdates
> >   - Continue sending in nvme_tcp_send_all() after a KeyUpdate
> >   - Remove command message using recvmsg
> >
> >   drivers/nvme/host/tcp.c | 85 +++++++++++++++++++++++++++++++++--------
> >   1 file changed, 70 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> > index 4797a4532b0d..5cec5a974bbf 100644
> > --- a/drivers/nvme/host/tcp.c
> > +++ b/drivers/nvme/host/tcp.c
> > @@ -172,6 +172,7 @@ struct nvme_tcp_queue {
> >       bool                    tls_enabled;
> >       u32                     rcv_crc;
> >       u32                     snd_crc;
> > +     key_serial_t            handshake_session_id;
> >       __le32                  exp_ddgst;
> >       __le32                  recv_ddgst;
> >       struct completion       tls_complete;
> > @@ -858,7 +859,10 @@ static void nvme_tcp_handle_c2h_term(struct nvme_tcp_queue *queue,
> >   static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
> >   {
> >       char *pdu = queue->pdu;
> > +     char cbuf[CMSG_LEN(sizeof(char))] = {};
> >       struct msghdr msg = {
> > +             .msg_control = cbuf,
> > +             .msg_controllen = sizeof(cbuf),
> >               .msg_flags = MSG_DONTWAIT,
> >       };
> >       struct kvec iov = {
> > @@ -873,12 +877,17 @@ static int nvme_tcp_recvmsg_pdu(struct nvme_tcp_queue *queue)
> >       if (ret <= 0)
> >               return ret;
> >
> > +     hdr = queue->pdu;
> > +     if (hdr->type == TLS_HANDSHAKE_KEYUPDATE) {
> > +             dev_err(queue->ctrl->ctrl.device, "KeyUpdate message\n");
> > +             return 1;
> > +     }
> > +
> >       queue->pdu_remaining -= ret;
> >       queue->pdu_offset += ret;
> >       if (queue->pdu_remaining)
> >               return 0;
> >
> > -     hdr = queue->pdu;
> >       if (unlikely(hdr->hlen != sizeof(struct nvme_tcp_rsp_pdu))) {
> >               if (!nvme_tcp_recv_pdu_supported(hdr->type))
> >                       goto unsupported_pdu;
> > @@ -944,6 +953,7 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
> >       struct request *rq =
> >               nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
> >       struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
> > +     char cbuf[CMSG_LEN(sizeof(char))] = {};
> >
> >       if (nvme_tcp_recv_state(queue) != NVME_TCP_RECV_DATA)
> >               return 0;
> > @@ -976,10 +986,26 @@ static int nvme_tcp_recvmsg_data(struct nvme_tcp_queue *queue)
> >
> >               ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
> >               if (ret < 0) {
> > -                     dev_err(queue->ctrl->ctrl.device,
> > -                             "queue %d failed to receive request %#x data",
> > -                             nvme_tcp_queue_id(queue), rq->tag);
> > -                     return ret;
> > +                     /* If MSG_CTRUNC is set, it's a control message,
> > +                      * so let's read the control message.
> > +                      */
> > +                     if (msg.msg_flags & MSG_CTRUNC) {
> > +                             memset(&msg, 0, sizeof(msg));
> > +                             msg.msg_flags = MSG_DONTWAIT;
> > +                             msg.msg_control = cbuf;
> > +                             msg.msg_controllen = sizeof(cbuf);
> > +
> > +                             ret = sock_recvmsg(queue->sock, &msg, msg.msg_flags);
> > +                     }
> > +
> > +                     if (ret < 0) {
> > +                             dev_dbg(queue->ctrl->ctrl.device,
> > +                                     "queue %d failed to receive request %#x data, %d",
> > +                                     nvme_tcp_queue_id(queue), rq->tag, ret);
> > +                             return ret;
> > +                     }
> > +
> > +                     return 0;
> >               }
> >               if (queue->data_digest)
> >                       nvme_tcp_ddgst_calc(req, &queue->rcv_crc, ret);
> > @@ -1384,15 +1410,39 @@ static int nvme_tcp_try_recvmsg(struct nvme_tcp_queue *queue)
> >               }
> >       } while (result >= 0);
> >
> > -     if (result < 0 && result != -EAGAIN) {
> > -             dev_err(queue->ctrl->ctrl.device,
> > -                     "receive failed:  %d\n", result);
> > -             queue->rd_enabled = false;
> > -             nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> > -     } else if (result == -EAGAIN)
> > -             result = 0;
> > +     if (result < 0) {
> > +             if (result != -EKEYEXPIRED && result != -EAGAIN) {
> > +                     dev_err(queue->ctrl->ctrl.device,
> > +                             "receive failed:  %d\n", result);
> > +                     queue->rd_enabled = false;
> > +                     nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> > +             }
> > +             return result;
> > +     }
> > +
> > +     queue->nr_cqe = nr_cqe;
> > +     return nr_cqe;
> > +}
> > +
> > +static void update_tls_keys(struct nvme_tcp_queue *queue)
> > +{
> > +     int qid = nvme_tcp_queue_id(queue);
> > +     int ret;
> > +
> > +     dev_dbg(queue->ctrl->ctrl.device,
> > +             "updating key for queue %d\n", qid);
> >
> > -     return result < 0 ? result : (queue->nr_cqe = nr_cqe);
> > +     flush_work(&(queue->ctrl->ctrl).async_event_work);
> > +
> > +     ret = nvme_tcp_start_tls(&(queue->ctrl->ctrl),
> > +                              queue, queue->ctrl->ctrl.tls_pskid,
> > +                              HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED);
>
> No need to quiesce the queue or anything like that? everything continues
> as usual?

Everything just continues as usual. Besides a drop in throughput
waiting for the KeyUpdate you don't notice it. For testing I trigger
an update every 30 packets, it's slow but works

> Why are you flushing async_event_work?

I think that's just leftover from earlier, I'll remove it

Alistair

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2025-12-01 23:27 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-12  4:27 [PATCH v5 0/6] nvme-tcp: Support receiving KeyUpdate requests alistair23
2025-11-12  4:27 ` [PATCH v5 1/6] net/handshake: Store the key serial number on completion alistair23
2025-11-12 15:02   ` Chuck Lever
2025-11-30 22:21   ` Sagi Grimberg
2025-11-12  4:27 ` [PATCH v5 2/6] net/handshake: Define handshake_sk_destruct_req alistair23
2025-11-12 15:47   ` Chuck Lever
2025-11-13 10:19     ` Alistair Francis
2025-11-13 14:01       ` Chuck Lever
2025-11-13 14:37         ` Chuck Lever
2025-11-14  3:44           ` Alistair Francis
2025-11-14 14:14             ` Chuck Lever
2025-11-19  0:45               ` Alistair Francis
2025-11-20 13:51                 ` Chuck Lever
2025-11-25  5:00                   ` Alistair Francis
2025-11-25 13:55                     ` Chuck Lever
2025-11-12  4:27 ` [PATCH v5 3/6] net/handshake: Ensure the request is destructed on completion alistair23
2025-11-12  4:27 ` [PATCH v5 4/6] net/handshake: Support KeyUpdate message types alistair23
2025-11-12 15:49   ` Chuck Lever
2025-11-13  2:16     ` Alistair Francis
2025-11-13 14:41   ` Chuck Lever
2025-11-27 13:12   ` Hannes Reinecke
2025-11-12  4:27 ` [PATCH v5 5/6] nvme-tcp: Support KeyUpdate alistair23
2025-11-12  6:59   ` Christoph Hellwig
2025-11-12 14:31     ` Chuck Lever
2025-11-12 14:38       ` Christoph Hellwig
2025-11-12 14:38         ` Chuck Lever
2025-11-27 13:31   ` Hannes Reinecke
2025-12-01  4:18     ` Alistair Francis
2025-12-01 15:03       ` Hannes Reinecke
2025-11-30 22:31   ` Sagi Grimberg
2025-12-01 23:27     ` Alistair Francis
2025-11-12  4:27 ` [PATCH v5 6/6] nvmet-tcp: " alistair23
2025-11-12  7:01   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).