public inbox for mptcp@lists.linux.dev
 help / color / mirror / Atom feed
* [RFC mptcp-next v4 0/8] NVME over MPTCP
@ 2026-03-05  4:05 Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 1/8] mptcp: add sk_is_msk helper Geliang Tang
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang

From: Geliang Tang <tanggeliang@kylinos.cn>

v4:
 - a new patch to set nvme iopolicy as Nilay suggested.
 - resend all set to trigger AI review.

v3:
 - update the implementation of sock_set_nodelay: originally it only set
the first subflow, but now it sets every subflow.
 - use sk_is_msk helper in this set.
 - update the selftest to perform testing under a multi-interface
environment.

v2:
 - Patch 1 fixes the timeout issue reported in v1, thanks to Paolo and Gang
Yan for their help.
 - Patch 5 implements an MPTCP-specific sock_set_syncnt helper.

This series (previously named "MPTCP support to 'NVME over TCP'") had three
RFC versions sent to Hannes in May, with subsequent revisions based on his
input. Following that, I initiated the process of upstreaming the dependent
"implement mptcp read_sock" series to the main MPTCP repository, which has
been merged into net-next recently.

Depends on: mptcp: fix stall because of data_ready
Without this fix, NVMe-over-MPTCP test will probabilistically fail.
Based-on: <20260228011511.440437-1-gang.yan@linux.dev>

Geliang Tang (8):
  mptcp: add sk_is_msk helper
  nvmet-tcp: add mptcp support
  nvme-tcp: add mptcp support
  mptcp: add sock_set_nodelay
  mptcp: add sock_set_reuseaddr
  mptcp: add sock_set_syncnt
  selftests: mptcp: add NVMe-over-MPTCP test
  selftests: mptcp: nvme: set iopolicy

 drivers/nvme/host/tcp.c                       |  26 ++-
 drivers/nvme/target/configfs.c                |   1 +
 drivers/nvme/target/tcp.c                     |  38 +++-
 include/linux/nvme.h                          |   1 +
 include/net/mptcp.h                           |  27 +++
 net/mptcp/protocol.c                          |  51 +++++
 tools/testing/selftests/net/mptcp/config      |   7 +
 .../testing/selftests/net/mptcp/mptcp_nvme.sh | 187 ++++++++++++++++++
 8 files changed, 335 insertions(+), 3 deletions(-)
 create mode 100755 tools/testing/selftests/net/mptcp/mptcp_nvme.sh

-- 
2.53.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC mptcp-next v4 1/8] mptcp: add sk_is_msk helper
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
@ 2026-03-05  4:05 ` Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 2/8] nvmet-tcp: add mptcp support Geliang Tang
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang

From: Geliang Tang <tanggeliang@kylinos.cn>

This patch introduces a sk_is_msk() helper modeled after sk_is_tcp() to
determine whether the socket is an MPTCP one. Unlike sk_is_mptcp(), which
accepts a subflow socket as its parameter, this new helper specifically
accepts an MPTCP socket parameter.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 include/net/mptcp.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 4cf59e83c1c5..82660374859a 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -150,6 +150,13 @@ static inline bool rsk_drop_req(const struct request_sock *req)
 	return tcp_rsk(req)->is_mptcp && tcp_rsk(req)->drop_req;
 }
 
+static inline bool sk_is_msk(const struct sock *sk)
+{
+	return sk_is_inet(sk) &&
+	       sk->sk_type == SOCK_STREAM &&
+	       sk->sk_protocol == IPPROTO_MPTCP;
+}
+
 void mptcp_space(const struct sock *ssk, int *space, int *full_space);
 bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
 		       unsigned int *size, struct mptcp_out_options *opts);
@@ -258,6 +265,11 @@ static inline bool rsk_drop_req(const struct request_sock *req)
 	return false;
 }
 
+static inline bool sk_is_msk(const struct sock *sk)
+{
+	return false;
+}
+
 static inline bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
 				     unsigned int *size,
 				     struct mptcp_out_options *opts)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC mptcp-next v4 2/8] nvmet-tcp: add mptcp support
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 1/8] mptcp: add sk_is_msk helper Geliang Tang
@ 2026-03-05  4:05 ` Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 3/8] nvme-tcp: " Geliang Tang
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang, zhenwei pi, Hui Zhu, Gang Yan

From: Geliang Tang <tanggeliang@kylinos.cn>

This patch adds a new nvme target transport type NVMF_TRTYPE_MPTCP for
MPTCP. And defines a new nvmet_fabrics_ops named nvmet_mptcp_ops, which
is almost the same as nvmet_tcp_ops except .type.

Check if disc_addr.trtype is NVMF_TRTYPE_MPTCP in nvmet_tcp_add_port()
to decide whether to pass IPPROTO_MPTCP to sock_create() to create a
MPTCP socket instead of a TCP one.

This new nvmet_fabrics_ops can be switched in nvmet_tcp_done_recv_pdu()
according to different protocol.

v2:
 - use trtype instead of tsas (Hannes).

v3:
 - check mptcp protocol from disc_addr.trtype instead of passing a
parameter (Hannes).

v4:
 - check CONFIG_MPTCP.

Cc: Hannes Reinecke <hare@suse.de>
Co-developed-by: zhenwei pi <zhenwei.pi@linux.dev>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Co-developed-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Co-developed-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 drivers/nvme/target/configfs.c |  1 +
 drivers/nvme/target/tcp.c      | 34 ++++++++++++++++++++++++++++++++--
 include/linux/nvme.h           |  1 +
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 3088e044dbcb..4b7498ffb102 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -38,6 +38,7 @@ static struct nvmet_type_name_map nvmet_transport[] = {
 	{ NVMF_TRTYPE_RDMA,	"rdma" },
 	{ NVMF_TRTYPE_FC,	"fc" },
 	{ NVMF_TRTYPE_TCP,	"tcp" },
+	{ NVMF_TRTYPE_MPTCP,	"mptcp" },
 	{ NVMF_TRTYPE_PCI,	"pci" },
 	{ NVMF_TRTYPE_LOOP,	"loop" },
 };
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index acc71a26733f..5a58b544f258 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -212,6 +212,7 @@ static DEFINE_MUTEX(nvmet_tcp_queue_mutex);
 
 static struct workqueue_struct *nvmet_tcp_wq;
 static const struct nvmet_fabrics_ops nvmet_tcp_ops;
+static const struct nvmet_fabrics_ops nvmet_mptcp_ops;
 static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c);
 static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd);
 
@@ -1067,7 +1068,9 @@ static int nvmet_tcp_done_recv_pdu(struct nvmet_tcp_queue *queue)
 	req = &queue->cmd->req;
 	memcpy(req->cmd, nvme_cmd, sizeof(*nvme_cmd));
 
-	if (unlikely(!nvmet_req_init(req, &queue->nvme_sq, &nvmet_tcp_ops))) {
+	if (unlikely(!nvmet_req_init(req, &queue->nvme_sq,
+				     sk_is_msk(queue->sock->sk) ?
+				     &nvmet_mptcp_ops : &nvmet_tcp_ops))) {
 		pr_err("failed cmd %p id %d opcode %d, data_len: %d, status: %04x\n",
 			req->cmd, req->cmd->common.command_id,
 			req->cmd->common.opcode,
@@ -2034,6 +2037,7 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport)
 {
 	struct nvmet_tcp_port *port;
 	__kernel_sa_family_t af;
+	int proto = IPPROTO_TCP;
 	int ret;
 
 	port = kzalloc_obj(*port);
@@ -2054,6 +2058,11 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport)
 		goto err_port;
 	}
 
+#ifdef CONFIG_MPTCP
+	if (nport->disc_addr.trtype == NVMF_TRTYPE_MPTCP)
+		proto = IPPROTO_MPTCP;
+#endif
+
 	ret = inet_pton_with_scope(&init_net, af, nport->disc_addr.traddr,
 			nport->disc_addr.trsvcid, &port->addr);
 	if (ret) {
@@ -2068,7 +2077,7 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport)
 		port->nport->inline_data_size = NVMET_TCP_DEF_INLINE_DATA_SIZE;
 
 	ret = sock_create(port->addr.ss_family, SOCK_STREAM,
-				IPPROTO_TCP, &port->sock);
+				proto, &port->sock);
 	if (ret) {
 		pr_err("failed to create a socket\n");
 		goto err_port;
@@ -2220,6 +2229,19 @@ static const struct nvmet_fabrics_ops nvmet_tcp_ops = {
 	.host_traddr		= nvmet_tcp_host_port_addr,
 };
 
+static const struct nvmet_fabrics_ops nvmet_mptcp_ops = {
+	.owner			= THIS_MODULE,
+	.type			= NVMF_TRTYPE_MPTCP,
+	.msdbd			= 1,
+	.add_port		= nvmet_tcp_add_port,
+	.remove_port		= nvmet_tcp_remove_port,
+	.queue_response		= nvmet_tcp_queue_response,
+	.delete_ctrl		= nvmet_tcp_delete_ctrl,
+	.install_queue		= nvmet_tcp_install_queue,
+	.disc_traddr		= nvmet_tcp_disc_port_addr,
+	.host_traddr		= nvmet_tcp_host_port_addr,
+};
+
 static int __init nvmet_tcp_init(void)
 {
 	int ret;
@@ -2233,6 +2255,12 @@ static int __init nvmet_tcp_init(void)
 	if (ret)
 		goto err;
 
+	ret = nvmet_register_transport(&nvmet_mptcp_ops);
+	if (ret) {
+		nvmet_unregister_transport(&nvmet_tcp_ops);
+		goto err;
+	}
+
 	return 0;
 err:
 	destroy_workqueue(nvmet_tcp_wq);
@@ -2243,6 +2271,7 @@ static void __exit nvmet_tcp_exit(void)
 {
 	struct nvmet_tcp_queue *queue;
 
+	nvmet_unregister_transport(&nvmet_mptcp_ops);
 	nvmet_unregister_transport(&nvmet_tcp_ops);
 
 	flush_workqueue(nvmet_wq);
@@ -2262,3 +2291,4 @@ module_exit(nvmet_tcp_exit);
 MODULE_DESCRIPTION("NVMe target TCP transport driver");
 MODULE_LICENSE("GPL v2");
 MODULE_ALIAS("nvmet-transport-3"); /* 3 == NVMF_TRTYPE_TCP */
+MODULE_ALIAS("nvmet-transport-4"); /* 4 == NVMF_TRTYPE_MPTCP */
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 655d194f8e72..8069667ad47e 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -68,6 +68,7 @@ enum {
 	NVMF_TRTYPE_RDMA	= 1,	/* RDMA */
 	NVMF_TRTYPE_FC		= 2,	/* Fibre Channel */
 	NVMF_TRTYPE_TCP		= 3,	/* TCP/IP */
+	NVMF_TRTYPE_MPTCP	= 4,	/* Multipath TCP */
 	NVMF_TRTYPE_LOOP	= 254,	/* Reserved for host usage */
 	NVMF_TRTYPE_MAX,
 };
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC mptcp-next v4 3/8] nvme-tcp: add mptcp support
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 1/8] mptcp: add sk_is_msk helper Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 2/8] nvmet-tcp: add mptcp support Geliang Tang
@ 2026-03-05  4:05 ` Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 4/8] mptcp: add sock_set_nodelay Geliang Tang
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang, zhenwei pi, Hui Zhu, Gang Yan

From: Geliang Tang <tanggeliang@kylinos.cn>

This patch defines a new nvmf_transport_ops named nvme_mptcp_transport,
which is almost the same as nvme_tcp_transport except .type.

Check if opts->transport is "mptcp" in nvme_tcp_alloc_queue() to decide
whether to pass IPPROTO_MPTCP to sock_create_kern() to create a MPTCP
socket instead of a TCP one.

v2:
 - use 'trtype' instead of '--mptcp' (Hannes)

v3:
 - check mptcp protocol from opts->transport instead of passing a
parameter (Hannes).

v4:
 - check CONFIG_MPTCP.

Cc: Hannes Reinecke <hare@suse.de>
Co-developed-by: zhenwei pi <zhenwei.pi@linux.dev>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Co-developed-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Co-developed-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 drivers/nvme/host/tcp.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 9ab3f61196a3..8446630cceca 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1766,6 +1766,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
 {
 	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
 	struct nvme_tcp_queue *queue = &ctrl->queues[qid];
+	int proto = IPPROTO_TCP;
 	int ret, rcv_pdu_size;
 	struct file *sock_file;
 
@@ -1782,9 +1783,14 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
 		queue->cmnd_capsule_len = sizeof(struct nvme_command) +
 						NVME_TCP_ADMIN_CCSZ;
 
+#ifdef CONFIG_MPTCP
+	if (!strcmp(ctrl->ctrl.opts->transport, "mptcp"))
+		proto = IPPROTO_MPTCP;
+#endif
+
 	ret = sock_create_kern(current->nsproxy->net_ns,
 			ctrl->addr.ss_family, SOCK_STREAM,
-			IPPROTO_TCP, &queue->sock);
+			proto, &queue->sock);
 	if (ret) {
 		dev_err(nctrl->device,
 			"failed to create socket: %d\n", ret);
@@ -3022,6 +3028,18 @@ static struct nvmf_transport_ops nvme_tcp_transport = {
 	.create_ctrl	= nvme_tcp_create_ctrl,
 };
 
+static struct nvmf_transport_ops nvme_mptcp_transport = {
+	.name		= "mptcp",
+	.module		= THIS_MODULE,
+	.required_opts	= NVMF_OPT_TRADDR,
+	.allowed_opts	= NVMF_OPT_TRSVCID | NVMF_OPT_RECONNECT_DELAY |
+			  NVMF_OPT_HOST_TRADDR | NVMF_OPT_CTRL_LOSS_TMO |
+			  NVMF_OPT_HDR_DIGEST | NVMF_OPT_DATA_DIGEST |
+			  NVMF_OPT_NR_WRITE_QUEUES | NVMF_OPT_NR_POLL_QUEUES |
+			  NVMF_OPT_TOS | NVMF_OPT_HOST_IFACE,
+	.create_ctrl	= nvme_tcp_create_ctrl,
+};
+
 static int __init nvme_tcp_init_module(void)
 {
 	unsigned int wq_flags = WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_SYSFS;
@@ -3047,6 +3065,7 @@ static int __init nvme_tcp_init_module(void)
 		atomic_set(&nvme_tcp_cpu_queues[cpu], 0);
 
 	nvmf_register_transport(&nvme_tcp_transport);
+	nvmf_register_transport(&nvme_mptcp_transport);
 	return 0;
 }
 
@@ -3054,6 +3073,7 @@ static void __exit nvme_tcp_cleanup_module(void)
 {
 	struct nvme_tcp_ctrl *ctrl;
 
+	nvmf_unregister_transport(&nvme_mptcp_transport);
 	nvmf_unregister_transport(&nvme_tcp_transport);
 
 	mutex_lock(&nvme_tcp_ctrl_mutex);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC mptcp-next v4 4/8] mptcp: add sock_set_nodelay
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
                   ` (2 preceding siblings ...)
  2026-03-05  4:05 ` [RFC mptcp-next v4 3/8] nvme-tcp: " Geliang Tang
@ 2026-03-05  4:05 ` Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 5/8] mptcp: add sock_set_reuseaddr Geliang Tang
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang, zhenwei pi, Hui Zhu, Gang Yan

From: Geliang Tang <tanggeliang@kylinos.cn>

This patch introduces an MPTCP-specific helper, mptcp_sock_set_nodelay,
which sets the TCP_NODELAY option for every subflow socket within an
MPTCP connection. It is utilized on both the target and host sides in
the 'NVMe over MPTCP' implementation.

Using tcp_sock_set_nodelay() with MPTCP will cause list corruption:

  nvmet: adding nsid 1 to subsystem nqn.2014-08.org.nvmexpress.mptcpdev
  nvmet_tcp: enabling port 1234 (127.0.0.1:4420)
   slab MPTCP start ffff8880108f0b80 pointer offset 2480 size 2816
  list_add corruption. prev->next should be next (ffff8880108f1530), but
  was ffff8885108f1530. (prev=ffff8880108f1530).
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:32!
  Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
  CPU: 1 UID: 0 PID: 182 Comm: nvme Not tainted 6.16.0-rc3+ #1 PREEMPT(full)
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011

Co-developed-by: zhenwei pi <zhenwei.pi@linux.dev>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Co-developed-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Co-developed-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 drivers/nvme/host/tcp.c   |  2 ++
 drivers/nvme/target/tcp.c |  2 ++
 include/net/mptcp.h       |  4 ++++
 net/mptcp/protocol.c      | 17 +++++++++++++++++
 4 files changed, 25 insertions(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 8446630cceca..dc5b3ecdd885 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1810,6 +1810,8 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
 	tcp_sock_set_syncnt(queue->sock->sk, 1);
 
 	/* Set TCP no delay */
+	sk_is_msk(queue->sock->sk) ?
+	mptcp_sock_set_nodelay(queue->sock->sk) :
 	tcp_sock_set_nodelay(queue->sock->sk);
 
 	/*
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 5a58b544f258..8452d38614a6 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -2087,6 +2087,8 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport)
 	port->data_ready = port->sock->sk->sk_data_ready;
 	port->sock->sk->sk_data_ready = nvmet_tcp_listen_data_ready;
 	sock_set_reuseaddr(port->sock->sk);
+	sk_is_msk(port->sock->sk) ?
+	mptcp_sock_set_nodelay(port->sock->sk) :
 	tcp_sock_set_nodelay(port->sock->sk);
 	if (so_priority > 0)
 		sock_set_priority(port->sock->sk, so_priority);
diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 82660374859a..60cbf29448b0 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -244,6 +244,8 @@ static inline __be32 mptcp_reset_option(const struct sk_buff *skb)
 }
 
 void mptcp_active_detect_blackhole(struct sock *sk, bool expired);
+
+void mptcp_sock_set_nodelay(struct sock *sk);
 #else
 
 static inline void mptcp_init(void)
@@ -335,6 +337,8 @@ static inline struct request_sock *mptcp_subflow_reqsk_alloc(const struct reques
 static inline __be32 mptcp_reset_option(const struct sk_buff *skb)  { return htonl(0u); }
 
 static inline void mptcp_active_detect_blackhole(struct sock *sk, bool expired) { }
+
+static inline void mptcp_sock_set_nodelay(struct sock *sk) { }
 #endif /* CONFIG_MPTCP */
 
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index c8fcc46ed042..451bc4df4fa4 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3806,6 +3806,23 @@ static void mptcp_sock_check_graft(struct sock *sk, struct sock *ssk)
 	}
 }
 
+void mptcp_sock_set_nodelay(struct sock *sk)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+	struct mptcp_subflow_context *subflow;
+
+	lock_sock(sk);
+	mptcp_for_each_subflow(msk, subflow) {
+		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+		lock_sock(ssk);
+		__tcp_sock_set_nodelay(ssk, true);
+		release_sock(ssk);
+	}
+	release_sock(sk);
+}
+EXPORT_SYMBOL(mptcp_sock_set_nodelay);
+
 bool mptcp_finish_join(struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC mptcp-next v4 5/8] mptcp: add sock_set_reuseaddr
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
                   ` (3 preceding siblings ...)
  2026-03-05  4:05 ` [RFC mptcp-next v4 4/8] mptcp: add sock_set_nodelay Geliang Tang
@ 2026-03-05  4:05 ` Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 6/8] mptcp: add sock_set_syncnt Geliang Tang
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang, zhenwei pi, Hui Zhu, Gang Yan

From: Geliang Tang <tanggeliang@kylinos.cn>

This patch introduces a dedicated MPTCP helper, sock_set_reuseaddr, which
sets the address reuse flag on the first subflow socket of an MPTCP
connection, and applies it to the target side in the 'NVMe over MPTCP'
implementation.

Co-developed-by: zhenwei pi <zhenwei.pi@linux.dev>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Co-developed-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Co-developed-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 drivers/nvme/target/tcp.c |  2 ++
 include/net/mptcp.h       |  4 ++++
 net/mptcp/protocol.c      | 15 +++++++++++++++
 3 files changed, 21 insertions(+)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 8452d38614a6..5111c0e690ee 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -2086,6 +2086,8 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport)
 	port->sock->sk->sk_user_data = port;
 	port->data_ready = port->sock->sk->sk_data_ready;
 	port->sock->sk->sk_data_ready = nvmet_tcp_listen_data_ready;
+	sk_is_msk(port->sock->sk) ?
+	mptcp_sock_set_reuseaddr(port->sock->sk) :
 	sock_set_reuseaddr(port->sock->sk);
 	sk_is_msk(port->sock->sk) ?
 	mptcp_sock_set_nodelay(port->sock->sk) :
diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 60cbf29448b0..63b64b7699e3 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -246,6 +246,8 @@ static inline __be32 mptcp_reset_option(const struct sk_buff *skb)
 void mptcp_active_detect_blackhole(struct sock *sk, bool expired);
 
 void mptcp_sock_set_nodelay(struct sock *sk);
+
+void mptcp_sock_set_reuseaddr(struct sock *sk);
 #else
 
 static inline void mptcp_init(void)
@@ -339,6 +341,8 @@ static inline __be32 mptcp_reset_option(const struct sk_buff *skb)  { return hto
 static inline void mptcp_active_detect_blackhole(struct sock *sk, bool expired) { }
 
 static inline void mptcp_sock_set_nodelay(struct sock *sk) { }
+
+static inline void mptcp_sock_set_reuseaddr(struct sock *sk) { }
 #endif /* CONFIG_MPTCP */
 
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 451bc4df4fa4..61f4eba02b37 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3823,6 +3823,21 @@ void mptcp_sock_set_nodelay(struct sock *sk)
 }
 EXPORT_SYMBOL(mptcp_sock_set_nodelay);
 
+void mptcp_sock_set_reuseaddr(struct sock *sk)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+	struct sock *ssk;
+
+	lock_sock(sk);
+	ssk = __mptcp_nmpc_sk(msk);
+	if (IS_ERR(ssk))
+		goto unlock;
+	ssk->sk_reuse = SK_CAN_REUSE;
+unlock:
+	release_sock(sk);
+}
+EXPORT_SYMBOL(mptcp_sock_set_reuseaddr);
+
 bool mptcp_finish_join(struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC mptcp-next v4 6/8] mptcp: add sock_set_syncnt
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
                   ` (4 preceding siblings ...)
  2026-03-05  4:05 ` [RFC mptcp-next v4 5/8] mptcp: add sock_set_reuseaddr Geliang Tang
@ 2026-03-05  4:05 ` Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 7/8] selftests: mptcp: add NVMe-over-MPTCP test Geliang Tang
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang, zhenwei pi, Hui Zhu, Gang Yan

From: Geliang Tang <tanggeliang@kylinos.cn>

This patch introduces a dedicated MPTCP helper, sock_set_syncnt, which sets
the SYN retransmission count on the first subflow socket and applies it to
the host side in the 'NVMe over MPTCP' implementation.

Co-developed-by: zhenwei pi <zhenwei.pi@linux.dev>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Co-developed-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Co-developed-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 drivers/nvme/host/tcp.c |  2 ++
 include/net/mptcp.h     |  7 +++++++
 net/mptcp/protocol.c    | 19 +++++++++++++++++++
 3 files changed, 28 insertions(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index dc5b3ecdd885..8ead932e7999 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1807,6 +1807,8 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
 	nvme_tcp_reclassify_socket(queue->sock);
 
 	/* Single syn retry */
+	sk_is_msk(queue->sock->sk) ?
+	mptcp_sock_set_syncnt(queue->sock->sk, 1) :
 	tcp_sock_set_syncnt(queue->sock->sk, 1);
 
 	/* Set TCP no delay */
diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 63b64b7699e3..d6bb67a55f24 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -248,6 +248,8 @@ void mptcp_active_detect_blackhole(struct sock *sk, bool expired);
 void mptcp_sock_set_nodelay(struct sock *sk);
 
 void mptcp_sock_set_reuseaddr(struct sock *sk);
+
+int mptcp_sock_set_syncnt(struct sock *sk, int val);
 #else
 
 static inline void mptcp_init(void)
@@ -343,6 +345,11 @@ static inline void mptcp_active_detect_blackhole(struct sock *sk, bool expired)
 static inline void mptcp_sock_set_nodelay(struct sock *sk) { }
 
 static inline void mptcp_sock_set_reuseaddr(struct sock *sk) { }
+
+static inline int mptcp_sock_set_syncnt(struct sock *sk, int val)
+{
+	return 0;
+}
 #endif /* CONFIG_MPTCP */
 
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 61f4eba02b37..bc507b6e91e7 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3838,6 +3838,25 @@ void mptcp_sock_set_reuseaddr(struct sock *sk)
 }
 EXPORT_SYMBOL(mptcp_sock_set_reuseaddr);
 
+int mptcp_sock_set_syncnt(struct sock *sk, int val)
+{
+	struct mptcp_sock *msk = mptcp_sk(sk);
+	struct sock *ssk;
+
+	if (val < 1 || val > MAX_TCP_SYNCNT)
+		return -EINVAL;
+
+	lock_sock(sk);
+	ssk = __mptcp_nmpc_sk(msk);
+	if (IS_ERR(ssk))
+		goto unlock;
+	WRITE_ONCE(inet_csk(ssk)->icsk_syn_retries, val);
+unlock:
+	release_sock(sk);
+	return 0;
+}
+EXPORT_SYMBOL(mptcp_sock_set_syncnt);
+
 bool mptcp_finish_join(struct sock *ssk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC mptcp-next v4 7/8] selftests: mptcp: add NVMe-over-MPTCP test
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
                   ` (5 preceding siblings ...)
  2026-03-05  4:05 ` [RFC mptcp-next v4 6/8] mptcp: add sock_set_syncnt Geliang Tang
@ 2026-03-05  4:05 ` Geliang Tang
  2026-03-05  4:05 ` [RFC mptcp-next v4 8/8] selftests: mptcp: nvme: set iopolicy Geliang Tang
  2026-03-05  6:13 ` [RFC mptcp-next v4 0/8] NVME over MPTCP MPTCP CI
  8 siblings, 0 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang, zhenwei pi, Hui Zhu, Gang Yan

From: Geliang Tang <tanggeliang@kylinos.cn>

A test case for NVMe over MPTCP has been implemented. It verifies the
proper functionality of nvme list, discover, connect, and disconnect
commands. Additionally, read/write performance has been evaluated using
fio.

This test simulats four NICs on both target and host sides, each limited
to 100MB/s. It shows that 'NVMe over MPTCP' delivered bandwidth up to
four times that of standard TCP:

 # ./mptcp_nvme.sh tcp
  READ: bw=112MiB/s (118MB/s), 112MiB/s-112MiB/s (118MB/s-118MB/s),
		io=1123MiB (1177MB), run=10018-10018msec
  WRITE: bw=112MiB/s (117MB/s), 112MiB/s-112MiB/s (117MB/s-117MB/s),
		io=1118MiB (1173MB), run=10018-10018msec

 # ./mptcp_nvme.sh mptcp
  READ: bw=427MiB/s (448MB/s), 427MiB/s-427MiB/s (448MB/s-448MB/s),
		io=4286MiB (4494MB), run=10039-10039msec
  WRITE: bw=387MiB/s (406MB/s), 387MiB/s-387MiB/s (406MB/s-406MB/s),
		io=3885MiB (4073MB), run=10043-10043msec

Co-developed-by: zhenwei pi <zhenwei.pi@linux.dev>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Co-developed-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Co-developed-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 tools/testing/selftests/net/mptcp/config      |   7 +
 .../testing/selftests/net/mptcp/mptcp_nvme.sh | 179 ++++++++++++++++++
 2 files changed, 186 insertions(+)
 create mode 100755 tools/testing/selftests/net/mptcp/mptcp_nvme.sh

diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selftests/net/mptcp/config
index 59051ee2a986..0eee348eff8b 100644
--- a/tools/testing/selftests/net/mptcp/config
+++ b/tools/testing/selftests/net/mptcp/config
@@ -34,3 +34,10 @@ CONFIG_NFT_SOCKET=m
 CONFIG_NFT_TPROXY=m
 CONFIG_SYN_COOKIES=y
 CONFIG_VETH=y
+CONFIG_CONFIGFS_FS=y
+CONFIG_NVME_CORE=y
+CONFIG_NVME_FABRICS=y
+CONFIG_NVME_TCP=y
+CONFIG_NVME_TARGET=y
+CONFIG_NVME_TARGET_TCP=y
+CONFIG_NVME_MULTIPATH=y
diff --git a/tools/testing/selftests/net/mptcp/mptcp_nvme.sh b/tools/testing/selftests/net/mptcp/mptcp_nvme.sh
new file mode 100755
index 000000000000..14a620040df2
--- /dev/null
+++ b/tools/testing/selftests/net/mptcp/mptcp_nvme.sh
@@ -0,0 +1,179 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+. "$(dirname "$0")/mptcp_lib.sh"
+
+trtype="${1:-mptcp}"
+nqn=nqn.2014-08.org.nvmexpress.${trtype}dev
+ns=1
+port=1234
+trsvcid=4420
+ns1=""
+ns2=""
+
+ns1_cleanup()
+{
+	mount -t configfs none /sys/kernel/config
+
+	rm -rf /sys/kernel/config/nvmet/ports/${port}/subsystems/${trtype}subsys
+	rmdir /sys/kernel/config/nvmet/ports/${port}
+	echo 0 > /sys/kernel/config/nvmet/subsystems/${nqn}/namespaces/${ns}/enable
+	echo -n 0 > /sys/kernel/config/nvmet/subsystems/${nqn}/namespaces/${ns}/device_path
+	rmdir /sys/kernel/config/nvmet/subsystems/${nqn}/namespaces/${ns}
+	rmdir /sys/kernel/config/nvmet/subsystems/${nqn}
+}
+
+ns2_cleanup()
+{
+	nvme disconnect -n ${nqn} || true
+}
+
+cleanup()
+{
+	sleep 1
+
+	ip netns exec "$ns2" bash <<- EOF
+		$(declare -f ns2_cleanup)
+		ns2_cleanup
+	EOF
+
+	ip netns exec "$ns1" bash <<- EOF
+		$(declare -f ns1_cleanup)
+		ns1_cleanup
+	EOF
+
+	losetup -d /dev/loop100
+	rm -rf /tmp/test.raw
+
+	mptcp_lib_ns_exit "$ns1" "$ns2"
+
+	kill "$monitor_pid_ns1" 2>/dev/null
+	wait "$monitor_pid_ns1" 2>/dev/null
+
+	kill "$monitor_pid_ns2" 2>/dev/null
+	wait "$monitor_pid_ns2" 2>/dev/null
+
+	unset -v trtype nqn ns port trsvcid
+}
+
+init()
+{
+	mptcp_lib_ns_init ns1 ns2
+
+	# ns1		ns2
+	# 10.1.1.1	10.1.1.2
+	# 10.1.2.1	10.1.2.2
+	# 10.1.3.1	10.1.3.2
+	# 10.1.4.1	10.1.4.2
+	for i in {1..4}; do
+		ip link add ns1eth$i netns "$ns1" type veth peer name ns2eth$i netns "$ns2"
+		ip -net "$ns1" addr add 10.1.$i.1/24 dev ns1eth$i
+		ip -net "$ns1" addr add dead:beef:$i::1/64 dev ns1eth$i nodad
+		ip -net "$ns1" link set ns1eth$i up
+		ip -net "$ns2" addr add 10.1.$i.2/24 dev ns2eth$i
+		ip -net "$ns2" addr add dead:beef:$i::2/64 dev ns2eth$i nodad
+		ip -net "$ns2" link set ns2eth$i up
+		ip -net "$ns2" route add default via 10.1.$i.1 dev ns2eth$i metric 10$i
+		ip -net "$ns2" route add default via dead:beef:$i::1 dev ns2eth$i metric 10$i
+
+		# Add tc qdisc to both namespaces for bandwidth limiting
+		tc -n $ns1 qdisc add dev ns1eth$i root netem rate 1000mbit
+		tc -n $ns2 qdisc add dev ns2eth$i root netem rate 1000mbit
+	done
+
+	mptcp_lib_pm_nl_set_limits "${ns1}" 8 8
+
+	mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.2.1 flags signal
+	mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.3.1 flags signal
+	mptcp_lib_pm_nl_add_endpoint "$ns1" 10.1.4.1 flags signal
+
+	mptcp_lib_pm_nl_set_limits "${ns2}" 8 8
+
+	mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.2.2 flags subflow
+	mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.3.2 flags subflow
+	mptcp_lib_pm_nl_add_endpoint "$ns2" 10.1.4.2 flags subflow
+
+	ip -n "${ns1}" mptcp monitor &
+	monitor_pid_ns1=$!
+	ip -n "${ns2}" mptcp monitor &
+	monitor_pid_ns2=$!
+}
+
+run_target()
+{
+	mount -t configfs none /sys/kernel/config
+
+	cd /sys/kernel/config/nvmet/subsystems
+	mkdir -p "${nqn}"
+	cd "${nqn}"
+	echo 1 > attr_allow_any_host
+	mkdir -p namespaces/${ns}
+	echo /dev/loop100 > namespaces/${ns}/device_path
+	echo 1 > namespaces/${ns}/enable
+
+	cd /sys/kernel/config/nvmet/ports
+	mkdir -p "${port}"
+	cd "${port}"
+	echo "${trtype}" > addr_trtype
+	echo ipv4 > addr_adrfam
+	echo 0.0.0.0 > addr_traddr
+	echo "${trsvcid}" > addr_trsvcid
+
+	cd subsystems
+	ln -sf ../../../subsystems/${nqn} ${trtype}subsys
+}
+
+run_host()
+{
+	local traddr=10.1.1.1
+
+	echo "nvme discover -a ${traddr}"
+	nvme discover -t "${trtype}" -a "${traddr}" -s "${trsvcid}"
+
+	echo "nvme connect"
+	devname=$(nvme connect -t "${trtype}" -a "${traddr}" -s "${trsvcid}" -n "${nqn}" |
+		  awk '{print $NF}')
+	echo "devname=${devname}"
+
+	sleep 1
+
+	echo "nvme list"
+	nvme list
+
+	echo "fio randread /dev/${devname}n1"
+	fio --name=global --direct=1 --norandommap --randrepeat=0 --ioengine=libaio \
+	    --thread=1 --blocksize=4k --runtime=10 --time_based --rw=randread --numjobs=4 \
+	    --iodepth=256 --group_reporting --size=100% --name=libaio_4_256_4k_randread \
+	    --size=4m --filename=/dev/${devname}n1
+
+	sleep 1
+
+	echo "fio randwrite /dev/${devname}n1"
+	fio --name=global --direct=1 --norandommap --randrepeat=0 --ioengine=libaio \
+	    --thread=1 --blocksize=4k --runtime=10 --time_based --rw=randwrite --numjobs=4 \
+	    --iodepth=256 --group_reporting --size=100% --name=libaio_4_256_4k_randwrite \
+	    --size=4m --filename=/dev/${devname}n1
+}
+
+init
+trap cleanup EXIT
+
+dd if=/dev/zero of=/tmp/test.raw bs=1M count=0 seek=512
+losetup /dev/loop100 /tmp/test.raw
+
+run_test()
+{
+	export trtype nqn ns port trsvcid
+
+	ip netns exec "$ns1" bash <<- EOF
+		$(declare -f run_target)
+		run_target
+	EOF
+
+	ip netns exec "$ns2" bash <<- EOF
+		$(declare -f run_host)
+		run_host
+	EOF
+}
+
+run_test "$@"
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC mptcp-next v4 8/8] selftests: mptcp: nvme: set iopolicy
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
                   ` (6 preceding siblings ...)
  2026-03-05  4:05 ` [RFC mptcp-next v4 7/8] selftests: mptcp: add NVMe-over-MPTCP test Geliang Tang
@ 2026-03-05  4:05 ` Geliang Tang
  2026-03-05  6:13 ` [RFC mptcp-next v4 0/8] NVME over MPTCP MPTCP CI
  8 siblings, 0 replies; 10+ messages in thread
From: Geliang Tang @ 2026-03-05  4:05 UTC (permalink / raw)
  To: mptcp, nilay, ming.lei, hare; +Cc: Geliang Tang

From: Geliang Tang <tanggeliang@kylinos.cn>

Added NVMe iopolicy testing to mptcp_nvme.sh, with the default set to
"numa". It can be set to "round-robin" or "queue-depth".

 # ./mptcp_nvme.sh mptcp round-robin

Cc: Nilay Shroff <nilay@linux.ibm.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 tools/testing/selftests/net/mptcp/mptcp_nvme.sh | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_nvme.sh b/tools/testing/selftests/net/mptcp/mptcp_nvme.sh
index 14a620040df2..2e03f47d22da 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_nvme.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_nvme.sh
@@ -4,6 +4,7 @@
 . "$(dirname "$0")/mptcp_lib.sh"
 
 trtype="${1:-mptcp}"
+iopolicy=${2:-"numa"} # round-robin, queue-depth
 nqn=nqn.2014-08.org.nvmexpress.${trtype}dev
 ns=1
 port=1234
@@ -140,6 +141,12 @@ run_host()
 	echo "nvme list"
 	nvme list
 
+	subname=$(nvme list-subsys /dev/${devname}n1 |
+		  grep -o 'nvme-subsys[0-9]*' | head -1)
+
+	echo ${iopolicy} > /sys/class/nvme-subsystem/${subname}/iopolicy
+	cat /sys/class/nvme-subsystem/${subname}/iopolicy
+
 	echo "fio randread /dev/${devname}n1"
 	fio --name=global --direct=1 --norandommap --randrepeat=0 --ioengine=libaio \
 	    --thread=1 --blocksize=4k --runtime=10 --time_based --rw=randread --numjobs=4 \
@@ -164,6 +171,7 @@ losetup /dev/loop100 /tmp/test.raw
 run_test()
 {
 	export trtype nqn ns port trsvcid
+	export iopolicy
 
 	ip netns exec "$ns1" bash <<- EOF
 		$(declare -f run_target)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC mptcp-next v4 0/8] NVME over MPTCP
  2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
                   ` (7 preceding siblings ...)
  2026-03-05  4:05 ` [RFC mptcp-next v4 8/8] selftests: mptcp: nvme: set iopolicy Geliang Tang
@ 2026-03-05  6:13 ` MPTCP CI
  8 siblings, 0 replies; 10+ messages in thread
From: MPTCP CI @ 2026-03-05  6:13 UTC (permalink / raw)
  To: Geliang Tang; +Cc: mptcp

Hi Geliang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Notice: Call Traces at boot time, rebooted and continued 🔴
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 2 failed test(s): packetdrill_dss packetdrill_sockopts 🔴
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/22702177776

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/519df0e440a2
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1061679


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-03-05  6:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-05  4:05 [RFC mptcp-next v4 0/8] NVME over MPTCP Geliang Tang
2026-03-05  4:05 ` [RFC mptcp-next v4 1/8] mptcp: add sk_is_msk helper Geliang Tang
2026-03-05  4:05 ` [RFC mptcp-next v4 2/8] nvmet-tcp: add mptcp support Geliang Tang
2026-03-05  4:05 ` [RFC mptcp-next v4 3/8] nvme-tcp: " Geliang Tang
2026-03-05  4:05 ` [RFC mptcp-next v4 4/8] mptcp: add sock_set_nodelay Geliang Tang
2026-03-05  4:05 ` [RFC mptcp-next v4 5/8] mptcp: add sock_set_reuseaddr Geliang Tang
2026-03-05  4:05 ` [RFC mptcp-next v4 6/8] mptcp: add sock_set_syncnt Geliang Tang
2026-03-05  4:05 ` [RFC mptcp-next v4 7/8] selftests: mptcp: add NVMe-over-MPTCP test Geliang Tang
2026-03-05  4:05 ` [RFC mptcp-next v4 8/8] selftests: mptcp: nvme: set iopolicy Geliang Tang
2026-03-05  6:13 ` [RFC mptcp-next v4 0/8] NVME over MPTCP MPTCP CI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox