public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace
@ 2026-03-08 23:35 Zhu Yanjun
  2026-03-08 23:35 ` [PATCH v4 1/4] RDMA/nldev: Add dellink function pointer Zhu Yanjun
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Zhu Yanjun @ 2026-03-08 23:35 UTC (permalink / raw)
  To: jgg, leon, zyjzyj2000, yanjun.zhu, dsahern, linux-rdma,
	linux-kselftest

Currently rxe does not work correctly in network namespaces.

When the rdma_rxe module is loaded, a UDP socket listening on port
4791 is created in init_net. When users run:

    ip link add ... type rxe

inside another network namespace, the RXE RDMA link is created but it
cannot function properly because the underlying UDP socket belongs to
init_net. Other network namespaces cannot use that socket.

To address this issue, this series introduces net namespace support
for rxe and moves socket management to be per network namespace.

The series first introduces per-net namespace management for the IPv4
and IPv6 sockets used by rxe. The sockets are created when the network
namespace becomes active and are released when the namespace is
destroyed.

Based on this infrastructure, rxe RDMA links are then created and
destroyed within each network namespace. This ensures that both the
UDP sockets and RDMA links are correctly scoped to the namespace in
which they are used.

With these changes, rxe RDMA links can be created and used both in
init_net and in other network namespaces, and resources are properly
cleaned up during namespace teardown.

The series also includes a selftest to verify RXE functionality in
network namespaces.

V3 -> V4: Squash all the changes about rxe_ns.c/h into one commit.
V2 -> V3: Fix build warnings
V1 -> V2: Fix the problems based on David Ahern.


Zhu Yanjun (4):
  RDMA/nldev: Add dellink function pointer
  RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets
  RDMA/rxe: Support RDMA link creation and destruction per net namespace
  RDMA/rxe: Add testcase for net namespace rxe

 MAINTAINERS                                   |   1 +
 drivers/infiniband/core/nldev.c               |   6 +
 drivers/infiniband/sw/rxe/Makefile            |   3 +-
 drivers/infiniband/sw/rxe/rxe.c               |  38 ++++-
 drivers/infiniband/sw/rxe/rxe_net.c           | 145 +++++++++++++-----
 drivers/infiniband/sw/rxe/rxe_net.h           |   9 +-
 drivers/infiniband/sw/rxe/rxe_ns.c            | 136 ++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_ns.h            |  17 ++
 include/rdma/rdma_netlink.h                   |   2 +
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/rdma/Makefile         |   7 +
 tools/testing/selftests/rdma/config           |   3 +
 tools/testing/selftests/rdma/rxe_ipv6.sh      |  47 ++++++
 .../selftests/rdma/rxe_rping_between_netns.sh |  57 +++++++
 .../selftests/rdma/rxe_socket_with_netns.sh   |  64 ++++++++
 .../rdma/rxe_test_NETDEV_UNREGISTER.sh        |  38 +++++
 16 files changed, 527 insertions(+), 47 deletions(-)
 create mode 100644 drivers/infiniband/sw/rxe/rxe_ns.c
 create mode 100644 drivers/infiniband/sw/rxe/rxe_ns.h
 create mode 100644 tools/testing/selftests/rdma/Makefile
 create mode 100644 tools/testing/selftests/rdma/config
 create mode 100755 tools/testing/selftests/rdma/rxe_ipv6.sh
 create mode 100755 tools/testing/selftests/rdma/rxe_rping_between_netns.sh
 create mode 100755 tools/testing/selftests/rdma/rxe_socket_with_netns.sh
 create mode 100755 tools/testing/selftests/rdma/rxe_test_NETDEV_UNREGISTER.sh

-- 
2.52.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/4] RDMA/nldev: Add dellink function pointer
  2026-03-08 23:35 [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
@ 2026-03-08 23:35 ` Zhu Yanjun
  2026-03-08 23:35 ` [PATCH v4 2/4] RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets Zhu Yanjun
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Zhu Yanjun @ 2026-03-08 23:35 UTC (permalink / raw)
  To: jgg, leon, zyjzyj2000, yanjun.zhu, dsahern, linux-rdma,
	linux-kselftest

The newlink function pointer is added. And the sock listening on port 4791
is added in the newlink function. So the dellink function is needed to
remove the sock.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/core/nldev.c | 6 ++++++
 include/rdma/rdma_netlink.h     | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 2220a2dfab24..48684930660a 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -1824,6 +1824,12 @@ static int nldev_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
 		return -EINVAL;
 	}
 
+	if (device->link_ops) {
+		err = device->link_ops->dellink(device);
+		if (err)
+			return err;
+	}
+
 	ib_unregister_device_and_put(device);
 	return 0;
 }
diff --git a/include/rdma/rdma_netlink.h b/include/rdma/rdma_netlink.h
index 326deaf56d5d..2fd1358ea57d 100644
--- a/include/rdma/rdma_netlink.h
+++ b/include/rdma/rdma_netlink.h
@@ -5,6 +5,7 @@
 
 #include <linux/netlink.h>
 #include <uapi/rdma/rdma_netlink.h>
+#include <rdma/ib_verbs.h>
 
 struct ib_device;
 
@@ -126,6 +127,7 @@ struct rdma_link_ops {
 	struct list_head list;
 	const char *type;
 	int (*newlink)(const char *ibdev_name, struct net_device *ndev);
+	int (*dellink)(struct ib_device *dev);
 };
 
 void rdma_link_register(struct rdma_link_ops *ops);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/4] RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets
  2026-03-08 23:35 [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
  2026-03-08 23:35 ` [PATCH v4 1/4] RDMA/nldev: Add dellink function pointer Zhu Yanjun
@ 2026-03-08 23:35 ` Zhu Yanjun
  2026-03-09 18:54   ` David Ahern
  2026-03-08 23:35 ` [PATCH v4 3/4] RDMA/rxe: Support RDMA link creation and destruction per net namespace Zhu Yanjun
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Zhu Yanjun @ 2026-03-08 23:35 UTC (permalink / raw)
  To: jgg, leon, zyjzyj2000, yanjun.zhu, dsahern, linux-rdma,
	linux-kselftest

Add a net namespace implementation file to rxe to manage the
lifecycle of IPv4 and IPv6 sockets per network namespace.

This implementation handles the creation and destruction of the
sockets both for init_net and for dynamically created network
namespaces. The sockets are initialized when a namespace becomes
active and are properly released when the namespace is removed.

This change provides the infrastructure needed for rxe to operate
correctly in environments using multiple network namespaces.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/sw/rxe/Makefile |   3 +-
 drivers/infiniband/sw/rxe/rxe_ns.c | 136 +++++++++++++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_ns.h |  17 ++++
 3 files changed, 155 insertions(+), 1 deletion(-)
 create mode 100644 drivers/infiniband/sw/rxe/rxe_ns.c
 create mode 100644 drivers/infiniband/sw/rxe/rxe_ns.h

diff --git a/drivers/infiniband/sw/rxe/Makefile b/drivers/infiniband/sw/rxe/Makefile
index 93134f1d1d0c..3977f4f13258 100644
--- a/drivers/infiniband/sw/rxe/Makefile
+++ b/drivers/infiniband/sw/rxe/Makefile
@@ -22,6 +22,7 @@ rdma_rxe-y := \
 	rxe_mcast.o \
 	rxe_task.o \
 	rxe_net.o \
-	rxe_hw_counters.o
+	rxe_hw_counters.o \
+	rxe_ns.o
 
 rdma_rxe-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += rxe_odp.o
diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
new file mode 100644
index 000000000000..6fe056c81ef3
--- /dev/null
+++ b/drivers/infiniband/sw/rxe/rxe_ns.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved.
+ */
+
+#include <net/sock.h>
+#include <net/netns/generic.h>
+#include <net/net_namespace.h>
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/pid_namespace.h>
+#include <net/udp_tunnel.h>
+
+#include "rxe_ns.h"
+
+/*
+ * Per network namespace data
+ */
+struct rxe_ns_sock {
+	struct sock __rcu *rxe_sk4;
+	struct sock __rcu *rxe_sk6;
+};
+
+/*
+ * Index to store custom data for each network namespace.
+ */
+static unsigned int rxe_pernet_id;
+
+/*
+ * Called for every existing and added network namespaces
+ */
+static int rxe_ns_init(struct net *net)
+{
+	/* defer socket create in the namespace to the first
+	 * device create.
+	 */
+
+	return 0;
+}
+
+static void rxe_ns_exit(struct net *net)
+{
+	/* called when the network namespace is removed
+	 */
+	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
+	struct sock *sk;
+
+	sk = rcu_dereference(ns_sk->rxe_sk4);
+	if (sk) {
+		rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
+		udp_tunnel_sock_release(sk->sk_socket);
+	}
+
+#if IS_ENABLED(CONFIG_IPV6)
+	sk = rcu_dereference(ns_sk->rxe_sk6);
+	if (sk) {
+		rcu_assign_pointer(ns_sk->rxe_sk6, NULL);
+		udp_tunnel_sock_release(sk->sk_socket);
+	}
+#endif
+}
+
+/*
+ * callback to make the module network namespace aware
+ */
+static struct pernet_operations rxe_net_ops = {
+	.init = rxe_ns_init,
+	.exit = rxe_ns_exit,
+	.id = &rxe_pernet_id,
+	.size = sizeof(struct rxe_ns_sock),
+};
+
+struct sock *rxe_ns_pernet_sk4(struct net *net)
+{
+	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
+	struct sock *sk;
+
+	rcu_read_lock();
+	sk = rcu_dereference(ns_sk->rxe_sk4);
+	rcu_read_unlock();
+
+	return sk;
+}
+
+void rxe_ns_pernet_set_sk4(struct net *net, struct sock *sk)
+{
+	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
+
+	rcu_assign_pointer(ns_sk->rxe_sk4, sk);
+	synchronize_rcu();
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+struct sock *rxe_ns_pernet_sk6(struct net *net)
+{
+	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
+	struct sock *sk;
+
+	rcu_read_lock();
+	sk = rcu_dereference(ns_sk->rxe_sk6);
+	rcu_read_unlock();
+
+	return sk;
+}
+
+void rxe_ns_pernet_set_sk6(struct net *net, struct sock *sk)
+{
+	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
+
+	rcu_assign_pointer(ns_sk->rxe_sk6, sk);
+	synchronize_rcu();
+}
+
+#else /* IPV6 */
+
+struct sock *rxe_ns_pernet_sk6(struct net *net)
+{
+	return NULL;
+}
+
+void rxe_ns_pernet_set_sk6(struct net *net, struct sock *sk)
+{
+}
+
+#endif /* IPV6 */
+
+int rxe_namespace_init(void)
+{
+	return register_pernet_subsys(&rxe_net_ops);
+}
+
+void rxe_namespace_exit(void)
+{
+	unregister_pernet_subsys(&rxe_net_ops);
+}
diff --git a/drivers/infiniband/sw/rxe/rxe_ns.h b/drivers/infiniband/sw/rxe/rxe_ns.h
new file mode 100644
index 000000000000..998511742c47
--- /dev/null
+++ b/drivers/infiniband/sw/rxe/rxe_ns.h
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved.
+ * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved.
+ */
+
+#ifndef RXE_NS_H
+#define RXE_NS_H
+
+struct sock *rxe_ns_pernet_sk4(struct net *net);
+struct sock *rxe_ns_pernet_sk6(struct net *net);
+void rxe_ns_pernet_set_sk4(struct net *net, struct sock *sk);
+void rxe_ns_pernet_set_sk6(struct net *net, struct sock *sk);
+int rxe_namespace_init(void);
+void rxe_namespace_exit(void);
+
+#endif /* RXE_NS_H */
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/4] RDMA/rxe: Support RDMA link creation and destruction per net namespace
  2026-03-08 23:35 [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
  2026-03-08 23:35 ` [PATCH v4 1/4] RDMA/nldev: Add dellink function pointer Zhu Yanjun
  2026-03-08 23:35 ` [PATCH v4 2/4] RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets Zhu Yanjun
@ 2026-03-08 23:35 ` Zhu Yanjun
  2026-03-09 18:54   ` David Ahern
  2026-03-08 23:35 ` [PATCH v4 4/4] RDMA/rxe: Add testcase for net namespace rxe Zhu Yanjun
  2026-03-08 23:40 ` [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
  4 siblings, 1 reply; 12+ messages in thread
From: Zhu Yanjun @ 2026-03-08 23:35 UTC (permalink / raw)
  To: jgg, leon, zyjzyj2000, yanjun.zhu, dsahern, linux-rdma,
	linux-kselftest

After introducing dellink handling and per-net namespace management
for IPv4 and IPv6 sockets, extend rxe to create and destroy RDMA links
within each network namespace.

With this change, RDMA links can be instantiated both in init_net and
in other network namespaces. The lifecycle of the RDMA link is now tied
to the corresponding namespace and is properly cleaned up when the
namespace or link is removed.

This ensures rxe behaves correctly in multi-namespace environments and
keeps socket and RDMA link resources consistent across namespace
creation and teardown.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe.c     |  38 +++++++-
 drivers/infiniband/sw/rxe/rxe_net.c | 145 +++++++++++++++++++++-------
 drivers/infiniband/sw/rxe/rxe_net.h |   9 +-
 3 files changed, 146 insertions(+), 46 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index e891199cbdef..b0714f9abe3d 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -8,6 +8,8 @@
 #include <net/addrconf.h>
 #include "rxe.h"
 #include "rxe_loc.h"
+#include "rxe_net.h"
+#include "rxe_ns.h"
 
 MODULE_AUTHOR("Bob Pearson, Frank Zago, John Groves, Kamal Heib");
 MODULE_DESCRIPTION("Soft RDMA transport");
@@ -200,6 +202,8 @@ void rxe_set_mtu(struct rxe_dev *rxe, unsigned int ndev_mtu)
 	port->mtu_cap = ib_mtu_enum_to_int(mtu);
 }
 
+static struct rdma_link_ops rxe_link_ops;
+
 /* called by ifc layer to create new rxe device.
  * The caller should allocate memory for rxe by calling ib_alloc_device.
  */
@@ -208,6 +212,7 @@ int rxe_add(struct rxe_dev *rxe, unsigned int mtu, const char *ibdev_name,
 {
 	rxe_init(rxe, ndev);
 	rxe_set_mtu(rxe, mtu);
+	rxe->ib_dev.link_ops = &rxe_link_ops;
 
 	return rxe_register_device(rxe, ibdev_name, ndev);
 }
@@ -231,6 +236,10 @@ static int rxe_newlink(const char *ibdev_name, struct net_device *ndev)
 		goto err;
 	}
 
+	err = rxe_net_init(ndev);
+	if (err)
+		return err;
+
 	err = rxe_net_add(ibdev_name, ndev);
 	if (err) {
 		rxe_err("failed to add %s\n", ndev->name);
@@ -240,9 +249,17 @@ static int rxe_newlink(const char *ibdev_name, struct net_device *ndev)
 	return err;
 }
 
+static int rxe_dellink(struct ib_device *dev)
+{
+	rxe_net_del(dev);
+
+	return 0;
+}
+
 static struct rdma_link_ops rxe_link_ops = {
 	.type = "rxe",
 	.newlink = rxe_newlink,
+	.dellink = rxe_dellink,
 };
 
 static int __init rxe_module_init(void)
@@ -253,15 +270,24 @@ static int __init rxe_module_init(void)
 	if (err)
 		return err;
 
-	err = rxe_net_init();
-	if (err) {
-		rxe_destroy_wq();
-		return err;
-	}
+	err = rxe_namespace_init();
+	if (err)
+		goto err_destroy_wq;
+
+	err = rxe_register_notifier();
+	if (err)
+		goto err_namespace_exit;
 
 	rdma_link_register(&rxe_link_ops);
+
 	pr_info("loaded\n");
 	return 0;
+
+err_namespace_exit:
+	rxe_namespace_exit();
+err_destroy_wq:
+	rxe_destroy_wq();
+	return err;
 }
 
 static void __exit rxe_module_exit(void)
@@ -271,6 +297,8 @@ static void __exit rxe_module_exit(void)
 	rxe_net_exit();
 	rxe_destroy_wq();
 
+	rxe_namespace_exit();
+
 	pr_info("unloaded\n");
 }
 
diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index 0bd0902b11f7..1e422d4f584f 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -17,8 +17,7 @@
 #include "rxe.h"
 #include "rxe_net.h"
 #include "rxe_loc.h"
-
-static struct rxe_recv_sockets recv_sockets;
+#include "rxe_ns.h"
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 /*
@@ -101,20 +100,20 @@ static inline void rxe_reclassify_recv_socket(struct socket *sock)
 }
 
 static struct dst_entry *rxe_find_route4(struct rxe_qp *qp,
+					 struct net *net,
 					 struct net_device *ndev,
 					 struct in_addr *saddr,
 					 struct in_addr *daddr)
 {
 	struct rtable *rt;
-	struct flowi4 fl = { { 0 } };
+	struct flowi4 fl = {};
 
-	memset(&fl, 0, sizeof(fl));
 	fl.flowi4_oif = ndev->ifindex;
 	memcpy(&fl.saddr, saddr, sizeof(*saddr));
 	memcpy(&fl.daddr, daddr, sizeof(*daddr));
 	fl.flowi4_proto = IPPROTO_UDP;
 
-	rt = ip_route_output_key(&init_net, &fl);
+	rt = ip_route_output_key(net, &fl);
 	if (IS_ERR(rt)) {
 		rxe_dbg_qp(qp, "no route to %pI4\n", &daddr->s_addr);
 		return NULL;
@@ -125,21 +124,21 @@ static struct dst_entry *rxe_find_route4(struct rxe_qp *qp,
 
 #if IS_ENABLED(CONFIG_IPV6)
 static struct dst_entry *rxe_find_route6(struct rxe_qp *qp,
+					 struct net *net,
 					 struct net_device *ndev,
 					 struct in6_addr *saddr,
 					 struct in6_addr *daddr)
 {
 	struct dst_entry *ndst;
-	struct flowi6 fl6 = { { 0 } };
+	struct flowi6 fl6 = {};
 
-	memset(&fl6, 0, sizeof(fl6));
 	fl6.flowi6_oif = ndev->ifindex;
 	memcpy(&fl6.saddr, saddr, sizeof(*saddr));
 	memcpy(&fl6.daddr, daddr, sizeof(*daddr));
 	fl6.flowi6_proto = IPPROTO_UDP;
 
-	ndst = ipv6_stub->ipv6_dst_lookup_flow(sock_net(recv_sockets.sk6->sk),
-					       recv_sockets.sk6->sk, &fl6,
+	ndst = ipv6_stub->ipv6_dst_lookup_flow(net,
+					       rxe_ns_pernet_sk6(dev_net(ndev)), &fl6,
 					       NULL);
 	if (IS_ERR(ndst)) {
 		rxe_dbg_qp(qp, "no route to %pI6\n", daddr);
@@ -160,6 +159,7 @@ static struct dst_entry *rxe_find_route6(struct rxe_qp *qp,
 #else
 
 static struct dst_entry *rxe_find_route6(struct rxe_qp *qp,
+					 struct net *net,
 					 struct net_device *ndev,
 					 struct in6_addr *saddr,
 					 struct in6_addr *daddr)
@@ -174,6 +174,7 @@ static struct dst_entry *rxe_find_route(struct net_device *ndev,
 					struct rxe_av *av)
 {
 	struct dst_entry *dst = NULL;
+	struct net *net;
 
 	if (qp_type(qp) == IB_QPT_RC)
 		dst = sk_dst_get(qp->sk->sk);
@@ -182,20 +183,22 @@ static struct dst_entry *rxe_find_route(struct net_device *ndev,
 		if (dst)
 			dst_release(dst);
 
+		net = dev_net(ndev);
+
 		if (av->network_type == RXE_NETWORK_TYPE_IPV4) {
 			struct in_addr *saddr;
 			struct in_addr *daddr;
 
 			saddr = &av->sgid_addr._sockaddr_in.sin_addr;
 			daddr = &av->dgid_addr._sockaddr_in.sin_addr;
-			dst = rxe_find_route4(qp, ndev, saddr, daddr);
+			dst = rxe_find_route4(qp, net, ndev, saddr, daddr);
 		} else if (av->network_type == RXE_NETWORK_TYPE_IPV6) {
 			struct in6_addr *saddr6;
 			struct in6_addr *daddr6;
 
 			saddr6 = &av->sgid_addr._sockaddr_in6.sin6_addr;
 			daddr6 = &av->dgid_addr._sockaddr_in6.sin6_addr;
-			dst = rxe_find_route6(qp, ndev, saddr6, daddr6);
+			dst = rxe_find_route6(qp, net, ndev, saddr6, daddr6);
 #if IS_ENABLED(CONFIG_IPV6)
 			if (dst)
 				qp->dst_cookie =
@@ -624,6 +627,46 @@ int rxe_net_add(const char *ibdev_name, struct net_device *ndev)
 	return 0;
 }
 
+#define SK_REF_FOR_TUNNEL	2
+
+static void rxe_sock_put(struct sock *sk,
+					void (*set_sk)(struct net *, struct sock *),
+					struct net *net)
+{
+	if (refcount_read(&sk->sk_refcnt) > SK_REF_FOR_TUNNEL) {
+		__sock_put(sk);
+	} else {
+		rxe_release_udp_tunnel(sk->sk_socket);
+		sk = NULL;
+		set_sk(net, sk);
+	}
+}
+
+void rxe_net_del(struct ib_device *dev)
+{
+	struct rxe_dev *rxe = container_of(dev, struct rxe_dev, ib_dev);
+	struct net_device *ndev;
+	struct sock *sk;
+	struct net *net;
+
+	ndev = rxe_ib_device_get_netdev(&rxe->ib_dev);
+	if (!ndev)
+		return;
+
+	net = dev_net(ndev);
+
+	sk = rxe_ns_pernet_sk4(net);
+	if (sk)
+		rxe_sock_put(sk, rxe_ns_pernet_set_sk4, net);
+
+	sk = rxe_ns_pernet_sk6(net);
+	if (sk)
+		rxe_sock_put(sk, rxe_ns_pernet_set_sk6, net);
+
+	dev_put(ndev);
+}
+#undef SK_REF_FOR_TUNNEL
+
 static void rxe_port_event(struct rxe_dev *rxe,
 			   enum ib_event_type event)
 {
@@ -680,6 +723,7 @@ static int rxe_notify(struct notifier_block *not_blk,
 	switch (event) {
 	case NETDEV_UNREGISTER:
 		ib_unregister_device_queued(&rxe->ib_dev);
+		rxe_net_del(&rxe->ib_dev);
 		break;
 	case NETDEV_CHANGEMTU:
 		rxe_dbg_dev(rxe, "%s changed mtu to %d\n", ndev->name, ndev->mtu);
@@ -709,66 +753,97 @@ static struct notifier_block rxe_net_notifier = {
 	.notifier_call = rxe_notify,
 };
 
-static int rxe_net_ipv4_init(void)
+static int rxe_net_ipv4_init(struct net *net)
 {
-	recv_sockets.sk4 = rxe_setup_udp_tunnel(&init_net,
-				htons(ROCE_V2_UDP_DPORT), false);
-	if (IS_ERR(recv_sockets.sk4)) {
-		recv_sockets.sk4 = NULL;
+	struct sock *sk;
+	struct socket *sock;
+
+	sk = rxe_ns_pernet_sk4(net);
+	if (sk) {
+		sock_hold(sk);
+		return 0;
+	}
+
+	sock = rxe_setup_udp_tunnel(net, htons(ROCE_V2_UDP_DPORT), false);
+	if (IS_ERR(sock)) {
 		pr_err("Failed to create IPv4 UDP tunnel\n");
 		return -1;
 	}
+	rxe_ns_pernet_set_sk4(net, sock->sk);
 
 	return 0;
 }
 
-static int rxe_net_ipv6_init(void)
+static int rxe_net_ipv6_init(struct net *net)
 {
 #if IS_ENABLED(CONFIG_IPV6)
+	struct sock *sk;
+	struct socket *sock;
 
-	recv_sockets.sk6 = rxe_setup_udp_tunnel(&init_net,
-						htons(ROCE_V2_UDP_DPORT), true);
-	if (PTR_ERR(recv_sockets.sk6) == -EAFNOSUPPORT) {
-		recv_sockets.sk6 = NULL;
+	sk = rxe_ns_pernet_sk6(net);
+	if (sk) {
+		sock_hold(sk);
+		return 0;
+	}
+
+	sock = rxe_setup_udp_tunnel(net, htons(ROCE_V2_UDP_DPORT), true);
+	if (PTR_ERR(sock) == -EAFNOSUPPORT) {
 		pr_warn("IPv6 is not supported, can not create a UDPv6 socket\n");
 		return 0;
 	}
 
-	if (IS_ERR(recv_sockets.sk6)) {
-		recv_sockets.sk6 = NULL;
+	if (IS_ERR(sock)) {
 		pr_err("Failed to create IPv6 UDP tunnel\n");
 		return -1;
 	}
+
+	rxe_ns_pernet_set_sk6(net, sock->sk);
+
 #endif
 	return 0;
 }
 
+int rxe_register_notifier(void)
+{
+	int err;
+
+	err = register_netdevice_notifier(&rxe_net_notifier);
+	if (err) {
+		pr_err("Failed to register netdev notifier\n");
+		return -1;
+	}
+
+	return 0;
+}
+
 void rxe_net_exit(void)
 {
-	rxe_release_udp_tunnel(recv_sockets.sk6);
-	rxe_release_udp_tunnel(recv_sockets.sk4);
 	unregister_netdevice_notifier(&rxe_net_notifier);
 }
 
-int rxe_net_init(void)
+int rxe_net_init(struct net_device *ndev)
 {
+	struct net *net;
 	int err;
 
-	recv_sockets.sk6 = NULL;
+	net = dev_net(ndev);
 
-	err = rxe_net_ipv4_init();
+	err = rxe_net_ipv4_init(net);
 	if (err)
 		return err;
-	err = rxe_net_ipv6_init();
+
+	err = rxe_net_ipv6_init(net);
 	if (err)
 		goto err_out;
-	err = register_netdevice_notifier(&rxe_net_notifier);
-	if (err) {
-		pr_err("Failed to register netdev notifier\n");
-		goto err_out;
-	}
+
 	return 0;
+
 err_out:
-	rxe_net_exit();
+	/* If ipv6 error, release ipv4 resource */
+	struct sock *sk = rxe_ns_pernet_sk4(net);
+
+	if (sk)
+		rxe_sock_put(sk, rxe_ns_pernet_set_sk4, net);
+
 	return err;
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_net.h b/drivers/infiniband/sw/rxe/rxe_net.h
index 45d80d00f86b..56249677d692 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.h
+++ b/drivers/infiniband/sw/rxe/rxe_net.h
@@ -11,14 +11,11 @@
 #include <net/if_inet6.h>
 #include <linux/module.h>
 
-struct rxe_recv_sockets {
-	struct socket *sk4;
-	struct socket *sk6;
-};
-
 int rxe_net_add(const char *ibdev_name, struct net_device *ndev);
+void rxe_net_del(struct ib_device *dev);
 
-int rxe_net_init(void);
+int rxe_register_notifier(void);
+int rxe_net_init(struct net_device *ndev);
 void rxe_net_exit(void);
 
 #endif /* RXE_NET_H */
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/4] RDMA/rxe: Add testcase for net namespace rxe
  2026-03-08 23:35 [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
                   ` (2 preceding siblings ...)
  2026-03-08 23:35 ` [PATCH v4 3/4] RDMA/rxe: Support RDMA link creation and destruction per net namespace Zhu Yanjun
@ 2026-03-08 23:35 ` Zhu Yanjun
  2026-03-08 23:40 ` [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
  4 siblings, 0 replies; 12+ messages in thread
From: Zhu Yanjun @ 2026-03-08 23:35 UTC (permalink / raw)
  To: jgg, leon, zyjzyj2000, yanjun.zhu, dsahern, linux-rdma,
	linux-kselftest

Add 4 testcases for rxe with net namespace.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 MAINTAINERS                                   |  1 +
 tools/testing/selftests/Makefile              |  1 +
 tools/testing/selftests/rdma/Makefile         |  7 ++
 tools/testing/selftests/rdma/config           |  3 +
 tools/testing/selftests/rdma/rxe_ipv6.sh      | 47 ++++++++++++++
 .../selftests/rdma/rxe_rping_between_netns.sh | 57 +++++++++++++++++
 .../selftests/rdma/rxe_socket_with_netns.sh   | 64 +++++++++++++++++++
 .../rdma/rxe_test_NETDEV_UNREGISTER.sh        | 38 +++++++++++
 8 files changed, 218 insertions(+)
 create mode 100644 tools/testing/selftests/rdma/Makefile
 create mode 100644 tools/testing/selftests/rdma/config
 create mode 100755 tools/testing/selftests/rdma/rxe_ipv6.sh
 create mode 100755 tools/testing/selftests/rdma/rxe_rping_between_netns.sh
 create mode 100755 tools/testing/selftests/rdma/rxe_socket_with_netns.sh
 create mode 100755 tools/testing/selftests/rdma/rxe_test_NETDEV_UNREGISTER.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index 89007f9ed35e..3835857d0192 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -24493,6 +24493,7 @@ L:	linux-rdma@vger.kernel.org
 S:	Supported
 F:	drivers/infiniband/sw/rxe/
 F:	include/uapi/rdma/rdma_user_rxe.h
+F:	tools/testing/selftests/rdma/
 
 SOFTLOGIC 6x10 MPEG CODEC
 M:	Bluecherry Maintainers <maintainers@bluecherrydvr.com>
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 450f13ba4cca..110e07c0d99d 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -94,6 +94,7 @@ TARGETS += proc
 TARGETS += pstore
 TARGETS += ptrace
 TARGETS += openat2
+TARGETS += rdma
 TARGETS += resctrl
 TARGETS += riscv
 TARGETS += rlimits
diff --git a/tools/testing/selftests/rdma/Makefile b/tools/testing/selftests/rdma/Makefile
new file mode 100644
index 000000000000..7dd7cba7a73c
--- /dev/null
+++ b/tools/testing/selftests/rdma/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+TEST_PROGS := rxe_rping_between_netns.sh \
+		rxe_ipv6.sh \
+		rxe_socket_with_netns.sh \
+		rxe_test_NETDEV_UNREGISTER.sh
+
+include ../lib.mk
diff --git a/tools/testing/selftests/rdma/config b/tools/testing/selftests/rdma/config
new file mode 100644
index 000000000000..4ffb814e253b
--- /dev/null
+++ b/tools/testing/selftests/rdma/config
@@ -0,0 +1,3 @@
+CONFIG_TUN
+CONFIG_VETH
+CONFIG_RDMA_RXE
diff --git a/tools/testing/selftests/rdma/rxe_ipv6.sh b/tools/testing/selftests/rdma/rxe_ipv6.sh
new file mode 100755
index 000000000000..9337ac4fd13f
--- /dev/null
+++ b/tools/testing/selftests/rdma/rxe_ipv6.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+
+# Notes:
+#
+# 1. Before running this script, please disable the firewall, as it may
+# block UDP port 4791.
+
+# 2. This test script depends on the veth and tun drivers. Before running
+#  the script, please verify that both drivers are available by executing:
+#
+# modinfo tun
+# modinfo veth
+#
+# Make sure these commands return valid module information.
+
+# 3. ipv6 test.
+# While RXE is conventionally deployed over IPv4, it maintains
+# native support for IPv6. However, IPv6 implementations typically
+# receive less validation and performance tuning in standard use cases.
+exec > /dev/null
+# 1) create ipv6 net namespace
+ip netns add net6
+ip link add veth0 type veth peer name veth1
+ip link set veth1 netns net6
+ip netns exec net6 ip addr add 2001:db8::1/64 dev veth1
+ip netns exec net6 ip link set veth1 up
+
+# 2) Add rdma link
+ip netns exec net6 rdma link add rxe6 type rxe netdev veth1
+
+# 3) check IPv6 UDP 4791 listening port
+if ! ip netns exec net6 ss -ul6n | grep :4791; then
+	echo "Error: udp port 4791 exists"
+	exit 1
+fi
+
+# 4) Delete rxe link
+ip netns exec net6 rdma link del rxe6
+if ip netns exec net6 ss -ul6n | grep :4791; then  # result should be null
+	echo "Error: udp port 4791 exists"
+	exit 1
+fi
+
+# 5) delete net6
+ip netns del net6
+
+modprobe -v -r rdma_rxe
diff --git a/tools/testing/selftests/rdma/rxe_rping_between_netns.sh b/tools/testing/selftests/rdma/rxe_rping_between_netns.sh
new file mode 100755
index 000000000000..80b4249dba55
--- /dev/null
+++ b/tools/testing/selftests/rdma/rxe_rping_between_netns.sh
@@ -0,0 +1,57 @@
+#!/bin/sh
+
+# Notes:
+#
+# 1. Before running this script, please disable the firewall, as it may
+# block UDP port 4791.
+
+# 2. This test script depends on the veth and tun drivers. Before running
+#  the script, please verify that both drivers are available by executing:
+#
+# modinfo veth
+#
+# Make sure these commands return valid module information.
+
+#1. Check if rping can work or not
+exec > /dev/null
+ip netns add test1
+ip netns ls
+ip link add veth-a type veth peer name veth-b
+ip l
+ip link set veth-a netns test1
+ip l
+ip netns exec test1 ip l set veth-a up
+ip netns exec test1 ip addr add 1.1.1.1/24 dev veth-a
+ip netns exec test1 ip l
+ip netns exec test1 ip -4 a
+ip netns exec test1 rdma link add rxe0 type rxe netdev veth-a
+
+#check if socket exist or not
+ip netns exec test1 ss -lun | grep :4791
+
+ip netns exec test1 rdma link
+ip link set veth-b up
+ip addr add 1.1.1.2/24 dev veth-b
+ping -c 3 1.1.1.1 || exit 1
+ip netns exec test1 rping -s -a 1.1.1.1&
+rdma link add rxe1 type rxe netdev veth-b
+rdma link
+
+#check if socket exist or not
+ss -lun | grep :4791
+
+rping -c -a 1.1.1.1 -d -v -C 3 || exit 1
+ip netns ls
+rdma link del rxe1
+
+#check if socket exist or not
+ss -lun | grep :4791
+
+ip netns exec test1 ss -lun | grep :4791
+ip netns exec test1 rdma link del rxe0
+ip netns exec test1 ss -lun | grep :4791
+ip netns del test1
+ip netns ls
+
+modprobe -v -r veth
+modprobe -v -r rdma_rxe
diff --git a/tools/testing/selftests/rdma/rxe_socket_with_netns.sh b/tools/testing/selftests/rdma/rxe_socket_with_netns.sh
new file mode 100755
index 000000000000..676aec63babd
--- /dev/null
+++ b/tools/testing/selftests/rdma/rxe_socket_with_netns.sh
@@ -0,0 +1,64 @@
+#!/bin/sh
+
+# Notes:
+#
+# 1. Before running this script, please disable the firewall, as it may
+# block UDP port 4791.
+
+# 2. This test script depends on the veth and tun drivers. Before running
+#  the script, please verify that both drivers are available by executing:
+#
+# modinfo tun
+#
+# Make sure these commands return valid module information.
+
+# Check if socket exist or not
+exec > /dev/null
+ip tuntap add mode tun tun0
+ip -4 a
+ip addr add 1.1.1.1/24 dev tun0
+ip link set tun0 up
+ip -4 a
+rdma link add rxe0 type rxe netdev tun0
+rdma link
+ret=`ss -lun | grep :4791`
+if [ X"$ret" == X"" ]; then
+	echo "Error: udp port 4791 does not exist"
+	exit 1
+fi
+
+ip tuntap add mode tun tun1
+ip -4 a
+ip addr add 2.2.2.2/24 dev tun1
+ip link set tun1 up
+rdma link add rxe1 type rxe netdev tun1
+rdma link
+ret=`ss -lun | grep :4791`
+if [ X"$ret" == X"" ]; then
+	echo "Error: udp port 4791 does not exist"
+	exit 1
+fi
+
+rdma link del rxe1
+rdma link
+ret=`ss -lun | grep :4791`
+if [ X"$ret" == X"" ]; then
+	echo "Error: udp port 4791 doese not exist"
+	exit 1
+fi
+
+rdma link del rxe0
+rdma link
+if ss -lun | grep :4791; then
+	echo "Error: udp port 4791 exists"
+	exit 1
+fi
+
+ip addr del 2.2.2.2/24 dev tun1
+ip tuntap del mode tun tun1
+
+ip addr del 1.1.1.1/24 dev tun0
+ip tuntap del mode tun tun0
+
+modprobe -v -r tun
+modprobe -v -r rdma_rxe
diff --git a/tools/testing/selftests/rdma/rxe_test_NETDEV_UNREGISTER.sh b/tools/testing/selftests/rdma/rxe_test_NETDEV_UNREGISTER.sh
new file mode 100755
index 000000000000..c30ff905b121
--- /dev/null
+++ b/tools/testing/selftests/rdma/rxe_test_NETDEV_UNREGISTER.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+# Notes:
+#
+# 1. Before running this script, please disable the firewall, as it may
+# block UDP port 4791.
+
+# 2. This test script depends on the veth and tun drivers. Before running
+#  the script, please verify that both drivers are available by executing:
+#
+# modinfo tun
+# modinfo veth
+#
+# Make sure these commands return valid module information.
+
+# Trigger NETDEV_UNREGISTER
+exec > /dev/null
+ip tuntap add mode tun tun0
+ip -4 a
+ip addr add 1.1.1.1/24 dev tun0
+ip link set tun0 up
+ip -4 a
+rdma link add rxe0 type rxe netdev tun0
+rdma link
+ss -lun | grep :4791
+
+ip l
+ip addr del 1.1.1.1/24 dev tun0
+ip tuntap del mode tun tun0
+
+rdma link
+if ss -lun | grep :4791; then
+	echo "error"
+	exit 1
+fi
+
+modprobe -v -r tun
+modprobe -v -r rdma_rxe
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace
  2026-03-08 23:35 [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
                   ` (3 preceding siblings ...)
  2026-03-08 23:35 ` [PATCH v4 4/4] RDMA/rxe: Add testcase for net namespace rxe Zhu Yanjun
@ 2026-03-08 23:40 ` Zhu Yanjun
  2026-03-09 18:55   ` David Ahern
  4 siblings, 1 reply; 12+ messages in thread
From: Zhu Yanjun @ 2026-03-08 23:40 UTC (permalink / raw)
  To: jgg, leon, zyjzyj2000, dsahern, linux-rdma, linux-kselftest,
	yanjun.zhu@linux.dev


在 2026/3/8 16:35, Zhu Yanjun 写道:
> Currently rxe does not work correctly in network namespaces.
>
> When the rdma_rxe module is loaded, a UDP socket listening on port
> 4791 is created in init_net. When users run:
>
>      ip link add ... type rxe
>
> inside another network namespace, the RXE RDMA link is created but it
> cannot function properly because the underlying UDP socket belongs to
> init_net. Other network namespaces cannot use that socket.
>
> To address this issue, this series introduces net namespace support
> for rxe and moves socket management to be per network namespace.
>
> The series first introduces per-net namespace management for the IPv4
> and IPv6 sockets used by rxe. The sockets are created when the network
> namespace becomes active and are released when the namespace is
> destroyed.
>
> Based on this infrastructure, rxe RDMA links are then created and
> destroyed within each network namespace. This ensures that both the
> UDP sockets and RDMA links are correctly scoped to the namespace in
> which they are used.
>
> With these changes, rxe RDMA links can be created and used both in
> init_net and in other network namespaces, and resources are properly
> cleaned up during namespace teardown.
>
> The series also includes a selftest to verify RXE functionality in
> network namespaces.

The selftest result is as below:

"
# make -C tools/testing/selftests/ TARGETS=rdma run_tests
make: Entering directory '/root/Development/linux/tools/testing/selftests'
make[1]: Nothing to be done for 'all'.
TAP version 13
1..4
# timeout set to 45
# selftests: rdma: rxe_rping_between_netns.sh
# server DISCONNECT EVENT...
# wait for RDMA_READ_ADV state 10
ok 1 selftests: rdma: rxe_rping_between_netns.sh
# timeout set to 45
# selftests: rdma: rxe_ipv6.sh
ok 2 selftests: rdma: rxe_ipv6.sh
# timeout set to 45
# selftests: rdma: rxe_socket_with_netns.sh
ok 3 selftests: rdma: rxe_socket_with_netns.sh
# timeout set to 45
# selftests: rdma: rxe_test_NETDEV_UNREGISTER.sh
ok 4 selftests: rdma: rxe_test_NETDEV_UNREGISTER.sh
make: Leaving directory '/root/Development/linux/tools/testing/selftests'
"

Zhu Yanjun

>
> V3 -> V4: Squash all the changes about rxe_ns.c/h into one commit.
> V2 -> V3: Fix build warnings
> V1 -> V2: Fix the problems based on David Ahern.
>
>
> Zhu Yanjun (4):
>    RDMA/nldev: Add dellink function pointer
>    RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets
>    RDMA/rxe: Support RDMA link creation and destruction per net namespace
>    RDMA/rxe: Add testcase for net namespace rxe
>
>   MAINTAINERS                                   |   1 +
>   drivers/infiniband/core/nldev.c               |   6 +
>   drivers/infiniband/sw/rxe/Makefile            |   3 +-
>   drivers/infiniband/sw/rxe/rxe.c               |  38 ++++-
>   drivers/infiniband/sw/rxe/rxe_net.c           | 145 +++++++++++++-----
>   drivers/infiniband/sw/rxe/rxe_net.h           |   9 +-
>   drivers/infiniband/sw/rxe/rxe_ns.c            | 136 ++++++++++++++++
>   drivers/infiniband/sw/rxe/rxe_ns.h            |  17 ++
>   include/rdma/rdma_netlink.h                   |   2 +
>   tools/testing/selftests/Makefile              |   1 +
>   tools/testing/selftests/rdma/Makefile         |   7 +
>   tools/testing/selftests/rdma/config           |   3 +
>   tools/testing/selftests/rdma/rxe_ipv6.sh      |  47 ++++++
>   .../selftests/rdma/rxe_rping_between_netns.sh |  57 +++++++
>   .../selftests/rdma/rxe_socket_with_netns.sh   |  64 ++++++++
>   .../rdma/rxe_test_NETDEV_UNREGISTER.sh        |  38 +++++
>   16 files changed, 527 insertions(+), 47 deletions(-)
>   create mode 100644 drivers/infiniband/sw/rxe/rxe_ns.c
>   create mode 100644 drivers/infiniband/sw/rxe/rxe_ns.h
>   create mode 100644 tools/testing/selftests/rdma/Makefile
>   create mode 100644 tools/testing/selftests/rdma/config
>   create mode 100755 tools/testing/selftests/rdma/rxe_ipv6.sh
>   create mode 100755 tools/testing/selftests/rdma/rxe_rping_between_netns.sh
>   create mode 100755 tools/testing/selftests/rdma/rxe_socket_with_netns.sh
>   create mode 100755 tools/testing/selftests/rdma/rxe_test_NETDEV_UNREGISTER.sh
>
-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/4] RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets
  2026-03-08 23:35 ` [PATCH v4 2/4] RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets Zhu Yanjun
@ 2026-03-09 18:54   ` David Ahern
  2026-03-10  0:59     ` yanjun.zhu
  0 siblings, 1 reply; 12+ messages in thread
From: David Ahern @ 2026-03-09 18:54 UTC (permalink / raw)
  To: Zhu Yanjun, jgg, leon, zyjzyj2000, linux-rdma, linux-kselftest

On 3/8/26 5:35 PM, Zhu Yanjun wrote:
> diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
> new file mode 100644
> index 000000000000..6fe056c81ef3
> --- /dev/null
> +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
> @@ -0,0 +1,136 @@
> +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> +/*
> + * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved.
> + * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved.

neither of those copyrights are relevant here.



> +static void rxe_ns_exit(struct net *net)
> +{
> +	/* called when the network namespace is removed
> +	 */
> +	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
> +	struct sock *sk;
> +
> +	sk = rcu_dereference(ns_sk->rxe_sk4);

[  323.527911] =============================
[  323.527915] WARNING: suspicious RCU usage
[  323.527918] 7.0.0-rc1-debug+ #3 Tainted: G        W
[  323.527922] -----------------------------
[  323.527925] drivers/infiniband/sw/rxe/rxe_ns.c:49 suspicious
rcu_dereference_check() usage!
[  323.527929]

> +	if (sk) {
> +		rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
> +		udp_tunnel_sock_release(sk->sk_socket);
> +	}
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +	sk = rcu_dereference(ns_sk->rxe_sk6);

[  323.528243] =============================
[  323.528245] WARNING: suspicious RCU usage
[  323.528248] 7.0.0-rc1-debug+ #3 Tainted: G        W
[  323.528251] -----------------------------
[  323.528253] drivers/infiniband/sw/rxe/rxe_ns.c:56 suspicious
rcu_dereference_check() usage!


you should always run tests with a debug kernel that has kmemleak and
lock debugging enabled.


> +#else /* IPV6 */
> +
> +struct sock *rxe_ns_pernet_sk6(struct net *net)
> +{
> +	return NULL;
> +}
> +
> +void rxe_ns_pernet_set_sk6(struct net *net, struct sock *sk)
> +{
> +}

This branch is typically done as an inline in the header file.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/4] RDMA/rxe: Support RDMA link creation and destruction per net namespace
  2026-03-08 23:35 ` [PATCH v4 3/4] RDMA/rxe: Support RDMA link creation and destruction per net namespace Zhu Yanjun
@ 2026-03-09 18:54   ` David Ahern
  2026-03-10  0:57     ` yanjun.zhu
  0 siblings, 1 reply; 12+ messages in thread
From: David Ahern @ 2026-03-09 18:54 UTC (permalink / raw)
  To: Zhu Yanjun, jgg, leon, zyjzyj2000, linux-rdma, linux-kselftest

On 3/8/26 5:35 PM, Zhu Yanjun wrote:
> @@ -101,20 +100,20 @@ static inline void rxe_reclassify_recv_socket(struct socket *sock)
>  }
>  
>  static struct dst_entry *rxe_find_route4(struct rxe_qp *qp,
> +					 struct net *net,
>  					 struct net_device *ndev,
>  					 struct in_addr *saddr,
>  					 struct in_addr *daddr)
>  {
>  	struct rtable *rt;
> -	struct flowi4 fl = { { 0 } };
> +	struct flowi4 fl = {};
>  
> -	memset(&fl, 0, sizeof(fl));

changing init of fl here and fl6 in the next function are not relevant
to this patch. It should be a different one after this set.

>  	fl.flowi4_oif = ndev->ifindex;
>  	memcpy(&fl.saddr, saddr, sizeof(*saddr));
>  	memcpy(&fl.daddr, daddr, sizeof(*daddr));
>  	fl.flowi4_proto = IPPROTO_UDP;
>  
> -	rt = ip_route_output_key(&init_net, &fl);
> +	rt = ip_route_output_key(net, &fl);
>  	if (IS_ERR(rt)) {
>  		rxe_dbg_qp(qp, "no route to %pI4\n", &daddr->s_addr);
>  		return NULL;
> @@ -125,21 +124,21 @@ static struct dst_entry *rxe_find_route4(struct rxe_qp *qp,
>  
>  #if IS_ENABLED(CONFIG_IPV6)
>  static struct dst_entry *rxe_find_route6(struct rxe_qp *qp,
> +					 struct net *net,
>  					 struct net_device *ndev,
>  					 struct in6_addr *saddr,
>  					 struct in6_addr *daddr)
>  {
>  	struct dst_entry *ndst;
> -	struct flowi6 fl6 = { { 0 } };
> +	struct flowi6 fl6 = {};
>  
> -	memset(&fl6, 0, sizeof(fl6));
>  	fl6.flowi6_oif = ndev->ifindex;
>  	memcpy(&fl6.saddr, saddr, sizeof(*saddr));
>  	memcpy(&fl6.daddr, daddr, sizeof(*daddr));
>  	fl6.flowi6_proto = IPPROTO_UDP;
>  
> -	ndst = ipv6_stub->ipv6_dst_lookup_flow(sock_net(recv_sockets.sk6->sk),
> -					       recv_sockets.sk6->sk, &fl6,
> +	ndst = ipv6_stub->ipv6_dst_lookup_flow(net,
> +					       rxe_ns_pernet_sk6(dev_net(ndev)), &fl6,

why dev_net(ndev) here?


>  					       NULL);
>  	if (IS_ERR(ndst)) {
>  		rxe_dbg_qp(qp, "no route to %pI6\n", daddr);


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace
  2026-03-08 23:40 ` [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
@ 2026-03-09 18:55   ` David Ahern
  2026-03-10  0:55     ` yanjun.zhu
  0 siblings, 1 reply; 12+ messages in thread
From: David Ahern @ 2026-03-09 18:55 UTC (permalink / raw)
  To: Zhu Yanjun, jgg, leon, zyjzyj2000, linux-rdma, linux-kselftest

On 3/8/26 5:40 PM, Zhu Yanjun wrote:
> # make -C tools/testing/selftests/ TARGETS=rdma run_tests
> make: Entering directory '/root/Development/linux/tools/testing/selftests'
> make[1]: Nothing to be done for 'all'.
> TAP version 13
> 1..4
> # timeout set to 45
> # selftests: rdma: rxe_rping_between_netns.sh
> # server DISCONNECT EVENT...
> # wait for RDMA_READ_ADV state 10
> ok 1 selftests: rdma: rxe_rping_between_netns.sh
> # timeout set to 45
> # selftests: rdma: rxe_ipv6.sh
> ok 2 selftests: rdma: rxe_ipv6.sh
> # timeout set to 45
> # selftests: rdma: rxe_socket_with_netns.sh
> ok 3 selftests: rdma: rxe_socket_with_netns.sh
> # timeout set to 45
> # selftests: rdma: rxe_test_NETDEV_UNREGISTER.sh
> ok 4 selftests: rdma: rxe_test_NETDEV_UNREGISTER.sh
> make: Leaving directory '/root/Development/linux/tools/testing/selftests'
> "
> 

Just put that in the cover letter; no need for followup emails to the set.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace
  2026-03-09 18:55   ` David Ahern
@ 2026-03-10  0:55     ` yanjun.zhu
  0 siblings, 0 replies; 12+ messages in thread
From: yanjun.zhu @ 2026-03-10  0:55 UTC (permalink / raw)
  To: David Ahern, jgg, leon, zyjzyj2000, linux-rdma, linux-kselftest,
	Zhu Yanjun

On 3/9/26 11:55 AM, David Ahern wrote:
> On 3/8/26 5:40 PM, Zhu Yanjun wrote:
>> # make -C tools/testing/selftests/ TARGETS=rdma run_tests
>> make: Entering directory '/root/Development/linux/tools/testing/selftests'
>> make[1]: Nothing to be done for 'all'.
>> TAP version 13
>> 1..4
>> # timeout set to 45
>> # selftests: rdma: rxe_rping_between_netns.sh
>> # server DISCONNECT EVENT...
>> # wait for RDMA_READ_ADV state 10
>> ok 1 selftests: rdma: rxe_rping_between_netns.sh
>> # timeout set to 45
>> # selftests: rdma: rxe_ipv6.sh
>> ok 2 selftests: rdma: rxe_ipv6.sh
>> # timeout set to 45
>> # selftests: rdma: rxe_socket_with_netns.sh
>> ok 3 selftests: rdma: rxe_socket_with_netns.sh
>> # timeout set to 45
>> # selftests: rdma: rxe_test_NETDEV_UNREGISTER.sh
>> ok 4 selftests: rdma: rxe_test_NETDEV_UNREGISTER.sh
>> make: Leaving directory '/root/Development/linux/tools/testing/selftests'
>> "
>>
> 
> Just put that in the cover letter; no need for followup emails to the set.

Thanks, will do.

Zhu Yanjun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/4] RDMA/rxe: Support RDMA link creation and destruction per net namespace
  2026-03-09 18:54   ` David Ahern
@ 2026-03-10  0:57     ` yanjun.zhu
  0 siblings, 0 replies; 12+ messages in thread
From: yanjun.zhu @ 2026-03-10  0:57 UTC (permalink / raw)
  To: David Ahern, jgg, leon, zyjzyj2000, linux-rdma, linux-kselftest,
	Zhu Yanjun

On 3/9/26 11:54 AM, David Ahern wrote:
> On 3/8/26 5:35 PM, Zhu Yanjun wrote:
>> @@ -101,20 +100,20 @@ static inline void rxe_reclassify_recv_socket(struct socket *sock)
>>   }
>>   
>>   static struct dst_entry *rxe_find_route4(struct rxe_qp *qp,
>> +					 struct net *net,
>>   					 struct net_device *ndev,
>>   					 struct in_addr *saddr,
>>   					 struct in_addr *daddr)
>>   {
>>   	struct rtable *rt;
>> -	struct flowi4 fl = { { 0 } };
>> +	struct flowi4 fl = {};
>>   
>> -	memset(&fl, 0, sizeof(fl));
> 
> changing init of fl here and fl6 in the next function are not relevant
> to this patch. It should be a different one after this set.

Thanks. The changes have been done. It is a trivial problem. Let us keep 
them in this patch series.

> 
>>   	fl.flowi4_oif = ndev->ifindex;
>>   	memcpy(&fl.saddr, saddr, sizeof(*saddr));
>>   	memcpy(&fl.daddr, daddr, sizeof(*daddr));
>>   	fl.flowi4_proto = IPPROTO_UDP;
>>   
>> -	rt = ip_route_output_key(&init_net, &fl);
>> +	rt = ip_route_output_key(net, &fl);
>>   	if (IS_ERR(rt)) {
>>   		rxe_dbg_qp(qp, "no route to %pI4\n", &daddr->s_addr);
>>   		return NULL;
>> @@ -125,21 +124,21 @@ static struct dst_entry *rxe_find_route4(struct rxe_qp *qp,
>>   
>>   #if IS_ENABLED(CONFIG_IPV6)
>>   static struct dst_entry *rxe_find_route6(struct rxe_qp *qp,
>> +					 struct net *net,
>>   					 struct net_device *ndev,
>>   					 struct in6_addr *saddr,
>>   					 struct in6_addr *daddr)
>>   {
>>   	struct dst_entry *ndst;
>> -	struct flowi6 fl6 = { { 0 } };
>> +	struct flowi6 fl6 = {};
>>   
>> -	memset(&fl6, 0, sizeof(fl6));
>>   	fl6.flowi6_oif = ndev->ifindex;
>>   	memcpy(&fl6.saddr, saddr, sizeof(*saddr));
>>   	memcpy(&fl6.daddr, daddr, sizeof(*daddr));
>>   	fl6.flowi6_proto = IPPROTO_UDP;
>>   
>> -	ndst = ipv6_stub->ipv6_dst_lookup_flow(sock_net(recv_sockets.sk6->sk),
>> -					       recv_sockets.sk6->sk, &fl6,
>> +	ndst = ipv6_stub->ipv6_dst_lookup_flow(net,
>> +					       rxe_ns_pernet_sk6(dev_net(ndev)), &fl6,
> 
> why dev_net(ndev) here?

Got it. I will fix it in the latest commits.

Zhu Yanjun

> 
> 
>>   					       NULL);
>>   	if (IS_ERR(ndst)) {
>>   		rxe_dbg_qp(qp, "no route to %pI6\n", daddr);
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/4] RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets
  2026-03-09 18:54   ` David Ahern
@ 2026-03-10  0:59     ` yanjun.zhu
  0 siblings, 0 replies; 12+ messages in thread
From: yanjun.zhu @ 2026-03-10  0:59 UTC (permalink / raw)
  To: David Ahern, jgg, leon, zyjzyj2000, linux-rdma, linux-kselftest,
	Zhu Yanjun

On 3/9/26 11:54 AM, David Ahern wrote:
> On 3/8/26 5:35 PM, Zhu Yanjun wrote:
>> diff --git a/drivers/infiniband/sw/rxe/rxe_ns.c b/drivers/infiniband/sw/rxe/rxe_ns.c
>> new file mode 100644
>> index 000000000000..6fe056c81ef3
>> --- /dev/null
>> +++ b/drivers/infiniband/sw/rxe/rxe_ns.c
>> @@ -0,0 +1,136 @@
>> +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
>> +/*
>> + * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved.
>> + * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved.
> 
> neither of those copyrights are relevant here.
> 
> 
> 
>> +static void rxe_ns_exit(struct net *net)
>> +{
>> +	/* called when the network namespace is removed
>> +	 */
>> +	struct rxe_ns_sock *ns_sk = net_generic(net, rxe_pernet_id);
>> +	struct sock *sk;
>> +
>> +	sk = rcu_dereference(ns_sk->rxe_sk4);
> 
> [  323.527911] =============================
> [  323.527915] WARNING: suspicious RCU usage
> [  323.527918] 7.0.0-rc1-debug+ #3 Tainted: G        W
> [  323.527922] -----------------------------
> [  323.527925] drivers/infiniband/sw/rxe/rxe_ns.c:49 suspicious
> rcu_dereference_check() usage!
> [  323.527929]
> 
>> +	if (sk) {
>> +		rcu_assign_pointer(ns_sk->rxe_sk4, NULL);
>> +		udp_tunnel_sock_release(sk->sk_socket);
>> +	}
>> +
>> +#if IS_ENABLED(CONFIG_IPV6)
>> +	sk = rcu_dereference(ns_sk->rxe_sk6);
> 
> [  323.528243] =============================
> [  323.528245] WARNING: suspicious RCU usage
> [  323.528248] 7.0.0-rc1-debug+ #3 Tainted: G        W
> [  323.528251] -----------------------------
> [  323.528253] drivers/infiniband/sw/rxe/rxe_ns.c:56 suspicious
> rcu_dereference_check() usage!
> 
> 
> you should always run tests with a debug kernel that has kmemleak and
> lock debugging enabled.
> 
> 
>> +#else /* IPV6 */
>> +
>> +struct sock *rxe_ns_pernet_sk6(struct net *net)
>> +{
>> +	return NULL;
>> +}
>> +
>> +void rxe_ns_pernet_set_sk6(struct net *net, struct sock *sk)
>> +{
>> +}
> 
> This branch is typically done as an inline in the header file.
> 
> 

Got it. All the above problems are fixed in the latest commit.

Zhu Yanjun

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-03-10  0:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-08 23:35 [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
2026-03-08 23:35 ` [PATCH v4 1/4] RDMA/nldev: Add dellink function pointer Zhu Yanjun
2026-03-08 23:35 ` [PATCH v4 2/4] RDMA/rxe: Add net namespace support for IPv4/IPv6 sockets Zhu Yanjun
2026-03-09 18:54   ` David Ahern
2026-03-10  0:59     ` yanjun.zhu
2026-03-08 23:35 ` [PATCH v4 3/4] RDMA/rxe: Support RDMA link creation and destruction per net namespace Zhu Yanjun
2026-03-09 18:54   ` David Ahern
2026-03-10  0:57     ` yanjun.zhu
2026-03-08 23:35 ` [PATCH v4 4/4] RDMA/rxe: Add testcase for net namespace rxe Zhu Yanjun
2026-03-08 23:40 ` [PATCH v4 0/4] RDMA/rxe: Add the support that rxe can work in net namespace Zhu Yanjun
2026-03-09 18:55   ` David Ahern
2026-03-10  0:55     ` yanjun.zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox