Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests
@ 2026-05-18  1:24 Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 01/11] net/rds: Don't sleep inside rds_ib_conn_path_shutdown Allison Henderson
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

Currently the rds selftests only tests the tcp transport.  This means
most of rds_rdma.ko has no testing coverage.  This series refactors the
rds self tests to add an rdma option when running tests.  When used,
the test creates a pair of ROCE interfaces to run the payloads through.

Most of this set is refactoring the existing test.py module.  Since most
of this code is one long procedure, it is difficult to modularize it
without creating a lot of pylint complaints about lengthy functions
with too many variables or branches.  

Patch 1 fixes an RDS-IB shutdown hang exposed by the new ROCE selftests
in patches 10/11. The next seven patches break down test.py into helper
functions.  After we have modularized the send/recv packet logic, we
introduce the new ROCE equivalent network configurations, add the new
command line flags to build and run the test with rdma support.

Questions, comments and feedback appreciated!

Thanks everyone!
Allison

Change Log
v2:
   [PATCH net-next v1 1/9] selftests: rds: Capitalize ret global in test.py
      Dropped

   [PATCH net-next v2 4/9] selftests: rds: Add helper function recv_burst() in test.py
      Pylint nits

   [PATCH net-next v2 6/9] selftests: rds: Add helper function snd_rcv_packets() in test.py
      Pylint nits

   [PATCH net-next v2 7/9] selftests: rds: Register network teardown via atexi
      NEW
      Registers network config cleanup function teardown_tcp() with atexi

   [PATCH net-next v2 8/9] selftests: rds: Add ROCE support to test.py
      Pylint nits
      Added rdma network teardown cleanup on atexit
      Fixed test result reporting with dynamic per-transport reporting

v3:
   [PATCH net-next v3 1/11] net/rds: Don't sleep inside rds_ib_conn_path_shutdown
      NEW

   [PATCH net-next v3 08/11] selftests: rds: Handle errors in netns_socket
      NEW

   [PATCH net-next v3 10/11] selftests: rds: Add ROCE support to test.py
      Sashiko complaint: expand snd_rcv_packets docstring
      Sashiko complaint: properly close sockets when test completes
      Sashiko complaint: collect pcaps per rdma iface
      Sashiko complaint: only teardown rdma net configs when -T rdma is used
      Sashiko complaint: cancel timeout before reporting test results

   [PATCH net-next v3 11/11] selftests: rds: Add ROCE support to run.sh
      Sashiko complaint: Update test.py usage and README with -T usage

Allison Henderson (11):
  net/rds: Don't sleep inside rds_ib_conn_path_shutdown
  selftests: rds: Add helper function setup_tcp() in test.py
  selftests: rds: Add helper function check_info() in test.py
  selftests: rds: Add helper function send_burst() in test.py
  selftests: rds: Add helper function recv_burst() in test.py
  selftests: rds: Add helper function verify_hashes() in test.py
  selftests: rds: Add helper function snd_rcv_packets() in test.py
  selftests: rds: Handle errors in netns_socket
  selftests: rds: Register network teardown via atexit
  selftests: rds: Add ROCE support to test.py
  selftests: rds: Add ROCE support to run.sh

 net/rds/ib_cm.c                            |  25 +-
 tools/testing/selftests/net/rds/README.txt |  29 +-
 tools/testing/selftests/net/rds/config.sh  |  15 +-
 tools/testing/selftests/net/rds/run.sh     |  53 +-
 tools/testing/selftests/net/rds/test.py    | 631 ++++++++++++++-------
 5 files changed, 529 insertions(+), 224 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 01/11] net/rds: Don't sleep inside rds_ib_conn_path_shutdown
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 02/11] selftests: rds: Add helper function setup_tcp() in test.py Allison Henderson
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

New rds rdma self tests exposed a hang when tearing down
the ib network configs.  This is caused by the shutdown worker
thread sleeping on the wait_event call, which blocks other work
items in the queue. Fix this by changing wait_event to
wait_event timeout, and looping until the wait check succeeds.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 net/rds/ib_cm.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index 0c64c504f79db..6b40345ba44d1 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -1038,6 +1038,19 @@ int rds_ib_conn_path_connect(struct rds_conn_path *cp)
 	return ret;
 }
 
+static unsigned long rds_ib_conn_path_shutdown_check_wait(struct rds_conn_path *cp)
+{
+	struct rds_connection *conn = cp->cp_conn;
+	struct rds_ib_connection *ic = conn->c_transport_data;
+
+	return (!ic->i_cm_id ||
+		(rds_ib_ring_empty(&ic->i_recv_ring) &&
+		 (atomic_read(&ic->i_signaled_sends) == 0) &&
+		 (atomic_read(&ic->i_fastreg_inuse_count)) == 0 &&
+		 (atomic_read(&ic->i_fastreg_wrs) == RDS_IB_DEFAULT_FR_WR))) ? 0
+		: msecs_to_jiffies(1000);
+}
+
 /*
  * This is so careful about only cleaning up resources that were built up
  * so that it can be called at any point during startup.  In fact it
@@ -1078,11 +1091,13 @@ void rds_ib_conn_path_shutdown(struct rds_conn_path *cp)
 		 * sends to complete we're ensured that there will be no
 		 * more tx processing.
 		 */
-		wait_event(rds_ib_ring_empty_wait,
-			   rds_ib_ring_empty(&ic->i_recv_ring) &&
-			   (atomic_read(&ic->i_signaled_sends) == 0) &&
-			   (atomic_read(&ic->i_fastreg_inuse_count) == 0) &&
-			   (atomic_read(&ic->i_fastreg_wrs) == RDS_IB_DEFAULT_FR_WR));
+		while (!wait_event_timeout(rds_ib_ring_empty_wait,
+					   rds_ib_conn_path_shutdown_check_wait(cp) == 0,
+					   msecs_to_jiffies(1000))) {
+			tasklet_schedule(&ic->i_send_tasklet);
+			tasklet_schedule(&ic->i_recv_tasklet);
+		}
+
 		tasklet_kill(&ic->i_send_tasklet);
 		tasklet_kill(&ic->i_recv_tasklet);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 02/11] selftests: rds: Add helper function setup_tcp() in test.py
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 01/11] net/rds: Don't sleep inside rds_ib_conn_path_shutdown Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 03/11] selftests: rds: Add helper function check_info() " Allison Henderson
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

Hoist the network configs in test.py into a tcp specific helper
function, setup_tcp().  This is a preparatory refactoring for the
rds over ROCE series which will add separate function for rdma
specific configs.  No functional changes are introduced in this patch.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 113 +++++++++++++-----------
 1 file changed, 60 insertions(+), 53 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index 6db6067792312..118a5da83c98e 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -32,6 +32,15 @@ NET1 = 'net1'
 VETH0 = 'veth0'
 VETH1 = 'veth1'
 
+tcpdump_procs = []
+tcp_addrs = [
+    # we technically don't need different port numbers, but this will
+    # help identify traffic in the network analyzer
+    ('10.0.0.1', 10000),
+    ('10.0.0.2', 20000),
+]
+
+
 # Helper function for creating a socket inside a network namespace.
 # We need this because otherwise RDS will detect that the two TCP
 # sockets are on the same interface and use the loop transport instead
@@ -100,6 +109,55 @@ def signal_handler(_sig, _frame):
     print("not ok 1 rds selftest")
     sys.exit(1)
 
+def setup_tcp():
+    """
+    Configure tcp network
+    """
+
+    ip(f"netns add {NET0}")
+    ip(f"netns add {NET1}")
+    ip("link add type veth")
+
+    # Move TCP interfaces into separate namespaces so they can no longer be
+    # bound directly; this prevents rds from switching over from the tcp
+    # transport to the loop transport.
+    ip(f"link set {VETH0} netns {NET0} up")
+    ip(f"link set {VETH1} netns {NET1} up")
+
+    # add addresses
+    ip(f"-n {NET0} addr add {tcp_addrs[0][0]}/32 dev {VETH0}")
+    ip(f"-n {NET1} addr add {tcp_addrs[1][0]}/32 dev {VETH1}")
+
+    # add routes
+    ip(f"-n {NET0} route add {tcp_addrs[1][0]}/32 dev {VETH0}")
+    ip(f"-n {NET1} route add {tcp_addrs[0][0]}/32 dev {VETH1}")
+
+    # sanity check that our two interfaces/addresses are correctly set up
+    # and communicating by doing a single ping
+    ip(f"netns exec {NET0} ping -c 1 {tcp_addrs[1][0]}")
+
+    # Start a packet capture on each network
+    if logdir is not None:
+        for netn in [NET0, NET1]:
+            pcap = logdir+'/rds-'+netn+'.pcap'
+
+            tcpdump_cmd = ['ip', 'netns', 'exec', netn, '/usr/sbin/tcpdump']
+            sudo_user = os.environ.get('SUDO_USER')
+            if sudo_user:
+                tcpdump_cmd.extend(['-Z', sudo_user])
+            tcpdump_cmd.extend(['-i', 'any', '-w', pcap])
+
+            # pylint: disable-next=consider-using-with
+            p = subprocess.Popen(tcpdump_cmd,
+                                 stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+            tcpdump_procs.append(p)
+
+    # simulate packet loss, duplication and corruption
+    for netn, iface in [(NET0, VETH0), (NET1, VETH1)]:
+        ip(f"netns exec {netn} /usr/sbin/tc qdisc add dev {iface} root netem  \
+             corrupt {PACKET_CORRUPTION} loss {PACKET_LOSS} duplicate  \
+             {PACKET_DUPLICATE}")
+
 #Parse out command line arguments.  We take an optional
 # timeout parameter and an optional log output folder
 parser = argparse.ArgumentParser(description="init script args",
@@ -120,59 +178,8 @@ PACKET_LOSS=str(args.loss)+'%'
 PACKET_CORRUPTION=str(args.corruption)+'%'
 PACKET_DUPLICATE=str(args.duplicate)+'%'
 
-ip(f"netns add {NET0}")
-ip(f"netns add {NET1}")
-ip("link add type veth")
-
-addrs = [
-    # we technically don't need different port numbers, but this will
-    # help identify traffic in the network analyzer
-    ('10.0.0.1', 10000),
-    ('10.0.0.2', 20000),
-]
-
-# move interfaces to separate namespaces so they can no longer be
-# bound directly; this prevents rds from switching over from the tcp
-# transport to the loop transport.
-ip(f"link set {VETH0} netns {NET0} up")
-ip(f"link set {VETH1} netns {NET1} up")
-
-
-
-# add addresses
-ip(f"-n {NET0} addr add {addrs[0][0]}/32 dev {VETH0}")
-ip(f"-n {NET1} addr add {addrs[1][0]}/32 dev {VETH1}")
-
-# add routes
-ip(f"-n {NET0} route add {addrs[1][0]}/32 dev {VETH0}")
-ip(f"-n {NET1} route add {addrs[0][0]}/32 dev {VETH1}")
-
-# sanity check that our two interfaces/addresses are correctly set up
-# and communicating by doing a single ping
-ip(f"netns exec {NET0} ping -c 1 {addrs[1][0]}")
-
-tcpdump_procs = []
-# Start a packet capture on each network
-if logdir is not None:
-    for net in [NET0, NET1]:
-        pcap = logdir+'/rds-'+net+'.pcap'
-
-        tcpdump_cmd = ['ip', 'netns', 'exec', net, '/usr/sbin/tcpdump']
-        sudo_user = os.environ.get('SUDO_USER')
-        if sudo_user:
-            tcpdump_cmd.extend(['-Z', sudo_user])
-        tcpdump_cmd.extend(['-i', 'any', '-w', pcap])
-
-        # pylint: disable-next=consider-using-with
-        p = subprocess.Popen(tcpdump_cmd,
-                             stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
-        tcpdump_procs.append(p)
-
-# simulate packet loss, duplication and corruption
-for net, iface in [(NET0, VETH0), (NET1, VETH1)]:
-    ip(f"netns exec {net} /usr/sbin/tc qdisc add dev {iface} root netem  \
-         corrupt {PACKET_CORRUPTION} loss {PACKET_LOSS} duplicate  \
-         {PACKET_DUPLICATE}")
+setup_tcp()
+addrs = tcp_addrs
 
 print("TAP version 13")
 print("1..1")
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 03/11] selftests: rds: Add helper function check_info() in test.py
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 01/11] net/rds: Don't sleep inside rds_ib_conn_path_shutdown Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 02/11] selftests: rds: Add helper function setup_tcp() in test.py Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 04/11] selftests: rds: Add helper function send_burst() " Allison Henderson
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

Hoist the page info logic in test.py into a helper function,
check_info().  This is a preparatory refactoring for the rds over ROCE
series that helps modularize the send/recv logic. Breaking up the logic
now will help avoid large function pylint errors later.  No functional
changes are introduced in this patch.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 53 +++++++++++++++----------
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index 118a5da83c98e..d64af9e662e8c 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -79,6 +79,36 @@ def netns_socket(netns, *sock_args):
     u1.close()
     return socket.fromfd(fds[0], *sock_args)
 
+def check_info(socks):
+    """
+    Check all rds info pages for errors
+
+    :param socks: list of sockets to check
+    """
+
+    # the Python socket module doesn't know these
+    rds_info_first = 10000
+    rds_info_last = 10017
+
+    nr_success = 0
+    nr_error = 0
+
+    for sock in socks:
+        for optname in range(rds_info_first, rds_info_last + 1):
+            # Sigh, the Python socket module doesn't allow us to pass
+            # buffer lengths greater than 1024 for some reason. RDS
+            # wants multiple pages.
+            try:
+                sock.getsockopt(socket.SOL_RDS, optname, 1024)
+                nr_success = nr_success + 1
+            except OSError as e:
+                nr_error = nr_error + 1
+                if e.errno == errno.ENOSPC:
+                    # ignore
+                    pass
+
+    ksft_pr(f"getsockopt(): {nr_success}/{nr_error}")
+
 def stop_pcaps():
     """Stop tcpdump processes.
 
@@ -268,28 +298,7 @@ while nr_send < NUM_PACKETS:
 
 ksft_pr("done", nr_send, nr_recv)
 
-# the Python socket module doesn't know these
-RDS_INFO_FIRST = 10000
-RDS_INFO_LAST = 10017
-
-nr_success = 0
-nr_error = 0
-
-for s in sockets:
-    for optname in range(RDS_INFO_FIRST, RDS_INFO_LAST + 1):
-        # Sigh, the Python socket module doesn't allow us to pass
-        # buffer lengths greater than 1024 for some reason. RDS
-        # wants multiple pages.
-        try:
-            s.getsockopt(socket.SOL_RDS, optname, 1024)
-            nr_success = nr_success + 1
-        except OSError as e:
-            nr_error = nr_error + 1
-            if e.errno == errno.ENOSPC:
-                # ignore
-                pass
-
-ksft_pr(f"getsockopt(): {nr_success}/{nr_error}")
+check_info(sockets)
 
 # cancel timeout
 signal.alarm(0)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 04/11] selftests: rds: Add helper function send_burst() in test.py
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
                   ` (2 preceding siblings ...)
  2026-05-18  1:24 ` [PATCH net-next v3 03/11] selftests: rds: Add helper function check_info() " Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 05/11] selftests: rds: Add helper function recv_burst() " Allison Henderson
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

Hoist the send packet logic in test.py into a helper function,
send_burst().  This is a preparatory refactoring for the rds over ROCE
series that helps modularize the send/recv logic. Breaking up the logic
now will help avoid large function pylint errors later.  No functional
changes are introduced in this patch.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 50 +++++++++++++------------
 1 file changed, 27 insertions(+), 23 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index d64af9e662e8c..d6e872af13600 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -79,6 +79,31 @@ def netns_socket(netns, *sock_args):
     u1.close()
     return socket.fromfd(fds[0], *sock_args)
 
+def send_burst(socks, ip_addrs, snd_hashes, nr_sent, nr_total):
+    """Send until blocked or nr_total reached. Return updated nr_sent."""
+
+    while nr_sent < nr_total:
+        data = hashlib.sha256(
+            f'packet {nr_sent}'.encode('utf-8')).hexdigest().encode('utf-8')
+        # pseudo-random send/receive pattern
+        snd_idx = nr_sent % 2
+        rcv_idx = 1 - (nr_sent % 3) % 2
+
+        snd = socks[snd_idx]
+        rcv = socks[rcv_idx]
+        try:
+            snd.sendto(data, ip_addrs[rcv_idx])
+        except BlockingIOError:
+            return nr_sent
+        except OSError as e:
+            if e.errno in (errno.ENOBUFS, errno.ECONNRESET, errno.EPIPE):
+                return nr_sent
+            raise
+        snd_hashes.setdefault((snd.fileno(), rcv.fileno()),
+                hashlib.sha256()).update(f'<{data}>'.encode('utf-8'))
+        nr_sent += 1
+    return nr_sent
+
 def check_info(socks):
     """
     Check all rds info pages for errors
@@ -234,10 +259,6 @@ fileno_to_socket = {
 
 addr_to_socket = dict(zip(addrs, sockets))
 
-socket_to_addr = {
-    s: addr for addr, s in zip(addrs, sockets)
-}
-
 send_hashes = {}
 recv_hashes = {}
 
@@ -251,27 +272,10 @@ nr_send = 0
 nr_recv = 0
 
 while nr_send < NUM_PACKETS:
+
     # Send as much as we can without blocking
     ksft_pr("sending...", nr_send, nr_recv)
-    while nr_send < NUM_PACKETS:
-        send_data = hashlib.sha256(
-            f'packet {nr_send}'.encode('utf-8')).hexdigest().encode('utf-8')
-
-        # pseudo-random send/receive pattern
-        sender = sockets[nr_send % 2]
-        receiver = sockets[1 - (nr_send % 3) % 2]
-
-        try:
-            sender.sendto(send_data, socket_to_addr[receiver])
-            send_hashes.setdefault((sender.fileno(), receiver.fileno()),
-                    hashlib.sha256()).update(f'<{send_data}>'.encode('utf-8'))
-            nr_send = nr_send + 1
-        except BlockingIOError:
-            break
-        except OSError as e:
-            if e.errno in [errno.ENOBUFS, errno.ECONNRESET, errno.EPIPE]:
-                break
-            raise
+    nr_send = send_burst(sockets, addrs, send_hashes, nr_send, NUM_PACKETS)
 
     # Receive as much as we can without blocking
     ksft_pr("receiving...", nr_send, nr_recv)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 05/11] selftests: rds: Add helper function recv_burst() in test.py
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
                   ` (3 preceding siblings ...)
  2026-05-18  1:24 ` [PATCH net-next v3 04/11] selftests: rds: Add helper function send_burst() " Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 06/11] selftests: rds: Add helper function verify_hashes() " Allison Henderson
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

Hoist receive packet logic in test.py into a helper function,
recv_burst().  This is a preparatory refactoring for the rds over ROCE
series that helps modularize the send/recv logic. Breaking up the logic
now will help avoid large function pylint errors later.  No functional
changes are introduced in this patch.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 39 ++++++++++++-------------
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index d6e872af13600..ae74117b41747 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -104,6 +104,24 @@ def send_burst(socks, ip_addrs, snd_hashes, nr_sent, nr_total):
         nr_sent += 1
     return nr_sent
 
+def recv_burst(epoll, socks, ip_addrs, rcv_hashes, nr_rcv):
+    """Drain whatever's readable from epoll. Return updated nr_recv."""
+    for filen, evntmask in epoll.poll():
+        if not evntmask & select.EPOLLRDNORM:
+            continue
+        rcv = next(s for s in socks if s.fileno() == filen)
+        while True:
+            try:
+                data, adr = rcv.recvfrom(1024)
+            except BlockingIOError:
+                break
+            snd_idx = ip_addrs.index(adr)
+            snd = socks[snd_idx]
+            rcv_hashes.setdefault((snd.fileno(), rcv.fileno()),
+                    hashlib.sha256()).update(f'<{data}>'.encode('utf-8'))
+            nr_rcv += 1
+    return nr_rcv
+
 def check_info(socks):
     """
     Check all rds info pages for errors
@@ -253,12 +271,6 @@ for s, addr in zip(sockets, addrs):
     s.bind(addr)
     s.setblocking(0)
 
-fileno_to_socket = {
-    s.fileno(): s for s in sockets
-}
-
-addr_to_socket = dict(zip(addrs, sockets))
-
 send_hashes = {}
 recv_hashes = {}
 
@@ -280,20 +292,7 @@ while nr_send < NUM_PACKETS:
     # Receive as much as we can without blocking
     ksft_pr("receiving...", nr_send, nr_recv)
     while nr_recv < nr_send:
-        for fileno, eventmask in ep.poll():
-            receiver = fileno_to_socket[fileno]
-
-            if eventmask & select.EPOLLRDNORM:
-                while True:
-                    try:
-                        recv_data, address = receiver.recvfrom(1024)
-                        sender = addr_to_socket[address]
-                        recv_hashes.setdefault((sender.fileno(),
-                            receiver.fileno()), hashlib.sha256()).update(
-                                    f'<{recv_data}>'.encode('utf-8'))
-                        nr_recv = nr_recv + 1
-                    except BlockingIOError:
-                        break
+        nr_recv = recv_burst(ep, sockets, addrs, recv_hashes, nr_recv)
 
     # exercise net/rds/tcp.c:rds_tcp_sysctl_reset()
     for net in [NET0, NET1]:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 06/11] selftests: rds: Add helper function verify_hashes() in test.py
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
                   ` (4 preceding siblings ...)
  2026-05-18  1:24 ` [PATCH net-next v3 05/11] selftests: rds: Add helper function recv_burst() " Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 07/11] selftests: rds: Add helper function snd_rcv_packets() " Allison Henderson
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

Hoist the verify hashes logic in test.py into a helper function,
verify_hashes().  This is a preparatory refactoring for the rds over
ROCE series that helps modularize the send/recv logic. Breaking up the
logic now will help avoid large function pylint errors later.  No
functional changes are introduced in this patch.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 33 ++++++++++++-------------
 1 file changed, 16 insertions(+), 17 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index ae74117b41747..a3def413d84ad 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -152,6 +152,21 @@ def check_info(socks):
 
     ksft_pr(f"getsockopt(): {nr_success}/{nr_error}")
 
+def verify_hashes(snd_hashes, rcv_hashes):
+    """Compare send/recv hashes per (sender, receiver) pair."""
+    for key, snd_hash in snd_hashes.items():
+        rcv_hash = rcv_hashes.get(key)
+        if rcv_hash is None:
+            ksft_pr("FAIL: No data received")
+            return 1
+        if snd_hash.hexdigest() != rcv_hash.hexdigest():
+            ksft_pr("FAIL: Send/recv mismatch")
+            ksft_pr("hash expected:", snd_hash.hexdigest())
+            ksft_pr("hash received:", rcv_hash.hexdigest())
+            return 1
+        ksft_pr(f"{key[0]}/{key[1]}: ok")
+    return 0
+
 def stop_pcaps():
     """Stop tcpdump processes.
 
@@ -310,23 +325,7 @@ stop_pcaps()
 
 # We're done sending and receiving stuff, now let's check if what
 # we received is what we sent.
-ret = 0
-for (sender, receiver), send_hash in send_hashes.items():
-    recv_hash = recv_hashes.get((sender, receiver))
-
-    if recv_hash is None:
-        ksft_pr("FAIL: No data received")
-        ret = 1
-        break
-
-    if send_hash.hexdigest() != recv_hash.hexdigest():
-        ksft_pr("FAIL: Send/recv mismatch")
-        ksft_pr("hash expected:", send_hash.hexdigest())
-        ksft_pr("hash received:", recv_hash.hexdigest())
-        ret = 1
-        break
-
-    ksft_pr(f"{sender}/{receiver}: ok")
+ret = verify_hashes(send_hashes, recv_hashes)
 
 if ret == 0:
     ksft_pr("Success")
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 07/11] selftests: rds: Add helper function snd_rcv_packets() in test.py
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
                   ` (5 preceding siblings ...)
  2026-05-18  1:24 ` [PATCH net-next v3 06/11] selftests: rds: Add helper function verify_hashes() " Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 08/11] selftests: rds: Handle errors in netns_socket Allison Henderson
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

Hoist the send/recv logic in test.py into a helper function,
snd_rcv_packets().  This is a preparatory refactoring for the
rds over ROCE series which can use the same function to run
the test over tcp, rdma, or both.  No functional changes are
introduced in this patch.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 99 ++++++++++++++-----------
 1 file changed, 54 insertions(+), 45 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index a3def413d84ad..f7d0dba85131e 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -167,6 +167,59 @@ def verify_hashes(snd_hashes, rcv_hashes):
         ksft_pr(f"{key[0]}/{key[1]}: ok")
     return 0
 
+def snd_rcv_packets(addrs, netns_list):
+    """
+    Send packets on the given network interfaces
+
+    :param addrs: list of (ip, port) tuples matching the sockets
+    :param netns_list: list of network namespaces
+    """
+
+    sockets = [
+        netns_socket(netns_list[0], socket.AF_RDS, socket.SOCK_SEQPACKET),
+        netns_socket(netns_list[1], socket.AF_RDS, socket.SOCK_SEQPACKET),
+    ]
+
+    for s, addr in zip(sockets, addrs):
+        s.bind(addr)
+        s.setblocking(0)
+
+    send_hashes = {}
+    recv_hashes = {}
+
+    ep = select.epoll()
+
+    for s in sockets:
+        ep.register(s, select.EPOLLRDNORM)
+
+    num_packets = 50000
+    nr_send = 0
+    nr_recv = 0
+
+    while nr_send < num_packets:
+
+        # Send as much as we can without blocking
+        ksft_pr("sending...", nr_send, nr_recv)
+        nr_send = send_burst(sockets, addrs, send_hashes, nr_send, num_packets)
+
+        # Receive as much as we can without blocking
+        ksft_pr("receiving...", nr_send, nr_recv)
+        while nr_recv < nr_send:
+            nr_recv = recv_burst(ep, sockets, addrs, recv_hashes, nr_recv)
+
+        # exercise net/rds/tcp.c:rds_tcp_sysctl_reset()
+        for net in netns_list:
+            ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_rcvbuf=10000")
+            ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_sndbuf=10000")
+
+    ksft_pr("done", nr_send, nr_recv)
+
+    check_info(sockets)
+
+    # We're done sending and receiving stuff, now let's check if what
+    # we received is what we sent.
+    return verify_hashes(send_hashes, recv_hashes)
+
 def stop_pcaps():
     """Stop tcpdump processes.
 
@@ -267,7 +320,6 @@ PACKET_CORRUPTION=str(args.corruption)+'%'
 PACKET_DUPLICATE=str(args.duplicate)+'%'
 
 setup_tcp()
-addrs = tcp_addrs
 
 print("TAP version 13")
 print("1..1")
@@ -277,56 +329,13 @@ if args.timeout > 0:
     signal.alarm(args.timeout)
     signal.signal(signal.SIGALRM, signal_handler)
 
-sockets = [
-    netns_socket(NET0, socket.AF_RDS, socket.SOCK_SEQPACKET),
-    netns_socket(NET1, socket.AF_RDS, socket.SOCK_SEQPACKET),
-]
-
-for s, addr in zip(sockets, addrs):
-    s.bind(addr)
-    s.setblocking(0)
-
-send_hashes = {}
-recv_hashes = {}
-
-ep = select.epoll()
-
-for s in sockets:
-    ep.register(s, select.EPOLLRDNORM)
-
-NUM_PACKETS = 50000
-nr_send = 0
-nr_recv = 0
-
-while nr_send < NUM_PACKETS:
-
-    # Send as much as we can without blocking
-    ksft_pr("sending...", nr_send, nr_recv)
-    nr_send = send_burst(sockets, addrs, send_hashes, nr_send, NUM_PACKETS)
-
-    # Receive as much as we can without blocking
-    ksft_pr("receiving...", nr_send, nr_recv)
-    while nr_recv < nr_send:
-        nr_recv = recv_burst(ep, sockets, addrs, recv_hashes, nr_recv)
-
-    # exercise net/rds/tcp.c:rds_tcp_sysctl_reset()
-    for net in [NET0, NET1]:
-        ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_rcvbuf=10000")
-        ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_sndbuf=10000")
-
-ksft_pr("done", nr_send, nr_recv)
-
-check_info(sockets)
+ret = snd_rcv_packets(tcp_addrs, [NET0, NET1])
 
 # cancel timeout
 signal.alarm(0)
 
 stop_pcaps()
 
-# We're done sending and receiving stuff, now let's check if what
-# we received is what we sent.
-ret = verify_hashes(send_hashes, recv_hashes)
-
 if ret == 0:
     ksft_pr("Success")
     print("ok 1 rds selftest")
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 08/11] selftests: rds: Handle errors in netns_socket
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
                   ` (6 preceding siblings ...)
  2026-05-18  1:24 ` [PATCH net-next v3 07/11] selftests: rds: Add helper function snd_rcv_packets() " Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 09/11] selftests: rds: Register network teardown via atexit Allison Henderson
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

Sockets created by child processes in netns_socket may raise
exceptions that are currently not handled by the parent.  If for
example a namespace didn't exist or the rds module didn't load. Because
these exceptions occur with in a child thread, the child thread exits,
but the parent does not check the return status.

Further, allowing the child processes to quietly raise exceptions
will cause problems later if the parent registers clean up functions
with atexit.  Since the child processes inherit the parents handlers,
they may prematurely call the parents cleanup routines without the
parent being aware.

Fix this by all catching exceptions raised by the child processes.
Child errors surface as a non-zero exit status, which are then
properly raised in the parent process.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 27 +++++++++++++------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index f7d0dba85131e..2188221ee7805 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -56,27 +56,28 @@ def netns_socket(netns, *sock_args):
 
     child = os.fork()
     if child == 0:
-        # change network namespace
-        with open(f'/var/run/netns/{netns}', encoding='utf-8') as f:
-            try:
+        try:
+            # change network namespace
+            with open(f'/var/run/netns/{netns}', encoding='utf-8') as f:
                 setns(f.fileno(), 0)
-            except IOError as e:
-                print(e.errno)
-                print(e)
-
-        # create socket in target namespace
-        sock = socket.socket(*sock_args)
+            # create socket in target namespace
+            sock = socket.socket(*sock_args)
 
-        # send resulting socket to parent
-        socket.send_fds(u0, [], [sock.fileno()])
+            # send resulting socket to parent
+            socket.send_fds(u0, [], [sock.fileno()])
 
-        os._exit(0)
+            os._exit(0)
+        except BaseException:
+            os._exit(1)
 
     # receive socket from child
     _, fds, _, _ = socket.recv_fds(u1, 0, 1)
-    os.waitpid(child, 0)
+    _, status = os.waitpid(child, 0)
     u0.close()
     u1.close()
+    if not os.WIFEXITED(status) or os.WEXITSTATUS(status) != 0:
+        raise RuntimeError(
+            f"netns_socket child failed in netns {netns} (status={status})")
     return socket.fromfd(fds[0], *sock_args)
 
 def send_burst(socks, ip_addrs, snd_hashes, nr_sent, nr_total):
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 09/11] selftests: rds: Register network teardown via atexit
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
                   ` (7 preceding siblings ...)
  2026-05-18  1:24 ` [PATCH net-next v3 08/11] selftests: rds: Handle errors in netns_socket Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 10/11] selftests: rds: Add ROCE support to test.py Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 11/11] selftests: rds: Add ROCE support to run.sh Allison Henderson
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

This patch adds a teardown_tcp() helper that removes net0/net1.
The cmd calls here use fail=False so they can be called from
completed or partially-setup states on error. Also call
teardown_tcp() at the top of setup_tcp() so a previous
interrupted run does not leave net0/net1 lingering and break a
subsequent ip netns add.  Register teardown_tcp() with atexit
before setup_tcp() is invoked.

Likewise, we can simpliy stop_pcaps() handling by registering it
with atexit instead of calling it from the signal handler.

atexit handlers run on any exit path - normal completion, raised
exception, and sys.exit() from the timeout signal handler.  This
guarantees cleanup are called without further wrapping the test
body in a try/finally blocks.

atexit LIFO ordering keeps stop_pcaps before teardown_tcp so
tcpdumps are killed cleanly before their namespaces go away.

This is a preparatory cleanup for the upcoming ROCE patch which
will also register a teardown_rdma() alongside teardown_tcp()

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index 2188221ee7805..5b699bf87eb25 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -5,6 +5,7 @@ This module provides functional testing for the net/rds component.
 """
 
 import argparse
+import atexit
 import ctypes
 import errno
 import hashlib
@@ -19,7 +20,7 @@ import sys
 this_dir = os.path.dirname(os.path.realpath(__file__))
 sys.path.append(os.path.join(this_dir, "../"))
 # pylint: disable-next=wrong-import-position,import-error,no-name-in-module
-from lib.py.utils import ip # noqa: E402
+from lib.py.utils import ip, cmd # noqa: E402
 # pylint: disable-next=wrong-import-position,import-error,no-name-in-module
 from lib.py.ksft import ksft_pr # noqa: E402
 
@@ -247,7 +248,6 @@ def signal_handler(_sig, _frame):
     Test timed out signal handler
     """
     ksft_pr("Test timed out")
-    stop_pcaps()
     print("not ok 1 rds selftest")
     sys.exit(1)
 
@@ -256,6 +256,9 @@ def setup_tcp():
     Configure tcp network
     """
 
+    # clean up any leftovers from a previously interrupted run
+    teardown_tcp()
+
     ip(f"netns add {NET0}")
     ip(f"netns add {NET1}")
     ip("link add type veth")
@@ -300,6 +303,17 @@ def setup_tcp():
              corrupt {PACKET_CORRUPTION} loss {PACKET_LOSS} duplicate  \
              {PACKET_DUPLICATE}")
 
+def teardown_tcp():
+    """
+    Tear down the tcp network configured by setup_tcp().
+
+    Removing the namespaces also removes the veth pair, addresses,
+    routes, and netem qdisc that live inside them.  fail=False so
+    this is safe to call in error paths after a partial or complete setup.
+    """
+    cmd(f"ip netns del {NET0}", fail=False)
+    cmd(f"ip netns del {NET1}", fail=False)
+
 #Parse out command line arguments.  We take an optional
 # timeout parameter and an optional log output folder
 parser = argparse.ArgumentParser(description="init script args",
@@ -320,6 +334,11 @@ PACKET_LOSS=str(args.loss)+'%'
 PACKET_CORRUPTION=str(args.corruption)+'%'
 PACKET_DUPLICATE=str(args.duplicate)+'%'
 
+# Register cleanup before setup so a partial-setup crash still tears down
+# whatever state did get created.
+atexit.register(teardown_tcp)
+atexit.register(stop_pcaps)
+
 setup_tcp()
 
 print("TAP version 13")
@@ -335,8 +354,6 @@ ret = snd_rcv_packets(tcp_addrs, [NET0, NET1])
 # cancel timeout
 signal.alarm(0)
 
-stop_pcaps()
-
 if ret == 0:
     ksft_pr("Success")
     print("ok 1 rds selftest")
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 10/11] selftests: rds: Add ROCE support to test.py
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
                   ` (8 preceding siblings ...)
  2026-05-18  1:24 ` [PATCH net-next v3 09/11] selftests: rds: Register network teardown via atexit Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  2026-05-18  1:24 ` [PATCH net-next v3 11/11] selftests: rds: Add ROCE support to run.sh Allison Henderson
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

This patch adds support for testing rds rdma over ROCE in test.py
A new -T flag is added, which takes a transport option, tcp or rdma.
A new setup_rdma() function is added that will configure rdma
interfaces and sockets for use in the test case.

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/test.py | 238 ++++++++++++++++++++----
 1 file changed, 206 insertions(+), 32 deletions(-)

diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py
index 5b699bf87eb25..08f2a846a8ab5 100755
--- a/tools/testing/selftests/net/rds/test.py
+++ b/tools/testing/selftests/net/rds/test.py
@@ -11,10 +11,12 @@ import errno
 import hashlib
 import os
 import select
+import re
 import signal
 import socket
 import subprocess
 import sys
+import time
 
 # Allow utils module to be imported from different directory
 this_dir = os.path.dirname(os.path.realpath(__file__))
@@ -41,6 +43,27 @@ tcp_addrs = [
     ('10.0.0.2', 20000),
 ]
 
+# RDMA network configs
+RXE_DEV0 = 'rxe0'
+RXE_DEV1 = 'rxe1'
+
+VETH_RDMA0 = 'veth_rdma0'
+VETH_RDMA1 = 'veth_rdma1'
+
+rdma_addrs = [
+    ('10.0.0.3', 30000),
+    ('10.0.0.4', 30000),
+]
+
+# send_packets flag space
+OP_FLAG_TCP     = 0x1
+OP_FLAG_RDMA    = 0x2
+
+signal_handler_label = ""
+
+tap_idx = 0
+nr_pass = 0
+nr_fail = 0
 
 # Helper function for creating a socket inside a network namespace.
 # We need this because otherwise RDS will detect that the two TCP
@@ -169,18 +192,35 @@ def verify_hashes(snd_hashes, rcv_hashes):
         ksft_pr(f"{key[0]}/{key[1]}: ok")
     return 0
 
-def snd_rcv_packets(addrs, netns_list):
+def snd_rcv_packets(env):
     """
     Send packets on the given network interfaces
 
-    :param addrs: list of (ip, port) tuples matching the sockets
-    :param netns_list: list of network namespaces
+    :param env: transport-environment dict for setup_tcp() / setup_rdma().
+                "addrs": list of (ip, port) tuples matching the sockets
+                "netns": list of netns names for TCP or None for RDMA
+                "flags": OP_FLAG_TCP or OP_FLAG_RDMA, selects sockets
     """
 
-    sockets = [
-        netns_socket(netns_list[0], socket.AF_RDS, socket.SOCK_SEQPACKET),
-        netns_socket(netns_list[1], socket.AF_RDS, socket.SOCK_SEQPACKET),
-    ]
+    addrs = env["addrs"]
+    netns_list = env["netns"]
+    flags = env.get("flags", 0)
+
+    if (flags & OP_FLAG_TCP) and (flags & OP_FLAG_RDMA):
+        raise RuntimeError(f"Invalid transport flag sets multiple transports: {flags}")
+
+    if flags & OP_FLAG_TCP:
+        sockets = [
+            netns_socket(netns_list[0], socket.AF_RDS, socket.SOCK_SEQPACKET),
+            netns_socket(netns_list[1], socket.AF_RDS, socket.SOCK_SEQPACKET),
+        ]
+    elif flags & OP_FLAG_RDMA:
+        sockets = [
+            socket.socket(socket.AF_RDS, socket.SOCK_SEQPACKET),
+            socket.socket(socket.AF_RDS, socket.SOCK_SEQPACKET),
+        ]
+    else:
+        raise RuntimeError(f"Invalid transport flag sets no transports: {flags}")
 
     for s, addr in zip(sockets, addrs):
         s.bind(addr)
@@ -210,9 +250,10 @@ def snd_rcv_packets(addrs, netns_list):
             nr_recv = recv_burst(ep, sockets, addrs, recv_hashes, nr_recv)
 
         # exercise net/rds/tcp.c:rds_tcp_sysctl_reset()
-        for net in netns_list:
-            ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_rcvbuf=10000")
-            ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_sndbuf=10000")
+        if netns_list:
+            for net in netns_list:
+                ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_rcvbuf=10000")
+                ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_sndbuf=10000")
 
     ksft_pr("done", nr_send, nr_recv)
 
@@ -220,7 +261,13 @@ def snd_rcv_packets(addrs, netns_list):
 
     # We're done sending and receiving stuff, now let's check if what
     # we received is what we sent.
-    return verify_hashes(send_hashes, recv_hashes)
+    rc = verify_hashes(send_hashes, recv_hashes)
+
+    ep.close()
+    for s in sockets:
+        s.close()
+
+    return rc
 
 def stop_pcaps():
     """Stop tcpdump processes.
@@ -247,8 +294,8 @@ def signal_handler(_sig, _frame):
     """
     Test timed out signal handler
     """
-    ksft_pr("Test timed out")
-    print("not ok 1 rds selftest")
+    ksft_pr(f"Test timed out: {signal_handler_label}")
+    print(f"not ok {tap_idx} rds selftest {signal_handler_label}")
     sys.exit(1)
 
 def setup_tcp():
@@ -314,12 +361,107 @@ def teardown_tcp():
     cmd(f"ip netns del {NET0}", fail=False)
     cmd(f"ip netns del {NET1}", fail=False)
 
+def get_iface_mac(iface):
+    """Return the MAC address of a local network interface."""
+    out = subprocess.check_output(['ip', 'link', 'show', iface], text=True)
+    mac = re.search(r'link/ether\s+([0-9a-f:]+)', out)
+    if not mac:
+        raise RuntimeError(f"Cannot determine MAC address of {iface}")
+    return mac.group(1)
+
+def setup_rdma():
+    """
+    Configure rdma network
+    """
+
+    # remove links left over by previously interrupted run.
+    teardown_rdma()
+
+    # use call here since modprobe may fail if the rdma_rxe
+    # module is built-in
+    subprocess.call(['modprobe', 'rdma_rxe'],
+                    stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+
+    ip(f"link add {VETH_RDMA0} type veth peer name {VETH_RDMA1}")
+
+    ip(f"link set {VETH_RDMA0} up")
+    ip(f"link set {VETH_RDMA1} up")
+
+    # Since both addresses are in the same namespace, the source address
+    # is always local, so enable accept_local
+    cmd(f"/usr/sbin/sysctl -q net.ipv4.conf.{VETH_RDMA0}.accept_local=1")
+    cmd(f"/usr/sbin/sysctl -q net.ipv4.conf.{VETH_RDMA1}.accept_local=1")
+
+    # Reverse path filters must be disabled so that the local routes don't
+    # cause RPF failures.
+    cmd(f"/usr/sbin/sysctl -q net.ipv4.conf.{VETH_RDMA0}.rp_filter=0")
+    cmd(f"/usr/sbin/sysctl -q net.ipv4.conf.{VETH_RDMA1}.rp_filter=0")
+
+    # add addresses
+    ip(f"addr add {rdma_addrs[0][0]}/32 dev {VETH_RDMA0}")
+    ip(f"addr add {rdma_addrs[1][0]}/32 dev {VETH_RDMA1}")
+
+    # add routes
+    ip(f"route add {rdma_addrs[1][0]}/32 dev {VETH_RDMA0}")
+    ip(f"route add {rdma_addrs[0][0]}/32 dev {VETH_RDMA1}")
+
+    # ARP will not resolve neighbor IPs on /32 routes without a subnet.
+    # Avoid this by adding neighbors directly so RDMA CM can populate path
+    # records with correct mac addrs without waiting for the ARP.
+    mac0 = get_iface_mac(VETH_RDMA0)
+    mac1 = get_iface_mac(VETH_RDMA1)
+    ip(f"neigh add {rdma_addrs[1][0]} lladdr {mac1} dev {VETH_RDMA0} nud permanent")
+    ip(f"neigh add {rdma_addrs[0][0]} lladdr {mac0} dev {VETH_RDMA1} nud permanent")
+
+    cmd(f'rdma link add {RXE_DEV0} type rxe netdev {VETH_RDMA0}')
+    cmd(f'rdma link add {RXE_DEV1} type rxe netdev {VETH_RDMA1}')
+
+    time.sleep(1)  # allow RXE devices to initialise
+
+    # Start a packet capture on each network
+    if logdir is not None:
+        for iface in [VETH_RDMA0, VETH_RDMA1]:
+            pcap = logdir+'/rds-roce-'+iface+'.pcap'
+
+            tcpdump_cmd = ['/usr/sbin/tcpdump']
+            sudo_user = os.environ.get('SUDO_USER')
+            if sudo_user:
+                tcpdump_cmd.extend(['-Z', sudo_user])
+            tcpdump_cmd.extend(['-i', iface, '-w', pcap])
+
+            # pylint: disable-next=consider-using-with
+            p = subprocess.Popen(tcpdump_cmd,
+                                 stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+            tcpdump_procs.append(p)
+
+    # simulate packet loss, duplication and corruption
+    for iface in [VETH_RDMA0, VETH_RDMA1]:
+        cmd(f"/usr/sbin/tc qdisc add dev {iface} root netem  \
+             corrupt {PACKET_CORRUPTION} loss {PACKET_LOSS} duplicate  \
+             {PACKET_DUPLICATE}")
+
+def teardown_rdma():
+    """
+    Tear down the rdma network configured by setup_rdma().
+    """
+
+    # remove links left over by previously interrupted run.
+    cmd(f'rdma link del {RXE_DEV0}', fail=False)
+    cmd(f'rdma link del {RXE_DEV1}', fail=False)
+    cmd(f'ip link del {VETH_RDMA0}', fail=False)
+
+
 #Parse out command line arguments.  We take an optional
 # timeout parameter and an optional log output folder
 parser = argparse.ArgumentParser(description="init script args",
                   formatter_class=argparse.ArgumentDefaultsHelpFormatter)
 parser.add_argument("-d", "--logdir", action="store",
                     help="directory to store logs", default=None)
+parser.add_argument("-T", "--transport", default="tcp",
+                    help="Comma-separated list of transports to test: "
+                         "tcp, rdma, or tcp,rdma.  Each matching test "
+                         "is run once per transport.  "
+                         "'rdma' requires CONFIG_RDS_RDMA and rdma_rxe.")
 parser.add_argument('-t', '--timeout', help="timeout to terminate hung test",
                     type=int, default=0)
 parser.add_argument('-l', '--loss', help="Simulate tcp packet loss",
@@ -334,31 +476,63 @@ PACKET_LOSS=str(args.loss)+'%'
 PACKET_CORRUPTION=str(args.corruption)+'%'
 PACKET_DUPLICATE=str(args.duplicate)+'%'
 
-# Register cleanup before setup so a partial-setup crash still tears down
-# whatever state did get created.
-atexit.register(teardown_tcp)
+# check transport is either tcp or rdma
+transports = [t.strip() for t in args.transport.split(',')]
+for t in transports:
+    if t not in ('tcp', 'rdma'):
+        raise SystemExit(f"test.py: unknown transport: {t!r}")
+
+# Register stop_pcaps before any network setups so that any partially setup
+# tcpdumps are still cleaned up on error
 atexit.register(stop_pcaps)
 
-setup_tcp()
+# Set up all requested transports upfront so network plumbing is
+# ready before any test runs.
+transport_envs = {}
+FLAGS = 0
+if 'tcp' in transports:
+    # Register cleanups before setups to handle partial setups that error'd out
+    atexit.register(teardown_tcp)
+    setup_tcp()
+    transport_envs['tcp'] = {
+        'addrs': tcp_addrs,
+        'netns': [NET0, NET1],
+        'flags': FLAGS | OP_FLAG_TCP,
+    }
+
+if 'rdma' in transports:
+    atexit.register(teardown_rdma)
+    setup_rdma()
+    transport_envs['rdma'] = {
+        'addrs': rdma_addrs,
+        'netns': None,
+        'flags': FLAGS | OP_FLAG_RDMA,
+    }
 
 print("TAP version 13")
-print("1..1")
+print(f"1..{len(transport_envs)}")
+
+for transport, tenv in transport_envs.items():
+    tap_idx += 1
 
-# add a timeout
-if args.timeout > 0:
-    signal.alarm(args.timeout)
-    signal.signal(signal.SIGALRM, signal_handler)
+    # add a timeout
+    if args.timeout > 0:
+        signal_handler_label = transport
+        signal.alarm(args.timeout)
+        signal.signal(signal.SIGALRM, signal_handler)
 
-ret = snd_rcv_packets(tcp_addrs, [NET0, NET1])
+    ret = snd_rcv_packets(tenv)
 
-# cancel timeout
-signal.alarm(0)
+    # cancel timeout
+    signal.alarm(0)
 
-if ret == 0:
-    ksft_pr("Success")
-    print("ok 1 rds selftest")
-else:
-    print("not ok 1 rds selftest")
+    if ret == 0:
+        ksft_pr("Success")
+        print(f"ok {tap_idx} rds selftest {transport}")
+        nr_pass += 1
+    else:
+        print(f"not ok {tap_idx} rds selftest {transport}")
+        nr_fail += 1
 
-ksft_pr(f"Totals: pass:{1-ret} fail:{ret} skip:0")
-sys.exit(ret)
+ksft_pr(f"Totals: pass:{nr_pass} fail:{nr_fail} skip:0")
+sys.exit(1 if nr_fail else 0)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v3 11/11] selftests: rds: Add ROCE support to run.sh
  2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
                   ` (9 preceding siblings ...)
  2026-05-18  1:24 ` [PATCH net-next v3 10/11] selftests: rds: Add ROCE support to test.py Allison Henderson
@ 2026-05-18  1:24 ` Allison Henderson
  10 siblings, 0 replies; 12+ messages in thread
From: Allison Henderson @ 2026-05-18  1:24 UTC (permalink / raw)
  To: netdev, pabeni, edumazet, kuba, horms, linux-rdma, achender,
	linux-kselftest, shuah

This patch adds support for testing rds rdma over ROCE.  A new
-r flag is added to config.sh which enables the required kernel
configs for rdma.  We also add a -T flag to run.sh, which takes
a transport option, tcp or rdma.  The rdma option will check to
ensure the proper configs have been enabled. The flag is then
passed to test.py, which will run the test over the specified
transport(s)

Signed-off-by: Allison Henderson <achender@kernel.org>
---
 tools/testing/selftests/net/rds/README.txt | 29 ++++++++----
 tools/testing/selftests/net/rds/config.sh  | 15 +++++-
 tools/testing/selftests/net/rds/run.sh     | 53 +++++++++++++++++++++-
 3 files changed, 84 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/net/rds/README.txt b/tools/testing/selftests/net/rds/README.txt
index 295dc82c0770f..bac6f15a80d52 100644
--- a/tools/testing/selftests/net/rds/README.txt
+++ b/tools/testing/selftests/net/rds/README.txt
@@ -1,18 +1,22 @@
 RDS self-tests
 ==============
 
-These scripts provide a coverage test for RDS-TCP by creating two
-network namespaces and running rds packets between them. A loopback
-network is provisioned with optional probability of packet loss or
-corruption. A workload of 50000 hashes, each 64 characters in size,
-are passed over an RDS socket on this test network. A passing test means
-the RDS-TCP stack was able to recover properly.  The provided config.sh
-can be used to compile the kernel with the necessary gcov options.  The
-kernel may optionally be configured to omit the coverage report as well.
+These scripts provide a coverage test for RDS-TCP and RDS-RDMA (over
+RoCE/RXE) by setting up two endpoints and running RDS packets between
+them. The TCP path creates two network namespaces; the RDMA path uses
+an RXE (soft RoCE) device backed by a veth pair.  A workload of 50000
+hashes, each 64 characters in size, is passed over an RDS socket on
+this test network with an optional probability of packet loss or
+corruption.  A passing test means the RDS stack was able to recover
+properly.  The provided config.sh can be used to compile the kernel
+with the necessary gcov options; pass -r to also enable the kernel
+configs required for the RDMA transport.  The kernel may optionally be
+configured to omit the coverage report as well.
 
 USAGE:
 	run.sh [-d logdir] [-l packet_loss] [-c packet_corruption]
 	       [-u packet_duplicate] [-t timeout]
+	       [-T tcp|rdma|tcp,rdma]
 
 OPTIONS:
 	-d	Log directory.  If set, logs will be stored in the
@@ -27,6 +31,10 @@ OPTIONS:
 
 	-t	Test timeout.  Defaults to tools/testing/selftests/net/rds/settings
 
+	-T	Comma-separated list of transports to test.  Accepts
+		"tcp", "rdma", or "tcp,rdma".  Defaults to "tcp".  Use
+		config.sh -r to enable required RDMA configs
+
 ENV VARIABLES:
 	RDS_LOG_DIR	Log directory.  If set, logs will be stored in
 			the given dir, or skipped if unset. Log dir
@@ -48,6 +56,9 @@ EXAMPLE:
     # Create a suitable gcov enabled .config
     tools/testing/selftests/net/rds/config.sh -g
 
+    # Optionally add RDMA configs (CONFIG_RDS_RDMA, CONFIG_RDMA_RXE)
+    tools/testing/selftests/net/rds/config.sh -r
+
     # Alternatly create a gcov disabled .config
     tools/testing/selftests/net/rds/config.sh
 
@@ -62,5 +73,5 @@ EXAMPLE:
         "export PYTHONPATH=tools/testing/selftests/net/; \
          export SUDO_USER=example_user; \
          export RDS_LOG_DIR=tools/testing/selftests/net/rds/rds_logs; \
-         tools/testing/selftests/net/rds/run.sh"
+         tools/testing/selftests/net/rds/run.sh -T tcp,rdma"
 
diff --git a/tools/testing/selftests/net/rds/config.sh b/tools/testing/selftests/net/rds/config.sh
index 29a79314dd60f..be0668359a070 100755
--- a/tools/testing/selftests/net/rds/config.sh
+++ b/tools/testing/selftests/net/rds/config.sh
@@ -10,7 +10,8 @@ CONF_FILE=""
 FLAGS=()
 
 GENERATE_GCOV_REPORT=0
-while getopts "gc:" opt; do
+ENABLE_RDMA=0
+while getopts "gc:r" opt; do
   case ${opt} in
     g)
       GENERATE_GCOV_REPORT=1
@@ -18,8 +19,11 @@ while getopts "gc:" opt; do
     c)
       CONF_FILE=$OPTARG
       ;;
+    r)
+      ENABLE_RDMA=1
+      ;;
     :)
-      echo "USAGE: config.sh [-g] [-c config]"
+      echo "USAGE: config.sh [-g] [-c config] [-r]"
       exit 1
       ;;
     ?)
@@ -58,3 +62,10 @@ scripts/config "${FLAGS[@]}" --enable CONFIG_VETH
 # simulate packet loss
 scripts/config "${FLAGS[@]}" --enable CONFIG_NET_SCH_NETEM
 
+if [ "$ENABLE_RDMA" -eq 1 ]; then
+	# enable RDS over InfiniBand / RDMA (rds_rdma test)
+	scripts/config "${FLAGS[@]}" --enable CONFIG_INFINIBAND
+	scripts/config "${FLAGS[@]}" --enable CONFIG_INFINIBAND_ADDR_TRANS
+	scripts/config "${FLAGS[@]}" --enable CONFIG_RDMA_RXE
+	scripts/config "${FLAGS[@]}" --enable CONFIG_RDS_RDMA
+fi
diff --git a/tools/testing/selftests/net/rds/run.sh b/tools/testing/selftests/net/rds/run.sh
index 424fd57401d88..07af2f927a2a7 100755
--- a/tools/testing/selftests/net/rds/run.sh
+++ b/tools/testing/selftests/net/rds/run.sh
@@ -101,6 +101,16 @@ check_conf_enabled() {
 		exit 4
 	fi
 }
+
+check_rdma_conf_enabled() {
+	if ! grep -x "$1=y" "$kconfig" > /dev/null 2>&1; then
+		echo "selftests: [SKIP] rdma transport requires $1 enabled"
+		echo "To enable, run " \
+		     "tools/testing/selftests/net/rds/config.sh -r and rebuild"
+		exit 4
+	fi
+}
+
 check_conf_disabled() {
 	if grep -x "$1=y" "$kconfig" > /dev/null 2>&1; then
 		echo "selftests: [SKIP] This test requires $1 disabled"
@@ -117,6 +127,28 @@ check_conf() {
 	check_conf_disabled CONFIG_MODULES
 }
 
+# Check kernel config and host environment for RDS-RDMA support.
+# Exits with SKIP (4) if the user requested rdma but prerequisites
+# are not met.
+check_rdma_conf()
+{
+	case "$TRANSPORT" in
+	  *rdma*) ;;
+	  *) return ;;
+	esac
+
+	# Kconfig will enforce CONFIG_INFINIBAND_* as dependencies
+	# of CONFIG_RDMA_RXE
+	check_rdma_conf_enabled CONFIG_RDMA_RXE
+	check_rdma_conf_enabled CONFIG_RDS_RDMA
+
+	if ! which rdma > /dev/null 2>&1; then
+		echo "selftests: [SKIP] rdma transport requires the 'rdma'" \
+		      " tool (iproute2)"
+		exit 4
+	fi
+}
+
 check_env()
 {
 	if ! test -d "$obj_dir"; then
@@ -153,8 +185,10 @@ check_env()
 LOG_DIR="${RDS_LOG_DIR:-}"
 TIMEOUT=$timeout
 GENERATE_GCOV_REPORT=1
+TRANSPORT=tcp
 FLAGS=()
-while getopts "d:l:c:u:t:" opt; do
+
+while getopts "d:l:c:u:t:T:" opt; do
   case ${opt} in
     d)
       LOG_DIR=${OPTARG}
@@ -171,9 +205,12 @@ while getopts "d:l:c:u:t:" opt; do
     u)
       FLAGS+=("-u" "${OPTARG}")
       ;;
+    T)
+      TRANSPORT=${OPTARG}
+      ;;
     :)
       echo "USAGE: run.sh [-d logdir] [-l packet_loss] [-c packet_corruption]" \
-           "[-u packet_duplicate] [-t timeout]"
+           "[-u packet_duplicate] [-t timeout] [-T tcp|rdma|tcp,rdma]"
       exit 1
       ;;
     ?)
@@ -183,9 +220,21 @@ while getopts "d:l:c:u:t:" opt; do
   esac
 done
 
+# Validate transport tokens
+IFS=',' read -ra transports <<< "$TRANSPORT"
+for t in "${transports[@]}"; do
+    if [ "$t" != "tcp" ] && [ "$t" != "rdma" ]; then
+        echo "run.sh: unknown transport '$t' (expected tcp or rdma)"
+        exit 1
+    fi
+done
+
+FLAGS+=("--transport" "${TRANSPORT}")
+
 check_env
 check_conf
 check_gcov_conf
+check_rdma_conf
 
 TRACE_CMD=()
 if [[ -n "$LOG_DIR" ]]; then
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-05-18  1:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18  1:24 [PATCH net-next v3 00/11] selftests: rds: Add ROCE support to rds selftests Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 01/11] net/rds: Don't sleep inside rds_ib_conn_path_shutdown Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 02/11] selftests: rds: Add helper function setup_tcp() in test.py Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 03/11] selftests: rds: Add helper function check_info() " Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 04/11] selftests: rds: Add helper function send_burst() " Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 05/11] selftests: rds: Add helper function recv_burst() " Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 06/11] selftests: rds: Add helper function verify_hashes() " Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 07/11] selftests: rds: Add helper function snd_rcv_packets() " Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 08/11] selftests: rds: Handle errors in netns_socket Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 09/11] selftests: rds: Register network teardown via atexit Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 10/11] selftests: rds: Add ROCE support to test.py Allison Henderson
2026-05-18  1:24 ` [PATCH net-next v3 11/11] selftests: rds: Add ROCE support to run.sh Allison Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox