The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Michael Bommarito <michael.bommarito@gmail.com>
To: Bernard Metzler <bernard.metzler@linux.dev>,
	Jason Gunthorpe <jgg@ziepe.ca>, Leon Romanovsky <leon@kernel.org>,
	linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Subject: [PATCH 2/2] RDMA/siw: add KUnit tests for MPA receive parsing
Date: Wed, 13 May 2026 13:53:25 -0400	[thread overview]
Message-ID: <20260513175325.2042630-3-michael.bommarito@gmail.com> (raw)
In-Reply-To: <20260513175325.2042630-1-michael.bommarito@gmail.com>

Add a KUnit suite (CONFIG_SIW_MPA_RX_KUNIT_TEST) that exercises the
real siw_tcp_rx_data() path with three cases covering the MPA length
validation added in the previous patch:

  - siw_mpa_write_underflow_rejected
      Constructs an sk_buff carrying a tagged RDMA WRITE FPDU whose
      mpa_len is one below iwarp_pktinfo[opcode].hdr_len -
      MPA_HDR_SIZE.  Registers a REMOTE_WRITE MR in mem_xa so the
      WRITE path would otherwise reach siw_proc_write(), and calls
      siw_tcp_rx_data() directly.  Asserts the FPDU is rejected with
      TERM(LLP/MPA/FPDU_START) and rx_suspend = 1.

  - siw_mpa_write_minimum_valid_accepted
      Regression control with mpa_len = hdr_len - MPA_HDR_SIZE (the
      smallest legal value, i.e. a zero-length WRITE).  Asserts the
      new check does not fire: no terminate, rx_stream not
      suspended.

  - siw_mpa_write_underflow_rejected_live_socket
      Opens a loopback AF_INET socketpair via sock_create_kern(),
      attaches a struct siw_cep as sk_user_data so sk_to_qp()
      resolves to the test QP, and installs siw_qp_llp_data_ready as
      sk_data_ready on the victim socket.  Writes the malformed FPDU
      via kernel_sendmsg from the attacker side.  The kernel TCP
      stack delivers, sk_data_ready fires in softirq, and
      tcp_read_sock dispatches to siw_tcp_rx_data the same way a
      remote peer would.  Asserts the same terminate state as the
      first case.

The third case is the design driver: it confirms the bug-fix
codepath fires from a real softirq RX entry, not just a synthetic
direct call.  On a stock siw tree the same harness reproduces the
KASAN slab-out-of-bounds / use-after-free in skb_copy_bits.

Bringing siw's loopback netdev up (case 3 binds 127.0.0.1) is done
inline via dev_change_flags() under rtnl_lock since the KUnit
environment does not run init scripts.

Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Assisted-by: Claude:claude-opus-4-7
---
 drivers/infiniband/sw/siw/Kconfig            |  18 +
 drivers/infiniband/sw/siw/Makefile           |   2 +
 drivers/infiniband/sw/siw/siw_mpa_rx_kunit.c | 349 +++++++++++++++++++
 3 files changed, 369 insertions(+)
 create mode 100644 drivers/infiniband/sw/siw/siw_mpa_rx_kunit.c

diff --git a/drivers/infiniband/sw/siw/Kconfig b/drivers/infiniband/sw/siw/Kconfig
index 186f182b80e7..b137f5920271 100644
--- a/drivers/infiniband/sw/siw/Kconfig
+++ b/drivers/infiniband/sw/siw/Kconfig
@@ -18,3 +18,21 @@ config RDMA_SIW
 	space verbs API, libibverbs. To implement RDMA over
 	TCP/IP, the driver further interfaces with the Linux
 	in-kernel TCP socket layer.
+
+config SIW_MPA_RX_KUNIT_TEST
+	bool "KUnit tests for Soft-iWARP MPA receive parsing" if !KUNIT_ALL_TESTS
+	depends on KUNIT && RDMA_SIW
+	default KUNIT_ALL_TESTS
+	help
+	  Build KUnit regression tests for the Soft-iWARP MPA receive
+	  state machine. The tests cover the MPA length consistency
+	  check in siw_get_hdr(): malformed FPDUs whose mpa_len is
+	  below the opcode's fixed DDP/RDMAP header must be rejected
+	  with TERM(LLP/MPA/FPDU_START); the minimum-valid mpa_len
+	  (zero-length WRITE) must still be accepted. One case drives
+	  the real kernel TCP receive path via a loopback socketpair
+	  so the same softirq sk_data_ready -> tcp_read_sock ->
+	  siw_tcp_rx_data chain a remote peer would exercise is
+	  covered.
+
+	  If unsure, say N.
diff --git a/drivers/infiniband/sw/siw/Makefile b/drivers/infiniband/sw/siw/Makefile
index f5f7e3867889..09d4c90d8758 100644
--- a/drivers/infiniband/sw/siw/Makefile
+++ b/drivers/infiniband/sw/siw/Makefile
@@ -9,3 +9,5 @@ siw-y := \
 	siw_qp_tx.o \
 	siw_qp_rx.o \
 	siw_verbs.o
+
+siw-$(CONFIG_SIW_MPA_RX_KUNIT_TEST) += siw_mpa_rx_kunit.o
diff --git a/drivers/infiniband/sw/siw/siw_mpa_rx_kunit.c b/drivers/infiniband/sw/siw/siw_mpa_rx_kunit.c
new file mode 100644
index 000000000000..204b3213b840
--- /dev/null
+++ b/drivers/infiniband/sw/siw/siw_mpa_rx_kunit.c
@@ -0,0 +1,349 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+/*
+ * KUnit harness for siw MPA receive-state length validation.
+ *
+ * Internal to the SIW_MPA_LEN_UNDERFLOW_RX_COPY disclosure validation tree.
+ * Not part of the upstream patch.
+ *
+ *   case 1: short mpa_len triggers the new siw_get_hdr() check via direct
+ *           siw_tcp_rx_data() call with a constructed sk_buff
+ *           - expects: TERM(LLP/MPA/FPDU_START), rx_suspend=1
+ *           - under stock siw: KASAN slab-out-of-bounds in skb_copy_bits()
+ *           - under patched siw: no splat, terminate state set
+ *
+ *   case 2: minimum-valid mpa_len control (constructed sk_buff)
+ *           - mpa_len = hdr_len - MPA_HDR_SIZE  ->  fpdu_part_rem = 0
+ *             so siw_proc_write() takes the zero-length WRITE short path
+ *             and returns 0 without calling skb_copy_bits().
+ *           - expects: no TERM, state machine progressed normally
+ *
+ *   case 3: real loopback TCP socketpair (the "live two-node" analog)
+ *           - opens AF_INET TCP sockets in-kernel via sock_create_kern()
+ *           - binds/listens on 127.0.0.1:0, connects, accepts
+ *           - installs siw_qp_llp_data_ready on the victim socket and
+ *             attaches a struct siw_cep so sk_to_qp() resolves to our qp
+ *           - writes the malformed FPDU bytes via kernel_sendmsg on the
+ *             attacker socket
+ *           - the kernel TCP stack delivers, sk_data_ready fires, and
+ *             siw_qp_llp_data_ready -> tcp_read_sock -> siw_tcp_rx_data
+ *             runs in the normal kernel receive path
+ *           - expects: TERM(LLP/MPA/FPDU_START) on the qp
+ */
+
+#include <kunit/test.h>
+#include <linux/inet.h>
+#include <linux/in.h>
+#include <linux/netdevice.h>
+#include <linux/rtnetlink.h>
+#include <linux/skbuff.h>
+#include <linux/tcp.h>
+#include <linux/wait.h>
+#include <linux/xarray.h>
+#include <net/sock.h>
+#include <net/tcp.h>
+#include <rdma/ib_verbs.h>
+
+#include "siw.h"
+#include "siw_cm.h"
+#include "siw_mem.h"
+
+static void siw_kunit_kfree_skb(void *skb)
+{
+	kfree_skb(skb);
+}
+
+struct siw_mpa_rx_ctx {
+	struct siw_device *sdev;
+	struct siw_qp *qp;
+	struct siw_mem *mem;
+	void *target;
+	u32 stag;
+};
+
+static void siw_mpa_rx_setup(struct kunit *test, struct siw_mpa_rx_ctx *c)
+{
+	void *xa_ret;
+
+	c->stag = 0x00000100;
+
+	c->sdev = kunit_kzalloc(test, sizeof(*c->sdev), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, c->sdev);
+	xa_init_flags(&c->sdev->mem_xa, XA_FLAGS_ALLOC1);
+
+	c->qp = kunit_kzalloc(test, sizeof(*c->qp), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, c->qp);
+	c->qp->sdev = c->sdev;
+	c->qp->pd = kunit_kzalloc(test, sizeof(*c->qp->pd), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, c->qp->pd);
+	c->qp->rx_stream.state = SIW_GET_HDR;
+
+	c->target = kunit_kzalloc(test, 64, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, c->target);
+
+	c->mem = kunit_kzalloc(test, sizeof(*c->mem), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, c->mem);
+	kref_init(&c->mem->ref);
+	c->mem->sdev = c->sdev;
+	c->mem->stag = c->stag;
+	c->mem->stag_valid = 1;
+	c->mem->va = (u64)(uintptr_t)c->target;
+	c->mem->len = 64;
+	c->mem->pd = c->qp->pd;
+	c->mem->perms = IB_ACCESS_REMOTE_WRITE;
+
+	xa_ret = xa_store(&c->sdev->mem_xa, c->stag >> 8, c->mem, GFP_KERNEL);
+	KUNIT_ASSERT_FALSE(test, xa_is_err(xa_ret));
+}
+
+static void siw_mpa_rx_run(struct kunit *test, struct siw_mpa_rx_ctx *c,
+			   u16 mpa_len_val)
+{
+	struct iwarp_rdma_write write = { };
+	struct sk_buff *skb;
+	read_descriptor_t rd_desc = { };
+	u8 payload[sizeof(write) + 1];
+
+	write.ctrl.mpa_len = cpu_to_be16(mpa_len_val);
+	write.ctrl.ddp_rdmap_ctrl = DDP_FLAG_TAGGED | DDP_FLAG_LAST |
+		cpu_to_be16(DDP_VERSION << 8) |
+		cpu_to_be16(RDMAP_VERSION << 6) |
+		cpu_to_be16(RDMAP_RDMA_WRITE);
+	write.sink_stag = cpu_to_be32(c->stag);
+	write.sink_to = cpu_to_be64((u64)(uintptr_t)c->target);
+
+	memcpy(payload, &write, sizeof(write));
+	payload[sizeof(write)] = 0x41;
+
+	skb = alloc_skb(sizeof(payload), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, skb);
+	skb_put_data(skb, payload, sizeof(payload));
+	kunit_add_action_or_reset(test, siw_kunit_kfree_skb, skb);
+
+	rd_desc.arg.data = c->qp;
+	rd_desc.count = sizeof(payload);
+
+	siw_tcp_rx_data(&rd_desc, skb, 0, sizeof(payload));
+}
+
+static void siw_mpa_write_underflow_rejected(struct kunit *test)
+{
+	struct siw_mpa_rx_ctx c;
+
+	siw_mpa_rx_setup(test, &c);
+
+	/* mpa_len one byte short of the WRITE hdr_len - MPA_HDR_SIZE floor. */
+	siw_mpa_rx_run(test, &c,
+		       sizeof(struct iwarp_rdma_write) - MPA_HDR_SIZE - 1);
+
+	KUNIT_EXPECT_EQ(test, (int)c.qp->term_info.valid, 1);
+	KUNIT_EXPECT_EQ(test, (int)c.qp->term_info.layer,
+			(int)TERM_ERROR_LAYER_LLP);
+	KUNIT_EXPECT_EQ(test, (int)c.qp->term_info.etype,
+			(int)LLP_ETYPE_MPA);
+	KUNIT_EXPECT_EQ(test, (int)c.qp->term_info.ecode,
+			(int)LLP_ECODE_FPDU_START);
+	KUNIT_EXPECT_EQ(test, (int)c.qp->rx_stream.rx_suspend, 1);
+}
+
+static void siw_mpa_write_minimum_valid_accepted(struct kunit *test)
+{
+	struct siw_mpa_rx_ctx c;
+
+	siw_mpa_rx_setup(test, &c);
+
+	/*
+	 * mpa_len == hdr_len - MPA_HDR_SIZE is the smallest legal value;
+	 * it yields fpdu_part_rem = 0, i.e. a zero-length WRITE. The new
+	 * length check in siw_get_hdr() must accept this.
+	 */
+	siw_mpa_rx_run(test, &c,
+		       sizeof(struct iwarp_rdma_write) - MPA_HDR_SIZE);
+
+	KUNIT_EXPECT_EQ(test, (int)c.qp->term_info.valid, 0);
+	KUNIT_EXPECT_EQ(test, (int)c.qp->rx_stream.rx_suspend, 0);
+}
+
+static int siw_mpa_rx_bring_up_lo(struct kunit *test)
+{
+	struct net_device *lo;
+	int rv;
+
+	rtnl_lock();
+	lo = __dev_get_by_name(&init_net, "lo");
+	KUNIT_ASSERT_NOT_NULL(test, lo);
+	if (!(lo->flags & IFF_UP))
+		rv = dev_change_flags(lo, lo->flags | IFF_UP, NULL);
+	else
+		rv = 0;
+	rtnl_unlock();
+	KUNIT_ASSERT_EQ(test, rv, 0);
+	return 0;
+}
+
+static int siw_mpa_rx_make_pair(struct kunit *test, struct socket **listen,
+				struct socket **server, struct socket **client)
+{
+	struct sockaddr_in addr = { .sin_family = AF_INET, };
+	struct sockaddr_in bound = { };
+	struct socket *l = NULL, *s = NULL, *c = NULL;
+	int rv;
+
+	siw_mpa_rx_bring_up_lo(test);
+
+	rv = sock_create_kern(&init_net, AF_INET, SOCK_STREAM, IPPROTO_TCP, &l);
+	KUNIT_ASSERT_EQ(test, rv, 0);
+
+	addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+	addr.sin_port = 0;
+	rv = kernel_bind(l, (struct sockaddr_unsized *)&addr, sizeof(addr));
+	KUNIT_ASSERT_EQ(test, rv, 0);
+
+	rv = l->ops->getname(l, (struct sockaddr *)&bound, 0);
+	KUNIT_ASSERT_GT(test, rv, 0);
+
+	rv = kernel_listen(l, 1);
+	KUNIT_ASSERT_EQ(test, rv, 0);
+
+	rv = sock_create_kern(&init_net, AF_INET, SOCK_STREAM, IPPROTO_TCP, &c);
+	KUNIT_ASSERT_EQ(test, rv, 0);
+
+	rv = kernel_connect(c, (struct sockaddr_unsized *)&bound,
+			    sizeof(bound), 0);
+	KUNIT_ASSERT_EQ(test, rv, 0);
+
+	rv = kernel_accept(l, &s, 0);
+	KUNIT_ASSERT_EQ(test, rv, 0);
+
+	*listen = l;
+	*server = s;
+	*client = c;
+	return 0;
+}
+
+static void siw_mpa_write_underflow_rejected_live_socket(struct kunit *test)
+{
+	struct siw_device *sdev;
+	struct siw_qp *qp;
+	struct siw_cep *cep;
+	struct siw_mem *mem;
+	struct socket *listen_sock = NULL, *server_sock = NULL, *client_sock = NULL;
+	struct iwarp_rdma_write write = { };
+	struct kvec iov;
+	struct msghdr msg = { };
+	void *xa_ret, *target;
+	u8 payload[sizeof(write) + 1];
+	u32 stag = 0x00000100;
+	int rv, i;
+
+	sdev = kunit_kzalloc(test, sizeof(*sdev), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, sdev);
+	xa_init_flags(&sdev->mem_xa, XA_FLAGS_ALLOC1);
+
+	qp = kunit_kzalloc(test, sizeof(*qp), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, qp);
+	qp->sdev = sdev;
+	qp->pd = kunit_kzalloc(test, sizeof(*qp->pd), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, qp->pd);
+	qp->rx_stream.state = SIW_GET_HDR;
+	init_rwsem(&qp->state_lock);
+	qp->attrs.state = SIW_QP_STATE_RTS;
+	qp->cep = NULL;
+
+	/* Register a valid REMOTE_WRITE memory object. On unpatched siw
+	 * this is what lets the negative-length copy reach skb_copy_bits;
+	 * without an MR the STag lookup in siw_proc_write() returns NULL
+	 * and the WRITE is terminated before the underflow primitive fires.
+	 * With this patch in place, the new siw_get_hdr() check rejects
+	 * the FPDU BEFORE STag lookup, so the MR is unused. We keep it in
+	 * the test so unpatched-kernel reruns also exercise the full path.
+	 */
+	target = kunit_kzalloc(test, 64, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, target);
+	mem = kunit_kzalloc(test, sizeof(*mem), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, mem);
+	kref_init(&mem->ref);
+	mem->sdev = sdev;
+	mem->stag = stag;
+	mem->stag_valid = 1;
+	mem->va = (u64)(uintptr_t)target;
+	mem->len = 64;
+	mem->pd = qp->pd;
+	mem->perms = IB_ACCESS_REMOTE_WRITE;
+	xa_ret = xa_store(&sdev->mem_xa, stag >> 8, mem, GFP_KERNEL);
+	KUNIT_ASSERT_FALSE(test, xa_is_err(xa_ret));
+
+	/* siw_qp_llp_data_ready dereferences sk_user_data as siw_cep. */
+	cep = kunit_kzalloc(test, sizeof(*cep), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, cep);
+	cep->qp = qp;
+	spin_lock_init(&cep->lock);
+	kref_init(&cep->ref);
+
+	rv = siw_mpa_rx_make_pair(test, &listen_sock, &server_sock, &client_sock);
+	KUNIT_ASSERT_EQ(test, rv, 0);
+
+	write_lock_bh(&server_sock->sk->sk_callback_lock);
+	server_sock->sk->sk_user_data = cep;
+	server_sock->sk->sk_data_ready = siw_qp_llp_data_ready;
+	qp->attrs.sk = server_sock;
+	write_unlock_bh(&server_sock->sk->sk_callback_lock);
+
+	write.ctrl.mpa_len =
+		cpu_to_be16(sizeof(write) - MPA_HDR_SIZE - 1);
+	write.ctrl.ddp_rdmap_ctrl = DDP_FLAG_TAGGED | DDP_FLAG_LAST |
+		cpu_to_be16(DDP_VERSION << 8) |
+		cpu_to_be16(RDMAP_VERSION << 6) |
+		cpu_to_be16(RDMAP_RDMA_WRITE);
+	write.sink_stag = cpu_to_be32(stag);
+	write.sink_to = cpu_to_be64((u64)(uintptr_t)target);
+
+	memcpy(payload, &write, sizeof(write));
+	payload[sizeof(write)] = 0x41;
+
+	iov.iov_base = payload;
+	iov.iov_len = sizeof(payload);
+	rv = kernel_sendmsg(client_sock, &msg, &iov, 1, sizeof(payload));
+	KUNIT_ASSERT_EQ(test, rv, (int)sizeof(payload));
+
+	/* Wait for TCP to deliver bytes and sk_data_ready to fire. */
+	for (i = 0; i < 200; i++) {
+		if (qp->term_info.valid)
+			break;
+		msleep(20);
+	}
+
+	KUNIT_EXPECT_EQ(test, (int)qp->term_info.valid, 1);
+	KUNIT_EXPECT_EQ(test, (int)qp->term_info.layer,
+			(int)TERM_ERROR_LAYER_LLP);
+	KUNIT_EXPECT_EQ(test, (int)qp->term_info.etype,
+			(int)LLP_ETYPE_MPA);
+	KUNIT_EXPECT_EQ(test, (int)qp->term_info.ecode,
+			(int)LLP_ECODE_FPDU_START);
+	KUNIT_EXPECT_EQ(test, (int)qp->rx_stream.rx_suspend, 1);
+
+	/* Detach our handler before tearing down sockets so the TCP stack
+	 * cannot call into freed kunit memory after the test.
+	 */
+	write_lock_bh(&server_sock->sk->sk_callback_lock);
+	server_sock->sk->sk_user_data = NULL;
+	server_sock->sk->sk_data_ready = sock_def_readable;
+	write_unlock_bh(&server_sock->sk->sk_callback_lock);
+
+	sock_release(client_sock);
+	sock_release(server_sock);
+	sock_release(listen_sock);
+}
+
+static struct kunit_case siw_mpa_rx_cases[] = {
+	KUNIT_CASE(siw_mpa_write_underflow_rejected),
+	KUNIT_CASE(siw_mpa_write_minimum_valid_accepted),
+	KUNIT_CASE(siw_mpa_write_underflow_rejected_live_socket),
+	{ }
+};
+
+static struct kunit_suite siw_mpa_rx_suite = {
+	.name = "siw_mpa_rx",
+	.test_cases = siw_mpa_rx_cases,
+};
+
+kunit_test_suite(siw_mpa_rx_suite);
-- 
2.53.0


      parent reply	other threads:[~2026-05-13 17:53 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 17:53 [PATCH 0/2] RDMA/siw: fix MPA FPDU length underflow + add KUnit coverage Michael Bommarito
2026-05-13 17:53 ` [PATCH 1/2] RDMA/siw: reject MPA FPDU length underflow before signed receive math Michael Bommarito
2026-05-13 17:53 ` Michael Bommarito [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260513175325.2042630-3-michael.bommarito@gmail.com \
    --to=michael.bommarito@gmail.com \
    --cc=bernard.metzler@linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox