linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1
@ 2025-07-03 17:26 Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 1/8] RDMA/siw: make mpa_version = MPA_REVISION_2 const Stefan Metzmacher
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

While working on smbdirect cleanup I'm using siw.ko for
testing against Windows using Chelsio T404-BT (only supporting MPA V1)
and T520-BT (supporting MPA V2) cards.

Up to now I used old patches to add MPA V1 support
to siw.ko and compiled against the running kernel
each time. I don't want to do that forever, so
I re-introduced module parameters in order to
avoid patching and compiling.

For my testing I'm using something like this:

  echo 1 > /sys/module/siw/parameters/mpa_crc_required
  echo 1 > /sys/module/siw/parameters/mpa_crc_strict
  echo 0 > /sys/module/siw/parameters/mpa_rdma_write_rtr
  echo 1 > /sys/module/siw/parameters/mpa_peer_to_peer

  echo 1 > /sys/module/siw/parameters/mpa_version
  rdma link add siw_v1_eth0 type siw netdev eth0

  echo 2 > /sys/module/siw/parameters/mpa_version
  rdma link add siw_v2_eth1 type siw netdev eth1

The global parameters are copied at 'rdma link add' time
and will stay as is for the lifetime of the specific device.

So I can testing against the T520-BT card using siw_v2_eth1
and against the T404-BT card using siw_v1_eth0.

It would be great if this can go upstream.

Stefan Metzmacher (8):
  RDMA/siw: make mpa_version = MPA_REVISION_2 const
  RDMA/siw: remove unused loopback_enabled = true
  RDMA/siw: add and remember siw_device_options per device.
  RDMA/siw: make use of sdev->options.* and avoid globals
  RDMA/siw: combine global options into siw_default_options
  RDMA/siw: move rtr_type to siw_device_options
  RDMA/siw: [re-]introduce module parameters to alter the behavior at
    runtime
  RDMA/siw: add support for MPA V1 and IRD/ORD negotiation based on
    [MS-SMBD]

 drivers/infiniband/sw/siw/iwarp.h     |   8 +
 drivers/infiniband/sw/siw/siw.h       |  21 +-
 drivers/infiniband/sw/siw/siw_cm.c    | 268 +++++++++++++++++++++-----
 drivers/infiniband/sw/siw/siw_cm.h    |   2 +
 drivers/infiniband/sw/siw/siw_main.c  | 173 ++++++++++++++---
 drivers/infiniband/sw/siw/siw_qp_tx.c |   3 +-
 drivers/infiniband/sw/siw/siw_verbs.c |   2 +-
 7 files changed, 395 insertions(+), 82 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/8] RDMA/siw: make mpa_version = MPA_REVISION_2 const
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
@ 2025-07-03 17:26 ` Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 2/8] RDMA/siw: remove unused loopback_enabled = true Stefan Metzmacher
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

There's currently no way to change that value at runtime.

Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
---
 drivers/infiniband/sw/siw/siw.h      | 2 +-
 drivers/infiniband/sw/siw/siw_cm.c   | 6 ------
 drivers/infiniband/sw/siw/siw_main.c | 2 +-
 3 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
index f5fd71717b80..3e04357ab197 100644
--- a/drivers/infiniband/sw/siw/siw.h
+++ b/drivers/infiniband/sw/siw/siw.h
@@ -494,7 +494,7 @@ extern const bool loopback_enabled;
 extern const bool mpa_crc_required;
 extern const bool mpa_crc_strict;
 extern const bool siw_tcp_nagle;
-extern u_char mpa_version;
+extern const u_char mpa_version;
 extern const bool peer_to_peer;
 extern struct task_struct *siw_tx_thread[];
 
diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index 708b13993fdf..aba97d674402 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1454,12 +1454,6 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 	 * MPA Rev. according to module parameter 'mpa_version', Key 'Request'.
 	 */
 	cep->mpa.hdr.params.bits = 0;
-	if (version > MPA_REVISION_2) {
-		pr_warn("Setting MPA version to %u\n", MPA_REVISION_2);
-		version = MPA_REVISION_2;
-		/* Adjust also module parameter */
-		mpa_version = MPA_REVISION_2;
-	}
 	__mpa_rr_set_revision(&cep->mpa.hdr.params.bits, version);
 
 	if (try_gso)
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index 5168307229a9..4e1d29832ac8 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -51,7 +51,7 @@ const bool mpa_crc_strict;
 const bool siw_tcp_nagle;
 
 /* Select MPA version to be used during connection setup */
-u_char mpa_version = MPA_REVISION_2;
+const u_char mpa_version = MPA_REVISION_2;
 
 /* Selects MPA P2P mode (additional handshake during connection
  * setup, if true.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/8] RDMA/siw: remove unused loopback_enabled = true
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 1/8] RDMA/siw: make mpa_version = MPA_REVISION_2 const Stefan Metzmacher
@ 2025-07-03 17:26 ` Stefan Metzmacher
  2025-07-04  1:14   ` yanjun.zhu
  2025-07-03 17:26 ` [PATCH 3/8] RDMA/siw: add and remember siw_device_options per device Stefan Metzmacher
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

Devices are created explicitly by the administrator using
'rdma link add siw_lo type siw netdev lo'.

Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
---
 drivers/infiniband/sw/siw/siw.h      | 1 -
 drivers/infiniband/sw/siw/siw_main.c | 5 +----
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
index 3e04357ab197..3bdc17eedbe7 100644
--- a/drivers/infiniband/sw/siw/siw.h
+++ b/drivers/infiniband/sw/siw/siw.h
@@ -490,7 +490,6 @@ struct siw_user_mmap_entry {
 /* Global siw parameters. Currently set in siw_main.c */
 extern const bool zcopy_tx;
 extern const bool try_gso;
-extern const bool loopback_enabled;
 extern const bool mpa_crc_required;
 extern const bool mpa_crc_strict;
 extern const bool siw_tcp_nagle;
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index 4e1d29832ac8..ba238b0b43a3 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -38,8 +38,6 @@ const bool zcopy_tx = true;
  */
 const bool try_gso;
 
-/* Attach siw also with loopback devices */
-const bool loopback_enabled = true;
 
 /* We try to negotiate CRC on, if true */
 const bool mpa_crc_required;
@@ -94,8 +92,7 @@ static int siw_dev_qualified(struct net_device *netdev)
 	 * <linux/if_arp.h> for type identifiers.
 	 */
 	if (netdev->type == ARPHRD_ETHER || netdev->type == ARPHRD_IEEE802 ||
-	    netdev->type == ARPHRD_NONE ||
-	    (netdev->type == ARPHRD_LOOPBACK && loopback_enabled))
+	    netdev->type == ARPHRD_NONE || netdev->type == ARPHRD_LOOPBACK)
 		return 1;
 
 	return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/8] RDMA/siw: add and remember siw_device_options per device.
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 1/8] RDMA/siw: make mpa_version = MPA_REVISION_2 const Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 2/8] RDMA/siw: remove unused loopback_enabled = true Stefan Metzmacher
@ 2025-07-03 17:26 ` Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 4/8] RDMA/siw: make use of sdev->options.* and avoid globals Stefan Metzmacher
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

In future there should be a way to control these values
on a per device basis.

Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
---
 drivers/infiniband/sw/siw/siw.h      | 11 +++++++++++
 drivers/infiniband/sw/siw/siw_main.c |  8 ++++++++
 2 files changed, 19 insertions(+)

diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
index 3bdc17eedbe7..b38e78909c06 100644
--- a/drivers/infiniband/sw/siw/siw.h
+++ b/drivers/infiniband/sw/siw/siw.h
@@ -70,8 +70,19 @@ struct siw_pd {
 	struct ib_pd base_pd;
 };
 
+struct siw_device_options {
+	bool zcopy_tx;
+	bool try_gso;
+	bool crc_required;
+	bool crc_strict;
+	bool tcp_nagle;
+	bool peer_to_peer;
+	u8 mpa_version;
+};
+
 struct siw_device {
 	struct ib_device base_dev;
+	struct siw_device_options options;
 	struct siw_dev_cap attrs;
 
 	u32 vendor_part_id;
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index ba238b0b43a3..bbec1442dfa1 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -324,6 +324,14 @@ static struct siw_device *siw_device_create(struct net_device *netdev)
 	/* Disable TCP port mapping */
 	base_dev->iw_driver_flags = IW_F_NO_PORT_MAP;
 
+	sdev->options.zcopy_tx = zcopy_tx;
+	sdev->options.try_gso = try_gso;
+	sdev->options.crc_required = mpa_crc_required;
+	sdev->options.crc_strict = mpa_crc_strict;
+	sdev->options.tcp_nagle = siw_tcp_nagle;
+	sdev->options.peer_to_peer = peer_to_peer;
+	sdev->options.mpa_version = mpa_version;
+
 	sdev->attrs.max_qp = SIW_MAX_QP;
 	sdev->attrs.max_qp_wr = SIW_MAX_QP_WR;
 	sdev->attrs.max_ord = SIW_MAX_ORD_QP;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/8] RDMA/siw: make use of sdev->options.* and avoid globals
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
                   ` (2 preceding siblings ...)
  2025-07-03 17:26 ` [PATCH 3/8] RDMA/siw: add and remember siw_device_options per device Stefan Metzmacher
@ 2025-07-03 17:26 ` Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 5/8] RDMA/siw: combine global options into siw_default_options Stefan Metzmacher
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
---
 drivers/infiniband/sw/siw/siw_cm.c    | 42 +++++++++++++++++----------
 drivers/infiniband/sw/siw/siw_qp_tx.c |  3 +-
 drivers/infiniband/sw/siw/siw_verbs.c |  2 +-
 3 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index aba97d674402..9640450e1f87 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -601,6 +601,9 @@ static int siw_recv_mpa_rr(struct siw_cep *cep)
  */
 static int siw_proc_mpareq(struct siw_cep *cep)
 {
+	struct siw_device *sdev = cep->sdev;
+	const bool local_crc_required = sdev->options.crc_required;
+	const bool local_crc_strict = sdev->options.crc_strict;
 	struct mpa_rr *req;
 	int version, rv;
 	u16 pd_len;
@@ -648,11 +651,11 @@ static int siw_proc_mpareq(struct siw_cep *cep)
 		 * connection with CRC if local CRC off enforced by
 		 * 'mpa_crc_strict' module parameter.
 		 */
-		if (!mpa_crc_required && mpa_crc_strict)
+		if (!local_crc_required && local_crc_strict)
 			goto reject_conn;
 
 		/* Enable CRC if requested by module parameter */
-		if (mpa_crc_required)
+		if (local_crc_required)
 			req->params.bits |= MPA_RR_FLAG_CRC;
 	}
 	if (cep->enhanced_rdma_conn_est) {
@@ -705,13 +708,13 @@ static int siw_proc_mpareq(struct siw_cep *cep)
 reject_conn:
 	siw_dbg_cep(cep, "reject: crc %d:%d:%d, m %d:%d\n",
 		    req->params.bits & MPA_RR_FLAG_CRC ? 1 : 0,
-		    mpa_crc_required, mpa_crc_strict,
+		    local_crc_required, local_crc_strict,
 		    req->params.bits & MPA_RR_FLAG_MARKERS ? 1 : 0, 0);
 
 	req->params.bits &= ~MPA_RR_FLAG_MARKERS;
 	req->params.bits |= MPA_RR_FLAG_REJECT;
 
-	if (!mpa_crc_required && mpa_crc_strict)
+	if (!local_crc_required && local_crc_strict)
 		req->params.bits &= ~MPA_RR_FLAG_CRC;
 
 	if (pd_len)
@@ -729,6 +732,9 @@ static int siw_proc_mpareply(struct siw_cep *cep)
 	struct siw_qp_attrs qp_attrs;
 	enum siw_qp_attr_mask qp_attr_mask;
 	struct siw_qp *qp = cep->qp;
+	struct siw_device *sdev = cep->sdev;
+	const bool local_crc_required = sdev->options.crc_required;
+	const bool local_crc_strict = sdev->options.crc_strict;
 	struct mpa_rr *rep;
 	int rv;
 	u16 rep_ord;
@@ -762,17 +768,17 @@ static int siw_proc_mpareply(struct siw_cep *cep)
 
 		return -ECONNRESET;
 	}
-	if (try_gso && rep->params.bits & MPA_RR_FLAG_GSO_EXP) {
+	if (sdev->options.try_gso && rep->params.bits & MPA_RR_FLAG_GSO_EXP) {
 		siw_dbg_cep(cep, "peer allows GSO on TX\n");
 		qp->tx_ctx.gso_seg_limit = 0;
 	}
 	if ((rep->params.bits & MPA_RR_FLAG_MARKERS) ||
-	    (mpa_crc_required && !(rep->params.bits & MPA_RR_FLAG_CRC)) ||
-	    (mpa_crc_strict && !mpa_crc_required &&
+	    (local_crc_required && !(rep->params.bits & MPA_RR_FLAG_CRC)) ||
+	    (local_crc_strict && !local_crc_required &&
 	     (rep->params.bits & MPA_RR_FLAG_CRC))) {
 		siw_dbg_cep(cep, "reply unsupp: crc %d:%d:%d, m %d:%d\n",
 			    rep->params.bits & MPA_RR_FLAG_CRC ? 1 : 0,
-			    mpa_crc_required, mpa_crc_strict,
+			    local_crc_required, local_crc_strict,
 			    rep->params.bits & MPA_RR_FLAG_MARKERS ? 1 : 0, 0);
 
 		siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -ECONNREFUSED);
@@ -920,6 +926,7 @@ static int siw_proc_mpareply(struct siw_cep *cep)
 static void siw_accept_newconn(struct siw_cep *cep)
 {
 	struct socket *s = cep->sock;
+	struct siw_device *sdev = cep->sdev;
 	struct socket *new_s = NULL;
 	struct siw_cep *new_cep = NULL;
 	int rv = 0; /* debug only. should disappear */
@@ -927,7 +934,7 @@ static void siw_accept_newconn(struct siw_cep *cep)
 	if (cep->state != SIW_EPSTATE_LISTENING)
 		goto error;
 
-	new_cep = siw_cep_alloc(cep->sdev);
+	new_cep = siw_cep_alloc(sdev);
 	if (!new_cep)
 		goto error;
 
@@ -960,7 +967,7 @@ static void siw_accept_newconn(struct siw_cep *cep)
 	siw_cep_get(new_cep);
 	new_s->sk->sk_user_data = new_cep;
 
-	if (siw_tcp_nagle == false)
+	if (sdev->options.tcp_nagle == false)
 		tcp_sock_set_nodelay(new_s->sk);
 	new_cep->state = SIW_EPSTATE_AWAIT_MPAREQ;
 
@@ -1357,9 +1364,11 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 	struct socket *s = NULL;
 	struct sockaddr *laddr = (struct sockaddr *)&id->local_addr,
 			*raddr = (struct sockaddr *)&id->remote_addr;
-	bool p2p_mode = peer_to_peer, v4 = true;
+	bool p2p_mode = sdev->options.peer_to_peer;
+	bool v4 = true;
 	u16 pd_len = params->private_data_len;
-	int version = mpa_version, rv;
+	int version = sdev->options.mpa_version;
+	int rv;
 
 	if (pd_len > MPA_MAX_PRIVDATA)
 		return -EINVAL;
@@ -1405,7 +1414,7 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 		siw_dbg_qp(qp, "kernel_bindconnect: error %d\n", rv);
 		goto error;
 	}
-	if (siw_tcp_nagle == false)
+	if (sdev->options.tcp_nagle == false)
 		tcp_sock_set_nodelay(s->sk);
 	cep = siw_cep_alloc(sdev);
 	if (!cep) {
@@ -1456,10 +1465,10 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 	cep->mpa.hdr.params.bits = 0;
 	__mpa_rr_set_revision(&cep->mpa.hdr.params.bits, version);
 
-	if (try_gso)
+	if (sdev->options.try_gso)
 		cep->mpa.hdr.params.bits |= MPA_RR_FLAG_GSO_EXP;
 
-	if (mpa_crc_required)
+	if (sdev->options.crc_required)
 		cep->mpa.hdr.params.bits |= MPA_RR_FLAG_CRC;
 
 	/*
@@ -1577,7 +1586,8 @@ int siw_accept(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 		goto error_unlock;
 	siw_dbg_cep(cep, "[QP %d]\n", params->qpn);
 
-	if (try_gso && cep->mpa.hdr.params.bits & MPA_RR_FLAG_GSO_EXP) {
+	if (sdev->options.try_gso &&
+	    cep->mpa.hdr.params.bits & MPA_RR_FLAG_GSO_EXP) {
 		siw_dbg_cep(cep, "peer allows GSO on TX\n");
 		qp->tx_ctx.gso_seg_limit = 0;
 	}
diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
index 3a08f57d2211..b12fc5ec656e 100644
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -794,6 +794,7 @@ static int siw_check_sgl_tx(struct ib_pd *pd, struct siw_wqe *wqe,
 static int siw_qp_sq_proc_tx(struct siw_qp *qp, struct siw_wqe *wqe)
 {
 	struct siw_iwarp_tx *c_tx = &qp->tx_ctx;
+	struct siw_device *sdev = qp->sdev;
 	struct socket *s = qp->attrs.sk;
 	int rv = 0, burst_len = qp->tx_ctx.burst;
 	enum rdmap_ecode ecode = RDMAP_ECODE_CATASTROPHIC_STREAM;
@@ -868,7 +869,7 @@ static int siw_qp_sq_proc_tx(struct siw_qp *qp, struct siw_wqe *wqe)
 		enum siw_opcode tx_type = tx_type(wqe);
 		unsigned int msg_flags;
 
-		if (siw_sq_empty(qp) || !siw_tcp_nagle || burst_len == 1)
+		if (siw_sq_empty(qp) || !sdev->options.tcp_nagle || burst_len == 1)
 			/*
 			 * End current TCP segment, if SQ runs empty,
 			 * or siw_tcp_nagle is not set, or we bail out
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index 2b2a7b8e93b0..1ca85b10e807 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -434,7 +434,7 @@ int siw_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attrs,
 
 	/* Make those two tunables fixed for now. */
 	qp->tx_ctx.gso_seg_limit = 1;
-	qp->tx_ctx.zcopy_tx = zcopy_tx;
+	qp->tx_ctx.zcopy_tx = sdev->options.zcopy_tx;
 
 	qp->attrs.state = SIW_QP_STATE_IDLE;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/8] RDMA/siw: combine global options into siw_default_options
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
                   ` (3 preceding siblings ...)
  2025-07-03 17:26 ` [PATCH 4/8] RDMA/siw: make use of sdev->options.* and avoid globals Stefan Metzmacher
@ 2025-07-03 17:26 ` Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 6/8] RDMA/siw: move rtr_type to siw_device_options Stefan Metzmacher
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
---
 drivers/infiniband/sw/siw/siw.h      |  8 ----
 drivers/infiniband/sw/siw/siw_main.c | 61 ++++++++++++++++------------
 2 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
index b38e78909c06..4d8e1f857099 100644
--- a/drivers/infiniband/sw/siw/siw.h
+++ b/drivers/infiniband/sw/siw/siw.h
@@ -498,14 +498,6 @@ struct siw_user_mmap_entry {
 	void *address;
 };
 
-/* Global siw parameters. Currently set in siw_main.c */
-extern const bool zcopy_tx;
-extern const bool try_gso;
-extern const bool mpa_crc_required;
-extern const bool mpa_crc_strict;
-extern const bool siw_tcp_nagle;
-extern const u_char mpa_version;
-extern const bool peer_to_peer;
 extern struct task_struct *siw_tx_thread[];
 
 extern struct iwarp_msg_info iwarp_pktinfo[RDMAP_TERMINATE + 1];
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index bbec1442dfa1..954990278256 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -29,32 +29,45 @@ MODULE_AUTHOR("Bernard Metzler");
 MODULE_DESCRIPTION("Software iWARP Driver");
 MODULE_LICENSE("Dual BSD/GPL");
 
-/* transmit from user buffer, if possible */
-const bool zcopy_tx = true;
-
-/* Restrict usage of GSO, if hardware peer iwarp is unable to process
- * large packets. try_gso = true lets siw try to use local GSO,
- * if peer agrees.  Not using GSO severly limits siw maximum tx bandwidth.
- */
-const bool try_gso;
+static const struct siw_device_options siw_default_options = {
+	/*
+	 * transmit from user buffer, if possible
+	 */
+	.zcopy_tx = true,
 
+	/*
+	 * Restrict usage of GSO, if hardware peer iwarp is unable to process
+	 * large packets. try_gso = true lets siw try to use local GSO, if peer
+	 * agrees.  Not using GSO severly limits siw maximum tx bandwidth.
+	 */
+	.try_gso = true,
 
-/* We try to negotiate CRC on, if true */
-const bool mpa_crc_required;
+	/*
+	 * We try to negotiate CRC on, if true
+	 */
+	.crc_required = false,
 
-/* MPA CRC on/off enforced */
-const bool mpa_crc_strict;
+	/*
+	 * MPA CRC on/off enforced
+	 */
+	.crc_strict = false,
 
-/* Control TCP_NODELAY socket option */
-const bool siw_tcp_nagle;
+	/*
+	 * Control TCP_NODELAY socket option
+	 */
+	.tcp_nagle = false,
 
-/* Select MPA version to be used during connection setup */
-const u_char mpa_version = MPA_REVISION_2;
+	/*
+	 * Selects MPA P2P mode (additional handshake during connection
+	 * setup, if true.
+	 */
+	.peer_to_peer = false,
 
-/* Selects MPA P2P mode (additional handshake during connection
- * setup, if true.
- */
-const bool peer_to_peer;
+	/*
+	 * Select MPA version to be used during connection setup
+	 */
+	.mpa_version = MPA_REVISION_2,
+};
 
 struct task_struct *siw_tx_thread[NR_CPUS];
 
@@ -324,13 +337,7 @@ static struct siw_device *siw_device_create(struct net_device *netdev)
 	/* Disable TCP port mapping */
 	base_dev->iw_driver_flags = IW_F_NO_PORT_MAP;
 
-	sdev->options.zcopy_tx = zcopy_tx;
-	sdev->options.try_gso = try_gso;
-	sdev->options.crc_required = mpa_crc_required;
-	sdev->options.crc_strict = mpa_crc_strict;
-	sdev->options.tcp_nagle = siw_tcp_nagle;
-	sdev->options.peer_to_peer = peer_to_peer;
-	sdev->options.mpa_version = mpa_version;
+	sdev->options = siw_default_options;
 
 	sdev->attrs.max_qp = SIW_MAX_QP;
 	sdev->attrs.max_qp_wr = SIW_MAX_QP_WR;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/8] RDMA/siw: move rtr_type to siw_device_options
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
                   ` (4 preceding siblings ...)
  2025-07-03 17:26 ` [PATCH 5/8] RDMA/siw: combine global options into siw_default_options Stefan Metzmacher
@ 2025-07-03 17:26 ` Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 7/8] RDMA/siw: [re-]introduce module parameters to alter the behavior at runtime Stefan Metzmacher
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
---
 drivers/infiniband/sw/siw/siw.h      | 1 +
 drivers/infiniband/sw/siw/siw_cm.c   | 6 +-----
 drivers/infiniband/sw/siw/siw_main.c | 6 ++++++
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
index 4d8e1f857099..f42418f44c0d 100644
--- a/drivers/infiniband/sw/siw/siw.h
+++ b/drivers/infiniband/sw/siw/siw.h
@@ -77,6 +77,7 @@ struct siw_device_options {
 	bool crc_strict;
 	bool tcp_nagle;
 	bool peer_to_peer;
+	__be16 rtr_type;
 	u8 mpa_version;
 };
 
diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index 9640450e1f87..c1c66ae1fa97 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -25,11 +25,6 @@
 #include "siw.h"
 #include "siw_cm.h"
 
-/*
- * Set to any combination of
- * MPA_V2_RDMA_NO_RTR, MPA_V2_RDMA_READ_RTR, MPA_V2_RDMA_WRITE_RTR
- */
-static __be16 rtr_type = MPA_V2_RDMA_READ_RTR | MPA_V2_RDMA_WRITE_RTR;
 static const bool relaxed_ird_negotiation = true;
 
 static void siw_cm_llp_state_change(struct sock *s);
@@ -1365,6 +1360,7 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 	struct sockaddr *laddr = (struct sockaddr *)&id->local_addr,
 			*raddr = (struct sockaddr *)&id->remote_addr;
 	bool p2p_mode = sdev->options.peer_to_peer;
+	__be16 rtr_type = sdev->options.rtr_type;
 	bool v4 = true;
 	u16 pd_len = params->private_data_len;
 	int version = sdev->options.mpa_version;
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index 954990278256..ff121f14d5b7 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -63,6 +63,12 @@ static const struct siw_device_options siw_default_options = {
 	 */
 	.peer_to_peer = false,
 
+	/*
+	 * Set to any combination of
+	 * MPA_V2_RDMA_NO_RTR, MPA_V2_RDMA_READ_RTR, MPA_V2_RDMA_WRITE_RTR
+	 */
+	.rtr_type = MPA_V2_RDMA_READ_RTR | MPA_V2_RDMA_WRITE_RTR,
+
 	/*
 	 * Select MPA version to be used during connection setup
 	 */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/8] RDMA/siw: [re-]introduce module parameters to alter the behavior at runtime
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
                   ` (5 preceding siblings ...)
  2025-07-03 17:26 ` [PATCH 6/8] RDMA/siw: move rtr_type to siw_device_options Stefan Metzmacher
@ 2025-07-03 17:26 ` Stefan Metzmacher
  2025-07-03 17:26 ` [PATCH 8/8] RDMA/siw: add support for MPA V1 and IRD/ORD negotiation based on [MS-SMBD] Stefan Metzmacher
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

During the out of tree development of the siw.ko driver it had module
parameters to change the runtime behavior. This got lost in the process
of bringing it upstream in order to reduce the complexity.

In order to interop with Chelsio iwarp cards (tested with T404-BT, T520-BT and T520-CR)
running from Windows in order to support smbdirect it is needed to
change some parameters:

echo 1 > /sys/module/siw/parameters/mpa_crc_required
echo 1 > /sys/module/siw/parameters/mpa_crc_strict
echo 0 > /sys/module/siw/parameters/mpa_rdma_write_rtr
echo 1 > /sys/module/siw/parameters/mpa_peer_to_peer

The T404-BT card also requires:
echo 1 > /sys/module/siw/parameters/mpa_version
(which will be supported in the next commit.

These only have an effect on following invocations of
'rdma link add siw_eth0 type siw netdev eth0',
which will make per device copy of the global parameters.

That means that each invocation of an
'rdma link add SIWDEV type siw netdev NETDEV'
can have its own set of values, which makes it
easy to test different settings on different interfaces.

Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
---
 drivers/infiniband/sw/siw/siw_cm.c   |   2 +-
 drivers/infiniband/sw/siw/siw_main.c | 111 ++++++++++++++++++++++++++-
 2 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index c1c66ae1fa97..7feed4a02d58 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1471,7 +1471,7 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 	 * If MPA version == 2:
 	 * o Include ORD and IRD.
 	 * o Indicate peer-to-peer mode, if required by module
-	 *   parameter 'peer_to_peer'.
+	 *   parameter 'mpa_peer_to_peer'.
 	 */
 	if (version == MPA_REVISION_2) {
 		cep->enhanced_rdma_conn_est = true;
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index ff121f14d5b7..c7d1e44ec8fa 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -29,7 +29,7 @@ MODULE_AUTHOR("Bernard Metzler");
 MODULE_DESCRIPTION("Software iWARP Driver");
 MODULE_LICENSE("Dual BSD/GPL");
 
-static const struct siw_device_options siw_default_options = {
+static struct siw_device_options siw_default_options = {
 	/*
 	 * transmit from user buffer, if possible
 	 */
@@ -75,6 +75,115 @@ static const struct siw_device_options siw_default_options = {
 	.mpa_version = MPA_REVISION_2,
 };
 
+module_param_named(zcopy_tx, siw_default_options.zcopy_tx, bool, 0644);
+MODULE_PARM_DESC(zcopy_tx, "Transmit from user buffer, if possible");
+
+module_param_named(try_gso, siw_default_options.try_gso, bool, 0644);
+MODULE_PARM_DESC(try_gso, "Try usage of GSO");
+
+module_param_named(mpa_crc_required, siw_default_options.crc_required, bool, 0644);
+MODULE_PARM_DESC(mpa_crc_required, "We try to negotiate CRC on, if true");
+
+module_param_named(mpa_crc_strict, siw_default_options.crc_strict, bool, 0644);
+MODULE_PARM_DESC(mpa_crc_strict, "MPA CRC on/off enforced");
+
+module_param_named(tcp_nagle, siw_default_options.tcp_nagle, bool, 0644);
+MODULE_PARM_DESC(tcp_nagle, "Control TCP_NODELAY socket option");
+
+module_param_named(mpa_peer_to_peer, siw_default_options.peer_to_peer, bool, 0644);
+MODULE_PARM_DESC(mpa_peer_to_peer, "Selects MPA P2P mode setup, if true");
+
+static int siw_rtr_type_set_bit(const char *val,
+				const struct kernel_param *kp,
+				__be16 bit)
+{
+	bool bval = true;
+	int ret;
+
+	/* No equals means "set"... */
+	if (!val) val = "1";
+
+	/* One of =[yYnN01] */
+	ret = kstrtobool(val, &bval);
+	if (ret != 0)
+		return ret;
+
+	if (bval)
+		*((__be16 *)kp->arg) |= bit;
+	else
+		*((__be16 *)kp->arg) &= ~bit;
+
+	return 0;
+}
+
+static int siw_rtr_type_get_bit(char *buffer,
+				const struct kernel_param *kp,
+				__be16 bit)
+{
+	bool bval = (*((__be16 *)kp->arg) & bit);
+
+	/* Y and N chosen as being relatively non-coder friendly */
+	return sprintf(buffer, "%c\n", bval ? 'Y' : 'N');
+}
+
+static int siw_rdma_read_rtr_set(const char *val, const struct kernel_param *kp)
+{
+	return siw_rtr_type_set_bit(val, kp, MPA_V2_RDMA_READ_RTR);
+}
+
+static int siw_rdma_read_rtr_get(char *buffer, const struct kernel_param *kp)
+{
+	return siw_rtr_type_get_bit(buffer, kp, MPA_V2_RDMA_READ_RTR);
+}
+module_param_call(mpa_rdma_read_rtr,
+		  siw_rdma_read_rtr_set,
+		  siw_rdma_read_rtr_get,
+		  &siw_default_options.rtr_type,
+		  0644);
+MODULE_PARM_DESC(mpa_rdma_read_rtr, "MPA P2P mode setup enable MPA_V2_RDMA_READ_RTR, if true");
+
+static int siw_rdma_write_rtr_set(const char *val, const struct kernel_param *kp)
+{
+	return siw_rtr_type_set_bit(val, kp, MPA_V2_RDMA_WRITE_RTR);
+}
+
+static int siw_rdma_write_rtr_get(char *buffer, const struct kernel_param *kp)
+{
+	return siw_rtr_type_get_bit(buffer, kp, MPA_V2_RDMA_WRITE_RTR);
+}
+module_param_call(mpa_rdma_write_rtr,
+		  siw_rdma_write_rtr_set,
+		  siw_rdma_write_rtr_get,
+		  &siw_default_options.rtr_type,
+		  0644);
+MODULE_PARM_DESC(mpa_rdma_write_rtr, "MPA P2P mode setup enable MPA_V2_RDMA_WRITE_RTR, if true");
+
+static int siw_mpa_version_set(const char *val, const struct kernel_param *kp)
+{
+	u8 uval = MPA_REVISION_2;
+	int ret;
+
+	ret = kstrtou8(val, 10, &uval);
+	if (ret != 0)
+		return ret;
+
+	switch (uval) {
+	case MPA_REVISION_2:
+	/* TODO case MPA_REVISION_1: */
+		siw_default_options.mpa_version = uval;
+		return 0;
+	}
+
+	return -ERANGE;
+}
+
+module_param_call(mpa_version,
+		  siw_mpa_version_set,
+		  param_get_byte,
+		  &siw_default_options.mpa_version,
+		  0644);
+MODULE_PARM_DESC(mpa_version, "Select MPA version to be used during connection setup");
+
 struct task_struct *siw_tx_thread[NR_CPUS];
 
 static int siw_device_register(struct siw_device *sdev, const char *name)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 8/8] RDMA/siw: add support for MPA V1 and IRD/ORD negotiation based on [MS-SMBD]
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
                   ` (6 preceding siblings ...)
  2025-07-03 17:26 ` [PATCH 7/8] RDMA/siw: [re-]introduce module parameters to alter the behavior at runtime Stefan Metzmacher
@ 2025-07-03 17:26 ` Stefan Metzmacher
  2025-07-03 19:16 ` [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Chuck Lever
  2025-07-06  7:56 ` Leon Romanovsky
  9 siblings, 0 replies; 13+ messages in thread
From: Stefan Metzmacher @ 2025-07-03 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: metze, Bernard Metzler

This is needed in order to interop with the Chelsio T404-BT card
it needs this:

  echo 1 > /sys/module/siw/parameters/mpa_crc_required
  echo 1 > /sys/module/siw/parameters/mpa_crc_strict
  echo 0 > /sys/module/siw/parameters/mpa_rdma_write_rtr
  echo 1 > /sys/module/siw/parameters/mpa_peer_to_peer
  echo 1 > /sys/module/siw/parameters/mpa_version

before calling 'rdma link add siw_eth0 type siw netdev eth0'.

Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
---
 drivers/infiniband/sw/siw/iwarp.h    |   8 +
 drivers/infiniband/sw/siw/siw_cm.c   | 214 ++++++++++++++++++++++++---
 drivers/infiniband/sw/siw/siw_cm.h   |   2 +
 drivers/infiniband/sw/siw/siw_main.c |   2 +-
 4 files changed, 204 insertions(+), 22 deletions(-)

diff --git a/drivers/infiniband/sw/siw/iwarp.h b/drivers/infiniband/sw/siw/iwarp.h
index 8cf69309827d..b30d9289950d 100644
--- a/drivers/infiniband/sw/siw/iwarp.h
+++ b/drivers/infiniband/sw/siw/iwarp.h
@@ -71,6 +71,14 @@ struct mpa_v2_data {
 	__be16 ord;
 };
 
+/*
+ * [MS-SMBD] 6 Appendix A: RDMA Provider IRD/ORD Negotiation
+ */
+struct mpa_v1_smbd_data {
+	__be32 ird;
+	__be32 ord;
+};
+
 struct mpa_marker {
 	__be16 rsvd;
 	__be16 fpdu_hmd; /* FPDU header-marker distance (= MPA's FPDUPTR) */
diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index 7feed4a02d58..b8b9bf70fb60 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -27,6 +27,8 @@
 
 static const bool relaxed_ird_negotiation = true;
 
+static size_t siw_cep_mpa_vX_ctrl_len(struct siw_cep *cep);
+
 static void siw_cm_llp_state_change(struct sock *s);
 static void siw_cm_llp_data_ready(struct sock *s);
 static void siw_cm_llp_write_space(struct sock *s);
@@ -330,18 +332,18 @@ static int siw_cm_upcall(struct siw_cep *cep, enum iw_cm_event_type reason,
 		u16 pd_len = be16_to_cpu(cep->mpa.hdr.params.pd_len);
 
 		if (pd_len) {
+			size_t hide_len = siw_cep_mpa_vX_ctrl_len(cep);
+
 			/*
 			 * hand over MPA private data
 			 */
 			event.private_data_len = pd_len;
 			event.private_data = cep->mpa.pdata;
 
-			/* Hide MPA V2 IRD/ORD control */
-			if (cep->enhanced_rdma_conn_est) {
-				event.private_data_len -=
-					sizeof(struct mpa_v2_data);
-				event.private_data +=
-					sizeof(struct mpa_v2_data);
+			/* Hide MPA IRD/ORD control */
+			if (event.private_data_len >= hide_len) {
+				event.private_data_len -= hide_len;
+				event.private_data += hide_len;
 			}
 		}
 		getname_local(cep->sock, &event.local_addr);
@@ -454,6 +456,40 @@ void siw_cep_get(struct siw_cep *cep)
 	kref_get(&cep->ref);
 }
 
+static u8 *siw_cep_mpa_vX_ctrl_ptr(struct siw_cep *cep)
+{
+	if (!cep->enhanced_rdma_conn_est) {
+		return NULL;
+	}
+
+	switch (cep->sdev->options.mpa_version) {
+	case MPA_REVISION_2:
+		return (u8 *)&cep->mpa.v2_ctrl;
+
+	case MPA_REVISION_1:
+		return (u8 *)&cep->mpa.v1_ctrl;
+	}
+
+	return NULL;
+}
+
+static size_t siw_cep_mpa_vX_ctrl_len(struct siw_cep *cep)
+{
+	if (!cep->enhanced_rdma_conn_est) {
+		return 0;
+	}
+
+	switch (cep->sdev->options.mpa_version) {
+	case MPA_REVISION_2:
+		return sizeof(cep->mpa.v2_ctrl);
+
+	case MPA_REVISION_1:
+		return sizeof(cep->mpa.v1_ctrl);
+	}
+
+	return 0;
+}
+
 /*
  * Expects params->pd_len in host byte order
  */
@@ -466,6 +502,8 @@ static int siw_send_mpareqrep(struct siw_cep *cep, const void *pdata, u8 pd_len)
 	int rv;
 	int iovec_num = 0;
 	int mpa_len;
+	u8 *vX_ctrl = siw_cep_mpa_vX_ctrl_ptr(cep);
+	size_t vX_ctrl_len = siw_cep_mpa_vX_ctrl_len(cep);
 
 	memset(&msg, 0, sizeof(msg));
 
@@ -473,11 +511,11 @@ static int siw_send_mpareqrep(struct siw_cep *cep, const void *pdata, u8 pd_len)
 	iov[iovec_num].iov_len = sizeof(*rr);
 	mpa_len = sizeof(*rr);
 
-	if (cep->enhanced_rdma_conn_est) {
+	if (vX_ctrl) {
 		iovec_num++;
-		iov[iovec_num].iov_base = &cep->mpa.v2_ctrl;
-		iov[iovec_num].iov_len = sizeof(cep->mpa.v2_ctrl);
-		mpa_len += sizeof(cep->mpa.v2_ctrl);
+		iov[iovec_num].iov_base = vX_ctrl;
+		iov[iovec_num].iov_len = vX_ctrl_len;
+		mpa_len += vX_ctrl_len;
 	}
 	if (pd_len) {
 		iovec_num++;
@@ -485,8 +523,8 @@ static int siw_send_mpareqrep(struct siw_cep *cep, const void *pdata, u8 pd_len)
 		iov[iovec_num].iov_len = pd_len;
 		mpa_len += pd_len;
 	}
-	if (cep->enhanced_rdma_conn_est)
-		pd_len += sizeof(cep->mpa.v2_ctrl);
+	if (vX_ctrl)
+		pd_len += vX_ctrl_len;
 
 	rr->params.pd_len = cpu_to_be16(pd_len);
 
@@ -611,6 +649,8 @@ static int siw_proc_mpareq(struct siw_cep *cep)
 
 	version = __mpa_rr_revision(req->params.bits);
 	pd_len = be16_to_cpu(req->params.pd_len);
+	siw_dbg_cep(cep, "GOT MPA REQUEST: version got %d local %d pd_len=%u\n",
+		version, sdev->options.mpa_version, pd_len);
 
 	if (version > MPA_REVISION_2)
 		/* allow for 0, 1, and 2 only */
@@ -635,6 +675,18 @@ static int siw_proc_mpareq(struct siw_cep *cep)
 		cep->enhanced_rdma_conn_est = true;
 	}
 
+	if (version == MPA_REVISION_1 &&
+	    sdev->options.peer_to_peer) {
+		/*
+		 * MPA version 1 must signal IRD/ORD values and P2P mode
+		 * in private data.
+		 */
+		if (pd_len < sizeof(struct mpa_v1_smbd_data))
+			goto reject_conn;
+
+		cep->enhanced_rdma_conn_est = true;
+	}
+
 	/* MPA Markers: currently not supported. Marker TX to be added. */
 	if (req->params.bits & MPA_RR_FLAG_MARKERS)
 		goto reject_conn;
@@ -653,7 +705,7 @@ static int siw_proc_mpareq(struct siw_cep *cep)
 		if (local_crc_required)
 			req->params.bits |= MPA_RR_FLAG_CRC;
 	}
-	if (cep->enhanced_rdma_conn_est) {
+	if (cep->enhanced_rdma_conn_est && version == MPA_REVISION_2) {
 		struct mpa_v2_data *v2 = (struct mpa_v2_data *)cep->mpa.pdata;
 
 		/*
@@ -689,8 +741,42 @@ static int siw_proc_mpareq(struct siw_cep *cep)
 				cep->mpa.v2_ctrl.ord |= MPA_V2_RDMA_WRITE_RTR;
 		}
 	}
+	if (cep->enhanced_rdma_conn_est && version == MPA_REVISION_1) {
+		struct mpa_v1_smbd_data *v1 = (struct mpa_v1_smbd_data *)cep->mpa.pdata;
 
+		/*
+		 * Peer requested ORD becomes requested local IRD,
+		 * peer requested IRD becomes requested local ORD.
+		 * IRD and ORD get limited by global maximum values.
+		 */
+		cep->ord = ntohl(v1->ird);
+		cep->ord = min(cep->ord, SIW_MAX_ORD_QP);
+		cep->ird = ntohl(v1->ord);
+		cep->ird = min(cep->ird, SIW_MAX_IRD_QP);
+
+		/* May get overwritten by locally negotiated values */
+		cep->mpa.v1_ctrl.ird = htons(cep->ird);
+		cep->mpa.v1_ctrl.ord = htons(cep->ord);
+
+		/*
+		 * Keep the v2 values in sync.
+		 */
+		req->params.bits |= MPA_RR_FLAG_ENHANCED;
+		cep->mpa.v2_ctrl.ird = htons(cep->ird);
+		cep->mpa.v2_ctrl.ord = htons(cep->ord);
+		cep->mpa.v2_ctrl.ird |= MPA_V2_PEER_TO_PEER;
+		cep->mpa.v2_ctrl.ord |= MPA_V2_RDMA_READ_RTR;
+	}
+
+	siw_dbg_cep(cep, "set SIW_EPSTATE_RECVD_MPAREQ\n");
 	cep->state = SIW_EPSTATE_RECVD_MPAREQ;
+	siw_dbg_cep(
+		cep,
+		"ord (cep %d) (max %d), ird (cep %d) (max %d)\n",
+		cep->ord,
+		sdev->attrs.max_ord,
+		cep->ird,
+		sdev->attrs.max_ird);
 
 	/* Keep reference until IWCM accepts/rejects */
 	siw_cep_get(cep);
@@ -736,6 +822,8 @@ static int siw_proc_mpareply(struct siw_cep *cep)
 	u16 rep_ird;
 	bool ird_insufficient = false;
 	enum mpa_v2_ctrl mpa_p2p_mode = MPA_V2_RDMA_NO_RTR;
+	int version;
+	u16 pd_len;
 
 	rv = siw_recv_mpa_rr(cep);
 	if (rv)
@@ -745,6 +833,11 @@ static int siw_proc_mpareply(struct siw_cep *cep)
 
 	rep = &cep->mpa.hdr;
 
+	version = __mpa_rr_revision(rep->params.bits);
+	pd_len = be16_to_cpu(rep->params.pd_len);
+	siw_dbg_cep(cep, "GOT MPA REPLY: version got %d local %d pd_len=%u\n",
+		version, sdev->options.mpa_version, pd_len);
+
 	if (__mpa_rr_revision(rep->params.bits) > MPA_REVISION_2) {
 		/* allow for 0, 1,  and 2 only */
 		rv = -EPROTO;
@@ -781,10 +874,34 @@ static int siw_proc_mpareply(struct siw_cep *cep)
 		return -EINVAL;
 	}
 	if (cep->enhanced_rdma_conn_est) {
-		struct mpa_v2_data *v2;
+		struct mpa_v2_data *v2 = NULL;
+
+		v2 = (struct mpa_v2_data *)cep->mpa.pdata;
+		if (version == MPA_REVISION_1) {
+			struct mpa_v1_smbd_data *v1 = (struct mpa_v1_smbd_data *)cep->mpa.pdata;
+
+			/*
+			 * Peer requested ORD becomes requested local IRD,
+			 * peer requested IRD becomes requested local ORD.
+			 * IRD and ORD get limited by global maximum values.
+			 */
+			cep->ord = ntohl(v1->ird);
+			cep->ord = min(cep->ord, SIW_MAX_ORD_QP);
+			cep->ird = ntohl(v1->ord);
+			cep->ird = min(cep->ird, SIW_MAX_IRD_QP);
+
+			/*
+			 * Keep the v2 values in sync.
+			 */
+			rep->params.bits |= MPA_RR_FLAG_ENHANCED;
+			cep->mpa.v2_ctrl.ird = htons(cep->ird);
+			cep->mpa.v2_ctrl.ord = htons(cep->ord);
+			cep->mpa.v2_ctrl.ird |= MPA_V2_PEER_TO_PEER;
+			cep->mpa.v2_ctrl.ord |= MPA_V2_RDMA_READ_RTR;
+			v2 = &cep->mpa.v2_ctrl;
+		}
 
-		if (__mpa_rr_revision(rep->params.bits) < MPA_REVISION_2 ||
-		    !(rep->params.bits & MPA_RR_FLAG_ENHANCED)) {
+		if (!(rep->params.bits & MPA_RR_FLAG_ENHANCED)) {
 			/*
 			 * Protocol failure: The responder MUST reply with
 			 * MPA version 2 and MUST set MPA_RR_FLAG_ENHANCED.
@@ -799,7 +916,6 @@ static int siw_proc_mpareply(struct siw_cep *cep)
 				      -ECONNRESET);
 			return -EINVAL;
 		}
-		v2 = (struct mpa_v2_data *)cep->mpa.pdata;
 		rep_ird = ntohs(v2->ird) & MPA_IRD_ORD_MASK;
 		rep_ord = ntohs(v2->ord) & MPA_IRD_ORD_MASK;
 
@@ -1442,7 +1558,7 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 	cep->ird = params->ird;
 	cep->ord = params->ord;
 
-	if (p2p_mode && cep->ord == 0)
+	if (p2p_mode && rtr_type & MPA_V2_RDMA_WRITE_RTR && cep->ord == 0)
 		cep->ord = 1;
 
 	cep->state = SIW_EPSTATE_CONNECTING;
@@ -1461,7 +1577,7 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 	cep->mpa.hdr.params.bits = 0;
 	__mpa_rr_set_revision(&cep->mpa.hdr.params.bits, version);
 
-	if (sdev->options.try_gso)
+	if (sdev->options.try_gso && version >= MPA_REVISION_2)
 		cep->mpa.hdr.params.bits |= MPA_RR_FLAG_GSO_EXP;
 
 	if (sdev->options.crc_required)
@@ -1488,6 +1604,26 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 		cep->mpa.v2_ctrl_req.ird = cep->mpa.v2_ctrl.ird;
 		cep->mpa.v2_ctrl_req.ord = cep->mpa.v2_ctrl.ord;
 	}
+	if (version == MPA_REVISION_1 && p2p_mode) {
+		cep->enhanced_rdma_conn_est = true;
+
+		cep->mpa.v1_ctrl.ird = htonl(cep->ird);
+		cep->mpa.v1_ctrl.ord = htonl(cep->ord);
+		/* Remember own P2P mode requested */
+		cep->mpa.v1_ctrl_req.ird = cep->mpa.v1_ctrl.ird;
+		cep->mpa.v1_ctrl_req.ord = cep->mpa.v1_ctrl.ord;
+
+		/*
+		 * Also setup the v2 values in order
+		 * to allow the callers to be unchanged
+		 */
+		cep->mpa.v2_ctrl.ird = htons(cep->ird);
+		cep->mpa.v2_ctrl.ord = htons(cep->ord);
+		cep->mpa.v2_ctrl.ird |= MPA_V2_PEER_TO_PEER;
+		cep->mpa.v2_ctrl.ord |= MPA_V2_RDMA_READ_RTR;
+		cep->mpa.v2_ctrl_req.ird = cep->mpa.v2_ctrl.ird;
+		cep->mpa.v2_ctrl_req.ord = cep->mpa.v2_ctrl.ord;
+	}
 	memcpy(cep->mpa.hdr.key, MPA_KEY_REQ, 16);
 
 	rv = siw_send_mpareqrep(cep, params->private_data, pd_len);
@@ -1551,6 +1687,7 @@ int siw_accept(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 {
 	struct siw_device *sdev = to_siw_dev(id->device);
 	struct siw_cep *cep = (struct siw_cep *)id->provider_data;
+	size_t hide_len = siw_cep_mpa_vX_ctrl_len(cep);
 	struct siw_qp *qp;
 	struct siw_qp_attrs qp_attrs;
 	int rv = -EINVAL, max_priv_data = MPA_MAX_PRIVDATA;
@@ -1582,6 +1719,17 @@ int siw_accept(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 		goto error_unlock;
 	siw_dbg_cep(cep, "[QP %d]\n", params->qpn);
 
+	siw_dbg_cep(
+		cep,
+		"[QP %u]: START ord (params %d) (cep %d) (max %d), ird (params %d) (cep %d) (max %d)\n",
+		qp_id(qp),
+		params->ord,
+		cep->ord,
+		sdev->attrs.max_ord,
+		params->ird,
+		cep->ird,
+		sdev->attrs.max_ird);
+
 	if (sdev->options.try_gso &&
 	    cep->mpa.hdr.params.bits & MPA_RR_FLAG_GSO_EXP) {
 		siw_dbg_cep(cep, "peer allows GSO on TX\n");
@@ -1596,8 +1744,8 @@ int siw_accept(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 			params->ird, sdev->attrs.max_ird);
 		goto error_unlock;
 	}
-	if (cep->enhanced_rdma_conn_est)
-		max_priv_data -= sizeof(struct mpa_v2_data);
+	if (max_priv_data >= hide_len)
+		max_priv_data -= hide_len;
 
 	if (params->private_data_len > max_priv_data) {
 		siw_dbg_cep(
@@ -1637,10 +1785,24 @@ int siw_accept(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 		cep->mpa.v2_ctrl.ird =
 			htons(params->ird & MPA_IRD_ORD_MASK) |
 			(cep->mpa.v2_ctrl.ird & ~MPA_V2_MASK_IRD_ORD);
+
+		cep->mpa.v1_ctrl.ord = htonl(params->ord & MPA_IRD_ORD_MASK);
+		cep->mpa.v1_ctrl.ird = htonl(params->ird & MPA_IRD_ORD_MASK);
 	}
 	cep->ird = params->ird;
 	cep->ord = params->ord;
 
+	siw_dbg_cep(
+		cep,
+		"[QP %u]: NEGOTIATED ord (params %d) (cep %d) (max %d), ird (params %d) (cep %d) (max %d)\n",
+		qp_id(qp),
+		params->ord,
+		cep->ord,
+		sdev->attrs.max_ord,
+		params->ird,
+		cep->ird,
+		sdev->attrs.max_ird);
+
 	cep->cm_id = id;
 	id->add_ref(id);
 
@@ -1675,6 +1837,16 @@ int siw_accept(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 	siw_dbg_cep(cep, "[QP %u]: send mpa reply, %d byte pdata\n",
 		    qp_id(qp), params->private_data_len);
 
+	siw_dbg_cep(
+		cep,
+		"[QP %u]: MPA RESPONSE ord (params %d) (cep %d) (qp_attrs %d), ird (params %d) (cep %d) (qp_attrs %d)\n",
+		qp_id(qp),
+		params->ord,
+		cep->ord,
+		qp_attrs.orq_size,
+		params->ird,
+		cep->ird,
+		qp_attrs.irq_size);
 	rv = siw_send_mpareqrep(cep, params->private_data,
 				params->private_data_len);
 	if (rv != 0)
diff --git a/drivers/infiniband/sw/siw/siw_cm.h b/drivers/infiniband/sw/siw/siw_cm.h
index 7011c8a8ee7b..506de01a701b 100644
--- a/drivers/infiniband/sw/siw/siw_cm.h
+++ b/drivers/infiniband/sw/siw/siw_cm.h
@@ -28,6 +28,8 @@ struct siw_mpa_info {
 	struct mpa_rr hdr; /* peer mpa hdr in host byte order */
 	struct mpa_v2_data v2_ctrl;
 	struct mpa_v2_data v2_ctrl_req;
+	struct mpa_v1_smbd_data v1_ctrl;
+	struct mpa_v1_smbd_data v1_ctrl_req;
 	char *pdata;
 	int bytes_rcvd;
 };
diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
index c7d1e44ec8fa..208a6366e9e9 100644
--- a/drivers/infiniband/sw/siw/siw_main.c
+++ b/drivers/infiniband/sw/siw/siw_main.c
@@ -169,7 +169,7 @@ static int siw_mpa_version_set(const char *val, const struct kernel_param *kp)
 
 	switch (uval) {
 	case MPA_REVISION_2:
-	/* TODO case MPA_REVISION_1: */
+	case MPA_REVISION_1:
 		siw_default_options.mpa_version = uval;
 		return 0;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
                   ` (7 preceding siblings ...)
  2025-07-03 17:26 ` [PATCH 8/8] RDMA/siw: add support for MPA V1 and IRD/ORD negotiation based on [MS-SMBD] Stefan Metzmacher
@ 2025-07-03 19:16 ` Chuck Lever
  2025-07-04 14:13   ` Bernard Metzler
  2025-07-06  7:56 ` Leon Romanovsky
  9 siblings, 1 reply; 13+ messages in thread
From: Chuck Lever @ 2025-07-03 19:16 UTC (permalink / raw)
  To: linux-rdma; +Cc: Bernard Metzler, Stefan Metzmacher

On 7/3/25 1:26 PM, Stefan Metzmacher wrote:
> While working on smbdirect cleanup I'm using siw.ko for
> testing against Windows using Chelsio T404-BT (only supporting MPA V1)
> and T520-BT (supporting MPA V2) cards.
> 
> Up to now I used old patches to add MPA V1 support
> to siw.ko and compiled against the running kernel
> each time. I don't want to do that forever, so
> I re-introduced module parameters in order to
> avoid patching and compiling.
> 
> For my testing I'm using something like this:
> 
>   echo 1 > /sys/module/siw/parameters/mpa_crc_required
>   echo 1 > /sys/module/siw/parameters/mpa_crc_strict
>   echo 0 > /sys/module/siw/parameters/mpa_rdma_write_rtr
>   echo 1 > /sys/module/siw/parameters/mpa_peer_to_peer
> 
>   echo 1 > /sys/module/siw/parameters/mpa_version
>   rdma link add siw_v1_eth0 type siw netdev eth0
> 
>   echo 2 > /sys/module/siw/parameters/mpa_version
>   rdma link add siw_v2_eth1 type siw netdev eth1
> 
> The global parameters are copied at 'rdma link add' time
> and will stay as is for the lifetime of the specific device.
> 
> So I can testing against the T520-BT card using siw_v2_eth1
> and against the T404-BT card using siw_v1_eth0.
> 
> It would be great if this can go upstream.
> 
> Stefan Metzmacher (8):
>   RDMA/siw: make mpa_version = MPA_REVISION_2 const
>   RDMA/siw: remove unused loopback_enabled = true
>   RDMA/siw: add and remember siw_device_options per device.
>   RDMA/siw: make use of sdev->options.* and avoid globals
>   RDMA/siw: combine global options into siw_default_options
>   RDMA/siw: move rtr_type to siw_device_options
>   RDMA/siw: [re-]introduce module parameters to alter the behavior at
>     runtime
>   RDMA/siw: add support for MPA V1 and IRD/ORD negotiation based on
>     [MS-SMBD]
> 
>  drivers/infiniband/sw/siw/iwarp.h     |   8 +
>  drivers/infiniband/sw/siw/siw.h       |  21 +-
>  drivers/infiniband/sw/siw/siw_cm.c    | 268 +++++++++++++++++++++-----
>  drivers/infiniband/sw/siw/siw_cm.h    |   2 +
>  drivers/infiniband/sw/siw/siw_main.c  | 173 ++++++++++++++---
>  drivers/infiniband/sw/siw/siw_qp_tx.c |   3 +-
>  drivers/infiniband/sw/siw/siw_verbs.c |   2 +-
>  7 files changed, 395 insertions(+), 82 deletions(-)
> 

Without addressing the question of whether siw should implement MPAv1, I
would like to request a user interface that is more friendly to
containers and to automation (CI). I know there is plenty of precedent
for "pull in the current setting of some random global module
parameter", but I don't think that is up to today's most common
deployment scenarios.

Surely rxe will also need to have a similar switch between RoCEv1 and
RoCEv2? Could something be added in the "rdma" tool to make this a per-
device protocol version setting?


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/8] RDMA/siw: remove unused loopback_enabled = true
  2025-07-03 17:26 ` [PATCH 2/8] RDMA/siw: remove unused loopback_enabled = true Stefan Metzmacher
@ 2025-07-04  1:14   ` yanjun.zhu
  0 siblings, 0 replies; 13+ messages in thread
From: yanjun.zhu @ 2025-07-04  1:14 UTC (permalink / raw)
  To: Stefan Metzmacher, linux-rdma; +Cc: Bernard Metzler

On 7/3/25 10:26 AM, Stefan Metzmacher wrote:
> Devices are created explicitly by the administrator using
> 'rdma link add siw_lo type siw netdev lo'.

In the file drivers/infiniband/core/addr.c:

496 static int rdma_set_src_addr_rcu(struct rdma_dev_addr *dev_addr,
497                                  unsigned int *ndev_flags,
498                                  const struct sockaddr *dst_in,
499                                  const struct dst_entry *dst)
500 {
501         struct net_device *ndev = READ_ONCE(dst->dev);
502
503         *ndev_flags = ndev->flags;
504         /* A physical device must be the RDMA device to use */
505         if (ndev->flags & IFF_LOOPBACK) {
506                 /*
507                  * RDMA (IB/RoCE, iWarp) doesn't run on lo interface or
508                  * loopback IP address. So if route is resolved to 
loopback
509                  * interface, translate that to a real ndev based on non
510                  * loopback IP address.
511                  */
512                 ndev = rdma_find_ndev_for_src_ip_rcu(dev_net(ndev), 
dst_in);
513                 if (IS_ERR(ndev))
514                         return -ENODEV;
515         }
516
517         return copy_src_l2_addr(dev_addr, dst_in, dst, ndev);
518 }

I am not sure whether the above comments are correct or not because you 
are creating a siw device on lo netdev.

Yanjun.Zhu

> 
> Cc: Bernard Metzler <bmt@zurich.ibm.com>
> Cc: linux-rdma@vger.kernel.org
> Signed-off-by: Stefan Metzmacher <metze@samba.org>
> ---
>   drivers/infiniband/sw/siw/siw.h      | 1 -
>   drivers/infiniband/sw/siw/siw_main.c | 5 +----
>   2 files changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
> index 3e04357ab197..3bdc17eedbe7 100644
> --- a/drivers/infiniband/sw/siw/siw.h
> +++ b/drivers/infiniband/sw/siw/siw.h
> @@ -490,7 +490,6 @@ struct siw_user_mmap_entry {
>   /* Global siw parameters. Currently set in siw_main.c */
>   extern const bool zcopy_tx;
>   extern const bool try_gso;
> -extern const bool loopback_enabled;
>   extern const bool mpa_crc_required;
>   extern const bool mpa_crc_strict;
>   extern const bool siw_tcp_nagle;
> diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
> index 4e1d29832ac8..ba238b0b43a3 100644
> --- a/drivers/infiniband/sw/siw/siw_main.c
> +++ b/drivers/infiniband/sw/siw/siw_main.c
> @@ -38,8 +38,6 @@ const bool zcopy_tx = true;
>    */
>   const bool try_gso;
>   
> -/* Attach siw also with loopback devices */
> -const bool loopback_enabled = true;
>   
>   /* We try to negotiate CRC on, if true */
>   const bool mpa_crc_required;
> @@ -94,8 +92,7 @@ static int siw_dev_qualified(struct net_device *netdev)
>   	 * <linux/if_arp.h> for type identifiers.
>   	 */
>   	if (netdev->type == ARPHRD_ETHER || netdev->type == ARPHRD_IEEE802 ||
> -	    netdev->type == ARPHRD_NONE ||
> -	    (netdev->type == ARPHRD_LOOPBACK && loopback_enabled))
> +	    netdev->type == ARPHRD_NONE || netdev->type == ARPHRD_LOOPBACK)
>   		return 1;
>   
>   	return 0;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1
  2025-07-03 19:16 ` [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Chuck Lever
@ 2025-07-04 14:13   ` Bernard Metzler
  0 siblings, 0 replies; 13+ messages in thread
From: Bernard Metzler @ 2025-07-04 14:13 UTC (permalink / raw)
  To: Chuck Lever, linux-rdma@vger.kernel.org; +Cc: Stefan Metzmacher



> -----Original Message-----
> From: Chuck Lever <chuck.lever@oracle.com>
> Sent: Thursday, July 3, 2025 9:17 PM
> To: linux-rdma@vger.kernel.org
> Cc: Bernard Metzler <BMT@zurich.ibm.com>; Stefan Metzmacher
> <metze@samba.org>
> Subject: [EXTERNAL] Re: [PATCH 0/8] RDMA/siw: [re-]introduce module
> parameters and add MPA V1
> 
> On 7/3/25 1:26 PM, Stefan Metzmacher wrote:
> > While working on smbdirect cleanup I'm using siw.ko for
> > testing against Windows using Chelsio T404-BT (only supporting MPA V1)
> > and T520-BT (supporting MPA V2) cards.
> >
> > Up to now I used old patches to add MPA V1 support
> > to siw.ko and compiled against the running kernel
> > each time. I don't want to do that forever, so
> > I re-introduced module parameters in order to
> > avoid patching and compiling.
> >

Using siw for testing upper layer RDMA
protocols/applications and involving 'real' RDMA
hardware at peer side is one of the major purposes
siw was originally done for. So I appreciate it
is being used that way and can follow the arguments.

However, we shall take care not to incorporate the
simplification of a single upper layer test case
like smbdirect into a kernel driver. Especially, I
am really not convinced smbdirect packet header
syntax to appear within siw. This is an upper layer
protocol and its semantics are void for iWarp.

I agree with Chuck that exporting module parameters
is an outdated solution. The proposed alternative -
the addition of control mechanics to the 'rdma tool'
sounds promising to me. We might invent wire protocol
specific options for RoCE, iWarp, ... devices.
Looking closer, however, for siw (and I think for rxe as
well), options could be even set per endpoint. This
would further improve the benefits of using those software
RDMA drivers for testing.
I am not sure if it is a good idea to add QPs as
configurable resources to the rdma tool, or if we might
consider using struct ib_qp_attr{}, which has
tons of parameters useless for iWarp, or extending rdma_cm
(rdma_set_option()), or if we better forget about
per-connection configuration.

What do others think?
 

Further comment after quickly looking through the patch:
Please no upper case characters in variable or function
names (siw_cep_mpa_vX_ctrl_len or vX_ctrl, etc.)
We have reserved capital letters for macros and constants;
we live a spartan life. 😊


Many thanks,
Bernard.


> > For my testing I'm using something like this:
> >
> >   echo 1 > /sys/module/siw/parameters/mpa_crc_required
> >   echo 1 > /sys/module/siw/parameters/mpa_crc_strict
> >   echo 0 > /sys/module/siw/parameters/mpa_rdma_write_rtr
> >   echo 1 > /sys/module/siw/parameters/mpa_peer_to_peer
> >
> >   echo 1 > /sys/module/siw/parameters/mpa_version
> >   rdma link add siw_v1_eth0 type siw netdev eth0
> >
> >   echo 2 > /sys/module/siw/parameters/mpa_version
> >   rdma link add siw_v2_eth1 type siw netdev eth1
> >
> > The global parameters are copied at 'rdma link add' time
> > and will stay as is for the lifetime of the specific device.
> >
> > So I can testing against the T520-BT card using siw_v2_eth1
> > and against the T404-BT card using siw_v1_eth0.
> >
> > It would be great if this can go upstream.
> >
> > Stefan Metzmacher (8):
> >   RDMA/siw: make mpa_version = MPA_REVISION_2 const
> >   RDMA/siw: remove unused loopback_enabled = true
> >   RDMA/siw: add and remember siw_device_options per device.
> >   RDMA/siw: make use of sdev->options.* and avoid globals
> >   RDMA/siw: combine global options into siw_default_options
> >   RDMA/siw: move rtr_type to siw_device_options
> >   RDMA/siw: [re-]introduce module parameters to alter the behavior at
> >     runtime
> >   RDMA/siw: add support for MPA V1 and IRD/ORD negotiation based on
> >     [MS-SMBD]
> >
> >  drivers/infiniband/sw/siw/iwarp.h     |   8 +
> >  drivers/infiniband/sw/siw/siw.h       |  21 +-
> >  drivers/infiniband/sw/siw/siw_cm.c    | 268 +++++++++++++++++++++-----
> >  drivers/infiniband/sw/siw/siw_cm.h    |   2 +
> >  drivers/infiniband/sw/siw/siw_main.c  | 173 ++++++++++++++---
> >  drivers/infiniband/sw/siw/siw_qp_tx.c |   3 +-
> >  drivers/infiniband/sw/siw/siw_verbs.c |   2 +-
> >  7 files changed, 395 insertions(+), 82 deletions(-)
> >
> 
> Without addressing the question of whether siw should implement MPAv1, I
> would like to request a user interface that is more friendly to
> containers and to automation (CI). I know there is plenty of precedent
> for "pull in the current setting of some random global module
> parameter", but I don't think that is up to today's most common
> deployment scenarios.
> 
> Surely rxe will also need to have a similar switch between RoCEv1 and
> RoCEv2? Could something be added in the "rdma" tool to make this a per-
> device protocol version setting?
> 
> 
> --
> Chuck Lever

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1
  2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
                   ` (8 preceding siblings ...)
  2025-07-03 19:16 ` [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Chuck Lever
@ 2025-07-06  7:56 ` Leon Romanovsky
  9 siblings, 0 replies; 13+ messages in thread
From: Leon Romanovsky @ 2025-07-06  7:56 UTC (permalink / raw)
  To: Stefan Metzmacher; +Cc: linux-rdma, Bernard Metzler

On Thu, Jul 03, 2025 at 07:26:11PM +0200, Stefan Metzmacher wrote:
> While working on smbdirect cleanup I'm using siw.ko for
> testing against Windows using Chelsio T404-BT (only supporting MPA V1)
> and T520-BT (supporting MPA V2) cards.
> 
> Up to now I used old patches to add MPA V1 support
> to siw.ko and compiled against the running kernel
> each time. I don't want to do that forever, so
> I re-introduced module parameters in order to
> avoid patching and compiling.
> 
> For my testing I'm using something like this:
> 
>   echo 1 > /sys/module/siw/parameters/mpa_crc_required
>   echo 1 > /sys/module/siw/parameters/mpa_crc_strict
>   echo 0 > /sys/module/siw/parameters/mpa_rdma_write_rtr
>   echo 1 > /sys/module/siw/parameters/mpa_peer_to_peer
> 
>   echo 1 > /sys/module/siw/parameters/mpa_version
>   rdma link add siw_v1_eth0 type siw netdev eth0
> 
>   echo 2 > /sys/module/siw/parameters/mpa_version
>   rdma link add siw_v2_eth1 type siw netdev eth1
> 
> The global parameters are copied at 'rdma link add' time
> and will stay as is for the lifetime of the specific device.
> 
> So I can testing against the T520-BT card using siw_v2_eth1
> and against the T404-BT card using siw_v1_eth0.
> 
> It would be great if this can go upstream.

Sorry no, there is no benefit in adding complexity for such an old device,
like Chelsio T404-BT.

Thanks

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-07-06  7:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-03 17:26 [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Stefan Metzmacher
2025-07-03 17:26 ` [PATCH 1/8] RDMA/siw: make mpa_version = MPA_REVISION_2 const Stefan Metzmacher
2025-07-03 17:26 ` [PATCH 2/8] RDMA/siw: remove unused loopback_enabled = true Stefan Metzmacher
2025-07-04  1:14   ` yanjun.zhu
2025-07-03 17:26 ` [PATCH 3/8] RDMA/siw: add and remember siw_device_options per device Stefan Metzmacher
2025-07-03 17:26 ` [PATCH 4/8] RDMA/siw: make use of sdev->options.* and avoid globals Stefan Metzmacher
2025-07-03 17:26 ` [PATCH 5/8] RDMA/siw: combine global options into siw_default_options Stefan Metzmacher
2025-07-03 17:26 ` [PATCH 6/8] RDMA/siw: move rtr_type to siw_device_options Stefan Metzmacher
2025-07-03 17:26 ` [PATCH 7/8] RDMA/siw: [re-]introduce module parameters to alter the behavior at runtime Stefan Metzmacher
2025-07-03 17:26 ` [PATCH 8/8] RDMA/siw: add support for MPA V1 and IRD/ORD negotiation based on [MS-SMBD] Stefan Metzmacher
2025-07-03 19:16 ` [PATCH 0/8] RDMA/siw: [re-]introduce module parameters and add MPA V1 Chuck Lever
2025-07-04 14:13   ` Bernard Metzler
2025-07-06  7:56 ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).