linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] Misc patches for RTRS
@ 2025-12-08 16:15 Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping Md Haris Iqbal
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner

Hi Jason, hi Leon,

Please consider to include following changes to the next merge window.

Jack Wang (1):
  RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req

Kim Zhu (4):
  RDMA/rtrs: Add error description to the logs
  RDMA/rtrs: Improve error logging for RDMA cm events
  RDMA/rtrs-srv: Rate-limit I/O path error logging
  RDMA/rtrs: Extend log message when a port fails

Md Haris Iqbal (3):
  RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
  RDMA/rtrs-srv: Add check and closure for possible zombie paths
  RDMA/rtrs-clt.c: For conn rejection use actual err number

Roman Penyaev (1):
  RDMA/rtrs-srv: fix SG mapping

 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c |   8 +-
 drivers/infiniband/ulp/rtrs/rtrs-clt.c       | 132 ++++++++-----
 drivers/infiniband/ulp/rtrs/rtrs-clt.h       |   1 -
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c |  12 +-
 drivers/infiniband/ulp/rtrs/rtrs-srv.c       | 185 +++++++++++++------
 drivers/infiniband/ulp/rtrs/rtrs-srv.h       |   1 +
 drivers/infiniband/ulp/rtrs/rtrs.c           |   9 +-
 7 files changed, 232 insertions(+), 116 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
	Roman Penyaev

From: Roman Penyaev <r.peniaev@gmail.com>

This fixes the following error on the server side:

   RTRS server session allocation failed: -EINVAL

caused by the caller of the `ib_dma_map_sg()`, which does not expect
less mapped entries, than requested, which is in the order of things
and can be easily reproduced on the machine with enabled IOMMU.

The fix is to treat any positive number of mapped sg entries as a
successful mapping and cache DMA addresses by traversing modified
SG table.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index ef4abdea3c2d..2589871c0fa9 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -601,7 +601,7 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
 	     srv_path->mrs_num++) {
 		struct rtrs_srv_mr *srv_mr = &srv_path->mrs[srv_path->mrs_num];
 		struct scatterlist *s;
-		int nr, nr_sgt, chunks;
+		int nr, nr_sgt, chunks, ind;
 
 		sgt = &srv_mr->sgt;
 		chunks = chunks_per_mr * srv_path->mrs_num;
@@ -631,7 +631,7 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
 		}
 		nr = ib_map_mr_sg(mr, sgt->sgl, nr_sgt,
 				  NULL, max_chunk_size);
-		if (nr != nr_sgt) {
+		if (nr < nr_sgt) {
 			err = nr < 0 ? nr : -EINVAL;
 			goto dereg_mr;
 		}
@@ -647,9 +647,24 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
 				goto dereg_mr;
 			}
 		}
-		/* Eventually dma addr for each chunk can be cached */
-		for_each_sg(sgt->sgl, s, nr_sgt, i)
-			srv_path->dma_addr[chunks + i] = sg_dma_address(s);
+
+		/*
+		 * Cache DMA addresses by traversing sg entries.  If
+		 * regions were merged, an inner loop is required to
+		 * populate the DMA address array by traversing larger
+		 * regions.
+		 */
+		ind = chunks;
+		for_each_sg(sgt->sgl, s, nr_sgt, i) {
+			unsigned int dma_len = sg_dma_len(s);
+			u64 dma_addr = sg_dma_address(s);
+			u64 dma_addr_end = dma_addr + dma_len;
+
+			do {
+				srv_path->dma_addr[ind++] = dma_addr;
+				dma_addr += max_chunk_size;
+			} while (dma_addr < dma_addr_end);
+		}
 
 		ib_update_fast_reg_key(mr, ib_inc_rkey(mr->rkey));
 		srv_mr->mr = mr;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/9] RDMA/rtrs: Add error description to the logs
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  2025-12-12  5:26   ` Dan Carpenter
  2025-12-18 15:51   ` Leon Romanovsky
  2025-12-08 16:15 ` [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS Md Haris Iqbal
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

From: Kim Zhu <zhu.yanjun@ionos.com>

Print error description besides the error number.

Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c |  8 +-
 drivers/infiniband/ulp/rtrs/rtrs-clt.c       | 89 ++++++++++----------
 drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 12 +--
 drivers/infiniband/ulp/rtrs/rtrs-srv.c       | 78 ++++++++---------
 drivers/infiniband/ulp/rtrs/rtrs.c           |  9 +-
 5 files changed, 101 insertions(+), 95 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
index 4aa80c9388f0..b318acc12b10 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
@@ -439,19 +439,19 @@ int rtrs_clt_create_path_files(struct rtrs_clt_path *clt_path)
 				   clt->kobj_paths,
 				   "%s", str);
 	if (err) {
-		pr_err("kobject_init_and_add: %d\n", err);
+		pr_err("kobject_init_and_add: %d(%pe)\n", err, ERR_PTR(err));
 		kobject_put(&clt_path->kobj);
 		return err;
 	}
 	err = sysfs_create_group(&clt_path->kobj, &rtrs_clt_path_attr_group);
 	if (err) {
-		pr_err("sysfs_create_group(): %d\n", err);
+		pr_err("sysfs_create_group(): %d(%pe)\n", err, ERR_PTR(err));
 		goto put_kobj;
 	}
 	err = kobject_init_and_add(&clt_path->stats->kobj_stats, &ktype_stats,
 				   &clt_path->kobj, "stats");
 	if (err) {
-		pr_err("kobject_init_and_add: %d\n", err);
+		pr_err("kobject_init_and_add: %d(%pe)\n", err, ERR_PTR(err));
 		kobject_put(&clt_path->stats->kobj_stats);
 		goto remove_group;
 	}
@@ -459,7 +459,7 @@ int rtrs_clt_create_path_files(struct rtrs_clt_path *clt_path)
 	err = sysfs_create_group(&clt_path->stats->kobj_stats,
 				 &rtrs_clt_stats_attr_group);
 	if (err) {
-		pr_err("failed to create stats sysfs group, err: %d\n", err);
+		pr_err("failed to create stats sysfs group, err: %d(%pe)\n", err, ERR_PTR(err));
 		goto put_kobj_stats;
 	}
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 71387811b281..808de144d2e4 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -422,8 +422,8 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 			refcount_inc(&req->ref);
 			err = rtrs_inv_rkey(req);
 			if (err) {
-				rtrs_err_rl(con->c.path, "Send INV WR key=%#x: %d\n",
-					  req->mr->rkey, err);
+				rtrs_err_rl(con->c.path, "Send INV WR key=%#x: %d(%pe)\n",
+					    req->mr->rkey, err, ERR_PTR(err));
 			} else if (can_wait) {
 				wait_for_completion(&req->inv_comp);
 			}
@@ -443,8 +443,8 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 
 	if (errno) {
 		rtrs_err_rl(con->c.path,
-			    "IO %s request failed: error=%d path=%s [%s:%u] notify=%d\n",
-			    req->dir == DMA_TO_DEVICE ? "write" : "read", errno,
+			    "IO %s request failed: error=%d(%pe) path=%s [%s:%u] notify=%d\n",
+			    req->dir == DMA_TO_DEVICE ? "write" : "read", errno, ERR_PTR(errno),
 			    kobject_name(&clt_path->kobj), clt_path->hca_name,
 			    clt_path->hca_port, notify);
 	}
@@ -514,7 +514,8 @@ static void rtrs_clt_recv_done(struct rtrs_clt_con *con, struct ib_wc *wc)
 			  cqe);
 	err = rtrs_iu_post_recv(&con->c, iu);
 	if (err) {
-		rtrs_err(con->c.path, "post iu failed %d\n", err);
+		rtrs_err(con->c.path, "post iu failed %d(%pe)\n", err,
+			 ERR_PTR(err));
 		rtrs_rdma_error_recovery(con);
 	}
 }
@@ -659,8 +660,8 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 		else
 			err = rtrs_post_recv_empty(&con->c, &io_comp_cqe);
 		if (err) {
-			rtrs_err(con->c.path, "rtrs_post_recv_empty(): %d\n",
-				  err);
+			rtrs_err(con->c.path, "rtrs_post_recv_empty(): %d(%pe)\n",
+				 err, ERR_PTR(err));
 			rtrs_rdma_error_recovery(con);
 		}
 		break;
@@ -731,8 +732,8 @@ static int post_recv_path(struct rtrs_clt_path *clt_path)
 
 		err = post_recv_io(to_clt_con(clt_path->s.con[cid]), q_size);
 		if (err) {
-			rtrs_err(clt_path->clt, "post_recv_io(), err: %d\n",
-				 err);
+			rtrs_err(clt_path->clt, "post_recv_io(), err: %d(%pe)\n",
+				 err, ERR_PTR(err));
 			return err;
 		}
 	}
@@ -1122,8 +1123,8 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 		ret = rtrs_map_sg_fr(req, count);
 		if (ret < 0) {
 			rtrs_err_rl(s,
-				    "Write request failed, failed to map fast reg. data, err: %d\n",
-				    ret);
+				    "Write request failed, failed to map fast reg. data, err: %d(%pe)\n",
+				    ret, ERR_PTR(ret));
 			ib_dma_unmap_sg(clt_path->s.dev->ib_dev, req->sglist,
 					req->sg_cnt, req->dir);
 			return ret;
@@ -1150,9 +1151,9 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 				      imm, wr, NULL);
 	if (ret) {
 		rtrs_err_rl(s,
-			    "Write request failed: error=%d path=%s [%s:%u]\n",
-			    ret, kobject_name(&clt_path->kobj), clt_path->hca_name,
-			    clt_path->hca_port);
+			    "Write request failed: error=%d(%pe) path=%s [%s:%u]\n",
+			    ret, ERR_PTR(ret), kobject_name(&clt_path->kobj),
+			    clt_path->hca_name, clt_path->hca_port);
 		if (req->mp_policy == MP_POLICY_MIN_INFLIGHT)
 			atomic_dec(&clt_path->stats->inflight);
 		if (req->mr->need_inval) {
@@ -1208,8 +1209,8 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
 		ret = rtrs_map_sg_fr(req, count);
 		if (ret < 0) {
 			rtrs_err_rl(s,
-				     "Read request failed, failed to map fast reg. data, err: %d\n",
-				     ret);
+				     "Read request failed, failed to map fast reg. data, err: %d(%pe)\n",
+				     ret, ERR_PTR(ret));
 			ib_dma_unmap_sg(dev->ib_dev, req->sglist, req->sg_cnt,
 					req->dir);
 			return ret;
@@ -1260,9 +1261,9 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
 				   req->data_len, imm, wr);
 	if (ret) {
 		rtrs_err_rl(s,
-			    "Read request failed: error=%d path=%s [%s:%u]\n",
-			    ret, kobject_name(&clt_path->kobj), clt_path->hca_name,
-			    clt_path->hca_port);
+			    "Read request failed: error=%d(%pe) path=%s [%s:%u]\n",
+			    ret, ERR_PTR(ret), kobject_name(&clt_path->kobj),
+			    clt_path->hca_name, clt_path->hca_port);
 		if (req->mp_policy == MP_POLICY_MIN_INFLIGHT)
 			atomic_dec(&clt_path->stats->inflight);
 		req->mr->need_inval = false;
@@ -1774,12 +1775,12 @@ static int rtrs_rdma_addr_resolved(struct rtrs_clt_con *con)
 	err = create_con_cq_qp(con);
 	mutex_unlock(&con->con_mutex);
 	if (err) {
-		rtrs_err(s, "create_con_cq_qp(), err: %d\n", err);
+		rtrs_err(s, "create_con_cq_qp(), err: %d(%pe)\n", err, ERR_PTR(err));
 		return err;
 	}
 	err = rdma_resolve_route(con->c.cm_id, RTRS_CONNECT_TIMEOUT_MS);
 	if (err)
-		rtrs_err(s, "Resolving route failed, err: %d\n", err);
+		rtrs_err(s, "Resolving route failed, err: %d(%pe)\n", err, ERR_PTR(err));
 
 	return err;
 }
@@ -1813,7 +1814,7 @@ static int rtrs_rdma_route_resolved(struct rtrs_clt_con *con)
 
 	err = rdma_connect_locked(con->c.cm_id, &param);
 	if (err)
-		rtrs_err(clt, "rdma_connect_locked(): %d\n", err);
+		rtrs_err(clt, "rdma_connect_locked(): %d(%pe)\n", err, ERR_PTR(err));
 
 	return err;
 }
@@ -1846,8 +1847,8 @@ static int rtrs_rdma_conn_established(struct rtrs_clt_con *con,
 	}
 	errno = le16_to_cpu(msg->errno);
 	if (errno) {
-		rtrs_err(clt, "Invalid RTRS message: errno %d\n",
-			  errno);
+		rtrs_err(clt, "Invalid RTRS message: errno %d(%pe)\n",
+			  errno, ERR_PTR(errno));
 		return -ECONNRESET;
 	}
 	if (con->c.cid == 0) {
@@ -1936,12 +1937,12 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
 				  "Previous session is still exists on the server, please reconnect later\n");
 		else
 			rtrs_err(s,
-				  "Connect rejected: status %d (%s), rtrs errno %d\n",
-				  status, rej_msg, errno);
+				  "Connect rejected: status %d (%s), rtrs errno %d(%pe)\n",
+				  status, rej_msg, errno, ERR_PTR(errno));
 	} else {
 		rtrs_err(s,
-			  "Connect rejected but with malformed message: status %d (%s)\n",
-			  status, rej_msg);
+			  "Connect rejected but with malformed message: status %d(%pe) (%s)\n",
+			  status, ERR_PTR(status), rej_msg);
 	}
 
 	return -ECONNRESET;
@@ -2008,27 +2009,27 @@ static int rtrs_clt_rdma_cm_handler(struct rdma_cm_id *cm_id,
 	case RDMA_CM_EVENT_UNREACHABLE:
 	case RDMA_CM_EVENT_ADDR_CHANGE:
 	case RDMA_CM_EVENT_TIMEWAIT_EXIT:
-		rtrs_wrn(s, "CM error (CM event: %s, err: %d)\n",
-			 rdma_event_msg(ev->event), ev->status);
+		rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
+			 rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
 		cm_err = -ECONNRESET;
 		break;
 	case RDMA_CM_EVENT_ADDR_ERROR:
 	case RDMA_CM_EVENT_ROUTE_ERROR:
-		rtrs_wrn(s, "CM error (CM event: %s, err: %d)\n",
-			 rdma_event_msg(ev->event), ev->status);
+		rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
+			 rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
 		cm_err = -EHOSTUNREACH;
 		break;
 	case RDMA_CM_EVENT_DEVICE_REMOVAL:
 		/*
 		 * Device removal is a special case.  Queue close and return 0.
 		 */
-		rtrs_wrn_rl(s, "CM event: %s, status: %d\n", rdma_event_msg(ev->event),
-			    ev->status);
+		rtrs_wrn_rl(s, "CM event: %s, status: %d(%pe)\n", rdma_event_msg(ev->event),
+			    ev->status, ERR_PTR(ev->status));
 		rtrs_clt_close_conns(clt_path, false);
 		return 0;
 	default:
-		rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d)\n",
-			 rdma_event_msg(ev->event), ev->status);
+		rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d(%pe))\n",
+			 rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
 		cm_err = -ECONNRESET;
 		break;
 	}
@@ -2065,14 +2066,14 @@ static int create_cm(struct rtrs_clt_con *con)
 	/* allow the port to be reused */
 	err = rdma_set_reuseaddr(cm_id, 1);
 	if (err != 0) {
-		rtrs_err(s, "Set address reuse failed, err: %d\n", err);
+		rtrs_err(s, "Set address reuse failed, err: %d(%pe)\n", err, ERR_PTR(err));
 		return err;
 	}
 	err = rdma_resolve_addr(cm_id, (struct sockaddr *)&clt_path->s.src_addr,
 				(struct sockaddr *)&clt_path->s.dst_addr,
 				RTRS_CONNECT_TIMEOUT_MS);
 	if (err) {
-		rtrs_err(s, "Failed to resolve address, err: %d\n", err);
+		rtrs_err(s, "Failed to resolve address, err: %d(%pe)\n", err, ERR_PTR(err));
 		return err;
 	}
 	/*
@@ -2547,7 +2548,7 @@ static int rtrs_send_path_info(struct rtrs_clt_path *clt_path)
 	/* Prepare for getting info response */
 	err = rtrs_iu_post_recv(&usr_con->c, rx_iu);
 	if (err) {
-		rtrs_err(clt_path->clt, "rtrs_iu_post_recv(), err: %d\n", err);
+		rtrs_err(clt_path->clt, "rtrs_iu_post_recv(), err: %d(%pe)\n", err, ERR_PTR(err));
 		goto out;
 	}
 	rx_iu = NULL;
@@ -2563,7 +2564,7 @@ static int rtrs_send_path_info(struct rtrs_clt_path *clt_path)
 	/* Send info request */
 	err = rtrs_iu_post_send(&usr_con->c, tx_iu, sizeof(*msg), NULL);
 	if (err) {
-		rtrs_err(clt_path->clt, "rtrs_iu_post_send(), err: %d\n", err);
+		rtrs_err(clt_path->clt, "rtrs_iu_post_send(), err: %d(%pe)\n", err, ERR_PTR(err));
 		goto out;
 	}
 	tx_iu = NULL;
@@ -2614,15 +2615,15 @@ static int init_path(struct rtrs_clt_path *clt_path)
 	err = init_conns(clt_path);
 	if (err) {
 		rtrs_err(clt_path->clt,
-			 "init_conns() failed: err=%d path=%s [%s:%u]\n", err,
-			 str, clt_path->hca_name, clt_path->hca_port);
+			 "init_conns() failed: err=%d(%pe) path=%s [%s:%u]\n", err,
+			 ERR_PTR(err), str, clt_path->hca_name, clt_path->hca_port);
 		goto out;
 	}
 	err = rtrs_send_path_info(clt_path);
 	if (err) {
 		rtrs_err(clt_path->clt,
-			 "rtrs_send_path_info() failed: err=%d path=%s [%s:%u]\n",
-			 err, str, clt_path->hca_name, clt_path->hca_port);
+			 "rtrs_send_path_info() failed: err=%d(%pe) path=%s [%s:%u]\n",
+			 err, ERR_PTR(err), str, clt_path->hca_name, clt_path->hca_port);
 		goto out;
 	}
 	rtrs_clt_path_up(clt_path);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
index 3f305e694fe8..5e12701a3733 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
@@ -176,14 +176,14 @@ static int rtrs_srv_create_once_sysfs_root_folders(struct rtrs_srv_path *srv_pat
 	dev_set_uevent_suppress(&srv->dev, true);
 	err = device_add(&srv->dev);
 	if (err) {
-		pr_err("device_add(): %d\n", err);
+		pr_err("device_add(): %d(%pe)\n", err, ERR_PTR(err));
 		put_device(&srv->dev);
 		goto unlock;
 	}
 	srv->kobj_paths = kobject_create_and_add("paths", &srv->dev.kobj);
 	if (!srv->kobj_paths) {
 		err = -ENOMEM;
-		pr_err("kobject_create_and_add(): %d\n", err);
+		pr_err("kobject_create_and_add(): %d(%pe)\n", err, ERR_PTR(err));
 		device_del(&srv->dev);
 		put_device(&srv->dev);
 		goto unlock;
@@ -237,14 +237,14 @@ static int rtrs_srv_create_stats_files(struct rtrs_srv_path *srv_path)
 	err = kobject_init_and_add(&srv_path->stats->kobj_stats, &ktype_stats,
 				   &srv_path->kobj, "stats");
 	if (err) {
-		rtrs_err(s, "kobject_init_and_add(): %d\n", err);
+		rtrs_err(s, "kobject_init_and_add(): %d(%pe)\n", err, ERR_PTR(err));
 		kobject_put(&srv_path->stats->kobj_stats);
 		return err;
 	}
 	err = sysfs_create_group(&srv_path->stats->kobj_stats,
 				 &rtrs_srv_stats_attr_group);
 	if (err) {
-		rtrs_err(s, "sysfs_create_group(): %d\n", err);
+		rtrs_err(s, "sysfs_create_group(): %d(%pe)\n", err, ERR_PTR(err));
 		goto err;
 	}
 
@@ -276,12 +276,12 @@ int rtrs_srv_create_path_files(struct rtrs_srv_path *srv_path)
 	err = kobject_init_and_add(&srv_path->kobj, &ktype, srv->kobj_paths,
 				   "%s", str);
 	if (err) {
-		rtrs_err(s, "kobject_init_and_add(): %d\n", err);
+		rtrs_err(s, "kobject_init_and_add(): %d(%pe)\n", err, ERR_PTR(err));
 		goto destroy_root;
 	}
 	err = sysfs_create_group(&srv_path->kobj, &rtrs_srv_path_attr_group);
 	if (err) {
-		rtrs_err(s, "sysfs_create_group(): %d\n", err);
+		rtrs_err(s, "sysfs_create_group(): %d(%pe)\n", err, ERR_PTR(err));
 		goto put_kobj;
 	}
 	err = rtrs_srv_create_stats_files(srv_path);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 2589871c0fa9..758d77206315 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -323,8 +323,8 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
 	err = ib_post_send(id->con->c.qp, &id->tx_wr.wr, NULL);
 	if (err)
 		rtrs_err(s,
-			  "Posting RDMA-Write-Request to QP failed, err: %d\n",
-			  err);
+			  "Posting RDMA-Write-Request to QP failed, err: %d(%pe)\n",
+			  err, ERR_PTR(err));
 
 	return err;
 }
@@ -440,8 +440,8 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id,
 
 	err = ib_post_send(id->con->c.qp, wr, NULL);
 	if (err)
-		rtrs_err_rl(s, "Posting RDMA-Reply to QP failed, err: %d\n",
-			     err);
+		rtrs_err_rl(s, "Posting RDMA-Reply to QP failed, err: %d(%pe)\n",
+			     err, ERR_PTR(err));
 
 	return err;
 }
@@ -525,8 +525,8 @@ bool rtrs_srv_resp_rdma(struct rtrs_srv_op *id, int status)
 		err = rdma_write_sg(id);
 
 	if (err) {
-		rtrs_err_rl(s, "IO response failed: %d: srv_path=%s\n", err,
-			    kobject_name(&srv_path->kobj));
+		rtrs_err_rl(s, "IO response failed: %d(%pe): srv_path=%s\n", err,
+			    ERR_PTR(err), kobject_name(&srv_path->kobj));
 		close_path(srv_path);
 	}
 out:
@@ -643,7 +643,7 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
 					DMA_TO_DEVICE, rtrs_srv_rdma_done);
 			if (!srv_mr->iu) {
 				err = -ENOMEM;
-				rtrs_err(ss, "rtrs_iu_alloc(), err: %d\n", err);
+				rtrs_err(ss, "rtrs_iu_alloc(), err: %d(%pe)\n", err, ERR_PTR(err));
 				goto dereg_mr;
 			}
 		}
@@ -819,7 +819,7 @@ static int process_info_req(struct rtrs_srv_con *con,
 
 	err = post_recv_path(srv_path);
 	if (err) {
-		rtrs_err(s, "post_recv_path(), err: %d\n", err);
+		rtrs_err(s, "post_recv_path(), err: %d(%pe)\n", err, ERR_PTR(err));
 		return err;
 	}
 
@@ -882,7 +882,7 @@ static int process_info_req(struct rtrs_srv_con *con,
 	get_device(&srv_path->srv->dev);
 	err = rtrs_srv_change_state(srv_path, RTRS_SRV_CONNECTED);
 	if (!err) {
-		rtrs_err(s, "rtrs_srv_change_state(), err: %d\n", err);
+		rtrs_err(s, "rtrs_srv_change_state(), err: %d(%pe)\n", err, ERR_PTR(err));
 		goto iu_free;
 	}
 
@@ -896,7 +896,7 @@ static int process_info_req(struct rtrs_srv_con *con,
 	 */
 	err = rtrs_srv_path_up(srv_path);
 	if (err) {
-		rtrs_err(s, "rtrs_srv_path_up(), err: %d\n", err);
+		rtrs_err(s, "rtrs_srv_path_up(), err: %d(%pe)\n", err, ERR_PTR(err));
 		goto iu_free;
 	}
 
@@ -907,7 +907,7 @@ static int process_info_req(struct rtrs_srv_con *con,
 	/* Send info response */
 	err = rtrs_iu_post_send(&con->c, tx_iu, tx_sz, reg_wr);
 	if (err) {
-		rtrs_err(s, "rtrs_iu_post_send(), err: %d\n", err);
+		rtrs_err(s, "rtrs_iu_post_send(), err: %d(%pe)\n", err, ERR_PTR(err));
 iu_free:
 		rtrs_iu_free(tx_iu, srv_path->s.dev->ib_dev, 1);
 	}
@@ -975,7 +975,7 @@ static int post_recv_info_req(struct rtrs_srv_con *con)
 	/* Prepare for getting info response */
 	err = rtrs_iu_post_recv(&con->c, rx_iu);
 	if (err) {
-		rtrs_err(s, "rtrs_iu_post_recv(), err: %d\n", err);
+		rtrs_err(s, "rtrs_iu_post_recv(), err: %d(%pe)\n", err, ERR_PTR(err));
 		rtrs_iu_free(rx_iu, srv_path->s.dev->ib_dev, 1);
 		return err;
 	}
@@ -1021,7 +1021,7 @@ static int post_recv_path(struct rtrs_srv_path *srv_path)
 
 		err = post_recv_io(to_srv_con(srv_path->s.con[cid]), q_size);
 		if (err) {
-			rtrs_err(s, "post_recv_io(), err: %d\n", err);
+			rtrs_err(s, "post_recv_io(), err: %d(%pe)\n", err, ERR_PTR(err));
 			return err;
 		}
 	}
@@ -1069,8 +1069,8 @@ static void process_read(struct rtrs_srv_con *con,
 
 	if (ret) {
 		rtrs_err_rl(s,
-			     "Processing read request failed, user module cb reported for msg_id %d, err: %d\n",
-			     buf_id, ret);
+			     "Processing read request failed, user module cb reported for msg_id %d, err: %d(%pe)\n",
+			     buf_id, ret, ERR_PTR(ret));
 		goto send_err_msg;
 	}
 
@@ -1080,8 +1080,8 @@ static void process_read(struct rtrs_srv_con *con,
 	ret = send_io_resp_imm(con, id, ret);
 	if (ret < 0) {
 		rtrs_err_rl(s,
-			     "Sending err msg for failed RDMA-Write-Req failed, msg_id %d, err: %d\n",
-			     buf_id, ret);
+			     "Sending err msg for failed RDMA-Write-Req failed, msg_id %d, err: %d(%pe)\n",
+			     buf_id, ret, ERR_PTR(ret));
 		close_path(srv_path);
 	}
 	rtrs_srv_put_ops_ids(srv_path);
@@ -1121,8 +1121,8 @@ static void process_write(struct rtrs_srv_con *con,
 			       data + data_len, usr_len);
 	if (ret) {
 		rtrs_err_rl(s,
-			     "Processing write request failed, user module callback reports err: %d\n",
-			     ret);
+			     "Processing write request failed, user module callback reports err: %d(%pe)\n",
+			     ret, ERR_PTR(ret));
 		goto send_err_msg;
 	}
 
@@ -1132,8 +1132,8 @@ static void process_write(struct rtrs_srv_con *con,
 	ret = send_io_resp_imm(con, id, ret);
 	if (ret < 0) {
 		rtrs_err_rl(s,
-			     "Processing write request failed, sending I/O response failed, msg_id %d, err: %d\n",
-			     buf_id, ret);
+			     "Processing write request failed, sending I/O response failed, msg_id %d, err: %d(%pe)\n",
+			     buf_id, ret, ERR_PTR(ret));
 		close_path(srv_path);
 	}
 	rtrs_srv_put_ops_ids(srv_path);
@@ -1263,7 +1263,8 @@ static void rtrs_srv_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 		srv_path->s.hb_missed_cnt = 0;
 		err = rtrs_post_recv_empty(&con->c, &io_comp_cqe);
 		if (err) {
-			rtrs_err(s, "rtrs_post_recv(), err: %d\n", err);
+			rtrs_err(s, "rtrs_post_recv(), err: %d(%pe)\n",
+				 err, ERR_PTR(err));
 			close_path(srv_path);
 			break;
 		}
@@ -1288,8 +1289,8 @@ static void rtrs_srv_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
 				mr->msg_id = msg_id;
 				err = rtrs_srv_inv_rkey(con, mr);
 				if (err) {
-					rtrs_err(s, "rtrs_post_recv(), err: %d\n",
-						  err);
+					rtrs_err(s, "rtrs_post_recv(), err: %d(%pe)\n",
+						  err, ERR_PTR(err));
 					close_path(srv_path);
 					break;
 				}
@@ -1638,7 +1639,7 @@ static int rtrs_rdma_do_accept(struct rtrs_srv_path *srv_path,
 
 	err = rdma_accept(cm_id, &param);
 	if (err)
-		pr_err("rdma_accept(), err: %d\n", err);
+		pr_err("rdma_accept(), err: %d(%pe)\n", err, ERR_PTR(err));
 
 	return err;
 }
@@ -1656,7 +1657,7 @@ static int rtrs_rdma_do_reject(struct rdma_cm_id *cm_id, int errno)
 
 	err = rdma_reject(cm_id, &msg, sizeof(msg), IB_CM_REJ_CONSUMER_DEFINED);
 	if (err)
-		pr_err("rdma_reject(), err: %d\n", err);
+		pr_err("rdma_reject(), err: %d(%pe)\n", err, ERR_PTR(err));
 
 	/* Bounce errno back */
 	return errno;
@@ -1732,7 +1733,7 @@ static int create_con(struct rtrs_srv_path *srv_path,
 				 max_send_wr, max_recv_wr,
 				 IB_POLL_WORKQUEUE);
 	if (err) {
-		rtrs_err(s, "rtrs_cq_qp_create(), err: %d\n", err);
+		rtrs_err(s, "rtrs_cq_qp_create(), err: %d(%pe)\n", err, ERR_PTR(err));
 		goto free_con;
 	}
 	if (con->c.cid == 0) {
@@ -1947,7 +1948,7 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
 	}
 	err = create_con(srv_path, cm_id, cid);
 	if (err) {
-		rtrs_err((&srv_path->s), "create_con(), error %d\n", err);
+		rtrs_err((&srv_path->s), "create_con(), error %d(%pe)\n", err, ERR_PTR(err));
 		rtrs_rdma_do_reject(cm_id, err);
 		/*
 		 * Since session has other connections we follow normal way
@@ -1958,7 +1959,8 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
 	}
 	err = rtrs_rdma_do_accept(srv_path, cm_id);
 	if (err) {
-		rtrs_err((&srv_path->s), "rtrs_rdma_do_accept(), error %d\n", err);
+		rtrs_err((&srv_path->s), "rtrs_rdma_do_accept(), error %d(%pe)\n",
+				err, ERR_PTR(err));
 		rtrs_rdma_do_reject(cm_id, err);
 		/*
 		 * Since current connection was successfully added to the
@@ -2009,8 +2011,8 @@ static int rtrs_srv_rdma_cm_handler(struct rdma_cm_id *cm_id,
 	case RDMA_CM_EVENT_REJECTED:
 	case RDMA_CM_EVENT_CONNECT_ERROR:
 	case RDMA_CM_EVENT_UNREACHABLE:
-		rtrs_err(s, "CM error (CM event: %s, err: %d)\n",
-			  rdma_event_msg(ev->event), ev->status);
+		rtrs_err(s, "CM error (CM event: %s, err: %d(%pe))\n",
+			  rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
 		fallthrough;
 	case RDMA_CM_EVENT_DISCONNECTED:
 	case RDMA_CM_EVENT_ADDR_CHANGE:
@@ -2019,8 +2021,8 @@ static int rtrs_srv_rdma_cm_handler(struct rdma_cm_id *cm_id,
 		close_path(srv_path);
 		break;
 	default:
-		pr_err("Ignoring unexpected CM event %s, err %d\n",
-		       rdma_event_msg(ev->event), ev->status);
+		pr_err("Ignoring unexpected CM event %s, err %d(%pe)\n",
+		       rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
 		break;
 	}
 
@@ -2044,13 +2046,13 @@ static struct rdma_cm_id *rtrs_srv_cm_init(struct rtrs_srv_ctx *ctx,
 	}
 	ret = rdma_bind_addr(cm_id, addr);
 	if (ret) {
-		pr_err("Binding RDMA address failed, err: %d\n", ret);
+		pr_err("Binding RDMA address failed, err: %d(%pe)\n", ret, ERR_PTR(ret));
 		goto err_cm;
 	}
 	ret = rdma_listen(cm_id, 64);
 	if (ret) {
-		pr_err("Listening on RDMA connection failed, err: %d\n",
-		       ret);
+		pr_err("Listening on RDMA connection failed, err: %d(%pe)\n",
+		       ret, ERR_PTR(ret));
 		goto err_cm;
 	}
 
@@ -2328,8 +2330,8 @@ static int __init rtrs_server_init(void)
 
 	err = check_module_params();
 	if (err) {
-		pr_err("Failed to load module, invalid module parameters, err: %d\n",
-		       err);
+		pr_err("Failed to load module, invalid module parameters, err: %d(%pe)\n",
+		       err, ERR_PTR(err));
 		return err;
 	}
 	err = class_register(&rtrs_dev_class);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
index bf38ac6f87c4..ea91371f6ad7 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs.c
@@ -273,7 +273,8 @@ static int create_qp(struct rtrs_con *con, struct ib_pd *pd,
 
 	ret = rdma_create_qp(cm_id, pd, &init_attr);
 	if (ret) {
-		rtrs_err(con->path, "Creating QP failed, err: %d\n", ret);
+		rtrs_err(con->path, "Creating QP failed, err: %d(%pe)\n", ret,
+			 ERR_PTR(ret));
 		return ret;
 	}
 	con->qp = cm_id->qp;
@@ -341,7 +342,8 @@ void rtrs_send_hb_ack(struct rtrs_path *path)
 	err = rtrs_post_rdma_write_imm_empty(usr_con, path->hb_cqe, imm,
 					     NULL);
 	if (err) {
-		rtrs_err(path, "send HB ACK failed, errno: %d\n", err);
+		rtrs_err(path, "send HB ACK failed, errno: %d(%pe)\n", err,
+			 ERR_PTR(err));
 		path->hb_err_handler(usr_con);
 		return;
 	}
@@ -375,7 +377,8 @@ static void hb_work(struct work_struct *work)
 	err = rtrs_post_rdma_write_imm_empty(usr_con, path->hb_cqe, imm,
 					     NULL);
 	if (err) {
-		rtrs_err(path, "HB send failed, errno: %d\n", err);
+		rtrs_err(path, "HB send failed, errno: %d(%pe)\n", err,
+			 ERR_PTR(err));
 		path->hb_err_handler(usr_con);
 		return;
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  2025-12-09  1:12   ` Honggang LI
  2025-12-08 16:15 ` [PATCH 4/9] RDMA/rtrs: Improve error logging for RDMA cm events Md Haris Iqbal
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

Support IB_MR_TYPE_SG_GAPS, which has less limitations
than standard IB_MR_TYPE_MEM_REG, a few ULP support this.

Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 10 ++++++++--
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 13 ++++++++++---
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 808de144d2e4..ee0682021234 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1360,7 +1360,9 @@ static void free_path_reqs(struct rtrs_clt_path *clt_path)
 
 static int alloc_path_reqs(struct rtrs_clt_path *clt_path)
 {
+	struct ib_device *ib_dev = clt_path->s.dev->ib_dev;
 	struct rtrs_clt_io_req *req;
+	enum ib_mr_type mr_type;
 	int i, err = -ENOMEM;
 
 	clt_path->reqs = kcalloc(clt_path->queue_depth,
@@ -1369,6 +1371,11 @@ static int alloc_path_reqs(struct rtrs_clt_path *clt_path)
 	if (!clt_path->reqs)
 		return -ENOMEM;
 
+	if (ib_dev->attrs.kernel_cap_flags & IBK_SG_GAPS_REG)
+		mr_type = IB_MR_TYPE_SG_GAPS;
+	else
+		mr_type = IB_MR_TYPE_MEM_REG;
+
 	for (i = 0; i < clt_path->queue_depth; ++i) {
 		req = &clt_path->reqs[i];
 		req->iu = rtrs_iu_alloc(1, clt_path->max_hdr_size, GFP_KERNEL,
@@ -1382,8 +1389,7 @@ static int alloc_path_reqs(struct rtrs_clt_path *clt_path)
 		if (!req->sge)
 			goto out;
 
-		req->mr = ib_alloc_mr(clt_path->s.dev->ib_pd,
-				      IB_MR_TYPE_MEM_REG,
+		req->mr = ib_alloc_mr(clt_path->s.dev->ib_pd, mr_type,
 				      clt_path->max_pages_per_mr);
 		if (IS_ERR(req->mr)) {
 			err = PTR_ERR(req->mr);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 758d77206315..905d5baec89b 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -568,13 +568,15 @@ static void unmap_cont_bufs(struct rtrs_srv_path *srv_path)
 
 static int map_cont_bufs(struct rtrs_srv_path *srv_path)
 {
+	struct ib_device *ib_dev = srv_path->s.dev->ib_dev;
 	struct rtrs_srv_sess *srv = srv_path->srv;
 	struct rtrs_path *ss = &srv_path->s;
 	int i, err, mrs_num;
 	unsigned int chunk_bits;
+	enum ib_mr_type mr_type;
 	int chunks_per_mr = 1;
-	struct ib_mr *mr;
 	struct sg_table *sgt;
+	struct ib_mr *mr;
 
 	/*
 	 * Here we map queue_depth chunks to MR.  Firstly we have to
@@ -623,8 +625,13 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
 			err = -EINVAL;
 			goto free_sg;
 		}
-		mr = ib_alloc_mr(srv_path->s.dev->ib_pd, IB_MR_TYPE_MEM_REG,
-				 nr_sgt);
+
+		if (ib_dev->attrs.kernel_cap_flags & IBK_SG_GAPS_REG)
+			mr_type = IB_MR_TYPE_SG_GAPS;
+		else
+			mr_type = IB_MR_TYPE_MEM_REG;
+
+		mr = ib_alloc_mr(srv_path->s.dev->ib_pd, mr_type, nr_sgt);
 		if (IS_ERR(mr)) {
 			err = PTR_ERR(mr);
 			goto unmap_sg;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/9] RDMA/rtrs: Improve error logging for RDMA cm events
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
                   ` (2 preceding siblings ...)
  2025-12-08 16:15 ` [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req Md Haris Iqbal
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

From: Kim Zhu <zhu.yanjun@ionos.com>

The member variable status in the struct rdma_cm_event is used for both
linux errors and the errors definded in rdma stack.

Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 46 ++++++++++++++++++++------
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 22 +++++++++---
 2 files changed, 54 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index ee0682021234..49249cc24152 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1947,8 +1947,8 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
 				  status, rej_msg, errno, ERR_PTR(errno));
 	} else {
 		rtrs_err(s,
-			  "Connect rejected but with malformed message: status %d(%pe) (%s)\n",
-			  status, ERR_PTR(status), rej_msg);
+			  "Connect rejected but with malformed message: status %d (%s)\n",
+			  status, rej_msg);
 	}
 
 	return -ECONNRESET;
@@ -2015,27 +2015,53 @@ static int rtrs_clt_rdma_cm_handler(struct rdma_cm_id *cm_id,
 	case RDMA_CM_EVENT_UNREACHABLE:
 	case RDMA_CM_EVENT_ADDR_CHANGE:
 	case RDMA_CM_EVENT_TIMEWAIT_EXIT:
-		rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
-			 rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+		if (ev->status < 0) {
+			rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
+				rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+		} else if (ev->status > 0) {
+			rtrs_wrn(s, "CM error (CM event: %s, err: %d(%s))\n",
+				rdma_event_msg(ev->event), ev->status,
+				rdma_reject_msg(cm_id, ev->status));
+		}
 		cm_err = -ECONNRESET;
 		break;
 	case RDMA_CM_EVENT_ADDR_ERROR:
 	case RDMA_CM_EVENT_ROUTE_ERROR:
-		rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
-			 rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+		if (ev->status < 0) {
+			rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
+				rdma_event_msg(ev->event), ev->status,
+				ERR_PTR(ev->status));
+		} else if (ev->status > 0) {
+			rtrs_wrn(s, "CM error (CM event: %s, err: %d(%s))\n",
+				rdma_event_msg(ev->event), ev->status,
+				rdma_reject_msg(cm_id, ev->status));
+		}
 		cm_err = -EHOSTUNREACH;
 		break;
 	case RDMA_CM_EVENT_DEVICE_REMOVAL:
 		/*
 		 * Device removal is a special case.  Queue close and return 0.
 		 */
-		rtrs_wrn_rl(s, "CM event: %s, status: %d(%pe)\n", rdma_event_msg(ev->event),
-			    ev->status, ERR_PTR(ev->status));
+		if (ev->status < 0) {
+			rtrs_wrn_rl(s, "CM event: %s, status: %d(%pe)\n",
+					rdma_event_msg(ev->event),
+					ev->status, ERR_PTR(ev->status));
+		} else if (ev->status > 0) {
+			rtrs_wrn_rl(s, "CM event: %s, status: %d(%s)\n",
+					rdma_event_msg(ev->event),
+					ev->status, rdma_reject_msg(cm_id, ev->status));
+		}
 		rtrs_clt_close_conns(clt_path, false);
 		return 0;
 	default:
-		rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d(%pe))\n",
-			 rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+		if (ev->status < 0) {
+			rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d(%pe))\n",
+				rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+		} else if (ev->status > 0) {
+			rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d(%s))\n",
+				rdma_event_msg(ev->event), ev->status,
+				rdma_reject_msg(cm_id, ev->status));
+		}
 		cm_err = -ECONNRESET;
 		break;
 	}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 905d5baec89b..4e203140c990 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -2018,8 +2018,15 @@ static int rtrs_srv_rdma_cm_handler(struct rdma_cm_id *cm_id,
 	case RDMA_CM_EVENT_REJECTED:
 	case RDMA_CM_EVENT_CONNECT_ERROR:
 	case RDMA_CM_EVENT_UNREACHABLE:
-		rtrs_err(s, "CM error (CM event: %s, err: %d(%pe))\n",
-			  rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+		if (ev->status < 0) {
+			rtrs_err(s, "CM error (CM event: %s, err: %d(%pe))\n",
+					rdma_event_msg(ev->event), ev->status,
+					ERR_PTR(ev->status));
+		} else if (ev->status > 0) {
+			rtrs_err(s, "CM error (CM event: %s, err: %d(%s))\n",
+					rdma_event_msg(ev->event), ev->status,
+					rdma_reject_msg(cm_id, ev->status));
+		}
 		fallthrough;
 	case RDMA_CM_EVENT_DISCONNECTED:
 	case RDMA_CM_EVENT_ADDR_CHANGE:
@@ -2028,8 +2035,15 @@ static int rtrs_srv_rdma_cm_handler(struct rdma_cm_id *cm_id,
 		close_path(srv_path);
 		break;
 	default:
-		pr_err("Ignoring unexpected CM event %s, err %d(%pe)\n",
-		       rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+		if (ev->status < 0) {
+			pr_err("Ignoring unexpected CM event %s, err %d(%pe)\n",
+					rdma_event_msg(ev->event), ev->status,
+					ERR_PTR(ev->status));
+		} else if (ev->status > 0) {
+			pr_err("Ignoring unexpected CM event %s, err %d(%s)\n",
+					rdma_event_msg(ev->event), ev->status,
+					rdma_reject_msg(cm_id, ev->status));
+		}
 		break;
 	}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
                   ` (3 preceding siblings ...)
  2025-12-08 16:15 ` [PATCH 4/9] RDMA/rtrs: Improve error logging for RDMA cm events Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  2025-12-09  1:14   ` Honggang LI
  2025-12-08 16:15 ` [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths Md Haris Iqbal
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner

From: Jack Wang <jinpu.wang@ionos.com>

Remove unused member.

Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index 0f57759b3080..3633119d1db2 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -92,7 +92,6 @@ struct rtrs_permit {
  * rtrs_clt_io_req - describes one inflight IO request
  */
 struct rtrs_clt_io_req {
-	struct list_head        list;
 	struct rtrs_iu		*iu;
 	struct scatterlist	*sglist; /* list holding user data */
 	unsigned int		sg_cnt;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
                   ` (4 preceding siblings ...)
  2025-12-08 16:15 ` [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  2025-12-09  1:17   ` Honggang LI
  2025-12-08 16:15 ` [PATCH 7/9] RDMA/rtrs-srv: Rate-limit I/O path error logging Md Haris Iqbal
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner

During several network incidents, a number of RTRS paths for a session
went through disconnect and reconnect phase. However, some of those did
not auto-reconnect successfully. Instead they failed with the following
logs,

On client,
kernel: rtrs_client L1991: <sess-name>: Connect rejected: status 28
  (consumer defined), rtrs errno -104
kernel: rtrs_client L2698: <sess-name>: init_conns() failed: err=-104
  path=gid:<gid1>@gid:<gid2> [mlx4_0:1]

On server, (log a)
kernel: ibtrs_server L1868: <>: Connection already exists: 0

When the misbehaving path was removed, and add_path was called to re-add
the path, the log on client side changed to, (log b)
kernel: rtrs_client L1991: <sess-name>: Connect rejected: status 28
  (consumer defined), rtrs errno -17

There was no log on the server side for this, which is expected since
there is no logging in that path,
if (unlikely(__is_path_w_addr_exists(srv, &cm_id->route.addr))) {
	err = -EEXIST;
	goto err;

Because of the following check on server side,
if (unlikely(sess->state != IBTRS_SRV_CONNECTING)) {
	ibtrs_err(s, "Session in wrong state: %s\n",

.. we know that the path in (log a) was in CONNECTING state.

The above state of the path persists for as long as we leave the session
be. This means that the path is in some zombie state, probably waiting
for the info_req packet to arrive, which never does.

The changes in this commits does 2 things.

1) Add logs at places where we see the errors happening. The logs would
shed more light at the state and lifetime of such zombie paths.

2) Close such zombie sessions, only if they are in CONNECTING state, and
after an inactivity period of 30 seconds.
  i) The state check prevents closure of paths which are CONNECTED.
Also, from the above logs and code, we already know that the path could
only be on CONNECTING state, so we play safe and narrow our impact surface
area by closing only CONNECTING paths.
  ii) The inactivity period is to allow requests for other cid to finish
processing, or for any stray packets to arrive/fail.

Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 46 +++++++++++++++++++++++---
 drivers/infiniband/ulp/rtrs/rtrs-srv.h |  1 +
 2 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 4e203140c990..20e7d2681668 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -911,6 +911,13 @@ static int process_info_req(struct rtrs_srv_con *con,
 				      tx_iu->dma_addr,
 				      tx_iu->size, DMA_TO_DEVICE);
 
+	/*
+	 * Now disable zombie connection closing. Since from the logs and code,
+	 * we know that it can never be in CONNECTED state.
+	 * See RNBD-3128 comments.
+	 */
+	srv_path->connection_timeout = 0;
+
 	/* Send info response */
 	err = rtrs_iu_post_send(&con->c, tx_iu, tx_sz, reg_wr);
 	if (err) {
@@ -1537,17 +1544,38 @@ static int sockaddr_cmp(const struct sockaddr *a, const struct sockaddr *b)
 	}
 }
 
+/* Let's close connections which have been waiting for more than 30 seconds */
+#define RTRS_MAX_CONN_TIMEOUT 30000
+
+static void rtrs_srv_check_close_path(struct rtrs_srv_path *srv_path)
+{
+	struct rtrs_path *s = &srv_path->s;
+
+	if (srv_path->state == RTRS_SRV_CONNECTING && srv_path->connection_timeout &&
+	   (jiffies_to_msecs(jiffies - srv_path->connection_timeout) > RTRS_MAX_CONN_TIMEOUT)) {
+		rtrs_err(s, "Closing zombie path\n");
+		close_path(srv_path);
+	}
+}
+
 static bool __is_path_w_addr_exists(struct rtrs_srv_sess *srv,
 				    struct rdma_addr *addr)
 {
 	struct rtrs_srv_path *srv_path;
 
-	list_for_each_entry(srv_path, &srv->paths_list, s.entry)
+	list_for_each_entry(srv_path, &srv->paths_list, s.entry) {
 		if (!sockaddr_cmp((struct sockaddr *)&srv_path->s.dst_addr,
 				  (struct sockaddr *)&addr->dst_addr) &&
 		    !sockaddr_cmp((struct sockaddr *)&srv_path->s.src_addr,
-				  (struct sockaddr *)&addr->src_addr))
+				  (struct sockaddr *)&addr->src_addr)) {
+			rtrs_err((&srv_path->s),
+				 "Path (%s) with same addr exists (lifetime %u)\n",
+				 rtrs_srv_state_str(srv_path->state),
+				 (jiffies_to_msecs(jiffies - srv_path->connection_timeout)));
+			rtrs_srv_check_close_path(srv_path);
 			return true;
+		}
+	}
 
 	return false;
 }
@@ -1785,7 +1813,6 @@ static struct rtrs_srv_path *__alloc_path(struct rtrs_srv_sess *srv,
 	}
 	if (__is_path_w_addr_exists(srv, &cm_id->route.addr)) {
 		err = -EEXIST;
-		pr_err("Path with same addr exists\n");
 		goto err;
 	}
 	srv_path = kzalloc(sizeof(*srv_path), GFP_KERNEL);
@@ -1832,6 +1859,7 @@ static struct rtrs_srv_path *__alloc_path(struct rtrs_srv_sess *srv,
 	spin_lock_init(&srv_path->state_lock);
 	INIT_WORK(&srv_path->close_work, rtrs_srv_close_work);
 	rtrs_srv_init_hb(srv_path);
+	srv_path->connection_timeout = 0;
 
 	srv_path->s.dev = rtrs_ib_dev_find_or_add(cm_id->device, &dev_pd);
 	if (!srv_path->s.dev) {
@@ -1937,8 +1965,10 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
 			goto reject_w_err;
 		}
 		if (s->con[cid]) {
-			rtrs_err(s, "Connection already exists: %d\n",
-				  cid);
+			rtrs_err(s, "Connection (%s) already exists: %d (lifetime %u)\n",
+				 rtrs_srv_state_str(srv_path->state), cid,
+				 (jiffies_to_msecs(jiffies - srv_path->connection_timeout)));
+			rtrs_srv_check_close_path(srv_path);
 			mutex_unlock(&srv->paths_mutex);
 			goto reject_w_err;
 		}
@@ -1953,6 +1983,12 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
 			goto reject_w_err;
 		}
 	}
+
+	/*
+	 * Start of any connection creation resets the timeout for the path.
+	 */
+	srv_path->connection_timeout = jiffies;
+
 	err = create_con(srv_path, cm_id, cid);
 	if (err) {
 		rtrs_err((&srv_path->s), "create_con(), error %d(%pe)\n", err, ERR_PTR(err));
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
index 014f85681f37..3d36876527f5 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
@@ -89,6 +89,7 @@ struct rtrs_srv_path {
 	unsigned int		mem_bits;
 	struct kobject		kobj;
 	struct rtrs_srv_stats	*stats;
+	unsigned long		connection_timeout;
 };
 
 static inline struct rtrs_srv_path *to_srv_path(struct rtrs_path *s)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/9] RDMA/rtrs-srv: Rate-limit I/O path error logging
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
                   ` (5 preceding siblings ...)
  2025-12-08 16:15 ` [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 8/9] RDMA/rtrs: Extend log message when a port fails Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 9/9] RDMA/rtrs-clt.c: For conn rejection use actual err number Md Haris Iqbal
  8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

From: Kim Zhu <zhu.yanjun@ionos.com>

Excessive error logging is making it difficult to identify the root
cause of issues. Implement rate limiting to improve log clarity.

Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 20e7d2681668..dfe38ffc2e38 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -184,7 +184,7 @@ static void rtrs_srv_reg_mr_done(struct ib_cq *cq, struct ib_wc *wc)
 	struct rtrs_srv_path *srv_path = to_srv_path(s);
 
 	if (wc->status != IB_WC_SUCCESS) {
-		rtrs_err(s, "REG MR failed: %s\n",
+		rtrs_err_rl(s, "REG MR failed: %s\n",
 			  ib_wc_status_msg(wc->status));
 		close_path(srv_path);
 		return;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 8/9] RDMA/rtrs: Extend log message when a port fails
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
                   ` (6 preceding siblings ...)
  2025-12-08 16:15 ` [PATCH 7/9] RDMA/rtrs-srv: Rate-limit I/O path error logging Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  2025-12-08 16:15 ` [PATCH 9/9] RDMA/rtrs-clt.c: For conn rejection use actual err number Md Haris Iqbal
  8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

From: Kim Zhu <zhu.yanjun@ionos.com>

Add HCA name and port of this HCA.
This would help with analysing and debugging the logs.

The logs would looks something like this,

rtrs_server L2516: Handling event: port error (10).
		   HCA name: mlx4_0, port num: 2
rtrs_client L3326: Handling event: port error (10).
		   HCA name: mlx4_0, port num: 1

Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 7 +++++--
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 7 +++++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 49249cc24152..dcf5704366eb 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -3179,8 +3179,11 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt_sess *clt,
 void rtrs_clt_ib_event_handler(struct ib_event_handler *handler,
 			       struct ib_event *ibevent)
 {
-	pr_info("Handling event: %s (%d).\n", ib_event_msg(ibevent->event),
-		ibevent->event);
+	struct ib_device *idev = ibevent->device;
+	u32 port_num = ibevent->element.port_num;
+
+	pr_info("Handling event: %s (%d). HCA name: %s, port num: %u\n",
+			ib_event_msg(ibevent->event), ibevent->event, idev->name, port_num);
 }
 
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index dfe38ffc2e38..301edaadfb1a 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -2349,8 +2349,11 @@ static int check_module_params(void)
 void rtrs_srv_ib_event_handler(struct ib_event_handler *handler,
 			       struct ib_event *ibevent)
 {
-	pr_info("Handling event: %s (%d).\n", ib_event_msg(ibevent->event),
-		ibevent->event);
+	struct ib_device *idev = ibevent->device;
+	u32 port_num = ibevent->element.port_num;
+
+	pr_info("Handling event: %s (%d). HCA name: %s, port num: %u\n",
+			ib_event_msg(ibevent->event), ibevent->event, idev->name, port_num);
 }
 
 static int rtrs_srv_ib_dev_init(struct rtrs_ib_dev *dev)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 9/9] RDMA/rtrs-clt.c: For conn rejection use actual err number
  2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
                   ` (7 preceding siblings ...)
  2025-12-08 16:15 ` [PATCH 8/9] RDMA/rtrs: Extend log message when a port fails Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
  8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner

When the connection establishment request is rejected from the server
side, then the actual error number sent back should be used.

Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index dcf5704366eb..3e62da5eaca7 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1929,7 +1929,7 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
 	struct rtrs_path *s = con->c.path;
 	const struct rtrs_msg_conn_rsp *msg;
 	const char *rej_msg;
-	int status, errno;
+	int status, errno = -ECONNRESET;
 	u8 data_len;
 
 	status = ev->status;
@@ -1951,7 +1951,7 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
 			  status, rej_msg);
 	}
 
-	return -ECONNRESET;
+	return errno;
 }
 
 void rtrs_clt_close_conns(struct rtrs_clt_path *clt_path, bool wait)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
  2025-12-08 16:15 ` [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS Md Haris Iqbal
@ 2025-12-09  1:12   ` Honggang LI
  2026-01-06  9:28     ` Haris Iqbal
  0 siblings, 1 reply; 20+ messages in thread
From: Honggang LI @ 2025-12-09  1:12 UTC (permalink / raw)
  To: Md Haris Iqbal
  Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

On Mon, Dec 08, 2025 at 05:15:07PM +0100, Md Haris Iqbal wrote:
> Subject: [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
> From: Md Haris Iqbal <haris.iqbal@ionos.com>
> Date: Mon,  8 Dec 2025 17:15:07 +0100
> X-Mailer: git-send-email 2.43.0
> 
> Support IB_MR_TYPE_SG_GAPS, which has less limitations
> than standard IB_MR_TYPE_MEM_REG, a few ULP support this.

Do you have benchmark performance difference between IB_MR_TYPE_MEM_REG
and IB_MR_TYPE_SG_GAPS?

thanks


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req
  2025-12-08 16:15 ` [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req Md Haris Iqbal
@ 2025-12-09  1:14   ` Honggang LI
  2026-01-06  9:26     ` Haris Iqbal
  0 siblings, 1 reply; 20+ messages in thread
From: Honggang LI @ 2025-12-09  1:14 UTC (permalink / raw)
  To: Md Haris Iqbal
  Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner

On Mon, Dec 08, 2025 at 05:15:09PM +0100, Md Haris Iqbal wrote:
> Subject: [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in
>  rtrs_clt_io_req
> From: Md Haris Iqbal <haris.iqbal@ionos.com>
> Date: Mon,  8 Dec 2025 17:15:09 +0100
> X-Mailer: git-send-email 2.43.0
> 
> From: Jack Wang <jinpu.wang@ionos.com>
> 
> Remove unused member.
> 
> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> ---
>  drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> index 0f57759b3080..3633119d1db2 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> @@ -92,7 +92,6 @@ struct rtrs_permit {
>   * rtrs_clt_io_req - describes one inflight IO request
>   */
>  struct rtrs_clt_io_req {
> -	struct list_head        list;

It seems these two members alse unused. Why keep them?

struct rtrs_sg_desc        *desc;
unsigned long                start_jiffies;

thanks


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths
  2025-12-08 16:15 ` [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths Md Haris Iqbal
@ 2025-12-09  1:17   ` Honggang LI
  2026-01-06  9:27     ` Haris Iqbal
  0 siblings, 1 reply; 20+ messages in thread
From: Honggang LI @ 2025-12-09  1:17 UTC (permalink / raw)
  To: Md Haris Iqbal
  Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner

On Mon, Dec 08, 2025 at 05:15:10PM +0100, Md Haris Iqbal wrote:
> Subject: [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible
>  zombie paths
> From: Md Haris Iqbal <haris.iqbal@ionos.com>
> Date: Mon,  8 Dec 2025 17:15:10 +0100
> X-Mailer: git-send-email 2.43.0
> 
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> @@ -911,6 +911,13 @@ static int process_info_req(struct rtrs_srv_con *con,
>  				      tx_iu->dma_addr,
>  				      tx_iu->size, DMA_TO_DEVICE);
>  
> +	/*
> +	 * Now disable zombie connection closing. Since from the logs and code,
> +	 * we know that it can never be in CONNECTED state.
> +	 * See RNBD-3128 comments.
               ^^^^^^^^^^^^^^^^^
What is it? How to access it?

thanks


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
  2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
@ 2025-12-12  5:26   ` Dan Carpenter
  2026-01-06  9:47     ` Haris Iqbal
  2025-12-18 15:51   ` Leon Romanovsky
  1 sibling, 1 reply; 20+ messages in thread
From: Dan Carpenter @ 2025-12-12  5:26 UTC (permalink / raw)
  To: oe-kbuild, Md Haris Iqbal, linux-rdma
  Cc: lkp, oe-kbuild-all, bvanassche, leon, jgg, haris.iqbal,
	jinpu.wang, grzegorz.prajsner, Kim Zhu

Hi Md,

kernel test robot noticed the following build warnings:

https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Md-Haris-Iqbal/RDMA-rtrs-srv-fix-SG-mapping/20251209-001817
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
patch link:    https://lore.kernel.org/r/20251208161513.127049-3-haris.iqbal%40ionos.com
patch subject: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
config: arm-randconfig-r072-20251210 (https://download.01.org/0day-ci/archive/20251212/202512120133.BuJVeI6M-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 12.5.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
| Closes: https://lore.kernel.org/r/202512120133.BuJVeI6M-lkp@intel.com/

smatch warnings:
drivers/infiniband/ulp/rtrs/rtrs-srv.c:885 process_info_req() warn: passing zero to 'ERR_PTR'

vim +/ERR_PTR +885 drivers/infiniband/ulp/rtrs/rtrs-srv.c

9cb837480424e7 Jack Wang        2020-05-11  808  static int process_info_req(struct rtrs_srv_con *con,
9cb837480424e7 Jack Wang        2020-05-11  809  			    struct rtrs_msg_info_req *msg)
9cb837480424e7 Jack Wang        2020-05-11  810  {
d9372794717f44 Vaishali Thakkar 2022-01-05  811  	struct rtrs_path *s = con->c.path;
ae4c81644e9105 Vaishali Thakkar 2022-01-05  812  	struct rtrs_srv_path *srv_path = to_srv_path(s);
9cb837480424e7 Jack Wang        2020-05-11  813  	struct ib_send_wr *reg_wr = NULL;
9cb837480424e7 Jack Wang        2020-05-11  814  	struct rtrs_msg_info_rsp *rsp;
9cb837480424e7 Jack Wang        2020-05-11  815  	struct rtrs_iu *tx_iu;
9cb837480424e7 Jack Wang        2020-05-11  816  	struct ib_reg_wr *rwr;
9cb837480424e7 Jack Wang        2020-05-11  817  	int mri, err;
9cb837480424e7 Jack Wang        2020-05-11  818  	size_t tx_sz;
9cb837480424e7 Jack Wang        2020-05-11  819  
ae4c81644e9105 Vaishali Thakkar 2022-01-05  820  	err = post_recv_path(srv_path);
4693d6b767d6ca Gioh Kim         2021-08-06  821  	if (err) {
94ae3ce9b375c6 Kim Zhu          2025-12-08  822  		rtrs_err(s, "post_recv_path(), err: %d(%pe)\n", err, ERR_PTR(err));
9cb837480424e7 Jack Wang        2020-05-11  823  		return err;
9cb837480424e7 Jack Wang        2020-05-11  824  	}
07c14027295a32 Gioh Kim         2021-05-28  825  
ae4c81644e9105 Vaishali Thakkar 2022-01-05  826  	if (strchr(msg->pathname, '/') || strchr(msg->pathname, '.')) {
ae4c81644e9105 Vaishali Thakkar 2022-01-05  827  		rtrs_err(s, "pathname cannot contain / and .\n");
dea7bb3ad3e08f Md Haris Iqbal   2021-09-22  828  		return -EINVAL;
dea7bb3ad3e08f Md Haris Iqbal   2021-09-22  829  	}
dea7bb3ad3e08f Md Haris Iqbal   2021-09-22  830  
ae4c81644e9105 Vaishali Thakkar 2022-01-05  831  	if (exist_pathname(srv_path->srv->ctx,
ae4c81644e9105 Vaishali Thakkar 2022-01-05  832  			   msg->pathname, &srv_path->srv->paths_uuid)) {
ae4c81644e9105 Vaishali Thakkar 2022-01-05  833  		rtrs_err(s, "pathname is duplicated: %s\n", msg->pathname);
07c14027295a32 Gioh Kim         2021-05-28  834  		return -EPERM;
07c14027295a32 Gioh Kim         2021-05-28  835  	}
ae4c81644e9105 Vaishali Thakkar 2022-01-05  836  	strscpy(srv_path->s.sessname, msg->pathname,
ae4c81644e9105 Vaishali Thakkar 2022-01-05  837  		sizeof(srv_path->s.sessname));
07c14027295a32 Gioh Kim         2021-05-28  838  
ae4c81644e9105 Vaishali Thakkar 2022-01-05  839  	rwr = kcalloc(srv_path->mrs_num, sizeof(*rwr), GFP_KERNEL);
4693d6b767d6ca Gioh Kim         2021-08-06  840  	if (!rwr)
9cb837480424e7 Jack Wang        2020-05-11  841  		return -ENOMEM;
9cb837480424e7 Jack Wang        2020-05-11  842  
9cb837480424e7 Jack Wang        2020-05-11  843  	tx_sz  = sizeof(*rsp);
ae4c81644e9105 Vaishali Thakkar 2022-01-05  844  	tx_sz += sizeof(rsp->desc[0]) * srv_path->mrs_num;
ae4c81644e9105 Vaishali Thakkar 2022-01-05  845  	tx_iu = rtrs_iu_alloc(1, tx_sz, GFP_KERNEL, srv_path->s.dev->ib_dev,
9cb837480424e7 Jack Wang        2020-05-11  846  			       DMA_TO_DEVICE, rtrs_srv_info_rsp_done);
4693d6b767d6ca Gioh Kim         2021-08-06  847  	if (!tx_iu) {
9cb837480424e7 Jack Wang        2020-05-11  848  		err = -ENOMEM;
9cb837480424e7 Jack Wang        2020-05-11  849  		goto rwr_free;
9cb837480424e7 Jack Wang        2020-05-11  850  	}
9cb837480424e7 Jack Wang        2020-05-11  851  
9cb837480424e7 Jack Wang        2020-05-11  852  	rsp = tx_iu->buf;
9cb837480424e7 Jack Wang        2020-05-11  853  	rsp->type = cpu_to_le16(RTRS_MSG_INFO_RSP);
ae4c81644e9105 Vaishali Thakkar 2022-01-05  854  	rsp->sg_cnt = cpu_to_le16(srv_path->mrs_num);
9cb837480424e7 Jack Wang        2020-05-11  855  
ae4c81644e9105 Vaishali Thakkar 2022-01-05  856  	for (mri = 0; mri < srv_path->mrs_num; mri++) {
ae4c81644e9105 Vaishali Thakkar 2022-01-05  857  		struct ib_mr *mr = srv_path->mrs[mri].mr;
9cb837480424e7 Jack Wang        2020-05-11  858  
9cb837480424e7 Jack Wang        2020-05-11  859  		rsp->desc[mri].addr = cpu_to_le64(mr->iova);
9cb837480424e7 Jack Wang        2020-05-11  860  		rsp->desc[mri].key  = cpu_to_le32(mr->rkey);
9cb837480424e7 Jack Wang        2020-05-11  861  		rsp->desc[mri].len  = cpu_to_le32(mr->length);
9cb837480424e7 Jack Wang        2020-05-11  862  
9cb837480424e7 Jack Wang        2020-05-11  863  		/*
9cb837480424e7 Jack Wang        2020-05-11  864  		 * Fill in reg MR request and chain them *backwards*
9cb837480424e7 Jack Wang        2020-05-11  865  		 */
9cb837480424e7 Jack Wang        2020-05-11  866  		rwr[mri].wr.next = mri ? &rwr[mri - 1].wr : NULL;
9cb837480424e7 Jack Wang        2020-05-11  867  		rwr[mri].wr.opcode = IB_WR_REG_MR;
9cb837480424e7 Jack Wang        2020-05-11  868  		rwr[mri].wr.wr_cqe = &local_reg_cqe;
9cb837480424e7 Jack Wang        2020-05-11  869  		rwr[mri].wr.num_sge = 0;
e8ae7ddb48a1b8 Jack Wang        2020-12-17  870  		rwr[mri].wr.send_flags = 0;
9cb837480424e7 Jack Wang        2020-05-11  871  		rwr[mri].mr = mr;
9cb837480424e7 Jack Wang        2020-05-11  872  		rwr[mri].key = mr->rkey;
9cb837480424e7 Jack Wang        2020-05-11  873  		rwr[mri].access = (IB_ACCESS_LOCAL_WRITE |
9cb837480424e7 Jack Wang        2020-05-11  874  				   IB_ACCESS_REMOTE_WRITE);
9cb837480424e7 Jack Wang        2020-05-11  875  		reg_wr = &rwr[mri].wr;
9cb837480424e7 Jack Wang        2020-05-11  876  	}
9cb837480424e7 Jack Wang        2020-05-11  877  
ae4c81644e9105 Vaishali Thakkar 2022-01-05  878  	err = rtrs_srv_create_path_files(srv_path);
4693d6b767d6ca Gioh Kim         2021-08-06  879  	if (err)
9cb837480424e7 Jack Wang        2020-05-11  880  		goto iu_free;
ae4c81644e9105 Vaishali Thakkar 2022-01-05  881  	kobject_get(&srv_path->kobj);
ae4c81644e9105 Vaishali Thakkar 2022-01-05  882  	get_device(&srv_path->srv->dev);
ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  883  	err = rtrs_srv_change_state(srv_path, RTRS_SRV_CONNECTED);
ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  884  	if (!err) {

Probably remove the !?

94ae3ce9b375c6 Kim Zhu          2025-12-08 @885  		rtrs_err(s, "rtrs_srv_change_state(), err: %d(%pe)\n", err, ERR_PTR(err));

err is zero.  Or is this a success path?

ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  886  		goto iu_free;
ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  887  	}
ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  888  
ae4c81644e9105 Vaishali Thakkar 2022-01-05  889  	rtrs_srv_start_hb(srv_path);
9cb837480424e7 Jack Wang        2020-05-11  890  
9cb837480424e7 Jack Wang        2020-05-11  891  	/*
9cb837480424e7 Jack Wang        2020-05-11  892  	 * We do not account number of established connections at the current
9cb837480424e7 Jack Wang        2020-05-11  893  	 * moment, we rely on the client, which should send info request when
9cb837480424e7 Jack Wang        2020-05-11  894  	 * all connections are successfully established.  Thus, simply notify
9cb837480424e7 Jack Wang        2020-05-11  895  	 * listener with a proper event if we are the first path.
9cb837480424e7 Jack Wang        2020-05-11  896  	 */
ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  897  	err = rtrs_srv_path_up(srv_path);
ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  898  	if (err) {
94ae3ce9b375c6 Kim Zhu          2025-12-08  899  		rtrs_err(s, "rtrs_srv_path_up(), err: %d(%pe)\n", err, ERR_PTR(err));
ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  900  		goto iu_free;
ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  901  	}
9cb837480424e7 Jack Wang        2020-05-11  902  
ae4c81644e9105 Vaishali Thakkar 2022-01-05  903  	ib_dma_sync_single_for_device(srv_path->s.dev->ib_dev,
ae4c81644e9105 Vaishali Thakkar 2022-01-05  904  				      tx_iu->dma_addr,
9cb837480424e7 Jack Wang        2020-05-11  905  				      tx_iu->size, DMA_TO_DEVICE);
9cb837480424e7 Jack Wang        2020-05-11  906  
9cb837480424e7 Jack Wang        2020-05-11  907  	/* Send info response */
9cb837480424e7 Jack Wang        2020-05-11  908  	err = rtrs_iu_post_send(&con->c, tx_iu, tx_sz, reg_wr);
4693d6b767d6ca Gioh Kim         2021-08-06  909  	if (err) {
94ae3ce9b375c6 Kim Zhu          2025-12-08  910  		rtrs_err(s, "rtrs_iu_post_send(), err: %d(%pe)\n", err, ERR_PTR(err));
9cb837480424e7 Jack Wang        2020-05-11  911  iu_free:
ae4c81644e9105 Vaishali Thakkar 2022-01-05  912  		rtrs_iu_free(tx_iu, srv_path->s.dev->ib_dev, 1);
9cb837480424e7 Jack Wang        2020-05-11  913  	}
9cb837480424e7 Jack Wang        2020-05-11  914  rwr_free:
9cb837480424e7 Jack Wang        2020-05-11  915  	kfree(rwr);
9cb837480424e7 Jack Wang        2020-05-11  916  
9cb837480424e7 Jack Wang        2020-05-11  917  	return err;
9cb837480424e7 Jack Wang        2020-05-11  918  }

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
  2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
  2025-12-12  5:26   ` Dan Carpenter
@ 2025-12-18 15:51   ` Leon Romanovsky
  2026-01-06 10:03     ` Haris Iqbal
  1 sibling, 1 reply; 20+ messages in thread
From: Leon Romanovsky @ 2025-12-18 15:51 UTC (permalink / raw)
  To: Md Haris Iqbal
  Cc: linux-rdma, bvanassche, jgg, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

On Mon, Dec 08, 2025 at 05:15:06PM +0100, Md Haris Iqbal wrote:
> From: Kim Zhu <zhu.yanjun@ionos.com>
> 
> Print error description besides the error number.
> 
> Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> ---
>  drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c |  8 +-
>  drivers/infiniband/ulp/rtrs/rtrs-clt.c       | 89 ++++++++++----------
>  drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 12 +--
>  drivers/infiniband/ulp/rtrs/rtrs-srv.c       | 78 ++++++++---------
>  drivers/infiniband/ulp/rtrs/rtrs.c           |  9 +-
>  5 files changed, 101 insertions(+), 95 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> index 4aa80c9388f0..b318acc12b10 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> @@ -439,19 +439,19 @@ int rtrs_clt_create_path_files(struct rtrs_clt_path *clt_path)
>  				   clt->kobj_paths,
>  				   "%s", str);
>  	if (err) {
> -		pr_err("kobject_init_and_add: %d\n", err);
> +		pr_err("kobject_init_and_add: %d(%pe)\n", err, ERR_PTR(err));

Or print error or print error description, not both.

Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req
  2025-12-09  1:14   ` Honggang LI
@ 2026-01-06  9:26     ` Haris Iqbal
  0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06  9:26 UTC (permalink / raw)
  To: Honggang LI
  Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner

On Tue, Dec 9, 2025 at 2:14 AM Honggang LI <honggangli@163.com> wrote:
>
> On Mon, Dec 08, 2025 at 05:15:09PM +0100, Md Haris Iqbal wrote:
> > Subject: [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in
> >  rtrs_clt_io_req
> > From: Md Haris Iqbal <haris.iqbal@ionos.com>
> > Date: Mon,  8 Dec 2025 17:15:09 +0100
> > X-Mailer: git-send-email 2.43.0
> >
> > From: Jack Wang <jinpu.wang@ionos.com>
> >
> > Remove unused member.
> >
> > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > ---
> >  drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > index 0f57759b3080..3633119d1db2 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > @@ -92,7 +92,6 @@ struct rtrs_permit {
> >   * rtrs_clt_io_req - describes one inflight IO request
> >   */
> >  struct rtrs_clt_io_req {
> > -     struct list_head        list;
>
> It seems these two members alse unused. Why keep them?
>
> struct rtrs_sg_desc        *desc;
> unsigned long                start_jiffies;

Makes sense. Will remove them. Thanks

>
> thanks
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths
  2025-12-09  1:17   ` Honggang LI
@ 2026-01-06  9:27     ` Haris Iqbal
  0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06  9:27 UTC (permalink / raw)
  To: Honggang LI
  Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner

On Tue, Dec 9, 2025 at 2:17 AM Honggang LI <honggangli@163.com> wrote:
>
> On Mon, Dec 08, 2025 at 05:15:10PM +0100, Md Haris Iqbal wrote:
> > Subject: [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible
> >  zombie paths
> > From: Md Haris Iqbal <haris.iqbal@ionos.com>
> > Date: Mon,  8 Dec 2025 17:15:10 +0100
> > X-Mailer: git-send-email 2.43.0
> >
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > @@ -911,6 +911,13 @@ static int process_info_req(struct rtrs_srv_con *con,
> >                                     tx_iu->dma_addr,
> >                                     tx_iu->size, DMA_TO_DEVICE);
> >
> > +     /*
> > +      * Now disable zombie connection closing. Since from the logs and code,
> > +      * we know that it can never be in CONNECTED state.
> > +      * See RNBD-3128 comments.
>                ^^^^^^^^^^^^^^^^^
> What is it? How to access it?

It is an internal ticket number. Should have been removed, but we
missed it. Will remove it.
Thanks.

>
> thanks
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
  2025-12-09  1:12   ` Honggang LI
@ 2026-01-06  9:28     ` Haris Iqbal
  0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06  9:28 UTC (permalink / raw)
  To: Honggang LI
  Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

On Tue, Dec 9, 2025 at 2:13 AM Honggang LI <honggangli@163.com> wrote:
>
> On Mon, Dec 08, 2025 at 05:15:07PM +0100, Md Haris Iqbal wrote:
> > Subject: [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
> > From: Md Haris Iqbal <haris.iqbal@ionos.com>
> > Date: Mon,  8 Dec 2025 17:15:07 +0100
> > X-Mailer: git-send-email 2.43.0
> >
> > Support IB_MR_TYPE_SG_GAPS, which has less limitations
> > than standard IB_MR_TYPE_MEM_REG, a few ULP support this.
>
> Do you have benchmark performance difference between IB_MR_TYPE_MEM_REG
> and IB_MR_TYPE_SG_GAPS?

We haven't benchmarked it yet. As a ULP, we wanted to first add
support to RTRS.

>
> thanks
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
  2025-12-12  5:26   ` Dan Carpenter
@ 2026-01-06  9:47     ` Haris Iqbal
  0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06  9:47 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: oe-kbuild, linux-rdma, lkp, oe-kbuild-all, bvanassche, leon, jgg,
	jinpu.wang, grzegorz.prajsner, Kim Zhu

On Fri, Dec 12, 2025 at 6:26 AM Dan Carpenter <dan.carpenter@linaro.org> wrote:
>
> Hi Md,
>
> kernel test robot noticed the following build warnings:
>
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Md-Haris-Iqbal/RDMA-rtrs-srv-fix-SG-mapping/20251209-001817
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
> patch link:    https://lore.kernel.org/r/20251208161513.127049-3-haris.iqbal%40ionos.com
> patch subject: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
> config: arm-randconfig-r072-20251210 (https://download.01.org/0day-ci/archive/20251212/202512120133.BuJVeI6M-lkp@intel.com/config)
> compiler: arm-linux-gnueabi-gcc (GCC) 12.5.0
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> | Closes: https://lore.kernel.org/r/202512120133.BuJVeI6M-lkp@intel.com/
>
> smatch warnings:
> drivers/infiniband/ulp/rtrs/rtrs-srv.c:885 process_info_req() warn: passing zero to 'ERR_PTR'
>
> vim +/ERR_PTR +885 drivers/infiniband/ulp/rtrs/rtrs-srv.c
>
> 9cb837480424e7 Jack Wang        2020-05-11  808  static int process_info_req(struct rtrs_srv_con *con,
> 9cb837480424e7 Jack Wang        2020-05-11  809                             struct rtrs_msg_info_req *msg)
> 9cb837480424e7 Jack Wang        2020-05-11  810  {
> d9372794717f44 Vaishali Thakkar 2022-01-05  811         struct rtrs_path *s = con->c.path;
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  812         struct rtrs_srv_path *srv_path = to_srv_path(s);
> 9cb837480424e7 Jack Wang        2020-05-11  813         struct ib_send_wr *reg_wr = NULL;
> 9cb837480424e7 Jack Wang        2020-05-11  814         struct rtrs_msg_info_rsp *rsp;
> 9cb837480424e7 Jack Wang        2020-05-11  815         struct rtrs_iu *tx_iu;
> 9cb837480424e7 Jack Wang        2020-05-11  816         struct ib_reg_wr *rwr;
> 9cb837480424e7 Jack Wang        2020-05-11  817         int mri, err;
> 9cb837480424e7 Jack Wang        2020-05-11  818         size_t tx_sz;
> 9cb837480424e7 Jack Wang        2020-05-11  819
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  820         err = post_recv_path(srv_path);
> 4693d6b767d6ca Gioh Kim         2021-08-06  821         if (err) {
> 94ae3ce9b375c6 Kim Zhu          2025-12-08  822                 rtrs_err(s, "post_recv_path(), err: %d(%pe)\n", err, ERR_PTR(err));
> 9cb837480424e7 Jack Wang        2020-05-11  823                 return err;
> 9cb837480424e7 Jack Wang        2020-05-11  824         }
> 07c14027295a32 Gioh Kim         2021-05-28  825
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  826         if (strchr(msg->pathname, '/') || strchr(msg->pathname, '.')) {
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  827                 rtrs_err(s, "pathname cannot contain / and .\n");
> dea7bb3ad3e08f Md Haris Iqbal   2021-09-22  828                 return -EINVAL;
> dea7bb3ad3e08f Md Haris Iqbal   2021-09-22  829         }
> dea7bb3ad3e08f Md Haris Iqbal   2021-09-22  830
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  831         if (exist_pathname(srv_path->srv->ctx,
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  832                            msg->pathname, &srv_path->srv->paths_uuid)) {
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  833                 rtrs_err(s, "pathname is duplicated: %s\n", msg->pathname);
> 07c14027295a32 Gioh Kim         2021-05-28  834                 return -EPERM;
> 07c14027295a32 Gioh Kim         2021-05-28  835         }
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  836         strscpy(srv_path->s.sessname, msg->pathname,
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  837                 sizeof(srv_path->s.sessname));
> 07c14027295a32 Gioh Kim         2021-05-28  838
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  839         rwr = kcalloc(srv_path->mrs_num, sizeof(*rwr), GFP_KERNEL);
> 4693d6b767d6ca Gioh Kim         2021-08-06  840         if (!rwr)
> 9cb837480424e7 Jack Wang        2020-05-11  841                 return -ENOMEM;
> 9cb837480424e7 Jack Wang        2020-05-11  842
> 9cb837480424e7 Jack Wang        2020-05-11  843         tx_sz  = sizeof(*rsp);
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  844         tx_sz += sizeof(rsp->desc[0]) * srv_path->mrs_num;
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  845         tx_iu = rtrs_iu_alloc(1, tx_sz, GFP_KERNEL, srv_path->s.dev->ib_dev,
> 9cb837480424e7 Jack Wang        2020-05-11  846                                DMA_TO_DEVICE, rtrs_srv_info_rsp_done);
> 4693d6b767d6ca Gioh Kim         2021-08-06  847         if (!tx_iu) {
> 9cb837480424e7 Jack Wang        2020-05-11  848                 err = -ENOMEM;
> 9cb837480424e7 Jack Wang        2020-05-11  849                 goto rwr_free;
> 9cb837480424e7 Jack Wang        2020-05-11  850         }
> 9cb837480424e7 Jack Wang        2020-05-11  851
> 9cb837480424e7 Jack Wang        2020-05-11  852         rsp = tx_iu->buf;
> 9cb837480424e7 Jack Wang        2020-05-11  853         rsp->type = cpu_to_le16(RTRS_MSG_INFO_RSP);
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  854         rsp->sg_cnt = cpu_to_le16(srv_path->mrs_num);
> 9cb837480424e7 Jack Wang        2020-05-11  855
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  856         for (mri = 0; mri < srv_path->mrs_num; mri++) {
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  857                 struct ib_mr *mr = srv_path->mrs[mri].mr;
> 9cb837480424e7 Jack Wang        2020-05-11  858
> 9cb837480424e7 Jack Wang        2020-05-11  859                 rsp->desc[mri].addr = cpu_to_le64(mr->iova);
> 9cb837480424e7 Jack Wang        2020-05-11  860                 rsp->desc[mri].key  = cpu_to_le32(mr->rkey);
> 9cb837480424e7 Jack Wang        2020-05-11  861                 rsp->desc[mri].len  = cpu_to_le32(mr->length);
> 9cb837480424e7 Jack Wang        2020-05-11  862
> 9cb837480424e7 Jack Wang        2020-05-11  863                 /*
> 9cb837480424e7 Jack Wang        2020-05-11  864                  * Fill in reg MR request and chain them *backwards*
> 9cb837480424e7 Jack Wang        2020-05-11  865                  */
> 9cb837480424e7 Jack Wang        2020-05-11  866                 rwr[mri].wr.next = mri ? &rwr[mri - 1].wr : NULL;
> 9cb837480424e7 Jack Wang        2020-05-11  867                 rwr[mri].wr.opcode = IB_WR_REG_MR;
> 9cb837480424e7 Jack Wang        2020-05-11  868                 rwr[mri].wr.wr_cqe = &local_reg_cqe;
> 9cb837480424e7 Jack Wang        2020-05-11  869                 rwr[mri].wr.num_sge = 0;
> e8ae7ddb48a1b8 Jack Wang        2020-12-17  870                 rwr[mri].wr.send_flags = 0;
> 9cb837480424e7 Jack Wang        2020-05-11  871                 rwr[mri].mr = mr;
> 9cb837480424e7 Jack Wang        2020-05-11  872                 rwr[mri].key = mr->rkey;
> 9cb837480424e7 Jack Wang        2020-05-11  873                 rwr[mri].access = (IB_ACCESS_LOCAL_WRITE |
> 9cb837480424e7 Jack Wang        2020-05-11  874                                    IB_ACCESS_REMOTE_WRITE);
> 9cb837480424e7 Jack Wang        2020-05-11  875                 reg_wr = &rwr[mri].wr;
> 9cb837480424e7 Jack Wang        2020-05-11  876         }
> 9cb837480424e7 Jack Wang        2020-05-11  877
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  878         err = rtrs_srv_create_path_files(srv_path);
> 4693d6b767d6ca Gioh Kim         2021-08-06  879         if (err)
> 9cb837480424e7 Jack Wang        2020-05-11  880                 goto iu_free;
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  881         kobject_get(&srv_path->kobj);
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  882         get_device(&srv_path->srv->dev);
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  883         err = rtrs_srv_change_state(srv_path, RTRS_SRV_CONNECTED);
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  884         if (!err) {
>
> Probably remove the !?
>
> 94ae3ce9b375c6 Kim Zhu          2025-12-08 @885                 rtrs_err(s, "rtrs_srv_change_state(), err: %d(%pe)\n", err, ERR_PTR(err));
>
> err is zero.  Or is this a success path?

The function rtrs_srv_change_state returns (bool) true for success.
For this return value, the error log should not be sent to ERR_PTR.
Will remove the change for this in the next version.
Thanks.

>
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  886                 goto iu_free;
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  887         }
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  888
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  889         rtrs_srv_start_hb(srv_path);
> 9cb837480424e7 Jack Wang        2020-05-11  890
> 9cb837480424e7 Jack Wang        2020-05-11  891         /*
> 9cb837480424e7 Jack Wang        2020-05-11  892          * We do not account number of established connections at the current
> 9cb837480424e7 Jack Wang        2020-05-11  893          * moment, we rely on the client, which should send info request when
> 9cb837480424e7 Jack Wang        2020-05-11  894          * all connections are successfully established.  Thus, simply notify
> 9cb837480424e7 Jack Wang        2020-05-11  895          * listener with a proper event if we are the first path.
> 9cb837480424e7 Jack Wang        2020-05-11  896          */
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  897         err = rtrs_srv_path_up(srv_path);
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  898         if (err) {
> 94ae3ce9b375c6 Kim Zhu          2025-12-08  899                 rtrs_err(s, "rtrs_srv_path_up(), err: %d(%pe)\n", err, ERR_PTR(err));
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  900                 goto iu_free;
> ed1e52aefa16f1 Md Haris Iqbal   2023-11-20  901         }
> 9cb837480424e7 Jack Wang        2020-05-11  902
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  903         ib_dma_sync_single_for_device(srv_path->s.dev->ib_dev,
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  904                                       tx_iu->dma_addr,
> 9cb837480424e7 Jack Wang        2020-05-11  905                                       tx_iu->size, DMA_TO_DEVICE);
> 9cb837480424e7 Jack Wang        2020-05-11  906
> 9cb837480424e7 Jack Wang        2020-05-11  907         /* Send info response */
> 9cb837480424e7 Jack Wang        2020-05-11  908         err = rtrs_iu_post_send(&con->c, tx_iu, tx_sz, reg_wr);
> 4693d6b767d6ca Gioh Kim         2021-08-06  909         if (err) {
> 94ae3ce9b375c6 Kim Zhu          2025-12-08  910                 rtrs_err(s, "rtrs_iu_post_send(), err: %d(%pe)\n", err, ERR_PTR(err));
> 9cb837480424e7 Jack Wang        2020-05-11  911  iu_free:
> ae4c81644e9105 Vaishali Thakkar 2022-01-05  912                 rtrs_iu_free(tx_iu, srv_path->s.dev->ib_dev, 1);
> 9cb837480424e7 Jack Wang        2020-05-11  913         }
> 9cb837480424e7 Jack Wang        2020-05-11  914  rwr_free:
> 9cb837480424e7 Jack Wang        2020-05-11  915         kfree(rwr);
> 9cb837480424e7 Jack Wang        2020-05-11  916
> 9cb837480424e7 Jack Wang        2020-05-11  917         return err;
> 9cb837480424e7 Jack Wang        2020-05-11  918  }
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
  2025-12-18 15:51   ` Leon Romanovsky
@ 2026-01-06 10:03     ` Haris Iqbal
  0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06 10:03 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: linux-rdma, bvanassche, jgg, jinpu.wang, grzegorz.prajsner,
	Kim Zhu

On Thu, Dec 18, 2025 at 4:51 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Dec 08, 2025 at 05:15:06PM +0100, Md Haris Iqbal wrote:
> > From: Kim Zhu <zhu.yanjun@ionos.com>
> >
> > Print error description besides the error number.
> >
> > Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
> > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > ---
> >  drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c |  8 +-
> >  drivers/infiniband/ulp/rtrs/rtrs-clt.c       | 89 ++++++++++----------
> >  drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 12 +--
> >  drivers/infiniband/ulp/rtrs/rtrs-srv.c       | 78 ++++++++---------
> >  drivers/infiniband/ulp/rtrs/rtrs.c           |  9 +-
> >  5 files changed, 101 insertions(+), 95 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> > index 4aa80c9388f0..b318acc12b10 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> > @@ -439,19 +439,19 @@ int rtrs_clt_create_path_files(struct rtrs_clt_path *clt_path)
> >                                  clt->kobj_paths,
> >                                  "%s", str);
> >       if (err) {
> > -             pr_err("kobject_init_and_add: %d\n", err);
> > +             pr_err("kobject_init_and_add: %d(%pe)\n", err, ERR_PTR(err));
>
> Or print error or print error description, not both.

Makes sense. Will change it.
Thanks

>
> Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2026-01-06 10:03 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
2025-12-12  5:26   ` Dan Carpenter
2026-01-06  9:47     ` Haris Iqbal
2025-12-18 15:51   ` Leon Romanovsky
2026-01-06 10:03     ` Haris Iqbal
2025-12-08 16:15 ` [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS Md Haris Iqbal
2025-12-09  1:12   ` Honggang LI
2026-01-06  9:28     ` Haris Iqbal
2025-12-08 16:15 ` [PATCH 4/9] RDMA/rtrs: Improve error logging for RDMA cm events Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req Md Haris Iqbal
2025-12-09  1:14   ` Honggang LI
2026-01-06  9:26     ` Haris Iqbal
2025-12-08 16:15 ` [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths Md Haris Iqbal
2025-12-09  1:17   ` Honggang LI
2026-01-06  9:27     ` Haris Iqbal
2025-12-08 16:15 ` [PATCH 7/9] RDMA/rtrs-srv: Rate-limit I/O path error logging Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 8/9] RDMA/rtrs: Extend log message when a port fails Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 9/9] RDMA/rtrs-clt.c: For conn rejection use actual err number Md Haris Iqbal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).