* [PATCH 0/9] Misc patches for RTRS
@ 2025-12-08 16:15 Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping Md Haris Iqbal
` (8 more replies)
0 siblings, 9 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner
Hi Jason, hi Leon,
Please consider to include following changes to the next merge window.
Jack Wang (1):
RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req
Kim Zhu (4):
RDMA/rtrs: Add error description to the logs
RDMA/rtrs: Improve error logging for RDMA cm events
RDMA/rtrs-srv: Rate-limit I/O path error logging
RDMA/rtrs: Extend log message when a port fails
Md Haris Iqbal (3):
RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
RDMA/rtrs-srv: Add check and closure for possible zombie paths
RDMA/rtrs-clt.c: For conn rejection use actual err number
Roman Penyaev (1):
RDMA/rtrs-srv: fix SG mapping
drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c | 8 +-
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 132 ++++++++-----
drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 -
drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 12 +-
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 185 +++++++++++++------
drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
drivers/infiniband/ulp/rtrs/rtrs.c | 9 +-
7 files changed, 232 insertions(+), 116 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
` (7 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
Roman Penyaev
From: Roman Penyaev <r.peniaev@gmail.com>
This fixes the following error on the server side:
RTRS server session allocation failed: -EINVAL
caused by the caller of the `ib_dma_map_sg()`, which does not expect
less mapped entries, than requested, which is in the order of things
and can be easily reproduced on the machine with enabled IOMMU.
The fix is to treat any positive number of mapped sg entries as a
successful mapping and cache DMA addresses by traversing modified
SG table.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index ef4abdea3c2d..2589871c0fa9 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -601,7 +601,7 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
srv_path->mrs_num++) {
struct rtrs_srv_mr *srv_mr = &srv_path->mrs[srv_path->mrs_num];
struct scatterlist *s;
- int nr, nr_sgt, chunks;
+ int nr, nr_sgt, chunks, ind;
sgt = &srv_mr->sgt;
chunks = chunks_per_mr * srv_path->mrs_num;
@@ -631,7 +631,7 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
}
nr = ib_map_mr_sg(mr, sgt->sgl, nr_sgt,
NULL, max_chunk_size);
- if (nr != nr_sgt) {
+ if (nr < nr_sgt) {
err = nr < 0 ? nr : -EINVAL;
goto dereg_mr;
}
@@ -647,9 +647,24 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
goto dereg_mr;
}
}
- /* Eventually dma addr for each chunk can be cached */
- for_each_sg(sgt->sgl, s, nr_sgt, i)
- srv_path->dma_addr[chunks + i] = sg_dma_address(s);
+
+ /*
+ * Cache DMA addresses by traversing sg entries. If
+ * regions were merged, an inner loop is required to
+ * populate the DMA address array by traversing larger
+ * regions.
+ */
+ ind = chunks;
+ for_each_sg(sgt->sgl, s, nr_sgt, i) {
+ unsigned int dma_len = sg_dma_len(s);
+ u64 dma_addr = sg_dma_address(s);
+ u64 dma_addr_end = dma_addr + dma_len;
+
+ do {
+ srv_path->dma_addr[ind++] = dma_addr;
+ dma_addr += max_chunk_size;
+ } while (dma_addr < dma_addr_end);
+ }
ib_update_fast_reg_key(mr, ib_inc_rkey(mr->rkey));
srv_mr->mr = mr;
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/9] RDMA/rtrs: Add error description to the logs
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
2025-12-12 5:26 ` Dan Carpenter
2025-12-18 15:51 ` Leon Romanovsky
2025-12-08 16:15 ` [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS Md Haris Iqbal
` (6 subsequent siblings)
8 siblings, 2 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
Kim Zhu
From: Kim Zhu <zhu.yanjun@ionos.com>
Print error description besides the error number.
Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c | 8 +-
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 89 ++++++++++----------
drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 12 +--
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 78 ++++++++---------
drivers/infiniband/ulp/rtrs/rtrs.c | 9 +-
5 files changed, 101 insertions(+), 95 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
index 4aa80c9388f0..b318acc12b10 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
@@ -439,19 +439,19 @@ int rtrs_clt_create_path_files(struct rtrs_clt_path *clt_path)
clt->kobj_paths,
"%s", str);
if (err) {
- pr_err("kobject_init_and_add: %d\n", err);
+ pr_err("kobject_init_and_add: %d(%pe)\n", err, ERR_PTR(err));
kobject_put(&clt_path->kobj);
return err;
}
err = sysfs_create_group(&clt_path->kobj, &rtrs_clt_path_attr_group);
if (err) {
- pr_err("sysfs_create_group(): %d\n", err);
+ pr_err("sysfs_create_group(): %d(%pe)\n", err, ERR_PTR(err));
goto put_kobj;
}
err = kobject_init_and_add(&clt_path->stats->kobj_stats, &ktype_stats,
&clt_path->kobj, "stats");
if (err) {
- pr_err("kobject_init_and_add: %d\n", err);
+ pr_err("kobject_init_and_add: %d(%pe)\n", err, ERR_PTR(err));
kobject_put(&clt_path->stats->kobj_stats);
goto remove_group;
}
@@ -459,7 +459,7 @@ int rtrs_clt_create_path_files(struct rtrs_clt_path *clt_path)
err = sysfs_create_group(&clt_path->stats->kobj_stats,
&rtrs_clt_stats_attr_group);
if (err) {
- pr_err("failed to create stats sysfs group, err: %d\n", err);
+ pr_err("failed to create stats sysfs group, err: %d(%pe)\n", err, ERR_PTR(err));
goto put_kobj_stats;
}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 71387811b281..808de144d2e4 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -422,8 +422,8 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
refcount_inc(&req->ref);
err = rtrs_inv_rkey(req);
if (err) {
- rtrs_err_rl(con->c.path, "Send INV WR key=%#x: %d\n",
- req->mr->rkey, err);
+ rtrs_err_rl(con->c.path, "Send INV WR key=%#x: %d(%pe)\n",
+ req->mr->rkey, err, ERR_PTR(err));
} else if (can_wait) {
wait_for_completion(&req->inv_comp);
}
@@ -443,8 +443,8 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
if (errno) {
rtrs_err_rl(con->c.path,
- "IO %s request failed: error=%d path=%s [%s:%u] notify=%d\n",
- req->dir == DMA_TO_DEVICE ? "write" : "read", errno,
+ "IO %s request failed: error=%d(%pe) path=%s [%s:%u] notify=%d\n",
+ req->dir == DMA_TO_DEVICE ? "write" : "read", errno, ERR_PTR(errno),
kobject_name(&clt_path->kobj), clt_path->hca_name,
clt_path->hca_port, notify);
}
@@ -514,7 +514,8 @@ static void rtrs_clt_recv_done(struct rtrs_clt_con *con, struct ib_wc *wc)
cqe);
err = rtrs_iu_post_recv(&con->c, iu);
if (err) {
- rtrs_err(con->c.path, "post iu failed %d\n", err);
+ rtrs_err(con->c.path, "post iu failed %d(%pe)\n", err,
+ ERR_PTR(err));
rtrs_rdma_error_recovery(con);
}
}
@@ -659,8 +660,8 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
else
err = rtrs_post_recv_empty(&con->c, &io_comp_cqe);
if (err) {
- rtrs_err(con->c.path, "rtrs_post_recv_empty(): %d\n",
- err);
+ rtrs_err(con->c.path, "rtrs_post_recv_empty(): %d(%pe)\n",
+ err, ERR_PTR(err));
rtrs_rdma_error_recovery(con);
}
break;
@@ -731,8 +732,8 @@ static int post_recv_path(struct rtrs_clt_path *clt_path)
err = post_recv_io(to_clt_con(clt_path->s.con[cid]), q_size);
if (err) {
- rtrs_err(clt_path->clt, "post_recv_io(), err: %d\n",
- err);
+ rtrs_err(clt_path->clt, "post_recv_io(), err: %d(%pe)\n",
+ err, ERR_PTR(err));
return err;
}
}
@@ -1122,8 +1123,8 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
ret = rtrs_map_sg_fr(req, count);
if (ret < 0) {
rtrs_err_rl(s,
- "Write request failed, failed to map fast reg. data, err: %d\n",
- ret);
+ "Write request failed, failed to map fast reg. data, err: %d(%pe)\n",
+ ret, ERR_PTR(ret));
ib_dma_unmap_sg(clt_path->s.dev->ib_dev, req->sglist,
req->sg_cnt, req->dir);
return ret;
@@ -1150,9 +1151,9 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
imm, wr, NULL);
if (ret) {
rtrs_err_rl(s,
- "Write request failed: error=%d path=%s [%s:%u]\n",
- ret, kobject_name(&clt_path->kobj), clt_path->hca_name,
- clt_path->hca_port);
+ "Write request failed: error=%d(%pe) path=%s [%s:%u]\n",
+ ret, ERR_PTR(ret), kobject_name(&clt_path->kobj),
+ clt_path->hca_name, clt_path->hca_port);
if (req->mp_policy == MP_POLICY_MIN_INFLIGHT)
atomic_dec(&clt_path->stats->inflight);
if (req->mr->need_inval) {
@@ -1208,8 +1209,8 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
ret = rtrs_map_sg_fr(req, count);
if (ret < 0) {
rtrs_err_rl(s,
- "Read request failed, failed to map fast reg. data, err: %d\n",
- ret);
+ "Read request failed, failed to map fast reg. data, err: %d(%pe)\n",
+ ret, ERR_PTR(ret));
ib_dma_unmap_sg(dev->ib_dev, req->sglist, req->sg_cnt,
req->dir);
return ret;
@@ -1260,9 +1261,9 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
req->data_len, imm, wr);
if (ret) {
rtrs_err_rl(s,
- "Read request failed: error=%d path=%s [%s:%u]\n",
- ret, kobject_name(&clt_path->kobj), clt_path->hca_name,
- clt_path->hca_port);
+ "Read request failed: error=%d(%pe) path=%s [%s:%u]\n",
+ ret, ERR_PTR(ret), kobject_name(&clt_path->kobj),
+ clt_path->hca_name, clt_path->hca_port);
if (req->mp_policy == MP_POLICY_MIN_INFLIGHT)
atomic_dec(&clt_path->stats->inflight);
req->mr->need_inval = false;
@@ -1774,12 +1775,12 @@ static int rtrs_rdma_addr_resolved(struct rtrs_clt_con *con)
err = create_con_cq_qp(con);
mutex_unlock(&con->con_mutex);
if (err) {
- rtrs_err(s, "create_con_cq_qp(), err: %d\n", err);
+ rtrs_err(s, "create_con_cq_qp(), err: %d(%pe)\n", err, ERR_PTR(err));
return err;
}
err = rdma_resolve_route(con->c.cm_id, RTRS_CONNECT_TIMEOUT_MS);
if (err)
- rtrs_err(s, "Resolving route failed, err: %d\n", err);
+ rtrs_err(s, "Resolving route failed, err: %d(%pe)\n", err, ERR_PTR(err));
return err;
}
@@ -1813,7 +1814,7 @@ static int rtrs_rdma_route_resolved(struct rtrs_clt_con *con)
err = rdma_connect_locked(con->c.cm_id, ¶m);
if (err)
- rtrs_err(clt, "rdma_connect_locked(): %d\n", err);
+ rtrs_err(clt, "rdma_connect_locked(): %d(%pe)\n", err, ERR_PTR(err));
return err;
}
@@ -1846,8 +1847,8 @@ static int rtrs_rdma_conn_established(struct rtrs_clt_con *con,
}
errno = le16_to_cpu(msg->errno);
if (errno) {
- rtrs_err(clt, "Invalid RTRS message: errno %d\n",
- errno);
+ rtrs_err(clt, "Invalid RTRS message: errno %d(%pe)\n",
+ errno, ERR_PTR(errno));
return -ECONNRESET;
}
if (con->c.cid == 0) {
@@ -1936,12 +1937,12 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
"Previous session is still exists on the server, please reconnect later\n");
else
rtrs_err(s,
- "Connect rejected: status %d (%s), rtrs errno %d\n",
- status, rej_msg, errno);
+ "Connect rejected: status %d (%s), rtrs errno %d(%pe)\n",
+ status, rej_msg, errno, ERR_PTR(errno));
} else {
rtrs_err(s,
- "Connect rejected but with malformed message: status %d (%s)\n",
- status, rej_msg);
+ "Connect rejected but with malformed message: status %d(%pe) (%s)\n",
+ status, ERR_PTR(status), rej_msg);
}
return -ECONNRESET;
@@ -2008,27 +2009,27 @@ static int rtrs_clt_rdma_cm_handler(struct rdma_cm_id *cm_id,
case RDMA_CM_EVENT_UNREACHABLE:
case RDMA_CM_EVENT_ADDR_CHANGE:
case RDMA_CM_EVENT_TIMEWAIT_EXIT:
- rtrs_wrn(s, "CM error (CM event: %s, err: %d)\n",
- rdma_event_msg(ev->event), ev->status);
+ rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
+ rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
cm_err = -ECONNRESET;
break;
case RDMA_CM_EVENT_ADDR_ERROR:
case RDMA_CM_EVENT_ROUTE_ERROR:
- rtrs_wrn(s, "CM error (CM event: %s, err: %d)\n",
- rdma_event_msg(ev->event), ev->status);
+ rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
+ rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
cm_err = -EHOSTUNREACH;
break;
case RDMA_CM_EVENT_DEVICE_REMOVAL:
/*
* Device removal is a special case. Queue close and return 0.
*/
- rtrs_wrn_rl(s, "CM event: %s, status: %d\n", rdma_event_msg(ev->event),
- ev->status);
+ rtrs_wrn_rl(s, "CM event: %s, status: %d(%pe)\n", rdma_event_msg(ev->event),
+ ev->status, ERR_PTR(ev->status));
rtrs_clt_close_conns(clt_path, false);
return 0;
default:
- rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d)\n",
- rdma_event_msg(ev->event), ev->status);
+ rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d(%pe))\n",
+ rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
cm_err = -ECONNRESET;
break;
}
@@ -2065,14 +2066,14 @@ static int create_cm(struct rtrs_clt_con *con)
/* allow the port to be reused */
err = rdma_set_reuseaddr(cm_id, 1);
if (err != 0) {
- rtrs_err(s, "Set address reuse failed, err: %d\n", err);
+ rtrs_err(s, "Set address reuse failed, err: %d(%pe)\n", err, ERR_PTR(err));
return err;
}
err = rdma_resolve_addr(cm_id, (struct sockaddr *)&clt_path->s.src_addr,
(struct sockaddr *)&clt_path->s.dst_addr,
RTRS_CONNECT_TIMEOUT_MS);
if (err) {
- rtrs_err(s, "Failed to resolve address, err: %d\n", err);
+ rtrs_err(s, "Failed to resolve address, err: %d(%pe)\n", err, ERR_PTR(err));
return err;
}
/*
@@ -2547,7 +2548,7 @@ static int rtrs_send_path_info(struct rtrs_clt_path *clt_path)
/* Prepare for getting info response */
err = rtrs_iu_post_recv(&usr_con->c, rx_iu);
if (err) {
- rtrs_err(clt_path->clt, "rtrs_iu_post_recv(), err: %d\n", err);
+ rtrs_err(clt_path->clt, "rtrs_iu_post_recv(), err: %d(%pe)\n", err, ERR_PTR(err));
goto out;
}
rx_iu = NULL;
@@ -2563,7 +2564,7 @@ static int rtrs_send_path_info(struct rtrs_clt_path *clt_path)
/* Send info request */
err = rtrs_iu_post_send(&usr_con->c, tx_iu, sizeof(*msg), NULL);
if (err) {
- rtrs_err(clt_path->clt, "rtrs_iu_post_send(), err: %d\n", err);
+ rtrs_err(clt_path->clt, "rtrs_iu_post_send(), err: %d(%pe)\n", err, ERR_PTR(err));
goto out;
}
tx_iu = NULL;
@@ -2614,15 +2615,15 @@ static int init_path(struct rtrs_clt_path *clt_path)
err = init_conns(clt_path);
if (err) {
rtrs_err(clt_path->clt,
- "init_conns() failed: err=%d path=%s [%s:%u]\n", err,
- str, clt_path->hca_name, clt_path->hca_port);
+ "init_conns() failed: err=%d(%pe) path=%s [%s:%u]\n", err,
+ ERR_PTR(err), str, clt_path->hca_name, clt_path->hca_port);
goto out;
}
err = rtrs_send_path_info(clt_path);
if (err) {
rtrs_err(clt_path->clt,
- "rtrs_send_path_info() failed: err=%d path=%s [%s:%u]\n",
- err, str, clt_path->hca_name, clt_path->hca_port);
+ "rtrs_send_path_info() failed: err=%d(%pe) path=%s [%s:%u]\n",
+ err, ERR_PTR(err), str, clt_path->hca_name, clt_path->hca_port);
goto out;
}
rtrs_clt_path_up(clt_path);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
index 3f305e694fe8..5e12701a3733 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
@@ -176,14 +176,14 @@ static int rtrs_srv_create_once_sysfs_root_folders(struct rtrs_srv_path *srv_pat
dev_set_uevent_suppress(&srv->dev, true);
err = device_add(&srv->dev);
if (err) {
- pr_err("device_add(): %d\n", err);
+ pr_err("device_add(): %d(%pe)\n", err, ERR_PTR(err));
put_device(&srv->dev);
goto unlock;
}
srv->kobj_paths = kobject_create_and_add("paths", &srv->dev.kobj);
if (!srv->kobj_paths) {
err = -ENOMEM;
- pr_err("kobject_create_and_add(): %d\n", err);
+ pr_err("kobject_create_and_add(): %d(%pe)\n", err, ERR_PTR(err));
device_del(&srv->dev);
put_device(&srv->dev);
goto unlock;
@@ -237,14 +237,14 @@ static int rtrs_srv_create_stats_files(struct rtrs_srv_path *srv_path)
err = kobject_init_and_add(&srv_path->stats->kobj_stats, &ktype_stats,
&srv_path->kobj, "stats");
if (err) {
- rtrs_err(s, "kobject_init_and_add(): %d\n", err);
+ rtrs_err(s, "kobject_init_and_add(): %d(%pe)\n", err, ERR_PTR(err));
kobject_put(&srv_path->stats->kobj_stats);
return err;
}
err = sysfs_create_group(&srv_path->stats->kobj_stats,
&rtrs_srv_stats_attr_group);
if (err) {
- rtrs_err(s, "sysfs_create_group(): %d\n", err);
+ rtrs_err(s, "sysfs_create_group(): %d(%pe)\n", err, ERR_PTR(err));
goto err;
}
@@ -276,12 +276,12 @@ int rtrs_srv_create_path_files(struct rtrs_srv_path *srv_path)
err = kobject_init_and_add(&srv_path->kobj, &ktype, srv->kobj_paths,
"%s", str);
if (err) {
- rtrs_err(s, "kobject_init_and_add(): %d\n", err);
+ rtrs_err(s, "kobject_init_and_add(): %d(%pe)\n", err, ERR_PTR(err));
goto destroy_root;
}
err = sysfs_create_group(&srv_path->kobj, &rtrs_srv_path_attr_group);
if (err) {
- rtrs_err(s, "sysfs_create_group(): %d\n", err);
+ rtrs_err(s, "sysfs_create_group(): %d(%pe)\n", err, ERR_PTR(err));
goto put_kobj;
}
err = rtrs_srv_create_stats_files(srv_path);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 2589871c0fa9..758d77206315 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -323,8 +323,8 @@ static int rdma_write_sg(struct rtrs_srv_op *id)
err = ib_post_send(id->con->c.qp, &id->tx_wr.wr, NULL);
if (err)
rtrs_err(s,
- "Posting RDMA-Write-Request to QP failed, err: %d\n",
- err);
+ "Posting RDMA-Write-Request to QP failed, err: %d(%pe)\n",
+ err, ERR_PTR(err));
return err;
}
@@ -440,8 +440,8 @@ static int send_io_resp_imm(struct rtrs_srv_con *con, struct rtrs_srv_op *id,
err = ib_post_send(id->con->c.qp, wr, NULL);
if (err)
- rtrs_err_rl(s, "Posting RDMA-Reply to QP failed, err: %d\n",
- err);
+ rtrs_err_rl(s, "Posting RDMA-Reply to QP failed, err: %d(%pe)\n",
+ err, ERR_PTR(err));
return err;
}
@@ -525,8 +525,8 @@ bool rtrs_srv_resp_rdma(struct rtrs_srv_op *id, int status)
err = rdma_write_sg(id);
if (err) {
- rtrs_err_rl(s, "IO response failed: %d: srv_path=%s\n", err,
- kobject_name(&srv_path->kobj));
+ rtrs_err_rl(s, "IO response failed: %d(%pe): srv_path=%s\n", err,
+ ERR_PTR(err), kobject_name(&srv_path->kobj));
close_path(srv_path);
}
out:
@@ -643,7 +643,7 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
DMA_TO_DEVICE, rtrs_srv_rdma_done);
if (!srv_mr->iu) {
err = -ENOMEM;
- rtrs_err(ss, "rtrs_iu_alloc(), err: %d\n", err);
+ rtrs_err(ss, "rtrs_iu_alloc(), err: %d(%pe)\n", err, ERR_PTR(err));
goto dereg_mr;
}
}
@@ -819,7 +819,7 @@ static int process_info_req(struct rtrs_srv_con *con,
err = post_recv_path(srv_path);
if (err) {
- rtrs_err(s, "post_recv_path(), err: %d\n", err);
+ rtrs_err(s, "post_recv_path(), err: %d(%pe)\n", err, ERR_PTR(err));
return err;
}
@@ -882,7 +882,7 @@ static int process_info_req(struct rtrs_srv_con *con,
get_device(&srv_path->srv->dev);
err = rtrs_srv_change_state(srv_path, RTRS_SRV_CONNECTED);
if (!err) {
- rtrs_err(s, "rtrs_srv_change_state(), err: %d\n", err);
+ rtrs_err(s, "rtrs_srv_change_state(), err: %d(%pe)\n", err, ERR_PTR(err));
goto iu_free;
}
@@ -896,7 +896,7 @@ static int process_info_req(struct rtrs_srv_con *con,
*/
err = rtrs_srv_path_up(srv_path);
if (err) {
- rtrs_err(s, "rtrs_srv_path_up(), err: %d\n", err);
+ rtrs_err(s, "rtrs_srv_path_up(), err: %d(%pe)\n", err, ERR_PTR(err));
goto iu_free;
}
@@ -907,7 +907,7 @@ static int process_info_req(struct rtrs_srv_con *con,
/* Send info response */
err = rtrs_iu_post_send(&con->c, tx_iu, tx_sz, reg_wr);
if (err) {
- rtrs_err(s, "rtrs_iu_post_send(), err: %d\n", err);
+ rtrs_err(s, "rtrs_iu_post_send(), err: %d(%pe)\n", err, ERR_PTR(err));
iu_free:
rtrs_iu_free(tx_iu, srv_path->s.dev->ib_dev, 1);
}
@@ -975,7 +975,7 @@ static int post_recv_info_req(struct rtrs_srv_con *con)
/* Prepare for getting info response */
err = rtrs_iu_post_recv(&con->c, rx_iu);
if (err) {
- rtrs_err(s, "rtrs_iu_post_recv(), err: %d\n", err);
+ rtrs_err(s, "rtrs_iu_post_recv(), err: %d(%pe)\n", err, ERR_PTR(err));
rtrs_iu_free(rx_iu, srv_path->s.dev->ib_dev, 1);
return err;
}
@@ -1021,7 +1021,7 @@ static int post_recv_path(struct rtrs_srv_path *srv_path)
err = post_recv_io(to_srv_con(srv_path->s.con[cid]), q_size);
if (err) {
- rtrs_err(s, "post_recv_io(), err: %d\n", err);
+ rtrs_err(s, "post_recv_io(), err: %d(%pe)\n", err, ERR_PTR(err));
return err;
}
}
@@ -1069,8 +1069,8 @@ static void process_read(struct rtrs_srv_con *con,
if (ret) {
rtrs_err_rl(s,
- "Processing read request failed, user module cb reported for msg_id %d, err: %d\n",
- buf_id, ret);
+ "Processing read request failed, user module cb reported for msg_id %d, err: %d(%pe)\n",
+ buf_id, ret, ERR_PTR(ret));
goto send_err_msg;
}
@@ -1080,8 +1080,8 @@ static void process_read(struct rtrs_srv_con *con,
ret = send_io_resp_imm(con, id, ret);
if (ret < 0) {
rtrs_err_rl(s,
- "Sending err msg for failed RDMA-Write-Req failed, msg_id %d, err: %d\n",
- buf_id, ret);
+ "Sending err msg for failed RDMA-Write-Req failed, msg_id %d, err: %d(%pe)\n",
+ buf_id, ret, ERR_PTR(ret));
close_path(srv_path);
}
rtrs_srv_put_ops_ids(srv_path);
@@ -1121,8 +1121,8 @@ static void process_write(struct rtrs_srv_con *con,
data + data_len, usr_len);
if (ret) {
rtrs_err_rl(s,
- "Processing write request failed, user module callback reports err: %d\n",
- ret);
+ "Processing write request failed, user module callback reports err: %d(%pe)\n",
+ ret, ERR_PTR(ret));
goto send_err_msg;
}
@@ -1132,8 +1132,8 @@ static void process_write(struct rtrs_srv_con *con,
ret = send_io_resp_imm(con, id, ret);
if (ret < 0) {
rtrs_err_rl(s,
- "Processing write request failed, sending I/O response failed, msg_id %d, err: %d\n",
- buf_id, ret);
+ "Processing write request failed, sending I/O response failed, msg_id %d, err: %d(%pe)\n",
+ buf_id, ret, ERR_PTR(ret));
close_path(srv_path);
}
rtrs_srv_put_ops_ids(srv_path);
@@ -1263,7 +1263,8 @@ static void rtrs_srv_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
srv_path->s.hb_missed_cnt = 0;
err = rtrs_post_recv_empty(&con->c, &io_comp_cqe);
if (err) {
- rtrs_err(s, "rtrs_post_recv(), err: %d\n", err);
+ rtrs_err(s, "rtrs_post_recv(), err: %d(%pe)\n",
+ err, ERR_PTR(err));
close_path(srv_path);
break;
}
@@ -1288,8 +1289,8 @@ static void rtrs_srv_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
mr->msg_id = msg_id;
err = rtrs_srv_inv_rkey(con, mr);
if (err) {
- rtrs_err(s, "rtrs_post_recv(), err: %d\n",
- err);
+ rtrs_err(s, "rtrs_post_recv(), err: %d(%pe)\n",
+ err, ERR_PTR(err));
close_path(srv_path);
break;
}
@@ -1638,7 +1639,7 @@ static int rtrs_rdma_do_accept(struct rtrs_srv_path *srv_path,
err = rdma_accept(cm_id, ¶m);
if (err)
- pr_err("rdma_accept(), err: %d\n", err);
+ pr_err("rdma_accept(), err: %d(%pe)\n", err, ERR_PTR(err));
return err;
}
@@ -1656,7 +1657,7 @@ static int rtrs_rdma_do_reject(struct rdma_cm_id *cm_id, int errno)
err = rdma_reject(cm_id, &msg, sizeof(msg), IB_CM_REJ_CONSUMER_DEFINED);
if (err)
- pr_err("rdma_reject(), err: %d\n", err);
+ pr_err("rdma_reject(), err: %d(%pe)\n", err, ERR_PTR(err));
/* Bounce errno back */
return errno;
@@ -1732,7 +1733,7 @@ static int create_con(struct rtrs_srv_path *srv_path,
max_send_wr, max_recv_wr,
IB_POLL_WORKQUEUE);
if (err) {
- rtrs_err(s, "rtrs_cq_qp_create(), err: %d\n", err);
+ rtrs_err(s, "rtrs_cq_qp_create(), err: %d(%pe)\n", err, ERR_PTR(err));
goto free_con;
}
if (con->c.cid == 0) {
@@ -1947,7 +1948,7 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
}
err = create_con(srv_path, cm_id, cid);
if (err) {
- rtrs_err((&srv_path->s), "create_con(), error %d\n", err);
+ rtrs_err((&srv_path->s), "create_con(), error %d(%pe)\n", err, ERR_PTR(err));
rtrs_rdma_do_reject(cm_id, err);
/*
* Since session has other connections we follow normal way
@@ -1958,7 +1959,8 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
}
err = rtrs_rdma_do_accept(srv_path, cm_id);
if (err) {
- rtrs_err((&srv_path->s), "rtrs_rdma_do_accept(), error %d\n", err);
+ rtrs_err((&srv_path->s), "rtrs_rdma_do_accept(), error %d(%pe)\n",
+ err, ERR_PTR(err));
rtrs_rdma_do_reject(cm_id, err);
/*
* Since current connection was successfully added to the
@@ -2009,8 +2011,8 @@ static int rtrs_srv_rdma_cm_handler(struct rdma_cm_id *cm_id,
case RDMA_CM_EVENT_REJECTED:
case RDMA_CM_EVENT_CONNECT_ERROR:
case RDMA_CM_EVENT_UNREACHABLE:
- rtrs_err(s, "CM error (CM event: %s, err: %d)\n",
- rdma_event_msg(ev->event), ev->status);
+ rtrs_err(s, "CM error (CM event: %s, err: %d(%pe))\n",
+ rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
fallthrough;
case RDMA_CM_EVENT_DISCONNECTED:
case RDMA_CM_EVENT_ADDR_CHANGE:
@@ -2019,8 +2021,8 @@ static int rtrs_srv_rdma_cm_handler(struct rdma_cm_id *cm_id,
close_path(srv_path);
break;
default:
- pr_err("Ignoring unexpected CM event %s, err %d\n",
- rdma_event_msg(ev->event), ev->status);
+ pr_err("Ignoring unexpected CM event %s, err %d(%pe)\n",
+ rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
break;
}
@@ -2044,13 +2046,13 @@ static struct rdma_cm_id *rtrs_srv_cm_init(struct rtrs_srv_ctx *ctx,
}
ret = rdma_bind_addr(cm_id, addr);
if (ret) {
- pr_err("Binding RDMA address failed, err: %d\n", ret);
+ pr_err("Binding RDMA address failed, err: %d(%pe)\n", ret, ERR_PTR(ret));
goto err_cm;
}
ret = rdma_listen(cm_id, 64);
if (ret) {
- pr_err("Listening on RDMA connection failed, err: %d\n",
- ret);
+ pr_err("Listening on RDMA connection failed, err: %d(%pe)\n",
+ ret, ERR_PTR(ret));
goto err_cm;
}
@@ -2328,8 +2330,8 @@ static int __init rtrs_server_init(void)
err = check_module_params();
if (err) {
- pr_err("Failed to load module, invalid module parameters, err: %d\n",
- err);
+ pr_err("Failed to load module, invalid module parameters, err: %d(%pe)\n",
+ err, ERR_PTR(err));
return err;
}
err = class_register(&rtrs_dev_class);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
index bf38ac6f87c4..ea91371f6ad7 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs.c
@@ -273,7 +273,8 @@ static int create_qp(struct rtrs_con *con, struct ib_pd *pd,
ret = rdma_create_qp(cm_id, pd, &init_attr);
if (ret) {
- rtrs_err(con->path, "Creating QP failed, err: %d\n", ret);
+ rtrs_err(con->path, "Creating QP failed, err: %d(%pe)\n", ret,
+ ERR_PTR(ret));
return ret;
}
con->qp = cm_id->qp;
@@ -341,7 +342,8 @@ void rtrs_send_hb_ack(struct rtrs_path *path)
err = rtrs_post_rdma_write_imm_empty(usr_con, path->hb_cqe, imm,
NULL);
if (err) {
- rtrs_err(path, "send HB ACK failed, errno: %d\n", err);
+ rtrs_err(path, "send HB ACK failed, errno: %d(%pe)\n", err,
+ ERR_PTR(err));
path->hb_err_handler(usr_con);
return;
}
@@ -375,7 +377,8 @@ static void hb_work(struct work_struct *work)
err = rtrs_post_rdma_write_imm_empty(usr_con, path->hb_cqe, imm,
NULL);
if (err) {
- rtrs_err(path, "HB send failed, errno: %d\n", err);
+ rtrs_err(path, "HB send failed, errno: %d(%pe)\n", err,
+ ERR_PTR(err));
path->hb_err_handler(usr_con);
return;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
2025-12-09 1:12 ` Honggang LI
2025-12-08 16:15 ` [PATCH 4/9] RDMA/rtrs: Improve error logging for RDMA cm events Md Haris Iqbal
` (5 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
Kim Zhu
Support IB_MR_TYPE_SG_GAPS, which has less limitations
than standard IB_MR_TYPE_MEM_REG, a few ULP support this.
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 10 ++++++++--
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 13 ++++++++++---
2 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 808de144d2e4..ee0682021234 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1360,7 +1360,9 @@ static void free_path_reqs(struct rtrs_clt_path *clt_path)
static int alloc_path_reqs(struct rtrs_clt_path *clt_path)
{
+ struct ib_device *ib_dev = clt_path->s.dev->ib_dev;
struct rtrs_clt_io_req *req;
+ enum ib_mr_type mr_type;
int i, err = -ENOMEM;
clt_path->reqs = kcalloc(clt_path->queue_depth,
@@ -1369,6 +1371,11 @@ static int alloc_path_reqs(struct rtrs_clt_path *clt_path)
if (!clt_path->reqs)
return -ENOMEM;
+ if (ib_dev->attrs.kernel_cap_flags & IBK_SG_GAPS_REG)
+ mr_type = IB_MR_TYPE_SG_GAPS;
+ else
+ mr_type = IB_MR_TYPE_MEM_REG;
+
for (i = 0; i < clt_path->queue_depth; ++i) {
req = &clt_path->reqs[i];
req->iu = rtrs_iu_alloc(1, clt_path->max_hdr_size, GFP_KERNEL,
@@ -1382,8 +1389,7 @@ static int alloc_path_reqs(struct rtrs_clt_path *clt_path)
if (!req->sge)
goto out;
- req->mr = ib_alloc_mr(clt_path->s.dev->ib_pd,
- IB_MR_TYPE_MEM_REG,
+ req->mr = ib_alloc_mr(clt_path->s.dev->ib_pd, mr_type,
clt_path->max_pages_per_mr);
if (IS_ERR(req->mr)) {
err = PTR_ERR(req->mr);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 758d77206315..905d5baec89b 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -568,13 +568,15 @@ static void unmap_cont_bufs(struct rtrs_srv_path *srv_path)
static int map_cont_bufs(struct rtrs_srv_path *srv_path)
{
+ struct ib_device *ib_dev = srv_path->s.dev->ib_dev;
struct rtrs_srv_sess *srv = srv_path->srv;
struct rtrs_path *ss = &srv_path->s;
int i, err, mrs_num;
unsigned int chunk_bits;
+ enum ib_mr_type mr_type;
int chunks_per_mr = 1;
- struct ib_mr *mr;
struct sg_table *sgt;
+ struct ib_mr *mr;
/*
* Here we map queue_depth chunks to MR. Firstly we have to
@@ -623,8 +625,13 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
err = -EINVAL;
goto free_sg;
}
- mr = ib_alloc_mr(srv_path->s.dev->ib_pd, IB_MR_TYPE_MEM_REG,
- nr_sgt);
+
+ if (ib_dev->attrs.kernel_cap_flags & IBK_SG_GAPS_REG)
+ mr_type = IB_MR_TYPE_SG_GAPS;
+ else
+ mr_type = IB_MR_TYPE_MEM_REG;
+
+ mr = ib_alloc_mr(srv_path->s.dev->ib_pd, mr_type, nr_sgt);
if (IS_ERR(mr)) {
err = PTR_ERR(mr);
goto unmap_sg;
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 4/9] RDMA/rtrs: Improve error logging for RDMA cm events
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
` (2 preceding siblings ...)
2025-12-08 16:15 ` [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req Md Haris Iqbal
` (4 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
Kim Zhu
From: Kim Zhu <zhu.yanjun@ionos.com>
The member variable status in the struct rdma_cm_event is used for both
linux errors and the errors definded in rdma stack.
Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 46 ++++++++++++++++++++------
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 22 +++++++++---
2 files changed, 54 insertions(+), 14 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index ee0682021234..49249cc24152 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1947,8 +1947,8 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
status, rej_msg, errno, ERR_PTR(errno));
} else {
rtrs_err(s,
- "Connect rejected but with malformed message: status %d(%pe) (%s)\n",
- status, ERR_PTR(status), rej_msg);
+ "Connect rejected but with malformed message: status %d (%s)\n",
+ status, rej_msg);
}
return -ECONNRESET;
@@ -2015,27 +2015,53 @@ static int rtrs_clt_rdma_cm_handler(struct rdma_cm_id *cm_id,
case RDMA_CM_EVENT_UNREACHABLE:
case RDMA_CM_EVENT_ADDR_CHANGE:
case RDMA_CM_EVENT_TIMEWAIT_EXIT:
- rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
- rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+ if (ev->status < 0) {
+ rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
+ rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+ } else if (ev->status > 0) {
+ rtrs_wrn(s, "CM error (CM event: %s, err: %d(%s))\n",
+ rdma_event_msg(ev->event), ev->status,
+ rdma_reject_msg(cm_id, ev->status));
+ }
cm_err = -ECONNRESET;
break;
case RDMA_CM_EVENT_ADDR_ERROR:
case RDMA_CM_EVENT_ROUTE_ERROR:
- rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
- rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+ if (ev->status < 0) {
+ rtrs_wrn(s, "CM error (CM event: %s, err: %d(%pe))\n",
+ rdma_event_msg(ev->event), ev->status,
+ ERR_PTR(ev->status));
+ } else if (ev->status > 0) {
+ rtrs_wrn(s, "CM error (CM event: %s, err: %d(%s))\n",
+ rdma_event_msg(ev->event), ev->status,
+ rdma_reject_msg(cm_id, ev->status));
+ }
cm_err = -EHOSTUNREACH;
break;
case RDMA_CM_EVENT_DEVICE_REMOVAL:
/*
* Device removal is a special case. Queue close and return 0.
*/
- rtrs_wrn_rl(s, "CM event: %s, status: %d(%pe)\n", rdma_event_msg(ev->event),
- ev->status, ERR_PTR(ev->status));
+ if (ev->status < 0) {
+ rtrs_wrn_rl(s, "CM event: %s, status: %d(%pe)\n",
+ rdma_event_msg(ev->event),
+ ev->status, ERR_PTR(ev->status));
+ } else if (ev->status > 0) {
+ rtrs_wrn_rl(s, "CM event: %s, status: %d(%s)\n",
+ rdma_event_msg(ev->event),
+ ev->status, rdma_reject_msg(cm_id, ev->status));
+ }
rtrs_clt_close_conns(clt_path, false);
return 0;
default:
- rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d(%pe))\n",
- rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+ if (ev->status < 0) {
+ rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d(%pe))\n",
+ rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+ } else if (ev->status > 0) {
+ rtrs_err(s, "Unexpected RDMA CM error (CM event: %s, err: %d(%s))\n",
+ rdma_event_msg(ev->event), ev->status,
+ rdma_reject_msg(cm_id, ev->status));
+ }
cm_err = -ECONNRESET;
break;
}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 905d5baec89b..4e203140c990 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -2018,8 +2018,15 @@ static int rtrs_srv_rdma_cm_handler(struct rdma_cm_id *cm_id,
case RDMA_CM_EVENT_REJECTED:
case RDMA_CM_EVENT_CONNECT_ERROR:
case RDMA_CM_EVENT_UNREACHABLE:
- rtrs_err(s, "CM error (CM event: %s, err: %d(%pe))\n",
- rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+ if (ev->status < 0) {
+ rtrs_err(s, "CM error (CM event: %s, err: %d(%pe))\n",
+ rdma_event_msg(ev->event), ev->status,
+ ERR_PTR(ev->status));
+ } else if (ev->status > 0) {
+ rtrs_err(s, "CM error (CM event: %s, err: %d(%s))\n",
+ rdma_event_msg(ev->event), ev->status,
+ rdma_reject_msg(cm_id, ev->status));
+ }
fallthrough;
case RDMA_CM_EVENT_DISCONNECTED:
case RDMA_CM_EVENT_ADDR_CHANGE:
@@ -2028,8 +2035,15 @@ static int rtrs_srv_rdma_cm_handler(struct rdma_cm_id *cm_id,
close_path(srv_path);
break;
default:
- pr_err("Ignoring unexpected CM event %s, err %d(%pe)\n",
- rdma_event_msg(ev->event), ev->status, ERR_PTR(ev->status));
+ if (ev->status < 0) {
+ pr_err("Ignoring unexpected CM event %s, err %d(%pe)\n",
+ rdma_event_msg(ev->event), ev->status,
+ ERR_PTR(ev->status));
+ } else if (ev->status > 0) {
+ pr_err("Ignoring unexpected CM event %s, err %d(%s)\n",
+ rdma_event_msg(ev->event), ev->status,
+ rdma_reject_msg(cm_id, ev->status));
+ }
break;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
` (3 preceding siblings ...)
2025-12-08 16:15 ` [PATCH 4/9] RDMA/rtrs: Improve error logging for RDMA cm events Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
2025-12-09 1:14 ` Honggang LI
2025-12-08 16:15 ` [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths Md Haris Iqbal
` (3 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner
From: Jack Wang <jinpu.wang@ionos.com>
Remove unused member.
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index 0f57759b3080..3633119d1db2 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -92,7 +92,6 @@ struct rtrs_permit {
* rtrs_clt_io_req - describes one inflight IO request
*/
struct rtrs_clt_io_req {
- struct list_head list;
struct rtrs_iu *iu;
struct scatterlist *sglist; /* list holding user data */
unsigned int sg_cnt;
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
` (4 preceding siblings ...)
2025-12-08 16:15 ` [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
2025-12-09 1:17 ` Honggang LI
2025-12-08 16:15 ` [PATCH 7/9] RDMA/rtrs-srv: Rate-limit I/O path error logging Md Haris Iqbal
` (2 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner
During several network incidents, a number of RTRS paths for a session
went through disconnect and reconnect phase. However, some of those did
not auto-reconnect successfully. Instead they failed with the following
logs,
On client,
kernel: rtrs_client L1991: <sess-name>: Connect rejected: status 28
(consumer defined), rtrs errno -104
kernel: rtrs_client L2698: <sess-name>: init_conns() failed: err=-104
path=gid:<gid1>@gid:<gid2> [mlx4_0:1]
On server, (log a)
kernel: ibtrs_server L1868: <>: Connection already exists: 0
When the misbehaving path was removed, and add_path was called to re-add
the path, the log on client side changed to, (log b)
kernel: rtrs_client L1991: <sess-name>: Connect rejected: status 28
(consumer defined), rtrs errno -17
There was no log on the server side for this, which is expected since
there is no logging in that path,
if (unlikely(__is_path_w_addr_exists(srv, &cm_id->route.addr))) {
err = -EEXIST;
goto err;
Because of the following check on server side,
if (unlikely(sess->state != IBTRS_SRV_CONNECTING)) {
ibtrs_err(s, "Session in wrong state: %s\n",
.. we know that the path in (log a) was in CONNECTING state.
The above state of the path persists for as long as we leave the session
be. This means that the path is in some zombie state, probably waiting
for the info_req packet to arrive, which never does.
The changes in this commits does 2 things.
1) Add logs at places where we see the errors happening. The logs would
shed more light at the state and lifetime of such zombie paths.
2) Close such zombie sessions, only if they are in CONNECTING state, and
after an inactivity period of 30 seconds.
i) The state check prevents closure of paths which are CONNECTED.
Also, from the above logs and code, we already know that the path could
only be on CONNECTING state, so we play safe and narrow our impact surface
area by closing only CONNECTING paths.
ii) The inactivity period is to allow requests for other cid to finish
processing, or for any stray packets to arrive/fail.
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 46 +++++++++++++++++++++++---
drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
2 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 4e203140c990..20e7d2681668 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -911,6 +911,13 @@ static int process_info_req(struct rtrs_srv_con *con,
tx_iu->dma_addr,
tx_iu->size, DMA_TO_DEVICE);
+ /*
+ * Now disable zombie connection closing. Since from the logs and code,
+ * we know that it can never be in CONNECTED state.
+ * See RNBD-3128 comments.
+ */
+ srv_path->connection_timeout = 0;
+
/* Send info response */
err = rtrs_iu_post_send(&con->c, tx_iu, tx_sz, reg_wr);
if (err) {
@@ -1537,17 +1544,38 @@ static int sockaddr_cmp(const struct sockaddr *a, const struct sockaddr *b)
}
}
+/* Let's close connections which have been waiting for more than 30 seconds */
+#define RTRS_MAX_CONN_TIMEOUT 30000
+
+static void rtrs_srv_check_close_path(struct rtrs_srv_path *srv_path)
+{
+ struct rtrs_path *s = &srv_path->s;
+
+ if (srv_path->state == RTRS_SRV_CONNECTING && srv_path->connection_timeout &&
+ (jiffies_to_msecs(jiffies - srv_path->connection_timeout) > RTRS_MAX_CONN_TIMEOUT)) {
+ rtrs_err(s, "Closing zombie path\n");
+ close_path(srv_path);
+ }
+}
+
static bool __is_path_w_addr_exists(struct rtrs_srv_sess *srv,
struct rdma_addr *addr)
{
struct rtrs_srv_path *srv_path;
- list_for_each_entry(srv_path, &srv->paths_list, s.entry)
+ list_for_each_entry(srv_path, &srv->paths_list, s.entry) {
if (!sockaddr_cmp((struct sockaddr *)&srv_path->s.dst_addr,
(struct sockaddr *)&addr->dst_addr) &&
!sockaddr_cmp((struct sockaddr *)&srv_path->s.src_addr,
- (struct sockaddr *)&addr->src_addr))
+ (struct sockaddr *)&addr->src_addr)) {
+ rtrs_err((&srv_path->s),
+ "Path (%s) with same addr exists (lifetime %u)\n",
+ rtrs_srv_state_str(srv_path->state),
+ (jiffies_to_msecs(jiffies - srv_path->connection_timeout)));
+ rtrs_srv_check_close_path(srv_path);
return true;
+ }
+ }
return false;
}
@@ -1785,7 +1813,6 @@ static struct rtrs_srv_path *__alloc_path(struct rtrs_srv_sess *srv,
}
if (__is_path_w_addr_exists(srv, &cm_id->route.addr)) {
err = -EEXIST;
- pr_err("Path with same addr exists\n");
goto err;
}
srv_path = kzalloc(sizeof(*srv_path), GFP_KERNEL);
@@ -1832,6 +1859,7 @@ static struct rtrs_srv_path *__alloc_path(struct rtrs_srv_sess *srv,
spin_lock_init(&srv_path->state_lock);
INIT_WORK(&srv_path->close_work, rtrs_srv_close_work);
rtrs_srv_init_hb(srv_path);
+ srv_path->connection_timeout = 0;
srv_path->s.dev = rtrs_ib_dev_find_or_add(cm_id->device, &dev_pd);
if (!srv_path->s.dev) {
@@ -1937,8 +1965,10 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
goto reject_w_err;
}
if (s->con[cid]) {
- rtrs_err(s, "Connection already exists: %d\n",
- cid);
+ rtrs_err(s, "Connection (%s) already exists: %d (lifetime %u)\n",
+ rtrs_srv_state_str(srv_path->state), cid,
+ (jiffies_to_msecs(jiffies - srv_path->connection_timeout)));
+ rtrs_srv_check_close_path(srv_path);
mutex_unlock(&srv->paths_mutex);
goto reject_w_err;
}
@@ -1953,6 +1983,12 @@ static int rtrs_rdma_connect(struct rdma_cm_id *cm_id,
goto reject_w_err;
}
}
+
+ /*
+ * Start of any connection creation resets the timeout for the path.
+ */
+ srv_path->connection_timeout = jiffies;
+
err = create_con(srv_path, cm_id, cid);
if (err) {
rtrs_err((&srv_path->s), "create_con(), error %d(%pe)\n", err, ERR_PTR(err));
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
index 014f85681f37..3d36876527f5 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
@@ -89,6 +89,7 @@ struct rtrs_srv_path {
unsigned int mem_bits;
struct kobject kobj;
struct rtrs_srv_stats *stats;
+ unsigned long connection_timeout;
};
static inline struct rtrs_srv_path *to_srv_path(struct rtrs_path *s)
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 7/9] RDMA/rtrs-srv: Rate-limit I/O path error logging
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
` (5 preceding siblings ...)
2025-12-08 16:15 ` [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 8/9] RDMA/rtrs: Extend log message when a port fails Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 9/9] RDMA/rtrs-clt.c: For conn rejection use actual err number Md Haris Iqbal
8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
Kim Zhu
From: Kim Zhu <zhu.yanjun@ionos.com>
Excessive error logging is making it difficult to identify the root
cause of issues. Implement rate limiting to improve log clarity.
Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 20e7d2681668..dfe38ffc2e38 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -184,7 +184,7 @@ static void rtrs_srv_reg_mr_done(struct ib_cq *cq, struct ib_wc *wc)
struct rtrs_srv_path *srv_path = to_srv_path(s);
if (wc->status != IB_WC_SUCCESS) {
- rtrs_err(s, "REG MR failed: %s\n",
+ rtrs_err_rl(s, "REG MR failed: %s\n",
ib_wc_status_msg(wc->status));
close_path(srv_path);
return;
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 8/9] RDMA/rtrs: Extend log message when a port fails
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
` (6 preceding siblings ...)
2025-12-08 16:15 ` [PATCH 7/9] RDMA/rtrs-srv: Rate-limit I/O path error logging Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 9/9] RDMA/rtrs-clt.c: For conn rejection use actual err number Md Haris Iqbal
8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner,
Kim Zhu
From: Kim Zhu <zhu.yanjun@ionos.com>
Add HCA name and port of this HCA.
This would help with analysing and debugging the logs.
The logs would looks something like this,
rtrs_server L2516: Handling event: port error (10).
HCA name: mlx4_0, port num: 2
rtrs_client L3326: Handling event: port error (10).
HCA name: mlx4_0, port num: 1
Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 7 +++++--
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 7 +++++--
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 49249cc24152..dcf5704366eb 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -3179,8 +3179,11 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt_sess *clt,
void rtrs_clt_ib_event_handler(struct ib_event_handler *handler,
struct ib_event *ibevent)
{
- pr_info("Handling event: %s (%d).\n", ib_event_msg(ibevent->event),
- ibevent->event);
+ struct ib_device *idev = ibevent->device;
+ u32 port_num = ibevent->element.port_num;
+
+ pr_info("Handling event: %s (%d). HCA name: %s, port num: %u\n",
+ ib_event_msg(ibevent->event), ibevent->event, idev->name, port_num);
}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index dfe38ffc2e38..301edaadfb1a 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -2349,8 +2349,11 @@ static int check_module_params(void)
void rtrs_srv_ib_event_handler(struct ib_event_handler *handler,
struct ib_event *ibevent)
{
- pr_info("Handling event: %s (%d).\n", ib_event_msg(ibevent->event),
- ibevent->event);
+ struct ib_device *idev = ibevent->device;
+ u32 port_num = ibevent->element.port_num;
+
+ pr_info("Handling event: %s (%d). HCA name: %s, port num: %u\n",
+ ib_event_msg(ibevent->event), ibevent->event, idev->name, port_num);
}
static int rtrs_srv_ib_dev_init(struct rtrs_ib_dev *dev)
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 9/9] RDMA/rtrs-clt.c: For conn rejection use actual err number
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
` (7 preceding siblings ...)
2025-12-08 16:15 ` [PATCH 8/9] RDMA/rtrs: Extend log message when a port fails Md Haris Iqbal
@ 2025-12-08 16:15 ` Md Haris Iqbal
8 siblings, 0 replies; 20+ messages in thread
From: Md Haris Iqbal @ 2025-12-08 16:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, grzegorz.prajsner
When the connection establishment request is rejected from the server
side, then the actual error number sent back should be used.
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index dcf5704366eb..3e62da5eaca7 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1929,7 +1929,7 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
struct rtrs_path *s = con->c.path;
const struct rtrs_msg_conn_rsp *msg;
const char *rej_msg;
- int status, errno;
+ int status, errno = -ECONNRESET;
u8 data_len;
status = ev->status;
@@ -1951,7 +1951,7 @@ static int rtrs_rdma_conn_rejected(struct rtrs_clt_con *con,
status, rej_msg);
}
- return -ECONNRESET;
+ return errno;
}
void rtrs_clt_close_conns(struct rtrs_clt_path *clt_path, bool wait)
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
2025-12-08 16:15 ` [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS Md Haris Iqbal
@ 2025-12-09 1:12 ` Honggang LI
2026-01-06 9:28 ` Haris Iqbal
0 siblings, 1 reply; 20+ messages in thread
From: Honggang LI @ 2025-12-09 1:12 UTC (permalink / raw)
To: Md Haris Iqbal
Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner,
Kim Zhu
On Mon, Dec 08, 2025 at 05:15:07PM +0100, Md Haris Iqbal wrote:
> Subject: [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
> From: Md Haris Iqbal <haris.iqbal@ionos.com>
> Date: Mon, 8 Dec 2025 17:15:07 +0100
> X-Mailer: git-send-email 2.43.0
>
> Support IB_MR_TYPE_SG_GAPS, which has less limitations
> than standard IB_MR_TYPE_MEM_REG, a few ULP support this.
Do you have benchmark performance difference between IB_MR_TYPE_MEM_REG
and IB_MR_TYPE_SG_GAPS?
thanks
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req
2025-12-08 16:15 ` [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req Md Haris Iqbal
@ 2025-12-09 1:14 ` Honggang LI
2026-01-06 9:26 ` Haris Iqbal
0 siblings, 1 reply; 20+ messages in thread
From: Honggang LI @ 2025-12-09 1:14 UTC (permalink / raw)
To: Md Haris Iqbal
Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner
On Mon, Dec 08, 2025 at 05:15:09PM +0100, Md Haris Iqbal wrote:
> Subject: [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in
> rtrs_clt_io_req
> From: Md Haris Iqbal <haris.iqbal@ionos.com>
> Date: Mon, 8 Dec 2025 17:15:09 +0100
> X-Mailer: git-send-email 2.43.0
>
> From: Jack Wang <jinpu.wang@ionos.com>
>
> Remove unused member.
>
> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> ---
> drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> index 0f57759b3080..3633119d1db2 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> @@ -92,7 +92,6 @@ struct rtrs_permit {
> * rtrs_clt_io_req - describes one inflight IO request
> */
> struct rtrs_clt_io_req {
> - struct list_head list;
It seems these two members alse unused. Why keep them?
struct rtrs_sg_desc *desc;
unsigned long start_jiffies;
thanks
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths
2025-12-08 16:15 ` [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths Md Haris Iqbal
@ 2025-12-09 1:17 ` Honggang LI
2026-01-06 9:27 ` Haris Iqbal
0 siblings, 1 reply; 20+ messages in thread
From: Honggang LI @ 2025-12-09 1:17 UTC (permalink / raw)
To: Md Haris Iqbal
Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner
On Mon, Dec 08, 2025 at 05:15:10PM +0100, Md Haris Iqbal wrote:
> Subject: [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible
> zombie paths
> From: Md Haris Iqbal <haris.iqbal@ionos.com>
> Date: Mon, 8 Dec 2025 17:15:10 +0100
> X-Mailer: git-send-email 2.43.0
>
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> @@ -911,6 +911,13 @@ static int process_info_req(struct rtrs_srv_con *con,
> tx_iu->dma_addr,
> tx_iu->size, DMA_TO_DEVICE);
>
> + /*
> + * Now disable zombie connection closing. Since from the logs and code,
> + * we know that it can never be in CONNECTED state.
> + * See RNBD-3128 comments.
^^^^^^^^^^^^^^^^^
What is it? How to access it?
thanks
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
@ 2025-12-12 5:26 ` Dan Carpenter
2026-01-06 9:47 ` Haris Iqbal
2025-12-18 15:51 ` Leon Romanovsky
1 sibling, 1 reply; 20+ messages in thread
From: Dan Carpenter @ 2025-12-12 5:26 UTC (permalink / raw)
To: oe-kbuild, Md Haris Iqbal, linux-rdma
Cc: lkp, oe-kbuild-all, bvanassche, leon, jgg, haris.iqbal,
jinpu.wang, grzegorz.prajsner, Kim Zhu
Hi Md,
kernel test robot noticed the following build warnings:
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Md-Haris-Iqbal/RDMA-rtrs-srv-fix-SG-mapping/20251209-001817
base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
patch link: https://lore.kernel.org/r/20251208161513.127049-3-haris.iqbal%40ionos.com
patch subject: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
config: arm-randconfig-r072-20251210 (https://download.01.org/0day-ci/archive/20251212/202512120133.BuJVeI6M-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 12.5.0
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
| Closes: https://lore.kernel.org/r/202512120133.BuJVeI6M-lkp@intel.com/
smatch warnings:
drivers/infiniband/ulp/rtrs/rtrs-srv.c:885 process_info_req() warn: passing zero to 'ERR_PTR'
vim +/ERR_PTR +885 drivers/infiniband/ulp/rtrs/rtrs-srv.c
9cb837480424e7 Jack Wang 2020-05-11 808 static int process_info_req(struct rtrs_srv_con *con,
9cb837480424e7 Jack Wang 2020-05-11 809 struct rtrs_msg_info_req *msg)
9cb837480424e7 Jack Wang 2020-05-11 810 {
d9372794717f44 Vaishali Thakkar 2022-01-05 811 struct rtrs_path *s = con->c.path;
ae4c81644e9105 Vaishali Thakkar 2022-01-05 812 struct rtrs_srv_path *srv_path = to_srv_path(s);
9cb837480424e7 Jack Wang 2020-05-11 813 struct ib_send_wr *reg_wr = NULL;
9cb837480424e7 Jack Wang 2020-05-11 814 struct rtrs_msg_info_rsp *rsp;
9cb837480424e7 Jack Wang 2020-05-11 815 struct rtrs_iu *tx_iu;
9cb837480424e7 Jack Wang 2020-05-11 816 struct ib_reg_wr *rwr;
9cb837480424e7 Jack Wang 2020-05-11 817 int mri, err;
9cb837480424e7 Jack Wang 2020-05-11 818 size_t tx_sz;
9cb837480424e7 Jack Wang 2020-05-11 819
ae4c81644e9105 Vaishali Thakkar 2022-01-05 820 err = post_recv_path(srv_path);
4693d6b767d6ca Gioh Kim 2021-08-06 821 if (err) {
94ae3ce9b375c6 Kim Zhu 2025-12-08 822 rtrs_err(s, "post_recv_path(), err: %d(%pe)\n", err, ERR_PTR(err));
9cb837480424e7 Jack Wang 2020-05-11 823 return err;
9cb837480424e7 Jack Wang 2020-05-11 824 }
07c14027295a32 Gioh Kim 2021-05-28 825
ae4c81644e9105 Vaishali Thakkar 2022-01-05 826 if (strchr(msg->pathname, '/') || strchr(msg->pathname, '.')) {
ae4c81644e9105 Vaishali Thakkar 2022-01-05 827 rtrs_err(s, "pathname cannot contain / and .\n");
dea7bb3ad3e08f Md Haris Iqbal 2021-09-22 828 return -EINVAL;
dea7bb3ad3e08f Md Haris Iqbal 2021-09-22 829 }
dea7bb3ad3e08f Md Haris Iqbal 2021-09-22 830
ae4c81644e9105 Vaishali Thakkar 2022-01-05 831 if (exist_pathname(srv_path->srv->ctx,
ae4c81644e9105 Vaishali Thakkar 2022-01-05 832 msg->pathname, &srv_path->srv->paths_uuid)) {
ae4c81644e9105 Vaishali Thakkar 2022-01-05 833 rtrs_err(s, "pathname is duplicated: %s\n", msg->pathname);
07c14027295a32 Gioh Kim 2021-05-28 834 return -EPERM;
07c14027295a32 Gioh Kim 2021-05-28 835 }
ae4c81644e9105 Vaishali Thakkar 2022-01-05 836 strscpy(srv_path->s.sessname, msg->pathname,
ae4c81644e9105 Vaishali Thakkar 2022-01-05 837 sizeof(srv_path->s.sessname));
07c14027295a32 Gioh Kim 2021-05-28 838
ae4c81644e9105 Vaishali Thakkar 2022-01-05 839 rwr = kcalloc(srv_path->mrs_num, sizeof(*rwr), GFP_KERNEL);
4693d6b767d6ca Gioh Kim 2021-08-06 840 if (!rwr)
9cb837480424e7 Jack Wang 2020-05-11 841 return -ENOMEM;
9cb837480424e7 Jack Wang 2020-05-11 842
9cb837480424e7 Jack Wang 2020-05-11 843 tx_sz = sizeof(*rsp);
ae4c81644e9105 Vaishali Thakkar 2022-01-05 844 tx_sz += sizeof(rsp->desc[0]) * srv_path->mrs_num;
ae4c81644e9105 Vaishali Thakkar 2022-01-05 845 tx_iu = rtrs_iu_alloc(1, tx_sz, GFP_KERNEL, srv_path->s.dev->ib_dev,
9cb837480424e7 Jack Wang 2020-05-11 846 DMA_TO_DEVICE, rtrs_srv_info_rsp_done);
4693d6b767d6ca Gioh Kim 2021-08-06 847 if (!tx_iu) {
9cb837480424e7 Jack Wang 2020-05-11 848 err = -ENOMEM;
9cb837480424e7 Jack Wang 2020-05-11 849 goto rwr_free;
9cb837480424e7 Jack Wang 2020-05-11 850 }
9cb837480424e7 Jack Wang 2020-05-11 851
9cb837480424e7 Jack Wang 2020-05-11 852 rsp = tx_iu->buf;
9cb837480424e7 Jack Wang 2020-05-11 853 rsp->type = cpu_to_le16(RTRS_MSG_INFO_RSP);
ae4c81644e9105 Vaishali Thakkar 2022-01-05 854 rsp->sg_cnt = cpu_to_le16(srv_path->mrs_num);
9cb837480424e7 Jack Wang 2020-05-11 855
ae4c81644e9105 Vaishali Thakkar 2022-01-05 856 for (mri = 0; mri < srv_path->mrs_num; mri++) {
ae4c81644e9105 Vaishali Thakkar 2022-01-05 857 struct ib_mr *mr = srv_path->mrs[mri].mr;
9cb837480424e7 Jack Wang 2020-05-11 858
9cb837480424e7 Jack Wang 2020-05-11 859 rsp->desc[mri].addr = cpu_to_le64(mr->iova);
9cb837480424e7 Jack Wang 2020-05-11 860 rsp->desc[mri].key = cpu_to_le32(mr->rkey);
9cb837480424e7 Jack Wang 2020-05-11 861 rsp->desc[mri].len = cpu_to_le32(mr->length);
9cb837480424e7 Jack Wang 2020-05-11 862
9cb837480424e7 Jack Wang 2020-05-11 863 /*
9cb837480424e7 Jack Wang 2020-05-11 864 * Fill in reg MR request and chain them *backwards*
9cb837480424e7 Jack Wang 2020-05-11 865 */
9cb837480424e7 Jack Wang 2020-05-11 866 rwr[mri].wr.next = mri ? &rwr[mri - 1].wr : NULL;
9cb837480424e7 Jack Wang 2020-05-11 867 rwr[mri].wr.opcode = IB_WR_REG_MR;
9cb837480424e7 Jack Wang 2020-05-11 868 rwr[mri].wr.wr_cqe = &local_reg_cqe;
9cb837480424e7 Jack Wang 2020-05-11 869 rwr[mri].wr.num_sge = 0;
e8ae7ddb48a1b8 Jack Wang 2020-12-17 870 rwr[mri].wr.send_flags = 0;
9cb837480424e7 Jack Wang 2020-05-11 871 rwr[mri].mr = mr;
9cb837480424e7 Jack Wang 2020-05-11 872 rwr[mri].key = mr->rkey;
9cb837480424e7 Jack Wang 2020-05-11 873 rwr[mri].access = (IB_ACCESS_LOCAL_WRITE |
9cb837480424e7 Jack Wang 2020-05-11 874 IB_ACCESS_REMOTE_WRITE);
9cb837480424e7 Jack Wang 2020-05-11 875 reg_wr = &rwr[mri].wr;
9cb837480424e7 Jack Wang 2020-05-11 876 }
9cb837480424e7 Jack Wang 2020-05-11 877
ae4c81644e9105 Vaishali Thakkar 2022-01-05 878 err = rtrs_srv_create_path_files(srv_path);
4693d6b767d6ca Gioh Kim 2021-08-06 879 if (err)
9cb837480424e7 Jack Wang 2020-05-11 880 goto iu_free;
ae4c81644e9105 Vaishali Thakkar 2022-01-05 881 kobject_get(&srv_path->kobj);
ae4c81644e9105 Vaishali Thakkar 2022-01-05 882 get_device(&srv_path->srv->dev);
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 883 err = rtrs_srv_change_state(srv_path, RTRS_SRV_CONNECTED);
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 884 if (!err) {
Probably remove the !?
94ae3ce9b375c6 Kim Zhu 2025-12-08 @885 rtrs_err(s, "rtrs_srv_change_state(), err: %d(%pe)\n", err, ERR_PTR(err));
err is zero. Or is this a success path?
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 886 goto iu_free;
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 887 }
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 888
ae4c81644e9105 Vaishali Thakkar 2022-01-05 889 rtrs_srv_start_hb(srv_path);
9cb837480424e7 Jack Wang 2020-05-11 890
9cb837480424e7 Jack Wang 2020-05-11 891 /*
9cb837480424e7 Jack Wang 2020-05-11 892 * We do not account number of established connections at the current
9cb837480424e7 Jack Wang 2020-05-11 893 * moment, we rely on the client, which should send info request when
9cb837480424e7 Jack Wang 2020-05-11 894 * all connections are successfully established. Thus, simply notify
9cb837480424e7 Jack Wang 2020-05-11 895 * listener with a proper event if we are the first path.
9cb837480424e7 Jack Wang 2020-05-11 896 */
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 897 err = rtrs_srv_path_up(srv_path);
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 898 if (err) {
94ae3ce9b375c6 Kim Zhu 2025-12-08 899 rtrs_err(s, "rtrs_srv_path_up(), err: %d(%pe)\n", err, ERR_PTR(err));
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 900 goto iu_free;
ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 901 }
9cb837480424e7 Jack Wang 2020-05-11 902
ae4c81644e9105 Vaishali Thakkar 2022-01-05 903 ib_dma_sync_single_for_device(srv_path->s.dev->ib_dev,
ae4c81644e9105 Vaishali Thakkar 2022-01-05 904 tx_iu->dma_addr,
9cb837480424e7 Jack Wang 2020-05-11 905 tx_iu->size, DMA_TO_DEVICE);
9cb837480424e7 Jack Wang 2020-05-11 906
9cb837480424e7 Jack Wang 2020-05-11 907 /* Send info response */
9cb837480424e7 Jack Wang 2020-05-11 908 err = rtrs_iu_post_send(&con->c, tx_iu, tx_sz, reg_wr);
4693d6b767d6ca Gioh Kim 2021-08-06 909 if (err) {
94ae3ce9b375c6 Kim Zhu 2025-12-08 910 rtrs_err(s, "rtrs_iu_post_send(), err: %d(%pe)\n", err, ERR_PTR(err));
9cb837480424e7 Jack Wang 2020-05-11 911 iu_free:
ae4c81644e9105 Vaishali Thakkar 2022-01-05 912 rtrs_iu_free(tx_iu, srv_path->s.dev->ib_dev, 1);
9cb837480424e7 Jack Wang 2020-05-11 913 }
9cb837480424e7 Jack Wang 2020-05-11 914 rwr_free:
9cb837480424e7 Jack Wang 2020-05-11 915 kfree(rwr);
9cb837480424e7 Jack Wang 2020-05-11 916
9cb837480424e7 Jack Wang 2020-05-11 917 return err;
9cb837480424e7 Jack Wang 2020-05-11 918 }
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
2025-12-12 5:26 ` Dan Carpenter
@ 2025-12-18 15:51 ` Leon Romanovsky
2026-01-06 10:03 ` Haris Iqbal
1 sibling, 1 reply; 20+ messages in thread
From: Leon Romanovsky @ 2025-12-18 15:51 UTC (permalink / raw)
To: Md Haris Iqbal
Cc: linux-rdma, bvanassche, jgg, jinpu.wang, grzegorz.prajsner,
Kim Zhu
On Mon, Dec 08, 2025 at 05:15:06PM +0100, Md Haris Iqbal wrote:
> From: Kim Zhu <zhu.yanjun@ionos.com>
>
> Print error description besides the error number.
>
> Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> ---
> drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c | 8 +-
> drivers/infiniband/ulp/rtrs/rtrs-clt.c | 89 ++++++++++----------
> drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 12 +--
> drivers/infiniband/ulp/rtrs/rtrs-srv.c | 78 ++++++++---------
> drivers/infiniband/ulp/rtrs/rtrs.c | 9 +-
> 5 files changed, 101 insertions(+), 95 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> index 4aa80c9388f0..b318acc12b10 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> @@ -439,19 +439,19 @@ int rtrs_clt_create_path_files(struct rtrs_clt_path *clt_path)
> clt->kobj_paths,
> "%s", str);
> if (err) {
> - pr_err("kobject_init_and_add: %d\n", err);
> + pr_err("kobject_init_and_add: %d(%pe)\n", err, ERR_PTR(err));
Or print error or print error description, not both.
Thanks
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req
2025-12-09 1:14 ` Honggang LI
@ 2026-01-06 9:26 ` Haris Iqbal
0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06 9:26 UTC (permalink / raw)
To: Honggang LI
Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner
On Tue, Dec 9, 2025 at 2:14 AM Honggang LI <honggangli@163.com> wrote:
>
> On Mon, Dec 08, 2025 at 05:15:09PM +0100, Md Haris Iqbal wrote:
> > Subject: [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in
> > rtrs_clt_io_req
> > From: Md Haris Iqbal <haris.iqbal@ionos.com>
> > Date: Mon, 8 Dec 2025 17:15:09 +0100
> > X-Mailer: git-send-email 2.43.0
> >
> > From: Jack Wang <jinpu.wang@ionos.com>
> >
> > Remove unused member.
> >
> > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > ---
> > drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > index 0f57759b3080..3633119d1db2 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
> > @@ -92,7 +92,6 @@ struct rtrs_permit {
> > * rtrs_clt_io_req - describes one inflight IO request
> > */
> > struct rtrs_clt_io_req {
> > - struct list_head list;
>
> It seems these two members alse unused. Why keep them?
>
> struct rtrs_sg_desc *desc;
> unsigned long start_jiffies;
Makes sense. Will remove them. Thanks
>
> thanks
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths
2025-12-09 1:17 ` Honggang LI
@ 2026-01-06 9:27 ` Haris Iqbal
0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06 9:27 UTC (permalink / raw)
To: Honggang LI
Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner
On Tue, Dec 9, 2025 at 2:17 AM Honggang LI <honggangli@163.com> wrote:
>
> On Mon, Dec 08, 2025 at 05:15:10PM +0100, Md Haris Iqbal wrote:
> > Subject: [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible
> > zombie paths
> > From: Md Haris Iqbal <haris.iqbal@ionos.com>
> > Date: Mon, 8 Dec 2025 17:15:10 +0100
> > X-Mailer: git-send-email 2.43.0
> >
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > @@ -911,6 +911,13 @@ static int process_info_req(struct rtrs_srv_con *con,
> > tx_iu->dma_addr,
> > tx_iu->size, DMA_TO_DEVICE);
> >
> > + /*
> > + * Now disable zombie connection closing. Since from the logs and code,
> > + * we know that it can never be in CONNECTED state.
> > + * See RNBD-3128 comments.
> ^^^^^^^^^^^^^^^^^
> What is it? How to access it?
It is an internal ticket number. Should have been removed, but we
missed it. Will remove it.
Thanks.
>
> thanks
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
2025-12-09 1:12 ` Honggang LI
@ 2026-01-06 9:28 ` Haris Iqbal
0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06 9:28 UTC (permalink / raw)
To: Honggang LI
Cc: linux-rdma, bvanassche, leon, jgg, jinpu.wang, grzegorz.prajsner,
Kim Zhu
On Tue, Dec 9, 2025 at 2:13 AM Honggang LI <honggangli@163.com> wrote:
>
> On Mon, Dec 08, 2025 at 05:15:07PM +0100, Md Haris Iqbal wrote:
> > Subject: [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS
> > From: Md Haris Iqbal <haris.iqbal@ionos.com>
> > Date: Mon, 8 Dec 2025 17:15:07 +0100
> > X-Mailer: git-send-email 2.43.0
> >
> > Support IB_MR_TYPE_SG_GAPS, which has less limitations
> > than standard IB_MR_TYPE_MEM_REG, a few ULP support this.
>
> Do you have benchmark performance difference between IB_MR_TYPE_MEM_REG
> and IB_MR_TYPE_SG_GAPS?
We haven't benchmarked it yet. As a ULP, we wanted to first add
support to RTRS.
>
> thanks
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
2025-12-12 5:26 ` Dan Carpenter
@ 2026-01-06 9:47 ` Haris Iqbal
0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06 9:47 UTC (permalink / raw)
To: Dan Carpenter
Cc: oe-kbuild, linux-rdma, lkp, oe-kbuild-all, bvanassche, leon, jgg,
jinpu.wang, grzegorz.prajsner, Kim Zhu
On Fri, Dec 12, 2025 at 6:26 AM Dan Carpenter <dan.carpenter@linaro.org> wrote:
>
> Hi Md,
>
> kernel test robot noticed the following build warnings:
>
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Md-Haris-Iqbal/RDMA-rtrs-srv-fix-SG-mapping/20251209-001817
> base: https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
> patch link: https://lore.kernel.org/r/20251208161513.127049-3-haris.iqbal%40ionos.com
> patch subject: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
> config: arm-randconfig-r072-20251210 (https://download.01.org/0day-ci/archive/20251212/202512120133.BuJVeI6M-lkp@intel.com/config)
> compiler: arm-linux-gnueabi-gcc (GCC) 12.5.0
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
> | Closes: https://lore.kernel.org/r/202512120133.BuJVeI6M-lkp@intel.com/
>
> smatch warnings:
> drivers/infiniband/ulp/rtrs/rtrs-srv.c:885 process_info_req() warn: passing zero to 'ERR_PTR'
>
> vim +/ERR_PTR +885 drivers/infiniband/ulp/rtrs/rtrs-srv.c
>
> 9cb837480424e7 Jack Wang 2020-05-11 808 static int process_info_req(struct rtrs_srv_con *con,
> 9cb837480424e7 Jack Wang 2020-05-11 809 struct rtrs_msg_info_req *msg)
> 9cb837480424e7 Jack Wang 2020-05-11 810 {
> d9372794717f44 Vaishali Thakkar 2022-01-05 811 struct rtrs_path *s = con->c.path;
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 812 struct rtrs_srv_path *srv_path = to_srv_path(s);
> 9cb837480424e7 Jack Wang 2020-05-11 813 struct ib_send_wr *reg_wr = NULL;
> 9cb837480424e7 Jack Wang 2020-05-11 814 struct rtrs_msg_info_rsp *rsp;
> 9cb837480424e7 Jack Wang 2020-05-11 815 struct rtrs_iu *tx_iu;
> 9cb837480424e7 Jack Wang 2020-05-11 816 struct ib_reg_wr *rwr;
> 9cb837480424e7 Jack Wang 2020-05-11 817 int mri, err;
> 9cb837480424e7 Jack Wang 2020-05-11 818 size_t tx_sz;
> 9cb837480424e7 Jack Wang 2020-05-11 819
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 820 err = post_recv_path(srv_path);
> 4693d6b767d6ca Gioh Kim 2021-08-06 821 if (err) {
> 94ae3ce9b375c6 Kim Zhu 2025-12-08 822 rtrs_err(s, "post_recv_path(), err: %d(%pe)\n", err, ERR_PTR(err));
> 9cb837480424e7 Jack Wang 2020-05-11 823 return err;
> 9cb837480424e7 Jack Wang 2020-05-11 824 }
> 07c14027295a32 Gioh Kim 2021-05-28 825
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 826 if (strchr(msg->pathname, '/') || strchr(msg->pathname, '.')) {
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 827 rtrs_err(s, "pathname cannot contain / and .\n");
> dea7bb3ad3e08f Md Haris Iqbal 2021-09-22 828 return -EINVAL;
> dea7bb3ad3e08f Md Haris Iqbal 2021-09-22 829 }
> dea7bb3ad3e08f Md Haris Iqbal 2021-09-22 830
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 831 if (exist_pathname(srv_path->srv->ctx,
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 832 msg->pathname, &srv_path->srv->paths_uuid)) {
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 833 rtrs_err(s, "pathname is duplicated: %s\n", msg->pathname);
> 07c14027295a32 Gioh Kim 2021-05-28 834 return -EPERM;
> 07c14027295a32 Gioh Kim 2021-05-28 835 }
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 836 strscpy(srv_path->s.sessname, msg->pathname,
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 837 sizeof(srv_path->s.sessname));
> 07c14027295a32 Gioh Kim 2021-05-28 838
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 839 rwr = kcalloc(srv_path->mrs_num, sizeof(*rwr), GFP_KERNEL);
> 4693d6b767d6ca Gioh Kim 2021-08-06 840 if (!rwr)
> 9cb837480424e7 Jack Wang 2020-05-11 841 return -ENOMEM;
> 9cb837480424e7 Jack Wang 2020-05-11 842
> 9cb837480424e7 Jack Wang 2020-05-11 843 tx_sz = sizeof(*rsp);
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 844 tx_sz += sizeof(rsp->desc[0]) * srv_path->mrs_num;
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 845 tx_iu = rtrs_iu_alloc(1, tx_sz, GFP_KERNEL, srv_path->s.dev->ib_dev,
> 9cb837480424e7 Jack Wang 2020-05-11 846 DMA_TO_DEVICE, rtrs_srv_info_rsp_done);
> 4693d6b767d6ca Gioh Kim 2021-08-06 847 if (!tx_iu) {
> 9cb837480424e7 Jack Wang 2020-05-11 848 err = -ENOMEM;
> 9cb837480424e7 Jack Wang 2020-05-11 849 goto rwr_free;
> 9cb837480424e7 Jack Wang 2020-05-11 850 }
> 9cb837480424e7 Jack Wang 2020-05-11 851
> 9cb837480424e7 Jack Wang 2020-05-11 852 rsp = tx_iu->buf;
> 9cb837480424e7 Jack Wang 2020-05-11 853 rsp->type = cpu_to_le16(RTRS_MSG_INFO_RSP);
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 854 rsp->sg_cnt = cpu_to_le16(srv_path->mrs_num);
> 9cb837480424e7 Jack Wang 2020-05-11 855
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 856 for (mri = 0; mri < srv_path->mrs_num; mri++) {
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 857 struct ib_mr *mr = srv_path->mrs[mri].mr;
> 9cb837480424e7 Jack Wang 2020-05-11 858
> 9cb837480424e7 Jack Wang 2020-05-11 859 rsp->desc[mri].addr = cpu_to_le64(mr->iova);
> 9cb837480424e7 Jack Wang 2020-05-11 860 rsp->desc[mri].key = cpu_to_le32(mr->rkey);
> 9cb837480424e7 Jack Wang 2020-05-11 861 rsp->desc[mri].len = cpu_to_le32(mr->length);
> 9cb837480424e7 Jack Wang 2020-05-11 862
> 9cb837480424e7 Jack Wang 2020-05-11 863 /*
> 9cb837480424e7 Jack Wang 2020-05-11 864 * Fill in reg MR request and chain them *backwards*
> 9cb837480424e7 Jack Wang 2020-05-11 865 */
> 9cb837480424e7 Jack Wang 2020-05-11 866 rwr[mri].wr.next = mri ? &rwr[mri - 1].wr : NULL;
> 9cb837480424e7 Jack Wang 2020-05-11 867 rwr[mri].wr.opcode = IB_WR_REG_MR;
> 9cb837480424e7 Jack Wang 2020-05-11 868 rwr[mri].wr.wr_cqe = &local_reg_cqe;
> 9cb837480424e7 Jack Wang 2020-05-11 869 rwr[mri].wr.num_sge = 0;
> e8ae7ddb48a1b8 Jack Wang 2020-12-17 870 rwr[mri].wr.send_flags = 0;
> 9cb837480424e7 Jack Wang 2020-05-11 871 rwr[mri].mr = mr;
> 9cb837480424e7 Jack Wang 2020-05-11 872 rwr[mri].key = mr->rkey;
> 9cb837480424e7 Jack Wang 2020-05-11 873 rwr[mri].access = (IB_ACCESS_LOCAL_WRITE |
> 9cb837480424e7 Jack Wang 2020-05-11 874 IB_ACCESS_REMOTE_WRITE);
> 9cb837480424e7 Jack Wang 2020-05-11 875 reg_wr = &rwr[mri].wr;
> 9cb837480424e7 Jack Wang 2020-05-11 876 }
> 9cb837480424e7 Jack Wang 2020-05-11 877
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 878 err = rtrs_srv_create_path_files(srv_path);
> 4693d6b767d6ca Gioh Kim 2021-08-06 879 if (err)
> 9cb837480424e7 Jack Wang 2020-05-11 880 goto iu_free;
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 881 kobject_get(&srv_path->kobj);
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 882 get_device(&srv_path->srv->dev);
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 883 err = rtrs_srv_change_state(srv_path, RTRS_SRV_CONNECTED);
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 884 if (!err) {
>
> Probably remove the !?
>
> 94ae3ce9b375c6 Kim Zhu 2025-12-08 @885 rtrs_err(s, "rtrs_srv_change_state(), err: %d(%pe)\n", err, ERR_PTR(err));
>
> err is zero. Or is this a success path?
The function rtrs_srv_change_state returns (bool) true for success.
For this return value, the error log should not be sent to ERR_PTR.
Will remove the change for this in the next version.
Thanks.
>
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 886 goto iu_free;
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 887 }
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 888
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 889 rtrs_srv_start_hb(srv_path);
> 9cb837480424e7 Jack Wang 2020-05-11 890
> 9cb837480424e7 Jack Wang 2020-05-11 891 /*
> 9cb837480424e7 Jack Wang 2020-05-11 892 * We do not account number of established connections at the current
> 9cb837480424e7 Jack Wang 2020-05-11 893 * moment, we rely on the client, which should send info request when
> 9cb837480424e7 Jack Wang 2020-05-11 894 * all connections are successfully established. Thus, simply notify
> 9cb837480424e7 Jack Wang 2020-05-11 895 * listener with a proper event if we are the first path.
> 9cb837480424e7 Jack Wang 2020-05-11 896 */
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 897 err = rtrs_srv_path_up(srv_path);
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 898 if (err) {
> 94ae3ce9b375c6 Kim Zhu 2025-12-08 899 rtrs_err(s, "rtrs_srv_path_up(), err: %d(%pe)\n", err, ERR_PTR(err));
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 900 goto iu_free;
> ed1e52aefa16f1 Md Haris Iqbal 2023-11-20 901 }
> 9cb837480424e7 Jack Wang 2020-05-11 902
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 903 ib_dma_sync_single_for_device(srv_path->s.dev->ib_dev,
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 904 tx_iu->dma_addr,
> 9cb837480424e7 Jack Wang 2020-05-11 905 tx_iu->size, DMA_TO_DEVICE);
> 9cb837480424e7 Jack Wang 2020-05-11 906
> 9cb837480424e7 Jack Wang 2020-05-11 907 /* Send info response */
> 9cb837480424e7 Jack Wang 2020-05-11 908 err = rtrs_iu_post_send(&con->c, tx_iu, tx_sz, reg_wr);
> 4693d6b767d6ca Gioh Kim 2021-08-06 909 if (err) {
> 94ae3ce9b375c6 Kim Zhu 2025-12-08 910 rtrs_err(s, "rtrs_iu_post_send(), err: %d(%pe)\n", err, ERR_PTR(err));
> 9cb837480424e7 Jack Wang 2020-05-11 911 iu_free:
> ae4c81644e9105 Vaishali Thakkar 2022-01-05 912 rtrs_iu_free(tx_iu, srv_path->s.dev->ib_dev, 1);
> 9cb837480424e7 Jack Wang 2020-05-11 913 }
> 9cb837480424e7 Jack Wang 2020-05-11 914 rwr_free:
> 9cb837480424e7 Jack Wang 2020-05-11 915 kfree(rwr);
> 9cb837480424e7 Jack Wang 2020-05-11 916
> 9cb837480424e7 Jack Wang 2020-05-11 917 return err;
> 9cb837480424e7 Jack Wang 2020-05-11 918 }
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/9] RDMA/rtrs: Add error description to the logs
2025-12-18 15:51 ` Leon Romanovsky
@ 2026-01-06 10:03 ` Haris Iqbal
0 siblings, 0 replies; 20+ messages in thread
From: Haris Iqbal @ 2026-01-06 10:03 UTC (permalink / raw)
To: Leon Romanovsky
Cc: linux-rdma, bvanassche, jgg, jinpu.wang, grzegorz.prajsner,
Kim Zhu
On Thu, Dec 18, 2025 at 4:51 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Dec 08, 2025 at 05:15:06PM +0100, Md Haris Iqbal wrote:
> > From: Kim Zhu <zhu.yanjun@ionos.com>
> >
> > Print error description besides the error number.
> >
> > Signed-off-by: Kim Zhu <zhu.yanjun@ionos.com>
> > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > ---
> > drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c | 8 +-
> > drivers/infiniband/ulp/rtrs/rtrs-clt.c | 89 ++++++++++----------
> > drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c | 12 +--
> > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 78 ++++++++---------
> > drivers/infiniband/ulp/rtrs/rtrs.c | 9 +-
> > 5 files changed, 101 insertions(+), 95 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> > index 4aa80c9388f0..b318acc12b10 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
> > @@ -439,19 +439,19 @@ int rtrs_clt_create_path_files(struct rtrs_clt_path *clt_path)
> > clt->kobj_paths,
> > "%s", str);
> > if (err) {
> > - pr_err("kobject_init_and_add: %d\n", err);
> > + pr_err("kobject_init_and_add: %d(%pe)\n", err, ERR_PTR(err));
>
> Or print error or print error description, not both.
Makes sense. Will change it.
Thanks
>
> Thanks
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2026-01-06 10:03 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-08 16:15 [PATCH 0/9] Misc patches for RTRS Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 1/9] RDMA/rtrs-srv: fix SG mapping Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 2/9] RDMA/rtrs: Add error description to the logs Md Haris Iqbal
2025-12-12 5:26 ` Dan Carpenter
2026-01-06 9:47 ` Haris Iqbal
2025-12-18 15:51 ` Leon Romanovsky
2026-01-06 10:03 ` Haris Iqbal
2025-12-08 16:15 ` [PATCH 3/9] RDMA/rtrs: Add optional support for IB_MR_TYPE_SG_GAPS Md Haris Iqbal
2025-12-09 1:12 ` Honggang LI
2026-01-06 9:28 ` Haris Iqbal
2025-12-08 16:15 ` [PATCH 4/9] RDMA/rtrs: Improve error logging for RDMA cm events Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 5/9] RDMA/rtrs-clt: Remove unused list-head in rtrs_clt_io_req Md Haris Iqbal
2025-12-09 1:14 ` Honggang LI
2026-01-06 9:26 ` Haris Iqbal
2025-12-08 16:15 ` [PATCH 6/9] RDMA/rtrs-srv: Add check and closure for possible zombie paths Md Haris Iqbal
2025-12-09 1:17 ` Honggang LI
2026-01-06 9:27 ` Haris Iqbal
2025-12-08 16:15 ` [PATCH 7/9] RDMA/rtrs-srv: Rate-limit I/O path error logging Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 8/9] RDMA/rtrs: Extend log message when a port fails Md Haris Iqbal
2025-12-08 16:15 ` [PATCH 9/9] RDMA/rtrs-clt.c: For conn rejection use actual err number Md Haris Iqbal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).