* [PATCH for-next 00/13] Misc patches for RTRS
@ 2024-08-09 13:15 Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call Md Haris Iqbal
` (12 more replies)
0 siblings, 13 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma; +Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang
Hi Jason, hi Leon,
Please consider to include following changes to the next merge window.
Grzegorz Prajsner (1):
RDMA/rtrs: register ib event handler
Jack Wang (8):
RDMA/rtrs-srv: Fix use-after-free during session establishment
RDMA/rtrs-clt: Fix need_inv setting in error case
RDMA/rtrs-clt: Rate limit errors in IO path
RDMA/rtrs: Reset hb_missed_cnt after receiving other traffic from peer
RDMA/rtrs-clt: Reuse need_inval from mr
RDMA/rtrs-clt: Print request type for errors
RDMA/rtrs-clt: Do local invalidate after write io completion
RDMA/rtrs-clt: Remove an extra space
Md Haris Iqbal (4):
RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
RDMA/rtrs: For HB error add additional clt/srv specific logging
RDMA/rtrs-clt: Reset cid to con_num - 1 to stay in bounds
RDMA/rtrs-srv: Avoid null pointer deref during path establishment
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 92 ++++++++++++++++----------
drivers/infiniband/ulp/rtrs/rtrs-clt.h | 3 +-
drivers/infiniband/ulp/rtrs/rtrs-pri.h | 2 +
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 80 ++++++++++++++++++++--
drivers/infiniband/ulp/rtrs/rtrs-srv.h | 3 +
drivers/infiniband/ulp/rtrs/rtrs.c | 2 +-
6 files changed, 138 insertions(+), 44 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-11 8:38 ` Leon Romanovsky
2024-08-09 13:15 ` [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment Md Haris Iqbal
` (11 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
Do not allow opening RTRS server if it is already in use and print
proper error message.
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 27 +++++++++++++++++++++++---
drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 1d33efb8fb03..fb67b58a7f62 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -2174,9 +2174,18 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
struct rtrs_srv_ctx *ctx;
int err;
+ mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
+ if (ib_ctx.srv_ctx) {
+ pr_err("%s: Already in use.\n", __func__);
+ ctx = ERR_PTR(-EEXIST);
+ goto out;
+ }
+
ctx = alloc_srv_ctx(ops);
- if (!ctx)
- return ERR_PTR(-ENOMEM);
+ if (!ctx) {
+ ctx = ERR_PTR(-ENOMEM);
+ goto out;
+ }
mutex_init(&ib_ctx.ib_dev_mutex);
ib_ctx.srv_ctx = ctx;
@@ -2185,9 +2194,11 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
err = ib_register_client(&rtrs_srv_client);
if (err) {
free_srv_ctx(ctx);
- return ERR_PTR(err);
+ ctx = ERR_PTR(err);
}
+out:
+ mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
return ctx;
}
EXPORT_SYMBOL(rtrs_srv_open);
@@ -2221,10 +2232,16 @@ static void close_ctx(struct rtrs_srv_ctx *ctx)
*/
void rtrs_srv_close(struct rtrs_srv_ctx *ctx)
{
+ mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
+ WARN_ON(ib_ctx.srv_ctx != ctx);
+
ib_unregister_client(&rtrs_srv_client);
mutex_destroy(&ib_ctx.ib_dev_mutex);
close_ctx(ctx);
free_srv_ctx(ctx);
+
+ ib_ctx.srv_ctx = NULL;
+ mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
}
EXPORT_SYMBOL(rtrs_srv_close);
@@ -2282,6 +2299,9 @@ static int __init rtrs_server_init(void)
goto out_dev_class;
}
+ mutex_init(&ib_ctx.rtrs_srv_ib_mutex);
+ ib_ctx.srv_ctx = NULL;
+
return 0;
out_dev_class:
@@ -2292,6 +2312,7 @@ static int __init rtrs_server_init(void)
static void __exit rtrs_server_exit(void)
{
+ mutex_destroy(&ib_ctx.rtrs_srv_ib_mutex);
destroy_workqueue(rtrs_wq);
class_unregister(&rtrs_dev_class);
rtrs_rdma_dev_pd_deinit(&dev_pd);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
index 5e325b82ff33..4924dde0a708 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
@@ -127,6 +127,7 @@ struct rtrs_srv_ib_ctx {
u16 port;
struct mutex ib_dev_mutex;
int ib_dev_count;
+ struct mutex rtrs_srv_ib_mutex;
};
extern const struct class rtrs_dev_class;
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-11 8:41 ` Leon Romanovsky
2024-08-09 13:15 ` [PATCH for-next 03/13] RDMA/rtrs: For HB error add additional clt/srv specific logging Md Haris Iqbal
` (10 subsequent siblings)
12 siblings, 1 reply; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
From: Jack Wang <jinpu.wang@ionos.com>
In case of error happening during session stablishment, close_work is
running. A new RDMA CM event may arrive since we don't destroy cm_id
before destroying qp. To fix this, we first destroy cm_id after drain_qp,
so no new RDMA CM event will arrive afterwards.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
drivers/infiniband/ulp/rtrs/rtrs.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index fb67b58a7f62..90ea25ad6720 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work)
con = to_srv_con(srv_path->s.con[i]);
rdma_disconnect(con->c.cm_id);
ib_drain_qp(con->c.qp);
+ rdma_destroy_id(con->c.cm_id);
}
/*
@@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work)
continue;
con = to_srv_con(srv_path->s.con[i]);
rtrs_cq_qp_destroy(&con->c);
- rdma_destroy_id(con->c.cm_id);
kfree(con);
}
rtrs_ib_dev_put(srv_path->s.dev);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
index 4e17d546d4cc..44167fd1c958 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs.c
@@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create);
void rtrs_cq_qp_destroy(struct rtrs_con *con)
{
if (con->qp) {
- rdma_destroy_qp(con->cm_id);
+ ib_destroy_qp(con->qp);
con->qp = NULL;
}
destroy_cq(con);
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 03/13] RDMA/rtrs: For HB error add additional clt/srv specific logging
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 04/13] RDMA/rtrs-clt: Fix need_inv setting in error case Md Haris Iqbal
` (9 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
In case of HB error, we need to know the specific path on which it
happened, for better debugging. Since the clt/srv path structures are not
available in rtrs.c, it needs to be done in the individual HB error
handler.
This commit add those loging. A sample kernel log output after this commit:
rtrs_core L357: <blya>: HB missed max reached.
rtrs_server L717: <blya>: HB err handler for path=ip:x.x.x.x@ip:x.x.x.x
.
.
rtrs_core L357: <blya>: HB missed max reached.
rtrs_client L1519: <blya>: HB err handler for path=ip:x.x.x.x@ip:x.x.x.x
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2 ++
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 4 ++++
2 files changed, 6 insertions(+)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 88106cf5ce55..66ac4dba990f 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1494,7 +1494,9 @@ static bool rtrs_clt_change_state_get_old(struct rtrs_clt_path *clt_path,
static void rtrs_clt_hb_err_handler(struct rtrs_con *c)
{
struct rtrs_clt_con *con = container_of(c, typeof(*con), c);
+ struct rtrs_clt_path *clt_path = to_clt_path(con->c.path);
+ rtrs_err(con->c.path, "HB err handler for path=%s\n", kobject_name(&clt_path->kobj));
rtrs_rdma_error_recovery(con);
}
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 90ea25ad6720..e6885c88fa41 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -672,6 +672,10 @@ static int map_cont_bufs(struct rtrs_srv_path *srv_path)
static void rtrs_srv_hb_err_handler(struct rtrs_con *c)
{
+ struct rtrs_srv_con *con = container_of(c, typeof(*con), c);
+ struct rtrs_srv_path *srv_path = to_srv_path(con->c.path);
+
+ rtrs_err(con->c.path, "HB err handler for path=%s\n", kobject_name(&srv_path->kobj));
close_path(to_srv_path(c->path));
}
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 04/13] RDMA/rtrs-clt: Fix need_inv setting in error case
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (2 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 03/13] RDMA/rtrs: For HB error add additional clt/srv specific logging Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 05/13] RDMA/rtrs-clt: Rate limit errors in IO path Md Haris Iqbal
` (8 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
From: Jack Wang <jinpu.wang@ionos.com>
In some cases need_inv can be missed for write requests, additionally
driver has to handle missing invalidates for write requests. While at
it, remove the else case from write invalidate path as it is possible
to reach there.
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 66ac4dba990f..d09018c11ece 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -391,11 +391,12 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
clt_path = to_clt_path(con->c.path);
if (req->sg_cnt) {
- if (req->dir == DMA_FROM_DEVICE && req->need_inv) {
+ if (req->need_inv) {
/*
- * We are here to invalidate read requests
+ * We are here to invalidate read/write requests
* ourselves. In normal scenario server should
- * send INV for all read requests, but
+ * send INV for all read requests, we do chained local
+ * invalidate for write requests, but
* we are here, thus two things could happen:
*
* 1. this is failover, when errno != 0
@@ -422,14 +423,6 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
req->mr->rkey, err);
} else if (can_wait) {
wait_for_completion(&req->inv_comp);
- } else {
- /*
- * Something went wrong, so request will be
- * completed from INV callback.
- */
- WARN_ON_ONCE(1);
-
- return;
}
if (!refcount_dec_and_test(&req->ref))
return;
@@ -1146,6 +1139,7 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
};
wr = &rwr.wr;
fr_en = true;
+ req->need_inv = true;
refcount_inc(&req->ref);
}
/*
@@ -1164,6 +1158,10 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
clt_path->hca_port);
if (req->mp_policy == MP_POLICY_MIN_INFLIGHT)
atomic_dec(&clt_path->stats->inflight);
+ if (req->need_inv) {
+ req->need_inv = false;
+ refcount_dec(&req->ref);
+ }
if (req->sg_cnt)
ib_dma_unmap_sg(clt_path->s.dev->ib_dev, req->sglist,
req->sg_cnt, req->dir);
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 05/13] RDMA/rtrs-clt: Rate limit errors in IO path
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (3 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 04/13] RDMA/rtrs-clt: Fix need_inv setting in error case Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 06/13] RDMA/rtrs: Reset hb_missed_cnt after receiving other traffic from peer Md Haris Iqbal
` (7 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
From: Jack Wang <jinpu.wang@ionos.com>
On network errors, a large number of these logs are printed due to all the
inflight IOs, rate limit them so they do not clutter kernel log.
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index d09018c11ece..b34eb4908185 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -331,7 +331,7 @@ static void rtrs_clt_fast_reg_done(struct ib_cq *cq, struct ib_wc *wc)
struct rtrs_clt_con *con = to_clt_con(wc->qp->qp_context);
if (wc->status != IB_WC_SUCCESS) {
- rtrs_err(con->c.path, "Failed IB_WR_REG_MR: %s\n",
+ rtrs_err_rl(con->c.path, "Failed IB_WR_REG_MR: %s\n",
ib_wc_status_msg(wc->status));
rtrs_rdma_error_recovery(con);
}
@@ -351,7 +351,7 @@ static void rtrs_clt_inv_rkey_done(struct ib_cq *cq, struct ib_wc *wc)
struct rtrs_clt_con *con = to_clt_con(wc->qp->qp_context);
if (wc->status != IB_WC_SUCCESS) {
- rtrs_err(con->c.path, "Failed IB_WR_LOCAL_INV: %s\n",
+ rtrs_err_rl(con->c.path, "Failed IB_WR_LOCAL_INV: %s\n",
ib_wc_status_msg(wc->status));
rtrs_rdma_error_recovery(con);
}
@@ -419,7 +419,7 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
refcount_inc(&req->ref);
err = rtrs_inv_rkey(req);
if (err) {
- rtrs_err(con->c.path, "Send INV WR key=%#x: %d\n",
+ rtrs_err_rl(con->c.path, "Send INV WR key=%#x: %d\n",
req->mr->rkey, err);
} else if (can_wait) {
wait_for_completion(&req->inv_comp);
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 06/13] RDMA/rtrs: Reset hb_missed_cnt after receiving other traffic from peer
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (4 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 05/13] RDMA/rtrs-clt: Rate limit errors in IO path Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 07/13] RDMA/rtrs-clt: Reuse need_inval from mr Md Haris Iqbal
` (6 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
From: Jack Wang <jinpu.wang@ionos.com>
Reset hb_missed_cnt after receiving traffic from other peer, so
hb is more robust again high load on host or network.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 3 ++-
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index b34eb4908185..c1bca8972015 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -619,6 +619,7 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
*/
if (WARN_ON(wc->wr_cqe->done != rtrs_clt_rdma_done))
return;
+ clt_path->s.hb_missed_cnt = 0;
rtrs_from_imm(be32_to_cpu(wc->ex.imm_data),
&imm_type, &imm_payload);
if (imm_type == RTRS_IO_RSP_IMM ||
@@ -636,7 +637,6 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
return rtrs_clt_recv_done(con, wc);
} else if (imm_type == RTRS_HB_ACK_IMM) {
WARN_ON(con->c.cid);
- clt_path->s.hb_missed_cnt = 0;
clt_path->s.hb_cur_latency =
ktime_sub(ktime_get(), clt_path->s.hb_last_sent);
if (clt_path->flags & RTRS_MSG_NEW_RKEY_F)
@@ -663,6 +663,7 @@ static void rtrs_clt_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
/*
* Key invalidations from server side
*/
+ clt_path->s.hb_missed_cnt = 0;
WARN_ON(!(wc->wc_flags & IB_WC_WITH_INVALIDATE ||
wc->wc_flags & IB_WC_WITH_IMM));
WARN_ON(wc->wr_cqe->done != rtrs_clt_rdma_done);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index e6885c88fa41..c7aec40d4934 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1233,6 +1233,7 @@ static void rtrs_srv_rdma_done(struct ib_cq *cq, struct ib_wc *wc)
*/
if (WARN_ON(wc->wr_cqe != &io_comp_cqe))
return;
+ srv_path->s.hb_missed_cnt = 0;
err = rtrs_post_recv_empty(&con->c, &io_comp_cqe);
if (err) {
rtrs_err(s, "rtrs_post_recv(), err: %d\n", err);
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 07/13] RDMA/rtrs-clt: Reuse need_inval from mr
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (5 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 06/13] RDMA/rtrs: Reset hb_missed_cnt after receiving other traffic from peer Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 08/13] RDMA/rtrs-clt: Reset cid to con_num - 1 to stay in bounds Md Haris Iqbal
` (5 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
From: Jack Wang <jinpu.wang@ionos.com>
mr has a member need_inval, which can be used to indicate if
local invalidate is needed, switch to it and remove need_inv
from rtrs_clt_io_req.
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 18 +++++++++---------
drivers/infiniband/ulp/rtrs/rtrs-clt.h | 1 -
2 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index c1bca8972015..e1557b0cda05 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -355,7 +355,7 @@ static void rtrs_clt_inv_rkey_done(struct ib_cq *cq, struct ib_wc *wc)
ib_wc_status_msg(wc->status));
rtrs_rdma_error_recovery(con);
}
- req->need_inv = false;
+ req->mr->need_inval = false;
if (req->need_inv_comp)
complete(&req->inv_comp);
else
@@ -391,7 +391,7 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
clt_path = to_clt_path(con->c.path);
if (req->sg_cnt) {
- if (req->need_inv) {
+ if (req->mr->need_inval) {
/*
* We are here to invalidate read/write requests
* ourselves. In normal scenario server should
@@ -494,7 +494,7 @@ static void process_io_rsp(struct rtrs_clt_path *clt_path, u32 msg_id,
req = &clt_path->reqs[msg_id];
/* Drop need_inv if server responded with send with invalidation */
- req->need_inv &= !w_inval;
+ req->mr->need_inval &= !w_inval;
complete_rdma_req(req, errno, true, false);
}
@@ -961,7 +961,7 @@ static void rtrs_clt_init_req(struct rtrs_clt_io_req *req,
req->dir = dir;
req->con = rtrs_permit_to_clt_con(clt_path, permit);
req->conf = conf;
- req->need_inv = false;
+ req->mr->need_inval = false;
req->need_inv_comp = false;
req->inv_errno = 0;
refcount_set(&req->ref, 1);
@@ -1140,8 +1140,8 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
};
wr = &rwr.wr;
fr_en = true;
- req->need_inv = true;
refcount_inc(&req->ref);
+ req->mr->need_inval = true;
}
/*
* Update stats now, after request is successfully sent it is not
@@ -1159,8 +1159,8 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
clt_path->hca_port);
if (req->mp_policy == MP_POLICY_MIN_INFLIGHT)
atomic_dec(&clt_path->stats->inflight);
- if (req->need_inv) {
- req->need_inv = false;
+ if (req->mr->need_inval) {
+ req->mr->need_inval = false;
refcount_dec(&req->ref);
}
if (req->sg_cnt)
@@ -1236,7 +1236,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
msg->desc[0].len = cpu_to_le32(req->mr->length);
/* Further invalidation is required */
- req->need_inv = !!RTRS_MSG_NEED_INVAL_F;
+ req->mr->need_inval = !!RTRS_MSG_NEED_INVAL_F;
} else {
msg->sg_cnt = 0;
@@ -1269,7 +1269,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
clt_path->hca_port);
if (req->mp_policy == MP_POLICY_MIN_INFLIGHT)
atomic_dec(&clt_path->stats->inflight);
- req->need_inv = false;
+ req->mr->need_inval = false;
if (req->sg_cnt)
ib_dma_unmap_sg(dev->ib_dev, req->sglist,
req->sg_cnt, req->dir);
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index f848c0392d98..45dac15825f4 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -115,7 +115,6 @@ struct rtrs_clt_io_req {
struct completion inv_comp;
int inv_errno;
bool need_inv_comp;
- bool need_inv;
refcount_t ref;
};
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 08/13] RDMA/rtrs-clt: Reset cid to con_num - 1 to stay in bounds
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (6 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 07/13] RDMA/rtrs-clt: Reuse need_inval from mr Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 09/13] RDMA/rtrs-clt: Print request type for errors Md Haris Iqbal
` (4 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
In the function init_conns(), after the create_con() and create_cm() for
loop if something fails. In the cleanup for loop after the destroy tag, we
access out of bound memory because cid is set to clt_path->s.con_num.
This commits resets the cid to clt_path->s.con_num - 1, to stay in bounds
in the cleanup loop later.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index e1557b0cda05..777f8e52ed7c 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -2347,6 +2347,12 @@ static int init_conns(struct rtrs_clt_path *clt_path)
if (err)
goto destroy;
}
+
+ /*
+ * Set the cid to con_num - 1, since if we fail later, we want to stay in bounds.
+ */
+ cid = clt_path->s.con_num - 1;
+
err = alloc_path_reqs(clt_path);
if (err)
goto destroy;
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 09/13] RDMA/rtrs-clt: Print request type for errors
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (7 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 08/13] RDMA/rtrs-clt: Reset cid to con_num - 1 to stay in bounds Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 10/13] RDMA/rtrs-srv: Avoid null pointer deref during path establishment Md Haris Iqbal
` (3 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
From: Jack Wang <jinpu.wang@ionos.com>
Extend the output to print also the request type.
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 777f8e52ed7c..7c6d40380638 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -439,8 +439,10 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
req->con = NULL;
if (errno) {
- rtrs_err_rl(con->c.path, "IO request failed: error=%d path=%s [%s:%u] notify=%d\n",
- errno, kobject_name(&clt_path->kobj), clt_path->hca_name,
+ rtrs_err_rl(con->c.path,
+ "IO %s request failed: error=%d path=%s [%s:%u] notify=%d\n",
+ req->dir == DMA_TO_DEVICE ? "write" : "read", errno,
+ kobject_name(&clt_path->kobj), clt_path->hca_name,
clt_path->hca_port, notify);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 10/13] RDMA/rtrs-srv: Avoid null pointer deref during path establishment
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (8 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 09/13] RDMA/rtrs-clt: Print request type for errors Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 11/13] RDMA/rtrs: register ib event handler Md Haris Iqbal
` (2 subsequent siblings)
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
For RTRS path establishment, RTRS client initiates and completes con_num
of connections. After establishing all its connections, the information
is exchanged between the client and server through the info_req message.
During this exchange, it is essential that all connections have been
established, and the state of the RTRS srv path is CONNECTED.
So add these sanity checks, to make sure we detect and abort process in
error scenarios to avoid null pointer deref.
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index c7aec40d4934..fe8abbe0d2e9 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -935,12 +935,11 @@ static void rtrs_srv_info_req_done(struct ib_cq *cq, struct ib_wc *wc)
if (err)
goto close;
-out:
rtrs_iu_free(iu, srv_path->s.dev->ib_dev, 1);
return;
close:
+ rtrs_iu_free(iu, srv_path->s.dev->ib_dev, 1);
close_path(srv_path);
- goto out;
}
static int post_recv_info_req(struct rtrs_srv_con *con)
@@ -991,6 +990,16 @@ static int post_recv_path(struct rtrs_srv_path *srv_path)
q_size = SERVICE_CON_QUEUE_DEPTH;
else
q_size = srv->queue_depth;
+ if (srv_path->state != RTRS_SRV_CONNECTING) {
+ rtrs_err(s, "Path state invalid. state %s\n",
+ rtrs_srv_state_str(srv_path->state));
+ return -EIO;
+ }
+
+ if (!srv_path->s.con[cid]) {
+ rtrs_err(s, "Conn not set for %d\n", cid);
+ return -EIO;
+ }
err = post_recv_io(to_srv_con(srv_path->s.con[cid]), q_size);
if (err) {
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 11/13] RDMA/rtrs: register ib event handler
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (9 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 10/13] RDMA/rtrs-srv: Avoid null pointer deref during path establishment Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 12/13] RDMA/rtrs-clt: Do local invalidate after write io completion Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 13/13] RDMA/rtrs-clt: Remove an extra space Md Haris Iqbal
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
From: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Use ib_register_event_handler() to register event handlers for both
client and server side. For now, all those handlers do, is to print
type of incoming event.
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 21 +++++++++++++++-
drivers/infiniband/ulp/rtrs/rtrs-clt.h | 2 ++
drivers/infiniband/ulp/rtrs/rtrs-pri.h | 2 ++
drivers/infiniband/ulp/rtrs/rtrs-srv.c | 33 +++++++++++++++++++++++++-
drivers/infiniband/ulp/rtrs/rtrs-srv.h | 2 ++
5 files changed, 58 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 7c6d40380638..230e5f6c8c90 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -3149,8 +3149,20 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt_sess *clt,
return err;
}
+void rtrs_clt_ib_event_handler(struct ib_event_handler *handler,
+ struct ib_event *ibevent)
+{
+ pr_info("Handling event: %s (%d).\n", ib_event_msg(ibevent->event),
+ ibevent->event);
+}
+
+
static int rtrs_clt_ib_dev_init(struct rtrs_ib_dev *dev)
{
+ INIT_IB_EVENT_HANDLER(&dev->event_handler, dev->ib_dev,
+ rtrs_clt_ib_event_handler);
+ ib_register_event_handler(&dev->event_handler);
+
if (!(dev->ib_dev->attrs.device_cap_flags &
IB_DEVICE_MEM_MGT_EXTENSIONS)) {
pr_err("Memory registrations not supported.\n");
@@ -3160,8 +3172,15 @@ static int rtrs_clt_ib_dev_init(struct rtrs_ib_dev *dev)
return 0;
}
+static void rtrs_clt_ib_dev_deinit(struct rtrs_ib_dev *dev)
+{
+ ib_unregister_event_handler(&dev->event_handler);
+}
+
+
static const struct rtrs_rdma_dev_pd_ops dev_pd_ops = {
- .init = rtrs_clt_ib_dev_init
+ .init = rtrs_clt_ib_dev_init,
+ .deinit = rtrs_clt_ib_dev_deinit
};
static int __init rtrs_client_init(void)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index 45dac15825f4..0f57759b3080 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -212,6 +212,8 @@ int rtrs_clt_remove_path_from_sysfs(struct rtrs_clt_path *path,
void rtrs_clt_set_max_reconnect_attempts(struct rtrs_clt_sess *clt, int value);
int rtrs_clt_get_max_reconnect_attempts(const struct rtrs_clt_sess *clt);
void free_path(struct rtrs_clt_path *clt_path);
+void rtrs_clt_ib_event_handler(struct ib_event_handler *handler,
+ struct ib_event *ibevent);
/* rtrs-clt-stats.c */
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
index ab25619261d2..ef29bd483b5a 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
@@ -69,6 +69,7 @@ struct rtrs_ib_dev;
struct rtrs_rdma_dev_pd_ops {
int (*init)(struct rtrs_ib_dev *dev);
+ void (*deinit)(struct rtrs_ib_dev *dev);
};
struct rtrs_rdma_dev_pd {
@@ -84,6 +85,7 @@ struct rtrs_ib_dev {
struct kref ref;
struct list_head entry;
struct rtrs_rdma_dev_pd *pool;
+ struct ib_event_handler event_handler;
};
struct rtrs_con {
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index fe8abbe0d2e9..141f05a1c395 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -26,7 +26,10 @@ MODULE_LICENSE("GPL");
#define DEFAULT_SESS_QUEUE_DEPTH 512
#define MAX_HDR_SIZE PAGE_SIZE
-static struct rtrs_rdma_dev_pd dev_pd;
+static const struct rtrs_rdma_dev_pd_ops dev_pd_ops;
+static struct rtrs_rdma_dev_pd dev_pd = {
+ .ops = &dev_pd_ops
+};
const struct class rtrs_dev_class = {
.name = "rtrs-server",
};
@@ -2286,6 +2289,34 @@ static int check_module_params(void)
return 0;
}
+void rtrs_srv_ib_event_handler(struct ib_event_handler *handler,
+ struct ib_event *ibevent)
+{
+ pr_info("Handling event: %s (%d).\n", ib_event_msg(ibevent->event),
+ ibevent->event);
+}
+
+static int rtrs_srv_ib_dev_init(struct rtrs_ib_dev *dev)
+{
+ INIT_IB_EVENT_HANDLER(&dev->event_handler, dev->ib_dev,
+ rtrs_srv_ib_event_handler);
+ ib_register_event_handler(&dev->event_handler);
+
+ return 0;
+}
+
+static void rtrs_srv_ib_dev_deinit(struct rtrs_ib_dev *dev)
+{
+ ib_unregister_event_handler(&dev->event_handler);
+}
+
+
+static const struct rtrs_rdma_dev_pd_ops dev_pd_ops = {
+ .init = rtrs_srv_ib_dev_init,
+ .deinit = rtrs_srv_ib_dev_deinit
+};
+
+
static int __init rtrs_server_init(void)
{
int err;
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
index 4924dde0a708..99cfd2e58ca1 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
@@ -133,6 +133,8 @@ struct rtrs_srv_ib_ctx {
extern const struct class rtrs_dev_class;
void close_path(struct rtrs_srv_path *srv_path);
+void rtrs_srv_ib_event_handler(struct ib_event_handler *handler,
+ struct ib_event *ibevent);
static inline void rtrs_srv_update_rdma_stats(struct rtrs_srv_stats *s,
size_t size, int d)
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 12/13] RDMA/rtrs-clt: Do local invalidate after write io completion
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (10 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 11/13] RDMA/rtrs: register ib event handler Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 13/13] RDMA/rtrs-clt: Remove an extra space Md Haris Iqbal
12 siblings, 0 replies; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Grzegorz Prajsner
From: Jack Wang <jinpu.wang@ionos.com>
Switch local invalidate after write io completion avoid the
chain usage of WR, this fixed the local protection error on
LOCAL INVALIDATE WR.
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 20 +++++++-------------
1 file changed, 7 insertions(+), 13 deletions(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 230e5f6c8c90..fb548d6a0aae 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -395,9 +395,9 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
/*
* We are here to invalidate read/write requests
* ourselves. In normal scenario server should
- * send INV for all read requests, we do chained local
- * invalidate for write requests, but
- * we are here, thus two things could happen:
+ * send INV for all read requests, we do local
+ * invalidate for write requests ourselves, but
+ * we are here, thus three things could happen:
*
* 1. this is failover, when errno != 0
* and can_wait == 1,
@@ -405,6 +405,9 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
* 2. something totally bad happened and
* server forgot to send INV, so we
* should do that ourselves.
+ *
+ * 3. write request finishes, we need to do local
+ * invalidate
*/
if (can_wait) {
@@ -1085,7 +1088,6 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
int ret, count = 0;
u32 imm, buf_id;
struct ib_reg_wr rwr;
- struct ib_send_wr inv_wr;
struct ib_send_wr *wr = NULL;
bool fr_en = false;
@@ -1126,13 +1128,6 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
req->sg_cnt, req->dir);
return ret;
}
- inv_wr = (struct ib_send_wr) {
- .opcode = IB_WR_LOCAL_INV,
- .wr_cqe = &req->inv_cqe,
- .send_flags = IB_SEND_SIGNALED,
- .ex.invalidate_rkey = req->mr->rkey,
- };
- req->inv_cqe.done = rtrs_clt_inv_rkey_done;
rwr = (struct ib_reg_wr) {
.wr.opcode = IB_WR_REG_MR,
.wr.wr_cqe = &fast_reg_cqe,
@@ -1142,7 +1137,6 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
};
wr = &rwr.wr;
fr_en = true;
- refcount_inc(&req->ref);
req->mr->need_inval = true;
}
/*
@@ -1153,7 +1147,7 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
ret = rtrs_post_rdma_write_sg(req->con, req, rbuf, fr_en, count,
req->usr_len + sizeof(*msg),
- imm, wr, &inv_wr);
+ imm, wr, NULL);
if (ret) {
rtrs_err_rl(s,
"Write request failed: error=%d path=%s [%s:%u]\n",
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH for-next 13/13] RDMA/rtrs-clt: Remove an extra space
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
` (11 preceding siblings ...)
2024-08-09 13:15 ` [PATCH for-next 12/13] RDMA/rtrs-clt: Do local invalidate after write io completion Md Haris Iqbal
@ 2024-08-09 13:15 ` Md Haris Iqbal
2024-08-11 8:43 ` Leon Romanovsky
12 siblings, 1 reply; 29+ messages in thread
From: Md Haris Iqbal @ 2024-08-09 13:15 UTC (permalink / raw)
To: linux-rdma
Cc: bvanassche, leon, jgg, haris.iqbal, jinpu.wang, Alexei Pastuchov,
Grzegorz Prajsner
From: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Alexei Pastuchov <alexei.pastuchov@ionos.com>
Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
---
drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index fb548d6a0aae..71387811b281 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1208,7 +1208,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
ret = rtrs_map_sg_fr(req, count);
if (ret < 0) {
rtrs_err_rl(s,
- "Read request failed, failed to map fast reg. data, err: %d\n",
+ "Read request failed, failed to map fast reg. data, err: %d\n",
ret);
ib_dma_unmap_sg(dev->ib_dev, req->sglist, req->sg_cnt,
req->dir);
--
2.25.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-09 13:15 ` [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call Md Haris Iqbal
@ 2024-08-11 8:38 ` Leon Romanovsky
2024-08-12 10:16 ` Haris Iqbal
0 siblings, 1 reply; 29+ messages in thread
From: Leon Romanovsky @ 2024-08-11 8:38 UTC (permalink / raw)
To: Md Haris Iqbal; +Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Fri, Aug 09, 2024 at 03:15:26PM +0200, Md Haris Iqbal wrote:
> Do not allow opening RTRS server if it is already in use and print
> proper error message.
1. How is it even possible? I see only one call to rtrs_srv_open() and
it is happening when the driver is loaded.
2. You already print an error message, why do you need to add another
one?
Thanks
>
> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> ---
> drivers/infiniband/ulp/rtrs/rtrs-srv.c | 27 +++++++++++++++++++++++---
> drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
> 2 files changed, 25 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> index 1d33efb8fb03..fb67b58a7f62 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> @@ -2174,9 +2174,18 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> struct rtrs_srv_ctx *ctx;
> int err;
>
> + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> + if (ib_ctx.srv_ctx) {
> + pr_err("%s: Already in use.\n", __func__);
> + ctx = ERR_PTR(-EEXIST);
> + goto out;
> + }
> +
> ctx = alloc_srv_ctx(ops);
> - if (!ctx)
> - return ERR_PTR(-ENOMEM);
> + if (!ctx) {
> + ctx = ERR_PTR(-ENOMEM);
> + goto out;
> + }
>
> mutex_init(&ib_ctx.ib_dev_mutex);
> ib_ctx.srv_ctx = ctx;
> @@ -2185,9 +2194,11 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> err = ib_register_client(&rtrs_srv_client);
> if (err) {
> free_srv_ctx(ctx);
> - return ERR_PTR(err);
> + ctx = ERR_PTR(err);
> }
>
> +out:
> + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> return ctx;
> }
> EXPORT_SYMBOL(rtrs_srv_open);
> @@ -2221,10 +2232,16 @@ static void close_ctx(struct rtrs_srv_ctx *ctx)
> */
> void rtrs_srv_close(struct rtrs_srv_ctx *ctx)
> {
> + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> + WARN_ON(ib_ctx.srv_ctx != ctx);
> +
> ib_unregister_client(&rtrs_srv_client);
> mutex_destroy(&ib_ctx.ib_dev_mutex);
> close_ctx(ctx);
> free_srv_ctx(ctx);
> +
> + ib_ctx.srv_ctx = NULL;
> + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> }
> EXPORT_SYMBOL(rtrs_srv_close);
>
> @@ -2282,6 +2299,9 @@ static int __init rtrs_server_init(void)
> goto out_dev_class;
> }
>
> + mutex_init(&ib_ctx.rtrs_srv_ib_mutex);
> + ib_ctx.srv_ctx = NULL;
> +
> return 0;
>
> out_dev_class:
> @@ -2292,6 +2312,7 @@ static int __init rtrs_server_init(void)
>
> static void __exit rtrs_server_exit(void)
> {
> + mutex_destroy(&ib_ctx.rtrs_srv_ib_mutex);
> destroy_workqueue(rtrs_wq);
> class_unregister(&rtrs_dev_class);
> rtrs_rdma_dev_pd_deinit(&dev_pd);
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> index 5e325b82ff33..4924dde0a708 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> @@ -127,6 +127,7 @@ struct rtrs_srv_ib_ctx {
> u16 port;
> struct mutex ib_dev_mutex;
> int ib_dev_count;
> + struct mutex rtrs_srv_ib_mutex;
> };
>
> extern const struct class rtrs_dev_class;
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment
2024-08-09 13:15 ` [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment Md Haris Iqbal
@ 2024-08-11 8:41 ` Leon Romanovsky
2024-08-12 10:52 ` Jinpu Wang
0 siblings, 1 reply; 29+ messages in thread
From: Leon Romanovsky @ 2024-08-11 8:41 UTC (permalink / raw)
To: Md Haris Iqbal; +Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Fri, Aug 09, 2024 at 03:15:27PM +0200, Md Haris Iqbal wrote:
> From: Jack Wang <jinpu.wang@ionos.com>
>
> In case of error happening during session stablishment, close_work is
> running. A new RDMA CM event may arrive since we don't destroy cm_id
> before destroying qp. To fix this, we first destroy cm_id after drain_qp,
> so no new RDMA CM event will arrive afterwards.
>
> Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> ---
> drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
> drivers/infiniband/ulp/rtrs/rtrs.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> index fb67b58a7f62..90ea25ad6720 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work)
> con = to_srv_con(srv_path->s.con[i]);
> rdma_disconnect(con->c.cm_id);
> ib_drain_qp(con->c.qp);
> + rdma_destroy_id(con->c.cm_id);
> }
>
> /*
> @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work)
> continue;
> con = to_srv_con(srv_path->s.con[i]);
> rtrs_cq_qp_destroy(&con->c);
> - rdma_destroy_id(con->c.cm_id);
> kfree(con);
> }
> rtrs_ib_dev_put(srv_path->s.dev);
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
> index 4e17d546d4cc..44167fd1c958 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs.c
> @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create);
> void rtrs_cq_qp_destroy(struct rtrs_con *con)
> {
> if (con->qp) {
> - rdma_destroy_qp(con->cm_id);
> + ib_destroy_qp(con->qp);
You created that QP with rdma_create_qp() and you should destroy it with rdma_destroy_qp().
Thanks
> con->qp = NULL;
> }
> destroy_cq(con);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 13/13] RDMA/rtrs-clt: Remove an extra space
2024-08-09 13:15 ` [PATCH for-next 13/13] RDMA/rtrs-clt: Remove an extra space Md Haris Iqbal
@ 2024-08-11 8:43 ` Leon Romanovsky
2024-08-12 4:10 ` Jinpu Wang
0 siblings, 1 reply; 29+ messages in thread
From: Leon Romanovsky @ 2024-08-11 8:43 UTC (permalink / raw)
To: Md Haris Iqbal
Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Alexei Pastuchov,
Grzegorz Prajsner
On Fri, Aug 09, 2024 at 03:15:38PM +0200, Md Haris Iqbal wrote:
> From: Jack Wang <jinpu.wang@ionos.com>
>
No empty commit message, please provide a proper description.
Thanks
> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> Signed-off-by: Alexei Pastuchov <alexei.pastuchov@ionos.com>
> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> ---
> drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> index fb548d6a0aae..71387811b281 100644
> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> @@ -1208,7 +1208,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
> ret = rtrs_map_sg_fr(req, count);
> if (ret < 0) {
> rtrs_err_rl(s,
> - "Read request failed, failed to map fast reg. data, err: %d\n",
> + "Read request failed, failed to map fast reg. data, err: %d\n",
> ret);
> ib_dma_unmap_sg(dev->ib_dev, req->sglist, req->sg_cnt,
> req->dir);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 13/13] RDMA/rtrs-clt: Remove an extra space
2024-08-11 8:43 ` Leon Romanovsky
@ 2024-08-12 4:10 ` Jinpu Wang
2024-08-12 9:48 ` Zhu Yanjun
0 siblings, 1 reply; 29+ messages in thread
From: Jinpu Wang @ 2024-08-12 4:10 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Md Haris Iqbal, linux-rdma, bvanassche, jgg, Alexei Pastuchov,
Grzegorz Prajsner
On Sun, Aug 11, 2024 at 10:43 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Fri, Aug 09, 2024 at 03:15:38PM +0200, Md Haris Iqbal wrote:
> > From: Jack Wang <jinpu.wang@ionos.com>
> >
>
> No empty commit message, please provide a proper description.
This is really simple change, the subject should explain it clear, but
ok, I will extend it also in the commit message.
>
> Thanks
Thx & Regards
>
>
> > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > Signed-off-by: Alexei Pastuchov <alexei.pastuchov@ionos.com>
> > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > ---
> > drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > index fb548d6a0aae..71387811b281 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
> > @@ -1208,7 +1208,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
> > ret = rtrs_map_sg_fr(req, count);
> > if (ret < 0) {
> > rtrs_err_rl(s,
> > - "Read request failed, failed to map fast reg. data, err: %d\n",
> > + "Read request failed, failed to map fast reg. data, err: %d\n",
> > ret);
> > ib_dma_unmap_sg(dev->ib_dev, req->sglist, req->sg_cnt,
> > req->dir);
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 13/13] RDMA/rtrs-clt: Remove an extra space
2024-08-12 4:10 ` Jinpu Wang
@ 2024-08-12 9:48 ` Zhu Yanjun
0 siblings, 0 replies; 29+ messages in thread
From: Zhu Yanjun @ 2024-08-12 9:48 UTC (permalink / raw)
To: Jinpu Wang, Leon Romanovsky
Cc: Md Haris Iqbal, linux-rdma, bvanassche, jgg, Alexei Pastuchov,
Grzegorz Prajsner
在 2024/8/12 12:10, Jinpu Wang 写道:
> On Sun, Aug 11, 2024 at 10:43 AM Leon Romanovsky <leon@kernel.org> wrote:
>>
>> On Fri, Aug 09, 2024 at 03:15:38PM +0200, Md Haris Iqbal wrote:
>>> From: Jack Wang <jinpu.wang@ionos.com>
>>>
>>
>> No empty commit message, please provide a proper description.
> This is really simple change, the subject should explain it clear, but
> ok, I will extend it also in the commit message.
For your reference, Adding "No functional changes" in commit log should
be OK. ^_^
Zhu Yanjun
>>
>> Thanks
> Thx & Regards
>>
>>
>>> Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
>>> Signed-off-by: Alexei Pastuchov <alexei.pastuchov@ionos.com>
>>> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
>>> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
>>> ---
>>> drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
>>> index fb548d6a0aae..71387811b281 100644
>>> --- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
>>> +++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
>>> @@ -1208,7 +1208,7 @@ static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
>>> ret = rtrs_map_sg_fr(req, count);
>>> if (ret < 0) {
>>> rtrs_err_rl(s,
>>> - "Read request failed, failed to map fast reg. data, err: %d\n",
>>> + "Read request failed, failed to map fast reg. data, err: %d\n",
>>> ret);
>>> ib_dma_unmap_sg(dev->ib_dev, req->sglist, req->sg_cnt,
>>> req->dir);
>>> --
>>> 2.25.1
>>>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-11 8:38 ` Leon Romanovsky
@ 2024-08-12 10:16 ` Haris Iqbal
2024-08-12 10:34 ` Leon Romanovsky
0 siblings, 1 reply; 29+ messages in thread
From: Haris Iqbal @ 2024-08-12 10:16 UTC (permalink / raw)
To: Leon Romanovsky
Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Sun, Aug 11, 2024 at 10:38 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Fri, Aug 09, 2024 at 03:15:26PM +0200, Md Haris Iqbal wrote:
> > Do not allow opening RTRS server if it is already in use and print
> > proper error message.
>
> 1. How is it even possible? I see only one call to rtrs_srv_open() and
> it is happening when the driver is loaded.
rtrs_srv_open() is NOT called during RTRS driver load. It is called
during RNBD driver load, which is a client which uses RTRS.
RTRS server currently works with only a single client. Hence if, while
in use by RNBD, another driver wants to use RTRS and calls
rtrs_srv_open(), it should fail.
> 2. You already print an error message, why do you need to add another
> one?
This patch adds only a single error print, in function
rtrs_srv_open(). And it has not other print message. Am I missing
something?
>
> Thanks
>
> >
> > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > ---
> > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 27 +++++++++++++++++++++++---
> > drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
> > 2 files changed, 25 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > index 1d33efb8fb03..fb67b58a7f62 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > @@ -2174,9 +2174,18 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > struct rtrs_srv_ctx *ctx;
> > int err;
> >
> > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > + if (ib_ctx.srv_ctx) {
> > + pr_err("%s: Already in use.\n", __func__);
> > + ctx = ERR_PTR(-EEXIST);
> > + goto out;
> > + }
> > +
> > ctx = alloc_srv_ctx(ops);
> > - if (!ctx)
> > - return ERR_PTR(-ENOMEM);
> > + if (!ctx) {
> > + ctx = ERR_PTR(-ENOMEM);
> > + goto out;
> > + }
> >
> > mutex_init(&ib_ctx.ib_dev_mutex);
> > ib_ctx.srv_ctx = ctx;
> > @@ -2185,9 +2194,11 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > err = ib_register_client(&rtrs_srv_client);
> > if (err) {
> > free_srv_ctx(ctx);
> > - return ERR_PTR(err);
> > + ctx = ERR_PTR(err);
> > }
> >
> > +out:
> > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > return ctx;
> > }
> > EXPORT_SYMBOL(rtrs_srv_open);
> > @@ -2221,10 +2232,16 @@ static void close_ctx(struct rtrs_srv_ctx *ctx)
> > */
> > void rtrs_srv_close(struct rtrs_srv_ctx *ctx)
> > {
> > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > + WARN_ON(ib_ctx.srv_ctx != ctx);
> > +
> > ib_unregister_client(&rtrs_srv_client);
> > mutex_destroy(&ib_ctx.ib_dev_mutex);
> > close_ctx(ctx);
> > free_srv_ctx(ctx);
> > +
> > + ib_ctx.srv_ctx = NULL;
> > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > }
> > EXPORT_SYMBOL(rtrs_srv_close);
> >
> > @@ -2282,6 +2299,9 @@ static int __init rtrs_server_init(void)
> > goto out_dev_class;
> > }
> >
> > + mutex_init(&ib_ctx.rtrs_srv_ib_mutex);
> > + ib_ctx.srv_ctx = NULL;
> > +
> > return 0;
> >
> > out_dev_class:
> > @@ -2292,6 +2312,7 @@ static int __init rtrs_server_init(void)
> >
> > static void __exit rtrs_server_exit(void)
> > {
> > + mutex_destroy(&ib_ctx.rtrs_srv_ib_mutex);
> > destroy_workqueue(rtrs_wq);
> > class_unregister(&rtrs_dev_class);
> > rtrs_rdma_dev_pd_deinit(&dev_pd);
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > index 5e325b82ff33..4924dde0a708 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > @@ -127,6 +127,7 @@ struct rtrs_srv_ib_ctx {
> > u16 port;
> > struct mutex ib_dev_mutex;
> > int ib_dev_count;
> > + struct mutex rtrs_srv_ib_mutex;
> > };
> >
> > extern const struct class rtrs_dev_class;
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-12 10:16 ` Haris Iqbal
@ 2024-08-12 10:34 ` Leon Romanovsky
2024-08-12 10:39 ` Haris Iqbal
0 siblings, 1 reply; 29+ messages in thread
From: Leon Romanovsky @ 2024-08-12 10:34 UTC (permalink / raw)
To: Haris Iqbal; +Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Mon, Aug 12, 2024 at 12:16:19PM +0200, Haris Iqbal wrote:
> On Sun, Aug 11, 2024 at 10:38 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Fri, Aug 09, 2024 at 03:15:26PM +0200, Md Haris Iqbal wrote:
> > > Do not allow opening RTRS server if it is already in use and print
> > > proper error message.
> >
> > 1. How is it even possible? I see only one call to rtrs_srv_open() and
> > it is happening when the driver is loaded.
>
> rtrs_srv_open() is NOT called during RTRS driver load. It is called
> during RNBD driver load, which is a client which uses RTRS.
> RTRS server currently works with only a single client. Hence if, while
> in use by RNBD, another driver wants to use RTRS and calls
> rtrs_srv_open(), it should fail.
➜ kernel git:(rdma-next) ✗ git grep rtrs_srv_open
drivers/block/rnbd/rnbd-srv.c: rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr); <---- SINGLE CALL
drivers/block/rnbd/rnbd-srv.c: pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx);
drivers/infiniband/ulp/rtrs/rtrs-srv.c: * rtrs_srv_open() - open RTRS server context
drivers/infiniband/ulp/rtrs/rtrs-srv.c:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
drivers/infiniband/ulp/rtrs/rtrs-srv.c:EXPORT_SYMBOL(rtrs_srv_open);
drivers/infiniband/ulp/rtrs/rtrs.h:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port);
807 static int __init rnbd_srv_init_module(void)
808 {
809 int err = 0;
810
811 BUILD_BUG_ON(sizeof(struct rnbd_msg_hdr) != 4);
812 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info) != 36);
813 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info_rsp) != 36);
814 BUILD_BUG_ON(sizeof(struct rnbd_msg_open) != 264);
815 BUILD_BUG_ON(sizeof(struct rnbd_msg_close) != 8);
816 BUILD_BUG_ON(sizeof(struct rnbd_msg_open_rsp) != 56);
817 rtrs_ops = (struct rtrs_srv_ops) {
818 .rdma_ev = rnbd_srv_rdma_ev,
819 .link_ev = rnbd_srv_link_ev,
820 };
821 rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr);
822 if (IS_ERR(rtrs_ctx)) {
823 pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx); <---- ALREADY PRINTED ERROR
824 return PTR_ERR(rtrs_ctx);
825 }
...
843 module_init(rnbd_srv_init_module); <---- SINGLE CALL
Upstream code has only on RNBD and one RTRS.
Thanks
>
> > 2. You already print an error message, why do you need to add another
> > one?
>
> This patch adds only a single error print, in function
> rtrs_srv_open(). And it has not other print message. Am I missing
> something?
>
> >
> > Thanks
> >
> > >
> > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > > ---
> > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 27 +++++++++++++++++++++++---
> > > drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
> > > 2 files changed, 25 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > index 1d33efb8fb03..fb67b58a7f62 100644
> > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > @@ -2174,9 +2174,18 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > struct rtrs_srv_ctx *ctx;
> > > int err;
> > >
> > > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > > + if (ib_ctx.srv_ctx) {
> > > + pr_err("%s: Already in use.\n", __func__);
> > > + ctx = ERR_PTR(-EEXIST);
> > > + goto out;
> > > + }
> > > +
> > > ctx = alloc_srv_ctx(ops);
> > > - if (!ctx)
> > > - return ERR_PTR(-ENOMEM);
> > > + if (!ctx) {
> > > + ctx = ERR_PTR(-ENOMEM);
> > > + goto out;
> > > + }
> > >
> > > mutex_init(&ib_ctx.ib_dev_mutex);
> > > ib_ctx.srv_ctx = ctx;
> > > @@ -2185,9 +2194,11 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > err = ib_register_client(&rtrs_srv_client);
> > > if (err) {
> > > free_srv_ctx(ctx);
> > > - return ERR_PTR(err);
> > > + ctx = ERR_PTR(err);
> > > }
> > >
> > > +out:
> > > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > > return ctx;
> > > }
> > > EXPORT_SYMBOL(rtrs_srv_open);
> > > @@ -2221,10 +2232,16 @@ static void close_ctx(struct rtrs_srv_ctx *ctx)
> > > */
> > > void rtrs_srv_close(struct rtrs_srv_ctx *ctx)
> > > {
> > > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > > + WARN_ON(ib_ctx.srv_ctx != ctx);
> > > +
> > > ib_unregister_client(&rtrs_srv_client);
> > > mutex_destroy(&ib_ctx.ib_dev_mutex);
> > > close_ctx(ctx);
> > > free_srv_ctx(ctx);
> > > +
> > > + ib_ctx.srv_ctx = NULL;
> > > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > > }
> > > EXPORT_SYMBOL(rtrs_srv_close);
> > >
> > > @@ -2282,6 +2299,9 @@ static int __init rtrs_server_init(void)
> > > goto out_dev_class;
> > > }
> > >
> > > + mutex_init(&ib_ctx.rtrs_srv_ib_mutex);
> > > + ib_ctx.srv_ctx = NULL;
> > > +
> > > return 0;
> > >
> > > out_dev_class:
> > > @@ -2292,6 +2312,7 @@ static int __init rtrs_server_init(void)
> > >
> > > static void __exit rtrs_server_exit(void)
> > > {
> > > + mutex_destroy(&ib_ctx.rtrs_srv_ib_mutex);
> > > destroy_workqueue(rtrs_wq);
> > > class_unregister(&rtrs_dev_class);
> > > rtrs_rdma_dev_pd_deinit(&dev_pd);
> > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > index 5e325b82ff33..4924dde0a708 100644
> > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > @@ -127,6 +127,7 @@ struct rtrs_srv_ib_ctx {
> > > u16 port;
> > > struct mutex ib_dev_mutex;
> > > int ib_dev_count;
> > > + struct mutex rtrs_srv_ib_mutex;
> > > };
> > >
> > > extern const struct class rtrs_dev_class;
> > > --
> > > 2.25.1
> > >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-12 10:34 ` Leon Romanovsky
@ 2024-08-12 10:39 ` Haris Iqbal
2024-08-12 10:59 ` Leon Romanovsky
0 siblings, 1 reply; 29+ messages in thread
From: Haris Iqbal @ 2024-08-12 10:39 UTC (permalink / raw)
To: Leon Romanovsky
Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Mon, Aug 12, 2024 at 12:34 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Aug 12, 2024 at 12:16:19PM +0200, Haris Iqbal wrote:
> > On Sun, Aug 11, 2024 at 10:38 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Fri, Aug 09, 2024 at 03:15:26PM +0200, Md Haris Iqbal wrote:
> > > > Do not allow opening RTRS server if it is already in use and print
> > > > proper error message.
> > >
> > > 1. How is it even possible? I see only one call to rtrs_srv_open() and
> > > it is happening when the driver is loaded.
> >
> > rtrs_srv_open() is NOT called during RTRS driver load. It is called
> > during RNBD driver load, which is a client which uses RTRS.
> > RTRS server currently works with only a single client. Hence if, while
> > in use by RNBD, another driver wants to use RTRS and calls
> > rtrs_srv_open(), it should fail.
>
> ➜ kernel git:(rdma-next) ✗ git grep rtrs_srv_open
> drivers/block/rnbd/rnbd-srv.c: rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr); <---- SINGLE CALL
> drivers/block/rnbd/rnbd-srv.c: pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx);
> drivers/infiniband/ulp/rtrs/rtrs-srv.c: * rtrs_srv_open() - open RTRS server context
> drivers/infiniband/ulp/rtrs/rtrs-srv.c:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> drivers/infiniband/ulp/rtrs/rtrs-srv.c:EXPORT_SYMBOL(rtrs_srv_open);
> drivers/infiniband/ulp/rtrs/rtrs.h:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port);
>
> 807 static int __init rnbd_srv_init_module(void)
> 808 {
> 809 int err = 0;
> 810
> 811 BUILD_BUG_ON(sizeof(struct rnbd_msg_hdr) != 4);
> 812 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info) != 36);
> 813 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info_rsp) != 36);
> 814 BUILD_BUG_ON(sizeof(struct rnbd_msg_open) != 264);
> 815 BUILD_BUG_ON(sizeof(struct rnbd_msg_close) != 8);
> 816 BUILD_BUG_ON(sizeof(struct rnbd_msg_open_rsp) != 56);
> 817 rtrs_ops = (struct rtrs_srv_ops) {
> 818 .rdma_ev = rnbd_srv_rdma_ev,
> 819 .link_ev = rnbd_srv_link_ev,
> 820 };
> 821 rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr);
> 822 if (IS_ERR(rtrs_ctx)) {
> 823 pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx); <---- ALREADY PRINTED ERROR
> 824 return PTR_ERR(rtrs_ctx);
> 825 }
>
> ...
>
> 843 module_init(rnbd_srv_init_module); <---- SINGLE CALL
>
> Upstream code has only on RNBD and one RTRS.
Yes. But they are different drivers. RTRS as a stand-alone ULP does
not know about RNBD or for that matter any other client driver, which
may use it, either out of tree or in the future. If RTRS can serve
only a single client, then it should should have protection for
multiple calls to *_open().
>
> Thanks
>
>
> >
> > > 2. You already print an error message, why do you need to add another
> > > one?
> >
> > This patch adds only a single error print, in function
> > rtrs_srv_open(). And it has not other print message. Am I missing
> > something?
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > > > ---
> > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 27 +++++++++++++++++++++++---
> > > > drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
> > > > 2 files changed, 25 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > index 1d33efb8fb03..fb67b58a7f62 100644
> > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > @@ -2174,9 +2174,18 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > > struct rtrs_srv_ctx *ctx;
> > > > int err;
> > > >
> > > > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > + if (ib_ctx.srv_ctx) {
> > > > + pr_err("%s: Already in use.\n", __func__);
> > > > + ctx = ERR_PTR(-EEXIST);
> > > > + goto out;
> > > > + }
> > > > +
> > > > ctx = alloc_srv_ctx(ops);
> > > > - if (!ctx)
> > > > - return ERR_PTR(-ENOMEM);
> > > > + if (!ctx) {
> > > > + ctx = ERR_PTR(-ENOMEM);
> > > > + goto out;
> > > > + }
> > > >
> > > > mutex_init(&ib_ctx.ib_dev_mutex);
> > > > ib_ctx.srv_ctx = ctx;
> > > > @@ -2185,9 +2194,11 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > > err = ib_register_client(&rtrs_srv_client);
> > > > if (err) {
> > > > free_srv_ctx(ctx);
> > > > - return ERR_PTR(err);
> > > > + ctx = ERR_PTR(err);
> > > > }
> > > >
> > > > +out:
> > > > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > return ctx;
> > > > }
> > > > EXPORT_SYMBOL(rtrs_srv_open);
> > > > @@ -2221,10 +2232,16 @@ static void close_ctx(struct rtrs_srv_ctx *ctx)
> > > > */
> > > > void rtrs_srv_close(struct rtrs_srv_ctx *ctx)
> > > > {
> > > > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > + WARN_ON(ib_ctx.srv_ctx != ctx);
> > > > +
> > > > ib_unregister_client(&rtrs_srv_client);
> > > > mutex_destroy(&ib_ctx.ib_dev_mutex);
> > > > close_ctx(ctx);
> > > > free_srv_ctx(ctx);
> > > > +
> > > > + ib_ctx.srv_ctx = NULL;
> > > > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > }
> > > > EXPORT_SYMBOL(rtrs_srv_close);
> > > >
> > > > @@ -2282,6 +2299,9 @@ static int __init rtrs_server_init(void)
> > > > goto out_dev_class;
> > > > }
> > > >
> > > > + mutex_init(&ib_ctx.rtrs_srv_ib_mutex);
> > > > + ib_ctx.srv_ctx = NULL;
> > > > +
> > > > return 0;
> > > >
> > > > out_dev_class:
> > > > @@ -2292,6 +2312,7 @@ static int __init rtrs_server_init(void)
> > > >
> > > > static void __exit rtrs_server_exit(void)
> > > > {
> > > > + mutex_destroy(&ib_ctx.rtrs_srv_ib_mutex);
> > > > destroy_workqueue(rtrs_wq);
> > > > class_unregister(&rtrs_dev_class);
> > > > rtrs_rdma_dev_pd_deinit(&dev_pd);
> > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > index 5e325b82ff33..4924dde0a708 100644
> > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > @@ -127,6 +127,7 @@ struct rtrs_srv_ib_ctx {
> > > > u16 port;
> > > > struct mutex ib_dev_mutex;
> > > > int ib_dev_count;
> > > > + struct mutex rtrs_srv_ib_mutex;
> > > > };
> > > >
> > > > extern const struct class rtrs_dev_class;
> > > > --
> > > > 2.25.1
> > > >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment
2024-08-11 8:41 ` Leon Romanovsky
@ 2024-08-12 10:52 ` Jinpu Wang
2024-08-12 11:00 ` Leon Romanovsky
0 siblings, 1 reply; 29+ messages in thread
From: Jinpu Wang @ 2024-08-12 10:52 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Md Haris Iqbal, linux-rdma, bvanassche, jgg, Grzegorz Prajsner
Hi,
On Sun, Aug 11, 2024 at 10:41 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Fri, Aug 09, 2024 at 03:15:27PM +0200, Md Haris Iqbal wrote:
> > From: Jack Wang <jinpu.wang@ionos.com>
> >
> > In case of error happening during session stablishment, close_work is
> > running. A new RDMA CM event may arrive since we don't destroy cm_id
> > before destroying qp. To fix this, we first destroy cm_id after drain_qp,
> > so no new RDMA CM event will arrive afterwards.
> >
> > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > ---
> > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
> > drivers/infiniband/ulp/rtrs/rtrs.c | 2 +-
> > 2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > index fb67b58a7f62..90ea25ad6720 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work)
> > con = to_srv_con(srv_path->s.con[i]);
> > rdma_disconnect(con->c.cm_id);
> > ib_drain_qp(con->c.qp);
> > + rdma_destroy_id(con->c.cm_id);
> > }
> >
> > /*
> > @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work)
> > continue;
> > con = to_srv_con(srv_path->s.con[i]);
> > rtrs_cq_qp_destroy(&con->c);
> > - rdma_destroy_id(con->c.cm_id);
> > kfree(con);
> > }
> > rtrs_ib_dev_put(srv_path->s.dev);
> > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
> > index 4e17d546d4cc..44167fd1c958 100644
> > --- a/drivers/infiniband/ulp/rtrs/rtrs.c
> > +++ b/drivers/infiniband/ulp/rtrs/rtrs.c
> > @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create);
> > void rtrs_cq_qp_destroy(struct rtrs_con *con)
> > {
> > if (con->qp) {
> > - rdma_destroy_qp(con->cm_id);
> > + ib_destroy_qp(con->qp);
>
> You created that QP with rdma_create_qp() and you should destroy it with rdma_destroy_qp().
We can't do it, as we move rdma_destroy_id before rtrs_cq_qp_destroy,
if we still call rdma_destroy_qp, which will lead to use after free as
cm_id could already be free-ed.
>
> Thanks
Thx!
>
> > con->qp = NULL;
> > }
> > destroy_cq(con);
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-12 10:39 ` Haris Iqbal
@ 2024-08-12 10:59 ` Leon Romanovsky
2024-08-12 11:17 ` Haris Iqbal
0 siblings, 1 reply; 29+ messages in thread
From: Leon Romanovsky @ 2024-08-12 10:59 UTC (permalink / raw)
To: Haris Iqbal; +Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Mon, Aug 12, 2024 at 12:39:06PM +0200, Haris Iqbal wrote:
> On Mon, Aug 12, 2024 at 12:34 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Mon, Aug 12, 2024 at 12:16:19PM +0200, Haris Iqbal wrote:
> > > On Sun, Aug 11, 2024 at 10:38 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Fri, Aug 09, 2024 at 03:15:26PM +0200, Md Haris Iqbal wrote:
> > > > > Do not allow opening RTRS server if it is already in use and print
> > > > > proper error message.
> > > >
> > > > 1. How is it even possible? I see only one call to rtrs_srv_open() and
> > > > it is happening when the driver is loaded.
> > >
> > > rtrs_srv_open() is NOT called during RTRS driver load. It is called
> > > during RNBD driver load, which is a client which uses RTRS.
> > > RTRS server currently works with only a single client. Hence if, while
> > > in use by RNBD, another driver wants to use RTRS and calls
> > > rtrs_srv_open(), it should fail.
> >
> > ➜ kernel git:(rdma-next) ✗ git grep rtrs_srv_open
> > drivers/block/rnbd/rnbd-srv.c: rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr); <---- SINGLE CALL
> > drivers/block/rnbd/rnbd-srv.c: pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx);
> > drivers/infiniband/ulp/rtrs/rtrs-srv.c: * rtrs_srv_open() - open RTRS server context
> > drivers/infiniband/ulp/rtrs/rtrs-srv.c:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > drivers/infiniband/ulp/rtrs/rtrs-srv.c:EXPORT_SYMBOL(rtrs_srv_open);
> > drivers/infiniband/ulp/rtrs/rtrs.h:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port);
> >
> > 807 static int __init rnbd_srv_init_module(void)
> > 808 {
> > 809 int err = 0;
> > 810
> > 811 BUILD_BUG_ON(sizeof(struct rnbd_msg_hdr) != 4);
> > 812 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info) != 36);
> > 813 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info_rsp) != 36);
> > 814 BUILD_BUG_ON(sizeof(struct rnbd_msg_open) != 264);
> > 815 BUILD_BUG_ON(sizeof(struct rnbd_msg_close) != 8);
> > 816 BUILD_BUG_ON(sizeof(struct rnbd_msg_open_rsp) != 56);
> > 817 rtrs_ops = (struct rtrs_srv_ops) {
> > 818 .rdma_ev = rnbd_srv_rdma_ev,
> > 819 .link_ev = rnbd_srv_link_ev,
> > 820 };
> > 821 rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr);
> > 822 if (IS_ERR(rtrs_ctx)) {
> > 823 pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx); <---- ALREADY PRINTED ERROR
> > 824 return PTR_ERR(rtrs_ctx);
> > 825 }
> >
> > ...
> >
> > 843 module_init(rnbd_srv_init_module); <---- SINGLE CALL
> >
> > Upstream code has only on RNBD and one RTRS.
>
> Yes. But they are different drivers. RTRS as a stand-alone ULP does
> not know about RNBD or for that matter any other client driver, which
> may use it, either out of tree or in the future. If RTRS can serve
> only a single client, then it should should have protection for
> multiple calls to *_open().
For now, there is only one upstream client and server.
I want to remind you that during initial submission of RTR code, the
feedback was that this ULP shouldn't exist in first place and right
thing to do it is to use NVMe over fabrics.
So chances that we will have real out-of-tree client are very low.
Thanks
>
> >
> > Thanks
> >
> >
> > >
> > > > 2. You already print an error message, why do you need to add another
> > > > one?
> > >
> > > This patch adds only a single error print, in function
> > > rtrs_srv_open(). And it has not other print message. Am I missing
> > > something?
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > > > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > > > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > > > > ---
> > > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 27 +++++++++++++++++++++++---
> > > > > drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
> > > > > 2 files changed, 25 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > > index 1d33efb8fb03..fb67b58a7f62 100644
> > > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > > @@ -2174,9 +2174,18 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > > > struct rtrs_srv_ctx *ctx;
> > > > > int err;
> > > > >
> > > > > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > + if (ib_ctx.srv_ctx) {
> > > > > + pr_err("%s: Already in use.\n", __func__);
> > > > > + ctx = ERR_PTR(-EEXIST);
> > > > > + goto out;
> > > > > + }
> > > > > +
> > > > > ctx = alloc_srv_ctx(ops);
> > > > > - if (!ctx)
> > > > > - return ERR_PTR(-ENOMEM);
> > > > > + if (!ctx) {
> > > > > + ctx = ERR_PTR(-ENOMEM);
> > > > > + goto out;
> > > > > + }
> > > > >
> > > > > mutex_init(&ib_ctx.ib_dev_mutex);
> > > > > ib_ctx.srv_ctx = ctx;
> > > > > @@ -2185,9 +2194,11 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > > > err = ib_register_client(&rtrs_srv_client);
> > > > > if (err) {
> > > > > free_srv_ctx(ctx);
> > > > > - return ERR_PTR(err);
> > > > > + ctx = ERR_PTR(err);
> > > > > }
> > > > >
> > > > > +out:
> > > > > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > return ctx;
> > > > > }
> > > > > EXPORT_SYMBOL(rtrs_srv_open);
> > > > > @@ -2221,10 +2232,16 @@ static void close_ctx(struct rtrs_srv_ctx *ctx)
> > > > > */
> > > > > void rtrs_srv_close(struct rtrs_srv_ctx *ctx)
> > > > > {
> > > > > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > + WARN_ON(ib_ctx.srv_ctx != ctx);
> > > > > +
> > > > > ib_unregister_client(&rtrs_srv_client);
> > > > > mutex_destroy(&ib_ctx.ib_dev_mutex);
> > > > > close_ctx(ctx);
> > > > > free_srv_ctx(ctx);
> > > > > +
> > > > > + ib_ctx.srv_ctx = NULL;
> > > > > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > }
> > > > > EXPORT_SYMBOL(rtrs_srv_close);
> > > > >
> > > > > @@ -2282,6 +2299,9 @@ static int __init rtrs_server_init(void)
> > > > > goto out_dev_class;
> > > > > }
> > > > >
> > > > > + mutex_init(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > + ib_ctx.srv_ctx = NULL;
> > > > > +
> > > > > return 0;
> > > > >
> > > > > out_dev_class:
> > > > > @@ -2292,6 +2312,7 @@ static int __init rtrs_server_init(void)
> > > > >
> > > > > static void __exit rtrs_server_exit(void)
> > > > > {
> > > > > + mutex_destroy(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > destroy_workqueue(rtrs_wq);
> > > > > class_unregister(&rtrs_dev_class);
> > > > > rtrs_rdma_dev_pd_deinit(&dev_pd);
> > > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > > index 5e325b82ff33..4924dde0a708 100644
> > > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > > @@ -127,6 +127,7 @@ struct rtrs_srv_ib_ctx {
> > > > > u16 port;
> > > > > struct mutex ib_dev_mutex;
> > > > > int ib_dev_count;
> > > > > + struct mutex rtrs_srv_ib_mutex;
> > > > > };
> > > > >
> > > > > extern const struct class rtrs_dev_class;
> > > > > --
> > > > > 2.25.1
> > > > >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment
2024-08-12 10:52 ` Jinpu Wang
@ 2024-08-12 11:00 ` Leon Romanovsky
2024-08-13 10:54 ` Jinpu Wang
0 siblings, 1 reply; 29+ messages in thread
From: Leon Romanovsky @ 2024-08-12 11:00 UTC (permalink / raw)
To: Jinpu Wang; +Cc: Md Haris Iqbal, linux-rdma, bvanassche, jgg, Grzegorz Prajsner
On Mon, Aug 12, 2024 at 12:52:25PM +0200, Jinpu Wang wrote:
> Hi,
>
> On Sun, Aug 11, 2024 at 10:41 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Fri, Aug 09, 2024 at 03:15:27PM +0200, Md Haris Iqbal wrote:
> > > From: Jack Wang <jinpu.wang@ionos.com>
> > >
> > > In case of error happening during session stablishment, close_work is
> > > running. A new RDMA CM event may arrive since we don't destroy cm_id
> > > before destroying qp. To fix this, we first destroy cm_id after drain_qp,
> > > so no new RDMA CM event will arrive afterwards.
> > >
> > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > > ---
> > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
> > > drivers/infiniband/ulp/rtrs/rtrs.c | 2 +-
> > > 2 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > index fb67b58a7f62..90ea25ad6720 100644
> > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work)
> > > con = to_srv_con(srv_path->s.con[i]);
> > > rdma_disconnect(con->c.cm_id);
> > > ib_drain_qp(con->c.qp);
> > > + rdma_destroy_id(con->c.cm_id);
> > > }
> > >
> > > /*
> > > @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work)
> > > continue;
> > > con = to_srv_con(srv_path->s.con[i]);
> > > rtrs_cq_qp_destroy(&con->c);
> > > - rdma_destroy_id(con->c.cm_id);
> > > kfree(con);
> > > }
> > > rtrs_ib_dev_put(srv_path->s.dev);
> > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
> > > index 4e17d546d4cc..44167fd1c958 100644
> > > --- a/drivers/infiniband/ulp/rtrs/rtrs.c
> > > +++ b/drivers/infiniband/ulp/rtrs/rtrs.c
> > > @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create);
> > > void rtrs_cq_qp_destroy(struct rtrs_con *con)
> > > {
> > > if (con->qp) {
> > > - rdma_destroy_qp(con->cm_id);
> > > + ib_destroy_qp(con->qp);
> >
> > You created that QP with rdma_create_qp() and you should destroy it with rdma_destroy_qp().
> We can't do it, as we move rdma_destroy_id before rtrs_cq_qp_destroy,
> if we still call rdma_destroy_qp, which will lead to use after free as
> cm_id could already be free-ed.
It is a hint that you are doing something wrong.
Thanks
>
> >
> > Thanks
> Thx!
> >
> > > con->qp = NULL;
> > > }
> > > destroy_cq(con);
> > > --
> > > 2.25.1
> > >
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-12 10:59 ` Leon Romanovsky
@ 2024-08-12 11:17 ` Haris Iqbal
2024-08-12 12:15 ` Leon Romanovsky
0 siblings, 1 reply; 29+ messages in thread
From: Haris Iqbal @ 2024-08-12 11:17 UTC (permalink / raw)
To: Leon Romanovsky
Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Mon, Aug 12, 2024 at 12:59 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Aug 12, 2024 at 12:39:06PM +0200, Haris Iqbal wrote:
> > On Mon, Aug 12, 2024 at 12:34 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Mon, Aug 12, 2024 at 12:16:19PM +0200, Haris Iqbal wrote:
> > > > On Sun, Aug 11, 2024 at 10:38 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > On Fri, Aug 09, 2024 at 03:15:26PM +0200, Md Haris Iqbal wrote:
> > > > > > Do not allow opening RTRS server if it is already in use and print
> > > > > > proper error message.
> > > > >
> > > > > 1. How is it even possible? I see only one call to rtrs_srv_open() and
> > > > > it is happening when the driver is loaded.
> > > >
> > > > rtrs_srv_open() is NOT called during RTRS driver load. It is called
> > > > during RNBD driver load, which is a client which uses RTRS.
> > > > RTRS server currently works with only a single client. Hence if, while
> > > > in use by RNBD, another driver wants to use RTRS and calls
> > > > rtrs_srv_open(), it should fail.
> > >
> > > ➜ kernel git:(rdma-next) ✗ git grep rtrs_srv_open
> > > drivers/block/rnbd/rnbd-srv.c: rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr); <---- SINGLE CALL
> > > drivers/block/rnbd/rnbd-srv.c: pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx);
> > > drivers/infiniband/ulp/rtrs/rtrs-srv.c: * rtrs_srv_open() - open RTRS server context
> > > drivers/infiniband/ulp/rtrs/rtrs-srv.c:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > drivers/infiniband/ulp/rtrs/rtrs-srv.c:EXPORT_SYMBOL(rtrs_srv_open);
> > > drivers/infiniband/ulp/rtrs/rtrs.h:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port);
> > >
> > > 807 static int __init rnbd_srv_init_module(void)
> > > 808 {
> > > 809 int err = 0;
> > > 810
> > > 811 BUILD_BUG_ON(sizeof(struct rnbd_msg_hdr) != 4);
> > > 812 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info) != 36);
> > > 813 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info_rsp) != 36);
> > > 814 BUILD_BUG_ON(sizeof(struct rnbd_msg_open) != 264);
> > > 815 BUILD_BUG_ON(sizeof(struct rnbd_msg_close) != 8);
> > > 816 BUILD_BUG_ON(sizeof(struct rnbd_msg_open_rsp) != 56);
> > > 817 rtrs_ops = (struct rtrs_srv_ops) {
> > > 818 .rdma_ev = rnbd_srv_rdma_ev,
> > > 819 .link_ev = rnbd_srv_link_ev,
> > > 820 };
> > > 821 rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr);
> > > 822 if (IS_ERR(rtrs_ctx)) {
> > > 823 pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx); <---- ALREADY PRINTED ERROR
> > > 824 return PTR_ERR(rtrs_ctx);
> > > 825 }
> > >
> > > ...
> > >
> > > 843 module_init(rnbd_srv_init_module); <---- SINGLE CALL
> > >
> > > Upstream code has only on RNBD and one RTRS.
> >
> > Yes. But they are different drivers. RTRS as a stand-alone ULP does
> > not know about RNBD or for that matter any other client driver, which
> > may use it, either out of tree or in the future. If RTRS can serve
> > only a single client, then it should should have protection for
> > multiple calls to *_open().
>
> For now, there is only one upstream client and server.
In my understanding, its the general rule of abstraction that this
type of limitation is handled where it exists.
>
> I want to remind you that during initial submission of RTR code, the
> feedback was that this ULP shouldn't exist in first place and right
> thing to do it is to use NVMe over fabrics.
>
> So chances that we will have real out-of-tree client are very low.
One reason for us to write this patch is that we are working on
another client which uses RTRS. We could have kept this change
out-of-tree, but frankly, it felt right to add this protection.
>
> Thanks
>
> >
> > >
> > > Thanks
> > >
> > >
> > > >
> > > > > 2. You already print an error message, why do you need to add another
> > > > > one?
> > > >
> > > > This patch adds only a single error print, in function
> > > > rtrs_srv_open(). And it has not other print message. Am I missing
> > > > something?
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > > > > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > > > > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > > > > > ---
> > > > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 27 +++++++++++++++++++++++---
> > > > > > drivers/infiniband/ulp/rtrs/rtrs-srv.h | 1 +
> > > > > > 2 files changed, 25 insertions(+), 3 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > > > index 1d33efb8fb03..fb67b58a7f62 100644
> > > > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > > > @@ -2174,9 +2174,18 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > > > > struct rtrs_srv_ctx *ctx;
> > > > > > int err;
> > > > > >
> > > > > > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > > + if (ib_ctx.srv_ctx) {
> > > > > > + pr_err("%s: Already in use.\n", __func__);
> > > > > > + ctx = ERR_PTR(-EEXIST);
> > > > > > + goto out;
> > > > > > + }
> > > > > > +
> > > > > > ctx = alloc_srv_ctx(ops);
> > > > > > - if (!ctx)
> > > > > > - return ERR_PTR(-ENOMEM);
> > > > > > + if (!ctx) {
> > > > > > + ctx = ERR_PTR(-ENOMEM);
> > > > > > + goto out;
> > > > > > + }
> > > > > >
> > > > > > mutex_init(&ib_ctx.ib_dev_mutex);
> > > > > > ib_ctx.srv_ctx = ctx;
> > > > > > @@ -2185,9 +2194,11 @@ struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > > > > err = ib_register_client(&rtrs_srv_client);
> > > > > > if (err) {
> > > > > > free_srv_ctx(ctx);
> > > > > > - return ERR_PTR(err);
> > > > > > + ctx = ERR_PTR(err);
> > > > > > }
> > > > > >
> > > > > > +out:
> > > > > > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > > return ctx;
> > > > > > }
> > > > > > EXPORT_SYMBOL(rtrs_srv_open);
> > > > > > @@ -2221,10 +2232,16 @@ static void close_ctx(struct rtrs_srv_ctx *ctx)
> > > > > > */
> > > > > > void rtrs_srv_close(struct rtrs_srv_ctx *ctx)
> > > > > > {
> > > > > > + mutex_lock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > > + WARN_ON(ib_ctx.srv_ctx != ctx);
> > > > > > +
> > > > > > ib_unregister_client(&rtrs_srv_client);
> > > > > > mutex_destroy(&ib_ctx.ib_dev_mutex);
> > > > > > close_ctx(ctx);
> > > > > > free_srv_ctx(ctx);
> > > > > > +
> > > > > > + ib_ctx.srv_ctx = NULL;
> > > > > > + mutex_unlock(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > > }
> > > > > > EXPORT_SYMBOL(rtrs_srv_close);
> > > > > >
> > > > > > @@ -2282,6 +2299,9 @@ static int __init rtrs_server_init(void)
> > > > > > goto out_dev_class;
> > > > > > }
> > > > > >
> > > > > > + mutex_init(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > > + ib_ctx.srv_ctx = NULL;
> > > > > > +
> > > > > > return 0;
> > > > > >
> > > > > > out_dev_class:
> > > > > > @@ -2292,6 +2312,7 @@ static int __init rtrs_server_init(void)
> > > > > >
> > > > > > static void __exit rtrs_server_exit(void)
> > > > > > {
> > > > > > + mutex_destroy(&ib_ctx.rtrs_srv_ib_mutex);
> > > > > > destroy_workqueue(rtrs_wq);
> > > > > > class_unregister(&rtrs_dev_class);
> > > > > > rtrs_rdma_dev_pd_deinit(&dev_pd);
> > > > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.h b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > > > index 5e325b82ff33..4924dde0a708 100644
> > > > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.h
> > > > > > @@ -127,6 +127,7 @@ struct rtrs_srv_ib_ctx {
> > > > > > u16 port;
> > > > > > struct mutex ib_dev_mutex;
> > > > > > int ib_dev_count;
> > > > > > + struct mutex rtrs_srv_ib_mutex;
> > > > > > };
> > > > > >
> > > > > > extern const struct class rtrs_dev_class;
> > > > > > --
> > > > > > 2.25.1
> > > > > >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-12 11:17 ` Haris Iqbal
@ 2024-08-12 12:15 ` Leon Romanovsky
2024-08-13 9:42 ` Haris Iqbal
0 siblings, 1 reply; 29+ messages in thread
From: Leon Romanovsky @ 2024-08-12 12:15 UTC (permalink / raw)
To: Haris Iqbal; +Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Mon, Aug 12, 2024 at 01:17:11PM +0200, Haris Iqbal wrote:
> On Mon, Aug 12, 2024 at 12:59 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Mon, Aug 12, 2024 at 12:39:06PM +0200, Haris Iqbal wrote:
> > > On Mon, Aug 12, 2024 at 12:34 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Mon, Aug 12, 2024 at 12:16:19PM +0200, Haris Iqbal wrote:
> > > > > On Sun, Aug 11, 2024 at 10:38 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > >
> > > > > > On Fri, Aug 09, 2024 at 03:15:26PM +0200, Md Haris Iqbal wrote:
> > > > > > > Do not allow opening RTRS server if it is already in use and print
> > > > > > > proper error message.
> > > > > >
> > > > > > 1. How is it even possible? I see only one call to rtrs_srv_open() and
> > > > > > it is happening when the driver is loaded.
> > > > >
> > > > > rtrs_srv_open() is NOT called during RTRS driver load. It is called
> > > > > during RNBD driver load, which is a client which uses RTRS.
> > > > > RTRS server currently works with only a single client. Hence if, while
> > > > > in use by RNBD, another driver wants to use RTRS and calls
> > > > > rtrs_srv_open(), it should fail.
> > > >
> > > > ➜ kernel git:(rdma-next) ✗ git grep rtrs_srv_open
> > > > drivers/block/rnbd/rnbd-srv.c: rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr); <---- SINGLE CALL
> > > > drivers/block/rnbd/rnbd-srv.c: pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx);
> > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c: * rtrs_srv_open() - open RTRS server context
> > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c:EXPORT_SYMBOL(rtrs_srv_open);
> > > > drivers/infiniband/ulp/rtrs/rtrs.h:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port);
> > > >
> > > > 807 static int __init rnbd_srv_init_module(void)
> > > > 808 {
> > > > 809 int err = 0;
> > > > 810
> > > > 811 BUILD_BUG_ON(sizeof(struct rnbd_msg_hdr) != 4);
> > > > 812 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info) != 36);
> > > > 813 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info_rsp) != 36);
> > > > 814 BUILD_BUG_ON(sizeof(struct rnbd_msg_open) != 264);
> > > > 815 BUILD_BUG_ON(sizeof(struct rnbd_msg_close) != 8);
> > > > 816 BUILD_BUG_ON(sizeof(struct rnbd_msg_open_rsp) != 56);
> > > > 817 rtrs_ops = (struct rtrs_srv_ops) {
> > > > 818 .rdma_ev = rnbd_srv_rdma_ev,
> > > > 819 .link_ev = rnbd_srv_link_ev,
> > > > 820 };
> > > > 821 rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr);
> > > > 822 if (IS_ERR(rtrs_ctx)) {
> > > > 823 pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx); <---- ALREADY PRINTED ERROR
> > > > 824 return PTR_ERR(rtrs_ctx);
> > > > 825 }
> > > >
> > > > ...
> > > >
> > > > 843 module_init(rnbd_srv_init_module); <---- SINGLE CALL
> > > >
> > > > Upstream code has only on RNBD and one RTRS.
> > >
> > > Yes. But they are different drivers. RTRS as a stand-alone ULP does
> > > not know about RNBD or for that matter any other client driver, which
> > > may use it, either out of tree or in the future. If RTRS can serve
> > > only a single client, then it should should have protection for
> > > multiple calls to *_open().
> >
> > For now, there is only one upstream client and server.
>
> In my understanding, its the general rule of abstraction that this
> type of limitation is handled where it exists.
We have such protection and it is called "monolithic kernel".
>
> >
> > I want to remind you that during initial submission of RTR code, the
> > feedback was that this ULP shouldn't exist in first place and right
> > thing to do it is to use NVMe over fabrics.
> >
> > So chances that we will have real out-of-tree client are very low.
>
> One reason for us to write this patch is that we are working on
> another client which uses RTRS. We could have kept this change
> out-of-tree, but frankly, it felt right to add this protection.
So once you will have this client upstream, we can discuss this change.
Thanks
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call
2024-08-12 12:15 ` Leon Romanovsky
@ 2024-08-13 9:42 ` Haris Iqbal
0 siblings, 0 replies; 29+ messages in thread
From: Haris Iqbal @ 2024-08-13 9:42 UTC (permalink / raw)
To: Leon Romanovsky
Cc: linux-rdma, bvanassche, jgg, jinpu.wang, Grzegorz Prajsner
On Mon, Aug 12, 2024 at 2:15 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Aug 12, 2024 at 01:17:11PM +0200, Haris Iqbal wrote:
> > On Mon, Aug 12, 2024 at 12:59 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Mon, Aug 12, 2024 at 12:39:06PM +0200, Haris Iqbal wrote:
> > > > On Mon, Aug 12, 2024 at 12:34 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > On Mon, Aug 12, 2024 at 12:16:19PM +0200, Haris Iqbal wrote:
> > > > > > On Sun, Aug 11, 2024 at 10:38 AM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > > >
> > > > > > > On Fri, Aug 09, 2024 at 03:15:26PM +0200, Md Haris Iqbal wrote:
> > > > > > > > Do not allow opening RTRS server if it is already in use and print
> > > > > > > > proper error message.
> > > > > > >
> > > > > > > 1. How is it even possible? I see only one call to rtrs_srv_open() and
> > > > > > > it is happening when the driver is loaded.
> > > > > >
> > > > > > rtrs_srv_open() is NOT called during RTRS driver load. It is called
> > > > > > during RNBD driver load, which is a client which uses RTRS.
> > > > > > RTRS server currently works with only a single client. Hence if, while
> > > > > > in use by RNBD, another driver wants to use RTRS and calls
> > > > > > rtrs_srv_open(), it should fail.
> > > > >
> > > > > ➜ kernel git:(rdma-next) ✗ git grep rtrs_srv_open
> > > > > drivers/block/rnbd/rnbd-srv.c: rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr); <---- SINGLE CALL
> > > > > drivers/block/rnbd/rnbd-srv.c: pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx);
> > > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c: * rtrs_srv_open() - open RTRS server context
> > > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port)
> > > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c:EXPORT_SYMBOL(rtrs_srv_open);
> > > > > drivers/infiniband/ulp/rtrs/rtrs.h:struct rtrs_srv_ctx *rtrs_srv_open(struct rtrs_srv_ops *ops, u16 port);
> > > > >
> > > > > 807 static int __init rnbd_srv_init_module(void)
> > > > > 808 {
> > > > > 809 int err = 0;
> > > > > 810
> > > > > 811 BUILD_BUG_ON(sizeof(struct rnbd_msg_hdr) != 4);
> > > > > 812 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info) != 36);
> > > > > 813 BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info_rsp) != 36);
> > > > > 814 BUILD_BUG_ON(sizeof(struct rnbd_msg_open) != 264);
> > > > > 815 BUILD_BUG_ON(sizeof(struct rnbd_msg_close) != 8);
> > > > > 816 BUILD_BUG_ON(sizeof(struct rnbd_msg_open_rsp) != 56);
> > > > > 817 rtrs_ops = (struct rtrs_srv_ops) {
> > > > > 818 .rdma_ev = rnbd_srv_rdma_ev,
> > > > > 819 .link_ev = rnbd_srv_link_ev,
> > > > > 820 };
> > > > > 821 rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr);
> > > > > 822 if (IS_ERR(rtrs_ctx)) {
> > > > > 823 pr_err("rtrs_srv_open(), err: %pe\n", rtrs_ctx); <---- ALREADY PRINTED ERROR
> > > > > 824 return PTR_ERR(rtrs_ctx);
> > > > > 825 }
> > > > >
> > > > > ...
> > > > >
> > > > > 843 module_init(rnbd_srv_init_module); <---- SINGLE CALL
> > > > >
> > > > > Upstream code has only on RNBD and one RTRS.
> > > >
> > > > Yes. But they are different drivers. RTRS as a stand-alone ULP does
> > > > not know about RNBD or for that matter any other client driver, which
> > > > may use it, either out of tree or in the future. If RTRS can serve
> > > > only a single client, then it should should have protection for
> > > > multiple calls to *_open().
> > >
> > > For now, there is only one upstream client and server.
> >
> > In my understanding, its the general rule of abstraction that this
> > type of limitation is handled where it exists.
>
> We have such protection and it is called "monolithic kernel".
>
> >
> > >
> > > I want to remind you that during initial submission of RTR code, the
> > > feedback was that this ULP shouldn't exist in first place and right
> > > thing to do it is to use NVMe over fabrics.
> > >
> > > So chances that we will have real out-of-tree client are very low.
> >
> > One reason for us to write this patch is that we are working on
> > another client which uses RTRS. We could have kept this change
> > out-of-tree, but frankly, it felt right to add this protection.
>
> So once you will have this client upstream, we can discuss this change.
Okay
>
> Thanks
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment
2024-08-12 11:00 ` Leon Romanovsky
@ 2024-08-13 10:54 ` Jinpu Wang
0 siblings, 0 replies; 29+ messages in thread
From: Jinpu Wang @ 2024-08-13 10:54 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Md Haris Iqbal, linux-rdma, bvanassche, jgg, Grzegorz Prajsner
On Mon, Aug 12, 2024 at 1:01 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Aug 12, 2024 at 12:52:25PM +0200, Jinpu Wang wrote:
> > Hi,
> >
> > On Sun, Aug 11, 2024 at 10:41 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Fri, Aug 09, 2024 at 03:15:27PM +0200, Md Haris Iqbal wrote:
> > > > From: Jack Wang <jinpu.wang@ionos.com>
> > > >
> > > > In case of error happening during session stablishment, close_work is
> > > > running. A new RDMA CM event may arrive since we don't destroy cm_id
> > > > before destroying qp. To fix this, we first destroy cm_id after drain_qp,
> > > > so no new RDMA CM event will arrive afterwards.
> > > >
> > > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> > > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
> > > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com>
> > > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
> > > > ---
> > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
> > > > drivers/infiniband/ulp/rtrs/rtrs.c | 2 +-
> > > > 2 files changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > index fb67b58a7f62..90ea25ad6720 100644
> > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> > > > @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work)
> > > > con = to_srv_con(srv_path->s.con[i]);
> > > > rdma_disconnect(con->c.cm_id);
> > > > ib_drain_qp(con->c.qp);
> > > > + rdma_destroy_id(con->c.cm_id);
> > > > }
> > > >
> > > > /*
> > > > @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work)
> > > > continue;
> > > > con = to_srv_con(srv_path->s.con[i]);
> > > > rtrs_cq_qp_destroy(&con->c);
> > > > - rdma_destroy_id(con->c.cm_id);
> > > > kfree(con);
> > > > }
> > > > rtrs_ib_dev_put(srv_path->s.dev);
> > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
> > > > index 4e17d546d4cc..44167fd1c958 100644
> > > > --- a/drivers/infiniband/ulp/rtrs/rtrs.c
> > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs.c
> > > > @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create);
> > > > void rtrs_cq_qp_destroy(struct rtrs_con *con)
> > > > {
> > > > if (con->qp) {
> > > > - rdma_destroy_qp(con->cm_id);
> > > > + ib_destroy_qp(con->qp);
> > >
> > > You created that QP with rdma_create_qp() and you should destroy it with rdma_destroy_qp().
> > We can't do it, as we move rdma_destroy_id before rtrs_cq_qp_destroy,
> > if we still call rdma_destroy_qp, which will lead to use after free as
> > cm_id could already be free-ed.
>
> It is a hint that you are doing something wrong.
will drop it for now, will see if there is other way to fix it.
>
> Thanks
Thx
>
> >
> > >
> > > Thanks
> > Thx!
> > >
> > > > con->qp = NULL;
> > > > }
> > > > destroy_cq(con);
> > > > --
> > > > 2.25.1
> > > >
> >
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2024-08-13 10:54 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-09 13:15 [PATCH for-next 00/13] Misc patches for RTRS Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 01/13] RDMA/rtrs-srv: Make rtrs_srv_open fail if its a second call Md Haris Iqbal
2024-08-11 8:38 ` Leon Romanovsky
2024-08-12 10:16 ` Haris Iqbal
2024-08-12 10:34 ` Leon Romanovsky
2024-08-12 10:39 ` Haris Iqbal
2024-08-12 10:59 ` Leon Romanovsky
2024-08-12 11:17 ` Haris Iqbal
2024-08-12 12:15 ` Leon Romanovsky
2024-08-13 9:42 ` Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 02/13] RDMA/rtrs-srv: Fix use-after-free during session establishment Md Haris Iqbal
2024-08-11 8:41 ` Leon Romanovsky
2024-08-12 10:52 ` Jinpu Wang
2024-08-12 11:00 ` Leon Romanovsky
2024-08-13 10:54 ` Jinpu Wang
2024-08-09 13:15 ` [PATCH for-next 03/13] RDMA/rtrs: For HB error add additional clt/srv specific logging Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 04/13] RDMA/rtrs-clt: Fix need_inv setting in error case Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 05/13] RDMA/rtrs-clt: Rate limit errors in IO path Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 06/13] RDMA/rtrs: Reset hb_missed_cnt after receiving other traffic from peer Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 07/13] RDMA/rtrs-clt: Reuse need_inval from mr Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 08/13] RDMA/rtrs-clt: Reset cid to con_num - 1 to stay in bounds Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 09/13] RDMA/rtrs-clt: Print request type for errors Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 10/13] RDMA/rtrs-srv: Avoid null pointer deref during path establishment Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 11/13] RDMA/rtrs: register ib event handler Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 12/13] RDMA/rtrs-clt: Do local invalidate after write io completion Md Haris Iqbal
2024-08-09 13:15 ` [PATCH for-next 13/13] RDMA/rtrs-clt: Remove an extra space Md Haris Iqbal
2024-08-11 8:43 ` Leon Romanovsky
2024-08-12 4:10 ` Jinpu Wang
2024-08-12 9:48 ` Zhu Yanjun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).