public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next v2 0/4] RDMA/erdma: erdma updates
@ 2024-08-28  6:09 Cheng Xu
  2024-08-28  6:09 ` [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust Cheng Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Cheng Xu @ 2024-08-28  6:09 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, KaiShen

Hi,

This series has some updates for erdma driver:
- #1 always issues a reset request in probe routing to ensure that
  the hardware is ready to probe before continuing.
- #2 refactors the initialization and destruction process of EQ to
  make the code cleaner.
- #3 adds disassociate ucontext support.
- #4 returns QP state in erdma_query_qp.

v2:
- Remove unsed erdma_aeq_destroy.

Thanks,
Cheng Xu

Cheng Xu (4):
  RDMA/erdma: Make the device probe process more robust
  RDMA/erdma: Refactor the initialization and destruction of EQ
  RDMA/erdma: Add disassociate ucontext support
  RDMA/erdma: Return QP state in erdma_query_qp

 drivers/infiniband/hw/erdma/erdma.h       |  4 +-
 drivers/infiniband/hw/erdma/erdma_cmdq.c  | 26 ++-----
 drivers/infiniband/hw/erdma/erdma_eq.c    | 91 +++++++++++------------
 drivers/infiniband/hw/erdma/erdma_main.c  | 49 +++++++++---
 drivers/infiniband/hw/erdma/erdma_verbs.c | 29 +++++++-
 drivers/infiniband/hw/erdma/erdma_verbs.h |  1 +
 6 files changed, 120 insertions(+), 80 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust
  2024-08-28  6:09 [PATCH for-next v2 0/4] RDMA/erdma: erdma updates Cheng Xu
@ 2024-08-28  6:09 ` Cheng Xu
  2024-08-29 10:09   ` Leon Romanovsky
  2024-08-28  6:09 ` [PATCH for-next v2 2/4] RDMA/erdma: Refactor the initialization and destruction of EQ Cheng Xu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Cheng Xu @ 2024-08-28  6:09 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, KaiShen

Driver may probe again while hardware is destroying the internal
resources allocated for previous probing, which will fail the
device probe. To make it more robust, we always issue a reset at the
beginning of the device probe process.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
---
 drivers/infiniband/hw/erdma/erdma.h      |  1 +
 drivers/infiniband/hw/erdma/erdma_main.c | 44 +++++++++++++++++++-----
 2 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/erdma/erdma.h b/drivers/infiniband/hw/erdma/erdma.h
index c8bd698e21b0..b5c258f77ca0 100644
--- a/drivers/infiniband/hw/erdma/erdma.h
+++ b/drivers/infiniband/hw/erdma/erdma.h
@@ -94,6 +94,7 @@ enum {
 
 #define ERDMA_CMDQ_TIMEOUT_MS 15000
 #define ERDMA_REG_ACCESS_WAIT_MS 20
+#define ERDMA_WAIT_DEV_REST_CNT 50
 #define ERDMA_WAIT_DEV_DONE_CNT 500
 
 struct erdma_cmdq {
diff --git a/drivers/infiniband/hw/erdma/erdma_main.c b/drivers/infiniband/hw/erdma/erdma_main.c
index 7080f8a71ec4..9199058a0b29 100644
--- a/drivers/infiniband/hw/erdma/erdma_main.c
+++ b/drivers/infiniband/hw/erdma/erdma_main.c
@@ -209,11 +209,30 @@ static void erdma_device_uninit(struct erdma_dev *dev)
 	dma_pool_destroy(dev->resp_pool);
 }
 
-static void erdma_hw_reset(struct erdma_dev *dev)
+static int erdma_hw_reset(struct erdma_dev *dev, bool wait)
 {
 	u32 ctrl = FIELD_PREP(ERDMA_REG_DEV_CTRL_RESET_MASK, 1);
+	int i;
 
 	erdma_reg_write32(dev, ERDMA_REGS_DEV_CTRL_REG, ctrl);
+
+	if (!wait)
+		return 0;
+
+	for (i = 0; i < ERDMA_WAIT_DEV_REST_CNT; i++) {
+		if (erdma_reg_read32_filed(dev, ERDMA_REGS_DEV_ST_REG,
+					   ERDMA_REG_DEV_ST_RESET_DONE_MASK))
+			break;
+
+		msleep(ERDMA_REG_ACCESS_WAIT_MS);
+	}
+
+	if (i == ERDMA_WAIT_DEV_REST_CNT) {
+		dev_err(&dev->pdev->dev, "wait reset done timeout.\n");
+		return -ETIME;
+	}
+
+	return 0;
 }
 
 static int erdma_wait_hw_init_done(struct erdma_dev *dev)
@@ -239,6 +258,17 @@ static int erdma_wait_hw_init_done(struct erdma_dev *dev)
 	return 0;
 }
 
+static int erdma_preinit_check(struct erdma_dev *dev)
+{
+	u32 version = erdma_reg_read32(dev, ERDMA_REGS_VERSION_REG);
+
+	/* we knows that it is a non-functional function. */
+	if (version == 0)
+		return -ENODEV;
+
+	return erdma_hw_reset(dev, true);
+}
+
 static const struct pci_device_id erdma_pci_tbl[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_ALIBABA, 0x107f) },
 	{}
@@ -248,7 +278,6 @@ static int erdma_probe_dev(struct pci_dev *pdev)
 {
 	struct erdma_dev *dev;
 	int bars, err;
-	u32 version;
 
 	err = pci_enable_device(pdev);
 	if (err) {
@@ -287,12 +316,9 @@ static int erdma_probe_dev(struct pci_dev *pdev)
 		goto err_release_bars;
 	}
 
-	version = erdma_reg_read32(dev, ERDMA_REGS_VERSION_REG);
-	if (version == 0) {
-		/* we knows that it is a non-functional function. */
-		err = -ENODEV;
+	err = erdma_preinit_check(dev);
+	if (err)
 		goto err_iounmap_func_bar;
-	}
 
 	err = erdma_device_init(dev, pdev);
 	if (err)
@@ -327,7 +353,7 @@ static int erdma_probe_dev(struct pci_dev *pdev)
 	return 0;
 
 err_reset_hw:
-	erdma_hw_reset(dev);
+	erdma_hw_reset(dev, false);
 
 err_uninit_cmdq:
 	erdma_cmdq_destroy(dev);
@@ -364,7 +390,7 @@ static void erdma_remove_dev(struct pci_dev *pdev)
 	struct erdma_dev *dev = pci_get_drvdata(pdev);
 
 	erdma_ceqs_uninit(dev);
-	erdma_hw_reset(dev);
+	erdma_hw_reset(dev, false);
 	erdma_cmdq_destroy(dev);
 	erdma_aeq_destroy(dev);
 	erdma_comm_irq_uninit(dev);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for-next v2 2/4] RDMA/erdma: Refactor the initialization and destruction of EQ
  2024-08-28  6:09 [PATCH for-next v2 0/4] RDMA/erdma: erdma updates Cheng Xu
  2024-08-28  6:09 ` [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust Cheng Xu
@ 2024-08-28  6:09 ` Cheng Xu
  2024-08-28  6:09 ` [PATCH for-next v2 3/4] RDMA/erdma: Add disassociate ucontext support Cheng Xu
  2024-08-28  6:09 ` [PATCH for-next v2 4/4] RDMA/erdma: Return QP state in erdma_query_qp Cheng Xu
  3 siblings, 0 replies; 11+ messages in thread
From: Cheng Xu @ 2024-08-28  6:09 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, KaiShen

We extracted the common parts of the initialization/destruction
process to make the code cleaner.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
---
 drivers/infiniband/hw/erdma/erdma.h      |  3 +-
 drivers/infiniband/hw/erdma/erdma_cmdq.c | 26 ++-----
 drivers/infiniband/hw/erdma/erdma_eq.c   | 91 ++++++++++++------------
 drivers/infiniband/hw/erdma/erdma_main.c |  4 +-
 4 files changed, 54 insertions(+), 70 deletions(-)

diff --git a/drivers/infiniband/hw/erdma/erdma.h b/drivers/infiniband/hw/erdma/erdma.h
index b5c258f77ca0..f3b648dcf4b6 100644
--- a/drivers/infiniband/hw/erdma/erdma.h
+++ b/drivers/infiniband/hw/erdma/erdma.h
@@ -275,7 +275,8 @@ void notify_eq(struct erdma_eq *eq);
 void *get_next_valid_eqe(struct erdma_eq *eq);
 
 int erdma_aeq_init(struct erdma_dev *dev);
-void erdma_aeq_destroy(struct erdma_dev *dev);
+int erdma_eq_common_init(struct erdma_dev *dev, struct erdma_eq *eq, u32 depth);
+void erdma_eq_destroy(struct erdma_dev *dev, struct erdma_eq *eq);
 
 void erdma_aeq_event_handler(struct erdma_dev *dev);
 void erdma_ceq_completion_handler(struct erdma_eq_cb *ceq_cb);
diff --git a/drivers/infiniband/hw/erdma/erdma_cmdq.c b/drivers/infiniband/hw/erdma/erdma_cmdq.c
index 43ff40b5a09d..a3d8922d1ad1 100644
--- a/drivers/infiniband/hw/erdma/erdma_cmdq.c
+++ b/drivers/infiniband/hw/erdma/erdma_cmdq.c
@@ -158,20 +158,13 @@ static int erdma_cmdq_eq_init(struct erdma_dev *dev)
 {
 	struct erdma_cmdq *cmdq = &dev->cmdq;
 	struct erdma_eq *eq = &cmdq->eq;
+	int ret;
 
-	eq->depth = cmdq->max_outstandings;
-	eq->qbuf = dma_alloc_coherent(&dev->pdev->dev, eq->depth << EQE_SHIFT,
-				      &eq->qbuf_dma_addr, GFP_KERNEL);
-	if (!eq->qbuf)
-		return -ENOMEM;
-
-	spin_lock_init(&eq->lock);
-	atomic64_set(&eq->event_num, 0);
+	ret = erdma_eq_common_init(dev, eq, cmdq->max_outstandings);
+	if (ret)
+		return ret;
 
 	eq->db = dev->func_bar + ERDMA_REGS_CEQ_DB_BASE_REG;
-	eq->dbrec = dma_pool_zalloc(dev->db_pool, GFP_KERNEL, &eq->dbrec_dma);
-	if (!eq->dbrec)
-		goto err_out;
 
 	erdma_reg_write32(dev, ERDMA_REGS_CMDQ_EQ_ADDR_H_REG,
 			  upper_32_bits(eq->qbuf_dma_addr));
@@ -181,12 +174,6 @@ static int erdma_cmdq_eq_init(struct erdma_dev *dev)
 	erdma_reg_write64(dev, ERDMA_CMDQ_EQ_DB_HOST_ADDR_REG, eq->dbrec_dma);
 
 	return 0;
-
-err_out:
-	dma_free_coherent(&dev->pdev->dev, eq->depth << EQE_SHIFT, eq->qbuf,
-			  eq->qbuf_dma_addr);
-
-	return -ENOMEM;
 }
 
 int erdma_cmdq_init(struct erdma_dev *dev)
@@ -247,10 +234,7 @@ void erdma_cmdq_destroy(struct erdma_dev *dev)
 
 	clear_bit(ERDMA_CMDQ_STATE_OK_BIT, &cmdq->state);
 
-	dma_free_coherent(&dev->pdev->dev, cmdq->eq.depth << EQE_SHIFT,
-			  cmdq->eq.qbuf, cmdq->eq.qbuf_dma_addr);
-
-	dma_pool_free(dev->db_pool, cmdq->eq.dbrec, cmdq->eq.dbrec_dma);
+	erdma_eq_destroy(dev, &cmdq->eq);
 
 	dma_free_coherent(&dev->pdev->dev, cmdq->sq.depth << SQEBB_SHIFT,
 			  cmdq->sq.qbuf, cmdq->sq.qbuf_dma_addr);
diff --git a/drivers/infiniband/hw/erdma/erdma_eq.c b/drivers/infiniband/hw/erdma/erdma_eq.c
index 84ccdd8144c9..c277c8f9233c 100644
--- a/drivers/infiniband/hw/erdma/erdma_eq.c
+++ b/drivers/infiniband/hw/erdma/erdma_eq.c
@@ -80,50 +80,62 @@ void erdma_aeq_event_handler(struct erdma_dev *dev)
 	notify_eq(&dev->aeq);
 }
 
-int erdma_aeq_init(struct erdma_dev *dev)
+int erdma_eq_common_init(struct erdma_dev *dev, struct erdma_eq *eq, u32 depth)
 {
-	struct erdma_eq *eq = &dev->aeq;
+	u32 buf_size = depth << EQE_SHIFT;
 
-	eq->depth = ERDMA_DEFAULT_EQ_DEPTH;
-
-	eq->qbuf = dma_alloc_coherent(&dev->pdev->dev, eq->depth << EQE_SHIFT,
-				      &eq->qbuf_dma_addr, GFP_KERNEL);
+	eq->qbuf = dma_alloc_coherent(&dev->pdev->dev, buf_size,
+				      &eq->qbuf_dma_addr,
+				      GFP_KERNEL | __GFP_ZERO);
 	if (!eq->qbuf)
 		return -ENOMEM;
 
+	eq->dbrec = dma_pool_alloc(dev->db_pool, GFP_KERNEL | __GFP_ZERO,
+				   &eq->dbrec_dma);
+	if (!eq->dbrec)
+		goto err_free_qbuf;
+
 	spin_lock_init(&eq->lock);
 	atomic64_set(&eq->event_num, 0);
 	atomic64_set(&eq->notify_num, 0);
-
-	eq->db = dev->func_bar + ERDMA_REGS_AEQ_DB_REG;
-	eq->dbrec = dma_pool_zalloc(dev->db_pool, GFP_KERNEL, &eq->dbrec_dma);
-	if (!eq->dbrec)
-		goto err_out;
-
-	erdma_reg_write32(dev, ERDMA_REGS_AEQ_ADDR_H_REG,
-			  upper_32_bits(eq->qbuf_dma_addr));
-	erdma_reg_write32(dev, ERDMA_REGS_AEQ_ADDR_L_REG,
-			  lower_32_bits(eq->qbuf_dma_addr));
-	erdma_reg_write32(dev, ERDMA_REGS_AEQ_DEPTH_REG, eq->depth);
-	erdma_reg_write64(dev, ERDMA_AEQ_DB_HOST_ADDR_REG, eq->dbrec_dma);
+	eq->ci = 0;
+	eq->depth = depth;
 
 	return 0;
 
-err_out:
-	dma_free_coherent(&dev->pdev->dev, eq->depth << EQE_SHIFT, eq->qbuf,
+err_free_qbuf:
+	dma_free_coherent(&dev->pdev->dev, buf_size, eq->qbuf,
 			  eq->qbuf_dma_addr);
 
 	return -ENOMEM;
 }
 
-void erdma_aeq_destroy(struct erdma_dev *dev)
+void erdma_eq_destroy(struct erdma_dev *dev, struct erdma_eq *eq)
 {
-	struct erdma_eq *eq = &dev->aeq;
-
+	dma_pool_free(dev->db_pool, eq->dbrec, eq->dbrec_dma);
 	dma_free_coherent(&dev->pdev->dev, eq->depth << EQE_SHIFT, eq->qbuf,
 			  eq->qbuf_dma_addr);
+}
 
-	dma_pool_free(dev->db_pool, eq->dbrec, eq->dbrec_dma);
+int erdma_aeq_init(struct erdma_dev *dev)
+{
+	struct erdma_eq *eq = &dev->aeq;
+	int ret;
+
+	ret = erdma_eq_common_init(dev, &dev->aeq, ERDMA_DEFAULT_EQ_DEPTH);
+	if (ret)
+		return ret;
+
+	eq->db = dev->func_bar + ERDMA_REGS_AEQ_DB_REG;
+
+	erdma_reg_write32(dev, ERDMA_REGS_AEQ_ADDR_H_REG,
+			  upper_32_bits(eq->qbuf_dma_addr));
+	erdma_reg_write32(dev, ERDMA_REGS_AEQ_ADDR_L_REG,
+			  lower_32_bits(eq->qbuf_dma_addr));
+	erdma_reg_write32(dev, ERDMA_REGS_AEQ_DEPTH_REG, eq->depth);
+	erdma_reg_write64(dev, ERDMA_AEQ_DB_HOST_ADDR_REG, eq->dbrec_dma);
+
+	return 0;
 }
 
 void erdma_ceq_completion_handler(struct erdma_eq_cb *ceq_cb)
@@ -234,32 +246,21 @@ static int erdma_ceq_init_one(struct erdma_dev *dev, u16 ceqn)
 	struct erdma_eq *eq = &dev->ceqs[ceqn].eq;
 	int ret;
 
-	eq->depth = ERDMA_DEFAULT_EQ_DEPTH;
-	eq->qbuf = dma_alloc_coherent(&dev->pdev->dev, eq->depth << EQE_SHIFT,
-				      &eq->qbuf_dma_addr, GFP_KERNEL);
-	if (!eq->qbuf)
-		return -ENOMEM;
-
-	spin_lock_init(&eq->lock);
-	atomic64_set(&eq->event_num, 0);
-	atomic64_set(&eq->notify_num, 0);
+	ret = erdma_eq_common_init(dev, eq, ERDMA_DEFAULT_EQ_DEPTH);
+	if (ret)
+		return ret;
 
 	eq->db = dev->func_bar + ERDMA_REGS_CEQ_DB_BASE_REG +
 		 (ceqn + 1) * ERDMA_DB_SIZE;
-
-	eq->dbrec = dma_pool_zalloc(dev->db_pool, GFP_KERNEL, &eq->dbrec_dma);
-	if (!eq->dbrec) {
-		dma_free_coherent(&dev->pdev->dev, eq->depth << EQE_SHIFT,
-				  eq->qbuf, eq->qbuf_dma_addr);
-		return -ENOMEM;
-	}
-
-	eq->ci = 0;
 	dev->ceqs[ceqn].dev = dev;
+	dev->ceqs[ceqn].ready = true;
 
 	/* CEQ indexed from 1, 0 rsvd for CMDQ-EQ. */
 	ret = create_eq_cmd(dev, ceqn + 1, eq);
-	dev->ceqs[ceqn].ready = ret ? false : true;
+	if (ret) {
+		erdma_eq_destroy(dev, eq);
+		dev->ceqs[ceqn].ready = false;
+	}
 
 	return ret;
 }
@@ -283,9 +284,7 @@ static void erdma_ceq_uninit_one(struct erdma_dev *dev, u16 ceqn)
 	if (err)
 		return;
 
-	dma_free_coherent(&dev->pdev->dev, eq->depth << EQE_SHIFT, eq->qbuf,
-			  eq->qbuf_dma_addr);
-	dma_pool_free(dev->db_pool, eq->dbrec, eq->dbrec_dma);
+	erdma_eq_destroy(dev, eq);
 }
 
 int erdma_ceqs_init(struct erdma_dev *dev)
diff --git a/drivers/infiniband/hw/erdma/erdma_main.c b/drivers/infiniband/hw/erdma/erdma_main.c
index 9199058a0b29..d1cb488e7ad4 100644
--- a/drivers/infiniband/hw/erdma/erdma_main.c
+++ b/drivers/infiniband/hw/erdma/erdma_main.c
@@ -359,7 +359,7 @@ static int erdma_probe_dev(struct pci_dev *pdev)
 	erdma_cmdq_destroy(dev);
 
 err_uninit_aeq:
-	erdma_aeq_destroy(dev);
+	erdma_eq_destroy(dev, &dev->aeq);
 
 err_uninit_comm_irq:
 	erdma_comm_irq_uninit(dev);
@@ -392,7 +392,7 @@ static void erdma_remove_dev(struct pci_dev *pdev)
 	erdma_ceqs_uninit(dev);
 	erdma_hw_reset(dev, false);
 	erdma_cmdq_destroy(dev);
-	erdma_aeq_destroy(dev);
+	erdma_eq_destroy(dev, &dev->aeq);
 	erdma_comm_irq_uninit(dev);
 	pci_free_irq_vectors(dev->pdev);
 	erdma_device_uninit(dev);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for-next v2 3/4] RDMA/erdma: Add disassociate ucontext support
  2024-08-28  6:09 [PATCH for-next v2 0/4] RDMA/erdma: erdma updates Cheng Xu
  2024-08-28  6:09 ` [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust Cheng Xu
  2024-08-28  6:09 ` [PATCH for-next v2 2/4] RDMA/erdma: Refactor the initialization and destruction of EQ Cheng Xu
@ 2024-08-28  6:09 ` Cheng Xu
  2024-08-28  6:09 ` [PATCH for-next v2 4/4] RDMA/erdma: Return QP state in erdma_query_qp Cheng Xu
  3 siblings, 0 replies; 11+ messages in thread
From: Cheng Xu @ 2024-08-28  6:09 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, KaiShen

All IO pages mapped to user space are handled by rdma_user_mmap_io,
so add empty stub for disassociate ucontext.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
---
 drivers/infiniband/hw/erdma/erdma_main.c  | 1 +
 drivers/infiniband/hw/erdma/erdma_verbs.c | 4 ++++
 drivers/infiniband/hw/erdma/erdma_verbs.h | 1 +
 3 files changed, 6 insertions(+)

diff --git a/drivers/infiniband/hw/erdma/erdma_main.c b/drivers/infiniband/hw/erdma/erdma_main.c
index d1cb488e7ad4..1ccf1b65c02c 100644
--- a/drivers/infiniband/hw/erdma/erdma_main.c
+++ b/drivers/infiniband/hw/erdma/erdma_main.c
@@ -516,6 +516,7 @@ static const struct ib_device_ops erdma_device_ops = {
 	.dereg_mr = erdma_dereg_mr,
 	.destroy_cq = erdma_destroy_cq,
 	.destroy_qp = erdma_destroy_qp,
+	.disassociate_ucontext = erdma_disassociate_ucontext,
 	.get_dma_mr = erdma_get_dma_mr,
 	.get_hw_stats = erdma_get_hw_stats,
 	.get_port_immutable = erdma_get_port_immutable,
diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c
index 40c9b6e46b82..48b08a15e6a8 100644
--- a/drivers/infiniband/hw/erdma/erdma_verbs.c
+++ b/drivers/infiniband/hw/erdma/erdma_verbs.c
@@ -1700,6 +1700,10 @@ int erdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 	return ret;
 }
 
+void erdma_disassociate_ucontext(struct ib_ucontext *ibcontext)
+{
+}
+
 void erdma_set_mtu(struct erdma_dev *dev, u32 mtu)
 {
 	struct erdma_cmdq_config_mtu_req req;
diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.h b/drivers/infiniband/hw/erdma/erdma_verbs.h
index 4f02ba06b210..b7478376eb80 100644
--- a/drivers/infiniband/hw/erdma/erdma_verbs.h
+++ b/drivers/infiniband/hw/erdma/erdma_verbs.h
@@ -344,6 +344,7 @@ int erdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int mask,
 		    struct ib_udata *data);
 int erdma_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata);
 int erdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
+void erdma_disassociate_ucontext(struct ib_ucontext *ibcontext);
 int erdma_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
 struct ib_mr *erdma_reg_user_mr(struct ib_pd *ibpd, u64 start, u64 len,
 				u64 virt, int access, struct ib_udata *udata);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for-next v2 4/4] RDMA/erdma: Return QP state in erdma_query_qp
  2024-08-28  6:09 [PATCH for-next v2 0/4] RDMA/erdma: erdma updates Cheng Xu
                   ` (2 preceding siblings ...)
  2024-08-28  6:09 ` [PATCH for-next v2 3/4] RDMA/erdma: Add disassociate ucontext support Cheng Xu
@ 2024-08-28  6:09 ` Cheng Xu
  3 siblings, 0 replies; 11+ messages in thread
From: Cheng Xu @ 2024-08-28  6:09 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, KaiShen

Fix qp_state and cur_qp_state to return correct values in
struct ib_qp_attr.

Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation")
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
---
 drivers/infiniband/hw/erdma/erdma_verbs.c | 25 ++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c
index 48b08a15e6a8..de11f0f1adb1 100644
--- a/drivers/infiniband/hw/erdma/erdma_verbs.c
+++ b/drivers/infiniband/hw/erdma/erdma_verbs.c
@@ -1544,11 +1544,31 @@ int erdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask,
 	return ret;
 }
 
+static inline enum ib_qp_state query_qp_state(struct erdma_qp *qp)
+{
+	switch (qp->attrs.state) {
+	case ERDMA_QP_STATE_IDLE:
+		return IB_QPS_INIT;
+	case ERDMA_QP_STATE_RTR:
+		return IB_QPS_RTR;
+	case ERDMA_QP_STATE_RTS:
+		return IB_QPS_RTS;
+	case ERDMA_QP_STATE_CLOSING:
+		return IB_QPS_ERR;
+	case ERDMA_QP_STATE_TERMINATE:
+		return IB_QPS_ERR;
+	case ERDMA_QP_STATE_ERROR:
+		return IB_QPS_ERR;
+	default:
+		return IB_QPS_ERR;
+	}
+}
+
 int erdma_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
 		   int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr)
 {
-	struct erdma_qp *qp;
 	struct erdma_dev *dev;
+	struct erdma_qp *qp;
 
 	if (ibqp && qp_attr && qp_init_attr) {
 		qp = to_eqp(ibqp);
@@ -1575,6 +1595,9 @@ int erdma_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
 
 	qp_init_attr->cap = qp_attr->cap;
 
+	qp_attr->qp_state = query_qp_state(qp);
+	qp_attr->cur_qp_state = query_qp_state(qp);
+
 	return 0;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust
  2024-08-28  6:09 ` [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust Cheng Xu
@ 2024-08-29 10:09   ` Leon Romanovsky
  2024-08-30  2:34     ` Cheng Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2024-08-29 10:09 UTC (permalink / raw)
  To: Cheng Xu; +Cc: jgg, linux-rdma, KaiShen

On Wed, Aug 28, 2024 at 02:09:41PM +0800, Cheng Xu wrote:
> Driver may probe again while hardware is destroying the internal
> resources allocated for previous probing

How is it possible?


> which will fail the device probe. To make it more robust, we always issue a reset at the
> beginning of the device probe process.
> 
> Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
> ---
>  drivers/infiniband/hw/erdma/erdma.h      |  1 +
>  drivers/infiniband/hw/erdma/erdma_main.c | 44 +++++++++++++++++++-----
>  2 files changed, 36 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/erdma/erdma.h b/drivers/infiniband/hw/erdma/erdma.h
> index c8bd698e21b0..b5c258f77ca0 100644
> --- a/drivers/infiniband/hw/erdma/erdma.h
> +++ b/drivers/infiniband/hw/erdma/erdma.h
> @@ -94,6 +94,7 @@ enum {
>  
>  #define ERDMA_CMDQ_TIMEOUT_MS 15000
>  #define ERDMA_REG_ACCESS_WAIT_MS 20
> +#define ERDMA_WAIT_DEV_REST_CNT 50
>  #define ERDMA_WAIT_DEV_DONE_CNT 500
>  
>  struct erdma_cmdq {
> diff --git a/drivers/infiniband/hw/erdma/erdma_main.c b/drivers/infiniband/hw/erdma/erdma_main.c
> index 7080f8a71ec4..9199058a0b29 100644
> --- a/drivers/infiniband/hw/erdma/erdma_main.c
> +++ b/drivers/infiniband/hw/erdma/erdma_main.c
> @@ -209,11 +209,30 @@ static void erdma_device_uninit(struct erdma_dev *dev)
>  	dma_pool_destroy(dev->resp_pool);
>  }
>  
> -static void erdma_hw_reset(struct erdma_dev *dev)
> +static int erdma_hw_reset(struct erdma_dev *dev, bool wait)
>  {
>  	u32 ctrl = FIELD_PREP(ERDMA_REG_DEV_CTRL_RESET_MASK, 1);
> +	int i;
>  
>  	erdma_reg_write32(dev, ERDMA_REGS_DEV_CTRL_REG, ctrl);
> +
> +	if (!wait)
> +		return 0;
> +
> +	for (i = 0; i < ERDMA_WAIT_DEV_REST_CNT; i++) {
> +		if (erdma_reg_read32_filed(dev, ERDMA_REGS_DEV_ST_REG,
> +					   ERDMA_REG_DEV_ST_RESET_DONE_MASK))
> +			break;
> +
> +		msleep(ERDMA_REG_ACCESS_WAIT_MS);
> +	}
> +
> +	if (i == ERDMA_WAIT_DEV_REST_CNT) {
> +		dev_err(&dev->pdev->dev, "wait reset done timeout.\n");
> +		return -ETIME;
> +	}
> +
> +	return 0;
>  }
>  
>  static int erdma_wait_hw_init_done(struct erdma_dev *dev)
> @@ -239,6 +258,17 @@ static int erdma_wait_hw_init_done(struct erdma_dev *dev)
>  	return 0;
>  }
>  
> +static int erdma_preinit_check(struct erdma_dev *dev)
> +{
> +	u32 version = erdma_reg_read32(dev, ERDMA_REGS_VERSION_REG);
> +
> +	/* we knows that it is a non-functional function. */
> +	if (version == 0)
> +		return -ENODEV;
> +
> +	return erdma_hw_reset(dev, true);
> +}
> +
>  static const struct pci_device_id erdma_pci_tbl[] = {
>  	{ PCI_DEVICE(PCI_VENDOR_ID_ALIBABA, 0x107f) },
>  	{}
> @@ -248,7 +278,6 @@ static int erdma_probe_dev(struct pci_dev *pdev)
>  {
>  	struct erdma_dev *dev;
>  	int bars, err;
> -	u32 version;
>  
>  	err = pci_enable_device(pdev);
>  	if (err) {
> @@ -287,12 +316,9 @@ static int erdma_probe_dev(struct pci_dev *pdev)
>  		goto err_release_bars;
>  	}
>  
> -	version = erdma_reg_read32(dev, ERDMA_REGS_VERSION_REG);
> -	if (version == 0) {
> -		/* we knows that it is a non-functional function. */
> -		err = -ENODEV;
> +	err = erdma_preinit_check(dev);
> +	if (err)
>  		goto err_iounmap_func_bar;
> -	}
>  
>  	err = erdma_device_init(dev, pdev);
>  	if (err)
> @@ -327,7 +353,7 @@ static int erdma_probe_dev(struct pci_dev *pdev)
>  	return 0;
>  
>  err_reset_hw:
> -	erdma_hw_reset(dev);
> +	erdma_hw_reset(dev, false);
>  
>  err_uninit_cmdq:
>  	erdma_cmdq_destroy(dev);
> @@ -364,7 +390,7 @@ static void erdma_remove_dev(struct pci_dev *pdev)
>  	struct erdma_dev *dev = pci_get_drvdata(pdev);
>  
>  	erdma_ceqs_uninit(dev);
> -	erdma_hw_reset(dev);
> +	erdma_hw_reset(dev, false);
>  	erdma_cmdq_destroy(dev);
>  	erdma_aeq_destroy(dev);
>  	erdma_comm_irq_uninit(dev);
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust
  2024-08-29 10:09   ` Leon Romanovsky
@ 2024-08-30  2:34     ` Cheng Xu
  2024-09-02  7:21       ` Leon Romanovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Cheng Xu @ 2024-08-30  2:34 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: jgg, linux-rdma, KaiShen



On 8/29/24 6:09 PM, Leon Romanovsky wrote:
> On Wed, Aug 28, 2024 at 02:09:41PM +0800, Cheng Xu wrote:
>> Driver may probe again while hardware is destroying the internal
>> resources allocated for previous probing
> 
> How is it possible?
> 

The resources I mentioned is totally unseen to driver, it's something related
to our device management part in hypervisor, so it won't cause host resources
leak, and the cleanup/reset process may take a long time. For these reason,
we don't wait the completion of the cleanup/reset in the remove routing.
Instead, the driver will wait the device status become ready in probe routing
(In most cases, the hardware will have enough time to finish the cleanup/reset
before the second probe), so that we can boost the remove process.

Thanks,
Cheng Xu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust
  2024-08-30  2:34     ` Cheng Xu
@ 2024-09-02  7:21       ` Leon Romanovsky
  2024-09-02  9:09         ` Cheng Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2024-09-02  7:21 UTC (permalink / raw)
  To: Cheng Xu; +Cc: jgg, linux-rdma, KaiShen

On Fri, Aug 30, 2024 at 10:34:42AM +0800, Cheng Xu wrote:
> 
> 
> On 8/29/24 6:09 PM, Leon Romanovsky wrote:
> > On Wed, Aug 28, 2024 at 02:09:41PM +0800, Cheng Xu wrote:
> >> Driver may probe again while hardware is destroying the internal
> >> resources allocated for previous probing
> > 
> > How is it possible?
> > 
> 
> The resources I mentioned is totally unseen to driver, it's something related
> to our device management part in hypervisor, so it won't cause host resources
> leak, and the cleanup/reset process may take a long time. For these reason,
> we don't wait the completion of the cleanup/reset in the remove routing.
> Instead, the driver will wait the device status become ready in probe routing
> (In most cases, the hardware will have enough time to finish the cleanup/reset
> before the second probe), so that we can boost the remove process.

And why don't hypervisor wait for the device to be ready before giving it to VM?
Why do you need to complicate the probe routine to overcome the hypervisor behavior?

Thanks

> 
> Thanks,
> Cheng Xu
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust
  2024-09-02  7:21       ` Leon Romanovsky
@ 2024-09-02  9:09         ` Cheng Xu
  2024-09-04 16:06           ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Cheng Xu @ 2024-09-02  9:09 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: jgg, linux-rdma, KaiShen



On 9/2/24 3:21 PM, Leon Romanovsky wrote:
> On Fri, Aug 30, 2024 at 10:34:42AM +0800, Cheng Xu wrote:
>>
>>
>> On 8/29/24 6:09 PM, Leon Romanovsky wrote:
>>> On Wed, Aug 28, 2024 at 02:09:41PM +0800, Cheng Xu wrote:
>>>> Driver may probe again while hardware is destroying the internal
>>>> resources allocated for previous probing
>>>
>>> How is it possible?
>>>
>>
>> The resources I mentioned is totally unseen to driver, it's something related
>> to our device management part in hypervisor, so it won't cause host resources
>> leak, and the cleanup/reset process may take a long time. For these reason,
>> we don't wait the completion of the cleanup/reset in the remove routing.
>> Instead, the driver will wait the device status become ready in probe routing
>> (In most cases, the hardware will have enough time to finish the cleanup/reset
>> before the second probe), so that we can boost the remove process.
> 
> And why don't hypervisor wait for the device to be ready before giving it to VM?

Hypervisor actually does what you described during the first bootup. However, one
scenario is that the erdma driver is unloaded and loaded quickly while the device
always exists in the VM. In this case, there is no opportunity for the hypervisor
to perform that action.

> Why do you need to complicate the probe routine to overcome the hypervisor behavior?
> 

The hardware now requires that the former reset (issued in the remove routine) must be
completed before device init (issued in the probe routine). Waiting the reset completed
either in the remove routine or in the probe routine both can meet the requirement.
This patch chose to wait in the probe routine because it can speed up the remove process.

Actually this is a good question, and inspires me that maybe the requirement in the
hardware/backend may be eliminated, so that simplify the driver process.

I'd like to remove this patch in v3 and leave it for internal discussion.

Thanks very much
Cheng Xu


> Thanks
> 
>>
>> Thanks,
>> Cheng Xu
>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust
  2024-09-02  9:09         ` Cheng Xu
@ 2024-09-04 16:06           ` Jason Gunthorpe
  2024-09-05  3:39             ` Cheng Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2024-09-04 16:06 UTC (permalink / raw)
  To: Cheng Xu; +Cc: Leon Romanovsky, linux-rdma, KaiShen

On Mon, Sep 02, 2024 at 05:09:09PM +0800, Cheng Xu wrote:

> The hardware now requires that the former reset (issued in the
> remove routine) must be completed before device init (issued in the
> probe routine). Waiting the reset completed either in the remove
> routine or in the probe routine both can meet the requirement.  This
> patch chose to wait in the probe routine because it can speed up the
> remove process.

But what happens if you attach VFIO or some other driver while this
background reset is occuring? Are you OK with that?

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust
  2024-09-04 16:06           ` Jason Gunthorpe
@ 2024-09-05  3:39             ` Cheng Xu
  0 siblings, 0 replies; 11+ messages in thread
From: Cheng Xu @ 2024-09-05  3:39 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, KaiShen



On 9/5/24 12:06 AM, Jason Gunthorpe wrote:
> On Mon, Sep 02, 2024 at 05:09:09PM +0800, Cheng Xu wrote:
> 
>> The hardware now requires that the former reset (issued in the
>> remove routine) must be completed before device init (issued in the
>> probe routine). Waiting the reset completed either in the remove
>> routine or in the probe routine both can meet the requirement.  This
>> patch chose to wait in the probe routine because it can speed up the
>> remove process.
> 
> But what happens if you attach VFIO or some other driver while this
> background reset is occuring? Are you OK with that?

Yes, it's OK.

To simplify the description, We have two relevant components in the
hardware: The device management engine and the RDMA engine. The PCIe
device initialization (erdma, vfio and other drivers will perform it)
is handled by device management engine without any issue. The issue described
in this patch pertains to the RDMA engine, which is only invoked by erdma
driver.

In fact, this low-probability issue is not very serious and can be resolved
by reloading the driver. After internal discussion, we have decided to eliminate
this constraint through appropriate firmware modifications.

Thanks,
Cheng Xu

> 
> Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-09-05  3:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-28  6:09 [PATCH for-next v2 0/4] RDMA/erdma: erdma updates Cheng Xu
2024-08-28  6:09 ` [PATCH for-next v2 1/4] RDMA/erdma: Make the device probe process more robust Cheng Xu
2024-08-29 10:09   ` Leon Romanovsky
2024-08-30  2:34     ` Cheng Xu
2024-09-02  7:21       ` Leon Romanovsky
2024-09-02  9:09         ` Cheng Xu
2024-09-04 16:06           ` Jason Gunthorpe
2024-09-05  3:39             ` Cheng Xu
2024-08-28  6:09 ` [PATCH for-next v2 2/4] RDMA/erdma: Refactor the initialization and destruction of EQ Cheng Xu
2024-08-28  6:09 ` [PATCH for-next v2 3/4] RDMA/erdma: Add disassociate ucontext support Cheng Xu
2024-08-28  6:09 ` [PATCH for-next v2 4/4] RDMA/erdma: Return QP state in erdma_query_qp Cheng Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox