linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next 0/2] RDMA/erdma: Add non-4K page size support
@ 2023-02-20 10:20 Cheng Xu
  2023-02-20 10:20 ` [PATCH for-next 1/2] RDMA/erdma: Use fixed hardware page size Cheng Xu
  2023-02-20 10:20 ` [PATCH for-next 2/2] RDMA/erdma: Support larger page size with doorbell allocation Cheng Xu
  0 siblings, 2 replies; 5+ messages in thread
From: Cheng Xu @ 2023-02-20 10:20 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, KaiShen

Hi,

This series introduces non-4K page size support. For some aarch64
distributions, the default page size is 64K, not 4K. Current erdma can
not work correctly for these systems. There are two reasons: one is that
the kernel driver passes the kernel's page size to HW, but HW always
treats 4096 as the basic page size. The second reason is that the user
space provider uses 4096 to map the doorbell space which is right for
4096 page size only.

So, we fix these issues in this patchset:
- #1 fixes the issue that put wrong value in CMD to HW if PAGE_SIZE is
  not 4096.
- #2 returns the necessary information for userspace to call mmap. This
  commit requires changes in userspace provider. PR is [1].

Thanks,
Cheng Xu

[1] https://github.com/linux-rdma/rdma-core/pull/1313

Cheng Xu (2):
  RDMA/erdma: Use fixed hardware page size
  RDMA/erdma: Support larger page size with doorbell allocation

 drivers/infiniband/hw/erdma/erdma_hw.h    |  4 ++
 drivers/infiniband/hw/erdma/erdma_verbs.c | 57 +++++++++++------------
 drivers/infiniband/hw/erdma/erdma_verbs.h |  5 +-
 include/uapi/rdma/erdma-abi.h             |  5 +-
 4 files changed, 36 insertions(+), 35 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH for-next 1/2] RDMA/erdma: Use fixed hardware page size
  2023-02-20 10:20 [PATCH for-next 0/2] RDMA/erdma: Add non-4K page size support Cheng Xu
@ 2023-02-20 10:20 ` Cheng Xu
  2023-02-21  1:13   ` Jason Gunthorpe
  2023-02-20 10:20 ` [PATCH for-next 2/2] RDMA/erdma: Support larger page size with doorbell allocation Cheng Xu
  1 sibling, 1 reply; 5+ messages in thread
From: Cheng Xu @ 2023-02-20 10:20 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, KaiShen

Hardware page size is 4096, but the kernel's page size may vary. driver
should use hardware page size to set the parameters to hardware.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
---
 drivers/infiniband/hw/erdma/erdma_hw.h    |  4 ++++
 drivers/infiniband/hw/erdma/erdma_verbs.c | 17 +++++++++--------
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/erdma/erdma_hw.h b/drivers/infiniband/hw/erdma/erdma_hw.h
index 4c38d99c73f1..56e0c7a3e8f8 100644
--- a/drivers/infiniband/hw/erdma/erdma_hw.h
+++ b/drivers/infiniband/hw/erdma/erdma_hw.h
@@ -112,6 +112,10 @@
 
 #define ERDMA_PAGE_SIZE_SUPPORT 0x7FFFF000
 
+/* Hardware page size definition */
+#define ERDMA_HW_PAGE_SHIFT 12
+#define ERDMA_HW_PAGE_SIZE 4096
+
 /* WQE related. */
 #define EQE_SIZE 16
 #define EQE_SHIFT 4
diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c
index 9c30d78730aa..83e1b0d55977 100644
--- a/drivers/infiniband/hw/erdma/erdma_verbs.c
+++ b/drivers/infiniband/hw/erdma/erdma_verbs.c
@@ -38,7 +38,7 @@ static int create_qp_cmd(struct erdma_dev *dev, struct erdma_qp *qp)
 		   FIELD_PREP(ERDMA_CMD_CREATE_QP_PD_MASK, pd->pdn);
 
 	if (rdma_is_kernel_res(&qp->ibqp.res)) {
-		u32 pgsz_range = ilog2(SZ_1M) - PAGE_SHIFT;
+		u32 pgsz_range = ilog2(SZ_1M) - ERDMA_HW_PAGE_SHIFT;
 
 		req.sq_cqn_mtt_cfg =
 			FIELD_PREP(ERDMA_CMD_CREATE_QP_PAGE_SIZE_MASK,
@@ -66,13 +66,13 @@ static int create_qp_cmd(struct erdma_dev *dev, struct erdma_qp *qp)
 		user_qp = &qp->user_qp;
 		req.sq_cqn_mtt_cfg = FIELD_PREP(
 			ERDMA_CMD_CREATE_QP_PAGE_SIZE_MASK,
-			ilog2(user_qp->sq_mtt.page_size) - PAGE_SHIFT);
+			ilog2(user_qp->sq_mtt.page_size) - ERDMA_HW_PAGE_SHIFT);
 		req.sq_cqn_mtt_cfg |=
 			FIELD_PREP(ERDMA_CMD_CREATE_QP_CQN_MASK, qp->scq->cqn);
 
 		req.rq_cqn_mtt_cfg = FIELD_PREP(
 			ERDMA_CMD_CREATE_QP_PAGE_SIZE_MASK,
-			ilog2(user_qp->rq_mtt.page_size) - PAGE_SHIFT);
+			ilog2(user_qp->rq_mtt.page_size) - ERDMA_HW_PAGE_SHIFT);
 		req.rq_cqn_mtt_cfg |=
 			FIELD_PREP(ERDMA_CMD_CREATE_QP_CQN_MASK, qp->rcq->cqn);
 
@@ -162,7 +162,7 @@ static int create_cq_cmd(struct erdma_dev *dev, struct erdma_cq *cq)
 	if (rdma_is_kernel_res(&cq->ibcq.res)) {
 		page_size = SZ_32M;
 		req.cfg0 |= FIELD_PREP(ERDMA_CMD_CREATE_CQ_PAGESIZE_MASK,
-				       ilog2(page_size) - PAGE_SHIFT);
+				       ilog2(page_size) - ERDMA_HW_PAGE_SHIFT);
 		req.qbuf_addr_l = lower_32_bits(cq->kern_cq.qbuf_dma_addr);
 		req.qbuf_addr_h = upper_32_bits(cq->kern_cq.qbuf_dma_addr);
 
@@ -175,8 +175,9 @@ static int create_cq_cmd(struct erdma_dev *dev, struct erdma_cq *cq)
 			cq->kern_cq.qbuf_dma_addr + (cq->depth << CQE_SHIFT);
 	} else {
 		mtt = &cq->user_cq.qbuf_mtt;
-		req.cfg0 |= FIELD_PREP(ERDMA_CMD_CREATE_CQ_PAGESIZE_MASK,
-				       ilog2(mtt->page_size) - PAGE_SHIFT);
+		req.cfg0 |=
+			FIELD_PREP(ERDMA_CMD_CREATE_CQ_PAGESIZE_MASK,
+				   ilog2(mtt->page_size) - ERDMA_HW_PAGE_SHIFT);
 		if (mtt->mtt_nents == 1) {
 			req.qbuf_addr_l = lower_32_bits(*(u64 *)mtt->mtt_buf);
 			req.qbuf_addr_h = upper_32_bits(*(u64 *)mtt->mtt_buf);
@@ -636,7 +637,7 @@ static int init_user_qp(struct erdma_qp *qp, struct erdma_ucontext *uctx,
 	u32 rq_offset;
 	int ret;
 
-	if (len < (PAGE_ALIGN(qp->attrs.sq_size * SQEBB_SIZE) +
+	if (len < (ALIGN(qp->attrs.sq_size * SQEBB_SIZE, ERDMA_HW_PAGE_SIZE) +
 		   qp->attrs.rq_size * RQE_SIZE))
 		return -EINVAL;
 
@@ -646,7 +647,7 @@ static int init_user_qp(struct erdma_qp *qp, struct erdma_ucontext *uctx,
 	if (ret)
 		return ret;
 
-	rq_offset = PAGE_ALIGN(qp->attrs.sq_size << SQEBB_SHIFT);
+	rq_offset = ALIGN(qp->attrs.sq_size << SQEBB_SHIFT, ERDMA_HW_PAGE_SIZE);
 	qp->user_qp.rq_offset = rq_offset;
 
 	ret = get_mtt_entries(qp->dev, &qp->user_qp.rq_mtt, va + rq_offset,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH for-next 2/2] RDMA/erdma: Support larger page size with doorbell allocation
  2023-02-20 10:20 [PATCH for-next 0/2] RDMA/erdma: Add non-4K page size support Cheng Xu
  2023-02-20 10:20 ` [PATCH for-next 1/2] RDMA/erdma: Use fixed hardware page size Cheng Xu
@ 2023-02-20 10:20 ` Cheng Xu
  1 sibling, 0 replies; 5+ messages in thread
From: Cheng Xu @ 2023-02-20 10:20 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, KaiShen

Doorbell resources are exposed to userspace by mmap. The size unit of mmap
is PAGE_SIZE, previous implementation can not work correctly if PAGE_SIZE
is not 4K, for example, 64K. We support larger page size in this commit.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
---
 drivers/infiniband/hw/erdma/erdma_verbs.c | 40 ++++++++++-------------
 drivers/infiniband/hw/erdma/erdma_verbs.h |  5 ++-
 include/uapi/rdma/erdma-abi.h             |  5 ++-
 3 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c
index 83e1b0d55977..0cd8dd61f569 100644
--- a/drivers/infiniband/hw/erdma/erdma_verbs.c
+++ b/drivers/infiniband/hw/erdma/erdma_verbs.c
@@ -1137,8 +1137,8 @@ void erdma_mmap_free(struct rdma_user_mmap_entry *rdma_entry)
 static void alloc_db_resources(struct erdma_dev *dev,
 			       struct erdma_ucontext *ctx)
 {
-	u32 bitmap_idx;
 	struct erdma_devattr *attrs = &dev->attrs;
+	u32 bitmap_idx, hw_page_idx;
 
 	if (attrs->disable_dwqe)
 		goto alloc_normal_db;
@@ -1151,11 +1151,9 @@ static void alloc_db_resources(struct erdma_dev *dev,
 		spin_unlock(&dev->db_bitmap_lock);
 
 		ctx->sdb_type = ERDMA_SDB_PAGE;
-		ctx->sdb_idx = bitmap_idx;
-		ctx->sdb_page_idx = bitmap_idx;
+		ctx->sdb_bitmap_idx = bitmap_idx;
 		ctx->sdb = dev->func_bar_addr + ERDMA_BAR_SQDB_SPACE_OFFSET +
-			   (bitmap_idx << PAGE_SHIFT);
-		ctx->sdb_page_off = 0;
+			   (bitmap_idx << ERDMA_HW_PAGE_SHIFT);
 
 		return;
 	}
@@ -1166,13 +1164,13 @@ static void alloc_db_resources(struct erdma_dev *dev,
 		spin_unlock(&dev->db_bitmap_lock);
 
 		ctx->sdb_type = ERDMA_SDB_ENTRY;
-		ctx->sdb_idx = bitmap_idx;
-		ctx->sdb_page_idx = attrs->dwqe_pages +
+		ctx->sdb_bitmap_idx = bitmap_idx;
+		hw_page_idx = attrs->dwqe_pages +
 				    bitmap_idx / ERDMA_DWQE_TYPE1_CNT_PER_PAGE;
-		ctx->sdb_page_off = bitmap_idx % ERDMA_DWQE_TYPE1_CNT_PER_PAGE;
 
+		ctx->sdb_entid = bitmap_idx % ERDMA_DWQE_TYPE1_CNT_PER_PAGE;
 		ctx->sdb = dev->func_bar_addr + ERDMA_BAR_SQDB_SPACE_OFFSET +
-			   (ctx->sdb_page_idx << PAGE_SHIFT);
+			   (hw_page_idx << PAGE_SHIFT);
 
 		return;
 	}
@@ -1181,11 +1179,8 @@ static void alloc_db_resources(struct erdma_dev *dev,
 
 alloc_normal_db:
 	ctx->sdb_type = ERDMA_SDB_SHARED;
-	ctx->sdb_idx = 0;
-	ctx->sdb_page_idx = ERDMA_SDB_SHARED_PAGE_INDEX;
-	ctx->sdb_page_off = 0;
-
-	ctx->sdb = dev->func_bar_addr + (ctx->sdb_page_idx << PAGE_SHIFT);
+	ctx->sdb = dev->func_bar_addr +
+		   (ERDMA_SDB_SHARED_PAGE_INDEX << ERDMA_HW_PAGE_SHIFT);
 }
 
 static void erdma_uctx_user_mmap_entries_remove(struct erdma_ucontext *uctx)
@@ -1215,11 +1210,6 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata)
 	ctx->rdb = dev->func_bar_addr + ERDMA_BAR_RQDB_SPACE_OFFSET;
 	ctx->cdb = dev->func_bar_addr + ERDMA_BAR_CQDB_SPACE_OFFSET;
 
-	if (udata->outlen < sizeof(uresp)) {
-		ret = -EINVAL;
-		goto err_out;
-	}
-
 	ctx->sq_db_mmap_entry = erdma_user_mmap_entry_insert(
 		ctx, (void *)ctx->sdb, PAGE_SIZE, ERDMA_MMAP_IO_NC, &uresp.sdb);
 	if (!ctx->sq_db_mmap_entry) {
@@ -1243,9 +1233,13 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata)
 
 	uresp.dev_id = dev->pdev->device;
 	uresp.sdb_type = ctx->sdb_type;
-	uresp.sdb_offset = ctx->sdb_page_off;
+	uresp.sdb_entid = ctx->sdb_entid;
+	uresp.sdb_off = ctx->sdb & ~PAGE_MASK;
+	uresp.rdb_off = ctx->rdb & ~PAGE_MASK;
+	uresp.cdb_off = ctx->cdb & ~PAGE_MASK;
 
-	ret = ib_copy_to_udata(udata, &uresp, sizeof(uresp));
+	ret = ib_copy_to_udata(udata, &uresp,
+			       min(sizeof(uresp), udata->outlen));
 	if (ret)
 		goto err_out;
 
@@ -1264,9 +1258,9 @@ void erdma_dealloc_ucontext(struct ib_ucontext *ibctx)
 
 	spin_lock(&dev->db_bitmap_lock);
 	if (ctx->sdb_type == ERDMA_SDB_PAGE)
-		clear_bit(ctx->sdb_idx, dev->sdb_page);
+		clear_bit(ctx->sdb_bitmap_idx, dev->sdb_page);
 	else if (ctx->sdb_type == ERDMA_SDB_ENTRY)
-		clear_bit(ctx->sdb_idx, dev->sdb_entry);
+		clear_bit(ctx->sdb_bitmap_idx, dev->sdb_entry);
 
 	erdma_uctx_user_mmap_entries_remove(ctx);
 
diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.h b/drivers/infiniband/hw/erdma/erdma_verbs.h
index e0a993bc032a..4dbef1483027 100644
--- a/drivers/infiniband/hw/erdma/erdma_verbs.h
+++ b/drivers/infiniband/hw/erdma/erdma_verbs.h
@@ -35,9 +35,8 @@ struct erdma_ucontext {
 	struct ib_ucontext ibucontext;
 
 	u32 sdb_type;
-	u32 sdb_idx;
-	u32 sdb_page_idx;
-	u32 sdb_page_off;
+	u32 sdb_bitmap_idx;
+	u32 sdb_entid;
 	u64 sdb;
 	u64 rdb;
 	u64 cdb;
diff --git a/include/uapi/rdma/erdma-abi.h b/include/uapi/rdma/erdma-abi.h
index b7a0222f978f..57f8942a3c56 100644
--- a/include/uapi/rdma/erdma-abi.h
+++ b/include/uapi/rdma/erdma-abi.h
@@ -40,10 +40,13 @@ struct erdma_uresp_alloc_ctx {
 	__u32 dev_id;
 	__u32 pad;
 	__u32 sdb_type;
-	__u32 sdb_offset;
+	__u32 sdb_entid;
 	__aligned_u64 sdb;
 	__aligned_u64 rdb;
 	__aligned_u64 cdb;
+	__u32 sdb_off;
+	__u32 rdb_off;
+	__u32 cdb_off;
 };
 
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH for-next 1/2] RDMA/erdma: Use fixed hardware page size
  2023-02-20 10:20 ` [PATCH for-next 1/2] RDMA/erdma: Use fixed hardware page size Cheng Xu
@ 2023-02-21  1:13   ` Jason Gunthorpe
  2023-02-21  2:23     ` Cheng Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2023-02-21  1:13 UTC (permalink / raw)
  To: Cheng Xu; +Cc: leon, linux-rdma, KaiShen

On Mon, Feb 20, 2023 at 06:20:14PM +0800, Cheng Xu wrote:
> Hardware page size is 4096, but the kernel's page size may vary. driver
> should use hardware page size to set the parameters to hardware.
> 
> Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
> ---
>  drivers/infiniband/hw/erdma/erdma_hw.h    |  4 ++++
>  drivers/infiniband/hw/erdma/erdma_verbs.c | 17 +++++++++--------
>  2 files changed, 13 insertions(+), 8 deletions(-)

This should have a fixes line

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-next 1/2] RDMA/erdma: Use fixed hardware page size
  2023-02-21  1:13   ` Jason Gunthorpe
@ 2023-02-21  2:23     ` Cheng Xu
  0 siblings, 0 replies; 5+ messages in thread
From: Cheng Xu @ 2023-02-21  2:23 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: leon, linux-rdma, KaiShen



On 2/21/23 9:13 AM, Jason Gunthorpe wrote:
> On Mon, Feb 20, 2023 at 06:20:14PM +0800, Cheng Xu wrote:
>> Hardware page size is 4096, but the kernel's page size may vary. driver
>> should use hardware page size to set the parameters to hardware.
>>
>> Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
>> ---
>>  drivers/infiniband/hw/erdma/erdma_hw.h    |  4 ++++
>>  drivers/infiniband/hw/erdma/erdma_verbs.c | 17 +++++++++--------
>>  2 files changed, 13 insertions(+), 8 deletions(-)
> 
> This should have a fixes line
> 

OK, I will send v2 after merge window.

Thanks,
Cheng Xu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-21  2:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-20 10:20 [PATCH for-next 0/2] RDMA/erdma: Add non-4K page size support Cheng Xu
2023-02-20 10:20 ` [PATCH for-next 1/2] RDMA/erdma: Use fixed hardware page size Cheng Xu
2023-02-21  1:13   ` Jason Gunthorpe
2023-02-21  2:23     ` Cheng Xu
2023-02-20 10:20 ` [PATCH for-next 2/2] RDMA/erdma: Support larger page size with doorbell allocation Cheng Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).