All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
@ 2023-11-03  9:55 Li Zhijian
  2023-11-03  9:55 ` [PATCH RFC V2 1/6] RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr Li Zhijian
                   ` (8 more replies)
  0 siblings, 9 replies; 25+ messages in thread
From: Li Zhijian @ 2023-11-03  9:55 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, bvanassche, yi.zhang,
	Li Zhijian

I don't collect the Reviewed-by to the patch1-2 this time, since i
think we can make it better.

Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
          Almost nothing change from V1.
Patch3-5: cleanups # newly add
Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested

My bad arm64 mechine offten hangs when doing blktests even though i use the
default siw driver.

- nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.

[1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/

Li Zhijian (6):
  RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
  RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
  RDMA/rxe: remove unused rxe_mr.page_shift
  RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
    page_list
  RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
  RDMA/rxe: Support PAGE_SIZE aligned MR

 drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
 drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
 3 files changed, 48 insertions(+), 43 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH RFC V2 1/6] RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
@ 2023-11-03  9:55 ` Li Zhijian
  2023-11-03 10:14   ` Greg Sword
  2023-11-03  9:55 ` [PATCH RFC V2 2/6] RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE Li Zhijian
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Li Zhijian @ 2023-11-03  9:55 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, bvanassche, yi.zhang,
	Li Zhijian

rxe_set_page() only store one PAGE_SIZE page by the step of page_size.
when page_size != PAGE_SIZE, we cannot restore the address with wrong
index and page_offset.

Let's take a look how current the xarray is being used.

0. offset = iova & (page_size -1); // offset is less than page_size
                                      but may not PAGE_SIZE
1. index = (iova - mr.iova) >> page_shift;
2. page = xa_load(&mr->page_list, index);
3. page_va = kmap_local_page(page) // map one page only, that means only
                                      memory [page_va, page_va + PAGE_SIZE)
                                      is valid for this mapping.
4. memcpy(addr, page_va + offset, bytes);

- when page_size > PAGE_SIZE, the offset could be beyond PAGE_SIZE,
  then page_va + offset may be invalid.
- when page_size < PAGE_SIZE, the offset may get lost.

Note that this patch will break some ULPs that try to register 4K
MR when PAGE_SIZE is not 4K. SRP and nvme over RXE is known to be
impacted.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>

---
---
 drivers/infiniband/sw/rxe/rxe_mr.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index f54042e9aeb2..3755e530e6dc 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -234,6 +234,12 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
 	struct rxe_mr *mr = to_rmr(ibmr);
 	unsigned int page_size = mr_page_size(mr);
 
+	if (page_size != PAGE_SIZE) {
+		rxe_err_mr(mr, "Unsupport mr page size %x, expect PAGE_SIZE(%lx)\n",
+			   page_size, PAGE_SIZE);
+		return -EINVAL;
+	}
+
 	mr->nbuf = 0;
 	mr->page_shift = ilog2(page_size);
 	mr->page_mask = ~((u64)page_size - 1);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC V2 2/6] RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
  2023-11-03  9:55 ` [PATCH RFC V2 1/6] RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr Li Zhijian
@ 2023-11-03  9:55 ` Li Zhijian
  2023-11-03  9:55 ` [PATCH RFC V2 3/6] RDMA/rxe: remove unused rxe_mr.page_shift Li Zhijian
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Li Zhijian @ 2023-11-03  9:55 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, bvanassche, yi.zhang,
	Li Zhijian

RXE_PAGE_SIZE_CAP means the MR page size supported by RXE. However
in current RXE implementation, only PAGE_SIZE MR works well.
So change it to PAGE_SIZE only.

ULPs such as SRP calculating the page size according to this attribute
get worked again with this change.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_param.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h
index d2f57ead78ad..b1cf1e1c0ce1 100644
--- a/drivers/infiniband/sw/rxe/rxe_param.h
+++ b/drivers/infiniband/sw/rxe/rxe_param.h
@@ -38,7 +38,7 @@ static inline enum ib_mtu eth_mtu_int_to_enum(int mtu)
 /* default/initial rxe device parameter settings */
 enum rxe_device_param {
 	RXE_MAX_MR_SIZE			= -1ull,
-	RXE_PAGE_SIZE_CAP		= 0xfffff000,
+	RXE_PAGE_SIZE_CAP		= PAGE_SIZE,
 	RXE_MAX_QP_WR			= DEFAULT_MAX_VALUE,
 	RXE_DEVICE_CAP_FLAGS		= IB_DEVICE_BAD_PKEY_CNTR
 					| IB_DEVICE_BAD_QKEY_CNTR
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC V2 3/6] RDMA/rxe: remove unused rxe_mr.page_shift
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
  2023-11-03  9:55 ` [PATCH RFC V2 1/6] RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr Li Zhijian
  2023-11-03  9:55 ` [PATCH RFC V2 2/6] RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE Li Zhijian
@ 2023-11-03  9:55 ` Li Zhijian
  2023-11-03  9:55 ` [PATCH RFC V2 4/6] RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from page_list Li Zhijian
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Li Zhijian @ 2023-11-03  9:55 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, bvanassche, yi.zhang,
	Li Zhijian

it's assigned but never used.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c    | 1 -
 drivers/infiniband/sw/rxe/rxe_verbs.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 3755e530e6dc..bbfedcd8d2cb 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -243,7 +243,6 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
 	mr->nbuf = 0;
 	mr->page_shift = ilog2(page_size);
 	mr->page_mask = ~((u64)page_size - 1);
-	mr->page_offset = mr->ibmr.iova & (page_size - 1);
 
 	return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset, rxe_set_page);
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index ccb9d19ffe8a..11647e976282 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -309,7 +309,6 @@ struct rxe_mr {
 	int			access;
 	atomic_t		num_mw;
 
-	unsigned int		page_offset;
 	unsigned int		page_shift;
 	u64			page_mask;
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC V2 4/6] RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from page_list
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
                   ` (2 preceding siblings ...)
  2023-11-03  9:55 ` [PATCH RFC V2 3/6] RDMA/rxe: remove unused rxe_mr.page_shift Li Zhijian
@ 2023-11-03  9:55 ` Li Zhijian
  2023-11-03 17:59   ` Jason Gunthorpe
  2023-11-03  9:55 ` [PATCH RFC V2 5/6] RDMA/rxe: cleanup rxe_mr.{page_size,page_shift} Li Zhijian
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Li Zhijian @ 2023-11-03  9:55 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, bvanassche, yi.zhang,
	Li Zhijian

As we said in previous commit, page_list only stores PAGE_SIZE page, so
when we extract an address from the page_list, we should use PAGE_SIZE
and PAGE_SHIFT instead of the ibmr.page_size.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c    | 42 +++++++++------------------
 drivers/infiniband/sw/rxe/rxe_verbs.h |  5 ----
 2 files changed, 14 insertions(+), 33 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index bbfedcd8d2cb..d39c02f0c51e 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -72,16 +72,6 @@ void rxe_mr_init_dma(int access, struct rxe_mr *mr)
 	mr->ibmr.type = IB_MR_TYPE_DMA;
 }
 
-static unsigned long rxe_mr_iova_to_index(struct rxe_mr *mr, u64 iova)
-{
-	return (iova >> mr->page_shift) - (mr->ibmr.iova >> mr->page_shift);
-}
-
-static unsigned long rxe_mr_iova_to_page_offset(struct rxe_mr *mr, u64 iova)
-{
-	return iova & (mr_page_size(mr) - 1);
-}
-
 static bool is_pmem_page(struct page *pg)
 {
 	unsigned long paddr = page_to_phys(pg);
@@ -232,17 +222,16 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
 		  int sg_nents, unsigned int *sg_offset)
 {
 	struct rxe_mr *mr = to_rmr(ibmr);
-	unsigned int page_size = mr_page_size(mr);
 
-	if (page_size != PAGE_SIZE) {
+	if (ibmr->page_size != PAGE_SIZE) {
 		rxe_err_mr(mr, "Unsupport mr page size %x, expect PAGE_SIZE(%lx)\n",
-			   page_size, PAGE_SIZE);
+			   ibmr->page_size, PAGE_SIZE);
 		return -EINVAL;
 	}
 
 	mr->nbuf = 0;
-	mr->page_shift = ilog2(page_size);
-	mr->page_mask = ~((u64)page_size - 1);
+	mr->page_shift = PAGE_SHIFT;
+	mr->page_mask = PAGE_MASK;
 
 	return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset, rxe_set_page);
 }
@@ -250,8 +239,8 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
 static int rxe_mr_copy_xarray(struct rxe_mr *mr, u64 iova, void *addr,
 			      unsigned int length, enum rxe_mr_copy_dir dir)
 {
-	unsigned int page_offset = rxe_mr_iova_to_page_offset(mr, iova);
-	unsigned long index = rxe_mr_iova_to_index(mr, iova);
+	unsigned int page_offset = iova & (PAGE_SIZE - 1);
+	unsigned long index = (iova - mr->ibmr.iova) >> PAGE_SHIFT;
 	unsigned int bytes;
 	struct page *page;
 	void *va;
@@ -261,8 +250,7 @@ static int rxe_mr_copy_xarray(struct rxe_mr *mr, u64 iova, void *addr,
 		if (!page)
 			return -EFAULT;
 
-		bytes = min_t(unsigned int, length,
-				mr_page_size(mr) - page_offset);
+		bytes = min_t(unsigned int, length, PAGE_SIZE - page_offset);
 		va = kmap_local_page(page);
 		if (dir == RXE_FROM_MR_OBJ)
 			memcpy(addr, va + page_offset, bytes);
@@ -450,14 +438,12 @@ int rxe_flush_pmem_iova(struct rxe_mr *mr, u64 iova, unsigned int length)
 		return err;
 
 	while (length > 0) {
-		index = rxe_mr_iova_to_index(mr, iova);
+		index = (iova - mr->ibmr.iova) >> PAGE_SHIFT;
 		page = xa_load(&mr->page_list, index);
-		page_offset = rxe_mr_iova_to_page_offset(mr, iova);
+		page_offset = iova & (PAGE_SIZE - 1);
 		if (!page)
 			return -EFAULT;
-		bytes = min_t(unsigned int, length,
-				mr_page_size(mr) - page_offset);
-
+		bytes = min_t(unsigned int, length, PAGE_SIZE - page_offset);
 		va = kmap_local_page(page);
 		arch_wb_cache_pmem(va + page_offset, bytes);
 		kunmap_local(va);
@@ -498,8 +484,8 @@ int rxe_mr_do_atomic_op(struct rxe_mr *mr, u64 iova, int opcode,
 			rxe_dbg_mr(mr, "iova out of range");
 			return RESPST_ERR_RKEY_VIOLATION;
 		}
-		page_offset = rxe_mr_iova_to_page_offset(mr, iova);
-		index = rxe_mr_iova_to_index(mr, iova);
+		page_offset = iova & (PAGE_SIZE - 1);
+		index = (iova - mr->ibmr.iova) >> PAGE_SHIFT;
 		page = xa_load(&mr->page_list, index);
 		if (!page)
 			return RESPST_ERR_RKEY_VIOLATION;
@@ -556,8 +542,8 @@ int rxe_mr_do_atomic_write(struct rxe_mr *mr, u64 iova, u64 value)
 			rxe_dbg_mr(mr, "iova out of range");
 			return RESPST_ERR_RKEY_VIOLATION;
 		}
-		page_offset = rxe_mr_iova_to_page_offset(mr, iova);
-		index = rxe_mr_iova_to_index(mr, iova);
+		page_offset = iova & (PAGE_SIZE - 1);
+		index = (iova - mr->ibmr.iova) >> PAGE_SHIFT;
 		page = xa_load(&mr->page_list, index);
 		if (!page)
 			return RESPST_ERR_RKEY_VIOLATION;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 11647e976282..ccc75f8c0985 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -318,11 +318,6 @@ struct rxe_mr {
 	struct xarray		page_list;
 };
 
-static inline unsigned int mr_page_size(struct rxe_mr *mr)
-{
-	return mr ? mr->ibmr.page_size : PAGE_SIZE;
-}
-
 enum rxe_mw_state {
 	RXE_MW_STATE_INVALID	= RXE_MR_STATE_INVALID,
 	RXE_MW_STATE_FREE	= RXE_MR_STATE_FREE,
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC V2 5/6] RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
                   ` (3 preceding siblings ...)
  2023-11-03  9:55 ` [PATCH RFC V2 4/6] RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from page_list Li Zhijian
@ 2023-11-03  9:55 ` Li Zhijian
  2023-11-03  9:55 ` [PATCH RFC V2 6/6] RDMA/rxe: Support PAGE_SIZE aligned MR Li Zhijian
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Li Zhijian @ 2023-11-03  9:55 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, bvanassche, yi.zhang,
	Li Zhijian

This 2 elements were believed to be designed for extracting address
from the page_list before. But now we use PAGE_SIZE and PAGE_SHIFT
directly, so we can drop it.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c    | 4 ----
 drivers/infiniband/sw/rxe/rxe_verbs.h | 3 ---
 2 files changed, 7 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index d39c02f0c51e..a038133e1322 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -59,8 +59,6 @@ static void rxe_mr_init(int access, struct rxe_mr *mr)
 
 	mr->access = access;
 	mr->ibmr.page_size = PAGE_SIZE;
-	mr->page_mask = PAGE_MASK;
-	mr->page_shift = PAGE_SHIFT;
 	mr->state = RXE_MR_STATE_INVALID;
 }
 
@@ -230,8 +228,6 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
 	}
 
 	mr->nbuf = 0;
-	mr->page_shift = PAGE_SHIFT;
-	mr->page_mask = PAGE_MASK;
 
 	return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset, rxe_set_page);
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index ccc75f8c0985..ef813560b0ab 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -309,9 +309,6 @@ struct rxe_mr {
 	int			access;
 	atomic_t		num_mw;
 
-	unsigned int		page_shift;
-	u64			page_mask;
-
 	u32			num_buf;
 	u32			nbuf;
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC V2 6/6] RDMA/rxe: Support PAGE_SIZE aligned MR
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
                   ` (4 preceding siblings ...)
  2023-11-03  9:55 ` [PATCH RFC V2 5/6] RDMA/rxe: cleanup rxe_mr.{page_size,page_shift} Li Zhijian
@ 2023-11-03  9:55 ` Li Zhijian
  2023-11-03 15:04   ` Bart Van Assche
  2023-11-03 10:17 ` [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Greg Sword
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Li Zhijian @ 2023-11-03  9:55 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, bvanassche, yi.zhang,
	Li Zhijian

In order to support PAGE_SIZE aligned MR, rxe_map_mr_sg() should be able
to split a large buffer to N * page entry into the xarray page_list.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c | 39 +++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index a038133e1322..3761740af986 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -193,9 +193,8 @@ int rxe_mr_init_fast(int max_pages, struct rxe_mr *mr)
 	return err;
 }
 
-static int rxe_set_page(struct ib_mr *ibmr, u64 dma_addr)
+static int rxe_store_page(struct rxe_mr *mr, u64 dma_addr)
 {
-	struct rxe_mr *mr = to_rmr(ibmr);
 	struct page *page = ib_virt_dma_to_page(dma_addr);
 	bool persistent = !!(mr->access & IB_ACCESS_FLUSH_PERSISTENT);
 	int err;
@@ -216,20 +215,48 @@ static int rxe_set_page(struct ib_mr *ibmr, u64 dma_addr)
 	return 0;
 }
 
+static int rxe_set_page(struct ib_mr *base_mr, u64 buf_addr)
+{
+	return 0;
+}
+
 int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
-		  int sg_nents, unsigned int *sg_offset)
+		  int sg_nents, unsigned int *sg_offset_p)
 {
 	struct rxe_mr *mr = to_rmr(ibmr);
+	struct scatterlist *sg;
+	unsigned int sg_offset = sg_offset_p ? *sg_offset_p : 0;
+	int i;
 
-	if (ibmr->page_size != PAGE_SIZE) {
-		rxe_err_mr(mr, "Unsupport mr page size %x, expect PAGE_SIZE(%lx)\n",
+	if (!IS_ALIGNED(ibmr->page_size, PAGE_SIZE)) {
+		rxe_err_mr(mr, "Misaligned page size %x, expect PAGE_SIZE(%lx) aligned\n",
 			   ibmr->page_size, PAGE_SIZE);
 		return -EINVAL;
 	}
 
 	mr->nbuf = 0;
 
-	return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset, rxe_set_page);
+	for_each_sg(sgl, sg, sg_nents, i) {
+		u64 dma_addr = sg_dma_address(sg) + sg_offset;
+		unsigned int dma_len = sg_dma_len(sg) - sg_offset;
+		u64 end_dma_addr = dma_addr + dma_len;
+		u64 page_addr = dma_addr & PAGE_MASK;
+
+		if (sg_dma_len(sg) == 0) {
+			rxe_dbg_mr(mr, "empty SGE\n");
+			return -EINVAL;
+		}
+		do {
+			int ret = rxe_store_page(mr, page_addr);
+			if (ret)
+				return ret;
+
+			page_addr += PAGE_SIZE;
+		} while (page_addr < end_dma_addr);
+		sg_offset = 0;
+	}
+
+	return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset_p, rxe_set_page);
 }
 
 static int rxe_mr_copy_xarray(struct rxe_mr *mr, u64 iova, void *addr,
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 1/6] RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
  2023-11-03  9:55 ` [PATCH RFC V2 1/6] RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr Li Zhijian
@ 2023-11-03 10:14   ` Greg Sword
  0 siblings, 0 replies; 25+ messages in thread
From: Greg Sword @ 2023-11-03 10:14 UTC (permalink / raw)
  To: Li Zhijian
  Cc: zyjzyj2000, jgg, leon, linux-rdma, linux-kernel, rpearsonhpe,
	matsuda-daisuke, bvanassche, yi.zhang

On Fri, Nov 3, 2023 at 5:56 PM Li Zhijian <lizhijian@fujitsu.com> wrote:
>
> rxe_set_page() only store one PAGE_SIZE page by the step of page_size.
> when page_size != PAGE_SIZE, we cannot restore the address with wrong
> index and page_offset.
>
> Let's take a look how current the xarray is being used.
>
> 0. offset = iova & (page_size -1); // offset is less than page_size
>                                       but may not PAGE_SIZE
> 1. index = (iova - mr.iova) >> page_shift;
> 2. page = xa_load(&mr->page_list, index);
> 3. page_va = kmap_local_page(page) // map one page only, that means only
>                                       memory [page_va, page_va + PAGE_SIZE)
>                                       is valid for this mapping.
> 4. memcpy(addr, page_va + offset, bytes);
>
> - when page_size > PAGE_SIZE, the offset could be beyond PAGE_SIZE,
>   then page_va + offset may be invalid.
> - when page_size < PAGE_SIZE, the offset may get lost.
>
> Note that this patch will break some ULPs that try to register 4K
> MR when PAGE_SIZE is not 4K. SRP and nvme over RXE is known to be
> impacted.
>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>
> ---
> ---
>  drivers/infiniband/sw/rxe/rxe_mr.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index f54042e9aeb2..3755e530e6dc 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -234,6 +234,12 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
>         struct rxe_mr *mr = to_rmr(ibmr);
>         unsigned int page_size = mr_page_size(mr);
>
> +       if (page_size != PAGE_SIZE) {
> +               rxe_err_mr(mr, "Unsupport mr page size %x, expect PAGE_SIZE(%lx)\n",
> +                          page_size, PAGE_SIZE);
> +               return -EINVAL;
> +       }

Are you kidding us? What problem you are fixing? Do you make tests in your host?

A  rubbish patch.

> +
>         mr->nbuf = 0;
>         mr->page_shift = ilog2(page_size);
>         mr->page_mask = ~((u64)page_size - 1);
> --
> 2.41.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
                   ` (5 preceding siblings ...)
  2023-11-03  9:55 ` [PATCH RFC V2 6/6] RDMA/rxe: Support PAGE_SIZE aligned MR Li Zhijian
@ 2023-11-03 10:17 ` Greg Sword
  2023-11-06  3:46   ` Zhijian Li (Fujitsu)
  2023-11-03 13:00 ` Zhu Yanjun
  2023-11-06  7:59 ` Zhijian Li (Fujitsu)
  8 siblings, 1 reply; 25+ messages in thread
From: Greg Sword @ 2023-11-03 10:17 UTC (permalink / raw)
  To: Li Zhijian
  Cc: zyjzyj2000, jgg, leon, linux-rdma, linux-kernel, rpearsonhpe,
	matsuda-daisuke, bvanassche, yi.zhang

On Fri, Nov 3, 2023 at 5:58 PM Li Zhijian <lizhijian@fujitsu.com> wrote:
>
> I don't collect the Reviewed-by to the patch1-2 this time, since i
> think we can make it better.
>
> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>           Almost nothing change from V1.
> Patch3-5: cleanups # newly add
> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested

Do some work. Do not use these rubbish patch to waste our time.

>
> My bad arm64 mechine offten hangs when doing blktests even though i use the
> default siw driver.
>
> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
>
> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
>
> Li Zhijian (6):
>   RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>   RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>   RDMA/rxe: remove unused rxe_mr.page_shift
>   RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>     page_list
>   RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>   RDMA/rxe: Support PAGE_SIZE aligned MR
>
>  drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>  drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>  drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>  3 files changed, 48 insertions(+), 43 deletions(-)
>
> --
> 2.41.0
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
                   ` (6 preceding siblings ...)
  2023-11-03 10:17 ` [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Greg Sword
@ 2023-11-03 13:00 ` Zhu Yanjun
  2023-11-06  4:07   ` Zhijian Li (Fujitsu)
  2023-11-06  7:59 ` Zhijian Li (Fujitsu)
  8 siblings, 1 reply; 25+ messages in thread
From: Zhu Yanjun @ 2023-11-03 13:00 UTC (permalink / raw)
  To: Li Zhijian, zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, bvanassche, yi.zhang

在 2023/11/3 17:55, Li Zhijian 写道:
> I don't collect the Reviewed-by to the patch1-2 this time, since i
> think we can make it better.
> 
> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>            Almost nothing change from V1.
> Patch3-5: cleanups # newly add
> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
> 
> My bad arm64 mechine offten hangs when doing blktests even though i use the
> default siw driver.
> 
> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.

Zhijian

Please read carefully the whole discussion about this problem. You will 
find a lot of valuable suggestions, especially suggestions from Jason.

 From the whole discussion, it seems that the root cause is very clear.
We need to fix this prolem. Please do not send this kind of commits again.

Zhu Yanjun

> 
> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
> 
> Li Zhijian (6):
>    RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>    RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>    RDMA/rxe: remove unused rxe_mr.page_shift
>    RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>      page_list
>    RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>    RDMA/rxe: Support PAGE_SIZE aligned MR
> 
>   drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>   drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>   drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>   3 files changed, 48 insertions(+), 43 deletions(-)
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 6/6] RDMA/rxe: Support PAGE_SIZE aligned MR
  2023-11-03  9:55 ` [PATCH RFC V2 6/6] RDMA/rxe: Support PAGE_SIZE aligned MR Li Zhijian
@ 2023-11-03 15:04   ` Bart Van Assche
  2023-11-06  3:07     ` Zhijian Li (Fujitsu)
  0 siblings, 1 reply; 25+ messages in thread
From: Bart Van Assche @ 2023-11-03 15:04 UTC (permalink / raw)
  To: Li Zhijian, zyjzyj2000, jgg, leon, linux-rdma
  Cc: linux-kernel, rpearsonhpe, matsuda-daisuke, yi.zhang


On 11/3/23 02:55, Li Zhijian wrote:
> -	return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset, rxe_set_page);
> +	for_each_sg(sgl, sg, sg_nents, i) {
> +		u64 dma_addr = sg_dma_address(sg) + sg_offset;
> +		unsigned int dma_len = sg_dma_len(sg) - sg_offset;
> +		u64 end_dma_addr = dma_addr + dma_len;
> +		u64 page_addr = dma_addr & PAGE_MASK;
> +
> +		if (sg_dma_len(sg) == 0) {
> +			rxe_dbg_mr(mr, "empty SGE\n");
> +			return -EINVAL;
> +		}
> +		do {
> +			int ret = rxe_store_page(mr, page_addr);
> +			if (ret)
> +				return ret;
> +
> +			page_addr += PAGE_SIZE;
> +		} while (page_addr < end_dma_addr);
> +		sg_offset = 0;
> +	}
> +
> +	return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset_p, rxe_set_page);
>   }

Is this change necessary? There is already a loop in ib_sg_to_pages()
that splits SG entries that are larger than mr->page_size into entries
with size mr->page_size.

Bart.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 4/6] RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from page_list
  2023-11-03  9:55 ` [PATCH RFC V2 4/6] RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from page_list Li Zhijian
@ 2023-11-03 17:59   ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2023-11-03 17:59 UTC (permalink / raw)
  To: Li Zhijian
  Cc: zyjzyj2000, leon, linux-rdma, linux-kernel, rpearsonhpe,
	matsuda-daisuke, bvanassche, yi.zhang

On Fri, Nov 03, 2023 at 05:55:47PM +0800, Li Zhijian wrote:
> As we said in previous commit, page_list only stores PAGE_SIZE page, so
> when we extract an address from the page_list, we should use PAGE_SIZE
> and PAGE_SHIFT instead of the ibmr.page_size.

The concept was that the xarray could store anything larger than
PAGE_SIZE and the entry would point at the first struct page of the
contiguous chunk

That looks like it is right, or at least close to right, so lets try
to keep it

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 6/6] RDMA/rxe: Support PAGE_SIZE aligned MR
  2023-11-03 15:04   ` Bart Van Assche
@ 2023-11-06  3:07     ` Zhijian Li (Fujitsu)
  0 siblings, 0 replies; 25+ messages in thread
From: Zhijian Li (Fujitsu) @ 2023-11-06  3:07 UTC (permalink / raw)
  To: Bart Van Assche, zyjzyj2000@gmail.com, jgg@ziepe.ca,
	leon@kernel.org, linux-rdma@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com,
	Daisuke Matsuda (Fujitsu), yi.zhang@redhat.com



On 03/11/2023 23:04, Bart Van Assche wrote:
> 
> On 11/3/23 02:55, Li Zhijian wrote:
>> -    return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset, rxe_set_page);
>> +    for_each_sg(sgl, sg, sg_nents, i) {
>> +        u64 dma_addr = sg_dma_address(sg) + sg_offset;
>> +        unsigned int dma_len = sg_dma_len(sg) - sg_offset;
>> +        u64 end_dma_addr = dma_addr + dma_len;
>> +        u64 page_addr = dma_addr & PAGE_MASK;
>> +
>> +        if (sg_dma_len(sg) == 0) {
>> +            rxe_dbg_mr(mr, "empty SGE\n");
>> +            return -EINVAL;
>> +        }
>> +        do {
>> +            int ret = rxe_store_page(mr, page_addr);
>> +            if (ret)
>> +                return ret;
>> +
>> +            page_addr += PAGE_SIZE;
>> +        } while (page_addr < end_dma_addr);
>> +        sg_offset = 0;
>> +    }
>> +
>> +    return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset_p, rxe_set_page);
>>   }
> 
> Is this change necessary? 

There is already a loop in ib_sg_to_pages()
> that splits SG entries that are larger than mr->page_size into entries
> with size mr->page_size.

I see.

My thought was that we are only able to safely access PAGE_SIZE memory scope [page_va, page_va + PAGE_SIZE)
from the return of kmap_local_page(page).
However when mr->page_size is larger than PAGE_SIZE, we may access the next pages without mapping it.

Thanks
Zhijian

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-03 10:17 ` [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Greg Sword
@ 2023-11-06  3:46   ` Zhijian Li (Fujitsu)
  0 siblings, 0 replies; 25+ messages in thread
From: Zhijian Li (Fujitsu) @ 2023-11-06  3:46 UTC (permalink / raw)
  To: Greg Sword
  Cc: zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	rpearsonhpe@gmail.com, Daisuke Matsuda (Fujitsu),
	bvanassche@acm.org, yi.zhang@redhat.com



On 03/11/2023 18:17, Greg Sword wrote:
> On Fri, Nov 3, 2023 at 5:58 PM Li Zhijian <lizhijian@fujitsu.com> wrote:
>>
>> I don't collect the Reviewed-by to the patch1-2 this time, since i
>> think we can make it better.
>>
>> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>>            Almost nothing change from V1.
>> Patch3-5: cleanups # newly add
>> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
> 
> Do some work. Do not use these rubbish patch to waste our time.

So sorry about this. Of course, any other proposals are welcomed.




> 
>>
>> My bad arm64 mechine offten hangs when doing blktests even though i use the
>> default siw driver.
>>
>> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
>>
>> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
>>
>> Li Zhijian (6):
>>    RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>>    RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>>    RDMA/rxe: remove unused rxe_mr.page_shift
>>    RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>>      page_list
>>    RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>>    RDMA/rxe: Support PAGE_SIZE aligned MR
>>
>>   drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>>   drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>>   drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>>   3 files changed, 48 insertions(+), 43 deletions(-)
>>
>> --
>> 2.41.0
>>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-03 13:00 ` Zhu Yanjun
@ 2023-11-06  4:07   ` Zhijian Li (Fujitsu)
  2023-11-06 13:58     ` Zhu Yanjun
  2023-11-06 14:13     ` Jason Gunthorpe
  0 siblings, 2 replies; 25+ messages in thread
From: Zhijian Li (Fujitsu) @ 2023-11-06  4:07 UTC (permalink / raw)
  To: Zhu Yanjun, zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	linux-rdma@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com,
	Daisuke Matsuda (Fujitsu), bvanassche@acm.org,
	yi.zhang@redhat.com



On 03/11/2023 21:00, Zhu Yanjun wrote:
> 在 2023/11/3 17:55, Li Zhijian 写道:
>> I don't collect the Reviewed-by to the patch1-2 this time, since i
>> think we can make it better.
>>
>> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>>            Almost nothing change from V1.
>> Patch3-5: cleanups # newly add
>> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
>>
>> My bad arm64 mechine offten hangs when doing blktests even though i use the
>> default siw driver.
>>
>> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
> 
> Zhijian
> 
> Please read carefully the whole discussion about this problem. You will find a lot of valuable suggestions, especially suggestions from Jason.

Okay, i will read it again. If you can tell me which thread, that would be better.


> 
>  From the whole discussion, it seems that the root cause is very clear.
> We need to fix this prolem. Please do not send this kind of commits again.
> 

Let's think about what's our goal first.

- 1) Fix the panic[1] and only support PAGE_SIZE MR
- 2) support PAGE_SIZE aligned MR
- 3) support any page_size MR.

I'm sorry i'm not familiar with the linux MM subsystem. It seem it's safe/correct to access
address/memory across pages start from the return of kmap_loca_page(page).
In other words, 2) is already native supported, right?

I get totally confused now.



> Zhu Yanjun
> 
>>
>> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
>>
>> Li Zhijian (6):
>>    RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>>    RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>>    RDMA/rxe: remove unused rxe_mr.page_shift
>>    RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>>      page_list
>>    RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>>    RDMA/rxe: Support PAGE_SIZE aligned MR
>>
>>   drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>>   drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>>   drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>>   3 files changed, 48 insertions(+), 43 deletions(-)
>>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
                   ` (7 preceding siblings ...)
  2023-11-03 13:00 ` Zhu Yanjun
@ 2023-11-06  7:59 ` Zhijian Li (Fujitsu)
  2023-11-06  9:35   ` Greg Sword
  8 siblings, 1 reply; 25+ messages in thread
From: Zhijian Li (Fujitsu) @ 2023-11-06  7:59 UTC (permalink / raw)
  To: zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	linux-rdma@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com,
	Daisuke Matsuda (Fujitsu), bvanassche@acm.org,
	yi.zhang@redhat.com



Very thanks for all your feedback.

On 03/11/2023 17:55, Li Zhijian wrote:
> I don't collect the Reviewed-by to the patch1-2 this time, since i
> think we can make it better.
> 
> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>            Almost nothing change from V1.

Quote from Jason:
"
> The concept was that the xarray could store anything larger than
> PAGE_SIZE and the entry would point at the first struct page of the
> contiguous chunk
> 
> That looks like it is right, or at least close to right, so lets try
> to keep it
"


It seems it's okay to access address/memory across pages on RXE even though
we only map the first page.

That also means PAGE_SIZE aligned MR is already supported, so only check
`if (IS_ALIGNED(page_size, PAGE_SIZE))` is sufficient, right?

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index f54042e9aeb2..3755e530e6dc 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -234,6 +234,12 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
         struct rxe_mr *mr = to_rmr(ibmr);
         unsigned int page_size = mr_page_size(mr);
  
+       if (!IS_ALIGNED(page_size, PAGE_SIZE)) {
+               rxe_err_mr(mr, "FIXME...\n")
+               return -EINVAL;
+       }
+
         mr->nbuf = 0;
         mr->page_shift = ilog2(page_size);
         mr->page_mask = ~((u64)page_size - 1);
diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h
index d2f57ead78ad..b1cf1e1c0ce1 100644
--- a/drivers/infiniband/sw/rxe/rxe_param.h
+++ b/drivers/infiniband/sw/rxe/rxe_param.h
@@ -38,7 +38,7 @@ static inline enum ib_mtu eth_mtu_int_to_enum(int mtu)
  /* default/initial rxe device parameter settings */
  enum rxe_device_param {
         RXE_MAX_MR_SIZE                 = -1ull,
-       RXE_PAGE_SIZE_CAP               = 0xfffff000,
+       RXE_PAGE_SIZE_CAP               = 0xffffffff - (PAGE_SIZE - 1),
         RXE_MAX_QP_WR                   = DEFAULT_MAX_VALUE,
         RXE_DEVICE_CAP_FLAGS            = IB_DEVICE_BAD_PKEY_CNTR
                                         | IB_DEVICE_BAD_QKEY_CNTR


* minor cleanup will be done after this.

Thanks
Zhijian

> Patch3-5: cleanups # newly add
> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
> 
> My bad arm64 mechine offten hangs when doing blktests even though i use the
> default siw driver.
> 
> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
> 
> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
> 
> Li Zhijian (6):
>    RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>    RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>    RDMA/rxe: remove unused rxe_mr.page_shift
>    RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>      page_list
>    RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>    RDMA/rxe: Support PAGE_SIZE aligned MR
> 
>   drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>   drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>   drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>   3 files changed, 48 insertions(+), 43 deletions(-)
> 

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-06  7:59 ` Zhijian Li (Fujitsu)
@ 2023-11-06  9:35   ` Greg Sword
  2023-11-06  9:55     ` Zhijian Li (Fujitsu)
  0 siblings, 1 reply; 25+ messages in thread
From: Greg Sword @ 2023-11-06  9:35 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu)
  Cc: zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	rpearsonhpe@gmail.com, Daisuke Matsuda (Fujitsu),
	bvanassche@acm.org, yi.zhang@redhat.com

On Mon, Nov 6, 2023 at 4:01 PM Zhijian Li (Fujitsu)
<lizhijian@fujitsu.com> wrote:
>
>
>
> Very thanks for all your feedback.
>
> On 03/11/2023 17:55, Li Zhijian wrote:
> > I don't collect the Reviewed-by to the patch1-2 this time, since i
> > think we can make it better.
> >
> > Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
> >            Almost nothing change from V1.
>
> Quote from Jason:
> "
> > The concept was that the xarray could store anything larger than
> > PAGE_SIZE and the entry would point at the first struct page of the
> > contiguous chunk
> >
> > That looks like it is right, or at least close to right, so lets try
> > to keep it
> "
>
>
> It seems it's okay to access address/memory across pages on RXE even though
> we only map the first page.

Do you really make tests in your test environment? Do you have test environment?
Do you really reproduce this problem in your test environment?
Your patches do not work actually. Please do not send these rubbish patches out.

>
> That also means PAGE_SIZE aligned MR is already supported, so only check
> `if (IS_ALIGNED(page_size, PAGE_SIZE))` is sufficient, right?
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index f54042e9aeb2..3755e530e6dc 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -234,6 +234,12 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
>          struct rxe_mr *mr = to_rmr(ibmr);
>          unsigned int page_size = mr_page_size(mr);
>
> +       if (!IS_ALIGNED(page_size, PAGE_SIZE)) {
> +               rxe_err_mr(mr, "FIXME...\n")
> +               return -EINVAL;
> +       }
> +
>          mr->nbuf = 0;
>          mr->page_shift = ilog2(page_size);
>          mr->page_mask = ~((u64)page_size - 1);
> diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h
> index d2f57ead78ad..b1cf1e1c0ce1 100644
> --- a/drivers/infiniband/sw/rxe/rxe_param.h
> +++ b/drivers/infiniband/sw/rxe/rxe_param.h
> @@ -38,7 +38,7 @@ static inline enum ib_mtu eth_mtu_int_to_enum(int mtu)
>   /* default/initial rxe device parameter settings */
>   enum rxe_device_param {
>          RXE_MAX_MR_SIZE                 = -1ull,
> -       RXE_PAGE_SIZE_CAP               = 0xfffff000,
> +       RXE_PAGE_SIZE_CAP               = 0xffffffff - (PAGE_SIZE - 1),
>          RXE_MAX_QP_WR                   = DEFAULT_MAX_VALUE,
>          RXE_DEVICE_CAP_FLAGS            = IB_DEVICE_BAD_PKEY_CNTR
>                                          | IB_DEVICE_BAD_QKEY_CNTR
>
>
> * minor cleanup will be done after this.
>
> Thanks
> Zhijian
>
> > Patch3-5: cleanups # newly add
> > Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
> >
> > My bad arm64 mechine offten hangs when doing blktests even though i use the
> > default siw driver.
> >
> > - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
> >
> > [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
> >
> > Li Zhijian (6):
> >    RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
> >    RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
> >    RDMA/rxe: remove unused rxe_mr.page_shift
> >    RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
> >      page_list
> >    RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
> >    RDMA/rxe: Support PAGE_SIZE aligned MR
> >
> >   drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
> >   drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
> >   drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
> >   3 files changed, 48 insertions(+), 43 deletions(-)
> >

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-06  9:35   ` Greg Sword
@ 2023-11-06  9:55     ` Zhijian Li (Fujitsu)
  0 siblings, 0 replies; 25+ messages in thread
From: Zhijian Li (Fujitsu) @ 2023-11-06  9:55 UTC (permalink / raw)
  To: Greg Sword
  Cc: zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	rpearsonhpe@gmail.com, Daisuke Matsuda (Fujitsu),
	bvanassche@acm.org, yi.zhang@redhat.com



On 06/11/2023 17:35, Greg Sword wrote:
> On Mon, Nov 6, 2023 at 4:01 PM Zhijian Li (Fujitsu)
> <lizhijian@fujitsu.com> wrote:
>>
>>
>>
>> Very thanks for all your feedback.
>>
>> On 03/11/2023 17:55, Li Zhijian wrote:
>>> I don't collect the Reviewed-by to the patch1-2 this time, since i
>>> think we can make it better.
>>>
>>> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>>>             Almost nothing change from V1.
>>
>> Quote from Jason:
>> "
>>> The concept was that the xarray could store anything larger than
>>> PAGE_SIZE and the entry would point at the first struct page of the
>>> contiguous chunk
>>>
>>> That looks like it is right, or at least close to right, so lets try
>>> to keep it
>> "
>>
>>
>> It seems it's okay to access address/memory across pages on RXE even though
>> we only map the first page.
> 
> Do you really make tests in your test environment? Do you have test environment?



> Do you really reproduce this problem in your test environment?
I did the test, the kernel panic[1] is gone after patch1-patch2


Thanks
Zhijian


> Your patches do not work actually. Please do not send these rubbish patches out.
> 
>>
>> That also means PAGE_SIZE aligned MR is already supported, so only check
>> `if (IS_ALIGNED(page_size, PAGE_SIZE))` is sufficient, right?
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
>> index f54042e9aeb2..3755e530e6dc 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
>> @@ -234,6 +234,12 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
>>           struct rxe_mr *mr = to_rmr(ibmr);
>>           unsigned int page_size = mr_page_size(mr);
>>
>> +       if (!IS_ALIGNED(page_size, PAGE_SIZE)) {
>> +               rxe_err_mr(mr, "FIXME...\n")
>> +               return -EINVAL;
>> +       }
>> +
>>           mr->nbuf = 0;
>>           mr->page_shift = ilog2(page_size);
>>           mr->page_mask = ~((u64)page_size - 1);
>> diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h
>> index d2f57ead78ad..b1cf1e1c0ce1 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_param.h
>> +++ b/drivers/infiniband/sw/rxe/rxe_param.h
>> @@ -38,7 +38,7 @@ static inline enum ib_mtu eth_mtu_int_to_enum(int mtu)
>>    /* default/initial rxe device parameter settings */
>>    enum rxe_device_param {
>>           RXE_MAX_MR_SIZE                 = -1ull,
>> -       RXE_PAGE_SIZE_CAP               = 0xfffff000,
>> +       RXE_PAGE_SIZE_CAP               = 0xffffffff - (PAGE_SIZE - 1),
>>           RXE_MAX_QP_WR                   = DEFAULT_MAX_VALUE,
>>           RXE_DEVICE_CAP_FLAGS            = IB_DEVICE_BAD_PKEY_CNTR
>>                                           | IB_DEVICE_BAD_QKEY_CNTR
>>
>>
>> * minor cleanup will be done after this.
>>
>> Thanks
>> Zhijian
>>
>>> Patch3-5: cleanups # newly add
>>> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
>>>
>>> My bad arm64 mechine offten hangs when doing blktests even though i use the
>>> default siw driver.
>>>
>>> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
>>>
>>> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
>>>
>>> Li Zhijian (6):
>>>     RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>>>     RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>>>     RDMA/rxe: remove unused rxe_mr.page_shift
>>>     RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>>>       page_list
>>>     RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>>>     RDMA/rxe: Support PAGE_SIZE aligned MR
>>>
>>>    drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>>>    drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>>>    drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>>>    3 files changed, 48 insertions(+), 43 deletions(-)
>>>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-06  4:07   ` Zhijian Li (Fujitsu)
@ 2023-11-06 13:58     ` Zhu Yanjun
  2023-11-09  2:24       ` Zhijian Li (Fujitsu)
  2023-11-06 14:13     ` Jason Gunthorpe
  1 sibling, 1 reply; 25+ messages in thread
From: Zhu Yanjun @ 2023-11-06 13:58 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu), zyjzyj2000@gmail.com, jgg@ziepe.ca,
	leon@kernel.org, linux-rdma@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com,
	Daisuke Matsuda (Fujitsu), bvanassche@acm.org,
	yi.zhang@redhat.com

在 2023/11/6 12:07, Zhijian Li (Fujitsu) 写道:
> 
> 
> On 03/11/2023 21:00, Zhu Yanjun wrote:
>> 在 2023/11/3 17:55, Li Zhijian 写道:
>>> I don't collect the Reviewed-by to the patch1-2 this time, since i
>>> think we can make it better.
>>>
>>> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>>>             Almost nothing change from V1.
>>> Patch3-5: cleanups # newly add
>>> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
>>>
>>> My bad arm64 mechine offten hangs when doing blktests even though i use the
>>> default siw driver.
>>>
>>> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
>>
>> Zhijian
>>
>> Please read carefully the whole discussion about this problem. You will find a lot of valuable suggestions, especially suggestions from Jason.
> 
> Okay, i will read it again. If you can tell me which thread, that would be better.
> 
> 
>>
>>   From the whole discussion, it seems that the root cause is very clear.
>> We need to fix this prolem. Please do not send this kind of commits again.
>>
> 
> Let's think about what's our goal first.
> 
> - 1) Fix the panic[1] and only support PAGE_SIZE MR
> - 2) support PAGE_SIZE aligned MR
> - 3) support any page_size MR.
> 
> I'm sorry i'm not familiar with the linux MM subsystem. It seem it's safe/correct to access
> address/memory across pages start from the return of kmap_loca_page(page).
> In other words, 2) is already native supported, right?

Yes. Please read the comments from Jason, Leon and Bart. They shared a 
lot of good advice. From them, we can know the root cause and how to fix 
this problem.

Good Luck.

Zhu Yanjun

> 
> I get totally confused now.
> 
> 
> 
>> Zhu Yanjun
>>
>>>
>>> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
>>>
>>> Li Zhijian (6):
>>>     RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>>>     RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>>>     RDMA/rxe: remove unused rxe_mr.page_shift
>>>     RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>>>       page_list
>>>     RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>>>     RDMA/rxe: Support PAGE_SIZE aligned MR
>>>
>>>    drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>>>    drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>>>    drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>>>    3 files changed, 48 insertions(+), 43 deletions(-)
>>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-06  4:07   ` Zhijian Li (Fujitsu)
  2023-11-06 13:58     ` Zhu Yanjun
@ 2023-11-06 14:13     ` Jason Gunthorpe
  1 sibling, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2023-11-06 14:13 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu)
  Cc: Zhu Yanjun, zyjzyj2000@gmail.com, leon@kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	rpearsonhpe@gmail.com, Daisuke Matsuda (Fujitsu),
	bvanassche@acm.org, yi.zhang@redhat.com

On Mon, Nov 06, 2023 at 04:07:19AM +0000, Zhijian Li (Fujitsu) wrote:

> I'm sorry i'm not familiar with the linux MM subsystem. It seem it's safe/correct to access
> address/memory across pages start from the return of
> kmap_loca_page(page).

kmap_local_page() gives you a PAGE_SIZE window only

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-06 13:58     ` Zhu Yanjun
@ 2023-11-09  2:24       ` Zhijian Li (Fujitsu)
  2023-11-09  6:36         ` Zhu Yanjun
  2023-11-09  7:16         ` Greg Sword
  0 siblings, 2 replies; 25+ messages in thread
From: Zhijian Li (Fujitsu) @ 2023-11-09  2:24 UTC (permalink / raw)
  To: Zhu Yanjun, zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	linux-rdma@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com,
	Daisuke Matsuda (Fujitsu), bvanassche@acm.org,
	yi.zhang@redhat.com



On 06/11/2023 21:58, Zhu Yanjun wrote:
> 在 2023/11/6 12:07, Zhijian Li (Fujitsu) 写道:
>>
>>
>> On 03/11/2023 21:00, Zhu Yanjun wrote:
>>> 在 2023/11/3 17:55, Li Zhijian 写道:
>>>> I don't collect the Reviewed-by to the patch1-2 this time, since i
>>>> think we can make it better.
>>>>
>>>> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>>>>             Almost nothing change from V1.
>>>> Patch3-5: cleanups # newly add
>>>> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
>>>>
>>>> My bad arm64 mechine offten hangs when doing blktests even though i use the
>>>> default siw driver.
>>>>
>>>> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
>>>
>>> Zhijian
>>>
>>> Please read carefully the whole discussion about this problem. You will find a lot of valuable suggestions, especially suggestions from Jason.
>>
>> Okay, i will read it again. If you can tell me which thread, that would be better.
>>
>>
>>>
>>>   From the whole discussion, it seems that the root cause is very clear.
>>> We need to fix this prolem. Please do not send this kind of commits again.
>>>
>>
>> Let's think about what's our goal first.
>>
>> - 1) Fix the panic[1] and only support PAGE_SIZE MR
>> - 2) support PAGE_SIZE aligned MR
>> - 3) support any page_size MR.
>>
>> I'm sorry i'm not familiar with the linux MM subsystem. It seem it's safe/correct to access
>> address/memory across pages start from the return of kmap_loca_page(page).
>> In other words, 2) is already native supported, right?
> 
> Yes. Please read the comments from Jason, Leon and Bart. They shared a lot of good advice. 

I read the whole discussion again, but I believed i still missed a lost.


> From them, we can know the root cause and how to fix this problem.

I don't think i misunderstood the root cause:
RXE splits memory into PAGE_SIZE units in the xarray. As a result, when we extract an address from the xarray,
we should not access address beyond a PAGE_SIZE window.

IIUC, then how to fix it?
- I'm not going to "removing page_size set", it's out of this patch scope.
   Feel free to do the cleanup separately.
- I'm not going to fix the NVMe/rtrs etc problems in this patch set when 64K page is enabled.
   But RXE will tell its callers explicitly "RXE don't don't support such page_size"
- I didn't state RXE supports PAGE_SIZE aligned page_size MR before refactoring rxe_map_mr_sg(),
   because I worry about it was not correct to access address beyond the PAGE_SIZE window.

What I should do next?
Just state "RXE support PAGE_SIZE aligned MR" ? Then patches become
RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE aligned MR
RDMA/rxe: set RXE_PAGE_SIZE_CAP to starting from PAGE_SIZE

Or just keep we have done in the V1

Thanks


> 
> Good Luck.
> 
> Zhu Yanjun
> 
>>
>> I get totally confused now.
>>
>>
>>
>>> Zhu Yanjun
>>>
>>>>
>>>> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
>>>>
>>>> Li Zhijian (6):
>>>>     RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>>>>     RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>>>>     RDMA/rxe: remove unused rxe_mr.page_shift
>>>>     RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>>>>       page_list
>>>>     RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>>>>     RDMA/rxe: Support PAGE_SIZE aligned MR
>>>>
>>>>    drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>>>>    drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>>>>    drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>>>>    3 files changed, 48 insertions(+), 43 deletions(-)
>>>>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-09  2:24       ` Zhijian Li (Fujitsu)
@ 2023-11-09  6:36         ` Zhu Yanjun
  2023-11-09  7:16         ` Greg Sword
  1 sibling, 0 replies; 25+ messages in thread
From: Zhu Yanjun @ 2023-11-09  6:36 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu), zyjzyj2000@gmail.com, jgg@ziepe.ca,
	leon@kernel.org, linux-rdma@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com,
	Daisuke Matsuda (Fujitsu), bvanassche@acm.org,
	yi.zhang@redhat.com


在 2023/11/9 10:24, Zhijian Li (Fujitsu) 写道:
>
> On 06/11/2023 21:58, Zhu Yanjun wrote:
>> 在 2023/11/6 12:07, Zhijian Li (Fujitsu) 写道:
>>>
>>> On 03/11/2023 21:00, Zhu Yanjun wrote:
>>>> 在 2023/11/3 17:55, Li Zhijian 写道:
>>>>> I don't collect the Reviewed-by to the patch1-2 this time, since i
>>>>> think we can make it better.
>>>>>
>>>>> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>>>>>              Almost nothing change from V1.
>>>>> Patch3-5: cleanups # newly add
>>>>> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
>>>>>
>>>>> My bad arm64 mechine offten hangs when doing blktests even though i use the
>>>>> default siw driver.
>>>>>
>>>>> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
>>>> Zhijian
>>>>
>>>> Please read carefully the whole discussion about this problem. You will find a lot of valuable suggestions, especially suggestions from Jason.
>>> Okay, i will read it again. If you can tell me which thread, that would be better.
>>>
>>>
>>>>    From the whole discussion, it seems that the root cause is very clear.
>>>> We need to fix this prolem. Please do not send this kind of commits again.
>>>>
>>> Let's think about what's our goal first.
>>>
>>> - 1) Fix the panic[1] and only support PAGE_SIZE MR
>>> - 2) support PAGE_SIZE aligned MR
>>> - 3) support any page_size MR.
>>>
>>> I'm sorry i'm not familiar with the linux MM subsystem. It seem it's safe/correct to access
>>> address/memory across pages start from the return of kmap_loca_page(page).
>>> In other words, 2) is already native supported, right?
>> Yes. Please read the comments from Jason, Leon and Bart. They shared a lot of good advice.
> I read the whole discussion again, but I believed i still missed a lost.
>
>
>>  From them, we can know the root cause and how to fix this problem.
> I don't think i misunderstood the root cause:
> RXE splits memory into PAGE_SIZE units in the xarray. As a result, when we extract an address from the xarray,
> we should not access address beyond a PAGE_SIZE window.

This is a complicated problem and it is deeply involved with memory 
management.

A guy who is very familiar with linux MM is to provide a better solution 
to this problem.

I expect a whole perfect solution to this problem.

Zhu Yanjun

>
> IIUC, then how to fix it?
> - I'm not going to "removing page_size set", it's out of this patch scope.
>     Feel free to do the cleanup separately.
> - I'm not going to fix the NVMe/rtrs etc problems in this patch set when 64K page is enabled.
>     But RXE will tell its callers explicitly "RXE don't don't support such page_size"
> - I didn't state RXE supports PAGE_SIZE aligned page_size MR before refactoring rxe_map_mr_sg(),
>     because I worry about it was not correct to access address beyond the PAGE_SIZE window.
>
> What I should do next?
> Just state "RXE support PAGE_SIZE aligned MR" ? Then patches become
> RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE aligned MR
> RDMA/rxe: set RXE_PAGE_SIZE_CAP to starting from PAGE_SIZE
>
> Or just keep we have done in the V1
>
> Thanks
>
>
>> Good Luck.
>>
>> Zhu Yanjun
>>
>>> I get totally confused now.
>>>
>>>
>>>
>>>> Zhu Yanjun
>>>>
>>>>> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
>>>>>
>>>>> Li Zhijian (6):
>>>>>      RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>>>>>      RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>>>>>      RDMA/rxe: remove unused rxe_mr.page_shift
>>>>>      RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>>>>>        page_list
>>>>>      RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>>>>>      RDMA/rxe: Support PAGE_SIZE aligned MR
>>>>>
>>>>>     drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>>>>>     drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>>>>>     drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>>>>>     3 files changed, 48 insertions(+), 43 deletions(-)
>>>>>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-09  2:24       ` Zhijian Li (Fujitsu)
  2023-11-09  6:36         ` Zhu Yanjun
@ 2023-11-09  7:16         ` Greg Sword
  2023-11-09  7:26           ` Zhijian Li (Fujitsu)
  2023-11-09 13:10           ` Jason Gunthorpe
  1 sibling, 2 replies; 25+ messages in thread
From: Greg Sword @ 2023-11-09  7:16 UTC (permalink / raw)
  To: Zhijian Li (Fujitsu)
  Cc: Zhu Yanjun, zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	rpearsonhpe@gmail.com, Daisuke Matsuda (Fujitsu),
	bvanassche@acm.org, yi.zhang@redhat.com

On Thu, Nov 9, 2023 at 10:25 AM Zhijian Li (Fujitsu)
<lizhijian@fujitsu.com> wrote:
>
>
>
> On 06/11/2023 21:58, Zhu Yanjun wrote:
> > 在 2023/11/6 12:07, Zhijian Li (Fujitsu) 写道:
> >>
> >>
> >> On 03/11/2023 21:00, Zhu Yanjun wrote:
> >>> 在 2023/11/3 17:55, Li Zhijian 写道:
> >>>> I don't collect the Reviewed-by to the patch1-2 this time, since i
> >>>> think we can make it better.
> >>>>
> >>>> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
> >>>>             Almost nothing change from V1.
> >>>> Patch3-5: cleanups # newly add
> >>>> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
> >>>>
> >>>> My bad arm64 mechine offten hangs when doing blktests even though i use the
> >>>> default siw driver.
> >>>>
> >>>> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
> >>>
> >>> Zhijian
> >>>
> >>> Please read carefully the whole discussion about this problem. You will find a lot of valuable suggestions, especially suggestions from Jason.
> >>
> >> Okay, i will read it again. If you can tell me which thread, that would be better.
> >>
> >>
> >>>
> >>>   From the whole discussion, it seems that the root cause is very clear.
> >>> We need to fix this prolem. Please do not send this kind of commits again.
> >>>
> >>
> >> Let's think about what's our goal first.
> >>
> >> - 1) Fix the panic[1] and only support PAGE_SIZE MR
> >> - 2) support PAGE_SIZE aligned MR
> >> - 3) support any page_size MR.
> >>
> >> I'm sorry i'm not familiar with the linux MM subsystem. It seem it's safe/correct to access
> >> address/memory across pages start from the return of kmap_loca_page(page).
> >> In other words, 2) is already native supported, right?
> >
> > Yes. Please read the comments from Jason, Leon and Bart. They shared a lot of good advice.
>
> I read the whole discussion again, but I believed i still missed a lost.
>
>
> > From them, we can know the root cause and how to fix this problem.
>
> I don't think i misunderstood the root cause:
> RXE splits memory into PAGE_SIZE units in the xarray. As a result, when we extract an address from the xarray,
> we should not access address beyond a PAGE_SIZE window.
>
> IIUC, then how to fix it?
> - I'm not going to "removing page_size set", it's out of this patch scope.
>    Feel free to do the cleanup separately.
> - I'm not going to fix the NVMe/rtrs etc problems in this patch set when 64K page is enabled.
>    But RXE will tell its callers explicitly "RXE don't don't support such page_size"
> - I didn't state RXE supports PAGE_SIZE aligned page_size MR before refactoring rxe_map_mr_sg(),
>    because I worry about it was not correct to access address beyond the PAGE_SIZE window.
>
> What I should do next?
> Just state "RXE support PAGE_SIZE aligned MR" ? Then patches become
> RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE aligned MR
> RDMA/rxe: set RXE_PAGE_SIZE_CAP to starting from PAGE_SIZE
>

What do you take rdma maillist for? Your bugzilla, jira? or your dev
program launch? Or your play ground?

> Or just keep we have done in the V1
>
> Thanks
>
>
> >
> > Good Luck.
> >
> > Zhu Yanjun
> >
> >>
> >> I get totally confused now.
> >>
> >>
> >>
> >>> Zhu Yanjun
> >>>
> >>>>
> >>>> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
> >>>>
> >>>> Li Zhijian (6):
> >>>>     RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
> >>>>     RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
> >>>>     RDMA/rxe: remove unused rxe_mr.page_shift
> >>>>     RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
> >>>>       page_list
> >>>>     RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
> >>>>     RDMA/rxe: Support PAGE_SIZE aligned MR
> >>>>
> >>>>    drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
> >>>>    drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
> >>>>    drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
> >>>>    3 files changed, 48 insertions(+), 43 deletions(-)
> >>>>
> >

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-09  7:16         ` Greg Sword
@ 2023-11-09  7:26           ` Zhijian Li (Fujitsu)
  2023-11-09 13:10           ` Jason Gunthorpe
  1 sibling, 0 replies; 25+ messages in thread
From: Zhijian Li (Fujitsu) @ 2023-11-09  7:26 UTC (permalink / raw)
  To: Greg Sword
  Cc: Zhu Yanjun, zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	rpearsonhpe@gmail.com, Daisuke Matsuda (Fujitsu),
	bvanassche@acm.org, yi.zhang@redhat.com



On 09/11/2023 15:16, Greg Sword wrote:
> On Thu, Nov 9, 2023 at 10:25 AM Zhijian Li (Fujitsu)
> <lizhijian@fujitsu.com> wrote:
>>
>>
>>
>> On 06/11/2023 21:58, Zhu Yanjun wrote:
>>> 在 2023/11/6 12:07, Zhijian Li (Fujitsu) 写道:
>>>>
>>>>
>>>> On 03/11/2023 21:00, Zhu Yanjun wrote:
>>>>> 在 2023/11/3 17:55, Li Zhijian 写道:
>>>>>> I don't collect the Reviewed-by to the patch1-2 this time, since i
>>>>>> think we can make it better.
>>>>>>
>>>>>> Patch1-2: Fix kernel panic[1] and benifit to make srp work again.
>>>>>>              Almost nothing change from V1.
>>>>>> Patch3-5: cleanups # newly add
>>>>>> Patch6: make RXE support PAGE_SIZE aligned mr # newly add, but not fully tested
>>>>>>
>>>>>> My bad arm64 mechine offten hangs when doing blktests even though i use the
>>>>>> default siw driver.
>>>>>>
>>>>>> - nvme and ULPs(rtrs, iser) always registers 4K mr still don't supported yet.
>>>>>
>>>>> Zhijian
>>>>>
>>>>> Please read carefully the whole discussion about this problem. You will find a lot of valuable suggestions, especially suggestions from Jason.
>>>>
>>>> Okay, i will read it again. If you can tell me which thread, that would be better.
>>>>
>>>>
>>>>>
>>>>>    From the whole discussion, it seems that the root cause is very clear.
>>>>> We need to fix this prolem. Please do not send this kind of commits again.
>>>>>
>>>>
>>>> Let's think about what's our goal first.
>>>>
>>>> - 1) Fix the panic[1] and only support PAGE_SIZE MR
>>>> - 2) support PAGE_SIZE aligned MR
>>>> - 3) support any page_size MR.
>>>>
>>>> I'm sorry i'm not familiar with the linux MM subsystem. It seem it's safe/correct to access
>>>> address/memory across pages start from the return of kmap_loca_page(page).
>>>> In other words, 2) is already native supported, right?
>>>
>>> Yes. Please read the comments from Jason, Leon and Bart. They shared a lot of good advice.
>>
>> I read the whole discussion again, but I believed i still missed a lost.
>>
>>
>>>  From them, we can know the root cause and how to fix this problem.
>>
>> I don't think i misunderstood the root cause:
>> RXE splits memory into PAGE_SIZE units in the xarray. As a result, when we extract an address from the xarray,
>> we should not access address beyond a PAGE_SIZE window.
>>
>> IIUC, then how to fix it?
>> - I'm not going to "removing page_size set", it's out of this patch scope.
>>     Feel free to do the cleanup separately.
>> - I'm not going to fix the NVMe/rtrs etc problems in this patch set when 64K page is enabled.
>>     But RXE will tell its callers explicitly "RXE don't don't support such page_size"
>> - I didn't state RXE supports PAGE_SIZE aligned page_size MR before refactoring rxe_map_mr_sg(),
>>     because I worry about it was not correct to access address beyond the PAGE_SIZE window.
>>
>> What I should do next?
>> Just state "RXE support PAGE_SIZE aligned MR" ? Then patches become
>> RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE aligned MR
>> RDMA/rxe: set RXE_PAGE_SIZE_CAP to starting from PAGE_SIZE
>>
> 
> What do you take rdma maillist for? Your bugzilla, jira? or your dev
> program launch? Or your play ground?

May i know which bug you are concerning. Actually i always cannot get your point.





> 
>> Or just keep we have done in the V1
>>
>> Thanks
>>
>>
>>>
>>> Good Luck.
>>>
>>> Zhu Yanjun
>>>
>>>>
>>>> I get totally confused now.
>>>>
>>>>
>>>>
>>>>> Zhu Yanjun
>>>>>
>>>>>>
>>>>>> [1] https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
>>>>>>
>>>>>> Li Zhijian (6):
>>>>>>      RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr
>>>>>>      RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE
>>>>>>      RDMA/rxe: remove unused rxe_mr.page_shift
>>>>>>      RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from
>>>>>>        page_list
>>>>>>      RDMA/rxe: cleanup rxe_mr.{page_size,page_shift}
>>>>>>      RDMA/rxe: Support PAGE_SIZE aligned MR
>>>>>>
>>>>>>     drivers/infiniband/sw/rxe/rxe_mr.c    | 80 ++++++++++++++++-----------
>>>>>>     drivers/infiniband/sw/rxe/rxe_param.h |  2 +-
>>>>>>     drivers/infiniband/sw/rxe/rxe_verbs.h |  9 ---
>>>>>>     3 files changed, 48 insertions(+), 43 deletions(-)
>>>>>>
>>>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor
  2023-11-09  7:16         ` Greg Sword
  2023-11-09  7:26           ` Zhijian Li (Fujitsu)
@ 2023-11-09 13:10           ` Jason Gunthorpe
  1 sibling, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2023-11-09 13:10 UTC (permalink / raw)
  To: Greg Sword
  Cc: Zhijian Li (Fujitsu), Zhu Yanjun, zyjzyj2000@gmail.com,
	leon@kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, rpearsonhpe@gmail.com,
	Daisuke Matsuda (Fujitsu), bvanassche@acm.org,
	yi.zhang@redhat.com

On Thu, Nov 09, 2023 at 03:16:58PM +0800, Greg Sword wrote:
> > What I should do next?
> > Just state "RXE support PAGE_SIZE aligned MR" ? Then patches become
> > RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE aligned MR
> > RDMA/rxe: set RXE_PAGE_SIZE_CAP to starting from PAGE_SIZE
> 
> What do you take rdma maillist for? Your bugzilla, jira? or your dev
> program launch? Or your play ground?

We have a code of conduct on these mailing lists, please follow it or
stop posting.

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-11-09 13:10 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-03  9:55 [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Li Zhijian
2023-11-03  9:55 ` [PATCH RFC V2 1/6] RDMA/rxe: RDMA/rxe: don't allow registering !PAGE_SIZE mr Li Zhijian
2023-11-03 10:14   ` Greg Sword
2023-11-03  9:55 ` [PATCH RFC V2 2/6] RDMA/rxe: set RXE_PAGE_SIZE_CAP to PAGE_SIZE Li Zhijian
2023-11-03  9:55 ` [PATCH RFC V2 3/6] RDMA/rxe: remove unused rxe_mr.page_shift Li Zhijian
2023-11-03  9:55 ` [PATCH RFC V2 4/6] RDMA/rxe: Use PAGE_SIZE and PAGE_SHIFT to extract address from page_list Li Zhijian
2023-11-03 17:59   ` Jason Gunthorpe
2023-11-03  9:55 ` [PATCH RFC V2 5/6] RDMA/rxe: cleanup rxe_mr.{page_size,page_shift} Li Zhijian
2023-11-03  9:55 ` [PATCH RFC V2 6/6] RDMA/rxe: Support PAGE_SIZE aligned MR Li Zhijian
2023-11-03 15:04   ` Bart Van Assche
2023-11-06  3:07     ` Zhijian Li (Fujitsu)
2023-11-03 10:17 ` [PATCH RFC V2 0/6] rxe_map_mr_sg() fix cleanup and refactor Greg Sword
2023-11-06  3:46   ` Zhijian Li (Fujitsu)
2023-11-03 13:00 ` Zhu Yanjun
2023-11-06  4:07   ` Zhijian Li (Fujitsu)
2023-11-06 13:58     ` Zhu Yanjun
2023-11-09  2:24       ` Zhijian Li (Fujitsu)
2023-11-09  6:36         ` Zhu Yanjun
2023-11-09  7:16         ` Greg Sword
2023-11-09  7:26           ` Zhijian Li (Fujitsu)
2023-11-09 13:10           ` Jason Gunthorpe
2023-11-06 14:13     ` Jason Gunthorpe
2023-11-06  7:59 ` Zhijian Li (Fujitsu)
2023-11-06  9:35   ` Greg Sword
2023-11-06  9:55     ` Zhijian Li (Fujitsu)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.