* [for-next PATCH 0/5] iova_to_vaddr refactor
@ 2022-11-11 4:30 Li Zhijian
2022-11-11 4:30 ` [for-next PATCH 1/5] RDMA/rxe: Remove rxe_phys_buf.size Li Zhijian
` (5 more replies)
0 siblings, 6 replies; 16+ messages in thread
From: Li Zhijian @ 2022-11-11 4:30 UTC (permalink / raw)
To: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma
Cc: linux-kernel, Li Zhijian
Background:
iova_to_addr just lookups at the map which is constructed and manages the
relationship between iova to vaddr when MR is registering. By conventional,
we should map the userspace address(iova) every time we want to access it.
Please refer to the previous discussion[1][2] for more details.
Refactor:
In this refactoring, we will do the map every time before the user really
accesses it.
patch1,3,5 are cleanup and preparation.
patch2 is to make all subroutines accessing the iova use iova_to_vaddr() API.
patch4 is the refactor.
[1]: https://www.spinics.net/lists/linux-rdma/msg113206.html
[2]: https://lore.kernel.org/all/20220118123505.GF84788@nvidia.com/
Li Zhijian (5):
RDMA/rxe: Remove rxe_phys_buf.size
RDMA/rxe: use iova_to_vaddr to transform iova for rxe_mr_copy
RDMA/rxe: iova_to_vaddr cleanup
RDMA/rxe: refactor iova_to_vaddr
RDMA/rxe: Rename iova_to_vaddr to rxe_map_iova
drivers/infiniband/sw/rxe/rxe_loc.h | 3 +-
drivers/infiniband/sw/rxe/rxe_mr.c | 128 +++++++++++---------------
drivers/infiniband/sw/rxe/rxe_resp.c | 5 +-
drivers/infiniband/sw/rxe/rxe_verbs.c | 1 -
drivers/infiniband/sw/rxe/rxe_verbs.h | 6 +-
5 files changed, 62 insertions(+), 81 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [for-next PATCH 1/5] RDMA/rxe: Remove rxe_phys_buf.size
2022-11-11 4:30 [for-next PATCH 0/5] iova_to_vaddr refactor Li Zhijian
@ 2022-11-11 4:30 ` Li Zhijian
2022-11-18 16:38 ` Jason Gunthorpe
2022-11-11 4:30 ` [for-next PATCH 2/5] RDMA/rxe: use iova_to_vaddr to transform iova for rxe_mr_copy Li Zhijian
` (4 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Li Zhijian @ 2022-11-11 4:30 UTC (permalink / raw)
To: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma
Cc: linux-kernel, Li Zhijian
Every rxe_phys_buf used by either IB_MR_TYPE_MEM_REG or IB_MR_TYPE_USER
has the same size, which should be same with ibmr->page_size. So we can
use ibmr->page_size correspondingly.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_mr.c | 11 ++++-------
drivers/infiniband/sw/rxe/rxe_verbs.c | 1 -
drivers/infiniband/sw/rxe/rxe_verbs.h | 1 -
3 files changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index d4f10c2d1aa7..f6366a635b92 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -145,6 +145,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
mr->page_shift = PAGE_SHIFT;
mr->page_mask = PAGE_SIZE - 1;
+ mr->ibmr.page_size = PAGE_SIZE;
num_buf = 0;
map = mr->map;
@@ -167,7 +168,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
}
buf->addr = (uintptr_t)vaddr;
- buf->size = PAGE_SIZE;
num_buf++;
buf++;
@@ -219,7 +219,7 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out,
size_t offset = iova - mr->ibmr.iova + mr->offset;
int map_index;
int buf_index;
- u64 length;
+ u64 length = mr->ibmr.page_size;
if (likely(mr->page_shift)) {
*offset_out = offset & mr->page_mask;
@@ -230,8 +230,6 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out,
map_index = 0;
buf_index = 0;
- length = mr->map[map_index]->buf[buf_index].size;
-
while (offset >= length) {
offset -= length;
buf_index++;
@@ -240,7 +238,6 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out,
map_index++;
buf_index = 0;
}
- length = mr->map[map_index]->buf[buf_index].size;
}
*m_out = map_index;
@@ -274,7 +271,7 @@ void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
lookup_iova(mr, iova, &m, &n, &offset);
- if (offset + length > mr->map[m]->buf[n].size) {
+ if (offset + length > mr->ibmr.page_size) {
pr_warn("crosses page boundary\n");
addr = NULL;
goto out;
@@ -336,7 +333,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
src = (dir == RXE_TO_MR_OBJ) ? addr : va;
dest = (dir == RXE_TO_MR_OBJ) ? va : addr;
- bytes = buf->size - offset;
+ bytes = mr->ibmr.page_size - offset;
if (bytes > length)
bytes = length;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 88825edc7dce..5da394c675bf 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -991,7 +991,6 @@ static int rxe_set_page(struct ib_mr *ibmr, u64 addr)
buf = &map->buf[mr->nbuf % RXE_BUF_PER_MAP];
buf->addr = addr;
- buf->size = ibmr->page_size;
mr->nbuf++;
return 0;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 22a299b0a9f0..acab785ba7e2 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -281,7 +281,6 @@ enum rxe_mr_lookup_type {
struct rxe_phys_buf {
u64 addr;
- u64 size;
};
struct rxe_map {
--
2.31.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [for-next PATCH 2/5] RDMA/rxe: use iova_to_vaddr to transform iova for rxe_mr_copy
2022-11-11 4:30 [for-next PATCH 0/5] iova_to_vaddr refactor Li Zhijian
2022-11-11 4:30 ` [for-next PATCH 1/5] RDMA/rxe: Remove rxe_phys_buf.size Li Zhijian
@ 2022-11-11 4:30 ` Li Zhijian
2022-11-18 16:41 ` Jason Gunthorpe
2022-11-11 4:30 ` [for-next PATCH 3/5] RDMA/rxe: iova_to_vaddr cleanup Li Zhijian
` (3 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Li Zhijian @ 2022-11-11 4:30 UTC (permalink / raw)
To: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma
Cc: linux-kernel, Li Zhijian
Make the code more friendly and readable by using iova_to_vaddr()
This commit also remove the err1 label in rxe_mr_copy().
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_mr.c | 46 +++++++++---------------------
1 file changed, 13 insertions(+), 33 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index f6366a635b92..cf3ce8d293b3 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -289,13 +289,6 @@ void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
enum rxe_mr_copy_dir dir)
{
- int err;
- int bytes;
- u8 *va;
- struct rxe_map **map;
- struct rxe_phys_buf *buf;
- int m;
- int i;
size_t offset;
if (length == 0)
@@ -315,49 +308,36 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
WARN_ON_ONCE(!mr->map);
- err = mr_check_range(mr, iova, length);
- if (err) {
- err = -EFAULT;
- goto err1;
- }
-
- lookup_iova(mr, iova, &m, &i, &offset);
-
- map = mr->map + m;
- buf = map[0]->buf + i;
+ if (mr_check_range(mr, iova, length))
+ return -EFAULT;
+ offset = (iova - mr->ibmr.iova + mr->offset) & mr->page_mask;
while (length > 0) {
- u8 *src, *dest;
-
- va = (u8 *)(uintptr_t)buf->addr + offset;
- src = (dir == RXE_TO_MR_OBJ) ? addr : va;
- dest = (dir == RXE_TO_MR_OBJ) ? va : addr;
+ u8 *src, *dest, *va;
+ int bytes;
bytes = mr->ibmr.page_size - offset;
if (bytes > length)
bytes = length;
+ va = iova_to_vaddr(mr, iova, bytes);
+ if (!va)
+ return -EFAULT;
+
+ src = (dir == RXE_TO_MR_OBJ) ? addr : va;
+ dest = (dir == RXE_TO_MR_OBJ) ? va : addr;
+
memcpy(dest, src, bytes);
length -= bytes;
addr += bytes;
+ iova += bytes;
offset = 0;
- buf++;
- i++;
-
- if (i == RXE_BUF_PER_MAP) {
- i = 0;
- map++;
- buf = map[0]->buf;
- }
}
return 0;
-
-err1:
- return err;
}
/* copy data in or out of a wqe, i.e. sg list
--
2.31.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [for-next PATCH 3/5] RDMA/rxe: iova_to_vaddr cleanup
2022-11-11 4:30 [for-next PATCH 0/5] iova_to_vaddr refactor Li Zhijian
2022-11-11 4:30 ` [for-next PATCH 1/5] RDMA/rxe: Remove rxe_phys_buf.size Li Zhijian
2022-11-11 4:30 ` [for-next PATCH 2/5] RDMA/rxe: use iova_to_vaddr to transform iova for rxe_mr_copy Li Zhijian
@ 2022-11-11 4:30 ` Li Zhijian
2022-11-11 4:30 ` [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr Li Zhijian
` (2 subsequent siblings)
5 siblings, 0 replies; 16+ messages in thread
From: Li Zhijian @ 2022-11-11 4:30 UTC (permalink / raw)
To: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma
Cc: linux-kernel, Li Zhijian
No functional change, do cleanup for refactoring it easily later.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_mr.c | 37 ++++++++++++++----------------
1 file changed, 17 insertions(+), 20 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index cf3ce8d293b3..a4e786b657b7 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -246,41 +246,38 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out,
}
}
-void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
+static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
{
size_t offset;
int m, n;
- void *addr;
+ lookup_iova(mr, iova, &m, &n, &offset);
+
+ if (offset + length > mr->ibmr.page_size) {
+ pr_warn("crosses page boundary\n");
+ return NULL;
+ }
+
+ return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
+}
+
+void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
+{
if (mr->state != RXE_MR_STATE_VALID) {
pr_warn("mr not in valid state\n");
- addr = NULL;
- goto out;
+ return NULL;
}
if (!mr->map) {
- addr = (void *)(uintptr_t)iova;
- goto out;
+ return (void *)(uintptr_t)iova;
}
if (mr_check_range(mr, iova, length)) {
pr_warn("range violation\n");
- addr = NULL;
- goto out;
- }
-
- lookup_iova(mr, iova, &m, &n, &offset);
-
- if (offset + length > mr->ibmr.page_size) {
- pr_warn("crosses page boundary\n");
- addr = NULL;
- goto out;
+ return NULL;
}
- addr = (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
-
-out:
- return addr;
+ return __iova_to_vaddr(mr, iova, length);
}
/* copy data from a range (vaddr, vaddr+length-1) to or from
--
2.31.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr
2022-11-11 4:30 [for-next PATCH 0/5] iova_to_vaddr refactor Li Zhijian
` (2 preceding siblings ...)
2022-11-11 4:30 ` [for-next PATCH 3/5] RDMA/rxe: iova_to_vaddr cleanup Li Zhijian
@ 2022-11-11 4:30 ` Li Zhijian
2022-11-16 12:37 ` Fabio M. De Francesco
` (2 more replies)
2022-11-11 4:32 ` [for-next PATCH 5/5] RDMA/rxe: Rename iova_to_vaddr to rxe_map_iova Li Zhijian
2022-11-11 6:13 ` [for-next PATCH 0/5] iova_to_vaddr refactor lizhijian
5 siblings, 3 replies; 16+ messages in thread
From: Li Zhijian @ 2022-11-11 4:30 UTC (permalink / raw)
To: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma
Cc: linux-kernel, Li Zhijian
For IB_MR_TYPE_USER MR, iova_to_vaddr() will call kmap_local_page() to
map page.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_loc.h | 1 +
drivers/infiniband/sw/rxe/rxe_mr.c | 38 +++++++++++++++------------
drivers/infiniband/sw/rxe/rxe_resp.c | 1 +
drivers/infiniband/sw/rxe/rxe_verbs.h | 5 +++-
4 files changed, 27 insertions(+), 18 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index c2a5c8814a48..22a8c44d39c8 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -73,6 +73,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info *dma,
void *addr, int length, enum rxe_mr_copy_dir dir);
void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length);
+void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr);
struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
enum rxe_mr_lookup_type type);
int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length);
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index a4e786b657b7..d26a4a33119c 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -120,9 +120,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
struct ib_umem *umem;
struct sg_page_iter sg_iter;
int num_buf;
- void *vaddr;
int err;
- int i;
umem = ib_umem_get(&rxe->ib_dev, start, length, access);
if (IS_ERR(umem)) {
@@ -159,18 +157,9 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
num_buf = 0;
}
- vaddr = page_address(sg_page_iter_page(&sg_iter));
- if (!vaddr) {
- pr_warn("%s: Unable to get virtual address\n",
- __func__);
- err = -ENOMEM;
- goto err_cleanup_map;
- }
-
- buf->addr = (uintptr_t)vaddr;
+ buf->page = sg_page_iter_page(&sg_iter);
num_buf++;
buf++;
-
}
}
@@ -182,10 +171,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
return 0;
-err_cleanup_map:
- for (i = 0; i < mr->num_map; i++)
- kfree(mr->map[i]);
- kfree(mr->map);
err_release_umem:
ib_umem_release(umem);
err_out:
@@ -246,6 +231,12 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out,
}
}
+void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr)
+{
+ if (mr->ibmr.type == IB_MR_TYPE_USER)
+ kunmap_local(vaddr);
+}
+
static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
{
size_t offset;
@@ -258,9 +249,21 @@ static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
return NULL;
}
- return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
+ if (mr->ibmr.type == IB_MR_TYPE_USER) {
+ char *paddr;
+ struct page *pg = mr->map[m]->buf[n].page;
+
+ paddr = kmap_local_page(pg);
+ if (paddr == NULL) {
+ pr_warn("Failed to map page");
+ return NULL;
+ }
+ return paddr + offset;
+ } else
+ return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
}
+/* must call rxe_unmap_vaddr to unmap vaddr */
void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
{
if (mr->state != RXE_MR_STATE_VALID) {
@@ -326,6 +329,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
dest = (dir == RXE_TO_MR_OBJ) ? va : addr;
memcpy(dest, src, bytes);
+ rxe_unmap_vaddr(mr, va);
length -= bytes;
addr += bytes;
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 483043dc4e89..765cb9f8538a 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -636,6 +636,7 @@ static enum resp_states atomic_reply(struct rxe_qp *qp,
*vaddr = value;
spin_unlock_bh(&atomic_ops_lock);
+ rxe_unmap_vaddr(mr, vaddr);
qp->resp.msn++;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index acab785ba7e2..6080a4b32f09 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -280,7 +280,10 @@ enum rxe_mr_lookup_type {
#define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct rxe_phys_buf))
struct rxe_phys_buf {
- u64 addr;
+ union {
+ u64 addr; /* IB_MR_TYPE_MEM_REG */
+ struct page *page; /* IB_MR_TYPE_USER */
+ };
};
struct rxe_map {
--
2.31.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [for-next PATCH 5/5] RDMA/rxe: Rename iova_to_vaddr to rxe_map_iova
2022-11-11 4:30 [for-next PATCH 0/5] iova_to_vaddr refactor Li Zhijian
` (3 preceding siblings ...)
2022-11-11 4:30 ` [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr Li Zhijian
@ 2022-11-11 4:32 ` Li Zhijian
2022-11-11 6:13 ` [for-next PATCH 0/5] iova_to_vaddr refactor lizhijian
5 siblings, 0 replies; 16+ messages in thread
From: Li Zhijian @ 2022-11-11 4:32 UTC (permalink / raw)
To: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma
Cc: linux-kernel, Li Zhijian
be pair with rxe_unmap_vaddr
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_loc.h | 2 +-
drivers/infiniband/sw/rxe/rxe_mr.c | 4 ++--
drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++--
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index 22a8c44d39c8..496e29215e0f 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -72,7 +72,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
enum rxe_mr_copy_dir dir);
int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info *dma,
void *addr, int length, enum rxe_mr_copy_dir dir);
-void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length);
+void *rxe_map_iova(struct rxe_mr *mr, u64 iova, int length);
void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr);
struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
enum rxe_mr_lookup_type type);
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index d26a4a33119c..e46ea2a7179c 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -264,7 +264,7 @@ static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
}
/* must call rxe_unmap_vaddr to unmap vaddr */
-void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
+void *rxe_map_iova(struct rxe_mr *mr, u64 iova, int length)
{
if (mr->state != RXE_MR_STATE_VALID) {
pr_warn("mr not in valid state\n");
@@ -321,7 +321,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
if (bytes > length)
bytes = length;
- va = iova_to_vaddr(mr, iova, bytes);
+ va = rxe_map_iova(mr, iova, bytes);
if (!va)
return -EFAULT;
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 765cb9f8538a..831d71e2054c 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -615,8 +615,8 @@ static enum resp_states atomic_reply(struct rxe_qp *qp,
goto out;
}
- vaddr = iova_to_vaddr(mr, qp->resp.va + qp->resp.offset,
- sizeof(u64));
+ vaddr = rxe_map_iova(mr, qp->resp.va + qp->resp.offset,
+ sizeof(u64));
/* check vaddr is 8 bytes aligned. */
if (!vaddr || (uintptr_t)vaddr & 7) {
--
2.31.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 0/5] iova_to_vaddr refactor
2022-11-11 4:30 [for-next PATCH 0/5] iova_to_vaddr refactor Li Zhijian
` (4 preceding siblings ...)
2022-11-11 4:32 ` [for-next PATCH 5/5] RDMA/rxe: Rename iova_to_vaddr to rxe_map_iova Li Zhijian
@ 2022-11-11 6:13 ` lizhijian
5 siblings, 0 replies; 16+ messages in thread
From: lizhijian @ 2022-11-11 6:13 UTC (permalink / raw)
To: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky,
linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, yangx.jy@fujitsu.com
CC+ Xiao
On 11/11/2022 12:30, Li Zhijian wrote:
> Background:
> iova_to_addr just lookups at the map which is constructed and manages the
> relationship between iova to vaddr when MR is registering. By conventional,
> we should map the userspace address(iova) every time we want to access it.
> Please refer to the previous discussion[1][2] for more details.
>
> Refactor:
> In this refactoring, we will do the map every time before the user really
> accesses it.
>
> patch1,3,5 are cleanup and preparation.
> patch2 is to make all subroutines accessing the iova use iova_to_vaddr() API.
> patch4 is the refactor.
>
> [1]: https://www.spinics.net/lists/linux-rdma/msg113206.html
> [2]: https://lore.kernel.org/all/20220118123505.GF84788@nvidia.com/
>
> Li Zhijian (5):
> RDMA/rxe: Remove rxe_phys_buf.size
> RDMA/rxe: use iova_to_vaddr to transform iova for rxe_mr_copy
> RDMA/rxe: iova_to_vaddr cleanup
> RDMA/rxe: refactor iova_to_vaddr
> RDMA/rxe: Rename iova_to_vaddr to rxe_map_iova
>
> drivers/infiniband/sw/rxe/rxe_loc.h | 3 +-
> drivers/infiniband/sw/rxe/rxe_mr.c | 128 +++++++++++---------------
> drivers/infiniband/sw/rxe/rxe_resp.c | 5 +-
> drivers/infiniband/sw/rxe/rxe_verbs.c | 1 -
> drivers/infiniband/sw/rxe/rxe_verbs.h | 6 +-
> 5 files changed, 62 insertions(+), 81 deletions(-)
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr
2022-11-11 4:30 ` [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr Li Zhijian
@ 2022-11-16 12:37 ` Fabio M. De Francesco
2022-11-18 1:32 ` lizhijian
2022-11-21 18:53 ` Jason Gunthorpe
2022-11-16 12:49 ` Fabio M. De Francesco
2022-11-18 17:33 ` Jason Gunthorpe
2 siblings, 2 replies; 16+ messages in thread
From: Fabio M. De Francesco @ 2022-11-16 12:37 UTC (permalink / raw)
To: Li Zhijian
Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
linux-kernel, Ira Weiny
On venerdì 11 novembre 2022 05:30:29 CET Li Zhijian wrote:
> For IB_MR_TYPE_USER MR, iova_to_vaddr() will call kmap_local_page() to
> map page.
>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> drivers/infiniband/sw/rxe/rxe_loc.h | 1 +
> drivers/infiniband/sw/rxe/rxe_mr.c | 38 +++++++++++++++------------
> drivers/infiniband/sw/rxe/rxe_resp.c | 1 +
> drivers/infiniband/sw/rxe/rxe_verbs.h | 5 +++-
> 4 files changed, 27 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h
> b/drivers/infiniband/sw/rxe/rxe_loc.h index c2a5c8814a48..22a8c44d39c8
100644
> --- a/drivers/infiniband/sw/rxe/rxe_loc.h
> +++ b/drivers/infiniband/sw/rxe/rxe_loc.h
> @@ -73,6 +73,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr,
int
> length, int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info
> *dma, void *addr, int length, enum rxe_mr_copy_dir dir);
> void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length);
> +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr);
> struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
> enum rxe_mr_lookup_type type);
> int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length);
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c
> b/drivers/infiniband/sw/rxe/rxe_mr.c index a4e786b657b7..d26a4a33119c 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -120,9 +120,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64
> length, u64 iova, struct ib_umem *umem;
> struct sg_page_iter sg_iter;
> int num_buf;
> - void *vaddr;
> int err;
> - int i;
>
> umem = ib_umem_get(&rxe->ib_dev, start, length, access);
> if (IS_ERR(umem)) {
> @@ -159,18 +157,9 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start,
u64
> length, u64 iova, num_buf = 0;
> }
>
> - vaddr =
page_address(sg_page_iter_page(&sg_iter));
> - if (!vaddr) {
> - pr_warn("%s: Unable to get virtual
address\n",
> - __func__);
> - err = -ENOMEM;
> - goto err_cleanup_map;
> - }
> -
> - buf->addr = (uintptr_t)vaddr;
> + buf->page = sg_page_iter_page(&sg_iter);
> num_buf++;
> buf++;
> -
> }
> }
>
> @@ -182,10 +171,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start,
u64
> length, u64 iova,
>
> return 0;
>
> -err_cleanup_map:
> - for (i = 0; i < mr->num_map; i++)
> - kfree(mr->map[i]);
> - kfree(mr->map);
> err_release_umem:
> ib_umem_release(umem);
> err_out:
> @@ -246,6 +231,12 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova,
int
> *m_out, int *n_out, }
> }
>
> +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr)
> +{
> + if (mr->ibmr.type == IB_MR_TYPE_USER)
> + kunmap_local(vaddr);
> +}
> +
> static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
> {
> size_t offset;
> @@ -258,9 +249,21 @@ static void *__iova_to_vaddr(struct rxe_mr *mr, u64
iova,
> int length) return NULL;
> }
>
> - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
> + if (mr->ibmr.type == IB_MR_TYPE_USER) {
> + char *paddr;
> + struct page *pg = mr->map[m]->buf[n].page;
> +
> + paddr = kmap_local_page(pg);
> + if (paddr == NULL) {
> + pr_warn("Failed to map page");
> + return NULL;
> + }
I know nothing about this code but I am here as a result of regular checks for
changes to HIGHMEM mappings across the entire kernel. So please forgive me if
I'm objecting to the correct changes.
1) It looks like this code had a call to page_address() and you converted it
to mapping with kmap_local_page().
If page_address() is related and it used to work properly, the page you are
mapping cannot come from ZONE_HIGHMEM. Therefore, kmap_local_page() looks like
an overkill.
I'm probably missing something...
> + return paddr + offset;
> + } else
> + return (void *)(uintptr_t)mr->map[m]->buf[n].addr +
offset;
> }
>
> +/* must call rxe_unmap_vaddr to unmap vaddr */
> void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
> {
> if (mr->state != RXE_MR_STATE_VALID) {
> @@ -326,6 +329,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr,
> int length, dest = (dir == RXE_TO_MR_OBJ) ? va : addr;
>
> memcpy(dest, src, bytes);
> + rxe_unmap_vaddr(mr, va);
>
> length -= bytes;
> addr += bytes;
> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c
> b/drivers/infiniband/sw/rxe/rxe_resp.c index 483043dc4e89..765cb9f8538a
> 100644
> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
> @@ -636,6 +636,7 @@ static enum resp_states atomic_reply(struct rxe_qp *qp,
>
> *vaddr = value;
> spin_unlock_bh(&atomic_ops_lock);
> + rxe_unmap_vaddr(mr, vaddr);
>
> qp->resp.msn++;
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h
> b/drivers/infiniband/sw/rxe/rxe_verbs.h index acab785ba7e2..6080a4b32f09
> 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.h
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
> @@ -280,7 +280,10 @@ enum rxe_mr_lookup_type {
> #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct
rxe_phys_buf))
>
> struct rxe_phys_buf {
> - u64 addr;
> + union {
> + u64 addr; /* IB_MR_TYPE_MEM_REG */
> + struct page *page; /* IB_MR_TYPE_USER */
> + };
> };
>
> struct rxe_map {
> --
> 2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr
2022-11-11 4:30 ` [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr Li Zhijian
2022-11-16 12:37 ` Fabio M. De Francesco
@ 2022-11-16 12:49 ` Fabio M. De Francesco
2022-11-18 17:33 ` Jason Gunthorpe
2 siblings, 0 replies; 16+ messages in thread
From: Fabio M. De Francesco @ 2022-11-16 12:49 UTC (permalink / raw)
To: Li Zhijian
Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
linux-kernel, ira.weiny
On venerdì 11 novembre 2022 05:30:29 CET Li Zhijian wrote:
> For IB_MR_TYPE_USER MR, iova_to_vaddr() will call kmap_local_page() to
> map page.
>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> drivers/infiniband/sw/rxe/rxe_loc.h | 1 +
> drivers/infiniband/sw/rxe/rxe_mr.c | 38 +++++++++++++++------------
> drivers/infiniband/sw/rxe/rxe_resp.c | 1 +
> drivers/infiniband/sw/rxe/rxe_verbs.h | 5 +++-
> 4 files changed, 27 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h
> b/drivers/infiniband/sw/rxe/rxe_loc.h index c2a5c8814a48..22a8c44d39c8
100644
> --- a/drivers/infiniband/sw/rxe/rxe_loc.h
> +++ b/drivers/infiniband/sw/rxe/rxe_loc.h
> @@ -73,6 +73,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr,
int
> length, int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info
> *dma, void *addr, int length, enum rxe_mr_copy_dir dir);
> void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length);
> +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr);
> struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
> enum rxe_mr_lookup_type type);
> int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length);
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c
> b/drivers/infiniband/sw/rxe/rxe_mr.c index a4e786b657b7..d26a4a33119c 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -120,9 +120,7 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64
> length, u64 iova, struct ib_umem *umem;
> struct sg_page_iter sg_iter;
> int num_buf;
> - void *vaddr;
> int err;
> - int i;
>
> umem = ib_umem_get(&rxe->ib_dev, start, length, access);
> if (IS_ERR(umem)) {
> @@ -159,18 +157,9 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start,
u64
> length, u64 iova, num_buf = 0;
> }
>
> - vaddr =
page_address(sg_page_iter_page(&sg_iter));
> - if (!vaddr) {
> - pr_warn("%s: Unable to get virtual
address\n",
> - __func__);
> - err = -ENOMEM;
> - goto err_cleanup_map;
> - }
> -
> - buf->addr = (uintptr_t)vaddr;
> + buf->page = sg_page_iter_page(&sg_iter);
> num_buf++;
> buf++;
> -
> }
> }
>
> @@ -182,10 +171,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start,
u64
> length, u64 iova,
>
> return 0;
>
> -err_cleanup_map:
> - for (i = 0; i < mr->num_map; i++)
> - kfree(mr->map[i]);
> - kfree(mr->map);
> err_release_umem:
> ib_umem_release(umem);
> err_out:
> @@ -246,6 +231,12 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova,
int
> *m_out, int *n_out, }
> }
>
> +void rxe_unmap_vaddr(struct rxe_mr *mr, void *vaddr)
> +{
> + if (mr->ibmr.type == IB_MR_TYPE_USER)
> + kunmap_local(vaddr);
> +}
> +
> static void *__iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
> {
> size_t offset;
> @@ -258,9 +249,21 @@ static void *__iova_to_vaddr(struct rxe_mr *mr, u64
iova,
> int length) return NULL;
> }
>
> - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
> + if (mr->ibmr.type == IB_MR_TYPE_USER) {
> + char *paddr;
> + struct page *pg = mr->map[m]->buf[n].page;
> +
> + paddr = kmap_local_page(pg);
> + if (paddr == NULL) {
> + pr_warn("Failed to map page");
> + return NULL;
> + }
I know nothing about this code but I am here as a result of regular checks for
changes to HIGHMEM mappings across the entire kernel. So please forgive me if
I'm objecting on correct changes.
1) It looks like this code had a call to page_address() and you converted it
to mapping with kmap_local_page().
If page_address() is related and it used to work properly, the page you are
mapping cannot come from ZONE_HIGHMEM. Therefore, kmap_local_page() looks like
an overkill.
I'm probably missing something...
2) What made you think that kmap_local_page() might fail and return NULL?
AFAIK, kmap_local_page() won't ever return NULL, therefore there is no need to
check.
Regards,
Fabio
P.S.: I just noticed that the second entry in my list was missing from the
other email which should be discarded.
> + return paddr + offset;
> + } else
> + return (void *)(uintptr_t)mr->map[m]->buf[n].addr +
offset;
> }
>
> +/* must call rxe_unmap_vaddr to unmap vaddr */
> void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
> {
> if (mr->state != RXE_MR_STATE_VALID) {
> @@ -326,6 +329,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr,
> int length, dest = (dir == RXE_TO_MR_OBJ) ? va : addr;
>
> memcpy(dest, src, bytes);
> + rxe_unmap_vaddr(mr, va);
>
> length -= bytes;
> addr += bytes;
> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c
> b/drivers/infiniband/sw/rxe/rxe_resp.c index 483043dc4e89..765cb9f8538a
> 100644
> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
> @@ -636,6 +636,7 @@ static enum resp_states atomic_reply(struct rxe_qp *qp,
>
> *vaddr = value;
> spin_unlock_bh(&atomic_ops_lock);
> + rxe_unmap_vaddr(mr, vaddr);
>
> qp->resp.msn++;
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h
> b/drivers/infiniband/sw/rxe/rxe_verbs.h index acab785ba7e2..6080a4b32f09
> 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.h
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
> @@ -280,7 +280,10 @@ enum rxe_mr_lookup_type {
> #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct
rxe_phys_buf))
>
> struct rxe_phys_buf {
> - u64 addr;
> + union {
> + u64 addr; /* IB_MR_TYPE_MEM_REG */
> + struct page *page; /* IB_MR_TYPE_USER */
> + };
> };
>
> struct rxe_map {
> --
> 2.31.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr
2022-11-16 12:37 ` Fabio M. De Francesco
@ 2022-11-18 1:32 ` lizhijian
2022-11-21 18:53 ` Jason Gunthorpe
1 sibling, 0 replies; 16+ messages in thread
From: lizhijian @ 2022-11-18 1:32 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
Ira Weiny
On 16/11/2022 20:37, Fabio M. De Francesco wrote:
>> - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
>> + if (mr->ibmr.type == IB_MR_TYPE_USER) {
>> + char *paddr;
>> + struct page *pg = mr->map[m]->buf[n].page;
>> +
>> + paddr = kmap_local_page(pg);
>> + if (paddr == NULL) {
>> + pr_warn("Failed to map page");
>> + return NULL;
>> + }
> I know nothing about this code but I am here as a result of regular checks for
> changes to HIGHMEM mappings across the entire kernel. So please forgive me if
> I'm objecting to the correct changes.
>
> 1) It looks like this code had a call to page_address() and you converted it
> to mapping with kmap_local_page().
>
> If page_address() is related and it used to work properly, the page you are
> mapping cannot come from ZONE_HIGHMEM.
Yes, you are totally right.
Therefore, kmap_local_page() looks like
> an overkill.
The confusion about the page_address() here has been raised for a long
time[1][2].
https://www.spinics.net/lists/linux-rdma/msg113206.html
https://lore.kernel.org/all/20220121160654.GC773547@iweiny-DESK2.sc.intel.com/
Thanks
Zhijian
>
> I'm probably missing something...
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 1/5] RDMA/rxe: Remove rxe_phys_buf.size
2022-11-11 4:30 ` [for-next PATCH 1/5] RDMA/rxe: Remove rxe_phys_buf.size Li Zhijian
@ 2022-11-18 16:38 ` Jason Gunthorpe
0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2022-11-18 16:38 UTC (permalink / raw)
To: Li Zhijian; +Cc: Zhu Yanjun, Leon Romanovsky, linux-rdma, linux-kernel
On Fri, Nov 11, 2022 at 04:30:26AM +0000, Li Zhijian wrote:
> Every rxe_phys_buf used by either IB_MR_TYPE_MEM_REG or IB_MR_TYPE_USER
> has the same size, which should be same with ibmr->page_size. So we can
> use ibmr->page_size correspondingly.
ibmr->page_size is really only supposed to be used by MRs that are
going to be used with FMR. It is some protocol to pass information to
the IB_WR_REG_MR op
The whole way rxe stores the MRs really could stand to be modernized,
just replace the whole mr->map and everything under it with a simple
xarray of page pointers.
You are right though, the page_size of the logical page array should
be global to the mr, not stored in every entry. It isn't a sgl, it is
a dynamic array. Just don't abuse ibmr->page_size for it..
Jason
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 2/5] RDMA/rxe: use iova_to_vaddr to transform iova for rxe_mr_copy
2022-11-11 4:30 ` [for-next PATCH 2/5] RDMA/rxe: use iova_to_vaddr to transform iova for rxe_mr_copy Li Zhijian
@ 2022-11-18 16:41 ` Jason Gunthorpe
0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2022-11-18 16:41 UTC (permalink / raw)
To: Li Zhijian; +Cc: Zhu Yanjun, Leon Romanovsky, linux-rdma, linux-kernel
On Fri, Nov 11, 2022 at 04:30:27AM +0000, Li Zhijian wrote:
> Make the code more friendly and readable by using iova_to_vaddr()
But the point of this is to be faster because it knows pages are in a
linear list
As with my previous suggestion, just convert this to an xarray and
then it is a trivial and fast xas_for_each_range() returning struct
pages.
Jason
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr
2022-11-11 4:30 ` [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr Li Zhijian
2022-11-16 12:37 ` Fabio M. De Francesco
2022-11-16 12:49 ` Fabio M. De Francesco
@ 2022-11-18 17:33 ` Jason Gunthorpe
2 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2022-11-18 17:33 UTC (permalink / raw)
To: Li Zhijian; +Cc: Zhu Yanjun, Leon Romanovsky, linux-rdma, linux-kernel
On Fri, Nov 11, 2022 at 04:30:29AM +0000, Li Zhijian wrote:
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
> index acab785ba7e2..6080a4b32f09 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.h
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
> @@ -280,7 +280,10 @@ enum rxe_mr_lookup_type {
> #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct rxe_phys_buf))
>
> struct rxe_phys_buf {
> - u64 addr;
> + union {
> + u64 addr; /* IB_MR_TYPE_MEM_REG */
> + struct page *page; /* IB_MR_TYPE_USER */
> + };
Everything rxe handles is a struct page, even the kernel
registrations.
Jason
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr
2022-11-16 12:37 ` Fabio M. De Francesco
2022-11-18 1:32 ` lizhijian
@ 2022-11-21 18:53 ` Jason Gunthorpe
2022-11-22 23:24 ` Fabio M. De Francesco
1 sibling, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2022-11-21 18:53 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: Li Zhijian, Zhu Yanjun, Leon Romanovsky, linux-rdma, linux-kernel,
Ira Weiny
On Wed, Nov 16, 2022 at 01:37:14PM +0100, Fabio M. De Francesco wrote:
> > - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
> > + if (mr->ibmr.type == IB_MR_TYPE_USER) {
> > + char *paddr;
> > + struct page *pg = mr->map[m]->buf[n].page;
> > +
> > + paddr = kmap_local_page(pg);
> > + if (paddr == NULL) {
> > + pr_warn("Failed to map page");
> > + return NULL;
> > + }
>
> I know nothing about this code but I am here as a result of regular checks for
> changes to HIGHMEM mappings across the entire kernel. So please forgive me if
> I'm objecting to the correct changes.
>
> 1) It looks like this code had a call to page_address() and you converted it
> to mapping with kmap_local_page().
It only worked properly because the kconfig is blocking CONFIG_HIGHMEM
so ZONE_HIGHMEM doesn't exist. These pages are obtained from GUP and
could be anything.
This is done indirectly through
config INFINIBAND_VIRT_DMA
def_bool !HIGHMEM
But this patch is undoing parts of what are causing that restriction
by using the proper APIs. Though I'm not sure if we can eventually get
to remove it for RXE completely.
Jason
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr
2022-11-21 18:53 ` Jason Gunthorpe
@ 2022-11-22 23:24 ` Fabio M. De Francesco
2022-11-22 23:55 ` Fabio M. De Francesco
0 siblings, 1 reply; 16+ messages in thread
From: Fabio M. De Francesco @ 2022-11-22 23:24 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Li Zhijian, Zhu Yanjun, Leon Romanovsky, linux-rdma, linux-kernel,
Ira Weiny
On lunedì 21 novembre 2022 19:53:54 CET Jason Gunthorpe wrote:
> On Wed, Nov 16, 2022 at 01:37:14PM +0100, Fabio M. De Francesco wrote:
> > > - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
> > > + if (mr->ibmr.type == IB_MR_TYPE_USER) {
> > > + char *paddr;
> > > + struct page *pg = mr->map[m]->buf[n].page;
> > > +
> > > + paddr = kmap_local_page(pg);
> > > + if (paddr == NULL) {
> > > + pr_warn("Failed to map page");
> > > + return NULL;
> > > + }
> >
> > I know nothing about this code but I am here as a result of regular checks
> > for changes to HIGHMEM mappings across the entire kernel. So please
forgive
> > me if I'm objecting to the correct changes.
> >
> > 1) It looks like this code had a call to page_address() and you converted
it
> > to mapping with kmap_local_page().
>
> It only worked properly because the kconfig is blocking CONFIG_HIGHMEM
> so ZONE_HIGHMEM doesn't exist. These pages are obtained from GUP and
> could be anything.
>
> This is done indirectly through
>
> config INFINIBAND_VIRT_DMA
> def_bool !HIGHMEM
> But this patch is undoing parts of what are causing that restriction
> by using the proper APIs. Though I'm not sure if we can eventually get
> to remove it for RXE completely.
> Jason
Ah, OK. This code was for !HIGHMEM kernels...
I understand but, FWIW I doubt that the use of page_address(), instead of
kmapping, should ever be made on the bases that kconfig is blocking HIGHMEM.
However, if I understand correctly, that restriction (!HIGHMEM) is going to be
removed. Therefore, page_address() will break on HIGHMEM and Jason is
correctly converting to kmap_local_page().
There is only one thing left... I think that he missed another mail from me.
The one you responded to was missing the final paragraph, so I sent another
message few minutes later.
kmap_local_page() returns always valid pointers to kernel virtual addresses. I
can't see any ways for kmap_local_page() to return NULL. Therefore, I was
asking for the reason to check for NULL.
Thanks,
Fabio
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr
2022-11-22 23:24 ` Fabio M. De Francesco
@ 2022-11-22 23:55 ` Fabio M. De Francesco
0 siblings, 0 replies; 16+ messages in thread
From: Fabio M. De Francesco @ 2022-11-22 23:55 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Li Zhijian, Zhu Yanjun, Leon Romanovsky, linux-rdma, linux-kernel,
Ira Weiny
On mercoledì 23 novembre 2022 00:24:26 CET Fabio M. De Francesco wrote:
> On lunedì 21 novembre 2022 19:53:54 CET Jason Gunthorpe wrote:
> > On Wed, Nov 16, 2022 at 01:37:14PM +0100, Fabio M. De Francesco wrote:
> > > > - return (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
> > > > + if (mr->ibmr.type == IB_MR_TYPE_USER) {
> > > > + char *paddr;
> > > > + struct page *pg = mr->map[m]->buf[n].page;
> > > > +
> > > > + paddr = kmap_local_page(pg);
> > > > + if (paddr == NULL) {
> > > > + pr_warn("Failed to map page");
> > > > + return NULL;
> > > > + }
> > >
> > > I know nothing about this code but I am here as a result of regular
checks
> > > for changes to HIGHMEM mappings across the entire kernel. So please
>
> forgive
>
> > > me if I'm objecting to the correct changes.
> > >
> > > 1) It looks like this code had a call to page_address() and you
converted
>
> it
>
> > > to mapping with kmap_local_page().
> >
> > It only worked properly because the kconfig is blocking CONFIG_HIGHMEM
> > so ZONE_HIGHMEM doesn't exist. These pages are obtained from GUP and
> > could be anything.
> >
> > This is done indirectly through
> >
> > config INFINIBAND_VIRT_DMA
> >
> > def_bool !HIGHMEM
> >
> > But this patch is undoing parts of what are causing that restriction
> > by using the proper APIs. Though I'm not sure if we can eventually get
> > to remove it for RXE completely.
> > Jason
>
> Ah, OK. This code was for !HIGHMEM kernels...
>
> I understand but, FWIW I doubt that the use of page_address(), instead of
> kmapping, should ever be made on the bases that kconfig is blocking HIGHMEM.
>
> However, if I understand correctly, that restriction (!HIGHMEM) is going to
be
> removed. Therefore, page_address() will break on HIGHMEM and Jason is
> correctly converting to kmap_local_page().
Jason, I'm sorry for the erroneous attribution :-(
Fabio
> There is only one thing left... I think that he missed another mail from me.
> The one you responded to was missing the final paragraph, so I sent another
> message few minutes later.
>
> kmap_local_page() returns always valid pointers to kernel virtual addresses.
I
> can't see any ways for kmap_local_page() to return NULL. Therefore, I was
> asking for the reason to check for NULL.
>
> Thanks,
>
> Fabio
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2022-11-22 23:55 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-11 4:30 [for-next PATCH 0/5] iova_to_vaddr refactor Li Zhijian
2022-11-11 4:30 ` [for-next PATCH 1/5] RDMA/rxe: Remove rxe_phys_buf.size Li Zhijian
2022-11-18 16:38 ` Jason Gunthorpe
2022-11-11 4:30 ` [for-next PATCH 2/5] RDMA/rxe: use iova_to_vaddr to transform iova for rxe_mr_copy Li Zhijian
2022-11-18 16:41 ` Jason Gunthorpe
2022-11-11 4:30 ` [for-next PATCH 3/5] RDMA/rxe: iova_to_vaddr cleanup Li Zhijian
2022-11-11 4:30 ` [for-next PATCH 4/5] RDMA/rxe: refactor iova_to_vaddr Li Zhijian
2022-11-16 12:37 ` Fabio M. De Francesco
2022-11-18 1:32 ` lizhijian
2022-11-21 18:53 ` Jason Gunthorpe
2022-11-22 23:24 ` Fabio M. De Francesco
2022-11-22 23:55 ` Fabio M. De Francesco
2022-11-16 12:49 ` Fabio M. De Francesco
2022-11-18 17:33 ` Jason Gunthorpe
2022-11-11 4:32 ` [for-next PATCH 5/5] RDMA/rxe: Rename iova_to_vaddr to rxe_map_iova Li Zhijian
2022-11-11 6:13 ` [for-next PATCH 0/5] iova_to_vaddr refactor lizhijian
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).