* [PATCH 0/5] udmbuf bug fix and some improvements
@ 2024-08-01 10:45 Huan Yang
2024-08-01 10:45 ` [PATCH 1/5] udmabuf: cancel mmap page fault, direct map it Huan Yang
` (5 more replies)
0 siblings, 6 replies; 14+ messages in thread
From: Huan Yang @ 2024-08-01 10:45 UTC (permalink / raw)
To: Gerd Hoffmann, Sumit Semwal, Christian König, dri-devel,
linux-media, linaro-mm-sig, linux-kernel
Cc: opensource.kernel, Huan Yang
This patchset attempts to fix some errors in udmabuf and remove the
upin_list structure.
Some of this fix just gather the patches which I upload before.
Patch1
===
Try to remove page fault mmap and direct map it.
Due to current udmabuf has already obtained and pinned the folio
upon completion of the creation.This means that the physical memory has
already been acquired, rather than being accessed dynamically. The
current page fault method only saves some page table memory.
As a result, the page fault mechanism has lost its purpose as a demanding
page. Due to the fact that page fault requires trapping into kernel mode
and filling in when accessing the corresponding virtual address in mmap,
this means that user mode access to virtual addresses needs to trap into
kernel mode.
Therefore, when creating a large size udmabuf, this represents a
considerable overhead.
Therefore, the current patch removes the page fault method of mmap and
instead fills it directly when mmap is triggered.
This is achieved by using the scatter-gather table to establish a
linear relationship for the page. Calling remap_pfn_range does not cause
the previously set VMA flags to become invalid.
Patch2
===
This is the same to patch:
https://lore.kernel.org/all/20240725021349.580574-1-link@vivo.com/
I just gather it to this patchset.
Patch3
===
The current implementation of udmabuf's vmap has issues.
It does not correctly set each page of the folio to the page structure,
so that when vmap is called, all pages are the head page of the folio.
This implementation is not the same as this patch:
https://lore.kernel.org/all/20240731090233.1343559-1-link@vivo.com/
This reuse sgt table to map all page into vmalloc area.
Patch4
===
Wrap the repeated calls to get_sg_table, add a helper function to do it.
Set to udmabuf->sg use cmpxchg, It should be able to prevent concurrent
access situations. (I see mmap do not use lock)
Patch5
===
Attempt to remove unpin_list and other related data structures.
In order to adapt to Folio, we established the unpin_list data structure
to unpin all folios and maintain the page mapping relationship.
However, this data structure requires 24 bytes for each page and has low
traversal performance for the list. And maintaining the offset structure
also consumes a portion of memory.
This patch attempts to remove these data structures and modify the
semantics of some existing data structures.
udmabuf:
folios -> folios array, which only contain's the folio, org contains
duplicate.
add item_offset -> base on create item count, record it's start offset
in every memfd.
add item_size -> base on create item count, record it's size in every
memfd.
add nr_folios -> folios array number
So, when building the sg table, it is necessary to iterate in this way:
if size cross item->size, take care of it's start offset in folio.
if got folio, set each page into sgl until reach into folio size.
This patch also remove single folios' create on each create item, use it
be the ubuf->folios arrays' pointer, slide to fill the corresponding
folio under the item into the array.
After the modification, the various data structures in udmabuf have the
following corresponding relationships:
pagecount * PAGESIZE = sum(folios_size(folios[i])) i=0->nr_folios
pagecount * PAGESIZE = sum(item_size[i]) i=0, item_count (do not
record)
item_offset use to record each memfd offset if exist, else 0.
Huan Yang (5):
udmabuf: cancel mmap page fault, direct map it
udmabuf: change folios array from kmalloc to kvmalloc
udmabuf: fix vmap_udmabuf error page set
udmabuf: add get_sg_table helper function
udmabuf: remove folio pin list
drivers/dma-buf/udmabuf.c | 270 +++++++++++++++++++++-----------------
1 file changed, 148 insertions(+), 122 deletions(-)
base-commit: cd19ac2f903276b820f5d0d89de0c896c27036ed
--
2.45.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/5] udmabuf: cancel mmap page fault, direct map it
2024-08-01 10:45 [PATCH 0/5] udmbuf bug fix and some improvements Huan Yang
@ 2024-08-01 10:45 ` Huan Yang
2024-08-01 10:50 ` Christian König
2024-08-01 10:45 ` [PATCH 2/5] udmabuf: change folios array from kmalloc to kvmalloc Huan Yang
` (4 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Huan Yang @ 2024-08-01 10:45 UTC (permalink / raw)
To: Gerd Hoffmann, Sumit Semwal, Christian König, dri-devel,
linux-media, linaro-mm-sig, linux-kernel
Cc: opensource.kernel, Huan Yang
The current udmabuf mmap uses a page fault mechanism to populate the vma.
However, the current udmabuf has already obtained and pinned the folio
upon completion of the creation.This means that the physical memory has
already been acquired, rather than being accessed dynamically. The
current page fault method only saves some page table memory.
As a result, the page fault mechanism has lost its purpose as a demanding
page. Due to the fact that page fault requires trapping into kernel mode
and filling in when accessing the corresponding virtual address in mmap,
this means that user mode access to virtual addresses needs to trap into
kernel mode.
Therefore, when creating a large size udmabuf, this represents a
considerable overhead.
Therefore, the current patch removes the page fault method of mmap and
instead fills it directly when mmap is triggered.
Signed-off-by: Huan Yang <link@vivo.com>
---
drivers/dma-buf/udmabuf.c | 70 ++++++++++++++++++++++-----------------
1 file changed, 39 insertions(+), 31 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 047c3cd2ceff..d69aeada7367 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -38,36 +38,39 @@ struct udmabuf_folio {
struct list_head list;
};
-static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf)
-{
- struct vm_area_struct *vma = vmf->vma;
- struct udmabuf *ubuf = vma->vm_private_data;
- pgoff_t pgoff = vmf->pgoff;
- unsigned long pfn;
-
- if (pgoff >= ubuf->pagecount)
- return VM_FAULT_SIGBUS;
-
- pfn = folio_pfn(ubuf->folios[pgoff]);
- pfn += ubuf->offsets[pgoff] >> PAGE_SHIFT;
-
- return vmf_insert_pfn(vma, vmf->address, pfn);
-}
-
-static const struct vm_operations_struct udmabuf_vm_ops = {
- .fault = udmabuf_vm_fault,
-};
+static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
+ enum dma_data_direction direction);
static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma)
{
struct udmabuf *ubuf = buf->priv;
+ struct sg_table *table = ubuf->sg;
+ unsigned long addr = vma->vm_start;
+ struct sg_page_iter piter;
+ int ret;
if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
return -EINVAL;
- vma->vm_ops = &udmabuf_vm_ops;
- vma->vm_private_data = ubuf;
- vm_flags_set(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
+ if (!table) {
+ table = get_sg_table(NULL, buf, 0);
+ if (IS_ERR(table))
+ return PTR_ERR(table);
+ ubuf->sg = table;
+ }
+
+ for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
+ struct page *page = sg_page_iter_page(&piter);
+
+ ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
+ vma->vm_page_prot);
+ if (ret)
+ return ret;
+ addr += PAGE_SIZE;
+ if (addr >= vma->vm_end)
+ return 0;
+ }
+
return 0;
}
@@ -126,6 +129,10 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
sg_set_folio(sgl, ubuf->folios[i], PAGE_SIZE,
ubuf->offsets[i]);
+ // if dev is NULL, no need to sync.
+ if (!dev)
+ return sg;
+
ret = dma_map_sgtable(dev, sg, direction, 0);
if (ret < 0)
goto err_map;
@@ -206,20 +213,21 @@ static int begin_cpu_udmabuf(struct dma_buf *buf,
{
struct udmabuf *ubuf = buf->priv;
struct device *dev = ubuf->device->this_device;
- int ret = 0;
+ struct sg_table *sg;
- if (!ubuf->sg) {
- ubuf->sg = get_sg_table(dev, buf, direction);
- if (IS_ERR(ubuf->sg)) {
- ret = PTR_ERR(ubuf->sg);
- ubuf->sg = NULL;
- }
- } else {
+ if (ubuf->sg) {
dma_sync_sg_for_cpu(dev, ubuf->sg->sgl, ubuf->sg->nents,
direction);
+ return 0;
}
- return ret;
+ sg = get_sg_table(dev, buf, direction);
+ if (IS_ERR(sg))
+ return PTR_ERR(sg);
+
+ ubuf->sg = sg;
+
+ return 0;
}
static int end_cpu_udmabuf(struct dma_buf *buf,
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/5] udmabuf: change folios array from kmalloc to kvmalloc
2024-08-01 10:45 [PATCH 0/5] udmbuf bug fix and some improvements Huan Yang
2024-08-01 10:45 ` [PATCH 1/5] udmabuf: cancel mmap page fault, direct map it Huan Yang
@ 2024-08-01 10:45 ` Huan Yang
2024-08-01 10:54 ` Christian König
2024-08-01 10:45 ` [PATCH 3/5] udmabuf: fix vmap_udmabuf error page set Huan Yang
` (3 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Huan Yang @ 2024-08-01 10:45 UTC (permalink / raw)
To: Gerd Hoffmann, Sumit Semwal, Christian König, dri-devel,
linux-media, linaro-mm-sig, linux-kernel
Cc: opensource.kernel, Huan Yang
When PAGE_SIZE 4096, MAX_PAGE_ORDER 10, 64bit machine,
page_alloc only support 4MB.
If above this, trigger this warn and return NULL.
udmabuf can change size limit, if change it to 3072(3GB), and then alloc
3GB udmabuf, will fail create.
[ 4080.876581] ------------[ cut here ]------------
[ 4080.876843] WARNING: CPU: 3 PID: 2015 at mm/page_alloc.c:4556 __alloc_pages+0x2c8/0x350
[ 4080.878839] RIP: 0010:__alloc_pages+0x2c8/0x350
[ 4080.879470] Call Trace:
[ 4080.879473] <TASK>
[ 4080.879473] ? __alloc_pages+0x2c8/0x350
[ 4080.879475] ? __warn.cold+0x8e/0xe8
[ 4080.880647] ? __alloc_pages+0x2c8/0x350
[ 4080.880909] ? report_bug+0xff/0x140
[ 4080.881175] ? handle_bug+0x3c/0x80
[ 4080.881556] ? exc_invalid_op+0x17/0x70
[ 4080.881559] ? asm_exc_invalid_op+0x1a/0x20
[ 4080.882077] ? udmabuf_create+0x131/0x400
Because MAX_PAGE_ORDER, kmalloc can max alloc 4096 * (1 << 10), 4MB
memory, each array entry is pointer(8byte), so can save 524288 pages(2GB).
Further more, costly order(order 3) may not be guaranteed that it can be
applied for, due to fragmentation.
This patch change udmabuf array use kvmalloc_array, this can fallback
alloc into vmalloc, which can guarantee allocation for any size and does
not affect the performance of kmalloc allocations.
Signed-off-by: Huan Yang <link@vivo.com>
---
drivers/dma-buf/udmabuf.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index d69aeada7367..a915714c5dce 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -83,7 +83,7 @@ static int vmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
dma_resv_assert_held(buf->resv);
- pages = kmalloc_array(ubuf->pagecount, sizeof(*pages), GFP_KERNEL);
+ pages = kvmalloc_array(ubuf->pagecount, sizeof(*pages), GFP_KERNEL);
if (!pages)
return -ENOMEM;
@@ -91,7 +91,7 @@ static int vmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
pages[pg] = &ubuf->folios[pg]->page;
vaddr = vm_map_ram(pages, ubuf->pagecount, -1);
- kfree(pages);
+ kvfree(pages);
if (!vaddr)
return -EINVAL;
@@ -203,8 +203,8 @@ static void release_udmabuf(struct dma_buf *buf)
put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL);
unpin_all_folios(&ubuf->unpin_list);
- kfree(ubuf->offsets);
- kfree(ubuf->folios);
+ kvfree(ubuf->offsets);
+ kvfree(ubuf->folios);
kfree(ubuf);
}
@@ -330,14 +330,14 @@ static long udmabuf_create(struct miscdevice *device,
if (!ubuf->pagecount)
goto err;
- ubuf->folios = kmalloc_array(ubuf->pagecount, sizeof(*ubuf->folios),
- GFP_KERNEL);
+ ubuf->folios = kvmalloc_array(ubuf->pagecount, sizeof(*ubuf->folios),
+ GFP_KERNEL);
if (!ubuf->folios) {
ret = -ENOMEM;
goto err;
}
- ubuf->offsets = kcalloc(ubuf->pagecount, sizeof(*ubuf->offsets),
- GFP_KERNEL);
+ ubuf->offsets =
+ kvcalloc(ubuf->pagecount, sizeof(*ubuf->offsets), GFP_KERNEL);
if (!ubuf->offsets) {
ret = -ENOMEM;
goto err;
@@ -351,7 +351,7 @@ static long udmabuf_create(struct miscdevice *device,
goto err;
pgcnt = list[i].size >> PAGE_SHIFT;
- folios = kmalloc_array(pgcnt, sizeof(*folios), GFP_KERNEL);
+ folios = kvmalloc_array(pgcnt, sizeof(*folios), GFP_KERNEL);
if (!folios) {
ret = -ENOMEM;
goto err;
@@ -361,7 +361,7 @@ static long udmabuf_create(struct miscdevice *device,
ret = memfd_pin_folios(memfd, list[i].offset, end,
folios, pgcnt, &pgoff);
if (ret <= 0) {
- kfree(folios);
+ kvfree(folios);
if (!ret)
ret = -EINVAL;
goto err;
@@ -390,7 +390,7 @@ static long udmabuf_create(struct miscdevice *device,
}
}
- kfree(folios);
+ kvfree(folios);
fput(memfd);
memfd = NULL;
}
@@ -406,8 +406,8 @@ static long udmabuf_create(struct miscdevice *device,
if (memfd)
fput(memfd);
unpin_all_folios(&ubuf->unpin_list);
- kfree(ubuf->offsets);
- kfree(ubuf->folios);
+ kvfree(ubuf->offsets);
+ kvfree(ubuf->folios);
kfree(ubuf);
return ret;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] udmabuf: fix vmap_udmabuf error page set
2024-08-01 10:45 [PATCH 0/5] udmbuf bug fix and some improvements Huan Yang
2024-08-01 10:45 ` [PATCH 1/5] udmabuf: cancel mmap page fault, direct map it Huan Yang
2024-08-01 10:45 ` [PATCH 2/5] udmabuf: change folios array from kmalloc to kvmalloc Huan Yang
@ 2024-08-01 10:45 ` Huan Yang
2024-08-01 10:55 ` Christian König
2024-08-01 10:45 ` [PATCH 4/5] udmabuf: add get_sg_table helper function Huan Yang
` (2 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Huan Yang @ 2024-08-01 10:45 UTC (permalink / raw)
To: Gerd Hoffmann, Sumit Semwal, Christian König, dri-devel,
linux-media, linaro-mm-sig, linux-kernel
Cc: opensource.kernel, Huan Yang
Currently vmap_udmabuf set page's array by each folio.
But, ubuf->folios is only contain's the folio's head page.
That mean we repeatedly mapped the folio head page to the vmalloc area.
This patch fix it, set each folio's page correct, so that pages array
contains right page, and then map into vmalloc area
Signed-off-by: Huan Yang <link@vivo.com>
---
drivers/dma-buf/udmabuf.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index a915714c5dce..7ed532342d7f 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -77,18 +77,27 @@ static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma)
static int vmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
{
struct udmabuf *ubuf = buf->priv;
- struct page **pages;
+ struct page **pages, **tmp;
+ struct sg_table *sg = ubuf->sg;
+ struct sg_page_iter piter;
void *vaddr;
- pgoff_t pg;
dma_resv_assert_held(buf->resv);
+ if (!sg) {
+ sg = get_sg_table(NULL, buf, 0);
+ if (IS_ERR(sg))
+ return PTR_ERR(sg);
+ ubuf->sg = sg;
+ }
+
pages = kvmalloc_array(ubuf->pagecount, sizeof(*pages), GFP_KERNEL);
if (!pages)
return -ENOMEM;
+ tmp = pages;
- for (pg = 0; pg < ubuf->pagecount; pg++)
- pages[pg] = &ubuf->folios[pg]->page;
+ for_each_sgtable_page(sg, &piter, 0)
+ *tmp++ = sg_page_iter_page(&piter);
vaddr = vm_map_ram(pages, ubuf->pagecount, -1);
kvfree(pages);
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] udmabuf: add get_sg_table helper function
2024-08-01 10:45 [PATCH 0/5] udmbuf bug fix and some improvements Huan Yang
` (2 preceding siblings ...)
2024-08-01 10:45 ` [PATCH 3/5] udmabuf: fix vmap_udmabuf error page set Huan Yang
@ 2024-08-01 10:45 ` Huan Yang
2024-08-01 10:56 ` Christian König
2024-08-01 10:45 ` [PATCH 5/5] udmabuf: remove folio pin list Huan Yang
2024-08-01 18:32 ` [PATCH 0/5] udmbuf bug fix and some improvements Kasireddy, Vivek
5 siblings, 1 reply; 14+ messages in thread
From: Huan Yang @ 2024-08-01 10:45 UTC (permalink / raw)
To: Gerd Hoffmann, Sumit Semwal, Christian König, dri-devel,
linux-media, linaro-mm-sig, linux-kernel
Cc: opensource.kernel, Huan Yang
Currently, there are three duplicate pieces of code that retrieve
sg_table and update uduf->sg.
Since the sgt is used to populate the page in both mmap and vmap.It is
necessary to ensure that ubuf->sg is set correctly.
This patch add a helper function, if ubuf->sg exist, just return it.
Or else, try alloc a new sgt, and cmpxchg to set it.
When the swap fails, it means that another process has set sg correctly.
Therefore, we reuse the new sg. If trigger by device, need invoke map to
sync it.
Signed-off-by: Huan Yang <link@vivo.com>
---
drivers/dma-buf/udmabuf.c | 60 ++++++++++++++++++++++++++++-----------
1 file changed, 43 insertions(+), 17 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 7ed532342d7f..677ebb2d462f 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -38,8 +38,9 @@ struct udmabuf_folio {
struct list_head list;
};
-static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
- enum dma_data_direction direction);
+static struct sg_table *udmabuf_get_sg_table(struct device *dev,
+ struct dma_buf *buf,
+ enum dma_data_direction direction);
static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma)
{
@@ -52,12 +53,9 @@ static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma)
if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
return -EINVAL;
- if (!table) {
- table = get_sg_table(NULL, buf, 0);
- if (IS_ERR(table))
- return PTR_ERR(table);
- ubuf->sg = table;
- }
+ table = udmabuf_get_sg_table(NULL, buf, 0);
+ if (IS_ERR(table))
+ return PTR_ERR(table);
for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
struct page *page = sg_page_iter_page(&piter);
@@ -84,12 +82,9 @@ static int vmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
dma_resv_assert_held(buf->resv);
- if (!sg) {
- sg = get_sg_table(NULL, buf, 0);
- if (IS_ERR(sg))
- return PTR_ERR(sg);
- ubuf->sg = sg;
- }
+ sg = udmabuf_get_sg_table(NULL, buf, 0);
+ if (IS_ERR(sg))
+ return PTR_ERR(sg);
pages = kvmalloc_array(ubuf->pagecount, sizeof(*pages), GFP_KERNEL);
if (!pages)
@@ -154,6 +149,39 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
return ERR_PTR(ret);
}
+static struct sg_table *udmabuf_get_sg_table(struct device *dev,
+ struct dma_buf *buf,
+ enum dma_data_direction direction)
+{
+ struct udmabuf *ubuf = buf->priv;
+ struct sg_table *sg = READ_ONCE(ubuf->sg);
+ int ret = 0;
+
+ if (sg)
+ return sg;
+
+ sg = get_sg_table(dev, buf, direction);
+ if (IS_ERR(sg))
+ return sg;
+
+ // Success update ubuf's sg, just return.
+ if (!cmpxchg(&ubuf->sg, NULL, sg))
+ return sg;
+
+ // use the new sg table.
+ sg_free_table(sg);
+ kfree(sg);
+ sg = READ_ONCE(ubuf->sg);
+
+ if (dev)
+ ret = dma_map_sgtable(dev, sg, direction, 0);
+
+ if (ret < 0)
+ return ERR_PTR(ret);
+
+ return sg;
+}
+
static void put_sg_table(struct device *dev, struct sg_table *sg,
enum dma_data_direction direction)
{
@@ -230,12 +258,10 @@ static int begin_cpu_udmabuf(struct dma_buf *buf,
return 0;
}
- sg = get_sg_table(dev, buf, direction);
+ sg = udmabuf_get_sg_table(dev, buf, direction);
if (IS_ERR(sg))
return PTR_ERR(sg);
- ubuf->sg = sg;
-
return 0;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] udmabuf: remove folio pin list
2024-08-01 10:45 [PATCH 0/5] udmbuf bug fix and some improvements Huan Yang
` (3 preceding siblings ...)
2024-08-01 10:45 ` [PATCH 4/5] udmabuf: add get_sg_table helper function Huan Yang
@ 2024-08-01 10:45 ` Huan Yang
2024-08-01 18:32 ` [PATCH 0/5] udmbuf bug fix and some improvements Kasireddy, Vivek
5 siblings, 0 replies; 14+ messages in thread
From: Huan Yang @ 2024-08-01 10:45 UTC (permalink / raw)
To: Gerd Hoffmann, Sumit Semwal, Christian König, dri-devel,
linux-media, linaro-mm-sig, linux-kernel
Cc: opensource.kernel, Huan Yang
Currently, udmabuf handles folio by creating an unpin list to record
each folio obtained from the list and unpinning them when released. To
maintain this approach, many data structures have been established.
However, maintaining this type of data structure requires a significant
amount of memory and traversing the list is a substantial overhead,
which is not friendly to the CPU cache, TLB, and so on.
Therefore, this patch removes the relationship between the folio and its
offset in the linear address mapping.
As an alternative, udmabuf only tracks all folio structures and splits
them into individual pages when needed by traversing them in the
required locations.(mmap/vmap, sg table.)
So, udmabuf's folios_array only save the folio struct, add nr_folios to
point how many folio saved in it.
offset is removed, and add item's offset and size to replace, due to
memfd create may have offset, we must set correctly page in folio.
So, when setup sg_table, we must start correct offset in each item at
begin, and then set each folio's page into sgtable.
Both item's offset and size number just the create list number, so,
memory size will not too large.
By doing this, we can accept the overhead of the udmabuf_folio structure
and the performance loss of traversing the list during unpinning.
Signed-off-by: Huan Yang <link@vivo.com>
---
drivers/dma-buf/udmabuf.c | 149 +++++++++++++++++---------------------
1 file changed, 66 insertions(+), 83 deletions(-)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 677ebb2d462f..1106e0b1e746 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -25,17 +25,19 @@ module_param(size_limit_mb, int, 0644);
MODULE_PARM_DESC(size_limit_mb, "Max size of a dmabuf, in megabytes. Default is 64.");
struct udmabuf {
+ // all page's count, pagecount * PAGE_SIZE is the udmabuf's size
pgoff_t pagecount;
+
+ // folios array only point to each folio, do not duplicate set.
struct folio **folios;
+ // folios array's number
+ pgoff_t nr_folios;
+
struct sg_table *sg;
struct miscdevice *device;
- pgoff_t *offsets;
- struct list_head unpin_list;
-};
-struct udmabuf_folio {
- struct folio *folio;
- struct list_head list;
+ pgoff_t *item_offset;
+ size_t *item_size;
};
static struct sg_table *udmabuf_get_sg_table(struct device *dev,
@@ -118,7 +120,10 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
struct udmabuf *ubuf = buf->priv;
struct sg_table *sg;
struct scatterlist *sgl;
- unsigned int i = 0;
+ struct folio *folio = NULL;
+ size_t fsize, foffset;
+ unsigned int i = 0, item_idx = 0, findex = 0;
+ size_t cur_size, item_size;
int ret;
sg = kzalloc(sizeof(*sg), GFP_KERNEL);
@@ -129,9 +134,33 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
if (ret < 0)
goto err_alloc;
- for_each_sg(sg->sgl, sgl, ubuf->pagecount, i)
- sg_set_folio(sgl, ubuf->folios[i], PAGE_SIZE,
- ubuf->offsets[i]);
+ cur_size = 0;
+ item_size = ubuf->item_size[0];
+ foffset = ubuf->item_offset[0];
+ folio = ubuf->folios[0];
+ fsize = folio_size(folio);
+
+ for_each_sg(sg->sgl, sgl, ubuf->pagecount, i) {
+ sg_set_folio(sgl, folio, PAGE_SIZE, foffset);
+ foffset += PAGE_SIZE;
+ cur_size += PAGE_SIZE;
+
+ // move to next folio.
+ if (foffset == fsize) {
+ ++findex;
+ folio = ubuf->folios[findex];
+ fsize = folio_size(folio);
+ foffset = 0;
+ }
+
+ // if reach to next item, must check the start offset.
+ if (cur_size == item_size) {
+ ++item_idx;
+ foffset = ubuf->item_offset[item_idx];
+ item_size = ubuf->item_size[item_idx];
+ cur_size = 0;
+ }
+ }
// if dev is NULL, no need to sync.
if (!dev)
@@ -203,34 +232,6 @@ static void unmap_udmabuf(struct dma_buf_attachment *at,
return put_sg_table(at->dev, sg, direction);
}
-static void unpin_all_folios(struct list_head *unpin_list)
-{
- struct udmabuf_folio *ubuf_folio;
-
- while (!list_empty(unpin_list)) {
- ubuf_folio = list_first_entry(unpin_list,
- struct udmabuf_folio, list);
- unpin_folio(ubuf_folio->folio);
-
- list_del(&ubuf_folio->list);
- kfree(ubuf_folio);
- }
-}
-
-static int add_to_unpin_list(struct list_head *unpin_list,
- struct folio *folio)
-{
- struct udmabuf_folio *ubuf_folio;
-
- ubuf_folio = kzalloc(sizeof(*ubuf_folio), GFP_KERNEL);
- if (!ubuf_folio)
- return -ENOMEM;
-
- ubuf_folio->folio = folio;
- list_add_tail(&ubuf_folio->list, unpin_list);
- return 0;
-}
-
static void release_udmabuf(struct dma_buf *buf)
{
struct udmabuf *ubuf = buf->priv;
@@ -239,8 +240,9 @@ static void release_udmabuf(struct dma_buf *buf)
if (ubuf->sg)
put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL);
- unpin_all_folios(&ubuf->unpin_list);
- kvfree(ubuf->offsets);
+ unpin_folios(ubuf->folios, ubuf->nr_folios);
+ kfree(ubuf->item_offset);
+ kfree(ubuf->item_size);
kvfree(ubuf->folios);
kfree(ubuf);
}
@@ -338,19 +340,18 @@ static long udmabuf_create(struct miscdevice *device,
struct udmabuf_create_list *head,
struct udmabuf_create_item *list)
{
- pgoff_t pgoff, pgcnt, pglimit, pgbuf = 0;
+ pgoff_t pgoff, pgcnt, pglimit;
long nr_folios, ret = -EINVAL;
struct file *memfd = NULL;
struct folio **folios;
struct udmabuf *ubuf;
- u32 i, j, k, flags;
+ u32 i, flags;
loff_t end;
ubuf = kzalloc(sizeof(*ubuf), GFP_KERNEL);
if (!ubuf)
return -ENOMEM;
- INIT_LIST_HEAD(&ubuf->unpin_list);
pglimit = (size_limit_mb * 1024 * 1024) >> PAGE_SHIFT;
for (i = 0; i < head->count; i++) {
if (!IS_ALIGNED(list[i].offset, PAGE_SIZE))
@@ -365,20 +366,27 @@ static long udmabuf_create(struct miscdevice *device,
if (!ubuf->pagecount)
goto err;
- ubuf->folios = kvmalloc_array(ubuf->pagecount, sizeof(*ubuf->folios),
- GFP_KERNEL);
- if (!ubuf->folios) {
+ ubuf->item_size =
+ kmalloc_array(head->count, sizeof(size_t), GFP_KERNEL);
+ if (!ubuf->item_size)
+ return -ENOMEM;
+
+ ubuf->item_offset =
+ kmalloc_array(head->count, sizeof(pgoff_t), GFP_KERNEL);
+ if (!ubuf->item_offset) {
ret = -ENOMEM;
goto err;
}
- ubuf->offsets =
- kvcalloc(ubuf->pagecount, sizeof(*ubuf->offsets), GFP_KERNEL);
- if (!ubuf->offsets) {
+
+ ubuf->folios = kvmalloc_array(ubuf->pagecount, sizeof(*ubuf->folios),
+ GFP_KERNEL);
+ if (!ubuf->folios) {
ret = -ENOMEM;
goto err;
}
+ folios = ubuf->folios;
- pgbuf = 0;
+ nr_folios = 0;
for (i = 0; i < head->count; i++) {
memfd = fget(list[i].memfd);
ret = check_memfd_seals(memfd);
@@ -386,49 +394,24 @@ static long udmabuf_create(struct miscdevice *device,
goto err;
pgcnt = list[i].size >> PAGE_SHIFT;
- folios = kvmalloc_array(pgcnt, sizeof(*folios), GFP_KERNEL);
- if (!folios) {
- ret = -ENOMEM;
- goto err;
- }
end = list[i].offset + (pgcnt << PAGE_SHIFT) - 1;
ret = memfd_pin_folios(memfd, list[i].offset, end,
folios, pgcnt, &pgoff);
if (ret <= 0) {
- kvfree(folios);
- if (!ret)
- ret = -EINVAL;
+ ret = ret ?: -EINVAL;
goto err;
}
+ ubuf->item_size[i] = list[i].size;
+ ubuf->item_offset[i] = pgoff;
- nr_folios = ret;
- pgoff >>= PAGE_SHIFT;
- for (j = 0, k = 0; j < pgcnt; j++) {
- ubuf->folios[pgbuf] = folios[k];
- ubuf->offsets[pgbuf] = pgoff << PAGE_SHIFT;
-
- if (j == 0 || ubuf->folios[pgbuf-1] != folios[k]) {
- ret = add_to_unpin_list(&ubuf->unpin_list,
- folios[k]);
- if (ret < 0) {
- kfree(folios);
- goto err;
- }
- }
-
- pgbuf++;
- if (++pgoff == folio_nr_pages(folios[k])) {
- pgoff = 0;
- if (++k == nr_folios)
- break;
- }
- }
+ nr_folios += ret;
+ folios += ret;
- kvfree(folios);
fput(memfd);
memfd = NULL;
}
+ ubuf->nr_folios = nr_folios;
flags = head->flags & UDMABUF_FLAGS_CLOEXEC ? O_CLOEXEC : 0;
ret = export_udmabuf(ubuf, device, flags);
@@ -440,8 +423,8 @@ static long udmabuf_create(struct miscdevice *device,
err:
if (memfd)
fput(memfd);
- unpin_all_folios(&ubuf->unpin_list);
- kvfree(ubuf->offsets);
+ kfree(ubuf->item_size);
+ kfree(ubuf->item_offset);
kvfree(ubuf->folios);
kfree(ubuf);
return ret;
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] udmabuf: cancel mmap page fault, direct map it
2024-08-01 10:45 ` [PATCH 1/5] udmabuf: cancel mmap page fault, direct map it Huan Yang
@ 2024-08-01 10:50 ` Christian König
2024-08-01 11:08 ` Huan Yang
0 siblings, 1 reply; 14+ messages in thread
From: Christian König @ 2024-08-01 10:50 UTC (permalink / raw)
To: Huan Yang, Gerd Hoffmann, Sumit Semwal, dri-devel, linux-media,
linaro-mm-sig, linux-kernel
Cc: opensource.kernel
Am 01.08.24 um 12:45 schrieb Huan Yang:
> The current udmabuf mmap uses a page fault mechanism to populate the vma.
>
> However, the current udmabuf has already obtained and pinned the folio
> upon completion of the creation.This means that the physical memory has
> already been acquired, rather than being accessed dynamically. The
> current page fault method only saves some page table memory.
>
> As a result, the page fault mechanism has lost its purpose as a demanding
> page. Due to the fact that page fault requires trapping into kernel mode
> and filling in when accessing the corresponding virtual address in mmap,
> this means that user mode access to virtual addresses needs to trap into
> kernel mode.
>
> Therefore, when creating a large size udmabuf, this represents a
> considerable overhead.
>
> Therefore, the current patch removes the page fault method of mmap and
> instead fills it directly when mmap is triggered.
>
> Signed-off-by: Huan Yang <link@vivo.com>
> ---
> drivers/dma-buf/udmabuf.c | 70 ++++++++++++++++++++++-----------------
> 1 file changed, 39 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> index 047c3cd2ceff..d69aeada7367 100644
> --- a/drivers/dma-buf/udmabuf.c
> +++ b/drivers/dma-buf/udmabuf.c
> @@ -38,36 +38,39 @@ struct udmabuf_folio {
> struct list_head list;
> };
>
> -static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf)
> -{
> - struct vm_area_struct *vma = vmf->vma;
> - struct udmabuf *ubuf = vma->vm_private_data;
> - pgoff_t pgoff = vmf->pgoff;
> - unsigned long pfn;
> -
> - if (pgoff >= ubuf->pagecount)
> - return VM_FAULT_SIGBUS;
> -
> - pfn = folio_pfn(ubuf->folios[pgoff]);
> - pfn += ubuf->offsets[pgoff] >> PAGE_SHIFT;
> -
> - return vmf_insert_pfn(vma, vmf->address, pfn);
> -}
> -
> -static const struct vm_operations_struct udmabuf_vm_ops = {
> - .fault = udmabuf_vm_fault,
> -};
> +static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
> + enum dma_data_direction direction);
>
> static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma)
> {
> struct udmabuf *ubuf = buf->priv;
> + struct sg_table *table = ubuf->sg;
> + unsigned long addr = vma->vm_start;
> + struct sg_page_iter piter;
> + int ret;
>
> if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
> return -EINVAL;
>
> - vma->vm_ops = &udmabuf_vm_ops;
> - vma->vm_private_data = ubuf;
> - vm_flags_set(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
> + if (!table) {
> + table = get_sg_table(NULL, buf, 0);
> + if (IS_ERR(table))
> + return PTR_ERR(table);
> + ubuf->sg = table;
> + }
> +
> + for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
That might not work correctly. We intentionally remove the pages from
the sgtable when it is shared between devices.
Additional to that the sgtable is *not* a page container, but rather a
DMA address container. So that here is also a rather bad idea from the
design side.
Regards,
Christian.
> + struct page *page = sg_page_iter_page(&piter);
> +
> + ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
> + vma->vm_page_prot);
> + if (ret)
> + return ret;
> + addr += PAGE_SIZE;
> + if (addr >= vma->vm_end)
> + return 0;
> + }
> +
> return 0;
> }
>
> @@ -126,6 +129,10 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
> sg_set_folio(sgl, ubuf->folios[i], PAGE_SIZE,
> ubuf->offsets[i]);
>
> + // if dev is NULL, no need to sync.
> + if (!dev)
> + return sg;
> +
> ret = dma_map_sgtable(dev, sg, direction, 0);
> if (ret < 0)
> goto err_map;
> @@ -206,20 +213,21 @@ static int begin_cpu_udmabuf(struct dma_buf *buf,
> {
> struct udmabuf *ubuf = buf->priv;
> struct device *dev = ubuf->device->this_device;
> - int ret = 0;
> + struct sg_table *sg;
>
> - if (!ubuf->sg) {
> - ubuf->sg = get_sg_table(dev, buf, direction);
> - if (IS_ERR(ubuf->sg)) {
> - ret = PTR_ERR(ubuf->sg);
> - ubuf->sg = NULL;
> - }
> - } else {
> + if (ubuf->sg) {
> dma_sync_sg_for_cpu(dev, ubuf->sg->sgl, ubuf->sg->nents,
> direction);
> + return 0;
> }
>
> - return ret;
> + sg = get_sg_table(dev, buf, direction);
> + if (IS_ERR(sg))
> + return PTR_ERR(sg);
> +
> + ubuf->sg = sg;
> +
> + return 0;
> }
>
> static int end_cpu_udmabuf(struct dma_buf *buf,
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] udmabuf: change folios array from kmalloc to kvmalloc
2024-08-01 10:45 ` [PATCH 2/5] udmabuf: change folios array from kmalloc to kvmalloc Huan Yang
@ 2024-08-01 10:54 ` Christian König
0 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2024-08-01 10:54 UTC (permalink / raw)
To: Huan Yang, Gerd Hoffmann, Sumit Semwal, dri-devel, linux-media,
linaro-mm-sig, linux-kernel
Cc: opensource.kernel
Am 01.08.24 um 12:45 schrieb Huan Yang:
> When PAGE_SIZE 4096, MAX_PAGE_ORDER 10, 64bit machine,
> page_alloc only support 4MB.
> If above this, trigger this warn and return NULL.
>
> udmabuf can change size limit, if change it to 3072(3GB), and then alloc
> 3GB udmabuf, will fail create.
>
> [ 4080.876581] ------------[ cut here ]------------
> [ 4080.876843] WARNING: CPU: 3 PID: 2015 at mm/page_alloc.c:4556 __alloc_pages+0x2c8/0x350
> [ 4080.878839] RIP: 0010:__alloc_pages+0x2c8/0x350
> [ 4080.879470] Call Trace:
> [ 4080.879473] <TASK>
> [ 4080.879473] ? __alloc_pages+0x2c8/0x350
> [ 4080.879475] ? __warn.cold+0x8e/0xe8
> [ 4080.880647] ? __alloc_pages+0x2c8/0x350
> [ 4080.880909] ? report_bug+0xff/0x140
> [ 4080.881175] ? handle_bug+0x3c/0x80
> [ 4080.881556] ? exc_invalid_op+0x17/0x70
> [ 4080.881559] ? asm_exc_invalid_op+0x1a/0x20
> [ 4080.882077] ? udmabuf_create+0x131/0x400
>
> Because MAX_PAGE_ORDER, kmalloc can max alloc 4096 * (1 << 10), 4MB
> memory, each array entry is pointer(8byte), so can save 524288 pages(2GB).
>
> Further more, costly order(order 3) may not be guaranteed that it can be
> applied for, due to fragmentation.
>
> This patch change udmabuf array use kvmalloc_array, this can fallback
> alloc into vmalloc, which can guarantee allocation for any size and does
> not affect the performance of kmalloc allocations.
>
> Signed-off-by: Huan Yang <link@vivo.com>
Acked-by: Christian König <christian.koenig@amd.com>
> ---
> drivers/dma-buf/udmabuf.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> index d69aeada7367..a915714c5dce 100644
> --- a/drivers/dma-buf/udmabuf.c
> +++ b/drivers/dma-buf/udmabuf.c
> @@ -83,7 +83,7 @@ static int vmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
>
> dma_resv_assert_held(buf->resv);
>
> - pages = kmalloc_array(ubuf->pagecount, sizeof(*pages), GFP_KERNEL);
> + pages = kvmalloc_array(ubuf->pagecount, sizeof(*pages), GFP_KERNEL);
> if (!pages)
> return -ENOMEM;
>
> @@ -91,7 +91,7 @@ static int vmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
> pages[pg] = &ubuf->folios[pg]->page;
>
> vaddr = vm_map_ram(pages, ubuf->pagecount, -1);
> - kfree(pages);
> + kvfree(pages);
> if (!vaddr)
> return -EINVAL;
>
> @@ -203,8 +203,8 @@ static void release_udmabuf(struct dma_buf *buf)
> put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL);
>
> unpin_all_folios(&ubuf->unpin_list);
> - kfree(ubuf->offsets);
> - kfree(ubuf->folios);
> + kvfree(ubuf->offsets);
> + kvfree(ubuf->folios);
> kfree(ubuf);
> }
>
> @@ -330,14 +330,14 @@ static long udmabuf_create(struct miscdevice *device,
> if (!ubuf->pagecount)
> goto err;
>
> - ubuf->folios = kmalloc_array(ubuf->pagecount, sizeof(*ubuf->folios),
> - GFP_KERNEL);
> + ubuf->folios = kvmalloc_array(ubuf->pagecount, sizeof(*ubuf->folios),
> + GFP_KERNEL);
> if (!ubuf->folios) {
> ret = -ENOMEM;
> goto err;
> }
> - ubuf->offsets = kcalloc(ubuf->pagecount, sizeof(*ubuf->offsets),
> - GFP_KERNEL);
> + ubuf->offsets =
> + kvcalloc(ubuf->pagecount, sizeof(*ubuf->offsets), GFP_KERNEL);
> if (!ubuf->offsets) {
> ret = -ENOMEM;
> goto err;
> @@ -351,7 +351,7 @@ static long udmabuf_create(struct miscdevice *device,
> goto err;
>
> pgcnt = list[i].size >> PAGE_SHIFT;
> - folios = kmalloc_array(pgcnt, sizeof(*folios), GFP_KERNEL);
> + folios = kvmalloc_array(pgcnt, sizeof(*folios), GFP_KERNEL);
> if (!folios) {
> ret = -ENOMEM;
> goto err;
> @@ -361,7 +361,7 @@ static long udmabuf_create(struct miscdevice *device,
> ret = memfd_pin_folios(memfd, list[i].offset, end,
> folios, pgcnt, &pgoff);
> if (ret <= 0) {
> - kfree(folios);
> + kvfree(folios);
> if (!ret)
> ret = -EINVAL;
> goto err;
> @@ -390,7 +390,7 @@ static long udmabuf_create(struct miscdevice *device,
> }
> }
>
> - kfree(folios);
> + kvfree(folios);
> fput(memfd);
> memfd = NULL;
> }
> @@ -406,8 +406,8 @@ static long udmabuf_create(struct miscdevice *device,
> if (memfd)
> fput(memfd);
> unpin_all_folios(&ubuf->unpin_list);
> - kfree(ubuf->offsets);
> - kfree(ubuf->folios);
> + kvfree(ubuf->offsets);
> + kvfree(ubuf->folios);
> kfree(ubuf);
> return ret;
> }
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] udmabuf: fix vmap_udmabuf error page set
2024-08-01 10:45 ` [PATCH 3/5] udmabuf: fix vmap_udmabuf error page set Huan Yang
@ 2024-08-01 10:55 ` Christian König
0 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2024-08-01 10:55 UTC (permalink / raw)
To: Huan Yang, Gerd Hoffmann, Sumit Semwal, dri-devel, linux-media,
linaro-mm-sig, linux-kernel
Cc: opensource.kernel
Am 01.08.24 um 12:45 schrieb Huan Yang:
> Currently vmap_udmabuf set page's array by each folio.
> But, ubuf->folios is only contain's the folio's head page.
>
> That mean we repeatedly mapped the folio head page to the vmalloc area.
>
> This patch fix it, set each folio's page correct, so that pages array
> contains right page, and then map into vmalloc area
>
> Signed-off-by: Huan Yang <link@vivo.com>
> ---
> drivers/dma-buf/udmabuf.c | 17 +++++++++++++----
> 1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> index a915714c5dce..7ed532342d7f 100644
> --- a/drivers/dma-buf/udmabuf.c
> +++ b/drivers/dma-buf/udmabuf.c
> @@ -77,18 +77,27 @@ static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma)
> static int vmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
> {
> struct udmabuf *ubuf = buf->priv;
> - struct page **pages;
> + struct page **pages, **tmp;
> + struct sg_table *sg = ubuf->sg;
> + struct sg_page_iter piter;
> void *vaddr;
> - pgoff_t pg;
>
> dma_resv_assert_held(buf->resv);
>
> + if (!sg) {
> + sg = get_sg_table(NULL, buf, 0);
> + if (IS_ERR(sg))
> + return PTR_ERR(sg);
> + ubuf->sg = sg;
> + }
> +
> pages = kvmalloc_array(ubuf->pagecount, sizeof(*pages), GFP_KERNEL);
> if (!pages)
> return -ENOMEM;
> + tmp = pages;
>
> - for (pg = 0; pg < ubuf->pagecount; pg++)
> - pages[pg] = &ubuf->folios[pg]->page;
> + for_each_sgtable_page(sg, &piter, 0)
> + *tmp++ = sg_page_iter_page(&piter);
Again don't abuse the sg table for that!
Regards,
Christian.
>
> vaddr = vm_map_ram(pages, ubuf->pagecount, -1);
> kvfree(pages);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/5] udmabuf: add get_sg_table helper function
2024-08-01 10:45 ` [PATCH 4/5] udmabuf: add get_sg_table helper function Huan Yang
@ 2024-08-01 10:56 ` Christian König
0 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2024-08-01 10:56 UTC (permalink / raw)
To: Huan Yang, Gerd Hoffmann, Sumit Semwal, dri-devel, linux-media,
linaro-mm-sig, linux-kernel
Cc: opensource.kernel
Am 01.08.24 um 12:45 schrieb Huan Yang:
> Currently, there are three duplicate pieces of code that retrieve
> sg_table and update uduf->sg.
>
> Since the sgt is used to populate the page in both mmap and vmap.It is
> necessary to ensure that ubuf->sg is set correctly.
That is a really bad idea. Why are sg tables used to populated the page
tables?
Regards,
Christian.
>
> This patch add a helper function, if ubuf->sg exist, just return it.
> Or else, try alloc a new sgt, and cmpxchg to set it.
>
> When the swap fails, it means that another process has set sg correctly.
> Therefore, we reuse the new sg. If trigger by device, need invoke map to
> sync it.
>
> Signed-off-by: Huan Yang <link@vivo.com>
> ---
> drivers/dma-buf/udmabuf.c | 60 ++++++++++++++++++++++++++++-----------
> 1 file changed, 43 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> index 7ed532342d7f..677ebb2d462f 100644
> --- a/drivers/dma-buf/udmabuf.c
> +++ b/drivers/dma-buf/udmabuf.c
> @@ -38,8 +38,9 @@ struct udmabuf_folio {
> struct list_head list;
> };
>
> -static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
> - enum dma_data_direction direction);
> +static struct sg_table *udmabuf_get_sg_table(struct device *dev,
> + struct dma_buf *buf,
> + enum dma_data_direction direction);
>
> static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma)
> {
> @@ -52,12 +53,9 @@ static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma)
> if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
> return -EINVAL;
>
> - if (!table) {
> - table = get_sg_table(NULL, buf, 0);
> - if (IS_ERR(table))
> - return PTR_ERR(table);
> - ubuf->sg = table;
> - }
> + table = udmabuf_get_sg_table(NULL, buf, 0);
> + if (IS_ERR(table))
> + return PTR_ERR(table);
>
> for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
> struct page *page = sg_page_iter_page(&piter);
> @@ -84,12 +82,9 @@ static int vmap_udmabuf(struct dma_buf *buf, struct iosys_map *map)
>
> dma_resv_assert_held(buf->resv);
>
> - if (!sg) {
> - sg = get_sg_table(NULL, buf, 0);
> - if (IS_ERR(sg))
> - return PTR_ERR(sg);
> - ubuf->sg = sg;
> - }
> + sg = udmabuf_get_sg_table(NULL, buf, 0);
> + if (IS_ERR(sg))
> + return PTR_ERR(sg);
>
> pages = kvmalloc_array(ubuf->pagecount, sizeof(*pages), GFP_KERNEL);
> if (!pages)
> @@ -154,6 +149,39 @@ static struct sg_table *get_sg_table(struct device *dev, struct dma_buf *buf,
> return ERR_PTR(ret);
> }
>
> +static struct sg_table *udmabuf_get_sg_table(struct device *dev,
> + struct dma_buf *buf,
> + enum dma_data_direction direction)
> +{
> + struct udmabuf *ubuf = buf->priv;
> + struct sg_table *sg = READ_ONCE(ubuf->sg);
> + int ret = 0;
> +
> + if (sg)
> + return sg;
> +
> + sg = get_sg_table(dev, buf, direction);
> + if (IS_ERR(sg))
> + return sg;
> +
> + // Success update ubuf's sg, just return.
> + if (!cmpxchg(&ubuf->sg, NULL, sg))
> + return sg;
> +
> + // use the new sg table.
> + sg_free_table(sg);
> + kfree(sg);
> + sg = READ_ONCE(ubuf->sg);
> +
> + if (dev)
> + ret = dma_map_sgtable(dev, sg, direction, 0);
> +
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + return sg;
> +}
> +
> static void put_sg_table(struct device *dev, struct sg_table *sg,
> enum dma_data_direction direction)
> {
> @@ -230,12 +258,10 @@ static int begin_cpu_udmabuf(struct dma_buf *buf,
> return 0;
> }
>
> - sg = get_sg_table(dev, buf, direction);
> + sg = udmabuf_get_sg_table(dev, buf, direction);
> if (IS_ERR(sg))
> return PTR_ERR(sg);
>
> - ubuf->sg = sg;
> -
> return 0;
> }
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] udmabuf: cancel mmap page fault, direct map it
2024-08-01 10:50 ` Christian König
@ 2024-08-01 11:08 ` Huan Yang
0 siblings, 0 replies; 14+ messages in thread
From: Huan Yang @ 2024-08-01 11:08 UTC (permalink / raw)
To: Christian König, Gerd Hoffmann, Sumit Semwal, dri-devel,
linux-media, linaro-mm-sig, linux-kernel
Cc: opensource.kernel
在 2024/8/1 18:50, Christian König 写道:
> Am 01.08.24 um 12:45 schrieb Huan Yang:
>> The current udmabuf mmap uses a page fault mechanism to populate the
>> vma.
>>
>> However, the current udmabuf has already obtained and pinned the folio
>> upon completion of the creation.This means that the physical memory has
>> already been acquired, rather than being accessed dynamically. The
>> current page fault method only saves some page table memory.
>>
>> As a result, the page fault mechanism has lost its purpose as a
>> demanding
>> page. Due to the fact that page fault requires trapping into kernel mode
>> and filling in when accessing the corresponding virtual address in mmap,
>> this means that user mode access to virtual addresses needs to trap into
>> kernel mode.
>>
>> Therefore, when creating a large size udmabuf, this represents a
>> considerable overhead.
>>
>> Therefore, the current patch removes the page fault method of mmap and
>> instead fills it directly when mmap is triggered.
>>
>> Signed-off-by: Huan Yang <link@vivo.com>
>> ---
>> drivers/dma-buf/udmabuf.c | 70 ++++++++++++++++++++++-----------------
>> 1 file changed, 39 insertions(+), 31 deletions(-)
>>
>> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
>> index 047c3cd2ceff..d69aeada7367 100644
>> --- a/drivers/dma-buf/udmabuf.c
>> +++ b/drivers/dma-buf/udmabuf.c
>> @@ -38,36 +38,39 @@ struct udmabuf_folio {
>> struct list_head list;
>> };
>> -static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf)
>> -{
>> - struct vm_area_struct *vma = vmf->vma;
>> - struct udmabuf *ubuf = vma->vm_private_data;
>> - pgoff_t pgoff = vmf->pgoff;
>> - unsigned long pfn;
>> -
>> - if (pgoff >= ubuf->pagecount)
>> - return VM_FAULT_SIGBUS;
>> -
>> - pfn = folio_pfn(ubuf->folios[pgoff]);
>> - pfn += ubuf->offsets[pgoff] >> PAGE_SHIFT;
>> -
>> - return vmf_insert_pfn(vma, vmf->address, pfn);
>> -}
>> -
>> -static const struct vm_operations_struct udmabuf_vm_ops = {
>> - .fault = udmabuf_vm_fault,
>> -};
>> +static struct sg_table *get_sg_table(struct device *dev, struct
>> dma_buf *buf,
>> + enum dma_data_direction direction);
>> static int mmap_udmabuf(struct dma_buf *buf, struct
>> vm_area_struct *vma)
>> {
>> struct udmabuf *ubuf = buf->priv;
>> + struct sg_table *table = ubuf->sg;
>> + unsigned long addr = vma->vm_start;
>> + struct sg_page_iter piter;
>> + int ret;
>> if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
>> return -EINVAL;
>> - vma->vm_ops = &udmabuf_vm_ops;
>> - vma->vm_private_data = ubuf;
>> - vm_flags_set(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
>> + if (!table) {
>> + table = get_sg_table(NULL, buf, 0);
>> + if (IS_ERR(table))
>> + return PTR_ERR(table);
>> + ubuf->sg = table;
>> + }
>> +
>> + for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
>
> That might not work correctly. We intentionally remove the pages from
> the sgtable when it is shared between devices.
>
> Additional to that the sgtable is *not* a page container, but rather a
> DMA address container. So that here is also a rather bad idea from the
> design side.
Sorry for that and patch 1 3 4's ops. I was not aware of this before.
All idea to do this in mmap/vmap is just like system_heap and any other
heaps that I learned.
But well to learn it.
BTW, sgtable is a wrong idea to maintain page, maybe we need to both
setup page's array(order 0 page), and folio's array(only the folio, use
for unpin)?
Or else, mapping page to vm_off and vma solely through folio's array is
quite challenging.
Moreover, even in this way, the memory overhead is smaller than the
current unpin list.
Thanks for your correct.:)
>
> Regards,
> Christian.
>
>> + struct page *page = sg_page_iter_page(&piter);
>> +
>> + ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
>> + vma->vm_page_prot);
>> + if (ret)
>> + return ret;
>> + addr += PAGE_SIZE;
>> + if (addr >= vma->vm_end)
>> + return 0;
>> + }
>> +
>> return 0;
>> }
>> @@ -126,6 +129,10 @@ static struct sg_table *get_sg_table(struct
>> device *dev, struct dma_buf *buf,
>> sg_set_folio(sgl, ubuf->folios[i], PAGE_SIZE,
>> ubuf->offsets[i]);
>> + // if dev is NULL, no need to sync.
>> + if (!dev)
>> + return sg;
>> +
>> ret = dma_map_sgtable(dev, sg, direction, 0);
>> if (ret < 0)
>> goto err_map;
>> @@ -206,20 +213,21 @@ static int begin_cpu_udmabuf(struct dma_buf *buf,
>> {
>> struct udmabuf *ubuf = buf->priv;
>> struct device *dev = ubuf->device->this_device;
>> - int ret = 0;
>> + struct sg_table *sg;
>> - if (!ubuf->sg) {
>> - ubuf->sg = get_sg_table(dev, buf, direction);
>> - if (IS_ERR(ubuf->sg)) {
>> - ret = PTR_ERR(ubuf->sg);
>> - ubuf->sg = NULL;
>> - }
>> - } else {
>> + if (ubuf->sg) {
>> dma_sync_sg_for_cpu(dev, ubuf->sg->sgl, ubuf->sg->nents,
>> direction);
>> + return 0;
>> }
>> - return ret;
>> + sg = get_sg_table(dev, buf, direction);
>> + if (IS_ERR(sg))
>> + return PTR_ERR(sg);
>> +
>> + ubuf->sg = sg;
>> +
>> + return 0;
>> }
>> static int end_cpu_udmabuf(struct dma_buf *buf,
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 0/5] udmbuf bug fix and some improvements
2024-08-01 10:45 [PATCH 0/5] udmbuf bug fix and some improvements Huan Yang
` (4 preceding siblings ...)
2024-08-01 10:45 ` [PATCH 5/5] udmabuf: remove folio pin list Huan Yang
@ 2024-08-01 18:32 ` Kasireddy, Vivek
2024-08-02 1:16 ` Huan Yang
2024-08-02 15:26 ` Michel Dänzer
5 siblings, 2 replies; 14+ messages in thread
From: Kasireddy, Vivek @ 2024-08-01 18:32 UTC (permalink / raw)
To: Huan Yang, Gerd Hoffmann, Sumit Semwal, Christian König,
dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org,
linaro-mm-sig@lists.linaro.org, linux-kernel@vger.kernel.org
Cc: opensource.kernel@vivo.com
Hi Huan,
> This patchset attempts to fix some errors in udmabuf and remove the
> upin_list structure.
>
> Some of this fix just gather the patches which I upload before.
>
> Patch1
> ===
> Try to remove page fault mmap and direct map it.
> Due to current udmabuf has already obtained and pinned the folio
> upon completion of the creation.This means that the physical memory has
> already been acquired, rather than being accessed dynamically. The
> current page fault method only saves some page table memory.
>
> As a result, the page fault mechanism has lost its purpose as a demanding
> page. Due to the fact that page fault requires trapping into kernel mode
> and filling in when accessing the corresponding virtual address in mmap,
> this means that user mode access to virtual addresses needs to trap into
> kernel mode.
>
> Therefore, when creating a large size udmabuf, this represents a
> considerable overhead.
Just want to mention that for the main use-case the udmabuf driver is designed for,
(sharing Qemu Guest FB with Host for GPU DMA), udmabufs are not created very
frequently. And, I think providing CPU access via mmap is just a backup, mainly
intended for debugging purposes.
>
> Therefore, the current patch removes the page fault method of mmap and
> instead fills it directly when mmap is triggered.
>
> This is achieved by using the scatter-gather table to establish a
> linear relationship for the page. Calling remap_pfn_range does not cause
> the previously set VMA flags to become invalid.
>
> Patch2
> ===
> This is the same to patch:
> https://lore.kernel.org/all/20240725021349.580574-1-link@vivo.com/
> I just gather it to this patchset.
>
> Patch3
> ===
> The current implementation of udmabuf's vmap has issues.
>
> It does not correctly set each page of the folio to the page structure,
> so that when vmap is called, all pages are the head page of the folio.
>
> This implementation is not the same as this patch:
> https://lore.kernel.org/all/20240731090233.1343559-1-link@vivo.com/
>
> This reuse sgt table to map all page into vmalloc area.
>
> Patch4
> ===
> Wrap the repeated calls to get_sg_table, add a helper function to do it.
> Set to udmabuf->sg use cmpxchg, It should be able to prevent concurrent
> access situations. (I see mmap do not use lock)
>
> Patch5
> ===
> Attempt to remove unpin_list and other related data structures.
>
> In order to adapt to Folio, we established the unpin_list data structure
> to unpin all folios and maintain the page mapping relationship.
>
> However, this data structure requires 24 bytes for each page and has low
> traversal performance for the list. And maintaining the offset structure
> also consumes a portion of memory.
>
> This patch attempts to remove these data structures and modify the
> semantics of some existing data structures.
>
> udmabuf:
> folios -> folios array, which only contain's the folio, org contains
> duplicate.
> add item_offset -> base on create item count, record it's start offset
> in every memfd.
> add item_size -> base on create item count, record it's size in every
> memfd.
> add nr_folios -> folios array number
I am not sure if these changes improve the readability. Instead, I think it makes
sense to add comments to the existing code.
>
> So, when building the sg table, it is necessary to iterate in this way:
> if size cross item->size, take care of it's start offset in folio.
> if got folio, set each page into sgl until reach into folio size.
>
> This patch also remove single folios' create on each create item, use it
> be the ubuf->folios arrays' pointer, slide to fill the corresponding
> folio under the item into the array.
>
> After the modification, the various data structures in udmabuf have the
> following corresponding relationships:
> pagecount * PAGESIZE = sum(folios_size(folios[i])) i=0->nr_folios
> pagecount * PAGESIZE = sum(item_size[i]) i=0, item_count (do not
> record)
> item_offset use to record each memfd offset if exist, else 0.
>
> Huan Yang (5):
> udmabuf: cancel mmap page fault, direct map it
> udmabuf: change folios array from kmalloc to kvmalloc
> udmabuf: fix vmap_udmabuf error page set
Do you have a test-case to test this patch?
> udmabuf: add get_sg_table helper function
> udmabuf: remove folio pin list
Please run the newly added udmabuf selftests to make sure that these
patches are not causing any regressions. And, we also need to make sure that
the main use-cases (Qemu with memfd + shmem and Qemu with memfd + hugetlb)
are working as expected given the invasive changes.
I'll be able to test and provide more detailed feedback on all patches once I am back from
vacation late next week.
Thanks,
Vivek
>
> drivers/dma-buf/udmabuf.c | 270 +++++++++++++++++++++-----------------
> 1 file changed, 148 insertions(+), 122 deletions(-)
>
>
> base-commit: cd19ac2f903276b820f5d0d89de0c896c27036ed
> --
> 2.45.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/5] udmbuf bug fix and some improvements
2024-08-01 18:32 ` [PATCH 0/5] udmbuf bug fix and some improvements Kasireddy, Vivek
@ 2024-08-02 1:16 ` Huan Yang
2024-08-02 15:26 ` Michel Dänzer
1 sibling, 0 replies; 14+ messages in thread
From: Huan Yang @ 2024-08-02 1:16 UTC (permalink / raw)
To: Kasireddy, Vivek, Gerd Hoffmann, Sumit Semwal,
Christian König, dri-devel@lists.freedesktop.org,
linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
linux-kernel@vger.kernel.org
Cc: opensource.kernel@vivo.com
在 2024/8/2 2:32, Kasireddy, Vivek 写道:
> [Some people who received this message don't often get email from vivek.kasireddy@intel.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Hi Huan,
>
>> This patchset attempts to fix some errors in udmabuf and remove the
>> upin_list structure.
>>
>> Some of this fix just gather the patches which I upload before.
>>
>> Patch1
>> ===
>> Try to remove page fault mmap and direct map it.
>> Due to current udmabuf has already obtained and pinned the folio
>> upon completion of the creation.This means that the physical memory has
>> already been acquired, rather than being accessed dynamically. The
>> current page fault method only saves some page table memory.
>>
>> As a result, the page fault mechanism has lost its purpose as a demanding
>> page. Due to the fact that page fault requires trapping into kernel mode
>> and filling in when accessing the corresponding virtual address in mmap,
>> this means that user mode access to virtual addresses needs to trap into
>> kernel mode.
>>
>> Therefore, when creating a large size udmabuf, this represents a
>> considerable overhead.
> Just want to mention that for the main use-case the udmabuf driver is designed for,
> (sharing Qemu Guest FB with Host for GPU DMA), udmabufs are not created very
> frequently. And, I think providing CPU access via mmap is just a backup, mainly
> intended for debugging purposes.
I'm very glad to know this.
However, recently I have been researching on using asynchronous and
direct I/O (DIO) when loading large model files with dma-buf,
which can improve performance and reduce power consumption. You can see
the patchset:
https://lore.kernel.org/all/20240730075755.10941-1-link@vivo.com/
In the discussion, the maintainer suggested that we should base our work
on udmabuf. I tested udmabuf and found that using asynchronous
and direct I/O (DIO) to read files performs similarly to my patchset.
So I turned to studying udmabuf, and once I become familiar with the
system, I will be able to encourage our partners to adapt it.
>
>> Therefore, the current patch removes the page fault method of mmap and
>> instead fills it directly when mmap is triggered.
>>
>> This is achieved by using the scatter-gather table to establish a
>> linear relationship for the page. Calling remap_pfn_range does not cause
>> the previously set VMA flags to become invalid.
>>
>> Patch2
>> ===
>> This is the same to patch:
>> https://lore.kernel.org/all/20240725021349.580574-1-link@vivo.com/
>> I just gather it to this patchset.
>>
>> Patch3
>> ===
>> The current implementation of udmabuf's vmap has issues.
>>
>> It does not correctly set each page of the folio to the page structure,
>> so that when vmap is called, all pages are the head page of the folio.
>>
>> This implementation is not the same as this patch:
>> https://lore.kernel.org/all/20240731090233.1343559-1-link@vivo.com/
>>
>> This reuse sgt table to map all page into vmalloc area.
>>
>> Patch4
>> ===
>> Wrap the repeated calls to get_sg_table, add a helper function to do it.
>> Set to udmabuf->sg use cmpxchg, It should be able to prevent concurrent
>> access situations. (I see mmap do not use lock)
>>
>> Patch5
>> ===
>> Attempt to remove unpin_list and other related data structures.
>>
>> In order to adapt to Folio, we established the unpin_list data structure
>> to unpin all folios and maintain the page mapping relationship.
>>
>> However, this data structure requires 24 bytes for each page and has low
>> traversal performance for the list. And maintaining the offset structure
>> also consumes a portion of memory.
>>
>> This patch attempts to remove these data structures and modify the
>> semantics of some existing data structures.
>>
>> udmabuf:
>> folios -> folios array, which only contain's the folio, org contains
>> duplicate.
>> add item_offset -> base on create item count, record it's start offset
>> in every memfd.
>> add item_size -> base on create item count, record it's size in every
>> memfd.
>> add nr_folios -> folios array number
> I am not sure if these changes improve the readability. Instead, I think it makes
> sense to add comments to the existing code.
This is not aimed at improving readability, but rather at saving memory
and performance,
as unpin_list is 24 bytes for each folio.
If each folio is 24 bytes, it would result in a lot of performance loss.
I previously provided a patch to establish a kmem_cache to reduce memory
waste, but after recent study,
https://lore.kernel.org/all/20240731062642.1164140-1-link@vivo.com/(This
patch forget to unregister when model exit)
I believe that the unpin_list may not need to be constructed, and
instead, operations can be directly based on the folio array.
>
>> So, when building the sg table, it is necessary to iterate in this way:
>> if size cross item->size, take care of it's start offset in folio.
>> if got folio, set each page into sgl until reach into folio size.
>>
>> This patch also remove single folios' create on each create item, use it
>> be the ubuf->folios arrays' pointer, slide to fill the corresponding
>> folio under the item into the array.
>>
>> After the modification, the various data structures in udmabuf have the
>> following corresponding relationships:
>> pagecount * PAGESIZE = sum(folios_size(folios[i])) i=0->nr_folios
>> pagecount * PAGESIZE = sum(item_size[i]) i=0, item_count (do not
>> record)
>> item_offset use to record each memfd offset if exist, else 0.
>>
>> Huan Yang (5):
>> udmabuf: cancel mmap page fault, direct map it
>> udmabuf: change folios array from kmalloc to kvmalloc
>> udmabuf: fix vmap_udmabuf error page set
> Do you have a test-case to test this patch?
>
>> udmabuf: add get_sg_table helper function
>> udmabuf: remove folio pin list
> Please run the newly added udmabuf selftests to make sure that these
Yes, you're right, when I release the next patch, I will include it.
Thank you for point this.
Christian König reminded me not to build page associations based on the
sg table, which I had not considered.
Therefore, the overall logic of the patch needs to be revised.
> patches are not causing any regressions. And, we also need to make sure that
> the main use-cases (Qemu with memfd + shmem and Qemu with memfd + hugetlb)
> are working as expected given the invasive changes.
>
> I'll be able to test and provide more detailed feedback on all patches once I am back from
> vacation late next week.
Wish you a pleasant holiday.
Thank you.
>
> Thanks,
> Vivek
>
>> drivers/dma-buf/udmabuf.c | 270 +++++++++++++++++++++-----------------
>> 1 file changed, 148 insertions(+), 122 deletions(-)
>>
>>
>> base-commit: cd19ac2f903276b820f5d0d89de0c896c27036ed
>> --
>> 2.45.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/5] udmbuf bug fix and some improvements
2024-08-01 18:32 ` [PATCH 0/5] udmbuf bug fix and some improvements Kasireddy, Vivek
2024-08-02 1:16 ` Huan Yang
@ 2024-08-02 15:26 ` Michel Dänzer
1 sibling, 0 replies; 14+ messages in thread
From: Michel Dänzer @ 2024-08-02 15:26 UTC (permalink / raw)
To: Kasireddy, Vivek, Huan Yang, Gerd Hoffmann, Sumit Semwal,
Christian König, dri-devel@lists.freedesktop.org,
linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
linux-kernel@vger.kernel.org
Cc: opensource.kernel@vivo.com
On 2024-08-01 20:32, Kasireddy, Vivek wrote:
> Hi Huan,
>
>> This patchset attempts to fix some errors in udmabuf and remove the
>> upin_list structure.
>>
>> Some of this fix just gather the patches which I upload before.
>>
>> Patch1
>> ===
>> Try to remove page fault mmap and direct map it.
>> Due to current udmabuf has already obtained and pinned the folio
>> upon completion of the creation.This means that the physical memory has
>> already been acquired, rather than being accessed dynamically. The
>> current page fault method only saves some page table memory.
>>
>> As a result, the page fault mechanism has lost its purpose as a demanding
>> page. Due to the fact that page fault requires trapping into kernel mode
>> and filling in when accessing the corresponding virtual address in mmap,
>> this means that user mode access to virtual addresses needs to trap into
>> kernel mode.
>>
>> Therefore, when creating a large size udmabuf, this represents a
>> considerable overhead.
> Just want to mention that for the main use-case the udmabuf driver is designed for,
> (sharing Qemu Guest FB with Host for GPU DMA), udmabufs are not created very
> frequently. And, I think providing CPU access via mmap is just a backup, mainly
> intended for debugging purposes.
FYI, Mesa now uses udmabuf for supporting dma-bufs with software rendering.
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and Xwayland developer
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-08-02 15:26 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 10:45 [PATCH 0/5] udmbuf bug fix and some improvements Huan Yang
2024-08-01 10:45 ` [PATCH 1/5] udmabuf: cancel mmap page fault, direct map it Huan Yang
2024-08-01 10:50 ` Christian König
2024-08-01 11:08 ` Huan Yang
2024-08-01 10:45 ` [PATCH 2/5] udmabuf: change folios array from kmalloc to kvmalloc Huan Yang
2024-08-01 10:54 ` Christian König
2024-08-01 10:45 ` [PATCH 3/5] udmabuf: fix vmap_udmabuf error page set Huan Yang
2024-08-01 10:55 ` Christian König
2024-08-01 10:45 ` [PATCH 4/5] udmabuf: add get_sg_table helper function Huan Yang
2024-08-01 10:56 ` Christian König
2024-08-01 10:45 ` [PATCH 5/5] udmabuf: remove folio pin list Huan Yang
2024-08-01 18:32 ` [PATCH 0/5] udmbuf bug fix and some improvements Kasireddy, Vivek
2024-08-02 1:16 ` Huan Yang
2024-08-02 15:26 ` Michel Dänzer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox