linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] vfio/type1: optimize vfio_pin_pages_remote() and vfio_unpin_pages_remote()
@ 2025-07-04  6:25 lizhe.67
  2025-07-04  6:25 ` [PATCH v2 1/5] mm: introduce num_pages_contiguous() lizhe.67
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: lizhe.67 @ 2025-07-04  6:25 UTC (permalink / raw)
  To: alex.williamson, akpm, david, peterx, jgg
  Cc: kvm, linux-kernel, linux-mm, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

This patchset is an integration of the two previous patchsets[1][2].

When vfio_pin_pages_remote() is called with a range of addresses that
includes large folios, the function currently performs individual
statistics counting operations for each page. This can lead to significant
performance overheads, especially when dealing with large ranges of pages.

The function vfio_unpin_pages_remote() has a similar issue, where executing
put_pfn() for each pfn brings considerable consumption.

This patchset primarily optimizes the performance of the relevant functions
by batching the less efficient operations mentioned before.

The first two patch optimizes the performance of the function
vfio_pin_pages_remote(), while the remaining patches optimize the
performance of the function vfio_unpin_pages_remote().

The performance test results, based on v6.16-rc4, for completing the 16G
VFIO MAP/UNMAP DMA, obtained through unit test[3] with slight
modifications[4], are as follows.

Base(6.16-rc4):
./vfio-pci-mem-dma-map 0000:03:00.0 16
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.047 s (340.2 GB/s)
VFIO UNMAP DMA in 0.135 s (118.6 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.280 s (57.2 GB/s)
VFIO UNMAP DMA in 0.312 s (51.3 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.052 s (310.5 GB/s)
VFIO UNMAP DMA in 0.136 s (117.3 GB/s)

With this patchset:
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.027 s (600.7 GB/s)
VFIO UNMAP DMA in 0.045 s (357.0 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.261 s (61.4 GB/s)
VFIO UNMAP DMA in 0.288 s (55.6 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.031 s (516.4 GB/s)
VFIO UNMAP DMA in 0.045 s (353.9 GB/s)

For large folio, we achieve an over 40% performance improvement for VFIO
MAP DMA and an over 66% performance improvement for VFIO DMA UNMAP. For
small folios, the performance test results show a slight improvement with
the performance before optimization.

[1]: https://lore.kernel.org/all/20250529064947.38433-1-lizhe.67@bytedance.com/
[2]: https://lore.kernel.org/all/20250620032344.13382-1-lizhe.67@bytedance.com/#t
[3]: https://github.com/awilliam/tests/blob/vfio-pci-mem-dma-map/vfio-pci-mem-dma-map.c
[4]: https://lore.kernel.org/all/20250610031013.98556-1-lizhe.67@bytedance.com/

Li Zhe (5):
  mm: introduce num_pages_contiguous()
  vfio/type1: optimize vfio_pin_pages_remote()
  vfio/type1: batch vfio_find_vpfn() in function
    vfio_unpin_pages_remote()
  vfio/type1: introduce a new member has_rsvd for struct vfio_dma
  vfio/type1: optimize vfio_unpin_pages_remote()

 drivers/vfio/vfio_iommu_type1.c | 111 ++++++++++++++++++++++++++------
 include/linux/mm.h              |  20 ++++++
 2 files changed, 110 insertions(+), 21 deletions(-)

---
Changelogs:

v1->v2:
- Update the performance test results.
- The function num_pages_contiguous() is extracted and placed in a
  separate commit.
- The phrase 'for large folio' has been removed from the patchset title.

v1: https://lore.kernel.org/all/20250630072518.31846-1-lizhe.67@bytedance.com/

-- 
2.20.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/5] mm: introduce num_pages_contiguous()
  2025-07-04  6:25 [PATCH v2 0/5] vfio/type1: optimize vfio_pin_pages_remote() and vfio_unpin_pages_remote() lizhe.67
@ 2025-07-04  6:25 ` lizhe.67
  2025-07-04  7:56   ` David Hildenbrand
                     ` (2 more replies)
  2025-07-04  6:25 ` [PATCH v2 2/5] vfio/type1: optimize vfio_pin_pages_remote() lizhe.67
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 17+ messages in thread
From: lizhe.67 @ 2025-07-04  6:25 UTC (permalink / raw)
  To: alex.williamson, akpm, david, peterx, jgg
  Cc: kvm, linux-kernel, linux-mm, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

Function num_pages_contiguous() determine the number of contiguous
pages starting from the first page in the given array of page pointers.
VFIO will utilize this interface to accelerate the VFIO DMA map process.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
---
 include/linux/mm.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0ef2ba0c667a..1d26203d1ced 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -205,6 +205,26 @@ extern unsigned long sysctl_admin_reserve_kbytes;
 #define folio_page_idx(folio, p)	((p) - &(folio)->page)
 #endif
 
+/*
+ * num_pages_contiguous() - determine the number of contiguous pages
+ * starting from the first page.
+ *
+ * @pages: an array of page pointers
+ * @nr_pages: length of the array
+ */
+static inline unsigned long num_pages_contiguous(struct page **pages,
+						 unsigned long nr_pages)
+{
+	struct page *first_page = pages[0];
+	unsigned long i;
+
+	for (i = 1; i < nr_pages; i++)
+		if (pages[i] != nth_page(first_page, i))
+			break;
+
+	return i;
+}
+
 /* to align the pointer to the (next) page boundary */
 #define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE)
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/5] vfio/type1: optimize vfio_pin_pages_remote()
  2025-07-04  6:25 [PATCH v2 0/5] vfio/type1: optimize vfio_pin_pages_remote() and vfio_unpin_pages_remote() lizhe.67
  2025-07-04  6:25 ` [PATCH v2 1/5] mm: introduce num_pages_contiguous() lizhe.67
@ 2025-07-04  6:25 ` lizhe.67
  2025-07-04  8:41   ` David Hildenbrand
  2025-07-04  6:26 ` [PATCH v2 3/5] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: lizhe.67 @ 2025-07-04  6:25 UTC (permalink / raw)
  To: alex.williamson, akpm, david, peterx, jgg
  Cc: kvm, linux-kernel, linux-mm, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

When vfio_pin_pages_remote() is called with a range of addresses that
includes large folios, the function currently performs individual
statistics counting operations for each page. This can lead to significant
performance overheads, especially when dealing with large ranges of pages.
Batch processing of statistical counting operations can effectively enhance
performance.

In addition, the pages obtained through longterm GUP are neither invalid
nor reserved. Therefore, we can reduce the overhead associated with some
calls to function is_invalid_reserved_pfn().

The performance test results for completing the 16G VFIO IOMMU DMA mapping
are as follows.

Base(v6.16-rc4):
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.047 s (340.2 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.280 s (57.2 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.052 s (310.5 GB/s)

With this patch:
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.027 s (602.1 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.257 s (62.4 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.031 s (517.4 GB/s)

For large folio, we achieve an over 40% performance improvement.
For small folios, the performance test results indicate a
slight improvement.

Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/vfio_iommu_type1.c | 83 ++++++++++++++++++++++++++++-----
 1 file changed, 71 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 1136d7ac6b59..03fce54e1372 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -318,7 +318,13 @@ static void vfio_dma_bitmap_free_all(struct vfio_iommu *iommu)
 /*
  * Helper Functions for host iova-pfn list
  */
-static struct vfio_pfn *vfio_find_vpfn(struct vfio_dma *dma, dma_addr_t iova)
+
+/*
+ * Find the highest vfio_pfn that overlapping the range
+ * [iova_start, iova_end) in rb tree.
+ */
+static struct vfio_pfn *vfio_find_vpfn_range(struct vfio_dma *dma,
+		dma_addr_t iova_start, dma_addr_t iova_end)
 {
 	struct vfio_pfn *vpfn;
 	struct rb_node *node = dma->pfn_list.rb_node;
@@ -326,9 +332,9 @@ static struct vfio_pfn *vfio_find_vpfn(struct vfio_dma *dma, dma_addr_t iova)
 	while (node) {
 		vpfn = rb_entry(node, struct vfio_pfn, node);
 
-		if (iova < vpfn->iova)
+		if (iova_end <= vpfn->iova)
 			node = node->rb_left;
-		else if (iova > vpfn->iova)
+		else if (iova_start > vpfn->iova)
 			node = node->rb_right;
 		else
 			return vpfn;
@@ -336,6 +342,11 @@ static struct vfio_pfn *vfio_find_vpfn(struct vfio_dma *dma, dma_addr_t iova)
 	return NULL;
 }
 
+static inline struct vfio_pfn *vfio_find_vpfn(struct vfio_dma *dma, dma_addr_t iova)
+{
+	return vfio_find_vpfn_range(dma, iova, iova + PAGE_SIZE);
+}
+
 static void vfio_link_pfn(struct vfio_dma *dma,
 			  struct vfio_pfn *new)
 {
@@ -614,6 +625,39 @@ static long vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
 	return ret;
 }
 
+
+static long vpfn_pages(struct vfio_dma *dma,
+		dma_addr_t iova_start, long nr_pages)
+{
+	dma_addr_t iova_end = iova_start + (nr_pages << PAGE_SHIFT);
+	struct vfio_pfn *top = vfio_find_vpfn_range(dma, iova_start, iova_end);
+	long ret = 1;
+	struct vfio_pfn *vpfn;
+	struct rb_node *prev;
+	struct rb_node *next;
+
+	if (likely(!top))
+		return 0;
+
+	prev = next = &top->node;
+
+	while ((prev = rb_prev(prev))) {
+		vpfn = rb_entry(prev, struct vfio_pfn, node);
+		if (vpfn->iova < iova_start)
+			break;
+		ret++;
+	}
+
+	while ((next = rb_next(next))) {
+		vpfn = rb_entry(next, struct vfio_pfn, node);
+		if (vpfn->iova >= iova_end)
+			break;
+		ret++;
+	}
+
+	return ret;
+}
+
 /*
  * Attempt to pin pages.  We really don't want to track all the pfns and
  * the iommu can only map chunks of consecutive pfns anyway, so get the
@@ -680,32 +724,47 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
 		 * and rsvd here, and therefore continues to use the batch.
 		 */
 		while (true) {
+			long nr_pages, acct_pages = 0;
+
 			if (pfn != *pfn_base + pinned ||
 			    rsvd != is_invalid_reserved_pfn(pfn))
 				goto out;
 
+			/*
+			 * Using GUP with the FOLL_LONGTERM in
+			 * vaddr_get_pfns() will not return invalid
+			 * or reserved pages.
+			 */
+			nr_pages = num_pages_contiguous(
+					&batch->pages[batch->offset],
+					batch->size);
+			if (!rsvd) {
+				acct_pages = nr_pages;
+				acct_pages -= vpfn_pages(dma, iova, nr_pages);
+			}
+
 			/*
 			 * Reserved pages aren't counted against the user,
 			 * externally pinned pages are already counted against
 			 * the user.
 			 */
-			if (!rsvd && !vfio_find_vpfn(dma, iova)) {
+			if (acct_pages) {
 				if (!dma->lock_cap &&
-				    mm->locked_vm + lock_acct + 1 > limit) {
+						mm->locked_vm + lock_acct + acct_pages > limit) {
 					pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n",
 						__func__, limit << PAGE_SHIFT);
 					ret = -ENOMEM;
 					goto unpin_out;
 				}
-				lock_acct++;
+				lock_acct += acct_pages;
 			}
 
-			pinned++;
-			npage--;
-			vaddr += PAGE_SIZE;
-			iova += PAGE_SIZE;
-			batch->offset++;
-			batch->size--;
+			pinned += nr_pages;
+			npage -= nr_pages;
+			vaddr += PAGE_SIZE * nr_pages;
+			iova += PAGE_SIZE * nr_pages;
+			batch->offset += nr_pages;
+			batch->size -= nr_pages;
 
 			if (!batch->size)
 				break;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/5] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
  2025-07-04  6:25 [PATCH v2 0/5] vfio/type1: optimize vfio_pin_pages_remote() and vfio_unpin_pages_remote() lizhe.67
  2025-07-04  6:25 ` [PATCH v2 1/5] mm: introduce num_pages_contiguous() lizhe.67
  2025-07-04  6:25 ` [PATCH v2 2/5] vfio/type1: optimize vfio_pin_pages_remote() lizhe.67
@ 2025-07-04  6:26 ` lizhe.67
  2025-07-04  6:26 ` [PATCH v2 4/5] vfio/type1: introduce a new member has_rsvd for struct vfio_dma lizhe.67
  2025-07-04  6:26 ` [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote() lizhe.67
  4 siblings, 0 replies; 17+ messages in thread
From: lizhe.67 @ 2025-07-04  6:26 UTC (permalink / raw)
  To: alex.williamson, akpm, david, peterx, jgg
  Cc: kvm, linux-kernel, linux-mm, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

The function vpfn_pages() can help us determine the number of vpfn
nodes on the vpfn rb tree within a specified range. This allows us
to avoid searching for each vpfn individually in the function
vfio_unpin_pages_remote(). This patch batches the vfio_find_vpfn()
calls in function vfio_unpin_pages_remote().

Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
---
 drivers/vfio/vfio_iommu_type1.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 03fce54e1372..b770eb1c4e07 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -794,16 +794,12 @@ static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
 				    unsigned long pfn, unsigned long npage,
 				    bool do_accounting)
 {
-	long unlocked = 0, locked = 0;
+	long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
 	long i;
 
-	for (i = 0; i < npage; i++, iova += PAGE_SIZE) {
-		if (put_pfn(pfn++, dma->prot)) {
+	for (i = 0; i < npage; i++)
+		if (put_pfn(pfn++, dma->prot))
 			unlocked++;
-			if (vfio_find_vpfn(dma, iova))
-				locked++;
-		}
-	}
 
 	if (do_accounting)
 		vfio_lock_acct(dma, locked - unlocked, true);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/5] vfio/type1: introduce a new member has_rsvd for struct vfio_dma
  2025-07-04  6:25 [PATCH v2 0/5] vfio/type1: optimize vfio_pin_pages_remote() and vfio_unpin_pages_remote() lizhe.67
                   ` (2 preceding siblings ...)
  2025-07-04  6:26 ` [PATCH v2 3/5] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
@ 2025-07-04  6:26 ` lizhe.67
  2025-07-04  8:46   ` David Hildenbrand
  2025-07-04  6:26 ` [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote() lizhe.67
  4 siblings, 1 reply; 17+ messages in thread
From: lizhe.67 @ 2025-07-04  6:26 UTC (permalink / raw)
  To: alex.williamson, akpm, david, peterx, jgg
  Cc: kvm, linux-kernel, linux-mm, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

Introduce a new member has_rsvd for struct vfio_dma. This member is
used to indicate whether there are any reserved or invalid pfns in
the region represented by this vfio_dma. If it is true, it indicates
that there is at least one pfn in this region that is either reserved
or invalid.

Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
---
 drivers/vfio/vfio_iommu_type1.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index b770eb1c4e07..13c5667d431c 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -92,6 +92,7 @@ struct vfio_dma {
 	bool			iommu_mapped;
 	bool			lock_cap;	/* capable(CAP_IPC_LOCK) */
 	bool			vaddr_invalid;
+	bool			has_rsvd;	/* has 1 or more rsvd pfns */
 	struct task_struct	*task;
 	struct rb_root		pfn_list;	/* Ex-user pinned pfn list */
 	unsigned long		*bitmap;
@@ -774,6 +775,7 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
 	}
 
 out:
+	dma->has_rsvd |= rsvd;
 	ret = vfio_lock_acct(dma, lock_acct, false);
 
 unpin_out:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote()
  2025-07-04  6:25 [PATCH v2 0/5] vfio/type1: optimize vfio_pin_pages_remote() and vfio_unpin_pages_remote() lizhe.67
                   ` (3 preceding siblings ...)
  2025-07-04  6:26 ` [PATCH v2 4/5] vfio/type1: introduce a new member has_rsvd for struct vfio_dma lizhe.67
@ 2025-07-04  6:26 ` lizhe.67
  2025-07-04  8:47   ` David Hildenbrand
  4 siblings, 1 reply; 17+ messages in thread
From: lizhe.67 @ 2025-07-04  6:26 UTC (permalink / raw)
  To: alex.williamson, akpm, david, peterx, jgg
  Cc: kvm, linux-kernel, linux-mm, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

When vfio_unpin_pages_remote() is called with a range of addresses that
includes large folios, the function currently performs individual
put_pfn() operations for each page. This can lead to significant
performance overheads, especially when dealing with large ranges of pages.

It would be very rare for reserved PFNs and non reserved will to be mixed
within the same range. So this patch utilizes the has_rsvd variable
introduced in the previous patch to determine whether batch put_pfn()
operations can be performed. Moreover, compared to put_pfn(),
unpin_user_page_range_dirty_lock() is capable of handling large folio
scenarios more efficiently.

The performance test results for completing the 16G VFIO IOMMU DMA
unmapping are as follows.

Base(v6.16-rc4):
./vfio-pci-mem-dma-map 0000:03:00.0 16
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO UNMAP DMA in 0.135 s (118.6 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO UNMAP DMA in 0.312 s (51.3 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO UNMAP DMA in 0.136 s (117.3 GB/s)

With this patchset:
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO UNMAP DMA in 0.045 s (357.0 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO UNMAP DMA in 0.288 s (55.6 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO UNMAP DMA in 0.045 s (353.9 GB/s)

For large folio, we achieve an over 66% performance improvement in
the VFIO UNMAP DMA item. For small folios, the performance test
results appear to show a slight improvement.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
---
 drivers/vfio/vfio_iommu_type1.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 13c5667d431c..3971539b0d67 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -792,17 +792,29 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
 	return pinned;
 }
 
+static inline void put_valid_unreserved_pfns(unsigned long start_pfn,
+		unsigned long npage, int prot)
+{
+	unpin_user_page_range_dirty_lock(pfn_to_page(start_pfn), npage,
+					 prot & IOMMU_WRITE);
+}
+
 static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
 				    unsigned long pfn, unsigned long npage,
 				    bool do_accounting)
 {
 	long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
-	long i;
 
-	for (i = 0; i < npage; i++)
-		if (put_pfn(pfn++, dma->prot))
-			unlocked++;
+	if (dma->has_rsvd) {
+		long i;
 
+		for (i = 0; i < npage; i++)
+			if (put_pfn(pfn++, dma->prot))
+				unlocked++;
+	} else {
+		put_valid_unreserved_pfns(pfn, npage, dma->prot);
+		unlocked = npage;
+	}
 	if (do_accounting)
 		vfio_lock_acct(dma, locked - unlocked, true);
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] mm: introduce num_pages_contiguous()
  2025-07-04  6:25 ` [PATCH v2 1/5] mm: introduce num_pages_contiguous() lizhe.67
@ 2025-07-04  7:56   ` David Hildenbrand
  2025-07-04  8:21     ` lizhe.67
  2025-07-04 17:10   ` Jason Gunthorpe
  2025-07-04 21:19   ` kernel test robot
  2 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2025-07-04  7:56 UTC (permalink / raw)
  To: lizhe.67, alex.williamson, akpm, peterx, jgg; +Cc: kvm, linux-kernel, linux-mm

On 04.07.25 08:25, lizhe.67@bytedance.com wrote:
> From: Li Zhe <lizhe.67@bytedance.com>
> 
> Function num_pages_contiguous() determine the number of contiguous
> pages starting from the first page in the given array of page pointers.
> VFIO will utilize this interface to accelerate the VFIO DMA map process.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>

I think Jason suggested having this as a helper as well.

> Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
> ---
>   include/linux/mm.h | 20 ++++++++++++++++++++
>   1 file changed, 20 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0ef2ba0c667a..1d26203d1ced 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -205,6 +205,26 @@ extern unsigned long sysctl_admin_reserve_kbytes;
>   #define folio_page_idx(folio, p)	((p) - &(folio)->page)
>   #endif
>   
> +/*
> + * num_pages_contiguous() - determine the number of contiguous pages
> + * starting from the first page.


Maybe clarify here here:

"Pages are contiguous if they represent contiguous PFNs. Depending on 
the memory model, this can mean that the addresses of the "struct page"s 
are not contiguous."


> + *
> + * @pages: an array of page pointers
> + * @nr_pages: length of the array
> + */
> +static inline unsigned long num_pages_contiguous(struct page **pages,
> +						 unsigned long nr_pages)
> +{
> +	struct page *first_page = pages[0];
> +	unsigned long i;
> +
> +	for (i = 1; i < nr_pages; i++)
> +		if (pages[i] != nth_page(first_page, i))

if (pages[i] != nth_page(pages[0], i))

Should be clear as well, so no need for the temporary "first_page" variable.

Apart from that LGTM.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] mm: introduce num_pages_contiguous()
  2025-07-04  7:56   ` David Hildenbrand
@ 2025-07-04  8:21     ` lizhe.67
  0 siblings, 0 replies; 17+ messages in thread
From: lizhe.67 @ 2025-07-04  8:21 UTC (permalink / raw)
  To: david
  Cc: akpm, alex.williamson, jgg, kvm, linux-kernel, linux-mm, lizhe.67,
	peterx

On Fri, 4 Jul 2025 09:56:23 +0200, david@redhat.com wrote:

> On 04.07.25 08:25, lizhe.67@bytedance.com wrote:
> > From: Li Zhe <lizhe.67@bytedance.com>
> > 
> > Function num_pages_contiguous() determine the number of contiguous
> > pages starting from the first page in the given array of page pointers.
> > VFIO will utilize this interface to accelerate the VFIO DMA map process.
> > 
> > Suggested-by: David Hildenbrand <david@redhat.com>
> 
> I think Jason suggested having this as a helper as well.

Yes, thank you for the reminder. Jason needs to be added here.

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>

> > Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
> > ---
> >   include/linux/mm.h | 20 ++++++++++++++++++++
> >   1 file changed, 20 insertions(+)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 0ef2ba0c667a..1d26203d1ced 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -205,6 +205,26 @@ extern unsigned long sysctl_admin_reserve_kbytes;
> >   #define folio_page_idx(folio, p)	((p) - &(folio)->page)
> >   #endif
> >   
> > +/*
> > + * num_pages_contiguous() - determine the number of contiguous pages
> > + * starting from the first page.
> 
> Maybe clarify here here:
> 
> "Pages are contiguous if they represent contiguous PFNs. Depending on 
> the memory model, this can mean that the addresses of the "struct page"s 
> are not contiguous."

Thank you. I will include this clarification in the comment.

> > + *
> > + * @pages: an array of page pointers
> > + * @nr_pages: length of the array
> > + */
> > +static inline unsigned long num_pages_contiguous(struct page **pages,
> > +						 unsigned long nr_pages)
> > +{
> > +	struct page *first_page = pages[0];
> > +	unsigned long i;
> > +
> > +	for (i = 1; i < nr_pages; i++)
> > +		if (pages[i] != nth_page(first_page, i))
> 
> if (pages[i] != nth_page(pages[0], i))
> 
> Should be clear as well, so no need for the temporary "first_page" variable.

Thank you. Doing so makes the function appear much clearer.

> Apart from that LGTM.

Thank you for your review!

Thanks,
Zhe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/5] vfio/type1: optimize vfio_pin_pages_remote()
  2025-07-04  6:25 ` [PATCH v2 2/5] vfio/type1: optimize vfio_pin_pages_remote() lizhe.67
@ 2025-07-04  8:41   ` David Hildenbrand
  0 siblings, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2025-07-04  8:41 UTC (permalink / raw)
  To: lizhe.67, alex.williamson, akpm, peterx, jgg; +Cc: kvm, linux-kernel, linux-mm


>    * Attempt to pin pages.  We really don't want to track all the pfns and
>    * the iommu can only map chunks of consecutive pfns anyway, so get the
> @@ -680,32 +724,47 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
>   		 * and rsvd here, and therefore continues to use the batch.
>   		 */
>   		while (true) {
> +			long nr_pages, acct_pages = 0;
> +
>   			if (pfn != *pfn_base + pinned ||
>   			    rsvd != is_invalid_reserved_pfn(pfn))
>   				goto out;
>   
> +			/*
> +			 * Using GUP with the FOLL_LONGTERM in
> +			 * vaddr_get_pfns() will not return invalid
> +			 * or reserved pages.
> +			 */
> +			nr_pages = num_pages_contiguous(
> +					&batch->pages[batch->offset],
> +					batch->size);
> +			if (!rsvd) {
> +				acct_pages = nr_pages;
> +				acct_pages -= vpfn_pages(dma, iova, nr_pages);

I'm not familiar with the vpfn_pages() part, but the other stuff LGTM.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/5] vfio/type1: introduce a new member has_rsvd for struct vfio_dma
  2025-07-04  6:26 ` [PATCH v2 4/5] vfio/type1: introduce a new member has_rsvd for struct vfio_dma lizhe.67
@ 2025-07-04  8:46   ` David Hildenbrand
  0 siblings, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2025-07-04  8:46 UTC (permalink / raw)
  To: lizhe.67, alex.williamson, akpm, peterx, jgg; +Cc: kvm, linux-kernel, linux-mm

On 04.07.25 08:26, lizhe.67@bytedance.com wrote:
> From: Li Zhe <lizhe.67@bytedance.com>
> 
> Introduce a new member has_rsvd for struct vfio_dma. This member is
> used to indicate whether there are any reserved or invalid pfns in
> the region represented by this vfio_dma. If it is true, it indicates
> that there is at least one pfn in this region that is either reserved
> or invalid.
> 
> Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
> ---
>   drivers/vfio/vfio_iommu_type1.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index b770eb1c4e07..13c5667d431c 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -92,6 +92,7 @@ struct vfio_dma {
>   	bool			iommu_mapped;
>   	bool			lock_cap;	/* capable(CAP_IPC_LOCK) */
>   	bool			vaddr_invalid;
> +	bool			has_rsvd;	/* has 1 or more rsvd pfns */
>   	struct task_struct	*task;
>   	struct rb_root		pfn_list;	/* Ex-user pinned pfn list */
>   	unsigned long		*bitmap;
> @@ -774,6 +775,7 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
>   	}
>   
>   out:
> +	dma->has_rsvd |= rsvd;
>   	ret = vfio_lock_acct(dma, lock_acct, false);
>   
>   unpin_out:

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote()
  2025-07-04  6:26 ` [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote() lizhe.67
@ 2025-07-04  8:47   ` David Hildenbrand
  2025-07-04 17:11     ` Jason Gunthorpe
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2025-07-04  8:47 UTC (permalink / raw)
  To: lizhe.67, alex.williamson, akpm, peterx, jgg; +Cc: kvm, linux-kernel, linux-mm

On 04.07.25 08:26, lizhe.67@bytedance.com wrote:
> From: Li Zhe <lizhe.67@bytedance.com>
> 
> When vfio_unpin_pages_remote() is called with a range of addresses that
> includes large folios, the function currently performs individual
> put_pfn() operations for each page. This can lead to significant
> performance overheads, especially when dealing with large ranges of pages.
> 
> It would be very rare for reserved PFNs and non reserved will to be mixed
> within the same range. So this patch utilizes the has_rsvd variable
> introduced in the previous patch to determine whether batch put_pfn()
> operations can be performed. Moreover, compared to put_pfn(),
> unpin_user_page_range_dirty_lock() is capable of handling large folio
> scenarios more efficiently.
> 
> The performance test results for completing the 16G VFIO IOMMU DMA
> unmapping are as follows.
> 
> Base(v6.16-rc4):
> ./vfio-pci-mem-dma-map 0000:03:00.0 16
> ------- AVERAGE (MADV_HUGEPAGE) --------
> VFIO UNMAP DMA in 0.135 s (118.6 GB/s)
> ------- AVERAGE (MAP_POPULATE) --------
> VFIO UNMAP DMA in 0.312 s (51.3 GB/s)
> ------- AVERAGE (HUGETLBFS) --------
> VFIO UNMAP DMA in 0.136 s (117.3 GB/s)
> 
> With this patchset:
> ------- AVERAGE (MADV_HUGEPAGE) --------
> VFIO UNMAP DMA in 0.045 s (357.0 GB/s)
> ------- AVERAGE (MAP_POPULATE) --------
> VFIO UNMAP DMA in 0.288 s (55.6 GB/s)
> ------- AVERAGE (HUGETLBFS) --------
> VFIO UNMAP DMA in 0.045 s (353.9 GB/s)
> 
> For large folio, we achieve an over 66% performance improvement in
> the VFIO UNMAP DMA item. For small folios, the performance test
> results appear to show a slight improvement.
> 
> Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
> Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
> ---
>   drivers/vfio/vfio_iommu_type1.c | 20 ++++++++++++++++----
>   1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 13c5667d431c..3971539b0d67 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -792,17 +792,29 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
>   	return pinned;
>   }
>   
> +static inline void put_valid_unreserved_pfns(unsigned long start_pfn,
> +		unsigned long npage, int prot)
> +{
> +	unpin_user_page_range_dirty_lock(pfn_to_page(start_pfn), npage,
> +					 prot & IOMMU_WRITE);
> +}
> +
>   static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
>   				    unsigned long pfn, unsigned long npage,
>   				    bool do_accounting)
>   {
>   	long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
> -	long i;
>   
> -	for (i = 0; i < npage; i++)
> -		if (put_pfn(pfn++, dma->prot))
> -			unlocked++;
> +	if (dma->has_rsvd) {
> +		long i;

No need to move "long i" here, but also doesn't really matter.

>   
> +		for (i = 0; i < npage; i++)
> +			if (put_pfn(pfn++, dma->prot))
> +				unlocked++;
> +	} else {
> +		put_valid_unreserved_pfns(pfn, npage, dma->prot);
> +		unlocked = npage;
> +	}
>   	if (do_accounting)
>   		vfio_lock_acct(dma, locked - unlocked, true);
>   

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] mm: introduce num_pages_contiguous()
  2025-07-04  6:25 ` [PATCH v2 1/5] mm: introduce num_pages_contiguous() lizhe.67
  2025-07-04  7:56   ` David Hildenbrand
@ 2025-07-04 17:10   ` Jason Gunthorpe
  2025-07-07  3:38     ` lizhe.67
  2025-07-04 21:19   ` kernel test robot
  2 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2025-07-04 17:10 UTC (permalink / raw)
  To: lizhe.67
  Cc: alex.williamson, akpm, david, peterx, kvm, linux-kernel, linux-mm

On Fri, Jul 04, 2025 at 02:25:58PM +0800, lizhe.67@bytedance.com wrote:
> From: Li Zhe <lizhe.67@bytedance.com>
> 
> Function num_pages_contiguous() determine the number of contiguous
> pages starting from the first page in the given array of page pointers.
> VFIO will utilize this interface to accelerate the VFIO DMA map process.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
> ---
>  include/linux/mm.h | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0ef2ba0c667a..1d26203d1ced 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -205,6 +205,26 @@ extern unsigned long sysctl_admin_reserve_kbytes;
>  #define folio_page_idx(folio, p)	((p) - &(folio)->page)
>  #endif
>  
> +/*
> + * num_pages_contiguous() - determine the number of contiguous pages
> + * starting from the first page.
> + *
> + * @pages: an array of page pointers
> + * @nr_pages: length of the array
> + */
> +static inline unsigned long num_pages_contiguous(struct page **pages,
> +						 unsigned long nr_pages)

Both longs should be size_t I think

> +{
> +	struct page *first_page = pages[0];
> +	unsigned long i;

Size_t

> +
> +	for (i = 1; i < nr_pages; i++)
> +		if (pages[i] != nth_page(first_page, i))
> +			break;

It seems OK. So the reasoning here is this is faster on
CONFIG_SPARSEMEM_VMEMMAP/nonsparse and about the same on sparse mem?
(or we don't care?)

Jason


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote()
  2025-07-04  8:47   ` David Hildenbrand
@ 2025-07-04 17:11     ` Jason Gunthorpe
  2025-07-07  3:44       ` lizhe.67
  0 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2025-07-04 17:11 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: lizhe.67, alex.williamson, akpm, peterx, kvm, linux-kernel,
	linux-mm

On Fri, Jul 04, 2025 at 10:47:00AM +0200, David Hildenbrand wrote:
> >   static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
> >   				    unsigned long pfn, unsigned long npage,
> >   				    bool do_accounting)
> >   {
> >   	long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
> > -	long i;
> > -	for (i = 0; i < npage; i++)
> > -		if (put_pfn(pfn++, dma->prot))
> > -			unlocked++;
> > +	if (dma->has_rsvd) {
> > +		long i;
> 
> No need to move "long i" here, but also doesn't really matter.

It should also be unsigned long as npage is unsigned

Jason


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] mm: introduce num_pages_contiguous()
  2025-07-04  6:25 ` [PATCH v2 1/5] mm: introduce num_pages_contiguous() lizhe.67
  2025-07-04  7:56   ` David Hildenbrand
  2025-07-04 17:10   ` Jason Gunthorpe
@ 2025-07-04 21:19   ` kernel test robot
  2025-07-07  3:52     ` lizhe.67
  2 siblings, 1 reply; 17+ messages in thread
From: kernel test robot @ 2025-07-04 21:19 UTC (permalink / raw)
  To: lizhe.67, alex.williamson, akpm, david, peterx, jgg
  Cc: oe-kbuild-all, kvm, linux-kernel, linux-mm, lizhe.67

Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on awilliam-vfio/next]
[also build test ERROR on awilliam-vfio/for-linus akpm-mm/mm-everything linus/master v6.16-rc4 next-20250704]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/lizhe-67-bytedance-com/mm-introduce-num_pages_contiguous/20250704-142948
base:   https://github.com/awilliam/linux-vfio.git next
patch link:    https://lore.kernel.org/r/20250704062602.33500-2-lizhe.67%40bytedance.com
patch subject: [PATCH v2 1/5] mm: introduce num_pages_contiguous()
config: sh-allmodconfig (https://download.01.org/0day-ci/archive/20250705/202507050529.EoMuEtd8-lkp@intel.com/config)
compiler: sh4-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250705/202507050529.EoMuEtd8-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202507050529.EoMuEtd8-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from arch/sh/include/asm/page.h:160,
                    from arch/sh/include/asm/thread_info.h:13,
                    from include/linux/thread_info.h:60,
                    from include/asm-generic/preempt.h:5,
                    from ./arch/sh/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:79,
                    from include/linux/spinlock.h:56,
                    from include/linux/mmzone.h:8,
                    from include/linux/gfp.h:7,
                    from include/linux/mm.h:7,
                    from arch/sh/kernel/asm-offsets.c:14:
   include/linux/mm.h: In function 'num_pages_contiguous':
>> include/asm-generic/memory_model.h:48:21: error: implicit declaration of function 'page_to_section'; did you mean 'present_section'? [-Wimplicit-function-declaration]
      48 |         int __sec = page_to_section(__pg);                      \
         |                     ^~~~~~~~~~~~~~~
   include/asm-generic/memory_model.h:53:32: note: in definition of macro '__pfn_to_page'
      53 | ({      unsigned long __pfn = (pfn);                    \
         |                                ^~~
   include/asm-generic/memory_model.h:65:21: note: in expansion of macro '__page_to_pfn'
      65 | #define page_to_pfn __page_to_pfn
         |                     ^~~~~~~~~~~~~
   include/linux/mm.h:200:38: note: in expansion of macro 'page_to_pfn'
     200 | #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
         |                                      ^~~~~~~~~~~
   include/linux/mm.h:221:33: note: in expansion of macro 'nth_page'
     221 |                 if (pages[i] != nth_page(first_page, i))
         |                                 ^~~~~~~~
   include/linux/mm.h: At top level:
>> include/linux/mm.h:2002:29: error: conflicting types for 'page_to_section'; have 'long unsigned int(const struct page *)'
    2002 | static inline unsigned long page_to_section(const struct page *page)
         |                             ^~~~~~~~~~~~~~~
   include/asm-generic/memory_model.h:48:21: note: previous implicit declaration of 'page_to_section' with type 'int()'
      48 |         int __sec = page_to_section(__pg);                      \
         |                     ^~~~~~~~~~~~~~~
   include/asm-generic/memory_model.h:53:32: note: in definition of macro '__pfn_to_page'
      53 | ({      unsigned long __pfn = (pfn);                    \
         |                                ^~~
   include/asm-generic/memory_model.h:65:21: note: in expansion of macro '__page_to_pfn'
      65 | #define page_to_pfn __page_to_pfn
         |                     ^~~~~~~~~~~~~
   include/linux/mm.h:200:38: note: in expansion of macro 'page_to_pfn'
     200 | #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
         |                                      ^~~~~~~~~~~
   include/linux/mm.h:221:33: note: in expansion of macro 'nth_page'
     221 |                 if (pages[i] != nth_page(first_page, i))
         |                                 ^~~~~~~~
   make[3]: *** [scripts/Makefile.build:98: arch/sh/kernel/asm-offsets.s] Error 1
   make[3]: Target 'prepare' not remade because of errors.
   make[2]: *** [Makefile:1274: prepare0] Error 2
   make[2]: Target 'prepare' not remade because of errors.
   make[1]: *** [Makefile:248: __sub-make] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:248: __sub-make] Error 2
   make: Target 'prepare' not remade because of errors.


vim +2002 include/linux/mm.h

bf4e8902ee5080 Daniel Kiper      2011-05-24  2001  
aa462abe8aaf21 Ian Campbell      2011-08-17 @2002  static inline unsigned long page_to_section(const struct page *page)
d41dee369bff3b Andy Whitcroft    2005-06-23  2003  {
d41dee369bff3b Andy Whitcroft    2005-06-23  2004  	return (page->flags >> SECTIONS_PGSHIFT) & SECTIONS_MASK;
d41dee369bff3b Andy Whitcroft    2005-06-23  2005  }
308c05e35e3517 Christoph Lameter 2008-04-28  2006  #endif
d41dee369bff3b Andy Whitcroft    2005-06-23  2007  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] mm: introduce num_pages_contiguous()
  2025-07-04 17:10   ` Jason Gunthorpe
@ 2025-07-07  3:38     ` lizhe.67
  0 siblings, 0 replies; 17+ messages in thread
From: lizhe.67 @ 2025-07-07  3:38 UTC (permalink / raw)
  To: jgg
  Cc: akpm, alex.williamson, david, kvm, linux-kernel, linux-mm,
	lizhe.67, peterx

On Fri, 4 Jul 2025 14:10:15 -0300, jgg@ziepe.ca wrote:

> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 0ef2ba0c667a..1d26203d1ced 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -205,6 +205,26 @@ extern unsigned long sysctl_admin_reserve_kbytes;
> >  #define folio_page_idx(folio, p)	((p) - &(folio)->page)
> >  #endif
> >  
> > +/*
> > + * num_pages_contiguous() - determine the number of contiguous pages
> > + * starting from the first page.
> > + *
> > + * @pages: an array of page pointers
> > + * @nr_pages: length of the array
> > + */
> > +static inline unsigned long num_pages_contiguous(struct page **pages,
> > +						 unsigned long nr_pages)
> 
> Both longs should be size_t I think

Yes, size_t is a better choice.

> > +{
> > +	struct page *first_page = pages[0];
> > +	unsigned long i;
> 
> Size_t
> 
> > +
> > +	for (i = 1; i < nr_pages; i++)
> > +		if (pages[i] != nth_page(first_page, i))
> > +			break;
> 
> It seems OK. So the reasoning here is this is faster on
> CONFIG_SPARSEMEM_VMEMMAP/nonsparse

Yes.

> and about the same on sparse mem?
> (or we don't care?)

Regarding sparse memory, I'm not entirely certain. From my
understanding, VFIO is predominantly utilized in virtualization
scenarios, which typically have sufficient kernel resources. This
implies that CONFIG_SPARSEMEM_VMEMMAP is generally set to "y" in
such cases. Therefore, we need not overly concern ourselves with
this particular scenario. Of course, David has also proposed
optimization solutions for sparse memory scenarios[1]. If anyone
later complains about performance in this context, I would be happy
to assist with further optimization efforts. Currently, I only have
a x86_64 machine, on which CONFIG_SPARSEMEM_VMEMMAP is forcibly
enabled. Attempting to compile with CONFIG_SPARSEMEM &&
!CONFIG_SPARSEMEM_VMEMMAP results in compilation errors, preventing
me from conducting the relevant performance tests.

Thanks,
Zhe

[1]: https://lore.kernel.org/all/c1144447-6b67-48d3-b37c-5f1ca6a9b4a7@redhat.com/#t


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote()
  2025-07-04 17:11     ` Jason Gunthorpe
@ 2025-07-07  3:44       ` lizhe.67
  0 siblings, 0 replies; 17+ messages in thread
From: lizhe.67 @ 2025-07-07  3:44 UTC (permalink / raw)
  To: jgg
  Cc: akpm, alex.williamson, david, kvm, linux-kernel, linux-mm,
	lizhe.67, peterx

On Fri, 4 Jul 2025 14:11:23 -0300, jgg@ziepe.ca wrote:

> On Fri, Jul 04, 2025 at 10:47:00AM +0200, David Hildenbrand wrote:
> > >   static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
> > >   				    unsigned long pfn, unsigned long npage,
> > >   				    bool do_accounting)
> > >   {
> > >   	long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
> > > -	long i;
> > > -	for (i = 0; i < npage; i++)
> > > -		if (put_pfn(pfn++, dma->prot))
> > > -			unlocked++;
> > > +	if (dma->has_rsvd) {
> > > +		long i;
> > 
> > No need to move "long i" here, but also doesn't really matter.
> 
> It should also be unsigned long as npage is unsigned

Yes, unsigned long is a better choice.

Thanks,
Zhe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/5] mm: introduce num_pages_contiguous()
  2025-07-04 21:19   ` kernel test robot
@ 2025-07-07  3:52     ` lizhe.67
  0 siblings, 0 replies; 17+ messages in thread
From: lizhe.67 @ 2025-07-07  3:52 UTC (permalink / raw)
  To: lkp
  Cc: akpm, alex.williamson, david, jgg, kvm, linux-kernel, linux-mm,
	lizhe.67, oe-kbuild-all, peterx

On Sat, 5 Jul 2025 05:19:34 +0800, lkp@intel.com wrote:
 
> All errors (new ones prefixed by >>):
> 
>    In file included from arch/sh/include/asm/page.h:160,
>                     from arch/sh/include/asm/thread_info.h:13,
>                     from include/linux/thread_info.h:60,
>                     from include/asm-generic/preempt.h:5,
>                     from ./arch/sh/include/generated/asm/preempt.h:1,
>                     from include/linux/preempt.h:79,
>                     from include/linux/spinlock.h:56,
>                     from include/linux/mmzone.h:8,
>                     from include/linux/gfp.h:7,
>                     from include/linux/mm.h:7,
>                     from arch/sh/kernel/asm-offsets.c:14:
>    include/linux/mm.h: In function 'num_pages_contiguous':
> >> include/asm-generic/memory_model.h:48:21: error: implicit declaration of function 'page_to_section'; did you mean 'present_section'? [-Wimplicit-function-declaration]
>       48 |         int __sec = page_to_section(__pg);                      \
>          |                     ^~~~~~~~~~~~~~~
>    include/asm-generic/memory_model.h:53:32: note: in definition of macro '__pfn_to_page'
>       53 | ({      unsigned long __pfn = (pfn);                    \
>          |                                ^~~
>    include/asm-generic/memory_model.h:65:21: note: in expansion of macro '__page_to_pfn'
>       65 | #define page_to_pfn __page_to_pfn
>          |                     ^~~~~~~~~~~~~
>    include/linux/mm.h:200:38: note: in expansion of macro 'page_to_pfn'
>      200 | #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
>          |                                      ^~~~~~~~~~~
>    include/linux/mm.h:221:33: note: in expansion of macro 'nth_page'
>      221 |                 if (pages[i] != nth_page(first_page, i))
>          |                                 ^~~~~~~~
>    include/linux/mm.h: At top level:
> >> include/linux/mm.h:2002:29: error: conflicting types for 'page_to_section'; have 'long unsigned int(const struct page *)'
>     2002 | static inline unsigned long page_to_section(const struct page *page)
>          |                             ^~~~~~~~~~~~~~~
>    include/asm-generic/memory_model.h:48:21: note: previous implicit declaration of 'page_to_section' with type 'int()'
>       48 |         int __sec = page_to_section(__pg);                      \
>          |                     ^~~~~~~~~~~~~~~
>    include/asm-generic/memory_model.h:53:32: note: in definition of macro '__pfn_to_page'
>       53 | ({      unsigned long __pfn = (pfn);                    \
>          |                                ^~~
>    include/asm-generic/memory_model.h:65:21: note: in expansion of macro '__page_to_pfn'
>       65 | #define page_to_pfn __page_to_pfn
>          |                     ^~~~~~~~~~~~~
>    include/linux/mm.h:200:38: note: in expansion of macro 'page_to_pfn'
>      200 | #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
>          |                                      ^~~~~~~~~~~
>    include/linux/mm.h:221:33: note: in expansion of macro 'nth_page'
>      221 |                 if (pages[i] != nth_page(first_page, i))
>          |                                 ^~~~~~~~
>    make[3]: *** [scripts/Makefile.build:98: arch/sh/kernel/asm-offsets.s] Error 1
>    make[3]: Target 'prepare' not remade because of errors.
>    make[2]: *** [Makefile:1274: prepare0] Error 2
>    make[2]: Target 'prepare' not remade because of errors.
>    make[1]: *** [Makefile:248: __sub-make] Error 2
>    make[1]: Target 'prepare' not remade because of errors.
>    make: *** [Makefile:248: __sub-make] Error 2
>    make: Target 'prepare' not remade because of errors.
> 
> 
> vim +2002 include/linux/mm.h
> 
> bf4e8902ee5080 Daniel Kiper      2011-05-24  2001  
> aa462abe8aaf21 Ian Campbell      2011-08-17 @2002  static inline unsigned long page_to_section(const struct page *page)
> d41dee369bff3b Andy Whitcroft    2005-06-23  2003  {
> d41dee369bff3b Andy Whitcroft    2005-06-23  2004  	return (page->flags >> SECTIONS_PGSHIFT) & SECTIONS_MASK;
> d41dee369bff3b Andy Whitcroft    2005-06-23  2005  }
> 308c05e35e3517 Christoph Lameter 2008-04-28  2006  #endif
> d41dee369bff3b Andy Whitcroft    2005-06-23  2007  

Thank you. Let's move function num_pages_contiguous() below the
definition of page_to_section() to resolve this compilation error.

Thanks,
Zhe


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-07-07  3:53 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-04  6:25 [PATCH v2 0/5] vfio/type1: optimize vfio_pin_pages_remote() and vfio_unpin_pages_remote() lizhe.67
2025-07-04  6:25 ` [PATCH v2 1/5] mm: introduce num_pages_contiguous() lizhe.67
2025-07-04  7:56   ` David Hildenbrand
2025-07-04  8:21     ` lizhe.67
2025-07-04 17:10   ` Jason Gunthorpe
2025-07-07  3:38     ` lizhe.67
2025-07-04 21:19   ` kernel test robot
2025-07-07  3:52     ` lizhe.67
2025-07-04  6:25 ` [PATCH v2 2/5] vfio/type1: optimize vfio_pin_pages_remote() lizhe.67
2025-07-04  8:41   ` David Hildenbrand
2025-07-04  6:26 ` [PATCH v2 3/5] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
2025-07-04  6:26 ` [PATCH v2 4/5] vfio/type1: introduce a new member has_rsvd for struct vfio_dma lizhe.67
2025-07-04  8:46   ` David Hildenbrand
2025-07-04  6:26 ` [PATCH v2 5/5] vfio/type1: optimize vfio_unpin_pages_remote() lizhe.67
2025-07-04  8:47   ` David Hildenbrand
2025-07-04 17:11     ` Jason Gunthorpe
2025-07-07  3:44       ` lizhe.67

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).