kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio
@ 2025-06-20  3:23 lizhe.67
  2025-06-20  3:23 ` [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: lizhe.67 @ 2025-06-20  3:23 UTC (permalink / raw)
  To: alex.williamson, jgg, david; +Cc: peterx, kvm, linux-kernel, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

This patchset is based on patch 'vfio/type1: optimize
vfio_pin_pages_remote() for large folios'[1].

When vfio_unpin_pages_remote() is called with a range of addresses
that includes large folios, the function currently performs individual
put_pfn() operations for each page. This can lead to significant
performance overheads, especially when dealing with large ranges of
pages. We can optimize this process by batching the put_pfn()
operations.

The first patch batches the vfio_find_vpfn() calls in function
vfio_unpin_pages_remote(). However, performance testing indicates that
this patch does not seem to have a significant impact. The primary
reason is that the vpfn rb tree is generally empty. Nevertheless, we
believe it can still offer performance benefits in certain scenarios
and also lays the groundwork for the third patch. The second patch
introduces a new member has_rsvd for struct vfio_dma, which will be
utilized by the third patch. The third patch, using the method described
earlier, optimizes the performance of vfio_unpin_pages_remote() for
large folio scenarios.

The performance test results, based on v6.15, for completing the 16G VFIO
IOMMU DMA unmapping, obtained through unit test[2] with slight
modifications[3], are as follows.

Base(v6.15):
./vfio-pci-mem-dma-map 0000:03:00.0 16
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.047 s (338.6 GB/s)
VFIO UNMAP DMA in 0.138 s (116.2 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.280 s (57.2 GB/s)
VFIO UNMAP DMA in 0.312 s (51.3 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.052 s (308.3 GB/s)
VFIO UNMAP DMA in 0.139 s (115.1 GB/s)

Map[1] + First patch:
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.027 s (596.1 GB/s)
VFIO UNMAP DMA in 0.138 s (115.8 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.292 s (54.8 GB/s)
VFIO UNMAP DMA in 0.310 s (51.6 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.032 s (506.5 GB/s)
VFIO UNMAP DMA in 0.140 s (114.1 GB/s)

Map[1] + This patchset:
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.028 s (563.9 GB/s)
VFIO UNMAP DMA in 0.049 s (325.1 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.292 s (54.7 GB/s)
VFIO UNMAP DMA in 0.292 s (54.9 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.033 s (491.3 GB/s)
VFIO UNMAP DMA in 0.049 s (323.9 GB/s)

The first patch appears to have negligible impact on the performance
of VFIO UNMAP DMA.

With the second and the third patch, we achieve an approximate 64%
performance improvement in the VFIO UNMAP DMA item for large folios.
For small folios, the performance test results appear to show no
significant changes.

[1]: https://lore.kernel.org/all/20250529064947.38433-1-lizhe.67@bytedance.com/
[2]: https://github.com/awilliam/tests/blob/vfio-pci-mem-dma-map/vfio-pci-mem-dma-map.c
[3]: https://lore.kernel.org/all/20250610031013.98556-1-lizhe.67@bytedance.com/

Changelogs:

v4->v5:
- Remove the unpin_user_folio_dirty_locked() interface introduced in
  v4.
- Introduces a new member has_rsvd for struct vfio_dma. We use it to
  determine whether there are any reserved or invalid pfns in the
  region represented by this vfio_dma. If not, we can perform batch
  put_pfn() operations by directly calling unpin_user_page_range_dirty_lock().
- Update the performance test results.

v3->v4:
- Introduce a new interface unpin_user_folio_dirty_locked(). Its
  purpose is to conditionally mark a folio as dirty and unpin it.
  This interface will be called in the VFIO DMA unmap process.
- Revert the related changes to put_pfn().
- Update the performance test results.

v2->v3:
- Split the original patch into two separate patches.
- Add several comments specific to large folio scenarios.
- Rename two variables.
- The update to iova has been removed within the loop in
  vfio_unpin_pages_remote().
- Update the performance test results.

v1->v2:
- Refactor the implementation of the optimized code

v4: https://lore.kernel.org/all/20250617041821.85555-1-lizhe.67@bytedance.com/
v3: https://lore.kernel.org/all/20250616075251.89067-1-lizhe.67@bytedance.com/
v2: https://lore.kernel.org/all/20250610045753.6405-1-lizhe.67@bytedance.com/
v1: https://lore.kernel.org/all/20250605124923.21896-1-lizhe.67@bytedance.com/

Li Zhe (3):
  vfio/type1: batch vfio_find_vpfn() in function
    vfio_unpin_pages_remote()
  vfio/type1: introduce a new member has_rsvd for struct vfio_dma
  vfio/type1: optimize vfio_unpin_pages_remote() for large folio

 drivers/vfio/vfio_iommu_type1.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
  2025-06-20  3:23 [PATCH v5 0/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67
@ 2025-06-20  3:23 ` lizhe.67
  2025-06-24 16:23   ` kernel test robot
  2025-06-27 21:40   ` Alex Williamson
  2025-06-20  3:23 ` [PATCH v5 2/3] vfio/type1: introduce a new member has_rsvd for struct vfio_dma lizhe.67
  2025-06-20  3:23 ` [PATCH v5 3/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67
  2 siblings, 2 replies; 9+ messages in thread
From: lizhe.67 @ 2025-06-20  3:23 UTC (permalink / raw)
  To: alex.williamson, jgg, david; +Cc: peterx, kvm, linux-kernel, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

This patch is based on patch 'vfio/type1: optimize
vfio_pin_pages_remote() for large folios'[1].

The function vpfn_pages() can help us determine the number of vpfn
nodes on the vpfn rb tree within a specified range. This allows us
to avoid searching for each vpfn individually in the function
vfio_unpin_pages_remote(). This patch batches the vfio_find_vpfn()
calls in function vfio_unpin_pages_remote().

[1]: https://lore.kernel.org/all/20250529064947.38433-1-lizhe.67@bytedance.com/

Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
---
 drivers/vfio/vfio_iommu_type1.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 28ee4b8d39ae..e952bf8bdfab 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -805,16 +805,12 @@ static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
 				    unsigned long pfn, unsigned long npage,
 				    bool do_accounting)
 {
-	long unlocked = 0, locked = 0;
+	long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
 	long i;
 
-	for (i = 0; i < npage; i++, iova += PAGE_SIZE) {
-		if (put_pfn(pfn++, dma->prot)) {
+	for (i = 0; i < npage; i++)
+		if (put_pfn(pfn++, dma->prot))
 			unlocked++;
-			if (vfio_find_vpfn(dma, iova))
-				locked++;
-		}
-	}
 
 	if (do_accounting)
 		vfio_lock_acct(dma, locked - unlocked, true);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 2/3] vfio/type1: introduce a new member has_rsvd for struct vfio_dma
  2025-06-20  3:23 [PATCH v5 0/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67
  2025-06-20  3:23 ` [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
@ 2025-06-20  3:23 ` lizhe.67
  2025-06-27 21:40   ` Alex Williamson
  2025-06-20  3:23 ` [PATCH v5 3/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67
  2 siblings, 1 reply; 9+ messages in thread
From: lizhe.67 @ 2025-06-20  3:23 UTC (permalink / raw)
  To: alex.williamson, jgg, david; +Cc: peterx, kvm, linux-kernel, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

Introduce a new member has_rsvd for struct vfio_dma. This member is
used to indicate whether there are any reserved or invalid pfns in
the region represented by this vfio_dma. If it is true, it indicates
that there is at least one pfn in this region that is either reserved
or invalid.

Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
---
 drivers/vfio/vfio_iommu_type1.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index e952bf8bdfab..8827e315e3d8 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -93,6 +93,10 @@ struct vfio_dma {
 	bool			iommu_mapped;
 	bool			lock_cap;	/* capable(CAP_IPC_LOCK) */
 	bool			vaddr_invalid;
+	/*
+	 * Any reserved or invalid pfns within this range?
+	 */
+	bool			has_rsvd;
 	struct task_struct	*task;
 	struct rb_root		pfn_list;	/* Ex-user pinned pfn list */
 	unsigned long		*bitmap;
@@ -785,6 +789,7 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
 	}
 
 out:
+	dma->has_rsvd |= rsvd;
 	ret = vfio_lock_acct(dma, lock_acct, false);
 
 unpin_out:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 3/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio
  2025-06-20  3:23 [PATCH v5 0/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67
  2025-06-20  3:23 ` [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
  2025-06-20  3:23 ` [PATCH v5 2/3] vfio/type1: introduce a new member has_rsvd for struct vfio_dma lizhe.67
@ 2025-06-20  3:23 ` lizhe.67
  2 siblings, 0 replies; 9+ messages in thread
From: lizhe.67 @ 2025-06-20  3:23 UTC (permalink / raw)
  To: alex.williamson, jgg, david; +Cc: peterx, kvm, linux-kernel, lizhe.67

From: Li Zhe <lizhe.67@bytedance.com>

When vfio_unpin_pages_remote() is called with a range of addresses that
includes large folios, the function currently performs individual
put_pfn() operations for each page. This can lead to significant
performance overheads, especially when dealing with large ranges of pages.

It would be very rare for reserved PFNs and non reserved will to be mixed
within the same range. So this patch utilizes the has_rsvd variable
introduced in the previous patch to determine whether batch put_pfn()
operations can be performed. Moreover, compared to put_pfn(),
unpin_user_page_range_dirty_lock() is capable of handling large folio
scenarios more efficiently.

The performance test results, based on v6.15, for completing the 16G VFIO
IOMMU DMA unmapping, obtained through unit test[1] with slight
modifications[2], are as follows.

Base(v6.15):
./vfio-pci-mem-dma-map 0000:03:00.0 16
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.047 s (338.6 GB/s)
VFIO UNMAP DMA in 0.138 s (116.2 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.280 s (57.2 GB/s)
VFIO UNMAP DMA in 0.312 s (51.3 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.052 s (308.3 GB/s)
VFIO UNMAP DMA in 0.139 s (115.1 GB/s)

Map[3] + This patchset:
------- AVERAGE (MADV_HUGEPAGE) --------
VFIO MAP DMA in 0.028 s (563.9 GB/s)
VFIO UNMAP DMA in 0.049 s (325.1 GB/s)
------- AVERAGE (MAP_POPULATE) --------
VFIO MAP DMA in 0.292 s (54.7 GB/s)
VFIO UNMAP DMA in 0.292 s (54.9 GB/s)
------- AVERAGE (HUGETLBFS) --------
VFIO MAP DMA in 0.033 s (491.3 GB/s)
VFIO UNMAP DMA in 0.049 s (323.9 GB/s)

For large folio, we achieve an approximate 64% performance improvement
in the VFIO UNMAP DMA item. For small folios, the performance test
results appear to show no significant changes.

[1]: https://github.com/awilliam/tests/blob/vfio-pci-mem-dma-map/vfio-pci-mem-dma-map.c
[2]: https://lore.kernel.org/all/20250610031013.98556-1-lizhe.67@bytedance.com/
[3]: https://lore.kernel.org/all/20250529064947.38433-1-lizhe.67@bytedance.com/

Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
---
 drivers/vfio/vfio_iommu_type1.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 8827e315e3d8..88a54b44df5b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -806,17 +806,29 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
 	return pinned;
 }
 
+static inline void put_valid_unreserved_pfns(unsigned long start_pfn,
+		unsigned long npage, int prot)
+{
+	unpin_user_page_range_dirty_lock(pfn_to_page(start_pfn), npage,
+					 prot & IOMMU_WRITE);
+}
+
 static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
 				    unsigned long pfn, unsigned long npage,
 				    bool do_accounting)
 {
 	long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
-	long i;
 
-	for (i = 0; i < npage; i++)
-		if (put_pfn(pfn++, dma->prot))
-			unlocked++;
+	if (dma->has_rsvd) {
+		long i;
 
+		for (i = 0; i < npage; i++)
+			if (put_pfn(pfn++, dma->prot))
+				unlocked++;
+	} else {
+		put_valid_unreserved_pfns(pfn, npage, dma->prot);
+		unlocked = npage;
+	}
 	if (do_accounting)
 		vfio_lock_acct(dma, locked - unlocked, true);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
  2025-06-20  3:23 ` [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
@ 2025-06-24 16:23   ` kernel test robot
  2025-06-25  2:29     ` lizhe.67
  2025-06-27 21:40   ` Alex Williamson
  1 sibling, 1 reply; 9+ messages in thread
From: kernel test robot @ 2025-06-24 16:23 UTC (permalink / raw)
  To: lizhe.67, alex.williamson, jgg, david
  Cc: oe-kbuild-all, peterx, kvm, linux-kernel, lizhe.67

Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on awilliam-vfio/next]
[also build test ERROR on awilliam-vfio/for-linus linus/master v6.16-rc3 next-20250624]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/lizhe-67-bytedance-com/vfio-type1-batch-vfio_find_vpfn-in-function-vfio_unpin_pages_remote/20250620-112605
base:   https://github.com/awilliam/linux-vfio.git next
patch link:    https://lore.kernel.org/r/20250620032344.13382-2-lizhe.67%40bytedance.com
patch subject: [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20250625/202506250037.VfdBAPP3-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250625/202506250037.VfdBAPP3-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506250037.VfdBAPP3-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/vfio/vfio_iommu_type1.c: In function 'vfio_unpin_pages_remote':
>> drivers/vfio/vfio_iommu_type1.c:738:37: error: implicit declaration of function 'vpfn_pages'; did you mean 'vma_pages'? [-Werror=implicit-function-declaration]
     738 |         long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
         |                                     ^~~~~~~~~~
         |                                     vma_pages
   cc1: some warnings being treated as errors


vim +738 drivers/vfio/vfio_iommu_type1.c

   733	
   734	static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
   735					    unsigned long pfn, unsigned long npage,
   736					    bool do_accounting)
   737	{
 > 738		long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
   739		long i;
   740	
   741		for (i = 0; i < npage; i++)
   742			if (put_pfn(pfn++, dma->prot))
   743				unlocked++;
   744	
   745		if (do_accounting)
   746			vfio_lock_acct(dma, locked - unlocked, true);
   747	
   748		return unlocked;
   749	}
   750	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
  2025-06-24 16:23   ` kernel test robot
@ 2025-06-25  2:29     ` lizhe.67
  0 siblings, 0 replies; 9+ messages in thread
From: lizhe.67 @ 2025-06-25  2:29 UTC (permalink / raw)
  To: lkp
  Cc: alex.williamson, david, jgg, kvm, linux-kernel, lizhe.67,
	oe-kbuild-all, peterx

On Wed, 25 Jun 2025 00:23:03 +0800,
kernel test robot <lkp@intel.com> wrote:

> kernel test robot noticed the following build errors:
> 
> [auto build test ERROR on awilliam-vfio/next]
> [also build test ERROR on awilliam-vfio/for-linus linus/master v6.16-rc3 next-20250624]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/lizhe-67-bytedance-com/vfio-type1-batch-vfio_find_vpfn-in-function-vfio_unpin_pages_remote/20250620-112605
> base:   https://github.com/awilliam/linux-vfio.git next
> patch link:    https://lore.kernel.org/r/20250620032344.13382-2-lizhe.67%40bytedance.com
> patch subject: [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
> config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20250625/202506250037.VfdBAPP3-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250625/202506250037.VfdBAPP3-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202506250037.VfdBAPP3-lkp@intel.com/
> 
> All errors (new ones prefixed by >>):
> 
>    drivers/vfio/vfio_iommu_type1.c: In function 'vfio_unpin_pages_remote':
> >> drivers/vfio/vfio_iommu_type1.c:738:37: error: implicit declaration of function 'vpfn_pages'; did you mean 'vma_pages'? [-Werror=implicit-function-declaration]
>      738 |         long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
>          |                                     ^~~~~~~~~~
>          |                                     vma_pages
>    cc1: some warnings being treated as errors

Perhaps we need to compile with this patch[1] included to avoid build
errors.

Thanks,
Zhe

[1]: https://lore.kernel.org/all/20250529064947.38433-1-lizhe.67@bytedance.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 2/3] vfio/type1: introduce a new member has_rsvd for struct vfio_dma
  2025-06-20  3:23 ` [PATCH v5 2/3] vfio/type1: introduce a new member has_rsvd for struct vfio_dma lizhe.67
@ 2025-06-27 21:40   ` Alex Williamson
  0 siblings, 0 replies; 9+ messages in thread
From: Alex Williamson @ 2025-06-27 21:40 UTC (permalink / raw)
  To: lizhe.67; +Cc: jgg, david, peterx, kvm, linux-kernel

On Fri, 20 Jun 2025 11:23:43 +0800
lizhe.67@bytedance.com wrote:

> From: Li Zhe <lizhe.67@bytedance.com>
> 
> Introduce a new member has_rsvd for struct vfio_dma. This member is
> used to indicate whether there are any reserved or invalid pfns in
> the region represented by this vfio_dma. If it is true, it indicates
> that there is at least one pfn in this region that is either reserved
> or invalid.
> 
> Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index e952bf8bdfab..8827e315e3d8 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -93,6 +93,10 @@ struct vfio_dma {
>  	bool			iommu_mapped;
>  	bool			lock_cap;	/* capable(CAP_IPC_LOCK) */
>  	bool			vaddr_invalid;
> +	/*
> +	 * Any reserved or invalid pfns within this range?
> +	 */
> +	bool			has_rsvd;

Nit, the topic isn't so complex to make a brief comment:

	bool			has_rsvd;	/* has 1 or more rsvd pfns */

Thanks,
Alex

>  	struct task_struct	*task;
>  	struct rb_root		pfn_list;	/* Ex-user pinned pfn list */
>  	unsigned long		*bitmap;
> @@ -785,6 +789,7 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
>  	}
>  
>  out:
> +	dma->has_rsvd |= rsvd;
>  	ret = vfio_lock_acct(dma, lock_acct, false);
>  
>  unpin_out:


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
  2025-06-20  3:23 ` [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
  2025-06-24 16:23   ` kernel test robot
@ 2025-06-27 21:40   ` Alex Williamson
  2025-06-30  2:45     ` lizhe.67
  1 sibling, 1 reply; 9+ messages in thread
From: Alex Williamson @ 2025-06-27 21:40 UTC (permalink / raw)
  To: lizhe.67; +Cc: jgg, david, peterx, kvm, linux-kernel

On Fri, 20 Jun 2025 11:23:42 +0800
lizhe.67@bytedance.com wrote:

> From: Li Zhe <lizhe.67@bytedance.com>
> 
> This patch is based on patch 'vfio/type1: optimize
> vfio_pin_pages_remote() for large folios'[1].

The above and the below link are only necessary in the cover letter, or
below the --- marker below, they don't really make sense in the
committed log.

Anyway, aside from that and one nit on 2/ (sent separately), the series
looks ok to me and I hope David and Jason will chime in with A-b/R-b
give the previous discussions.

Given the build bot error[1] I'd suggest resending all your work in a
single series, the previous map optimization and the unmap optimization
here.  That way the dependency is already included, and it's a good
nudge for acks.  Thanks,

Alex


[1]https://lore.kernel.org/all/202506250037.VfdBAPP3-lkp@intel.com/

> 
> The function vpfn_pages() can help us determine the number of vpfn
> nodes on the vpfn rb tree within a specified range. This allows us
> to avoid searching for each vpfn individually in the function
> vfio_unpin_pages_remote(). This patch batches the vfio_find_vpfn()
> calls in function vfio_unpin_pages_remote().
> 
> [1]: https://lore.kernel.org/all/20250529064947.38433-1-lizhe.67@bytedance.com/
> 
> Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 28ee4b8d39ae..e952bf8bdfab 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -805,16 +805,12 @@ static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
>  				    unsigned long pfn, unsigned long npage,
>  				    bool do_accounting)
>  {
> -	long unlocked = 0, locked = 0;
> +	long unlocked = 0, locked = vpfn_pages(dma, iova, npage);
>  	long i;
>  
> -	for (i = 0; i < npage; i++, iova += PAGE_SIZE) {
> -		if (put_pfn(pfn++, dma->prot)) {
> +	for (i = 0; i < npage; i++)
> +		if (put_pfn(pfn++, dma->prot))
>  			unlocked++;
> -			if (vfio_find_vpfn(dma, iova))
> -				locked++;
> -		}
> -	}
>  
>  	if (do_accounting)
>  		vfio_lock_acct(dma, locked - unlocked, true);


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
  2025-06-27 21:40   ` Alex Williamson
@ 2025-06-30  2:45     ` lizhe.67
  0 siblings, 0 replies; 9+ messages in thread
From: lizhe.67 @ 2025-06-30  2:45 UTC (permalink / raw)
  To: alex.williamson; +Cc: david, jgg, kvm, linux-kernel, lizhe.67, peterx

On Fri, 27 Jun 2025 15:40:59 -0600, alex.williamson@redhat.com wrote:

> On Fri, 20 Jun 2025 11:23:42 +0800
> lizhe.67@bytedance.com wrote:
> 
> > From: Li Zhe <lizhe.67@bytedance.com>
> > 
> > This patch is based on patch 'vfio/type1: optimize
> > vfio_pin_pages_remote() for large folios'[1].
> 
> The above and the below link are only necessary in the cover letter, or
> below the --- marker below, they don't really make sense in the
> committed log.
> 
> Anyway, aside from that and one nit on 2/ (sent separately), the series
> looks ok to me and I hope David and Jason will chime in with A-b/R-b
> give the previous discussions.
> 
> Given the build bot error[1] I'd suggest resending all your work in a
> single series, the previous map optimization and the unmap optimization
> here.  That way the dependency is already included, and it's a good
> nudge for acks.  Thanks,

Thank you for your review. I will send a new patchset that includes
the latest optimizations for both map and unmap.

Thanks,
Zhe

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-06-30  2:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-20  3:23 [PATCH v5 0/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67
2025-06-20  3:23 ` [PATCH v5 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
2025-06-24 16:23   ` kernel test robot
2025-06-25  2:29     ` lizhe.67
2025-06-27 21:40   ` Alex Williamson
2025-06-30  2:45     ` lizhe.67
2025-06-20  3:23 ` [PATCH v5 2/3] vfio/type1: introduce a new member has_rsvd for struct vfio_dma lizhe.67
2025-06-27 21:40   ` Alex Williamson
2025-06-20  3:23 ` [PATCH v5 3/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).