All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 0/9] iommu_ioas_map_file
@ 2024-10-22 21:00 Steve Sistare
  2024-10-22 21:00 ` [PATCH V5 1/9] mm/gup: folio_add_pins Steve Sistare
                   ` (8 more replies)
  0 siblings, 9 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

Provide the IOMMU_IOAS_MAP_FILE ioctl, which allows a user to register
memory by passing a memfd plus offset and length.  Implement it using
the memfd_map_folios KAPI, and the proposed folio_add_pins KAPI.
See the individual patches for details.

Changes in V2:
 * changed names and commit message in "rename uptr in iopt_alloc_iova"
 * normalized comments describing the iopt_map_user_pages interface
 * submitted folio_split_user_page_pin (fka folio_repin_unhugely) separately
 * replaced nupages[] optimization with folio-to-batch optimization
 * added selftests for map file

Changes in V3:
  * fixed bug setting user->locked
  * fixed bug in pages->file refs
  * replaced lockdep_off with down_write_nest_lock
  * added ufolios_next to track folio consumption in reader
  * combined IOMMU_IOAS_MAP_FILE interface and implementation
  * added patch folio_add_pins (fka folio_split_user_page_pin )
  * reformatted patches using clang-format
  * misc cosmetic changes in response to review comments

Changes in V4:
  * deleted ufolios_huge optimization
  * optimized batch_from_folios
  * squashed "optimize file mapping" into "pfn reader for file mappings"
  * moved length overflow check to iopt_alloc_file_pages
  * one declaration per line, and use local scope declarations
  * rebased to iommufd git tree

Changes in V5:
  * hoisted folio_add_pins call and deleted folios_unpin_partial
  * simplified batch_add_pfn_num and its caller
  * added cmd_length selftest for iommu_ioas_map_file
  * added map_file case for iommufd_fail_nth selftest
  * fixed an unreported mmput bug in pfn_reader_user_destroy that broke mdev

Steve Sistare (9):
  mm/gup: folio_add_pins
  iommufd: rename uptr in iopt_alloc_iova
  iommufd: generalize iopt_pages address
  iommufd: pfn reader local variables
  iommufd: folio subroutines
  iommufd: pfn reader for file mappings
  iommufd: IOMMU_IOAS_MAP_FILE
  iommufd: file mappings for mdev
  iommufd: map file selftest

 drivers/iommu/iommufd/io_pagetable.c             | 117 ++++++---
 drivers/iommu/iommufd/io_pagetable.h             |  20 +-
 drivers/iommu/iommufd/ioas.c                     |  38 +++
 drivers/iommu/iommufd/iommufd_private.h          |   5 +
 drivers/iommu/iommufd/main.c                     |   2 +
 drivers/iommu/iommufd/pages.c                    | 309 +++++++++++++++++++----
 include/linux/mm.h                               |   1 +
 include/uapi/linux/iommufd.h                     |  25 ++
 mm/gup.c                                         |  24 ++
 tools/testing/selftests/iommu/iommufd.c          | 136 ++++++++--
 tools/testing/selftests/iommu/iommufd_fail_nth.c |  39 +++
 tools/testing/selftests/iommu/iommufd_utils.h    |  57 +++++
 12 files changed, 664 insertions(+), 109 deletions(-)

base-commit: e2d8fe9148b79ed1cbf0663edc988db7769173dc

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH V5 1/9] mm/gup: folio_add_pins
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-22 21:00 ` [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

Export a function that adds pins to an already-pinned huge-page folio.
This allows any range of small pages within the folio to be unpinned later.
For example, pages pinned via memfd_pin_folios and modified by
folio_add_pins could be unpinned via unpin_user_page(s).

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 include/linux/mm.h |  1 +
 mm/gup.c           | 24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecf63d2..f9de33a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2524,6 +2524,7 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end,
 		      struct folio **folios, unsigned int max_folios,
 		      pgoff_t *offset);
+int folio_add_pins(struct folio *folio, unsigned int pins);
 
 int get_user_pages_fast(unsigned long start, int nr_pages,
 			unsigned int gup_flags, struct page **pages);
diff --git a/mm/gup.c b/mm/gup.c
index a82890b..4ac3e567 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3717,3 +3717,27 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end,
 	return ret;
 }
 EXPORT_SYMBOL_GPL(memfd_pin_folios);
+
+/**
+ * folio_add_pins() - add pins to an already-pinned folio
+ * @folio: the folio to add more pins to
+ * @pins: number of pins to add
+ *
+ * Try to add more pins to an already-pinned folio. The semantics
+ * of the pin (e.g., FOLL_WRITE) follow any existing pin and cannot
+ * be changed.
+ *
+ * This function is helpful when having obtained a pin on a large folio
+ * using memfd_pin_folios(), but wanting to logically unpin parts
+ * (e.g., individual pages) of the folio later, for example, using
+ * unpin_user_page_range_dirty_lock().
+ *
+ * This is not the right interface to initially pin a folio.
+ */
+int folio_add_pins(struct folio *folio, unsigned int pins)
+{
+	VM_WARN_ON_ONCE(!folio_maybe_dma_pinned(folio));
+
+	return try_grab_folio(folio, pins, FOLL_PIN);
+}
+EXPORT_SYMBOL_GPL(folio_add_pins);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
  2024-10-22 21:00 ` [PATCH V5 1/9] mm/gup: folio_add_pins Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-23  7:05   ` Tian, Kevin
  2024-10-22 21:00 ` [PATCH V5 3/9] iommufd: generalize iopt_pages address Steve Sistare
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

iopt_alloc_iova takes a uptr argument but only checks for its alignment.
Generalize this to an unsigned address, which can be the offset from the
start of a file in a subsequent patch.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/iommufd/io_pagetable.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 4bf7ccd..68ad91d 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -107,9 +107,9 @@ static bool __alloc_iova_check_used(struct interval_tree_span_iter *span,
  * Does not return a 0 IOVA even if it is valid.
  */
 static int iopt_alloc_iova(struct io_pagetable *iopt, unsigned long *iova,
-			   unsigned long uptr, unsigned long length)
+			   unsigned long addr, unsigned long length)
 {
-	unsigned long page_offset = uptr % PAGE_SIZE;
+	unsigned long page_offset = addr % PAGE_SIZE;
 	struct interval_tree_double_span_iter used_span;
 	struct interval_tree_span_iter allowed_span;
 	unsigned long max_alignment = PAGE_SIZE;
@@ -122,15 +122,15 @@ static int iopt_alloc_iova(struct io_pagetable *iopt, unsigned long *iova,
 		return -EOVERFLOW;
 
 	/*
-	 * Keep alignment present in the uptr when building the IOVA, this
+	 * Keep alignment present in addr when building the IOVA, which
 	 * increases the chance we can map a THP.
 	 */
-	if (!uptr)
+	if (!addr)
 		iova_alignment = roundup_pow_of_two(length);
 	else
 		iova_alignment = min_t(unsigned long,
 				       roundup_pow_of_two(length),
-				       1UL << __ffs64(uptr));
+				       1UL << __ffs64(addr));
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	max_alignment = HPAGE_SIZE;
@@ -248,6 +248,7 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
 				 int iommu_prot, unsigned int flags)
 {
 	struct iopt_pages_list *elm;
+	unsigned long start;
 	unsigned long iova;
 	int rc = 0;
 
@@ -267,9 +268,8 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
 		/* Use the first entry to guess the ideal IOVA alignment */
 		elm = list_first_entry(pages_list, struct iopt_pages_list,
 				       next);
-		rc = iopt_alloc_iova(
-			iopt, dst_iova,
-			(uintptr_t)elm->pages->uptr + elm->start_byte, length);
+		start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+		rc = iopt_alloc_iova(iopt, dst_iova, start, length);
 		if (rc)
 			goto out_unlock;
 		if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V5 3/9] iommufd: generalize iopt_pages address
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
  2024-10-22 21:00 ` [PATCH V5 1/9] mm/gup: folio_add_pins Steve Sistare
  2024-10-22 21:00 ` [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-23  7:08   ` Tian, Kevin
  2024-10-22 21:00 ` [PATCH V5 4/9] iommufd: pfn reader local variables Steve Sistare
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

The starting address in iopt_pages is currently a __user *uptr.  Generalize
to allow other types of addresses.  Refactor iopt_alloc_pages and
iopt_map_user_pages into address-type specific and common functions.

Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/iommufd/io_pagetable.c | 55 ++++++++++++++++++++++--------------
 drivers/iommu/iommufd/io_pagetable.h | 13 +++++++--
 drivers/iommu/iommufd/pages.c        | 31 ++++++++++++++------
 3 files changed, 67 insertions(+), 32 deletions(-)

diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 68ad91d..874ee9e 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -384,6 +384,34 @@ int iopt_map_pages(struct io_pagetable *iopt, struct list_head *pages_list,
 	return rc;
 }
 
+static int iopt_map_common(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+			   struct iopt_pages *pages, unsigned long *iova,
+			   unsigned long length, unsigned long start_byte,
+			   int iommu_prot, unsigned int flags)
+{
+	struct iopt_pages_list elm = {};
+	LIST_HEAD(pages_list);
+	int rc;
+
+	elm.pages = pages;
+	elm.start_byte = start_byte;
+	if (ictx->account_mode == IOPT_PAGES_ACCOUNT_MM &&
+	    elm.pages->account_mode == IOPT_PAGES_ACCOUNT_USER)
+		elm.pages->account_mode = IOPT_PAGES_ACCOUNT_MM;
+	elm.length = length;
+	list_add(&elm.next, &pages_list);
+
+	rc = iopt_map_pages(iopt, &pages_list, length, iova, iommu_prot, flags);
+	if (rc) {
+		if (elm.area)
+			iopt_abort_area(elm.area);
+		if (elm.pages)
+			iopt_put_pages(elm.pages);
+		return rc;
+	}
+	return 0;
+}
+
 /**
  * iopt_map_user_pages() - Map a user VA to an iova in the io page table
  * @ictx: iommufd_ctx the iopt is part of
@@ -408,29 +436,14 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
 			unsigned long length, int iommu_prot,
 			unsigned int flags)
 {
-	struct iopt_pages_list elm = {};
-	LIST_HEAD(pages_list);
-	int rc;
+	struct iopt_pages *pages;
 
-	elm.pages = iopt_alloc_pages(uptr, length, iommu_prot & IOMMU_WRITE);
-	if (IS_ERR(elm.pages))
-		return PTR_ERR(elm.pages);
-	if (ictx->account_mode == IOPT_PAGES_ACCOUNT_MM &&
-	    elm.pages->account_mode == IOPT_PAGES_ACCOUNT_USER)
-		elm.pages->account_mode = IOPT_PAGES_ACCOUNT_MM;
-	elm.start_byte = uptr - elm.pages->uptr;
-	elm.length = length;
-	list_add(&elm.next, &pages_list);
+	pages = iopt_alloc_user_pages(uptr, length, iommu_prot & IOMMU_WRITE);
+	if (IS_ERR(pages))
+		return PTR_ERR(pages);
 
-	rc = iopt_map_pages(iopt, &pages_list, length, iova, iommu_prot, flags);
-	if (rc) {
-		if (elm.area)
-			iopt_abort_area(elm.area);
-		if (elm.pages)
-			iopt_put_pages(elm.pages);
-		return rc;
-	}
-	return 0;
+	return iopt_map_common(ictx, iopt, pages, iova, length,
+			       uptr - pages->uptr, iommu_prot, flags);
 }
 
 struct iova_bitmap_fn_arg {
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index c61d744..8e48266 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -175,6 +175,10 @@ enum {
 	IOPT_PAGES_ACCOUNT_MM = 2,
 };
 
+enum iopt_address_type {
+	IOPT_ADDRESS_USER = 0,
+};
+
 /*
  * This holds a pinned page list for multiple areas of IO address space. The
  * pages always originate from a linear chunk of userspace VA. Multiple
@@ -195,7 +199,10 @@ struct iopt_pages {
 	struct task_struct *source_task;
 	struct mm_struct *source_mm;
 	struct user_struct *source_user;
-	void __user *uptr;
+	enum iopt_address_type type;
+	union {
+		void __user *uptr;		/* IOPT_ADDRESS_USER */
+	};
 	bool writable:1;
 	u8 account_mode;
 
@@ -206,8 +213,8 @@ struct iopt_pages {
 	struct rb_root_cached domains_itree;
 };
 
-struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
-				    bool writable);
+struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
+					 unsigned long length, bool writable);
 void iopt_release_pages(struct kref *kref);
 static inline void iopt_put_pages(struct iopt_pages *pages)
 {
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 93d806c..aa69c97 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1139,11 +1139,11 @@ static int pfn_reader_first(struct pfn_reader *pfns, struct iopt_pages *pages,
 	return 0;
 }
 
-struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
-				    bool writable)
+static struct iopt_pages *iopt_alloc_pages(unsigned long length,
+					   unsigned long start_byte,
+					   bool writable)
 {
 	struct iopt_pages *pages;
-	unsigned long end;
 
 	/*
 	 * The iommu API uses size_t as the length, and protect the DIV_ROUND_UP
@@ -1152,9 +1152,6 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
 	if (length > SIZE_MAX - PAGE_SIZE || length == 0)
 		return ERR_PTR(-EINVAL);
 
-	if (check_add_overflow((unsigned long)uptr, length, &end))
-		return ERR_PTR(-EOVERFLOW);
-
 	pages = kzalloc(sizeof(*pages), GFP_KERNEL_ACCOUNT);
 	if (!pages)
 		return ERR_PTR(-ENOMEM);
@@ -1164,8 +1161,7 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
 	mutex_init(&pages->mutex);
 	pages->source_mm = current->mm;
 	mmgrab(pages->source_mm);
-	pages->uptr = (void __user *)ALIGN_DOWN((uintptr_t)uptr, PAGE_SIZE);
-	pages->npages = DIV_ROUND_UP(length + (uptr - pages->uptr), PAGE_SIZE);
+	pages->npages = DIV_ROUND_UP(length + start_byte, PAGE_SIZE);
 	pages->access_itree = RB_ROOT_CACHED;
 	pages->domains_itree = RB_ROOT_CACHED;
 	pages->writable = writable;
@@ -1179,6 +1175,25 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
 	return pages;
 }
 
+struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
+					 unsigned long length, bool writable)
+{
+	struct iopt_pages *pages;
+	unsigned long end;
+	void __user *uptr_down =
+		(void __user *) ALIGN_DOWN((uintptr_t)uptr, PAGE_SIZE);
+
+	if (check_add_overflow((unsigned long)uptr, length, &end))
+		return ERR_PTR(-EOVERFLOW);
+
+	pages = iopt_alloc_pages(length, uptr - uptr_down, writable);
+	if (IS_ERR(pages))
+		return pages;
+	pages->uptr = uptr_down;
+	pages->type = IOPT_ADDRESS_USER;
+	return pages;
+}
+
 void iopt_release_pages(struct kref *kref)
 {
 	struct iopt_pages *pages = container_of(kref, struct iopt_pages, kref);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V5 4/9] iommufd: pfn reader local variables
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
                   ` (2 preceding siblings ...)
  2024-10-22 21:00 ` [PATCH V5 3/9] iommufd: generalize iopt_pages address Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-23  7:09   ` Tian, Kevin
  2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

Add local variables for common sub-expressions needed by a subsequent
patch.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/pages.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index aa69c97..0cf5b65 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -978,6 +978,8 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
 {
 	struct interval_tree_double_span_iter *span = &pfns->span;
 	unsigned long start_index = pfns->batch_end_index;
+	struct pfn_reader_user *user = &pfns->user;
+	unsigned long npages;
 	struct iopt_area *area;
 	int rc;
 
@@ -1015,10 +1017,9 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
 			return rc;
 	}
 
-	batch_from_pages(&pfns->batch,
-			 pfns->user.upages +
-				 (start_index - pfns->user.upages_start),
-			 pfns->user.upages_end - start_index);
+	npages = user->upages_end - start_index;
+	start_index -= user->upages_start;
+	batch_from_pages(&pfns->batch, user->upages + start_index, npages);
 	return 0;
 }
 
@@ -1092,16 +1093,18 @@ static int pfn_reader_init(struct pfn_reader *pfns, struct iopt_pages *pages,
 static void pfn_reader_release_pins(struct pfn_reader *pfns)
 {
 	struct iopt_pages *pages = pfns->pages;
+	struct pfn_reader_user *user = &pfns->user;
 
-	if (pfns->user.upages_end > pfns->batch_end_index) {
-		size_t npages = pfns->user.upages_end - pfns->batch_end_index;
-
+	if (user->upages_end > pfns->batch_end_index) {
 		/* Any pages not transferred to the batch are just unpinned */
-		unpin_user_pages(pfns->user.upages + (pfns->batch_end_index -
-						      pfns->user.upages_start),
-				 npages);
+
+		unsigned long npages = user->upages_end - pfns->batch_end_index;
+		unsigned long start_index = pfns->batch_end_index -
+					    user->upages_start;
+
+		unpin_user_pages(user->upages + start_index, npages);
 		iopt_pages_sub_npinned(pages, npages);
-		pfns->user.upages_end = pfns->batch_end_index;
+		user->upages_end = pfns->batch_end_index;
 	}
 	if (pfns->batch_start_index != pfns->batch_end_index) {
 		pfn_reader_unpin(pfns);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V5 5/9] iommufd: folio subroutines
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
                   ` (3 preceding siblings ...)
  2024-10-22 21:00 ` [PATCH V5 4/9] iommufd: pfn reader local variables Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-23  7:16   ` Tian, Kevin
                     ` (2 more replies)
  2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
                   ` (3 subsequent siblings)
  8 siblings, 3 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

Add subroutines for copying folios to a batch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 drivers/iommu/iommufd/pages.c | 79 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 64 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 0cf5b65..635b1fe 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -346,27 +346,41 @@ static void batch_destroy(struct pfn_batch *batch, void *backup)
 		kfree(batch->pfns);
 }
 
-/* true if the pfn was added, false otherwise */
-static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
+static bool batch_add_pfn_num(struct pfn_batch *batch, unsigned long pfn,
+			      unsigned long nr)
 {
-	const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
+	const unsigned long MAX_NPFNS = type_max(typeof(*batch->npfns));
+	int i = batch->end;
 
-	if (batch->end &&
-	    pfn == batch->pfns[batch->end - 1] + batch->npfns[batch->end - 1] &&
-	    batch->npfns[batch->end - 1] != MAX_NPFNS) {
-		batch->npfns[batch->end - 1]++;
-		batch->total_pfns++;
-		return true;
-	}
-	if (batch->end == batch->array_size)
+	if (i && pfn == batch->pfns[i - 1] + batch->npfns[i - 1] &&
+	    nr <= MAX_NPFNS - batch->npfns[i - 1]) {
+		batch->npfns[i - 1] += nr;
+	} else if (i < batch->array_size) {
+		batch->pfns[i] = pfn;
+		batch->npfns[i] = nr;
+		batch->end++;
+	} else {
 		return false;
-	batch->total_pfns++;
-	batch->pfns[batch->end] = pfn;
-	batch->npfns[batch->end] = 1;
-	batch->end++;
+	}
+
+	batch->total_pfns += nr;
 	return true;
 }
 
+static void batch_remove_pfn_num(struct pfn_batch *batch, unsigned long nr)
+{
+	batch->npfns[batch->end - 1] -= nr;
+	if (batch->npfns[batch->end - 1] == 0)
+		batch->end--;
+	batch->total_pfns -= nr;
+}
+
+/* true if the pfn was added, false otherwise */
+static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
+{
+	return batch_add_pfn_num(batch, pfn, 1);
+}
+
 /*
  * Fill the batch with pfns from the domain. When the batch is full, or it
  * reaches last_index, the function will return. The caller should use
@@ -622,6 +636,41 @@ static void batch_from_pages(struct pfn_batch *batch, struct page **pages,
 			break;
 }
 
+static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
+			     unsigned long *offset_p, unsigned long npages)
+{
+	int rc = 0;
+	struct folio **folios = *folios_p;
+	unsigned long offset = *offset_p;
+
+	while (npages) {
+		struct folio *folio = *folios;
+		unsigned long nr = folio_nr_pages(folio) - offset;
+		unsigned long pfn = page_to_pfn(folio_page(folio, offset));
+
+		nr = min(nr, npages);
+		npages -= nr;
+
+		if (!batch_add_pfn_num(batch, pfn, nr))
+			break;
+		if (nr > 1) {
+			rc = folio_add_pins(folio, nr - 1);
+			if (rc) {
+				batch_remove_pfn_num(batch, nr);
+				goto out;
+			}
+		}
+
+		folios++;
+		offset = 0;
+	}
+
+out:
+	*folios_p = folios;
+	*offset_p = offset;
+	return rc;
+}
+
 static void batch_unpin(struct pfn_batch *batch, struct iopt_pages *pages,
 			unsigned int first_page_off, size_t npages)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V5 6/9] iommufd: pfn reader for file mappings
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
                   ` (4 preceding siblings ...)
  2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-23  7:40   ` Tian, Kevin
  2024-10-23 13:24   ` Jason Gunthorpe
  2024-10-22 21:00 ` [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
Repin at small page granularity, and fill the batch from folios.  Expand
folios to upages for the iopt_pages_fill path.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/io_pagetable.h |   5 ++
 drivers/iommu/iommufd/pages.c        | 128 ++++++++++++++++++++++++++++++-----
 2 files changed, 116 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 8e48266..5ac4eed 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -177,6 +177,7 @@ enum {
 
 enum iopt_address_type {
 	IOPT_ADDRESS_USER = 0,
+	IOPT_ADDRESS_FILE = 1,
 };
 
 /*
@@ -202,6 +203,10 @@ struct iopt_pages {
 	enum iopt_address_type type;
 	union {
 		void __user *uptr;		/* IOPT_ADDRESS_USER */
+		struct {			/* IOPT_ADDRESS_FILE */
+			struct file *file;
+			unsigned long start;
+		};
 	};
 	bool writable:1;
 	u8 account_mode;
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 635b1fe..59d40c1 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -752,19 +752,32 @@ struct pfn_reader_user {
 	 * neither
 	 */
 	int locked;
+
+	/* The following are only valid if file != NULL. */
+	struct file *file;
+	struct folio **ufolios;
+	size_t ufolios_len;
+	unsigned long ufolios_offset;
+	struct folio **ufolios_next;
 };
 
 static void pfn_reader_user_init(struct pfn_reader_user *user,
 				 struct iopt_pages *pages)
 {
 	user->upages = NULL;
+	user->upages_len = 0;
 	user->upages_start = 0;
 	user->upages_end = 0;
 	user->locked = -1;
-
 	user->gup_flags = FOLL_LONGTERM;
 	if (pages->writable)
 		user->gup_flags |= FOLL_WRITE;
+
+	user->file = (pages->type == IOPT_ADDRESS_FILE) ? pages->file : NULL;
+	user->ufolios = NULL;
+	user->ufolios_len = 0;
+	user->ufolios_next = NULL;
+	user->ufolios_offset = 0;
 }
 
 static void pfn_reader_user_destroy(struct pfn_reader_user *user,
@@ -773,13 +786,67 @@ static void pfn_reader_user_destroy(struct pfn_reader_user *user,
 	if (user->locked != -1) {
 		if (user->locked)
 			mmap_read_unlock(pages->source_mm);
-		if (pages->source_mm != current->mm)
+		if (!user->file && pages->source_mm != current->mm)
 			mmput(pages->source_mm);
 		user->locked = -1;
 	}
 
 	kfree(user->upages);
 	user->upages = NULL;
+	kfree(user->ufolios);
+	user->ufolios = NULL;
+}
+
+static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long start,
+			    unsigned long npages)
+{
+	unsigned long i;
+	unsigned long offset;
+	unsigned long npages_out = 0;
+	struct page **upages = user->upages;
+	unsigned long end = start + (npages << PAGE_SHIFT) - 1;
+	long nfolios = user->ufolios_len / sizeof(*user->ufolios);
+
+	/*
+	 * todo: memfd_pin_folios should return the last pinned offset so
+	 * we can compute npages pinned, and avoid looping over folios here
+	 * if upages == NULL.
+	 */
+	nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
+				   nfolios, &offset);
+	if (nfolios <= 0)
+		return nfolios;
+
+	offset >>= PAGE_SHIFT;
+	user->ufolios_next = user->ufolios;
+	user->ufolios_offset = offset;
+
+	for (i = 0; i < nfolios; i++) {
+		struct folio *folio = user->ufolios[i];
+		unsigned long nr = folio_nr_pages(folio);
+		unsigned long npin = min(nr - offset, npages);
+
+		npages -= npin;
+		npages_out += npin;
+
+		if (upages) {
+			if (npin == 1) {
+				*upages++ = folio_page(folio, offset);
+			} else {
+				int rc = folio_add_pins(folio, npin - 1);
+
+				if (rc)
+					return rc;
+
+				while (npin--)
+					*upages++ = folio_page(folio, offset++);
+			}
+		}
+
+		offset = 0;
+	}
+
+	return npages_out;
 }
 
 static int pfn_reader_user_pin(struct pfn_reader_user *user,
@@ -788,7 +855,9 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
 			       unsigned long last_index)
 {
 	bool remote_mm = pages->source_mm != current->mm;
-	unsigned long npages;
+	unsigned long npages = last_index - start_index + 1;
+	unsigned long start;
+	unsigned long unum;
 	uintptr_t uptr;
 	long rc;
 
@@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
 	    WARN_ON(last_index < start_index))
 		return -EINVAL;
 
-	if (!user->upages) {
+	if (!user->file && !user->upages) {
 		/* All undone in pfn_reader_destroy() */
-		user->upages_len =
-			(last_index - start_index + 1) * sizeof(*user->upages);
+		user->upages_len = npages * sizeof(*user->upages);
 		user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
 		if (!user->upages)
 			return -ENOMEM;
 	}
 
+	if (user->file && !user->ufolios) {
+		user->ufolios_len = npages * sizeof(*user->ufolios);
+		user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
+		if (!user->ufolios)
+			return -ENOMEM;
+	}
+
 	if (user->locked == -1) {
 		/*
 		 * The majority of usages will run the map task within the mm
 		 * providing the pages, so we can optimize into
 		 * get_user_pages_fast()
 		 */
-		if (remote_mm) {
+		if (!user->file && remote_mm) {
 			if (!mmget_not_zero(pages->source_mm))
 				return -EFAULT;
 		}
 		user->locked = 0;
 	}
 
-	npages = min_t(unsigned long, last_index - start_index + 1,
-		       user->upages_len / sizeof(*user->upages));
-
+	unum = user->file ? user->ufolios_len / sizeof(*user->ufolios) :
+			    user->upages_len / sizeof(*user->upages);
+	npages = min_t(unsigned long, npages, unum);
 
 	if (iommufd_should_fail())
 		return -EFAULT;
 
-	uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
-	if (!remote_mm)
+	if (user->file) {
+		start = pages->start + (start_index * PAGE_SIZE);
+		rc = pin_memfd_pages(user, start, npages);
+	} else if (!remote_mm) {
+		uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
 		rc = pin_user_pages_fast(uptr, npages, user->gup_flags,
 					 user->upages);
-	else {
+	} else {
+		uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
 		if (!user->locked) {
 			mmap_read_lock(pages->source_mm);
 			user->locked = 1;
@@ -887,7 +966,8 @@ static int update_mm_locked_vm(struct iopt_pages *pages, unsigned long npages,
 		mmap_read_unlock(pages->source_mm);
 		user->locked = 0;
 		/* If we had the lock then we also have a get */
-	} else if ((!user || !user->upages) &&
+
+	} else if ((!user || (!user->upages && !user->ufolios)) &&
 		   pages->source_mm != current->mm) {
 		if (!mmget_not_zero(pages->source_mm))
 			return -EINVAL;
@@ -1068,8 +1148,15 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
 
 	npages = user->upages_end - start_index;
 	start_index -= user->upages_start;
-	batch_from_pages(&pfns->batch, user->upages + start_index, npages);
-	return 0;
+	rc = 0;
+
+	if (!user->file)
+		batch_from_pages(&pfns->batch, user->upages + start_index,
+				 npages);
+	else
+		rc = batch_from_folios(&pfns->batch, &user->ufolios_next,
+				       &user->ufolios_offset, npages);
+	return rc;
 }
 
 static bool pfn_reader_done(struct pfn_reader *pfns)
@@ -1151,7 +1238,14 @@ static void pfn_reader_release_pins(struct pfn_reader *pfns)
 		unsigned long start_index = pfns->batch_end_index -
 					    user->upages_start;
 
-		unpin_user_pages(user->upages + start_index, npages);
+		if (!user->file) {
+			unpin_user_pages(user->upages + start_index, npages);
+		} else {
+			long n = user->ufolios_len / sizeof(*user->ufolios);
+
+			unpin_folios(user->ufolios_next,
+				     user->ufolios + n - user->ufolios_next);
+		}
 		iopt_pages_sub_npinned(pages, npages);
 		user->upages_end = pfns->batch_end_index;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
                   ` (5 preceding siblings ...)
  2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-23  7:45   ` Tian, Kevin
  2024-10-22 21:00 ` [PATCH V5 8/9] iommufd: file mappings for mdev Steve Sistare
  2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
  8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

Define the IOMMU_IOAS_MAP_FILE ioctl interface, which allows a user to
register memory by passing a memfd plus offset and length.  Implement it
using the memfd_pin_folios KAPI.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/io_pagetable.c    | 36 ++++++++++++++++++++++++++++++-
 drivers/iommu/iommufd/io_pagetable.h    |  2 ++
 drivers/iommu/iommufd/ioas.c            | 38 +++++++++++++++++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h |  5 +++++
 drivers/iommu/iommufd/main.c            |  2 ++
 drivers/iommu/iommufd/pages.c           | 23 ++++++++++++++++++++
 include/uapi/linux/iommufd.h            | 25 ++++++++++++++++++++++
 7 files changed, 130 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 874ee9e..8a790e5 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -268,7 +268,14 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
 		/* Use the first entry to guess the ideal IOVA alignment */
 		elm = list_first_entry(pages_list, struct iopt_pages_list,
 				       next);
-		start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+		switch (elm->pages->type) {
+		case IOPT_ADDRESS_USER:
+			start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+			break;
+		case IOPT_ADDRESS_FILE:
+			start = elm->start_byte + elm->pages->start;
+			break;
+		}
 		rc = iopt_alloc_iova(iopt, dst_iova, start, length);
 		if (rc)
 			goto out_unlock;
@@ -446,6 +453,33 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
 			       uptr - pages->uptr, iommu_prot, flags);
 }
 
+/**
+ * iopt_map_file_pages() - Like iopt_map_user_pages, but map a file.
+ * @ictx: iommufd_ctx the iopt is part of
+ * @iopt: io_pagetable to act on
+ * @iova: If IOPT_ALLOC_IOVA is set this is unused on input and contains
+ *        the chosen iova on output. Otherwise is the iova to map to on input
+ * @file: file to map
+ * @start: map file starting at this byte offset
+ * @length: Number of bytes to map
+ * @iommu_prot: Combination of IOMMU_READ/WRITE/etc bits for the mapping
+ * @flags: IOPT_ALLOC_IOVA or zero
+ */
+int iopt_map_file_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+			unsigned long *iova, struct file *file,
+			unsigned long start, unsigned long length,
+			int iommu_prot, unsigned int flags)
+{
+	struct iopt_pages *pages;
+
+	pages = iopt_alloc_file_pages(file, start, length,
+				      iommu_prot & IOMMU_WRITE);
+	if (IS_ERR(pages))
+		return PTR_ERR(pages);
+	return iopt_map_common(ictx, iopt, pages, iova, length,
+			       start - pages->start, iommu_prot, flags);
+}
+
 struct iova_bitmap_fn_arg {
 	unsigned long flags;
 	struct io_pagetable *iopt;
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 5ac4eed..9b40b22 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -220,6 +220,8 @@ struct iopt_pages {
 
 struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
 					 unsigned long length, bool writable);
+struct iopt_pages *iopt_alloc_file_pages(struct file *file, unsigned long start,
+					 unsigned long length, bool writable);
 void iopt_release_pages(struct kref *kref);
 static inline void iopt_put_pages(struct iopt_pages *pages)
 {
diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c
index 2c4b2bb..3d31bcf 100644
--- a/drivers/iommu/iommufd/ioas.c
+++ b/drivers/iommu/iommufd/ioas.c
@@ -2,6 +2,7 @@
 /*
  * Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
  */
+#include <linux/file.h>
 #include <linux/interval_tree.h>
 #include <linux/iommu.h>
 #include <linux/iommufd.h>
@@ -197,6 +198,43 @@ static int conv_iommu_prot(u32 map_flags)
 	return iommu_prot;
 }
 
+int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd)
+{
+	struct iommu_ioas_map_file *cmd = ucmd->cmd;
+	unsigned long iova = cmd->iova;
+	struct iommufd_ioas *ioas;
+	unsigned int flags = 0;
+	int rc;
+	struct file *file;
+
+	if (cmd->iova >= ULONG_MAX || cmd->length >= ULONG_MAX)
+		return -EOVERFLOW;
+
+	ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
+	if (IS_ERR(ioas))
+		return PTR_ERR(ioas);
+
+	if (!(cmd->flags & IOMMU_IOAS_MAP_FIXED_IOVA))
+		flags = IOPT_ALLOC_IOVA;
+
+	file = fget(cmd->fd);
+	if (!file)
+		return -EBADF;
+
+	rc = iopt_map_file_pages(ucmd->ictx, &ioas->iopt, &iova, file,
+				 cmd->start, cmd->length,
+				 conv_iommu_prot(cmd->flags), flags);
+	if (rc)
+		goto out_put;
+
+	cmd->iova = iova;
+	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+out_put:
+	iommufd_put_object(ucmd->ictx, &ioas->obj);
+	fput(file);
+	return rc;
+}
+
 int iommufd_ioas_map(struct iommufd_ucmd *ucmd)
 {
 	struct iommu_ioas_map *cmd = ucmd->cmd;
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index f1d865e..8f3c21a 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -69,6 +69,10 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
 			unsigned long *iova, void __user *uptr,
 			unsigned long length, int iommu_prot,
 			unsigned int flags);
+int iopt_map_file_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+			unsigned long *iova, struct file *file,
+			unsigned long start, unsigned long length,
+			int iommu_prot, unsigned int flags);
 int iopt_map_pages(struct io_pagetable *iopt, struct list_head *pages_list,
 		   unsigned long length, unsigned long *dst_iova,
 		   int iommu_prot, unsigned int flags);
@@ -276,6 +280,7 @@ static inline struct iommufd_ioas *iommufd_get_ioas(struct iommufd_ctx *ictx,
 int iommufd_ioas_iova_ranges(struct iommufd_ucmd *ucmd);
 int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd);
 int iommufd_ioas_map(struct iommufd_ucmd *ucmd);
+int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd);
 int iommufd_ioas_copy(struct iommufd_ucmd *ucmd);
 int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd);
 int iommufd_ioas_option(struct iommufd_ucmd *ucmd);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index b5f5d27..826a2b2 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -378,6 +378,8 @@ struct iommufd_ioctl_op {
 		 struct iommu_ioas_iova_ranges, out_iova_alignment),
 	IOCTL_OP(IOMMU_IOAS_MAP, iommufd_ioas_map, struct iommu_ioas_map,
 		 iova),
+	IOCTL_OP(IOMMU_IOAS_MAP_FILE, iommufd_ioas_map_file,
+		 struct iommu_ioas_map_file, iova),
 	IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct iommu_ioas_unmap,
 		 length),
 	IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option,
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 59d40c1..1b8c3bf 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -45,6 +45,7 @@
  * last_iova + 1 can overflow. An iopt_pages index will always be much less than
  * ULONG_MAX so last_index + 1 cannot overflow.
  */
+#include <linux/file.h>
 #include <linux/highmem.h>
 #include <linux/iommu.h>
 #include <linux/iommufd.h>
@@ -1340,6 +1341,26 @@ struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
 	return pages;
 }
 
+struct iopt_pages *iopt_alloc_file_pages(struct file *file, unsigned long start,
+					 unsigned long length, bool writable)
+
+{
+	struct iopt_pages *pages;
+	unsigned long start_down = ALIGN_DOWN(start, PAGE_SIZE);
+	unsigned long end;
+
+	if (length && check_add_overflow(start, length - 1, &end))
+		return ERR_PTR(-EOVERFLOW);
+
+	pages = iopt_alloc_pages(length, start - start_down, writable);
+	if (IS_ERR(pages))
+		return pages;
+	pages->file = get_file(file);
+	pages->start = start_down;
+	pages->type = IOPT_ADDRESS_FILE;
+	return pages;
+}
+
 void iopt_release_pages(struct kref *kref)
 {
 	struct iopt_pages *pages = container_of(kref, struct iopt_pages, kref);
@@ -1352,6 +1373,8 @@ void iopt_release_pages(struct kref *kref)
 	mutex_destroy(&pages->mutex);
 	put_task_struct(pages->source_task);
 	free_uid(pages->source_user);
+	if (pages->type == IOPT_ADDRESS_FILE)
+		fput(pages->file);
 	kfree(pages);
 }
 
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 72010f7..a8f0d30 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -51,6 +51,7 @@ enum {
 	IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP = 0x8c,
 	IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
 	IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
+	IOMMUFD_CMD_IOAS_MAP_FILE = 0x8f,
 };
 
 /**
@@ -214,6 +215,30 @@ struct iommu_ioas_map {
 #define IOMMU_IOAS_MAP _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP)
 
 /**
+ * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
+ * @size: sizeof(struct iommu_ioas_map_file)
+ * @fd: the memfd to map
+ * @start: byte offset from start of file to map from
+ * @flags: same as for iommu_ioas_map
+ * @ios_id: ditto
+ * @length: ditto
+ * @iova: ditto
+ *
+ * Set an IOVA mapping from a memfd file.  All other arguments and semantics
+ * match those of IOMMU_IOAS_MAP.
+ */
+struct iommu_ioas_map_file {
+	__u32 size;
+	__u32 flags;
+	__u32 ioas_id;
+	__s32 fd;
+	__aligned_u64 start;
+	__aligned_u64 length;
+	__aligned_u64 iova;
+};
+#define IOMMU_IOAS_MAP_FILE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP_FILE)
+
+/**
  * struct iommu_ioas_copy - ioctl(IOMMU_IOAS_COPY)
  * @size: sizeof(struct iommu_ioas_copy)
  * @flags: Combination of enum iommufd_ioas_map_flags
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V5 8/9] iommufd: file mappings for mdev
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
                   ` (6 preceding siblings ...)
  2024-10-22 21:00 ` [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-23  8:01   ` Tian, Kevin
  2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
  8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

Support file mappings for mediated devices, aka mdevs.  Access is initiated
by the vfio_pin_pages and vfio_dma_rw kernel interfaces.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/pages.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 1b8c3bf..daf8fde 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1814,11 +1814,11 @@ static int iopt_pages_fill_from_domain(struct iopt_pages *pages,
 	return 0;
 }
 
-static int iopt_pages_fill_from_mm(struct iopt_pages *pages,
-				   struct pfn_reader_user *user,
-				   unsigned long start_index,
-				   unsigned long last_index,
-				   struct page **out_pages)
+static int iopt_pages_fill(struct iopt_pages *pages,
+			   struct pfn_reader_user *user,
+			   unsigned long start_index,
+			   unsigned long last_index,
+			   struct page **out_pages)
 {
 	unsigned long cur_index = start_index;
 	int rc;
@@ -1892,8 +1892,8 @@ int iopt_pages_fill_xarray(struct iopt_pages *pages, unsigned long start_index,
 
 		/* hole */
 		cur_pages = out_pages + (span.start_hole - start_index);
-		rc = iopt_pages_fill_from_mm(pages, &user, span.start_hole,
-					     span.last_hole, cur_pages);
+		rc = iopt_pages_fill(pages, &user, span.start_hole,
+				     span.last_hole, cur_pages);
 		if (rc)
 			goto out_clean_xa;
 		rc = pages_to_xarray(&pages->pinned_pfns, span.start_hole,
@@ -1973,6 +1973,10 @@ static int iopt_pages_rw_page(struct iopt_pages *pages, unsigned long index,
 	struct page *page = NULL;
 	int rc;
 
+	if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
+	    WARN_ON(pages->type != IOPT_ADDRESS_USER))
+		return -EINVAL;
+
 	if (!mmget_not_zero(pages->source_mm))
 		return iopt_pages_rw_slow(pages, index, index, offset, data,
 					  length, flags);
@@ -2028,6 +2032,15 @@ int iopt_pages_rw_access(struct iopt_pages *pages, unsigned long start_byte,
 	if ((flags & IOMMUFD_ACCESS_RW_WRITE) && !pages->writable)
 		return -EPERM;
 
+	if (pages->type == IOPT_ADDRESS_FILE)
+		return iopt_pages_rw_slow(pages, start_index, last_index,
+					  start_byte % PAGE_SIZE, data, length,
+					  flags);
+
+	if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
+	    WARN_ON(pages->type != IOPT_ADDRESS_USER))
+		return -EINVAL;
+
 	if (!(flags & IOMMUFD_ACCESS_RW_KTHREAD) && change_mm) {
 		if (start_index == last_index)
 			return iopt_pages_rw_page(pages, start_index,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH V5 9/9] iommufd: map file selftest
  2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
                   ` (7 preceding siblings ...)
  2024-10-22 21:00 ` [PATCH V5 8/9] iommufd: file mappings for mdev Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
  2024-10-23 13:52   ` Steven Sistare
  2024-10-23 20:15   ` Nicolin Chen
  8 siblings, 2 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare

Add test cases to exercise IOMMU_IOAS_MAP_FILE.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 tools/testing/selftests/iommu/iommufd.c          | 136 ++++++++++++++++++++---
 tools/testing/selftests/iommu/iommufd_fail_nth.c |  39 +++++++
 tools/testing/selftests/iommu/iommufd_utils.h    |  57 ++++++++++
 3 files changed, 217 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 4927b9a..3d62951 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -3,6 +3,7 @@
 #include <stdlib.h>
 #include <sys/mman.h>
 #include <sys/eventfd.h>
+#include <asm/unistd.h>
 
 #define __EXPORTED_HEADERS__
 #include <linux/vfio.h>
@@ -49,6 +50,9 @@ static __attribute__((constructor)) void setup_sizes(void)
 	vrc = mmap(buffer, BUFFER_SIZE, PROT_READ | PROT_WRITE,
 		   MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
 	assert(vrc == buffer);
+
+	mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
+				&mfd);
 }
 
 FIXTURE(iommufd)
@@ -128,6 +132,7 @@ static __attribute__((constructor)) void setup_sizes(void)
 	TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP, length);
 	TEST_LENGTH(iommu_option, IOMMU_OPTION, val64);
 	TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved);
+	TEST_LENGTH(iommu_ioas_map_file, IOMMU_IOAS_MAP_FILE, iova);
 #undef TEST_LENGTH
 }
 
@@ -1372,6 +1377,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 {
 	unsigned int mock_domains;
 	bool hugepages;
+	bool file;
 };
 
 FIXTURE_SETUP(iommufd_mock_domain)
@@ -1410,26 +1416,45 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 {
 	.mock_domains = 1,
 	.hugepages = false,
+	.file = false,
 };
 
 FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains)
 {
 	.mock_domains = 2,
 	.hugepages = false,
+	.file = false,
 };
 
 FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_hugepage)
 {
 	.mock_domains = 1,
 	.hugepages = true,
+	.file = false,
 };
 
 FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains_hugepage)
 {
 	.mock_domains = 2,
 	.hugepages = true,
+	.file = false,
 };
 
+FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file)
+{
+	.mock_domains = 1,
+	.hugepages = false,
+	.file = true,
+};
+
+FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file_hugepage)
+{
+	.mock_domains = 1,
+	.hugepages = true,
+	.file = true,
+};
+
+
 /* Have the kernel check that the user pages made it to the iommu_domain */
 #define check_mock_iova(_ptr, _iova, _length)                                \
 	({                                                                   \
@@ -1455,7 +1480,10 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 		}                                                            \
 	})
 
-TEST_F(iommufd_mock_domain, basic)
+static void
+test_basic_mmap(struct __test_metadata *_metadata,
+		struct _test_data_iommufd_mock_domain *self,
+		const struct _fixture_variant_iommufd_mock_domain *variant)
 {
 	size_t buf_size = self->mmap_buf_size;
 	uint8_t *buf;
@@ -1478,12 +1506,52 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 	test_err_ioctl_ioas_map(EFAULT, buf, buf_size, &iova);
 }
 
+static void
+test_basic_file(struct __test_metadata *_metadata,
+		struct _test_data_iommufd_mock_domain *self,
+		const struct _fixture_variant_iommufd_mock_domain *variant)
+{
+	size_t buf_size = self->mmap_buf_size;
+	uint8_t *buf;
+	__u64 iova;
+	int mfd_tmp;
+	int prot = PROT_READ | PROT_WRITE;
+
+	/* Simple one page map */
+	test_ioctl_ioas_map_file(mfd, 0, PAGE_SIZE, &iova);
+	check_mock_iova(mfd_buffer, iova, PAGE_SIZE);
+
+	buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd_tmp);
+	ASSERT_NE(MAP_FAILED, buf);
+
+	/* EFAULT half way through mapping */
+	ASSERT_EQ(0, munmap(buf + buf_size / 2, buf_size / 2));
+	test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
+
+	/* EFAULT on first page */
+	ASSERT_EQ(0, munmap(buf, buf_size / 2));
+	test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
+
+	close(mfd_tmp);
+}
+
+TEST_F(iommufd_mock_domain, basic)
+{
+	if (variant->file)
+		test_basic_file(_metadata, self, variant);
+	else
+		test_basic_mmap(_metadata, self, variant);
+}
+
 TEST_F(iommufd_mock_domain, ro_unshare)
 {
 	uint8_t *buf;
 	__u64 iova;
 	int fd;
 
+	if (variant->file)
+		SKIP(return, "redundant test");
+
 	fd = open("/proc/self/exe", O_RDONLY);
 	ASSERT_NE(-1, fd);
 
@@ -1513,9 +1581,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 	unsigned int start;
 	unsigned int end;
 	uint8_t *buf;
+	int prot = PROT_READ | PROT_WRITE;
+	int mfd;
 
-	buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
-		   0);
+	if (variant->file)
+		buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
+	else
+		buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
 	ASSERT_NE(MAP_FAILED, buf);
 	check_refs(buf, buf_size, 0);
 
@@ -1532,7 +1604,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 			size_t length = end - start;
 			__u64 iova;
 
-			test_ioctl_ioas_map(buf + start, length, &iova);
+			if (variant->file) {
+				test_ioctl_ioas_map_file(mfd, start, length,
+							 &iova);
+			} else {
+				test_ioctl_ioas_map(buf + start, length, &iova);
+			}
 			check_mock_iova(buf + start, iova, length);
 			check_refs(buf + start / PAGE_SIZE * PAGE_SIZE,
 				   end / PAGE_SIZE * PAGE_SIZE -
@@ -1544,6 +1621,8 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 	}
 	check_refs(buf, buf_size, 0);
 	ASSERT_EQ(0, munmap(buf, buf_size));
+	if (variant->file)
+		close(mfd);
 }
 
 TEST_F(iommufd_mock_domain, all_aligns_copy)
@@ -1554,9 +1633,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 	unsigned int start;
 	unsigned int end;
 	uint8_t *buf;
+	int prot = PROT_READ | PROT_WRITE;
+	int mfd;
 
-	buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
-		   0);
+	if (variant->file)
+		buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
+	else
+		buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
 	ASSERT_NE(MAP_FAILED, buf);
 	check_refs(buf, buf_size, 0);
 
@@ -1575,7 +1658,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 			uint32_t mock_stdev_id;
 			__u64 iova;
 
-			test_ioctl_ioas_map(buf + start, length, &iova);
+			if (variant->file) {
+				test_ioctl_ioas_map_file(mfd, start, length,
+							 &iova);
+			} else {
+				test_ioctl_ioas_map(buf + start, length, &iova);
+			}
 
 			/* Add and destroy a domain while the area exists */
 			old_id = self->hwpt_ids[1];
@@ -1596,15 +1684,18 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 	}
 	check_refs(buf, buf_size, 0);
 	ASSERT_EQ(0, munmap(buf, buf_size));
+	if (variant->file)
+		close(mfd);
 }
 
 TEST_F(iommufd_mock_domain, user_copy)
 {
+	void *buf = variant->file ? mfd_buffer : buffer;
 	struct iommu_test_cmd access_cmd = {
 		.size = sizeof(access_cmd),
 		.op = IOMMU_TEST_OP_ACCESS_PAGES,
 		.access_pages = { .length = BUFFER_SIZE,
-				  .uptr = (uintptr_t)buffer },
+				  .uptr = (uintptr_t)buf },
 	};
 	struct iommu_ioas_copy copy_cmd = {
 		.size = sizeof(copy_cmd),
@@ -1623,9 +1714,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 
 	/* Pin the pages in an IOAS with no domains then copy to an IOAS with domains */
 	test_ioctl_ioas_alloc(&ioas_id);
-	test_ioctl_ioas_map_id(ioas_id, buffer, BUFFER_SIZE,
-			       &copy_cmd.src_iova);
-
+	if (variant->file) {
+		test_ioctl_ioas_map_id_file(ioas_id, mfd, 0, BUFFER_SIZE,
+					    &copy_cmd.src_iova);
+	} else {
+		test_ioctl_ioas_map_id(ioas_id, buf, BUFFER_SIZE,
+				       &copy_cmd.src_iova);
+	}
 	test_cmd_create_access(ioas_id, &access_cmd.id,
 			       MOCK_FLAGS_ACCESS_CREATE_NEEDS_PIN_PAGES);
 
@@ -1635,12 +1730,17 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 			&access_cmd));
 	copy_cmd.src_ioas_id = ioas_id;
 	ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, &copy_cmd));
-	check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
+	check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
 
 	/* Now replace the ioas with a new one */
 	test_ioctl_ioas_alloc(&new_ioas_id);
-	test_ioctl_ioas_map_id(new_ioas_id, buffer, BUFFER_SIZE,
-			       &copy_cmd.src_iova);
+	if (variant->file) {
+		test_ioctl_ioas_map_id_file(new_ioas_id, mfd, 0, BUFFER_SIZE,
+					    &copy_cmd.src_iova);
+	} else {
+		test_ioctl_ioas_map_id(new_ioas_id, buf, BUFFER_SIZE,
+				       &copy_cmd.src_iova);
+	}
 	test_cmd_access_replace_ioas(access_cmd.id, new_ioas_id);
 
 	/* Destroy the old ioas and cleanup copied mapping */
@@ -1654,7 +1754,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 			&access_cmd));
 	copy_cmd.src_ioas_id = new_ioas_id;
 	ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, &copy_cmd));
-	check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
+	check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
 
 	test_cmd_destroy_access_pages(
 		access_cmd.id, access_cmd.access_pages.out_access_pages_id);
@@ -1667,6 +1767,9 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 {
 	uint32_t ioas_id;
 
+	if (variant->file)
+		SKIP(return, "redundant test");
+
 	test_ioctl_ioas_alloc(&ioas_id);
 
 	test_cmd_mock_domain_replace(self->stdev_ids[0], ioas_id);
@@ -1697,6 +1800,9 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
 {
 	int i;
 
+	if (variant->file)
+		SKIP(return, "redundant test");
+
 	for (i = 0; i != variant->mock_domains; i++) {
 		uint32_t hwpt_id[2];
 		uint32_t stddev_id;
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index c5d5e69..111d43e 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -47,6 +47,9 @@ static __attribute__((constructor)) void setup_buffer(void)
 
 	buffer = mmap(0, BUFFER_SIZE, PROT_READ | PROT_WRITE,
 		      MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+
+	mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
+				&mfd);
 }
 
 /*
@@ -331,6 +334,42 @@ void __fail_nth_enable(struct __test_metadata *_metadata,
 	return 0;
 }
 
+/* iopt_area_fill_domains() and iopt_area_fill_domain() */
+TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
+{
+	uint32_t ioas_id;
+	__u32 stdev_id;
+	__u32 hwpt_id;
+	__u64 iova;
+
+	self->fd = open("/dev/iommu", O_RDWR);
+	if (self->fd == -1)
+		return -1;
+
+	if (_test_ioctl_ioas_alloc(self->fd, &ioas_id))
+		return -1;
+
+	if (_test_ioctl_set_temp_memory_limit(self->fd, 32))
+		return -1;
+
+	fail_nth_enable();
+
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
+		return -1;
+
+	if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
+				 IOMMU_IOAS_MAP_WRITEABLE |
+					 IOMMU_IOAS_MAP_READABLE))
+		return -1;
+
+	if (_test_ioctl_destroy(self->fd, stdev_id))
+		return -1;
+
+	if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
+		return -1;
+	return 0;
+}
+
 TEST_FAIL_NTH(basic_fail_nth, map_two_domains)
 {
 	uint32_t ioas_id;
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 40f6f14..c708dc1 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -40,12 +40,28 @@ static inline bool test_bit(unsigned int nr, unsigned long *addr)
 static void *buffer;
 static unsigned long BUFFER_SIZE;
 
+static void *mfd_buffer;
+static int mfd;
+
 static unsigned long PAGE_SIZE;
 
 #define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
 #define offsetofend(TYPE, MEMBER) \
 	(offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
 
+static inline void *memfd_mmap(size_t length, int prot, int flags, int *mfd_p)
+{
+	int mfd_flags = (flags & MAP_HUGETLB) ? MFD_HUGETLB : 0;
+	int mfd = memfd_create("buffer", mfd_flags);
+
+	if (mfd <= 0)
+		return MAP_FAILED;
+	if (ftruncate(mfd, length))
+		return MAP_FAILED;
+	*mfd_p = mfd;
+	return mmap(0, length, prot, flags, mfd, 0);
+}
+
 /*
  * Have the kernel check the refcount on pages. I don't know why a freshly
  * mmap'd anon non-compound page starts out with a ref of 3
@@ -589,6 +605,47 @@ static int _test_ioctl_ioas_unmap(int fd, unsigned int ioas_id, uint64_t iova,
 	EXPECT_ERRNO(_errno, _test_ioctl_ioas_unmap(self->fd, self->ioas_id, \
 						    iova, length, NULL))
 
+static int _test_ioctl_ioas_map_file(int fd, unsigned int ioas_id, int mfd,
+				     size_t start, size_t length, __u64 *iova,
+				     unsigned int flags)
+{
+	struct iommu_ioas_map_file cmd = {
+		.size = sizeof(cmd),
+		.flags = flags,
+		.ioas_id = ioas_id,
+		.fd = mfd,
+		.start = start,
+		.length = length,
+	};
+	int ret;
+
+	if (flags & IOMMU_IOAS_MAP_FIXED_IOVA)
+		cmd.iova = *iova;
+
+	ret = ioctl(fd, IOMMU_IOAS_MAP_FILE, &cmd);
+	*iova = cmd.iova;
+	return ret;
+}
+
+#define test_ioctl_ioas_map_file(mfd, start, length, iova_p)                 \
+	ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
+					       start, length, iova_p,        \
+					       IOMMU_IOAS_MAP_WRITEABLE |    \
+						   IOMMU_IOAS_MAP_READABLE))
+
+#define test_err_ioctl_ioas_map_file(_errno, start, length, iova_p)	     \
+	EXPECT_ERRNO(_errno,                                                 \
+		     _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
+					       start, length, iova_p,        \
+					       IOMMU_IOAS_MAP_WRITEABLE |    \
+						   IOMMU_IOAS_MAP_READABLE))
+
+#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p)     \
+	ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd,       \
+					  start, length, iova_p,             \
+					  IOMMU_IOAS_MAP_WRITEABLE |         \
+						  IOMMU_IOAS_MAP_READABLE))
+
 static int _test_ioctl_set_temp_memory_limit(int fd, unsigned int limit)
 {
 	struct iommu_test_cmd memlimit_cmd = {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova
  2024-10-22 21:00 ` [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
@ 2024-10-23  7:05   ` Tian, Kevin
  0 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23  7:05 UTC (permalink / raw)
  To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
> 
> iopt_alloc_iova takes a uptr argument but only checks for its alignment.
> Generalize this to an unsigned address, which can be the offset from the
> start of a file in a subsequent patch.  No functional change.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 3/9] iommufd: generalize iopt_pages address
  2024-10-22 21:00 ` [PATCH V5 3/9] iommufd: generalize iopt_pages address Steve Sistare
@ 2024-10-23  7:08   ` Tian, Kevin
  2024-10-23 13:33     ` Steven Sistare
  0 siblings, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23  7:08 UTC (permalink / raw)
  To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
> 
> -struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
> -				    bool writable)
> +static struct iopt_pages *iopt_alloc_pages(unsigned long length,
> +					   unsigned long start_byte,
> +					   bool writable)

it's more intuitive to put start_byte as the 1st parameter, followed
by length.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 4/9] iommufd: pfn reader local variables
  2024-10-22 21:00 ` [PATCH V5 4/9] iommufd: pfn reader local variables Steve Sistare
@ 2024-10-23  7:09   ` Tian, Kevin
  0 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23  7:09 UTC (permalink / raw)
  To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
> 
> Add local variables for common sub-expressions needed by a subsequent
> patch.  No functional change.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 5/9] iommufd: folio subroutines
  2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
@ 2024-10-23  7:16   ` Tian, Kevin
  2024-10-23  7:21   ` Tian, Kevin
  2024-10-23 13:01   ` Jason Gunthorpe
  2 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23  7:16 UTC (permalink / raw)
  To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
> 
> -/* true if the pfn was added, false otherwise */
> -static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
> +static bool batch_add_pfn_num(struct pfn_batch *batch, unsigned long pfn,
> +			      unsigned long nr)
>  {
> -	const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
> +	const unsigned long MAX_NPFNS = type_max(typeof(*batch->npfns));

why changing to 'unsigned long'?

except that,

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 5/9] iommufd: folio subroutines
  2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
  2024-10-23  7:16   ` Tian, Kevin
@ 2024-10-23  7:21   ` Tian, Kevin
  2024-10-23 13:03     ` Jason Gunthorpe
  2024-10-23 13:04     ` Steven Sistare
  2024-10-23 13:01   ` Jason Gunthorpe
  2 siblings, 2 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23  7:21 UTC (permalink / raw)
  To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
> 
> +static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
> +			     unsigned long *offset_p, unsigned long npages)
> +{
> +	int rc = 0;
> +	struct folio **folios = *folios_p;
> +	unsigned long offset = *offset_p;
> +
> +	while (npages) {
> +		struct folio *folio = *folios;
> +		unsigned long nr = folio_nr_pages(folio) - offset;
> +		unsigned long pfn = page_to_pfn(folio_page(folio, offset));
> +
> +		nr = min(nr, npages);
> +		npages -= nr;
> +
> +		if (!batch_add_pfn_num(batch, pfn, nr))
> +			break;
> +		if (nr > 1) {
> +			rc = folio_add_pins(folio, nr - 1);
> +			if (rc) {
> +				batch_remove_pfn_num(batch, nr);
> +				goto out;
> +			}
> +		}

forgot to ask one thing here. Could you add a comment here why
additional pins is required only when batching more than 1 page?

it's not clear to a person w/o a lot of mm background. e.g. in patch1
folio_add_pins() is introduced for partial pin/unpin within an
already-pinned folio, but it's not mentioned to exclude such
operation on a single page...

> +
> +		folios++;
> +		offset = 0;
> +	}
> +
> +out:
> +	*folios_p = folios;
> +	*offset_p = offset;
> +	return rc;
> +}
> +
>  static void batch_unpin(struct pfn_batch *batch, struct iopt_pages *pages,
>  			unsigned int first_page_off, size_t npages)
>  {
> --
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 6/9] iommufd: pfn reader for file mappings
  2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
@ 2024-10-23  7:40   ` Tian, Kevin
  2024-10-23 13:06     ` Steven Sistare
  2024-10-23 13:15     ` Jason Gunthorpe
  2024-10-23 13:24   ` Jason Gunthorpe
  1 sibling, 2 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23  7:40 UTC (permalink / raw)
  To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
> 
> +
> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long
> start,
> +			    unsigned long npages)
> +{
> +	unsigned long i;
> +	unsigned long offset;
> +	unsigned long npages_out = 0;
> +	struct page **upages = user->upages;
> +	unsigned long end = start + (npages << PAGE_SHIFT) - 1;
> +	long nfolios = user->ufolios_len / sizeof(*user->ufolios);

reverse Christmas tree

> +
> +	/*
> +	 * todo: memfd_pin_folios should return the last pinned offset so
> +	 * we can compute npages pinned, and avoid looping over folios here
> +	 * if upages == NULL.
> +	 */
> +	nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
> +				   nfolios, &offset);
> +	if (nfolios <= 0)
> +		return nfolios;
> +
> +	offset >>= PAGE_SHIFT;
> +	user->ufolios_next = user->ufolios;
> +	user->ufolios_offset = offset;
> +
> +	for (i = 0; i < nfolios; i++) {
> +		struct folio *folio = user->ufolios[i];
> +		unsigned long nr = folio_nr_pages(folio);
> +		unsigned long npin = min(nr - offset, npages);
> +
> +		npages -= npin;
> +		npages_out += npin;
> +
> +		if (upages) {
> +			if (npin == 1) {
> +				*upages++ = folio_page(folio, offset);
> +			} else {
> +				int rc = folio_add_pins(folio, npin - 1);
> +
> +				if (rc)
> +					return rc;
> +
> +				while (npin--)
> +					*upages++ = folio_page(folio,
> offset++);
> +			}
> +		}

dead code as user->upages is NULL for memfd (echoed below)?

> 
> @@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct
> pfn_reader_user *user,
>  	    WARN_ON(last_index < start_index))
>  		return -EINVAL;
> 
> -	if (!user->upages) {
> +	if (!user->file && !user->upages) {
>  		/* All undone in pfn_reader_destroy() */
> -		user->upages_len =
> -			(last_index - start_index + 1) * sizeof(*user->upages);
> +		user->upages_len = npages * sizeof(*user->upages);
>  		user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
>  		if (!user->upages)
>  			return -ENOMEM;
>  	}
> 
> +	if (user->file && !user->ufolios) {
> +		user->ufolios_len = npages * sizeof(*user->ufolios);
> +		user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
> +		if (!user->ufolios)
> +			return -ENOMEM;
> +	}


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE
  2024-10-22 21:00 ` [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
@ 2024-10-23  7:45   ` Tian, Kevin
  2024-10-23 13:22     ` Steven Sistare
  0 siblings, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23  7:45 UTC (permalink / raw)
  To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
> 
> +int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd)
> +{
> +	struct iommu_ioas_map_file *cmd = ucmd->cmd;
> +	unsigned long iova = cmd->iova;
> +	struct iommufd_ioas *ioas;
> +	unsigned int flags = 0;
> +	int rc;
> +	struct file *file;
> +
> +	if (cmd->iova >= ULONG_MAX || cmd->length >= ULONG_MAX)
> +		return -EOVERFLOW;
> +
> +	ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
> +	if (IS_ERR(ioas))
> +		return PTR_ERR(ioas);
> +
> +	if (!(cmd->flags & IOMMU_IOAS_MAP_FIXED_IOVA))
> +		flags = IOPT_ALLOC_IOVA;

any reason skipping other checks in iommufd_ioas_map()?

	if ((cmd->flags &
	      ~(IOMMU_IOAS_MAP_FIXED_IOVA | IOMMU_IOAS_MAP_WRITEABLE |
	       IOMMU_IOAS_MAP_READABLE)) ||
	    cmd->__reserved)
		return -EOPNOTSUPP;

	if (!(cmd->flags &
	      (IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE)))
		return -EINVAL;

> 
>  /**
> + * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
> + * @size: sizeof(struct iommu_ioas_map_file)
> + * @fd: the memfd to map
> + * @start: byte offset from start of file to map from
> + * @flags: same as for iommu_ioas_map
> + * @ios_id: ditto

s/ios_id/ioas_id/

> + * @length: ditto
> + * @iova: ditto

prefer to duplicating "same as for iommu_ioas_map"

also adjust above lines in the same order as following fields.

> + *
> + * Set an IOVA mapping from a memfd file.  All other arguments and
> semantics
> + * match those of IOMMU_IOAS_MAP.
> + */
> +struct iommu_ioas_map_file {
> +	__u32 size;
> +	__u32 flags;
> +	__u32 ioas_id;
> +	__s32 fd;
> +	__aligned_u64 start;
> +	__aligned_u64 length;
> +	__aligned_u64 iova;
> +};
> +#define IOMMU_IOAS_MAP_FILE _IO(IOMMUFD_TYPE,
> IOMMUFD_CMD_IOAS_MAP_FILE)
> +
> +/**
>   * struct iommu_ioas_copy - ioctl(IOMMU_IOAS_COPY)
>   * @size: sizeof(struct iommu_ioas_copy)
>   * @flags: Combination of enum iommufd_ioas_map_flags
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 8/9] iommufd: file mappings for mdev
  2024-10-22 21:00 ` [PATCH V5 8/9] iommufd: file mappings for mdev Steve Sistare
@ 2024-10-23  8:01   ` Tian, Kevin
  2024-10-23 13:16     ` Jason Gunthorpe
  0 siblings, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23  8:01 UTC (permalink / raw)
  To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
> 
> @@ -1973,6 +1973,10 @@ static int iopt_pages_rw_page(struct iopt_pages
> *pages, unsigned long index,
>  	struct page *page = NULL;
>  	int rc;
> 
> +	if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
> +	    WARN_ON(pages->type != IOPT_ADDRESS_USER))
> +		return -EINVAL;
> +

is this check necessary? if yes why is it only enabled in test?

also it looks duplicated with the one in iopt_pages_rw_access().

>  	if (!mmget_not_zero(pages->source_mm))
>  		return iopt_pages_rw_slow(pages, index, index, offset, data,
>  					  length, flags);
> @@ -2028,6 +2032,15 @@ int iopt_pages_rw_access(struct iopt_pages
> *pages, unsigned long start_byte,
>  	if ((flags & IOMMUFD_ACCESS_RW_WRITE) && !pages->writable)
>  		return -EPERM;
> 
> +	if (pages->type == IOPT_ADDRESS_FILE)
> +		return iopt_pages_rw_slow(pages, start_index, last_index,
> +					  start_byte % PAGE_SIZE, data,
> length,
> +					  flags);

there are 3 invocations of iopt_pages_rw_slow() now, each counting
3 lines.

probably creating a goto label to reduce the lines.

> +
> +	if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
> +	    WARN_ON(pages->type != IOPT_ADDRESS_USER))
> +		return -EINVAL;
> +
>  	if (!(flags & IOMMUFD_ACCESS_RW_KTHREAD) && change_mm) {
>  		if (start_index == last_index)
>  			return iopt_pages_rw_page(pages, start_index,
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 5/9] iommufd: folio subroutines
  2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
  2024-10-23  7:16   ` Tian, Kevin
  2024-10-23  7:21   ` Tian, Kevin
@ 2024-10-23 13:01   ` Jason Gunthorpe
  2 siblings, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:01 UTC (permalink / raw)
  To: Steve Sistare; +Cc: iommu, Kevin Tian, Nicolin Chen

On Tue, Oct 22, 2024 at 02:00:34PM -0700, Steve Sistare wrote:
> Add subroutines for copying folios to a batch.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  drivers/iommu/iommufd/pages.c | 79 +++++++++++++++++++++++++++++++++++--------
>  1 file changed, 64 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
> index 0cf5b65..635b1fe 100644
> --- a/drivers/iommu/iommufd/pages.c
> +++ b/drivers/iommu/iommufd/pages.c
> @@ -346,27 +346,41 @@ static void batch_destroy(struct pfn_batch *batch, void *backup)
>  		kfree(batch->pfns);
>  }
>  
> -/* true if the pfn was added, false otherwise */
> -static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
> +static bool batch_add_pfn_num(struct pfn_batch *batch, unsigned long pfn,
> +			      unsigned long nr)

nr should be a u32 because that is iwhat npfns is

>  {
> -	const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
> +	const unsigned long MAX_NPFNS = type_max(typeof(*batch->npfns));
> +	int i = batch->end;

unsigned int end = batch->end;

Otherwise

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 5/9] iommufd: folio subroutines
  2024-10-23  7:21   ` Tian, Kevin
@ 2024-10-23 13:03     ` Jason Gunthorpe
  2024-10-24  6:14       ` Tian, Kevin
  2024-10-23 13:04     ` Steven Sistare
  1 sibling, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:03 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: Steve Sistare, iommu@lists.linux.dev, Nicolin Chen

On Wed, Oct 23, 2024 at 07:21:05AM +0000, Tian, Kevin wrote:
> > From: Steve Sistare <steven.sistare@oracle.com>
> > Sent: Wednesday, October 23, 2024 5:01 AM
> > 
> > +static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
> > +			     unsigned long *offset_p, unsigned long npages)
> > +{
> > +	int rc = 0;
> > +	struct folio **folios = *folios_p;
> > +	unsigned long offset = *offset_p;
> > +
> > +	while (npages) {
> > +		struct folio *folio = *folios;
> > +		unsigned long nr = folio_nr_pages(folio) - offset;
> > +		unsigned long pfn = page_to_pfn(folio_page(folio, offset));
> > +
> > +		nr = min(nr, npages);
> > +		npages -= nr;
> > +
> > +		if (!batch_add_pfn_num(batch, pfn, nr))
> > +			break;
> > +		if (nr > 1) {
> > +			rc = folio_add_pins(folio, nr - 1);
> > +			if (rc) {
> > +				batch_remove_pfn_num(batch, nr);
> > +				goto out;
> > +			}
> > +		}
> 
> forgot to ask one thing here. Could you add a comment here why
> additional pins is required only when batching more than 1 page?

Something like:

 The unpin path during unmap can unpin slices of a single folio, and
 broadly does not keep track of where the folio boundaries were. Since
 it treats everything as a single page transform the refcount on the
 folio from a per-folio ref to a per-page ref. This makes the unmap
 path uniform regardless of how the memory was pinned.

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 5/9] iommufd: folio subroutines
  2024-10-23  7:21   ` Tian, Kevin
  2024-10-23 13:03     ` Jason Gunthorpe
@ 2024-10-23 13:04     ` Steven Sistare
  1 sibling, 0 replies; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:04 UTC (permalink / raw)
  To: Tian, Kevin, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

On 10/23/2024 3:21 AM, Tian, Kevin wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>> Sent: Wednesday, October 23, 2024 5:01 AM
>>
>> +static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
>> +			     unsigned long *offset_p, unsigned long npages)
>> +{
>> +	int rc = 0;
>> +	struct folio **folios = *folios_p;
>> +	unsigned long offset = *offset_p;
>> +
>> +	while (npages) {
>> +		struct folio *folio = *folios;
>> +		unsigned long nr = folio_nr_pages(folio) - offset;
>> +		unsigned long pfn = page_to_pfn(folio_page(folio, offset));
>> +
>> +		nr = min(nr, npages);
>> +		npages -= nr;
>> +
>> +		if (!batch_add_pfn_num(batch, pfn, nr))
>> +			break;
>> +		if (nr > 1) {
>> +			rc = folio_add_pins(folio, nr - 1);
>> +			if (rc) {
>> +				batch_remove_pfn_num(batch, nr);
>> +				goto out;
>> +			}
>> +		}
> 
> forgot to ask one thing here. Could you add a comment here why
> additional pins is required only when batching more than 1 page?
> 
> it's not clear to a person w/o a lot of mm background. e.g. in patch1
> folio_add_pins() is introduced for partial pin/unpin within an
> already-pinned folio, but it's not mentioned to exclude such
> operation on a single page...

We already gained 1 pin per folio from memfd_pin_folios.  When iommufd unmaps
the memory, it will unpin each (small) page once.  Thus we add 1 pin for each
page, less 1 for the initial pin.

- Steve

>> +
>> +		folios++;
>> +		offset = 0;
>> +	}
>> +
>> +out:
>> +	*folios_p = folios;
>> +	*offset_p = offset;
>> +	return rc;
>> +}
>> +
>>   static void batch_unpin(struct pfn_batch *batch, struct iopt_pages *pages,
>>   			unsigned int first_page_off, size_t npages)
>>   {
>> --
>> 1.8.3.1
>>
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 6/9] iommufd: pfn reader for file mappings
  2024-10-23  7:40   ` Tian, Kevin
@ 2024-10-23 13:06     ` Steven Sistare
  2024-10-24  6:17       ` Tian, Kevin
  2024-10-23 13:15     ` Jason Gunthorpe
  1 sibling, 1 reply; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:06 UTC (permalink / raw)
  To: Tian, Kevin, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

On 10/23/2024 3:40 AM, Tian, Kevin wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>> Sent: Wednesday, October 23, 2024 5:01 AM
>>
>> +
>> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long
>> start,
>> +			    unsigned long npages)
>> +{
>> +	unsigned long i;
>> +	unsigned long offset;
>> +	unsigned long npages_out = 0;
>> +	struct page **upages = user->upages;
>> +	unsigned long end = start + (npages << PAGE_SHIFT) - 1;
>> +	long nfolios = user->ufolios_len / sizeof(*user->ufolios);
> 
> reverse Christmas tree
> 
>> +
>> +	/*
>> +	 * todo: memfd_pin_folios should return the last pinned offset so
>> +	 * we can compute npages pinned, and avoid looping over folios here
>> +	 * if upages == NULL.
>> +	 */
>> +	nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
>> +				   nfolios, &offset);
>> +	if (nfolios <= 0)
>> +		return nfolios;
>> +
>> +	offset >>= PAGE_SHIFT;
>> +	user->ufolios_next = user->ufolios;
>> +	user->ufolios_offset = offset;
>> +
>> +	for (i = 0; i < nfolios; i++) {
>> +		struct folio *folio = user->ufolios[i];
>> +		unsigned long nr = folio_nr_pages(folio);
>> +		unsigned long npin = min(nr - offset, npages);
>> +
>> +		npages -= npin;
>> +		npages_out += npin;
>> +
>> +		if (upages) {
>> +			if (npin == 1) {
>> +				*upages++ = folio_page(folio, offset);
>> +			} else {
>> +				int rc = folio_add_pins(folio, npin - 1);
>> +
>> +				if (rc)
>> +					return rc;
>> +
>> +				while (npin--)
>> +					*upages++ = folio_page(folio,
>> offset++);
>> +			}
>> +		}
> 
> dead code as user->upages is NULL for memfd (echoed below)?

user->upages is still used for mediated devices, assigned in iopt_pages_fill.
upages is filled directly, bypassing the batch.

- Steve

>> @@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct
>> pfn_reader_user *user,
>>   	    WARN_ON(last_index < start_index))
>>   		return -EINVAL;
>>
>> -	if (!user->upages) {
>> +	if (!user->file && !user->upages) {
>>   		/* All undone in pfn_reader_destroy() */
>> -		user->upages_len =
>> -			(last_index - start_index + 1) * sizeof(*user->upages);
>> +		user->upages_len = npages * sizeof(*user->upages);
>>   		user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
>>   		if (!user->upages)
>>   			return -ENOMEM;
>>   	}
>>
>> +	if (user->file && !user->ufolios) {
>> +		user->ufolios_len = npages * sizeof(*user->ufolios);
>> +		user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
>> +		if (!user->ufolios)
>> +			return -ENOMEM;
>> +	}
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 6/9] iommufd: pfn reader for file mappings
  2024-10-23  7:40   ` Tian, Kevin
  2024-10-23 13:06     ` Steven Sistare
@ 2024-10-23 13:15     ` Jason Gunthorpe
  1 sibling, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:15 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: Steve Sistare, iommu@lists.linux.dev, Nicolin Chen

On Wed, Oct 23, 2024 at 07:40:34AM +0000, Tian, Kevin wrote:

> > +	for (i = 0; i < nfolios; i++) {
> > +		struct folio *folio = user->ufolios[i];
> > +		unsigned long nr = folio_nr_pages(folio);
> > +		unsigned long npin = min(nr - offset, npages);
> > +
> > +		npages -= npin;
> > +		npages_out += npin;
> > +
> > +		if (upages) {
> > +			if (npin == 1) {
> > +				*upages++ = folio_page(folio, offset);
> > +			} else {
> > +				int rc = folio_add_pins(folio, npin - 1);
> > +
> > +				if (rc)
> > +					return rc;
> > +
> > +				while (npin--)
> > +					*upages++ = folio_page(folio,
> > offset++);
> > +			}
> > +		}
> 
> dead code as user->upages is NULL for memfd (echoed below)?

It can be non-null if we go down the iopt_pages_fill() call chain. In
that case it will hold the output array for the access's page list for
mdev type drivers.

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 8/9] iommufd: file mappings for mdev
  2024-10-23  8:01   ` Tian, Kevin
@ 2024-10-23 13:16     ` Jason Gunthorpe
  0 siblings, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:16 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: Steve Sistare, iommu@lists.linux.dev, Nicolin Chen

On Wed, Oct 23, 2024 at 08:01:56AM +0000, Tian, Kevin wrote:
> > From: Steve Sistare <steven.sistare@oracle.com>
> > Sent: Wednesday, October 23, 2024 5:01 AM
> > 
> > @@ -1973,6 +1973,10 @@ static int iopt_pages_rw_page(struct iopt_pages
> > *pages, unsigned long index,
> >  	struct page *page = NULL;
> >  	int rc;
> > 
> > +	if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
> > +	    WARN_ON(pages->type != IOPT_ADDRESS_USER))
> > +		return -EINVAL;
> > +
>
> is this check necessary? if yes why is it only enabled in test?

This is the pattern for correctness assertions on performance paths

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE
  2024-10-23  7:45   ` Tian, Kevin
@ 2024-10-23 13:22     ` Steven Sistare
  0 siblings, 0 replies; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:22 UTC (permalink / raw)
  To: Tian, Kevin, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

On 10/23/2024 3:45 AM, Tian, Kevin wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>> Sent: Wednesday, October 23, 2024 5:01 AM
>>
>> +int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd)
>> +{
>> +	struct iommu_ioas_map_file *cmd = ucmd->cmd;
>> +	unsigned long iova = cmd->iova;
>> +	struct iommufd_ioas *ioas;
>> +	unsigned int flags = 0;
>> +	int rc;
>> +	struct file *file;
>> +
>> +	if (cmd->iova >= ULONG_MAX || cmd->length >= ULONG_MAX)
>> +		return -EOVERFLOW;
>> +
>> +	ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
>> +	if (IS_ERR(ioas))
>> +		return PTR_ERR(ioas);
>> +
>> +	if (!(cmd->flags & IOMMU_IOAS_MAP_FIXED_IOVA))
>> +		flags = IOPT_ALLOC_IOVA;
> 
> any reason skipping other checks in iommufd_ioas_map()?

No reason, just an oversight.  I will add them, thanks.

> 	if ((cmd->flags &
> 	      ~(IOMMU_IOAS_MAP_FIXED_IOVA | IOMMU_IOAS_MAP_WRITEABLE |
> 	       IOMMU_IOAS_MAP_READABLE)) ||
> 	    cmd->__reserved)
> 		return -EOPNOTSUPP;
> 
> 	if (!(cmd->flags &
> 	      (IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE)))
> 		return -EINVAL;
> 
>>
>>   /**
>> + * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
>> + * @size: sizeof(struct iommu_ioas_map_file)
>> + * @fd: the memfd to map
>> + * @start: byte offset from start of file to map from
>> + * @flags: same as for iommu_ioas_map
>> + * @ios_id: ditto
> 
> s/ios_id/ioas_id/
> 
>> + * @length: ditto
>> + * @iova: ditto
> 
> prefer to duplicating "same as for iommu_ioas_map"
> 
> also adjust above lines in the same order as following fields.

Will edit as suggested.

- Steve

>> + *
>> + * Set an IOVA mapping from a memfd file.  All other arguments and
>> semantics
>> + * match those of IOMMU_IOAS_MAP.
>> + */
>> +struct iommu_ioas_map_file {
>> +	__u32 size;
>> +	__u32 flags;
>> +	__u32 ioas_id;
>> +	__s32 fd;
>> +	__aligned_u64 start;
>> +	__aligned_u64 length;
>> +	__aligned_u64 iova;
>> +};
>> +#define IOMMU_IOAS_MAP_FILE _IO(IOMMUFD_TYPE,
>> IOMMUFD_CMD_IOAS_MAP_FILE)
>> +
>> +/**
>>    * struct iommu_ioas_copy - ioctl(IOMMU_IOAS_COPY)
>>    * @size: sizeof(struct iommu_ioas_copy)
>>    * @flags: Combination of enum iommufd_ioas_map_flags
>> --
>> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 6/9] iommufd: pfn reader for file mappings
  2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
  2024-10-23  7:40   ` Tian, Kevin
@ 2024-10-23 13:24   ` Jason Gunthorpe
  1 sibling, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:24 UTC (permalink / raw)
  To: Steve Sistare; +Cc: iommu, Kevin Tian, Nicolin Chen

On Tue, Oct 22, 2024 at 02:00:35PM -0700, Steve Sistare wrote:
> Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
> Repin at small page granularity, and fill the batch from folios.  Expand
> folios to upages for the iopt_pages_fill path.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/iommufd/io_pagetable.h |   5 ++
>  drivers/iommu/iommufd/pages.c        | 128 ++++++++++++++++++++++++++++++-----
>  2 files changed, 116 insertions(+), 17 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 3/9] iommufd: generalize iopt_pages address
  2024-10-23  7:08   ` Tian, Kevin
@ 2024-10-23 13:33     ` Steven Sistare
  0 siblings, 0 replies; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:33 UTC (permalink / raw)
  To: Tian, Kevin, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

On 10/23/2024 3:08 AM, Tian, Kevin wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>> Sent: Wednesday, October 23, 2024 5:01 AM
>>
>> -struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
>> -				    bool writable)
>> +static struct iopt_pages *iopt_alloc_pages(unsigned long length,
>> +					   unsigned long start_byte,
>> +					   bool writable)
> 
> it's more intuitive to put start_byte as the 1st parameter, followed
> by length.

Will do - steve

> Reviewed-by: Kevin Tian <kevin.tian@intel.com>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 9/9] iommufd: map file selftest
  2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
@ 2024-10-23 13:52   ` Steven Sistare
  2024-10-23 20:15   ` Nicolin Chen
  1 sibling, 0 replies; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:52 UTC (permalink / raw)
  To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen

Who is willing to send an RB for this patch?
All new and old iommufd and iommufd_fail_nth tests pass.

- Steve

On 10/22/2024 5:00 PM, Steve Sistare wrote:
> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>   tools/testing/selftests/iommu/iommufd.c          | 136 ++++++++++++++++++++---
>   tools/testing/selftests/iommu/iommufd_fail_nth.c |  39 +++++++
>   tools/testing/selftests/iommu/iommufd_utils.h    |  57 ++++++++++
>   3 files changed, 217 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
> index 4927b9a..3d62951 100644
> --- a/tools/testing/selftests/iommu/iommufd.c
> +++ b/tools/testing/selftests/iommu/iommufd.c
> @@ -3,6 +3,7 @@
>   #include <stdlib.h>
>   #include <sys/mman.h>
>   #include <sys/eventfd.h>
> +#include <asm/unistd.h>
>   
>   #define __EXPORTED_HEADERS__
>   #include <linux/vfio.h>
> @@ -49,6 +50,9 @@ static __attribute__((constructor)) void setup_sizes(void)
>   	vrc = mmap(buffer, BUFFER_SIZE, PROT_READ | PROT_WRITE,
>   		   MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
>   	assert(vrc == buffer);
> +
> +	mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
> +				&mfd);
>   }
>   
>   FIXTURE(iommufd)
> @@ -128,6 +132,7 @@ static __attribute__((constructor)) void setup_sizes(void)
>   	TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP, length);
>   	TEST_LENGTH(iommu_option, IOMMU_OPTION, val64);
>   	TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved);
> +	TEST_LENGTH(iommu_ioas_map_file, IOMMU_IOAS_MAP_FILE, iova);
>   #undef TEST_LENGTH
>   }
>   
> @@ -1372,6 +1377,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   {
>   	unsigned int mock_domains;
>   	bool hugepages;
> +	bool file;
>   };
>   
>   FIXTURE_SETUP(iommufd_mock_domain)
> @@ -1410,26 +1416,45 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   {
>   	.mock_domains = 1,
>   	.hugepages = false,
> +	.file = false,
>   };
>   
>   FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains)
>   {
>   	.mock_domains = 2,
>   	.hugepages = false,
> +	.file = false,
>   };
>   
>   FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_hugepage)
>   {
>   	.mock_domains = 1,
>   	.hugepages = true,
> +	.file = false,
>   };
>   
>   FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains_hugepage)
>   {
>   	.mock_domains = 2,
>   	.hugepages = true,
> +	.file = false,
>   };
>   
> +FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file)
> +{
> +	.mock_domains = 1,
> +	.hugepages = false,
> +	.file = true,
> +};
> +
> +FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file_hugepage)
> +{
> +	.mock_domains = 1,
> +	.hugepages = true,
> +	.file = true,
> +};
> +
> +
>   /* Have the kernel check that the user pages made it to the iommu_domain */
>   #define check_mock_iova(_ptr, _iova, _length)                                \
>   	({                                                                   \
> @@ -1455,7 +1480,10 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   		}                                                            \
>   	})
>   
> -TEST_F(iommufd_mock_domain, basic)
> +static void
> +test_basic_mmap(struct __test_metadata *_metadata,
> +		struct _test_data_iommufd_mock_domain *self,
> +		const struct _fixture_variant_iommufd_mock_domain *variant)
>   {
>   	size_t buf_size = self->mmap_buf_size;
>   	uint8_t *buf;
> @@ -1478,12 +1506,52 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   	test_err_ioctl_ioas_map(EFAULT, buf, buf_size, &iova);
>   }
>   
> +static void
> +test_basic_file(struct __test_metadata *_metadata,
> +		struct _test_data_iommufd_mock_domain *self,
> +		const struct _fixture_variant_iommufd_mock_domain *variant)
> +{
> +	size_t buf_size = self->mmap_buf_size;
> +	uint8_t *buf;
> +	__u64 iova;
> +	int mfd_tmp;
> +	int prot = PROT_READ | PROT_WRITE;
> +
> +	/* Simple one page map */
> +	test_ioctl_ioas_map_file(mfd, 0, PAGE_SIZE, &iova);
> +	check_mock_iova(mfd_buffer, iova, PAGE_SIZE);
> +
> +	buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd_tmp);
> +	ASSERT_NE(MAP_FAILED, buf);
> +
> +	/* EFAULT half way through mapping */
> +	ASSERT_EQ(0, munmap(buf + buf_size / 2, buf_size / 2));
> +	test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
> +
> +	/* EFAULT on first page */
> +	ASSERT_EQ(0, munmap(buf, buf_size / 2));
> +	test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
> +
> +	close(mfd_tmp);
> +}
> +
> +TEST_F(iommufd_mock_domain, basic)
> +{
> +	if (variant->file)
> +		test_basic_file(_metadata, self, variant);
> +	else
> +		test_basic_mmap(_metadata, self, variant);
> +}
> +
>   TEST_F(iommufd_mock_domain, ro_unshare)
>   {
>   	uint8_t *buf;
>   	__u64 iova;
>   	int fd;
>   
> +	if (variant->file)
> +		SKIP(return, "redundant test");
> +
>   	fd = open("/proc/self/exe", O_RDONLY);
>   	ASSERT_NE(-1, fd);
>   
> @@ -1513,9 +1581,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   	unsigned int start;
>   	unsigned int end;
>   	uint8_t *buf;
> +	int prot = PROT_READ | PROT_WRITE;
> +	int mfd;
>   
> -	buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
> -		   0);
> +	if (variant->file)
> +		buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
> +	else
> +		buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
>   	ASSERT_NE(MAP_FAILED, buf);
>   	check_refs(buf, buf_size, 0);
>   
> @@ -1532,7 +1604,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   			size_t length = end - start;
>   			__u64 iova;
>   
> -			test_ioctl_ioas_map(buf + start, length, &iova);
> +			if (variant->file) {
> +				test_ioctl_ioas_map_file(mfd, start, length,
> +							 &iova);
> +			} else {
> +				test_ioctl_ioas_map(buf + start, length, &iova);
> +			}
>   			check_mock_iova(buf + start, iova, length);
>   			check_refs(buf + start / PAGE_SIZE * PAGE_SIZE,
>   				   end / PAGE_SIZE * PAGE_SIZE -
> @@ -1544,6 +1621,8 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   	}
>   	check_refs(buf, buf_size, 0);
>   	ASSERT_EQ(0, munmap(buf, buf_size));
> +	if (variant->file)
> +		close(mfd);
>   }
>   
>   TEST_F(iommufd_mock_domain, all_aligns_copy)
> @@ -1554,9 +1633,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   	unsigned int start;
>   	unsigned int end;
>   	uint8_t *buf;
> +	int prot = PROT_READ | PROT_WRITE;
> +	int mfd;
>   
> -	buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
> -		   0);
> +	if (variant->file)
> +		buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
> +	else
> +		buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
>   	ASSERT_NE(MAP_FAILED, buf);
>   	check_refs(buf, buf_size, 0);
>   
> @@ -1575,7 +1658,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   			uint32_t mock_stdev_id;
>   			__u64 iova;
>   
> -			test_ioctl_ioas_map(buf + start, length, &iova);
> +			if (variant->file) {
> +				test_ioctl_ioas_map_file(mfd, start, length,
> +							 &iova);
> +			} else {
> +				test_ioctl_ioas_map(buf + start, length, &iova);
> +			}
>   
>   			/* Add and destroy a domain while the area exists */
>   			old_id = self->hwpt_ids[1];
> @@ -1596,15 +1684,18 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   	}
>   	check_refs(buf, buf_size, 0);
>   	ASSERT_EQ(0, munmap(buf, buf_size));
> +	if (variant->file)
> +		close(mfd);
>   }
>   
>   TEST_F(iommufd_mock_domain, user_copy)
>   {
> +	void *buf = variant->file ? mfd_buffer : buffer;
>   	struct iommu_test_cmd access_cmd = {
>   		.size = sizeof(access_cmd),
>   		.op = IOMMU_TEST_OP_ACCESS_PAGES,
>   		.access_pages = { .length = BUFFER_SIZE,
> -				  .uptr = (uintptr_t)buffer },
> +				  .uptr = (uintptr_t)buf },
>   	};
>   	struct iommu_ioas_copy copy_cmd = {
>   		.size = sizeof(copy_cmd),
> @@ -1623,9 +1714,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   
>   	/* Pin the pages in an IOAS with no domains then copy to an IOAS with domains */
>   	test_ioctl_ioas_alloc(&ioas_id);
> -	test_ioctl_ioas_map_id(ioas_id, buffer, BUFFER_SIZE,
> -			       &copy_cmd.src_iova);
> -
> +	if (variant->file) {
> +		test_ioctl_ioas_map_id_file(ioas_id, mfd, 0, BUFFER_SIZE,
> +					    &copy_cmd.src_iova);
> +	} else {
> +		test_ioctl_ioas_map_id(ioas_id, buf, BUFFER_SIZE,
> +				       &copy_cmd.src_iova);
> +	}
>   	test_cmd_create_access(ioas_id, &access_cmd.id,
>   			       MOCK_FLAGS_ACCESS_CREATE_NEEDS_PIN_PAGES);
>   
> @@ -1635,12 +1730,17 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   			&access_cmd));
>   	copy_cmd.src_ioas_id = ioas_id;
>   	ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, &copy_cmd));
> -	check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
> +	check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
>   
>   	/* Now replace the ioas with a new one */
>   	test_ioctl_ioas_alloc(&new_ioas_id);
> -	test_ioctl_ioas_map_id(new_ioas_id, buffer, BUFFER_SIZE,
> -			       &copy_cmd.src_iova);
> +	if (variant->file) {
> +		test_ioctl_ioas_map_id_file(new_ioas_id, mfd, 0, BUFFER_SIZE,
> +					    &copy_cmd.src_iova);
> +	} else {
> +		test_ioctl_ioas_map_id(new_ioas_id, buf, BUFFER_SIZE,
> +				       &copy_cmd.src_iova);
> +	}
>   	test_cmd_access_replace_ioas(access_cmd.id, new_ioas_id);
>   
>   	/* Destroy the old ioas and cleanup copied mapping */
> @@ -1654,7 +1754,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   			&access_cmd));
>   	copy_cmd.src_ioas_id = new_ioas_id;
>   	ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, &copy_cmd));
> -	check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
> +	check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
>   
>   	test_cmd_destroy_access_pages(
>   		access_cmd.id, access_cmd.access_pages.out_access_pages_id);
> @@ -1667,6 +1767,9 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   {
>   	uint32_t ioas_id;
>   
> +	if (variant->file)
> +		SKIP(return, "redundant test");
> +
>   	test_ioctl_ioas_alloc(&ioas_id);
>   
>   	test_cmd_mock_domain_replace(self->stdev_ids[0], ioas_id);
> @@ -1697,6 +1800,9 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>   {
>   	int i;
>   
> +	if (variant->file)
> +		SKIP(return, "redundant test");
> +
>   	for (i = 0; i != variant->mock_domains; i++) {
>   		uint32_t hwpt_id[2];
>   		uint32_t stddev_id;
> diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
> index c5d5e69..111d43e 100644
> --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
> +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
> @@ -47,6 +47,9 @@ static __attribute__((constructor)) void setup_buffer(void)
>   
>   	buffer = mmap(0, BUFFER_SIZE, PROT_READ | PROT_WRITE,
>   		      MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> +
> +	mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
> +				&mfd);
>   }
>   
>   /*
> @@ -331,6 +334,42 @@ void __fail_nth_enable(struct __test_metadata *_metadata,
>   	return 0;
>   }
>   
> +/* iopt_area_fill_domains() and iopt_area_fill_domain() */
> +TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
> +{
> +	uint32_t ioas_id;
> +	__u32 stdev_id;
> +	__u32 hwpt_id;
> +	__u64 iova;
> +
> +	self->fd = open("/dev/iommu", O_RDWR);
> +	if (self->fd == -1)
> +		return -1;
> +
> +	if (_test_ioctl_ioas_alloc(self->fd, &ioas_id))
> +		return -1;
> +
> +	if (_test_ioctl_set_temp_memory_limit(self->fd, 32))
> +		return -1;
> +
> +	fail_nth_enable();
> +
> +	if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
> +		return -1;
> +
> +	if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
> +				 IOMMU_IOAS_MAP_WRITEABLE |
> +					 IOMMU_IOAS_MAP_READABLE))
> +		return -1;
> +
> +	if (_test_ioctl_destroy(self->fd, stdev_id))
> +		return -1;
> +
> +	if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
> +		return -1;
> +	return 0;
> +}
> +
>   TEST_FAIL_NTH(basic_fail_nth, map_two_domains)
>   {
>   	uint32_t ioas_id;
> diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
> index 40f6f14..c708dc1 100644
> --- a/tools/testing/selftests/iommu/iommufd_utils.h
> +++ b/tools/testing/selftests/iommu/iommufd_utils.h
> @@ -40,12 +40,28 @@ static inline bool test_bit(unsigned int nr, unsigned long *addr)
>   static void *buffer;
>   static unsigned long BUFFER_SIZE;
>   
> +static void *mfd_buffer;
> +static int mfd;
> +
>   static unsigned long PAGE_SIZE;
>   
>   #define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
>   #define offsetofend(TYPE, MEMBER) \
>   	(offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
>   
> +static inline void *memfd_mmap(size_t length, int prot, int flags, int *mfd_p)
> +{
> +	int mfd_flags = (flags & MAP_HUGETLB) ? MFD_HUGETLB : 0;
> +	int mfd = memfd_create("buffer", mfd_flags);
> +
> +	if (mfd <= 0)
> +		return MAP_FAILED;
> +	if (ftruncate(mfd, length))
> +		return MAP_FAILED;
> +	*mfd_p = mfd;
> +	return mmap(0, length, prot, flags, mfd, 0);
> +}
> +
>   /*
>    * Have the kernel check the refcount on pages. I don't know why a freshly
>    * mmap'd anon non-compound page starts out with a ref of 3
> @@ -589,6 +605,47 @@ static int _test_ioctl_ioas_unmap(int fd, unsigned int ioas_id, uint64_t iova,
>   	EXPECT_ERRNO(_errno, _test_ioctl_ioas_unmap(self->fd, self->ioas_id, \
>   						    iova, length, NULL))
>   
> +static int _test_ioctl_ioas_map_file(int fd, unsigned int ioas_id, int mfd,
> +				     size_t start, size_t length, __u64 *iova,
> +				     unsigned int flags)
> +{
> +	struct iommu_ioas_map_file cmd = {
> +		.size = sizeof(cmd),
> +		.flags = flags,
> +		.ioas_id = ioas_id,
> +		.fd = mfd,
> +		.start = start,
> +		.length = length,
> +	};
> +	int ret;
> +
> +	if (flags & IOMMU_IOAS_MAP_FIXED_IOVA)
> +		cmd.iova = *iova;
> +
> +	ret = ioctl(fd, IOMMU_IOAS_MAP_FILE, &cmd);
> +	*iova = cmd.iova;
> +	return ret;
> +}
> +
> +#define test_ioctl_ioas_map_file(mfd, start, length, iova_p)                 \
> +	ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
> +					       start, length, iova_p,        \
> +					       IOMMU_IOAS_MAP_WRITEABLE |    \
> +						   IOMMU_IOAS_MAP_READABLE))
> +
> +#define test_err_ioctl_ioas_map_file(_errno, start, length, iova_p)	     \
> +	EXPECT_ERRNO(_errno,                                                 \
> +		     _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
> +					       start, length, iova_p,        \
> +					       IOMMU_IOAS_MAP_WRITEABLE |    \
> +						   IOMMU_IOAS_MAP_READABLE))
> +
> +#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p)     \
> +	ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd,       \
> +					  start, length, iova_p,             \
> +					  IOMMU_IOAS_MAP_WRITEABLE |         \
> +						  IOMMU_IOAS_MAP_READABLE))
> +
>   static int _test_ioctl_set_temp_memory_limit(int fd, unsigned int limit)
>   {
>   	struct iommu_test_cmd memlimit_cmd = {


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 9/9] iommufd: map file selftest
  2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
  2024-10-23 13:52   ` Steven Sistare
@ 2024-10-23 20:15   ` Nicolin Chen
  2024-10-23 20:45     ` Steven Sistare
  1 sibling, 1 reply; 33+ messages in thread
From: Nicolin Chen @ 2024-10-23 20:15 UTC (permalink / raw)
  To: Steve Sistare; +Cc: iommu, Jason Gunthorpe, Kevin Tian

On Tue, Oct 22, 2024 at 02:00:38PM -0700, Steve Sistare wrote:

> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

LGTM overall.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

With some nits:

> ---
>  tools/testing/selftests/iommu/iommufd.c          | 136 ++++++++++++++++++++---
>  tools/testing/selftests/iommu/iommufd_fail_nth.c |  39 +++++++
>  tools/testing/selftests/iommu/iommufd_utils.h    |  57 ++++++++++
>  3 files changed, 217 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
> index 4927b9a..3d62951 100644
> --- a/tools/testing/selftests/iommu/iommufd.c
> +++ b/tools/testing/selftests/iommu/iommufd.c
> @@ -3,6 +3,7 @@
>  #include <stdlib.h>
>  #include <sys/mman.h>
>  #include <sys/eventfd.h>
> +#include <asm/unistd.h>

Can we add it to the top of the list for an alphabetical order?

>  TEST_F(iommufd_mock_domain, ro_unshare)
>  {
>         uint8_t *buf;
>         __u64 iova;
>         int fd;
> 
> +       if (variant->file)
> +               SKIP(return, "redundant test");

Why is it redundant? A line of comments maybe?

> +/* iopt_area_fill_domains() and iopt_area_fill_domain() */
> +TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
> +{
[...]
> +       if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
> +                                IOMMU_IOAS_MAP_WRITEABLE |
> +                                        IOMMU_IOAS_MAP_READABLE))
[...]
> +#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p)     \
> +       ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd,       \
> +                                         start, length, iova_p,             \
> +                                         IOMMU_IOAS_MAP_WRITEABLE |         \
> +                                                 IOMMU_IOAS_MAP_READABLE))
> +

The indentations are a bit odd here. Would you please do git-
clang-format to fix them?

Thanks
Nicolin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 9/9] iommufd: map file selftest
  2024-10-23 20:15   ` Nicolin Chen
@ 2024-10-23 20:45     ` Steven Sistare
  2024-10-24  0:36       ` Nicolin Chen
  0 siblings, 1 reply; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 20:45 UTC (permalink / raw)
  To: Nicolin Chen; +Cc: iommu, Jason Gunthorpe, Kevin Tian

Thanks very much!  I will fix all the nits.

- Steve

On 10/23/2024 4:15 PM, Nicolin Chen wrote:
> On Tue, Oct 22, 2024 at 02:00:38PM -0700, Steve Sistare wrote:
> 
>> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> 
> LGTM overall.
> 
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> 
> With some nits:
> 
>> ---
>>   tools/testing/selftests/iommu/iommufd.c          | 136 ++++++++++++++++++++---
>>   tools/testing/selftests/iommu/iommufd_fail_nth.c |  39 +++++++
>>   tools/testing/selftests/iommu/iommufd_utils.h    |  57 ++++++++++
>>   3 files changed, 217 insertions(+), 15 deletions(-)
>>
>> diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
>> index 4927b9a..3d62951 100644
>> --- a/tools/testing/selftests/iommu/iommufd.c
>> +++ b/tools/testing/selftests/iommu/iommufd.c
>> @@ -3,6 +3,7 @@
>>   #include <stdlib.h>
>>   #include <sys/mman.h>
>>   #include <sys/eventfd.h>
>> +#include <asm/unistd.h>
> 
> Can we add it to the top of the list for an alphabetical order?
> 
>>   TEST_F(iommufd_mock_domain, ro_unshare)
>>   {
>>          uint8_t *buf;
>>          __u64 iova;
>>          int fd;
>>
>> +       if (variant->file)
>> +               SKIP(return, "redundant test");
> 
> Why is it redundant? A line of comments maybe?
> 
>> +/* iopt_area_fill_domains() and iopt_area_fill_domain() */
>> +TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
>> +{
> [...]
>> +       if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
>> +                                IOMMU_IOAS_MAP_WRITEABLE |
>> +                                        IOMMU_IOAS_MAP_READABLE))
> [...]
>> +#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p)     \
>> +       ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd,       \
>> +                                         start, length, iova_p,             \
>> +                                         IOMMU_IOAS_MAP_WRITEABLE |         \
>> +                                                 IOMMU_IOAS_MAP_READABLE))
>> +
> 
> The indentations are a bit odd here. Would you please do git-
> clang-format to fix them?
> 
> Thanks
> Nicolin


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V5 9/9] iommufd: map file selftest
  2024-10-23 20:45     ` Steven Sistare
@ 2024-10-24  0:36       ` Nicolin Chen
  0 siblings, 0 replies; 33+ messages in thread
From: Nicolin Chen @ 2024-10-24  0:36 UTC (permalink / raw)
  To: Steven Sistare; +Cc: iommu, Jason Gunthorpe, Kevin Tian

On Wed, Oct 23, 2024 at 04:45:32PM -0400, Steven Sistare wrote:
> Thanks very much!  I will fix all the nits.

When you respin the series, would you please also send the series
to the selftest maillist? I think get_maintainer.pl should give
you the list that has it.

Thanks
Nicolin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 5/9] iommufd: folio subroutines
  2024-10-23 13:03     ` Jason Gunthorpe
@ 2024-10-24  6:14       ` Tian, Kevin
  0 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-24  6:14 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Steve Sistare, iommu@lists.linux.dev, Nicolin Chen

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, October 23, 2024 9:04 PM
> 
> On Wed, Oct 23, 2024 at 07:21:05AM +0000, Tian, Kevin wrote:
> > > From: Steve Sistare <steven.sistare@oracle.com>
> > > Sent: Wednesday, October 23, 2024 5:01 AM
> > >
> > > +static int batch_from_folios(struct pfn_batch *batch, struct folio
> ***folios_p,
> > > +			     unsigned long *offset_p, unsigned long npages)
> > > +{
> > > +	int rc = 0;
> > > +	struct folio **folios = *folios_p;
> > > +	unsigned long offset = *offset_p;
> > > +
> > > +	while (npages) {
> > > +		struct folio *folio = *folios;
> > > +		unsigned long nr = folio_nr_pages(folio) - offset;
> > > +		unsigned long pfn = page_to_pfn(folio_page(folio, offset));
> > > +
> > > +		nr = min(nr, npages);
> > > +		npages -= nr;
> > > +
> > > +		if (!batch_add_pfn_num(batch, pfn, nr))
> > > +			break;
> > > +		if (nr > 1) {
> > > +			rc = folio_add_pins(folio, nr - 1);
> > > +			if (rc) {
> > > +				batch_remove_pfn_num(batch, nr);
> > > +				goto out;
> > > +			}
> > > +		}
> >
> > forgot to ask one thing here. Could you add a comment here why
> > additional pins is required only when batching more than 1 page?
> 
> Something like:
> 
>  The unpin path during unmap can unpin slices of a single folio, and
>  broadly does not keep track of where the folio boundaries were. Since
>  it treats everything as a single page transform the refcount on the
>  folio from a per-folio ref to a per-page ref. This makes the unmap
>  path uniform regardless of how the memory was pinned.
> 

this is much clearer. Thanks!

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH V5 6/9] iommufd: pfn reader for file mappings
  2024-10-23 13:06     ` Steven Sistare
@ 2024-10-24  6:17       ` Tian, Kevin
  0 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-24  6:17 UTC (permalink / raw)
  To: Steven Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen

> From: Steven Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 9:07 PM
> 
> On 10/23/2024 3:40 AM, Tian, Kevin wrote:
> >> From: Steve Sistare <steven.sistare@oracle.com>
> >> Sent: Wednesday, October 23, 2024 5:01 AM
> >>
> >> +
> >> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned
> long
> >> start,
> >> +			    unsigned long npages)
> >> +{
> >> +	unsigned long i;
> >> +	unsigned long offset;
> >> +	unsigned long npages_out = 0;
> >> +	struct page **upages = user->upages;
> >> +	unsigned long end = start + (npages << PAGE_SHIFT) - 1;
> >> +	long nfolios = user->ufolios_len / sizeof(*user->ufolios);
> >
> > reverse Christmas tree
> >
> >> +
> >> +	/*
> >> +	 * todo: memfd_pin_folios should return the last pinned offset so
> >> +	 * we can compute npages pinned, and avoid looping over folios here
> >> +	 * if upages == NULL.
> >> +	 */
> >> +	nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
> >> +				   nfolios, &offset);
> >> +	if (nfolios <= 0)
> >> +		return nfolios;
> >> +
> >> +	offset >>= PAGE_SHIFT;
> >> +	user->ufolios_next = user->ufolios;
> >> +	user->ufolios_offset = offset;
> >> +
> >> +	for (i = 0; i < nfolios; i++) {
> >> +		struct folio *folio = user->ufolios[i];
> >> +		unsigned long nr = folio_nr_pages(folio);
> >> +		unsigned long npin = min(nr - offset, npages);
> >> +
> >> +		npages -= npin;
> >> +		npages_out += npin;
> >> +
> >> +		if (upages) {
> >> +			if (npin == 1) {
> >> +				*upages++ = folio_page(folio, offset);
> >> +			} else {
> >> +				int rc = folio_add_pins(folio, npin - 1);
> >> +
> >> +				if (rc)
> >> +					return rc;
> >> +
> >> +				while (npin--)
> >> +					*upages++ = folio_page(folio,
> >> offset++);
> >> +			}
> >> +		}
> >
> > dead code as user->upages is NULL for memfd (echoed below)?
> 
> user->upages is still used for mediated devices, assigned in iopt_pages_fill.
> upages is filled directly, bypassing the batch.
> 

Okay, I overlooked that part. 

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2024-10-24  6:17 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
2024-10-22 21:00 ` [PATCH V5 1/9] mm/gup: folio_add_pins Steve Sistare
2024-10-22 21:00 ` [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
2024-10-23  7:05   ` Tian, Kevin
2024-10-22 21:00 ` [PATCH V5 3/9] iommufd: generalize iopt_pages address Steve Sistare
2024-10-23  7:08   ` Tian, Kevin
2024-10-23 13:33     ` Steven Sistare
2024-10-22 21:00 ` [PATCH V5 4/9] iommufd: pfn reader local variables Steve Sistare
2024-10-23  7:09   ` Tian, Kevin
2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
2024-10-23  7:16   ` Tian, Kevin
2024-10-23  7:21   ` Tian, Kevin
2024-10-23 13:03     ` Jason Gunthorpe
2024-10-24  6:14       ` Tian, Kevin
2024-10-23 13:04     ` Steven Sistare
2024-10-23 13:01   ` Jason Gunthorpe
2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
2024-10-23  7:40   ` Tian, Kevin
2024-10-23 13:06     ` Steven Sistare
2024-10-24  6:17       ` Tian, Kevin
2024-10-23 13:15     ` Jason Gunthorpe
2024-10-23 13:24   ` Jason Gunthorpe
2024-10-22 21:00 ` [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
2024-10-23  7:45   ` Tian, Kevin
2024-10-23 13:22     ` Steven Sistare
2024-10-22 21:00 ` [PATCH V5 8/9] iommufd: file mappings for mdev Steve Sistare
2024-10-23  8:01   ` Tian, Kevin
2024-10-23 13:16     ` Jason Gunthorpe
2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
2024-10-23 13:52   ` Steven Sistare
2024-10-23 20:15   ` Nicolin Chen
2024-10-23 20:45     ` Steven Sistare
2024-10-24  0:36       ` Nicolin Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.