* [PATCH V7 1/9] mm/gup: folio_add_pins
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-25 13:11 ` [PATCH V7 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
` (8 subsequent siblings)
9 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Export a function that adds pins to an already-pinned huge-page folio.
This allows any range of small pages within the folio to be unpinned later.
For example, pages pinned via memfd_pin_folios and modified by
folio_add_pins could be unpinned via unpin_user_page(s).
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
include/linux/mm.h | 1 +
mm/gup.c | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecf63d2..f9de33a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2524,6 +2524,7 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end,
struct folio **folios, unsigned int max_folios,
pgoff_t *offset);
+int folio_add_pins(struct folio *folio, unsigned int pins);
int get_user_pages_fast(unsigned long start, int nr_pages,
unsigned int gup_flags, struct page **pages);
diff --git a/mm/gup.c b/mm/gup.c
index a82890b..4ac3e567 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3717,3 +3717,27 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end,
return ret;
}
EXPORT_SYMBOL_GPL(memfd_pin_folios);
+
+/**
+ * folio_add_pins() - add pins to an already-pinned folio
+ * @folio: the folio to add more pins to
+ * @pins: number of pins to add
+ *
+ * Try to add more pins to an already-pinned folio. The semantics
+ * of the pin (e.g., FOLL_WRITE) follow any existing pin and cannot
+ * be changed.
+ *
+ * This function is helpful when having obtained a pin on a large folio
+ * using memfd_pin_folios(), but wanting to logically unpin parts
+ * (e.g., individual pages) of the folio later, for example, using
+ * unpin_user_page_range_dirty_lock().
+ *
+ * This is not the right interface to initially pin a folio.
+ */
+int folio_add_pins(struct folio *folio, unsigned int pins)
+{
+ VM_WARN_ON_ONCE(!folio_maybe_dma_pinned(folio));
+
+ return try_grab_folio(folio, pins, FOLL_PIN);
+}
+EXPORT_SYMBOL_GPL(folio_add_pins);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* [PATCH V7 2/9] iommufd: rename uptr in iopt_alloc_iova
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
2024-10-25 13:11 ` [PATCH V7 1/9] mm/gup: folio_add_pins Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-25 13:11 ` [PATCH V7 3/9] iommufd: generalize iopt_pages address Steve Sistare
` (7 subsequent siblings)
9 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
iopt_alloc_iova takes a uptr argument but only checks for its alignment.
Generalize this to an unsigned address, which can be the offset from the
start of a file in a subsequent patch. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/iommufd/io_pagetable.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 4bf7ccd..68ad91d 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -107,9 +107,9 @@ static bool __alloc_iova_check_used(struct interval_tree_span_iter *span,
* Does not return a 0 IOVA even if it is valid.
*/
static int iopt_alloc_iova(struct io_pagetable *iopt, unsigned long *iova,
- unsigned long uptr, unsigned long length)
+ unsigned long addr, unsigned long length)
{
- unsigned long page_offset = uptr % PAGE_SIZE;
+ unsigned long page_offset = addr % PAGE_SIZE;
struct interval_tree_double_span_iter used_span;
struct interval_tree_span_iter allowed_span;
unsigned long max_alignment = PAGE_SIZE;
@@ -122,15 +122,15 @@ static int iopt_alloc_iova(struct io_pagetable *iopt, unsigned long *iova,
return -EOVERFLOW;
/*
- * Keep alignment present in the uptr when building the IOVA, this
+ * Keep alignment present in addr when building the IOVA, which
* increases the chance we can map a THP.
*/
- if (!uptr)
+ if (!addr)
iova_alignment = roundup_pow_of_two(length);
else
iova_alignment = min_t(unsigned long,
roundup_pow_of_two(length),
- 1UL << __ffs64(uptr));
+ 1UL << __ffs64(addr));
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
max_alignment = HPAGE_SIZE;
@@ -248,6 +248,7 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
int iommu_prot, unsigned int flags)
{
struct iopt_pages_list *elm;
+ unsigned long start;
unsigned long iova;
int rc = 0;
@@ -267,9 +268,8 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
/* Use the first entry to guess the ideal IOVA alignment */
elm = list_first_entry(pages_list, struct iopt_pages_list,
next);
- rc = iopt_alloc_iova(
- iopt, dst_iova,
- (uintptr_t)elm->pages->uptr + elm->start_byte, length);
+ start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+ rc = iopt_alloc_iova(iopt, dst_iova, start, length);
if (rc)
goto out_unlock;
if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* [PATCH V7 3/9] iommufd: generalize iopt_pages address
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
2024-10-25 13:11 ` [PATCH V7 1/9] mm/gup: folio_add_pins Steve Sistare
2024-10-25 13:11 ` [PATCH V7 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-25 13:11 ` [PATCH V7 4/9] iommufd: pfn reader local variables Steve Sistare
` (6 subsequent siblings)
9 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
The starting address in iopt_pages is currently a __user *uptr. Generalize
to allow other types of addresses. Refactor iopt_alloc_pages and
iopt_map_user_pages into address-type specific and common functions.
Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/iommufd/io_pagetable.c | 55 ++++++++++++++++++++++--------------
drivers/iommu/iommufd/io_pagetable.h | 13 +++++++--
drivers/iommu/iommufd/pages.c | 31 ++++++++++++++------
3 files changed, 67 insertions(+), 32 deletions(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 68ad91d..874ee9e 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -384,6 +384,34 @@ int iopt_map_pages(struct io_pagetable *iopt, struct list_head *pages_list,
return rc;
}
+static int iopt_map_common(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+ struct iopt_pages *pages, unsigned long *iova,
+ unsigned long length, unsigned long start_byte,
+ int iommu_prot, unsigned int flags)
+{
+ struct iopt_pages_list elm = {};
+ LIST_HEAD(pages_list);
+ int rc;
+
+ elm.pages = pages;
+ elm.start_byte = start_byte;
+ if (ictx->account_mode == IOPT_PAGES_ACCOUNT_MM &&
+ elm.pages->account_mode == IOPT_PAGES_ACCOUNT_USER)
+ elm.pages->account_mode = IOPT_PAGES_ACCOUNT_MM;
+ elm.length = length;
+ list_add(&elm.next, &pages_list);
+
+ rc = iopt_map_pages(iopt, &pages_list, length, iova, iommu_prot, flags);
+ if (rc) {
+ if (elm.area)
+ iopt_abort_area(elm.area);
+ if (elm.pages)
+ iopt_put_pages(elm.pages);
+ return rc;
+ }
+ return 0;
+}
+
/**
* iopt_map_user_pages() - Map a user VA to an iova in the io page table
* @ictx: iommufd_ctx the iopt is part of
@@ -408,29 +436,14 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
unsigned long length, int iommu_prot,
unsigned int flags)
{
- struct iopt_pages_list elm = {};
- LIST_HEAD(pages_list);
- int rc;
+ struct iopt_pages *pages;
- elm.pages = iopt_alloc_pages(uptr, length, iommu_prot & IOMMU_WRITE);
- if (IS_ERR(elm.pages))
- return PTR_ERR(elm.pages);
- if (ictx->account_mode == IOPT_PAGES_ACCOUNT_MM &&
- elm.pages->account_mode == IOPT_PAGES_ACCOUNT_USER)
- elm.pages->account_mode = IOPT_PAGES_ACCOUNT_MM;
- elm.start_byte = uptr - elm.pages->uptr;
- elm.length = length;
- list_add(&elm.next, &pages_list);
+ pages = iopt_alloc_user_pages(uptr, length, iommu_prot & IOMMU_WRITE);
+ if (IS_ERR(pages))
+ return PTR_ERR(pages);
- rc = iopt_map_pages(iopt, &pages_list, length, iova, iommu_prot, flags);
- if (rc) {
- if (elm.area)
- iopt_abort_area(elm.area);
- if (elm.pages)
- iopt_put_pages(elm.pages);
- return rc;
- }
- return 0;
+ return iopt_map_common(ictx, iopt, pages, iova, length,
+ uptr - pages->uptr, iommu_prot, flags);
}
struct iova_bitmap_fn_arg {
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index c61d744..8e48266 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -175,6 +175,10 @@ enum {
IOPT_PAGES_ACCOUNT_MM = 2,
};
+enum iopt_address_type {
+ IOPT_ADDRESS_USER = 0,
+};
+
/*
* This holds a pinned page list for multiple areas of IO address space. The
* pages always originate from a linear chunk of userspace VA. Multiple
@@ -195,7 +199,10 @@ struct iopt_pages {
struct task_struct *source_task;
struct mm_struct *source_mm;
struct user_struct *source_user;
- void __user *uptr;
+ enum iopt_address_type type;
+ union {
+ void __user *uptr; /* IOPT_ADDRESS_USER */
+ };
bool writable:1;
u8 account_mode;
@@ -206,8 +213,8 @@ struct iopt_pages {
struct rb_root_cached domains_itree;
};
-struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
- bool writable);
+struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
+ unsigned long length, bool writable);
void iopt_release_pages(struct kref *kref);
static inline void iopt_put_pages(struct iopt_pages *pages)
{
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 93d806c..4206e90 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1139,11 +1139,11 @@ static int pfn_reader_first(struct pfn_reader *pfns, struct iopt_pages *pages,
return 0;
}
-struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
- bool writable)
+static struct iopt_pages *iopt_alloc_pages(unsigned long start_byte,
+ unsigned long length,
+ bool writable)
{
struct iopt_pages *pages;
- unsigned long end;
/*
* The iommu API uses size_t as the length, and protect the DIV_ROUND_UP
@@ -1152,9 +1152,6 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
if (length > SIZE_MAX - PAGE_SIZE || length == 0)
return ERR_PTR(-EINVAL);
- if (check_add_overflow((unsigned long)uptr, length, &end))
- return ERR_PTR(-EOVERFLOW);
-
pages = kzalloc(sizeof(*pages), GFP_KERNEL_ACCOUNT);
if (!pages)
return ERR_PTR(-ENOMEM);
@@ -1164,8 +1161,7 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
mutex_init(&pages->mutex);
pages->source_mm = current->mm;
mmgrab(pages->source_mm);
- pages->uptr = (void __user *)ALIGN_DOWN((uintptr_t)uptr, PAGE_SIZE);
- pages->npages = DIV_ROUND_UP(length + (uptr - pages->uptr), PAGE_SIZE);
+ pages->npages = DIV_ROUND_UP(length + start_byte, PAGE_SIZE);
pages->access_itree = RB_ROOT_CACHED;
pages->domains_itree = RB_ROOT_CACHED;
pages->writable = writable;
@@ -1179,6 +1175,25 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
return pages;
}
+struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
+ unsigned long length, bool writable)
+{
+ struct iopt_pages *pages;
+ unsigned long end;
+ void __user *uptr_down =
+ (void __user *) ALIGN_DOWN((uintptr_t)uptr, PAGE_SIZE);
+
+ if (check_add_overflow((unsigned long)uptr, length, &end))
+ return ERR_PTR(-EOVERFLOW);
+
+ pages = iopt_alloc_pages(uptr - uptr_down, length, writable);
+ if (IS_ERR(pages))
+ return pages;
+ pages->uptr = uptr_down;
+ pages->type = IOPT_ADDRESS_USER;
+ return pages;
+}
+
void iopt_release_pages(struct kref *kref)
{
struct iopt_pages *pages = container_of(kref, struct iopt_pages, kref);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* [PATCH V7 4/9] iommufd: pfn reader local variables
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
` (2 preceding siblings ...)
2024-10-25 13:11 ` [PATCH V7 3/9] iommufd: generalize iopt_pages address Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-25 13:11 ` [PATCH V7 5/9] iommufd: folio subroutines Steve Sistare
` (5 subsequent siblings)
9 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Add local variables for common sub-expressions needed by a subsequent
patch. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/iommufd/pages.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 4206e90..9097b25 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -978,6 +978,8 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
{
struct interval_tree_double_span_iter *span = &pfns->span;
unsigned long start_index = pfns->batch_end_index;
+ struct pfn_reader_user *user = &pfns->user;
+ unsigned long npages;
struct iopt_area *area;
int rc;
@@ -1015,10 +1017,9 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
return rc;
}
- batch_from_pages(&pfns->batch,
- pfns->user.upages +
- (start_index - pfns->user.upages_start),
- pfns->user.upages_end - start_index);
+ npages = user->upages_end - start_index;
+ start_index -= user->upages_start;
+ batch_from_pages(&pfns->batch, user->upages + start_index, npages);
return 0;
}
@@ -1092,16 +1093,18 @@ static int pfn_reader_init(struct pfn_reader *pfns, struct iopt_pages *pages,
static void pfn_reader_release_pins(struct pfn_reader *pfns)
{
struct iopt_pages *pages = pfns->pages;
+ struct pfn_reader_user *user = &pfns->user;
- if (pfns->user.upages_end > pfns->batch_end_index) {
- size_t npages = pfns->user.upages_end - pfns->batch_end_index;
-
+ if (user->upages_end > pfns->batch_end_index) {
/* Any pages not transferred to the batch are just unpinned */
- unpin_user_pages(pfns->user.upages + (pfns->batch_end_index -
- pfns->user.upages_start),
- npages);
+
+ unsigned long npages = user->upages_end - pfns->batch_end_index;
+ unsigned long start_index = pfns->batch_end_index -
+ user->upages_start;
+
+ unpin_user_pages(user->upages + start_index, npages);
iopt_pages_sub_npinned(pages, npages);
- pfns->user.upages_end = pfns->batch_end_index;
+ user->upages_end = pfns->batch_end_index;
}
if (pfns->batch_start_index != pfns->batch_end_index) {
pfn_reader_unpin(pfns);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* [PATCH V7 5/9] iommufd: folio subroutines
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
` (3 preceding siblings ...)
2024-10-25 13:11 ` [PATCH V7 4/9] iommufd: pfn reader local variables Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-25 13:11 ` [PATCH V7 6/9] iommufd: pfn reader for file mappings Steve Sistare
` (4 subsequent siblings)
9 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Add subroutines for copying folios to a batch.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/iommufd/pages.c | 77 +++++++++++++++++++++++++++++++++++--------
1 file changed, 63 insertions(+), 14 deletions(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 9097b25..aa79504 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -346,27 +346,41 @@ static void batch_destroy(struct pfn_batch *batch, void *backup)
kfree(batch->pfns);
}
-/* true if the pfn was added, false otherwise */
-static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
+static bool batch_add_pfn_num(struct pfn_batch *batch, unsigned long pfn,
+ u32 nr)
{
const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
+ unsigned int end = batch->end;
- if (batch->end &&
- pfn == batch->pfns[batch->end - 1] + batch->npfns[batch->end - 1] &&
- batch->npfns[batch->end - 1] != MAX_NPFNS) {
- batch->npfns[batch->end - 1]++;
- batch->total_pfns++;
- return true;
- }
- if (batch->end == batch->array_size)
+ if (end && pfn == batch->pfns[end - 1] + batch->npfns[end - 1] &&
+ nr <= MAX_NPFNS - batch->npfns[end - 1]) {
+ batch->npfns[end - 1] += nr;
+ } else if (end < batch->array_size) {
+ batch->pfns[end] = pfn;
+ batch->npfns[end] = nr;
+ batch->end++;
+ } else {
return false;
- batch->total_pfns++;
- batch->pfns[batch->end] = pfn;
- batch->npfns[batch->end] = 1;
- batch->end++;
+ }
+
+ batch->total_pfns += nr;
return true;
}
+static void batch_remove_pfn_num(struct pfn_batch *batch, unsigned long nr)
+{
+ batch->npfns[batch->end - 1] -= nr;
+ if (batch->npfns[batch->end - 1] == 0)
+ batch->end--;
+ batch->total_pfns -= nr;
+}
+
+/* true if the pfn was added, false otherwise */
+static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
+{
+ return batch_add_pfn_num(batch, pfn, 1);
+}
+
/*
* Fill the batch with pfns from the domain. When the batch is full, or it
* reaches last_index, the function will return. The caller should use
@@ -622,6 +636,41 @@ static void batch_from_pages(struct pfn_batch *batch, struct page **pages,
break;
}
+static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
+ unsigned long *offset_p, unsigned long npages)
+{
+ int rc = 0;
+ struct folio **folios = *folios_p;
+ unsigned long offset = *offset_p;
+
+ while (npages) {
+ struct folio *folio = *folios;
+ unsigned long nr = folio_nr_pages(folio) - offset;
+ unsigned long pfn = page_to_pfn(folio_page(folio, offset));
+
+ nr = min(nr, npages);
+ npages -= nr;
+
+ if (!batch_add_pfn_num(batch, pfn, nr))
+ break;
+ if (nr > 1) {
+ rc = folio_add_pins(folio, nr - 1);
+ if (rc) {
+ batch_remove_pfn_num(batch, nr);
+ goto out;
+ }
+ }
+
+ folios++;
+ offset = 0;
+ }
+
+out:
+ *folios_p = folios;
+ *offset_p = offset;
+ return rc;
+}
+
static void batch_unpin(struct pfn_batch *batch, struct iopt_pages *pages,
unsigned int first_page_off, size_t npages)
{
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
` (4 preceding siblings ...)
2024-10-25 13:11 ` [PATCH V7 5/9] iommufd: folio subroutines Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-31 3:34 ` Alexey Kardashevskiy
2024-10-25 13:11 ` [PATCH V7 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
` (3 subsequent siblings)
9 siblings, 1 reply; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
Repin at small page granularity, and fill the batch from folios. Expand
folios to upages for the iopt_pages_fill path.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/iommufd/io_pagetable.h | 5 ++
drivers/iommu/iommufd/pages.c | 128 ++++++++++++++++++++++++++++++-----
2 files changed, 116 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 8e48266..5ac4eed 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -177,6 +177,7 @@ enum {
enum iopt_address_type {
IOPT_ADDRESS_USER = 0,
+ IOPT_ADDRESS_FILE = 1,
};
/*
@@ -202,6 +203,10 @@ struct iopt_pages {
enum iopt_address_type type;
union {
void __user *uptr; /* IOPT_ADDRESS_USER */
+ struct { /* IOPT_ADDRESS_FILE */
+ struct file *file;
+ unsigned long start;
+ };
};
bool writable:1;
u8 account_mode;
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index aa79504..5f371fa 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -752,19 +752,32 @@ struct pfn_reader_user {
* neither
*/
int locked;
+
+ /* The following are only valid if file != NULL. */
+ struct file *file;
+ struct folio **ufolios;
+ size_t ufolios_len;
+ unsigned long ufolios_offset;
+ struct folio **ufolios_next;
};
static void pfn_reader_user_init(struct pfn_reader_user *user,
struct iopt_pages *pages)
{
user->upages = NULL;
+ user->upages_len = 0;
user->upages_start = 0;
user->upages_end = 0;
user->locked = -1;
-
user->gup_flags = FOLL_LONGTERM;
if (pages->writable)
user->gup_flags |= FOLL_WRITE;
+
+ user->file = (pages->type == IOPT_ADDRESS_FILE) ? pages->file : NULL;
+ user->ufolios = NULL;
+ user->ufolios_len = 0;
+ user->ufolios_next = NULL;
+ user->ufolios_offset = 0;
}
static void pfn_reader_user_destroy(struct pfn_reader_user *user,
@@ -773,13 +786,67 @@ static void pfn_reader_user_destroy(struct pfn_reader_user *user,
if (user->locked != -1) {
if (user->locked)
mmap_read_unlock(pages->source_mm);
- if (pages->source_mm != current->mm)
+ if (!user->file && pages->source_mm != current->mm)
mmput(pages->source_mm);
user->locked = -1;
}
kfree(user->upages);
user->upages = NULL;
+ kfree(user->ufolios);
+ user->ufolios = NULL;
+}
+
+static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long start,
+ unsigned long npages)
+{
+ unsigned long i;
+ unsigned long offset;
+ unsigned long npages_out = 0;
+ struct page **upages = user->upages;
+ unsigned long end = start + (npages << PAGE_SHIFT) - 1;
+ long nfolios = user->ufolios_len / sizeof(*user->ufolios);
+
+ /*
+ * todo: memfd_pin_folios should return the last pinned offset so
+ * we can compute npages pinned, and avoid looping over folios here
+ * if upages == NULL.
+ */
+ nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
+ nfolios, &offset);
+ if (nfolios <= 0)
+ return nfolios;
+
+ offset >>= PAGE_SHIFT;
+ user->ufolios_next = user->ufolios;
+ user->ufolios_offset = offset;
+
+ for (i = 0; i < nfolios; i++) {
+ struct folio *folio = user->ufolios[i];
+ unsigned long nr = folio_nr_pages(folio);
+ unsigned long npin = min(nr - offset, npages);
+
+ npages -= npin;
+ npages_out += npin;
+
+ if (upages) {
+ if (npin == 1) {
+ *upages++ = folio_page(folio, offset);
+ } else {
+ int rc = folio_add_pins(folio, npin - 1);
+
+ if (rc)
+ return rc;
+
+ while (npin--)
+ *upages++ = folio_page(folio, offset++);
+ }
+ }
+
+ offset = 0;
+ }
+
+ return npages_out;
}
static int pfn_reader_user_pin(struct pfn_reader_user *user,
@@ -788,7 +855,9 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
unsigned long last_index)
{
bool remote_mm = pages->source_mm != current->mm;
- unsigned long npages;
+ unsigned long npages = last_index - start_index + 1;
+ unsigned long start;
+ unsigned long unum;
uintptr_t uptr;
long rc;
@@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
WARN_ON(last_index < start_index))
return -EINVAL;
- if (!user->upages) {
+ if (!user->file && !user->upages) {
/* All undone in pfn_reader_destroy() */
- user->upages_len =
- (last_index - start_index + 1) * sizeof(*user->upages);
+ user->upages_len = npages * sizeof(*user->upages);
user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
if (!user->upages)
return -ENOMEM;
}
+ if (user->file && !user->ufolios) {
+ user->ufolios_len = npages * sizeof(*user->ufolios);
+ user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
+ if (!user->ufolios)
+ return -ENOMEM;
+ }
+
if (user->locked == -1) {
/*
* The majority of usages will run the map task within the mm
* providing the pages, so we can optimize into
* get_user_pages_fast()
*/
- if (remote_mm) {
+ if (!user->file && remote_mm) {
if (!mmget_not_zero(pages->source_mm))
return -EFAULT;
}
user->locked = 0;
}
- npages = min_t(unsigned long, last_index - start_index + 1,
- user->upages_len / sizeof(*user->upages));
-
+ unum = user->file ? user->ufolios_len / sizeof(*user->ufolios) :
+ user->upages_len / sizeof(*user->upages);
+ npages = min_t(unsigned long, npages, unum);
if (iommufd_should_fail())
return -EFAULT;
- uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
- if (!remote_mm)
+ if (user->file) {
+ start = pages->start + (start_index * PAGE_SIZE);
+ rc = pin_memfd_pages(user, start, npages);
+ } else if (!remote_mm) {
+ uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
rc = pin_user_pages_fast(uptr, npages, user->gup_flags,
user->upages);
- else {
+ } else {
+ uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
if (!user->locked) {
mmap_read_lock(pages->source_mm);
user->locked = 1;
@@ -887,7 +966,8 @@ static int update_mm_locked_vm(struct iopt_pages *pages, unsigned long npages,
mmap_read_unlock(pages->source_mm);
user->locked = 0;
/* If we had the lock then we also have a get */
- } else if ((!user || !user->upages) &&
+
+ } else if ((!user || (!user->upages && !user->ufolios)) &&
pages->source_mm != current->mm) {
if (!mmget_not_zero(pages->source_mm))
return -EINVAL;
@@ -1068,8 +1148,15 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
npages = user->upages_end - start_index;
start_index -= user->upages_start;
- batch_from_pages(&pfns->batch, user->upages + start_index, npages);
- return 0;
+ rc = 0;
+
+ if (!user->file)
+ batch_from_pages(&pfns->batch, user->upages + start_index,
+ npages);
+ else
+ rc = batch_from_folios(&pfns->batch, &user->ufolios_next,
+ &user->ufolios_offset, npages);
+ return rc;
}
static bool pfn_reader_done(struct pfn_reader *pfns)
@@ -1151,7 +1238,14 @@ static void pfn_reader_release_pins(struct pfn_reader *pfns)
unsigned long start_index = pfns->batch_end_index -
user->upages_start;
- unpin_user_pages(user->upages + start_index, npages);
+ if (!user->file) {
+ unpin_user_pages(user->upages + start_index, npages);
+ } else {
+ long n = user->ufolios_len / sizeof(*user->ufolios);
+
+ unpin_folios(user->ufolios_next,
+ user->ufolios + n - user->ufolios_next);
+ }
iopt_pages_sub_npinned(pages, npages);
user->upages_end = pfns->batch_end_index;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-10-25 13:11 ` [PATCH V7 6/9] iommufd: pfn reader for file mappings Steve Sistare
@ 2024-10-31 3:34 ` Alexey Kardashevskiy
2024-11-06 3:18 ` Alexey Kardashevskiy
0 siblings, 1 reply; 35+ messages in thread
From: Alexey Kardashevskiy @ 2024-10-31 3:34 UTC (permalink / raw)
To: Steve Sistare, iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen
On 26/10/24 00:11, Steve Sistare wrote:
> Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
> Repin at small page granularity, and fill the batch from folios. Expand
> folios to upages for the iopt_pages_fill path.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
> drivers/iommu/iommufd/io_pagetable.h | 5 ++
> drivers/iommu/iommufd/pages.c | 128 ++++++++++++++++++++++++++++++-----
> 2 files changed, 116 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
> index 8e48266..5ac4eed 100644
> --- a/drivers/iommu/iommufd/io_pagetable.h
> +++ b/drivers/iommu/iommufd/io_pagetable.h
> @@ -177,6 +177,7 @@ enum {
>
> enum iopt_address_type {
> IOPT_ADDRESS_USER = 0,
> + IOPT_ADDRESS_FILE = 1,
> };
>
> /*
> @@ -202,6 +203,10 @@ struct iopt_pages {
> enum iopt_address_type type;
> union {
> void __user *uptr; /* IOPT_ADDRESS_USER */
> + struct { /* IOPT_ADDRESS_FILE */
> + struct file *file;
> + unsigned long start;
> + };
> };
> bool writable:1;
> u8 account_mode;
> diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
> index aa79504..5f371fa 100644
> --- a/drivers/iommu/iommufd/pages.c
> +++ b/drivers/iommu/iommufd/pages.c
> @@ -752,19 +752,32 @@ struct pfn_reader_user {
> * neither
> */
> int locked;
> +
> + /* The following are only valid if file != NULL. */
Are "struct page **upages" and "size_t upages_len" still valid in this
case? The code (kind of) suggests they are not... Thanks,
> + struct file *file;
> + struct folio **ufolios;
> + size_t ufolios_len;
> + unsigned long ufolios_offset;
> + struct folio **ufolios_next;
> };
>
> static void pfn_reader_user_init(struct pfn_reader_user *user,
> struct iopt_pages *pages)
> {
> user->upages = NULL;
> + user->upages_len = 0;
> user->upages_start = 0;
> user->upages_end = 0;
> user->locked = -1;
> -
> user->gup_flags = FOLL_LONGTERM;
> if (pages->writable)
> user->gup_flags |= FOLL_WRITE;
> +
> + user->file = (pages->type == IOPT_ADDRESS_FILE) ? pages->file : NULL;
> + user->ufolios = NULL;
> + user->ufolios_len = 0;
> + user->ufolios_next = NULL;
> + user->ufolios_offset = 0;
> }
>
> static void pfn_reader_user_destroy(struct pfn_reader_user *user,
> @@ -773,13 +786,67 @@ static void pfn_reader_user_destroy(struct pfn_reader_user *user,
> if (user->locked != -1) {
> if (user->locked)
> mmap_read_unlock(pages->source_mm);
> - if (pages->source_mm != current->mm)
> + if (!user->file && pages->source_mm != current->mm)
> mmput(pages->source_mm);
> user->locked = -1;
> }
>
> kfree(user->upages);
> user->upages = NULL;
> + kfree(user->ufolios);
> + user->ufolios = NULL;
> +}
> +
> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long start,
> + unsigned long npages)
> +{
> + unsigned long i;
> + unsigned long offset;
> + unsigned long npages_out = 0;
> + struct page **upages = user->upages;
> + unsigned long end = start + (npages << PAGE_SHIFT) - 1;
> + long nfolios = user->ufolios_len / sizeof(*user->ufolios);
> +
> + /*
> + * todo: memfd_pin_folios should return the last pinned offset so
> + * we can compute npages pinned, and avoid looping over folios here
> + * if upages == NULL.
> + */
> + nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
> + nfolios, &offset);
> + if (nfolios <= 0)
> + return nfolios;
> +
> + offset >>= PAGE_SHIFT;
> + user->ufolios_next = user->ufolios;
> + user->ufolios_offset = offset;
> +
> + for (i = 0; i < nfolios; i++) {
> + struct folio *folio = user->ufolios[i];
> + unsigned long nr = folio_nr_pages(folio);
> + unsigned long npin = min(nr - offset, npages);
> +
> + npages -= npin;
> + npages_out += npin;
> +
> + if (upages) {
> + if (npin == 1) {
> + *upages++ = folio_page(folio, offset);
> + } else {
> + int rc = folio_add_pins(folio, npin - 1);
> +
> + if (rc)
> + return rc;
> +
> + while (npin--)
> + *upages++ = folio_page(folio, offset++);
> + }
> + }
> +
> + offset = 0;
> + }
> +
> + return npages_out;
> }
>
> static int pfn_reader_user_pin(struct pfn_reader_user *user,
> @@ -788,7 +855,9 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
> unsigned long last_index)
> {
> bool remote_mm = pages->source_mm != current->mm;
> - unsigned long npages;
> + unsigned long npages = last_index - start_index + 1;
> + unsigned long start;
> + unsigned long unum;
> uintptr_t uptr;
> long rc;
>
> @@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
> WARN_ON(last_index < start_index))
> return -EINVAL;
>
> - if (!user->upages) {
> + if (!user->file && !user->upages) {
> /* All undone in pfn_reader_destroy() */
> - user->upages_len =
> - (last_index - start_index + 1) * sizeof(*user->upages);
> + user->upages_len = npages * sizeof(*user->upages);
> user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
> if (!user->upages)
> return -ENOMEM;
> }
>
> + if (user->file && !user->ufolios) {
> + user->ufolios_len = npages * sizeof(*user->ufolios);
> + user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
> + if (!user->ufolios)
> + return -ENOMEM;
> + }
> +
> if (user->locked == -1) {
> /*
> * The majority of usages will run the map task within the mm
> * providing the pages, so we can optimize into
> * get_user_pages_fast()
> */
> - if (remote_mm) {
> + if (!user->file && remote_mm) {
> if (!mmget_not_zero(pages->source_mm))
> return -EFAULT;
> }
> user->locked = 0;
> }
>
> - npages = min_t(unsigned long, last_index - start_index + 1,
> - user->upages_len / sizeof(*user->upages));
> -
> + unum = user->file ? user->ufolios_len / sizeof(*user->ufolios) :
> + user->upages_len / sizeof(*user->upages);
> + npages = min_t(unsigned long, npages, unum);
>
> if (iommufd_should_fail())
> return -EFAULT;
>
> - uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
> - if (!remote_mm)
> + if (user->file) {
> + start = pages->start + (start_index * PAGE_SIZE);
> + rc = pin_memfd_pages(user, start, npages);
> + } else if (!remote_mm) {
> + uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
> rc = pin_user_pages_fast(uptr, npages, user->gup_flags,
> user->upages);
> - else {
> + } else {
> + uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
> if (!user->locked) {
> mmap_read_lock(pages->source_mm);
> user->locked = 1;
> @@ -887,7 +966,8 @@ static int update_mm_locked_vm(struct iopt_pages *pages, unsigned long npages,
> mmap_read_unlock(pages->source_mm);
> user->locked = 0;
> /* If we had the lock then we also have a get */
> - } else if ((!user || !user->upages) &&
> +
> + } else if ((!user || (!user->upages && !user->ufolios)) &&
> pages->source_mm != current->mm) {
> if (!mmget_not_zero(pages->source_mm))
> return -EINVAL;
> @@ -1068,8 +1148,15 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
>
> npages = user->upages_end - start_index;
> start_index -= user->upages_start;
> - batch_from_pages(&pfns->batch, user->upages + start_index, npages);
> - return 0;
> + rc = 0;
> +
> + if (!user->file)
> + batch_from_pages(&pfns->batch, user->upages + start_index,
> + npages);
> + else
> + rc = batch_from_folios(&pfns->batch, &user->ufolios_next,
> + &user->ufolios_offset, npages);
> + return rc;
> }
>
> static bool pfn_reader_done(struct pfn_reader *pfns)
> @@ -1151,7 +1238,14 @@ static void pfn_reader_release_pins(struct pfn_reader *pfns)
> unsigned long start_index = pfns->batch_end_index -
> user->upages_start;
>
> - unpin_user_pages(user->upages + start_index, npages);
> + if (!user->file) {
> + unpin_user_pages(user->upages + start_index, npages);
> + } else {
> + long n = user->ufolios_len / sizeof(*user->ufolios);
> +
> + unpin_folios(user->ufolios_next,
> + user->ufolios + n - user->ufolios_next);
> + }
> iopt_pages_sub_npinned(pages, npages);
> user->upages_end = pfns->batch_end_index;
> }
--
Alexey
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-10-31 3:34 ` Alexey Kardashevskiy
@ 2024-11-06 3:18 ` Alexey Kardashevskiy
2024-11-06 13:19 ` Steven Sistare
0 siblings, 1 reply; 35+ messages in thread
From: Alexey Kardashevskiy @ 2024-11-06 3:18 UTC (permalink / raw)
To: Steve Sistare, iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen
On 31/10/24 14:34, Alexey Kardashevskiy wrote:
> On 26/10/24 00:11, Steve Sistare wrote:
>> Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
>> Repin at small page granularity, and fill the batch from folios. Expand
>> folios to upages for the iopt_pages_fill path.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>> ---
>> drivers/iommu/iommufd/io_pagetable.h | 5 ++
>> drivers/iommu/iommufd/pages.c | 128
>> ++++++++++++++++++++++++++++++-----
>> 2 files changed, 116 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/iommu/iommufd/io_pagetable.h
>> b/drivers/iommu/iommufd/io_pagetable.h
>> index 8e48266..5ac4eed 100644
>> --- a/drivers/iommu/iommufd/io_pagetable.h
>> +++ b/drivers/iommu/iommufd/io_pagetable.h
>> @@ -177,6 +177,7 @@ enum {
>> enum iopt_address_type {
>> IOPT_ADDRESS_USER = 0,
>> + IOPT_ADDRESS_FILE = 1,
>> };
>> /*
>> @@ -202,6 +203,10 @@ struct iopt_pages {
>> enum iopt_address_type type;
>> union {
>> void __user *uptr; /* IOPT_ADDRESS_USER */
>> + struct { /* IOPT_ADDRESS_FILE */
>> + struct file *file;
>> + unsigned long start;
>> + };
>> };
>> bool writable:1;
>> u8 account_mode;
>> diff --git a/drivers/iommu/iommufd/pages.c
>> b/drivers/iommu/iommufd/pages.c
>> index aa79504..5f371fa 100644
>> --- a/drivers/iommu/iommufd/pages.c
>> +++ b/drivers/iommu/iommufd/pages.c
>> @@ -752,19 +752,32 @@ struct pfn_reader_user {
>> * neither
>> */
>> int locked;
>> +
>> + /* The following are only valid if file != NULL. */
>
>
> Are "struct page **upages" and "size_t upages_len" still valid in this
> case? The code (kind of) suggests they are not... Thanks,
Ping?
Also does "user" in function names mean "userspace addresses" (as
opposite to "file"? If so, the patchset makes many names misleading
then. Thanks,
>
>
>
>> + struct file *file;
>> + struct folio **ufolios;
>> + size_t ufolios_len;
>> + unsigned long ufolios_offset;
>> + struct folio **ufolios_next;
>> };
>> static void pfn_reader_user_init(struct pfn_reader_user *user,
>> struct iopt_pages *pages)
>> {
>> user->upages = NULL;
>> + user->upages_len = 0;
>> user->upages_start = 0;
>> user->upages_end = 0;
>> user->locked = -1;
>> -
>> user->gup_flags = FOLL_LONGTERM;
>> if (pages->writable)
>> user->gup_flags |= FOLL_WRITE;
>> +
>> + user->file = (pages->type == IOPT_ADDRESS_FILE) ? pages->file :
>> NULL;
>> + user->ufolios = NULL;
>> + user->ufolios_len = 0;
>> + user->ufolios_next = NULL;
>> + user->ufolios_offset = 0;
>> }
>> static void pfn_reader_user_destroy(struct pfn_reader_user *user,
>> @@ -773,13 +786,67 @@ static void pfn_reader_user_destroy(struct
>> pfn_reader_user *user,
>> if (user->locked != -1) {
>> if (user->locked)
>> mmap_read_unlock(pages->source_mm);
>> - if (pages->source_mm != current->mm)
>> + if (!user->file && pages->source_mm != current->mm)
>> mmput(pages->source_mm);
>> user->locked = -1;
>> }
>> kfree(user->upages);
>> user->upages = NULL;
>> + kfree(user->ufolios);
>> + user->ufolios = NULL;
>> +}
>> +
>> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned
>> long start,
>> + unsigned long npages)
>> +{
>> + unsigned long i;
>> + unsigned long offset;
>> + unsigned long npages_out = 0;
>> + struct page **upages = user->upages;
>> + unsigned long end = start + (npages << PAGE_SHIFT) - 1;
>> + long nfolios = user->ufolios_len / sizeof(*user->ufolios);
>> +
>> + /*
>> + * todo: memfd_pin_folios should return the last pinned offset so
>> + * we can compute npages pinned, and avoid looping over folios here
>> + * if upages == NULL.
>> + */
>> + nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
>> + nfolios, &offset);
>> + if (nfolios <= 0)
>> + return nfolios;
>> +
>> + offset >>= PAGE_SHIFT;
>> + user->ufolios_next = user->ufolios;
>> + user->ufolios_offset = offset;
>> +
>> + for (i = 0; i < nfolios; i++) {
>> + struct folio *folio = user->ufolios[i];
>> + unsigned long nr = folio_nr_pages(folio);
>> + unsigned long npin = min(nr - offset, npages);
>> +
>> + npages -= npin;
>> + npages_out += npin;
>> +
>> + if (upages) {
>> + if (npin == 1) {
>> + *upages++ = folio_page(folio, offset);
>> + } else {
>> + int rc = folio_add_pins(folio, npin - 1);
>> +
>> + if (rc)
>> + return rc;
>> +
>> + while (npin--)
>> + *upages++ = folio_page(folio, offset++);
>> + }
>> + }
>> +
>> + offset = 0;
>> + }
>> +
>> + return npages_out;
>> }
>> static int pfn_reader_user_pin(struct pfn_reader_user *user,
>> @@ -788,7 +855,9 @@ static int pfn_reader_user_pin(struct
>> pfn_reader_user *user,
>> unsigned long last_index)
>> {
>> bool remote_mm = pages->source_mm != current->mm;
>> - unsigned long npages;
>> + unsigned long npages = last_index - start_index + 1;
>> + unsigned long start;
>> + unsigned long unum;
>> uintptr_t uptr;
>> long rc;
>> @@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct
>> pfn_reader_user *user,
>> WARN_ON(last_index < start_index))
>> return -EINVAL;
>> - if (!user->upages) {
>> + if (!user->file && !user->upages) {
>> /* All undone in pfn_reader_destroy() */
>> - user->upages_len =
>> - (last_index - start_index + 1) * sizeof(*user->upages);
>> + user->upages_len = npages * sizeof(*user->upages);
>> user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
>> if (!user->upages)
>> return -ENOMEM;
>> }
>> + if (user->file && !user->ufolios) {
>> + user->ufolios_len = npages * sizeof(*user->ufolios);
>> + user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
>> + if (!user->ufolios)
>> + return -ENOMEM;
>> + }
>> +
>> if (user->locked == -1) {
>> /*
>> * The majority of usages will run the map task within the mm
>> * providing the pages, so we can optimize into
>> * get_user_pages_fast()
>> */
>> - if (remote_mm) {
>> + if (!user->file && remote_mm) {
>> if (!mmget_not_zero(pages->source_mm))
>> return -EFAULT;
>> }
>> user->locked = 0;
>> }
>> - npages = min_t(unsigned long, last_index - start_index + 1,
>> - user->upages_len / sizeof(*user->upages));
>> -
>> + unum = user->file ? user->ufolios_len / sizeof(*user->ufolios) :
>> + user->upages_len / sizeof(*user->upages);
>> + npages = min_t(unsigned long, npages, unum);
>> if (iommufd_should_fail())
>> return -EFAULT;
>> - uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>> - if (!remote_mm)
>> + if (user->file) {
>> + start = pages->start + (start_index * PAGE_SIZE);
>> + rc = pin_memfd_pages(user, start, npages);
>> + } else if (!remote_mm) {
>> + uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>> rc = pin_user_pages_fast(uptr, npages, user->gup_flags,
>> user->upages);
>> - else {
>> + } else {
>> + uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>> if (!user->locked) {
>> mmap_read_lock(pages->source_mm);
>> user->locked = 1;
>> @@ -887,7 +966,8 @@ static int update_mm_locked_vm(struct iopt_pages
>> *pages, unsigned long npages,
>> mmap_read_unlock(pages->source_mm);
>> user->locked = 0;
>> /* If we had the lock then we also have a get */
>> - } else if ((!user || !user->upages) &&
>> +
>> + } else if ((!user || (!user->upages && !user->ufolios)) &&
>> pages->source_mm != current->mm) {
>> if (!mmget_not_zero(pages->source_mm))
>> return -EINVAL;
>> @@ -1068,8 +1148,15 @@ static int pfn_reader_fill_span(struct
>> pfn_reader *pfns)
>> npages = user->upages_end - start_index;
>> start_index -= user->upages_start;
>> - batch_from_pages(&pfns->batch, user->upages + start_index, npages);
>> - return 0;
>> + rc = 0;
>> +
>> + if (!user->file)
>> + batch_from_pages(&pfns->batch, user->upages + start_index,
>> + npages);
>> + else
>> + rc = batch_from_folios(&pfns->batch, &user->ufolios_next,
>> + &user->ufolios_offset, npages);
>> + return rc;
>> }
>> static bool pfn_reader_done(struct pfn_reader *pfns)
>> @@ -1151,7 +1238,14 @@ static void pfn_reader_release_pins(struct
>> pfn_reader *pfns)
>> unsigned long start_index = pfns->batch_end_index -
>> user->upages_start;
>> - unpin_user_pages(user->upages + start_index, npages);
>> + if (!user->file) {
>> + unpin_user_pages(user->upages + start_index, npages);
>> + } else {
>> + long n = user->ufolios_len / sizeof(*user->ufolios);
>> +
>> + unpin_folios(user->ufolios_next,
>> + user->ufolios + n - user->ufolios_next);
>> + }
>> iopt_pages_sub_npinned(pages, npages);
>> user->upages_end = pfns->batch_end_index;
>> }
>
--
Alexey
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-11-06 3:18 ` Alexey Kardashevskiy
@ 2024-11-06 13:19 ` Steven Sistare
2024-11-07 10:00 ` Alexey Kardashevskiy
0 siblings, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2024-11-06 13:19 UTC (permalink / raw)
To: Alexey Kardashevskiy, iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen
On 11/5/2024 10:18 PM, Alexey Kardashevskiy wrote:
> On 31/10/24 14:34, Alexey Kardashevskiy wrote:
>> On 26/10/24 00:11, Steve Sistare wrote:
>>> Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
>>> Repin at small page granularity, and fill the batch from folios. Expand
>>> folios to upages for the iopt_pages_fill path.
>>>
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>>> ---
>>> drivers/iommu/iommufd/io_pagetable.h | 5 ++
>>> drivers/iommu/iommufd/pages.c | 128 ++++++++++++++++++++++++++++++-----
>>> 2 files changed, 116 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
>>> index 8e48266..5ac4eed 100644
>>> --- a/drivers/iommu/iommufd/io_pagetable.h
>>> +++ b/drivers/iommu/iommufd/io_pagetable.h
>>> @@ -177,6 +177,7 @@ enum {
>>> enum iopt_address_type {
>>> IOPT_ADDRESS_USER = 0,
>>> + IOPT_ADDRESS_FILE = 1,
>>> };
>>> /*
>>> @@ -202,6 +203,10 @@ struct iopt_pages {
>>> enum iopt_address_type type;
>>> union {
>>> void __user *uptr; /* IOPT_ADDRESS_USER */
>>> + struct { /* IOPT_ADDRESS_FILE */
>>> + struct file *file;
>>> + unsigned long start;
>>> + };
>>> };
>>> bool writable:1;
>>> u8 account_mode;
>>> diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
>>> index aa79504..5f371fa 100644
>>> --- a/drivers/iommu/iommufd/pages.c
>>> +++ b/drivers/iommu/iommufd/pages.c
>>> @@ -752,19 +752,32 @@ struct pfn_reader_user {
>>> * neither
>>> */
>>> int locked;
>>> +
>>> + /* The following are only valid if file != NULL. */
>>
>>
>> Are "struct page **upages" and "size_t upages_len" still valid in this case? The code (kind of) suggests they are not... Thanks,
>
>
> Ping?
Sorry, I don't recall seeing your first email, but others asked this same question
and I answered. upages and upages_len are still used when file!=NULL for mediated
device access, to point to the pages array returned to the kernel client. Maybe
"u" for user is not a good description in that case, but changing that and still
using "u"pages elsewhere would complicate the code for little gain in clarity IMO.
> Also does "user" in function names mean "userspace addresses" (as opposite to "file"?
Yes, except for mdev as noted above.
> If so, the patchset makes many names misleading then. Thanks,
The new stuff I added is prefixed with file. I left the existing upages names
as is. That made sense to me, and made the maintainers happy enough to accept it :)
- Steve
>>> + struct file *file;
>>> + struct folio **ufolios;
>>> + size_t ufolios_len;
>>> + unsigned long ufolios_offset;
>>> + struct folio **ufolios_next;
>>> };
>>> static void pfn_reader_user_init(struct pfn_reader_user *user,
>>> struct iopt_pages *pages)
>>> {
>>> user->upages = NULL;
>>> + user->upages_len = 0;
>>> user->upages_start = 0;
>>> user->upages_end = 0;
>>> user->locked = -1;
>>> -
>>> user->gup_flags = FOLL_LONGTERM;
>>> if (pages->writable)
>>> user->gup_flags |= FOLL_WRITE;
>>> +
>>> + user->file = (pages->type == IOPT_ADDRESS_FILE) ? pages->file : NULL;
>>> + user->ufolios = NULL;
>>> + user->ufolios_len = 0;
>>> + user->ufolios_next = NULL;
>>> + user->ufolios_offset = 0;
>>> }
>>> static void pfn_reader_user_destroy(struct pfn_reader_user *user,
>>> @@ -773,13 +786,67 @@ static void pfn_reader_user_destroy(struct pfn_reader_user *user,
>>> if (user->locked != -1) {
>>> if (user->locked)
>>> mmap_read_unlock(pages->source_mm);
>>> - if (pages->source_mm != current->mm)
>>> + if (!user->file && pages->source_mm != current->mm)
>>> mmput(pages->source_mm);
>>> user->locked = -1;
>>> }
>>> kfree(user->upages);
>>> user->upages = NULL;
>>> + kfree(user->ufolios);
>>> + user->ufolios = NULL;
>>> +}
>>> +
>>> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long start,
>>> + unsigned long npages)
>>> +{
>>> + unsigned long i;
>>> + unsigned long offset;
>>> + unsigned long npages_out = 0;
>>> + struct page **upages = user->upages;
>>> + unsigned long end = start + (npages << PAGE_SHIFT) - 1;
>>> + long nfolios = user->ufolios_len / sizeof(*user->ufolios);
>>> +
>>> + /*
>>> + * todo: memfd_pin_folios should return the last pinned offset so
>>> + * we can compute npages pinned, and avoid looping over folios here
>>> + * if upages == NULL.
>>> + */
>>> + nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
>>> + nfolios, &offset);
>>> + if (nfolios <= 0)
>>> + return nfolios;
>>> +
>>> + offset >>= PAGE_SHIFT;
>>> + user->ufolios_next = user->ufolios;
>>> + user->ufolios_offset = offset;
>>> +
>>> + for (i = 0; i < nfolios; i++) {
>>> + struct folio *folio = user->ufolios[i];
>>> + unsigned long nr = folio_nr_pages(folio);
>>> + unsigned long npin = min(nr - offset, npages);
>>> +
>>> + npages -= npin;
>>> + npages_out += npin;
>>> +
>>> + if (upages) {
>>> + if (npin == 1) {
>>> + *upages++ = folio_page(folio, offset);
>>> + } else {
>>> + int rc = folio_add_pins(folio, npin - 1);
>>> +
>>> + if (rc)
>>> + return rc;
>>> +
>>> + while (npin--)
>>> + *upages++ = folio_page(folio, offset++);
>>> + }
>>> + }
>>> +
>>> + offset = 0;
>>> + }
>>> +
>>> + return npages_out;
>>> }
>>> static int pfn_reader_user_pin(struct pfn_reader_user *user,
>>> @@ -788,7 +855,9 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
>>> unsigned long last_index)
>>> {
>>> bool remote_mm = pages->source_mm != current->mm;
>>> - unsigned long npages;
>>> + unsigned long npages = last_index - start_index + 1;
>>> + unsigned long start;
>>> + unsigned long unum;
>>> uintptr_t uptr;
>>> long rc;
>>> @@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
>>> WARN_ON(last_index < start_index))
>>> return -EINVAL;
>>> - if (!user->upages) {
>>> + if (!user->file && !user->upages) {
>>> /* All undone in pfn_reader_destroy() */
>>> - user->upages_len =
>>> - (last_index - start_index + 1) * sizeof(*user->upages);
>>> + user->upages_len = npages * sizeof(*user->upages);
>>> user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
>>> if (!user->upages)
>>> return -ENOMEM;
>>> }
>>> + if (user->file && !user->ufolios) {
>>> + user->ufolios_len = npages * sizeof(*user->ufolios);
>>> + user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
>>> + if (!user->ufolios)
>>> + return -ENOMEM;
>>> + }
>>> +
>>> if (user->locked == -1) {
>>> /*
>>> * The majority of usages will run the map task within the mm
>>> * providing the pages, so we can optimize into
>>> * get_user_pages_fast()
>>> */
>>> - if (remote_mm) {
>>> + if (!user->file && remote_mm) {
>>> if (!mmget_not_zero(pages->source_mm))
>>> return -EFAULT;
>>> }
>>> user->locked = 0;
>>> }
>>> - npages = min_t(unsigned long, last_index - start_index + 1,
>>> - user->upages_len / sizeof(*user->upages));
>>> -
>>> + unum = user->file ? user->ufolios_len / sizeof(*user->ufolios) :
>>> + user->upages_len / sizeof(*user->upages);
>>> + npages = min_t(unsigned long, npages, unum);
>>> if (iommufd_should_fail())
>>> return -EFAULT;
>>> - uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>>> - if (!remote_mm)
>>> + if (user->file) {
>>> + start = pages->start + (start_index * PAGE_SIZE);
>>> + rc = pin_memfd_pages(user, start, npages);
>>> + } else if (!remote_mm) {
>>> + uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>>> rc = pin_user_pages_fast(uptr, npages, user->gup_flags,
>>> user->upages);
>>> - else {
>>> + } else {
>>> + uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>>> if (!user->locked) {
>>> mmap_read_lock(pages->source_mm);
>>> user->locked = 1;
>>> @@ -887,7 +966,8 @@ static int update_mm_locked_vm(struct iopt_pages *pages, unsigned long npages,
>>> mmap_read_unlock(pages->source_mm);
>>> user->locked = 0;
>>> /* If we had the lock then we also have a get */
>>> - } else if ((!user || !user->upages) &&
>>> +
>>> + } else if ((!user || (!user->upages && !user->ufolios)) &&
>>> pages->source_mm != current->mm) {
>>> if (!mmget_not_zero(pages->source_mm))
>>> return -EINVAL;
>>> @@ -1068,8 +1148,15 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
>>> npages = user->upages_end - start_index;
>>> start_index -= user->upages_start;
>>> - batch_from_pages(&pfns->batch, user->upages + start_index, npages);
>>> - return 0;
>>> + rc = 0;
>>> +
>>> + if (!user->file)
>>> + batch_from_pages(&pfns->batch, user->upages + start_index,
>>> + npages);
>>> + else
>>> + rc = batch_from_folios(&pfns->batch, &user->ufolios_next,
>>> + &user->ufolios_offset, npages);
>>> + return rc;
>>> }
>>> static bool pfn_reader_done(struct pfn_reader *pfns)
>>> @@ -1151,7 +1238,14 @@ static void pfn_reader_release_pins(struct pfn_reader *pfns)
>>> unsigned long start_index = pfns->batch_end_index -
>>> user->upages_start;
>>> - unpin_user_pages(user->upages + start_index, npages);
>>> + if (!user->file) {
>>> + unpin_user_pages(user->upages + start_index, npages);
>>> + } else {
>>> + long n = user->ufolios_len / sizeof(*user->ufolios);
>>> +
>>> + unpin_folios(user->ufolios_next,
>>> + user->ufolios + n - user->ufolios_next);
>>> + }
>>> iopt_pages_sub_npinned(pages, npages);
>>> user->upages_end = pfns->batch_end_index;
>>> }
>>
>
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-11-06 13:19 ` Steven Sistare
@ 2024-11-07 10:00 ` Alexey Kardashevskiy
2024-11-07 13:21 ` Steven Sistare
2024-11-07 14:08 ` Jason Gunthorpe
0 siblings, 2 replies; 35+ messages in thread
From: Alexey Kardashevskiy @ 2024-11-07 10:00 UTC (permalink / raw)
To: Steven Sistare, iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen
On 7/11/24 00:19, Steven Sistare wrote:
> On 11/5/2024 10:18 PM, Alexey Kardashevskiy wrote:
>> On 31/10/24 14:34, Alexey Kardashevskiy wrote:
>>> On 26/10/24 00:11, Steve Sistare wrote:
>>>> Extend pfn_reader_user to pin file mappings, by calling
>>>> memfd_pin_folios.
>>>> Repin at small page granularity, and fill the batch from folios.
>>>> Expand
>>>> folios to upages for the iopt_pages_fill path.
>>>>
>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
>>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>>>> ---
>>>> drivers/iommu/iommufd/io_pagetable.h | 5 ++
>>>> drivers/iommu/iommufd/pages.c | 128
>>>> ++++++++++++++++++++++++++++++-----
>>>> 2 files changed, 116 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/iommufd/io_pagetable.h
>>>> b/drivers/iommu/iommufd/io_pagetable.h
>>>> index 8e48266..5ac4eed 100644
>>>> --- a/drivers/iommu/iommufd/io_pagetable.h
>>>> +++ b/drivers/iommu/iommufd/io_pagetable.h
>>>> @@ -177,6 +177,7 @@ enum {
>>>> enum iopt_address_type {
>>>> IOPT_ADDRESS_USER = 0,
>>>> + IOPT_ADDRESS_FILE = 1,
>>>> };
>>>> /*
>>>> @@ -202,6 +203,10 @@ struct iopt_pages {
>>>> enum iopt_address_type type;
>>>> union {
>>>> void __user *uptr; /* IOPT_ADDRESS_USER */
>>>> + struct { /* IOPT_ADDRESS_FILE */
>>>> + struct file *file;
>>>> + unsigned long start;
>>>> + };
>>>> };
>>>> bool writable:1;
>>>> u8 account_mode;
>>>> diff --git a/drivers/iommu/iommufd/pages.c
>>>> b/drivers/iommu/iommufd/pages.c
>>>> index aa79504..5f371fa 100644
>>>> --- a/drivers/iommu/iommufd/pages.c
>>>> +++ b/drivers/iommu/iommufd/pages.c
>>>> @@ -752,19 +752,32 @@ struct pfn_reader_user {
>>>> * neither
>>>> */
>>>> int locked;
>>>> +
>>>> + /* The following are only valid if file != NULL. */
>>>
>>>
>>> Are "struct page **upages" and "size_t upages_len" still valid in
>>> this case? The code (kind of) suggests they are not... Thanks,
>>
>>
>> Ping?
>
> Sorry, I don't recall seeing your first email, but others asked this
> same question
> and I answered. upages and upages_len are still used when file!=NULL
> for mediated
> device access, to point to the pages array returned to the kernel
> client.
This would make a nice addition to the commit log :)
> Maybe
> "u" for user is not a good description in that case, but changing that
> and still
> using "u"pages elsewhere would complicate the code for little gain in
> clarity IMO.
>
>> Also does "user" in function names mean "userspace addresses" (as
>> opposite to "file"?
>
> Yes, except for mdev as noted above.
>
>> If so, the patchset makes many names misleading then. Thanks,
>
> The new stuff I added is prefixed with file. I left the existing upages
> names
> as is. That made sense to me, and made the maintainers happy enough to
> accept it :)
Fair enough :)
I was initially pointed at this patchset as an example of how my
guest_memfd pinning of private memory (CoCo VM) should look like. And so
far it's been good except the actual pinning part - memfd_pin_folios() -
expects the file to be shmem or hugetlbfs and guest_memfd is neither
(well, last couple of months). Is there any plan to add those? If none,
I will then, just wanted to check. At the moment I do what KVM does
which is calling filemap_grab_folio(). Thanks,
>
> - Steve
>
>>>> + struct file *file;
>>>> + struct folio **ufolios;
>>>> + size_t ufolios_len;
>>>> + unsigned long ufolios_offset;
>>>> + struct folio **ufolios_next;
>>>> };
>>>> static void pfn_reader_user_init(struct pfn_reader_user *user,
>>>> struct iopt_pages *pages)
>>>> {
>>>> user->upages = NULL;
>>>> + user->upages_len = 0;
>>>> user->upages_start = 0;
>>>> user->upages_end = 0;
>>>> user->locked = -1;
>>>> -
>>>> user->gup_flags = FOLL_LONGTERM;
>>>> if (pages->writable)
>>>> user->gup_flags |= FOLL_WRITE;
>>>> +
>>>> + user->file = (pages->type == IOPT_ADDRESS_FILE) ? pages->file :
>>>> NULL;
>>>> + user->ufolios = NULL;
>>>> + user->ufolios_len = 0;
>>>> + user->ufolios_next = NULL;
>>>> + user->ufolios_offset = 0;
>>>> }
>>>> static void pfn_reader_user_destroy(struct pfn_reader_user *user,
>>>> @@ -773,13 +786,67 @@ static void pfn_reader_user_destroy(struct
>>>> pfn_reader_user *user,
>>>> if (user->locked != -1) {
>>>> if (user->locked)
>>>> mmap_read_unlock(pages->source_mm);
>>>> - if (pages->source_mm != current->mm)
>>>> + if (!user->file && pages->source_mm != current->mm)
>>>> mmput(pages->source_mm);
>>>> user->locked = -1;
>>>> }
>>>> kfree(user->upages);
>>>> user->upages = NULL;
>>>> + kfree(user->ufolios);
>>>> + user->ufolios = NULL;
>>>> +}
>>>> +
>>>> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned
>>>> long start,
>>>> + unsigned long npages)
>>>> +{
>>>> + unsigned long i;
>>>> + unsigned long offset;
>>>> + unsigned long npages_out = 0;
>>>> + struct page **upages = user->upages;
>>>> + unsigned long end = start + (npages << PAGE_SHIFT) - 1;
>>>> + long nfolios = user->ufolios_len / sizeof(*user->ufolios);
>>>> +
>>>> + /*
>>>> + * todo: memfd_pin_folios should return the last pinned offset so
>>>> + * we can compute npages pinned, and avoid looping over folios
>>>> here
>>>> + * if upages == NULL.
>>>> + */
>>>> + nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
>>>> + nfolios, &offset);
>>>> + if (nfolios <= 0)
>>>> + return nfolios;
>>>> +
>>>> + offset >>= PAGE_SHIFT;
>>>> + user->ufolios_next = user->ufolios;
>>>> + user->ufolios_offset = offset;
>>>> +
>>>> + for (i = 0; i < nfolios; i++) {
>>>> + struct folio *folio = user->ufolios[i];
>>>> + unsigned long nr = folio_nr_pages(folio);
>>>> + unsigned long npin = min(nr - offset, npages);
>>>> +
>>>> + npages -= npin;
>>>> + npages_out += npin;
>>>> +
>>>> + if (upages) {
>>>> + if (npin == 1) {
>>>> + *upages++ = folio_page(folio, offset);
>>>> + } else {
>>>> + int rc = folio_add_pins(folio, npin - 1);
>>>> +
>>>> + if (rc)
>>>> + return rc;
>>>> +
>>>> + while (npin--)
>>>> + *upages++ = folio_page(folio, offset++);
>>>> + }
>>>> + }
>>>> +
>>>> + offset = 0;
>>>> + }
>>>> +
>>>> + return npages_out;
>>>> }
>>>> static int pfn_reader_user_pin(struct pfn_reader_user *user,
>>>> @@ -788,7 +855,9 @@ static int pfn_reader_user_pin(struct
>>>> pfn_reader_user *user,
>>>> unsigned long last_index)
>>>> {
>>>> bool remote_mm = pages->source_mm != current->mm;
>>>> - unsigned long npages;
>>>> + unsigned long npages = last_index - start_index + 1;
>>>> + unsigned long start;
>>>> + unsigned long unum;
>>>> uintptr_t uptr;
>>>> long rc;
>>>> @@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct
>>>> pfn_reader_user *user,
>>>> WARN_ON(last_index < start_index))
>>>> return -EINVAL;
>>>> - if (!user->upages) {
>>>> + if (!user->file && !user->upages) {
>>>> /* All undone in pfn_reader_destroy() */
>>>> - user->upages_len =
>>>> - (last_index - start_index + 1) * sizeof(*user->upages);
>>>> + user->upages_len = npages * sizeof(*user->upages);
>>>> user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
>>>> if (!user->upages)
>>>> return -ENOMEM;
>>>> }
>>>> + if (user->file && !user->ufolios) {
>>>> + user->ufolios_len = npages * sizeof(*user->ufolios);
>>>> + user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
>>>> + if (!user->ufolios)
>>>> + return -ENOMEM;
>>>> + }
>>>> +
>>>> if (user->locked == -1) {
>>>> /*
>>>> * The majority of usages will run the map task within the mm
>>>> * providing the pages, so we can optimize into
>>>> * get_user_pages_fast()
>>>> */
>>>> - if (remote_mm) {
>>>> + if (!user->file && remote_mm) {
>>>> if (!mmget_not_zero(pages->source_mm))
>>>> return -EFAULT;
>>>> }
>>>> user->locked = 0;
>>>> }
>>>> - npages = min_t(unsigned long, last_index - start_index + 1,
>>>> - user->upages_len / sizeof(*user->upages));
>>>> -
>>>> + unum = user->file ? user->ufolios_len / sizeof(*user->ufolios) :
>>>> + user->upages_len / sizeof(*user->upages);
>>>> + npages = min_t(unsigned long, npages, unum);
>>>> if (iommufd_should_fail())
>>>> return -EFAULT;
>>>> - uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>>>> - if (!remote_mm)
>>>> + if (user->file) {
>>>> + start = pages->start + (start_index * PAGE_SIZE);
>>>> + rc = pin_memfd_pages(user, start, npages);
>>>> + } else if (!remote_mm) {
>>>> + uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>>>> rc = pin_user_pages_fast(uptr, npages, user->gup_flags,
>>>> user->upages);
>>>> - else {
>>>> + } else {
>>>> + uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
>>>> if (!user->locked) {
>>>> mmap_read_lock(pages->source_mm);
>>>> user->locked = 1;
>>>> @@ -887,7 +966,8 @@ static int update_mm_locked_vm(struct iopt_pages
>>>> *pages, unsigned long npages,
>>>> mmap_read_unlock(pages->source_mm);
>>>> user->locked = 0;
>>>> /* If we had the lock then we also have a get */
>>>> - } else if ((!user || !user->upages) &&
>>>> +
>>>> + } else if ((!user || (!user->upages && !user->ufolios)) &&
>>>> pages->source_mm != current->mm) {
>>>> if (!mmget_not_zero(pages->source_mm))
>>>> return -EINVAL;
>>>> @@ -1068,8 +1148,15 @@ static int pfn_reader_fill_span(struct
>>>> pfn_reader *pfns)
>>>> npages = user->upages_end - start_index;
>>>> start_index -= user->upages_start;
>>>> - batch_from_pages(&pfns->batch, user->upages + start_index,
>>>> npages);
>>>> - return 0;
>>>> + rc = 0;
>>>> +
>>>> + if (!user->file)
>>>> + batch_from_pages(&pfns->batch, user->upages + start_index,
>>>> + npages);
>>>> + else
>>>> + rc = batch_from_folios(&pfns->batch, &user->ufolios_next,
>>>> + &user->ufolios_offset, npages);
>>>> + return rc;
>>>> }
>>>> static bool pfn_reader_done(struct pfn_reader *pfns)
>>>> @@ -1151,7 +1238,14 @@ static void pfn_reader_release_pins(struct
>>>> pfn_reader *pfns)
>>>> unsigned long start_index = pfns->batch_end_index -
>>>> user->upages_start;
>>>> - unpin_user_pages(user->upages + start_index, npages);
>>>> + if (!user->file) {
>>>> + unpin_user_pages(user->upages + start_index, npages);
>>>> + } else {
>>>> + long n = user->ufolios_len / sizeof(*user->ufolios);
>>>> +
>>>> + unpin_folios(user->ufolios_next,
>>>> + user->ufolios + n - user->ufolios_next);
>>>> + }
>>>> iopt_pages_sub_npinned(pages, npages);
>>>> user->upages_end = pfns->batch_end_index;
>>>> }
>>>
>>
>
--
Alexey
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-11-07 10:00 ` Alexey Kardashevskiy
@ 2024-11-07 13:21 ` Steven Sistare
2024-11-07 14:08 ` Jason Gunthorpe
1 sibling, 0 replies; 35+ messages in thread
From: Steven Sistare @ 2024-11-07 13:21 UTC (permalink / raw)
To: Alexey Kardashevskiy, iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen
On 11/7/2024 5:00 AM, Alexey Kardashevskiy wrote:
>
>
> On 7/11/24 00:19, Steven Sistare wrote:
>> On 11/5/2024 10:18 PM, Alexey Kardashevskiy wrote:
>>> On 31/10/24 14:34, Alexey Kardashevskiy wrote:
>>>> On 26/10/24 00:11, Steve Sistare wrote:
>>>>> Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
>>>>> Repin at small page granularity, and fill the batch from folios. Expand
>>>>> folios to upages for the iopt_pages_fill path.
>>>>>
>>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>>> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
>>>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>>>>> ---
>>>>> drivers/iommu/iommufd/io_pagetable.h | 5 ++
>>>>> drivers/iommu/iommufd/pages.c | 128 ++++++++++++++++++++++++++++++-----
>>>>> 2 files changed, 116 insertions(+), 17 deletions(-)
>>>>>
>>>>> diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
>>>>> index 8e48266..5ac4eed 100644
>>>>> --- a/drivers/iommu/iommufd/io_pagetable.h
>>>>> +++ b/drivers/iommu/iommufd/io_pagetable.h
>>>>> @@ -177,6 +177,7 @@ enum {
>>>>> enum iopt_address_type {
>>>>> IOPT_ADDRESS_USER = 0,
>>>>> + IOPT_ADDRESS_FILE = 1,
>>>>> };
>>>>> /*
>>>>> @@ -202,6 +203,10 @@ struct iopt_pages {
>>>>> enum iopt_address_type type;
>>>>> union {
>>>>> void __user *uptr; /* IOPT_ADDRESS_USER */
>>>>> + struct { /* IOPT_ADDRESS_FILE */
>>>>> + struct file *file;
>>>>> + unsigned long start;
>>>>> + };
>>>>> };
>>>>> bool writable:1;
>>>>> u8 account_mode;
>>>>> diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
>>>>> index aa79504..5f371fa 100644
>>>>> --- a/drivers/iommu/iommufd/pages.c
>>>>> +++ b/drivers/iommu/iommufd/pages.c
>>>>> @@ -752,19 +752,32 @@ struct pfn_reader_user {
>>>>> * neither
>>>>> */
>>>>> int locked;
>>>>> +
>>>>> + /* The following are only valid if file != NULL. */
>>>>
>>>>
>>>> Are "struct page **upages" and "size_t upages_len" still valid in this case? The code (kind of) suggests they are not... Thanks,
>>>
>>>
>>> Ping?
>>
>> Sorry, I don't recall seeing your first email, but others asked this same question
>> and I answered. upages and upages_len are still used when file!=NULL for mediated
>> device access, to point to the pages array returned to the kernel client.
>
> This would make a nice addition to the commit log :)
>
>> Maybe
>> "u" for user is not a good description in that case, but changing that and still
>> using "u"pages elsewhere would complicate the code for little gain in clarity IMO.
>>
>>> Also does "user" in function names mean "userspace addresses" (as opposite to "file"?
>>
>> Yes, except for mdev as noted above.
>>
>>> If so, the patchset makes many names misleading then. Thanks,
>>
>> The new stuff I added is prefixed with file. I left the existing upages names
>> as is. That made sense to me, and made the maintainers happy enough to accept it :)
>
> Fair enough :)
>
> I was initially pointed at this patchset as an example of how my guest_memfd pinning of private memory (CoCo VM) should look like. And so far it's been good except the actual pinning part - memfd_pin_folios() - expects the file to be shmem or hugetlbfs and guest_memfd is neither (well, last couple of months). Is there any plan to add those? If none, I will then, just wanted to check. At the moment I do what KVM does which is calling filemap_grab_folio(). Thanks,
I have no plans to implement guest_memfd pinning. Not sure about other folks.
- Steve
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-11-07 10:00 ` Alexey Kardashevskiy
2024-11-07 13:21 ` Steven Sistare
@ 2024-11-07 14:08 ` Jason Gunthorpe
2024-11-08 0:01 ` Alexey Kardashevskiy
1 sibling, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2024-11-07 14:08 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: Steven Sistare, iommu, Kevin Tian, Nicolin Chen
On Thu, Nov 07, 2024 at 09:00:50PM +1100, Alexey Kardashevskiy wrote:
> I was initially pointed at this patchset as an example of how my guest_memfd
> pinning of private memory (CoCo VM) should look like. And so far it's been
> good except the actual pinning part - memfd_pin_folios() - expects the file
> to be shmem or hugetlbfs and guest_memfd is neither (well, last couple of
> months). Is there any plan to add those? If none, I will then, just wanted
> to check. At the moment I do what KVM does which is calling
> filemap_grab_folio(). Thanks,
I understood that the guestmemfd folks really don't want typical
pinning behavior, you'll probably have to negotiate with them on
exactly what the interface is?
I though KVM had its own private interface to guestmemfd?
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-11-07 14:08 ` Jason Gunthorpe
@ 2024-11-08 0:01 ` Alexey Kardashevskiy
2024-11-14 4:03 ` Alexey Kardashevskiy
0 siblings, 1 reply; 35+ messages in thread
From: Alexey Kardashevskiy @ 2024-11-08 0:01 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Steven Sistare, iommu, Kevin Tian, Nicolin Chen
On 8/11/24 01:08, Jason Gunthorpe wrote:
> On Thu, Nov 07, 2024 at 09:00:50PM +1100, Alexey Kardashevskiy wrote:
>> I was initially pointed at this patchset as an example of how my guest_memfd
>> pinning of private memory (CoCo VM) should look like. And so far it's been
>> good except the actual pinning part - memfd_pin_folios() - expects the file
>> to be shmem or hugetlbfs and guest_memfd is neither (well, last couple of
>> months). Is there any plan to add those? If none, I will then, just wanted
>> to check. At the moment I do what KVM does which is calling
>> filemap_grab_folio(). Thanks,
>
> I understood that the guestmemfd folks really don't want typical
> pinning behavior, you'll probably have to negotiate with them on
> exactly what the interface is?
Yup I will, just checking the plans here.
> I though KVM had its own private interface to guestmemfd?
At the moment it is essentially this:
folio = filemap_grab_folio(file_inode(file)->i_mapping, index);
Thanks,
--
Alexey
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-11-08 0:01 ` Alexey Kardashevskiy
@ 2024-11-14 4:03 ` Alexey Kardashevskiy
2024-11-14 16:17 ` Jason Gunthorpe
0 siblings, 1 reply; 35+ messages in thread
From: Alexey Kardashevskiy @ 2024-11-14 4:03 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Steven Sistare, iommu, Kevin Tian, Nicolin Chen
On 8/11/24 11:01, Alexey Kardashevskiy wrote:
>
>
> On 8/11/24 01:08, Jason Gunthorpe wrote:
>> On Thu, Nov 07, 2024 at 09:00:50PM +1100, Alexey Kardashevskiy wrote:
>>> I was initially pointed at this patchset as an example of how my
>>> guest_memfd
>>> pinning of private memory (CoCo VM) should look like. And so far it's
>>> been
>>> good except the actual pinning part - memfd_pin_folios() - expects
>>> the file
>>> to be shmem or hugetlbfs and guest_memfd is neither (well, last
>>> couple of
>>> months). Is there any plan to add those? If none, I will then, just
>>> wanted
>>> to check. At the moment I do what KVM does which is calling
>>> filemap_grab_folio(). Thanks,
>>
>> I understood that the guestmemfd folks really don't want typical
>> pinning behavior, you'll probably have to negotiate with them on
>> exactly what the interface is?
>
> Yup I will, just checking the plans here.
>
>> I though KVM had its own private interface to guestmemfd?
>
> At the moment it is essentially this:
>
> folio = filemap_grab_folio(file_inode(file)->i_mapping, index);
Turns out the complete sequence is:
folio = filemap_grab_folio(file_inode(file)->i_mapping, index);
folio_add_pins(folio, 1);
folio_put(folio);
so then unpin_user_page_range_dirty_lock() =>
gup_fast_unpin_user_pages() => gup_put_folio() work properly.
Later on, hopefully, guest_memfd will developed ability to mmap() and
then the existing iommufd mapping API will "just work", I suppose. I'll
carry on the above for now. Thanks,
>
> Thanks,
>
--
Alexey
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-11-14 4:03 ` Alexey Kardashevskiy
@ 2024-11-14 16:17 ` Jason Gunthorpe
2024-11-18 1:24 ` Alexey Kardashevskiy
0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2024-11-14 16:17 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: Steven Sistare, iommu, Kevin Tian, Nicolin Chen
On Thu, Nov 14, 2024 at 03:03:36PM +1100, Alexey Kardashevskiy wrote:
> > > I though KVM had its own private interface to guestmemfd?
> >
> > At the moment it is essentially this:
> >
> > folio = filemap_grab_folio(file_inode(file)->i_mapping, index);
>
> Turns out the complete sequence is:
>
> folio = filemap_grab_folio(file_inode(file)->i_mapping, index);
> folio_add_pins(folio, 1);
> folio_put(folio);
>
> so then unpin_user_page_range_dirty_lock() => gup_fast_unpin_user_pages() =>
> gup_put_folio() work properly.
I'm surprised given how forceful people were that pinning should not
be done there. Please confirm with guestmemfd experts..
> Later on, hopefully, guest_memfd will developed ability to mmap() and then
> the existing iommufd mapping API will "just work", I suppose. I'll carry on
> the above for now. Thanks,
I think that is not possible, the entire point of guest memfd is that
private memory is never mapped.
For the AMD case we still need to program the address of private
memory into the hypervisor iommu page tables, so the FD path will have
to used to make that connection.
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 6/9] iommufd: pfn reader for file mappings
2024-11-14 16:17 ` Jason Gunthorpe
@ 2024-11-18 1:24 ` Alexey Kardashevskiy
0 siblings, 0 replies; 35+ messages in thread
From: Alexey Kardashevskiy @ 2024-11-18 1:24 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Steven Sistare, iommu, Kevin Tian, Nicolin Chen
On 15/11/24 03:17, Jason Gunthorpe wrote:
> On Thu, Nov 14, 2024 at 03:03:36PM +1100, Alexey Kardashevskiy wrote:
>>>> I though KVM had its own private interface to guestmemfd?
>>>
>>> At the moment it is essentially this:
>>>
>>> folio = filemap_grab_folio(file_inode(file)->i_mapping, index);
>>
>> Turns out the complete sequence is:
>>
>> folio = filemap_grab_folio(file_inode(file)->i_mapping, index);
>> folio_add_pins(folio, 1);
>> folio_put(folio);
>>
>> so then unpin_user_page_range_dirty_lock() => gup_fast_unpin_user_pages() =>
>> gup_put_folio() work properly.
>
> I'm surprised given how forceful people were that pinning should not
> be done there. Please confirm with guestmemfd experts..
>
>> Later on, hopefully, guest_memfd will developed ability to mmap() and then
>> the existing iommufd mapping API will "just work", I suppose. I'll carry on
>> the above for now. Thanks,
>
> I think that is not possible, the entire point of guest memfd is that
> private memory is never mapped.
It was like that but recently the approach changed to allow mappings. I
do not think the interface is finalized yet though.
> For the AMD case we still need to program the address of private
> memory into the hypervisor iommu page tables, so the FD path will have
> to used to make that connection.
Right, using it now in my working tree, with this new vdevice thingy
https://github.com/aik/linux/commits/tsm/
Thanks,
--
Alexey
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH V7 7/9] iommufd: IOMMU_IOAS_MAP_FILE
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
` (5 preceding siblings ...)
2024-10-25 13:11 ` [PATCH V7 6/9] iommufd: pfn reader for file mappings Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-25 13:11 ` [PATCH V7 8/9] iommufd: file mappings for mdev Steve Sistare
` (2 subsequent siblings)
9 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Define the IOMMU_IOAS_MAP_FILE ioctl interface, which allows a user to
register memory by passing a memfd plus offset and length. Implement it
using the memfd_pin_folios KAPI.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/iommufd/io_pagetable.c | 36 ++++++++++++++++++++++++-
drivers/iommu/iommufd/io_pagetable.h | 2 ++
drivers/iommu/iommufd/ioas.c | 47 +++++++++++++++++++++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 5 ++++
drivers/iommu/iommufd/main.c | 2 ++
drivers/iommu/iommufd/pages.c | 23 ++++++++++++++++
include/uapi/linux/iommufd.h | 25 ++++++++++++++++++
7 files changed, 139 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 874ee9e..8a790e5 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -268,7 +268,14 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
/* Use the first entry to guess the ideal IOVA alignment */
elm = list_first_entry(pages_list, struct iopt_pages_list,
next);
- start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+ switch (elm->pages->type) {
+ case IOPT_ADDRESS_USER:
+ start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+ break;
+ case IOPT_ADDRESS_FILE:
+ start = elm->start_byte + elm->pages->start;
+ break;
+ }
rc = iopt_alloc_iova(iopt, dst_iova, start, length);
if (rc)
goto out_unlock;
@@ -446,6 +453,33 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
uptr - pages->uptr, iommu_prot, flags);
}
+/**
+ * iopt_map_file_pages() - Like iopt_map_user_pages, but map a file.
+ * @ictx: iommufd_ctx the iopt is part of
+ * @iopt: io_pagetable to act on
+ * @iova: If IOPT_ALLOC_IOVA is set this is unused on input and contains
+ * the chosen iova on output. Otherwise is the iova to map to on input
+ * @file: file to map
+ * @start: map file starting at this byte offset
+ * @length: Number of bytes to map
+ * @iommu_prot: Combination of IOMMU_READ/WRITE/etc bits for the mapping
+ * @flags: IOPT_ALLOC_IOVA or zero
+ */
+int iopt_map_file_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+ unsigned long *iova, struct file *file,
+ unsigned long start, unsigned long length,
+ int iommu_prot, unsigned int flags)
+{
+ struct iopt_pages *pages;
+
+ pages = iopt_alloc_file_pages(file, start, length,
+ iommu_prot & IOMMU_WRITE);
+ if (IS_ERR(pages))
+ return PTR_ERR(pages);
+ return iopt_map_common(ictx, iopt, pages, iova, length,
+ start - pages->start, iommu_prot, flags);
+}
+
struct iova_bitmap_fn_arg {
unsigned long flags;
struct io_pagetable *iopt;
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 5ac4eed..9b40b22 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -220,6 +220,8 @@ struct iopt_pages {
struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
unsigned long length, bool writable);
+struct iopt_pages *iopt_alloc_file_pages(struct file *file, unsigned long start,
+ unsigned long length, bool writable);
void iopt_release_pages(struct kref *kref);
static inline void iopt_put_pages(struct iopt_pages *pages)
{
diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c
index 2c4b2bb..c05d33f 100644
--- a/drivers/iommu/iommufd/ioas.c
+++ b/drivers/iommu/iommufd/ioas.c
@@ -2,6 +2,7 @@
/*
* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
*/
+#include <linux/file.h>
#include <linux/interval_tree.h>
#include <linux/iommu.h>
#include <linux/iommufd.h>
@@ -197,6 +198,52 @@ static int conv_iommu_prot(u32 map_flags)
return iommu_prot;
}
+int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_ioas_map_file *cmd = ucmd->cmd;
+ unsigned long iova = cmd->iova;
+ struct iommufd_ioas *ioas;
+ unsigned int flags = 0;
+ struct file *file;
+ int rc;
+
+ if (cmd->flags &
+ ~(IOMMU_IOAS_MAP_FIXED_IOVA | IOMMU_IOAS_MAP_WRITEABLE |
+ IOMMU_IOAS_MAP_READABLE))
+ return -EOPNOTSUPP;
+
+ if (cmd->iova >= ULONG_MAX || cmd->length >= ULONG_MAX)
+ return -EOVERFLOW;
+
+ if (!(cmd->flags &
+ (IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE)))
+ return -EINVAL;
+
+ ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+
+ if (!(cmd->flags & IOMMU_IOAS_MAP_FIXED_IOVA))
+ flags = IOPT_ALLOC_IOVA;
+
+ file = fget(cmd->fd);
+ if (!file)
+ return -EBADF;
+
+ rc = iopt_map_file_pages(ucmd->ictx, &ioas->iopt, &iova, file,
+ cmd->start, cmd->length,
+ conv_iommu_prot(cmd->flags), flags);
+ if (rc)
+ goto out_put;
+
+ cmd->iova = iova;
+ rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+out_put:
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
+ fput(file);
+ return rc;
+}
+
int iommufd_ioas_map(struct iommufd_ucmd *ucmd)
{
struct iommu_ioas_map *cmd = ucmd->cmd;
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index f1d865e..8f3c21a 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -69,6 +69,10 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
unsigned long *iova, void __user *uptr,
unsigned long length, int iommu_prot,
unsigned int flags);
+int iopt_map_file_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+ unsigned long *iova, struct file *file,
+ unsigned long start, unsigned long length,
+ int iommu_prot, unsigned int flags);
int iopt_map_pages(struct io_pagetable *iopt, struct list_head *pages_list,
unsigned long length, unsigned long *dst_iova,
int iommu_prot, unsigned int flags);
@@ -276,6 +280,7 @@ static inline struct iommufd_ioas *iommufd_get_ioas(struct iommufd_ctx *ictx,
int iommufd_ioas_iova_ranges(struct iommufd_ucmd *ucmd);
int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd);
int iommufd_ioas_map(struct iommufd_ucmd *ucmd);
+int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd);
int iommufd_ioas_copy(struct iommufd_ucmd *ucmd);
int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd);
int iommufd_ioas_option(struct iommufd_ucmd *ucmd);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index b5f5d27..826a2b2 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -378,6 +378,8 @@ struct iommufd_ioctl_op {
struct iommu_ioas_iova_ranges, out_iova_alignment),
IOCTL_OP(IOMMU_IOAS_MAP, iommufd_ioas_map, struct iommu_ioas_map,
iova),
+ IOCTL_OP(IOMMU_IOAS_MAP_FILE, iommufd_ioas_map_file,
+ struct iommu_ioas_map_file, iova),
IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct iommu_ioas_unmap,
length),
IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option,
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 5f371fa..2ee6fcd 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -45,6 +45,7 @@
* last_iova + 1 can overflow. An iopt_pages index will always be much less than
* ULONG_MAX so last_index + 1 cannot overflow.
*/
+#include <linux/file.h>
#include <linux/highmem.h>
#include <linux/iommu.h>
#include <linux/iommufd.h>
@@ -1340,6 +1341,26 @@ struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
return pages;
}
+struct iopt_pages *iopt_alloc_file_pages(struct file *file, unsigned long start,
+ unsigned long length, bool writable)
+
+{
+ struct iopt_pages *pages;
+ unsigned long start_down = ALIGN_DOWN(start, PAGE_SIZE);
+ unsigned long end;
+
+ if (length && check_add_overflow(start, length - 1, &end))
+ return ERR_PTR(-EOVERFLOW);
+
+ pages = iopt_alloc_pages(start - start_down, length, writable);
+ if (IS_ERR(pages))
+ return pages;
+ pages->file = get_file(file);
+ pages->start = start_down;
+ pages->type = IOPT_ADDRESS_FILE;
+ return pages;
+}
+
void iopt_release_pages(struct kref *kref)
{
struct iopt_pages *pages = container_of(kref, struct iopt_pages, kref);
@@ -1352,6 +1373,8 @@ void iopt_release_pages(struct kref *kref)
mutex_destroy(&pages->mutex);
put_task_struct(pages->source_task);
free_uid(pages->source_user);
+ if (pages->type == IOPT_ADDRESS_FILE)
+ fput(pages->file);
kfree(pages);
}
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 72010f7..41b1a01 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -51,6 +51,7 @@ enum {
IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP = 0x8c,
IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
+ IOMMUFD_CMD_IOAS_MAP_FILE = 0x8f,
};
/**
@@ -214,6 +215,30 @@ struct iommu_ioas_map {
#define IOMMU_IOAS_MAP _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP)
/**
+ * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
+ * @size: sizeof(struct iommu_ioas_map_file)
+ * @flags: same as for iommu_ioas_map
+ * @ioas_id: same as for iommu_ioas_map
+ * @fd: the memfd to map
+ * @start: byte offset from start of file to map from
+ * @length: same as for iommu_ioas_map
+ * @iova: same as for iommu_ioas_map
+ *
+ * Set an IOVA mapping from a memfd file. All other arguments and semantics
+ * match those of IOMMU_IOAS_MAP.
+ */
+struct iommu_ioas_map_file {
+ __u32 size;
+ __u32 flags;
+ __u32 ioas_id;
+ __s32 fd;
+ __aligned_u64 start;
+ __aligned_u64 length;
+ __aligned_u64 iova;
+};
+#define IOMMU_IOAS_MAP_FILE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP_FILE)
+
+/**
* struct iommu_ioas_copy - ioctl(IOMMU_IOAS_COPY)
* @size: sizeof(struct iommu_ioas_copy)
* @flags: Combination of enum iommufd_ioas_map_flags
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* [PATCH V7 8/9] iommufd: file mappings for mdev
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
` (6 preceding siblings ...)
2024-10-25 13:11 ` [PATCH V7 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-25 13:11 ` [PATCH V7 9/9] iommufd: map file selftest Steve Sistare
2024-10-30 0:11 ` [PATCH V7 0/9] iommu_ioas_map_file Jason Gunthorpe
9 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Support file mappings for mediated devices, aka mdevs. Access is initiated
by the vfio_pin_pages and vfio_dma_rw kernel interfaces.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
drivers/iommu/iommufd/pages.c | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 2ee6fcd..8f24916 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1814,11 +1814,11 @@ static int iopt_pages_fill_from_domain(struct iopt_pages *pages,
return 0;
}
-static int iopt_pages_fill_from_mm(struct iopt_pages *pages,
- struct pfn_reader_user *user,
- unsigned long start_index,
- unsigned long last_index,
- struct page **out_pages)
+static int iopt_pages_fill(struct iopt_pages *pages,
+ struct pfn_reader_user *user,
+ unsigned long start_index,
+ unsigned long last_index,
+ struct page **out_pages)
{
unsigned long cur_index = start_index;
int rc;
@@ -1892,8 +1892,8 @@ int iopt_pages_fill_xarray(struct iopt_pages *pages, unsigned long start_index,
/* hole */
cur_pages = out_pages + (span.start_hole - start_index);
- rc = iopt_pages_fill_from_mm(pages, &user, span.start_hole,
- span.last_hole, cur_pages);
+ rc = iopt_pages_fill(pages, &user, span.start_hole,
+ span.last_hole, cur_pages);
if (rc)
goto out_clean_xa;
rc = pages_to_xarray(&pages->pinned_pfns, span.start_hole,
@@ -1973,6 +1973,10 @@ static int iopt_pages_rw_page(struct iopt_pages *pages, unsigned long index,
struct page *page = NULL;
int rc;
+ if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
+ WARN_ON(pages->type != IOPT_ADDRESS_USER))
+ return -EINVAL;
+
if (!mmget_not_zero(pages->source_mm))
return iopt_pages_rw_slow(pages, index, index, offset, data,
length, flags);
@@ -2028,6 +2032,15 @@ int iopt_pages_rw_access(struct iopt_pages *pages, unsigned long start_byte,
if ((flags & IOMMUFD_ACCESS_RW_WRITE) && !pages->writable)
return -EPERM;
+ if (pages->type == IOPT_ADDRESS_FILE)
+ return iopt_pages_rw_slow(pages, start_index, last_index,
+ start_byte % PAGE_SIZE, data, length,
+ flags);
+
+ if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
+ WARN_ON(pages->type != IOPT_ADDRESS_USER))
+ return -EINVAL;
+
if (!(flags & IOMMUFD_ACCESS_RW_KTHREAD) && change_mm) {
if (start_index == last_index)
return iopt_pages_rw_page(pages, start_index,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* [PATCH V7 9/9] iommufd: map file selftest
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
` (7 preceding siblings ...)
2024-10-25 13:11 ` [PATCH V7 8/9] iommufd: file mappings for mdev Steve Sistare
@ 2024-10-25 13:11 ` Steve Sistare
2024-10-25 13:14 ` Steven Sistare
2024-10-25 17:04 ` Nicolin Chen
2024-10-30 0:11 ` [PATCH V7 0/9] iommu_ioas_map_file Jason Gunthorpe
9 siblings, 2 replies; 35+ messages in thread
From: Steve Sistare @ 2024-10-25 13:11 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Add test cases to exercise IOMMU_IOAS_MAP_FILE.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
---
tools/testing/selftests/iommu/iommufd.c | 124 ++++++++++++++++++++---
tools/testing/selftests/iommu/iommufd_fail_nth.c | 39 +++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 +++++++++++
3 files changed, 205 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 4927b9a..88b92bb 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES */
+#include <asm/unistd.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/eventfd.h>
@@ -49,6 +50,9 @@ static __attribute__((constructor)) void setup_sizes(void)
vrc = mmap(buffer, BUFFER_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
assert(vrc == buffer);
+
+ mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
+ &mfd);
}
FIXTURE(iommufd)
@@ -128,6 +132,7 @@ static __attribute__((constructor)) void setup_sizes(void)
TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP, length);
TEST_LENGTH(iommu_option, IOMMU_OPTION, val64);
TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved);
+ TEST_LENGTH(iommu_ioas_map_file, IOMMU_IOAS_MAP_FILE, iova);
#undef TEST_LENGTH
}
@@ -1372,6 +1377,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
{
unsigned int mock_domains;
bool hugepages;
+ bool file;
};
FIXTURE_SETUP(iommufd_mock_domain)
@@ -1410,26 +1416,45 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
{
.mock_domains = 1,
.hugepages = false,
+ .file = false,
};
FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains)
{
.mock_domains = 2,
.hugepages = false,
+ .file = false,
};
FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_hugepage)
{
.mock_domains = 1,
.hugepages = true,
+ .file = false,
};
FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains_hugepage)
{
.mock_domains = 2,
.hugepages = true,
+ .file = false,
};
+FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file)
+{
+ .mock_domains = 1,
+ .hugepages = false,
+ .file = true,
+};
+
+FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file_hugepage)
+{
+ .mock_domains = 1,
+ .hugepages = true,
+ .file = true,
+};
+
+
/* Have the kernel check that the user pages made it to the iommu_domain */
#define check_mock_iova(_ptr, _iova, _length) \
({ \
@@ -1455,7 +1480,10 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
} \
})
-TEST_F(iommufd_mock_domain, basic)
+static void
+test_basic_mmap(struct __test_metadata *_metadata,
+ struct _test_data_iommufd_mock_domain *self,
+ const struct _fixture_variant_iommufd_mock_domain *variant)
{
size_t buf_size = self->mmap_buf_size;
uint8_t *buf;
@@ -1478,6 +1506,40 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
test_err_ioctl_ioas_map(EFAULT, buf, buf_size, &iova);
}
+static void
+test_basic_file(struct __test_metadata *_metadata,
+ struct _test_data_iommufd_mock_domain *self,
+ const struct _fixture_variant_iommufd_mock_domain *variant)
+{
+ size_t buf_size = self->mmap_buf_size;
+ uint8_t *buf;
+ __u64 iova;
+ int mfd_tmp;
+ int prot = PROT_READ | PROT_WRITE;
+
+ /* Simple one page map */
+ test_ioctl_ioas_map_file(mfd, 0, PAGE_SIZE, &iova);
+ check_mock_iova(mfd_buffer, iova, PAGE_SIZE);
+
+ buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd_tmp);
+ ASSERT_NE(MAP_FAILED, buf);
+
+ test_err_ioctl_ioas_map_file(EINVAL, mfd_tmp, 0, buf_size + 1, &iova);
+
+ ASSERT_EQ(0, ftruncate(mfd_tmp, 0));
+ test_err_ioctl_ioas_map_file(EINVAL, mfd_tmp, 0, buf_size, &iova);
+
+ close(mfd_tmp);
+}
+
+TEST_F(iommufd_mock_domain, basic)
+{
+ if (variant->file)
+ test_basic_file(_metadata, self, variant);
+ else
+ test_basic_mmap(_metadata, self, variant);
+}
+
TEST_F(iommufd_mock_domain, ro_unshare)
{
uint8_t *buf;
@@ -1513,9 +1575,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
unsigned int start;
unsigned int end;
uint8_t *buf;
+ int prot = PROT_READ | PROT_WRITE;
+ int mfd;
- buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
- 0);
+ if (variant->file)
+ buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
+ else
+ buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
ASSERT_NE(MAP_FAILED, buf);
check_refs(buf, buf_size, 0);
@@ -1532,7 +1598,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
size_t length = end - start;
__u64 iova;
- test_ioctl_ioas_map(buf + start, length, &iova);
+ if (variant->file) {
+ test_ioctl_ioas_map_file(mfd, start, length,
+ &iova);
+ } else {
+ test_ioctl_ioas_map(buf + start, length, &iova);
+ }
check_mock_iova(buf + start, iova, length);
check_refs(buf + start / PAGE_SIZE * PAGE_SIZE,
end / PAGE_SIZE * PAGE_SIZE -
@@ -1544,6 +1615,8 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
}
check_refs(buf, buf_size, 0);
ASSERT_EQ(0, munmap(buf, buf_size));
+ if (variant->file)
+ close(mfd);
}
TEST_F(iommufd_mock_domain, all_aligns_copy)
@@ -1554,9 +1627,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
unsigned int start;
unsigned int end;
uint8_t *buf;
+ int prot = PROT_READ | PROT_WRITE;
+ int mfd;
- buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
- 0);
+ if (variant->file)
+ buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
+ else
+ buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
ASSERT_NE(MAP_FAILED, buf);
check_refs(buf, buf_size, 0);
@@ -1575,7 +1652,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
uint32_t mock_stdev_id;
__u64 iova;
- test_ioctl_ioas_map(buf + start, length, &iova);
+ if (variant->file) {
+ test_ioctl_ioas_map_file(mfd, start, length,
+ &iova);
+ } else {
+ test_ioctl_ioas_map(buf + start, length, &iova);
+ }
/* Add and destroy a domain while the area exists */
old_id = self->hwpt_ids[1];
@@ -1596,15 +1678,18 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
}
check_refs(buf, buf_size, 0);
ASSERT_EQ(0, munmap(buf, buf_size));
+ if (variant->file)
+ close(mfd);
}
TEST_F(iommufd_mock_domain, user_copy)
{
+ void *buf = variant->file ? mfd_buffer : buffer;
struct iommu_test_cmd access_cmd = {
.size = sizeof(access_cmd),
.op = IOMMU_TEST_OP_ACCESS_PAGES,
.access_pages = { .length = BUFFER_SIZE,
- .uptr = (uintptr_t)buffer },
+ .uptr = (uintptr_t)buf },
};
struct iommu_ioas_copy copy_cmd = {
.size = sizeof(copy_cmd),
@@ -1623,9 +1708,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
/* Pin the pages in an IOAS with no domains then copy to an IOAS with domains */
test_ioctl_ioas_alloc(&ioas_id);
- test_ioctl_ioas_map_id(ioas_id, buffer, BUFFER_SIZE,
- ©_cmd.src_iova);
-
+ if (variant->file) {
+ test_ioctl_ioas_map_id_file(ioas_id, mfd, 0, BUFFER_SIZE,
+ ©_cmd.src_iova);
+ } else {
+ test_ioctl_ioas_map_id(ioas_id, buf, BUFFER_SIZE,
+ ©_cmd.src_iova);
+ }
test_cmd_create_access(ioas_id, &access_cmd.id,
MOCK_FLAGS_ACCESS_CREATE_NEEDS_PIN_PAGES);
@@ -1635,12 +1724,17 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
&access_cmd));
copy_cmd.src_ioas_id = ioas_id;
ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd));
- check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
+ check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
/* Now replace the ioas with a new one */
test_ioctl_ioas_alloc(&new_ioas_id);
- test_ioctl_ioas_map_id(new_ioas_id, buffer, BUFFER_SIZE,
- ©_cmd.src_iova);
+ if (variant->file) {
+ test_ioctl_ioas_map_id_file(new_ioas_id, mfd, 0, BUFFER_SIZE,
+ ©_cmd.src_iova);
+ } else {
+ test_ioctl_ioas_map_id(new_ioas_id, buf, BUFFER_SIZE,
+ ©_cmd.src_iova);
+ }
test_cmd_access_replace_ioas(access_cmd.id, new_ioas_id);
/* Destroy the old ioas and cleanup copied mapping */
@@ -1654,7 +1748,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
&access_cmd));
copy_cmd.src_ioas_id = new_ioas_id;
ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd));
- check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
+ check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
test_cmd_destroy_access_pages(
access_cmd.id, access_cmd.access_pages.out_access_pages_id);
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index c5d5e69..2d7d016 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -47,6 +47,9 @@ static __attribute__((constructor)) void setup_buffer(void)
buffer = mmap(0, BUFFER_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+
+ mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
+ &mfd);
}
/*
@@ -331,6 +334,42 @@ void __fail_nth_enable(struct __test_metadata *_metadata,
return 0;
}
+/* iopt_area_fill_domains() and iopt_area_fill_domain() */
+TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
+{
+ uint32_t ioas_id;
+ __u32 stdev_id;
+ __u32 hwpt_id;
+ __u64 iova;
+
+ self->fd = open("/dev/iommu", O_RDWR);
+ if (self->fd == -1)
+ return -1;
+
+ if (_test_ioctl_ioas_alloc(self->fd, &ioas_id))
+ return -1;
+
+ if (_test_ioctl_set_temp_memory_limit(self->fd, 32))
+ return -1;
+
+ fail_nth_enable();
+
+ if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
+ return -1;
+
+ if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
+ IOMMU_IOAS_MAP_WRITEABLE |
+ IOMMU_IOAS_MAP_READABLE))
+ return -1;
+
+ if (_test_ioctl_destroy(self->fd, stdev_id))
+ return -1;
+
+ if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
+ return -1;
+ return 0;
+}
+
TEST_FAIL_NTH(basic_fail_nth, map_two_domains)
{
uint32_t ioas_id;
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 40f6f14..6a11c26 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -40,12 +40,28 @@ static inline bool test_bit(unsigned int nr, unsigned long *addr)
static void *buffer;
static unsigned long BUFFER_SIZE;
+static void *mfd_buffer;
+static int mfd;
+
static unsigned long PAGE_SIZE;
#define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
#define offsetofend(TYPE, MEMBER) \
(offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
+static inline void *memfd_mmap(size_t length, int prot, int flags, int *mfd_p)
+{
+ int mfd_flags = (flags & MAP_HUGETLB) ? MFD_HUGETLB : 0;
+ int mfd = memfd_create("buffer", mfd_flags);
+
+ if (mfd <= 0)
+ return MAP_FAILED;
+ if (ftruncate(mfd, length))
+ return MAP_FAILED;
+ *mfd_p = mfd;
+ return mmap(0, length, prot, flags, mfd, 0);
+}
+
/*
* Have the kernel check the refcount on pages. I don't know why a freshly
* mmap'd anon non-compound page starts out with a ref of 3
@@ -589,6 +605,47 @@ static int _test_ioctl_ioas_unmap(int fd, unsigned int ioas_id, uint64_t iova,
EXPECT_ERRNO(_errno, _test_ioctl_ioas_unmap(self->fd, self->ioas_id, \
iova, length, NULL))
+static int _test_ioctl_ioas_map_file(int fd, unsigned int ioas_id, int mfd,
+ size_t start, size_t length, __u64 *iova,
+ unsigned int flags)
+{
+ struct iommu_ioas_map_file cmd = {
+ .size = sizeof(cmd),
+ .flags = flags,
+ .ioas_id = ioas_id,
+ .fd = mfd,
+ .start = start,
+ .length = length,
+ };
+ int ret;
+
+ if (flags & IOMMU_IOAS_MAP_FIXED_IOVA)
+ cmd.iova = *iova;
+
+ ret = ioctl(fd, IOMMU_IOAS_MAP_FILE, &cmd);
+ *iova = cmd.iova;
+ return ret;
+}
+
+#define test_ioctl_ioas_map_file(mfd, start, length, iova_p) \
+ ASSERT_EQ(0, \
+ _test_ioctl_ioas_map_file( \
+ self->fd, self->ioas_id, mfd, start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
+
+#define test_err_ioctl_ioas_map_file(_errno, mfd, start, length, iova_p) \
+ EXPECT_ERRNO( \
+ _errno, \
+ _test_ioctl_ioas_map_file( \
+ self->fd, self->ioas_id, mfd, start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
+
+#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p) \
+ ASSERT_EQ(0, \
+ _test_ioctl_ioas_map_file( \
+ self->fd, ioas_id, mfd, start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
+
static int _test_ioctl_set_temp_memory_limit(int fd, unsigned int limit)
{
struct iommu_test_cmd memlimit_cmd = {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 35+ messages in thread* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-25 13:11 ` [PATCH V7 9/9] iommufd: map file selftest Steve Sistare
@ 2024-10-25 13:14 ` Steven Sistare
2024-10-25 17:00 ` Nicolin Chen
2024-10-25 17:04 ` Nicolin Chen
1 sibling, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2024-10-25 13:14 UTC (permalink / raw)
To: linux-kselftest; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, iommu
cc linux-selftest
- Steve
On 10/25/2024 9:11 AM, Steve Sistare wrote:
> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> tools/testing/selftests/iommu/iommufd.c | 124 ++++++++++++++++++++---
> tools/testing/selftests/iommu/iommufd_fail_nth.c | 39 +++++++
> tools/testing/selftests/iommu/iommufd_utils.h | 57 +++++++++++
> 3 files changed, 205 insertions(+), 15 deletions(-)
>
> diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
> index 4927b9a..88b92bb 100644
> --- a/tools/testing/selftests/iommu/iommufd.c
> +++ b/tools/testing/selftests/iommu/iommufd.c
> @@ -1,5 +1,6 @@
> // SPDX-License-Identifier: GPL-2.0-only
> /* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES */
> +#include <asm/unistd.h>
> #include <stdlib.h>
> #include <sys/mman.h>
> #include <sys/eventfd.h>
> @@ -49,6 +50,9 @@ static __attribute__((constructor)) void setup_sizes(void)
> vrc = mmap(buffer, BUFFER_SIZE, PROT_READ | PROT_WRITE,
> MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
> assert(vrc == buffer);
> +
> + mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
> + &mfd);
> }
>
> FIXTURE(iommufd)
> @@ -128,6 +132,7 @@ static __attribute__((constructor)) void setup_sizes(void)
> TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP, length);
> TEST_LENGTH(iommu_option, IOMMU_OPTION, val64);
> TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved);
> + TEST_LENGTH(iommu_ioas_map_file, IOMMU_IOAS_MAP_FILE, iova);
> #undef TEST_LENGTH
> }
>
> @@ -1372,6 +1377,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> {
> unsigned int mock_domains;
> bool hugepages;
> + bool file;
> };
>
> FIXTURE_SETUP(iommufd_mock_domain)
> @@ -1410,26 +1416,45 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> {
> .mock_domains = 1,
> .hugepages = false,
> + .file = false,
> };
>
> FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains)
> {
> .mock_domains = 2,
> .hugepages = false,
> + .file = false,
> };
>
> FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_hugepage)
> {
> .mock_domains = 1,
> .hugepages = true,
> + .file = false,
> };
>
> FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains_hugepage)
> {
> .mock_domains = 2,
> .hugepages = true,
> + .file = false,
> };
>
> +FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file)
> +{
> + .mock_domains = 1,
> + .hugepages = false,
> + .file = true,
> +};
> +
> +FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file_hugepage)
> +{
> + .mock_domains = 1,
> + .hugepages = true,
> + .file = true,
> +};
> +
> +
> /* Have the kernel check that the user pages made it to the iommu_domain */
> #define check_mock_iova(_ptr, _iova, _length) \
> ({ \
> @@ -1455,7 +1480,10 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> } \
> })
>
> -TEST_F(iommufd_mock_domain, basic)
> +static void
> +test_basic_mmap(struct __test_metadata *_metadata,
> + struct _test_data_iommufd_mock_domain *self,
> + const struct _fixture_variant_iommufd_mock_domain *variant)
> {
> size_t buf_size = self->mmap_buf_size;
> uint8_t *buf;
> @@ -1478,6 +1506,40 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> test_err_ioctl_ioas_map(EFAULT, buf, buf_size, &iova);
> }
>
> +static void
> +test_basic_file(struct __test_metadata *_metadata,
> + struct _test_data_iommufd_mock_domain *self,
> + const struct _fixture_variant_iommufd_mock_domain *variant)
> +{
> + size_t buf_size = self->mmap_buf_size;
> + uint8_t *buf;
> + __u64 iova;
> + int mfd_tmp;
> + int prot = PROT_READ | PROT_WRITE;
> +
> + /* Simple one page map */
> + test_ioctl_ioas_map_file(mfd, 0, PAGE_SIZE, &iova);
> + check_mock_iova(mfd_buffer, iova, PAGE_SIZE);
> +
> + buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd_tmp);
> + ASSERT_NE(MAP_FAILED, buf);
> +
> + test_err_ioctl_ioas_map_file(EINVAL, mfd_tmp, 0, buf_size + 1, &iova);
> +
> + ASSERT_EQ(0, ftruncate(mfd_tmp, 0));
> + test_err_ioctl_ioas_map_file(EINVAL, mfd_tmp, 0, buf_size, &iova);
> +
> + close(mfd_tmp);
> +}
> +
> +TEST_F(iommufd_mock_domain, basic)
> +{
> + if (variant->file)
> + test_basic_file(_metadata, self, variant);
> + else
> + test_basic_mmap(_metadata, self, variant);
> +}
> +
> TEST_F(iommufd_mock_domain, ro_unshare)
> {
> uint8_t *buf;
> @@ -1513,9 +1575,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> unsigned int start;
> unsigned int end;
> uint8_t *buf;
> + int prot = PROT_READ | PROT_WRITE;
> + int mfd;
>
> - buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
> - 0);
> + if (variant->file)
> + buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
> + else
> + buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
> ASSERT_NE(MAP_FAILED, buf);
> check_refs(buf, buf_size, 0);
>
> @@ -1532,7 +1598,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> size_t length = end - start;
> __u64 iova;
>
> - test_ioctl_ioas_map(buf + start, length, &iova);
> + if (variant->file) {
> + test_ioctl_ioas_map_file(mfd, start, length,
> + &iova);
> + } else {
> + test_ioctl_ioas_map(buf + start, length, &iova);
> + }
> check_mock_iova(buf + start, iova, length);
> check_refs(buf + start / PAGE_SIZE * PAGE_SIZE,
> end / PAGE_SIZE * PAGE_SIZE -
> @@ -1544,6 +1615,8 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> }
> check_refs(buf, buf_size, 0);
> ASSERT_EQ(0, munmap(buf, buf_size));
> + if (variant->file)
> + close(mfd);
> }
>
> TEST_F(iommufd_mock_domain, all_aligns_copy)
> @@ -1554,9 +1627,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> unsigned int start;
> unsigned int end;
> uint8_t *buf;
> + int prot = PROT_READ | PROT_WRITE;
> + int mfd;
>
> - buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
> - 0);
> + if (variant->file)
> + buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
> + else
> + buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
> ASSERT_NE(MAP_FAILED, buf);
> check_refs(buf, buf_size, 0);
>
> @@ -1575,7 +1652,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> uint32_t mock_stdev_id;
> __u64 iova;
>
> - test_ioctl_ioas_map(buf + start, length, &iova);
> + if (variant->file) {
> + test_ioctl_ioas_map_file(mfd, start, length,
> + &iova);
> + } else {
> + test_ioctl_ioas_map(buf + start, length, &iova);
> + }
>
> /* Add and destroy a domain while the area exists */
> old_id = self->hwpt_ids[1];
> @@ -1596,15 +1678,18 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> }
> check_refs(buf, buf_size, 0);
> ASSERT_EQ(0, munmap(buf, buf_size));
> + if (variant->file)
> + close(mfd);
> }
>
> TEST_F(iommufd_mock_domain, user_copy)
> {
> + void *buf = variant->file ? mfd_buffer : buffer;
> struct iommu_test_cmd access_cmd = {
> .size = sizeof(access_cmd),
> .op = IOMMU_TEST_OP_ACCESS_PAGES,
> .access_pages = { .length = BUFFER_SIZE,
> - .uptr = (uintptr_t)buffer },
> + .uptr = (uintptr_t)buf },
> };
> struct iommu_ioas_copy copy_cmd = {
> .size = sizeof(copy_cmd),
> @@ -1623,9 +1708,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>
> /* Pin the pages in an IOAS with no domains then copy to an IOAS with domains */
> test_ioctl_ioas_alloc(&ioas_id);
> - test_ioctl_ioas_map_id(ioas_id, buffer, BUFFER_SIZE,
> - ©_cmd.src_iova);
> -
> + if (variant->file) {
> + test_ioctl_ioas_map_id_file(ioas_id, mfd, 0, BUFFER_SIZE,
> + ©_cmd.src_iova);
> + } else {
> + test_ioctl_ioas_map_id(ioas_id, buf, BUFFER_SIZE,
> + ©_cmd.src_iova);
> + }
> test_cmd_create_access(ioas_id, &access_cmd.id,
> MOCK_FLAGS_ACCESS_CREATE_NEEDS_PIN_PAGES);
>
> @@ -1635,12 +1724,17 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> &access_cmd));
> copy_cmd.src_ioas_id = ioas_id;
> ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd));
> - check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
> + check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
>
> /* Now replace the ioas with a new one */
> test_ioctl_ioas_alloc(&new_ioas_id);
> - test_ioctl_ioas_map_id(new_ioas_id, buffer, BUFFER_SIZE,
> - ©_cmd.src_iova);
> + if (variant->file) {
> + test_ioctl_ioas_map_id_file(new_ioas_id, mfd, 0, BUFFER_SIZE,
> + ©_cmd.src_iova);
> + } else {
> + test_ioctl_ioas_map_id(new_ioas_id, buf, BUFFER_SIZE,
> + ©_cmd.src_iova);
> + }
> test_cmd_access_replace_ioas(access_cmd.id, new_ioas_id);
>
> /* Destroy the old ioas and cleanup copied mapping */
> @@ -1654,7 +1748,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> &access_cmd));
> copy_cmd.src_ioas_id = new_ioas_id;
> ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd));
> - check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
> + check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
>
> test_cmd_destroy_access_pages(
> access_cmd.id, access_cmd.access_pages.out_access_pages_id);
> diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
> index c5d5e69..2d7d016 100644
> --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
> +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
> @@ -47,6 +47,9 @@ static __attribute__((constructor)) void setup_buffer(void)
>
> buffer = mmap(0, BUFFER_SIZE, PROT_READ | PROT_WRITE,
> MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> +
> + mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
> + &mfd);
> }
>
> /*
> @@ -331,6 +334,42 @@ void __fail_nth_enable(struct __test_metadata *_metadata,
> return 0;
> }
>
> +/* iopt_area_fill_domains() and iopt_area_fill_domain() */
> +TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
> +{
> + uint32_t ioas_id;
> + __u32 stdev_id;
> + __u32 hwpt_id;
> + __u64 iova;
> +
> + self->fd = open("/dev/iommu", O_RDWR);
> + if (self->fd == -1)
> + return -1;
> +
> + if (_test_ioctl_ioas_alloc(self->fd, &ioas_id))
> + return -1;
> +
> + if (_test_ioctl_set_temp_memory_limit(self->fd, 32))
> + return -1;
> +
> + fail_nth_enable();
> +
> + if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
> + return -1;
> +
> + if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
> + IOMMU_IOAS_MAP_WRITEABLE |
> + IOMMU_IOAS_MAP_READABLE))
> + return -1;
> +
> + if (_test_ioctl_destroy(self->fd, stdev_id))
> + return -1;
> +
> + if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
> + return -1;
> + return 0;
> +}
> +
> TEST_FAIL_NTH(basic_fail_nth, map_two_domains)
> {
> uint32_t ioas_id;
> diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
> index 40f6f14..6a11c26 100644
> --- a/tools/testing/selftests/iommu/iommufd_utils.h
> +++ b/tools/testing/selftests/iommu/iommufd_utils.h
> @@ -40,12 +40,28 @@ static inline bool test_bit(unsigned int nr, unsigned long *addr)
> static void *buffer;
> static unsigned long BUFFER_SIZE;
>
> +static void *mfd_buffer;
> +static int mfd;
> +
> static unsigned long PAGE_SIZE;
>
> #define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
> #define offsetofend(TYPE, MEMBER) \
> (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
>
> +static inline void *memfd_mmap(size_t length, int prot, int flags, int *mfd_p)
> +{
> + int mfd_flags = (flags & MAP_HUGETLB) ? MFD_HUGETLB : 0;
> + int mfd = memfd_create("buffer", mfd_flags);
> +
> + if (mfd <= 0)
> + return MAP_FAILED;
> + if (ftruncate(mfd, length))
> + return MAP_FAILED;
> + *mfd_p = mfd;
> + return mmap(0, length, prot, flags, mfd, 0);
> +}
> +
> /*
> * Have the kernel check the refcount on pages. I don't know why a freshly
> * mmap'd anon non-compound page starts out with a ref of 3
> @@ -589,6 +605,47 @@ static int _test_ioctl_ioas_unmap(int fd, unsigned int ioas_id, uint64_t iova,
> EXPECT_ERRNO(_errno, _test_ioctl_ioas_unmap(self->fd, self->ioas_id, \
> iova, length, NULL))
>
> +static int _test_ioctl_ioas_map_file(int fd, unsigned int ioas_id, int mfd,
> + size_t start, size_t length, __u64 *iova,
> + unsigned int flags)
> +{
> + struct iommu_ioas_map_file cmd = {
> + .size = sizeof(cmd),
> + .flags = flags,
> + .ioas_id = ioas_id,
> + .fd = mfd,
> + .start = start,
> + .length = length,
> + };
> + int ret;
> +
> + if (flags & IOMMU_IOAS_MAP_FIXED_IOVA)
> + cmd.iova = *iova;
> +
> + ret = ioctl(fd, IOMMU_IOAS_MAP_FILE, &cmd);
> + *iova = cmd.iova;
> + return ret;
> +}
> +
> +#define test_ioctl_ioas_map_file(mfd, start, length, iova_p) \
> + ASSERT_EQ(0, \
> + _test_ioctl_ioas_map_file( \
> + self->fd, self->ioas_id, mfd, start, length, iova_p, \
> + IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
> +
> +#define test_err_ioctl_ioas_map_file(_errno, mfd, start, length, iova_p) \
> + EXPECT_ERRNO( \
> + _errno, \
> + _test_ioctl_ioas_map_file( \
> + self->fd, self->ioas_id, mfd, start, length, iova_p, \
> + IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
> +
> +#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p) \
> + ASSERT_EQ(0, \
> + _test_ioctl_ioas_map_file( \
> + self->fd, ioas_id, mfd, start, length, iova_p, \
> + IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
> +
> static int _test_ioctl_set_temp_memory_limit(int fd, unsigned int limit)
> {
> struct iommu_test_cmd memlimit_cmd = {
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-25 13:11 ` [PATCH V7 9/9] iommufd: map file selftest Steve Sistare
2024-10-25 13:14 ` Steven Sistare
@ 2024-10-25 17:04 ` Nicolin Chen
2024-10-25 17:58 ` Steven Sistare
1 sibling, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2024-10-25 17:04 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Jason Gunthorpe, Kevin Tian
On Fri, Oct 25, 2024 at 06:11:59AM -0700, Steve Sistare wrote:
>
> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
The build passes, but I now see a few failing tests. Any idea?
# RUN iommufd_mock_domain.one_domain_file.basic ...
# iommufd.c:1521:basic:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, self->ioas_id, mfd, 0, PAGE_SIZE, &iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
# basic: Test terminated by assertion
# FAIL iommufd_mock_domain.one_domain_file.basic
not ok 136 iommufd_mock_domain.one_domain_file.basic
# RUN iommufd_mock_domain.one_domain_file.ro_unshare ...
# OK iommufd_mock_domain.one_domain_file.ro_unshare
ok 137 iommufd_mock_domain.one_domain_file.ro_unshare
# RUN iommufd_mock_domain.one_domain_file.all_aligns ...
# iommufd.c:1602:all_aligns:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, self->ioas_id, mfd, start, length, &iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
# all_aligns: Test terminated by assertion
# FAIL iommufd_mock_domain.one_domain_file.all_aligns
not ok 138 iommufd_mock_domain.one_domain_file.all_aligns
# RUN iommufd_mock_domain.one_domain_file.all_aligns_copy ...
# iommufd.c:1656:all_aligns_copy:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, self->ioas_id, mfd, start, length, &iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
# all_aligns_copy: Test terminated by assertion
# FAIL iommufd_mock_domain.one_domain_file.all_aligns_copy
not ok 139 iommufd_mock_domain.one_domain_file.all_aligns_copy
# RUN iommufd_mock_domain.one_domain_file.user_copy ...
# iommufd.c:1712:user_copy:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, ioas_id, mfd, 0, BUFFER_SIZE, ©_cmd.src_iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
# user_copy: Test terminated by assertion
# FAIL iommufd_mock_domain.one_domain_file.user_copy
not ok 140 iommufd_mock_domain.one_domain_file.user_copy
# RUN iommufd_mock_domain.one_domain_file.replace ...
# OK iommufd_mock_domain.one_domain_file.replace
ok 141 iommufd_mock_domain.one_domain_file.replace
# RUN iommufd_mock_domain.one_domain_file.alloc_hwpt ...
# OK iommufd_mock_domain.one_domain_file.alloc_hwpt
ok 142 iommufd_mock_domain.one_domain_file.alloc_hwpt
# RUN iommufd_mock_domain.one_domain_file_hugepage.basic ...
# iommufd.c:1521:basic:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, self->ioas_id, mfd, 0, PAGE_SIZE, &iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
# basic: Test terminated by assertion
# FAIL iommufd_mock_domain.one_domain_file_hugepage.basic
not ok 143 iommufd_mock_domain.one_domain_file_hugepage.basic
# RUN iommufd_mock_domain.one_domain_file_hugepage.ro_unshare ...
# OK iommufd_mock_domain.one_domain_file_hugepage.ro_unshare
ok 144 iommufd_mock_domain.one_domain_file_hugepage.ro_unshare
# RUN iommufd_mock_domain.one_domain_file_hugepage.all_aligns ...
# OK iommufd_mock_domain.one_domain_file_hugepage.all_aligns
ok 145 iommufd_mock_domain.one_domain_file_hugepage.all_aligns
# RUN iommufd_mock_domain.one_domain_file_hugepage.all_aligns_copy ...
# OK iommufd_mock_domain.one_domain_file_hugepage.all_aligns_copy
ok 146 iommufd_mock_domain.one_domain_file_hugepage.all_aligns_copy
# RUN iommufd_mock_domain.one_domain_file_hugepage.user_copy ...
# iommufd.c:1712:user_copy:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, ioas_id, mfd, 0, BUFFER_SIZE, ©_cmd.src_iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
# user_copy: Test terminated by assertion
# FAIL iommufd_mock_domain.one_domain_file_hugepage.user_copy
...
ok 2 basic_fail_nth.map_domain
# RUN basic_fail_nth.map_file_domain ...
# iommufd_fail_nth.c:338:map_file_domain:Expected 0 (0) == test_nth_map_file_domain(_metadata, self, variant, &nth_state) (-1)
# map_file_domain: Test terminated by assertion
# FAIL basic_fail_nth.map_file_domain
not ok 3 basic_fail_nth.map_file_domain
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-25 17:04 ` Nicolin Chen
@ 2024-10-25 17:58 ` Steven Sistare
2024-10-25 18:39 ` Nicolin Chen
0 siblings, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2024-10-25 17:58 UTC (permalink / raw)
To: Nicolin Chen; +Cc: iommu, Jason Gunthorpe, Kevin Tian
On 10/25/2024 1:04 PM, Nicolin Chen wrote:
> On Fri, Oct 25, 2024 at 06:11:59AM -0700, Steve Sistare wrote:
>>
>> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
>
> The build passes, but I now see a few failing tests. Any idea?
Strange. All tests pass for me, with patches applied to base commit
e2d8fe9148b79ed1cbf0663edc988db7769173dc
# modprobe iomufd
# iommufd
...
Totals: pass:204 fail:0 xfail:0 xpass:0 skip:0 error:0
However, I also needed this patch from Yi Liu to make it work:
[2] https://github.com/yiliu1765/iommufd/commit/ff0772d155c1ae2841e7b7ff0d7ca889271b81b2
Without that, most tests fail, even before my series.
I will try to reproduce your failures.
What is your base commit?
What is your value of /sys/kernel/mm/transparent_hugepage/shmem_enabled ?
(both always and never pass for me)
- Steve
> # RUN iommufd_mock_domain.one_domain_file.basic ...
> # iommufd.c:1521:basic:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, self->ioas_id, mfd, 0, PAGE_SIZE, &iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
> # basic: Test terminated by assertion
> # FAIL iommufd_mock_domain.one_domain_file.basic
> not ok 136 iommufd_mock_domain.one_domain_file.basic
> # RUN iommufd_mock_domain.one_domain_file.ro_unshare ...
> # OK iommufd_mock_domain.one_domain_file.ro_unshare
> ok 137 iommufd_mock_domain.one_domain_file.ro_unshare
> # RUN iommufd_mock_domain.one_domain_file.all_aligns ...
> # iommufd.c:1602:all_aligns:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, self->ioas_id, mfd, start, length, &iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
> # all_aligns: Test terminated by assertion
> # FAIL iommufd_mock_domain.one_domain_file.all_aligns
> not ok 138 iommufd_mock_domain.one_domain_file.all_aligns
> # RUN iommufd_mock_domain.one_domain_file.all_aligns_copy ...
> # iommufd.c:1656:all_aligns_copy:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, self->ioas_id, mfd, start, length, &iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
> # all_aligns_copy: Test terminated by assertion
> # FAIL iommufd_mock_domain.one_domain_file.all_aligns_copy
> not ok 139 iommufd_mock_domain.one_domain_file.all_aligns_copy
> # RUN iommufd_mock_domain.one_domain_file.user_copy ...
> # iommufd.c:1712:user_copy:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, ioas_id, mfd, 0, BUFFER_SIZE, ©_cmd.src_iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
> # user_copy: Test terminated by assertion
> # FAIL iommufd_mock_domain.one_domain_file.user_copy
> not ok 140 iommufd_mock_domain.one_domain_file.user_copy
> # RUN iommufd_mock_domain.one_domain_file.replace ...
> # OK iommufd_mock_domain.one_domain_file.replace
> ok 141 iommufd_mock_domain.one_domain_file.replace
> # RUN iommufd_mock_domain.one_domain_file.alloc_hwpt ...
> # OK iommufd_mock_domain.one_domain_file.alloc_hwpt
> ok 142 iommufd_mock_domain.one_domain_file.alloc_hwpt
> # RUN iommufd_mock_domain.one_domain_file_hugepage.basic ...
> # iommufd.c:1521:basic:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, self->ioas_id, mfd, 0, PAGE_SIZE, &iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
> # basic: Test terminated by assertion
> # FAIL iommufd_mock_domain.one_domain_file_hugepage.basic
> not ok 143 iommufd_mock_domain.one_domain_file_hugepage.basic
> # RUN iommufd_mock_domain.one_domain_file_hugepage.ro_unshare ...
> # OK iommufd_mock_domain.one_domain_file_hugepage.ro_unshare
> ok 144 iommufd_mock_domain.one_domain_file_hugepage.ro_unshare
> # RUN iommufd_mock_domain.one_domain_file_hugepage.all_aligns ...
> # OK iommufd_mock_domain.one_domain_file_hugepage.all_aligns
> ok 145 iommufd_mock_domain.one_domain_file_hugepage.all_aligns
> # RUN iommufd_mock_domain.one_domain_file_hugepage.all_aligns_copy ...
> # OK iommufd_mock_domain.one_domain_file_hugepage.all_aligns_copy
> ok 146 iommufd_mock_domain.one_domain_file_hugepage.all_aligns_copy
> # RUN iommufd_mock_domain.one_domain_file_hugepage.user_copy ...
> # iommufd.c:1712:user_copy:Expected 0 (0) == _test_ioctl_ioas_map_file( self->fd, ioas_id, mfd, 0, BUFFER_SIZE, ©_cmd.src_iova, IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE) (-1)
> # user_copy: Test terminated by assertion
> # FAIL iommufd_mock_domain.one_domain_file_hugepage.user_copy
> ...
> ok 2 basic_fail_nth.map_domain
> # RUN basic_fail_nth.map_file_domain ...
> # iommufd_fail_nth.c:338:map_file_domain:Expected 0 (0) == test_nth_map_file_domain(_metadata, self, variant, &nth_state) (-1)
> # map_file_domain: Test terminated by assertion
> # FAIL basic_fail_nth.map_file_domain
> not ok 3 basic_fail_nth.map_file_domain
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-25 17:58 ` Steven Sistare
@ 2024-10-25 18:39 ` Nicolin Chen
2024-10-25 23:58 ` Jason Gunthorpe
0 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2024-10-25 18:39 UTC (permalink / raw)
To: Steven Sistare; +Cc: iommu, Jason Gunthorpe, Kevin Tian
On Fri, Oct 25, 2024 at 01:58:57PM -0400, Steven Sistare wrote:
> On 10/25/2024 1:04 PM, Nicolin Chen wrote:
> > On Fri, Oct 25, 2024 at 06:11:59AM -0700, Steve Sistare wrote:
> > >
> > > Add test cases to exercise IOMMU_IOAS_MAP_FILE.
> > >
> > > Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> > > Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reran on Jason's linus tree:
...
# PASSED: 204 / 204 tests passed.
# Totals: pass:194 fail:0 xfail:0 xpass:0 skip:10 error:0
...
# PASSED: 8 / 8 tests passed.
# Totals: pass:8 fail:0 xfail:0 xpass:0 skip:0 error:0
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> > The build passes, but I now see a few failing tests. Any idea?
>
> Strange. All tests pass for me, with patches applied to base commit
> e2d8fe9148b79ed1cbf0663edc988db7769173dc
>
> # modprobe iomufd
> # iommufd
> ...
> Totals: pass:204 fail:0 xfail:0 xpass:0 skip:0 error:0
>
> However, I also needed this patch from Yi Liu to make it work:
> [2] https://github.com/yiliu1765/iommufd/commit/ff0772d155c1ae2841e7b7ff0d7ca889271b81b2
>
> Without that, most tests fail, even before my series.
I run on ARM64, so I don't have that for it fixing an x86 issue.
> I will try to reproduce your failures.
> What is your base commit?
I was using the for-next tree (rc1) from:
git git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git
And it was failing, so I guess rc1 is missing something for memfd.
So, it should be a false alarm.
> What is your value of /sys/kernel/mm/transparent_hugepage/shmem_enabled ?
> (both always and never pass for me)
$ cat /sys/kernel/mm/transparent_hugepage/shmem_enabled
always within_size advise [never] deny force
And I retried with "always" that works too with the linus tree.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-25 18:39 ` Nicolin Chen
@ 2024-10-25 23:58 ` Jason Gunthorpe
2024-10-26 19:13 ` Steven Sistare
0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2024-10-25 23:58 UTC (permalink / raw)
To: Nicolin Chen; +Cc: Steven Sistare, iommu, Kevin Tian
On Fri, Oct 25, 2024 at 11:39:40AM -0700, Nicolin Chen wrote:
> > I will try to reproduce your failures.
> > What is your base commit?
>
> I was using the for-next tree (rc1) from:
> git git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git
>
> And it was failing, so I guess rc1 is missing something for memfd.
> So, it should be a false alarm.
Oh, that's exciting. So I will have to base it on a rc that works so
bisection works :\
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-25 23:58 ` Jason Gunthorpe
@ 2024-10-26 19:13 ` Steven Sistare
2024-10-26 19:16 ` Steven Sistare
0 siblings, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2024-10-26 19:13 UTC (permalink / raw)
To: Jason Gunthorpe, Nicolin Chen; +Cc: iommu, Kevin Tian
On 10/25/2024 7:58 PM, Jason Gunthorpe wrote:
> On Fri, Oct 25, 2024 at 11:39:40AM -0700, Nicolin Chen wrote:
>
>>> I will try to reproduce your failures.
>>> What is your base commit?
>>
>> I was using the for-next tree (rc1) from:
>> git git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git
>>
>> And it was failing, so I guess rc1 is missing something for memfd.
>> So, it should be a false alarm.
>
> Oh, that's exciting. So I will have to base it on a rc that works so
> bisection works :\
I'll test builds on arm to identify the minimum requirement.
- Steve
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-26 19:13 ` Steven Sistare
@ 2024-10-26 19:16 ` Steven Sistare
2024-10-26 23:09 ` Nicolin Chen
0 siblings, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2024-10-26 19:16 UTC (permalink / raw)
To: Jason Gunthorpe, Nicolin Chen; +Cc: iommu, Kevin Tian
On 10/26/2024 3:13 PM, Steven Sistare wrote:
> On 10/25/2024 7:58 PM, Jason Gunthorpe wrote:
>> On Fri, Oct 25, 2024 at 11:39:40AM -0700, Nicolin Chen wrote:
>>
>>>> I will try to reproduce your failures.
>>>> What is your base commit?
>>>
>>> I was using the for-next tree (rc1) from:
>>> git git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git
>>>
>>> And it was failing, so I guess rc1 is missing something for memfd.
>>> So, it should be a false alarm.
>>
>> Oh, that's exciting. So I will have to base it on a rc that works so
>> bisection works :\
>
> I'll test builds on arm to identify the minimum requirement.
BTW this is likely an arm-specific issue, because I have been using
for-next rc1 on x86, and all tests pass.
- Steve
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-26 19:16 ` Steven Sistare
@ 2024-10-26 23:09 ` Nicolin Chen
2024-10-27 14:38 ` Steven Sistare
0 siblings, 1 reply; 35+ messages in thread
From: Nicolin Chen @ 2024-10-26 23:09 UTC (permalink / raw)
To: Steven Sistare; +Cc: Jason Gunthorpe, iommu, Kevin Tian
On Sat, Oct 26, 2024 at 03:16:36PM -0400, Steven Sistare wrote:
> On 10/26/2024 3:13 PM, Steven Sistare wrote:
> > On 10/25/2024 7:58 PM, Jason Gunthorpe wrote:
> > > On Fri, Oct 25, 2024 at 11:39:40AM -0700, Nicolin Chen wrote:
> > >
> > > > > I will try to reproduce your failures.
> > > > > What is your base commit?
> > > >
> > > > I was using the for-next tree (rc1) from:
> > > > git git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git
> > > >
> > > > And it was failing, so I guess rc1 is missing something for memfd.
> > > > So, it should be a false alarm.
> > >
> > > Oh, that's exciting. So I will have to base it on a rc that works so
> > > bisection works :\
> >
> > I'll test builds on arm to identify the minimum requirement.
>
> BTW this is likely an arm-specific issue, because I have been using
> for-next rc1 on x86, and all tests pass.
I did a clean build today. And it passes with for-next tree.
Looks like something wrong with my yesterday's build. Sorry.
Nicolin
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 9/9] iommufd: map file selftest
2024-10-26 23:09 ` Nicolin Chen
@ 2024-10-27 14:38 ` Steven Sistare
0 siblings, 0 replies; 35+ messages in thread
From: Steven Sistare @ 2024-10-27 14:38 UTC (permalink / raw)
To: Nicolin Chen; +Cc: Jason Gunthorpe, iommu, Kevin Tian
On 10/26/2024 7:09 PM, Nicolin Chen wrote:
> On Sat, Oct 26, 2024 at 03:16:36PM -0400, Steven Sistare wrote:
>> On 10/26/2024 3:13 PM, Steven Sistare wrote:
>>> On 10/25/2024 7:58 PM, Jason Gunthorpe wrote:
>>>> On Fri, Oct 25, 2024 at 11:39:40AM -0700, Nicolin Chen wrote:
>>>>
>>>>>> I will try to reproduce your failures.
>>>>>> What is your base commit?
>>>>>
>>>>> I was using the for-next tree (rc1) from:
>>>>> git git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git
>>>>>
>>>>> And it was failing, so I guess rc1 is missing something for memfd.
>>>>> So, it should be a false alarm.
>>>>
>>>> Oh, that's exciting. So I will have to base it on a rc that works so
>>>> bisection works :\
>>>
>>> I'll test builds on arm to identify the minimum requirement.
>>
>> BTW this is likely an arm-specific issue, because I have been using
>> for-next rc1 on x86, and all tests pass.
>
> I did a clean build today. And it passes with for-next tree.
>
> Looks like something wrong with my yesterday's build. Sorry.
No problem. Thanks for letting me know.
- Steve
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 0/9] iommu_ioas_map_file
2024-10-25 13:11 [PATCH V7 0/9] iommu_ioas_map_file Steve Sistare
` (8 preceding siblings ...)
2024-10-25 13:11 ` [PATCH V7 9/9] iommufd: map file selftest Steve Sistare
@ 2024-10-30 0:11 ` Jason Gunthorpe
2024-10-30 12:43 ` Steven Sistare
9 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2024-10-30 0:11 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Kevin Tian, Nicolin Chen
On Fri, Oct 25, 2024 at 06:11:50AM -0700, Steve Sistare wrote:
> Provide the IOMMU_IOAS_MAP_FILE ioctl, which allows a user to register
> memory by passing a memfd plus offset and length. Implement it using
> the memfd_map_folios KAPI, and the proposed folio_add_pins KAPI.
> See the individual patches for details.
>
> Steve Sistare (9):
> mm/gup: folio_add_pins
> iommufd: rename uptr in iopt_alloc_iova
> iommufd: generalize iopt_pages address
> iommufd: pfn reader local variables
> iommufd: folio subroutines
> iommufd: pfn reader for file mappings
> iommufd: IOMMU_IOAS_MAP_FILE
> iommufd: file mappings for mdev
> iommufd: map file selftest
Applied to iommufd for-next
Is v3 still the latest for "iommufd live update"?
Thanks,
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: [PATCH V7 0/9] iommu_ioas_map_file
2024-10-30 0:11 ` [PATCH V7 0/9] iommu_ioas_map_file Jason Gunthorpe
@ 2024-10-30 12:43 ` Steven Sistare
2024-11-04 13:58 ` Steven Sistare
0 siblings, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2024-10-30 12:43 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, Kevin Tian, Nicolin Chen
On 10/29/2024 8:11 PM, Jason Gunthorpe wrote:
> On Fri, Oct 25, 2024 at 06:11:50AM -0700, Steve Sistare wrote:
>> Provide the IOMMU_IOAS_MAP_FILE ioctl, which allows a user to register
>> memory by passing a memfd plus offset and length. Implement it using
>> the memfd_map_folios KAPI, and the proposed folio_add_pins KAPI.
>> See the individual patches for details.
>>
>> Steve Sistare (9):
>> mm/gup: folio_add_pins
>> iommufd: rename uptr in iopt_alloc_iova
>> iommufd: generalize iopt_pages address
>> iommufd: pfn reader local variables
>> iommufd: folio subroutines
>> iommufd: pfn reader for file mappings
>> iommufd: IOMMU_IOAS_MAP_FILE
>> iommufd: file mappings for mdev
>> iommufd: map file selftest
>
> Applied to iommufd for-next
>
> Is v3 still the latest for "iommufd live update"?
Yes.
It's selftest patch no longer applies cleanly, but I'll fix that in V4
after you review the other patches. Thanks!
- Steve
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 0/9] iommu_ioas_map_file
2024-10-30 12:43 ` Steven Sistare
@ 2024-11-04 13:58 ` Steven Sistare
2024-11-04 14:18 ` Jason Gunthorpe
0 siblings, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2024-11-04 13:58 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, Kevin Tian, Nicolin Chen
On 10/30/2024 8:43 AM, Steven Sistare wrote:
> On 10/29/2024 8:11 PM, Jason Gunthorpe wrote:
>> On Fri, Oct 25, 2024 at 06:11:50AM -0700, Steve Sistare wrote:
>>> Provide the IOMMU_IOAS_MAP_FILE ioctl, which allows a user to register
>>> memory by passing a memfd plus offset and length. Implement it using
>>> the memfd_map_folios KAPI, and the proposed folio_add_pins KAPI.
>>> See the individual patches for details.
>>>
>>> Steve Sistare (9):
>>> mm/gup: folio_add_pins
>>> iommufd: rename uptr in iopt_alloc_iova
>>> iommufd: generalize iopt_pages address
>>> iommufd: pfn reader local variables
>>> iommufd: folio subroutines
>>> iommufd: pfn reader for file mappings
>>> iommufd: IOMMU_IOAS_MAP_FILE
>>> iommufd: file mappings for mdev
>>> iommufd: map file selftest
>>
>> Applied to iommufd for-next
>>
>> Is v3 still the latest for "iommufd live update"?
>
> Yes.
>
> It's selftest patch no longer applies cleanly, but I'll fix that in V4
> after you review the other patches. Thanks!
Just checking -- are you waiting for me to post V4 before reviewing any
further?
- Steve
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 0/9] iommu_ioas_map_file
2024-11-04 13:58 ` Steven Sistare
@ 2024-11-04 14:18 ` Jason Gunthorpe
2024-11-04 14:24 ` Steven Sistare
0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2024-11-04 14:18 UTC (permalink / raw)
To: Steven Sistare; +Cc: iommu, Kevin Tian, Nicolin Chen
On Mon, Nov 04, 2024 at 08:58:29AM -0500, Steven Sistare wrote:
> > It's selftest patch no longer applies cleanly, but I'll fix that in V4
> > after you review the other patches. Thanks!
>
> Just checking -- are you waiting for me to post V4 before reviewing any
> further?
No, just busy with other patches :|
Jason
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH V7 0/9] iommu_ioas_map_file
2024-11-04 14:18 ` Jason Gunthorpe
@ 2024-11-04 14:24 ` Steven Sistare
0 siblings, 0 replies; 35+ messages in thread
From: Steven Sistare @ 2024-11-04 14:24 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, Kevin Tian, Nicolin Chen
On 11/4/2024 9:18 AM, Jason Gunthorpe wrote:
> On Mon, Nov 04, 2024 at 08:58:29AM -0500, Steven Sistare wrote:
>
>>> It's selftest patch no longer applies cleanly, but I'll fix that in V4
>>> after you review the other patches. Thanks!
>>
>> Just checking -- are you waiting for me to post V4 before reviewing any
>> further?
>
> No, just busy with other patches :|
No problem, I understand, thanks. Just making sure we were not deadlocked
waiting for each other.
- Steve
^ permalink raw reply [flat|nested] 35+ messages in thread