* [PATCH V5 1/9] mm/gup: folio_add_pins
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-22 21:00 ` [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
` (7 subsequent siblings)
8 siblings, 0 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Export a function that adds pins to an already-pinned huge-page folio.
This allows any range of small pages within the folio to be unpinned later.
For example, pages pinned via memfd_pin_folios and modified by
folio_add_pins could be unpinned via unpin_user_page(s).
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
include/linux/mm.h | 1 +
mm/gup.c | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecf63d2..f9de33a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2524,6 +2524,7 @@ long pin_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end,
struct folio **folios, unsigned int max_folios,
pgoff_t *offset);
+int folio_add_pins(struct folio *folio, unsigned int pins);
int get_user_pages_fast(unsigned long start, int nr_pages,
unsigned int gup_flags, struct page **pages);
diff --git a/mm/gup.c b/mm/gup.c
index a82890b..4ac3e567 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -3717,3 +3717,27 @@ long memfd_pin_folios(struct file *memfd, loff_t start, loff_t end,
return ret;
}
EXPORT_SYMBOL_GPL(memfd_pin_folios);
+
+/**
+ * folio_add_pins() - add pins to an already-pinned folio
+ * @folio: the folio to add more pins to
+ * @pins: number of pins to add
+ *
+ * Try to add more pins to an already-pinned folio. The semantics
+ * of the pin (e.g., FOLL_WRITE) follow any existing pin and cannot
+ * be changed.
+ *
+ * This function is helpful when having obtained a pin on a large folio
+ * using memfd_pin_folios(), but wanting to logically unpin parts
+ * (e.g., individual pages) of the folio later, for example, using
+ * unpin_user_page_range_dirty_lock().
+ *
+ * This is not the right interface to initially pin a folio.
+ */
+int folio_add_pins(struct folio *folio, unsigned int pins)
+{
+ VM_WARN_ON_ONCE(!folio_maybe_dma_pinned(folio));
+
+ return try_grab_folio(folio, pins, FOLL_PIN);
+}
+EXPORT_SYMBOL_GPL(folio_add_pins);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
2024-10-22 21:00 ` [PATCH V5 1/9] mm/gup: folio_add_pins Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-23 7:05 ` Tian, Kevin
2024-10-22 21:00 ` [PATCH V5 3/9] iommufd: generalize iopt_pages address Steve Sistare
` (6 subsequent siblings)
8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
iopt_alloc_iova takes a uptr argument but only checks for its alignment.
Generalize this to an unsigned address, which can be the offset from the
start of a file in a subsequent patch. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/io_pagetable.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 4bf7ccd..68ad91d 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -107,9 +107,9 @@ static bool __alloc_iova_check_used(struct interval_tree_span_iter *span,
* Does not return a 0 IOVA even if it is valid.
*/
static int iopt_alloc_iova(struct io_pagetable *iopt, unsigned long *iova,
- unsigned long uptr, unsigned long length)
+ unsigned long addr, unsigned long length)
{
- unsigned long page_offset = uptr % PAGE_SIZE;
+ unsigned long page_offset = addr % PAGE_SIZE;
struct interval_tree_double_span_iter used_span;
struct interval_tree_span_iter allowed_span;
unsigned long max_alignment = PAGE_SIZE;
@@ -122,15 +122,15 @@ static int iopt_alloc_iova(struct io_pagetable *iopt, unsigned long *iova,
return -EOVERFLOW;
/*
- * Keep alignment present in the uptr when building the IOVA, this
+ * Keep alignment present in addr when building the IOVA, which
* increases the chance we can map a THP.
*/
- if (!uptr)
+ if (!addr)
iova_alignment = roundup_pow_of_two(length);
else
iova_alignment = min_t(unsigned long,
roundup_pow_of_two(length),
- 1UL << __ffs64(uptr));
+ 1UL << __ffs64(addr));
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
max_alignment = HPAGE_SIZE;
@@ -248,6 +248,7 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
int iommu_prot, unsigned int flags)
{
struct iopt_pages_list *elm;
+ unsigned long start;
unsigned long iova;
int rc = 0;
@@ -267,9 +268,8 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
/* Use the first entry to guess the ideal IOVA alignment */
elm = list_first_entry(pages_list, struct iopt_pages_list,
next);
- rc = iopt_alloc_iova(
- iopt, dst_iova,
- (uintptr_t)elm->pages->uptr + elm->start_byte, length);
+ start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+ rc = iopt_alloc_iova(iopt, dst_iova, start, length);
if (rc)
goto out_unlock;
if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* RE: [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova
2024-10-22 21:00 ` [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
@ 2024-10-23 7:05 ` Tian, Kevin
0 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23 7:05 UTC (permalink / raw)
To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
>
> iopt_alloc_iova takes a uptr argument but only checks for its alignment.
> Generalize this to an unsigned address, which can be the offset from the
> start of a file in a subsequent patch. No functional change.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH V5 3/9] iommufd: generalize iopt_pages address
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
2024-10-22 21:00 ` [PATCH V5 1/9] mm/gup: folio_add_pins Steve Sistare
2024-10-22 21:00 ` [PATCH V5 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-23 7:08 ` Tian, Kevin
2024-10-22 21:00 ` [PATCH V5 4/9] iommufd: pfn reader local variables Steve Sistare
` (5 subsequent siblings)
8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
The starting address in iopt_pages is currently a __user *uptr. Generalize
to allow other types of addresses. Refactor iopt_alloc_pages and
iopt_map_user_pages into address-type specific and common functions.
Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/io_pagetable.c | 55 ++++++++++++++++++++++--------------
drivers/iommu/iommufd/io_pagetable.h | 13 +++++++--
drivers/iommu/iommufd/pages.c | 31 ++++++++++++++------
3 files changed, 67 insertions(+), 32 deletions(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 68ad91d..874ee9e 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -384,6 +384,34 @@ int iopt_map_pages(struct io_pagetable *iopt, struct list_head *pages_list,
return rc;
}
+static int iopt_map_common(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+ struct iopt_pages *pages, unsigned long *iova,
+ unsigned long length, unsigned long start_byte,
+ int iommu_prot, unsigned int flags)
+{
+ struct iopt_pages_list elm = {};
+ LIST_HEAD(pages_list);
+ int rc;
+
+ elm.pages = pages;
+ elm.start_byte = start_byte;
+ if (ictx->account_mode == IOPT_PAGES_ACCOUNT_MM &&
+ elm.pages->account_mode == IOPT_PAGES_ACCOUNT_USER)
+ elm.pages->account_mode = IOPT_PAGES_ACCOUNT_MM;
+ elm.length = length;
+ list_add(&elm.next, &pages_list);
+
+ rc = iopt_map_pages(iopt, &pages_list, length, iova, iommu_prot, flags);
+ if (rc) {
+ if (elm.area)
+ iopt_abort_area(elm.area);
+ if (elm.pages)
+ iopt_put_pages(elm.pages);
+ return rc;
+ }
+ return 0;
+}
+
/**
* iopt_map_user_pages() - Map a user VA to an iova in the io page table
* @ictx: iommufd_ctx the iopt is part of
@@ -408,29 +436,14 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
unsigned long length, int iommu_prot,
unsigned int flags)
{
- struct iopt_pages_list elm = {};
- LIST_HEAD(pages_list);
- int rc;
+ struct iopt_pages *pages;
- elm.pages = iopt_alloc_pages(uptr, length, iommu_prot & IOMMU_WRITE);
- if (IS_ERR(elm.pages))
- return PTR_ERR(elm.pages);
- if (ictx->account_mode == IOPT_PAGES_ACCOUNT_MM &&
- elm.pages->account_mode == IOPT_PAGES_ACCOUNT_USER)
- elm.pages->account_mode = IOPT_PAGES_ACCOUNT_MM;
- elm.start_byte = uptr - elm.pages->uptr;
- elm.length = length;
- list_add(&elm.next, &pages_list);
+ pages = iopt_alloc_user_pages(uptr, length, iommu_prot & IOMMU_WRITE);
+ if (IS_ERR(pages))
+ return PTR_ERR(pages);
- rc = iopt_map_pages(iopt, &pages_list, length, iova, iommu_prot, flags);
- if (rc) {
- if (elm.area)
- iopt_abort_area(elm.area);
- if (elm.pages)
- iopt_put_pages(elm.pages);
- return rc;
- }
- return 0;
+ return iopt_map_common(ictx, iopt, pages, iova, length,
+ uptr - pages->uptr, iommu_prot, flags);
}
struct iova_bitmap_fn_arg {
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index c61d744..8e48266 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -175,6 +175,10 @@ enum {
IOPT_PAGES_ACCOUNT_MM = 2,
};
+enum iopt_address_type {
+ IOPT_ADDRESS_USER = 0,
+};
+
/*
* This holds a pinned page list for multiple areas of IO address space. The
* pages always originate from a linear chunk of userspace VA. Multiple
@@ -195,7 +199,10 @@ struct iopt_pages {
struct task_struct *source_task;
struct mm_struct *source_mm;
struct user_struct *source_user;
- void __user *uptr;
+ enum iopt_address_type type;
+ union {
+ void __user *uptr; /* IOPT_ADDRESS_USER */
+ };
bool writable:1;
u8 account_mode;
@@ -206,8 +213,8 @@ struct iopt_pages {
struct rb_root_cached domains_itree;
};
-struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
- bool writable);
+struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
+ unsigned long length, bool writable);
void iopt_release_pages(struct kref *kref);
static inline void iopt_put_pages(struct iopt_pages *pages)
{
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 93d806c..aa69c97 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1139,11 +1139,11 @@ static int pfn_reader_first(struct pfn_reader *pfns, struct iopt_pages *pages,
return 0;
}
-struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
- bool writable)
+static struct iopt_pages *iopt_alloc_pages(unsigned long length,
+ unsigned long start_byte,
+ bool writable)
{
struct iopt_pages *pages;
- unsigned long end;
/*
* The iommu API uses size_t as the length, and protect the DIV_ROUND_UP
@@ -1152,9 +1152,6 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
if (length > SIZE_MAX - PAGE_SIZE || length == 0)
return ERR_PTR(-EINVAL);
- if (check_add_overflow((unsigned long)uptr, length, &end))
- return ERR_PTR(-EOVERFLOW);
-
pages = kzalloc(sizeof(*pages), GFP_KERNEL_ACCOUNT);
if (!pages)
return ERR_PTR(-ENOMEM);
@@ -1164,8 +1161,7 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
mutex_init(&pages->mutex);
pages->source_mm = current->mm;
mmgrab(pages->source_mm);
- pages->uptr = (void __user *)ALIGN_DOWN((uintptr_t)uptr, PAGE_SIZE);
- pages->npages = DIV_ROUND_UP(length + (uptr - pages->uptr), PAGE_SIZE);
+ pages->npages = DIV_ROUND_UP(length + start_byte, PAGE_SIZE);
pages->access_itree = RB_ROOT_CACHED;
pages->domains_itree = RB_ROOT_CACHED;
pages->writable = writable;
@@ -1179,6 +1175,25 @@ struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
return pages;
}
+struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
+ unsigned long length, bool writable)
+{
+ struct iopt_pages *pages;
+ unsigned long end;
+ void __user *uptr_down =
+ (void __user *) ALIGN_DOWN((uintptr_t)uptr, PAGE_SIZE);
+
+ if (check_add_overflow((unsigned long)uptr, length, &end))
+ return ERR_PTR(-EOVERFLOW);
+
+ pages = iopt_alloc_pages(length, uptr - uptr_down, writable);
+ if (IS_ERR(pages))
+ return pages;
+ pages->uptr = uptr_down;
+ pages->type = IOPT_ADDRESS_USER;
+ return pages;
+}
+
void iopt_release_pages(struct kref *kref)
{
struct iopt_pages *pages = container_of(kref, struct iopt_pages, kref);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* RE: [PATCH V5 3/9] iommufd: generalize iopt_pages address
2024-10-22 21:00 ` [PATCH V5 3/9] iommufd: generalize iopt_pages address Steve Sistare
@ 2024-10-23 7:08 ` Tian, Kevin
2024-10-23 13:33 ` Steven Sistare
0 siblings, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23 7:08 UTC (permalink / raw)
To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
>
> -struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
> - bool writable)
> +static struct iopt_pages *iopt_alloc_pages(unsigned long length,
> + unsigned long start_byte,
> + bool writable)
it's more intuitive to put start_byte as the 1st parameter, followed
by length.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V5 3/9] iommufd: generalize iopt_pages address
2024-10-23 7:08 ` Tian, Kevin
@ 2024-10-23 13:33 ` Steven Sistare
0 siblings, 0 replies; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:33 UTC (permalink / raw)
To: Tian, Kevin, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
On 10/23/2024 3:08 AM, Tian, Kevin wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>> Sent: Wednesday, October 23, 2024 5:01 AM
>>
>> -struct iopt_pages *iopt_alloc_pages(void __user *uptr, unsigned long length,
>> - bool writable)
>> +static struct iopt_pages *iopt_alloc_pages(unsigned long length,
>> + unsigned long start_byte,
>> + bool writable)
>
> it's more intuitive to put start_byte as the 1st parameter, followed
> by length.
Will do - steve
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH V5 4/9] iommufd: pfn reader local variables
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
` (2 preceding siblings ...)
2024-10-22 21:00 ` [PATCH V5 3/9] iommufd: generalize iopt_pages address Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-23 7:09 ` Tian, Kevin
2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
` (4 subsequent siblings)
8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Add local variables for common sub-expressions needed by a subsequent
patch. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommufd/pages.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index aa69c97..0cf5b65 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -978,6 +978,8 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
{
struct interval_tree_double_span_iter *span = &pfns->span;
unsigned long start_index = pfns->batch_end_index;
+ struct pfn_reader_user *user = &pfns->user;
+ unsigned long npages;
struct iopt_area *area;
int rc;
@@ -1015,10 +1017,9 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
return rc;
}
- batch_from_pages(&pfns->batch,
- pfns->user.upages +
- (start_index - pfns->user.upages_start),
- pfns->user.upages_end - start_index);
+ npages = user->upages_end - start_index;
+ start_index -= user->upages_start;
+ batch_from_pages(&pfns->batch, user->upages + start_index, npages);
return 0;
}
@@ -1092,16 +1093,18 @@ static int pfn_reader_init(struct pfn_reader *pfns, struct iopt_pages *pages,
static void pfn_reader_release_pins(struct pfn_reader *pfns)
{
struct iopt_pages *pages = pfns->pages;
+ struct pfn_reader_user *user = &pfns->user;
- if (pfns->user.upages_end > pfns->batch_end_index) {
- size_t npages = pfns->user.upages_end - pfns->batch_end_index;
-
+ if (user->upages_end > pfns->batch_end_index) {
/* Any pages not transferred to the batch are just unpinned */
- unpin_user_pages(pfns->user.upages + (pfns->batch_end_index -
- pfns->user.upages_start),
- npages);
+
+ unsigned long npages = user->upages_end - pfns->batch_end_index;
+ unsigned long start_index = pfns->batch_end_index -
+ user->upages_start;
+
+ unpin_user_pages(user->upages + start_index, npages);
iopt_pages_sub_npinned(pages, npages);
- pfns->user.upages_end = pfns->batch_end_index;
+ user->upages_end = pfns->batch_end_index;
}
if (pfns->batch_start_index != pfns->batch_end_index) {
pfn_reader_unpin(pfns);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* [PATCH V5 5/9] iommufd: folio subroutines
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
` (3 preceding siblings ...)
2024-10-22 21:00 ` [PATCH V5 4/9] iommufd: pfn reader local variables Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-23 7:16 ` Tian, Kevin
` (2 more replies)
2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
` (3 subsequent siblings)
8 siblings, 3 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Add subroutines for copying folios to a batch.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
drivers/iommu/iommufd/pages.c | 79 +++++++++++++++++++++++++++++++++++--------
1 file changed, 64 insertions(+), 15 deletions(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 0cf5b65..635b1fe 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -346,27 +346,41 @@ static void batch_destroy(struct pfn_batch *batch, void *backup)
kfree(batch->pfns);
}
-/* true if the pfn was added, false otherwise */
-static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
+static bool batch_add_pfn_num(struct pfn_batch *batch, unsigned long pfn,
+ unsigned long nr)
{
- const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
+ const unsigned long MAX_NPFNS = type_max(typeof(*batch->npfns));
+ int i = batch->end;
- if (batch->end &&
- pfn == batch->pfns[batch->end - 1] + batch->npfns[batch->end - 1] &&
- batch->npfns[batch->end - 1] != MAX_NPFNS) {
- batch->npfns[batch->end - 1]++;
- batch->total_pfns++;
- return true;
- }
- if (batch->end == batch->array_size)
+ if (i && pfn == batch->pfns[i - 1] + batch->npfns[i - 1] &&
+ nr <= MAX_NPFNS - batch->npfns[i - 1]) {
+ batch->npfns[i - 1] += nr;
+ } else if (i < batch->array_size) {
+ batch->pfns[i] = pfn;
+ batch->npfns[i] = nr;
+ batch->end++;
+ } else {
return false;
- batch->total_pfns++;
- batch->pfns[batch->end] = pfn;
- batch->npfns[batch->end] = 1;
- batch->end++;
+ }
+
+ batch->total_pfns += nr;
return true;
}
+static void batch_remove_pfn_num(struct pfn_batch *batch, unsigned long nr)
+{
+ batch->npfns[batch->end - 1] -= nr;
+ if (batch->npfns[batch->end - 1] == 0)
+ batch->end--;
+ batch->total_pfns -= nr;
+}
+
+/* true if the pfn was added, false otherwise */
+static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
+{
+ return batch_add_pfn_num(batch, pfn, 1);
+}
+
/*
* Fill the batch with pfns from the domain. When the batch is full, or it
* reaches last_index, the function will return. The caller should use
@@ -622,6 +636,41 @@ static void batch_from_pages(struct pfn_batch *batch, struct page **pages,
break;
}
+static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
+ unsigned long *offset_p, unsigned long npages)
+{
+ int rc = 0;
+ struct folio **folios = *folios_p;
+ unsigned long offset = *offset_p;
+
+ while (npages) {
+ struct folio *folio = *folios;
+ unsigned long nr = folio_nr_pages(folio) - offset;
+ unsigned long pfn = page_to_pfn(folio_page(folio, offset));
+
+ nr = min(nr, npages);
+ npages -= nr;
+
+ if (!batch_add_pfn_num(batch, pfn, nr))
+ break;
+ if (nr > 1) {
+ rc = folio_add_pins(folio, nr - 1);
+ if (rc) {
+ batch_remove_pfn_num(batch, nr);
+ goto out;
+ }
+ }
+
+ folios++;
+ offset = 0;
+ }
+
+out:
+ *folios_p = folios;
+ *offset_p = offset;
+ return rc;
+}
+
static void batch_unpin(struct pfn_batch *batch, struct iopt_pages *pages,
unsigned int first_page_off, size_t npages)
{
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* RE: [PATCH V5 5/9] iommufd: folio subroutines
2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
@ 2024-10-23 7:16 ` Tian, Kevin
2024-10-23 7:21 ` Tian, Kevin
2024-10-23 13:01 ` Jason Gunthorpe
2 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23 7:16 UTC (permalink / raw)
To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
>
> -/* true if the pfn was added, false otherwise */
> -static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
> +static bool batch_add_pfn_num(struct pfn_batch *batch, unsigned long pfn,
> + unsigned long nr)
> {
> - const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
> + const unsigned long MAX_NPFNS = type_max(typeof(*batch->npfns));
why changing to 'unsigned long'?
except that,
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 33+ messages in thread* RE: [PATCH V5 5/9] iommufd: folio subroutines
2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
2024-10-23 7:16 ` Tian, Kevin
@ 2024-10-23 7:21 ` Tian, Kevin
2024-10-23 13:03 ` Jason Gunthorpe
2024-10-23 13:04 ` Steven Sistare
2024-10-23 13:01 ` Jason Gunthorpe
2 siblings, 2 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23 7:21 UTC (permalink / raw)
To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
>
> +static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
> + unsigned long *offset_p, unsigned long npages)
> +{
> + int rc = 0;
> + struct folio **folios = *folios_p;
> + unsigned long offset = *offset_p;
> +
> + while (npages) {
> + struct folio *folio = *folios;
> + unsigned long nr = folio_nr_pages(folio) - offset;
> + unsigned long pfn = page_to_pfn(folio_page(folio, offset));
> +
> + nr = min(nr, npages);
> + npages -= nr;
> +
> + if (!batch_add_pfn_num(batch, pfn, nr))
> + break;
> + if (nr > 1) {
> + rc = folio_add_pins(folio, nr - 1);
> + if (rc) {
> + batch_remove_pfn_num(batch, nr);
> + goto out;
> + }
> + }
forgot to ask one thing here. Could you add a comment here why
additional pins is required only when batching more than 1 page?
it's not clear to a person w/o a lot of mm background. e.g. in patch1
folio_add_pins() is introduced for partial pin/unpin within an
already-pinned folio, but it's not mentioned to exclude such
operation on a single page...
> +
> + folios++;
> + offset = 0;
> + }
> +
> +out:
> + *folios_p = folios;
> + *offset_p = offset;
> + return rc;
> +}
> +
> static void batch_unpin(struct pfn_batch *batch, struct iopt_pages *pages,
> unsigned int first_page_off, size_t npages)
> {
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH V5 5/9] iommufd: folio subroutines
2024-10-23 7:21 ` Tian, Kevin
@ 2024-10-23 13:03 ` Jason Gunthorpe
2024-10-24 6:14 ` Tian, Kevin
2024-10-23 13:04 ` Steven Sistare
1 sibling, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:03 UTC (permalink / raw)
To: Tian, Kevin; +Cc: Steve Sistare, iommu@lists.linux.dev, Nicolin Chen
On Wed, Oct 23, 2024 at 07:21:05AM +0000, Tian, Kevin wrote:
> > From: Steve Sistare <steven.sistare@oracle.com>
> > Sent: Wednesday, October 23, 2024 5:01 AM
> >
> > +static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
> > + unsigned long *offset_p, unsigned long npages)
> > +{
> > + int rc = 0;
> > + struct folio **folios = *folios_p;
> > + unsigned long offset = *offset_p;
> > +
> > + while (npages) {
> > + struct folio *folio = *folios;
> > + unsigned long nr = folio_nr_pages(folio) - offset;
> > + unsigned long pfn = page_to_pfn(folio_page(folio, offset));
> > +
> > + nr = min(nr, npages);
> > + npages -= nr;
> > +
> > + if (!batch_add_pfn_num(batch, pfn, nr))
> > + break;
> > + if (nr > 1) {
> > + rc = folio_add_pins(folio, nr - 1);
> > + if (rc) {
> > + batch_remove_pfn_num(batch, nr);
> > + goto out;
> > + }
> > + }
>
> forgot to ask one thing here. Could you add a comment here why
> additional pins is required only when batching more than 1 page?
Something like:
The unpin path during unmap can unpin slices of a single folio, and
broadly does not keep track of where the folio boundaries were. Since
it treats everything as a single page transform the refcount on the
folio from a per-folio ref to a per-page ref. This makes the unmap
path uniform regardless of how the memory was pinned.
Jason
^ permalink raw reply [flat|nested] 33+ messages in thread* RE: [PATCH V5 5/9] iommufd: folio subroutines
2024-10-23 13:03 ` Jason Gunthorpe
@ 2024-10-24 6:14 ` Tian, Kevin
0 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-24 6:14 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Steve Sistare, iommu@lists.linux.dev, Nicolin Chen
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, October 23, 2024 9:04 PM
>
> On Wed, Oct 23, 2024 at 07:21:05AM +0000, Tian, Kevin wrote:
> > > From: Steve Sistare <steven.sistare@oracle.com>
> > > Sent: Wednesday, October 23, 2024 5:01 AM
> > >
> > > +static int batch_from_folios(struct pfn_batch *batch, struct folio
> ***folios_p,
> > > + unsigned long *offset_p, unsigned long npages)
> > > +{
> > > + int rc = 0;
> > > + struct folio **folios = *folios_p;
> > > + unsigned long offset = *offset_p;
> > > +
> > > + while (npages) {
> > > + struct folio *folio = *folios;
> > > + unsigned long nr = folio_nr_pages(folio) - offset;
> > > + unsigned long pfn = page_to_pfn(folio_page(folio, offset));
> > > +
> > > + nr = min(nr, npages);
> > > + npages -= nr;
> > > +
> > > + if (!batch_add_pfn_num(batch, pfn, nr))
> > > + break;
> > > + if (nr > 1) {
> > > + rc = folio_add_pins(folio, nr - 1);
> > > + if (rc) {
> > > + batch_remove_pfn_num(batch, nr);
> > > + goto out;
> > > + }
> > > + }
> >
> > forgot to ask one thing here. Could you add a comment here why
> > additional pins is required only when batching more than 1 page?
>
> Something like:
>
> The unpin path during unmap can unpin slices of a single folio, and
> broadly does not keep track of where the folio boundaries were. Since
> it treats everything as a single page transform the refcount on the
> folio from a per-folio ref to a per-page ref. This makes the unmap
> path uniform regardless of how the memory was pinned.
>
this is much clearer. Thanks!
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V5 5/9] iommufd: folio subroutines
2024-10-23 7:21 ` Tian, Kevin
2024-10-23 13:03 ` Jason Gunthorpe
@ 2024-10-23 13:04 ` Steven Sistare
1 sibling, 0 replies; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:04 UTC (permalink / raw)
To: Tian, Kevin, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
On 10/23/2024 3:21 AM, Tian, Kevin wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>> Sent: Wednesday, October 23, 2024 5:01 AM
>>
>> +static int batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
>> + unsigned long *offset_p, unsigned long npages)
>> +{
>> + int rc = 0;
>> + struct folio **folios = *folios_p;
>> + unsigned long offset = *offset_p;
>> +
>> + while (npages) {
>> + struct folio *folio = *folios;
>> + unsigned long nr = folio_nr_pages(folio) - offset;
>> + unsigned long pfn = page_to_pfn(folio_page(folio, offset));
>> +
>> + nr = min(nr, npages);
>> + npages -= nr;
>> +
>> + if (!batch_add_pfn_num(batch, pfn, nr))
>> + break;
>> + if (nr > 1) {
>> + rc = folio_add_pins(folio, nr - 1);
>> + if (rc) {
>> + batch_remove_pfn_num(batch, nr);
>> + goto out;
>> + }
>> + }
>
> forgot to ask one thing here. Could you add a comment here why
> additional pins is required only when batching more than 1 page?
>
> it's not clear to a person w/o a lot of mm background. e.g. in patch1
> folio_add_pins() is introduced for partial pin/unpin within an
> already-pinned folio, but it's not mentioned to exclude such
> operation on a single page...
We already gained 1 pin per folio from memfd_pin_folios. When iommufd unmaps
the memory, it will unpin each (small) page once. Thus we add 1 pin for each
page, less 1 for the initial pin.
- Steve
>> +
>> + folios++;
>> + offset = 0;
>> + }
>> +
>> +out:
>> + *folios_p = folios;
>> + *offset_p = offset;
>> + return rc;
>> +}
>> +
>> static void batch_unpin(struct pfn_batch *batch, struct iopt_pages *pages,
>> unsigned int first_page_off, size_t npages)
>> {
>> --
>> 1.8.3.1
>>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V5 5/9] iommufd: folio subroutines
2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
2024-10-23 7:16 ` Tian, Kevin
2024-10-23 7:21 ` Tian, Kevin
@ 2024-10-23 13:01 ` Jason Gunthorpe
2 siblings, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:01 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Kevin Tian, Nicolin Chen
On Tue, Oct 22, 2024 at 02:00:34PM -0700, Steve Sistare wrote:
> Add subroutines for copying folios to a batch.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> drivers/iommu/iommufd/pages.c | 79 +++++++++++++++++++++++++++++++++++--------
> 1 file changed, 64 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
> index 0cf5b65..635b1fe 100644
> --- a/drivers/iommu/iommufd/pages.c
> +++ b/drivers/iommu/iommufd/pages.c
> @@ -346,27 +346,41 @@ static void batch_destroy(struct pfn_batch *batch, void *backup)
> kfree(batch->pfns);
> }
>
> -/* true if the pfn was added, false otherwise */
> -static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
> +static bool batch_add_pfn_num(struct pfn_batch *batch, unsigned long pfn,
> + unsigned long nr)
nr should be a u32 because that is iwhat npfns is
> {
> - const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
> + const unsigned long MAX_NPFNS = type_max(typeof(*batch->npfns));
> + int i = batch->end;
unsigned int end = batch->end;
Otherwise
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH V5 6/9] iommufd: pfn reader for file mappings
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
` (4 preceding siblings ...)
2024-10-22 21:00 ` [PATCH V5 5/9] iommufd: folio subroutines Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-23 7:40 ` Tian, Kevin
2024-10-23 13:24 ` Jason Gunthorpe
2024-10-22 21:00 ` [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
` (2 subsequent siblings)
8 siblings, 2 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
Repin at small page granularity, and fill the batch from folios. Expand
folios to upages for the iopt_pages_fill path.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommufd/io_pagetable.h | 5 ++
drivers/iommu/iommufd/pages.c | 128 ++++++++++++++++++++++++++++++-----
2 files changed, 116 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 8e48266..5ac4eed 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -177,6 +177,7 @@ enum {
enum iopt_address_type {
IOPT_ADDRESS_USER = 0,
+ IOPT_ADDRESS_FILE = 1,
};
/*
@@ -202,6 +203,10 @@ struct iopt_pages {
enum iopt_address_type type;
union {
void __user *uptr; /* IOPT_ADDRESS_USER */
+ struct { /* IOPT_ADDRESS_FILE */
+ struct file *file;
+ unsigned long start;
+ };
};
bool writable:1;
u8 account_mode;
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 635b1fe..59d40c1 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -752,19 +752,32 @@ struct pfn_reader_user {
* neither
*/
int locked;
+
+ /* The following are only valid if file != NULL. */
+ struct file *file;
+ struct folio **ufolios;
+ size_t ufolios_len;
+ unsigned long ufolios_offset;
+ struct folio **ufolios_next;
};
static void pfn_reader_user_init(struct pfn_reader_user *user,
struct iopt_pages *pages)
{
user->upages = NULL;
+ user->upages_len = 0;
user->upages_start = 0;
user->upages_end = 0;
user->locked = -1;
-
user->gup_flags = FOLL_LONGTERM;
if (pages->writable)
user->gup_flags |= FOLL_WRITE;
+
+ user->file = (pages->type == IOPT_ADDRESS_FILE) ? pages->file : NULL;
+ user->ufolios = NULL;
+ user->ufolios_len = 0;
+ user->ufolios_next = NULL;
+ user->ufolios_offset = 0;
}
static void pfn_reader_user_destroy(struct pfn_reader_user *user,
@@ -773,13 +786,67 @@ static void pfn_reader_user_destroy(struct pfn_reader_user *user,
if (user->locked != -1) {
if (user->locked)
mmap_read_unlock(pages->source_mm);
- if (pages->source_mm != current->mm)
+ if (!user->file && pages->source_mm != current->mm)
mmput(pages->source_mm);
user->locked = -1;
}
kfree(user->upages);
user->upages = NULL;
+ kfree(user->ufolios);
+ user->ufolios = NULL;
+}
+
+static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long start,
+ unsigned long npages)
+{
+ unsigned long i;
+ unsigned long offset;
+ unsigned long npages_out = 0;
+ struct page **upages = user->upages;
+ unsigned long end = start + (npages << PAGE_SHIFT) - 1;
+ long nfolios = user->ufolios_len / sizeof(*user->ufolios);
+
+ /*
+ * todo: memfd_pin_folios should return the last pinned offset so
+ * we can compute npages pinned, and avoid looping over folios here
+ * if upages == NULL.
+ */
+ nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
+ nfolios, &offset);
+ if (nfolios <= 0)
+ return nfolios;
+
+ offset >>= PAGE_SHIFT;
+ user->ufolios_next = user->ufolios;
+ user->ufolios_offset = offset;
+
+ for (i = 0; i < nfolios; i++) {
+ struct folio *folio = user->ufolios[i];
+ unsigned long nr = folio_nr_pages(folio);
+ unsigned long npin = min(nr - offset, npages);
+
+ npages -= npin;
+ npages_out += npin;
+
+ if (upages) {
+ if (npin == 1) {
+ *upages++ = folio_page(folio, offset);
+ } else {
+ int rc = folio_add_pins(folio, npin - 1);
+
+ if (rc)
+ return rc;
+
+ while (npin--)
+ *upages++ = folio_page(folio, offset++);
+ }
+ }
+
+ offset = 0;
+ }
+
+ return npages_out;
}
static int pfn_reader_user_pin(struct pfn_reader_user *user,
@@ -788,7 +855,9 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
unsigned long last_index)
{
bool remote_mm = pages->source_mm != current->mm;
- unsigned long npages;
+ unsigned long npages = last_index - start_index + 1;
+ unsigned long start;
+ unsigned long unum;
uintptr_t uptr;
long rc;
@@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct pfn_reader_user *user,
WARN_ON(last_index < start_index))
return -EINVAL;
- if (!user->upages) {
+ if (!user->file && !user->upages) {
/* All undone in pfn_reader_destroy() */
- user->upages_len =
- (last_index - start_index + 1) * sizeof(*user->upages);
+ user->upages_len = npages * sizeof(*user->upages);
user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
if (!user->upages)
return -ENOMEM;
}
+ if (user->file && !user->ufolios) {
+ user->ufolios_len = npages * sizeof(*user->ufolios);
+ user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
+ if (!user->ufolios)
+ return -ENOMEM;
+ }
+
if (user->locked == -1) {
/*
* The majority of usages will run the map task within the mm
* providing the pages, so we can optimize into
* get_user_pages_fast()
*/
- if (remote_mm) {
+ if (!user->file && remote_mm) {
if (!mmget_not_zero(pages->source_mm))
return -EFAULT;
}
user->locked = 0;
}
- npages = min_t(unsigned long, last_index - start_index + 1,
- user->upages_len / sizeof(*user->upages));
-
+ unum = user->file ? user->ufolios_len / sizeof(*user->ufolios) :
+ user->upages_len / sizeof(*user->upages);
+ npages = min_t(unsigned long, npages, unum);
if (iommufd_should_fail())
return -EFAULT;
- uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
- if (!remote_mm)
+ if (user->file) {
+ start = pages->start + (start_index * PAGE_SIZE);
+ rc = pin_memfd_pages(user, start, npages);
+ } else if (!remote_mm) {
+ uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
rc = pin_user_pages_fast(uptr, npages, user->gup_flags,
user->upages);
- else {
+ } else {
+ uptr = (uintptr_t)(pages->uptr + start_index * PAGE_SIZE);
if (!user->locked) {
mmap_read_lock(pages->source_mm);
user->locked = 1;
@@ -887,7 +966,8 @@ static int update_mm_locked_vm(struct iopt_pages *pages, unsigned long npages,
mmap_read_unlock(pages->source_mm);
user->locked = 0;
/* If we had the lock then we also have a get */
- } else if ((!user || !user->upages) &&
+
+ } else if ((!user || (!user->upages && !user->ufolios)) &&
pages->source_mm != current->mm) {
if (!mmget_not_zero(pages->source_mm))
return -EINVAL;
@@ -1068,8 +1148,15 @@ static int pfn_reader_fill_span(struct pfn_reader *pfns)
npages = user->upages_end - start_index;
start_index -= user->upages_start;
- batch_from_pages(&pfns->batch, user->upages + start_index, npages);
- return 0;
+ rc = 0;
+
+ if (!user->file)
+ batch_from_pages(&pfns->batch, user->upages + start_index,
+ npages);
+ else
+ rc = batch_from_folios(&pfns->batch, &user->ufolios_next,
+ &user->ufolios_offset, npages);
+ return rc;
}
static bool pfn_reader_done(struct pfn_reader *pfns)
@@ -1151,7 +1238,14 @@ static void pfn_reader_release_pins(struct pfn_reader *pfns)
unsigned long start_index = pfns->batch_end_index -
user->upages_start;
- unpin_user_pages(user->upages + start_index, npages);
+ if (!user->file) {
+ unpin_user_pages(user->upages + start_index, npages);
+ } else {
+ long n = user->ufolios_len / sizeof(*user->ufolios);
+
+ unpin_folios(user->ufolios_next,
+ user->ufolios + n - user->ufolios_next);
+ }
iopt_pages_sub_npinned(pages, npages);
user->upages_end = pfns->batch_end_index;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* RE: [PATCH V5 6/9] iommufd: pfn reader for file mappings
2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
@ 2024-10-23 7:40 ` Tian, Kevin
2024-10-23 13:06 ` Steven Sistare
2024-10-23 13:15 ` Jason Gunthorpe
2024-10-23 13:24 ` Jason Gunthorpe
1 sibling, 2 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23 7:40 UTC (permalink / raw)
To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
>
> +
> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long
> start,
> + unsigned long npages)
> +{
> + unsigned long i;
> + unsigned long offset;
> + unsigned long npages_out = 0;
> + struct page **upages = user->upages;
> + unsigned long end = start + (npages << PAGE_SHIFT) - 1;
> + long nfolios = user->ufolios_len / sizeof(*user->ufolios);
reverse Christmas tree
> +
> + /*
> + * todo: memfd_pin_folios should return the last pinned offset so
> + * we can compute npages pinned, and avoid looping over folios here
> + * if upages == NULL.
> + */
> + nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
> + nfolios, &offset);
> + if (nfolios <= 0)
> + return nfolios;
> +
> + offset >>= PAGE_SHIFT;
> + user->ufolios_next = user->ufolios;
> + user->ufolios_offset = offset;
> +
> + for (i = 0; i < nfolios; i++) {
> + struct folio *folio = user->ufolios[i];
> + unsigned long nr = folio_nr_pages(folio);
> + unsigned long npin = min(nr - offset, npages);
> +
> + npages -= npin;
> + npages_out += npin;
> +
> + if (upages) {
> + if (npin == 1) {
> + *upages++ = folio_page(folio, offset);
> + } else {
> + int rc = folio_add_pins(folio, npin - 1);
> +
> + if (rc)
> + return rc;
> +
> + while (npin--)
> + *upages++ = folio_page(folio,
> offset++);
> + }
> + }
dead code as user->upages is NULL for memfd (echoed below)?
>
> @@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct
> pfn_reader_user *user,
> WARN_ON(last_index < start_index))
> return -EINVAL;
>
> - if (!user->upages) {
> + if (!user->file && !user->upages) {
> /* All undone in pfn_reader_destroy() */
> - user->upages_len =
> - (last_index - start_index + 1) * sizeof(*user->upages);
> + user->upages_len = npages * sizeof(*user->upages);
> user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
> if (!user->upages)
> return -ENOMEM;
> }
>
> + if (user->file && !user->ufolios) {
> + user->ufolios_len = npages * sizeof(*user->ufolios);
> + user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
> + if (!user->ufolios)
> + return -ENOMEM;
> + }
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH V5 6/9] iommufd: pfn reader for file mappings
2024-10-23 7:40 ` Tian, Kevin
@ 2024-10-23 13:06 ` Steven Sistare
2024-10-24 6:17 ` Tian, Kevin
2024-10-23 13:15 ` Jason Gunthorpe
1 sibling, 1 reply; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:06 UTC (permalink / raw)
To: Tian, Kevin, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
On 10/23/2024 3:40 AM, Tian, Kevin wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>> Sent: Wednesday, October 23, 2024 5:01 AM
>>
>> +
>> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned long
>> start,
>> + unsigned long npages)
>> +{
>> + unsigned long i;
>> + unsigned long offset;
>> + unsigned long npages_out = 0;
>> + struct page **upages = user->upages;
>> + unsigned long end = start + (npages << PAGE_SHIFT) - 1;
>> + long nfolios = user->ufolios_len / sizeof(*user->ufolios);
>
> reverse Christmas tree
>
>> +
>> + /*
>> + * todo: memfd_pin_folios should return the last pinned offset so
>> + * we can compute npages pinned, and avoid looping over folios here
>> + * if upages == NULL.
>> + */
>> + nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
>> + nfolios, &offset);
>> + if (nfolios <= 0)
>> + return nfolios;
>> +
>> + offset >>= PAGE_SHIFT;
>> + user->ufolios_next = user->ufolios;
>> + user->ufolios_offset = offset;
>> +
>> + for (i = 0; i < nfolios; i++) {
>> + struct folio *folio = user->ufolios[i];
>> + unsigned long nr = folio_nr_pages(folio);
>> + unsigned long npin = min(nr - offset, npages);
>> +
>> + npages -= npin;
>> + npages_out += npin;
>> +
>> + if (upages) {
>> + if (npin == 1) {
>> + *upages++ = folio_page(folio, offset);
>> + } else {
>> + int rc = folio_add_pins(folio, npin - 1);
>> +
>> + if (rc)
>> + return rc;
>> +
>> + while (npin--)
>> + *upages++ = folio_page(folio,
>> offset++);
>> + }
>> + }
>
> dead code as user->upages is NULL for memfd (echoed below)?
user->upages is still used for mediated devices, assigned in iopt_pages_fill.
upages is filled directly, bypassing the batch.
- Steve
>> @@ -796,40 +865,50 @@ static int pfn_reader_user_pin(struct
>> pfn_reader_user *user,
>> WARN_ON(last_index < start_index))
>> return -EINVAL;
>>
>> - if (!user->upages) {
>> + if (!user->file && !user->upages) {
>> /* All undone in pfn_reader_destroy() */
>> - user->upages_len =
>> - (last_index - start_index + 1) * sizeof(*user->upages);
>> + user->upages_len = npages * sizeof(*user->upages);
>> user->upages = temp_kmalloc(&user->upages_len, NULL, 0);
>> if (!user->upages)
>> return -ENOMEM;
>> }
>>
>> + if (user->file && !user->ufolios) {
>> + user->ufolios_len = npages * sizeof(*user->ufolios);
>> + user->ufolios = temp_kmalloc(&user->ufolios_len, NULL, 0);
>> + if (!user->ufolios)
>> + return -ENOMEM;
>> + }
>
^ permalink raw reply [flat|nested] 33+ messages in thread* RE: [PATCH V5 6/9] iommufd: pfn reader for file mappings
2024-10-23 13:06 ` Steven Sistare
@ 2024-10-24 6:17 ` Tian, Kevin
0 siblings, 0 replies; 33+ messages in thread
From: Tian, Kevin @ 2024-10-24 6:17 UTC (permalink / raw)
To: Steven Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
> From: Steven Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 9:07 PM
>
> On 10/23/2024 3:40 AM, Tian, Kevin wrote:
> >> From: Steve Sistare <steven.sistare@oracle.com>
> >> Sent: Wednesday, October 23, 2024 5:01 AM
> >>
> >> +
> >> +static long pin_memfd_pages(struct pfn_reader_user *user, unsigned
> long
> >> start,
> >> + unsigned long npages)
> >> +{
> >> + unsigned long i;
> >> + unsigned long offset;
> >> + unsigned long npages_out = 0;
> >> + struct page **upages = user->upages;
> >> + unsigned long end = start + (npages << PAGE_SHIFT) - 1;
> >> + long nfolios = user->ufolios_len / sizeof(*user->ufolios);
> >
> > reverse Christmas tree
> >
> >> +
> >> + /*
> >> + * todo: memfd_pin_folios should return the last pinned offset so
> >> + * we can compute npages pinned, and avoid looping over folios here
> >> + * if upages == NULL.
> >> + */
> >> + nfolios = memfd_pin_folios(user->file, start, end, user->ufolios,
> >> + nfolios, &offset);
> >> + if (nfolios <= 0)
> >> + return nfolios;
> >> +
> >> + offset >>= PAGE_SHIFT;
> >> + user->ufolios_next = user->ufolios;
> >> + user->ufolios_offset = offset;
> >> +
> >> + for (i = 0; i < nfolios; i++) {
> >> + struct folio *folio = user->ufolios[i];
> >> + unsigned long nr = folio_nr_pages(folio);
> >> + unsigned long npin = min(nr - offset, npages);
> >> +
> >> + npages -= npin;
> >> + npages_out += npin;
> >> +
> >> + if (upages) {
> >> + if (npin == 1) {
> >> + *upages++ = folio_page(folio, offset);
> >> + } else {
> >> + int rc = folio_add_pins(folio, npin - 1);
> >> +
> >> + if (rc)
> >> + return rc;
> >> +
> >> + while (npin--)
> >> + *upages++ = folio_page(folio,
> >> offset++);
> >> + }
> >> + }
> >
> > dead code as user->upages is NULL for memfd (echoed below)?
>
> user->upages is still used for mediated devices, assigned in iopt_pages_fill.
> upages is filled directly, bypassing the batch.
>
Okay, I overlooked that part.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V5 6/9] iommufd: pfn reader for file mappings
2024-10-23 7:40 ` Tian, Kevin
2024-10-23 13:06 ` Steven Sistare
@ 2024-10-23 13:15 ` Jason Gunthorpe
1 sibling, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:15 UTC (permalink / raw)
To: Tian, Kevin; +Cc: Steve Sistare, iommu@lists.linux.dev, Nicolin Chen
On Wed, Oct 23, 2024 at 07:40:34AM +0000, Tian, Kevin wrote:
> > + for (i = 0; i < nfolios; i++) {
> > + struct folio *folio = user->ufolios[i];
> > + unsigned long nr = folio_nr_pages(folio);
> > + unsigned long npin = min(nr - offset, npages);
> > +
> > + npages -= npin;
> > + npages_out += npin;
> > +
> > + if (upages) {
> > + if (npin == 1) {
> > + *upages++ = folio_page(folio, offset);
> > + } else {
> > + int rc = folio_add_pins(folio, npin - 1);
> > +
> > + if (rc)
> > + return rc;
> > +
> > + while (npin--)
> > + *upages++ = folio_page(folio,
> > offset++);
> > + }
> > + }
>
> dead code as user->upages is NULL for memfd (echoed below)?
It can be non-null if we go down the iopt_pages_fill() call chain. In
that case it will hold the output array for the access's page list for
mdev type drivers.
Jason
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V5 6/9] iommufd: pfn reader for file mappings
2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
2024-10-23 7:40 ` Tian, Kevin
@ 2024-10-23 13:24 ` Jason Gunthorpe
1 sibling, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:24 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Kevin Tian, Nicolin Chen
On Tue, Oct 22, 2024 at 02:00:35PM -0700, Steve Sistare wrote:
> Extend pfn_reader_user to pin file mappings, by calling memfd_pin_folios.
> Repin at small page granularity, and fill the batch from folios. Expand
> folios to upages for the iopt_pages_fill path.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/iommufd/io_pagetable.h | 5 ++
> drivers/iommu/iommufd/pages.c | 128 ++++++++++++++++++++++++++++++-----
> 2 files changed, 116 insertions(+), 17 deletions(-)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
` (5 preceding siblings ...)
2024-10-22 21:00 ` [PATCH V5 6/9] iommufd: pfn reader for file mappings Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-23 7:45 ` Tian, Kevin
2024-10-22 21:00 ` [PATCH V5 8/9] iommufd: file mappings for mdev Steve Sistare
2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Define the IOMMU_IOAS_MAP_FILE ioctl interface, which allows a user to
register memory by passing a memfd plus offset and length. Implement it
using the memfd_pin_folios KAPI.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommufd/io_pagetable.c | 36 ++++++++++++++++++++++++++++++-
drivers/iommu/iommufd/io_pagetable.h | 2 ++
drivers/iommu/iommufd/ioas.c | 38 +++++++++++++++++++++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 5 +++++
drivers/iommu/iommufd/main.c | 2 ++
drivers/iommu/iommufd/pages.c | 23 ++++++++++++++++++++
include/uapi/linux/iommufd.h | 25 ++++++++++++++++++++++
7 files changed, 130 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
index 874ee9e..8a790e5 100644
--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -268,7 +268,14 @@ static int iopt_alloc_area_pages(struct io_pagetable *iopt,
/* Use the first entry to guess the ideal IOVA alignment */
elm = list_first_entry(pages_list, struct iopt_pages_list,
next);
- start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+ switch (elm->pages->type) {
+ case IOPT_ADDRESS_USER:
+ start = elm->start_byte + (uintptr_t)elm->pages->uptr;
+ break;
+ case IOPT_ADDRESS_FILE:
+ start = elm->start_byte + elm->pages->start;
+ break;
+ }
rc = iopt_alloc_iova(iopt, dst_iova, start, length);
if (rc)
goto out_unlock;
@@ -446,6 +453,33 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
uptr - pages->uptr, iommu_prot, flags);
}
+/**
+ * iopt_map_file_pages() - Like iopt_map_user_pages, but map a file.
+ * @ictx: iommufd_ctx the iopt is part of
+ * @iopt: io_pagetable to act on
+ * @iova: If IOPT_ALLOC_IOVA is set this is unused on input and contains
+ * the chosen iova on output. Otherwise is the iova to map to on input
+ * @file: file to map
+ * @start: map file starting at this byte offset
+ * @length: Number of bytes to map
+ * @iommu_prot: Combination of IOMMU_READ/WRITE/etc bits for the mapping
+ * @flags: IOPT_ALLOC_IOVA or zero
+ */
+int iopt_map_file_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+ unsigned long *iova, struct file *file,
+ unsigned long start, unsigned long length,
+ int iommu_prot, unsigned int flags)
+{
+ struct iopt_pages *pages;
+
+ pages = iopt_alloc_file_pages(file, start, length,
+ iommu_prot & IOMMU_WRITE);
+ if (IS_ERR(pages))
+ return PTR_ERR(pages);
+ return iopt_map_common(ictx, iopt, pages, iova, length,
+ start - pages->start, iommu_prot, flags);
+}
+
struct iova_bitmap_fn_arg {
unsigned long flags;
struct io_pagetable *iopt;
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 5ac4eed..9b40b22 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -220,6 +220,8 @@ struct iopt_pages {
struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
unsigned long length, bool writable);
+struct iopt_pages *iopt_alloc_file_pages(struct file *file, unsigned long start,
+ unsigned long length, bool writable);
void iopt_release_pages(struct kref *kref);
static inline void iopt_put_pages(struct iopt_pages *pages)
{
diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c
index 2c4b2bb..3d31bcf 100644
--- a/drivers/iommu/iommufd/ioas.c
+++ b/drivers/iommu/iommufd/ioas.c
@@ -2,6 +2,7 @@
/*
* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
*/
+#include <linux/file.h>
#include <linux/interval_tree.h>
#include <linux/iommu.h>
#include <linux/iommufd.h>
@@ -197,6 +198,43 @@ static int conv_iommu_prot(u32 map_flags)
return iommu_prot;
}
+int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_ioas_map_file *cmd = ucmd->cmd;
+ unsigned long iova = cmd->iova;
+ struct iommufd_ioas *ioas;
+ unsigned int flags = 0;
+ int rc;
+ struct file *file;
+
+ if (cmd->iova >= ULONG_MAX || cmd->length >= ULONG_MAX)
+ return -EOVERFLOW;
+
+ ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
+ if (IS_ERR(ioas))
+ return PTR_ERR(ioas);
+
+ if (!(cmd->flags & IOMMU_IOAS_MAP_FIXED_IOVA))
+ flags = IOPT_ALLOC_IOVA;
+
+ file = fget(cmd->fd);
+ if (!file)
+ return -EBADF;
+
+ rc = iopt_map_file_pages(ucmd->ictx, &ioas->iopt, &iova, file,
+ cmd->start, cmd->length,
+ conv_iommu_prot(cmd->flags), flags);
+ if (rc)
+ goto out_put;
+
+ cmd->iova = iova;
+ rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+out_put:
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
+ fput(file);
+ return rc;
+}
+
int iommufd_ioas_map(struct iommufd_ucmd *ucmd)
{
struct iommu_ioas_map *cmd = ucmd->cmd;
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index f1d865e..8f3c21a 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -69,6 +69,10 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
unsigned long *iova, void __user *uptr,
unsigned long length, int iommu_prot,
unsigned int flags);
+int iopt_map_file_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt,
+ unsigned long *iova, struct file *file,
+ unsigned long start, unsigned long length,
+ int iommu_prot, unsigned int flags);
int iopt_map_pages(struct io_pagetable *iopt, struct list_head *pages_list,
unsigned long length, unsigned long *dst_iova,
int iommu_prot, unsigned int flags);
@@ -276,6 +280,7 @@ static inline struct iommufd_ioas *iommufd_get_ioas(struct iommufd_ctx *ictx,
int iommufd_ioas_iova_ranges(struct iommufd_ucmd *ucmd);
int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd);
int iommufd_ioas_map(struct iommufd_ucmd *ucmd);
+int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd);
int iommufd_ioas_copy(struct iommufd_ucmd *ucmd);
int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd);
int iommufd_ioas_option(struct iommufd_ucmd *ucmd);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index b5f5d27..826a2b2 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -378,6 +378,8 @@ struct iommufd_ioctl_op {
struct iommu_ioas_iova_ranges, out_iova_alignment),
IOCTL_OP(IOMMU_IOAS_MAP, iommufd_ioas_map, struct iommu_ioas_map,
iova),
+ IOCTL_OP(IOMMU_IOAS_MAP_FILE, iommufd_ioas_map_file,
+ struct iommu_ioas_map_file, iova),
IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct iommu_ioas_unmap,
length),
IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option,
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 59d40c1..1b8c3bf 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -45,6 +45,7 @@
* last_iova + 1 can overflow. An iopt_pages index will always be much less than
* ULONG_MAX so last_index + 1 cannot overflow.
*/
+#include <linux/file.h>
#include <linux/highmem.h>
#include <linux/iommu.h>
#include <linux/iommufd.h>
@@ -1340,6 +1341,26 @@ struct iopt_pages *iopt_alloc_user_pages(void __user *uptr,
return pages;
}
+struct iopt_pages *iopt_alloc_file_pages(struct file *file, unsigned long start,
+ unsigned long length, bool writable)
+
+{
+ struct iopt_pages *pages;
+ unsigned long start_down = ALIGN_DOWN(start, PAGE_SIZE);
+ unsigned long end;
+
+ if (length && check_add_overflow(start, length - 1, &end))
+ return ERR_PTR(-EOVERFLOW);
+
+ pages = iopt_alloc_pages(length, start - start_down, writable);
+ if (IS_ERR(pages))
+ return pages;
+ pages->file = get_file(file);
+ pages->start = start_down;
+ pages->type = IOPT_ADDRESS_FILE;
+ return pages;
+}
+
void iopt_release_pages(struct kref *kref)
{
struct iopt_pages *pages = container_of(kref, struct iopt_pages, kref);
@@ -1352,6 +1373,8 @@ void iopt_release_pages(struct kref *kref)
mutex_destroy(&pages->mutex);
put_task_struct(pages->source_task);
free_uid(pages->source_user);
+ if (pages->type == IOPT_ADDRESS_FILE)
+ fput(pages->file);
kfree(pages);
}
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 72010f7..a8f0d30 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -51,6 +51,7 @@ enum {
IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP = 0x8c,
IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
+ IOMMUFD_CMD_IOAS_MAP_FILE = 0x8f,
};
/**
@@ -214,6 +215,30 @@ struct iommu_ioas_map {
#define IOMMU_IOAS_MAP _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP)
/**
+ * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
+ * @size: sizeof(struct iommu_ioas_map_file)
+ * @fd: the memfd to map
+ * @start: byte offset from start of file to map from
+ * @flags: same as for iommu_ioas_map
+ * @ios_id: ditto
+ * @length: ditto
+ * @iova: ditto
+ *
+ * Set an IOVA mapping from a memfd file. All other arguments and semantics
+ * match those of IOMMU_IOAS_MAP.
+ */
+struct iommu_ioas_map_file {
+ __u32 size;
+ __u32 flags;
+ __u32 ioas_id;
+ __s32 fd;
+ __aligned_u64 start;
+ __aligned_u64 length;
+ __aligned_u64 iova;
+};
+#define IOMMU_IOAS_MAP_FILE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP_FILE)
+
+/**
* struct iommu_ioas_copy - ioctl(IOMMU_IOAS_COPY)
* @size: sizeof(struct iommu_ioas_copy)
* @flags: Combination of enum iommufd_ioas_map_flags
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* RE: [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE
2024-10-22 21:00 ` [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
@ 2024-10-23 7:45 ` Tian, Kevin
2024-10-23 13:22 ` Steven Sistare
0 siblings, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23 7:45 UTC (permalink / raw)
To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
>
> +int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd)
> +{
> + struct iommu_ioas_map_file *cmd = ucmd->cmd;
> + unsigned long iova = cmd->iova;
> + struct iommufd_ioas *ioas;
> + unsigned int flags = 0;
> + int rc;
> + struct file *file;
> +
> + if (cmd->iova >= ULONG_MAX || cmd->length >= ULONG_MAX)
> + return -EOVERFLOW;
> +
> + ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
> + if (IS_ERR(ioas))
> + return PTR_ERR(ioas);
> +
> + if (!(cmd->flags & IOMMU_IOAS_MAP_FIXED_IOVA))
> + flags = IOPT_ALLOC_IOVA;
any reason skipping other checks in iommufd_ioas_map()?
if ((cmd->flags &
~(IOMMU_IOAS_MAP_FIXED_IOVA | IOMMU_IOAS_MAP_WRITEABLE |
IOMMU_IOAS_MAP_READABLE)) ||
cmd->__reserved)
return -EOPNOTSUPP;
if (!(cmd->flags &
(IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE)))
return -EINVAL;
>
> /**
> + * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
> + * @size: sizeof(struct iommu_ioas_map_file)
> + * @fd: the memfd to map
> + * @start: byte offset from start of file to map from
> + * @flags: same as for iommu_ioas_map
> + * @ios_id: ditto
s/ios_id/ioas_id/
> + * @length: ditto
> + * @iova: ditto
prefer to duplicating "same as for iommu_ioas_map"
also adjust above lines in the same order as following fields.
> + *
> + * Set an IOVA mapping from a memfd file. All other arguments and
> semantics
> + * match those of IOMMU_IOAS_MAP.
> + */
> +struct iommu_ioas_map_file {
> + __u32 size;
> + __u32 flags;
> + __u32 ioas_id;
> + __s32 fd;
> + __aligned_u64 start;
> + __aligned_u64 length;
> + __aligned_u64 iova;
> +};
> +#define IOMMU_IOAS_MAP_FILE _IO(IOMMUFD_TYPE,
> IOMMUFD_CMD_IOAS_MAP_FILE)
> +
> +/**
> * struct iommu_ioas_copy - ioctl(IOMMU_IOAS_COPY)
> * @size: sizeof(struct iommu_ioas_copy)
> * @flags: Combination of enum iommufd_ioas_map_flags
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE
2024-10-23 7:45 ` Tian, Kevin
@ 2024-10-23 13:22 ` Steven Sistare
0 siblings, 0 replies; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:22 UTC (permalink / raw)
To: Tian, Kevin, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
On 10/23/2024 3:45 AM, Tian, Kevin wrote:
>> From: Steve Sistare <steven.sistare@oracle.com>
>> Sent: Wednesday, October 23, 2024 5:01 AM
>>
>> +int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd)
>> +{
>> + struct iommu_ioas_map_file *cmd = ucmd->cmd;
>> + unsigned long iova = cmd->iova;
>> + struct iommufd_ioas *ioas;
>> + unsigned int flags = 0;
>> + int rc;
>> + struct file *file;
>> +
>> + if (cmd->iova >= ULONG_MAX || cmd->length >= ULONG_MAX)
>> + return -EOVERFLOW;
>> +
>> + ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
>> + if (IS_ERR(ioas))
>> + return PTR_ERR(ioas);
>> +
>> + if (!(cmd->flags & IOMMU_IOAS_MAP_FIXED_IOVA))
>> + flags = IOPT_ALLOC_IOVA;
>
> any reason skipping other checks in iommufd_ioas_map()?
No reason, just an oversight. I will add them, thanks.
> if ((cmd->flags &
> ~(IOMMU_IOAS_MAP_FIXED_IOVA | IOMMU_IOAS_MAP_WRITEABLE |
> IOMMU_IOAS_MAP_READABLE)) ||
> cmd->__reserved)
> return -EOPNOTSUPP;
>
> if (!(cmd->flags &
> (IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE)))
> return -EINVAL;
>
>>
>> /**
>> + * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
>> + * @size: sizeof(struct iommu_ioas_map_file)
>> + * @fd: the memfd to map
>> + * @start: byte offset from start of file to map from
>> + * @flags: same as for iommu_ioas_map
>> + * @ios_id: ditto
>
> s/ios_id/ioas_id/
>
>> + * @length: ditto
>> + * @iova: ditto
>
> prefer to duplicating "same as for iommu_ioas_map"
>
> also adjust above lines in the same order as following fields.
Will edit as suggested.
- Steve
>> + *
>> + * Set an IOVA mapping from a memfd file. All other arguments and
>> semantics
>> + * match those of IOMMU_IOAS_MAP.
>> + */
>> +struct iommu_ioas_map_file {
>> + __u32 size;
>> + __u32 flags;
>> + __u32 ioas_id;
>> + __s32 fd;
>> + __aligned_u64 start;
>> + __aligned_u64 length;
>> + __aligned_u64 iova;
>> +};
>> +#define IOMMU_IOAS_MAP_FILE _IO(IOMMUFD_TYPE,
>> IOMMUFD_CMD_IOAS_MAP_FILE)
>> +
>> +/**
>> * struct iommu_ioas_copy - ioctl(IOMMU_IOAS_COPY)
>> * @size: sizeof(struct iommu_ioas_copy)
>> * @flags: Combination of enum iommufd_ioas_map_flags
>> --
>> 1.8.3.1
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH V5 8/9] iommufd: file mappings for mdev
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
` (6 preceding siblings ...)
2024-10-22 21:00 ` [PATCH V5 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-23 8:01 ` Tian, Kevin
2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
8 siblings, 1 reply; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Support file mappings for mediated devices, aka mdevs. Access is initiated
by the vfio_pin_pages and vfio_dma_rw kernel interfaces.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommufd/pages.c | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 1b8c3bf..daf8fde 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1814,11 +1814,11 @@ static int iopt_pages_fill_from_domain(struct iopt_pages *pages,
return 0;
}
-static int iopt_pages_fill_from_mm(struct iopt_pages *pages,
- struct pfn_reader_user *user,
- unsigned long start_index,
- unsigned long last_index,
- struct page **out_pages)
+static int iopt_pages_fill(struct iopt_pages *pages,
+ struct pfn_reader_user *user,
+ unsigned long start_index,
+ unsigned long last_index,
+ struct page **out_pages)
{
unsigned long cur_index = start_index;
int rc;
@@ -1892,8 +1892,8 @@ int iopt_pages_fill_xarray(struct iopt_pages *pages, unsigned long start_index,
/* hole */
cur_pages = out_pages + (span.start_hole - start_index);
- rc = iopt_pages_fill_from_mm(pages, &user, span.start_hole,
- span.last_hole, cur_pages);
+ rc = iopt_pages_fill(pages, &user, span.start_hole,
+ span.last_hole, cur_pages);
if (rc)
goto out_clean_xa;
rc = pages_to_xarray(&pages->pinned_pfns, span.start_hole,
@@ -1973,6 +1973,10 @@ static int iopt_pages_rw_page(struct iopt_pages *pages, unsigned long index,
struct page *page = NULL;
int rc;
+ if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
+ WARN_ON(pages->type != IOPT_ADDRESS_USER))
+ return -EINVAL;
+
if (!mmget_not_zero(pages->source_mm))
return iopt_pages_rw_slow(pages, index, index, offset, data,
length, flags);
@@ -2028,6 +2032,15 @@ int iopt_pages_rw_access(struct iopt_pages *pages, unsigned long start_byte,
if ((flags & IOMMUFD_ACCESS_RW_WRITE) && !pages->writable)
return -EPERM;
+ if (pages->type == IOPT_ADDRESS_FILE)
+ return iopt_pages_rw_slow(pages, start_index, last_index,
+ start_byte % PAGE_SIZE, data, length,
+ flags);
+
+ if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
+ WARN_ON(pages->type != IOPT_ADDRESS_USER))
+ return -EINVAL;
+
if (!(flags & IOMMUFD_ACCESS_RW_KTHREAD) && change_mm) {
if (start_index == last_index)
return iopt_pages_rw_page(pages, start_index,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* RE: [PATCH V5 8/9] iommufd: file mappings for mdev
2024-10-22 21:00 ` [PATCH V5 8/9] iommufd: file mappings for mdev Steve Sistare
@ 2024-10-23 8:01 ` Tian, Kevin
2024-10-23 13:16 ` Jason Gunthorpe
0 siblings, 1 reply; 33+ messages in thread
From: Tian, Kevin @ 2024-10-23 8:01 UTC (permalink / raw)
To: Steve Sistare, iommu@lists.linux.dev; +Cc: Jason Gunthorpe, Nicolin Chen
> From: Steve Sistare <steven.sistare@oracle.com>
> Sent: Wednesday, October 23, 2024 5:01 AM
>
> @@ -1973,6 +1973,10 @@ static int iopt_pages_rw_page(struct iopt_pages
> *pages, unsigned long index,
> struct page *page = NULL;
> int rc;
>
> + if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
> + WARN_ON(pages->type != IOPT_ADDRESS_USER))
> + return -EINVAL;
> +
is this check necessary? if yes why is it only enabled in test?
also it looks duplicated with the one in iopt_pages_rw_access().
> if (!mmget_not_zero(pages->source_mm))
> return iopt_pages_rw_slow(pages, index, index, offset, data,
> length, flags);
> @@ -2028,6 +2032,15 @@ int iopt_pages_rw_access(struct iopt_pages
> *pages, unsigned long start_byte,
> if ((flags & IOMMUFD_ACCESS_RW_WRITE) && !pages->writable)
> return -EPERM;
>
> + if (pages->type == IOPT_ADDRESS_FILE)
> + return iopt_pages_rw_slow(pages, start_index, last_index,
> + start_byte % PAGE_SIZE, data,
> length,
> + flags);
there are 3 invocations of iopt_pages_rw_slow() now, each counting
3 lines.
probably creating a goto label to reduce the lines.
> +
> + if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
> + WARN_ON(pages->type != IOPT_ADDRESS_USER))
> + return -EINVAL;
> +
> if (!(flags & IOMMUFD_ACCESS_RW_KTHREAD) && change_mm) {
> if (start_index == last_index)
> return iopt_pages_rw_page(pages, start_index,
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH V5 8/9] iommufd: file mappings for mdev
2024-10-23 8:01 ` Tian, Kevin
@ 2024-10-23 13:16 ` Jason Gunthorpe
0 siblings, 0 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2024-10-23 13:16 UTC (permalink / raw)
To: Tian, Kevin; +Cc: Steve Sistare, iommu@lists.linux.dev, Nicolin Chen
On Wed, Oct 23, 2024 at 08:01:56AM +0000, Tian, Kevin wrote:
> > From: Steve Sistare <steven.sistare@oracle.com>
> > Sent: Wednesday, October 23, 2024 5:01 AM
> >
> > @@ -1973,6 +1973,10 @@ static int iopt_pages_rw_page(struct iopt_pages
> > *pages, unsigned long index,
> > struct page *page = NULL;
> > int rc;
> >
> > + if (IS_ENABLED(CONFIG_IOMMUFD_TEST) &&
> > + WARN_ON(pages->type != IOPT_ADDRESS_USER))
> > + return -EINVAL;
> > +
>
> is this check necessary? if yes why is it only enabled in test?
This is the pattern for correctness assertions on performance paths
Jason
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH V5 9/9] iommufd: map file selftest
2024-10-22 21:00 [PATCH V5 0/9] iommu_ioas_map_file Steve Sistare
` (7 preceding siblings ...)
2024-10-22 21:00 ` [PATCH V5 8/9] iommufd: file mappings for mdev Steve Sistare
@ 2024-10-22 21:00 ` Steve Sistare
2024-10-23 13:52 ` Steven Sistare
2024-10-23 20:15 ` Nicolin Chen
8 siblings, 2 replies; 33+ messages in thread
From: Steve Sistare @ 2024-10-22 21:00 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen, Steve Sistare
Add test cases to exercise IOMMU_IOAS_MAP_FILE.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
tools/testing/selftests/iommu/iommufd.c | 136 ++++++++++++++++++++---
tools/testing/selftests/iommu/iommufd_fail_nth.c | 39 +++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++++++
3 files changed, 217 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 4927b9a..3d62951 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -3,6 +3,7 @@
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/eventfd.h>
+#include <asm/unistd.h>
#define __EXPORTED_HEADERS__
#include <linux/vfio.h>
@@ -49,6 +50,9 @@ static __attribute__((constructor)) void setup_sizes(void)
vrc = mmap(buffer, BUFFER_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
assert(vrc == buffer);
+
+ mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
+ &mfd);
}
FIXTURE(iommufd)
@@ -128,6 +132,7 @@ static __attribute__((constructor)) void setup_sizes(void)
TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP, length);
TEST_LENGTH(iommu_option, IOMMU_OPTION, val64);
TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved);
+ TEST_LENGTH(iommu_ioas_map_file, IOMMU_IOAS_MAP_FILE, iova);
#undef TEST_LENGTH
}
@@ -1372,6 +1377,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
{
unsigned int mock_domains;
bool hugepages;
+ bool file;
};
FIXTURE_SETUP(iommufd_mock_domain)
@@ -1410,26 +1416,45 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
{
.mock_domains = 1,
.hugepages = false,
+ .file = false,
};
FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains)
{
.mock_domains = 2,
.hugepages = false,
+ .file = false,
};
FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_hugepage)
{
.mock_domains = 1,
.hugepages = true,
+ .file = false,
};
FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains_hugepage)
{
.mock_domains = 2,
.hugepages = true,
+ .file = false,
};
+FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file)
+{
+ .mock_domains = 1,
+ .hugepages = false,
+ .file = true,
+};
+
+FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file_hugepage)
+{
+ .mock_domains = 1,
+ .hugepages = true,
+ .file = true,
+};
+
+
/* Have the kernel check that the user pages made it to the iommu_domain */
#define check_mock_iova(_ptr, _iova, _length) \
({ \
@@ -1455,7 +1480,10 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
} \
})
-TEST_F(iommufd_mock_domain, basic)
+static void
+test_basic_mmap(struct __test_metadata *_metadata,
+ struct _test_data_iommufd_mock_domain *self,
+ const struct _fixture_variant_iommufd_mock_domain *variant)
{
size_t buf_size = self->mmap_buf_size;
uint8_t *buf;
@@ -1478,12 +1506,52 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
test_err_ioctl_ioas_map(EFAULT, buf, buf_size, &iova);
}
+static void
+test_basic_file(struct __test_metadata *_metadata,
+ struct _test_data_iommufd_mock_domain *self,
+ const struct _fixture_variant_iommufd_mock_domain *variant)
+{
+ size_t buf_size = self->mmap_buf_size;
+ uint8_t *buf;
+ __u64 iova;
+ int mfd_tmp;
+ int prot = PROT_READ | PROT_WRITE;
+
+ /* Simple one page map */
+ test_ioctl_ioas_map_file(mfd, 0, PAGE_SIZE, &iova);
+ check_mock_iova(mfd_buffer, iova, PAGE_SIZE);
+
+ buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd_tmp);
+ ASSERT_NE(MAP_FAILED, buf);
+
+ /* EFAULT half way through mapping */
+ ASSERT_EQ(0, munmap(buf + buf_size / 2, buf_size / 2));
+ test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
+
+ /* EFAULT on first page */
+ ASSERT_EQ(0, munmap(buf, buf_size / 2));
+ test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
+
+ close(mfd_tmp);
+}
+
+TEST_F(iommufd_mock_domain, basic)
+{
+ if (variant->file)
+ test_basic_file(_metadata, self, variant);
+ else
+ test_basic_mmap(_metadata, self, variant);
+}
+
TEST_F(iommufd_mock_domain, ro_unshare)
{
uint8_t *buf;
__u64 iova;
int fd;
+ if (variant->file)
+ SKIP(return, "redundant test");
+
fd = open("/proc/self/exe", O_RDONLY);
ASSERT_NE(-1, fd);
@@ -1513,9 +1581,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
unsigned int start;
unsigned int end;
uint8_t *buf;
+ int prot = PROT_READ | PROT_WRITE;
+ int mfd;
- buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
- 0);
+ if (variant->file)
+ buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
+ else
+ buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
ASSERT_NE(MAP_FAILED, buf);
check_refs(buf, buf_size, 0);
@@ -1532,7 +1604,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
size_t length = end - start;
__u64 iova;
- test_ioctl_ioas_map(buf + start, length, &iova);
+ if (variant->file) {
+ test_ioctl_ioas_map_file(mfd, start, length,
+ &iova);
+ } else {
+ test_ioctl_ioas_map(buf + start, length, &iova);
+ }
check_mock_iova(buf + start, iova, length);
check_refs(buf + start / PAGE_SIZE * PAGE_SIZE,
end / PAGE_SIZE * PAGE_SIZE -
@@ -1544,6 +1621,8 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
}
check_refs(buf, buf_size, 0);
ASSERT_EQ(0, munmap(buf, buf_size));
+ if (variant->file)
+ close(mfd);
}
TEST_F(iommufd_mock_domain, all_aligns_copy)
@@ -1554,9 +1633,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
unsigned int start;
unsigned int end;
uint8_t *buf;
+ int prot = PROT_READ | PROT_WRITE;
+ int mfd;
- buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
- 0);
+ if (variant->file)
+ buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
+ else
+ buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
ASSERT_NE(MAP_FAILED, buf);
check_refs(buf, buf_size, 0);
@@ -1575,7 +1658,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
uint32_t mock_stdev_id;
__u64 iova;
- test_ioctl_ioas_map(buf + start, length, &iova);
+ if (variant->file) {
+ test_ioctl_ioas_map_file(mfd, start, length,
+ &iova);
+ } else {
+ test_ioctl_ioas_map(buf + start, length, &iova);
+ }
/* Add and destroy a domain while the area exists */
old_id = self->hwpt_ids[1];
@@ -1596,15 +1684,18 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
}
check_refs(buf, buf_size, 0);
ASSERT_EQ(0, munmap(buf, buf_size));
+ if (variant->file)
+ close(mfd);
}
TEST_F(iommufd_mock_domain, user_copy)
{
+ void *buf = variant->file ? mfd_buffer : buffer;
struct iommu_test_cmd access_cmd = {
.size = sizeof(access_cmd),
.op = IOMMU_TEST_OP_ACCESS_PAGES,
.access_pages = { .length = BUFFER_SIZE,
- .uptr = (uintptr_t)buffer },
+ .uptr = (uintptr_t)buf },
};
struct iommu_ioas_copy copy_cmd = {
.size = sizeof(copy_cmd),
@@ -1623,9 +1714,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
/* Pin the pages in an IOAS with no domains then copy to an IOAS with domains */
test_ioctl_ioas_alloc(&ioas_id);
- test_ioctl_ioas_map_id(ioas_id, buffer, BUFFER_SIZE,
- ©_cmd.src_iova);
-
+ if (variant->file) {
+ test_ioctl_ioas_map_id_file(ioas_id, mfd, 0, BUFFER_SIZE,
+ ©_cmd.src_iova);
+ } else {
+ test_ioctl_ioas_map_id(ioas_id, buf, BUFFER_SIZE,
+ ©_cmd.src_iova);
+ }
test_cmd_create_access(ioas_id, &access_cmd.id,
MOCK_FLAGS_ACCESS_CREATE_NEEDS_PIN_PAGES);
@@ -1635,12 +1730,17 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
&access_cmd));
copy_cmd.src_ioas_id = ioas_id;
ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd));
- check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
+ check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
/* Now replace the ioas with a new one */
test_ioctl_ioas_alloc(&new_ioas_id);
- test_ioctl_ioas_map_id(new_ioas_id, buffer, BUFFER_SIZE,
- ©_cmd.src_iova);
+ if (variant->file) {
+ test_ioctl_ioas_map_id_file(new_ioas_id, mfd, 0, BUFFER_SIZE,
+ ©_cmd.src_iova);
+ } else {
+ test_ioctl_ioas_map_id(new_ioas_id, buf, BUFFER_SIZE,
+ ©_cmd.src_iova);
+ }
test_cmd_access_replace_ioas(access_cmd.id, new_ioas_id);
/* Destroy the old ioas and cleanup copied mapping */
@@ -1654,7 +1754,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
&access_cmd));
copy_cmd.src_ioas_id = new_ioas_id;
ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd));
- check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
+ check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
test_cmd_destroy_access_pages(
access_cmd.id, access_cmd.access_pages.out_access_pages_id);
@@ -1667,6 +1767,9 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
{
uint32_t ioas_id;
+ if (variant->file)
+ SKIP(return, "redundant test");
+
test_ioctl_ioas_alloc(&ioas_id);
test_cmd_mock_domain_replace(self->stdev_ids[0], ioas_id);
@@ -1697,6 +1800,9 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
{
int i;
+ if (variant->file)
+ SKIP(return, "redundant test");
+
for (i = 0; i != variant->mock_domains; i++) {
uint32_t hwpt_id[2];
uint32_t stddev_id;
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index c5d5e69..111d43e 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -47,6 +47,9 @@ static __attribute__((constructor)) void setup_buffer(void)
buffer = mmap(0, BUFFER_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+
+ mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
+ &mfd);
}
/*
@@ -331,6 +334,42 @@ void __fail_nth_enable(struct __test_metadata *_metadata,
return 0;
}
+/* iopt_area_fill_domains() and iopt_area_fill_domain() */
+TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
+{
+ uint32_t ioas_id;
+ __u32 stdev_id;
+ __u32 hwpt_id;
+ __u64 iova;
+
+ self->fd = open("/dev/iommu", O_RDWR);
+ if (self->fd == -1)
+ return -1;
+
+ if (_test_ioctl_ioas_alloc(self->fd, &ioas_id))
+ return -1;
+
+ if (_test_ioctl_set_temp_memory_limit(self->fd, 32))
+ return -1;
+
+ fail_nth_enable();
+
+ if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
+ return -1;
+
+ if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
+ IOMMU_IOAS_MAP_WRITEABLE |
+ IOMMU_IOAS_MAP_READABLE))
+ return -1;
+
+ if (_test_ioctl_destroy(self->fd, stdev_id))
+ return -1;
+
+ if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
+ return -1;
+ return 0;
+}
+
TEST_FAIL_NTH(basic_fail_nth, map_two_domains)
{
uint32_t ioas_id;
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 40f6f14..c708dc1 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -40,12 +40,28 @@ static inline bool test_bit(unsigned int nr, unsigned long *addr)
static void *buffer;
static unsigned long BUFFER_SIZE;
+static void *mfd_buffer;
+static int mfd;
+
static unsigned long PAGE_SIZE;
#define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
#define offsetofend(TYPE, MEMBER) \
(offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
+static inline void *memfd_mmap(size_t length, int prot, int flags, int *mfd_p)
+{
+ int mfd_flags = (flags & MAP_HUGETLB) ? MFD_HUGETLB : 0;
+ int mfd = memfd_create("buffer", mfd_flags);
+
+ if (mfd <= 0)
+ return MAP_FAILED;
+ if (ftruncate(mfd, length))
+ return MAP_FAILED;
+ *mfd_p = mfd;
+ return mmap(0, length, prot, flags, mfd, 0);
+}
+
/*
* Have the kernel check the refcount on pages. I don't know why a freshly
* mmap'd anon non-compound page starts out with a ref of 3
@@ -589,6 +605,47 @@ static int _test_ioctl_ioas_unmap(int fd, unsigned int ioas_id, uint64_t iova,
EXPECT_ERRNO(_errno, _test_ioctl_ioas_unmap(self->fd, self->ioas_id, \
iova, length, NULL))
+static int _test_ioctl_ioas_map_file(int fd, unsigned int ioas_id, int mfd,
+ size_t start, size_t length, __u64 *iova,
+ unsigned int flags)
+{
+ struct iommu_ioas_map_file cmd = {
+ .size = sizeof(cmd),
+ .flags = flags,
+ .ioas_id = ioas_id,
+ .fd = mfd,
+ .start = start,
+ .length = length,
+ };
+ int ret;
+
+ if (flags & IOMMU_IOAS_MAP_FIXED_IOVA)
+ cmd.iova = *iova;
+
+ ret = ioctl(fd, IOMMU_IOAS_MAP_FILE, &cmd);
+ *iova = cmd.iova;
+ return ret;
+}
+
+#define test_ioctl_ioas_map_file(mfd, start, length, iova_p) \
+ ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
+ start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | \
+ IOMMU_IOAS_MAP_READABLE))
+
+#define test_err_ioctl_ioas_map_file(_errno, start, length, iova_p) \
+ EXPECT_ERRNO(_errno, \
+ _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
+ start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | \
+ IOMMU_IOAS_MAP_READABLE))
+
+#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p) \
+ ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, \
+ start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | \
+ IOMMU_IOAS_MAP_READABLE))
+
static int _test_ioctl_set_temp_memory_limit(int fd, unsigned int limit)
{
struct iommu_test_cmd memlimit_cmd = {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH V5 9/9] iommufd: map file selftest
2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
@ 2024-10-23 13:52 ` Steven Sistare
2024-10-23 20:15 ` Nicolin Chen
1 sibling, 0 replies; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 13:52 UTC (permalink / raw)
To: iommu; +Cc: Jason Gunthorpe, Kevin Tian, Nicolin Chen
Who is willing to send an RB for this patch?
All new and old iommufd and iommufd_fail_nth tests pass.
- Steve
On 10/22/2024 5:00 PM, Steve Sistare wrote:
> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> tools/testing/selftests/iommu/iommufd.c | 136 ++++++++++++++++++++---
> tools/testing/selftests/iommu/iommufd_fail_nth.c | 39 +++++++
> tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++++++
> 3 files changed, 217 insertions(+), 15 deletions(-)
>
> diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
> index 4927b9a..3d62951 100644
> --- a/tools/testing/selftests/iommu/iommufd.c
> +++ b/tools/testing/selftests/iommu/iommufd.c
> @@ -3,6 +3,7 @@
> #include <stdlib.h>
> #include <sys/mman.h>
> #include <sys/eventfd.h>
> +#include <asm/unistd.h>
>
> #define __EXPORTED_HEADERS__
> #include <linux/vfio.h>
> @@ -49,6 +50,9 @@ static __attribute__((constructor)) void setup_sizes(void)
> vrc = mmap(buffer, BUFFER_SIZE, PROT_READ | PROT_WRITE,
> MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
> assert(vrc == buffer);
> +
> + mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
> + &mfd);
> }
>
> FIXTURE(iommufd)
> @@ -128,6 +132,7 @@ static __attribute__((constructor)) void setup_sizes(void)
> TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP, length);
> TEST_LENGTH(iommu_option, IOMMU_OPTION, val64);
> TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved);
> + TEST_LENGTH(iommu_ioas_map_file, IOMMU_IOAS_MAP_FILE, iova);
> #undef TEST_LENGTH
> }
>
> @@ -1372,6 +1377,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> {
> unsigned int mock_domains;
> bool hugepages;
> + bool file;
> };
>
> FIXTURE_SETUP(iommufd_mock_domain)
> @@ -1410,26 +1416,45 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> {
> .mock_domains = 1,
> .hugepages = false,
> + .file = false,
> };
>
> FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains)
> {
> .mock_domains = 2,
> .hugepages = false,
> + .file = false,
> };
>
> FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_hugepage)
> {
> .mock_domains = 1,
> .hugepages = true,
> + .file = false,
> };
>
> FIXTURE_VARIANT_ADD(iommufd_mock_domain, two_domains_hugepage)
> {
> .mock_domains = 2,
> .hugepages = true,
> + .file = false,
> };
>
> +FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file)
> +{
> + .mock_domains = 1,
> + .hugepages = false,
> + .file = true,
> +};
> +
> +FIXTURE_VARIANT_ADD(iommufd_mock_domain, one_domain_file_hugepage)
> +{
> + .mock_domains = 1,
> + .hugepages = true,
> + .file = true,
> +};
> +
> +
> /* Have the kernel check that the user pages made it to the iommu_domain */
> #define check_mock_iova(_ptr, _iova, _length) \
> ({ \
> @@ -1455,7 +1480,10 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> } \
> })
>
> -TEST_F(iommufd_mock_domain, basic)
> +static void
> +test_basic_mmap(struct __test_metadata *_metadata,
> + struct _test_data_iommufd_mock_domain *self,
> + const struct _fixture_variant_iommufd_mock_domain *variant)
> {
> size_t buf_size = self->mmap_buf_size;
> uint8_t *buf;
> @@ -1478,12 +1506,52 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> test_err_ioctl_ioas_map(EFAULT, buf, buf_size, &iova);
> }
>
> +static void
> +test_basic_file(struct __test_metadata *_metadata,
> + struct _test_data_iommufd_mock_domain *self,
> + const struct _fixture_variant_iommufd_mock_domain *variant)
> +{
> + size_t buf_size = self->mmap_buf_size;
> + uint8_t *buf;
> + __u64 iova;
> + int mfd_tmp;
> + int prot = PROT_READ | PROT_WRITE;
> +
> + /* Simple one page map */
> + test_ioctl_ioas_map_file(mfd, 0, PAGE_SIZE, &iova);
> + check_mock_iova(mfd_buffer, iova, PAGE_SIZE);
> +
> + buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd_tmp);
> + ASSERT_NE(MAP_FAILED, buf);
> +
> + /* EFAULT half way through mapping */
> + ASSERT_EQ(0, munmap(buf + buf_size / 2, buf_size / 2));
> + test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
> +
> + /* EFAULT on first page */
> + ASSERT_EQ(0, munmap(buf, buf_size / 2));
> + test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
> +
> + close(mfd_tmp);
> +}
> +
> +TEST_F(iommufd_mock_domain, basic)
> +{
> + if (variant->file)
> + test_basic_file(_metadata, self, variant);
> + else
> + test_basic_mmap(_metadata, self, variant);
> +}
> +
> TEST_F(iommufd_mock_domain, ro_unshare)
> {
> uint8_t *buf;
> __u64 iova;
> int fd;
>
> + if (variant->file)
> + SKIP(return, "redundant test");
> +
> fd = open("/proc/self/exe", O_RDONLY);
> ASSERT_NE(-1, fd);
>
> @@ -1513,9 +1581,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> unsigned int start;
> unsigned int end;
> uint8_t *buf;
> + int prot = PROT_READ | PROT_WRITE;
> + int mfd;
>
> - buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
> - 0);
> + if (variant->file)
> + buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
> + else
> + buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
> ASSERT_NE(MAP_FAILED, buf);
> check_refs(buf, buf_size, 0);
>
> @@ -1532,7 +1604,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> size_t length = end - start;
> __u64 iova;
>
> - test_ioctl_ioas_map(buf + start, length, &iova);
> + if (variant->file) {
> + test_ioctl_ioas_map_file(mfd, start, length,
> + &iova);
> + } else {
> + test_ioctl_ioas_map(buf + start, length, &iova);
> + }
> check_mock_iova(buf + start, iova, length);
> check_refs(buf + start / PAGE_SIZE * PAGE_SIZE,
> end / PAGE_SIZE * PAGE_SIZE -
> @@ -1544,6 +1621,8 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> }
> check_refs(buf, buf_size, 0);
> ASSERT_EQ(0, munmap(buf, buf_size));
> + if (variant->file)
> + close(mfd);
> }
>
> TEST_F(iommufd_mock_domain, all_aligns_copy)
> @@ -1554,9 +1633,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> unsigned int start;
> unsigned int end;
> uint8_t *buf;
> + int prot = PROT_READ | PROT_WRITE;
> + int mfd;
>
> - buf = mmap(0, buf_size, PROT_READ | PROT_WRITE, self->mmap_flags, -1,
> - 0);
> + if (variant->file)
> + buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd);
> + else
> + buf = mmap(0, buf_size, prot, self->mmap_flags, -1, 0);
> ASSERT_NE(MAP_FAILED, buf);
> check_refs(buf, buf_size, 0);
>
> @@ -1575,7 +1658,12 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> uint32_t mock_stdev_id;
> __u64 iova;
>
> - test_ioctl_ioas_map(buf + start, length, &iova);
> + if (variant->file) {
> + test_ioctl_ioas_map_file(mfd, start, length,
> + &iova);
> + } else {
> + test_ioctl_ioas_map(buf + start, length, &iova);
> + }
>
> /* Add and destroy a domain while the area exists */
> old_id = self->hwpt_ids[1];
> @@ -1596,15 +1684,18 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> }
> check_refs(buf, buf_size, 0);
> ASSERT_EQ(0, munmap(buf, buf_size));
> + if (variant->file)
> + close(mfd);
> }
>
> TEST_F(iommufd_mock_domain, user_copy)
> {
> + void *buf = variant->file ? mfd_buffer : buffer;
> struct iommu_test_cmd access_cmd = {
> .size = sizeof(access_cmd),
> .op = IOMMU_TEST_OP_ACCESS_PAGES,
> .access_pages = { .length = BUFFER_SIZE,
> - .uptr = (uintptr_t)buffer },
> + .uptr = (uintptr_t)buf },
> };
> struct iommu_ioas_copy copy_cmd = {
> .size = sizeof(copy_cmd),
> @@ -1623,9 +1714,13 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
>
> /* Pin the pages in an IOAS with no domains then copy to an IOAS with domains */
> test_ioctl_ioas_alloc(&ioas_id);
> - test_ioctl_ioas_map_id(ioas_id, buffer, BUFFER_SIZE,
> - ©_cmd.src_iova);
> -
> + if (variant->file) {
> + test_ioctl_ioas_map_id_file(ioas_id, mfd, 0, BUFFER_SIZE,
> + ©_cmd.src_iova);
> + } else {
> + test_ioctl_ioas_map_id(ioas_id, buf, BUFFER_SIZE,
> + ©_cmd.src_iova);
> + }
> test_cmd_create_access(ioas_id, &access_cmd.id,
> MOCK_FLAGS_ACCESS_CREATE_NEEDS_PIN_PAGES);
>
> @@ -1635,12 +1730,17 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> &access_cmd));
> copy_cmd.src_ioas_id = ioas_id;
> ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd));
> - check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
> + check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
>
> /* Now replace the ioas with a new one */
> test_ioctl_ioas_alloc(&new_ioas_id);
> - test_ioctl_ioas_map_id(new_ioas_id, buffer, BUFFER_SIZE,
> - ©_cmd.src_iova);
> + if (variant->file) {
> + test_ioctl_ioas_map_id_file(new_ioas_id, mfd, 0, BUFFER_SIZE,
> + ©_cmd.src_iova);
> + } else {
> + test_ioctl_ioas_map_id(new_ioas_id, buf, BUFFER_SIZE,
> + ©_cmd.src_iova);
> + }
> test_cmd_access_replace_ioas(access_cmd.id, new_ioas_id);
>
> /* Destroy the old ioas and cleanup copied mapping */
> @@ -1654,7 +1754,7 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> &access_cmd));
> copy_cmd.src_ioas_id = new_ioas_id;
> ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_COPY, ©_cmd));
> - check_mock_iova(buffer, MOCK_APERTURE_START, BUFFER_SIZE);
> + check_mock_iova(buf, MOCK_APERTURE_START, BUFFER_SIZE);
>
> test_cmd_destroy_access_pages(
> access_cmd.id, access_cmd.access_pages.out_access_pages_id);
> @@ -1667,6 +1767,9 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> {
> uint32_t ioas_id;
>
> + if (variant->file)
> + SKIP(return, "redundant test");
> +
> test_ioctl_ioas_alloc(&ioas_id);
>
> test_cmd_mock_domain_replace(self->stdev_ids[0], ioas_id);
> @@ -1697,6 +1800,9 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
> {
> int i;
>
> + if (variant->file)
> + SKIP(return, "redundant test");
> +
> for (i = 0; i != variant->mock_domains; i++) {
> uint32_t hwpt_id[2];
> uint32_t stddev_id;
> diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
> index c5d5e69..111d43e 100644
> --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
> +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
> @@ -47,6 +47,9 @@ static __attribute__((constructor)) void setup_buffer(void)
>
> buffer = mmap(0, BUFFER_SIZE, PROT_READ | PROT_WRITE,
> MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> +
> + mfd_buffer = memfd_mmap(BUFFER_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,
> + &mfd);
> }
>
> /*
> @@ -331,6 +334,42 @@ void __fail_nth_enable(struct __test_metadata *_metadata,
> return 0;
> }
>
> +/* iopt_area_fill_domains() and iopt_area_fill_domain() */
> +TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
> +{
> + uint32_t ioas_id;
> + __u32 stdev_id;
> + __u32 hwpt_id;
> + __u64 iova;
> +
> + self->fd = open("/dev/iommu", O_RDWR);
> + if (self->fd == -1)
> + return -1;
> +
> + if (_test_ioctl_ioas_alloc(self->fd, &ioas_id))
> + return -1;
> +
> + if (_test_ioctl_set_temp_memory_limit(self->fd, 32))
> + return -1;
> +
> + fail_nth_enable();
> +
> + if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
> + return -1;
> +
> + if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
> + IOMMU_IOAS_MAP_WRITEABLE |
> + IOMMU_IOAS_MAP_READABLE))
> + return -1;
> +
> + if (_test_ioctl_destroy(self->fd, stdev_id))
> + return -1;
> +
> + if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL))
> + return -1;
> + return 0;
> +}
> +
> TEST_FAIL_NTH(basic_fail_nth, map_two_domains)
> {
> uint32_t ioas_id;
> diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
> index 40f6f14..c708dc1 100644
> --- a/tools/testing/selftests/iommu/iommufd_utils.h
> +++ b/tools/testing/selftests/iommu/iommufd_utils.h
> @@ -40,12 +40,28 @@ static inline bool test_bit(unsigned int nr, unsigned long *addr)
> static void *buffer;
> static unsigned long BUFFER_SIZE;
>
> +static void *mfd_buffer;
> +static int mfd;
> +
> static unsigned long PAGE_SIZE;
>
> #define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
> #define offsetofend(TYPE, MEMBER) \
> (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
>
> +static inline void *memfd_mmap(size_t length, int prot, int flags, int *mfd_p)
> +{
> + int mfd_flags = (flags & MAP_HUGETLB) ? MFD_HUGETLB : 0;
> + int mfd = memfd_create("buffer", mfd_flags);
> +
> + if (mfd <= 0)
> + return MAP_FAILED;
> + if (ftruncate(mfd, length))
> + return MAP_FAILED;
> + *mfd_p = mfd;
> + return mmap(0, length, prot, flags, mfd, 0);
> +}
> +
> /*
> * Have the kernel check the refcount on pages. I don't know why a freshly
> * mmap'd anon non-compound page starts out with a ref of 3
> @@ -589,6 +605,47 @@ static int _test_ioctl_ioas_unmap(int fd, unsigned int ioas_id, uint64_t iova,
> EXPECT_ERRNO(_errno, _test_ioctl_ioas_unmap(self->fd, self->ioas_id, \
> iova, length, NULL))
>
> +static int _test_ioctl_ioas_map_file(int fd, unsigned int ioas_id, int mfd,
> + size_t start, size_t length, __u64 *iova,
> + unsigned int flags)
> +{
> + struct iommu_ioas_map_file cmd = {
> + .size = sizeof(cmd),
> + .flags = flags,
> + .ioas_id = ioas_id,
> + .fd = mfd,
> + .start = start,
> + .length = length,
> + };
> + int ret;
> +
> + if (flags & IOMMU_IOAS_MAP_FIXED_IOVA)
> + cmd.iova = *iova;
> +
> + ret = ioctl(fd, IOMMU_IOAS_MAP_FILE, &cmd);
> + *iova = cmd.iova;
> + return ret;
> +}
> +
> +#define test_ioctl_ioas_map_file(mfd, start, length, iova_p) \
> + ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
> + start, length, iova_p, \
> + IOMMU_IOAS_MAP_WRITEABLE | \
> + IOMMU_IOAS_MAP_READABLE))
> +
> +#define test_err_ioctl_ioas_map_file(_errno, start, length, iova_p) \
> + EXPECT_ERRNO(_errno, \
> + _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
> + start, length, iova_p, \
> + IOMMU_IOAS_MAP_WRITEABLE | \
> + IOMMU_IOAS_MAP_READABLE))
> +
> +#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p) \
> + ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, \
> + start, length, iova_p, \
> + IOMMU_IOAS_MAP_WRITEABLE | \
> + IOMMU_IOAS_MAP_READABLE))
> +
> static int _test_ioctl_set_temp_memory_limit(int fd, unsigned int limit)
> {
> struct iommu_test_cmd memlimit_cmd = {
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH V5 9/9] iommufd: map file selftest
2024-10-22 21:00 ` [PATCH V5 9/9] iommufd: map file selftest Steve Sistare
2024-10-23 13:52 ` Steven Sistare
@ 2024-10-23 20:15 ` Nicolin Chen
2024-10-23 20:45 ` Steven Sistare
1 sibling, 1 reply; 33+ messages in thread
From: Nicolin Chen @ 2024-10-23 20:15 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Jason Gunthorpe, Kevin Tian
On Tue, Oct 22, 2024 at 02:00:38PM -0700, Steve Sistare wrote:
> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
LGTM overall.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
With some nits:
> ---
> tools/testing/selftests/iommu/iommufd.c | 136 ++++++++++++++++++++---
> tools/testing/selftests/iommu/iommufd_fail_nth.c | 39 +++++++
> tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++++++
> 3 files changed, 217 insertions(+), 15 deletions(-)
>
> diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
> index 4927b9a..3d62951 100644
> --- a/tools/testing/selftests/iommu/iommufd.c
> +++ b/tools/testing/selftests/iommu/iommufd.c
> @@ -3,6 +3,7 @@
> #include <stdlib.h>
> #include <sys/mman.h>
> #include <sys/eventfd.h>
> +#include <asm/unistd.h>
Can we add it to the top of the list for an alphabetical order?
> TEST_F(iommufd_mock_domain, ro_unshare)
> {
> uint8_t *buf;
> __u64 iova;
> int fd;
>
> + if (variant->file)
> + SKIP(return, "redundant test");
Why is it redundant? A line of comments maybe?
> +/* iopt_area_fill_domains() and iopt_area_fill_domain() */
> +TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
> +{
[...]
> + if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
> + IOMMU_IOAS_MAP_WRITEABLE |
> + IOMMU_IOAS_MAP_READABLE))
[...]
> +#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p) \
> + ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, \
> + start, length, iova_p, \
> + IOMMU_IOAS_MAP_WRITEABLE | \
> + IOMMU_IOAS_MAP_READABLE))
> +
The indentations are a bit odd here. Would you please do git-
clang-format to fix them?
Thanks
Nicolin
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH V5 9/9] iommufd: map file selftest
2024-10-23 20:15 ` Nicolin Chen
@ 2024-10-23 20:45 ` Steven Sistare
2024-10-24 0:36 ` Nicolin Chen
0 siblings, 1 reply; 33+ messages in thread
From: Steven Sistare @ 2024-10-23 20:45 UTC (permalink / raw)
To: Nicolin Chen; +Cc: iommu, Jason Gunthorpe, Kevin Tian
Thanks very much! I will fix all the nits.
- Steve
On 10/23/2024 4:15 PM, Nicolin Chen wrote:
> On Tue, Oct 22, 2024 at 02:00:38PM -0700, Steve Sistare wrote:
>
>> Add test cases to exercise IOMMU_IOAS_MAP_FILE.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>
> LGTM overall.
>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
>
> With some nits:
>
>> ---
>> tools/testing/selftests/iommu/iommufd.c | 136 ++++++++++++++++++++---
>> tools/testing/selftests/iommu/iommufd_fail_nth.c | 39 +++++++
>> tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++++++
>> 3 files changed, 217 insertions(+), 15 deletions(-)
>>
>> diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
>> index 4927b9a..3d62951 100644
>> --- a/tools/testing/selftests/iommu/iommufd.c
>> +++ b/tools/testing/selftests/iommu/iommufd.c
>> @@ -3,6 +3,7 @@
>> #include <stdlib.h>
>> #include <sys/mman.h>
>> #include <sys/eventfd.h>
>> +#include <asm/unistd.h>
>
> Can we add it to the top of the list for an alphabetical order?
>
>> TEST_F(iommufd_mock_domain, ro_unshare)
>> {
>> uint8_t *buf;
>> __u64 iova;
>> int fd;
>>
>> + if (variant->file)
>> + SKIP(return, "redundant test");
>
> Why is it redundant? A line of comments maybe?
>
>> +/* iopt_area_fill_domains() and iopt_area_fill_domain() */
>> +TEST_FAIL_NTH(basic_fail_nth, map_file_domain)
>> +{
> [...]
>> + if (_test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, 0, 262144, &iova,
>> + IOMMU_IOAS_MAP_WRITEABLE |
>> + IOMMU_IOAS_MAP_READABLE))
> [...]
>> +#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p) \
>> + ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, \
>> + start, length, iova_p, \
>> + IOMMU_IOAS_MAP_WRITEABLE | \
>> + IOMMU_IOAS_MAP_READABLE))
>> +
>
> The indentations are a bit odd here. Would you please do git-
> clang-format to fix them?
>
> Thanks
> Nicolin
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH V5 9/9] iommufd: map file selftest
2024-10-23 20:45 ` Steven Sistare
@ 2024-10-24 0:36 ` Nicolin Chen
0 siblings, 0 replies; 33+ messages in thread
From: Nicolin Chen @ 2024-10-24 0:36 UTC (permalink / raw)
To: Steven Sistare; +Cc: iommu, Jason Gunthorpe, Kevin Tian
On Wed, Oct 23, 2024 at 04:45:32PM -0400, Steven Sistare wrote:
> Thanks very much! I will fix all the nits.
When you respin the series, would you please also send the series
to the selftest maillist? I think get_maintainer.pl should give
you the list that has it.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 33+ messages in thread