* [PATCH V3 0/4] iommufd live update
@ 2024-10-07 17:13 Steve Sistare
2024-10-07 17:13 ` [PATCH V3 1/4] iommufd: Export do_update_pinned Steve Sistare
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Steve Sistare @ 2024-10-07 17:13 UTC (permalink / raw)
To: iommu
Cc: Jason Gunthorpe, Kevin Tian, Alex Williamson, Cornelia Huck,
Steve Sistare
Live update is a technique wherein an application saves its state, launches
an updated version of itself, and restores its state. Clients of the
application experience a brief suspension of service, on the order of
100's of milliseconds, but are otherwise unaffected.
Define the IOMMU_IOAS_CHANGE_PROCESS ioctl to allow management and use
of an iommufd device to be transferred from one process to another. The
application is responsible for transferring the device descriptor to the new
process, eg either by preservation across fork and exec or via SCM_RIGHTS.
It atomically updates everything to the current process, grabbing the new mm,
and transferring pinned page counts.
IOMMU_IOAS_CHANGE_PROCESS only supports DMA mappings created with
IOMMU_IOAS_MAP_FILE, because the kernel metadata for such mappings does not
depend on the userland VA of the pages (which is different in the new process).
IOMMU_IOAS_CHANGE_PROCESS fails if other types of mappings are present.
Thanks to Jason Gunthorpe for code and ideas in this series.
This series depends on the series:
[PATCH V3] iommu_ioas_map_file
This is implemented in QEMU by the series:
[PATCH] Live update: iommufd (V2 to be posted)
Changes in V2:
* deleted interface to update VA
* add selftest cases
Changes in V3:
* replaced lockdep_off with down_write_nest_lock
* cosmetic changes as requested
* check for zero length in iommufd_ioas_map_file
Steve Sistare (4):
iommufd: Export do_update_pinned
iommufd: Lock all objects
iommufd: Add IOMMU_IOAS_CHANGE_PROCESS
iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest
drivers/iommu/iommufd/io_pagetable.h | 6 +
drivers/iommu/iommufd/ioas.c | 207 ++++++++++++++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 2 +
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/pages.c | 10 +-
include/uapi/linux/iommufd.h | 23 +++
tools/testing/selftests/iommu/Makefile | 1 +
tools/testing/selftests/iommu/iommufd.c | 148 +++++++++++++++++-
tools/testing/selftests/iommu/iommufd_utils.h | 34 +++--
9 files changed, 409 insertions(+), 25 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V3 1/4] iommufd: Export do_update_pinned
2024-10-07 17:13 [PATCH V3 0/4] iommufd live update Steve Sistare
@ 2024-10-07 17:13 ` Steve Sistare
2024-10-07 17:13 ` [PATCH V3 2/4] iommufd: Lock all objects Steve Sistare
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Steve Sistare @ 2024-10-07 17:13 UTC (permalink / raw)
To: iommu
Cc: Jason Gunthorpe, Kevin Tian, Alex Williamson, Cornelia Huck,
Steve Sistare
Export do_update_pinned. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/iommufd/io_pagetable.h | 5 +++++
drivers/iommu/iommufd/pages.c | 10 +++++-----
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 909d396..622773a 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -252,4 +252,9 @@ struct iopt_pages_access {
unsigned int users;
};
+struct pfn_reader_user;
+
+int iopt_update_pinned(struct iopt_pages *pages, unsigned long npages, bool inc,
+ struct pfn_reader_user *user);
+
#endif
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index b57c2b9..c3700e7 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1006,8 +1006,8 @@ static int update_mm_locked_vm(struct iopt_pages *pages, unsigned long npages,
return rc;
}
-static int do_update_pinned(struct iopt_pages *pages, unsigned long npages,
- bool inc, struct pfn_reader_user *user)
+int iopt_update_pinned(struct iopt_pages *pages, unsigned long npages, bool inc,
+ struct pfn_reader_user *user)
{
int rc = 0;
@@ -1041,8 +1041,8 @@ static void update_unpinned(struct iopt_pages *pages)
return;
if (pages->npinned == pages->last_npinned)
return;
- do_update_pinned(pages, pages->last_npinned - pages->npinned, false,
- NULL);
+ iopt_update_pinned(pages, pages->last_npinned - pages->npinned, false,
+ NULL);
}
/*
@@ -1072,7 +1072,7 @@ static int pfn_reader_user_update_pinned(struct pfn_reader_user *user,
npages = pages->npinned - pages->last_npinned;
inc = true;
}
- return do_update_pinned(pages, npages, inc, user);
+ return iopt_update_pinned(pages, npages, inc, user);
}
/*
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V3 2/4] iommufd: Lock all objects
2024-10-07 17:13 [PATCH V3 0/4] iommufd live update Steve Sistare
2024-10-07 17:13 ` [PATCH V3 1/4] iommufd: Export do_update_pinned Steve Sistare
@ 2024-10-07 17:13 ` Steve Sistare
2024-11-07 20:23 ` Jason Gunthorpe
2024-10-07 17:13 ` [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS Steve Sistare
2024-10-07 17:13 ` [PATCH V3 4/4] iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest Steve Sistare
3 siblings, 1 reply; 11+ messages in thread
From: Steve Sistare @ 2024-10-07 17:13 UTC (permalink / raw)
To: iommu
Cc: Jason Gunthorpe, Kevin Tian, Alex Williamson, Cornelia Huck,
Steve Sistare
Define helpers to lock and unlock all iommufd_ctx objects.
This will allow DMA mappings to be updated atomically during live update.
This code is substantially the same as an initial version provided by
Jason, plus fixes.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
drivers/iommu/iommufd/ioas.c | 65 +++++++++++++++++++++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 1 +
drivers/iommu/iommufd/main.c | 1 +
3 files changed, 67 insertions(+)
diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c
index 2b2e172..480fe75 100644
--- a/drivers/iommu/iommufd/ioas.c
+++ b/drivers/iommu/iommufd/ioas.c
@@ -52,7 +52,10 @@ int iommufd_ioas_alloc_ioctl(struct iommufd_ucmd *ucmd)
rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
if (rc)
goto out_table;
+
+ down_read(&ucmd->ictx->ioas_creation_lock);
iommufd_object_finalize(ucmd->ictx, &ioas->obj);
+ up_read(&ucmd->ictx->ioas_creation_lock);
return 0;
out_table:
@@ -370,6 +373,68 @@ int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd)
return rc;
}
+static void iommufd_release_all_iova_rwsem(struct iommufd_ctx *ictx,
+ struct xarray *ioas_list)
+{
+ struct iommufd_ioas *ioas;
+ unsigned long index;
+
+ xa_for_each(ioas_list, index, ioas) {
+ up_write(&ioas->iopt.iova_rwsem);
+ refcount_dec(&ioas->obj.users);
+ }
+ up_write(&ictx->ioas_creation_lock);
+ xa_destroy(ioas_list);
+}
+
+static int iommufd_take_all_iova_rwsem(struct iommufd_ctx *ictx,
+ struct xarray *ioas_list)
+{
+ struct iommufd_object *obj;
+ unsigned long index;
+ int rc;
+
+ /*
+ * This is very ugly, it is done instead of adding a lock around
+ * pages->source_mm, which is a performance path for mdev, we just
+ * obtain the write side of all the iova_rwsems which also protects the
+ * pages->source_*. Due to copies we can't know which IOAS could read
+ * from the pages, so we just lock everything. This is the only place
+ * locks are nested and they are uniformly taken in ID order.
+ *
+ * ioas_creation_lock prevents new IOAS from being installed in the
+ * xarray while we do this, and also prevents more than one thread from
+ * holding nested locks.
+ */
+ down_write(&ictx->ioas_creation_lock);
+ xa_lock(&ictx->objects);
+ xa_for_each(&ictx->objects, index, obj) {
+ struct iommufd_ioas *ioas;
+
+ if (!obj || obj->type != IOMMUFD_OBJ_IOAS)
+ continue;
+
+ if (!refcount_inc_not_zero(&obj->users))
+ continue;
+
+ xa_unlock(&ictx->objects);
+
+ ioas = container_of(obj, struct iommufd_ioas, obj);
+ down_write_nest_lock(&ioas->iopt.iova_rwsem,
+ &ictx->ioas_creation_lock);
+
+ rc = xa_err(xa_store(ioas_list, index, ioas, GFP_KERNEL));
+ if (rc) {
+ iommufd_release_all_iova_rwsem(ictx, ioas_list);
+ return rc;
+ }
+
+ xa_lock(&ictx->objects);
+ }
+ xa_unlock(&ictx->objects);
+ return 0;
+}
+
int iommufd_option_rlimit_mode(struct iommu_option *cmd,
struct iommufd_ctx *ictx)
{
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 34b1f3a..49557de 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -23,6 +23,7 @@ struct iommufd_ctx {
struct xarray objects;
struct xarray groups;
wait_queue_head_t destroy_wait;
+ struct rw_semaphore ioas_creation_lock;
u8 account_mode;
/* Compatibility with VFIO no iommu */
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index f57c3318..e7ff72f 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -248,6 +248,7 @@ static int iommufd_fops_open(struct inode *inode, struct file *filp)
pr_info_once("IOMMUFD is providing /dev/vfio/vfio, not VFIO.\n");
}
+ init_rwsem(&ictx->ioas_creation_lock);
xa_init_flags(&ictx->objects, XA_FLAGS_ALLOC1 | XA_FLAGS_ACCOUNT);
xa_init(&ictx->groups);
ictx->file = filp;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS
2024-10-07 17:13 [PATCH V3 0/4] iommufd live update Steve Sistare
2024-10-07 17:13 ` [PATCH V3 1/4] iommufd: Export do_update_pinned Steve Sistare
2024-10-07 17:13 ` [PATCH V3 2/4] iommufd: Lock all objects Steve Sistare
@ 2024-10-07 17:13 ` Steve Sistare
2024-11-06 18:08 ` Jason Gunthorpe
2024-11-07 20:25 ` Jason Gunthorpe
2024-10-07 17:13 ` [PATCH V3 4/4] iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest Steve Sistare
3 siblings, 2 replies; 11+ messages in thread
From: Steve Sistare @ 2024-10-07 17:13 UTC (permalink / raw)
To: iommu
Cc: Jason Gunthorpe, Kevin Tian, Alex Williamson, Cornelia Huck,
Steve Sistare
Add an ioctl that updates all DMA mappings to reflect the current process,
Change the mm and transfer locked memory accounting from old to current mm.
This will be used for live update, allowing an old process to hand the
iommufd device descriptor to a new process. The new process calls the
ioctl.
IOMMU_IOAS_CHANGE_PROCESS only supports DMA mappings created with
IOMMU_IOAS_MAP_FILE, because the kernel metadata for such mappings does
not depend on the userland VA of the pages (which is different in the new
process).
IOMMU_IOAS_CHANGE_PROCESS fails if other types of mappings are present.
This is a revised version of code originally provided by Jason.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
drivers/iommu/iommufd/io_pagetable.h | 1 +
drivers/iommu/iommufd/ioas.c | 142 ++++++++++++++++++++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 1 +
drivers/iommu/iommufd/main.c | 2 +
include/uapi/linux/iommufd.h | 23 ++++++
5 files changed, 169 insertions(+)
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 622773a..a3cca5e 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -173,6 +173,7 @@ enum {
IOPT_PAGES_ACCOUNT_NONE = 0,
IOPT_PAGES_ACCOUNT_USER = 1,
IOPT_PAGES_ACCOUNT_MM = 2,
+ IOPT_PAGES_ACCOUNT_MODE_NUM = 3,
};
enum iopt_address_type {
diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c
index 480fe75..12570e6 100644
--- a/drivers/iommu/iommufd/ioas.c
+++ b/drivers/iommu/iommufd/ioas.c
@@ -435,6 +435,148 @@ static int iommufd_take_all_iova_rwsem(struct iommufd_ctx *ictx,
return 0;
}
+static bool need_charge_update(struct iopt_pages *pages)
+{
+ switch (pages->account_mode) {
+ case IOPT_PAGES_ACCOUNT_NONE:
+ return false;
+ case IOPT_PAGES_ACCOUNT_MM:
+ return pages->source_mm != current->mm;
+ case IOPT_PAGES_ACCOUNT_USER:
+ /*
+ * Update when mm changes because it also accounts
+ * in mm->pinned_vm.
+ */
+ return (pages->source_user != current_user()) ||
+ (pages->source_mm != current->mm);
+ }
+ return true;
+}
+
+static int charge_current(unsigned long *npinned)
+{
+ struct iopt_pages tmp = {
+ .source_mm = current->mm,
+ .source_task = current->group_leader,
+ .source_user = current_user(),
+ };
+ unsigned int account_mode;
+ int rc;
+
+ for (account_mode = 0; account_mode != IOPT_PAGES_ACCOUNT_MODE_NUM;
+ account_mode++) {
+ if (!npinned[account_mode])
+ continue;
+
+ tmp.account_mode = account_mode;
+ rc = iopt_update_pinned(&tmp, npinned[account_mode], true,
+ NULL);
+ if (rc)
+ goto err_undo;
+ }
+ return 0;
+
+err_undo:
+ while (account_mode != 0) {
+ account_mode--;
+ if (!npinned[account_mode])
+ continue;
+ tmp.account_mode = account_mode;
+ iopt_update_pinned(&tmp, npinned[account_mode], false, NULL);
+ }
+ return rc;
+}
+
+static void change_mm(struct iopt_pages *pages)
+{
+ struct task_struct *old_task = pages->source_task;
+ struct user_struct *old_user = pages->source_user;
+ struct mm_struct *old_mm = pages->source_mm;
+
+ pages->source_mm = current->mm;
+ mmgrab(pages->source_mm);
+ mmdrop(old_mm);
+
+ pages->source_task = current->group_leader;
+ get_task_struct(pages->source_task);
+ put_task_struct(old_task);
+
+ pages->source_user = get_uid(current_user());
+ free_uid(old_user);
+}
+
+#define for_each_ioas_area(_xa, _index, _ioas, _area) \
+ xa_for_each((_xa), (_index), (_ioas)) \
+ for (_area = iopt_area_iter_first(&_ioas->iopt, 0, ULONG_MAX); \
+ _area; \
+ _area = iopt_area_iter_next(_area, 0, ULONG_MAX))
+
+int iommufd_ioas_change_process(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_ioas_change_process *cmd = ucmd->cmd;
+ struct iommufd_ctx *ictx = ucmd->ictx;
+ unsigned long all_npinned[IOPT_PAGES_ACCOUNT_MODE_NUM] = {};
+ struct iommufd_ioas *ioas;
+ struct iopt_area *area;
+ struct iopt_pages *pages;
+ struct xarray ioas_list;
+ unsigned long index;
+ int rc;
+
+ if (cmd->__reserved)
+ return -EOPNOTSUPP;
+
+ xa_init(&ioas_list);
+ rc = iommufd_take_all_iova_rwsem(ictx, &ioas_list);
+ if (rc)
+ return rc;
+
+ for_each_ioas_area(&ioas_list, index, ioas, area) {
+ if (area->pages->type != IOPT_ADDRESS_FILE) {
+ rc = -EINVAL;
+ goto out;
+ }
+ }
+
+ /*
+ * Count last_pinned pages, then clear it to avoid double counting
+ * if the same iopt_pages is visited multiple times in this loop.
+ * Since we are under all the locks, npinned == last_npinned, so we
+ * can easily restore last_npinned before we return.
+ */
+ for_each_ioas_area(&ioas_list, index, ioas, area) {
+ pages = area->pages;
+
+ if (need_charge_update(pages)) {
+ all_npinned[pages->account_mode] += pages->last_npinned;
+ pages->last_npinned = 0;
+ }
+ }
+
+ rc = charge_current(all_npinned);
+
+ if (rc) {
+ /* Charge failed. Fix last_npinned and bail. */
+ for_each_ioas_area(&ioas_list, index, ioas, area)
+ area->pages->last_npinned = area->pages->npinned;
+ goto out;
+ }
+
+ for_each_ioas_area(&ioas_list, index, ioas, area) {
+ pages = area->pages;
+
+ /* Uncharge the old one (which also restores last_npinned) */
+ if (need_charge_update(pages))
+ iopt_update_pinned(pages, pages->npinned, false, NULL);
+
+ change_mm(pages);
+ }
+
+out:
+ iommufd_release_all_iova_rwsem(ictx, &ioas_list);
+ return rc;
+}
+
int iommufd_option_rlimit_mode(struct iommu_option *cmd,
struct iommufd_ctx *ictx)
{
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 49557de..2dc9e40 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -281,6 +281,7 @@ static inline struct iommufd_ioas *iommufd_get_ioas(struct iommufd_ctx *ictx,
int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd);
int iommufd_ioas_map(struct iommufd_ucmd *ucmd);
int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd);
+int iommufd_ioas_change_process(struct iommufd_ucmd *ucmd);
int iommufd_ioas_copy(struct iommufd_ucmd *ucmd);
int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd);
int iommufd_ioas_option(struct iommufd_ucmd *ucmd);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index e7ff72f..8c76eca 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -373,6 +373,8 @@ struct iommufd_ioctl_op {
struct iommu_ioas_alloc, out_ioas_id),
IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas,
struct iommu_ioas_allow_iovas, allowed_iovas),
+ IOCTL_OP(IOMMU_IOAS_CHANGE_PROCESS, iommufd_ioas_change_process,
+ struct iommu_ioas_change_process, size),
IOCTL_OP(IOMMU_IOAS_COPY, iommufd_ioas_copy, struct iommu_ioas_copy,
src_iova),
IOCTL_OP(IOMMU_IOAS_IOVA_RANGES, iommufd_ioas_iova_ranges,
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index c484981..523de8c 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -52,6 +52,7 @@ enum {
IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
IOMMUFD_CMD_IOAS_MAP_FILE = 0x8f,
+ IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x90,
};
/**
@@ -822,4 +823,26 @@ struct iommu_fault_alloc {
__u32 out_fault_fd;
};
#define IOMMU_FAULT_QUEUE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_FAULT_QUEUE_ALLOC)
+
+/**
+ * struct iommu_ioas_change_process - ioctl(VFIO_IOAS_CHANGE_PROCESS)
+ * @size: sizeof(struct iommu_ioas_change_process)
+ * @__reserved: Must be 0
+ *
+ * This transfers pinned memory counts for every memory map in every IOAS
+ * in the context to the current process. This only supports maps created
+ * with IOMMU_IOAS_MAP_FILE, and returns EINVAL if other maps are present.
+ * If the ioctl returns a failure status, then nothing is changed.
+ *
+ * This API is useful for transferring operation of a device from one process
+ * to another, such as during userland live update.
+ */
+struct iommu_ioas_change_process {
+ __u32 size;
+ __u32 __reserved;
+};
+
+#define IOMMU_IOAS_CHANGE_PROCESS \
+ _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_CHANGE_PROCESS)
+
#endif
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH V3 4/4] iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest
2024-10-07 17:13 [PATCH V3 0/4] iommufd live update Steve Sistare
` (2 preceding siblings ...)
2024-10-07 17:13 ` [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS Steve Sistare
@ 2024-10-07 17:13 ` Steve Sistare
2024-11-07 20:26 ` Jason Gunthorpe
3 siblings, 1 reply; 11+ messages in thread
From: Steve Sistare @ 2024-10-07 17:13 UTC (permalink / raw)
To: iommu
Cc: Jason Gunthorpe, Kevin Tian, Alex Williamson, Cornelia Huck,
Steve Sistare
Add selftest cases for IOMMU_IOAS_CHANGE_PROCESS.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
tools/testing/selftests/iommu/Makefile | 1 +
tools/testing/selftests/iommu/iommufd.c | 148 ++++++++++++++++++++++++--
tools/testing/selftests/iommu/iommufd_utils.h | 34 +++---
3 files changed, 163 insertions(+), 20 deletions(-)
diff --git a/tools/testing/selftests/iommu/Makefile b/tools/testing/selftests/iommu/Makefile
index fd64779..bd23ac9 100644
--- a/tools/testing/selftests/iommu/Makefile
+++ b/tools/testing/selftests/iommu/Makefile
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
CFLAGS += -Wall -O2 -Wno-unused-function
CFLAGS += $(KHDR_INCLUDES)
+CFLAGS += -lcap
TEST_GEN_PROGS :=
TEST_GEN_PROGS += iommufd
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 98c9f80..648deb9 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -5,6 +5,7 @@
#include <sys/eventfd.h>
#include <sys/syscall.h>
#include <asm/unistd.h>
+#include <sys/capability.h>
#define __EXPORTED_HEADERS__
#include <linux/vfio.h>
@@ -215,6 +216,144 @@ static __attribute__((constructor)) void setup_sizes(void)
EXPECT_ERRNO(ENOENT, ioctl(self->fd, IOMMU_OPTION, &cmd));
}
+static void drop_cap_ipc_lock(struct __test_metadata *_metadata)
+{
+ cap_t caps;
+ cap_value_t cap_list[1] = { CAP_IPC_LOCK };
+
+ caps = cap_get_proc();
+ ASSERT_NE(caps, NULL);
+ ASSERT_NE(-1,
+ cap_set_flag(caps, CAP_EFFECTIVE, 1, cap_list, CAP_CLEAR));
+ ASSERT_NE(-1, cap_set_proc(caps));
+ cap_free(caps);
+}
+
+static long get_proc_status_value(pid_t pid, const char *var)
+{
+ FILE *fp;
+ char buf[80], tag[80];
+ long val = -1;
+
+ snprintf(buf, sizeof(buf), "/proc/%d/status", pid);
+ fp = fopen(buf, "r");
+ if (!fp)
+ return val;
+
+ while (fgets(buf, sizeof(buf), fp))
+ if (fscanf(fp, "%s %ld\n", tag, &val) == 2 && !strcmp(tag, var))
+ break;
+
+ fclose(fp);
+ return val;
+}
+
+static long get_vm_pinned(pid_t pid)
+{
+ return get_proc_status_value(pid, "VmPin:");
+}
+
+static long get_vm_locked(pid_t pid)
+{
+ return get_proc_status_value(pid, "VmLck:");
+}
+
+FIXTURE(change_process)
+{
+ int fd;
+ uint32_t ioas_id;
+};
+
+FIXTURE_VARIANT(change_process)
+{
+ int accounting;
+};
+
+FIXTURE_SETUP(change_process)
+{
+ self->fd = open("/dev/iommu", O_RDWR);
+ ASSERT_NE(-1, self->fd);
+
+ drop_cap_ipc_lock(_metadata);
+ if (variant->accounting != IOPT_PAGES_ACCOUNT_NONE) {
+ struct iommu_option set_limit_cmd = {
+ .size = sizeof(set_limit_cmd),
+ .option_id = IOMMU_OPTION_RLIMIT_MODE,
+ .op = IOMMU_OPTION_OP_SET,
+ .val64 = (variant->accounting == IOPT_PAGES_ACCOUNT_MM),
+ };
+ ASSERT_EQ(0, ioctl(self->fd, IOMMU_OPTION, &set_limit_cmd));
+ }
+
+ test_ioctl_ioas_alloc(&self->ioas_id);
+ test_cmd_mock_domain(self->ioas_id, NULL, NULL, NULL);
+}
+
+FIXTURE_TEARDOWN(change_process)
+{
+ teardown_iommufd(self->fd, _metadata);
+}
+
+FIXTURE_VARIANT_ADD(change_process, account_none)
+{
+ .accounting = IOPT_PAGES_ACCOUNT_NONE,
+};
+
+FIXTURE_VARIANT_ADD(change_process, account_user)
+{
+ .accounting = IOPT_PAGES_ACCOUNT_USER,
+};
+
+FIXTURE_VARIANT_ADD(change_process, account_mm)
+{
+ .accounting = IOPT_PAGES_ACCOUNT_MM,
+};
+
+TEST_F(change_process, basic)
+{
+ pid_t parent = getpid();
+ pid_t child;
+ __u64 iova;
+ struct iommu_ioas_change_process cmd = {
+ .size = sizeof(cmd),
+ };
+
+ /* Expect failure if non-file maps exist */
+ test_ioctl_ioas_map(buffer, PAGE_SIZE, &iova);
+ EXPECT_ERRNO(EINVAL, ioctl(self->fd, IOMMU_IOAS_CHANGE_PROCESS, &cmd));
+ test_ioctl_ioas_unmap(iova, PAGE_SIZE);
+
+ /* Change process works in current process. */
+ test_ioctl_ioas_map_file(mfd, 0, PAGE_SIZE, &iova);
+ ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_CHANGE_PROCESS, &cmd));
+
+ /* Change process works in another process */
+ child = fork();
+ if (!child) {
+ int nlock = PAGE_SIZE / 1024;
+
+ /* Parent accounts for locked memory before */
+ ASSERT_EQ(nlock, get_vm_pinned(parent));
+ if (variant->accounting == IOPT_PAGES_ACCOUNT_MM)
+ ASSERT_EQ(nlock, get_vm_locked(parent));
+ ASSERT_EQ(0, get_vm_pinned(getpid()));
+ ASSERT_EQ(0, get_vm_locked(getpid()));
+
+ ASSERT_EQ(0, ioctl(self->fd, IOMMU_IOAS_CHANGE_PROCESS, &cmd));
+
+ /* Child accounts for locked memory after */
+ ASSERT_EQ(0, get_vm_pinned(parent));
+ ASSERT_EQ(0, get_vm_locked(parent));
+ ASSERT_EQ(nlock, get_vm_pinned(getpid()));
+ if (variant->accounting == IOPT_PAGES_ACCOUNT_MM)
+ ASSERT_EQ(nlock, get_vm_locked(getpid()));
+
+ exit(0);
+ }
+ ASSERT_NE(-1, child);
+ ASSERT_EQ(child, waitpid(child, NULL, 0));
+}
+
FIXTURE(iommufd_ioas)
{
int fd;
@@ -1548,13 +1687,10 @@ static void check_access_rw(struct __test_metadata *_metadata, int fd,
buf = memfd_mmap(buf_size, prot, MAP_SHARED, &mfd_tmp);
ASSERT_NE(MAP_FAILED, buf);
- /* EFAULT half way through mapping */
- ASSERT_EQ(0, munmap(buf + buf_size / 2, buf_size / 2));
- test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
+ test_err_ioctl_ioas_map_file(EINVAL, mfd_tmp, 0, buf_size + 1, &iova);
- /* EFAULT on first page */
- ASSERT_EQ(0, munmap(buf, buf_size / 2));
- test_err_ioctl_ioas_map_file(EFAULT, 0, buf_size, &iova);
+ ASSERT_EQ(0, ftruncate(mfd_tmp, 0));
+ test_err_ioctl_ioas_map_file(EINVAL, mfd_tmp, 0, buf_size, &iova);
close(mfd_tmp);
}
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 7bc664e..b5bf1c7 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -22,6 +22,12 @@
#define BIT_MASK(nr) (1UL << ((nr) % __BITS_PER_LONG))
#define BIT_WORD(nr) ((nr) / __BITS_PER_LONG)
+enum {
+ IOPT_PAGES_ACCOUNT_NONE = 0,
+ IOPT_PAGES_ACCOUNT_USER = 1,
+ IOPT_PAGES_ACCOUNT_MM = 2,
+};
+
#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
static inline void set_bit(unsigned int nr, unsigned long *addr)
@@ -612,23 +618,23 @@ static int _test_ioctl_ioas_map_file(int fd, unsigned int ioas_id, int mfd,
}
#define test_ioctl_ioas_map_file(mfd, start, length, iova_p) \
- ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
- start, length, iova_p, \
- IOMMU_IOAS_MAP_WRITEABLE | \
- IOMMU_IOAS_MAP_READABLE))
+ ASSERT_EQ(0, \
+ _test_ioctl_ioas_map_file( \
+ self->fd, self->ioas_id, mfd, start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
-#define test_err_ioctl_ioas_map_file(_errno, start, length, iova_p) \
- EXPECT_ERRNO(_errno, \
- _test_ioctl_ioas_map_file(self->fd, self->ioas_id, mfd, \
- start, length, iova_p, \
- IOMMU_IOAS_MAP_WRITEABLE | \
- IOMMU_IOAS_MAP_READABLE))
+#define test_err_ioctl_ioas_map_file(_errno, mfd, start, length, iova_p) \
+ EXPECT_ERRNO( \
+ _errno, \
+ _test_ioctl_ioas_map_file( \
+ self->fd, self->ioas_id, mfd, start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
#define test_ioctl_ioas_map_id_file(ioas_id, mfd, start, length, iova_p) \
- ASSERT_EQ(0, _test_ioctl_ioas_map_file(self->fd, ioas_id, mfd, \
- start, length, iova_p, \
- IOMMU_IOAS_MAP_WRITEABLE | \
- IOMMU_IOAS_MAP_READABLE))
+ ASSERT_EQ(0, \
+ _test_ioctl_ioas_map_file( \
+ self->fd, ioas_id, mfd, start, length, iova_p, \
+ IOMMU_IOAS_MAP_WRITEABLE | IOMMU_IOAS_MAP_READABLE))
static int _test_ioctl_set_temp_memory_limit(int fd, unsigned int limit)
{
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS
2024-10-07 17:13 ` [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS Steve Sistare
@ 2024-11-06 18:08 ` Jason Gunthorpe
2024-11-06 18:55 ` Steven Sistare
2024-11-07 20:25 ` Jason Gunthorpe
1 sibling, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2024-11-06 18:08 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Kevin Tian, Alex Williamson, Cornelia Huck
On Mon, Oct 07, 2024 at 10:13:22AM -0700, Steve Sistare wrote:
> diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> index c484981..523de8c 100644
> --- a/include/uapi/linux/iommufd.h
> +++ b/include/uapi/linux/iommufd.h
> @@ -52,6 +52,7 @@ enum {
> IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
> IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
> IOMMUFD_CMD_IOAS_MAP_FILE = 0x8f,
> + IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x90,
I'm going to give 0x90 to the nesting series
IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
IOMMUFD_CMD_IOAS_MAP_FILE = 0x8f,
IOMMUFD_CMD_VIOMMU_ALLOC = 0x90,
IOMMUFD_CMD_VDEVICE_ALLOC = 0x91,
};
Can you use
IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92
?
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS
2024-11-06 18:08 ` Jason Gunthorpe
@ 2024-11-06 18:55 ` Steven Sistare
0 siblings, 0 replies; 11+ messages in thread
From: Steven Sistare @ 2024-11-06 18:55 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, Kevin Tian, Alex Williamson, Cornelia Huck
On 11/6/2024 1:08 PM, Jason Gunthorpe wrote:
> On Mon, Oct 07, 2024 at 10:13:22AM -0700, Steve Sistare wrote:
>> diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
>> index c484981..523de8c 100644
>> --- a/include/uapi/linux/iommufd.h
>> +++ b/include/uapi/linux/iommufd.h
>> @@ -52,6 +52,7 @@ enum {
>> IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
>> IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
>> IOMMUFD_CMD_IOAS_MAP_FILE = 0x8f,
>> + IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x90,
>
> I'm going to give 0x90 to the nesting series
>
> IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
> IOMMUFD_CMD_IOAS_MAP_FILE = 0x8f,
> IOMMUFD_CMD_VIOMMU_ALLOC = 0x90,
> IOMMUFD_CMD_VDEVICE_ALLOC = 0x91,
> };
>
> Can you use
>
> IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92
Sure, will do - steve
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH V3 2/4] iommufd: Lock all objects
2024-10-07 17:13 ` [PATCH V3 2/4] iommufd: Lock all objects Steve Sistare
@ 2024-11-07 20:23 ` Jason Gunthorpe
0 siblings, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2024-11-07 20:23 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Kevin Tian, Alex Williamson, Cornelia Huck
On Mon, Oct 07, 2024 at 10:13:21AM -0700, Steve Sistare wrote:
> Define helpers to lock and unlock all iommufd_ctx objects.
> This will allow DMA mappings to be updated atomically during live update.
> This code is substantially the same as an initial version provided by
> Jason, plus fixes.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> drivers/iommu/iommufd/ioas.c | 65 +++++++++++++++++++++++++++++++++
> drivers/iommu/iommufd/iommufd_private.h | 1 +
> drivers/iommu/iommufd/main.c | 1 +
> 3 files changed, 67 insertions(+)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS
2024-10-07 17:13 ` [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS Steve Sistare
2024-11-06 18:08 ` Jason Gunthorpe
@ 2024-11-07 20:25 ` Jason Gunthorpe
1 sibling, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2024-11-07 20:25 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Kevin Tian, Alex Williamson, Cornelia Huck
On Mon, Oct 07, 2024 at 10:13:22AM -0700, Steve Sistare wrote:
> Add an ioctl that updates all DMA mappings to reflect the current process,
> Change the mm and transfer locked memory accounting from old to current mm.
> This will be used for live update, allowing an old process to hand the
> iommufd device descriptor to a new process. The new process calls the
> ioctl.
>
> IOMMU_IOAS_CHANGE_PROCESS only supports DMA mappings created with
> IOMMU_IOAS_MAP_FILE, because the kernel metadata for such mappings does
> not depend on the userland VA of the pages (which is different in the new
> process).
> IOMMU_IOAS_CHANGE_PROCESS fails if other types of mappings are present.
>
> This is a revised version of code originally provided by Jason.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> drivers/iommu/iommufd/io_pagetable.h | 1 +
> drivers/iommu/iommufd/ioas.c | 142 ++++++++++++++++++++++++++++++++
> drivers/iommu/iommufd/iommufd_private.h | 1 +
> drivers/iommu/iommufd/main.c | 2 +
> include/uapi/linux/iommufd.h | 23 ++++++
> 5 files changed, 169 insertions(+)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH V3 4/4] iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest
2024-10-07 17:13 ` [PATCH V3 4/4] iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest Steve Sistare
@ 2024-11-07 20:26 ` Jason Gunthorpe
2024-11-07 20:47 ` Steven Sistare
0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2024-11-07 20:26 UTC (permalink / raw)
To: Steve Sistare; +Cc: iommu, Kevin Tian, Alex Williamson, Cornelia Huck
On Mon, Oct 07, 2024 at 10:13:23AM -0700, Steve Sistare wrote:
> Add selftest cases for IOMMU_IOAS_CHANGE_PROCESS.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> tools/testing/selftests/iommu/Makefile | 1 +
> tools/testing/selftests/iommu/iommufd.c | 148 ++++++++++++++++++++++++--
> tools/testing/selftests/iommu/iommufd_utils.h | 34 +++---
> 3 files changed, 163 insertions(+), 20 deletions(-)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH V3 4/4] iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest
2024-11-07 20:26 ` Jason Gunthorpe
@ 2024-11-07 20:47 ` Steven Sistare
0 siblings, 0 replies; 11+ messages in thread
From: Steven Sistare @ 2024-11-07 20:47 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, Kevin Tian, Alex Williamson, Cornelia Huck
On 11/7/2024 3:26 PM, Jason Gunthorpe wrote:
> On Mon, Oct 07, 2024 at 10:13:23AM -0700, Steve Sistare wrote:
>> Add selftest cases for IOMMU_IOAS_CHANGE_PROCESS.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> tools/testing/selftests/iommu/Makefile | 1 +
>> tools/testing/selftests/iommu/iommufd.c | 148 ++++++++++++++++++++++++--
>> tools/testing/selftests/iommu/iommufd_utils.h | 34 +++---
>> 3 files changed, 163 insertions(+), 20 deletions(-)
>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cool, thanks!
I'll post V4 with minor changes to the self test code.
- Steve
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-11-07 20:47 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-07 17:13 [PATCH V3 0/4] iommufd live update Steve Sistare
2024-10-07 17:13 ` [PATCH V3 1/4] iommufd: Export do_update_pinned Steve Sistare
2024-10-07 17:13 ` [PATCH V3 2/4] iommufd: Lock all objects Steve Sistare
2024-11-07 20:23 ` Jason Gunthorpe
2024-10-07 17:13 ` [PATCH V3 3/4] iommufd: Add IOMMU_IOAS_CHANGE_PROCESS Steve Sistare
2024-11-06 18:08 ` Jason Gunthorpe
2024-11-06 18:55 ` Steven Sistare
2024-11-07 20:25 ` Jason Gunthorpe
2024-10-07 17:13 ` [PATCH V3 4/4] iommufd: IOMMU_IOAS_CHANGE_PROCESS selftest Steve Sistare
2024-11-07 20:26 ` Jason Gunthorpe
2024-11-07 20:47 ` Steven Sistare
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox