* [RFC PATCH v2 00/32] Add live update state preservation
@ 2025-12-02 23:02 Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 01/32] iommufd: Allow HWPTs to have a NULL IOAS Samiullah Khawaja
` (32 more replies)
0 siblings, 33 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Hi,
This RFC patch series introduces a mechanism for IOMMU state
preservation across live update, using the Intel VT-d driver as the
initial example implementation and demonstration platform.
Please take a look at the following LWN article to learn about KHO and
Live Update Orchestrator:
https://lwn.net/Articles/1033364/
This work is based on the LUOv8 that is in linux-next. Please find the
details of various LUO sessions, file handlers and FLB preservation
callbacks, and memory preservation mechanisms in the LUOv8 series.
https://lore.kernel.org/linux-mm/20251125165850.3389713-1-pasha.tatashin@soleen.com/
This series also uses the VFIO cdev preservation support series v2 that
is sent out for review separately:
https://lore.kernel.org/all/20251126193608.2678510-1-dmatlack@google.com/
The kernel tree with all dependencies is uploaded to the following
Github location:
https://github.com/samikhawaja/linux/tree/iommu/rfc-v2
Overall Goals:
The goal of this effort is to preserve the IOMMU domains, managed by
iommufd, attached to preserved VFIO. This allows DMA mappings and IOMMU
context of a device assigned to a VM to be maintained across a live
update.
This is achieved by preserving IOMMU page tables using Generic Page
Table support, IOMMU root table and the relevant context entries across
live update.
Current Implementation, Scope and Limitations:
This RFC is using the newer LUO as compared to the previous version and
has major rework related to lifecycle management.
It includes preservation and restoration of the following states,
- IOMMUFD and its HWPTs that are marked by user for preservation.
- Iommu domain that is associated with the preserved HWPTs.
- Iommu domain page tables associated with the iommu domain.
- VFIO cdev preserved device context and domain IDs.
- IOMMU translation units associated with the preserved devices.
It also includes a selftest to validate the end-to-end preservation and
restoration of an iommufd and vfio cdev file descriptor.
This version does not implement hotswap of the iommu domain in the IOMMU
driver. That will be done as a separate patch series.
The series also does not yet include a versioning scheme for the
persisted state; this will be added later. Currently the structs are
used and defined in include/linux/kho/abi/iommu.h and
include/linux/kho/abi/iommufd.h
Architectural Overview:
The target architecture for IOMMU state preservation across a live
update involves coordination between the Live Update Orchestrator,
iommufd, and the IOMMU drivers.
The core design uses the Live Update Orchestrator's file descriptor
preservation mechanism to preserve iommufd file descriptors. The user
marks the iommufd HWPTs for preservation using a new ioctl added in this
series. Once done, the preservation of iommufd inside an LUO sessions is
triggered using LUO ioctls. During preservation, the LUO preserve
callback for an iommufd walks through the HWPTs it manages to identify
the ones that need to be preserved. Once identified, a new IOMMU core
API is used to preserve the iommu domain. The IOMMU core uses Generic
Page Table to preserve the page tables of these domains. The domains are
then marked as preserved.
When the user triggers the preservation of a VFIO cdev that is attached
to an iommufd that is preserved, the device attachment state of that
VFIO cdev is also preserved using an API exported by iommufd. IOMMUFD
fetches all the information that needs to be preserved and calls the
IOMMU core API to preserve the device state. The IOMMU core also
preserves state of IOMMU that is associated with this device.
The IOMMU core has LUO FLB registered with the iommufd LUO file handler
so the presered iommu domain and iommu hardware unit state is available
during boot for early restore in the next kernel.
During boot the driver fetches the preserved state from the IOMMU core
and restores the state of preserved IOMMUs. Later when IOMMU core goes
through the devices and probes them, the iommu domains of preserved
devices are restored and the preserved devices are attached to them.
Later during iommufd retrieval, the preserved HWPTs are restored and
associated with restored iommu domains. The preserved iommufd does not
finish (returns error from can_finish) until all of the following
statements are true,
- The iommufd is retrieved.
- The preserved HWPTs are restored.
- The restored iommu domains still have attachments.
Hotswap:
The userspace restores the iommufd, creates a new HWPT and setup DMA
mappings on it. Once the HWPT is fully setup the device can be attached
to the new HWPT to do a hotswap. That is the underlying iommu domains
are replaced. Once done, the restored HWPT and the associated restored
iommu domain can be destroyed using the HWPT ID.
In this series IOMMU core allows replacement of restored iommu domains,
this way attaching a new HWPT doesn't return -EBUSY.
This can maybe be avoided by using the existing hwpt replace logic in
iommufd by,
- Using the replace code path if the vfio_device is restored.
- Reconstructing the vfio_device<->hwpt association during vfio-cdev
restore. During vfio-cdev restore when it autobinds with the iommufd,
it also re-attaches itself to the restored hwpts that it was attached
to before. The new attachment is immutable and can only be replaced by
a new HWPT.
Tested:
This series was tested using QEMU with virtual IOMMU (VT-d) support. The
workflow was validated using a guest with virtio-net device bound to the
vfio-pci driver.
The new iommufd_liveupdate selftest was used to verify the end-to-end
preservation and restore logic.
Future Work:
- Implement IOMMU domain replacement for Intel IOMMU Driver.
- Add support for preserving PASID tables for devices that use them.
- Auto bind VFIO cdev during retrieve using preserved iommufd token
fetched from LUO using internal API.
- Implement a versioning scheme for serialized data to ensure
compatibility across kernel versions.
- Extend support to other IOMMU architectures (e.g., AMD-Vi, Arm SMMUv3).
- Only accept memfds that are SEAL'd for iommufd preservation..
- Keep preserved memfd immutable after retrieve.
- Make HWPTs immutable after iommufd is preserved.
High-Level Sequence Flow:
The following diagrams illustrate the high-level interactions during the
preservation phase.
Prepare:
Before live update the PREPARE event of Liveupdate Orchestrator invokes
callbacks of the registered file and subsystem handlers.
Userspace (VMM) | LUO Core | iommufd | IOMMU Core | IOMMU Driver
-----------------|----------|---------------|---------------|-------------
| | | |
MARK_HWPT | | | |
---------------------------> | |
| | Mark HWPT as | |
| | lu_preserved | |
| | | |
PRESERVE | | | |
iommufd_fd | | | |
-----------------> | | |
| preserve | | |
|----------> | |
| | For each HWPT | |
| |--------------> |
| | | domain_presrv |
| | |-------------->
| | | | gpt(preserve)
| | |<--------------|
| |<--------------| |
|<---------| | |
| | | |
... | | | |
| | | |
PRESERVE, | | | |
vfio_cdev_fd | | | |
-----------------> | | |
| preserve | | |
|----------> | |
| | | |
| | Get domain | |
| | iommu_preserv | |
| | _device() | |
| |--------------> |
| | | preserve |
| | | (iommu_hw) |
| | |-------------->
| | | | preserve(root)
| | |<--------------|
| | | |
| | | preserve |
| | | _device(dev) |
| | |-------------->
| | | |
| | |<--------------|
| |<--------------| |
|<---------| | |
Restore:
After a live update, the preserved state is restored during boot and
when userspace retrieves the preserved FDs.
Userspace (VMM) | LUO Core | iommufd | IOMMU Core | IOMMU Driver
-----------------|----------|---------------|---------------|-------------
| | | |
| | | | Restore
| | | | Root, DIDs
| | | |
| | | | Register
| | | probe devices |
| | | |
| | | restore |
| | | domain |
| | |-------------->
| | | | restore
| | | reattach |
| | | domain |
| | |-------------->
| | | |
RETRIEVE | | | |
iommufd_fd | | | |
-----------------> | | |
| retrieve | | |
|----------> | |
| | | |
RETRIEVE | | | |
HWPT | | | |
---------------------------> | |
| | get domain | |
| |--------------> |
| | | |
| |<--------------| |
|<---------| | |
| | | |
| ... (create new HWPT, map) ... |
| | | |
ATTACH PT | | | |
---------------------------> | |
| | replace hwpt | |
| |--------------> |
| | | replace domain|
| |<--------------| |
|<---------| | |
FINISH session | | | |
-----------------> | | |
| can | | |
| finish | | |
|----------> | |
| | has | |
| | attachments | |
| |--------------> |
| | | return false |
| |<--------------| |
| finish | | |
|----------> | |
| | finish | |
| |--------------> |
| | | |
| |<--------------| |
Looking forward to your feedback on this.
v2:
- Reworked for LUOv8
- HWPTs are marked for preservation by user.
- Use GPT to preserve iommu domain page tables.
- iommufd file handler cannot finish until domains are replaced.
- Device iommu context preservation is triggered through VFIO cdev
preservation.
- Use LUO FLB instead of subsystems and preserve state synchronously
instead of staging it.
Samiullah Khawaja (26):
iommu: Add liveupdate FLB for IOMMU state preservation
iommu: Register IOMMU FLB with iommufd file handler
iommu: Implement IOMMU LU FLB preserve/unpreserve callbacks
iommu: Add iommu_domain ops to preserve, unpreserve and restore
iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages
iommupt: Implement preserve/unpreserve/restore callbacks
iommu: Add APIs to preserve/unpreserve iommu domains
iommufd: Use the iommu_domain_preserve/unpreserve APIs
iommu: Add API to keep track of iommu domain attachments
iommu: Add API to preserve/unpreserve a device
iommu/vt-d: Implement device and iommu preserve/unpreserve ops
iommufd: Add APIs to preserve/unpreserve a vfio cdev
vfio/pci: Preserve the iommufd state of the vfio cdev
iommu: Add APIs to get preserved state of a device
iommu/vt-d: Clean the context entries of unpreserved devices
iommu: Implement IOMMU FLB retrieve and finish ops
iommu: Add an API get the preserved state of an IOMMU
iommu/vt-d: restore state of the preserved IOMMU
iommu: Add helper APIs to fetch preserved device state
iommu/vt-d: reclaim domain ids of the preserved devices
iommu: restore preserved domain and reattach
iommu/vt-d: reuse the preserved domain id for preserved devices
iommufd: Handle the iommufd can_finish properly
iommu: Transfer device ownership after liveupdate
iommu: Allow replacing restored domain
iommufd/selftest: Add test to verify iommufd preservation
YiFei Zhu (6):
iommufd: Allow HWPTs to have a NULL IOAS
iommufd: split alloc and domain assign from iommufd_hwpt_paging_alloc
iommufd: Add basic skeleton based on liveupdate_file_handler
iommufd-lu: Implement basic prepare/cancel/finish/retrieve using
folios
iommufd-lu: Implement ioctl to let userspace mark an HWPT to be
preserved
iommufd-lu: Persist iommu hardware pagetables for live update
MAINTAINERS | 1 +
drivers/iommu/Makefile | 1 +
drivers/iommu/generic_pt/iommu_pt.h | 100 ++++
drivers/iommu/intel/Makefile | 1 +
drivers/iommu/intel/iommu.c | 111 +++-
drivers/iommu/intel/iommu.h | 15 +-
drivers/iommu/intel/liveupdate.c | 168 ++++++
drivers/iommu/intel/nested.c | 2 +-
drivers/iommu/iommu-pages.c | 70 +++
drivers/iommu/iommu-pages.h | 8 +
drivers/iommu/iommu.c | 88 ++-
drivers/iommu/iommufd/Makefile | 1 +
drivers/iommu/iommufd/device.c | 56 +-
drivers/iommu/iommufd/hw_pagetable.c | 89 ++--
drivers/iommu/iommufd/iommufd_private.h | 44 ++
drivers/iommu/iommufd/liveupdate.c | 424 +++++++++++++++
drivers/iommu/iommufd/main.c | 39 +-
drivers/iommu/liveupdate.c | 500 ++++++++++++++++++
drivers/vfio/iommufd.c | 7 +-
drivers/vfio/pci/vfio_pci_liveupdate.c | 24 +-
include/linux/generic_pt/iommu.h | 10 +
include/linux/iommu-lu.h | 92 ++++
include/linux/iommu.h | 36 +-
include/linux/iommufd.h | 9 +-
include/linux/kho/abi/iommu.h | 119 +++++
include/linux/kho/abi/iommufd.h | 39 ++
include/linux/kho/abi/vfio_pci.h | 10 +
include/linux/vfio.h | 4 +
include/uapi/linux/iommufd.h | 41 ++
tools/testing/selftests/iommu/Makefile | 1 +
.../selftests/iommu/iommufd_liveupdate.c | 291 ++++++++++
31 files changed, 2318 insertions(+), 83 deletions(-)
create mode 100644 drivers/iommu/intel/liveupdate.c
create mode 100644 drivers/iommu/iommufd/liveupdate.c
create mode 100644 drivers/iommu/liveupdate.c
create mode 100644 include/linux/iommu-lu.h
create mode 100644 include/linux/kho/abi/iommu.h
create mode 100644 include/linux/kho/abi/iommufd.h
create mode 100644 tools/testing/selftests/iommu/iommufd_liveupdate.c
base-commit: ef68bf704646690aba5e81c2f7be8d6ef13d7ad8
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply [flat|nested] 49+ messages in thread
* [RFC PATCH v2 01/32] iommufd: Allow HWPTs to have a NULL IOAS
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 02/32] iommufd: split alloc and domain assign from iommufd_hwpt_paging_alloc Samiullah Khawaja
` (31 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: YiFei Zhu, Robin Murphy, Pratyush Yadav, Samiullah Khawaja,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, Chris Li, praan
From: YiFei Zhu <zhuyifei@google.com>
Normally HWPTs are created with a parent IOAS to allow the mappings
to be modified. For liveupdate we want an immutable HWPT upon restore,
so no IOAS is needed. This patch prepares iommufd so it would not
crash on a NULL hwpt_paging->ioas.
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
drivers/iommu/iommufd/device.c | 11 ++++++++---
drivers/iommu/iommufd/hw_pagetable.c | 15 +++++++++++++--
2 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 4c842368289f..ba4d9c3cfa8b 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -418,6 +418,7 @@ iommufd_device_attach_reserved_iova(struct iommufd_device *idev,
lockdep_assert_held(&igroup->lock);
+ /* unreachable if !hwpt_paging->ioas */
rc = iopt_table_enforce_dev_resv_regions(&hwpt_paging->ioas->iopt,
idev->dev,
&igroup->sw_msi_start);
@@ -603,7 +604,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
struct iommufd_device *idev, ioasid_t pasid)
{
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
- bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID;
+ bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID && hwpt_paging->ioas;
struct iommufd_group *igroup = idev->igroup;
struct iommufd_hw_pagetable *old_hwpt;
struct iommufd_attach *attach;
@@ -707,7 +708,7 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev, ioasid_t pasid)
xa_erase(&igroup->pasid_attach, pasid);
kfree(attach);
}
- if (hwpt_paging && pasid == IOMMU_NO_PASID)
+ if (hwpt_paging && pasid == IOMMU_NO_PASID && hwpt_paging->ioas)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, idev->dev);
mutex_unlock(&igroup->lock);
@@ -739,6 +740,9 @@ iommufd_group_remove_reserved_iova(struct iommufd_group *igroup,
lockdep_assert_held(&igroup->lock);
+ if (!hwpt_paging->ioas)
+ return;
+
attach = xa_load(&igroup->pasid_attach, IOMMU_NO_PASID);
xa_for_each(&attach->device_array, index, cur)
iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, cur->dev);
@@ -756,6 +760,7 @@ iommufd_group_do_replace_reserved_iova(struct iommufd_group *igroup,
lockdep_assert_held(&igroup->lock);
+ /* unreachable if !hwpt_paging->ioas */
attach = xa_load(&igroup->pasid_attach, IOMMU_NO_PASID);
old_hwpt_paging = find_hwpt_paging(attach->hwpt);
if (!old_hwpt_paging || hwpt_paging->ioas != old_hwpt_paging->ioas) {
@@ -782,7 +787,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, ioasid_t pasid,
struct iommufd_hw_pagetable *hwpt)
{
struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
- bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID;
+ bool attach_resv = hwpt_paging && pasid == IOMMU_NO_PASID && hwpt_paging->ioas;
struct iommufd_hwpt_paging *old_hwpt_paging;
struct iommufd_group *igroup = idev->igroup;
struct iommufd_hw_pagetable *old_hwpt;
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index fe789c2dc0c9..78d2130e0061 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -23,6 +23,7 @@ void iommufd_hwpt_paging_destroy(struct iommufd_object *obj)
container_of(obj, struct iommufd_hwpt_paging, common.obj);
if (!list_empty(&hwpt_paging->hwpt_item)) {
+ /* unreachable if !hwpt_paging->ioas */
mutex_lock(&hwpt_paging->ioas->mutex);
list_del(&hwpt_paging->hwpt_item);
mutex_unlock(&hwpt_paging->ioas->mutex);
@@ -32,7 +33,9 @@ void iommufd_hwpt_paging_destroy(struct iommufd_object *obj)
}
__iommufd_hwpt_destroy(&hwpt_paging->common);
- refcount_dec(&hwpt_paging->ioas->obj.users);
+
+ if (hwpt_paging->ioas)
+ refcount_dec(&hwpt_paging->ioas->obj.users);
}
void iommufd_hwpt_paging_abort(struct iommufd_object *obj)
@@ -41,9 +44,11 @@ void iommufd_hwpt_paging_abort(struct iommufd_object *obj)
container_of(obj, struct iommufd_hwpt_paging, common.obj);
/* The ioas->mutex must be held until finalize is called. */
- lockdep_assert_held(&hwpt_paging->ioas->mutex);
+ if (hwpt_paging->ioas)
+ lockdep_assert_held(&hwpt_paging->ioas->mutex);
if (!list_empty(&hwpt_paging->hwpt_item)) {
+ /* unreachable if !hwpt_paging->ioas */
list_del_init(&hwpt_paging->hwpt_item);
iopt_table_remove_domain(&hwpt_paging->ioas->iopt,
hwpt_paging->common.domain);
@@ -457,6 +462,9 @@ int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd)
return PTR_ERR(hwpt_paging);
ioas = hwpt_paging->ioas;
+ if (!ioas)
+ return -EINVAL;
+
enable = cmd->flags & IOMMU_HWPT_DIRTY_TRACKING_ENABLE;
rc = iopt_set_dirty_tracking(&ioas->iopt, hwpt_paging->common.domain,
@@ -482,6 +490,9 @@ int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd)
return PTR_ERR(hwpt_paging);
ioas = hwpt_paging->ioas;
+ if (!ioas)
+ return -EINVAL;
+
rc = iopt_read_and_clear_dirty_data(
&ioas->iopt, hwpt_paging->common.domain, cmd->flags, cmd);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 02/32] iommufd: split alloc and domain assign from iommufd_hwpt_paging_alloc
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 01/32] iommufd: Allow HWPTs to have a NULL IOAS Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 03/32] iommufd: Add basic skeleton based on liveupdate_file_handler Samiullah Khawaja
` (30 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: YiFei Zhu, Robin Murphy, Pratyush Yadav, Samiullah Khawaja,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, Chris Li, praan
From: YiFei Zhu <zhuyifei@google.com>
To avoid code duplication, these code are split off into smaller
functions that may also be called by liveupdate.
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
drivers/iommu/iommufd/hw_pagetable.c | 74 +++++++++++++++----------
drivers/iommu/iommufd/iommufd_private.h | 4 ++
2 files changed, 50 insertions(+), 28 deletions(-)
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index 78d2130e0061..a528f84ad429 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -90,6 +90,29 @@ iommufd_hwpt_paging_enforce_cc(struct iommufd_hwpt_paging *hwpt_paging)
return 0;
}
+struct iommufd_hwpt_paging *
+_iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx)
+{
+ struct iommufd_hwpt_paging *hwpt_paging;
+
+ hwpt_paging = __iommufd_object_alloc(
+ ictx, hwpt_paging, IOMMUFD_OBJ_HWPT_PAGING, common.obj);
+ if (IS_ERR(hwpt_paging))
+ return ERR_CAST(hwpt_paging);
+
+ INIT_LIST_HEAD(&hwpt_paging->hwpt_item);
+
+ return hwpt_paging;
+}
+
+void iommufd_hwpt_init_from_domain(struct iommufd_hw_pagetable *hwpt,
+ struct iommu_domain *domain)
+{
+ hwpt->domain = domain;
+ domain->iommufd_hwpt = hwpt;
+ domain->cookie_type = IOMMU_COOKIE_IOMMUFD;
+}
+
/**
* iommufd_hwpt_paging_alloc() - Get a PAGING iommu_domain for a device
* @ictx: iommufd context
@@ -122,6 +145,7 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
const struct iommu_ops *ops = dev_iommu_ops(idev->dev);
struct iommufd_hwpt_paging *hwpt_paging;
struct iommufd_hw_pagetable *hwpt;
+ struct iommu_domain *domain;
int rc;
lockdep_assert_held(&ioas->mutex);
@@ -137,38 +161,34 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
(flags & IOMMU_HWPT_ALLOC_NEST_PARENT))
return ERR_PTR(-EOPNOTSUPP);
- hwpt_paging = __iommufd_object_alloc(
- ictx, hwpt_paging, IOMMUFD_OBJ_HWPT_PAGING, common.obj);
+ hwpt_paging = _iommufd_hwpt_paging_alloc(ictx);
if (IS_ERR(hwpt_paging))
return ERR_CAST(hwpt_paging);
+
hwpt = &hwpt_paging->common;
hwpt->pasid_compat = flags & IOMMU_HWPT_ALLOC_PASID;
- INIT_LIST_HEAD(&hwpt_paging->hwpt_item);
/* Pairs with iommufd_hw_pagetable_destroy() */
refcount_inc(&ioas->obj.users);
hwpt_paging->ioas = ioas;
hwpt_paging->nest_parent = flags & IOMMU_HWPT_ALLOC_NEST_PARENT;
if (ops->domain_alloc_paging_flags) {
- hwpt->domain = ops->domain_alloc_paging_flags(idev->dev,
+ domain = ops->domain_alloc_paging_flags(idev->dev,
flags & ~IOMMU_HWPT_FAULT_ID_VALID, user_data);
- if (IS_ERR(hwpt->domain)) {
- rc = PTR_ERR(hwpt->domain);
- hwpt->domain = NULL;
+ if (IS_ERR(domain)) {
+ rc = PTR_ERR(domain);
goto out_abort;
}
- hwpt->domain->owner = ops;
+ domain->owner = ops;
} else {
- hwpt->domain = iommu_paging_domain_alloc(idev->dev);
- if (IS_ERR(hwpt->domain)) {
- rc = PTR_ERR(hwpt->domain);
- hwpt->domain = NULL;
+ domain = iommu_paging_domain_alloc(idev->dev);
+ if (IS_ERR(domain)) {
+ rc = PTR_ERR(domain);
goto out_abort;
}
}
- hwpt->domain->iommufd_hwpt = hwpt;
- hwpt->domain->cookie_type = IOMMU_COOKIE_IOMMUFD;
+ iommufd_hwpt_init_from_domain(hwpt, domain);
/*
* Set the coherency mode before we do iopt_table_add_domain() as some
@@ -237,6 +257,7 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx,
const struct iommu_ops *ops = dev_iommu_ops(idev->dev);
struct iommufd_hwpt_nested *hwpt_nested;
struct iommufd_hw_pagetable *hwpt;
+ struct iommu_domain *domain;
int rc;
if ((flags & ~(IOMMU_HWPT_FAULT_ID_VALID | IOMMU_HWPT_ALLOC_PASID)) ||
@@ -256,17 +277,15 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx,
refcount_inc(&parent->common.obj.users);
hwpt_nested->parent = parent;
- hwpt->domain = ops->domain_alloc_nested(
+ domain = ops->domain_alloc_nested(
idev->dev, parent->common.domain,
flags & ~IOMMU_HWPT_FAULT_ID_VALID, user_data);
- if (IS_ERR(hwpt->domain)) {
- rc = PTR_ERR(hwpt->domain);
- hwpt->domain = NULL;
+ if (IS_ERR(domain)) {
+ rc = PTR_ERR(domain);
goto out_abort;
}
- hwpt->domain->owner = ops;
- hwpt->domain->iommufd_hwpt = hwpt;
- hwpt->domain->cookie_type = IOMMU_COOKIE_IOMMUFD;
+ iommufd_hwpt_init_from_domain(hwpt, domain);
+ domain->owner = ops;
if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) {
rc = -EOPNOTSUPP;
@@ -294,6 +313,7 @@ iommufd_viommu_alloc_hwpt_nested(struct iommufd_viommu *viommu, u32 flags,
{
struct iommufd_hwpt_nested *hwpt_nested;
struct iommufd_hw_pagetable *hwpt;
+ struct iommu_domain *domain;
int rc;
if (flags & ~(IOMMU_HWPT_FAULT_ID_VALID | IOMMU_HWPT_ALLOC_PASID))
@@ -314,16 +334,14 @@ iommufd_viommu_alloc_hwpt_nested(struct iommufd_viommu *viommu, u32 flags,
refcount_inc(&viommu->obj.users);
hwpt_nested->parent = viommu->hwpt;
- hwpt->domain = viommu->ops->alloc_domain_nested(
+ domain = viommu->ops->alloc_domain_nested(
viommu, flags & ~IOMMU_HWPT_FAULT_ID_VALID, user_data);
- if (IS_ERR(hwpt->domain)) {
- rc = PTR_ERR(hwpt->domain);
- hwpt->domain = NULL;
+ if (IS_ERR(domain)) {
+ rc = PTR_ERR(domain);
goto out_abort;
}
- hwpt->domain->iommufd_hwpt = hwpt;
- hwpt->domain->owner = viommu->iommu_dev->ops;
- hwpt->domain->cookie_type = IOMMU_COOKIE_IOMMUFD;
+ iommufd_hwpt_init_from_domain(hwpt, domain);
+ domain->owner = viommu->iommu_dev->ops;
if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) {
rc = -EOPNOTSUPP;
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index eb6d1a70f673..e43da269ab80 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -434,6 +434,10 @@ iommufd_get_hwpt_nested(struct iommufd_ucmd *ucmd, u32 id)
int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd);
int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd);
+struct iommufd_hwpt_paging *
+_iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx);
+void iommufd_hwpt_init_from_domain(struct iommufd_hw_pagetable *hwpt,
+ struct iommu_domain *domain);
struct iommufd_hwpt_paging *
iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
struct iommufd_device *idev, ioasid_t pasid,
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 03/32] iommufd: Add basic skeleton based on liveupdate_file_handler
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 01/32] iommufd: Allow HWPTs to have a NULL IOAS Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 02/32] iommufd: split alloc and domain assign from iommufd_hwpt_paging_alloc Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 04/32] iommufd-lu: Implement basic prepare/cancel/finish/retrieve using folios Samiullah Khawaja
` (29 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: YiFei Zhu, Samiullah Khawaja, Robin Murphy, Pratyush Yadav,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, Chris Li, praan
From: YiFei Zhu <zhuyifei@google.com>
No functionality is implemented in this commit. Just registering and
unregistering of the struct liveupdate_file_handler for iommufd.
All operations are stubs returning either error or no-op.
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommufd/Makefile | 1 +
drivers/iommu/iommufd/iommufd_private.h | 15 ++++++
drivers/iommu/iommufd/liveupdate.c | 69 +++++++++++++++++++++++++
drivers/iommu/iommufd/main.c | 14 ++++-
4 files changed, 98 insertions(+), 1 deletion(-)
create mode 100644 drivers/iommu/iommufd/liveupdate.c
diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile
index 71d692c9a8f4..f37830ff7229 100644
--- a/drivers/iommu/iommufd/Makefile
+++ b/drivers/iommu/iommufd/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_IOMMUFD_DRIVER) += iova_bitmap.o
iommufd_driver-y := driver.o
obj-$(CONFIG_IOMMUFD_DRIVER_CORE) += iommufd_driver.o
+obj-$(CONFIG_LIVEUPDATE) += liveupdate.o
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index e43da269ab80..b6959ad55ad4 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -711,6 +711,21 @@ iommufd_get_vdevice(struct iommufd_ctx *ictx, u32 id)
struct iommufd_vdevice, obj);
}
+#ifdef CONFIG_LIVEUPDATE
+int iommufd_liveupdate_register_lufs(void);
+int iommufd_liveupdate_unregister_lufs(void);
+#else
+static inline int iommufd_liveupdate_register_lufs(void)
+{
+ return 0;
+}
+
+static inline int iommufd_liveupdate_unregister_lufs(void)
+{
+ return 0;
+}
+#endif
+
#ifdef CONFIG_IOMMUFD_TEST
int iommufd_test(struct iommufd_ucmd *ucmd);
void iommufd_selftest_destroy(struct iommufd_object *obj);
diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
new file mode 100644
index 000000000000..d228157b6fed
--- /dev/null
+++ b/drivers/iommu/iommufd/liveupdate.c
@@ -0,0 +1,69 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#define pr_fmt(fmt) "iommufd: " fmt
+
+#include <linux/file.h>
+#include <linux/iommufd.h>
+#include <linux/liveupdate.h>
+
+#include "iommufd_private.h"
+
+static int iommufd_liveupdate_preserve(struct liveupdate_file_op_args *args)
+{
+ return -EOPNOTSUPP;
+}
+
+static int iommufd_liveupdate_freeze(struct liveupdate_file_op_args *args)
+{
+ /* No-Op; everything should be made read-only */
+ return 0;
+}
+
+static void iommufd_liveupdate_unpreserve(struct liveupdate_file_op_args *args)
+{
+}
+
+static int iommufd_liveupdate_retrieve(struct liveupdate_file_op_args *args)
+{
+ return -EOPNOTSUPP;
+}
+
+static bool iommufd_liveupdate_can_finish(struct liveupdate_file_op_args *args)
+{
+ return false;
+}
+
+static void iommufd_liveupdate_finish(struct liveupdate_file_op_args *args)
+{
+}
+
+static bool iommufd_liveupdate_can_preserve(struct liveupdate_file_handler *handler,
+ struct file *file)
+{
+ return false;
+}
+
+static struct liveupdate_file_ops iommufd_lu_file_ops = {
+ .can_preserve = iommufd_liveupdate_can_preserve,
+ .preserve = iommufd_liveupdate_preserve,
+ .unpreserve = iommufd_liveupdate_unpreserve,
+ .freeze = iommufd_liveupdate_freeze,
+ .retrieve = iommufd_liveupdate_retrieve,
+ .can_finish = iommufd_liveupdate_can_finish,
+ .finish = iommufd_liveupdate_finish,
+};
+
+static struct liveupdate_file_handler iommufd_lu_handler = {
+ .compatible = "iommufd-v1",
+ .ops = &iommufd_lu_file_ops,
+};
+
+int iommufd_liveupdate_register_lufs(void)
+{
+ return liveupdate_register_file_handler(&iommufd_lu_handler);
+}
+
+int iommufd_liveupdate_unregister_lufs(void)
+{
+ return liveupdate_unregister_file_handler(&iommufd_lu_handler);
+}
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 5cc4b08c25f5..18cc4af0a5c4 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -773,11 +773,21 @@ static int __init iommufd_init(void)
if (ret)
goto err_misc;
}
+
+ if (IS_ENABLED(CONFIG_LIVEUPDATE)) {
+ ret = iommufd_liveupdate_register_lufs();
+ if (ret)
+ goto err_vfio_misc;
+ }
+
ret = iommufd_test_init();
if (ret)
- goto err_vfio_misc;
+ goto err_lufs;
return 0;
+err_lufs:
+ if (IS_ENABLED(CONFIG_LIVEUPDATE))
+ iommufd_liveupdate_unregister_lufs();
err_vfio_misc:
if (IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER))
misc_deregister(&vfio_misc_dev);
@@ -789,6 +799,8 @@ static int __init iommufd_init(void)
static void __exit iommufd_exit(void)
{
iommufd_test_exit();
+ if (IS_ENABLED(CONFIG_LIVEUPDATE))
+ iommufd_liveupdate_unregister_lufs();
if (IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER))
misc_deregister(&vfio_misc_dev);
misc_deregister(&iommu_misc_dev);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 04/32] iommufd-lu: Implement basic prepare/cancel/finish/retrieve using folios
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (2 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 03/32] iommufd: Add basic skeleton based on liveupdate_file_handler Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 05/32] iommufd-lu: Implement ioctl to let userspace mark an HWPT to be preserved Samiullah Khawaja
` (28 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: YiFei Zhu, Samiullah Khawaja, Robin Murphy, Pratyush Yadav,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, Chris Li, praan
From: YiFei Zhu <zhuyifei@google.com>
The actual serialization and de-serialization is implemented in
follow up commits.
- On prepare, a single folio is created and preserved to store
all the structs.
- On cancel, the folio is unpreserved and freed.
- On retrieve, the folio is restored, then an fd with anon_inode
is created, with data pointing to the folio.
- On finish, the folio is freed.
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
MAINTAINERS | 1 +
drivers/iommu/iommufd/iommufd_private.h | 7 ++
drivers/iommu/iommufd/liveupdate.c | 113 ++++++++++++++++++++++--
drivers/iommu/iommufd/main.c | 2 +-
include/linux/kho/abi/iommufd.h | 31 +++++++
5 files changed, 148 insertions(+), 6 deletions(-)
create mode 100644 include/linux/kho/abi/iommufd.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 3bb9edf09f0e..bfe646d1c74f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13243,6 +13243,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd.git
F: Documentation/userspace-api/iommufd.rst
F: drivers/iommu/iommufd/
F: include/linux/iommufd.h
+F: include/linux/kho/abi/iommufd.h
F: include/uapi/linux/iommufd.h
F: tools/testing/selftests/iommu/
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index b6959ad55ad4..dfe9120aced0 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -21,6 +21,9 @@ struct iommu_option;
struct iommufd_device;
struct dma_buf_attachment;
struct dma_buf_phys_vec;
+struct iommufd_lu;
+
+extern const struct file_operations iommufd_fops;
struct iommufd_sw_msi_map {
struct list_head sw_msi_item;
@@ -57,6 +60,10 @@ struct iommufd_ctx {
/* Compatibility with VFIO no iommu */
u8 no_iommu_mode;
struct iommufd_ioas *vfio_ioas;
+
+#ifdef CONFIG_LIVEUPDATE
+ struct iommufd_lu *lu;
+#endif
};
/* Entry for iommufd_ctx::mt_mmap */
diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
index d228157b6fed..a9b5956c0dee 100644
--- a/drivers/iommu/iommufd/liveupdate.c
+++ b/drivers/iommu/iommufd/liveupdate.c
@@ -2,15 +2,44 @@
#define pr_fmt(fmt) "iommufd: " fmt
+#include <linux/anon_inodes.h>
#include <linux/file.h>
#include <linux/iommufd.h>
+#include <linux/kexec_handover.h>
+#include <linux/kho/abi/iommufd.h>
#include <linux/liveupdate.h>
+#include <linux/mm.h>
#include "iommufd_private.h"
static int iommufd_liveupdate_preserve(struct liveupdate_file_op_args *args)
{
- return -EOPNOTSUPP;
+ struct iommufd_ctx *ictx = iommufd_ctx_from_file(args->file);
+ struct iommufd_lu *iommufd_lu;
+ size_t serial_size;
+ void *mem;
+ int rc;
+
+ if (IS_ERR(ictx))
+ return PTR_ERR(ictx);
+
+ serial_size = sizeof(*iommufd_lu);
+
+ mem = kho_alloc_preserve(serial_size);
+ if (!mem) {
+ rc = -ENOMEM;
+ goto err_ctx_put;
+ }
+
+ iommufd_lu = mem;
+
+ args->serialized_data = virt_to_phys(iommufd_lu);
+ iommufd_ctx_put(ictx);
+ return 0;
+
+err_ctx_put:
+ iommufd_ctx_put(ictx);
+ return rc;
}
static int iommufd_liveupdate_freeze(struct liveupdate_file_op_args *args)
@@ -21,26 +50,100 @@ static int iommufd_liveupdate_freeze(struct liveupdate_file_op_args *args)
static void iommufd_liveupdate_unpreserve(struct liveupdate_file_op_args *args)
{
+ struct iommufd_ctx *ictx = iommufd_ctx_from_file(args->file);
+
+ if (WARN_ON(IS_ERR(ictx)))
+ return;
+
+ kho_unpreserve_free(phys_to_virt(args->serialized_data));
+ iommufd_ctx_put(ictx);
}
static int iommufd_liveupdate_retrieve(struct liveupdate_file_op_args *args)
{
- return -EOPNOTSUPP;
+ struct iommufd_lu *iommufd_lu;
+ struct iommufd_ctx *ictx;
+ struct folio *folio_lu;
+ struct file *file;
+ int rc;
+
+ folio_lu = kho_restore_folio(args->serialized_data);
+ if (IS_ERR_OR_NULL(folio_lu))
+ return -EFAULT;
+
+ iommufd_lu = folio_address(folio_lu);
+
+ file = anon_inode_create_getfile("iommufd", &iommufd_fops,
+ NULL, O_RDWR, NULL);
+ if (IS_ERR(file)) {
+ rc = PTR_ERR(file);
+ goto err_folio_put;
+ }
+
+ rc = iommufd_fops.open(file->f_inode, file);
+ if (rc)
+ goto err_fput;
+
+ ictx = iommufd_ctx_from_file(file);
+ if (WARN_ON(IS_ERR(ictx))) {
+ rc = PTR_ERR(ictx);
+ goto err_fput;
+ }
+
+ if (WARN_ON(ictx->lu)) {
+ rc = -EEXIST;
+ goto err_ctx_put;
+ }
+ ictx->lu = iommufd_lu;
+
+ iommufd_ctx_put(ictx);
+
+ args->file = file;
+
+ return 0;
+
+err_ctx_put:
+ iommufd_ctx_put(ictx);
+err_fput:
+ fput(file);
+err_folio_put:
+ folio_put(folio_lu);
+ return rc;
}
static bool iommufd_liveupdate_can_finish(struct liveupdate_file_op_args *args)
{
- return false;
+ if (!args->retrieved || !args->file) {
+ pr_warn("%s: fd not reclaimed\n", __func__);
+ return false;
+ }
+
+ return true;
}
static void iommufd_liveupdate_finish(struct liveupdate_file_op_args *args)
{
+ struct iommufd_lu *iommufd_lu;
+ struct iommufd_ctx *ictx;
+
+ ictx = iommufd_ctx_from_file(args->file);
+ iommufd_lu = ictx->lu;
+ ictx->lu = NULL;
+ iommufd_ctx_put(ictx);
+
+ folio_put(virt_to_folio(iommufd_lu));
}
static bool iommufd_liveupdate_can_preserve(struct liveupdate_file_handler *handler,
struct file *file)
{
- return false;
+ struct iommufd_ctx *ictx = iommufd_ctx_from_file(file);
+
+ if (IS_ERR(ictx))
+ return false;
+
+ iommufd_ctx_put(ictx);
+ return true;
}
static struct liveupdate_file_ops iommufd_lu_file_ops = {
@@ -54,7 +157,7 @@ static struct liveupdate_file_ops iommufd_lu_file_ops = {
};
static struct liveupdate_file_handler iommufd_lu_handler = {
- .compatible = "iommufd-v1",
+ .compatible = IOMMUFD_LUO_COMPATIBLE,
.ops = &iommufd_lu_file_ops,
};
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 18cc4af0a5c4..12601f9ad217 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -610,7 +610,7 @@ static int iommufd_fops_mmap(struct file *filp, struct vm_area_struct *vma)
return rc;
}
-static const struct file_operations iommufd_fops = {
+const struct file_operations iommufd_fops = {
.owner = THIS_MODULE,
.open = iommufd_fops_open,
.release = iommufd_fops_release,
diff --git a/include/linux/kho/abi/iommufd.h b/include/linux/kho/abi/iommufd.h
new file mode 100644
index 000000000000..19d6b61ec3c3
--- /dev/null
+++ b/include/linux/kho/abi/iommufd.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (C) 2025, Google LLC
+ * Author: Samiullah Khawaja <skhawaja@google.com>
+ */
+
+#ifndef _LINUX_KHO_ABI_IOMMUFD_H
+#define _LINUX_KHO_ABI_IOMMUFD_H
+
+#include <linux/mutex_types.h>
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+/**
+ * DOC: IOMMUFD Live Update ABI
+ *
+ * This header defines the ABI for preserving the state of an IOMMUFD file
+ * across a kexec reboot using LUO.
+ *
+ * This interface is a contract. Any modification to any of the serialization
+ * structs defined here constitutes a breaking change. Such changes require
+ * incrementing the version number in the IOMMUFD_LUO_COMPATIBLE string.
+ */
+
+#define IOMMUFD_LUO_COMPATIBLE "iommufd-v1"
+
+struct iommufd_lu {
+};
+
+#endif /* _LINUX_KHO_ABI_IOMMUFD_H */
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 05/32] iommufd-lu: Implement ioctl to let userspace mark an HWPT to be preserved
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (3 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 04/32] iommufd-lu: Implement basic prepare/cancel/finish/retrieve using folios Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 06/32] iommufd-lu: Persist iommu hardware pagetables for live update Samiullah Khawaja
` (27 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: YiFei Zhu, Robin Murphy, Pratyush Yadav, Samiullah Khawaja,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, Chris Li, praan
From: YiFei Zhu <zhuyifei@google.com>
Userspace provides a token, which will then be used at restore to
identify this HWPT.
A skeleton for te restore-side ioctl is also added.
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
drivers/iommu/iommufd/iommufd_private.h | 18 ++++++++++
drivers/iommu/iommufd/liveupdate.c | 46 +++++++++++++++++++++++++
drivers/iommu/iommufd/main.c | 4 +++
include/uapi/linux/iommufd.h | 41 ++++++++++++++++++++++
4 files changed, 109 insertions(+)
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index dfe9120aced0..54c7c9888de3 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -381,6 +381,10 @@ struct iommufd_hwpt_paging {
bool auto_domain : 1;
bool enforce_cache_coherency : 1;
bool nest_parent : 1;
+#ifdef CONFIG_LIVEUPDATE
+ bool lu_preserved : 1;
+ u32 lu_token;
+#endif
/* Head at iommufd_ioas::hwpt_list */
struct list_head hwpt_item;
struct iommufd_sw_msi_maps present_sw_msi;
@@ -721,6 +725,10 @@ iommufd_get_vdevice(struct iommufd_ctx *ictx, u32 id)
#ifdef CONFIG_LIVEUPDATE
int iommufd_liveupdate_register_lufs(void);
int iommufd_liveupdate_unregister_lufs(void);
+
+
+int iommufd_hwpt_lu_set_preserved(struct iommufd_ucmd *ucmd);
+int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd);
#else
static inline int iommufd_liveupdate_register_lufs(void)
{
@@ -731,6 +739,16 @@ static inline int iommufd_liveupdate_unregister_lufs(void)
{
return 0;
}
+
+static inline int iommufd_hwpt_lu_set_preserved(struct iommufd_ucmd *ucmd)
+{
+ return -ENOTTY;
+}
+
+static inline int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd)
+{
+ return -ENOTTY;
+}
#endif
#ifdef CONFIG_IOMMUFD_TEST
diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
index a9b5956c0dee..83d1b888d914 100644
--- a/drivers/iommu/iommufd/liveupdate.c
+++ b/drivers/iommu/iommufd/liveupdate.c
@@ -12,6 +12,47 @@
#include "iommufd_private.h"
+int iommufd_hwpt_lu_set_preserved(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_hwpt_lu_set_preserved *cmd = ucmd->cmd;
+ struct iommufd_hwpt_paging *hwpt_target, *hwpt;
+ struct iommufd_ctx *ictx = ucmd->ictx;
+ struct iommufd_object *obj;
+ unsigned long index;
+ int rc = 0;
+
+ /* TODO: return error if iommufd is already preserved. */
+
+ hwpt_target = iommufd_get_hwpt_paging(ucmd, cmd->hwpt_id);
+ if (IS_ERR(hwpt_target))
+ return PTR_ERR(hwpt_target);
+
+ xa_lock(&ictx->objects);
+ xa_for_each(&ictx->objects, index, obj) {
+ if (obj->type != IOMMUFD_OBJ_HWPT_PAGING)
+ continue;
+
+ hwpt = container_of(obj, struct iommufd_hwpt_paging, common.obj);
+
+ if (hwpt == hwpt_target)
+ continue;
+ if (!hwpt->lu_preserved)
+ continue;
+ if (hwpt->lu_token == cmd->hwpt_token) {
+ rc = -EADDRINUSE;
+ goto out;
+ }
+ }
+
+ hwpt_target->lu_preserved = cmd->preserved;
+ hwpt_target->lu_token = cmd->hwpt_token;
+
+out:
+ xa_unlock(&ictx->objects);
+ iommufd_put_object(ictx, &hwpt_target->common.obj);
+ return rc;
+}
+
static int iommufd_liveupdate_preserve(struct liveupdate_file_op_args *args)
{
struct iommufd_ctx *ictx = iommufd_ctx_from_file(args->file);
@@ -121,6 +162,11 @@ static bool iommufd_liveupdate_can_finish(struct liveupdate_file_op_args *args)
return true;
}
+int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd)
+{
+ return -ENOTTY;
+}
+
static void iommufd_liveupdate_finish(struct liveupdate_file_op_args *args)
{
struct iommufd_lu *iommufd_lu;
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 12601f9ad217..b63f61331cae 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -493,6 +493,10 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
__reserved),
IOCTL_OP(IOMMU_VIOMMU_ALLOC, iommufd_viommu_alloc_ioctl,
struct iommu_viommu_alloc, out_viommu_id),
+ IOCTL_OP(IOMMU_HWPT_LU_SET_PRESERVED, iommufd_hwpt_lu_set_preserved,
+ struct iommu_hwpt_lu_set_preserved, preserved),
+ IOCTL_OP(IOMMU_HWPT_LU_RESTORE, iommufd_hwpt_lu_restore,
+ struct iommu_hwpt_lu_restore, hwpt_token),
#ifdef CONFIG_IOMMUFD_TEST
IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last),
#endif
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 2c41920b641d..4b953129f8d8 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -57,6 +57,8 @@ enum {
IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92,
IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93,
IOMMUFD_CMD_HW_QUEUE_ALLOC = 0x94,
+ IOMMUFD_CMD_HWPT_LU_SET_PRESERVED = 0x95,
+ IOMMUFD_CMD_HWPT_LU_RESTORE = 0x96,
};
/**
@@ -1299,4 +1301,43 @@ struct iommu_hw_queue_alloc {
__aligned_u64 length;
};
#define IOMMU_HW_QUEUE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HW_QUEUE_ALLOC)
+
+/**
+ * struct iommu_hwpt_lu_set_preserved - ioctl(IOMMU_HWPT_LU_SET_PRESERVED)
+ * @size: sizeof(struct iommu_hwpt_lu_set_preserved)
+ * @hwpt_id: Iommufd object ID of the target HWPT
+ * @hwpt_token: Token to identify this hwpt upon restore
+ * @preserved: If non-zero, HWPT is preserved by liveupdate
+ *
+ * If preserved set to non-zero, the target HWPT will be preserved.
+ *
+ * The hwpt_token is provided by userspace. If userspace enters a token
+ * already in use within this iommufd, -EADDRINUSE is returned from this ioctl.
+ */
+struct iommu_hwpt_lu_set_preserved {
+ __u32 size;
+ __u32 hwpt_id;
+ __u32 hwpt_token;
+ __u8 preserved;
+};
+#define IOMMU_HWPT_LU_SET_PRESERVED _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_LU_SET_PRESERVED)
+
+/**
+ * struct iommu_hwpt_lu_restore - ioctl(IOMMU_HWPT_LU_RESTORE)
+ * @size: sizeof(struct iommu_hwpt_lu_restore)
+ * @pt_id: Output the ID of the recreated HWPT.
+ * @hwpt_token: Token to identify this hwpt
+ * @hwpt_alloc_flags: Combination of enum iommufd_hwpt_alloc_flags
+
+ * An immutable HWPT is restored without a parent IOAS, and the ID
+ * of this new HWPT is returned.
+ */
+
+struct iommu_hwpt_lu_restore {
+ __u32 size;
+ __u32 pt_id;
+ __u32 hwpt_token;
+ __u32 hwpt_alloc_flags;
+};
+#define IOMMU_HWPT_LU_RESTORE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_LU_RESTORE)
#endif
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 06/32] iommufd-lu: Persist iommu hardware pagetables for live update
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (4 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 05/32] iommufd-lu: Implement ioctl to let userspace mark an HWPT to be preserved Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 07/32] iommu: Add liveupdate FLB for IOMMU state preservation Samiullah Khawaja
` (26 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: YiFei Zhu, Samiullah Khawaja, Robin Murphy, Pratyush Yadav,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, Chris Li, praan
From: YiFei Zhu <zhuyifei@google.com>
The caller is expected to mark each HWPT to be preserved with an ioctl
call, with a token that will be used in restore. At preserve time, each
HWPT's domain is then called with iommu_domain_preserve to preserve the
iommu domain.
On restore, each preserved HWPT is expected to be restored with another
ioctl call, This HWPT will be recreated without a parent IOAS, and its
domain recreated with iommu_domain_restore. The caller is expected to
later swap the old restored attachments with newly created HWPTs through
normal means such as VFIO_DEVICE_ATTACH_IOMMUFD_PT.
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommufd/iommufd_private.h | 6 +-
drivers/iommu/iommufd/liveupdate.c | 161 +++++++++++++++++++++++-
drivers/iommu/iommufd/main.c | 19 +++
include/linux/kho/abi/iommufd.h | 8 ++
4 files changed, 189 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 54c7c9888de3..15afff6ba0ea 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -726,9 +726,13 @@ iommufd_get_vdevice(struct iommufd_ctx *ictx, u32 id)
int iommufd_liveupdate_register_lufs(void);
int iommufd_liveupdate_unregister_lufs(void);
-
int iommufd_hwpt_lu_set_preserved(struct iommufd_ucmd *ucmd);
int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd);
+
+/* TODO */
+#define iommu_domain_restore(x) ERR_PTR(-EOPNOTSUPP)
+#define iommu_domain_preserve(x, y) (-EOPNOTSUPP)
+#define iommu_domain_has_attachments(x) (false)
#else
static inline int iommufd_liveupdate_register_lufs(void)
{
diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
index 83d1b888d914..42b380229c57 100644
--- a/drivers/iommu/iommufd/liveupdate.c
+++ b/drivers/iommu/iommufd/liveupdate.c
@@ -9,6 +9,7 @@
#include <linux/kho/abi/iommufd.h>
#include <linux/liveupdate.h>
#include <linux/mm.h>
+#include <linux/pci.h>
#include "iommufd_private.h"
@@ -53,6 +54,82 @@ int iommufd_hwpt_lu_set_preserved(struct iommufd_ucmd *ucmd)
return rc;
}
+static int iommufd_save_hwpts(struct iommufd_ctx *ictx,
+ struct iommufd_lu *iommufd_lu)
+{
+ struct iommufd_hwpt_paging *hwpt, **hwpts = NULL;
+ struct iommufd_hwpt_lu *hwpt_lu;
+ struct iommufd_object *obj;
+ unsigned int nr_hwpts = 0;
+ unsigned long index;
+ unsigned int i;
+ int rc = 0;
+
+ if (iommufd_lu) {
+ hwpts = kcalloc(iommufd_lu->nr_hwpts, sizeof(*hwpts),
+ GFP_KERNEL);
+ if (!hwpts)
+ return -ENOMEM;
+ }
+
+ xa_lock(&ictx->objects);
+ xa_for_each(&ictx->objects, index, obj) {
+ if (obj->type != IOMMUFD_OBJ_HWPT_PAGING)
+ continue;
+
+ hwpt = container_of(obj, struct iommufd_hwpt_paging, common.obj);
+ if (!hwpt->lu_preserved)
+ continue;
+
+ /*
+ * TODO: The HWPT should be made immutable, and cannot be
+ * destroyed
+ */
+
+ if (!hwpt->common.domain) {
+ rc = -EINVAL;
+ xa_unlock(&ictx->objects);
+ goto out;
+ }
+
+ if (iommufd_lu) {
+ hwpts[nr_hwpts] = hwpt;
+ hwpt_lu = &iommufd_lu->hwpts[nr_hwpts];
+
+ hwpt_lu->token = hwpt->lu_token;
+ hwpt_lu->reclaimed = false;
+ }
+
+ nr_hwpts++;
+ }
+ xa_unlock(&ictx->objects);
+
+ if (WARN_ON(iommufd_lu && iommufd_lu->nr_hwpts != nr_hwpts)) {
+ rc = -EFAULT;
+ goto out;
+ }
+
+ if (iommufd_lu) {
+ /*
+ * iommu_domain_preserve may sleep and must be called
+ * outside of xa_lock
+ */
+ for (i = 0; i < nr_hwpts; i++) {
+ hwpt = hwpts[i];
+ hwpt_lu = &iommufd_lu->hwpts[i];
+
+ rc = iommu_domain_preserve(hwpt->common.domain, &hwpt_lu->domain_data);
+ goto out;
+ }
+ }
+
+ rc = nr_hwpts;
+
+out:
+ kfree(hwpts);
+ return rc;
+}
+
static int iommufd_liveupdate_preserve(struct liveupdate_file_op_args *args)
{
struct iommufd_ctx *ictx = iommufd_ctx_from_file(args->file);
@@ -64,7 +141,11 @@ static int iommufd_liveupdate_preserve(struct liveupdate_file_op_args *args)
if (IS_ERR(ictx))
return PTR_ERR(ictx);
- serial_size = sizeof(*iommufd_lu);
+ rc = iommufd_save_hwpts(ictx, NULL);
+ if (rc < 0)
+ goto err_ctx_put;
+
+ serial_size = struct_size(iommufd_lu, hwpts, rc);
mem = kho_alloc_preserve(serial_size);
if (!mem) {
@@ -73,11 +154,17 @@ static int iommufd_liveupdate_preserve(struct liveupdate_file_op_args *args)
}
iommufd_lu = mem;
+ iommufd_lu->nr_hwpts = rc;
+ rc = iommufd_save_hwpts(ictx, iommufd_lu);
+ if (rc < 0)
+ goto err_free;
args->serialized_data = virt_to_phys(iommufd_lu);
iommufd_ctx_put(ictx);
return 0;
+err_free:
+ kho_unpreserve_free(mem);
err_ctx_put:
iommufd_ctx_put(ictx);
return rc;
@@ -92,10 +179,31 @@ static int iommufd_liveupdate_freeze(struct liveupdate_file_op_args *args)
static void iommufd_liveupdate_unpreserve(struct liveupdate_file_op_args *args)
{
struct iommufd_ctx *ictx = iommufd_ctx_from_file(args->file);
+ struct iommufd_hwpt_paging *hwpt;
+ struct iommufd_object *obj;
+ unsigned long index;
if (WARN_ON(IS_ERR(ictx)))
return;
+ xa_lock(&ictx->objects);
+ xa_for_each(&ictx->objects, index, obj) {
+ if (obj->type != IOMMUFD_OBJ_HWPT_PAGING)
+ continue;
+
+ hwpt = container_of(obj, struct iommufd_hwpt_paging, common.obj);
+ if (!hwpt->lu_preserved)
+ continue;
+
+ /* TODO: The HWPT should be made mutable again */
+
+ if (!hwpt->common.domain)
+ continue;
+
+ /* TODO: WARN_ON(iommu_domain_unpreserve(hwpt->common.domain)); */
+ }
+ xa_unlock(&ictx->objects);
+
kho_unpreserve_free(phys_to_virt(args->serialized_data));
iommufd_ctx_put(ictx);
}
@@ -164,7 +272,53 @@ static bool iommufd_liveupdate_can_finish(struct liveupdate_file_op_args *args)
int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd)
{
- return -ENOTTY;
+ struct iommu_hwpt_lu_restore *cmd = ucmd->cmd;
+ struct iommufd_hwpt_paging *hwpt = NULL;
+ struct iommufd_ctx *ictx = ucmd->ictx;
+ struct iommufd_hwpt_lu *hwpt_lu;
+ struct iommufd_lu *iommufd_lu;
+ struct iommu_domain *domain;
+ unsigned int i;
+ int rc;
+
+ iommufd_lu = ictx->lu;
+ if (!iommufd_lu)
+ return -ENOTTY;
+
+ for (i = 0; i < iommufd_lu->nr_hwpts; i++) {
+ hwpt_lu = &iommufd_lu->hwpts[i];
+
+ if (hwpt_lu->reclaimed)
+ continue;
+
+ if (hwpt_lu->token == cmd->hwpt_token)
+ goto hwpt_found;
+ }
+
+ return -ENOENT;
+
+hwpt_found:
+ hwpt = _iommufd_hwpt_paging_alloc(ictx);
+ if (IS_ERR(hwpt))
+ return PTR_ERR(hwpt);
+
+ /* a successful iommu_domain_restore mars the point of no return */
+ domain = iommu_domain_restore(hwpt_lu->domain_data);
+ if (IS_ERR(domain)) {
+ rc = PTR_ERR(domain);
+ goto err_destroy;
+ }
+
+ iommufd_hwpt_init_from_domain(&hwpt->common, domain);
+ iommufd_object_finalize(ictx, &hwpt->common.obj);
+
+ hwpt_lu->reclaimed = true;
+ cmd->pt_id = hwpt->common.obj.id;
+ return 0;
+
+err_destroy:
+ iommufd_object_abort_and_destroy(ictx, &hwpt->common.obj);
+ return rc;
}
static void iommufd_liveupdate_finish(struct liveupdate_file_op_args *args)
@@ -175,9 +329,8 @@ static void iommufd_liveupdate_finish(struct liveupdate_file_op_args *args)
ictx = iommufd_ctx_from_file(args->file);
iommufd_lu = ictx->lu;
ictx->lu = NULL;
- iommufd_ctx_put(ictx);
-
folio_put(virt_to_folio(iommufd_lu));
+ iommufd_ctx_put(ictx);
}
static bool iommufd_liveupdate_can_preserve(struct liveupdate_file_handler *handler,
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index b63f61331cae..a334e3da3f45 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -207,6 +207,8 @@ int iommufd_object_remove(struct iommufd_ctx *ictx,
struct iommufd_object *to_destroy, u32 id,
unsigned int flags)
{
+ struct iommufd_hwpt_paging *hwpt_paging;
+ struct iommu_domain *domain;
struct iommufd_object *obj;
XA_STATE(xas, &ictx->objects, id);
bool zerod_wait_cnt = false;
@@ -250,6 +252,23 @@ int iommufd_object_remove(struct iommufd_ctx *ictx,
goto err_xa;
}
+ if (obj->type == IOMMUFD_OBJ_HWPT_PAGING) {
+ /*
+ * Normally attacments are refcounted, but this is not the case
+ * for liveupdate-restored HWPTs.
+ * Additionally, LUO holds a reference to struct files until
+ * finish, which makes sure HWPTs are no-longer attached, so
+ * this code path is not a concern in iommufd_fops_release
+ */
+ hwpt_paging = container_of(obj, struct iommufd_hwpt_paging,
+ common.obj);
+ domain = hwpt_paging->common.domain;
+ if (domain && iommu_domain_has_attachments(domain)) {
+ ret = -EBUSY;
+ goto err_xa;
+ }
+ }
+
if (!refcount_dec_if_one(&obj->users)) {
ret = -EBUSY;
goto err_xa;
diff --git a/include/linux/kho/abi/iommufd.h b/include/linux/kho/abi/iommufd.h
index 19d6b61ec3c3..f7393ac78aa9 100644
--- a/include/linux/kho/abi/iommufd.h
+++ b/include/linux/kho/abi/iommufd.h
@@ -25,7 +25,15 @@
#define IOMMUFD_LUO_COMPATIBLE "iommufd-v1"
+struct iommufd_hwpt_lu {
+ u32 token;
+ u64 domain_data;
+ bool reclaimed;
+} __packed;
+
struct iommufd_lu {
+ unsigned int nr_hwpts;
+ struct iommufd_hwpt_lu hwpts[];
};
#endif /* _LINUX_KHO_ABI_IOMMUFD_H */
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 07/32] iommu: Add liveupdate FLB for IOMMU state preservation
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (5 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 06/32] iommufd-lu: Persist iommu hardware pagetables for live update Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 08/32] iommu: Register IOMMU FLB with iommufd file handler Samiullah Khawaja
` (25 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Add a liveupdate FLB for IOMMU state preservation with stub
implementation only. Also add APIs to to register/unregister this FLB
with the iommufd LU file handler.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/Makefile | 1 +
drivers/iommu/liveupdate.c | 60 +++++++++++++++++
include/linux/iommu-lu.h | 17 +++++
include/linux/kho/abi/iommu.h | 119 ++++++++++++++++++++++++++++++++++
4 files changed, 197 insertions(+)
create mode 100644 drivers/iommu/liveupdate.c
create mode 100644 include/linux/iommu-lu.h
create mode 100644 include/linux/kho/abi/iommu.h
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 8e8843316c4b..f1d740bb3592 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE_KUNIT_TEST) += io-pgtable-arm-selftests.o
obj-$(CONFIG_IOMMU_IO_PGTABLE_DART) += io-pgtable-dart.o
+obj-$(CONFIG_LIVEUPDATE) += liveupdate.o
obj-$(CONFIG_IOMMU_IOVA) += iova.o
obj-$(CONFIG_OF_IOMMU) += of_iommu.o
obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
new file mode 100644
index 000000000000..93e5691a8c1f
--- /dev/null
+++ b/drivers/iommu/liveupdate.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright (C) 2025, Google LLC
+ * Author: Samiullah Khawaja <skhawaja@google.com>
+ */
+
+#define pr_fmt(fmt) "iommu: liveupdate: " fmt
+
+#include <linux/kexec_handover.h>
+#include <linux/liveupdate.h>
+#include <linux/iommu-lu.h>
+#include <linux/errno.h>
+
+static void iommu_liveupdate_flb_free(struct iommu_lu_flb_obj *obj)
+{
+}
+
+static int iommu_liveupdate_flb_preserve(struct liveupdate_flb_op_args *argp)
+{
+ return -EOPNOTSUPP;
+}
+
+static void iommu_liveupdate_flb_unpreserve(struct liveupdate_flb_op_args *argp)
+{
+ iommu_liveupdate_flb_free(argp->obj);
+}
+
+static void iommu_liveupdate_flb_finish(struct liveupdate_flb_op_args *argp)
+{
+}
+
+static int iommu_liveupdate_flb_retrieve(struct liveupdate_flb_op_args *argp)
+{
+ return -EOPNOTSUPP;
+}
+
+static struct liveupdate_flb_ops iommu_flb_ops = {
+ .preserve = iommu_liveupdate_flb_preserve,
+ .unpreserve = iommu_liveupdate_flb_unpreserve,
+ .finish = iommu_liveupdate_flb_finish,
+ .retrieve = iommu_liveupdate_flb_retrieve,
+};
+
+static struct liveupdate_flb iommu_flb = {
+ .compatible = IOMMU_LUO_FLB_COMPATIBLE,
+ .ops = &iommu_flb_ops,
+};
+
+int iommu_liveupdate_register_flb(struct liveupdate_file_handler *handler)
+{
+ return liveupdate_register_flb(handler, &iommu_flb);
+}
+EXPORT_SYMBOL(iommu_liveupdate_register_flb);
+
+int iommu_liveupdate_unregister_flb(struct liveupdate_file_handler *handler)
+{
+ return liveupdate_unregister_flb(handler, &iommu_flb);
+}
+EXPORT_SYMBOL(iommu_liveupdate_unregister_flb);
diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
new file mode 100644
index 000000000000..59095d2f1bb2
--- /dev/null
+++ b/include/linux/iommu-lu.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (C) 2025, Google LLC
+ * Author: Samiullah Khawaja <skhawaja@google.com>
+ */
+
+#ifndef _LINUX_IOMMU_LU_H
+#define _LINUX_IOMMU_LU_H
+
+#include <linux/liveupdate.h>
+#include <linux/kho/abi/iommu.h>
+
+int iommu_liveupdate_register_flb(struct liveupdate_file_handler *handler);
+int iommu_liveupdate_unregister_flb(struct liveupdate_file_handler *handler);
+
+#endif /* _LINUX_IOMMU_LU_H */
diff --git a/include/linux/kho/abi/iommu.h b/include/linux/kho/abi/iommu.h
new file mode 100644
index 000000000000..71ff9b670a0b
--- /dev/null
+++ b/include/linux/kho/abi/iommu.h
@@ -0,0 +1,119 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (C) 2025, Google LLC
+ * Author: Samiullah Khawaja <skhawaja@google.com>
+ */
+
+#ifndef _LINUX_KHO_ABI_IOMMU_H
+#define _LINUX_KHO_ABI_IOMMU_H
+
+#include <linux/mutex_types.h>
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+/**
+ * DOC: IOMMU File-Lifecycle Bound (FLB) Live Update ABI
+ *
+ * This header defines the ABI for preserving IOMMU state across kexec using
+ * Live Update File-Lifecycle Bound (FLB) data.
+ *
+ * This interface is a contract. Any modification to any of the serialization
+ * structs defined here constitutes a breaking change. Such changes require
+ * incrementing the version number in the IOMMU_LUO_FLB_COMPATIBLE string.
+ */
+
+#define IOMMU_LUO_FLB_COMPATIBLE "iommu-v1"
+
+enum iommu_lu_type {
+ IOMMU_INVALID,
+ IOMMU_INTEL,
+};
+
+struct iommu_obj_ser {
+ u32 idx;
+ u32 ref_count;
+ u32 deleted:1;
+ u32 incoming:1;
+} __packed;
+
+struct iommu_domain_ser {
+ struct iommu_obj_ser obj;
+ u64 top_table;
+ u64 top_level;
+ u64 attach_count;
+ struct iommu_domain *restored_domain;
+} __packed;
+
+struct device_domain_iommu_ser {
+ u32 did;
+ u64 domain_phys;
+ u64 iommu_phys;
+};
+
+struct device_ser {
+ struct iommu_obj_ser obj;
+ u64 token;
+ u32 devid;
+ u32 pci_domain;
+ struct device_domain_iommu_ser domain_iommu_ser;
+ enum iommu_lu_type type;
+} __packed;
+
+struct iommu_intel_ser {
+ u64 phys_addr;
+ u64 root_table;
+} __packed;
+
+struct iommu_ser {
+ struct iommu_obj_ser obj;
+ u64 token;
+ enum iommu_lu_type type;
+ union {
+ struct iommu_intel_ser intel;
+ };
+};
+
+struct iommu_objs_ser {
+ u64 next_objs;
+ u64 nr_objs;
+};
+
+struct iommus_ser {
+ struct iommu_objs_ser objs;
+ struct iommu_ser iommus[];
+} __packed;
+
+struct iommu_domains_ser {
+ struct iommu_objs_ser objs;
+ struct iommu_domain_ser iommu_domains[];
+} __packed;
+
+struct devices_ser {
+ struct iommu_objs_ser objs;
+ struct device_ser devices[];
+} __packed;
+
+#define MAX_IOMMU_SERS ((PAGE_SIZE - sizeof(struct iommus_ser)) / sizeof(struct iommu_ser))
+#define MAX_IOMMU_DOMAIN_SERS ((PAGE_SIZE - sizeof(struct iommu_domains_ser)) / sizeof(struct iommu_domain_ser))
+#define MAX_DEVICE_SERS ((PAGE_SIZE - sizeof(struct devices_ser)) / sizeof(struct device_ser))
+
+struct iommu_lu_flb_ser {
+ u64 iommus_phys;
+ u64 nr_iommus;
+ u64 iommu_domains_phys;
+ u64 nr_domains;
+ u64 devices_phys;
+ u64 nr_devices;
+} __packed;
+
+struct iommu_lu_flb_obj {
+ struct mutex lock;
+ struct iommu_lu_flb_ser *ser;
+
+ struct iommu_domains_ser *iommu_domains;
+ struct iommus_ser *iommus;
+ struct devices_ser *devices;
+};
+
+#endif /* _LINUX_KHO_ABI_IOMMU_H */
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 08/32] iommu: Register IOMMU FLB with iommufd file handler
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (6 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 07/32] iommu: Add liveupdate FLB for IOMMU state preservation Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 09/32] iommu: Implement IOMMU LU FLB preserve/unpreserve callbacks Samiullah Khawaja
` (24 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Use the IOMMU FLB register/unregister API to associate it with iommufd
LU file handler.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommufd/liveupdate.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
index 42b380229c57..782585aff44a 100644
--- a/drivers/iommu/iommufd/liveupdate.c
+++ b/drivers/iommu/iommufd/liveupdate.c
@@ -8,6 +8,7 @@
#include <linux/kexec_handover.h>
#include <linux/kho/abi/iommufd.h>
#include <linux/liveupdate.h>
+#include <linux/iommu-lu.h>
#include <linux/mm.h>
#include <linux/pci.h>
@@ -362,10 +363,22 @@ static struct liveupdate_file_handler iommufd_lu_handler = {
int iommufd_liveupdate_register_lufs(void)
{
- return liveupdate_register_file_handler(&iommufd_lu_handler);
+ int ret;
+
+ ret = liveupdate_register_file_handler(&iommufd_lu_handler);
+ if (ret)
+ return ret;
+
+ ret = iommu_liveupdate_register_flb(&iommufd_lu_handler);
+ if (ret)
+ liveupdate_unregister_file_handler(&iommufd_lu_handler);
+
+ return ret;
}
int iommufd_liveupdate_unregister_lufs(void)
{
+ WARN_ON(iommu_liveupdate_unregister_flb(&iommufd_lu_handler));
+
return liveupdate_unregister_file_handler(&iommufd_lu_handler);
}
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 09/32] iommu: Implement IOMMU LU FLB preserve/unpreserve callbacks
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (7 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 08/32] iommu: Register IOMMU FLB with iommufd file handler Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 10/32] iommu: Add iommu_domain ops to preserve, unpreserve and restore Samiullah Khawaja
` (23 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Use KHO preserve memory alloc/free helper functions to allocate memory
for the IOMMU LU FLB. The serialization structs for device, domain and
iommu are allocated.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/liveupdate.c | 71 +++++++++++++++++++++++++++++++++++++-
1 file changed, 70 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
index 93e5691a8c1f..21ce7be9b87e 100644
--- a/drivers/iommu/liveupdate.c
+++ b/drivers/iommu/liveupdate.c
@@ -10,15 +10,84 @@
#include <linux/kexec_handover.h>
#include <linux/liveupdate.h>
#include <linux/iommu-lu.h>
+#include <linux/iommu.h>
#include <linux/errno.h>
+static void iommu_liveupdate_free_objs(u64 next, bool incoming)
+{
+ struct iommu_objs_ser *objs;
+
+ while (next) {
+ objs = __va(next);
+ next = objs->next_objs;
+
+ if (!incoming)
+ kho_unpreserve_free(objs);
+ else
+ folio_put(virt_to_folio(objs));
+ }
+}
+
static void iommu_liveupdate_flb_free(struct iommu_lu_flb_obj *obj)
{
+ if (obj->iommu_domains)
+ iommu_liveupdate_free_objs(obj->ser->iommu_domains_phys, false);
+
+ if (obj->devices)
+ iommu_liveupdate_free_objs(obj->ser->devices_phys, false);
+
+ if (obj->iommus)
+ iommu_liveupdate_free_objs(obj->ser->iommus_phys, false);
+
+ kho_unpreserve_free(obj->ser);
}
static int iommu_liveupdate_flb_preserve(struct liveupdate_flb_op_args *argp)
{
- return -EOPNOTSUPP;
+ struct iommu_lu_flb_obj *obj;
+ struct iommu_lu_flb_ser *ser;
+ void *mem;
+
+ obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+ if (!obj)
+ return -ENOMEM;
+
+ mutex_init(&obj->lock);
+ mem = kho_alloc_preserve(sizeof(*ser));
+ if (IS_ERR(mem))
+ goto err_free;
+
+ ser = mem;
+ obj->ser = ser;
+
+ mem = kho_alloc_preserve(PAGE_SIZE);
+ if (IS_ERR(mem))
+ goto err_free;
+
+ obj->iommu_domains = mem;
+ ser->iommu_domains_phys = virt_to_phys(obj->iommu_domains);
+
+ mem = kho_alloc_preserve(PAGE_SIZE);
+ if (IS_ERR(mem))
+ goto err_free;
+
+ obj->devices = mem;
+ ser->devices_phys = virt_to_phys(obj->devices);
+
+ mem = kho_alloc_preserve(PAGE_SIZE);
+ if (IS_ERR(mem))
+ goto err_free;
+
+ obj->iommus = mem;
+ ser->iommus_phys = virt_to_phys(obj->iommus);
+
+ argp->obj = obj;
+ argp->data = virt_to_phys(ser);
+ return 0;
+
+err_free:
+ iommu_liveupdate_flb_free(obj);
+ return PTR_ERR(mem);
}
static void iommu_liveupdate_flb_unpreserve(struct liveupdate_flb_op_args *argp)
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 10/32] iommu: Add iommu_domain ops to preserve, unpreserve and restore
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (8 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 09/32] iommu: Implement IOMMU LU FLB preserve/unpreserve callbacks Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 11/32] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages Samiullah Khawaja
` (22 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
These domain ops can be implemented by the iommu drivers if they support
iommu domain preservation across liveupdate.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
include/linux/iommu.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 8c66284a91a8..f681d4d27d6e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -14,6 +14,7 @@
#include <linux/err.h>
#include <linux/of.h>
#include <linux/iova_bitmap.h>
+#include <linux/kho/abi/iommu.h>
#include <uapi/linux/iommufd.h>
#define IOMMU_READ (1 << 0)
@@ -749,6 +750,11 @@ struct iommu_ops {
* specific mechanisms.
* @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*)
* @free: Release the domain after use.
+ * @preserve: Preserve the iommu domain for liveupdate.
+ * Returns 0 on success, a negative errno on failure.
+ * @unpreserve: Unpreserve the iommu domain that was preserved earlier.
+ * @restore: Restore the iommu domain after liveupdate.
+ * Returns 0 on success, a negative errno on failure.
*/
struct iommu_domain_ops {
int (*attach_dev)(struct iommu_domain *domain, struct device *dev,
@@ -779,6 +785,9 @@ struct iommu_domain_ops {
unsigned long quirks);
void (*free)(struct iommu_domain *domain);
+ int (*preserve)(struct iommu_domain *domain, struct iommu_domain_ser *ser);
+ void (*unpreserve)(struct iommu_domain *domain, struct iommu_domain_ser *ser);
+ int (*restore)(struct iommu_domain *domain, struct iommu_domain_ser *ser);
};
/**
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 11/32] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (9 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 10/32] iommu: Add iommu_domain ops to preserve, unpreserve and restore Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-04 2:25 ` Baolu Lu
2025-12-02 23:02 ` [RFC PATCH v2 12/32] iommupt: Implement preserve/unpreserve/restore callbacks Samiullah Khawaja
` (21 subsequent siblings)
32 siblings, 1 reply; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
IOMMU pages are allocated/freed using APIs using struct ioptdesc. For
the proper preservation and restoration of ioptdesc add helper
functions.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommu-pages.c | 70 +++++++++++++++++++++++++++++++++++++
drivers/iommu/iommu-pages.h | 8 +++++
2 files changed, 78 insertions(+)
diff --git a/drivers/iommu/iommu-pages.c b/drivers/iommu/iommu-pages.c
index 3bab175d8557..14669b0f498f 100644
--- a/drivers/iommu/iommu-pages.c
+++ b/drivers/iommu/iommu-pages.c
@@ -6,6 +6,7 @@
#include "iommu-pages.h"
#include <linux/dma-mapping.h>
#include <linux/gfp.h>
+#include <linux/kexec_handover.h>
#include <linux/mm.h>
#define IOPTDESC_MATCH(pg_elm, elm) \
@@ -131,6 +132,75 @@ void iommu_put_pages_list(struct iommu_pages_list *list)
}
EXPORT_SYMBOL_GPL(iommu_put_pages_list);
+#if IS_ENABLED(CONFIG_LIVEUPDATE)
+void iommu_unpreserve_page(void *virt)
+{
+ kho_unpreserve_folio(ioptdesc_folio(virt_to_ioptdesc(virt)));
+}
+EXPORT_SYMBOL_GPL(iommu_unpreserve_page);
+
+int iommu_preserve_page(void *virt)
+{
+ return kho_preserve_folio(ioptdesc_folio(virt_to_ioptdesc(virt)));
+}
+EXPORT_SYMBOL_GPL(iommu_preserve_page);
+
+void iommu_unpreserve_pages(struct iommu_pages_list *list, int count)
+{
+ struct ioptdesc *iopt;
+
+ if (count < 0)
+ count = 0;
+
+ list_for_each_entry(iopt, &list->pages, iopt_freelist_elm) {
+ kho_unpreserve_folio(ioptdesc_folio(iopt));
+ if (count > 0 && --count == 0)
+ break;
+ }
+}
+EXPORT_SYMBOL_GPL(iommu_unpreserve_pages);
+
+void iommu_restore_page(u64 phys)
+{
+ struct ioptdesc *iopt;
+ struct folio *folio;
+ unsigned long pgcnt;
+ unsigned int order;
+
+ folio = kho_restore_folio(phys);
+ BUG_ON(!folio);
+
+ iopt = folio_ioptdesc(folio);
+
+ order = folio_order(folio);
+ pgcnt = 1UL << order;
+ mod_node_page_state(folio_pgdat(folio), NR_IOMMU_PAGES, pgcnt);
+ lruvec_stat_mod_folio(folio, NR_SECONDARY_PAGETABLE, pgcnt);
+}
+EXPORT_SYMBOL_GPL(iommu_restore_page);
+
+int iommu_preserve_pages(struct iommu_pages_list *list)
+{
+ struct ioptdesc *iopt;
+ int count = 0;
+ int ret;
+
+ list_for_each_entry(iopt, &list->pages, iopt_freelist_elm) {
+ ret = kho_preserve_folio(ioptdesc_folio(iopt));
+ if (ret) {
+ iommu_unpreserve_pages(list, count);
+ return ret;
+ }
+
+ ++count;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_preserve_pages);
+
+#endif
+
/**
* iommu_pages_start_incoherent - Setup the page for cache incoherent operation
* @virt: The page to setup
diff --git a/drivers/iommu/iommu-pages.h b/drivers/iommu/iommu-pages.h
index ae9da4f571f6..ec1787776edd 100644
--- a/drivers/iommu/iommu-pages.h
+++ b/drivers/iommu/iommu-pages.h
@@ -53,6 +53,14 @@ void *iommu_alloc_pages_node_sz(int nid, gfp_t gfp, size_t size);
void iommu_free_pages(void *virt);
void iommu_put_pages_list(struct iommu_pages_list *list);
+#if IS_ENABLED(CONFIG_LIVEUPDATE)
+int iommu_preserve_page(void *virt);
+void iommu_unpreserve_page(void *virt);
+int iommu_preserve_pages(struct iommu_pages_list *list);
+void iommu_unpreserve_pages(struct iommu_pages_list *list, int count);
+void iommu_restore_page(u64 phys);
+#endif
+
/**
* iommu_pages_list_add - add the page to a iommu_pages_list
* @list: List to add the page to
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 12/32] iommupt: Implement preserve/unpreserve/restore callbacks
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (10 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 11/32] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 13/32] iommu: Add APIs to preserve/unpreserve iommu domains Samiullah Khawaja
` (20 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Implement the iommu domain ops for presevation, unpresevation and
restoration of iommu domains for liveupdate. Use the existing page
walker to preserve the ioptdesc of the top_table and the lower tables.
Preserve the top_level also so it can be restored during boot.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/generic_pt/iommu_pt.h | 100 ++++++++++++++++++++++++++++
include/linux/generic_pt/iommu.h | 10 +++
2 files changed, 110 insertions(+)
diff --git a/drivers/iommu/generic_pt/iommu_pt.h b/drivers/iommu/generic_pt/iommu_pt.h
index 032d04ec7b56..f71b8c92372d 100644
--- a/drivers/iommu/generic_pt/iommu_pt.h
+++ b/drivers/iommu/generic_pt/iommu_pt.h
@@ -354,6 +354,7 @@ static int __collect_tables(struct pt_range *range, void *arg,
return ret;
continue;
}
+
if (pts.type == PT_ENTRY_OA && collect->check_mapped)
return -EADDRINUSE;
}
@@ -918,6 +919,105 @@ int DOMAIN_NS(map_pages)(struct iommu_domain *domain, unsigned long iova,
}
EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(map_pages), "GENERIC_PT_IOMMU");
+/**
+ * unpreserve() - Unpreserve page tables and other state of a domain.
+ * @domain: Domain to unpreserve
+ */
+void DOMAIN_NS(unpreserve)(struct iommu_domain *domain, struct iommu_domain_ser *ser)
+{
+ struct pt_iommu *iommu_table =
+ container_of(domain, struct pt_iommu, domain);
+ struct pt_common *common = common_from_iommu(iommu_table);
+ struct pt_range range = pt_all_range(common);
+ struct pt_iommu_collect_args collect = {
+ .free_list = IOMMU_PAGES_LIST_INIT(collect.free_list),
+ };
+
+ iommu_pages_list_add(&collect.free_list, range.top_table);
+ pt_walk_range(&range, __collect_tables, &collect);
+
+ iommu_unpreserve_pages(&collect.free_list, -1);
+}
+EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(unpreserve), "GENERIC_PT_IOMMU");
+
+/**
+ * preserve() - Preserve page tables and other state of a domain.
+ * @domain: Domain to preserve
+ *
+ * Returns: -ERRNO on failure, on success.
+ */
+int DOMAIN_NS(preserve)(struct iommu_domain *domain, struct iommu_domain_ser *ser)
+{
+ struct pt_iommu *iommu_table =
+ container_of(domain, struct pt_iommu, domain);
+ struct pt_common *common = common_from_iommu(iommu_table);
+ struct pt_range range = pt_all_range(common);
+ struct pt_iommu_collect_args collect = {
+ .free_list = IOMMU_PAGES_LIST_INIT(collect.free_list),
+ };
+ int ret;
+
+ iommu_pages_list_add(&collect.free_list, range.top_table);
+ pt_walk_range(&range, __collect_tables, &collect);
+
+ ret = iommu_preserve_pages(&collect.free_list);
+ if (ret)
+ return ret;
+
+ ser->top_table = virt_to_phys(range.top_table);
+ ser->top_level = range.top_level;
+
+ return 0;
+}
+EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(preserve), "GENERIC_PT_IOMMU");
+
+static int __restore_tables(struct pt_range *range, void *arg,
+ unsigned int level, struct pt_table_p *table)
+{
+ struct pt_state pts = pt_init(range, level, table);
+ int ret;
+
+ for_each_pt_level_entry(&pts) {
+ if (pts.type == PT_ENTRY_TABLE) {
+ iommu_restore_page(virt_to_phys(pts.table_lower));
+ ret = pt_descend(&pts, arg, __restore_tables);
+ if (ret)
+ return ret;
+ continue;
+ }
+ }
+ return 0;
+}
+
+/**
+ * restore() - Restore page tables and other state of a domain.
+ * @domain: Domain to preserve
+ *
+ * Returns: -ERRNO on failure, on success.
+ */
+int DOMAIN_NS(restore)(struct iommu_domain *domain, struct iommu_domain_ser *ser)
+{
+ struct pt_iommu *iommu_table =
+ container_of(domain, struct pt_iommu, domain);
+ struct pt_common *common = common_from_iommu(iommu_table);
+ struct pt_range range = pt_all_range(common);
+
+ iommu_restore_page(ser->top_table);
+
+ /* Free new table */
+ iommu_free_pages(range.top_table);
+
+ /* Set the restored top table */
+ pt_top_set(common, phys_to_virt(ser->top_table), ser->top_level);
+
+ /* Collect all pages*/
+ range = pt_all_range(common);
+ pt_walk_range(&range, __restore_tables, NULL);
+
+ return 0;
+}
+EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(restore), "GENERIC_PT_IOMMU");
+
struct pt_unmap_args {
struct iommu_pages_list free_list;
pt_vaddr_t unmapped;
diff --git a/include/linux/generic_pt/iommu.h b/include/linux/generic_pt/iommu.h
index cfe05a77f86b..d67d1d8b509f 100644
--- a/include/linux/generic_pt/iommu.h
+++ b/include/linux/generic_pt/iommu.h
@@ -13,6 +13,7 @@ struct iommu_iotlb_gather;
struct pt_iommu_ops;
struct pt_iommu_driver_ops;
struct iommu_dirty_bitmap;
+struct iommu_domain_ser;
/**
* DOC: IOMMU Radix Page Table
@@ -198,6 +199,12 @@ struct pt_iommu_cfg {
unsigned long iova, phys_addr_t paddr, \
size_t pgsize, size_t pgcount, \
int prot, gfp_t gfp, size_t *mapped); \
+ int pt_iommu_##fmt##_preserve(struct iommu_domain *domain, \
+ struct iommu_domain_ser *ser); \
+ void pt_iommu_##fmt##_unpreserve(struct iommu_domain *domain, \
+ struct iommu_domain_ser *ser); \
+ int pt_iommu_##fmt##_restore(struct iommu_domain *domain, \
+ struct iommu_domain_ser *ser); \
size_t pt_iommu_##fmt##_unmap_pages( \
struct iommu_domain *domain, unsigned long iova, \
size_t pgsize, size_t pgcount, \
@@ -224,6 +231,9 @@ struct pt_iommu_cfg {
#define IOMMU_PT_DOMAIN_OPS(fmt) \
.iova_to_phys = &pt_iommu_##fmt##_iova_to_phys, \
.map_pages = &pt_iommu_##fmt##_map_pages, \
+ .preserve = &pt_iommu_##fmt##_preserve, \
+ .unpreserve = &pt_iommu_##fmt##_unpreserve, \
+ .restore = &pt_iommu_##fmt##_restore, \
.unmap_pages = &pt_iommu_##fmt##_unmap_pages
#define IOMMU_PT_DIRTY_OPS(fmt) \
.read_and_clear_dirty = &pt_iommu_##fmt##_read_and_clear_dirty
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 13/32] iommu: Add APIs to preserve/unpreserve iommu domains
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (11 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 12/32] iommupt: Implement preserve/unpreserve/restore callbacks Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 14/32] iommufd: Use the iommu_domain_preserve/unpreserve APIs Samiullah Khawaja
` (19 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
These APIs will be used by iommufd LU handler to preserve and unpreserve
a domain for live update.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/liveupdate.c | 82 ++++++++++++++++++++++++++++++++++++++
include/linux/iommu-lu.h | 2 +
include/linux/iommu.h | 4 ++
3 files changed, 88 insertions(+)
diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
index 21ce7be9b87e..25a943e5e1e3 100644
--- a/drivers/iommu/liveupdate.c
+++ b/drivers/iommu/liveupdate.c
@@ -127,3 +127,85 @@ int iommu_liveupdate_unregister_flb(struct liveupdate_file_handler *handler)
return liveupdate_unregister_flb(handler, &iommu_flb);
}
EXPORT_SYMBOL(iommu_liveupdate_unregister_flb);
+
+static int reserve_obj_ser(struct iommu_objs_ser **objs_ptr, u64 max_objs)
+{
+ struct iommu_objs_ser *next_objs, *objs = *objs_ptr;
+ int idx;
+
+ if (objs->nr_objs == max_objs) {
+ next_objs = kho_alloc_preserve(PAGE_SIZE);
+ if (!next_objs)
+ return -ENOMEM;
+
+ objs->next_objs = virt_to_phys(next_objs);
+ objs = next_objs;
+ *objs_ptr = objs;
+ objs->nr_objs = 0;
+ }
+
+ idx = objs->nr_objs++;
+ return idx;
+}
+
+int iommu_domain_preserve(struct iommu_domain *domain, struct iommu_domain_ser **ser)
+{
+ struct iommu_domain_ser *domain_ser;
+ struct iommu_lu_flb_obj *flb_obj;
+ int idx, ret;
+
+ if (!domain->ops->preserve)
+ return -EOPNOTSUPP;
+
+ ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
+ if (ret)
+ return ret;
+
+ guard(mutex)(&flb_obj->lock);
+ idx = reserve_obj_ser((struct iommu_objs_ser **)&flb_obj->iommu_domains,
+ MAX_IOMMU_DOMAIN_SERS);
+ if (idx < 0)
+ return idx;
+
+ domain_ser = &flb_obj->iommu_domains->iommu_domains[idx];
+ idx = flb_obj->ser->nr_domains++;
+ domain_ser->obj.idx = idx;
+ domain_ser->obj.ref_count = 1;
+
+ ret = domain->ops->preserve(domain, domain_ser);
+ if (ret) {
+ domain_ser->obj.deleted = true;
+ return ret;
+ }
+
+ domain->preserved_state = domain_ser;
+ *ser = domain_ser;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_domain_preserve);
+
+int iommu_domain_unpreserve(struct iommu_domain *domain)
+{
+ struct iommu_domain_ser *domain_ser;
+ struct iommu_lu_flb_obj *flb_obj;
+ int ret;
+
+ if (!domain->ops->unpreserve)
+ return -EOPNOTSUPP;
+
+ ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
+ if (ret)
+ return ret;
+
+ guard(mutex)(&flb_obj->lock);
+ domain_ser = domain->preserved_state;
+ if (domain_ser->attach_count)
+ ret = -EBUSY;
+
+ domain->ops->unpreserve(domain, domain_ser);
+ domain_ser->obj.deleted = true;
+ domain->preserved_state = NULL;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_domain_unpreserve);
diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
index 59095d2f1bb2..2c8e5ac746ad 100644
--- a/include/linux/iommu-lu.h
+++ b/include/linux/iommu-lu.h
@@ -11,6 +11,8 @@
#include <linux/liveupdate.h>
#include <linux/kho/abi/iommu.h>
+int iommu_domain_preserve(struct iommu_domain *domain, struct iommu_domain_ser **ser);
+int iommu_domain_unpreserve(struct iommu_domain *domain);
int iommu_liveupdate_register_flb(struct liveupdate_file_handler *handler);
int iommu_liveupdate_unregister_flb(struct liveupdate_file_handler *handler);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index f681d4d27d6e..17e1f3c29958 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -249,6 +249,10 @@ struct iommu_domain {
struct list_head next;
};
};
+
+#ifdef CONFIG_LIVEUPDATE
+ struct iommu_domain_ser *preserved_state;
+#endif
};
static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 14/32] iommufd: Use the iommu_domain_preserve/unpreserve APIs
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (12 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 13/32] iommu: Add APIs to preserve/unpreserve iommu domains Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 15/32] iommu: Add API to keep track of iommu domain attachments Samiullah Khawaja
` (18 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Remove the stub implementation of iommufd_domain_preserve and unpreserve
APIs and use the APIs exported by the IOMMU core.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommufd/iommufd_private.h | 2 --
drivers/iommu/iommufd/liveupdate.c | 19 ++++++++++++-------
2 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 15afff6ba0ea..0d358e5486d0 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -730,8 +730,6 @@ int iommufd_hwpt_lu_set_preserved(struct iommufd_ucmd *ucmd);
int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd);
/* TODO */
-#define iommu_domain_restore(x) ERR_PTR(-EOPNOTSUPP)
-#define iommu_domain_preserve(x, y) (-EOPNOTSUPP)
#define iommu_domain_has_attachments(x) (false)
#else
static inline int iommufd_liveupdate_register_lufs(void)
diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
index 782585aff44a..5b45071d7dd2 100644
--- a/drivers/iommu/iommufd/liveupdate.c
+++ b/drivers/iommu/iommufd/liveupdate.c
@@ -59,6 +59,7 @@ static int iommufd_save_hwpts(struct iommufd_ctx *ictx,
struct iommufd_lu *iommufd_lu)
{
struct iommufd_hwpt_paging *hwpt, **hwpts = NULL;
+ struct iommu_domain_ser *domain_ser;
struct iommufd_hwpt_lu *hwpt_lu;
struct iommufd_object *obj;
unsigned int nr_hwpts = 0;
@@ -119,8 +120,11 @@ static int iommufd_save_hwpts(struct iommufd_ctx *ictx,
hwpt = hwpts[i];
hwpt_lu = &iommufd_lu->hwpts[i];
- rc = iommu_domain_preserve(hwpt->common.domain, &hwpt_lu->domain_data);
- goto out;
+ rc = iommu_domain_preserve(hwpt->common.domain, &domain_ser);
+ if (rc < 0)
+ goto out;
+
+ hwpt_lu->domain_data = __pa(domain_ser);
}
}
@@ -201,7 +205,7 @@ static void iommufd_liveupdate_unpreserve(struct liveupdate_file_op_args *args)
if (!hwpt->common.domain)
continue;
- /* TODO: WARN_ON(iommu_domain_unpreserve(hwpt->common.domain)); */
+ WARN_ON(iommu_domain_unpreserve(hwpt->common.domain));
}
xa_unlock(&ictx->objects);
@@ -276,6 +280,7 @@ int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd)
struct iommu_hwpt_lu_restore *cmd = ucmd->cmd;
struct iommufd_hwpt_paging *hwpt = NULL;
struct iommufd_ctx *ictx = ucmd->ictx;
+ struct iommu_domain_ser *domain_ser;
struct iommufd_hwpt_lu *hwpt_lu;
struct iommufd_lu *iommufd_lu;
struct iommu_domain *domain;
@@ -303,10 +308,10 @@ int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd)
if (IS_ERR(hwpt))
return PTR_ERR(hwpt);
- /* a successful iommu_domain_restore mars the point of no return */
- domain = iommu_domain_restore(hwpt_lu->domain_data);
- if (IS_ERR(domain)) {
- rc = PTR_ERR(domain);
+ domain_ser = __va(hwpt_lu->domain_data);
+ domain = domain_ser->restored_domain;
+ if (!domain) {
+ rc = -ENOENT;
goto err_destroy;
}
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 15/32] iommu: Add API to keep track of iommu domain attachments
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (13 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 14/32] iommufd: Use the iommu_domain_preserve/unpreserve APIs Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 16/32] iommu: Add API to preserve/unpreserve a device Samiullah Khawaja
` (17 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Keep track of iommu domain attachments by incrementing and decremeting a
counter when a device is attached and detached to a domain.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommu.c | 18 ++++++++++++++++++
include/linux/iommu.h | 3 +++
2 files changed, 21 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2ca990dfbb88..a70898d11959 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2041,6 +2041,7 @@ static void iommu_domain_init(struct iommu_domain *domain, unsigned int type,
{
domain->type = type;
domain->owner = ops;
+ atomic_set(&domain->attach_count, 0);
if (!domain->ops)
domain->ops = ops->default_domain_ops;
}
@@ -2093,8 +2094,20 @@ struct iommu_domain *iommu_paging_domain_alloc_flags(struct device *dev,
}
EXPORT_SYMBOL_GPL(iommu_paging_domain_alloc_flags);
+bool iommu_domain_has_attachments(struct iommu_domain *domain)
+{
+ return atomic_read(&domain->attach_count) != 0;
+}
+EXPORT_SYMBOL_GPL(iommu_domain_has_attachments);
+
void iommu_domain_free(struct iommu_domain *domain)
{
+ if (WARN_ON_ONCE(iommu_domain_has_attachments(domain))) {
+ pr_err("Attempt to free an iommu_domain that has attachments: %d\n",
+ atomic_read(&domain->attach_count));
+ return;
+ }
+
switch (domain->cookie_type) {
case IOMMU_COOKIE_DMA_IOVA:
iommu_put_dma_cookie(domain);
@@ -2140,6 +2153,11 @@ static int __iommu_attach_device(struct iommu_domain *domain,
ret = domain->ops->attach_dev(domain, dev, old);
if (ret)
return ret;
+
+ atomic_inc(&domain->attach_count);
+ if (old)
+ atomic_dec(&old->attach_count);
+
dev->iommu->attach_deferred = 0;
trace_attach_device_to_domain(dev);
return 0;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 17e1f3c29958..387cbe20c83b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -14,6 +14,7 @@
#include <linux/err.h>
#include <linux/of.h>
#include <linux/iova_bitmap.h>
+#include <linux/atomic.h>
#include <linux/kho/abi/iommu.h>
#include <uapi/linux/iommufd.h>
@@ -250,6 +251,7 @@ struct iommu_domain {
};
};
+ atomic_t attach_count;
#ifdef CONFIG_LIVEUPDATE
struct iommu_domain_ser *preserved_state;
#endif
@@ -917,6 +919,7 @@ static inline struct iommu_domain *iommu_paging_domain_alloc(struct device *dev)
{
return iommu_paging_domain_alloc_flags(dev, 0);
}
+extern bool iommu_domain_has_attachments(struct iommu_domain *domain);
extern void iommu_domain_free(struct iommu_domain *domain);
extern int iommu_attach_device(struct iommu_domain *domain,
struct device *dev);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 16/32] iommu: Add API to preserve/unpreserve a device
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (14 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 15/32] iommu: Add API to keep track of iommu domain attachments Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-04 5:46 ` Baolu Lu
2025-12-02 23:02 ` [RFC PATCH v2 17/32] iommu/vt-d: Implement device and iommu preserve/unpreserve ops Samiullah Khawaja
` (16 subsequent siblings)
32 siblings, 1 reply; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
iommu_preserve_device/iommu_unpreserve_device can be used to
preserve/unpreserve a device for liveupdate. During device preservation
the state of the associated IOMMU is also preserved. The device can only
be preseved if the attached iommu domain is preserved and the assocated
iommu supports preservation.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommu.c | 3 +
drivers/iommu/liveupdate.c | 115 +++++++++++++++++++++++++++++++++++++
include/linux/iommu-lu.h | 2 +
include/linux/iommu.h | 18 ++++++
4 files changed, 138 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index a70898d11959..3feb440de40a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -382,6 +382,9 @@ static struct dev_iommu *dev_iommu_get(struct device *dev)
mutex_init(¶m->lock);
dev->iommu = param;
+#ifdef CONFIG_LIVEUPDATE
+ dev->iommu->device_ser = NULL;
+#endif
return param;
}
diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
index 25a943e5e1e3..5780761a7024 100644
--- a/drivers/iommu/liveupdate.c
+++ b/drivers/iommu/liveupdate.c
@@ -11,6 +11,7 @@
#include <linux/liveupdate.h>
#include <linux/iommu-lu.h>
#include <linux/iommu.h>
+#include <linux/pci.h>
#include <linux/errno.h>
static void iommu_liveupdate_free_objs(u64 next, bool incoming)
@@ -209,3 +210,117 @@ int iommu_domain_unpreserve(struct iommu_domain *domain)
return 0;
}
EXPORT_SYMBOL_GPL(iommu_domain_unpreserve);
+
+static int iommu_preserve_locked(struct iommu_device *iommu)
+{
+ struct iommu_lu_flb_obj *flb_obj;
+ struct iommu_ser *iommu_ser;
+ int idx, ret;
+
+ if (!iommu->ops->preserve)
+ return -EOPNOTSUPP;
+
+ if (iommu->outgoing_preserved_state) {
+ iommu->outgoing_preserved_state->obj.ref_count++;
+ return 0;
+ }
+
+ ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
+ if (ret)
+ return ret;
+
+ idx = reserve_obj_ser((struct iommu_objs_ser **)&flb_obj->iommus, MAX_IOMMU_SERS);
+ if (idx < 0)
+ return idx;
+
+ iommu_ser = &flb_obj->iommus->iommus[idx];
+ idx = flb_obj->ser->nr_iommus++;
+ iommu_ser->obj.idx = idx;
+ iommu_ser->obj.ref_count = 1;
+
+ ret = iommu->ops->preserve(iommu, iommu_ser);
+ if (ret)
+ iommu_ser->obj.deleted = true;
+
+ iommu->outgoing_preserved_state = iommu_ser;
+ return ret;
+}
+
+static void iommu_unpreserve_locked(struct iommu_device *iommu)
+{
+ struct iommu_ser *iommu_ser = iommu->outgoing_preserved_state;
+
+ iommu_ser->obj.ref_count--;
+ if (iommu_ser->obj.ref_count)
+ return;
+
+ iommu->outgoing_preserved_state = NULL;
+ iommu->ops->unpreserve(iommu, iommu_ser);
+ iommu_ser->obj.deleted = true;
+}
+
+int iommu_preserve_device(struct iommu_domain *domain, struct device *dev)
+{
+ struct iommu_lu_flb_obj *flb_obj;
+ struct device_ser *device_ser;
+ struct dev_iommu *iommu;
+ struct pci_dev *pdev;
+ int ret, idx;
+
+ if (!dev_is_pci(dev))
+ return -EOPNOTSUPP;
+
+ if (!domain->preserved_state)
+ return -EINVAL;
+
+ pdev = to_pci_dev(dev);
+ iommu = dev->iommu;
+ if (!iommu->iommu_dev->ops->preserve_device ||
+ !iommu->iommu_dev->ops->preserve)
+ return -EOPNOTSUPP;
+
+ if (!iommu->iommu_dev->ops->preserve)
+ return -EOPNOTSUPP;
+
+ ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
+ if (ret)
+ return ret;
+
+ guard(mutex)(&flb_obj->lock);
+ idx = reserve_obj_ser((struct iommu_objs_ser **)&flb_obj->devices, MAX_IOMMU_SERS);
+ if (idx < 0)
+ return idx;
+
+ device_ser = &flb_obj->devices->devices[idx];
+ idx = flb_obj->ser->nr_devices++;
+ device_ser->obj.idx = idx;
+ device_ser->obj.ref_count = 1;
+
+ ret = iommu_preserve_locked(iommu->iommu_dev);
+ if (ret) {
+ device_ser->obj.deleted = true;
+ return ret;
+ }
+
+ device_ser->domain_iommu_ser.domain_phys = __pa(domain->preserved_state);
+ device_ser->domain_iommu_ser.iommu_phys = __pa(iommu->iommu_dev->outgoing_preserved_state);
+ device_ser->devid = pci_dev_id(pdev);
+ device_ser->pci_domain = pci_domain_nr(pdev->bus);
+ device_ser->token = device_ser->obj.idx + 1;
+
+ ret = iommu->iommu_dev->ops->preserve_device(dev, device_ser);
+ if (ret) {
+ device_ser->obj.deleted = true;
+ iommu_unpreserve_locked(iommu->iommu_dev);
+ return ret;
+ }
+
+ dev->iommu->device_ser = device_ser;
+ domain->preserved_state->attach_count++;
+ return device_ser->token;
+}
+
+int iommu_unpreserve_device(struct iommu_domain *domain, struct device *dev)
+{
+ return -EOPNOTSUPP;
+}
diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
index 2c8e5ac746ad..95375530b7be 100644
--- a/include/linux/iommu-lu.h
+++ b/include/linux/iommu-lu.h
@@ -13,6 +13,8 @@
int iommu_domain_preserve(struct iommu_domain *domain, struct iommu_domain_ser **ser);
int iommu_domain_unpreserve(struct iommu_domain *domain);
+int iommu_preserve_device(struct iommu_domain *domain, struct device *dev);
+int iommu_unpreserve_device(struct iommu_domain *domain, struct device *dev);
int iommu_liveupdate_register_flb(struct liveupdate_file_handler *handler);
int iommu_liveupdate_unregister_flb(struct liveupdate_file_handler *handler);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 387cbe20c83b..45da5b88f35d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -654,6 +654,10 @@ __iommu_copy_struct_to_user(const struct iommu_user_data *dst_data,
* resources shared/passed to user space IOMMU instance. Associate
* it with a nesting @parent_domain. It is required for driver to
* set @viommu->ops pointing to its own viommu_ops
+ * @preserve_device: Preserve state of a device for liveupdate.
+ * @unpreserve_device: Unpreserve state that was preserved earlier.
+ * @preserve: Preserve state of iommu translation hardware for liveupdate.
+ * @unpreserve: Unpreserve state of iommu that was preserved earlier.
* @owner: Driver module providing these ops
* @identity_domain: An always available, always attachable identity
* translation.
@@ -710,6 +714,11 @@ struct iommu_ops {
struct iommu_domain *parent_domain,
const struct iommu_user_data *user_data);
+ int (*preserve_device)(struct device *dev, struct device_ser *device_ser);
+ void (*unpreserve_device)(struct device *dev, struct device_ser *device_ser);
+ int (*preserve)(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
+ void (*unpreserve)(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
+
const struct iommu_domain_ops *default_domain_ops;
struct module *owner;
struct iommu_domain *identity_domain;
@@ -805,6 +814,8 @@ struct iommu_domain_ops {
* @singleton_group: Used internally for drivers that have only one group
* @max_pasids: number of supported PASIDs
* @ready: set once iommu_device_register() has completed successfully
+ * @outgoing_preserved_state: preserved iommu state of outgoing kernel for
+ * liveupdate.
*/
struct iommu_device {
struct list_head list;
@@ -814,6 +825,10 @@ struct iommu_device {
struct iommu_group *singleton_group;
u32 max_pasids;
bool ready;
+
+#ifdef CONFIG_LIVEUPDATE
+ struct iommu_ser *outgoing_preserved_state;
+#endif
};
/**
@@ -868,6 +883,9 @@ struct dev_iommu {
u32 pci_32bit_workaround:1;
u32 require_direct:1;
u32 shadow_on_flush:1;
+#ifdef CONFIG_LIVEUPDATE
+ struct device_ser *device_ser;
+#endif
};
int iommu_device_register(struct iommu_device *iommu,
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 17/32] iommu/vt-d: Implement device and iommu preserve/unpreserve ops
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (15 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 16/32] iommu: Add API to preserve/unpreserve a device Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 18/32] iommufd: Add APIs to preserve/unpreserve a vfio cdev Samiullah Khawaja
` (15 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Add implementation of the device and iommu presevation in a separate
file. Also set the device and iommu preserve/unpreserve ops in the
struct iommu_ops
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/intel/Makefile | 1 +
drivers/iommu/intel/iommu.c | 6 +-
drivers/iommu/intel/iommu.h | 9 +++
drivers/iommu/intel/liveupdate.c | 127 +++++++++++++++++++++++++++++++
4 files changed, 141 insertions(+), 2 deletions(-)
create mode 100644 drivers/iommu/intel/liveupdate.c
diff --git a/drivers/iommu/intel/Makefile b/drivers/iommu/intel/Makefile
index ada651c4a01b..58922d580c79 100644
--- a/drivers/iommu/intel/Makefile
+++ b/drivers/iommu/intel/Makefile
@@ -6,3 +6,4 @@ obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += debugfs.o
obj-$(CONFIG_INTEL_IOMMU_SVM) += svm.o
obj-$(CONFIG_IRQ_REMAP) += irq_remapping.o
obj-$(CONFIG_INTEL_IOMMU_PERF_EVENTS) += perfmon.o
+obj-$(CONFIG_LIVEUPDATE) += liveupdate.o
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d745f833d8b5..3f69a073b2d8 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -60,8 +60,6 @@ static int force_on = 0;
static int intel_iommu_tboot_noforce;
static int no_platform_optin;
-#define ROOT_ENTRY_NR (VTD_PAGE_SIZE/sizeof(struct root_entry))
-
/*
* Take a root_entry and return the Lower Context Table Pointer (LCTP)
* if marked present.
@@ -3909,6 +3907,10 @@ const struct iommu_ops intel_iommu_ops = {
.is_attach_deferred = intel_iommu_is_attach_deferred,
.def_domain_type = device_def_domain_type,
.page_response = intel_iommu_page_response,
+ .preserve_device = intel_iommu_preserve_device,
+ .unpreserve_device = intel_iommu_unpreserve_device,
+ .preserve = intel_iommu_preserve,
+ .unpreserve = intel_iommu_unpreserve,
};
static void quirk_iommu_igfx(struct pci_dev *dev)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 25c5e22096d4..ea88c86030bb 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -557,6 +557,8 @@ struct root_entry {
u64 hi;
};
+#define ROOT_ENTRY_NR (VTD_PAGE_SIZE/sizeof(struct root_entry))
+
/*
* low 64 bits:
* 0: present
@@ -1276,6 +1278,13 @@ static inline int iopf_for_domain_replace(struct iommu_domain *new,
return 0;
}
+#ifdef CONFIG_LIVEUPDATE
+int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_ser);
+void intel_iommu_unpreserve_device(struct device *dev, struct device_ser *device_ser);
+int intel_iommu_preserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
+void intel_iommu_unpreserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
+#endif
+
#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu);
struct iommu_domain *intel_svm_domain_alloc(struct device *dev,
diff --git a/drivers/iommu/intel/liveupdate.c b/drivers/iommu/intel/liveupdate.c
new file mode 100644
index 000000000000..491075802e4b
--- /dev/null
+++ b/drivers/iommu/intel/liveupdate.c
@@ -0,0 +1,127 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright (C) 2025, Google LLC
+ * Author: Samiullah Khawaja <skhawaja@google.com>
+ */
+
+#define pr_fmt(fmt) "iommu: liveupdate: " fmt
+
+#include <linux/kexec_handover.h>
+#include <linux/liveupdate.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+#include "iommu.h"
+#include "../iommu-pages.h"
+
+static void unpreserve_iommu_context(struct intel_iommu *iommu, int end)
+{
+ struct context_entry *context;
+ int i;
+
+ if (end < 0)
+ end = ROOT_ENTRY_NR;
+
+ for (i = 0; i < end; i++) {
+ context = iommu_context_addr(iommu, i, 0, 0);
+ if (context)
+ iommu_unpreserve_page(context);
+
+ if (!sm_supported(iommu))
+ continue;
+
+ context = iommu_context_addr(iommu, i, 0x80, 0);
+ if (context)
+ iommu_unpreserve_page(context);
+ }
+}
+
+static int preserve_iommu_context(struct intel_iommu *iommu)
+{
+ struct context_entry *context;
+ int ret;
+ int i;
+
+ for (i = 0; i < ROOT_ENTRY_NR; i++) {
+ context = iommu_context_addr(iommu, i, 0, 0);
+ if (context) {
+ ret = iommu_preserve_page(context);
+ if (ret)
+ goto error;
+ }
+
+ if (!sm_supported(iommu))
+ continue;
+
+ context = iommu_context_addr(iommu, i, 0x80, 0);
+ if (context) {
+ ret = iommu_preserve_page(context);
+ if (ret)
+ goto error_sm;
+ }
+ }
+
+ return 0;
+
+error_sm:
+ context = iommu_context_addr(iommu, i, 0, 0);
+ iommu_unpreserve_page(context);
+error:
+ unpreserve_iommu_context(iommu, i);
+ return ret;
+}
+
+int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_ser)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+
+ if (!dev_is_pci(dev))
+ return -EOPNOTSUPP;
+
+ if (!info)
+ return -EINVAL;
+
+ device_ser->domain_iommu_ser.did = domain_id_iommu(info->domain, info->iommu);
+
+ /* TODO: Add support preservation of PASIDs. */
+ return 0;
+}
+
+void intel_iommu_unpreserve_device(struct device *dev, struct device_ser *device_ser)
+{
+}
+
+int intel_iommu_preserve(struct iommu_device *iommu_dev, struct iommu_ser *ser)
+{
+ struct intel_iommu *iommu;
+ int ret;
+
+ iommu = container_of(iommu_dev, struct intel_iommu, iommu);
+
+ spin_lock(&iommu->lock);
+ ret = preserve_iommu_context(iommu);
+ if (ret)
+ goto err;
+
+ ret = iommu_preserve_page(iommu->root_entry);
+ if (ret) {
+ unpreserve_iommu_context(iommu, -1);
+ goto err;
+ }
+
+ ser->intel.phys_addr = iommu->reg_phys;
+ ser->intel.root_table = __pa(iommu->root_entry);
+ ser->type = IOMMU_INTEL;
+ ser->token = ser->intel.phys_addr;
+ spin_unlock(&iommu->lock);
+
+ return 0;
+err:
+ spin_unlock(&iommu->lock);
+ return ret;
+}
+
+void intel_iommu_unpreserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser)
+{
+}
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 18/32] iommufd: Add APIs to preserve/unpreserve a vfio cdev
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (16 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 17/32] iommu/vt-d: Implement device and iommu preserve/unpreserve ops Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 19/32] vfio/pci: Preserve the iommufd state of the " Samiullah Khawaja
` (14 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Add APIs that can be used to preserve and unpreserve a vfio cdev. Use
the APIs exported by the IOMMU core to preserve/unpreserve device. Once
preserved, the token is returned to the caller so it can be used to
restore/rebind to the device after liveupdate.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommufd/device.c | 39 ++++++++++++++++++++++++++++++++++
include/linux/iommufd.h | 6 ++++++
2 files changed, 45 insertions(+)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index ba4d9c3cfa8b..2c81bfa7dedd 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -2,6 +2,7 @@
/* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
*/
#include <linux/iommu.h>
+#include <linux/iommu-lu.h>
#include <linux/iommufd.h>
#include <linux/pci-ats.h>
#include <linux/slab.h>
@@ -1666,3 +1667,41 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd)
iommufd_put_object(ucmd->ictx, &idev->obj);
return rc;
}
+
+#ifdef CONFIG_LIVEUPDATE
+int iommufd_device_preserve(struct iommufd_device *idev, ioasid_t pasid)
+{
+ struct iommufd_group *igroup = idev->igroup;
+ struct iommufd_hwpt_paging *hwpt_paging;
+ struct iommufd_hw_pagetable *hwpt;
+ struct iommufd_attach *attach;
+ int ret;
+
+ mutex_lock(&igroup->lock);
+ attach = xa_load(&igroup->pasid_attach, pasid);
+ if (!attach) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ hwpt = attach->hwpt;
+ hwpt_paging = find_hwpt_paging(hwpt);
+ if (!hwpt_paging || !hwpt_paging->lu_preserved) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* TODO: Add support PASIDs */
+ ret = iommu_preserve_device(hwpt_paging->common.domain, idev->dev);
+
+out:
+ mutex_unlock(&igroup->lock);
+ return ret;
+}
+EXPORT_SYMBOL_NS_GPL(iommufd_device_preserve, "IOMMUFD");
+
+void iommufd_device_unpreserve(struct iommufd_device *idev)
+{
+}
+EXPORT_SYMBOL_NS_GPL(iommufd_device_unpreserve, "IOMMUFD");
+#endif
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 6e7efe83bc5d..ba433fb1a481 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -9,6 +9,7 @@
#include <linux/err.h>
#include <linux/errno.h>
#include <linux/iommu.h>
+#include <linux/liveupdate.h>
#include <linux/refcount.h>
#include <linux/types.h>
#include <linux/xarray.h>
@@ -71,6 +72,11 @@ void iommufd_device_detach(struct iommufd_device *idev, ioasid_t pasid);
struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev);
u32 iommufd_device_to_id(struct iommufd_device *idev);
+#ifdef CONFIG_LIVEUPDATE
+int iommufd_device_preserve(struct iommufd_device *idev, ioasid_t pasid);
+void iommufd_device_unpreserve(struct iommufd_device *idev);
+#endif
+
struct iommufd_access_ops {
u8 needs_pin_pages : 1;
void (*unmap)(void *data, unsigned long iova, unsigned long length);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 19/32] vfio/pci: Preserve the iommufd state of the vfio cdev
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (17 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 18/32] iommufd: Add APIs to preserve/unpreserve a vfio cdev Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 20/32] iommu: Add APIs to get preserved state of a device Samiullah Khawaja
` (13 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
If the vfio cdev is attached to an iommufd, preserve the state of the
attached iommufd also. Basically preserve the iommu state of the device
and also the attached domain. The token returned by the preservation API
will be used to restore/rebind to the iommufd state.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/vfio/pci/vfio_pci_liveupdate.c | 23 +++++++++++++++++++----
include/linux/kho/abi/vfio_pci.h | 10 ++++++++++
2 files changed, 29 insertions(+), 4 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index bcaf9de8a823..b721080599d5 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -15,6 +15,7 @@
#include <linux/liveupdate.h>
#include <linux/errno.h>
#include <linux/vfio.h>
+#include <linux/iommufd.h>
#include "vfio_pci_priv.h"
@@ -38,9 +39,9 @@ static int vfio_pci_liveupdate_preserve(struct liveupdate_file_op_args *args)
struct vfio_device *device = vfio_device_from_file(args->file);
struct vfio_pci_core_device_ser *ser;
struct vfio_pci_core_device *vdev;
+ int err, iommufd_token;
struct pci_dev *pdev;
struct folio *folio;
- int err;
vdev = container_of(device, struct vfio_pci_core_device, vdev);
pdev = vdev->pdev;
@@ -51,15 +52,26 @@ static int vfio_pci_liveupdate_preserve(struct liveupdate_file_op_args *args)
if (vfio_pci_is_intel_display(pdev))
return -EINVAL;
+ /* If iommufd is attached, preserve the underlying domain */
+ if (device->iommufd_attached) {
+ iommufd_token = iommufd_device_preserve(device->iommufd_device,
+ IOMMU_NO_PASID);
+ if (iommufd_token < 0)
+ return iommufd_token;
+ }
+
folio = folio_alloc(GFP_KERNEL | __GFP_ZERO, get_order(sizeof(*ser)));
- if (!folio)
- return -ENOMEM;
+ if (!folio) {
+ err = -ENOMEM;
+ goto error_folio;
+ }
ser = folio_address(folio);
ser->bdf = pci_dev_id(pdev);
ser->domain = pci_domain_nr(pdev->bus);
ser->reset_works = vdev->reset_works;
+ ser->iommufd_ser.token = iommufd_token;
err = kho_preserve_folio(folio);
if (err)
@@ -69,8 +81,11 @@ static int vfio_pci_liveupdate_preserve(struct liveupdate_file_op_args *args)
args->serialized_data = virt_to_phys(ser);
return 0;
-error:
+error_folio:
folio_put(folio);
+error:
+ if (device->iommufd_attached)
+ iommufd_device_unpreserve(device->iommufd_device);
return err;
}
diff --git a/include/linux/kho/abi/vfio_pci.h b/include/linux/kho/abi/vfio_pci.h
index 6c3d3c6dfc09..28d6eac5fd65 100644
--- a/include/linux/kho/abi/vfio_pci.h
+++ b/include/linux/kho/abi/vfio_pci.h
@@ -28,6 +28,15 @@
#define VFIO_PCI_LUO_FH_COMPATIBLE "vfio-pci-v1"
+/**
+ * struct vfio_iommufd_ser - Serialized state relevant attached iommufd.
+ *
+ * @token: The token of the bound iommufd state.
+ */
+struct vfio_iommufd_ser {
+ u32 token;
+} __packed;
+
/**
* struct vfio_pci_core_device_ser - Serialized state of a single VFIO PCI
* device.
@@ -40,6 +49,7 @@ struct vfio_pci_core_device_ser {
u16 bdf;
u16 domain;
u8 reset_works;
+ struct vfio_iommufd_ser iommufd_ser;
} __packed;
#endif /* _LINUX_LIVEUPDATE_ABI_VFIO_PCI_H */
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 20/32] iommu: Add APIs to get preserved state of a device
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (18 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 19/32] vfio/pci: Preserve the iommufd state of the " Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-04 6:19 ` Baolu Lu
2025-12-02 23:02 ` [RFC PATCH v2 21/32] iommu/vt-d: Clean the context entries of unpreserved devices Samiullah Khawaja
` (12 subsequent siblings)
32 siblings, 1 reply; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
The preserved state of the device needs to be fetched at various places
during liveupdate. The added API can also be used to check if a device
is preserved or not. The API is only used during shutdown and after
liveupdate so no locking needed.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
include/linux/iommu-lu.h | 67 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)
diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
index 95375530b7be..08a659de8553 100644
--- a/include/linux/iommu-lu.h
+++ b/include/linux/iommu-lu.h
@@ -8,9 +8,76 @@
#ifndef _LINUX_IOMMU_LU_H
#define _LINUX_IOMMU_LU_H
+#include <linux/device.h>
+#include <linux/iommu.h>
#include <linux/liveupdate.h>
#include <linux/kho/abi/iommu.h>
+#ifdef CONFIG_LIVEUPDATE
+static inline void *dev_iommu_preserved_state(struct device *dev)
+{
+ struct device_ser *ser;
+
+ ser = dev->iommu->device_ser;
+ if (ser && !ser->obj.incoming)
+ return ser;
+
+ return NULL;
+}
+
+static inline void *dev_iommu_restored_state(struct device *dev)
+{
+ struct device_ser *ser;
+
+ ser = dev->iommu->device_ser;
+ if (ser && ser->obj.incoming)
+ return ser;
+
+ return NULL;
+}
+
+static inline void *iommu_domain_restored_state(struct iommu_domain *domain)
+{
+ struct iommu_domain_ser *ser;
+
+ ser = domain->preserved_state;
+ if (ser && ser->obj.incoming)
+ return ser;
+
+ return NULL;
+}
+
+static inline int dev_iommu_restore_did(struct device *dev, struct iommu_domain *domain)
+{
+ struct device_ser *ser = dev_iommu_restored_state(dev);
+
+ if (ser && iommu_domain_restored_state(domain))
+ return ser->domain_iommu_ser.did;
+
+ return -1;
+}
+#else
+static inline void *dev_iommu_preserved_state(struct device *dev)
+{
+ return NULL;
+}
+
+static inline void *dev_iommu_restored_state(struct device *dev)
+{
+ return NULL;
+}
+
+static inline int dev_iommu_restore_did(struct device *dev, struct iommu_domain *domain)
+{
+ return -1;
+}
+
+static inline void *iommu_domain_restored_state(struct iommu_domain *domain)
+{
+ return NULL;
+}
+#endif
+
int iommu_domain_preserve(struct iommu_domain *domain, struct iommu_domain_ser **ser);
int iommu_domain_unpreserve(struct iommu_domain *domain);
int iommu_preserve_device(struct iommu_domain *domain, struct device *dev);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 21/32] iommu/vt-d: Clean the context entries of unpreserved devices
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (19 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 20/32] iommu: Add APIs to get preserved state of a device Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-04 6:28 ` Baolu Lu
2025-12-02 23:02 ` [RFC PATCH v2 22/32] iommu: Implement IOMMU FLB retrieve and finish ops Samiullah Khawaja
` (11 subsequent siblings)
32 siblings, 1 reply; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
During normal shutdown the iommu translation is disabled. Since the root
table is preserved during live update, it needs to be cleaned up and the
context entries of the unpreserved devices need to be cleared.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/intel/iommu.c | 33 ++++++++++++++++++++++++++++++--
drivers/iommu/intel/iommu.h | 1 +
drivers/iommu/intel/liveupdate.c | 1 +
3 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 3f69a073b2d8..84fef81ecf4d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -16,6 +16,7 @@
#include <linux/crash_dump.h>
#include <linux/dma-direct.h>
#include <linux/dmi.h>
+#include <linux/iommu-lu.h>
#include <linux/memory.h>
#include <linux/pci.h>
#include <linux/pci-ats.h>
@@ -52,6 +53,10 @@ static int rwbf_quirk;
#define rwbf_required(iommu) (rwbf_quirk || cap_rwbf((iommu)->cap))
+#ifdef CONFIG_LIVEUPDATE
+static void __clean_unpreserved_context_entries(struct intel_iommu *iommu);
+#endif
+
/*
* set to 1 to panic kernel if can't successfully enable VT-d
* (used when kernel is launched w/ TXT)
@@ -2376,8 +2381,12 @@ void intel_iommu_shutdown(void)
/* Disable PMRs explicitly here. */
iommu_disable_protect_mem_regions(iommu);
- /* Make sure the IOMMUs are switched off */
- iommu_disable_translation(iommu);
+ if (iommu->iommu.outgoing_preserved_state) {
+ __clean_unpreserved_context_entries(iommu);
+ } else {
+ /* Make sure the IOMMUs are switched off */
+ iommu_disable_translation(iommu);
+ }
}
}
@@ -2884,6 +2893,26 @@ static const struct iommu_dirty_ops intel_second_stage_dirty_ops = {
.set_dirty_tracking = intel_iommu_set_dirty_tracking,
};
+static void __clean_unpreserved_context_entries(struct intel_iommu *iommu)
+{
+ struct device_domain_info *info;
+ struct pci_dev *pdev = NULL;
+
+ for_each_pci_dev(pdev) {
+ info = dev_iommu_priv_get(&pdev->dev);
+ if (!info)
+ continue;
+
+ if (info->iommu != iommu)
+ continue;
+
+ if (dev_iommu_preserved_state(&pdev->dev))
+ continue;
+
+ domain_context_clear(info);
+ }
+}
+
static struct iommu_domain *
intel_iommu_domain_alloc_second_stage(struct device *dev,
struct intel_iommu *iommu, u32 flags)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index ea88c86030bb..1eb60ce1300f 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1283,6 +1283,7 @@ int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_se
void intel_iommu_unpreserve_device(struct device *dev, struct device_ser *device_ser);
int intel_iommu_preserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
void intel_iommu_unpreserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
+bool intel_iommu_liveupdate_clear_context_entries(struct intel_iommu *iommu);
#endif
#ifdef CONFIG_INTEL_IOMMU_SVM
diff --git a/drivers/iommu/intel/liveupdate.c b/drivers/iommu/intel/liveupdate.c
index 491075802e4b..3f8c7f15bc36 100644
--- a/drivers/iommu/intel/liveupdate.c
+++ b/drivers/iommu/intel/liveupdate.c
@@ -9,6 +9,7 @@
#include <linux/kexec_handover.h>
#include <linux/liveupdate.h>
+#include <linux/iommu-lu.h>
#include <linux/module.h>
#include <linux/pci.h>
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 22/32] iommu: Implement IOMMU FLB retrieve and finish ops
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (20 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 21/32] iommu/vt-d: Clean the context entries of unpreserved devices Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 23/32] iommu: Add an API get the preserved state of an IOMMU Samiullah Khawaja
` (10 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Add implementation of the IOMMU LU FLB retrieve and finish ops. During
retrieve walk through the preserved objs nodes and restore each folio.
Also recreate the FLB obj.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/liveupdate.c | 48 +++++++++++++++++++++++++++++++++++++-
1 file changed, 47 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
index 5780761a7024..0dfa03673178 100644
--- a/drivers/iommu/liveupdate.c
+++ b/drivers/iommu/liveupdate.c
@@ -14,6 +14,17 @@
#include <linux/pci.h>
#include <linux/errno.h>
+static void iommu_liveupdate_restore_objs(u64 next)
+{
+ struct iommu_objs_ser *objs;
+
+ while (next) {
+ BUG_ON(!kho_restore_folio(next));
+ objs = __va(next);
+ next = objs->next_objs;
+ }
+}
+
static void iommu_liveupdate_free_objs(u64 next, bool incoming)
{
struct iommu_objs_ser *objs;
@@ -98,11 +109,46 @@ static void iommu_liveupdate_flb_unpreserve(struct liveupdate_flb_op_args *argp)
static void iommu_liveupdate_flb_finish(struct liveupdate_flb_op_args *argp)
{
+ struct iommu_lu_flb_obj *obj = argp->obj;
+
+ if (obj->iommu_domains)
+ iommu_liveupdate_free_objs(obj->ser->iommu_domains_phys, true);
+
+ if (obj->devices)
+ iommu_liveupdate_free_objs(obj->ser->devices_phys, true);
+
+ if (obj->iommus)
+ iommu_liveupdate_free_objs(obj->ser->iommus_phys, true);
+
+ folio_put(virt_to_folio(obj->ser));
}
static int iommu_liveupdate_flb_retrieve(struct liveupdate_flb_op_args *argp)
{
- return -EOPNOTSUPP;
+ struct iommu_lu_flb_obj *obj;
+ struct iommu_lu_flb_ser *ser;
+
+ obj = kzalloc(sizeof(*obj), GFP_ATOMIC);
+ if (!obj)
+ return -ENOMEM;
+
+ mutex_init(&obj->lock);
+ BUG_ON(!kho_restore_folio(argp->data));
+ ser = phys_to_virt(argp->data);
+ obj->ser = ser;
+
+ iommu_liveupdate_restore_objs(ser->iommu_domains_phys);
+ obj->iommu_domains = phys_to_virt(ser->iommu_domains_phys);
+
+ iommu_liveupdate_restore_objs(ser->devices_phys);
+ obj->devices = phys_to_virt(ser->devices_phys);
+
+ iommu_liveupdate_restore_objs(ser->iommus_phys);
+ obj->iommus = phys_to_virt(ser->iommus_phys);
+
+ argp->obj = obj;
+
+ return 0;
}
static struct liveupdate_flb_ops iommu_flb_ops = {
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 23/32] iommu: Add an API get the preserved state of an IOMMU
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (21 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 22/32] iommu: Implement IOMMU FLB retrieve and finish ops Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 24/32] iommu/vt-d: restore state of the preserved IOMMU Samiullah Khawaja
` (9 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
After liveupdate kexec during boot, the state of the preserved IOMMU
needs to be restored. Since the state needs to be restored by the IOMMU
drivers during initialization/registration with the IOMMU core, add an
API that can be used by the IOMMU drivers to fetch the preserved state.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/liveupdate.c | 29 +++++++++++++++++++++++++++++
include/linux/iommu-lu.h | 1 +
2 files changed, 30 insertions(+)
diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
index 0dfa03673178..e7ecf2e9aa4e 100644
--- a/drivers/iommu/liveupdate.c
+++ b/drivers/iommu/liveupdate.c
@@ -175,6 +175,35 @@ int iommu_liveupdate_unregister_flb(struct liveupdate_file_handler *handler)
}
EXPORT_SYMBOL(iommu_liveupdate_unregister_flb);
+struct iommu_ser *iommu_get_preserved_data(u64 token, enum iommu_lu_type type)
+{
+ struct iommu_lu_flb_obj *obj;
+ struct iommus_ser *iommus;
+ int ret, i, idx;
+
+ ret = liveupdate_flb_get_incoming(&iommu_flb, (void **)&obj);
+ if (ret)
+ return NULL;
+
+ iommus = __va(obj->ser->iommus_phys);
+ for (i = 0, idx = 0; i < obj->ser->nr_iommus; ++i, ++idx) {
+ if (idx >= MAX_IOMMU_SERS) {
+ iommus = __va(iommus->objs.next_objs);
+ idx = 0;
+ }
+
+ if (iommus->iommus[idx].obj.deleted)
+ continue;
+
+ if (iommus->iommus[idx].token == token &&
+ iommus->iommus[idx].type == type)
+ return &iommus->iommus[idx];
+ }
+
+ return NULL;
+}
+EXPORT_SYMBOL(iommu_get_preserved_data);
+
static int reserve_obj_ser(struct iommu_objs_ser **objs_ptr, u64 max_objs)
{
struct iommu_objs_ser *next_objs, *objs = *objs_ptr;
diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
index 08a659de8553..ffce7043e997 100644
--- a/include/linux/iommu-lu.h
+++ b/include/linux/iommu-lu.h
@@ -78,6 +78,7 @@ static inline void *iommu_domain_restored_state(struct iommu_domain *domain)
}
#endif
+struct iommu_ser *iommu_get_preserved_data(u64 token, enum iommu_lu_type type);
int iommu_domain_preserve(struct iommu_domain *domain, struct iommu_domain_ser **ser);
int iommu_domain_unpreserve(struct iommu_domain *domain);
int iommu_preserve_device(struct iommu_domain *domain, struct device *dev);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 24/32] iommu/vt-d: restore state of the preserved IOMMU
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (22 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 23/32] iommu: Add an API get the preserved state of an IOMMU Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-04 6:43 ` Baolu Lu
2025-12-02 23:02 ` [RFC PATCH v2 25/32] iommu: Add helper APIs to fetch preserved device state Samiullah Khawaja
` (8 subsequent siblings)
32 siblings, 1 reply; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
During boot fetch the preserved state of IOMMU unit and if found then
restore the state. Reuse the root_table that was preserved in the
previous kernel.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/intel/iommu.c | 30 ++++++++++++++++++++++++------
drivers/iommu/intel/iommu.h | 2 ++
drivers/iommu/intel/liveupdate.c | 30 ++++++++++++++++++++++++++++++
3 files changed, 56 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 84fef81ecf4d..888351f91918 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -224,12 +224,12 @@ static void clear_translation_pre_enabled(struct intel_iommu *iommu)
iommu->flags &= ~VTD_FLAG_TRANS_PRE_ENABLED;
}
-static void init_translation_status(struct intel_iommu *iommu)
+static void init_translation_status(struct intel_iommu *iommu, bool restoring)
{
u32 gsts;
gsts = readl(iommu->reg + DMAR_GSTS_REG);
- if (gsts & DMA_GSTS_TES)
+ if (!restoring && (gsts & DMA_GSTS_TES))
iommu->flags |= VTD_FLAG_TRANS_PRE_ENABLED;
}
@@ -672,10 +672,18 @@ void dmar_fault_dump_ptes(struct intel_iommu *iommu, u16 source_id,
#endif
/* iommu handling */
-static int iommu_alloc_root_entry(struct intel_iommu *iommu)
+static int iommu_alloc_root_entry(struct intel_iommu *iommu, struct iommu_ser *restored_state)
{
struct root_entry *root;
+#if CONFIG_LIVEUPDATE
+ if (restored_state) {
+ intel_iommu_liveupdate_restore_root_table(iommu, restored_state);
+ /* Should not be needed since the entries are already cleaned in last kernel. */
+ __iommu_flush_cache(iommu, iommu->root_entry, ROOT_SIZE);
+ return 0;
+ }
+#endif
root = iommu_alloc_pages_node_sz(iommu->node, GFP_ATOMIC, SZ_4K);
if (!root) {
pr_err("Allocating root entry for %s failed\n",
@@ -1616,6 +1624,7 @@ static int copy_translation_tables(struct intel_iommu *iommu)
static int __init init_dmars(void)
{
+ struct iommu_ser *iommu_ser = NULL;
struct dmar_drhd_unit *drhd;
struct intel_iommu *iommu;
int ret;
@@ -1638,8 +1647,12 @@ static int __init init_dmars(void)
intel_pasid_max_id);
}
+#if IS_ENABLED(CONFIG_LIVEUPDATE)
+ iommu_ser = iommu_get_preserved_data(iommu->reg_phys, IOMMU_INTEL);
+#endif
+
intel_iommu_init_qi(iommu);
- init_translation_status(iommu);
+ init_translation_status(iommu, !!iommu_ser);
if (translation_pre_enabled(iommu) && !is_kdump_kernel()) {
iommu_disable_translation(iommu);
@@ -1653,7 +1666,7 @@ static int __init init_dmars(void)
* we could share the same root & context tables
* among all IOMMU's. Need to Split it later.
*/
- ret = iommu_alloc_root_entry(iommu);
+ ret = iommu_alloc_root_entry(iommu, iommu_ser);
if (ret)
goto free_iommu;
@@ -2112,6 +2125,7 @@ int dmar_parse_one_satc(struct acpi_dmar_header *hdr, void *arg)
static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
{
struct intel_iommu *iommu = dmaru->iommu;
+ struct iommu_ser *iommu_ser = NULL;
int ret;
/*
@@ -2120,7 +2134,11 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
if (iommu->gcmd & DMA_GCMD_TE)
iommu_disable_translation(iommu);
- ret = iommu_alloc_root_entry(iommu);
+#if IS_ENABLED(CONFIG_LIVEUPDATE)
+ iommu_ser = iommu_get_preserved_data(iommu->reg_phys, IOMMU_INTEL);
+#endif
+
+ ret = iommu_alloc_root_entry(iommu, iommu_ser);
if (ret)
goto out;
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 1eb60ce1300f..b0c56e27f167 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1283,6 +1283,8 @@ int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_se
void intel_iommu_unpreserve_device(struct device *dev, struct device_ser *device_ser);
int intel_iommu_preserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
void intel_iommu_unpreserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
+void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu,
+ struct iommu_ser *iommu_ser);
bool intel_iommu_liveupdate_clear_context_entries(struct intel_iommu *iommu);
#endif
diff --git a/drivers/iommu/intel/liveupdate.c b/drivers/iommu/intel/liveupdate.c
index 3f8c7f15bc36..140887187084 100644
--- a/drivers/iommu/intel/liveupdate.c
+++ b/drivers/iommu/intel/liveupdate.c
@@ -73,6 +73,36 @@ static int preserve_iommu_context(struct intel_iommu *iommu)
return ret;
}
+static void restore_iommu_context(struct intel_iommu *iommu)
+{
+ struct context_entry *context;
+ int i;
+
+ for (i = 0; i < ROOT_ENTRY_NR; i++) {
+ context = iommu_context_addr(iommu, i, 0, 0);
+ if (context)
+ BUG_ON(!kho_restore_folio(virt_to_phys(context)));
+
+ if (!sm_supported(iommu))
+ continue;
+
+ context = iommu_context_addr(iommu, i, 0x80, 0);
+ if (context)
+ BUG_ON(!kho_restore_folio(virt_to_phys(context)));
+ }
+}
+
+void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu,
+ struct iommu_ser *iommu_ser)
+{
+ BUG_ON(!kho_restore_folio(iommu_ser->intel.root_table));
+ iommu->root_entry = __va(iommu_ser->intel.root_table);
+
+ restore_iommu_context(iommu);
+ pr_info("Restored IOMMU[0x%llx] Root Table at: 0x%llx\n",
+ iommu->reg_phys, iommu_ser->intel.root_table);
+}
+
int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_ser)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 25/32] iommu: Add helper APIs to fetch preserved device state
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (23 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 24/32] iommu/vt-d: restore state of the preserved IOMMU Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 26/32] iommu/vt-d: reclaim domain ids of the preserved devices Samiullah Khawaja
` (7 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Add two APIs to fetch state of the preserved devices. An API to iterate
through state of all preserved devices and another API to fetch the
state of a single preserved device. Note that these APIs only fetch the
preserved state from the previous kernel.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/liveupdate.c | 68 ++++++++++++++++++++++++++++++++++++++
include/linux/iommu-lu.h | 2 ++
2 files changed, 70 insertions(+)
diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
index e7ecf2e9aa4e..1ca97612c501 100644
--- a/drivers/iommu/liveupdate.c
+++ b/drivers/iommu/liveupdate.c
@@ -175,6 +175,74 @@ int iommu_liveupdate_unregister_flb(struct liveupdate_file_handler *handler)
}
EXPORT_SYMBOL(iommu_liveupdate_unregister_flb);
+int iommu_for_each_preserved_device(int (*fn)(struct device_ser *ser, void *arg), void *arg)
+{
+ struct iommu_lu_flb_obj *obj;
+ struct devices_ser *devices;
+ int ret, i, idx;
+
+ ret = liveupdate_flb_get_incoming(&iommu_flb, (void **)&obj);
+ if (ret)
+ return -ENOENT;
+
+ devices = __va(obj->ser->devices_phys);
+ for (i = 0, idx = 0; i < obj->ser->nr_devices; ++i, ++idx) {
+ if (idx >= MAX_DEVICE_SERS) {
+ devices = __va(devices->objs.next_objs);
+ idx = 0;
+ }
+
+ if (devices->devices[idx].obj.deleted)
+ continue;
+
+ ret = fn(&devices->devices[idx], arg);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(iommu_for_each_preserved_device);
+
+static inline bool device_ser_match(struct device_ser *match,
+ struct pci_dev *pdev)
+{
+ return match->devid == pci_dev_id(pdev) && match->pci_domain == pci_domain_nr(pdev->bus);
+}
+
+struct device_ser *iommu_get_device_preserved_data(struct device *dev)
+{
+ struct iommu_lu_flb_obj *obj;
+ struct devices_ser *devices;
+ int ret, i, idx;
+
+ if (!dev_is_pci(dev))
+ return NULL;
+
+ ret = liveupdate_flb_get_incoming(&iommu_flb, (void **)&obj);
+ if (ret)
+ return NULL;
+
+ devices = __va(obj->ser->devices_phys);
+ for (i = 0, idx = 0; i < obj->ser->nr_devices; ++i, ++idx) {
+ if (idx >= MAX_DEVICE_SERS) {
+ devices = __va(devices->objs.next_objs);
+ idx = 0;
+ }
+
+ if (devices->devices[idx].obj.deleted)
+ continue;
+
+ if (device_ser_match(&devices->devices[idx], to_pci_dev(dev))) {
+ devices->devices[idx].obj.incoming = true;
+ return &devices->devices[idx];
+ }
+ }
+
+ return NULL;
+}
+EXPORT_SYMBOL(iommu_get_device_preserved_data);
+
struct iommu_ser *iommu_get_preserved_data(u64 token, enum iommu_lu_type type)
{
struct iommu_lu_flb_obj *obj;
diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
index ffce7043e997..d0226ec19b2f 100644
--- a/include/linux/iommu-lu.h
+++ b/include/linux/iommu-lu.h
@@ -78,6 +78,8 @@ static inline void *iommu_domain_restored_state(struct iommu_domain *domain)
}
#endif
+int iommu_for_each_preserved_device(int (*fn)(struct device_ser *ser, void *arg), void *arg);
+struct device_ser *iommu_get_device_preserved_data(struct device *dev);
struct iommu_ser *iommu_get_preserved_data(u64 token, enum iommu_lu_type type);
int iommu_domain_preserve(struct iommu_domain *domain, struct iommu_domain_ser **ser);
int iommu_domain_unpreserve(struct iommu_domain *domain);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 26/32] iommu/vt-d: reclaim domain ids of the preserved devices
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (24 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 25/32] iommu: Add helper APIs to fetch preserved device state Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 27/32] iommu: restore preserved domain and reattach Samiullah Khawaja
` (6 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
During IOMMU unit state restore, reclaim the domain ids of the preserved
devices so these are not acquired by another device.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/intel/liveupdate.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/iommu/intel/liveupdate.c b/drivers/iommu/intel/liveupdate.c
index 140887187084..fc1e8545ed0e 100644
--- a/drivers/iommu/intel/liveupdate.c
+++ b/drivers/iommu/intel/liveupdate.c
@@ -92,6 +92,15 @@ static void restore_iommu_context(struct intel_iommu *iommu)
}
}
+static int __restore_used_domain_ids(struct device_ser *ser, void *arg)
+{
+ int id = ser->domain_iommu_ser.did;
+ struct intel_iommu *iommu = arg;
+
+ ida_alloc_range(&iommu->domain_ida, id, id, GFP_KERNEL);
+ return 0;
+}
+
void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu,
struct iommu_ser *iommu_ser)
{
@@ -99,6 +108,7 @@ void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu,
iommu->root_entry = __va(iommu_ser->intel.root_table);
restore_iommu_context(iommu);
+ iommu_for_each_preserved_device(__restore_used_domain_ids, iommu);
pr_info("Restored IOMMU[0x%llx] Root Table at: 0x%llx\n",
iommu->reg_phys, iommu_ser->intel.root_table);
}
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 27/32] iommu: restore preserved domain and reattach
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (25 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 26/32] iommu/vt-d: reclaim domain ids of the preserved devices Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 28/32] iommu/vt-d: reuse the preserved domain id for preserved devices Samiullah Khawaja
` (5 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
During boot the preserved iommu domains need to be recreated and
reattached to the preserved devices. Once the device is reattached to
the preserved domain, the preserved_state set in the dev_iommu can be
unset.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommu.c | 39 ++++++++++++++++++++++++++++++++++++--
drivers/iommu/liveupdate.c | 31 ++++++++++++++++++++++++++++++
include/linux/iommu-lu.h | 1 +
3 files changed, 69 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3feb440de40a..07453611cf8e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -18,6 +18,7 @@
#include <linux/errno.h>
#include <linux/host1x_context_bus.h>
#include <linux/iommu.h>
+#include <linux/iommu-lu.h>
#include <linux/iommufd.h>
#include <linux/idr.h>
#include <linux/err.h>
@@ -489,6 +490,10 @@ static int iommu_init_device(struct device *dev)
}
dev->iommu->iommu_dev = iommu_dev;
+#ifdef CONFIG_LIVEUPDATE
+ dev->iommu->device_ser = iommu_get_device_preserved_data(dev);
+#endif
+
ret = iommu_device_link(iommu_dev, dev);
if (ret)
goto err_release;
@@ -2161,6 +2166,12 @@ static int __iommu_attach_device(struct iommu_domain *domain,
if (old)
atomic_dec(&old->attach_count);
+#ifdef CONFIG_LIVEUPDATE
+ /* The associated state can be unset once restored. */
+ if (dev_iommu_restored_state(dev))
+ WRITE_ONCE(dev->iommu->device_ser, NULL);
+#endif
+
dev->iommu->attach_deferred = 0;
trace_attach_device_to_domain(dev);
return 0;
@@ -3006,6 +3017,27 @@ int iommu_fwspec_add_ids(struct device *dev, const u32 *ids, int num_ids)
}
EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids);
+static struct iommu_domain *__iommu_group_maybe_restore_domain(struct iommu_group *group)
+{
+ struct device_ser *device_ser;
+ struct iommu_domain *domain;
+ struct device *dev;
+
+ dev = iommu_group_first_dev(group);
+ if (!dev_is_pci(dev))
+ return NULL;
+
+ device_ser = dev_iommu_restored_state(dev);
+ if (!device_ser)
+ return NULL;
+
+ domain = iommu_restore_domain(dev, device_ser);
+ if (WARN_ON(IS_ERR(domain)))
+ return NULL;
+
+ return domain;
+}
+
/**
* iommu_setup_default_domain - Set the default_domain for the group
* @group: Group to change
@@ -3020,8 +3052,8 @@ static int iommu_setup_default_domain(struct iommu_group *group,
int target_type)
{
struct iommu_domain *old_dom = group->default_domain;
+ struct iommu_domain *dom, *restored_domain;
struct group_device *gdev;
- struct iommu_domain *dom;
bool direct_failed;
int req_type;
int ret;
@@ -3065,6 +3097,9 @@ static int iommu_setup_default_domain(struct iommu_group *group,
/* We must set default_domain early for __iommu_device_set_domain */
group->default_domain = dom;
if (!group->domain) {
+ restored_domain = __iommu_group_maybe_restore_domain(group);
+ if (!restored_domain)
+ restored_domain = dom;
/*
* Drivers are not allowed to fail the first domain attach.
* The only way to recover from this is to fail attaching the
@@ -3072,7 +3107,7 @@ static int iommu_setup_default_domain(struct iommu_group *group,
* in group->default_domain so it is freed after.
*/
ret = __iommu_group_set_domain_internal(
- group, dom, IOMMU_SET_DOMAIN_MUST_SUCCEED);
+ group, restored_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED);
if (WARN_ON(ret))
goto out_free_old;
} else {
diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
index 1ca97612c501..735b7d9417c7 100644
--- a/drivers/iommu/liveupdate.c
+++ b/drivers/iommu/liveupdate.c
@@ -467,3 +467,34 @@ int iommu_unpreserve_device(struct iommu_domain *domain, struct device *dev)
{
return -EOPNOTSUPP;
}
+
+struct iommu_domain *iommu_restore_domain(struct device *dev, struct device_ser *ser)
+{
+ struct iommu_domain_ser *domain_ser;
+ struct iommu_lu_flb_obj *flb_obj;
+ struct iommu_domain *domain;
+ int ret;
+
+ domain_ser = __va(ser->domain_iommu_ser.domain_phys);
+
+ ret = liveupdate_flb_get_incoming(&iommu_flb, (void **)&flb_obj);
+ if (ret)
+ return ERR_PTR(ret);
+
+ guard(mutex)(&flb_obj->lock);
+ if (domain_ser->restored_domain)
+ return domain_ser->restored_domain;
+
+ domain_ser->obj.incoming = true;
+ domain = iommu_paging_domain_alloc(dev);
+ if (IS_ERR(domain))
+ return domain;
+
+ ret = domain->ops->restore(domain, domain_ser);
+ if (ret)
+ return ERR_PTR(ret);
+
+ domain->preserved_state = domain_ser;
+ domain_ser->restored_domain = domain;
+ return domain;
+}
diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
index d0226ec19b2f..1f90f4fb0328 100644
--- a/include/linux/iommu-lu.h
+++ b/include/linux/iommu-lu.h
@@ -78,6 +78,7 @@ static inline void *iommu_domain_restored_state(struct iommu_domain *domain)
}
#endif
+struct iommu_domain *iommu_restore_domain(struct device *dev, struct device_ser *ser);
int iommu_for_each_preserved_device(int (*fn)(struct device_ser *ser, void *arg), void *arg);
struct device_ser *iommu_get_device_preserved_data(struct device *dev);
struct iommu_ser *iommu_get_preserved_data(u64 token, enum iommu_lu_type type);
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 28/32] iommu/vt-d: reuse the preserved domain id for preserved devices
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (26 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 27/32] iommu: restore preserved domain and reattach Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 29/32] iommufd: Handle the iommufd can_finish properly Samiullah Khawaja
` (4 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Preserved devices have their domain ids preserved by the previous
kernel. During restore and reattach in the next kernel the domain ids
are already reclaimed so reuse the presered domain id.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/intel/iommu.c | 42 +++++++++++++++++++++++-------------
drivers/iommu/intel/iommu.h | 3 ++-
drivers/iommu/intel/nested.c | 2 +-
3 files changed, 30 insertions(+), 17 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 888351f91918..177bf1b2715f 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1033,7 +1033,8 @@ static bool first_level_by_default(struct intel_iommu *iommu)
return true;
}
-int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu)
+int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu,
+ int restore_did)
{
struct iommu_domain_info *info, *curr;
int num, ret = -ENOSPC;
@@ -1053,8 +1054,11 @@ int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu)
return 0;
}
- num = ida_alloc_range(&iommu->domain_ida, IDA_START_DID,
- cap_ndoms(iommu->cap) - 1, GFP_KERNEL);
+ if (restore_did >= 0)
+ num = restore_did;
+ else
+ num = ida_alloc_range(&iommu->domain_ida, IDA_START_DID,
+ cap_ndoms(iommu->cap) - 1, GFP_KERNEL);
if (num < 0) {
pr_err("%s: No free domain ids\n", iommu->name);
goto err_unlock;
@@ -1325,10 +1329,16 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
+ struct device_ser *device_ser = NULL;
unsigned long flags;
int ret;
- ret = domain_attach_iommu(domain, iommu);
+#ifdef CONFIG_LIVEUPDATE
+ device_ser = dev_iommu_restored_state(dev);
+#endif
+
+ ret = domain_attach_iommu(domain, iommu,
+ dev_iommu_restore_did(dev, &domain->domain));
if (ret)
return ret;
@@ -1341,16 +1351,18 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
if (dev_is_real_dma_subdevice(dev))
return 0;
- if (!sm_supported(iommu))
- ret = domain_context_mapping(domain, dev);
- else if (intel_domain_is_fs_paging(domain))
- ret = domain_setup_first_level(iommu, domain, dev,
- IOMMU_NO_PASID, NULL);
- else if (intel_domain_is_ss_paging(domain))
- ret = domain_setup_second_level(iommu, domain, dev,
- IOMMU_NO_PASID, NULL);
- else if (WARN_ON(true))
- ret = -EINVAL;
+ if (!device_ser) {
+ if (!sm_supported(iommu))
+ ret = domain_context_mapping(domain, dev);
+ else if (intel_domain_is_fs_paging(domain))
+ ret = domain_setup_first_level(iommu, domain, dev,
+ IOMMU_NO_PASID, NULL);
+ else if (intel_domain_is_ss_paging(domain))
+ ret = domain_setup_second_level(iommu, domain, dev,
+ IOMMU_NO_PASID, NULL);
+ else if (WARN_ON(true))
+ ret = -EINVAL;
+ }
if (ret)
goto out_block_translation;
@@ -3612,7 +3624,7 @@ domain_add_dev_pasid(struct iommu_domain *domain,
if (!dev_pasid)
return ERR_PTR(-ENOMEM);
- ret = domain_attach_iommu(dmar_domain, iommu);
+ ret = domain_attach_iommu(dmar_domain, iommu, -1);
if (ret)
goto out_free;
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index b0c56e27f167..aa336050015e 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -1174,7 +1174,8 @@ void __iommu_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
*/
#define QI_OPT_WAIT_DRAIN BIT(0)
-int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu);
+int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu,
+ int restore_did);
void domain_detach_iommu(struct dmar_domain *domain, struct intel_iommu *iommu);
void device_block_translation(struct device *dev);
int paging_domain_compatible(struct iommu_domain *domain, struct device *dev);
diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c
index a3fb8c193ca6..4fed9f5981e5 100644
--- a/drivers/iommu/intel/nested.c
+++ b/drivers/iommu/intel/nested.c
@@ -40,7 +40,7 @@ static int intel_nested_attach_dev(struct iommu_domain *domain,
return ret;
}
- ret = domain_attach_iommu(dmar_domain, iommu);
+ ret = domain_attach_iommu(dmar_domain, iommu, -1);
if (ret) {
dev_err_ratelimited(dev, "Failed to attach domain to iommu\n");
return ret;
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 29/32] iommufd: Handle the iommufd can_finish properly
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (27 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 28/32] iommu/vt-d: reuse the preserved domain id for preserved devices Samiullah Khawaja
@ 2025-12-02 23:02 ` Samiullah Khawaja
2025-12-02 23:03 ` [RFC PATCH v2 30/32] iommu: Transfer device ownership after liveupdate Samiullah Khawaja
` (3 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:02 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
IOMMUFD cannot finish until the restored domains have attachments. The
devices attached to the restored domains should be detached or
hotswapped to different iommu domains before the finish can happen.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommufd/iommufd_private.h | 4 +-
drivers/iommu/iommufd/liveupdate.c | 55 ++++++++++++++++++++-----
2 files changed, 46 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 0d358e5486d0..3ae324d7da14 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -383,6 +383,7 @@ struct iommufd_hwpt_paging {
bool nest_parent : 1;
#ifdef CONFIG_LIVEUPDATE
bool lu_preserved : 1;
+ bool lu_restored : 1;
u32 lu_token;
#endif
/* Head at iommufd_ioas::hwpt_list */
@@ -728,9 +729,6 @@ int iommufd_liveupdate_unregister_lufs(void);
int iommufd_hwpt_lu_set_preserved(struct iommufd_ucmd *ucmd);
int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd);
-
-/* TODO */
-#define iommu_domain_has_attachments(x) (false)
#else
static inline int iommufd_liveupdate_register_lufs(void)
{
diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
index 5b45071d7dd2..fe4c514811a4 100644
--- a/drivers/iommu/iommufd/liveupdate.c
+++ b/drivers/iommu/iommufd/liveupdate.c
@@ -265,16 +265,6 @@ static int iommufd_liveupdate_retrieve(struct liveupdate_file_op_args *args)
return rc;
}
-static bool iommufd_liveupdate_can_finish(struct liveupdate_file_op_args *args)
-{
- if (!args->retrieved || !args->file) {
- pr_warn("%s: fd not reclaimed\n", __func__);
- return false;
- }
-
- return true;
-}
-
int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd)
{
struct iommu_hwpt_lu_restore *cmd = ucmd->cmd;
@@ -319,6 +309,7 @@ int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd)
iommufd_object_finalize(ictx, &hwpt->common.obj);
hwpt_lu->reclaimed = true;
+ hwpt->lu_restored = true;
cmd->pt_id = hwpt->common.obj.id;
return 0;
@@ -327,6 +318,50 @@ int iommufd_hwpt_lu_restore(struct iommufd_ucmd *ucmd)
return rc;
}
+static bool iommufd_liveupdate_can_finish(struct liveupdate_file_op_args *args)
+{
+ struct iommufd_hwpt_paging *hwpt;
+ struct iommufd_hwpt_lu *hwpt_lu;
+ struct iommufd_lu *iommufd_lu;
+ struct iommufd_object *obj;
+ struct iommufd_ctx *ictx;
+ unsigned long index;
+ unsigned int i;
+
+ if (!args->retrieved || !args->file) {
+ pr_warn("%s: fd not reclaimed\n", __func__);
+ return false;
+ }
+
+ ictx = iommufd_ctx_from_file(args->file);
+ iommufd_lu = ictx->lu;
+
+ for (i = 0; i < iommufd_lu->nr_hwpts; i++) {
+ hwpt_lu = &iommufd_lu->hwpts[i];
+
+ if (!hwpt_lu->reclaimed)
+ return false;
+ }
+
+ xa_lock(&ictx->objects);
+ xa_for_each(&ictx->objects, index, obj) {
+ if (obj->type != IOMMUFD_OBJ_HWPT_PAGING)
+ continue;
+
+ hwpt = container_of(obj, struct iommufd_hwpt_paging, common.obj);
+ if (!hwpt->lu_restored)
+ continue;
+
+ if (!hwpt->common.domain || iommu_domain_has_attachments(hwpt->common.domain)) {
+ xa_unlock(&ictx->objects);
+ return false;
+ }
+ }
+ xa_unlock(&ictx->objects);
+
+ return true;
+}
+
static void iommufd_liveupdate_finish(struct liveupdate_file_op_args *args)
{
struct iommufd_lu *iommufd_lu;
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 30/32] iommu: Transfer device ownership after liveupdate
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (28 preceding siblings ...)
2025-12-02 23:02 ` [RFC PATCH v2 29/32] iommufd: Handle the iommufd can_finish properly Samiullah Khawaja
@ 2025-12-02 23:03 ` Samiullah Khawaja
2025-12-02 23:03 ` [RFC PATCH v2 31/32] iommu: Allow replacing restored domain Samiullah Khawaja
` (2 subsequent siblings)
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:03 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Get the token of the preserved device and use that to reclaim the
ownership of the preserved device.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommu.c | 23 ++++++++++++++---------
drivers/iommu/iommufd/device.c | 6 ++++--
drivers/vfio/iommufd.c | 7 ++++++-
drivers/vfio/pci/vfio_pci_liveupdate.c | 1 +
include/linux/iommu.h | 2 +-
include/linux/iommufd.h | 3 ++-
include/linux/vfio.h | 4 ++++
7 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 07453611cf8e..ad47597baa04 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3297,20 +3297,24 @@ static int __iommu_group_alloc_blocking_domain(struct iommu_group *group)
return 0;
}
-static int __iommu_take_dma_ownership(struct iommu_group *group, void *owner)
+static int __iommu_take_dma_ownership(struct iommu_group *group, void *owner, bool transfer)
{
int ret;
- if ((group->domain && group->domain != group->default_domain) ||
- !xa_empty(&group->pasid_array))
+ if (!transfer &&
+ ((group->domain && group->domain != group->default_domain) ||
+ !xa_empty(&group->pasid_array)))
return -EBUSY;
ret = __iommu_group_alloc_blocking_domain(group);
if (ret)
return ret;
- ret = __iommu_group_set_domain(group, group->blocking_domain);
- if (ret)
- return ret;
+
+ if (!transfer) {
+ ret = __iommu_group_set_domain(group, group->blocking_domain);
+ if (ret)
+ return ret;
+ }
group->owner = owner;
group->owner_cnt++;
@@ -3339,7 +3343,7 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner)
goto unlock_out;
}
- ret = __iommu_take_dma_ownership(group, owner);
+ ret = __iommu_take_dma_ownership(group, owner, false);
unlock_out:
mutex_unlock(&group->mutex);
@@ -3351,12 +3355,13 @@ EXPORT_SYMBOL_GPL(iommu_group_claim_dma_owner);
* iommu_device_claim_dma_owner() - Set DMA ownership of a device
* @dev: The device.
* @owner: Caller specified pointer. Used for exclusive ownership.
+ * @transfer: Transfer ownership even if domain attached.
*
* Claim the DMA ownership of a device. Multiple devices in the same group may
* concurrently claim ownership if they present the same owner value. Returns 0
* on success and error code on failure
*/
-int iommu_device_claim_dma_owner(struct device *dev, void *owner)
+int iommu_device_claim_dma_owner(struct device *dev, void *owner, bool transfer)
{
/* Caller must be a probed driver on dev */
struct iommu_group *group = dev->iommu_group;
@@ -3378,7 +3383,7 @@ int iommu_device_claim_dma_owner(struct device *dev, void *owner)
goto unlock_out;
}
- ret = __iommu_take_dma_ownership(group, owner);
+ ret = __iommu_take_dma_ownership(group, owner, transfer);
unlock_out:
mutex_unlock(&group->mutex);
return ret;
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 2c81bfa7dedd..c7b48de53f66 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -205,6 +205,7 @@ void iommufd_device_destroy(struct iommufd_object *obj)
* @ictx: iommufd file descriptor
* @dev: Pointer to a physical device struct
* @id: Output ID number to return to userspace for this device
+ * @restore_token: Preserved state token if restoring.
*
* A successful bind establishes an ownership over the device and returns
* struct iommufd_device pointer, otherwise returns error pointer.
@@ -217,7 +218,8 @@ void iommufd_device_destroy(struct iommufd_object *obj)
* The caller must undo this with iommufd_device_unbind()
*/
struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
- struct device *dev, u32 *id)
+ struct device *dev, u32 *id,
+ u32 restore_token)
{
struct iommufd_device *idev;
struct iommufd_group *igroup;
@@ -254,7 +256,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
"Use the \"allow_unsafe_interrupts\" module parameter to override\n");
}
- rc = iommu_device_claim_dma_owner(dev, ictx);
+ rc = iommu_device_claim_dma_owner(dev, ictx, restore_token);
if (rc)
goto out_group_put;
diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index a38d262c6028..57f0f395408d 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -118,8 +118,13 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev,
struct iommufd_ctx *ictx, u32 *out_device_id)
{
struct iommufd_device *idev;
+ u32 restore_token = 0;
- idev = iommufd_device_bind(ictx, vdev->dev, out_device_id);
+#ifdef CONFIG_LIVEUPDATE
+ restore_token = vdev->preserved_iommufd_token;
+#endif
+
+ idev = iommufd_device_bind(ictx, vdev->dev, out_device_id, restore_token);
if (IS_ERR(idev))
return PTR_ERR(idev);
vdev->iommufd_device = idev;
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index b721080599d5..208a0b60c10e 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -205,6 +205,7 @@ static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_op_args *args)
vdev = container_of(device, struct vfio_pci_core_device, vdev);
vdev->liveupdate_state = ser;
+ device->preserved_iommufd_token = ser->iommufd_ser.token;
args->file = file;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 45da5b88f35d..111a892fba1b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1211,7 +1211,7 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
void iommu_group_release_dma_owner(struct iommu_group *group);
bool iommu_group_dma_owner_claimed(struct iommu_group *group);
-int iommu_device_claim_dma_owner(struct device *dev, void *owner);
+int iommu_device_claim_dma_owner(struct device *dev, void *owner, bool transfer);
void iommu_device_release_dma_owner(struct device *dev);
int iommu_attach_device_pasid(struct iommu_domain *domain,
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index ba433fb1a481..59d31c76f50d 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -60,7 +60,8 @@ struct iommufd_object {
};
struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
- struct device *dev, u32 *id);
+ struct device *dev, u32 *id,
+ u32 restore_token);
void iommufd_device_unbind(struct iommufd_device *idev);
int iommufd_device_attach(struct iommufd_device *idev, ioasid_t pasid,
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 0e9df71e17ab..eacdc025e26a 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -80,6 +80,10 @@ struct vfio_device {
*/
struct dentry *debug_root;
#endif
+
+#ifdef CONFIG_LIVEUPDATE
+ u32 preserved_iommufd_token;
+#endif
};
struct vfio_device_file {
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 31/32] iommu: Allow replacing restored domain
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (29 preceding siblings ...)
2025-12-02 23:03 ` [RFC PATCH v2 30/32] iommu: Transfer device ownership after liveupdate Samiullah Khawaja
@ 2025-12-02 23:03 ` Samiullah Khawaja
2025-12-02 23:03 ` [RFC PATCH v2 32/32] iommufd/selftest: Add test to verify iommufd preservation Samiullah Khawaja
2026-01-28 19:59 ` [RFC PATCH v2 00/32] Add live update state preservation Jason Gunthorpe
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:03 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Allow replacing the restore domain with a new domain after liveupdate as
the restored domain is going to be replaced and not used for anything.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommu.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ad47597baa04..b3cda7d044ee 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2290,9 +2290,12 @@ static bool domain_iommu_ops_compatible(const struct iommu_ops *ops,
static int __iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group)
{
+ bool allow_replace = false;
struct device *dev;
- if (group->domain && group->domain != group->default_domain &&
+ allow_replace = group->domain && iommu_domain_restored_state(group->domain);
+ if (!allow_replace && group->domain &&
+ group->domain != group->default_domain &&
group->domain != group->blocking_domain)
return -EBUSY;
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [RFC PATCH v2 32/32] iommufd/selftest: Add test to verify iommufd preservation
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (30 preceding siblings ...)
2025-12-02 23:03 ` [RFC PATCH v2 31/32] iommu: Allow replacing restored domain Samiullah Khawaja
@ 2025-12-02 23:03 ` Samiullah Khawaja
2026-01-28 19:59 ` [RFC PATCH v2 00/32] Add live update state preservation Jason Gunthorpe
32 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-02 23:03 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Samiullah Khawaja, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
Test iommufd preservation by setting up an iommufd and vfio cdev and
preserve it across live update. Test takes VFIO cdev path of a device
bound to vfio-pci driver and binds it to an iommufd being preserved. It
also preserves the vfio cdev so the iommufd state associated with it is
also preserved.
The restore path is tested by restoring the preserved vfio cdev, iommufd
and HWPT and hotswapping it with the a new HWPT. Test also tries to
finish the session before restoring iommufd, after restoring iommufd and
after restoring HWPT to verify that it fails as the hotswap has not
happened.
Note that the helper functions setup_cdev, open_iommufd, and
setup_iommufd will be replaced with VFIO selftest library. Similarly the
helper function defined to open and interface with Live Update
Orchestrator device will be replaced with a common helper library.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
---
tools/testing/selftests/iommu/Makefile | 1 +
.../selftests/iommu/iommufd_liveupdate.c | 291 ++++++++++++++++++
2 files changed, 292 insertions(+)
create mode 100644 tools/testing/selftests/iommu/iommufd_liveupdate.c
diff --git a/tools/testing/selftests/iommu/Makefile b/tools/testing/selftests/iommu/Makefile
index 84abeb2f0949..42c962c5e612 100644
--- a/tools/testing/selftests/iommu/Makefile
+++ b/tools/testing/selftests/iommu/Makefile
@@ -6,5 +6,6 @@ LDLIBS += -lcap
TEST_GEN_PROGS :=
TEST_GEN_PROGS += iommufd
TEST_GEN_PROGS += iommufd_fail_nth
+TEST_GEN_PROGS += iommufd_liveupdate
include ../lib.mk
diff --git a/tools/testing/selftests/iommu/iommufd_liveupdate.c b/tools/testing/selftests/iommu/iommufd_liveupdate.c
new file mode 100644
index 000000000000..5f767adb10fa
--- /dev/null
+++ b/tools/testing/selftests/iommu/iommufd_liveupdate.c
@@ -0,0 +1,291 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright (c) 2025, Google LLC.
+ * Samiullah Khawaja <skhawaja@google.com>
+ */
+
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <stdbool.h>
+#include <unistd.h>
+
+#define __EXPORTED_HEADERS__
+#include <linux/liveupdate.h>
+#include <linux/iommufd.h>
+#include <linux/types.h>
+#include <linux/vfio.h>
+
+#include "../kselftest.h"
+
+#define ksft_assert(condition) \
+ do { if (!(condition)) \
+ ksft_exit_fail_msg("Failed: %s at %s %d: %s\n", \
+ #condition, __FILE__, __LINE__, strerror(errno)); } while (0)
+
+int setup_cdev(const char *vfio_cdev_path)
+{
+ int cdev_fd;
+
+ cdev_fd = open(vfio_cdev_path, O_RDWR);
+ if (cdev_fd < 0)
+ ksft_exit_skip("Failed to open VFIO cdev: %s\n", vfio_cdev_path);
+
+ return cdev_fd;
+}
+
+int open_iommufd(void)
+{
+ int iommufd;
+
+ iommufd = open("/dev/iommu", O_RDWR);
+ if (iommufd < 0)
+ ksft_exit_skip("Failed to open /dev/iommu. IOMMUFD support not enabled.\n");
+
+ return iommufd;
+}
+
+int setup_iommufd(int iommufd, int cdev_fd, int hwpt_token)
+{
+ int ret;
+
+ struct vfio_device_bind_iommufd bind = {
+ .argsz = sizeof(bind),
+ .flags = 0,
+ };
+ struct iommu_ioas_alloc alloc_data = {
+ .size = sizeof(alloc_data),
+ .flags = 0,
+ };
+ struct iommu_hwpt_alloc hwpt_alloc = {
+ .size = sizeof(hwpt_alloc),
+ .flags = 0,
+ };
+ struct vfio_device_attach_iommufd_pt attach_data = {
+ .argsz = sizeof(attach_data),
+ .flags = 0,
+ };
+ struct iommu_hwpt_lu_set_preserved set_preserved = {
+ .size = sizeof(set_preserved),
+ .hwpt_token = hwpt_token,
+ .preserved = 1,
+ };
+
+ bind.iommufd = iommufd;
+ ret = ioctl(cdev_fd, VFIO_DEVICE_BIND_IOMMUFD, &bind);
+ ksft_assert(!ret);
+
+ ret = ioctl(iommufd, IOMMU_IOAS_ALLOC, &alloc_data);
+ ksft_assert(!ret);
+
+ hwpt_alloc.pt_id = bind.out_devid;
+ hwpt_alloc.pt_id = alloc_data.out_ioas_id;
+ ret = ioctl(iommufd, IOMMU_HWPT_ALLOC, &hwpt_alloc);
+ ksft_assert(ret);
+
+ attach_data.pt_id = hwpt_alloc.pt_id;
+ ret = ioctl(cdev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data);
+ ksft_assert(!ret);
+
+ set_preserved.hwpt_id = attach_data.pt_id;
+ ret = ioctl(iommufd, IOMMU_HWPT_LU_SET_PRESERVED, &set_preserved);
+ ksft_assert(!ret);
+
+ return ret;
+}
+
+int luo_session_finish(int session_fd)
+{
+ struct liveupdate_session_finish arg = { .size = sizeof(arg) };
+
+ if (ioctl(session_fd, LIVEUPDATE_SESSION_FINISH, &arg) < 0)
+ return -errno;
+
+ return 0;
+}
+
+int restore_iommufd(int session, int iommufd, int cdev_fd, int hwpt_token)
+{
+ int ret;
+
+ struct vfio_device_bind_iommufd bind = {
+ .argsz = sizeof(bind),
+ .flags = 0,
+ };
+ struct iommu_ioas_alloc alloc_data = {
+ .size = sizeof(alloc_data),
+ .flags = 0,
+ };
+ struct iommu_hwpt_alloc hwpt_alloc = {
+ .size = sizeof(hwpt_alloc),
+ .flags = 0,
+ };
+ struct iommu_hwpt_lu_restore restore = {
+ .size = sizeof(restore),
+ .hwpt_token = hwpt_token,
+ .hwpt_alloc_flags = 0,
+ };
+ struct vfio_device_attach_iommufd_pt attach_data = {
+ .argsz = sizeof(attach_data),
+ .flags = 0,
+ };
+
+ bind.iommufd = iommufd;
+ ret = ioctl(cdev_fd, VFIO_DEVICE_BIND_IOMMUFD, &bind);
+ ksft_assert(!ret);
+
+ ret = ioctl(iommufd, IOMMU_IOAS_ALLOC, &alloc_data);
+ ksft_assert(!ret);
+
+ ret = ioctl(iommufd, IOMMU_HWPT_LU_RESTORE, &restore);
+ ksft_assert(!ret);
+
+ /* Should fail */
+ ret = luo_session_finish(session);
+ ksft_assert(ret);
+
+ hwpt_alloc.pt_id = bind.out_devid;
+ hwpt_alloc.pt_id = alloc_data.out_ioas_id;
+ ret = ioctl(iommufd, IOMMU_HWPT_ALLOC, &hwpt_alloc);
+ ksft_assert(ret);
+
+ attach_data.pt_id = hwpt_alloc.pt_id;
+ ret = ioctl(cdev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data);
+ ksft_assert(!ret);
+ attach_data.pt_id = alloc_data.out_ioas_id;
+ ret = ioctl(cdev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data);
+ ksft_assert(!ret);
+
+ return ret;
+}
+
+int open_liveupdate_orchestrator(void)
+{
+ int luo;
+
+ luo = open("/dev/liveupdate", O_RDWR);
+ ksft_assert(luo > 0);
+
+ return luo;
+}
+
+int luo_create_session(int luo_fd, const char *name)
+{
+ struct liveupdate_ioctl_create_session arg = { .size = sizeof(arg) };
+ int ret;
+
+ snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s",
+ LIVEUPDATE_SESSION_NAME_LENGTH - 1, name);
+ ret = ioctl(luo_fd, LIVEUPDATE_IOCTL_CREATE_SESSION, &arg);
+ ksft_assert(!ret);
+ ksft_assert(arg.fd > 0);
+
+ return arg.fd;
+}
+
+int luo_retrieve_session(int luo_fd, const char *name)
+{
+ struct liveupdate_ioctl_retrieve_session arg = { .size = sizeof(arg) };
+ int ret;
+
+ snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s",
+ LIVEUPDATE_SESSION_NAME_LENGTH - 1, name);
+ ret = ioctl(luo_fd, LIVEUPDATE_IOCTL_RETRIEVE_SESSION, &arg);
+ ksft_assert(!ret || errno == ENOENT);
+
+ if (ret && errno == ENOENT)
+ return -errno;
+
+ return arg.fd;
+}
+
+int liveupdate_preserve_fd(int session_fd, int fd, int token)
+{
+ struct liveupdate_session_preserve_fd preserve;
+ int ret;
+
+ preserve.fd = fd;
+ preserve.token = token;
+ preserve.size = sizeof(preserve);
+
+ ret = ioctl(session_fd, LIVEUPDATE_SESSION_PRESERVE_FD, &preserve);
+ ksft_assert(!ret);
+
+ return ret;
+}
+
+int liveupdate_restore_fd(int session_fd, int token)
+{
+ struct liveupdate_session_retrieve_fd arg = { .size = sizeof(arg) };
+ int ret;
+
+ arg.token = token;
+
+ ret = ioctl(session_fd, LIVEUPDATE_SESSION_RETRIEVE_FD, &arg);
+ ksft_assert(!ret);
+ ksft_assert(arg.fd > 0);
+
+ return arg.fd;
+}
+
+int main(int argc, char *argv[])
+{
+ int iommufd, cdev_fd, luo, session, ret;
+ const int token = 0x123456;
+ const int cdev_token = 0x654321;
+ const int hwpt_token = 0x789012;
+ bool updated = false;
+
+ if (argc < 2) {
+ printf("Usage: ./iommufd_liveupdate <vfio_cdev_path>\n");
+ return 1;
+ }
+
+ luo = open_liveupdate_orchestrator();
+ ksft_assert(luo > 0);
+
+ session = luo_retrieve_session(luo, "iommufd-test");
+ if (session == -ENOENT) {
+ session = luo_create_session(luo, "iommufd-test");
+
+ iommufd = open_iommufd();
+ cdev_fd = setup_cdev(argv[1]);
+ } else {
+ updated = true;
+
+ /* Should fail */
+ ret = luo_session_finish(session);
+ ksft_assert(ret);
+
+ iommufd = liveupdate_restore_fd(session, token);
+ cdev_fd = liveupdate_restore_fd(session, cdev_token);
+ }
+
+ if (!updated) {
+ ret = setup_iommufd(iommufd, cdev_fd, hwpt_token);
+ ksft_assert(!ret);
+ } else {
+ /* Should fail */
+ ret = luo_session_finish(session);
+ ksft_assert(ret);
+
+ ret = restore_iommufd(session, iommufd, cdev_fd, hwpt_token);
+ ksft_assert(!ret);
+ }
+
+ if (!updated) {
+ ret = liveupdate_preserve_fd(session, iommufd, token);
+ ksft_assert(!ret);
+
+ ret = liveupdate_preserve_fd(session, cdev_fd, cdev_token);
+ ksft_assert(!ret);
+
+ while (1)
+ sleep(5);
+ } else {
+ ret = luo_session_finish(session);
+ ksft_assert(!ret);
+ }
+
+ return 0;
+}
--
2.52.0.158.g65b55ccf14-goog
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 11/32] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages
2025-12-02 23:02 ` [RFC PATCH v2 11/32] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages Samiullah Khawaja
@ 2025-12-04 2:25 ` Baolu Lu
2025-12-04 17:39 ` Samiullah Khawaja
0 siblings, 1 reply; 49+ messages in thread
From: Baolu Lu @ 2025-12-04 2:25 UTC (permalink / raw)
To: Samiullah Khawaja, David Woodhouse, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Robin Murphy, Pratyush Yadav, Kevin Tian, Alex Williamson,
linux-kernel, Saeed Mahameed, Adithya Jayachandran, Parav Pandit,
Leon Romanovsky, William Tu, Vipin Sharma, dmatlack, YiFei Zhu,
Chris Li, praan
On 12/3/25 07:02, Samiullah Khawaja wrote:
> +int iommu_preserve_pages(struct iommu_pages_list *list)
> +{
> + struct ioptdesc *iopt;
> + int count = 0;
> + int ret;
> +
> + list_for_each_entry(iopt, &list->pages, iopt_freelist_elm) {
> + ret = kho_preserve_folio(ioptdesc_folio(iopt));
> + if (ret) {
> + iommu_unpreserve_pages(list, count);
> + return ret;
> + }
> +
> + ++count;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(iommu_preserve_pages);
What is the purpose of "count"?
Thanks,
baolu
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 16/32] iommu: Add API to preserve/unpreserve a device
2025-12-02 23:02 ` [RFC PATCH v2 16/32] iommu: Add API to preserve/unpreserve a device Samiullah Khawaja
@ 2025-12-04 5:46 ` Baolu Lu
2025-12-04 17:47 ` Samiullah Khawaja
0 siblings, 1 reply; 49+ messages in thread
From: Baolu Lu @ 2025-12-04 5:46 UTC (permalink / raw)
To: Samiullah Khawaja, David Woodhouse, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Robin Murphy, Pratyush Yadav, Kevin Tian, Alex Williamson,
linux-kernel, Saeed Mahameed, Adithya Jayachandran, Parav Pandit,
Leon Romanovsky, William Tu, Vipin Sharma, dmatlack, YiFei Zhu,
Chris Li, praan
On 12/3/25 07:02, Samiullah Khawaja wrote:
> iommu_preserve_device/iommu_unpreserve_device can be used to
> preserve/unpreserve a device for liveupdate. During device preservation
> the state of the associated IOMMU is also preserved. The device can only
> be preseved if the attached iommu domain is preserved and the assocated
> iommu supports preservation.
If the device supports PASID, multiple domains might be attached to it.
...
>
> Signed-off-by: Samiullah Khawaja<skhawaja@google.com>
> ---
> drivers/iommu/iommu.c | 3 +
> drivers/iommu/liveupdate.c | 115 +++++++++++++++++++++++++++++++++++++
> include/linux/iommu-lu.h | 2 +
> include/linux/iommu.h | 18 ++++++
> 4 files changed, 138 insertions(+)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index a70898d11959..3feb440de40a 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -382,6 +382,9 @@ static struct dev_iommu *dev_iommu_get(struct device *dev)
>
> mutex_init(¶m->lock);
> dev->iommu = param;
> +#ifdef CONFIG_LIVEUPDATE
> + dev->iommu->device_ser = NULL;
> +#endif
> return param;
> }
>
> diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
> index 25a943e5e1e3..5780761a7024 100644
> --- a/drivers/iommu/liveupdate.c
> +++ b/drivers/iommu/liveupdate.c
> @@ -11,6 +11,7 @@
> #include <linux/liveupdate.h>
> #include <linux/iommu-lu.h>
> #include <linux/iommu.h>
> +#include <linux/pci.h>
> #include <linux/errno.h>
>
> static void iommu_liveupdate_free_objs(u64 next, bool incoming)
> @@ -209,3 +210,117 @@ int iommu_domain_unpreserve(struct iommu_domain *domain)
> return 0;
> }
> EXPORT_SYMBOL_GPL(iommu_domain_unpreserve);
> +
> +static int iommu_preserve_locked(struct iommu_device *iommu)
> +{
> + struct iommu_lu_flb_obj *flb_obj;
> + struct iommu_ser *iommu_ser;
> + int idx, ret;
> +
> + if (!iommu->ops->preserve)
> + return -EOPNOTSUPP;
> +
> + if (iommu->outgoing_preserved_state) {
> + iommu->outgoing_preserved_state->obj.ref_count++;
> + return 0;
> + }
> +
> + ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
> + if (ret)
> + return ret;
> +
> + idx = reserve_obj_ser((struct iommu_objs_ser **)&flb_obj->iommus, MAX_IOMMU_SERS);
> + if (idx < 0)
> + return idx;
> +
> + iommu_ser = &flb_obj->iommus->iommus[idx];
> + idx = flb_obj->ser->nr_iommus++;
> + iommu_ser->obj.idx = idx;
> + iommu_ser->obj.ref_count = 1;
> +
> + ret = iommu->ops->preserve(iommu, iommu_ser);
> + if (ret)
> + iommu_ser->obj.deleted = true;
> +
> + iommu->outgoing_preserved_state = iommu_ser;
> + return ret;
> +}
> +
> +static void iommu_unpreserve_locked(struct iommu_device *iommu)
> +{
> + struct iommu_ser *iommu_ser = iommu->outgoing_preserved_state;
> +
> + iommu_ser->obj.ref_count--;
> + if (iommu_ser->obj.ref_count)
> + return;
> +
> + iommu->outgoing_preserved_state = NULL;
> + iommu->ops->unpreserve(iommu, iommu_ser);
> + iommu_ser->obj.deleted = true;
> +}
> +
> +int iommu_preserve_device(struct iommu_domain *domain, struct device *dev)
> +{
... but this helper only cares about a single domain.
> + struct iommu_lu_flb_obj *flb_obj;
> + struct device_ser *device_ser;
> + struct dev_iommu *iommu;
> + struct pci_dev *pdev;
> + int ret, idx;
Thanks,
baolu
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 20/32] iommu: Add APIs to get preserved state of a device
2025-12-02 23:02 ` [RFC PATCH v2 20/32] iommu: Add APIs to get preserved state of a device Samiullah Khawaja
@ 2025-12-04 6:19 ` Baolu Lu
2025-12-04 17:48 ` Samiullah Khawaja
0 siblings, 1 reply; 49+ messages in thread
From: Baolu Lu @ 2025-12-04 6:19 UTC (permalink / raw)
To: Samiullah Khawaja, David Woodhouse, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Robin Murphy, Pratyush Yadav, Kevin Tian, Alex Williamson,
linux-kernel, Saeed Mahameed, Adithya Jayachandran, Parav Pandit,
Leon Romanovsky, William Tu, Vipin Sharma, dmatlack, YiFei Zhu,
Chris Li, praan
On 12/3/25 07:02, Samiullah Khawaja wrote:
> The preserved state of the device needs to be fetched at various places
> during liveupdate. The added API can also be used to check if a device
> is preserved or not. The API is only used during shutdown and after
> liveupdate so no locking needed.
>
> Signed-off-by: Samiullah Khawaja<skhawaja@google.com>
> ---
> include/linux/iommu-lu.h | 67 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 67 insertions(+)
>
> diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
> index 95375530b7be..08a659de8553 100644
> --- a/include/linux/iommu-lu.h
> +++ b/include/linux/iommu-lu.h
> @@ -8,9 +8,76 @@
> #ifndef _LINUX_IOMMU_LU_H
> #define _LINUX_IOMMU_LU_H
>
> +#include <linux/device.h>
> +#include <linux/iommu.h>
> #include <linux/liveupdate.h>
> #include <linux/kho/abi/iommu.h>
>
> +#ifdef CONFIG_LIVEUPDATE
> +static inline void *dev_iommu_preserved_state(struct device *dev)
> +{
> + struct device_ser *ser;
> +
> + ser = dev->iommu->device_ser;
This might cause "NULL pointer dereference" issue if the device is not
iommu probed.
if (!dev->iommu)
return NULL;
ser = dev->iommu->device_ser;
> + if (ser && !ser->obj.incoming)
> + return ser;
> +
> + return NULL;
> +}
Thanks,
baolu
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 21/32] iommu/vt-d: Clean the context entries of unpreserved devices
2025-12-02 23:02 ` [RFC PATCH v2 21/32] iommu/vt-d: Clean the context entries of unpreserved devices Samiullah Khawaja
@ 2025-12-04 6:28 ` Baolu Lu
0 siblings, 0 replies; 49+ messages in thread
From: Baolu Lu @ 2025-12-04 6:28 UTC (permalink / raw)
To: Samiullah Khawaja, David Woodhouse, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Robin Murphy, Pratyush Yadav, Kevin Tian, Alex Williamson,
linux-kernel, Saeed Mahameed, Adithya Jayachandran, Parav Pandit,
Leon Romanovsky, William Tu, Vipin Sharma, dmatlack, YiFei Zhu,
Chris Li, praan
On 12/3/25 07:02, Samiullah Khawaja wrote:
> During normal shutdown the iommu translation is disabled. Since the root
> table is preserved during live update, it needs to be cleaned up and the
> context entries of the unpreserved devices need to be cleared.
>
> Signed-off-by: Samiullah Khawaja<skhawaja@google.com>
> ---
> drivers/iommu/intel/iommu.c | 33 ++++++++++++++++++++++++++++++--
> drivers/iommu/intel/iommu.h | 1 +
> drivers/iommu/intel/liveupdate.c | 1 +
> 3 files changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 3f69a073b2d8..84fef81ecf4d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -16,6 +16,7 @@
> #include <linux/crash_dump.h>
> #include <linux/dma-direct.h>
> #include <linux/dmi.h>
> +#include <linux/iommu-lu.h>
> #include <linux/memory.h>
> #include <linux/pci.h>
> #include <linux/pci-ats.h>
> @@ -52,6 +53,10 @@ static int rwbf_quirk;
>
> #define rwbf_required(iommu) (rwbf_quirk || cap_rwbf((iommu)->cap))
>
> +#ifdef CONFIG_LIVEUPDATE
> +static void __clean_unpreserved_context_entries(struct intel_iommu *iommu);
> +#endif
> +
> /*
> * set to 1 to panic kernel if can't successfully enable VT-d
> * (used when kernel is launched w/ TXT)
> @@ -2376,8 +2381,12 @@ void intel_iommu_shutdown(void)
> /* Disable PMRs explicitly here. */
> iommu_disable_protect_mem_regions(iommu);
>
> - /* Make sure the IOMMUs are switched off */
> - iommu_disable_translation(iommu);
> + if (iommu->iommu.outgoing_preserved_state) {
> + __clean_unpreserved_context_entries(iommu);
> + } else {
> + /* Make sure the IOMMUs are switched off */
> + iommu_disable_translation(iommu);
> + }
> }
> }
>
> @@ -2884,6 +2893,26 @@ static const struct iommu_dirty_ops intel_second_stage_dirty_ops = {
> .set_dirty_tracking = intel_iommu_set_dirty_tracking,
> };
>
> +static void __clean_unpreserved_context_entries(struct intel_iommu *iommu)
> +{
> + struct device_domain_info *info;
> + struct pci_dev *pdev = NULL;
> +
> + for_each_pci_dev(pdev) {
> + info = dev_iommu_priv_get(&pdev->dev);
> + if (!info)
> + continue;
I assume the per-device iommu private data is freed in the
release_device path, which runs before intel_iommu_shutdown(). If that
is the case, "info" would always be NULL here, resulting the subsequent
code dead code. Or not?
> +
> + if (info->iommu != iommu)
> + continue;
> +
> + if (dev_iommu_preserved_state(&pdev->dev))
> + continue;
> +
> + domain_context_clear(info);
> + }
> +}
Thanks,
baolu
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 24/32] iommu/vt-d: restore state of the preserved IOMMU
2025-12-02 23:02 ` [RFC PATCH v2 24/32] iommu/vt-d: restore state of the preserved IOMMU Samiullah Khawaja
@ 2025-12-04 6:43 ` Baolu Lu
2025-12-04 17:55 ` Samiullah Khawaja
0 siblings, 1 reply; 49+ messages in thread
From: Baolu Lu @ 2025-12-04 6:43 UTC (permalink / raw)
To: Samiullah Khawaja, David Woodhouse, Joerg Roedel, Will Deacon,
Pasha Tatashin, Jason Gunthorpe, iommu
Cc: Robin Murphy, Pratyush Yadav, Kevin Tian, Alex Williamson,
linux-kernel, Saeed Mahameed, Adithya Jayachandran, Parav Pandit,
Leon Romanovsky, William Tu, Vipin Sharma, dmatlack, YiFei Zhu,
Chris Li, praan
On 12/3/25 07:02, Samiullah Khawaja wrote:
> During boot fetch the preserved state of IOMMU unit and if found then
> restore the state. Reuse the root_table that was preserved in the
> previous kernel.
>
> Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
> ---
> drivers/iommu/intel/iommu.c | 30 ++++++++++++++++++++++++------
> drivers/iommu/intel/iommu.h | 2 ++
> drivers/iommu/intel/liveupdate.c | 30 ++++++++++++++++++++++++++++++
> 3 files changed, 56 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 84fef81ecf4d..888351f91918 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -224,12 +224,12 @@ static void clear_translation_pre_enabled(struct intel_iommu *iommu)
> iommu->flags &= ~VTD_FLAG_TRANS_PRE_ENABLED;
> }
>
> -static void init_translation_status(struct intel_iommu *iommu)
> +static void init_translation_status(struct intel_iommu *iommu, bool restoring)
> {
> u32 gsts;
>
> gsts = readl(iommu->reg + DMAR_GSTS_REG);
> - if (gsts & DMA_GSTS_TES)
> + if (!restoring && (gsts & DMA_GSTS_TES))
> iommu->flags |= VTD_FLAG_TRANS_PRE_ENABLED;
> }
>
> @@ -672,10 +672,18 @@ void dmar_fault_dump_ptes(struct intel_iommu *iommu, u16 source_id,
> #endif
>
> /* iommu handling */
> -static int iommu_alloc_root_entry(struct intel_iommu *iommu)
> +static int iommu_alloc_root_entry(struct intel_iommu *iommu, struct iommu_ser *restored_state)
> {
> struct root_entry *root;
>
> +#if CONFIG_LIVEUPDATE
nit: Is it possible to hide the "IS_ENABLED(CONFIG_LIVEUPDATE)" and "#if
CONFIG_LIVEUPDATE" checks within the header files?
> + if (restored_state) {
> + intel_iommu_liveupdate_restore_root_table(iommu, restored_state);
> + /* Should not be needed since the entries are already cleaned in last kernel. */
> + __iommu_flush_cache(iommu, iommu->root_entry, ROOT_SIZE);
> + return 0;
> + }
> +#endif
> root = iommu_alloc_pages_node_sz(iommu->node, GFP_ATOMIC, SZ_4K);
> if (!root) {
> pr_err("Allocating root entry for %s failed\n",
> @@ -1616,6 +1624,7 @@ static int copy_translation_tables(struct intel_iommu *iommu)
>
> static int __init init_dmars(void)
> {
> + struct iommu_ser *iommu_ser = NULL;
> struct dmar_drhd_unit *drhd;
> struct intel_iommu *iommu;
> int ret;
> @@ -1638,8 +1647,12 @@ static int __init init_dmars(void)
> intel_pasid_max_id);
> }
>
> +#if IS_ENABLED(CONFIG_LIVEUPDATE)
> + iommu_ser = iommu_get_preserved_data(iommu->reg_phys, IOMMU_INTEL);
> +#endif
> +
> intel_iommu_init_qi(iommu);
> - init_translation_status(iommu);
> + init_translation_status(iommu, !!iommu_ser);
>
> if (translation_pre_enabled(iommu) && !is_kdump_kernel()) {
> iommu_disable_translation(iommu);
> @@ -1653,7 +1666,7 @@ static int __init init_dmars(void)
> * we could share the same root & context tables
> * among all IOMMU's. Need to Split it later.
> */
> - ret = iommu_alloc_root_entry(iommu);
> + ret = iommu_alloc_root_entry(iommu, iommu_ser);
> if (ret)
> goto free_iommu;
>
> @@ -2112,6 +2125,7 @@ int dmar_parse_one_satc(struct acpi_dmar_header *hdr, void *arg)
> static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
> {
> struct intel_iommu *iommu = dmaru->iommu;
> + struct iommu_ser *iommu_ser = NULL;
> int ret;
>
> /*
> @@ -2120,7 +2134,11 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
> if (iommu->gcmd & DMA_GCMD_TE)
> iommu_disable_translation(iommu);
>
> - ret = iommu_alloc_root_entry(iommu);
> +#if IS_ENABLED(CONFIG_LIVEUPDATE)
> + iommu_ser = iommu_get_preserved_data(iommu->reg_phys, IOMMU_INTEL);
> +#endif
> +
> + ret = iommu_alloc_root_entry(iommu, iommu_ser);
> if (ret)
> goto out;
>
> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
> index 1eb60ce1300f..b0c56e27f167 100644
> --- a/drivers/iommu/intel/iommu.h
> +++ b/drivers/iommu/intel/iommu.h
> @@ -1283,6 +1283,8 @@ int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_se
> void intel_iommu_unpreserve_device(struct device *dev, struct device_ser *device_ser);
> int intel_iommu_preserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
> void intel_iommu_unpreserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
> +void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu,
> + struct iommu_ser *iommu_ser);
> bool intel_iommu_liveupdate_clear_context_entries(struct intel_iommu *iommu);
> #endif
>
> diff --git a/drivers/iommu/intel/liveupdate.c b/drivers/iommu/intel/liveupdate.c
> index 3f8c7f15bc36..140887187084 100644
> --- a/drivers/iommu/intel/liveupdate.c
> +++ b/drivers/iommu/intel/liveupdate.c
> @@ -73,6 +73,36 @@ static int preserve_iommu_context(struct intel_iommu *iommu)
> return ret;
> }
>
> +static void restore_iommu_context(struct intel_iommu *iommu)
> +{
> + struct context_entry *context;
> + int i;
> +
> + for (i = 0; i < ROOT_ENTRY_NR; i++) {
> + context = iommu_context_addr(iommu, i, 0, 0);
> + if (context)
> + BUG_ON(!kho_restore_folio(virt_to_phys(context)));
> +
> + if (!sm_supported(iommu))
> + continue;
> +
> + context = iommu_context_addr(iommu, i, 0x80, 0);
> + if (context)
> + BUG_ON(!kho_restore_folio(virt_to_phys(context)));
> + }
> +}
> +
> +void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu,
> + struct iommu_ser *iommu_ser)
> +{
> + BUG_ON(!kho_restore_folio(iommu_ser->intel.root_table));
> + iommu->root_entry = __va(iommu_ser->intel.root_table);
> +
> + restore_iommu_context(iommu);
> + pr_info("Restored IOMMU[0x%llx] Root Table at: 0x%llx\n",
> + iommu->reg_phys, iommu_ser->intel.root_table);
> +}
> +
> int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_ser)
> {
> struct device_domain_info *info = dev_iommu_priv_get(dev);
Thanks,
baolu
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 11/32] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages
2025-12-04 2:25 ` Baolu Lu
@ 2025-12-04 17:39 ` Samiullah Khawaja
2025-12-05 4:58 ` Baolu Lu
0 siblings, 1 reply; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-04 17:39 UTC (permalink / raw)
To: Baolu Lu
Cc: David Woodhouse, Joerg Roedel, Will Deacon, Pasha Tatashin,
Jason Gunthorpe, iommu, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
On Wed, Dec 3, 2025 at 6:30 PM Baolu Lu <baolu.lu@linux.intel.com> wrote:
>
> On 12/3/25 07:02, Samiullah Khawaja wrote:
> > +int iommu_preserve_pages(struct iommu_pages_list *list)
> > +{
> > + struct ioptdesc *iopt;
> > + int count = 0;
> > + int ret;
> > +
> > + list_for_each_entry(iopt, &list->pages, iopt_freelist_elm) {
> > + ret = kho_preserve_folio(ioptdesc_folio(iopt));
> > + if (ret) {
> > + iommu_unpreserve_pages(list, count);
> > + return ret;
> > + }
> > +
> > + ++count;
> > + }
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_preserve_pages);
>
> What is the purpose of "count"?
Thanks for reviewing.
The count is there to unpreserve only the preserved pages if an error
occurs during preservation. For example if 4 pages (from the start of
the list) were preserved out of 10 pages in the list, the unpreserve
function will unpreserve the first 4 pages from the start of the list.
Please take a look at the implementation of iommu_unpreserve_pages for
details.
>
> Thanks,
> baolu
Sami
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 16/32] iommu: Add API to preserve/unpreserve a device
2025-12-04 5:46 ` Baolu Lu
@ 2025-12-04 17:47 ` Samiullah Khawaja
0 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-04 17:47 UTC (permalink / raw)
To: Baolu Lu
Cc: David Woodhouse, Joerg Roedel, Will Deacon, Pasha Tatashin,
Jason Gunthorpe, iommu, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
On Wed, Dec 3, 2025 at 9:51 PM Baolu Lu <baolu.lu@linux.intel.com> wrote:
>
> On 12/3/25 07:02, Samiullah Khawaja wrote:
> > iommu_preserve_device/iommu_unpreserve_device can be used to
> > preserve/unpreserve a device for liveupdate. During device preservation
> > the state of the associated IOMMU is also preserved. The device can only
> > be preseved if the attached iommu domain is preserved and the assocated
> > iommu supports preservation.
>
> If the device supports PASID, multiple domains might be attached to it.
>
> ...
>
> >
> > Signed-off-by: Samiullah Khawaja<skhawaja@google.com>
> > ---
> > drivers/iommu/iommu.c | 3 +
> > drivers/iommu/liveupdate.c | 115 +++++++++++++++++++++++++++++++++++++
> > include/linux/iommu-lu.h | 2 +
> > include/linux/iommu.h | 18 ++++++
> > 4 files changed, 138 insertions(+)
> >
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index a70898d11959..3feb440de40a 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -382,6 +382,9 @@ static struct dev_iommu *dev_iommu_get(struct device *dev)
> >
> > mutex_init(¶m->lock);
> > dev->iommu = param;
> > +#ifdef CONFIG_LIVEUPDATE
> > + dev->iommu->device_ser = NULL;
> > +#endif
> > return param;
> > }
> >
> > diff --git a/drivers/iommu/liveupdate.c b/drivers/iommu/liveupdate.c
> > index 25a943e5e1e3..5780761a7024 100644
> > --- a/drivers/iommu/liveupdate.c
> > +++ b/drivers/iommu/liveupdate.c
> > @@ -11,6 +11,7 @@
> > #include <linux/liveupdate.h>
> > #include <linux/iommu-lu.h>
> > #include <linux/iommu.h>
> > +#include <linux/pci.h>
> > #include <linux/errno.h>
> >
> > static void iommu_liveupdate_free_objs(u64 next, bool incoming)
> > @@ -209,3 +210,117 @@ int iommu_domain_unpreserve(struct iommu_domain *domain)
> > return 0;
> > }
> > EXPORT_SYMBOL_GPL(iommu_domain_unpreserve);
> > +
> > +static int iommu_preserve_locked(struct iommu_device *iommu)
> > +{
> > + struct iommu_lu_flb_obj *flb_obj;
> > + struct iommu_ser *iommu_ser;
> > + int idx, ret;
> > +
> > + if (!iommu->ops->preserve)
> > + return -EOPNOTSUPP;
> > +
> > + if (iommu->outgoing_preserved_state) {
> > + iommu->outgoing_preserved_state->obj.ref_count++;
> > + return 0;
> > + }
> > +
> > + ret = liveupdate_flb_get_outgoing(&iommu_flb, (void **)&flb_obj);
> > + if (ret)
> > + return ret;
> > +
> > + idx = reserve_obj_ser((struct iommu_objs_ser **)&flb_obj->iommus, MAX_IOMMU_SERS);
> > + if (idx < 0)
> > + return idx;
> > +
> > + iommu_ser = &flb_obj->iommus->iommus[idx];
> > + idx = flb_obj->ser->nr_iommus++;
> > + iommu_ser->obj.idx = idx;
> > + iommu_ser->obj.ref_count = 1;
> > +
> > + ret = iommu->ops->preserve(iommu, iommu_ser);
> > + if (ret)
> > + iommu_ser->obj.deleted = true;
> > +
> > + iommu->outgoing_preserved_state = iommu_ser;
> > + return ret;
> > +}
> > +
> > +static void iommu_unpreserve_locked(struct iommu_device *iommu)
> > +{
> > + struct iommu_ser *iommu_ser = iommu->outgoing_preserved_state;
> > +
> > + iommu_ser->obj.ref_count--;
> > + if (iommu_ser->obj.ref_count)
> > + return;
> > +
> > + iommu->outgoing_preserved_state = NULL;
> > + iommu->ops->unpreserve(iommu, iommu_ser);
> > + iommu_ser->obj.deleted = true;
> > +}
> > +
> > +int iommu_preserve_device(struct iommu_domain *domain, struct device *dev)
> > +{
>
> ... but this helper only cares about a single domain.
Thanks
Yes, PASID support is unimplemented in this RFC and I will be adding
PASID support when I send a non-RFC patch series. I have mentioned
this as future work in the cover letter.
Also for PASID, the PASID table also need to be preserved.
>
> > + struct iommu_lu_flb_obj *flb_obj;
> > + struct device_ser *device_ser;
> > + struct dev_iommu *iommu;
> > + struct pci_dev *pdev;
> > + int ret, idx;
>
> Thanks,
> baolu
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 20/32] iommu: Add APIs to get preserved state of a device
2025-12-04 6:19 ` Baolu Lu
@ 2025-12-04 17:48 ` Samiullah Khawaja
0 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-04 17:48 UTC (permalink / raw)
To: Baolu Lu
Cc: David Woodhouse, Joerg Roedel, Will Deacon, Pasha Tatashin,
Jason Gunthorpe, iommu, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
On Wed, Dec 3, 2025 at 10:24 PM Baolu Lu <baolu.lu@linux.intel.com> wrote:
>
> On 12/3/25 07:02, Samiullah Khawaja wrote:
> > The preserved state of the device needs to be fetched at various places
> > during liveupdate. The added API can also be used to check if a device
> > is preserved or not. The API is only used during shutdown and after
> > liveupdate so no locking needed.
> >
> > Signed-off-by: Samiullah Khawaja<skhawaja@google.com>
> > ---
> > include/linux/iommu-lu.h | 67 ++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 67 insertions(+)
> >
> > diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h
> > index 95375530b7be..08a659de8553 100644
> > --- a/include/linux/iommu-lu.h
> > +++ b/include/linux/iommu-lu.h
> > @@ -8,9 +8,76 @@
> > #ifndef _LINUX_IOMMU_LU_H
> > #define _LINUX_IOMMU_LU_H
> >
> > +#include <linux/device.h>
> > +#include <linux/iommu.h>
> > #include <linux/liveupdate.h>
> > #include <linux/kho/abi/iommu.h>
> >
> > +#ifdef CONFIG_LIVEUPDATE
> > +static inline void *dev_iommu_preserved_state(struct device *dev)
> > +{
> > + struct device_ser *ser;
> > +
> > + ser = dev->iommu->device_ser;
>
> This might cause "NULL pointer dereference" issue if the device is not
> iommu probed.
>
> if (!dev->iommu)
> return NULL;
> ser = dev->iommu->device_ser;
>
Thanks for this.
I will fix it in the next iteration.
> > + if (ser && !ser->obj.incoming)
> > + return ser;
> > +
> > + return NULL;
> > +}
>
> Thanks,
> baolu
>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 24/32] iommu/vt-d: restore state of the preserved IOMMU
2025-12-04 6:43 ` Baolu Lu
@ 2025-12-04 17:55 ` Samiullah Khawaja
0 siblings, 0 replies; 49+ messages in thread
From: Samiullah Khawaja @ 2025-12-04 17:55 UTC (permalink / raw)
To: Baolu Lu
Cc: David Woodhouse, Joerg Roedel, Will Deacon, Pasha Tatashin,
Jason Gunthorpe, iommu, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
On Wed, Dec 3, 2025 at 10:48 PM Baolu Lu <baolu.lu@linux.intel.com> wrote:
>
> On 12/3/25 07:02, Samiullah Khawaja wrote:
> > During boot fetch the preserved state of IOMMU unit and if found then
> > restore the state. Reuse the root_table that was preserved in the
> > previous kernel.
> >
> > Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
> > ---
> > drivers/iommu/intel/iommu.c | 30 ++++++++++++++++++++++++------
> > drivers/iommu/intel/iommu.h | 2 ++
> > drivers/iommu/intel/liveupdate.c | 30 ++++++++++++++++++++++++++++++
> > 3 files changed, 56 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 84fef81ecf4d..888351f91918 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -224,12 +224,12 @@ static void clear_translation_pre_enabled(struct intel_iommu *iommu)
> > iommu->flags &= ~VTD_FLAG_TRANS_PRE_ENABLED;
> > }
> >
> > -static void init_translation_status(struct intel_iommu *iommu)
> > +static void init_translation_status(struct intel_iommu *iommu, bool restoring)
> > {
> > u32 gsts;
> >
> > gsts = readl(iommu->reg + DMAR_GSTS_REG);
> > - if (gsts & DMA_GSTS_TES)
> > + if (!restoring && (gsts & DMA_GSTS_TES))
> > iommu->flags |= VTD_FLAG_TRANS_PRE_ENABLED;
> > }
> >
> > @@ -672,10 +672,18 @@ void dmar_fault_dump_ptes(struct intel_iommu *iommu, u16 source_id,
> > #endif
> >
> > /* iommu handling */
> > -static int iommu_alloc_root_entry(struct intel_iommu *iommu)
> > +static int iommu_alloc_root_entry(struct intel_iommu *iommu, struct iommu_ser *restored_state)
> > {
> > struct root_entry *root;
> >
> > +#if CONFIG_LIVEUPDATE
>
> nit: Is it possible to hide the "IS_ENABLED(CONFIG_LIVEUPDATE)" and "#if
> CONFIG_LIVEUPDATE" checks within the header files?
Thanks for pointing this out.
Yes I think this one can be moved into the header file. I will
evaluate/check all of them and move the ones that can be moved.
>
> > + if (restored_state) {
> > + intel_iommu_liveupdate_restore_root_table(iommu, restored_state);
> > + /* Should not be needed since the entries are already cleaned in last kernel. */
> > + __iommu_flush_cache(iommu, iommu->root_entry, ROOT_SIZE);
> > + return 0;
> > + }
> > +#endif
> > root = iommu_alloc_pages_node_sz(iommu->node, GFP_ATOMIC, SZ_4K);
> > if (!root) {
> > pr_err("Allocating root entry for %s failed\n",
> > @@ -1616,6 +1624,7 @@ static int copy_translation_tables(struct intel_iommu *iommu)
> >
> > static int __init init_dmars(void)
> > {
> > + struct iommu_ser *iommu_ser = NULL;
> > struct dmar_drhd_unit *drhd;
> > struct intel_iommu *iommu;
> > int ret;
> > @@ -1638,8 +1647,12 @@ static int __init init_dmars(void)
> > intel_pasid_max_id);
> > }
> >
> > +#if IS_ENABLED(CONFIG_LIVEUPDATE)
> > + iommu_ser = iommu_get_preserved_data(iommu->reg_phys, IOMMU_INTEL);
> > +#endif
> > +
> > intel_iommu_init_qi(iommu);
> > - init_translation_status(iommu);
> > + init_translation_status(iommu, !!iommu_ser);
> >
> > if (translation_pre_enabled(iommu) && !is_kdump_kernel()) {
> > iommu_disable_translation(iommu);
> > @@ -1653,7 +1666,7 @@ static int __init init_dmars(void)
> > * we could share the same root & context tables
> > * among all IOMMU's. Need to Split it later.
> > */
> > - ret = iommu_alloc_root_entry(iommu);
> > + ret = iommu_alloc_root_entry(iommu, iommu_ser);
> > if (ret)
> > goto free_iommu;
> >
> > @@ -2112,6 +2125,7 @@ int dmar_parse_one_satc(struct acpi_dmar_header *hdr, void *arg)
> > static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
> > {
> > struct intel_iommu *iommu = dmaru->iommu;
> > + struct iommu_ser *iommu_ser = NULL;
> > int ret;
> >
> > /*
> > @@ -2120,7 +2134,11 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
> > if (iommu->gcmd & DMA_GCMD_TE)
> > iommu_disable_translation(iommu);
> >
> > - ret = iommu_alloc_root_entry(iommu);
> > +#if IS_ENABLED(CONFIG_LIVEUPDATE)
> > + iommu_ser = iommu_get_preserved_data(iommu->reg_phys, IOMMU_INTEL);
> > +#endif
> > +
> > + ret = iommu_alloc_root_entry(iommu, iommu_ser);
> > if (ret)
> > goto out;
> >
> > diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
> > index 1eb60ce1300f..b0c56e27f167 100644
> > --- a/drivers/iommu/intel/iommu.h
> > +++ b/drivers/iommu/intel/iommu.h
> > @@ -1283,6 +1283,8 @@ int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_se
> > void intel_iommu_unpreserve_device(struct device *dev, struct device_ser *device_ser);
> > int intel_iommu_preserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
> > void intel_iommu_unpreserve(struct iommu_device *iommu, struct iommu_ser *iommu_ser);
> > +void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu,
> > + struct iommu_ser *iommu_ser);
> > bool intel_iommu_liveupdate_clear_context_entries(struct intel_iommu *iommu);
> > #endif
> >
> > diff --git a/drivers/iommu/intel/liveupdate.c b/drivers/iommu/intel/liveupdate.c
> > index 3f8c7f15bc36..140887187084 100644
> > --- a/drivers/iommu/intel/liveupdate.c
> > +++ b/drivers/iommu/intel/liveupdate.c
> > @@ -73,6 +73,36 @@ static int preserve_iommu_context(struct intel_iommu *iommu)
> > return ret;
> > }
> >
> > +static void restore_iommu_context(struct intel_iommu *iommu)
> > +{
> > + struct context_entry *context;
> > + int i;
> > +
> > + for (i = 0; i < ROOT_ENTRY_NR; i++) {
> > + context = iommu_context_addr(iommu, i, 0, 0);
> > + if (context)
> > + BUG_ON(!kho_restore_folio(virt_to_phys(context)));
> > +
> > + if (!sm_supported(iommu))
> > + continue;
> > +
> > + context = iommu_context_addr(iommu, i, 0x80, 0);
> > + if (context)
> > + BUG_ON(!kho_restore_folio(virt_to_phys(context)));
> > + }
> > +}
> > +
> > +void intel_iommu_liveupdate_restore_root_table(struct intel_iommu *iommu,
> > + struct iommu_ser *iommu_ser)
> > +{
> > + BUG_ON(!kho_restore_folio(iommu_ser->intel.root_table));
> > + iommu->root_entry = __va(iommu_ser->intel.root_table);
> > +
> > + restore_iommu_context(iommu);
> > + pr_info("Restored IOMMU[0x%llx] Root Table at: 0x%llx\n",
> > + iommu->reg_phys, iommu_ser->intel.root_table);
> > +}
> > +
> > int intel_iommu_preserve_device(struct device *dev, struct device_ser *device_ser)
> > {
> > struct device_domain_info *info = dev_iommu_priv_get(dev);
>
> Thanks,
> baolu
>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 11/32] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages
2025-12-04 17:39 ` Samiullah Khawaja
@ 2025-12-05 4:58 ` Baolu Lu
0 siblings, 0 replies; 49+ messages in thread
From: Baolu Lu @ 2025-12-05 4:58 UTC (permalink / raw)
To: Samiullah Khawaja
Cc: David Woodhouse, Joerg Roedel, Will Deacon, Pasha Tatashin,
Jason Gunthorpe, iommu, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
On 12/5/25 01:39, Samiullah Khawaja wrote:
> On Wed, Dec 3, 2025 at 6:30 PM Baolu Lu<baolu.lu@linux.intel.com> wrote:
>> On 12/3/25 07:02, Samiullah Khawaja wrote:
>>> +int iommu_preserve_pages(struct iommu_pages_list *list)
>>> +{
>>> + struct ioptdesc *iopt;
>>> + int count = 0;
>>> + int ret;
>>> +
>>> + list_for_each_entry(iopt, &list->pages, iopt_freelist_elm) {
>>> + ret = kho_preserve_folio(ioptdesc_folio(iopt));
>>> + if (ret) {
>>> + iommu_unpreserve_pages(list, count);
>>> + return ret;
>>> + }
>>> +
>>> + ++count;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(iommu_preserve_pages);
>> What is the purpose of "count"?
> Thanks for reviewing.
>
> The count is there to unpreserve only the preserved pages if an error
> occurs during preservation. For example if 4 pages (from the start of
> the list) were preserved out of 10 pages in the list, the unpreserve
> function will unpreserve the first 4 pages from the start of the list.
> Please take a look at the implementation of iommu_unpreserve_pages for
> details.
Oh, I see now. Thanks!
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 00/32] Add live update state preservation
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
` (31 preceding siblings ...)
2025-12-02 23:03 ` [RFC PATCH v2 32/32] iommufd/selftest: Add test to verify iommufd preservation Samiullah Khawaja
@ 2026-01-28 19:59 ` Jason Gunthorpe
2026-01-29 1:11 ` Samiullah Khawaja
32 siblings, 1 reply; 49+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 19:59 UTC (permalink / raw)
To: Samiullah Khawaja
Cc: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, iommu, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
On Tue, Dec 02, 2025 at 11:02:30PM +0000, Samiullah Khawaja wrote:
>
> Samiullah Khawaja (26):
> iommu: Add liveupdate FLB for IOMMU state preservation
> iommu: Register IOMMU FLB with iommufd file handler
> iommu: Implement IOMMU LU FLB preserve/unpreserve callbacks
> iommu: Add iommu_domain ops to preserve, unpreserve and restore
> iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages
> iommupt: Implement preserve/unpreserve/restore callbacks
> iommu: Add APIs to preserve/unpreserve iommu domains
> iommufd: Use the iommu_domain_preserve/unpreserve APIs
> iommu: Add API to keep track of iommu domain attachments
> iommu: Add API to preserve/unpreserve a device
> iommu/vt-d: Implement device and iommu preserve/unpreserve ops
> iommufd: Add APIs to preserve/unpreserve a vfio cdev
> vfio/pci: Preserve the iommufd state of the vfio cdev
> iommu: Add APIs to get preserved state of a device
> iommu/vt-d: Clean the context entries of unpreserved devices
> iommu: Implement IOMMU FLB retrieve and finish ops
> iommu: Add an API get the preserved state of an IOMMU
> iommu/vt-d: restore state of the preserved IOMMU
> iommu: Add helper APIs to fetch preserved device state
> iommu/vt-d: reclaim domain ids of the preserved devices
> iommu: restore preserved domain and reattach
> iommu/vt-d: reuse the preserved domain id for preserved devices
> iommufd: Handle the iommufd can_finish properly
> iommu: Transfer device ownership after liveupdate
> iommu: Allow replacing restored domain
> iommufd/selftest: Add test to verify iommufd preservation
>
> YiFei Zhu (6):
> iommufd: Allow HWPTs to have a NULL IOAS
> iommufd: split alloc and domain assign from iommufd_hwpt_paging_alloc
> iommufd: Add basic skeleton based on liveupdate_file_handler
> iommufd-lu: Implement basic prepare/cancel/finish/retrieve using
> folios
> iommufd-lu: Implement ioctl to let userspace mark an HWPT to be
> preserved
> iommufd-lu: Persist iommu hardware pagetables for live update
This really needs to be made smaller and more focused to be
reviewable and mergable. Try to stick to 15-ish patches.
There is a lot giong on here, I suggest focusing this only on the main
iommu core, a vt-d driver implementation of the bare minimum, and an
iommufd function to trigger preservation of a domain.
Start off by just making the successor kernel fail to accept any
drivers at all because the iommu_domain was preserved. ie restore the
domain and set it as the default_domain and then fail in
iommu_device_use_default_domain() and related functions.
All it should do is convey an iommu_domain and the pasid table entry
unchanged without hit from prior to successor kernel. That's the bare
minimum.
Then a second series which picks up from there and feeds the sucessor path
through iommufd/vfio.
Jason
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 00/32] Add live update state preservation
2026-01-28 19:59 ` [RFC PATCH v2 00/32] Add live update state preservation Jason Gunthorpe
@ 2026-01-29 1:11 ` Samiullah Khawaja
2026-02-02 22:45 ` David Matlack
0 siblings, 1 reply; 49+ messages in thread
From: Samiullah Khawaja @ 2026-01-29 1:11 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Pasha Tatashin, iommu, Robin Murphy, Pratyush Yadav, Kevin Tian,
Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, dmatlack, YiFei Zhu, Chris Li, praan
On Wed, Jan 28, 2026 at 11:59 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Tue, Dec 02, 2025 at 11:02:30PM +0000, Samiullah Khawaja wrote:
> >
> > Samiullah Khawaja (26):
> > iommu: Add liveupdate FLB for IOMMU state preservation
> > iommu: Register IOMMU FLB with iommufd file handler
> > iommu: Implement IOMMU LU FLB preserve/unpreserve callbacks
> > iommu: Add iommu_domain ops to preserve, unpreserve and restore
> > iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages
> > iommupt: Implement preserve/unpreserve/restore callbacks
> > iommu: Add APIs to preserve/unpreserve iommu domains
> > iommufd: Use the iommu_domain_preserve/unpreserve APIs
> > iommu: Add API to keep track of iommu domain attachments
> > iommu: Add API to preserve/unpreserve a device
> > iommu/vt-d: Implement device and iommu preserve/unpreserve ops
> > iommufd: Add APIs to preserve/unpreserve a vfio cdev
> > vfio/pci: Preserve the iommufd state of the vfio cdev
> > iommu: Add APIs to get preserved state of a device
> > iommu/vt-d: Clean the context entries of unpreserved devices
> > iommu: Implement IOMMU FLB retrieve and finish ops
> > iommu: Add an API get the preserved state of an IOMMU
> > iommu/vt-d: restore state of the preserved IOMMU
> > iommu: Add helper APIs to fetch preserved device state
> > iommu/vt-d: reclaim domain ids of the preserved devices
> > iommu: restore preserved domain and reattach
> > iommu/vt-d: reuse the preserved domain id for preserved devices
> > iommufd: Handle the iommufd can_finish properly
> > iommu: Transfer device ownership after liveupdate
> > iommu: Allow replacing restored domain
> > iommufd/selftest: Add test to verify iommufd preservation
> >
> > YiFei Zhu (6):
> > iommufd: Allow HWPTs to have a NULL IOAS
> > iommufd: split alloc and domain assign from iommufd_hwpt_paging_alloc
> > iommufd: Add basic skeleton based on liveupdate_file_handler
> > iommufd-lu: Implement basic prepare/cancel/finish/retrieve using
> > folios
> > iommufd-lu: Implement ioctl to let userspace mark an HWPT to be
> > preserved
> > iommufd-lu: Persist iommu hardware pagetables for live update
>
> This really needs to be made smaller and more focused to be
> reviewable and mergable. Try to stick to 15-ish patches.
Agreed. I will restructure this into a more focused series.
>
> There is a lot giong on here, I suggest focusing this only on the main
> iommu core, a vt-d driver implementation of the bare minimum, and an
> iommufd function to trigger preservation of a domain.
The preservation is done into FLB and that is bound to the iommufd
file handler. So the phase 1 series will need some boiler plate logic
to trigger preserve/unpreserve. It will also need checking that memfds
are preserved and SEALd to make it robust. But I will try to keep it
to a minimum.
>
> Start off by just making the successor kernel fail to accept any
> drivers at all because the iommu_domain was preserved. ie restore the
> domain and set it as the default_domain and then fail in
> iommu_device_use_default_domain() and related functions.
Right. In this model, the device would remain unusable after
liveupdate as you cannot do session finish. So the only way to recover
would be to do a proper reboot. But I think this is a great
intermediate step.
>
> All it should do is convey an iommu_domain and the pasid table entry
> unchanged without hit from prior to successor kernel. That's the bare
> minimum.
>
> Then a second series which picks up from there and feeds the sucessor path
> through iommufd/vfio.
Yes this sounds like a great split of series. I experimented with this
a little and it is workable. But like I mentioned earlier, some boiler
plate will be needed.
>
> Jason
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 00/32] Add live update state preservation
2026-01-29 1:11 ` Samiullah Khawaja
@ 2026-02-02 22:45 ` David Matlack
2026-02-02 23:00 ` Samiullah Khawaja
2026-02-02 23:45 ` Jason Gunthorpe
0 siblings, 2 replies; 49+ messages in thread
From: David Matlack @ 2026-02-02 22:45 UTC (permalink / raw)
To: Samiullah Khawaja
Cc: Jason Gunthorpe, David Woodhouse, Lu Baolu, Joerg Roedel,
Will Deacon, Pasha Tatashin, iommu, Robin Murphy, Pratyush Yadav,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, YiFei Zhu, Chris Li, praan
On 2026-01-28 05:11 PM, Samiullah Khawaja wrote:
> On Wed, Jan 28, 2026 at 11:59 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > On Tue, Dec 02, 2025 at 11:02:30PM +0000, Samiullah Khawaja wrote:
> > Start off by just making the successor kernel fail to accept any
> > drivers at all because the iommu_domain was preserved. ie restore the
> > domain and set it as the default_domain and then fail in
> > iommu_device_use_default_domain() and related functions.
>
> Right. In this model, the device would remain unusable after
> liveupdate as you cannot do session finish. So the only way to recover
> would be to do a proper reboot. But I think this is a great
> intermediate step.
This will break the new selftest vfio_pci_liveupdate_kexec_test that is
added in the vfio cdev series:
https://lore.kernel.org/kvm/20260129212510.967611-19-dmatlack@google.com/
https://lore.kernel.org/kvm/20260129212510.967611-23-dmatlack@google.com/
And having to reboot a machine back to life after every Live Update will
be a painful for everyone working on VFIO, PCI, and IOMMU Live Update
support.
I agree with wanting to do incremental steps, but can we target that the
intial iommufd series supports an e2e scenario that doesn't leave the
host (or its devices) in an unusable state?
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 00/32] Add live update state preservation
2026-02-02 22:45 ` David Matlack
@ 2026-02-02 23:00 ` Samiullah Khawaja
2026-02-02 23:46 ` Jason Gunthorpe
2026-02-02 23:45 ` Jason Gunthorpe
1 sibling, 1 reply; 49+ messages in thread
From: Samiullah Khawaja @ 2026-02-02 23:00 UTC (permalink / raw)
To: David Matlack
Cc: Jason Gunthorpe, David Woodhouse, Lu Baolu, Joerg Roedel,
Will Deacon, Pasha Tatashin, iommu, Robin Murphy, Pratyush Yadav,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, YiFei Zhu, Chris Li, praan
On Mon, Feb 2, 2026 at 2:45 PM David Matlack <dmatlack@google.com> wrote:
>
> On 2026-01-28 05:11 PM, Samiullah Khawaja wrote:
> > On Wed, Jan 28, 2026 at 11:59 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > On Tue, Dec 02, 2025 at 11:02:30PM +0000, Samiullah Khawaja wrote:
>
> > > Start off by just making the successor kernel fail to accept any
> > > drivers at all because the iommu_domain was preserved. ie restore the
> > > domain and set it as the default_domain and then fail in
> > > iommu_device_use_default_domain() and related functions.
> >
> > Right. In this model, the device would remain unusable after
> > liveupdate as you cannot do session finish. So the only way to recover
> > would be to do a proper reboot. But I think this is a great
> > intermediate step.
>
> This will break the new selftest vfio_pci_liveupdate_kexec_test that is
> added in the vfio cdev series:
>
> https://lore.kernel.org/kvm/20260129212510.967611-19-dmatlack@google.com/
> https://lore.kernel.org/kvm/20260129212510.967611-23-dmatlack@google.com/
Yes. the test will fail to bind with the new iommufd. I will send a
patch with modifications to the test.
>
> And having to reboot a machine back to life after every Live Update will
> be a painful for everyone working on VFIO, PCI, and IOMMU Live Update
> support.
>
> I agree with wanting to do incremental steps, but can we target that the
> intial iommufd series supports an e2e scenario that doesn't leave the
> host (or its devices) in an unusable state?
Agreed. In that case, I can send a couple of extra patches with phase
1 to restore iommufd and discard the iommu domain on finish if it has
no attachments.
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 00/32] Add live update state preservation
2026-02-02 22:45 ` David Matlack
2026-02-02 23:00 ` Samiullah Khawaja
@ 2026-02-02 23:45 ` Jason Gunthorpe
1 sibling, 0 replies; 49+ messages in thread
From: Jason Gunthorpe @ 2026-02-02 23:45 UTC (permalink / raw)
To: David Matlack
Cc: Samiullah Khawaja, David Woodhouse, Lu Baolu, Joerg Roedel,
Will Deacon, Pasha Tatashin, iommu, Robin Murphy, Pratyush Yadav,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, YiFei Zhu, Chris Li, praan
On Mon, Feb 02, 2026 at 10:45:48PM +0000, David Matlack wrote:
> On 2026-01-28 05:11 PM, Samiullah Khawaja wrote:
> > On Wed, Jan 28, 2026 at 11:59 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > On Tue, Dec 02, 2025 at 11:02:30PM +0000, Samiullah Khawaja wrote:
>
> > > Start off by just making the successor kernel fail to accept any
> > > drivers at all because the iommu_domain was preserved. ie restore the
> > > domain and set it as the default_domain and then fail in
> > > iommu_device_use_default_domain() and related functions.
> >
> > Right. In this model, the device would remain unusable after
> > liveupdate as you cannot do session finish. So the only way to recover
> > would be to do a proper reboot. But I think this is a great
> > intermediate step.
>
> This will break the new selftest vfio_pci_liveupdate_kexec_test that is
> added in the vfio cdev series:
>
> https://lore.kernel.org/kvm/20260129212510.967611-19-dmatlack@google.com/
> https://lore.kernel.org/kvm/20260129212510.967611-23-dmatlack@google.com/
>
> And having to reboot a machine back to life after every Live Update will
> be a painful for everyone working on VFIO, PCI, and IOMMU Live Update
> support.
I expect it to be under a config protection and people will simply
turn it off.
Jason
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [RFC PATCH v2 00/32] Add live update state preservation
2026-02-02 23:00 ` Samiullah Khawaja
@ 2026-02-02 23:46 ` Jason Gunthorpe
0 siblings, 0 replies; 49+ messages in thread
From: Jason Gunthorpe @ 2026-02-02 23:46 UTC (permalink / raw)
To: Samiullah Khawaja
Cc: David Matlack, David Woodhouse, Lu Baolu, Joerg Roedel,
Will Deacon, Pasha Tatashin, iommu, Robin Murphy, Pratyush Yadav,
Kevin Tian, Alex Williamson, linux-kernel, Saeed Mahameed,
Adithya Jayachandran, Parav Pandit, Leon Romanovsky, William Tu,
Vipin Sharma, YiFei Zhu, Chris Li, praan
On Mon, Feb 02, 2026 at 03:00:12PM -0800, Samiullah Khawaja wrote:
> Agreed. In that case, I can send a couple of extra patches with phase
> 1 to restore iommufd and discard the iommu domain on finish if it has
> no attachments.
I think restoring is the hardest part of all of this, you need lots of
iommu driver changes too, and we shouldn't have some weird uapi to
accomodate this inbetween state..
Jason
^ permalink raw reply [flat|nested] 49+ messages in thread
end of thread, other threads:[~2026-02-02 23:46 UTC | newest]
Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-02 23:02 [RFC PATCH v2 00/32] Add live update state preservation Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 01/32] iommufd: Allow HWPTs to have a NULL IOAS Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 02/32] iommufd: split alloc and domain assign from iommufd_hwpt_paging_alloc Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 03/32] iommufd: Add basic skeleton based on liveupdate_file_handler Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 04/32] iommufd-lu: Implement basic prepare/cancel/finish/retrieve using folios Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 05/32] iommufd-lu: Implement ioctl to let userspace mark an HWPT to be preserved Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 06/32] iommufd-lu: Persist iommu hardware pagetables for live update Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 07/32] iommu: Add liveupdate FLB for IOMMU state preservation Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 08/32] iommu: Register IOMMU FLB with iommufd file handler Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 09/32] iommu: Implement IOMMU LU FLB preserve/unpreserve callbacks Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 10/32] iommu: Add iommu_domain ops to preserve, unpreserve and restore Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 11/32] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages Samiullah Khawaja
2025-12-04 2:25 ` Baolu Lu
2025-12-04 17:39 ` Samiullah Khawaja
2025-12-05 4:58 ` Baolu Lu
2025-12-02 23:02 ` [RFC PATCH v2 12/32] iommupt: Implement preserve/unpreserve/restore callbacks Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 13/32] iommu: Add APIs to preserve/unpreserve iommu domains Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 14/32] iommufd: Use the iommu_domain_preserve/unpreserve APIs Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 15/32] iommu: Add API to keep track of iommu domain attachments Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 16/32] iommu: Add API to preserve/unpreserve a device Samiullah Khawaja
2025-12-04 5:46 ` Baolu Lu
2025-12-04 17:47 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 17/32] iommu/vt-d: Implement device and iommu preserve/unpreserve ops Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 18/32] iommufd: Add APIs to preserve/unpreserve a vfio cdev Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 19/32] vfio/pci: Preserve the iommufd state of the " Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 20/32] iommu: Add APIs to get preserved state of a device Samiullah Khawaja
2025-12-04 6:19 ` Baolu Lu
2025-12-04 17:48 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 21/32] iommu/vt-d: Clean the context entries of unpreserved devices Samiullah Khawaja
2025-12-04 6:28 ` Baolu Lu
2025-12-02 23:02 ` [RFC PATCH v2 22/32] iommu: Implement IOMMU FLB retrieve and finish ops Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 23/32] iommu: Add an API get the preserved state of an IOMMU Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 24/32] iommu/vt-d: restore state of the preserved IOMMU Samiullah Khawaja
2025-12-04 6:43 ` Baolu Lu
2025-12-04 17:55 ` Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 25/32] iommu: Add helper APIs to fetch preserved device state Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 26/32] iommu/vt-d: reclaim domain ids of the preserved devices Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 27/32] iommu: restore preserved domain and reattach Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 28/32] iommu/vt-d: reuse the preserved domain id for preserved devices Samiullah Khawaja
2025-12-02 23:02 ` [RFC PATCH v2 29/32] iommufd: Handle the iommufd can_finish properly Samiullah Khawaja
2025-12-02 23:03 ` [RFC PATCH v2 30/32] iommu: Transfer device ownership after liveupdate Samiullah Khawaja
2025-12-02 23:03 ` [RFC PATCH v2 31/32] iommu: Allow replacing restored domain Samiullah Khawaja
2025-12-02 23:03 ` [RFC PATCH v2 32/32] iommufd/selftest: Add test to verify iommufd preservation Samiullah Khawaja
2026-01-28 19:59 ` [RFC PATCH v2 00/32] Add live update state preservation Jason Gunthorpe
2026-01-29 1:11 ` Samiullah Khawaja
2026-02-02 22:45 ` David Matlack
2026-02-02 23:00 ` Samiullah Khawaja
2026-02-02 23:46 ` Jason Gunthorpe
2026-02-02 23:45 ` Jason Gunthorpe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.