From: Steve Sistare <steven.sistare@oracle.com>
To: qemu-devel@nongnu.org
Cc: Alex Williamson <alex.williamson@redhat.com>,
Cedric Le Goater <clg@redhat.com>, Yi Liu <yi.l.liu@intel.com>,
Eric Auger <eric.auger@redhat.com>,
Zhenzhong Duan <zhenzhong.duan@intel.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>,
Steve Sistare <steven.sistare@oracle.com>
Subject: [PATCH V5 00/38] Live update: vfio and iommufd
Date: Tue, 10 Jun 2025 08:39:13 -0700 [thread overview]
Message-ID: <1749569991-25171-1-git-send-email-steven.sistare@oracle.com> (raw)
Support vfio and iommufd devices with the cpr-transfer live migration mode.
Devices that do not support live migration can still support cpr-transfer,
allowing live update to a new version of QEMU on the same host, with no loss
of guest connectivity.
No user-visible interfaces are added.
For legacy containers:
Pass vfio device descriptors to new QEMU. In new QEMU, during vfio_realize,
skip the ioctls that configure the device, because it is already configured.
Use VFIO_DMA_UNMAP_FLAG_VADDR to abandon the old VA's for DMA mapped
regions, and use VFIO_DMA_MAP_FLAG_VADDR to register the new VA in new
QEMU and update the locked memory accounting. The physical pages remain
pinned, because the descriptor of the device that locked them remains open,
so DMA to those pages continues without interruption. Mediated devices are
not supported, however, because they require the VA to always be valid, and
there is a brief window where no VA is registered.
Save the MSI message area as part of vfio-pci vmstate, and pass the interrupt
and notifier eventfd's to new QEMU. New QEMU loads the MSI data, then the
vfio-pci post_load handler finds the eventfds in CPR state, rebuilds vector
data structures, and attaches the interrupts to the new KVM instance. This
logic also applies to iommufd containers.
For iommufd containers:
Use IOMMU_IOAS_MAP_FILE to register memory regions for DMA when they are
backed by a file (including a memfd), so DMA mappings do not depend on VA,
which can differ after live update. This allows mediated devices to be
supported.
Pass the iommufd and vfio device descriptors from old to new QEMU. In new
QEMU, during vfio_realize, skip the ioctls that configure the device, because
it is already configured.
In new QEMU, call ioctl(IOMMU_IOAS_CHANGE_PROCESS) to update mm ownership and
locked memory accounting.
Patches 3 to 8 are specific to legacy containers.
Patches 21 to 36 are specific to iommufd containers.
The remainder apply to both.
Changes from previous versions:
* V1 of this series contains minor changes from the "Live update: vfio" and
"Live update: iommufd" series, mainly bug fixes and refactored patches.
Changes in V2:
* refactored various vfio code snippets into new cpr helpers
* refactored vfio struct members into cpr-specific structures
* refactored various small changes into their own patches
* split complex patches. Notably:
- split "refactor for cpr" into 5 patches
- split "reconstruct device" into 4 patches
* refactored vfio_connect_container using helpers and made its
error recovery more robust.
* moved vfio pci msi/vector/intx cpr functions to cpr.c
* renamed "reused" to cpr_reused and cpr.reused
* squashed vfio_cpr_[un]register_container to their call sites
* simplified iommu_type setting after cpr
* added cpr_open_fd and cpr_is_incoming helpers
* removed changes from vfio_legacy_dma_map, and instead temporarily
override dma_map and dma_unmap ops.
* deleted error_report and returned Error to callers where possible.
* simplified the memory_get_xlat_addr interface
* fixed flags passed to iommufd_backend_alloc_hwpt
* defined MIG_PRI_UNINITIALIZED
* added maintainers
Changes in V3:
* removed cleanup patches that were already pulled
* rebased to latest master
Changes in V4:
* added SPDX-License-Identifier
* patch "vfio/container: preserve descriptors"
- rewrote search loop in vfio_container_connect
- do not return pfd from vfio_cpr_container_match
- add helper for VFIO_GROUP_GET_DEVICE_FD
* deleted patch "export vfio_legacy_dma_map"
* patch "vfio/container: restore DMA vaddr"
- deleted redundant error_report from vfio_legacy_cpr_dma_map
- save old dma_map function
* patch "vfio-pci: skip reset during cpr"
- use cpr_is_incoming instead of cpr_reused
* renamed err -> local_err in all new code
* patch "export MSI functions"
- renamed with vfio_pci prefix, and defined wrappers for low level
routines instead of exporting them.
* patch "close kvm after cpr"
- fixed build error for !CONFIG_KVM
* added the cpr_resave_fd helper
* dropped patch "pass ramblock to vfio_container_dma_map", relying on
"pass MemoryRegion" from the vfio-user series instead.
* deleted "reused" variables, replaced with cpr_is_incoming()
* renamed cpr_needed_for_reuse -> cpr_incoming_needed
* rewrote patch "pci: skip reset during cpr"
* rebased to latest master
for iommufd:
* deleted redundant error_report from iommufd_backend_map_file_dma
* added interface doc for dma_map_file
* check return value of cpr_open_fd
* deleted "export iommufd_cdev_get_info_iova_range"
* deleted "reconstruct device"
* deleted "reconstruct hw_caps"
* deleted "define hwpt constructors"
* separated cpr registration for iommufd be and vfio container
* correctly attach to multiple containers per iommufd using ioas_id
* simplified "reconstruct hwpt" by matching against hwpt_id.
* added patch "add vfio_device_free_name"
Changes in V5:
* dropped: vfio/pci: vfio_pci_put_device on failure
* added: "vfio: doc changes for cpr"
* deleted unnecessary include of vfio-cpr.h
* fixed compilation for !CONFIG_VFIO and !CONFIG_IOMMUFD
* misc minor changes
* Added RB's, rebased to master
Steve Sistare (38):
migration: cpr helpers
migration: lower handler priority
vfio/container: register container for cpr
vfio/container: preserve descriptors
vfio/container: discard old DMA vaddr
vfio/container: restore DMA vaddr
vfio/container: mdev cpr blocker
vfio/container: recover from unmap-all-vaddr failure
pci: export msix_is_pending
pci: skip reset during cpr
vfio-pci: skip reset during cpr
vfio/pci: vfio_pci_vector_init
vfio/pci: vfio_notifier_init
vfio/pci: pass vector to virq functions
vfio/pci: vfio_notifier_init cpr parameters
vfio/pci: vfio_notifier_cleanup
vfio/pci: export MSI functions
vfio-pci: preserve MSI
vfio-pci: preserve INTx
migration: close kvm after cpr
migration: cpr_get_fd_param helper
backends/iommufd: iommufd_backend_map_file_dma
backends/iommufd: change process ioctl
physmem: qemu_ram_get_fd_offset
vfio/iommufd: use IOMMU_IOAS_MAP_FILE
vfio/iommufd: invariant device name
vfio/iommufd: add vfio_device_free_name
vfio/iommufd: device name blocker
vfio/iommufd: register container for cpr
migration: vfio cpr state hook
vfio/iommufd: cpr state
vfio/iommufd: preserve descriptors
vfio/iommufd: reconstruct device
vfio/iommufd: reconstruct hwpt
vfio/iommufd: change process
iommufd: preserve DMA mappings
vfio/container: delete old cpr register
vfio: doc changes for cpr
docs/devel/migration/CPR.rst | 5 +-
qapi/migration.json | 6 +-
hw/vfio/pci.h | 10 ++
include/exec/cpu-common.h | 1 +
include/hw/pci/msix.h | 1 +
include/hw/pci/pci_device.h | 3 +
include/hw/vfio/vfio-container-base.h | 18 +++
include/hw/vfio/vfio-container.h | 2 +
include/hw/vfio/vfio-cpr.h | 66 +++++++-
include/hw/vfio/vfio-device.h | 5 +
include/migration/cpr.h | 21 +++
include/migration/vmstate.h | 6 +-
include/system/iommufd.h | 7 +
include/system/kvm.h | 1 +
accel/kvm/kvm-all.c | 32 ++++
accel/stubs/kvm-stub.c | 5 +
backends/iommufd.c | 101 +++++++++++-
hw/pci/msix.c | 2 +-
hw/pci/pci.c | 5 +
hw/vfio/ap.c | 2 +-
hw/vfio/ccw.c | 2 +-
hw/vfio/container-base.c | 9 ++
hw/vfio/container.c | 97 +++++++++---
hw/vfio/cpr-iommufd.c | 220 ++++++++++++++++++++++++++
hw/vfio/cpr-legacy.c | 287 ++++++++++++++++++++++++++++++++++
hw/vfio/cpr.c | 159 +++++++++++++++++--
hw/vfio/device.c | 40 +++--
hw/vfio/helpers.c | 10 ++
hw/vfio/iommufd-stubs.c | 18 +++
hw/vfio/iommufd.c | 81 ++++++++--
hw/vfio/listener.c | 19 ++-
hw/vfio/pci.c | 231 ++++++++++++++++++++-------
hw/vfio/platform.c | 2 +-
hw/vfio/vfio-stubs.c | 13 ++
migration/cpr-transfer.c | 18 +++
migration/cpr.c | 95 +++++++++--
migration/migration.c | 1 +
migration/savevm.c | 4 +-
system/physmem.c | 5 +
backends/trace-events | 2 +
hw/vfio/meson.build | 5 +
41 files changed, 1482 insertions(+), 135 deletions(-)
create mode 100644 hw/vfio/cpr-iommufd.c
create mode 100644 hw/vfio/cpr-legacy.c
create mode 100644 hw/vfio/iommufd-stubs.c
create mode 100644 hw/vfio/vfio-stubs.c
base-commit: bc98ffdc7577e55ab8373c579c28fe24d600c40f
--
1.8.3.1
next reply other threads:[~2025-06-10 16:52 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-10 15:39 Steve Sistare [this message]
2025-06-10 15:39 ` [PATCH V5 01/38] migration: cpr helpers Steve Sistare
2025-06-10 15:39 ` [PATCH V5 02/38] migration: lower handler priority Steve Sistare
2025-06-10 15:39 ` [PATCH V5 03/38] vfio/container: register container for cpr Steve Sistare
2025-06-10 15:39 ` [PATCH V5 04/38] vfio/container: preserve descriptors Steve Sistare
2025-06-23 9:07 ` Duan, Zhenzhong
2025-07-01 14:25 ` Steven Sistare
2025-07-02 14:23 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 05/38] vfio/container: discard old DMA vaddr Steve Sistare
2025-06-10 15:39 ` [PATCH V5 06/38] vfio/container: restore " Steve Sistare
2025-06-10 15:39 ` [PATCH V5 07/38] vfio/container: mdev cpr blocker Steve Sistare
2025-06-10 15:39 ` [PATCH V5 08/38] vfio/container: recover from unmap-all-vaddr failure Steve Sistare
2025-08-13 12:54 ` Cédric Le Goater
2025-08-13 14:18 ` Steven Sistare
2025-06-10 15:39 ` [PATCH V5 09/38] pci: export msix_is_pending Steve Sistare
2025-06-10 15:39 ` [PATCH V5 10/38] pci: skip reset during cpr Steve Sistare
2025-06-10 15:39 ` [PATCH V5 11/38] vfio-pci: " Steve Sistare
2025-06-10 15:39 ` [PATCH V5 12/38] vfio/pci: vfio_pci_vector_init Steve Sistare
2025-06-10 15:39 ` [PATCH V5 13/38] vfio/pci: vfio_notifier_init Steve Sistare
2025-06-10 15:39 ` [PATCH V5 14/38] vfio/pci: pass vector to virq functions Steve Sistare
2025-06-10 15:39 ` [PATCH V5 15/38] vfio/pci: vfio_notifier_init cpr parameters Steve Sistare
2025-06-10 15:39 ` [PATCH V5 16/38] vfio/pci: vfio_notifier_cleanup Steve Sistare
2025-06-10 15:39 ` [PATCH V5 17/38] vfio/pci: export MSI functions Steve Sistare
2025-06-10 15:39 ` [PATCH V5 18/38] vfio-pci: preserve MSI Steve Sistare
2025-07-01 16:12 ` Steven Sistare
2025-07-02 7:17 ` Cédric Le Goater
2025-07-02 12:03 ` Steven Sistare
2025-07-02 15:35 ` Cédric Le Goater
2025-07-02 16:40 ` Steven Sistare
2025-06-10 15:39 ` [PATCH V5 19/38] vfio-pci: preserve INTx Steve Sistare
2025-07-02 15:23 ` Cédric Le Goater
2025-07-02 17:54 ` Steven Sistare
2025-06-10 15:39 ` [PATCH V5 20/38] migration: close kvm after cpr Steve Sistare
2025-07-01 15:25 ` Steven Sistare
2025-07-02 16:02 ` Peter Xu
2025-07-02 19:41 ` Steven Sistare
2025-07-03 19:45 ` Peter Xu
2025-07-03 21:21 ` Cédric Le Goater
2025-07-03 21:58 ` Peter Xu
2025-07-07 13:13 ` Steven Sistare
2025-07-01 17:49 ` Fabiano Rosas
2025-06-10 15:39 ` [PATCH V5 21/38] migration: cpr_get_fd_param helper Steve Sistare
2025-06-10 15:39 ` [PATCH V5 22/38] backends/iommufd: iommufd_backend_map_file_dma Steve Sistare
2025-06-10 15:39 ` [PATCH V5 23/38] backends/iommufd: change process ioctl Steve Sistare
2025-06-11 12:38 ` Cédric Le Goater
2025-06-23 8:20 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 24/38] physmem: qemu_ram_get_fd_offset Steve Sistare
2025-06-10 15:39 ` [PATCH V5 25/38] vfio/iommufd: use IOMMU_IOAS_MAP_FILE Steve Sistare
2025-06-10 15:39 ` [PATCH V5 26/38] vfio/iommufd: invariant device name Steve Sistare
2025-06-23 8:25 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 27/38] vfio/iommufd: add vfio_device_free_name Steve Sistare
2025-06-11 12:38 ` Cédric Le Goater
2025-06-23 8:27 ` Duan, Zhenzhong
2025-06-23 13:50 ` Eric Farman
2025-07-01 14:26 ` Steven Sistare
2025-06-10 15:39 ` [PATCH V5 28/38] vfio/iommufd: device name blocker Steve Sistare
2025-06-23 10:29 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 29/38] vfio/iommufd: register container for cpr Steve Sistare
2025-07-01 14:25 ` Steven Sistare
2025-07-02 14:17 ` Duan, Zhenzhong
2025-07-02 14:52 ` Steven Sistare
2025-06-10 15:39 ` [PATCH V5 30/38] migration: vfio cpr state hook Steve Sistare
2025-06-24 11:24 ` Duan, Zhenzhong
2025-07-01 14:26 ` Steven Sistare
2025-07-02 13:39 ` Duan, Zhenzhong
2025-07-02 15:07 ` Steven Sistare
2025-06-10 15:39 ` [PATCH V5 31/38] vfio/iommufd: cpr state Steve Sistare
2025-06-23 10:45 ` Duan, Zhenzhong
2025-07-01 14:26 ` Steven Sistare
2025-07-02 13:44 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 32/38] vfio/iommufd: preserve descriptors Steve Sistare
2025-06-25 11:40 ` Duan, Zhenzhong
2025-07-01 14:26 ` Steven Sistare
2025-07-02 14:08 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 33/38] vfio/iommufd: reconstruct device Steve Sistare
2025-06-25 11:40 ` Duan, Zhenzhong
2025-07-01 14:26 ` Steven Sistare
2025-07-02 14:14 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 34/38] vfio/iommufd: reconstruct hwpt Steve Sistare
2025-06-25 11:40 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 35/38] vfio/iommufd: change process Steve Sistare
2025-06-25 11:40 ` Duan, Zhenzhong
2025-07-01 14:26 ` Steven Sistare
2025-07-02 13:46 ` Duan, Zhenzhong
2025-07-02 20:57 ` Steven Sistare
2025-06-10 15:39 ` [PATCH V5 36/38] iommufd: preserve DMA mappings Steve Sistare
2025-06-25 11:40 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 37/38] vfio/container: delete old cpr register Steve Sistare
2025-06-25 11:40 ` Duan, Zhenzhong
2025-06-10 15:39 ` [PATCH V5 38/38] vfio: doc changes for cpr Steve Sistare
2025-07-02 14:03 ` Steven Sistare
2025-07-02 14:49 ` Cédric Le Goater
2025-07-02 17:52 ` Fabiano Rosas
2025-06-10 17:18 ` [PATCH V5 00/38] Live update: vfio and iommufd Cédric Le Goater
2025-06-10 17:39 ` Cédric Le Goater
2025-06-11 14:25 ` Cédric Le Goater
2025-06-11 14:39 ` Steven Sistare
2025-06-12 7:23 ` Cédric Le Goater
2025-06-19 12:03 ` Cédric Le Goater
2025-06-20 5:46 ` Duan, Zhenzhong
2025-06-11 14:49 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1749569991-25171-1-git-send-email-steven.sistare@oracle.com \
--to=steven.sistare@oracle.com \
--cc=alex.williamson@redhat.com \
--cc=clg@redhat.com \
--cc=eric.auger@redhat.com \
--cc=farosas@suse.de \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=yi.l.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).