From: "Cédric Le Goater" <clg@redhat.com>
To: Steve Sistare <steven.sistare@oracle.com>, qemu-devel@nongnu.org
Cc: Alex Williamson <alex.williamson@redhat.com>,
Yi Liu <yi.l.liu@intel.com>, Eric Auger <eric.auger@redhat.com>,
Zhenzhong Duan <zhenzhong.duan@intel.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH V3 12/42] vfio/container: recover from unmap-all-vaddr failure
Date: Tue, 20 May 2025 08:29:13 +0200 [thread overview]
Message-ID: <af8772de-5469-4736-99cd-ec917a855aac@redhat.com> (raw)
In-Reply-To: <1747063973-124548-13-git-send-email-steven.sistare@oracle.com>
On 5/12/25 17:32, Steve Sistare wrote:
> If there are multiple containers and unmap-all fails for some container, we
> need to remap vaddr for the other containers for which unmap-all succeeded.
> Recover by walking all address ranges of all containers to restore the vaddr
> for each. Do so by invoking the vfio listener callback, and passing a new
> "remap" flag that tells it to restore a mapping without re-allocating new
> userland data structures.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> hw/vfio/cpr-legacy.c | 91 +++++++++++++++++++++++++++++++++++
> hw/vfio/listener.c | 19 +++++++-
> include/hw/vfio/vfio-container-base.h | 3 ++
> include/hw/vfio/vfio-cpr.h | 10 ++++
> 4 files changed, 122 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
> index bbcf71e..f8ddf78 100644
> --- a/hw/vfio/cpr-legacy.c
> +++ b/hw/vfio/cpr-legacy.c
> @@ -31,6 +31,7 @@ static bool vfio_dma_unmap_vaddr_all(VFIOContainer *container, Error **errp)
> error_setg_errno(errp, errno, "vfio_dma_unmap_vaddr_all");
> return false;
> }
> + container->cpr.vaddr_unmapped = true;
> return true;
> }
>
> @@ -63,6 +64,14 @@ static int vfio_legacy_cpr_dma_map(const VFIOContainerBase *bcontainer,
> return 0;
> }
>
> +static void vfio_region_remap(MemoryListener *listener,
> + MemoryRegionSection *section)
> +{
> + VFIOContainer *container = container_of(listener, VFIOContainer,
> + cpr.remap_listener);
> + vfio_container_region_add(&container->bcontainer, section, true);
> +}
> +
> static bool vfio_cpr_supported(VFIOContainer *container, Error **errp)
> {
> if (!ioctl(container->fd, VFIO_CHECK_EXTENSION, VFIO_UPDATE_VADDR)) {
> @@ -131,6 +140,40 @@ static const VMStateDescription vfio_container_vmstate = {
> }
> };
>
> +static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
> + MigrationEvent *e, Error **errp)
> +{
> + VFIOContainer *container =
> + container_of(notifier, VFIOContainer, cpr.transfer_notifier);
> + VFIOContainerBase *bcontainer = &container->bcontainer;
> +
> + if (e->type != MIG_EVENT_PRECOPY_FAILED) {
> + return 0;
> + }
> +
> + if (container->cpr.vaddr_unmapped) {
> + /*
> + * Force a call to vfio_region_remap for each mapped section by
> + * temporarily registering a listener, and temporarily diverting
> + * dma_map to vfio_legacy_cpr_dma_map. The latter restores vaddr.
> + */
> +
> + VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> + vioc->dma_map = vfio_legacy_cpr_dma_map;
> +
> + container->cpr.remap_listener = (MemoryListener) {
> + .name = "vfio cpr recover",
> + .region_add = vfio_region_remap
> + };
> + memory_listener_register(&container->cpr.remap_listener,
> + bcontainer->space->as);
> + memory_listener_unregister(&container->cpr.remap_listener);
> + container->cpr.vaddr_unmapped = false;
> + vioc->dma_map = vfio_legacy_dma_map;
> + }
> + return 0;
> +}
> +
> bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp)
> {
> VFIOContainerBase *bcontainer = &container->bcontainer;
> @@ -152,6 +195,10 @@ bool vfio_legacy_cpr_register_container(VFIOContainer *container, Error **errp)
> VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
> vioc->dma_map = vfio_legacy_cpr_dma_map;
> }
> +
> + migration_add_notifier_mode(&container->cpr.transfer_notifier,
> + vfio_cpr_fail_notifier,
> + MIG_MODE_CPR_TRANSFER);
> return true;
> }
>
> @@ -162,6 +209,50 @@ void vfio_legacy_cpr_unregister_container(VFIOContainer *container)
> migration_remove_notifier(&bcontainer->cpr_reboot_notifier);
> migrate_del_blocker(&container->cpr.blocker);
> vmstate_unregister(NULL, &vfio_container_vmstate, container);
> + migration_remove_notifier(&container->cpr.transfer_notifier);
> +}
> +
> +/*
> + * In old QEMU, VFIO_DMA_UNMAP_FLAG_VADDR may fail on some mapping after
> + * succeeding for others, so the latter have lost their vaddr. Call this
> + * to restore vaddr for a section with a giommu.
> + *
> + * The giommu already exists. Find it and replay it, which calls
> + * vfio_legacy_cpr_dma_map further down the stack.
> + */
> +void vfio_cpr_giommu_remap(VFIOContainerBase *bcontainer,
> + MemoryRegionSection *section)
> +{
> + VFIOGuestIOMMU *giommu = NULL;
> + hwaddr as_offset = section->offset_within_address_space;
> + hwaddr iommu_offset = as_offset - section->offset_within_region;
> +
> + QLIST_FOREACH(giommu, &bcontainer->giommu_list, giommu_next) {
> + if (giommu->iommu_mr == IOMMU_MEMORY_REGION(section->mr) &&
> + giommu->iommu_offset == iommu_offset) {
> + break;
> + }
> + }
> + g_assert(giommu);
> + memory_region_iommu_replay(giommu->iommu_mr, &giommu->n);
> +}
> +
> +/*
> + * In old QEMU, VFIO_DMA_UNMAP_FLAG_VADDR may fail on some mapping after
> + * succeeding for others, so the latter have lost their vaddr. Call this
> + * to restore vaddr for a section with a RamDiscardManager.
> + *
> + * The ram discard listener already exists. Call its populate function
> + * directly, which calls vfio_legacy_cpr_dma_map.
> + */
> +bool vfio_cpr_ram_discard_register_listener(VFIOContainerBase *bcontainer,
> + MemoryRegionSection *section)
> +{
> + VFIORamDiscardListener *vrdl =
> + vfio_find_ram_discard_listener(bcontainer, section);
> +
> + g_assert(vrdl);
> + return vrdl->listener.notify_populate(&vrdl->listener, section) == 0;
> }
>
> static bool same_device(int fd1, int fd2)
> diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
> index 5642d04..e86ffcf 100644
> --- a/hw/vfio/listener.c
> +++ b/hw/vfio/listener.c
> @@ -474,6 +474,13 @@ static void vfio_listener_region_add(MemoryListener *listener,
> {
> VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
> listener);
> + vfio_container_region_add(bcontainer, section, false);
> +}
> +
> +void vfio_container_region_add(VFIOContainerBase *bcontainer,
> + MemoryRegionSection *section,
> + bool cpr_remap)
> +{
> hwaddr iova, end;
> Int128 llend, llsize;
> void *vaddr;
> @@ -509,6 +516,11 @@ static void vfio_listener_region_add(MemoryListener *listener,
> int iommu_idx;
>
> trace_vfio_listener_region_add_iommu(section->mr->name, iova, end);
> +
> + if (cpr_remap) {
> + vfio_cpr_giommu_remap(bcontainer, section);
> + }
> +
> /*
> * FIXME: For VFIO iommu types which have KVM acceleration to
> * avoid bouncing all map/unmaps through qemu this way, this
> @@ -551,7 +563,12 @@ static void vfio_listener_region_add(MemoryListener *listener,
> * about changes.
> */
> if (memory_region_has_ram_discard_manager(section->mr)) {
> - vfio_ram_discard_register_listener(bcontainer, section);
> + if (!cpr_remap) {
> + vfio_ram_discard_register_listener(bcontainer, section);
> + } else if (!vfio_cpr_ram_discard_register_listener(bcontainer,
> + section)) {
> + goto fail;
> + }
> return;
> }
>
> diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
> index a2f6c3a..5776fd7 100644
> --- a/include/hw/vfio/vfio-container-base.h
> +++ b/include/hw/vfio/vfio-container-base.h
> @@ -189,4 +189,7 @@ VFIORamDiscardListener *vfio_find_ram_discard_listener(
> int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova,
> ram_addr_t size, void *vaddr, bool readonly);
>
> +void vfio_container_region_add(VFIOContainerBase *bcontainer,
> + MemoryRegionSection *section, bool cpr_remap);
> +
> #endif /* HW_VFIO_VFIO_CONTAINER_BASE_H */
> diff --git a/include/hw/vfio/vfio-cpr.h b/include/hw/vfio/vfio-cpr.h
> index 0fc7ab2..d6d22f2 100644
> --- a/include/hw/vfio/vfio-cpr.h
> +++ b/include/hw/vfio/vfio-cpr.h
> @@ -10,10 +10,14 @@
> #define HW_VFIO_VFIO_CPR_H
>
> #include "migration/misc.h"
> +#include "system/memory.h"
>
> typedef struct VFIOContainerCPR {
> Error *blocker;
> bool reused;
> + bool vaddr_unmapped;
> + NotifierWithReturn transfer_notifier;
> + MemoryListener remap_listener;
> } VFIOContainerCPR;
>
> typedef struct VFIODeviceCPR {
> @@ -39,4 +43,10 @@ void vfio_cpr_unregister_container(struct VFIOContainerBase *bcontainer);
> bool vfio_cpr_container_match(struct VFIOContainer *container,
> struct VFIOGroup *group, int *fd);
>
> +void vfio_cpr_giommu_remap(struct VFIOContainerBase *bcontainer,
> + MemoryRegionSection *section);
> +
> +bool vfio_cpr_ram_discard_register_listener(
> + struct VFIOContainerBase *bcontainer, MemoryRegionSection *section);
> +
> #endif /* HW_VFIO_VFIO_CPR_H */
Please add to your .gitconfig :
[diff]
orderFile = /path/to/qemu/scripts/git.orderfile
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
next prev parent reply other threads:[~2025-05-20 6:30 UTC|newest]
Thread overview: 157+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-12 15:32 [PATCH V3 00/42] Live update: vfio and iommufd Steve Sistare
2025-05-12 15:32 ` [PATCH V3 01/42] MAINTAINERS: Add reviewer for CPR Steve Sistare
2025-05-15 7:36 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 02/42] migration: cpr helpers Steve Sistare
2025-05-15 7:43 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 03/42] migration: lower handler priority Steve Sistare
2025-05-12 15:32 ` [PATCH V3 04/42] vfio: vfio_find_ram_discard_listener Steve Sistare
2025-05-12 15:32 ` [PATCH V3 05/42] vfio: move vfio-cpr.h Steve Sistare
2025-05-15 7:46 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 06/42] vfio/container: register container for cpr Steve Sistare
2025-05-15 7:54 ` Cédric Le Goater
2025-05-15 19:06 ` Steven Sistare
2025-05-16 16:20 ` Cédric Le Goater
2025-05-16 17:21 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 07/42] vfio/container: preserve descriptors Steve Sistare
2025-05-15 12:59 ` Cédric Le Goater
2025-05-15 19:08 ` Steven Sistare
2025-05-19 13:20 ` Cédric Le Goater
2025-05-19 16:21 ` Steven Sistare
2025-05-22 13:51 ` Cédric Le Goater
2025-05-22 13:56 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 08/42] vfio/container: export vfio_legacy_dma_map Steve Sistare
2025-05-15 13:42 ` Cédric Le Goater
2025-05-15 19:08 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 09/42] vfio/container: discard old DMA vaddr Steve Sistare
2025-05-15 13:30 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 10/42] vfio/container: restore " Steve Sistare
2025-05-15 13:42 ` Cédric Le Goater
2025-05-15 19:08 ` Steven Sistare
2025-05-19 13:32 ` Cédric Le Goater
2025-05-19 16:33 ` Steven Sistare
2025-05-22 6:37 ` Cédric Le Goater
2025-05-22 14:00 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 11/42] vfio/container: mdev cpr blocker Steve Sistare
2025-05-16 8:16 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 12/42] vfio/container: recover from unmap-all-vaddr failure Steve Sistare
2025-05-20 6:29 ` Cédric Le Goater [this message]
2025-05-20 13:39 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 13/42] pci: export msix_is_pending Steve Sistare
2025-05-12 15:32 ` [PATCH V3 14/42] pci: skip reset during cpr Steve Sistare
2025-05-16 8:19 ` Cédric Le Goater
2025-05-16 17:58 ` Steven Sistare
2025-05-24 9:34 ` Michael S. Tsirkin
2025-05-27 20:42 ` Steven Sistare
2025-05-27 21:03 ` Michael S. Tsirkin
2025-05-28 16:11 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 15/42] vfio-pci: " Steve Sistare
2025-05-20 6:48 ` Cédric Le Goater
2025-05-20 13:44 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 16/42] vfio/pci: vfio_vector_init Steve Sistare
2025-05-16 8:32 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 17/42] vfio/pci: vfio_notifier_init Steve Sistare
2025-05-16 8:29 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 18/42] vfio/pci: pass vector to virq functions Steve Sistare
2025-05-16 8:28 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 19/42] vfio/pci: vfio_notifier_init cpr parameters Steve Sistare
2025-05-16 8:29 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 20/42] vfio/pci: vfio_notifier_cleanup Steve Sistare
2025-05-16 8:30 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 21/42] vfio/pci: export MSI functions Steve Sistare
2025-05-16 8:31 ` Cédric Le Goater
2025-05-16 17:58 ` Steven Sistare
2025-05-20 5:52 ` Cédric Le Goater
2025-05-20 14:56 ` Steven Sistare
2025-05-20 15:10 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 22/42] vfio-pci: preserve MSI Steve Sistare
2025-05-28 17:44 ` Steven Sistare
2025-06-01 17:28 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 23/42] vfio-pci: preserve INTx Steve Sistare
2025-05-12 15:32 ` [PATCH V3 24/42] migration: close kvm after cpr Steve Sistare
2025-05-16 8:35 ` Cédric Le Goater
2025-05-16 17:14 ` Peter Xu
2025-05-16 19:17 ` Steven Sistare
2025-05-16 18:18 ` Steven Sistare
2025-05-19 8:51 ` Cédric Le Goater
2025-05-19 19:07 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 25/42] migration: cpr_get_fd_param helper Steve Sistare
2025-05-19 21:22 ` Fabiano Rosas
2025-05-12 15:32 ` [PATCH V3 26/42] vfio: return mr from vfio_get_xlat_addr Steve Sistare
2025-05-12 20:51 ` John Levon
2025-05-14 17:03 ` Cédric Le Goater
2025-05-15 8:22 ` David Hildenbrand
2025-05-15 19:13 ` Steven Sistare
2025-05-15 17:24 ` Steven Sistare
2025-05-13 11:12 ` Mark Cave-Ayland
2025-05-15 19:40 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 27/42] vfio: pass ramblock to vfio_container_dma_map Steve Sistare
2025-05-16 8:26 ` Duan, Zhenzhong
2025-05-12 15:32 ` [PATCH V3 28/42] backends/iommufd: iommufd_backend_map_file_dma Steve Sistare
2025-05-16 8:26 ` Duan, Zhenzhong
2025-05-19 15:51 ` Steven Sistare
2025-05-20 19:32 ` Steven Sistare
2025-05-21 2:48 ` Duan, Zhenzhong
2025-05-12 15:32 ` [PATCH V3 29/42] backends/iommufd: change process ioctl Steve Sistare
2025-05-16 8:42 ` Duan, Zhenzhong
2025-05-19 15:51 ` Steven Sistare
2025-05-20 19:34 ` Steven Sistare
2025-05-21 3:11 ` Duan, Zhenzhong
2025-05-21 13:01 ` Steven Sistare
2025-05-22 3:19 ` Duan, Zhenzhong
2025-05-22 21:11 ` Steven Sistare
2025-05-23 8:56 ` Duan, Zhenzhong
2025-05-23 14:56 ` Steven Sistare
2025-05-23 19:19 ` Steven Sistare
2025-05-26 2:31 ` Duan, Zhenzhong
2025-05-28 13:31 ` Steven Sistare
2025-05-30 9:56 ` Duan, Zhenzhong
2025-05-12 15:32 ` [PATCH V3 30/42] physmem: qemu_ram_get_fd_offset Steve Sistare
2025-05-16 8:40 ` Duan, Zhenzhong
2025-05-12 15:32 ` [PATCH V3 31/42] vfio/iommufd: use IOMMU_IOAS_MAP_FILE Steve Sistare
2025-05-16 8:48 ` Duan, Zhenzhong
2025-05-19 15:52 ` Steven Sistare
2025-05-20 19:39 ` Steven Sistare
2025-05-21 3:13 ` Duan, Zhenzhong
2025-05-20 12:27 ` Cédric Le Goater
2025-05-20 13:58 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 32/42] vfio/iommufd: export iommufd_cdev_get_info_iova_range Steve Sistare
2025-05-21 18:35 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 33/42] vfio/iommufd: define hwpt constructors Steve Sistare
2025-05-16 8:55 ` Duan, Zhenzhong
2025-05-19 15:55 ` Steven Sistare
2025-05-23 17:47 ` Steven Sistare
2025-05-20 12:34 ` Cédric Le Goater
2025-05-21 2:48 ` Duan, Zhenzhong
2025-05-21 8:19 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 34/42] vfio/iommufd: invariant device name Steve Sistare
2025-05-16 9:29 ` Duan, Zhenzhong
2025-05-19 15:52 ` Steven Sistare
2025-05-20 13:55 ` Cédric Le Goater
2025-05-20 21:00 ` Steven Sistare
2025-05-21 8:20 ` Cédric Le Goater
2025-05-12 15:32 ` [PATCH V3 35/42] vfio/iommufd: register container for cpr Steve Sistare
2025-05-16 10:23 ` Duan, Zhenzhong
2025-05-19 15:52 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 36/42] vfio/iommufd: preserve descriptors Steve Sistare
2025-05-16 10:06 ` Duan, Zhenzhong
2025-05-19 15:53 ` Steven Sistare
2025-05-20 9:15 ` Duan, Zhenzhong
2025-05-12 15:32 ` [PATCH V3 37/42] vfio/iommufd: reconstruct device Steve Sistare
2025-05-16 10:22 ` Duan, Zhenzhong
2025-05-19 15:53 ` Steven Sistare
2025-05-20 9:14 ` Duan, Zhenzhong
2025-05-21 18:38 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 38/42] vfio/iommufd: reconstruct hw_caps Steve Sistare
2025-05-21 19:59 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 39/42] vfio/iommufd: reconstruct hwpt Steve Sistare
2025-05-19 3:25 ` Duan, Zhenzhong
2025-05-19 15:53 ` Steven Sistare
2025-05-20 9:16 ` Duan, Zhenzhong
2025-05-21 17:40 ` Steven Sistare
2025-05-12 15:32 ` [PATCH V3 40/42] vfio/iommufd: change process Steve Sistare
2025-05-12 15:32 ` [PATCH V3 41/42] iommufd: preserve DMA mappings Steve Sistare
2025-05-12 15:32 ` [PATCH V3 42/42] vfio/container: delete old cpr register Steve Sistare
2025-05-16 16:37 ` [PATCH V3 00/42] Live update: vfio and iommufd Cédric Le Goater
2025-05-16 17:17 ` Steven Sistare
2025-05-16 19:48 ` Steven Sistare
2025-05-19 8:54 ` Cédric Le Goater
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=af8772de-5469-4736-99cd-ec917a855aac@redhat.com \
--to=clg@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=eric.auger@redhat.com \
--cc=farosas@suse.de \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=steven.sistare@oracle.com \
--cc=yi.l.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.