From: Alex Williamson <alex.williamson@redhat.com>
To: Avihai Horon <avihaih@nvidia.com>
Cc: <qemu-devel@nongnu.org>, Halil Pasic <pasic@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Eric Farman <farman@linux.ibm.com>,
Richard Henderson <richard.henderson@linaro.org>,
David Hildenbrand <david@redhat.com>,
"Ilya Leoshkevich" <iii@linux.ibm.com>,
Thomas Huth <thuth@redhat.com>,
"Juan Quintela" <quintela@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Cornelia Huck <cohuck@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>,
Eric Blake <eblake@redhat.com>,
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>,
John Snow <jsnow@redhat.com>, <qemu-s390x@nongnu.org>,
<qemu-block@nongnu.org>, Kunkun Jiang <jiangkunkun@huawei.com>,
"Zhang, Chen" <chen.zhang@intel.com>,
Yishai Hadas <yishaih@nvidia.com>,
Jason Gunthorpe <jgg@nvidia.com>,
Maor Gottlieb <maorg@nvidia.com>, Shay Drory <shayd@nvidia.com>,
Kirti Wankhede <kwankhede@nvidia.com>,
Tarun Gupta <targupta@nvidia.com>,
Joao Martins <joao.m.martins@oracle.com>
Subject: Re: [PATCH v3 07/17] vfio/migration: Allow migration without VFIO IOMMU dirty tracking support
Date: Tue, 15 Nov 2022 16:36:37 -0700 [thread overview]
Message-ID: <20221115163637.710c3d70.alex.williamson@redhat.com> (raw)
In-Reply-To: <20221103161620.13120-8-avihaih@nvidia.com>
On Thu, 3 Nov 2022 18:16:10 +0200
Avihai Horon <avihaih@nvidia.com> wrote:
> Currently, if IOMMU of a VFIO container doesn't support dirty page
> tracking, migration is blocked. This is because a DMA-able VFIO device
> can dirty RAM pages without updating QEMU about it, thus breaking the
> migration.
>
> However, this doesn't mean that migration can't be done at all.
> In such case, allow migration and let QEMU VFIO code mark the entire
> bitmap dirty.
>
> This guarantees that all pages that might have gotten dirty are reported
> back, and thus guarantees a valid migration even without VFIO IOMMU
> dirty tracking support.
>
> The motivation for this patch is the future introduction of iommufd [1].
> iommufd will directly implement the /dev/vfio/vfio container IOCTLs by
> mapping them into its internal ops, allowing the usage of these IOCTLs
> over iommufd. However, VFIO IOMMU dirty tracking will not be supported
> by this VFIO compatibility API.
>
> This patch will allow migration by hosts that use the VFIO compatibility
> API and prevent migration regressions caused by the lack of VFIO IOMMU
> dirty tracking support.
>
> [1] https://lore.kernel.org/kvm/0-v2-f9436d0bde78+4bb-iommufd_jgg@nvidia.com/
>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> hw/vfio/common.c | 84 +++++++++++++++++++++++++++++++++++++--------
> hw/vfio/migration.c | 3 +-
> 2 files changed, 70 insertions(+), 17 deletions(-)
This duplicates quite a bit of code, I think we can integrate this into
a common flow quite a bit more. See below, only compile tested. Thanks,
Alex
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6b5d8c0bf694..4117b40fd9b0 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -397,17 +397,33 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container,
IOMMUTLBEntry *iotlb)
{
struct vfio_iommu_type1_dma_unmap *unmap;
- struct vfio_bitmap *bitmap;
+ struct vfio_bitmap *vbitmap;
+ unsigned long *bitmap;
+ uint64_t bitmap_size;
uint64_t pages = REAL_HOST_PAGE_ALIGN(size) / qemu_real_host_page_size();
int ret;
- unmap = g_malloc0(sizeof(*unmap) + sizeof(*bitmap));
+ unmap = g_malloc0(sizeof(*unmap) + sizeof(*vbitmap));
- unmap->argsz = sizeof(*unmap) + sizeof(*bitmap);
+ unmap->argsz = sizeof(*unmap);
unmap->iova = iova;
unmap->size = size;
- unmap->flags |= VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP;
- bitmap = (struct vfio_bitmap *)&unmap->data;
+
+ bitmap_size = ROUND_UP(pages, sizeof(__u64) * BITS_PER_BYTE) /
+ BITS_PER_BYTE;
+ bitmap = g_try_malloc0(bitmap_size);
+ if (!bitmap) {
+ ret = -ENOMEM;
+ goto unmap_exit;
+ }
+
+ if (!container->dirty_pages_supported) {
+ bitmap_set(bitmap, 0, pages);
+ goto do_unmap;
+ }
+
+ unmap->argsz += sizeof(*vbitmap);
+ unmap->flags = VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP;
/*
* cpu_physical_memory_set_dirty_lebitmap() supports pages in bitmap of
@@ -415,33 +431,28 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container,
* to qemu_real_host_page_size.
*/
- bitmap->pgsize = qemu_real_host_page_size();
- bitmap->size = ROUND_UP(pages, sizeof(__u64) * BITS_PER_BYTE) /
- BITS_PER_BYTE;
+ vbitmap = (struct vfio_bitmap *)&unmap->data;
+ vbitmap->data = (__u64 *)bitmap;
+ vbitmap->pgsize = qemu_real_host_page_size();
+ vbitmap->size = bitmap_size;
- if (bitmap->size > container->max_dirty_bitmap_size) {
- error_report("UNMAP: Size of bitmap too big 0x%"PRIx64,
- (uint64_t)bitmap->size);
+ if (bitmap_size > container->max_dirty_bitmap_size) {
+ error_report("UNMAP: Size of bitmap too big 0x%"PRIx64, bitmap_size);
ret = -E2BIG;
goto unmap_exit;
}
- bitmap->data = g_try_malloc0(bitmap->size);
- if (!bitmap->data) {
- ret = -ENOMEM;
- goto unmap_exit;
- }
-
+do_unmap:
ret = ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, unmap);
if (!ret) {
- cpu_physical_memory_set_dirty_lebitmap((unsigned long *)bitmap->data,
- iotlb->translated_addr, pages);
+ cpu_physical_memory_set_dirty_lebitmap(bitmap, iotlb->translated_addr,
+ pages);
} else {
error_report("VFIO_UNMAP_DMA with DIRTY_BITMAP : %m");
}
- g_free(bitmap->data);
unmap_exit:
+ g_free(bitmap);
g_free(unmap);
return ret;
}
@@ -460,8 +471,7 @@ static int vfio_dma_unmap(VFIOContainer *container,
.size = size,
};
- if (iotlb && container->dirty_pages_supported &&
- vfio_devices_all_running_and_saving(container)) {
+ if (iotlb && vfio_devices_all_running_and_saving(container)) {
return vfio_dma_unmap_bitmap(container, iova, size, iotlb);
}
@@ -1257,6 +1267,10 @@ static void vfio_set_dirty_page_tracking(VFIOContainer *container, bool start)
.argsz = sizeof(dirty),
};
+ if (!container->dirty_pages_supported) {
+ return;
+ }
+
if (start) {
dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_START;
} else {
@@ -1287,11 +1301,26 @@ static void vfio_listener_log_global_stop(MemoryListener *listener)
static int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova,
uint64_t size, ram_addr_t ram_addr)
{
- struct vfio_iommu_type1_dirty_bitmap *dbitmap;
+ struct vfio_iommu_type1_dirty_bitmap *dbitmap = NULL;
struct vfio_iommu_type1_dirty_bitmap_get *range;
+ unsigned long *bitmap;
+ uint64_t bitmap_size;
uint64_t pages;
int ret;
+ pages = REAL_HOST_PAGE_ALIGN(size) / qemu_real_host_page_size();
+ bitmap_size = ROUND_UP(pages, sizeof(__u64) * BITS_PER_BYTE) /
+ BITS_PER_BYTE;
+ bitmap = g_try_malloc0(bitmap_size);
+ if (!bitmap) {
+ return -ENOMEM;
+ }
+
+ if (!container->dirty_pages_supported) {
+ bitmap_set(bitmap, 0, pages);
+ goto set_dirty;
+ }
+
dbitmap = g_malloc0(sizeof(*dbitmap) + sizeof(*range));
dbitmap->argsz = sizeof(*dbitmap) + sizeof(*range);
@@ -1306,15 +1335,8 @@ static int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova,
* to qemu_real_host_page_size.
*/
range->bitmap.pgsize = qemu_real_host_page_size();
-
- pages = REAL_HOST_PAGE_ALIGN(range->size) / qemu_real_host_page_size();
- range->bitmap.size = ROUND_UP(pages, sizeof(__u64) * BITS_PER_BYTE) /
- BITS_PER_BYTE;
- range->bitmap.data = g_try_malloc0(range->bitmap.size);
- if (!range->bitmap.data) {
- ret = -ENOMEM;
- goto err_out;
- }
+ range->bitmap.size = bitmap_size;
+ range->bitmap.data = (__u64 *)bitmap;
ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, dbitmap);
if (ret) {
@@ -1324,13 +1346,13 @@ static int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova,
goto err_out;
}
- cpu_physical_memory_set_dirty_lebitmap((unsigned long *)range->bitmap.data,
- ram_addr, pages);
+set_dirty:
+ cpu_physical_memory_set_dirty_lebitmap(bitmap, ram_addr, pages);
- trace_vfio_get_dirty_bitmap(container->fd, range->iova, range->size,
- range->bitmap.size, ram_addr);
+ trace_vfio_get_dirty_bitmap(container->fd, iova, size,
+ bitmap_size, ram_addr);
err_out:
- g_free(range->bitmap.data);
+ g_free(bitmap);
g_free(dbitmap);
return ret;
@@ -1465,8 +1487,7 @@ static void vfio_listener_log_sync(MemoryListener *listener,
{
VFIOContainer *container = container_of(listener, VFIOContainer, listener);
- if (vfio_listener_skipped_section(section) ||
- !container->dirty_pages_supported) {
+ if (vfio_listener_skipped_section(section)) {
return;
}
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index f5e72c7ac198..99ffb7578290 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -857,11 +857,10 @@ int64_t vfio_mig_bytes_transferred(void)
int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
{
- VFIOContainer *container = vbasedev->group->container;
struct vfio_region_info *info = NULL;
int ret = -ENOTSUP;
- if (!vbasedev->enable_migration || !container->dirty_pages_supported) {
+ if (!vbasedev->enable_migration) {
goto add_blocker;
}
next prev parent reply other threads:[~2022-11-15 23:37 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-03 16:16 [PATCH v3 00/17] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2022-11-03 16:16 ` [PATCH v3 01/17] migration: Remove res_compatible parameter Avihai Horon
2022-11-08 17:52 ` Vladimir Sementsov-Ogievskiy
2022-11-10 13:36 ` Avihai Horon
2022-11-21 7:20 ` Avihai Horon
2022-11-23 18:23 ` Dr. David Alan Gilbert
2022-11-24 12:19 ` Avihai Horon
2022-11-03 16:16 ` [PATCH v3 02/17] migration: No save_live_pending() method uses the QEMUFile parameter Avihai Horon
2022-11-08 17:57 ` Vladimir Sementsov-Ogievskiy
2022-11-03 16:16 ` [PATCH v3 03/17] migration: Block migration comment or code is wrong Avihai Horon
2022-11-08 18:36 ` Vladimir Sementsov-Ogievskiy
2022-11-08 18:38 ` Vladimir Sementsov-Ogievskiy
2022-11-10 13:38 ` Avihai Horon
2022-11-21 7:21 ` Avihai Horon
2022-11-03 16:16 ` [PATCH v3 04/17] migration: Simplify migration_iteration_run() Avihai Horon
2022-11-08 18:56 ` Vladimir Sementsov-Ogievskiy
2022-11-10 13:42 ` Avihai Horon
2022-11-03 16:16 ` [PATCH v3 05/17] vfio/migration: Fix wrong enum usage Avihai Horon
2022-11-08 19:05 ` Vladimir Sementsov-Ogievskiy
2022-11-10 13:47 ` Avihai Horon
2022-11-03 16:16 ` [PATCH v3 06/17] vfio/migration: Fix NULL pointer dereference bug Avihai Horon
2022-11-08 19:08 ` Vladimir Sementsov-Ogievskiy
2022-11-03 16:16 ` [PATCH v3 07/17] vfio/migration: Allow migration without VFIO IOMMU dirty tracking support Avihai Horon
2022-11-15 23:36 ` Alex Williamson [this message]
2022-11-16 13:29 ` Avihai Horon
2022-11-03 16:16 ` [PATCH v3 08/17] migration/qemu-file: Add qemu_file_get_to_fd() Avihai Horon
2022-11-08 20:26 ` Vladimir Sementsov-Ogievskiy
2022-11-03 16:16 ` [PATCH v3 09/17] vfio/common: Change vfio_devices_all_running_and_saving() logic to equivalent one Avihai Horon
2022-11-03 16:16 ` [PATCH v3 10/17] vfio/migration: Move migration v1 logic to vfio_migration_init() Avihai Horon
2022-11-15 23:56 ` Alex Williamson
2022-11-16 13:39 ` Avihai Horon
2022-11-03 16:16 ` [PATCH v3 11/17] vfio/migration: Rename functions/structs related to v1 protocol Avihai Horon
2022-11-03 16:16 ` [PATCH v3 12/17] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2022-11-16 18:29 ` Alex Williamson
2022-11-17 17:07 ` Avihai Horon
2022-11-17 17:24 ` Jason Gunthorpe
2022-11-20 8:46 ` Avihai Horon
2022-11-17 17:38 ` Alex Williamson
2022-11-20 9:34 ` Avihai Horon
2022-11-24 12:41 ` Avihai Horon
2022-11-28 18:50 ` Alex Williamson
2022-11-28 19:40 ` Jason Gunthorpe
2022-11-28 20:36 ` Alex Williamson
2022-11-28 20:56 ` Jason Gunthorpe
2022-11-28 21:10 ` Alex Williamson
2022-11-29 10:40 ` Avihai Horon
2022-11-23 18:59 ` Dr. David Alan Gilbert
2022-11-24 12:25 ` Avihai Horon
2022-11-24 13:28 ` Dr. David Alan Gilbert
2022-11-24 14:07 ` Avihai Horon
2022-11-03 16:16 ` [PATCH v3 13/17] vfio/migration: Remove VFIO migration protocol v1 Avihai Horon
2022-11-03 16:16 ` [PATCH v3 14/17] vfio/migration: Reset device if setting recover state fails Avihai Horon
2022-11-16 18:36 ` Alex Williamson
2022-11-17 17:11 ` Avihai Horon
2022-11-17 18:18 ` Alex Williamson
2022-11-20 9:39 ` Avihai Horon
2022-11-03 16:16 ` [PATCH v3 15/17] vfio: Alphabetize migration section of VFIO trace-events file Avihai Horon
2022-11-03 16:16 ` [PATCH v3 16/17] docs/devel: Align vfio-migration docs to VFIO migration v2 Avihai Horon
2022-11-03 16:16 ` [PATCH v3 17/17] vfio/migration: Query device data size in vfio_save_pending() Avihai Horon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221115163637.710c3d70.alex.williamson@redhat.com \
--to=alex.williamson@redhat.com \
--cc=avihaih@nvidia.com \
--cc=borntraeger@linux.ibm.com \
--cc=chen.zhang@intel.com \
--cc=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=fam@euphon.net \
--cc=farman@linux.ibm.com \
--cc=iii@linux.ibm.com \
--cc=jgg@nvidia.com \
--cc=jiangkunkun@huawei.com \
--cc=joao.m.martins@oracle.com \
--cc=jsnow@redhat.com \
--cc=kwankhede@nvidia.com \
--cc=maorg@nvidia.com \
--cc=mst@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=quintela@redhat.com \
--cc=richard.henderson@linaro.org \
--cc=shayd@nvidia.com \
--cc=stefanha@redhat.com \
--cc=targupta@nvidia.com \
--cc=thuth@redhat.com \
--cc=vsementsov@yandex-team.ru \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).