* [PATCH v2 0/8] vfio: relax the vIOMMU check
@ 2025-10-17 8:22 Zhenzhong Duan
2025-10-17 8:22 ` [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap Zhenzhong Duan
` (7 more replies)
0 siblings, 8 replies; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan
Hi,
This series relax the vIOMMU check and allows live migration with vIOMMU
without VFs using device dirty tracking. It's rewritten based on first 4
patches of [1] from Joao.
Currently what block us is the lack of dirty bitmap query with iommufd
before unmap. By adding that query and handle some corner case we can
relax the check.
Based on vfio-next branch:
patch1-2: add dirty bitmap query with iommufd
patch3: a renaming cleanup
patch4-5: unmap_bitmap optimization
patch6: fix large iommu notification triggered unmap_bitmap failure
patch7: add a blocker if VM memory is really quite large for unmap_bitmap
patch8: relax vIOMMU check
We tested VM live migration (running QAT workload in VM) with QAT
device passthrough, below matrix configs:
1.Scalable mode vIOMMU + IOMMUFD cdev mode
2.Scalable mode vIOMMU + legacy VFIO mode
3.legacy mode vIOMMU + IOMMUFD cdev mode
4.legacy mode vIOMMU + legacy VFIO mode
[1] https://github.com/jpemartins/qemu/commits/vfio-migration-viommu/
Thanks
Zhenzhong
Changelog:
v2:
- add backend_flag parameter to pass DIRTY_BITMAP_NO_CLEAR (Joao, Cedric)
- add a cleanup patch to rename vfio_dma_unmap_bitmap (Cedric)
- add blocker if unmap_bitmap limit check fail (Liuyi)
Joao Martins (1):
vfio: Add a backend_flag parameter to
vfio_contianer_query_dirty_bitmap()
Zhenzhong Duan (7):
vfio/iommufd: Add framework code to support getting dirty bitmap
before unmap
vfio/iommufd: Query dirty bitmap before DMA unmap
vfio/container-legacy: rename vfio_dma_unmap_bitmap() to
vfio_legacy_dma_unmap_get_dirty_bitmap()
vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend
vfio/migration: Add migration blocker if VM memory is too large to
cause unmap_bitmap failure
vfio/migration: Allow live migration with vIOMMU without VFs using
device dirty tracking
include/hw/vfio/vfio-container.h | 8 ++++--
include/hw/vfio/vfio-device.h | 10 +++++++
include/system/iommufd.h | 2 +-
backends/iommufd.c | 5 ++--
hw/i386/intel_iommu.c | 42 +++++++++++++++++++++++++++++
hw/vfio-user/container.c | 5 ++--
hw/vfio/container-legacy.c | 15 ++++++-----
hw/vfio/container.c | 20 +++++++-------
hw/vfio/device.c | 6 +++++
hw/vfio/iommufd.c | 46 ++++++++++++++++++++++++++++----
hw/vfio/listener.c | 6 ++---
hw/vfio/migration.c | 43 ++++++++++++++++++++++++++---
backends/trace-events | 2 +-
hw/vfio/trace-events | 2 +-
14 files changed, 176 insertions(+), 36 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
@ 2025-10-17 8:22 ` Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 12:44 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap Zhenzhong Duan
` (6 subsequent siblings)
7 siblings, 2 replies; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan
Currently we support device and iommu dirty tracking, device dirty tracking
is preferred.
Add the framework code in iommufd_cdev_unmap() to choose either device or
iommu dirty tracking, just like vfio_legacy_dma_unmap_one().
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
---
hw/vfio/iommufd.c | 34 +++++++++++++++++++++++++++++++---
1 file changed, 31 insertions(+), 3 deletions(-)
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index fc9cd9d22f..976c0a8814 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -61,14 +61,42 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
IOMMUTLBEntry *iotlb, bool unmap_all)
{
const VFIOIOMMUFDContainer *container = VFIO_IOMMU_IOMMUFD(bcontainer);
+ IOMMUFDBackend *be = container->be;
+ uint32_t ioas_id = container->ioas_id;
+ bool need_dirty_sync = false;
+ Error *local_err = NULL;
+ int ret;
if (unmap_all) {
size = UINT64_MAX;
}
- /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */
- return iommufd_backend_unmap_dma(container->be,
- container->ioas_id, iova, size);
+ if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
+ if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
+ bcontainer->dirty_pages_supported) {
+ /* TODO: query dirty bitmap before DMA unmap */
+ return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
+ }
+
+ need_dirty_sync = true;
+ }
+
+ ret = iommufd_backend_unmap_dma(be, ioas_id, iova, size);
+ if (ret) {
+ return ret;
+ }
+
+ if (need_dirty_sync) {
+ ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+ iotlb->translated_addr,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ return ret;
+ }
+ }
+
+ return 0;
}
static bool iommufd_cdev_kvm_device_add(VFIODevice *vbasedev, Error **errp)
--
2.47.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
2025-10-17 8:22 ` [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap Zhenzhong Duan
@ 2025-10-17 8:22 ` Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
` (2 more replies)
2025-10-17 8:22 ` [PATCH v2 3/8] vfio/container-legacy: rename vfio_dma_unmap_bitmap() to vfio_legacy_dma_unmap_get_dirty_bitmap() Zhenzhong Duan
` (5 subsequent siblings)
7 siblings, 3 replies; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan
When a existing mapping is unmapped, there could already be dirty bits
which need to be recorded before unmap.
If query dirty bitmap fails, we still need to do unmapping or else there
is stale mapping and it's risky to guest.
Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
---
hw/vfio/iommufd.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 976c0a8814..404e6249ca 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -74,7 +74,13 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
bcontainer->dirty_pages_supported) {
- /* TODO: query dirty bitmap before DMA unmap */
+ ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+ iotlb->translated_addr,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
+ /* Unmap stale mapping even if query dirty bitmap fails */
return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
}
--
2.47.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 3/8] vfio/container-legacy: rename vfio_dma_unmap_bitmap() to vfio_legacy_dma_unmap_get_dirty_bitmap()
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
2025-10-17 8:22 ` [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap Zhenzhong Duan
2025-10-17 8:22 ` [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap Zhenzhong Duan
@ 2025-10-17 8:22 ` Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 12:45 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 4/8] vfio: Add a backend_flag parameter to vfio_contianer_query_dirty_bitmap() Zhenzhong Duan
` (4 subsequent siblings)
7 siblings, 2 replies; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan
This is to follow naming style in container-legacy.c to have low level functions
with vfio_legacy_ prefix.
No functional changes.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/vfio/container-legacy.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/hw/vfio/container-legacy.c b/hw/vfio/container-legacy.c
index 8e9639603e..b7e3b892b9 100644
--- a/hw/vfio/container-legacy.c
+++ b/hw/vfio/container-legacy.c
@@ -68,9 +68,10 @@ static int vfio_ram_block_discard_disable(VFIOLegacyContainer *container,
}
}
-static int vfio_dma_unmap_bitmap(const VFIOLegacyContainer *container,
- hwaddr iova, uint64_t size,
- IOMMUTLBEntry *iotlb)
+static int
+vfio_legacy_dma_unmap_get_dirty_bitmap(const VFIOLegacyContainer *container,
+ hwaddr iova, uint64_t size,
+ IOMMUTLBEntry *iotlb)
{
const VFIOContainer *bcontainer = VFIO_IOMMU(container);
struct vfio_iommu_type1_dma_unmap *unmap;
@@ -141,7 +142,8 @@ static int vfio_legacy_dma_unmap_one(const VFIOLegacyContainer *container,
if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
bcontainer->dirty_pages_supported) {
- return vfio_dma_unmap_bitmap(container, iova, size, iotlb);
+ return vfio_legacy_dma_unmap_get_dirty_bitmap(container, iova, size,
+ iotlb);
}
need_dirty_sync = true;
--
2.47.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 4/8] vfio: Add a backend_flag parameter to vfio_contianer_query_dirty_bitmap()
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
` (2 preceding siblings ...)
2025-10-17 8:22 ` [PATCH v2 3/8] vfio/container-legacy: rename vfio_dma_unmap_bitmap() to vfio_legacy_dma_unmap_get_dirty_bitmap() Zhenzhong Duan
@ 2025-10-17 8:22 ` Zhenzhong Duan
2025-10-20 7:01 ` Cédric Le Goater
2025-10-20 12:44 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 5/8] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support Zhenzhong Duan
` (3 subsequent siblings)
7 siblings, 2 replies; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan
From: Joao Martins <joao.m.martins@oracle.com>
This new parameter will be used in following patch, currently 0 is passed.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
---
include/hw/vfio/vfio-container.h | 8 ++++++--
hw/vfio-user/container.c | 5 +++--
hw/vfio/container-legacy.c | 5 +++--
hw/vfio/container.c | 15 +++++++++------
hw/vfio/iommufd.c | 7 ++++---
hw/vfio/listener.c | 6 +++---
hw/vfio/trace-events | 2 +-
7 files changed, 29 insertions(+), 19 deletions(-)
diff --git a/include/hw/vfio/vfio-container.h b/include/hw/vfio/vfio-container.h
index c4b58d664b..9f6e8cedfc 100644
--- a/include/hw/vfio/vfio-container.h
+++ b/include/hw/vfio/vfio-container.h
@@ -99,7 +99,9 @@ bool vfio_container_devices_dirty_tracking_is_supported(
const VFIOContainer *bcontainer);
int vfio_container_query_dirty_bitmap(const VFIOContainer *bcontainer,
uint64_t iova, uint64_t size,
- hwaddr translated_addr, Error **errp);
+ uint64_t backend_flag,
+ hwaddr translated_addr,
+ Error **errp);
GList *vfio_container_get_iova_ranges(const VFIOContainer *bcontainer);
@@ -253,12 +255,14 @@ struct VFIOIOMMUClass {
* @vbmap: #VFIOBitmap internal bitmap structure
* @iova: iova base address
* @size: size of iova range
+ * @backend_flag: flags for backend, opaque to upper layer container
* @errp: pointer to Error*, to store an error if it happens.
*
* Returns zero to indicate success and negative for error.
*/
int (*query_dirty_bitmap)(const VFIOContainer *bcontainer,
- VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error **errp);
+ VFIOBitmap *vbmap, hwaddr iova, hwaddr size,
+ uint64_t backend_flag, Error **errp);
/* PCI specific */
int (*pci_hot_reset)(VFIODevice *vbasedev, bool single);
diff --git a/hw/vfio-user/container.c b/hw/vfio-user/container.c
index e45192fef6..3ce6ea12db 100644
--- a/hw/vfio-user/container.c
+++ b/hw/vfio-user/container.c
@@ -162,8 +162,9 @@ vfio_user_set_dirty_page_tracking(const VFIOContainer *bcontainer,
}
static int vfio_user_query_dirty_bitmap(const VFIOContainer *bcontainer,
- VFIOBitmap *vbmap, hwaddr iova,
- hwaddr size, Error **errp)
+ VFIOBitmap *vbmap, hwaddr iova,
+ hwaddr size, uint64_t backend_flag,
+ Error **errp)
{
error_setg_errno(errp, ENOTSUP, "Not supported");
return -ENOTSUP;
diff --git a/hw/vfio/container-legacy.c b/hw/vfio/container-legacy.c
index b7e3b892b9..dd9c4a6a5a 100644
--- a/hw/vfio/container-legacy.c
+++ b/hw/vfio/container-legacy.c
@@ -154,7 +154,7 @@ static int vfio_legacy_dma_unmap_one(const VFIOLegacyContainer *container,
}
if (need_dirty_sync) {
- ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+ ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
iotlb->translated_addr, &local_err);
if (ret) {
error_report_err(local_err);
@@ -255,7 +255,8 @@ vfio_legacy_set_dirty_page_tracking(const VFIOContainer *bcontainer,
}
static int vfio_legacy_query_dirty_bitmap(const VFIOContainer *bcontainer,
- VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error **errp)
+ VFIOBitmap *vbmap, hwaddr iova, hwaddr size,
+ uint64_t backend_flag, Error **errp)
{
const VFIOLegacyContainer *container = VFIO_IOMMU_LEGACY(bcontainer);
struct vfio_iommu_type1_dirty_bitmap *dbitmap;
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 9ddec300e3..7706603c1c 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -213,13 +213,13 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
static int vfio_container_iommu_query_dirty_bitmap(
const VFIOContainer *bcontainer, VFIOBitmap *vbmap, hwaddr iova,
- hwaddr size, Error **errp)
+ hwaddr size, uint64_t backend_flag, Error **errp)
{
VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
g_assert(vioc->query_dirty_bitmap);
return vioc->query_dirty_bitmap(bcontainer, vbmap, iova, size,
- errp);
+ backend_flag, errp);
}
static int vfio_container_devices_query_dirty_bitmap(
@@ -247,7 +247,9 @@ static int vfio_container_devices_query_dirty_bitmap(
int vfio_container_query_dirty_bitmap(const VFIOContainer *bcontainer,
uint64_t iova, uint64_t size,
- hwaddr translated_addr, Error **errp)
+ uint64_t backend_flag,
+ hwaddr translated_addr,
+ Error **errp)
{
bool all_device_dirty_tracking =
vfio_container_devices_dirty_tracking_is_supported(bcontainer);
@@ -274,7 +276,7 @@ int vfio_container_query_dirty_bitmap(const VFIOContainer *bcontainer,
errp);
} else {
ret = vfio_container_iommu_query_dirty_bitmap(bcontainer, &vbmap, iova, size,
- errp);
+ backend_flag, errp);
}
if (ret) {
@@ -285,8 +287,9 @@ int vfio_container_query_dirty_bitmap(const VFIOContainer *bcontainer,
translated_addr,
vbmap.pages);
- trace_vfio_container_query_dirty_bitmap(iova, size, vbmap.size,
- translated_addr, dirty_pages);
+ trace_vfio_container_query_dirty_bitmap(iova, size, backend_flag,
+ vbmap.size, translated_addr,
+ dirty_pages);
out:
g_free(vbmap.bitmap);
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 404e6249ca..6457cef344 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -74,7 +74,7 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
bcontainer->dirty_pages_supported) {
- ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+ ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
iotlb->translated_addr,
&local_err);
if (ret) {
@@ -93,7 +93,7 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
}
if (need_dirty_sync) {
- ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+ ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
iotlb->translated_addr,
&local_err);
if (ret) {
@@ -209,7 +209,8 @@ err:
static int iommufd_query_dirty_bitmap(const VFIOContainer *bcontainer,
VFIOBitmap *vbmap, hwaddr iova,
- hwaddr size, Error **errp)
+ hwaddr size, uint64_t backend_flag,
+ Error **errp)
{
VFIOIOMMUFDContainer *container = VFIO_IOMMU_IOMMUFD(bcontainer);
unsigned long page_size = qemu_real_host_page_size();
diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
index 2d7d3a4645..2109101158 100644
--- a/hw/vfio/listener.c
+++ b/hw/vfio/listener.c
@@ -1083,7 +1083,7 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
translated_addr = memory_region_get_ram_addr(mr) + xlat;
ret = vfio_container_query_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
- translated_addr, &local_err);
+ 0, translated_addr, &local_err);
if (ret) {
error_prepend(&local_err,
"vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
@@ -1119,7 +1119,7 @@ static int vfio_ram_discard_query_dirty_bitmap(MemoryRegionSection *section,
* Sync the whole mapped region (spanning multiple individual mappings)
* in one go.
*/
- ret = vfio_container_query_dirty_bitmap(vrdl->bcontainer, iova, size,
+ ret = vfio_container_query_dirty_bitmap(vrdl->bcontainer, iova, size, 0,
translated_addr, &local_err);
if (ret) {
error_report_err(local_err);
@@ -1204,7 +1204,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *bcontainer,
return vfio_container_query_dirty_bitmap(bcontainer,
REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
- int128_get64(section->size), translated_addr, errp);
+ int128_get64(section->size), 0, translated_addr, errp);
}
static void vfio_listener_log_sync(MemoryListener *listener,
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 1e895448cd..3c62bab764 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -105,7 +105,7 @@ vfio_device_dirty_tracking_start(int nr_ranges, uint64_t min32, uint64_t max32,
vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64
# container.c
-vfio_container_query_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t translated_addr, uint64_t dirty_pages) "iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" gpa=0x%"PRIx64" dirty_pages=%"PRIu64
+vfio_container_query_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t backend_flag, uint64_t bitmap_size, uint64_t translated_addr, uint64_t dirty_pages) "iova=0x%"PRIx64" size=0x%"PRIx64" backend_flag=0x%"PRIx64" bitmap_size=0x%"PRIx64" gpa=0x%"PRIx64" dirty_pages=%"PRIu64
# container-legacy.c
vfio_container_disconnect(int fd) "close container->fd=%d"
--
2.47.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 5/8] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
` (3 preceding siblings ...)
2025-10-17 8:22 ` [PATCH v2 4/8] vfio: Add a backend_flag parameter to vfio_contianer_query_dirty_bitmap() Zhenzhong Duan
@ 2025-10-17 8:22 ` Zhenzhong Duan
2025-10-20 7:01 ` Cédric Le Goater
2025-10-20 12:45 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend Zhenzhong Duan
` (2 subsequent siblings)
7 siblings, 2 replies; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan
Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last dirty
bitmap query right before unmap, no PTEs flushes. This accelerates the
query without issue because unmap will tear down the mapping anyway.
Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
---
include/system/iommufd.h | 2 +-
backends/iommufd.c | 5 +++--
hw/vfio/iommufd.c | 5 +++--
backends/trace-events | 2 +-
4 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index a659f36a20..767a8e4cb6 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -64,7 +64,7 @@ bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
uint64_t iova, ram_addr_t size,
uint64_t page_size, uint64_t *data,
- Error **errp);
+ uint64_t flags, Error **errp);
bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
uint32_t data_type, uint32_t entry_len,
uint32_t *entry_num, void *data,
diff --git a/backends/iommufd.c b/backends/iommufd.c
index fdfb7c9d67..086bd67aea 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -361,7 +361,7 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
uint32_t hwpt_id,
uint64_t iova, ram_addr_t size,
uint64_t page_size, uint64_t *data,
- Error **errp)
+ uint64_t flags, Error **errp)
{
int ret;
struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
@@ -371,11 +371,12 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
.length = size,
.page_size = page_size,
.data = (uintptr_t)data,
+ .flags = flags,
};
ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP, &get_dirty_bitmap);
trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id, iova, size,
- page_size, ret ? errno : 0);
+ flags, page_size, ret ? errno : 0);
if (ret) {
error_setg_errno(errp, errno,
"IOMMU_HWPT_GET_DIRTY_BITMAP (iova: 0x%"HWADDR_PRIx
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 6457cef344..937b80340c 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -74,7 +74,8 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
bcontainer->dirty_pages_supported) {
- ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
+ ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+ IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR,
iotlb->translated_addr,
&local_err);
if (ret) {
@@ -224,7 +225,7 @@ static int iommufd_query_dirty_bitmap(const VFIOContainer *bcontainer,
if (!iommufd_backend_get_dirty_bitmap(container->be, hwpt->hwpt_id,
iova, size, page_size,
(uint64_t *)vbmap->bitmap,
- errp)) {
+ backend_flag, errp)) {
return -EINVAL;
}
}
diff --git a/backends/trace-events b/backends/trace-events
index 56132d3fd2..e1992ba12f 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -19,5 +19,5 @@ iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d"
iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t flags, uint32_t hwpt_type, uint32_t len, uint64_t data_ptr, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u flags=0x%x hwpt_type=%u len=%u data_ptr=0x%"PRIx64" out_hwpt=%u (%d)"
iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
-iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
+iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t flags, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" flags=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
--
2.47.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
` (4 preceding siblings ...)
2025-10-17 8:22 ` [PATCH v2 5/8] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support Zhenzhong Duan
@ 2025-10-17 8:22 ` Zhenzhong Duan
2025-10-20 7:06 ` Cédric Le Goater
2025-10-20 7:38 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure Zhenzhong Duan
2025-10-17 8:22 ` [PATCH v2 8/8] vfio/migration: Allow live migration with vIOMMU without VFs using device dirty tracking Zhenzhong Duan
7 siblings, 2 replies; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan
If a VFIO device in guest switches from IOMMU domain to block domain,
vtd_address_space_unmap() is called to unmap whole address space.
If that happens during migration, migration fails with legacy VFIO
backend as below:
Status: failed (vfio_container_dma_unmap(0x561bbbd92d90, 0x100000000000, 0x100000000000) = -7 (Argument list too long))
Because legacy VFIO limits maximum bitmap size to 256MB which maps to 8TB on
4K page system, when 16TB sized UNMAP notification is sent, unmap_bitmap
ioctl fails.
Fix it by iterating over DMAMap list to unmap each range with active mapping
when migration is active. If migration is not active, unmapping the whole
address space in one go is optimal.
There is no such limitation with iommufd backend, but it's still not optimal
to allocate large bitmap, e.g., there may be large hole between IOVA ranges,
allocating large bitmap and dirty tracking on the hole is time consuming and
useless work.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
---
hw/i386/intel_iommu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 6a168d5107..f32d4f5a15 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -37,6 +37,7 @@
#include "system/system.h"
#include "hw/i386/apic_internal.h"
#include "kvm/kvm_i386.h"
+#include "migration/misc.h"
#include "migration/vmstate.h"
#include "trace.h"
@@ -4533,6 +4534,42 @@ static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
vtd_iommu_unlock(s);
}
+/*
+ * Unmapping a large range in one go is not optimal during migration because
+ * a large dirty bitmap needs to be allocated while there may be only small
+ * mappings, iterate over DMAMap list to unmap each range with active mapping.
+ */
+static void vtd_address_space_unmap_in_migration(VTDAddressSpace *as,
+ IOMMUNotifier *n)
+{
+ const DMAMap *map;
+ const DMAMap target = {
+ .iova = n->start,
+ .size = n->end,
+ };
+ IOVATree *tree = as->iova_tree;
+
+ /*
+ * DMAMap is created during IOMMU page table sync, it's either 4KB or huge
+ * page size and always a power of 2 in size. So the range of DMAMap could
+ * be used for UNMAP notification directly.
+ */
+ while ((map = iova_tree_find(tree, &target))) {
+ IOMMUTLBEvent event;
+
+ event.type = IOMMU_NOTIFIER_UNMAP;
+ event.entry.iova = map->iova;
+ event.entry.addr_mask = map->size;
+ event.entry.target_as = &address_space_memory;
+ event.entry.perm = IOMMU_NONE;
+ /* This field is meaningless for unmap */
+ event.entry.translated_addr = 0;
+ memory_region_notify_iommu_one(n, &event);
+
+ iova_tree_remove(tree, *map);
+ }
+}
+
/* Unmap the whole range in the notifier's scope. */
static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
{
@@ -4542,6 +4579,11 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
IntelIOMMUState *s = as->iommu_state;
DMAMap map;
+ if (migration_is_running()) {
+ vtd_address_space_unmap_in_migration(as, n);
+ return;
+ }
+
/*
* Note: all the codes in this function has a assumption that IOVA
* bits are no more than VTD_MGAW bits (which is restricted by
--
2.47.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
` (5 preceding siblings ...)
2025-10-17 8:22 ` [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend Zhenzhong Duan
@ 2025-10-17 8:22 ` Zhenzhong Duan
2025-10-20 8:39 ` Cédric Le Goater
2025-10-20 12:44 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 8/8] vfio/migration: Allow live migration with vIOMMU without VFs using device dirty tracking Zhenzhong Duan
7 siblings, 2 replies; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan
With default config, kernel VFIO type1 driver limits dirty bitmap to 256MB
for unmap_bitmap ioctl so the maximum guest memory region is no more than
8TB size for the ioctl to succeed.
Be conservative here to limit total guest memory to 8TB or else add a
migration blocker. IOMMUFD backend doesn't have such limit, one can use
IOMMUFD backed device if there is a need to migration such large VM.
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
hw/vfio/migration.c | 37 +++++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 4c06e3db93..1106ca7857 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -16,6 +16,7 @@
#include <sys/ioctl.h>
#include "system/runstate.h"
+#include "hw/boards.h"
#include "hw/vfio/vfio-device.h"
#include "hw/vfio/vfio-migration.h"
#include "migration/misc.h"
@@ -1152,6 +1153,35 @@ static bool vfio_viommu_preset(VFIODevice *vbasedev)
return vbasedev->bcontainer->space->as != &address_space_memory;
}
+static bool vfio_dirty_tracking_exceed_limit(VFIODevice *vbasedev)
+{
+ VFIOContainer *bcontainer = vbasedev->bcontainer;
+ uint64_t max_size, page_size;
+
+ if (!object_dynamic_cast(OBJECT(bcontainer), TYPE_VFIO_IOMMU_LEGACY)) {
+ return false;
+ }
+
+ if (!bcontainer->dirty_pages_supported) {
+ return true;
+ }
+ /*
+ * VFIO type1 driver has a limitation of bitmap size on unmap_bitmap
+ * ioctl(), calculate the limit and compare with guest memory size to
+ * catch dirty tracking failure early.
+ *
+ * This limit is 8TB with default kernel and QEMU config, we are a bit
+ * conservative here as VM memory layout may be nonconsecutive or VM
+ * can run with vIOMMU enabled so the limitation could be relaxed. One
+ * can also switch to use IOMMUFD backend if there is a need to migrate
+ * large VM.
+ */
+ page_size = 1 << ctz64(bcontainer->dirty_pgsizes);
+ max_size = bcontainer->max_dirty_bitmap_size * BITS_PER_BYTE * page_size;
+
+ return current_machine->ram_size > max_size;
+}
+
/*
* Return true when either migration initialized or blocker registered.
* Currently only return false when adding blocker fails which will
@@ -1208,6 +1238,13 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp)
goto add_blocker;
}
+ if (vfio_dirty_tracking_exceed_limit(vbasedev)) {
+ error_setg(&err, "%s: Migration is currently not supported with "
+ "large memory VM due to dirty tracking limitation in "
+ "VFIO type1 driver", vbasedev->name);
+ goto add_blocker;
+ }
+
trace_vfio_migration_realize(vbasedev->name);
return true;
--
2.47.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 8/8] vfio/migration: Allow live migration with vIOMMU without VFs using device dirty tracking
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
` (6 preceding siblings ...)
2025-10-17 8:22 ` [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure Zhenzhong Duan
@ 2025-10-17 8:22 ` Zhenzhong Duan
2025-10-20 12:46 ` Yi Liu
7 siblings, 1 reply; 34+ messages in thread
From: Zhenzhong Duan @ 2025-10-17 8:22 UTC (permalink / raw)
To: qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, avihaih,
xudong.hao, giovanni.cabiddu, mark.gross, arjan.van.de.ven,
Zhenzhong Duan, Jason Zeng
Commit e46883204c38 ("vfio/migration: Block migration with vIOMMU")
introduces a migration blocker when vIOMMU is enabled, because we need
to calculate the IOVA ranges for device dirty tracking. But this is
unnecessary for iommu dirty tracking.
Limit the vfio_viommu_preset() check to those devices which use device
dirty tracking. This allows live migration with VFIO devices which use
iommu dirty tracking.
Introduce a helper vfio_device_dirty_pages_disabled() to facilicate it.
Suggested-by: Jason Zeng <jason.zeng@intel.com>
Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
---
include/hw/vfio/vfio-device.h | 10 ++++++++++
hw/vfio/container.c | 5 +----
hw/vfio/device.c | 6 ++++++
hw/vfio/migration.c | 6 +++---
4 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
index 7e9aed6d3c..feda521514 100644
--- a/include/hw/vfio/vfio-device.h
+++ b/include/hw/vfio/vfio-device.h
@@ -148,6 +148,16 @@ bool vfio_device_irq_set_signaling(VFIODevice *vbasedev, int index, int subindex
void vfio_device_reset_handler(void *opaque);
bool vfio_device_is_mdev(VFIODevice *vbasedev);
+/**
+ * vfio_device_dirty_pages_disabled: Check if device dirty tracking will be
+ * used for a VFIO device
+ *
+ * @vbasedev: The VFIODevice to transform
+ *
+ * Return: true if either @vbasedev doesn't support device dirty tracking or
+ * is forcedly disabled from command line, otherwise false.
+ */
+bool vfio_device_dirty_pages_disabled(VFIODevice *vbasedev);
bool vfio_device_hiod_create_and_realize(VFIODevice *vbasedev,
const char *typename, Error **errp);
bool vfio_device_attach(char *name, VFIODevice *vbasedev,
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 7706603c1c..8879da78c8 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -178,10 +178,7 @@ bool vfio_container_devices_dirty_tracking_is_supported(
VFIODevice *vbasedev;
QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
- if (vbasedev->device_dirty_page_tracking == ON_OFF_AUTO_OFF) {
- return false;
- }
- if (!vbasedev->dirty_pages_supported) {
+ if (vfio_device_dirty_pages_disabled(vbasedev)) {
return false;
}
}
diff --git a/hw/vfio/device.c b/hw/vfio/device.c
index 64f8750389..837872387f 100644
--- a/hw/vfio/device.c
+++ b/hw/vfio/device.c
@@ -400,6 +400,12 @@ bool vfio_device_is_mdev(VFIODevice *vbasedev)
return subsys && (strcmp(subsys, "/sys/bus/mdev") == 0);
}
+bool vfio_device_dirty_pages_disabled(VFIODevice *vbasedev)
+{
+ return (!vbasedev->dirty_pages_supported ||
+ vbasedev->device_dirty_page_tracking == ON_OFF_AUTO_OFF);
+}
+
bool vfio_device_hiod_create_and_realize(VFIODevice *vbasedev,
const char *typename, Error **errp)
{
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 1106ca7857..1093857a34 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -1213,8 +1213,7 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp)
return !vfio_block_migration(vbasedev, err, errp);
}
- if ((!vbasedev->dirty_pages_supported ||
- vbasedev->device_dirty_page_tracking == ON_OFF_AUTO_OFF) &&
+ if (vfio_device_dirty_pages_disabled(vbasedev) &&
!vbasedev->iommu_dirty_tracking) {
if (vbasedev->enable_migration == ON_OFF_AUTO_AUTO) {
error_setg(&err,
@@ -1232,7 +1231,8 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp)
goto out_deinit;
}
- if (vfio_viommu_preset(vbasedev)) {
+ if (!vfio_device_dirty_pages_disabled(vbasedev) &&
+ vfio_viommu_preset(vbasedev)) {
error_setg(&err, "%s: Migration is currently not supported "
"with vIOMMU enabled", vbasedev->name);
goto add_blocker;
--
2.47.1
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
2025-10-17 8:22 ` [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap Zhenzhong Duan
@ 2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 12:44 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2025-10-20 7:00 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, mst, jasowang, yi.l.liu, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 10/17/25 10:22, Zhenzhong Duan wrote:
> Currently we support device and iommu dirty tracking, device dirty tracking
> is preferred.
>
> Add the framework code in iommufd_cdev_unmap() to choose either device or
> iommu dirty tracking, just like vfio_legacy_dma_unmap_one().
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap
2025-10-17 8:22 ` [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap Zhenzhong Duan
@ 2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 8:14 ` Avihai Horon
2025-10-20 12:44 ` Yi Liu
2 siblings, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2025-10-20 7:00 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, mst, jasowang, yi.l.liu, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 10/17/25 10:22, Zhenzhong Duan wrote:
> When a existing mapping is unmapped, there could already be dirty bits
> which need to be recorded before unmap.
>
> If query dirty bitmap fails, we still need to do unmapping or else there
> is stale mapping and it's risky to guest.
>
> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> hw/vfio/iommufd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 976c0a8814..404e6249ca 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -74,7 +74,13 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> bcontainer->dirty_pages_supported) {
> - /* TODO: query dirty bitmap before DMA unmap */
> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> + iotlb->translated_addr,
> + &local_err);
> + if (ret) {
> + error_report_err(local_err);
> + }
> + /* Unmap stale mapping even if query dirty bitmap fails */
> return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
> }
>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 3/8] vfio/container-legacy: rename vfio_dma_unmap_bitmap() to vfio_legacy_dma_unmap_get_dirty_bitmap()
2025-10-17 8:22 ` [PATCH v2 3/8] vfio/container-legacy: rename vfio_dma_unmap_bitmap() to vfio_legacy_dma_unmap_get_dirty_bitmap() Zhenzhong Duan
@ 2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 12:45 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2025-10-20 7:00 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, mst, jasowang, yi.l.liu, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 10/17/25 10:22, Zhenzhong Duan wrote:
> This is to follow naming style in container-legacy.c to have low level functions
> with vfio_legacy_ prefix.
>
> No functional changes.
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/vfio/container-legacy.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/hw/vfio/container-legacy.c b/hw/vfio/container-legacy.c
> index 8e9639603e..b7e3b892b9 100644
> --- a/hw/vfio/container-legacy.c
> +++ b/hw/vfio/container-legacy.c
> @@ -68,9 +68,10 @@ static int vfio_ram_block_discard_disable(VFIOLegacyContainer *container,
> }
> }
>
> -static int vfio_dma_unmap_bitmap(const VFIOLegacyContainer *container,
> - hwaddr iova, uint64_t size,
> - IOMMUTLBEntry *iotlb)
> +static int
> +vfio_legacy_dma_unmap_get_dirty_bitmap(const VFIOLegacyContainer *container,
> + hwaddr iova, uint64_t size,
> + IOMMUTLBEntry *iotlb)
> {
> const VFIOContainer *bcontainer = VFIO_IOMMU(container);
> struct vfio_iommu_type1_dma_unmap *unmap;
> @@ -141,7 +142,8 @@ static int vfio_legacy_dma_unmap_one(const VFIOLegacyContainer *container,
> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> bcontainer->dirty_pages_supported) {
> - return vfio_dma_unmap_bitmap(container, iova, size, iotlb);
> + return vfio_legacy_dma_unmap_get_dirty_bitmap(container, iova, size,
> + iotlb);
> }
>
> need_dirty_sync = true;
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 4/8] vfio: Add a backend_flag parameter to vfio_contianer_query_dirty_bitmap()
2025-10-17 8:22 ` [PATCH v2 4/8] vfio: Add a backend_flag parameter to vfio_contianer_query_dirty_bitmap() Zhenzhong Duan
@ 2025-10-20 7:01 ` Cédric Le Goater
2025-10-20 12:44 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2025-10-20 7:01 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, mst, jasowang, yi.l.liu, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 10/17/25 10:22, Zhenzhong Duan wrote:
> From: Joao Martins <joao.m.martins@oracle.com>
>
> This new parameter will be used in following patch, currently 0 is passed.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> include/hw/vfio/vfio-container.h | 8 ++++++--
> hw/vfio-user/container.c | 5 +++--
> hw/vfio/container-legacy.c | 5 +++--
> hw/vfio/container.c | 15 +++++++++------
> hw/vfio/iommufd.c | 7 ++++---
> hw/vfio/listener.c | 6 +++---
> hw/vfio/trace-events | 2 +-
> 7 files changed, 29 insertions(+), 19 deletions(-)
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 5/8] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
2025-10-17 8:22 ` [PATCH v2 5/8] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support Zhenzhong Duan
@ 2025-10-20 7:01 ` Cédric Le Goater
2025-10-20 12:45 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2025-10-20 7:01 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, mst, jasowang, yi.l.liu, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 10/17/25 10:22, Zhenzhong Duan wrote:
> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last dirty
> bitmap query right before unmap, no PTEs flushes. This accelerates the
> query without issue because unmap will tear down the mapping anyway.
>
> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> include/system/iommufd.h | 2 +-
> backends/iommufd.c | 5 +++--
> hw/vfio/iommufd.c | 5 +++--
> backends/trace-events | 2 +-
> 4 files changed, 8 insertions(+), 6 deletions(-)
>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend
2025-10-17 8:22 ` [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend Zhenzhong Duan
@ 2025-10-20 7:06 ` Cédric Le Goater
2025-10-20 7:38 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2025-10-20 7:06 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, mst, jasowang, yi.l.liu, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
Clément, Yi Liu,
On 10/17/25 10:22, Zhenzhong Duan wrote:
> If a VFIO device in guest switches from IOMMU domain to block domain,
> vtd_address_space_unmap() is called to unmap whole address space.
>
> If that happens during migration, migration fails with legacy VFIO
> backend as below:
>
> Status: failed (vfio_container_dma_unmap(0x561bbbd92d90, 0x100000000000, 0x100000000000) = -7 (Argument list too long))
>
> Because legacy VFIO limits maximum bitmap size to 256MB which maps to 8TB on
> 4K page system, when 16TB sized UNMAP notification is sent, unmap_bitmap
> ioctl fails.
>
> Fix it by iterating over DMAMap list to unmap each range with active mapping
> when migration is active. If migration is not active, unmapping the whole
> address space in one go is optimal.
>
> There is no such limitation with iommufd backend, but it's still not optimal
> to allocate large bitmap, e.g., there may be large hole between IOVA ranges,
> allocating large bitmap and dirty tracking on the hole is time consuming and
> useless work.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> hw/i386/intel_iommu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 42 insertions(+)
>
Could you ack this change please ?
Thanks,
C.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend
2025-10-17 8:22 ` [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend Zhenzhong Duan
2025-10-20 7:06 ` Cédric Le Goater
@ 2025-10-20 7:38 ` Yi Liu
2025-10-20 8:03 ` Duan, Zhenzhong
1 sibling, 1 reply; 34+ messages in thread
From: Yi Liu @ 2025-10-20 7:38 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 2025/10/17 16:22, Zhenzhong Duan wrote:
> If a VFIO device in guest switches from IOMMU domain to block domain,
> vtd_address_space_unmap() is called to unmap whole address space.
>
> If that happens during migration, migration fails with legacy VFIO
> backend as below:
>
> Status: failed (vfio_container_dma_unmap(0x561bbbd92d90, 0x100000000000, 0x100000000000) = -7 (Argument list too long))
>
> Because legacy VFIO limits maximum bitmap size to 256MB which maps to 8TB on
> 4K page system, when 16TB sized UNMAP notification is sent, unmap_bitmap
> ioctl fails.
It would be great to add some words to note why vIOMMU can trigger this.
> Fix it by iterating over DMAMap list to unmap each range with active mapping
> when migration is active. If migration is not active, unmapping the whole
> address space in one go is optimal.
>
> There is no such limitation with iommufd backend, but it's still not optimal
> to allocate large bitmap, e.g., there may be large hole between IOVA ranges,
> allocating large bitmap and dirty tracking on the hole is time consuming and
> useless work.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> hw/i386/intel_iommu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 42 insertions(+)
with above comment, the patch LGTM.
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 6a168d5107..f32d4f5a15 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -37,6 +37,7 @@
> #include "system/system.h"
> #include "hw/i386/apic_internal.h"
> #include "kvm/kvm_i386.h"
> +#include "migration/misc.h"
> #include "migration/vmstate.h"
> #include "trace.h"
>
> @@ -4533,6 +4534,42 @@ static void vtd_dev_unset_iommu_device(PCIBus *bus, void *opaque, int devfn)
> vtd_iommu_unlock(s);
> }
>
> +/*
> + * Unmapping a large range in one go is not optimal during migration because
> + * a large dirty bitmap needs to be allocated while there may be only small
> + * mappings, iterate over DMAMap list to unmap each range with active mapping.
> + */
> +static void vtd_address_space_unmap_in_migration(VTDAddressSpace *as,
> + IOMMUNotifier *n)
> +{
> + const DMAMap *map;
> + const DMAMap target = {
> + .iova = n->start,
> + .size = n->end,
> + };
> + IOVATree *tree = as->iova_tree;
> +
> + /*
> + * DMAMap is created during IOMMU page table sync, it's either 4KB or huge
> + * page size and always a power of 2 in size. So the range of DMAMap could
> + * be used for UNMAP notification directly.
> + */
> + while ((map = iova_tree_find(tree, &target))) {
> + IOMMUTLBEvent event;
> +
> + event.type = IOMMU_NOTIFIER_UNMAP;
> + event.entry.iova = map->iova;
> + event.entry.addr_mask = map->size;
> + event.entry.target_as = &address_space_memory;
> + event.entry.perm = IOMMU_NONE;
> + /* This field is meaningless for unmap */
> + event.entry.translated_addr = 0;
> + memory_region_notify_iommu_one(n, &event);
> +
> + iova_tree_remove(tree, *map);
> + }
> +}
> +
> /* Unmap the whole range in the notifier's scope. */
> static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> {
> @@ -4542,6 +4579,11 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> IntelIOMMUState *s = as->iommu_state;
> DMAMap map;
>
> + if (migration_is_running()) {
> + vtd_address_space_unmap_in_migration(as, n);
> + return;
> + }
> +
> /*
> * Note: all the codes in this function has a assumption that IOVA
> * bits are no more than VTD_MGAW bits (which is restricted by
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend
2025-10-20 7:38 ` Yi Liu
@ 2025-10-20 8:03 ` Duan, Zhenzhong
2025-10-20 8:37 ` Yi Liu
0 siblings, 1 reply; 34+ messages in thread
From: Duan, Zhenzhong @ 2025-10-20 8:03 UTC (permalink / raw)
To: Liu, Yi L, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com,
avihaih@nvidia.com, Hao, Xudong, Cabiddu, Giovanni, Gross, Mark,
Van De Ven, Arjan
>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@intel.com>
>Subject: Re: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with
>legacy VFIO backend
>
>On 2025/10/17 16:22, Zhenzhong Duan wrote:
>> If a VFIO device in guest switches from IOMMU domain to block domain,
>> vtd_address_space_unmap() is called to unmap whole address space.
>>
>> If that happens during migration, migration fails with legacy VFIO
>> backend as below:
>>
>> Status: failed (vfio_container_dma_unmap(0x561bbbd92d90,
>0x100000000000, 0x100000000000) = -7 (Argument list too long))
>>
>> Because legacy VFIO limits maximum bitmap size to 256MB which maps to
>8TB on
>> 4K page system, when 16TB sized UNMAP notification is sent,
>unmap_bitmap
>> ioctl fails.
>
>It would be great to add some words to note why vIOMMU can trigger this.
Hi Yi, I think the first sentence in description is explaining that?
"If a VFIO device in guest switches from IOMMU domain to block domain,
vtd_address_space_unmap() is called to unmap whole address space."
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap
2025-10-17 8:22 ` [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
@ 2025-10-20 8:14 ` Avihai Horon
2025-10-20 10:00 ` Duan, Zhenzhong
2025-10-20 12:44 ` Yi Liu
2 siblings, 1 reply; 34+ messages in thread
From: Avihai Horon @ 2025-10-20 8:14 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, yi.l.liu,
clement.mathieu--drif, eric.auger, joao.m.martins, xudong.hao,
giovanni.cabiddu, mark.gross, arjan.van.de.ven
Hi,
On 17/10/2025 11:22, Zhenzhong Duan wrote:
> External email: Use caution opening links or attachments
>
>
> When a existing mapping is unmapped, there could already be dirty bits
> which need to be recorded before unmap.
>
> If query dirty bitmap fails, we still need to do unmapping or else there
> is stale mapping and it's risky to guest.
>
> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> hw/vfio/iommufd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 976c0a8814..404e6249ca 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -74,7 +74,13 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> bcontainer->dirty_pages_supported) {
> - /* TODO: query dirty bitmap before DMA unmap */
> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> + iotlb->translated_addr,
> + &local_err);
> + if (ret) {
> + error_report_err(local_err);
> + }
> + /* Unmap stale mapping even if query dirty bitmap fails */
> return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
If query dirty bitmap fails, shouldn't we unmap and return the query
bitmap error to fail migration? Otherwise, migration may succeed with
some dirtied pages not being migrated.
Thanks.
> }
>
> --
> 2.47.1
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend
2025-10-20 8:03 ` Duan, Zhenzhong
@ 2025-10-20 8:37 ` Yi Liu
2025-10-20 10:01 ` Duan, Zhenzhong
0 siblings, 1 reply; 34+ messages in thread
From: Yi Liu @ 2025-10-20 8:37 UTC (permalink / raw)
To: Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com,
avihaih@nvidia.com, Hao, Xudong, Cabiddu, Giovanni, Gross, Mark,
Van De Ven, Arjan
On 2025/10/20 16:03, Duan, Zhenzhong wrote:
>
>
>> -----Original Message-----
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Subject: Re: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with
>> legacy VFIO backend
>>
>> On 2025/10/17 16:22, Zhenzhong Duan wrote:
>>> If a VFIO device in guest switches from IOMMU domain to block domain,
>>> vtd_address_space_unmap() is called to unmap whole address space.
>>>
>>> If that happens during migration, migration fails with legacy VFIO
>>> backend as below:
>>>
>>> Status: failed (vfio_container_dma_unmap(0x561bbbd92d90,
>> 0x100000000000, 0x100000000000) = -7 (Argument list too long))
>>>
>>> Because legacy VFIO limits maximum bitmap size to 256MB which maps to
>> 8TB on
>>> 4K page system, when 16TB sized UNMAP notification is sent,
>> unmap_bitmap
>>> ioctl fails.
>>
>> It would be great to add some words to note why vIOMMU can trigger this.
>
> Hi Yi, I think the first sentence in description is explaining that?
>
> "If a VFIO device in guest switches from IOMMU domain to block domain,
> vtd_address_space_unmap() is called to unmap whole address space."
aha, yes. I was trying to mark it is NOT necessarily related to VM
memory size. Could you note that the address space is guest IOVA which
is not system memory?
Regards,
Yi Liu
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure
2025-10-17 8:22 ` [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure Zhenzhong Duan
@ 2025-10-20 8:39 ` Cédric Le Goater
2025-10-20 10:07 ` Duan, Zhenzhong
2025-10-20 12:44 ` Yi Liu
1 sibling, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2025-10-20 8:39 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, mst, jasowang, yi.l.liu, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 10/17/25 10:22, Zhenzhong Duan wrote:
> With default config, kernel VFIO type1 driver limits dirty bitmap to 256MB
... VFIO IOMMU Type1 ...
> for unmap_bitmap ioctl so the maximum guest memory region is no more than
> 8TB size for the ioctl to succeed.
>
> Be conservative here to limit total guest memory to 8TB or else add a
> migration blocker. IOMMUFD backend doesn't have such limit, one can use
> IOMMUFD backed device if there is a need to migration such large VM.
>
> Suggested-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/vfio/migration.c | 37 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 37 insertions(+)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 4c06e3db93..1106ca7857 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -16,6 +16,7 @@
> #include <sys/ioctl.h>
>
> #include "system/runstate.h"
> +#include "hw/boards.h"
> #include "hw/vfio/vfio-device.h"
> #include "hw/vfio/vfio-migration.h"
> #include "migration/misc.h"
> @@ -1152,6 +1153,35 @@ static bool vfio_viommu_preset(VFIODevice *vbasedev)
> return vbasedev->bcontainer->space->as != &address_space_memory;
> }
>
> +static bool vfio_dirty_tracking_exceed_limit(VFIODevice *vbasedev)
> +{
> + VFIOContainer *bcontainer = vbasedev->bcontainer;
> + uint64_t max_size, page_size;
> +
> + if (!object_dynamic_cast(OBJECT(bcontainer), TYPE_VFIO_IOMMU_LEGACY)) {
> + return false;
> + }
Could we set in the IOMMUFD backend 'dirty_pgsizes' and
'max_dirty_bitmap_size'to avoid the object_dynamic_cast() ?
Thanks,
C.
> + if (!bcontainer->dirty_pages_supported) {
> + return true;
> + }
> + /*
> + * VFIO type1 driver has a limitation of bitmap size on unmap_bitmap
> + * ioctl(), calculate the limit and compare with guest memory size to
> + * catch dirty tracking failure early.
> + *
> + * This limit is 8TB with default kernel and QEMU config, we are a bit
> + * conservative here as VM memory layout may be nonconsecutive or VM
> + * can run with vIOMMU enabled so the limitation could be relaxed. One
> + * can also switch to use IOMMUFD backend if there is a need to migrate
> + * large VM.
> + */
> + page_size = 1 << ctz64(bcontainer->dirty_pgsizes);
> + max_size = bcontainer->max_dirty_bitmap_size * BITS_PER_BYTE * page_size;
> +
> + return current_machine->ram_size > max_size;
> +}
> +
> /*
> * Return true when either migration initialized or blocker registered.
> * Currently only return false when adding blocker fails which will
> @@ -1208,6 +1238,13 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp)
> goto add_blocker;
> }
>
> + if (vfio_dirty_tracking_exceed_limit(vbasedev)) {
> + error_setg(&err, "%s: Migration is currently not supported with "
> + "large memory VM due to dirty tracking limitation in "
> + "VFIO type1 driver", vbasedev->name);
> + goto add_blocker;
> + }
> +
> trace_vfio_migration_realize(vbasedev->name);
> return true;
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap
2025-10-20 8:14 ` Avihai Horon
@ 2025-10-20 10:00 ` Duan, Zhenzhong
2025-10-20 12:45 ` Yi Liu
0 siblings, 1 reply; 34+ messages in thread
From: Duan, Zhenzhong @ 2025-10-20 10:00 UTC (permalink / raw)
To: Avihai Horon, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, Liu, Yi L, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com, Hao, Xudong,
Cabiddu, Giovanni, Gross, Mark, Van De Ven, Arjan
Hi
>-----Original Message-----
>From: Avihai Horon <avihaih@nvidia.com>
>Subject: Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA
>unmap
>
>Hi,
>
>On 17/10/2025 11:22, Zhenzhong Duan wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> When a existing mapping is unmapped, there could already be dirty bits
>> which need to be recorded before unmap.
>>
>> If query dirty bitmap fails, we still need to do unmapping or else there
>> is stale mapping and it's risky to guest.
>>
>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>> ---
>> hw/vfio/iommufd.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index 976c0a8814..404e6249ca 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -74,7 +74,13 @@ static int iommufd_cdev_unmap(const
>VFIOContainer *bcontainer,
>> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
>> if
>(!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>> bcontainer->dirty_pages_supported) {
>> - /* TODO: query dirty bitmap before DMA unmap */
>> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova,
>size,
>> +
>iotlb->translated_addr,
>> +
>&local_err);
>> + if (ret) {
>> + error_report_err(local_err);
>> + }
>> + /* Unmap stale mapping even if query dirty bitmap fails */
>> return iommufd_backend_unmap_dma(be, ioas_id, iova,
>size);
>
>If query dirty bitmap fails, shouldn't we unmap and return the query
>bitmap error to fail migration? Otherwise, migration may succeed with
>some dirtied pages not being migrated.
Oh, good catch. Will make below change:
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -65,7 +65,7 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
uint32_t ioas_id = container->ioas_id;
bool need_dirty_sync = false;
Error *local_err = NULL;
- int ret;
+ int ret, unmap_ret;
if (unmap_all) {
size = UINT64_MAX;
@@ -82,7 +82,14 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
error_report_err(local_err);
}
/* Unmap stale mapping even if query dirty bitmap fails */
- return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
+ unmap_ret = iommufd_backend_unmap_dma(be, ioas_id, iova, size);
+
+ /*
+ * If dirty tracking fails, return the failure to VFIO core to
+ * fail the migration, or else there will be dirty pages missed
+ * to be migrated.
+ */
+ return unmap_ret ? : ret;
}
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend
2025-10-20 8:37 ` Yi Liu
@ 2025-10-20 10:01 ` Duan, Zhenzhong
0 siblings, 0 replies; 34+ messages in thread
From: Duan, Zhenzhong @ 2025-10-20 10:01 UTC (permalink / raw)
To: Liu, Yi L, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com,
avihaih@nvidia.com, Hao, Xudong, Cabiddu, Giovanni, Gross, Mark,
Van De Ven, Arjan
>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@intel.com>
>Subject: Re: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with
>legacy VFIO backend
>
>
>
>On 2025/10/20 16:03, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>> Subject: Re: [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with
>>> legacy VFIO backend
>>>
>>> On 2025/10/17 16:22, Zhenzhong Duan wrote:
>>>> If a VFIO device in guest switches from IOMMU domain to block domain,
>>>> vtd_address_space_unmap() is called to unmap whole address space.
>>>>
>>>> If that happens during migration, migration fails with legacy VFIO
>>>> backend as below:
>>>>
>>>> Status: failed (vfio_container_dma_unmap(0x561bbbd92d90,
>>> 0x100000000000, 0x100000000000) = -7 (Argument list too long))
>>>>
>>>> Because legacy VFIO limits maximum bitmap size to 256MB which maps
>to
>>> 8TB on
>>>> 4K page system, when 16TB sized UNMAP notification is sent,
>>> unmap_bitmap
>>>> ioctl fails.
>>>
>>> It would be great to add some words to note why vIOMMU can trigger
>this.
>>
>> Hi Yi, I think the first sentence in description is explaining that?
>>
>> "If a VFIO device in guest switches from IOMMU domain to block domain,
>> vtd_address_space_unmap() is called to unmap whole address space."
>
>aha, yes. I was trying to mark it is NOT necessarily related to VM
>memory size. Could you note that the address space is guest IOVA which
>is not system memory?
Sure, will do
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure
2025-10-20 8:39 ` Cédric Le Goater
@ 2025-10-20 10:07 ` Duan, Zhenzhong
0 siblings, 0 replies; 34+ messages in thread
From: Duan, Zhenzhong @ 2025-10-20 10:07 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, mst@redhat.com, jasowang@redhat.com,
Liu, Yi L, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com,
avihaih@nvidia.com, Hao, Xudong, Cabiddu, Giovanni, Gross, Mark,
Van De Ven, Arjan
>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH v2 7/8] vfio/migration: Add migration blocker if VM
>memory is too large to cause unmap_bitmap failure
>
>On 10/17/25 10:22, Zhenzhong Duan wrote:
>> With default config, kernel VFIO type1 driver limits dirty bitmap to 256MB
>
>
>... VFIO IOMMU Type1 ...
OK
>
>> for unmap_bitmap ioctl so the maximum guest memory region is no more
>than
>> 8TB size for the ioctl to succeed.
>>
>> Be conservative here to limit total guest memory to 8TB or else add a
>> migration blocker. IOMMUFD backend doesn't have such limit, one can use
>> IOMMUFD backed device if there is a need to migration such large VM.
>>
>> Suggested-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/vfio/migration.c | 37 +++++++++++++++++++++++++++++++++++++
>> 1 file changed, 37 insertions(+)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 4c06e3db93..1106ca7857 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -16,6 +16,7 @@
>> #include <sys/ioctl.h>
>>
>> #include "system/runstate.h"
>> +#include "hw/boards.h"
>> #include "hw/vfio/vfio-device.h"
>> #include "hw/vfio/vfio-migration.h"
>> #include "migration/misc.h"
>> @@ -1152,6 +1153,35 @@ static bool vfio_viommu_preset(VFIODevice
>*vbasedev)
>> return vbasedev->bcontainer->space->as !=
>&address_space_memory;
>> }
>>
>> +static bool vfio_dirty_tracking_exceed_limit(VFIODevice *vbasedev)
>> +{
>> + VFIOContainer *bcontainer = vbasedev->bcontainer;
>> + uint64_t max_size, page_size;
>> +
>> + if (!object_dynamic_cast(OBJECT(bcontainer),
>TYPE_VFIO_IOMMU_LEGACY)) {
>> + return false;
>> + }
>
>
>Could we set in the IOMMUFD backend 'dirty_pgsizes' and
>'max_dirty_bitmap_size'to avoid the object_dynamic_cast() ?
Sure, will do.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
2025-10-17 8:22 ` [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
@ 2025-10-20 12:44 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Yi Liu @ 2025-10-20 12:44 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 2025/10/17 16:22, Zhenzhong Duan wrote:
> Currently we support device and iommu dirty tracking, device dirty tracking
> is preferred.
>
> Add the framework code in iommufd_cdev_unmap() to choose either device or
> iommu dirty tracking, just like vfio_legacy_dma_unmap_one().
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> hw/vfio/iommufd.c | 34 +++++++++++++++++++++++++++++++---
> 1 file changed, 31 insertions(+), 3 deletions(-)
>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index fc9cd9d22f..976c0a8814 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -61,14 +61,42 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> IOMMUTLBEntry *iotlb, bool unmap_all)
> {
> const VFIOIOMMUFDContainer *container = VFIO_IOMMU_IOMMUFD(bcontainer);
> + IOMMUFDBackend *be = container->be;
> + uint32_t ioas_id = container->ioas_id;
> + bool need_dirty_sync = false;
> + Error *local_err = NULL;
> + int ret;
>
> if (unmap_all) {
> size = UINT64_MAX;
> }
>
> - /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */
> - return iommufd_backend_unmap_dma(container->be,
> - container->ioas_id, iova, size);
> + if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> + if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> + bcontainer->dirty_pages_supported) {
> + /* TODO: query dirty bitmap before DMA unmap */
> + return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
> + }
> +
> + need_dirty_sync = true;
> + }
> +
> + ret = iommufd_backend_unmap_dma(be, ioas_id, iova, size);
> + if (ret) {
> + return ret;
> + }
> +
> + if (need_dirty_sync) {
> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> + iotlb->translated_addr,
> + &local_err);
> + if (ret) {
> + error_report_err(local_err);
> + return ret;
> + }
> + }
> +
> + return 0;
> }
>
> static bool iommufd_cdev_kvm_device_add(VFIODevice *vbasedev, Error **errp)
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap
2025-10-17 8:22 ` [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 8:14 ` Avihai Horon
@ 2025-10-20 12:44 ` Yi Liu
2 siblings, 0 replies; 34+ messages in thread
From: Yi Liu @ 2025-10-20 12:44 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 2025/10/17 16:22, Zhenzhong Duan wrote:
> When a existing mapping is unmapped, there could already be dirty bits
> which need to be recorded before unmap.
s/a/an/
> If query dirty bitmap fails, we still need to do unmapping or else there
> is stale mapping and it's risky to guest.
>
> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> hw/vfio/iommufd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 976c0a8814..404e6249ca 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -74,7 +74,13 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> bcontainer->dirty_pages_supported) {
> - /* TODO: query dirty bitmap before DMA unmap */
> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> + iotlb->translated_addr,
> + &local_err);
> + if (ret) {
> + error_report_err(local_err);
> + }
> + /* Unmap stale mapping even if query dirty bitmap fails */
> return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
> }
>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 4/8] vfio: Add a backend_flag parameter to vfio_contianer_query_dirty_bitmap()
2025-10-17 8:22 ` [PATCH v2 4/8] vfio: Add a backend_flag parameter to vfio_contianer_query_dirty_bitmap() Zhenzhong Duan
2025-10-20 7:01 ` Cédric Le Goater
@ 2025-10-20 12:44 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Yi Liu @ 2025-10-20 12:44 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 2025/10/17 16:22, Zhenzhong Duan wrote:
> From: Joao Martins <joao.m.martins@oracle.com>
>
> This new parameter will be used in following patch, currently 0 is passed.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> include/hw/vfio/vfio-container.h | 8 ++++++--
> hw/vfio-user/container.c | 5 +++--
> hw/vfio/container-legacy.c | 5 +++--
> hw/vfio/container.c | 15 +++++++++------
> hw/vfio/iommufd.c | 7 ++++---
> hw/vfio/listener.c | 6 +++---
> hw/vfio/trace-events | 2 +-
> 7 files changed, 29 insertions(+), 19 deletions(-)
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
> diff --git a/include/hw/vfio/vfio-container.h b/include/hw/vfio/vfio-container.h
> index c4b58d664b..9f6e8cedfc 100644
> --- a/include/hw/vfio/vfio-container.h
> +++ b/include/hw/vfio/vfio-container.h
> @@ -99,7 +99,9 @@ bool vfio_container_devices_dirty_tracking_is_supported(
> const VFIOContainer *bcontainer);
> int vfio_container_query_dirty_bitmap(const VFIOContainer *bcontainer,
> uint64_t iova, uint64_t size,
> - hwaddr translated_addr, Error **errp);
> + uint64_t backend_flag,
> + hwaddr translated_addr,
> + Error **errp);
>
> GList *vfio_container_get_iova_ranges(const VFIOContainer *bcontainer);
>
> @@ -253,12 +255,14 @@ struct VFIOIOMMUClass {
> * @vbmap: #VFIOBitmap internal bitmap structure
> * @iova: iova base address
> * @size: size of iova range
> + * @backend_flag: flags for backend, opaque to upper layer container
> * @errp: pointer to Error*, to store an error if it happens.
> *
> * Returns zero to indicate success and negative for error.
> */
> int (*query_dirty_bitmap)(const VFIOContainer *bcontainer,
> - VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error **errp);
> + VFIOBitmap *vbmap, hwaddr iova, hwaddr size,
> + uint64_t backend_flag, Error **errp);
> /* PCI specific */
> int (*pci_hot_reset)(VFIODevice *vbasedev, bool single);
>
> diff --git a/hw/vfio-user/container.c b/hw/vfio-user/container.c
> index e45192fef6..3ce6ea12db 100644
> --- a/hw/vfio-user/container.c
> +++ b/hw/vfio-user/container.c
> @@ -162,8 +162,9 @@ vfio_user_set_dirty_page_tracking(const VFIOContainer *bcontainer,
> }
>
> static int vfio_user_query_dirty_bitmap(const VFIOContainer *bcontainer,
> - VFIOBitmap *vbmap, hwaddr iova,
> - hwaddr size, Error **errp)
> + VFIOBitmap *vbmap, hwaddr iova,
> + hwaddr size, uint64_t backend_flag,
> + Error **errp)
> {
> error_setg_errno(errp, ENOTSUP, "Not supported");
> return -ENOTSUP;
> diff --git a/hw/vfio/container-legacy.c b/hw/vfio/container-legacy.c
> index b7e3b892b9..dd9c4a6a5a 100644
> --- a/hw/vfio/container-legacy.c
> +++ b/hw/vfio/container-legacy.c
> @@ -154,7 +154,7 @@ static int vfio_legacy_dma_unmap_one(const VFIOLegacyContainer *container,
> }
>
> if (need_dirty_sync) {
> - ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
> iotlb->translated_addr, &local_err);
> if (ret) {
> error_report_err(local_err);
> @@ -255,7 +255,8 @@ vfio_legacy_set_dirty_page_tracking(const VFIOContainer *bcontainer,
> }
>
> static int vfio_legacy_query_dirty_bitmap(const VFIOContainer *bcontainer,
> - VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error **errp)
> + VFIOBitmap *vbmap, hwaddr iova, hwaddr size,
> + uint64_t backend_flag, Error **errp)
> {
> const VFIOLegacyContainer *container = VFIO_IOMMU_LEGACY(bcontainer);
> struct vfio_iommu_type1_dirty_bitmap *dbitmap;
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index 9ddec300e3..7706603c1c 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -213,13 +213,13 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
>
> static int vfio_container_iommu_query_dirty_bitmap(
> const VFIOContainer *bcontainer, VFIOBitmap *vbmap, hwaddr iova,
> - hwaddr size, Error **errp)
> + hwaddr size, uint64_t backend_flag, Error **errp)
> {
> VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
>
> g_assert(vioc->query_dirty_bitmap);
> return vioc->query_dirty_bitmap(bcontainer, vbmap, iova, size,
> - errp);
> + backend_flag, errp);
> }
>
> static int vfio_container_devices_query_dirty_bitmap(
> @@ -247,7 +247,9 @@ static int vfio_container_devices_query_dirty_bitmap(
>
> int vfio_container_query_dirty_bitmap(const VFIOContainer *bcontainer,
> uint64_t iova, uint64_t size,
> - hwaddr translated_addr, Error **errp)
> + uint64_t backend_flag,
> + hwaddr translated_addr,
> + Error **errp)
> {
> bool all_device_dirty_tracking =
> vfio_container_devices_dirty_tracking_is_supported(bcontainer);
> @@ -274,7 +276,7 @@ int vfio_container_query_dirty_bitmap(const VFIOContainer *bcontainer,
> errp);
> } else {
> ret = vfio_container_iommu_query_dirty_bitmap(bcontainer, &vbmap, iova, size,
> - errp);
> + backend_flag, errp);
> }
>
> if (ret) {
> @@ -285,8 +287,9 @@ int vfio_container_query_dirty_bitmap(const VFIOContainer *bcontainer,
> translated_addr,
> vbmap.pages);
>
> - trace_vfio_container_query_dirty_bitmap(iova, size, vbmap.size,
> - translated_addr, dirty_pages);
> + trace_vfio_container_query_dirty_bitmap(iova, size, backend_flag,
> + vbmap.size, translated_addr,
> + dirty_pages);
> out:
> g_free(vbmap.bitmap);
>
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 404e6249ca..6457cef344 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -74,7 +74,7 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> bcontainer->dirty_pages_supported) {
> - ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
> iotlb->translated_addr,
> &local_err);
> if (ret) {
> @@ -93,7 +93,7 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> }
>
> if (need_dirty_sync) {
> - ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
> iotlb->translated_addr,
> &local_err);
> if (ret) {
> @@ -209,7 +209,8 @@ err:
>
> static int iommufd_query_dirty_bitmap(const VFIOContainer *bcontainer,
> VFIOBitmap *vbmap, hwaddr iova,
> - hwaddr size, Error **errp)
> + hwaddr size, uint64_t backend_flag,
> + Error **errp)
> {
> VFIOIOMMUFDContainer *container = VFIO_IOMMU_IOMMUFD(bcontainer);
> unsigned long page_size = qemu_real_host_page_size();
> diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
> index 2d7d3a4645..2109101158 100644
> --- a/hw/vfio/listener.c
> +++ b/hw/vfio/listener.c
> @@ -1083,7 +1083,7 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
> translated_addr = memory_region_get_ram_addr(mr) + xlat;
>
> ret = vfio_container_query_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
> - translated_addr, &local_err);
> + 0, translated_addr, &local_err);
> if (ret) {
> error_prepend(&local_err,
> "vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
> @@ -1119,7 +1119,7 @@ static int vfio_ram_discard_query_dirty_bitmap(MemoryRegionSection *section,
> * Sync the whole mapped region (spanning multiple individual mappings)
> * in one go.
> */
> - ret = vfio_container_query_dirty_bitmap(vrdl->bcontainer, iova, size,
> + ret = vfio_container_query_dirty_bitmap(vrdl->bcontainer, iova, size, 0,
> translated_addr, &local_err);
> if (ret) {
> error_report_err(local_err);
> @@ -1204,7 +1204,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *bcontainer,
>
> return vfio_container_query_dirty_bitmap(bcontainer,
> REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
> - int128_get64(section->size), translated_addr, errp);
> + int128_get64(section->size), 0, translated_addr, errp);
> }
>
> static void vfio_listener_log_sync(MemoryListener *listener,
> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> index 1e895448cd..3c62bab764 100644
> --- a/hw/vfio/trace-events
> +++ b/hw/vfio/trace-events
> @@ -105,7 +105,7 @@ vfio_device_dirty_tracking_start(int nr_ranges, uint64_t min32, uint64_t max32,
> vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64
>
> # container.c
> -vfio_container_query_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t translated_addr, uint64_t dirty_pages) "iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" gpa=0x%"PRIx64" dirty_pages=%"PRIu64
> +vfio_container_query_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t backend_flag, uint64_t bitmap_size, uint64_t translated_addr, uint64_t dirty_pages) "iova=0x%"PRIx64" size=0x%"PRIx64" backend_flag=0x%"PRIx64" bitmap_size=0x%"PRIx64" gpa=0x%"PRIx64" dirty_pages=%"PRIu64
>
> # container-legacy.c
> vfio_container_disconnect(int fd) "close container->fd=%d"
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure
2025-10-17 8:22 ` [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure Zhenzhong Duan
2025-10-20 8:39 ` Cédric Le Goater
@ 2025-10-20 12:44 ` Yi Liu
2025-10-21 8:25 ` Duan, Zhenzhong
1 sibling, 1 reply; 34+ messages in thread
From: Yi Liu @ 2025-10-20 12:44 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 2025/10/17 16:22, Zhenzhong Duan wrote:
> With default config, kernel VFIO type1 driver limits dirty bitmap to 256MB
> for unmap_bitmap ioctl so the maximum guest memory region is no more than
> 8TB size for the ioctl to succeed.
>
> Be conservative here to limit total guest memory to 8TB or else add a
> migration blocker. IOMMUFD backend doesn't have such limit, one can use
> IOMMUFD backed device if there is a need to migration such large VM.
>
> Suggested-by: Yi Liu <yi.l.liu@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/vfio/migration.c | 37 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 37 insertions(+)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 4c06e3db93..1106ca7857 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -16,6 +16,7 @@
> #include <sys/ioctl.h>
>
> #include "system/runstate.h"
> +#include "hw/boards.h"
> #include "hw/vfio/vfio-device.h"
> #include "hw/vfio/vfio-migration.h"
> #include "migration/misc.h"
> @@ -1152,6 +1153,35 @@ static bool vfio_viommu_preset(VFIODevice *vbasedev)
> return vbasedev->bcontainer->space->as != &address_space_memory;
> }
>
> +static bool vfio_dirty_tracking_exceed_limit(VFIODevice *vbasedev)
> +{
> + VFIOContainer *bcontainer = vbasedev->bcontainer;
> + uint64_t max_size, page_size;
> +
> + if (!object_dynamic_cast(OBJECT(bcontainer), TYPE_VFIO_IOMMU_LEGACY)) {
> + return false;
> + }
> +
> + if (!bcontainer->dirty_pages_supported) {
> + return true;
> + }
> + /*
> + * VFIO type1 driver has a limitation of bitmap size on unmap_bitmap
> + * ioctl(), calculate the limit and compare with guest memory size to
> + * catch dirty tracking failure early.
> + *
> + * This limit is 8TB with default kernel and QEMU config, we are a bit
> + * conservative here as VM memory layout may be nonconsecutive or VM
> + * can run with vIOMMU enabled so the limitation could be relaxed. One
> + * can also switch to use IOMMUFD backend if there is a need to migrate
> + * large VM.
> + */
> + page_size = 1 << ctz64(bcontainer->dirty_pgsizes);
Should use qemu_real_host_page_size() here?
> + max_size = bcontainer->max_dirty_bitmap_size * BITS_PER_BYTE * page_size;
> +
> + return current_machine->ram_size > max_size;
> +}
> +
> /*
> * Return true when either migration initialized or blocker registered.
> * Currently only return false when adding blocker fails which will
> @@ -1208,6 +1238,13 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp)
> goto add_blocker;
> }
>
> + if (vfio_dirty_tracking_exceed_limit(vbasedev)) {
> + error_setg(&err, "%s: Migration is currently not supported with "
> + "large memory VM due to dirty tracking limitation in "
> + "VFIO type1 driver", vbasedev->name);
> + goto add_blocker;
> + }
> +
> trace_vfio_migration_realize(vbasedev->name);
> return true;
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap
2025-10-20 10:00 ` Duan, Zhenzhong
@ 2025-10-20 12:45 ` Yi Liu
2025-10-20 13:04 ` Avihai Horon
0 siblings, 1 reply; 34+ messages in thread
From: Yi Liu @ 2025-10-20 12:45 UTC (permalink / raw)
To: Duan, Zhenzhong, Avihai Horon, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com, Hao, Xudong,
Cabiddu, Giovanni, Gross, Mark, Van De Ven, Arjan
On 2025/10/20 18:00, Duan, Zhenzhong wrote:
> Hi
>
>> -----Original Message-----
>> From: Avihai Horon <avihaih@nvidia.com>
>> Subject: Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA
>> unmap
>>
>> Hi,
>>
>> On 17/10/2025 11:22, Zhenzhong Duan wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> When a existing mapping is unmapped, there could already be dirty bits
>>> which need to be recorded before unmap.
>>>
>>> If query dirty bitmap fails, we still need to do unmapping or else there
>>> is stale mapping and it's risky to guest.
>>>
>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>> ---
>>> hw/vfio/iommufd.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>> index 976c0a8814..404e6249ca 100644
>>> --- a/hw/vfio/iommufd.c
>>> +++ b/hw/vfio/iommufd.c
>>> @@ -74,7 +74,13 @@ static int iommufd_cdev_unmap(const
>> VFIOContainer *bcontainer,
>>> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
>>> if
>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>> bcontainer->dirty_pages_supported) {
>>> - /* TODO: query dirty bitmap before DMA unmap */
>>> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova,
>> size,
>>> +
>> iotlb->translated_addr,
>>> +
>> &local_err);
>>> + if (ret) {
>>> + error_report_err(local_err);
>>> + }
>>> + /* Unmap stale mapping even if query dirty bitmap fails */
>>> return iommufd_backend_unmap_dma(be, ioas_id, iova,
>> size);
>>
>> If query dirty bitmap fails, shouldn't we unmap and return the query
>> bitmap error to fail migration? Otherwise, migration may succeed with
>> some dirtied pages not being migrated.
>
> Oh, good catch. Will make below change:
>
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -65,7 +65,7 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> uint32_t ioas_id = container->ioas_id;
> bool need_dirty_sync = false;
> Error *local_err = NULL;
> - int ret;
> + int ret, unmap_ret;
>
> if (unmap_all) {
> size = UINT64_MAX;
> @@ -82,7 +82,14 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> error_report_err(local_err);
> }
> /* Unmap stale mapping even if query dirty bitmap fails */
> - return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
> + unmap_ret = iommufd_backend_unmap_dma(be, ioas_id, iova, size);
> +
> + /*
> + * If dirty tracking fails, return the failure to VFIO core to
> + * fail the migration, or else there will be dirty pages missed
> + * to be migrated.
> + */
> + return unmap_ret ? : ret;
> }
do we need a async way to fail migration? This unmap path is not
necessarily in the migration path.
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 5/8] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
2025-10-17 8:22 ` [PATCH v2 5/8] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support Zhenzhong Duan
2025-10-20 7:01 ` Cédric Le Goater
@ 2025-10-20 12:45 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Yi Liu @ 2025-10-20 12:45 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 2025/10/17 16:22, Zhenzhong Duan wrote:
> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last dirty
> bitmap query right before unmap, no PTEs flushes. This accelerates the
> query without issue because unmap will tear down the mapping anyway.
>
> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> include/system/iommufd.h | 2 +-
> backends/iommufd.c | 5 +++--
> hw/vfio/iommufd.c | 5 +++--
> backends/trace-events | 2 +-
> 4 files changed, 8 insertions(+), 6 deletions(-)
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index a659f36a20..767a8e4cb6 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -64,7 +64,7 @@ bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
> bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
> uint64_t iova, ram_addr_t size,
> uint64_t page_size, uint64_t *data,
> - Error **errp);
> + uint64_t flags, Error **errp);
> bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
> uint32_t data_type, uint32_t entry_len,
> uint32_t *entry_num, void *data,
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index fdfb7c9d67..086bd67aea 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -361,7 +361,7 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
> uint32_t hwpt_id,
> uint64_t iova, ram_addr_t size,
> uint64_t page_size, uint64_t *data,
> - Error **errp)
> + uint64_t flags, Error **errp)
> {
> int ret;
> struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
> @@ -371,11 +371,12 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
> .length = size,
> .page_size = page_size,
> .data = (uintptr_t)data,
> + .flags = flags,
> };
>
> ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP, &get_dirty_bitmap);
> trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id, iova, size,
> - page_size, ret ? errno : 0);
> + flags, page_size, ret ? errno : 0);
> if (ret) {
> error_setg_errno(errp, errno,
> "IOMMU_HWPT_GET_DIRTY_BITMAP (iova: 0x%"HWADDR_PRIx
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 6457cef344..937b80340c 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -74,7 +74,8 @@ static int iommufd_cdev_unmap(const VFIOContainer *bcontainer,
> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> bcontainer->dirty_pages_supported) {
> - ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> + IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR,
> iotlb->translated_addr,
> &local_err);
> if (ret) {
> @@ -224,7 +225,7 @@ static int iommufd_query_dirty_bitmap(const VFIOContainer *bcontainer,
> if (!iommufd_backend_get_dirty_bitmap(container->be, hwpt->hwpt_id,
> iova, size, page_size,
> (uint64_t *)vbmap->bitmap,
> - errp)) {
> + backend_flag, errp)) {
> return -EINVAL;
> }
> }
> diff --git a/backends/trace-events b/backends/trace-events
> index 56132d3fd2..e1992ba12f 100644
> --- a/backends/trace-events
> +++ b/backends/trace-events
> @@ -19,5 +19,5 @@ iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d"
> iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t flags, uint32_t hwpt_type, uint32_t len, uint64_t data_ptr, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u flags=0x%x hwpt_type=%u len=%u data_ptr=0x%"PRIx64" out_hwpt=%u (%d)"
> iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
> iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
> -iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
> +iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t flags, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" flags=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
> iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 3/8] vfio/container-legacy: rename vfio_dma_unmap_bitmap() to vfio_legacy_dma_unmap_get_dirty_bitmap()
2025-10-17 8:22 ` [PATCH v2 3/8] vfio/container-legacy: rename vfio_dma_unmap_bitmap() to vfio_legacy_dma_unmap_get_dirty_bitmap() Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
@ 2025-10-20 12:45 ` Yi Liu
1 sibling, 0 replies; 34+ messages in thread
From: Yi Liu @ 2025-10-20 12:45 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven
On 2025/10/17 16:22, Zhenzhong Duan wrote:
> This is to follow naming style in container-legacy.c to have low level functions
> with vfio_legacy_ prefix.
>
> No functional changes.
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
> hw/vfio/container-legacy.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
> diff --git a/hw/vfio/container-legacy.c b/hw/vfio/container-legacy.c
> index 8e9639603e..b7e3b892b9 100644
> --- a/hw/vfio/container-legacy.c
> +++ b/hw/vfio/container-legacy.c
> @@ -68,9 +68,10 @@ static int vfio_ram_block_discard_disable(VFIOLegacyContainer *container,
> }
> }
>
> -static int vfio_dma_unmap_bitmap(const VFIOLegacyContainer *container,
> - hwaddr iova, uint64_t size,
> - IOMMUTLBEntry *iotlb)
> +static int
> +vfio_legacy_dma_unmap_get_dirty_bitmap(const VFIOLegacyContainer *container,
> + hwaddr iova, uint64_t size,
> + IOMMUTLBEntry *iotlb)
> {
> const VFIOContainer *bcontainer = VFIO_IOMMU(container);
> struct vfio_iommu_type1_dma_unmap *unmap;
> @@ -141,7 +142,8 @@ static int vfio_legacy_dma_unmap_one(const VFIOLegacyContainer *container,
> if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> bcontainer->dirty_pages_supported) {
> - return vfio_dma_unmap_bitmap(container, iova, size, iotlb);
> + return vfio_legacy_dma_unmap_get_dirty_bitmap(container, iova, size,
> + iotlb);
> }
>
> need_dirty_sync = true;
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 8/8] vfio/migration: Allow live migration with vIOMMU without VFs using device dirty tracking
2025-10-17 8:22 ` [PATCH v2 8/8] vfio/migration: Allow live migration with vIOMMU without VFs using device dirty tracking Zhenzhong Duan
@ 2025-10-20 12:46 ` Yi Liu
0 siblings, 0 replies; 34+ messages in thread
From: Yi Liu @ 2025-10-20 12:46 UTC (permalink / raw)
To: Zhenzhong Duan, qemu-devel
Cc: alex.williamson, clg, mst, jasowang, clement.mathieu--drif,
eric.auger, joao.m.martins, avihaih, xudong.hao, giovanni.cabiddu,
mark.gross, arjan.van.de.ven, Jason Zeng
On 2025/10/17 16:22, Zhenzhong Duan wrote:
> Commit e46883204c38 ("vfio/migration: Block migration with vIOMMU")
> introduces a migration blocker when vIOMMU is enabled, because we need
> to calculate the IOVA ranges for device dirty tracking. But this is
> unnecessary for iommu dirty tracking.
>
> Limit the vfio_viommu_preset() check to those devices which use device
> dirty tracking. This allows live migration with VFIO devices which use
> iommu dirty tracking.
>
> Introduce a helper vfio_device_dirty_pages_disabled() to facilicate it.
s/facilicate/facilitate/
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
>
> Suggested-by: Jason Zeng <jason.zeng@intel.com>
> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
> include/hw/vfio/vfio-device.h | 10 ++++++++++
> hw/vfio/container.c | 5 +----
> hw/vfio/device.c | 6 ++++++
> hw/vfio/migration.c | 6 +++---
> 4 files changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
> index 7e9aed6d3c..feda521514 100644
> --- a/include/hw/vfio/vfio-device.h
> +++ b/include/hw/vfio/vfio-device.h
> @@ -148,6 +148,16 @@ bool vfio_device_irq_set_signaling(VFIODevice *vbasedev, int index, int subindex
>
> void vfio_device_reset_handler(void *opaque);
> bool vfio_device_is_mdev(VFIODevice *vbasedev);
> +/**
> + * vfio_device_dirty_pages_disabled: Check if device dirty tracking will be
> + * used for a VFIO device
> + *
> + * @vbasedev: The VFIODevice to transform
> + *
> + * Return: true if either @vbasedev doesn't support device dirty tracking or
> + * is forcedly disabled from command line, otherwise false.
> + */
> +bool vfio_device_dirty_pages_disabled(VFIODevice *vbasedev);
> bool vfio_device_hiod_create_and_realize(VFIODevice *vbasedev,
> const char *typename, Error **errp);
> bool vfio_device_attach(char *name, VFIODevice *vbasedev,
> diff --git a/hw/vfio/container.c b/hw/vfio/container.c
> index 7706603c1c..8879da78c8 100644
> --- a/hw/vfio/container.c
> +++ b/hw/vfio/container.c
> @@ -178,10 +178,7 @@ bool vfio_container_devices_dirty_tracking_is_supported(
> VFIODevice *vbasedev;
>
> QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
> - if (vbasedev->device_dirty_page_tracking == ON_OFF_AUTO_OFF) {
> - return false;
> - }
> - if (!vbasedev->dirty_pages_supported) {
> + if (vfio_device_dirty_pages_disabled(vbasedev)) {
> return false;
> }
> }
> diff --git a/hw/vfio/device.c b/hw/vfio/device.c
> index 64f8750389..837872387f 100644
> --- a/hw/vfio/device.c
> +++ b/hw/vfio/device.c
> @@ -400,6 +400,12 @@ bool vfio_device_is_mdev(VFIODevice *vbasedev)
> return subsys && (strcmp(subsys, "/sys/bus/mdev") == 0);
> }
>
> +bool vfio_device_dirty_pages_disabled(VFIODevice *vbasedev)
> +{
> + return (!vbasedev->dirty_pages_supported ||
> + vbasedev->device_dirty_page_tracking == ON_OFF_AUTO_OFF);
> +}
> +
> bool vfio_device_hiod_create_and_realize(VFIODevice *vbasedev,
> const char *typename, Error **errp)
> {
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 1106ca7857..1093857a34 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -1213,8 +1213,7 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp)
> return !vfio_block_migration(vbasedev, err, errp);
> }
>
> - if ((!vbasedev->dirty_pages_supported ||
> - vbasedev->device_dirty_page_tracking == ON_OFF_AUTO_OFF) &&
> + if (vfio_device_dirty_pages_disabled(vbasedev) &&
> !vbasedev->iommu_dirty_tracking) {
> if (vbasedev->enable_migration == ON_OFF_AUTO_AUTO) {
> error_setg(&err,
> @@ -1232,7 +1231,8 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp)
> goto out_deinit;
> }
>
> - if (vfio_viommu_preset(vbasedev)) {
> + if (!vfio_device_dirty_pages_disabled(vbasedev) &&
> + vfio_viommu_preset(vbasedev)) {
> error_setg(&err, "%s: Migration is currently not supported "
> "with vIOMMU enabled", vbasedev->name);
> goto add_blocker;
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap
2025-10-20 12:45 ` Yi Liu
@ 2025-10-20 13:04 ` Avihai Horon
2025-10-21 3:30 ` Yi Liu
0 siblings, 1 reply; 34+ messages in thread
From: Avihai Horon @ 2025-10-20 13:04 UTC (permalink / raw)
To: Yi Liu, Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com, Hao, Xudong,
Cabiddu, Giovanni, Gross, Mark, Van De Ven, Arjan
On 20/10/2025 15:45, Yi Liu wrote:
> External email: Use caution opening links or attachments
>
>
> On 2025/10/20 18:00, Duan, Zhenzhong wrote:
>> Hi
>>
>>> -----Original Message-----
>>> From: Avihai Horon <avihaih@nvidia.com>
>>> Subject: Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA
>>> unmap
>>>
>>> Hi,
>>>
>>> On 17/10/2025 11:22, Zhenzhong Duan wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> When a existing mapping is unmapped, there could already be dirty bits
>>>> which need to be recorded before unmap.
>>>>
>>>> If query dirty bitmap fails, we still need to do unmapping or else
>>>> there
>>>> is stale mapping and it's risky to guest.
>>>>
>>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>>> ---
>>>> hw/vfio/iommufd.c | 8 +++++++-
>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>> index 976c0a8814..404e6249ca 100644
>>>> --- a/hw/vfio/iommufd.c
>>>> +++ b/hw/vfio/iommufd.c
>>>> @@ -74,7 +74,13 @@ static int iommufd_cdev_unmap(const
>>> VFIOContainer *bcontainer,
>>>> if (iotlb &&
>>>> vfio_container_dirty_tracking_is_started(bcontainer)) {
>>>> if
>>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>> bcontainer->dirty_pages_supported) {
>>>> - /* TODO: query dirty bitmap before DMA unmap */
>>>> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova,
>>> size,
>>>> +
>>> iotlb->translated_addr,
>>>> +
>>> &local_err);
>>>> + if (ret) {
>>>> + error_report_err(local_err);
>>>> + }
>>>> + /* Unmap stale mapping even if query dirty bitmap
>>>> fails */
>>>> return iommufd_backend_unmap_dma(be, ioas_id, iova,
>>> size);
>>>
>>> If query dirty bitmap fails, shouldn't we unmap and return the query
>>> bitmap error to fail migration? Otherwise, migration may succeed with
>>> some dirtied pages not being migrated.
>>
>> Oh, good catch. Will make below change:
>>
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -65,7 +65,7 @@ static int iommufd_cdev_unmap(const VFIOContainer
>> *bcontainer,
>> uint32_t ioas_id = container->ioas_id;
>> bool need_dirty_sync = false;
>> Error *local_err = NULL;
>> - int ret;
>> + int ret, unmap_ret;
>>
>> if (unmap_all) {
>> size = UINT64_MAX;
>> @@ -82,7 +82,14 @@ static int iommufd_cdev_unmap(const VFIOContainer
>> *bcontainer,
>> error_report_err(local_err);
>> }
>> /* Unmap stale mapping even if query dirty bitmap fails */
>> - return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
>> + unmap_ret = iommufd_backend_unmap_dma(be, ioas_id, iova,
>> size);
>> +
>> + /*
>> + * If dirty tracking fails, return the failure to VFIO
>> core to
>> + * fail the migration, or else there will be dirty pages
>> missed
>> + * to be migrated.
>> + */
>> + return unmap_ret ? : ret;
>> }
>
> do we need a async way to fail migration? This unmap path is not
> necessarily in the migration path.
I think in upper layers there is a check, if migration is active then
migration error is set to fail it.
vfio_iommu_map_notify():
...
ret = vfio_container_dma_unmap(bcontainer, iova,
iotlb->addr_mask + 1, iotlb, false);
if (ret) {
error_setg(&local_err,
"vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
"0x%"HWADDR_PRIx") = %d (%s)",
bcontainer, iova,
iotlb->addr_mask + 1, ret, strerror(-ret));
if (migration_is_running()) {
migration_file_set_error(ret, local_err);
} else {
error_report_err(local_err);
}
}
Thanks.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap
2025-10-20 13:04 ` Avihai Horon
@ 2025-10-21 3:30 ` Yi Liu
0 siblings, 0 replies; 34+ messages in thread
From: Yi Liu @ 2025-10-21 3:30 UTC (permalink / raw)
To: Avihai Horon, Duan, Zhenzhong, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com, Hao, Xudong,
Cabiddu, Giovanni, Gross, Mark, Van De Ven, Arjan
On 2025/10/20 21:04, Avihai Horon wrote:
>
> On 20/10/2025 15:45, Yi Liu wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2025/10/20 18:00, Duan, Zhenzhong wrote:
>>> Hi
>>>
>>>> -----Original Message-----
>>>> From: Avihai Horon <avihaih@nvidia.com>
>>>> Subject: Re: [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA
>>>> unmap
>>>>
>>>> Hi,
>>>>
>>>> On 17/10/2025 11:22, Zhenzhong Duan wrote:
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> When a existing mapping is unmapped, there could already be dirty bits
>>>>> which need to be recorded before unmap.
>>>>>
>>>>> If query dirty bitmap fails, we still need to do unmapping or else
>>>>> there
>>>>> is stale mapping and it's risky to guest.
>>>>>
>>>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>>>> ---
>>>>> hw/vfio/iommufd.c | 8 +++++++-
>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>>> index 976c0a8814..404e6249ca 100644
>>>>> --- a/hw/vfio/iommufd.c
>>>>> +++ b/hw/vfio/iommufd.c
>>>>> @@ -74,7 +74,13 @@ static int iommufd_cdev_unmap(const
>>>> VFIOContainer *bcontainer,
>>>>> if (iotlb &&
>>>>> vfio_container_dirty_tracking_is_started(bcontainer)) {
>>>>> if
>>>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>>> bcontainer->dirty_pages_supported) {
>>>>> - /* TODO: query dirty bitmap before DMA unmap */
>>>>> + ret = vfio_container_query_dirty_bitmap(bcontainer, iova,
>>>> size,
>>>>> +
>>>> iotlb->translated_addr,
>>>>> +
>>>> &local_err);
>>>>> + if (ret) {
>>>>> + error_report_err(local_err);
>>>>> + }
>>>>> + /* Unmap stale mapping even if query dirty bitmap
>>>>> fails */
>>>>> return iommufd_backend_unmap_dma(be, ioas_id, iova,
>>>> size);
>>>>
>>>> If query dirty bitmap fails, shouldn't we unmap and return the query
>>>> bitmap error to fail migration? Otherwise, migration may succeed with
>>>> some dirtied pages not being migrated.
>>>
>>> Oh, good catch. Will make below change:
>>>
>>> --- a/hw/vfio/iommufd.c
>>> +++ b/hw/vfio/iommufd.c
>>> @@ -65,7 +65,7 @@ static int iommufd_cdev_unmap(const VFIOContainer
>>> *bcontainer,
>>> uint32_t ioas_id = container->ioas_id;
>>> bool need_dirty_sync = false;
>>> Error *local_err = NULL;
>>> - int ret;
>>> + int ret, unmap_ret;
>>>
>>> if (unmap_all) {
>>> size = UINT64_MAX;
>>> @@ -82,7 +82,14 @@ static int iommufd_cdev_unmap(const VFIOContainer
>>> *bcontainer,
>>> error_report_err(local_err);
>>> }
>>> /* Unmap stale mapping even if query dirty bitmap fails */
>>> - return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
>>> + unmap_ret = iommufd_backend_unmap_dma(be, ioas_id, iova,
>>> size);
>>> +
>>> + /*
>>> + * If dirty tracking fails, return the failure to VFIO
>>> core to
>>> + * fail the migration, or else there will be dirty pages
>>> missed
>>> + * to be migrated.
>>> + */
>>> + return unmap_ret ? : ret;
>>> }
>>
>> do we need a async way to fail migration? This unmap path is not
>> necessarily in the migration path.
>
> I think in upper layers there is a check, if migration is active then
> migration error is set to fail it.
>
> vfio_iommu_map_notify():
>
> ...
> ret = vfio_container_dma_unmap(bcontainer, iova,
> iotlb->addr_mask + 1, iotlb,
> false);
> if (ret) {
> error_setg(&local_err,
> "vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
> "0x%"HWADDR_PRIx") = %d (%s)",
> bcontainer, iova,
> iotlb->addr_mask + 1, ret, strerror(-ret));
> if (migration_is_running()) {
> migration_file_set_error(ret, local_err);
> } else {
> error_report_err(local_err);
> }
> }
>
got it. thanks. :)
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure
2025-10-20 12:44 ` Yi Liu
@ 2025-10-21 8:25 ` Duan, Zhenzhong
0 siblings, 0 replies; 34+ messages in thread
From: Duan, Zhenzhong @ 2025-10-21 8:25 UTC (permalink / raw)
To: Liu, Yi L, qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, mst@redhat.com,
jasowang@redhat.com, clement.mathieu--drif@eviden.com,
eric.auger@redhat.com, joao.m.martins@oracle.com,
avihaih@nvidia.com, Hao, Xudong, Cabiddu, Giovanni, Gross, Mark,
Van De Ven, Arjan
>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@intel.com>
>Subject: Re: [PATCH v2 7/8] vfio/migration: Add migration blocker if VM
>memory is too large to cause unmap_bitmap failure
>
>On 2025/10/17 16:22, Zhenzhong Duan wrote:
>> With default config, kernel VFIO type1 driver limits dirty bitmap to 256MB
>> for unmap_bitmap ioctl so the maximum guest memory region is no more
>than
>> 8TB size for the ioctl to succeed.
>>
>> Be conservative here to limit total guest memory to 8TB or else add a
>> migration blocker. IOMMUFD backend doesn't have such limit, one can use
>> IOMMUFD backed device if there is a need to migration such large VM.
>>
>> Suggested-by: Yi Liu <yi.l.liu@intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>> hw/vfio/migration.c | 37 +++++++++++++++++++++++++++++++++++++
>> 1 file changed, 37 insertions(+)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 4c06e3db93..1106ca7857 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -16,6 +16,7 @@
>> #include <sys/ioctl.h>
>>
>> #include "system/runstate.h"
>> +#include "hw/boards.h"
>> #include "hw/vfio/vfio-device.h"
>> #include "hw/vfio/vfio-migration.h"
>> #include "migration/misc.h"
>> @@ -1152,6 +1153,35 @@ static bool vfio_viommu_preset(VFIODevice
>*vbasedev)
>> return vbasedev->bcontainer->space->as !=
>&address_space_memory;
>> }
>>
>> +static bool vfio_dirty_tracking_exceed_limit(VFIODevice *vbasedev)
>> +{
>> + VFIOContainer *bcontainer = vbasedev->bcontainer;
>> + uint64_t max_size, page_size;
>> +
>> + if (!object_dynamic_cast(OBJECT(bcontainer),
>TYPE_VFIO_IOMMU_LEGACY)) {
>> + return false;
>> + }
>> +
>> + if (!bcontainer->dirty_pages_supported) {
>> + return true;
>> + }
>> + /*
>> + * VFIO type1 driver has a limitation of bitmap size on unmap_bitmap
>> + * ioctl(), calculate the limit and compare with guest memory size to
>> + * catch dirty tracking failure early.
>> + *
>> + * This limit is 8TB with default kernel and QEMU config, we are a bit
>> + * conservative here as VM memory layout may be nonconsecutive
>or VM
>> + * can run with vIOMMU enabled so the limitation could be relaxed.
>One
>> + * can also switch to use IOMMUFD backend if there is a need to
>migrate
>> + * large VM.
>> + */
>> + page_size = 1 << ctz64(bcontainer->dirty_pgsizes);
>
>Should use qemu_real_host_page_size() here?
hmm, I think it's host mmu page size which is not as accurate as the iommu page sizes? here we want the iommu ones.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2025-10-21 8:25 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-17 8:22 [PATCH v2 0/8] vfio: relax the vIOMMU check Zhenzhong Duan
2025-10-17 8:22 ` [PATCH v2 1/8] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 12:44 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 2/8] vfio/iommufd: Query dirty bitmap before DMA unmap Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 8:14 ` Avihai Horon
2025-10-20 10:00 ` Duan, Zhenzhong
2025-10-20 12:45 ` Yi Liu
2025-10-20 13:04 ` Avihai Horon
2025-10-21 3:30 ` Yi Liu
2025-10-20 12:44 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 3/8] vfio/container-legacy: rename vfio_dma_unmap_bitmap() to vfio_legacy_dma_unmap_get_dirty_bitmap() Zhenzhong Duan
2025-10-20 7:00 ` Cédric Le Goater
2025-10-20 12:45 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 4/8] vfio: Add a backend_flag parameter to vfio_contianer_query_dirty_bitmap() Zhenzhong Duan
2025-10-20 7:01 ` Cédric Le Goater
2025-10-20 12:44 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 5/8] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support Zhenzhong Duan
2025-10-20 7:01 ` Cédric Le Goater
2025-10-20 12:45 ` Yi Liu
2025-10-17 8:22 ` [PATCH v2 6/8] intel_iommu: Fix unmap_bitmap failure with legacy VFIO backend Zhenzhong Duan
2025-10-20 7:06 ` Cédric Le Goater
2025-10-20 7:38 ` Yi Liu
2025-10-20 8:03 ` Duan, Zhenzhong
2025-10-20 8:37 ` Yi Liu
2025-10-20 10:01 ` Duan, Zhenzhong
2025-10-17 8:22 ` [PATCH v2 7/8] vfio/migration: Add migration blocker if VM memory is too large to cause unmap_bitmap failure Zhenzhong Duan
2025-10-20 8:39 ` Cédric Le Goater
2025-10-20 10:07 ` Duan, Zhenzhong
2025-10-20 12:44 ` Yi Liu
2025-10-21 8:25 ` Duan, Zhenzhong
2025-10-17 8:22 ` [PATCH v2 8/8] vfio/migration: Allow live migration with vIOMMU without VFs using device dirty tracking Zhenzhong Duan
2025-10-20 12:46 ` Yi Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).