* [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot
@ 2023-09-06 15:08 Avihai Horon
2023-09-06 15:08 ` [PATCH v3 1/6] migration: Add migration prefix to functions in target.c Avihai Horon
` (6 more replies)
0 siblings, 7 replies; 12+ messages in thread
From: Avihai Horon @ 2023-09-06 15:08 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
Hello,
Recently added VFIO migration is not compatible with some of the
pre-existing migration features. This was overlooked and today these
combinations are not blocked by QEMU. This series fixes it.
Postcopy migration:
VFIO migration is not compatible with postcopy migration. A VFIO device
in the destination can't handle page faults for pages that have not been
sent yet. Doing such migration will cause the VM to crash in the
destination.
Background snapshot:
Background snapshot allows creating a snapshot of the VM while it's
running and keeping it small by not including dirty RAM pages.
The way it works is by first stopping the VM, saving the non-iterable
devices' state and then starting the VM and saving the RAM while write
protecting it with UFFD. The resulting snapshot represents the VM state
at snapshot start.
VFIO migration is not compatible with background snapshot.
First of all, VFIO device state is not even saved in background snapshot
because only non-iterable device state is saved. But even if it was
saved, after starting the VM, a VFIO device could dirty pages without it
being detected by UFFD write protection. This would corrupt the
snapshot, as the RAM in it would not represent the RAM at snapshot
start.
This series fixes it by blocking these combinations. This is done by
adding a .save_prepare() handler to struct SaveVMHandler. The
.save_prepare() handler is called early, even before migration starts,
and allows VFIO migration to check the migration capabilities and fail
migration if needed.
Note that this series is based on the P2P series [1] sent a few weeks
ago.
Comments and suggestions will be greatly appreciated.
Thanks.
Changes from v2 [3]:
* Added a new patch that moves more migration initializations to
migrate_init(). (Cedric)
* Consolidated the two call sites of qemu_savevm_state_prepare() into
migrate_init(). (Peter)
* Added R-bs and Tested-by tags.
Changes from v1 [2]:
* Adopted Peter's suggestion to block migration upon migrate command
using a new .save_prepare() handler in SaveVMHandlers.
* Added R-bs by Cedric.
[1]
https://lore.kernel.org/qemu-devel/20230802081449.2528-1-avihaih@nvidia.com/
[2]
https://lore.kernel.org/qemu-devel/20230828151842.11303-1-avihaih@nvidia.com/
[3]
https://lore.kernel.org/qemu-devel/20230831125702.11263-1-avihaih@nvidia.com/
Avihai Horon (6):
migration: Add migration prefix to functions in target.c
vfio/migration: Fail adding device with enable-migration=on and
existing blocker
migration: Move more initializations to migrate_init()
migration: Add .save_prepare() handler to struct SaveVMHandlers
vfio/migration: Block VFIO migration with postcopy migration
vfio/migration: Block VFIO migration with background snapshot
include/migration/register.h | 5 +++++
migration/migration.h | 6 +++---
migration/savevm.h | 1 +
hw/vfio/common.c | 7 +++++--
hw/vfio/migration.c | 31 +++++++++++++++++++++++++++++++
migration/migration.c | 33 ++++++++++++++++++++++-----------
migration/savevm.c | 32 ++++++++++++++++++++++++++++----
migration/target.c | 8 ++++----
8 files changed, 99 insertions(+), 24 deletions(-)
--
2.26.3
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 1/6] migration: Add migration prefix to functions in target.c
2023-09-06 15:08 [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
@ 2023-09-06 15:08 ` Avihai Horon
2023-09-06 15:08 ` [PATCH v3 2/6] vfio/migration: Fail adding device with enable-migration=on and existing blocker Avihai Horon
` (5 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Avihai Horon @ 2023-09-06 15:08 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
The functions in target.c are not static, yet they don't have a proper
migration prefix. Add such prefix.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
migration/migration.h | 4 ++--
migration/migration.c | 6 +++---
migration/savevm.c | 2 +-
migration/target.c | 8 ++++----
4 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/migration/migration.h b/migration/migration.h
index 6eea18db36..c5695de214 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -512,8 +512,8 @@ void migration_consume_urgent_request(void);
bool migration_rate_limit(void);
void migration_cancel(const Error *error);
-void populate_vfio_info(MigrationInfo *info);
-void reset_vfio_bytes_transferred(void);
+void migration_populate_vfio_info(MigrationInfo *info);
+void migration_reset_vfio_bytes_transferred(void);
void postcopy_temp_page_reset(PostcopyTmpPage *tmp_page);
#endif
diff --git a/migration/migration.c b/migration/migration.c
index 5528acb65e..92866a8f49 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1039,7 +1039,7 @@ static void fill_source_migration_info(MigrationInfo *info)
populate_time_info(info, s);
populate_ram_info(info, s);
populate_disk_info(info);
- populate_vfio_info(info);
+ migration_populate_vfio_info(info);
break;
case MIGRATION_STATUS_COLO:
info->has_status = true;
@@ -1048,7 +1048,7 @@ static void fill_source_migration_info(MigrationInfo *info)
case MIGRATION_STATUS_COMPLETED:
populate_time_info(info, s);
populate_ram_info(info, s);
- populate_vfio_info(info);
+ migration_populate_vfio_info(info);
break;
case MIGRATION_STATUS_FAILED:
info->has_status = true;
@@ -1641,7 +1641,7 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
*/
memset(&mig_stats, 0, sizeof(mig_stats));
memset(&compression_counters, 0, sizeof(compression_counters));
- reset_vfio_bytes_transferred();
+ migration_reset_vfio_bytes_transferred();
return true;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index a2cb8855e2..5bf8b59a7d 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1622,7 +1622,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
migrate_init(ms);
memset(&mig_stats, 0, sizeof(mig_stats));
memset(&compression_counters, 0, sizeof(compression_counters));
- reset_vfio_bytes_transferred();
+ migration_reset_vfio_bytes_transferred();
ms->to_dst_file = f;
qemu_mutex_unlock_iothread();
diff --git a/migration/target.c b/migration/target.c
index f39c9a8d88..a6ffa9a5ce 100644
--- a/migration/target.c
+++ b/migration/target.c
@@ -15,7 +15,7 @@
#endif
#ifdef CONFIG_VFIO
-void populate_vfio_info(MigrationInfo *info)
+void migration_populate_vfio_info(MigrationInfo *info)
{
if (vfio_mig_active()) {
info->vfio = g_malloc0(sizeof(*info->vfio));
@@ -23,16 +23,16 @@ void populate_vfio_info(MigrationInfo *info)
}
}
-void reset_vfio_bytes_transferred(void)
+void migration_reset_vfio_bytes_transferred(void)
{
vfio_reset_bytes_transferred();
}
#else
-void populate_vfio_info(MigrationInfo *info)
+void migration_populate_vfio_info(MigrationInfo *info)
{
}
-void reset_vfio_bytes_transferred(void)
+void migration_reset_vfio_bytes_transferred(void)
{
}
#endif
--
2.26.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 2/6] vfio/migration: Fail adding device with enable-migration=on and existing blocker
2023-09-06 15:08 [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
2023-09-06 15:08 ` [PATCH v3 1/6] migration: Add migration prefix to functions in target.c Avihai Horon
@ 2023-09-06 15:08 ` Avihai Horon
2023-09-06 15:08 ` [PATCH v3 3/6] migration: Move more initializations to migrate_init() Avihai Horon
` (4 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Avihai Horon @ 2023-09-06 15:08 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
If a device with enable-migration=on is added and it causes a migration
blocker, adding the device should fail with a proper error.
This is not the case with multiple device migration blocker when the
blocker already exists. If the blocker already exists and a device with
enable-migration=on is added which causes a migration blocker, adding
the device will succeed.
Fix it by failing adding the device in such case.
Fixes: 8bbcb64a71d8 ("vfio/migration: Make VFIO migration non-experimental")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/common.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 8a8d074e18..237101d038 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -394,8 +394,7 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp)
{
int ret;
- if (multiple_devices_migration_blocker ||
- vfio_multiple_devices_migration_is_supported()) {
+ if (vfio_multiple_devices_migration_is_supported()) {
return 0;
}
@@ -405,6 +404,10 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp)
return -EINVAL;
}
+ if (multiple_devices_migration_blocker) {
+ return 0;
+ }
+
error_setg(&multiple_devices_migration_blocker,
"Multiple VFIO devices migration is supported only if all of "
"them support P2P migration");
--
2.26.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 3/6] migration: Move more initializations to migrate_init()
2023-09-06 15:08 [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
2023-09-06 15:08 ` [PATCH v3 1/6] migration: Add migration prefix to functions in target.c Avihai Horon
2023-09-06 15:08 ` [PATCH v3 2/6] vfio/migration: Fail adding device with enable-migration=on and existing blocker Avihai Horon
@ 2023-09-06 15:08 ` Avihai Horon
2023-09-06 16:16 ` Cédric Le Goater
2023-09-06 15:08 ` [PATCH v3 4/6] migration: Add .save_prepare() handler to struct SaveVMHandlers Avihai Horon
` (3 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Avihai Horon @ 2023-09-06 15:08 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
Initialization of mig_stats, compression_counters and VFIO bytes
transferred is hard-coded in migration code path and snapshot code path.
Make the code cleaner by initializing them in migrate_init().
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
migration/migration.c | 14 +++++++-------
migration/savevm.c | 3 ---
2 files changed, 7 insertions(+), 10 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 92866a8f49..ce01a3ba6a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1425,6 +1425,13 @@ void migrate_init(MigrationState *s)
s->iteration_initial_bytes = 0;
s->threshold_size = 0;
s->switchover_acked = false;
+ /*
+ * set mig_stats compression_counters memory to zero for a
+ * new migration
+ */
+ memset(&mig_stats, 0, sizeof(mig_stats));
+ memset(&compression_counters, 0, sizeof(compression_counters));
+ migration_reset_vfio_bytes_transferred();
}
int migrate_add_blocker_internal(Error *reason, Error **errp)
@@ -1635,13 +1642,6 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
}
migrate_init(s);
- /*
- * set mig_stats compression_counters memory to zero for a
- * new migration
- */
- memset(&mig_stats, 0, sizeof(mig_stats));
- memset(&compression_counters, 0, sizeof(compression_counters));
- migration_reset_vfio_bytes_transferred();
return true;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index 5bf8b59a7d..e14efeced0 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1620,9 +1620,6 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
}
migrate_init(ms);
- memset(&mig_stats, 0, sizeof(mig_stats));
- memset(&compression_counters, 0, sizeof(compression_counters));
- migration_reset_vfio_bytes_transferred();
ms->to_dst_file = f;
qemu_mutex_unlock_iothread();
--
2.26.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 4/6] migration: Add .save_prepare() handler to struct SaveVMHandlers
2023-09-06 15:08 [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
` (2 preceding siblings ...)
2023-09-06 15:08 ` [PATCH v3 3/6] migration: Move more initializations to migrate_init() Avihai Horon
@ 2023-09-06 15:08 ` Avihai Horon
2023-09-06 15:08 ` [PATCH v3 5/6] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
` (2 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Avihai Horon @ 2023-09-06 15:08 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
Add a new .save_prepare() handler to struct SaveVMHandlers. This handler
is called early, even before migration starts, and can be used by
devices to perform early checks.
Refactor migrate_init() to be able to return errors and call
.save_prepare() from there.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
include/migration/register.h | 5 +++++
migration/migration.h | 2 +-
migration/savevm.h | 1 +
migration/migration.c | 15 +++++++++++++--
migration/savevm.c | 29 ++++++++++++++++++++++++++++-
5 files changed, 48 insertions(+), 4 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 90914f32f5..2b12c6adec 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -20,6 +20,11 @@ typedef struct SaveVMHandlers {
/* This runs inside the iothread lock. */
SaveStateHandler *save_state;
+ /*
+ * save_prepare is called early, even before migration starts, and can be
+ * used to perform early checks.
+ */
+ int (*save_prepare)(void *opaque, Error **errp);
void (*save_cleanup)(void *opaque);
int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
diff --git a/migration/migration.h b/migration/migration.h
index c5695de214..c390500604 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -472,7 +472,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in);
bool migration_is_setup_or_active(int state);
bool migration_is_running(int state);
-void migrate_init(MigrationState *s);
+int migrate_init(MigrationState *s, Error **errp);
bool migration_is_blocked(Error **errp);
/* True if outgoing migration has entered postcopy phase */
bool migration_in_postcopy(void);
diff --git a/migration/savevm.h b/migration/savevm.h
index e894bbc143..74669733dd 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -31,6 +31,7 @@
bool qemu_savevm_state_blocked(Error **errp);
void qemu_savevm_non_migratable_list(strList **reasons);
+int qemu_savevm_state_prepare(Error **errp);
void qemu_savevm_state_setup(QEMUFile *f);
bool qemu_savevm_state_guest_unplug_pending(void);
int qemu_savevm_state_resume_prepare(MigrationState *s);
diff --git a/migration/migration.c b/migration/migration.c
index ce01a3ba6a..d61e572742 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1392,8 +1392,15 @@ bool migration_is_active(MigrationState *s)
s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
}
-void migrate_init(MigrationState *s)
+int migrate_init(MigrationState *s, Error **errp)
{
+ int ret;
+
+ ret = qemu_savevm_state_prepare(errp);
+ if (ret) {
+ return ret;
+ }
+
/*
* Reinitialise all migration state, except
* parameters/capabilities that the user set, and
@@ -1432,6 +1439,8 @@ void migrate_init(MigrationState *s)
memset(&mig_stats, 0, sizeof(mig_stats));
memset(&compression_counters, 0, sizeof(compression_counters));
migration_reset_vfio_bytes_transferred();
+
+ return 0;
}
int migrate_add_blocker_internal(Error *reason, Error **errp)
@@ -1641,7 +1650,9 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
migrate_set_block_incremental(true);
}
- migrate_init(s);
+ if (migrate_init(s, errp)) {
+ return false;
+ }
return true;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index e14efeced0..bb3e99194c 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1233,6 +1233,30 @@ bool qemu_savevm_state_guest_unplug_pending(void)
return false;
}
+int qemu_savevm_state_prepare(Error **errp)
+{
+ SaveStateEntry *se;
+ int ret;
+
+ QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+ if (!se->ops || !se->ops->save_prepare) {
+ continue;
+ }
+ if (se->ops->is_active) {
+ if (!se->ops->is_active(se->opaque)) {
+ continue;
+ }
+ }
+
+ ret = se->ops->save_prepare(se->opaque, errp);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
void qemu_savevm_state_setup(QEMUFile *f)
{
MigrationState *ms = migrate_get_current();
@@ -1619,7 +1643,10 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
return -EINVAL;
}
- migrate_init(ms);
+ ret = migrate_init(ms, errp);
+ if (ret) {
+ return ret;
+ }
ms->to_dst_file = f;
qemu_mutex_unlock_iothread();
--
2.26.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 5/6] vfio/migration: Block VFIO migration with postcopy migration
2023-09-06 15:08 [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
` (3 preceding siblings ...)
2023-09-06 15:08 ` [PATCH v3 4/6] migration: Add .save_prepare() handler to struct SaveVMHandlers Avihai Horon
@ 2023-09-06 15:08 ` Avihai Horon
2023-09-06 15:18 ` Peter Xu
2023-09-06 15:08 ` [PATCH v3 6/6] vfio/migration: Block VFIO migration with background snapshot Avihai Horon
2023-09-07 8:55 ` [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and " Cédric Le Goater
6 siblings, 1 reply; 12+ messages in thread
From: Avihai Horon @ 2023-09-06 15:08 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
VFIO migration is not compatible with postcopy migration. A VFIO device
in the destination can't handle page faults for pages that have not been
sent yet.
Doing such migration will cause the VM to crash in the destination:
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
To prevent this, block VFIO migration with postcopy migration.
Reported-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
---
hw/vfio/migration.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 71855468fe..20994dc1d6 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
/* ---------------------------------------------------------------------- */
+static int vfio_save_prepare(void *opaque, Error **errp)
+{
+ VFIODevice *vbasedev = opaque;
+
+ /*
+ * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
+ */
+ if (runstate_check(RUN_STATE_SAVE_VM)) {
+ return 0;
+ }
+
+ if (migrate_postcopy_ram()) {
+ error_setg(
+ errp, "%s: VFIO migration is not supported with postcopy migration",
+ vbasedev->name);
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
static int vfio_save_setup(QEMUFile *f, void *opaque)
{
VFIODevice *vbasedev = opaque;
@@ -640,6 +661,7 @@ static bool vfio_switchover_ack_needed(void *opaque)
}
static const SaveVMHandlers savevm_vfio_handlers = {
+ .save_prepare = vfio_save_prepare,
.save_setup = vfio_save_setup,
.save_cleanup = vfio_save_cleanup,
.state_pending_estimate = vfio_state_pending_estimate,
--
2.26.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 6/6] vfio/migration: Block VFIO migration with background snapshot
2023-09-06 15:08 [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
` (4 preceding siblings ...)
2023-09-06 15:08 ` [PATCH v3 5/6] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
@ 2023-09-06 15:08 ` Avihai Horon
2023-09-07 8:55 ` [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and " Cédric Le Goater
6 siblings, 0 replies; 12+ messages in thread
From: Avihai Horon @ 2023-09-06 15:08 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
Background snapshot allows creating a snapshot of the VM while it's
running and keeping it small by not including dirty RAM pages.
The way it works is by first stopping the VM, saving the non-iterable
devices' state and then starting the VM and saving the RAM while write
protecting it with UFFD. The resulting snapshot represents the VM state
at snapshot start.
VFIO migration is not compatible with background snapshot.
First of all, VFIO device state is not even saved in background snapshot
because only non-iterable device state is saved. But even if it was
saved, after starting the VM, a VFIO device could dirty pages without it
being detected by UFFD write protection. This would corrupt the
snapshot, as the RAM in it would not represent the RAM at snapshot
start.
To prevent this, block VFIO migration with background snapshot.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
hw/vfio/migration.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 20994dc1d6..da43dcd2fe 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -340,7 +340,8 @@ static int vfio_save_prepare(void *opaque, Error **errp)
VFIODevice *vbasedev = opaque;
/*
- * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
+ * Snapshot doesn't use postcopy nor background snapshot, so allow snapshot
+ * even if they are on.
*/
if (runstate_check(RUN_STATE_SAVE_VM)) {
return 0;
@@ -353,6 +354,14 @@ static int vfio_save_prepare(void *opaque, Error **errp)
return -EOPNOTSUPP;
}
+ if (migrate_background_snapshot()) {
+ error_setg(
+ errp,
+ "%s: VFIO migration is not supported with background snapshot",
+ vbasedev->name);
+ return -EOPNOTSUPP;
+ }
+
return 0;
}
--
2.26.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 5/6] vfio/migration: Block VFIO migration with postcopy migration
2023-09-06 15:08 ` [PATCH v3 5/6] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
@ 2023-09-06 15:18 ` Peter Xu
0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2023-09-06 15:18 UTC (permalink / raw)
To: Avihai Horon
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Juan Quintela,
Leonardo Bras, Yanghang Liu
On Wed, Sep 06, 2023 at 06:08:52PM +0300, Avihai Horon wrote:
> VFIO migration is not compatible with postcopy migration. A VFIO device
> in the destination can't handle page faults for pages that have not been
> sent yet.
>
> Doing such migration will cause the VM to crash in the destination:
>
> qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
> qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
> qemu: hardware error: vfio: DMA mapping failed, unable to continue
>
> To prevent this, block VFIO migration with postcopy migration.
>
> Reported-by: Yanghang Liu <yanghliu@redhat.com>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> Tested-by: Yanghang Liu <yanghliu@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/6] migration: Move more initializations to migrate_init()
2023-09-06 15:08 ` [PATCH v3 3/6] migration: Move more initializations to migrate_init() Avihai Horon
@ 2023-09-06 16:16 ` Cédric Le Goater
0 siblings, 0 replies; 12+ messages in thread
From: Cédric Le Goater @ 2023-09-06 16:16 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Alex Williamson, Juan Quintela, Peter Xu, Leonardo Bras,
Yanghang Liu
On 9/6/23 17:08, Avihai Horon wrote:
> Initialization of mig_stats, compression_counters and VFIO bytes
> transferred is hard-coded in migration code path and snapshot code path.
>
> Make the code cleaner by initializing them in migrate_init().
>
> Suggested-by: Cédric Le Goater <clg@redhat.com>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
> ---
> migration/migration.c | 14 +++++++-------
> migration/savevm.c | 3 ---
> 2 files changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 92866a8f49..ce01a3ba6a 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1425,6 +1425,13 @@ void migrate_init(MigrationState *s)
> s->iteration_initial_bytes = 0;
> s->threshold_size = 0;
> s->switchover_acked = false;
> + /*
> + * set mig_stats compression_counters memory to zero for a
> + * new migration
> + */
> + memset(&mig_stats, 0, sizeof(mig_stats));
> + memset(&compression_counters, 0, sizeof(compression_counters));
> + migration_reset_vfio_bytes_transferred();
> }
>
> int migrate_add_blocker_internal(Error *reason, Error **errp)
> @@ -1635,13 +1642,6 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
> }
>
> migrate_init(s);
> - /*
> - * set mig_stats compression_counters memory to zero for a
> - * new migration
> - */
> - memset(&mig_stats, 0, sizeof(mig_stats));
> - memset(&compression_counters, 0, sizeof(compression_counters));
> - migration_reset_vfio_bytes_transferred();
>
> return true;
> }
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 5bf8b59a7d..e14efeced0 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1620,9 +1620,6 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
> }
>
> migrate_init(ms);
> - memset(&mig_stats, 0, sizeof(mig_stats));
> - memset(&compression_counters, 0, sizeof(compression_counters));
> - migration_reset_vfio_bytes_transferred();
> ms->to_dst_file = f;
>
> qemu_mutex_unlock_iothread();
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot
2023-09-06 15:08 [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
` (5 preceding siblings ...)
2023-09-06 15:08 ` [PATCH v3 6/6] vfio/migration: Block VFIO migration with background snapshot Avihai Horon
@ 2023-09-07 8:55 ` Cédric Le Goater
2023-09-07 9:07 ` Cédric Le Goater
6 siblings, 1 reply; 12+ messages in thread
From: Cédric Le Goater @ 2023-09-07 8:55 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Alex Williamson, Juan Quintela, Peter Xu, Leonardo Bras,
Yanghang Liu
On 9/6/23 17:08, Avihai Horon wrote:
> Hello,
>
> Recently added VFIO migration is not compatible with some of the
> pre-existing migration features. This was overlooked and today these
> combinations are not blocked by QEMU. This series fixes it.
>
> Postcopy migration:
> VFIO migration is not compatible with postcopy migration. A VFIO device
> in the destination can't handle page faults for pages that have not been
> sent yet. Doing such migration will cause the VM to crash in the
> destination.
>
> Background snapshot:
> Background snapshot allows creating a snapshot of the VM while it's
> running and keeping it small by not including dirty RAM pages.
>
> The way it works is by first stopping the VM, saving the non-iterable
> devices' state and then starting the VM and saving the RAM while write
> protecting it with UFFD. The resulting snapshot represents the VM state
> at snapshot start.
>
> VFIO migration is not compatible with background snapshot.
> First of all, VFIO device state is not even saved in background snapshot
> because only non-iterable device state is saved. But even if it was
> saved, after starting the VM, a VFIO device could dirty pages without it
> being detected by UFFD write protection. This would corrupt the
> snapshot, as the RAM in it would not represent the RAM at snapshot
> start.
>
> This series fixes it by blocking these combinations. This is done by
> adding a .save_prepare() handler to struct SaveVMHandler. The
> .save_prepare() handler is called early, even before migration starts,
> and allows VFIO migration to check the migration capabilities and fail
> migration if needed.
>
> Note that this series is based on the P2P series [1] sent a few weeks
> ago.
>
> Comments and suggestions will be greatly appreciated.
Applied to vfio-next.
Thanks,
C.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot
2023-09-07 8:55 ` [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and " Cédric Le Goater
@ 2023-09-07 9:07 ` Cédric Le Goater
2023-09-08 21:47 ` Peter Xu
0 siblings, 1 reply; 12+ messages in thread
From: Cédric Le Goater @ 2023-09-07 9:07 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Alex Williamson, Juan Quintela, Peter Xu, Leonardo Bras,
Yanghang Liu
[ ... ]
> Applied to vfio-next.
On that topic I am preparing a PR.
Juan, Peter, Leonardo, is it ok for you if these migration changes
go through the VFIO tree ?
Thanks,
C.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot
2023-09-07 9:07 ` Cédric Le Goater
@ 2023-09-08 21:47 ` Peter Xu
0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2023-09-08 21:47 UTC (permalink / raw)
To: Cédric Le Goater
Cc: Avihai Horon, qemu-devel, Alex Williamson, Juan Quintela,
Leonardo Bras, Yanghang Liu
On Thu, Sep 07, 2023 at 11:07:10AM +0200, Cédric Le Goater wrote:
> [ ... ]
>
> > Applied to vfio-next.
>
> On that topic I am preparing a PR.
>
> Juan, Peter, Leonardo, is it ok for you if these migration changes
> go through the VFIO tree ?
All good here.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-09-08 21:48 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-06 15:08 [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
2023-09-06 15:08 ` [PATCH v3 1/6] migration: Add migration prefix to functions in target.c Avihai Horon
2023-09-06 15:08 ` [PATCH v3 2/6] vfio/migration: Fail adding device with enable-migration=on and existing blocker Avihai Horon
2023-09-06 15:08 ` [PATCH v3 3/6] migration: Move more initializations to migrate_init() Avihai Horon
2023-09-06 16:16 ` Cédric Le Goater
2023-09-06 15:08 ` [PATCH v3 4/6] migration: Add .save_prepare() handler to struct SaveVMHandlers Avihai Horon
2023-09-06 15:08 ` [PATCH v3 5/6] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
2023-09-06 15:18 ` Peter Xu
2023-09-06 15:08 ` [PATCH v3 6/6] vfio/migration: Block VFIO migration with background snapshot Avihai Horon
2023-09-07 8:55 ` [PATCH v3 0/6] vfio/migration: Block VFIO migration with postcopy and " Cédric Le Goater
2023-09-07 9:07 ` Cédric Le Goater
2023-09-08 21:47 ` Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).