* [PATCH v2 1/5] migration: Add migration prefix to functions in target.c
2023-08-31 12:56 [PATCH v2 0/5] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
@ 2023-08-31 12:56 ` Avihai Horon
2023-08-31 12:56 ` [PATCH v2 2/5] vfio/migration: Fail adding device with enable-migration=on and existing blocker Avihai Horon
` (3 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Avihai Horon @ 2023-08-31 12:56 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
The functions in target.c are not static, yet they don't have a proper
migration prefix. Add such prefix.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
migration/migration.h | 4 ++--
migration/migration.c | 6 +++---
migration/savevm.c | 2 +-
migration/target.c | 8 ++++----
4 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/migration/migration.h b/migration/migration.h
index 6eea18db36..c5695de214 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -512,8 +512,8 @@ void migration_consume_urgent_request(void);
bool migration_rate_limit(void);
void migration_cancel(const Error *error);
-void populate_vfio_info(MigrationInfo *info);
-void reset_vfio_bytes_transferred(void);
+void migration_populate_vfio_info(MigrationInfo *info);
+void migration_reset_vfio_bytes_transferred(void);
void postcopy_temp_page_reset(PostcopyTmpPage *tmp_page);
#endif
diff --git a/migration/migration.c b/migration/migration.c
index 5528acb65e..92866a8f49 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1039,7 +1039,7 @@ static void fill_source_migration_info(MigrationInfo *info)
populate_time_info(info, s);
populate_ram_info(info, s);
populate_disk_info(info);
- populate_vfio_info(info);
+ migration_populate_vfio_info(info);
break;
case MIGRATION_STATUS_COLO:
info->has_status = true;
@@ -1048,7 +1048,7 @@ static void fill_source_migration_info(MigrationInfo *info)
case MIGRATION_STATUS_COMPLETED:
populate_time_info(info, s);
populate_ram_info(info, s);
- populate_vfio_info(info);
+ migration_populate_vfio_info(info);
break;
case MIGRATION_STATUS_FAILED:
info->has_status = true;
@@ -1641,7 +1641,7 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
*/
memset(&mig_stats, 0, sizeof(mig_stats));
memset(&compression_counters, 0, sizeof(compression_counters));
- reset_vfio_bytes_transferred();
+ migration_reset_vfio_bytes_transferred();
return true;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index a2cb8855e2..5bf8b59a7d 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1622,7 +1622,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
migrate_init(ms);
memset(&mig_stats, 0, sizeof(mig_stats));
memset(&compression_counters, 0, sizeof(compression_counters));
- reset_vfio_bytes_transferred();
+ migration_reset_vfio_bytes_transferred();
ms->to_dst_file = f;
qemu_mutex_unlock_iothread();
diff --git a/migration/target.c b/migration/target.c
index f39c9a8d88..a6ffa9a5ce 100644
--- a/migration/target.c
+++ b/migration/target.c
@@ -15,7 +15,7 @@
#endif
#ifdef CONFIG_VFIO
-void populate_vfio_info(MigrationInfo *info)
+void migration_populate_vfio_info(MigrationInfo *info)
{
if (vfio_mig_active()) {
info->vfio = g_malloc0(sizeof(*info->vfio));
@@ -23,16 +23,16 @@ void populate_vfio_info(MigrationInfo *info)
}
}
-void reset_vfio_bytes_transferred(void)
+void migration_reset_vfio_bytes_transferred(void)
{
vfio_reset_bytes_transferred();
}
#else
-void populate_vfio_info(MigrationInfo *info)
+void migration_populate_vfio_info(MigrationInfo *info)
{
}
-void reset_vfio_bytes_transferred(void)
+void migration_reset_vfio_bytes_transferred(void)
{
}
#endif
--
2.26.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 2/5] vfio/migration: Fail adding device with enable-migration=on and existing blocker
2023-08-31 12:56 [PATCH v2 0/5] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
2023-08-31 12:56 ` [PATCH v2 1/5] migration: Add migration prefix to functions in target.c Avihai Horon
@ 2023-08-31 12:56 ` Avihai Horon
2023-08-31 12:57 ` [PATCH v2 3/5] migration: Add .save_prepare() handler to struct SaveVMHandlers Avihai Horon
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Avihai Horon @ 2023-08-31 12:56 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
If a device with enable-migration=on is added and it causes a migration
blocker, adding the device should fail with a proper error.
This is not the case with multiple device migration blocker when the
blocker already exists. If the blocker already exists and a device with
enable-migration=on is added which causes a migration blocker, adding
the device will succeed.
Fix it by failing adding the device in such case.
Fixes: 8bbcb64a71d8 ("vfio/migration: Make VFIO migration non-experimental")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/common.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 8a8d074e18..237101d038 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -394,8 +394,7 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp)
{
int ret;
- if (multiple_devices_migration_blocker ||
- vfio_multiple_devices_migration_is_supported()) {
+ if (vfio_multiple_devices_migration_is_supported()) {
return 0;
}
@@ -405,6 +404,10 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp)
return -EINVAL;
}
+ if (multiple_devices_migration_blocker) {
+ return 0;
+ }
+
error_setg(&multiple_devices_migration_blocker,
"Multiple VFIO devices migration is supported only if all of "
"them support P2P migration");
--
2.26.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 3/5] migration: Add .save_prepare() handler to struct SaveVMHandlers
2023-08-31 12:56 [PATCH v2 0/5] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
2023-08-31 12:56 ` [PATCH v2 1/5] migration: Add migration prefix to functions in target.c Avihai Horon
2023-08-31 12:56 ` [PATCH v2 2/5] vfio/migration: Fail adding device with enable-migration=on and existing blocker Avihai Horon
@ 2023-08-31 12:57 ` Avihai Horon
2023-09-01 15:49 ` Peter Xu
2023-08-31 12:57 ` [PATCH v2 4/5] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
2023-08-31 12:57 ` [PATCH v2 5/5] vfio/migration: Block VFIO migration with background snapshot Avihai Horon
4 siblings, 1 reply; 14+ messages in thread
From: Avihai Horon @ 2023-08-31 12:57 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
Add a new .save_prepare() handler to struct SaveVMHandlers. This handler
is called early, even before migration starts, and can be used by
devices to perform early checks.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
include/migration/register.h | 5 +++++
migration/savevm.h | 1 +
migration/migration.c | 4 ++++
migration/savevm.c | 29 +++++++++++++++++++++++++++++
4 files changed, 39 insertions(+)
diff --git a/include/migration/register.h b/include/migration/register.h
index 90914f32f5..2b12c6adec 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -20,6 +20,11 @@ typedef struct SaveVMHandlers {
/* This runs inside the iothread lock. */
SaveStateHandler *save_state;
+ /*
+ * save_prepare is called early, even before migration starts, and can be
+ * used to perform early checks.
+ */
+ int (*save_prepare)(void *opaque, Error **errp);
void (*save_cleanup)(void *opaque);
int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
diff --git a/migration/savevm.h b/migration/savevm.h
index e894bbc143..74669733dd 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -31,6 +31,7 @@
bool qemu_savevm_state_blocked(Error **errp);
void qemu_savevm_non_migratable_list(strList **reasons);
+int qemu_savevm_state_prepare(Error **errp);
void qemu_savevm_state_setup(QEMUFile *f);
bool qemu_savevm_state_guest_unplug_pending(void);
int qemu_savevm_state_resume_prepare(MigrationState *s);
diff --git a/migration/migration.c b/migration/migration.c
index 92866a8f49..914783f07b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1634,6 +1634,10 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
migrate_set_block_incremental(true);
}
+ if (qemu_savevm_state_prepare(errp)) {
+ return false;
+ }
+
migrate_init(s);
/*
* set mig_stats compression_counters memory to zero for a
diff --git a/migration/savevm.c b/migration/savevm.c
index 5bf8b59a7d..818510ec57 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1233,6 +1233,30 @@ bool qemu_savevm_state_guest_unplug_pending(void)
return false;
}
+int qemu_savevm_state_prepare(Error **errp)
+{
+ SaveStateEntry *se;
+ int ret;
+
+ QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+ if (!se->ops || !se->ops->save_prepare) {
+ continue;
+ }
+ if (se->ops->is_active) {
+ if (!se->ops->is_active(se->opaque)) {
+ continue;
+ }
+ }
+
+ ret = se->ops->save_prepare(se->opaque, errp);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
void qemu_savevm_state_setup(QEMUFile *f)
{
MigrationState *ms = migrate_get_current();
@@ -1619,6 +1643,11 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
return -EINVAL;
}
+ ret = qemu_savevm_state_prepare(errp);
+ if (ret) {
+ return ret;
+ }
+
migrate_init(ms);
memset(&mig_stats, 0, sizeof(mig_stats));
memset(&compression_counters, 0, sizeof(compression_counters));
--
2.26.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/5] migration: Add .save_prepare() handler to struct SaveVMHandlers
2023-08-31 12:57 ` [PATCH v2 3/5] migration: Add .save_prepare() handler to struct SaveVMHandlers Avihai Horon
@ 2023-09-01 15:49 ` Peter Xu
2023-09-05 16:13 ` Cédric Le Goater
0 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2023-09-01 15:49 UTC (permalink / raw)
To: Avihai Horon
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Juan Quintela,
Leonardo Bras, Yanghang Liu
On Thu, Aug 31, 2023 at 03:57:00PM +0300, Avihai Horon wrote:
> Add a new .save_prepare() handler to struct SaveVMHandlers. This handler
> is called early, even before migration starts, and can be used by
> devices to perform early checks.
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Shouldn't be hard to unify the two call sites for qmp migrate and save
snapshot, but we can leave that for later:
Reviewed-by: Peter Xu <peterx@redhat.com>
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/5] migration: Add .save_prepare() handler to struct SaveVMHandlers
2023-09-01 15:49 ` Peter Xu
@ 2023-09-05 16:13 ` Cédric Le Goater
2023-09-06 14:11 ` Avihai Horon
0 siblings, 1 reply; 14+ messages in thread
From: Cédric Le Goater @ 2023-09-05 16:13 UTC (permalink / raw)
To: Peter Xu, Avihai Horon
Cc: qemu-devel, Alex Williamson, Juan Quintela, Leonardo Bras,
Yanghang Liu
On 9/1/23 17:49, Peter Xu wrote:
> On Thu, Aug 31, 2023 at 03:57:00PM +0300, Avihai Horon wrote:
>> Add a new .save_prepare() handler to struct SaveVMHandlers. This handler
>> is called early, even before migration starts, and can be used by
>> devices to perform early checks.
>>
>> Suggested-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>
> Shouldn't be hard to unify the two call sites for qmp migrate and save
> snapshot, but we can leave that for later:
yes. It could be called from migrate_init() with minor changes.
We could probably move :
memset(&mig_stats, 0, sizeof(mig_stats));
memset(&compression_counters, 0, sizeof(compression_counters));
migration_reset_vfio_bytes_transferred();
under migrate_init() also. Anyhow,
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/5] migration: Add .save_prepare() handler to struct SaveVMHandlers
2023-09-05 16:13 ` Cédric Le Goater
@ 2023-09-06 14:11 ` Avihai Horon
2023-09-06 14:21 ` Cédric Le Goater
0 siblings, 1 reply; 14+ messages in thread
From: Avihai Horon @ 2023-09-06 14:11 UTC (permalink / raw)
To: Cédric Le Goater, Peter Xu
Cc: qemu-devel, Alex Williamson, Juan Quintela, Leonardo Bras,
Yanghang Liu
On 05/09/2023 19:13, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> On 9/1/23 17:49, Peter Xu wrote:
>> On Thu, Aug 31, 2023 at 03:57:00PM +0300, Avihai Horon wrote:
>>> Add a new .save_prepare() handler to struct SaveVMHandlers. This
>>> handler
>>> is called early, even before migration starts, and can be used by
>>> devices to perform early checks.
>>>
>>> Suggested-by: Peter Xu <peterx@redhat.com>
>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>>
>> Shouldn't be hard to unify the two call sites for qmp migrate and save
>> snapshot, but we can leave that for later:
>
> yes. It could be called from migrate_init() with minor changes.
>
> We could probably move :
>
> memset(&mig_stats, 0, sizeof(mig_stats));
> memset(&compression_counters, 0, sizeof(compression_counters));
> migration_reset_vfio_bytes_transferred();
>
> under migrate_init() also.
I will send v3 with the above changes shortly.
Thanks.
> Anyhow,
>
> Reviewed-by: Cédric Le Goater <clg@redhat.com>
>
> Thanks,
>
> C.
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/5] migration: Add .save_prepare() handler to struct SaveVMHandlers
2023-09-06 14:11 ` Avihai Horon
@ 2023-09-06 14:21 ` Cédric Le Goater
0 siblings, 0 replies; 14+ messages in thread
From: Cédric Le Goater @ 2023-09-06 14:21 UTC (permalink / raw)
To: Avihai Horon, Peter Xu
Cc: qemu-devel, Alex Williamson, Juan Quintela, Leonardo Bras,
Yanghang Liu
On 9/6/23 16:11, Avihai Horon wrote:
>
> On 05/09/2023 19:13, Cédric Le Goater wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 9/1/23 17:49, Peter Xu wrote:
>>> On Thu, Aug 31, 2023 at 03:57:00PM +0300, Avihai Horon wrote:
>>>> Add a new .save_prepare() handler to struct SaveVMHandlers. This handler
>>>> is called early, even before migration starts, and can be used by
>>>> devices to perform early checks.
>>>>
>>>> Suggested-by: Peter Xu <peterx@redhat.com>
>>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>>>
>>> Shouldn't be hard to unify the two call sites for qmp migrate and save
>>> snapshot, but we can leave that for later:
>>
>> yes. It could be called from migrate_init() with minor changes.
>>
>> We could probably move :
>>
>> memset(&mig_stats, 0, sizeof(mig_stats));
>> memset(&compression_counters, 0, sizeof(compression_counters));
>> migration_reset_vfio_bytes_transferred();
>>
>> under migrate_init() also.
>
> I will send v3 with the above changes shortly.
Good timing. I was going to start working on downstream.
Thanks !
C.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 4/5] vfio/migration: Block VFIO migration with postcopy migration
2023-08-31 12:56 [PATCH v2 0/5] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
` (2 preceding siblings ...)
2023-08-31 12:57 ` [PATCH v2 3/5] migration: Add .save_prepare() handler to struct SaveVMHandlers Avihai Horon
@ 2023-08-31 12:57 ` Avihai Horon
2023-09-01 10:14 ` YangHang Liu
2023-09-01 15:51 ` Peter Xu
2023-08-31 12:57 ` [PATCH v2 5/5] vfio/migration: Block VFIO migration with background snapshot Avihai Horon
4 siblings, 2 replies; 14+ messages in thread
From: Avihai Horon @ 2023-08-31 12:57 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
VFIO migration is not compatible with postcopy migration. A VFIO device
in the destination can't handle page faults for pages that have not been
sent yet.
Doing such migration will cause the VM to crash in the destination:
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
To prevent this, block VFIO migration with postcopy migration.
Reported-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
hw/vfio/migration.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 71855468fe..20994dc1d6 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
/* ---------------------------------------------------------------------- */
+static int vfio_save_prepare(void *opaque, Error **errp)
+{
+ VFIODevice *vbasedev = opaque;
+
+ /*
+ * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
+ */
+ if (runstate_check(RUN_STATE_SAVE_VM)) {
+ return 0;
+ }
+
+ if (migrate_postcopy_ram()) {
+ error_setg(
+ errp, "%s: VFIO migration is not supported with postcopy migration",
+ vbasedev->name);
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
static int vfio_save_setup(QEMUFile *f, void *opaque)
{
VFIODevice *vbasedev = opaque;
@@ -640,6 +661,7 @@ static bool vfio_switchover_ack_needed(void *opaque)
}
static const SaveVMHandlers savevm_vfio_handlers = {
+ .save_prepare = vfio_save_prepare,
.save_setup = vfio_save_setup,
.save_cleanup = vfio_save_cleanup,
.state_pending_estimate = vfio_state_pending_estimate,
--
2.26.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 4/5] vfio/migration: Block VFIO migration with postcopy migration
2023-08-31 12:57 ` [PATCH v2 4/5] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
@ 2023-09-01 10:14 ` YangHang Liu
2023-09-01 15:51 ` Peter Xu
1 sibling, 0 replies; 14+ messages in thread
From: YangHang Liu @ 2023-09-01 10:14 UTC (permalink / raw)
To: Avihai Horon
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Juan Quintela,
Peter Xu, Leonardo Bras
When try to do the vfio post-copy migration, we can get an expected
internal error now: "unable to execute QEMU command 'migrate':
0000:b1:00.2: VFIO migration is not supported with postcopy migration"
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Best Regards,
YangHang Liu
On Thu, Aug 31, 2023 at 8:57 PM Avihai Horon <avihaih@nvidia.com> wrote:
>
> VFIO migration is not compatible with postcopy migration. A VFIO device
> in the destination can't handle page faults for pages that have not been
> sent yet.
>
> Doing such migration will cause the VM to crash in the destination:
>
> qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
> qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
> qemu: hardware error: vfio: DMA mapping failed, unable to continue
>
> To prevent this, block VFIO migration with postcopy migration.
>
> Reported-by: Yanghang Liu <yanghliu@redhat.com>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> hw/vfio/migration.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 71855468fe..20994dc1d6 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
>
> /* ---------------------------------------------------------------------- */
>
> +static int vfio_save_prepare(void *opaque, Error **errp)
> +{
> + VFIODevice *vbasedev = opaque;
> +
> + /*
> + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
> + */
> + if (runstate_check(RUN_STATE_SAVE_VM)) {
> + return 0;
> + }
> +
> + if (migrate_postcopy_ram()) {
> + error_setg(
> + errp, "%s: VFIO migration is not supported with postcopy migration",
> + vbasedev->name);
> + return -EOPNOTSUPP;
> + }
> +
> + return 0;
> +}
> +
> static int vfio_save_setup(QEMUFile *f, void *opaque)
> {
> VFIODevice *vbasedev = opaque;
> @@ -640,6 +661,7 @@ static bool vfio_switchover_ack_needed(void *opaque)
> }
>
> static const SaveVMHandlers savevm_vfio_handlers = {
> + .save_prepare = vfio_save_prepare,
> .save_setup = vfio_save_setup,
> .save_cleanup = vfio_save_cleanup,
> .state_pending_estimate = vfio_state_pending_estimate,
> --
> 2.26.3
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 4/5] vfio/migration: Block VFIO migration with postcopy migration
2023-08-31 12:57 ` [PATCH v2 4/5] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
2023-09-01 10:14 ` YangHang Liu
@ 2023-09-01 15:51 ` Peter Xu
2023-09-03 7:52 ` Avihai Horon
1 sibling, 1 reply; 14+ messages in thread
From: Peter Xu @ 2023-09-01 15:51 UTC (permalink / raw)
To: Avihai Horon
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Juan Quintela,
Leonardo Bras, Yanghang Liu
On Thu, Aug 31, 2023 at 03:57:01PM +0300, Avihai Horon wrote:
> VFIO migration is not compatible with postcopy migration. A VFIO device
> in the destination can't handle page faults for pages that have not been
> sent yet.
>
> Doing such migration will cause the VM to crash in the destination:
>
> qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
> qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
> qemu: hardware error: vfio: DMA mapping failed, unable to continue
>
> To prevent this, block VFIO migration with postcopy migration.
>
> Reported-by: Yanghang Liu <yanghliu@redhat.com>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> hw/vfio/migration.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 71855468fe..20994dc1d6 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
>
> /* ---------------------------------------------------------------------- */
>
> +static int vfio_save_prepare(void *opaque, Error **errp)
> +{
> + VFIODevice *vbasedev = opaque;
> +
> + /*
> + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
> + */
> + if (runstate_check(RUN_STATE_SAVE_VM)) {
> + return 0;
> + }
Just purely curious: will it really work to save a snapshot for the GPU
assigned use case?
> +
> + if (migrate_postcopy_ram()) {
> + error_setg(
> + errp, "%s: VFIO migration is not supported with postcopy migration",
> + vbasedev->name);
> + return -EOPNOTSUPP;
> + }
> +
> + return 0;
> +}
--
Peter Xu
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 4/5] vfio/migration: Block VFIO migration with postcopy migration
2023-09-01 15:51 ` Peter Xu
@ 2023-09-03 7:52 ` Avihai Horon
0 siblings, 0 replies; 14+ messages in thread
From: Avihai Horon @ 2023-09-03 7:52 UTC (permalink / raw)
To: Peter Xu, Tarun Gupta
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Juan Quintela,
Leonardo Bras, Yanghang Liu
On 01/09/2023 18:51, Peter Xu wrote:
> External email: Use caution opening links or attachments
>
>
> On Thu, Aug 31, 2023 at 03:57:01PM +0300, Avihai Horon wrote:
>> VFIO migration is not compatible with postcopy migration. A VFIO device
>> in the destination can't handle page faults for pages that have not been
>> sent yet.
>>
>> Doing such migration will cause the VM to crash in the destination:
>>
>> qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
>> qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
>> qemu: hardware error: vfio: DMA mapping failed, unable to continue
>>
>> To prevent this, block VFIO migration with postcopy migration.
>>
>> Reported-by: Yanghang Liu <yanghliu@redhat.com>
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>> ---
>> hw/vfio/migration.c | 22 ++++++++++++++++++++++
>> 1 file changed, 22 insertions(+)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 71855468fe..20994dc1d6 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
>>
>> /* ---------------------------------------------------------------------- */
>>
>> +static int vfio_save_prepare(void *opaque, Error **errp)
>> +{
>> + VFIODevice *vbasedev = opaque;
>> +
>> + /*
>> + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
>> + */
>> + if (runstate_check(RUN_STATE_SAVE_VM)) {
>> + return 0;
>> + }
> Just purely curious: will it really work to save a snapshot for the GPU
> assigned use case?
I have never tried that.
Adding Tarun, maybe he can answer that.
Thanks.
>> +
>> + if (migrate_postcopy_ram()) {
>> + error_setg(
>> + errp, "%s: VFIO migration is not supported with postcopy migration",
>> + vbasedev->name);
>> + return -EOPNOTSUPP;
>> + }
>> +
>> + return 0;
>> +}
> --
> Peter Xu
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 5/5] vfio/migration: Block VFIO migration with background snapshot
2023-08-31 12:56 [PATCH v2 0/5] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
` (3 preceding siblings ...)
2023-08-31 12:57 ` [PATCH v2 4/5] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
@ 2023-08-31 12:57 ` Avihai Horon
2023-09-01 15:51 ` Peter Xu
4 siblings, 1 reply; 14+ messages in thread
From: Avihai Horon @ 2023-08-31 12:57 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Juan Quintela, Peter Xu,
Leonardo Bras, Yanghang Liu, Avihai Horon
Background snapshot allows creating a snapshot of the VM while it's
running and keeping it small by not including dirty RAM pages.
The way it works is by first stopping the VM, saving the non-iterable
devices' state and then starting the VM and saving the RAM while write
protecting it with UFFD. The resulting snapshot represents the VM state
at snapshot start.
VFIO migration is not compatible with background snapshot.
First of all, VFIO device state is not even saved in background snapshot
because only non-iterable device state is saved. But even if it was
saved, after starting the VM, a VFIO device could dirty pages without it
being detected by UFFD write protection. This would corrupt the
snapshot, as the RAM in it would not represent the RAM at snapshot
start.
To prevent this, block VFIO migration with background snapshot.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
hw/vfio/migration.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 20994dc1d6..da43dcd2fe 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -340,7 +340,8 @@ static int vfio_save_prepare(void *opaque, Error **errp)
VFIODevice *vbasedev = opaque;
/*
- * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
+ * Snapshot doesn't use postcopy nor background snapshot, so allow snapshot
+ * even if they are on.
*/
if (runstate_check(RUN_STATE_SAVE_VM)) {
return 0;
@@ -353,6 +354,14 @@ static int vfio_save_prepare(void *opaque, Error **errp)
return -EOPNOTSUPP;
}
+ if (migrate_background_snapshot()) {
+ error_setg(
+ errp,
+ "%s: VFIO migration is not supported with background snapshot",
+ vbasedev->name);
+ return -EOPNOTSUPP;
+ }
+
return 0;
}
--
2.26.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 5/5] vfio/migration: Block VFIO migration with background snapshot
2023-08-31 12:57 ` [PATCH v2 5/5] vfio/migration: Block VFIO migration with background snapshot Avihai Horon
@ 2023-09-01 15:51 ` Peter Xu
0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2023-09-01 15:51 UTC (permalink / raw)
To: Avihai Horon
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Juan Quintela,
Leonardo Bras, Yanghang Liu
On Thu, Aug 31, 2023 at 03:57:02PM +0300, Avihai Horon wrote:
> Background snapshot allows creating a snapshot of the VM while it's
> running and keeping it small by not including dirty RAM pages.
>
> The way it works is by first stopping the VM, saving the non-iterable
> devices' state and then starting the VM and saving the RAM while write
> protecting it with UFFD. The resulting snapshot represents the VM state
> at snapshot start.
>
> VFIO migration is not compatible with background snapshot.
> First of all, VFIO device state is not even saved in background snapshot
> because only non-iterable device state is saved. But even if it was
> saved, after starting the VM, a VFIO device could dirty pages without it
> being detected by UFFD write protection. This would corrupt the
> snapshot, as the RAM in it would not represent the RAM at snapshot
> start.
>
> To prevent this, block VFIO migration with background snapshot.
>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 14+ messages in thread