* [PATCH v2 0/5] migration: Notifier fixes for 11.0
@ 2026-01-26 21:36 Peter Xu
2026-01-26 21:36 ` [PATCH v2 1/5] migration: Add a tracepoint for invoking migration notifiers Peter Xu
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: Peter Xu @ 2026-01-26 21:36 UTC (permalink / raw)
To: qemu-devel
Cc: Prasad Pandit, Stefan Hajnoczi, peterx, Fabiano Rosas,
Juraj Marcin
CI: https://gitlab.com/peterx/qemu/-/pipelines/2287309287
v2:
- Collected r-bs / a-bs
- Patch 2: update comment for possible sequence of notifies [Fabiano]
v1: https://lore.kernel.org/r/20260122230331.3543312-1-peterx@redhat.com
Two major goals for this small series:
- Fix postcopy issue where DONE and FAILED notifiers will be invoked twice
- Move FAILED notifier to be before vm_start() if the failure happens
during switchover (where we will stop the VM first)
The 2nd goal will be needed by Stefan's ongoing work on block persistent
reservations, where a fallback should be required on src to happen before
vm_start(). Instead of introducing another FAILED_BEFORE_START, this
patchset should make FAILED work instead.
Patch 1 adds a tracepoint for me to verify this fix.
Patch 2-3 are the real changes of above two.
Patch 3-4 are some cleanups alone the context that we can do, hence
attached at the end.
More details in commit logs individually. Comments welcomed, thanks.
Peter Xu (5):
migration: Add a tracepoint for invoking migration notifiers
migration: Fix double notification of DONE/FAIL for postcopy
migration: Notify migration FAILED before starting VM
migration: Drop explicit block activation in postcopy fail path
migration: Rename MIG_EVENT_PRECOPY_* to MIG_EVENT_*
include/migration/misc.h | 20 ++++++++++++--------
hw/intc/arm_gicv3_kvm.c | 2 +-
hw/net/virtio-net.c | 4 ++--
hw/vfio/cpr-legacy.c | 2 +-
hw/vfio/cpr.c | 8 ++++----
hw/vfio/migration.c | 4 ++--
migration/cpr-exec.c | 6 +++---
migration/migration.c | 29 ++++++++++++++++++++---------
net/vhost-vdpa.c | 4 ++--
ui/spice-core.c | 7 ++++---
migration/trace-events | 1 +
11 files changed, 52 insertions(+), 35 deletions(-)
--
2.50.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/5] migration: Add a tracepoint for invoking migration notifiers
2026-01-26 21:36 [PATCH v2 0/5] migration: Notifier fixes for 11.0 Peter Xu
@ 2026-01-26 21:36 ` Peter Xu
2026-01-26 21:36 ` [PATCH v2 2/5] migration: Fix double notification of DONE/FAIL for postcopy Peter Xu
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2026-01-26 21:36 UTC (permalink / raw)
To: qemu-devel
Cc: Prasad Pandit, Stefan Hajnoczi, peterx, Fabiano Rosas,
Juraj Marcin
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 2 ++
migration/trace-events | 1 +
2 files changed, 3 insertions(+)
diff --git a/migration/migration.c b/migration/migration.c
index b103a82fc0..341b9be80e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1528,6 +1528,8 @@ int migration_call_notifiers(MigrationEventType type, Error **errp)
GSList *elem, *next;
int ret;
+ trace_migration_call_notifiers(type);
+
e.type = type;
for (elem = migration_state_notifiers[mode]; elem; elem = next) {
diff --git a/migration/trace-events b/migration/trace-events
index 91d7506634..90629f828f 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -198,6 +198,7 @@ process_incoming_migration_co_end(int ret) "ret=%d"
process_incoming_migration_co_postcopy_end_main(void) ""
postcopy_preempt_enabled(bool value) "%d"
migration_precopy_complete(void) ""
+migration_call_notifiers(int type) "type=%d"
# migration-stats
migration_transferred_bytes(uint64_t qemu_file, uint64_t multifd, uint64_t rdma) "qemu_file %" PRIu64 " multifd %" PRIu64 " RDMA %" PRIu64
--
2.50.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 2/5] migration: Fix double notification of DONE/FAIL for postcopy
2026-01-26 21:36 [PATCH v2 0/5] migration: Notifier fixes for 11.0 Peter Xu
2026-01-26 21:36 ` [PATCH v2 1/5] migration: Add a tracepoint for invoking migration notifiers Peter Xu
@ 2026-01-26 21:36 ` Peter Xu
2026-01-26 22:03 ` Fabiano Rosas
2026-01-26 21:36 ` [PATCH v2 3/5] migration: Notify migration FAILED before starting VM Peter Xu
` (3 subsequent siblings)
5 siblings, 1 reply; 9+ messages in thread
From: Peter Xu @ 2026-01-26 21:36 UTC (permalink / raw)
To: qemu-devel
Cc: Prasad Pandit, Stefan Hajnoczi, peterx, Fabiano Rosas,
Juraj Marcin, Marc-André Lureau, Dr. David Alan Gilbert
Migration notifiers will notify at any of three places: (1) SETUP
phase, (2) migration completes, (3) migration fails.
There's actually a special case for spice: one can refer to
b82fc321bf ("Postcopy+spice: Pass spice migration data earlier"). It
doesn't need another 4th event because in commit 9d9babf78d ("migration:
MigrationEvent for notifiers") we merged it together with the DONE event.
The merge makes some sense if we treat "switchover" of postcopy as "DONE",
however that also means for postcopy we'll notify DONE twice.. The other
one at the end of postcopy when migration_cleanup().
In reality, the current code base will also notify FAILED for postcopy
twice. It's because an (maybe accidental) change in commit
4af667f87c ("migration: notifier error checking").
First of all, we still need that notification when switchover as stated in
Dave's commit, however that's only needed for spice. To fix it, introduce
POSTCOPY_START event to differenciate it from DONE. Use that instead in
postcopy_start(). Then spice will need to capture this event too.
Then we remove the extra FAILED notification in postcopy_start().
If one wonder if other DONE users should also monitor POSTCOPY_START
event.. We have two more DONE users:
- kvm_arm_gicv3_notifier
- cpr_exec_notifier
Both of them do not need a notification for POSTCOPY_START, but only when
migration completed. Actually, both of them are used in CPR, which doesn't
support postcopy.
When at this, update the notifier transition graph in the comment, and move
it from migration_add_notifier() to be closer to where the enum is defined.
I didn't attach Fixes: because I am not aware of any real bug on such
double reporting. I'm wildly guessing the 2nd notify might be silently
ignored in many cases. However this is still worth fixing.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 16 ++++++++++++----
migration/migration.c | 3 +--
ui/spice-core.c | 3 ++-
3 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index e26d418a6e..1cd6cfd7f7 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -59,10 +59,22 @@ void migration_shutdown(void);
bool migration_is_running(void);
bool migration_thread_is_self(void);
+/*
+ * Notifiers may receive events in any of the following orders:
+ *
+ * - MIG_EVENT_PRECOPY_SETUP [-> MIG_EVENT_POSTCOPY_START]
+ * -> MIG_EVENT_PRECOPY_DONE
+ *
+ * - MIG_EVENT_PRECOPY_SETUP [-> MIG_EVENT_POSTCOPY_START]
+ * -> MIG_EVENT_PRECOPY_FAILED
+ *
+ * - MIG_EVENT_PRECOPY_FAILED
+ */
typedef enum MigrationEventType {
MIG_EVENT_PRECOPY_SETUP,
MIG_EVENT_PRECOPY_DONE,
MIG_EVENT_PRECOPY_FAILED,
+ MIG_EVENT_POSTCOPY_START,
MIG_EVENT_MAX
} MigrationEventType;
@@ -81,10 +93,6 @@ typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
/*
* Register the notifier @notify to be called when a migration event occurs
* for MIG_MODE_NORMAL, as specified by the MigrationEvent passed to @func.
- * Notifiers may receive events in any of the following orders:
- * - MIG_EVENT_PRECOPY_SETUP -> MIG_EVENT_PRECOPY_DONE
- * - MIG_EVENT_PRECOPY_SETUP -> MIG_EVENT_PRECOPY_FAILED
- * - MIG_EVENT_PRECOPY_FAILED
*/
void migration_add_notifier(NotifierWithReturn *notify,
MigrationNotifyFunc func);
diff --git a/migration/migration.c b/migration/migration.c
index 341b9be80e..bd24006c1a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2591,7 +2591,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
* at the transition to postcopy and after the device state; in particular
* spice needs to trigger a transition now
*/
- migration_call_notifiers(MIG_EVENT_PRECOPY_DONE, NULL);
+ migration_call_notifiers(MIG_EVENT_POSTCOPY_START, NULL);
migration_downtime_end(ms);
@@ -2640,7 +2640,6 @@ fail:
migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
}
migration_block_activate(NULL);
- migration_call_notifiers(MIG_EVENT_PRECOPY_FAILED, NULL);
bql_unlock();
return -1;
}
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 8a6050f4ae..ce3c2954e3 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -585,7 +585,8 @@ static int migration_state_notifier(NotifierWithReturn *notifier,
if (e->type == MIG_EVENT_PRECOPY_SETUP) {
spice_server_migrate_start(spice_server);
- } else if (e->type == MIG_EVENT_PRECOPY_DONE) {
+ } else if (e->type == MIG_EVENT_PRECOPY_DONE ||
+ e->type == MIG_EVENT_POSTCOPY_START) {
spice_server_migrate_end(spice_server, true);
spice_have_target_host = false;
} else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
--
2.50.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 3/5] migration: Notify migration FAILED before starting VM
2026-01-26 21:36 [PATCH v2 0/5] migration: Notifier fixes for 11.0 Peter Xu
2026-01-26 21:36 ` [PATCH v2 1/5] migration: Add a tracepoint for invoking migration notifiers Peter Xu
2026-01-26 21:36 ` [PATCH v2 2/5] migration: Fix double notification of DONE/FAIL for postcopy Peter Xu
@ 2026-01-26 21:36 ` Peter Xu
2026-01-26 21:36 ` [PATCH v2 4/5] migration: Drop explicit block activation in postcopy fail path Peter Xu
` (2 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2026-01-26 21:36 UTC (permalink / raw)
To: qemu-devel
Cc: Prasad Pandit, Stefan Hajnoczi, peterx, Fabiano Rosas,
Juraj Marcin, Cédric Le Goater, Marc-André Lureau
Devices may opt-in migration FAILED notifiers to be invoked when migration
fails. Currently, the notifications happen in migration_cleanup(). It is
normally fine, but maybe not ideal if there's dependency of the fallback
v.s. VM starts.
This patch moves the FAILED notification earlier, so that if the failure
happened during switchover, it'll notify before VM restart.
After walking over all existing FAILED notifier users, I got the conclusion
that this should also be a cleaner approach at least from design POV.
We have these notifier users, where the first two do not need to trap
FAILED:
|----------------------------+-------------------------------------+---------------------|
| device | handler | events needed |
|----------------------------+-------------------------------------+---------------------|
| gicv3 | kvm_arm_gicv3_notifier | DONE |
| vfio_iommufd / vfio_legacy | vfio_cpr_reboot_notifier | SETUP |
| cpr-exec | cpr_exec_notifier | FAILED, DONE |
| virtio-net | virtio_net_migration_state_notifier | SETUP, FAILED |
| vfio | vfio_migration_state_notifier | FAILED |
| vdpa | vdpa_net_migration_state_notifier | SETUP, FAILED |
| spice [*] | migration_state_notifier | SETUP, FAILED, DONE |
|----------------------------+-------------------------------------+---------------------|
For cpr-exec, it tries to cleanup some cpr-exec specific fd or env
variables. This should be fine either way, as long as before
migration_cleanup().
For virtio-net, we need to re-plug the primary device back to guest in the
failover mode. Likely benign.
VFIO needs to re-start the device if FAILED. IIUC it should do it before
vm_start(), if the VFIO device can be put into a STOPed state due to
migration, we should logically make it running again before vCPUs run.
VDPA will disable SVQ when migration is FAILED. Likely benign too, but
looks better if we can do it before resuming vCPUs.
For spice, we should rely on "spice_server_migrate_end(false)" to retake
the ownership. Benign, but looks more reasonable if the spice client does
it before VM runs again.
Note that this change may introduce slightly more downtime, if the
migration failed exactly at the switchover phase. But that's very rare,
and even if it happens, none of above expects a long delay, but a short
one, likely will be buried in the total downtime even if failed.
Cc: Cédric Le Goater <clg@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index bd24006c1a..8d1c294b47 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1283,7 +1283,6 @@ static void migration_cleanup_json_writer(MigrationState *s)
static void migration_cleanup(MigrationState *s)
{
- MigrationEventType type;
QEMUFile *tmp = NULL;
trace_migration_cleanup();
@@ -1333,9 +1332,14 @@ static void migration_cleanup(MigrationState *s)
MIGRATION_STATUS_CANCELLED);
}
- type = migration_has_failed(s) ? MIG_EVENT_PRECOPY_FAILED :
- MIG_EVENT_PRECOPY_DONE;
- migration_call_notifiers(type, NULL);
+ /*
+ * FAILED notification should have already happened. Notify DONE if
+ * migration completed successfully.
+ */
+ if (!migration_has_failed(s)) {
+ migration_call_notifiers(MIG_EVENT_PRECOPY_DONE, NULL);
+ }
+
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
}
@@ -3323,6 +3327,13 @@ static void migration_iteration_finish(MigrationState *s)
error_free(local_err);
break;
}
+
+ /*
+ * Notify FAILED before starting VM, so that devices can invoke
+ * necessary fallbacks before vCPUs run again.
+ */
+ migration_call_notifiers(MIG_EVENT_PRECOPY_FAILED, NULL);
+
if (runstate_is_live(s->vm_old_state)) {
if (!runstate_check(RUN_STATE_SHUTDOWN)) {
vm_start();
--
2.50.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 4/5] migration: Drop explicit block activation in postcopy fail path
2026-01-26 21:36 [PATCH v2 0/5] migration: Notifier fixes for 11.0 Peter Xu
` (2 preceding siblings ...)
2026-01-26 21:36 ` [PATCH v2 3/5] migration: Notify migration FAILED before starting VM Peter Xu
@ 2026-01-26 21:36 ` Peter Xu
2026-01-26 21:36 ` [PATCH v2 5/5] migration: Rename MIG_EVENT_PRECOPY_* to MIG_EVENT_* Peter Xu
2026-01-26 22:04 ` [PATCH v2 0/5] migration: Notifier fixes for 11.0 Fabiano Rosas
5 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2026-01-26 21:36 UTC (permalink / raw)
To: qemu-devel
Cc: Prasad Pandit, Stefan Hajnoczi, peterx, Fabiano Rosas,
Juraj Marcin
Postcopy (in failure path) should share with precopy on disk reactivations.
Explicit activiation should used to be fine even if called twice, but after
26f65c01ed ("migration: Do not try to start VM if disk activation fails")
we may want to avoid it and always capture failure when reactivation
happens (even if we do not expect the failure to happen). Remove this
redundant call.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 8d1c294b47..a5b0561cbe 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2643,7 +2643,6 @@ fail:
if (ms->state != MIGRATION_STATUS_CANCELLING) {
migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
}
- migration_block_activate(NULL);
bql_unlock();
return -1;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 5/5] migration: Rename MIG_EVENT_PRECOPY_* to MIG_EVENT_*
2026-01-26 21:36 [PATCH v2 0/5] migration: Notifier fixes for 11.0 Peter Xu
` (3 preceding siblings ...)
2026-01-26 21:36 ` [PATCH v2 4/5] migration: Drop explicit block activation in postcopy fail path Peter Xu
@ 2026-01-26 21:36 ` Peter Xu
2026-01-26 22:04 ` [PATCH v2 0/5] migration: Notifier fixes for 11.0 Fabiano Rosas
5 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2026-01-26 21:36 UTC (permalink / raw)
To: qemu-devel
Cc: Prasad Pandit, Stefan Hajnoczi, peterx, Fabiano Rosas,
Juraj Marcin
All three events are shared between precopy and postcopy, rather than
precopy specific.
For example, both precopy and postcopy will go through a SETUP process.
Meanwhile, both FAILED and DONE notifiers will be notified for either
precopy or postcopy on completions / failures.
Rename them to make them match what they do, and shorter.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 18 +++++++-----------
hw/intc/arm_gicv3_kvm.c | 2 +-
hw/net/virtio-net.c | 4 ++--
hw/vfio/cpr-legacy.c | 2 +-
hw/vfio/cpr.c | 8 ++++----
hw/vfio/migration.c | 4 ++--
migration/cpr-exec.c | 6 +++---
migration/migration.c | 8 ++++----
net/vhost-vdpa.c | 4 ++--
ui/spice-core.c | 6 +++---
10 files changed, 29 insertions(+), 33 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 1cd6cfd7f7..3159a5e53c 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -62,19 +62,15 @@ bool migration_thread_is_self(void);
/*
* Notifiers may receive events in any of the following orders:
*
- * - MIG_EVENT_PRECOPY_SETUP [-> MIG_EVENT_POSTCOPY_START]
- * -> MIG_EVENT_PRECOPY_DONE
- *
- * - MIG_EVENT_PRECOPY_SETUP [-> MIG_EVENT_POSTCOPY_START]
- * -> MIG_EVENT_PRECOPY_FAILED
- *
- * - MIG_EVENT_PRECOPY_FAILED
+ * - MIG_EVENT_SETUP [-> MIG_EVENT_POSTCOPY_START] -> MIG_EVENT_DONE
+ * - MIG_EVENT_SETUP [-> MIG_EVENT_POSTCOPY_START] -> MIG_EVENT_FAILED
+ * - MIG_EVENT_FAILED
*/
typedef enum MigrationEventType {
- MIG_EVENT_PRECOPY_SETUP,
- MIG_EVENT_PRECOPY_DONE,
- MIG_EVENT_PRECOPY_FAILED,
+ MIG_EVENT_SETUP,
MIG_EVENT_POSTCOPY_START,
+ MIG_EVENT_DONE,
+ MIG_EVENT_FAILED,
MIG_EVENT_MAX
} MigrationEventType;
@@ -84,7 +80,7 @@ typedef struct MigrationEvent {
/*
* A MigrationNotifyFunc may return an error code and an Error object,
- * but only when @e->type is MIG_EVENT_PRECOPY_SETUP. The code is an int
+ * but only when @e->type is MIG_EVENT_SETUP. The code is an int
* to allow for different failure modes and recovery actions.
*/
typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 6f311e37ef..fddeefa26f 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -774,7 +774,7 @@ static void vm_change_state_handler(void *opaque, bool running,
static int kvm_arm_gicv3_notifier(NotifierWithReturn *notifier,
MigrationEvent *e, Error **errp)
{
- if (e->type == MIG_EVENT_PRECOPY_DONE) {
+ if (e->type == MIG_EVENT_DONE) {
GICv3State *s = container_of(notifier, GICv3State, cpr_notifier);
return kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES,
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 317f1ad23b..3e2dc30da6 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3786,7 +3786,7 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationEvent *e)
should_be_hidden = qatomic_read(&n->failover_primary_hidden);
- if (e->type == MIG_EVENT_PRECOPY_SETUP && !should_be_hidden) {
+ if (e->type == MIG_EVENT_SETUP && !should_be_hidden) {
if (failover_unplug_primary(n, dev)) {
vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
qapi_event_send_unplug_primary(dev->id);
@@ -3794,7 +3794,7 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationEvent *e)
} else {
warn_report("couldn't unplug primary device");
}
- } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+ } else if (e->type == MIG_EVENT_FAILED) {
/* We already unplugged the device let's plug it back */
if (!failover_replug_primary(n, dev, &err)) {
if (err) {
diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
index 7c03ddb961..033a546c30 100644
--- a/hw/vfio/cpr-legacy.c
+++ b/hw/vfio/cpr-legacy.c
@@ -137,7 +137,7 @@ static int vfio_cpr_fail_notifier(NotifierWithReturn *notifier,
container_of(notifier, VFIOLegacyContainer, cpr.transfer_notifier);
VFIOContainer *bcontainer = VFIO_IOMMU(container);
- if (e->type != MIG_EVENT_PRECOPY_FAILED) {
+ if (e->type != MIG_EVENT_FAILED) {
return 0;
}
diff --git a/hw/vfio/cpr.c b/hw/vfio/cpr.c
index 998230d271..ffa4f8e099 100644
--- a/hw/vfio/cpr.c
+++ b/hw/vfio/cpr.c
@@ -18,7 +18,7 @@
int vfio_cpr_reboot_notifier(NotifierWithReturn *notifier,
MigrationEvent *e, Error **errp)
{
- if (e->type == MIG_EVENT_PRECOPY_SETUP &&
+ if (e->type == MIG_EVENT_SETUP &&
!runstate_check(RUN_STATE_SUSPENDED) && !vm_get_suspended()) {
error_setg(errp,
@@ -186,7 +186,7 @@ static int vfio_cpr_kvm_close_notifier(NotifierWithReturn *notifier,
MigrationEvent *e,
Error **errp)
{
- if (e->type == MIG_EVENT_PRECOPY_DONE) {
+ if (e->type == MIG_EVENT_DONE) {
vfio_kvm_device_close();
}
return 0;
@@ -272,9 +272,9 @@ static int vfio_cpr_pci_notifier(NotifierWithReturn *notifier,
VFIOPCIDevice *vdev =
container_of(notifier, VFIOPCIDevice, cpr.transfer_notifier);
- if (e->type == MIG_EVENT_PRECOPY_SETUP) {
+ if (e->type == MIG_EVENT_SETUP) {
return vfio_cpr_set_msi_virq(vdev, errp, false);
- } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+ } else if (e->type == MIG_EVENT_FAILED) {
return vfio_cpr_set_msi_virq(vdev, errp, true);
}
return 0;
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index f857dc25ed..76a902b79c 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -917,10 +917,10 @@ static int vfio_migration_state_notifier(NotifierWithReturn *notifier,
trace_vfio_migration_state_notifier(vbasedev->name, e->type);
- if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+ if (e->type == MIG_EVENT_FAILED) {
/*
* MigrationNotifyFunc may not return an error code and an Error
- * object for MIG_EVENT_PRECOPY_FAILED. Hence, report the error
+ * object for MIG_EVENT_FAILED. Hence, report the error
* locally and ignore the errp argument.
*/
ret = vfio_migration_set_state_or_reset(vbasedev,
diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c
index e315a30f92..daa50916d2 100644
--- a/migration/cpr-exec.c
+++ b/migration/cpr-exec.c
@@ -164,7 +164,7 @@ static void cpr_exec_cb(void *opaque)
err = NULL;
/* Note, we can go from state COMPLETED to FAILED */
- migration_call_notifiers(MIG_EVENT_PRECOPY_FAILED, NULL);
+ migration_call_notifiers(MIG_EVENT_FAILED, NULL);
if (!migration_block_activate(&err)) {
/* error was already reported */
@@ -182,12 +182,12 @@ static int cpr_exec_notifier(NotifierWithReturn *notifier, MigrationEvent *e,
{
MigrationState *s = migrate_get_current();
- if (e->type == MIG_EVENT_PRECOPY_DONE) {
+ if (e->type == MIG_EVENT_DONE) {
QEMUBH *cpr_exec_bh = qemu_bh_new(cpr_exec_cb, NULL);
assert(s->state == MIGRATION_STATUS_COMPLETED);
qemu_bh_schedule(cpr_exec_bh);
qemu_notify_event();
- } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+ } else if (e->type == MIG_EVENT_FAILED) {
cpr_exec_unpersist_state();
}
return 0;
diff --git a/migration/migration.c b/migration/migration.c
index a5b0561cbe..7ab0294d22 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1337,7 +1337,7 @@ static void migration_cleanup(MigrationState *s)
* migration completed successfully.
*/
if (!migration_has_failed(s)) {
- migration_call_notifiers(MIG_EVENT_PRECOPY_DONE, NULL);
+ migration_call_notifiers(MIG_EVENT_DONE, NULL);
}
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
@@ -1541,7 +1541,7 @@ int migration_call_notifiers(MigrationEventType type, Error **errp)
notifier = (NotifierWithReturn *)elem->data;
ret = notifier->notify(notifier, &e, errp);
if (ret) {
- assert(type == MIG_EVENT_PRECOPY_SETUP);
+ assert(type == MIG_EVENT_SETUP);
return ret;
}
}
@@ -3331,7 +3331,7 @@ static void migration_iteration_finish(MigrationState *s)
* Notify FAILED before starting VM, so that devices can invoke
* necessary fallbacks before vCPUs run again.
*/
- migration_call_notifiers(MIG_EVENT_PRECOPY_FAILED, NULL);
+ migration_call_notifiers(MIG_EVENT_FAILED, NULL);
if (runstate_is_live(s->vm_old_state)) {
if (!runstate_check(RUN_STATE_SHUTDOWN)) {
@@ -3769,7 +3769,7 @@ void migration_start_outgoing(MigrationState *s)
rate_limit = migrate_max_bandwidth();
/* Notify before starting migration thread */
- if (migration_call_notifiers(MIG_EVENT_PRECOPY_SETUP, &local_err)) {
+ if (migration_call_notifiers(MIG_EVENT_SETUP, &local_err)) {
goto fail;
}
}
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 74d26a9497..f4b1f0e9e0 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -378,9 +378,9 @@ static int vdpa_net_migration_state_notifier(NotifierWithReturn *notifier,
{
VhostVDPAState *s = container_of(notifier, VhostVDPAState, migration_state);
- if (e->type == MIG_EVENT_PRECOPY_SETUP) {
+ if (e->type == MIG_EVENT_SETUP) {
vhost_vdpa_net_log_global_enable(s, true);
- } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+ } else if (e->type == MIG_EVENT_FAILED) {
vhost_vdpa_net_log_global_enable(s, false);
}
return 0;
diff --git a/ui/spice-core.c b/ui/spice-core.c
index ce3c2954e3..ee13ecc4a5 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -583,13 +583,13 @@ static int migration_state_notifier(NotifierWithReturn *notifier,
return 0;
}
- if (e->type == MIG_EVENT_PRECOPY_SETUP) {
+ if (e->type == MIG_EVENT_SETUP) {
spice_server_migrate_start(spice_server);
- } else if (e->type == MIG_EVENT_PRECOPY_DONE ||
+ } else if (e->type == MIG_EVENT_DONE ||
e->type == MIG_EVENT_POSTCOPY_START) {
spice_server_migrate_end(spice_server, true);
spice_have_target_host = false;
- } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+ } else if (e->type == MIG_EVENT_FAILED) {
spice_server_migrate_end(spice_server, false);
spice_have_target_host = false;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/5] migration: Fix double notification of DONE/FAIL for postcopy
2026-01-26 21:36 ` [PATCH v2 2/5] migration: Fix double notification of DONE/FAIL for postcopy Peter Xu
@ 2026-01-26 22:03 ` Fabiano Rosas
0 siblings, 0 replies; 9+ messages in thread
From: Fabiano Rosas @ 2026-01-26 22:03 UTC (permalink / raw)
To: Peter Xu, qemu-devel
Cc: Prasad Pandit, Stefan Hajnoczi, peterx, Juraj Marcin,
Marc-André Lureau, Dr. David Alan Gilbert
Peter Xu <peterx@redhat.com> writes:
> Migration notifiers will notify at any of three places: (1) SETUP
> phase, (2) migration completes, (3) migration fails.
>
> There's actually a special case for spice: one can refer to
> b82fc321bf ("Postcopy+spice: Pass spice migration data earlier"). It
> doesn't need another 4th event because in commit 9d9babf78d ("migration:
> MigrationEvent for notifiers") we merged it together with the DONE event.
>
> The merge makes some sense if we treat "switchover" of postcopy as "DONE",
> however that also means for postcopy we'll notify DONE twice.. The other
> one at the end of postcopy when migration_cleanup().
>
> In reality, the current code base will also notify FAILED for postcopy
> twice. It's because an (maybe accidental) change in commit
> 4af667f87c ("migration: notifier error checking").
>
> First of all, we still need that notification when switchover as stated in
> Dave's commit, however that's only needed for spice. To fix it, introduce
> POSTCOPY_START event to differenciate it from DONE. Use that instead in
> postcopy_start(). Then spice will need to capture this event too.
>
> Then we remove the extra FAILED notification in postcopy_start().
>
> If one wonder if other DONE users should also monitor POSTCOPY_START
> event.. We have two more DONE users:
>
> - kvm_arm_gicv3_notifier
> - cpr_exec_notifier
>
> Both of them do not need a notification for POSTCOPY_START, but only when
> migration completed. Actually, both of them are used in CPR, which doesn't
> support postcopy.
>
> When at this, update the notifier transition graph in the comment, and move
> it from migration_add_notifier() to be closer to where the enum is defined.
>
> I didn't attach Fixes: because I am not aware of any real bug on such
> double reporting. I'm wildly guessing the 2nd notify might be silently
> ignored in many cases. However this is still worth fixing.
>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Dr. David Alan Gilbert <dave@treblig.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> include/migration/misc.h | 16 ++++++++++++----
> migration/migration.c | 3 +--
> ui/spice-core.c | 3 ++-
> 3 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index e26d418a6e..1cd6cfd7f7 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -59,10 +59,22 @@ void migration_shutdown(void);
> bool migration_is_running(void);
> bool migration_thread_is_self(void);
>
> +/*
> + * Notifiers may receive events in any of the following orders:
> + *
> + * - MIG_EVENT_PRECOPY_SETUP [-> MIG_EVENT_POSTCOPY_START]
> + * -> MIG_EVENT_PRECOPY_DONE
> + *
> + * - MIG_EVENT_PRECOPY_SETUP [-> MIG_EVENT_POSTCOPY_START]
> + * -> MIG_EVENT_PRECOPY_FAILED
> + *
> + * - MIG_EVENT_PRECOPY_FAILED
> + */
Ugh, and for cpr-exec is also possible:
- MIG_EVENT_PRECOPY_DONE -> MIG_EVENT_PRECOPY_FAILED
Maybe add it for completion.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
> typedef enum MigrationEventType {
> MIG_EVENT_PRECOPY_SETUP,
> MIG_EVENT_PRECOPY_DONE,
> MIG_EVENT_PRECOPY_FAILED,
> + MIG_EVENT_POSTCOPY_START,
> MIG_EVENT_MAX
> } MigrationEventType;
>
> @@ -81,10 +93,6 @@ typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
> /*
> * Register the notifier @notify to be called when a migration event occurs
> * for MIG_MODE_NORMAL, as specified by the MigrationEvent passed to @func.
> - * Notifiers may receive events in any of the following orders:
> - * - MIG_EVENT_PRECOPY_SETUP -> MIG_EVENT_PRECOPY_DONE
> - * - MIG_EVENT_PRECOPY_SETUP -> MIG_EVENT_PRECOPY_FAILED
> - * - MIG_EVENT_PRECOPY_FAILED
> */
> void migration_add_notifier(NotifierWithReturn *notify,
> MigrationNotifyFunc func);
> diff --git a/migration/migration.c b/migration/migration.c
> index 341b9be80e..bd24006c1a 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2591,7 +2591,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
> * at the transition to postcopy and after the device state; in particular
> * spice needs to trigger a transition now
> */
> - migration_call_notifiers(MIG_EVENT_PRECOPY_DONE, NULL);
> + migration_call_notifiers(MIG_EVENT_POSTCOPY_START, NULL);
>
> migration_downtime_end(ms);
>
> @@ -2640,7 +2640,6 @@ fail:
> migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> }
> migration_block_activate(NULL);
> - migration_call_notifiers(MIG_EVENT_PRECOPY_FAILED, NULL);
> bql_unlock();
> return -1;
> }
> diff --git a/ui/spice-core.c b/ui/spice-core.c
> index 8a6050f4ae..ce3c2954e3 100644
> --- a/ui/spice-core.c
> +++ b/ui/spice-core.c
> @@ -585,7 +585,8 @@ static int migration_state_notifier(NotifierWithReturn *notifier,
>
> if (e->type == MIG_EVENT_PRECOPY_SETUP) {
> spice_server_migrate_start(spice_server);
> - } else if (e->type == MIG_EVENT_PRECOPY_DONE) {
> + } else if (e->type == MIG_EVENT_PRECOPY_DONE ||
> + e->type == MIG_EVENT_POSTCOPY_START) {
> spice_server_migrate_end(spice_server, true);
> spice_have_target_host = false;
> } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/5] migration: Notifier fixes for 11.0
2026-01-26 21:36 [PATCH v2 0/5] migration: Notifier fixes for 11.0 Peter Xu
` (4 preceding siblings ...)
2026-01-26 21:36 ` [PATCH v2 5/5] migration: Rename MIG_EVENT_PRECOPY_* to MIG_EVENT_* Peter Xu
@ 2026-01-26 22:04 ` Fabiano Rosas
2026-01-26 22:36 ` Peter Xu
5 siblings, 1 reply; 9+ messages in thread
From: Fabiano Rosas @ 2026-01-26 22:04 UTC (permalink / raw)
To: Peter Xu, qemu-devel; +Cc: Prasad Pandit, Stefan Hajnoczi, peterx, Juraj Marcin
Peter Xu <peterx@redhat.com> writes:
> CI: https://gitlab.com/peterx/qemu/-/pipelines/2287309287
>
> v2:
> - Collected r-bs / a-bs
> - Patch 2: update comment for possible sequence of notifies [Fabiano]
>
> v1: https://lore.kernel.org/r/20260122230331.3543312-1-peterx@redhat.com
>
> Two major goals for this small series:
>
> - Fix postcopy issue where DONE and FAILED notifiers will be invoked twice
>
> - Move FAILED notifier to be before vm_start() if the failure happens
> during switchover (where we will stop the VM first)
>
> The 2nd goal will be needed by Stefan's ongoing work on block persistent
> reservations, where a fallback should be required on src to happen before
> vm_start(). Instead of introducing another FAILED_BEFORE_START, this
> patchset should make FAILED work instead.
>
> Patch 1 adds a tracepoint for me to verify this fix.
>
> Patch 2-3 are the real changes of above two.
>
> Patch 3-4 are some cleanups alone the context that we can do, hence
> attached at the end.
>
> More details in commit logs individually. Comments welcomed, thanks.
>
> Peter Xu (5):
> migration: Add a tracepoint for invoking migration notifiers
> migration: Fix double notification of DONE/FAIL for postcopy
> migration: Notify migration FAILED before starting VM
> migration: Drop explicit block activation in postcopy fail path
> migration: Rename MIG_EVENT_PRECOPY_* to MIG_EVENT_*
>
> include/migration/misc.h | 20 ++++++++++++--------
> hw/intc/arm_gicv3_kvm.c | 2 +-
> hw/net/virtio-net.c | 4 ++--
> hw/vfio/cpr-legacy.c | 2 +-
> hw/vfio/cpr.c | 8 ++++----
> hw/vfio/migration.c | 4 ++--
> migration/cpr-exec.c | 6 +++---
> migration/migration.c | 29 ++++++++++++++++++++---------
> net/vhost-vdpa.c | 4 ++--
> ui/spice-core.c | 7 ++++---
> migration/trace-events | 1 +
> 11 files changed, 52 insertions(+), 35 deletions(-)
Queued, let me know if you want to change patch 2.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/5] migration: Notifier fixes for 11.0
2026-01-26 22:04 ` [PATCH v2 0/5] migration: Notifier fixes for 11.0 Fabiano Rosas
@ 2026-01-26 22:36 ` Peter Xu
0 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2026-01-26 22:36 UTC (permalink / raw)
To: Fabiano Rosas; +Cc: qemu-devel, Prasad Pandit, Stefan Hajnoczi, Juraj Marcin
On Mon, Jan 26, 2026 at 07:04:38PM -0300, Fabiano Rosas wrote:
> Queued, let me know if you want to change patch 2.
Thanks. I have no strong preference.
Considering that cpr-exec's execve() is not designed to fail at all at
least in production, and FAILED after DONE is utterly confusing.. maybe we
don't need to mention it in a doc everyone would read? I actually instead
hope nobody noticed it's even possible..
I believe that code was there only because when Steve worked on it he
wanted to trap when he typed something wrong by accident when testing
cpr-exec. It's a feature too special and mustn't used wrong or VM data can
definitely get lost. From that POV, maybe some day later we can use an
g_assert_not_reached() to replace that whole blob, including the notify.
--
Peter Xu
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-01-26 22:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-26 21:36 [PATCH v2 0/5] migration: Notifier fixes for 11.0 Peter Xu
2026-01-26 21:36 ` [PATCH v2 1/5] migration: Add a tracepoint for invoking migration notifiers Peter Xu
2026-01-26 21:36 ` [PATCH v2 2/5] migration: Fix double notification of DONE/FAIL for postcopy Peter Xu
2026-01-26 22:03 ` Fabiano Rosas
2026-01-26 21:36 ` [PATCH v2 3/5] migration: Notify migration FAILED before starting VM Peter Xu
2026-01-26 21:36 ` [PATCH v2 4/5] migration: Drop explicit block activation in postcopy fail path Peter Xu
2026-01-26 21:36 ` [PATCH v2 5/5] migration: Rename MIG_EVENT_PRECOPY_* to MIG_EVENT_* Peter Xu
2026-01-26 22:04 ` [PATCH v2 0/5] migration: Notifier fixes for 11.0 Fabiano Rosas
2026-01-26 22:36 ` Peter Xu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.