* [PATCH V3 00/13] allow cpr-reboot for vfio
@ 2024-02-08 18:53 Steve Sistare
2024-02-08 18:53 ` [PATCH V3 01/13] notify: pass error to notifier with return Steve Sistare
` (13 more replies)
0 siblings, 14 replies; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:53 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Allow cpr-reboot for vfio if the guest is in the suspended runstate. The
guest drivers' suspend methods flush outstanding requests and re-initialize
the devices, and thus there is no device state to save and restore. The
user is responsible for suspending the guest before initiating cpr, such as
by issuing guest-suspend-ram to the qemu guest agent.
Most of the patches in this series enhance migration notifiers so they can
return an error status and message. The last few patches register a notifier
for vfio that returns an error if the guest is not suspended.
Changes in V3:
* update to tip, add RB's
* replace MigrationStatus with new enum MigrationEventType
* simplify migrate_fd_connect error recovery
* support vfio iommufd containers
* add patches:
migration: stop vm for cpr
migration: update cpr-reboot description
Steve Sistare (13):
notify: pass error to notifier with return
migration: remove error from notifier data
migration: convert to NotifierWithReturn
migration: MigrationEvent for notifiers
migration: remove postcopy_after_devices
migration: MigrationNotifyFunc
migration: per-mode notifiers
migration: refactor migrate_fd_connect failures
migration: notifier error checking
migration: stop vm for cpr
vfio: register container for cpr
vfio: allow cpr-reboot migration if suspended
migration: update cpr-reboot description
hw/net/virtio-net.c | 13 ++--
hw/vfio/common.c | 2 +-
hw/vfio/container.c | 11 ++-
hw/vfio/cpr.c | 39 +++++++++++
hw/vfio/iommufd.c | 6 ++
hw/vfio/meson.build | 1 +
hw/vfio/migration.c | 15 ++--
hw/vfio/trace-events | 2 +-
hw/virtio/vhost-user.c | 10 +--
hw/virtio/virtio-balloon.c | 3 +-
include/hw/vfio/vfio-common.h | 5 +-
include/hw/vfio/vfio-container-base.h | 1 +
include/hw/virtio/virtio-net.h | 2 +-
include/migration/misc.h | 31 ++++++--
include/qemu/notify.h | 8 ++-
migration/migration.c | 128 +++++++++++++++++++++++-----------
migration/migration.h | 2 -
migration/postcopy-ram.c | 3 +-
migration/postcopy-ram.h | 1 -
migration/ram.c | 3 +-
net/vhost-vdpa.c | 14 ++--
qapi/migration.json | 36 ++++++----
ui/spice-core.c | 17 +++--
util/notify.c | 5 +-
24 files changed, 244 insertions(+), 114 deletions(-)
create mode 100644 hw/vfio/cpr.c
--
1.8.3.1
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH V3 01/13] notify: pass error to notifier with return
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
@ 2024-02-08 18:53 ` Steve Sistare
2024-02-12 9:08 ` David Hildenbrand
2024-02-08 18:53 ` [PATCH V3 02/13] migration: remove error from notifier data Steve Sistare
` (12 subsequent siblings)
13 siblings, 1 reply; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:53 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Pass an error object as the third parameter to "notifier with return"
notifiers, so clients no longer need to bundle an error object in the
opaque data. The new parameter is used in a later patch.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
hw/virtio/vhost-user.c | 2 +-
hw/virtio/virtio-balloon.c | 3 ++-
include/qemu/notify.h | 7 +++++--
migration/postcopy-ram.c | 2 +-
migration/ram.c | 2 +-
util/notify.c | 5 +++--
6 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index f214df8..f502345 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2084,7 +2084,7 @@ static int vhost_user_postcopy_end(struct vhost_dev *dev, Error **errp)
}
static int vhost_user_postcopy_notifier(NotifierWithReturn *notifier,
- void *opaque)
+ void *opaque, Error **errp)
{
struct PostcopyNotifyData *pnd = opaque;
struct vhost_user *u = container_of(notifier, struct vhost_user,
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 486fe3d..89f853f 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -633,7 +633,8 @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
}
static int
-virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
+virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data,
+ Error **errp)
{
VirtIOBalloon *dev = container_of(n, VirtIOBalloon, free_page_hint_notify);
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
diff --git a/include/qemu/notify.h b/include/qemu/notify.h
index bcfa70f..9a85631 100644
--- a/include/qemu/notify.h
+++ b/include/qemu/notify.h
@@ -45,12 +45,15 @@ bool notifier_list_empty(NotifierList *list);
/* Same as Notifier but allows .notify() to return errors */
typedef struct NotifierWithReturn NotifierWithReturn;
+typedef int (*NotifierWithReturnFunc)(NotifierWithReturn *notifier, void *data,
+ Error **errp);
+
struct NotifierWithReturn {
/**
* Return 0 on success (next notifier will be invoked), otherwise
* notifier_with_return_list_notify() will stop and return the value.
*/
- int (*notify)(NotifierWithReturn *notifier, void *data);
+ NotifierWithReturnFunc notify;
QLIST_ENTRY(NotifierWithReturn) node;
};
@@ -69,6 +72,6 @@ void notifier_with_return_list_add(NotifierWithReturnList *list,
void notifier_with_return_remove(NotifierWithReturn *notifier);
int notifier_with_return_list_notify(NotifierWithReturnList *list,
- void *data);
+ void *data, Error **errp);
#endif
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 893ec8f..3ab2f6b 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -80,7 +80,7 @@ int postcopy_notify(enum PostcopyNotifyReason reason, Error **errp)
pnd.errp = errp;
return notifier_with_return_list_notify(&postcopy_notifier_list,
- &pnd);
+ &pnd, errp);
}
/*
diff --git a/migration/ram.c b/migration/ram.c
index d5b7cd5..ed9241a 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -428,7 +428,7 @@ int precopy_notify(PrecopyNotifyReason reason, Error **errp)
pnd.reason = reason;
pnd.errp = errp;
- return notifier_with_return_list_notify(&precopy_notifier_list, &pnd);
+ return notifier_with_return_list_notify(&precopy_notifier_list, &pnd, errp);
}
uint64_t ram_bytes_remaining(void)
diff --git a/util/notify.c b/util/notify.c
index 76bab21..c6e158f 100644
--- a/util/notify.c
+++ b/util/notify.c
@@ -61,13 +61,14 @@ void notifier_with_return_remove(NotifierWithReturn *notifier)
QLIST_REMOVE(notifier, node);
}
-int notifier_with_return_list_notify(NotifierWithReturnList *list, void *data)
+int notifier_with_return_list_notify(NotifierWithReturnList *list, void *data,
+ Error **errp)
{
NotifierWithReturn *notifier, *next;
int ret = 0;
QLIST_FOREACH_SAFE(notifier, &list->notifiers, node, next) {
- ret = notifier->notify(notifier, data);
+ ret = notifier->notify(notifier, data, errp);
if (ret != 0) {
break;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 02/13] migration: remove error from notifier data
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
2024-02-08 18:53 ` [PATCH V3 01/13] notify: pass error to notifier with return Steve Sistare
@ 2024-02-08 18:53 ` Steve Sistare
2024-02-12 9:08 ` David Hildenbrand
2024-02-08 18:53 ` [PATCH V3 03/13] migration: convert to NotifierWithReturn Steve Sistare
` (11 subsequent siblings)
13 siblings, 1 reply; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:53 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Remove the error object from opaque data passed to notifiers.
Use the new error parameter passed to the notifier instead.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
hw/virtio/vhost-user.c | 8 ++++----
include/migration/misc.h | 1 -
migration/postcopy-ram.c | 1 -
migration/postcopy-ram.h | 1 -
migration/ram.c | 1 -
5 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index f502345..a1eea85 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2096,20 +2096,20 @@ static int vhost_user_postcopy_notifier(NotifierWithReturn *notifier,
if (!virtio_has_feature(dev->protocol_features,
VHOST_USER_PROTOCOL_F_PAGEFAULT)) {
/* TODO: Get the device name into this error somehow */
- error_setg(pnd->errp,
+ error_setg(errp,
"vhost-user backend not capable of postcopy");
return -ENOENT;
}
break;
case POSTCOPY_NOTIFY_INBOUND_ADVISE:
- return vhost_user_postcopy_advise(dev, pnd->errp);
+ return vhost_user_postcopy_advise(dev, errp);
case POSTCOPY_NOTIFY_INBOUND_LISTEN:
- return vhost_user_postcopy_listen(dev, pnd->errp);
+ return vhost_user_postcopy_listen(dev, errp);
case POSTCOPY_NOTIFY_INBOUND_END:
- return vhost_user_postcopy_end(dev, pnd->errp);
+ return vhost_user_postcopy_end(dev, errp);
default:
/* We ignore notifications we don't know */
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 1bc8902..5e65c18 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -31,7 +31,6 @@ typedef enum PrecopyNotifyReason {
typedef struct PrecopyNotifyData {
enum PrecopyNotifyReason reason;
- Error **errp;
} PrecopyNotifyData;
void precopy_infrastructure_init(void);
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 3ab2f6b..0273dc6 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -77,7 +77,6 @@ int postcopy_notify(enum PostcopyNotifyReason reason, Error **errp)
{
struct PostcopyNotifyData pnd;
pnd.reason = reason;
- pnd.errp = errp;
return notifier_with_return_list_notify(&postcopy_notifier_list,
&pnd, errp);
diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
index 442ab89..ecae941 100644
--- a/migration/postcopy-ram.h
+++ b/migration/postcopy-ram.h
@@ -128,7 +128,6 @@ enum PostcopyNotifyReason {
struct PostcopyNotifyData {
enum PostcopyNotifyReason reason;
- Error **errp;
};
void postcopy_add_notifier(NotifierWithReturn *nn);
diff --git a/migration/ram.c b/migration/ram.c
index ed9241a..fe90024 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -426,7 +426,6 @@ int precopy_notify(PrecopyNotifyReason reason, Error **errp)
{
PrecopyNotifyData pnd;
pnd.reason = reason;
- pnd.errp = errp;
return notifier_with_return_list_notify(&precopy_notifier_list, &pnd, errp);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 03/13] migration: convert to NotifierWithReturn
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
2024-02-08 18:53 ` [PATCH V3 01/13] notify: pass error to notifier with return Steve Sistare
2024-02-08 18:53 ` [PATCH V3 02/13] migration: remove error from notifier data Steve Sistare
@ 2024-02-08 18:53 ` Steve Sistare
2024-02-12 9:10 ` David Hildenbrand
2024-02-08 18:53 ` [PATCH V3 04/13] migration: MigrationEvent for notifiers Steve Sistare
` (10 subsequent siblings)
13 siblings, 1 reply; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:53 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Change all migration notifiers to type NotifierWithReturn, so notifiers
can return an error status in a future patch. For now, pass NULL for the
notifier error parameter, and do not check the return value.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
hw/net/virtio-net.c | 4 +++-
hw/vfio/migration.c | 4 +++-
include/hw/vfio/vfio-common.h | 2 +-
include/hw/virtio/virtio-net.h | 2 +-
include/migration/misc.h | 6 +++---
include/qemu/notify.h | 1 +
migration/migration.c | 16 ++++++++--------
net/vhost-vdpa.c | 6 ++++--
ui/spice-core.c | 8 +++++---
9 files changed, 29 insertions(+), 20 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 5a79bc3..75f4e86 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3534,11 +3534,13 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
}
}
-static void virtio_net_migration_state_notifier(Notifier *notifier, void *data)
+static int virtio_net_migration_state_notifier(NotifierWithReturn *notifier,
+ void *data, Error **errp)
{
MigrationState *s = data;
VirtIONet *n = container_of(notifier, VirtIONet, migration_state);
virtio_net_handle_migration_primary(n, s);
+ return 0;
}
static bool failover_hide_primary_device(DeviceListener *listener,
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 70e6b1a..6b6acc4 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -754,7 +754,8 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
mig_state_to_str(new_state));
}
-static void vfio_migration_state_notifier(Notifier *notifier, void *data)
+static int vfio_migration_state_notifier(NotifierWithReturn *notifier,
+ void *data, Error **errp)
{
MigrationState *s = data;
VFIOMigration *migration = container_of(notifier, VFIOMigration,
@@ -770,6 +771,7 @@ static void vfio_migration_state_notifier(Notifier *notifier, void *data)
case MIGRATION_STATUS_FAILED:
vfio_migration_set_state_or_reset(vbasedev, VFIO_DEVICE_STATE_RUNNING);
}
+ return 0;
}
static void vfio_migration_free(VFIODevice *vbasedev)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 9b7ef7d..4a6c262 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -62,7 +62,7 @@ typedef struct VFIORegion {
typedef struct VFIOMigration {
struct VFIODevice *vbasedev;
VMChangeStateEntry *vm_state;
- Notifier migration_state;
+ NotifierWithReturn migration_state;
uint32_t device_state;
int data_fd;
void *data_buffer;
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 55977f0..eaee8f4 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -221,7 +221,7 @@ struct VirtIONet {
DeviceListener primary_listener;
QDict *primary_opts;
bool primary_opts_from_json;
- Notifier migration_state;
+ NotifierWithReturn migration_state;
VirtioNetRssData rss_data;
struct NetRxPkt *rx_pkt;
struct EBPFRSSContext ebpf_rss;
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 5e65c18..b62e351 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,9 +60,9 @@ void migration_object_init(void);
void migration_shutdown(void);
bool migration_is_idle(void);
bool migration_is_active(MigrationState *);
-void migration_add_notifier(Notifier *notify,
- void (*func)(Notifier *notifier, void *data));
-void migration_remove_notifier(Notifier *notify);
+void migration_add_notifier(NotifierWithReturn *notify,
+ NotifierWithReturnFunc func);
+void migration_remove_notifier(NotifierWithReturn *notify);
void migration_call_notifiers(MigrationState *s);
bool migration_in_setup(MigrationState *);
bool migration_has_finished(MigrationState *);
diff --git a/include/qemu/notify.h b/include/qemu/notify.h
index 9a85631..abf18db 100644
--- a/include/qemu/notify.h
+++ b/include/qemu/notify.h
@@ -45,6 +45,7 @@ bool notifier_list_empty(NotifierList *list);
/* Same as Notifier but allows .notify() to return errors */
typedef struct NotifierWithReturn NotifierWithReturn;
+/* Return int to allow for different failure modes and recovery actions */
typedef int (*NotifierWithReturnFunc)(NotifierWithReturn *notifier, void *data,
Error **errp);
diff --git a/migration/migration.c b/migration/migration.c
index d5f705c..ff9945b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -68,8 +68,8 @@
#include "sysemu/dirtylimit.h"
#include "qemu/sockets.h"
-static NotifierList migration_state_notifiers =
- NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
+static NotifierWithReturnList migration_state_notifiers =
+ NOTIFIER_WITH_RETURN_LIST_INITIALIZER(migration_state_notifiers);
/* Messages sent on the return path from destination to source */
enum mig_rp_message_type {
@@ -1453,24 +1453,24 @@ static void migrate_fd_cancel(MigrationState *s)
}
}
-void migration_add_notifier(Notifier *notify,
- void (*func)(Notifier *notifier, void *data))
+void migration_add_notifier(NotifierWithReturn *notify,
+ NotifierWithReturnFunc func)
{
notify->notify = func;
- notifier_list_add(&migration_state_notifiers, notify);
+ notifier_with_return_list_add(&migration_state_notifiers, notify);
}
-void migration_remove_notifier(Notifier *notify)
+void migration_remove_notifier(NotifierWithReturn *notify)
{
if (notify->notify) {
- notifier_remove(notify);
+ notifier_with_return_remove(notify);
notify->notify = NULL;
}
}
void migration_call_notifiers(MigrationState *s)
{
- notifier_list_notify(&migration_state_notifiers, s);
+ notifier_with_return_list_notify(&migration_state_notifiers, s, 0);
}
bool migration_in_setup(MigrationState *s)
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 3726ee5..1c00519 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -34,7 +34,7 @@
typedef struct VhostVDPAState {
NetClientState nc;
struct vhost_vdpa vhost_vdpa;
- Notifier migration_state;
+ NotifierWithReturn migration_state;
VHostNetState *vhost_net;
/* Control commands shadow buffers */
@@ -322,7 +322,8 @@ static void vhost_vdpa_net_log_global_enable(VhostVDPAState *s, bool enable)
}
}
-static void vdpa_net_migration_state_notifier(Notifier *notifier, void *data)
+static int vdpa_net_migration_state_notifier(NotifierWithReturn *notifier,
+ void *data, Error **errp)
{
MigrationState *migration = data;
VhostVDPAState *s = container_of(notifier, VhostVDPAState,
@@ -333,6 +334,7 @@ static void vdpa_net_migration_state_notifier(Notifier *notifier, void *data)
} else if (migration_has_failed(migration)) {
vhost_vdpa_net_log_global_enable(s, false);
}
+ return 0;
}
static void vhost_vdpa_net_data_start_first(VhostVDPAState *s)
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 37b277f..b3cd229 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -42,7 +42,7 @@
/* core bits */
static SpiceServer *spice_server;
-static Notifier migration_state;
+static NotifierWithReturn migration_state;
static const char *auth = "spice";
static char *auth_passwd;
static time_t auth_expires = TIME_MAX;
@@ -568,12 +568,13 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
return info;
}
-static void migration_state_notifier(Notifier *notifier, void *data)
+static int migration_state_notifier(NotifierWithReturn *notifier,
+ void *data, Error **errp)
{
MigrationState *s = data;
if (!spice_have_target_host) {
- return;
+ return 0;
}
if (migration_in_setup(s)) {
@@ -586,6 +587,7 @@ static void migration_state_notifier(Notifier *notifier, void *data)
spice_server_migrate_end(spice_server, false);
spice_have_target_host = false;
}
+ return 0;
}
int qemu_spice_migrate_info(const char *hostname, int port, int tls_port,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 04/13] migration: MigrationEvent for notifiers
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (2 preceding siblings ...)
2024-02-08 18:53 ` [PATCH V3 03/13] migration: convert to NotifierWithReturn Steve Sistare
@ 2024-02-08 18:53 ` Steve Sistare
2024-02-12 9:11 ` David Hildenbrand
2024-02-20 6:38 ` Peter Xu
2024-02-08 18:53 ` [PATCH V3 05/13] migration: remove postcopy_after_devices Steve Sistare
` (9 subsequent siblings)
13 siblings, 2 replies; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:53 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Passing MigrationState to notifiers is unsound because they could access
unstable migration state internals or even modify the state. Instead, pass
the minimal info needed in a new MigrationEvent struct, which could be
extended in the future if needed.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
hw/net/virtio-net.c | 11 ++++++-----
hw/vfio/migration.c | 10 +++-------
hw/vfio/trace-events | 2 +-
include/migration/misc.h | 14 +++++++++++++-
migration/migration.c | 15 ++++++++++-----
net/vhost-vdpa.c | 6 +++---
ui/spice-core.c | 9 ++++-----
7 files changed, 40 insertions(+), 27 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 75f4e86..e803f98 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3504,7 +3504,7 @@ out:
return !err;
}
-static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
+static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationEvent *e)
{
bool should_be_hidden;
Error *err = NULL;
@@ -3516,7 +3516,7 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
should_be_hidden = qatomic_read(&n->failover_primary_hidden);
- if (migration_in_setup(s) && !should_be_hidden) {
+ if (e->type == MIG_EVENT_PRECOPY_SETUP && !should_be_hidden) {
if (failover_unplug_primary(n, dev)) {
vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
qapi_event_send_unplug_primary(dev->id);
@@ -3524,7 +3524,7 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
} else {
warn_report("couldn't unplug primary device");
}
- } else if (migration_has_failed(s)) {
+ } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
/* We already unplugged the device let's plug it back */
if (!failover_replug_primary(n, dev, &err)) {
if (err) {
@@ -3537,9 +3537,10 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
static int virtio_net_migration_state_notifier(NotifierWithReturn *notifier,
void *data, Error **errp)
{
- MigrationState *s = data;
+ MigrationEvent *e = data;
+
VirtIONet *n = container_of(notifier, VirtIONet, migration_state);
- virtio_net_handle_migration_primary(n, s);
+ virtio_net_handle_migration_primary(n, e);
return 0;
}
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 6b6acc4..869d841 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -757,18 +757,14 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
static int vfio_migration_state_notifier(NotifierWithReturn *notifier,
void *data, Error **errp)
{
- MigrationState *s = data;
+ MigrationEvent *e = data;
VFIOMigration *migration = container_of(notifier, VFIOMigration,
migration_state);
VFIODevice *vbasedev = migration->vbasedev;
- trace_vfio_migration_state_notifier(vbasedev->name,
- MigrationStatus_str(s->state));
+ trace_vfio_migration_state_notifier(vbasedev->name, e->type);
- switch (s->state) {
- case MIGRATION_STATUS_CANCELLING:
- case MIGRATION_STATUS_CANCELLED:
- case MIGRATION_STATUS_FAILED:
+ if (e->type == MIG_EVENT_PRECOPY_FAILED) {
vfio_migration_set_state_or_reset(vbasedev, VFIO_DEVICE_STATE_RUNNING);
}
return 0;
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 8fdde54..f0474b2 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -153,7 +153,7 @@ vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64
vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size 0x%"PRIx64" ret %d"
vfio_migration_realize(const char *name) " (%s)"
vfio_migration_set_state(const char *name, const char *state) " (%s) state %s"
-vfio_migration_state_notifier(const char *name, const char *state) " (%s) state %s"
+vfio_migration_state_notifier(const char *name, int state) " (%s) state %d"
vfio_save_block(const char *name, int data_size) " (%s) data_size %d"
vfio_save_cleanup(const char *name) " (%s)"
vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d"
diff --git a/include/migration/misc.h b/include/migration/misc.h
index b62e351..ff8cc57 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,10 +60,22 @@ void migration_object_init(void);
void migration_shutdown(void);
bool migration_is_idle(void);
bool migration_is_active(MigrationState *);
+
+typedef enum MigrationEventType {
+ MIG_EVENT_PRECOPY_SETUP,
+ MIG_EVENT_PRECOPY_DONE,
+ MIG_EVENT_PRECOPY_FAILED,
+ MIG_EVENT_MAX
+} MigrationEventType;
+
+typedef struct MigrationEvent {
+ MigrationEventType type;
+} MigrationEvent;
+
void migration_add_notifier(NotifierWithReturn *notify,
NotifierWithReturnFunc func);
void migration_remove_notifier(NotifierWithReturn *notify);
-void migration_call_notifiers(MigrationState *s);
+void migration_call_notifiers(MigrationState *s, MigrationEventType type);
bool migration_in_setup(MigrationState *);
bool migration_has_finished(MigrationState *);
bool migration_has_failed(MigrationState *);
diff --git a/migration/migration.c b/migration/migration.c
index ff9945b..82098ad 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1361,7 +1361,9 @@ static void migrate_fd_cleanup(MigrationState *s)
/* It is used on info migrate. We can't free it */
error_report_err(error_copy(s->error));
}
- migration_call_notifiers(s);
+ migration_call_notifiers(s, s->state == MIGRATION_STATUS_COMPLETED ?
+ MIG_EVENT_PRECOPY_DONE :
+ MIG_EVENT_PRECOPY_FAILED);
block_cleanup_parameters();
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
}
@@ -1468,9 +1470,12 @@ void migration_remove_notifier(NotifierWithReturn *notify)
}
}
-void migration_call_notifiers(MigrationState *s)
+void migration_call_notifiers(MigrationState *s, MigrationEventType type)
{
- notifier_with_return_list_notify(&migration_state_notifiers, s, 0);
+ MigrationEvent e;
+
+ e.type = type;
+ notifier_with_return_list_notify(&migration_state_notifiers, &e, 0);
}
bool migration_in_setup(MigrationState *s)
@@ -2525,7 +2530,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
* spice needs to trigger a transition now
*/
ms->postcopy_after_devices = true;
- migration_call_notifiers(ms);
+ migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE);
migration_downtime_end(ms);
@@ -3584,7 +3589,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
rate_limit = migrate_max_bandwidth();
/* Notify before starting migration thread */
- migration_call_notifiers(s);
+ migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP);
}
migration_rate_set(rate_limit);
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 1c00519..a29d18a 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -325,13 +325,13 @@ static void vhost_vdpa_net_log_global_enable(VhostVDPAState *s, bool enable)
static int vdpa_net_migration_state_notifier(NotifierWithReturn *notifier,
void *data, Error **errp)
{
- MigrationState *migration = data;
+ MigrationEvent *e = data;
VhostVDPAState *s = container_of(notifier, VhostVDPAState,
migration_state);
- if (migration_in_setup(migration)) {
+ if (e->type == MIG_EVENT_PRECOPY_SETUP) {
vhost_vdpa_net_log_global_enable(s, true);
- } else if (migration_has_failed(migration)) {
+ } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
vhost_vdpa_net_log_global_enable(s, false);
}
return 0;
diff --git a/ui/spice-core.c b/ui/spice-core.c
index b3cd229..0a59876 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -571,19 +571,18 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
static int migration_state_notifier(NotifierWithReturn *notifier,
void *data, Error **errp)
{
- MigrationState *s = data;
+ MigrationEvent *e = data;
if (!spice_have_target_host) {
return 0;
}
- if (migration_in_setup(s)) {
+ if (e->type == MIG_EVENT_PRECOPY_SETUP) {
spice_server_migrate_start(spice_server);
- } else if (migration_has_finished(s) ||
- migration_in_postcopy_after_devices(s)) {
+ } else if (e->type == MIG_EVENT_PRECOPY_DONE) {
spice_server_migrate_end(spice_server, true);
spice_have_target_host = false;
- } else if (migration_has_failed(s)) {
+ } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
spice_server_migrate_end(spice_server, false);
spice_have_target_host = false;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 05/13] migration: remove postcopy_after_devices
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (3 preceding siblings ...)
2024-02-08 18:53 ` [PATCH V3 04/13] migration: MigrationEvent for notifiers Steve Sistare
@ 2024-02-08 18:53 ` Steve Sistare
2024-02-20 6:42 ` Peter Xu
2024-02-08 18:53 ` [PATCH V3 06/13] migration: MigrationNotifyFunc Steve Sistare
` (8 subsequent siblings)
13 siblings, 1 reply; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:53 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
postcopy_after_devices and migration_in_postcopy_after_devices are no
longer used, so delete them.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
include/migration/misc.h | 1 -
migration/migration.c | 7 -------
migration/migration.h | 2 --
3 files changed, 10 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index ff8cc57..7a51b45 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -80,7 +80,6 @@ bool migration_in_setup(MigrationState *);
bool migration_has_finished(MigrationState *);
bool migration_has_failed(MigrationState *);
/* ...and after the device transmission */
-bool migration_in_postcopy_after_devices(MigrationState *);
/* True if incoming migration entered POSTCOPY_INCOMING_DISCARD */
bool migration_in_incoming_postcopy(void);
/* True if incoming migration entered POSTCOPY_INCOMING_ADVISE */
diff --git a/migration/migration.c b/migration/migration.c
index 82098ad..9a72680 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1519,11 +1519,6 @@ bool migration_postcopy_is_alive(int state)
}
}
-bool migration_in_postcopy_after_devices(MigrationState *s)
-{
- return migration_in_postcopy() && s->postcopy_after_devices;
-}
-
bool migration_in_incoming_postcopy(void)
{
PostcopyState ps = postcopy_state_get();
@@ -1605,7 +1600,6 @@ int migrate_init(MigrationState *s, Error **errp)
s->expected_downtime = 0;
s->setup_time = 0;
s->start_postcopy = false;
- s->postcopy_after_devices = false;
s->migration_thread_running = false;
error_free(s->error);
s->error = NULL;
@@ -2529,7 +2523,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
* at the transition to postcopy and after the device state; in particular
* spice needs to trigger a transition now
*/
- ms->postcopy_after_devices = true;
migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE);
migration_downtime_end(ms);
diff --git a/migration/migration.h b/migration/migration.h
index f2c8b8f..aef8afb 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -348,8 +348,6 @@ struct MigrationState {
/* Flag set once the migration has been asked to enter postcopy */
bool start_postcopy;
- /* Flag set after postcopy has sent the device state */
- bool postcopy_after_devices;
/* Flag set once the migration thread is running (and needs joining) */
bool migration_thread_running;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 06/13] migration: MigrationNotifyFunc
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (4 preceding siblings ...)
2024-02-08 18:53 ` [PATCH V3 05/13] migration: remove postcopy_after_devices Steve Sistare
@ 2024-02-08 18:53 ` Steve Sistare
2024-02-12 9:14 ` David Hildenbrand
2024-02-20 6:48 ` Peter Xu
2024-02-08 18:54 ` [PATCH V3 07/13] migration: per-mode notifiers Steve Sistare
` (7 subsequent siblings)
13 siblings, 2 replies; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:53 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Define MigrationNotifyFunc to improve type safety and simplify migration
notifiers.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
hw/net/virtio-net.c | 4 +---
hw/vfio/migration.c | 3 +--
include/migration/misc.h | 7 ++++++-
migration/migration.c | 4 ++--
net/vhost-vdpa.c | 6 ++----
ui/spice-core.c | 4 +---
6 files changed, 13 insertions(+), 15 deletions(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e803f98..a3c711b 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3535,10 +3535,8 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationEvent *e)
}
static int virtio_net_migration_state_notifier(NotifierWithReturn *notifier,
- void *data, Error **errp)
+ MigrationEvent *e, Error **errp)
{
- MigrationEvent *e = data;
-
VirtIONet *n = container_of(notifier, VirtIONet, migration_state);
virtio_net_handle_migration_primary(n, e);
return 0;
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 869d841..50140ed 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -755,9 +755,8 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
}
static int vfio_migration_state_notifier(NotifierWithReturn *notifier,
- void *data, Error **errp)
+ MigrationEvent *e, Error **errp)
{
- MigrationEvent *e = data;
VFIOMigration *migration = container_of(notifier, VFIOMigration,
migration_state);
VFIODevice *vbasedev = migration->vbasedev;
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 7a51b45..d75c8b0 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -72,8 +72,13 @@ typedef struct MigrationEvent {
MigrationEventType type;
} MigrationEvent;
+
+/* Return int to allow for different failure modes and recovery actions */
+typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
+ MigrationEvent *e, Error **errp);
+
void migration_add_notifier(NotifierWithReturn *notify,
- NotifierWithReturnFunc func);
+ MigrationNotifyFunc func);
void migration_remove_notifier(NotifierWithReturn *notify);
void migration_call_notifiers(MigrationState *s, MigrationEventType type);
bool migration_in_setup(MigrationState *);
diff --git a/migration/migration.c b/migration/migration.c
index 9a72680..5f04c46 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1456,9 +1456,9 @@ static void migrate_fd_cancel(MigrationState *s)
}
void migration_add_notifier(NotifierWithReturn *notify,
- NotifierWithReturnFunc func)
+ MigrationNotifyFunc func)
{
- notify->notify = func;
+ notify->notify = (NotifierWithReturnFunc)func;
notifier_with_return_list_add(&migration_state_notifiers, notify);
}
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a29d18a..e6bdb45 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -323,11 +323,9 @@ static void vhost_vdpa_net_log_global_enable(VhostVDPAState *s, bool enable)
}
static int vdpa_net_migration_state_notifier(NotifierWithReturn *notifier,
- void *data, Error **errp)
+ MigrationEvent *e, Error **errp)
{
- MigrationEvent *e = data;
- VhostVDPAState *s = container_of(notifier, VhostVDPAState,
- migration_state);
+ VhostVDPAState *s = container_of(notifier, VhostVDPAState, migration_state);
if (e->type == MIG_EVENT_PRECOPY_SETUP) {
vhost_vdpa_net_log_global_enable(s, true);
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 0a59876..15be640 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -569,10 +569,8 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
}
static int migration_state_notifier(NotifierWithReturn *notifier,
- void *data, Error **errp)
+ MigrationEvent *e, Error **errp)
{
- MigrationEvent *e = data;
-
if (!spice_have_target_host) {
return 0;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 07/13] migration: per-mode notifiers
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (5 preceding siblings ...)
2024-02-08 18:53 ` [PATCH V3 06/13] migration: MigrationNotifyFunc Steve Sistare
@ 2024-02-08 18:54 ` Steve Sistare
2024-02-12 9:16 ` David Hildenbrand
2024-02-20 6:51 ` Peter Xu
2024-02-08 18:54 ` [PATCH V3 08/13] migration: refactor migrate_fd_connect failures Steve Sistare
` (6 subsequent siblings)
13 siblings, 2 replies; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:54 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Keep a separate list of migration notifiers for each migration mode.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
include/migration/misc.h | 2 ++
migration/migration.c | 22 +++++++++++++++++-----
2 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index d75c8b0..0ea1902 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -79,6 +79,8 @@ typedef int (*MigrationNotifyFunc)(NotifierWithReturn *notify,
void migration_add_notifier(NotifierWithReturn *notify,
MigrationNotifyFunc func);
+void migration_add_notifier_mode(NotifierWithReturn *notify,
+ MigrationNotifyFunc func, MigMode mode);
void migration_remove_notifier(NotifierWithReturn *notify);
void migration_call_notifiers(MigrationState *s, MigrationEventType type);
bool migration_in_setup(MigrationState *);
diff --git a/migration/migration.c b/migration/migration.c
index 5f04c46..1601a03 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -68,8 +68,13 @@
#include "sysemu/dirtylimit.h"
#include "qemu/sockets.h"
-static NotifierWithReturnList migration_state_notifiers =
- NOTIFIER_WITH_RETURN_LIST_INITIALIZER(migration_state_notifiers);
+#define NOTIFIER_ELEM_INIT(array, elem) \
+ [elem] = NOTIFIER_WITH_RETURN_LIST_INITIALIZER((array)[elem])
+
+static NotifierWithReturnList migration_state_notifiers[] = {
+ NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_NORMAL),
+ NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_REBOOT),
+};
/* Messages sent on the return path from destination to source */
enum mig_rp_message_type {
@@ -1455,11 +1460,17 @@ static void migrate_fd_cancel(MigrationState *s)
}
}
+void migration_add_notifier_mode(NotifierWithReturn *notify,
+ MigrationNotifyFunc func, MigMode mode)
+{
+ notify->notify = (NotifierWithReturnFunc)func;
+ notifier_with_return_list_add(&migration_state_notifiers[mode], notify);
+}
+
void migration_add_notifier(NotifierWithReturn *notify,
MigrationNotifyFunc func)
{
- notify->notify = (NotifierWithReturnFunc)func;
- notifier_with_return_list_add(&migration_state_notifiers, notify);
+ migration_add_notifier_mode(notify, func, MIG_MODE_NORMAL);
}
void migration_remove_notifier(NotifierWithReturn *notify)
@@ -1472,10 +1483,11 @@ void migration_remove_notifier(NotifierWithReturn *notify)
void migration_call_notifiers(MigrationState *s, MigrationEventType type)
{
+ MigMode mode = s->parameters.mode;
MigrationEvent e;
e.type = type;
- notifier_with_return_list_notify(&migration_state_notifiers, &e, 0);
+ notifier_with_return_list_notify(&migration_state_notifiers[mode], &e, 0);
}
bool migration_in_setup(MigrationState *s)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 08/13] migration: refactor migrate_fd_connect failures
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (6 preceding siblings ...)
2024-02-08 18:54 ` [PATCH V3 07/13] migration: per-mode notifiers Steve Sistare
@ 2024-02-08 18:54 ` Steve Sistare
2024-02-12 9:17 ` David Hildenbrand
2024-02-08 18:54 ` [PATCH V3 09/13] migration: notifier error checking Steve Sistare
` (5 subsequent siblings)
13 siblings, 1 reply; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:54 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Move common code for the error path in migrate_fd_connect to a shared
fail label. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 1601a03..01d8867 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3608,11 +3608,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
if (migrate_postcopy_ram() || migrate_return_path()) {
if (open_return_path_on_source(s)) {
error_setg(&local_err, "Unable to open return-path for postcopy");
- migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
- migrate_set_error(s, local_err);
- error_report_err(local_err);
- migrate_fd_cleanup(s);
- return;
+ goto fail;
}
}
@@ -3634,12 +3630,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
}
if (multifd_save_setup(&local_err) != 0) {
- migrate_set_error(s, local_err);
- error_report_err(local_err);
- migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
- MIGRATION_STATUS_FAILED);
- migrate_fd_cleanup(s);
- return;
+ goto fail;
}
if (migrate_background_snapshot()) {
@@ -3650,6 +3641,13 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
migration_thread, s, QEMU_THREAD_JOINABLE);
}
s->migration_thread_running = true;
+ return;
+
+fail:
+ migrate_set_error(s, local_err);
+ migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+ error_report_err(local_err);
+ migrate_fd_cleanup(s);
}
static void migration_class_init(ObjectClass *klass, void *data)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 09/13] migration: notifier error checking
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (7 preceding siblings ...)
2024-02-08 18:54 ` [PATCH V3 08/13] migration: refactor migrate_fd_connect failures Steve Sistare
@ 2024-02-08 18:54 ` Steve Sistare
2024-02-12 9:24 ` David Hildenbrand
2024-02-20 7:12 ` Peter Xu
2024-02-08 18:54 ` [PATCH V3 10/13] migration: stop vm for cpr Steve Sistare
` (4 subsequent siblings)
13 siblings, 2 replies; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:54 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Check the status returned by migration notifiers and report errors.
If notifiers fail, call the notifiers again so they can clean up.
None of the notifiers return an error status at this time.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
include/migration/misc.h | 3 ++-
migration/migration.c | 40 +++++++++++++++++++++++++++++-----------
2 files changed, 31 insertions(+), 12 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 0ea1902..6dc234b 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -82,7 +82,8 @@ void migration_add_notifier(NotifierWithReturn *notify,
void migration_add_notifier_mode(NotifierWithReturn *notify,
MigrationNotifyFunc func, MigMode mode);
void migration_remove_notifier(NotifierWithReturn *notify);
-void migration_call_notifiers(MigrationState *s, MigrationEventType type);
+int migration_call_notifiers(MigrationState *s, MigrationEventType type,
+ Error **errp);
bool migration_in_setup(MigrationState *);
bool migration_has_finished(MigrationState *);
bool migration_has_failed(MigrationState *);
diff --git a/migration/migration.c b/migration/migration.c
index 01d8867..d1fce9e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1318,6 +1318,8 @@ void migrate_set_state(int *state, int old_state, int new_state)
static void migrate_fd_cleanup(MigrationState *s)
{
+ Error *local_err = NULL;
+
g_free(s->hostname);
s->hostname = NULL;
json_writer_free(s->vmdesc);
@@ -1362,13 +1364,23 @@ static void migrate_fd_cleanup(MigrationState *s)
MIGRATION_STATUS_CANCELLED);
}
+ if (!migration_has_failed(s) &&
+ migration_call_notifiers(s, MIG_EVENT_PRECOPY_DONE, &local_err)) {
+
+ migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+ migrate_set_error(s, local_err);
+ error_free(local_err);
+ }
+
if (s->error) {
/* It is used on info migrate. We can't free it */
error_report_err(error_copy(s->error));
}
- migration_call_notifiers(s, s->state == MIGRATION_STATUS_COMPLETED ?
- MIG_EVENT_PRECOPY_DONE :
- MIG_EVENT_PRECOPY_FAILED);
+
+ if (migration_has_failed(s)) {
+ migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
+ }
+
block_cleanup_parameters();
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
}
@@ -1481,13 +1493,15 @@ void migration_remove_notifier(NotifierWithReturn *notify)
}
}
-void migration_call_notifiers(MigrationState *s, MigrationEventType type)
+int migration_call_notifiers(MigrationState *s, MigrationEventType type,
+ Error **errp)
{
MigMode mode = s->parameters.mode;
MigrationEvent e;
e.type = type;
- notifier_with_return_list_notify(&migration_state_notifiers[mode], &e, 0);
+ return notifier_with_return_list_notify(&migration_state_notifiers[mode],
+ &e, errp);
}
bool migration_in_setup(MigrationState *s)
@@ -2535,7 +2549,9 @@ static int postcopy_start(MigrationState *ms, Error **errp)
* at the transition to postcopy and after the device state; in particular
* spice needs to trigger a transition now
*/
- migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE);
+ if (migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE, errp)) {
+ goto fail;
+ }
migration_downtime_end(ms);
@@ -2555,11 +2571,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
ret = qemu_file_get_error(ms->to_dst_file);
if (ret) {
- error_setg(errp, "postcopy_start: Migration stream errored");
- migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_FAILED);
+ error_setg_errno(errp, -ret, "postcopy_start: Migration stream error");
+ bql_lock();
+ goto fail;
}
-
trace_postcopy_preempt_enabled(migrate_postcopy_preempt());
return ret;
@@ -2580,6 +2595,7 @@ fail:
error_report_err(local_err);
}
}
+ migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
bql_unlock();
return -1;
}
@@ -3594,7 +3610,9 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
rate_limit = migrate_max_bandwidth();
/* Notify before starting migration thread */
- migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP);
+ if (migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP, &local_err)) {
+ goto fail;
+ }
}
migration_rate_set(rate_limit);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 10/13] migration: stop vm for cpr
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (8 preceding siblings ...)
2024-02-08 18:54 ` [PATCH V3 09/13] migration: notifier error checking Steve Sistare
@ 2024-02-08 18:54 ` Steve Sistare
2024-02-20 7:33 ` Peter Xu
2024-02-08 18:54 ` [PATCH V3 11/13] vfio: register container " Steve Sistare
` (3 subsequent siblings)
13 siblings, 1 reply; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:54 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
When migration for cpr is initiated, stop the vm and set state
RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
possibility of ram and device state being out of sync, and guarantees
that a guest in the suspended state remains suspended, because qmp_cont
rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
include/migration/misc.h | 1 +
migration/migration.c | 32 +++++++++++++++++++++++++-------
2 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 6dc234b..54c99a3 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -60,6 +60,7 @@ void migration_object_init(void);
void migration_shutdown(void);
bool migration_is_idle(void);
bool migration_is_active(MigrationState *);
+bool migrate_mode_is_cpr(MigrationState *);
typedef enum MigrationEventType {
MIG_EVENT_PRECOPY_SETUP,
diff --git a/migration/migration.c b/migration/migration.c
index d1fce9e..fc5c587 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1603,6 +1603,11 @@ bool migration_is_active(MigrationState *s)
s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
}
+bool migrate_mode_is_cpr(MigrationState *s)
+{
+ return s->parameters.mode == MIG_MODE_CPR_REBOOT;
+}
+
int migrate_init(MigrationState *s, Error **errp)
{
int ret;
@@ -2651,13 +2656,14 @@ static int migration_completion_precopy(MigrationState *s,
bql_lock();
migration_downtime_start(s);
- s->vm_old_state = runstate_get();
- global_state_store();
-
- ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
- trace_migration_completion_vm_stop(ret);
- if (ret < 0) {
- goto out_unlock;
+ if (!migrate_mode_is_cpr(s)) {
+ s->vm_old_state = runstate_get();
+ global_state_store();
+ ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
+ trace_migration_completion_vm_stop(ret);
+ if (ret < 0) {
+ goto out_unlock;
+ }
}
ret = migration_maybe_pause(s, current_active_state,
@@ -3576,6 +3582,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
Error *local_err = NULL;
uint64_t rate_limit;
bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
+ int ret;
/*
* If there's a previous error, free it and prepare for another one.
@@ -3651,6 +3658,17 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
goto fail;
}
+ if (migrate_mode_is_cpr(s)) {
+ s->vm_old_state = runstate_get();
+ global_state_store();
+ ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
+ trace_migration_completion_vm_stop(ret);
+ if (ret < 0) {
+ error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
+ goto fail;
+ }
+ }
+
if (migrate_background_snapshot()) {
qemu_thread_create(&s->thread, "bg_snapshot",
bg_migration_thread, s, QEMU_THREAD_JOINABLE);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 11/13] vfio: register container for cpr
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (9 preceding siblings ...)
2024-02-08 18:54 ` [PATCH V3 10/13] migration: stop vm for cpr Steve Sistare
@ 2024-02-08 18:54 ` Steve Sistare
2024-02-08 18:54 ` [PATCH V3 12/13] vfio: allow cpr-reboot migration if suspended Steve Sistare
` (2 subsequent siblings)
13 siblings, 0 replies; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:54 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Define entry points to perform per-container cpr-specific initialization
and teardown.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
hw/vfio/container.c | 11 ++++++++++-
hw/vfio/cpr.c | 19 +++++++++++++++++++
hw/vfio/iommufd.c | 6 ++++++
hw/vfio/meson.build | 1 +
include/hw/vfio/vfio-common.h | 3 +++
5 files changed, 39 insertions(+), 1 deletion(-)
create mode 100644 hw/vfio/cpr.c
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index bd25b9f..096d77e 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -621,10 +621,15 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
goto free_container_exit;
}
+ ret = vfio_cpr_register_container(bcontainer, errp);
+ if (ret) {
+ goto free_container_exit;
+ }
+
ret = vfio_ram_block_discard_disable(container, true);
if (ret) {
error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken");
- goto free_container_exit;
+ goto unregister_container_exit;
}
assert(bcontainer->ops->setup);
@@ -667,6 +672,9 @@ listener_release_exit:
enable_discards_exit:
vfio_ram_block_discard_disable(container, false);
+unregister_container_exit:
+ vfio_cpr_unregister_container(bcontainer);
+
free_container_exit:
g_free(container);
@@ -710,6 +718,7 @@ static void vfio_disconnect_container(VFIOGroup *group)
vfio_container_destroy(bcontainer);
trace_vfio_disconnect_container(container->fd);
+ vfio_cpr_unregister_container(bcontainer);
close(container->fd);
g_free(container);
diff --git a/hw/vfio/cpr.c b/hw/vfio/cpr.c
new file mode 100644
index 0000000..3bede54
--- /dev/null
+++ b/hw/vfio/cpr.c
@@ -0,0 +1,19 @@
+/*
+ * Copyright (c) 2021-2024 Oracle and/or its affiliates.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/vfio/vfio-common.h"
+#include "qapi/error.h"
+
+int vfio_cpr_register_container(VFIOContainerBase *bcontainer, Error **errp)
+{
+ return 0;
+}
+
+void vfio_cpr_unregister_container(VFIOContainerBase *bcontainer)
+{
+}
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 9bfddc1..e1be224 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -411,6 +411,11 @@ found_container:
goto err_listener_register;
}
+ ret = vfio_cpr_register_container(bcontainer, errp);
+ if (ret) {
+ goto err_listener_register;
+ }
+
/*
* TODO: examine RAM_BLOCK_DISCARD stuff, should we do group level
* for discarding incompatibility check as well?
@@ -461,6 +466,7 @@ static void iommufd_cdev_detach(VFIODevice *vbasedev)
iommufd_cdev_ram_block_discard_disable(false);
}
+ vfio_cpr_unregister_container(bcontainer);
iommufd_cdev_detach_container(vbasedev, container);
iommufd_cdev_container_destroy(container);
vfio_put_address_space(space);
diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build
index bb98493..bba776f 100644
--- a/hw/vfio/meson.build
+++ b/hw/vfio/meson.build
@@ -5,6 +5,7 @@ vfio_ss.add(files(
'container-base.c',
'container.c',
'migration.c',
+ 'cpr.c',
))
vfio_ss.add(when: 'CONFIG_PSERIES', if_true: files('spapr.c'))
vfio_ss.add(when: 'CONFIG_IOMMUFD', if_true: files(
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 4a6c262..b9da6c0 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -205,6 +205,9 @@ void vfio_detach_device(VFIODevice *vbasedev);
int vfio_kvm_device_add_fd(int fd, Error **errp);
int vfio_kvm_device_del_fd(int fd, Error **errp);
+int vfio_cpr_register_container(VFIOContainerBase *bcontainer, Error **errp);
+void vfio_cpr_unregister_container(VFIOContainerBase *bcontainer);
+
extern const MemoryRegionOps vfio_region_ops;
typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
typedef QLIST_HEAD(VFIODeviceList, VFIODevice) VFIODeviceList;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 12/13] vfio: allow cpr-reboot migration if suspended
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (10 preceding siblings ...)
2024-02-08 18:54 ` [PATCH V3 11/13] vfio: register container " Steve Sistare
@ 2024-02-08 18:54 ` Steve Sistare
2024-02-21 18:32 ` Steven Sistare
2024-02-08 18:54 ` [PATCH V3 13/13] migration: update cpr-reboot description Steve Sistare
2024-02-20 7:49 ` [PATCH V3 00/13] allow cpr-reboot for vfio Peter Xu
13 siblings, 1 reply; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:54 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Allow cpr-reboot for vfio if the guest is in the suspended runstate. The
guest drivers' suspend methods flush outstanding requests and re-initialize
the devices, and thus there is no device state to save and restore. The
user is responsible for suspending the guest before initiating cpr, such as
by issuing guest-suspend-ram to the qemu guest agent.
Relax the vfio blocker so it does not apply to cpr, and add a notifier that
verifies the guest is suspended.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
hw/vfio/common.c | 2 +-
hw/vfio/cpr.c | 20 ++++++++++++++++++++
hw/vfio/migration.c | 2 +-
include/hw/vfio/vfio-container-base.h | 1 +
4 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 059bfdc..ff88c3f 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -128,7 +128,7 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp)
error_setg(&multiple_devices_migration_blocker,
"Multiple VFIO devices migration is supported only if all of "
"them support P2P migration");
- ret = migrate_add_blocker(&multiple_devices_migration_blocker, errp);
+ ret = migrate_add_blocker_normal(&multiple_devices_migration_blocker, errp);
return ret;
}
diff --git a/hw/vfio/cpr.c b/hw/vfio/cpr.c
index 3bede54..392c2dd 100644
--- a/hw/vfio/cpr.c
+++ b/hw/vfio/cpr.c
@@ -7,13 +7,33 @@
#include "qemu/osdep.h"
#include "hw/vfio/vfio-common.h"
+#include "migration/misc.h"
#include "qapi/error.h"
+#include "sysemu/runstate.h"
+
+static int vfio_cpr_reboot_notifier(NotifierWithReturn *notifier,
+ MigrationEvent *e, Error **errp)
+{
+ if (e->type == MIG_EVENT_PRECOPY_SETUP &&
+ !runstate_check(RUN_STATE_SUSPENDED) && !vm_get_suspended()) {
+
+ error_setg(errp,
+ "VFIO device only supports cpr-reboot for runstate suspended");
+
+ return -1;
+ }
+ return 0;
+}
int vfio_cpr_register_container(VFIOContainerBase *bcontainer, Error **errp)
{
+ migration_add_notifier_mode(&bcontainer->cpr_reboot_notifier,
+ vfio_cpr_reboot_notifier,
+ MIG_MODE_CPR_REBOOT);
return 0;
}
void vfio_cpr_unregister_container(VFIOContainerBase *bcontainer)
{
+ migration_remove_notifier(&bcontainer->cpr_reboot_notifier);
}
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 50140ed..2050ac8 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -889,7 +889,7 @@ static int vfio_block_migration(VFIODevice *vbasedev, Error *err, Error **errp)
vbasedev->migration_blocker = error_copy(err);
error_free(err);
- return migrate_add_blocker(&vbasedev->migration_blocker, errp);
+ return migrate_add_blocker_normal(&vbasedev->migration_blocker, errp);
}
/* ---------------------------------------------------------------------- */
diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index b2813b0..3582d5f 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -49,6 +49,7 @@ typedef struct VFIOContainerBase {
QLIST_ENTRY(VFIOContainerBase) next;
QLIST_HEAD(, VFIODevice) device_list;
GList *iova_ranges;
+ NotifierWithReturn cpr_reboot_notifier;
} VFIOContainerBase;
typedef struct VFIOGuestIOMMU {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH V3 13/13] migration: update cpr-reboot description
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (11 preceding siblings ...)
2024-02-08 18:54 ` [PATCH V3 12/13] vfio: allow cpr-reboot migration if suspended Steve Sistare
@ 2024-02-08 18:54 ` Steve Sistare
2024-02-20 7:49 ` [PATCH V3 00/13] allow cpr-reboot for vfio Peter Xu
13 siblings, 0 replies; 42+ messages in thread
From: Steve Sistare @ 2024-02-08 18:54 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand, Steve Sistare
Clarify qapi for cpr-reboot migration mode, and add vfio support.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
qapi/migration.json | 36 +++++++++++++++++++++++-------------
1 file changed, 23 insertions(+), 13 deletions(-)
diff --git a/qapi/migration.json b/qapi/migration.json
index 8197083..c83e0c0 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -636,19 +636,29 @@
#
# @normal: the original form of migration. (since 8.2)
#
-# @cpr-reboot: The migrate command saves state to a file, allowing one to
-# quit qemu, reboot to an updated kernel, and restart an updated
-# version of qemu. The caller must specify a migration URI
-# that writes to and reads from a file. Unlike normal mode,
-# the use of certain local storage options does not block the
-# migration, but the caller must not modify guest block devices
-# between the quit and restart. To avoid saving guest RAM to the
-# file, the memory backend must be shared, and the @x-ignore-shared
-# migration capability must be set. Guest RAM must be non-volatile
-# across reboot, such as by backing it with a dax device, but this
-# is not enforced. The restarted qemu arguments must match those
-# used to initially start qemu, plus the -incoming option.
-# (since 8.2)
+# @cpr-reboot: The migrate command stops the VM, saves state to the URI,
+# and qemu exits. After qemu exits, the user resumes by running
+# qemu -incoming.
+#
+# This mode allows the user to quit qemu, and restart an updated version
+# of qemu. The user may even update and reboot the OS before restarting,
+# as long as the URI persists across a reboot.
+#
+# Unlike normal mode, the use of certain local storage options does not
+# block the migration, but the user must not modify guest block devices
+# between the quit and restart.
+#
+# This mode supports vfio devices provided the user first puts the guest
+# in the suspended runstate, such as by issuing guest-suspend-ram to the
+# qemu guest agent.
+#
+# Best performance is achieved when the memory backend is shared and the
+# @x-ignore-shared migration capability is set, but this is not required.
+# Further, if the user reboots before restarting such a configuration, the
+# shared backend must be be non-volatile across reboot, such as by backing
+# it with a dax device.
+#
+# (since 8.2)
##
{ 'enum': 'MigMode',
'data': [ 'normal', 'cpr-reboot' ] }
--
1.8.3.1
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH V3 01/13] notify: pass error to notifier with return
2024-02-08 18:53 ` [PATCH V3 01/13] notify: pass error to notifier with return Steve Sistare
@ 2024-02-12 9:08 ` David Hildenbrand
0 siblings, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2024-02-12 9:08 UTC (permalink / raw)
To: Steve Sistare, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 08.02.24 19:53, Steve Sistare wrote:
> Pass an error object as the third parameter to "notifier with return"
> notifiers, so clients no longer need to bundle an error object in the
> opaque data. The new parameter is used in a later patch.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> ---
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 02/13] migration: remove error from notifier data
2024-02-08 18:53 ` [PATCH V3 02/13] migration: remove error from notifier data Steve Sistare
@ 2024-02-12 9:08 ` David Hildenbrand
0 siblings, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2024-02-12 9:08 UTC (permalink / raw)
To: Steve Sistare, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 08.02.24 19:53, Steve Sistare wrote:
> Remove the error object from opaque data passed to notifiers.
> Use the new error parameter passed to the notifier instead.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> ---
Would have squashed #1 and #2.
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 03/13] migration: convert to NotifierWithReturn
2024-02-08 18:53 ` [PATCH V3 03/13] migration: convert to NotifierWithReturn Steve Sistare
@ 2024-02-12 9:10 ` David Hildenbrand
0 siblings, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2024-02-12 9:10 UTC (permalink / raw)
To: Steve Sistare, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 08.02.24 19:53, Steve Sistare wrote:
> Change all migration notifiers to type NotifierWithReturn, so notifiers
> can return an error status in a future patch. For now, pass NULL for the
> notifier error parameter, and do not check the return value.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> ---
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 04/13] migration: MigrationEvent for notifiers
2024-02-08 18:53 ` [PATCH V3 04/13] migration: MigrationEvent for notifiers Steve Sistare
@ 2024-02-12 9:11 ` David Hildenbrand
2024-02-20 6:38 ` Peter Xu
1 sibling, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2024-02-12 9:11 UTC (permalink / raw)
To: Steve Sistare, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 08.02.24 19:53, Steve Sistare wrote:
> Passing MigrationState to notifiers is unsound because they could access
> unstable migration state internals or even modify the state. Instead, pass
> the minimal info needed in a new MigrationEvent struct, which could be
> extended in the future if needed.
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 06/13] migration: MigrationNotifyFunc
2024-02-08 18:53 ` [PATCH V3 06/13] migration: MigrationNotifyFunc Steve Sistare
@ 2024-02-12 9:14 ` David Hildenbrand
2024-02-20 6:48 ` Peter Xu
1 sibling, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2024-02-12 9:14 UTC (permalink / raw)
To: Steve Sistare, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 08.02.24 19:53, Steve Sistare wrote:
> Define MigrationNotifyFunc to improve type safety and simplify migration
> notifiers.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 07/13] migration: per-mode notifiers
2024-02-08 18:54 ` [PATCH V3 07/13] migration: per-mode notifiers Steve Sistare
@ 2024-02-12 9:16 ` David Hildenbrand
2024-02-20 6:51 ` Peter Xu
1 sibling, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2024-02-12 9:16 UTC (permalink / raw)
To: Steve Sistare, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 08.02.24 19:54, Steve Sistare wrote:
> Keep a separate list of migration notifiers for each migration mode.
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 08/13] migration: refactor migrate_fd_connect failures
2024-02-08 18:54 ` [PATCH V3 08/13] migration: refactor migrate_fd_connect failures Steve Sistare
@ 2024-02-12 9:17 ` David Hildenbrand
0 siblings, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2024-02-12 9:17 UTC (permalink / raw)
To: Steve Sistare, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 08.02.24 19:54, Steve Sistare wrote:
> Move common code for the error path in migrate_fd_connect to a shared
> fail label. No functional change.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> ---
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 09/13] migration: notifier error checking
2024-02-08 18:54 ` [PATCH V3 09/13] migration: notifier error checking Steve Sistare
@ 2024-02-12 9:24 ` David Hildenbrand
2024-02-12 15:37 ` Steven Sistare
2024-02-20 7:12 ` Peter Xu
1 sibling, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2024-02-12 9:24 UTC (permalink / raw)
To: Steve Sistare, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 08.02.24 19:54, Steve Sistare wrote:
> Check the status returned by migration notifiers and report errors.
> If notifiers fail, call the notifiers again so they can clean up.
IIUC, if any of the notifiers will actually start to fail, say, during
MIG_EVENT_PRECOPY_SETUP, you will call MIG_EVENT_PRECOPY_FAILED on all
notifiers.
That will include notifiers that have never seen a
MIG_EVENT_PRECOPY_SETUP call.
Is that what we expect notifiers to be able to deal with? Can we
document that?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 09/13] migration: notifier error checking
2024-02-12 9:24 ` David Hildenbrand
@ 2024-02-12 15:37 ` Steven Sistare
0 siblings, 0 replies; 42+ messages in thread
From: Steven Sistare @ 2024-02-12 15:37 UTC (permalink / raw)
To: David Hildenbrand, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau
On 2/12/2024 4:24 AM, David Hildenbrand wrote:
> On 08.02.24 19:54, Steve Sistare wrote:
>> Check the status returned by migration notifiers and report errors.
>> If notifiers fail, call the notifiers again so they can clean up.
>
> IIUC, if any of the notifiers will actually start to fail, say, during MIG_EVENT_PRECOPY_SETUP, you will call MIG_EVENT_PRECOPY_FAILED on all notifiers.
>
> That will include notifiers that have never seen a MIG_EVENT_PRECOPY_SETUP call.
Correct.
> Is that what we expect notifiers to be able to deal with? Can we document that?
The notifiers have always needed to handle failure without knowing the previous migration
states, and are robust about unwinding their own internal state. I will document it.
Thanks for all the RBs.
- Steve
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 04/13] migration: MigrationEvent for notifiers
2024-02-08 18:53 ` [PATCH V3 04/13] migration: MigrationEvent for notifiers Steve Sistare
2024-02-12 9:11 ` David Hildenbrand
@ 2024-02-20 6:38 ` Peter Xu
1 sibling, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-20 6:38 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Thu, Feb 08, 2024 at 10:53:57AM -0800, Steve Sistare wrote:
> Passing MigrationState to notifiers is unsound because they could access
> unstable migration state internals or even modify the state. Instead, pass
> the minimal info needed in a new MigrationEvent struct, which could be
> extended in the future if needed.
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 05/13] migration: remove postcopy_after_devices
2024-02-08 18:53 ` [PATCH V3 05/13] migration: remove postcopy_after_devices Steve Sistare
@ 2024-02-20 6:42 ` Peter Xu
0 siblings, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-20 6:42 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Thu, Feb 08, 2024 at 10:53:58AM -0800, Steve Sistare wrote:
> postcopy_after_devices and migration_in_postcopy_after_devices are no
> longer used, so delete them.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 06/13] migration: MigrationNotifyFunc
2024-02-08 18:53 ` [PATCH V3 06/13] migration: MigrationNotifyFunc Steve Sistare
2024-02-12 9:14 ` David Hildenbrand
@ 2024-02-20 6:48 ` Peter Xu
1 sibling, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-20 6:48 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Thu, Feb 08, 2024 at 10:53:59AM -0800, Steve Sistare wrote:
> Define MigrationNotifyFunc to improve type safety and simplify migration
> notifiers.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 07/13] migration: per-mode notifiers
2024-02-08 18:54 ` [PATCH V3 07/13] migration: per-mode notifiers Steve Sistare
2024-02-12 9:16 ` David Hildenbrand
@ 2024-02-20 6:51 ` Peter Xu
1 sibling, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-20 6:51 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Thu, Feb 08, 2024 at 10:54:00AM -0800, Steve Sistare wrote:
> Keep a separate list of migration notifiers for each migration mode.
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 09/13] migration: notifier error checking
2024-02-08 18:54 ` [PATCH V3 09/13] migration: notifier error checking Steve Sistare
2024-02-12 9:24 ` David Hildenbrand
@ 2024-02-20 7:12 ` Peter Xu
2024-02-20 22:12 ` Steven Sistare
1 sibling, 1 reply; 42+ messages in thread
From: Peter Xu @ 2024-02-20 7:12 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Thu, Feb 08, 2024 at 10:54:02AM -0800, Steve Sistare wrote:
> Check the status returned by migration notifiers and report errors.
> If notifiers fail, call the notifiers again so they can clean up.
> None of the notifiers return an error status at this time.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> include/migration/misc.h | 3 ++-
> migration/migration.c | 40 +++++++++++++++++++++++++++++-----------
> 2 files changed, 31 insertions(+), 12 deletions(-)
>
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 0ea1902..6dc234b 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -82,7 +82,8 @@ void migration_add_notifier(NotifierWithReturn *notify,
> void migration_add_notifier_mode(NotifierWithReturn *notify,
> MigrationNotifyFunc func, MigMode mode);
> void migration_remove_notifier(NotifierWithReturn *notify);
> -void migration_call_notifiers(MigrationState *s, MigrationEventType type);
> +int migration_call_notifiers(MigrationState *s, MigrationEventType type,
> + Error **errp);
> bool migration_in_setup(MigrationState *);
> bool migration_has_finished(MigrationState *);
> bool migration_has_failed(MigrationState *);
> diff --git a/migration/migration.c b/migration/migration.c
> index 01d8867..d1fce9e 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1318,6 +1318,8 @@ void migrate_set_state(int *state, int old_state, int new_state)
>
> static void migrate_fd_cleanup(MigrationState *s)
> {
> + Error *local_err = NULL;
> +
> g_free(s->hostname);
> s->hostname = NULL;
> json_writer_free(s->vmdesc);
> @@ -1362,13 +1364,23 @@ static void migrate_fd_cleanup(MigrationState *s)
> MIGRATION_STATUS_CANCELLED);
> }
>
> + if (!migration_has_failed(s) &&
> + migration_call_notifiers(s, MIG_EVENT_PRECOPY_DONE, &local_err)) {
> +
> + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> + migrate_set_error(s, local_err);
> + error_free(local_err);
> + }
> +
> if (s->error) {
> /* It is used on info migrate. We can't free it */
> error_report_err(error_copy(s->error));
> }
> - migration_call_notifiers(s, s->state == MIGRATION_STATUS_COMPLETED ?
> - MIG_EVENT_PRECOPY_DONE :
> - MIG_EVENT_PRECOPY_FAILED);
> +
> + if (migration_has_failed(s)) {
> + migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
> + }
AFAIU, the whole point of such split is, allowing DONE notifies to fail too
and then if that happens we can invoke FAIL notifiers again.
Perhaps we can avoid that complexity, but rather document only SETUP
notifiers can fail?
The problem is that failing a notifier at this stage (if migration already
finished) can already be too late; dest QEMU can already have started
running, so no way to roll back. We can document that, check and assert
for !SETUP cases to make sure error is never hit?
> +
> block_cleanup_parameters();
> yank_unregister_instance(MIGRATION_YANK_INSTANCE);
> }
> @@ -1481,13 +1493,15 @@ void migration_remove_notifier(NotifierWithReturn *notify)
> }
> }
>
> -void migration_call_notifiers(MigrationState *s, MigrationEventType type)
> +int migration_call_notifiers(MigrationState *s, MigrationEventType type,
> + Error **errp)
> {
> MigMode mode = s->parameters.mode;
> MigrationEvent e;
>
> e.type = type;
> - notifier_with_return_list_notify(&migration_state_notifiers[mode], &e, 0);
> + return notifier_with_return_list_notify(&migration_state_notifiers[mode],
> + &e, errp);
> }
>
> bool migration_in_setup(MigrationState *s)
> @@ -2535,7 +2549,9 @@ static int postcopy_start(MigrationState *ms, Error **errp)
> * at the transition to postcopy and after the device state; in particular
> * spice needs to trigger a transition now
> */
> - migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE);
> + if (migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE, errp)) {
> + goto fail;
> + }
>
> migration_downtime_end(ms);
>
> @@ -2555,11 +2571,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>
> ret = qemu_file_get_error(ms->to_dst_file);
> if (ret) {
> - error_setg(errp, "postcopy_start: Migration stream errored");
> - migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> - MIGRATION_STATUS_FAILED);
> + error_setg_errno(errp, -ret, "postcopy_start: Migration stream error");
> + bql_lock();
> + goto fail;
> }
> -
> trace_postcopy_preempt_enabled(migrate_postcopy_preempt());
>
> return ret;
> @@ -2580,6 +2595,7 @@ fail:
> error_report_err(local_err);
> }
> }
> + migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
> bql_unlock();
> return -1;
> }
> @@ -3594,7 +3610,9 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> rate_limit = migrate_max_bandwidth();
>
> /* Notify before starting migration thread */
> - migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP);
> + if (migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP, &local_err)) {
> + goto fail;
> + }
> }
>
> migration_rate_set(rate_limit);
> --
> 1.8.3.1
>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-08 18:54 ` [PATCH V3 10/13] migration: stop vm for cpr Steve Sistare
@ 2024-02-20 7:33 ` Peter Xu
2024-02-20 22:19 ` Steven Sistare
` (2 more replies)
0 siblings, 3 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-20 7:33 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Thu, Feb 08, 2024 at 10:54:03AM -0800, Steve Sistare wrote:
> When migration for cpr is initiated, stop the vm and set state
> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
> possibility of ram and device state being out of sync, and guarantees
> that a guest in the suspended state remains suspended, because qmp_cont
> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> include/migration/misc.h | 1 +
> migration/migration.c | 32 +++++++++++++++++++++++++-------
> 2 files changed, 26 insertions(+), 7 deletions(-)
>
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 6dc234b..54c99a3 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -60,6 +60,7 @@ void migration_object_init(void);
> void migration_shutdown(void);
> bool migration_is_idle(void);
> bool migration_is_active(MigrationState *);
> +bool migrate_mode_is_cpr(MigrationState *);
>
> typedef enum MigrationEventType {
> MIG_EVENT_PRECOPY_SETUP,
> diff --git a/migration/migration.c b/migration/migration.c
> index d1fce9e..fc5c587 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1603,6 +1603,11 @@ bool migration_is_active(MigrationState *s)
> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
> }
>
> +bool migrate_mode_is_cpr(MigrationState *s)
> +{
> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
> +}
> +
> int migrate_init(MigrationState *s, Error **errp)
> {
> int ret;
> @@ -2651,13 +2656,14 @@ static int migration_completion_precopy(MigrationState *s,
> bql_lock();
> migration_downtime_start(s);
>
> - s->vm_old_state = runstate_get();
> - global_state_store();
> -
> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> - trace_migration_completion_vm_stop(ret);
> - if (ret < 0) {
> - goto out_unlock;
> + if (!migrate_mode_is_cpr(s)) {
> + s->vm_old_state = runstate_get();
> + global_state_store();
> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> + trace_migration_completion_vm_stop(ret);
> + if (ret < 0) {
> + goto out_unlock;
> + }
> }
>
> ret = migration_maybe_pause(s, current_active_state,
> @@ -3576,6 +3582,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> Error *local_err = NULL;
> uint64_t rate_limit;
> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
> + int ret;
>
> /*
> * If there's a previous error, free it and prepare for another one.
> @@ -3651,6 +3658,17 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> goto fail;
> }
>
> + if (migrate_mode_is_cpr(s)) {
> + s->vm_old_state = runstate_get();
> + global_state_store();
> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> + trace_migration_completion_vm_stop(ret);
> + if (ret < 0) {
> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
> + goto fail;
> + }
> + }
Could we have a helper function for the shared codes?
How about postcopy? I know it's nonsense to enable postcopy for cpr.. but
iiuc we don't yet forbid an user doing so. Maybe we should?
> +
> if (migrate_background_snapshot()) {
> qemu_thread_create(&s->thread, "bg_snapshot",
> bg_migration_thread, s, QEMU_THREAD_JOINABLE);
> --
> 1.8.3.1
>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 00/13] allow cpr-reboot for vfio
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
` (12 preceding siblings ...)
2024-02-08 18:54 ` [PATCH V3 13/13] migration: update cpr-reboot description Steve Sistare
@ 2024-02-20 7:49 ` Peter Xu
2024-02-20 22:32 ` Steven Sistare
13 siblings, 1 reply; 42+ messages in thread
From: Peter Xu @ 2024-02-20 7:49 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Thu, Feb 08, 2024 at 10:53:53AM -0800, Steve Sistare wrote:
> Allow cpr-reboot for vfio if the guest is in the suspended runstate. The
> guest drivers' suspend methods flush outstanding requests and re-initialize
> the devices, and thus there is no device state to save and restore. The
> user is responsible for suspending the guest before initiating cpr, such as
> by issuing guest-suspend-ram to the qemu guest agent.
>
> Most of the patches in this series enhance migration notifiers so they can
> return an error status and message. The last few patches register a notifier
> for vfio that returns an error if the guest is not suspended.
>
> Changes in V3:
> * update to tip, add RB's
> * replace MigrationStatus with new enum MigrationEventType
> * simplify migrate_fd_connect error recovery
> * support vfio iommufd containers
> * add patches:
> migration: stop vm for cpr
> migration: update cpr-reboot description
This doesn't apply to master anymore, please rebase when repost, thanks.
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 09/13] migration: notifier error checking
2024-02-20 7:12 ` Peter Xu
@ 2024-02-20 22:12 ` Steven Sistare
0 siblings, 0 replies; 42+ messages in thread
From: Steven Sistare @ 2024-02-20 22:12 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On 2/20/2024 2:12 AM, Peter Xu wrote:
> On Thu, Feb 08, 2024 at 10:54:02AM -0800, Steve Sistare wrote:
>> Check the status returned by migration notifiers and report errors.
>> If notifiers fail, call the notifiers again so they can clean up.
>> None of the notifiers return an error status at this time.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> include/migration/misc.h | 3 ++-
>> migration/migration.c | 40 +++++++++++++++++++++++++++++-----------
>> 2 files changed, 31 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index 0ea1902..6dc234b 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -82,7 +82,8 @@ void migration_add_notifier(NotifierWithReturn *notify,
>> void migration_add_notifier_mode(NotifierWithReturn *notify,
>> MigrationNotifyFunc func, MigMode mode);
>> void migration_remove_notifier(NotifierWithReturn *notify);
>> -void migration_call_notifiers(MigrationState *s, MigrationEventType type);
>> +int migration_call_notifiers(MigrationState *s, MigrationEventType type,
>> + Error **errp);
>> bool migration_in_setup(MigrationState *);
>> bool migration_has_finished(MigrationState *);
>> bool migration_has_failed(MigrationState *);
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 01d8867..d1fce9e 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1318,6 +1318,8 @@ void migrate_set_state(int *state, int old_state, int new_state)
>>
>> static void migrate_fd_cleanup(MigrationState *s)
>> {
>> + Error *local_err = NULL;
>> +
>> g_free(s->hostname);
>> s->hostname = NULL;
>> json_writer_free(s->vmdesc);
>> @@ -1362,13 +1364,23 @@ static void migrate_fd_cleanup(MigrationState *s)
>> MIGRATION_STATUS_CANCELLED);
>> }
>>
>> + if (!migration_has_failed(s) &&
>> + migration_call_notifiers(s, MIG_EVENT_PRECOPY_DONE, &local_err)) {
>> +
>> + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
>> + migrate_set_error(s, local_err);
>> + error_free(local_err);
>> + }
>> +
>> if (s->error) {
>> /* It is used on info migrate. We can't free it */
>> error_report_err(error_copy(s->error));
>> }
>> - migration_call_notifiers(s, s->state == MIGRATION_STATUS_COMPLETED ?
>> - MIG_EVENT_PRECOPY_DONE :
>> - MIG_EVENT_PRECOPY_FAILED);
>> +
>> + if (migration_has_failed(s)) {
>> + migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
>> + }
>
> AFAIU, the whole point of such split is, allowing DONE notifies to fail too
> and then if that happens we can invoke FAIL notifiers again.
Correct.
>
> Perhaps we can avoid that complexity, but rather document only SETUP
> notifiers can fail?
>
> The problem is that failing a notifier at this stage (if migration already
> finished) can already be too late; dest QEMU can already have started
> running, so no way to roll back. We can document that, check and assert
> for !SETUP cases to make sure error is never hit?
Makes sense. I will modify the patch as you suggest.
- Steve
>> +
>> block_cleanup_parameters();
>> yank_unregister_instance(MIGRATION_YANK_INSTANCE);
>> }
>> @@ -1481,13 +1493,15 @@ void migration_remove_notifier(NotifierWithReturn *notify)
>> }
>> }
>>
>> -void migration_call_notifiers(MigrationState *s, MigrationEventType type)
>> +int migration_call_notifiers(MigrationState *s, MigrationEventType type,
>> + Error **errp)
>> {
>> MigMode mode = s->parameters.mode;
>> MigrationEvent e;
>>
>> e.type = type;
>> - notifier_with_return_list_notify(&migration_state_notifiers[mode], &e, 0);
>> + return notifier_with_return_list_notify(&migration_state_notifiers[mode],
>> + &e, errp);
>> }
>>
>> bool migration_in_setup(MigrationState *s)
>> @@ -2535,7 +2549,9 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>> * at the transition to postcopy and after the device state; in particular
>> * spice needs to trigger a transition now
>> */
>> - migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE);
>> + if (migration_call_notifiers(ms, MIG_EVENT_PRECOPY_DONE, errp)) {
>> + goto fail;
>> + }
>>
>> migration_downtime_end(ms);
>>
>> @@ -2555,11 +2571,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>>
>> ret = qemu_file_get_error(ms->to_dst_file);
>> if (ret) {
>> - error_setg(errp, "postcopy_start: Migration stream errored");
>> - migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
>> - MIGRATION_STATUS_FAILED);
>> + error_setg_errno(errp, -ret, "postcopy_start: Migration stream error");
>> + bql_lock();
>> + goto fail;
>> }
>> -
>> trace_postcopy_preempt_enabled(migrate_postcopy_preempt());
>>
>> return ret;
>> @@ -2580,6 +2595,7 @@ fail:
>> error_report_err(local_err);
>> }
>> }
>> + migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
>> bql_unlock();
>> return -1;
>> }
>> @@ -3594,7 +3610,9 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> rate_limit = migrate_max_bandwidth();
>>
>> /* Notify before starting migration thread */
>> - migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP);
>> + if (migration_call_notifiers(s, MIG_EVENT_PRECOPY_SETUP, &local_err)) {
>> + goto fail;
>> + }
>> }
>>
>> migration_rate_set(rate_limit);
>> --
>> 1.8.3.1
>>
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-20 7:33 ` Peter Xu
@ 2024-02-20 22:19 ` Steven Sistare
2024-02-21 21:20 ` Steven Sistare
2024-02-21 21:23 ` Steven Sistare
2 siblings, 0 replies; 42+ messages in thread
From: Steven Sistare @ 2024-02-20 22:19 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On 2/20/2024 2:33 AM, Peter Xu wrote:
> On Thu, Feb 08, 2024 at 10:54:03AM -0800, Steve Sistare wrote:
>> When migration for cpr is initiated, stop the vm and set state
>> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
>> possibility of ram and device state being out of sync, and guarantees
>> that a guest in the suspended state remains suspended, because qmp_cont
>> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> include/migration/misc.h | 1 +
>> migration/migration.c | 32 +++++++++++++++++++++++++-------
>> 2 files changed, 26 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index 6dc234b..54c99a3 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -60,6 +60,7 @@ void migration_object_init(void);
>> void migration_shutdown(void);
>> bool migration_is_idle(void);
>> bool migration_is_active(MigrationState *);
>> +bool migrate_mode_is_cpr(MigrationState *);
>>
>> typedef enum MigrationEventType {
>> MIG_EVENT_PRECOPY_SETUP,
>> diff --git a/migration/migration.c b/migration/migration.c
>> index d1fce9e..fc5c587 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1603,6 +1603,11 @@ bool migration_is_active(MigrationState *s)
>> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
>> }
>>
>> +bool migrate_mode_is_cpr(MigrationState *s)
>> +{
>> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
>> +}
>> +
>> int migrate_init(MigrationState *s, Error **errp)
>> {
>> int ret;
>> @@ -2651,13 +2656,14 @@ static int migration_completion_precopy(MigrationState *s,
>> bql_lock();
>> migration_downtime_start(s);
>>
>> - s->vm_old_state = runstate_get();
>> - global_state_store();
>> -
>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> - trace_migration_completion_vm_stop(ret);
>> - if (ret < 0) {
>> - goto out_unlock;
>> + if (!migrate_mode_is_cpr(s)) {
>> + s->vm_old_state = runstate_get();
>> + global_state_store();
>> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> + trace_migration_completion_vm_stop(ret);
>> + if (ret < 0) {
>> + goto out_unlock;
>> + }
>> }
>>
>> ret = migration_maybe_pause(s, current_active_state,
>> @@ -3576,6 +3582,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> Error *local_err = NULL;
>> uint64_t rate_limit;
>> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
>> + int ret;
>>
>> /*
>> * If there's a previous error, free it and prepare for another one.
>> @@ -3651,6 +3658,17 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> goto fail;
>> }
>>
>> + if (migrate_mode_is_cpr(s)) {
>> + s->vm_old_state = runstate_get();
>> + global_state_store();
>> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> + trace_migration_completion_vm_stop(ret);
>> + if (ret < 0) {
>> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
>> + goto fail;
>> + }
>> + }
>
> Could we have a helper function for the shared codes?
Will do.
> How about postcopy? I know it's nonsense to enable postcopy for cpr.. but
> iiuc we don't yet forbid an user doing so. Maybe we should?
I will assert that mode != cpr in the postcopy path instead, to prevent any nonsense.
- Steve
>> +
>> if (migrate_background_snapshot()) {
>> qemu_thread_create(&s->thread, "bg_snapshot",
>> bg_migration_thread, s, QEMU_THREAD_JOINABLE);
>> --
>> 1.8.3.1
>>
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 00/13] allow cpr-reboot for vfio
2024-02-20 7:49 ` [PATCH V3 00/13] allow cpr-reboot for vfio Peter Xu
@ 2024-02-20 22:32 ` Steven Sistare
2024-02-21 2:13 ` Peter Xu
0 siblings, 1 reply; 42+ messages in thread
From: Steven Sistare @ 2024-02-20 22:32 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On 2/20/2024 2:49 AM, Peter Xu wrote:
> On Thu, Feb 08, 2024 at 10:53:53AM -0800, Steve Sistare wrote:
>> Allow cpr-reboot for vfio if the guest is in the suspended runstate. The
>> guest drivers' suspend methods flush outstanding requests and re-initialize
>> the devices, and thus there is no device state to save and restore. The
>> user is responsible for suspending the guest before initiating cpr, such as
>> by issuing guest-suspend-ram to the qemu guest agent.
>>
>> Most of the patches in this series enhance migration notifiers so they can
>> return an error status and message. The last few patches register a notifier
>> for vfio that returns an error if the guest is not suspended.
>>
>> Changes in V3:
>> * update to tip, add RB's
>> * replace MigrationStatus with new enum MigrationEventType
>> * simplify migrate_fd_connect error recovery
>> * support vfio iommufd containers
>> * add patches:
>> migration: stop vm for cpr
>> migration: update cpr-reboot description
>
> This doesn't apply to master anymore, please rebase when repost, thanks.
Will do. Before I do, any comments on "migration: update cpr-reboot description"?
After we converge on that short description, I will submit a longer treatment in
docs/devel/migration, which I see you have recently populated.
- Steve
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 00/13] allow cpr-reboot for vfio
2024-02-20 22:32 ` Steven Sistare
@ 2024-02-21 2:13 ` Peter Xu
0 siblings, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-21 2:13 UTC (permalink / raw)
To: Steven Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Tue, Feb 20, 2024 at 05:32:34PM -0500, Steven Sistare wrote:
> On 2/20/2024 2:49 AM, Peter Xu wrote:
> > On Thu, Feb 08, 2024 at 10:53:53AM -0800, Steve Sistare wrote:
> >> Allow cpr-reboot for vfio if the guest is in the suspended runstate. The
> >> guest drivers' suspend methods flush outstanding requests and re-initialize
> >> the devices, and thus there is no device state to save and restore. The
> >> user is responsible for suspending the guest before initiating cpr, such as
> >> by issuing guest-suspend-ram to the qemu guest agent.
> >>
> >> Most of the patches in this series enhance migration notifiers so they can
> >> return an error status and message. The last few patches register a notifier
> >> for vfio that returns an error if the guest is not suspended.
> >>
> >> Changes in V3:
> >> * update to tip, add RB's
> >> * replace MigrationStatus with new enum MigrationEventType
> >> * simplify migrate_fd_connect error recovery
> >> * support vfio iommufd containers
> >> * add patches:
> >> migration: stop vm for cpr
> >> migration: update cpr-reboot description
> >
> > This doesn't apply to master anymore, please rebase when repost, thanks.
>
> Will do. Before I do, any comments on "migration: update cpr-reboot description"?
> After we converge on that short description, I will submit a longer treatment in
> docs/devel/migration, which I see you have recently populated.
Sounds good; yes I hope we have a file there, as it'll pop up later in
https://www.qemu.org/docs/master/devel/migration/.
You can add a short sentence to forbid postcopy if that's the plan. Other
than that it looks good.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 12/13] vfio: allow cpr-reboot migration if suspended
2024-02-08 18:54 ` [PATCH V3 12/13] vfio: allow cpr-reboot migration if suspended Steve Sistare
@ 2024-02-21 18:32 ` Steven Sistare
0 siblings, 0 replies; 42+ messages in thread
From: Steven Sistare @ 2024-02-21 18:32 UTC (permalink / raw)
To: qemu-devel, Alex Williamson
Cc: Peter Xu, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Cedric Le Goater, Gerd Hoffmann, Marc-Andre Lureau,
David Hildenbrand
Hi Alex, any comments or RB on this or patch 11? The last few changes I am
making for Peter will not change this patch.
- Steve
On 2/8/2024 1:54 PM, Steve Sistare wrote:
> Allow cpr-reboot for vfio if the guest is in the suspended runstate. The
> guest drivers' suspend methods flush outstanding requests and re-initialize
> the devices, and thus there is no device state to save and restore. The
> user is responsible for suspending the guest before initiating cpr, such as
> by issuing guest-suspend-ram to the qemu guest agent.
>
> Relax the vfio blocker so it does not apply to cpr, and add a notifier that
> verifies the guest is suspended.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> hw/vfio/common.c | 2 +-
> hw/vfio/cpr.c | 20 ++++++++++++++++++++
> hw/vfio/migration.c | 2 +-
> include/hw/vfio/vfio-container-base.h | 1 +
> 4 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 059bfdc..ff88c3f 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -128,7 +128,7 @@ int vfio_block_multiple_devices_migration(VFIODevice *vbasedev, Error **errp)
> error_setg(&multiple_devices_migration_blocker,
> "Multiple VFIO devices migration is supported only if all of "
> "them support P2P migration");
> - ret = migrate_add_blocker(&multiple_devices_migration_blocker, errp);
> + ret = migrate_add_blocker_normal(&multiple_devices_migration_blocker, errp);
>
> return ret;
> }
> diff --git a/hw/vfio/cpr.c b/hw/vfio/cpr.c
> index 3bede54..392c2dd 100644
> --- a/hw/vfio/cpr.c
> +++ b/hw/vfio/cpr.c
> @@ -7,13 +7,33 @@
>
> #include "qemu/osdep.h"
> #include "hw/vfio/vfio-common.h"
> +#include "migration/misc.h"
> #include "qapi/error.h"
> +#include "sysemu/runstate.h"
> +
> +static int vfio_cpr_reboot_notifier(NotifierWithReturn *notifier,
> + MigrationEvent *e, Error **errp)
> +{
> + if (e->type == MIG_EVENT_PRECOPY_SETUP &&
> + !runstate_check(RUN_STATE_SUSPENDED) && !vm_get_suspended()) {
> +
> + error_setg(errp,
> + "VFIO device only supports cpr-reboot for runstate suspended");
> +
> + return -1;
> + }
> + return 0;
> +}
>
> int vfio_cpr_register_container(VFIOContainerBase *bcontainer, Error **errp)
> {
> + migration_add_notifier_mode(&bcontainer->cpr_reboot_notifier,
> + vfio_cpr_reboot_notifier,
> + MIG_MODE_CPR_REBOOT);
> return 0;
> }
>
> void vfio_cpr_unregister_container(VFIOContainerBase *bcontainer)
> {
> + migration_remove_notifier(&bcontainer->cpr_reboot_notifier);
> }
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 50140ed..2050ac8 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -889,7 +889,7 @@ static int vfio_block_migration(VFIODevice *vbasedev, Error *err, Error **errp)
> vbasedev->migration_blocker = error_copy(err);
> error_free(err);
>
> - return migrate_add_blocker(&vbasedev->migration_blocker, errp);
> + return migrate_add_blocker_normal(&vbasedev->migration_blocker, errp);
> }
>
> /* ---------------------------------------------------------------------- */
> diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
> index b2813b0..3582d5f 100644
> --- a/include/hw/vfio/vfio-container-base.h
> +++ b/include/hw/vfio/vfio-container-base.h
> @@ -49,6 +49,7 @@ typedef struct VFIOContainerBase {
> QLIST_ENTRY(VFIOContainerBase) next;
> QLIST_HEAD(, VFIODevice) device_list;
> GList *iova_ranges;
> + NotifierWithReturn cpr_reboot_notifier;
> } VFIOContainerBase;
>
> typedef struct VFIOGuestIOMMU {
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-20 7:33 ` Peter Xu
2024-02-20 22:19 ` Steven Sistare
@ 2024-02-21 21:20 ` Steven Sistare
2024-02-22 9:12 ` Peter Xu
2024-02-21 21:23 ` Steven Sistare
2 siblings, 1 reply; 42+ messages in thread
From: Steven Sistare @ 2024-02-21 21:20 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On 2/20/2024 2:33 AM, Peter Xu wrote:
> On Thu, Feb 08, 2024 at 10:54:03AM -0800, Steve Sistare wrote:
>> When migration for cpr is initiated, stop the vm and set state
>> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
>> possibility of ram and device state being out of sync, and guarantees
>> that a guest in the suspended state remains suspended, because qmp_cont
>> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> include/migration/misc.h | 1 +
>> migration/migration.c | 32 +++++++++++++++++++++++++-------
>> 2 files changed, 26 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index 6dc234b..54c99a3 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -60,6 +60,7 @@ void migration_object_init(void);
>> void migration_shutdown(void);
>> bool migration_is_idle(void);
>> bool migration_is_active(MigrationState *);
>> +bool migrate_mode_is_cpr(MigrationState *);
>>
>> typedef enum MigrationEventType {
>> MIG_EVENT_PRECOPY_SETUP,
>> diff --git a/migration/migration.c b/migration/migration.c
>> index d1fce9e..fc5c587 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1603,6 +1603,11 @@ bool migration_is_active(MigrationState *s)
>> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
>> }
>>
>> +bool migrate_mode_is_cpr(MigrationState *s)
>> +{
>> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
>> +}
>> +
>> int migrate_init(MigrationState *s, Error **errp)
>> {
>> int ret;
>> @@ -2651,13 +2656,14 @@ static int migration_completion_precopy(MigrationState *s,
>> bql_lock();
>> migration_downtime_start(s);
>>
>> - s->vm_old_state = runstate_get();
>> - global_state_store();
>> -
>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> - trace_migration_completion_vm_stop(ret);
>> - if (ret < 0) {
>> - goto out_unlock;
>> + if (!migrate_mode_is_cpr(s)) {
>> + s->vm_old_state = runstate_get();
>> + global_state_store();
>> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> + trace_migration_completion_vm_stop(ret);
>> + if (ret < 0) {
>> + goto out_unlock;
>> + }
>> }
>>
>> ret = migration_maybe_pause(s, current_active_state,
>> @@ -3576,6 +3582,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> Error *local_err = NULL;
>> uint64_t rate_limit;
>> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
>> + int ret;
>>
>> /*
>> * If there's a previous error, free it and prepare for another one.
>> @@ -3651,6 +3658,17 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> goto fail;
>> }
>>
>> + if (migrate_mode_is_cpr(s)) {
>> + s->vm_old_state = runstate_get();
>> + global_state_store();
>> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> + trace_migration_completion_vm_stop(ret);
>> + if (ret < 0) {
>> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
>> + goto fail;
>> + }
>> + }
>
> Could we have a helper function for the shared codes?
I propose to add code to migration_stop_vm to make it the helper. Some call sites emit
more traces (via migration_stop_vm) as a result of my refactoring, and postcopy start sets
vm_old_state, which is not used thereafter in that path. Those changes seem harmless to me.
Tell me what you think:
-------------------------------------------------------
diff --git a/migration/migration.c b/migration/migration.c
index fc5c587..30d2b08 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -107,12 +107,6 @@ static int migration_maybe_pause(MigrationState *s,
static void migrate_fd_cancel(MigrationState *s);
static bool close_return_path_on_source(MigrationState *s);
-static void migration_downtime_start(MigrationState *s)
-{
- trace_vmstate_downtime_checkpoint("src-downtime-start");
- s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
-}
-
static void migration_downtime_end(MigrationState *s)
{
int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
@@ -161,11 +155,20 @@ static gint page_request_addr_cmp(gconstpointer ap, gconstpo
return (a > b) - (a < b);
}
-int migration_stop_vm(RunState state)
+static int migration_stop_vm(MigrationState *s, RunState state)
{
- int ret = vm_stop_force_state(state);
+ int ret;
+
+ trace_vmstate_downtime_checkpoint("src-downtime-start");
+ s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+
+ s->vm_old_state = runstate_get();
+ global_state_store();
+
+ ret = vm_stop_force_state(state);
trace_vmstate_downtime_checkpoint("src-vm-stopped");
+ trace_migration_completion_vm_stop(ret);
return ret;
}
@@ -2454,10 +2457,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
bql_lock();
trace_postcopy_start_set_run();
- migration_downtime_start(ms);
-
- global_state_store();
- ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
+ ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
if (ret < 0) {
goto fail;
}
@@ -2654,13 +2654,9 @@ static int migration_completion_precopy(MigrationState *s,
int ret;
bql_lock();
- migration_downtime_start(s);
if (!migrate_mode_is_cpr(s)) {
- s->vm_old_state = runstate_get();
- global_state_store();
- ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
- trace_migration_completion_vm_stop(ret);
+ ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
if (ret < 0) {
goto out_unlock;
}
@@ -3498,15 +3494,10 @@ static void *bg_migration_thread(void *opaque)
s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
trace_migration_thread_setup_complete();
- migration_downtime_start(s);
bql_lock();
- s->vm_old_state = runstate_get();
-
- global_state_store();
- /* Forcibly stop VM before saving state of vCPUs and devices */
- if (migration_stop_vm(RUN_STATE_PAUSED)) {
+ if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
goto fail;
}
/*
@@ -3659,10 +3650,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
}
if (migrate_mode_is_cpr(s)) {
- s->vm_old_state = runstate_get();
- global_state_store();
- ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
- trace_migration_completion_vm_stop(ret);
+ ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
if (ret < 0) {
error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
goto fail;
diff --git a/migration/migration.h b/migration/migration.h
index aef8afb..65c0b61 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
*/
void migration_rp_kick(MigrationState *s);
-int migration_stop_vm(RunState state);
-
#endif
-------------------------------------------------------
- Steve
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-20 7:33 ` Peter Xu
2024-02-20 22:19 ` Steven Sistare
2024-02-21 21:20 ` Steven Sistare
@ 2024-02-21 21:23 ` Steven Sistare
2024-02-22 9:03 ` Peter Xu
2 siblings, 1 reply; 42+ messages in thread
From: Steven Sistare @ 2024-02-21 21:23 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On 2/20/2024 2:33 AM, Peter Xu wrote:
> On Thu, Feb 08, 2024 at 10:54:03AM -0800, Steve Sistare wrote:
>> When migration for cpr is initiated, stop the vm and set state
>> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
>> possibility of ram and device state being out of sync, and guarantees
>> that a guest in the suspended state remains suspended, because qmp_cont
>> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> include/migration/misc.h | 1 +
>> migration/migration.c | 32 +++++++++++++++++++++++++-------
>> 2 files changed, 26 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>> index 6dc234b..54c99a3 100644
>> --- a/include/migration/misc.h
>> +++ b/include/migration/misc.h
>> @@ -60,6 +60,7 @@ void migration_object_init(void);
>> void migration_shutdown(void);
>> bool migration_is_idle(void);
>> bool migration_is_active(MigrationState *);
>> +bool migrate_mode_is_cpr(MigrationState *);
>>
>> typedef enum MigrationEventType {
>> MIG_EVENT_PRECOPY_SETUP,
>> diff --git a/migration/migration.c b/migration/migration.c
>> index d1fce9e..fc5c587 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1603,6 +1603,11 @@ bool migration_is_active(MigrationState *s)
>> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
>> }
>>
>> +bool migrate_mode_is_cpr(MigrationState *s)
>> +{
>> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
>> +}
>> +
>> int migrate_init(MigrationState *s, Error **errp)
>> {
>> int ret;
>> @@ -2651,13 +2656,14 @@ static int migration_completion_precopy(MigrationState *s,
>> bql_lock();
>> migration_downtime_start(s);
>>
>> - s->vm_old_state = runstate_get();
>> - global_state_store();
>> -
>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> - trace_migration_completion_vm_stop(ret);
>> - if (ret < 0) {
>> - goto out_unlock;
>> + if (!migrate_mode_is_cpr(s)) {
>> + s->vm_old_state = runstate_get();
>> + global_state_store();
>> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> + trace_migration_completion_vm_stop(ret);
>> + if (ret < 0) {
>> + goto out_unlock;
>> + }
>> }
>>
>> ret = migration_maybe_pause(s, current_active_state,
>> @@ -3576,6 +3582,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> Error *local_err = NULL;
>> uint64_t rate_limit;
>> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
>> + int ret;
>>
>> /*
>> * If there's a previous error, free it and prepare for another one.
>> @@ -3651,6 +3658,17 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> goto fail;
>> }
>>
>> + if (migrate_mode_is_cpr(s)) {
>> + s->vm_old_state = runstate_get();
>> + global_state_store();
>> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>> + trace_migration_completion_vm_stop(ret);
>> + if (ret < 0) {
>> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
>> + goto fail;
>> + }
>> + }
>
> Could we have a helper function for the shared codes?
>
> How about postcopy? I know it's nonsense to enable postcopy for cpr.. but
> iiuc we don't yet forbid an user doing so. Maybe we should?
How about this?
-------------------------------------------
@@ -3600,6 +3600,11 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
return;
}
+ if (migrate_mode_is_cpr(s) && migrate_postcopy()) {
+ error_setg(&local_err, "cannot mix postcopy and cpr");
+ goto fail;
+ }
+
if (resume) {
/* This is a resumed migration */
rate_limit = migrate_max_postcopy_bandwidth();
------------------------------------------------
- Steve
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-21 21:23 ` Steven Sistare
@ 2024-02-22 9:03 ` Peter Xu
2024-02-22 13:24 ` Steven Sistare
0 siblings, 1 reply; 42+ messages in thread
From: Peter Xu @ 2024-02-22 9:03 UTC (permalink / raw)
To: Steven Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Wed, Feb 21, 2024 at 04:23:07PM -0500, Steven Sistare wrote:
> > How about postcopy? I know it's nonsense to enable postcopy for cpr.. but
> > iiuc we don't yet forbid an user doing so. Maybe we should?
>
> How about this?
>
> -------------------------------------------
> @@ -3600,6 +3600,11 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> return;
> }
>
> + if (migrate_mode_is_cpr(s) && migrate_postcopy()) {
> + error_setg(&local_err, "cannot mix postcopy and cpr");
> + goto fail;
> + }
> +
> if (resume) {
> /* This is a resumed migration */
> rate_limit = migrate_max_postcopy_bandwidth();
> ------------------------------------------------
migrate_fd_connect() will be a bit late, the error won't be able to be
attached in the "migrate" request. Perhaps, migrate_prepare()?
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-21 21:20 ` Steven Sistare
@ 2024-02-22 9:12 ` Peter Xu
2024-02-22 9:30 ` Peter Xu
0 siblings, 1 reply; 42+ messages in thread
From: Peter Xu @ 2024-02-22 9:12 UTC (permalink / raw)
To: Steven Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Wed, Feb 21, 2024 at 04:20:07PM -0500, Steven Sistare wrote:
> On 2/20/2024 2:33 AM, Peter Xu wrote:
> > On Thu, Feb 08, 2024 at 10:54:03AM -0800, Steve Sistare wrote:
> >> When migration for cpr is initiated, stop the vm and set state
> >> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
> >> possibility of ram and device state being out of sync, and guarantees
> >> that a guest in the suspended state remains suspended, because qmp_cont
> >> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> ---
> >> include/migration/misc.h | 1 +
> >> migration/migration.c | 32 +++++++++++++++++++++++++-------
> >> 2 files changed, 26 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/include/migration/misc.h b/include/migration/misc.h
> >> index 6dc234b..54c99a3 100644
> >> --- a/include/migration/misc.h
> >> +++ b/include/migration/misc.h
> >> @@ -60,6 +60,7 @@ void migration_object_init(void);
> >> void migration_shutdown(void);
> >> bool migration_is_idle(void);
> >> bool migration_is_active(MigrationState *);
> >> +bool migrate_mode_is_cpr(MigrationState *);
> >>
> >> typedef enum MigrationEventType {
> >> MIG_EVENT_PRECOPY_SETUP,
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index d1fce9e..fc5c587 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -1603,6 +1603,11 @@ bool migration_is_active(MigrationState *s)
> >> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
> >> }
> >>
> >> +bool migrate_mode_is_cpr(MigrationState *s)
> >> +{
> >> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
> >> +}
> >> +
> >> int migrate_init(MigrationState *s, Error **errp)
> >> {
> >> int ret;
> >> @@ -2651,13 +2656,14 @@ static int migration_completion_precopy(MigrationState *s,
> >> bql_lock();
> >> migration_downtime_start(s);
> >>
> >> - s->vm_old_state = runstate_get();
> >> - global_state_store();
> >> -
> >> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> >> - trace_migration_completion_vm_stop(ret);
> >> - if (ret < 0) {
> >> - goto out_unlock;
> >> + if (!migrate_mode_is_cpr(s)) {
> >> + s->vm_old_state = runstate_get();
> >> + global_state_store();
> >> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> >> + trace_migration_completion_vm_stop(ret);
> >> + if (ret < 0) {
> >> + goto out_unlock;
> >> + }
> >> }
> >>
> >> ret = migration_maybe_pause(s, current_active_state,
> >> @@ -3576,6 +3582,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> >> Error *local_err = NULL;
> >> uint64_t rate_limit;
> >> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
> >> + int ret;
> >>
> >> /*
> >> * If there's a previous error, free it and prepare for another one.
> >> @@ -3651,6 +3658,17 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> >> goto fail;
> >> }
> >>
> >> + if (migrate_mode_is_cpr(s)) {
> >> + s->vm_old_state = runstate_get();
> >> + global_state_store();
> >> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> >> + trace_migration_completion_vm_stop(ret);
> >> + if (ret < 0) {
> >> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
> >> + goto fail;
> >> + }
> >> + }
> >
> > Could we have a helper function for the shared codes?
>
> I propose to add code to migration_stop_vm to make it the helper. Some call sites emit
> more traces (via migration_stop_vm) as a result of my refactoring,
This should be fine.
> and postcopy start sets
> vm_old_state, which is not used thereafter in that path. Those changes seem harmless to me.
Not only harmless, I think it was a bug to not set vm_old_state in
postcopy_start().. See:
https://issues.redhat.com/browse/RHEL-18061
I suspect that was the cause.
> Tell me what you think:
I'll have a closer look later, but so far it looks all good.
Thanks,
>
> -------------------------------------------------------
> diff --git a/migration/migration.c b/migration/migration.c
> index fc5c587..30d2b08 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -107,12 +107,6 @@ static int migration_maybe_pause(MigrationState *s,
> static void migrate_fd_cancel(MigrationState *s);
> static bool close_return_path_on_source(MigrationState *s);
>
> -static void migration_downtime_start(MigrationState *s)
> -{
> - trace_vmstate_downtime_checkpoint("src-downtime-start");
> - s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> -}
> -
> static void migration_downtime_end(MigrationState *s)
> {
> int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> @@ -161,11 +155,20 @@ static gint page_request_addr_cmp(gconstpointer ap, gconstpo
> return (a > b) - (a < b);
> }
>
> -int migration_stop_vm(RunState state)
> +static int migration_stop_vm(MigrationState *s, RunState state)
> {
> - int ret = vm_stop_force_state(state);
> + int ret;
> +
> + trace_vmstate_downtime_checkpoint("src-downtime-start");
> + s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +
> + s->vm_old_state = runstate_get();
> + global_state_store();
> +
> + ret = vm_stop_force_state(state);
>
> trace_vmstate_downtime_checkpoint("src-vm-stopped");
> + trace_migration_completion_vm_stop(ret);
>
> return ret;
> }
> @@ -2454,10 +2457,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
> bql_lock();
> trace_postcopy_start_set_run();
>
> - migration_downtime_start(ms);
> -
> - global_state_store();
> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> + ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
> if (ret < 0) {
> goto fail;
> }
> @@ -2654,13 +2654,9 @@ static int migration_completion_precopy(MigrationState *s,
> int ret;
>
> bql_lock();
> - migration_downtime_start(s);
>
> if (!migrate_mode_is_cpr(s)) {
> - s->vm_old_state = runstate_get();
> - global_state_store();
> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> - trace_migration_completion_vm_stop(ret);
> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
> if (ret < 0) {
> goto out_unlock;
> }
> @@ -3498,15 +3494,10 @@ static void *bg_migration_thread(void *opaque)
> s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
>
> trace_migration_thread_setup_complete();
> - migration_downtime_start(s);
>
> bql_lock();
>
> - s->vm_old_state = runstate_get();
> -
> - global_state_store();
> - /* Forcibly stop VM before saving state of vCPUs and devices */
> - if (migration_stop_vm(RUN_STATE_PAUSED)) {
> + if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
> goto fail;
> }
> /*
> @@ -3659,10 +3650,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> }
>
> if (migrate_mode_is_cpr(s)) {
> - s->vm_old_state = runstate_get();
> - global_state_store();
> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> - trace_migration_completion_vm_stop(ret);
> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
> if (ret < 0) {
> error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
> goto fail;
> diff --git a/migration/migration.h b/migration/migration.h
> index aef8afb..65c0b61 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
> */
> void migration_rp_kick(MigrationState *s);
>
> -int migration_stop_vm(RunState state);
> -
> #endif
> -------------------------------------------------------
>
> - Steve
>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-22 9:12 ` Peter Xu
@ 2024-02-22 9:30 ` Peter Xu
2024-02-22 13:29 ` Steven Sistare
0 siblings, 1 reply; 42+ messages in thread
From: Peter Xu @ 2024-02-22 9:30 UTC (permalink / raw)
To: Steven Sistare
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On Thu, Feb 22, 2024 at 05:12:53PM +0800, Peter Xu wrote:
> On Wed, Feb 21, 2024 at 04:20:07PM -0500, Steven Sistare wrote:
> > On 2/20/2024 2:33 AM, Peter Xu wrote:
> > > On Thu, Feb 08, 2024 at 10:54:03AM -0800, Steve Sistare wrote:
> > >> When migration for cpr is initiated, stop the vm and set state
> > >> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
> > >> possibility of ram and device state being out of sync, and guarantees
> > >> that a guest in the suspended state remains suspended, because qmp_cont
> > >> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
> > >>
> > >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> > >> ---
> > >> include/migration/misc.h | 1 +
> > >> migration/migration.c | 32 +++++++++++++++++++++++++-------
> > >> 2 files changed, 26 insertions(+), 7 deletions(-)
> > >>
> > >> diff --git a/include/migration/misc.h b/include/migration/misc.h
> > >> index 6dc234b..54c99a3 100644
> > >> --- a/include/migration/misc.h
> > >> +++ b/include/migration/misc.h
> > >> @@ -60,6 +60,7 @@ void migration_object_init(void);
> > >> void migration_shutdown(void);
> > >> bool migration_is_idle(void);
> > >> bool migration_is_active(MigrationState *);
> > >> +bool migrate_mode_is_cpr(MigrationState *);
> > >>
> > >> typedef enum MigrationEventType {
> > >> MIG_EVENT_PRECOPY_SETUP,
> > >> diff --git a/migration/migration.c b/migration/migration.c
> > >> index d1fce9e..fc5c587 100644
> > >> --- a/migration/migration.c
> > >> +++ b/migration/migration.c
> > >> @@ -1603,6 +1603,11 @@ bool migration_is_active(MigrationState *s)
> > >> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
> > >> }
> > >>
> > >> +bool migrate_mode_is_cpr(MigrationState *s)
> > >> +{
> > >> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
> > >> +}
> > >> +
> > >> int migrate_init(MigrationState *s, Error **errp)
> > >> {
> > >> int ret;
> > >> @@ -2651,13 +2656,14 @@ static int migration_completion_precopy(MigrationState *s,
> > >> bql_lock();
> > >> migration_downtime_start(s);
> > >>
> > >> - s->vm_old_state = runstate_get();
> > >> - global_state_store();
> > >> -
> > >> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> > >> - trace_migration_completion_vm_stop(ret);
> > >> - if (ret < 0) {
> > >> - goto out_unlock;
> > >> + if (!migrate_mode_is_cpr(s)) {
> > >> + s->vm_old_state = runstate_get();
> > >> + global_state_store();
> > >> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> > >> + trace_migration_completion_vm_stop(ret);
> > >> + if (ret < 0) {
> > >> + goto out_unlock;
> > >> + }
> > >> }
> > >>
> > >> ret = migration_maybe_pause(s, current_active_state,
> > >> @@ -3576,6 +3582,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> > >> Error *local_err = NULL;
> > >> uint64_t rate_limit;
> > >> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
> > >> + int ret;
> > >>
> > >> /*
> > >> * If there's a previous error, free it and prepare for another one.
> > >> @@ -3651,6 +3658,17 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> > >> goto fail;
> > >> }
> > >>
> > >> + if (migrate_mode_is_cpr(s)) {
> > >> + s->vm_old_state = runstate_get();
> > >> + global_state_store();
> > >> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> > >> + trace_migration_completion_vm_stop(ret);
> > >> + if (ret < 0) {
> > >> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
> > >> + goto fail;
> > >> + }
> > >> + }
> > >
> > > Could we have a helper function for the shared codes?
> >
> > I propose to add code to migration_stop_vm to make it the helper. Some call sites emit
> > more traces (via migration_stop_vm) as a result of my refactoring,
>
> This should be fine.
>
> > and postcopy start sets
> > vm_old_state, which is not used thereafter in that path. Those changes seem harmless to me.
>
> Not only harmless, I think it was a bug to not set vm_old_state in
> postcopy_start().. See:
>
> https://issues.redhat.com/browse/RHEL-18061
>
> I suspect that was the cause.
>
> > Tell me what you think:
>
> I'll have a closer look later, but so far it looks all good.
>
> Thanks,
>
> >
> > -------------------------------------------------------
> > diff --git a/migration/migration.c b/migration/migration.c
> > index fc5c587..30d2b08 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -107,12 +107,6 @@ static int migration_maybe_pause(MigrationState *s,
> > static void migrate_fd_cancel(MigrationState *s);
> > static bool close_return_path_on_source(MigrationState *s);
> >
> > -static void migration_downtime_start(MigrationState *s)
> > -{
> > - trace_vmstate_downtime_checkpoint("src-downtime-start");
> > - s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> > -}
Ah.. one more thing: would you mind keep this helper even if it can be
squashed when sending formal patch? I want to keep downtime start/end
super clear and paired as they're important hook points. It should be
inlined anyway by the compiler.
> > -
> > static void migration_downtime_end(MigrationState *s)
> > {
> > int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> > @@ -161,11 +155,20 @@ static gint page_request_addr_cmp(gconstpointer ap, gconstpo
> > return (a > b) - (a < b);
> > }
> >
> > -int migration_stop_vm(RunState state)
> > +static int migration_stop_vm(MigrationState *s, RunState state)
> > {
> > - int ret = vm_stop_force_state(state);
> > + int ret;
> > +
> > + trace_vmstate_downtime_checkpoint("src-downtime-start");
> > + s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> > +
> > + s->vm_old_state = runstate_get();
> > + global_state_store();
> > +
> > + ret = vm_stop_force_state(state);
> >
> > trace_vmstate_downtime_checkpoint("src-vm-stopped");
> > + trace_migration_completion_vm_stop(ret);
> >
> > return ret;
> > }
> > @@ -2454,10 +2457,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
> > bql_lock();
> > trace_postcopy_start_set_run();
> >
> > - migration_downtime_start(ms);
> > -
> > - global_state_store();
> > - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> > + ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
> > if (ret < 0) {
> > goto fail;
> > }
> > @@ -2654,13 +2654,9 @@ static int migration_completion_precopy(MigrationState *s,
> > int ret;
> >
> > bql_lock();
> > - migration_downtime_start(s);
> >
> > if (!migrate_mode_is_cpr(s)) {
> > - s->vm_old_state = runstate_get();
> > - global_state_store();
> > - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> > - trace_migration_completion_vm_stop(ret);
> > + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
> > if (ret < 0) {
> > goto out_unlock;
> > }
> > @@ -3498,15 +3494,10 @@ static void *bg_migration_thread(void *opaque)
> > s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> >
> > trace_migration_thread_setup_complete();
> > - migration_downtime_start(s);
> >
> > bql_lock();
> >
> > - s->vm_old_state = runstate_get();
> > -
> > - global_state_store();
> > - /* Forcibly stop VM before saving state of vCPUs and devices */
> > - if (migration_stop_vm(RUN_STATE_PAUSED)) {
> > + if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
> > goto fail;
> > }
> > /*
> > @@ -3659,10 +3650,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> > }
> >
> > if (migrate_mode_is_cpr(s)) {
> > - s->vm_old_state = runstate_get();
> > - global_state_store();
> > - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
> > - trace_migration_completion_vm_stop(ret);
> > + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
> > if (ret < 0) {
> > error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
> > goto fail;
> > diff --git a/migration/migration.h b/migration/migration.h
> > index aef8afb..65c0b61 100644
> > --- a/migration/migration.h
> > +++ b/migration/migration.h
> > @@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
> > */
> > void migration_rp_kick(MigrationState *s);
> >
> > -int migration_stop_vm(RunState state);
> > -
> > #endif
> > -------------------------------------------------------
> >
> > - Steve
> >
>
> --
> Peter Xu
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-22 9:03 ` Peter Xu
@ 2024-02-22 13:24 ` Steven Sistare
0 siblings, 0 replies; 42+ messages in thread
From: Steven Sistare @ 2024-02-22 13:24 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On 2/22/2024 4:03 AM, Peter Xu wrote:
> On Wed, Feb 21, 2024 at 04:23:07PM -0500, Steven Sistare wrote:
>>> How about postcopy? I know it's nonsense to enable postcopy for cpr.. but
>>> iiuc we don't yet forbid an user doing so. Maybe we should?
>>
>> How about this?
>>
>> -------------------------------------------
>> @@ -3600,6 +3600,11 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>> return;
>> }
>>
>> + if (migrate_mode_is_cpr(s) && migrate_postcopy()) {
>> + error_setg(&local_err, "cannot mix postcopy and cpr");
>> + goto fail;
>> + }
>> +
>> if (resume) {
>> /* This is a resumed migration */
>> rate_limit = migrate_max_postcopy_bandwidth();
>> ------------------------------------------------
>
> migrate_fd_connect() will be a bit late, the error won't be able to be
> attached in the "migrate" request. Perhaps, migrate_prepare()?
Thank you, that is better - steve
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH V3 10/13] migration: stop vm for cpr
2024-02-22 9:30 ` Peter Xu
@ 2024-02-22 13:29 ` Steven Sistare
0 siblings, 0 replies; 42+ messages in thread
From: Steven Sistare @ 2024-02-22 13:29 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Michael S. Tsirkin, Jason Wang,
Alex Williamson, Cedric Le Goater, Gerd Hoffmann,
Marc-Andre Lureau, David Hildenbrand
On 2/22/2024 4:30 AM, Peter Xu wrote:
> On Thu, Feb 22, 2024 at 05:12:53PM +0800, Peter Xu wrote:
>> On Wed, Feb 21, 2024 at 04:20:07PM -0500, Steven Sistare wrote:
>>> On 2/20/2024 2:33 AM, Peter Xu wrote:
>>>> On Thu, Feb 08, 2024 at 10:54:03AM -0800, Steve Sistare wrote:
>>>>> When migration for cpr is initiated, stop the vm and set state
>>>>> RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
>>>>> possibility of ram and device state being out of sync, and guarantees
>>>>> that a guest in the suspended state remains suspended, because qmp_cont
>>>>> rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
>>>>>
>>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>>> ---
>>>>> include/migration/misc.h | 1 +
>>>>> migration/migration.c | 32 +++++++++++++++++++++++++-------
>>>>> 2 files changed, 26 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>>>>> index 6dc234b..54c99a3 100644
>>>>> --- a/include/migration/misc.h
>>>>> +++ b/include/migration/misc.h
>>>>> @@ -60,6 +60,7 @@ void migration_object_init(void);
>>>>> void migration_shutdown(void);
>>>>> bool migration_is_idle(void);
>>>>> bool migration_is_active(MigrationState *);
>>>>> +bool migrate_mode_is_cpr(MigrationState *);
>>>>>
>>>>> typedef enum MigrationEventType {
>>>>> MIG_EVENT_PRECOPY_SETUP,
>>>>> diff --git a/migration/migration.c b/migration/migration.c
>>>>> index d1fce9e..fc5c587 100644
>>>>> --- a/migration/migration.c
>>>>> +++ b/migration/migration.c
>>>>> @@ -1603,6 +1603,11 @@ bool migration_is_active(MigrationState *s)
>>>>> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
>>>>> }
>>>>>
>>>>> +bool migrate_mode_is_cpr(MigrationState *s)
>>>>> +{
>>>>> + return s->parameters.mode == MIG_MODE_CPR_REBOOT;
>>>>> +}
>>>>> +
>>>>> int migrate_init(MigrationState *s, Error **errp)
>>>>> {
>>>>> int ret;
>>>>> @@ -2651,13 +2656,14 @@ static int migration_completion_precopy(MigrationState *s,
>>>>> bql_lock();
>>>>> migration_downtime_start(s);
>>>>>
>>>>> - s->vm_old_state = runstate_get();
>>>>> - global_state_store();
>>>>> -
>>>>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>>>> - trace_migration_completion_vm_stop(ret);
>>>>> - if (ret < 0) {
>>>>> - goto out_unlock;
>>>>> + if (!migrate_mode_is_cpr(s)) {
>>>>> + s->vm_old_state = runstate_get();
>>>>> + global_state_store();
>>>>> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>>>> + trace_migration_completion_vm_stop(ret);
>>>>> + if (ret < 0) {
>>>>> + goto out_unlock;
>>>>> + }
>>>>> }
>>>>>
>>>>> ret = migration_maybe_pause(s, current_active_state,
>>>>> @@ -3576,6 +3582,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>>>>> Error *local_err = NULL;
>>>>> uint64_t rate_limit;
>>>>> bool resume = s->state == MIGRATION_STATUS_POSTCOPY_PAUSED;
>>>>> + int ret;
>>>>>
>>>>> /*
>>>>> * If there's a previous error, free it and prepare for another one.
>>>>> @@ -3651,6 +3658,17 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>>>>> goto fail;
>>>>> }
>>>>>
>>>>> + if (migrate_mode_is_cpr(s)) {
>>>>> + s->vm_old_state = runstate_get();
>>>>> + global_state_store();
>>>>> + ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>>>> + trace_migration_completion_vm_stop(ret);
>>>>> + if (ret < 0) {
>>>>> + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
>>>>> + goto fail;
>>>>> + }
>>>>> + }
>>>>
>>>> Could we have a helper function for the shared codes?
>>>
>>> I propose to add code to migration_stop_vm to make it the helper. Some call sites emit
>>> more traces (via migration_stop_vm) as a result of my refactoring,
>>
>> This should be fine.
>>
>>> and postcopy start sets
>>> vm_old_state, which is not used thereafter in that path. Those changes seem harmless to me.
>>
>> Not only harmless, I think it was a bug to not set vm_old_state in
>> postcopy_start().. See:
>>
>> https://issues.redhat.com/browse/RHEL-18061
>>
>> I suspect that was the cause.
>>
>>> Tell me what you think:
>>
>> I'll have a closer look later, but so far it looks all good.
>>
>> Thanks,
>>
>>>
>>> -------------------------------------------------------
>>> diff --git a/migration/migration.c b/migration/migration.c
>>> index fc5c587..30d2b08 100644
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -107,12 +107,6 @@ static int migration_maybe_pause(MigrationState *s,
>>> static void migrate_fd_cancel(MigrationState *s);
>>> static bool close_return_path_on_source(MigrationState *s);
>>>
>>> -static void migration_downtime_start(MigrationState *s)
>>> -{
>>> - trace_vmstate_downtime_checkpoint("src-downtime-start");
>>> - s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>> -}
>
> Ah.. one more thing: would you mind keep this helper even if it can be
> squashed when sending formal patch? I want to keep downtime start/end
> super clear and paired as they're important hook points. It should be
> inlined anyway by the compiler.
Will do - steve
>>> -
>>> static void migration_downtime_end(MigrationState *s)
>>> {
>>> int64_t now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>> @@ -161,11 +155,20 @@ static gint page_request_addr_cmp(gconstpointer ap, gconstpo
>>> return (a > b) - (a < b);
>>> }
>>>
>>> -int migration_stop_vm(RunState state)
>>> +static int migration_stop_vm(MigrationState *s, RunState state)
>>> {
>>> - int ret = vm_stop_force_state(state);
>>> + int ret;
>>> +
>>> + trace_vmstate_downtime_checkpoint("src-downtime-start");
>>> + s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>> +
>>> + s->vm_old_state = runstate_get();
>>> + global_state_store();
>>> +
>>> + ret = vm_stop_force_state(state);
>>>
>>> trace_vmstate_downtime_checkpoint("src-vm-stopped");
>>> + trace_migration_completion_vm_stop(ret);
>>>
>>> return ret;
>>> }
>>> @@ -2454,10 +2457,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>>> bql_lock();
>>> trace_postcopy_start_set_run();
>>>
>>> - migration_downtime_start(ms);
>>> -
>>> - global_state_store();
>>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>> + ret = migration_stop_vm(ms, RUN_STATE_FINISH_MIGRATE);
>>> if (ret < 0) {
>>> goto fail;
>>> }
>>> @@ -2654,13 +2654,9 @@ static int migration_completion_precopy(MigrationState *s,
>>> int ret;
>>>
>>> bql_lock();
>>> - migration_downtime_start(s);
>>>
>>> if (!migrate_mode_is_cpr(s)) {
>>> - s->vm_old_state = runstate_get();
>>> - global_state_store();
>>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>> - trace_migration_completion_vm_stop(ret);
>>> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
>>> if (ret < 0) {
>>> goto out_unlock;
>>> }
>>> @@ -3498,15 +3494,10 @@ static void *bg_migration_thread(void *opaque)
>>> s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
>>>
>>> trace_migration_thread_setup_complete();
>>> - migration_downtime_start(s);
>>>
>>> bql_lock();
>>>
>>> - s->vm_old_state = runstate_get();
>>> -
>>> - global_state_store();
>>> - /* Forcibly stop VM before saving state of vCPUs and devices */
>>> - if (migration_stop_vm(RUN_STATE_PAUSED)) {
>>> + if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
>>> goto fail;
>>> }
>>> /*
>>> @@ -3659,10 +3650,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>>> }
>>>
>>> if (migrate_mode_is_cpr(s)) {
>>> - s->vm_old_state = runstate_get();
>>> - global_state_store();
>>> - ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
>>> - trace_migration_completion_vm_stop(ret);
>>> + ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
>>> if (ret < 0) {
>>> error_setg(&local_err, "migration_stop_vm failed, error %d", -ret);
>>> goto fail;
>>> diff --git a/migration/migration.h b/migration/migration.h
>>> index aef8afb..65c0b61 100644
>>> --- a/migration/migration.h
>>> +++ b/migration/migration.h
>>> @@ -541,6 +541,4 @@ int migration_rp_wait(MigrationState *s);
>>> */
>>> void migration_rp_kick(MigrationState *s);
>>>
>>> -int migration_stop_vm(RunState state);
>>> -
>>> #endif
>>> -------------------------------------------------------
>>>
>>> - Steve
>>>
>>
>> --
>> Peter Xu
>
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2024-02-22 13:30 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-08 18:53 [PATCH V3 00/13] allow cpr-reboot for vfio Steve Sistare
2024-02-08 18:53 ` [PATCH V3 01/13] notify: pass error to notifier with return Steve Sistare
2024-02-12 9:08 ` David Hildenbrand
2024-02-08 18:53 ` [PATCH V3 02/13] migration: remove error from notifier data Steve Sistare
2024-02-12 9:08 ` David Hildenbrand
2024-02-08 18:53 ` [PATCH V3 03/13] migration: convert to NotifierWithReturn Steve Sistare
2024-02-12 9:10 ` David Hildenbrand
2024-02-08 18:53 ` [PATCH V3 04/13] migration: MigrationEvent for notifiers Steve Sistare
2024-02-12 9:11 ` David Hildenbrand
2024-02-20 6:38 ` Peter Xu
2024-02-08 18:53 ` [PATCH V3 05/13] migration: remove postcopy_after_devices Steve Sistare
2024-02-20 6:42 ` Peter Xu
2024-02-08 18:53 ` [PATCH V3 06/13] migration: MigrationNotifyFunc Steve Sistare
2024-02-12 9:14 ` David Hildenbrand
2024-02-20 6:48 ` Peter Xu
2024-02-08 18:54 ` [PATCH V3 07/13] migration: per-mode notifiers Steve Sistare
2024-02-12 9:16 ` David Hildenbrand
2024-02-20 6:51 ` Peter Xu
2024-02-08 18:54 ` [PATCH V3 08/13] migration: refactor migrate_fd_connect failures Steve Sistare
2024-02-12 9:17 ` David Hildenbrand
2024-02-08 18:54 ` [PATCH V3 09/13] migration: notifier error checking Steve Sistare
2024-02-12 9:24 ` David Hildenbrand
2024-02-12 15:37 ` Steven Sistare
2024-02-20 7:12 ` Peter Xu
2024-02-20 22:12 ` Steven Sistare
2024-02-08 18:54 ` [PATCH V3 10/13] migration: stop vm for cpr Steve Sistare
2024-02-20 7:33 ` Peter Xu
2024-02-20 22:19 ` Steven Sistare
2024-02-21 21:20 ` Steven Sistare
2024-02-22 9:12 ` Peter Xu
2024-02-22 9:30 ` Peter Xu
2024-02-22 13:29 ` Steven Sistare
2024-02-21 21:23 ` Steven Sistare
2024-02-22 9:03 ` Peter Xu
2024-02-22 13:24 ` Steven Sistare
2024-02-08 18:54 ` [PATCH V3 11/13] vfio: register container " Steve Sistare
2024-02-08 18:54 ` [PATCH V3 12/13] vfio: allow cpr-reboot migration if suspended Steve Sistare
2024-02-21 18:32 ` Steven Sistare
2024-02-08 18:54 ` [PATCH V3 13/13] migration: update cpr-reboot description Steve Sistare
2024-02-20 7:49 ` [PATCH V3 00/13] allow cpr-reboot for vfio Peter Xu
2024-02-20 22:32 ` Steven Sistare
2024-02-21 2:13 ` Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).