* [PATCH v2 00/21] migration: Improve error reporting
@ 2024-02-27 18:03 Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 01/21] migration: Report error when shutdown fails Cédric Le Goater
` (20 more replies)
0 siblings, 21 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
Hello,
The motivation behind these changes is to improve error reporting to
the upper management layer (libvirt) with a more detailed error, this
to let it decide, depending on the reported error, whether to try
migration again later. It would be useful in cases where migration
fails due to lack of HW resources on the host. For instance, some
adapters can only initiate a limited number of simultaneous dirty
tracking requests and this imposes a limit on the the number of VMs
that can be migrated simultaneously.
We are not quite ready for such a mechanism but what we can do first is
to cleanup the error reporting in the early save_setup sequence. This
is what the following changes propose, by adding an Error** argument to
various handlers and propagating it to the core migration subsystem.
Thanks,
C.
Changes in v2:
- Removed v1 patches addressing the return-path thread termination as
they are now superseded by :
https://lore.kernel.org/qemu-devel/20240226203122.22894-1-farosas@suse.de/
- Documentation updates of handlers
- Removed call to PRECOPY_NOTIFY_SETUP notifiers in case of errors
- Modified routines taking an Error** argument to return a bool when
possible and made adjustments in callers.
- new MEMORY_LISTENER_CALL_LOG_GLOBAL macro for .log_global*()
handlers
- Handled SETUP state when migration terminates
- Modified memory_get_xlat_addr() to take an Error** argument
- Various refinements on error handling
Cédric Le Goater (21):
migration: Report error when shutdown fails
migration: Remove SaveStateHandler and LoadStateHandler typedefs
migration: Add documentation for SaveVMHandlers
migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
migration: Add Error** argument to qemu_savevm_state_setup()
migration: Add Error** argument to .save_setup() handler
migration: Add Error** argument to .load_setup() handler
memory: Add Error** argument to .log_global*() handlers
memory: Add Error** argument to the global_dirty_log routines
migration: Modify ram_init_bitmaps() to report dirty tracking errors
migration: Fix migration termination
vfio: Add Error** argument to .set_dirty_page_tracking() handler
vfio: Add Error** argument to vfio_devices_dma_logging_start()
vfio: Add Error** argument to vfio_devices_dma_logging_stop()
vfio: Use new Error** argument in vfio_save_setup()
vfio: Add Error** argument to .vfio_save_config() handler
vfio: Reverse test on vfio_get_dirty_bitmap()
memory: Add Error** argument to memory_get_xlat_addr()
vfio: Add Error** argument to .get_dirty_bitmap() handler
vfio: Also trace event failures in vfio_save_complete_precopy()
vfio: Extend vfio_set_migration_error() with Error* argument
include/exec/memory.h | 40 +++-
include/hw/vfio/vfio-common.h | 29 ++-
include/hw/vfio/vfio-container-base.h | 35 +++-
include/migration/register.h | 267 +++++++++++++++++++++++---
include/qemu/typedefs.h | 2 -
migration/savevm.h | 2 +-
hw/i386/xen/xen-hvm.c | 10 +-
hw/ppc/spapr.c | 2 +-
hw/s390x/s390-stattrib.c | 2 +-
hw/vfio/common.c | 160 +++++++++------
hw/vfio/container-base.c | 9 +-
hw/vfio/container.c | 19 +-
hw/vfio/migration.c | 89 ++++++---
hw/vfio/pci.c | 5 +-
hw/virtio/vhost-vdpa.c | 5 +-
hw/virtio/vhost.c | 6 +-
migration/block-dirty-bitmap.c | 2 +-
migration/block.c | 2 +-
migration/dirtyrate.c | 21 +-
migration/migration.c | 24 ++-
migration/qemu-file.c | 5 +-
migration/ram.c | 48 ++++-
migration/savevm.c | 28 +--
system/memory.c | 95 +++++++--
system/physmem.c | 5 +-
25 files changed, 699 insertions(+), 213 deletions(-)
--
2.43.2
^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH v2 01/21] migration: Report error when shutdown fails
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
` (19 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
This will help detect issues regarding I/O channels usage.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/qemu-file.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 94231ff2955c80b3d0fab11a40510d34c334a826..b69e0c62e2fcf21d346a3687df7eebee23791fdc 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -62,6 +62,8 @@ struct QEMUFile {
*/
int qemu_file_shutdown(QEMUFile *f)
{
+ Error *err = NULL;
+
/*
* We must set qemufile error before the real shutdown(), otherwise
* there can be a race window where we thought IO all went though
@@ -90,7 +92,8 @@ int qemu_file_shutdown(QEMUFile *f)
return -ENOSYS;
}
- if (qio_channel_shutdown(f->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL) < 0) {
+ if (qio_channel_shutdown(f->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, &err) < 0) {
+ error_report_err(err);
return -EIO;
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 01/21] migration: Report error when shutdown fails Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-29 1:21 ` Fabiano Rosas
2024-02-29 3:55 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers Cédric Le Goater
` (18 subsequent siblings)
20 siblings, 2 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
They are only used once.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/migration/register.h | 4 ++--
include/qemu/typedefs.h | 2 --
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 9ab1f79512c605f0c88a45b560c57486fa054441..2e6a7d766e62f64940086b7b511249c9ff21fa62 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -18,7 +18,7 @@
typedef struct SaveVMHandlers {
/* This runs inside the BQL. */
- SaveStateHandler *save_state;
+ void (*save_state)(QEMUFile *f, void *opaque);
/*
* save_prepare is called early, even before migration starts, and can be
@@ -71,7 +71,7 @@ typedef struct SaveVMHandlers {
/* This calculate the exact remaining data to transfer */
void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
uint64_t *can_postcopy);
- LoadStateHandler *load_state;
+ int (*load_state)(QEMUFile *f, void *opaque, int version_id);
int (*load_setup)(QEMUFile *f, void *opaque);
int (*load_cleanup)(void *opaque);
/* Called when postcopy migration wants to resume from failure */
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index d7c703b4ae9c91d9638111bcaafc656686e1dbb8..5fcba1c1b467826d4f7b6bd287690d33cdc48acf 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -150,8 +150,6 @@ typedef struct IRQState *qemu_irq;
/*
* Function types
*/
-typedef void SaveStateHandler(QEMUFile *f, void *opaque);
-typedef int LoadStateHandler(QEMUFile *f, void *opaque, int version_id);
typedef void (*qemu_irq_handler)(void *opaque, int n, int level);
#endif /* QEMU_TYPEDEFS_H */
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 01/21] migration: Report error when shutdown fails Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-29 4:10 ` Peter Xu
2024-03-04 9:05 ` Avihai Horon
2024-02-27 18:03 ` [PATCH v2 04/21] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error Cédric Le Goater
` (17 subsequent siblings)
20 siblings, 2 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
The SaveVMHandlers structure is still in use for complex subsystems
and devices. Document the handlers since we are going to modify a few
later.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/migration/register.h | 257 +++++++++++++++++++++++++++++++----
1 file changed, 231 insertions(+), 26 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 2e6a7d766e62f64940086b7b511249c9ff21fa62..2cc71ec45f65bf2884c9e7a823d2968752f15c20 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -16,30 +16,129 @@
#include "hw/vmstate-if.h"
+/**
+ * struct SaveVMHandlers: handler structure to finely control
+ * migration of complex subsystems and devices, such as RAM, block and
+ * VFIO.
+ */
typedef struct SaveVMHandlers {
- /* This runs inside the BQL. */
+
+ /* The following handlers runs inside the BQL. */
+
+ /**
+ * @save_state
+ *
+ * Saves state section on the source using the latest state format
+ * version.
+ *
+ * Legacy method. Should be deprecated when all users are ported
+ * to VMState.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ */
void (*save_state)(QEMUFile *f, void *opaque);
- /*
- * save_prepare is called early, even before migration starts, and can be
- * used to perform early checks.
+ /**
+ * @save_prepare
+ *
+ * Called early, even before migration starts, and can be used to
+ * perform early checks.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
*/
int (*save_prepare)(void *opaque, Error **errp);
+
+ /**
+ * @save_setup
+ *
+ * Initializes the data structures on the source and transmits
+ * first section containing information on the device
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*save_setup)(QEMUFile *f, void *opaque);
+
+ /**
+ * @save_cleanup
+ *
+ * Performs save related cleanup
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
void (*save_cleanup)(void *opaque);
+
+ /**
+ * @save_live_complete_postcopy
+ *
+ * Called at the end of postcopy for all postcopyiable devices.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
+
+ /**
+ * @save_live_complete_precopy
+ *
+ * Transmits the last section for the device containing any
+ * remaining data.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
/* This runs both outside and inside the BQL. */
+
+ /**
+ * @is_active
+ *
+ * Will skip a state section if not active
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true if state section is active else false
+ */
bool (*is_active)(void *opaque);
+
+ /**
+ * @has_postcopy
+ *
+ * checks if a device supports postcopy
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true for postcopy support else false
+ */
bool (*has_postcopy)(void *opaque);
- /* is_active_iterate
- * If it is not NULL then qemu_savevm_state_iterate will skip iteration if
- * it returns false. For example, it is needed for only-postcopy-states,
- * which needs to be handled by qemu_savevm_state_setup and
- * qemu_savevm_state_pending, but do not need iterations until not in
- * postcopy stage.
+ /**
+ * @is_active_iterate
+ *
+ * As #SaveVMHandlers.is_active(), will skip an inactive state
+ * section in qemu_savevm_state_iterate.
+ *
+ * For example, it is needed for only-postcopy-states, which needs
+ * to be handled by qemu_savevm_state_setup() and
+ * qemu_savevm_state_pending(), but do not need iterations until
+ * not in postcopy stage.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true if state section is active else false
*/
bool (*is_active_iterate)(void *opaque);
@@ -48,44 +147,150 @@ typedef struct SaveVMHandlers {
* use data that is local to the migration thread or protected
* by other locks.
*/
+
+ /**
+ * @save_live_iterate
+ *
+ * Should send a chunk of data until the point that stream
+ * bandwidth limits tell it to stop. Each call generates one
+ * section.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*save_live_iterate)(QEMUFile *f, void *opaque);
/* This runs outside the BQL! */
- /* Note for save_live_pending:
- * must_precopy:
- * - must be migrated in precopy or in stopped state
- * - i.e. must be migrated before target start
- *
- * can_postcopy:
- * - can migrate in postcopy or in stopped state
- * - i.e. can migrate after target start
- * - some can also be migrated during precopy (RAM)
- * - some must be migrated after source stops (block-dirty-bitmap)
- *
- * Sum of can_postcopy and must_postcopy is the whole amount of
+
+ /**
+ * @state_pending_estimate
+ *
+ * This estimates the remaining data to transfer
+ *
+ * Sum of @can_postcopy and @must_postcopy is the whole amount of
* pending data.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ * @must_precopy: amount of data that must be migrated in precopy
+ * or in stopped state, i.e. that must be migrated
+ * before target start.
+ * @can_postcopy: amount of data that can be migrated in postcopy
+ * or in stopped state, i.e. after target start.
+ * Some can also be migrated during precopy (RAM).
+ * Some must be migrated after source stops
+ * (block-dirty-bitmap)
*/
- /* This estimates the remaining data to transfer */
void (*state_pending_estimate)(void *opaque, uint64_t *must_precopy,
uint64_t *can_postcopy);
- /* This calculate the exact remaining data to transfer */
+
+ /**
+ * @state_pending_exact
+ *
+ * This calculate the exact remaining data to transfer
+ *
+ * Sum of @can_postcopy and @must_postcopy is the whole amount of
+ * pending data.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ * @must_precopy: amount of data that must be migrated in precopy
+ * or in stopped state, i.e. that must be migrated
+ * before target start.
+ * @can_postcopy: amount of data that can be migrated in postcopy
+ * or in stopped state, i.e. after target start.
+ * Some can also be migrated during precopy (RAM).
+ * Some must be migrated after source stops
+ * (block-dirty-bitmap)
+ */
void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
uint64_t *can_postcopy);
+
+ /**
+ * @load_state
+ *
+ * Load sections generated by any of the save functions that
+ * generate sections.
+ *
+ * Legacy method. Should be deprecated when all users are ported
+ * to VMState.
+ *
+ * @f: QEMUFile where to receive the data
+ * @opaque: data pointer passed to register_savevm_live()
+ * @version_id: the maximum version_id supported
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*load_state)(QEMUFile *f, void *opaque, int version_id);
+
+ /**
+ * @load_setup
+ *
+ * Initializes the data structures on the destination.
+ *
+ * @f: QEMUFile where to receive the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*load_setup)(QEMUFile *f, void *opaque);
+
+ /**
+ * @load_cleanup
+ *
+ * Performs cleanup of load data structures
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*load_cleanup)(void *opaque);
- /* Called when postcopy migration wants to resume from failure */
+
+ /**
+ * @resume_prepare
+ *
+ * Called when postcopy migration wants to resume from failure
+ *
+ * @s: Current migration state
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*resume_prepare)(MigrationState *s, void *opaque);
- /* Checks if switchover ack should be used. Called only in dest */
+
+ /**
+ * @switchover_ack_needed
+ *
+ * Checks if switchover ack should be used. Called only on
+ * destination.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ */
bool (*switchover_ack_needed)(void *opaque);
} SaveVMHandlers;
+/**
+ * register_savevm_live: Register a set of custom migration handlers
+ *
+ * @idstr: state section identifier
+ * @instance_id: instance id
+ * @version_id: version id supported
+ * @ops: SaveVMHandlers structurex
+ * @opaque: data pointer passed SaveVMHandlers handlers
+ */
int register_savevm_live(const char *idstr,
uint32_t instance_id,
int version_id,
const SaveVMHandlers *ops,
void *opaque);
+/**
+ * unregister_savevm: Unregister custom migration handlers
+ *
+ * @obj: object associated with state section
+ * @idstr: state section identifier
+ * @opaque: data pointer passed to register_savevm_live()
+ */
void unregister_savevm(VMStateIf *obj, const char *idstr, void *opaque);
#endif
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 04/21] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (2 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-29 4:12 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 05/21] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
` (16 subsequent siblings)
20 siblings, 1 reply; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
When commit bd2270608fa0 ("migration/ram.c: add a notifier chain for
precopy") added PRECOPY_NOTIFY_SETUP notifiers at the end of
qemu_savevm_state_setup(), it didn't take into account a possible
error in the loop calling vmstate_save() or .save_setup() handlers.
Check ret value before calling the notifiers.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/savevm.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index d612c8a9020b204d5d078d5df85f0e6449c27645..51876f2ef674bb76c7e7ef96e1119a083883deac 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1316,7 +1316,7 @@ void qemu_savevm_state_setup(QEMUFile *f)
MigrationState *ms = migrate_get_current();
SaveStateEntry *se;
Error *local_err = NULL;
- int ret;
+ int ret = 0;
json_writer_int64(ms->vmdesc, "page_size", qemu_target_page_size());
json_writer_start_array(ms->vmdesc, "devices");
@@ -1350,6 +1350,10 @@ void qemu_savevm_state_setup(QEMUFile *f)
}
}
+ if (ret) {
+ return;
+ }
+
if (precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)) {
error_report_err(local_err);
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 05/21] migration: Add Error** argument to qemu_savevm_state_setup()
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (3 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 04/21] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-29 4:19 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
` (15 subsequent siblings)
20 siblings, 1 reply; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
This prepares ground for the changes coming next which add an Error**
argument to the .save_setup() handler. Callers of qemu_savevm_state_setup()
now handle the error and fail earlier. This is a functional change
that should be examined closely.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/savevm.h | 2 +-
migration/migration.c | 20 ++++++++++++++++++--
migration/savevm.c | 14 +++++++-------
3 files changed, 26 insertions(+), 10 deletions(-)
diff --git a/migration/savevm.h b/migration/savevm.h
index 74669733dd63a080b765866c703234a5c4939223..9ec96a995c93a42aad621595f0ed58596c532328 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -32,7 +32,7 @@
bool qemu_savevm_state_blocked(Error **errp);
void qemu_savevm_non_migratable_list(strList **reasons);
int qemu_savevm_state_prepare(Error **errp);
-void qemu_savevm_state_setup(QEMUFile *f);
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp);
bool qemu_savevm_state_guest_unplug_pending(void);
int qemu_savevm_state_resume_prepare(MigrationState *s);
void qemu_savevm_state_header(QEMUFile *f);
diff --git a/migration/migration.c b/migration/migration.c
index 5316bbe6704742e604ae55dc7b47a4e11e73c2a4..c1a62b696f62c0d5aca0505e58bc4dc0ff561fde 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3314,6 +3314,8 @@ static void *migration_thread(void *opaque)
int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
MigThrError thr_error;
bool urgent = false;
+ Error *local_err = NULL;
+ int ret;
thread = migration_threads_add("live_migration", qemu_get_thread_id());
@@ -3357,9 +3359,15 @@ static void *migration_thread(void *opaque)
}
bql_lock();
- qemu_savevm_state_setup(s->to_dst_file);
+ ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
bql_unlock();
+ if (ret) {
+ migrate_set_error(s, local_err);
+ error_free(local_err);
+ goto out;
+ }
+
qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
MIGRATION_STATUS_ACTIVE);
@@ -3436,6 +3444,8 @@ static void *bg_migration_thread(void *opaque)
MigThrError thr_error;
QEMUFile *fb;
bool early_fail = true;
+ Error *local_err = NULL;
+ int ret;
rcu_register_thread();
object_ref(OBJECT(s));
@@ -3469,9 +3479,15 @@ static void *bg_migration_thread(void *opaque)
bql_lock();
qemu_savevm_state_header(s->to_dst_file);
- qemu_savevm_state_setup(s->to_dst_file);
+ ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
bql_unlock();
+ if (ret) {
+ migrate_set_error(s, local_err);
+ error_free(local_err);
+ goto fail;
+ }
+
qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
MIGRATION_STATUS_ACTIVE);
diff --git a/migration/savevm.c b/migration/savevm.c
index 51876f2ef674bb76c7e7ef96e1119a083883deac..bc168371a31acf85f29f2c284be181250db45df4 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1311,11 +1311,10 @@ int qemu_savevm_state_prepare(Error **errp)
return 0;
}
-void qemu_savevm_state_setup(QEMUFile *f)
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
{
MigrationState *ms = migrate_get_current();
SaveStateEntry *se;
- Error *local_err = NULL;
int ret = 0;
json_writer_int64(ms->vmdesc, "page_size", qemu_target_page_size());
@@ -1351,12 +1350,10 @@ void qemu_savevm_state_setup(QEMUFile *f)
}
if (ret) {
- return;
+ return ret;
}
- if (precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)) {
- error_report_err(local_err);
- }
+ return precopy_notify(PRECOPY_NOTIFY_SETUP, errp);
}
int qemu_savevm_state_resume_prepare(MigrationState *s)
@@ -1725,7 +1722,10 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
ms->to_dst_file = f;
qemu_savevm_state_header(f);
- qemu_savevm_state_setup(f);
+ ret = qemu_savevm_state_setup(f, errp);
+ if (ret) {
+ return ret;
+ }
while (qemu_file_get_error(f) == 0) {
if (qemu_savevm_state_iterate(f, false) > 0) {
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (4 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 05/21] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-29 6:32 ` Markus Armbruster
2024-02-27 18:03 ` [PATCH v2 07/21] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
` (14 subsequent siblings)
20 siblings, 1 reply; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater,
Nicholas Piggin, Harsh Prateek Bora, Halil Pasic, Thomas Huth,
Eric Blake, Vladimir Sementsov-Ogievskiy, John Snow,
Stefan Hajnoczi
The purpose is to record a potential error in the migration stream if
qemu_savevm_state_setup() fails. Most of the current .save_setup()
handlers can be modified to use the Error argument instead of managing
their own and calling locally error_report(). The following patches
will introduce such changes for VFIO first.
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: John Snow <jsnow@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v2:
- dropped qemu_file_set_error_obj(f, ret, local_err);
include/migration/register.h | 3 ++-
hw/ppc/spapr.c | 2 +-
hw/s390x/s390-stattrib.c | 2 +-
hw/vfio/migration.c | 2 +-
migration/block-dirty-bitmap.c | 2 +-
migration/block.c | 2 +-
migration/ram.c | 3 ++-
migration/savevm.c | 2 +-
8 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 2cc71ec45f65bf2884c9e7a823d2968752f15c20..96eae9dba2970552c379c732393e3ab6ef578a58 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -60,10 +60,11 @@ typedef struct SaveVMHandlers {
*
* @f: QEMUFile where to send the data
* @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
*
* Returns zero to indicate success and negative for error
*/
- int (*save_setup)(QEMUFile *f, void *opaque);
+ int (*save_setup)(QEMUFile *f, void *opaque, Error **errp);
/**
* @save_cleanup
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 55263f0815ed7671b32ea20b394ae71c82e616cb..045c024ffa76eacfc496bd486cb6cafbee2df73e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2142,7 +2142,7 @@ static const VMStateDescription vmstate_spapr = {
}
};
-static int htab_save_setup(QEMUFile *f, void *opaque)
+static int htab_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
SpaprMachineState *spapr = opaque;
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index c483b62a9b5f71772639fc180bdad15ecb6711cb..c934df424a555d83d2198f5ddfc0cbe0ea98e9ec 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -166,7 +166,7 @@ static int cmma_load(QEMUFile *f, void *opaque, int version_id)
return ret;
}
-static int cmma_save_setup(QEMUFile *f, void *opaque)
+static int cmma_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
S390StAttribState *sas = S390_STATTRIB(opaque);
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 70e6b1a709f9b67e4c9eb41033d76347275cac42..8bcb4bc73cd5ba5338e3ffa4d907d0e6bfbb9485 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -378,7 +378,7 @@ static int vfio_save_prepare(void *opaque, Error **errp)
return 0;
}
-static int vfio_save_setup(QEMUFile *f, void *opaque)
+static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 2708abf3d762de774ed294d3fdb8e56690d2974c..16f84e6c57c2403a8c2d6319f4e7b6360dade28c 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1213,7 +1213,7 @@ fail:
return ret;
}
-static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
+static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
DBMSaveState *s = &((DBMState *)opaque)->save;
SaveBitmapState *dbms = NULL;
diff --git a/migration/block.c b/migration/block.c
index 8c6ebafacc1ffe930d1d4f19d968817b14852c69..df15319ceab66201b043f15eac1b0a7d6522b60c 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -708,7 +708,7 @@ static void block_migration_cleanup(void *opaque)
blk_mig_unlock();
}
-static int block_save_setup(QEMUFile *f, void *opaque)
+static int block_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
int ret;
diff --git a/migration/ram.c b/migration/ram.c
index 4649a8120492a03d331d660622e1a0a51adb0a96..745482899e18c86b73261b683c1bec04039a76d2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2930,8 +2930,9 @@ void qemu_guest_free_page_hint(void *addr, size_t len)
*
* @f: QEMUFile where to send the data
* @opaque: RAMState pointer
+ * @errp: pointer to Error*, to store an error if it happens.
*/
-static int ram_save_setup(QEMUFile *f, void *opaque)
+static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
RAMState **rsp = opaque;
RAMBlock *block;
diff --git a/migration/savevm.c b/migration/savevm.c
index bc168371a31acf85f29f2c284be181250db45df4..b5b3b51bad94dc4c04ae22cd687ba111299339aa 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1341,7 +1341,7 @@ int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
}
save_section_header(f, se, QEMU_VM_SECTION_START);
- ret = se->ops->save_setup(f, se->opaque);
+ ret = se->ops->save_setup(f, se->opaque, errp);
save_section_footer(f, se);
if (ret < 0) {
qemu_file_set_error(f, ret);
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 07/21] migration: Add Error** argument to .load_setup() handler
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (5 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-29 4:21 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 08/21] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
` (13 subsequent siblings)
20 siblings, 1 reply; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
This will be useful to report errors at a higher level, mostly in VFIO
today.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/migration/register.h | 3 ++-
hw/vfio/migration.c | 2 +-
migration/ram.c | 3 ++-
migration/savevm.c | 10 ++++++----
4 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 96eae9dba2970552c379c732393e3ab6ef578a58..2cfc167f717de8e08c1ca8accdc3011c03eb1554 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -231,10 +231,11 @@ typedef struct SaveVMHandlers {
*
* @f: QEMUFile where to receive the data
* @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
*
* Returns zero to indicate success and negative for error
*/
- int (*load_setup)(QEMUFile *f, void *opaque);
+ int (*load_setup)(QEMUFile *f, void *opaque, Error **errp);
/**
* @load_cleanup
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 8bcb4bc73cd5ba5338e3ffa4d907d0e6bfbb9485..2dfbe671f6f45aa530c7341177bb532d8292cecd 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -580,7 +580,7 @@ static void vfio_save_state(QEMUFile *f, void *opaque)
}
}
-static int vfio_load_setup(QEMUFile *f, void *opaque)
+static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp)
{
VFIODevice *vbasedev = opaque;
diff --git a/migration/ram.c b/migration/ram.c
index 745482899e18c86b73261b683c1bec04039a76d2..d648134133fc22cd91c7b2064198a90287ee733d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3498,8 +3498,9 @@ void colo_release_ram_cache(void)
*
* @f: QEMUFile where to receive the data
* @opaque: RAMState pointer
+ * @errp: pointer to Error*, to store an error if it happens.
*/
-static int ram_load_setup(QEMUFile *f, void *opaque)
+static int ram_load_setup(QEMUFile *f, void *opaque, Error **errp)
{
xbzrle_load_setup();
ramblock_recv_map_init();
diff --git a/migration/savevm.c b/migration/savevm.c
index b5b3b51bad94dc4c04ae22cd687ba111299339aa..a4ef41d3ff5b471a1cd4166c2dc5813e44ea3a5a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2741,7 +2741,7 @@ static void qemu_loadvm_state_switchover_ack_needed(MigrationIncomingState *mis)
trace_loadvm_state_switchover_ack_needed(mis->switchover_ack_pending_num);
}
-static int qemu_loadvm_state_setup(QEMUFile *f)
+static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp)
{
SaveStateEntry *se;
int ret;
@@ -2757,10 +2757,11 @@ static int qemu_loadvm_state_setup(QEMUFile *f)
}
}
- ret = se->ops->load_setup(f, se->opaque);
+ ret = se->ops->load_setup(f, se->opaque, errp);
if (ret < 0) {
+ error_prepend(errp, "Load state of device %s failed: ",
+ se->idstr);
qemu_file_set_error(f, ret);
- error_report("Load state of device %s failed", se->idstr);
return ret;
}
}
@@ -2941,7 +2942,8 @@ int qemu_loadvm_state(QEMUFile *f)
return ret;
}
- if (qemu_loadvm_state_setup(f) != 0) {
+ if (qemu_loadvm_state_setup(f, &local_err) != 0) {
+ error_report_err(local_err);
return -EINVAL;
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 08/21] memory: Add Error** argument to .log_global*() handlers
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (6 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 07/21] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-29 5:29 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 09/21] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
` (12 subsequent siblings)
20 siblings, 1 reply; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater,
Stefano Stabellini, Anthony Perard, Paul Durrant,
Michael S . Tsirkin, Paolo Bonzini, David Hildenbrand
Modify all log_global*() handlers to take an Error** parameter and
return a bool. A new MEMORY_LISTENER_CALL_LOG_GLOBAL macro looping on
the listeners is introduced to handle a possible error, which will
would interrupt the loop if necessary.
To be noted a change in memory_global_dirty_log_start() behavior as it
will return as soon as an error is detected.
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/exec/memory.h | 15 ++++++--
hw/i386/xen/xen-hvm.c | 6 ++--
hw/vfio/common.c | 8 +++--
hw/virtio/vhost.c | 6 ++--
system/memory.c | 83 +++++++++++++++++++++++++++++++++++++------
system/physmem.c | 5 +--
6 files changed, 101 insertions(+), 22 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b310ed7b1a1db7978ba4b394032c2f15..4bc146c5ebdd377cd14a4e462f32cc945db5a0a8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -998,8 +998,11 @@ struct MemoryListener {
* active at that time.
*
* @listener: The #MemoryListener.
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
- void (*log_global_start)(MemoryListener *listener);
+ bool (*log_global_start)(MemoryListener *listener, Error **errp);
/**
* @log_global_stop:
@@ -1009,8 +1012,11 @@ struct MemoryListener {
* the address space.
*
* @listener: The #MemoryListener.
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
- void (*log_global_stop)(MemoryListener *listener);
+ bool (*log_global_stop)(MemoryListener *listener, Error **errp);
/**
* @log_global_after_sync:
@@ -1019,8 +1025,11 @@ struct MemoryListener {
* for any #MemoryRegionSection.
*
* @listener: The #MemoryListener.
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
- void (*log_global_after_sync)(MemoryListener *listener);
+ bool (*log_global_after_sync)(MemoryListener *listener, Error **errp);
/**
* @eventfd_add:
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index f42621e6742552035122ea58092c91c3458338ff..925a207b494b4eed52d5f360b554f18ac8a9806d 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -446,16 +446,18 @@ static void xen_log_sync(MemoryListener *listener, MemoryRegionSection *section)
int128_get64(section->size));
}
-static void xen_log_global_start(MemoryListener *listener)
+static bool xen_log_global_start(MemoryListener *listener, Error **errp)
{
if (xen_enabled()) {
xen_in_migration = true;
}
+ return true;
}
-static void xen_log_global_stop(MemoryListener *listener)
+static bool xen_log_global_stop(MemoryListener *listener, Error **errp)
{
xen_in_migration = false;
+ return true;
}
static const MemoryListener xen_memory_listener = {
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 059bfdc07a85e2eb908df828c1f42104d683e911..8bba95ba6a2010b78cae54c6905857686bbb6309 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1075,7 +1075,8 @@ out:
return ret;
}
-static void vfio_listener_log_global_start(MemoryListener *listener)
+static bool vfio_listener_log_global_start(MemoryListener *listener,
+ Error **errp)
{
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
@@ -1092,9 +1093,11 @@ static void vfio_listener_log_global_start(MemoryListener *listener)
ret, strerror(-ret));
vfio_set_migration_error(ret);
}
+ return !!ret;
}
-static void vfio_listener_log_global_stop(MemoryListener *listener)
+static bool vfio_listener_log_global_stop(MemoryListener *listener,
+ Error **errp)
{
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
@@ -1111,6 +1114,7 @@ static void vfio_listener_log_global_stop(MemoryListener *listener)
ret, strerror(-ret));
vfio_set_migration_error(ret);
}
+ return !!ret;
}
static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2c9ac794680ea9b65eba6cc22e70cf141e90aa73..7a555f941934991a72a2817e5505fe0ce6d6fc64 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1044,7 +1044,7 @@ check_dev_state:
return r;
}
-static void vhost_log_global_start(MemoryListener *listener)
+static bool vhost_log_global_start(MemoryListener *listener, Error **errp)
{
int r;
@@ -1052,9 +1052,10 @@ static void vhost_log_global_start(MemoryListener *listener)
if (r < 0) {
abort();
}
+ return true;
}
-static void vhost_log_global_stop(MemoryListener *listener)
+static bool vhost_log_global_stop(MemoryListener *listener, Error **errp)
{
int r;
@@ -1062,6 +1063,7 @@ static void vhost_log_global_stop(MemoryListener *listener)
if (r < 0) {
abort();
}
+ return true;
}
static void vhost_log_start(MemoryListener *listener,
diff --git a/system/memory.c b/system/memory.c
index a229a79988fce2aa3cb77e3a130db4c694e8cd49..af06157ead5b1272548e87f79ab9fb3036055328 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -127,6 +127,35 @@ enum ListenerDirection { Forward, Reverse };
} \
} while (0)
+#define MEMORY_LISTENER_CALL_LOG_GLOBAL(_callback, _direction, _errp, \
+ _args...) \
+ do { \
+ MemoryListener *_listener; \
+ \
+ switch (_direction) { \
+ case Forward: \
+ QTAILQ_FOREACH(_listener, &memory_listeners, link) { \
+ if (_listener->_callback) { \
+ if (!_listener->_callback(_listener, _errp, ##_args)) { \
+ break; \
+ } \
+ } \
+ } \
+ break; \
+ case Reverse: \
+ QTAILQ_FOREACH_REVERSE(_listener, &memory_listeners, link) { \
+ if (_listener->_callback) { \
+ if (!_listener->_callback(_listener, _errp, ##_args)) { \
+ break; \
+ } \
+ } \
+ } \
+ break; \
+ default: \
+ abort(); \
+ }; \
+ } while (0)
+
#define MEMORY_LISTENER_CALL(_as, _callback, _direction, _section, _args...) \
do { \
MemoryListener *_listener; \
@@ -2903,7 +2932,13 @@ void memory_global_dirty_log_sync(bool last_stage)
void memory_global_after_dirty_log_sync(void)
{
- MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
+ Error *local_err = NULL;
+
+ MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_after_sync, Forward,
+ &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ }
}
/*
@@ -2912,18 +2947,22 @@ void memory_global_after_dirty_log_sync(void)
*/
static unsigned int postponed_stop_flags;
static VMChangeStateEntry *vmstate_change;
-static void memory_global_dirty_log_stop_postponed_run(void);
+static bool memory_global_dirty_log_stop_postponed_run(Error **errp);
void memory_global_dirty_log_start(unsigned int flags)
{
unsigned int old_flags;
+ Error *local_err = NULL;
assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
if (vmstate_change) {
/* If there is postponed stop(), operate on it first */
postponed_stop_flags &= ~flags;
- memory_global_dirty_log_stop_postponed_run();
+ if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
+ error_report_err(local_err);
+ return;
+ }
}
flags &= ~global_dirty_tracking;
@@ -2936,15 +2975,22 @@ void memory_global_dirty_log_start(unsigned int flags)
trace_global_dirty_changed(global_dirty_tracking);
if (!old_flags) {
- MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
+ MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
+ &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ return;
+ }
memory_region_transaction_begin();
memory_region_update_pending = true;
memory_region_transaction_commit();
}
}
-static void memory_global_dirty_log_do_stop(unsigned int flags)
+static bool memory_global_dirty_log_do_stop(unsigned int flags, Error **errp)
{
+ ERRP_GUARD();
+
assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
assert((global_dirty_tracking & flags) == flags);
global_dirty_tracking &= ~flags;
@@ -2955,39 +3001,49 @@ static void memory_global_dirty_log_do_stop(unsigned int flags)
memory_region_transaction_begin();
memory_region_update_pending = true;
memory_region_transaction_commit();
- MEMORY_LISTENER_CALL_GLOBAL(log_global_stop, Reverse);
+ MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_stop, Reverse, errp);
}
+ return !*errp;
}
/*
* Execute the postponed dirty log stop operations if there is, then reset
* everything (including the flags and the vmstate change hook).
*/
-static void memory_global_dirty_log_stop_postponed_run(void)
+static bool memory_global_dirty_log_stop_postponed_run(Error **errp)
{
+ bool ret = true;
+
/* This must be called with the vmstate handler registered */
assert(vmstate_change);
/* Note: postponed_stop_flags can be cleared in log start routine */
if (postponed_stop_flags) {
- memory_global_dirty_log_do_stop(postponed_stop_flags);
+ ret = memory_global_dirty_log_do_stop(postponed_stop_flags, errp);
postponed_stop_flags = 0;
}
qemu_del_vm_change_state_handler(vmstate_change);
vmstate_change = NULL;
+ return ret;
}
static void memory_vm_change_state_handler(void *opaque, bool running,
RunState state)
{
+ Error *local_err = NULL;
+
if (running) {
- memory_global_dirty_log_stop_postponed_run();
+ if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
+ error_report_err(local_err);
+ }
}
}
void memory_global_dirty_log_stop(unsigned int flags)
{
+ Error *local_err = NULL;
+
if (!runstate_is_running()) {
/* Postpone the dirty log stop, e.g., to when VM starts again */
if (vmstate_change) {
@@ -3001,7 +3057,9 @@ void memory_global_dirty_log_stop(unsigned int flags)
return;
}
- memory_global_dirty_log_do_stop(flags);
+ if (!memory_global_dirty_log_do_stop(flags, &local_err)) {
+ error_report_err(local_err);
+ }
}
static void listener_add_address_space(MemoryListener *listener,
@@ -3009,13 +3067,16 @@ static void listener_add_address_space(MemoryListener *listener,
{
FlatView *view;
FlatRange *fr;
+ Error *local_err = NULL;
if (listener->begin) {
listener->begin(listener);
}
if (global_dirty_tracking) {
if (listener->log_global_start) {
- listener->log_global_start(listener);
+ if (!listener->log_global_start(listener, &local_err)) {
+ error_report_err(local_err);
+ }
}
}
diff --git a/system/physmem.c b/system/physmem.c
index e3ebc19eefd8050a1dee16e3d1449f0c144f751f..9adbf9aea847cd80bdac6dca466fb476844ac048 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -148,7 +148,7 @@ typedef struct subpage_t {
static void io_mem_init(void);
static void memory_map_init(void);
-static void tcg_log_global_after_sync(MemoryListener *listener);
+static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp);
static void tcg_commit(MemoryListener *listener);
/**
@@ -2475,7 +2475,7 @@ static void do_nothing(CPUState *cpu, run_on_cpu_data d)
{
}
-static void tcg_log_global_after_sync(MemoryListener *listener)
+static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp)
{
CPUAddressSpace *cpuas;
@@ -2507,6 +2507,7 @@ static void tcg_log_global_after_sync(MemoryListener *listener)
cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
}
+ return true;
}
static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data)
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 09/21] memory: Add Error** argument to the global_dirty_log routines
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (7 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 08/21] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 10/21] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
` (11 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater,
Stefano Stabellini, Anthony Perard, Paul Durrant,
Michael S . Tsirkin, Paolo Bonzini, David Hildenbrand,
Hyman Huang
Now that the log_global*() handlers take an Error** parameter and
return a bool, do the same for memory_global_dirty_log_start() and
memory_global_dirty_log_stop(). The error is reported in the callers
for now and it will be propagated in the call stack in the next
changes.
To be noted a functional change in ram_init_bitmaps(), if the dirty
pages logger fails to start, there is no need to synchronize the dirty
pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
similar way.
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hyman Huang <yong.huang@smartx.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/exec/memory.h | 10 ++++++++--
hw/i386/xen/xen-hvm.c | 4 ++--
migration/dirtyrate.c | 21 +++++++++++++++++----
migration/ram.c | 34 ++++++++++++++++++++++++++++++----
system/memory.c | 30 ++++++++++++------------------
5 files changed, 69 insertions(+), 30 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 4bc146c5ebdd377cd14a4e462f32cc945db5a0a8..8b019465ab13ce85c03075c80865a0865ea1feed 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2576,15 +2576,21 @@ void memory_listener_unregister(MemoryListener *listener);
* memory_global_dirty_log_start: begin dirty logging for all regions
*
* @flags: purpose of starting dirty log, migration or dirty rate
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
-void memory_global_dirty_log_start(unsigned int flags);
+bool memory_global_dirty_log_start(unsigned int flags, Error **errp);
/**
* memory_global_dirty_log_stop: end dirty logging for all regions
*
* @flags: purpose of stopping dirty log, migration or dirty rate
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
-void memory_global_dirty_log_stop(unsigned int flags);
+bool memory_global_dirty_log_stop(unsigned int flags, Error **errp);
void mtree_info(bool flatview, bool dispatch_tree, bool owner, bool disabled);
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index 925a207b494b4eed52d5f360b554f18ac8a9806d..286269b47572d90e57df5ff44835bb5f8e16c7ad 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -655,9 +655,9 @@ void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length)
void qmp_xen_set_global_dirty_log(bool enable, Error **errp)
{
if (enable) {
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION, errp);
} else {
- memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
+ memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION, errp);
}
}
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index 1d2e85746fb7b10eb7f149976970f9a92125af8a..34f6d803ff5f4e6ccf2e06aaaed65a336c4be469 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -90,11 +90,17 @@ static int64_t do_calculate_dirtyrate(DirtyPageRecord dirty_pages,
void global_dirty_log_change(unsigned int flag, bool start)
{
+ Error *local_err = NULL;
+ bool ret;
+
bql_lock();
if (start) {
- memory_global_dirty_log_start(flag);
+ ret = memory_global_dirty_log_start(flag, &local_err);
} else {
- memory_global_dirty_log_stop(flag);
+ ret = memory_global_dirty_log_stop(flag, &local_err);
+ }
+ if (!ret) {
+ error_report_err(local_err);
}
bql_unlock();
}
@@ -106,10 +112,14 @@ void global_dirty_log_change(unsigned int flag, bool start)
*/
static void global_dirty_log_sync(unsigned int flag, bool one_shot)
{
+ Error *local_err = NULL;
+
bql_lock();
memory_global_dirty_log_sync(false);
if (one_shot) {
- memory_global_dirty_log_stop(flag);
+ if (!memory_global_dirty_log_stop(flag, &local_err)) {
+ error_report_err(local_err);
+ }
}
bql_unlock();
}
@@ -608,9 +618,12 @@ static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config)
{
int64_t start_time;
DirtyPageRecord dirty_pages;
+ Error *local_err = NULL;
bql_lock();
- memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE);
+ if (!memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE, &local_err)) {
+ error_report_err(local_err);
+ }
/*
* 1'round of log sync may return all 1 bits with
diff --git a/migration/ram.c b/migration/ram.c
index d648134133fc22cd91c7b2064198a90287ee733d..9fb1875aad73b2fa009199bdfa8960339df7287d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2391,6 +2391,7 @@ static void ram_save_cleanup(void *opaque)
{
RAMState **rsp = opaque;
RAMBlock *block;
+ Error *local_err = NULL;
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
@@ -2403,7 +2404,10 @@ static void ram_save_cleanup(void *opaque)
* memory_global_dirty_log_stop will assert that
* memory_global_dirty_log_start/stop used in pairs
*/
- memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
+ if (!memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION,
+ &local_err)) {
+ error_report_err(local_err);
+ }
}
}
@@ -2800,18 +2804,31 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
static void ram_init_bitmaps(RAMState *rs)
{
+ Error *local_err = NULL;
+ bool ret = true;
+
qemu_mutex_lock_ramlist();
WITH_RCU_READ_LOCK_GUARD() {
ram_list_init_bitmaps();
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
+ &local_err);
+ if (!ret) {
+ error_report_err(local_err);
+ goto out_unlock;
+ }
migration_bitmap_sync_precopy(rs, false);
}
}
+out_unlock:
qemu_mutex_unlock_ramlist();
+ if (!ret) {
+ return;
+ }
+
/*
* After an eventual first bitmap sync, fixup the initial bitmap
* containing all 1s to exclude any discarded pages from migration.
@@ -3451,6 +3468,8 @@ int colo_init_ram_cache(void)
void colo_incoming_start_dirty_log(void)
{
RAMBlock *block = NULL;
+ Error *local_err = NULL;
+
/* For memory_global_dirty_log_start below. */
bql_lock();
qemu_mutex_lock_ramlist();
@@ -3462,7 +3481,10 @@ void colo_incoming_start_dirty_log(void)
/* Discard this dirty bitmap record */
bitmap_zero(block->bmap, block->max_length >> TARGET_PAGE_BITS);
}
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ if (!memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
+ &local_err)) {
+ error_report_err(local_err);
+ }
}
ram_state->migration_dirty_pages = 0;
qemu_mutex_unlock_ramlist();
@@ -3473,8 +3495,12 @@ void colo_incoming_start_dirty_log(void)
void colo_release_ram_cache(void)
{
RAMBlock *block;
+ Error *local_err = NULL;
+
+ if (!memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION, &local_err)) {
+ error_report_err(local_err);
+ }
- memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
RAMBLOCK_FOREACH_NOT_IGNORED(block) {
g_free(block->bmap);
block->bmap = NULL;
diff --git a/system/memory.c b/system/memory.c
index af06157ead5b1272548e87f79ab9fb3036055328..48aed0f8ece1c731849636c442b8ab8e5d7ff6a5 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2949,25 +2949,24 @@ static unsigned int postponed_stop_flags;
static VMChangeStateEntry *vmstate_change;
static bool memory_global_dirty_log_stop_postponed_run(Error **errp);
-void memory_global_dirty_log_start(unsigned int flags)
+bool memory_global_dirty_log_start(unsigned int flags, Error **errp)
{
+ ERRP_GUARD();
unsigned int old_flags;
- Error *local_err = NULL;
assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
if (vmstate_change) {
/* If there is postponed stop(), operate on it first */
postponed_stop_flags &= ~flags;
- if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
- error_report_err(local_err);
- return;
+ if (!memory_global_dirty_log_stop_postponed_run(errp)) {
+ return false;
}
}
flags &= ~global_dirty_tracking;
if (!flags) {
- return;
+ return true;
}
old_flags = global_dirty_tracking;
@@ -2975,16 +2974,15 @@ void memory_global_dirty_log_start(unsigned int flags)
trace_global_dirty_changed(global_dirty_tracking);
if (!old_flags) {
- MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
- &local_err);
- if (local_err) {
- error_report_err(local_err);
- return;
+ MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward, errp);
+ if (*errp) {
+ return false;
}
memory_region_transaction_begin();
memory_region_update_pending = true;
memory_region_transaction_commit();
}
+ return true;
}
static bool memory_global_dirty_log_do_stop(unsigned int flags, Error **errp)
@@ -3040,10 +3038,8 @@ static void memory_vm_change_state_handler(void *opaque, bool running,
}
}
-void memory_global_dirty_log_stop(unsigned int flags)
+bool memory_global_dirty_log_stop(unsigned int flags, Error **errp)
{
- Error *local_err = NULL;
-
if (!runstate_is_running()) {
/* Postpone the dirty log stop, e.g., to when VM starts again */
if (vmstate_change) {
@@ -3054,12 +3050,10 @@ void memory_global_dirty_log_stop(unsigned int flags)
vmstate_change = qemu_add_vm_change_state_handler(
memory_vm_change_state_handler, NULL);
}
- return;
+ return true;
}
- if (!memory_global_dirty_log_do_stop(flags, &local_err)) {
- error_report_err(local_err);
- }
+ return memory_global_dirty_log_do_stop(flags, errp);
}
static void listener_add_address_space(MemoryListener *listener,
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 10/21] migration: Modify ram_init_bitmaps() to report dirty tracking errors
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (8 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 09/21] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 11/21] migration: Fix migration termination Cédric Le Goater
` (10 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
The .save_setup() handler has now an Error** argument that we can use
to propagate errors reported by the .log_global_start() handler. Do
that for the RAM. The caller qemu_savevm_state_setup() will store the
error under the migration stream for later detection in the migration
sequence.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/ram.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 9fb1875aad73b2fa009199bdfa8960339df7287d..23f4df4779309bbbe164c56c1436b60d65749860 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2802,9 +2802,8 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
}
}
-static void ram_init_bitmaps(RAMState *rs)
+static bool ram_init_bitmaps(RAMState *rs, Error **errp)
{
- Error *local_err = NULL;
bool ret = true;
qemu_mutex_lock_ramlist();
@@ -2813,10 +2812,8 @@ static void ram_init_bitmaps(RAMState *rs)
ram_list_init_bitmaps();
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
- ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
- &local_err);
+ ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION, errp);
if (!ret) {
- error_report_err(local_err);
goto out_unlock;
}
migration_bitmap_sync_precopy(rs, false);
@@ -2826,7 +2823,7 @@ out_unlock:
qemu_mutex_unlock_ramlist();
if (!ret) {
- return;
+ return false;
}
/*
@@ -2834,9 +2831,10 @@ out_unlock:
* containing all 1s to exclude any discarded pages from migration.
*/
migration_bitmap_clear_discarded_pages(rs);
+ return true;
}
-static int ram_init_all(RAMState **rsp)
+static int ram_init_all(RAMState **rsp, Error **errp)
{
if (ram_state_init(rsp)) {
return -1;
@@ -2847,7 +2845,9 @@ static int ram_init_all(RAMState **rsp)
return -1;
}
- ram_init_bitmaps(*rsp);
+ if (!ram_init_bitmaps(*rsp, errp)) {
+ return -1;
+ }
return 0;
}
@@ -2961,7 +2961,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
/* migration has already setup the bitmap, reuse it. */
if (!migration_in_colo_state()) {
- if (ram_init_all(rsp) != 0) {
+ if (ram_init_all(rsp, errp) != 0) {
compress_threads_save_cleanup();
return -1;
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 11/21] migration: Fix migration termination
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (9 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 10/21] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-29 5:34 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 12/21] vfio: Add Error** argument to .set_dirty_page_tracking() handler Cédric Le Goater
` (9 subsequent siblings)
20 siblings, 1 reply; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
Handle migration termination when in SETUP state. This can happen if
qemu_savevm_state_setup() fails.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/migration.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/migration/migration.c b/migration/migration.c
index c1a62b696f62c0d5aca0505e58bc4dc0ff561fde..63294417ff9cae868ad8a167094a795fc30e4da0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3161,6 +3161,8 @@ static void migration_iteration_finish(MigrationState *s)
}
}
break;
+ case MIGRATION_STATUS_SETUP:
+ break;
default:
/* Should not reach here, but if so, forgive the VM. */
@@ -3192,6 +3194,8 @@ static void bg_migration_iteration_finish(MigrationState *s)
case MIGRATION_STATUS_CANCELLED:
case MIGRATION_STATUS_CANCELLING:
break;
+ case MIGRATION_STATUS_SETUP:
+ break;
default:
/* Should not reach here, but if so, forgive the VM. */
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 12/21] vfio: Add Error** argument to .set_dirty_page_tracking() handler
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (10 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 11/21] migration: Fix migration termination Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 13/21] vfio: Add Error** argument to vfio_devices_dma_logging_start() Cédric Le Goater
` (8 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
We will use the Error object to improve error reporting in the
.log_global*() handlers of VFIO. Add documentation while at it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/hw/vfio/vfio-container-base.h | 18 ++++++++++++++++--
hw/vfio/common.c | 4 ++--
hw/vfio/container-base.c | 4 ++--
hw/vfio/container.c | 6 +++---
4 files changed, 23 insertions(+), 9 deletions(-)
diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index b2813b0c117985425c842d91f011bb895955d738..dec2023eceb6c7d62b0ee35008cc58f8e695e190 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -81,7 +81,7 @@ int vfio_container_add_section_window(VFIOContainerBase *bcontainer,
void vfio_container_del_section_window(VFIOContainerBase *bcontainer,
MemoryRegionSection *section);
int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
- bool start);
+ bool start, Error **errp);
int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
hwaddr iova, hwaddr size);
@@ -120,9 +120,23 @@ struct VFIOIOMMUClass {
int (*attach_device)(const char *name, VFIODevice *vbasedev,
AddressSpace *as, Error **errp);
void (*detach_device)(VFIODevice *vbasedev);
+
/* migration feature */
+
+ /**
+ * @set_dirty_page_tracking
+ *
+ * Start or stop dirty pages tracking on VFIO container
+ *
+ * @bcontainer: #VFIOContainerBase on which to de/activate dirty
+ * pages tracking
+ * @start: indicates whether to start or stop dirty pages tracking
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
- bool start);
+ bool start, Error **errp);
int (*query_dirty_bitmap)(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
hwaddr iova, hwaddr size);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 8bba95ba6a2010b78cae54c6905857686bbb6309..560f4bc38499f7f4a3bc84ef7e4184fd6dc89935 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1085,7 +1085,7 @@ static bool vfio_listener_log_global_start(MemoryListener *listener,
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
ret = vfio_devices_dma_logging_start(bcontainer);
} else {
- ret = vfio_container_set_dirty_page_tracking(bcontainer, true);
+ ret = vfio_container_set_dirty_page_tracking(bcontainer, true, NULL);
}
if (ret) {
@@ -1106,7 +1106,7 @@ static bool vfio_listener_log_global_stop(MemoryListener *listener,
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
vfio_devices_dma_logging_stop(bcontainer);
} else {
- ret = vfio_container_set_dirty_page_tracking(bcontainer, false);
+ ret = vfio_container_set_dirty_page_tracking(bcontainer, false, NULL);
}
if (ret) {
diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
index 913ae49077c4f09b7b27517c1231cfbe4befb7fb..7c0764121d24b02b6c4e66e368d7dff78a6d65aa 100644
--- a/hw/vfio/container-base.c
+++ b/hw/vfio/container-base.c
@@ -53,14 +53,14 @@ void vfio_container_del_section_window(VFIOContainerBase *bcontainer,
}
int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
- bool start)
+ bool start, Error **errp)
{
if (!bcontainer->dirty_pages_supported) {
return 0;
}
g_assert(bcontainer->ops->set_dirty_page_tracking);
- return bcontainer->ops->set_dirty_page_tracking(bcontainer, start);
+ return bcontainer->ops->set_dirty_page_tracking(bcontainer, start, errp);
}
int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index bd25b9fbad2e717e63c2ab0e331186e5f63cef49..f772ac79b9c413c86d7e60f6dc4e6699852d5aac 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -210,7 +210,7 @@ static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova,
static int
vfio_legacy_set_dirty_page_tracking(const VFIOContainerBase *bcontainer,
- bool start)
+ bool start, Error **errp)
{
const VFIOContainer *container = container_of(bcontainer, VFIOContainer,
bcontainer);
@@ -228,8 +228,8 @@ vfio_legacy_set_dirty_page_tracking(const VFIOContainerBase *bcontainer,
ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, &dirty);
if (ret) {
ret = -errno;
- error_report("Failed to set dirty tracking flag 0x%x errno: %d",
- dirty.flags, errno);
+ error_setg(errp, "Failed to set dirty tracking flag 0x%x errno: %d",
+ dirty.flags, errno);
}
return ret;
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 13/21] vfio: Add Error** argument to vfio_devices_dma_logging_start()
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (11 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 12/21] vfio: Add Error** argument to .set_dirty_page_tracking() handler Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 14/21] vfio: Add Error** argument to vfio_devices_dma_logging_stop() Cédric Le Goater
` (7 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
This allows to update the Error argument of the VFIO log_global_start()
handler. Errors detected when device level logging is started will be
propagated up to qemu_savevm_state_setup() when the ram save_setup()
handler is executed.
The vfio_set_migration_error() call becomes redundant. Remove it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/common.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 560f4bc38499f7f4a3bc84ef7e4184fd6dc89935..a2d26cd08cb132d2b27c388bd75db3d9b8128407 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1036,7 +1036,8 @@ static void vfio_device_feature_dma_logging_start_destroy(
g_free(feature);
}
-static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer)
+static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer,
+ Error **errp)
{
struct vfio_device_feature *feature;
VFIODirtyRanges ranges;
@@ -1058,8 +1059,8 @@ static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer)
ret = ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature);
if (ret) {
ret = -errno;
- error_report("%s: Failed to start DMA logging, err %d (%s)",
- vbasedev->name, ret, strerror(errno));
+ error_setg(errp, "%s: Failed to start DMA logging, err %d (%s)",
+ vbasedev->name, ret, strerror(errno));
goto out;
}
vbasedev->dirty_tracking = true;
@@ -1083,15 +1084,13 @@ static bool vfio_listener_log_global_start(MemoryListener *listener,
int ret;
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
- ret = vfio_devices_dma_logging_start(bcontainer);
+ ret = vfio_devices_dma_logging_start(bcontainer, errp);
} else {
- ret = vfio_container_set_dirty_page_tracking(bcontainer, true, NULL);
+ ret = vfio_container_set_dirty_page_tracking(bcontainer, true, errp);
}
if (ret) {
- error_report("vfio: Could not start dirty page tracking, err: %d (%s)",
- ret, strerror(-ret));
- vfio_set_migration_error(ret);
+ error_prepend(errp, "vfio: Could not start dirty page tracking - ");
}
return !!ret;
}
@@ -1106,13 +1105,11 @@ static bool vfio_listener_log_global_stop(MemoryListener *listener,
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
vfio_devices_dma_logging_stop(bcontainer);
} else {
- ret = vfio_container_set_dirty_page_tracking(bcontainer, false, NULL);
+ ret = vfio_container_set_dirty_page_tracking(bcontainer, false, errp);
}
if (ret) {
- error_report("vfio: Could not stop dirty page tracking, err: %d (%s)",
- ret, strerror(-ret));
- vfio_set_migration_error(ret);
+ error_prepend(errp, "vfio: Could not stop dirty page tracking - ");
}
return !!ret;
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 14/21] vfio: Add Error** argument to vfio_devices_dma_logging_stop()
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (12 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 13/21] vfio: Add Error** argument to vfio_devices_dma_logging_start() Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 15/21] vfio: Use new Error** argument in vfio_save_setup() Cédric Le Goater
` (6 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
This improves error reporting in the log_global_stop() VFIO handler.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/common.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index a2d26cd08cb132d2b27c388bd75db3d9b8128407..706e915a3ba5f8520deb3753d9bb450a986f207a 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -938,12 +938,14 @@ static void vfio_dirty_tracking_init(VFIOContainerBase *bcontainer,
memory_listener_unregister(&dirty.listener);
}
-static void vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer)
+static int vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer,
+ Error **errp)
{
uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature),
sizeof(uint64_t))] = {};
struct vfio_device_feature *feature = (struct vfio_device_feature *)buf;
VFIODevice *vbasedev;
+ int ret = 0;
feature->argsz = sizeof(buf);
feature->flags = VFIO_DEVICE_FEATURE_SET |
@@ -955,11 +957,17 @@ static void vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer)
}
if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) {
- warn_report("%s: Failed to stop DMA logging, err %d (%s)",
- vbasedev->name, -errno, strerror(errno));
+ /* Keep first error */
+ if (!ret) {
+ ret = -errno;
+ error_setg(errp, "%s: Failed to stop DMA logging, err %d (%s)",
+ vbasedev->name, -errno, strerror(errno));
+ }
}
vbasedev->dirty_tracking = false;
}
+
+ return ret;
}
static struct vfio_device_feature *
@@ -1068,7 +1076,8 @@ static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer,
out:
if (ret) {
- vfio_devices_dma_logging_stop(bcontainer);
+ /* Ignore the potential errors when doing rollback */
+ vfio_devices_dma_logging_stop(bcontainer, NULL);
}
vfio_device_feature_dma_logging_start_destroy(feature);
@@ -1103,7 +1112,7 @@ static bool vfio_listener_log_global_stop(MemoryListener *listener,
int ret = 0;
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
- vfio_devices_dma_logging_stop(bcontainer);
+ ret = vfio_devices_dma_logging_stop(bcontainer, errp);
} else {
ret = vfio_container_set_dirty_page_tracking(bcontainer, false, errp);
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 15/21] vfio: Use new Error** argument in vfio_save_setup()
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (13 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 14/21] vfio: Add Error** argument to vfio_devices_dma_logging_stop() Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 16/21] vfio: Add Error** argument to .vfio_save_config() handler Cédric Le Goater
` (5 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
Add an Error** argument to vfio_migration_set_state() and adjust
callers, including vfio_save_setup(). The error will be propagated up
to qemu_savevm_state_setup() where the save_setup() handler is
executed.
Modify vfio_vmstate_change_prepare() and vfio_vmstate_change() to
store a reported error under the migration stream if a migration is in
progress.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/migration.c | 64 ++++++++++++++++++++++++++++++---------------
1 file changed, 43 insertions(+), 21 deletions(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 2dfbe671f6f45aa530c7341177bb532d8292cecd..8bdc68c66516710c52443135284262580825e0b8 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -84,7 +84,8 @@ static const char *mig_state_to_str(enum vfio_device_mig_state state)
static int vfio_migration_set_state(VFIODevice *vbasedev,
enum vfio_device_mig_state new_state,
- enum vfio_device_mig_state recover_state)
+ enum vfio_device_mig_state recover_state,
+ Error **errp)
{
VFIOMigration *migration = vbasedev->migration;
uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature) +
@@ -104,15 +105,15 @@ static int vfio_migration_set_state(VFIODevice *vbasedev,
ret = -errno;
if (recover_state == VFIO_DEVICE_STATE_ERROR) {
- error_report("%s: Failed setting device state to %s, err: %s. "
- "Recover state is ERROR. Resetting device",
- vbasedev->name, mig_state_to_str(new_state),
- strerror(errno));
+ error_setg(errp, "%s: Failed setting device state to %s, err: %s. "
+ "Recover state is ERROR. Resetting device",
+ vbasedev->name, mig_state_to_str(new_state),
+ strerror(errno));
goto reset_device;
}
- error_report(
+ error_setg(errp,
"%s: Failed setting device state to %s, err: %s. Setting device in recover state %s",
vbasedev->name, mig_state_to_str(new_state),
strerror(errno), mig_state_to_str(recover_state));
@@ -120,7 +121,7 @@ static int vfio_migration_set_state(VFIODevice *vbasedev,
mig_state->device_state = recover_state;
if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) {
ret = -errno;
- error_report(
+ error_setg(errp,
"%s: Failed setting device in recover state, err: %s. Resetting device",
vbasedev->name, strerror(errno));
@@ -139,7 +140,7 @@ static int vfio_migration_set_state(VFIODevice *vbasedev,
* This can happen if the device is asynchronously reset and
* terminates a data transfer.
*/
- error_report("%s: data_fd out of sync", vbasedev->name);
+ error_setg(errp, "%s: data_fd out of sync", vbasedev->name);
close(mig_state->data_fd);
return -EBADF;
@@ -170,10 +171,11 @@ reset_device:
*/
static int
vfio_migration_set_state_or_reset(VFIODevice *vbasedev,
- enum vfio_device_mig_state new_state)
+ enum vfio_device_mig_state new_state,
+ Error **errp)
{
return vfio_migration_set_state(vbasedev, new_state,
- VFIO_DEVICE_STATE_ERROR);
+ VFIO_DEVICE_STATE_ERROR, errp);
}
static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev,
@@ -391,8 +393,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
stop_copy_size);
migration->data_buffer = g_try_malloc0(migration->data_buffer_size);
if (!migration->data_buffer) {
- error_report("%s: Failed to allocate migration data buffer",
- vbasedev->name);
+ error_setg(errp, "%s: Failed to allocate migration data buffer",
+ vbasedev->name);
return -ENOMEM;
}
@@ -402,7 +404,7 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
switch (migration->device_state) {
case VFIO_DEVICE_STATE_RUNNING:
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
- VFIO_DEVICE_STATE_RUNNING);
+ VFIO_DEVICE_STATE_RUNNING, errp);
if (ret) {
return ret;
}
@@ -429,13 +431,20 @@ static void vfio_save_cleanup(void *opaque)
{
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
+ Error *local_err = NULL;
+ int ret;
/*
* Changing device state from STOP_COPY to STOP can take time. Do it here,
* after migration has completed, so it won't increase downtime.
*/
if (migration->device_state == VFIO_DEVICE_STATE_STOP_COPY) {
- vfio_migration_set_state_or_reset(vbasedev, VFIO_DEVICE_STATE_STOP);
+ ret = vfio_migration_set_state_or_reset(vbasedev,
+ VFIO_DEVICE_STATE_STOP,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
}
g_free(migration->data_buffer);
@@ -541,11 +550,13 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
VFIODevice *vbasedev = opaque;
ssize_t data_size;
int ret;
+ Error *local_err = NULL;
/* We reach here with device state STOP or STOP_COPY only */
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY,
- VFIO_DEVICE_STATE_STOP);
+ VFIO_DEVICE_STATE_STOP, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
@@ -585,7 +596,7 @@ static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp)
VFIODevice *vbasedev = opaque;
return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING,
- vbasedev->migration->device_state);
+ vbasedev->migration->device_state, errp);
}
static int vfio_load_cleanup(void *opaque)
@@ -701,20 +712,22 @@ static void vfio_vmstate_change_prepare(void *opaque, bool running,
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
enum vfio_device_mig_state new_state;
+ Error *local_err = NULL;
int ret;
new_state = migration->device_state == VFIO_DEVICE_STATE_PRE_COPY ?
VFIO_DEVICE_STATE_PRE_COPY_P2P :
VFIO_DEVICE_STATE_RUNNING_P2P;
- ret = vfio_migration_set_state_or_reset(vbasedev, new_state);
+ ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
if (ret) {
/*
* Migration should be aborted in this case, but vm_state_notify()
* currently does not support reporting failures.
*/
if (migrate_get_current()->to_dst_file) {
- qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
+ qemu_file_set_error_obj(migrate_get_current()->to_dst_file, ret,
+ local_err);
}
}
@@ -727,6 +740,7 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
{
VFIODevice *vbasedev = opaque;
enum vfio_device_mig_state new_state;
+ Error *local_err = NULL;
int ret;
if (running) {
@@ -739,14 +753,15 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
VFIO_DEVICE_STATE_STOP;
}
- ret = vfio_migration_set_state_or_reset(vbasedev, new_state);
+ ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
if (ret) {
/*
* Migration should be aborted in this case, but vm_state_notify()
* currently does not support reporting failures.
*/
if (migrate_get_current()->to_dst_file) {
- qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
+ qemu_file_set_error_obj(migrate_get_current()->to_dst_file, ret,
+ local_err);
}
}
@@ -760,6 +775,8 @@ static void vfio_migration_state_notifier(Notifier *notifier, void *data)
VFIOMigration *migration = container_of(notifier, VFIOMigration,
migration_state);
VFIODevice *vbasedev = migration->vbasedev;
+ Error *local_err = NULL;
+ int ret;
trace_vfio_migration_state_notifier(vbasedev->name,
MigrationStatus_str(s->state));
@@ -768,7 +785,12 @@ static void vfio_migration_state_notifier(Notifier *notifier, void *data)
case MIGRATION_STATUS_CANCELLING:
case MIGRATION_STATUS_CANCELLED:
case MIGRATION_STATUS_FAILED:
- vfio_migration_set_state_or_reset(vbasedev, VFIO_DEVICE_STATE_RUNNING);
+ ret = vfio_migration_set_state_or_reset(vbasedev,
+ VFIO_DEVICE_STATE_RUNNING,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
}
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 16/21] vfio: Add Error** argument to .vfio_save_config() handler
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (14 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 15/21] vfio: Use new Error** argument in vfio_save_setup() Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 17/21] vfio: Reverse test on vfio_get_dirty_bitmap() Cédric Le Goater
` (4 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
Use vmstate_save_state_with_err() to improve error reporting in the
callers and store a reported error under the migration stream. Add
documentation while at it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/hw/vfio/vfio-common.h | 25 ++++++++++++++++++++++++-
hw/vfio/migration.c | 18 ++++++++++++------
hw/vfio/pci.c | 5 +++--
3 files changed, 39 insertions(+), 9 deletions(-)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 9b7ef7d02b5a0ad5266bcc4d06cd6874178978e4..6d9dee626afc491645d2c2398f3e3210961f67e9 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -133,7 +133,30 @@ struct VFIODeviceOps {
int (*vfio_hot_reset_multi)(VFIODevice *vdev);
void (*vfio_eoi)(VFIODevice *vdev);
Object *(*vfio_get_object)(VFIODevice *vdev);
- void (*vfio_save_config)(VFIODevice *vdev, QEMUFile *f);
+
+ /**
+ * @vfio_save_config
+ *
+ * Save device config state
+ *
+ * @vdev: #VFIODevice for which to save the config
+ * @f: #QEMUFile where to send the data
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
+ */
+ int (*vfio_save_config)(VFIODevice *vdev, QEMUFile *f, Error **errp);
+
+ /**
+ * @vfio_load_config
+ *
+ * Load device config state
+ *
+ * @vdev: #VFIODevice for which to load the config
+ * @f: #QEMUFile where to get the data
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f);
};
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 8bdc68c66516710c52443135284262580825e0b8..228e8854594f3714b7c6f4fcfc5468d6b56337cb 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -190,14 +190,19 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev,
return ret;
}
-static int vfio_save_device_config_state(QEMUFile *f, void *opaque)
+static int vfio_save_device_config_state(QEMUFile *f, void *opaque,
+ Error **errp)
{
VFIODevice *vbasedev = opaque;
+ int ret;
qemu_put_be64(f, VFIO_MIG_FLAG_DEV_CONFIG_STATE);
if (vbasedev->ops && vbasedev->ops->vfio_save_config) {
- vbasedev->ops->vfio_save_config(vbasedev, f);
+ ret = vbasedev->ops->vfio_save_config(vbasedev, f, errp);
+ if (ret) {
+ return ret;
+ }
}
qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
@@ -581,13 +586,14 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
static void vfio_save_state(QEMUFile *f, void *opaque)
{
VFIODevice *vbasedev = opaque;
+ Error *local_err = NULL;
int ret;
- ret = vfio_save_device_config_state(f, opaque);
+ ret = vfio_save_device_config_state(f, opaque, &local_err);
if (ret) {
- error_report("%s: Failed to save device config space",
- vbasedev->name);
- qemu_file_set_error(f, ret);
+ error_prepend(&local_err, "%s: Failed to save device config space",
+ vbasedev->name);
+ qemu_file_set_error_obj(f, ret, local_err);
}
}
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 4fa387f0430d62ca2ba1b5ae5b7037f8f06b33f9..99d86e1d40ef25133fc76ad6e58294b07bd20843 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2585,11 +2585,12 @@ const VMStateDescription vmstate_vfio_pci_config = {
}
};
-static void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f)
+static int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f, Error **errp)
{
VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
- vmstate_save_state(f, &vmstate_vfio_pci_config, vdev, NULL);
+ return vmstate_save_state_with_err(f, &vmstate_vfio_pci_config, vdev, NULL,
+ errp);
}
static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f)
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 17/21] vfio: Reverse test on vfio_get_dirty_bitmap()
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (15 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 16/21] vfio: Add Error** argument to .vfio_save_config() handler Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 18/21] memory: Add Error** argument to memory_get_xlat_addr() Cédric Le Goater
` (3 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
It will simplify the changes coming after.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/common.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 706e915a3ba5f8520deb3753d9bb450a986f207a..e51757e7d747c60b67deb966bb29b946a511b328 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1237,16 +1237,20 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
}
rcu_read_lock();
- if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) {
- ret = vfio_get_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
- translated_addr);
- if (ret) {
- error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%s)",
- bcontainer, iova, iotlb->addr_mask + 1, ret,
- strerror(-ret));
- }
+ if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) {
+ goto out_lock;
}
+
+ ret = vfio_get_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
+ translated_addr);
+ if (ret) {
+ error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") = %d (%s)",
+ bcontainer, iova, iotlb->addr_mask + 1, ret,
+ strerror(-ret));
+ }
+
+out_lock:
rcu_read_unlock();
out:
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 18/21] memory: Add Error** argument to memory_get_xlat_addr()
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (16 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 17/21] vfio: Reverse test on vfio_get_dirty_bitmap() Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 19/21] vfio: Add Error** argument to .get_dirty_bitmap() handler Cédric Le Goater
` (2 subsequent siblings)
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater,
Michael S . Tsirkin, Paolo Bonzini, David Hildenbrand
Let the callers do the reporting. This will be useful in
vfio_iommu_map_dirty_notify().
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/exec/memory.h | 15 ++++++++++++++-
hw/vfio/common.c | 13 +++++++++----
hw/virtio/vhost-vdpa.c | 5 ++++-
system/memory.c | 10 +++++-----
4 files changed, 32 insertions(+), 11 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8b019465ab13ce85c03075c80865a0865ea1feed..baca989023415b69be3b4b4e7a622f983182314b 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -771,9 +771,22 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
RamDiscardListener *rdl);
+/**
+ * memory_get_xlat_addr: Extract addresses from a TLB entry
+ *
+ * @iotlb: pointer to an #IOMMUTLBEntry
+ * @vaddr: virtual addressf
+ * @ram_addr: RAM address
+ * @read_only: indicates if writes are allowed
+ * @mr_has_discard_manager: indicates memory is controlled by a
+ * RamDiscardManager
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
+ */
bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
ram_addr_t *ram_addr, bool *read_only,
- bool *mr_has_discard_manager);
+ bool *mr_has_discard_manager, Error **errp);
typedef struct CoalescedMemoryRange CoalescedMemoryRange;
typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index e51757e7d747c60b67deb966bb29b946a511b328..43f37447e3692ffa97788b02f83b81b44aaf301a 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -262,12 +262,13 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
/* Called with rcu_read_lock held. */
static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
- ram_addr_t *ram_addr, bool *read_only)
+ ram_addr_t *ram_addr, bool *read_only,
+ Error **errp)
{
bool ret, mr_has_discard_manager;
ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
- &mr_has_discard_manager);
+ &mr_has_discard_manager, errp);
if (ret && mr_has_discard_manager) {
/*
* Malicious VMs might trigger discarding of IOMMU-mapped memory. The
@@ -297,6 +298,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
hwaddr iova = iotlb->iova + giommu->iommu_offset;
void *vaddr;
int ret;
+ Error *local_err = NULL;
trace_vfio_iommu_map_notify(iotlb->perm == IOMMU_NONE ? "UNMAP" : "MAP",
iova, iova + iotlb->addr_mask);
@@ -313,7 +315,8 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
bool read_only;
- if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only)) {
+ if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, &local_err)) {
+ error_report_err(local_err);
goto out;
}
/*
@@ -1226,6 +1229,7 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
VFIOContainerBase *bcontainer = giommu->bcontainer;
hwaddr iova = iotlb->iova + giommu->iommu_offset;
ram_addr_t translated_addr;
+ Error *local_err = NULL;
int ret = -EINVAL;
trace_vfio_iommu_map_dirty_notify(iova, iova + iotlb->addr_mask);
@@ -1237,7 +1241,8 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
}
rcu_read_lock();
- if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) {
+ if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, &local_err)) {
+ error_report_err(local_err);
goto out_lock;
}
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index ddae494ca8e8154ce03b88bc781fe9f1e639aceb..a6f06266cfc798b20b98001fa97ce771722175ec 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -203,6 +203,7 @@ static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
void *vaddr;
int ret;
Int128 llend;
+ Error *local_err = NULL;
if (iotlb->target_as != &address_space_memory) {
error_report("Wrong target AS \"%s\", only system memory is allowed",
@@ -222,7 +223,9 @@ static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
bool read_only;
- if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
+ if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL,
+ &local_err)) {
+ error_report_err(local_err);
return;
}
ret = vhost_vdpa_dma_map(s, VHOST_VDPA_GUEST_PA_ASID, iova,
diff --git a/system/memory.c b/system/memory.c
index 48aed0f8ece1c731849636c442b8ab8e5d7ff6a5..132c026e35cbeb0ab8fa0fe64bb9db50f6024e0d 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2203,7 +2203,7 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
/* Called with rcu_read_lock held. */
bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
ram_addr_t *ram_addr, bool *read_only,
- bool *mr_has_discard_manager)
+ bool *mr_has_discard_manager, Error **errp)
{
MemoryRegion *mr;
hwaddr xlat;
@@ -2221,7 +2221,7 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
mr = address_space_translate(&address_space_memory, iotlb->translated_addr,
&xlat, &len, writable, MEMTXATTRS_UNSPECIFIED);
if (!memory_region_is_ram(mr)) {
- error_report("iommu map to non memory area %" HWADDR_PRIx "", xlat);
+ error_setg(errp, "iommu map to non memory area %" HWADDR_PRIx "", xlat);
return false;
} else if (memory_region_has_ram_discard_manager(mr)) {
RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
@@ -2240,8 +2240,8 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
* were already restored before IOMMUs are restored.
*/
if (!ram_discard_manager_is_populated(rdm, &tmp)) {
- error_report("iommu map to discarded memory (e.g., unplugged via"
- " virtio-mem): %" HWADDR_PRIx "",
+ error_setg(errp, "iommu map to discarded memory (e.g., unplugged"
+ " via virtio-mem): %" HWADDR_PRIx "",
iotlb->translated_addr);
return false;
}
@@ -2252,7 +2252,7 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
* check that it did not truncate too much.
*/
if (len & iotlb->addr_mask) {
- error_report("iommu has granularity incompatible with target AS");
+ error_setg(errp, "iommu has granularity incompatible with target AS");
return false;
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 19/21] vfio: Add Error** argument to .get_dirty_bitmap() handler
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (17 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 18/21] memory: Add Error** argument to memory_get_xlat_addr() Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 20/21] vfio: Also trace event failures in vfio_save_complete_precopy() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 21/21] vfio: Extend vfio_set_migration_error() with Error* argument Cédric Le Goater
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
Let the callers do the error reporting. Add documentation while at it.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/hw/vfio/vfio-common.h | 4 +-
include/hw/vfio/vfio-container-base.h | 17 +++++++-
hw/vfio/common.c | 59 ++++++++++++++++++---------
hw/vfio/container-base.c | 5 ++-
hw/vfio/container.c | 13 +++---
5 files changed, 67 insertions(+), 31 deletions(-)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 6d9dee626afc491645d2c2398f3e3210961f67e9..83ffad89f5cf434452332fe29fb752d9ec71b2f0 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -271,9 +271,9 @@ bool
vfio_devices_all_device_dirty_tracking(const VFIOContainerBase *bcontainer);
int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap, hwaddr iova,
- hwaddr size);
+ hwaddr size, Error **errp);
int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova,
- uint64_t size, ram_addr_t ram_addr);
+ uint64_t size, ram_addr_t ram_addr, Error **errp);
/* Returns 0 on success, or a negative errno. */
int vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index dec2023eceb6c7d62b0ee35008cc58f8e695e190..3ee713014cb414f18b34092641a17717983b5559 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -84,7 +84,7 @@ int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
bool start, Error **errp);
int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
- hwaddr iova, hwaddr size);
+ hwaddr iova, hwaddr size, Error **errp);
void vfio_container_init(VFIOContainerBase *bcontainer,
VFIOAddressSpace *space,
@@ -137,9 +137,22 @@ struct VFIOIOMMUClass {
*/
int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
bool start, Error **errp);
+ /**
+ * @query_dirty_bitmap
+ *
+ * Get list of dirty pages from container
+ *
+ * @bcontainer: #VFIOContainerBase from which to get dirty pages
+ * @vbmap: #VFIOBitmap internal bitmap structure
+ * @iova: iova base address
+ * @size: size of iova range
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*query_dirty_bitmap)(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
- hwaddr iova, hwaddr size);
+ hwaddr iova, hwaddr size, Error **errp);
/* PCI specific */
int (*pci_hot_reset)(VFIODevice *vbasedev, bool single);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 43f37447e3692ffa97788b02f83b81b44aaf301a..8fbf04e55d1b304bc80fdd9ef6f5f5089acd3360 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1154,7 +1154,7 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap, hwaddr iova,
- hwaddr size)
+ hwaddr size, Error **errp)
{
VFIODevice *vbasedev;
int ret;
@@ -1163,10 +1163,10 @@ int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
ret = vfio_device_dma_logging_report(vbasedev, iova, size,
vbmap->bitmap);
if (ret) {
- error_report("%s: Failed to get DMA logging report, iova: "
- "0x%" HWADDR_PRIx ", size: 0x%" HWADDR_PRIx
- ", err: %d (%s)",
- vbasedev->name, iova, size, ret, strerror(-ret));
+ error_setg(errp, "%s: Failed to get DMA logging report, iova: "
+ "0x%" HWADDR_PRIx ", size: 0x%" HWADDR_PRIx
+ ", err: %d (%s)",
+ vbasedev->name, iova, size, ret, strerror(-ret));
return ret;
}
@@ -1176,7 +1176,7 @@ int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
}
int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova,
- uint64_t size, ram_addr_t ram_addr)
+ uint64_t size, ram_addr_t ram_addr, Error **errp)
{
bool all_device_dirty_tracking =
vfio_devices_all_device_dirty_tracking(bcontainer);
@@ -1193,13 +1193,17 @@ int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova,
ret = vfio_bitmap_alloc(&vbmap, size);
if (ret) {
+ error_setg_errno(errp, -ret,
+ "Failed to allocate dirty tracking bitmap");
return ret;
}
if (all_device_dirty_tracking) {
- ret = vfio_devices_query_dirty_bitmap(bcontainer, &vbmap, iova, size);
+ ret = vfio_devices_query_dirty_bitmap(bcontainer, &vbmap, iova, size,
+ errp);
} else {
- ret = vfio_container_query_dirty_bitmap(bcontainer, &vbmap, iova, size);
+ ret = vfio_container_query_dirty_bitmap(bcontainer, &vbmap, iova, size,
+ errp);
}
if (ret) {
@@ -1247,12 +1251,13 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
}
ret = vfio_get_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
- translated_addr);
+ translated_addr, &local_err);
if (ret) {
- error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%s)",
- bcontainer, iova, iotlb->addr_mask + 1, ret,
- strerror(-ret));
+ error_prepend(&local_err,
+ "vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") failed :", bcontainer, iova,
+ iotlb->addr_mask + 1);
+ error_report_err(local_err);
}
out_lock:
@@ -1272,12 +1277,19 @@ static int vfio_ram_discard_get_dirty_bitmap(MemoryRegionSection *section,
const ram_addr_t ram_addr = memory_region_get_ram_addr(section->mr) +
section->offset_within_region;
VFIORamDiscardListener *vrdl = opaque;
+ Error *local_err = NULL;
+ int ret;
/*
* Sync the whole mapped region (spanning multiple individual mappings)
* in one go.
*/
- return vfio_get_dirty_bitmap(vrdl->bcontainer, iova, size, ram_addr);
+ ret = vfio_get_dirty_bitmap(vrdl->bcontainer, iova, size, ram_addr,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
+ return ret;
}
static int
@@ -1309,7 +1321,7 @@ vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainerBase *bcontainer,
}
static int vfio_sync_dirty_bitmap(VFIOContainerBase *bcontainer,
- MemoryRegionSection *section)
+ MemoryRegionSection *section, Error **errp)
{
ram_addr_t ram_addr;
@@ -1340,7 +1352,14 @@ static int vfio_sync_dirty_bitmap(VFIOContainerBase *bcontainer,
}
return 0;
} else if (memory_region_has_ram_discard_manager(section->mr)) {
- return vfio_sync_ram_discard_listener_dirty_bitmap(bcontainer, section);
+ int ret;
+
+ ret = vfio_sync_ram_discard_listener_dirty_bitmap(bcontainer, section);
+ if (ret) {
+ error_setg(errp,
+ "Failed to sync dirty bitmap with RAM discard listener");
+ return ret;
+ }
}
ram_addr = memory_region_get_ram_addr(section->mr) +
@@ -1348,7 +1367,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainerBase *bcontainer,
return vfio_get_dirty_bitmap(bcontainer,
REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
- int128_get64(section->size), ram_addr);
+ int128_get64(section->size), ram_addr, errp);
}
static void vfio_listener_log_sync(MemoryListener *listener,
@@ -1357,16 +1376,16 @@ static void vfio_listener_log_sync(MemoryListener *listener,
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
int ret;
+ Error *local_err = NULL;
if (vfio_listener_skipped_section(section)) {
return;
}
if (vfio_devices_all_dirty_tracking(bcontainer)) {
- ret = vfio_sync_dirty_bitmap(bcontainer, section);
+ ret = vfio_sync_dirty_bitmap(bcontainer, section, &local_err);
if (ret) {
- error_report("vfio: Failed to sync dirty bitmap, err: %d (%s)", ret,
- strerror(-ret));
+ error_report_err(local_err);
vfio_set_migration_error(ret);
}
}
diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
index 7c0764121d24b02b6c4e66e368d7dff78a6d65aa..8db59881873c3b1edee81104b966af737e5fa6f6 100644
--- a/hw/vfio/container-base.c
+++ b/hw/vfio/container-base.c
@@ -65,10 +65,11 @@ int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
- hwaddr iova, hwaddr size)
+ hwaddr iova, hwaddr size, Error **errp)
{
g_assert(bcontainer->ops->query_dirty_bitmap);
- return bcontainer->ops->query_dirty_bitmap(bcontainer, vbmap, iova, size);
+ return bcontainer->ops->query_dirty_bitmap(bcontainer, vbmap, iova, size,
+ errp);
}
void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace *space,
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index f772ac79b9c413c86d7e60f6dc4e6699852d5aac..a5243b5ee4bb5bc2f732554bd6cfa1b92758195e 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -131,6 +131,7 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer,
};
bool need_dirty_sync = false;
int ret;
+ Error *local_err = NULL;
if (iotlb && vfio_devices_all_running_and_mig_active(bcontainer)) {
if (!vfio_devices_all_device_dirty_tracking(bcontainer) &&
@@ -166,8 +167,9 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer,
if (need_dirty_sync) {
ret = vfio_get_dirty_bitmap(bcontainer, iova, size,
- iotlb->translated_addr);
+ iotlb->translated_addr, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
}
@@ -237,7 +239,8 @@ vfio_legacy_set_dirty_page_tracking(const VFIOContainerBase *bcontainer,
static int vfio_legacy_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
- hwaddr iova, hwaddr size)
+ hwaddr iova, hwaddr size,
+ Error **errp)
{
const VFIOContainer *container = container_of(bcontainer, VFIOContainer,
bcontainer);
@@ -265,9 +268,9 @@ static int vfio_legacy_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, dbitmap);
if (ret) {
ret = -errno;
- error_report("Failed to get dirty bitmap for iova: 0x%"PRIx64
- " size: 0x%"PRIx64" err: %d", (uint64_t)range->iova,
- (uint64_t)range->size, errno);
+ error_setg(errp, "Failed to get dirty bitmap for iova: 0x%"PRIx64
+ " size: 0x%"PRIx64" err: %d", (uint64_t)range->iova,
+ (uint64_t)range->size, errno);
}
g_free(dbitmap);
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 20/21] vfio: Also trace event failures in vfio_save_complete_precopy()
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (18 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 19/21] vfio: Add Error** argument to .get_dirty_bitmap() handler Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 21/21] vfio: Extend vfio_set_migration_error() with Error* argument Cédric Le Goater
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
vfio_save_complete_precopy() currently returns before doing the trace
event. Change that.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/migration.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 228e8854594f3714b7c6f4fcfc5468d6b56337cb..f3b500dd1cab944722ccbc41575b15046c2420c9 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -574,9 +574,6 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
ret = qemu_file_get_error(f);
- if (ret) {
- return ret;
- }
trace_vfio_save_complete_precopy(vbasedev->name, ret);
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH v2 21/21] vfio: Extend vfio_set_migration_error() with Error* argument
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
` (19 preceding siblings ...)
2024-02-27 18:03 ` [PATCH v2 20/21] vfio: Also trace event failures in vfio_save_complete_precopy() Cédric Le Goater
@ 2024-02-27 18:03 ` Cédric Le Goater
20 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
vfio_set_migration_error() sets the 'return' error on the migration
stream if a migration is in progress. To improve error reporting, add
a new Error* argument to also set the Error object on the migration
stream, if a migration is progress.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/common.c | 36 +++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 8fbf04e55d1b304bc80fdd9ef6f5f5089acd3360..5e6353ae468c885af0fa169b671902a518df4c75 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -148,16 +148,18 @@ bool vfio_viommu_preset(VFIODevice *vbasedev)
return vbasedev->bcontainer->space->as != &address_space_memory;
}
-static void vfio_set_migration_error(int err)
+static void vfio_set_migration_error(int ret, Error *err)
{
MigrationState *ms = migrate_get_current();
if (migration_is_setup_or_active(ms->state)) {
WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) {
if (ms->to_dst_file) {
- qemu_file_set_error(ms->to_dst_file, err);
+ qemu_file_set_error_obj(ms->to_dst_file, ret, err);
}
}
+ } else {
+ error_report_err(err);
}
}
@@ -304,9 +306,10 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
iova, iova + iotlb->addr_mask);
if (iotlb->target_as != &address_space_memory) {
- error_report("Wrong target AS \"%s\", only system memory is allowed",
- iotlb->target_as->name ? iotlb->target_as->name : "none");
- vfio_set_migration_error(-EINVAL);
+ error_setg(&local_err,
+ "Wrong target AS \"%s\", only system memory is allowed",
+ iotlb->target_as->name ? iotlb->target_as->name : "none");
+ vfio_set_migration_error(-EINVAL, local_err);
return;
}
@@ -339,11 +342,12 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
ret = vfio_container_dma_unmap(bcontainer, iova,
iotlb->addr_mask + 1, iotlb);
if (ret) {
- error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%s)",
- bcontainer, iova,
- iotlb->addr_mask + 1, ret, strerror(-ret));
- vfio_set_migration_error(ret);
+ error_setg(&local_err,
+ "vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") = %d (%s)",
+ bcontainer, iova,
+ iotlb->addr_mask + 1, ret, strerror(-ret));
+ vfio_set_migration_error(ret, local_err);
}
}
out:
@@ -1239,14 +1243,14 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
trace_vfio_iommu_map_dirty_notify(iova, iova + iotlb->addr_mask);
if (iotlb->target_as != &address_space_memory) {
- error_report("Wrong target AS \"%s\", only system memory is allowed",
- iotlb->target_as->name ? iotlb->target_as->name : "none");
+ error_setg(&local_err,
+ "Wrong target AS \"%s\", only system memory is allowed",
+ iotlb->target_as->name ? iotlb->target_as->name : "none");
goto out;
}
rcu_read_lock();
if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, &local_err)) {
- error_report_err(local_err);
goto out_lock;
}
@@ -1257,7 +1261,6 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
"vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
"0x%"HWADDR_PRIx") failed :", bcontainer, iova,
iotlb->addr_mask + 1);
- error_report_err(local_err);
}
out_lock:
@@ -1265,7 +1268,7 @@ out_lock:
out:
if (ret) {
- vfio_set_migration_error(ret);
+ vfio_set_migration_error(ret, local_err);
}
}
@@ -1385,8 +1388,7 @@ static void vfio_listener_log_sync(MemoryListener *listener,
if (vfio_devices_all_dirty_tracking(bcontainer)) {
ret = vfio_sync_dirty_bitmap(bcontainer, section, &local_err);
if (ret) {
- error_report_err(local_err);
- vfio_set_migration_error(ret);
+ vfio_set_migration_error(ret, local_err);
}
}
}
--
2.43.2
^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs
2024-02-27 18:03 ` [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
@ 2024-02-29 1:21 ` Fabiano Rosas
2024-02-29 3:55 ` Peter Xu
1 sibling, 0 replies; 42+ messages in thread
From: Fabiano Rosas @ 2024-02-29 1:21 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Cédric Le Goater
Cédric Le Goater <clg@redhat.com> writes:
> They are only used once.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs
2024-02-27 18:03 ` [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
2024-02-29 1:21 ` Fabiano Rosas
@ 2024-02-29 3:55 ` Peter Xu
1 sibling, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-29 3:55 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé
On Tue, Feb 27, 2024 at 07:03:26PM +0100, Cédric Le Goater wrote:
> They are only used once.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers
2024-02-27 18:03 ` [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers Cédric Le Goater
@ 2024-02-29 4:10 ` Peter Xu
2024-02-29 13:19 ` Cédric Le Goater
2024-03-04 9:05 ` Avihai Horon
1 sibling, 1 reply; 42+ messages in thread
From: Peter Xu @ 2024-02-29 4:10 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé
On Tue, Feb 27, 2024 at 07:03:27PM +0100, Cédric Le Goater wrote:
> The SaveVMHandlers structure is still in use for complex subsystems
> and devices. Document the handlers since we are going to modify a few
> later.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Still a few nitpick comments below.
> ---
> include/migration/register.h | 257 +++++++++++++++++++++++++++++++----
> 1 file changed, 231 insertions(+), 26 deletions(-)
>
> diff --git a/include/migration/register.h b/include/migration/register.h
> index 2e6a7d766e62f64940086b7b511249c9ff21fa62..2cc71ec45f65bf2884c9e7a823d2968752f15c20 100644
> --- a/include/migration/register.h
> +++ b/include/migration/register.h
> @@ -16,30 +16,129 @@
>
> #include "hw/vmstate-if.h"
>
> +/**
> + * struct SaveVMHandlers: handler structure to finely control
> + * migration of complex subsystems and devices, such as RAM, block and
> + * VFIO.
> + */
> typedef struct SaveVMHandlers {
> - /* This runs inside the BQL. */
> +
> + /* The following handlers runs inside the BQL. */
> +
> + /**
> + * @save_state
> + *
> + * Saves state section on the source using the latest state format
> + * version.
> + *
> + * Legacy method. Should be deprecated when all users are ported
> + * to VMState.
VMStateDescription?
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + */
> void (*save_state)(QEMUFile *f, void *opaque);
>
> - /*
> - * save_prepare is called early, even before migration starts, and can be
> - * used to perform early checks.
> + /**
> + * @save_prepare
> + *
> + * Called early, even before migration starts, and can be used to
> + * perform early checks.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + * @errp: pointer to Error*, to store an error if it happens.
> + *
> + * Returns zero to indicate success and negative for error
> */
> int (*save_prepare)(void *opaque, Error **errp);
> +
> + /**
> + * @save_setup
> + *
> + * Initializes the data structures on the source and transmits
> + * first section containing information on the device
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*save_setup)(QEMUFile *f, void *opaque);
> +
> + /**
> + * @save_cleanup
> + *
> + * Performs save related cleanup
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> void (*save_cleanup)(void *opaque);
> +
> + /**
> + * @save_live_complete_postcopy
> + *
> + * Called at the end of postcopy for all postcopyiable devices.
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
> +
> + /**
> + * @save_live_complete_precopy
> + *
> + * Transmits the last section for the device containing any
> + * remaining data.
This doesn't flush all remaining data when postcopy is enabled and
supported on the specific device. Perhaps attach one more sentence to
describe that case?
Transmits the last section for the device containing any remaining data
at the end of a precopy phase. When postcopy is enabled, devices that
support postcopy will skip this step, where the final data will be
flushed at the end of postcopy via @save_live_complete_postcopy instead.
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
>
> /* This runs both outside and inside the BQL. */
> +
> + /**
> + * @is_active
> + *
> + * Will skip a state section if not active
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns true if state section is active else false
> + */
> bool (*is_active)(void *opaque);
> +
> + /**
> + * @has_postcopy
> + *
> + * checks if a device supports postcopy
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns true for postcopy support else false
> + */
> bool (*has_postcopy)(void *opaque);
>
> - /* is_active_iterate
> - * If it is not NULL then qemu_savevm_state_iterate will skip iteration if
> - * it returns false. For example, it is needed for only-postcopy-states,
> - * which needs to be handled by qemu_savevm_state_setup and
> - * qemu_savevm_state_pending, but do not need iterations until not in
> - * postcopy stage.
> + /**
> + * @is_active_iterate
> + *
> + * As #SaveVMHandlers.is_active(), will skip an inactive state
> + * section in qemu_savevm_state_iterate.
> + *
> + * For example, it is needed for only-postcopy-states, which needs
> + * to be handled by qemu_savevm_state_setup() and
> + * qemu_savevm_state_pending(), but do not need iterations until
> + * not in postcopy stage.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns true if state section is active else false
> */
> bool (*is_active_iterate)(void *opaque);
>
> @@ -48,44 +147,150 @@ typedef struct SaveVMHandlers {
> * use data that is local to the migration thread or protected
> * by other locks.
> */
> +
> + /**
> + * @save_live_iterate
> + *
> + * Should send a chunk of data until the point that stream
> + * bandwidth limits tell it to stop. Each call generates one
> + * section.
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*save_live_iterate)(QEMUFile *f, void *opaque);
>
> /* This runs outside the BQL! */
> - /* Note for save_live_pending:
> - * must_precopy:
> - * - must be migrated in precopy or in stopped state
> - * - i.e. must be migrated before target start
> - *
> - * can_postcopy:
> - * - can migrate in postcopy or in stopped state
> - * - i.e. can migrate after target start
> - * - some can also be migrated during precopy (RAM)
> - * - some must be migrated after source stops (block-dirty-bitmap)
> - *
> - * Sum of can_postcopy and must_postcopy is the whole amount of
> +
> + /**
> + * @state_pending_estimate
> + *
> + * This estimates the remaining data to transfer
> + *
> + * Sum of @can_postcopy and @must_postcopy is the whole amount of
> * pending data.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + * @must_precopy: amount of data that must be migrated in precopy
> + * or in stopped state, i.e. that must be migrated
> + * before target start.
> + * @can_postcopy: amount of data that can be migrated in postcopy
> + * or in stopped state, i.e. after target start.
> + * Some can also be migrated during precopy (RAM).
> + * Some must be migrated after source stops
> + * (block-dirty-bitmap)
> */
> - /* This estimates the remaining data to transfer */
> void (*state_pending_estimate)(void *opaque, uint64_t *must_precopy,
> uint64_t *can_postcopy);
> - /* This calculate the exact remaining data to transfer */
> +
> + /**
> + * @state_pending_exact
> + *
> + * This calculate the exact remaining data to transfer
> + *
> + * Sum of @can_postcopy and @must_postcopy is the whole amount of
> + * pending data.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + * @must_precopy: amount of data that must be migrated in precopy
> + * or in stopped state, i.e. that must be migrated
> + * before target start.
> + * @can_postcopy: amount of data that can be migrated in postcopy
> + * or in stopped state, i.e. after target start.
> + * Some can also be migrated during precopy (RAM).
> + * Some must be migrated after source stops
> + * (block-dirty-bitmap)
> + */
> void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
> uint64_t *can_postcopy);
> +
> + /**
> + * @load_state
> + *
> + * Load sections generated by any of the save functions that
> + * generate sections.
> + *
> + * Legacy method. Should be deprecated when all users are ported
> + * to VMState.
VMStateDescription.
> + *
> + * @f: QEMUFile where to receive the data
> + * @opaque: data pointer passed to register_savevm_live()
> + * @version_id: the maximum version_id supported
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*load_state)(QEMUFile *f, void *opaque, int version_id);
> +
> + /**
> + * @load_setup
> + *
> + * Initializes the data structures on the destination.
> + *
> + * @f: QEMUFile where to receive the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*load_setup)(QEMUFile *f, void *opaque);
> +
> + /**
> + * @load_cleanup
> + *
> + * Performs cleanup of load data structures
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*load_cleanup)(void *opaque);
> - /* Called when postcopy migration wants to resume from failure */
> +
> + /**
> + * @resume_prepare
> + *
> + * Called when postcopy migration wants to resume from failure
> + *
> + * @s: Current migration state
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*resume_prepare)(MigrationState *s, void *opaque);
> - /* Checks if switchover ack should be used. Called only in dest */
> +
> + /**
> + * @switchover_ack_needed
> + *
> + * Checks if switchover ack should be used. Called only on
> + * destination.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + */
> bool (*switchover_ack_needed)(void *opaque);
> } SaveVMHandlers;
>
> +/**
> + * register_savevm_live: Register a set of custom migration handlers
> + *
> + * @idstr: state section identifier
> + * @instance_id: instance id
> + * @version_id: version id supported
> + * @ops: SaveVMHandlers structurex
> + * @opaque: data pointer passed SaveVMHandlers handlers
> + */
> int register_savevm_live(const char *idstr,
> uint32_t instance_id,
> int version_id,
> const SaveVMHandlers *ops,
> void *opaque);
>
> +/**
> + * unregister_savevm: Unregister custom migration handlers
> + *
> + * @obj: object associated with state section
> + * @idstr: state section identifier
> + * @opaque: data pointer passed to register_savevm_live()
> + */
> void unregister_savevm(VMStateIf *obj, const char *idstr, void *opaque);
>
> #endif
> --
> 2.43.2
>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 04/21] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
2024-02-27 18:03 ` [PATCH v2 04/21] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error Cédric Le Goater
@ 2024-02-29 4:12 ` Peter Xu
0 siblings, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-29 4:12 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé
On Tue, Feb 27, 2024 at 07:03:28PM +0100, Cédric Le Goater wrote:
> When commit bd2270608fa0 ("migration/ram.c: add a notifier chain for
> precopy") added PRECOPY_NOTIFY_SETUP notifiers at the end of
> qemu_savevm_state_setup(), it didn't take into account a possible
> error in the loop calling vmstate_save() or .save_setup() handlers.
>
> Check ret value before calling the notifiers.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 05/21] migration: Add Error** argument to qemu_savevm_state_setup()
2024-02-27 18:03 ` [PATCH v2 05/21] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
@ 2024-02-29 4:19 ` Peter Xu
0 siblings, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-29 4:19 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé
On Tue, Feb 27, 2024 at 07:03:29PM +0100, Cédric Le Goater wrote:
> @@ -3469,9 +3479,15 @@ static void *bg_migration_thread(void *opaque)
>
> bql_lock();
> qemu_savevm_state_header(s->to_dst_file);
> - qemu_savevm_state_setup(s->to_dst_file);
> + ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
> bql_unlock();
>
> + if (ret) {
> + migrate_set_error(s, local_err);
> + error_free(local_err);
> + goto fail;
> + }
This chunk may need to be moved above bql unlock.
> +
> qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
> MIGRATION_STATUS_ACTIVE);
>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 07/21] migration: Add Error** argument to .load_setup() handler
2024-02-27 18:03 ` [PATCH v2 07/21] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
@ 2024-02-29 4:21 ` Peter Xu
0 siblings, 0 replies; 42+ messages in thread
From: Peter Xu @ 2024-02-29 4:21 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé
On Tue, Feb 27, 2024 at 07:03:31PM +0100, Cédric Le Goater wrote:
> This will be useful to report errors at a higher level, mostly in VFIO
> today.
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 08/21] memory: Add Error** argument to .log_global*() handlers
2024-02-27 18:03 ` [PATCH v2 08/21] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
@ 2024-02-29 5:29 ` Peter Xu
2024-03-04 13:38 ` Cédric Le Goater
0 siblings, 1 reply; 42+ messages in thread
From: Peter Xu @ 2024-02-29 5:29 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Stefano Stabellini, Anthony Perard,
Paul Durrant, Michael S . Tsirkin, Paolo Bonzini,
David Hildenbrand
On Tue, Feb 27, 2024 at 07:03:32PM +0100, Cédric Le Goater wrote:
> Modify all log_global*() handlers to take an Error** parameter and
> return a bool. A new MEMORY_LISTENER_CALL_LOG_GLOBAL macro looping on
> the listeners is introduced to handle a possible error, which will
> would interrupt the loop if necessary.
>
> To be noted a change in memory_global_dirty_log_start() behavior as it
> will return as soon as an error is detected.
>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Anthony Perard <anthony.perard@citrix.com>
> Cc: Paul Durrant <paul@xen.org>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> include/exec/memory.h | 15 ++++++--
> hw/i386/xen/xen-hvm.c | 6 ++--
> hw/vfio/common.c | 8 +++--
> hw/virtio/vhost.c | 6 ++--
> system/memory.c | 83 +++++++++++++++++++++++++++++++++++++------
> system/physmem.c | 5 +--
> 6 files changed, 101 insertions(+), 22 deletions(-)
>
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 8626a355b310ed7b1a1db7978ba4b394032c2f15..4bc146c5ebdd377cd14a4e462f32cc945db5a0a8 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -998,8 +998,11 @@ struct MemoryListener {
> * active at that time.
> *
> * @listener: The #MemoryListener.
> + * @errp: pointer to Error*, to store an error if it happens.
> + *
> + * Return: true on success, else false setting @errp with error.
> */
> - void (*log_global_start)(MemoryListener *listener);
> + bool (*log_global_start)(MemoryListener *listener, Error **errp);
>
> /**
> * @log_global_stop:
> @@ -1009,8 +1012,11 @@ struct MemoryListener {
> * the address space.
> *
> * @listener: The #MemoryListener.
> + * @errp: pointer to Error*, to store an error if it happens.
> + *
> + * Return: true on success, else false setting @errp with error.
> */
> - void (*log_global_stop)(MemoryListener *listener);
> + bool (*log_global_stop)(MemoryListener *listener, Error **errp);
>
> /**
> * @log_global_after_sync:
> @@ -1019,8 +1025,11 @@ struct MemoryListener {
> * for any #MemoryRegionSection.
> *
> * @listener: The #MemoryListener.
> + * @errp: pointer to Error*, to store an error if it happens.
> + *
> + * Return: true on success, else false setting @errp with error.
> */
> - void (*log_global_after_sync)(MemoryListener *listener);
> + bool (*log_global_after_sync)(MemoryListener *listener, Error **errp);
>
> /**
> * @eventfd_add:
> diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
> index f42621e6742552035122ea58092c91c3458338ff..925a207b494b4eed52d5f360b554f18ac8a9806d 100644
> --- a/hw/i386/xen/xen-hvm.c
> +++ b/hw/i386/xen/xen-hvm.c
> @@ -446,16 +446,18 @@ static void xen_log_sync(MemoryListener *listener, MemoryRegionSection *section)
> int128_get64(section->size));
> }
>
> -static void xen_log_global_start(MemoryListener *listener)
> +static bool xen_log_global_start(MemoryListener *listener, Error **errp)
> {
> if (xen_enabled()) {
> xen_in_migration = true;
> }
> + return true;
> }
>
> -static void xen_log_global_stop(MemoryListener *listener)
> +static bool xen_log_global_stop(MemoryListener *listener, Error **errp)
> {
> xen_in_migration = false;
> + return true;
> }
>
> static const MemoryListener xen_memory_listener = {
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 059bfdc07a85e2eb908df828c1f42104d683e911..8bba95ba6a2010b78cae54c6905857686bbb6309 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1075,7 +1075,8 @@ out:
> return ret;
> }
>
> -static void vfio_listener_log_global_start(MemoryListener *listener)
> +static bool vfio_listener_log_global_start(MemoryListener *listener,
> + Error **errp)
> {
> VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
> listener);
> @@ -1092,9 +1093,11 @@ static void vfio_listener_log_global_start(MemoryListener *listener)
> ret, strerror(-ret));
> vfio_set_migration_error(ret);
> }
> + return !!ret;
> }
>
> -static void vfio_listener_log_global_stop(MemoryListener *listener)
> +static bool vfio_listener_log_global_stop(MemoryListener *listener,
> + Error **errp)
> {
> VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
> listener);
> @@ -1111,6 +1114,7 @@ static void vfio_listener_log_global_stop(MemoryListener *listener)
> ret, strerror(-ret));
> vfio_set_migration_error(ret);
> }
> + return !!ret;
> }
Shouldn't both of these be "return !ret"?
Besides, maybe it'll be easier to go with either:
1) if to only add the new parameter, here we can constantly return true
to keep the old behavior, or,
2) allow failure within this patch, then error_report() should already do
error_setg() to errp.
>
> static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 2c9ac794680ea9b65eba6cc22e70cf141e90aa73..7a555f941934991a72a2817e5505fe0ce6d6fc64 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1044,7 +1044,7 @@ check_dev_state:
> return r;
> }
>
> -static void vhost_log_global_start(MemoryListener *listener)
> +static bool vhost_log_global_start(MemoryListener *listener, Error **errp)
> {
> int r;
>
> @@ -1052,9 +1052,10 @@ static void vhost_log_global_start(MemoryListener *listener)
> if (r < 0) {
> abort();
> }
> + return true;
> }
>
> -static void vhost_log_global_stop(MemoryListener *listener)
> +static bool vhost_log_global_stop(MemoryListener *listener, Error **errp)
> {
> int r;
>
> @@ -1062,6 +1063,7 @@ static void vhost_log_global_stop(MemoryListener *listener)
> if (r < 0) {
> abort();
> }
> + return true;
> }
>
> static void vhost_log_start(MemoryListener *listener,
> diff --git a/system/memory.c b/system/memory.c
> index a229a79988fce2aa3cb77e3a130db4c694e8cd49..af06157ead5b1272548e87f79ab9fb3036055328 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -127,6 +127,35 @@ enum ListenerDirection { Forward, Reverse };
> } \
> } while (0)
>
> +#define MEMORY_LISTENER_CALL_LOG_GLOBAL(_callback, _direction, _errp, \
> + _args...) \
> + do { \
> + MemoryListener *_listener; \
> + \
> + switch (_direction) { \
> + case Forward: \
> + QTAILQ_FOREACH(_listener, &memory_listeners, link) { \
> + if (_listener->_callback) { \
> + if (!_listener->_callback(_listener, _errp, ##_args)) { \
> + break; \
> + } \
> + } \
> + } \
> + break; \
> + case Reverse: \
> + QTAILQ_FOREACH_REVERSE(_listener, &memory_listeners, link) { \
> + if (_listener->_callback) { \
> + if (!_listener->_callback(_listener, _errp, ##_args)) { \
> + break; \
> + } \
> + } \
> + } \
> + break; \
> + default: \
> + abort(); \
> + }; \
> + } while (0)
> +
> #define MEMORY_LISTENER_CALL(_as, _callback, _direction, _section, _args...) \
> do { \
> MemoryListener *_listener; \
> @@ -2903,7 +2932,13 @@ void memory_global_dirty_log_sync(bool last_stage)
>
> void memory_global_after_dirty_log_sync(void)
> {
> - MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
> + Error *local_err = NULL;
> +
> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_after_sync, Forward,
> + &local_err);
> + if (local_err) {
> + error_report_err(local_err);
> + }
> }
>
> /*
> @@ -2912,18 +2947,22 @@ void memory_global_after_dirty_log_sync(void)
> */
> static unsigned int postponed_stop_flags;
> static VMChangeStateEntry *vmstate_change;
> -static void memory_global_dirty_log_stop_postponed_run(void);
> +static bool memory_global_dirty_log_stop_postponed_run(Error **errp);
>
> void memory_global_dirty_log_start(unsigned int flags)
> {
> unsigned int old_flags;
> + Error *local_err = NULL;
>
> assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
>
> if (vmstate_change) {
> /* If there is postponed stop(), operate on it first */
> postponed_stop_flags &= ~flags;
> - memory_global_dirty_log_stop_postponed_run();
> + if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
> + error_report_err(local_err);
> + return;
> + }
> }
>
> flags &= ~global_dirty_tracking;
> @@ -2936,15 +2975,22 @@ void memory_global_dirty_log_start(unsigned int flags)
> trace_global_dirty_changed(global_dirty_tracking);
>
> if (!old_flags) {
> - MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
> + &local_err);
> + if (local_err) {
> + error_report_err(local_err);
> + return;
> + }
> memory_region_transaction_begin();
> memory_region_update_pending = true;
> memory_region_transaction_commit();
> }
> }
>
> -static void memory_global_dirty_log_do_stop(unsigned int flags)
> +static bool memory_global_dirty_log_do_stop(unsigned int flags, Error **errp)
> {
> + ERRP_GUARD();
> +
> assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
> assert((global_dirty_tracking & flags) == flags);
> global_dirty_tracking &= ~flags;
> @@ -2955,39 +3001,49 @@ static void memory_global_dirty_log_do_stop(unsigned int flags)
> memory_region_transaction_begin();
> memory_region_update_pending = true;
> memory_region_transaction_commit();
> - MEMORY_LISTENER_CALL_GLOBAL(log_global_stop, Reverse);
> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_stop, Reverse, errp);
> }
> + return !*errp;
> }
>
> /*
> * Execute the postponed dirty log stop operations if there is, then reset
> * everything (including the flags and the vmstate change hook).
> */
> -static void memory_global_dirty_log_stop_postponed_run(void)
> +static bool memory_global_dirty_log_stop_postponed_run(Error **errp)
> {
> + bool ret = true;
> +
> /* This must be called with the vmstate handler registered */
> assert(vmstate_change);
>
> /* Note: postponed_stop_flags can be cleared in log start routine */
> if (postponed_stop_flags) {
> - memory_global_dirty_log_do_stop(postponed_stop_flags);
> + ret = memory_global_dirty_log_do_stop(postponed_stop_flags, errp);
> postponed_stop_flags = 0;
> }
>
> qemu_del_vm_change_state_handler(vmstate_change);
> vmstate_change = NULL;
> + return ret;
> }
>
> static void memory_vm_change_state_handler(void *opaque, bool running,
> RunState state)
> {
> + Error *local_err = NULL;
> +
> if (running) {
> - memory_global_dirty_log_stop_postponed_run();
> + if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
> + error_report_err(local_err);
> + }
> }
> }
>
> void memory_global_dirty_log_stop(unsigned int flags)
> {
> + Error *local_err = NULL;
> +
> if (!runstate_is_running()) {
> /* Postpone the dirty log stop, e.g., to when VM starts again */
> if (vmstate_change) {
> @@ -3001,7 +3057,9 @@ void memory_global_dirty_log_stop(unsigned int flags)
> return;
> }
>
> - memory_global_dirty_log_do_stop(flags);
> + if (!memory_global_dirty_log_do_stop(flags, &local_err)) {
> + error_report_err(local_err);
> + }
> }
>
> static void listener_add_address_space(MemoryListener *listener,
> @@ -3009,13 +3067,16 @@ static void listener_add_address_space(MemoryListener *listener,
> {
> FlatView *view;
> FlatRange *fr;
> + Error *local_err = NULL;
>
> if (listener->begin) {
> listener->begin(listener);
> }
> if (global_dirty_tracking) {
> if (listener->log_global_start) {
> - listener->log_global_start(listener);
> + if (!listener->log_global_start(listener, &local_err)) {
> + error_report_err(local_err);
> + }
> }
> }
The stop process is indeed more involved due to the postponed mechanism.
Is it really a goal that we should also allow global stop() to be able to
fail? Or maybe we only care about global start() on the "VFIO doesn't have
enough dirty tracking unit for the current device" issue?
It makes more sense to me to allow a start() to fail rather than stop(),
because if start()s failed we can still fallback to stop() on whatever got
start()ed already, and fail the operation that triggered the start().
If stop() can also fail I don't know what we can do; consider if the
fallback stop()s failed above. It sounds just more challenging to allow
stop() to fail v.s. start() only from that regard to behave correctly.
Same question to after_sync(); that seems only to be used by tcg. If VFIO
has no plan to leverage that Error** maybe we can keep them as-is?
>
> diff --git a/system/physmem.c b/system/physmem.c
> index e3ebc19eefd8050a1dee16e3d1449f0c144f751f..9adbf9aea847cd80bdac6dca466fb476844ac048 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -148,7 +148,7 @@ typedef struct subpage_t {
>
> static void io_mem_init(void);
> static void memory_map_init(void);
> -static void tcg_log_global_after_sync(MemoryListener *listener);
> +static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp);
> static void tcg_commit(MemoryListener *listener);
>
> /**
> @@ -2475,7 +2475,7 @@ static void do_nothing(CPUState *cpu, run_on_cpu_data d)
> {
> }
>
> -static void tcg_log_global_after_sync(MemoryListener *listener)
> +static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp)
> {
> CPUAddressSpace *cpuas;
>
> @@ -2507,6 +2507,7 @@ static void tcg_log_global_after_sync(MemoryListener *listener)
> cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
> run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
> }
> + return true;
> }
>
> static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data)
> --
> 2.43.2
>
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 11/21] migration: Fix migration termination
2024-02-27 18:03 ` [PATCH v2 11/21] migration: Fix migration termination Cédric Le Goater
@ 2024-02-29 5:34 ` Peter Xu
2024-02-29 15:13 ` Cédric Le Goater
0 siblings, 1 reply; 42+ messages in thread
From: Peter Xu @ 2024-02-29 5:34 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé
On Tue, Feb 27, 2024 at 07:03:35PM +0100, Cédric Le Goater wrote:
> Handle migration termination when in SETUP state. This can happen if
> qemu_savevm_state_setup() fails.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> migration/migration.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index c1a62b696f62c0d5aca0505e58bc4dc0ff561fde..63294417ff9cae868ad8a167094a795fc30e4da0 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3161,6 +3161,8 @@ static void migration_iteration_finish(MigrationState *s)
> }
> }
> break;
> + case MIGRATION_STATUS_SETUP:
> + break;
>
> default:
> /* Should not reach here, but if so, forgive the VM. */
> @@ -3192,6 +3194,8 @@ static void bg_migration_iteration_finish(MigrationState *s)
> case MIGRATION_STATUS_CANCELLED:
> case MIGRATION_STATUS_CANCELLING:
> break;
> + case MIGRATION_STATUS_SETUP:
> + break;
>
> default:
> /* Should not reach here, but if so, forgive the VM. */
Would this cause "query-migrate" to keep reporting "SETUP" even if setup()
failed?
IIUC we may need to set state to FAILED when setup() failed. If so, this
patch might not be needed.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
2024-02-27 18:03 ` [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
@ 2024-02-29 6:32 ` Markus Armbruster
2024-02-29 7:20 ` Vladimir Sementsov-Ogievskiy
2024-02-29 13:34 ` Cédric Le Goater
0 siblings, 2 replies; 42+ messages in thread
From: Markus Armbruster @ 2024-02-29 6:32 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Alex Williamson,
Avihai Horon, Philippe Mathieu-Daudé, Nicholas Piggin,
Harsh Prateek Bora, Halil Pasic, Thomas Huth, Eric Blake,
Vladimir Sementsov-Ogievskiy, John Snow, Stefan Hajnoczi
Cédric Le Goater <clg@redhat.com> writes:
> The purpose is to record a potential error in the migration stream if
> qemu_savevm_state_setup() fails. Most of the current .save_setup()
> handlers can be modified to use the Error argument instead of managing
> their own and calling locally error_report(). The following patches
> will introduce such changes for VFIO first.
>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: Eric Blake <eblake@redhat.com>
> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> Cc: John Snow <jsnow@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
[...]
> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
> index c483b62a9b5f71772639fc180bdad15ecb6711cb..c934df424a555d83d2198f5ddfc0cbe0ea98e9ec 100644
> --- a/hw/s390x/s390-stattrib.c
> +++ b/hw/s390x/s390-stattrib.c
> @@ -166,7 +166,7 @@ static int cmma_load(QEMUFile *f, void *opaque, int version_id)
> return ret;
> }
>
> -static int cmma_save_setup(QEMUFile *f, void *opaque)
> +static int cmma_save_setup(QEMUFile *f, void *opaque, Error **errp)
> {
> S390StAttribState *sas = S390_STATTRIB(opaque);
> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
int res;
/*
* Signal that we want to start a migration, thus needing PGSTE dirty
* tracking.
*/
res = sac->set_migrationmode(sas, 1);
if (res) {
return res;
I believe this is a failure return.
Anti-pattern: fail without setting an error. There might be more
elsewhere in the series.
qapi/error.h's big comment:
* - On success, the function should not touch *errp. On failure, it
* should set a new error, e.g. with error_setg(errp, ...), or
* propagate an existing one, e.g. with error_propagate(errp, ...).
*
* - Whenever practical, also return a value that indicates success /
* failure. This can make the error checking more concise, and can
* avoid useless error object creation and destruction. Note that
* we still have many functions returning void. We recommend
* • bool-valued functions return true on success / false on failure,
* • pointer-valued functions return non-null / null pointer, and
* • integer-valued functions return non-negative / negative.
}
qemu_put_be64(f, STATTR_FLAG_EOS);
return 0;
}
When adding Error **errp to a function, you must also add code to set an
error on failure to every failure path. Adding it in a later patch in
the same series can be okay, but I'd add a TODO comment to the function
then, and mention it in the commit message.
Questions?
[...]
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
2024-02-29 6:32 ` Markus Armbruster
@ 2024-02-29 7:20 ` Vladimir Sementsov-Ogievskiy
2024-02-29 10:35 ` Thomas Huth
2024-02-29 13:34 ` Cédric Le Goater
1 sibling, 1 reply; 42+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-02-29 7:20 UTC (permalink / raw)
To: Markus Armbruster, Cédric Le Goater
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Alex Williamson,
Avihai Horon, Philippe Mathieu-Daudé, Nicholas Piggin,
Harsh Prateek Bora, Halil Pasic, Thomas Huth, Eric Blake,
John Snow, Stefan Hajnoczi
On 29.02.24 09:32, Markus Armbruster wrote:
> Cédric Le Goater <clg@redhat.com> writes:
>
>> The purpose is to record a potential error in the migration stream if
>> qemu_savevm_state_setup() fails. Most of the current .save_setup()
>> handlers can be modified to use the Error argument instead of managing
>> their own and calling locally error_report(). The following patches
>> will introduce such changes for VFIO first.
>>
>> Cc: Nicholas Piggin <npiggin@gmail.com>
>> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
>> Cc: Halil Pasic <pasic@linux.ibm.com>
>> Cc: Thomas Huth <thuth@redhat.com>
>> Cc: Eric Blake <eblake@redhat.com>
>> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> Cc: John Snow <jsnow@redhat.com>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>> Reviewed-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>
> [...]
>
>> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
>> index c483b62a9b5f71772639fc180bdad15ecb6711cb..c934df424a555d83d2198f5ddfc0cbe0ea98e9ec 100644
>> --- a/hw/s390x/s390-stattrib.c
>> +++ b/hw/s390x/s390-stattrib.c
>> @@ -166,7 +166,7 @@ static int cmma_load(QEMUFile *f, void *opaque, int version_id)
>> return ret;
>> }
>>
>> -static int cmma_save_setup(QEMUFile *f, void *opaque)
>> +static int cmma_save_setup(QEMUFile *f, void *opaque, Error **errp)
>> {
>> S390StAttribState *sas = S390_STATTRIB(opaque);
>> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> int res;
> /*
> * Signal that we want to start a migration, thus needing PGSTE dirty
> * tracking.
> */
> res = sac->set_migrationmode(sas, 1);
> if (res) {
> return res;
>
> I believe this is a failure return.
>
> Anti-pattern: fail without setting an error. There might be more
> elsewhere in the series.
>
> qapi/error.h's big comment:
>
> * - On success, the function should not touch *errp. On failure, it
> * should set a new error, e.g. with error_setg(errp, ...), or
> * propagate an existing one, e.g. with error_propagate(errp, ...).
> *
> * - Whenever practical, also return a value that indicates success /
> * failure. This can make the error checking more concise, and can
> * avoid useless error object creation and destruction. Note that
> * we still have many functions returning void. We recommend
> * • bool-valued functions return true on success / false on failure,
> * • pointer-valued functions return non-null / null pointer, and
> * • integer-valued functions return non-negative / negative.
>
> }
> qemu_put_be64(f, STATTR_FLAG_EOS);
> return 0;
> }
>
> When adding Error **errp to a function, you must also add code to set an
> error on failure to every failure path. Adding it in a later patch in
> the same series can be okay,
Personally, I'd prefer not doing so. Creating wrong commits and fixing them in same series - better to merge all fixes into bad commit:)
> but I'd add a TODO comment to the function
> then, and mention it in the commit message.
>
> Questions?
>
> [...]
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
2024-02-29 7:20 ` Vladimir Sementsov-Ogievskiy
@ 2024-02-29 10:35 ` Thomas Huth
2024-02-29 13:21 ` Markus Armbruster
0 siblings, 1 reply; 42+ messages in thread
From: Thomas Huth @ 2024-02-29 10:35 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, Markus Armbruster,
Cédric Le Goater
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Alex Williamson,
Avihai Horon, Philippe Mathieu-Daudé, Nicholas Piggin,
Harsh Prateek Bora, Halil Pasic, Eric Blake, John Snow,
Stefan Hajnoczi
On 29/02/2024 08.20, Vladimir Sementsov-Ogievskiy wrote:
> On 29.02.24 09:32, Markus Armbruster wrote:
>> Cédric Le Goater <clg@redhat.com> writes:
>>
>>> The purpose is to record a potential error in the migration stream if
>>> qemu_savevm_state_setup() fails. Most of the current .save_setup()
>>> handlers can be modified to use the Error argument instead of managing
>>> their own and calling locally error_report(). The following patches
>>> will introduce such changes for VFIO first.
>>>
>>> Cc: Nicholas Piggin <npiggin@gmail.com>
>>> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>> Cc: Halil Pasic <pasic@linux.ibm.com>
>>> Cc: Thomas Huth <thuth@redhat.com>
>>> Cc: Eric Blake <eblake@redhat.com>
>>> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>> Cc: John Snow <jsnow@redhat.com>
>>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>>> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>>> Reviewed-by: Peter Xu <peterx@redhat.com>
>>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>>> ---
>>
>> [...]
>>
>>> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
>>> index
>>> c483b62a9b5f71772639fc180bdad15ecb6711cb..c934df424a555d83d2198f5ddfc0cbe0ea98e9ec 100644
>>> --- a/hw/s390x/s390-stattrib.c
>>> +++ b/hw/s390x/s390-stattrib.c
>>> @@ -166,7 +166,7 @@ static int cmma_load(QEMUFile *f, void *opaque, int
>>> version_id)
>>> return ret;
>>> }
>>> -static int cmma_save_setup(QEMUFile *f, void *opaque)
>>> +static int cmma_save_setup(QEMUFile *f, void *opaque, Error **errp)
>>> {
>>> S390StAttribState *sas = S390_STATTRIB(opaque);
>>> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
>> int res;
>> /*
>> * Signal that we want to start a migration, thus needing PGSTE dirty
>> * tracking.
>> */
>> res = sac->set_migrationmode(sas, 1);
>> if (res) {
>> return res;
>>
>> I believe this is a failure return.
>>
>> Anti-pattern: fail without setting an error. There might be more
>> elsewhere in the series.
>>
>> qapi/error.h's big comment:
>>
>> * - On success, the function should not touch *errp. On failure, it
>> * should set a new error, e.g. with error_setg(errp, ...), or
>> * propagate an existing one, e.g. with error_propagate(errp, ...).
>> *
>> * - Whenever practical, also return a value that indicates success /
>> * failure. This can make the error checking more concise, and can
>> * avoid useless error object creation and destruction. Note that
>> * we still have many functions returning void. We recommend
>> * • bool-valued functions return true on success / false on failure,
>> * • pointer-valued functions return non-null / null pointer, and
>> * • integer-valued functions return non-negative / negative.
>>
>> }
>> qemu_put_be64(f, STATTR_FLAG_EOS);
>> return 0;
>> }
>>
>> When adding Error **errp to a function, you must also add code to set an
>> error on failure to every failure path. Adding it in a later patch in
>> the same series can be okay,
>
> Personally, I'd prefer not doing so. Creating wrong commits and fixing them
> in same series - better to merge all fixes into bad commit:)
I agree - that might create issues with bisecting later. Please fix it in
this patch here already!
Thanks,
Thomas
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers
2024-02-29 4:10 ` Peter Xu
@ 2024-02-29 13:19 ` Cédric Le Goater
0 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-29 13:19 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé
On 2/29/24 05:10, Peter Xu wrote:
> On Tue, Feb 27, 2024 at 07:03:27PM +0100, Cédric Le Goater wrote:
>> The SaveVMHandlers structure is still in use for complex subsystems
>> and devices. Document the handlers since we are going to modify a few
>> later.
>>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
> Still a few nitpick comments below.
I have applied your suggestions for the next spin.
Thanks,
C.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
2024-02-29 10:35 ` Thomas Huth
@ 2024-02-29 13:21 ` Markus Armbruster
2024-03-01 14:44 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 42+ messages in thread
From: Markus Armbruster @ 2024-02-29 13:21 UTC (permalink / raw)
To: Thomas Huth
Cc: Vladimir Sementsov-Ogievskiy, Cédric Le Goater, qemu-devel,
Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Nicholas Piggin, Harsh Prateek Bora,
Halil Pasic, Eric Blake, John Snow, Stefan Hajnoczi
Thomas Huth <thuth@redhat.com> writes:
> On 29/02/2024 08.20, Vladimir Sementsov-Ogievskiy wrote:
>> On 29.02.24 09:32, Markus Armbruster wrote:
[...]
>>> Anti-pattern: fail without setting an error. There might be more
>>> elsewhere in the series.
>>>
>>> qapi/error.h's big comment:
>>>
>>> * - On success, the function should not touch *errp. On failure, it
>>> * should set a new error, e.g. with error_setg(errp, ...), or
>>> * propagate an existing one, e.g. with error_propagate(errp, ...).
>>> *
>>> * - Whenever practical, also return a value that indicates success /
>>> * failure. This can make the error checking more concise, and can
>>> * avoid useless error object creation and destruction. Note that
>>> * we still have many functions returning void. We recommend
>>> * • bool-valued functions return true on success / false on failure,
>>> * • pointer-valued functions return non-null / null pointer, and
>>> * • integer-valued functions return non-negative / negative.
>>>
>>> }
>>> qemu_put_be64(f, STATTR_FLAG_EOS);
>>> return 0;
>>> }
>>>
>>> When adding Error **errp to a function, you must also add code to set an
>>> error on failure to every failure path. Adding it in a later patch in
>>> the same series can be okay,
>>
>> Personally, I'd prefer not doing so. Creating wrong commits and fixing them in same series - better to merge all fixes into bad commit:)
>
> I agree - that might create issues with bisecting later. Please fix it in this patch here already!
Depends on the wrongness, really.
We don't want broken intermediate states, no argument.
But intermediate states that are merely unclean can be acceptable.
For instance, my commit a30ecde6e79 (net: Permit incremental conversion
of init functions to Error) added such Error ** parameters to a somewhat
tangled nest of functions, along with FIXME comments where errors
weren't set. The next few commits fixed most, but not all of them.
Later commits fixed some more. The one in tap-win32.c is still there
today.
This was acceptable, because it improved things from "bad error
reporting" to "okay error reporting in most cases, unclean and bad error
reporting in a few cases marked FIXME", with "a few" over time going
down to the one I can't test, and nobody else seems to care about.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
2024-02-29 6:32 ` Markus Armbruster
2024-02-29 7:20 ` Vladimir Sementsov-Ogievskiy
@ 2024-02-29 13:34 ` Cédric Le Goater
1 sibling, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-29 13:34 UTC (permalink / raw)
To: Markus Armbruster
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Alex Williamson,
Avihai Horon, Philippe Mathieu-Daudé, Nicholas Piggin,
Harsh Prateek Bora, Halil Pasic, Thomas Huth, Eric Blake,
Vladimir Sementsov-Ogievskiy, John Snow, Stefan Hajnoczi
On 2/29/24 07:32, Markus Armbruster wrote:
> Cédric Le Goater <clg@redhat.com> writes:
>
>> The purpose is to record a potential error in the migration stream if
>> qemu_savevm_state_setup() fails. Most of the current .save_setup()
>> handlers can be modified to use the Error argument instead of managing
>> their own and calling locally error_report(). The following patches
>> will introduce such changes for VFIO first.
>>
>> Cc: Nicholas Piggin <npiggin@gmail.com>
>> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
>> Cc: Halil Pasic <pasic@linux.ibm.com>
>> Cc: Thomas Huth <thuth@redhat.com>
>> Cc: Eric Blake <eblake@redhat.com>
>> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> Cc: John Snow <jsnow@redhat.com>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>> Reviewed-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>
> [...]
>
>> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
>> index c483b62a9b5f71772639fc180bdad15ecb6711cb..c934df424a555d83d2198f5ddfc0cbe0ea98e9ec 100644
>> --- a/hw/s390x/s390-stattrib.c
>> +++ b/hw/s390x/s390-stattrib.c
>> @@ -166,7 +166,7 @@ static int cmma_load(QEMUFile *f, void *opaque, int version_id)
>> return ret;
>> }
>>
>> -static int cmma_save_setup(QEMUFile *f, void *opaque)
>> +static int cmma_save_setup(QEMUFile *f, void *opaque, Error **errp)
>> {
>> S390StAttribState *sas = S390_STATTRIB(opaque);
>> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> int res;
> /*
> * Signal that we want to start a migration, thus needing PGSTE dirty
> * tracking.
> */
> res = sac->set_migrationmode(sas, 1);
> if (res) {
> return res;
>
> I believe this is a failure return.
>
> Anti-pattern: fail without setting an error. There might be more
> elsewhere in the series.
>
> qapi/error.h's big comment:
>
> * - On success, the function should not touch *errp. On failure, it
> * should set a new error, e.g. with error_setg(errp, ...), or
> * propagate an existing one, e.g. with error_propagate(errp, ...).
> *
> * - Whenever practical, also return a value that indicates success /
> * failure. This can make the error checking more concise, and can
> * avoid useless error object creation and destruction. Note that
> * we still have many functions returning void. We recommend
> * • bool-valued functions return true on success / false on failure,
> * • pointer-valued functions return non-null / null pointer, and
> * • integer-valued functions return non-negative / negative.
>
> }
> qemu_put_be64(f, STATTR_FLAG_EOS);
> return 0;
> }
>
> When adding Error **errp to a function, you must also add code to set an
> error on failure to every failure path. Adding it in a later patch in
> the same series can be okay, but I'd add a TODO comment to the function
> then, and mention it in the commit message.
Indeed. I will check the other changes.
> Questions?
Perfectly clear.
Thanks,
C.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 11/21] migration: Fix migration termination
2024-02-29 5:34 ` Peter Xu
@ 2024-02-29 15:13 ` Cédric Le Goater
0 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-29 15:13 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé
On 2/29/24 06:34, Peter Xu wrote:
> On Tue, Feb 27, 2024 at 07:03:35PM +0100, Cédric Le Goater wrote:
>> Handle migration termination when in SETUP state. This can happen if
>> qemu_savevm_state_setup() fails.
>>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>> migration/migration.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index c1a62b696f62c0d5aca0505e58bc4dc0ff561fde..63294417ff9cae868ad8a167094a795fc30e4da0 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -3161,6 +3161,8 @@ static void migration_iteration_finish(MigrationState *s)
>> }
>> }
>> break;
>> + case MIGRATION_STATUS_SETUP:
>> + break;
>>
>> default:
>> /* Should not reach here, but if so, forgive the VM. */
>> @@ -3192,6 +3194,8 @@ static void bg_migration_iteration_finish(MigrationState *s)
>> case MIGRATION_STATUS_CANCELLED:
>> case MIGRATION_STATUS_CANCELLING:
>> break;
>> + case MIGRATION_STATUS_SETUP:
>> + break;
>>
>> default:
>> /* Should not reach here, but if so, forgive the VM. */
>
> Would this cause "query-migrate" to keep reporting "SETUP" even if setup()
> failed?
>
> IIUC we may need to set state to FAILED when setup() failed. If so, this
> patch might not be needed.
Oh yes. you are right. I will see how that can be done.
Thanks for the feedback.
C.
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
2024-02-29 13:21 ` Markus Armbruster
@ 2024-03-01 14:44 ` Vladimir Sementsov-Ogievskiy
2024-03-01 14:52 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 42+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-03-01 14:44 UTC (permalink / raw)
To: Markus Armbruster, Thomas Huth
Cc: Cédric Le Goater, qemu-devel, Peter Xu, Fabiano Rosas,
Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Nicholas Piggin, Harsh Prateek Bora, Halil Pasic, Eric Blake,
John Snow, Stefan Hajnoczi
On 29.02.24 16:21, Markus Armbruster wrote:
> Thomas Huth <thuth@redhat.com> writes:
>
>> On 29/02/2024 08.20, Vladimir Sementsov-Ogievskiy wrote:
>>> On 29.02.24 09:32, Markus Armbruster wrote:
>
> [...]
>
>>>> Anti-pattern: fail without setting an error. There might be more
>>>> elsewhere in the series.
>>>>
>>>> qapi/error.h's big comment:
>>>>
>>>> * - On success, the function should not touch *errp. On failure, it
>>>> * should set a new error, e.g. with error_setg(errp, ...), or
>>>> * propagate an existing one, e.g. with error_propagate(errp, ...).
>>>> *
>>>> * - Whenever practical, also return a value that indicates success /
>>>> * failure. This can make the error checking more concise, and can
>>>> * avoid useless error object creation and destruction. Note that
>>>> * we still have many functions returning void. We recommend
>>>> * • bool-valued functions return true on success / false on failure,
>>>> * • pointer-valued functions return non-null / null pointer, and
>>>> * • integer-valued functions return non-negative / negative.
>>>>
>>>> }
>>>> qemu_put_be64(f, STATTR_FLAG_EOS);
>>>> return 0;
>>>> }
>>>>
>>>> When adding Error **errp to a function, you must also add code to set an
>>>> error on failure to every failure path. Adding it in a later patch in
>>>> the same series can be okay,
>>>
>>> Personally, I'd prefer not doing so. Creating wrong commits and fixing them in same series - better to merge all fixes into bad commit:)
>>
>> I agree - that might create issues with bisecting later. Please fix it in this patch here already!
>
> Depends on the wrongness, really.
In my understanding, unset errp on failure path is wrong enough.
For example, simple pattern
Error *local_err = NULL;
int ret = foo(&local_err);
if (ret < 0) {
error_report_err(local_err);
return;
}
will just crash.
When I write code, I expect that "errp is set" === "ret < 0", for all functions returning negative integer on failure.
Also, we have enough code that relying on errp for failure checking:
$ git grep 'if (local_err)' | wc -l
373
Of course, most of this should be checking failure of functions that return void, but who knows.
>
> We don't want broken intermediate states, no argument.
>
> But intermediate states that are merely unclean can be acceptable.
>
> For instance, my commit a30ecde6e79 (net: Permit incremental conversion
> of init functions to Error) added such Error ** parameters to a somewhat
> tangled nest of functions, along with FIXME comments where errors
> weren't set. The next few commits fixed most, but not all of them.
> Later commits fixed some more. The one in tap-win32.c is still there
> today.
>
> This was acceptable, because it improved things from "bad error
> reporting" to "okay error reporting in most cases, unclean and bad error
> reporting in a few cases marked FIXME", with "a few" over time going
> down to the one I can't test, and nobody else seems to care about.
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler
2024-03-01 14:44 ` Vladimir Sementsov-Ogievskiy
@ 2024-03-01 14:52 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 42+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-03-01 14:52 UTC (permalink / raw)
To: Markus Armbruster, Thomas Huth
Cc: Cédric Le Goater, qemu-devel, Peter Xu, Fabiano Rosas,
Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Nicholas Piggin, Harsh Prateek Bora, Halil Pasic, Eric Blake,
John Snow, Stefan Hajnoczi
On 01.03.24 17:44, Vladimir Sementsov-Ogievskiy wrote:
> On 29.02.24 16:21, Markus Armbruster wrote:
>> Thomas Huth <thuth@redhat.com> writes:
>>
>>> On 29/02/2024 08.20, Vladimir Sementsov-Ogievskiy wrote:
>>>> On 29.02.24 09:32, Markus Armbruster wrote:
>>
>> [...]
>>
>>>>> Anti-pattern: fail without setting an error. There might be more
>>>>> elsewhere in the series.
>>>>>
>>>>> qapi/error.h's big comment:
>>>>>
>>>>> * - On success, the function should not touch *errp. On failure, it
>>>>> * should set a new error, e.g. with error_setg(errp, ...), or
>>>>> * propagate an existing one, e.g. with error_propagate(errp, ...).
>>>>> *
>>>>> * - Whenever practical, also return a value that indicates success /
>>>>> * failure. This can make the error checking more concise, and can
>>>>> * avoid useless error object creation and destruction. Note that
>>>>> * we still have many functions returning void. We recommend
>>>>> * • bool-valued functions return true on success / false on failure,
>>>>> * • pointer-valued functions return non-null / null pointer, and
>>>>> * • integer-valued functions return non-negative / negative.
>>>>>
>>>>> }
>>>>> qemu_put_be64(f, STATTR_FLAG_EOS);
>>>>> return 0;
>>>>> }
>>>>>
>>>>> When adding Error **errp to a function, you must also add code to set an
>>>>> error on failure to every failure path. Adding it in a later patch in
>>>>> the same series can be okay,
>>>>
>>>> Personally, I'd prefer not doing so. Creating wrong commits and fixing them in same series - better to merge all fixes into bad commit:)
>>>
>>> I agree - that might create issues with bisecting later. Please fix it in this patch here already!
>>
>> Depends on the wrongness, really.
>
> In my understanding, unset errp on failure path is wrong enough.
>
> For example, simple pattern
>
> Error *local_err = NULL;
>
> int ret = foo(&local_err);
> if (ret < 0) {
> error_report_err(local_err);
> return;
> }
>
> will just crash.
>
> When I write code, I expect that "errp is set" === "ret < 0", for all functions returning negative integer on failure.
>
> Also, we have enough code that relying on errp for failure checking:
> $ git grep 'if (local_err)' | wc -l
> 373
>
> Of course, most of this should be checking failure of functions that return void, but who knows.
>
>>
>> We don't want broken intermediate states, no argument.
>>
>> But intermediate states that are merely unclean can be acceptable.
>>
>> For instance, my commit a30ecde6e79 (net: Permit incremental conversion
>> of init functions to Error) added such Error ** parameters to a somewhat
>> tangled nest of functions, along with FIXME comments where errors
>> weren't set. The next few commits fixed most, but not all of them.
>> Later commits fixed some more. The one in tap-win32.c is still there
>> today.
>>
>> This was acceptable, because it improved things from "bad error
>> reporting" to "okay error reporting in most cases, unclean and bad error
>> reporting in a few cases marked FIXME", with "a few" over time going
>> down to the one I can't test, and nobody else seems to care about.
>>
You may be sure, that functions you modify are never used in conditions I've described above. But you can't guarantee that this will not change in future. Of course, if someone will create new call of the function, he should look (at least once) at that function and see "FIXME", but better not rely on this. Also, if someone will add a call to another function that calls function with bad error reporting, most probably he will not see the "FIXME"...
Formally, you should add such FIXME to all functions with errp, that may do (nested) calls to a function with FIXME
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers
2024-02-27 18:03 ` [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers Cédric Le Goater
2024-02-29 4:10 ` Peter Xu
@ 2024-03-04 9:05 ` Avihai Horon
2024-03-04 10:29 ` Cédric Le Goater
1 sibling, 1 reply; 42+ messages in thread
From: Avihai Horon @ 2024-03-04 9:05 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson,
Philippe Mathieu-Daudé
Hi Cedric,
A few nits below.
On 27/02/2024 20:03, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> The SaveVMHandlers structure is still in use for complex subsystems
> and devices. Document the handlers since we are going to modify a few
> later.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> include/migration/register.h | 257 +++++++++++++++++++++++++++++++----
> 1 file changed, 231 insertions(+), 26 deletions(-)
>
> diff --git a/include/migration/register.h b/include/migration/register.h
> index 2e6a7d766e62f64940086b7b511249c9ff21fa62..2cc71ec45f65bf2884c9e7a823d2968752f15c20 100644
> --- a/include/migration/register.h
> +++ b/include/migration/register.h
> @@ -16,30 +16,129 @@
>
> #include "hw/vmstate-if.h"
>
> +/**
> + * struct SaveVMHandlers: handler structure to finely control
> + * migration of complex subsystems and devices, such as RAM, block and
> + * VFIO.
> + */
> typedef struct SaveVMHandlers {
> - /* This runs inside the BQL. */
> +
> + /* The following handlers runs inside the BQL. */
s/runs/run
> +
> + /**
> + * @save_state
> + *
> + * Saves state section on the source using the latest state format
> + * version.
> + *
> + * Legacy method. Should be deprecated when all users are ported
> + * to VMState.
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + */
> void (*save_state)(QEMUFile *f, void *opaque);
>
> - /*
> - * save_prepare is called early, even before migration starts, and can be
> - * used to perform early checks.
> + /**
> + * @save_prepare
> + *
> + * Called early, even before migration starts, and can be used to
> + * perform early checks.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + * @errp: pointer to Error*, to store an error if it happens.
> + *
> + * Returns zero to indicate success and negative for error
> */
> int (*save_prepare)(void *opaque, Error **errp);
> +
> + /**
> + * @save_setup
> + *
> + * Initializes the data structures on the source and transmits
> + * first section containing information on the device
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*save_setup)(QEMUFile *f, void *opaque);
> +
> + /**
> + * @save_cleanup
> + *
> + * Performs save related cleanup
Use save_setup phrasing?
Uninitializes the data structures on the source.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
This can be dropped as it's a void function.
> + */
> void (*save_cleanup)(void *opaque);
> +
> + /**
> + * @save_live_complete_postcopy
> + *
> + * Called at the end of postcopy for all postcopyiable devices.
s/postcopyiable/postcopyable
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
> +
> + /**
> + * @save_live_complete_precopy
> + *
> + * Transmits the last section for the device containing any
> + * remaining data.
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
>
> /* This runs both outside and inside the BQL. */
> +
> + /**
> + * @is_active
> + *
> + * Will skip a state section if not active
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns true if state section is active else false
> + */
> bool (*is_active)(void *opaque);
> +
> + /**
> + * @has_postcopy
> + *
> + * checks if a device supports postcopy
s/checks/Checks
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns true for postcopy support else false
> + */
> bool (*has_postcopy)(void *opaque);
>
> - /* is_active_iterate
> - * If it is not NULL then qemu_savevm_state_iterate will skip iteration if
> - * it returns false. For example, it is needed for only-postcopy-states,
> - * which needs to be handled by qemu_savevm_state_setup and
> - * qemu_savevm_state_pending, but do not need iterations until not in
> - * postcopy stage.
> + /**
> + * @is_active_iterate
> + *
> + * As #SaveVMHandlers.is_active(), will skip an inactive state
> + * section in qemu_savevm_state_iterate.
> + *
> + * For example, it is needed for only-postcopy-states, which needs
> + * to be handled by qemu_savevm_state_setup() and
> + * qemu_savevm_state_pending(), but do not need iterations until
> + * not in postcopy stage.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns true if state section is active else false
> */
> bool (*is_active_iterate)(void *opaque);
>
> @@ -48,44 +147,150 @@ typedef struct SaveVMHandlers {
> * use data that is local to the migration thread or protected
> * by other locks.
> */
> +
> + /**
> + * @save_live_iterate
> + *
> + * Should send a chunk of data until the point that stream
> + * bandwidth limits tell it to stop. Each call generates one
> + * section.
> + *
> + * @f: QEMUFile where to send the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
0 indicates that there is still more data to send, 1 indicates that
there is no more data to send and negative indicates error.
> + */
> int (*save_live_iterate)(QEMUFile *f, void *opaque);
>
> /* This runs outside the BQL! */
> - /* Note for save_live_pending:
> - * must_precopy:
> - * - must be migrated in precopy or in stopped state
> - * - i.e. must be migrated before target start
> - *
> - * can_postcopy:
> - * - can migrate in postcopy or in stopped state
> - * - i.e. can migrate after target start
> - * - some can also be migrated during precopy (RAM)
> - * - some must be migrated after source stops (block-dirty-bitmap)
> - *
> - * Sum of can_postcopy and must_postcopy is the whole amount of
> +
> + /**
> + * @state_pending_estimate
> + *
> + * This estimates the remaining data to transfer
> + *
> + * Sum of @can_postcopy and @must_postcopy is the whole amount of
> * pending data.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + * @must_precopy: amount of data that must be migrated in precopy
> + * or in stopped state, i.e. that must be migrated
> + * before target start.
> + * @can_postcopy: amount of data that can be migrated in postcopy
> + * or in stopped state, i.e. after target start.
> + * Some can also be migrated during precopy (RAM).
> + * Some must be migrated after source stops
> + * (block-dirty-bitmap)
> */
> - /* This estimates the remaining data to transfer */
> void (*state_pending_estimate)(void *opaque, uint64_t *must_precopy,
> uint64_t *can_postcopy);
> - /* This calculate the exact remaining data to transfer */
> +
> + /**
> + * @state_pending_exact
> + *
> + * This calculate the exact remaining data to transfer
s/calculate/calculates
> + *
> + * Sum of @can_postcopy and @must_postcopy is the whole amount of
> + * pending data.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + * @must_precopy: amount of data that must be migrated in precopy
> + * or in stopped state, i.e. that must be migrated
> + * before target start.
> + * @can_postcopy: amount of data that can be migrated in postcopy
> + * or in stopped state, i.e. after target start.
> + * Some can also be migrated during precopy (RAM).
> + * Some must be migrated after source stops
> + * (block-dirty-bitmap)
> + */
> void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
> uint64_t *can_postcopy);
> +
> + /**
> + * @load_state
> + *
> + * Load sections generated by any of the save functions that
> + * generate sections.
> + *
> + * Legacy method. Should be deprecated when all users are ported
> + * to VMState.
> + *
> + * @f: QEMUFile where to receive the data
> + * @opaque: data pointer passed to register_savevm_live()
> + * @version_id: the maximum version_id supported
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*load_state)(QEMUFile *f, void *opaque, int version_id);
> +
> + /**
> + * @load_setup
> + *
> + * Initializes the data structures on the destination.
> + *
> + * @f: QEMUFile where to receive the data
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*load_setup)(QEMUFile *f, void *opaque);
> +
> + /**
> + * @load_cleanup
> + *
> + * Performs cleanup of load data structures
Use load_setup phrasing?
Uninitializes the data structures on the destination.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*load_cleanup)(void *opaque);
> - /* Called when postcopy migration wants to resume from failure */
> +
> + /**
> + * @resume_prepare
> + *
> + * Called when postcopy migration wants to resume from failure
> + *
> + * @s: Current migration state
> + * @opaque: data pointer passed to register_savevm_live()
> + *
> + * Returns zero to indicate success and negative for error
> + */
> int (*resume_prepare)(MigrationState *s, void *opaque);
> - /* Checks if switchover ack should be used. Called only in dest */
> +
> + /**
> + * @switchover_ack_needed
> + *
> + * Checks if switchover ack should be used. Called only on
> + * destination.
> + *
> + * @opaque: data pointer passed to register_savevm_live()
Add:
Returns true if switchover ack should be used and false otherwise
> + */
> bool (*switchover_ack_needed)(void *opaque);
> } SaveVMHandlers;
>
> +/**
> + * register_savevm_live: Register a set of custom migration handlers
> + *
> + * @idstr: state section identifier
> + * @instance_id: instance id
> + * @version_id: version id supported
> + * @ops: SaveVMHandlers structurex
s/structurex/structure
> + * @opaque: data pointer passed SaveVMHandlers handlers
s/passed SaveVMHandlers/passed to SaveVMHandlers
Thanks.
> + */
> int register_savevm_live(const char *idstr,
> uint32_t instance_id,
> int version_id,
> const SaveVMHandlers *ops,
> void *opaque);
>
> +/**
> + * unregister_savevm: Unregister custom migration handlers
> + *
> + * @obj: object associated with state section
> + * @idstr: state section identifier
> + * @opaque: data pointer passed to register_savevm_live()
> + */
> void unregister_savevm(VMStateIf *obj, const char *idstr, void *opaque);
>
> #endif
> --
> 2.43.2
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers
2024-03-04 9:05 ` Avihai Horon
@ 2024-03-04 10:29 ` Cédric Le Goater
0 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-03-04 10:29 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson,
Philippe Mathieu-Daudé
On 3/4/24 10:05, Avihai Horon wrote:
> Hi Cedric,
>
> A few nits below.
Just in time for v3 I was about to send ! I will include these
suggestions.
Thanks,
C.
>
> On 27/02/2024 20:03, Cédric Le Goater wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> The SaveVMHandlers structure is still in use for complex subsystems
>> and devices. Document the handlers since we are going to modify a few
>> later.
>>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>> include/migration/register.h | 257 +++++++++++++++++++++++++++++++----
>> 1 file changed, 231 insertions(+), 26 deletions(-)
>>
>> diff --git a/include/migration/register.h b/include/migration/register.h
>> index 2e6a7d766e62f64940086b7b511249c9ff21fa62..2cc71ec45f65bf2884c9e7a823d2968752f15c20 100644
>> --- a/include/migration/register.h
>> +++ b/include/migration/register.h
>> @@ -16,30 +16,129 @@
>>
>> #include "hw/vmstate-if.h"
>>
>> +/**
>> + * struct SaveVMHandlers: handler structure to finely control
>> + * migration of complex subsystems and devices, such as RAM, block and
>> + * VFIO.
>> + */
>> typedef struct SaveVMHandlers {
>> - /* This runs inside the BQL. */
>> +
>> + /* The following handlers runs inside the BQL. */
>
> s/runs/run
>
>> +
>> + /**
>> + * @save_state
>> + *
>> + * Saves state section on the source using the latest state format
>> + * version.
>> + *
>> + * Legacy method. Should be deprecated when all users are ported
>> + * to VMState.
>> + *
>> + * @f: QEMUFile where to send the data
>> + * @opaque: data pointer passed to register_savevm_live()
>> + */
>> void (*save_state)(QEMUFile *f, void *opaque);
>>
>> - /*
>> - * save_prepare is called early, even before migration starts, and can be
>> - * used to perform early checks.
>> + /**
>> + * @save_prepare
>> + *
>> + * Called early, even before migration starts, and can be used to
>> + * perform early checks.
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>> + * @errp: pointer to Error*, to store an error if it happens.
>> + *
>> + * Returns zero to indicate success and negative for error
>> */
>> int (*save_prepare)(void *opaque, Error **errp);
>> +
>> + /**
>> + * @save_setup
>> + *
>> + * Initializes the data structures on the source and transmits
>> + * first section containing information on the device
>> + *
>> + * @f: QEMUFile where to send the data
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns zero to indicate success and negative for error
>> + */
>> int (*save_setup)(QEMUFile *f, void *opaque);
>> +
>> + /**
>> + * @save_cleanup
>> + *
>> + * Performs save related cleanup
>
> Use save_setup phrasing?
> Uninitializes the data structures on the source.
>
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns zero to indicate success and negative for error
>
> This can be dropped as it's a void function.
>
>> + */
>> void (*save_cleanup)(void *opaque);
>> +
>> + /**
>> + * @save_live_complete_postcopy
>> + *
>> + * Called at the end of postcopy for all postcopyiable devices.
>
> s/postcopyiable/postcopyable>
>> + *
>> + * @f: QEMUFile where to send the data
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns zero to indicate success and negative for error
>> + */
>> int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
>> +
>> + /**
>> + * @save_live_complete_precopy
>> + *
>> + * Transmits the last section for the device containing any
>> + * remaining data.
>> + *
>> + * @f: QEMUFile where to send the data
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns zero to indicate success and negative for error
>> + */
>> int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
>>
>> /* This runs both outside and inside the BQL. */
>> +
>> + /**
>> + * @is_active
>> + *
>> + * Will skip a state section if not active
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns true if state section is active else false
>> + */
>> bool (*is_active)(void *opaque);
>> +
>> + /**
>> + * @has_postcopy
>> + *
>> + * checks if a device supports postcopy
>
> s/checks/Checks
>
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns true for postcopy support else false
>> + */
>> bool (*has_postcopy)(void *opaque);
>>
>> - /* is_active_iterate
>> - * If it is not NULL then qemu_savevm_state_iterate will skip iteration if
>> - * it returns false. For example, it is needed for only-postcopy-states,
>> - * which needs to be handled by qemu_savevm_state_setup and
>> - * qemu_savevm_state_pending, but do not need iterations until not in
>> - * postcopy stage.
>> + /**
>> + * @is_active_iterate
>> + *
>> + * As #SaveVMHandlers.is_active(), will skip an inactive state
>> + * section in qemu_savevm_state_iterate.
>> + *
>> + * For example, it is needed for only-postcopy-states, which needs
>> + * to be handled by qemu_savevm_state_setup() and
>> + * qemu_savevm_state_pending(), but do not need iterations until
>> + * not in postcopy stage.
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns true if state section is active else false
>> */
>> bool (*is_active_iterate)(void *opaque);
>>
>> @@ -48,44 +147,150 @@ typedef struct SaveVMHandlers {
>> * use data that is local to the migration thread or protected
>> * by other locks.
>> */
>> +
>> + /**
>> + * @save_live_iterate
>> + *
>> + * Should send a chunk of data until the point that stream
>> + * bandwidth limits tell it to stop. Each call generates one
>> + * section.
>> + *
>> + * @f: QEMUFile where to send the data
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns zero to indicate success and negative for error
>
> 0 indicates that there is still more data to send, 1 indicates that there is no more data to send and negative indicates error.
>
>> + */
>> int (*save_live_iterate)(QEMUFile *f, void *opaque);
>>
>> /* This runs outside the BQL! */
>> - /* Note for save_live_pending:
>> - * must_precopy:
>> - * - must be migrated in precopy or in stopped state
>> - * - i.e. must be migrated before target start
>> - *
>> - * can_postcopy:
>> - * - can migrate in postcopy or in stopped state
>> - * - i.e. can migrate after target start
>> - * - some can also be migrated during precopy (RAM)
>> - * - some must be migrated after source stops (block-dirty-bitmap)
>> - *
>> - * Sum of can_postcopy and must_postcopy is the whole amount of
>> +
>> + /**
>> + * @state_pending_estimate
>> + *
>> + * This estimates the remaining data to transfer
>> + *
>> + * Sum of @can_postcopy and @must_postcopy is the whole amount of
>> * pending data.
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>> + * @must_precopy: amount of data that must be migrated in precopy
>> + * or in stopped state, i.e. that must be migrated
>> + * before target start.
>> + * @can_postcopy: amount of data that can be migrated in postcopy
>> + * or in stopped state, i.e. after target start.
>> + * Some can also be migrated during precopy (RAM).
>> + * Some must be migrated after source stops
>> + * (block-dirty-bitmap)
>> */
>> - /* This estimates the remaining data to transfer */
>> void (*state_pending_estimate)(void *opaque, uint64_t *must_precopy,
>> uint64_t *can_postcopy);
>> - /* This calculate the exact remaining data to transfer */
>> +
>> + /**
>> + * @state_pending_exact
>> + *
>> + * This calculate the exact remaining data to transfer
>
> s/calculate/calculates
>
>> + *
>> + * Sum of @can_postcopy and @must_postcopy is the whole amount of
>> + * pending data.
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>> + * @must_precopy: amount of data that must be migrated in precopy
>> + * or in stopped state, i.e. that must be migrated
>> + * before target start.
>> + * @can_postcopy: amount of data that can be migrated in postcopy
>> + * or in stopped state, i.e. after target start.
>> + * Some can also be migrated during precopy (RAM).
>> + * Some must be migrated after source stops
>> + * (block-dirty-bitmap)
>> + */
>> void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
>> uint64_t *can_postcopy);
>> +
>> + /**
>> + * @load_state
>> + *
>> + * Load sections generated by any of the save functions that
>> + * generate sections.
>> + *
>> + * Legacy method. Should be deprecated when all users are ported
>> + * to VMState.
>> + *
>> + * @f: QEMUFile where to receive the data
>> + * @opaque: data pointer passed to register_savevm_live()
>> + * @version_id: the maximum version_id supported
>> + *
>> + * Returns zero to indicate success and negative for error
>> + */
>> int (*load_state)(QEMUFile *f, void *opaque, int version_id);
>> +
>> + /**
>> + * @load_setup
>> + *
>> + * Initializes the data structures on the destination.
>> + *
>> + * @f: QEMUFile where to receive the data
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns zero to indicate success and negative for error
>> + */
>> int (*load_setup)(QEMUFile *f, void *opaque);
>> +
>> + /**
>> + * @load_cleanup
>> + *
>> + * Performs cleanup of load data structures
>
> Use load_setup phrasing?
> Uninitializes the data structures on the destination.
>
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns zero to indicate success and negative for error
>> + */
>> int (*load_cleanup)(void *opaque);
>> - /* Called when postcopy migration wants to resume from failure */
>> +
>> + /**
>> + * @resume_prepare
>> + *
>> + * Called when postcopy migration wants to resume from failure
>> + *
>> + * @s: Current migration state
>> + * @opaque: data pointer passed to register_savevm_live()
>> + *
>> + * Returns zero to indicate success and negative for error
>> + */
>> int (*resume_prepare)(MigrationState *s, void *opaque);
>> - /* Checks if switchover ack should be used. Called only in dest */
>> +
>> + /**
>> + * @switchover_ack_needed
>> + *
>> + * Checks if switchover ack should be used. Called only on
>> + * destination.
>> + *
>> + * @opaque: data pointer passed to register_savevm_live()
>
> Add:
> Returns true if switchover ack should be used and false otherwise
>
>> + */
>> bool (*switchover_ack_needed)(void *opaque);
>> } SaveVMHandlers;
>>
>> +/**
>> + * register_savevm_live: Register a set of custom migration handlers
>> + *
>> + * @idstr: state section identifier
>> + * @instance_id: instance id
>> + * @version_id: version id supported
>> + * @ops: SaveVMHandlers structurex
>
> s/structurex/structure
>
>> + * @opaque: data pointer passed SaveVMHandlers handlers
>
> s/passed SaveVMHandlers/passed to SaveVMHandlers
>
> Thanks.
>
>> + */
>> int register_savevm_live(const char *idstr,
>> uint32_t instance_id,
>> int version_id,
>> const SaveVMHandlers *ops,
>> void *opaque);
>>
>> +/**
>> + * unregister_savevm: Unregister custom migration handlers
>> + *
>> + * @obj: object associated with state section
>> + * @idstr: state section identifier
>> + * @opaque: data pointer passed to register_savevm_live()
>> + */
>> void unregister_savevm(VMStateIf *obj, const char *idstr, void *opaque);
>>
>> #endif
>> --
>> 2.43.2
>>
>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH v2 08/21] memory: Add Error** argument to .log_global*() handlers
2024-02-29 5:29 ` Peter Xu
@ 2024-03-04 13:38 ` Cédric Le Goater
0 siblings, 0 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-03-04 13:38 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Stefano Stabellini, Anthony Perard,
Paul Durrant, Michael S . Tsirkin, Paolo Bonzini,
David Hildenbrand
Hello Peter,
I missed this email last week, so I might not have addressed your
comments. sorry about that.
On 2/29/24 06:29, Peter Xu wrote:
> On Tue, Feb 27, 2024 at 07:03:32PM +0100, Cédric Le Goater wrote:
>> Modify all log_global*() handlers to take an Error** parameter and
>> return a bool. A new MEMORY_LISTENER_CALL_LOG_GLOBAL macro looping on
>> the listeners is introduced to handle a possible error, which will
>> would interrupt the loop if necessary.
>>
>> To be noted a change in memory_global_dirty_log_start() behavior as it
>> will return as soon as an error is detected.
>>
>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>> Cc: Anthony Perard <anthony.perard@citrix.com>
>> Cc: Paul Durrant <paul@xen.org>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>> include/exec/memory.h | 15 ++++++--
>> hw/i386/xen/xen-hvm.c | 6 ++--
>> hw/vfio/common.c | 8 +++--
>> hw/virtio/vhost.c | 6 ++--
>> system/memory.c | 83 +++++++++++++++++++++++++++++++++++++------
>> system/physmem.c | 5 +--
>> 6 files changed, 101 insertions(+), 22 deletions(-)
>>
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index 8626a355b310ed7b1a1db7978ba4b394032c2f15..4bc146c5ebdd377cd14a4e462f32cc945db5a0a8 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -998,8 +998,11 @@ struct MemoryListener {
>> * active at that time.
>> *
>> * @listener: The #MemoryListener.
>> + * @errp: pointer to Error*, to store an error if it happens.
>> + *
>> + * Return: true on success, else false setting @errp with error.
>> */
>> - void (*log_global_start)(MemoryListener *listener);
>> + bool (*log_global_start)(MemoryListener *listener, Error **errp);
>>
>> /**
>> * @log_global_stop:
>> @@ -1009,8 +1012,11 @@ struct MemoryListener {
>> * the address space.
>> *
>> * @listener: The #MemoryListener.
>> + * @errp: pointer to Error*, to store an error if it happens.
>> + *
>> + * Return: true on success, else false setting @errp with error.
>> */
>> - void (*log_global_stop)(MemoryListener *listener);
>> + bool (*log_global_stop)(MemoryListener *listener, Error **errp);
>>
>> /**
>> * @log_global_after_sync:
>> @@ -1019,8 +1025,11 @@ struct MemoryListener {
>> * for any #MemoryRegionSection.
>> *
>> * @listener: The #MemoryListener.
>> + * @errp: pointer to Error*, to store an error if it happens.
>> + *
>> + * Return: true on success, else false setting @errp with error.
>> */
>> - void (*log_global_after_sync)(MemoryListener *listener);
>> + bool (*log_global_after_sync)(MemoryListener *listener, Error **errp);
>>
>> /**
>> * @eventfd_add:
>> diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
>> index f42621e6742552035122ea58092c91c3458338ff..925a207b494b4eed52d5f360b554f18ac8a9806d 100644
>> --- a/hw/i386/xen/xen-hvm.c
>> +++ b/hw/i386/xen/xen-hvm.c
>> @@ -446,16 +446,18 @@ static void xen_log_sync(MemoryListener *listener, MemoryRegionSection *section)
>> int128_get64(section->size));
>> }
>>
>> -static void xen_log_global_start(MemoryListener *listener)
>> +static bool xen_log_global_start(MemoryListener *listener, Error **errp)
>> {
>> if (xen_enabled()) {
>> xen_in_migration = true;
>> }
>> + return true;
>> }
>>
>> -static void xen_log_global_stop(MemoryListener *listener)
>> +static bool xen_log_global_stop(MemoryListener *listener, Error **errp)
>> {
>> xen_in_migration = false;
>> + return true;
>> }
>>
>> static const MemoryListener xen_memory_listener = {
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 059bfdc07a85e2eb908df828c1f42104d683e911..8bba95ba6a2010b78cae54c6905857686bbb6309 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1075,7 +1075,8 @@ out:
>> return ret;
>> }
>>
>> -static void vfio_listener_log_global_start(MemoryListener *listener)
>> +static bool vfio_listener_log_global_start(MemoryListener *listener,
>> + Error **errp)
>> {
>> VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
>> listener);
>> @@ -1092,9 +1093,11 @@ static void vfio_listener_log_global_start(MemoryListener *listener)
>> ret, strerror(-ret));
>> vfio_set_migration_error(ret);
>> }
>> + return !!ret;
>> }
>>
>> -static void vfio_listener_log_global_stop(MemoryListener *listener)
>> +static bool vfio_listener_log_global_stop(MemoryListener *listener,
>> + Error **errp)
>> {
>> VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
>> listener);
>> @@ -1111,6 +1114,7 @@ static void vfio_listener_log_global_stop(MemoryListener *listener)
>> ret, strerror(-ret));
>> vfio_set_migration_error(ret);
>> }
>> + return !!ret;
>> }
>
> Shouldn't both of these be "return !ret"?
Indeed. It went unnoticed because the return value is not tested when the
memory macros MEMORY_LISTENER_CALL_LOG_GLOBAL at run.
>
> Besides, maybe it'll be easier to go with either:
>
> 1) if to only add the new parameter, here we can constantly return true
> to keep the old behavior, or,
>
> 2) allow failure within this patch, then error_report() should already do
> error_setg() to errp.
v3 adds more changes to allow failures.
>>
>> static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index 2c9ac794680ea9b65eba6cc22e70cf141e90aa73..7a555f941934991a72a2817e5505fe0ce6d6fc64 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -1044,7 +1044,7 @@ check_dev_state:
>> return r;
>> }
>>
>> -static void vhost_log_global_start(MemoryListener *listener)
>> +static bool vhost_log_global_start(MemoryListener *listener, Error **errp)
>> {
>> int r;
>>
>> @@ -1052,9 +1052,10 @@ static void vhost_log_global_start(MemoryListener *listener)
>> if (r < 0) {
>> abort();
>> }
>> + return true;
>> }
>>
>> -static void vhost_log_global_stop(MemoryListener *listener)
>> +static bool vhost_log_global_stop(MemoryListener *listener, Error **errp)
>> {
>> int r;
>>
>> @@ -1062,6 +1063,7 @@ static void vhost_log_global_stop(MemoryListener *listener)
>> if (r < 0) {
>> abort();
>> }
>> + return true;
>> }
>>
>> static void vhost_log_start(MemoryListener *listener,
>> diff --git a/system/memory.c b/system/memory.c
>> index a229a79988fce2aa3cb77e3a130db4c694e8cd49..af06157ead5b1272548e87f79ab9fb3036055328 100644
>> --- a/system/memory.c
>> +++ b/system/memory.c
>> @@ -127,6 +127,35 @@ enum ListenerDirection { Forward, Reverse };
>> } \
>> } while (0)
>>
>> +#define MEMORY_LISTENER_CALL_LOG_GLOBAL(_callback, _direction, _errp, \
>> + _args...) \
>> + do { \
>> + MemoryListener *_listener; \
>> + \
>> + switch (_direction) { \
>> + case Forward: \
>> + QTAILQ_FOREACH(_listener, &memory_listeners, link) { \
>> + if (_listener->_callback) { \
>> + if (!_listener->_callback(_listener, _errp, ##_args)) { \
>> + break; \
>> + } \
>> + } \
>> + } \
>> + break; \
>> + case Reverse: \
>> + QTAILQ_FOREACH_REVERSE(_listener, &memory_listeners, link) { \
>> + if (_listener->_callback) { \
>> + if (!_listener->_callback(_listener, _errp, ##_args)) { \
>> + break; \
>> + } \
>> + } \
>> + } \
>> + break; \
>> + default: \
>> + abort(); \
>> + }; \
>> + } while (0)
>> +
>> #define MEMORY_LISTENER_CALL(_as, _callback, _direction, _section, _args...) \
>> do { \
>> MemoryListener *_listener; \
>> @@ -2903,7 +2932,13 @@ void memory_global_dirty_log_sync(bool last_stage)
>>
>> void memory_global_after_dirty_log_sync(void)
>> {
>> - MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
>> + Error *local_err = NULL;
>> +
>> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_after_sync, Forward,
>> + &local_err);
>> + if (local_err) {
>> + error_report_err(local_err);
>> + }
>> }
>>
>> /*
>> @@ -2912,18 +2947,22 @@ void memory_global_after_dirty_log_sync(void)
>> */
>> static unsigned int postponed_stop_flags;
>> static VMChangeStateEntry *vmstate_change;
>> -static void memory_global_dirty_log_stop_postponed_run(void);
>> +static bool memory_global_dirty_log_stop_postponed_run(Error **errp);
>>
>> void memory_global_dirty_log_start(unsigned int flags)
>> {
>> unsigned int old_flags;
>> + Error *local_err = NULL;
>>
>> assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
>>
>> if (vmstate_change) {
>> /* If there is postponed stop(), operate on it first */
>> postponed_stop_flags &= ~flags;
>> - memory_global_dirty_log_stop_postponed_run();
>> + if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
>> + error_report_err(local_err);
>> + return;
>> + }
>> }
>>
>> flags &= ~global_dirty_tracking;
>> @@ -2936,15 +2975,22 @@ void memory_global_dirty_log_start(unsigned int flags)
>> trace_global_dirty_changed(global_dirty_tracking);
>>
>> if (!old_flags) {
>> - MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
>> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
>> + &local_err);
>> + if (local_err) {
>> + error_report_err(local_err);
>> + return;
>> + }
>> memory_region_transaction_begin();
>> memory_region_update_pending = true;
>> memory_region_transaction_commit();
>> }
>> }
>>
>> -static void memory_global_dirty_log_do_stop(unsigned int flags)
>> +static bool memory_global_dirty_log_do_stop(unsigned int flags, Error **errp)
>> {
>> + ERRP_GUARD();
>> +
>> assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
>> assert((global_dirty_tracking & flags) == flags);
>> global_dirty_tracking &= ~flags;
>> @@ -2955,39 +3001,49 @@ static void memory_global_dirty_log_do_stop(unsigned int flags)
>> memory_region_transaction_begin();
>> memory_region_update_pending = true;
>> memory_region_transaction_commit();
>> - MEMORY_LISTENER_CALL_GLOBAL(log_global_stop, Reverse);
>> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_stop, Reverse, errp);
>> }
>> + return !*errp;
>> }
>>
>> /*
>> * Execute the postponed dirty log stop operations if there is, then reset
>> * everything (including the flags and the vmstate change hook).
>> */
>> -static void memory_global_dirty_log_stop_postponed_run(void)
>> +static bool memory_global_dirty_log_stop_postponed_run(Error **errp)
>> {
>> + bool ret = true;
>> +
>> /* This must be called with the vmstate handler registered */
>> assert(vmstate_change);
>>
>> /* Note: postponed_stop_flags can be cleared in log start routine */
>> if (postponed_stop_flags) {
>> - memory_global_dirty_log_do_stop(postponed_stop_flags);
>> + ret = memory_global_dirty_log_do_stop(postponed_stop_flags, errp);
>> postponed_stop_flags = 0;
>> }
>>
>> qemu_del_vm_change_state_handler(vmstate_change);
>> vmstate_change = NULL;
>> + return ret;
>> }
>>
>> static void memory_vm_change_state_handler(void *opaque, bool running,
>> RunState state)
>> {
>> + Error *local_err = NULL;
>> +
>> if (running) {
>> - memory_global_dirty_log_stop_postponed_run();
>> + if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
>> + error_report_err(local_err);
>> + }
>> }
>> }
>>
>> void memory_global_dirty_log_stop(unsigned int flags)
>> {
>> + Error *local_err = NULL;
>> +
>> if (!runstate_is_running()) {
>> /* Postpone the dirty log stop, e.g., to when VM starts again */
>> if (vmstate_change) {
>> @@ -3001,7 +3057,9 @@ void memory_global_dirty_log_stop(unsigned int flags)
>> return;
>> }
>>
>> - memory_global_dirty_log_do_stop(flags);
>> + if (!memory_global_dirty_log_do_stop(flags, &local_err)) {
>> + error_report_err(local_err);
>> + }
>> }
>>
>> static void listener_add_address_space(MemoryListener *listener,
>> @@ -3009,13 +3067,16 @@ static void listener_add_address_space(MemoryListener *listener,
>> {
>> FlatView *view;
>> FlatRange *fr;
>> + Error *local_err = NULL;
>>
>> if (listener->begin) {
>> listener->begin(listener);
>> }
>> if (global_dirty_tracking) {
>> if (listener->log_global_start) {
>> - listener->log_global_start(listener);
>> + if (!listener->log_global_start(listener, &local_err)) {
>> + error_report_err(local_err);
>> + }
>> }
>> }
>
> The stop process is indeed more involved due to the postponed mechanism.
>
> Is it really a goal that we should also allow global stop() to be able to
> fail? Or maybe we only care about global start() on the "VFIO doesn't have
> enough dirty tracking unit for the current device" issue?
So you would rather keep the stop handler prototype as it is ? I made
the log global stop and start handlers similar for symmetry but I am
fine not changing stop. That said, the stop handler changes are much
easier to address.
> It makes more sense to me to allow a start() to fail rather than stop(),
The approach I took for the stop sequence is to keep the same behavior
and to only report the error, but no actions are taken. I should improve
the commit log if we keep the stop changes.
> because if start()s failed we can still fallback to stop() on whatever got
> start()ed already, and fail the operation that triggered the start().
Yes. The goal is to resume the guest execution "gracefully", with some
error.
> If stop() can also fail I don't know what we can do; consider if the
> fallback stop()s failed above. It sounds just more challenging to allow
> stop() to fail v.s. start() only from that regard to behave correctly.
Yes. this would be just for error reporting.
> Same question to after_sync(); that seems only to be used by tcg. If VFIO
> has no plan to leverage that Error** maybe we can keep them as-is?
AFAICS log_global_after_sync() handlers are called from ram_init_bitmaps()
ram_init_bitmaps
migration_bitmap_sync_precopy
migration_bitmap_sync
memory_global_dirty_log_sync
memory_region_sync_dirty_bitmap
calls .log_sync() handlers
memory_global_after_dirty_log_sync
calls .log_global_after_sync() handlers
The .log_global_after_sync() change is not necessary for VFIO. It is
proposed mostly for symmetry of the .log_global* handlers and because
migration_bitmap_sync_precopy() and migration_bitmap_sync() could fail.
I agree we can keep this change for later. It would be in v4 now. Sorry
about that.
Also, it would be interesting to have the .log_sync() handler report an
error since it is used by VFIO in the 'save_setup' migration step.
I think we should address .log_global_after_sync() and .log_sync() all
together in a new series.
Thanks,
C.
>
>>
>> diff --git a/system/physmem.c b/system/physmem.c
>> index e3ebc19eefd8050a1dee16e3d1449f0c144f751f..9adbf9aea847cd80bdac6dca466fb476844ac048 100644
>> --- a/system/physmem.c
>> +++ b/system/physmem.c
>> @@ -148,7 +148,7 @@ typedef struct subpage_t {
>>
>> static void io_mem_init(void);
>> static void memory_map_init(void);
>> -static void tcg_log_global_after_sync(MemoryListener *listener);
>> +static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp);
>> static void tcg_commit(MemoryListener *listener);
>>
>> /**
>> @@ -2475,7 +2475,7 @@ static void do_nothing(CPUState *cpu, run_on_cpu_data d)
>> {
>> }
>>
>> -static void tcg_log_global_after_sync(MemoryListener *listener)
>> +static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp)
>> {
>> CPUAddressSpace *cpuas;
>>
>> @@ -2507,6 +2507,7 @@ static void tcg_log_global_after_sync(MemoryListener *listener)
>> cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
>> run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
>> }
>> + return true;
>> }
>>
>> static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data)
>> --
>> 2.43.2
>>
>
^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2024-03-04 13:39 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 01/21] migration: Report error when shutdown fails Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
2024-02-29 1:21 ` Fabiano Rosas
2024-02-29 3:55 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers Cédric Le Goater
2024-02-29 4:10 ` Peter Xu
2024-02-29 13:19 ` Cédric Le Goater
2024-03-04 9:05 ` Avihai Horon
2024-03-04 10:29 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 04/21] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error Cédric Le Goater
2024-02-29 4:12 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 05/21] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
2024-02-29 4:19 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
2024-02-29 6:32 ` Markus Armbruster
2024-02-29 7:20 ` Vladimir Sementsov-Ogievskiy
2024-02-29 10:35 ` Thomas Huth
2024-02-29 13:21 ` Markus Armbruster
2024-03-01 14:44 ` Vladimir Sementsov-Ogievskiy
2024-03-01 14:52 ` Vladimir Sementsov-Ogievskiy
2024-02-29 13:34 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 07/21] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
2024-02-29 4:21 ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 08/21] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
2024-02-29 5:29 ` Peter Xu
2024-03-04 13:38 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 09/21] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 10/21] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 11/21] migration: Fix migration termination Cédric Le Goater
2024-02-29 5:34 ` Peter Xu
2024-02-29 15:13 ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 12/21] vfio: Add Error** argument to .set_dirty_page_tracking() handler Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 13/21] vfio: Add Error** argument to vfio_devices_dma_logging_start() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 14/21] vfio: Add Error** argument to vfio_devices_dma_logging_stop() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 15/21] vfio: Use new Error** argument in vfio_save_setup() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 16/21] vfio: Add Error** argument to .vfio_save_config() handler Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 17/21] vfio: Reverse test on vfio_get_dirty_bitmap() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 18/21] memory: Add Error** argument to memory_get_xlat_addr() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 19/21] vfio: Add Error** argument to .get_dirty_bitmap() handler Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 20/21] vfio: Also trace event failures in vfio_save_complete_precopy() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 21/21] vfio: Extend vfio_set_migration_error() with Error* argument Cédric Le Goater
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).