* [PATCH for-9.1 v5 00/14] migration: Improve error reporting
@ 2024-03-20 6:48 Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 01/14] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
` (14 more replies)
0 siblings, 15 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:48 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Prasad Pandit,
Cédric Le Goater
Hello,
The motivation behind these changes is to improve error reporting to
the upper management layer (libvirt) with a more detailed error, this
to let it decide, depending on the reported error, whether to try
migration again later. It would be useful in cases where migration
fails due to lack of HW resources on the host. For instance, some
adapters can only initiate a limited number of simultaneous dirty
tracking requests and this imposes a limit on the the number of VMs
that can be migrated simultaneously.
We are not quite ready for such a mechanism but what we can do first is
to cleanup the error reporting in the early save_setup sequence. This
is what the following changes propose, by adding an Error** argument to
various handlers and propagating it to the core migration subsystem.
Patchset is organized as follow :
* [1-4] are prerequisite changes in other components related to the
migration save_setup() handler. They make sure a failure is not
returned without setting an error.
s390/stattrib: Add Error** argument to set_migrationmode() handler
vfio: Always report an error in vfio_save_setup()
migration: Always report an error in block_save_setup()
migration: Always report an error in ram_save_setup()
migration: Add Error** argument to vmstate_save()
* [5-14] are the core changes in migration and memory components to
propagate an error reported in a save_setup() handler.
migration: Add Error** argument to qemu_savevm_state_setup()
migration: Add Error** argument to .save_setup() handler
migration: Add Error** argument to .load_setup() handler
memory: Add Error** argument to .log_global_start() handler
migration: Introduce ram_bitmaps_destroy()
memory: Add Error** argument to the global_dirty_log routines
migration: Add Error** argument to ram_state_init()
migration: Add Error** argument to xbzrle_init()
migration: Modify ram_init_bitmaps() to report dirty tracking errors
The VFIO changes depend on the above. They are simpler and have been
reviewed already. I kept them for another series.
Thanks,
C.
Changes in v5:
- Rebased on 2e128776dc56 ("migration: Skip only empty block devices")
- Removed Fabiano's R-b because of changes
- Handled qemu_savevm_state_setup() failures after waiting for
virtio-net-failover devices to unplug.
- Removed memory_global_dirty_log_rollback()
- Introduced memory_global_dirty_log_do_start() to call
.log_global_start() handlers and do the rollback in case of error.
- Kept modification of the global_dirty_tracking flag within
memory_global_dirty_log_start()
- Added an assert on error of a .log_global_start() handler in
listener_add_address_space()
- Removed Yong Huang's R-b
- Introduced ram_bitmaps_destroy()
- Added Error** argument to ram_state_init() and xbzrle_init()
- Made use of ram_bitmaps_destroy() in ram_init_bitmaps() to cleanup
allocated bitmaps
- Took into account changes of ram_state_init() and xbzrle_init() to
propagate the error.
- Reduced series to migration. VFIO can come later.
Changes in v4:
- Fixed frenchism futur to future
- Fixed typo in set_migrationmode() handler
- Added error_free() in hmp_migrationmode()
- Fixed state name printed out in error returned by vfio_save_setup()
- Fixed test on error returned by qemu_file_get_error()
- Added an error when bdrv_nb_sectors() returns a negative value
- Dropped log_global_stop() and log_global_sync() changes
- Dropped MEMORY_LISTENER_CALL_LOG_GLOBAL
- Modified memory_global_dirty_log_start() to loop on the list of
listeners and handle errors directly.
- Introduced memory_global_dirty_log_rollback() to revert operations
previously done
Changes in v3:
- New changes to make sure an error is always set in case of failure.
This is the reason behind the 5/6 extra patches. (Markus)
- Documentation fixup (Peter + Avihai)
- Set migration state to MIGRATION_STATUS_FAILED always
- Fixed error handling in bg_migration_thread() (Peter)
- Fixed return value of vfio_listener_log_global_start/stop().
Went unnoticed because value is not tested. (Peter)
- Add ERRP_GUARD() when error_prepend is used
- Use error_setg_errno() when possible
Changes in v2:
- Removed v1 patches addressing the return-path thread termination as
they are now superseded by :
https://lore.kernel.org/qemu-devel/20240226203122.22894-1-farosas@suse.de/
- Documentation updates of handlers
- Removed call to PRECOPY_NOTIFY_SETUP notifiers in case of errors
- Modified routines taking an Error** argument to return a bool when
possible and made adjustments in callers.
- new MEMORY_LISTENER_CALL_LOG_GLOBAL macro for .log_global*()
handlers
- Handled SETUP state when migration terminates
- Modified memory_get_xlat_addr() to take an Error** argument
- Various refinements on error handling
Cédric Le Goater (14):
s390/stattrib: Add Error** argument to set_migrationmode() handler
vfio: Always report an error in vfio_save_setup()
migration: Always report an error in block_save_setup()
migration: Always report an error in ram_save_setup()
migration: Add Error** argument to vmstate_save()
migration: Add Error** argument to qemu_savevm_state_setup()
migration: Add Error** argument to .save_setup() handler
migration: Add Error** argument to .load_setup() handler
memory: Add Error** argument to .log_global_start() handler
migration: Introduce ram_bitmaps_destroy()
memory: Add Error** argument to the global_dirty_log routines
migration: Add Error** argument to ram_state_init()
migration: Add Error** argument to xbzrle_init()
migration: Modify ram_init_bitmaps() to report dirty tracking errors
include/exec/memory.h | 10 ++-
include/hw/s390x/storage-attributes.h | 2 +-
include/migration/register.h | 6 +-
migration/savevm.h | 2 +-
hw/i386/xen/xen-hvm.c | 5 +-
hw/ppc/spapr.c | 2 +-
hw/s390x/s390-stattrib-kvm.c | 12 ++-
hw/s390x/s390-stattrib.c | 15 ++--
hw/vfio/common.c | 4 +-
hw/vfio/migration.c | 29 +++++--
hw/virtio/vhost.c | 3 +-
migration/block-dirty-bitmap.c | 4 +-
migration/block.c | 17 +++--
migration/dirtyrate.c | 13 +++-
migration/migration.c | 33 +++++++-
migration/ram.c | 106 +++++++++++++++++---------
migration/savevm.c | 57 ++++++++------
system/memory.c | 40 +++++++++-
18 files changed, 261 insertions(+), 99 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 01/14] s390/stattrib: Add Error** argument to set_migrationmode() handler
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
@ 2024-03-20 6:48 ` Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 02/14] vfio: Always report an error in vfio_save_setup() Cédric Le Goater
` (13 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:48 UTC (permalink / raw)
To: qemu-devel, Halil Pasic, Christian Borntraeger, Eric Farman,
Richard Henderson, David Hildenbrand, Ilya Leoshkevich,
Thomas Huth
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Prasad Pandit,
Cédric Le Goater, qemu-s390x
This will prepare ground for future changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
set_migrationmode() always sets a new error. See the Rules section in
qapi/error.h.
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/hw/s390x/storage-attributes.h | 2 +-
hw/s390x/s390-stattrib-kvm.c | 12 ++++++++++--
hw/s390x/s390-stattrib.c | 15 ++++++++++-----
3 files changed, 21 insertions(+), 8 deletions(-)
diff --git a/include/hw/s390x/storage-attributes.h b/include/hw/s390x/storage-attributes.h
index 5239eb538c1b087797867a247abfc14551af6a4d..8921a04d514bf64a3113255ee10ed33fc598ae06 100644
--- a/include/hw/s390x/storage-attributes.h
+++ b/include/hw/s390x/storage-attributes.h
@@ -39,7 +39,7 @@ struct S390StAttribClass {
int (*set_stattr)(S390StAttribState *sa, uint64_t start_gfn,
uint32_t count, uint8_t *values);
void (*synchronize)(S390StAttribState *sa);
- int (*set_migrationmode)(S390StAttribState *sa, bool value);
+ int (*set_migrationmode)(S390StAttribState *sa, bool value, Error **errp);
int (*get_active)(S390StAttribState *sa);
long long (*get_dirtycount)(S390StAttribState *sa);
};
diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
index 24cd01382e2d74d62c2d7e980eb6aca1077d893d..eeaa8110981c970e91a8948f027e398c34637321 100644
--- a/hw/s390x/s390-stattrib-kvm.c
+++ b/hw/s390x/s390-stattrib-kvm.c
@@ -17,6 +17,7 @@
#include "sysemu/kvm.h"
#include "exec/ram_addr.h"
#include "kvm/kvm_s390x.h"
+#include "qapi/error.h"
Object *kvm_s390_stattrib_create(void)
{
@@ -137,14 +138,21 @@ static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
}
}
-static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val)
+static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val,
+ Error **errp)
{
struct kvm_device_attr attr = {
.group = KVM_S390_VM_MIGRATION,
.attr = val,
.addr = 0,
};
- return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
+ int r;
+
+ r = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
+ if (r) {
+ error_setg_errno(errp, -r, "setting KVM_S390_VM_MIGRATION failed");
+ }
+ return r;
}
static long long kvm_s390_stattrib_get_dirtycount(S390StAttribState *sa)
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index c483b62a9b5f71772639fc180bdad15ecb6711cb..b743e8a2fee84c7374460ccea6df1cf447cda44b 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -60,11 +60,13 @@ void hmp_migrationmode(Monitor *mon, const QDict *qdict)
S390StAttribState *sas = s390_get_stattrib_device();
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
uint64_t what = qdict_get_int(qdict, "mode");
+ Error *local_err = NULL;
int r;
- r = sac->set_migrationmode(sas, what);
+ r = sac->set_migrationmode(sas, what, &local_err);
if (r < 0) {
- monitor_printf(mon, "Error: %s", strerror(-r));
+ monitor_printf(mon, "Error: %s", error_get_pretty(local_err));
+ error_free(local_err);
}
}
@@ -170,13 +172,15 @@ static int cmma_save_setup(QEMUFile *f, void *opaque)
{
S390StAttribState *sas = S390_STATTRIB(opaque);
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
+ Error *local_err = NULL;
int res;
/*
* Signal that we want to start a migration, thus needing PGSTE dirty
* tracking.
*/
- res = sac->set_migrationmode(sas, 1);
+ res = sac->set_migrationmode(sas, true, &local_err);
if (res) {
+ error_report_err(local_err);
return res;
}
qemu_put_be64(f, STATTR_FLAG_EOS);
@@ -260,7 +264,7 @@ static void cmma_save_cleanup(void *opaque)
{
S390StAttribState *sas = S390_STATTRIB(opaque);
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
- sac->set_migrationmode(sas, 0);
+ sac->set_migrationmode(sas, false, NULL);
}
static bool cmma_active(void *opaque)
@@ -293,7 +297,8 @@ static long long qemu_s390_get_dirtycount_stub(S390StAttribState *sa)
{
return 0;
}
-static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value)
+static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value,
+ Error **errp)
{
return 0;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 02/14] vfio: Always report an error in vfio_save_setup()
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 01/14] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
@ 2024-03-20 6:48 ` Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 03/14] migration: Always report an error in block_save_setup() Cédric Le Goater
` (12 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:48 UTC (permalink / raw)
To: qemu-devel, Alex Williamson, Cédric Le Goater
Cc: Peter Xu, Fabiano Rosas, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Prasad Pandit,
Eric Auger
This will prepare ground for future changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
vfio_save_setup() always sets a new error.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/migration.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 1149c6b3740f7f7bb8febdedf435be1adb223e59..bf5a29ddc15b0dbc7ae9c44f289539dd0cdddb0d 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -381,6 +381,7 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE;
+ int ret;
qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
@@ -395,13 +396,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
}
if (vfio_precopy_supported(vbasedev)) {
- int ret;
-
switch (migration->device_state) {
case VFIO_DEVICE_STATE_RUNNING:
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
VFIO_DEVICE_STATE_RUNNING);
if (ret) {
+ error_report("%s: Failed to set new PRE_COPY state",
+ vbasedev->name);
return ret;
}
@@ -412,6 +413,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
/* vfio_save_complete_precopy() will go to STOP_COPY */
break;
default:
+ error_report("%s: Invalid device state %d", vbasedev->name,
+ migration->device_state);
return -EINVAL;
}
}
@@ -420,7 +423,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
- return qemu_file_get_error(f);
+ ret = qemu_file_get_error(f);
+ if (ret < 0) {
+ error_report("%s: save setup failed : %s", vbasedev->name,
+ strerror(-ret));
+ }
+
+ return ret;
}
static void vfio_save_cleanup(void *opaque)
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 03/14] migration: Always report an error in block_save_setup()
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 01/14] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 02/14] vfio: Always report an error in vfio_save_setup() Cédric Le Goater
@ 2024-03-20 6:48 ` Cédric Le Goater
2024-03-20 6:49 ` [PATCH for-9.1 v5 04/14] migration: Always report an error in ram_save_setup() Cédric Le Goater
` (11 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:48 UTC (permalink / raw)
To: qemu-devel, Stefan Hajnoczi, Fam Zheng, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater,
qemu-block
This will prepare ground for future changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
block_save_setup() always sets a new error.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v5:
- Rebased on 2e128776dc56 ("migration: Skip only empty block devices")
migration/block.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/migration/block.c b/migration/block.c
index 2b9054889ad2ba739828594c50cf047703757e96..f8a11beb37dac3df5c2cc654db6440509d1181ea 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -367,7 +367,7 @@ static void unset_dirty_tracking(void)
}
}
-static int init_blk_migration(QEMUFile *f)
+static int init_blk_migration(QEMUFile *f, Error **errp)
{
BlockDriverState *bs;
BlkMigDevState *bmds;
@@ -378,7 +378,6 @@ static int init_blk_migration(QEMUFile *f)
BlkMigDevState *bmds;
BlockDriverState *bs;
} *bmds_bs;
- Error *local_err = NULL;
int ret;
GRAPH_RDLOCK_GUARD_MAINLOOP();
@@ -406,6 +405,8 @@ static int init_blk_migration(QEMUFile *f)
continue;
}
if (sectors < 0) {
+ error_setg(errp, "Error getting length of block device %s",
+ bdrv_get_device_name(bs));
ret = sectors;
bdrv_next_cleanup(&it);
goto out;
@@ -442,9 +443,8 @@ static int init_blk_migration(QEMUFile *f)
bs = bmds_bs[i].bs;
if (bmds) {
- ret = blk_insert_bs(bmds->blk, bs, &local_err);
+ ret = blk_insert_bs(bmds->blk, bs, errp);
if (ret < 0) {
- error_report_err(local_err);
goto out;
}
@@ -714,6 +714,7 @@ static void block_migration_cleanup(void *opaque)
static int block_save_setup(QEMUFile *f, void *opaque)
{
int ret;
+ Error *local_err = NULL;
trace_migration_block_save("setup", block_mig_state.submitted,
block_mig_state.transferred);
@@ -721,18 +722,27 @@ static int block_save_setup(QEMUFile *f, void *opaque)
warn_report("block migration is deprecated;"
" use blockdev-mirror with NBD instead");
- ret = init_blk_migration(f);
+ ret = init_blk_migration(f, &local_err);
if (ret < 0) {
+ error_report_err(local_err);
return ret;
}
/* start track dirty blocks */
ret = set_dirty_tracking();
if (ret) {
+ error_setg_errno(&local_err, -ret,
+ "Failed to start block dirty tracking");
+ error_report_err(local_err);
return ret;
}
ret = flush_blks(f);
+ if (ret) {
+ error_setg_errno(&local_err, -ret, "Flushing block failed");
+ error_report_err(local_err);
+ return ret;
+ }
blk_mig_reset_dirty_cursor();
qemu_put_be64(f, BLK_MIG_FLAG_EOS);
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 04/14] migration: Always report an error in ram_save_setup()
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (2 preceding siblings ...)
2024-03-20 6:48 ` [PATCH for-9.1 v5 03/14] migration: Always report an error in block_save_setup() Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 6:49 ` [PATCH for-9.1 v5 05/14] migration: Add Error** argument to vmstate_save() Cédric Le Goater
` (10 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
This will prepare ground for future changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
ram_save_setup() sets a new error.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/ram.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/migration/ram.c b/migration/ram.c
index 8deb84984f4a8e9bcdcbdba7d83ce56e406adddf..44d7073730c67fa09ab9a59a712e74d8b088bff4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3074,12 +3074,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
int ret, max_hg_page_size;
if (compress_threads_save_setup()) {
+ error_report("%s: failed to start compress threads", __func__);
return -1;
}
/* migration has already setup the bitmap, reuse it. */
if (!migration_in_colo_state()) {
if (ram_init_all(rsp) != 0) {
+ error_report("%s: failed to setup RAM for migration", __func__);
compress_threads_save_cleanup();
return -1;
}
@@ -3116,12 +3118,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ret = rdma_registration_start(f, RAM_CONTROL_SETUP);
if (ret < 0) {
+ error_report("%s: failed to start RDMA registration", __func__);
qemu_file_set_error(f, ret);
return ret;
}
ret = rdma_registration_stop(f, RAM_CONTROL_SETUP);
if (ret < 0) {
+ error_report("%s: failed to stop RDMA registration", __func__);
qemu_file_set_error(f, ret);
return ret;
}
@@ -3138,6 +3142,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ret = multifd_send_sync_main();
bql_lock();
if (ret < 0) {
+ error_report("%s: multifd synchronization failed", __func__);
return ret;
}
@@ -3147,7 +3152,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
}
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
- return qemu_fflush(f);
+ ret = qemu_fflush(f);
+ if (ret < 0) {
+ error_report("%s failed : %s", __func__, strerror(-ret));
+ }
+ return ret;
}
static void ram_save_file_bmap(QEMUFile *f)
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 05/14] migration: Add Error** argument to vmstate_save()
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (3 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 04/14] migration: Always report an error in ram_save_setup() Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 6:49 ` [PATCH for-9.1 v5 06/14] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
` (9 subsequent siblings)
14 siblings, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
This will prepare ground for future changes adding an Error** argument
to qemu_savevm_state_setup().
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/savevm.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 388d7af7cdd842ec94fe782ed53979b800ffd4f6..1a7b5cb78a912c36ae16db703afc90ef2906b61f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1009,11 +1009,10 @@ static void save_section_footer(QEMUFile *f, SaveStateEntry *se)
}
}
-static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc)
+static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc,
+ Error **errp)
{
int ret;
- Error *local_err = NULL;
- MigrationState *s = migrate_get_current();
if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
return 0;
@@ -1035,10 +1034,9 @@ static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc)
if (!se->vmsd) {
vmstate_save_old_style(f, se, vmdesc);
} else {
- ret = vmstate_save_state_with_err(f, se->vmsd, se->opaque, vmdesc, &local_err);
+ ret = vmstate_save_state_with_err(f, se->vmsd, se->opaque, vmdesc,
+ errp);
if (ret) {
- migrate_set_error(s, local_err);
- error_report_err(local_err);
return ret;
}
}
@@ -1325,8 +1323,10 @@ void qemu_savevm_state_setup(QEMUFile *f)
trace_savevm_state_setup();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->vmsd && se->vmsd->early_setup) {
- ret = vmstate_save(f, se, ms->vmdesc);
+ ret = vmstate_save(f, se, ms->vmdesc, &local_err);
if (ret) {
+ migrate_set_error(ms, local_err);
+ error_report_err(local_err);
qemu_file_set_error(f, ret);
break;
}
@@ -1542,6 +1542,7 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
JSONWriter *vmdesc = ms->vmdesc;
int vmdesc_len;
SaveStateEntry *se;
+ Error *local_err = NULL;
int ret;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
@@ -1552,8 +1553,10 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
start_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME);
- ret = vmstate_save(f, se, vmdesc);
+ ret = vmstate_save(f, se, vmdesc, &local_err);
if (ret) {
+ migrate_set_error(ms, local_err);
+ error_report_err(local_err);
qemu_file_set_error(f, ret);
return ret;
}
@@ -1568,7 +1571,6 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
* bdrv_activate_all() on the other end won't fail. */
ret = bdrv_inactivate_all();
if (ret) {
- Error *local_err = NULL;
error_setg(&local_err, "%s: bdrv_inactivate_all() failed (%d)",
__func__, ret);
migrate_set_error(ms, local_err);
@@ -1764,6 +1766,8 @@ void qemu_savevm_live_state(QEMUFile *f)
int qemu_save_device_state(QEMUFile *f)
{
+ MigrationState *ms = migrate_get_current();
+ Error *local_err = NULL;
SaveStateEntry *se;
if (!migration_in_colo_state()) {
@@ -1778,8 +1782,10 @@ int qemu_save_device_state(QEMUFile *f)
if (se->is_ram) {
continue;
}
- ret = vmstate_save(f, se, NULL);
+ ret = vmstate_save(f, se, NULL, &local_err);
if (ret) {
+ migrate_set_error(ms, local_err);
+ error_report_err(local_err);
return ret;
}
}
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 06/14] migration: Add Error** argument to qemu_savevm_state_setup()
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (4 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 05/14] migration: Add Error** argument to vmstate_save() Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 14:29 ` Peter Xu
2024-03-20 6:49 ` [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
` (8 subsequent siblings)
14 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
This prepares ground for the changes coming next which add an Error**
argument to the .save_setup() handler. Callers of qemu_savevm_state_setup()
now handle the error and fail earlier setting the migration state from
MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
In qemu_savevm_state(), move the cleanup to preserve the error
reported by .save_setup() handlers.
Since the previous behavior was to ignore errors at this step of
migration, this change should be examined closely to check that
cleanups are still correctly done.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v5:
- Removed Fabiano's R-b because of changes
- Handled qemu_savevm_state_setup() failures after waiting for
virtio-net-failover devices to unplug.
migration/savevm.h | 2 +-
migration/migration.c | 33 +++++++++++++++++++++++++++++++--
migration/savevm.c | 26 +++++++++++++++-----------
3 files changed, 47 insertions(+), 14 deletions(-)
diff --git a/migration/savevm.h b/migration/savevm.h
index 74669733dd63a080b765866c703234a5c4939223..9ec96a995c93a42aad621595f0ed58596c532328 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -32,7 +32,7 @@
bool qemu_savevm_state_blocked(Error **errp);
void qemu_savevm_non_migratable_list(strList **reasons);
int qemu_savevm_state_prepare(Error **errp);
-void qemu_savevm_state_setup(QEMUFile *f);
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp);
bool qemu_savevm_state_guest_unplug_pending(void);
int qemu_savevm_state_resume_prepare(MigrationState *s);
void qemu_savevm_state_header(QEMUFile *f);
diff --git a/migration/migration.c b/migration/migration.c
index f60bd371e3f896a74df8be4282a15b4280eba732..cd6b6120e31798de9361d02ee43d89989c8d30ce 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3427,6 +3427,8 @@ static void *migration_thread(void *opaque)
int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
MigThrError thr_error;
bool urgent = false;
+ Error *local_err = NULL;
+ int ret;
thread = migration_threads_add("live_migration", qemu_get_thread_id());
@@ -3470,12 +3472,24 @@ static void *migration_thread(void *opaque)
}
bql_lock();
- qemu_savevm_state_setup(s->to_dst_file);
+ ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
bql_unlock();
qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
MIGRATION_STATUS_ACTIVE);
+ /*
+ * Handle SETUP failures after waiting for virtio-net-failover
+ * devices to unplug. This to preserve migration state transitions.
+ */
+ if (ret) {
+ migrate_set_error(s, local_err);
+ error_free(local_err);
+ migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+ MIGRATION_STATUS_FAILED);
+ goto out;
+ }
+
s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
trace_migration_thread_setup_complete();
@@ -3549,6 +3563,8 @@ static void *bg_migration_thread(void *opaque)
MigThrError thr_error;
QEMUFile *fb;
bool early_fail = true;
+ Error *local_err = NULL;
+ int ret;
rcu_register_thread();
object_ref(OBJECT(s));
@@ -3582,12 +3598,24 @@ static void *bg_migration_thread(void *opaque)
bql_lock();
qemu_savevm_state_header(s->to_dst_file);
- qemu_savevm_state_setup(s->to_dst_file);
+ ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
bql_unlock();
qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
MIGRATION_STATUS_ACTIVE);
+ /*
+ * Handle SETUP failures after waiting for virtio-net-failover
+ * devices to unplug. This to preserve migration state transitions.
+ */
+ if (ret) {
+ migrate_set_error(s, local_err);
+ error_free(local_err);
+ migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+ MIGRATION_STATUS_FAILED);
+ goto fail_setup;
+ }
+
s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
trace_migration_thread_setup_complete();
@@ -3656,6 +3684,7 @@ fail:
bql_unlock();
}
+fail_setup:
bg_migration_iteration_finish(s);
qemu_fclose(fb);
diff --git a/migration/savevm.c b/migration/savevm.c
index 1a7b5cb78a912c36ae16db703afc90ef2906b61f..0eb94e61f888adba2c0732c2cb701b110814c455 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1310,11 +1310,11 @@ int qemu_savevm_state_prepare(Error **errp)
return 0;
}
-void qemu_savevm_state_setup(QEMUFile *f)
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
{
+ ERRP_GUARD();
MigrationState *ms = migrate_get_current();
SaveStateEntry *se;
- Error *local_err = NULL;
int ret = 0;
json_writer_int64(ms->vmdesc, "page_size", qemu_target_page_size());
@@ -1323,10 +1323,9 @@ void qemu_savevm_state_setup(QEMUFile *f)
trace_savevm_state_setup();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->vmsd && se->vmsd->early_setup) {
- ret = vmstate_save(f, se, ms->vmdesc, &local_err);
+ ret = vmstate_save(f, se, ms->vmdesc, errp);
if (ret) {
- migrate_set_error(ms, local_err);
- error_report_err(local_err);
+ migrate_set_error(ms, *errp);
qemu_file_set_error(f, ret);
break;
}
@@ -1346,18 +1345,19 @@ void qemu_savevm_state_setup(QEMUFile *f)
ret = se->ops->save_setup(f, se->opaque);
save_section_footer(f, se);
if (ret < 0) {
+ error_setg(errp, "failed to setup SaveStateEntry with id(name): "
+ "%d(%s): %d", se->section_id, se->idstr, ret);
qemu_file_set_error(f, ret);
break;
}
}
if (ret) {
- return;
+ return ret;
}
- if (precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)) {
- error_report_err(local_err);
- }
+ /* TODO: Should we check that errp is set in case of failure ? */
+ return precopy_notify(PRECOPY_NOTIFY_SETUP, errp);
}
int qemu_savevm_state_resume_prepare(MigrationState *s)
@@ -1725,7 +1725,10 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
ms->to_dst_file = f;
qemu_savevm_state_header(f);
- qemu_savevm_state_setup(f);
+ ret = qemu_savevm_state_setup(f, errp);
+ if (ret) {
+ goto cleanup;
+ }
while (qemu_file_get_error(f) == 0) {
if (qemu_savevm_state_iterate(f, false) > 0) {
@@ -1738,10 +1741,11 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
qemu_savevm_state_complete_precopy(f, false, false);
ret = qemu_file_get_error(f);
}
- qemu_savevm_state_cleanup();
if (ret != 0) {
error_setg_errno(errp, -ret, "Error while writing VM state");
}
+cleanup:
+ qemu_savevm_state_cleanup();
if (ret != 0) {
status = MIGRATION_STATUS_FAILED;
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (5 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 06/14] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-29 9:32 ` Vladimir Sementsov-Ogievskiy
2024-03-20 6:49 ` [PATCH for-9.1 v5 08/14] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
` (7 subsequent siblings)
14 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Nicholas Piggin, Daniel Henrique Barboza,
David Gibson, Harsh Prateek Bora, Halil Pasic,
Christian Borntraeger, Eric Farman, Thomas Huth,
Richard Henderson, David Hildenbrand, Ilya Leoshkevich,
Alex Williamson, Cédric Le Goater, Peter Xu, Fabiano Rosas,
Stefan Hajnoczi, Fam Zheng, Eric Blake,
Vladimir Sementsov-Ogievskiy, John Snow
Cc: Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster,
Prasad Pandit, qemu-ppc, qemu-s390x, qemu-block
The purpose is to record a potential error in the migration stream if
qemu_savevm_state_setup() fails. Most of the current .save_setup()
handlers can be modified to use the Error argument instead of managing
their own and calling locally error_report().
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: John Snow <jsnow@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/migration/register.h | 3 ++-
hw/ppc/spapr.c | 2 +-
hw/s390x/s390-stattrib.c | 6 ++----
hw/vfio/migration.c | 17 ++++++++---------
migration/block-dirty-bitmap.c | 4 +++-
migration/block.c | 13 ++++---------
migration/ram.c | 15 ++++++++-------
migration/savevm.c | 4 +---
8 files changed, 29 insertions(+), 35 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index d7b70a8be68c9df47c7843bda7d430989d7ca384..64fc7c11036c82edd6d69513e56a0216d36c17aa 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -60,10 +60,11 @@ typedef struct SaveVMHandlers {
*
* @f: QEMUFile where to send the data
* @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
*
* Returns zero to indicate success and negative for error
*/
- int (*save_setup)(QEMUFile *f, void *opaque);
+ int (*save_setup)(QEMUFile *f, void *opaque, Error **errp);
/**
* @save_cleanup
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index c417f9dd523547eabf6d66a8f505093758e80461..144a3f2b604872e09268b509b9b79ee5b2226136 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2171,7 +2171,7 @@ static const VMStateDescription vmstate_spapr = {
}
};
-static int htab_save_setup(QEMUFile *f, void *opaque)
+static int htab_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
SpaprMachineState *spapr = opaque;
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index b743e8a2fee84c7374460ccea6df1cf447cda44b..bc04187b2b69226db80219da1a964a87428adc0c 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -168,19 +168,17 @@ static int cmma_load(QEMUFile *f, void *opaque, int version_id)
return ret;
}
-static int cmma_save_setup(QEMUFile *f, void *opaque)
+static int cmma_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
S390StAttribState *sas = S390_STATTRIB(opaque);
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
- Error *local_err = NULL;
int res;
/*
* Signal that we want to start a migration, thus needing PGSTE dirty
* tracking.
*/
- res = sac->set_migrationmode(sas, true, &local_err);
+ res = sac->set_migrationmode(sas, true, errp);
if (res) {
- error_report_err(local_err);
return res;
}
qemu_put_be64(f, STATTR_FLAG_EOS);
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index bf5a29ddc15b0dbc7ae9c44f289539dd0cdddb0d..5763c0b68376b1e24ef3e77c3d19fcd406922c79 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -376,7 +376,7 @@ static int vfio_save_prepare(void *opaque, Error **errp)
return 0;
}
-static int vfio_save_setup(QEMUFile *f, void *opaque)
+static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
@@ -390,8 +390,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
stop_copy_size);
migration->data_buffer = g_try_malloc0(migration->data_buffer_size);
if (!migration->data_buffer) {
- error_report("%s: Failed to allocate migration data buffer",
- vbasedev->name);
+ error_setg(errp, "%s: Failed to allocate migration data buffer",
+ vbasedev->name);
return -ENOMEM;
}
@@ -401,8 +401,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
VFIO_DEVICE_STATE_RUNNING);
if (ret) {
- error_report("%s: Failed to set new PRE_COPY state",
- vbasedev->name);
+ error_setg(errp, "%s: Failed to set new PRE_COPY state",
+ vbasedev->name);
return ret;
}
@@ -413,8 +413,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
/* vfio_save_complete_precopy() will go to STOP_COPY */
break;
default:
- error_report("%s: Invalid device state %d", vbasedev->name,
- migration->device_state);
+ error_setg(errp, "%s: Invalid device state %d", vbasedev->name,
+ migration->device_state);
return -EINVAL;
}
}
@@ -425,8 +425,7 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
ret = qemu_file_get_error(f);
if (ret < 0) {
- error_report("%s: save setup failed : %s", vbasedev->name,
- strerror(-ret));
+ error_setg_errno(errp, -ret, "%s: save setup failed", vbasedev->name);
}
return ret;
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 2708abf3d762de774ed294d3fdb8e56690d2974c..542a8c297b329abc30d1b3a205d29340fa59a961 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1213,12 +1213,14 @@ fail:
return ret;
}
-static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
+static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
DBMSaveState *s = &((DBMState *)opaque)->save;
SaveBitmapState *dbms = NULL;
if (init_dirty_bitmap_migration(s) < 0) {
+ error_setg(errp,
+ "Failed to initialize dirty tracking bitmap for blocks");
return -1;
}
diff --git a/migration/block.c b/migration/block.c
index f8a11beb37dac3df5c2cc654db6440509d1181ea..bae6e94891f371302d7a78b2ec54449fd8cfe0b3 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -711,10 +711,9 @@ static void block_migration_cleanup(void *opaque)
blk_mig_unlock();
}
-static int block_save_setup(QEMUFile *f, void *opaque)
+static int block_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
int ret;
- Error *local_err = NULL;
trace_migration_block_save("setup", block_mig_state.submitted,
block_mig_state.transferred);
@@ -722,25 +721,21 @@ static int block_save_setup(QEMUFile *f, void *opaque)
warn_report("block migration is deprecated;"
" use blockdev-mirror with NBD instead");
- ret = init_blk_migration(f, &local_err);
+ ret = init_blk_migration(f, errp);
if (ret < 0) {
- error_report_err(local_err);
return ret;
}
/* start track dirty blocks */
ret = set_dirty_tracking();
if (ret) {
- error_setg_errno(&local_err, -ret,
- "Failed to start block dirty tracking");
- error_report_err(local_err);
+ error_setg_errno(errp, -ret, "Failed to start block dirty tracking");
return ret;
}
ret = flush_blks(f);
if (ret) {
- error_setg_errno(&local_err, -ret, "Flushing block failed");
- error_report_err(local_err);
+ error_setg_errno(errp, -ret, "Flushing block failed");
return ret;
}
blk_mig_reset_dirty_cursor();
diff --git a/migration/ram.c b/migration/ram.c
index 44d7073730c67fa09ab9a59a712e74d8b088bff4..6ea5a06e00e30d0d1e4d8a6defdeb86c81fa707b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3066,22 +3066,23 @@ static bool mapped_ram_read_header(QEMUFile *file, MappedRamHeader *header,
*
* @f: QEMUFile where to send the data
* @opaque: RAMState pointer
+ * @errp: pointer to Error*, to store an error if it happens.
*/
-static int ram_save_setup(QEMUFile *f, void *opaque)
+static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
RAMState **rsp = opaque;
RAMBlock *block;
int ret, max_hg_page_size;
if (compress_threads_save_setup()) {
- error_report("%s: failed to start compress threads", __func__);
+ error_setg(errp, "%s: failed to start compress threads", __func__);
return -1;
}
/* migration has already setup the bitmap, reuse it. */
if (!migration_in_colo_state()) {
if (ram_init_all(rsp) != 0) {
- error_report("%s: failed to setup RAM for migration", __func__);
+ error_setg(errp, "%s: failed to setup RAM for migration", __func__);
compress_threads_save_cleanup();
return -1;
}
@@ -3118,14 +3119,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ret = rdma_registration_start(f, RAM_CONTROL_SETUP);
if (ret < 0) {
- error_report("%s: failed to start RDMA registration", __func__);
+ error_setg(errp, "%s: failed to start RDMA registration", __func__);
qemu_file_set_error(f, ret);
return ret;
}
ret = rdma_registration_stop(f, RAM_CONTROL_SETUP);
if (ret < 0) {
- error_report("%s: failed to stop RDMA registration", __func__);
+ error_setg(errp, "%s: failed to stop RDMA registration", __func__);
qemu_file_set_error(f, ret);
return ret;
}
@@ -3142,7 +3143,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ret = multifd_send_sync_main();
bql_lock();
if (ret < 0) {
- error_report("%s: multifd synchronization failed", __func__);
+ error_setg(errp, "%s: multifd synchronization failed", __func__);
return ret;
}
@@ -3154,7 +3155,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
ret = qemu_fflush(f);
if (ret < 0) {
- error_report("%s failed : %s", __func__, strerror(-ret));
+ error_setg_errno(errp, -ret, "%s failed", __func__);
}
return ret;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index 0eb94e61f888adba2c0732c2cb701b110814c455..535ad5a32d67057dd172ce25d561a66a07172e97 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1342,11 +1342,9 @@ int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
}
save_section_header(f, se, QEMU_VM_SECTION_START);
- ret = se->ops->save_setup(f, se->opaque);
+ ret = se->ops->save_setup(f, se->opaque, errp);
save_section_footer(f, se);
if (ret < 0) {
- error_setg(errp, "failed to setup SaveStateEntry with id(name): "
- "%d(%s): %d", se->section_id, se->idstr, ret);
qemu_file_set_error(f, ret);
break;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 08/14] migration: Add Error** argument to .load_setup() handler
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (6 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 8:02 ` Markus Armbruster
2024-03-20 6:49 ` [PATCH for-9.1 v5 09/14] memory: Add Error** argument to .log_global_start() handler Cédric Le Goater
` (6 subsequent siblings)
14 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Alex Williamson, Cédric Le Goater, Peter Xu,
Fabiano Rosas
Cc: Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster,
Prasad Pandit
This will be useful to report errors at a higher level, mostly in VFIO
today.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/migration/register.h | 3 ++-
hw/vfio/migration.c | 9 +++++++--
migration/ram.c | 3 ++-
migration/savevm.c | 11 +++++++----
4 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 64fc7c11036c82edd6d69513e56a0216d36c17aa..f60e797894e5faacdf55d2d6de175074ac58944f 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -234,10 +234,11 @@ typedef struct SaveVMHandlers {
*
* @f: QEMUFile where to receive the data
* @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
*
* Returns zero to indicate success and negative for error
*/
- int (*load_setup)(QEMUFile *f, void *opaque);
+ int (*load_setup)(QEMUFile *f, void *opaque, Error **errp);
/**
* @load_cleanup
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 5763c0b68376b1e24ef3e77c3d19fcd406922c79..06ae40969b6c19037e190008e14f28be646278cd 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -588,12 +588,17 @@ static void vfio_save_state(QEMUFile *f, void *opaque)
}
}
-static int vfio_load_setup(QEMUFile *f, void *opaque)
+static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp)
{
VFIODevice *vbasedev = opaque;
+ int ret;
- return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING,
+ ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING,
vbasedev->migration->device_state);
+ if (ret) {
+ error_setg(errp, "%s: Failed to set RESUMING state", vbasedev->name);
+ }
+ return ret;
}
static int vfio_load_cleanup(void *opaque)
diff --git a/migration/ram.c b/migration/ram.c
index 6ea5a06e00e30d0d1e4d8a6defdeb86c81fa707b..4cd4f0158c8675e1515ef8476c64d1203eed4458 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3704,8 +3704,9 @@ void colo_release_ram_cache(void)
*
* @f: QEMUFile where to receive the data
* @opaque: RAMState pointer
+ * @errp: pointer to Error*, to store an error if it happens.
*/
-static int ram_load_setup(QEMUFile *f, void *opaque)
+static int ram_load_setup(QEMUFile *f, void *opaque, Error **errp)
{
xbzrle_load_setup();
ramblock_recv_map_init();
diff --git a/migration/savevm.c b/migration/savevm.c
index 535ad5a32d67057dd172ce25d561a66a07172e97..8f42999a15d1685957de9ed517d6bc9ba49c3f11 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2747,8 +2747,9 @@ static void qemu_loadvm_state_switchover_ack_needed(MigrationIncomingState *mis)
trace_loadvm_state_switchover_ack_needed(mis->switchover_ack_pending_num);
}
-static int qemu_loadvm_state_setup(QEMUFile *f)
+static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp)
{
+ ERRP_GUARD(); /* error_prepend use */
SaveStateEntry *se;
int ret;
@@ -2763,10 +2764,11 @@ static int qemu_loadvm_state_setup(QEMUFile *f)
}
}
- ret = se->ops->load_setup(f, se->opaque);
+ ret = se->ops->load_setup(f, se->opaque, errp);
if (ret < 0) {
+ error_prepend(errp, "Load state of device %s failed: ",
+ se->idstr);
qemu_file_set_error(f, ret);
- error_report("Load state of device %s failed", se->idstr);
return ret;
}
}
@@ -2947,7 +2949,8 @@ int qemu_loadvm_state(QEMUFile *f)
return ret;
}
- if (qemu_loadvm_state_setup(f) != 0) {
+ if (qemu_loadvm_state_setup(f, &local_err) != 0) {
+ error_report_err(local_err);
return -EINVAL;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 09/14] memory: Add Error** argument to .log_global_start() handler
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (7 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 08/14] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 14:42 ` Peter Xu
2024-03-20 6:49 ` [PATCH for-9.1 v5 10/14] migration: Introduce ram_bitmaps_destroy() Cédric Le Goater
` (5 subsequent siblings)
14 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
Paolo Bonzini, Richard Henderson, Eduardo Habkost,
Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
Cédric Le Goater, Peter Xu, David Hildenbrand,
Philippe Mathieu-Daudé
Cc: Fabiano Rosas, Avihai Horon, Markus Armbruster, Prasad Pandit,
xen-devel
Modify all .log_global_start() handlers to take an Error** parameter
and return a bool. Adapt memory_global_dirty_log_start() to interrupt
on the first error the loop on handlers. In such case, a rollback is
performed to stop dirty logging on all listeners where it was
previously enabled.
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v5:
- Removed memory_global_dirty_log_rollback
- Introduced memory_global_dirty_log_do_start() to call
.log_global_start() handlers and do the rollback in case of error.
- Kept modification of the global_dirty_tracking flag within
memory_global_dirty_log_start()
- Added an assert on error of a .log_global_start() handler in
listener_add_address_space()
include/exec/memory.h | 5 ++++-
hw/i386/xen/xen-hvm.c | 3 ++-
hw/vfio/common.c | 4 +++-
hw/virtio/vhost.c | 3 ++-
system/memory.c | 37 +++++++++++++++++++++++++++++++++++--
5 files changed, 46 insertions(+), 6 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b310ed7b1a1db7978ba4b394032c2f15..5555567bc4c9fdb53e8f63487f1400980275687d 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -998,8 +998,11 @@ struct MemoryListener {
* active at that time.
*
* @listener: The #MemoryListener.
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
- void (*log_global_start)(MemoryListener *listener);
+ bool (*log_global_start)(MemoryListener *listener, Error **errp);
/**
* @log_global_stop:
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index 7745cb39631ea423aeb6e5d3eb7f7bcbe27ec6fa..f6e9a1bc86491783077b5cb5aafdb19ab294e392 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -457,11 +457,12 @@ static void xen_log_sync(MemoryListener *listener, MemoryRegionSection *section)
int128_get64(section->size));
}
-static void xen_log_global_start(MemoryListener *listener)
+static bool xen_log_global_start(MemoryListener *listener, Error **errp)
{
if (xen_enabled()) {
xen_in_migration = true;
}
+ return true;
}
static void xen_log_global_stop(MemoryListener *listener)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 011ceaab89433de4496dffadc737286e053f321d..8f9cbdc0264044ce587877a7d19d14b28527291b 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1066,7 +1066,8 @@ out:
return ret;
}
-static void vfio_listener_log_global_start(MemoryListener *listener)
+static bool vfio_listener_log_global_start(MemoryListener *listener,
+ Error **errp)
{
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
@@ -1083,6 +1084,7 @@ static void vfio_listener_log_global_start(MemoryListener *listener)
ret, strerror(-ret));
vfio_set_migration_error(ret);
}
+ return !ret;
}
static void vfio_listener_log_global_stop(MemoryListener *listener)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2e4e040db87acf45166da86d268077f54511d82c..d405f03caf2fd3a5ea23bdc0392f4c6c072bc10b 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1044,7 +1044,7 @@ check_dev_state:
return r;
}
-static void vhost_log_global_start(MemoryListener *listener)
+static bool vhost_log_global_start(MemoryListener *listener, Error **errp)
{
int r;
@@ -1052,6 +1052,7 @@ static void vhost_log_global_start(MemoryListener *listener)
if (r < 0) {
abort();
}
+ return true;
}
static void vhost_log_global_stop(MemoryListener *listener)
diff --git a/system/memory.c b/system/memory.c
index a229a79988fce2aa3cb77e3a130db4c694e8cd49..ca4d91484fb3d06f4b902486fea49dba86dc141b 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2914,9 +2914,33 @@ static unsigned int postponed_stop_flags;
static VMChangeStateEntry *vmstate_change;
static void memory_global_dirty_log_stop_postponed_run(void);
+static bool memory_global_dirty_log_do_start(Error **errp)
+{
+ MemoryListener *listener;
+
+ QTAILQ_FOREACH(listener, &memory_listeners, link) {
+ if (listener->log_global_start) {
+ if (!listener->log_global_start(listener, errp)) {
+ goto err;
+ }
+ }
+ }
+ return true;
+
+err:
+ while ((listener = QTAILQ_PREV(listener, link)) != NULL) {
+ if (listener->log_global_stop) {
+ listener->log_global_stop(listener);
+ }
+ }
+
+ return false;
+}
+
void memory_global_dirty_log_start(unsigned int flags)
{
unsigned int old_flags;
+ Error *local_err = NULL;
assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
@@ -2936,7 +2960,13 @@ void memory_global_dirty_log_start(unsigned int flags)
trace_global_dirty_changed(global_dirty_tracking);
if (!old_flags) {
- MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
+ if (!memory_global_dirty_log_do_start(&local_err)) {
+ global_dirty_tracking &= ~flags;
+ trace_global_dirty_changed(global_dirty_tracking);
+ error_report_err(local_err);
+ return;
+ }
+
memory_region_transaction_begin();
memory_region_update_pending = true;
memory_region_transaction_commit();
@@ -3014,8 +3044,11 @@ static void listener_add_address_space(MemoryListener *listener,
listener->begin(listener);
}
if (global_dirty_tracking) {
+ /*
+ * Migration has already started. Assert on any error.
+ */
if (listener->log_global_start) {
- listener->log_global_start(listener);
+ listener->log_global_start(listener, &error_abort);
}
}
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 10/14] migration: Introduce ram_bitmaps_destroy()
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (8 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 09/14] memory: Add Error** argument to .log_global_start() handler Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 14:43 ` Peter Xu
2024-03-20 14:49 ` Fabiano Rosas
2024-03-20 6:49 ` [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
` (4 subsequent siblings)
14 siblings, 2 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
We will use it in ram_init_bitmaps() to clear the allocated bitmaps when
support for error reporting is added to memory_global_dirty_log_start().
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/ram.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 4cd4f0158c8675e1515ef8476c64d1203eed4458..f0bd71438a4f7212118593b51648b645737933d4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2438,10 +2438,23 @@ static void xbzrle_cleanup(void)
XBZRLE_cache_unlock();
}
+static void ram_bitmaps_destroy(void)
+{
+ RAMBlock *block;
+
+ RAMBLOCK_FOREACH_NOT_IGNORED(block) {
+ g_free(block->clear_bmap);
+ block->clear_bmap = NULL;
+ g_free(block->bmap);
+ block->bmap = NULL;
+ g_free(block->file_bmap);
+ block->file_bmap = NULL;
+ }
+}
+
static void ram_save_cleanup(void *opaque)
{
RAMState **rsp = opaque;
- RAMBlock *block;
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
@@ -2458,12 +2471,7 @@ static void ram_save_cleanup(void *opaque)
}
}
- RAMBLOCK_FOREACH_NOT_IGNORED(block) {
- g_free(block->clear_bmap);
- block->clear_bmap = NULL;
- g_free(block->bmap);
- block->bmap = NULL;
- }
+ ram_bitmaps_destroy();
xbzrle_cleanup();
compress_threads_save_cleanup();
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (9 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 10/14] migration: Introduce ram_bitmaps_destroy() Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 14:53 ` Fabiano Rosas
2024-03-20 15:18 ` Peter Xu
2024-03-20 6:49 ` [PATCH for-9.1 v5 12/14] migration: Add Error** argument to ram_state_init() Cédric Le Goater
` (3 subsequent siblings)
14 siblings, 2 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
Paolo Bonzini, Richard Henderson, Eduardo Habkost,
Michael S. Tsirkin, Marcel Apfelbaum, Peter Xu, David Hildenbrand,
Philippe Mathieu-Daudé, Hyman Huang, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Markus Armbruster, Prasad Pandit,
Cédric Le Goater, xen-devel
Now that the log_global*() handlers take an Error** parameter and
return a bool, do the same for memory_global_dirty_log_start() and
memory_global_dirty_log_stop(). The error is reported in the callers
for now and it will be propagated in the call stack in the next
changes.
To be noted a functional change in ram_init_bitmaps(), if the dirty
pages logger fails to start, there is no need to synchronize the dirty
pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
similar way.
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hyman Huang <yong.huang@smartx.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v5:
- Removed Yong Huang's R-b
- Made use of ram_bitmaps_destroy() in ram_init_bitmaps() to cleanup
allocated bitmaps
include/exec/memory.h | 5 ++++-
hw/i386/xen/xen-hvm.c | 2 +-
migration/dirtyrate.c | 13 +++++++++++--
migration/ram.c | 23 +++++++++++++++++++++--
system/memory.c | 11 +++++------
5 files changed, 42 insertions(+), 12 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 5555567bc4c9fdb53e8f63487f1400980275687d..c129ee6db7162504bd72d4cfc69b5affb2cd87e8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2570,8 +2570,11 @@ void memory_listener_unregister(MemoryListener *listener);
* memory_global_dirty_log_start: begin dirty logging for all regions
*
* @flags: purpose of starting dirty log, migration or dirty rate
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
-void memory_global_dirty_log_start(unsigned int flags);
+bool memory_global_dirty_log_start(unsigned int flags, Error **errp);
/**
* memory_global_dirty_log_stop: end dirty logging for all regions
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index f6e9a1bc86491783077b5cb5aafdb19ab294e392..006d219ad52d739cc406ad5f8082ca82c16c61cc 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -669,7 +669,7 @@ void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length)
void qmp_xen_set_global_dirty_log(bool enable, Error **errp)
{
if (enable) {
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION, errp);
} else {
memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
}
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index 1d2e85746fb7b10eb7f149976970f9a92125af8a..d02d70b7b4b86a29d4d5540ded416543536d8f98 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -90,9 +90,15 @@ static int64_t do_calculate_dirtyrate(DirtyPageRecord dirty_pages,
void global_dirty_log_change(unsigned int flag, bool start)
{
+ Error *local_err = NULL;
+ bool ret;
+
bql_lock();
if (start) {
- memory_global_dirty_log_start(flag);
+ ret = memory_global_dirty_log_start(flag, &local_err);
+ if (!ret) {
+ error_report_err(local_err);
+ }
} else {
memory_global_dirty_log_stop(flag);
}
@@ -608,9 +614,12 @@ static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config)
{
int64_t start_time;
DirtyPageRecord dirty_pages;
+ Error *local_err = NULL;
bql_lock();
- memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE);
+ if (!memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE, &local_err)) {
+ error_report_err(local_err);
+ }
/*
* 1'round of log sync may return all 1 bits with
diff --git a/migration/ram.c b/migration/ram.c
index f0bd71438a4f7212118593b51648b645737933d4..bade3e9281ae839578033524b800dcf3c6f486dc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2862,18 +2862,32 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
static void ram_init_bitmaps(RAMState *rs)
{
+ Error *local_err = NULL;
+ bool ret = true;
+
qemu_mutex_lock_ramlist();
WITH_RCU_READ_LOCK_GUARD() {
ram_list_init_bitmaps();
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
+ &local_err);
+ if (!ret) {
+ error_report_err(local_err);
+ goto out_unlock;
+ }
migration_bitmap_sync_precopy(rs, false);
}
}
+out_unlock:
qemu_mutex_unlock_ramlist();
+ if (!ret) {
+ ram_bitmaps_destroy();
+ return;
+ }
+
/*
* After an eventual first bitmap sync, fixup the initial bitmap
* containing all 1s to exclude any discarded pages from migration.
@@ -3665,6 +3679,8 @@ int colo_init_ram_cache(void)
void colo_incoming_start_dirty_log(void)
{
RAMBlock *block = NULL;
+ Error *local_err = NULL;
+
/* For memory_global_dirty_log_start below. */
bql_lock();
qemu_mutex_lock_ramlist();
@@ -3676,7 +3692,10 @@ void colo_incoming_start_dirty_log(void)
/* Discard this dirty bitmap record */
bitmap_zero(block->bmap, block->max_length >> TARGET_PAGE_BITS);
}
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ if (!memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
+ &local_err)) {
+ error_report_err(local_err);
+ }
}
ram_state->migration_dirty_pages = 0;
qemu_mutex_unlock_ramlist();
diff --git a/system/memory.c b/system/memory.c
index ca4d91484fb3d06f4b902486fea49dba86dc141b..d7ca1de994f6c3d08820eae4f0b1922db0030831 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2937,10 +2937,9 @@ err:
return false;
}
-void memory_global_dirty_log_start(unsigned int flags)
+bool memory_global_dirty_log_start(unsigned int flags, Error **errp)
{
unsigned int old_flags;
- Error *local_err = NULL;
assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
@@ -2952,7 +2951,7 @@ void memory_global_dirty_log_start(unsigned int flags)
flags &= ~global_dirty_tracking;
if (!flags) {
- return;
+ return true;
}
old_flags = global_dirty_tracking;
@@ -2960,17 +2959,17 @@ void memory_global_dirty_log_start(unsigned int flags)
trace_global_dirty_changed(global_dirty_tracking);
if (!old_flags) {
- if (!memory_global_dirty_log_do_start(&local_err)) {
+ if (!memory_global_dirty_log_do_start(errp)) {
global_dirty_tracking &= ~flags;
trace_global_dirty_changed(global_dirty_tracking);
- error_report_err(local_err);
- return;
+ return false;
}
memory_region_transaction_begin();
memory_region_update_pending = true;
memory_region_transaction_commit();
}
+ return true;
}
static void memory_global_dirty_log_do_stop(unsigned int flags)
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 12/14] migration: Add Error** argument to ram_state_init()
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (10 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 14:59 ` Fabiano Rosas
2024-03-20 6:49 ` [PATCH for-9.1 v5 13/14] migration: Add Error** argument to xbzrle_init() Cédric Le Goater
` (2 subsequent siblings)
14 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
Since the return value not exploited, follow the recommendations of
qapi/error.h and change it to a bool
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/ram.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index bade3e9281ae839578033524b800dcf3c6f486dc..26ce11a3379056d38062c1fb63166c6e3155bd8c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2780,13 +2780,13 @@ err_out:
return -ENOMEM;
}
-static int ram_state_init(RAMState **rsp)
+static bool ram_state_init(RAMState **rsp, Error **errp)
{
*rsp = g_try_new0(RAMState, 1);
if (!*rsp) {
- error_report("%s: Init ramstate fail", __func__);
- return -1;
+ error_setg(errp, "%s: Init ramstate fail", __func__);
+ return false;
}
qemu_mutex_init(&(*rsp)->bitmap_mutex);
@@ -2802,7 +2802,7 @@ static int ram_state_init(RAMState **rsp)
(*rsp)->migration_dirty_pages = (*rsp)->ram_bytes_total >> TARGET_PAGE_BITS;
ram_state_reset(*rsp);
- return 0;
+ return true;
}
static void ram_list_init_bitmaps(void)
@@ -2897,7 +2897,10 @@ out_unlock:
static int ram_init_all(RAMState **rsp)
{
- if (ram_state_init(rsp)) {
+ Error *local_err = NULL;
+
+ if (!ram_state_init(rsp, &local_err)) {
+ error_report_err(local_err);
return -1;
}
@@ -3624,7 +3627,11 @@ void ram_handle_zero(void *host, uint64_t size)
static void colo_init_ram_state(void)
{
- ram_state_init(&ram_state);
+ Error *local_err = NULL;
+
+ if (!ram_state_init(&ram_state, &local_err)) {
+ error_report_err(local_err);
+ }
}
/*
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 13/14] migration: Add Error** argument to xbzrle_init()
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (11 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 12/14] migration: Add Error** argument to ram_state_init() Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 15:01 ` Fabiano Rosas
2024-03-20 6:49 ` [PATCH for-9.1 v5 14/14] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
2024-03-22 13:42 ` [PATCH for-9.1 v5 00/14] migration: Improve error reporting Peter Xu
14 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
Since the return value (-ENOMEM) is not exploited, follow the
recommendations of qapi/error.h and change it to a bool
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/ram.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 26ce11a3379056d38062c1fb63166c6e3155bd8c..70797ef5d80c6ccf61fee987bbe3969041664c69 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2727,44 +2727,41 @@ int ram_discard_range(const char *rbname, uint64_t start, size_t length)
* For every allocation, we will try not to crash the VM if the
* allocation failed.
*/
-static int xbzrle_init(void)
+static bool xbzrle_init(Error **errp)
{
- Error *local_err = NULL;
-
if (!migrate_xbzrle()) {
- return 0;
+ return true;
}
XBZRLE_cache_lock();
XBZRLE.zero_target_page = g_try_malloc0(TARGET_PAGE_SIZE);
if (!XBZRLE.zero_target_page) {
- error_report("%s: Error allocating zero page", __func__);
+ error_setg(errp, "%s: Error allocating zero page", __func__);
goto err_out;
}
XBZRLE.cache = cache_init(migrate_xbzrle_cache_size(),
- TARGET_PAGE_SIZE, &local_err);
+ TARGET_PAGE_SIZE, errp);
if (!XBZRLE.cache) {
- error_report_err(local_err);
goto free_zero_page;
}
XBZRLE.encoded_buf = g_try_malloc0(TARGET_PAGE_SIZE);
if (!XBZRLE.encoded_buf) {
- error_report("%s: Error allocating encoded_buf", __func__);
+ error_setg(errp, "%s: Error allocating encoded_buf", __func__);
goto free_cache;
}
XBZRLE.current_buf = g_try_malloc(TARGET_PAGE_SIZE);
if (!XBZRLE.current_buf) {
- error_report("%s: Error allocating current_buf", __func__);
+ error_setg(errp, "%s: Error allocating current_buf", __func__);
goto free_encoded_buf;
}
/* We are all good */
XBZRLE_cache_unlock();
- return 0;
+ return true;
free_encoded_buf:
g_free(XBZRLE.encoded_buf);
@@ -2777,7 +2774,7 @@ free_zero_page:
XBZRLE.zero_target_page = NULL;
err_out:
XBZRLE_cache_unlock();
- return -ENOMEM;
+ return false;
}
static bool ram_state_init(RAMState **rsp, Error **errp)
@@ -2904,7 +2901,8 @@ static int ram_init_all(RAMState **rsp)
return -1;
}
- if (xbzrle_init()) {
+ if (!xbzrle_init(&local_err)) {
+ error_report_err(local_err);
ram_state_cleanup(rsp);
return -1;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH for-9.1 v5 14/14] migration: Modify ram_init_bitmaps() to report dirty tracking errors
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (12 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 13/14] migration: Add Error** argument to xbzrle_init() Cédric Le Goater
@ 2024-03-20 6:49 ` Cédric Le Goater
2024-03-20 15:05 ` Fabiano Rosas
2024-03-22 13:42 ` [PATCH for-9.1 v5 00/14] migration: Improve error reporting Peter Xu
14 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 6:49 UTC (permalink / raw)
To: qemu-devel, Peter Xu, Fabiano Rosas
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
The .save_setup() handler has now an Error** argument that we can use
to propagate errors reported by the .log_global_start() handler. Do
that for the RAM. The caller qemu_savevm_state_setup() will store the
error under the migration stream for later detection in the migration
sequence.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v5:
- Took into account changes of ram_state_init() and xbzrle_init() to
propagate the error.
migration/ram.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 70797ef5d80c6ccf61fee987bbe3969041664c69..daffcd82d4f15a2defc66059e967092ebc3ec055 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2857,9 +2857,8 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
}
}
-static void ram_init_bitmaps(RAMState *rs)
+static bool ram_init_bitmaps(RAMState *rs, Error **errp)
{
- Error *local_err = NULL;
bool ret = true;
qemu_mutex_lock_ramlist();
@@ -2868,10 +2867,8 @@ static void ram_init_bitmaps(RAMState *rs)
ram_list_init_bitmaps();
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
- ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
- &local_err);
+ ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION, errp);
if (!ret) {
- error_report_err(local_err);
goto out_unlock;
}
migration_bitmap_sync_precopy(rs, false);
@@ -2882,7 +2879,7 @@ out_unlock:
if (!ret) {
ram_bitmaps_destroy();
- return;
+ return false;
}
/*
@@ -2890,24 +2887,23 @@ out_unlock:
* containing all 1s to exclude any discarded pages from migration.
*/
migration_bitmap_clear_discarded_pages(rs);
+ return true;
}
-static int ram_init_all(RAMState **rsp)
+static int ram_init_all(RAMState **rsp, Error **errp)
{
- Error *local_err = NULL;
-
- if (!ram_state_init(rsp, &local_err)) {
- error_report_err(local_err);
+ if (!ram_state_init(rsp, errp)) {
return -1;
}
- if (!xbzrle_init(&local_err)) {
- error_report_err(local_err);
+ if (!xbzrle_init(errp)) {
ram_state_cleanup(rsp);
return -1;
}
- ram_init_bitmaps(*rsp);
+ if (!ram_init_bitmaps(*rsp, errp)) {
+ return -1;
+ }
return 0;
}
@@ -3104,8 +3100,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
/* migration has already setup the bitmap, reuse it. */
if (!migration_in_colo_state()) {
- if (ram_init_all(rsp) != 0) {
- error_setg(errp, "%s: failed to setup RAM for migration", __func__);
+ if (ram_init_all(rsp, errp) != 0) {
compress_threads_save_cleanup();
return -1;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 08/14] migration: Add Error** argument to .load_setup() handler
2024-03-20 6:49 ` [PATCH for-9.1 v5 08/14] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
@ 2024-03-20 8:02 ` Markus Armbruster
2024-03-20 8:41 ` Cédric Le Goater
0 siblings, 1 reply; 34+ messages in thread
From: Markus Armbruster @ 2024-03-20 8:02 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Alex Williamson, Peter Xu, Fabiano Rosas,
Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster,
Prasad Pandit
Cédric Le Goater <clg@redhat.com> writes:
> This will be useful to report errors at a higher level, mostly in VFIO
> today.
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
[...]
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 535ad5a32d67057dd172ce25d561a66a07172e97..8f42999a15d1685957de9ed517d6bc9ba49c3f11 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2747,8 +2747,9 @@ static void qemu_loadvm_state_switchover_ack_needed(MigrationIncomingState *mis)
> trace_loadvm_state_switchover_ack_needed(mis->switchover_ack_pending_num);
> }
>
> -static int qemu_loadvm_state_setup(QEMUFile *f)
> +static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp)
> {
> + ERRP_GUARD(); /* error_prepend use */
The comment is correct, but prone to go stale. No other use of
ERRP_GUARD() is commented. Suggest to drop it.
> SaveStateEntry *se;
> int ret;
>
> @@ -2763,10 +2764,11 @@ static int qemu_loadvm_state_setup(QEMUFile *f)
> }
> }
>
> - ret = se->ops->load_setup(f, se->opaque);
> + ret = se->ops->load_setup(f, se->opaque, errp);
> if (ret < 0) {
> + error_prepend(errp, "Load state of device %s failed: ",
> + se->idstr);
> qemu_file_set_error(f, ret);
> - error_report("Load state of device %s failed", se->idstr);
> return ret;
> }
> }
> @@ -2947,7 +2949,8 @@ int qemu_loadvm_state(QEMUFile *f)
> return ret;
> }
>
> - if (qemu_loadvm_state_setup(f) != 0) {
> + if (qemu_loadvm_state_setup(f, &local_err) != 0) {
> + error_report_err(local_err);
> return -EINVAL;
> }
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 08/14] migration: Add Error** argument to .load_setup() handler
2024-03-20 8:02 ` Markus Armbruster
@ 2024-03-20 8:41 ` Cédric Le Goater
0 siblings, 0 replies; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 8:41 UTC (permalink / raw)
To: Markus Armbruster
Cc: qemu-devel, Alex Williamson, Peter Xu, Fabiano Rosas,
Avihai Horon, Philippe Mathieu-Daudé, Prasad Pandit
On 3/20/24 09:02, Markus Armbruster wrote:
> Cédric Le Goater <clg@redhat.com> writes:
>
>> This will be useful to report errors at a higher level, mostly in VFIO
>> today.
>>
>> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>> Reviewed-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>
> [...]
>
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 535ad5a32d67057dd172ce25d561a66a07172e97..8f42999a15d1685957de9ed517d6bc9ba49c3f11 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -2747,8 +2747,9 @@ static void qemu_loadvm_state_switchover_ack_needed(MigrationIncomingState *mis)
>> trace_loadvm_state_switchover_ack_needed(mis->switchover_ack_pending_num);
>> }
>>
>> -static int qemu_loadvm_state_setup(QEMUFile *f)
>> +static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp)
>> {
>> + ERRP_GUARD(); /* error_prepend use */
>
> The comment is correct, but prone to go stale. No other use of
> ERRP_GUARD() is commented. Suggest to drop it.
OK. I found it interesting, for me at least. Will drop in the VFIO
patches also.
Thanks,
C.
>
>> SaveStateEntry *se;
>> int ret;
>>
>> @@ -2763,10 +2764,11 @@ static int qemu_loadvm_state_setup(QEMUFile *f)
>> }
>> }
>>
>> - ret = se->ops->load_setup(f, se->opaque);
>> + ret = se->ops->load_setup(f, se->opaque, errp);
>> if (ret < 0) {
>> + error_prepend(errp, "Load state of device %s failed: ",
>> + se->idstr);
>> qemu_file_set_error(f, ret);
>> - error_report("Load state of device %s failed", se->idstr);
>> return ret;
>> }
>> }
>> @@ -2947,7 +2949,8 @@ int qemu_loadvm_state(QEMUFile *f)
>> return ret;
>> }
>>
>> - if (qemu_loadvm_state_setup(f) != 0) {
>> + if (qemu_loadvm_state_setup(f, &local_err) != 0) {
>> + error_report_err(local_err);
>> return -EINVAL;
>> }
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 06/14] migration: Add Error** argument to qemu_savevm_state_setup()
2024-03-20 6:49 ` [PATCH for-9.1 v5 06/14] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
@ 2024-03-20 14:29 ` Peter Xu
0 siblings, 0 replies; 34+ messages in thread
From: Peter Xu @ 2024-03-20 14:29 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Prasad Pandit
On Wed, Mar 20, 2024 at 07:49:02AM +0100, Cédric Le Goater wrote:
> This prepares ground for the changes coming next which add an Error**
> argument to the .save_setup() handler. Callers of qemu_savevm_state_setup()
> now handle the error and fail earlier setting the migration state from
> MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
>
> In qemu_savevm_state(), move the cleanup to preserve the error
> reported by .save_setup() handlers.
>
> Since the previous behavior was to ignore errors at this step of
> migration, this change should be examined closely to check that
> cleanups are still correctly done.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 09/14] memory: Add Error** argument to .log_global_start() handler
2024-03-20 6:49 ` [PATCH for-9.1 v5 09/14] memory: Add Error** argument to .log_global_start() handler Cédric Le Goater
@ 2024-03-20 14:42 ` Peter Xu
2024-03-20 16:15 ` Cédric Le Goater
0 siblings, 1 reply; 34+ messages in thread
From: Peter Xu @ 2024-03-20 14:42 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
Paolo Bonzini, Richard Henderson, Eduardo Habkost,
Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
David Hildenbrand, Philippe Mathieu-Daudé, Fabiano Rosas,
Avihai Horon, Markus Armbruster, Prasad Pandit, xen-devel
On Wed, Mar 20, 2024 at 07:49:05AM +0100, Cédric Le Goater wrote:
> Modify all .log_global_start() handlers to take an Error** parameter
> and return a bool. Adapt memory_global_dirty_log_start() to interrupt
> on the first error the loop on handlers. In such case, a rollback is
> performed to stop dirty logging on all listeners where it was
> previously enabled.
>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Anthony Perard <anthony.perard@citrix.com>
> Cc: Paul Durrant <paul@xen.org>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Still one comment below:
> @@ -3014,8 +3044,11 @@ static void listener_add_address_space(MemoryListener *listener,
> listener->begin(listener);
> }
> if (global_dirty_tracking) {
> + /*
> + * Migration has already started. Assert on any error.
If you won't mind, I can change this to:
/*
* Currently only VFIO can fail log_global_start(), and it's not allowed
* to hotplug a VFIO device during migration, so this should never fail
* when invoked. If it can start to fail in the future, we need to be
* able to fail the whole listener_add_address_space() and its callers.
*/
Thanks,
> + */
> if (listener->log_global_start) {
> - listener->log_global_start(listener);
> + listener->log_global_start(listener, &error_abort);
> }
> }
>
> --
> 2.44.0
>
--
Peter Xu
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 10/14] migration: Introduce ram_bitmaps_destroy()
2024-03-20 6:49 ` [PATCH for-9.1 v5 10/14] migration: Introduce ram_bitmaps_destroy() Cédric Le Goater
@ 2024-03-20 14:43 ` Peter Xu
2024-03-20 14:49 ` Fabiano Rosas
1 sibling, 0 replies; 34+ messages in thread
From: Peter Xu @ 2024-03-20 14:43 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Prasad Pandit
On Wed, Mar 20, 2024 at 07:49:06AM +0100, Cédric Le Goater wrote:
> We will use it in ram_init_bitmaps() to clear the allocated bitmaps when
> support for error reporting is added to memory_global_dirty_log_start().
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 10/14] migration: Introduce ram_bitmaps_destroy()
2024-03-20 6:49 ` [PATCH for-9.1 v5 10/14] migration: Introduce ram_bitmaps_destroy() Cédric Le Goater
2024-03-20 14:43 ` Peter Xu
@ 2024-03-20 14:49 ` Fabiano Rosas
1 sibling, 0 replies; 34+ messages in thread
From: Fabiano Rosas @ 2024-03-20 14:49 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel, Peter Xu
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
Cédric Le Goater <clg@redhat.com> writes:
> We will use it in ram_init_bitmaps() to clear the allocated bitmaps when
> support for error reporting is added to memory_global_dirty_log_start().
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines
2024-03-20 6:49 ` [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
@ 2024-03-20 14:53 ` Fabiano Rosas
2024-03-20 15:18 ` Peter Xu
1 sibling, 0 replies; 34+ messages in thread
From: Fabiano Rosas @ 2024-03-20 14:53 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel, Stefano Stabellini,
Anthony Perard, Paul Durrant, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S. Tsirkin, Marcel Apfelbaum, Peter Xu,
David Hildenbrand, Philippe Mathieu-Daudé, Hyman Huang
Cc: Alex Williamson, Avihai Horon, Markus Armbruster, Prasad Pandit,
Cédric Le Goater, xen-devel
Cédric Le Goater <clg@redhat.com> writes:
> Now that the log_global*() handlers take an Error** parameter and
> return a bool, do the same for memory_global_dirty_log_start() and
> memory_global_dirty_log_stop(). The error is reported in the callers
> for now and it will be propagated in the call stack in the next
> changes.
>
> To be noted a functional change in ram_init_bitmaps(), if the dirty
> pages logger fails to start, there is no need to synchronize the dirty
> pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
> similar way.
>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Anthony Perard <anthony.perard@citrix.com>
> Cc: Paul Durrant <paul@xen.org>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hyman Huang <yong.huang@smartx.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 12/14] migration: Add Error** argument to ram_state_init()
2024-03-20 6:49 ` [PATCH for-9.1 v5 12/14] migration: Add Error** argument to ram_state_init() Cédric Le Goater
@ 2024-03-20 14:59 ` Fabiano Rosas
0 siblings, 0 replies; 34+ messages in thread
From: Fabiano Rosas @ 2024-03-20 14:59 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel, Peter Xu
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
Cédric Le Goater <clg@redhat.com> writes:
> Since the return value not exploited, follow the recommendations of
> qapi/error.h and change it to a bool
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 13/14] migration: Add Error** argument to xbzrle_init()
2024-03-20 6:49 ` [PATCH for-9.1 v5 13/14] migration: Add Error** argument to xbzrle_init() Cédric Le Goater
@ 2024-03-20 15:01 ` Fabiano Rosas
0 siblings, 0 replies; 34+ messages in thread
From: Fabiano Rosas @ 2024-03-20 15:01 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel, Peter Xu
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
Cédric Le Goater <clg@redhat.com> writes:
> Since the return value (-ENOMEM) is not exploited, follow the
> recommendations of qapi/error.h and change it to a bool
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 14/14] migration: Modify ram_init_bitmaps() to report dirty tracking errors
2024-03-20 6:49 ` [PATCH for-9.1 v5 14/14] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
@ 2024-03-20 15:05 ` Fabiano Rosas
0 siblings, 0 replies; 34+ messages in thread
From: Fabiano Rosas @ 2024-03-20 15:05 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel, Peter Xu
Cc: Alex Williamson, Avihai Horon, Philippe Mathieu-Daudé,
Markus Armbruster, Prasad Pandit, Cédric Le Goater
Cédric Le Goater <clg@redhat.com> writes:
> The .save_setup() handler has now an Error** argument that we can use
> to propagate errors reported by the .log_global_start() handler. Do
> that for the RAM. The caller qemu_savevm_state_setup() will store the
> error under the migration stream for later detection in the migration
> sequence.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines
2024-03-20 6:49 ` [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
2024-03-20 14:53 ` Fabiano Rosas
@ 2024-03-20 15:18 ` Peter Xu
2024-03-22 1:55 ` Yong Huang
1 sibling, 1 reply; 34+ messages in thread
From: Peter Xu @ 2024-03-20 15:18 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
Paolo Bonzini, Richard Henderson, Eduardo Habkost,
Michael S. Tsirkin, Marcel Apfelbaum, David Hildenbrand,
Philippe Mathieu-Daudé, Hyman Huang, Fabiano Rosas,
Alex Williamson, Avihai Horon, Markus Armbruster, Prasad Pandit,
xen-devel, Hailiang Zhang
On Wed, Mar 20, 2024 at 07:49:07AM +0100, Cédric Le Goater wrote:
> Now that the log_global*() handlers take an Error** parameter and
> return a bool, do the same for memory_global_dirty_log_start() and
> memory_global_dirty_log_stop(). The error is reported in the callers
> for now and it will be propagated in the call stack in the next
> changes.
>
> To be noted a functional change in ram_init_bitmaps(), if the dirty
> pages logger fails to start, there is no need to synchronize the dirty
> pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
> similar way.
>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Anthony Perard <anthony.perard@citrix.com>
> Cc: Paul Durrant <paul@xen.org>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hyman Huang <yong.huang@smartx.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Just to mention that for the rest users (dirtylimit/dirtyrate/colo) the
code still just keeps going even if start logging failed even after this
series applied as a whole. Migration framework handles the failure
gracefully, not yet the rest.
That might be an issue for some, e.g., ideally we should be able to fail a
calc-dirty-rate request, but it's not supported so far. Adding that could
add quite some burden to this series, so maybe that's fine to be done
later. After all, having a VFIO device (that can fail a start_log()), plus
any of those features should be even rarer, I think?
It seems at least memory_global_dirty_log_sync() can be called even without
start logging, so I expect nothing should crash immediately. I spot one in
colo_incoming_start_dirty_log() already of such use. My wild guess is it
relies on all log_sync*() hooks to cope with it, e.g. KVM ioctl() should
fail with -ENENT on most archs I think when it sees dirty log not ever
started.
For those bits, I'll wait and see whether Yong or Hailiang (cced) has any
comments. From generic migration/memory side, nothing I see wrong:
Acked-by: Peter Xu <peterx@redhat.com>
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 09/14] memory: Add Error** argument to .log_global_start() handler
2024-03-20 14:42 ` Peter Xu
@ 2024-03-20 16:15 ` Cédric Le Goater
2024-03-20 17:39 ` Peter Xu
0 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-20 16:15 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
Paolo Bonzini, Richard Henderson, Eduardo Habkost,
Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
David Hildenbrand, Philippe Mathieu-Daudé, Fabiano Rosas,
Avihai Horon, Markus Armbruster, Prasad Pandit, xen-devel
On 3/20/24 15:42, Peter Xu wrote:
> On Wed, Mar 20, 2024 at 07:49:05AM +0100, Cédric Le Goater wrote:
>> Modify all .log_global_start() handlers to take an Error** parameter
>> and return a bool. Adapt memory_global_dirty_log_start() to interrupt
>> on the first error the loop on handlers. In such case, a rollback is
>> performed to stop dirty logging on all listeners where it was
>> previously enabled.
>>
>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>> Cc: Anthony Perard <anthony.perard@citrix.com>
>> Cc: Paul Durrant <paul@xen.org>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
> Still one comment below:
>
>> @@ -3014,8 +3044,11 @@ static void listener_add_address_space(MemoryListener *listener,
>> listener->begin(listener);
>> }
>> if (global_dirty_tracking) {
>> + /*
>> + * Migration has already started. Assert on any error.
>
> If you won't mind, I can change this to:
>
> /*
> * Currently only VFIO can fail log_global_start(), and it's not allowed
> * to hotplug a VFIO device during migration, so this should never fail
> * when invoked. If it can start to fail in the future, we need to be
> * able to fail the whole listener_add_address_space() and its callers.
> */
Sure, or I will in a v6. Markus had a comment on 8/14.
Thanks,
C.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 09/14] memory: Add Error** argument to .log_global_start() handler
2024-03-20 16:15 ` Cédric Le Goater
@ 2024-03-20 17:39 ` Peter Xu
0 siblings, 0 replies; 34+ messages in thread
From: Peter Xu @ 2024-03-20 17:39 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Stefano Stabellini, Anthony Perard, Paul Durrant,
Paolo Bonzini, Richard Henderson, Eduardo Habkost,
Michael S. Tsirkin, Marcel Apfelbaum, Alex Williamson,
David Hildenbrand, Philippe Mathieu-Daudé, Fabiano Rosas,
Avihai Horon, Markus Armbruster, Prasad Pandit, xen-devel
On Wed, Mar 20, 2024 at 05:15:06PM +0100, Cédric Le Goater wrote:
> Sure, or I will in a v6. Markus had a comment on 8/14.
Yeah, I can handle both if they're the only ones. Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines
2024-03-20 15:18 ` Peter Xu
@ 2024-03-22 1:55 ` Yong Huang
2024-03-22 13:41 ` Peter Xu
0 siblings, 1 reply; 34+ messages in thread
From: Yong Huang @ 2024-03-22 1:55 UTC (permalink / raw)
To: Peter Xu
Cc: Cédric Le Goater, qemu-devel, Stefano Stabellini,
Anthony Perard, Paul Durrant, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S. Tsirkin, Marcel Apfelbaum,
David Hildenbrand, Philippe Mathieu-Daudé, Fabiano Rosas,
Alex Williamson, Avihai Horon, Markus Armbruster, Prasad Pandit,
xen-devel, Hailiang Zhang
[-- Attachment #1: Type: text/plain, Size: 2558 bytes --]
On Wed, Mar 20, 2024 at 11:19 PM Peter Xu <peterx@redhat.com> wrote:
> On Wed, Mar 20, 2024 at 07:49:07AM +0100, Cédric Le Goater wrote:
> > Now that the log_global*() handlers take an Error** parameter and
> > return a bool, do the same for memory_global_dirty_log_start() and
> > memory_global_dirty_log_stop(). The error is reported in the callers
> > for now and it will be propagated in the call stack in the next
> > changes.
> >
> > To be noted a functional change in ram_init_bitmaps(), if the dirty
> > pages logger fails to start, there is no need to synchronize the dirty
> > pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
> > similar way.
> >
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Anthony Perard <anthony.perard@citrix.com>
> > Cc: Paul Durrant <paul@xen.org>
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: David Hildenbrand <david@redhat.com>
> > Cc: Hyman Huang <yong.huang@smartx.com>
> > Signed-off-by: Cédric Le Goater <clg@redhat.com>
>
> Just to mention that for the rest users (dirtylimit/dirtyrate/colo) the
> code still just keeps going even if start logging failed even after this
> series applied as a whole. Migration framework handles the failure
> gracefully, not yet the rest.
>
> That might be an issue for some, e.g., ideally we should be able to fail a
> calc-dirty-rate request, but it's not supported so far. Adding that could
> add quite some burden to this series, so maybe that's fine to be done
>
Indeed, for the GLOBAL_DIRTY_DIRTY_RATE and GLOBAL_DIRTY_LIMIT
dirty tracking, they should handle the failure path of logging start.
The work may be done once the current patchset is merged.
> later. After all, having a VFIO device (that can fail a start_log()), plus
> any of those features should be even rarer, I think?
>
> It seems at least memory_global_dirty_log_sync() can be called even without
> start logging, so I expect nothing should crash immediately. I spot one in
> colo_incoming_start_dirty_log() already of such use. My wild guess is it
> relies on all log_sync*() hooks to cope with it, e.g. KVM ioctl() should
> fail with -ENENT on most archs I think when it sees dirty log not ever
> started.
>
> For those bits, I'll wait and see whether Yong or Hailiang (cced) has any
> comments. From generic migration/memory side, nothing I see wrong:
>
> Acked-by: Peter Xu <peterx@redhat.com>
>
> Thanks,
>
> --
> Peter Xu
>
>
--
Best regards
[-- Attachment #2: Type: text/html, Size: 4477 bytes --]
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines
2024-03-22 1:55 ` Yong Huang
@ 2024-03-22 13:41 ` Peter Xu
0 siblings, 0 replies; 34+ messages in thread
From: Peter Xu @ 2024-03-22 13:41 UTC (permalink / raw)
To: Yong Huang
Cc: Cédric Le Goater, qemu-devel, Stefano Stabellini,
Anthony Perard, Paul Durrant, Paolo Bonzini, Richard Henderson,
Eduardo Habkost, Michael S. Tsirkin, Marcel Apfelbaum,
David Hildenbrand, Philippe Mathieu-Daudé, Fabiano Rosas,
Alex Williamson, Avihai Horon, Markus Armbruster, Prasad Pandit,
xen-devel, Hailiang Zhang
On Fri, Mar 22, 2024 at 09:55:18AM +0800, Yong Huang wrote:
> Indeed, for the GLOBAL_DIRTY_DIRTY_RATE and GLOBAL_DIRTY_LIMIT
> dirty tracking, they should handle the failure path of logging start.
> The work may be done once the current patchset is merged.
Thanks for confirming this, Yong. I think I'll go ahead and queue it then.
It should be in the 1st migration pull for 9.1. Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 00/14] migration: Improve error reporting
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
` (13 preceding siblings ...)
2024-03-20 6:49 ` [PATCH for-9.1 v5 14/14] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
@ 2024-03-22 13:42 ` Peter Xu
14 siblings, 0 replies; 34+ messages in thread
From: Peter Xu @ 2024-03-22 13:42 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Prasad Pandit
On Wed, Mar 20, 2024 at 07:48:56AM +0100, Cédric Le Goater wrote:
> Hello,
>
> The motivation behind these changes is to improve error reporting to
> the upper management layer (libvirt) with a more detailed error, this
> to let it decide, depending on the reported error, whether to try
> migration again later. It would be useful in cases where migration
> fails due to lack of HW resources on the host. For instance, some
> adapters can only initiate a limited number of simultaneous dirty
> tracking requests and this imposes a limit on the the number of VMs
> that can be migrated simultaneously.
>
> We are not quite ready for such a mechanism but what we can do first is
> to cleanup the error reporting in the early save_setup sequence. This
> is what the following changes propose, by adding an Error** argument to
> various handlers and propagating it to the core migration subsystem.
>
>
> Patchset is organized as follow :
>
> * [1-4] are prerequisite changes in other components related to the
> migration save_setup() handler. They make sure a failure is not
> returned without setting an error.
>
> s390/stattrib: Add Error** argument to set_migrationmode() handler
> vfio: Always report an error in vfio_save_setup()
> migration: Always report an error in block_save_setup()
> migration: Always report an error in ram_save_setup()
> migration: Add Error** argument to vmstate_save()
>
> * [5-14] are the core changes in migration and memory components to
> propagate an error reported in a save_setup() handler.
>
> migration: Add Error** argument to qemu_savevm_state_setup()
> migration: Add Error** argument to .save_setup() handler
> migration: Add Error** argument to .load_setup() handler
> memory: Add Error** argument to .log_global_start() handler
> migration: Introduce ram_bitmaps_destroy()
> memory: Add Error** argument to the global_dirty_log routines
> migration: Add Error** argument to ram_state_init()
> migration: Add Error** argument to xbzrle_init()
> migration: Modify ram_init_bitmaps() to report dirty tracking errors
>
> The VFIO changes depend on the above. They are simpler and have been
> reviewed already. I kept them for another series.
queued for 9.1, thanks.
--
Peter Xu
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler
2024-03-20 6:49 ` [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
@ 2024-03-29 9:32 ` Vladimir Sementsov-Ogievskiy
2024-03-29 10:53 ` Cédric Le Goater
0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-03-29 9:32 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel, Nicholas Piggin,
Daniel Henrique Barboza, David Gibson, Harsh Prateek Bora,
Halil Pasic, Christian Borntraeger, Eric Farman, Thomas Huth,
Richard Henderson, David Hildenbrand, Ilya Leoshkevich,
Alex Williamson, Peter Xu, Fabiano Rosas, Stefan Hajnoczi,
Fam Zheng, Eric Blake, John Snow
Cc: Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster,
Prasad Pandit, qemu-ppc, qemu-s390x, qemu-block
On 20.03.24 09:49, Cédric Le Goater wrote:
> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
> index 2708abf3d762de774ed294d3fdb8e56690d2974c..542a8c297b329abc30d1b3a205d29340fa59a961 100644
> --- a/migration/block-dirty-bitmap.c
> +++ b/migration/block-dirty-bitmap.c
> @@ -1213,12 +1213,14 @@ fail:
> return ret;
> }
>
> -static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
> +static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp)
> {
> DBMSaveState *s = &((DBMState *)opaque)->save;
> SaveBitmapState *dbms = NULL;
>
> if (init_dirty_bitmap_migration(s) < 0) {
> + error_setg(errp,
> + "Failed to initialize dirty tracking bitmap for blocks");
No, that's not about initializing a bitmap. This all is about migration of block-dirty-bitmaps themselves.
So correct would be say "Failed to initialize migration of block dirty bitmaps".
with this, for block dirty bitmap migration:
Acked-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Still, a lot better is add errp to init_dirty_bitmap_migration() and to add_bitmaps_to_list() too: look,
init_dirty_bitmap_migration() fails only if add_bitmaps_to_list() fails
in turn,
add_bitmaps_to_list() have several clear failure points, where it always does error_report (or error_report_err), which would be better to pass-through to the user.
> return -1;
> }
>
> diff --git a/migrat
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler
2024-03-29 9:32 ` Vladimir Sementsov-Ogievskiy
@ 2024-03-29 10:53 ` Cédric Le Goater
2024-03-29 11:32 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 34+ messages in thread
From: Cédric Le Goater @ 2024-03-29 10:53 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, qemu-devel, Nicholas Piggin,
Daniel Henrique Barboza, David Gibson, Harsh Prateek Bora,
Halil Pasic, Christian Borntraeger, Eric Farman, Thomas Huth,
Richard Henderson, David Hildenbrand, Ilya Leoshkevich,
Alex Williamson, Peter Xu, Fabiano Rosas, Stefan Hajnoczi,
Fam Zheng, Eric Blake, John Snow
Cc: Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster,
Prasad Pandit, qemu-ppc, qemu-s390x, qemu-block
Hello Vladimir,
On 3/29/24 10:32, Vladimir Sementsov-Ogievskiy wrote:
> On 20.03.24 09:49, Cédric Le Goater wrote:
>> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
>> index 2708abf3d762de774ed294d3fdb8e56690d2974c..542a8c297b329abc30d1b3a205d29340fa59a961 100644
>> --- a/migration/block-dirty-bitmap.c
>> +++ b/migration/block-dirty-bitmap.c
>> @@ -1213,12 +1213,14 @@ fail:
>> return ret;
>> }
>> -static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
>> +static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp)
>> {
>> DBMSaveState *s = &((DBMState *)opaque)->save;
>> SaveBitmapState *dbms = NULL;
>> if (init_dirty_bitmap_migration(s) < 0) {
>> + error_setg(errp,
>> + "Failed to initialize dirty tracking bitmap for blocks");
>
> No, that's not about initializing a bitmap. This all is about migration of block-dirty-bitmaps themselves.
>
> So correct would be say "Failed to initialize migration of block dirty bitmaps".
>
> with this, for block dirty bitmap migration:
> Acked-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
I had kept your previous R-b.
Should we remove it ? or is it ok if I address your comments below in a
followup patch, in which case the error message above would be removed.
> Still, a lot better is add errp to init_dirty_bitmap_migration() and to add_bitmaps_to_list() too: look,
>
> init_dirty_bitmap_migration() fails only if add_bitmaps_to_list() fails
>
> in turn,
>
> add_bitmaps_to_list() have several clear failure points, where it always does error_report (or error_report_err), which would be better to pass-through to the user.
Good idea. Will do.
Thanks,
C.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler
2024-03-29 10:53 ` Cédric Le Goater
@ 2024-03-29 11:32 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2024-03-29 11:32 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel, Nicholas Piggin,
Daniel Henrique Barboza, David Gibson, Harsh Prateek Bora,
Halil Pasic, Christian Borntraeger, Eric Farman, Thomas Huth,
Richard Henderson, David Hildenbrand, Ilya Leoshkevich,
Alex Williamson, Peter Xu, Fabiano Rosas, Stefan Hajnoczi,
Fam Zheng, Eric Blake, John Snow
Cc: Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster,
Prasad Pandit, qemu-ppc, qemu-s390x, qemu-block
On 29.03.24 13:53, Cédric Le Goater wrote:
> Hello Vladimir,
>
> On 3/29/24 10:32, Vladimir Sementsov-Ogievskiy wrote:
>> On 20.03.24 09:49, Cédric Le Goater wrote:
>>> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
>>> index 2708abf3d762de774ed294d3fdb8e56690d2974c..542a8c297b329abc30d1b3a205d29340fa59a961 100644
>>> --- a/migration/block-dirty-bitmap.c
>>> +++ b/migration/block-dirty-bitmap.c
>>> @@ -1213,12 +1213,14 @@ fail:
>>> return ret;
>>> }
>>> -static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
>>> +static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp)
>>> {
>>> DBMSaveState *s = &((DBMState *)opaque)->save;
>>> SaveBitmapState *dbms = NULL;
>>> if (init_dirty_bitmap_migration(s) < 0) {
>>> + error_setg(errp,
>>> + "Failed to initialize dirty tracking bitmap for blocks");
>>
>> No, that's not about initializing a bitmap. This all is about migration of block-dirty-bitmaps themselves.
>>
>> So correct would be say "Failed to initialize migration of block dirty bitmaps".
>>
>> with this, for block dirty bitmap migration:
>> Acked-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>
> I had kept your previous R-b.
Oh, I missed that)
>
> Should we remove it ? or is it ok if I address your comments below in a
> followup patch, in which case the error message above would be removed.
Yes of course, you can keep my old r-b. Followup patch is appreciated
>
>> Still, a lot better is add errp to init_dirty_bitmap_migration() and to add_bitmaps_to_list() too: look,
>>
>> init_dirty_bitmap_migration() fails only if add_bitmaps_to_list() fails
>>
>> in turn,
>>
>> add_bitmaps_to_list() have several clear failure points, where it always does error_report (or error_report_err), which would be better to pass-through to the user.
>
> Good idea. Will do.
>
> Thanks,
>
> C.
>
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-03-29 11:33 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-20 6:48 [PATCH for-9.1 v5 00/14] migration: Improve error reporting Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 01/14] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 02/14] vfio: Always report an error in vfio_save_setup() Cédric Le Goater
2024-03-20 6:48 ` [PATCH for-9.1 v5 03/14] migration: Always report an error in block_save_setup() Cédric Le Goater
2024-03-20 6:49 ` [PATCH for-9.1 v5 04/14] migration: Always report an error in ram_save_setup() Cédric Le Goater
2024-03-20 6:49 ` [PATCH for-9.1 v5 05/14] migration: Add Error** argument to vmstate_save() Cédric Le Goater
2024-03-20 6:49 ` [PATCH for-9.1 v5 06/14] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
2024-03-20 14:29 ` Peter Xu
2024-03-20 6:49 ` [PATCH for-9.1 v5 07/14] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
2024-03-29 9:32 ` Vladimir Sementsov-Ogievskiy
2024-03-29 10:53 ` Cédric Le Goater
2024-03-29 11:32 ` Vladimir Sementsov-Ogievskiy
2024-03-20 6:49 ` [PATCH for-9.1 v5 08/14] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
2024-03-20 8:02 ` Markus Armbruster
2024-03-20 8:41 ` Cédric Le Goater
2024-03-20 6:49 ` [PATCH for-9.1 v5 09/14] memory: Add Error** argument to .log_global_start() handler Cédric Le Goater
2024-03-20 14:42 ` Peter Xu
2024-03-20 16:15 ` Cédric Le Goater
2024-03-20 17:39 ` Peter Xu
2024-03-20 6:49 ` [PATCH for-9.1 v5 10/14] migration: Introduce ram_bitmaps_destroy() Cédric Le Goater
2024-03-20 14:43 ` Peter Xu
2024-03-20 14:49 ` Fabiano Rosas
2024-03-20 6:49 ` [PATCH for-9.1 v5 11/14] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
2024-03-20 14:53 ` Fabiano Rosas
2024-03-20 15:18 ` Peter Xu
2024-03-22 1:55 ` Yong Huang
2024-03-22 13:41 ` Peter Xu
2024-03-20 6:49 ` [PATCH for-9.1 v5 12/14] migration: Add Error** argument to ram_state_init() Cédric Le Goater
2024-03-20 14:59 ` Fabiano Rosas
2024-03-20 6:49 ` [PATCH for-9.1 v5 13/14] migration: Add Error** argument to xbzrle_init() Cédric Le Goater
2024-03-20 15:01 ` Fabiano Rosas
2024-03-20 6:49 ` [PATCH for-9.1 v5 14/14] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
2024-03-20 15:05 ` Fabiano Rosas
2024-03-22 13:42 ` [PATCH for-9.1 v5 00/14] migration: Improve error reporting Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).