* [PATCH v3 00/26] migration: Improve error reporting
@ 2024-03-04 12:28 Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
` (26 more replies)
0 siblings, 27 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
Hello,
The motivation behind these changes is to improve error reporting to
the upper management layer (libvirt) with a more detailed error, this
to let it decide, depending on the reported error, whether to try
migration again later. It would be useful in cases where migration
fails due to lack of HW resources on the host. For instance, some
adapters can only initiate a limited number of simultaneous dirty
tracking requests and this imposes a limit on the the number of VMs
that can be migrated simultaneously.
We are not quite ready for such a mechanism but what we can do first is
to cleanup the error reporting in the early save_setup sequence. This
is what the following changes propose, by adding an Error** argument to
various handlers and propagating it to the core migration subsystem.
Thanks,
C.
Changes in v3:
- New changes to make sure an error is always set in case of failure.
This is the reason behing the 5/6 extra patches. (Markus)
- Documentation fixup (Peter + Avihai)
- Set migration state to MIGRATION_STATUS_FAILED always
- Fixed error handling in bg_migration_thread() (Peter)
- Fixed return value of vfio_listener_log_global_start/stop().
Went unnoticed because value is not tested. (Peter)
- Add ERRP_GUARD() when error_prepend is used
- Use error_setg_errno() when possible
Changes in v2:
- Removed v1 patches addressing the return-path thread termination as
they are now superseded by :
https://lore.kernel.org/qemu-devel/20240226203122.22894-1-farosas@suse.de/
- Documentation updates of handlers
- Removed call to PRECOPY_NOTIFY_SETUP notifiers in case of errors
- Modified routines taking an Error** argument to return a bool when
possible and made adjustments in callers.
- new MEMORY_LISTENER_CALL_LOG_GLOBAL macro for .log_global*()
handlers
- Handled SETUP state when migration terminates
- Modified memory_get_xlat_addr() to take an Error** argument
- Various refinements on error handling
Cédric Le Goater (26):
s390/stattrib: Add Error** argument to set_migrationmode() handler
vfio: Always report an error in vfio_save_setup()
migration: Always report an error in block_save_setup()
migration: Always report an error in ram_save_setup()
migration: Add Error** argument to vmstate_save()
migration: Report error when shutdown fails
migration: Remove SaveStateHandler and LoadStateHandler typedefs
migration: Add documentation for SaveVMHandlers
migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
migration: Move cleanup after after error reporting in
qemu_savevm_state_setup()
migration: Add Error** argument to qemu_savevm_state_setup()
migration: Add Error** argument to .save_setup() handler
migration: Add Error** argument to .load_setup() handler
memory: Add Error** argument to .log_global*() handlers
memory: Add Error** argument to the global_dirty_log routines
migration: Modify ram_init_bitmaps() to report dirty tracking errors
vfio: Add Error** argument to .set_dirty_page_tracking() handler
vfio: Add Error** argument to vfio_devices_dma_logging_start()
vfio: Add Error** argument to vfio_devices_dma_logging_stop()
vfio: Use new Error** argument in vfio_save_setup()
vfio: Add Error** argument to .vfio_save_config() handler
vfio: Reverse test on vfio_get_dirty_bitmap()
memory: Add Error** argument to memory_get_xlat_addr()
vfio: Add Error** argument to .get_dirty_bitmap() handler
vfio: Also trace event failures in vfio_save_complete_precopy()
vfio: Extend vfio_set_migration_error() with Error* argument
include/exec/memory.h | 40 +++-
include/hw/s390x/storage-attributes.h | 2 +-
include/hw/vfio/vfio-common.h | 29 ++-
include/hw/vfio/vfio-container-base.h | 35 +++-
include/migration/register.h | 273 +++++++++++++++++++++++---
include/qemu/typedefs.h | 2 -
migration/savevm.h | 2 +-
hw/i386/xen/xen-hvm.c | 10 +-
hw/ppc/spapr.c | 2 +-
hw/s390x/s390-stattrib-kvm.c | 12 +-
hw/s390x/s390-stattrib.c | 14 +-
hw/vfio/common.c | 162 +++++++++------
hw/vfio/container-base.c | 9 +-
hw/vfio/container.c | 19 +-
hw/vfio/migration.c | 99 ++++++----
hw/vfio/pci.c | 5 +-
hw/virtio/vhost-vdpa.c | 5 +-
hw/virtio/vhost.c | 6 +-
migration/block-dirty-bitmap.c | 4 +-
migration/block.c | 15 +-
migration/dirtyrate.c | 21 +-
migration/migration.c | 27 ++-
migration/qemu-file.c | 5 +-
migration/ram.c | 58 ++++--
migration/savevm.c | 59 +++---
system/memory.c | 95 +++++++--
system/physmem.c | 5 +-
27 files changed, 772 insertions(+), 243 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 20:49 ` Fabiano Rosas
2024-03-05 14:54 ` Avihai Horon
2024-03-04 12:28 ` [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup() Cédric Le Goater
` (25 subsequent siblings)
26 siblings, 2 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater, Halil Pasic, Christian Borntraeger,
Thomas Huth
This will prepare ground for futur changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
set_migrationmode() always sets a new error. See the Rules section in
qapi/error.h.
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/hw/s390x/storage-attributes.h | 2 +-
hw/s390x/s390-stattrib-kvm.c | 12 ++++++++++--
hw/s390x/s390-stattrib.c | 14 +++++++++-----
3 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/include/hw/s390x/storage-attributes.h b/include/hw/s390x/storage-attributes.h
index 5239eb538c1b087797867a247abfc14551af6a4d..8921a04d514bf64a3113255ee10ed33fc598ae06 100644
--- a/include/hw/s390x/storage-attributes.h
+++ b/include/hw/s390x/storage-attributes.h
@@ -39,7 +39,7 @@ struct S390StAttribClass {
int (*set_stattr)(S390StAttribState *sa, uint64_t start_gfn,
uint32_t count, uint8_t *values);
void (*synchronize)(S390StAttribState *sa);
- int (*set_migrationmode)(S390StAttribState *sa, bool value);
+ int (*set_migrationmode)(S390StAttribState *sa, bool value, Error **errp);
int (*get_active)(S390StAttribState *sa);
long long (*get_dirtycount)(S390StAttribState *sa);
};
diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
index 24cd01382e2d74d62c2d7e980eb6aca1077d893d..357cea2c987213b867c81b0e258f7d0c293fe665 100644
--- a/hw/s390x/s390-stattrib-kvm.c
+++ b/hw/s390x/s390-stattrib-kvm.c
@@ -17,6 +17,7 @@
#include "sysemu/kvm.h"
#include "exec/ram_addr.h"
#include "kvm/kvm_s390x.h"
+#include "qapi/error.h"
Object *kvm_s390_stattrib_create(void)
{
@@ -137,14 +138,21 @@ static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
}
}
-static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val)
+static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val,
+ Error **errp)
{
struct kvm_device_attr attr = {
.group = KVM_S390_VM_MIGRATION,
.attr = val,
.addr = 0,
};
- return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
+ int r;
+
+ r = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
+ if (r) {
+ error_setg_errno(errp, -r, "KVM_S390_SET_CMMA_BITS failed");
+ }
+ return r;
}
static long long kvm_s390_stattrib_get_dirtycount(S390StAttribState *sa)
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index c483b62a9b5f71772639fc180bdad15ecb6711cb..e99de190332a82363b1388bbc450013149295bc0 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -60,11 +60,12 @@ void hmp_migrationmode(Monitor *mon, const QDict *qdict)
S390StAttribState *sas = s390_get_stattrib_device();
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
uint64_t what = qdict_get_int(qdict, "mode");
+ Error *local_err = NULL;
int r;
- r = sac->set_migrationmode(sas, what);
+ r = sac->set_migrationmode(sas, what, &local_err);
if (r < 0) {
- monitor_printf(mon, "Error: %s", strerror(-r));
+ monitor_printf(mon, "Error: %s", error_get_pretty(local_err));
}
}
@@ -170,13 +171,15 @@ static int cmma_save_setup(QEMUFile *f, void *opaque)
{
S390StAttribState *sas = S390_STATTRIB(opaque);
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
+ Error *local_err = NULL;
int res;
/*
* Signal that we want to start a migration, thus needing PGSTE dirty
* tracking.
*/
- res = sac->set_migrationmode(sas, 1);
+ res = sac->set_migrationmode(sas, true, &local_err);
if (res) {
+ error_report_err(local_err);
return res;
}
qemu_put_be64(f, STATTR_FLAG_EOS);
@@ -260,7 +263,7 @@ static void cmma_save_cleanup(void *opaque)
{
S390StAttribState *sas = S390_STATTRIB(opaque);
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
- sac->set_migrationmode(sas, 0);
+ sac->set_migrationmode(sas, false, NULL);
}
static bool cmma_active(void *opaque)
@@ -293,7 +296,8 @@ static long long qemu_s390_get_dirtycount_stub(S390StAttribState *sa)
{
return 0;
}
-static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value)
+static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value,
+ Error **errp)
{
return 0;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 20:51 ` Fabiano Rosas
2024-03-06 9:56 ` Avihai Horon
2024-03-04 12:28 ` [PATCH v3 03/26] migration: Always report an error in block_save_setup() Cédric Le Goater
` (24 subsequent siblings)
26 siblings, 2 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This will prepare ground for futur changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
vfio_save_setup() always sets a new error.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/migration.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 2050ac8897231ff89cc223f0570d5c7a65dede9e..51bea536cc290ba0aa393f78b017b0650e333bff 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -383,6 +383,7 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE;
+ int ret;
qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
@@ -397,13 +398,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
}
if (vfio_precopy_supported(vbasedev)) {
- int ret;
-
switch (migration->device_state) {
case VFIO_DEVICE_STATE_RUNNING:
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
VFIO_DEVICE_STATE_RUNNING);
if (ret) {
+ error_report("%s: Failed to set new RUNNING state",
+ vbasedev->name);
return ret;
}
@@ -414,6 +415,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
/* vfio_save_complete_precopy() will go to STOP_COPY */
break;
default:
+ error_report("%s: Invalid device state %d", vbasedev->name,
+ migration->device_state);
return -EINVAL;
}
}
@@ -422,7 +425,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
- return qemu_file_get_error(f);
+ ret = qemu_file_get_error(f);
+ if (ret) {
+ error_report("%s: save setup failed : %s", vbasedev->name,
+ strerror(ret));
+ }
+
+ return ret;
}
static void vfio_save_cleanup(void *opaque)
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 03/26] migration: Always report an error in block_save_setup()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 21:04 ` Fabiano Rosas
2024-03-04 12:28 ` [PATCH v3 04/26] migration: Always report an error in ram_save_setup() Cédric Le Goater
` (23 subsequent siblings)
26 siblings, 1 reply; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater, Stefan Hajnoczi
This will prepare ground for futur changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
block_save_setup() always sets a new error.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/block.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/migration/block.c b/migration/block.c
index 8c6ebafacc1ffe930d1d4f19d968817b14852c69..06f5857cf049df3261d2cf1d7c3607ab92350ac6 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -367,7 +367,7 @@ static void unset_dirty_tracking(void)
}
}
-static int init_blk_migration(QEMUFile *f)
+static int init_blk_migration(QEMUFile *f, Error **errp)
{
BlockDriverState *bs;
BlkMigDevState *bmds;
@@ -378,7 +378,6 @@ static int init_blk_migration(QEMUFile *f)
BlkMigDevState *bmds;
BlockDriverState *bs;
} *bmds_bs;
- Error *local_err = NULL;
int ret;
GRAPH_RDLOCK_GUARD_MAINLOOP();
@@ -439,9 +438,8 @@ static int init_blk_migration(QEMUFile *f)
bs = bmds_bs[i].bs;
if (bmds) {
- ret = blk_insert_bs(bmds->blk, bs, &local_err);
+ ret = blk_insert_bs(bmds->blk, bs, errp);
if (ret < 0) {
- error_report_err(local_err);
goto out;
}
@@ -711,6 +709,7 @@ static void block_migration_cleanup(void *opaque)
static int block_save_setup(QEMUFile *f, void *opaque)
{
int ret;
+ Error *local_err = NULL;
trace_migration_block_save("setup", block_mig_state.submitted,
block_mig_state.transferred);
@@ -718,18 +717,27 @@ static int block_save_setup(QEMUFile *f, void *opaque)
warn_report("block migration is deprecated;"
" use blockdev-mirror with NBD instead");
- ret = init_blk_migration(f);
+ ret = init_blk_migration(f, &local_err);
if (ret < 0) {
+ error_report_err(local_err);
return ret;
}
/* start track dirty blocks */
ret = set_dirty_tracking();
if (ret) {
+ error_setg_errno(&local_err, -ret,
+ "Failed to start block dirty tracking");
+ error_report_err(local_err);
return ret;
}
ret = flush_blks(f);
+ if (ret) {
+ error_setg_errno(&local_err, -ret, "Flushing block failed");
+ error_report_err(local_err);
+ return ret;
+ }
blk_mig_reset_dirty_cursor();
qemu_put_be64(f, BLK_MIG_FLAG_EOS);
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 04/26] migration: Always report an error in ram_save_setup()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (2 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 03/26] migration: Always report an error in block_save_setup() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 21:30 ` Fabiano Rosas
2024-03-05 5:29 ` Prasad Pandit
2024-03-04 12:28 ` [PATCH v3 05/26] migration: Add Error** argument to vmstate_save() Cédric Le Goater
` (22 subsequent siblings)
26 siblings, 2 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This will prepare ground for futur changes adding an Error** argument
to the save_setup() handler. We need to make sure that on failure,
ram_save_setup() sets a new error.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/ram.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/migration/ram.c b/migration/ram.c
index 45a00b45edd429ef0568adce09b7459883e00167..dbd04d8ee2167881007c836a6bbc79c1aced72d0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2937,12 +2937,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
int ret;
if (compress_threads_save_setup()) {
+ error_report("%s: failed to start compress threads", __func__);
return -1;
}
/* migration has already setup the bitmap, reuse it. */
if (!migration_in_colo_state()) {
if (ram_init_all(rsp) != 0) {
+ error_report("%s: failed to setup RAM for migration", __func__);
compress_threads_save_cleanup();
return -1;
}
@@ -2969,12 +2971,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ret = rdma_registration_start(f, RAM_CONTROL_SETUP);
if (ret < 0) {
+ error_report("%s: failed to start RDMA registration", __func__);
qemu_file_set_error(f, ret);
return ret;
}
ret = rdma_registration_stop(f, RAM_CONTROL_SETUP);
if (ret < 0) {
+ error_report("%s: failed to stop RDMA registration", __func__);
qemu_file_set_error(f, ret);
return ret;
}
@@ -2986,6 +2990,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ret = multifd_send_sync_main();
bql_lock();
if (ret < 0) {
+ error_report("%s: multifd synchronization failed", __func__);
return ret;
}
@@ -2994,7 +2999,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
}
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
- return qemu_fflush(f);
+ ret = qemu_fflush(f);
+ if (ret) {
+ error_report("%s failed : %s", __func__, strerror(ret));
+ }
+ return ret;
}
/**
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 05/26] migration: Add Error** argument to vmstate_save()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (3 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 04/26] migration: Always report an error in ram_save_setup() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-05 7:44 ` Prasad Pandit
2024-03-04 12:28 ` [PATCH v3 06/26] migration: Report error when shutdown fails Cédric Le Goater
` (21 subsequent siblings)
26 siblings, 1 reply; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This will prepare ground for futur changes adding an Error** argument
to qemu_savevm_state_setup().
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/savevm.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index d612c8a9020b204d5d078d5df85f0e6449c27645..c3642fac9dad3e314fe3fd5058889db1d2cecd47 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1008,11 +1008,10 @@ static void save_section_footer(QEMUFile *f, SaveStateEntry *se)
}
}
-static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc)
+static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc,
+ Error **errp)
{
int ret;
- Error *local_err = NULL;
- MigrationState *s = migrate_get_current();
if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
return 0;
@@ -1034,10 +1033,9 @@ static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc)
if (!se->vmsd) {
vmstate_save_old_style(f, se, vmdesc);
} else {
- ret = vmstate_save_state_with_err(f, se->vmsd, se->opaque, vmdesc, &local_err);
+ ret = vmstate_save_state_with_err(f, se->vmsd, se->opaque, vmdesc,
+ errp);
if (ret) {
- migrate_set_error(s, local_err);
- error_report_err(local_err);
return ret;
}
}
@@ -1324,8 +1322,10 @@ void qemu_savevm_state_setup(QEMUFile *f)
trace_savevm_state_setup();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->vmsd && se->vmsd->early_setup) {
- ret = vmstate_save(f, se, ms->vmdesc);
+ ret = vmstate_save(f, se, ms->vmdesc, &local_err);
if (ret) {
+ migrate_set_error(ms, local_err);
+ error_report_err(local_err);
qemu_file_set_error(f, ret);
break;
}
@@ -1540,6 +1540,7 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
JSONWriter *vmdesc = ms->vmdesc;
int vmdesc_len;
SaveStateEntry *se;
+ Error *local_err = NULL;
int ret;
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
@@ -1550,8 +1551,10 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
start_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME);
- ret = vmstate_save(f, se, vmdesc);
+ ret = vmstate_save(f, se, vmdesc, &local_err);
if (ret) {
+ migrate_set_error(ms, local_err);
+ error_report_err(local_err);
qemu_file_set_error(f, ret);
return ret;
}
@@ -1566,7 +1569,6 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
* bdrv_activate_all() on the other end won't fail. */
ret = bdrv_inactivate_all();
if (ret) {
- Error *local_err = NULL;
error_setg(&local_err, "%s: bdrv_inactivate_all() failed (%d)",
__func__, ret);
migrate_set_error(ms, local_err);
@@ -1762,6 +1764,8 @@ void qemu_savevm_live_state(QEMUFile *f)
int qemu_save_device_state(QEMUFile *f)
{
+ MigrationState *ms = migrate_get_current();
+ Error *local_err = NULL;
SaveStateEntry *se;
if (!migration_in_colo_state()) {
@@ -1776,8 +1780,10 @@ int qemu_save_device_state(QEMUFile *f)
if (se->is_ram) {
continue;
}
- ret = vmstate_save(f, se, NULL);
+ ret = vmstate_save(f, se, NULL, &local_err);
if (ret) {
+ migrate_set_error(ms, local_err);
+ error_report_err(local_err);
return ret;
}
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 06/26] migration: Report error when shutdown fails
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (4 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 05/26] migration: Add Error** argument to vmstate_save() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 07/26] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
` (20 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This will help detect issues regarding I/O channels usage.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/qemu-file.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 94231ff2955c80b3d0fab11a40510d34c334a826..b69e0c62e2fcf21d346a3687df7eebee23791fdc 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -62,6 +62,8 @@ struct QEMUFile {
*/
int qemu_file_shutdown(QEMUFile *f)
{
+ Error *err = NULL;
+
/*
* We must set qemufile error before the real shutdown(), otherwise
* there can be a race window where we thought IO all went though
@@ -90,7 +92,8 @@ int qemu_file_shutdown(QEMUFile *f)
return -ENOSYS;
}
- if (qio_channel_shutdown(f->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL) < 0) {
+ if (qio_channel_shutdown(f->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, &err) < 0) {
+ error_report_err(err);
return -EIO;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 07/26] migration: Remove SaveStateHandler and LoadStateHandler typedefs
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (5 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 06/26] migration: Report error when shutdown fails Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 08/26] migration: Add documentation for SaveVMHandlers Cédric Le Goater
` (19 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
They are only used once.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/migration/register.h | 4 ++--
include/qemu/typedefs.h | 2 --
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 9ab1f79512c605f0c88a45b560c57486fa054441..2e6a7d766e62f64940086b7b511249c9ff21fa62 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -18,7 +18,7 @@
typedef struct SaveVMHandlers {
/* This runs inside the BQL. */
- SaveStateHandler *save_state;
+ void (*save_state)(QEMUFile *f, void *opaque);
/*
* save_prepare is called early, even before migration starts, and can be
@@ -71,7 +71,7 @@ typedef struct SaveVMHandlers {
/* This calculate the exact remaining data to transfer */
void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
uint64_t *can_postcopy);
- LoadStateHandler *load_state;
+ int (*load_state)(QEMUFile *f, void *opaque, int version_id);
int (*load_setup)(QEMUFile *f, void *opaque);
int (*load_cleanup)(void *opaque);
/* Called when postcopy migration wants to resume from failure */
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index a028dba4d0b67e87165f9f1a4e960e9e6b94477c..50c277cf0b467f782ba526041b2663207bf70945 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -151,8 +151,6 @@ typedef struct IRQState *qemu_irq;
/*
* Function types
*/
-typedef void SaveStateHandler(QEMUFile *f, void *opaque);
-typedef int LoadStateHandler(QEMUFile *f, void *opaque, int version_id);
typedef void (*qemu_irq_handler)(void *opaque, int n, int level);
#endif /* QEMU_TYPEDEFS_H */
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 08/26] migration: Add documentation for SaveVMHandlers
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (6 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 07/26] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 09/26] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error Cédric Le Goater
` (18 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
The SaveVMHandlers structure is still in use for complex subsystems
and devices. Document the handlers since we are going to modify a few
later.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- VMState -> VMStateDescription
- Rephrased .save_live_complete_precopy() (Peter)
- Typos and rewordings (Avihai)
include/migration/register.h | 263 +++++++++++++++++++++++++++++++----
1 file changed, 237 insertions(+), 26 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 2e6a7d766e62f64940086b7b511249c9ff21fa62..d7b70a8be68c9df47c7843bda7d430989d7ca384 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -16,30 +16,130 @@
#include "hw/vmstate-if.h"
+/**
+ * struct SaveVMHandlers: handler structure to finely control
+ * migration of complex subsystems and devices, such as RAM, block and
+ * VFIO.
+ */
typedef struct SaveVMHandlers {
- /* This runs inside the BQL. */
+
+ /* The following handlers run inside the BQL. */
+
+ /**
+ * @save_state
+ *
+ * Saves state section on the source using the latest state format
+ * version.
+ *
+ * Legacy method. Should be deprecated when all users are ported
+ * to VMStateDescription.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ */
void (*save_state)(QEMUFile *f, void *opaque);
- /*
- * save_prepare is called early, even before migration starts, and can be
- * used to perform early checks.
+ /**
+ * @save_prepare
+ *
+ * Called early, even before migration starts, and can be used to
+ * perform early checks.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
*/
int (*save_prepare)(void *opaque, Error **errp);
+
+ /**
+ * @save_setup
+ *
+ * Initializes the data structures on the source and transmits
+ * first section containing information on the device
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*save_setup)(QEMUFile *f, void *opaque);
+
+ /**
+ * @save_cleanup
+ *
+ * Uninitializes the data structures on the source
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ */
void (*save_cleanup)(void *opaque);
+
+ /**
+ * @save_live_complete_postcopy
+ *
+ * Called at the end of postcopy for all postcopyable devices.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*save_live_complete_postcopy)(QEMUFile *f, void *opaque);
+
+ /**
+ * @save_live_complete_precopy
+ *
+ * Transmits the last section for the device containing any
+ * remaining data at the end of a precopy phase. When postcopy is
+ * enabled, devices that support postcopy will skip this step,
+ * where the final data will be flushed at the end of postcopy via
+ * @save_live_complete_postcopy instead.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*save_live_complete_precopy)(QEMUFile *f, void *opaque);
/* This runs both outside and inside the BQL. */
+
+ /**
+ * @is_active
+ *
+ * Will skip a state section if not active
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true if state section is active else false
+ */
bool (*is_active)(void *opaque);
+
+ /**
+ * @has_postcopy
+ *
+ * Checks if a device supports postcopy
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true for postcopy support else false
+ */
bool (*has_postcopy)(void *opaque);
- /* is_active_iterate
- * If it is not NULL then qemu_savevm_state_iterate will skip iteration if
- * it returns false. For example, it is needed for only-postcopy-states,
- * which needs to be handled by qemu_savevm_state_setup and
- * qemu_savevm_state_pending, but do not need iterations until not in
- * postcopy stage.
+ /**
+ * @is_active_iterate
+ *
+ * As #SaveVMHandlers.is_active(), will skip an inactive state
+ * section in qemu_savevm_state_iterate.
+ *
+ * For example, it is needed for only-postcopy-states, which needs
+ * to be handled by qemu_savevm_state_setup() and
+ * qemu_savevm_state_pending(), but do not need iterations until
+ * not in postcopy stage.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true if state section is active else false
*/
bool (*is_active_iterate)(void *opaque);
@@ -48,44 +148,155 @@ typedef struct SaveVMHandlers {
* use data that is local to the migration thread or protected
* by other locks.
*/
+
+ /**
+ * @save_live_iterate
+ *
+ * Should send a chunk of data until the point that stream
+ * bandwidth limits tell it to stop. Each call generates one
+ * section.
+ *
+ * @f: QEMUFile where to send the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns 0 to indicate that there is still more data to send,
+ * 1 that there is no more data to send and
+ * negative to indicate an error.
+ */
int (*save_live_iterate)(QEMUFile *f, void *opaque);
/* This runs outside the BQL! */
- /* Note for save_live_pending:
- * must_precopy:
- * - must be migrated in precopy or in stopped state
- * - i.e. must be migrated before target start
- *
- * can_postcopy:
- * - can migrate in postcopy or in stopped state
- * - i.e. can migrate after target start
- * - some can also be migrated during precopy (RAM)
- * - some must be migrated after source stops (block-dirty-bitmap)
- *
- * Sum of can_postcopy and must_postcopy is the whole amount of
+
+ /**
+ * @state_pending_estimate
+ *
+ * This estimates the remaining data to transfer
+ *
+ * Sum of @can_postcopy and @must_postcopy is the whole amount of
* pending data.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ * @must_precopy: amount of data that must be migrated in precopy
+ * or in stopped state, i.e. that must be migrated
+ * before target start.
+ * @can_postcopy: amount of data that can be migrated in postcopy
+ * or in stopped state, i.e. after target start.
+ * Some can also be migrated during precopy (RAM).
+ * Some must be migrated after source stops
+ * (block-dirty-bitmap)
*/
- /* This estimates the remaining data to transfer */
void (*state_pending_estimate)(void *opaque, uint64_t *must_precopy,
uint64_t *can_postcopy);
- /* This calculate the exact remaining data to transfer */
+
+ /**
+ * @state_pending_exact
+ *
+ * This calculates the exact remaining data to transfer
+ *
+ * Sum of @can_postcopy and @must_postcopy is the whole amount of
+ * pending data.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ * @must_precopy: amount of data that must be migrated in precopy
+ * or in stopped state, i.e. that must be migrated
+ * before target start.
+ * @can_postcopy: amount of data that can be migrated in postcopy
+ * or in stopped state, i.e. after target start.
+ * Some can also be migrated during precopy (RAM).
+ * Some must be migrated after source stops
+ * (block-dirty-bitmap)
+ */
void (*state_pending_exact)(void *opaque, uint64_t *must_precopy,
uint64_t *can_postcopy);
+
+ /**
+ * @load_state
+ *
+ * Load sections generated by any of the save functions that
+ * generate sections.
+ *
+ * Legacy method. Should be deprecated when all users are ported
+ * to VMStateDescription.
+ *
+ * @f: QEMUFile where to receive the data
+ * @opaque: data pointer passed to register_savevm_live()
+ * @version_id: the maximum version_id supported
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*load_state)(QEMUFile *f, void *opaque, int version_id);
+
+ /**
+ * @load_setup
+ *
+ * Initializes the data structures on the destination.
+ *
+ * @f: QEMUFile where to receive the data
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*load_setup)(QEMUFile *f, void *opaque);
+
+ /**
+ * @load_cleanup
+ *
+ * Uninitializes the data structures on the destination.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*load_cleanup)(void *opaque);
- /* Called when postcopy migration wants to resume from failure */
+
+ /**
+ * @resume_prepare
+ *
+ * Called when postcopy migration wants to resume from failure
+ *
+ * @s: Current migration state
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*resume_prepare)(MigrationState *s, void *opaque);
- /* Checks if switchover ack should be used. Called only in dest */
+
+ /**
+ * @switchover_ack_needed
+ *
+ * Checks if switchover ack should be used. Called only on
+ * destination.
+ *
+ * @opaque: data pointer passed to register_savevm_live()
+ *
+ * Returns true if switchover ack should be used and false
+ * otherwise
+ */
bool (*switchover_ack_needed)(void *opaque);
} SaveVMHandlers;
+/**
+ * register_savevm_live: Register a set of custom migration handlers
+ *
+ * @idstr: state section identifier
+ * @instance_id: instance id
+ * @version_id: version id supported
+ * @ops: SaveVMHandlers structure
+ * @opaque: data pointer passed to SaveVMHandlers handlers
+ */
int register_savevm_live(const char *idstr,
uint32_t instance_id,
int version_id,
const SaveVMHandlers *ops,
void *opaque);
+/**
+ * unregister_savevm: Unregister custom migration handlers
+ *
+ * @obj: object associated with state section
+ * @idstr: state section identifier
+ * @opaque: data pointer passed to register_savevm_live()
+ */
void unregister_savevm(VMStateIf *obj, const char *idstr, void *opaque);
#endif
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 09/26] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (7 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 08/26] migration: Add documentation for SaveVMHandlers Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 10/26] migration: Move cleanup after after error reporting in qemu_savevm_state_setup() Cédric Le Goater
` (17 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
When commit bd2270608fa0 ("migration/ram.c: add a notifier chain for
precopy") added PRECOPY_NOTIFY_SETUP notifiers at the end of
qemu_savevm_state_setup(), it didn't take into account a possible
error in the loop calling vmstate_save() or .save_setup() handlers.
Check ret value before calling the notifiers.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/savevm.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index c3642fac9dad3e314fe3fd5058889db1d2cecd47..31ce9391d49c825d4ec835e26ac0246e192783a0 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1314,7 +1314,7 @@ void qemu_savevm_state_setup(QEMUFile *f)
MigrationState *ms = migrate_get_current();
SaveStateEntry *se;
Error *local_err = NULL;
- int ret;
+ int ret = 0;
json_writer_int64(ms->vmdesc, "page_size", qemu_target_page_size());
json_writer_start_array(ms->vmdesc, "devices");
@@ -1350,6 +1350,10 @@ void qemu_savevm_state_setup(QEMUFile *f)
}
}
+ if (ret) {
+ return;
+ }
+
if (precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)) {
error_report_err(local_err);
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 10/26] migration: Move cleanup after after error reporting in qemu_savevm_state_setup()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (8 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 09/26] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-05 3:32 ` Peter Xu
2024-03-04 12:28 ` [PATCH v3 11/26] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
` (16 subsequent siblings)
26 siblings, 1 reply; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This will help preserving the error set by .save_setup() handlers.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/savevm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 31ce9391d49c825d4ec835e26ac0246e192783a0..e400706e61e06d2d1d03a11aed14f30a243833f2 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1740,10 +1740,10 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
qemu_savevm_state_complete_precopy(f, false, false);
ret = qemu_file_get_error(f);
}
- qemu_savevm_state_cleanup();
if (ret != 0) {
error_setg_errno(errp, -ret, "Error while writing VM state");
}
+ qemu_savevm_state_cleanup();
if (ret != 0) {
status = MIGRATION_STATUS_FAILED;
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 11/26] migration: Add Error** argument to qemu_savevm_state_setup()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (9 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 10/26] migration: Move cleanup after after error reporting in qemu_savevm_state_setup() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 12/26] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
` (15 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This prepares ground for the changes coming next which add an Error**
argument to the .save_setup() handler. Callers of qemu_savevm_state_setup()
now handle the error and fail earlier setting the migration state from
MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
Since the previous behavior was to ignore errors at this step of
migration, this change should be examined closely to check that
cleanups are still correctly done.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- Set migration state to MIGRATION_STATUS_FAILED
- Fixed error handling to be done under lock in bg_migration_thread()
- Made sure an error is always set in case of failure in
qemu_savevm_state_setup()
migration/savevm.h | 2 +-
migration/migration.c | 27 ++++++++++++++++++++++++---
migration/savevm.c | 24 ++++++++++++++----------
3 files changed, 39 insertions(+), 14 deletions(-)
diff --git a/migration/savevm.h b/migration/savevm.h
index 74669733dd63a080b765866c703234a5c4939223..9ec96a995c93a42aad621595f0ed58596c532328 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -32,7 +32,7 @@
bool qemu_savevm_state_blocked(Error **errp);
void qemu_savevm_non_migratable_list(strList **reasons);
int qemu_savevm_state_prepare(Error **errp);
-void qemu_savevm_state_setup(QEMUFile *f);
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp);
bool qemu_savevm_state_guest_unplug_pending(void);
int qemu_savevm_state_resume_prepare(MigrationState *s);
void qemu_savevm_state_header(QEMUFile *f);
diff --git a/migration/migration.c b/migration/migration.c
index bab68bcbef33edc9a78c2f4ebeccee690f4b1786..ff2d4bf9d153444d58d21f761071fe811034ea86 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3360,6 +3360,8 @@ static void *migration_thread(void *opaque)
int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
MigThrError thr_error;
bool urgent = false;
+ Error *local_err = NULL;
+ int ret;
thread = migration_threads_add("live_migration", qemu_get_thread_id());
@@ -3403,9 +3405,17 @@ static void *migration_thread(void *opaque)
}
bql_lock();
- qemu_savevm_state_setup(s->to_dst_file);
+ ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
bql_unlock();
+ if (ret) {
+ migrate_set_error(s, local_err);
+ error_free(local_err);
+ migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+ MIGRATION_STATUS_FAILED);
+ goto out;
+ }
+
qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
MIGRATION_STATUS_ACTIVE);
@@ -3482,6 +3492,9 @@ static void *bg_migration_thread(void *opaque)
MigThrError thr_error;
QEMUFile *fb;
bool early_fail = true;
+ bool setup_fail = true;
+ Error *local_err = NULL;
+ int ret;
rcu_register_thread();
object_ref(OBJECT(s));
@@ -3515,9 +3528,16 @@ static void *bg_migration_thread(void *opaque)
bql_lock();
qemu_savevm_state_header(s->to_dst_file);
- qemu_savevm_state_setup(s->to_dst_file);
+ ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
+ if (ret) {
+ migrate_set_error(s, local_err);
+ error_free(local_err);
+ goto fail;
+ }
bql_unlock();
+ setup_fail = false;
+
qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
MIGRATION_STATUS_ACTIVE);
@@ -3584,7 +3604,8 @@ static void *bg_migration_thread(void *opaque)
fail:
if (early_fail) {
- migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+ migrate_set_state(&s->state,
+ setup_fail ? MIGRATION_STATUS_SETUP : MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
bql_unlock();
}
diff --git a/migration/savevm.c b/migration/savevm.c
index e400706e61e06d2d1d03a11aed14f30a243833f2..3ae4328b95d554cb4855f96989f8685daf4b8407 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1309,11 +1309,11 @@ int qemu_savevm_state_prepare(Error **errp)
return 0;
}
-void qemu_savevm_state_setup(QEMUFile *f)
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
{
+ ERRP_GUARD();
MigrationState *ms = migrate_get_current();
SaveStateEntry *se;
- Error *local_err = NULL;
int ret = 0;
json_writer_int64(ms->vmdesc, "page_size", qemu_target_page_size());
@@ -1322,10 +1322,9 @@ void qemu_savevm_state_setup(QEMUFile *f)
trace_savevm_state_setup();
QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
if (se->vmsd && se->vmsd->early_setup) {
- ret = vmstate_save(f, se, ms->vmdesc, &local_err);
+ ret = vmstate_save(f, se, ms->vmdesc, errp);
if (ret) {
- migrate_set_error(ms, local_err);
- error_report_err(local_err);
+ migrate_set_error(ms, *errp);
qemu_file_set_error(f, ret);
break;
}
@@ -1345,18 +1344,19 @@ void qemu_savevm_state_setup(QEMUFile *f)
ret = se->ops->save_setup(f, se->opaque);
save_section_footer(f, se);
if (ret < 0) {
+ error_setg(errp, "failed to setup SaveStateEntry with id(name): "
+ "%d(%s): %d", se->section_id, se->idstr, ret);
qemu_file_set_error(f, ret);
break;
}
}
if (ret) {
- return;
+ return ret;
}
- if (precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)) {
- error_report_err(local_err);
- }
+ /* TODO: Should we check that errp is set in case of failure ? */
+ return precopy_notify(PRECOPY_NOTIFY_SETUP, errp);
}
int qemu_savevm_state_resume_prepare(MigrationState *s)
@@ -1727,7 +1727,10 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
ms->to_dst_file = f;
qemu_savevm_state_header(f);
- qemu_savevm_state_setup(f);
+ ret = qemu_savevm_state_setup(f, errp);
+ if (ret) {
+ goto cleanup;
+ }
while (qemu_file_get_error(f) == 0) {
if (qemu_savevm_state_iterate(f, false) > 0) {
@@ -1743,6 +1746,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
if (ret != 0) {
error_setg_errno(errp, -ret, "Error while writing VM state");
}
+cleanup:
qemu_savevm_state_cleanup();
if (ret != 0) {
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 12/26] migration: Add Error** argument to .save_setup() handler
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (10 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 11/26] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-05 8:54 ` Thomas Huth
2024-03-04 12:28 ` [PATCH v3 13/26] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
` (14 subsequent siblings)
26 siblings, 1 reply; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater, Nicholas Piggin, Harsh Prateek Bora,
Halil Pasic, Thomas Huth, Eric Blake,
Vladimir Sementsov-Ogievskiy, John Snow, Stefan Hajnoczi
The purpose is to record a potential error in the migration stream if
qemu_savevm_state_setup() fails. Most of the current .save_setup()
handlers can be modified to use the Error argument instead of managing
their own and calling locally error_report().
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: John Snow <jsnow@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- Made sure an error is always set in case of failure in
qemu_savevm_state_setup()
Changes in v2:
- dropped qemu_file_set_error_obj(f, ret, local_err);
include/migration/register.h | 3 ++-
hw/ppc/spapr.c | 2 +-
hw/s390x/s390-stattrib.c | 6 ++----
hw/vfio/migration.c | 17 ++++++++---------
migration/block-dirty-bitmap.c | 4 +++-
migration/block.c | 13 ++++---------
migration/ram.c | 15 ++++++++-------
migration/savevm.c | 4 +---
8 files changed, 29 insertions(+), 35 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index d7b70a8be68c9df47c7843bda7d430989d7ca384..64fc7c11036c82edd6d69513e56a0216d36c17aa 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -60,10 +60,11 @@ typedef struct SaveVMHandlers {
*
* @f: QEMUFile where to send the data
* @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
*
* Returns zero to indicate success and negative for error
*/
- int (*save_setup)(QEMUFile *f, void *opaque);
+ int (*save_setup)(QEMUFile *f, void *opaque, Error **errp);
/**
* @save_cleanup
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 55263f0815ed7671b32ea20b394ae71c82e616cb..045c024ffa76eacfc496bd486cb6cafbee2df73e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2142,7 +2142,7 @@ static const VMStateDescription vmstate_spapr = {
}
};
-static int htab_save_setup(QEMUFile *f, void *opaque)
+static int htab_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
SpaprMachineState *spapr = opaque;
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index e99de190332a82363b1388bbc450013149295bc0..16dee2e2ea8031d072156bb7930d5ac6dfd5a214 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -167,19 +167,17 @@ static int cmma_load(QEMUFile *f, void *opaque, int version_id)
return ret;
}
-static int cmma_save_setup(QEMUFile *f, void *opaque)
+static int cmma_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
S390StAttribState *sas = S390_STATTRIB(opaque);
S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
- Error *local_err = NULL;
int res;
/*
* Signal that we want to start a migration, thus needing PGSTE dirty
* tracking.
*/
- res = sac->set_migrationmode(sas, true, &local_err);
+ res = sac->set_migrationmode(sas, true, errp);
if (res) {
- error_report_err(local_err);
return res;
}
qemu_put_be64(f, STATTR_FLAG_EOS);
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 51bea536cc290ba0aa393f78b017b0650e333bff..5152783e01c39edaf4ab37035924f8e495cf48d8 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -378,7 +378,7 @@ static int vfio_save_prepare(void *opaque, Error **errp)
return 0;
}
-static int vfio_save_setup(QEMUFile *f, void *opaque)
+static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
@@ -392,8 +392,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
stop_copy_size);
migration->data_buffer = g_try_malloc0(migration->data_buffer_size);
if (!migration->data_buffer) {
- error_report("%s: Failed to allocate migration data buffer",
- vbasedev->name);
+ error_setg(errp, "%s: Failed to allocate migration data buffer",
+ vbasedev->name);
return -ENOMEM;
}
@@ -403,8 +403,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
VFIO_DEVICE_STATE_RUNNING);
if (ret) {
- error_report("%s: Failed to set new RUNNING state",
- vbasedev->name);
+ error_setg(errp, "%s: Failed to set new RUNNING state",
+ vbasedev->name);
return ret;
}
@@ -415,8 +415,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
/* vfio_save_complete_precopy() will go to STOP_COPY */
break;
default:
- error_report("%s: Invalid device state %d", vbasedev->name,
- migration->device_state);
+ error_setg(errp, "%s: Invalid device state %d", vbasedev->name,
+ migration->device_state);
return -EINVAL;
}
}
@@ -427,8 +427,7 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
ret = qemu_file_get_error(f);
if (ret) {
- error_report("%s: save setup failed : %s", vbasedev->name,
- strerror(ret));
+ error_setg_errno(errp, ret, "%s: save setup failed", vbasedev->name);
}
return ret;
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 2708abf3d762de774ed294d3fdb8e56690d2974c..542a8c297b329abc30d1b3a205d29340fa59a961 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1213,12 +1213,14 @@ fail:
return ret;
}
-static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
+static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
DBMSaveState *s = &((DBMState *)opaque)->save;
SaveBitmapState *dbms = NULL;
if (init_dirty_bitmap_migration(s) < 0) {
+ error_setg(errp,
+ "Failed to initialize dirty tracking bitmap for blocks");
return -1;
}
diff --git a/migration/block.c b/migration/block.c
index 06f5857cf049df3261d2cf1d7c3607ab92350ac6..02d00095db4cb040d5508a191f469d657b0085bf 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -706,10 +706,9 @@ static void block_migration_cleanup(void *opaque)
blk_mig_unlock();
}
-static int block_save_setup(QEMUFile *f, void *opaque)
+static int block_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
int ret;
- Error *local_err = NULL;
trace_migration_block_save("setup", block_mig_state.submitted,
block_mig_state.transferred);
@@ -717,25 +716,21 @@ static int block_save_setup(QEMUFile *f, void *opaque)
warn_report("block migration is deprecated;"
" use blockdev-mirror with NBD instead");
- ret = init_blk_migration(f, &local_err);
+ ret = init_blk_migration(f, errp);
if (ret < 0) {
- error_report_err(local_err);
return ret;
}
/* start track dirty blocks */
ret = set_dirty_tracking();
if (ret) {
- error_setg_errno(&local_err, -ret,
- "Failed to start block dirty tracking");
- error_report_err(local_err);
+ error_setg_errno(errp, -ret, "Failed to start block dirty tracking");
return ret;
}
ret = flush_blks(f);
if (ret) {
- error_setg_errno(&local_err, -ret, "Flushing block failed");
- error_report_err(local_err);
+ error_setg_errno(errp, -ret, "Flushing block failed");
return ret;
}
blk_mig_reset_dirty_cursor();
diff --git a/migration/ram.c b/migration/ram.c
index dbd04d8ee2167881007c836a6bbc79c1aced72d0..84e718c562c72310092114c438eb94e0077775f2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2929,22 +2929,23 @@ void qemu_guest_free_page_hint(void *addr, size_t len)
*
* @f: QEMUFile where to send the data
* @opaque: RAMState pointer
+ * @errp: pointer to Error*, to store an error if it happens.
*/
-static int ram_save_setup(QEMUFile *f, void *opaque)
+static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
{
RAMState **rsp = opaque;
RAMBlock *block;
int ret;
if (compress_threads_save_setup()) {
- error_report("%s: failed to start compress threads", __func__);
+ error_setg(errp, "%s: failed to start compress threads", __func__);
return -1;
}
/* migration has already setup the bitmap, reuse it. */
if (!migration_in_colo_state()) {
if (ram_init_all(rsp) != 0) {
- error_report("%s: failed to setup RAM for migration", __func__);
+ error_setg(errp, "%s: failed to setup RAM for migration", __func__);
compress_threads_save_cleanup();
return -1;
}
@@ -2971,14 +2972,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ret = rdma_registration_start(f, RAM_CONTROL_SETUP);
if (ret < 0) {
- error_report("%s: failed to start RDMA registration", __func__);
+ error_setg(errp, "%s: failed to start RDMA registration", __func__);
qemu_file_set_error(f, ret);
return ret;
}
ret = rdma_registration_stop(f, RAM_CONTROL_SETUP);
if (ret < 0) {
- error_report("%s: failed to stop RDMA registration", __func__);
+ error_setg(errp, "%s: failed to stop RDMA registration", __func__);
qemu_file_set_error(f, ret);
return ret;
}
@@ -2990,7 +2991,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
ret = multifd_send_sync_main();
bql_lock();
if (ret < 0) {
- error_report("%s: multifd synchronization failed", __func__);
+ error_setg(errp, "%s: multifd synchronization failed", __func__);
return ret;
}
@@ -3001,7 +3002,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
ret = qemu_fflush(f);
if (ret) {
- error_report("%s failed : %s", __func__, strerror(ret));
+ error_setg_errno(errp, ret, "%s failed", __func__);
}
return ret;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index 3ae4328b95d554cb4855f96989f8685daf4b8407..0ff2c4d97d475b81603bc2387cfaa3c000a25cdc 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1341,11 +1341,9 @@ int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
}
save_section_header(f, se, QEMU_VM_SECTION_START);
- ret = se->ops->save_setup(f, se->opaque);
+ ret = se->ops->save_setup(f, se->opaque, errp);
save_section_footer(f, se);
if (ret < 0) {
- error_setg(errp, "failed to setup SaveStateEntry with id(name): "
- "%d(%s): %d", se->section_id, se->idstr, ret);
qemu_file_set_error(f, ret);
break;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 13/26] migration: Add Error** argument to .load_setup() handler
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (11 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 12/26] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 14/26] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
` (13 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This will be useful to report errors at a higher level, mostly in VFIO
today.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- ERRP_GUARD() because of error_prepend use
- Made sure an error is always set in case of failure in
vfio_load_setup()
include/migration/register.h | 3 ++-
hw/vfio/migration.c | 9 +++++++--
migration/ram.c | 3 ++-
migration/savevm.c | 11 +++++++----
4 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 64fc7c11036c82edd6d69513e56a0216d36c17aa..f60e797894e5faacdf55d2d6de175074ac58944f 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -234,10 +234,11 @@ typedef struct SaveVMHandlers {
*
* @f: QEMUFile where to receive the data
* @opaque: data pointer passed to register_savevm_live()
+ * @errp: pointer to Error*, to store an error if it happens.
*
* Returns zero to indicate success and negative for error
*/
- int (*load_setup)(QEMUFile *f, void *opaque);
+ int (*load_setup)(QEMUFile *f, void *opaque, Error **errp);
/**
* @load_cleanup
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 5152783e01c39edaf4ab37035924f8e495cf48d8..20eb7dad1a909d49a810294d87a07262ec5f1e9d 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -588,12 +588,17 @@ static void vfio_save_state(QEMUFile *f, void *opaque)
}
}
-static int vfio_load_setup(QEMUFile *f, void *opaque)
+static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp)
{
VFIODevice *vbasedev = opaque;
+ int ret;
- return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING,
+ ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING,
vbasedev->migration->device_state);
+ if (ret) {
+ error_setg(errp, "%s: Failed to set RESUMING state", vbasedev->name);
+ }
+ return ret;
}
static int vfio_load_cleanup(void *opaque)
diff --git a/migration/ram.c b/migration/ram.c
index 84e718c562c72310092114c438eb94e0077775f2..20c6ad9e759b2b8ec7ae26b7ca72d5cbd20d481f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3506,8 +3506,9 @@ void colo_release_ram_cache(void)
*
* @f: QEMUFile where to receive the data
* @opaque: RAMState pointer
+ * @errp: pointer to Error*, to store an error if it happens.
*/
-static int ram_load_setup(QEMUFile *f, void *opaque)
+static int ram_load_setup(QEMUFile *f, void *opaque, Error **errp)
{
xbzrle_load_setup();
ramblock_recv_map_init();
diff --git a/migration/savevm.c b/migration/savevm.c
index 0ff2c4d97d475b81603bc2387cfaa3c000a25cdc..a495676fab928b86df33911ea2f24d250efe1876 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2749,8 +2749,9 @@ static void qemu_loadvm_state_switchover_ack_needed(MigrationIncomingState *mis)
trace_loadvm_state_switchover_ack_needed(mis->switchover_ack_pending_num);
}
-static int qemu_loadvm_state_setup(QEMUFile *f)
+static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp)
{
+ ERRP_GUARD(); /* error_prepend use */
SaveStateEntry *se;
int ret;
@@ -2765,10 +2766,11 @@ static int qemu_loadvm_state_setup(QEMUFile *f)
}
}
- ret = se->ops->load_setup(f, se->opaque);
+ ret = se->ops->load_setup(f, se->opaque, errp);
if (ret < 0) {
+ error_prepend(errp, "Load state of device %s failed: ",
+ se->idstr);
qemu_file_set_error(f, ret);
- error_report("Load state of device %s failed", se->idstr);
return ret;
}
}
@@ -2949,7 +2951,8 @@ int qemu_loadvm_state(QEMUFile *f)
return ret;
}
- if (qemu_loadvm_state_setup(f) != 0) {
+ if (qemu_loadvm_state_setup(f, &local_err) != 0) {
+ error_report_err(local_err);
return -EINVAL;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 14/26] memory: Add Error** argument to .log_global*() handlers
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (12 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 13/26] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-05 7:57 ` Peter Xu
2024-03-04 12:28 ` [PATCH v3 15/26] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
` (12 subsequent siblings)
26 siblings, 1 reply; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater, Stefano Stabellini, Anthony Perard,
Paul Durrant, Michael S. Tsirkin, Paolo Bonzini,
David Hildenbrand
Modify all log_global*() handlers to take an Error** parameter and
return a bool. A new MEMORY_LISTENER_CALL_LOG_GLOBAL macro looping on
the listeners is introduced to handle a possible error, which will
would interrupt the loop if necessary.
To be noted the change in memory_global_dirty_log_start() behavior as
it will return as soon as an error is detected.
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- Fixed return value of vfio_listener_log_global_start() and
vfio_listener_log_global_stop(). Went unnoticed because not tested.
include/exec/memory.h | 15 ++++++--
hw/i386/xen/xen-hvm.c | 6 ++--
hw/vfio/common.c | 8 +++--
hw/virtio/vhost.c | 6 ++--
system/memory.c | 83 +++++++++++++++++++++++++++++++++++++------
system/physmem.c | 5 +--
6 files changed, 101 insertions(+), 22 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b310ed7b1a1db7978ba4b394032c2f15..4bc146c5ebdd377cd14a4e462f32cc945db5a0a8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -998,8 +998,11 @@ struct MemoryListener {
* active at that time.
*
* @listener: The #MemoryListener.
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
- void (*log_global_start)(MemoryListener *listener);
+ bool (*log_global_start)(MemoryListener *listener, Error **errp);
/**
* @log_global_stop:
@@ -1009,8 +1012,11 @@ struct MemoryListener {
* the address space.
*
* @listener: The #MemoryListener.
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
- void (*log_global_stop)(MemoryListener *listener);
+ bool (*log_global_stop)(MemoryListener *listener, Error **errp);
/**
* @log_global_after_sync:
@@ -1019,8 +1025,11 @@ struct MemoryListener {
* for any #MemoryRegionSection.
*
* @listener: The #MemoryListener.
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
- void (*log_global_after_sync)(MemoryListener *listener);
+ bool (*log_global_after_sync)(MemoryListener *listener, Error **errp);
/**
* @eventfd_add:
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index f42621e6742552035122ea58092c91c3458338ff..925a207b494b4eed52d5f360b554f18ac8a9806d 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -446,16 +446,18 @@ static void xen_log_sync(MemoryListener *listener, MemoryRegionSection *section)
int128_get64(section->size));
}
-static void xen_log_global_start(MemoryListener *listener)
+static bool xen_log_global_start(MemoryListener *listener, Error **errp)
{
if (xen_enabled()) {
xen_in_migration = true;
}
+ return true;
}
-static void xen_log_global_stop(MemoryListener *listener)
+static bool xen_log_global_stop(MemoryListener *listener, Error **errp)
{
xen_in_migration = false;
+ return true;
}
static const MemoryListener xen_memory_listener = {
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index ff88c3f31ca660b3c0a790601100fdc6116192a0..71352ad6532cad02f3a2ab53bdd531d769ea55b2 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1075,7 +1075,8 @@ out:
return ret;
}
-static void vfio_listener_log_global_start(MemoryListener *listener)
+static bool vfio_listener_log_global_start(MemoryListener *listener,
+ Error **errp)
{
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
@@ -1092,9 +1093,11 @@ static void vfio_listener_log_global_start(MemoryListener *listener)
ret, strerror(-ret));
vfio_set_migration_error(ret);
}
+ return !ret;
}
-static void vfio_listener_log_global_stop(MemoryListener *listener)
+static bool vfio_listener_log_global_stop(MemoryListener *listener,
+ Error **errp)
{
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
@@ -1111,6 +1114,7 @@ static void vfio_listener_log_global_stop(MemoryListener *listener)
ret, strerror(-ret));
vfio_set_migration_error(ret);
}
+ return !ret;
}
static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2c9ac794680ea9b65eba6cc22e70cf141e90aa73..7a555f941934991a72a2817e5505fe0ce6d6fc64 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1044,7 +1044,7 @@ check_dev_state:
return r;
}
-static void vhost_log_global_start(MemoryListener *listener)
+static bool vhost_log_global_start(MemoryListener *listener, Error **errp)
{
int r;
@@ -1052,9 +1052,10 @@ static void vhost_log_global_start(MemoryListener *listener)
if (r < 0) {
abort();
}
+ return true;
}
-static void vhost_log_global_stop(MemoryListener *listener)
+static bool vhost_log_global_stop(MemoryListener *listener, Error **errp)
{
int r;
@@ -1062,6 +1063,7 @@ static void vhost_log_global_stop(MemoryListener *listener)
if (r < 0) {
abort();
}
+ return true;
}
static void vhost_log_start(MemoryListener *listener,
diff --git a/system/memory.c b/system/memory.c
index a229a79988fce2aa3cb77e3a130db4c694e8cd49..af06157ead5b1272548e87f79ab9fb3036055328 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -127,6 +127,35 @@ enum ListenerDirection { Forward, Reverse };
} \
} while (0)
+#define MEMORY_LISTENER_CALL_LOG_GLOBAL(_callback, _direction, _errp, \
+ _args...) \
+ do { \
+ MemoryListener *_listener; \
+ \
+ switch (_direction) { \
+ case Forward: \
+ QTAILQ_FOREACH(_listener, &memory_listeners, link) { \
+ if (_listener->_callback) { \
+ if (!_listener->_callback(_listener, _errp, ##_args)) { \
+ break; \
+ } \
+ } \
+ } \
+ break; \
+ case Reverse: \
+ QTAILQ_FOREACH_REVERSE(_listener, &memory_listeners, link) { \
+ if (_listener->_callback) { \
+ if (!_listener->_callback(_listener, _errp, ##_args)) { \
+ break; \
+ } \
+ } \
+ } \
+ break; \
+ default: \
+ abort(); \
+ }; \
+ } while (0)
+
#define MEMORY_LISTENER_CALL(_as, _callback, _direction, _section, _args...) \
do { \
MemoryListener *_listener; \
@@ -2903,7 +2932,13 @@ void memory_global_dirty_log_sync(bool last_stage)
void memory_global_after_dirty_log_sync(void)
{
- MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
+ Error *local_err = NULL;
+
+ MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_after_sync, Forward,
+ &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ }
}
/*
@@ -2912,18 +2947,22 @@ void memory_global_after_dirty_log_sync(void)
*/
static unsigned int postponed_stop_flags;
static VMChangeStateEntry *vmstate_change;
-static void memory_global_dirty_log_stop_postponed_run(void);
+static bool memory_global_dirty_log_stop_postponed_run(Error **errp);
void memory_global_dirty_log_start(unsigned int flags)
{
unsigned int old_flags;
+ Error *local_err = NULL;
assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
if (vmstate_change) {
/* If there is postponed stop(), operate on it first */
postponed_stop_flags &= ~flags;
- memory_global_dirty_log_stop_postponed_run();
+ if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
+ error_report_err(local_err);
+ return;
+ }
}
flags &= ~global_dirty_tracking;
@@ -2936,15 +2975,22 @@ void memory_global_dirty_log_start(unsigned int flags)
trace_global_dirty_changed(global_dirty_tracking);
if (!old_flags) {
- MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
+ MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
+ &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ return;
+ }
memory_region_transaction_begin();
memory_region_update_pending = true;
memory_region_transaction_commit();
}
}
-static void memory_global_dirty_log_do_stop(unsigned int flags)
+static bool memory_global_dirty_log_do_stop(unsigned int flags, Error **errp)
{
+ ERRP_GUARD();
+
assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
assert((global_dirty_tracking & flags) == flags);
global_dirty_tracking &= ~flags;
@@ -2955,39 +3001,49 @@ static void memory_global_dirty_log_do_stop(unsigned int flags)
memory_region_transaction_begin();
memory_region_update_pending = true;
memory_region_transaction_commit();
- MEMORY_LISTENER_CALL_GLOBAL(log_global_stop, Reverse);
+ MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_stop, Reverse, errp);
}
+ return !*errp;
}
/*
* Execute the postponed dirty log stop operations if there is, then reset
* everything (including the flags and the vmstate change hook).
*/
-static void memory_global_dirty_log_stop_postponed_run(void)
+static bool memory_global_dirty_log_stop_postponed_run(Error **errp)
{
+ bool ret = true;
+
/* This must be called with the vmstate handler registered */
assert(vmstate_change);
/* Note: postponed_stop_flags can be cleared in log start routine */
if (postponed_stop_flags) {
- memory_global_dirty_log_do_stop(postponed_stop_flags);
+ ret = memory_global_dirty_log_do_stop(postponed_stop_flags, errp);
postponed_stop_flags = 0;
}
qemu_del_vm_change_state_handler(vmstate_change);
vmstate_change = NULL;
+ return ret;
}
static void memory_vm_change_state_handler(void *opaque, bool running,
RunState state)
{
+ Error *local_err = NULL;
+
if (running) {
- memory_global_dirty_log_stop_postponed_run();
+ if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
+ error_report_err(local_err);
+ }
}
}
void memory_global_dirty_log_stop(unsigned int flags)
{
+ Error *local_err = NULL;
+
if (!runstate_is_running()) {
/* Postpone the dirty log stop, e.g., to when VM starts again */
if (vmstate_change) {
@@ -3001,7 +3057,9 @@ void memory_global_dirty_log_stop(unsigned int flags)
return;
}
- memory_global_dirty_log_do_stop(flags);
+ if (!memory_global_dirty_log_do_stop(flags, &local_err)) {
+ error_report_err(local_err);
+ }
}
static void listener_add_address_space(MemoryListener *listener,
@@ -3009,13 +3067,16 @@ static void listener_add_address_space(MemoryListener *listener,
{
FlatView *view;
FlatRange *fr;
+ Error *local_err = NULL;
if (listener->begin) {
listener->begin(listener);
}
if (global_dirty_tracking) {
if (listener->log_global_start) {
- listener->log_global_start(listener);
+ if (!listener->log_global_start(listener, &local_err)) {
+ error_report_err(local_err);
+ }
}
}
diff --git a/system/physmem.c b/system/physmem.c
index e3ebc19eefd8050a1dee16e3d1449f0c144f751f..9adbf9aea847cd80bdac6dca466fb476844ac048 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -148,7 +148,7 @@ typedef struct subpage_t {
static void io_mem_init(void);
static void memory_map_init(void);
-static void tcg_log_global_after_sync(MemoryListener *listener);
+static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp);
static void tcg_commit(MemoryListener *listener);
/**
@@ -2475,7 +2475,7 @@ static void do_nothing(CPUState *cpu, run_on_cpu_data d)
{
}
-static void tcg_log_global_after_sync(MemoryListener *listener)
+static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp)
{
CPUAddressSpace *cpuas;
@@ -2507,6 +2507,7 @@ static void tcg_log_global_after_sync(MemoryListener *listener)
cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
}
+ return true;
}
static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data)
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 15/26] memory: Add Error** argument to the global_dirty_log routines
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (13 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 14/26] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-05 1:58 ` Yong Huang
2024-03-04 12:28 ` [PATCH v3 16/26] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
` (11 subsequent siblings)
26 siblings, 1 reply; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater, Stefano Stabellini, Anthony Perard,
Paul Durrant, Michael S. Tsirkin, Paolo Bonzini,
David Hildenbrand, Hyman Huang
Now that the log_global*() handlers take an Error** parameter and
return a bool, do the same for memory_global_dirty_log_start() and
memory_global_dirty_log_stop(). The error is reported in the callers
for now and it will be propagated in the call stack in the next
changes.
To be noted a functional change in ram_init_bitmaps(), if the dirty
pages logger fails to start, there is no need to synchronize the dirty
pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
similar way.
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hyman Huang <yong.huang@smartx.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/exec/memory.h | 10 ++++++++--
hw/i386/xen/xen-hvm.c | 4 ++--
migration/dirtyrate.c | 21 +++++++++++++++++----
migration/ram.c | 34 ++++++++++++++++++++++++++++++----
system/memory.c | 30 ++++++++++++------------------
5 files changed, 69 insertions(+), 30 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 4bc146c5ebdd377cd14a4e462f32cc945db5a0a8..8b019465ab13ce85c03075c80865a0865ea1feed 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2576,15 +2576,21 @@ void memory_listener_unregister(MemoryListener *listener);
* memory_global_dirty_log_start: begin dirty logging for all regions
*
* @flags: purpose of starting dirty log, migration or dirty rate
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
-void memory_global_dirty_log_start(unsigned int flags);
+bool memory_global_dirty_log_start(unsigned int flags, Error **errp);
/**
* memory_global_dirty_log_stop: end dirty logging for all regions
*
* @flags: purpose of stopping dirty log, migration or dirty rate
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
*/
-void memory_global_dirty_log_stop(unsigned int flags);
+bool memory_global_dirty_log_stop(unsigned int flags, Error **errp);
void mtree_info(bool flatview, bool dispatch_tree, bool owner, bool disabled);
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index 925a207b494b4eed52d5f360b554f18ac8a9806d..286269b47572d90e57df5ff44835bb5f8e16c7ad 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -655,9 +655,9 @@ void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length)
void qmp_xen_set_global_dirty_log(bool enable, Error **errp)
{
if (enable) {
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION, errp);
} else {
- memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
+ memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION, errp);
}
}
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index 1d2e85746fb7b10eb7f149976970f9a92125af8a..34f6d803ff5f4e6ccf2e06aaaed65a336c4be469 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -90,11 +90,17 @@ static int64_t do_calculate_dirtyrate(DirtyPageRecord dirty_pages,
void global_dirty_log_change(unsigned int flag, bool start)
{
+ Error *local_err = NULL;
+ bool ret;
+
bql_lock();
if (start) {
- memory_global_dirty_log_start(flag);
+ ret = memory_global_dirty_log_start(flag, &local_err);
} else {
- memory_global_dirty_log_stop(flag);
+ ret = memory_global_dirty_log_stop(flag, &local_err);
+ }
+ if (!ret) {
+ error_report_err(local_err);
}
bql_unlock();
}
@@ -106,10 +112,14 @@ void global_dirty_log_change(unsigned int flag, bool start)
*/
static void global_dirty_log_sync(unsigned int flag, bool one_shot)
{
+ Error *local_err = NULL;
+
bql_lock();
memory_global_dirty_log_sync(false);
if (one_shot) {
- memory_global_dirty_log_stop(flag);
+ if (!memory_global_dirty_log_stop(flag, &local_err)) {
+ error_report_err(local_err);
+ }
}
bql_unlock();
}
@@ -608,9 +618,12 @@ static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config)
{
int64_t start_time;
DirtyPageRecord dirty_pages;
+ Error *local_err = NULL;
bql_lock();
- memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE);
+ if (!memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE, &local_err)) {
+ error_report_err(local_err);
+ }
/*
* 1'round of log sync may return all 1 bits with
diff --git a/migration/ram.c b/migration/ram.c
index 20c6ad9e759b2b8ec7ae26b7ca72d5cbd20d481f..3d9c08cfae8a59031a7c1b3c70721c2a90daceba 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2390,6 +2390,7 @@ static void ram_save_cleanup(void *opaque)
{
RAMState **rsp = opaque;
RAMBlock *block;
+ Error *local_err = NULL;
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
@@ -2402,7 +2403,10 @@ static void ram_save_cleanup(void *opaque)
* memory_global_dirty_log_stop will assert that
* memory_global_dirty_log_start/stop used in pairs
*/
- memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
+ if (!memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION,
+ &local_err)) {
+ error_report_err(local_err);
+ }
}
}
@@ -2799,18 +2803,31 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
static void ram_init_bitmaps(RAMState *rs)
{
+ Error *local_err = NULL;
+ bool ret = true;
+
qemu_mutex_lock_ramlist();
WITH_RCU_READ_LOCK_GUARD() {
ram_list_init_bitmaps();
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
+ &local_err);
+ if (!ret) {
+ error_report_err(local_err);
+ goto out_unlock;
+ }
migration_bitmap_sync_precopy(rs, false);
}
}
+out_unlock:
qemu_mutex_unlock_ramlist();
+ if (!ret) {
+ return;
+ }
+
/*
* After an eventual first bitmap sync, fixup the initial bitmap
* containing all 1s to exclude any discarded pages from migration.
@@ -3459,6 +3476,8 @@ int colo_init_ram_cache(void)
void colo_incoming_start_dirty_log(void)
{
RAMBlock *block = NULL;
+ Error *local_err = NULL;
+
/* For memory_global_dirty_log_start below. */
bql_lock();
qemu_mutex_lock_ramlist();
@@ -3470,7 +3489,10 @@ void colo_incoming_start_dirty_log(void)
/* Discard this dirty bitmap record */
bitmap_zero(block->bmap, block->max_length >> TARGET_PAGE_BITS);
}
- memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
+ if (!memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
+ &local_err)) {
+ error_report_err(local_err);
+ }
}
ram_state->migration_dirty_pages = 0;
qemu_mutex_unlock_ramlist();
@@ -3481,8 +3503,12 @@ void colo_incoming_start_dirty_log(void)
void colo_release_ram_cache(void)
{
RAMBlock *block;
+ Error *local_err = NULL;
+
+ if (!memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION, &local_err)) {
+ error_report_err(local_err);
+ }
- memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
RAMBLOCK_FOREACH_NOT_IGNORED(block) {
g_free(block->bmap);
block->bmap = NULL;
diff --git a/system/memory.c b/system/memory.c
index af06157ead5b1272548e87f79ab9fb3036055328..48aed0f8ece1c731849636c442b8ab8e5d7ff6a5 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2949,25 +2949,24 @@ static unsigned int postponed_stop_flags;
static VMChangeStateEntry *vmstate_change;
static bool memory_global_dirty_log_stop_postponed_run(Error **errp);
-void memory_global_dirty_log_start(unsigned int flags)
+bool memory_global_dirty_log_start(unsigned int flags, Error **errp)
{
+ ERRP_GUARD();
unsigned int old_flags;
- Error *local_err = NULL;
assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
if (vmstate_change) {
/* If there is postponed stop(), operate on it first */
postponed_stop_flags &= ~flags;
- if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
- error_report_err(local_err);
- return;
+ if (!memory_global_dirty_log_stop_postponed_run(errp)) {
+ return false;
}
}
flags &= ~global_dirty_tracking;
if (!flags) {
- return;
+ return true;
}
old_flags = global_dirty_tracking;
@@ -2975,16 +2974,15 @@ void memory_global_dirty_log_start(unsigned int flags)
trace_global_dirty_changed(global_dirty_tracking);
if (!old_flags) {
- MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
- &local_err);
- if (local_err) {
- error_report_err(local_err);
- return;
+ MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward, errp);
+ if (*errp) {
+ return false;
}
memory_region_transaction_begin();
memory_region_update_pending = true;
memory_region_transaction_commit();
}
+ return true;
}
static bool memory_global_dirty_log_do_stop(unsigned int flags, Error **errp)
@@ -3040,10 +3038,8 @@ static void memory_vm_change_state_handler(void *opaque, bool running,
}
}
-void memory_global_dirty_log_stop(unsigned int flags)
+bool memory_global_dirty_log_stop(unsigned int flags, Error **errp)
{
- Error *local_err = NULL;
-
if (!runstate_is_running()) {
/* Postpone the dirty log stop, e.g., to when VM starts again */
if (vmstate_change) {
@@ -3054,12 +3050,10 @@ void memory_global_dirty_log_stop(unsigned int flags)
vmstate_change = qemu_add_vm_change_state_handler(
memory_vm_change_state_handler, NULL);
}
- return;
+ return true;
}
- if (!memory_global_dirty_log_do_stop(flags, &local_err)) {
- error_report_err(local_err);
- }
+ return memory_global_dirty_log_do_stop(flags, errp);
}
static void listener_add_address_space(MemoryListener *listener,
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 16/26] migration: Modify ram_init_bitmaps() to report dirty tracking errors
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (14 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 15/26] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 17/26] vfio: Add Error** argument to .set_dirty_page_tracking() handler Cédric Le Goater
` (10 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
The .save_setup() handler has now an Error** argument that we can use
to propagate errors reported by the .log_global_start() handler. Do
that for the RAM. The caller qemu_savevm_state_setup() will store the
error under the migration stream for later detection in the migration
sequence.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
migration/ram.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 3d9c08cfae8a59031a7c1b3c70721c2a90daceba..b12a2f24335be4b2b009aabfe431f4e5a6b4d9a9 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2801,9 +2801,8 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
}
}
-static void ram_init_bitmaps(RAMState *rs)
+static bool ram_init_bitmaps(RAMState *rs, Error **errp)
{
- Error *local_err = NULL;
bool ret = true;
qemu_mutex_lock_ramlist();
@@ -2812,10 +2811,8 @@ static void ram_init_bitmaps(RAMState *rs)
ram_list_init_bitmaps();
/* We don't use dirty log with background snapshots */
if (!migrate_background_snapshot()) {
- ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
- &local_err);
+ ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION, errp);
if (!ret) {
- error_report_err(local_err);
goto out_unlock;
}
migration_bitmap_sync_precopy(rs, false);
@@ -2825,7 +2822,7 @@ out_unlock:
qemu_mutex_unlock_ramlist();
if (!ret) {
- return;
+ return false;
}
/*
@@ -2833,9 +2830,10 @@ out_unlock:
* containing all 1s to exclude any discarded pages from migration.
*/
migration_bitmap_clear_discarded_pages(rs);
+ return true;
}
-static int ram_init_all(RAMState **rsp)
+static int ram_init_all(RAMState **rsp, Error **errp)
{
if (ram_state_init(rsp)) {
return -1;
@@ -2846,7 +2844,9 @@ static int ram_init_all(RAMState **rsp)
return -1;
}
- ram_init_bitmaps(*rsp);
+ if (!ram_init_bitmaps(*rsp, errp)) {
+ return -1;
+ }
return 0;
}
@@ -2961,8 +2961,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
/* migration has already setup the bitmap, reuse it. */
if (!migration_in_colo_state()) {
- if (ram_init_all(rsp) != 0) {
- error_setg(errp, "%s: failed to setup RAM for migration", __func__);
+ if (ram_init_all(rsp, errp) != 0) {
compress_threads_save_cleanup();
return -1;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 17/26] vfio: Add Error** argument to .set_dirty_page_tracking() handler
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (15 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 16/26] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 18/26] vfio: Add Error** argument to vfio_devices_dma_logging_start() Cédric Le Goater
` (9 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
We will use the Error object to improve error reporting in the
.log_global*() handlers of VFIO. Add documentation while at it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- Use error_setg_errno() in vfio_legacy_set_dirty_page_tracking()
include/hw/vfio/vfio-container-base.h | 18 ++++++++++++++++--
hw/vfio/common.c | 4 ++--
hw/vfio/container-base.c | 4 ++--
hw/vfio/container.c | 6 +++---
4 files changed, 23 insertions(+), 9 deletions(-)
diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index 3582d5f97a37877b2adfc0d0b06996c82403f8b7..c76984654a596e3016a8cf833e10143eb872e102 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -82,7 +82,7 @@ int vfio_container_add_section_window(VFIOContainerBase *bcontainer,
void vfio_container_del_section_window(VFIOContainerBase *bcontainer,
MemoryRegionSection *section);
int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
- bool start);
+ bool start, Error **errp);
int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
hwaddr iova, hwaddr size);
@@ -121,9 +121,23 @@ struct VFIOIOMMUClass {
int (*attach_device)(const char *name, VFIODevice *vbasedev,
AddressSpace *as, Error **errp);
void (*detach_device)(VFIODevice *vbasedev);
+
/* migration feature */
+
+ /**
+ * @set_dirty_page_tracking
+ *
+ * Start or stop dirty pages tracking on VFIO container
+ *
+ * @bcontainer: #VFIOContainerBase on which to de/activate dirty
+ * pages tracking
+ * @start: indicates whether to start or stop dirty pages tracking
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
- bool start);
+ bool start, Error **errp);
int (*query_dirty_bitmap)(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
hwaddr iova, hwaddr size);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 71352ad6532cad02f3a2ab53bdd531d769ea55b2..9c5bc76414487513775fa8e559f97c12ed9c5fda 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1085,7 +1085,7 @@ static bool vfio_listener_log_global_start(MemoryListener *listener,
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
ret = vfio_devices_dma_logging_start(bcontainer);
} else {
- ret = vfio_container_set_dirty_page_tracking(bcontainer, true);
+ ret = vfio_container_set_dirty_page_tracking(bcontainer, true, NULL);
}
if (ret) {
@@ -1106,7 +1106,7 @@ static bool vfio_listener_log_global_stop(MemoryListener *listener,
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
vfio_devices_dma_logging_stop(bcontainer);
} else {
- ret = vfio_container_set_dirty_page_tracking(bcontainer, false);
+ ret = vfio_container_set_dirty_page_tracking(bcontainer, false, NULL);
}
if (ret) {
diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
index 913ae49077c4f09b7b27517c1231cfbe4befb7fb..7c0764121d24b02b6c4e66e368d7dff78a6d65aa 100644
--- a/hw/vfio/container-base.c
+++ b/hw/vfio/container-base.c
@@ -53,14 +53,14 @@ void vfio_container_del_section_window(VFIOContainerBase *bcontainer,
}
int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
- bool start)
+ bool start, Error **errp)
{
if (!bcontainer->dirty_pages_supported) {
return 0;
}
g_assert(bcontainer->ops->set_dirty_page_tracking);
- return bcontainer->ops->set_dirty_page_tracking(bcontainer, start);
+ return bcontainer->ops->set_dirty_page_tracking(bcontainer, start, errp);
}
int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 096d77eac3946a9c38fc2a98116b93353f71f06e..6524575aeddcea8470b5fd10caf57475088d1813 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -210,7 +210,7 @@ static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova,
static int
vfio_legacy_set_dirty_page_tracking(const VFIOContainerBase *bcontainer,
- bool start)
+ bool start, Error **errp)
{
const VFIOContainer *container = container_of(bcontainer, VFIOContainer,
bcontainer);
@@ -228,8 +228,8 @@ vfio_legacy_set_dirty_page_tracking(const VFIOContainerBase *bcontainer,
ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, &dirty);
if (ret) {
ret = -errno;
- error_report("Failed to set dirty tracking flag 0x%x errno: %d",
- dirty.flags, errno);
+ error_setg_errno(errp, errno, "Failed to set dirty tracking flag 0x%x",
+ dirty.flags);
}
return ret;
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 18/26] vfio: Add Error** argument to vfio_devices_dma_logging_start()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (16 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 17/26] vfio: Add Error** argument to .set_dirty_page_tracking() handler Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 19/26] vfio: Add Error** argument to vfio_devices_dma_logging_stop() Cédric Le Goater
` (8 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This allows to update the Error argument of the VFIO log_global_start()
handler. Errors detected when device level logging is started will be
propagated up to qemu_savevm_state_setup() when the ram save_setup()
handler is executed.
The vfio_set_migration_error() call becomes redundant. Remove it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- Use error_setg_errno() in vfio_devices_dma_logging_start()
- ERRP_GUARD() because of error_prepend use in
vfio_listener_log_global_start()
hw/vfio/common.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 9c5bc76414487513775fa8e559f97c12ed9c5fda..70e7c288f5f61f850fc9bbd204ecbe43c859f968 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1036,7 +1036,8 @@ static void vfio_device_feature_dma_logging_start_destroy(
g_free(feature);
}
-static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer)
+static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer,
+ Error **errp)
{
struct vfio_device_feature *feature;
VFIODirtyRanges ranges;
@@ -1058,8 +1059,8 @@ static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer)
ret = ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature);
if (ret) {
ret = -errno;
- error_report("%s: Failed to start DMA logging, err %d (%s)",
- vbasedev->name, ret, strerror(errno));
+ error_setg_errno(errp, errno, "%s: Failed to start DMA logging",
+ vbasedev->name);
goto out;
}
vbasedev->dirty_tracking = true;
@@ -1078,20 +1079,19 @@ out:
static bool vfio_listener_log_global_start(MemoryListener *listener,
Error **errp)
{
+ ERRP_GUARD(); /* error_prepend use */
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
int ret;
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
- ret = vfio_devices_dma_logging_start(bcontainer);
+ ret = vfio_devices_dma_logging_start(bcontainer, errp);
} else {
- ret = vfio_container_set_dirty_page_tracking(bcontainer, true, NULL);
+ ret = vfio_container_set_dirty_page_tracking(bcontainer, true, errp);
}
if (ret) {
- error_report("vfio: Could not start dirty page tracking, err: %d (%s)",
- ret, strerror(-ret));
- vfio_set_migration_error(ret);
+ error_prepend(errp, "vfio: Could not start dirty page tracking - ");
}
return !ret;
}
@@ -1099,6 +1099,7 @@ static bool vfio_listener_log_global_start(MemoryListener *listener,
static bool vfio_listener_log_global_stop(MemoryListener *listener,
Error **errp)
{
+ ERRP_GUARD(); /* error_prepend use */
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
int ret = 0;
@@ -1106,13 +1107,11 @@ static bool vfio_listener_log_global_stop(MemoryListener *listener,
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
vfio_devices_dma_logging_stop(bcontainer);
} else {
- ret = vfio_container_set_dirty_page_tracking(bcontainer, false, NULL);
+ ret = vfio_container_set_dirty_page_tracking(bcontainer, false, errp);
}
if (ret) {
- error_report("vfio: Could not stop dirty page tracking, err: %d (%s)",
- ret, strerror(-ret));
- vfio_set_migration_error(ret);
+ error_prepend(errp, "vfio: Could not stop dirty page tracking - ");
}
return !ret;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 19/26] vfio: Add Error** argument to vfio_devices_dma_logging_stop()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (17 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 18/26] vfio: Add Error** argument to vfio_devices_dma_logging_start() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 20/26] vfio: Use new Error** argument in vfio_save_setup() Cédric Le Goater
` (7 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
This improves error reporting in the log_global_stop() VFIO handler.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- Use error_setg_errno() in vfio_devices_dma_logging_stop()
hw/vfio/common.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 70e7c288f5f61f850fc9bbd204ecbe43c859f968..8fa232538d482f094643e0f1601b8ebe25fe077f 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -938,12 +938,14 @@ static void vfio_dirty_tracking_init(VFIOContainerBase *bcontainer,
memory_listener_unregister(&dirty.listener);
}
-static void vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer)
+static int vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer,
+ Error **errp)
{
uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature),
sizeof(uint64_t))] = {};
struct vfio_device_feature *feature = (struct vfio_device_feature *)buf;
VFIODevice *vbasedev;
+ int ret = 0;
feature->argsz = sizeof(buf);
feature->flags = VFIO_DEVICE_FEATURE_SET |
@@ -955,11 +957,17 @@ static void vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer)
}
if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) {
- warn_report("%s: Failed to stop DMA logging, err %d (%s)",
- vbasedev->name, -errno, strerror(errno));
+ /* Keep first error */
+ if (!ret) {
+ ret = -errno;
+ error_setg_errno(errp, errno, "%s: Failed to stop DMA logging",
+ vbasedev->name);
+ }
}
vbasedev->dirty_tracking = false;
}
+
+ return ret;
}
static struct vfio_device_feature *
@@ -1068,7 +1076,8 @@ static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer,
out:
if (ret) {
- vfio_devices_dma_logging_stop(bcontainer);
+ /* Ignore the potential errors when doing rollback */
+ vfio_devices_dma_logging_stop(bcontainer, NULL);
}
vfio_device_feature_dma_logging_start_destroy(feature);
@@ -1105,7 +1114,7 @@ static bool vfio_listener_log_global_stop(MemoryListener *listener,
int ret = 0;
if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
- vfio_devices_dma_logging_stop(bcontainer);
+ ret = vfio_devices_dma_logging_stop(bcontainer, errp);
} else {
ret = vfio_container_set_dirty_page_tracking(bcontainer, false, errp);
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 20/26] vfio: Use new Error** argument in vfio_save_setup()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (18 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 19/26] vfio: Add Error** argument to vfio_devices_dma_logging_stop() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 21/26] vfio: Add Error** argument to .vfio_save_config() handler Cédric Le Goater
` (6 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
Add an Error** argument to vfio_migration_set_state() and adjust
callers, including vfio_save_setup(). The error will be propagated up
to qemu_savevm_state_setup() where the save_setup() handler is
executed.
Modify vfio_vmstate_change_prepare() and vfio_vmstate_change() to
store a reported error under the migration stream if a migration is in
progress.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
Changes in v3:
- Use error_setg_errno() in vfio_save_setup()
- Made sure an error is always set in case of failure in
vfio_load_setup()
hw/vfio/migration.c | 67 ++++++++++++++++++++++++++-------------------
1 file changed, 39 insertions(+), 28 deletions(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 20eb7dad1a909d49a810294d87a07262ec5f1e9d..a67c3e457aa8255e7e39db5127627524d75a55dd 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -84,7 +84,8 @@ static const char *mig_state_to_str(enum vfio_device_mig_state state)
static int vfio_migration_set_state(VFIODevice *vbasedev,
enum vfio_device_mig_state new_state,
- enum vfio_device_mig_state recover_state)
+ enum vfio_device_mig_state recover_state,
+ Error **errp)
{
VFIOMigration *migration = vbasedev->migration;
uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature) +
@@ -104,15 +105,15 @@ static int vfio_migration_set_state(VFIODevice *vbasedev,
ret = -errno;
if (recover_state == VFIO_DEVICE_STATE_ERROR) {
- error_report("%s: Failed setting device state to %s, err: %s. "
- "Recover state is ERROR. Resetting device",
- vbasedev->name, mig_state_to_str(new_state),
- strerror(errno));
+ error_setg(errp, "%s: Failed setting device state to %s, err: %s. "
+ "Recover state is ERROR. Resetting device",
+ vbasedev->name, mig_state_to_str(new_state),
+ strerror(errno));
goto reset_device;
}
- error_report(
+ error_setg(errp,
"%s: Failed setting device state to %s, err: %s. Setting device in recover state %s",
vbasedev->name, mig_state_to_str(new_state),
strerror(errno), mig_state_to_str(recover_state));
@@ -120,7 +121,7 @@ static int vfio_migration_set_state(VFIODevice *vbasedev,
mig_state->device_state = recover_state;
if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) {
ret = -errno;
- error_report(
+ error_setg(errp,
"%s: Failed setting device in recover state, err: %s. Resetting device",
vbasedev->name, strerror(errno));
@@ -139,7 +140,7 @@ static int vfio_migration_set_state(VFIODevice *vbasedev,
* This can happen if the device is asynchronously reset and
* terminates a data transfer.
*/
- error_report("%s: data_fd out of sync", vbasedev->name);
+ error_setg(errp, "%s: data_fd out of sync", vbasedev->name);
close(mig_state->data_fd);
return -EBADF;
@@ -170,10 +171,11 @@ reset_device:
*/
static int
vfio_migration_set_state_or_reset(VFIODevice *vbasedev,
- enum vfio_device_mig_state new_state)
+ enum vfio_device_mig_state new_state,
+ Error **errp)
{
return vfio_migration_set_state(vbasedev, new_state,
- VFIO_DEVICE_STATE_ERROR);
+ VFIO_DEVICE_STATE_ERROR, errp);
}
static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev,
@@ -401,10 +403,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp)
switch (migration->device_state) {
case VFIO_DEVICE_STATE_RUNNING:
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
- VFIO_DEVICE_STATE_RUNNING);
+ VFIO_DEVICE_STATE_RUNNING, errp);
if (ret) {
- error_setg(errp, "%s: Failed to set new RUNNING state",
- vbasedev->name);
return ret;
}
@@ -437,13 +437,20 @@ static void vfio_save_cleanup(void *opaque)
{
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
+ Error *local_err = NULL;
+ int ret;
/*
* Changing device state from STOP_COPY to STOP can take time. Do it here,
* after migration has completed, so it won't increase downtime.
*/
if (migration->device_state == VFIO_DEVICE_STATE_STOP_COPY) {
- vfio_migration_set_state_or_reset(vbasedev, VFIO_DEVICE_STATE_STOP);
+ ret = vfio_migration_set_state_or_reset(vbasedev,
+ VFIO_DEVICE_STATE_STOP,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
}
g_free(migration->data_buffer);
@@ -549,11 +556,13 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
VFIODevice *vbasedev = opaque;
ssize_t data_size;
int ret;
+ Error *local_err = NULL;
/* We reach here with device state STOP or STOP_COPY only */
ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY,
- VFIO_DEVICE_STATE_STOP);
+ VFIO_DEVICE_STATE_STOP, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
@@ -591,14 +600,9 @@ static void vfio_save_state(QEMUFile *f, void *opaque)
static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp)
{
VFIODevice *vbasedev = opaque;
- int ret;
- ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING,
- vbasedev->migration->device_state);
- if (ret) {
- error_setg(errp, "%s: Failed to set RESUMING state", vbasedev->name);
- }
- return ret;
+ return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING,
+ vbasedev->migration->device_state, errp);
}
static int vfio_load_cleanup(void *opaque)
@@ -714,20 +718,22 @@ static void vfio_vmstate_change_prepare(void *opaque, bool running,
VFIODevice *vbasedev = opaque;
VFIOMigration *migration = vbasedev->migration;
enum vfio_device_mig_state new_state;
+ Error *local_err = NULL;
int ret;
new_state = migration->device_state == VFIO_DEVICE_STATE_PRE_COPY ?
VFIO_DEVICE_STATE_PRE_COPY_P2P :
VFIO_DEVICE_STATE_RUNNING_P2P;
- ret = vfio_migration_set_state_or_reset(vbasedev, new_state);
+ ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
if (ret) {
/*
* Migration should be aborted in this case, but vm_state_notify()
* currently does not support reporting failures.
*/
if (migrate_get_current()->to_dst_file) {
- qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
+ qemu_file_set_error_obj(migrate_get_current()->to_dst_file, ret,
+ local_err);
}
}
@@ -740,6 +746,7 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
{
VFIODevice *vbasedev = opaque;
enum vfio_device_mig_state new_state;
+ Error *local_err = NULL;
int ret;
if (running) {
@@ -752,14 +759,15 @@ static void vfio_vmstate_change(void *opaque, bool running, RunState state)
VFIO_DEVICE_STATE_STOP;
}
- ret = vfio_migration_set_state_or_reset(vbasedev, new_state);
+ ret = vfio_migration_set_state_or_reset(vbasedev, new_state, &local_err);
if (ret) {
/*
* Migration should be aborted in this case, but vm_state_notify()
* currently does not support reporting failures.
*/
if (migrate_get_current()->to_dst_file) {
- qemu_file_set_error(migrate_get_current()->to_dst_file, ret);
+ qemu_file_set_error_obj(migrate_get_current()->to_dst_file, ret,
+ local_err);
}
}
@@ -773,13 +781,16 @@ static int vfio_migration_state_notifier(NotifierWithReturn *notifier,
VFIOMigration *migration = container_of(notifier, VFIOMigration,
migration_state);
VFIODevice *vbasedev = migration->vbasedev;
+ int ret = 0;
trace_vfio_migration_state_notifier(vbasedev->name, e->type);
if (e->type == MIG_EVENT_PRECOPY_FAILED) {
- vfio_migration_set_state_or_reset(vbasedev, VFIO_DEVICE_STATE_RUNNING);
+ ret = vfio_migration_set_state_or_reset(vbasedev,
+ VFIO_DEVICE_STATE_RUNNING,
+ errp);
}
- return 0;
+ return ret;
}
static void vfio_migration_free(VFIODevice *vbasedev)
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 21/26] vfio: Add Error** argument to .vfio_save_config() handler
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (19 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 20/26] vfio: Use new Error** argument in vfio_save_setup() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 22/26] vfio: Reverse test on vfio_get_dirty_bitmap() Cédric Le Goater
` (5 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
Use vmstate_save_state_with_err() to improve error reporting in the
callers and store a reported error under the migration stream. Add
documentation while at it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/hw/vfio/vfio-common.h | 25 ++++++++++++++++++++++++-
hw/vfio/migration.c | 18 ++++++++++++------
hw/vfio/pci.c | 5 +++--
3 files changed, 39 insertions(+), 9 deletions(-)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index b9da6c08ef41174610eb92726c590309a53696a3..46f88493634b5634a9c14a5caa33a463fbf2c50d 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -133,7 +133,30 @@ struct VFIODeviceOps {
int (*vfio_hot_reset_multi)(VFIODevice *vdev);
void (*vfio_eoi)(VFIODevice *vdev);
Object *(*vfio_get_object)(VFIODevice *vdev);
- void (*vfio_save_config)(VFIODevice *vdev, QEMUFile *f);
+
+ /**
+ * @vfio_save_config
+ *
+ * Save device config state
+ *
+ * @vdev: #VFIODevice for which to save the config
+ * @f: #QEMUFile where to send the data
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
+ */
+ int (*vfio_save_config)(VFIODevice *vdev, QEMUFile *f, Error **errp);
+
+ /**
+ * @vfio_load_config
+ *
+ * Load device config state
+ *
+ * @vdev: #VFIODevice for which to load the config
+ * @f: #QEMUFile where to get the data
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f);
};
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index a67c3e457aa8255e7e39db5127627524d75a55dd..4c4ccbaed10b82e7baa992770ec0ec5a6c82b757 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -190,14 +190,19 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev,
return ret;
}
-static int vfio_save_device_config_state(QEMUFile *f, void *opaque)
+static int vfio_save_device_config_state(QEMUFile *f, void *opaque,
+ Error **errp)
{
VFIODevice *vbasedev = opaque;
+ int ret;
qemu_put_be64(f, VFIO_MIG_FLAG_DEV_CONFIG_STATE);
if (vbasedev->ops && vbasedev->ops->vfio_save_config) {
- vbasedev->ops->vfio_save_config(vbasedev, f);
+ ret = vbasedev->ops->vfio_save_config(vbasedev, f, errp);
+ if (ret) {
+ return ret;
+ }
}
qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
@@ -587,13 +592,14 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
static void vfio_save_state(QEMUFile *f, void *opaque)
{
VFIODevice *vbasedev = opaque;
+ Error *local_err = NULL;
int ret;
- ret = vfio_save_device_config_state(f, opaque);
+ ret = vfio_save_device_config_state(f, opaque, &local_err);
if (ret) {
- error_report("%s: Failed to save device config space",
- vbasedev->name);
- qemu_file_set_error(f, ret);
+ error_prepend(&local_err, "%s: Failed to save device config space",
+ vbasedev->name);
+ qemu_file_set_error_obj(f, ret, local_err);
}
}
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 4fa387f0430d62ca2ba1b5ae5b7037f8f06b33f9..99d86e1d40ef25133fc76ad6e58294b07bd20843 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2585,11 +2585,12 @@ const VMStateDescription vmstate_vfio_pci_config = {
}
};
-static void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f)
+static int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f, Error **errp)
{
VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
- vmstate_save_state(f, &vmstate_vfio_pci_config, vdev, NULL);
+ return vmstate_save_state_with_err(f, &vmstate_vfio_pci_config, vdev, NULL,
+ errp);
}
static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f)
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 22/26] vfio: Reverse test on vfio_get_dirty_bitmap()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (20 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 21/26] vfio: Add Error** argument to .vfio_save_config() handler Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 23/26] memory: Add Error** argument to memory_get_xlat_addr() Cédric Le Goater
` (4 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
It will simplify the changes coming after.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/common.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 8fa232538d482f094643e0f1601b8ebe25fe077f..3c01b4c78ac73665d3c9f2322291a19ef8f71e76 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1239,16 +1239,20 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
}
rcu_read_lock();
- if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) {
- ret = vfio_get_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
- translated_addr);
- if (ret) {
- error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%s)",
- bcontainer, iova, iotlb->addr_mask + 1, ret,
- strerror(-ret));
- }
+ if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) {
+ goto out_lock;
}
+
+ ret = vfio_get_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
+ translated_addr);
+ if (ret) {
+ error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") = %d (%s)",
+ bcontainer, iova, iotlb->addr_mask + 1, ret,
+ strerror(-ret));
+ }
+
+out_lock:
rcu_read_unlock();
out:
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 23/26] memory: Add Error** argument to memory_get_xlat_addr()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (21 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 22/26] vfio: Reverse test on vfio_get_dirty_bitmap() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 24/26] vfio: Add Error** argument to .get_dirty_bitmap() handler Cédric Le Goater
` (3 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater, Michael S. Tsirkin, Paolo Bonzini,
David Hildenbrand
Let the callers do the reporting. This will be useful in
vfio_iommu_map_dirty_notify().
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/exec/memory.h | 15 ++++++++++++++-
hw/vfio/common.c | 13 +++++++++----
hw/virtio/vhost-vdpa.c | 5 ++++-
system/memory.c | 10 +++++-----
4 files changed, 32 insertions(+), 11 deletions(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8b019465ab13ce85c03075c80865a0865ea1feed..4f700eed598ba2bec7b75630e75483305ac6469f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -771,9 +771,22 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
RamDiscardListener *rdl);
+/**
+ * memory_get_xlat_addr: Extract addresses from a TLB entry
+ *
+ * @iotlb: pointer to an #IOMMUTLBEntry
+ * @vaddr: virtual addressf
+ * @ram_addr: RAM address
+ * @read_only: indicates if writes are allowed
+ * @mr_has_discard_manager: indicates memory is controlled by a
+ * RamDiscardManager
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Return: true on success, else false setting @errp with error.
+ */
bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
ram_addr_t *ram_addr, bool *read_only,
- bool *mr_has_discard_manager);
+ bool *mr_has_discard_manager, Error **errp);
typedef struct CoalescedMemoryRange CoalescedMemoryRange;
typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 3c01b4c78ac73665d3c9f2322291a19ef8f71e76..d4f6e695b671eeead2284d2116740d9eac31a5a3 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -262,12 +262,13 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
/* Called with rcu_read_lock held. */
static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
- ram_addr_t *ram_addr, bool *read_only)
+ ram_addr_t *ram_addr, bool *read_only,
+ Error **errp)
{
bool ret, mr_has_discard_manager;
ret = memory_get_xlat_addr(iotlb, vaddr, ram_addr, read_only,
- &mr_has_discard_manager);
+ &mr_has_discard_manager, errp);
if (ret && mr_has_discard_manager) {
/*
* Malicious VMs might trigger discarding of IOMMU-mapped memory. The
@@ -297,6 +298,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
hwaddr iova = iotlb->iova + giommu->iommu_offset;
void *vaddr;
int ret;
+ Error *local_err = NULL;
trace_vfio_iommu_map_notify(iotlb->perm == IOMMU_NONE ? "UNMAP" : "MAP",
iova, iova + iotlb->addr_mask);
@@ -313,7 +315,8 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
bool read_only;
- if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only)) {
+ if (!vfio_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, &local_err)) {
+ error_report_err(local_err);
goto out;
}
/*
@@ -1228,6 +1231,7 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
VFIOContainerBase *bcontainer = giommu->bcontainer;
hwaddr iova = iotlb->iova + giommu->iommu_offset;
ram_addr_t translated_addr;
+ Error *local_err = NULL;
int ret = -EINVAL;
trace_vfio_iommu_map_dirty_notify(iova, iova + iotlb->addr_mask);
@@ -1239,7 +1243,8 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
}
rcu_read_lock();
- if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) {
+ if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, &local_err)) {
+ error_report_err(local_err);
goto out_lock;
}
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index ddae494ca8e8154ce03b88bc781fe9f1e639aceb..a6f06266cfc798b20b98001fa97ce771722175ec 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -203,6 +203,7 @@ static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
void *vaddr;
int ret;
Int128 llend;
+ Error *local_err = NULL;
if (iotlb->target_as != &address_space_memory) {
error_report("Wrong target AS \"%s\", only system memory is allowed",
@@ -222,7 +223,9 @@ static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
bool read_only;
- if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL)) {
+ if (!memory_get_xlat_addr(iotlb, &vaddr, NULL, &read_only, NULL,
+ &local_err)) {
+ error_report_err(local_err);
return;
}
ret = vhost_vdpa_dma_map(s, VHOST_VDPA_GUEST_PA_ASID, iova,
diff --git a/system/memory.c b/system/memory.c
index 48aed0f8ece1c731849636c442b8ab8e5d7ff6a5..132c026e35cbeb0ab8fa0fe64bb9db50f6024e0d 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2203,7 +2203,7 @@ void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
/* Called with rcu_read_lock held. */
bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
ram_addr_t *ram_addr, bool *read_only,
- bool *mr_has_discard_manager)
+ bool *mr_has_discard_manager, Error **errp)
{
MemoryRegion *mr;
hwaddr xlat;
@@ -2221,7 +2221,7 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
mr = address_space_translate(&address_space_memory, iotlb->translated_addr,
&xlat, &len, writable, MEMTXATTRS_UNSPECIFIED);
if (!memory_region_is_ram(mr)) {
- error_report("iommu map to non memory area %" HWADDR_PRIx "", xlat);
+ error_setg(errp, "iommu map to non memory area %" HWADDR_PRIx "", xlat);
return false;
} else if (memory_region_has_ram_discard_manager(mr)) {
RamDiscardManager *rdm = memory_region_get_ram_discard_manager(mr);
@@ -2240,8 +2240,8 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
* were already restored before IOMMUs are restored.
*/
if (!ram_discard_manager_is_populated(rdm, &tmp)) {
- error_report("iommu map to discarded memory (e.g., unplugged via"
- " virtio-mem): %" HWADDR_PRIx "",
+ error_setg(errp, "iommu map to discarded memory (e.g., unplugged"
+ " via virtio-mem): %" HWADDR_PRIx "",
iotlb->translated_addr);
return false;
}
@@ -2252,7 +2252,7 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr,
* check that it did not truncate too much.
*/
if (len & iotlb->addr_mask) {
- error_report("iommu has granularity incompatible with target AS");
+ error_setg(errp, "iommu has granularity incompatible with target AS");
return false;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 24/26] vfio: Add Error** argument to .get_dirty_bitmap() handler
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (22 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 23/26] memory: Add Error** argument to memory_get_xlat_addr() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 25/26] vfio: Also trace event failures in vfio_save_complete_precopy() Cédric Le Goater
` (2 subsequent siblings)
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
Let the callers do the error reporting. Add documentation while at it.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
include/hw/vfio/vfio-common.h | 4 +-
include/hw/vfio/vfio-container-base.h | 17 +++++++-
hw/vfio/common.c | 59 ++++++++++++++++++---------
hw/vfio/container-base.c | 5 ++-
hw/vfio/container.c | 13 +++---
5 files changed, 67 insertions(+), 31 deletions(-)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 46f88493634b5634a9c14a5caa33a463fbf2c50d..68911d36676667352e94a97895828aff4b194b57 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -274,9 +274,9 @@ bool
vfio_devices_all_device_dirty_tracking(const VFIOContainerBase *bcontainer);
int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap, hwaddr iova,
- hwaddr size);
+ hwaddr size, Error **errp);
int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova,
- uint64_t size, ram_addr_t ram_addr);
+ uint64_t size, ram_addr_t ram_addr, Error **errp);
/* Returns 0 on success, or a negative errno. */
int vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index c76984654a596e3016a8cf833e10143eb872e102..ebc49ebfbe7de862450941b1129faad5d62b3769 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -85,7 +85,7 @@ int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
bool start, Error **errp);
int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
- hwaddr iova, hwaddr size);
+ hwaddr iova, hwaddr size, Error **errp);
void vfio_container_init(VFIOContainerBase *bcontainer,
VFIOAddressSpace *space,
@@ -138,9 +138,22 @@ struct VFIOIOMMUClass {
*/
int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
bool start, Error **errp);
+ /**
+ * @query_dirty_bitmap
+ *
+ * Get list of dirty pages from container
+ *
+ * @bcontainer: #VFIOContainerBase from which to get dirty pages
+ * @vbmap: #VFIOBitmap internal bitmap structure
+ * @iova: iova base address
+ * @size: size of iova range
+ * @errp: pointer to Error*, to store an error if it happens.
+ *
+ * Returns zero to indicate success and negative for error
+ */
int (*query_dirty_bitmap)(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
- hwaddr iova, hwaddr size);
+ hwaddr iova, hwaddr size, Error **errp);
/* PCI specific */
int (*pci_hot_reset)(VFIODevice *vbasedev, bool single);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index d4f6e695b671eeead2284d2116740d9eac31a5a3..1dec3b4d49c2864a397c99754eca69c6ba443ca1 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1156,7 +1156,7 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap, hwaddr iova,
- hwaddr size)
+ hwaddr size, Error **errp)
{
VFIODevice *vbasedev;
int ret;
@@ -1165,10 +1165,10 @@ int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
ret = vfio_device_dma_logging_report(vbasedev, iova, size,
vbmap->bitmap);
if (ret) {
- error_report("%s: Failed to get DMA logging report, iova: "
- "0x%" HWADDR_PRIx ", size: 0x%" HWADDR_PRIx
- ", err: %d (%s)",
- vbasedev->name, iova, size, ret, strerror(-ret));
+ error_setg(errp, "%s: Failed to get DMA logging report, iova: "
+ "0x%" HWADDR_PRIx ", size: 0x%" HWADDR_PRIx
+ ", err: %d (%s)",
+ vbasedev->name, iova, size, ret, strerror(-ret));
return ret;
}
@@ -1178,7 +1178,7 @@ int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
}
int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova,
- uint64_t size, ram_addr_t ram_addr)
+ uint64_t size, ram_addr_t ram_addr, Error **errp)
{
bool all_device_dirty_tracking =
vfio_devices_all_device_dirty_tracking(bcontainer);
@@ -1195,13 +1195,17 @@ int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova,
ret = vfio_bitmap_alloc(&vbmap, size);
if (ret) {
+ error_setg_errno(errp, -ret,
+ "Failed to allocate dirty tracking bitmap");
return ret;
}
if (all_device_dirty_tracking) {
- ret = vfio_devices_query_dirty_bitmap(bcontainer, &vbmap, iova, size);
+ ret = vfio_devices_query_dirty_bitmap(bcontainer, &vbmap, iova, size,
+ errp);
} else {
- ret = vfio_container_query_dirty_bitmap(bcontainer, &vbmap, iova, size);
+ ret = vfio_container_query_dirty_bitmap(bcontainer, &vbmap, iova, size,
+ errp);
}
if (ret) {
@@ -1249,12 +1253,13 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
}
ret = vfio_get_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
- translated_addr);
+ translated_addr, &local_err);
if (ret) {
- error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%s)",
- bcontainer, iova, iotlb->addr_mask + 1, ret,
- strerror(-ret));
+ error_prepend(&local_err,
+ "vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") failed :", bcontainer, iova,
+ iotlb->addr_mask + 1);
+ error_report_err(local_err);
}
out_lock:
@@ -1274,12 +1279,19 @@ static int vfio_ram_discard_get_dirty_bitmap(MemoryRegionSection *section,
const ram_addr_t ram_addr = memory_region_get_ram_addr(section->mr) +
section->offset_within_region;
VFIORamDiscardListener *vrdl = opaque;
+ Error *local_err = NULL;
+ int ret;
/*
* Sync the whole mapped region (spanning multiple individual mappings)
* in one go.
*/
- return vfio_get_dirty_bitmap(vrdl->bcontainer, iova, size, ram_addr);
+ ret = vfio_get_dirty_bitmap(vrdl->bcontainer, iova, size, ram_addr,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
+ return ret;
}
static int
@@ -1311,7 +1323,7 @@ vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainerBase *bcontainer,
}
static int vfio_sync_dirty_bitmap(VFIOContainerBase *bcontainer,
- MemoryRegionSection *section)
+ MemoryRegionSection *section, Error **errp)
{
ram_addr_t ram_addr;
@@ -1342,7 +1354,14 @@ static int vfio_sync_dirty_bitmap(VFIOContainerBase *bcontainer,
}
return 0;
} else if (memory_region_has_ram_discard_manager(section->mr)) {
- return vfio_sync_ram_discard_listener_dirty_bitmap(bcontainer, section);
+ int ret;
+
+ ret = vfio_sync_ram_discard_listener_dirty_bitmap(bcontainer, section);
+ if (ret) {
+ error_setg(errp,
+ "Failed to sync dirty bitmap with RAM discard listener");
+ return ret;
+ }
}
ram_addr = memory_region_get_ram_addr(section->mr) +
@@ -1350,7 +1369,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainerBase *bcontainer,
return vfio_get_dirty_bitmap(bcontainer,
REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
- int128_get64(section->size), ram_addr);
+ int128_get64(section->size), ram_addr, errp);
}
static void vfio_listener_log_sync(MemoryListener *listener,
@@ -1359,16 +1378,16 @@ static void vfio_listener_log_sync(MemoryListener *listener,
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
int ret;
+ Error *local_err = NULL;
if (vfio_listener_skipped_section(section)) {
return;
}
if (vfio_devices_all_dirty_tracking(bcontainer)) {
- ret = vfio_sync_dirty_bitmap(bcontainer, section);
+ ret = vfio_sync_dirty_bitmap(bcontainer, section, &local_err);
if (ret) {
- error_report("vfio: Failed to sync dirty bitmap, err: %d (%s)", ret,
- strerror(-ret));
+ error_report_err(local_err);
vfio_set_migration_error(ret);
}
}
diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
index 7c0764121d24b02b6c4e66e368d7dff78a6d65aa..8db59881873c3b1edee81104b966af737e5fa6f6 100644
--- a/hw/vfio/container-base.c
+++ b/hw/vfio/container-base.c
@@ -65,10 +65,11 @@ int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
- hwaddr iova, hwaddr size)
+ hwaddr iova, hwaddr size, Error **errp)
{
g_assert(bcontainer->ops->query_dirty_bitmap);
- return bcontainer->ops->query_dirty_bitmap(bcontainer, vbmap, iova, size);
+ return bcontainer->ops->query_dirty_bitmap(bcontainer, vbmap, iova, size,
+ errp);
}
void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace *space,
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 6524575aeddcea8470b5fd10caf57475088d1813..475d96eaaa927998c6aa8cc9aa9f2115f5a1efda 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -131,6 +131,7 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer,
};
bool need_dirty_sync = false;
int ret;
+ Error *local_err = NULL;
if (iotlb && vfio_devices_all_running_and_mig_active(bcontainer)) {
if (!vfio_devices_all_device_dirty_tracking(bcontainer) &&
@@ -166,8 +167,9 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer,
if (need_dirty_sync) {
ret = vfio_get_dirty_bitmap(bcontainer, iova, size,
- iotlb->translated_addr);
+ iotlb->translated_addr, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
}
@@ -237,7 +239,8 @@ vfio_legacy_set_dirty_page_tracking(const VFIOContainerBase *bcontainer,
static int vfio_legacy_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
VFIOBitmap *vbmap,
- hwaddr iova, hwaddr size)
+ hwaddr iova, hwaddr size,
+ Error **errp)
{
const VFIOContainer *container = container_of(bcontainer, VFIOContainer,
bcontainer);
@@ -265,9 +268,9 @@ static int vfio_legacy_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, dbitmap);
if (ret) {
ret = -errno;
- error_report("Failed to get dirty bitmap for iova: 0x%"PRIx64
- " size: 0x%"PRIx64" err: %d", (uint64_t)range->iova,
- (uint64_t)range->size, errno);
+ error_setg(errp, "Failed to get dirty bitmap for iova: 0x%"PRIx64
+ " size: 0x%"PRIx64" err: %d", (uint64_t)range->iova,
+ (uint64_t)range->size, errno);
}
g_free(dbitmap);
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 25/26] vfio: Also trace event failures in vfio_save_complete_precopy()
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (23 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 24/26] vfio: Add Error** argument to .get_dirty_bitmap() handler Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 26/26] vfio: Extend vfio_set_migration_error() with Error* argument Cédric Le Goater
2024-03-05 8:06 ` [PATCH v3 00/26] migration: Improve error reporting Peter Xu
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
vfio_save_complete_precopy() currently returns before doing the trace
event. Change that.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/migration.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 4c4ccbaed10b82e7baa992770ec0ec5a6c82b757..4469ac6abafd4fa0cbf1df54faebc975c92bbcf2 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -580,9 +580,6 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
ret = qemu_file_get_error(f);
- if (ret) {
- return ret;
- }
trace_vfio_save_complete_precopy(vbasedev->name, ret);
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH v3 26/26] vfio: Extend vfio_set_migration_error() with Error* argument
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (24 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 25/26] vfio: Also trace event failures in vfio_save_complete_precopy() Cédric Le Goater
@ 2024-03-04 12:28 ` Cédric Le Goater
2024-03-05 8:06 ` [PATCH v3 00/26] migration: Improve error reporting Peter Xu
26 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-04 12:28 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
vfio_set_migration_error() sets the 'return' error on the migration
stream if a migration is in progress. To improve error reporting, add
a new Error* argument to also set the Error object on the migration
stream, if a migration is progress.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/common.c | 36 +++++++++++++++++++-----------------
1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 1dec3b4d49c2864a397c99754eca69c6ba443ca1..9fa6eee8035b5f23bb82833699e85c48c6bfaf60 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -148,16 +148,18 @@ bool vfio_viommu_preset(VFIODevice *vbasedev)
return vbasedev->bcontainer->space->as != &address_space_memory;
}
-static void vfio_set_migration_error(int err)
+static void vfio_set_migration_error(int ret, Error *err)
{
MigrationState *ms = migrate_get_current();
if (migration_is_setup_or_active(ms->state)) {
WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) {
if (ms->to_dst_file) {
- qemu_file_set_error(ms->to_dst_file, err);
+ qemu_file_set_error_obj(ms->to_dst_file, ret, err);
}
}
+ } else {
+ error_report_err(err);
}
}
@@ -304,9 +306,10 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
iova, iova + iotlb->addr_mask);
if (iotlb->target_as != &address_space_memory) {
- error_report("Wrong target AS \"%s\", only system memory is allowed",
- iotlb->target_as->name ? iotlb->target_as->name : "none");
- vfio_set_migration_error(-EINVAL);
+ error_setg(&local_err,
+ "Wrong target AS \"%s\", only system memory is allowed",
+ iotlb->target_as->name ? iotlb->target_as->name : "none");
+ vfio_set_migration_error(-EINVAL, local_err);
return;
}
@@ -339,11 +342,12 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
ret = vfio_container_dma_unmap(bcontainer, iova,
iotlb->addr_mask + 1, iotlb);
if (ret) {
- error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
- "0x%"HWADDR_PRIx") = %d (%s)",
- bcontainer, iova,
- iotlb->addr_mask + 1, ret, strerror(-ret));
- vfio_set_migration_error(ret);
+ error_setg(&local_err,
+ "vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
+ "0x%"HWADDR_PRIx") = %d (%s)",
+ bcontainer, iova,
+ iotlb->addr_mask + 1, ret, strerror(-ret));
+ vfio_set_migration_error(ret, local_err);
}
}
out:
@@ -1241,14 +1245,14 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
trace_vfio_iommu_map_dirty_notify(iova, iova + iotlb->addr_mask);
if (iotlb->target_as != &address_space_memory) {
- error_report("Wrong target AS \"%s\", only system memory is allowed",
- iotlb->target_as->name ? iotlb->target_as->name : "none");
+ error_setg(&local_err,
+ "Wrong target AS \"%s\", only system memory is allowed",
+ iotlb->target_as->name ? iotlb->target_as->name : "none");
goto out;
}
rcu_read_lock();
if (!vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL, &local_err)) {
- error_report_err(local_err);
goto out_lock;
}
@@ -1259,7 +1263,6 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
"vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
"0x%"HWADDR_PRIx") failed :", bcontainer, iova,
iotlb->addr_mask + 1);
- error_report_err(local_err);
}
out_lock:
@@ -1267,7 +1270,7 @@ out_lock:
out:
if (ret) {
- vfio_set_migration_error(ret);
+ vfio_set_migration_error(ret, local_err);
}
}
@@ -1387,8 +1390,7 @@ static void vfio_listener_log_sync(MemoryListener *listener,
if (vfio_devices_all_dirty_tracking(bcontainer)) {
ret = vfio_sync_dirty_bitmap(bcontainer, section, &local_err);
if (ret) {
- error_report_err(local_err);
- vfio_set_migration_error(ret);
+ vfio_set_migration_error(ret, local_err);
}
}
}
--
2.44.0
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler
2024-03-04 12:28 ` [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
@ 2024-03-04 20:49 ` Fabiano Rosas
2024-03-05 7:12 ` Cédric Le Goater
2024-03-05 14:54 ` Avihai Horon
1 sibling, 1 reply; 50+ messages in thread
From: Fabiano Rosas @ 2024-03-04 20:49 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater, Halil Pasic, Christian Borntraeger,
Thomas Huth
Cédric Le Goater <clg@redhat.com> writes:
> This will prepare ground for futur changes adding an Error** argument
> to the save_setup() handler. We need to make sure that on failure,
> set_migrationmode() always sets a new error. See the Rules section in
> qapi/error.h.
>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> include/hw/s390x/storage-attributes.h | 2 +-
> hw/s390x/s390-stattrib-kvm.c | 12 ++++++++++--
> hw/s390x/s390-stattrib.c | 14 +++++++++-----
> 3 files changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/include/hw/s390x/storage-attributes.h b/include/hw/s390x/storage-attributes.h
> index 5239eb538c1b087797867a247abfc14551af6a4d..8921a04d514bf64a3113255ee10ed33fc598ae06 100644
> --- a/include/hw/s390x/storage-attributes.h
> +++ b/include/hw/s390x/storage-attributes.h
> @@ -39,7 +39,7 @@ struct S390StAttribClass {
> int (*set_stattr)(S390StAttribState *sa, uint64_t start_gfn,
> uint32_t count, uint8_t *values);
> void (*synchronize)(S390StAttribState *sa);
> - int (*set_migrationmode)(S390StAttribState *sa, bool value);
> + int (*set_migrationmode)(S390StAttribState *sa, bool value, Error **errp);
> int (*get_active)(S390StAttribState *sa);
> long long (*get_dirtycount)(S390StAttribState *sa);
> };
> diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
> index 24cd01382e2d74d62c2d7e980eb6aca1077d893d..357cea2c987213b867c81b0e258f7d0c293fe665 100644
> --- a/hw/s390x/s390-stattrib-kvm.c
> +++ b/hw/s390x/s390-stattrib-kvm.c
> @@ -17,6 +17,7 @@
> #include "sysemu/kvm.h"
> #include "exec/ram_addr.h"
> #include "kvm/kvm_s390x.h"
> +#include "qapi/error.h"
>
> Object *kvm_s390_stattrib_create(void)
> {
> @@ -137,14 +138,21 @@ static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
> }
> }
>
> -static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val)
> +static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val,
> + Error **errp)
> {
> struct kvm_device_attr attr = {
> .group = KVM_S390_VM_MIGRATION,
> .attr = val,
> .addr = 0,
> };
> - return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
> + int r;
> +
> + r = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
> + if (r) {
> + error_setg_errno(errp, -r, "KVM_S390_SET_CMMA_BITS failed");
Did you mean KVM_SET_DEVICE_ATTR?
> + }
> + return r;
> }
>
> static long long kvm_s390_stattrib_get_dirtycount(S390StAttribState *sa)
> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
> index c483b62a9b5f71772639fc180bdad15ecb6711cb..e99de190332a82363b1388bbc450013149295bc0 100644
> --- a/hw/s390x/s390-stattrib.c
> +++ b/hw/s390x/s390-stattrib.c
> @@ -60,11 +60,12 @@ void hmp_migrationmode(Monitor *mon, const QDict *qdict)
> S390StAttribState *sas = s390_get_stattrib_device();
> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> uint64_t what = qdict_get_int(qdict, "mode");
> + Error *local_err = NULL;
> int r;
>
> - r = sac->set_migrationmode(sas, what);
> + r = sac->set_migrationmode(sas, what, &local_err);
> if (r < 0) {
> - monitor_printf(mon, "Error: %s", strerror(-r));
> + monitor_printf(mon, "Error: %s", error_get_pretty(local_err));
> }
> }
>
> @@ -170,13 +171,15 @@ static int cmma_save_setup(QEMUFile *f, void *opaque)
> {
> S390StAttribState *sas = S390_STATTRIB(opaque);
> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> + Error *local_err = NULL;
> int res;
> /*
> * Signal that we want to start a migration, thus needing PGSTE dirty
> * tracking.
> */
> - res = sac->set_migrationmode(sas, 1);
> + res = sac->set_migrationmode(sas, true, &local_err);
> if (res) {
> + error_report_err(local_err);
> return res;
> }
> qemu_put_be64(f, STATTR_FLAG_EOS);
> @@ -260,7 +263,7 @@ static void cmma_save_cleanup(void *opaque)
> {
> S390StAttribState *sas = S390_STATTRIB(opaque);
> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> - sac->set_migrationmode(sas, 0);
> + sac->set_migrationmode(sas, false, NULL);
> }
>
> static bool cmma_active(void *opaque)
> @@ -293,7 +296,8 @@ static long long qemu_s390_get_dirtycount_stub(S390StAttribState *sa)
> {
> return 0;
> }
> -static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value)
> +static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value,
> + Error **errp)
> {
> return 0;
> }
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup()
2024-03-04 12:28 ` [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup() Cédric Le Goater
@ 2024-03-04 20:51 ` Fabiano Rosas
2024-03-06 9:56 ` Avihai Horon
1 sibling, 0 replies; 50+ messages in thread
From: Fabiano Rosas @ 2024-03-04 20:51 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
Cédric Le Goater <clg@redhat.com> writes:
> This will prepare ground for futur changes adding an Error** argument
> to the save_setup() handler. We need to make sure that on failure,
> vfio_save_setup() always sets a new error.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 03/26] migration: Always report an error in block_save_setup()
2024-03-04 12:28 ` [PATCH v3 03/26] migration: Always report an error in block_save_setup() Cédric Le Goater
@ 2024-03-04 21:04 ` Fabiano Rosas
2024-03-05 8:23 ` Cédric Le Goater
0 siblings, 1 reply; 50+ messages in thread
From: Fabiano Rosas @ 2024-03-04 21:04 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater, Stefan Hajnoczi
Cédric Le Goater <clg@redhat.com> writes:
> This will prepare ground for futur changes adding an Error** argument
> to the save_setup() handler. We need to make sure that on failure,
> block_save_setup() always sets a new error.
>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> migration/block.c | 18 +++++++++++++-----
> 1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/migration/block.c b/migration/block.c
> index 8c6ebafacc1ffe930d1d4f19d968817b14852c69..06f5857cf049df3261d2cf1d7c3607ab92350ac6 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -367,7 +367,7 @@ static void unset_dirty_tracking(void)
> }
> }
>
> -static int init_blk_migration(QEMUFile *f)
> +static int init_blk_migration(QEMUFile *f, Error **errp)
> {
> BlockDriverState *bs;
> BlkMigDevState *bmds;
> @@ -378,7 +378,6 @@ static int init_blk_migration(QEMUFile *f)
> BlkMigDevState *bmds;
> BlockDriverState *bs;
> } *bmds_bs;
> - Error *local_err = NULL;
> int ret;
>
> GRAPH_RDLOCK_GUARD_MAINLOOP();
There's a negative return below at:
for (i = 0, bs = bdrv_first(&it); bs; bs = bdrv_next(&it), i++) {
if (bdrv_is_read_only(bs)) {
continue;
}
sectors = bdrv_nb_sectors(bs);
if (sectors <= 0) {
ret = sectors;
^
bdrv_next_cleanup(&it);
goto out;
}
...
I presume that must be covered by an error as well. A similar function
somewhere else reports:
total_sectors = blk_nb_sectors(blk);
if (total_sectors <= 0) {
error_report("Error getting length of block device %s",
device_name);
return -EINVAL;
}
> @@ -439,9 +438,8 @@ static int init_blk_migration(QEMUFile *f)
> bs = bmds_bs[i].bs;
>
> if (bmds) {
> - ret = blk_insert_bs(bmds->blk, bs, &local_err);
> + ret = blk_insert_bs(bmds->blk, bs, errp);
> if (ret < 0) {
> - error_report_err(local_err);
> goto out;
> }
>
> @@ -711,6 +709,7 @@ static void block_migration_cleanup(void *opaque)
> static int block_save_setup(QEMUFile *f, void *opaque)
> {
> int ret;
> + Error *local_err = NULL;
>
> trace_migration_block_save("setup", block_mig_state.submitted,
> block_mig_state.transferred);
> @@ -718,18 +717,27 @@ static int block_save_setup(QEMUFile *f, void *opaque)
> warn_report("block migration is deprecated;"
> " use blockdev-mirror with NBD instead");
>
> - ret = init_blk_migration(f);
> + ret = init_blk_migration(f, &local_err);
> if (ret < 0) {
> + error_report_err(local_err);
> return ret;
> }
>
> /* start track dirty blocks */
> ret = set_dirty_tracking();
> if (ret) {
> + error_setg_errno(&local_err, -ret,
> + "Failed to start block dirty tracking");
> + error_report_err(local_err);
> return ret;
> }
>
> ret = flush_blks(f);
> + if (ret) {
> + error_setg_errno(&local_err, -ret, "Flushing block failed");
> + error_report_err(local_err);
> + return ret;
> + }
> blk_mig_reset_dirty_cursor();
> qemu_put_be64(f, BLK_MIG_FLAG_EOS);
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 04/26] migration: Always report an error in ram_save_setup()
2024-03-04 12:28 ` [PATCH v3 04/26] migration: Always report an error in ram_save_setup() Cédric Le Goater
@ 2024-03-04 21:30 ` Fabiano Rosas
2024-03-05 8:52 ` Cédric Le Goater
2024-03-05 5:29 ` Prasad Pandit
1 sibling, 1 reply; 50+ messages in thread
From: Fabiano Rosas @ 2024-03-04 21:30 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Cédric Le Goater
Cédric Le Goater <clg@redhat.com> writes:
> This will prepare ground for futur changes adding an Error** argument
> to the save_setup() handler. We need to make sure that on failure,
> ram_save_setup() sets a new error.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> migration/ram.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index 45a00b45edd429ef0568adce09b7459883e00167..dbd04d8ee2167881007c836a6bbc79c1aced72d0 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2937,12 +2937,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> int ret;
>
> if (compress_threads_save_setup()) {
> + error_report("%s: failed to start compress threads", __func__);
> return -1;
> }
>
> /* migration has already setup the bitmap, reuse it. */
> if (!migration_in_colo_state()) {
> if (ram_init_all(rsp) != 0) {
> + error_report("%s: failed to setup RAM for migration", __func__);
> compress_threads_save_cleanup();
> return -1;
> }
> @@ -2969,12 +2971,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>
> ret = rdma_registration_start(f, RAM_CONTROL_SETUP);
> if (ret < 0) {
> + error_report("%s: failed to start RDMA registration", __func__);
> qemu_file_set_error(f, ret);
> return ret;
> }
>
> ret = rdma_registration_stop(f, RAM_CONTROL_SETUP);
> if (ret < 0) {
> + error_report("%s: failed to stop RDMA registration", __func__);
> qemu_file_set_error(f, ret);
> return ret;
> }
> @@ -2986,6 +2990,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> ret = multifd_send_sync_main();
> bql_lock();
> if (ret < 0) {
> + error_report("%s: multifd synchronization failed", __func__);
> return ret;
> }
>
> @@ -2994,7 +2999,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> }
>
> qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> - return qemu_fflush(f);
> + ret = qemu_fflush(f);
> + if (ret) {
> + error_report("%s failed : %s", __func__, strerror(ret));
Here it should be -ret
The qemu_fflush function returns the QEMUFile error (f->last_error) and
that is expected to be a -errno value.
I started making sure all callers of qemu_file_set_error() use a
negative value, but got lost in the vmstate code and never finished:
https://lore.kernel.org/r/20230706195201.18595-1-farosas@suse.de
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 15/26] memory: Add Error** argument to the global_dirty_log routines
2024-03-04 12:28 ` [PATCH v3 15/26] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
@ 2024-03-05 1:58 ` Yong Huang
0 siblings, 0 replies; 50+ messages in thread
From: Yong Huang @ 2024-03-05 1:58 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Alex Williamson,
Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster,
Stefano Stabellini, Anthony Perard, Paul Durrant,
Michael S. Tsirkin, Paolo Bonzini, David Hildenbrand
[-- Attachment #1: Type: text/plain, Size: 11816 bytes --]
On Mon, Mar 4, 2024 at 8:29 PM Cédric Le Goater <clg@redhat.com> wrote:
> Now that the log_global*() handlers take an Error** parameter and
> return a bool, do the same for memory_global_dirty_log_start() and
> memory_global_dirty_log_stop(). The error is reported in the callers
> for now and it will be propagated in the call stack in the next
> changes.
>
> To be noted a functional change in ram_init_bitmaps(), if the dirty
> pages logger fails to start, there is no need to synchronize the dirty
> pages bitmaps. colo_incoming_start_dirty_log() could be modified in a
> similar way.
>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Anthony Perard <anthony.perard@citrix.com>
> Cc: Paul Durrant <paul@xen.org>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hyman Huang <yong.huang@smartx.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> include/exec/memory.h | 10 ++++++++--
> hw/i386/xen/xen-hvm.c | 4 ++--
> migration/dirtyrate.c | 21 +++++++++++++++++----
> migration/ram.c | 34 ++++++++++++++++++++++++++++++----
> system/memory.c | 30 ++++++++++++------------------
> 5 files changed, 69 insertions(+), 30 deletions(-)
>
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index
> 4bc146c5ebdd377cd14a4e462f32cc945db5a0a8..8b019465ab13ce85c03075c80865a0865ea1feed
> 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -2576,15 +2576,21 @@ void memory_listener_unregister(MemoryListener
> *listener);
> * memory_global_dirty_log_start: begin dirty logging for all regions
> *
> * @flags: purpose of starting dirty log, migration or dirty rate
> + * @errp: pointer to Error*, to store an error if it happens.
> + *
> + * Return: true on success, else false setting @errp with error.
> */
> -void memory_global_dirty_log_start(unsigned int flags);
> +bool memory_global_dirty_log_start(unsigned int flags, Error **errp);
>
> /**
> * memory_global_dirty_log_stop: end dirty logging for all regions
> *
> * @flags: purpose of stopping dirty log, migration or dirty rate
> + * @errp: pointer to Error*, to store an error if it happens.
> + *
> + * Return: true on success, else false setting @errp with error.
> */
> -void memory_global_dirty_log_stop(unsigned int flags);
> +bool memory_global_dirty_log_stop(unsigned int flags, Error **errp);
>
> void mtree_info(bool flatview, bool dispatch_tree, bool owner, bool
> disabled);
>
> diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
> index
> 925a207b494b4eed52d5f360b554f18ac8a9806d..286269b47572d90e57df5ff44835bb5f8e16c7ad
> 100644
> --- a/hw/i386/xen/xen-hvm.c
> +++ b/hw/i386/xen/xen-hvm.c
> @@ -655,9 +655,9 @@ void xen_hvm_modified_memory(ram_addr_t start,
> ram_addr_t length)
> void qmp_xen_set_global_dirty_log(bool enable, Error **errp)
> {
> if (enable) {
> - memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
> + memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION, errp);
> } else {
> - memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
> + memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION, errp);
> }
> }
>
> diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
> index
> 1d2e85746fb7b10eb7f149976970f9a92125af8a..34f6d803ff5f4e6ccf2e06aaaed65a336c4be469
> 100644
> --- a/migration/dirtyrate.c
> +++ b/migration/dirtyrate.c
> @@ -90,11 +90,17 @@ static int64_t do_calculate_dirtyrate(DirtyPageRecord
> dirty_pages,
>
> void global_dirty_log_change(unsigned int flag, bool start)
> {
> + Error *local_err = NULL;
> + bool ret;
> +
> bql_lock();
> if (start) {
> - memory_global_dirty_log_start(flag);
> + ret = memory_global_dirty_log_start(flag, &local_err);
> } else {
> - memory_global_dirty_log_stop(flag);
> + ret = memory_global_dirty_log_stop(flag, &local_err);
> + }
> + if (!ret) {
> + error_report_err(local_err);
> }
> bql_unlock();
> }
> @@ -106,10 +112,14 @@ void global_dirty_log_change(unsigned int flag, bool
> start)
> */
> static void global_dirty_log_sync(unsigned int flag, bool one_shot)
> {
> + Error *local_err = NULL;
> +
> bql_lock();
> memory_global_dirty_log_sync(false);
> if (one_shot) {
> - memory_global_dirty_log_stop(flag);
> + if (!memory_global_dirty_log_stop(flag, &local_err)) {
> + error_report_err(local_err);
> + }
> }
> bql_unlock();
> }
> @@ -608,9 +618,12 @@ static void calculate_dirtyrate_dirty_bitmap(struct
> DirtyRateConfig config)
> {
> int64_t start_time;
> DirtyPageRecord dirty_pages;
> + Error *local_err = NULL;
>
> bql_lock();
> - memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE);
> + if (!memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE,
> &local_err)) {
> + error_report_err(local_err);
> + }
>
> /*
> * 1'round of log sync may return all 1 bits with
> diff --git a/migration/ram.c b/migration/ram.c
> index
> 20c6ad9e759b2b8ec7ae26b7ca72d5cbd20d481f..3d9c08cfae8a59031a7c1b3c70721c2a90daceba
> 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2390,6 +2390,7 @@ static void ram_save_cleanup(void *opaque)
> {
> RAMState **rsp = opaque;
> RAMBlock *block;
> + Error *local_err = NULL;
>
> /* We don't use dirty log with background snapshots */
> if (!migrate_background_snapshot()) {
> @@ -2402,7 +2403,10 @@ static void ram_save_cleanup(void *opaque)
> * memory_global_dirty_log_stop will assert that
> * memory_global_dirty_log_start/stop used in pairs
> */
> - memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
> + if (!memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION,
> + &local_err)) {
> + error_report_err(local_err);
> + }
> }
> }
>
> @@ -2799,18 +2803,31 @@ static void
> migration_bitmap_clear_discarded_pages(RAMState *rs)
>
> static void ram_init_bitmaps(RAMState *rs)
> {
> + Error *local_err = NULL;
> + bool ret = true;
> +
> qemu_mutex_lock_ramlist();
>
> WITH_RCU_READ_LOCK_GUARD() {
> ram_list_init_bitmaps();
> /* We don't use dirty log with background snapshots */
> if (!migrate_background_snapshot()) {
> - memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
> + ret = memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
> + &local_err);
> + if (!ret) {
> + error_report_err(local_err);
> + goto out_unlock;
> + }
> migration_bitmap_sync_precopy(rs, false);
> }
> }
> +out_unlock:
> qemu_mutex_unlock_ramlist();
>
> + if (!ret) {
> + return;
> + }
> +
> /*
> * After an eventual first bitmap sync, fixup the initial bitmap
> * containing all 1s to exclude any discarded pages from migration.
> @@ -3459,6 +3476,8 @@ int colo_init_ram_cache(void)
> void colo_incoming_start_dirty_log(void)
> {
> RAMBlock *block = NULL;
> + Error *local_err = NULL;
> +
> /* For memory_global_dirty_log_start below. */
> bql_lock();
> qemu_mutex_lock_ramlist();
> @@ -3470,7 +3489,10 @@ void colo_incoming_start_dirty_log(void)
> /* Discard this dirty bitmap record */
> bitmap_zero(block->bmap, block->max_length >>
> TARGET_PAGE_BITS);
> }
> - memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION);
> + if (!memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION,
> + &local_err)) {
> + error_report_err(local_err);
> + }
> }
> ram_state->migration_dirty_pages = 0;
> qemu_mutex_unlock_ramlist();
> @@ -3481,8 +3503,12 @@ void colo_incoming_start_dirty_log(void)
> void colo_release_ram_cache(void)
> {
> RAMBlock *block;
> + Error *local_err = NULL;
> +
> + if (!memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION,
> &local_err)) {
> + error_report_err(local_err);
> + }
>
> - memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION);
> RAMBLOCK_FOREACH_NOT_IGNORED(block) {
> g_free(block->bmap);
> block->bmap = NULL;
> diff --git a/system/memory.c b/system/memory.c
> index
> af06157ead5b1272548e87f79ab9fb3036055328..48aed0f8ece1c731849636c442b8ab8e5d7ff6a5
> 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -2949,25 +2949,24 @@ static unsigned int postponed_stop_flags;
> static VMChangeStateEntry *vmstate_change;
> static bool memory_global_dirty_log_stop_postponed_run(Error **errp);
>
> -void memory_global_dirty_log_start(unsigned int flags)
> +bool memory_global_dirty_log_start(unsigned int flags, Error **errp)
> {
> + ERRP_GUARD();
> unsigned int old_flags;
> - Error *local_err = NULL;
>
> assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
>
> if (vmstate_change) {
> /* If there is postponed stop(), operate on it first */
> postponed_stop_flags &= ~flags;
> - if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
> - error_report_err(local_err);
> - return;
> + if (!memory_global_dirty_log_stop_postponed_run(errp)) {
> + return false;
> }
> }
>
> flags &= ~global_dirty_tracking;
> if (!flags) {
> - return;
> + return true;
> }
>
> old_flags = global_dirty_tracking;
> @@ -2975,16 +2974,15 @@ void memory_global_dirty_log_start(unsigned int
> flags)
> trace_global_dirty_changed(global_dirty_tracking);
>
> if (!old_flags) {
> - MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
> - &local_err);
> - if (local_err) {
> - error_report_err(local_err);
> - return;
> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward, errp);
> + if (*errp) {
> + return false;
> }
> memory_region_transaction_begin();
> memory_region_update_pending = true;
> memory_region_transaction_commit();
> }
> + return true;
> }
>
> static bool memory_global_dirty_log_do_stop(unsigned int flags, Error
> **errp)
> @@ -3040,10 +3038,8 @@ static void memory_vm_change_state_handler(void
> *opaque, bool running,
> }
> }
>
> -void memory_global_dirty_log_stop(unsigned int flags)
> +bool memory_global_dirty_log_stop(unsigned int flags, Error **errp)
> {
> - Error *local_err = NULL;
> -
> if (!runstate_is_running()) {
> /* Postpone the dirty log stop, e.g., to when VM starts again */
> if (vmstate_change) {
> @@ -3054,12 +3050,10 @@ void memory_global_dirty_log_stop(unsigned int
> flags)
> vmstate_change = qemu_add_vm_change_state_handler(
> memory_vm_change_state_handler, NULL);
> }
> - return;
> + return true;
> }
>
> - if (!memory_global_dirty_log_do_stop(flags, &local_err)) {
> - error_report_err(local_err);
> - }
> + return memory_global_dirty_log_do_stop(flags, errp);
> }
>
> static void listener_add_address_space(MemoryListener *listener,
> --
> 2.44.0
>
>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
--
Best regards
[-- Attachment #2: Type: text/html, Size: 14531 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 10/26] migration: Move cleanup after after error reporting in qemu_savevm_state_setup()
2024-03-04 12:28 ` [PATCH v3 10/26] migration: Move cleanup after after error reporting in qemu_savevm_state_setup() Cédric Le Goater
@ 2024-03-05 3:32 ` Peter Xu
2024-03-05 7:57 ` Cédric Le Goater
0 siblings, 1 reply; 50+ messages in thread
From: Peter Xu @ 2024-03-05 3:32 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster
On Mon, Mar 04, 2024 at 01:28:28PM +0100, Cédric Le Goater wrote:
> This will help preserving the error set by .save_setup() handlers.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
IIUC this is about the next patch. I got fully confused before reading
into the next one. IMHO we can squash it into where it's used.
Thanks,
> ---
> migration/savevm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 31ce9391d49c825d4ec835e26ac0246e192783a0..e400706e61e06d2d1d03a11aed14f30a243833f2 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1740,10 +1740,10 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
> qemu_savevm_state_complete_precopy(f, false, false);
> ret = qemu_file_get_error(f);
> }
> - qemu_savevm_state_cleanup();
> if (ret != 0) {
> error_setg_errno(errp, -ret, "Error while writing VM state");
> }
> + qemu_savevm_state_cleanup();
>
> if (ret != 0) {
> status = MIGRATION_STATUS_FAILED;
> --
> 2.44.0
>
>
--
Peter Xu
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 04/26] migration: Always report an error in ram_save_setup()
2024-03-04 12:28 ` [PATCH v3 04/26] migration: Always report an error in ram_save_setup() Cédric Le Goater
2024-03-04 21:30 ` Fabiano Rosas
@ 2024-03-05 5:29 ` Prasad Pandit
2024-03-05 8:53 ` Cédric Le Goater
1 sibling, 1 reply; 50+ messages in thread
From: Prasad Pandit @ 2024-03-05 5:29 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Alex Williamson,
Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster
On Mon, 4 Mar 2024 at 18:01, Cédric Le Goater <clg@redhat.com> wrote:
> This will prepare ground for futur changes adding an Error** argument
* futur -> future
> + ret = qemu_fflush(f);
> + if (ret) {
* if (ret) -> if (ret < 0)
Thank you.
---
- Prasad
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler
2024-03-04 20:49 ` Fabiano Rosas
@ 2024-03-05 7:12 ` Cédric Le Goater
0 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-05 7:12 UTC (permalink / raw)
To: Fabiano Rosas, qemu-devel
Cc: Peter Xu, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Halil Pasic,
Christian Borntraeger, Thomas Huth
On 3/4/24 21:49, Fabiano Rosas wrote:
> Cédric Le Goater <clg@redhat.com> writes:
>
>> This will prepare ground for futur changes adding an Error** argument
>> to the save_setup() handler. We need to make sure that on failure,
>> set_migrationmode() always sets a new error. See the Rules section in
>> qapi/error.h.
>>
>> Cc: Halil Pasic <pasic@linux.ibm.com>
>> Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
>> Cc: Thomas Huth <thuth@redhat.com>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>> include/hw/s390x/storage-attributes.h | 2 +-
>> hw/s390x/s390-stattrib-kvm.c | 12 ++++++++++--
>> hw/s390x/s390-stattrib.c | 14 +++++++++-----
>> 3 files changed, 20 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/hw/s390x/storage-attributes.h b/include/hw/s390x/storage-attributes.h
>> index 5239eb538c1b087797867a247abfc14551af6a4d..8921a04d514bf64a3113255ee10ed33fc598ae06 100644
>> --- a/include/hw/s390x/storage-attributes.h
>> +++ b/include/hw/s390x/storage-attributes.h
>> @@ -39,7 +39,7 @@ struct S390StAttribClass {
>> int (*set_stattr)(S390StAttribState *sa, uint64_t start_gfn,
>> uint32_t count, uint8_t *values);
>> void (*synchronize)(S390StAttribState *sa);
>> - int (*set_migrationmode)(S390StAttribState *sa, bool value);
>> + int (*set_migrationmode)(S390StAttribState *sa, bool value, Error **errp);
>> int (*get_active)(S390StAttribState *sa);
>> long long (*get_dirtycount)(S390StAttribState *sa);
>> };
>> diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
>> index 24cd01382e2d74d62c2d7e980eb6aca1077d893d..357cea2c987213b867c81b0e258f7d0c293fe665 100644
>> --- a/hw/s390x/s390-stattrib-kvm.c
>> +++ b/hw/s390x/s390-stattrib-kvm.c
>> @@ -17,6 +17,7 @@
>> #include "sysemu/kvm.h"
>> #include "exec/ram_addr.h"
>> #include "kvm/kvm_s390x.h"
>> +#include "qapi/error.h"
>>
>> Object *kvm_s390_stattrib_create(void)
>> {
>> @@ -137,14 +138,21 @@ static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
>> }
>> }
>>
>> -static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val)
>> +static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val,
>> + Error **errp)
>> {
>> struct kvm_device_attr attr = {
>> .group = KVM_S390_VM_MIGRATION,
>> .attr = val,
>> .addr = 0,
>> };
>> - return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
>> + int r;
>> +
>> + r = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
>> + if (r) {
>> + error_setg_errno(errp, -r, "KVM_S390_SET_CMMA_BITS failed");
>
> Did you mean KVM_SET_DEVICE_ATTR?
Drat. Copy paste :)
Thanks,
C.
>
>> + }
>> + return r;
>> }
>>
>> static long long kvm_s390_stattrib_get_dirtycount(S390StAttribState *sa)
>> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
>> index c483b62a9b5f71772639fc180bdad15ecb6711cb..e99de190332a82363b1388bbc450013149295bc0 100644
>> --- a/hw/s390x/s390-stattrib.c
>> +++ b/hw/s390x/s390-stattrib.c
>> @@ -60,11 +60,12 @@ void hmp_migrationmode(Monitor *mon, const QDict *qdict)
>> S390StAttribState *sas = s390_get_stattrib_device();
>> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
>> uint64_t what = qdict_get_int(qdict, "mode");
>> + Error *local_err = NULL;
>> int r;
>>
>> - r = sac->set_migrationmode(sas, what);
>> + r = sac->set_migrationmode(sas, what, &local_err);
>> if (r < 0) {
>> - monitor_printf(mon, "Error: %s", strerror(-r));
>> + monitor_printf(mon, "Error: %s", error_get_pretty(local_err));
>> }
>> }
>>
>> @@ -170,13 +171,15 @@ static int cmma_save_setup(QEMUFile *f, void *opaque)
>> {
>> S390StAttribState *sas = S390_STATTRIB(opaque);
>> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
>> + Error *local_err = NULL;
>> int res;
>> /*
>> * Signal that we want to start a migration, thus needing PGSTE dirty
>> * tracking.
>> */
>> - res = sac->set_migrationmode(sas, 1);
>> + res = sac->set_migrationmode(sas, true, &local_err);
>> if (res) {
>> + error_report_err(local_err);
>> return res;
>> }
>> qemu_put_be64(f, STATTR_FLAG_EOS);
>> @@ -260,7 +263,7 @@ static void cmma_save_cleanup(void *opaque)
>> {
>> S390StAttribState *sas = S390_STATTRIB(opaque);
>> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
>> - sac->set_migrationmode(sas, 0);
>> + sac->set_migrationmode(sas, false, NULL);
>> }
>>
>> static bool cmma_active(void *opaque)
>> @@ -293,7 +296,8 @@ static long long qemu_s390_get_dirtycount_stub(S390StAttribState *sa)
>> {
>> return 0;
>> }
>> -static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value)
>> +static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value,
>> + Error **errp)
>> {
>> return 0;
>> }
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 05/26] migration: Add Error** argument to vmstate_save()
2024-03-04 12:28 ` [PATCH v3 05/26] migration: Add Error** argument to vmstate_save() Cédric Le Goater
@ 2024-03-05 7:44 ` Prasad Pandit
0 siblings, 0 replies; 50+ messages in thread
From: Prasad Pandit @ 2024-03-05 7:44 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Alex Williamson,
Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster
On Mon, 4 Mar 2024 at 19:38, Cédric Le Goater <clg@redhat.com> wrote:
> This will prepare ground for futur changes adding an Error** argument
* futur -> furure
>
> -static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc)
> +static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc,
> + Error **errp)
> {
> int ret;
> - Error *local_err = NULL;
> - MigrationState *s = migrate_get_current();
>
> if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
> return 0;
> @@ -1034,10 +1033,9 @@ static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc)
> if (!se->vmsd) {
> vmstate_save_old_style(f, se, vmdesc);
> } else {
> - ret = vmstate_save_state_with_err(f, se->vmsd, se->opaque, vmdesc, &local_err);
> + ret = vmstate_save_state_with_err(f, se->vmsd, se->opaque, vmdesc,
> + errp);
> if (ret) {
> - migrate_set_error(s, local_err);
> - error_report_err(local_err);
> return ret;
> }
> }
> @@ -1324,8 +1322,10 @@ void qemu_savevm_state_setup(QEMUFile *f)
> trace_savevm_state_setup();
> QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> if (se->vmsd && se->vmsd->early_setup) {
> - ret = vmstate_save(f, se, ms->vmdesc);
> + ret = vmstate_save(f, se, ms->vmdesc, &local_err);
> if (ret) {
> + migrate_set_error(ms, local_err);
> + error_report_err(local_err);
> qemu_file_set_error(f, ret);
> break;
> }
> @@ -1540,6 +1540,7 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
> JSONWriter *vmdesc = ms->vmdesc;
> int vmdesc_len;
> SaveStateEntry *se;
> + Error *local_err = NULL;
> int ret;
>
> QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> @@ -1550,8 +1551,10 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
>
> start_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME);
>
> - ret = vmstate_save(f, se, vmdesc);
> + ret = vmstate_save(f, se, vmdesc, &local_err);
> if (ret) {
> + migrate_set_error(ms, local_err);
> + error_report_err(local_err);
> qemu_file_set_error(f, ret);
> return ret;
> }
> @@ -1566,7 +1569,6 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
> * bdrv_activate_all() on the other end won't fail. */
> ret = bdrv_inactivate_all();
> if (ret) {
> - Error *local_err = NULL;
> error_setg(&local_err, "%s: bdrv_inactivate_all() failed (%d)",
> __func__, ret);
> migrate_set_error(ms, local_err);
> @@ -1762,6 +1764,8 @@ void qemu_savevm_live_state(QEMUFile *f)
>
> int qemu_save_device_state(QEMUFile *f)
> {
> + MigrationState *ms = migrate_get_current();
> + Error *local_err = NULL;
> SaveStateEntry *se;
>
> if (!migration_in_colo_state()) {
> @@ -1776,8 +1780,10 @@ int qemu_save_device_state(QEMUFile *f)
> if (se->is_ram) {
> continue;
> }
> - ret = vmstate_save(f, se, NULL);
> + ret = vmstate_save(f, se, NULL, &local_err);
> if (ret) {
> + migrate_set_error(ms, local_err);
> + error_report_err(local_err);
> return ret;
> }
> }
> --
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Thank you.
---
- Prasad
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 10/26] migration: Move cleanup after after error reporting in qemu_savevm_state_setup()
2024-03-05 3:32 ` Peter Xu
@ 2024-03-05 7:57 ` Cédric Le Goater
0 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-05 7:57 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster
On 3/5/24 04:32, Peter Xu wrote:
> On Mon, Mar 04, 2024 at 01:28:28PM +0100, Cédric Le Goater wrote:
>> This will help preserving the error set by .save_setup() handlers.
>>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>
> IIUC this is about the next patch. I got fully confused before reading
> into the next one. IMHO we can squash it into where it's used.
That's where the change was initially ... I thought extracting it in its
own patch would clarify. Oh well nevermind, I will put it back and add
a comment in the commit log.
Thanks,
C.
>
> Thanks,
>
>> ---
>> migration/savevm.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 31ce9391d49c825d4ec835e26ac0246e192783a0..e400706e61e06d2d1d03a11aed14f30a243833f2 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -1740,10 +1740,10 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
>> qemu_savevm_state_complete_precopy(f, false, false);
>> ret = qemu_file_get_error(f);
>> }
>> - qemu_savevm_state_cleanup();
>> if (ret != 0) {
>> error_setg_errno(errp, -ret, "Error while writing VM state");
>> }
>> + qemu_savevm_state_cleanup();
>>
>> if (ret != 0) {
>> status = MIGRATION_STATUS_FAILED;
>> --
>> 2.44.0
>>
>>
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 14/26] memory: Add Error** argument to .log_global*() handlers
2024-03-04 12:28 ` [PATCH v3 14/26] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
@ 2024-03-05 7:57 ` Peter Xu
2024-03-05 14:25 ` Cédric Le Goater
0 siblings, 1 reply; 50+ messages in thread
From: Peter Xu @ 2024-03-05 7:57 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Stefano Stabellini, Anthony Perard, Paul Durrant,
Michael S. Tsirkin, Paolo Bonzini, David Hildenbrand
On Mon, Mar 04, 2024 at 01:28:32PM +0100, Cédric Le Goater wrote:
> @@ -2936,15 +2975,22 @@ void memory_global_dirty_log_start(unsigned int flags)
> trace_global_dirty_changed(global_dirty_tracking);
>
> if (!old_flags) {
> - MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
> + &local_err);
> + if (local_err) {
> + error_report_err(local_err);
> + return;
Returns here means global_dirty_tracking will keep the new value even if
it's not truly commited globally (in memory_region_transaction_commit()
later below). I think it'll cause inconsistency: global_dirty_tracking
should reflect the global status of dirty tracking, and that should match
with the MR status cached in FlatViews (which is used in memory core to
reflect address space translations).
For some details on how that flag applied to each MR, feel free to have a
quick look in address_space_update_topology_pass() of the "else if (frold
&& frnew && flatrange_equal(frold, frnew))".
Here IIUC if to fully support a graceful failure (IIUC that is the goal for
VFIO.. and this op should be easily triggerable by the user), then we need
to do proper unwind on both:
- Call proper log_global_stop() on those who has already been started
successfully before the current failed log_global_start(), then,
- Reset global_dirty_tracking to old_flags before return
We may want to make sure trace_global_dirty_changed() is only called when
all things succeeded.
I don't have a strong opinion on whether do we need similar error report
interfaces for _stop() and _log_sync(). I'd still suggest the same that we
drop them to make the patch simpler, but only add such error reports for
log_global_start(). If they never get triggered they're dead code anyway,
so I don't think "having errp for all APIs" is a must-to-have at least to me.
Thanks,
> + }
> memory_region_transaction_begin();
> memory_region_update_pending = true;
> memory_region_transaction_commit();
> }
> }
--
Peter Xu
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 00/26] migration: Improve error reporting
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
` (25 preceding siblings ...)
2024-03-04 12:28 ` [PATCH v3 26/26] vfio: Extend vfio_set_migration_error() with Error* argument Cédric Le Goater
@ 2024-03-05 8:06 ` Peter Xu
2024-03-05 8:30 ` Cédric Le Goater
26 siblings, 1 reply; 50+ messages in thread
From: Peter Xu @ 2024-03-05 8:06 UTC (permalink / raw)
To: Cédric Le Goater
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster
On Mon, Mar 04, 2024 at 01:28:18PM +0100, Cédric Le Goater wrote:
> migration: Report error when shutdown fails
> migration: Remove SaveStateHandler and LoadStateHandler typedefs
> migration: Add documentation for SaveVMHandlers
> migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
These four patches seem to be pretty standalone ones and got at least 1
ACKs. I queued them for 9.0, thanks.
--
Peter Xu
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 03/26] migration: Always report an error in block_save_setup()
2024-03-04 21:04 ` Fabiano Rosas
@ 2024-03-05 8:23 ` Cédric Le Goater
0 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-05 8:23 UTC (permalink / raw)
To: Fabiano Rosas, qemu-devel
Cc: Peter Xu, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Stefan Hajnoczi,
qemu-block
On 3/4/24 22:04, Fabiano Rosas wrote:
> Cédric Le Goater <clg@redhat.com> writes:
>
>> This will prepare ground for futur changes adding an Error** argument
>> to the save_setup() handler. We need to make sure that on failure,
>> block_save_setup() always sets a new error.
>>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>> migration/block.c | 18 +++++++++++++-----
>> 1 file changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/migration/block.c b/migration/block.c
>> index 8c6ebafacc1ffe930d1d4f19d968817b14852c69..06f5857cf049df3261d2cf1d7c3607ab92350ac6 100644
>> --- a/migration/block.c
>> +++ b/migration/block.c
>> @@ -367,7 +367,7 @@ static void unset_dirty_tracking(void)
>> }
>> }
>>
>> -static int init_blk_migration(QEMUFile *f)
>> +static int init_blk_migration(QEMUFile *f, Error **errp)
>> {
>> BlockDriverState *bs;
>> BlkMigDevState *bmds;
>> @@ -378,7 +378,6 @@ static int init_blk_migration(QEMUFile *f)
>> BlkMigDevState *bmds;
>> BlockDriverState *bs;
>> } *bmds_bs;
>> - Error *local_err = NULL;
>> int ret;
>>
>> GRAPH_RDLOCK_GUARD_MAINLOOP();
>
> There's a negative return below at:
>
> for (i = 0, bs = bdrv_first(&it); bs; bs = bdrv_next(&it), i++) {
> if (bdrv_is_read_only(bs)) {
> continue;
> }
>
> sectors = bdrv_nb_sectors(bs);
> if (sectors <= 0) {
> ret = sectors;
> ^
> bdrv_next_cleanup(&it);
> goto out;
> }
> ...
Indeed.
I understand that the bdrv_nb_sectors() == 0 case only breaks the loop but
it is not considered as an error. Could the block folks confirm please ?
Thanks,
C.
>
> I presume that must be covered by an error as well. A similar function
> somewhere else reports:
>
> total_sectors = blk_nb_sectors(blk);
> if (total_sectors <= 0) {
> error_report("Error getting length of block device %s",
> device_name);
> return -EINVAL;
> }
>
>> @@ -439,9 +438,8 @@ static int init_blk_migration(QEMUFile *f)
>> bs = bmds_bs[i].bs;
>>
>> if (bmds) {
>> - ret = blk_insert_bs(bmds->blk, bs, &local_err);
>> + ret = blk_insert_bs(bmds->blk, bs, errp);
>> if (ret < 0) {
>> - error_report_err(local_err);
>> goto out;
>> }
>>
>> @@ -711,6 +709,7 @@ static void block_migration_cleanup(void *opaque)
>> static int block_save_setup(QEMUFile *f, void *opaque)
>> {
>> int ret;
>> + Error *local_err = NULL;
>>
>> trace_migration_block_save("setup", block_mig_state.submitted,
>> block_mig_state.transferred);
>> @@ -718,18 +717,27 @@ static int block_save_setup(QEMUFile *f, void *opaque)
>> warn_report("block migration is deprecated;"
>> " use blockdev-mirror with NBD instead");
>>
>> - ret = init_blk_migration(f);
>> + ret = init_blk_migration(f, &local_err);
>> if (ret < 0) {
>> + error_report_err(local_err);
>> return ret;
>> }
>>
>> /* start track dirty blocks */
>> ret = set_dirty_tracking();
>> if (ret) {
>> + error_setg_errno(&local_err, -ret,
>> + "Failed to start block dirty tracking");
>> + error_report_err(local_err);
>> return ret;
>> }
>>
>> ret = flush_blks(f);
>> + if (ret) {
>> + error_setg_errno(&local_err, -ret, "Flushing block failed");
>> + error_report_err(local_err);
>> + return ret;
>> + }
>> blk_mig_reset_dirty_cursor();
>> qemu_put_be64(f, BLK_MIG_FLAG_EOS);
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 00/26] migration: Improve error reporting
2024-03-05 8:06 ` [PATCH v3 00/26] migration: Improve error reporting Peter Xu
@ 2024-03-05 8:30 ` Cédric Le Goater
0 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-05 8:30 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster
On 3/5/24 09:06, Peter Xu wrote:
> On Mon, Mar 04, 2024 at 01:28:18PM +0100, Cédric Le Goater wrote:
>> migration: Report error when shutdown fails
>> migration: Remove SaveStateHandler and LoadStateHandler typedefs
>> migration: Add documentation for SaveVMHandlers
>> migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
>
> These four patches seem to be pretty standalone ones and got at least 1
> ACKs. I queued them for 9.0, thanks.
OK.
I will try to have the first 5 ready before 9.0 :
s390/stattrib: Add Error** argument to set_migrationmode() handler
vfio: Always report an error in vfio_save_setup()
migration: Always report an error in block_save_setup()
migration: Always report an error in ram_save_setup()
migration: Add Error** argument to vmstate_save()
So that we only have the core changes in log_global_start() and
ram_init_bitmaps() to address in the next cycle.
As for the VFIO part coming after, I will see which initial cleanups
we can merge before soft freeze.
Thanks,
C.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 04/26] migration: Always report an error in ram_save_setup()
2024-03-04 21:30 ` Fabiano Rosas
@ 2024-03-05 8:52 ` Cédric Le Goater
0 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-05 8:52 UTC (permalink / raw)
To: Fabiano Rosas, qemu-devel
Cc: Peter Xu, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster
On 3/4/24 22:30, Fabiano Rosas wrote:
> Cédric Le Goater <clg@redhat.com> writes:
>
>> This will prepare ground for futur changes adding an Error** argument
>> to the save_setup() handler. We need to make sure that on failure,
>> ram_save_setup() sets a new error.
>>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>> migration/ram.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 45a00b45edd429ef0568adce09b7459883e00167..dbd04d8ee2167881007c836a6bbc79c1aced72d0 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -2937,12 +2937,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>> int ret;
>>
>> if (compress_threads_save_setup()) {
>> + error_report("%s: failed to start compress threads", __func__);
>> return -1;
>> }
>>
>> /* migration has already setup the bitmap, reuse it. */
>> if (!migration_in_colo_state()) {
>> if (ram_init_all(rsp) != 0) {
>> + error_report("%s: failed to setup RAM for migration", __func__);
>> compress_threads_save_cleanup();
>> return -1;
>> }
>> @@ -2969,12 +2971,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>
>> ret = rdma_registration_start(f, RAM_CONTROL_SETUP);
>> if (ret < 0) {
>> + error_report("%s: failed to start RDMA registration", __func__);
>> qemu_file_set_error(f, ret);
>> return ret;
>> }
>>
>> ret = rdma_registration_stop(f, RAM_CONTROL_SETUP);
>> if (ret < 0) {
>> + error_report("%s: failed to stop RDMA registration", __func__);
>> qemu_file_set_error(f, ret);
>> return ret;
>> }
>> @@ -2986,6 +2990,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>> ret = multifd_send_sync_main();
>> bql_lock();
>> if (ret < 0) {
>> + error_report("%s: multifd synchronization failed", __func__);
>> return ret;
>> }
>>
>> @@ -2994,7 +2999,11 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>> }
>>
>> qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>> - return qemu_fflush(f);
>> + ret = qemu_fflush(f);
>> + if (ret) {
>> + error_report("%s failed : %s", __func__, strerror(ret));
>
> Here it should be -ret
>
> The qemu_fflush function returns the QEMUFile error (f->last_error) and
> that is expected to be a -errno value.
oh yes. I should have know that since qemu_file_set_error() takes a -errno.
> I started making sure all callers of qemu_file_set_error() use a
> negative value, but got lost in the vmstate code and never finished:
>
> https://lore.kernel.org/r/20230706195201.18595-1-farosas@suse.de
Thanks for the pointer.
Thanks,
C.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 04/26] migration: Always report an error in ram_save_setup()
2024-03-05 5:29 ` Prasad Pandit
@ 2024-03-05 8:53 ` Cédric Le Goater
0 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-05 8:53 UTC (permalink / raw)
To: Prasad Pandit
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Alex Williamson,
Avihai Horon, Philippe Mathieu-Daudé, Markus Armbruster
On 3/5/24 06:29, Prasad Pandit wrote:
> On Mon, 4 Mar 2024 at 18:01, Cédric Le Goater <clg@redhat.com> wrote:
>> This will prepare ground for futur changes adding an Error** argument
>
> * futur -> future
Fixed in all first 5 patches.
>
>> + ret = qemu_fflush(f);
>> + if (ret) {
>
> * if (ret) -> if (ret < 0)
yep. Thanks,
C.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 12/26] migration: Add Error** argument to .save_setup() handler
2024-03-04 12:28 ` [PATCH v3 12/26] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
@ 2024-03-05 8:54 ` Thomas Huth
0 siblings, 0 replies; 50+ messages in thread
From: Thomas Huth @ 2024-03-05 8:54 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster, Nicholas Piggin,
Harsh Prateek Bora, Halil Pasic, Eric Blake,
Vladimir Sementsov-Ogievskiy, John Snow, Stefan Hajnoczi
On 04/03/2024 13.28, Cédric Le Goater wrote:
> The purpose is to record a potential error in the migration stream if
> qemu_savevm_state_setup() fails. Most of the current .save_setup()
> handlers can be modified to use the Error argument instead of managing
> their own and calling locally error_report().
>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: Eric Blake <eblake@redhat.com>
> Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> Cc: John Snow <jsnow@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
Reviewed-by: Thomas Huth <thuth@redhat.com>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 14/26] memory: Add Error** argument to .log_global*() handlers
2024-03-05 7:57 ` Peter Xu
@ 2024-03-05 14:25 ` Cédric Le Goater
0 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-05 14:25 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Alex Williamson, Avihai Horon,
Philippe Mathieu-Daudé, Markus Armbruster,
Stefano Stabellini, Anthony Perard, Paul Durrant,
Michael S. Tsirkin, Paolo Bonzini, David Hildenbrand
On 3/5/24 08:57, Peter Xu wrote:
> On Mon, Mar 04, 2024 at 01:28:32PM +0100, Cédric Le Goater wrote:
>> @@ -2936,15 +2975,22 @@ void memory_global_dirty_log_start(unsigned int flags)
>> trace_global_dirty_changed(global_dirty_tracking);
>>
>> if (!old_flags) {
>> - MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
>> + MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
>> + &local_err);
>> + if (local_err) {
>> + error_report_err(local_err);
>> + return;
>
> Returns here means global_dirty_tracking will keep the new value even if
> it's not truly commited globally (in memory_region_transaction_commit()
> later below). I think it'll cause inconsistency: global_dirty_tracking
> should reflect the global status of dirty tracking, and that should match
> with the MR status cached in FlatViews (which is used in memory core to
> reflect address space translations).
You are right. FlatRange::dirty_log_mask could be out sync with
global_dirty_tracking.
> For some details on how that flag applied to each MR, feel free to have a
> quick look in address_space_update_topology_pass() of the "else if (frold
> && frnew && flatrange_equal(frold, frnew))".>
> Here IIUC if to fully support a graceful failure (IIUC that is the goal for
> VFIO.. and this op should be easily triggerable by the user), then we need
> to do proper unwind on both:
>
> - Call proper log_global_stop() on those who has already been started
> successfully before the current failed log_global_start(), then,
Yes. This needs more work to restore the initial state. The current
proposal is relying on save_cleanup() to restore the previous state.
This is not enough.
> - Reset global_dirty_tracking to old_flags before return
>
> We may want to make sure trace_global_dirty_changed() is only called when
> all things succeeded.
That should be done after the loop on listeners even today.
> I don't have a strong opinion on whether do we need similar error report
> interfaces for _stop() and _log_sync().
Yes. Let's focus on log_global_start(). The other changes are not necessary
for VFIO.
> I'd still suggest the same that we
> drop them to make the patch simpler, but only add such error reports for
> log_global_start(). If they never get triggered they're dead code anyway,
> so I don't think "having errp for all APIs" is a must-to-have at least to me.
I am fine with that.
Thanks,
C.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler
2024-03-04 12:28 ` [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
2024-03-04 20:49 ` Fabiano Rosas
@ 2024-03-05 14:54 ` Avihai Horon
1 sibling, 0 replies; 50+ messages in thread
From: Avihai Horon @ 2024-03-05 14:54 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson,
Philippe Mathieu-Daudé, Markus Armbruster, Halil Pasic,
Christian Borntraeger, Thomas Huth
On 04/03/2024 14:28, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> This will prepare ground for futur changes adding an Error** argument
> to the save_setup() handler. We need to make sure that on failure,
> set_migrationmode() always sets a new error. See the Rules section in
> qapi/error.h.
>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> include/hw/s390x/storage-attributes.h | 2 +-
> hw/s390x/s390-stattrib-kvm.c | 12 ++++++++++--
> hw/s390x/s390-stattrib.c | 14 +++++++++-----
> 3 files changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/include/hw/s390x/storage-attributes.h b/include/hw/s390x/storage-attributes.h
> index 5239eb538c1b087797867a247abfc14551af6a4d..8921a04d514bf64a3113255ee10ed33fc598ae06 100644
> --- a/include/hw/s390x/storage-attributes.h
> +++ b/include/hw/s390x/storage-attributes.h
> @@ -39,7 +39,7 @@ struct S390StAttribClass {
> int (*set_stattr)(S390StAttribState *sa, uint64_t start_gfn,
> uint32_t count, uint8_t *values);
> void (*synchronize)(S390StAttribState *sa);
> - int (*set_migrationmode)(S390StAttribState *sa, bool value);
> + int (*set_migrationmode)(S390StAttribState *sa, bool value, Error **errp);
> int (*get_active)(S390StAttribState *sa);
> long long (*get_dirtycount)(S390StAttribState *sa);
> };
> diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
> index 24cd01382e2d74d62c2d7e980eb6aca1077d893d..357cea2c987213b867c81b0e258f7d0c293fe665 100644
> --- a/hw/s390x/s390-stattrib-kvm.c
> +++ b/hw/s390x/s390-stattrib-kvm.c
> @@ -17,6 +17,7 @@
> #include "sysemu/kvm.h"
> #include "exec/ram_addr.h"
> #include "kvm/kvm_s390x.h"
> +#include "qapi/error.h"
>
> Object *kvm_s390_stattrib_create(void)
> {
> @@ -137,14 +138,21 @@ static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
> }
> }
>
> -static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val)
> +static int kvm_s390_stattrib_set_migrationmode(S390StAttribState *sa, bool val,
> + Error **errp)
> {
> struct kvm_device_attr attr = {
> .group = KVM_S390_VM_MIGRATION,
> .attr = val,
> .addr = 0,
> };
> - return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
> + int r;
> +
> + r = kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
> + if (r) {
> + error_setg_errno(errp, -r, "KVM_S390_SET_CMMA_BITS failed");
> + }
> + return r;
> }
>
> static long long kvm_s390_stattrib_get_dirtycount(S390StAttribState *sa)
> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
> index c483b62a9b5f71772639fc180bdad15ecb6711cb..e99de190332a82363b1388bbc450013149295bc0 100644
> --- a/hw/s390x/s390-stattrib.c
> +++ b/hw/s390x/s390-stattrib.c
> @@ -60,11 +60,12 @@ void hmp_migrationmode(Monitor *mon, const QDict *qdict)
> S390StAttribState *sas = s390_get_stattrib_device();
> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> uint64_t what = qdict_get_int(qdict, "mode");
> + Error *local_err = NULL;
> int r;
>
> - r = sac->set_migrationmode(sas, what);
> + r = sac->set_migrationmode(sas, what, &local_err);
> if (r < 0) {
> - monitor_printf(mon, "Error: %s", strerror(-r));
> + monitor_printf(mon, "Error: %s", error_get_pretty(local_err));
I think we need to free the error here:
error_free(local_err);
Thanks.
> }
> }
>
> @@ -170,13 +171,15 @@ static int cmma_save_setup(QEMUFile *f, void *opaque)
> {
> S390StAttribState *sas = S390_STATTRIB(opaque);
> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> + Error *local_err = NULL;
> int res;
> /*
> * Signal that we want to start a migration, thus needing PGSTE dirty
> * tracking.
> */
> - res = sac->set_migrationmode(sas, 1);
> + res = sac->set_migrationmode(sas, true, &local_err);
> if (res) {
> + error_report_err(local_err);
> return res;
> }
> qemu_put_be64(f, STATTR_FLAG_EOS);
> @@ -260,7 +263,7 @@ static void cmma_save_cleanup(void *opaque)
> {
> S390StAttribState *sas = S390_STATTRIB(opaque);
> S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> - sac->set_migrationmode(sas, 0);
> + sac->set_migrationmode(sas, false, NULL);
> }
>
> static bool cmma_active(void *opaque)
> @@ -293,7 +296,8 @@ static long long qemu_s390_get_dirtycount_stub(S390StAttribState *sa)
> {
> return 0;
> }
> -static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value)
> +static int qemu_s390_set_migrationmode_stub(S390StAttribState *sa, bool value,
> + Error **errp)
> {
> return 0;
> }
> --
> 2.44.0
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup()
2024-03-04 12:28 ` [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup() Cédric Le Goater
2024-03-04 20:51 ` Fabiano Rosas
@ 2024-03-06 9:56 ` Avihai Horon
2024-03-06 10:16 ` Avihai Horon
2024-03-06 10:49 ` Cédric Le Goater
1 sibling, 2 replies; 50+ messages in thread
From: Avihai Horon @ 2024-03-06 9:56 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson,
Philippe Mathieu-Daudé, Markus Armbruster
On 04/03/2024 14:28, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> This will prepare ground for futur changes adding an Error** argument
> to the save_setup() handler. We need to make sure that on failure,
> vfio_save_setup() always sets a new error.
>
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
> hw/vfio/migration.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 2050ac8897231ff89cc223f0570d5c7a65dede9e..51bea536cc290ba0aa393f78b017b0650e333bff 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -383,6 +383,7 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
> VFIODevice *vbasedev = opaque;
> VFIOMigration *migration = vbasedev->migration;
> uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE;
> + int ret;
>
> qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
>
> @@ -397,13 +398,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
> }
>
> if (vfio_precopy_supported(vbasedev)) {
> - int ret;
> -
> switch (migration->device_state) {
> case VFIO_DEVICE_STATE_RUNNING:
> ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
> VFIO_DEVICE_STATE_RUNNING);
> if (ret) {
> + error_report("%s: Failed to set new RUNNING state",
> + vbasedev->name);
> return ret;
> }
>
> @@ -414,6 +415,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
> /* vfio_save_complete_precopy() will go to STOP_COPY */
> break;
> default:
> + error_report("%s: Invalid device state %d", vbasedev->name,
> + migration->device_state);
> return -EINVAL;
> }
> }
> @@ -422,7 +425,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>
> qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
>
> - return qemu_file_get_error(f);
> + ret = qemu_file_get_error(f);
> + if (ret) {
> + error_report("%s: save setup failed : %s", vbasedev->name,
> + strerror(ret));
Here it should be -ret (and also later in patch #12).
Thanks.
> + }
> +
> + return ret;
> }
>
> static void vfio_save_cleanup(void *opaque)
> --
> 2.44.0
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup()
2024-03-06 9:56 ` Avihai Horon
@ 2024-03-06 10:16 ` Avihai Horon
2024-03-06 10:51 ` Cédric Le Goater
2024-03-06 10:49 ` Cédric Le Goater
1 sibling, 1 reply; 50+ messages in thread
From: Avihai Horon @ 2024-03-06 10:16 UTC (permalink / raw)
To: Cédric Le Goater, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson,
Philippe Mathieu-Daudé, Markus Armbruster
On 06/03/2024 11:56, Avihai Horon wrote:
>
> On 04/03/2024 14:28, Cédric Le Goater wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> This will prepare ground for futur changes adding an Error** argument
>> to the save_setup() handler. We need to make sure that on failure,
>> vfio_save_setup() always sets a new error.
>>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>> hw/vfio/migration.c | 15 ++++++++++++---
>> 1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index
>> 2050ac8897231ff89cc223f0570d5c7a65dede9e..51bea536cc290ba0aa393f78b017b0650e333bff
>> 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -383,6 +383,7 @@ static int vfio_save_setup(QEMUFile *f, void
>> *opaque)
>> VFIODevice *vbasedev = opaque;
>> VFIOMigration *migration = vbasedev->migration;
>> uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE;
>> + int ret;
>>
>> qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
>>
>> @@ -397,13 +398,13 @@ static int vfio_save_setup(QEMUFile *f, void
>> *opaque)
>> }
>>
>> if (vfio_precopy_supported(vbasedev)) {
>> - int ret;
>> -
>> switch (migration->device_state) {
>> case VFIO_DEVICE_STATE_RUNNING:
>> ret = vfio_migration_set_state(vbasedev,
>> VFIO_DEVICE_STATE_PRE_COPY,
>> VFIO_DEVICE_STATE_RUNNING);
>> if (ret) {
>> + error_report("%s: Failed to set new RUNNING state",
Oh, sorry, forgot to mention in previous mail:
s/RUNNING/PRE_COPY
>> + vbasedev->name);
>> return ret;
>> }
>>
>> @@ -414,6 +415,8 @@ static int vfio_save_setup(QEMUFile *f, void
>> *opaque)
>> /* vfio_save_complete_precopy() will go to STOP_COPY */
>> break;
>> default:
>> + error_report("%s: Invalid device state %d", vbasedev->name,
>> + migration->device_state);
>> return -EINVAL;
>> }
>> }
>> @@ -422,7 +425,13 @@ static int vfio_save_setup(QEMUFile *f, void
>> *opaque)
>>
>> qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
>>
>> - return qemu_file_get_error(f);
>> + ret = qemu_file_get_error(f);
>> + if (ret) {
>> + error_report("%s: save setup failed : %s", vbasedev->name,
>> + strerror(ret));
>
> Here it should be -ret (and also later in patch #12).
>
> Thanks.
>
>> + }
>> +
>> + return ret;
>> }
>>
>> static void vfio_save_cleanup(void *opaque)
>> --
>> 2.44.0
>>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup()
2024-03-06 9:56 ` Avihai Horon
2024-03-06 10:16 ` Avihai Horon
@ 2024-03-06 10:49 ` Cédric Le Goater
1 sibling, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-06 10:49 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson,
Philippe Mathieu-Daudé, Markus Armbruster
On 3/6/24 10:56, Avihai Horon wrote:
>
> On 04/03/2024 14:28, Cédric Le Goater wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> This will prepare ground for futur changes adding an Error** argument
>> to the save_setup() handler. We need to make sure that on failure,
>> vfio_save_setup() always sets a new error.
>>
>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>> ---
>> hw/vfio/migration.c | 15 ++++++++++++---
>> 1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 2050ac8897231ff89cc223f0570d5c7a65dede9e..51bea536cc290ba0aa393f78b017b0650e333bff 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -383,6 +383,7 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>> VFIODevice *vbasedev = opaque;
>> VFIOMigration *migration = vbasedev->migration;
>> uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE;
>> + int ret;
>>
>> qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
>>
>> @@ -397,13 +398,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>> }
>>
>> if (vfio_precopy_supported(vbasedev)) {
>> - int ret;
>> -
>> switch (migration->device_state) {
>> case VFIO_DEVICE_STATE_RUNNING:
>> ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
>> VFIO_DEVICE_STATE_RUNNING);
>> if (ret) {
>> + error_report("%s: Failed to set new RUNNING state",
>> + vbasedev->name);
>> return ret;
>> }
>>
>> @@ -414,6 +415,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>> /* vfio_save_complete_precopy() will go to STOP_COPY */
>> break;
>> default:
>> + error_report("%s: Invalid device state %d", vbasedev->name,
>> + migration->device_state);
>> return -EINVAL;
>> }
>> }
>> @@ -422,7 +425,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>>
>> qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
>>
>> - return qemu_file_get_error(f);
>> + ret = qemu_file_get_error(f);
>> + if (ret) {
>> + error_report("%s: save setup failed : %s", vbasedev->name,
>> + strerror(ret));
>
> Here it should be -ret (and also later in patch #12).
Yes this is like qemu_fflush(). I will also change the test to
if (ret < 0)
As Prasad suggested.
Thanks,
C.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup()
2024-03-06 10:16 ` Avihai Horon
@ 2024-03-06 10:51 ` Cédric Le Goater
0 siblings, 0 replies; 50+ messages in thread
From: Cédric Le Goater @ 2024-03-06 10:51 UTC (permalink / raw)
To: Avihai Horon, qemu-devel
Cc: Peter Xu, Fabiano Rosas, Alex Williamson,
Philippe Mathieu-Daudé, Markus Armbruster
On 3/6/24 11:16, Avihai Horon wrote:
>
> On 06/03/2024 11:56, Avihai Horon wrote:
>>
>> On 04/03/2024 14:28, Cédric Le Goater wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> This will prepare ground for futur changes adding an Error** argument
>>> to the save_setup() handler. We need to make sure that on failure,
>>> vfio_save_setup() always sets a new error.
>>>
>>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>>> ---
>>> hw/vfio/migration.c | 15 ++++++++++++---
>>> 1 file changed, 12 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>> index 2050ac8897231ff89cc223f0570d5c7a65dede9e..51bea536cc290ba0aa393f78b017b0650e333bff 100644
>>> --- a/hw/vfio/migration.c
>>> +++ b/hw/vfio/migration.c
>>> @@ -383,6 +383,7 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>>> VFIODevice *vbasedev = opaque;
>>> VFIOMigration *migration = vbasedev->migration;
>>> uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE;
>>> + int ret;
>>>
>>> qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
>>>
>>> @@ -397,13 +398,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>>> }
>>>
>>> if (vfio_precopy_supported(vbasedev)) {
>>> - int ret;
>>> -
>>> switch (migration->device_state) {
>>> case VFIO_DEVICE_STATE_RUNNING:
>>> ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_PRE_COPY,
>>> VFIO_DEVICE_STATE_RUNNING);
>>> if (ret) {
>>> + error_report("%s: Failed to set new RUNNING state",
>
> Oh, sorry, forgot to mention in previous mail:
> s/RUNNING/PRE_COPY
Ah yes. Parameters are <new> and <recover> state.
Thanks,
C.
>
>>> + vbasedev->name);
>>> return ret;
>>> }
>>>
>>> @@ -414,6 +415,8 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>>> /* vfio_save_complete_precopy() will go to STOP_COPY */
>>> break;
>>> default:
>>> + error_report("%s: Invalid device state %d", vbasedev->name,
>>> + migration->device_state);
>>> return -EINVAL;
>>> }
>>> }
>>> @@ -422,7 +425,13 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
>>>
>>> qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
>>>
>>> - return qemu_file_get_error(f);
>>> + ret = qemu_file_get_error(f);
>>> + if (ret) {
>>> + error_report("%s: save setup failed : %s", vbasedev->name,
>>> + strerror(ret));
>>
>> Here it should be -ret (and also later in patch #12).
>>
>> Thanks.
>>
>>> + }
>>> +
>>> + return ret;
>>> }
>>>
>>> static void vfio_save_cleanup(void *opaque)
>>> --
>>> 2.44.0
>>>
>
^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2024-03-06 10:51 UTC | newest]
Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-04 12:28 [PATCH v3 00/26] migration: Improve error reporting Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 01/26] s390/stattrib: Add Error** argument to set_migrationmode() handler Cédric Le Goater
2024-03-04 20:49 ` Fabiano Rosas
2024-03-05 7:12 ` Cédric Le Goater
2024-03-05 14:54 ` Avihai Horon
2024-03-04 12:28 ` [PATCH v3 02/26] vfio: Always report an error in vfio_save_setup() Cédric Le Goater
2024-03-04 20:51 ` Fabiano Rosas
2024-03-06 9:56 ` Avihai Horon
2024-03-06 10:16 ` Avihai Horon
2024-03-06 10:51 ` Cédric Le Goater
2024-03-06 10:49 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 03/26] migration: Always report an error in block_save_setup() Cédric Le Goater
2024-03-04 21:04 ` Fabiano Rosas
2024-03-05 8:23 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 04/26] migration: Always report an error in ram_save_setup() Cédric Le Goater
2024-03-04 21:30 ` Fabiano Rosas
2024-03-05 8:52 ` Cédric Le Goater
2024-03-05 5:29 ` Prasad Pandit
2024-03-05 8:53 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 05/26] migration: Add Error** argument to vmstate_save() Cédric Le Goater
2024-03-05 7:44 ` Prasad Pandit
2024-03-04 12:28 ` [PATCH v3 06/26] migration: Report error when shutdown fails Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 07/26] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 08/26] migration: Add documentation for SaveVMHandlers Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 09/26] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 10/26] migration: Move cleanup after after error reporting in qemu_savevm_state_setup() Cédric Le Goater
2024-03-05 3:32 ` Peter Xu
2024-03-05 7:57 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 11/26] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 12/26] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
2024-03-05 8:54 ` Thomas Huth
2024-03-04 12:28 ` [PATCH v3 13/26] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 14/26] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
2024-03-05 7:57 ` Peter Xu
2024-03-05 14:25 ` Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 15/26] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
2024-03-05 1:58 ` Yong Huang
2024-03-04 12:28 ` [PATCH v3 16/26] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 17/26] vfio: Add Error** argument to .set_dirty_page_tracking() handler Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 18/26] vfio: Add Error** argument to vfio_devices_dma_logging_start() Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 19/26] vfio: Add Error** argument to vfio_devices_dma_logging_stop() Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 20/26] vfio: Use new Error** argument in vfio_save_setup() Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 21/26] vfio: Add Error** argument to .vfio_save_config() handler Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 22/26] vfio: Reverse test on vfio_get_dirty_bitmap() Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 23/26] memory: Add Error** argument to memory_get_xlat_addr() Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 24/26] vfio: Add Error** argument to .get_dirty_bitmap() handler Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 25/26] vfio: Also trace event failures in vfio_save_complete_precopy() Cédric Le Goater
2024-03-04 12:28 ` [PATCH v3 26/26] vfio: Extend vfio_set_migration_error() with Error* argument Cédric Le Goater
2024-03-05 8:06 ` [PATCH v3 00/26] migration: Improve error reporting Peter Xu
2024-03-05 8:30 ` Cédric Le Goater
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).