* [PULL 00/45] Staging patches
@ 2025-10-03 15:39 Peter Xu
2025-10-03 15:39 ` [PULL 01/45] migration: push Error **errp into vmstate_subsection_load() Peter Xu
` (45 more replies)
0 siblings, 46 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini
The following changes since commit 517e9b4862cc9798b7a24b1935d94c2f96787f12:
Merge tag 'qtest-20251001-pull-request' of https://gitlab.com/farosas/qemu into staging (2025-10-01 15:03:00 -0700)
are available in the Git repository at:
https://gitlab.com/peterx/qemu.git tags/staging-pull-request
for you to fetch changes up to 27cffe16354816d57710d2d4357f16139405c749:
migration-test: test cpr-exec (2025-10-03 09:48:02 -0400)
----------------------------------------------------------------
Migration/Memory Pull for 10.2
- PeterX's fix on tls warning for preempt channel when migratino completes
- Arun's series to enhance error reporting for vTPM and migration framework
- PeterX's patch to cleanup multifd send TLS BYE messages
- Juraj's fix on postcopy start state transition when switchover failed
- Yanfei's fix to migrate APIC before VFIO-PCI to avoid irq fallbacks
- Dan's cleanup to simplify error reporting in qemu_fill_buffer()
- PeterM's fix on address space leak when cpu hot plug / unplug
- Steve's cpr-exec wholeset
----------------------------------------------------------------
Arun Menon (26):
migration: push Error **errp into vmstate_subsection_load()
migration: push Error **errp into vmstate_load_state()
migration: push Error **errp into qemu_loadvm_state_header()
migration: push Error **errp into vmstate_load()
migration: push Error **errp into loadvm_process_command()
migration: push Error **errp into loadvm_handle_cmd_packaged()
migration: push Error **errp into qemu_loadvm_state()
migration: push Error **errp into qemu_load_device_state()
migration: push Error **errp into qemu_loadvm_state_main()
migration: push Error **errp into qemu_loadvm_section_start_full()
migration: push Error **errp into qemu_loadvm_section_part_end()
migration: Update qemu_file_get_return_path() docs and remove dead
checks
migration: make loadvm_postcopy_handle_resume() void
migration: push Error **errp into ram_postcopy_incoming_init()
migration: push Error **errp into loadvm_postcopy_handle_advise()
migration: push Error **errp into loadvm_postcopy_handle_listen()
migration: push Error **errp into loadvm_postcopy_handle_run()
migration: push Error **errp into loadvm_postcopy_ram_handle_discard()
migration: push Error **errp into loadvm_handle_recv_bitmap()
migration: Return -1 on memory allocation failure in ram.c
migration: push Error **errp into loadvm_process_enable_colo()
migration: push Error **errp into
loadvm_postcopy_handle_switchover_start()
migration: Capture error in postcopy_ram_listen_thread()
migration: Remove error variant of vmstate_save_state() function
migration: Add error-parameterized function variants in VMSD struct
backends/tpm: Propagate vTPM error on migration failure
Bin Guo (1):
migration: HMP: Adjust the order of output fields
Daniel P. Berrangé (1):
migration: simplify error reporting after channel read
Juraj Marcin (1):
migration: Fix state transition in postcopy_start() error handling
Peter Maydell (2):
include/system/memory.h: Clarify address_space_destroy() behaviour
physmem: Destroy all CPU AddressSpaces on unrealize
Peter Xu (4):
io/crypto: Move tls premature termination handling into QIO layer
migration: Make migration_has_failed() work even for CANCELLING
migration/multifd/tls: Cleanup BYE message processing on sender side
memory: New AS helper to serialize destroy+free
Steve Sistare (9):
migration: multi-mode notifier
migration: add cpr_walk_fd
oslib: qemu_clear_cloexec
migration: cpr-exec-command parameter
migration: cpr-exec save and load
migration: cpr-exec mode
migration: cpr-exec docs
vfio: cpr-exec mode
migration-test: test cpr-exec
Yanfei Xu (1):
migration: ensure APIC is loaded prior to VFIO PCI devices
docs/devel/migration/CPR.rst | 112 +++++++++-
docs/devel/migration/main.rst | 19 ++
qapi/migration.json | 46 +++-
include/crypto/tlssession.h | 10 +-
include/exec/cpu-common.h | 10 +-
include/hw/core/cpu.h | 1 -
include/migration/colo.h | 2 +-
include/migration/cpr.h | 10 +
include/migration/misc.h | 12 ++
include/migration/vmstate.h | 19 +-
include/qemu/osdep.h | 9 +
include/system/memory.h | 24 ++-
migration/postcopy-ram.h | 2 +-
migration/ram.h | 4 +-
migration/savevm.h | 7 +-
backends/tpm/tpm_emulator.c | 40 ++--
crypto/tlssession.c | 7 +-
hw/core/cpu-common.c | 1 +
hw/display/virtio-gpu.c | 5 +-
hw/intc/apic_common.c | 1 +
hw/pci/pci.c | 5 +-
hw/s390x/virtio-ccw.c | 4 +-
hw/scsi/spapr_vscsi.c | 6 +-
hw/vfio/container-legacy.c | 3 +-
hw/vfio/cpr-iommufd.c | 3 +-
hw/vfio/cpr-legacy.c | 9 +-
hw/vfio/cpr.c | 13 +-
hw/vfio/pci.c | 9 +-
hw/virtio/virtio-mmio.c | 5 +-
hw/virtio/virtio-pci.c | 4 +-
hw/virtio/virtio.c | 13 +-
io/channel-tls.c | 21 +-
migration/colo.c | 10 +-
migration/cpr-exec.c | 194 +++++++++++++++++
migration/cpr.c | 42 +++-
migration/migration-hmp-cmds.c | 44 +++-
migration/migration.c | 116 +++++++---
migration/multifd.c | 65 +++---
migration/options.c | 14 ++
migration/postcopy-ram.c | 9 +-
migration/qemu-file.c | 7 +-
migration/ram.c | 17 +-
migration/savevm.c | 329 +++++++++++++++++------------
migration/vmstate-types.c | 61 ++++--
migration/vmstate.c | 103 ++++++---
stubs/cpu-destroy-address-spaces.c | 15 ++
system/memory.c | 20 +-
system/physmem.c | 32 ++-
system/vl.c | 4 +-
tests/qtest/migration/cpr-tests.c | 133 ++++++++++++
tests/unit/test-vmstate.c | 83 ++++++--
ui/vdagent.c | 8 +-
util/oslib-posix.c | 9 +
util/oslib-win32.c | 4 +
hmp-commands.hx | 2 +-
migration/meson.build | 1 +
migration/trace-events | 1 +
stubs/meson.build | 1 +
58 files changed, 1351 insertions(+), 409 deletions(-)
create mode 100644 migration/cpr-exec.c
create mode 100644 stubs/cpu-destroy-address-spaces.c
--
2.50.1
^ permalink raw reply [flat|nested] 49+ messages in thread
* [PULL 01/45] migration: push Error **errp into vmstate_subsection_load()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 02/45] migration: push Error **errp into vmstate_load_state() Peter Xu
` (44 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that vmstate_subsection_load() must report an error
in errp, in case of failure.
The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series,
when we are actually able to propagate the error to the calling
function using errp.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-1-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/vmstate.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/migration/vmstate.c b/migration/vmstate.c
index 5feaa3244d..08f2b562e3 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -25,7 +25,7 @@ static int vmstate_subsection_save(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, JSONWriter *vmdesc,
Error **errp);
static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
- void *opaque);
+ void *opaque, Error **errp);
/* Whether this field should exist for either save or load the VM? */
static bool
@@ -136,6 +136,7 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
{
const VMStateField *field = vmsd->fields;
int ret = 0;
+ Error *local_err = NULL;
trace_vmstate_load_state(vmsd->name, version_id);
if (version_id > vmsd->version_id) {
@@ -225,9 +226,10 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
field++;
}
assert(field->flags == VMS_END);
- ret = vmstate_subsection_load(f, vmsd, opaque);
+ ret = vmstate_subsection_load(f, vmsd, opaque, &local_err);
if (ret != 0) {
qemu_file_set_error(f, ret);
+ error_report_err(local_err);
return ret;
}
if (vmsd->post_load) {
@@ -566,7 +568,7 @@ vmstate_get_subsection(const VMStateDescription * const *sub,
}
static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
- void *opaque)
+ void *opaque, Error **errp)
{
trace_vmstate_subsection_load(vmsd->name);
@@ -598,6 +600,8 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr);
if (sub_vmsd == NULL) {
trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(lookup)");
+ error_setg(errp, "VM subsection '%s' in '%s' does not exist",
+ idstr, vmsd->name);
return -ENOENT;
}
qemu_file_skip(f, 1); /* subsection */
@@ -608,6 +612,9 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
ret = vmstate_load_state(f, sub_vmsd, opaque, version_id);
if (ret) {
trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(child)");
+ error_setg(errp,
+ "Loading VM subsection '%s' in '%s' failed: %d",
+ idstr, vmsd->name, ret);
return ret;
}
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 02/45] migration: push Error **errp into vmstate_load_state()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
2025-10-03 15:39 ` [PULL 01/45] migration: push Error **errp into vmstate_subsection_load() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 03/45] migration: push Error **errp into qemu_loadvm_state_header() Peter Xu
` (43 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that vmstate_load_state() must report an error
in errp, in case of failure.
The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series,
when we are actually able to propagate the error to the calling
function using errp. Whereas, if we want the function to exit on
error, then error_fatal is passed.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-2-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/vmstate.h | 2 +-
hw/display/virtio-gpu.c | 2 +-
hw/pci/pci.c | 3 +-
hw/s390x/virtio-ccw.c | 2 +-
hw/scsi/spapr_vscsi.c | 4 ++-
hw/vfio/pci.c | 5 ++-
hw/virtio/virtio-mmio.c | 3 +-
hw/virtio/virtio-pci.c | 2 +-
hw/virtio/virtio.c | 7 +++--
migration/cpr.c | 3 +-
migration/savevm.c | 8 +++--
migration/vmstate-types.c | 28 ++++++++++-------
migration/vmstate.c | 61 +++++++++++++++++++++++------------
tests/unit/test-vmstate.c | 63 +++++++++++++++++++++++++++++++------
ui/vdagent.c | 5 ++-
15 files changed, 143 insertions(+), 55 deletions(-)
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 1ff7bd9ac4..056781b1c2 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -1196,7 +1196,7 @@ extern const VMStateInfo vmstate_info_qlist;
}
int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
- void *opaque, int version_id);
+ void *opaque, int version_id, Error **errp);
int vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, JSONWriter *vmdesc);
int vmstate_save_state_with_err(QEMUFile *f, const VMStateDescription *vmsd,
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index de35902213..e61585aa61 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1347,7 +1347,7 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size,
}
/* load & apply scanout state */
- vmstate_load_state(f, &vmstate_virtio_gpu_scanouts, g, 1);
+ vmstate_load_state(f, &vmstate_virtio_gpu_scanouts, g, 1, &error_fatal);
return 0;
}
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index c3df9d6656..17715ca1b3 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -934,7 +934,8 @@ void pci_device_save(PCIDevice *s, QEMUFile *f)
int pci_device_load(PCIDevice *s, QEMUFile *f)
{
int ret;
- ret = vmstate_load_state(f, &vmstate_pci_device, s, s->version_id);
+ ret = vmstate_load_state(f, &vmstate_pci_device, s, s->version_id,
+ &error_fatal);
/* Restore the interrupt status bit. */
pci_update_irq_status(s);
return ret;
diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index d2f85b39f3..6a9641a03d 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -1136,7 +1136,7 @@ static void virtio_ccw_save_config(DeviceState *d, QEMUFile *f)
static int virtio_ccw_load_config(DeviceState *d, QEMUFile *f)
{
VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
- return vmstate_load_state(f, &vmstate_virtio_ccw_dev, dev, 1);
+ return vmstate_load_state(f, &vmstate_virtio_ccw_dev, dev, 1, &error_fatal);
}
static void virtio_ccw_pre_plugged(DeviceState *d, Error **errp)
diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index 20f70fb272..da173f4867 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -642,15 +642,17 @@ static void *vscsi_load_request(QEMUFile *f, SCSIRequest *sreq)
VSCSIState *s = VIO_SPAPR_VSCSI_DEVICE(bus->qbus.parent);
vscsi_req *req;
int rc;
+ Error *local_err = NULL;
assert(sreq->tag < VSCSI_REQ_LIMIT);
req = &s->reqs[sreq->tag];
assert(!req->active);
memset(req, 0, sizeof(*req));
- rc = vmstate_load_state(f, &vmstate_spapr_vscsi_req, req, 1);
+ rc = vmstate_load_state(f, &vmstate_spapr_vscsi_req, req, 1, &local_err);
if (rc) {
fprintf(stderr, "VSCSI: failed loading request tag#%u\n", sreq->tag);
+ error_report_err(local_err);
return NULL;
}
assert(req->active);
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 5b022da19e..a5df4685d4 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2831,13 +2831,16 @@ static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f)
PCIDevice *pdev = PCI_DEVICE(vdev);
pcibus_t old_addr[PCI_NUM_REGIONS - 1];
int bar, ret;
+ Error *local_err = NULL;
for (bar = 0; bar < PCI_ROM_SLOT; bar++) {
old_addr[bar] = pdev->io_regions[bar].addr;
}
- ret = vmstate_load_state(f, &vmstate_vfio_pci_config, vdev, 1);
+ ret = vmstate_load_state(f, &vmstate_vfio_pci_config, vdev, 1,
+ &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 532c67107b..0a688909fc 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -34,6 +34,7 @@
#include "qemu/error-report.h"
#include "qemu/log.h"
#include "trace.h"
+#include "qapi/error.h"
static bool virtio_mmio_ioeventfd_enabled(DeviceState *d)
{
@@ -619,7 +620,7 @@ static int virtio_mmio_load_extra_state(DeviceState *opaque, QEMUFile *f)
{
VirtIOMMIOProxy *proxy = VIRTIO_MMIO(opaque);
- return vmstate_load_state(f, &vmstate_virtio_mmio, proxy, 1);
+ return vmstate_load_state(f, &vmstate_virtio_mmio, proxy, 1, &error_fatal);
}
static bool virtio_mmio_has_extra_state(DeviceState *opaque)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 767216d795..b04faa1e5c 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -161,7 +161,7 @@ static int virtio_pci_load_extra_state(DeviceState *d, QEMUFile *f)
{
VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
- return vmstate_load_state(f, &vmstate_virtio_pci, proxy, 1);
+ return vmstate_load_state(f, &vmstate_virtio_pci, proxy, 1, &error_fatal);
}
static void virtio_pci_save_queue(DeviceState *d, int n, QEMUFile *f)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 9a81ad912e..018803c80d 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3235,6 +3235,7 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+ Error *local_err = NULL;
/*
* We poison the endianness to ensure it does not get used before
@@ -3327,15 +3328,17 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
}
if (vdc->vmsd) {
- ret = vmstate_load_state(f, vdc->vmsd, vdev, version_id);
+ ret = vmstate_load_state(f, vdc->vmsd, vdev, version_id, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
}
/* Subsections */
- ret = vmstate_load_state(f, &vmstate_virtio, vdev, 1);
+ ret = vmstate_load_state(f, &vmstate_virtio, vdev, 1, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
diff --git a/migration/cpr.c b/migration/cpr.c
index 9848a21ea6..6b0e19651a 100644
--- a/migration/cpr.c
+++ b/migration/cpr.c
@@ -234,9 +234,8 @@ int cpr_state_load(MigrationChannel *channel, Error **errp)
return -ENOTSUP;
}
- ret = vmstate_load_state(f, &vmstate_cpr_state, &cpr_state, 1);
+ ret = vmstate_load_state(f, &vmstate_cpr_state, &cpr_state, 1, errp);
if (ret) {
- error_setg(errp, "vmstate_load_state error %d", ret);
qemu_fclose(f);
return ret;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index abe0547f9b..55c99e0902 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -969,7 +969,8 @@ static int vmstate_load(QEMUFile *f, SaveStateEntry *se)
if (!se->vmsd) { /* Old style */
return se->ops->load_state(f, se->opaque, se->load_version_id);
}
- return vmstate_load_state(f, se->vmsd, se->opaque, se->load_version_id);
+ return vmstate_load_state(f, se->vmsd, se->opaque, se->load_version_id,
+ &error_fatal);
}
static void vmstate_save_old_style(QEMUFile *f, SaveStateEntry *se,
@@ -2817,6 +2818,7 @@ static int qemu_loadvm_state_header(QEMUFile *f)
{
unsigned int v;
int ret;
+ Error *local_err = NULL;
v = qemu_get_be32(f);
if (v != QEMU_VM_FILE_MAGIC) {
@@ -2839,9 +2841,11 @@ static int qemu_loadvm_state_header(QEMUFile *f)
error_report("Configuration section missing");
return -EINVAL;
}
- ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0);
+ ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0,
+ &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
}
diff --git a/migration/vmstate-types.c b/migration/vmstate-types.c
index 741a588b7e..c5cfd861e3 100644
--- a/migration/vmstate-types.c
+++ b/migration/vmstate-types.c
@@ -19,6 +19,7 @@
#include "qemu/error-report.h"
#include "qemu/queue.h"
#include "trace.h"
+#include "qapi/error.h"
/* bool */
@@ -543,13 +544,17 @@ static int get_tmp(QEMUFile *f, void *pv, size_t size,
const VMStateField *field)
{
int ret;
+ Error *local_err = NULL;
const VMStateDescription *vmsd = field->vmsd;
int version_id = field->version_id;
void *tmp = g_malloc(size);
/* Writes the parent field which is at the start of the tmp */
*(void **)tmp = pv;
- ret = vmstate_load_state(f, vmsd, tmp, version_id);
+ ret = vmstate_load_state(f, vmsd, tmp, version_id, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
g_free(tmp);
return ret;
}
@@ -626,6 +631,7 @@ static int get_qtailq(QEMUFile *f, void *pv, size_t unused_size,
const VMStateField *field)
{
int ret = 0;
+ Error *local_err = NULL;
const VMStateDescription *vmsd = field->vmsd;
/* size of a QTAILQ element */
size_t size = field->size;
@@ -649,8 +655,9 @@ static int get_qtailq(QEMUFile *f, void *pv, size_t unused_size,
while (qemu_get_byte(f)) {
elm = g_malloc(size);
- ret = vmstate_load_state(f, vmsd, elm, version_id);
+ ret = vmstate_load_state(f, vmsd, elm, version_id, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
QTAILQ_RAW_INSERT_TAIL(pv, elm, entry_offset);
@@ -772,6 +779,7 @@ static int get_gtree(QEMUFile *f, void *pv, size_t unused_size,
GTree *tree = *pval;
void *key, *val;
int ret = 0;
+ Error *local_err = NULL;
/* in case of direct key, the key vmsd can be {}, ie. check fields */
if (!direct_key && version_id > key_vmsd->version_id) {
@@ -803,18 +811,16 @@ static int get_gtree(QEMUFile *f, void *pv, size_t unused_size,
key = (void *)(uintptr_t)qemu_get_be64(f);
} else {
key = g_malloc0(key_size);
- ret = vmstate_load_state(f, key_vmsd, key, version_id);
+ ret = vmstate_load_state(f, key_vmsd, key, version_id, &local_err);
if (ret) {
- error_report("%s : failed to load %s (%d)",
- field->name, key_vmsd->name, ret);
+ error_report_err(local_err);
goto key_error;
}
}
val = g_malloc0(val_size);
- ret = vmstate_load_state(f, val_vmsd, val, version_id);
+ ret = vmstate_load_state(f, val_vmsd, val, version_id, &local_err);
if (ret) {
- error_report("%s : failed to load %s (%d)",
- field->name, val_vmsd->name, ret);
+ error_report_err(local_err);
goto val_error;
}
g_tree_insert(tree, key, val);
@@ -872,6 +878,7 @@ static int get_qlist(QEMUFile *f, void *pv, size_t unused_size,
const VMStateField *field)
{
int ret = 0;
+ Error *local_err = NULL;
const VMStateDescription *vmsd = field->vmsd;
/* size of a QLIST element */
size_t size = field->size;
@@ -892,10 +899,9 @@ static int get_qlist(QEMUFile *f, void *pv, size_t unused_size,
while (qemu_get_byte(f)) {
elm = g_malloc(size);
- ret = vmstate_load_state(f, vmsd, elm, version_id);
+ ret = vmstate_load_state(f, vmsd, elm, version_id, &local_err);
if (ret) {
- error_report("%s: failed to load %s (%d)", field->name,
- vmsd->name, ret);
+ error_report_err(local_err);
g_free(elm);
return ret;
}
diff --git a/migration/vmstate.c b/migration/vmstate.c
index 08f2b562e3..8d1e9eb62b 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -132,30 +132,33 @@ static void vmstate_handle_alloc(void *ptr, const VMStateField *field,
}
int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
- void *opaque, int version_id)
+ void *opaque, int version_id, Error **errp)
{
const VMStateField *field = vmsd->fields;
int ret = 0;
- Error *local_err = NULL;
trace_vmstate_load_state(vmsd->name, version_id);
if (version_id > vmsd->version_id) {
- error_report("%s: incoming version_id %d is too new "
- "for local version_id %d",
- vmsd->name, version_id, vmsd->version_id);
+ error_setg(errp, "%s: incoming version_id %d is too new "
+ "for local version_id %d",
+ vmsd->name, version_id, vmsd->version_id);
trace_vmstate_load_state_end(vmsd->name, "too new", -EINVAL);
return -EINVAL;
}
if (version_id < vmsd->minimum_version_id) {
- error_report("%s: incoming version_id %d is too old "
- "for local minimum version_id %d",
- vmsd->name, version_id, vmsd->minimum_version_id);
+ error_setg(errp, "%s: incoming version_id %d is too old "
+ "for local minimum version_id %d",
+ vmsd->name, version_id, vmsd->minimum_version_id);
trace_vmstate_load_state_end(vmsd->name, "too old", -EINVAL);
return -EINVAL;
}
if (vmsd->pre_load) {
ret = vmsd->pre_load(opaque);
if (ret) {
+ error_setg(errp, "pre load hook failed for: '%s', "
+ "version_id: %d, minimum version_id: %d, ret: %d",
+ vmsd->name, vmsd->version_id, vmsd->minimum_version_id,
+ ret);
return ret;
}
}
@@ -193,13 +196,21 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
if (inner_field->flags & VMS_STRUCT) {
ret = vmstate_load_state(f, inner_field->vmsd, curr_elem,
- inner_field->vmsd->version_id);
+ inner_field->vmsd->version_id,
+ errp);
} else if (inner_field->flags & VMS_VSTRUCT) {
ret = vmstate_load_state(f, inner_field->vmsd, curr_elem,
- inner_field->struct_version_id);
+ inner_field->struct_version_id,
+ errp);
} else {
ret = inner_field->info->get(f, curr_elem, size,
inner_field);
+ if (ret < 0) {
+ error_setg(errp,
+ "Failed to load element of type %s for %s: "
+ "%d", inner_field->info->name,
+ inner_field->name, ret);
+ }
}
/* If we used a fake temp field.. free it now */
@@ -209,31 +220,40 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
if (ret >= 0) {
ret = qemu_file_get_error(f);
+ if (ret < 0) {
+ error_setg(errp,
+ "Failed to load %s state: stream error: %d",
+ vmsd->name, ret);
+ }
}
if (ret < 0) {
qemu_file_set_error(f, ret);
- error_report("Failed to load %s:%s", vmsd->name,
- field->name);
trace_vmstate_load_field_error(field->name, ret);
return ret;
}
}
} else if (field->flags & VMS_MUST_EXIST) {
- error_report("Input validation failed: %s/%s",
- vmsd->name, field->name);
+ error_setg(errp, "Input validation failed: %s/%s version_id: %d",
+ vmsd->name, field->name, vmsd->version_id);
return -1;
}
field++;
}
assert(field->flags == VMS_END);
- ret = vmstate_subsection_load(f, vmsd, opaque, &local_err);
+ ret = vmstate_subsection_load(f, vmsd, opaque, errp);
if (ret != 0) {
qemu_file_set_error(f, ret);
- error_report_err(local_err);
return ret;
}
if (vmsd->post_load) {
ret = vmsd->post_load(opaque, version_id);
+ if (ret < 0) {
+ error_setg(errp,
+ "post load hook failed for: %s, version_id: %d, "
+ "minimum_version: %d, ret: %d",
+ vmsd->name, vmsd->version_id, vmsd->minimum_version_id,
+ ret);
+ }
}
trace_vmstate_load_state_end(vmsd->name, "end", ret);
return ret;
@@ -570,6 +590,7 @@ vmstate_get_subsection(const VMStateDescription * const *sub,
static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, Error **errp)
{
+ ERRP_GUARD();
trace_vmstate_subsection_load(vmsd->name);
while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
@@ -609,12 +630,12 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
qemu_file_skip(f, len); /* idstr */
version_id = qemu_get_be32(f);
- ret = vmstate_load_state(f, sub_vmsd, opaque, version_id);
+ ret = vmstate_load_state(f, sub_vmsd, opaque, version_id, errp);
if (ret) {
trace_vmstate_subsection_load_bad(vmsd->name, idstr, "(child)");
- error_setg(errp,
- "Loading VM subsection '%s' in '%s' failed: %d",
- idstr, vmsd->name, ret);
+ error_prepend(errp,
+ "Loading VM subsection '%s' in '%s' failed: %d: ",
+ idstr, vmsd->name, ret);
return ret;
}
}
diff --git a/tests/unit/test-vmstate.c b/tests/unit/test-vmstate.c
index 63f28f26f4..4ff0ab632f 100644
--- a/tests/unit/test-vmstate.c
+++ b/tests/unit/test-vmstate.c
@@ -30,6 +30,7 @@
#include "../migration/savevm.h"
#include "qemu/module.h"
#include "io/channel-file.h"
+#include "qapi/error.h"
static int temp_fd;
@@ -108,14 +109,16 @@ static int load_vmstate_one(const VMStateDescription *desc, void *obj,
{
QEMUFile *f;
int ret;
+ Error *local_err = NULL;
f = open_test_file(true);
qemu_put_buffer(f, wire, size);
qemu_fclose(f);
f = open_test_file(false);
- ret = vmstate_load_state(f, desc, obj, version);
+ ret = vmstate_load_state(f, desc, obj, version, &local_err);
if (ret) {
+ error_report_err(local_err);
g_assert(qemu_file_get_error(f));
} else{
g_assert(!qemu_file_get_error(f));
@@ -355,6 +358,8 @@ static const VMStateDescription vmstate_versioned = {
static void test_load_v1(void)
{
+ Error *local_err = NULL;
+ int ret;
uint8_t buf[] = {
0, 0, 0, 10, /* a */
0, 0, 0, 30, /* c */
@@ -365,7 +370,10 @@ static void test_load_v1(void)
QEMUFile *loading = open_test_file(false);
TestStruct obj = { .b = 200, .e = 500, .f = 600 };
- vmstate_load_state(loading, &vmstate_versioned, &obj, 1);
+ ret = vmstate_load_state(loading, &vmstate_versioned, &obj, 1, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
g_assert(!qemu_file_get_error(loading));
g_assert_cmpint(obj.a, ==, 10);
g_assert_cmpint(obj.b, ==, 200);
@@ -378,6 +386,8 @@ static void test_load_v1(void)
static void test_load_v2(void)
{
+ Error *local_err = NULL;
+ int ret;
uint8_t buf[] = {
0, 0, 0, 10, /* a */
0, 0, 0, 20, /* b */
@@ -391,7 +401,10 @@ static void test_load_v2(void)
QEMUFile *loading = open_test_file(false);
TestStruct obj;
- vmstate_load_state(loading, &vmstate_versioned, &obj, 2);
+ ret = vmstate_load_state(loading, &vmstate_versioned, &obj, 2, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
g_assert_cmpint(obj.a, ==, 10);
g_assert_cmpint(obj.b, ==, 20);
g_assert_cmpint(obj.c, ==, 30);
@@ -467,6 +480,8 @@ static void test_save_skip(void)
static void test_load_noskip(void)
{
+ Error *local_err = NULL;
+ int ret;
uint8_t buf[] = {
0, 0, 0, 10, /* a */
0, 0, 0, 20, /* b */
@@ -480,7 +495,10 @@ static void test_load_noskip(void)
QEMUFile *loading = open_test_file(false);
TestStruct obj = { .skip_c_e = false };
- vmstate_load_state(loading, &vmstate_skipping, &obj, 2);
+ ret = vmstate_load_state(loading, &vmstate_skipping, &obj, 2, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
g_assert(!qemu_file_get_error(loading));
g_assert_cmpint(obj.a, ==, 10);
g_assert_cmpint(obj.b, ==, 20);
@@ -493,6 +511,8 @@ static void test_load_noskip(void)
static void test_load_skip(void)
{
+ Error *local_err = NULL;
+ int ret;
uint8_t buf[] = {
0, 0, 0, 10, /* a */
0, 0, 0, 20, /* b */
@@ -504,7 +524,10 @@ static void test_load_skip(void)
QEMUFile *loading = open_test_file(false);
TestStruct obj = { .skip_c_e = true, .c = 300, .e = 500 };
- vmstate_load_state(loading, &vmstate_skipping, &obj, 2);
+ ret = vmstate_load_state(loading, &vmstate_skipping, &obj, 2, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
g_assert(!qemu_file_get_error(loading));
g_assert_cmpint(obj.a, ==, 10);
g_assert_cmpint(obj.b, ==, 20);
@@ -744,6 +767,8 @@ static void test_save_q(void)
static void test_load_q(void)
{
+ int ret;
+ Error *local_err = NULL;
TestQtailq obj_q = {
.i16 = -512,
.i32 = 70000,
@@ -773,7 +798,10 @@ static void test_load_q(void)
TestQtailq tgt;
QTAILQ_INIT(&tgt.q);
- vmstate_load_state(fload, &vmstate_q, &tgt, 1);
+ ret = vmstate_load_state(fload, &vmstate_q, &tgt, 1, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
char eof = qemu_get_byte(fload);
g_assert(!qemu_file_get_error(fload));
g_assert_cmpint(tgt.i16, ==, obj_q.i16);
@@ -1115,6 +1143,8 @@ static void diff_iommu(TestGTreeIOMMU *iommu1, TestGTreeIOMMU *iommu2)
static void test_gtree_load_domain(void)
{
+ Error *local_err = NULL;
+ int ret;
TestGTreeDomain *dest_domain = g_new0(TestGTreeDomain, 1);
TestGTreeDomain *orig_domain = create_first_domain();
QEMUFile *fload, *fsave;
@@ -1127,7 +1157,11 @@ static void test_gtree_load_domain(void)
fload = open_test_file(false);
- vmstate_load_state(fload, &vmstate_domain, dest_domain, 1);
+ ret = vmstate_load_state(fload, &vmstate_domain, dest_domain, 1,
+ &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
eof = qemu_get_byte(fload);
g_assert(!qemu_file_get_error(fload));
g_assert_cmpint(orig_domain->id, ==, dest_domain->id);
@@ -1230,6 +1264,8 @@ static void test_gtree_save_iommu(void)
static void test_gtree_load_iommu(void)
{
+ Error *local_err = NULL;
+ int ret;
TestGTreeIOMMU *dest_iommu = g_new0(TestGTreeIOMMU, 1);
TestGTreeIOMMU *orig_iommu = create_iommu();
QEMUFile *fsave, *fload;
@@ -1241,7 +1277,10 @@ static void test_gtree_load_iommu(void)
qemu_fclose(fsave);
fload = open_test_file(false);
- vmstate_load_state(fload, &vmstate_iommu, dest_iommu, 1);
+ ret = vmstate_load_state(fload, &vmstate_iommu, dest_iommu, 1, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
eof = qemu_get_byte(fload);
g_assert(!qemu_file_get_error(fload));
g_assert_cmpint(orig_iommu->id, ==, dest_iommu->id);
@@ -1363,6 +1402,8 @@ static void test_save_qlist(void)
static void test_load_qlist(void)
{
+ Error *local_err = NULL;
+ int ret;
QEMUFile *fsave, *fload;
TestQListContainer *orig_container = alloc_container();
TestQListContainer *dest_container = g_new0(TestQListContainer, 1);
@@ -1376,7 +1417,11 @@ static void test_load_qlist(void)
qemu_fclose(fsave);
fload = open_test_file(false);
- vmstate_load_state(fload, &vmstate_container, dest_container, 1);
+ ret = vmstate_load_state(fload, &vmstate_container, dest_container, 1,
+ &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
eof = qemu_get_byte(fload);
g_assert(!qemu_file_get_error(fload));
g_assert_cmpint(eof, ==, QEMU_VM_EOF);
diff --git a/ui/vdagent.c b/ui/vdagent.c
index c0746fe5b1..bc3c77f013 100644
--- a/ui/vdagent.c
+++ b/ui/vdagent.c
@@ -1001,6 +1001,7 @@ static int get_cbinfo(QEMUFile *f, void *pv, size_t size,
VDAgentChardev *vd = QEMU_VDAGENT_CHARDEV(pv);
struct CBInfoArray cbinfo = {};
int i, ret;
+ Error *local_err = NULL;
if (!have_clipboard(vd)) {
return 0;
@@ -1008,8 +1009,10 @@ static int get_cbinfo(QEMUFile *f, void *pv, size_t size,
vdagent_clipboard_peer_register(vd);
- ret = vmstate_load_state(f, &vmstate_cbinfo_array, &cbinfo, 0);
+ ret = vmstate_load_state(f, &vmstate_cbinfo_array, &cbinfo, 0,
+ &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 03/45] migration: push Error **errp into qemu_loadvm_state_header()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
2025-10-03 15:39 ` [PULL 01/45] migration: push Error **errp into vmstate_subsection_load() Peter Xu
2025-10-03 15:39 ` [PULL 02/45] migration: push Error **errp into vmstate_load_state() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 04/45] migration: push Error **errp into vmstate_load() Peter Xu
` (42 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_state_header() must report an error
in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-3-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 28 +++++++++++++++++-----------
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 55c99e0902..8ac3d33814 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2814,38 +2814,43 @@ qemu_loadvm_section_part_end(QEMUFile *f, uint8_t type)
return 0;
}
-static int qemu_loadvm_state_header(QEMUFile *f)
+static int qemu_loadvm_state_header(QEMUFile *f, Error **errp)
{
unsigned int v;
int ret;
- Error *local_err = NULL;
v = qemu_get_be32(f);
if (v != QEMU_VM_FILE_MAGIC) {
- error_report("Not a migration stream");
+ error_setg(errp, "Not a migration stream, magic: %x != %x",
+ v, QEMU_VM_FILE_MAGIC);
return -EINVAL;
}
v = qemu_get_be32(f);
if (v == QEMU_VM_FILE_VERSION_COMPAT) {
- error_report("SaveVM v2 format is obsolete and don't work anymore");
+ error_setg(errp,
+ "SaveVM v2 format is obsolete and no longer supported");
+
return -ENOTSUP;
}
if (v != QEMU_VM_FILE_VERSION) {
- error_report("Unsupported migration stream version");
+ error_setg(errp, "Unsupported migration stream version, "
+ "file version %x != %x",
+ v, QEMU_VM_FILE_VERSION);
return -ENOTSUP;
}
if (migrate_get_current()->send_configuration) {
- if (qemu_get_byte(f) != QEMU_VM_CONFIGURATION) {
- error_report("Configuration section missing");
+ v = qemu_get_byte(f);
+ if (v != QEMU_VM_CONFIGURATION) {
+ error_setg(errp, "Configuration section missing, %x != %x",
+ v, QEMU_VM_CONFIGURATION);
return -EINVAL;
}
- ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0,
- &local_err);
+ ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0,
+ errp);
if (ret) {
- error_report_err(local_err);
return ret;
}
}
@@ -3121,8 +3126,9 @@ int qemu_loadvm_state(QEMUFile *f)
qemu_loadvm_thread_pool_create(mis);
- ret = qemu_loadvm_state_header(f);
+ ret = qemu_loadvm_state_header(f, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 04/45] migration: push Error **errp into vmstate_load()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (2 preceding siblings ...)
2025-10-03 15:39 ` [PULL 03/45] migration: push Error **errp into qemu_loadvm_state_header() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 05/45] migration: push Error **errp into loadvm_process_command() Peter Xu
` (41 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that vmstate_load() must report an error
in errp, in case of failure.
The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series
when we are actually able to propagate the error to the calling
function.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-4-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 8ac3d33814..fffea57cd9 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -963,14 +963,20 @@ void vmstate_unregister(VMStateIf *obj, const VMStateDescription *vmsd,
}
}
-static int vmstate_load(QEMUFile *f, SaveStateEntry *se)
+static int vmstate_load(QEMUFile *f, SaveStateEntry *se, Error **errp)
{
+ int ret;
trace_vmstate_load(se->idstr, se->vmsd ? se->vmsd->name : "(old)");
if (!se->vmsd) { /* Old style */
- return se->ops->load_state(f, se->opaque, se->load_version_id);
+ ret = se->ops->load_state(f, se->opaque, se->load_version_id);
+ if (ret < 0) {
+ error_setg(errp, "Failed to load vmstate version_id: %d, ret: %d",
+ se->load_version_id, ret);
+ }
+ return ret;
}
return vmstate_load_state(f, se->vmsd, se->opaque, se->load_version_id,
- &error_fatal);
+ errp);
}
static void vmstate_save_old_style(QEMUFile *f, SaveStateEntry *se,
@@ -2692,6 +2698,7 @@ qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type)
SaveStateEntry *se;
char idstr[256];
int ret;
+ Error *local_err = NULL;
/* Read section start */
section_id = qemu_get_be32(f);
@@ -2741,10 +2748,11 @@ qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type)
start_ts = qemu_clock_get_us(QEMU_CLOCK_REALTIME);
}
- ret = vmstate_load(f, se);
+ ret = vmstate_load(f, se, &local_err);
if (ret < 0) {
error_report("error while loading state for instance 0x%"PRIx32" of"
" device '%s'", instance_id, idstr);
+ error_report_err(local_err);
return ret;
}
@@ -2769,6 +2777,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, uint8_t type)
uint32_t section_id;
SaveStateEntry *se;
int ret;
+ Error *local_err = NULL;
section_id = qemu_get_be32(f);
@@ -2794,10 +2803,9 @@ qemu_loadvm_section_part_end(QEMUFile *f, uint8_t type)
start_ts = qemu_clock_get_us(QEMU_CLOCK_REALTIME);
}
- ret = vmstate_load(f, se);
+ ret = vmstate_load(f, se, &local_err);
if (ret < 0) {
- error_report("error while loading state section id %d(%s)",
- section_id, se->idstr);
+ error_report_err(local_err);
return ret;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 05/45] migration: push Error **errp into loadvm_process_command()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (3 preceding siblings ...)
2025-10-03 15:39 ` [PULL 04/45] migration: push Error **errp into vmstate_load() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 06/45] migration: push Error **errp into loadvm_handle_cmd_packaged() Peter Xu
` (40 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_process_command() must report an error
in errp, in case of failure.
The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series
when we are actually able to propagate the error to the calling
function.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-5-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 86 +++++++++++++++++++++++++++++++++-------------
1 file changed, 63 insertions(+), 23 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index fffea57cd9..d1ed2e1cde 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2546,32 +2546,37 @@ static int loadvm_postcopy_handle_switchover_start(void)
* LOADVM_QUIT All good, but exit the loop
* <0 Error
*/
-static int loadvm_process_command(QEMUFile *f)
+static int loadvm_process_command(QEMUFile *f, Error **errp)
{
MigrationIncomingState *mis = migration_incoming_get_current();
uint16_t cmd;
uint16_t len;
uint32_t tmp32;
+ int ret;
cmd = qemu_get_be16(f);
len = qemu_get_be16(f);
/* Check validity before continue processing of cmds */
- if (qemu_file_get_error(f)) {
- return qemu_file_get_error(f);
+ ret = qemu_file_get_error(f);
+ if (ret) {
+ error_setg(errp,
+ "Failed to load VM process command: stream error: %d",
+ ret);
+ return ret;
}
if (cmd >= MIG_CMD_MAX || cmd == MIG_CMD_INVALID) {
- error_report("MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
+ error_setg(errp, "MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
return -EINVAL;
}
trace_loadvm_process_command(mig_cmd_args[cmd].name, len);
if (mig_cmd_args[cmd].len != -1 && mig_cmd_args[cmd].len != len) {
- error_report("%s received with bad length - expecting %zu, got %d",
- mig_cmd_args[cmd].name,
- (size_t)mig_cmd_args[cmd].len, len);
+ error_setg(errp, "%s received with bad length - expecting %zu, got %d",
+ mig_cmd_args[cmd].name,
+ (size_t)mig_cmd_args[cmd].len, len);
return -ERANGE;
}
@@ -2584,7 +2589,7 @@ static int loadvm_process_command(QEMUFile *f)
}
mis->to_src_file = qemu_file_get_return_path(f);
if (!mis->to_src_file) {
- error_report("CMD_OPEN_RETURN_PATH failed");
+ error_setg(errp, "CMD_OPEN_RETURN_PATH failed");
return -1;
}
@@ -2594,11 +2599,10 @@ static int loadvm_process_command(QEMUFile *f)
* been created.
*/
if (migrate_switchover_ack() && !mis->switchover_ack_pending_num) {
- int ret = migrate_send_rp_switchover_ack(mis);
+ ret = migrate_send_rp_switchover_ack(mis);
if (ret) {
- error_report(
- "Could not send switchover ack RP MSG, err %d (%s)", ret,
- strerror(-ret));
+ error_setg_errno(errp, -ret,
+ "Could not send switchover ack RP MSG");
return ret;
}
}
@@ -2608,39 +2612,71 @@ static int loadvm_process_command(QEMUFile *f)
tmp32 = qemu_get_be32(f);
trace_loadvm_process_command_ping(tmp32);
if (!mis->to_src_file) {
- error_report("CMD_PING (0x%x) received with no return path",
- tmp32);
+ error_setg(errp, "CMD_PING (0x%x) received with no return path",
+ tmp32);
return -1;
}
migrate_send_rp_pong(mis, tmp32);
break;
case MIG_CMD_PACKAGED:
- return loadvm_handle_cmd_packaged(mis);
+ ret = loadvm_handle_cmd_packaged(mis);
+ if (ret < 0) {
+ error_setg(errp, "Failed to load device state command: %d", ret);
+ }
+ return ret;
case MIG_CMD_POSTCOPY_ADVISE:
- return loadvm_postcopy_handle_advise(mis, len);
+ ret = loadvm_postcopy_handle_advise(mis, len);
+ if (ret < 0) {
+ error_setg(errp, "Failed to load device state command: %d", ret);
+ }
+ return ret;
case MIG_CMD_POSTCOPY_LISTEN:
- return loadvm_postcopy_handle_listen(mis);
+ ret = loadvm_postcopy_handle_listen(mis);
+ if (ret < 0) {
+ error_setg(errp, "Failed to load device state command: %d", ret);
+ }
+ return ret;
case MIG_CMD_POSTCOPY_RUN:
- return loadvm_postcopy_handle_run(mis);
+ ret = loadvm_postcopy_handle_run(mis);
+ if (ret < 0) {
+ error_setg(errp, "Failed to load device state command: %d", ret);
+ }
+ return ret;
case MIG_CMD_POSTCOPY_RAM_DISCARD:
- return loadvm_postcopy_ram_handle_discard(mis, len);
+ ret = loadvm_postcopy_ram_handle_discard(mis, len);
+ if (ret < 0) {
+ error_setg(errp, "Failed to load device state command: %d", ret);
+ }
+ return ret;
case MIG_CMD_POSTCOPY_RESUME:
return loadvm_postcopy_handle_resume(mis);
case MIG_CMD_RECV_BITMAP:
- return loadvm_handle_recv_bitmap(mis, len);
+ ret = loadvm_handle_recv_bitmap(mis, len);
+ if (ret < 0) {
+ error_setg(errp, "Failed to load device state command: %d", ret);
+ }
+ return ret;
case MIG_CMD_ENABLE_COLO:
- return loadvm_process_enable_colo(mis);
+ ret = loadvm_process_enable_colo(mis);
+ if (ret < 0) {
+ error_setg(errp, "Failed to load device state command: %d", ret);
+ }
+ return ret;
case MIG_CMD_SWITCHOVER_START:
- return loadvm_postcopy_handle_switchover_start();
+ ret = loadvm_postcopy_handle_switchover_start();
+ if (ret < 0) {
+ error_setg(errp, "Failed to load device state command: %d", ret);
+ }
+ return ret;
}
return 0;
@@ -3049,6 +3085,7 @@ int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
{
uint8_t section_type;
int ret = 0;
+ Error *local_err = NULL;
retry:
while (true) {
@@ -3076,7 +3113,10 @@ retry:
}
break;
case QEMU_VM_COMMAND:
- ret = loadvm_process_command(f);
+ ret = loadvm_process_command(f, &local_err);
+ if (ret < 0) {
+ error_report_err(local_err);
+ }
trace_qemu_loadvm_state_section_command(ret);
if ((ret < 0) || (ret == LOADVM_QUIT)) {
goto out;
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 06/45] migration: push Error **errp into loadvm_handle_cmd_packaged()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (4 preceding siblings ...)
2025-10-03 15:39 ` [PULL 05/45] migration: push Error **errp into loadvm_process_command() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 07/45] migration: push Error **errp into qemu_loadvm_state() Peter Xu
` (39 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_handle_cmd_packaged() must report an error
in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-6-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index d1ed2e1cde..5e54651652 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2405,7 +2405,7 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
* Returns: Negative values on error
*
*/
-static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
+static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis, Error **errp)
{
int ret;
size_t length;
@@ -2415,7 +2415,7 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
trace_loadvm_handle_cmd_packaged(length);
if (length > MAX_VM_CMD_PACKAGED_SIZE) {
- error_report("Unreasonably large packaged state: %zu", length);
+ error_setg(errp, "Unreasonably large packaged state: %zu", length);
return -1;
}
@@ -2426,8 +2426,8 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
length);
if (ret != length) {
object_unref(OBJECT(bioc));
- error_report("CMD_PACKAGED: Buffer receive fail ret=%d length=%zu",
- ret, length);
+ error_setg(errp, "CMD_PACKAGED: Buffer receive fail ret=%d length=%zu",
+ ret, length);
return (ret < 0) ? ret : -EAGAIN;
}
bioc->usage += length;
@@ -2457,6 +2457,9 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
} while (1);
ret = qemu_loadvm_state_main(packf, mis);
+ if (ret < 0) {
+ error_setg(errp, "VM state load failed: %d", ret);
+ }
trace_loadvm_handle_cmd_packaged_main(ret);
qemu_fclose(packf);
object_unref(OBJECT(bioc));
@@ -2620,11 +2623,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
break;
case MIG_CMD_PACKAGED:
- ret = loadvm_handle_cmd_packaged(mis);
- if (ret < 0) {
- error_setg(errp, "Failed to load device state command: %d", ret);
- }
- return ret;
+ return loadvm_handle_cmd_packaged(mis, errp);
case MIG_CMD_POSTCOPY_ADVISE:
ret = loadvm_postcopy_handle_advise(mis, len);
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 07/45] migration: push Error **errp into qemu_loadvm_state()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (5 preceding siblings ...)
2025-10-03 15:39 ` [PULL 06/45] migration: push Error **errp into loadvm_handle_cmd_packaged() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 08/45] migration: push Error **errp into qemu_load_device_state() Peter Xu
` (38 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Akihiko Odaki, Marc-André Lureau
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_state() must report an error
in errp, in case of failure.
When postcopy live migration runs, the device states are loaded by
both the qemu coroutine process_incoming_migration_co() and the
postcopy_ram_listen_thread(). Therefore, it is important that the
coroutine also reports the error in case of failure, with
error_report_err(). Otherwise, the source qemu will not display
any errors before going into the postcopy pause state.
Suggested-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-7-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.h | 2 +-
migration/migration.c | 14 ++++++++++++--
migration/savevm.c | 30 ++++++++++++++++++------------
3 files changed, 31 insertions(+), 15 deletions(-)
diff --git a/migration/savevm.h b/migration/savevm.h
index 2d5e9c7166..b80770b746 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -64,7 +64,7 @@ void qemu_savevm_send_colo_enable(QEMUFile *f);
void qemu_savevm_live_state(QEMUFile *f);
int qemu_save_device_state(QEMUFile *f);
-int qemu_loadvm_state(QEMUFile *f);
+int qemu_loadvm_state(QEMUFile *f, Error **errp);
void qemu_loadvm_state_cleanup(MigrationIncomingState *mis);
int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
int qemu_load_device_state(QEMUFile *f);
diff --git a/migration/migration.c b/migration/migration.c
index e1ac4d73c2..cba2a39355 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -881,7 +881,7 @@ process_incoming_migration_co(void *opaque)
MIGRATION_STATUS_ACTIVE);
mis->loadvm_co = qemu_coroutine_self();
- ret = qemu_loadvm_state(mis->from_src_file);
+ ret = qemu_loadvm_state(mis->from_src_file, &local_err);
mis->loadvm_co = NULL;
trace_vmstate_downtime_checkpoint("dst-precopy-loadvm-completed");
@@ -908,7 +908,8 @@ process_incoming_migration_co(void *opaque)
}
if (ret < 0) {
- error_setg(&local_err, "load of migration failed: %s", strerror(-ret));
+ error_prepend(&local_err, "load of migration failed: %s: ",
+ strerror(-ret));
goto fail;
}
@@ -935,6 +936,15 @@ fail:
}
exit(EXIT_FAILURE);
+ } else {
+ /*
+ * Report the error here in case that QEMU abruptly exits
+ * when postcopy is enabled.
+ */
+ WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
+ error_report_err(s->error);
+ s->error = NULL;
+ }
}
out:
/* Pairs with the refcount taken in qmp_migrate_incoming() */
diff --git a/migration/savevm.c b/migration/savevm.c
index 5e54651652..88116ed278 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3159,28 +3159,24 @@ out:
return ret;
}
-int qemu_loadvm_state(QEMUFile *f)
+int qemu_loadvm_state(QEMUFile *f, Error **errp)
{
MigrationState *s = migrate_get_current();
MigrationIncomingState *mis = migration_incoming_get_current();
- Error *local_err = NULL;
int ret;
- if (qemu_savevm_state_blocked(&local_err)) {
- error_report_err(local_err);
+ if (qemu_savevm_state_blocked(errp)) {
return -EINVAL;
}
qemu_loadvm_thread_pool_create(mis);
- ret = qemu_loadvm_state_header(f, &local_err);
+ ret = qemu_loadvm_state_header(f, errp);
if (ret) {
- error_report_err(local_err);
return ret;
}
- if (qemu_loadvm_state_setup(f, &local_err) != 0) {
- error_report_err(local_err);
+ if (qemu_loadvm_state_setup(f, errp) != 0) {
return -EINVAL;
}
@@ -3191,6 +3187,9 @@ int qemu_loadvm_state(QEMUFile *f)
cpu_synchronize_all_pre_loadvm();
ret = qemu_loadvm_state_main(f, mis);
+ if (ret < 0) {
+ error_setg(errp, "Load VM state failed: %d", ret);
+ }
qemu_event_set(&mis->main_thread_load_event);
trace_qemu_loadvm_state_post_main(ret);
@@ -3208,8 +3207,15 @@ int qemu_loadvm_state(QEMUFile *f)
if (migrate_has_error(migrate_get_current()) ||
!qemu_loadvm_thread_pool_wait(s, mis)) {
ret = -EINVAL;
+ error_setg(errp,
+ "Error while loading vmstate");
} else {
ret = qemu_file_get_error(f);
+ if (ret < 0) {
+ error_setg(errp,
+ "Error while loading vmstate: stream error: %d",
+ ret);
+ }
}
}
/*
@@ -3474,6 +3480,7 @@ void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
void qmp_xen_load_devices_state(const char *filename, Error **errp)
{
+ ERRP_GUARD();
QEMUFile *f;
QIOChannelFile *ioc;
int ret;
@@ -3495,10 +3502,10 @@ void qmp_xen_load_devices_state(const char *filename, Error **errp)
f = qemu_file_new_input(QIO_CHANNEL(ioc));
object_unref(OBJECT(ioc));
- ret = qemu_loadvm_state(f);
+ ret = qemu_loadvm_state(f, errp);
qemu_fclose(f);
if (ret < 0) {
- error_setg(errp, "loading Xen device state failed");
+ error_prepend(errp, "loading Xen device state failed: ");
}
migration_incoming_state_destroy();
}
@@ -3569,13 +3576,12 @@ bool load_snapshot(const char *name, const char *vmstate,
ret = -EINVAL;
goto err_drain;
}
- ret = qemu_loadvm_state(f);
+ ret = qemu_loadvm_state(f, errp);
migration_incoming_state_destroy();
bdrv_drain_all_end();
if (ret < 0) {
- error_setg(errp, "Error %d while loading VM state", ret);
return false;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 08/45] migration: push Error **errp into qemu_load_device_state()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (6 preceding siblings ...)
2025-10-03 15:39 ` [PULL 07/45] migration: push Error **errp into qemu_loadvm_state() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 09/45] migration: push Error **errp into qemu_loadvm_state_main() Peter Xu
` (37 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_load_device_state() must report an error
in errp, in case of failure.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-8-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.h | 2 +-
migration/colo.c | 3 +--
migration/savevm.c | 4 ++--
3 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/migration/savevm.h b/migration/savevm.h
index b80770b746..b12681839f 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -67,7 +67,7 @@ int qemu_save_device_state(QEMUFile *f);
int qemu_loadvm_state(QEMUFile *f, Error **errp);
void qemu_loadvm_state_cleanup(MigrationIncomingState *mis);
int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
-int qemu_load_device_state(QEMUFile *f);
+int qemu_load_device_state(QEMUFile *f, Error **errp);
int qemu_loadvm_approve_switchover(void);
int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
bool in_postcopy);
diff --git a/migration/colo.c b/migration/colo.c
index cf4d71d9ed..a426ec5b60 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -729,9 +729,8 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
bql_lock();
vmstate_loading = true;
colo_flush_ram_cache();
- ret = qemu_load_device_state(fb);
+ ret = qemu_load_device_state(fb, errp);
if (ret < 0) {
- error_setg(errp, "COLO: load device state failed");
vmstate_loading = false;
bql_unlock();
return;
diff --git a/migration/savevm.c b/migration/savevm.c
index 88116ed278..9e30718995 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3264,7 +3264,7 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
return ret;
}
-int qemu_load_device_state(QEMUFile *f)
+int qemu_load_device_state(QEMUFile *f, Error **errp)
{
MigrationIncomingState *mis = migration_incoming_get_current();
int ret;
@@ -3272,7 +3272,7 @@ int qemu_load_device_state(QEMUFile *f)
/* Load QEMU_VM_SECTION_FULL section */
ret = qemu_loadvm_state_main(f, mis);
if (ret < 0) {
- error_report("Failed to load device state: %d", ret);
+ error_setg(errp, "Failed to load device state: %d", ret);
return ret;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 09/45] migration: push Error **errp into qemu_loadvm_state_main()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (7 preceding siblings ...)
2025-10-03 15:39 ` [PULL 08/45] migration: push Error **errp into qemu_load_device_state() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 10/45] migration: push Error **errp into qemu_loadvm_section_start_full() Peter Xu
` (36 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_state_main() must report an error
in errp, in case of failure.
Set errp explicitly if it is NULL in case of failure in the out
section. This will be removed in the subsequent patch when all of
the calls are converted to passing errp.
The error message in the default case of qemu_loadvm_state_main()
has the word "savevm". This is removed because it can confuse the
user while reading destination side error logs.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-9-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.h | 3 ++-
migration/colo.c | 3 +--
migration/savevm.c | 36 +++++++++++++++++-------------------
3 files changed, 20 insertions(+), 22 deletions(-)
diff --git a/migration/savevm.h b/migration/savevm.h
index b12681839f..c337e3e3d1 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -66,7 +66,8 @@ int qemu_save_device_state(QEMUFile *f);
int qemu_loadvm_state(QEMUFile *f, Error **errp);
void qemu_loadvm_state_cleanup(MigrationIncomingState *mis);
-int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis,
+ Error **errp);
int qemu_load_device_state(QEMUFile *f, Error **errp);
int qemu_loadvm_approve_switchover(void);
int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
diff --git a/migration/colo.c b/migration/colo.c
index a426ec5b60..ad50a3abc9 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -686,11 +686,10 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
bql_lock();
cpu_synchronize_all_states();
- ret = qemu_loadvm_state_main(mis->from_src_file, mis);
+ ret = qemu_loadvm_state_main(mis->from_src_file, mis, errp);
bql_unlock();
if (ret < 0) {
- error_setg(errp, "Load VM's live state (ram) error");
return;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index 9e30718995..f1338f5cf6 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2105,7 +2105,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
qemu_file_set_blocking(f, true, &error_fatal);
/* TODO: sanity check that only postcopiable data will be loaded here */
- load_res = qemu_loadvm_state_main(f, mis);
+ load_res = qemu_loadvm_state_main(f, mis, &error_fatal);
/*
* This is tricky, but, mis->from_src_file can change after it
@@ -2456,10 +2456,7 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis, Error **errp)
qemu_coroutine_yield();
} while (1);
- ret = qemu_loadvm_state_main(packf, mis);
- if (ret < 0) {
- error_setg(errp, "VM state load failed: %d", ret);
- }
+ ret = qemu_loadvm_state_main(packf, mis, errp);
trace_loadvm_handle_cmd_packaged_main(ret);
qemu_fclose(packf);
object_unref(OBJECT(bioc));
@@ -3080,18 +3077,22 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
return true;
}
-int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis,
+ Error **errp)
{
+ ERRP_GUARD();
uint8_t section_type;
int ret = 0;
- Error *local_err = NULL;
retry:
while (true) {
section_type = qemu_get_byte(f);
- ret = qemu_file_get_error_obj_any(f, mis->postcopy_qemufile_dst, NULL);
+ ret = qemu_file_get_error_obj_any(f, mis->postcopy_qemufile_dst, errp);
if (ret) {
+ error_prepend(errp,
+ "Failed to load section ID: stream error: %d: ",
+ ret);
break;
}
@@ -3112,10 +3113,7 @@ retry:
}
break;
case QEMU_VM_COMMAND:
- ret = loadvm_process_command(f, &local_err);
- if (ret < 0) {
- error_report_err(local_err);
- }
+ ret = loadvm_process_command(f, errp);
trace_qemu_loadvm_state_section_command(ret);
if ((ret < 0) || (ret == LOADVM_QUIT)) {
goto out;
@@ -3125,7 +3123,7 @@ retry:
/* This is the end of migration */
goto out;
default:
- error_report("Unknown savevm section type %d", section_type);
+ error_setg(errp, "Unknown section type %d", section_type);
ret = -EINVAL;
goto out;
}
@@ -3133,6 +3131,9 @@ retry:
out:
if (ret < 0) {
+ if (*errp == NULL) {
+ error_setg(errp, "Loading VM state failed: %d", ret);
+ }
qemu_file_set_error(f, ret);
/* Cancel bitmaps incoming regardless of recovery */
@@ -3153,6 +3154,7 @@ out:
migrate_postcopy_ram() && postcopy_pause_incoming(mis)) {
/* Reset f to point to the newly created channel */
f = mis->from_src_file;
+ error_free_or_abort(errp);
goto retry;
}
}
@@ -3186,10 +3188,7 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
cpu_synchronize_all_pre_loadvm();
- ret = qemu_loadvm_state_main(f, mis);
- if (ret < 0) {
- error_setg(errp, "Load VM state failed: %d", ret);
- }
+ ret = qemu_loadvm_state_main(f, mis, errp);
qemu_event_set(&mis->main_thread_load_event);
trace_qemu_loadvm_state_post_main(ret);
@@ -3270,9 +3269,8 @@ int qemu_load_device_state(QEMUFile *f, Error **errp)
int ret;
/* Load QEMU_VM_SECTION_FULL section */
- ret = qemu_loadvm_state_main(f, mis);
+ ret = qemu_loadvm_state_main(f, mis, errp);
if (ret < 0) {
- error_setg(errp, "Failed to load device state: %d", ret);
return ret;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 10/45] migration: push Error **errp into qemu_loadvm_section_start_full()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (8 preceding siblings ...)
2025-10-03 15:39 ` [PULL 09/45] migration: push Error **errp into qemu_loadvm_state_main() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 11/45] migration: push Error **errp into qemu_loadvm_section_part_end() Peter Xu
` (35 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_section_start_full() must report an error
in errp, in case of failure.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-10-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 37 +++++++++++++++++++------------------
1 file changed, 19 insertions(+), 18 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index f1338f5cf6..83d8fb8f41 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2722,21 +2722,21 @@ static bool check_section_footer(QEMUFile *f, SaveStateEntry *se)
}
static int
-qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type)
+qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type, Error **errp)
{
+ ERRP_GUARD();
bool trace_downtime = (type == QEMU_VM_SECTION_FULL);
uint32_t instance_id, version_id, section_id;
int64_t start_ts, end_ts;
SaveStateEntry *se;
char idstr[256];
int ret;
- Error *local_err = NULL;
/* Read section start */
section_id = qemu_get_be32(f);
if (!qemu_get_counted_string(f, idstr)) {
- error_report("Unable to read ID string for section %u",
- section_id);
+ error_setg(errp, "Unable to read ID string for section %u",
+ section_id);
return -EINVAL;
}
instance_id = qemu_get_be32(f);
@@ -2744,8 +2744,7 @@ qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type)
ret = qemu_file_get_error(f);
if (ret) {
- error_report("%s: Failed to read instance/version ID: %d",
- __func__, ret);
+ error_setg(errp, "Failed to read instance/version ID: %d", ret);
return ret;
}
@@ -2754,17 +2753,17 @@ qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type)
/* Find savevm section */
se = find_se(idstr, instance_id);
if (se == NULL) {
- error_report("Unknown savevm section or instance '%s' %"PRIu32". "
- "Make sure that your current VM setup matches your "
- "saved VM setup, including any hotplugged devices",
- idstr, instance_id);
+ error_setg(errp, "Unknown section or instance '%s' %"PRIu32". "
+ "Make sure that your current VM setup matches your "
+ "saved VM setup, including any hotplugged devices",
+ idstr, instance_id);
return -EINVAL;
}
/* Validate version */
if (version_id > se->version_id) {
- error_report("savevm: unsupported version %d for '%s' v%d",
- version_id, idstr, se->version_id);
+ error_setg(errp, "unsupported version %d for '%s' v%d",
+ version_id, idstr, se->version_id);
return -EINVAL;
}
se->load_version_id = version_id;
@@ -2772,7 +2771,7 @@ qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type)
/* Validate if it is a device's state */
if (xen_enabled() && se->is_ram) {
- error_report("loadvm: %s RAM loading not allowed on Xen", idstr);
+ error_setg(errp, "loadvm: %s RAM loading not allowed on Xen", idstr);
return -EINVAL;
}
@@ -2780,11 +2779,11 @@ qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type)
start_ts = qemu_clock_get_us(QEMU_CLOCK_REALTIME);
}
- ret = vmstate_load(f, se, &local_err);
+ ret = vmstate_load(f, se, errp);
if (ret < 0) {
- error_report("error while loading state for instance 0x%"PRIx32" of"
- " device '%s'", instance_id, idstr);
- error_report_err(local_err);
+ error_prepend(errp,
+ "error while loading state for instance 0x%"PRIx32" of"
+ " device '%s': ", instance_id, idstr);
return ret;
}
@@ -2795,6 +2794,8 @@ qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type)
}
if (!check_section_footer(f, se)) {
+ error_setg(errp, "Section footer error, section_id: %d",
+ section_id);
return -EINVAL;
}
@@ -3100,7 +3101,7 @@ retry:
switch (section_type) {
case QEMU_VM_SECTION_START:
case QEMU_VM_SECTION_FULL:
- ret = qemu_loadvm_section_start_full(f, section_type);
+ ret = qemu_loadvm_section_start_full(f, section_type, errp);
if (ret < 0) {
goto out;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 11/45] migration: push Error **errp into qemu_loadvm_section_part_end()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (9 preceding siblings ...)
2025-10-03 15:39 ` [PULL 10/45] migration: push Error **errp into qemu_loadvm_section_start_full() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 12/45] migration: Update qemu_file_get_return_path() docs and remove dead checks Peter Xu
` (34 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that qemu_loadvm_section_part_end() must report an error
in errp, in case of failure.
This patch also removes the setting of errp when errp is NULL in the
out section as it is no longer required in the series.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-11-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 83d8fb8f41..c1ae36b50a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2803,21 +2803,19 @@ qemu_loadvm_section_start_full(QEMUFile *f, uint8_t type, Error **errp)
}
static int
-qemu_loadvm_section_part_end(QEMUFile *f, uint8_t type)
+qemu_loadvm_section_part_end(QEMUFile *f, uint8_t type, Error **errp)
{
bool trace_downtime = (type == QEMU_VM_SECTION_END);
int64_t start_ts, end_ts;
uint32_t section_id;
SaveStateEntry *se;
int ret;
- Error *local_err = NULL;
section_id = qemu_get_be32(f);
ret = qemu_file_get_error(f);
if (ret) {
- error_report("%s: Failed to read section ID: %d",
- __func__, ret);
+ error_setg(errp, "Failed to read section ID: %d", ret);
return ret;
}
@@ -2828,7 +2826,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, uint8_t type)
}
}
if (se == NULL) {
- error_report("Unknown savevm section %d", section_id);
+ error_setg(errp, "Unknown section %d", section_id);
return -EINVAL;
}
@@ -2836,9 +2834,8 @@ qemu_loadvm_section_part_end(QEMUFile *f, uint8_t type)
start_ts = qemu_clock_get_us(QEMU_CLOCK_REALTIME);
}
- ret = vmstate_load(f, se, &local_err);
+ ret = vmstate_load(f, se, errp);
if (ret < 0) {
- error_report_err(local_err);
return ret;
}
@@ -2849,6 +2846,8 @@ qemu_loadvm_section_part_end(QEMUFile *f, uint8_t type)
}
if (!check_section_footer(f, se)) {
+ error_setg(errp, "Section footer error, section_id: %d",
+ section_id);
return -EINVAL;
}
@@ -3108,7 +3107,7 @@ retry:
break;
case QEMU_VM_SECTION_PART:
case QEMU_VM_SECTION_END:
- ret = qemu_loadvm_section_part_end(f, section_type);
+ ret = qemu_loadvm_section_part_end(f, section_type, errp);
if (ret < 0) {
goto out;
}
@@ -3132,9 +3131,6 @@ retry:
out:
if (ret < 0) {
- if (*errp == NULL) {
- error_setg(errp, "Loading VM state failed: %d", ret);
- }
qemu_file_set_error(f, ret);
/* Cancel bitmaps incoming regardless of recovery */
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 12/45] migration: Update qemu_file_get_return_path() docs and remove dead checks
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (10 preceding siblings ...)
2025-10-03 15:39 ` [PULL 11/45] migration: push Error **errp into qemu_loadvm_section_part_end() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 13/45] migration: make loadvm_postcopy_handle_resume() void Peter Xu
` (33 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Marc-André Lureau,
Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
The documentation of qemu_file_get_return_path() states that it can
return NULL on failure. However, a review of the current implementation
reveals that it is guaranteed that it will always succeed and will never
return NULL.
As a result, the NULL checks post calling the function become redundant.
This commit updates the documentation for the function and removes all
NULL checks throughout the migration code.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-12-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/colo.c | 4 ----
migration/migration.c | 12 ++----------
migration/qemu-file.c | 1 -
migration/savevm.c | 4 ----
4 files changed, 2 insertions(+), 19 deletions(-)
diff --git a/migration/colo.c b/migration/colo.c
index ad50a3abc9..db783f6fa7 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -847,10 +847,6 @@ static void *colo_process_incoming_thread(void *opaque)
failover_init_state();
mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
- if (!mis->to_src_file) {
- error_report("COLO incoming thread: Open QEMUFile to_src_file failed");
- goto out;
- }
/*
* Note: the communication between Primary side and Secondary side
* should be sequential, we set the fd to unblocked in migration incoming
diff --git a/migration/migration.c b/migration/migration.c
index cba2a39355..ce17dcc1c0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2656,12 +2656,9 @@ out:
return NULL;
}
-static int open_return_path_on_source(MigrationState *ms)
+static void open_return_path_on_source(MigrationState *ms)
{
ms->rp_state.from_dst_file = qemu_file_get_return_path(ms->to_dst_file);
- if (!ms->rp_state.from_dst_file) {
- return -1;
- }
trace_open_return_path_on_source();
@@ -2670,8 +2667,6 @@ static int open_return_path_on_source(MigrationState *ms)
ms->rp_state.rp_thread_created = true;
trace_open_return_path_on_source_continue();
-
- return 0;
}
/* Return true if error detected, or false otherwise */
@@ -4022,10 +4017,7 @@ void migration_connect(MigrationState *s, Error *error_in)
* QEMU uses the return path.
*/
if (migrate_postcopy_ram() || migrate_return_path()) {
- if (open_return_path_on_source(s)) {
- error_setg(&local_err, "Unable to open return-path for postcopy");
- goto fail;
- }
+ open_return_path_on_source(s);
}
/*
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 0f4280df21..0ee0f48a3e 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -125,7 +125,6 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
/*
* Result: QEMUFile* for a 'return path' for comms in the opposite direction
- * NULL if not available
*/
QEMUFile *qemu_file_get_return_path(QEMUFile *f)
{
diff --git a/migration/savevm.c b/migration/savevm.c
index c1ae36b50a..eb2a905f32 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2588,10 +2588,6 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return 0;
}
mis->to_src_file = qemu_file_get_return_path(f);
- if (!mis->to_src_file) {
- error_setg(errp, "CMD_OPEN_RETURN_PATH failed");
- return -1;
- }
/*
* Switchover ack is enabled but no device uses it, so send an ACK to
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 13/45] migration: make loadvm_postcopy_handle_resume() void
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (11 preceding siblings ...)
2025-10-03 15:39 ` [PULL 12/45] migration: Update qemu_file_get_return_path() docs and remove dead checks Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 14/45] migration: push Error **errp into ram_postcopy_incoming_init() Peter Xu
` (32 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
Use warn_report() instead of error_report(); it ensures that
a resume command received while the migration is not
in postcopy recover state is not fatal. It only informs that
the command received is unusual, and therefore we should not set
errp with the error string.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-13-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index eb2a905f32..d145e7b1e5 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2334,12 +2334,12 @@ static void migrate_send_rp_req_pages_pending(MigrationIncomingState *mis)
}
}
-static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
+static void loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
{
if (mis->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {
- error_report("%s: illegal resume received", __func__);
+ warn_report("%s: illegal resume received", __func__);
/* Don't fail the load, only for this. */
- return 0;
+ return;
}
/*
@@ -2391,8 +2391,6 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
/* Kick the fast ram load thread too */
qemu_sem_post(&mis->postcopy_pause_sem_fast_load);
}
-
- return 0;
}
/**
@@ -2647,7 +2645,8 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return ret;
case MIG_CMD_POSTCOPY_RESUME:
- return loadvm_postcopy_handle_resume(mis);
+ loadvm_postcopy_handle_resume(mis);
+ return 0;
case MIG_CMD_RECV_BITMAP:
ret = loadvm_handle_recv_bitmap(mis, len);
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 14/45] migration: push Error **errp into ram_postcopy_incoming_init()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (12 preceding siblings ...)
2025-10-03 15:39 ` [PULL 13/45] migration: make loadvm_postcopy_handle_resume() void Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 15/45] migration: push Error **errp into loadvm_postcopy_handle_advise() Peter Xu
` (31 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that ram_postcopy_incoming_init() must report an error
in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-14-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/postcopy-ram.h | 2 +-
migration/ram.h | 2 +-
migration/postcopy-ram.c | 9 ++++++---
migration/ram.c | 4 ++--
migration/savevm.c | 2 +-
5 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
index 3852141d7e..ca19433b24 100644
--- a/migration/postcopy-ram.h
+++ b/migration/postcopy-ram.h
@@ -30,7 +30,7 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis);
* postcopy later; must be called prior to any precopy.
* called from ram.c's similarly named ram_postcopy_incoming_init
*/
-int postcopy_ram_incoming_init(MigrationIncomingState *mis);
+int postcopy_ram_incoming_init(MigrationIncomingState *mis, Error **errp);
/*
* At the end of a migration where postcopy_ram_incoming_init was called.
diff --git a/migration/ram.h b/migration/ram.h
index 921c39a2c5..275709a991 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -86,7 +86,7 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms);
void ram_postcopy_send_discard_bitmap(MigrationState *ms);
/* For incoming postcopy discard */
int ram_discard_range(const char *block_name, uint64_t start, size_t length);
-int ram_postcopy_incoming_init(MigrationIncomingState *mis);
+int ram_postcopy_incoming_init(MigrationIncomingState *mis, Error **errp);
int ram_load_postcopy(QEMUFile *f, int channel);
void ram_handle_zero(void *host, uint64_t size);
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 0172172343..5471efb4f0 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -681,6 +681,7 @@ out:
*/
static int init_range(RAMBlock *rb, void *opaque)
{
+ Error **errp = opaque;
const char *block_name = qemu_ram_get_idstr(rb);
void *host_addr = qemu_ram_get_host_addr(rb);
ram_addr_t offset = qemu_ram_get_offset(rb);
@@ -701,6 +702,8 @@ static int init_range(RAMBlock *rb, void *opaque)
* (Precopy will just overwrite this data, so doesn't need the discard)
*/
if (ram_discard_range(block_name, 0, length)) {
+ error_setg(errp, "failed to discard RAM block %s len=%zu",
+ block_name, length);
return -1;
}
@@ -749,9 +752,9 @@ static int cleanup_range(RAMBlock *rb, void *opaque)
* postcopy later; must be called prior to any precopy.
* called from arch_init's similarly named ram_postcopy_incoming_init
*/
-int postcopy_ram_incoming_init(MigrationIncomingState *mis)
+int postcopy_ram_incoming_init(MigrationIncomingState *mis, Error **errp)
{
- if (foreach_not_ignored_block(init_range, NULL)) {
+ if (foreach_not_ignored_block(init_range, errp)) {
return -1;
}
@@ -1703,7 +1706,7 @@ bool postcopy_ram_supported_by_host(MigrationIncomingState *mis, Error **errp)
return false;
}
-int postcopy_ram_incoming_init(MigrationIncomingState *mis)
+int postcopy_ram_incoming_init(MigrationIncomingState *mis, Error **errp)
{
error_report("postcopy_ram_incoming_init: No OS support");
return -1;
diff --git a/migration/ram.c b/migration/ram.c
index 7208bc114f..6a0dcc04f4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3716,9 +3716,9 @@ static int ram_load_cleanup(void *opaque)
* postcopy-ram. postcopy-ram's similarly names
* postcopy_ram_incoming_init does the work.
*/
-int ram_postcopy_incoming_init(MigrationIncomingState *mis)
+int ram_postcopy_incoming_init(MigrationIncomingState *mis, Error **errp)
{
- return postcopy_ram_incoming_init(mis);
+ return postcopy_ram_incoming_init(mis, errp);
}
/**
diff --git a/migration/savevm.c b/migration/savevm.c
index d145e7b1e5..338d1a9756 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1989,7 +1989,7 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
return -1;
}
- if (ram_postcopy_incoming_init(mis)) {
+ if (ram_postcopy_incoming_init(mis, NULL) < 0) {
return -1;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 15/45] migration: push Error **errp into loadvm_postcopy_handle_advise()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (13 preceding siblings ...)
2025-10-03 15:39 ` [PULL 14/45] migration: push Error **errp into ram_postcopy_incoming_init() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 16/45] migration: push Error **errp into loadvm_postcopy_handle_listen() Peter Xu
` (30 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Marc-André Lureau,
Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_postcopy_handle_advise() must report an error
in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-15-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 40 +++++++++++++++++++---------------------
1 file changed, 19 insertions(+), 21 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 338d1a9756..38e22b435b 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1912,39 +1912,39 @@ enum LoadVMExitCodes {
* quickly.
*/
static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
- uint16_t len)
+ uint16_t len, Error **errp)
{
PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
size_t page_size = qemu_target_page_size();
- Error *local_err = NULL;
trace_loadvm_postcopy_handle_advise();
if (ps != POSTCOPY_INCOMING_NONE) {
- error_report("CMD_POSTCOPY_ADVISE in wrong postcopy state (%d)", ps);
+ error_setg(errp, "CMD_POSTCOPY_ADVISE in wrong postcopy state (%d)",
+ ps);
return -1;
}
switch (len) {
case 0:
if (migrate_postcopy_ram()) {
- error_report("RAM postcopy is enabled but have 0 byte advise");
+ error_setg(errp, "RAM postcopy is enabled but have 0 byte advise");
return -EINVAL;
}
return 0;
case 8 + 8:
if (!migrate_postcopy_ram()) {
- error_report("RAM postcopy is disabled but have 16 byte advise");
+ error_setg(errp,
+ "RAM postcopy is disabled but have 16 byte advise");
return -EINVAL;
}
break;
default:
- error_report("CMD_POSTCOPY_ADVISE invalid length (%d)", len);
+ error_setg(errp, "CMD_POSTCOPY_ADVISE invalid length (%d)", len);
return -EINVAL;
}
- if (!postcopy_ram_supported_by_host(mis, &local_err)) {
- error_report_err(local_err);
+ if (!postcopy_ram_supported_by_host(mis, errp)) {
postcopy_state_set(POSTCOPY_INCOMING_NONE);
return -1;
}
@@ -1967,9 +1967,10 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
* also fails when passed to an older qemu that doesn't
* do huge pages.
*/
- error_report("Postcopy needs matching RAM page sizes (s=%" PRIx64
- " d=%" PRIx64 ")",
- remote_pagesize_summary, local_pagesize_summary);
+ error_setg(errp,
+ "Postcopy needs matching RAM page sizes "
+ "(s=%" PRIx64 " d=%" PRIx64 ")",
+ remote_pagesize_summary, local_pagesize_summary);
return -1;
}
@@ -1979,17 +1980,18 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
* Again, some differences could be dealt with, but for now keep it
* simple.
*/
- error_report("Postcopy needs matching target page sizes (s=%d d=%zd)",
- (int)remote_tps, page_size);
+ error_setg(errp,
+ "Postcopy needs matching target page sizes (s=%d d=%zd)",
+ (int)remote_tps, page_size);
return -1;
}
- if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_ADVISE, &local_err)) {
- error_report_err(local_err);
+ if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_ADVISE, errp)) {
return -1;
}
- if (ram_postcopy_incoming_init(mis, NULL) < 0) {
+ if (ram_postcopy_incoming_init(mis, errp) < 0) {
+ error_prepend(errp, "Postcopy RAM incoming init failed: ");
return -1;
}
@@ -2617,11 +2619,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return loadvm_handle_cmd_packaged(mis, errp);
case MIG_CMD_POSTCOPY_ADVISE:
- ret = loadvm_postcopy_handle_advise(mis, len);
- if (ret < 0) {
- error_setg(errp, "Failed to load device state command: %d", ret);
- }
- return ret;
+ return loadvm_postcopy_handle_advise(mis, len, errp);
case MIG_CMD_POSTCOPY_LISTEN:
ret = loadvm_postcopy_handle_listen(mis);
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 16/45] migration: push Error **errp into loadvm_postcopy_handle_listen()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (14 preceding siblings ...)
2025-10-03 15:39 ` [PULL 15/45] migration: push Error **errp into loadvm_postcopy_handle_advise() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 17/45] migration: push Error **errp into loadvm_postcopy_handle_run() Peter Xu
` (29 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_postcopy_handle_listen() must report an error
in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-16-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 17 +++++++----------
1 file changed, 7 insertions(+), 10 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 38e22b435b..ce88f56498 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2181,15 +2181,16 @@ static void *postcopy_ram_listen_thread(void *opaque)
}
/* After this message we must be able to immediately receive postcopy data */
-static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
+static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis,
+ Error **errp)
{
PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_LISTENING);
- Error *local_err = NULL;
trace_loadvm_postcopy_handle_listen("enter");
if (ps != POSTCOPY_INCOMING_ADVISE && ps != POSTCOPY_INCOMING_DISCARD) {
- error_report("CMD_POSTCOPY_LISTEN in wrong postcopy state (%d)", ps);
+ error_setg(errp,
+ "CMD_POSTCOPY_LISTEN in wrong postcopy state (%d)", ps);
return -1;
}
if (ps == POSTCOPY_INCOMING_ADVISE) {
@@ -2212,14 +2213,14 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
if (migrate_postcopy_ram()) {
if (postcopy_ram_incoming_setup(mis)) {
postcopy_ram_incoming_cleanup(mis);
+ error_setg(errp, "Failed to setup incoming postcopy RAM blocks");
return -1;
}
}
trace_loadvm_postcopy_handle_listen("after uffd");
- if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_LISTEN, &local_err)) {
- error_report_err(local_err);
+ if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_LISTEN, errp)) {
return -1;
}
@@ -2622,11 +2623,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return loadvm_postcopy_handle_advise(mis, len, errp);
case MIG_CMD_POSTCOPY_LISTEN:
- ret = loadvm_postcopy_handle_listen(mis);
- if (ret < 0) {
- error_setg(errp, "Failed to load device state command: %d", ret);
- }
- return ret;
+ return loadvm_postcopy_handle_listen(mis, errp);
case MIG_CMD_POSTCOPY_RUN:
ret = loadvm_postcopy_handle_run(mis);
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 17/45] migration: push Error **errp into loadvm_postcopy_handle_run()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (15 preceding siblings ...)
2025-10-03 15:39 ` [PULL 16/45] migration: push Error **errp into loadvm_postcopy_handle_listen() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 18/45] migration: push Error **errp into loadvm_postcopy_ram_handle_discard() Peter Xu
` (28 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_postcopy_handle_run() must report an error
in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-17-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index ce88f56498..f7947160fd 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2273,13 +2273,13 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
}
/* After all discards we can start running and asking for pages */
-static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
+static int loadvm_postcopy_handle_run(MigrationIncomingState *mis, Error **errp)
{
PostcopyState ps = postcopy_state_get();
trace_loadvm_postcopy_handle_run();
if (ps != POSTCOPY_INCOMING_LISTENING) {
- error_report("CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
+ error_setg(errp, "CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
return -1;
}
@@ -2626,11 +2626,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return loadvm_postcopy_handle_listen(mis, errp);
case MIG_CMD_POSTCOPY_RUN:
- ret = loadvm_postcopy_handle_run(mis);
- if (ret < 0) {
- error_setg(errp, "Failed to load device state command: %d", ret);
- }
- return ret;
+ return loadvm_postcopy_handle_run(mis, errp);
case MIG_CMD_POSTCOPY_RAM_DISCARD:
ret = loadvm_postcopy_ram_handle_discard(mis, len);
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 18/45] migration: push Error **errp into loadvm_postcopy_ram_handle_discard()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (16 preceding siblings ...)
2025-10-03 15:39 ` [PULL 17/45] migration: push Error **errp into loadvm_postcopy_handle_run() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 19/45] migration: push Error **errp into loadvm_handle_recv_bitmap() Peter Xu
` (27 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_postcopy_ram_handle_discard() must report an error
in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-18-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index f7947160fd..b80da04b47 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2004,7 +2004,7 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
* There can be 0..many of these messages, each encoding multiple pages.
*/
static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
- uint16_t len)
+ uint16_t len, Error **errp)
{
int tmp;
char ramid[256];
@@ -2017,6 +2017,7 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
/* 1st discard */
tmp = postcopy_ram_prepare_discard(mis);
if (tmp) {
+ error_setg(errp, "Failed to prepare for RAM discard: %d", tmp);
return tmp;
}
break;
@@ -2026,8 +2027,9 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
break;
default:
- error_report("CMD_POSTCOPY_RAM_DISCARD in wrong postcopy state (%d)",
- ps);
+ error_setg(errp,
+ "CMD_POSTCOPY_RAM_DISCARD in wrong postcopy state (%d)",
+ ps);
return -1;
}
/* We're expecting a
@@ -2036,29 +2038,30 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
* then at least 1 16 byte chunk
*/
if (len < (1 + 1 + 1 + 1 + 2 * 8)) {
- error_report("CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
+ error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
return -1;
}
tmp = qemu_get_byte(mis->from_src_file);
if (tmp != postcopy_ram_discard_version) {
- error_report("CMD_POSTCOPY_RAM_DISCARD invalid version (%d)", tmp);
+ error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD invalid version (%d)", tmp);
return -1;
}
if (!qemu_get_counted_string(mis->from_src_file, ramid)) {
- error_report("CMD_POSTCOPY_RAM_DISCARD Failed to read RAMBlock ID");
+ error_setg(errp,
+ "CMD_POSTCOPY_RAM_DISCARD Failed to read RAMBlock ID");
return -1;
}
tmp = qemu_get_byte(mis->from_src_file);
if (tmp != 0) {
- error_report("CMD_POSTCOPY_RAM_DISCARD missing nil (%d)", tmp);
+ error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD missing nil (%d)", tmp);
return -1;
}
len -= 3 + strlen(ramid);
if (len % 16) {
- error_report("CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
+ error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
return -1;
}
trace_loadvm_postcopy_ram_handle_discard_header(ramid, len);
@@ -2070,6 +2073,7 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
len -= 16;
int ret = ram_discard_range(ramid, start_addr, block_length);
if (ret) {
+ error_setg(errp, "Failed to discard RAM range %s: %d", ramid, ret);
return ret;
}
}
@@ -2629,11 +2633,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return loadvm_postcopy_handle_run(mis, errp);
case MIG_CMD_POSTCOPY_RAM_DISCARD:
- ret = loadvm_postcopy_ram_handle_discard(mis, len);
- if (ret < 0) {
- error_setg(errp, "Failed to load device state command: %d", ret);
- }
- return ret;
+ return loadvm_postcopy_ram_handle_discard(mis, len, errp);
case MIG_CMD_POSTCOPY_RESUME:
loadvm_postcopy_handle_resume(mis);
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 19/45] migration: push Error **errp into loadvm_handle_recv_bitmap()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (17 preceding siblings ...)
2025-10-03 15:39 ` [PULL 18/45] migration: push Error **errp into loadvm_postcopy_ram_handle_discard() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 20/45] migration: Return -1 on memory allocation failure in ram.c Peter Xu
` (26 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_handle_recv_bitmap() must report an error
in errp, in case of failure.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-19-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index b80da04b47..2e8776768f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2476,32 +2476,35 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis, Error **errp)
* len (1 byte) + ramblock_name (<255 bytes)
*/
static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
- uint16_t len)
+ uint16_t len, Error **errp)
{
QEMUFile *file = mis->from_src_file;
RAMBlock *rb;
char block_name[256];
size_t cnt;
+ int ret;
cnt = qemu_get_counted_string(file, block_name);
if (!cnt) {
- error_report("%s: failed to read block name", __func__);
+ error_setg(errp, "failed to read block name");
return -EINVAL;
}
/* Validate before using the data */
- if (qemu_file_get_error(file)) {
- return qemu_file_get_error(file);
+ ret = qemu_file_get_error(file);
+ if (ret < 0) {
+ error_setg(errp, "loadvm failed: stream error: %d", ret);
+ return ret;
}
if (len != cnt + 1) {
- error_report("%s: invalid payload length (%d)", __func__, len);
+ error_setg(errp, "invalid payload length (%d)", len);
return -EINVAL;
}
rb = qemu_ram_block_by_name(block_name);
if (!rb) {
- error_report("%s: block '%s' not found", __func__, block_name);
+ error_setg(errp, "block '%s' not found", block_name);
return -EINVAL;
}
@@ -2640,11 +2643,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return 0;
case MIG_CMD_RECV_BITMAP:
- ret = loadvm_handle_recv_bitmap(mis, len);
- if (ret < 0) {
- error_setg(errp, "Failed to load device state command: %d", ret);
- }
- return ret;
+ return loadvm_handle_recv_bitmap(mis, len, errp);
case MIG_CMD_ENABLE_COLO:
ret = loadvm_process_enable_colo(mis);
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 20/45] migration: Return -1 on memory allocation failure in ram.c
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (18 preceding siblings ...)
2025-10-03 15:39 ` [PULL 19/45] migration: push Error **errp into loadvm_handle_recv_bitmap() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 21/45] migration: push Error **errp into loadvm_process_enable_colo() Peter Xu
` (25 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
The function colo_init_ram_cache() currently returns -errno if
qemu_anon_ram_alloc() fails. However, the subsequent cleanup loop that
calls qemu_anon_ram_free() could potentially alter the value of errno.
This would cause the function to return a value that does not accurately
represent the original allocation failure.
This commit changes the return value to -1 on memory allocation failure.
This ensures that the return value is consistent and is not affected by
any errno changes that may occur during the free process.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-20-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/ram.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/migration/ram.c b/migration/ram.c
index 6a0dcc04f4..163265a57f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3575,6 +3575,8 @@ static void colo_init_ram_state(void)
* colo cache: this is for secondary VM, we cache the whole
* memory of the secondary VM, it is need to hold the global lock
* to call this helper.
+ *
+ * Returns zero to indicate success or -1 on error.
*/
int colo_init_ram_cache(void)
{
@@ -3594,7 +3596,7 @@ int colo_init_ram_cache(void)
block->colo_cache = NULL;
}
}
- return -errno;
+ return -1;
}
if (!machine_dump_guest_core(current_machine)) {
qemu_madvise(block->colo_cache, block->used_length,
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 21/45] migration: push Error **errp into loadvm_process_enable_colo()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (19 preceding siblings ...)
2025-10-03 15:39 ` [PULL 20/45] migration: Return -1 on memory allocation failure in ram.c Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 22/45] migration: push Error **errp into loadvm_postcopy_handle_switchover_start() Peter Xu
` (24 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_process_enable_colo() must report an error
in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-21-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/colo.h | 2 +-
migration/ram.h | 2 +-
migration/migration.c | 12 ++++++------
migration/ram.c | 8 ++++----
migration/savevm.c | 26 ++++++++++++++------------
5 files changed, 26 insertions(+), 24 deletions(-)
diff --git a/include/migration/colo.h b/include/migration/colo.h
index 43222ef5ae..d4fe422e4d 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -25,7 +25,7 @@ void migrate_start_colo_process(MigrationState *s);
bool migration_in_colo_state(void);
/* loadvm */
-int migration_incoming_enable_colo(void);
+int migration_incoming_enable_colo(Error **errp);
void migration_incoming_disable_colo(void);
bool migration_incoming_colo_enabled(void);
bool migration_incoming_in_colo_state(void);
diff --git a/migration/ram.h b/migration/ram.h
index 275709a991..24cd0bf585 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -109,7 +109,7 @@ void ramblock_set_file_bmap_atomic(RAMBlock *block, ram_addr_t offset,
bool set);
/* ram cache */
-int colo_init_ram_cache(void);
+int colo_init_ram_cache(Error **errp);
void colo_flush_ram_cache(void);
void colo_release_ram_cache(void);
void colo_incoming_start_dirty_log(void);
diff --git a/migration/migration.c b/migration/migration.c
index ce17dcc1c0..2f55f2784b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -623,22 +623,22 @@ void migration_incoming_disable_colo(void)
migration_colo_enabled = false;
}
-int migration_incoming_enable_colo(void)
+int migration_incoming_enable_colo(Error **errp)
{
#ifndef CONFIG_REPLICATION
- error_report("ENABLE_COLO command come in migration stream, but the "
- "replication module is not built in");
+ error_setg(errp, "ENABLE_COLO command come in migration stream, but the "
+ "replication module is not built in");
return -ENOTSUP;
#endif
if (!migrate_colo()) {
- error_report("ENABLE_COLO command come in migration stream, but x-colo "
- "capability is not set");
+ error_setg(errp, "ENABLE_COLO command come in migration stream"
+ ", but x-colo capability is not set");
return -EINVAL;
}
if (ram_block_discard_disable(true)) {
- error_report("COLO: cannot disable RAM discard");
+ error_setg(errp, "COLO: cannot disable RAM discard");
return -EBUSY;
}
migration_colo_enabled = true;
diff --git a/migration/ram.c b/migration/ram.c
index 163265a57f..a8e8d2cc67 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3578,7 +3578,7 @@ static void colo_init_ram_state(void)
*
* Returns zero to indicate success or -1 on error.
*/
-int colo_init_ram_cache(void)
+int colo_init_ram_cache(Error **errp)
{
RAMBlock *block;
@@ -3587,9 +3587,9 @@ int colo_init_ram_cache(void)
block->colo_cache = qemu_anon_ram_alloc(block->used_length,
NULL, false, false);
if (!block->colo_cache) {
- error_report("%s: Can't alloc memory for COLO cache of block %s,"
- "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
- block->used_length);
+ error_setg(errp, "Can't alloc memory for COLO cache of "
+ "block %s, size 0x" RAM_ADDR_FMT,
+ block->idstr, block->used_length);
RAMBLOCK_FOREACH_NOT_IGNORED(block) {
if (block->colo_cache) {
qemu_anon_ram_free(block->colo_cache, block->used_length);
diff --git a/migration/savevm.c b/migration/savevm.c
index 2e8776768f..8937496d9f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2515,15 +2515,21 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
return 0;
}
-static int loadvm_process_enable_colo(MigrationIncomingState *mis)
+static int loadvm_process_enable_colo(MigrationIncomingState *mis,
+ Error **errp)
{
- int ret = migration_incoming_enable_colo();
+ ERRP_GUARD();
+ int ret;
- if (!ret) {
- ret = colo_init_ram_cache();
- if (ret) {
- migration_incoming_disable_colo();
- }
+ ret = migration_incoming_enable_colo(errp);
+ if (ret < 0) {
+ return ret;
+ }
+
+ ret = colo_init_ram_cache(errp);
+ if (ret) {
+ error_prepend(errp, "failed to init colo RAM cache: %d: ", ret);
+ migration_incoming_disable_colo();
}
return ret;
}
@@ -2646,11 +2652,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return loadvm_handle_recv_bitmap(mis, len, errp);
case MIG_CMD_ENABLE_COLO:
- ret = loadvm_process_enable_colo(mis);
- if (ret < 0) {
- error_setg(errp, "Failed to load device state command: %d", ret);
- }
- return ret;
+ return loadvm_process_enable_colo(mis, errp);
case MIG_CMD_SWITCHOVER_START:
ret = loadvm_postcopy_handle_switchover_start();
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 22/45] migration: push Error **errp into loadvm_postcopy_handle_switchover_start()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (20 preceding siblings ...)
2025-10-03 15:39 ` [PULL 21/45] migration: push Error **errp into loadvm_process_enable_colo() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 23/45] migration: Capture error in postcopy_ram_listen_thread() Peter Xu
` (23 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Daniel P. Berrangé, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading code to report
error via Error objects instead of directly printing it to console/monitor.
It is ensured that loadvm_postcopy_handle_switchover_start() must report
an error in errp, in case of failure.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-22-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 8937496d9f..34b7a28d38 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2534,7 +2534,7 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis,
return ret;
}
-static int loadvm_postcopy_handle_switchover_start(void)
+static int loadvm_postcopy_handle_switchover_start(Error **errp)
{
SaveStateEntry *se;
@@ -2547,6 +2547,7 @@ static int loadvm_postcopy_handle_switchover_start(void)
ret = se->ops->switchover_start(se->opaque);
if (ret < 0) {
+ error_setg(errp, "Switchover start failed: %d", ret);
return ret;
}
}
@@ -2655,11 +2656,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
return loadvm_process_enable_colo(mis, errp);
case MIG_CMD_SWITCHOVER_START:
- ret = loadvm_postcopy_handle_switchover_start();
- if (ret < 0) {
- error_setg(errp, "Failed to load device state command: %d", ret);
- }
- return ret;
+ return loadvm_postcopy_handle_switchover_start(errp);
}
return 0;
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 23/45] migration: Capture error in postcopy_ram_listen_thread()
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (21 preceding siblings ...)
2025-10-03 15:39 ` [PULL 22/45] migration: push Error **errp into loadvm_postcopy_handle_switchover_start() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 24/45] migration: Remove error variant of vmstate_save_state() function Peter Xu
` (22 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Marc-André Lureau, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
postcopy_ram_listen_thread() calls qemu_loadvm_state_main()
to load the vm, and in case of a failure, it should set the error
in the migration object.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-23-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 34b7a28d38..996673b679 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2095,6 +2095,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
QEMUFile *f = mis->from_src_file;
int load_res;
MigrationState *migr = migrate_get_current();
+ Error *local_err = NULL;
object_ref(OBJECT(migr));
@@ -2111,7 +2112,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
qemu_file_set_blocking(f, true, &error_fatal);
/* TODO: sanity check that only postcopiable data will be loaded here */
- load_res = qemu_loadvm_state_main(f, mis, &error_fatal);
+ load_res = qemu_loadvm_state_main(f, mis, &local_err);
/*
* This is tricky, but, mis->from_src_file can change after it
@@ -2137,7 +2138,10 @@ static void *postcopy_ram_listen_thread(void *opaque)
__func__, load_res);
load_res = 0; /* prevent further exit() */
} else {
- error_report("%s: loadvm failed: %d", __func__, load_res);
+ error_prepend(&local_err,
+ "loadvm failed during postcopy: %d: ", load_res);
+ migrate_set_error(migr, local_err);
+ error_report_err(local_err);
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
MIGRATION_STATUS_FAILED);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 24/45] migration: Remove error variant of vmstate_save_state() function
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (22 preceding siblings ...)
2025-10-03 15:39 ` [PULL 23/45] migration: Capture error in postcopy_ram_listen_thread() Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 25/45] migration: Add error-parameterized function variants in VMSD struct Peter Xu
` (21 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
This commit removes the redundant vmstate_save_state_with_err()
function.
Previously, commit 969298f9d7 introduced vmstate_save_state_with_err()
to handle error propagation, while vmstate_save_state() existed for
non-error scenarios.
This is because there were code paths where vmstate_save_state_v()
(called internally by vmstate_save_state) did not explicitly set
errors on failure.
This change unifies error handling by
- updating vmstate_save_state() to accept an Error **errp argument.
- vmstate_save_state_v() ensures errors are set directly within the errp
object, eliminating the need for two separate functions.
All calls to vmstate_save_state_with_err() are replaced with
vmstate_save_state(). This simplifies the API and improves code
maintainability.
vmstate_save_state() that only calls vmstate_save_state_v(),
by inference, also has errors set in errp in case of failure.
The errors are reported using error_report_err().
If we want the function to exit on error, then &error_fatal is
passed.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-24-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/vmstate.h | 2 --
hw/display/virtio-gpu.c | 3 ++-
hw/pci/pci.c | 2 +-
hw/s390x/virtio-ccw.c | 2 +-
hw/scsi/spapr_vscsi.c | 2 +-
hw/vfio/pci.c | 4 ++--
hw/virtio/virtio-mmio.c | 2 +-
hw/virtio/virtio-pci.c | 2 +-
hw/virtio/virtio.c | 6 ++++--
migration/cpr.c | 3 +--
migration/savevm.c | 11 ++++++++---
migration/vmstate-types.c | 25 ++++++++++++++++++-------
migration/vmstate.c | 10 ++--------
tests/unit/test-vmstate.c | 20 +++++++++++++++++---
ui/vdagent.c | 3 ++-
15 files changed, 61 insertions(+), 36 deletions(-)
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 056781b1c2..5fe9bbf390 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -1198,8 +1198,6 @@ extern const VMStateInfo vmstate_info_qlist;
int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, int version_id, Error **errp);
int vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd,
- void *opaque, JSONWriter *vmdesc);
-int vmstate_save_state_with_err(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, JSONWriter *vmdesc, Error **errp);
int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, JSONWriter *vmdesc,
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index e61585aa61..3a555125be 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1248,7 +1248,8 @@ static int virtio_gpu_save(QEMUFile *f, void *opaque, size_t size,
}
qemu_put_be32(f, 0); /* end of list */
- return vmstate_save_state(f, &vmstate_virtio_gpu_scanouts, g, NULL);
+ return vmstate_save_state(f, &vmstate_virtio_gpu_scanouts, g, NULL,
+ &error_fatal);
}
static bool virtio_gpu_load_restore_mapping(VirtIOGPU *g,
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 17715ca1b3..5e2c3c6fc2 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -926,7 +926,7 @@ void pci_device_save(PCIDevice *s, QEMUFile *f)
* This makes us compatible with old devices
* which never set or clear this bit. */
s->config[PCI_STATUS] &= ~PCI_STATUS_INTERRUPT;
- vmstate_save_state(f, &vmstate_pci_device, s, NULL);
+ vmstate_save_state(f, &vmstate_pci_device, s, NULL, &error_fatal);
/* Restore the interrupt status bit. */
pci_update_irq_status(s);
}
diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 6a9641a03d..4cb1ced001 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -1130,7 +1130,7 @@ static int virtio_ccw_load_queue(DeviceState *d, int n, QEMUFile *f)
static void virtio_ccw_save_config(DeviceState *d, QEMUFile *f)
{
VirtioCcwDevice *dev = VIRTIO_CCW_DEVICE(d);
- vmstate_save_state(f, &vmstate_virtio_ccw_dev, dev, NULL);
+ vmstate_save_state(f, &vmstate_virtio_ccw_dev, dev, NULL, &error_fatal);
}
static int virtio_ccw_load_config(DeviceState *d, QEMUFile *f)
diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index da173f4867..f0a7dd2b88 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -630,7 +630,7 @@ static void vscsi_save_request(QEMUFile *f, SCSIRequest *sreq)
vscsi_req *req = sreq->hba_private;
assert(req->active);
- vmstate_save_state(f, &vmstate_spapr_vscsi_req, req, NULL);
+ vmstate_save_state(f, &vmstate_spapr_vscsi_req, req, NULL, &error_fatal);
trace_spapr_vscsi_save_request(req->qtag, req->cur_desc_num,
req->cur_desc_offset);
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index a5df4685d4..06b06afc2b 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2821,8 +2821,8 @@ static int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f, Error **errp)
{
VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
- return vmstate_save_state_with_err(f, &vmstate_vfio_pci_config, vdev, NULL,
- errp);
+ return vmstate_save_state(f, &vmstate_vfio_pci_config, vdev, NULL,
+ errp);
}
static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f)
diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 0a688909fc..fb58c36452 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -613,7 +613,7 @@ static void virtio_mmio_save_extra_state(DeviceState *opaque, QEMUFile *f)
{
VirtIOMMIOProxy *proxy = VIRTIO_MMIO(opaque);
- vmstate_save_state(f, &vmstate_virtio_mmio, proxy, NULL);
+ vmstate_save_state(f, &vmstate_virtio_mmio, proxy, NULL, &error_fatal);
}
static int virtio_mmio_load_extra_state(DeviceState *opaque, QEMUFile *f)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index b04faa1e5c..d2595fbd55 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -154,7 +154,7 @@ static void virtio_pci_save_extra_state(DeviceState *d, QEMUFile *f)
{
VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
- vmstate_save_state(f, &vmstate_virtio_pci, proxy, NULL);
+ vmstate_save_state(f, &vmstate_virtio_pci, proxy, NULL, &error_fatal);
}
static int virtio_pci_load_extra_state(DeviceState *d, QEMUFile *f)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 018803c80d..0a68f1b6f1 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2992,6 +2992,7 @@ int virtio_save(VirtIODevice *vdev, QEMUFile *f)
VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
uint32_t guest_features_lo = (vdev->guest_features & 0xffffffff);
int i;
+ Error *local_err = NULL;
if (k->save_config) {
k->save_config(qbus->parent, f);
@@ -3035,14 +3036,15 @@ int virtio_save(VirtIODevice *vdev, QEMUFile *f)
}
if (vdc->vmsd) {
- int ret = vmstate_save_state(f, vdc->vmsd, vdev, NULL);
+ int ret = vmstate_save_state(f, vdc->vmsd, vdev, NULL, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
}
/* Subsections */
- return vmstate_save_state(f, &vmstate_virtio, vdev, NULL);
+ return vmstate_save_state(f, &vmstate_virtio, vdev, NULL, &error_fatal);
}
/* A wrapper for use as a VMState .put function */
diff --git a/migration/cpr.c b/migration/cpr.c
index 6b0e19651a..e0b47df222 100644
--- a/migration/cpr.c
+++ b/migration/cpr.c
@@ -183,9 +183,8 @@ int cpr_state_save(MigrationChannel *channel, Error **errp)
qemu_put_be32(f, QEMU_CPR_FILE_MAGIC);
qemu_put_be32(f, QEMU_CPR_FILE_VERSION);
- ret = vmstate_save_state(f, &vmstate_cpr_state, &cpr_state, 0);
+ ret = vmstate_save_state(f, &vmstate_cpr_state, &cpr_state, 0, errp);
if (ret) {
- error_setg(errp, "vmstate_save_state error %d", ret);
qemu_fclose(f);
return ret;
}
diff --git a/migration/savevm.c b/migration/savevm.c
index 996673b679..7b35ec4dd0 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1056,8 +1056,8 @@ static int vmstate_save(QEMUFile *f, SaveStateEntry *se, JSONWriter *vmdesc,
if (!se->vmsd) {
vmstate_save_old_style(f, se, vmdesc);
} else {
- ret = vmstate_save_state_with_err(f, se->vmsd, se->opaque, vmdesc,
- errp);
+ ret = vmstate_save_state(f, se->vmsd, se->opaque, vmdesc,
+ errp);
if (ret) {
return ret;
}
@@ -1285,6 +1285,7 @@ void qemu_savevm_state_header(QEMUFile *f)
{
MigrationState *s = migrate_get_current();
JSONWriter *vmdesc = s->vmdesc;
+ Error *local_err = NULL;
trace_savevm_state_header();
qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
@@ -1303,7 +1304,11 @@ void qemu_savevm_state_header(QEMUFile *f)
json_writer_start_object(vmdesc, "configuration");
}
- vmstate_save_state(f, &vmstate_configuration, &savevm_state, vmdesc);
+ vmstate_save_state(f, &vmstate_configuration, &savevm_state,
+ vmdesc, &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ }
if (vmdesc) {
json_writer_end_object(vmdesc);
diff --git a/migration/vmstate-types.c b/migration/vmstate-types.c
index c5cfd861e3..a1cd7a95fa 100644
--- a/migration/vmstate-types.c
+++ b/migration/vmstate-types.c
@@ -565,10 +565,14 @@ static int put_tmp(QEMUFile *f, void *pv, size_t size,
const VMStateDescription *vmsd = field->vmsd;
void *tmp = g_malloc(size);
int ret;
+ Error *local_err = NULL;
/* Writes the parent field which is at the start of the tmp */
*(void **)tmp = pv;
- ret = vmstate_save_state(f, vmsd, tmp, vmdesc);
+ ret = vmstate_save_state(f, vmsd, tmp, vmdesc, &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
g_free(tmp);
return ret;
@@ -676,13 +680,15 @@ static int put_qtailq(QEMUFile *f, void *pv, size_t unused_size,
size_t entry_offset = field->start;
void *elm;
int ret;
+ Error *local_err = NULL;
trace_put_qtailq(vmsd->name, vmsd->version_id);
QTAILQ_RAW_FOREACH(elm, pv, entry_offset) {
qemu_put_byte(f, true);
- ret = vmstate_save_state(f, vmsd, elm, vmdesc);
+ ret = vmstate_save_state(f, vmsd, elm, vmdesc, &local_err);
if (ret) {
+ error_report_err(local_err);
return ret;
}
}
@@ -711,6 +717,7 @@ static gboolean put_gtree_elem(gpointer key, gpointer value, gpointer data)
struct put_gtree_data *capsule = (struct put_gtree_data *)data;
QEMUFile *f = capsule->f;
int ret;
+ Error *local_err = NULL;
qemu_put_byte(f, true);
@@ -718,16 +725,20 @@ static gboolean put_gtree_elem(gpointer key, gpointer value, gpointer data)
if (!capsule->key_vmsd) {
qemu_put_be64(f, (uint64_t)(uintptr_t)(key)); /* direct key */
} else {
- ret = vmstate_save_state(f, capsule->key_vmsd, key, capsule->vmdesc);
+ ret = vmstate_save_state(f, capsule->key_vmsd, key, capsule->vmdesc,
+ &local_err);
if (ret) {
+ error_report_err(local_err);
capsule->ret = ret;
return true;
}
}
/* put the data */
- ret = vmstate_save_state(f, capsule->val_vmsd, value, capsule->vmdesc);
+ ret = vmstate_save_state(f, capsule->val_vmsd, value, capsule->vmdesc,
+ &local_err);
if (ret) {
+ error_report_err(local_err);
capsule->ret = ret;
return true;
}
@@ -857,14 +868,14 @@ static int put_qlist(QEMUFile *f, void *pv, size_t unused_size,
size_t entry_offset = field->start;
void *elm;
int ret;
+ Error *local_err = NULL;
trace_put_qlist(field->name, vmsd->name, vmsd->version_id);
QLIST_RAW_FOREACH(elm, pv, entry_offset) {
qemu_put_byte(f, true);
- ret = vmstate_save_state(f, vmsd, elm, vmdesc);
+ ret = vmstate_save_state(f, vmsd, elm, vmdesc, &local_err);
if (ret) {
- error_report("%s: failed to save %s (%d)", field->name,
- vmsd->name, ret);
+ error_report_err(local_err);
return ret;
}
}
diff --git a/migration/vmstate.c b/migration/vmstate.c
index 8d1e9eb62b..ad8e5b71ae 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -406,12 +406,6 @@ bool vmstate_section_needed(const VMStateDescription *vmsd, void *opaque)
int vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd,
- void *opaque, JSONWriter *vmdesc_id)
-{
- return vmstate_save_state_v(f, vmsd, opaque, vmdesc_id, vmsd->version_id, NULL);
-}
-
-int vmstate_save_state_with_err(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, JSONWriter *vmdesc_id, Error **errp)
{
return vmstate_save_state_v(f, vmsd, opaque, vmdesc_id, vmsd->version_id, errp);
@@ -512,7 +506,7 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
if (inner_field->flags & VMS_STRUCT) {
ret = vmstate_save_state(f, inner_field->vmsd,
- curr_elem, vmdesc_loop);
+ curr_elem, vmdesc_loop, errp);
} else if (inner_field->flags & VMS_VSTRUCT) {
ret = vmstate_save_state_v(f, inner_field->vmsd,
curr_elem, vmdesc_loop,
@@ -674,7 +668,7 @@ static int vmstate_subsection_save(QEMUFile *f, const VMStateDescription *vmsd,
qemu_put_byte(f, len);
qemu_put_buffer(f, (uint8_t *)vmsdsub->name, len);
qemu_put_be32(f, vmsdsub->version_id);
- ret = vmstate_save_state_with_err(f, vmsdsub, opaque, vmdesc, errp);
+ ret = vmstate_save_state(f, vmsdsub, opaque, vmdesc, errp);
if (ret) {
return ret;
}
diff --git a/tests/unit/test-vmstate.c b/tests/unit/test-vmstate.c
index 4ff0ab632f..cadbab3c5e 100644
--- a/tests/unit/test-vmstate.c
+++ b/tests/unit/test-vmstate.c
@@ -67,9 +67,13 @@ static QEMUFile *open_test_file(bool write)
static void save_vmstate(const VMStateDescription *desc, void *obj)
{
QEMUFile *f = open_test_file(true);
+ Error *local_err = NULL;
/* Save file with vmstate */
- int ret = vmstate_save_state(f, desc, obj, NULL);
+ int ret = vmstate_save_state(f, desc, obj, NULL, &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
g_assert(!ret);
qemu_put_byte(f, QEMU_VM_EOF);
g_assert(!qemu_file_get_error(f));
@@ -438,10 +442,15 @@ static const VMStateDescription vmstate_skipping = {
static void test_save_noskip(void)
{
+ Error *local_err = NULL;
QEMUFile *fsave = open_test_file(true);
TestStruct obj = { .a = 1, .b = 2, .c = 3, .d = 4, .e = 5, .f = 6,
.skip_c_e = false };
- int ret = vmstate_save_state(fsave, &vmstate_skipping, &obj, NULL);
+ int ret = vmstate_save_state(fsave, &vmstate_skipping, &obj, NULL,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
g_assert(!ret);
g_assert(!qemu_file_get_error(fsave));
@@ -460,10 +469,15 @@ static void test_save_noskip(void)
static void test_save_skip(void)
{
+ Error *local_err = NULL;
QEMUFile *fsave = open_test_file(true);
TestStruct obj = { .a = 1, .b = 2, .c = 3, .d = 4, .e = 5, .f = 6,
.skip_c_e = true };
- int ret = vmstate_save_state(fsave, &vmstate_skipping, &obj, NULL);
+ int ret = vmstate_save_state(fsave, &vmstate_skipping, &obj, NULL,
+ &local_err);
+ if (ret) {
+ error_report_err(local_err);
+ }
g_assert(!ret);
g_assert(!qemu_file_get_error(fsave));
diff --git a/ui/vdagent.c b/ui/vdagent.c
index bc3c77f013..ddb91e75c6 100644
--- a/ui/vdagent.c
+++ b/ui/vdagent.c
@@ -992,7 +992,8 @@ static int put_cbinfo(QEMUFile *f, void *pv, size_t size,
}
}
- return vmstate_save_state(f, &vmstate_cbinfo_array, &cbinfo, vmdesc);
+ return vmstate_save_state(f, &vmstate_cbinfo_array, &cbinfo, vmdesc,
+ &error_fatal);
}
static int get_cbinfo(QEMUFile *f, void *pv, size_t size,
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 25/45] migration: Add error-parameterized function variants in VMSD struct
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (23 preceding siblings ...)
2025-10-03 15:39 ` [PULL 24/45] migration: Remove error variant of vmstate_save_state() function Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 26/45] backends/tpm: Propagate vTPM error on migration failure Peter Xu
` (20 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
- We need to have good error reporting in the callbacks in
VMStateDescription struct. Specifically pre_save, pre_load
and post_load callbacks.
- It is not possible to change these functions everywhere in one
patch, therefore, we introduce a duplicate set of callbacks
with Error object passed to them.
- So, in this commit, we implement 'errp' variants of these callbacks,
introducing an explicit Error object parameter.
- This is a functional step towards transitioning the entire codebase
to the new error-parameterized functions.
- Deliberately called in mutual exclusion from their counterparts,
to prevent conflicts during the transition.
- New impls should preferentally use 'errp' variants of
these methods, and existing impls incrementally converted.
The variants without 'errp' are intended to be removed
once all usage is converted.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-26-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
docs/devel/migration/main.rst | 19 +++++++++++++++++++
include/migration/vmstate.h | 14 ++++++++++++++
migration/vmstate.c | 31 ++++++++++++++++++++++++++++---
3 files changed, 61 insertions(+), 3 deletions(-)
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 6493c1d2bc..1afe7b9689 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -444,6 +444,25 @@ The functions to do that are inside a vmstate definition, and are called:
This function is called after we save the state of one device
(even upon failure, unless the call to pre_save returned an error).
+Following are the errp variants of these functions.
+
+- ``int (*pre_load_errp)(void *opaque, Error **errp);``
+
+ This function is called before we load the state of one device.
+
+- ``int (*post_load_errp)(void *opaque, int version_id, Error **errp);``
+
+ This function is called after we load the state of one device.
+
+- ``int (*pre_save_errp)(void *opaque, Error **errp);``
+
+ This function is called before we save the state of one device.
+
+New impls should preferentally use 'errp' variants of these
+methods and existing impls incrementally converted.
+The variants without 'errp' are intended to be removed
+once all usage is converted.
+
Example: You can look at hpet.c, that uses the first three functions
to massage the state that is transferred.
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 5fe9bbf390..5567fd78d0 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -200,14 +200,28 @@ struct VMStateDescription {
* exclusive. For this reason, also early_setup VMSDs are migrated in a
* QEMU_VM_SECTION_FULL section, while save_setup() data is migrated in
* a QEMU_VM_SECTION_START section.
+ *
+ * There are duplicate impls of the post/pre save/load hooks.
+ * New impls should preferentally use 'errp' variants of these
+ * methods and existing impls incrementally converted.
+ * The variants without 'errp' are intended to be removed
+ * once all usage is converted.
+ *
+ * For the errp variants,
+ * Returns: 0 on success,
+ * <0 on error where -value is an error number from errno.h
*/
+
bool early_setup;
int version_id;
int minimum_version_id;
MigrationPriority priority;
int (*pre_load)(void *opaque);
+ int (*pre_load_errp)(void *opaque, Error **errp);
int (*post_load)(void *opaque, int version_id);
+ int (*post_load_errp)(void *opaque, int version_id, Error **errp);
int (*pre_save)(void *opaque);
+ int (*pre_save_errp)(void *opaque, Error **errp);
int (*post_save)(void *opaque);
bool (*needed)(void *opaque);
bool (*dev_unplug_pending)(void *opaque);
diff --git a/migration/vmstate.c b/migration/vmstate.c
index ad8e5b71ae..81eadde553 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -134,6 +134,7 @@ static void vmstate_handle_alloc(void *ptr, const VMStateField *field,
int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, int version_id, Error **errp)
{
+ ERRP_GUARD();
const VMStateField *field = vmsd->fields;
int ret = 0;
@@ -152,7 +153,16 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
trace_vmstate_load_state_end(vmsd->name, "too old", -EINVAL);
return -EINVAL;
}
- if (vmsd->pre_load) {
+ if (vmsd->pre_load_errp) {
+ ret = vmsd->pre_load_errp(opaque, errp);
+ if (ret < 0) {
+ error_prepend(errp, "pre load hook failed for: '%s', "
+ "version_id: %d, minimum version_id: %d, "
+ "ret: %d: ", vmsd->name, vmsd->version_id,
+ vmsd->minimum_version_id, ret);
+ return ret;
+ }
+ } else if (vmsd->pre_load) {
ret = vmsd->pre_load(opaque);
if (ret) {
error_setg(errp, "pre load hook failed for: '%s', "
@@ -245,7 +255,14 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd,
qemu_file_set_error(f, ret);
return ret;
}
- if (vmsd->post_load) {
+ if (vmsd->post_load_errp) {
+ ret = vmsd->post_load_errp(opaque, version_id, errp);
+ if (ret < 0) {
+ error_prepend(errp, "post load hook failed for: %s, version_id: "
+ "%d, minimum_version: %d, ret: %d: ", vmsd->name,
+ vmsd->version_id, vmsd->minimum_version_id, ret);
+ }
+ } else if (vmsd->post_load) {
ret = vmsd->post_load(opaque, version_id);
if (ret < 0) {
error_setg(errp,
@@ -414,12 +431,20 @@ int vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd,
int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque, JSONWriter *vmdesc, int version_id, Error **errp)
{
+ ERRP_GUARD();
int ret = 0;
const VMStateField *field = vmsd->fields;
trace_vmstate_save_state_top(vmsd->name);
- if (vmsd->pre_save) {
+ if (vmsd->pre_save_errp) {
+ ret = vmsd->pre_save_errp(opaque, errp);
+ trace_vmstate_save_state_pre_save_res(vmsd->name, ret);
+ if (ret < 0) {
+ error_prepend(errp, "pre-save for %s failed, ret: %d: ",
+ vmsd->name, ret);
+ }
+ } else if (vmsd->pre_save) {
ret = vmsd->pre_save(opaque);
trace_vmstate_save_state_pre_save_res(vmsd->name, ret);
if (ret) {
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 26/45] backends/tpm: Propagate vTPM error on migration failure
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (24 preceding siblings ...)
2025-10-03 15:39 ` [PULL 25/45] migration: Add error-parameterized function variants in VMSD struct Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer Peter Xu
` (19 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Arun Menon, Stefan Berger, Daniel P. Berrangé, Akihiko Odaki
From: Arun Menon <armenon@redhat.com>
- When migration of a VM with encrypted vTPM fails on the
destination host, (e.g., due to a mismatch in secret values),
the error message displayed on the source host is generic and unhelpful.
- For example, a typical error looks like this:
"operation failed: job 'migration out' failed: Sibling indicated error 1.
operation failed: job 'migration in' failed: load of migration failed:
Input/output error"
- Such generic errors are logged using error_report(), which prints to
the console/monitor but does not make the detailed error accessible via
the QMP query-migrate command.
- This change, along with the set of changes of passing errp Error object
to the VM state loading functions, help in addressing the issue.
We use the post_load_errp hook of VMStateDescription to propagate errors
by setting Error **errp objects in case of failure in the TPM backend.
- It can then be retrieved using QMP command:
{"execute" : "query-migrate"}
Buglink: https://issues.redhat.com/browse/RHEL-82826
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-27-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
backends/tpm/tpm_emulator.c | 40 +++++++++++++++++++------------------
1 file changed, 21 insertions(+), 19 deletions(-)
diff --git a/backends/tpm/tpm_emulator.c b/backends/tpm/tpm_emulator.c
index 4a234ab2c0..dacfca5ab7 100644
--- a/backends/tpm/tpm_emulator.c
+++ b/backends/tpm/tpm_emulator.c
@@ -819,7 +819,8 @@ static int tpm_emulator_get_state_blobs(TPMEmulator *tpm_emu)
static int tpm_emulator_set_state_blob(TPMEmulator *tpm_emu,
uint32_t type,
TPMSizedBuffer *tsb,
- uint32_t flags)
+ uint32_t flags,
+ Error **errp)
{
ssize_t n;
ptm_setstate pss;
@@ -838,17 +839,18 @@ static int tpm_emulator_set_state_blob(TPMEmulator *tpm_emu,
/* write the header only */
if (tpm_emulator_ctrlcmd(tpm_emu, CMD_SET_STATEBLOB, &pss,
offsetof(ptm_setstate, u.req.data), 0, 0) < 0) {
- error_report("tpm-emulator: could not set state blob type %d : %s",
- type, strerror(errno));
+ error_setg_errno(errp, errno,
+ "tpm-emulator: could not set state blob type %d",
+ type);
return -1;
}
/* now the body */
n = qemu_chr_fe_write_all(&tpm_emu->ctrl_chr, tsb->buffer, tsb->size);
if (n != tsb->size) {
- error_report("tpm-emulator: Writing the stateblob (type %d) "
- "failed; could not write %u bytes, but only %zd",
- type, tsb->size, n);
+ error_setg(errp, "tpm-emulator: Writing the stateblob (type %d) "
+ "failed; could not write %u bytes, but only %zd",
+ type, tsb->size, n);
return -1;
}
@@ -856,17 +858,17 @@ static int tpm_emulator_set_state_blob(TPMEmulator *tpm_emu,
n = qemu_chr_fe_read_all(&tpm_emu->ctrl_chr,
(uint8_t *)&pss, sizeof(pss.u.resp));
if (n != sizeof(pss.u.resp)) {
- error_report("tpm-emulator: Reading response from writing stateblob "
- "(type %d) failed; expected %zu bytes, got %zd", type,
- sizeof(pss.u.resp), n);
+ error_setg(errp, "tpm-emulator: Reading response from writing "
+ "stateblob (type %d) failed; expected %zu bytes, "
+ "got %zd", type, sizeof(pss.u.resp), n);
return -1;
}
tpm_result = be32_to_cpu(pss.u.resp.tpm_result);
if (tpm_result != 0) {
- error_report("tpm-emulator: Setting the stateblob (type %d) failed "
- "with a TPM error 0x%x %s", type, tpm_result,
- tpm_emulator_strerror(tpm_result));
+ error_setg(errp, "tpm-emulator: Setting the stateblob (type %d) "
+ "failed with a TPM error 0x%x %s", type, tpm_result,
+ tpm_emulator_strerror(tpm_result));
return -1;
}
@@ -880,7 +882,7 @@ static int tpm_emulator_set_state_blob(TPMEmulator *tpm_emu,
*
* Returns a negative errno code in case of error.
*/
-static int tpm_emulator_set_state_blobs(TPMBackend *tb)
+static int tpm_emulator_set_state_blobs(TPMBackend *tb, Error **errp)
{
TPMEmulator *tpm_emu = TPM_EMULATOR(tb);
TPMBlobBuffers *state_blobs = &tpm_emu->state_blobs;
@@ -894,13 +896,13 @@ static int tpm_emulator_set_state_blobs(TPMBackend *tb)
if (tpm_emulator_set_state_blob(tpm_emu, PTM_BLOB_TYPE_PERMANENT,
&state_blobs->permanent,
- state_blobs->permanent_flags) < 0 ||
+ state_blobs->permanent_flags, errp) < 0 ||
tpm_emulator_set_state_blob(tpm_emu, PTM_BLOB_TYPE_VOLATILE,
&state_blobs->volatil,
- state_blobs->volatil_flags) < 0 ||
+ state_blobs->volatil_flags, errp) < 0 ||
tpm_emulator_set_state_blob(tpm_emu, PTM_BLOB_TYPE_SAVESTATE,
&state_blobs->savestate,
- state_blobs->savestate_flags) < 0) {
+ state_blobs->savestate_flags, errp) < 0) {
return -EIO;
}
@@ -948,12 +950,12 @@ static void tpm_emulator_vm_state_change(void *opaque, bool running,
*
* Returns negative errno codes in case of error.
*/
-static int tpm_emulator_post_load(void *opaque, int version_id)
+static int tpm_emulator_post_load(void *opaque, int version_id, Error **errp)
{
TPMBackend *tb = opaque;
int ret;
- ret = tpm_emulator_set_state_blobs(tb);
+ ret = tpm_emulator_set_state_blobs(tb, errp);
if (ret < 0) {
return ret;
}
@@ -969,7 +971,7 @@ static const VMStateDescription vmstate_tpm_emulator = {
.name = "tpm-emulator",
.version_id = 0,
.pre_save = tpm_emulator_pre_save,
- .post_load = tpm_emulator_post_load,
+ .post_load_errp = tpm_emulator_post_load,
.fields = (const VMStateField[]) {
VMSTATE_UINT32(state_blobs.permanent_flags, TPMEmulator),
VMSTATE_UINT32(state_blobs.permanent.size, TPMEmulator),
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (25 preceding siblings ...)
2025-10-03 15:39 ` [PULL 26/45] backends/tpm: Propagate vTPM error on migration failure Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-10 8:00 ` iotest 233 is failing (was: [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer) Thomas Huth
2025-10-03 15:39 ` [PULL 28/45] migration: Make migration_has_failed() work even for CANCELLING Peter Xu
` (18 subsequent siblings)
45 siblings, 1 reply; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Daniel P. Berrangé, Juraj Marcin
QCryptoTLSSession allows TLS premature termination in two cases, one of the
case is when the channel shutdown() is invoked on READ side.
It's possible the shutdown() happened after the read thread blocked at
gnutls_record_recv(). In this case, we should allow the premature
termination to happen.
The problem is by the time qcrypto_tls_session_read() was invoked,
tioc->shutdown may not have been set, so this may instead be treated as an
error if there is concurrent shutdown() calls.
To allow the flag to reflect the latest status of tioc->shutdown, move the
check upper into the QIOChannel level, so as to read the flag only after
QEMU gets an GNUTLS_E_PREMATURE_TERMINATION.
When at it, introduce qio_channel_tls_allow_premature_termination() helper
to make the condition checks easier to read. When doing so, change the
qatomic_load_acquire() to qatomic_read(): here we don't need any ordering
of memory accesses, but reading a flag. qatomic_read() would suffice
because it guarantees fetching from memory. Nothing else we should need to
order on memory access.
This patch will fix a qemu qtest warning when running the preempt tls test,
reporting premature termination:
QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --full -r /x86_64/migration/postcopy/preempt/tls/psk
...
qemu-kvm: Cannot read from TLS channel: The TLS connection was non-properly terminated.
...
In this specific case, the error was set by postcopy_preempt_thread, which
normally will be concurrently shutdown()ed by the main thread.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250918203937.200833-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/crypto/tlssession.h | 10 +++-------
crypto/tlssession.c | 7 ++-----
io/channel-tls.c | 21 +++++++++++++++++++--
3 files changed, 24 insertions(+), 14 deletions(-)
diff --git a/include/crypto/tlssession.h b/include/crypto/tlssession.h
index 2f62ce2d67..2e9fe11cf6 100644
--- a/include/crypto/tlssession.h
+++ b/include/crypto/tlssession.h
@@ -110,6 +110,7 @@
typedef struct QCryptoTLSSession QCryptoTLSSession;
#define QCRYPTO_TLS_SESSION_ERR_BLOCK -2
+#define QCRYPTO_TLS_SESSION_PREMATURE_TERMINATION -3
/**
* qcrypto_tls_session_new:
@@ -259,7 +260,6 @@ ssize_t qcrypto_tls_session_write(QCryptoTLSSession *sess,
* @sess: the TLS session object
* @buf: to fill with plain text received
* @len: the length of @buf
- * @gracefulTermination: treat premature termination as graceful EOF
* @errp: pointer to hold returned error object
*
* Receive up to @len bytes of data from the remote peer
@@ -267,22 +267,18 @@ ssize_t qcrypto_tls_session_write(QCryptoTLSSession *sess,
* qcrypto_tls_session_set_callbacks(), decrypt it and
* store it in @buf.
*
- * If @gracefulTermination is true, then a premature termination
- * of the TLS session will be treated as indicating EOF, as
- * opposed to an error.
- *
* It is an error to call this before
* qcrypto_tls_session_handshake() returns
* QCRYPTO_TLS_HANDSHAKE_COMPLETE
*
* Returns: the number of bytes received,
* or QCRYPTO_TLS_SESSION_ERR_BLOCK if the receive would block,
- * or -1 on error.
+ * or QCRYPTO_TLS_SESSION_PREMATURE_TERMINATION if a premature termination
+ * is detected, or -1 on error.
*/
ssize_t qcrypto_tls_session_read(QCryptoTLSSession *sess,
char *buf,
size_t len,
- bool gracefulTermination,
Error **errp);
/**
diff --git a/crypto/tlssession.c b/crypto/tlssession.c
index 86d407a142..ac38c2121d 100644
--- a/crypto/tlssession.c
+++ b/crypto/tlssession.c
@@ -552,7 +552,6 @@ ssize_t
qcrypto_tls_session_read(QCryptoTLSSession *session,
char *buf,
size_t len,
- bool gracefulTermination,
Error **errp)
{
ssize_t ret;
@@ -570,9 +569,8 @@ qcrypto_tls_session_read(QCryptoTLSSession *session,
if (ret < 0) {
if (ret == GNUTLS_E_AGAIN) {
return QCRYPTO_TLS_SESSION_ERR_BLOCK;
- } else if ((ret == GNUTLS_E_PREMATURE_TERMINATION) &&
- gracefulTermination){
- return 0;
+ } else if (ret == GNUTLS_E_PREMATURE_TERMINATION) {
+ return QCRYPTO_TLS_SESSION_PREMATURE_TERMINATION;
} else {
if (session->rerr) {
error_propagate(errp, session->rerr);
@@ -789,7 +787,6 @@ ssize_t
qcrypto_tls_session_read(QCryptoTLSSession *sess,
char *buf,
size_t len,
- bool gracefulTermination,
Error **errp)
{
error_setg(errp, "TLS requires GNUTLS support");
diff --git a/io/channel-tls.c b/io/channel-tls.c
index 7135896f79..1fbed4be0c 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -346,6 +346,19 @@ static void qio_channel_tls_finalize(Object *obj)
qcrypto_tls_session_free(ioc->session);
}
+static bool
+qio_channel_tls_allow_premature_termination(QIOChannelTLS *tioc, int flags)
+{
+ if (flags & QIO_CHANNEL_READ_FLAG_RELAXED_EOF) {
+ return true;
+ }
+
+ if (qatomic_read(&tioc->shutdown) & QIO_CHANNEL_SHUTDOWN_READ) {
+ return true;
+ }
+
+ return false;
+}
static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
const struct iovec *iov,
@@ -364,8 +377,6 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
tioc->session,
iov[i].iov_base,
iov[i].iov_len,
- flags & QIO_CHANNEL_READ_FLAG_RELAXED_EOF ||
- qatomic_load_acquire(&tioc->shutdown) & QIO_CHANNEL_SHUTDOWN_READ,
errp);
if (ret == QCRYPTO_TLS_SESSION_ERR_BLOCK) {
if (got) {
@@ -373,6 +384,12 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
} else {
return QIO_CHANNEL_ERR_BLOCK;
}
+ } else if (ret == QCRYPTO_TLS_SESSION_PREMATURE_TERMINATION) {
+ if (qio_channel_tls_allow_premature_termination(tioc, flags)) {
+ ret = 0;
+ } else {
+ return -1;
+ }
} else if (ret < 0) {
return -1;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 28/45] migration: Make migration_has_failed() work even for CANCELLING
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (26 preceding siblings ...)
2025-10-03 15:39 ` [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 29/45] migration: HMP: Adjust the order of output fields Peter Xu
` (17 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Juraj Marcin
No issue I hit, the change is only from code observation when I am looking
at a TLS premature termination issue.
We set CANCELLED very late, it means migration_has_failed() may not work
correctly if it's invoked before updating CANCELLING to CANCELLED.
Allow that state will make migration_has_failed() working as expected even
if it's invoked slightly earlier.
One current user is the multifd code for the TLS graceful termination,
where it's before updating to CANCELLED.
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250918203937.200833-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 2f55f2784b..3ff85098d5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1712,7 +1712,8 @@ int migration_call_notifiers(MigrationState *s, MigrationEventType type,
bool migration_has_failed(MigrationState *s)
{
- return (s->state == MIGRATION_STATUS_CANCELLED ||
+ return (s->state == MIGRATION_STATUS_CANCELLING ||
+ s->state == MIGRATION_STATUS_CANCELLED ||
s->state == MIGRATION_STATUS_FAILED);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 29/45] migration: HMP: Adjust the order of output fields
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (27 preceding siblings ...)
2025-10-03 15:39 ` [PULL 28/45] migration: Make migration_has_failed() work even for CANCELLING Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 30/45] migration/multifd/tls: Cleanup BYE message processing on sender side Peter Xu
` (16 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini, Bin Guo,
Dr. David Alan Gilbert
From: Bin Guo <guobin@linux.alibaba.com>
Adjust the positions of 'tls-authz' and 'max-postcopy-bandwidth' in
the fields output by the 'info migrate_parameters' command so that
related fields are next to each other.
For clarity only, no functional changes.
Sample output after this commit:
(qemu) info migrate_parameters
...
max-cpu-throttle: 99
tls-creds: ''
tls-hostname: ''
tls-authz: ''
max-bandwidth: 134217728 bytes/second
avail-switchover-bandwidth: 0 bytes/second
max-postcopy-bandwidth: 0 bytes/second
downtime-limit: 300 ms
...
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Bin Guo <guobin@linux.alibaba.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250929021213.28369-1-guobin@linux.alibaba.com
[peterx: move postcopy-bw before avail-switchover-bw]
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration-hmp-cmds.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 0fc21f0647..814221b260 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -353,6 +353,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "%s: '%s'\n",
MigrationParameter_str(MIGRATION_PARAMETER_TLS_HOSTNAME),
params->tls_hostname);
+ assert(params->tls_authz);
+ monitor_printf(mon, "%s: '%s'\n",
+ MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
+ params->tls_authz);
assert(params->has_max_bandwidth);
monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
MigrationParameter_str(MIGRATION_PARAMETER_MAX_BANDWIDTH),
@@ -361,6 +365,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
MigrationParameter_str(MIGRATION_PARAMETER_AVAIL_SWITCHOVER_BANDWIDTH),
params->avail_switchover_bandwidth);
+ assert(params->has_max_postcopy_bandwidth);
+ monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
+ MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
+ params->max_postcopy_bandwidth);
assert(params->has_downtime_limit);
monitor_printf(mon, "%s: %" PRIu64 " ms\n",
MigrationParameter_str(MIGRATION_PARAMETER_DOWNTIME_LIMIT),
@@ -383,12 +391,6 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "%s: %" PRIu64 " bytes\n",
MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE),
params->xbzrle_cache_size);
- monitor_printf(mon, "%s: %" PRIu64 "\n",
- MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
- params->max_postcopy_bandwidth);
- monitor_printf(mon, "%s: '%s'\n",
- MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
- params->tls_authz);
if (params->has_block_bitmap_mapping) {
const BitmapMigrationNodeAliasList *bmnal;
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 30/45] migration/multifd/tls: Cleanup BYE message processing on sender side
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (28 preceding siblings ...)
2025-10-03 15:39 ` [PULL 29/45] migration: HMP: Adjust the order of output fields Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 31/45] migration: Fix state transition in postcopy_start() error handling Peter Xu
` (15 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini
This patch is a trivial cleanup to the BYE messages on the multifd sender
side. It could also be a fix, but since we do not have a solid clue,
taking this as a cleanup only.
One trivial concern is, migration_tls_channel_end() might be unsafe to be
invoked in the migration thread if migration is not successful, because
when failed / cancelled we do not know whether the multifd sender threads
can be writting to the channels, while GnuTLS library (when it's a TLS
channel) logically doesn't support concurrent writes.
When at it, cleanup on a few things. What changed:
- Introduce a helper to do graceful shutdowns with rich comment, hiding
the details
- Only send bye() iff migration succeeded, skip if it failed / cancelled
- Detect TLS channel using channel type rather than thread created flags
- Move the loop into the existing one that will close the channels, but
do graceful shutdowns before channel shutdowns
- local_err seems to have been leaked if set, fix it along the way
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250925201601.290546-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/multifd.c | 65 ++++++++++++++++++++++++---------------------
1 file changed, 34 insertions(+), 31 deletions(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index b255778855..98873cee74 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -439,6 +439,39 @@ static void multifd_send_set_error(Error *err)
}
}
+/*
+ * Gracefully shutdown IOChannels. Only needed for successful migrations on
+ * top of TLS channels. Otherwise it is same to qio_channel_shutdown().
+ *
+ * A successful migration also guarantees multifd sender threads are
+ * properly flushed and halted. It is only safe to send BYE in the
+ * migration thread here when we know there's no other thread writting to
+ * the channel, because GnuTLS doesn't support concurrent writers.
+ */
+static void migration_ioc_shutdown_gracefully(QIOChannel *ioc)
+{
+ g_autoptr(Error) local_err = NULL;
+
+ if (!migration_has_failed(migrate_get_current()) &&
+ object_dynamic_cast((Object *)ioc, TYPE_QIO_CHANNEL_TLS)) {
+
+ /*
+ * The destination expects the TLS session to always be properly
+ * terminated. This helps to detect a premature termination in the
+ * middle of the stream. Note that older QEMUs always break the
+ * connection on the source and the destination always sees
+ * GNUTLS_E_PREMATURE_TERMINATION.
+ */
+ migration_tls_channel_end(ioc, &local_err);
+ if (local_err) {
+ warn_report("Failed to gracefully terminate TLS connection: %s",
+ error_get_pretty(local_err));
+ }
+ }
+
+ qio_channel_shutdown(ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
+}
+
static void multifd_send_terminate_threads(void)
{
int i;
@@ -460,7 +493,7 @@ static void multifd_send_terminate_threads(void)
qemu_sem_post(&p->sem);
if (p->c) {
- qio_channel_shutdown(p->c, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
+ migration_ioc_shutdown_gracefully(p->c);
}
}
@@ -547,36 +580,6 @@ void multifd_send_shutdown(void)
return;
}
- for (i = 0; i < migrate_multifd_channels(); i++) {
- MultiFDSendParams *p = &multifd_send_state->params[i];
-
- /* thread_created implies the TLS handshake has succeeded */
- if (p->tls_thread_created && p->thread_created) {
- Error *local_err = NULL;
- /*
- * The destination expects the TLS session to always be
- * properly terminated. This helps to detect a premature
- * termination in the middle of the stream. Note that
- * older QEMUs always break the connection on the source
- * and the destination always sees
- * GNUTLS_E_PREMATURE_TERMINATION.
- */
- migration_tls_channel_end(p->c, &local_err);
-
- /*
- * The above can return an error in case the migration has
- * already failed. If the migration succeeded, errors are
- * not expected but there's no need to kill the source.
- */
- if (local_err && !migration_has_failed(migrate_get_current())) {
- warn_report(
- "multifd_send_%d: Failed to terminate TLS connection: %s",
- p->id, error_get_pretty(local_err));
- break;
- }
- }
- }
-
multifd_send_terminate_threads();
for (i = 0; i < migrate_multifd_channels(); i++) {
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 31/45] migration: Fix state transition in postcopy_start() error handling
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (29 preceding siblings ...)
2025-10-03 15:39 ` [PULL 30/45] migration/multifd/tls: Cleanup BYE message processing on sender side Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 32/45] migration: ensure APIC is loaded prior to VFIO PCI devices Peter Xu
` (14 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Juraj Marcin, qemu-stable
From: Juraj Marcin <jmarcin@redhat.com>
Commit 48814111366b ("migration: Always set DEVICE state") introduced
DEVICE state to postcopy, which moved the actual state transition that
leads to POSTCOPY_ACTIVE.
However, the error handling part of the postcopy_start() function still
expects the state POSTCOPY_ACTIVE, but depending on where an error
happens, now the state can be either ACTIVE, DEVICE or CANCELLING, but
never POSTCOPY_ACTIVE, as this transition now happens just before a
successful return from the function.
Instead, accept any state except CANCELLING when transitioning to FAILED
state.
Cc: qemu-stable@nongnu.org
Fixes: 48814111366b ("migration: Always set DEVICE state")
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250826115145.871272-1-jmarcin@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 3ff85098d5..edb8ff0d46 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2878,8 +2878,9 @@ static int postcopy_start(MigrationState *ms, Error **errp)
fail_closefb:
qemu_fclose(fb);
fail:
- migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
- MIGRATION_STATUS_FAILED);
+ if (ms->state != MIGRATION_STATUS_CANCELLING) {
+ migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
+ }
migration_block_activate(NULL);
migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
bql_unlock();
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 32/45] migration: ensure APIC is loaded prior to VFIO PCI devices
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (30 preceding siblings ...)
2025-10-03 15:39 ` [PULL 31/45] migration: Fix state transition in postcopy_start() error handling Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 33/45] include/system/memory.h: Clarify address_space_destroy() behaviour Peter Xu
` (13 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Yanfei Xu, Yicong Shen
From: Yanfei Xu <yanfei.xu@bytedance.com>
The load procedure of VFIO PCI devices involves setting up IRT
for each VFIO PCI devices. This requires determining whether an
interrupt is single-destination interrupt to decide between
Posted Interrupt(PI) or remapping mode for the IRTE. However,
determining this may require accessing the VM's APIC registers.
For example:
ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
...
kvm_arch_irq_bypass_add_producer
kvm_x86_call(pi_update_irte)
vmx_pi_update_irte
kvm_intr_is_single_vcpu
If the LAPIC has not been loaded yet, interrupts will use remapping
mode. To prevent the fallback of interrupt mode, keep APIC is always
loaded prior to VFIO PCI devices.
Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250818131127.1021648-1-yanfei.xu@bytedance.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/vmstate.h | 1 +
hw/intc/apic_common.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 5567fd78d0..6f5a9fed68 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -163,6 +163,7 @@ typedef enum {
MIG_PRI_IOMMU, /* Must happen before PCI devices */
MIG_PRI_PCI_BUS, /* Must happen before IOMMU */
MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */
+ MIG_PRI_APIC, /* Must happen before PCI devices */
MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */
MIG_PRI_GICV3, /* Must happen before the ITS */
MIG_PRI_MAX,
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 37a7a7019d..394fe02013 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -379,6 +379,7 @@ static const VMStateDescription vmstate_apic_common = {
.pre_load = apic_pre_load,
.pre_save = apic_dispatch_pre_save,
.post_load = apic_dispatch_post_load,
+ .priority = MIG_PRI_APIC,
.fields = (const VMStateField[]) {
VMSTATE_UINT32(apicbase, APICCommonState),
VMSTATE_UINT8(id, APICCommonState),
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 33/45] include/system/memory.h: Clarify address_space_destroy() behaviour
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (31 preceding siblings ...)
2025-10-03 15:39 ` [PULL 32/45] migration: ensure APIC is loaded prior to VFIO PCI devices Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 34/45] memory: New AS helper to serialize destroy+free Peter Xu
` (12 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini
From: Peter Maydell <peter.maydell@linaro.org>
address_space_destroy() doesn't actually immediately destroy the AS;
it queues it to be destroyed via RCU. This means you can't g_free()
the memory the AS struct is in until that has happened.
Clarify this in the documentation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250929144228.1994037-2-peter.maydell@linaro.org
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/system/memory.h | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/include/system/memory.h b/include/system/memory.h
index aa85fc27a1..827e2c5aa4 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -2727,9 +2727,14 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name);
/**
* address_space_destroy: destroy an address space
*
- * Releases all resources associated with an address space. After an address space
- * is destroyed, its root memory region (given by address_space_init()) may be destroyed
- * as well.
+ * Releases all resources associated with an address space. After an
+ * address space is destroyed, the reference the AddressSpace had to
+ * its root memory region is dropped, which may result in the
+ * destruction of that memory region as well.
+ *
+ * Note that destruction of the AddressSpace is done via RCU;
+ * it is therefore not valid to free the memory the AddressSpace
+ * struct is in until after that RCU callback has completed.
*
* @as: address space to be destroyed
*/
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 34/45] memory: New AS helper to serialize destroy+free
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (32 preceding siblings ...)
2025-10-03 15:39 ` [PULL 33/45] include/system/memory.h: Clarify address_space_destroy() behaviour Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 35/45] physmem: Destroy all CPU AddressSpaces on unrealize Peter Xu
` (11 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
qemu-stable
If an AddressSpace has been created in its own allocated
memory, cleaning it up requires first destroying the AS
and then freeing the memory. Doing this doesn't work:
address_space_destroy(as);
g_free_rcu(as, rcu);
because both address_space_destroy() and g_free_rcu()
try to use the same 'rcu' node in the AddressSpace struct
and the address_space_destroy hook gets overwritten.
Provide a new address_space_destroy_free() function which
will destroy the AS and then free the memory it uses, all
in one RCU callback.
(CC to stable because the next commit needs this function.)
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250929144228.1994037-3-peter.maydell@linaro.org
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/system/memory.h | 13 +++++++++++++
system/memory.c | 20 +++++++++++++++++++-
2 files changed, 32 insertions(+), 1 deletion(-)
diff --git a/include/system/memory.h b/include/system/memory.h
index 827e2c5aa4..08daf0fc59 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -2735,11 +2735,24 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name);
* Note that destruction of the AddressSpace is done via RCU;
* it is therefore not valid to free the memory the AddressSpace
* struct is in until after that RCU callback has completed.
+ * If you want to g_free() the AddressSpace after destruction you
+ * can do that with address_space_destroy_free().
*
* @as: address space to be destroyed
*/
void address_space_destroy(AddressSpace *as);
+/**
+ * address_space_destroy_free: destroy an address space and free it
+ *
+ * This does the same thing as address_space_destroy(), and then also
+ * frees (via g_free()) the AddressSpace itself once the destruction
+ * is complete.
+ *
+ * @as: address space to be destroyed
+ */
+void address_space_destroy_free(AddressSpace *as);
+
/**
* address_space_remove_listeners: unregister all listeners of an address space
*
diff --git a/system/memory.c b/system/memory.c
index cf8cad6961..fe8b28a096 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -3278,7 +3278,14 @@ static void do_address_space_destroy(AddressSpace *as)
memory_region_unref(as->root);
}
-void address_space_destroy(AddressSpace *as)
+static void do_address_space_destroy_free(AddressSpace *as)
+{
+ do_address_space_destroy(as);
+ g_free(as);
+}
+
+/* Detach address space from global view, notify all listeners */
+static void address_space_detach(AddressSpace *as)
{
MemoryRegion *root = as->root;
@@ -3293,9 +3300,20 @@ void address_space_destroy(AddressSpace *as)
* values to expire before freeing the data.
*/
as->root = root;
+}
+
+void address_space_destroy(AddressSpace *as)
+{
+ address_space_detach(as);
call_rcu(as, do_address_space_destroy, rcu);
}
+void address_space_destroy_free(AddressSpace *as)
+{
+ address_space_detach(as);
+ call_rcu(as, do_address_space_destroy_free, rcu);
+}
+
static const char *memory_region_type(MemoryRegion *mr)
{
if (mr->alias) {
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 35/45] physmem: Destroy all CPU AddressSpaces on unrealize
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (33 preceding siblings ...)
2025-10-03 15:39 ` [PULL 34/45] memory: New AS helper to serialize destroy+free Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 36/45] migration: simplify error reporting after channel read Peter Xu
` (10 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
qemu-stable
From: Peter Maydell <peter.maydell@linaro.org>
When we unrealize a CPU object (which happens on vCPU hot-unplug), we
should destroy all the AddressSpace objects we created via calls to
cpu_address_space_init() when the CPU was realized.
Commit 24bec42f3d6eae added a function to do this for a specific
AddressSpace, but did not add any places where the function was
called.
Since we always want to destroy all the AddressSpaces on unrealize,
regardless of the target architecture, we don't need to try to keep
track of how many are still undestroyed, or make the target
architecture code manually call a destroy function for each AS it
created. Instead we can adjust the function to always completely
destroy the whole cpu->ases array, and arrange for it to be called
during CPU unrealize as part of the common code.
Without this fix, AddressSanitizer will report a leak like this
from a run where we hot-plugged and then hot-unplugged an x86 KVM
vCPU:
Direct leak of 416 byte(s) in 1 object(s) allocated from:
#0 0x5b638565053d in calloc (/data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/qemu-system-x86_64+0x1ee153d) (BuildId: c1cd6022b195142106e1bffeca23498c2b752bca)
#1 0x7c28083f77b1 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x637b1) (BuildId: 1eb6131419edb83b2178b682829a6913cf682d75)
#2 0x5b6386999c7c in cpu_address_space_init /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../system/physmem.c:797:25
#3 0x5b638727f049 in kvm_cpu_realizefn /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../target/i386/kvm/kvm-cpu.c:102:5
#4 0x5b6385745f40 in accel_cpu_common_realize /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../accel/accel-common.c:101:13
#5 0x5b638568fe3c in cpu_exec_realizefn /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../hw/core/cpu-common.c:232:10
#6 0x5b63874a2cd5 in x86_cpu_realizefn /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../target/i386/cpu.c:9321:5
#7 0x5b6387a0469a in device_set_realized /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../hw/core/qdev.c:494:13
#8 0x5b6387a27d9e in property_set_bool /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../qom/object.c:2375:5
#9 0x5b6387a2090b in object_property_set /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../qom/object.c:1450:5
#10 0x5b6387a35b05 in object_property_set_qobject /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../qom/qom-qobject.c:28:10
#11 0x5b6387a21739 in object_property_set_bool /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../qom/object.c:1520:15
#12 0x5b63879fe510 in qdev_realize /data_nvme1n1/linaro/qemu-from-laptop/qemu/build/x86-tgts-asan/../../hw/core/qdev.c:276:12
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2517
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20250929144228.1994037-4-peter.maydell@linaro.org
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/exec/cpu-common.h | 10 +++++-----
include/hw/core/cpu.h | 1 -
hw/core/cpu-common.c | 1 +
stubs/cpu-destroy-address-spaces.c | 15 ++++++++++++++
system/physmem.c | 32 ++++++++++++++----------------
stubs/meson.build | 1 +
6 files changed, 37 insertions(+), 23 deletions(-)
create mode 100644 stubs/cpu-destroy-address-spaces.c
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index f373781ae0..b96ac49844 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -123,13 +123,13 @@ size_t qemu_ram_pagesize_largest(void);
void cpu_address_space_init(CPUState *cpu, int asidx,
const char *prefix, MemoryRegion *mr);
/**
- * cpu_address_space_destroy:
- * @cpu: CPU for which address space needs to be destroyed
- * @asidx: integer index of this address space
+ * cpu_destroy_address_spaces:
+ * @cpu: CPU for which address spaces need to be destroyed
*
- * Note that with KVM only one address space is supported.
+ * Destroy all address spaces associated with this CPU; this
+ * is called as part of unrealizing the CPU.
*/
-void cpu_address_space_destroy(CPUState *cpu, int asidx);
+void cpu_destroy_address_spaces(CPUState *cpu);
void cpu_physical_memory_rw(hwaddr addr, void *buf,
hwaddr len, bool is_write);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index c9f40c2539..0fcbc923f3 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -515,7 +515,6 @@ struct CPUState {
QSIMPLEQ_HEAD(, qemu_work_item) work_list;
struct CPUAddressSpace *cpu_ases;
- int cpu_ases_count;
int num_ases;
AddressSpace *as;
MemoryRegion *memory;
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 41a339903c..8c306c89e4 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -294,6 +294,7 @@ void cpu_exec_unrealizefn(CPUState *cpu)
* accel_cpu_common_unrealize, which may free fields using call_rcu.
*/
accel_cpu_common_unrealize(cpu);
+ cpu_destroy_address_spaces(cpu);
}
static void cpu_common_initfn(Object *obj)
diff --git a/stubs/cpu-destroy-address-spaces.c b/stubs/cpu-destroy-address-spaces.c
new file mode 100644
index 0000000000..dc6813f5bd
--- /dev/null
+++ b/stubs/cpu-destroy-address-spaces.c
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include "qemu/osdep.h"
+#include "exec/cpu-common.h"
+
+/*
+ * user-mode CPUs never create address spaces with
+ * cpu_address_space_init(), so the cleanup function doesn't
+ * need to do anything. We need this stub because cpu-common.c
+ * is built-once so it can't #ifndef CONFIG_USER around the
+ * call; the real function is in physmem.c which is system-only.
+ */
+void cpu_destroy_address_spaces(CPUState *cpu)
+{
+}
diff --git a/system/physmem.c b/system/physmem.c
index ae8ecd50ea..dbb2a4e017 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -795,7 +795,6 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
if (!cpu->cpu_ases) {
cpu->cpu_ases = g_new0(CPUAddressSpace, cpu->num_ases);
- cpu->cpu_ases_count = cpu->num_ases;
}
newas = &cpu->cpu_ases[asidx];
@@ -809,30 +808,29 @@ void cpu_address_space_init(CPUState *cpu, int asidx,
}
}
-void cpu_address_space_destroy(CPUState *cpu, int asidx)
+void cpu_destroy_address_spaces(CPUState *cpu)
{
CPUAddressSpace *cpuas;
+ int asidx;
assert(cpu->cpu_ases);
- assert(asidx >= 0 && asidx < cpu->num_ases);
- cpuas = &cpu->cpu_ases[asidx];
- if (tcg_enabled()) {
- memory_listener_unregister(&cpuas->tcg_as_listener);
- }
+ /* convenience alias just points to some cpu_ases[n] */
+ cpu->as = NULL;
- address_space_destroy(cpuas->as);
- g_free_rcu(cpuas->as, rcu);
-
- if (asidx == 0) {
- /* reset the convenience alias for address space 0 */
- cpu->as = NULL;
+ for (asidx = 0; asidx < cpu->num_ases; asidx++) {
+ cpuas = &cpu->cpu_ases[asidx];
+ if (!cpuas->as) {
+ /* This index was never initialized; no deinit needed */
+ continue;
+ }
+ if (tcg_enabled()) {
+ memory_listener_unregister(&cpuas->tcg_as_listener);
+ }
+ g_clear_pointer(&cpuas->as, address_space_destroy_free);
}
- if (--cpu->cpu_ases_count == 0) {
- g_free(cpu->cpu_ases);
- cpu->cpu_ases = NULL;
- }
+ g_clear_pointer(&cpu->cpu_ases, g_free);
}
AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)
diff --git a/stubs/meson.build b/stubs/meson.build
index cef046e685..5d577467bf 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -55,6 +55,7 @@ endif
if have_user
# Symbols that are used by hw/core.
stub_ss.add(files('cpu-synchronize-state.c'))
+ stub_ss.add(files('cpu-destroy-address-spaces.c'))
# Stubs for QAPI events. Those can always be included in the build, but
# they are not built at all for --disable-system builds.
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 36/45] migration: simplify error reporting after channel read
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (34 preceding siblings ...)
2025-10-03 15:39 ` [PULL 35/45] physmem: Destroy all CPU AddressSpaces on unrealize Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 37/45] migration: multi-mode notifier Peter Xu
` (9 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Daniel P. Berrangé, Prasad Pandit
From: Daniel P. Berrangé <berrange@redhat.com>
The code handling the return value of qio_channel_read proceses
len == 0 (EOF) separately from len < 1 (error), but in both
cases ends up calling qemu_file_set_error_obj() with -EIO as the
errno. This logic can be merged into one codepath to simplify it.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Link: https://lore.kernel.org/r/20250801170212.54409-2-berrange@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/qemu-file.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 0ee0f48a3e..2d4ce174a5 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -348,17 +348,13 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f)
} else {
qio_channel_wait(f->ioc, G_IO_IN);
}
- } else if (len < 0) {
- len = -EIO;
}
} while (len == QIO_CHANNEL_ERR_BLOCK);
if (len > 0) {
f->buf_size += len;
- } else if (len == 0) {
- qemu_file_set_error_obj(f, -EIO, local_error);
} else {
- qemu_file_set_error_obj(f, len, local_error);
+ qemu_file_set_error_obj(f, -EIO, local_error);
}
for (int i = 0; i < nfd; i++) {
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 37/45] migration: multi-mode notifier
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (35 preceding siblings ...)
2025-10-03 15:39 ` [PULL 36/45] migration: simplify error reporting after channel read Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 38/45] migration: add cpr_walk_fd Peter Xu
` (8 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Allow a notifier to be added for multiple migration modes.
To allow a notifier to appear on multiple per-node lists, use
a generic list type. We can no longer use NotifierWithReturnList,
because it shoe horns the notifier onto a single list.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1759332851-370353-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 12 ++++++++
migration/migration.c | 60 +++++++++++++++++++++++++++++++---------
2 files changed, 59 insertions(+), 13 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index a261f99d89..592b93021e 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -95,7 +95,19 @@ void migration_add_notifier(NotifierWithReturn *notify,
void migration_add_notifier_mode(NotifierWithReturn *notify,
MigrationNotifyFunc func, MigMode mode);
+/*
+ * Same as migration_add_notifier, but applies to all @mode in the argument
+ * list. The list is terminated by -1 or MIG_MODE_ALL. For the latter,
+ * the notifier is added for all modes.
+ */
+void migration_add_notifier_modes(NotifierWithReturn *notify,
+ MigrationNotifyFunc func, MigMode mode, ...);
+
+/*
+ * Remove a notifier from all modes.
+ */
void migration_remove_notifier(NotifierWithReturn *notify);
+
void migration_file_set_error(int ret, Error *err);
/* True if incoming migration entered POSTCOPY_INCOMING_DISCARD */
diff --git a/migration/migration.c b/migration/migration.c
index edb8ff0d46..a399735f02 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -74,11 +74,7 @@
#define INMIGRATE_DEFAULT_EXIT_ON_ERROR true
-static NotifierWithReturnList migration_state_notifiers[] = {
- NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_NORMAL),
- NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_REBOOT),
- NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_TRANSFER),
-};
+static GSList *migration_state_notifiers[MIG_MODE__MAX];
/* Messages sent on the return path from destination to source */
enum mig_rp_message_type {
@@ -1675,23 +1671,51 @@ void migration_cancel(void)
}
}
+static int get_modes(MigMode mode, va_list ap);
+
+static void add_notifiers(NotifierWithReturn *notify, int modes)
+{
+ for (MigMode mode = 0; mode < MIG_MODE__MAX; mode++) {
+ if (modes & BIT(mode)) {
+ migration_state_notifiers[mode] =
+ g_slist_prepend(migration_state_notifiers[mode], notify);
+ }
+ }
+}
+
+void migration_add_notifier_modes(NotifierWithReturn *notify,
+ MigrationNotifyFunc func, MigMode mode, ...)
+{
+ int modes;
+ va_list ap;
+
+ va_start(ap, mode);
+ modes = get_modes(mode, ap);
+ va_end(ap);
+
+ notify->notify = (NotifierWithReturnFunc)func;
+ add_notifiers(notify, modes);
+}
+
void migration_add_notifier_mode(NotifierWithReturn *notify,
MigrationNotifyFunc func, MigMode mode)
{
- notify->notify = (NotifierWithReturnFunc)func;
- notifier_with_return_list_add(&migration_state_notifiers[mode], notify);
+ migration_add_notifier_modes(notify, func, mode, -1);
}
void migration_add_notifier(NotifierWithReturn *notify,
MigrationNotifyFunc func)
{
- migration_add_notifier_mode(notify, func, MIG_MODE_NORMAL);
+ migration_add_notifier_modes(notify, func, MIG_MODE_NORMAL, -1);
}
void migration_remove_notifier(NotifierWithReturn *notify)
{
if (notify->notify) {
- notifier_with_return_remove(notify);
+ for (MigMode mode = 0; mode < MIG_MODE__MAX; mode++) {
+ migration_blockers[mode] =
+ g_slist_remove(migration_state_notifiers[mode], notify);
+ }
notify->notify = NULL;
}
}
@@ -1701,13 +1725,23 @@ int migration_call_notifiers(MigrationState *s, MigrationEventType type,
{
MigMode mode = s->parameters.mode;
MigrationEvent e;
+ NotifierWithReturn *notifier;
+ GSList *elem, *next;
int ret;
e.type = type;
- ret = notifier_with_return_list_notify(&migration_state_notifiers[mode],
- &e, errp);
- assert(!ret || type == MIG_EVENT_PRECOPY_SETUP);
- return ret;
+
+ for (elem = migration_state_notifiers[mode]; elem; elem = next) {
+ next = elem->next;
+ notifier = (NotifierWithReturn *)elem->data;
+ ret = notifier->notify(notifier, &e, errp);
+ if (ret) {
+ assert(type == MIG_EVENT_PRECOPY_SETUP);
+ return ret;
+ }
+ }
+
+ return 0;
}
bool migration_has_failed(MigrationState *s)
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 38/45] migration: add cpr_walk_fd
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (36 preceding siblings ...)
2025-10-03 15:39 ` [PULL 37/45] migration: multi-mode notifier Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 39/45] oslib: qemu_clear_cloexec Peter Xu
` (7 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Add a helper to walk all CPR fd's and run a callback for each.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1759332851-370353-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/cpr.h | 3 +++
migration/cpr.c | 13 +++++++++++++
2 files changed, 16 insertions(+)
diff --git a/include/migration/cpr.h b/include/migration/cpr.h
index 3fc19a74ef..2b074d7a65 100644
--- a/include/migration/cpr.h
+++ b/include/migration/cpr.h
@@ -34,6 +34,9 @@ void cpr_resave_fd(const char *name, int id, int fd);
int cpr_open_fd(const char *path, int flags, const char *name, int id,
Error **errp);
+typedef bool (*cpr_walk_fd_cb)(int fd);
+bool cpr_walk_fd(cpr_walk_fd_cb cb);
+
MigMode cpr_get_incoming_mode(void);
void cpr_set_incoming_mode(MigMode mode);
bool cpr_is_incoming(void);
diff --git a/migration/cpr.c b/migration/cpr.c
index e0b47df222..6feda78f1b 100644
--- a/migration/cpr.c
+++ b/migration/cpr.c
@@ -122,6 +122,19 @@ int cpr_open_fd(const char *path, int flags, const char *name, int id,
return fd;
}
+bool cpr_walk_fd(cpr_walk_fd_cb cb)
+{
+ CprFd *elem;
+
+ QLIST_FOREACH(elem, &cpr_state.fds, next) {
+ g_assert(elem->fd >= 0);
+ if (!cb(elem->fd)) {
+ return false;
+ }
+ }
+ return true;
+}
+
/*************************************************************************/
static const VMStateDescription vmstate_cpr_state = {
.name = CPR_STATE,
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 39/45] oslib: qemu_clear_cloexec
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (37 preceding siblings ...)
2025-10-03 15:39 ` [PULL 38/45] migration: add cpr_walk_fd Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 40/45] migration: cpr-exec-command parameter Peter Xu
` (6 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare, Dr. David Alan Gilbert, Marc-André Lureau
From: Steve Sistare <steven.sistare@oracle.com>
Define qemu_clear_cloexec, analogous to qemu_set_cloexec.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1759332851-370353-4-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/qemu/osdep.h | 9 +++++++++
util/oslib-posix.c | 9 +++++++++
util/oslib-win32.c | 4 ++++
3 files changed, 22 insertions(+)
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 1b38cb7e45..ed3e511a8e 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -689,6 +689,15 @@ ssize_t qemu_write_full(int fd, const void *buf, size_t count)
void qemu_set_cloexec(int fd);
bool qemu_set_blocking(int fd, bool block, Error **errp);
+/*
+ * Clear FD_CLOEXEC for a descriptor.
+ *
+ * The caller must guarantee that no other fork+exec's occur before the
+ * exec that is intended to inherit this descriptor, eg by suspending CPUs
+ * and blocking monitor commands.
+ */
+void qemu_clear_cloexec(int fd);
+
/* Return a dynamically allocated directory path that is appropriate for storing
* local state.
*
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 14cf94ac03..3c14b72665 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -305,6 +305,15 @@ int qemu_socketpair(int domain, int type, int protocol, int sv[2])
return ret;
}
+void qemu_clear_cloexec(int fd)
+{
+ int f;
+ f = fcntl(fd, F_GETFD);
+ assert(f != -1);
+ f = fcntl(fd, F_SETFD, f & ~FD_CLOEXEC);
+ assert(f != -1);
+}
+
char *
qemu_get_local_state_dir(void)
{
diff --git a/util/oslib-win32.c b/util/oslib-win32.c
index 84bc65a765..839b8a4170 100644
--- a/util/oslib-win32.c
+++ b/util/oslib-win32.c
@@ -219,6 +219,10 @@ void qemu_set_cloexec(int fd)
{
}
+void qemu_clear_cloexec(int fd)
+{
+}
+
int qemu_get_thread_id(void)
{
return GetCurrentThreadId();
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 40/45] migration: cpr-exec-command parameter
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (38 preceding siblings ...)
2025-10-03 15:39 ` [PULL 39/45] oslib: qemu_clear_cloexec Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 41/45] migration: cpr-exec save and load Peter Xu
` (5 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare, Markus Armbruster
From: Steve Sistare <steven.sistare@oracle.com>
Create the cpr-exec-command migration parameter, defined as a list of
strings. It will be used for cpr-exec migration mode in a subsequent
patch, and contains forward references to cpr-exec mode in the qapi
doc.
No functional change, except that cpr-exec-command is shown by the
'info migrate' command.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/1759332851-370353-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
qapi/migration.json | 21 ++++++++++++++++++---
migration/migration-hmp-cmds.c | 30 ++++++++++++++++++++++++++++++
migration/options.c | 14 ++++++++++++++
hmp-commands.hx | 2 +-
4 files changed, 63 insertions(+), 4 deletions(-)
diff --git a/qapi/migration.json b/qapi/migration.json
index 2387c21e9c..2be8fa1d16 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -924,6 +924,10 @@
# only has effect if the @mapped-ram capability is enabled.
# (Since 9.1)
#
+# @cpr-exec-command: Command to start the new QEMU process when @mode
+# is @cpr-exec. The first list element is the program's filename,
+# the remainder its arguments. (Since 10.2)
+#
# Features:
#
# @unstable: Members @x-checkpoint-delay and
@@ -950,7 +954,8 @@
'vcpu-dirty-limit',
'mode',
'zero-page-detection',
- 'direct-io'] }
+ 'direct-io',
+ 'cpr-exec-command'] }
##
# @MigrateSetParameters:
@@ -1105,6 +1110,10 @@
# only has effect if the @mapped-ram capability is enabled.
# (Since 9.1)
#
+# @cpr-exec-command: Command to start the new QEMU process when @mode
+# is @cpr-exec. The first list element is the program's filename,
+# the remainder its arguments. (Since 10.2)
+#
# Features:
#
# @unstable: Members @x-checkpoint-delay and
@@ -1146,7 +1155,8 @@
'*vcpu-dirty-limit': 'uint64',
'*mode': 'MigMode',
'*zero-page-detection': 'ZeroPageDetection',
- '*direct-io': 'bool' } }
+ '*direct-io': 'bool',
+ '*cpr-exec-command': [ 'str' ]} }
##
# @migrate-set-parameters:
@@ -1315,6 +1325,10 @@
# only has effect if the @mapped-ram capability is enabled.
# (Since 9.1)
#
+# @cpr-exec-command: Command to start the new QEMU process when @mode
+# is @cpr-exec. The first list element is the program's filename,
+# the remainder its arguments. (Since 10.2)
+#
# Features:
#
# @unstable: Members @x-checkpoint-delay and
@@ -1353,7 +1367,8 @@
'*vcpu-dirty-limit': 'uint64',
'*mode': 'MigMode',
'*zero-page-detection': 'ZeroPageDetection',
- '*direct-io': 'bool' } }
+ '*direct-io': 'bool',
+ '*cpr-exec-command': [ 'str' ]} }
##
# @query-migrate-parameters:
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 814221b260..847d18faaa 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -306,6 +306,18 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict)
qapi_free_MigrationCapabilityStatusList(caps);
}
+static void monitor_print_cpr_exec_command(Monitor *mon, strList *args)
+{
+ monitor_printf(mon, "%s:",
+ MigrationParameter_str(MIGRATION_PARAMETER_CPR_EXEC_COMMAND));
+
+ while (args) {
+ monitor_printf(mon, " %s", args->value);
+ args = args->next;
+ }
+ monitor_printf(mon, "\n");
+}
+
void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
{
MigrationParameters *params;
@@ -437,6 +449,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
MIGRATION_PARAMETER_DIRECT_IO),
params->direct_io ? "on" : "off");
}
+
+ assert(params->has_cpr_exec_command);
+ monitor_print_cpr_exec_command(mon, params->cpr_exec_command);
}
qapi_free_MigrationParameters(params);
@@ -718,6 +733,21 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
p->has_direct_io = true;
visit_type_bool(v, param, &p->direct_io, &err);
break;
+ case MIGRATION_PARAMETER_CPR_EXEC_COMMAND: {
+ g_autofree char **strv = NULL;
+ g_autoptr(GError) gerr = NULL;
+ strList **tail = &p->cpr_exec_command;
+
+ if (!g_shell_parse_argv(valuestr, NULL, &strv, &gerr)) {
+ error_setg(&err, "%s", gerr->message);
+ break;
+ }
+ for (int i = 0; strv[i]; i++) {
+ QAPI_LIST_APPEND(tail, strv[i]);
+ }
+ p->has_cpr_exec_command = true;
+ break;
+ }
default:
g_assert_not_reached();
}
diff --git a/migration/options.c b/migration/options.c
index 4e923a2e07..5183112775 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -959,6 +959,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
params->zero_page_detection = s->parameters.zero_page_detection;
params->has_direct_io = true;
params->direct_io = s->parameters.direct_io;
+ params->has_cpr_exec_command = true;
+ params->cpr_exec_command = QAPI_CLONE(strList,
+ s->parameters.cpr_exec_command);
return params;
}
@@ -993,6 +996,7 @@ void migrate_params_init(MigrationParameters *params)
params->has_mode = true;
params->has_zero_page_detection = true;
params->has_direct_io = true;
+ params->has_cpr_exec_command = true;
}
/*
@@ -1297,6 +1301,10 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
if (params->has_direct_io) {
dest->direct_io = params->direct_io;
}
+
+ if (params->has_cpr_exec_command) {
+ dest->cpr_exec_command = params->cpr_exec_command;
+ }
}
static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1429,6 +1437,12 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
if (params->has_direct_io) {
s->parameters.direct_io = params->direct_io;
}
+
+ if (params->has_cpr_exec_command) {
+ qapi_free_strList(s->parameters.cpr_exec_command);
+ s->parameters.cpr_exec_command =
+ QAPI_CLONE(strList, params->cpr_exec_command);
+ }
}
void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
diff --git a/hmp-commands.hx b/hmp-commands.hx
index d0e4f35a30..3cace8f1f7 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1009,7 +1009,7 @@ ERST
{
.name = "migrate_set_parameter",
- .args_type = "parameter:s,value:s",
+ .args_type = "parameter:s,value:S",
.params = "parameter value",
.help = "Set the parameter for migration",
.cmd = hmp_migrate_set_parameter,
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 41/45] migration: cpr-exec save and load
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (39 preceding siblings ...)
2025-10-03 15:39 ` [PULL 40/45] migration: cpr-exec-command parameter Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 42/45] migration: cpr-exec mode Peter Xu
` (4 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
To preserve CPR state across exec, create a QEMUFile based on a memfd, and
keep the memfd open across exec. Save the value of the memfd in an
environment variable so post-exec QEMU can find it.
These new functions are called in a subsequent patch.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1759332851-370353-6-git-send-email-steven.sistare@oracle.com
[peterx: fix build for Windows]
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/migration/cpr.h | 5 +++
migration/cpr-exec.c | 99 +++++++++++++++++++++++++++++++++++++++++
migration/meson.build | 1 +
3 files changed, 105 insertions(+)
create mode 100644 migration/cpr-exec.c
diff --git a/include/migration/cpr.h b/include/migration/cpr.h
index 2b074d7a65..b84389ff04 100644
--- a/include/migration/cpr.h
+++ b/include/migration/cpr.h
@@ -53,4 +53,9 @@ int cpr_get_fd_param(const char *name, const char *fdname, int index,
QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp);
QEMUFile *cpr_transfer_input(MigrationChannel *channel, Error **errp);
+QEMUFile *cpr_exec_output(Error **errp);
+QEMUFile *cpr_exec_input(Error **errp);
+void cpr_exec_persist_state(QEMUFile *f);
+bool cpr_exec_has_state(void);
+void cpr_exec_unpersist_state(void);
#endif
diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c
new file mode 100644
index 0000000000..81d84425e1
--- /dev/null
+++ b/migration/cpr-exec.c
@@ -0,0 +1,99 @@
+/*
+ * Copyright (c) 2021-2025 Oracle and/or its affiliates.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/cutils.h"
+#include "qemu/memfd.h"
+#include "qapi/error.h"
+#include "io/channel-file.h"
+#include "io/channel-socket.h"
+#include "migration/cpr.h"
+#include "migration/qemu-file.h"
+#include "migration/misc.h"
+#include "migration/vmstate.h"
+#include "system/runstate.h"
+
+#define CPR_EXEC_STATE_NAME "QEMU_CPR_EXEC_STATE"
+
+static QEMUFile *qemu_file_new_fd_input(int fd, const char *name)
+{
+ g_autoptr(QIOChannelFile) fioc = qio_channel_file_new_fd(fd);
+ QIOChannel *ioc = QIO_CHANNEL(fioc);
+ qio_channel_set_name(ioc, name);
+ return qemu_file_new_input(ioc);
+}
+
+static QEMUFile *qemu_file_new_fd_output(int fd, const char *name)
+{
+ g_autoptr(QIOChannelFile) fioc = qio_channel_file_new_fd(fd);
+ QIOChannel *ioc = QIO_CHANNEL(fioc);
+ qio_channel_set_name(ioc, name);
+ return qemu_file_new_output(ioc);
+}
+
+void cpr_exec_persist_state(QEMUFile *f)
+{
+ QIOChannelFile *fioc = QIO_CHANNEL_FILE(qemu_file_get_ioc(f));
+ int mfd = dup(fioc->fd);
+ char val[16];
+
+ /* Remember mfd in environment for post-exec load */
+ qemu_clear_cloexec(mfd);
+ snprintf(val, sizeof(val), "%d", mfd);
+ g_setenv(CPR_EXEC_STATE_NAME, val, 1);
+}
+
+static int cpr_exec_find_state(void)
+{
+ const char *val = g_getenv(CPR_EXEC_STATE_NAME);
+ int mfd;
+
+ assert(val);
+ g_unsetenv(CPR_EXEC_STATE_NAME);
+ assert(!qemu_strtoi(val, NULL, 10, &mfd));
+ return mfd;
+}
+
+bool cpr_exec_has_state(void)
+{
+ return g_getenv(CPR_EXEC_STATE_NAME) != NULL;
+}
+
+void cpr_exec_unpersist_state(void)
+{
+ int mfd;
+ const char *val = g_getenv(CPR_EXEC_STATE_NAME);
+
+ g_unsetenv(CPR_EXEC_STATE_NAME);
+ assert(val);
+ assert(!qemu_strtoi(val, NULL, 10, &mfd));
+ close(mfd);
+}
+
+QEMUFile *cpr_exec_output(Error **errp)
+{
+ int mfd;
+
+#ifdef CONFIG_LINUX
+ mfd = qemu_memfd_create(CPR_EXEC_STATE_NAME, 0, false, 0, 0, errp);
+#else
+ mfd = -1;
+#endif
+
+ if (mfd < 0) {
+ return NULL;
+ }
+
+ return qemu_file_new_fd_output(mfd, CPR_EXEC_STATE_NAME);
+}
+
+QEMUFile *cpr_exec_input(Error **errp)
+{
+ int mfd = cpr_exec_find_state();
+
+ lseek(mfd, 0, SEEK_SET);
+ return qemu_file_new_fd_input(mfd, CPR_EXEC_STATE_NAME);
+}
diff --git a/migration/meson.build b/migration/meson.build
index 0f71544a82..16909d54c5 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -16,6 +16,7 @@ system_ss.add(files(
'channel-block.c',
'cpr.c',
'cpr-transfer.c',
+ 'cpr-exec.c',
'cpu-throttle.c',
'dirtyrate.c',
'exec.c',
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 42/45] migration: cpr-exec mode
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (40 preceding siblings ...)
2025-10-03 15:39 ` [PULL 41/45] migration: cpr-exec save and load Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 43/45] migration: cpr-exec docs Peter Xu
` (3 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare, Markus Armbruster
From: Steve Sistare <steven.sistare@oracle.com>
Add the cpr-exec migration mode. Usage:
qemu-system-$arch -machine aux-ram-share=on ...
migrate_set_parameter mode cpr-exec
migrate_set_parameter cpr-exec-command \
<arg1> <arg2> ... -incoming <uri-1> \
migrate -d <uri-1>
The migrate command stops the VM, saves state to uri-1,
directly exec's a new version of QEMU on the same host,
replacing the original process while retaining its PID, and
loads state from uri-1. Guest RAM is preserved in place,
albeit with new virtual addresses.
The new QEMU process is started by exec'ing the command
specified by the @cpr-exec-command parameter. The first word of
the command is the binary, and the remaining words are its
arguments. The command may be a direct invocation of new QEMU,
or may be a non-QEMU command that exec's the new QEMU binary.
This mode creates a second migration channel that is not visible
to the user. At the start of migration, old QEMU saves CPR state
to the second channel, and at the end of migration, it tells the
main loop to call cpr_exec. New QEMU loads CPR state early, before
objects are created.
Because old QEMU terminates when new QEMU starts, one cannot
stream data between the two, so uri-1 must be a type,
such as a file, that accepts all data before old QEMU exits.
Otherwise, old QEMU may quietly block writing to the channel.
Memory-backend objects must have the share=on attribute, but
memory-backend-epc is not supported. The VM must be started with
the '-machine aux-ram-share=on' option, which allows anonymous
memory to be transferred in place to the new process. The memfds
are kept open across exec by clearing the close-on-exec flag, their
values are saved in CPR state, and they are mmap'd in new QEMU.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/1759332851-370353-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
qapi/migration.json | 25 ++++++++++-
include/migration/cpr.h | 2 +
migration/cpr-exec.c | 95 +++++++++++++++++++++++++++++++++++++++
migration/cpr.c | 23 +++++++++-
migration/migration.c | 10 ++++-
migration/ram.c | 1 +
migration/vmstate-types.c | 8 ++++
system/vl.c | 4 +-
migration/trace-events | 1 +
9 files changed, 164 insertions(+), 5 deletions(-)
diff --git a/qapi/migration.json b/qapi/migration.json
index 2be8fa1d16..be0f3fcc12 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -694,9 +694,32 @@
# until you issue the `migrate-incoming` command.
#
# (since 10.0)
+#
+# @cpr-exec: The migrate command stops the VM, saves state to the
+# migration channel, directly exec's a new version of QEMU on the
+# same host, replacing the original process while retaining its
+# PID, and loads state from the channel. Guest RAM is preserved
+# in place. Devices and their pinned pages are also preserved for
+# VFIO and IOMMUFD.
+#
+# Old QEMU starts new QEMU by exec'ing the command specified by
+# the @cpr-exec-command parameter. The command may be a direct
+# invocation of new QEMU, or may be a wrapper that exec's the new
+# QEMU binary.
+#
+# Because old QEMU terminates when new QEMU starts, one cannot
+# stream data between the two, so the channel must be a type,
+# such as a file, that accepts all data before old QEMU exits.
+# Otherwise, old QEMU may quietly block writing to the channel.
+#
+# Memory-backend objects must have the share=on attribute, but
+# memory-backend-epc is not supported. The VM must be started
+# with the '-machine aux-ram-share=on' option.
+#
+# (since 10.2)
##
{ 'enum': 'MigMode',
- 'data': [ 'normal', 'cpr-reboot', 'cpr-transfer' ] }
+ 'data': [ 'normal', 'cpr-reboot', 'cpr-transfer', 'cpr-exec' ] }
##
# @ZeroPageDetection:
diff --git a/include/migration/cpr.h b/include/migration/cpr.h
index b84389ff04..a412d6663c 100644
--- a/include/migration/cpr.h
+++ b/include/migration/cpr.h
@@ -53,9 +53,11 @@ int cpr_get_fd_param(const char *name, const char *fdname, int index,
QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp);
QEMUFile *cpr_transfer_input(MigrationChannel *channel, Error **errp);
+void cpr_exec_init(void);
QEMUFile *cpr_exec_output(Error **errp);
QEMUFile *cpr_exec_input(Error **errp);
void cpr_exec_persist_state(QEMUFile *f);
bool cpr_exec_has_state(void);
void cpr_exec_unpersist_state(void);
+void cpr_exec_unpreserve_fds(void);
#endif
diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c
index 81d84425e1..d57714bc5d 100644
--- a/migration/cpr-exec.c
+++ b/migration/cpr-exec.c
@@ -6,15 +6,21 @@
#include "qemu/osdep.h"
#include "qemu/cutils.h"
+#include "qemu/error-report.h"
#include "qemu/memfd.h"
#include "qapi/error.h"
+#include "qapi/type-helpers.h"
#include "io/channel-file.h"
#include "io/channel-socket.h"
+#include "block/block-global-state.h"
+#include "qemu/main-loop.h"
#include "migration/cpr.h"
#include "migration/qemu-file.h"
+#include "migration/migration.h"
#include "migration/misc.h"
#include "migration/vmstate.h"
#include "system/runstate.h"
+#include "trace.h"
#define CPR_EXEC_STATE_NAME "QEMU_CPR_EXEC_STATE"
@@ -97,3 +103,92 @@ QEMUFile *cpr_exec_input(Error **errp)
lseek(mfd, 0, SEEK_SET);
return qemu_file_new_fd_input(mfd, CPR_EXEC_STATE_NAME);
}
+
+static bool preserve_fd(int fd)
+{
+ qemu_clear_cloexec(fd);
+ return true;
+}
+
+static bool unpreserve_fd(int fd)
+{
+ qemu_set_cloexec(fd);
+ return true;
+}
+
+static void cpr_exec_preserve_fds(void)
+{
+ cpr_walk_fd(preserve_fd);
+}
+
+void cpr_exec_unpreserve_fds(void)
+{
+ cpr_walk_fd(unpreserve_fd);
+}
+
+static void cpr_exec_cb(void *opaque)
+{
+ MigrationState *s = migrate_get_current();
+ char **argv = strv_from_str_list(s->parameters.cpr_exec_command);
+ Error *err = NULL;
+
+ /*
+ * Clear the close-on-exec flag for all preserved fd's. We cannot do so
+ * earlier because they should not persist across miscellaneous fork and
+ * exec calls that are performed during normal operation.
+ */
+ cpr_exec_preserve_fds();
+
+ trace_cpr_exec();
+ execvp(argv[0], argv);
+
+ /*
+ * exec should only fail if argv[0] is bogus, or has a permissions problem,
+ * or the system is very short on resources.
+ */
+ g_strfreev(argv);
+ cpr_exec_unpreserve_fds();
+
+ error_setg_errno(&err, errno, "execvp %s failed", argv[0]);
+ error_report_err(error_copy(err));
+ migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+ migrate_set_error(s, err);
+
+ /* Note, we can go from state COMPLETED to FAILED */
+ migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
+
+ err = NULL;
+ if (!migration_block_activate(&err)) {
+ /* error was already reported */
+ error_free(err);
+ return;
+ }
+
+ if (runstate_is_live(s->vm_old_state)) {
+ vm_start();
+ }
+}
+
+static int cpr_exec_notifier(NotifierWithReturn *notifier, MigrationEvent *e,
+ Error **errp)
+{
+ MigrationState *s = migrate_get_current();
+
+ if (e->type == MIG_EVENT_PRECOPY_DONE) {
+ QEMUBH *cpr_exec_bh = qemu_bh_new(cpr_exec_cb, NULL);
+ assert(s->state == MIGRATION_STATUS_COMPLETED);
+ qemu_bh_schedule(cpr_exec_bh);
+ qemu_notify_event();
+ } else if (e->type == MIG_EVENT_PRECOPY_FAILED) {
+ cpr_exec_unpersist_state();
+ }
+ return 0;
+}
+
+void cpr_exec_init(void)
+{
+ static NotifierWithReturn exec_notifier;
+
+ migration_add_notifier_mode(&exec_notifier, cpr_exec_notifier,
+ MIG_MODE_CPR_EXEC);
+}
diff --git a/migration/cpr.c b/migration/cpr.c
index 6feda78f1b..22dbac7c72 100644
--- a/migration/cpr.c
+++ b/migration/cpr.c
@@ -6,6 +6,7 @@
*/
#include "qemu/osdep.h"
+#include "qemu/error-report.h"
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "hw/vfio/vfio-device.h"
@@ -186,6 +187,8 @@ int cpr_state_save(MigrationChannel *channel, Error **errp)
if (mode == MIG_MODE_CPR_TRANSFER) {
g_assert(channel);
f = cpr_transfer_output(channel, errp);
+ } else if (mode == MIG_MODE_CPR_EXEC) {
+ f = cpr_exec_output(errp);
} else {
return 0;
}
@@ -202,6 +205,10 @@ int cpr_state_save(MigrationChannel *channel, Error **errp)
return ret;
}
+ if (migrate_mode() == MIG_MODE_CPR_EXEC) {
+ cpr_exec_persist_state(f);
+ }
+
/*
* Close the socket only partially so we can later detect when the other
* end closes by getting a HUP event.
@@ -220,7 +227,13 @@ int cpr_state_load(MigrationChannel *channel, Error **errp)
QEMUFile *f;
MigMode mode = 0;
- if (channel) {
+ if (cpr_exec_has_state()) {
+ mode = MIG_MODE_CPR_EXEC;
+ f = cpr_exec_input(errp);
+ if (channel) {
+ warn_report("ignoring cpr channel for migration mode cpr-exec");
+ }
+ } else if (channel) {
mode = MIG_MODE_CPR_TRANSFER;
cpr_set_incoming_mode(mode);
f = cpr_transfer_input(channel, errp);
@@ -232,6 +245,7 @@ int cpr_state_load(MigrationChannel *channel, Error **errp)
}
trace_cpr_state_load(MigMode_str(mode));
+ cpr_set_incoming_mode(mode);
v = qemu_get_be32(f);
if (v != QEMU_CPR_FILE_MAGIC) {
@@ -252,6 +266,11 @@ int cpr_state_load(MigrationChannel *channel, Error **errp)
return ret;
}
+ if (migrate_mode() == MIG_MODE_CPR_EXEC) {
+ /* Set cloexec to prevent fd leaks from fork until the next cpr-exec */
+ cpr_exec_unpreserve_fds();
+ }
+
/*
* Let the caller decide when to close the socket (and generate a HUP event
* for the sending side).
@@ -272,7 +291,7 @@ void cpr_state_close(void)
bool cpr_incoming_needed(void *opaque)
{
MigMode mode = migrate_mode();
- return mode == MIG_MODE_CPR_TRANSFER;
+ return mode == MIG_MODE_CPR_TRANSFER || mode == MIG_MODE_CPR_EXEC;
}
/*
diff --git a/migration/migration.c b/migration/migration.c
index a399735f02..a63b46bbef 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -333,6 +333,7 @@ void migration_object_init(void)
ram_mig_init();
dirty_bitmap_mig_init();
+ cpr_exec_init();
/* Initialize cpu throttle timers */
cpu_throttle_init();
@@ -1807,7 +1808,8 @@ bool migrate_mode_is_cpr(MigrationState *s)
{
MigMode mode = s->parameters.mode;
return mode == MIG_MODE_CPR_REBOOT ||
- mode == MIG_MODE_CPR_TRANSFER;
+ mode == MIG_MODE_CPR_TRANSFER ||
+ mode == MIG_MODE_CPR_EXEC;
}
int migrate_init(MigrationState *s, Error **errp)
@@ -2156,6 +2158,12 @@ static bool migrate_prepare(MigrationState *s, bool resume, Error **errp)
return false;
}
+ if (migrate_mode() == MIG_MODE_CPR_EXEC &&
+ !s->parameters.has_cpr_exec_command) {
+ error_setg(errp, "cpr-exec mode requires setting cpr-exec-command");
+ return false;
+ }
+
if (migration_is_blocked(errp)) {
return false;
}
diff --git a/migration/ram.c b/migration/ram.c
index a8e8d2cc67..9aac89638a 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -228,6 +228,7 @@ bool migrate_ram_is_ignored(RAMBlock *block)
MigMode mode = migrate_mode();
return !qemu_ram_is_migratable(block) ||
mode == MIG_MODE_CPR_TRANSFER ||
+ mode == MIG_MODE_CPR_EXEC ||
(migrate_ignore_shared() && qemu_ram_is_shared(block)
&& qemu_ram_is_named_file(block));
}
diff --git a/migration/vmstate-types.c b/migration/vmstate-types.c
index a1cd7a95fa..4b01dc19c2 100644
--- a/migration/vmstate-types.c
+++ b/migration/vmstate-types.c
@@ -322,6 +322,10 @@ static int get_fd(QEMUFile *f, void *pv, size_t size,
const VMStateField *field)
{
int32_t *v = pv;
+ if (migrate_mode() == MIG_MODE_CPR_EXEC) {
+ qemu_get_sbe32s(f, v);
+ return 0;
+ }
*v = qemu_file_get_fd(f);
return 0;
}
@@ -330,6 +334,10 @@ static int put_fd(QEMUFile *f, void *pv, size_t size,
const VMStateField *field, JSONWriter *vmdesc)
{
int32_t *v = pv;
+ if (migrate_mode() == MIG_MODE_CPR_EXEC) {
+ qemu_put_sbe32s(f, v);
+ return 0;
+ }
return qemu_file_put_fd(f, *v);
}
diff --git a/system/vl.c b/system/vl.c
index 00f3694725..646239e4a6 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -3837,6 +3837,8 @@ void qemu_init(int argc, char **argv)
}
qemu_init_displays();
accel_setup_post(current_machine);
- os_setup_post();
+ if (migrate_mode() != MIG_MODE_CPR_EXEC) {
+ os_setup_post();
+ }
resume_mux_open();
}
diff --git a/migration/trace-events b/migration/trace-events
index 706db97def..e8edd1fbba 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -354,6 +354,7 @@ cpr_state_save(const char *mode) "%s mode"
cpr_state_load(const char *mode) "%s mode"
cpr_transfer_input(const char *path) "%s"
cpr_transfer_output(const char *path) "%s"
+cpr_exec(void) ""
# block-dirty-bitmap.c
send_bitmap_header_enter(void) ""
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 43/45] migration: cpr-exec docs
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (41 preceding siblings ...)
2025-10-03 15:39 ` [PULL 42/45] migration: cpr-exec mode Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 44/45] vfio: cpr-exec mode Peter Xu
` (2 subsequent siblings)
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Update developer documentation for cpr-exec mode.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1759332851-370353-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
docs/devel/migration/CPR.rst | 112 ++++++++++++++++++++++++++++++++++-
1 file changed, 111 insertions(+), 1 deletion(-)
diff --git a/docs/devel/migration/CPR.rst b/docs/devel/migration/CPR.rst
index 0a0fd4f6dc..b6178568a8 100644
--- a/docs/devel/migration/CPR.rst
+++ b/docs/devel/migration/CPR.rst
@@ -5,7 +5,7 @@ CPR is the umbrella name for a set of migration modes in which the
VM is migrated to a new QEMU instance on the same host. It is
intended for use when the goal is to update host software components
that run the VM, such as QEMU or even the host kernel. At this time,
-the cpr-reboot and cpr-transfer modes are available.
+the cpr-reboot, cpr-transfer, and cpr-exec modes are available.
Because QEMU is restarted on the same host, with access to the same
local devices, CPR is allowed in certain cases where normal migration
@@ -324,3 +324,113 @@ descriptors from old to new QEMU. In the future, descriptors for
vhost, and char devices could be transferred,
preserving those devices and their kernel state without interruption,
even if they do not explicitly support live migration.
+
+cpr-exec mode
+-------------
+
+In this mode, QEMU stops the VM, writes VM state to the migration
+URI, and directly exec's a new version of QEMU on the same host,
+replacing the original process while retaining its PID. Guest RAM is
+preserved in place, albeit with new virtual addresses. The user
+completes the migration by specifying the ``-incoming`` option, and
+by issuing the ``migrate-incoming`` command if necessary; see details
+below.
+
+This mode supports VFIO/IOMMUFD devices by preserving device
+descriptors and hence kernel state across the exec, even for devices
+that do not support live migration.
+
+Because the old and new QEMU instances are not active concurrently,
+the URI cannot be a type that streams data from one instance to the
+other.
+
+This mode does not require a channel of type ``cpr``. The information
+that is passed over that channel for cpr-transfer mode is instead
+serialized to a memfd, the number of the fd is saved in the
+QEMU_CPR_EXEC_STATE environment variable during the exec of new QEMU.
+and new QEMU mmaps the memfd.
+
+Usage
+^^^^^
+
+Arguments for the new QEMU process are taken from the
+@cpr-exec-command parameter. The first argument should be the
+path of a new QEMU binary, or a prefix command that exec's the
+new QEMU binary, and the arguments should include the ''-incoming''
+option.
+
+Memory backend objects must have the ``share=on`` attribute.
+The VM must be started with the ``-machine aux-ram-share=on`` option.
+
+Outgoing:
+ * Set the migration mode parameter to ``cpr-exec``.
+ * Set the ``cpr-exec-command`` parameter.
+ * Issue the ``migrate`` command. It is recommended that the URI be
+ a ``file`` type, but one can use other types such as ``exec``,
+ provided the command captures all the data from the outgoing side,
+ and provides all the data to the incoming side.
+
+Incoming:
+ * You do not need to explicitly start new QEMU. It is started as
+ a side effect of the migrate command above.
+ * If the VM was running when the outgoing ``migrate`` command was
+ issued, then QEMU automatically resumes VM execution.
+
+Example 1: incoming URI
+^^^^^^^^^^^^^^^^^^^^^^^
+
+In these examples, we simply restart the same version of QEMU, but in
+a real scenario one would set a new QEMU binary path in
+cpr-exec-command.
+
+::
+
+ # qemu-kvm -monitor stdio
+ -object memory-backend-memfd,id=ram0,size=4G
+ -machine memory-backend=ram0
+ -machine aux-ram-share=on
+ ...
+
+ QEMU 10.2.50 monitor - type 'help' for more information
+ (qemu) info status
+ VM status: running
+ (qemu) migrate_set_parameter mode cpr-exec
+ (qemu) migrate_set_parameter cpr-exec-command qemu-kvm ... -incoming file:vm.state
+ (qemu) migrate -d file:vm.state
+ (qemu) QEMU 10.2.50 monitor - type 'help' for more information
+ (qemu) info status
+ VM status: running
+
+Example 2: incoming defer
+^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+ # qemu-kvm -monitor stdio
+ -object memory-backend-memfd,id=ram0,size=4G
+ -machine memory-backend=ram0
+ -machine aux-ram-share=on
+ ...
+
+ QEMU 10.2.50 monitor - type 'help' for more information
+ (qemu) info status
+ VM status: running
+ (qemu) migrate_set_parameter mode cpr-exec
+ (qemu) migrate_set_parameter cpr-exec-command qemu-kvm ... -incoming defer
+ (qemu) migrate -d file:vm.state
+ (qemu) QEMU 10.2.50 monitor - type 'help' for more information
+ (qemu) info status
+ status: paused (inmigrate)
+ (qemu) migrate_incoming file:vm.state
+ (qemu) info status
+ VM status: running
+
+Caveats
+^^^^^^^
+
+cpr-exec mode may not be used with postcopy, background-snapshot,
+or COLO.
+
+cpr-exec mode requires permission to use the exec system call, which
+is denied by certain sandbox options, such as spawn.
+
+The guest pause time increases for large guest RAM backed by small pages.
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 44/45] vfio: cpr-exec mode
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (42 preceding siblings ...)
2025-10-03 15:39 ` [PULL 43/45] migration: cpr-exec docs Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-03 15:39 ` [PULL 45/45] migration-test: test cpr-exec Peter Xu
2025-10-04 17:53 ` [PULL 00/45] Staging patches Richard Henderson
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare, Cédric Le Goater
From: Steve Sistare <steven.sistare@oracle.com>
All blockers and notifiers for cpr-transfer mode also apply to cpr-exec.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/30750362-d4a1-4392-8dd6-016624d01be1@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/vfio/container-legacy.c | 3 ++-
hw/vfio/cpr-iommufd.c | 3 ++-
hw/vfio/cpr-legacy.c | 9 +++++----
hw/vfio/cpr.c | 13 +++++++------
4 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/hw/vfio/container-legacy.c b/hw/vfio/container-legacy.c
index c0f87f774a..c0540f2bdc 100644
--- a/hw/vfio/container-legacy.c
+++ b/hw/vfio/container-legacy.c
@@ -990,7 +990,8 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev,
error_setg(&vbasedev->cpr.mdev_blocker,
"CPR does not support vfio mdev %s", vbasedev->name);
if (migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, errp,
- MIG_MODE_CPR_TRANSFER, -1) < 0) {
+ MIG_MODE_CPR_TRANSFER, MIG_MODE_CPR_EXEC,
+ -1) < 0) {
goto hiod_unref_exit;
}
}
diff --git a/hw/vfio/cpr-iommufd.c b/hw/vfio/cpr-iommufd.c
index 1d70c87996..8a4d65de5e 100644
--- a/hw/vfio/cpr-iommufd.c
+++ b/hw/vfio/cpr-iommufd.c
@@ -159,7 +159,8 @@ bool vfio_iommufd_cpr_register_iommufd(IOMMUFDBackend *be, Error **errp)
if (!vfio_cpr_supported(be, cpr_blocker)) {
return migrate_add_blocker_modes(cpr_blocker, errp,
- MIG_MODE_CPR_TRANSFER, -1) == 0;
+ MIG_MODE_CPR_TRANSFER,
+ MIG_MODE_CPR_EXEC, -1) == 0;
}
vmstate_register(NULL, -1, &iommufd_cpr_vmstate, be);
diff --git a/hw/vfio/cpr-legacy.c b/hw/vfio/cpr-legacy.c
index bbf7a0d35f..1a16cb1188 100644
--- a/hw/vfio/cpr-legacy.c
+++ b/hw/vfio/cpr-legacy.c
@@ -179,16 +179,17 @@ bool vfio_legacy_cpr_register_container(VFIOLegacyContainer *container,
if (!vfio_cpr_supported(container, cpr_blocker)) {
return migrate_add_blocker_modes(cpr_blocker, errp,
- MIG_MODE_CPR_TRANSFER, -1) == 0;
+ MIG_MODE_CPR_TRANSFER,
+ MIG_MODE_CPR_EXEC, -1) == 0;
}
vfio_cpr_add_kvm_notifier();
vmstate_register(NULL, -1, &vfio_container_vmstate, container);
- migration_add_notifier_mode(&container->cpr.transfer_notifier,
- vfio_cpr_fail_notifier,
- MIG_MODE_CPR_TRANSFER);
+ migration_add_notifier_modes(&container->cpr.transfer_notifier,
+ vfio_cpr_fail_notifier,
+ MIG_MODE_CPR_TRANSFER, MIG_MODE_CPR_EXEC, -1);
return true;
}
diff --git a/hw/vfio/cpr.c b/hw/vfio/cpr.c
index 2c71fc1e8e..db462aabcb 100644
--- a/hw/vfio/cpr.c
+++ b/hw/vfio/cpr.c
@@ -195,9 +195,10 @@ static int vfio_cpr_kvm_close_notifier(NotifierWithReturn *notifier,
void vfio_cpr_add_kvm_notifier(void)
{
if (!kvm_close_notifier.notify) {
- migration_add_notifier_mode(&kvm_close_notifier,
- vfio_cpr_kvm_close_notifier,
- MIG_MODE_CPR_TRANSFER);
+ migration_add_notifier_modes(&kvm_close_notifier,
+ vfio_cpr_kvm_close_notifier,
+ MIG_MODE_CPR_TRANSFER, MIG_MODE_CPR_EXEC,
+ -1);
}
}
@@ -282,9 +283,9 @@ static int vfio_cpr_pci_notifier(NotifierWithReturn *notifier,
void vfio_cpr_pci_register_device(VFIOPCIDevice *vdev)
{
- migration_add_notifier_mode(&vdev->cpr.transfer_notifier,
- vfio_cpr_pci_notifier,
- MIG_MODE_CPR_TRANSFER);
+ migration_add_notifier_modes(&vdev->cpr.transfer_notifier,
+ vfio_cpr_pci_notifier,
+ MIG_MODE_CPR_TRANSFER, MIG_MODE_CPR_EXEC, -1);
}
void vfio_cpr_pci_unregister_device(VFIOPCIDevice *vdev)
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PULL 45/45] migration-test: test cpr-exec
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (43 preceding siblings ...)
2025-10-03 15:39 ` [PULL 44/45] vfio: cpr-exec mode Peter Xu
@ 2025-10-03 15:39 ` Peter Xu
2025-10-04 17:53 ` [PULL 00/45] Staging patches Richard Henderson
45 siblings, 0 replies; 49+ messages in thread
From: Peter Xu @ 2025-10-03 15:39 UTC (permalink / raw)
To: Peter Maydell, qemu-devel
Cc: Fabiano Rosas, peterx, David Hildenbrand, Paolo Bonzini,
Steve Sistare
From: Steve Sistare <steven.sistare@oracle.com>
Add a test for the cpr-exec migration mode.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1759332851-370353-20-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
tests/qtest/migration/cpr-tests.c | 133 ++++++++++++++++++++++++++++++
1 file changed, 133 insertions(+)
diff --git a/tests/qtest/migration/cpr-tests.c b/tests/qtest/migration/cpr-tests.c
index c4ce60ff66..9388ad64be 100644
--- a/tests/qtest/migration/cpr-tests.c
+++ b/tests/qtest/migration/cpr-tests.c
@@ -113,6 +113,138 @@ static void test_mode_transfer_defer(void)
test_mode_transfer_common(true);
}
+static void set_cpr_exec_args(QTestState *who, MigrateCommon *args)
+{
+ g_autofree char *qtest_from_args = NULL;
+ g_autofree char *from_args = NULL;
+ g_autofree char *to_args = NULL;
+ g_autofree char *exec_args = NULL;
+ g_auto(GStrv) argv = NULL;
+ char *from_str, *src, *dst;
+ int ret;
+
+ /*
+ * hide_stderr appends "2>/dev/null" to the command line, but cpr-exec
+ * passes the command-line words to execv, not to the shell, so suppress it
+ * here. fd 2 was already bound in the source VM, and execv preserves it.
+ */
+ g_assert(args->start.hide_stderr == false);
+
+ ret = migrate_args(&from_args, &to_args, args->listen_uri, &args->start);
+ g_assert(!ret);
+ qtest_from_args = qtest_qemu_args(from_args);
+
+ /*
+ * The generated args may have been formatted using "%s %s" with empty
+ * strings, which can produce consecutive spaces, which g_strsplit would
+ * convert into empty strings. Ditto for leading and trailing space.
+ * De-dup spaces to avoid that.
+ */
+
+ from_str = src = dst = g_strstrip(qtest_from_args);
+ do {
+ if (*src != ' ' || src[-1] != ' ') {
+ *dst++ = *src;
+ }
+ } while (*src++);
+
+ exec_args = g_strconcat(qtest_qemu_binary(migration_get_env()->qemu_dst),
+ " -incoming defer ", from_str, NULL);
+ argv = g_strsplit(exec_args, " ", -1);
+ migrate_set_parameter_strv(who, "cpr-exec-command", argv);
+}
+
+static void wait_for_migration_event(QTestState *who, const char *waitfor)
+{
+ QDict *rsp, *data;
+ char *status;
+ bool done = false;
+
+ while (!done) {
+ rsp = qtest_qmp_eventwait_ref(who, "MIGRATION");
+ g_assert(qdict_haskey(rsp, "data"));
+ data = qdict_get_qdict(rsp, "data");
+ g_assert(qdict_haskey(data, "status"));
+ status = g_strdup(qdict_get_str(data, "status"));
+ g_assert(strcmp(status, "failed"));
+ done = !strcmp(status, waitfor);
+ qobject_unref(rsp);
+ }
+}
+
+static void test_cpr_exec(MigrateCommon *args)
+{
+ QTestState *from, *to;
+ void *data_hook = NULL;
+ g_autofree char *connect_uri = g_strdup(args->connect_uri);
+ g_autofree char *filename = g_strdup_printf("%s/%s", tmpfs,
+ FILE_TEST_FILENAME);
+
+ if (migrate_start(&from, NULL, args->listen_uri, &args->start)) {
+ return;
+ }
+
+ /* Source and dest never run concurrently */
+ g_assert_false(args->live);
+
+ if (args->start_hook) {
+ data_hook = args->start_hook(from, NULL);
+ }
+
+ wait_for_serial("src_serial");
+ set_cpr_exec_args(from, args);
+ migrate_set_capability(from, "events", true);
+ migrate_qmp(from, NULL, connect_uri, NULL, "{}");
+ wait_for_migration_event(from, "completed");
+
+ to = qtest_init_after_exec(from);
+
+ qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming',"
+ " 'arguments': { "
+ " 'channels': [ { 'channel-type': 'main',"
+ " 'addr': { 'transport': 'file',"
+ " 'filename': %s,"
+ " 'offset': 0 } } ] } }",
+ filename);
+ wait_for_migration_complete(to);
+
+ wait_for_resume(to, get_dst());
+ /* Device on target is still named src_serial because args do not change */
+ wait_for_serial("src_serial");
+
+ if (args->end_hook) {
+ args->end_hook(from, to, data_hook);
+ }
+
+ migrate_end(from, to, args->result == MIG_TEST_SUCCEED);
+}
+
+static void *test_mode_exec_start(QTestState *from, QTestState *to)
+{
+ assert(!to);
+ migrate_set_parameter_str(from, "mode", "cpr-exec");
+ return NULL;
+}
+
+static void test_mode_exec(void)
+{
+ g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
+ FILE_TEST_FILENAME);
+ g_autofree char *listen_uri = g_strdup_printf("defer");
+
+ MigrateCommon args = {
+ .start.only_source = true,
+ .start.opts_source = "-machine aux-ram-share=on -nodefaults",
+ .start.memory_backend = "-object memory-backend-memfd,id=pc.ram,size=%s"
+ " -machine memory-backend=pc.ram",
+ .connect_uri = uri,
+ .listen_uri = listen_uri,
+ .start_hook = test_mode_exec_start,
+ };
+
+ test_cpr_exec(&args);
+}
+
void migration_test_add_cpr(MigrationTestEnv *env)
{
tmpfs = env->tmpfs;
@@ -135,5 +267,6 @@ void migration_test_add_cpr(MigrationTestEnv *env)
migration_test_add("/migration/mode/transfer", test_mode_transfer);
migration_test_add("/migration/mode/transfer/defer",
test_mode_transfer_defer);
+ migration_test_add("/migration/mode/exec", test_mode_exec);
}
}
--
2.50.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PULL 00/45] Staging patches
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
` (44 preceding siblings ...)
2025-10-03 15:39 ` [PULL 45/45] migration-test: test cpr-exec Peter Xu
@ 2025-10-04 17:53 ` Richard Henderson
45 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2025-10-04 17:53 UTC (permalink / raw)
To: qemu-devel
On 10/3/25 08:39, Peter Xu wrote:
> The following changes since commit 517e9b4862cc9798b7a24b1935d94c2f96787f12:
>
> Merge tag 'qtest-20251001-pull-request' ofhttps://gitlab.com/farosas/qemu into staging (2025-10-01 15:03:00 -0700)
>
> are available in the Git repository at:
>
> https://gitlab.com/peterx/qemu.git tags/staging-pull-request
>
> for you to fetch changes up to 27cffe16354816d57710d2d4357f16139405c749:
>
> migration-test: test cpr-exec (2025-10-03 09:48:02 -0400)
>
> ----------------------------------------------------------------
> Migration/Memory Pull for 10.2
>
> - PeterX's fix on tls warning for preempt channel when migratino completes
> - Arun's series to enhance error reporting for vTPM and migration framework
> - PeterX's patch to cleanup multifd send TLS BYE messages
> - Juraj's fix on postcopy start state transition when switchover failed
> - Yanfei's fix to migrate APIC before VFIO-PCI to avoid irq fallbacks
> - Dan's cleanup to simplify error reporting in qemu_fill_buffer()
> - PeterM's fix on address space leak when cpu hot plug / unplug
> - Steve's cpr-exec wholeset
Applied, thanks. Please update https://wiki.qemu.org/ChangeLog/10.2 as appropriate.
r~
^ permalink raw reply [flat|nested] 49+ messages in thread
* iotest 233 is failing (was: [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer)
2025-10-03 15:39 ` [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer Peter Xu
@ 2025-10-10 8:00 ` Thomas Huth
2025-10-10 8:35 ` iotest 233 is failing Thomas Huth
0 siblings, 1 reply; 49+ messages in thread
From: Thomas Huth @ 2025-10-10 8:00 UTC (permalink / raw)
To: Peter Xu, Peter Maydell, qemu-devel
Cc: Fabiano Rosas, David Hildenbrand, Paolo Bonzini,
Daniel P. Berrangé, Juraj Marcin, Kevin Wolf, Qemu-block
On 03/10/2025 17.39, Peter Xu wrote:
> QCryptoTLSSession allows TLS premature termination in two cases, one of the
> case is when the channel shutdown() is invoked on READ side.
Hi Peter,
this patch break iotest 233 for me:
thuth:~/tmp/qemu-build$ cd tests/qemu-iotests/
thuth:~/tmp/qemu-build/tests/qemu-iotests$ ./check 233
QEMU -- "/home/thuth/tmp/qemu-build/qemu-system-x86_64" -nodefaults
-display none -accel qtest
QEMU_IMG -- "/home/thuth/tmp/qemu-build/qemu-img"
QEMU_IO -- "/home/thuth/tmp/qemu-build/qemu-io" --cache writeback
--aio threads -f raw
QEMU_NBD -- "/home/thuth/tmp/qemu-build/qemu-nbd"
IMGFMT -- raw
IMGPROTO -- file
PLATFORM -- Linux/x86_64 thuth-p1g4 6.16.10-200.fc42.x86_64
TEST_DIR -- /home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch
SOCK_DIR -- /tmp/qemu-iotests-eidif2rs
GDB_OPTIONS --
VALGRIND_QEMU --
PRINT_QEMU_OUTPUT --
233 fail [09:58:28] [09:58:30] 2.5s (last: 2.0s) output
mismatch (see
/home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch/raw-file-233/233.out.bad)
--- /home/thuth/devel/qemu/tests/qemu-iotests/233.out
+++
/home/thuth/tmp/qemu-build/tests/qemu-iotests/scratch/raw-file-233/233.out.bad
@@ -43,51 +43,37 @@
== check TLS fail over TCP with mismatched hostname ==
qemu-img: Could not open
'driver=nbd,host=localhost,port=PORT,tls-creds=tls0': Certificate does not
match the hostname localhost
-qemu-nbd: Certificate does not match the hostname localhost
+qemu-nbd: Failed to read initial magic: Unable to read from socket:
Connection reset by peer
== check TLS works over TCP with mismatched hostname and override ==
-image: nbd://localhost:PORT
-file format: nbd
-virtual size: 64 MiB (67108864 bytes)
-disk size: unavailable
-exports available: 1
- export: ''
- size: 67108864
- min block: 1
- transaction size: 64-bit
+qemu-img: Could not open
'driver=nbd,host=localhost,port=PORT,tls-creds=tls0,tls-hostname=127.0.0.1':
Failed to connect to 'localhost:PORT': Connection refused
+qemu-nbd: Failed to connect to 'localhost:10809': Connection refused
== check TLS with different CA fails ==
-qemu-img: Could not open
'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': The certificate hasn't
got a known issuer
-qemu-nbd: The certificate hasn't got a known issuer
+qemu-img: Could not open
'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': Failed to connect to
'127.0.0.1:PORT': Connection refused
+qemu-nbd: Failed to connect to '127.0.0.1:10809': Connection refused
== perform I/O over TLS ==
-read 1048576/1048576 bytes at offset 1048576
-1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 1048576/1048576 bytes at offset 1048576
-1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qemu-io: can't open: Failed to connect to '127.0.0.1:10809': Connection refused
+Pattern verification failed at offset 1048576, 1048576 bytes
read 1048576/1048576 bytes at offset 1048576
1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
== check TLS with authorization ==
-qemu-img: Could not open
'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': Failed to read option
reply: Cannot read from TLS channel: The TLS connection was non-properly
terminated.
-qemu-img: Could not open
'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': Failed to read option
reply: Cannot read from TLS channel: The TLS connection was non-properly
terminated.
+./common.nbd: line 38: kill: (545045) - No such process
+./common.rc: line 208: 545147 Segmentation fault (core dumped) (
VALGRIND_QEMU="${VALGRIND_QEMU_IMG}" _qemu_proc_exec "${VALGRIND_LOGFILE}"
"$QEMU_IMG_PROG" $QEMU_IMG_OPTIONS "$@" )
+./common.rc: line 208: 545163 Segmentation fault (core dumped) (
VALGRIND_QEMU="${VALGRIND_QEMU_IMG}" _qemu_proc_exec "${VALGRIND_LOGFILE}"
"$QEMU_IMG_PROG" $QEMU_IMG_OPTIONS "$@" )
== check TLS fail over UNIX with no hostname ==
qemu-img: Could not open
'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0': No hostname for
certificate validation
-qemu-nbd: No hostname for certificate validation
+qemu-nbd: Failed to read initial magic: Unable to read from socket:
Connection reset by peer
== check TLS works over UNIX with hostname override ==
-image: nbd+unix://?socket=SOCK_DIR/qemu-nbd.sock
-file format: nbd
-virtual size: 64 MiB (67108864 bytes)
-disk size: unavailable
-exports available: 1
- export: ''
- size: 67108864
- min block: 1
- transaction size: 64-bit
+qemu-img: Could not open
'driver=nbd,path=SOCK_DIR/qemu-nbd.sock,tls-creds=tls0,tls-hostname=127.0.0.1':
Failed to connect to
'/tmp/qemu-iotests-eidif2rs/raw-file-233/qemu-nbd.sock': Connection refused
+qemu-nbd: Failed to connect to
'/tmp/qemu-iotests-eidif2rs/raw-file-233/qemu-nbd.sock': Connection refused
== check TLS works over UNIX with PSK ==
+./common.nbd: line 38: kill: (545184) - No such process
image: nbd+unix://?socket=SOCK_DIR/qemu-nbd.sock
file format: nbd
virtual size: 64 MiB (67108864 bytes)
@@ -103,14 +89,8 @@
qemu-nbd: TLS handshake failed: The TLS connection was non-properly
terminated.
== final server log ==
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read
from TLS channel: The TLS connection was non-properly terminated.
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read
from TLS channel: The TLS connection was non-properly terminated.
-qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
-qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
qemu-nbd: option negotiation failed: TLS x509 authz check for
DISTINGUISHED-NAME is denied
qemu-nbd: option negotiation failed: TLS x509 authz check for
DISTINGUISHED-NAME is denied
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read
from TLS channel: The TLS connection was non-properly terminated.
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read
from TLS channel: The TLS connection was non-properly terminated.
qemu-nbd: option negotiation failed: TLS handshake failed: An illegal
parameter has been received.
qemu-nbd: option negotiation failed: TLS handshake failed: An illegal
parameter has been received.
*** done
Failures: 233
Failed 1 of 1 iotests
Could you please have a look?
Thanks,
Thomas
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: iotest 233 is failing
2025-10-10 8:00 ` iotest 233 is failing (was: [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer) Thomas Huth
@ 2025-10-10 8:35 ` Thomas Huth
0 siblings, 0 replies; 49+ messages in thread
From: Thomas Huth @ 2025-10-10 8:35 UTC (permalink / raw)
To: Peter Xu, Peter Maydell, qemu-devel
Cc: Fabiano Rosas, David Hildenbrand, Paolo Bonzini,
Daniel P. Berrangé, Juraj Marcin, Kevin Wolf, Qemu-block
On 10/10/2025 10.00, Thomas Huth wrote:
> On 03/10/2025 17.39, Peter Xu wrote:
>> QCryptoTLSSession allows TLS premature termination in two cases, one of the
>> case is when the channel shutdown() is invoked on READ side.
> Hi Peter,
>
> this patch break iotest 233 for me:
...
> Could you please have a look?
Never mind, Daniel just told me that there is already a patch available:
https://lore.kernel.org/qemu-devel/20251006190126.4159590-1-berrange@redhat.com/
Thomas
^ permalink raw reply [flat|nested] 49+ messages in thread
end of thread, other threads:[~2025-10-10 8:36 UTC | newest]
Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-03 15:39 [PULL 00/45] Staging patches Peter Xu
2025-10-03 15:39 ` [PULL 01/45] migration: push Error **errp into vmstate_subsection_load() Peter Xu
2025-10-03 15:39 ` [PULL 02/45] migration: push Error **errp into vmstate_load_state() Peter Xu
2025-10-03 15:39 ` [PULL 03/45] migration: push Error **errp into qemu_loadvm_state_header() Peter Xu
2025-10-03 15:39 ` [PULL 04/45] migration: push Error **errp into vmstate_load() Peter Xu
2025-10-03 15:39 ` [PULL 05/45] migration: push Error **errp into loadvm_process_command() Peter Xu
2025-10-03 15:39 ` [PULL 06/45] migration: push Error **errp into loadvm_handle_cmd_packaged() Peter Xu
2025-10-03 15:39 ` [PULL 07/45] migration: push Error **errp into qemu_loadvm_state() Peter Xu
2025-10-03 15:39 ` [PULL 08/45] migration: push Error **errp into qemu_load_device_state() Peter Xu
2025-10-03 15:39 ` [PULL 09/45] migration: push Error **errp into qemu_loadvm_state_main() Peter Xu
2025-10-03 15:39 ` [PULL 10/45] migration: push Error **errp into qemu_loadvm_section_start_full() Peter Xu
2025-10-03 15:39 ` [PULL 11/45] migration: push Error **errp into qemu_loadvm_section_part_end() Peter Xu
2025-10-03 15:39 ` [PULL 12/45] migration: Update qemu_file_get_return_path() docs and remove dead checks Peter Xu
2025-10-03 15:39 ` [PULL 13/45] migration: make loadvm_postcopy_handle_resume() void Peter Xu
2025-10-03 15:39 ` [PULL 14/45] migration: push Error **errp into ram_postcopy_incoming_init() Peter Xu
2025-10-03 15:39 ` [PULL 15/45] migration: push Error **errp into loadvm_postcopy_handle_advise() Peter Xu
2025-10-03 15:39 ` [PULL 16/45] migration: push Error **errp into loadvm_postcopy_handle_listen() Peter Xu
2025-10-03 15:39 ` [PULL 17/45] migration: push Error **errp into loadvm_postcopy_handle_run() Peter Xu
2025-10-03 15:39 ` [PULL 18/45] migration: push Error **errp into loadvm_postcopy_ram_handle_discard() Peter Xu
2025-10-03 15:39 ` [PULL 19/45] migration: push Error **errp into loadvm_handle_recv_bitmap() Peter Xu
2025-10-03 15:39 ` [PULL 20/45] migration: Return -1 on memory allocation failure in ram.c Peter Xu
2025-10-03 15:39 ` [PULL 21/45] migration: push Error **errp into loadvm_process_enable_colo() Peter Xu
2025-10-03 15:39 ` [PULL 22/45] migration: push Error **errp into loadvm_postcopy_handle_switchover_start() Peter Xu
2025-10-03 15:39 ` [PULL 23/45] migration: Capture error in postcopy_ram_listen_thread() Peter Xu
2025-10-03 15:39 ` [PULL 24/45] migration: Remove error variant of vmstate_save_state() function Peter Xu
2025-10-03 15:39 ` [PULL 25/45] migration: Add error-parameterized function variants in VMSD struct Peter Xu
2025-10-03 15:39 ` [PULL 26/45] backends/tpm: Propagate vTPM error on migration failure Peter Xu
2025-10-03 15:39 ` [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer Peter Xu
2025-10-10 8:00 ` iotest 233 is failing (was: [PULL 27/45] io/crypto: Move tls premature termination handling into QIO layer) Thomas Huth
2025-10-10 8:35 ` iotest 233 is failing Thomas Huth
2025-10-03 15:39 ` [PULL 28/45] migration: Make migration_has_failed() work even for CANCELLING Peter Xu
2025-10-03 15:39 ` [PULL 29/45] migration: HMP: Adjust the order of output fields Peter Xu
2025-10-03 15:39 ` [PULL 30/45] migration/multifd/tls: Cleanup BYE message processing on sender side Peter Xu
2025-10-03 15:39 ` [PULL 31/45] migration: Fix state transition in postcopy_start() error handling Peter Xu
2025-10-03 15:39 ` [PULL 32/45] migration: ensure APIC is loaded prior to VFIO PCI devices Peter Xu
2025-10-03 15:39 ` [PULL 33/45] include/system/memory.h: Clarify address_space_destroy() behaviour Peter Xu
2025-10-03 15:39 ` [PULL 34/45] memory: New AS helper to serialize destroy+free Peter Xu
2025-10-03 15:39 ` [PULL 35/45] physmem: Destroy all CPU AddressSpaces on unrealize Peter Xu
2025-10-03 15:39 ` [PULL 36/45] migration: simplify error reporting after channel read Peter Xu
2025-10-03 15:39 ` [PULL 37/45] migration: multi-mode notifier Peter Xu
2025-10-03 15:39 ` [PULL 38/45] migration: add cpr_walk_fd Peter Xu
2025-10-03 15:39 ` [PULL 39/45] oslib: qemu_clear_cloexec Peter Xu
2025-10-03 15:39 ` [PULL 40/45] migration: cpr-exec-command parameter Peter Xu
2025-10-03 15:39 ` [PULL 41/45] migration: cpr-exec save and load Peter Xu
2025-10-03 15:39 ` [PULL 42/45] migration: cpr-exec mode Peter Xu
2025-10-03 15:39 ` [PULL 43/45] migration: cpr-exec docs Peter Xu
2025-10-03 15:39 ` [PULL 44/45] vfio: cpr-exec mode Peter Xu
2025-10-03 15:39 ` [PULL 45/45] migration-test: test cpr-exec Peter Xu
2025-10-04 17:53 ` [PULL 00/45] Staging patches Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).