From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Arun Menon <armenon@redhat.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Cornelia Huck" <cohuck@redhat.com>,
"Halil Pasic" <pasic@linux.ibm.com>,
"Eric Farman" <farman@linux.ibm.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"David Hildenbrand" <david@redhat.com>,
"Ilya Leoshkevich" <iii@linux.ibm.com>,
"Thomas Huth" <thuth@redhat.com>,
"Christian Borntraeger" <borntraeger@linux.ibm.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Fam Zheng" <fam@euphon.net>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Daniel Henrique Barboza" <danielhb413@gmail.com>,
"Harsh Prateek Bora" <harshpb@linux.ibm.com>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Peter Xu" <peterx@redhat.com>, "Fabiano Rosas" <farosas@suse.de>,
"Hailiang Zhang" <zhanghailiang@xfusion.com>,
"Steve Sistare" <steven.sistare@oracle.com>,
qemu-s390x@nongnu.org, qemu-ppc@nongnu.org,
"Stefan Berger" <stefanb@linux.vnet.ibm.com>
Subject: Re: [PATCH v3 1/3] migration: Pass Error object errp into vm state loading functions
Date: Wed, 2 Jul 2025 13:08:51 +0100 [thread overview]
Message-ID: <aGUhU_Sjsp68oTWp@redhat.com> (raw)
In-Reply-To: <20250702-propagate_tpm_error-v3-1-986d94540528@redhat.com>
On Wed, Jul 02, 2025 at 05:06:50PM +0530, Arun Menon wrote:
> - This is an incremental step in converting vmstate loading
> code to report errors.
> - Minimal changes to the signature and body of the following
> functions are done,
> - vmstate_load()
> - vmstate_load_state()
> - vmstate_subsection_load()
> - qemu_load_device_state()
> - qemu_loadvm_state()
> - qemu_loadvm_section_start_full()
> - qemu_loadvm_section_part_end()
> - qemu_loadvm_state_header()
> - qemu_loadvm_state_main()
> - error_report() has been replaced with error_setg();
> and in places where error has been already set,
> error_prepend() is used to not lose information.
>
> Signed-off-by: Arun Menon <armenon@redhat.com>
> ---
> hw/display/virtio-gpu.c | 2 +-
> hw/pci/pci.c | 2 +-
> hw/s390x/virtio-ccw.c | 2 +-
> hw/scsi/spapr_vscsi.c | 2 +-
> hw/vfio/pci.c | 2 +-
> hw/virtio/virtio-mmio.c | 2 +-
> hw/virtio/virtio-pci.c | 2 +-
> hw/virtio/virtio.c | 4 +-
> include/migration/vmstate.h | 2 +-
> migration/colo.c | 13 +++--
> migration/cpr.c | 2 +-
> migration/migration.c | 19 ++++--
> migration/savevm.c | 137 +++++++++++++++++++++++++++-----------------
> migration/savevm.h | 7 ++-
> migration/vmstate-types.c | 10 ++--
> migration/vmstate.c | 40 +++++++------
> tests/unit/test-vmstate.c | 18 +++---
> 17 files changed, 158 insertions(+), 108 deletions(-)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index e0f713c837f5da25d67afbd02ceb6c54024ca3af..808c980b4af3199968771cdc6cca3c2451a2f4a6 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -659,6 +659,7 @@ void migrate_start_colo_process(MigrationState *s)
> static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
> QEMUFile *fb, QIOChannelBuffer *bioc, Error **errp)
> {
> + ERRP_GUARD();
> uint64_t total_size;
> uint64_t value;
> Error *local_err = NULL;
> @@ -686,11 +687,13 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
>
> bql_lock();
> cpu_synchronize_all_states();
> - ret = qemu_loadvm_state_main(mis->from_src_file, mis);
> + ret = qemu_loadvm_state_main(mis->from_src_file, mis, &local_err);
> bql_unlock();
>
> if (ret < 0) {
> - error_setg(errp, "Load VM's live state (ram) error");
> + if (local_err != NULL) {
> + error_prepend(errp, "Load VM's live state (ram) error");
> + }
This doesn't look right to me.
The old code would unconditionally report an error, but this code
now only appends an error if 'qemu_loadvm_state_main' already
reported an error - if 'qemu_loadvm_state_main' was silent this
method reports nothing too.
IMHO this is a bad design for qemu_loadvm_state_main - when we
add an 'errp' to that method we *MUST* ensure it fills that
in *all* possible error code paths.
This code should then unconditionally use error_prepend().
> return;
> }
>
> @@ -729,9 +732,11 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
> bql_lock();
> vmstate_loading = true;
> colo_flush_ram_cache();
> - ret = qemu_load_device_state(fb);
> + ret = qemu_load_device_state(fb, &local_err);
> if (ret < 0) {
> - error_setg(errp, "COLO: load device state failed");
> + if (*errp != NULL) {
> + error_prepend(errp, "COLO: load device state failed");
> + }
Same flawed design here - qemu_load_device_state must guarantee it always
fills its 'errp' parameter on failure code paths.
There are more instances of this general error reporting problem through
this patch
IMHO this patch is changing too many methods at once to be confident when
reviewing it.
This would be better changing 1 single method and its *immediate* callers
only.
IOW, if we have a sequence a() -> b() -> c() -> d() all of which
currently use 'error_report', don't try to change the whole call
chain at once.
First add an "errp' to 'd()', and make 'c()' use 'error_report_err'.
Then add an "errp' to 'c'()', and make 'b()' use 'error_report_err',
...repeat..
This is what I tried todo previously to address this problem in
migration code
https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg01727.html
Though those patches are horribly outdated now, so can't be used as
is, IMHO they show the right kind of incremental conversion
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2025-07-02 12:10 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-02 11:36 [PATCH v3 0/3] migration: propagate vTPM errors using Error objects Arun Menon
2025-07-02 11:36 ` [PATCH v3 1/3] migration: Pass Error object errp into vm state loading functions Arun Menon
2025-07-02 12:08 ` Daniel P. Berrangé [this message]
2025-07-02 14:42 ` Arun Menon
2025-07-02 13:01 ` Markus Armbruster
2025-07-02 14:53 ` Arun Menon
2025-07-02 11:36 ` [PATCH v3 2/3] migration: Introduce a post_load_with_error hook Arun Menon
2025-07-02 11:53 ` Daniel P. Berrangé
2025-07-02 11:36 ` [PATCH v3 3/3] backends/tpm: Propagate vTPM error on migration failure Arun Menon
2025-07-02 11:55 ` Daniel P. Berrangé
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGUhU_Sjsp68oTWp@redhat.com \
--to=berrange@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=armenon@redhat.com \
--cc=borntraeger@linux.ibm.com \
--cc=clg@redhat.com \
--cc=cohuck@redhat.com \
--cc=danielhb413@gmail.com \
--cc=david@redhat.com \
--cc=fam@euphon.net \
--cc=farman@linux.ibm.com \
--cc=farosas@suse.de \
--cc=harshpb@linux.ibm.com \
--cc=iii@linux.ibm.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=npiggin@gmail.com \
--cc=pasic@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=stefanb@linux.vnet.ibm.com \
--cc=steven.sistare@oracle.com \
--cc=thuth@redhat.com \
--cc=zhanghailiang@xfusion.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.