From: Fabiano Rosas <farosas@suse.de>
To: Arun Menon <armenon@redhat.com>, qemu-devel@nongnu.org
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Cornelia Huck" <cohuck@redhat.com>,
"Halil Pasic" <pasic@linux.ibm.com>,
"Eric Farman" <farman@linux.ibm.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"David Hildenbrand" <david@redhat.com>,
"Ilya Leoshkevich" <iii@linux.ibm.com>,
"Thomas Huth" <thuth@redhat.com>,
"Christian Borntraeger" <borntraeger@linux.ibm.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Fam Zheng" <fam@euphon.net>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Daniel Henrique Barboza" <danielhb413@gmail.com>,
"Harsh Prateek Bora" <harshpb@linux.ibm.com>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Peter Xu" <peterx@redhat.com>,
"Hailiang Zhang" <zhanghailiang@xfusion.com>,
"Steve Sistare" <steven.sistare@oracle.com>,
qemu-s390x@nongnu.org, qemu-ppc@nongnu.org,
"Stefan Berger" <stefanb@linux.vnet.ibm.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Alex Bennée" <alex.bennee@linaro.org>,
"Akihiko Odaki" <odaki@rsg.ci.i.u-tokyo.ac.jp>,
"Dmitry Osipenko" <dmitry.osipenko@collabora.com>,
"Matthew Rosato" <mjrosato@linux.ibm.com>,
"Arun Menon" <armenon@redhat.com>,
"Stefan Berger" <stefanb@linux.vnet.ibm.com>
Subject: Re: [PATCH v4 00/23] migration: propagate vTPM errors using Error objects
Date: Wed, 16 Jul 2025 16:58:52 -0300 [thread overview]
Message-ID: <87zfd3lxfn.fsf@suse.de> (raw)
In-Reply-To: <20250716-propagate_tpm_error-v4-0-7141902077c0@redhat.com>
Arun Menon <armenon@redhat.com> writes:
> Hello,
>
> Currently, when a migration of a VM with an encrypted vTPM
> fails on the destination host (e.g., due to a mismatch in secret values),
> the error message displayed on the source host is generic and unhelpful.
>
> For example, a typical error looks like this:
> "operation failed: job 'migration out' failed: Sibling indicated error 1.
> operation failed: job 'migration in' failed: load of migration failed:
> Input/output error"
>
> This message does not provide any specific indication of a vTPM failure.
> Such generic errors are logged using error_report(), which prints to
> the console/monitor but does not make the detailed error accessible via
> the QMP query-migrate command.
>
> This series addresses the issue, by ensuring that specific TPM error
> messages are propagated via the QEMU Error object.
> To make this possible,
> - A set of functions in the call stack is changed
> to incorporate an Error object as an additional parameter.
> - Also, the TPM backend makes use of a new hook called post_load_errp()
> that explicitly passes an Error object.
>
> It is organized as follows,
> - Patches 1-21 focuses on pushing Error object into the functions
> that are important in the call stack where TPM errors are observed.
> We still need to make changes in rest of the functions in savevm.c
> such that they also incorporate the errp object for propagating errors.
> - Patch 22 introduces the new variants of the hooks in VMStateDescription
> structure. These hooks should be used in future implementations.
> - Patch 23 focuses on changing the TPM backend such that the errors are
> set in the Error object.
>
> While this series focuses specifically on TPM error reporting during
> live migration, it lays the groundwork for broader improvements.
> A lot of methods in savevm.c that previously returned an integer now capture
> errors in the Error object, enabling other modules to adopt the
> post_load_errp hook in the future.
>
> One such change previously attempted:
> https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg01727.html
>
> Resolves: https://issues.redhat.com/browse/RHEL-82826
>
> Signed-off-by: Arun Menon <armenon@redhat.com>
> ---
> Changes in v4:
> - Split the patches into smaller ones based on functions. Pass NULL in the
> caller until errp is made available. Every function that has an
> Error **errp object passed to it, ensures that it sets the errp object
> in case of failure.
> - A few more functions within loadvm_process_command() now handle errors using
> the errp object. I've converted these for consistency, taking Daniel's
> patches (link above) as a reference.
> - Along with the post_load_errp() hook, other duplicate hooks are also introduced.
> This will enable us to migrate to the newer versions eventually.
> - Fix some semantic errors, like using error_propagate_prepend() in places where
> we need to preserve existing behaviour of accumulating the error in local_err
> and then propagating it to errp. This can be refactored in a later commit.
> - Add more information in commit messages explaining the changes.
> - Link to v3: https://lore.kernel.org/qemu-devel/20250702-propagate_tpm_error-v3-0-986d94540528@redhat.com
>
> Changes in v3:
> - Split the 2nd patch into 2. Introducing post_load_with_error() hook
> has been separated from using it in the backends TPM module. This is
> so that it can be acknowledged.
> - Link to v2: https://lore.kernel.org/qemu-devel/20250627-propagate_tpm_error-v2-0-85990c89da29@redhat.com
>
> Changes in v2:
> - Combine the first two changes into one, focusing on passing the
> Error object (errp) consistently through functions involved in
> loading the VM's state. Other functions are not yet changed.
> - As suggested in the review comment, add null checks for errp
> before adding error messages, preventing crashes.
> We also now correctly set errors when post-copy migration fails.
> - In process_incoming_migration_co(), switch to error_prepend
> instead of error_setg. This means we now null-check local_err in
> the "fail" section before using it, preventing dereferencing issues.
> - Link to v1: https://lore.kernel.org/qemu-devel/20250624-propagate_tpm_error-v1-0-2171487a593d@redhat.com
>
> ---
> Arun Menon (23):
> migration: push Error **errp into vmstate_subsection_load()
> migration: push Error **errp into vmstate_load_state()
> migration: push Error **errp into qemu_loadvm_state_header()
> migration: push Error **errp into vmstate_load()
> migration: push Error **errp into qemu_loadvm_section_start_full()
> migration: push Error **errp into qemu_loadvm_section_part_end()
> migration: push Error **errp into loadvm_process_command()
> migration: push Error **errp into loadvm_handle_cmd_packaged()
> migration: push Error **errp into ram_postcopy_incoming_init()
> migration: push Error **errp into loadvm_postcopy_handle_advise()
> migration: push Error **errp into loadvm_postcopy_handle_listen()
> migration: push Error **errp into loadvm_postcopy_handle_run()
> migration: push Error **errp into loadvm_postcopy_ram_handle_discard()
> migration: make loadvm_postcopy_handle_resume() void
> migration: push Error **errp into loadvm_handle_recv_bitmap()
> migration: push Error **errp into loadvm_process_enable_colo()
> migration: push Error **errp into loadvm_postcopy_handle_switchover_start()
> migration: push Error **errp into qemu_loadvm_state_main()
> migration: push Error **errp into qemu_loadvm_state()
> migration: push Error **errp into qemu_load_device_state()
> migration: Capture error in postcopy_ram_listen_thread()
> migration: Add error-parameterized function variants in VMSD struct
> backends/tpm: Propagate vTPM error on migration failure
>
> backends/tpm/tpm_emulator.c | 39 +++---
> hw/display/virtio-gpu.c | 2 +-
> hw/pci/pci.c | 2 +-
> hw/s390x/virtio-ccw.c | 2 +-
> hw/scsi/spapr_vscsi.c | 2 +-
> hw/vfio/pci.c | 2 +-
> hw/virtio/virtio-mmio.c | 2 +-
> hw/virtio/virtio-pci.c | 2 +-
> hw/virtio/virtio.c | 4 +-
> include/migration/colo.h | 2 +-
> include/migration/vmstate.h | 13 +-
> migration/colo.c | 10 +-
> migration/cpr.c | 4 +-
> migration/migration.c | 19 +--
> migration/postcopy-ram.c | 9 +-
> migration/postcopy-ram.h | 2 +-
> migration/ram.c | 14 +--
> migration/ram.h | 4 +-
> migration/savevm.c | 299 +++++++++++++++++++++++++-------------------
> migration/savevm.h | 7 +-
> migration/vmstate-types.c | 10 +-
> migration/vmstate.c | 83 ++++++++----
> tests/unit/test-vmstate.c | 18 +--
> ui/vdagent.c | 2 +-
> 24 files changed, 325 insertions(+), 228 deletions(-)
> ---
> base-commit: 9a4e273ddec3927920c5958d2226c6b38b543336
> change-id: 20250624-propagate_tpm_error-bf4ae6c23d30
>
> Best regards,
Hi Arun, make check is failing, please take a look:
QTEST_LOG=1 QTEST_QEMU_BINARY=./qemu-system-x86_64 \
./tests/qtest/migration-test \
--full -p /x86_64/migration/postcopy/recovery/double-failures/handshake
...
qemu-system-x86_64: ../util/error.c:65: error_setv: Assertion `*errp ==
NULL' failed.
next prev parent reply other threads:[~2025-07-16 19:59 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-16 10:42 [PATCH v4 00/23] migration: propagate vTPM errors using Error objects Arun Menon
2025-07-16 10:42 ` [PATCH v4 01/23] migration: push Error **errp into vmstate_subsection_load() Arun Menon
2025-07-16 10:42 ` [PATCH v4 02/23] migration: push Error **errp into vmstate_load_state() Arun Menon
2025-07-16 10:42 ` [PATCH v4 03/23] migration: push Error **errp into qemu_loadvm_state_header() Arun Menon
2025-07-16 10:42 ` [PATCH v4 04/23] migration: push Error **errp into vmstate_load() Arun Menon
2025-07-16 10:42 ` [PATCH v4 05/23] migration: push Error **errp into qemu_loadvm_section_start_full() Arun Menon
2025-07-16 10:42 ` [PATCH v4 06/23] migration: push Error **errp into qemu_loadvm_section_part_end() Arun Menon
2025-07-16 10:42 ` [PATCH v4 07/23] migration: push Error **errp into loadvm_process_command() Arun Menon
2025-07-16 10:42 ` [PATCH v4 08/23] migration: push Error **errp into loadvm_handle_cmd_packaged() Arun Menon
2025-07-16 10:42 ` [PATCH v4 09/23] migration: push Error **errp into ram_postcopy_incoming_init() Arun Menon
2025-07-16 10:43 ` [PATCH v4 10/23] migration: push Error **errp into loadvm_postcopy_handle_advise() Arun Menon
2025-07-16 10:43 ` [PATCH v4 11/23] migration: push Error **errp into loadvm_postcopy_handle_listen() Arun Menon
2025-07-16 10:43 ` [PATCH v4 12/23] migration: push Error **errp into loadvm_postcopy_handle_run() Arun Menon
2025-07-16 10:43 ` [PATCH v4 13/23] migration: push Error **errp into loadvm_postcopy_ram_handle_discard() Arun Menon
2025-07-16 10:43 ` [PATCH v4 14/23] migration: make loadvm_postcopy_handle_resume() void Arun Menon
2025-07-16 10:43 ` [PATCH v4 15/23] migration: push Error **errp into loadvm_handle_recv_bitmap() Arun Menon
2025-07-16 10:43 ` [PATCH v4 16/23] migration: push Error **errp into loadvm_process_enable_colo() Arun Menon
2025-07-16 10:43 ` [PATCH v4 17/23] migration: push Error **errp into loadvm_postcopy_handle_switchover_start() Arun Menon
2025-07-16 10:43 ` [PATCH v4 18/23] migration: push Error **errp into qemu_loadvm_state_main() Arun Menon
2025-07-16 10:43 ` [PATCH v4 19/23] migration: push Error **errp into qemu_loadvm_state() Arun Menon
2025-07-16 10:43 ` [PATCH v4 20/23] migration: push Error **errp into qemu_load_device_state() Arun Menon
2025-07-16 10:43 ` [PATCH v4 21/23] migration: Capture error in postcopy_ram_listen_thread() Arun Menon
2025-07-16 10:43 ` [PATCH v4 22/23] migration: Add error-parameterized function variants in VMSD struct Arun Menon
2025-07-16 10:43 ` [PATCH v4 23/23] backends/tpm: Propagate vTPM error on migration failure Arun Menon
2025-07-16 19:58 ` Fabiano Rosas [this message]
2025-07-16 22:38 ` [PATCH v4 00/23] migration: propagate vTPM errors using Error objects Arun Menon
2025-07-17 12:30 ` Fabiano Rosas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zfd3lxfn.fsf@suse.de \
--to=farosas@suse.de \
--cc=alex.bennee@linaro.org \
--cc=alex.williamson@redhat.com \
--cc=armenon@redhat.com \
--cc=borntraeger@linux.ibm.com \
--cc=clg@redhat.com \
--cc=cohuck@redhat.com \
--cc=danielhb413@gmail.com \
--cc=david@redhat.com \
--cc=dmitry.osipenko@collabora.com \
--cc=fam@euphon.net \
--cc=farman@linux.ibm.com \
--cc=harshpb@linux.ibm.com \
--cc=iii@linux.ibm.com \
--cc=marcandre.lureau@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mjrosato@linux.ibm.com \
--cc=mst@redhat.com \
--cc=npiggin@gmail.com \
--cc=odaki@rsg.ci.i.u-tokyo.ac.jp \
--cc=pasic@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=stefanb@linux.vnet.ibm.com \
--cc=steven.sistare@oracle.com \
--cc=thuth@redhat.com \
--cc=zhanghailiang@xfusion.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).