From: "Cédric Le Goater" <clg@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Peter Xu" <peterx@redhat.com>, "Fabiano Rosas" <farosas@suse.de>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>
Subject: [PATCH 00/14] migration: Improve error reporting
Date: Wed, 7 Feb 2024 14:33:33 +0100 [thread overview]
Message-ID: <20240207133347.1115903-1-clg@redhat.com> (raw)
Hello,
The motivation behind these changes is to improve error reporting to
the upper management layer (libvirt) with a more detailed error, this
to let it decide, depending on the reported error, whether to try
migration again later. It would be useful in cases where migration
fails due to lack of HW resources on the host. For instance, some
adapters can only initiate a limited number of simultaneous dirty
tracking requests and this imposes a limit on the the number of VMs
that can be migrated simultaneously.
We are not quite ready for such a mechanism but what we can do first is
to cleanup the error reporting in the early save_setup sequence. This
is what the following changes propose, by adding an Error argument to
various handlers and propagating it to the core migration subsystem.
The last patches try to address a related issue found on VMs with MLX5
VF assigned devices. These are one of those adapters with the HW
limitation described above. If dirty tracking setup fails and
return-path is in use, the return-path thread does not terminate,
leaving the source and destination VMs waiting for an event to occur.
The last patch is still an RFC because the correct fix is not obvious
and implies reworking the QEMUFile software construct, built on top of
the QEMU I/O channel.
Thanks,
C.
[1] https://lore.kernel.org/qemu-devel/20240201184853.890471-1-clg@redhat.com/
Cédric Le Goater (14):
migration: Add Error** argument to .save_setup() handler
migration: Add Error** argument to .load_setup() handler
memory: Add Error** argument to .log_global*() handlers
migration: Modify ram_init_bitmaps() to report dirty tracking errors
vfio: Add Error** argument to .set_dirty_page_tracking() handler
vfio: Add Error** argument to vfio_devices_dma_logging_start()
vfio: Add Error** argument to vfio_devices_dma_logging_stop()
vfio: Use new Error** argument in vfio_save_setup()
vfio: Add Error** argument to .vfio_save_config() handler
vfio: Also trace event failures in vfio_save_complete_precopy()
vfio: Extend vfio_set_migration_error() with Error* argument
migration: Report error when shutdown fails
migration: Use migrate_has_error() in close_return_path_on_source()
migration: Fix return-path thread exit
include/exec/memory.h | 12 ++--
include/hw/vfio/vfio-common.h | 2 +-
include/hw/vfio/vfio-container-base.h | 4 +-
include/migration/register.h | 4 +-
hw/i386/xen/xen-hvm.c | 8 +--
hw/ppc/spapr.c | 2 +-
hw/s390x/s390-stattrib.c | 2 +-
hw/vfio/common.c | 96 ++++++++++++++++-----------
hw/vfio/container-base.c | 4 +-
hw/vfio/container.c | 6 +-
hw/vfio/migration.c | 87 +++++++++++++++---------
hw/vfio/pci.c | 5 +-
hw/virtio/vhost.c | 4 +-
migration/block-dirty-bitmap.c | 2 +-
migration/block.c | 2 +-
migration/dirtyrate.c | 24 +++++--
migration/migration.c | 16 ++---
migration/qemu-file.c | 5 +-
migration/ram.c | 40 ++++++++---
migration/savevm.c | 14 ++--
system/memory.c | 37 +++++++----
21 files changed, 236 insertions(+), 140 deletions(-)
--
2.43.0
next reply other threads:[~2024-02-07 13:36 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-07 13:33 Cédric Le Goater [this message]
2024-02-07 13:33 ` [PATCH 01/14] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
2024-02-07 20:11 ` Philippe Mathieu-Daudé
2024-02-08 13:27 ` Cédric Le Goater
2024-02-08 4:26 ` Peter Xu
2024-02-12 8:36 ` Avihai Horon
2024-02-12 14:49 ` Cédric Le Goater
2024-02-12 15:57 ` Avihai Horon
2024-02-07 13:33 ` [PATCH 02/14] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
2024-02-07 20:12 ` Philippe Mathieu-Daudé
2024-02-08 4:30 ` Peter Xu
2024-02-09 9:35 ` Cédric Le Goater
2024-02-07 13:33 ` [PATCH 03/14] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
2024-02-08 5:48 ` Peter Xu
2024-02-09 10:14 ` Cédric Le Goater
2024-02-12 8:43 ` Avihai Horon
2024-02-12 16:36 ` Cédric Le Goater
2024-02-08 15:59 ` Philippe Mathieu-Daudé
2024-02-07 13:33 ` [PATCH 04/14] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
2024-02-07 20:15 ` Philippe Mathieu-Daudé
2024-02-12 8:51 ` Avihai Horon
2024-02-12 16:37 ` Cédric Le Goater
2024-02-07 13:33 ` [PATCH 05/14] vfio: Add Error** argument to .set_dirty_page_tracking() handler Cédric Le Goater
2024-02-07 20:16 ` Philippe Mathieu-Daudé
2024-02-07 13:33 ` [PATCH 06/14] vfio: Add Error** argument to vfio_devices_dma_logging_start() Cédric Le Goater
2024-02-07 20:17 ` Philippe Mathieu-Daudé
2024-02-07 13:33 ` [PATCH 07/14] vfio: Add Error** argument to vfio_devices_dma_logging_stop() Cédric Le Goater
2024-02-07 20:18 ` Philippe Mathieu-Daudé
2024-02-07 13:33 ` [PATCH 08/14] vfio: Use new Error** argument in vfio_save_setup() Cédric Le Goater
2024-02-07 20:21 ` Philippe Mathieu-Daudé
2024-02-12 9:17 ` Avihai Horon
2024-02-12 17:54 ` Cédric Le Goater
2024-02-13 13:57 ` Avihai Horon
2024-02-07 13:33 ` [PATCH 09/14] vfio: Add Error** argument to .vfio_save_config() handler Cédric Le Goater
2024-02-07 20:22 ` Philippe Mathieu-Daudé
2024-02-12 9:21 ` Avihai Horon
2024-02-07 13:33 ` [PATCH 10/14] vfio: Also trace event failures in vfio_save_complete_precopy() Cédric Le Goater
2024-02-07 13:33 ` [PATCH 11/14] vfio: Extend vfio_set_migration_error() with Error* argument Cédric Le Goater
2024-02-07 20:25 ` Philippe Mathieu-Daudé
2024-02-12 9:35 ` Avihai Horon
2024-02-16 13:12 ` Cédric Le Goater
2024-02-07 13:33 ` [PATCH 12/14] migration: Report error when shutdown fails Cédric Le Goater
2024-02-07 20:26 ` Philippe Mathieu-Daudé
2024-02-08 5:52 ` Peter Xu
2024-02-07 13:33 ` [PATCH 13/14] migration: Use migrate_has_error() in close_return_path_on_source() Cédric Le Goater
2024-02-08 5:52 ` Peter Xu
2024-02-08 13:07 ` Fabiano Rosas
2024-02-08 13:45 ` Cédric Le Goater
2024-02-08 13:57 ` Fabiano Rosas
2024-02-12 13:03 ` Cédric Le Goater
2024-02-14 16:00 ` Fabiano Rosas
2024-02-16 15:17 ` Cédric Le Goater
2024-02-23 4:14 ` Peter Xu
2024-02-07 13:33 ` [RFC PATCH 14/14] migration: Fix return-path thread exit Cédric Le Goater
2024-02-08 5:57 ` Peter Xu
2024-02-12 16:04 ` Cédric Le Goater
2024-02-23 4:25 ` Peter Xu
2024-02-08 13:29 ` Fabiano Rosas
2024-02-12 15:44 ` Cédric Le Goater
2024-02-14 20:35 ` Fabiano Rosas
2024-02-16 15:08 ` Cédric Le Goater
2024-02-16 17:35 ` Fabiano Rosas
2024-02-23 4:31 ` Peter Xu
2024-02-23 14:05 ` Fabiano Rosas
2024-02-26 8:44 ` Cédric Le Goater
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240207133347.1115903-1-clg@redhat.com \
--to=clg@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=farosas@suse.de \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).