qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/21] migration: Improve error reporting
@ 2024-02-27 18:03 Cédric Le Goater
  2024-02-27 18:03 ` [PATCH v2 01/21] migration: Report error when shutdown fails Cédric Le Goater
                   ` (20 more replies)
  0 siblings, 21 replies; 42+ messages in thread
From: Cédric Le Goater @ 2024-02-27 18:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Alex Williamson, Avihai Horon,
	Philippe Mathieu-Daudé, Cédric Le Goater

Hello,

The motivation behind these changes is to improve error reporting to
the upper management layer (libvirt) with a more detailed error, this
to let it decide, depending on the reported error, whether to try
migration again later. It would be useful in cases where migration
fails due to lack of HW resources on the host. For instance, some
adapters can only initiate a limited number of simultaneous dirty
tracking requests and this imposes a limit on the the number of VMs
that can be migrated simultaneously.

We are not quite ready for such a mechanism but what we can do first is
to cleanup the error reporting ​in the early save_setup sequence. This
is what the following changes propose, by adding an Error** argument to
various handlers and propagating it to the core migration subsystem.
 
Thanks,

C.

Changes in v2:

- Removed v1 patches addressing the return-path thread termination as
  they are now superseded by :  
  https://lore.kernel.org/qemu-devel/20240226203122.22894-1-farosas@suse.de/
- Documentation updates of handlers
- Removed call to PRECOPY_NOTIFY_SETUP notifiers in case of errors
- Modified routines taking an Error** argument to return a bool when
  possible and made adjustments in callers.
- new MEMORY_LISTENER_CALL_LOG_GLOBAL macro for .log_global*()
  handlers
- Handled SETUP state when migration terminates
- Modified memory_get_xlat_addr() to take an Error** argument
- Various refinements on error handling

Cédric Le Goater (21):
  migration: Report error when shutdown fails
  migration: Remove SaveStateHandler and LoadStateHandler typedefs
  migration: Add documentation for SaveVMHandlers
  migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error
  migration: Add Error** argument to qemu_savevm_state_setup()
  migration: Add Error** argument to .save_setup() handler
  migration: Add Error** argument to .load_setup() handler
  memory: Add Error** argument to .log_global*() handlers
  memory: Add Error** argument to the global_dirty_log routines
  migration: Modify ram_init_bitmaps() to report dirty tracking errors
  migration: Fix migration termination
  vfio: Add Error** argument to .set_dirty_page_tracking() handler
  vfio: Add Error** argument to vfio_devices_dma_logging_start()
  vfio: Add Error** argument to vfio_devices_dma_logging_stop()
  vfio: Use new Error** argument in vfio_save_setup()
  vfio: Add Error** argument to .vfio_save_config() handler
  vfio: Reverse test on vfio_get_dirty_bitmap()
  memory: Add Error** argument to memory_get_xlat_addr()
  vfio: Add Error** argument to .get_dirty_bitmap() handler
  vfio: Also trace event failures in vfio_save_complete_precopy()
  vfio: Extend vfio_set_migration_error() with Error* argument

 include/exec/memory.h                 |  40 +++-
 include/hw/vfio/vfio-common.h         |  29 ++-
 include/hw/vfio/vfio-container-base.h |  35 +++-
 include/migration/register.h          | 267 +++++++++++++++++++++++---
 include/qemu/typedefs.h               |   2 -
 migration/savevm.h                    |   2 +-
 hw/i386/xen/xen-hvm.c                 |  10 +-
 hw/ppc/spapr.c                        |   2 +-
 hw/s390x/s390-stattrib.c              |   2 +-
 hw/vfio/common.c                      | 160 +++++++++------
 hw/vfio/container-base.c              |   9 +-
 hw/vfio/container.c                   |  19 +-
 hw/vfio/migration.c                   |  89 ++++++---
 hw/vfio/pci.c                         |   5 +-
 hw/virtio/vhost-vdpa.c                |   5 +-
 hw/virtio/vhost.c                     |   6 +-
 migration/block-dirty-bitmap.c        |   2 +-
 migration/block.c                     |   2 +-
 migration/dirtyrate.c                 |  21 +-
 migration/migration.c                 |  24 ++-
 migration/qemu-file.c                 |   5 +-
 migration/ram.c                       |  48 ++++-
 migration/savevm.c                    |  28 +--
 system/memory.c                       |  95 +++++++--
 system/physmem.c                      |   5 +-
 25 files changed, 699 insertions(+), 213 deletions(-)

-- 
2.43.2



^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2024-03-04 13:39 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-27 18:03 [PATCH v2 00/21] migration: Improve error reporting Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 01/21] migration: Report error when shutdown fails Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 02/21] migration: Remove SaveStateHandler and LoadStateHandler typedefs Cédric Le Goater
2024-02-29  1:21   ` Fabiano Rosas
2024-02-29  3:55   ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 03/21] migration: Add documentation for SaveVMHandlers Cédric Le Goater
2024-02-29  4:10   ` Peter Xu
2024-02-29 13:19     ` Cédric Le Goater
2024-03-04  9:05   ` Avihai Horon
2024-03-04 10:29     ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 04/21] migration: Do not call PRECOPY_NOTIFY_SETUP notifiers in case of error Cédric Le Goater
2024-02-29  4:12   ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 05/21] migration: Add Error** argument to qemu_savevm_state_setup() Cédric Le Goater
2024-02-29  4:19   ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 06/21] migration: Add Error** argument to .save_setup() handler Cédric Le Goater
2024-02-29  6:32   ` Markus Armbruster
2024-02-29  7:20     ` Vladimir Sementsov-Ogievskiy
2024-02-29 10:35       ` Thomas Huth
2024-02-29 13:21         ` Markus Armbruster
2024-03-01 14:44           ` Vladimir Sementsov-Ogievskiy
2024-03-01 14:52             ` Vladimir Sementsov-Ogievskiy
2024-02-29 13:34     ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 07/21] migration: Add Error** argument to .load_setup() handler Cédric Le Goater
2024-02-29  4:21   ` Peter Xu
2024-02-27 18:03 ` [PATCH v2 08/21] memory: Add Error** argument to .log_global*() handlers Cédric Le Goater
2024-02-29  5:29   ` Peter Xu
2024-03-04 13:38     ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 09/21] memory: Add Error** argument to the global_dirty_log routines Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 10/21] migration: Modify ram_init_bitmaps() to report dirty tracking errors Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 11/21] migration: Fix migration termination Cédric Le Goater
2024-02-29  5:34   ` Peter Xu
2024-02-29 15:13     ` Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 12/21] vfio: Add Error** argument to .set_dirty_page_tracking() handler Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 13/21] vfio: Add Error** argument to vfio_devices_dma_logging_start() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 14/21] vfio: Add Error** argument to vfio_devices_dma_logging_stop() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 15/21] vfio: Use new Error** argument in vfio_save_setup() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 16/21] vfio: Add Error** argument to .vfio_save_config() handler Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 17/21] vfio: Reverse test on vfio_get_dirty_bitmap() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 18/21] memory: Add Error** argument to memory_get_xlat_addr() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 19/21] vfio: Add Error** argument to .get_dirty_bitmap() handler Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 20/21] vfio: Also trace event failures in vfio_save_complete_precopy() Cédric Le Goater
2024-02-27 18:03 ` [PATCH v2 21/21] vfio: Extend vfio_set_migration_error() with Error* argument Cédric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).