qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@redhat.com>
To: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>,
	Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Avihai Horon" <avihaih@nvidia.com>,
	"Joao Martins" <joao.m.martins@oracle.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v6 00/36] Multifd 🔀 device state transfer support with VFIO consumer
Date: Wed, 5 Mar 2025 10:29:44 +0100	[thread overview]
Message-ID: <4ea12608-ec9d-4eed-a20c-75f3ac6a5d0d@redhat.com> (raw)
In-Reply-To: <cover.1741124640.git.maciej.szmigiero@oracle.com>

Hello,

On 3/4/25 23:03, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
> 
> This is an updated v6 patch series of the v5 series located here:
> https://lore.kernel.org/qemu-devel/cover.1739994627.git.maciej.szmigiero@oracle.com/
> 
> What this patch set is about?
> Current live migration device state transfer is done via the main (single)
> migration channel, which reduces performance and severally impacts the
> migration downtime for VMs having large device state that needs to be
> transferred during the switchover phase.
> 
> Example devices that have such large switchover phase device state are some
> types of VFIO SmartNICs and GPUs.
> 
> This patch set allows parallelizing this transfer by using multifd channels
> for it.
> It also introduces new load and save threads per VFIO device for decoupling
> these operations from the main migration thread.
> These threads run on newly introduced generic (non-AIO) thread pools,
> instantiated by the core migration core.

I think we are ready to apply 1-33. Avihai, please take a look !

7,15 and 17 still need an Ack from Peter and/or Fabiano though.

34 can be reworked a bit before -rc0.
35 is for QEMU 10.1.
36 needs some massaging. I will do that.

This can go through the vfio tree if everyone agrees.

Thanks,

C.




> Changes from v5:
> * Add bql_locked() assertion to migration_incoming_state_destroy() with a
> comment describing why holding BQL there is necessary.
> 
> * Add SPDX-License-Identifier to newly added files.
> 
> * Move consistency of multfd transfer settings check to the patch adding
> x-migration-multifd-transfer property.
> 
> * Change packet->idx == UINT32_MAX message to the suggested one.
> 
> * Use WITH_QEMU_LOCK_GUARD() in vfio_load_state_buffer().
> 
> * Add vfio_load_bufs_thread_{start,end} trace events.
> 
> * Invert "ret" value computation logic in vfio_load_bufs_thread() and
>    vfio_multifd_save_complete_precopy_thread() - initialize "ret" to false
>    at definition, remove "ret = false" at every failure/early exit block and
>    add "ret = true" just before the early exit jump label.
> 
> * Make vfio_load_bufs_thread_load_config() return a bool and take an
>    "Error **" parameter.
> 
> * Make vfio_multifd_setup() (previously called vfio_multifd_transfer_setup())
>    allocate struct VFIOMultifd if requested by "alloc_multifd" parameter.
> 
> * Add vfio_multifd_cleanup() call to vfio_save_cleanup() (for consistency
>    with the load code), with a comment describing that it is currently a NOP
>    there.
> 
> * Move vfio_multifd_cleanup() to migration-multifd.c.
> 
> * Move general multifd migration description in docs/devel/migration/vfio.rst
>    from the top section to new "Multifd" section at the bottom.
> 
> * Add comment describing why x-migration-multifd-transfer needs to be
>    a custom property above the variable containing that custom property type
>    in register_vfio_pci_dev_type().
> 
> * Add object_class_property_set_description() description for all 3 newly
>    added parameters: x-migration-multifd-transfer,
>    x-migration-load-config-after-iter and x-migration-max-queued-buffers.
> 
> * Split out wiring vfio_multifd_setup() and vfio_multifd_cleanup() into
>    general VFIO load/save setup and cleanup methods into a brand new
>    patch/commit.
> 
> * Squash the patch introducing VFIOStateBuffer(s) into the "received buffers
>    queuing" commit to fix building the interim code form at the time of this
>    patch with "-Werror".
>    
> * Change device state packet "idstr" field to NULL-terminated and drop
>    QEMU_NONSTRING marking from its definition.
> 
> * Add vbasedev->name to VFIO error messages to know which device caused
>    that error.
> 
> * Move BQL lock ordering assert closer to the other lock in the lock order
>    in vfio_load_state_buffer().
> 
> * Drop orphan "QemuThread load_bufs_thread" VFIOMultifd member leftover
>    from the days of the version 2 of this patch set.
> 
> * Change "guint" into an "unsigned int" where it was present in this
>    patch set.
> 
> * Use g_autoptr() for QEMUFile also in vfio_load_bufs_thread_load_config().
> 
> * Call multifd_abort_device_state_save_threads() if a migration error is
>    already set in the save path to avoid needlessly waiting for the remaining
>    threads to do all of their normal work.
> 
> * Other minor changes that should not have functional impact, like:
>    renamed functions/labels, moved code lines between patches contained
>    in this patch set, added review tags, code formatting, rebased on top
>    of the latest QEMU git master, etc.
> 
> ========================================================================
> 
> This patch set is targeting QEMU 10.0.
> 
> It is also exported as a git tree:
> https://gitlab.com/maciejsszmigiero/qemu/-/commits/multifd-device-state-transfer-vfio
> 
> ========================================================================
> 
> Maciej S. Szmigiero (35):
>    migration: Clarify that {load,save}_cleanup handlers can run without
>      setup
>    thread-pool: Remove thread_pool_submit() function
>    thread-pool: Rename AIO pool functions to *_aio() and data types to
>      *Aio
>    thread-pool: Implement generic (non-AIO) pool support
>    migration: Add MIG_CMD_SWITCHOVER_START and its load handler
>    migration: Add qemu_loadvm_load_state_buffer() and its handler
>    migration: postcopy_ram_listen_thread() should take BQL for some calls
>    error: define g_autoptr() cleanup function for the Error type
>    migration: Add thread pool of optional load threads
>    migration/multifd: Split packet into header and RAM data
>    migration/multifd: Device state transfer support - receive side
>    migration/multifd: Make multifd_send() thread safe
>    migration/multifd: Add an explicit MultiFDSendData destructor
>    migration/multifd: Device state transfer support - send side
>    migration/multifd: Add multifd_device_state_supported()
>    migration: Add save_live_complete_precopy_thread handler
>    vfio/migration: Add load_device_config_state_start trace event
>    vfio/migration: Convert bytes_transferred counter to atomic
>    vfio/migration: Add vfio_add_bytes_transferred()
>    vfio/migration: Move migration channel flags to vfio-common.h header
>      file
>    vfio/migration: Multifd device state transfer support - basic types
>    vfio/migration: Multifd device state transfer - add support checking
>      function
>    vfio/migration: Multifd setup/cleanup functions and associated
>      VFIOMultifd
>    vfio/migration: Setup and cleanup multifd transfer in these general
>      methods
>    vfio/migration: Multifd device state transfer support - received
>      buffers queuing
>    vfio/migration: Multifd device state transfer support - load thread
>    migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile
>    vfio/migration: Multifd device state transfer support - config loading
>      support
>    vfio/migration: Multifd device state transfer support - send side
>    vfio/migration: Add x-migration-multifd-transfer VFIO property
>    vfio/migration: Make x-migration-multifd-transfer VFIO property
>      mutable
>    hw/core/machine: Add compat for x-migration-multifd-transfer VFIO
>      property
>    vfio/migration: Max in-flight VFIO device state buffer count limit
>    vfio/migration: Add x-migration-load-config-after-iter VFIO property
>    vfio/migration: Update VFIO migration documentation
> 
> Peter Xu (1):
>    migration/multifd: Make MultiFDSendData a struct
> 
>   docs/devel/migration/vfio.rst      |  79 ++-
>   hw/core/machine.c                  |   2 +
>   hw/vfio/meson.build                |   1 +
>   hw/vfio/migration-multifd.c        | 786 +++++++++++++++++++++++++++++
>   hw/vfio/migration-multifd.h        |  37 ++
>   hw/vfio/migration.c                | 111 ++--
>   hw/vfio/pci.c                      |  40 ++
>   hw/vfio/trace-events               |  13 +-
>   include/block/aio.h                |   8 +-
>   include/block/thread-pool.h        |  62 ++-
>   include/hw/vfio/vfio-common.h      |  34 ++
>   include/migration/client-options.h |   4 +
>   include/migration/misc.h           |  25 +
>   include/migration/register.h       |  52 +-
>   include/qapi/error.h               |   2 +
>   include/qemu/typedefs.h            |   5 +
>   migration/colo.c                   |   3 +
>   migration/meson.build              |   1 +
>   migration/migration-hmp-cmds.c     |   2 +
>   migration/migration.c              |  20 +-
>   migration/migration.h              |   7 +
>   migration/multifd-device-state.c   | 212 ++++++++
>   migration/multifd-nocomp.c         |  30 +-
>   migration/multifd.c                | 248 +++++++--
>   migration/multifd.h                |  74 ++-
>   migration/options.c                |   9 +
>   migration/qemu-file.h              |   2 +
>   migration/savevm.c                 | 201 +++++++-
>   migration/savevm.h                 |   6 +-
>   migration/trace-events             |   1 +
>   scripts/analyze-migration.py       |  11 +
>   tests/unit/test-thread-pool.c      |   6 +-
>   util/async.c                       |   6 +-
>   util/thread-pool.c                 | 184 +++++--
>   util/trace-events                  |   6 +-
>   35 files changed, 2125 insertions(+), 165 deletions(-)
>   create mode 100644 hw/vfio/migration-multifd.c
>   create mode 100644 hw/vfio/migration-multifd.h
>   create mode 100644 migration/multifd-device-state.c
> 



  parent reply	other threads:[~2025-03-05  9:30 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-04 22:03 [PATCH v6 00/36] Multifd 🔀 device state transfer support with VFIO consumer Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 01/36] migration: Clarify that {load, save}_cleanup handlers can run without setup Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 02/36] thread-pool: Remove thread_pool_submit() function Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 03/36] thread-pool: Rename AIO pool functions to *_aio() and data types to *Aio Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 04/36] thread-pool: Implement generic (non-AIO) pool support Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 05/36] migration: Add MIG_CMD_SWITCHOVER_START and its load handler Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 06/36] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 07/36] migration: postcopy_ram_listen_thread() should take BQL for some calls Maciej S. Szmigiero
2025-03-05 12:34   ` Peter Xu
2025-03-05 15:11     ` Maciej S. Szmigiero
2025-03-05 16:15       ` Peter Xu
2025-03-05 16:37         ` Cédric Le Goater
2025-03-05 16:49           ` Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 08/36] error: define g_autoptr() cleanup function for the Error type Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 09/36] migration: Add thread pool of optional load threads Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 10/36] migration/multifd: Split packet into header and RAM data Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 11/36] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 12/36] migration/multifd: Make multifd_send() thread safe Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 13/36] migration/multifd: Add an explicit MultiFDSendData destructor Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 14/36] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 15/36] migration/multifd: Make MultiFDSendData a struct Maciej S. Szmigiero
2025-03-05  9:00   ` Cédric Le Goater
2025-03-05 12:43   ` Fabiano Rosas
2025-03-04 22:03 ` [PATCH v6 16/36] migration/multifd: Add multifd_device_state_supported() Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 17/36] migration: Add save_live_complete_precopy_thread handler Maciej S. Szmigiero
2025-03-05 12:36   ` Peter Xu
2025-03-04 22:03 ` [PATCH v6 18/36] vfio/migration: Add load_device_config_state_start trace event Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 19/36] vfio/migration: Convert bytes_transferred counter to atomic Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 20/36] vfio/migration: Add vfio_add_bytes_transferred() Maciej S. Szmigiero
2025-03-05  7:44   ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 21/36] vfio/migration: Move migration channel flags to vfio-common.h header file Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 22/36] vfio/migration: Multifd device state transfer support - basic types Maciej S. Szmigiero
2025-03-05  7:44   ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 23/36] vfio/migration: Multifd device state transfer - add support checking function Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 24/36] vfio/migration: Multifd setup/cleanup functions and associated VFIOMultifd Maciej S. Szmigiero
2025-03-05  8:03   ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 25/36] vfio/migration: Setup and cleanup multifd transfer in these general methods Maciej S. Szmigiero
2025-03-05  8:30   ` Cédric Le Goater
2025-03-05 16:22   ` Peter Xu
2025-03-05 16:27     ` Maciej S. Szmigiero
2025-03-05 16:39       ` Peter Xu
2025-03-05 16:47         ` Cédric Le Goater
2025-03-05 16:48         ` Peter Xu
2025-03-04 22:03 ` [PATCH v6 26/36] vfio/migration: Multifd device state transfer support - received buffers queuing Maciej S. Szmigiero
2025-03-05  8:30   ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 27/36] vfio/migration: Multifd device state transfer support - load thread Maciej S. Szmigiero
2025-03-05  8:31   ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 28/36] migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 29/36] vfio/migration: Multifd device state transfer support - config loading support Maciej S. Szmigiero
2025-03-05  8:33   ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 30/36] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2025-03-05  8:38   ` Cédric Le Goater
2025-03-06  6:47   ` Avihai Horon
2025-03-06 10:15     ` Maciej S. Szmigiero
2025-03-06 10:32       ` Cédric Le Goater
2025-03-06 13:37         ` Avihai Horon
2025-03-06 14:13           ` Maciej S. Szmigiero
2025-03-06 14:23             ` Avihai Horon
2025-03-06 14:26               ` Cédric Le Goater
2025-03-07 10:59                 ` Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 31/36] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2025-03-05  9:21   ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 32/36] vfio/migration: Make x-migration-multifd-transfer VFIO property mutable Maciej S. Szmigiero
2025-03-05  8:41   ` Cédric Le Goater
2025-03-04 22:04 ` [PATCH v6 33/36] hw/core/machine: Add compat for x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2025-03-04 22:04 ` [PATCH v6 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit Maciej S. Szmigiero
2025-03-05  9:19   ` Cédric Le Goater
2025-03-05 15:11     ` Maciej S. Szmigiero
2025-03-05 16:39       ` Cédric Le Goater
2025-03-05 16:53         ` Maciej S. Szmigiero
2025-03-04 22:04 ` [PATCH v6 35/36] vfio/migration: Add x-migration-load-config-after-iter VFIO property Maciej S. Szmigiero
2025-03-04 22:04 ` [PATCH v6 36/36] vfio/migration: Update VFIO migration documentation Maciej S. Szmigiero
2025-03-05  8:53   ` Cédric Le Goater
2025-03-05  9:29 ` Cédric Le Goater [this message]
2025-03-05  9:33   ` [PATCH v6 00/36] Multifd 🔀 device state transfer support with VFIO consumer Avihai Horon
2025-03-05  9:35     ` Cédric Le Goater
2025-03-05  9:38       ` Avihai Horon
2025-03-05 17:45   ` Cédric Le Goater
2025-03-06  6:50     ` Avihai Horon
2025-03-05 16:49 ` [PATCH] migration: Always take BQL for migration_incoming_state_destroy() Maciej S. Szmigiero
2025-03-05 16:53   ` Cédric Le Goater
2025-03-05 16:55     ` Maciej S. Szmigiero
2025-03-07 10:57 ` [PATCH 1/2] vfio/migration: Add also max in-flight VFIO device state buffers size limit Maciej S. Szmigiero
2025-03-07 12:03   ` Cédric Le Goater
2025-03-07 13:45     ` Maciej S. Szmigiero
2025-03-11 13:04       ` Cédric Le Goater
2025-03-11 14:57         ` Avihai Horon
2025-03-11 15:45           ` Cédric Le Goater
2025-03-11 16:01             ` Avihai Horon
2025-03-11 16:05               ` Cédric Le Goater
2025-03-12  7:44                 ` Avihai Horon
2025-04-01 12:26         ` Maciej S. Szmigiero
2025-04-02  9:51           ` Cédric Le Goater
2025-04-02 12:40             ` Maciej S. Szmigiero
2025-04-02 13:13               ` Cédric Le Goater
2025-03-07 10:57 ` [PATCH 2/2] vfio/migration: Use BE byte order for device state wire packets Maciej S. Szmigiero
2025-03-10  7:30   ` Cédric Le Goater
2025-03-10  7:34   ` Cédric Le Goater
2025-03-10  8:17   ` Avihai Horon
2025-03-10  9:23     ` Cédric Le Goater
2025-03-10 12:53       ` Maciej S. Szmigiero
2025-03-10 13:39         ` Cédric Le Goater
2025-03-10 12:53     ` Maciej S. Szmigiero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ea12608-ec9d-4eed-a20c-75f3ac6a5d0d@redhat.com \
    --to=clg@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=berrange@redhat.com \
    --cc=eblake@redhat.com \
    --cc=farosas@suse.de \
    --cc=joao.m.martins@oracle.com \
    --cc=mail@maciej.szmigiero.name \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).