From: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
To: Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Avihai Horon" <avihaih@nvidia.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
qemu-devel@nongnu.org
Subject: [PATCH RFC 00/26] Multifd 🔀 device state transfer support with VFIO consumer
Date: Tue, 16 Apr 2024 16:42:39 +0200 [thread overview]
Message-ID: <cover.1713269378.git.maciej.szmigiero@oracle.com> (raw)
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
VFIO device state transfer is currently done via the main migration channel.
This means that transfers from multiple VFIO devices are done sequentially
and via just a single common migration channel.
Such way of transferring VFIO device state migration data reduces
performance and severally impacts the migration downtime (~50%) for VMs
that have multiple such devices with large state size - see the test
results below.
However, we already have a way to transfer migration data using multiple
connections - that's what multifd channels are.
Unfortunately, multifd channels are currently utilized for RAM transfer
only.
This patch set adds a new framework allowing their use for device state
transfer too.
The wire protocol is based on Avihai's x-channel-header patches, which
introduce a header for migration channels that allow the migration source
to explicitly indicate the migration channel type without having the
target deduce the channel type by peeking in the channel's content.
The new wire protocol can be switch on and off via migration.x-channel-header
option for compatibility with older QEMU versions and testing.
Switching the new wire protocol off also disables device state transfer via
multifd channels.
The device state transfer can happen either via the same multifd channels
as RAM data is transferred, mixed with RAM data (when
migration.x-multifd-channels-device-state is 0) or exclusively via
dedicated device state transfer channels (when
migration.x-multifd-channels-device-state > 0).
Using dedicated device state transfer multifd channels brings further
performance benefits since these channels don't need to participate in
the RAM sync process.
These patches introduce a few new SaveVMHandlers:
* "save_live_complete_precopy_async{,wait}" handlers that allow device to
provide its own asynchronous transmission of the remaining data at the
end of a precopy phase.
The "save_live_complete_precopy_async" handler is supposed to start such
transmission (for example, by launching appropriate threads) while the
"save_live_complete_precopy_async_wait" handler is supposed to wait until
such transfer has finished (for example, until the sending threads
have exited).
* "load_state_buffer" and its caller qemu_loadvm_load_state_buffer() that
allow providing device state buffer to explicitly specified device via
its idstr and instance id.
* "load_finish" the allows migration code to poll whether a device-specific
asynchronous device state loading operation had finished before proceeding
further in the migration process (with associated condition variable for
notification to avoid unnecessary polling).
A VFIO device migration consumer for the new multifd channels device state
migration framework was implemented with a reassembly process for the multifd
received data since device state packets sent via different multifd channels
can arrive out-of-order.
The VFIO device data loading process happens in a separate thread in order
to avoid blocking a multifd receive thread during this fairly long process.
Test setup:
Source machine: 2x Xeon Gold 5218 / 192 GiB RAM
Mellanox ConnectX-7 with 100GbE link
6.9.0-rc1+ kernel
Target machine: 2x Xeon Platinum 8260 / 376 GiB RAM
Mellanox ConnectX-7 with 100GbE link
6.6.0+ kernel
VM: CPU 12cores x 2threads / 15 GiB RAM / 4x Mellanox ConnectX-7 VF
Migration config: 15 multifd channels total
new way had 4 channels dedicated to device state transfer
x-return-path=true, x-switchover-ack=true
Downtime with ~400 MiB VFIO total device state size:
TLS off TLS on
migration.x-channel-header=false (old way) ~2100 ms ~2300 ms
migration.x-channel-header=true (new way) ~1100 ms ~1200 ms
IMPROVEMENT ~50% ~50%
This patch set is also available as a git tree:
https://github.com/maciejsszmigiero/qemu/tree/multifd-device-state-transfer-vfio
Avihai Horon (7):
migration: Add x-channel-header pseudo-capability
migration: Add migration channel header send/receive
migration: Add send/receive header for main channel
migration: Allow passing migration header in migration channel
creation
migration: Add send/receive header for postcopy preempt channel
migration: Add send/receive header for multifd channel
migration: Enable x-channel-header pseudo-capability
Maciej S. Szmigiero (19):
multifd: change multifd_new_send_channel_create() param type
migration: Add a DestroyNotify parameter to
socket_send_channel_create()
multifd: pass MFDSendChannelConnectData when connecting sending socket
migration/postcopy: pass PostcopyPChannelConnectData when connecting
sending preempt socket
migration/options: Mapped-ram is not channel header compatible
vfio/migration: Add save_{iterate,complete_precopy}_started trace
events
migration/ram: Add load start trace event
migration/multifd: Zero p->flags before starting filling a packet
migration: Add save_live_complete_precopy_async{,wait} handlers
migration: Add qemu_loadvm_load_state_buffer() and its handler
migration: Add load_finish handler and associated functions
migration: Add x-multifd-channels-device-state parameter
migration: Add MULTIFD_DEVICE_STATE migration channel type
migration/multifd: Device state transfer support - receive side
migration/multifd: Convert multifd_send_pages::next_channel to atomic
migration/multifd: Device state transfer support - send side
migration/multifd: Add migration_has_device_state_support()
vfio/migration: Multifd device state transfer support - receive side
vfio/migration: Multifd device state transfer support - send side
hw/core/machine.c | 1 +
hw/vfio/migration.c | 530 ++++++++++++++++++++++++++++++++-
hw/vfio/trace-events | 15 +-
include/hw/vfio/vfio-common.h | 25 ++
include/migration/misc.h | 5 +
include/migration/register.h | 67 +++++
migration/channel.c | 68 +++++
migration/channel.h | 17 ++
migration/file.c | 5 +-
migration/migration-hmp-cmds.c | 7 +
migration/migration.c | 110 ++++++-
migration/migration.h | 6 +
migration/multifd-zlib.c | 2 +-
migration/multifd-zstd.c | 2 +-
migration/multifd.c | 512 ++++++++++++++++++++++++++-----
migration/multifd.h | 62 +++-
migration/options.c | 66 ++++
migration/options.h | 2 +
migration/postcopy-ram.c | 81 ++++-
migration/ram.c | 1 +
migration/savevm.c | 112 +++++++
migration/savevm.h | 7 +
migration/socket.c | 6 +-
migration/socket.h | 3 +-
migration/trace-events | 3 +
qapi/migration.json | 16 +-
26 files changed, 1626 insertions(+), 105 deletions(-)
next reply other threads:[~2024-04-16 14:44 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-16 14:42 Maciej S. Szmigiero [this message]
2024-04-16 14:42 ` [PATCH RFC 01/26] migration: Add x-channel-header pseudo-capability Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 02/26] migration: Add migration channel header send/receive Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 03/26] migration: Add send/receive header for main channel Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 04/26] multifd: change multifd_new_send_channel_create() param type Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 05/26] migration: Add a DestroyNotify parameter to socket_send_channel_create() Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 06/26] multifd: pass MFDSendChannelConnectData when connecting sending socket Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 07/26] migration/postcopy: pass PostcopyPChannelConnectData when connecting sending preempt socket Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 08/26] migration: Allow passing migration header in migration channel creation Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 09/26] migration: Add send/receive header for postcopy preempt channel Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 10/26] migration: Add send/receive header for multifd channel Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 11/26] migration/options: Mapped-ram is not channel header compatible Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 12/26] migration: Enable x-channel-header pseudo-capability Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 13/26] vfio/migration: Add save_{iterate, complete_precopy}_started trace events Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 14/26] migration/ram: Add load start trace event Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 15/26] migration/multifd: Zero p->flags before starting filling a packet Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 16/26] migration: Add save_live_complete_precopy_async{, wait} handlers Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 17/26] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 18/26] migration: Add load_finish handler and associated functions Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 19/26] migration: Add x-multifd-channels-device-state parameter Maciej S. Szmigiero
2024-04-16 14:42 ` [PATCH RFC 20/26] migration: Add MULTIFD_DEVICE_STATE migration channel type Maciej S. Szmigiero
2024-04-16 14:43 ` [PATCH RFC 21/26] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2024-04-16 14:43 ` [PATCH RFC 22/26] migration/multifd: Convert multifd_send_pages::next_channel to atomic Maciej S. Szmigiero
2024-04-16 14:43 ` [PATCH RFC 23/26] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2024-04-29 20:04 ` Peter Xu
2024-05-06 16:25 ` Maciej S. Szmigiero
2024-04-16 14:43 ` [PATCH RFC 24/26] migration/multifd: Add migration_has_device_state_support() Maciej S. Szmigiero
2024-04-16 14:43 ` [PATCH RFC 25/26] vfio/migration: Multifd device state transfer support - receive side Maciej S. Szmigiero
2024-04-16 14:43 ` [PATCH RFC 26/26] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2024-04-17 8:36 ` [PATCH RFC 00/26] Multifd 🔀 device state transfer support with VFIO consumer Daniel P. Berrangé
2024-04-17 12:11 ` Maciej S. Szmigiero
2024-04-17 16:35 ` Daniel P. Berrangé
2024-04-18 9:50 ` Maciej S. Szmigiero
2024-04-18 10:39 ` Daniel P. Berrangé
2024-04-18 18:14 ` Maciej S. Szmigiero
2024-04-18 20:02 ` Peter Xu
2024-04-19 10:07 ` Daniel P. Berrangé
2024-04-19 15:31 ` Peter Xu
2024-04-23 16:15 ` Maciej S. Szmigiero
2024-04-23 22:20 ` Peter Xu
2024-04-23 22:25 ` Maciej S. Szmigiero
2024-04-23 22:35 ` Peter Xu
2024-04-26 17:34 ` Maciej S. Szmigiero
2024-04-29 15:09 ` Peter Xu
2024-05-06 16:26 ` Maciej S. Szmigiero
2024-05-06 17:56 ` Peter Xu
2024-05-07 8:41 ` Avihai Horon
2024-05-07 16:13 ` Peter Xu
2024-05-07 17:23 ` Avihai Horon
2024-04-23 16:14 ` Maciej S. Szmigiero
2024-04-23 22:27 ` Peter Xu
2024-04-26 17:35 ` Maciej S. Szmigiero
2024-04-29 20:34 ` Peter Xu
2024-04-19 10:20 ` Daniel P. Berrangé
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1713269378.git.maciej.szmigiero@oracle.com \
--to=mail@maciej.szmigiero.name \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=avihaih@nvidia.com \
--cc=clg@redhat.com \
--cc=eblake@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).