qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
To: Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
	"Cédric Le Goater" <clg@redhat.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Avihai Horon" <avihaih@nvidia.com>,
	"Joao Martins" <joao.m.martins@oracle.com>,
	qemu-devel@nongnu.org
Subject: [PATCH v1 00/13] Multifd 🔀 device state transfer support with VFIO consumer
Date: Tue, 18 Jun 2024 18:12:18 +0200	[thread overview]
Message-ID: <cover.1718717584.git.maciej.szmigiero@oracle.com> (raw)

From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>

This is an updated v1 patch series of the RFC (v0) series located here:
https://lore.kernel.org/qemu-devel/cover.1713269378.git.maciej.szmigiero@oracle.com/


Changes from the RFC (v0):
* Extend the existing multifd packet format instead of introducing a new
migration channel header.

* As the replacement of switching the migration channel header on or off
introduce "x-migration-multifd-transfer" VFIO device property instead that
allows configuring at runtime whether to send the particular device state
via multifd channels when live migrating that device.

This property defaults to "false" for bit stream compatibility with older
QEMU versions.

* Remove the support for having dedicated device state transfer multifd
channels since the same downtime performance can be attained by simply
reducing the total number of multifd channels in a shared channel
configuration to the number of channels available for RAM transfer in
the dedicated device state channels configuration.

For example, the best downtime from the dedicated device state config
on my setup (achieved in configuration of 10 total multifd channels /
4 dedicated device state channels) can also be achieved in the
shared RAM/device state channel configuration by reducing the total
multifd channel count to 6.

It looks like not having too many RAM transfer multifd channels is
key to having a good downtime since the results are reproducibly
worse with 15 shared channels total, while they are as good as with
6 shared channels if there are 15 total channels but 4 of them are
dedicated to transferring device state (leaving 11 for RAM transfer).

* Make the next multifd channel selection more fair when converting
multifd_send_pages::next_channel to atomic.

* Convert the code to use QEMU thread sync primitives (QemuMutex with
QemuLockable, QemuCond) instead of their Glib equivalents (GMutex,
GMutexLocker and GCond).

* Rename complete_precopy_async{,_wait} to complete_precopy_{begin,_end} as
suggested.

* Rebase onto the last week's QEMU git master and retest.


When working on the updated patch set version I also investigated the
possibility of refactoring VM live phase (non-downtime) transfers to
happen via multifd channels.

However, the VM live phase transfer works differently: it happens
opportunistically until the remaining data drops below the switchover
threshold, rather that transferring always the whole device state data
until their exhaustion.

For this reason, there would need to be some way in the migration
framework to update the remaining data estimate from per-device
saving/transfer queuing thread and then stop these threads when the
decision has been reached in the migration core to stop the VM and
switch over. Such functionality would need to be introduced first.

There would also need to be some fairness guarantees so every device
gets similar access to multifd channels - otherwise there could be a
situation that the remaining data never drops below switchover
threshold because some devices are starved with respect to access to
the multifd transfer channels - as in the VM live phase additional
device data is constantly being generated.

Moreover, there's nothing stopping a QEMU device driver from requiring
different handling (loading, etc.) of VM live phase data from the
post-switchover data.
For cases like this some kind of a new device VM live phase incoming
data load handler would need to be introduced too.

For the above reasons, the VM live phase multifd transfer functionality
isn't a simple extension of the functionality introduced by this patch
set.


For convenience, this patch set is also available as a git tree:
https://github.com/maciejsszmigiero/qemu/tree/multifd-device-state-transfer-vfio


Maciej S. Szmigiero (13):
  vfio/migration: Add save_{iterate,complete_precopy}_started trace
    events
  migration/ram: Add load start trace event
  migration/multifd: Zero p->flags before starting filling a packet
  migration: Add save_live_complete_precopy_{begin,end} handlers
  migration: Add qemu_loadvm_load_state_buffer() and its handler
  migration: Add load_finish handler and associated functions
  migration/multifd: Device state transfer support - receive side
  migration/multifd: Convert multifd_send_pages::next_channel to atomic
  migration/multifd: Device state transfer support - send side
  migration/multifd: Add migration_has_device_state_support()
  vfio/migration: Multifd device state transfer support - receive side
  vfio/migration: Add x-migration-multifd-transfer VFIO property
  vfio/migration: Multifd device state transfer support - send side

 hw/vfio/migration.c           | 545 +++++++++++++++++++++++++++++++++-
 hw/vfio/pci.c                 |   7 +
 hw/vfio/trace-events          |  15 +-
 include/hw/vfio/vfio-common.h |  27 ++
 include/migration/misc.h      |   5 +
 include/migration/register.h  |  70 +++++
 migration/migration.c         |   6 +
 migration/migration.h         |   3 +
 migration/multifd-zlib.c      |   2 +-
 migration/multifd-zstd.c      |   2 +-
 migration/multifd.c           | 336 +++++++++++++++++----
 migration/multifd.h           |  57 +++-
 migration/ram.c               |   1 +
 migration/savevm.c            | 112 +++++++
 migration/savevm.h            |   7 +
 migration/trace-events        |   1 +
 16 files changed, 1132 insertions(+), 64 deletions(-)



             reply	other threads:[~2024-06-18 16:14 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-18 16:12 Maciej S. Szmigiero [this message]
2024-06-18 16:12 ` [PATCH v1 01/13] vfio/migration: Add save_{iterate, complete_precopy}_started trace events Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 02/13] migration/ram: Add load start trace event Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 03/13] migration/multifd: Zero p->flags before starting filling a packet Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 04/13] migration: Add save_live_complete_precopy_{begin, end} handlers Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 05/13] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 06/13] migration: Add load_finish handler and associated functions Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 07/13] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 08/13] migration/multifd: Convert multifd_send_pages::next_channel to atomic Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 09/13] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 10/13] migration/multifd: Add migration_has_device_state_support() Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 11/13] vfio/migration: Multifd device state transfer support - receive side Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 12/13] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2024-06-18 16:12 ` [PATCH v1 13/13] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2024-06-23 20:27 ` [PATCH v1 00/13] Multifd 🔀 device state transfer support with VFIO consumer Peter Xu
2024-06-24 19:51   ` Maciej S. Szmigiero
2024-06-25 17:25     ` Peter Xu
2024-06-25 22:44       ` Maciej S. Szmigiero
2024-06-26  1:51         ` Peter Xu
2024-06-26 15:47           ` Maciej S. Szmigiero
2024-06-26 16:23             ` Peter Xu
2024-06-27  9:14               ` Maciej S. Szmigiero
2024-06-27 14:56                 ` Peter Xu
2024-07-16 20:10                   ` Maciej S. Szmigiero
2024-07-17 18:49                     ` Peter Xu
2024-07-17 20:19                       ` Fabiano Rosas
2024-07-17 21:07                         ` Maciej S. Szmigiero
2024-07-17 21:21                           ` Peter Xu
2024-06-27 15:09                 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1718717584.git.maciej.szmigiero@oracle.com \
    --to=mail@maciej.szmigiero.name \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=berrange@redhat.com \
    --cc=clg@redhat.com \
    --cc=eblake@redhat.com \
    --cc=farosas@suse.de \
    --cc=joao.m.martins@oracle.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).