From: "Cédric Le Goater" <clg@redhat.com>
To: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>,
Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"Avihai Horon" <avihaih@nvidia.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH v4 27/33] vfio/migration: Multifd device state transfer support - received buffers queuing
Date: Wed, 12 Feb 2025 14:47:43 +0100 [thread overview]
Message-ID: <1b708674-e14d-46c2-8373-a0b12cf08b10@redhat.com> (raw)
In-Reply-To: <74c4bbaaccd81e883504ae478e84394ddd96bbae.1738171076.git.maciej.szmigiero@oracle.com>
On 1/30/25 11:08, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> The multifd received data needs to be reassembled since device state
> packets sent via different multifd channels can arrive out-of-order.
>
> Therefore, each VFIO device state packet carries a header indicating its
> position in the stream.
> The raw device state data is saved into a VFIOStateBuffer for later
> in-order loading into the device.
>
> The last such VFIO device state packet should have
> VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> ---
> hw/vfio/migration.c | 116 ++++++++++++++++++++++++++++++++++
> hw/vfio/pci.c | 2 +
> hw/vfio/trace-events | 1 +
> include/hw/vfio/vfio-common.h | 1 +
> 4 files changed, 120 insertions(+)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index bcdf204d5cf4..0c0caec1bd64 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -301,6 +301,12 @@ typedef struct VFIOStateBuffer {
> } VFIOStateBuffer;
>
> typedef struct VFIOMultifd {
> + VFIOStateBuffers load_bufs;
> + QemuCond load_bufs_buffer_ready_cond;
> + QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
> + uint32_t load_buf_idx;
> + uint32_t load_buf_idx_last;
> + uint32_t load_buf_queued_pending_buffers;
> } VFIOMultifd;
>
> static void vfio_state_buffer_clear(gpointer data)
> @@ -346,6 +352,103 @@ static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, guint idx)
> return &g_array_index(bufs->array, VFIOStateBuffer, idx);
> }
>
Each routine executed from a migration thread should have a preliminary
comment saying from which context it is called: migration or VFIO
> +static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
> + VFIODeviceStatePacket *packet,
> + size_t packet_total_size,
> + Error **errp)
> +{
> + VFIOMigration *migration = vbasedev->migration;
> + VFIOMultifd *multifd = migration->multifd;
> + VFIOStateBuffer *lb;
> +
> + vfio_state_buffers_assert_init(&multifd->load_bufs);
> + if (packet->idx >= vfio_state_buffers_size_get(&multifd->load_bufs)) {
> + vfio_state_buffers_size_set(&multifd->load_bufs, packet->idx + 1);
> + }
> +
> + lb = vfio_state_buffers_at(&multifd->load_bufs, packet->idx);
> + if (lb->is_present) {
> + error_setg(errp, "state buffer %" PRIu32 " already filled",
> + packet->idx);
> + return false;
> + }
> +
> + assert(packet->idx >= multifd->load_buf_idx);
> +
> + multifd->load_buf_queued_pending_buffers++;
> + if (multifd->load_buf_queued_pending_buffers >
> + vbasedev->migration_max_queued_buffers) {
> + error_setg(errp,
> + "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64,
> + packet->idx, vbasedev->migration_max_queued_buffers);
> + return false;
> + }
AFAICT, attributes multifd->load_buf_queued_pending_buffers and
vbasedev->migration_max_queued_buffers are not strictly necessary.
They allow to count buffers and check an arbitrary limit, which
is UINT64_MAX today. It makes me wonder how useful they are.
Please introduce them in a separate patch at the end of the series,
adding documentation on the "x-migration-max-queued-buffers" property
and also general documentation on why and how to use it.
> +
> + lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
> + lb->len = packet_total_size - sizeof(*packet);
> + lb->is_present = true;
> +
> + return true;
> +}
> +
> +static bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size,
> + Error **errp)
> +{
> + VFIODevice *vbasedev = opaque;
> + VFIOMigration *migration = vbasedev->migration;
> + VFIOMultifd *multifd = migration->multifd;
> + VFIODeviceStatePacket *packet = (VFIODeviceStatePacket *)data;
> +
> + /*
> + * Holding BQL here would violate the lock order and can cause
> + * a deadlock once we attempt to lock load_bufs_mutex below.
> + */
> + assert(!bql_locked());
> +
> + if (!migration->multifd_transfer) {
> + error_setg(errp,
> + "got device state packet but not doing multifd transfer");
> + return false;
> + }
> +
> + assert(multifd);
> +
> + if (data_size < sizeof(*packet)) {
> + error_setg(errp, "packet too short at %zu (min is %zu)",
> + data_size, sizeof(*packet));
> + return false;
> + }
> +
> + if (packet->version != 0) {
Please add a define for version, even if 0.
> + error_setg(errp, "packet has unknown version %" PRIu32,
> + packet->version);
> + return false;
> + }
> +
> + if (packet->idx == UINT32_MAX) {
> + error_setg(errp, "packet has too high idx %" PRIu32,
> + packet->idx);
I don't think printing out packet->idx is useful here.
> + return false;
> + }
> +
> + trace_vfio_load_state_device_buffer_incoming(vbasedev->name, packet->idx);
I wonder if we can add thread ids to trace events. It would be useful.
> +
> + QEMU_LOCK_GUARD(&multifd->load_bufs_mutex);
> +
> + /* config state packet should be the last one in the stream */
> + if (packet->flags & VFIO_DEVICE_STATE_CONFIG_STATE) {
> + multifd->load_buf_idx_last = packet->idx;
> + }
> +
> + if (!vfio_load_state_buffer_insert(vbasedev, packet, data_size, errp)) {
So the migration thread calling multifd_device_state_recv() will
exit and the vfio thread loading the state into the device will
hang until its aborted ?
This sequence is expected to be called to release the vfio thread
while (multifd->load_bufs_thread_running) {
multifd->load_bufs_thread_want_exit = true;
qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond);
...
}
right ?
The way the series is presented makes it a bit complex to follow the
proposition, especially regarding the creation and termination of
threads, something the reader should be aware of.
As an initial step in clarifying the design, I would have preferred
a series of patches introducing the various threads, migration threads
and VFIO threads, without any workload. Once the creation and termination
points are established I would then introduce the work load for each
thread.
Thanks,
C.
> + return false;
> + }
> +
> + qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond);
> +
> + return true;
> +}
> +
> static int vfio_save_device_config_state(QEMUFile *f, void *opaque,
> Error **errp)
> {
> @@ -405,11 +508,23 @@ static VFIOMultifd *vfio_multifd_new(void)
> {
> VFIOMultifd *multifd = g_new(VFIOMultifd, 1);
>
> + vfio_state_buffers_init(&multifd->load_bufs);
> +
> + qemu_mutex_init(&multifd->load_bufs_mutex);
> +
> + multifd->load_buf_idx = 0;
> + multifd->load_buf_idx_last = UINT32_MAX;
> + multifd->load_buf_queued_pending_buffers = 0;
> + qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
> +
> return multifd;
> }
>
> static void vfio_multifd_free(VFIOMultifd *multifd)
> {
> + qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond);
> + qemu_mutex_destroy(&multifd->load_bufs_mutex);
> +
> g_free(multifd);
> }
>
> @@ -940,6 +1055,7 @@ static const SaveVMHandlers savevm_vfio_handlers = {
> .load_setup = vfio_load_setup,
> .load_cleanup = vfio_load_cleanup,
> .load_state = vfio_load_state,
> + .load_state_buffer = vfio_load_state_buffer,
> .switchover_ack_needed = vfio_switchover_ack_needed,
> };
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 83090c544d95..2700b355ecf1 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3380,6 +3380,8 @@ static const Property vfio_pci_dev_properties[] = {
> DEFINE_PROP_ON_OFF_AUTO("x-migration-load-config-after-iter", VFIOPCIDevice,
> vbasedev.migration_load_config_after_iter,
> ON_OFF_AUTO_AUTO),
> + DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
> + vbasedev.migration_max_queued_buffers, UINT64_MAX),
> DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
> vbasedev.migration_events, false),
> DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> index 1bebe9877d88..042a3dc54a33 100644
> --- a/hw/vfio/trace-events
> +++ b/hw/vfio/trace-events
> @@ -153,6 +153,7 @@ vfio_load_device_config_state_start(const char *name) " (%s)"
> vfio_load_device_config_state_end(const char *name) " (%s)"
> vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64
> vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d"
> +vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (%s) idx %"PRIu32
> vfio_migration_realize(const char *name) " (%s)"
> vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s"
> vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s"
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index c0c9c0b1b263..0e8b0848882e 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -139,6 +139,7 @@ typedef struct VFIODevice {
> OnOffAuto enable_migration;
> OnOffAuto migration_multifd_transfer;
> OnOffAuto migration_load_config_after_iter;
> + uint64_t migration_max_queued_buffers;
> bool migration_events;
> VFIODeviceOps *ops;
> unsigned int num_irqs;
>
next prev parent reply other threads:[~2025-02-12 13:48 UTC|newest]
Thread overview: 137+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-30 10:08 [PATCH v4 00/33] Multifd 🔀 device state transfer support with VFIO consumer Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 01/33] migration: Clarify that {load, save}_cleanup handlers can run without setup Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 02/33] thread-pool: Remove thread_pool_submit() function Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 03/33] thread-pool: Rename AIO pool functions to *_aio() and data types to *Aio Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 04/33] thread-pool: Implement generic (non-AIO) pool support Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 05/33] migration: Add MIG_CMD_SWITCHOVER_START and its load handler Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 06/33] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 07/33] io: tls: Allow terminating the TLS session gracefully with EOF Maciej S. Szmigiero
2025-02-04 15:15 ` Daniel P. Berrangé
2025-02-04 16:02 ` Maciej S. Szmigiero
2025-02-04 16:14 ` Daniel P. Berrangé
2025-02-04 18:25 ` Maciej S. Szmigiero
2025-02-06 21:53 ` Peter Xu
2025-01-30 10:08 ` [PATCH v4 08/33] migration/multifd: Allow premature EOF on TLS incoming channels Maciej S. Szmigiero
2025-02-03 18:20 ` Peter Xu
2025-02-03 18:53 ` Maciej S. Szmigiero
2025-02-03 20:20 ` Peter Xu
2025-02-03 21:41 ` Maciej S. Szmigiero
2025-02-03 22:56 ` Peter Xu
2025-02-04 13:51 ` Fabiano Rosas
2025-02-04 14:39 ` Maciej S. Szmigiero
2025-02-04 15:00 ` Fabiano Rosas
2025-02-04 15:10 ` Maciej S. Szmigiero
2025-02-04 15:31 ` Peter Xu
2025-02-04 15:39 ` Daniel P. Berrangé
2025-02-05 19:09 ` Fabiano Rosas
2025-02-05 20:42 ` Fabiano Rosas
2025-02-05 20:55 ` Maciej S. Szmigiero
2025-02-06 14:13 ` Fabiano Rosas
2025-02-06 14:53 ` Maciej S. Szmigiero
2025-02-06 15:20 ` Fabiano Rosas
2025-02-06 16:01 ` Maciej S. Szmigiero
2025-02-06 17:32 ` Fabiano Rosas
2025-02-06 17:55 ` Maciej S. Szmigiero
2025-02-06 21:51 ` Peter Xu
2025-02-07 13:17 ` Fabiano Rosas
2025-02-07 14:04 ` Peter Xu
2025-02-07 14:16 ` Fabiano Rosas
2025-02-05 21:13 ` Peter Xu
2025-02-06 14:19 ` Fabiano Rosas
2025-02-04 15:10 ` Daniel P. Berrangé
2025-02-04 15:08 ` Daniel P. Berrangé
2025-02-04 16:02 ` Peter Xu
2025-02-04 16:12 ` Daniel P. Berrangé
2025-02-04 16:29 ` Peter Xu
2025-02-04 18:25 ` Fabiano Rosas
2025-02-04 19:34 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 09/33] migration: postcopy_ram_listen_thread() needs to take BQL for some calls Maciej S. Szmigiero
2025-02-02 2:06 ` Dr. David Alan Gilbert
2025-02-02 11:55 ` Maciej S. Szmigiero
2025-02-02 12:45 ` Dr. David Alan Gilbert
2025-02-03 13:57 ` Maciej S. Szmigiero
2025-02-03 19:58 ` Peter Xu
2025-02-03 20:15 ` Maciej S. Szmigiero
2025-02-03 20:36 ` Peter Xu
2025-02-03 21:41 ` Maciej S. Szmigiero
2025-02-03 23:02 ` Peter Xu
2025-02-04 14:57 ` Maciej S. Szmigiero
2025-02-04 15:39 ` Peter Xu
2025-02-04 19:32 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 10/33] error: define g_autoptr() cleanup function for the Error type Maciej S. Szmigiero
2025-02-03 20:53 ` Peter Xu
2025-02-03 21:13 ` Daniel P. Berrangé
2025-02-03 21:51 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 11/33] migration: Add thread pool of optional load threads Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 12/33] migration/multifd: Split packet into header and RAM data Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 13/33] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2025-02-03 21:27 ` Peter Xu
2025-02-03 22:18 ` Maciej S. Szmigiero
2025-02-03 22:59 ` Peter Xu
2025-02-04 14:40 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 14/33] migration/multifd: Make multifd_send() thread safe Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 15/33] migration/multifd: Add an explicit MultiFDSendData destructor Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 16/33] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2025-02-03 21:47 ` Peter Xu
2025-01-30 10:08 ` [PATCH v4 17/33] migration/multifd: Make MultiFDSendData a struct Maciej S. Szmigiero
2025-02-07 14:36 ` Fabiano Rosas
2025-02-07 19:43 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 18/33] migration/multifd: Add multifd_device_state_supported() Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 19/33] migration: Add save_live_complete_precopy_thread handler Maciej S. Szmigiero
2025-02-04 17:54 ` Peter Xu
2025-02-04 19:32 ` Maciej S. Szmigiero
2025-02-04 20:34 ` Peter Xu
2025-02-05 11:53 ` Maciej S. Szmigiero
2025-02-05 15:55 ` Peter Xu
2025-02-06 11:41 ` Maciej S. Szmigiero
2025-02-06 22:16 ` Peter Xu
2025-01-30 10:08 ` [PATCH v4 20/33] vfio/migration: Add x-migration-load-config-after-iter VFIO property Maciej S. Szmigiero
2025-02-10 17:24 ` Cédric Le Goater
2025-02-11 14:37 ` Maciej S. Szmigiero
2025-02-11 15:00 ` Cédric Le Goater
2025-02-11 15:57 ` Maciej S. Szmigiero
2025-02-11 16:28 ` Cédric Le Goater
2025-01-30 10:08 ` [PATCH v4 21/33] vfio/migration: Add load_device_config_state_start trace event Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 22/33] vfio/migration: Convert bytes_transferred counter to atomic Maciej S. Szmigiero
2025-01-30 21:35 ` Cédric Le Goater
2025-01-31 9:47 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 23/33] vfio/migration: Multifd device state transfer support - basic types Maciej S. Szmigiero
2025-02-10 17:17 ` Cédric Le Goater
2025-01-30 10:08 ` [PATCH v4 24/33] vfio/migration: Multifd device state transfer support - VFIOStateBuffer(s) Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 25/33] vfio/migration: Multifd device state transfer - add support checking function Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 26/33] vfio/migration: Multifd device state transfer support - receive init/cleanup Maciej S. Szmigiero
2025-02-12 10:55 ` Cédric Le Goater
2025-02-14 20:55 ` Maciej S. Szmigiero
2025-02-17 9:38 ` Cédric Le Goater
2025-02-17 22:13 ` Maciej S. Szmigiero
2025-02-18 7:54 ` Cédric Le Goater
2025-01-30 10:08 ` [PATCH v4 27/33] vfio/migration: Multifd device state transfer support - received buffers queuing Maciej S. Szmigiero
2025-02-12 13:47 ` Cédric Le Goater [this message]
2025-02-14 20:58 ` Maciej S. Szmigiero
2025-02-17 13:48 ` Cédric Le Goater
2025-02-17 22:15 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 28/33] vfio/migration: Multifd device state transfer support - load thread Maciej S. Szmigiero
2025-02-12 15:48 ` Cédric Le Goater
2025-02-12 16:19 ` Cédric Le Goater
2025-02-17 22:09 ` Maciej S. Szmigiero
2025-02-17 22:09 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 29/33] vfio/migration: Multifd device state transfer support - config loading support Maciej S. Szmigiero
2025-02-12 16:21 ` Cédric Le Goater
2025-02-17 22:09 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 30/33] migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 31/33] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2025-02-12 17:03 ` Cédric Le Goater
2025-02-17 22:12 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 32/33] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2025-02-12 17:10 ` Cédric Le Goater
2025-02-14 20:56 ` Maciej S. Szmigiero
2025-02-17 13:57 ` Cédric Le Goater
2025-02-17 14:16 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 33/33] hw/core/machine: Add compat for " Maciej S. Szmigiero
2025-01-30 20:19 ` [PATCH v4 00/33] Multifd 🔀 device state transfer support with VFIO consumer Fabiano Rosas
2025-01-30 20:27 ` Maciej S. Szmigiero
2025-01-30 20:46 ` Fabiano Rosas
2025-01-31 18:16 ` Maciej S. Szmigiero
2025-02-03 14:19 ` Cédric Le Goater
2025-02-21 6:57 ` Yanghang Liu
2025-02-22 9:51 ` Maciej S. Szmigiero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1b708674-e14d-46c2-8373-a0b12cf08b10@redhat.com \
--to=clg@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=avihaih@nvidia.com \
--cc=berrange@redhat.com \
--cc=eblake@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=mail@maciej.szmigiero.name \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).