From: Avihai Horon <avihaih@nvidia.com>
To: "Cédric Le Goater" <clg@redhat.com>,
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
"Eric Blake" <eblake@redhat.com>, "Peter Xu" <peterx@redhat.com>,
"Fabiano Rosas" <farosas@suse.de>,
"Markus Armbruster" <armbru@redhat.com>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH 1/2] vfio/migration: Add also max in-flight VFIO device state buffers size limit
Date: Tue, 11 Mar 2025 16:57:08 +0200 [thread overview]
Message-ID: <bdd69682-3d0f-4687-a8a5-43a6cb4cecc3@nvidia.com> (raw)
In-Reply-To: <fc547687-b313-404c-a6a6-dd599b0a9dbc@redhat.com>
On 11/03/2025 15:04, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> On 3/7/25 14:45, Maciej S. Szmigiero wrote:
>> On 7.03.2025 13:03, Cédric Le Goater wrote:
>>> On 3/7/25 11:57, Maciej S. Szmigiero wrote:
>>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>>
>>>> There's already a max in-flight VFIO device state buffers *count*
>>>> limit,
>>>
>>> no. there isn't. Do we need both ?
>>
>> This is on a top of the remaining patches
>> (x-migration-load-config-after-iter
>> and x-migration-max-queued-buffers) - I thought we were supposed to work
>> on these after the main series was merged as they are relatively
>> non-critical.
>
> yes. we don't need both count and size limits though, a size limit is
> enough.
>
>> I would also give x-migration-load-config-after-iter priority over
>> x-migration-max-queued-buffers{,-size} as the former is correctness fix
>> while the later are just additional functionalities.
>
> ok. I have kept both patches in my tree with the doc updates.
>
>> Also, if some setup is truly worried about these buffers consuming
>> too much
>> memory then roughly the same thing could be achieved by (temporarily)
>> putting
>> the target QEMU process in a memory-limited cgroup.
>
> yes.
>
> That said,
>
> since QEMU exchanges 1MB VFIODeviceStatePackets when using multifd and
> that
> the overall device state is in the order of 100MB :
>
> /*
> * This is an arbitrary size based on migration of mlx5 devices,
> where typically
> * total device migration size is on the order of 100s of MB.
> Testing with
> * larger values, e.g. 128MB and 1GB, did not show a performance
> improvement.
> */
> #define VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE (1 * MiB)
>
>
> Could we define the limit to 1GB ?
>
> Avihai, would that make sense ?
>
There can be many use cases, each one with its own requirements and
constraints, so it's hard for me to think of a "good" default value.
IIUC this limit is mostly relevant for the extreme cases where devices
have big state + writing the buffers to the device is slow.
So IMHO let's set it to unlimited by default and let the users decide if
they want to set such limit and to what value. (Note also that even when
unlimited, it is really limited to 2 * device_state_size).
Unless you have other reasons why 1GB or other value is preferable?
Thanks.
>
> Thanks,
>
> C.
>
>
>
>
>>
>> On the other hand, the network endianess patch is urgent since it
>> affects
>> the bit stream.
>>
>>>> add also max queued buffers *size* limit.
>>>>
>>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>>> ---
>>>> docs/devel/migration/vfio.rst | 8 +++++---
>>>> hw/vfio/migration-multifd.c | 21 +++++++++++++++++++--
>>>> hw/vfio/pci.c | 9 +++++++++
>>>> include/hw/vfio/vfio-common.h | 1 +
>>>> 4 files changed, 34 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/docs/devel/migration/vfio.rst
>>>> b/docs/devel/migration/vfio.rst
>>>> index 7c9cb7bdbf87..127a1db35949 100644
>>>> --- a/docs/devel/migration/vfio.rst
>>>> +++ b/docs/devel/migration/vfio.rst
>>>> @@ -254,12 +254,14 @@ This means that a malicious QEMU source could
>>>> theoretically cause the target
>>>> QEMU to allocate unlimited amounts of memory for such
>>>> buffers-in-flight.
>>>> The "x-migration-max-queued-buffers" property allows capping the
>>>> maximum count
>>>> -of these VFIO device state buffers queued at the destination.
>>>> +of these VFIO device state buffers queued at the destination while
>>>> +"x-migration-max-queued-buffers-size" property allows capping
>>>> their total queued
>>>> +size.
>>>> Because a malicious QEMU source causing OOM on the target is not
>>>> expected to be
>>>> a realistic threat in most of VFIO live migration use cases and
>>>> the right value
>>>> -depends on the particular setup by default this queued buffers
>>>> limit is
>>>> -disabled by setting it to UINT64_MAX.
>>>> +depends on the particular setup by default these queued buffers
>>>> limits are
>>>> +disabled by setting them to UINT64_MAX.
>>>> Some host platforms (like ARM64) require that VFIO device config
>>>> is loaded only
>>>> after all iterables were loaded.
>>>> diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
>>>> index dccd763d7c39..a9d41b9f1cb1 100644
>>>> --- a/hw/vfio/migration-multifd.c
>>>> +++ b/hw/vfio/migration-multifd.c
>>>> @@ -83,6 +83,7 @@ typedef struct VFIOMultifd {
>>>> uint32_t load_buf_idx;
>>>> uint32_t load_buf_idx_last;
>>>> uint32_t load_buf_queued_pending_buffers;
>>>
>>> 'load_buf_queued_pending_buffers' is not in mainline. Please rebase.
>>>
>>>
>>> Thanks,
>>>
>>> C.
>>
>> Thanks,
>> Maciej
>>
>
next prev parent reply other threads:[~2025-03-11 15:03 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-04 22:03 [PATCH v6 00/36] Multifd 🔀 device state transfer support with VFIO consumer Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 01/36] migration: Clarify that {load, save}_cleanup handlers can run without setup Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 02/36] thread-pool: Remove thread_pool_submit() function Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 03/36] thread-pool: Rename AIO pool functions to *_aio() and data types to *Aio Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 04/36] thread-pool: Implement generic (non-AIO) pool support Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 05/36] migration: Add MIG_CMD_SWITCHOVER_START and its load handler Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 06/36] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 07/36] migration: postcopy_ram_listen_thread() should take BQL for some calls Maciej S. Szmigiero
2025-03-05 12:34 ` Peter Xu
2025-03-05 15:11 ` Maciej S. Szmigiero
2025-03-05 16:15 ` Peter Xu
2025-03-05 16:37 ` Cédric Le Goater
2025-03-05 16:49 ` Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 08/36] error: define g_autoptr() cleanup function for the Error type Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 09/36] migration: Add thread pool of optional load threads Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 10/36] migration/multifd: Split packet into header and RAM data Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 11/36] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 12/36] migration/multifd: Make multifd_send() thread safe Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 13/36] migration/multifd: Add an explicit MultiFDSendData destructor Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 14/36] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 15/36] migration/multifd: Make MultiFDSendData a struct Maciej S. Szmigiero
2025-03-05 9:00 ` Cédric Le Goater
2025-03-05 12:43 ` Fabiano Rosas
2025-03-04 22:03 ` [PATCH v6 16/36] migration/multifd: Add multifd_device_state_supported() Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 17/36] migration: Add save_live_complete_precopy_thread handler Maciej S. Szmigiero
2025-03-05 12:36 ` Peter Xu
2025-03-04 22:03 ` [PATCH v6 18/36] vfio/migration: Add load_device_config_state_start trace event Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 19/36] vfio/migration: Convert bytes_transferred counter to atomic Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 20/36] vfio/migration: Add vfio_add_bytes_transferred() Maciej S. Szmigiero
2025-03-05 7:44 ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 21/36] vfio/migration: Move migration channel flags to vfio-common.h header file Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 22/36] vfio/migration: Multifd device state transfer support - basic types Maciej S. Szmigiero
2025-03-05 7:44 ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 23/36] vfio/migration: Multifd device state transfer - add support checking function Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 24/36] vfio/migration: Multifd setup/cleanup functions and associated VFIOMultifd Maciej S. Szmigiero
2025-03-05 8:03 ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 25/36] vfio/migration: Setup and cleanup multifd transfer in these general methods Maciej S. Szmigiero
2025-03-05 8:30 ` Cédric Le Goater
2025-03-05 16:22 ` Peter Xu
2025-03-05 16:27 ` Maciej S. Szmigiero
2025-03-05 16:39 ` Peter Xu
2025-03-05 16:47 ` Cédric Le Goater
2025-03-05 16:48 ` Peter Xu
2025-03-04 22:03 ` [PATCH v6 26/36] vfio/migration: Multifd device state transfer support - received buffers queuing Maciej S. Szmigiero
2025-03-05 8:30 ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 27/36] vfio/migration: Multifd device state transfer support - load thread Maciej S. Szmigiero
2025-03-05 8:31 ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 28/36] migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 29/36] vfio/migration: Multifd device state transfer support - config loading support Maciej S. Szmigiero
2025-03-05 8:33 ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 30/36] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2025-03-05 8:38 ` Cédric Le Goater
2025-03-06 6:47 ` Avihai Horon
2025-03-06 10:15 ` Maciej S. Szmigiero
2025-03-06 10:32 ` Cédric Le Goater
2025-03-06 13:37 ` Avihai Horon
2025-03-06 14:13 ` Maciej S. Szmigiero
2025-03-06 14:23 ` Avihai Horon
2025-03-06 14:26 ` Cédric Le Goater
2025-03-07 10:59 ` Maciej S. Szmigiero
2025-03-04 22:03 ` [PATCH v6 31/36] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2025-03-05 9:21 ` Cédric Le Goater
2025-03-04 22:03 ` [PATCH v6 32/36] vfio/migration: Make x-migration-multifd-transfer VFIO property mutable Maciej S. Szmigiero
2025-03-05 8:41 ` Cédric Le Goater
2025-03-04 22:04 ` [PATCH v6 33/36] hw/core/machine: Add compat for x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2025-03-04 22:04 ` [PATCH v6 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit Maciej S. Szmigiero
2025-03-05 9:19 ` Cédric Le Goater
2025-03-05 15:11 ` Maciej S. Szmigiero
2025-03-05 16:39 ` Cédric Le Goater
2025-03-05 16:53 ` Maciej S. Szmigiero
2025-03-04 22:04 ` [PATCH v6 35/36] vfio/migration: Add x-migration-load-config-after-iter VFIO property Maciej S. Szmigiero
2025-03-04 22:04 ` [PATCH v6 36/36] vfio/migration: Update VFIO migration documentation Maciej S. Szmigiero
2025-03-05 8:53 ` Cédric Le Goater
2025-03-05 9:29 ` [PATCH v6 00/36] Multifd 🔀 device state transfer support with VFIO consumer Cédric Le Goater
2025-03-05 9:33 ` Avihai Horon
2025-03-05 9:35 ` Cédric Le Goater
2025-03-05 9:38 ` Avihai Horon
2025-03-05 17:45 ` Cédric Le Goater
2025-03-06 6:50 ` Avihai Horon
2025-03-05 16:49 ` [PATCH] migration: Always take BQL for migration_incoming_state_destroy() Maciej S. Szmigiero
2025-03-05 16:53 ` Cédric Le Goater
2025-03-05 16:55 ` Maciej S. Szmigiero
2025-03-07 10:57 ` [PATCH 1/2] vfio/migration: Add also max in-flight VFIO device state buffers size limit Maciej S. Szmigiero
2025-03-07 12:03 ` Cédric Le Goater
2025-03-07 13:45 ` Maciej S. Szmigiero
2025-03-11 13:04 ` Cédric Le Goater
2025-03-11 14:57 ` Avihai Horon [this message]
2025-03-11 15:45 ` Cédric Le Goater
2025-03-11 16:01 ` Avihai Horon
2025-03-11 16:05 ` Cédric Le Goater
2025-03-12 7:44 ` Avihai Horon
2025-04-01 12:26 ` Maciej S. Szmigiero
2025-04-02 9:51 ` Cédric Le Goater
2025-04-02 12:40 ` Maciej S. Szmigiero
2025-04-02 13:13 ` Cédric Le Goater
2025-03-07 10:57 ` [PATCH 2/2] vfio/migration: Use BE byte order for device state wire packets Maciej S. Szmigiero
2025-03-10 7:30 ` Cédric Le Goater
2025-03-10 7:34 ` Cédric Le Goater
2025-03-10 8:17 ` Avihai Horon
2025-03-10 9:23 ` Cédric Le Goater
2025-03-10 12:53 ` Maciej S. Szmigiero
2025-03-10 13:39 ` Cédric Le Goater
2025-03-10 12:53 ` Maciej S. Szmigiero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bdd69682-3d0f-4687-a8a5-43a6cb4cecc3@nvidia.com \
--to=avihaih@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=eblake@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=mail@maciej.szmigiero.name \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).