From: Fabiano Rosas <farosas@suse.de>
To: Prasad Pandit <ppandit@redhat.com>
Cc: qemu-devel@nongnu.org, "Peter Xu" <peterx@redhat.com>,
"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
"Cédric Le Goater" <clg@redhat.com>
Subject: Re: [PATCH 1/2] migration: Add some documentation for multifd
Date: Thu, 20 Mar 2025 11:45:29 -0300 [thread overview]
Message-ID: <875xk3bw1i.fsf@suse.de> (raw)
In-Reply-To: <CAE8KmOx0KQ7OfbyivQ_256JVRugtJ8ekykxtQw-uz91Uiuv-tg@mail.gmail.com>
Prasad Pandit <ppandit@redhat.com> writes:
> Hello Fabiano,
>
> * First big thank you for starting/writing this document. It is a
> great resource.
>
> On Fri, 7 Mar 2025 at 19:13, Fabiano Rosas <farosas@suse.de> wrote:
>> +++ b/docs/devel/migration/multifd.rst
>> @@ -0,0 +1,254 @@
>> +Multifd
>> +Multifd is the name given for the migration capability that enables
>> +data transfer using multiple threads. Multifd supports all the
>> +transport types currently in use with migration (inet, unix, vsock,
>> +fd, file).
>
> * Multifd is Multiple File Descriptors, right? Ie. Does it work with
> one thread but multiple file descriptors? OR one thread per file
> descriptor is always the case? I have not used/tried 'multifd +
> file://' migration, but I imagined there one thread might be able to
> read/write to multiple file descriptors at a time.
>
Technically both can happen. But that would just be the case of
file:fdset migration which requires an extra fd for O_DIRECT. So
"multiple" in the usual sense of "more is better" is only
fd-per-thread. IOW, using multiple fds is an implementation detail IMO,
what people really care about is medium saturation, which we can only
get (with multifd) via parallelization.
>> +Usage
>> +-----
>> +
>> +On both source and destination, enable the ``multifd`` capability:
>> +
>> + ``migrate_set_capability multifd on``
>> +
>> +Define a number of channels to use (default is 2, but 8 usually
>> +provides best performance).
>> +
>> + ``migrate_set_parameter multifd-channels 8``
>> +
>
> * I get that this is a QEMU documentation, but for users/reader's
> convenience it'll help to point to libvirt:virsh migrate usage here ->
> https://www.libvirt.org/manpages/virsh.html#migrate , just as an
> alternative.
AFAIK, we tend to not do that in QEMU docs.
> Because doing migration via QMP commands is not as
> straightforward, I wonder who might do that and why.
>
All of QEMU developers, libvirt developers, cloud software developers,
kernel developers etc.
>
>> +Restrictions
>> +------------
>> +
>> +For migration to a file, support is conditional on the presence of the
>> +mapped-ram capability, see `mapped-ram`.
>> +
>> +Snapshots are currently not supported.
>
> * Maybe: Sanpshot using multiple threads (multifd) is not supported.
>
>> +`postcopy` migration is currently not supported.
>
> * Maybe - 'postcopy' migration using multiple threads (multifd) is not
> supported. ie. 'postcopy' uses a single thread to transfer migration
> data.
>
> * Reason for these suggestions: as a writer it is easy to think
> everything written in this page is to be taken with multifd context,
> but readers may not do that, they may take sentences in isolation.
> (just sharing thoughts)
>
Sure, I can expand on those.
>> +Multifd consists of:
>> +
>> +- A client that produces the data on the migration source side and
>> + consumes it on the destination. Currently the main client code is
>> + ram.c, which selects the RAM pages for migration;
>
> * So multifd mechanism can be used to transfer non-ram data as well? I
> thought it's only used for RAM migration. Are device/gpu states etc
> bits also transferred via multifd threads?
>
device state migration with multifd has been merged for 10.0
<rant>
If it were up to me, we'd have a pool of multifd threads that transmit
everything migration-related. Unfortunately, that's not so
straight-forward to implement without rewriting a lot of code, multifd
requires too much entanglement from the data producer. We're constantly
dealing with details of data transmission getting in the way of data
production/consumption (e.g. try to change ram.c to produce multiple
pages at once and watch everyting explode).
I've been experimenting with a MultiFDIov payload type to allow
separation between the data type handling details and multifd inner
workings. However in order for that to be useful we'd need to have a
sync that doesn't depend on control data on the main migration
thread. That's why I've been asking about a multifd-only sync with Peter
in the other thread.
There's a bunch of other issues as well:
- no clear distinction between what should go in the header and what
should go in the packet.
- the header taking up one slot in the iov, which should in theory be
responsibility of the client
- the whole multifd_ops situation which doesn't allow a clear interface
between multifd and client
- the lack of uniformity between send/recv in regards to doing I/O from
multifd code or from client code
- the recv having two different modes of operation, socket and file
the list goes on...
</rant>
>> +- A packet which is the final result of all the data aggregation
>> + and/or transformation. The packet contains: a *header* with magic and
>> + version numbers and flags that inform of special processing needed
>> + on the destination; a *payload-specific header* with metadata referent
>> + to the packet's data portion, e.g. page counts; and a variable-size
>> + *data portion* which contains the actual opaque payload data.
>
> * It'll help to define the exact packet format here. Like they do in RFCs.
I'll try to produce some ascii art.
>
> Thank you for writing this.
> ---
> - Prasad
next prev parent reply other threads:[~2025-03-20 14:46 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-07 13:42 [PATCH 0/2] migration: multifd documentation Fabiano Rosas
2025-03-07 13:42 ` [PATCH 1/2] migration: Add some documentation for multifd Fabiano Rosas
2025-03-07 17:27 ` Peter Xu
2025-03-07 19:06 ` Fabiano Rosas
2025-03-07 22:15 ` Peter Xu
2025-03-10 14:24 ` Fabiano Rosas
2025-03-10 15:22 ` Peter Xu
2025-03-10 19:27 ` Fabiano Rosas
2025-03-20 12:06 ` Prasad Pandit
2025-03-20 13:38 ` Fabiano Rosas
2025-03-20 11:50 ` Prasad Pandit
2025-03-20 14:45 ` Fabiano Rosas [this message]
2025-03-20 15:56 ` Peter Xu
2025-03-20 17:12 ` Fabiano Rosas
2025-03-21 10:47 ` Prasad Pandit
2025-03-21 14:04 ` Fabiano Rosas
2025-03-24 11:14 ` Prasad Pandit
2025-03-07 13:42 ` [PATCH 2/2] migration: Move compression docs under multifd Fabiano Rosas
2025-03-07 17:28 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875xk3bw1i.fsf@suse.de \
--to=farosas@suse.de \
--cc=clg@redhat.com \
--cc=mail@maciej.szmigiero.name \
--cc=peterx@redhat.com \
--cc=ppandit@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.