qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Prasad Pandit <ppandit@redhat.com>
Cc: qemu-devel@nongnu.org, "Peter Xu" <peterx@redhat.com>,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	"Cédric Le Goater" <clg@redhat.com>
Subject: Re: [PATCH 1/2] migration: Add some documentation for multifd
Date: Thu, 20 Mar 2025 11:45:29 -0300	[thread overview]
Message-ID: <875xk3bw1i.fsf@suse.de> (raw)
In-Reply-To: <CAE8KmOx0KQ7OfbyivQ_256JVRugtJ8ekykxtQw-uz91Uiuv-tg@mail.gmail.com>

Prasad Pandit <ppandit@redhat.com> writes:

> Hello Fabiano,
>
> * First big thank you for starting/writing this document. It is a
> great resource.
>
> On Fri, 7 Mar 2025 at 19:13, Fabiano Rosas <farosas@suse.de> wrote:
>> +++ b/docs/devel/migration/multifd.rst
>> @@ -0,0 +1,254 @@
>> +Multifd
>> +Multifd is the name given for the migration capability that enables
>> +data transfer using multiple threads. Multifd supports all the
>> +transport types currently in use with migration (inet, unix, vsock,
>> +fd, file).
>
> * Multifd is Multiple File Descriptors, right? Ie. Does it work with
> one thread but multiple file descriptors? OR one thread per file
> descriptor is always the case? I have not used/tried 'multifd +
> file://' migration, but I imagined there one thread might be able to
> read/write to multiple file descriptors at a time.
>

Technically both can happen. But that would just be the case of
file:fdset migration which requires an extra fd for O_DIRECT. So
"multiple" in the usual sense of "more is better" is only
fd-per-thread. IOW, using multiple fds is an implementation detail IMO,
what people really care about is medium saturation, which we can only
get (with multifd) via parallelization.

>> +Usage
>> +-----
>> +
>> +On both source and destination, enable the ``multifd`` capability:
>> +
>> +    ``migrate_set_capability multifd on``
>> +
>> +Define a number of channels to use (default is 2, but 8 usually
>> +provides best performance).
>> +
>> +    ``migrate_set_parameter multifd-channels 8``
>> +
>
> * I get that this is a QEMU documentation, but for users/reader's
> convenience it'll help to point to libvirt:virsh migrate usage here ->
> https://www.libvirt.org/manpages/virsh.html#migrate , just as an
> alternative.

AFAIK, we tend to not do that in QEMU docs.

> Because doing migration via QMP commands is not as
> straightforward, I wonder who might do that and why.
>

All of QEMU developers, libvirt developers, cloud software developers,
kernel developers etc.

>
>> +Restrictions
>> +------------
>> +
>> +For migration to a file, support is conditional on the presence of the
>> +mapped-ram capability, see `mapped-ram`.
>> +
>> +Snapshots are currently not supported.
>
> * Maybe: Sanpshot using multiple threads (multifd) is not supported.
>
>> +`postcopy` migration is currently not supported.
>
> * Maybe - 'postcopy' migration using multiple threads (multifd) is not
> supported. ie. 'postcopy' uses a single thread to transfer migration
> data.
>
> * Reason for these suggestions: as a writer it is easy to think
> everything written in this page is to be taken with multifd context,
> but readers may not do that, they may take sentences in isolation.
> (just sharing thoughts)
>

Sure, I can expand on those.

>> +Multifd consists of:
>> +
>> +- A client that produces the data on the migration source side and
>> +  consumes it on the destination. Currently the main client code is
>> +  ram.c, which selects the RAM pages for migration;
>
> * So multifd mechanism can be used to transfer non-ram data as well? I
> thought it's only used for RAM migration. Are device/gpu states etc
> bits also transferred via multifd threads?
>

device state migration with multifd has been merged for 10.0

<rant>
If it were up to me, we'd have a pool of multifd threads that transmit
everything migration-related. Unfortunately, that's not so
straight-forward to implement without rewriting a lot of code, multifd
requires too much entanglement from the data producer. We're constantly
dealing with details of data transmission getting in the way of data
production/consumption (e.g. try to change ram.c to produce multiple
pages at once and watch everyting explode).

I've been experimenting with a MultiFDIov payload type to allow
separation between the data type handling details and multifd inner
workings. However in order for that to be useful we'd need to have a
sync that doesn't depend on control data on the main migration
thread. That's why I've been asking about a multifd-only sync with Peter
in the other thread.

There's a bunch of other issues as well:

- no clear distinction between what should go in the header and what
  should go in the packet.

- the header taking up one slot in the iov, which should in theory be
  responsibility of the client

- the whole multifd_ops situation which doesn't allow a clear interface
  between multifd and client

- the lack of uniformity between send/recv in regards to doing I/O from
  multifd code or from client code

- the recv having two different modes of operation, socket and file

the list goes on...
</rant>

>> +- A packet which is the final result of all the data aggregation
>> +  and/or transformation. The packet contains: a *header* with magic and
>> +  version numbers and flags that inform of special processing needed
>> +  on the destination; a *payload-specific header* with metadata referent
>> +  to the packet's data portion, e.g. page counts; and a variable-size
>> +  *data portion* which contains the actual opaque payload data.
>
> * It'll help to define the exact packet format here. Like they do in RFCs.

I'll try to produce some ascii art.

>
> Thank you for writing this.
> ---
>   - Prasad


  reply	other threads:[~2025-03-20 14:46 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-07 13:42 [PATCH 0/2] migration: multifd documentation Fabiano Rosas
2025-03-07 13:42 ` [PATCH 1/2] migration: Add some documentation for multifd Fabiano Rosas
2025-03-07 17:27   ` Peter Xu
2025-03-07 19:06     ` Fabiano Rosas
2025-03-07 22:15       ` Peter Xu
2025-03-10 14:24         ` Fabiano Rosas
2025-03-10 15:22           ` Peter Xu
2025-03-10 19:27             ` Fabiano Rosas
2025-03-20 12:06               ` Prasad Pandit
2025-03-20 13:38                 ` Fabiano Rosas
2025-03-20 11:50   ` Prasad Pandit
2025-03-20 14:45     ` Fabiano Rosas [this message]
2025-03-20 15:56       ` Peter Xu
2025-03-20 17:12         ` Fabiano Rosas
2025-03-21 10:47       ` Prasad Pandit
2025-03-21 14:04         ` Fabiano Rosas
2025-03-24 11:14           ` Prasad Pandit
2025-03-07 13:42 ` [PATCH 2/2] migration: Move compression docs under multifd Fabiano Rosas
2025-03-07 17:28   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875xk3bw1i.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=clg@redhat.com \
    --cc=mail@maciej.szmigiero.name \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).