qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	"Cédric Le Goater" <clg@redhat.com>
Subject: Re: [PATCH 1/2] migration: Add some documentation for multifd
Date: Fri, 7 Mar 2025 17:15:03 -0500	[thread overview]
Message-ID: <Z8tv53G5s9MLYv6f@x1.local> (raw)
In-Reply-To: <87tt84u0d2.fsf@suse.de>

On Fri, Mar 07, 2025 at 04:06:17PM -0300, Fabiano Rosas wrote:
> > I never tried vsock, would it be used in any use case?
> >
> 
> I don't know, I'm going by what's in the code.
> 
> > It seems to be introduced by accident in 72a8192e225cea, but I'm not sure.
> > Maybe there's something I missed.
> 
> The code was always had some variation of:
> 
> static bool transport_supports_multi_channels(SocketAddress *saddr)
> {
>     return strstart(uri, "tcp:", NULL) || strstart(uri, "unix:", NULL) ||
>            strstart(uri, "vsock:", NULL);
> }
> 
> Introduced by b7acd65707 ("migration: allow multifd for socket protocol
> only").

Looks like a copy-paste error..  I found it, should be 9ba3b2baa1
("migration: add vsock as data channel support").

https://lore.kernel.org/all/2bc0e226-ee71-330a-1bcd-bd9d097509bc@huawei.com/
https://kvmforum2019.sched.com/event/Tmzh/zero-next-generation-virtualization-platform-for-huawei-cloud-jinsong-liu-zhichao-huang-huawei
https://e.huawei.com/sa/material/event/HC/e37b9c4c33e14e869bb1183fab468fed

So if I read it right.. the VM in this case is inside a container or
something, then it talks to an "agent" on a PCIe device which understands
virtio-vsock protocol.  So maybe vsock just performs better than other ways
to dig that tunnel for the container.

In that case, mentioning vsock is at least ok.

[...]

> >> +After the channels have been put into a wait state by the sync
> >> +functions, the client code may continue to transmit additional data by
> >> +issuing ``multifd_send()`` once again.
> >> +
> >> +Note:
> >> +
> >> +- the RAM migration does, effectively, a global synchronization by
> >> +  chaining a call to ``multifd_send_sync_main()`` with the emission of a
> >> +  flag on the main migration channel (``RAM_SAVE_FLAG_MULTIFD_FLUSH``)
> >
> > ... or RAM_SAVE_FLAG_EOS ... depending on the machine type.
> >
> 
> Eh.. big compatibility mess. I rather not mention it.

It's not strictly a compatibility mess.  IIUC, it was used to be designed
to always work with EOS.  I think at that time Juan was still focused on
making it work and not whole perf tunings, but then we found it can be a
major perf issue if we flush too soon.  Then if we flush it once per round,
it may not always pair with a EOS.  That's why we needed a new message.

But hey, you're writting a doc that helps everyone.  You deserve to decide
whether you like to mention it or not on this one. :)

IIRC we updated our compat rule so we maintain each machine type for only 6
years.  It means the whole per-iteration + EOS stuff can be removed in 3.5
years or so - we did that work in July 2022.  So it isn't that important
either to mention indeed.

> 
> > Maybe we should also add a sentence on the relationship of
> > MULTIFD_FLAG_SYNC and RAM_SAVE_FLAG_MULTIFD_FLUSH (or RAM_SAVE_FLAG_EOS ),
> > in that they should always be sent together, and only if so would it
> > provide ordering of multifd messages and what happens in the main migration
> > thread.
> >
> 
> The problem is that RAM_SAVE_FLAGs are a ram.c thing. In theory the need
> for RAM_SAVE_FLAG_MULTIFD_FLUSH is just because the RAM migration is
> driven by the source machine by the flags that are put on the
> stream. IOW, this is a RAM migration design, not a multifd design. The
> multifd design is (could be, we decide) that once sync packets are sent,
> _something_ must do the following:
> 
>     for (i = 0; i < thread_count; i++) {
>         trace_multifd_recv_sync_main_wait(i);
>         qemu_sem_wait(&multifd_recv_state->sem_sync);
>     }
> 
> ... which is already part of multifd_recv_sync_main(), but that just
> _happens to be_ called by ram.c when it sees the
> RAM_SAVE_FLAG_MULTIFD_FLUSH flag on the stream, that's not a multifd
> design requirement. The ram.c code could for instance do the sync when
> some QEMU_VM_SECTION_EOS (or whatever it's called) appears.

I still think it should be done in RAM code only.  One major goal (if not
the only goal..) is it wants to order different versions of pages and
that's only what the RAM module is about, not migration in general.

From that POV, having a QEMU_VM_* is kind of the wrong layer - they should
work for migration in general.

Said so, I agree we do violate it from time to time, for example, we have a
bunch of subcmds (MIG_CMD_POSTCOPY*) just for postcopy, which is under
QEMU_VM_COMMAND.  But IIUC that was either kind of ancient (so we need to
stick with them now.. postcopy was there for 10 years) or it needs some
ping-pong messages in which case QEMU_VM_COMMAND is the easiest.. IMHO we
should still try to stick with the layering if possible.

-- 
Peter Xu



  reply	other threads:[~2025-03-07 22:17 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-07 13:42 [PATCH 0/2] migration: multifd documentation Fabiano Rosas
2025-03-07 13:42 ` [PATCH 1/2] migration: Add some documentation for multifd Fabiano Rosas
2025-03-07 17:27   ` Peter Xu
2025-03-07 19:06     ` Fabiano Rosas
2025-03-07 22:15       ` Peter Xu [this message]
2025-03-10 14:24         ` Fabiano Rosas
2025-03-10 15:22           ` Peter Xu
2025-03-10 19:27             ` Fabiano Rosas
2025-03-20 12:06               ` Prasad Pandit
2025-03-20 13:38                 ` Fabiano Rosas
2025-03-20 11:50   ` Prasad Pandit
2025-03-20 14:45     ` Fabiano Rosas
2025-03-20 15:56       ` Peter Xu
2025-03-20 17:12         ` Fabiano Rosas
2025-03-21 10:47       ` Prasad Pandit
2025-03-21 14:04         ` Fabiano Rosas
2025-03-24 11:14           ` Prasad Pandit
2025-03-07 13:42 ` [PATCH 2/2] migration: Move compression docs under multifd Fabiano Rosas
2025-03-07 17:28   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8tv53G5s9MLYv6f@x1.local \
    --to=peterx@redhat.com \
    --cc=clg@redhat.com \
    --cc=farosas@suse.de \
    --cc=mail@maciej.szmigiero.name \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).