From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org,
"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>
Subject: Re: [RFC PATCH v2 0/9] migration/multifd: Remove multifd_send_state->pages
Date: Tue, 23 Jul 2024 14:48:48 -0300 [thread overview]
Message-ID: <8734o0yp6n.fsf@suse.de> (raw)
In-Reply-To: <Zp7k4wF1W6Fzp7YW@x1n>
Peter Xu <peterx@redhat.com> writes:
> On Mon, Jul 22, 2024 at 06:20:28PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>>
>> > On Mon, Jul 22, 2024 at 05:21:48PM -0300, Fabiano Rosas wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >>
>> >> > On Mon, Jul 22, 2024 at 02:59:05PM -0300, Fabiano Rosas wrote:
>> >> >> Hi,
>> >> >>
>> >> >> In this v2 I took Peter's suggestion of keeping the channels' pointers
>> >> >> and moving only the extra slot. The major changes are in patches 5 and
>> >> >> 9. Patch 3 introduces the structure:
>> >> >>
>> >> >> typedef enum {
>> >> >> MULTIFD_PAYLOAD_NONE,
>> >> >> MULTIFD_PAYLOAD_RAM,
>> >> >> } MultiFDPayloadType;
>> >> >>
>> >> >> struct MultiFDSendData {
>> >> >> MultiFDPayloadType type;
>> >> >> union {
>> >> >> MultiFDPages_t ram;
>> >> >> } u;
>> >> >> };
>> >> >>
>> >> >> I added a NONE type so we can use it to tell when the channel has
>> >> >> finished sending a packet, since we'll need to switch types between
>> >> >> clients anyway. This avoids having to introduce a 'size', or 'free'
>> >> >> variable.
>> >> >
>> >> > This at least looks better to me, thanks.
>> >> >
>> >> >>
>> >> >> WHAT'S MISSING:
>> >> >>
>> >> >> - The support for calling multifd_send() concurrently. Maciej has this
>> >> >> in his series so I didn't touch it.
>> >> >>
>> >> >> - A way of adding methods for the new payload type. Currently, the
>> >> >> compression methods are somewhat coupled with ram migration, so I'm
>> >> >> not sure how to proceed.
>> >> >
>> >> > What is this one? Why compression methods need new payload? Aren't they
>> >> > ram-typed?
>> >>
>> >> The data we transport is MultiFDPages_t, yes, but the MultiFDMethods are
>> >> either nocomp, or the compression-specific methods
>> >> (e.g. zlib_send_prepare).
>> >>
>> >> How do we add methods for the upcoming new payload types? I don't expect
>> >> us to continue using nocomp and then do "if (ram)... else if
>> >> (device_state) ..." inside of them. I would expect us to rename
>> >> s/nocomp/ram/ and add a new set of MultiFDMethods for the new data type
>> >> (e.g. vfio_send_prepare, vmstate_send_prepare, etc).
>> >>
>> >> multifd_nocomp_ops -> multifd_ram_ops // rename
>> >> multifd_zlib_ops // existing
>> >> multifd_device_ops // new
>> >>
>> >> The challenge here is that the current framework is nocomp
>> >> vs. compression. It needs to become ram + compression vs. other types.
>> >
>> > IMHO we can keep multifd_ops[] only for RAM. There's only send_prepare()
>> > that device state will need, and so far it's only (referring Maciej's
>> > code):
>> >
>> > static int nocomp_send_prepare_device_state(MultiFDSendParams *p,
>> > Error **errp)
>> > {
>> > multifd_send_prepare_header_device_state(p);
>> >
>> > assert(!(p->flags & MULTIFD_FLAG_SYNC));
>> >
>> > p->next_packet_size = p->device_state->buf_len;
>> > if (p->next_packet_size > 0) {
>> > p->iov[p->iovs_num].iov_base = p->device_state->buf;
>> > p->iov[p->iovs_num].iov_len = p->next_packet_size;
>> > p->iovs_num++;
>> > }
>> >
>> > p->flags |= MULTIFD_FLAG_NOCOMP | MULTIFD_FLAG_DEVICE_STATE;
>> >
>> > multifd_send_fill_packet_device_state(p);
>> >
>> > return 0;
>> > }
>> >
>> > None of other multifd_ops are used.
>>
>> There's also a conditional around device_state when calling
>> ->recv(). That could seems like it could go to a hook.
>>
>> https://lore.kernel.org/r/41dedaf2c9abebb5e45f88c052daa26320715a92.1718717584.git.maciej.szmigiero@oracle.com
>
> Actually that's exactly what I think is right.. it looks to me now that we
> could bypass anything in MultifdOps (including recv()) but let device state
> be a parallel layer of MultifdOps itself, leaving MultifdOps only for
> compressors.
>
> And yeah, I still remember you just renamed it from recv_pages() to
> recv().. it's just that now when think it again it looks like cleaner to
> make it only about pages..
>
>>
>> >
>> > I think we can directly invoke this part of device state code in
>> > multifd_send_thread() for now. So far I think it should be ok.
>>
>> It's not just that. There's also a check for "if (ram)" at every call to
>> multifd_ops to avoid calling the ram code when doing the device
>> migration. It would be way easier to just set noop functions for those.
>>
>> static MultiFDMethods multifd_devstate_ops = {
>> .send_setup = noop_send_setup,
>> .send_cleanup = noop_send_cleanup,
>> .send_prepare = devstate_send_prepare,
>> .recv_setup = noop_recv_setup,
>> .recv_cleanup = noop_recv_cleanup,
>> .recv = devstate_recv
>> };
>>
>> I'm not saying this needs to be done in this series though. But I do
>> think that's the correct design choice for the long term.
>
> Yes it should be separate.
>
> And what I meant is we don't need all these noops, but recv() keeps being
> ignored just like above, then for sender side, right now it's:
>
> ret = multifd_send_state->ops->send_prepare(p, &local_err);
> if (migrate_mapped_ram()) {
> file_write_ramblock_iov();
> } else {
> ret = qio_channel_writev_full_all();
> }
>
> VFIO can process device state in parallel, so:
>
> if (ram) {
> ret = multifd_send_state->ops->send_prepare(p, &local_err);
> if (migrate_mapped_ram()) {
> file_write_ramblock_iov();
> } else {
> qio_channel_writev_full_all();
> }
> } else {
> // device state handling
> multifd_send_device_prepare(...);
> ...
> qio_channel_writev_full_all();
> }
>
> Then MultifdOps doesn't apply to device states.
To avoid getting into bikeshed territory, I think we should postpone
this discussion until after Maciej's series is merged, so we can speak
more concretely about the implications. It's easy enough to go from your
suggestion to mine than the other way around, so let's leave at that.
I had it already written, so more of my reasoning below, if you're
interested.
======
We already have the send/recv threads structured in relation to what we
do inside the hooks. You're just defining a function that's not a hook,
but it has the same signature and responsibilities and needs to be
called at the same moment.
I think the dissonance here is that you don't see the multifd thread
code and the payloads (ram, device) as separate layers. Payload-specific
code should not be at top level. Otherwise, it breaks any semblance of
proper layering:
- payload code will have access to MultiFD*Params, which has a bunch of
control variables for the loop, the semaphores, etc. that should not
be touched;
- payload code ends up influencing the flow of the thread
function. E.g. when zero_copy_send used to dictate whether we'd have
separate IO for the packet or not.
- temporary variables needed by the payload code will have to be
declared inside the thread funcion, which makes tempting to use them
across payload types and also in the thread code itself;
- it creates doubt as to whether new changes go inside the hooks, in the
if/else or outside of it;
Think about how easy it has has been to review and merge the various
compression features we had. It doesn't matter how much they mess up
inside the hooks, it will never cause the dreaded "Memory content
inconsistency at ..." error from check_guest_ram(). At least not in a
way that affects other people. Now compare that with for instance the
zero-page work, or even mapped-ram, that required a bunch of changes to
the multifd control flow itself (e.g. all of the sync changes w/
mapped-ram).
next prev parent reply other threads:[~2024-07-23 17:49 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-22 17:59 [RFC PATCH v2 0/9] migration/multifd: Remove multifd_send_state->pages Fabiano Rosas
2024-07-22 17:59 ` [RFC PATCH v2 1/9] migration/multifd: Reduce access to p->pages Fabiano Rosas
2024-07-22 17:59 ` [RFC PATCH v2 2/9] migration/multifd: Pass in MultiFDPages_t to file_write_ramblock_iov Fabiano Rosas
2024-07-22 19:21 ` Peter Xu
2024-07-22 17:59 ` [RFC PATCH v2 3/9] migration/multifd: Introduce MultiFDSendData Fabiano Rosas
2024-07-22 19:22 ` Peter Xu
2024-07-22 17:59 ` [RFC PATCH v2 4/9] migration/multifd: Make MultiFDPages_t:offset a flexible array member Fabiano Rosas
2024-07-22 19:20 ` Peter Xu
2024-07-22 20:36 ` Fabiano Rosas
2024-07-22 17:59 ` [RFC PATCH v2 5/9] migration/multifd: Replace p->pages with an union pointer Fabiano Rosas
2024-07-22 17:59 ` [RFC PATCH v2 6/9] migration/multifd: Move pages accounting into multifd_send_zero_page_detect() Fabiano Rosas
2024-07-22 19:29 ` Peter Xu
2024-07-22 20:07 ` Fabiano Rosas
2024-07-22 20:38 ` Peter Xu
2024-07-22 17:59 ` [RFC PATCH v2 7/9] migration/multifd: Isolate ram pages packet data Fabiano Rosas
2024-07-22 19:37 ` Peter Xu
2024-07-22 20:34 ` Fabiano Rosas
2024-07-22 21:06 ` Peter Xu
2024-07-24 9:30 ` Maciej S. Szmigiero
2024-07-22 17:59 ` [RFC PATCH v2 8/9] migration/multifd: Don't send ram data during SYNC Fabiano Rosas
2024-07-22 21:03 ` Peter Xu
2024-07-22 21:24 ` Fabiano Rosas
2024-07-22 17:59 ` [RFC PATCH v2 9/9] migration/multifd: Replace multifd_send_state->pages with client data Fabiano Rosas
2024-07-22 19:55 ` Peter Xu
2024-07-22 20:26 ` Fabiano Rosas
2024-07-22 20:41 ` Peter Xu
2024-07-22 19:58 ` [RFC PATCH v2 0/9] migration/multifd: Remove multifd_send_state->pages Peter Xu
2024-07-22 20:21 ` Fabiano Rosas
2024-07-22 20:54 ` Peter Xu
2024-07-22 21:20 ` Fabiano Rosas
2024-07-22 23:01 ` Peter Xu
2024-07-23 17:48 ` Fabiano Rosas [this message]
2024-07-23 18:20 ` Peter Xu
2024-07-23 20:50 ` Fabiano Rosas
2024-07-23 21:11 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8734o0yp6n.fsf@suse.de \
--to=farosas@suse.de \
--cc=mail@maciej.szmigiero.name \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.