qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	"Cédric Le Goater" <clg@redhat.com>,
	"Avihai Horon" <avihaih@nvidia.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Prasad Pandit" <ppandit@redhat.com>
Subject: Re: [PATCH v2 7/7] migration/multifd: Document the reason to sync for save_setup()
Date: Fri, 06 Dec 2024 14:01:01 -0300	[thread overview]
Message-ID: <87frn0bvaq.fsf@suse.de> (raw)
In-Reply-To: <Z1MZ8hwPUQBLMLcb@x1n>

Peter Xu <peterx@redhat.com> writes:

> On Fri, Dec 06, 2024 at 11:40:27AM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > It's not straightforward to see why src QEMU needs to sync multifd during
>> > setup() phase.  After all, there's no page queued at that point.
>> >
>> > For old QEMUs, there's a solid reason: EOS requires it to work.  While it's
>> > clueless on the new QEMUs which do not take EOS message as sync requests.
>> >
>> > One will figure that out only when this is conditionally removed.  In fact,
>> > the author did try it out.  Logically we could still avoid doing this on
>> > new machine types, however that needs a separate compat field and that can
>> > be an overkill in some trivial overhead in setup() phase.
>> >
>> > Let's instead document it completely, to avoid someone else tries this
>> > again and do the debug one more time, or anyone confused on why this ever
>> > existed.
>> >
>> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> > ---
>> >  migration/ram.c | 27 +++++++++++++++++++++++++++
>> >  1 file changed, 27 insertions(+)
>> >
>> > diff --git a/migration/ram.c b/migration/ram.c
>> > index 5d4bdefe69..ddee703585 100644
>> > --- a/migration/ram.c
>> > +++ b/migration/ram.c
>> > @@ -3036,6 +3036,33 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
>> >          migration_ops->ram_save_target_page = ram_save_target_page_legacy;
>> >      }
>> >  
>> > +    /*
>> > +     * This operation is unfortunate..
>> > +     *
>> > +     * For legacy QEMUs using per-section sync
>> > +     * =======================================
>> > +     *
>> > +     * This must exist because the EOS below requires the SYNC messages
>> > +     * per-channel to work.
>> > +     *
>> > +     * For modern QEMUs using per-round sync
>> > +     * =====================================
>> > +     *
>> > +     * Logically this sync is not needed (because we know there's nothing
>> > +     * in the multifd queue yet!).
>> 
>> This is a bit misleading because even today we could split the
>> multifd_ram_flush_and_sync() into _flush and _sync (haven't I seen a
>> patch doing this somewhere? Maybe from Maciej...) and call just the
>> _sync here, which is unrelated to any multifd queue.
>
> Yeah you have a point, maybe at least I shouldn't mention the queues, they
> can be irrelevant.
>
>> 
>> I think we shouldn't tie "sync" with "wait for multifd threads to finish
>> sending their data (a kind of flush)" as this implies. The sync is as
>> much making sure the threads are ready to receive as it is making sure
>> the data is received in order with relation to ram scanning rounds.
>> 
>> IOW, the local sync is what ensures multifd send threads are flushed
>> while this code deals with the sync of src&dst threads, which is "just"
>> a synchronization point between the two QEMUs.
>
> This is a remote sync, not a local sync.  But yes, it can be the
> synchronization point.
>
> As I mentioned below, such sync point has nothing to do with src, so it can
> be implemented in dest alone without such sync message.  It works, though.
>
>> 
>> > However as a side effect, this makes
>> > +     * sure the dest side won't receive any data before it properly reaches
>> > +     * ram_load_precopy().
>> 
>> I'm not sure it's a side-effect. It seems deliberate to me, seeing that
>> multifd usually does its own synchronization. For instance, on the send
>> side we also need some sync to make sure ram.c doesn't send data to
>> multifd send threads that are not ready yet (i.e. the wait on
>> channels_ready at the start of multifd_send()).
>
> Yes, and that's exactly what I wanted to express.  If dest has that
> "channels_ready" thing, I'm pretty sure we don't need this remote sync.
> We're using a heavier sync to service the purpose for "local sync for
> dest".  It's ok, but it's very unclear on what it really does.
>
>> 
>> > +     *
>> > +     * Without this sync, src QEMU can send data too soon so that dest may
>> > +     * not have been ready to receive it (e.g., rb->receivedmap may be
>> > +     * uninitialized, for example).
>> > +     *
>> > +     * Logically "wait for recv setup ready" shouldn't need to involve src
>> > +     * QEMU at all, however to be compatible with old QEMUs, let's stick
>> 
>> I don't understand this statement, you're saying that QEMU src could
>> just start dumping data on the channel without a remote end? Certainly
>> for file migrations, but socket as well?
>
> Yes.
>
> When reaching here on sender side, all multifd channels are already there:
> multifd_send_setup() guaranteed it.  We can start dump things, AFAICT,
> irrelevant of what dest is doing.
>
> Maybe the recv threads are not even created, but that isn't relevant, IMO,
> as long as they'll be there at some point and start collecting socket
> buffers.
>
>> 
>> > +     * with this.  Fortunately the overhead is low to sync during setup
>> > +     * because the VM is running, so at least it's not accounted as part of
>> > +     * downtime.
>> > +     */
>> >      bql_unlock();
>> >      ret = multifd_ram_flush_and_sync(f);
>> >      bql_lock();
>> 
>
> I removed ambiguous wordings on "queue is empty", and simplified it a bit.
> How's this one look?
>
>     /*
>      * This operation is unfortunate..
>      *
>      * For legacy QEMUs using per-section sync
>      * =======================================
>      *
>      * This must exist because the EOS below requires the SYNC messages
>      * per-channel to work.
>      *
>      * For modern QEMUs using per-round sync
>      * =====================================
>      *
>      * Logically such sync is not needed, and recv threads should not run
>      * until setup ready (using things like channels_ready on src).  Then
>      * we should be all fine.
>      *
>      * However even if we add channels_ready to recv side in new QEMUs, old
>      * QEMU won't have them so this sync will still be needed to make sure
>      * multifd recv threads won't start processing guest pages early before
>      * ram_load_setup() is properly done.
>      *
>      * Let's stick with this.  Fortunately the overhead is low to sync
>      * during setup because the VM is running, so at least it's not
>      * accounted as part of downtime.
>      */

Looks ok, thanks!


      reply	other threads:[~2024-12-06 17:02 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-06  0:58 [PATCH v2 0/7] migration/multifd: Some VFIO / postcopy preparations on flush Peter Xu
2024-12-06  0:58 ` [PATCH v2 1/7] migration/multifd: Further remove the SYNC on complete Peter Xu
2024-12-06 13:17   ` Fabiano Rosas
2024-12-06 14:40     ` Peter Xu
2024-12-06  0:58 ` [PATCH v2 2/7] migration/multifd: Allow to sync with sender threads only Peter Xu
2024-12-06 13:26   ` Fabiano Rosas
2024-12-06 14:50     ` Peter Xu
2024-12-06 15:00       ` Fabiano Rosas
2024-12-06  0:58 ` [PATCH v2 3/7] migration/ram: Move RAM_SAVE_FLAG* into ram.h Peter Xu
2024-12-06 13:43   ` Fabiano Rosas
2024-12-06 15:03     ` Peter Xu
2024-12-06 15:10       ` Fabiano Rosas
2024-12-06 15:46         ` Peter Xu
2024-12-06 16:58           ` Fabiano Rosas
2024-12-06  0:58 ` [PATCH v2 4/7] migration/multifd: Unify RAM_SAVE_FLAG_MULTIFD_FLUSH messages Peter Xu
2024-12-06 14:12   ` Fabiano Rosas
2024-12-06  0:58 ` [PATCH v2 5/7] migration/multifd: Remove sync processing on postcopy Peter Xu
2024-12-06 14:19   ` Fabiano Rosas
2024-12-06  0:58 ` [PATCH v2 6/7] migration/multifd: Cleanup src flushes on condition check Peter Xu
2024-12-06 14:18   ` Fabiano Rosas
2024-12-06 15:13     ` Peter Xu
2024-12-06  0:58 ` [PATCH v2 7/7] migration/multifd: Document the reason to sync for save_setup() Peter Xu
2024-12-06 14:40   ` Fabiano Rosas
2024-12-06 15:36     ` Peter Xu
2024-12-06 17:01       ` Fabiano Rosas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87frn0bvaq.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=alex.williamson@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=clg@redhat.com \
    --cc=mail@maciej.szmigiero.name \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).