qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org, Hao Xiang <hao.xiang@bytedance.com>,
	Bryan Zhang <bryan.zhang@bytedance.com>,
	Avihai Horon <avihaih@nvidia.com>, Yuan Liu <yuan1.liu@intel.com>,
	Prasad Pandit <ppandit@redhat.com>
Subject: Re: [PATCH v2 18/23] migration/multifd: Rewrite multifd_queue_page()
Date: Mon, 5 Feb 2024 12:03:10 +0800	[thread overview]
Message-ID: <ZcBd_hDAUKgmUAhX@x1n> (raw)
In-Reply-To: <87cyteh9va.fsf@suse.de>

On Fri, Feb 02, 2024 at 05:47:05PM -0300, Fabiano Rosas wrote:
> peterx@redhat.com writes:
> 
> > From: Peter Xu <peterx@redhat.com>
> >
> > The current multifd_queue_page() is not easy to read and follow.  It is not
> > good with a few reasons:
> >
> >   - No helper at all to show what exactly does a condition mean; in short,
> >   readability is low.
> >
> >   - Rely on pages->ramblock being cleared to detect an empty queue.  It's
> >   slightly an overload of the ramblock pointer, per Fabiano [1], which I
> >   also agree.
> >
> >   - Contains a self recursion, even if not necessary..
> >
> > Rewrite this function.  We add some comments to make it even clearer on
> > what it does.
> >
> > [1] https://lore.kernel.org/r/87wmrpjzew.fsf@suse.de
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> 
> Patch looks good, but I have a question below.
> 
> > ---
> >  migration/multifd.c | 56 ++++++++++++++++++++++++++++++---------------
> >  1 file changed, 37 insertions(+), 19 deletions(-)
> >
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 35d4e8ad1f..4ab8e6eff2 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -506,35 +506,53 @@ static bool multifd_send_pages(void)
> >      return true;
> >  }
> >  
> > +static inline bool multifd_queue_empty(MultiFDPages_t *pages)
> > +{
> > +    return pages->num == 0;
> > +}
> > +
> > +static inline bool multifd_queue_full(MultiFDPages_t *pages)
> > +{
> > +    return pages->num == pages->allocated;
> > +}
> > +
> > +static inline void multifd_enqueue(MultiFDPages_t *pages, ram_addr_t offset)
> > +{
> > +    pages->offset[pages->num++] = offset;
> > +}
> > +
> >  /* Returns true if enqueue successful, false otherwise */
> >  bool multifd_queue_page(RAMBlock *block, ram_addr_t offset)
> >  {
> > -    MultiFDPages_t *pages = multifd_send_state->pages;
> > -    bool changed = false;
> > +    MultiFDPages_t *pages;
> > +
> > +retry:
> > +    pages = multifd_send_state->pages;
> >  
> > -    if (!pages->block) {
> > +    /* If the queue is empty, we can already enqueue now */
> > +    if (multifd_queue_empty(pages)) {
> >          pages->block = block;
> > +        multifd_enqueue(pages, offset);
> > +        return true;
> >      }
> >  
> > -    if (pages->block == block) {
> > -        pages->offset[pages->num] = offset;
> > -        pages->num++;
> > -
> > -        if (pages->num < pages->allocated) {
> > -            return true;
> > +    /*
> > +     * Not empty, meanwhile we need a flush.  It can because of either:
> > +     *
> > +     * (1) The page is not on the same ramblock of previous ones, or,
> > +     * (2) The queue is full.
> > +     *
> > +     * After flush, always retry.
> > +     */
> > +    if (pages->block != block || multifd_queue_full(pages)) {
> > +        if (!multifd_send_pages()) {
> > +            return false;
> >          }
> > -    } else {
> > -        changed = true;
> > -    }
> > -
> > -    if (!multifd_send_pages()) {
> > -        return false;
> > -    }
> > -
> > -    if (changed) {
> > -        return multifd_queue_page(block, offset);
> > +        goto retry;
> >      }
> >  
> > +    /* Not empty, and we still have space, do it! */
> > +    multifd_enqueue(pages, offset);
> 
> Hm, here you're missing the flush of the last group of pages of the last
> ramblock. Just like current code...
> 
> ...which means we're relying on the multifd_send_pages() at
> multifd_send_sync_main() to send the last few pages. So how can that
> work when multifd_flush_after_each_section==false? Because it skips the
> sync flag, but would also skip the last send. I'm confused.

IIUC it won't skip the final flush of the last pages.  See
find_dirty_block():

            if (migrate_multifd() &&
                !migrate_multifd_flush_after_each_section()) {
                QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
                int ret = multifd_send_sync_main();
                if (ret < 0) {
                    return ret;
                }
                qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
                qemu_fflush(f);
            }

IMHO this should be the last flush of the pages when we loop one more
round.

Maybe what you're talking about this one (of ram_save_complete())?

    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
    }

I remember we talked about this somewhere in your "file" series,
but.. AFAIU this last RAM_SAVE_FLAG_MULTIFD_FLUSH might be redundant, it
just needs some justifications to double check I didn't miss something.

Now multifd_queue_page() is kind of lazy-mode on flushing, I think that may
make some sense (we assign job unless required, so maybe there's higher
chance that one thread is free?), but I'm not sure whether that's a huge
deal if NIC is the bandwidth, because in that case we'll wait for sender
threads anyway, and they should all be busy at any time.

However even if we flush immediately as long as full, we'd still better
check queue is empty before completion of migration for sure, to make sure
nothing is left.

-- 
Peter Xu



  reply	other threads:[~2024-02-05  4:03 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02 10:28 [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups peterx
2024-02-02 10:28 ` [PATCH v2 01/23] migration/multifd: Drop stale comment for multifd zero copy peterx
2024-02-02 10:28 ` [PATCH v2 02/23] migration/multifd: multifd_send_kick_main() peterx
2024-02-02 10:28 ` [PATCH v2 03/23] migration/multifd: Drop MultiFDSendParams.quit, cleanup error paths peterx
2024-02-02 19:15   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 04/23] migration/multifd: Postpone reset of MultiFDPages_t peterx
2024-02-02 10:28 ` [PATCH v2 05/23] migration/multifd: Drop MultiFDSendParams.normal[] array peterx
2024-02-09  0:06   ` [External] " Hao Xiang
2024-02-09 12:20     ` Fabiano Rosas
2024-02-14  2:16       ` Hao Xiang
2024-02-14 17:17         ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 06/23] migration/multifd: Separate SYNC request with normal jobs peterx
2024-02-02 19:21   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 07/23] migration/multifd: Simplify locking in sender thread peterx
2024-02-02 19:23   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 08/23] migration/multifd: Drop pages->num check " peterx
2024-02-02 10:28 ` [PATCH v2 09/23] migration/multifd: Rename p->num_packets and clean it up peterx
2024-02-02 10:28 ` [PATCH v2 10/23] migration/multifd: Move total_normal_pages accounting peterx
2024-02-02 10:28 ` [PATCH v2 11/23] migration/multifd: Move trace_multifd_send|recv() peterx
2024-02-02 10:28 ` [PATCH v2 12/23] migration/multifd: multifd_send_prepare_header() peterx
2024-02-02 10:28 ` [PATCH v2 13/23] migration/multifd: Move header prepare/fill into send_prepare() peterx
2024-02-02 19:26   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 14/23] migration/multifd: Forbid spurious wakeups peterx
2024-02-02 10:28 ` [PATCH v2 15/23] migration/multifd: Split multifd_send_terminate_threads() peterx
2024-02-02 19:28   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 16/23] migration/multifd: Change retval of multifd_queue_page() peterx
2024-02-02 19:29   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 17/23] migration/multifd: Change retval of multifd_send_pages() peterx
2024-02-02 19:30   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 18/23] migration/multifd: Rewrite multifd_queue_page() peterx
2024-02-02 20:47   ` Fabiano Rosas
2024-02-05  4:03     ` Peter Xu [this message]
2024-02-02 10:28 ` [PATCH v2 19/23] migration/multifd: Cleanup multifd_save_cleanup() peterx
2024-02-02 20:54   ` Fabiano Rosas
2024-02-05  4:25     ` Peter Xu
2024-02-02 10:28 ` [PATCH v2 20/23] migration/multifd: Cleanup multifd_load_cleanup() peterx
2024-02-02 20:55   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 21/23] migration/multifd: Stick with send/recv on function names peterx
2024-02-02 21:03   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 22/23] migration/multifd: Fix MultiFDSendParams.packet_num race peterx
2024-02-02 21:08   ` Fabiano Rosas
2024-02-05  4:05     ` Peter Xu
2024-02-02 10:28 ` [PATCH v2 23/23] migration/multifd: Optimize sender side to be lockless peterx
2024-02-02 21:34   ` Fabiano Rosas
2024-02-05  4:35     ` Peter Xu
2024-02-05 14:10       ` Fabiano Rosas
2024-02-05 14:24         ` Peter Xu
2024-02-05 17:59           ` Fabiano Rosas
2024-02-06  3:05 ` [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZcBd_hDAUKgmUAhX@x1n \
    --to=peterx@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=bryan.zhang@bytedance.com \
    --cc=farosas@suse.de \
    --cc=hao.xiang@bytedance.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yuan1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).