qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
	Leonardo Bras <leobras@redhat.com>,
	Elena Ufimtseva <elena.ufimtseva@oracle.com>
Subject: Re: [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore
Date: Tue, 10 Oct 2023 17:00:37 -0400	[thread overview]
Message-ID: <ZSW7dfSgV2dc6n0D@x1n> (raw)
In-Reply-To: <20230922145319.27380-2-farosas@suse.de>

On Fri, Sep 22, 2023 at 11:53:17AM -0300, Fabiano Rosas wrote:
> Commit d2026ee117 ("multifd: Fix the number of channels ready") moved
> the "post" of channels_ready to the start of the multifd_send_thread()
> loop and added a missing "wait" at multifd_send_sync_main(). While it
> does work, the placement of the wait goes against what the rest of the
> code does.
> 
> The sequence at multifd_send_thread() is:
> 
>     qemu_sem_post(&multifd_send_state->channels_ready);
>     qemu_sem_wait(&p->sem);
>     <work>
>     if (flags & MULTIFD_FLAG_SYNC) {
>         qemu_sem_post(&p->sem_sync);
>     }
> 
> Which means that the sending thread makes itself available
> (channels_ready) and waits for more work (sem). So the sequence in the
> migration thread should be to check if any channel is available
> (channels_ready), give it some work and set it off (sem):
> 
>     qemu_sem_wait(&multifd_send_state->channels_ready);

Here it means we have at least 1 free send thread, then...

>     <enqueue work>
>     qemu_sem_post(&p->sem);

... here we enqueue some work to the current thread (pointed by "i"), no
matter it's free or not, as "i" may not always point to the free thread.

>     if (flags & MULTIFD_FLAG_SYNC) {
>         qemu_sem_wait(&p->sem_sync);
>     }

So I must confess I never fully digest how these sem/mutex/.. worked in
multifd, since the 1st day it's introduced.. so please take below comment
with a grain of salt..

It seems to me that the current design allows >1 pending_job for a thread.
Here the current code didn't do "wait(channels_ready)" because it doesn't
need to - it simply always queue an MULTIFD_FLAG_SYNC pending job over the
thread, and wait for it to run.

From that POV I think I can understand why "wait(channels_ready)" is not
needed here.  But then I'm confused because we don't have a real QUEUE to
put those requests; we simply apply this:

multifd_send_sync_main():
        p->flags |= MULTIFD_FLAG_SYNC;

Even if this send thread can be busy handling a batch of pages and
accessing p->flags.  I think it can actually race with the send thread
reading the flag at the exact same time:

multifd_send_thread():
            multifd_send_fill_packet(p);
            flags = p->flags;  <-------------- here

And whether it sees MULTIFD_FLAG_SYNC is unpredictable.  If it sees it,
it'll post(sem_sync) in this round.  If it doesn't see it, it'll
post(sem_sync) in the next round.  In whatever way, we'll generate an empty
multifd packet to the wire I think, even though I don't know whether that's
needed at all...

I'm not sure whether we should fix it in a more complete form, by not
sending that empty multifd packet at all? Because that only contains the
header without any real page inside, IIUC, so it seems to be a waste of
resource.  Here what we want is only to kick sem_sync?

> 
> The reason there's no deadlock today is that the migration thread
> enqueues the SYNC packet right before the wait on channels_ready and
> we end up taking advantage of the out-of-order post to sem:
> 
>         ...
>         qemu_sem_post(&p->sem);
>     }
>     for (i = 0; i < migrate_multifd_channels(); i++) {
>         MultiFDSendParams *p = &multifd_send_state->params[i];
> 
>         qemu_sem_wait(&multifd_send_state->channels_ready);
>         trace_multifd_send_sync_main_wait(p->id);
>         qemu_sem_wait(&p->sem_sync);
> 	...
> 
> Move the channels_ready wait before the sem post to keep the sequence
> consistent. Also fix the error path to post to channels_ready and
> sem_sync in the correct order.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  migration/multifd.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/multifd.c b/migration/multifd.c
> index a7c7a947e3..d626740f2f 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -618,6 +618,7 @@ int multifd_send_sync_main(QEMUFile *f)
>  
>          trace_multifd_send_sync_main_signal(p->id);
>  
> +        qemu_sem_wait(&multifd_send_state->channels_ready);
>          qemu_mutex_lock(&p->mutex);
>  
>          if (p->quit) {
> @@ -635,7 +636,6 @@ int multifd_send_sync_main(QEMUFile *f)
>      for (i = 0; i < migrate_multifd_channels(); i++) {
>          MultiFDSendParams *p = &multifd_send_state->params[i];
>  
> -        qemu_sem_wait(&multifd_send_state->channels_ready);
>          trace_multifd_send_sync_main_wait(p->id);
>          qemu_sem_wait(&p->sem_sync);
>  
> @@ -763,8 +763,8 @@ out:
>       * who pay attention to me.
>       */
>      if (ret != 0) {
> -        qemu_sem_post(&p->sem_sync);
>          qemu_sem_post(&multifd_send_state->channels_ready);
> +        qemu_sem_post(&p->sem_sync);

I'm not sure why such movement will have a difference; afaiu on the
semaphore semantics, post() to two sems don't matter on order?

>      }
>  
>      qemu_mutex_lock(&p->mutex);
> -- 
> 2.35.3
> 

-- 
Peter Xu



  parent reply	other threads:[~2023-10-10 21:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-22 14:53 [RFC PATCH 0/3] migration/multifd: SYNC packet changes Fabiano Rosas
2023-09-22 14:53 ` [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore Fabiano Rosas
2023-09-22 22:33   ` Elena Ufimtseva
2023-09-29 14:41     ` Fabiano Rosas
2023-10-10 21:00   ` Peter Xu [this message]
2023-10-10 21:40     ` Peter Xu
2023-10-10 21:43     ` Fabiano Rosas
2023-10-10 21:59       ` Peter Xu
2023-09-22 14:53 ` [RFC PATCH 2/3] migration/multifd: Decouple control flow from the SYNC packet Fabiano Rosas
2023-09-22 14:53 ` [RFC PATCH 3/3] migration/multifd: Extract sem_done waiting into a function Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSW7dfSgV2dc6n0D@x1n \
    --to=peterx@redhat.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=farosas@suse.de \
    --cc=leobras@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).