All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
	Peter Xu <peterx@redhat.com>, Leonardo Bras <leobras@redhat.com>
Subject: Re: [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore
Date: Fri, 29 Sep 2023 11:41:42 -0300	[thread overview]
Message-ID: <8734yx11p5.fsf@suse.de> (raw)
In-Reply-To: <o66r3yvserkbkfxczdweqpr3vzo2fj5jix2dv4hs22g32alqqh@lwk6qjp3fi74>

Elena Ufimtseva <elena.ufimtseva@oracle.com> writes:

> On Fri, Sep 22, 2023 at 11:53:17AM -0300, Fabiano Rosas wrote:
>> Commit d2026ee117 ("multifd: Fix the number of channels ready") moved
>> the "post" of channels_ready to the start of the multifd_send_thread()
>> loop and added a missing "wait" at multifd_send_sync_main(). While it
>> does work, the placement of the wait goes against what the rest of the
>> code does.
>> 
>> The sequence at multifd_send_thread() is:
>> 
>>     qemu_sem_post(&multifd_send_state->channels_ready);
>>     qemu_sem_wait(&p->sem);
>>     <work>
>>     if (flags & MULTIFD_FLAG_SYNC) {
>>         qemu_sem_post(&p->sem_sync);
>>     }
>> 
>> Which means that the sending thread makes itself available
>> (channels_ready) and waits for more work (sem). So the sequence in the
>> migration thread should be to check if any channel is available
>> (channels_ready), give it some work and set it off (sem):
>> 
>>     qemu_sem_wait(&multifd_send_state->channels_ready);
>>     <enqueue work>
>>     qemu_sem_post(&p->sem);
>>     if (flags & MULTIFD_FLAG_SYNC) {
>>         qemu_sem_wait(&p->sem_sync);
>>     }
>> 
>> The reason there's no deadlock today is that the migration thread
>> enqueues the SYNC packet right before the wait on channels_ready and
>> we end up taking advantage of the out-of-order post to sem:
>> 
>>         ...
>>         qemu_sem_post(&p->sem);
>>     }
>>     for (i = 0; i < migrate_multifd_channels(); i++) {
>>         MultiFDSendParams *p = &multifd_send_state->params[i];
>> 
>>         qemu_sem_wait(&multifd_send_state->channels_ready);
>>         trace_multifd_send_sync_main_wait(p->id);
>>         qemu_sem_wait(&p->sem_sync);
>> 	...
>> 
>> Move the channels_ready wait before the sem post to keep the sequence
>> consistent. Also fix the error path to post to channels_ready and
>> sem_sync in the correct order.
>>
>
> Thank you Fabiano,
>
> Your solution is more complete. I also had in mind getting rid of
> sem_sync.
>
> With your second patch, this one could be merged with it?
>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  migration/multifd.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index a7c7a947e3..d626740f2f 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -618,6 +618,7 @@ int multifd_send_sync_main(QEMUFile *f)
>>  
>>          trace_multifd_send_sync_main_signal(p->id);
>>  
>> +        qemu_sem_wait(&multifd_send_state->channels_ready);
>>          qemu_mutex_lock(&p->mutex);
>>  
>>          if (p->quit) {
>> @@ -635,7 +636,6 @@ int multifd_send_sync_main(QEMUFile *f)
>>      for (i = 0; i < migrate_multifd_channels(); i++) {
>>          MultiFDSendParams *p = &multifd_send_state->params[i];
>>  
>> -        qemu_sem_wait(&multifd_send_state->channels_ready);
>>          trace_multifd_send_sync_main_wait(p->id);
>>          qemu_sem_wait(&p->sem_sync);
>>  
>> @@ -763,8 +763,8 @@ out:
>>       * who pay attention to me.
>>       */
>>      if (ret != 0) {
>> -        qemu_sem_post(&p->sem_sync);
>>          qemu_sem_post(&multifd_send_state->channels_ready);
>> +        qemu_sem_post(&p->sem_sync);
>
> Can this thread in this error case be woken up again between
> these two qemu_sem_posts?
> I see in other places p->quit is set to true before it.
> Or maybe it should one more patch to make these consistent 
> as well.

That's a good point. There's clearly something going on here if we need
a 'running', a 'quit' and a 'exiting' flag. The tls code uses quit as a
signal in one direction while the regular multifd path uses it in
another.

I'll give it some more thought. Thanks


  reply	other threads:[~2023-09-29 14:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-22 14:53 [RFC PATCH 0/3] migration/multifd: SYNC packet changes Fabiano Rosas
2023-09-22 14:53 ` [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore Fabiano Rosas
2023-09-22 22:33   ` Elena Ufimtseva
2023-09-29 14:41     ` Fabiano Rosas [this message]
2023-10-10 21:00   ` Peter Xu
2023-10-10 21:40     ` Peter Xu
2023-10-10 21:43     ` Fabiano Rosas
2023-10-10 21:59       ` Peter Xu
2023-09-22 14:53 ` [RFC PATCH 2/3] migration/multifd: Decouple control flow from the SYNC packet Fabiano Rosas
2023-09-22 14:53 ` [RFC PATCH 3/3] migration/multifd: Extract sem_done waiting into a function Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734yx11p5.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=elena.ufimtseva@oracle.com \
    --cc=leobras@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.