From: Fabiano Rosas <farosas@suse.de>
To: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
Peter Xu <peterx@redhat.com>, Leonardo Bras <leobras@redhat.com>
Subject: Re: [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore
Date: Fri, 29 Sep 2023 11:41:42 -0300 [thread overview]
Message-ID: <8734yx11p5.fsf@suse.de> (raw)
In-Reply-To: <o66r3yvserkbkfxczdweqpr3vzo2fj5jix2dv4hs22g32alqqh@lwk6qjp3fi74>
Elena Ufimtseva <elena.ufimtseva@oracle.com> writes:
> On Fri, Sep 22, 2023 at 11:53:17AM -0300, Fabiano Rosas wrote:
>> Commit d2026ee117 ("multifd: Fix the number of channels ready") moved
>> the "post" of channels_ready to the start of the multifd_send_thread()
>> loop and added a missing "wait" at multifd_send_sync_main(). While it
>> does work, the placement of the wait goes against what the rest of the
>> code does.
>>
>> The sequence at multifd_send_thread() is:
>>
>> qemu_sem_post(&multifd_send_state->channels_ready);
>> qemu_sem_wait(&p->sem);
>> <work>
>> if (flags & MULTIFD_FLAG_SYNC) {
>> qemu_sem_post(&p->sem_sync);
>> }
>>
>> Which means that the sending thread makes itself available
>> (channels_ready) and waits for more work (sem). So the sequence in the
>> migration thread should be to check if any channel is available
>> (channels_ready), give it some work and set it off (sem):
>>
>> qemu_sem_wait(&multifd_send_state->channels_ready);
>> <enqueue work>
>> qemu_sem_post(&p->sem);
>> if (flags & MULTIFD_FLAG_SYNC) {
>> qemu_sem_wait(&p->sem_sync);
>> }
>>
>> The reason there's no deadlock today is that the migration thread
>> enqueues the SYNC packet right before the wait on channels_ready and
>> we end up taking advantage of the out-of-order post to sem:
>>
>> ...
>> qemu_sem_post(&p->sem);
>> }
>> for (i = 0; i < migrate_multifd_channels(); i++) {
>> MultiFDSendParams *p = &multifd_send_state->params[i];
>>
>> qemu_sem_wait(&multifd_send_state->channels_ready);
>> trace_multifd_send_sync_main_wait(p->id);
>> qemu_sem_wait(&p->sem_sync);
>> ...
>>
>> Move the channels_ready wait before the sem post to keep the sequence
>> consistent. Also fix the error path to post to channels_ready and
>> sem_sync in the correct order.
>>
>
> Thank you Fabiano,
>
> Your solution is more complete. I also had in mind getting rid of
> sem_sync.
>
> With your second patch, this one could be merged with it?
>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>> migration/multifd.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index a7c7a947e3..d626740f2f 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -618,6 +618,7 @@ int multifd_send_sync_main(QEMUFile *f)
>>
>> trace_multifd_send_sync_main_signal(p->id);
>>
>> + qemu_sem_wait(&multifd_send_state->channels_ready);
>> qemu_mutex_lock(&p->mutex);
>>
>> if (p->quit) {
>> @@ -635,7 +636,6 @@ int multifd_send_sync_main(QEMUFile *f)
>> for (i = 0; i < migrate_multifd_channels(); i++) {
>> MultiFDSendParams *p = &multifd_send_state->params[i];
>>
>> - qemu_sem_wait(&multifd_send_state->channels_ready);
>> trace_multifd_send_sync_main_wait(p->id);
>> qemu_sem_wait(&p->sem_sync);
>>
>> @@ -763,8 +763,8 @@ out:
>> * who pay attention to me.
>> */
>> if (ret != 0) {
>> - qemu_sem_post(&p->sem_sync);
>> qemu_sem_post(&multifd_send_state->channels_ready);
>> + qemu_sem_post(&p->sem_sync);
>
> Can this thread in this error case be woken up again between
> these two qemu_sem_posts?
> I see in other places p->quit is set to true before it.
> Or maybe it should one more patch to make these consistent
> as well.
That's a good point. There's clearly something going on here if we need
a 'running', a 'quit' and a 'exiting' flag. The tls code uses quit as a
signal in one direction while the regular multifd path uses it in
another.
I'll give it some more thought. Thanks
next prev parent reply other threads:[~2023-09-29 14:42 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-22 14:53 [RFC PATCH 0/3] migration/multifd: SYNC packet changes Fabiano Rosas
2023-09-22 14:53 ` [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore Fabiano Rosas
2023-09-22 22:33 ` Elena Ufimtseva
2023-09-29 14:41 ` Fabiano Rosas [this message]
2023-10-10 21:00 ` Peter Xu
2023-10-10 21:40 ` Peter Xu
2023-10-10 21:43 ` Fabiano Rosas
2023-10-10 21:59 ` Peter Xu
2023-09-22 14:53 ` [RFC PATCH 2/3] migration/multifd: Decouple control flow from the SYNC packet Fabiano Rosas
2023-09-22 14:53 ` [RFC PATCH 3/3] migration/multifd: Extract sem_done waiting into a function Fabiano Rosas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8734yx11p5.fsf@suse.de \
--to=farosas@suse.de \
--cc=elena.ufimtseva@oracle.com \
--cc=leobras@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).