From: Juan Quintela <quintela@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Fabiano Rosas" <farosas@suse.de>,
qemu-devel@nongnu.org, "Leonardo Bras" <leobras@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: Re: [RFC PATCH 2/2] migration/multifd: Move semaphore release into main thread
Date: Thu, 16 Nov 2023 15:56:49 +0100 [thread overview]
Message-ID: <875y21rbke.fsf@secure.mitica> (raw)
In-Reply-To: <ZU0rY662a5C1mvyf@x1n> (Peter Xu's message of "Thu, 9 Nov 2023 13:56:35 -0500")
Peter Xu <peterx@redhat.com> wrote:
> On Thu, Nov 09, 2023 at 01:58:56PM -0300, Fabiano Rosas wrote:
>> We cannot operate on the multifd semaphores outside of the multifd
>> channel thread
>> because multifd_save_cleanup() can run in parallel and
>> attempt to destroy the mutexes, which causes an assert.
>>
>> Looking at the places where we use the semaphores aside from the
>> migration thread, there's only the TLS handshake thread and the
>> initial multifd_channel_connect() in the main thread. These are places
>> where creating the multifd channel cannot fail, so releasing the
>> semaphores at these places only serves the purpose of avoiding a
>> deadlock when an error happens before the channel(s) have started, but
>> after the migration thread has already called
>> multifd_send_sync_main().
>>
>> Instead of attempting to release the semaphores at those places, move
>> the release into multifd_save_cleanup(). This puts the semaphore usage
>> in the same thread that does the cleanup, eliminating the race.
>>
>> Move the call to multifd_save_cleanup() before joining for the
>> migration thread so we release the semaphores before.
>>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>> migration/migration.c | 4 +++-
>> migration/multifd.c | 29 +++++++++++------------------
>> 2 files changed, 14 insertions(+), 19 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index cca32c553c..52be20561b 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1300,6 +1300,9 @@ static void migrate_fd_cleanup(MigrationState *s)
>> QEMUFile *tmp;
>>
>> trace_migrate_fd_cleanup();
>> +
>> + multifd_save_cleanup();
>> +
>> qemu_mutex_unlock_iothread();
>> if (s->migration_thread_running) {
>> qemu_thread_join(&s->thread);
>> @@ -1307,7 +1310,6 @@ static void migrate_fd_cleanup(MigrationState *s)
>> }
>> qemu_mutex_lock_iothread();
>>
>> - multifd_save_cleanup();
>> qemu_mutex_lock(&s->qemu_file_lock);
>> tmp = s->to_dst_file;
>> s->to_dst_file = NULL;
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index ec58c58082..9ca482cfbe 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -527,6 +527,9 @@ void multifd_save_cleanup(void)
>>
>> if (p->running) {
>> qemu_thread_join(&p->thread);
>> + } else {
>> + qemu_sem_post(&p->sem_sync);
>> + qemu_sem_post(&multifd_send_state->channels_ready);
>
> I think relying on p->running to join the thread is already problematic.
Why?
> Now all threads are created with JOINABLE, so we must join them to release
> the thread resources. Clearing "running" at the end of the thread is
> already wrong to me, because it means if the thread quits before the main
> thread reaching here, we will not join the thread, and the thread resource
> will be leaked.
The bug is that the thread quits from other place.
It is not different that forgetting to do a mutex_unlock(). It is an
error that needs fixing.
> Here IMHO we should set p->running=true right before thread created,
(that is basically what is done)
Meaning of ->running is that multifd thread has started running. it a
false is that it is not running normally anymore.
thread function:
> and never clear it.
static void *multifd_send_thread(void *opaque)
{
// No error can happens here
while (true) {
// No return command here, just breaks
}
out:
// This can fail
qemu_mutex_lock(&p->mutex);
p->running = false;
qemu_mutex_unlock(&p->mutex);
/* this can't fail */
return NULL;
}
What running here means is that we don't need to "stop" this thread
anymore. That happens as soon as we get out of the loop.
> We may even want to rename it to p->thread_created?
We can rename it, but I am not sure if it buys it too much. Notice that
we also mean that we have created the channel.
>
>> }
>> }
>> for (i = 0; i < migrate_multifd_channels(); i++) {
>> @@ -751,8 +754,6 @@ out:
>> assert(local_err);
>> trace_multifd_send_error(p->id);
>> multifd_send_terminate_threads(local_err);
>> - qemu_sem_post(&p->sem_sync);
>> - qemu_sem_post(&multifd_send_state->channels_ready);
>> error_free(local_err);
>> }
>>
>> @@ -780,20 +781,15 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
>>
>> if (!qio_task_propagate_error(task, &err)) {
>> trace_multifd_tls_outgoing_handshake_complete(ioc);
>> - if (multifd_channel_connect(p, ioc, &err)) {
>> - return;
>> - }
>> + multifd_channel_connect(p, ioc, NULL);
>
> Ignoring Error** is not good..
>
> I think you meant to say "it should never fail", then we should put
> &error_abort. Another cleaner way to do this is split the current
> multifd_channel_connect() into tls and non-tls helpers, then we can call
> the non-tls helpers here (which may not need an Error**).
This code is really weird because it is (very) asynchronous.
And TLS is even worse, because we need to do the equilent of two
listen().
Sniff.
>> + } else {
>> + /*
>> + * The multifd client could already be waiting to queue data,
>> + * so let it know that we didn't even start.
>> + */
>> + p->quit = true;
>> + trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
>> }
>> -
>> - trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
>> -
>> - /*
>> - * Error happen, mark multifd_send_thread status as 'quit' although it
>> - * is not created, and then tell who pay attention to me.
>> - */
>> - p->quit = true;
>> - qemu_sem_post(&multifd_send_state->channels_ready);
>> - qemu_sem_post(&p->sem_sync);
>> }
>>
>> static void *multifd_tls_handshake_thread(void *opaque)
>> @@ -862,9 +858,6 @@ static void multifd_new_send_channel_cleanup(MultiFDSendParams *p,
>> QIOChannel *ioc, Error *err)
>> {
>> migrate_set_error(migrate_get_current(), err);
>> - /* Error happen, we need to tell who pay attention to me */
>> - qemu_sem_post(&multifd_send_state->channels_ready);
>> - qemu_sem_post(&p->sem_sync);
>> /*
>> * Although multifd_send_thread is not created, but main migration
>> * thread need to judge whether it is running, so we need to mark
>> --
>
> I may still need some more time to digest your whole solution, currently
> not very clear to me. It may or may not be the patch's problem, though.
>
> But let me also share how I see this.. I think we should rework the
> multifd thread model on channel establishment.
>
> Firstly, as I mentioned above, we must always join() the threads if it's
> JOINABLE or the resource is leaked, afaict. That's the first thing to fix.
The problem that running is trying to fix is this:
* we get an additional error while we are exiting due to any reason.
So we end calling multifd_fd_cleanup() several times, and things get
really weird.
So what we really need is to be sure that:
- we join each thread once and only once
https://linux.die.net/man/3/pthread_join
Joining with a thread that has previously been joined results in undefined behavior.
> Then let's see when TLS is there and what we do: we'll create "two" threads
> just for that, what's even worse:
>
> - We'll create tls handshake thread in multifd_tls_channel_connect()
> first, setting &p->thread.
>
> - Within the same thread, we do multifd_tls_outgoing_handshake() when
> handshake done -> multifd_channel_connect() -> we yet create the real
> multifd_send_thread(), setting &p->thread too.
>
> So AFAICT, the tls handshake thread is already leaked, got p->thread
> overwritten by the new thread created by itself..
>
> I think we should fix this then. I haven't figured the best way to do,
> two things I can come up with now:
>
> 1) At least make the tls handshake thread detached.
>
> 2) Make the handshake done in multifd send thread directly; I don't yet
> see why we must create two threads..
The way we listen() in sockets anynchronously. QIO channels are a
mystery to me, and this was really weird.
Using glib_main_loop() has lots of advantages. Understanding how to
listen asynchronously is not one of them.
> Then assuming we have a clear model with all these threads issue fixed (no
> matter whether we'd shrink 2N threads into N threads), then what we need to
> do, IMHO, is making sure to join() all of them before destroying anything
> (say, per-channel MultiFDSendParams). Then when we destroy everything
> safely, either mutex/sem/etc.. Because no one will race us anymore.
> Would it make sense?
I can't see why we need to join the threads created for tls. That ones
could be not joinable and just call it a day.
Later, Juan.
next prev parent reply other threads:[~2023-11-16 14:58 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-09 16:58 [RFC PATCH 0/2] migration: Fix multifd qemu_mutex_destroy race Fabiano Rosas
2023-11-09 16:58 ` [RFC PATCH 1/2] migration: Report error in incoming migration Fabiano Rosas
2023-11-09 18:57 ` Peter Xu
2023-11-10 10:58 ` Fabiano Rosas
2023-11-13 16:51 ` Peter Xu
2023-11-14 1:54 ` Fabiano Rosas
2023-11-09 16:58 ` [RFC PATCH 2/2] migration/multifd: Move semaphore release into main thread Fabiano Rosas
2023-11-09 18:56 ` Peter Xu
2023-11-10 12:05 ` Fabiano Rosas
2023-11-10 12:37 ` Fabiano Rosas
2023-11-16 15:51 ` Juan Quintela
2023-11-13 16:45 ` Peter Xu
2023-11-14 1:50 ` Fabiano Rosas
2023-11-14 17:28 ` Peter Xu
2023-11-16 15:44 ` Juan Quintela
2023-11-16 14:56 ` Juan Quintela [this message]
2023-11-16 18:13 ` Fabiano Rosas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875y21rbke.fsf@secure.mitica \
--to=quintela@redhat.com \
--cc=farosas@suse.de \
--cc=leobras@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).