From: Fabiano Rosas <farosas@suse.de>
To: qemu-devel@nongnu.org
Cc: "Juan Quintela" <quintela@redhat.com>,
"Peter Xu" <peterx@redhat.com>,
"Leonardo Bras" <leobras@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: [RFC PATCH 2/2] migration/multifd: Move semaphore release into main thread
Date: Thu, 9 Nov 2023 13:58:56 -0300 [thread overview]
Message-ID: <20231109165856.15224-3-farosas@suse.de> (raw)
In-Reply-To: <20231109165856.15224-1-farosas@suse.de>
We cannot operate on the multifd semaphores outside of the multifd
channel thread because multifd_save_cleanup() can run in parallel and
attempt to destroy the mutexes, which causes an assert.
Looking at the places where we use the semaphores aside from the
migration thread, there's only the TLS handshake thread and the
initial multifd_channel_connect() in the main thread. These are places
where creating the multifd channel cannot fail, so releasing the
semaphores at these places only serves the purpose of avoiding a
deadlock when an error happens before the channel(s) have started, but
after the migration thread has already called
multifd_send_sync_main().
Instead of attempting to release the semaphores at those places, move
the release into multifd_save_cleanup(). This puts the semaphore usage
in the same thread that does the cleanup, eliminating the race.
Move the call to multifd_save_cleanup() before joining for the
migration thread so we release the semaphores before.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
migration/migration.c | 4 +++-
migration/multifd.c | 29 +++++++++++------------------
2 files changed, 14 insertions(+), 19 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index cca32c553c..52be20561b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1300,6 +1300,9 @@ static void migrate_fd_cleanup(MigrationState *s)
QEMUFile *tmp;
trace_migrate_fd_cleanup();
+
+ multifd_save_cleanup();
+
qemu_mutex_unlock_iothread();
if (s->migration_thread_running) {
qemu_thread_join(&s->thread);
@@ -1307,7 +1310,6 @@ static void migrate_fd_cleanup(MigrationState *s)
}
qemu_mutex_lock_iothread();
- multifd_save_cleanup();
qemu_mutex_lock(&s->qemu_file_lock);
tmp = s->to_dst_file;
s->to_dst_file = NULL;
diff --git a/migration/multifd.c b/migration/multifd.c
index ec58c58082..9ca482cfbe 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -527,6 +527,9 @@ void multifd_save_cleanup(void)
if (p->running) {
qemu_thread_join(&p->thread);
+ } else {
+ qemu_sem_post(&p->sem_sync);
+ qemu_sem_post(&multifd_send_state->channels_ready);
}
}
for (i = 0; i < migrate_multifd_channels(); i++) {
@@ -751,8 +754,6 @@ out:
assert(local_err);
trace_multifd_send_error(p->id);
multifd_send_terminate_threads(local_err);
- qemu_sem_post(&p->sem_sync);
- qemu_sem_post(&multifd_send_state->channels_ready);
error_free(local_err);
}
@@ -780,20 +781,15 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
if (!qio_task_propagate_error(task, &err)) {
trace_multifd_tls_outgoing_handshake_complete(ioc);
- if (multifd_channel_connect(p, ioc, &err)) {
- return;
- }
+ multifd_channel_connect(p, ioc, NULL);
+ } else {
+ /*
+ * The multifd client could already be waiting to queue data,
+ * so let it know that we didn't even start.
+ */
+ p->quit = true;
+ trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
}
-
- trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
-
- /*
- * Error happen, mark multifd_send_thread status as 'quit' although it
- * is not created, and then tell who pay attention to me.
- */
- p->quit = true;
- qemu_sem_post(&multifd_send_state->channels_ready);
- qemu_sem_post(&p->sem_sync);
}
static void *multifd_tls_handshake_thread(void *opaque)
@@ -862,9 +858,6 @@ static void multifd_new_send_channel_cleanup(MultiFDSendParams *p,
QIOChannel *ioc, Error *err)
{
migrate_set_error(migrate_get_current(), err);
- /* Error happen, we need to tell who pay attention to me */
- qemu_sem_post(&multifd_send_state->channels_ready);
- qemu_sem_post(&p->sem_sync);
/*
* Although multifd_send_thread is not created, but main migration
* thread need to judge whether it is running, so we need to mark
--
2.35.3
next prev parent reply other threads:[~2023-11-09 16:59 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-09 16:58 [RFC PATCH 0/2] migration: Fix multifd qemu_mutex_destroy race Fabiano Rosas
2023-11-09 16:58 ` [RFC PATCH 1/2] migration: Report error in incoming migration Fabiano Rosas
2023-11-09 18:57 ` Peter Xu
2023-11-10 10:58 ` Fabiano Rosas
2023-11-13 16:51 ` Peter Xu
2023-11-14 1:54 ` Fabiano Rosas
2023-11-09 16:58 ` Fabiano Rosas [this message]
2023-11-09 18:56 ` [RFC PATCH 2/2] migration/multifd: Move semaphore release into main thread Peter Xu
2023-11-10 12:05 ` Fabiano Rosas
2023-11-10 12:37 ` Fabiano Rosas
2023-11-16 15:51 ` Juan Quintela
2023-11-13 16:45 ` Peter Xu
2023-11-14 1:50 ` Fabiano Rosas
2023-11-14 17:28 ` Peter Xu
2023-11-16 15:44 ` Juan Quintela
2023-11-16 14:56 ` Juan Quintela
2023-11-16 18:13 ` Fabiano Rosas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231109165856.15224-3-farosas@suse.de \
--to=farosas@suse.de \
--cc=leobras@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).