From: Juan Quintela <quintela@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org, Peter Xu <peterx@redhat.com>,
Leonardo Bras <leobras@redhat.com>,
Elena Ufimtseva <elena.ufimtseva@oracle.com>
Subject: Re: [RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphore
Date: Thu, 19 Oct 2023 11:06:06 +0200 [thread overview]
Message-ID: <87sf676kxt.fsf@secure.mitica> (raw)
In-Reply-To: <20231012140651.13122-2-farosas@suse.de> (Fabiano Rosas's message of "Thu, 12 Oct 2023 11:06:46 -0300")
Fabiano Rosas <farosas@suse.de> wrote:
> The channels_ready semaphore is a global variable not linked to any
> single multifd channel. Waiting on it only means that "some" channel
> has become ready to send data. Since we need to address the channels
> by index (multifd_send_state->params[i]), that information adds
> nothing of value.
NAK.
I disagree here O:-)
the reason why that channel exist is for multifd_send_pages()
And simplifying the function what it does is:
sem_wait(channels_ready);
for_each_channel()
look if it is empty()
But with the semaphore, we guarantee that when we go to the loop, there
is a channel ready, so we know we donat busy wait searching for a
channel that is free.
Notice that I fully agree that the sem is not needed for locking.
Locking is done with the mutex. It is just used to make sure that we
don't busy loop on that loop.
And we use a sem, because it is the easiest way to know how many
channels are ready (even when we only care if there is one when we
arrive to that code).
We lost count of that counter, and we fixed that here:
commit d2026ee117147893f8d80f060cede6d872ecbd7f
Author: Juan Quintela <quintela@redhat.com>
Date: Wed Apr 26 12:20:36 2023 +0200
multifd: Fix the number of channels ready
We don't wait in the sem when we are doing a sync_main. Make it
And we were addressing the problem that some users where finding that we
were busy waiting on that loop.
> The channel being addressed is not necessarily the
> one that just released the semaphore.
We only care that there is at least one free. We are going to search
the next one.
Does this explanation makes sense?
Later, Juan.
> The only usage of this semaphore that makes sense is to wait for it in
> a loop that iterates for the number of channels. That could mean: all
> channels have been setup and are operational OR all channels have
> finished their work and are idle.
>
> Currently all code that waits on channels_ready is redundant. There is
> always a subsequent lock or semaphore that does the actual data
> protection/synchronization.
>
> - at multifd_send_pages: Waiting on channels_ready doesn't mean the
> 'next_channel' is ready, it could be any other channel. So there are
> already cases where this code runs as if no semaphore was there.
> Waiting outside of the loop is also incorrect because if the current
> channel already has a pending_job, then it will loop into the next
> one without waiting the semaphore and the count will be greater than
> zero at the end of the execution.
>
> Checking that "any" channel is ready as a proxy for all channels
> being ready would work, but it's not what the code is doing and not
> really needed because the channel lock and 'sem' would be enough.
>
> - at multifd_send_sync: This usage is correct, but it is made
> redundant by the wait on sem_sync. What this piece of code is doing
> is making sure all channels have sent the SYNC packet and became
> idle afterwards.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> migration/multifd.c | 10 ----------
> 1 file changed, 10 deletions(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 0f6b203877..e26f5f246d 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -362,8 +362,6 @@ struct {
> MultiFDPages_t *pages;
> /* global number of generated multifd packets */
> uint64_t packet_num;
> - /* send channels ready */
> - QemuSemaphore channels_ready;
> /*
> * Have we already run terminate threads. There is a race when it
> * happens that we got one error while we are exiting.
> @@ -403,7 +401,6 @@ static int multifd_send_pages(QEMUFile *f)
> return -1;
> }
>
> - qemu_sem_wait(&multifd_send_state->channels_ready);
> /*
> * next_channel can remain from a previous migration that was
> * using more channels, so ensure it doesn't overflow if the
> @@ -554,7 +551,6 @@ void multifd_save_cleanup(void)
> error_free(local_err);
> }
> }
> - qemu_sem_destroy(&multifd_send_state->channels_ready);
> g_free(multifd_send_state->params);
> multifd_send_state->params = NULL;
> multifd_pages_clear(multifd_send_state->pages);
> @@ -630,7 +626,6 @@ int multifd_send_sync_main(QEMUFile *f)
> for (i = 0; i < migrate_multifd_channels(); i++) {
> MultiFDSendParams *p = &multifd_send_state->params[i];
>
> - qemu_sem_wait(&multifd_send_state->channels_ready);
> trace_multifd_send_sync_main_wait(p->id);
> qemu_sem_wait(&p->sem_sync);
>
> @@ -664,7 +659,6 @@ static void *multifd_send_thread(void *opaque)
> p->num_packets = 1;
>
> while (true) {
> - qemu_sem_post(&multifd_send_state->channels_ready);
> qemu_sem_wait(&p->sem);
>
> if (qatomic_read(&multifd_send_state->exiting)) {
> @@ -759,7 +753,6 @@ out:
> */
> if (ret != 0) {
> qemu_sem_post(&p->sem_sync);
> - qemu_sem_post(&multifd_send_state->channels_ready);
> }
>
> qemu_mutex_lock(&p->mutex);
> @@ -796,7 +789,6 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
> * is not created, and then tell who pay attention to me.
> */
> p->quit = true;
> - qemu_sem_post(&multifd_send_state->channels_ready);
> qemu_sem_post(&p->sem_sync);
> }
> }
> @@ -874,7 +866,6 @@ static void multifd_new_send_channel_cleanup(MultiFDSendParams *p,
> {
> migrate_set_error(migrate_get_current(), err);
> /* Error happen, we need to tell who pay attention to me */
> - qemu_sem_post(&multifd_send_state->channels_ready);
> qemu_sem_post(&p->sem_sync);
> /*
> * Although multifd_send_thread is not created, but main migration
> @@ -919,7 +910,6 @@ int multifd_save_setup(Error **errp)
> multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
> multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
> multifd_send_state->pages = multifd_pages_init(page_count);
> - qemu_sem_init(&multifd_send_state->channels_ready, 0);
> qatomic_set(&multifd_send_state->exiting, 0);
> multifd_send_state->ops = multifd_ops[migrate_multifd_compression()];
next prev parent reply other threads:[~2023-10-19 9:06 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-12 14:06 [RFC PATCH v2 0/6] migration/multifd: Locking changes Fabiano Rosas
2023-10-12 14:06 ` [RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphore Fabiano Rosas
2023-10-19 9:06 ` Juan Quintela [this message]
2023-10-19 14:35 ` Peter Xu
2023-10-19 15:00 ` Juan Quintela
2023-10-19 15:46 ` Peter Xu
2023-10-19 18:28 ` Juan Quintela
2023-10-19 18:50 ` Peter Xu
2023-10-20 7:56 ` Juan Quintela
2023-10-19 14:55 ` Fabiano Rosas
2023-10-19 15:18 ` Juan Quintela
2023-10-19 15:56 ` Fabiano Rosas
2023-10-19 18:41 ` Juan Quintela
2023-10-19 19:04 ` Peter Xu
2023-10-20 7:53 ` Juan Quintela
2023-10-20 12:48 ` Fabiano Rosas
2023-10-22 20:17 ` Peter Xu
2023-10-12 14:06 ` [RFC PATCH v2 2/6] migration/multifd: Stop checking p->quit in multifd_send_thread Fabiano Rosas
2023-10-19 9:08 ` Juan Quintela
2023-10-19 14:58 ` Fabiano Rosas
2023-10-19 15:19 ` Peter Xu
2023-10-19 15:19 ` Juan Quintela
2023-10-12 14:06 ` [RFC PATCH v2 3/6] migration/multifd: Decouple control flow from the SYNC packet Fabiano Rosas
2023-10-19 10:28 ` Juan Quintela
2023-10-19 15:31 ` Peter Xu
2023-10-12 14:06 ` [RFC PATCH v2 4/6] migration/multifd: Extract sem_done waiting into a function Fabiano Rosas
2023-10-12 14:06 ` [RFC PATCH v2 5/6] migration/multifd: Stop setting 'quit' outside of channels Fabiano Rosas
2023-10-19 10:35 ` Juan Quintela
2023-10-12 14:06 ` [RFC PATCH v2 6/6] migration/multifd: Bring back the 'ready' semaphore Fabiano Rosas
2023-10-19 10:43 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sf676kxt.fsf@secure.mitica \
--to=quintela@redhat.com \
--cc=elena.ufimtseva@oracle.com \
--cc=farosas@suse.de \
--cc=leobras@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).