qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: quintela@redhat.com, Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Leonardo Bras <leobras@redhat.com>,
	Elena Ufimtseva <elena.ufimtseva@oracle.com>
Subject: Re: [RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphore
Date: Fri, 20 Oct 2023 09:48:54 -0300	[thread overview]
Message-ID: <87ttqlh32h.fsf@suse.de> (raw)
In-Reply-To: <87ttql20hz.fsf@secure.mitica>

Juan Quintela <quintela@redhat.com> writes:

> Peter Xu <peterx@redhat.com> wrote:
>> On Thu, Oct 19, 2023 at 08:41:26PM +0200, Juan Quintela wrote:
>>> We can changing pending_job to a bool if you preffer.  I think that we
>>> have nailed all the off_by_one errors by now (famous last words).
>>
>> Would it work to make pending_job a bool, even with SYNC?  It seems to me
>> multifd_send_sync_main() now can boost pending_job even if pending_job==1.
>
> Then a int is ok, I think.
>
>> That's also the place where I really think confusing too; where it looks
>> like the migration thread can modify a pending job's flag as long as it is
>> fast enough before the send thread put that onto the wire.
>
> It never does.
>
>     for (i = next_channel;; i = (i + 1) % migrate_multifd_channels()) {
>         qemu_mutex_lock(&p->mutex);
>         ...
>         if (!p->pending_job) {
>             p->pending_job++;
>             next_channel = (i + 1) % migrate_multifd_channels();
>             break;
>         }
>         qemu_mutex_unlock(&p->mutex);
>     }

We copy the flags into the packet header at multifd_send_fill_packet()
before unlocking. I think that is even independent of the pending_jobs
scheme.

> If pending_job == 0 -> owner of the channel is migration_thread and it
> can use it.
>
> If pending_job > 0 -> owner of the channel is the channel thread and
> migration_thread can't use it.

Do you really mean "migration_thread" here or just multifd_send_pages()?
Because multifd_send_sync_main() doesn't care about this ownership
rule. Does that mean that code is incorrect?

> I think that this is easy to understand.  You are right that it is not
> _explained_.  And clearly, if you have to ask, it is not obvious O:-)

It is explained at multifd.h. But it doesn't really matter because code
can drift and documentation doesn't assure correctness. That's why we
have to ask about seemingly obvious stuff.

> Yes, it was obvious to me, that is the reason why I wrote it on the
> 1st place.  Notice also that it is a common idiom in multithreaded
> apps.  That allows it to do stuff without having to have a mutex
> locked, so other threads can "look" into the state.
>> Then it's
>> unpredictable whether the SYNC flag will be sent with current packet (where
>> due to pending_jobs==1 already, can contain valid pages), or will be only
>> set for the next one (where there will have 0 real page).
>
> I have to think about this one.
> Decrease pending_jobs there if we are doing multiple jobs?
> But we still have the issue of the semaphore.
>
>> IMHO it'll be good to separate the sync task, then we can change
>> pending_jobs to bool.  Something like:
>>
>>   bool pending_send_page;
>>   bool pending_send_sync;
>
> current code:
>
>         qemu_mutex_lock(&p->mutex);
>         qemu_mutex_lock(&p->mutex);
>
>         if (p->pending_job) {
>             uint64_t packet_num = p->packet_num;
>             uint32_t flags;
>             p->normal_num = 0;
>
>             if (use_zero_copy_send) {
>                 p->iovs_num = 0;
>             } else {
>                 p->iovs_num = 1;
>             }
>
>             for (int i = 0; i < p->pages->num; i++) {
>                 p->normal[p->normal_num] = p->pages->offset[i];
>                 p->normal_num++;
>             }
>
>             if (p->normal_num) {
>                 ret = multifd_send_state->ops->send_prepare(p, &local_err);
>                 if (ret != 0) {
>                     qemu_mutex_unlock(&p->mutex);
>                     break;
>                 }
>             }
>             multifd_send_fill_packet(p);
>             flags = p->flags;
>             p->flags = 0;
>             p->num_packets++;
>             p->total_normal_pages += p->normal_num;
>             p->pages->num = 0;
>             p->pages->block = NULL;
>             qemu_mutex_unlock(&p->mutex);
>
>             trace_multifd_send(p->id, packet_num, p->normal_num, flags,
>                                p->next_packet_size);
>
>             if (use_zero_copy_send) {
>                 /* Send header first, without zerocopy */
>                 ret = qio_channel_write_all(p->c, (void *)p->packet,
>                                             p->packet_len, &local_err);
>                 if (ret != 0) {
>                     break;
>                 }
>             } else {
>                 /* Send header using the same writev call */
>                 p->iov[0].iov_len = p->packet_len;
>                 p->iov[0].iov_base = p->packet;
>             }
>
>             ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num, NULL,
>                                               0, p->write_flags, &local_err);
>             if (ret != 0) {
>                 break;
>             }
>
>             stat64_add(&mig_stats.multifd_bytes,
>                        p->next_packet_size + p->packet_len);
>             stat64_add(&mig_stats.transferred,
>                        p->next_packet_size + p->packet_len);
>             p->next_packet_size = 0;
>             qemu_mutex_lock(&p->mutex);
>             p->pending_job--;
>             qemu_mutex_unlock(&p->mutex);
>
>             if (flags & MULTIFD_FLAG_SYNC) {
>                 qemu_sem_post(&p->sem_sync);
>             }
>         } else {
>             qemu_mutex_unlock(&p->mutex);
>             /* sometimes there are spurious wakeups */
>         }
>
> Your suggested change:
>
>        qemu_mutex_lock(&p->mutex);
>
>        if (p->pending_job_page) {

It's a semantic issue really, but I'd rather we avoid locking ourselves
more into the "pages" idea for multifd threads. The data being sent by
the multifd thread should be opaque.



  reply	other threads:[~2023-10-20 12:49 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-12 14:06 [RFC PATCH v2 0/6] migration/multifd: Locking changes Fabiano Rosas
2023-10-12 14:06 ` [RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphore Fabiano Rosas
2023-10-19  9:06   ` Juan Quintela
2023-10-19 14:35     ` Peter Xu
2023-10-19 15:00       ` Juan Quintela
2023-10-19 15:46         ` Peter Xu
2023-10-19 18:28           ` Juan Quintela
2023-10-19 18:50             ` Peter Xu
2023-10-20  7:56               ` Juan Quintela
2023-10-19 14:55     ` Fabiano Rosas
2023-10-19 15:18       ` Juan Quintela
2023-10-19 15:56         ` Fabiano Rosas
2023-10-19 18:41           ` Juan Quintela
2023-10-19 19:04             ` Peter Xu
2023-10-20  7:53               ` Juan Quintela
2023-10-20 12:48                 ` Fabiano Rosas [this message]
2023-10-22 20:17                   ` Peter Xu
2023-10-12 14:06 ` [RFC PATCH v2 2/6] migration/multifd: Stop checking p->quit in multifd_send_thread Fabiano Rosas
2023-10-19  9:08   ` Juan Quintela
2023-10-19 14:58     ` Fabiano Rosas
2023-10-19 15:19       ` Peter Xu
2023-10-19 15:19       ` Juan Quintela
2023-10-12 14:06 ` [RFC PATCH v2 3/6] migration/multifd: Decouple control flow from the SYNC packet Fabiano Rosas
2023-10-19 10:28   ` Juan Quintela
2023-10-19 15:31     ` Peter Xu
2023-10-12 14:06 ` [RFC PATCH v2 4/6] migration/multifd: Extract sem_done waiting into a function Fabiano Rosas
2023-10-12 14:06 ` [RFC PATCH v2 5/6] migration/multifd: Stop setting 'quit' outside of channels Fabiano Rosas
2023-10-19 10:35   ` Juan Quintela
2023-10-12 14:06 ` [RFC PATCH v2 6/6] migration/multifd: Bring back the 'ready' semaphore Fabiano Rosas
2023-10-19 10:43   ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttqlh32h.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=elena.ufimtseva@oracle.com \
    --cc=leobras@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).