All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: peterx@redhat.com, qemu-devel@nongnu.org
Cc: Hao Xiang <hao.xiang@bytedance.com>,
	Bryan Zhang <bryan.zhang@bytedance.com>,
	peterx@redhat.com, Avihai Horon <avihaih@nvidia.com>,
	Yuan Liu <yuan1.liu@intel.com>,
	Prasad Pandit <ppandit@redhat.com>
Subject: Re: [PATCH v2 23/23] migration/multifd: Optimize sender side to be lockless
Date: Fri, 02 Feb 2024 18:34:08 -0300	[thread overview]
Message-ID: <87wmrmft4f.fsf@suse.de> (raw)
In-Reply-To: <20240202102857.110210-24-peterx@redhat.com>

peterx@redhat.com writes:

> From: Peter Xu <peterx@redhat.com>
>
> When reviewing my attempt to refactor send_prepare(), Fabiano suggested we
> try out with dropping the mutex in multifd code [1].
>
> I thought about that before but I never tried to change the code.  Now
> maybe it's time to give it a stab.  This only optimizes the sender side.
>
> The trick here is multifd has a clear provider/consumer model, that the
> migration main thread publishes requests (either pending_job/pending_sync),
> while the multifd sender threads are consumers.  Here we don't have a lot
> of comlicated data sharing, and the jobs can logically be submitted lockless.

complicated

>
> Arm the code with atomic weapons.  Two things worth mentioning:
>
>   - For multifd_send_pages(): we can use qatomic_load_acquire() when trying
>   to find a free channel, but that's expensive if we attach one ACQUIRE per
>   channel.  Instead, make it atomic_read() on the pending_job flag, but

s/make it/keep it/

The diff doesn't show the atomic_read already there so it's confusing.

>   merge the ACQUIRE into one single smp_mb_acquire() later.
>
>   - For pending_sync: it doesn't have any extra data required since now
>   p->flags are never touched, it should be safe to not use memory barrier.
>   That's different from pending_sync.

pending_job?

>
> Provide rich comments for all the lockless operations to state how they are
> paired.  With that, we can remove the mutex.
>
> [1] https://lore.kernel.org/r/87o7d1jlu5.fsf@suse.de
>
> Suggested-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/multifd.h |  2 --
>  migration/multifd.c | 51 +++++++++++++++++++++++----------------------
>  2 files changed, 26 insertions(+), 27 deletions(-)
>
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 98876ff94a..78a2317263 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -91,8 +91,6 @@ typedef struct {
>      /* syncs main thread and channels */
>      QemuSemaphore sem_sync;
>  
> -    /* this mutex protects the following parameters */
> -    QemuMutex mutex;
>      /* is this channel thread running */
>      bool running;
>      /* multifd flags for each packet */
> diff --git a/migration/multifd.c b/migration/multifd.c
> index b317d57d61..ef13e2e781 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -501,19 +501,19 @@ static bool multifd_send_pages(void)
>          }
>      }
>  
> -    qemu_mutex_lock(&p->mutex);
> -    assert(!p->pages->num);
> -    assert(!p->pages->block);
>      /*
> -     * Double check on pending_job==false with the lock.  In the future if
> -     * we can have >1 requester thread, we can replace this with a "goto
> -     * retry", but that is for later.
> +     * Make sure we read p->pending_job before all the rest.  Pairs with
> +     * qatomic_store_release() in multifd_send_thread().
>       */
> -    assert(qatomic_read(&p->pending_job) == false);
> -    qatomic_set(&p->pending_job, true);
> +    smp_mb_acquire();
> +    assert(!p->pages->num);
>      multifd_send_state->pages = p->pages;
>      p->pages = pages;
> -    qemu_mutex_unlock(&p->mutex);
> +    /*
> +     * Making sure p->pages is setup before marking pending_job=true. Pairs
> +     * with the qatomic_load_acquire() in multifd_send_thread().
> +     */
> +    qatomic_store_release(&p->pending_job, true);
>      qemu_sem_post(&p->sem);
>  
>      return true;
> @@ -648,7 +648,6 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp)
>      }
>      multifd_send_channel_destroy(p->c);
>      p->c = NULL;
> -    qemu_mutex_destroy(&p->mutex);
>      qemu_sem_destroy(&p->sem);
>      qemu_sem_destroy(&p->sem_sync);
>      g_free(p->name);
> @@ -742,14 +741,12 @@ int multifd_send_sync_main(void)
>  
>          trace_multifd_send_sync_main_signal(p->id);
>  
> -        qemu_mutex_lock(&p->mutex);
>          /*
>           * We should be the only user so far, so not possible to be set by
>           * others concurrently.
>           */
>          assert(qatomic_read(&p->pending_sync) == false);
>          qatomic_set(&p->pending_sync, true);
> -        qemu_mutex_unlock(&p->mutex);
>          qemu_sem_post(&p->sem);
>      }
>      for (i = 0; i < migrate_multifd_channels(); i++) {
> @@ -796,9 +793,12 @@ static void *multifd_send_thread(void *opaque)
>          if (multifd_send_should_exit()) {
>              break;
>          }
> -        qemu_mutex_lock(&p->mutex);
>  
> -        if (qatomic_read(&p->pending_job)) {
> +        /*
> +         * Read pending_job flag before p->pages.  Pairs with the
> +         * qatomic_store_release() in multifd_send_pages().
> +         */
> +        if (qatomic_load_acquire(&p->pending_job)) {
>              MultiFDPages_t *pages = p->pages;
>  
>              p->iovs_num = 0;
> @@ -806,14 +806,12 @@ static void *multifd_send_thread(void *opaque)
>  
>              ret = multifd_send_state->ops->send_prepare(p, &local_err);
>              if (ret != 0) {
> -                qemu_mutex_unlock(&p->mutex);
>                  break;
>              }
>  
>              ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num, NULL,
>                                                0, p->write_flags, &local_err);
>              if (ret != 0) {
> -                qemu_mutex_unlock(&p->mutex);
>                  break;
>              }
>  
> @@ -822,24 +820,31 @@ static void *multifd_send_thread(void *opaque)
>  
>              multifd_pages_reset(p->pages);
>              p->next_packet_size = 0;
> -            qatomic_set(&p->pending_job, false);
> -            qemu_mutex_unlock(&p->mutex);
> +
> +            /*
> +             * Making sure p->pages is published before saying "we're
> +             * free".  Pairs with the qatomic_load_acquire() in

smp_mb_acquire()

> +             * multifd_send_pages().
> +             */
> +            qatomic_store_release(&p->pending_job, false);
>          } else {
> -            /* If not a normal job, must be a sync request */
> +            /*
> +             * If not a normal job, must be a sync request.  Note that
> +             * pending_sync is a standalone flag (unlike pending_job), so
> +             * it doesn't require explicit memory barriers.
> +             */
>              assert(qatomic_read(&p->pending_sync));
>              p->flags = MULTIFD_FLAG_SYNC;
>              multifd_send_fill_packet(p);
>              ret = qio_channel_write_all(p->c, (void *)p->packet,
>                                          p->packet_len, &local_err);
>              if (ret != 0) {
> -                qemu_mutex_unlock(&p->mutex);
>                  break;
>              }
>              /* p->next_packet_size will always be zero for a SYNC packet */
>              stat64_add(&mig_stats.multifd_bytes, p->packet_len);
>              p->flags = 0;
>              qatomic_set(&p->pending_sync, false);
> -            qemu_mutex_unlock(&p->mutex);
>              qemu_sem_post(&p->sem_sync);
>          }
>      }
> @@ -853,10 +858,7 @@ out:
>          error_free(local_err);
>      }
>  
> -    qemu_mutex_lock(&p->mutex);
>      p->running = false;
> -    qemu_mutex_unlock(&p->mutex);
> -
>      rcu_unregister_thread();
>      migration_threads_remove(thread);
>      trace_multifd_send_thread_end(p->id, p->packets_sent, p->total_normal_pages);
> @@ -998,7 +1000,6 @@ int multifd_send_setup(Error **errp)
>      for (i = 0; i < thread_count; i++) {
>          MultiFDSendParams *p = &multifd_send_state->params[i];
>  
> -        qemu_mutex_init(&p->mutex);
>          qemu_sem_init(&p->sem, 0);
>          qemu_sem_init(&p->sem_sync, 0);
>          p->id = i;


  reply	other threads:[~2024-02-02 21:35 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02 10:28 [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups peterx
2024-02-02 10:28 ` [PATCH v2 01/23] migration/multifd: Drop stale comment for multifd zero copy peterx
2024-02-02 10:28 ` [PATCH v2 02/23] migration/multifd: multifd_send_kick_main() peterx
2024-02-02 10:28 ` [PATCH v2 03/23] migration/multifd: Drop MultiFDSendParams.quit, cleanup error paths peterx
2024-02-02 19:15   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 04/23] migration/multifd: Postpone reset of MultiFDPages_t peterx
2024-02-02 10:28 ` [PATCH v2 05/23] migration/multifd: Drop MultiFDSendParams.normal[] array peterx
2024-02-09  0:06   ` [External] " Hao Xiang
2024-02-09 12:20     ` Fabiano Rosas
2024-02-14  2:16       ` Hao Xiang
2024-02-14 17:17         ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 06/23] migration/multifd: Separate SYNC request with normal jobs peterx
2024-02-02 19:21   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 07/23] migration/multifd: Simplify locking in sender thread peterx
2024-02-02 19:23   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 08/23] migration/multifd: Drop pages->num check " peterx
2024-02-02 10:28 ` [PATCH v2 09/23] migration/multifd: Rename p->num_packets and clean it up peterx
2024-02-02 10:28 ` [PATCH v2 10/23] migration/multifd: Move total_normal_pages accounting peterx
2024-02-02 10:28 ` [PATCH v2 11/23] migration/multifd: Move trace_multifd_send|recv() peterx
2024-02-02 10:28 ` [PATCH v2 12/23] migration/multifd: multifd_send_prepare_header() peterx
2024-02-02 10:28 ` [PATCH v2 13/23] migration/multifd: Move header prepare/fill into send_prepare() peterx
2024-02-02 19:26   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 14/23] migration/multifd: Forbid spurious wakeups peterx
2024-02-02 10:28 ` [PATCH v2 15/23] migration/multifd: Split multifd_send_terminate_threads() peterx
2024-02-02 19:28   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 16/23] migration/multifd: Change retval of multifd_queue_page() peterx
2024-02-02 19:29   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 17/23] migration/multifd: Change retval of multifd_send_pages() peterx
2024-02-02 19:30   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 18/23] migration/multifd: Rewrite multifd_queue_page() peterx
2024-02-02 20:47   ` Fabiano Rosas
2024-02-05  4:03     ` Peter Xu
2024-02-02 10:28 ` [PATCH v2 19/23] migration/multifd: Cleanup multifd_save_cleanup() peterx
2024-02-02 20:54   ` Fabiano Rosas
2024-02-05  4:25     ` Peter Xu
2024-02-02 10:28 ` [PATCH v2 20/23] migration/multifd: Cleanup multifd_load_cleanup() peterx
2024-02-02 20:55   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 21/23] migration/multifd: Stick with send/recv on function names peterx
2024-02-02 21:03   ` Fabiano Rosas
2024-02-02 10:28 ` [PATCH v2 22/23] migration/multifd: Fix MultiFDSendParams.packet_num race peterx
2024-02-02 21:08   ` Fabiano Rosas
2024-02-05  4:05     ` Peter Xu
2024-02-02 10:28 ` [PATCH v2 23/23] migration/multifd: Optimize sender side to be lockless peterx
2024-02-02 21:34   ` Fabiano Rosas [this message]
2024-02-05  4:35     ` Peter Xu
2024-02-05 14:10       ` Fabiano Rosas
2024-02-05 14:24         ` Peter Xu
2024-02-05 17:59           ` Fabiano Rosas
2024-02-06  3:05 ` [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wmrmft4f.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=avihaih@nvidia.com \
    --cc=bryan.zhang@bytedance.com \
    --cc=hao.xiang@bytedance.com \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yuan1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.