All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, dgilbert@redhat.com, lvivier@redhat.com,
	berrange@redhat.com
Subject: Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
Date: Thu, 10 Aug 2017 14:49:03 +0800	[thread overview]
Message-ID: <20170810064903.GN13486@pxdev.xzpeter.org> (raw)
In-Reply-To: <877eye9tit.fsf@secure.mitica>

On Tue, Aug 08, 2017 at 01:40:58PM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Mon, Jul 17, 2017 at 03:42:38PM +0200, Juan Quintela wrote:
> >> Each time that we sync the bitmap, it is a possiblity that we receive
> >> a page that is being processed by a different thread.  We fix this
> >> problem just making sure that we wait for all receiving threads to
> >> finish its work before we procedeed with the next stage.
> >> 
> >> We are low on page flags, so we use a combination that is not valid to
> >> emit that message:  MULTIFD_PAGE and COMPRESSED.
> >> 
> >> I tried to make a migration command for it, but it don't work because
> >> we sync the bitmap sometimes when we have already sent the beggining
> >> of the section, so I just added a new page flag.
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> >> @@ -675,6 +686,10 @@ static void *multifd_recv_thread(void *opaque)
> >>                  return NULL;
> >>              }
> >>              p->done = true;
> >> +            if (p->sync) {
> >> +                qemu_cond_signal(&p->cond_sync);
> >> +                p->sync = false;
> >> +            }
> >
> > Could we use the same p->ready for this purpose? They looks similar:
> > all we want to do is to let the main thread know "worker thread has
> > finished receiving the last piece and becomes idle again", right?
> 
> We *could*, but "ready" is used for each page that we sent, sync is only
> used once every round.  Notice that "ready" is a semaphore, and its
> semantic is weird.  See next comment.
> 
> 
> >> +static int multifd_flush(void)
> >> +{
> >> +    int i, thread_count;
> >> +
> >> +    if (!migrate_use_multifd()) {
> >> +        return 0;
> >> +    }
> >> +    thread_count = migrate_multifd_threads();
> >> +    for (i = 0; i < thread_count; i++) {
> >> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
> >> +
> >> +        qemu_mutex_lock(&p->mutex);
> >> +        while (!p->done) {
> >> +            p->sync = true;
> >> +            qemu_cond_wait(&p->cond_sync, &p->mutex);
> >
> > (similar comment like above)
> 
> We need to look at the two pieces of code at the same time.  What are we
> trying to do:
> 
> - making sure that all threads have finished the current round.
>   in this particular case, that this thread has finished its current
>   round OR  that it is waiting for work.
> 
> So, the main thread is the one that does the sem_wait(ready) and the channel
> thread is the one that does the sem_post(ready).
> 
> multifd_recv_thread()
> 
>     if (p->sync) {
>         sem_post(ready);
>         p->sync = false;
>     }
> 
> multifd_flush()
>    if (!p->done) {
>        p->sync = true;
>        sem_wait(ready);
>    }
> 
> Ah, but done and sync can be changed from other threads, so current code
> will become:
> 
> multifd_recv_thread()
> 
>     if (p->sync) {
>         sem_post(ready);
>         p->sync = false;
>     }
> 
> multifd_flush()
>    ...
>    mutex_lock(lock);
>    if (!p->done) {
>        p->sync = true;
>        mutex_unlock(lock)
>        sem_wait(ready);
>        mutex_lock(lock)
>    }
>    mutex_unlock(lock)
> 
> That I would claim that it is more complicated to understand.  Mixing
> locks and semaphores is ..... interesting to say the least.  With
> variable conditions it becomes easy.
> 
> Yes, we can change sync/done to atomic access, but not sure that makes
> things so much simpler.

I was thinking that p->ready can be used a notification channel from
recv thread to main thread for any reason. But I'm also fine that if
you want to do this separately to have different sync channels for
page-level completions and global flushes especially in first version.

(but I'd say I feel the whole thing slightly complicated, while I feel
 it can be simpler somewhere...)

> 
> >> +        }
> >> +        qemu_mutex_unlock(&p->mutex);
> >> +    }
> >> +    return 0;
> >> +}
> >> +
> >>  /**
> >>   * save_page_header: write page header to wire
> >>   *
> >> @@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
> >>  {
> >>      size_t size, len;
> >>  
> >> +    if (rs->multifd_needs_flush &&
> >> +        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {
> >
> > If multifd_needs_flush is only for multifd, then we may skip this
> > check, but it looks more like an assertion:
> >
> >     if (rs->multifd_needs_flush) {
> >         assert(offset & RAM_SAVE_FLAG_MULTIFD_PAGE);
> >         offset |= RAM_SAVE_FLAG_ZERO;
> >     }
> 
> No, it could be that this page is a _non_ multifd page, and then ZERO
> means something different.  So, we can only send this for MULTIFD pages.

But if multifd_needs_flush==true, it must be a multifd page, no? :)

I think this is trivial, so both work for me.

> 
> > (Dave mentioned about unaligned flag used in commit message and here:
> >  ZERO is used, but COMPRESS is mentioned)
> 
> OK, I can change the message.
> 
> >> @@ -2496,6 +2540,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
> >>  
> >>      if (!migration_in_postcopy()) {
> >>          migration_bitmap_sync(rs);
> >> +        if (migrate_use_multifd()) {
> >> +            rs->multifd_needs_flush = true;
> >> +        }
> >
> > Would it be good to move this block into entry of
> > migration_bitmap_sync(), instead of setting it up at the callers of
> > migration_bitmap_sync()?
> 
> We can't have all of it.
> 
> We call migration_bitmap_sync() in 4 places.
> - We don't need to set the flag for the 1st synchronization
> - We don't need to set it on postcopy (yet).

[1]

I see.

> 
> So, we can add code inside to check if we are on the 1st round, and
> forget about postcopy (we check in other place), or we maintain it this way.
> 
> So, change becomes:
> 
> modified   migration/ram.c
> @@ -1131,6 +1131,9 @@ static void migration_bitmap_sync(RAMState *rs)
>      if (migrate_use_events()) {
>          qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL);
>      }
> +    if (rs->ram_bulk_stage && migrate_use_multifd()) {

Should this be "!rs->ram_bulk_stage && migrate_use_multifd()"?

> +        rs->multifd_needs_flush = true;
> +    }
>  }
>  
>  /**
> @@ -2533,9 +2536,6 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  
>      if (!migration_in_postcopy()) {
>          migration_bitmap_sync(rs);
> -        if (migrate_use_multifd()) {
> -            rs->multifd_needs_flush = true;
> -        }
>      }
>  
>      ram_control_before_iterate(f, RAM_CONTROL_FINISH);
> @@ -2578,9 +2578,6 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
>          qemu_mutex_lock_iothread();
>          rcu_read_lock();
>          migration_bitmap_sync(rs);
> -        if (migrate_use_multifd()) {
> -            rs->multifd_needs_flush = true;
> -        }
>          rcu_read_unlock();
>          qemu_mutex_unlock_iothread();
>          remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;
> 
> three less lines, you win.  We need to check in otherplace already that
> postcopy & multifd are not enabled at the same time.

I got the point. I would slightly prefer the new way to have only one
single place to set multifd_needs_flush (it would be nice to have some
comments like [1] there), but I'm also fine if you prefer the old one.

Thanks,

-- 
Peter Xu

  reply	other threads:[~2017-08-10  6:49 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-17 13:42 [Qemu-devel] [PATCH v5 00/17] Multifd Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 01/17] migrate: Add gboolean return type to migrate_channel_process_incoming Juan Quintela
2017-07-19 15:01   ` Dr. David Alan Gilbert
2017-07-20  7:00     ` Peter Xu
2017-07-20  8:47       ` Daniel P. Berrange
2017-07-24 10:18         ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 02/17] migration: Create migration_ioc_process_incoming() Juan Quintela
2017-07-19 13:38   ` Daniel P. Berrange
2017-07-24 11:09     ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 03/17] qio: Create new qio_channel_{readv, writev}_all Juan Quintela
2017-07-19 13:44   ` Daniel P. Berrange
2017-08-08  8:40     ` Juan Quintela
2017-08-08  9:25       ` Daniel P. Berrange
2017-07-19 15:42   ` Dr. David Alan Gilbert
2017-07-19 15:43     ` Daniel P. Berrange
2017-07-19 16:04       ` Dr. David Alan Gilbert
2017-07-19 16:08         ` Daniel P. Berrange
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 04/17] migration: Add multifd capability Juan Quintela
2017-07-19 15:44   ` Dr. David Alan Gilbert
2017-08-08  8:42     ` Juan Quintela
2017-07-19 17:14   ` Eric Blake
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 05/17] migration: Create x-multifd-threads parameter Juan Quintela
2017-07-19 16:00   ` Dr. David Alan Gilbert
2017-08-08  8:46     ` Juan Quintela
2017-08-08  9:44       ` Dr. David Alan Gilbert
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 06/17] migration: Create x-multifd-group parameter Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 07/17] migration: Create multifd migration threads Juan Quintela
2017-07-19 16:49   ` Dr. David Alan Gilbert
2017-08-08  8:58     ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 08/17] migration: Split migration_fd_process_incomming Juan Quintela
2017-07-19 17:08   ` Dr. David Alan Gilbert
2017-07-21 12:39     ` Eric Blake
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work Juan Quintela
2017-07-19 13:56   ` Daniel P. Berrange
2017-07-19 17:35   ` Dr. David Alan Gilbert
2017-08-08  9:35     ` Juan Quintela
2017-08-08  9:54       ` Dr. David Alan Gilbert
2017-07-20  9:34   ` Peter Xu
2017-08-08  9:19     ` Juan Quintela
2017-08-09  8:08       ` Peter Xu
2017-08-09 11:12         ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 10/17] migration: Create ram_multifd_page Juan Quintela
2017-07-19 19:02   ` Dr. David Alan Gilbert
2017-07-20  8:10     ` Peter Xu
2017-07-20 11:48       ` Dr. David Alan Gilbert
2017-08-08 15:58         ` Juan Quintela
2017-08-08 16:04       ` Juan Quintela
2017-08-09  7:42         ` Peter Xu
2017-08-08 15:56     ` Juan Quintela
2017-08-08 16:30       ` Dr. David Alan Gilbert
2017-08-08 18:02         ` Juan Quintela
2017-08-08 19:14           ` Dr. David Alan Gilbert
2017-08-09 16:48             ` Paolo Bonzini
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 11/17] migration: Really use multiple pages at a time Juan Quintela
2017-07-19 13:58   ` Daniel P. Berrange
2017-08-08 11:55     ` Juan Quintela
2017-07-20  9:44   ` Dr. David Alan Gilbert
2017-08-08 12:11     ` Juan Quintela
2017-07-20  9:49   ` Peter Xu
2017-07-20 10:09     ` Peter Xu
2017-08-08 16:06     ` Juan Quintela
2017-08-09  7:48       ` Peter Xu
2017-08-09  8:05         ` Juan Quintela
2017-08-09  8:12           ` Peter Xu
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 12/17] migration: Send the fd number which we are going to use for this page Juan Quintela
2017-07-20  9:58   ` Dr. David Alan Gilbert
2017-08-09 16:48   ` Paolo Bonzini
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side Juan Quintela
2017-07-20 10:22   ` Peter Xu
2017-08-08 11:41     ` Juan Quintela
2017-08-09  5:53       ` Peter Xu
2017-07-20 10:29   ` Dr. David Alan Gilbert
2017-08-08 11:51     ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 14/17] migration: Delay the start of reception on main channel Juan Quintela
2017-07-20 10:56   ` Dr. David Alan Gilbert
2017-08-08 11:29     ` Juan Quintela
2017-07-20 11:10   ` Peter Xu
2017-08-08 11:30     ` Juan Quintela
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 15/17] migration: Test new fd infrastructure Juan Quintela
2017-07-20 11:20   ` Dr. David Alan Gilbert
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels Juan Quintela
2017-07-20 11:31   ` Dr. David Alan Gilbert
2017-08-08 11:13     ` Juan Quintela
2017-08-08 11:32       ` Dr. David Alan Gilbert
2017-07-17 13:42 ` [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue Juan Quintela
2017-07-20 11:45   ` Dr. David Alan Gilbert
2017-08-08 10:43     ` Juan Quintela
2017-08-08 11:25       ` Dr. David Alan Gilbert
2017-07-21  2:40   ` Peter Xu
2017-08-08 11:40     ` Juan Quintela
2017-08-10  6:49       ` Peter Xu [this message]
2017-07-21  6:03   ` Peter Xu
2017-07-21 10:53     ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170810064903.GN13486@pxdev.xzpeter.org \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.