From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57829) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ddGTj-0007Md-Ra for qemu-devel@nongnu.org; Thu, 03 Aug 2017 09:47:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ddGTg-0007hG-SX for qemu-devel@nongnu.org; Thu, 03 Aug 2017 09:47:55 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59886) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ddGTg-0007gV-J6 for qemu-devel@nongnu.org; Thu, 03 Aug 2017 09:47:52 -0400 Date: Thu, 3 Aug 2017 14:47:44 +0100 From: "Dr. David Alan Gilbert" Message-ID: <20170803134744.GL2076@work-vm> References: <1501229198-30588-1-git-send-email-peterx@redhat.com> <1501229198-30588-29-git-send-email-peterx@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1501229198-30588-29-git-send-email-peterx@redhat.com> Subject: Re: [Qemu-devel] [RFC 28/29] migration: final handshake for the resume List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: qemu-devel@nongnu.org, Laurent Vivier , Alexey Perevalov , Juan Quintela , Andrea Arcangeli * Peter Xu (peterx@redhat.com) wrote: > Finish the last step to do the final handshake for the recovery. > > First source sends one MIG_CMD_RESUME to dst, telling that source is > ready to resume. > > Then, dest replies with MIG_RP_MSG_RESUME_ACK to source, telling that > dest is ready to resume (after switch to postcopy-active state). > > When source received the RESUME_ACK, it switches its state to > postcopy-active, and finally the recovery is completed. > > Signed-off-by: Peter Xu > --- > migration/migration.c | 39 +++++++++++++++++++++++++++++++++++---- > 1 file changed, 35 insertions(+), 4 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 93fbc96..ecebe30 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -1666,6 +1666,13 @@ static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name) > return ram_dirty_bitmap_reload(s, block); > } > > +static void postcopy_resume_handshake_ack(MigrationState *s) > +{ > + qemu_mutex_lock(&s->resume_lock); > + qemu_cond_signal(&s->resume_cond); > + qemu_mutex_unlock(&s->resume_lock); > +} > + > static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value) > { > trace_source_return_path_thread_resume_ack(value); > @@ -1680,7 +1687,8 @@ static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value) > migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_RECOVER, > MIGRATION_STATUS_POSTCOPY_ACTIVE); > > - /* TODO: notify send thread that time to continue send pages */ > + /* Notify send thread that time to continue send pages */ > + postcopy_resume_handshake_ack(s); > > return 0; > } > @@ -2151,6 +2159,25 @@ static bool postcopy_should_start(MigrationState *s) > return atomic_read(&s->start_postcopy) || s->start_postcopy_fast; > } > > +static int postcopy_resume_handshake(MigrationState *s) > +{ > + qemu_mutex_lock(&s->resume_lock); > + > + qemu_savevm_send_postcopy_resume(s->to_dst_file); > + > + while (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) { > + qemu_cond_wait(&s->resume_cond, &s->resume_lock); > + } > + > + qemu_mutex_unlock(&s->resume_lock); > + > + if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { > + return 0; > + } That feels to be a small racy - couldn't that validly become a MIGRATION_STATUS_COMPLETED before that check? I wonder if we need to change migrate_fd_cancel to be able to cause a cancel in this case? Dave > + return -1; > +} > + > /* Return zero if success, or <0 for error */ > static int postcopy_do_resume(MigrationState *s) > { > @@ -2168,10 +2195,14 @@ static int postcopy_do_resume(MigrationState *s) > } > > /* > - * TODO: handshake with dest using MIG_CMD_RESUME, > - * MIG_RP_MSG_RESUME_ACK, then switch source state to > - * "postcopy-active" > + * Last handshake with destination on the resume (destination will > + * switch to postcopy-active afterwards) > */ > + ret = postcopy_resume_handshake(s); > + if (ret) { > + error_report("%s: handshake failed: %d", __func__, ret); > + return ret; > + } > > return 0; > } > -- > 2.7.4 > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK