All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Zhang Chen <zhangckid@gmail.com>, stefanha@redhat.com
Cc: qemu-devel@nongnu.org, Eric Blake <eblake@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	Li Zhijian <lizhijian@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PATCH V7 RESEND 05/17] COLO: Add block replication into colo process
Date: Wed, 16 May 2018 16:54:54 +0100	[thread overview]
Message-ID: <20180516155454.GB15675@work-vm> (raw)
In-Reply-To: <20180514165424.12884-6-zhangckid@gmail.com>

* Zhang Chen (zhangckid@gmail.com) wrote:
> Make sure master start block replication after slave's block
> replication started.
> 
> Besides, we need to activate VM's blocks before goes into
> COLO state.

Stefan: This looks mostly OK to me, how does it look from the block
side?

The only thing I'd like to be convinced of is that
the replication_do_checkpoint_all() is synchronous enough
to know that the destination has received all disk IO
for one checkpoint before the primary starts running the next one.

Also, in the 'colo_do_checkpoint_transaction' the replication is called
near the start; is that the right point or should it be after any of the
device saves (could they spit out one last write?)

Dave

> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> ---
>  migration/colo.c      | 43 +++++++++++++++++++++++++++++++++++++++++++
>  migration/migration.c |  9 +++++++++
>  2 files changed, 52 insertions(+)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 081df1835f..e06640c3d6 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -27,6 +27,7 @@
>  #include "replication.h"
>  #include "net/colo-compare.h"
>  #include "net/colo.h"
> +#include "block/block.h"
>  
>  static bool vmstate_loading;
>  static Notifier packets_compare_notifier;
> @@ -56,6 +57,7 @@ static void secondary_vm_do_failover(void)
>  {
>      int old_state;
>      MigrationIncomingState *mis = migration_incoming_get_current();
> +    Error *local_err = NULL;
>  
>      /* Can not do failover during the process of VM's loading VMstate, Or
>       * it will break the secondary VM.
> @@ -73,6 +75,11 @@ static void secondary_vm_do_failover(void)
>      migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
>                        MIGRATION_STATUS_COMPLETED);
>  
> +    replication_stop_all(true, &local_err);
> +    if (local_err) {
> +        error_report_err(local_err);
> +    }
> +
>      if (!autostart) {
>          error_report("\"-S\" qemu option will be ignored in secondary side");
>          /* recover runstate to normal migration finish state */
> @@ -110,6 +117,7 @@ static void primary_vm_do_failover(void)
>  {
>      MigrationState *s = migrate_get_current();
>      int old_state;
> +    Error *local_err = NULL;
>  
>      migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
>                        MIGRATION_STATUS_COMPLETED);
> @@ -133,6 +141,13 @@ static void primary_vm_do_failover(void)
>                       FailoverStatus_str(old_state));
>          return;
>      }
> +
> +    replication_stop_all(true, &local_err);
> +    if (local_err) {
> +        error_report_err(local_err);
> +        local_err = NULL;
> +    }
> +
>      /* Notify COLO thread that failover work is finished */
>      qemu_sem_post(&s->colo_exit_sem);
>  }
> @@ -356,6 +371,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
>      qemu_savevm_state_header(fb);
>      qemu_savevm_state_setup(fb);
>      qemu_mutex_lock_iothread();
> +    replication_do_checkpoint_all(&local_err);
> +    if (local_err) {
> +        qemu_mutex_unlock_iothread();
> +        goto out;
> +    }
>      qemu_savevm_state_complete_precopy(fb, false, false);
>      qemu_mutex_unlock_iothread();
>  
> @@ -446,6 +466,12 @@ static void colo_process_checkpoint(MigrationState *s)
>      object_unref(OBJECT(bioc));
>  
>      qemu_mutex_lock_iothread();
> +    replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
> +    if (local_err) {
> +        qemu_mutex_unlock_iothread();
> +        goto out;
> +    }
> +
>      vm_start();
>      qemu_mutex_unlock_iothread();
>      trace_colo_vm_state_change("stop", "run");
> @@ -585,6 +611,11 @@ void *colo_process_incoming_thread(void *opaque)
>      object_unref(OBJECT(bioc));
>  
>      qemu_mutex_lock_iothread();
> +    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
> +    if (local_err) {
> +        qemu_mutex_unlock_iothread();
> +        goto out;
> +    }
>      vm_start();
>      trace_colo_vm_state_change("stop", "run");
>      qemu_mutex_unlock_iothread();
> @@ -665,6 +696,18 @@ void *colo_process_incoming_thread(void *opaque)
>              goto out;
>          }
>  
> +        replication_get_error_all(&local_err);
> +        if (local_err) {
> +            qemu_mutex_unlock_iothread();
> +            goto out;
> +        }
> +        /* discard colo disk buffer */
> +        replication_do_checkpoint_all(&local_err);
> +        if (local_err) {
> +            qemu_mutex_unlock_iothread();
> +            goto out;
> +        }
> +
>          vmstate_loading = false;
>          vm_start();
>          trace_colo_vm_state_change("stop", "run");
> diff --git a/migration/migration.c b/migration/migration.c
> index bca187275a..ddd0c4b988 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -357,6 +357,7 @@ static void process_incoming_migration_co(void *opaque)
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      PostcopyState ps;
>      int ret;
> +    Error *local_err = NULL;
>  
>      assert(mis->from_src_file);
>      mis->largest_page_size = qemu_ram_pagesize_largest();
> @@ -388,6 +389,14 @@ static void process_incoming_migration_co(void *opaque)
>  
>      /* we get COLO info, and know if we are in COLO mode */
>      if (!ret && migration_incoming_enable_colo()) {
> +        /* Make sure all file formats flush their mutable metadata */
> +        bdrv_invalidate_cache_all(&local_err);
> +        if (local_err) {
> +            migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> +                    MIGRATION_STATUS_FAILED);
> +            error_report_err(local_err);
> +            exit(EXIT_FAILURE);
> +        }
>          mis->migration_incoming_co = qemu_coroutine_self();
>          qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
>               colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
> -- 
> 2.17.0
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-05-16 15:55 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-14 16:54 [Qemu-devel] [PATCH V7 RESEND 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 01/17] filter-rewriter: fix memory leak for connection in connection_track_table Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 02/17] colo-compare: implement the process of checkpoint Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 03/17] colo-compare: use notifier to notify packets comparing result Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 04/17] COLO: integrate colo compare with colo frame Zhang Chen
2018-05-16 11:12   ` Dr. David Alan Gilbert
2018-05-16 13:55     ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 05/17] COLO: Add block replication into colo process Zhang Chen
2018-05-16 15:54   ` Dr. David Alan Gilbert [this message]
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 06/17] COLO: Remove colo_state migration struct Zhang Chen
2018-05-15 16:02   ` Dr. David Alan Gilbert
2018-05-16 13:58     ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 07/17] COLO: Load dirty pages into SVM's RAM cache firstly Zhang Chen
2018-05-15 16:55   ` Dr. David Alan Gilbert
2018-05-20 18:30     ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 08/17] ram/COLO: Record the dirty pages that SVM received Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 09/17] COLO: Flush memory data from ram cache Zhang Chen
2018-05-15 14:44   ` Dr. David Alan Gilbert
2018-05-20 16:09     ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 11/17] qapi: Add new command to query colo status Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 12/17] savevm: split the process of different stages for loadvm/savevm Zhang Chen
2018-05-15 18:56   ` Dr. David Alan Gilbert
2018-06-03  5:10     ` Zhang Chen
2018-06-19 19:00       ` Dr. David Alan Gilbert
2018-06-22  3:45         ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 13/17] COLO: flush host dirty ram from cache Zhang Chen
2018-05-15 15:32   ` Dr. David Alan Gilbert
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 14/17] filter: Add handle_event method for NetFilterClass Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 15/17] filter-rewriter: handle checkpoint and failover event Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 16/17] COLO: notify net filters about checkpoint/failover event Zhang Chen
2018-05-17  9:48   ` Dr. David Alan Gilbert
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 17/17] COLO: quick failover process by kick COLO thread Zhang Chen
2018-05-17  9:53   ` Dr. David Alan Gilbert
2018-05-16 11:18 ` [Qemu-devel] [PATCH V7 RESEND 00/17] COLO: integrate colo frame with block replication and COLO proxy Dr. David Alan Gilbert
2018-05-16 12:21   ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180516155454.GB15675@work-vm \
    --to=dgilbert@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=zhang.zhanghailiang@huawei.com \
    --cc=zhangckid@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.