qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Zhang Chen <zhangckid@gmail.com>
Cc: qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Jason Wang <jasowang@redhat.com>, Eric Blake <eblake@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	Li Zhijian <lizhijian@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PATCH V10 05/20] COLO: Add block replication into colo process
Date: Tue, 7 Aug 2018 15:30:36 +0100	[thread overview]
Message-ID: <20180807143036.GJ2556@work-vm> (raw)
In-Reply-To: <20180722193350.6028-6-zhangckid@gmail.com>

* Zhang Chen (zhangckid@gmail.com) wrote:
> Make sure master start block replication after slave's block
> replication started.
> 
> Besides, we need to activate VM's blocks before goes into
> COLO state.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> ---
>  migration/colo.c      | 43 +++++++++++++++++++++++++++++++++++++++++++
>  migration/migration.c |  9 +++++++++
>  2 files changed, 52 insertions(+)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 081df1835f..e06640c3d6 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -27,6 +27,7 @@
>  #include "replication.h"
>  #include "net/colo-compare.h"
>  #include "net/colo.h"
> +#include "block/block.h"
>  
>  static bool vmstate_loading;
>  static Notifier packets_compare_notifier;
> @@ -56,6 +57,7 @@ static void secondary_vm_do_failover(void)
>  {
>      int old_state;
>      MigrationIncomingState *mis = migration_incoming_get_current();
> +    Error *local_err = NULL;
>  
>      /* Can not do failover during the process of VM's loading VMstate, Or
>       * it will break the secondary VM.
> @@ -73,6 +75,11 @@ static void secondary_vm_do_failover(void)
>      migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
>                        MIGRATION_STATUS_COMPLETED);
>  
> +    replication_stop_all(true, &local_err);
> +    if (local_err) {
> +        error_report_err(local_err);
> +    }
> +
>      if (!autostart) {
>          error_report("\"-S\" qemu option will be ignored in secondary side");
>          /* recover runstate to normal migration finish state */
> @@ -110,6 +117,7 @@ static void primary_vm_do_failover(void)
>  {
>      MigrationState *s = migrate_get_current();
>      int old_state;
> +    Error *local_err = NULL;
>  
>      migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
>                        MIGRATION_STATUS_COMPLETED);
> @@ -133,6 +141,13 @@ static void primary_vm_do_failover(void)
>                       FailoverStatus_str(old_state));
>          return;
>      }
> +
> +    replication_stop_all(true, &local_err);
> +    if (local_err) {
> +        error_report_err(local_err);
> +        local_err = NULL;
> +    }
> +
>      /* Notify COLO thread that failover work is finished */
>      qemu_sem_post(&s->colo_exit_sem);
>  }
> @@ -356,6 +371,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
>      qemu_savevm_state_header(fb);
>      qemu_savevm_state_setup(fb);
>      qemu_mutex_lock_iothread();
> +    replication_do_checkpoint_all(&local_err);
> +    if (local_err) {
> +        qemu_mutex_unlock_iothread();
> +        goto out;
> +    }

In docs/block-replication.txt it says:
  b. replication_do_checkpoint_all()
   This interface is called after all VM state is transferred to
   Secondary QEMU. The Disk buffer will be dropped in this interface.
   The caller must hold the I/O mutex lock if it is in migration/checkpoint
   thread.

but we're making this call before the call below that actually transfers
all the state.  Which is right?

Other than that I think it's OK.

Dave

>      qemu_savevm_state_complete_precopy(fb, false, false);
>      qemu_mutex_unlock_iothread();
>  
> @@ -446,6 +466,12 @@ static void colo_process_checkpoint(MigrationState *s)
>      object_unref(OBJECT(bioc));
>  
>      qemu_mutex_lock_iothread();
> +    replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
> +    if (local_err) {
> +        qemu_mutex_unlock_iothread();
> +        goto out;
> +    }
> +
>      vm_start();
>      qemu_mutex_unlock_iothread();
>      trace_colo_vm_state_change("stop", "run");
> @@ -585,6 +611,11 @@ void *colo_process_incoming_thread(void *opaque)
>      object_unref(OBJECT(bioc));
>  
>      qemu_mutex_lock_iothread();
> +    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
> +    if (local_err) {
> +        qemu_mutex_unlock_iothread();
> +        goto out;
> +    }
>      vm_start();
>      trace_colo_vm_state_change("stop", "run");
>      qemu_mutex_unlock_iothread();
> @@ -665,6 +696,18 @@ void *colo_process_incoming_thread(void *opaque)
>              goto out;
>          }
>  
> +        replication_get_error_all(&local_err);
> +        if (local_err) {
> +            qemu_mutex_unlock_iothread();
> +            goto out;
> +        }
> +        /* discard colo disk buffer */
> +        replication_do_checkpoint_all(&local_err);
> +        if (local_err) {
> +            qemu_mutex_unlock_iothread();
> +            goto out;
> +        }
> +
>          vmstate_loading = false;
>          vm_start();
>          trace_colo_vm_state_change("stop", "run");
> diff --git a/migration/migration.c b/migration/migration.c
> index ce06941706..c97b7660af 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -385,6 +385,7 @@ static void process_incoming_migration_co(void *opaque)
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      PostcopyState ps;
>      int ret;
> +    Error *local_err = NULL;
>  
>      assert(mis->from_src_file);
>      mis->largest_page_size = qemu_ram_pagesize_largest();
> @@ -416,6 +417,14 @@ static void process_incoming_migration_co(void *opaque)
>  
>      /* we get COLO info, and know if we are in COLO mode */
>      if (!ret && migration_incoming_enable_colo()) {
> +        /* Make sure all file formats flush their mutable metadata */
> +        bdrv_invalidate_cache_all(&local_err);
> +        if (local_err) {
> +            migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> +                    MIGRATION_STATUS_FAILED);
> +            error_report_err(local_err);
> +            exit(EXIT_FAILURE);
> +        }
>          mis->migration_incoming_co = qemu_coroutine_self();
>          qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
>               colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2018-08-07 14:30 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-22 19:33 [Qemu-devel] [PATCH V10 00/20] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 01/20] filter-rewriter: Add TCP state machine and fix memory leak in connection_track_table Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 02/20] colo-compare: implement the process of checkpoint Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 03/20] colo-compare: use notifier to notify packets comparing result Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 04/20] COLO: integrate colo compare with colo frame Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 05/20] COLO: Add block replication into colo process Zhang Chen
2018-08-07 14:30   ` Dr. David Alan Gilbert [this message]
2018-08-11 19:47     ` Zhang Chen
2018-08-17 11:07       ` Dr. David Alan Gilbert
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 06/20] COLO: Remove colo_state migration struct Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 07/20] COLO: Load dirty pages into SVM's RAM cache firstly Zhang Chen
2018-08-07 17:58   ` Dr. David Alan Gilbert
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 08/20] ram/COLO: Record the dirty pages that SVM received Zhang Chen
2018-07-27 18:51   ` Dr. David Alan Gilbert
2018-08-01 10:38     ` Zhang Chen
2018-08-07 18:44       ` Dr. David Alan Gilbert
2018-08-11 19:52         ` Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 09/20] COLO: Flush memory data from ram cache Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 10/20] qmp event: Add COLO_EXIT event to notify users while exited COLO Zhang Chen
2018-07-23 18:41   ` Eric Blake
2018-07-24  9:56     ` Zhang Chen
2018-07-24 14:54       ` Dr. David Alan Gilbert
2018-07-25  1:12         ` Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 11/20] qapi/migration.json: Rename COLO unknown mode to none mode Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 12/20] qapi: Add new command to query colo status Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 13/20] savevm: split the process of different stages for loadvm/savevm Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 14/20] COLO: flush host dirty ram from cache Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 15/20] net/net.c: Add net client type check function for COLO Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 16/20] filter: Add handle_event method for NetFilterClass Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 17/20] filter-rewriter: handle checkpoint and failover event Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 18/20] COLO: notify net filters about checkpoint/failover event Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 19/20] COLO: quick failover process by kick COLO thread Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 20/20] docs: Add COLO status diagram to COLO-FT.txt Zhang Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180807143036.GJ2556@work-vm \
    --to=dgilbert@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=zhang.zhanghailiang@huawei.com \
    --cc=zhangckid@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).