From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Zhang Chen <zhangckid@gmail.com>
Cc: qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
Juan Quintela <quintela@redhat.com>,
jasowang@redhat.com, Eric Blake <eblake@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
zhang.zhanghailiang@huawei.com, lizhijian@cn.fujitsu.com
Subject: Re: [Qemu-devel] [PATCH V10 05/20] COLO: Add block replication into colo process
Date: Fri, 17 Aug 2018 12:07:09 +0100 [thread overview]
Message-ID: <20180817110709.GF2459@work-vm> (raw)
In-Reply-To: <CAK3tnvK=yQz8=mE8_7OMHiiNwPiRY8dG-7yek_iv_yXHgQYqKg@mail.gmail.com>
* Zhang Chen (zhangckid@gmail.com) wrote:
> On Tue, Aug 7, 2018 at 10:30 PM Dr. David Alan Gilbert <dgilbert@redhat.com>
> wrote:
>
> > * Zhang Chen (zhangckid@gmail.com) wrote:
> > > Make sure master start block replication after slave's block
> > > replication started.
> > >
> > > Besides, we need to activate VM's blocks before goes into
> > > COLO state.
> > >
> > > Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> > > Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> > > Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> > > ---
> > > migration/colo.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> > > migration/migration.c | 9 +++++++++
> > > 2 files changed, 52 insertions(+)
> > >
> > > diff --git a/migration/colo.c b/migration/colo.c
> > > index 081df1835f..e06640c3d6 100644
> > > --- a/migration/colo.c
> > > +++ b/migration/colo.c
> > > @@ -27,6 +27,7 @@
> > > #include "replication.h"
> > > #include "net/colo-compare.h"
> > > #include "net/colo.h"
> > > +#include "block/block.h"
> > >
> > > static bool vmstate_loading;
> > > static Notifier packets_compare_notifier;
> > > @@ -56,6 +57,7 @@ static void secondary_vm_do_failover(void)
> > > {
> > > int old_state;
> > > MigrationIncomingState *mis = migration_incoming_get_current();
> > > + Error *local_err = NULL;
> > >
> > > /* Can not do failover during the process of VM's loading VMstate,
> > Or
> > > * it will break the secondary VM.
> > > @@ -73,6 +75,11 @@ static void secondary_vm_do_failover(void)
> > > migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
> > > MIGRATION_STATUS_COMPLETED);
> > >
> > > + replication_stop_all(true, &local_err);
> > > + if (local_err) {
> > > + error_report_err(local_err);
> > > + }
> > > +
> > > if (!autostart) {
> > > error_report("\"-S\" qemu option will be ignored in secondary
> > side");
> > > /* recover runstate to normal migration finish state */
> > > @@ -110,6 +117,7 @@ static void primary_vm_do_failover(void)
> > > {
> > > MigrationState *s = migrate_get_current();
> > > int old_state;
> > > + Error *local_err = NULL;
> > >
> > > migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
> > > MIGRATION_STATUS_COMPLETED);
> > > @@ -133,6 +141,13 @@ static void primary_vm_do_failover(void)
> > > FailoverStatus_str(old_state));
> > > return;
> > > }
> > > +
> > > + replication_stop_all(true, &local_err);
> > > + if (local_err) {
> > > + error_report_err(local_err);
> > > + local_err = NULL;
> > > + }
> > > +
> > > /* Notify COLO thread that failover work is finished */
> > > qemu_sem_post(&s->colo_exit_sem);
> > > }
> > > @@ -356,6 +371,11 @@ static int
> > colo_do_checkpoint_transaction(MigrationState *s,
> > > qemu_savevm_state_header(fb);
> > > qemu_savevm_state_setup(fb);
> > > qemu_mutex_lock_iothread();
> > > + replication_do_checkpoint_all(&local_err);
> > > + if (local_err) {
> > > + qemu_mutex_unlock_iothread();
> > > + goto out;
> > > + }
> >
> > In docs/block-replication.txt it says:
> > b. replication_do_checkpoint_all()
> > This interface is called after all VM state is transferred to
> > Secondary QEMU. The Disk buffer will be dropped in this interface.
> > The caller must hold the I/O mutex lock if it is in migration/checkpoint
> > thread.
> >
> > but we're making this call before the call below that actually transfers
> > all the state. Which is right?
> >
>
> Hi Dave,
>
> The "docs/block-replication.txt" means we should call the
> replication_do_checkpoint_all() after VM state is transferred in secondary
> node,
> and in primary node this function no need to call before transfers all the
> state.
OK, it might be worth clarifying the docs.
Dave
> Thanks
> Zhang Chen
>
>
>
> >
> > Other than that I think it's OK.
> >
> > Dave
> >
> > > qemu_savevm_state_complete_precopy(fb, false, false);
> > > qemu_mutex_unlock_iothread();
> > >
> > > @@ -446,6 +466,12 @@ static void colo_process_checkpoint(MigrationState
> > *s)
> > > object_unref(OBJECT(bioc));
> > >
> > > qemu_mutex_lock_iothread();
> > > + replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
> > > + if (local_err) {
> > > + qemu_mutex_unlock_iothread();
> > > + goto out;
> > > + }
> > > +
> > > vm_start();
> > > qemu_mutex_unlock_iothread();
> > > trace_colo_vm_state_change("stop", "run");
> > > @@ -585,6 +611,11 @@ void *colo_process_incoming_thread(void *opaque)
> > > object_unref(OBJECT(bioc));
> > >
> > > qemu_mutex_lock_iothread();
> > > + replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
> > > + if (local_err) {
> > > + qemu_mutex_unlock_iothread();
> > > + goto out;
> > > + }
> > > vm_start();
> > > trace_colo_vm_state_change("stop", "run");
> > > qemu_mutex_unlock_iothread();
> > > @@ -665,6 +696,18 @@ void *colo_process_incoming_thread(void *opaque)
> > > goto out;
> > > }
> > >
> > > + replication_get_error_all(&local_err);
> > > + if (local_err) {
> > > + qemu_mutex_unlock_iothread();
> > > + goto out;
> > > + }
> > > + /* discard colo disk buffer */
> > > + replication_do_checkpoint_all(&local_err);
> > > + if (local_err) {
> > > + qemu_mutex_unlock_iothread();
> > > + goto out;
> > > + }
> > > +
> > > vmstate_loading = false;
> > > vm_start();
> > > trace_colo_vm_state_change("stop", "run");
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index ce06941706..c97b7660af 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -385,6 +385,7 @@ static void process_incoming_migration_co(void
> > *opaque)
> > > MigrationIncomingState *mis = migration_incoming_get_current();
> > > PostcopyState ps;
> > > int ret;
> > > + Error *local_err = NULL;
> > >
> > > assert(mis->from_src_file);
> > > mis->largest_page_size = qemu_ram_pagesize_largest();
> > > @@ -416,6 +417,14 @@ static void process_incoming_migration_co(void
> > *opaque)
> > >
> > > /* we get COLO info, and know if we are in COLO mode */
> > > if (!ret && migration_incoming_enable_colo()) {
> > > + /* Make sure all file formats flush their mutable metadata */
> > > + bdrv_invalidate_cache_all(&local_err);
> > > + if (local_err) {
> > > + migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> > > + MIGRATION_STATUS_FAILED);
> > > + error_report_err(local_err);
> > > + exit(EXIT_FAILURE);
> > > + }
> > > mis->migration_incoming_co = qemu_coroutine_self();
> > > qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
> > > colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
> > > --
> > > 2.17.1
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2018-08-17 11:07 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-22 19:33 [Qemu-devel] [PATCH V10 00/20] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 01/20] filter-rewriter: Add TCP state machine and fix memory leak in connection_track_table Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 02/20] colo-compare: implement the process of checkpoint Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 03/20] colo-compare: use notifier to notify packets comparing result Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 04/20] COLO: integrate colo compare with colo frame Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 05/20] COLO: Add block replication into colo process Zhang Chen
2018-08-07 14:30 ` Dr. David Alan Gilbert
2018-08-11 19:47 ` Zhang Chen
2018-08-17 11:07 ` Dr. David Alan Gilbert [this message]
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 06/20] COLO: Remove colo_state migration struct Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 07/20] COLO: Load dirty pages into SVM's RAM cache firstly Zhang Chen
2018-08-07 17:58 ` Dr. David Alan Gilbert
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 08/20] ram/COLO: Record the dirty pages that SVM received Zhang Chen
2018-07-27 18:51 ` Dr. David Alan Gilbert
2018-08-01 10:38 ` Zhang Chen
2018-08-07 18:44 ` Dr. David Alan Gilbert
2018-08-11 19:52 ` Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 09/20] COLO: Flush memory data from ram cache Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 10/20] qmp event: Add COLO_EXIT event to notify users while exited COLO Zhang Chen
2018-07-23 18:41 ` Eric Blake
2018-07-24 9:56 ` Zhang Chen
2018-07-24 14:54 ` Dr. David Alan Gilbert
2018-07-25 1:12 ` Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 11/20] qapi/migration.json: Rename COLO unknown mode to none mode Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 12/20] qapi: Add new command to query colo status Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 13/20] savevm: split the process of different stages for loadvm/savevm Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 14/20] COLO: flush host dirty ram from cache Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 15/20] net/net.c: Add net client type check function for COLO Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 16/20] filter: Add handle_event method for NetFilterClass Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 17/20] filter-rewriter: handle checkpoint and failover event Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 18/20] COLO: notify net filters about checkpoint/failover event Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 19/20] COLO: quick failover process by kick COLO thread Zhang Chen
2018-07-22 19:33 ` [Qemu-devel] [PATCH V10 20/20] docs: Add COLO status diagram to COLO-FT.txt Zhang Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180817110709.GF2459@work-vm \
--to=dgilbert@redhat.com \
--cc=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=jasowang@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=zhang.zhanghailiang@huawei.com \
--cc=zhangckid@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.