From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Zhang Chen <zhangckid@gmail.com>
Cc: qemu-devel@nongnu.org, Eric Blake <eblake@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Jason Wang <jasowang@redhat.com>,
zhanghailiang <zhang.zhanghailiang@huawei.com>,
Li Zhijian <lizhijian@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PATCH V7 RESEND 12/17] savevm: split the process of different stages for loadvm/savevm
Date: Tue, 15 May 2018 19:56:03 +0100 [thread overview]
Message-ID: <20180515185603.GF2749@work-vm> (raw)
In-Reply-To: <20180514165424.12884-13-zhangckid@gmail.com>
* Zhang Chen (zhangckid@gmail.com) wrote:
> From: zhanghailiang <zhang.zhanghailiang@huawei.com>
>
> There are several stages during loadvm/savevm process. In different stage,
> migration incoming processes different types of sections.
> We want to control these stages more accuracy, it will benefit COLO
> performance, we don't have to save type of QEMU_VM_SECTION_START
> sections everytime while do checkpoint, besides, we want to separate
> the process of saving/loading memory and devices state.
>
> So we add three new helper functions: qemu_load_device_state() and
> qemu_savevm_live_state() to achieve different process during migration.
>
> Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
> public, and simplify the codes of qemu_save_device_state() by calling the
> wrapper qemu_savevm_state_header().
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> migration/colo.c | 36 ++++++++++++++++++++++++++++--------
> migration/savevm.c | 35 ++++++++++++++++++++++++++++-------
> migration/savevm.h | 4 ++++
> 3 files changed, 60 insertions(+), 15 deletions(-)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index cdff0a2490..5b055f79f1 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -30,6 +30,7 @@
> #include "block/block.h"
> #include "qapi/qapi-events-migration.h"
> #include "qapi/qmp/qerror.h"
> +#include "sysemu/cpus.h"
>
> static bool vmstate_loading;
> static Notifier packets_compare_notifier;
> @@ -414,23 +415,30 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
>
> /* Disable block migration */
> migrate_set_block_enabled(false, &local_err);
> - qemu_savevm_state_header(fb);
> - qemu_savevm_state_setup(fb);
> qemu_mutex_lock_iothread();
> replication_do_checkpoint_all(&local_err);
> if (local_err) {
> qemu_mutex_unlock_iothread();
> goto out;
> }
> - qemu_savevm_state_complete_precopy(fb, false, false);
> - qemu_mutex_unlock_iothread();
> -
> - qemu_fflush(fb);
>
> colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
> if (local_err) {
> goto out;
> }
> + /*
> + * Only save VM's live state, which not including device state.
> + * TODO: We may need a timeout mechanism to prevent COLO process
> + * to be blocked here.
> + */
I guess that's the downside to transmitting it directly than into the buffer;
Peter Xu's OOB command system would let you kill the connection - and
that's something I think COLO should use.
Still the change saves you having that huge outgoing buffer on the
source side and lets you start sending the checkpoint sooner, which
means the pause time should be smaller.
> + qemu_savevm_live_state(s->to_dst_file);
Does this actually need to be inside of the qemu_mutex_lock_iothread?
I'm pretty sure the device_state needs to be, but I'm not sure the
live_state needs to.
> + /* Note: device state is saved into buffer */
> + ret = qemu_save_device_state(fb);
> +
> + qemu_mutex_unlock_iothread();
> +
> + qemu_fflush(fb);
> +
> /*
> * We need the size of the VMstate data in Secondary side,
> * With which we can decide how much data should be read.
> @@ -643,6 +651,7 @@ void *colo_process_incoming_thread(void *opaque)
> uint64_t total_size;
> uint64_t value;
> Error *local_err = NULL;
> + int ret;
>
> qemu_sem_init(&mis->colo_incoming_sem, 0);
>
> @@ -715,6 +724,16 @@ void *colo_process_incoming_thread(void *opaque)
> goto out;
> }
>
> + qemu_mutex_lock_iothread();
> + cpu_synchronize_all_pre_loadvm();
> + ret = qemu_loadvm_state_main(mis->from_src_file, mis);
> + qemu_mutex_unlock_iothread();
> +
> + if (ret < 0) {
> + error_report("Load VM's live state (ram) error");
> + goto out;
> + }
> +
> value = colo_receive_message_value(mis->from_src_file,
> COLO_MESSAGE_VMSTATE_SIZE, &local_err);
> if (local_err) {
> @@ -748,8 +767,9 @@ void *colo_process_incoming_thread(void *opaque)
> qemu_mutex_lock_iothread();
> qemu_system_reset(SHUTDOWN_CAUSE_NONE);
Is the reset safe? Are you sure it doesn't change the ram you've just
loaded?
> vmstate_loading = true;
> - if (qemu_loadvm_state(fb) < 0) {
> - error_report("COLO: loadvm failed");
> + ret = qemu_load_device_state(fb);
> + if (ret < 0) {
> + error_report("COLO: load device state failed");
> qemu_mutex_unlock_iothread();
> goto out;
> }
> diff --git a/migration/savevm.c b/migration/savevm.c
> index ec0bff09ce..0f61239429 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1332,13 +1332,20 @@ done:
> return ret;
> }
>
> -static int qemu_save_device_state(QEMUFile *f)
> +void qemu_savevm_live_state(QEMUFile *f)
> {
> - SaveStateEntry *se;
> + /* save QEMU_VM_SECTION_END section */
> + qemu_savevm_state_complete_precopy(f, true, false);
> + qemu_put_byte(f, QEMU_VM_EOF);
> +}
>
> - qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
> - qemu_put_be32(f, QEMU_VM_FILE_VERSION);
> +int qemu_save_device_state(QEMUFile *f)
> +{
> + SaveStateEntry *se;
>
> + if (!migration_in_colo_state()) {
> + qemu_savevm_state_header(f);
> + }
> cpu_synchronize_all_states();
So this changes qemu_save_device_state to use savevm_state_header
which feels reasonable, but that includes the 'configuration'
section; do we want that? Is that OK for Xen's use in
qmp_xen_save_devices_state?
> QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> @@ -1394,8 +1401,6 @@ enum LoadVMExitCodes {
> LOADVM_QUIT = 1,
> };
>
> -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> -
> /* ------ incoming postcopy messages ------ */
> /* 'advise' arrives before any transfers just to tell us that a postcopy
> * *might* happen - it might be skipped if precopy transferred everything
> @@ -2075,7 +2080,7 @@ void qemu_loadvm_state_cleanup(void)
> }
> }
>
> -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> {
> uint8_t section_type;
> int ret = 0;
> @@ -2229,6 +2234,22 @@ int qemu_loadvm_state(QEMUFile *f)
> return ret;
> }
>
> +int qemu_load_device_state(QEMUFile *f)
> +{
> + MigrationIncomingState *mis = migration_incoming_get_current();
> + int ret;
> +
> + /* Load QEMU_VM_SECTION_FULL section */
> + ret = qemu_loadvm_state_main(f, mis);
> + if (ret < 0) {
> + error_report("Failed to load device state: %d", ret);
> + return ret;
> + }
> +
> + cpu_synchronize_all_post_init();
> + return 0;
> +}
> +
> int save_snapshot(const char *name, Error **errp)
> {
> BlockDriverState *bs, *bs1;
> diff --git a/migration/savevm.h b/migration/savevm.h
> index c6d46b37a2..cf7935dd68 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -53,8 +53,12 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
> uint64_t *start_list,
> uint64_t *length_list);
> void qemu_savevm_send_colo_enable(QEMUFile *f);
> +void qemu_savevm_live_state(QEMUFile *f);
> +int qemu_save_device_state(QEMUFile *f);
>
> int qemu_loadvm_state(QEMUFile *f);
> void qemu_loadvm_state_cleanup(void);
> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> +int qemu_load_device_state(QEMUFile *f);
>
> #endif
> --
> 2.17.0
Dave
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2018-05-15 18:56 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-14 16:54 [Qemu-devel] [PATCH V7 RESEND 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 01/17] filter-rewriter: fix memory leak for connection in connection_track_table Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 02/17] colo-compare: implement the process of checkpoint Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 03/17] colo-compare: use notifier to notify packets comparing result Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 04/17] COLO: integrate colo compare with colo frame Zhang Chen
2018-05-16 11:12 ` Dr. David Alan Gilbert
2018-05-16 13:55 ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 05/17] COLO: Add block replication into colo process Zhang Chen
2018-05-16 15:54 ` Dr. David Alan Gilbert
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 06/17] COLO: Remove colo_state migration struct Zhang Chen
2018-05-15 16:02 ` Dr. David Alan Gilbert
2018-05-16 13:58 ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 07/17] COLO: Load dirty pages into SVM's RAM cache firstly Zhang Chen
2018-05-15 16:55 ` Dr. David Alan Gilbert
2018-05-20 18:30 ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 08/17] ram/COLO: Record the dirty pages that SVM received Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 09/17] COLO: Flush memory data from ram cache Zhang Chen
2018-05-15 14:44 ` Dr. David Alan Gilbert
2018-05-20 16:09 ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 11/17] qapi: Add new command to query colo status Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 12/17] savevm: split the process of different stages for loadvm/savevm Zhang Chen
2018-05-15 18:56 ` Dr. David Alan Gilbert [this message]
2018-06-03 5:10 ` Zhang Chen
2018-06-19 19:00 ` Dr. David Alan Gilbert
2018-06-22 3:45 ` Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 13/17] COLO: flush host dirty ram from cache Zhang Chen
2018-05-15 15:32 ` Dr. David Alan Gilbert
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 14/17] filter: Add handle_event method for NetFilterClass Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 15/17] filter-rewriter: handle checkpoint and failover event Zhang Chen
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 16/17] COLO: notify net filters about checkpoint/failover event Zhang Chen
2018-05-17 9:48 ` Dr. David Alan Gilbert
2018-05-14 16:54 ` [Qemu-devel] [PATCH V7 RESEND 17/17] COLO: quick failover process by kick COLO thread Zhang Chen
2018-05-17 9:53 ` Dr. David Alan Gilbert
2018-05-16 11:18 ` [Qemu-devel] [PATCH V7 RESEND 00/17] COLO: integrate colo frame with block replication and COLO proxy Dr. David Alan Gilbert
2018-05-16 12:21 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180515185603.GF2749@work-vm \
--to=dgilbert@redhat.com \
--cc=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=jasowang@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=zhang.zhanghailiang@huawei.com \
--cc=zhangckid@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).