From: Li Guang <lig.fnst@cn.fujitsu.com>
To: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
Cc: GILR@il.ibm.com, SADEKJ@il.ibm.com, pbonzini@redhat.com,
quintela@redhat.com, qemu-devel@nongnu.org, EREZH@il.ibm.com,
owasserm@redhat.com, junqing.wang@cs2c.com.cn, onom@us.ibm.com,
abali@us.ibm.com, "Michael R. Hines" <mrhines@us.ibm.com>,
gokul@us.ibm.com, dbulkow@gmail.com, hinesmr@cn.ibm.com,
BIRAN@il.ibm.com, isaku.yamahata@gmail.com
Subject: Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC
Date: Mon, 24 Feb 2014 14:48:34 +0800 [thread overview]
Message-ID: <530AEB42.6020809@cn.fujitsu.com> (raw)
In-Reply-To: <53070A8E.9040508@linux.vnet.ibm.com>
Michael R. Hines wrote:
> On 02/19/2014 09:00 AM, Li Guang wrote:
>> Hi,
>>
>> mrhines@linux.vnet.ibm.com wrote:
>>> From: "Michael R. Hines"<mrhines@us.ibm.com>
>>>
>>> This patch sets up the initial changes to the migration state
>>> machine and prototypes to be used by the checkpointing code
>>> to interact with the state machine so that we can later handle
>>> failure and recovery scenarios.
>>>
>>> Signed-off-by: Michael R. Hines<mrhines@us.ibm.com>
>>> ---
>>> arch_init.c | 29 ++++++++++++++++++++++++-----
>>> include/migration/migration.h | 2 ++
>>> migration.c | 37
>>> +++++++++++++++++++++----------------
>>> 3 files changed, 47 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/arch_init.c b/arch_init.c
>>> index db75120..e9d4d9e 100644
>>> --- a/arch_init.c
>>> +++ b/arch_init.c
>>> @@ -658,13 +658,13 @@ static void ram_migration_cancel(void *opaque)
>>> migration_end();
>>> }
>>>
>>> -static void reset_ram_globals(void)
>>> +static void reset_ram_globals(bool reset_bulk_stage)
>>> {
>>> last_seen_block = NULL;
>>> last_sent_block = NULL;
>>> last_offset = 0;
>>> last_version = ram_list.version;
>>> - ram_bulk_stage = true;
>>> + ram_bulk_stage = reset_bulk_stage;
>>> }
>>>
>>
>> here is a chance that ram_save_block will never break while loop
>> if loat_seen_block be reset for mc when there are no dirty pages
>> to be migrated.
>>
>> Thanks!
>>
>
> This bug is fixed now - you can re-pull from github.com.
>
> Believe it or not, when there is no network devices attached to the
> guest whatsoever, the initial bootup process can be extremely slow,
> where there are almost no processes dirtying memory at all or
> only occasionally except for maybe a DHCP client. This results in
> some 100ms periods of time where there are actually *no* dirty
> pages - hard to believe, but it does happen.
sorry, seems all my pull requests for github was blocked,
let me check it later.
Thanks!
>
> ram_save_block() really doesn't understand this possibility,
> surprisingly. It results in an infinite loop because it was expecting
> last_seen_block to always be non-NULL, when in fact, we have reset
> the value to start from the beginning of the guest can scan the
> entire VM for dirty memory.
>
>
>>> #define MAX_WAIT 50 /* ms, half buffered_file limit */
>>> @@ -674,6 +674,15 @@ static int ram_save_setup(QEMUFile *f, void
>>> *opaque)
>>> RAMBlock *block;
>>> int64_t ram_pages = last_ram_offset()>> TARGET_PAGE_BITS;
>>>
>>> + /*
>>> + * RAM stays open during micro-checkpointing for the next
>>> transaction.
>>> + */
>>> + if (migration_is_mc(migrate_get_current())) {
>>> + qemu_mutex_lock_ramlist();
>>> + reset_ram_globals(false);
>>> + goto skip_setup;
>>> + }
>>> +
>>> migration_bitmap = bitmap_new(ram_pages);
>>> bitmap_set(migration_bitmap, 0, ram_pages);
>>> migration_dirty_pages = ram_pages;
>>> @@ -710,12 +719,14 @@ static int ram_save_setup(QEMUFile *f, void
>>> *opaque)
>>> qemu_mutex_lock_iothread();
>>> qemu_mutex_lock_ramlist();
>>> bytes_transferred = 0;
>>> - reset_ram_globals();
>>> + reset_ram_globals(true);
>>>
>>> memory_global_dirty_log_start();
>>> migration_bitmap_sync();
>>> qemu_mutex_unlock_iothread();
>>>
>>> +skip_setup:
>>> +
>>> qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>>>
>>> QTAILQ_FOREACH(block,&ram_list.blocks, next) {
>>> @@ -744,7 +755,7 @@ static int ram_save_iterate(QEMUFile *f, void
>>> *opaque)
>>> qemu_mutex_lock_ramlist();
>>>
>>> if (ram_list.version != last_version) {
>>> - reset_ram_globals();
>>> + reset_ram_globals(true);
>>> }
>>>
>>> ram_control_before_iterate(f, RAM_CONTROL_ROUND);
>>> @@ -825,7 +836,15 @@ static int ram_save_complete(QEMUFile *f, void
>>> *opaque)
>>> }
>>>
>>> ram_control_after_iterate(f, RAM_CONTROL_FINISH);
>>> - migration_end();
>>> +
>>> + /*
>>> + * Only cleanup at the end of normal migrations
>>> + * or if the MC destination failed and we got an error.
>>> + * Otherwise, we are (or will soon be) in MIG_STATE_CHECKPOINTING.
>>> + */
>>> + if(!migrate_use_mc() ||
>>> migration_has_failed(migrate_get_current())) {
>>> + migration_end();
>>> + }
>>>
>>> qemu_mutex_unlock_ramlist();
>>> qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>>> diff --git a/include/migration/migration.h
>>> b/include/migration/migration.h
>>> index a7c54fe..e876a2c 100644
>>> --- a/include/migration/migration.h
>>> +++ b/include/migration/migration.h
>>> @@ -101,7 +101,9 @@ int migrate_fd_close(MigrationState *s);
>>>
>>> void add_migration_state_change_notifier(Notifier *notify);
>>> void remove_migration_state_change_notifier(Notifier *notify);
>>> +bool migration_is_active(MigrationState *);
>>> bool migration_in_setup(MigrationState *);
>>> +bool migration_is_mc(MigrationState *s);
>>> bool migration_has_finished(MigrationState *);
>>> bool migration_has_failed(MigrationState *);
>>> MigrationState *migrate_get_current(void);
>>> diff --git a/migration.c b/migration.c
>>> index 25add6f..f42dae4 100644
>>> --- a/migration.c
>>> +++ b/migration.c
>>> @@ -36,16 +36,6 @@
>>> do { } while (0)
>>> #endif
>>>
>>> -enum {
>>> - MIG_STATE_ERROR = -1,
>>> - MIG_STATE_NONE,
>>> - MIG_STATE_SETUP,
>>> - MIG_STATE_CANCELLING,
>>> - MIG_STATE_CANCELLED,
>>> - MIG_STATE_ACTIVE,
>>> - MIG_STATE_COMPLETED,
>>> -};
>>> -
>>> #define MAX_THROTTLE (32<< 20) /* Migration speed
>>> throttling */
>>>
>>> /* Amount of time to allocate to each "chunk" of bandwidth-throttled
>>> @@ -273,7 +263,7 @@ void
>>> qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
>>> MigrationState *s = migrate_get_current();
>>> MigrationCapabilityStatusList *cap;
>>>
>>> - if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP) {
>>> + if (migration_is_active(s)) {
>>> error_set(errp, QERR_MIGRATION_ACTIVE);
>>> return;
>>> }
>>> @@ -285,7 +275,13 @@ void
>>> qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
>>>
>>> /* shared migration helpers */
>>>
>>> -static void migrate_set_state(MigrationState *s, int old_state, int
>>> new_state)
>>> +bool migration_is_active(MigrationState *s)
>>> +{
>>> + return (s->state == MIG_STATE_ACTIVE) || s->state ==
>>> MIG_STATE_SETUP
>>> + || s->state == MIG_STATE_CHECKPOINTING;
>>> +}
>>> +
>>> +void migrate_set_state(MigrationState *s, int old_state, int
>>> new_state)
>>> {
>>> if (atomic_cmpxchg(&s->state, old_state, new_state) ==
>>> new_state) {
>>> trace_migrate_set_state(new_state);
>>> @@ -309,7 +305,7 @@ static void migrate_fd_cleanup(void *opaque)
>>> s->file = NULL;
>>> }
>>>
>>> - assert(s->state != MIG_STATE_ACTIVE);
>>> + assert(!migration_is_active(s));
>>>
>>> if (s->state != MIG_STATE_COMPLETED) {
>>> qemu_savevm_state_cancel();
>>> @@ -356,7 +352,12 @@ void
>>> remove_migration_state_change_notifier(Notifier *notify)
>>>
>>> bool migration_in_setup(MigrationState *s)
>>> {
>>> - return s->state == MIG_STATE_SETUP;
>>> + return s->state == MIG_STATE_SETUP;
>>> +}
>>> +
>>> +bool migration_is_mc(MigrationState *s)
>>> +{
>>> + return s->state == MIG_STATE_CHECKPOINTING;
>>> }
>>>
>>> bool migration_has_finished(MigrationState *s)
>>> @@ -419,7 +420,8 @@ void qmp_migrate(const char *uri, bool has_blk,
>>> bool blk,
>>> params.shared = has_inc&& inc;
>>>
>>> if (s->state == MIG_STATE_ACTIVE || s->state ==
>>> MIG_STATE_SETUP ||
>>> - s->state == MIG_STATE_CANCELLING) {
>>> + s->state == MIG_STATE_CANCELLING
>>> + || s->state == MIG_STATE_CHECKPOINTING) {
>>> error_set(errp, QERR_MIGRATION_ACTIVE);
>>> return;
>>> }
>>> @@ -624,7 +626,10 @@ static void *migration_thread(void *opaque)
>>> }
>>>
>>> if (!qemu_file_get_error(s->file)) {
>>> - migrate_set_state(s, MIG_STATE_ACTIVE,
>>> MIG_STATE_COMPLETED);
>>> + if (!migrate_use_mc()) {
>>> + migrate_set_state(s,
>>> + MIG_STATE_ACTIVE, MIG_STATE_COMPLETED);
>>> + }
>>> break;
>>> }
>>> }
>>
>
>
next prev parent reply other threads:[~2014-02-24 7:00 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-18 8:50 [Qemu-devel] [RFC PATCH v2 00/12] mc: fault tolerante through micro-checkpointing mrhines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing mrhines
2014-02-18 12:45 ` Dr. David Alan Gilbert
2014-02-19 1:40 ` Michael R. Hines
2014-02-19 11:27 ` Dr. David Alan Gilbert
2014-02-20 1:17 ` Michael R. Hines
2014-02-20 10:09 ` Dr. David Alan Gilbert
2014-02-20 11:14 ` Li Guang
2014-02-20 14:58 ` Michael R. Hines
2014-02-20 14:57 ` Michael R. Hines
2014-02-20 16:32 ` Dr. David Alan Gilbert
2014-02-21 4:54 ` Michael R. Hines
2014-02-21 9:44 ` Dr. David Alan Gilbert
2014-03-03 6:08 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 02/12] mc: timestamp migration_bitmap and KVM logdirty usage mrhines
2014-02-18 10:32 ` Dr. David Alan Gilbert
2014-02-19 1:42 ` Michael R. Hines
2014-03-11 21:31 ` Juan Quintela
2014-04-04 3:08 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 03/12] mc: introduce a 'checkpointing' status check into the VCPU states mrhines
2014-03-11 21:36 ` Juan Quintela
2014-04-04 3:11 ` Michael R. Hines
2014-03-11 21:40 ` Eric Blake
2014-04-04 3:12 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 04/12] mc: support custom page loading and copying mrhines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 05/12] rdma: accelerated memcpy() support and better external RDMA user interfaces mrhines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC mrhines
2014-02-19 1:00 ` Li Guang
2014-02-19 2:14 ` Michael R. Hines
2014-02-20 5:03 ` Michael R. Hines
2014-02-21 8:13 ` Michael R. Hines
2014-02-24 6:48 ` Li Guang [this message]
2014-02-26 2:52 ` Li Guang
2014-03-11 21:57 ` Juan Quintela
2014-04-04 3:50 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 07/12] mc: introduce additional QMP statistics for micro-checkpointing mrhines
2014-03-11 21:45 ` Eric Blake
2014-04-04 3:15 ` Michael R. Hines
2014-04-04 4:22 ` Eric Blake
2014-03-11 21:59 ` Juan Quintela
2014-04-04 3:55 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 08/12] mc: core logic mrhines
2014-02-19 1:07 ` Li Guang
2014-02-19 2:16 ` Michael R. Hines
2014-02-19 2:53 ` Li Guang
2014-02-19 4:27 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 09/12] mc: configure and makefile support mrhines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 10/12] mc: expose tunable parameter for checkpointing frequency mrhines
2014-03-11 21:49 ` Eric Blake
2014-03-11 22:15 ` Juan Quintela
2014-03-11 22:49 ` Eric Blake
2014-04-04 5:29 ` Michael R. Hines
2014-04-04 14:56 ` Eric Blake
2014-04-11 6:10 ` Michael R. Hines
2014-04-04 16:28 ` Dr. David Alan Gilbert
2014-04-04 16:35 ` Eric Blake
2014-04-04 3:29 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 11/12] mc: introduce new capabilities to control micro-checkpointing mrhines
2014-03-11 21:57 ` Eric Blake
2014-04-04 3:38 ` Michael R. Hines
2014-04-04 4:25 ` Eric Blake
2014-03-11 22:02 ` Juan Quintela
2014-03-11 22:07 ` Eric Blake
2014-04-04 3:57 ` Michael R. Hines
2014-04-04 3:56 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 12/12] mc: activate and use MC if requested mrhines
2014-02-18 9:28 ` [Qemu-devel] [RFC PATCH v2 00/12] mc: fault tolerante through micro-checkpointing Li Guang
2014-02-19 1:29 ` Michael R. Hines
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=530AEB42.6020809@cn.fujitsu.com \
--to=lig.fnst@cn.fujitsu.com \
--cc=BIRAN@il.ibm.com \
--cc=EREZH@il.ibm.com \
--cc=GILR@il.ibm.com \
--cc=SADEKJ@il.ibm.com \
--cc=abali@us.ibm.com \
--cc=dbulkow@gmail.com \
--cc=gokul@us.ibm.com \
--cc=hinesmr@cn.ibm.com \
--cc=isaku.yamahata@gmail.com \
--cc=junqing.wang@cs2c.com.cn \
--cc=mrhines@linux.vnet.ibm.com \
--cc=mrhines@us.ibm.com \
--cc=onom@us.ibm.com \
--cc=owasserm@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).