Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: Li Guang <lig.fnst@cn.fujitsu.com>
Cc: GILR@il.ibm.com, SADEKJ@il.ibm.com, pbonzini@redhat.com,
	quintela@redhat.com, qemu-devel@nongnu.org, EREZH@il.ibm.com,
	owasserm@redhat.com, junqing.wang@cs2c.com.cn, onom@us.ibm.com,
	abali@us.ibm.com, "Michael R. Hines" <mrhines@us.ibm.com>,
	gokul@us.ibm.com, dbulkow@gmail.com, hinesmr@cn.ibm.com,
	BIRAN@il.ibm.com, isaku.yamahata@gmail.com
Subject: Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC
Date: Fri, 21 Feb 2014 16:13:02 +0800	[thread overview]
Message-ID: <53070A8E.9040508@linux.vnet.ibm.com> (raw)
In-Reply-To: <53040215.2080102@cn.fujitsu.com>

On 02/19/2014 09:00 AM, Li Guang wrote:
> Hi,
>
> mrhines@linux.vnet.ibm.com wrote:
>> From: "Michael R. Hines"<mrhines@us.ibm.com>
>>
>> This patch sets up the initial changes to the migration state
>> machine and prototypes to be used by the checkpointing code
>> to interact with the state machine so that we can later handle
>> failure and recovery scenarios.
>>
>> Signed-off-by: Michael R. Hines<mrhines@us.ibm.com>
>> ---
>>   arch_init.c                   | 29 ++++++++++++++++++++++++-----
>>   include/migration/migration.h |  2 ++
>>   migration.c                   | 37 
>> +++++++++++++++++++++----------------
>>   3 files changed, 47 insertions(+), 21 deletions(-)
>>
>> diff --git a/arch_init.c b/arch_init.c
>> index db75120..e9d4d9e 100644
>> --- a/arch_init.c
>> +++ b/arch_init.c
>> @@ -658,13 +658,13 @@ static void ram_migration_cancel(void *opaque)
>>       migration_end();
>>   }
>>
>> -static void reset_ram_globals(void)
>> +static void reset_ram_globals(bool reset_bulk_stage)
>>   {
>>       last_seen_block = NULL;
>>       last_sent_block = NULL;
>>       last_offset = 0;
>>       last_version = ram_list.version;
>> -    ram_bulk_stage = true;
>> +    ram_bulk_stage = reset_bulk_stage;
>>   }
>>
>
> here is a chance that ram_save_block will never break while loop
> if loat_seen_block be reset for mc when there are no dirty pages
> to be migrated.
>
> Thanks!
>

This bug is fixed now - you can re-pull from github.com.

     Believe it or not, when there is no network devices attached to the
     guest whatsoever, the initial bootup process can be extremely slow,
     where there are almost no processes dirtying memory at all or
     only occasionally except for maybe a DHCP client. This results in
     some 100ms periods of time where there are actually *no* dirty
     pages - hard to believe, but it does happen.

     ram_save_block() really doesn't understand this possibility,
     surprisingly. It results in an infinite loop because it was expecting
     last_seen_block to always be non-NULL, when in fact, we have reset
     the value to start from the beginning of the guest can scan the
     entire VM for dirty memory.


>>   #define MAX_WAIT 50 /* ms, half buffered_file limit */
>> @@ -674,6 +674,15 @@ static int ram_save_setup(QEMUFile *f, void 
>> *opaque)
>>       RAMBlock *block;
>>       int64_t ram_pages = last_ram_offset()>> TARGET_PAGE_BITS;
>>
>> +    /*
>> +     * RAM stays open during micro-checkpointing for the next 
>> transaction.
>> +     */
>> +    if (migration_is_mc(migrate_get_current())) {
>> +        qemu_mutex_lock_ramlist();
>> +        reset_ram_globals(false);
>> +        goto skip_setup;
>> +    }
>> +
>>       migration_bitmap = bitmap_new(ram_pages);
>>       bitmap_set(migration_bitmap, 0, ram_pages);
>>       migration_dirty_pages = ram_pages;
>> @@ -710,12 +719,14 @@ static int ram_save_setup(QEMUFile *f, void 
>> *opaque)
>>       qemu_mutex_lock_iothread();
>>       qemu_mutex_lock_ramlist();
>>       bytes_transferred = 0;
>> -    reset_ram_globals();
>> +    reset_ram_globals(true);
>>
>>       memory_global_dirty_log_start();
>>       migration_bitmap_sync();
>>       qemu_mutex_unlock_iothread();
>>
>> +skip_setup:
>> +
>>       qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>>
>>       QTAILQ_FOREACH(block,&ram_list.blocks, next) {
>> @@ -744,7 +755,7 @@ static int ram_save_iterate(QEMUFile *f, void 
>> *opaque)
>>       qemu_mutex_lock_ramlist();
>>
>>       if (ram_list.version != last_version) {
>> -        reset_ram_globals();
>> +        reset_ram_globals(true);
>>       }
>>
>>       ram_control_before_iterate(f, RAM_CONTROL_ROUND);
>> @@ -825,7 +836,15 @@ static int ram_save_complete(QEMUFile *f, void 
>> *opaque)
>>       }
>>
>>       ram_control_after_iterate(f, RAM_CONTROL_FINISH);
>> -    migration_end();
>> +
>> +    /*
>> +     * Only cleanup at the end of normal migrations
>> +     * or if the MC destination failed and we got an error.
>> +     * Otherwise, we are (or will soon be) in MIG_STATE_CHECKPOINTING.
>> +     */
>> +    if(!migrate_use_mc() || 
>> migration_has_failed(migrate_get_current())) {
>> +        migration_end();
>> +    }
>>
>>       qemu_mutex_unlock_ramlist();
>>       qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>> diff --git a/include/migration/migration.h 
>> b/include/migration/migration.h
>> index a7c54fe..e876a2c 100644
>> --- a/include/migration/migration.h
>> +++ b/include/migration/migration.h
>> @@ -101,7 +101,9 @@ int migrate_fd_close(MigrationState *s);
>>
>>   void add_migration_state_change_notifier(Notifier *notify);
>>   void remove_migration_state_change_notifier(Notifier *notify);
>> +bool migration_is_active(MigrationState *);
>>   bool migration_in_setup(MigrationState *);
>> +bool migration_is_mc(MigrationState *s);
>>   bool migration_has_finished(MigrationState *);
>>   bool migration_has_failed(MigrationState *);
>>   MigrationState *migrate_get_current(void);
>> diff --git a/migration.c b/migration.c
>> index 25add6f..f42dae4 100644
>> --- a/migration.c
>> +++ b/migration.c
>> @@ -36,16 +36,6 @@
>>       do { } while (0)
>>   #endif
>>
>> -enum {
>> -    MIG_STATE_ERROR = -1,
>> -    MIG_STATE_NONE,
>> -    MIG_STATE_SETUP,
>> -    MIG_STATE_CANCELLING,
>> -    MIG_STATE_CANCELLED,
>> -    MIG_STATE_ACTIVE,
>> -    MIG_STATE_COMPLETED,
>> -};
>> -
>>   #define MAX_THROTTLE  (32<<  20)      /* Migration speed throttling */
>>
>>   /* Amount of time to allocate to each "chunk" of bandwidth-throttled
>> @@ -273,7 +263,7 @@ void 
>> qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
>>       MigrationState *s = migrate_get_current();
>>       MigrationCapabilityStatusList *cap;
>>
>> -    if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP) {
>> +    if (migration_is_active(s)) {
>>           error_set(errp, QERR_MIGRATION_ACTIVE);
>>           return;
>>       }
>> @@ -285,7 +275,13 @@ void 
>> qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
>>
>>   /* shared migration helpers */
>>
>> -static void migrate_set_state(MigrationState *s, int old_state, int 
>> new_state)
>> +bool migration_is_active(MigrationState *s)
>> +{
>> +    return (s->state == MIG_STATE_ACTIVE) || s->state == 
>> MIG_STATE_SETUP
>> +            || s->state == MIG_STATE_CHECKPOINTING;
>> +}
>> +
>> +void migrate_set_state(MigrationState *s, int old_state, int new_state)
>>   {
>>       if (atomic_cmpxchg(&s->state, old_state, new_state) == 
>> new_state) {
>>           trace_migrate_set_state(new_state);
>> @@ -309,7 +305,7 @@ static void migrate_fd_cleanup(void *opaque)
>>           s->file = NULL;
>>       }
>>
>> -    assert(s->state != MIG_STATE_ACTIVE);
>> +    assert(!migration_is_active(s));
>>
>>       if (s->state != MIG_STATE_COMPLETED) {
>>           qemu_savevm_state_cancel();
>> @@ -356,7 +352,12 @@ void 
>> remove_migration_state_change_notifier(Notifier *notify)
>>
>>   bool migration_in_setup(MigrationState *s)
>>   {
>> -    return s->state == MIG_STATE_SETUP;
>> +        return s->state == MIG_STATE_SETUP;
>> +}
>> +
>> +bool migration_is_mc(MigrationState *s)
>> +{
>> +        return s->state == MIG_STATE_CHECKPOINTING;
>>   }
>>
>>   bool migration_has_finished(MigrationState *s)
>> @@ -419,7 +420,8 @@ void qmp_migrate(const char *uri, bool has_blk, 
>> bool blk,
>>       params.shared = has_inc&&  inc;
>>
>>       if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP ||
>> -        s->state == MIG_STATE_CANCELLING) {
>> +        s->state == MIG_STATE_CANCELLING
>> +         || s->state == MIG_STATE_CHECKPOINTING) {
>>           error_set(errp, QERR_MIGRATION_ACTIVE);
>>           return;
>>       }
>> @@ -624,7 +626,10 @@ static void *migration_thread(void *opaque)
>>                   }
>>
>>                   if (!qemu_file_get_error(s->file)) {
>> -                    migrate_set_state(s, MIG_STATE_ACTIVE, 
>> MIG_STATE_COMPLETED);
>> +                    if (!migrate_use_mc()) {
>> +                        migrate_set_state(s,
>> +                            MIG_STATE_ACTIVE, MIG_STATE_COMPLETED);
>> +                    }
>>                       break;
>>                   }
>>               }
>

next prev parent reply	other threads:[~2014-02-21  8:13 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-18  8:50 [Qemu-devel] [RFC PATCH v2 00/12] mc: fault tolerante through micro-checkpointing mrhines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing mrhines
2014-02-18 12:45   ` Dr. David Alan Gilbert
2014-02-19  1:40     ` Michael R. Hines
2014-02-19 11:27       ` Dr. David Alan Gilbert
2014-02-20  1:17         ` Michael R. Hines
2014-02-20 10:09           ` Dr. David Alan Gilbert
2014-02-20 11:14             ` Li Guang
2014-02-20 14:58               ` Michael R. Hines
2014-02-20 14:57             ` Michael R. Hines
2014-02-20 16:32               ` Dr. David Alan Gilbert
2014-02-21  4:54                 ` Michael R. Hines
2014-02-21  9:44                   ` Dr. David Alan Gilbert
2014-03-03  6:08                     ` Michael R. Hines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 02/12] mc: timestamp migration_bitmap and KVM logdirty usage mrhines
2014-02-18 10:32   ` Dr. David Alan Gilbert
2014-02-19  1:42     ` Michael R. Hines
2014-03-11 21:31   ` Juan Quintela
2014-04-04  3:08     ` Michael R. Hines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 03/12] mc: introduce a 'checkpointing' status check into the VCPU states mrhines
2014-03-11 21:36   ` Juan Quintela
2014-04-04  3:11     ` Michael R. Hines
2014-03-11 21:40   ` Eric Blake
2014-04-04  3:12     ` Michael R. Hines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 04/12] mc: support custom page loading and copying mrhines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 05/12] rdma: accelerated memcpy() support and better external RDMA user interfaces mrhines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC mrhines
2014-02-19  1:00   ` Li Guang
2014-02-19  2:14     ` Michael R. Hines
2014-02-20  5:03     ` Michael R. Hines
2014-02-21  8:13     ` Michael R. Hines [this message]
2014-02-24  6:48       ` Li Guang
2014-02-26  2:52         ` Li Guang
2014-03-11 21:57   ` Juan Quintela
2014-04-04  3:50     ` Michael R. Hines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 07/12] mc: introduce additional QMP statistics for micro-checkpointing mrhines
2014-03-11 21:45   ` Eric Blake
2014-04-04  3:15     ` Michael R. Hines
2014-04-04  4:22       ` Eric Blake
2014-03-11 21:59   ` Juan Quintela
2014-04-04  3:55     ` Michael R. Hines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 08/12] mc: core logic mrhines
2014-02-19  1:07   ` Li Guang
2014-02-19  2:16     ` Michael R. Hines
2014-02-19  2:53       ` Li Guang
2014-02-19  4:27         ` Michael R. Hines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 09/12] mc: configure and makefile support mrhines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 10/12] mc: expose tunable parameter for checkpointing frequency mrhines
2014-03-11 21:49   ` Eric Blake
2014-03-11 22:15     ` Juan Quintela
2014-03-11 22:49       ` Eric Blake
2014-04-04  5:29         ` Michael R. Hines
2014-04-04 14:56           ` Eric Blake
2014-04-11  6:10             ` Michael R. Hines
2014-04-04 16:28           ` Dr. David Alan Gilbert
2014-04-04 16:35             ` Eric Blake
2014-04-04  3:29     ` Michael R. Hines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 11/12] mc: introduce new capabilities to control micro-checkpointing mrhines
2014-03-11 21:57   ` Eric Blake
2014-04-04  3:38     ` Michael R. Hines
2014-04-04  4:25       ` Eric Blake
2014-03-11 22:02   ` Juan Quintela
2014-03-11 22:07     ` Eric Blake
2014-04-04  3:57       ` Michael R. Hines
2014-04-04  3:56     ` Michael R. Hines
2014-02-18  8:50 ` [Qemu-devel] [RFC PATCH v2 12/12] mc: activate and use MC if requested mrhines
2014-02-18  9:28 ` [Qemu-devel] [RFC PATCH v2 00/12] mc: fault tolerante through micro-checkpointing Li Guang
2014-02-19  1:29   ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53070A8E.9040508@linux.vnet.ibm.com \
    --to=mrhines@linux.vnet.ibm.com \
    --cc=BIRAN@il.ibm.com \
    --cc=EREZH@il.ibm.com \
    --cc=GILR@il.ibm.com \
    --cc=SADEKJ@il.ibm.com \
    --cc=abali@us.ibm.com \
    --cc=dbulkow@gmail.com \
    --cc=gokul@us.ibm.com \
    --cc=hinesmr@cn.ibm.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=junqing.wang@cs2c.com.cn \
    --cc=lig.fnst@cn.fujitsu.com \
    --cc=mrhines@us.ibm.com \
    --cc=onom@us.ibm.com \
    --cc=owasserm@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).