From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: quintela@redhat.com
Cc: GILR@il.ibm.com, SADEKJ@il.ibm.com, BIRAN@il.ibm.com,
hinesmr@cn.ibm.com, qemu-devel@nongnu.org, EREZH@il.ibm.com,
owasserm@redhat.com, onom@us.ibm.com, junqing.wang@cs2c.com.cn,
lig.fnst@cn.fujitsu.com, gokul@us.ibm.com, dbulkow@gmail.com,
pbonzini@redhat.com, abali@us.ibm.com, isaku.yamahata@gmail.com,
"Michael R. Hines" <mrhines@us.ibm.com>
Subject: Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC
Date: Fri, 04 Apr 2014 11:50:12 +0800 [thread overview]
Message-ID: <533E2BF4.7020301@linux.vnet.ibm.com> (raw)
In-Reply-To: <87zjkwz9d2.fsf@elfo.mitica>
On 03/12/2014 05:57 AM, Juan Quintela wrote:
> mrhines@linux.vnet.ibm.com wrote:
>> From: "Michael R. Hines" <mrhines@us.ibm.com>
>>
>> This patch sets up the initial changes to the migration state
>> machine and prototypes to be used by the checkpointing code
>> to interact with the state machine so that we can later handle
>> failure and recovery scenarios.
>>
>> Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
>> ---
>> arch_init.c | 29 ++++++++++++++++++++++++-----
>> include/migration/migration.h | 2 ++
>> migration.c | 37 +++++++++++++++++++++----------------
>> 3 files changed, 47 insertions(+), 21 deletions(-)
>>
>> diff --git a/arch_init.c b/arch_init.c
>> index db75120..e9d4d9e 100644
>> --- a/arch_init.c
>> +++ b/arch_init.c
>> @@ -658,13 +658,13 @@ static void ram_migration_cancel(void *opaque)
>> migration_end();
>> }
>>
>> -static void reset_ram_globals(void)
>> +static void reset_ram_globals(bool reset_bulk_stage)
>> {
>> last_seen_block = NULL;
>> last_sent_block = NULL;
>> last_offset = 0;
>> last_version = ram_list.version;
>> - ram_bulk_stage = true;
>> + ram_bulk_stage = reset_bulk_stage;
>> }
>>
>> #define MAX_WAIT 50 /* ms, half buffered_file limit */
>> @@ -674,6 +674,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>> RAMBlock *block;
>> int64_t ram_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>>
>> + /*
>> + * RAM stays open during micro-checkpointing for the next transaction.
>> + */
>> + if (migration_is_mc(migrate_get_current())) {
>> + qemu_mutex_lock_ramlist();
>> + reset_ram_globals(false);
>> + goto skip_setup;
>> + }
>> +
>> migration_bitmap = bitmap_new(ram_pages);
>> bitmap_set(migration_bitmap, 0, ram_pages);
>> migration_dirty_pages = ram_pages;
>> @@ -710,12 +719,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>> qemu_mutex_lock_iothread();
>> qemu_mutex_lock_ramlist();
>> bytes_transferred = 0;
>> - reset_ram_globals();
>> + reset_ram_globals(true);
>>
>> memory_global_dirty_log_start();
>> migration_bitmap_sync();
>> qemu_mutex_unlock_iothread();
>>
>> +skip_setup:
>> +
>> qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>>
>> QTAILQ_FOREACH(block, &ram_list.blocks, next) {
>> @@ -744,7 +755,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>> qemu_mutex_lock_ramlist();
>>
>> if (ram_list.version != last_version) {
>> - reset_ram_globals();
>> + reset_ram_globals(true);
>> }
>>
>> ram_control_before_iterate(f, RAM_CONTROL_ROUND);
>> @@ -825,7 +836,15 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>> }
>>
>> ram_control_after_iterate(f, RAM_CONTROL_FINISH);
>> - migration_end();
>> +
>> + /*
>> + * Only cleanup at the end of normal migrations
>> + * or if the MC destination failed and we got an error.
>> + * Otherwise, we are (or will soon be) in MIG_STATE_CHECKPOINTING.
>> + */
>> + if(!migrate_use_mc() || migration_has_failed(migrate_get_current())) {
>> + migration_end();
>> + }
>>
>> qemu_mutex_unlock_ramlist();
>> qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>
>
> I haven't looked at the code in detail, but what we have here is
> esentially:
>
>
> ram_save_complete()
> {
> code not needed for mc
> common codo for migration and mc
> code not needed for mc
> }
>
> Similar code on ram_save_setup. Yes, I know that there are some locking
> issues here.
>
>
> SHould we be able do do something like
>
> __ram_save_complete()
> {
> common code
> }
>
> mc_ram_save_complete()
> {
> # Possible something else here
> __ram_save_complete()
> }
>
> rest_ram_save_complete()
> {
> code not needed for mc
> __ram_save_complete()
> code not needed for mc
> }
>
> My problem here is that current code is already quite complex and
> convoluted. At some point we are going to need to change it to
> something that is easier to understand?
Understood: So it looks like we needs some "accessor" function
pointers or something here, similar to the way Paolo suggested
that we breakout "before" and "after" iteration methods for
localhost migration and RDMA migration.
I'll cook something up and re-submit.
>
>> -enum {
>> - MIG_STATE_ERROR = -1,
>> - MIG_STATE_NONE,
>> - MIG_STATE_SETUP,
>> - MIG_STATE_CANCELLING,
>> - MIG_STATE_CANCELLED,
>> - MIG_STATE_ACTIVE,
>> - MIG_STATE_COMPLETED,
>> -};
>> -
> Here comes the code seen on the previous patch O:-)
>
>>
>> -static void migrate_set_state(MigrationState *s, int old_state, int new_state)
>> +bool migration_is_active(MigrationState *s)
>> +{
>> + return (s->state == MIG_STATE_ACTIVE) || s->state == MIG_STATE_SETUP
>> + || s->state == MIG_STATE_CHECKPOINTING;
>> +}
> The whole idea of moving MIG_STATE_* to this file was to "force" all
> other users to use accessor functions. This way we know what the others
> expect frum us.
Acknowleged - I'll work on creating (or enhancing) the accessor
functions to avoid moving the flags again.
>> - assert(s->state != MIG_STATE_ACTIVE);
>> + assert(!migration_is_active(s));
> I can understand that we want here MIG_STATE_CHECKPOINTING, but _SETUP?
> Or it is a bug on upstream?
My fault, I think there is some merge breakage here when I started
walking through the diff. Ignore this one for now..........
- Michael
next prev parent reply other threads:[~2014-04-04 3:51 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-18 8:50 [Qemu-devel] [RFC PATCH v2 00/12] mc: fault tolerante through micro-checkpointing mrhines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 01/12] mc: add documentation for micro-checkpointing mrhines
2014-02-18 12:45 ` Dr. David Alan Gilbert
2014-02-19 1:40 ` Michael R. Hines
2014-02-19 11:27 ` Dr. David Alan Gilbert
2014-02-20 1:17 ` Michael R. Hines
2014-02-20 10:09 ` Dr. David Alan Gilbert
2014-02-20 11:14 ` Li Guang
2014-02-20 14:58 ` Michael R. Hines
2014-02-20 14:57 ` Michael R. Hines
2014-02-20 16:32 ` Dr. David Alan Gilbert
2014-02-21 4:54 ` Michael R. Hines
2014-02-21 9:44 ` Dr. David Alan Gilbert
2014-03-03 6:08 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 02/12] mc: timestamp migration_bitmap and KVM logdirty usage mrhines
2014-02-18 10:32 ` Dr. David Alan Gilbert
2014-02-19 1:42 ` Michael R. Hines
2014-03-11 21:31 ` Juan Quintela
2014-04-04 3:08 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 03/12] mc: introduce a 'checkpointing' status check into the VCPU states mrhines
2014-03-11 21:36 ` Juan Quintela
2014-04-04 3:11 ` Michael R. Hines
2014-03-11 21:40 ` Eric Blake
2014-04-04 3:12 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 04/12] mc: support custom page loading and copying mrhines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 05/12] rdma: accelerated memcpy() support and better external RDMA user interfaces mrhines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC mrhines
2014-02-19 1:00 ` Li Guang
2014-02-19 2:14 ` Michael R. Hines
2014-02-20 5:03 ` Michael R. Hines
2014-02-21 8:13 ` Michael R. Hines
2014-02-24 6:48 ` Li Guang
2014-02-26 2:52 ` Li Guang
2014-03-11 21:57 ` Juan Quintela
2014-04-04 3:50 ` Michael R. Hines [this message]
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 07/12] mc: introduce additional QMP statistics for micro-checkpointing mrhines
2014-03-11 21:45 ` Eric Blake
2014-04-04 3:15 ` Michael R. Hines
2014-04-04 4:22 ` Eric Blake
2014-03-11 21:59 ` Juan Quintela
2014-04-04 3:55 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 08/12] mc: core logic mrhines
2014-02-19 1:07 ` Li Guang
2014-02-19 2:16 ` Michael R. Hines
2014-02-19 2:53 ` Li Guang
2014-02-19 4:27 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 09/12] mc: configure and makefile support mrhines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 10/12] mc: expose tunable parameter for checkpointing frequency mrhines
2014-03-11 21:49 ` Eric Blake
2014-03-11 22:15 ` Juan Quintela
2014-03-11 22:49 ` Eric Blake
2014-04-04 5:29 ` Michael R. Hines
2014-04-04 14:56 ` Eric Blake
2014-04-11 6:10 ` Michael R. Hines
2014-04-04 16:28 ` Dr. David Alan Gilbert
2014-04-04 16:35 ` Eric Blake
2014-04-04 3:29 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 11/12] mc: introduce new capabilities to control micro-checkpointing mrhines
2014-03-11 21:57 ` Eric Blake
2014-04-04 3:38 ` Michael R. Hines
2014-04-04 4:25 ` Eric Blake
2014-03-11 22:02 ` Juan Quintela
2014-03-11 22:07 ` Eric Blake
2014-04-04 3:57 ` Michael R. Hines
2014-04-04 3:56 ` Michael R. Hines
2014-02-18 8:50 ` [Qemu-devel] [RFC PATCH v2 12/12] mc: activate and use MC if requested mrhines
2014-02-18 9:28 ` [Qemu-devel] [RFC PATCH v2 00/12] mc: fault tolerante through micro-checkpointing Li Guang
2014-02-19 1:29 ` Michael R. Hines
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=533E2BF4.7020301@linux.vnet.ibm.com \
--to=mrhines@linux.vnet.ibm.com \
--cc=BIRAN@il.ibm.com \
--cc=EREZH@il.ibm.com \
--cc=GILR@il.ibm.com \
--cc=SADEKJ@il.ibm.com \
--cc=abali@us.ibm.com \
--cc=dbulkow@gmail.com \
--cc=gokul@us.ibm.com \
--cc=hinesmr@cn.ibm.com \
--cc=isaku.yamahata@gmail.com \
--cc=junqing.wang@cs2c.com.cn \
--cc=lig.fnst@cn.fujitsu.com \
--cc=mrhines@us.ibm.com \
--cc=onom@us.ibm.com \
--cc=owasserm@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).