From: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: xuquan8@huawei.com, qemu-devel@nongnu.org, quintela@redhat.com,
eddie.dong@intel.com, lizhijian@cn.fujitsu.com,
zhangchen.fnst@cn.fujitsu.com
Subject: Re: [Qemu-devel] [PATCH 1/3] COLO: fix setting checkpoint-delay not working properly
Date: Wed, 8 Feb 2017 19:18:34 +0800 [thread overview]
Message-ID: <589AFE8A.2000304@huawei.com> (raw)
In-Reply-To: <20170208103820.GB2341@work-vm>
On 2017/2/8 18:38, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> If we set checkpoint-delay through command 'migrate-set-parameters',
>> It will not take effect until we finish last sleep chekpoint-delay,
>> That's will be offensive espeically when we want to change its value
>> from an extreme big one to a proper value.
>>
>> Fix it by using timer to realize checkpoint-delay.
>
> Yes, I think this works; you only kick the timer in COLO state,
> and you create the timer before going into COLO state and clean it up
> after we leave, so it feels safe.
>
> Are you also going to kick this semaphore when doing a failover to
> cause this thread to exit quickly?
>
Ha, good suggestion! I think it will work.
Accepted, thanks very much.
>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
> Dave
>> ---
>> include/migration/colo.h | 2 ++
>> include/migration/migration.h | 5 +++++
>> migration/colo.c | 33 +++++++++++++++++++++++----------
>> migration/migration.c | 3 +++
>> 4 files changed, 33 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/migration/colo.h b/include/migration/colo.h
>> index e32eef4..2bbff9e 100644
>> --- a/include/migration/colo.h
>> +++ b/include/migration/colo.h
>> @@ -35,4 +35,6 @@ COLOMode get_colo_mode(void);
>>
>> /* failover */
>> void colo_do_failover(MigrationState *s);
>> +
>> +void colo_checkpoint_notify(void *opaque);
>> #endif
>> diff --git a/include/migration/migration.h b/include/migration/migration.h
>> index c309d23..487ac1e 100644
>> --- a/include/migration/migration.h
>> +++ b/include/migration/migration.h
>> @@ -183,6 +183,11 @@ struct MigrationState
>> /* The RAMBlock used in the last src_page_request */
>> RAMBlock *last_req_rb;
>>
>> + /* The semaphore is used to notify COLO thread to do checkpoint */
>> + QemuSemaphore colo_checkpoint_sem;
>> + int64_t colo_checkpoint_time;
>> + QEMUTimer *colo_delay_timer;
>> +
>> /* The last error that occurred */
>> Error *error;
>> };
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 93c85c5..08b2e46 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -302,7 +302,7 @@ static void colo_process_checkpoint(MigrationState *s)
>> {
>> QIOChannelBuffer *bioc;
>> QEMUFile *fb = NULL;
>> - int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> + int64_t current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> Error *local_err = NULL;
>> int ret;
>>
>> @@ -332,26 +332,21 @@ static void colo_process_checkpoint(MigrationState *s)
>> qemu_mutex_unlock_iothread();
>> trace_colo_vm_state_change("stop", "run");
>>
>> + timer_mod(s->colo_delay_timer,
>> + current_time + s->parameters.x_checkpoint_delay);
>> +
>> while (s->state == MIGRATION_STATUS_COLO) {
>> if (failover_get_state() != FAILOVER_STATUS_NONE) {
>> error_report("failover request");
>> goto out;
>> }
>>
>> - current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> - if (current_time - checkpoint_time <
>> - s->parameters.x_checkpoint_delay) {
>> - int64_t delay_ms;
>> + qemu_sem_wait(&s->colo_checkpoint_sem);
>>
>> - delay_ms = s->parameters.x_checkpoint_delay -
>> - (current_time - checkpoint_time);
>> - g_usleep(delay_ms * 1000);
>> - }
>> ret = colo_do_checkpoint_transaction(s, bioc, fb);
>> if (ret < 0) {
>> goto out;
>> }
>> - checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> }
>>
>> out:
>> @@ -364,14 +359,32 @@ out:
>> qemu_fclose(fb);
>> }
>>
>> + timer_del(s->colo_delay_timer);
>> +
>> if (s->rp_state.from_dst_file) {
>> qemu_fclose(s->rp_state.from_dst_file);
>> }
>> }
>>
>> +void colo_checkpoint_notify(void *opaque)
>> +{
>> + MigrationState *s = opaque;
>> + int64_t next_notify_time;
>> +
>> + qemu_sem_post(&s->colo_checkpoint_sem);
>> + s->colo_checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> + next_notify_time = s->colo_checkpoint_time +
>> + s->parameters.x_checkpoint_delay;
>> + timer_mod(s->colo_delay_timer, next_notify_time);
>> +}
>> +
>> void migrate_start_colo_process(MigrationState *s)
>> {
>> qemu_mutex_unlock_iothread();
>> + qemu_sem_init(&s->colo_checkpoint_sem, 0);
>> + s->colo_delay_timer = timer_new_ms(QEMU_CLOCK_HOST,
>> + colo_checkpoint_notify, s);
>> +
>> migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
>> MIGRATION_STATUS_COLO);
>> colo_process_checkpoint(s);
>> diff --git a/migration/migration.c b/migration/migration.c
>> index f498ab8..a4861da 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -895,6 +895,9 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
>>
>> if (params->has_x_checkpoint_delay) {
>> s->parameters.x_checkpoint_delay = params->x_checkpoint_delay;
>> + if (migration_in_colo_state()) {
>> + colo_checkpoint_notify(s);
>> + }
>> }
>> }
>>
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>
next prev parent reply other threads:[~2017-02-08 11:21 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-17 12:57 [Qemu-devel] [PATCH 0/3] COLO: fix some bugs zhanghailiang
2017-01-17 12:57 ` [Qemu-devel] [PATCH 1/3] COLO: fix setting checkpoint-delay not working properly zhanghailiang
2017-02-08 10:38 ` Dr. David Alan Gilbert
2017-02-08 11:18 ` Hailiang Zhang [this message]
2017-01-17 12:57 ` [Qemu-devel] [PATCH 2/3] COLO: Shutdown related socket fd while do failover zhanghailiang
2017-01-18 11:01 ` Dr. David Alan Gilbert
2017-02-08 11:14 ` Hailiang Zhang
2017-02-08 19:53 ` Dr. David Alan Gilbert
2017-02-13 4:13 ` Hailiang Zhang
2017-01-17 12:57 ` [Qemu-devel] [PATCH 3/3] COLO: Don't process failover request while loading VM's state zhanghailiang
2017-01-17 18:24 ` Eric Blake
2017-01-18 8:19 ` Hailiang Zhang
2017-02-10 15:44 ` [Qemu-devel] [PATCH 0/3] COLO: fix some bugs Dr. David Alan Gilbert
2017-02-13 8:46 ` Hailiang Zhang
2017-02-13 10:17 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=589AFE8A.2000304@huawei.com \
--to=zhang.zhanghailiang@huawei.com \
--cc=dgilbert@redhat.com \
--cc=eddie.dong@intel.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=xuquan8@huawei.com \
--cc=zhangchen.fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).