All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Zhou <jianjay.zhou@huawei.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org, quintela@redhat.com, armbru@redhat.com,
	arei.gonglei@huawei.com, zhang.zhanghailiang@huawei.com,
	wangxinxin.wang@huawei.com, weidong.huang@huawei.com
Subject: Re: [Qemu-devel] [PATCH] migration: optimize the downtime
Date: Fri, 21 Jul 2017 20:23:34 +0800	[thread overview]
Message-ID: <5971F246.5080000@huawei.com> (raw)
In-Reply-To: <20170721094943.GC2133@work-vm>

Hi Dave,

On 2017/7/21 17:49, Dr. David Alan Gilbert wrote:
> * Jay Zhou (jianjay.zhou@huawei.com) wrote:
>> Qemu_savevm_state_cleanup() takes about 300ms in my ram migration tests
>> with a 8U24G vm(20G is really occupied), the main cost comes from
>> KVM_SET_USER_MEMORY_REGION ioctl when mem.memory_size = 0 in
>> kvm_set_user_memory_region(). In kmod, the main cost is
>> kvm_zap_obsolete_pages(), which traverses the active_mmu_pages list to
>> zap the unsync sptes.
>
> Hi Jay,
>    Is this actually increasing the real downtime when the guest isn't
> running, or is it just the reported time? I see that the s->downtime
> value is calculated right after where we currently call
> qemu_savevm_state_cleanup.

It actually increased the real downtime, I used the "ping" command to
test. Reason is that the source side libvirt sends qmp to qemu to query
the status of migration, which needs the BQL. qemu_savevm_state_cleanup
is done with BQL, qemu can not handle the qmp if qemu_savevm_state_cleanup
has not finished. And the source side libvirt delays about 300ms to notify
the destination side libvirt to send the "cont" command to start the vm.

I think the value of s->downtime is not accurate enough, maybe we could
move the calculation of end_time after qemu_savevm_state_cleanup has done.

>    I guess the biggest problem is that 300ms happens before we restart
> the guest on the source if a migration fails.

300ms happens even if a migration succeeds.

>> I think it can be optimized:
>> (1) source vm will be destroyed if the migration is successfully done,
>>      so the resources will be cleanuped automatically by the system
>> (2) delay the cleanup if the migration failed
>
> I don't like putting it in qmp_cont; that shouldn't have migration magic
> in it.

Yes, it is not a ideal place. :(

> I guess we could put it in migrate_fd_cleanup perhaps? It gets called on
> a bh near the end -  or could we just move it closer to the end of
> migration_thread?

I have tested putting it in migrate_fd_cleanup, but the downtime is not
optimized. So I think it is the same to move it closer to the end of
migration_thread if it holds the BQL.
Could we put it in migrate_init?

Thanks,
Jay

> However, we would need to be a bit careful of anything that needs
> cleaning up before the source restarts on failure; I'm not sure of
> the semantics of all the current things wired into save_cleanup.
>
> Dave
>
>
>> Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
>> ---
>>   migration/migration.c | 16 +++++++++-------
>>   qmp.c                 | 10 ++++++++++
>>   2 files changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index a0db40d..72832be 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1877,6 +1877,15 @@ static void *migration_thread(void *opaque)
>>           if (qemu_file_get_error(s->to_dst_file)) {
>>               migrate_set_state(&s->state, current_active_state,
>>                                 MIGRATION_STATUS_FAILED);
>> +            /*
>> +             * The resource has been allocated by migration will be reused in
>> +             * COLO process, so don't release them.
>> +             */
>> +            if (!enable_colo) {
>> +                qemu_mutex_lock_iothread();
>> +                qemu_savevm_state_cleanup();
>> +                qemu_mutex_unlock_iothread();
>> +            }
>>               trace_migration_thread_file_err();
>>               break;
>>           }
>> @@ -1916,13 +1925,6 @@ static void *migration_thread(void *opaque)
>>       end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>
>>       qemu_mutex_lock_iothread();
>> -    /*
>> -     * The resource has been allocated by migration will be reused in COLO
>> -     * process, so don't release them.
>> -     */
>> -    if (!enable_colo) {
>> -        qemu_savevm_state_cleanup();
>> -    }
>>       if (s->state == MIGRATION_STATUS_COMPLETED) {
>>           uint64_t transferred_bytes = qemu_ftell(s->to_dst_file);
>>           s->total_time = end_time - s->total_time;
>> diff --git a/qmp.c b/qmp.c
>> index b86201e..0e68eaa 100644
>> --- a/qmp.c
>> +++ b/qmp.c
>> @@ -37,6 +37,8 @@
>>   #include "qom/object_interfaces.h"
>>   #include "hw/mem/pc-dimm.h"
>>   #include "hw/acpi/acpi_dev_interface.h"
>> +#include "migration/migration.h"
>> +#include "migration/savevm.h"
>>
>>   NameInfo *qmp_query_name(Error **errp)
>>   {
>> @@ -200,6 +202,14 @@ void qmp_cont(Error **errp)
>>       if (runstate_check(RUN_STATE_INMIGRATE)) {
>>           autostart = 1;
>>       } else {
>> +        /*
>> +         * Delay the cleanup to reduce the downtime of migration.
>> +         * The resource has been allocated by migration will be reused
>> +         * in COLO process, so don't release them.
>> +         */
>> +        if (runstate_check(RUN_STATE_POSTMIGRATE) && !migrate_colo_enabled()) {
>> +            qemu_savevm_state_cleanup();
>> +        }
>>           vm_start();
>>       }
>>   }
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

  reply	other threads:[~2017-07-21 12:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-20  3:49 [Qemu-devel] [PATCH] migration: optimize the downtime Jay Zhou
2017-07-20  4:23 ` no-reply
2017-07-21  9:49 ` Dr. David Alan Gilbert
2017-07-21 12:23   ` Jay Zhou [this message]
2017-07-24 15:35     ` Dr. David Alan Gilbert
2017-07-24 16:33       ` Paolo Bonzini
2017-07-24 19:03         ` Dr. David Alan Gilbert
2017-07-24 20:38           ` Paolo Bonzini
2017-07-25 19:15             ` Dr. David Alan Gilbert
2017-07-27 14:26               ` Paolo Bonzini
2017-07-25  7:29         ` Jay Zhou
2017-07-25  8:18           ` Paolo Bonzini
2017-07-25  7:09       ` Jay Zhou
2017-07-25 10:34         ` Dr. David Alan Gilbert
2017-07-31  7:04           ` Jay Zhou
2017-07-31 13:33             ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5971F246.5080000@huawei.com \
    --to=jianjay.zhou@huawei.com \
    --cc=arei.gonglei@huawei.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=wangxinxin.wang@huawei.com \
    --cc=weidong.huang@huawei.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.