Re: [Qemu-devel] About QEMU BQL and dirty log switch in Migration

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jay Zhou <jianjay.zhou@huawei.com>
To: Wanpeng Li <kernellwp@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	yanghongyang <yanghongyang@huawei.com>,
	Juan Quintela <quintela@redhat.com>,
	"wangxin (U)" <wangxinxin.wang@huawei.com>,
	"qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
	"Gonglei (Arei)" <arei.gonglei@huawei.com>,
	Huangzhichao <huangzhichao@huawei.com>,
	Zhanghailiang <zhang.zhanghailiang@huawei.com>,
	"Herongguang (Stephen)" <herongguang.he@huawei.com>,
	Xiao Guangrong <xiaoguangrong@tencent.com>,
	"Huangweidong (C)" <weidong.huang@huawei.com>
Subject: Re: [Qemu-devel] About QEMU BQL and dirty log switch in Migration
Date: Fri, 19 May 2017 16:09:49 +0800	[thread overview]
Message-ID: <591EA84D.1030800@huawei.com> (raw)
In-Reply-To: <CANRm+CwbTaoxq35zyfNF1BP3d5GqRmtq0sY+aRKRaOsFwGB1Mw@mail.gmail.com>

Hi Paolo and Wanpeng,

On 2017/5/17 16:38, Wanpeng Li wrote:
> 2017-05-17 15:43 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>>> Recently, I have tested the performance before migration and after migration failure
>>> using spec cpu2006 https://www.spec.org/cpu2006/, which is a standard performance
>>> evaluation tool.
>>>
>>> These are the steps:
>>> ======
>>>   (1) the version of kmod is 4.4.11(with slightly modified) and the version of
>>>   qemu is 2.6.0
>>>      (with slightly modified), the kmod is applied with the following patch
>>>
>>> diff --git a/source/x86/x86.c b/source/x86/x86.c
>>> index 054a7d3..75a4bb3 100644
>>> --- a/source/x86/x86.c
>>> +++ b/source/x86/x86.c
>>> @@ -8550,8 +8550,10 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>>           */
>>>          if ((change != KVM_MR_DELETE) &&
>>>                  (old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
>>> -               !(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
>>> -               kvm_mmu_zap_collapsible_sptes(kvm, new);
>>> +               !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
>>> +               printk(KERN_ERR "zj make KVM_REQ_MMU_RELOAD request\n");
>>> +               kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD);
>>> +       }
>>>
>>>          /*
>>>           * Set up write protection and/or dirty logging for the new slot.
>>
>> Try these modifications to the setup:
>>
>> 1) set up 1G hugetlbfs hugepages and use those for the guest's memory
>>
>> 2) test both without and with the above patch.
>>

In order to avoid random memory allocation issues, I reran the test cases:
(1) setup: start a 4U10G VM with memory preoccupied, each vcpu is pinned to a 
pcpu respectively, these resources(memory and pcpu) allocated to VM are all 
from NUMA node 0
(2) sequence: firstly, I run the 429.mcf of spec cpu2006 before migration, and 
get a result. And then, migration failure is constructed. At last, I run the 
test case again, and get an another result.
(3) results:
Host hugepages           THP on(2M)  THP on(2M)   THP on(2M)   THP on(2M)
Patch                    patch1      patch2       patch3       -
Before migration         No          No           No           Yes
After migration failed   Yes         Yes          Yes          No
Largepages               67->1862    62->1890     95->1865     1926
score of 429.mcf         189         188          188          189

Host hugepages           1G hugepages  1G hugepages  1G hugepages  1G hugepages
Patch                    patch1        patch2        patch3        -
Before migration         No            No            No            Yes
After migration failed   Yes           Yes           Yes           No
Largepages               21            21            26            39
score of 429.mcf         188           188           186           188

Notes:
patch1  means with "lazy collapse small sptes into large sptes" codes
patch2  means comment out "lazy collapse small sptes into large sptes" codes
patch3  means using kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD)
         instead of kvm_mmu_zap_collapsible_sptes(kvm, new)

"Largepages" means the value of /sys/kernel/debug/kvm/largepages

> In addition, we can compare /sys/kernel/debug/kvm/largepages w/ and
> w/o the patch. IIRC, /sys/kernel/debug/kvm/largepages will drop during
> live migration, it will keep a small value if live migration fails and
> w/o "lazy collapse small sptes into large sptes" codes, however, it
> will increase gradually if w/ the "lazy collapse small sptes into
> large sptes" codes.
>

No, without the "lazy collapse small sptes into large sptes" codes,
/sys/kernel/debug/kvm/largepages does drop during live migration,
but it still will increase gradually if live migration fails, see the result
above. I printed out the back trace when it increases after migration failure,

[139574.369098]  [<ffffffff81644a7f>] dump_stack+0x19/0x1b
[139574.369111]  [<ffffffffa02c3af6>] mmu_set_spte+0x2f6/0x310 [kvm]
[139574.369122]  [<ffffffffa02c4f7e>] __direct_map.isra.109+0x1de/0x250 [kvm]
[139574.369133]  [<ffffffffa02c8a76>] tdp_page_fault+0x246/0x280 [kvm]
[139574.369144]  [<ffffffffa02bf4e4>] kvm_mmu_page_fault+0x24/0x130 [kvm]
[139574.369148]  [<ffffffffa07c8116>] handle_ept_violation+0x96/0x170 [kvm_intel]
[139574.369153]  [<ffffffffa07cf949>] vmx_handle_exit+0x299/0xbf0 [kvm_intel]
[139574.369157]  [<ffffffff816559f0>] ? uv_bau_message_intr1+0x80/0x80
[139574.369161]  [<ffffffffa07cd5e0>] ? vmx_inject_irq+0xf0/0xf0 [kvm_intel]
[139574.369172]  [<ffffffffa02b35cd>] vcpu_enter_guest+0x76d/0x1160 [kvm]
[139574.369184]  [<ffffffffa02d9285>] ? kvm_apic_local_deliver+0x65/0x70 [kvm]
[139574.369196]  [<ffffffffa02bb125>] kvm_arch_vcpu_ioctl_run+0xd5/0x440 [kvm]
[139574.369205]  [<ffffffffa02a2b11>] kvm_vcpu_ioctl+0x2b1/0x640 [kvm]
[139574.369209]  [<ffffffff810e7852>] ? do_futex+0x122/0x5b0
[139574.369212]  [<ffffffff811fd9d5>] do_vfs_ioctl+0x2e5/0x4c0
[139574.369223]  [<ffffffffa02b0cf5>] ? kvm_on_user_return+0x75/0xb0 [kvm]
[139574.369225]  [<ffffffff811fdc51>] SyS_ioctl+0xa1/0xc0
[139574.369229]  [<ffffffff81654e09>] system_call_fastpath+0x16/0x1b

Any suggestion will be appreciated, Thanks!


Regards,
Jay Zhou

next prev parent reply	other threads:[~2017-05-19  8:10 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-24 11:46 [Qemu-devel] About QEMU BQL and dirty log switch in Migration Yang Hongyang
2017-04-24 12:06 ` Juan Quintela
2017-04-24 12:13   ` Yang Hongyang
2017-04-24 16:42     ` Dr. David Alan Gilbert
2017-04-26 15:46       ` Paolo Bonzini
2017-04-27  2:46         ` Yang Hongyang
2017-05-11 12:07       ` Zhoujian (jay)
2017-05-11 12:24         ` Paolo Bonzini
2017-05-11 13:43           ` Wanpeng Li
2017-05-11 13:49             ` Wanpeng Li
2017-05-11 14:18               ` Zhoujian (jay)
2017-05-12  6:34                 ` Wanpeng Li
2017-05-17  2:20             ` Zhoujian (jay)
2017-05-17  5:47               ` Wanpeng Li
2017-05-17  7:35                 ` Jay Zhou
2017-05-17  7:43               ` Paolo Bonzini
2017-05-17  8:38                 ` Wanpeng Li
2017-05-19  8:09                   ` Jay Zhou [this message]
2017-05-19  8:32                     ` Xiao Guangrong
2017-05-19  9:27                       ` Jay Zhou
2018-12-11  3:43                     ` Wanpeng Li
2018-12-11  3:43                       ` [Qemu-devel] " Wanpeng Li
2017-05-12  8:09           ` Xiao Guangrong
2017-05-12  8:42             ` Hailiang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=591EA84D.1030800@huawei.com \
    --to=jianjay.zhou@huawei.com \
    --cc=arei.gonglei@huawei.com \
    --cc=dgilbert@redhat.com \
    --cc=herongguang.he@huawei.com \
    --cc=huangzhichao@huawei.com \
    --cc=kernellwp@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=wangxinxin.wang@huawei.com \
    --cc=weidong.huang@huawei.com \
    --cc=xiaoguangrong@tencent.com \
    --cc=yanghongyang@huawei.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.