Re: vm performance degradation after kvm live migration or save-restore with ETP enabled

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
To: "Zhanghaoyu (A)" <haoyu.zhang@huawei.com>
Cc: KVM <kvm@vger.kernel.org>, qemu-devel <qemu-devel@nongnu.org>,
	"cloudfantom@gmail.com" <cloudfantom@gmail.com>,
	"mpetersen@peak6.com" <mpetersen@peak6.com>,
	"Shouta.Uehara@jp.yokogawa.com" <Shouta.Uehara@jp.yokogawa.com>,
	"paolo.bonzini@gmail.com" <paolo.bonzini@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Luonengjun <luonengjun@huawei.com>,
	Zanghongyong <zanghongyong@huawei.com>,
	Hanweidong <hanweidong@huawei.com>,
	"Huangweidong (C)" <weidong.huang@huawei.com>
Subject: Re: vm performance degradation after kvm live migration or save-restore with ETP enabled
Date: Thu, 11 Jul 2013 18:39:39 +0800	[thread overview]
Message-ID: <51DE8B6B.7010802@linux.vnet.ibm.com> (raw)
In-Reply-To: <D3E216785288A145B7BC975F83A2ED103FEE9853@szxeml556-mbx.china.huawei.com>

Hi,

Could you please test this patch?

>From 48df7db2ec2721e35d024a8d9850dbb34b557c1c Mon Sep 17 00:00:00 2001
From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Date: Thu, 6 Sep 2012 16:56:01 +0800
Subject: [PATCH 10/11] using huge page on fast page fault path

---
 arch/x86/kvm/mmu.c |   27 ++++++++++++++++++++-------
 1 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6945ef4..7d177c7 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2663,6 +2663,13 @@ static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, pfn_t pfn)
 	return -EFAULT;
 }

+static bool pfn_can_adjust(pfn_t pfn, int level)
+{
+	return !is_error_pfn(pfn) && !kvm_is_mmio_pfn(pfn) &&
+		   level == PT_PAGE_TABLE_LEVEL &&
+		      PageTransCompound(pfn_to_page(pfn));
+}
+
 static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
 					gfn_t *gfnp, pfn_t *pfnp, int *levelp)
 {
@@ -2676,10 +2683,8 @@ static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
 	 * PT_PAGE_TABLE_LEVEL and there would be no adjustment done
 	 * here.
 	 */
-	if (!is_error_pfn(pfn) && !kvm_is_mmio_pfn(pfn) &&
-	    level == PT_PAGE_TABLE_LEVEL &&
-	    PageTransCompound(pfn_to_page(pfn)) &&
-	    !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) {
+	if (pfn_can_adjust(pfn, level) &&
+	      !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) {
 		unsigned long mask;
 		/*
 		 * mmu_notifier_retry was successful and we hold the
@@ -2768,7 +2773,7 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 spte)
  * - false: let the real page fault path to fix it.
  */
 static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
-			    u32 error_code)
+			    u32 error_code, bool force_pt_level)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	bool ret = false;
@@ -2795,6 +2800,14 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
 		goto exit;

 	/*
+	 * Let the real page fault path change the mapping if large
+	 * mapping is allowed, for example, the memslot dirty log is
+	 * disabled.
+	 */
+	if (!force_pt_level && pfn_can_adjust(spte_to_pfn(spte), level))
+		goto exit;
+
+	/*
 	 * Check if it is a spurious fault caused by TLB lazily flushed.
 	 *
 	 * Need not check the access of upper level table entries since
@@ -2854,7 +2867,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, u32 error_code,
 	} else
 		level = PT_PAGE_TABLE_LEVEL;

-	if (fast_page_fault(vcpu, v, level, error_code))
+	if (fast_page_fault(vcpu, v, level, error_code, force_pt_level))
 		return 0;

 	mmu_seq = vcpu->kvm->mmu_notifier_seq;
@@ -3323,7 +3336,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 	} else
 		level = PT_PAGE_TABLE_LEVEL;

-	if (fast_page_fault(vcpu, gpa, level, error_code))
+	if (fast_page_fault(vcpu, gpa, level, error_code, force_pt_level))
 		return 0;

 	mmu_seq = vcpu->kvm->mmu_notifier_seq;
-- 
1.7.7.6


On 07/11/2013 05:36 PM, Zhanghaoyu (A) wrote:
> hi all,
> 
> I met similar problem to these, while performing live migration or save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, guest:suse11sp2), running tele-communication software suite in guest,
> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
> https://bugzilla.kernel.org/show_bug.cgi?id=58771
> 
> After live migration or virsh restore [savefile], one process's CPU utilization went up by about 30%, resulted in throughput degradation of this process.
> oprofile report on this process in guest,
> pre live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %        app name                 symbol name
> 248      12.3016  no-vmlinux               (no symbols)
> 78        3.8690  libc.so.6                memset
> 68        3.3730  libc.so.6                memcpy
> 30        1.4881  cscf.scu                 SipMmBufMemAlloc
> 29        1.4385  libpthread.so.0          pthread_mutex_lock
> 26        1.2897  cscf.scu                 SipApiGetNextIe
> 25        1.2401  cscf.scu                 DBFI_DATA_Search
> 20        0.9921  libpthread.so.0          __pthread_mutex_unlock_usercnt
> 16        0.7937  cscf.scu                 DLM_FreeSlice
> 16        0.7937  cscf.scu                 receivemessage
> 15        0.7440  cscf.scu                 SipSmCopyString
> 14        0.6944  cscf.scu                 DLM_AllocSlice
> 
> post live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %        app name                 symbol name
> 1586     42.2370  libc.so.6                memcpy
> 271       7.2170  no-vmlinux               (no symbols)
> 83        2.2104  libc.so.6                memset
> 41        1.0919  libpthread.so.0          __pthread_mutex_unlock_usercnt
> 35        0.9321  cscf.scu                 SipMmBufMemAlloc
> 29        0.7723  cscf.scu                 DLM_AllocSlice
> 28        0.7457  libpthread.so.0          pthread_mutex_lock
> 23        0.6125  cscf.scu                 SipApiGetNextIe
> 17        0.4527  cscf.scu                 SipSmCopyString
> 16        0.4261  cscf.scu                 receivemessage
> 15        0.3995  cscf.scu                 SipcMsgStatHandle
> 14        0.3728  cscf.scu                 Urilex
> 12        0.3196  cscf.scu                 DBFI_DATA_Search
> 12        0.3196  cscf.scu                 SipDsmGetHdrBitValInner
> 12        0.3196  cscf.scu                 SipSmGetDataFromRefString
> 
> So, memcpy costs much more cpu cycles after live migration. Then, I restart the process, this problem disappeared. save-restore has the similar problem.
> 
> perf report on vcpu thread in host,
> pre live migration:
> Performance counter stats for thread id '21082':
> 
>                  0 page-faults
>                  0 minor-faults
>                  0 major-faults
>              31616 cs
>                506 migrations
>                  0 alignment-faults
>                  0 emulation-faults
>         5075957539 L1-dcache-loads                                              [21.32%]
>          324685106 L1-dcache-load-misses     #    6.40% of all L1-dcache hits   [21.85%]
>         3681777120 L1-dcache-stores                                             [21.65%]
>           65251823 L1-dcache-store-misses    # 1.77%                                   [22.78%]
>                  0 L1-dcache-prefetches                                         [22.84%]
>                  0 L1-dcache-prefetch-misses                                    [22.32%]
>         9321652613 L1-icache-loads                                              [22.60%]
>         1353418869 L1-icache-load-misses     #   14.52% of all L1-icache hits   [21.92%]
>          169126969 LLC-loads                                                    [21.87%]
>           12583605 LLC-load-misses           #    7.44% of all LL-cache hits    [ 5.84%]
>          132853447 LLC-stores                                                   [ 6.61%]
>           10601171 LLC-store-misses          #7.9%                                   [ 5.01%]
>           25309497 LLC-prefetches             #30%                                  [ 4.96%]
>            7723198 LLC-prefetch-misses                                          [ 6.04%]
>         4954075817 dTLB-loads                                                   [11.56%]
>           26753106 dTLB-load-misses          #    0.54% of all dTLB cache hits  [16.80%]
>         3553702874 dTLB-stores                                                  [22.37%]
>            4720313 dTLB-store-misses        #0.13%                                    [21.46%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
>       60.000920666 seconds time elapsed
> 
> post live migration:
> Performance counter stats for thread id '1579':
> 
>                  0 page-faults                                                  [100.00%]
>                  0 minor-faults                                                 [100.00%]
>                  0 major-faults                                                 [100.00%]
>              34979 cs                                                           [100.00%]
>                441 migrations                                                   [100.00%]
>                  0 alignment-faults                                             [100.00%]
>                  0 emulation-faults
>         6903585501 L1-dcache-loads                                              [22.06%]
>          525939560 L1-dcache-load-misses     #    7.62% of all L1-dcache hits   [21.97%]
>         5042552685 L1-dcache-stores                                             [22.20%]
>           94493742 L1-dcache-store-misses    #1.8%                                   [22.06%]
>                  0 L1-dcache-prefetches                                         [22.39%]
>                  0 L1-dcache-prefetch-misses                                    [22.47%]
>        13022953030 L1-icache-loads                                              [22.25%]
>         1957161101 L1-icache-load-misses     #   15.03% of all L1-icache hits   [22.47%]
>          348479792 LLC-loads                                                    [22.27%]
>           80662778 LLC-load-misses           #   23.15% of all LL-cache hits    [ 5.64%]
>          198745620 LLC-stores                                                   [ 5.63%]
>           14236497 LLC-store-misses          #   7.1%                                    [ 5.41%]
>           20757435 LLC-prefetches                                               [ 5.42%]
>            5361819 LLC-prefetch-misses       #   25%                                [ 5.69%]
>         7235715124 dTLB-loads                                                   [11.26%]
>           49895163 dTLB-load-misses          #    0.69% of all dTLB cache hits  [16.96%]
>         5168276218 dTLB-stores                                                  [22.44%]
>            6765983 dTLB-store-misses        #0.13%                                    [22.24%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
> The "LLC-load-misses" went up by about 16%. Then, I restarted the process in guest, the perf data back to normal,
> Performance counter stats for thread id '1579':
> 
>                  0 page-faults                                                  [100.00%]
>                  0 minor-faults                                                 [100.00%]
>                  0 major-faults                                                 [100.00%]
>              30594 cs                                                           [100.00%]
>                327 migrations                                                   [100.00%]
>                  0 alignment-faults                                             [100.00%]
>                  0 emulation-faults
>         7707091948 L1-dcache-loads                                              [22.10%]
>          559829176 L1-dcache-load-misses     #    7.26% of all L1-dcache hits   [22.28%]
>         5976654983 L1-dcache-stores                                             [23.22%]
>          160436114 L1-dcache-store-misses                                       [22.80%]
>                  0 L1-dcache-prefetches                                         [22.51%]
>                  0 L1-dcache-prefetch-misses                                    [22.53%]
>        13798415672 L1-icache-loads                                              [22.28%]
>         2017724676 L1-icache-load-misses     #   14.62% of all L1-icache hits   [22.49%]
>          254598008 LLC-loads                                                    [22.86%]
>           16035378 LLC-load-misses           #    6.30% of all LL-cache hits    [ 5.36%]
>          307019606 LLC-stores                                                   [ 5.60%]
>           13665033 LLC-store-misses                                             [ 5.43%]
>           17715554 LLC-prefetches                                               [ 5.57%]
>            4187006 LLC-prefetch-misses                                          [ 5.44%]
>         7811502895 dTLB-loads                                                   [10.72%]
>           40547330 dTLB-load-misses          #    0.52% of all dTLB cache hits  [16.31%]
>         6144202516 dTLB-stores                                                  [21.58%]
>            6313363 dTLB-store-misses                                            [21.91%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
>       60.000812523 seconds time elapsed
> 
> If EPT disabled, this problem gone.
> 
> I suspect that kvm hypervisor has business with this problem.
> Based on above suspect, I want to find the two adjacent versions of kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1),
> and analyze the differences between this two versions, or apply the patches between this two versions by bisection method, finally find the key patches.
> 
> Any better ideas?
> 
> Thanks,
> Zhang Haoyu
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
>

WARNING: multiple messages have this Message-ID (diff)

From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
To: "Zhanghaoyu (A)" <haoyu.zhang@huawei.com>
Cc: "mpetersen@peak6.com" <mpetersen@peak6.com>,
	"Shouta.Uehara@jp.yokogawa.com" <Shouta.Uehara@jp.yokogawa.com>,
	KVM <kvm@vger.kernel.org>, "Michael S. Tsirkin" <mst@redhat.com>,
	Luonengjun <luonengjun@huawei.com>,
	Hanweidong <hanweidong@huawei.com>,
	"paolo.bonzini@gmail.com" <paolo.bonzini@gmail.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Zanghongyong <zanghongyong@huawei.com>,
	"cloudfantom@gmail.com" <cloudfantom@gmail.com>,
	"Huangweidong (C)" <weidong.huang@huawei.com>
Subject: Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled
Date: Thu, 11 Jul 2013 18:39:39 +0800	[thread overview]
Message-ID: <51DE8B6B.7010802@linux.vnet.ibm.com> (raw)
In-Reply-To: <D3E216785288A145B7BC975F83A2ED103FEE9853@szxeml556-mbx.china.huawei.com>

Hi,

Could you please test this patch?

>From 48df7db2ec2721e35d024a8d9850dbb34b557c1c Mon Sep 17 00:00:00 2001
From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Date: Thu, 6 Sep 2012 16:56:01 +0800
Subject: [PATCH 10/11] using huge page on fast page fault path

---
 arch/x86/kvm/mmu.c |   27 ++++++++++++++++++++-------
 1 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6945ef4..7d177c7 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2663,6 +2663,13 @@ static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, pfn_t pfn)
 	return -EFAULT;
 }

+static bool pfn_can_adjust(pfn_t pfn, int level)
+{
+	return !is_error_pfn(pfn) && !kvm_is_mmio_pfn(pfn) &&
+		   level == PT_PAGE_TABLE_LEVEL &&
+		      PageTransCompound(pfn_to_page(pfn));
+}
+
 static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
 					gfn_t *gfnp, pfn_t *pfnp, int *levelp)
 {
@@ -2676,10 +2683,8 @@ static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
 	 * PT_PAGE_TABLE_LEVEL and there would be no adjustment done
 	 * here.
 	 */
-	if (!is_error_pfn(pfn) && !kvm_is_mmio_pfn(pfn) &&
-	    level == PT_PAGE_TABLE_LEVEL &&
-	    PageTransCompound(pfn_to_page(pfn)) &&
-	    !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) {
+	if (pfn_can_adjust(pfn, level) &&
+	      !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) {
 		unsigned long mask;
 		/*
 		 * mmu_notifier_retry was successful and we hold the
@@ -2768,7 +2773,7 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 spte)
  * - false: let the real page fault path to fix it.
  */
 static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
-			    u32 error_code)
+			    u32 error_code, bool force_pt_level)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	bool ret = false;
@@ -2795,6 +2800,14 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level,
 		goto exit;

 	/*
+	 * Let the real page fault path change the mapping if large
+	 * mapping is allowed, for example, the memslot dirty log is
+	 * disabled.
+	 */
+	if (!force_pt_level && pfn_can_adjust(spte_to_pfn(spte), level))
+		goto exit;
+
+	/*
 	 * Check if it is a spurious fault caused by TLB lazily flushed.
 	 *
 	 * Need not check the access of upper level table entries since
@@ -2854,7 +2867,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, u32 error_code,
 	} else
 		level = PT_PAGE_TABLE_LEVEL;

-	if (fast_page_fault(vcpu, v, level, error_code))
+	if (fast_page_fault(vcpu, v, level, error_code, force_pt_level))
 		return 0;

 	mmu_seq = vcpu->kvm->mmu_notifier_seq;
@@ -3323,7 +3336,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
 	} else
 		level = PT_PAGE_TABLE_LEVEL;

-	if (fast_page_fault(vcpu, gpa, level, error_code))
+	if (fast_page_fault(vcpu, gpa, level, error_code, force_pt_level))
 		return 0;

 	mmu_seq = vcpu->kvm->mmu_notifier_seq;
-- 
1.7.7.6


On 07/11/2013 05:36 PM, Zhanghaoyu (A) wrote:
> hi all,
> 
> I met similar problem to these, while performing live migration or save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, guest:suse11sp2), running tele-communication software suite in guest,
> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
> https://bugzilla.kernel.org/show_bug.cgi?id=58771
> 
> After live migration or virsh restore [savefile], one process's CPU utilization went up by about 30%, resulted in throughput degradation of this process.
> oprofile report on this process in guest,
> pre live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %        app name                 symbol name
> 248      12.3016  no-vmlinux               (no symbols)
> 78        3.8690  libc.so.6                memset
> 68        3.3730  libc.so.6                memcpy
> 30        1.4881  cscf.scu                 SipMmBufMemAlloc
> 29        1.4385  libpthread.so.0          pthread_mutex_lock
> 26        1.2897  cscf.scu                 SipApiGetNextIe
> 25        1.2401  cscf.scu                 DBFI_DATA_Search
> 20        0.9921  libpthread.so.0          __pthread_mutex_unlock_usercnt
> 16        0.7937  cscf.scu                 DLM_FreeSlice
> 16        0.7937  cscf.scu                 receivemessage
> 15        0.7440  cscf.scu                 SipSmCopyString
> 14        0.6944  cscf.scu                 DLM_AllocSlice
> 
> post live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %        app name                 symbol name
> 1586     42.2370  libc.so.6                memcpy
> 271       7.2170  no-vmlinux               (no symbols)
> 83        2.2104  libc.so.6                memset
> 41        1.0919  libpthread.so.0          __pthread_mutex_unlock_usercnt
> 35        0.9321  cscf.scu                 SipMmBufMemAlloc
> 29        0.7723  cscf.scu                 DLM_AllocSlice
> 28        0.7457  libpthread.so.0          pthread_mutex_lock
> 23        0.6125  cscf.scu                 SipApiGetNextIe
> 17        0.4527  cscf.scu                 SipSmCopyString
> 16        0.4261  cscf.scu                 receivemessage
> 15        0.3995  cscf.scu                 SipcMsgStatHandle
> 14        0.3728  cscf.scu                 Urilex
> 12        0.3196  cscf.scu                 DBFI_DATA_Search
> 12        0.3196  cscf.scu                 SipDsmGetHdrBitValInner
> 12        0.3196  cscf.scu                 SipSmGetDataFromRefString
> 
> So, memcpy costs much more cpu cycles after live migration. Then, I restart the process, this problem disappeared. save-restore has the similar problem.
> 
> perf report on vcpu thread in host,
> pre live migration:
> Performance counter stats for thread id '21082':
> 
>                  0 page-faults
>                  0 minor-faults
>                  0 major-faults
>              31616 cs
>                506 migrations
>                  0 alignment-faults
>                  0 emulation-faults
>         5075957539 L1-dcache-loads                                              [21.32%]
>          324685106 L1-dcache-load-misses     #    6.40% of all L1-dcache hits   [21.85%]
>         3681777120 L1-dcache-stores                                             [21.65%]
>           65251823 L1-dcache-store-misses    # 1.77%                                   [22.78%]
>                  0 L1-dcache-prefetches                                         [22.84%]
>                  0 L1-dcache-prefetch-misses                                    [22.32%]
>         9321652613 L1-icache-loads                                              [22.60%]
>         1353418869 L1-icache-load-misses     #   14.52% of all L1-icache hits   [21.92%]
>          169126969 LLC-loads                                                    [21.87%]
>           12583605 LLC-load-misses           #    7.44% of all LL-cache hits    [ 5.84%]
>          132853447 LLC-stores                                                   [ 6.61%]
>           10601171 LLC-store-misses          #7.9%                                   [ 5.01%]
>           25309497 LLC-prefetches             #30%                                  [ 4.96%]
>            7723198 LLC-prefetch-misses                                          [ 6.04%]
>         4954075817 dTLB-loads                                                   [11.56%]
>           26753106 dTLB-load-misses          #    0.54% of all dTLB cache hits  [16.80%]
>         3553702874 dTLB-stores                                                  [22.37%]
>            4720313 dTLB-store-misses        #0.13%                                    [21.46%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
>       60.000920666 seconds time elapsed
> 
> post live migration:
> Performance counter stats for thread id '1579':
> 
>                  0 page-faults                                                  [100.00%]
>                  0 minor-faults                                                 [100.00%]
>                  0 major-faults                                                 [100.00%]
>              34979 cs                                                           [100.00%]
>                441 migrations                                                   [100.00%]
>                  0 alignment-faults                                             [100.00%]
>                  0 emulation-faults
>         6903585501 L1-dcache-loads                                              [22.06%]
>          525939560 L1-dcache-load-misses     #    7.62% of all L1-dcache hits   [21.97%]
>         5042552685 L1-dcache-stores                                             [22.20%]
>           94493742 L1-dcache-store-misses    #1.8%                                   [22.06%]
>                  0 L1-dcache-prefetches                                         [22.39%]
>                  0 L1-dcache-prefetch-misses                                    [22.47%]
>        13022953030 L1-icache-loads                                              [22.25%]
>         1957161101 L1-icache-load-misses     #   15.03% of all L1-icache hits   [22.47%]
>          348479792 LLC-loads                                                    [22.27%]
>           80662778 LLC-load-misses           #   23.15% of all LL-cache hits    [ 5.64%]
>          198745620 LLC-stores                                                   [ 5.63%]
>           14236497 LLC-store-misses          #   7.1%                                    [ 5.41%]
>           20757435 LLC-prefetches                                               [ 5.42%]
>            5361819 LLC-prefetch-misses       #   25%                                [ 5.69%]
>         7235715124 dTLB-loads                                                   [11.26%]
>           49895163 dTLB-load-misses          #    0.69% of all dTLB cache hits  [16.96%]
>         5168276218 dTLB-stores                                                  [22.44%]
>            6765983 dTLB-store-misses        #0.13%                                    [22.24%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
> The "LLC-load-misses" went up by about 16%. Then, I restarted the process in guest, the perf data back to normal,
> Performance counter stats for thread id '1579':
> 
>                  0 page-faults                                                  [100.00%]
>                  0 minor-faults                                                 [100.00%]
>                  0 major-faults                                                 [100.00%]
>              30594 cs                                                           [100.00%]
>                327 migrations                                                   [100.00%]
>                  0 alignment-faults                                             [100.00%]
>                  0 emulation-faults
>         7707091948 L1-dcache-loads                                              [22.10%]
>          559829176 L1-dcache-load-misses     #    7.26% of all L1-dcache hits   [22.28%]
>         5976654983 L1-dcache-stores                                             [23.22%]
>          160436114 L1-dcache-store-misses                                       [22.80%]
>                  0 L1-dcache-prefetches                                         [22.51%]
>                  0 L1-dcache-prefetch-misses                                    [22.53%]
>        13798415672 L1-icache-loads                                              [22.28%]
>         2017724676 L1-icache-load-misses     #   14.62% of all L1-icache hits   [22.49%]
>          254598008 LLC-loads                                                    [22.86%]
>           16035378 LLC-load-misses           #    6.30% of all LL-cache hits    [ 5.36%]
>          307019606 LLC-stores                                                   [ 5.60%]
>           13665033 LLC-store-misses                                             [ 5.43%]
>           17715554 LLC-prefetches                                               [ 5.57%]
>            4187006 LLC-prefetch-misses                                          [ 5.44%]
>         7811502895 dTLB-loads                                                   [10.72%]
>           40547330 dTLB-load-misses          #    0.52% of all dTLB cache hits  [16.31%]
>         6144202516 dTLB-stores                                                  [21.58%]
>            6313363 dTLB-store-misses                                            [21.91%]
>      <not counted> dTLB-prefetches
>      <not counted> dTLB-prefetch-misses
> 
>       60.000812523 seconds time elapsed
> 
> If EPT disabled, this problem gone.
> 
> I suspect that kvm hypervisor has business with this problem.
> Based on above suspect, I want to find the two adjacent versions of kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1),
> and analyze the differences between this two versions, or apply the patches between this two versions by bisection method, finally find the key patches.
> 
> Any better ideas?
> 
> Thanks,
> Zhang Haoyu
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
>

next prev parent reply	other threads:[~2013-07-11 10:39 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-11  9:36 vm performance degradation after kvm live migration or save-restore with ETP enabled Zhanghaoyu (A)
2013-07-11  9:36 ` [Qemu-devel] " Zhanghaoyu (A)
2013-07-11 10:28 ` Michael S. Tsirkin
2013-07-11 10:28   ` [Qemu-devel] " Michael S. Tsirkin
2013-07-11 10:39 ` Gleb Natapov
2013-07-11 10:39   ` [Qemu-devel] " Gleb Natapov
2013-07-11 10:39 ` Xiao Guangrong [this message]
2013-07-11 10:39   ` Xiao Guangrong
2013-07-11 14:00   ` Zhang Haoyu
2013-07-11 14:00     ` [Qemu-devel] " Zhang Haoyu
2013-07-11 10:51 ` Andreas Färber
2013-07-11 10:51   ` Andreas Färber
2013-07-12  3:21   ` Zhanghaoyu (A)
2013-07-12  3:21     ` Zhanghaoyu (A)
2013-07-11 18:20 ` Bruce Rogers
2013-07-11 18:20   ` [Qemu-devel] " Bruce Rogers
2013-07-27  7:47   ` Zhanghaoyu (A)
2013-07-27  7:47     ` [Qemu-devel] " Zhanghaoyu (A)
2013-07-29 22:14     ` Andrea Arcangeli
2013-07-29 22:14       ` Andrea Arcangeli
2013-07-29 23:47     ` Marcelo Tosatti
2013-07-29 23:47       ` Marcelo Tosatti
2013-07-30  9:04       ` Zhanghaoyu (A)
2013-07-30  9:04         ` Zhanghaoyu (A)
2013-08-01  6:16         ` Gleb Natapov
2013-08-01  6:16           ` Gleb Natapov
2013-08-05  8:35           ` [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled Zhanghaoyu (A)
2013-08-05  8:35             ` Zhanghaoyu (A)
2013-08-05  8:43             ` Gleb Natapov
2013-08-05  8:43               ` Gleb Natapov
2013-08-05  9:09               ` Zhanghaoyu (A)
2013-08-05  9:09                 ` Zhanghaoyu (A)
2013-08-05  9:15                 ` Andreas Färber
2013-08-05  9:15                   ` Andreas Färber
2013-08-05  9:22                   ` Zhanghaoyu (A)
2013-08-05  9:22                     ` Zhanghaoyu (A)
2013-08-05  9:37                 ` Gleb Natapov
2013-08-05  9:37                   ` Gleb Natapov
2013-08-06 10:47                   ` Zhanghaoyu (A)
2013-08-06 10:47                     ` Zhanghaoyu (A)
2013-08-07  1:34                     ` Zhanghaoyu (A)
2013-08-07  1:34                       ` Zhanghaoyu (A)
2013-08-07  5:52                       ` Gleb Natapov
2013-08-07  5:52                         ` Gleb Natapov
2013-08-14  9:05                         ` Zhanghaoyu (A)
2013-08-14  9:05                           ` [Qemu-devel] " Zhanghaoyu (A)
2013-08-20 13:33                         ` Zhanghaoyu (A)
2013-08-20 13:33                           ` [Qemu-devel] " Zhanghaoyu (A)
2013-08-31  7:45                         ` Zhanghaoyu (A)
2013-08-31  7:45                           ` [Qemu-devel] " Zhanghaoyu (A)
2013-08-05 18:27             ` Xiao Guangrong
2013-08-05 18:27               ` [Qemu-devel] " Xiao Guangrong

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:6945ef4 dfblob:7d177c7 dfblob:6945ef4 dfblob:7d177c7 )
 OR (
bs:"using huge page on fast page fault path" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51DE8B6B.7010802@linux.vnet.ibm.com \
    --to=xiaoguangrong@linux.vnet.ibm.com \
    --cc=Shouta.Uehara@jp.yokogawa.com \
    --cc=cloudfantom@gmail.com \
    --cc=hanweidong@huawei.com \
    --cc=haoyu.zhang@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=luonengjun@huawei.com \
    --cc=mpetersen@peak6.com \
    --cc=mst@redhat.com \
    --cc=paolo.bonzini@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=weidong.huang@huawei.com \
    --cc=zanghongyong@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.