public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids
@ 2023-10-24  7:57 Dapeng Mi
  2023-10-24  7:57 ` [kvm-unit-tests Patch 1/5] x86: pmu: Remove duplicate code in pmu_init() Dapeng Mi
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Dapeng Mi @ 2023-10-24  7:57 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Zhenyu Wang, Zhang Xiong, Jim Mattson,
	Mingwei Zhang, Like Xu, Dapeng Mi, Dapeng Mi

When running pmu test on Intel Sapphire Rapids, we found several
failures are encountered, such as "llc misses" failure, "all counters"
failure and "fixed counter 3" failure.

Intel Sapphire Rapids introduces new fixed counter 3, total PMU counters
including GP and fixed counters increase to 12 and also optimizes cache
subsystem. All these changes make the original assumptions in pmu test
unavailable any more on Sapphire Rapids. Patches 2-4 fixes these
failures, patch 0 remove the duplicate code and patch 5 adds assert to
ensure predefine fixed events are matched with HW fixed counters.

Dapeng Mi (4):
  x86: pmu: Change the minimum value of llc_misses event to 0
  x86: pmu: Enlarge cnt array length to 64 in check_counters_many()
  x86: pmu: Support validation for Intel PMU fixed counter 3
  x86: pmu: Add asserts to warn inconsistent fixed events and counters

Xiong Zhang (1):
  x86: pmu: Remove duplicate code in pmu_init()

 lib/x86/pmu.c |  5 -----
 x86/pmu.c     | 17 ++++++++++++-----
 2 files changed, 12 insertions(+), 10 deletions(-)


base-commit: bfe5d7d0e14c8199d134df84d6ae8487a9772c48
-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [kvm-unit-tests Patch 1/5] x86: pmu: Remove duplicate code in pmu_init()
  2023-10-24  7:57 [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Dapeng Mi
@ 2023-10-24  7:57 ` Dapeng Mi
  2023-10-24  7:57 ` [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0 Dapeng Mi
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Dapeng Mi @ 2023-10-24  7:57 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Zhenyu Wang, Zhang Xiong, Jim Mattson,
	Mingwei Zhang, Like Xu, Dapeng Mi, Dapeng Mi

From: Xiong Zhang <xiong.y.zhang@intel.com>

There are totally same code in pmu_init() helper, remove the duplicate
code.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 lib/x86/pmu.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/lib/x86/pmu.c b/lib/x86/pmu.c
index 0f2afd650bc9..d06e94553024 100644
--- a/lib/x86/pmu.c
+++ b/lib/x86/pmu.c
@@ -16,11 +16,6 @@ void pmu_init(void)
 			pmu.fixed_counter_width = (cpuid_10.d >> 5) & 0xff;
 		}
 
-		if (pmu.version > 1) {
-			pmu.nr_fixed_counters = cpuid_10.d & 0x1f;
-			pmu.fixed_counter_width = (cpuid_10.d >> 5) & 0xff;
-		}
-
 		pmu.nr_gp_counters = (cpuid_10.a >> 8) & 0xff;
 		pmu.gp_counter_width = (cpuid_10.a >> 16) & 0xff;
 		pmu.gp_counter_mask_length = (cpuid_10.a >> 24) & 0xff;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0
  2023-10-24  7:57 [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Dapeng Mi
  2023-10-24  7:57 ` [kvm-unit-tests Patch 1/5] x86: pmu: Remove duplicate code in pmu_init() Dapeng Mi
@ 2023-10-24  7:57 ` Dapeng Mi
  2023-10-24 13:03   ` Jim Mattson
  2023-10-24  7:57 ` [kvm-unit-tests Patch 3/5] x86: pmu: Enlarge cnt array length to 64 in check_counters_many() Dapeng Mi
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Dapeng Mi @ 2023-10-24  7:57 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Zhenyu Wang, Zhang Xiong, Jim Mattson,
	Mingwei Zhang, Like Xu, Dapeng Mi, Dapeng Mi

Along with the CPU HW's upgrade and optimization, the count of LLC
misses event for running loop() helper could be 0 just like seen on
Sapphire Rapids.

So modify the lower limit of possible count range for LLC misses
events to 0 to avoid LLC misses event test failure on Sapphire Rapids.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 x86/pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 0def28695c70..7443fdab5c8a 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -35,7 +35,7 @@ struct pmu_event {
 	{"instructions", 0x00c0, 10*N, 10.2*N},
 	{"ref cycles", 0x013c, 1*N, 30*N},
 	{"llc references", 0x4f2e, 1, 2*N},
-	{"llc misses", 0x412e, 1, 1*N},
+	{"llc misses", 0x412e, 0, 1*N},
 	{"branches", 0x00c4, 1*N, 1.1*N},
 	{"branch misses", 0x00c5, 0, 0.1*N},
 }, amd_gp_events[] = {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests Patch 3/5] x86: pmu: Enlarge cnt array length to 64 in check_counters_many()
  2023-10-24  7:57 [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Dapeng Mi
  2023-10-24  7:57 ` [kvm-unit-tests Patch 1/5] x86: pmu: Remove duplicate code in pmu_init() Dapeng Mi
  2023-10-24  7:57 ` [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0 Dapeng Mi
@ 2023-10-24  7:57 ` Dapeng Mi
  2023-10-24  7:57 ` [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3 Dapeng Mi
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Dapeng Mi @ 2023-10-24  7:57 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Zhenyu Wang, Zhang Xiong, Jim Mattson,
	Mingwei Zhang, Like Xu, Dapeng Mi, Dapeng Mi

Considering there are already 8 GP counters and 4 fixed counters on
latest Intel CPUs, like Sapphire Rapids. The original cnt array length
10 is definitely not enough to cover all supported PMU counters on these
new CPUs and it would cause PMU counter validation failures.

It's probably more and more GP and fixed counters are introduced in the
future and then directly extends the cnt array length to 64.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 x86/pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 7443fdab5c8a..1bebf493d4a4 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -254,7 +254,7 @@ static void check_fixed_counters(void)
 
 static void check_counters_many(void)
 {
-	pmu_counter_t cnt[10];
+	pmu_counter_t cnt[64];
 	int i, n;
 
 	for (i = 0, n = 0; n < pmu.nr_gp_counters; i++) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3
  2023-10-24  7:57 [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Dapeng Mi
                   ` (2 preceding siblings ...)
  2023-10-24  7:57 ` [kvm-unit-tests Patch 3/5] x86: pmu: Enlarge cnt array length to 64 in check_counters_many() Dapeng Mi
@ 2023-10-24  7:57 ` Dapeng Mi
  2023-10-24 19:05   ` Jim Mattson
  2023-10-24  7:57 ` [kvm-unit-tests Patch 5/5] x86: pmu: Add asserts to warn inconsistent fixed events and counters Dapeng Mi
  2023-10-25 23:47 ` [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Mingwei Zhang
  5 siblings, 1 reply; 19+ messages in thread
From: Dapeng Mi @ 2023-10-24  7:57 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Zhenyu Wang, Zhang Xiong, Jim Mattson,
	Mingwei Zhang, Like Xu, Dapeng Mi, Dapeng Mi

Intel CPUs, like Sapphire Rapids, introduces a new fixed counter
(fixed counter 3) to counter/sample topdown.slots event, but current
code still doesn't cover this new fixed counter.

So add code to validate this new fixed counter.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 x86/pmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 1bebf493d4a4..41165e168d8e 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -46,7 +46,8 @@ struct pmu_event {
 }, fixed_events[] = {
 	{"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
 	{"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
-	{"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
+	{"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N},
+	{"fixed 4", MSR_CORE_PERF_FIXED_CTR0 + 3, 1*N, 100*N}
 };
 
 char *buf;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [kvm-unit-tests Patch 5/5] x86: pmu: Add asserts to warn inconsistent fixed events and counters
  2023-10-24  7:57 [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Dapeng Mi
                   ` (3 preceding siblings ...)
  2023-10-24  7:57 ` [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3 Dapeng Mi
@ 2023-10-24  7:57 ` Dapeng Mi
  2023-10-25 23:47 ` [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Mingwei Zhang
  5 siblings, 0 replies; 19+ messages in thread
From: Dapeng Mi @ 2023-10-24  7:57 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Zhenyu Wang, Zhang Xiong, Jim Mattson,
	Mingwei Zhang, Like Xu, Dapeng Mi, Dapeng Mi

Current PMU code deosn't check whether PMU fixed counter number is
larger than pre-defined fixed events. If so, it would cause memory
access out of range.

So add assert to warn this invalid case.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 x86/pmu.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index 41165e168d8e..a1d615fbab3d 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -112,8 +112,12 @@ static struct pmu_event* get_counter_event(pmu_counter_t *cnt)
 		for (i = 0; i < gp_events_size; i++)
 			if (gp_events[i].unit_sel == (cnt->config & 0xffff))
 				return &gp_events[i];
-	} else
-		return &fixed_events[cnt->ctr - MSR_CORE_PERF_FIXED_CTR0];
+	} else {
+		int idx = cnt->ctr - MSR_CORE_PERF_FIXED_CTR0;
+
+		assert(idx < ARRAY_SIZE(fixed_events));
+		return &fixed_events[idx];
+	}
 
 	return (void*)0;
 }
@@ -246,6 +250,7 @@ static void check_fixed_counters(void)
 	};
 	int i;
 
+	assert(pmu.nr_fixed_counters <= ARRAY_SIZE(fixed_events));
 	for (i = 0; i < pmu.nr_fixed_counters; i++) {
 		cnt.ctr = fixed_events[i].unit_sel;
 		measure_one(&cnt);
@@ -267,6 +272,7 @@ static void check_counters_many(void)
 			gp_events[i % gp_events_size].unit_sel;
 		n++;
 	}
+	assert(pmu.nr_fixed_counters <= ARRAY_SIZE(fixed_events));
 	for (i = 0; i < pmu.nr_fixed_counters; i++) {
 		cnt[n].ctr = fixed_events[i].unit_sel;
 		cnt[n].config = EVNTSEL_OS | EVNTSEL_USR;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0
  2023-10-24  7:57 ` [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0 Dapeng Mi
@ 2023-10-24 13:03   ` Jim Mattson
  2023-10-25 11:22     ` Mi, Dapeng
  0 siblings, 1 reply; 19+ messages in thread
From: Jim Mattson @ 2023-10-24 13:03 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi

On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> Along with the CPU HW's upgrade and optimization, the count of LLC
> misses event for running loop() helper could be 0 just like seen on
> Sapphire Rapids.
>
> So modify the lower limit of possible count range for LLC misses
> events to 0 to avoid LLC misses event test failure on Sapphire Rapids.

I'm not convinced that these tests are really indicative of whether or
not the PMU is working properly. If 0 is allowed for llc misses, for
instance, doesn't this sub-test pass even when the PMU is disabled?

Surely, we can do better.

> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  x86/pmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 0def28695c70..7443fdab5c8a 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -35,7 +35,7 @@ struct pmu_event {
>         {"instructions", 0x00c0, 10*N, 10.2*N},
>         {"ref cycles", 0x013c, 1*N, 30*N},
>         {"llc references", 0x4f2e, 1, 2*N},
> -       {"llc misses", 0x412e, 1, 1*N},
> +       {"llc misses", 0x412e, 0, 1*N},
>         {"branches", 0x00c4, 1*N, 1.1*N},
>         {"branch misses", 0x00c5, 0, 0.1*N},
>  }, amd_gp_events[] = {
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3
  2023-10-24  7:57 ` [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3 Dapeng Mi
@ 2023-10-24 19:05   ` Jim Mattson
  2023-10-25 11:26     ` Mi, Dapeng
  0 siblings, 1 reply; 19+ messages in thread
From: Jim Mattson @ 2023-10-24 19:05 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi

On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>
> Intel CPUs, like Sapphire Rapids, introduces a new fixed counter
> (fixed counter 3) to counter/sample topdown.slots event, but current
> code still doesn't cover this new fixed counter.
>
> So add code to validate this new fixed counter.

Can you explain how this "validates" anything?

> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
>  x86/pmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 1bebf493d4a4..41165e168d8e 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -46,7 +46,8 @@ struct pmu_event {
>  }, fixed_events[] = {
>         {"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
>         {"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
> -       {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
> +       {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N},
> +       {"fixed 4", MSR_CORE_PERF_FIXED_CTR0 + 3, 1*N, 100*N}
>  };
>
>  char *buf;
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0
  2023-10-24 13:03   ` Jim Mattson
@ 2023-10-25 11:22     ` Mi, Dapeng
  2023-10-25 12:35       ` Jim Mattson
  0 siblings, 1 reply; 19+ messages in thread
From: Mi, Dapeng @ 2023-10-25 11:22 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi


On 10/24/2023 9:03 PM, Jim Mattson wrote:
> On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> Along with the CPU HW's upgrade and optimization, the count of LLC
>> misses event for running loop() helper could be 0 just like seen on
>> Sapphire Rapids.
>>
>> So modify the lower limit of possible count range for LLC misses
>> events to 0 to avoid LLC misses event test failure on Sapphire Rapids.
> I'm not convinced that these tests are really indicative of whether or
> not the PMU is working properly. If 0 is allowed for llc misses, for
> instance, doesn't this sub-test pass even when the PMU is disabled?
>
> Surely, we can do better.


Considering the testing workload is just a simple adding loop, it's 
reasonable and possible that it gets a 0 result for LLC misses and 
branch misses events. Yeah, I agree the 0 count makes the results not so 
credible. If we want to avoid these 0 count values, we may have to 
complicate the workload, such as adding flush cache instructions, or 
something like that (I'm not sure if there are instructions which can 
force branch misses). How's your idea about this?


>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>>   x86/pmu.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/x86/pmu.c b/x86/pmu.c
>> index 0def28695c70..7443fdab5c8a 100644
>> --- a/x86/pmu.c
>> +++ b/x86/pmu.c
>> @@ -35,7 +35,7 @@ struct pmu_event {
>>          {"instructions", 0x00c0, 10*N, 10.2*N},
>>          {"ref cycles", 0x013c, 1*N, 30*N},
>>          {"llc references", 0x4f2e, 1, 2*N},
>> -       {"llc misses", 0x412e, 1, 1*N},
>> +       {"llc misses", 0x412e, 0, 1*N},
>>          {"branches", 0x00c4, 1*N, 1.1*N},
>>          {"branch misses", 0x00c5, 0, 0.1*N},
>>   }, amd_gp_events[] = {
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3
  2023-10-24 19:05   ` Jim Mattson
@ 2023-10-25 11:26     ` Mi, Dapeng
  2023-10-25 12:38       ` Jim Mattson
  0 siblings, 1 reply; 19+ messages in thread
From: Mi, Dapeng @ 2023-10-25 11:26 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi


On 10/25/2023 3:05 AM, Jim Mattson wrote:
> On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> Intel CPUs, like Sapphire Rapids, introduces a new fixed counter
>> (fixed counter 3) to counter/sample topdown.slots event, but current
>> code still doesn't cover this new fixed counter.
>>
>> So add code to validate this new fixed counter.
> Can you explain how this "validates" anything?


I may not describe the sentence clearly. This would validate the fixed 
counter 3 can count the slots event and get a valid count in a 
reasonable range. Thanks.


>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> ---
>>   x86/pmu.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/x86/pmu.c b/x86/pmu.c
>> index 1bebf493d4a4..41165e168d8e 100644
>> --- a/x86/pmu.c
>> +++ b/x86/pmu.c
>> @@ -46,7 +46,8 @@ struct pmu_event {
>>   }, fixed_events[] = {
>>          {"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
>>          {"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
>> -       {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
>> +       {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N},
>> +       {"fixed 4", MSR_CORE_PERF_FIXED_CTR0 + 3, 1*N, 100*N}
>>   };
>>
>>   char *buf;
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0
  2023-10-25 11:22     ` Mi, Dapeng
@ 2023-10-25 12:35       ` Jim Mattson
  2023-10-26  2:14         ` Mi, Dapeng
  0 siblings, 1 reply; 19+ messages in thread
From: Jim Mattson @ 2023-10-25 12:35 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi

On Wed, Oct 25, 2023 at 4:23 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>
>
> On 10/24/2023 9:03 PM, Jim Mattson wrote:
> > On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
> >> Along with the CPU HW's upgrade and optimization, the count of LLC
> >> misses event for running loop() helper could be 0 just like seen on
> >> Sapphire Rapids.
> >>
> >> So modify the lower limit of possible count range for LLC misses
> >> events to 0 to avoid LLC misses event test failure on Sapphire Rapids.
> > I'm not convinced that these tests are really indicative of whether or
> > not the PMU is working properly. If 0 is allowed for llc misses, for
> > instance, doesn't this sub-test pass even when the PMU is disabled?
> >
> > Surely, we can do better.
>
>
> Considering the testing workload is just a simple adding loop, it's
> reasonable and possible that it gets a 0 result for LLC misses and
> branch misses events. Yeah, I agree the 0 count makes the results not so
> credible. If we want to avoid these 0 count values, we may have to
> complicate the workload, such as adding flush cache instructions, or
> something like that (I'm not sure if there are instructions which can
> force branch misses). How's your idea about this?

CLFLUSH is probably a good way to ensure cache misses. IBPB may be a
good way to ensure branch mispredictions, or IBRS on parts without
eIBRS.

>
> >
> >> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> >> ---
> >>   x86/pmu.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/x86/pmu.c b/x86/pmu.c
> >> index 0def28695c70..7443fdab5c8a 100644
> >> --- a/x86/pmu.c
> >> +++ b/x86/pmu.c
> >> @@ -35,7 +35,7 @@ struct pmu_event {
> >>          {"instructions", 0x00c0, 10*N, 10.2*N},
> >>          {"ref cycles", 0x013c, 1*N, 30*N},
> >>          {"llc references", 0x4f2e, 1, 2*N},
> >> -       {"llc misses", 0x412e, 1, 1*N},
> >> +       {"llc misses", 0x412e, 0, 1*N},
> >>          {"branches", 0x00c4, 1*N, 1.1*N},
> >>          {"branch misses", 0x00c5, 0, 0.1*N},
> >>   }, amd_gp_events[] = {
> >> --
> >> 2.34.1
> >>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3
  2023-10-25 11:26     ` Mi, Dapeng
@ 2023-10-25 12:38       ` Jim Mattson
  2023-10-26  2:29         ` Mi, Dapeng
  0 siblings, 1 reply; 19+ messages in thread
From: Jim Mattson @ 2023-10-25 12:38 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi

On Wed, Oct 25, 2023 at 4:26 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>
>
> On 10/25/2023 3:05 AM, Jim Mattson wrote:
> > On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
> >> Intel CPUs, like Sapphire Rapids, introduces a new fixed counter
> >> (fixed counter 3) to counter/sample topdown.slots event, but current
> >> code still doesn't cover this new fixed counter.
> >>
> >> So add code to validate this new fixed counter.
> > Can you explain how this "validates" anything?
>
>
> I may not describe the sentence clearly. This would validate the fixed
> counter 3 can count the slots event and get a valid count in a
> reasonable range. Thanks.

I thought the current vPMU implementation did not actually support
top-down slots. If it doesn't work, how can it be validated?

>
> >
> >> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> >> ---
> >>   x86/pmu.c | 3 ++-
> >>   1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/x86/pmu.c b/x86/pmu.c
> >> index 1bebf493d4a4..41165e168d8e 100644
> >> --- a/x86/pmu.c
> >> +++ b/x86/pmu.c
> >> @@ -46,7 +46,8 @@ struct pmu_event {
> >>   }, fixed_events[] = {
> >>          {"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
> >>          {"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
> >> -       {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
> >> +       {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N},
> >> +       {"fixed 4", MSR_CORE_PERF_FIXED_CTR0 + 3, 1*N, 100*N}
> >>   };
> >>
> >>   char *buf;
> >> --
> >> 2.34.1
> >>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids
  2023-10-24  7:57 [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Dapeng Mi
                   ` (4 preceding siblings ...)
  2023-10-24  7:57 ` [kvm-unit-tests Patch 5/5] x86: pmu: Add asserts to warn inconsistent fixed events and counters Dapeng Mi
@ 2023-10-25 23:47 ` Mingwei Zhang
  2023-10-26  3:32   ` Mi, Dapeng
  5 siblings, 1 reply; 19+ messages in thread
From: Mingwei Zhang @ 2023-10-25 23:47 UTC (permalink / raw)
  To: Dapeng Mi
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Jim Mattson, Like Xu, Dapeng Mi

On Tue, Oct 24, 2023, Dapeng Mi wrote:
> When running pmu test on Intel Sapphire Rapids, we found several
> failures are encountered, such as "llc misses" failure, "all counters"
> failure and "fixed counter 3" failure.

hmm, I have tested your series on a SPR machine. It looks like, all "llc
misses" already pass on my side. "all counters" always fail with/without
your patches. "fixed counter 3" never exists... I have "fixed
cntr-{0,1,2}" and "fixed-{0,1,2}"

You may want to double check the requirements of your series. Not just
under your setting without explainning those setting in detail.

Maybe what I am missing is your topdown series? So, before your topdown
series checked in. I don't see value in this series.

Thanks.
-Mingwei
> 
> Intel Sapphire Rapids introduces new fixed counter 3, total PMU counters
> including GP and fixed counters increase to 12 and also optimizes cache
> subsystem. All these changes make the original assumptions in pmu test
> unavailable any more on Sapphire Rapids. Patches 2-4 fixes these
> failures, patch 0 remove the duplicate code and patch 5 adds assert to
> ensure predefine fixed events are matched with HW fixed counters.
> 
> Dapeng Mi (4):
>   x86: pmu: Change the minimum value of llc_misses event to 0
>   x86: pmu: Enlarge cnt array length to 64 in check_counters_many()
>   x86: pmu: Support validation for Intel PMU fixed counter 3
>   x86: pmu: Add asserts to warn inconsistent fixed events and counters
> 
> Xiong Zhang (1):
>   x86: pmu: Remove duplicate code in pmu_init()
> 
>  lib/x86/pmu.c |  5 -----
>  x86/pmu.c     | 17 ++++++++++++-----
>  2 files changed, 12 insertions(+), 10 deletions(-)
> 
> 
> base-commit: bfe5d7d0e14c8199d134df84d6ae8487a9772c48
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0
  2023-10-25 12:35       ` Jim Mattson
@ 2023-10-26  2:14         ` Mi, Dapeng
  2023-10-26 12:19           ` Jim Mattson
  0 siblings, 1 reply; 19+ messages in thread
From: Mi, Dapeng @ 2023-10-26  2:14 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi


On 10/25/2023 8:35 PM, Jim Mattson wrote:
> On Wed, Oct 25, 2023 at 4:23 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>>
>> On 10/24/2023 9:03 PM, Jim Mattson wrote:
>>> On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>>>> Along with the CPU HW's upgrade and optimization, the count of LLC
>>>> misses event for running loop() helper could be 0 just like seen on
>>>> Sapphire Rapids.
>>>>
>>>> So modify the lower limit of possible count range for LLC misses
>>>> events to 0 to avoid LLC misses event test failure on Sapphire Rapids.
>>> I'm not convinced that these tests are really indicative of whether or
>>> not the PMU is working properly. If 0 is allowed for llc misses, for
>>> instance, doesn't this sub-test pass even when the PMU is disabled?
>>>
>>> Surely, we can do better.
>>
>> Considering the testing workload is just a simple adding loop, it's
>> reasonable and possible that it gets a 0 result for LLC misses and
>> branch misses events. Yeah, I agree the 0 count makes the results not so
>> credible. If we want to avoid these 0 count values, we may have to
>> complicate the workload, such as adding flush cache instructions, or
>> something like that (I'm not sure if there are instructions which can
>> force branch misses). How's your idea about this?
> CLFLUSH is probably a good way to ensure cache misses. IBPB may be a
> good way to ensure branch mispredictions, or IBRS on parts without
> eIBRS.


Thanks Jim for the information. I'm not familiar with IBPB/IBRS 
instructions, but just a glance, it looks there two instructions are 
some kind of advanced instructions,  Not all Intel CPUs support these 
instructions and not sure if AMD has similar instructions. It would be 
better if there are more generic instruction to trigger branch miss. 
Anyway I would look at the details and come back again.


>>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>>> ---
>>>>    x86/pmu.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/x86/pmu.c b/x86/pmu.c
>>>> index 0def28695c70..7443fdab5c8a 100644
>>>> --- a/x86/pmu.c
>>>> +++ b/x86/pmu.c
>>>> @@ -35,7 +35,7 @@ struct pmu_event {
>>>>           {"instructions", 0x00c0, 10*N, 10.2*N},
>>>>           {"ref cycles", 0x013c, 1*N, 30*N},
>>>>           {"llc references", 0x4f2e, 1, 2*N},
>>>> -       {"llc misses", 0x412e, 1, 1*N},
>>>> +       {"llc misses", 0x412e, 0, 1*N},
>>>>           {"branches", 0x00c4, 1*N, 1.1*N},
>>>>           {"branch misses", 0x00c5, 0, 0.1*N},
>>>>    }, amd_gp_events[] = {
>>>> --
>>>> 2.34.1
>>>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3
  2023-10-25 12:38       ` Jim Mattson
@ 2023-10-26  2:29         ` Mi, Dapeng
  0 siblings, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2023-10-26  2:29 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi


On 10/25/2023 8:38 PM, Jim Mattson wrote:
> On Wed, Oct 25, 2023 at 4:26 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>>
>> On 10/25/2023 3:05 AM, Jim Mattson wrote:
>>> On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>>>> Intel CPUs, like Sapphire Rapids, introduces a new fixed counter
>>>> (fixed counter 3) to counter/sample topdown.slots event, but current
>>>> code still doesn't cover this new fixed counter.
>>>>
>>>> So add code to validate this new fixed counter.
>>> Can you explain how this "validates" anything?
>>
>> I may not describe the sentence clearly. This would validate the fixed
>> counter 3 can count the slots event and get a valid count in a
>> reasonable range. Thanks.
> I thought the current vPMU implementation did not actually support
> top-down slots. If it doesn't work, how can it be validated?

Ops, you reminds me, I just made a mistake, the kernel which I used 
includes the vtopdown supporting patches, so the topdown slots is 
supported. Since there are big arguments on the original vtopdown RFC 
patches,  the topdown metrics feature is probably not to be supported in 
current vPMU emulation framework, but the slots events support patches 
(the former two patches 
https://lore.kernel.org/all/20230927033124.1226509-1-dapeng1.mi@linux.intel.com/T/#m53883e39177eb9a0d8e23e4c382ddc6190c7f0f4 
and 
https://lore.kernel.org/all/20230927033124.1226509-1-dapeng1.mi@linux.intel.com/T/#m1d9c433eb6ce83b32e50f6d976fbfeee2b731fb9) 
are still valuable and just a small piece of work and doesn't touch any 
perf code. I'd like split these two patches into an independent patchset 
and resend to LKML.

>
>>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>>> ---
>>>>    x86/pmu.c | 3 ++-
>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/x86/pmu.c b/x86/pmu.c
>>>> index 1bebf493d4a4..41165e168d8e 100644
>>>> --- a/x86/pmu.c
>>>> +++ b/x86/pmu.c
>>>> @@ -46,7 +46,8 @@ struct pmu_event {
>>>>    }, fixed_events[] = {
>>>>           {"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
>>>>           {"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
>>>> -       {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
>>>> +       {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N},
>>>> +       {"fixed 4", MSR_CORE_PERF_FIXED_CTR0 + 3, 1*N, 100*N}
>>>>    };
>>>>
>>>>    char *buf;
>>>> --
>>>> 2.34.1
>>>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids
  2023-10-25 23:47 ` [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Mingwei Zhang
@ 2023-10-26  3:32   ` Mi, Dapeng
  2023-10-30  3:57     ` Mingwei Zhang
  0 siblings, 1 reply; 19+ messages in thread
From: Mi, Dapeng @ 2023-10-26  3:32 UTC (permalink / raw)
  To: Mingwei Zhang
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Jim Mattson, Like Xu, Dapeng Mi

On 10/26/2023 7:47 AM, Mingwei Zhang wrote:
> On Tue, Oct 24, 2023, Dapeng Mi wrote:
>> When running pmu test on Intel Sapphire Rapids, we found several
>> failures are encountered, such as "llc misses" failure, "all counters"
>> failure and "fixed counter 3" failure.
> hmm, I have tested your series on a SPR machine. It looks like, all "llc
> misses" already pass on my side. "all counters" always fail with/without
> your patches. "fixed counter 3" never exists... I have "fixed
> cntr-{0,1,2}" and "fixed-{0,1,2}"

1. "LLC misses" failure

Yeah, the "LLC misses" failure is not always seen. I can see the "LLC  
misses" 2 ~3 times out of 10 runs of PMU standalone test and you could 
see the failure with higher possibility if you run the full 
kvm-unit-tests. I think whether you can see the "LLC misses" failure it 
really depends on current cache status on your system, how much cache 
memory are consumed by other programs. If there are lots of free cache 
lines on system when running the pmu test, you may have higher 
possibility to see the LLC misses failures just like what I see below.

PASS: Intel: llc references-7
*FAIL*: Intel: llc misses-0
PASS: Intel: llc misses-1
PASS: Intel: llc misses-2

2. "all counters" failure

Actually the "all counters" failure are not always seen, but it doesn't 
mean current code is correct. In current code, the length of "cnt[10]" 
array in check_counters_many() is defined as 10, but there are at least 
11 counters supported (8 GP counters + 3 fixed counters) on SPR even 
though fixed counter 3 is not supported in current upstream code. 
Obviously there would be out of range memory access in 
check_counters_many().

>
> You may want to double check the requirements of your series. Not just
> under your setting without explainning those setting in detail.
>
> Maybe what I am missing is your topdown series? So, before your topdown
> series checked in. I don't see value in this series.

3. "fixed counter 3" failure

Yeah, I just realized I used the kernel which includes the vtopdown 
supporting patches after Jim's reminding. As the reply for Jim's 
comments says, the patches for support slots event are still valuable 
for current emulation framework and I would split them from the original 
vtopdown patchset and resend them as an independent patchset. Anyway, 
even though there is not slots event support in Kernel, it only impacts 
the patch 4/5, other patches are still valuable.


>
> Thanks.
> -Mingwei
>> Intel Sapphire Rapids introduces new fixed counter 3, total PMU counters
>> including GP and fixed counters increase to 12 and also optimizes cache
>> subsystem. All these changes make the original assumptions in pmu test
>> unavailable any more on Sapphire Rapids. Patches 2-4 fixes these
>> failures, patch 0 remove the duplicate code and patch 5 adds assert to
>> ensure predefine fixed events are matched with HW fixed counters.
>>
>> Dapeng Mi (4):
>>    x86: pmu: Change the minimum value of llc_misses event to 0
>>    x86: pmu: Enlarge cnt array length to 64 in check_counters_many()
>>    x86: pmu: Support validation for Intel PMU fixed counter 3
>>    x86: pmu: Add asserts to warn inconsistent fixed events and counters
>>
>> Xiong Zhang (1):
>>    x86: pmu: Remove duplicate code in pmu_init()
>>
>>   lib/x86/pmu.c |  5 -----
>>   x86/pmu.c     | 17 ++++++++++++-----
>>   2 files changed, 12 insertions(+), 10 deletions(-)
>>
>>
>> base-commit: bfe5d7d0e14c8199d134df84d6ae8487a9772c48
>> -- 
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0
  2023-10-26  2:14         ` Mi, Dapeng
@ 2023-10-26 12:19           ` Jim Mattson
  2023-10-27 10:17             ` Mi, Dapeng
  0 siblings, 1 reply; 19+ messages in thread
From: Jim Mattson @ 2023-10-26 12:19 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi

On Wed, Oct 25, 2023 at 7:14 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>
>
> On 10/25/2023 8:35 PM, Jim Mattson wrote:
> > On Wed, Oct 25, 2023 at 4:23 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
> >>
> >> On 10/24/2023 9:03 PM, Jim Mattson wrote:
> >>> On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
> >>>> Along with the CPU HW's upgrade and optimization, the count of LLC
> >>>> misses event for running loop() helper could be 0 just like seen on
> >>>> Sapphire Rapids.
> >>>>
> >>>> So modify the lower limit of possible count range for LLC misses
> >>>> events to 0 to avoid LLC misses event test failure on Sapphire Rapids.
> >>> I'm not convinced that these tests are really indicative of whether or
> >>> not the PMU is working properly. If 0 is allowed for llc misses, for
> >>> instance, doesn't this sub-test pass even when the PMU is disabled?
> >>>
> >>> Surely, we can do better.
> >>
> >> Considering the testing workload is just a simple adding loop, it's
> >> reasonable and possible that it gets a 0 result for LLC misses and
> >> branch misses events. Yeah, I agree the 0 count makes the results not so
> >> credible. If we want to avoid these 0 count values, we may have to
> >> complicate the workload, such as adding flush cache instructions, or
> >> something like that (I'm not sure if there are instructions which can
> >> force branch misses). How's your idea about this?
> > CLFLUSH is probably a good way to ensure cache misses. IBPB may be a
> > good way to ensure branch mispredictions, or IBRS on parts without
> > eIBRS.
>
>
> Thanks Jim for the information. I'm not familiar with IBPB/IBRS
> instructions, but just a glance, it looks there two instructions are
> some kind of advanced instructions,  Not all Intel CPUs support these
> instructions and not sure if AMD has similar instructions. It would be
> better if there are more generic instruction to trigger branch miss.
> Anyway I would look at the details and come back again.

IBPB and IBRS are not instructions. IBPB (indirect branch predictor
barrier) is triggered by setting bit 0 of the IA32_PRED_CMD MSR. IBRS
(indirect branch restricted speculation) is triggered by setting bit 0
of the IA32_SPEC_CTRL MSR. It is true that the desired behavior of
IBRS (causing branch mispredictions) is only exhibited by certain
older parts. However, IBPB is now universally available, as it is
necessary to mitigate many speculative execution attacks. For Intel
documentation, see
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/cpuid-enumeration-and-architectural-msrs.html.

If you don't want to use these, you could train a branch to go one way
prior to measurement, and then arrange for the branch under test go
the other way.

> >>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> >>>> ---
> >>>>    x86/pmu.c | 2 +-
> >>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/x86/pmu.c b/x86/pmu.c
> >>>> index 0def28695c70..7443fdab5c8a 100644
> >>>> --- a/x86/pmu.c
> >>>> +++ b/x86/pmu.c
> >>>> @@ -35,7 +35,7 @@ struct pmu_event {
> >>>>           {"instructions", 0x00c0, 10*N, 10.2*N},
> >>>>           {"ref cycles", 0x013c, 1*N, 30*N},
> >>>>           {"llc references", 0x4f2e, 1, 2*N},
> >>>> -       {"llc misses", 0x412e, 1, 1*N},
> >>>> +       {"llc misses", 0x412e, 0, 1*N},
> >>>>           {"branches", 0x00c4, 1*N, 1.1*N},
> >>>>           {"branch misses", 0x00c5, 0, 0.1*N},
> >>>>    }, amd_gp_events[] = {
> >>>> --
> >>>> 2.34.1
> >>>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0
  2023-10-26 12:19           ` Jim Mattson
@ 2023-10-27 10:17             ` Mi, Dapeng
  0 siblings, 0 replies; 19+ messages in thread
From: Mi, Dapeng @ 2023-10-27 10:17 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Mingwei Zhang, Like Xu, Dapeng Mi


On 10/26/2023 8:19 PM, Jim Mattson wrote:
> On Wed, Oct 25, 2023 at 7:14 PM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>>
>> On 10/25/2023 8:35 PM, Jim Mattson wrote:
>>> On Wed, Oct 25, 2023 at 4:23 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>>>> On 10/24/2023 9:03 PM, Jim Mattson wrote:
>>>>> On Tue, Oct 24, 2023 at 12:51 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>>>>>> Along with the CPU HW's upgrade and optimization, the count of LLC
>>>>>> misses event for running loop() helper could be 0 just like seen on
>>>>>> Sapphire Rapids.
>>>>>>
>>>>>> So modify the lower limit of possible count range for LLC misses
>>>>>> events to 0 to avoid LLC misses event test failure on Sapphire Rapids.
>>>>> I'm not convinced that these tests are really indicative of whether or
>>>>> not the PMU is working properly. If 0 is allowed for llc misses, for
>>>>> instance, doesn't this sub-test pass even when the PMU is disabled?
>>>>>
>>>>> Surely, we can do better.
>>>> Considering the testing workload is just a simple adding loop, it's
>>>> reasonable and possible that it gets a 0 result for LLC misses and
>>>> branch misses events. Yeah, I agree the 0 count makes the results not so
>>>> credible. If we want to avoid these 0 count values, we may have to
>>>> complicate the workload, such as adding flush cache instructions, or
>>>> something like that (I'm not sure if there are instructions which can
>>>> force branch misses). How's your idea about this?
>>> CLFLUSH is probably a good way to ensure cache misses. IBPB may be a
>>> good way to ensure branch mispredictions, or IBRS on parts without
>>> eIBRS.
>>
>> Thanks Jim for the information. I'm not familiar with IBPB/IBRS
>> instructions, but just a glance, it looks there two instructions are
>> some kind of advanced instructions,  Not all Intel CPUs support these
>> instructions and not sure if AMD has similar instructions. It would be
>> better if there are more generic instruction to trigger branch miss.
>> Anyway I would look at the details and come back again.
> IBPB and IBRS are not instructions. IBPB (indirect branch predictor
> barrier) is triggered by setting bit 0 of the IA32_PRED_CMD MSR. IBRS
> (indirect branch restricted speculation) is triggered by setting bit 0
> of the IA32_SPEC_CTRL MSR. It is true that the desired behavior of
> IBRS (causing branch mispredictions) is only exhibited by certain
> older parts. However, IBPB is now universally available, as it is
> necessary to mitigate many speculative execution attacks. For Intel
> documentation, see
> https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/cpuid-enumeration-and-architectural-msrs.html.
>
> If you don't want to use these, you could train a branch to go one way
> prior to measurement, and then arrange for the branch under test go
> the other way.


Thanks Jim. From my point of view, IBPB is still some kind of extended 
feature which may be not supported on some older platforms. Considering 
kvm-unit-tests could still be run on these old platforms, IBPB seems not 
the best choice. I'm thinking an alternative way is to use the 'rdrand' 
instruction to get a random value, and then call jmp instruction base on 
the random value results. That would definitely cause branch misses. 
This looks more generic.


>
>>>>>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>>>>>> ---
>>>>>>     x86/pmu.c | 2 +-
>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/x86/pmu.c b/x86/pmu.c
>>>>>> index 0def28695c70..7443fdab5c8a 100644
>>>>>> --- a/x86/pmu.c
>>>>>> +++ b/x86/pmu.c
>>>>>> @@ -35,7 +35,7 @@ struct pmu_event {
>>>>>>            {"instructions", 0x00c0, 10*N, 10.2*N},
>>>>>>            {"ref cycles", 0x013c, 1*N, 30*N},
>>>>>>            {"llc references", 0x4f2e, 1, 2*N},
>>>>>> -       {"llc misses", 0x412e, 1, 1*N},
>>>>>> +       {"llc misses", 0x412e, 0, 1*N},
>>>>>>            {"branches", 0x00c4, 1*N, 1.1*N},
>>>>>>            {"branch misses", 0x00c5, 0, 0.1*N},
>>>>>>     }, amd_gp_events[] = {
>>>>>> --
>>>>>> 2.34.1
>>>>>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids
  2023-10-26  3:32   ` Mi, Dapeng
@ 2023-10-30  3:57     ` Mingwei Zhang
  0 siblings, 0 replies; 19+ messages in thread
From: Mingwei Zhang @ 2023-10-30  3:57 UTC (permalink / raw)
  To: Mi, Dapeng
  Cc: Sean Christopherson, Paolo Bonzini, kvm, linux-kernel,
	Zhenyu Wang, Zhang Xiong, Jim Mattson, Like Xu, Dapeng Mi

On Thu, Oct 26, 2023, Mi, Dapeng wrote:
> On 10/26/2023 7:47 AM, Mingwei Zhang wrote:
> > On Tue, Oct 24, 2023, Dapeng Mi wrote:
> > > When running pmu test on Intel Sapphire Rapids, we found several
> > > failures are encountered, such as "llc misses" failure, "all counters"
> > > failure and "fixed counter 3" failure.
> > hmm, I have tested your series on a SPR machine. It looks like, all "llc
> > misses" already pass on my side. "all counters" always fail with/without
> > your patches. "fixed counter 3" never exists... I have "fixed
> > cntr-{0,1,2}" and "fixed-{0,1,2}"
> 
> 1. "LLC misses" failure
> 
> Yeah, the "LLC misses" failure is not always seen. I can see the "LLC 
> misses" 2 ~3 times out of 10 runs of PMU standalone test and you could see
> the failure with higher possibility if you run the full kvm-unit-tests. I
> think whether you can see the "LLC misses" failure it really depends on
> current cache status on your system, how much cache memory are consumed by
> other programs. If there are lots of free cache lines on system when running
> the pmu test, you may have higher possibility to see the LLC misses failures
> just like what I see below.
> 
> PASS: Intel: llc references-7
> *FAIL*: Intel: llc misses-0
> PASS: Intel: llc misses-1
> PASS: Intel: llc misses-2
> 
> 2. "all counters" failure
> 
> Actually the "all counters" failure are not always seen, but it doesn't mean
> current code is correct. In current code, the length of "cnt[10]" array in
> check_counters_many() is defined as 10, but there are at least 11 counters
> supported (8 GP counters + 3 fixed counters) on SPR even though fixed
> counter 3 is not supported in current upstream code. Obviously there would
> be out of range memory access in check_counters_many().
> 

ok, I will double check on these. Thanks.

> > 
> > You may want to double check the requirements of your series. Not just
> > under your setting without explainning those setting in detail.
> > 
> > Maybe what I am missing is your topdown series? So, before your topdown
> > series checked in. I don't see value in this series.
> 
> 3. "fixed counter 3" failure
> 
> Yeah, I just realized I used the kernel which includes the vtopdown
> supporting patches after Jim's reminding. As the reply for Jim's comments
> says, the patches for support slots event are still valuable for current
> emulation framework and I would split them from the original vtopdown
> patchset and resend them as an independent patchset. Anyway, even though
> there is not slots event support in Kernel, it only impacts the patch 4/5,
> other patches are still valuable.
> 
> 
> > 
> > Thanks.
> > -Mingwei
> > > Intel Sapphire Rapids introduces new fixed counter 3, total PMU counters
> > > including GP and fixed counters increase to 12 and also optimizes cache
> > > subsystem. All these changes make the original assumptions in pmu test
> > > unavailable any more on Sapphire Rapids. Patches 2-4 fixes these
> > > failures, patch 0 remove the duplicate code and patch 5 adds assert to
> > > ensure predefine fixed events are matched with HW fixed counters.
> > > 
> > > Dapeng Mi (4):
> > >    x86: pmu: Change the minimum value of llc_misses event to 0
> > >    x86: pmu: Enlarge cnt array length to 64 in check_counters_many()
> > >    x86: pmu: Support validation for Intel PMU fixed counter 3
> > >    x86: pmu: Add asserts to warn inconsistent fixed events and counters
> > > 
> > > Xiong Zhang (1):
> > >    x86: pmu: Remove duplicate code in pmu_init()
> > > 
> > >   lib/x86/pmu.c |  5 -----
> > >   x86/pmu.c     | 17 ++++++++++++-----
> > >   2 files changed, 12 insertions(+), 10 deletions(-)
> > > 
> > > 
> > > base-commit: bfe5d7d0e14c8199d134df84d6ae8487a9772c48
> > > -- 
> > > 2.34.1
> > > 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-10-30  3:57 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-24  7:57 [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Dapeng Mi
2023-10-24  7:57 ` [kvm-unit-tests Patch 1/5] x86: pmu: Remove duplicate code in pmu_init() Dapeng Mi
2023-10-24  7:57 ` [kvm-unit-tests Patch 2/5] x86: pmu: Change the minimum value of llc_misses event to 0 Dapeng Mi
2023-10-24 13:03   ` Jim Mattson
2023-10-25 11:22     ` Mi, Dapeng
2023-10-25 12:35       ` Jim Mattson
2023-10-26  2:14         ` Mi, Dapeng
2023-10-26 12:19           ` Jim Mattson
2023-10-27 10:17             ` Mi, Dapeng
2023-10-24  7:57 ` [kvm-unit-tests Patch 3/5] x86: pmu: Enlarge cnt array length to 64 in check_counters_many() Dapeng Mi
2023-10-24  7:57 ` [kvm-unit-tests Patch 4/5] x86: pmu: Support validation for Intel PMU fixed counter 3 Dapeng Mi
2023-10-24 19:05   ` Jim Mattson
2023-10-25 11:26     ` Mi, Dapeng
2023-10-25 12:38       ` Jim Mattson
2023-10-26  2:29         ` Mi, Dapeng
2023-10-24  7:57 ` [kvm-unit-tests Patch 5/5] x86: pmu: Add asserts to warn inconsistent fixed events and counters Dapeng Mi
2023-10-25 23:47 ` [kvm-unit-tests Patch 0/5] Fix PMU test failures on Sapphire Rapids Mingwei Zhang
2023-10-26  3:32   ` Mi, Dapeng
2023-10-30  3:57     ` Mingwei Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox