* [kvm-unit-tests patch 0/5] Fix pmu test errors on SRF/CWF
@ 2025-07-12 17:49 Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues Dapeng Mi
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Dapeng Mi @ 2025-07-12 17:49 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Jim Mattson, Mingwei Zhang, Zide Chen,
Das Sandipan, Shukla Manali, Yi Lai, Dapeng Mi, Dapeng Mi
This patchset fixes the pmu test errors on Atom server like Sierra
Forest (SRF) and Clearwater Forest (CWF).
On Intel Atom platforms, the PMU events "Instruction Retired" or
"Branch Instruction Retired" may be overcounted for some certain
instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
and complex SGX/SMX/CSTATE instructions/flows[1].
In details, for the Atom platforms before Sierra Forest (including
Sierra Forest), Both 2 events "Instruction Retired" and
"Branch Instruction Retired" would be overcounted on these certain
instructions, but for Clearwater Forest only "Instruction Retired" event
is overcounted on these instructions.
As the overcount issue, pmu test would fail to validate the precise
count for these 2 events on SRF and CWF. Patches 1-3/5 detects if the
platform has this overcount issue, if so relax the precise count
validation for these 2 events.
Besides it looks more LLC references are needed on SRF/CWF, so adjust
the "LLC references" event count range.
Tests:
* pmu test passes on Intel GNR/SRF/CWF platforms.
Ref:
[1]https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details
dongsheng (5):
x86/pmu: Add helper to detect Intel overcount issues
x86/pmu: Relax precise count validation for Intel overcounted
platforms
x86/pmu: Fix incorrect masking of fixed counters
x86/pmu: Handle instruction overcount issue in overflow test
x86/pmu: Expand "llc references" upper limit for broader compatibility
lib/x86/processor.h | 17 +++++++++
x86/pmu.c | 93 +++++++++++++++++++++++++++++++++++++++------
2 files changed, 98 insertions(+), 12 deletions(-)
base-commit: 525bdb5d65d51a367341f471eb1bcd505d73c51f
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues
2025-07-12 17:49 [kvm-unit-tests patch 0/5] Fix pmu test errors on SRF/CWF Dapeng Mi
@ 2025-07-12 17:49 ` Dapeng Mi
2025-07-15 13:27 ` Xiaoyao Li
2025-07-12 17:49 ` [kvm-unit-tests patch 2/5] x86/pmu: Relax precise count validation for Intel overcounted platforms Dapeng Mi
` (3 subsequent siblings)
4 siblings, 1 reply; 8+ messages in thread
From: Dapeng Mi @ 2025-07-12 17:49 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Jim Mattson, Mingwei Zhang, Zide Chen,
Das Sandipan, Shukla Manali, Yi Lai, Dapeng Mi, dongsheng,
Dapeng Mi
From: dongsheng <dongsheng.x.zhang@intel.com>
For Intel Atom CPUs, the PMU events "Instruction Retired" or
"Branch Instruction Retired" may be overcounted for some certain
instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
and complex SGX/SMX/CSTATE instructions/flows.
The detailed information can be found in the errata (section SRF7):
https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
For the Atom platforms before Sierra Forest (including Sierra Forest),
Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
be overcounted on these certain instructions, but for Clearwater Forest
only "Instruction Retired" event is overcounted on these instructions.
So add a helper detect_inst_overcount_flags() to detect whether the
platform has the overcount issue and the later patches would relax the
precise count check by leveraging the gotten overcount flags from this
helper.
Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
[Rewrite comments and commit message - Dapeng]
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
lib/x86/processor.h | 17 ++++++++++++++++
x86/pmu.c | 47 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+)
diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index 62f3d578..3f475c21 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -1188,4 +1188,21 @@ static inline bool is_lam_u57_enabled(void)
return !!(read_cr3() & X86_CR3_LAM_U57);
}
+static inline u32 x86_family(u32 eax)
+{
+ u32 x86;
+
+ x86 = (eax >> 8) & 0xf;
+
+ if (x86 == 0xf)
+ x86 += (eax >> 20) & 0xff;
+
+ return x86;
+}
+
+static inline u32 x86_model(u32 eax)
+{
+ return ((eax >> 12) & 0xf0) | ((eax >> 4) & 0x0f);
+}
+
#endif
diff --git a/x86/pmu.c b/x86/pmu.c
index a6b0cfcc..87365aff 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -159,6 +159,14 @@ static struct pmu_event *gp_events;
static unsigned int gp_events_size;
static unsigned int fixed_counters_num;
+/*
+ * Flags for Intel "Instruction Retired" and "Branch Instruction Retired"
+ * overcount flaws.
+ */
+#define INST_RETIRED_OVERCOUNT BIT(0)
+#define BR_RETIRED_OVERCOUNT BIT(1)
+static u32 intel_inst_overcount_flags;
+
static int has_ibpb(void)
{
return this_cpu_has(X86_FEATURE_SPEC_CTRL) ||
@@ -959,6 +967,43 @@ static void check_invalid_rdpmc_gp(void)
"Expected #GP on RDPMC(64)");
}
+/*
+ * For Intel Atom CPUs, the PMU events "Instruction Retired" or
+ * "Branch Instruction Retired" may be overcounted for some certain
+ * instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
+ * and complex SGX/SMX/CSTATE instructions/flows.
+ *
+ * The detailed information can be found in the errata (section SRF7):
+ * https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
+ *
+ * For the Atom platforms before Sierra Forest (including Sierra Forest),
+ * Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
+ * be overcounted on these certain instructions, but for Clearwater Forest
+ * only "Instruction Retired" event is overcounted on these instructions.
+ */
+static u32 detect_inst_overcount_flags(void)
+{
+ u32 flags = 0;
+ struct cpuid c = cpuid(1);
+
+ if (x86_family(c.a) == 0x6) {
+ switch (x86_model(c.a)) {
+ case 0xDD: /* Clearwater Forest */
+ flags = INST_RETIRED_OVERCOUNT;
+ break;
+
+ case 0xAF: /* Sierra Forest */
+ case 0x4D: /* Avaton, Rangely */
+ case 0x5F: /* Denverton */
+ case 0x86: /* Jacobsville */
+ flags = INST_RETIRED_OVERCOUNT | BR_RETIRED_OVERCOUNT;
+ break;
+ }
+ }
+
+ return flags;
+}
+
int main(int ac, char **av)
{
int instruction_idx;
@@ -985,6 +1030,8 @@ int main(int ac, char **av)
branch_idx = INTEL_BRANCHES_IDX;
branch_miss_idx = INTEL_BRANCH_MISS_IDX;
+ intel_inst_overcount_flags = detect_inst_overcount_flags();
+
/*
* For legacy Intel CPUS without clflush/clflushopt support,
* there is no way to force to trigger a LLC miss, thus set
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [kvm-unit-tests patch 2/5] x86/pmu: Relax precise count validation for Intel overcounted platforms
2025-07-12 17:49 [kvm-unit-tests patch 0/5] Fix pmu test errors on SRF/CWF Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues Dapeng Mi
@ 2025-07-12 17:49 ` Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 3/5] x86/pmu: Fix incorrect masking of fixed counters Dapeng Mi
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Dapeng Mi @ 2025-07-12 17:49 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Jim Mattson, Mingwei Zhang, Zide Chen,
Das Sandipan, Shukla Manali, Yi Lai, Dapeng Mi, dongsheng,
Dapeng Mi
From: dongsheng <dongsheng.x.zhang@intel.com>
As the VM-Exit/VM-Entry overcount issue on Intel Atom platforms,
there is no way to validate the precise count for "instructions" and
"branches" events on these overcounted Atom platforms. Thus relax the
precise count validation on these overcounted platforms.
Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
x86/pmu.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/x86/pmu.c b/x86/pmu.c
index 87365aff..04946d10 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -237,10 +237,15 @@ static void adjust_events_range(struct pmu_event *gp_events,
* occur while running the measured code, e.g. if the host takes IRQs.
*/
if (pmu.is_intel && this_cpu_has_perf_global_ctrl()) {
- gp_events[instruction_idx].min = LOOP_INSNS;
- gp_events[instruction_idx].max = LOOP_INSNS;
- gp_events[branch_idx].min = LOOP_BRANCHES;
- gp_events[branch_idx].max = LOOP_BRANCHES;
+ if (!(intel_inst_overcount_flags & INST_RETIRED_OVERCOUNT)) {
+ gp_events[instruction_idx].min = LOOP_INSNS;
+ gp_events[instruction_idx].max = LOOP_INSNS;
+ }
+
+ if (!(intel_inst_overcount_flags & BR_RETIRED_OVERCOUNT)) {
+ gp_events[branch_idx].min = LOOP_BRANCHES;
+ gp_events[branch_idx].max = LOOP_BRANCHES;
+ }
}
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [kvm-unit-tests patch 3/5] x86/pmu: Fix incorrect masking of fixed counters
2025-07-12 17:49 [kvm-unit-tests patch 0/5] Fix pmu test errors on SRF/CWF Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 2/5] x86/pmu: Relax precise count validation for Intel overcounted platforms Dapeng Mi
@ 2025-07-12 17:49 ` Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 4/5] x86/pmu: Handle instruction overcount issue in overflow test Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 5/5] x86/pmu: Expand "llc references" upper limit for broader compatibility Dapeng Mi
4 siblings, 0 replies; 8+ messages in thread
From: Dapeng Mi @ 2025-07-12 17:49 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Jim Mattson, Mingwei Zhang, Zide Chen,
Das Sandipan, Shukla Manali, Yi Lai, Dapeng Mi, dongsheng,
Dapeng Mi
From: dongsheng <dongsheng.x.zhang@intel.com>
The current implementation mistakenly limits the width of fixed counters
to the width of GP counters. Corrects the logic to ensure fixed counters
are properly masked according to their own width.
Opportunistically refine the GP counter bitwidth processing code.
Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
x86/pmu.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/x86/pmu.c b/x86/pmu.c
index 04946d10..44c728a5 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -556,18 +556,16 @@ static void check_counter_overflow(void)
int idx;
cnt.count = overflow_preset;
- if (pmu_use_full_writes())
- cnt.count &= (1ull << pmu.gp_counter_width) - 1;
-
if (i == pmu.nr_gp_counters) {
if (!pmu.is_intel)
break;
cnt.ctr = fixed_events[0].unit_sel;
- cnt.count = measure_for_overflow(&cnt);
- cnt.count &= (1ull << pmu.gp_counter_width) - 1;
+ cnt.count &= (1ull << pmu.fixed_counter_width) - 1;
} else {
cnt.ctr = MSR_GP_COUNTERx(i);
+ if (pmu_use_full_writes())
+ cnt.count &= (1ull << pmu.gp_counter_width) - 1;
}
if (i % 2)
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [kvm-unit-tests patch 4/5] x86/pmu: Handle instruction overcount issue in overflow test
2025-07-12 17:49 [kvm-unit-tests patch 0/5] Fix pmu test errors on SRF/CWF Dapeng Mi
` (2 preceding siblings ...)
2025-07-12 17:49 ` [kvm-unit-tests patch 3/5] x86/pmu: Fix incorrect masking of fixed counters Dapeng Mi
@ 2025-07-12 17:49 ` Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 5/5] x86/pmu: Expand "llc references" upper limit for broader compatibility Dapeng Mi
4 siblings, 0 replies; 8+ messages in thread
From: Dapeng Mi @ 2025-07-12 17:49 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Jim Mattson, Mingwei Zhang, Zide Chen,
Das Sandipan, Shukla Manali, Yi Lai, Dapeng Mi, dongsheng,
Dapeng Mi
From: dongsheng <dongsheng.x.zhang@intel.com>
During the execution of __measure(), VM exits (e.g., due to
WRMSR/EXTERNAL_INTERRUPT) may occur. On systems affected by the
instruction overcount issue, each VM-Exit/VM-Entry can erroneously
increment the instruction count by one, leading to false failures in
overflow tests.
To address this, the patch introduces a range-based validation in place
of precise instruction count checks. Additionally, overflow_preset is
now statically set to 1 - LOOP_INSNS, rather than being dynamically
determined via measure_for_overflow().
These changes ensure consistent and predictable behavior aligned with the
intended loop instruction count, while avoiding modifications to the
subsequent status and status-clear testing logic.
The chosen validation range is empirically derived to maintain test
reliability across hardware variations.
Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
x86/pmu.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/x86/pmu.c b/x86/pmu.c
index 44c728a5..c54c0988 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -518,6 +518,21 @@ static void check_counters_many(void)
static uint64_t measure_for_overflow(pmu_counter_t *cnt)
{
+ /*
+ * During the execution of __measure(), VM exits (e.g., due to
+ * WRMSR/EXTERNAL_INTERRUPT) may occur. On systems affected by the
+ * instruction overcount issue, each VM-Exit/VM-Entry can erroneously
+ * increment the instruction count by one, leading to false failures
+ * in overflow tests.
+ *
+ * To mitigate this, if the overcount issue is detected, we hardcode
+ * the overflow preset to (1 - LOOP_INSNS) instead of calculating it
+ * dynamically. This ensures that an overflow will reliably occur,
+ * regardless of any overcounting caused by VM exits.
+ */
+ if (intel_inst_overcount_flags & INST_RETIRED_OVERCOUNT)
+ return 1 - LOOP_INSNS;
+
__measure(cnt, 0);
/*
* To generate overflow, i.e. roll over to '0', the initial count just
@@ -574,8 +589,12 @@ static void check_counter_overflow(void)
cnt.config &= ~EVNTSEL_INT;
idx = event_to_global_idx(&cnt);
__measure(&cnt, cnt.count);
- if (pmu.is_intel)
- report(cnt.count == 1, "cntr-%d", i);
+ if (pmu.is_intel) {
+ if (intel_inst_overcount_flags & INST_RETIRED_OVERCOUNT)
+ report(cnt.count < 14, "cntr-%d", i);
+ else
+ report(cnt.count == 1, "cntr-%d", i);
+ }
else
report(cnt.count == 0xffffffffffff || cnt.count < 7, "cntr-%d", i);
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [kvm-unit-tests patch 5/5] x86/pmu: Expand "llc references" upper limit for broader compatibility
2025-07-12 17:49 [kvm-unit-tests patch 0/5] Fix pmu test errors on SRF/CWF Dapeng Mi
` (3 preceding siblings ...)
2025-07-12 17:49 ` [kvm-unit-tests patch 4/5] x86/pmu: Handle instruction overcount issue in overflow test Dapeng Mi
@ 2025-07-12 17:49 ` Dapeng Mi
4 siblings, 0 replies; 8+ messages in thread
From: Dapeng Mi @ 2025-07-12 17:49 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Jim Mattson, Mingwei Zhang, Zide Chen,
Das Sandipan, Shukla Manali, Yi Lai, Dapeng Mi, dongsheng,
Dapeng Mi
From: dongsheng <dongsheng.x.zhang@intel.com>
Increase the upper limit of the "llc references" test to accommodate
results observed on additional Intel CPU models, including CWF and
SRF.
These CPUs exhibited higher reference counts that previously caused
the test to fail.
Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
x86/pmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/x86/pmu.c b/x86/pmu.c
index c54c0988..445ea6b4 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -116,7 +116,7 @@ struct pmu_event {
{"core cycles", 0x003c, 1*N, 50*N},
{"instructions", 0x00c0, 10*N, 10.2*N},
{"ref cycles", 0x013c, 1*N, 30*N},
- {"llc references", 0x4f2e, 1, 2*N},
+ {"llc references", 0x4f2e, 1, 2.5*N},
{"llc misses", 0x412e, 1, 1*N},
{"branches", 0x00c4, 1*N, 1.1*N},
{"branch misses", 0x00c5, 1, 0.1*N},
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues
2025-07-12 17:49 ` [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues Dapeng Mi
@ 2025-07-15 13:27 ` Xiaoyao Li
2025-07-16 1:13 ` Mi, Dapeng
0 siblings, 1 reply; 8+ messages in thread
From: Xiaoyao Li @ 2025-07-15 13:27 UTC (permalink / raw)
To: Dapeng Mi, Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Jim Mattson, Mingwei Zhang, Zide Chen,
Das Sandipan, Shukla Manali, Yi Lai, Dapeng Mi, dongsheng
On 7/13/2025 1:49 AM, Dapeng Mi wrote:
> From: dongsheng <dongsheng.x.zhang@intel.com>
>
> For Intel Atom CPUs, the PMU events "Instruction Retired" or
> "Branch Instruction Retired" may be overcounted for some certain
> instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
> and complex SGX/SMX/CSTATE instructions/flows.
>
> The detailed information can be found in the errata (section SRF7):
> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
>
> For the Atom platforms before Sierra Forest (including Sierra Forest),
> Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
> be overcounted on these certain instructions, but for Clearwater Forest
> only "Instruction Retired" event is overcounted on these instructions.
>
> So add a helper detect_inst_overcount_flags() to detect whether the
> platform has the overcount issue and the later patches would relax the
> precise count check by leveraging the gotten overcount flags from this
> helper.
>
> Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
> [Rewrite comments and commit message - Dapeng]
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Tested-by: Yi Lai <yi1.lai@intel.com>
> ---
> lib/x86/processor.h | 17 ++++++++++++++++
> x86/pmu.c | 47 +++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 64 insertions(+)
>
> diff --git a/lib/x86/processor.h b/lib/x86/processor.h
> index 62f3d578..3f475c21 100644
> --- a/lib/x86/processor.h
> +++ b/lib/x86/processor.h
> @@ -1188,4 +1188,21 @@ static inline bool is_lam_u57_enabled(void)
> return !!(read_cr3() & X86_CR3_LAM_U57);
> }
>
> +static inline u32 x86_family(u32 eax)
> +{
> + u32 x86;
> +
> + x86 = (eax >> 8) & 0xf;
> +
> + if (x86 == 0xf)
> + x86 += (eax >> 20) & 0xff;
> +
> + return x86;
> +}
> +
> +static inline u32 x86_model(u32 eax)
> +{
> + return ((eax >> 12) & 0xf0) | ((eax >> 4) & 0x0f);
> +}
It seems to copy the implementation of kvm selftest.
I need to point it out that it's not correct (because I fixed the
similar issue on QEMU recently).
We cannot count Extended Model ID unconditionally. Intel counts Extended
Model when (base) Family is 0x6 or 0xF, while AMD counts EXtended Model
when (base) Family is 0xF.
You can refer to kernel's x86_model() in arch/x86/lib/cpu.c, while it
optimizes the condition to "family >= 0x6", which seems to have the
assumption that Intel doesn't have processor with family ID from 7 to
0xe and AMD doesn't have processor with family ID from 6 to 0xe.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues
2025-07-15 13:27 ` Xiaoyao Li
@ 2025-07-16 1:13 ` Mi, Dapeng
0 siblings, 0 replies; 8+ messages in thread
From: Mi, Dapeng @ 2025-07-16 1:13 UTC (permalink / raw)
To: Xiaoyao Li, Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Jim Mattson, Mingwei Zhang, Zide Chen,
Das Sandipan, Shukla Manali, Yi Lai, Dapeng Mi, dongsheng
On 7/15/2025 9:27 PM, Xiaoyao Li wrote:
> On 7/13/2025 1:49 AM, Dapeng Mi wrote:
>> From: dongsheng <dongsheng.x.zhang@intel.com>
>>
>> For Intel Atom CPUs, the PMU events "Instruction Retired" or
>> "Branch Instruction Retired" may be overcounted for some certain
>> instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
>> and complex SGX/SMX/CSTATE instructions/flows.
>>
>> The detailed information can be found in the errata (section SRF7):
>> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
>>
>> For the Atom platforms before Sierra Forest (including Sierra Forest),
>> Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
>> be overcounted on these certain instructions, but for Clearwater Forest
>> only "Instruction Retired" event is overcounted on these instructions.
>>
>> So add a helper detect_inst_overcount_flags() to detect whether the
>> platform has the overcount issue and the later patches would relax the
>> precise count check by leveraging the gotten overcount flags from this
>> helper.
>>
>> Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
>> [Rewrite comments and commit message - Dapeng]
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
>> Tested-by: Yi Lai <yi1.lai@intel.com>
>> ---
>> lib/x86/processor.h | 17 ++++++++++++++++
>> x86/pmu.c | 47 +++++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 64 insertions(+)
>>
>> diff --git a/lib/x86/processor.h b/lib/x86/processor.h
>> index 62f3d578..3f475c21 100644
>> --- a/lib/x86/processor.h
>> +++ b/lib/x86/processor.h
>> @@ -1188,4 +1188,21 @@ static inline bool is_lam_u57_enabled(void)
>> return !!(read_cr3() & X86_CR3_LAM_U57);
>> }
>>
>> +static inline u32 x86_family(u32 eax)
>> +{
>> + u32 x86;
>> +
>> + x86 = (eax >> 8) & 0xf;
>> +
>> + if (x86 == 0xf)
>> + x86 += (eax >> 20) & 0xff;
>> +
>> + return x86;
>> +}
>> +
>> +static inline u32 x86_model(u32 eax)
>> +{
>> + return ((eax >> 12) & 0xf0) | ((eax >> 4) & 0x0f);
>> +}
> It seems to copy the implementation of kvm selftest.
>
> I need to point it out that it's not correct (because I fixed the
> similar issue on QEMU recently).
>
> We cannot count Extended Model ID unconditionally. Intel counts Extended
> Model when (base) Family is 0x6 or 0xF, while AMD counts EXtended Model
> when (base) Family is 0xF.
>
> You can refer to kernel's x86_model() in arch/x86/lib/cpu.c, while it
> optimizes the condition to "family >= 0x6", which seems to have the
> assumption that Intel doesn't have processor with family ID from 7 to
> 0xe and AMD doesn't have processor with family ID from 6 to 0xe.
Sure. Thanks for reviewing.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-07-16 1:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-12 17:49 [kvm-unit-tests patch 0/5] Fix pmu test errors on SRF/CWF Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 1/5] x86/pmu: Add helper to detect Intel overcount issues Dapeng Mi
2025-07-15 13:27 ` Xiaoyao Li
2025-07-16 1:13 ` Mi, Dapeng
2025-07-12 17:49 ` [kvm-unit-tests patch 2/5] x86/pmu: Relax precise count validation for Intel overcounted platforms Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 3/5] x86/pmu: Fix incorrect masking of fixed counters Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 4/5] x86/pmu: Handle instruction overcount issue in overflow test Dapeng Mi
2025-07-12 17:49 ` [kvm-unit-tests patch 5/5] x86/pmu: Expand "llc references" upper limit for broader compatibility Dapeng Mi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).