From: Sean Christopherson <seanjc@google.com>
To: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Jim Mattson <jmattson@google.com>,
Mingwei Zhang <mizhang@google.com>,
Zide Chen <zide.chen@intel.com>,
Das Sandipan <Sandipan.Das@amd.com>,
Shukla Manali <Manali.Shukla@amd.com>,
Yi Lai <yi1.lai@intel.com>, Dapeng Mi <dapeng1.mi@intel.com>,
dongsheng <dongsheng.x.zhang@intel.com>
Subject: Re: [PATCH v2 4/5] KVM: selftests: Relax precise event count validation as overcount issue
Date: Wed, 10 Sep 2025 16:56:52 -0700 [thread overview]
Message-ID: <aMIQRGRg59dvcHaP@google.com> (raw)
In-Reply-To: <20250718001905.196989-5-dapeng1.mi@linux.intel.com>
On Fri, Jul 18, 2025, Dapeng Mi wrote:
> From: dongsheng <dongsheng.x.zhang@intel.com>
>
> For Intel Atom CPUs, the PMU events "Instruction Retired" or
> "Branch Instruction Retired" may be overcounted for some certain
> instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
> and complex SGX/SMX/CSTATE instructions/flows.
>
> The detailed information can be found in the errata (section SRF7):
> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
>
> For the Atom platforms before Sierra Forest (including Sierra Forest),
> Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
> be overcounted on these certain instructions, but for Clearwater Forest
> only "Instruction Retired" event is overcounted on these instructions.
>
> As the overcount issue on VM-Exit/VM-Entry, it has no way to validate
> the precise count for these 2 events on these affected Atom platforms,
> so just relax the precise event count check for these 2 events on these
> Atom platforms.
>
> Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
> Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Tested-by: Yi Lai <yi1.lai@intel.com>
> ---
...
> diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> index 342a72420177..074cdf323406 100644
> --- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> @@ -52,6 +52,9 @@ struct kvm_intel_pmu_event {
> struct kvm_x86_pmu_feature fixed_event;
> };
>
> +
> +static uint8_t inst_overcount_flags;
> +
> /*
> * Wrap the array to appease the compiler, as the macros used to construct each
> * kvm_x86_pmu_feature use syntax that's only valid in function scope, and the
> @@ -163,10 +166,18 @@ static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t pmc_msr
>
> switch (idx) {
> case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
> - GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> + /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
> + if (inst_overcount_flags & INST_RETIRED_OVERCOUNT)
> + GUEST_ASSERT(count >= NUM_INSNS_RETIRED);
> + else
> + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> break;
> case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
> - GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
> + /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
> + if (inst_overcount_flags & BR_RETIRED_OVERCOUNT)
> + GUEST_ASSERT(count >= NUM_BRANCH_INSNS_RETIRED);
> + else
> + GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
> break;
> case INTEL_ARCH_LLC_REFERENCES_INDEX:
> case INTEL_ARCH_LLC_MISSES_INDEX:
> @@ -335,6 +346,7 @@ static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities,
> length);
> vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
> unavailable_mask);
> + sync_global_to_guest(vm, inst_overcount_flags);
Rather than force individual tests to sync_global_to_guest(), and to cache the
value, I think it makes sense to handle this automatically in kvm_arch_vm_post_create(),
similar to things like host_cpu_is_intel and host_cpu_is_amd.
And explicitly call these out as errata, so that it's super clear that we're
working around PMU/CPU flaws, not KVM bugs. With some shenanigans, we can even
reuse the this_pmu_has()/this_cpu_has(0 terminology as this_pmu_has_errata(), and
hide the use of a bitmask too.
diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
index d4f90f5ec5b8..046d992c5940 100644
--- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
@@ -163,10 +163,18 @@ static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t pmc_msr
switch (idx) {
case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
- GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
+ /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
+ if (this_pmu_has_errata(INSTRUCTIONS_RETIRED_OVERCOUNT))
+ GUEST_ASSERT(count >= NUM_INSNS_RETIRED);
+ else
+ GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
break;
case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
- GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
+ /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
+ if (this_pmu_has_errata(BRANCHES_RETIRED_OVERCOUNT))
+ GUEST_ASSERT(count >= NUM_BRANCH_INSNS_RETIRED);
+ else
+ GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
break;
case INTEL_ARCH_LLC_REFERENCES_INDEX:
case INTEL_ARCH_LLC_MISSES_INDEX:
diff --git a/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
index c15513cd74d1..1c5b7611db24 100644
--- a/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
@@ -214,8 +214,10 @@ static void remove_event(struct __kvm_pmu_event_filter *f, uint64_t event)
do { \
uint64_t br = pmc_results.branches_retired; \
uint64_t ir = pmc_results.instructions_retired; \
+ bool br_matched = this_pmu_has_errata(BRANCHES_RETIRED_OVERCOUNT) ? \
+ br >= NUM_BRANCHES : br == NUM_BRANCHES; \
\
- if (br && br != NUM_BRANCHES) \
+ if (br && !br_matched) \
pr_info("%s: Branch instructions retired = %lu (expected %u)\n", \
__func__, br, NUM_BRANCHES); \
TEST_ASSERT(br, "%s: Branch instructions retired = %lu (expected > 0)", \
next prev parent reply other threads:[~2025-09-10 23:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-18 0:19 [PATCH v2 0/5] Fix PMU kselftests errors on GNR/SRF/CWF Dapeng Mi
2025-07-18 0:19 ` [PATCH v2 1/5] KVM: x86/pmu: Correct typo "_COUTNERS" to "_COUNTERS" Dapeng Mi
2025-07-18 0:19 ` [PATCH v2 2/5] KVM: selftests: Add timing_info bit support in vmx_pmu_caps_test Dapeng Mi
2025-09-10 22:03 ` Sean Christopherson
2025-09-11 1:20 ` Mi, Dapeng
2025-07-18 0:19 ` [PATCH v2 3/5] KVM: Selftests: Validate more arch-events in pmu_counters_test Dapeng Mi
2025-09-10 23:51 ` Sean Christopherson
2025-09-11 1:41 ` Mi, Dapeng
2025-07-18 0:19 ` [PATCH v2 4/5] KVM: selftests: Relax precise event count validation as overcount issue Dapeng Mi
2025-09-10 23:56 ` Sean Christopherson [this message]
2025-09-11 1:55 ` Mi, Dapeng
2025-07-18 0:19 ` [PATCH v2 5/5] KVM: selftests: Relax branches event count check for event_filter test Dapeng Mi
2025-09-10 23:52 ` Sean Christopherson
2025-09-10 23:59 ` [PATCH v2 0/5] Fix PMU kselftests errors on GNR/SRF/CWF Sean Christopherson
2025-09-11 1:59 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMIQRGRg59dvcHaP@google.com \
--to=seanjc@google.com \
--cc=Manali.Shukla@amd.com \
--cc=Sandipan.Das@amd.com \
--cc=dapeng1.mi@intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=dongsheng.x.zhang@intel.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
--cc=yi1.lai@intel.com \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox