From: Sean Christopherson <seanjc@google.com>
To: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Jim Mattson <jmattson@google.com>,
Mingwei Zhang <mizhang@google.com>,
Zide Chen <zide.chen@intel.com>,
Das Sandipan <Sandipan.Das@amd.com>,
Shukla Manali <Manali.Shukla@amd.com>,
Yi Lai <yi1.lai@intel.com>, Dapeng Mi <dapeng1.mi@intel.com>,
dongsheng <dongsheng.x.zhang@intel.com>
Subject: Re: [PATCH v2 4/5] KVM: selftests: Relax precise event count validation as overcount issue
Date: Wed, 10 Sep 2025 16:56:52 -0700 [thread overview]
Message-ID: <aMIQRGRg59dvcHaP@google.com> (raw)
In-Reply-To: <20250718001905.196989-5-dapeng1.mi@linux.intel.com>
On Fri, Jul 18, 2025, Dapeng Mi wrote:
> From: dongsheng <dongsheng.x.zhang@intel.com>
>
> For Intel Atom CPUs, the PMU events "Instruction Retired" or
> "Branch Instruction Retired" may be overcounted for some certain
> instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
> and complex SGX/SMX/CSTATE instructions/flows.
>
> The detailed information can be found in the errata (section SRF7):
> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
>
> For the Atom platforms before Sierra Forest (including Sierra Forest),
> Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
> be overcounted on these certain instructions, but for Clearwater Forest
> only "Instruction Retired" event is overcounted on these instructions.
>
> As the overcount issue on VM-Exit/VM-Entry, it has no way to validate
> the precise count for these 2 events on these affected Atom platforms,
> so just relax the precise event count check for these 2 events on these
> Atom platforms.
>
> Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
> Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> Tested-by: Yi Lai <yi1.lai@intel.com>
> ---
...
> diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> index 342a72420177..074cdf323406 100644
> --- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
> @@ -52,6 +52,9 @@ struct kvm_intel_pmu_event {
> struct kvm_x86_pmu_feature fixed_event;
> };
>
> +
> +static uint8_t inst_overcount_flags;
> +
> /*
> * Wrap the array to appease the compiler, as the macros used to construct each
> * kvm_x86_pmu_feature use syntax that's only valid in function scope, and the
> @@ -163,10 +166,18 @@ static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t pmc_msr
>
> switch (idx) {
> case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
> - GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> + /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
> + if (inst_overcount_flags & INST_RETIRED_OVERCOUNT)
> + GUEST_ASSERT(count >= NUM_INSNS_RETIRED);
> + else
> + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> break;
> case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
> - GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
> + /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
> + if (inst_overcount_flags & BR_RETIRED_OVERCOUNT)
> + GUEST_ASSERT(count >= NUM_BRANCH_INSNS_RETIRED);
> + else
> + GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
> break;
> case INTEL_ARCH_LLC_REFERENCES_INDEX:
> case INTEL_ARCH_LLC_MISSES_INDEX:
> @@ -335,6 +346,7 @@ static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities,
> length);
> vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
> unavailable_mask);
> + sync_global_to_guest(vm, inst_overcount_flags);
Rather than force individual tests to sync_global_to_guest(), and to cache the
value, I think it makes sense to handle this automatically in kvm_arch_vm_post_create(),
similar to things like host_cpu_is_intel and host_cpu_is_amd.
And explicitly call these out as errata, so that it's super clear that we're
working around PMU/CPU flaws, not KVM bugs. With some shenanigans, we can even
reuse the this_pmu_has()/this_cpu_has(0 terminology as this_pmu_has_errata(), and
hide the use of a bitmask too.
diff --git a/tools/testing/selftests/kvm/x86/pmu_counters_test.c b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
index d4f90f5ec5b8..046d992c5940 100644
--- a/tools/testing/selftests/kvm/x86/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86/pmu_counters_test.c
@@ -163,10 +163,18 @@ static void guest_assert_event_count(uint8_t idx, uint32_t pmc, uint32_t pmc_msr
switch (idx) {
case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
- GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
+ /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
+ if (this_pmu_has_errata(INSTRUCTIONS_RETIRED_OVERCOUNT))
+ GUEST_ASSERT(count >= NUM_INSNS_RETIRED);
+ else
+ GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
break;
case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
- GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
+ /* Relax precise count check due to VM-EXIT/VM-ENTRY overcount issue */
+ if (this_pmu_has_errata(BRANCHES_RETIRED_OVERCOUNT))
+ GUEST_ASSERT(count >= NUM_BRANCH_INSNS_RETIRED);
+ else
+ GUEST_ASSERT_EQ(count, NUM_BRANCH_INSNS_RETIRED);
break;
case INTEL_ARCH_LLC_REFERENCES_INDEX:
case INTEL_ARCH_LLC_MISSES_INDEX:
diff --git a/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
index c15513cd74d1..1c5b7611db24 100644
--- a/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86/pmu_event_filter_test.c
@@ -214,8 +214,10 @@ static void remove_event(struct __kvm_pmu_event_filter *f, uint64_t event)
do { \
uint64_t br = pmc_results.branches_retired; \
uint64_t ir = pmc_results.instructions_retired; \
+ bool br_matched = this_pmu_has_errata(BRANCHES_RETIRED_OVERCOUNT) ? \
+ br >= NUM_BRANCHES : br == NUM_BRANCHES; \
\
- if (br && br != NUM_BRANCHES) \
+ if (br && !br_matched) \
pr_info("%s: Branch instructions retired = %lu (expected %u)\n", \
__func__, br, NUM_BRANCHES); \
TEST_ASSERT(br, "%s: Branch instructions retired = %lu (expected > 0)", \
next prev parent reply other threads:[~2025-09-10 23:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-18 0:19 [PATCH v2 0/5] Fix PMU kselftests errors on GNR/SRF/CWF Dapeng Mi
2025-07-18 0:19 ` [PATCH v2 1/5] KVM: x86/pmu: Correct typo "_COUTNERS" to "_COUNTERS" Dapeng Mi
2025-07-18 0:19 ` [PATCH v2 2/5] KVM: selftests: Add timing_info bit support in vmx_pmu_caps_test Dapeng Mi
2025-09-10 22:03 ` Sean Christopherson
2025-09-11 1:20 ` Mi, Dapeng
2025-07-18 0:19 ` [PATCH v2 3/5] KVM: Selftests: Validate more arch-events in pmu_counters_test Dapeng Mi
2025-09-10 23:51 ` Sean Christopherson
2025-09-11 1:41 ` Mi, Dapeng
2025-07-18 0:19 ` [PATCH v2 4/5] KVM: selftests: Relax precise event count validation as overcount issue Dapeng Mi
2025-09-10 23:56 ` Sean Christopherson [this message]
2025-09-11 1:55 ` Mi, Dapeng
2025-07-18 0:19 ` [PATCH v2 5/5] KVM: selftests: Relax branches event count check for event_filter test Dapeng Mi
2025-09-10 23:52 ` Sean Christopherson
2025-09-10 23:59 ` [PATCH v2 0/5] Fix PMU kselftests errors on GNR/SRF/CWF Sean Christopherson
2025-09-11 1:59 ` Mi, Dapeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMIQRGRg59dvcHaP@google.com \
--to=seanjc@google.com \
--cc=Manali.Shukla@amd.com \
--cc=Sandipan.Das@amd.com \
--cc=dapeng1.mi@intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=dongsheng.x.zhang@intel.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
--cc=yi1.lai@intel.com \
--cc=zide.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.