From: Zide Chen <zide.chen@intel.com>
To: Sean Christopherson <seanjc@google.com>,
Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Jim Mattson <jmattson@google.com>,
Mingwei Zhang <mizhang@google.com>,
Zide Chen <zide.chen@intel.com>,
Das Sandipan <Sandipan.Das@amd.com>,
Shukla Manali <Manali.Shukla@amd.com>,
Dapeng Mi <dapeng1.mi@linux.intel.com>,
Falcon Thomas <thomas.falcon@intel.com>,
Xudong Hao <xudong.hao@intel.com>
Subject: [PATCH V3 0/4] KVM: x86/pmu: Add hardware Topdown metrics support
Date: Mon, 15 Jun 2026 16:01:14 -0700 [thread overview]
Message-ID: <20260615230118.50718-1-zide.chen@intel.com> (raw)
The Top-Down Microarchitecture Analysis (TMA) method is a structured
approach for identifying performance bottlenecks in out-of-order
processors.
Currently, guests support the TMA method by collecting Topdown events
using GP counters, which may trigger multiplexing. To free up scarce
GP counters, eliminate multiplexing-induced skew, and obtain coherent
Topdown metric ratios, it is desirable to expose fixed counter 3 and
the IA32_PERF_METRICS MSR to guests.
Several attempts have been made to virtualize this under the legacy
vPMU model [1][2][3], but they were unsuccessful. With the new mediated
vPMU, enabling TMA support in guests becomes much simpler. It avoids
invasive changes to the perf core, eliminates CPU pinning and
fixed-counter affinity issues, and reduces the latge overhead of
trapping and emulating MSR accesses.
[1] https://lore.kernel.org/kvm/20231031090613.2872700-1-dapeng1.mi@linux.intel.com/
[2] https://lore.kernel.org/all/20230927033124.1226509-1-dapeng1.mi@linux.intel.com/T/
[3] https://lwn.net/ml/linux-kernel/20221212125844.41157-1-likexu@tencent.com/
Tested on an SPR. Without this series, only raw topdown.*_slots events
work in the guest, and metric events (e.g. cpu/topdown-bad-spec/) are
not available.
With this series, metric events are visible in the guest. Run this
command on both host and guest:
$ perf stat --topdown --no-metric-only -- taskset -c 2 perf bench sched messaging
Host results:
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 1.500 [sec]
Performance counter stats for 'taskset -c 2 perf bench sched messaging':
4,266,060,558 TOPDOWN.SLOTS:u # 32.0 % tma_frontend_bound
# 5.2 % tma_bad_speculation
588,397,905 topdown-retiring:u # 13.8 % tma_retiring
# 49.0 % tma_backend_bound
1,376,283,990 topdown-fe-bound:u
2,096,827,304 topdown-be-bound:u
217,425,841 topdown-bad-spec:u
5,050,520 INT_MISC.UOP_DROPPING:u
Only minor changes in v3.
Rebased to kvm-x86/next: c1f730330292
v3 changes:
- patch 2/4: Move the non-contiguous counter filter code to pmu.c (Dapeng)
- patch 3/4: Replace WARN_ON() with WARN_ON_ONCE(). (Dapeng)
- patch 4/4: Change abs() with explicit bounds (sum >= 0xfd && sum <= 0x102).
- Minor comment cleanups.
v2 changes:
- As suggested by Dapeng, implement a new selftest patch.
- Don't advertise fixed counter 3 if the host doesn't support it.
- Minor change in patch 1 to remove a magic number.
v2:
https://lore.kernel.org/kvm/20260423174639.56149-1-zide.chen@intel.com/T/#u
v1:
https://lore.kernel.org/kvm/20260226230606.146532-1-zide.chen@intel.com/T/#t
QEMU:
https://lore.kernel.org/qemu-devel/20260604025546.19378-7-zide.chen@intel.com/
Dapeng Mi (2):
KVM: x86/pmu: Support Intel fixed counter 3 on mediated vPMU
KVM: x86/pmu: Support PERF_METRICS MSR in mediated vPMU
Zide Chen (2):
KVM: x86/pmu: Do not map fixed counters >= 3 to generic perf events
KVM: selftests: Add perf_metrics and fixed counter 3 tests
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/perf_event.h | 1 +
arch/x86/kvm/pmu.c | 18 +++++
arch/x86/kvm/vmx/pmu_intel.c | 62 ++++++++++++----
arch/x86/kvm/vmx/pmu_intel.h | 5 ++
arch/x86/kvm/vmx/vmx.c | 6 ++
arch/x86/kvm/x86.c | 10 ++-
tools/arch/x86/include/asm/msr-index.h | 1 +
tools/testing/selftests/kvm/include/x86/pmu.h | 3 +
.../selftests/kvm/x86/pmu_counters_test.c | 72 +++++++++++++++++--
11 files changed, 161 insertions(+), 21 deletions(-)
--
2.54.0
next reply other threads:[~2026-06-15 23:10 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-15 23:01 Zide Chen [this message]
2026-06-15 23:01 ` [PATCH V3 1/4] KVM: x86/pmu: Do not map fixed counters >= 3 to generic perf events Zide Chen
2026-06-15 23:01 ` [PATCH V3 2/4] KVM: x86/pmu: Support Intel fixed counter 3 on mediated vPMU Zide Chen
2026-06-15 23:01 ` [PATCH V3 3/4] KVM: x86/pmu: Support PERF_METRICS MSR in " Zide Chen
2026-06-15 23:26 ` sashiko-bot
2026-06-16 16:29 ` Chen, Zide
2026-06-18 2:21 ` Mi, Dapeng
2026-06-15 23:01 ` [PATCH V3 4/4] KVM: selftests: Add perf_metrics and fixed counter 3 tests Zide Chen
2026-06-15 23:26 ` sashiko-bot
2026-06-16 16:32 ` Chen, Zide
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260615230118.50718-1-zide.chen@intel.com \
--to=zide.chen@intel.com \
--cc=Manali.Shukla@amd.com \
--cc=Sandipan.Das@amd.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=thomas.falcon@intel.com \
--cc=xudong.hao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.