linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/22] KVM: x86: Virtualize IA32_APERF and IA32_MPERF MSRs
@ 2024-11-21 18:52 Mingwei Zhang
  2024-11-21 18:52 ` [RFC PATCH 01/22] x86/aperfmperf: Introduce get_host_[am]perf() Mingwei Zhang
                   ` (23 more replies)
  0 siblings, 24 replies; 35+ messages in thread
From: Mingwei Zhang @ 2024-11-21 18:52 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Huang Rui, Gautham R. Shenoy,
	Mario Limonciello, Rafael J. Wysocki, Viresh Kumar,
	Srinivas Pandruvada, Len Brown
  Cc: H. Peter Anvin, Perry Yuan, kvm, linux-kernel, linux-pm,
	Jim Mattson, Mingwei Zhang

Linux guests read IA32_APERF and IA32_MPERF on every scheduler tick
(250 Hz by default) to measure their effective CPU frequency. To avoid
the overhead of intercepting these frequent MSR reads, allow the guest
to read them directly by loading guest values into the hardware MSRs.

These MSRs are continuously running counters whose values must be
carefully tracked during all vCPU state transitions:
- Guest IA32_APERF advances only during guest execution
- Guest IA32_MPERF advances at the TSC frequency whenever the vCPU is
  in C0 state, even when not actively running
- Host kernel access is redirected through get_host_[am]perf() which
  adds per-CPU offsets to the hardware MSR values
- Remote MSR reads through /dev/cpu/*/msr also account for these
  offsets

Guest values persist in hardware while the vCPU is loaded and
running. Host MSR values are restored on vcpu_put (either at KVM_RUN
completion or when preempted) and when transitioning to halt state.

Note that guest TSC scaling via KVM_SET_TSC_KHZ is not supported, as
it would require either intercepting MPERF reads on Intel (where MPERF
ticks at host rate regardless of guest TSC scaling) or significantly
complicating the cycle accounting on AMD.

The host must have both CONSTANT_TSC and NONSTOP_TSC capabilities
since these ensure stable TSC frequency across C-states and P-states,
which is required for accurate background MPERF accounting.

Jim Mattson (14):
  x86/aperfmperf: Introduce get_host_[am]perf()
  x86/aperfmperf: Introduce set_guest_[am]perf()
  x86/aperfmperf: Introduce restore_host_[am]perf()
  x86/msr: Adjust remote reads of IA32_[AM]PERF by the per-cpu host
    offset
  KVM: x86: Introduce kvm_vcpu_make_runnable()
  KVM: x86: INIT may transition from HALTED to RUNNABLE
  KVM: nSVM: Nested #VMEXIT may transition from HALTED to RUNNABLE
  KVM: nVMX: Nested VM-exit may transition from HALTED to RUNNABLE
  KVM: x86: Make APERFMPERF a governed feature
  KVM: x86: Initialize guest [am]perf at vcpu power-on
  KVM: x86: Load guest [am]perf when leaving halt state
  KVM: x86: Introduce kvm_user_return_notifier_register()
  KVM: x86: Restore host IA32_[AM]PERF on userspace return
  KVM: x86: Update aperfmperf on host-initiated MP_STATE transitions

Mingwei Zhang (8):
  KVM: x86: Introduce KVM_X86_FEATURE_APERFMPERF
  KVM: x86: Load guest [am]perf into hardware MSRs at vcpu_load()
  KVM: x86: Save guest [am]perf checkpoint on HLT
  KVM: x86: Save guest [am]perf checkpoint on vcpu_put()
  KVM: x86: Allow host and guest access to IA32_[AM]PERF
  KVM: VMX: Pass through guest reads of IA32_[AM]PERF
  KVM: SVM: Pass through guest reads of IA32_[AM]PERF
  KVM: x86: Enable guest usage of X86_FEATURE_APERFMPERF

 arch/x86/include/asm/kvm_host.h  |  11 ++
 arch/x86/include/asm/topology.h  |  10 ++
 arch/x86/kernel/cpu/aperfmperf.c |  65 +++++++++++-
 arch/x86/kvm/cpuid.c             |  12 ++-
 arch/x86/kvm/governed_features.h |   1 +
 arch/x86/kvm/lapic.c             |   5 +-
 arch/x86/kvm/reverse_cpuid.h     |   6 ++
 arch/x86/kvm/svm/nested.c        |   2 +-
 arch/x86/kvm/svm/svm.c           |   7 ++
 arch/x86/kvm/svm/svm.h           |   2 +-
 arch/x86/kvm/vmx/nested.c        |   2 +-
 arch/x86/kvm/vmx/vmx.c           |   7 ++
 arch/x86/kvm/vmx/vmx.h           |   2 +-
 arch/x86/kvm/x86.c               | 171 ++++++++++++++++++++++++++++---
 arch/x86/lib/msr-smp.c           |  11 ++
 drivers/cpufreq/amd-pstate.c     |   4 +-
 drivers/cpufreq/intel_pstate.c   |   5 +-
 17 files changed, 295 insertions(+), 28 deletions(-)


base-commit: 0a9b9d17f3a781dea03baca01c835deaa07f7cc3
-- 
2.47.0.371.ga323438b13-goog


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2025-01-13 19:15 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-21 18:52 [RFC PATCH 00/22] KVM: x86: Virtualize IA32_APERF and IA32_MPERF MSRs Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 01/22] x86/aperfmperf: Introduce get_host_[am]perf() Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 02/22] x86/aperfmperf: Introduce set_guest_[am]perf() Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 03/22] x86/aperfmperf: Introduce restore_host_[am]perf() Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 04/22] x86/msr: Adjust remote reads of IA32_[AM]PERF by the per-cpu host offset Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 05/22] KVM: x86: Introduce kvm_vcpu_make_runnable() Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 06/22] KVM: x86: INIT may transition from HALTED to RUNNABLE Mingwei Zhang
2024-12-03 19:07   ` Sean Christopherson
2024-11-21 18:52 ` [RFC PATCH 07/22] KVM: nSVM: Nested #VMEXIT " Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 08/22] KVM: nVMX: Nested VM-exit " Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 09/22] KVM: x86: Introduce KVM_X86_FEATURE_APERFMPERF Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 10/22] KVM: x86: Make APERFMPERF a governed feature Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 11/22] KVM: x86: Initialize guest [am]perf at vcpu power-on Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 12/22] KVM: x86: Load guest [am]perf into hardware MSRs at vcpu_load() Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 13/22] KVM: x86: Load guest [am]perf when leaving halt state Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 14/22] KVM: x86: Introduce kvm_user_return_notifier_register() Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 15/22] KVM: x86: Restore host IA32_[AM]PERF on userspace return Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 16/22] KVM: x86: Save guest [am]perf checkpoint on HLT Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 17/22] KVM: x86: Save guest [am]perf checkpoint on vcpu_put() Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 18/22] KVM: x86: Update aperfmperf on host-initiated MP_STATE transitions Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 19/22] KVM: x86: Allow host and guest access to IA32_[AM]PERF Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 20/22] KVM: VMX: Pass through guest reads of IA32_[AM]PERF Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 21/22] KVM: SVM: " Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 22/22] KVM: x86: Enable guest usage of X86_FEATURE_APERFMPERF Mingwei Zhang
2024-12-03 23:19 ` [RFC PATCH 00/22] KVM: x86: Virtualize IA32_APERF and IA32_MPERF MSRs Sean Christopherson
2024-12-04  1:13   ` Jim Mattson
2024-12-04  1:59     ` Sean Christopherson
2024-12-04  4:00       ` Jim Mattson
2024-12-04  5:11       ` Mingwei Zhang
2024-12-04 12:30       ` Jim Mattson
2024-12-06 16:34         ` Sean Christopherson
2024-12-18 22:23           ` Jim Mattson
2025-01-13 19:15             ` Sean Christopherson
2024-12-05  8:59 ` Nikunj A Dadhania
2024-12-05 13:48   ` Jim Mattson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).