From: Mingwei Zhang <mizhang@google.com>
To: Sean Christopherson <seanjc@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Huang Rui <ray.huang@amd.com>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
Mario Limonciello <mario.limonciello@amd.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
Len Brown <lenb@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Perry Yuan <perry.yuan@amd.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org, Jim Mattson <jmattson@google.com>,
Mingwei Zhang <mizhang@google.com>
Subject: [RFC PATCH 02/22] x86/aperfmperf: Introduce set_guest_[am]perf()
Date: Thu, 21 Nov 2024 18:52:54 +0000 [thread overview]
Message-ID: <20241121185315.3416855-3-mizhang@google.com> (raw)
In-Reply-To: <20241121185315.3416855-1-mizhang@google.com>
From: Jim Mattson <jmattson@google.com>
KVM guests need access to IA32_APERF and IA32_MPERF to observe their
effective CPU frequency, but intercepting reads of these MSRs is too
expensive since Linux guests read them every scheduler tick (250 Hz by
default). Allow the guest to read these MSRs without interception by
loading guest values into the hardware MSRs.
When loading a guest value into IA32_APERF or IA32_MPERF:
1. Query the current host value
2. Record the offset between guest and host values in a per-CPU variable
3. Load the guest value into the MSR
Modify get_host_[am]perf() to add the per-CPU offset to the raw MSR
value, so that host kernel code can still obtain correct host values
even when the MSRs contain guest values.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
---
arch/x86/include/asm/topology.h | 5 +++++
arch/x86/kernel/cpu/aperfmperf.c | 31 +++++++++++++++++++++++++++++--
2 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 2ef9903cf85d7..fef5846c01976 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -302,8 +302,13 @@ static inline void arch_set_max_freq_ratio(bool turbo_disabled) { }
static inline void freq_invariance_set_perf_ratio(u64 ratio, bool turbo_disabled) { }
#endif
+DECLARE_PER_CPU(u64, host_aperf_offset);
+DECLARE_PER_CPU(u64, host_mperf_offset);
+
extern u64 get_host_aperf(void);
extern u64 get_host_mperf(void);
+extern void set_guest_aperf(u64 aperf);
+extern void set_guest_mperf(u64 mperf);
extern void arch_scale_freq_tick(void);
#define arch_scale_freq_tick arch_scale_freq_tick
diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
index 3be5070ba3361..8b66872aa98c1 100644
--- a/arch/x86/kernel/cpu/aperfmperf.c
+++ b/arch/x86/kernel/cpu/aperfmperf.c
@@ -94,20 +94,47 @@ void arch_set_max_freq_ratio(bool turbo_disabled)
}
EXPORT_SYMBOL_GPL(arch_set_max_freq_ratio);
+DEFINE_PER_CPU(u64, host_aperf_offset);
+DEFINE_PER_CPU(u64, host_mperf_offset);
+
u64 get_host_aperf(void)
{
WARN_ON_ONCE(!irqs_disabled());
- return native_read_msr(MSR_IA32_APERF);
+ return native_read_msr(MSR_IA32_APERF) +
+ this_cpu_read(host_aperf_offset);
}
EXPORT_SYMBOL_GPL(get_host_aperf);
u64 get_host_mperf(void)
{
WARN_ON_ONCE(!irqs_disabled());
- return native_read_msr(MSR_IA32_MPERF);
+ return native_read_msr(MSR_IA32_MPERF) +
+ this_cpu_read(host_mperf_offset);
}
EXPORT_SYMBOL_GPL(get_host_mperf);
+void set_guest_aperf(u64 guest_aperf)
+{
+ u64 host_aperf;
+
+ WARN_ON_ONCE(!irqs_disabled());
+ host_aperf = get_host_aperf();
+ wrmsrl(MSR_IA32_APERF, guest_aperf);
+ this_cpu_write(host_aperf_offset, host_aperf - guest_aperf);
+}
+EXPORT_SYMBOL_GPL(set_guest_aperf);
+
+void set_guest_mperf(u64 guest_mperf)
+{
+ u64 host_mperf;
+
+ WARN_ON_ONCE(!irqs_disabled());
+ host_mperf = get_host_mperf();
+ wrmsrl(MSR_IA32_MPERF, guest_mperf);
+ this_cpu_write(host_mperf_offset, host_mperf - guest_mperf);
+}
+EXPORT_SYMBOL_GPL(set_guest_mperf);
+
static bool __init turbo_disabled(void)
{
u64 misc_en;
--
2.47.0.371.ga323438b13-goog
next prev parent reply other threads:[~2024-11-21 18:53 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-21 18:52 [RFC PATCH 00/22] KVM: x86: Virtualize IA32_APERF and IA32_MPERF MSRs Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 01/22] x86/aperfmperf: Introduce get_host_[am]perf() Mingwei Zhang
2024-11-21 18:52 ` Mingwei Zhang [this message]
2024-11-21 18:52 ` [RFC PATCH 03/22] x86/aperfmperf: Introduce restore_host_[am]perf() Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 04/22] x86/msr: Adjust remote reads of IA32_[AM]PERF by the per-cpu host offset Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 05/22] KVM: x86: Introduce kvm_vcpu_make_runnable() Mingwei Zhang
2024-11-21 18:52 ` [RFC PATCH 06/22] KVM: x86: INIT may transition from HALTED to RUNNABLE Mingwei Zhang
2024-12-03 19:07 ` Sean Christopherson
2024-11-21 18:52 ` [RFC PATCH 07/22] KVM: nSVM: Nested #VMEXIT " Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 08/22] KVM: nVMX: Nested VM-exit " Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 09/22] KVM: x86: Introduce KVM_X86_FEATURE_APERFMPERF Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 10/22] KVM: x86: Make APERFMPERF a governed feature Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 11/22] KVM: x86: Initialize guest [am]perf at vcpu power-on Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 12/22] KVM: x86: Load guest [am]perf into hardware MSRs at vcpu_load() Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 13/22] KVM: x86: Load guest [am]perf when leaving halt state Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 14/22] KVM: x86: Introduce kvm_user_return_notifier_register() Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 15/22] KVM: x86: Restore host IA32_[AM]PERF on userspace return Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 16/22] KVM: x86: Save guest [am]perf checkpoint on HLT Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 17/22] KVM: x86: Save guest [am]perf checkpoint on vcpu_put() Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 18/22] KVM: x86: Update aperfmperf on host-initiated MP_STATE transitions Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 19/22] KVM: x86: Allow host and guest access to IA32_[AM]PERF Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 20/22] KVM: VMX: Pass through guest reads of IA32_[AM]PERF Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 21/22] KVM: SVM: " Mingwei Zhang
2024-11-21 18:53 ` [RFC PATCH 22/22] KVM: x86: Enable guest usage of X86_FEATURE_APERFMPERF Mingwei Zhang
2024-12-03 23:19 ` [RFC PATCH 00/22] KVM: x86: Virtualize IA32_APERF and IA32_MPERF MSRs Sean Christopherson
2024-12-04 1:13 ` Jim Mattson
2024-12-04 1:59 ` Sean Christopherson
2024-12-04 4:00 ` Jim Mattson
2024-12-04 5:11 ` Mingwei Zhang
2024-12-04 12:30 ` Jim Mattson
2024-12-06 16:34 ` Sean Christopherson
2024-12-18 22:23 ` Jim Mattson
2025-01-13 19:15 ` Sean Christopherson
2024-12-05 8:59 ` Nikunj A Dadhania
2024-12-05 13:48 ` Jim Mattson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241121185315.3416855-3-mizhang@google.com \
--to=mizhang@google.com \
--cc=gautham.shenoy@amd.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=pbonzini@redhat.com \
--cc=perry.yuan@amd.com \
--cc=rafael@kernel.org \
--cc=ray.huang@amd.com \
--cc=seanjc@google.com \
--cc=srinivas.pandruvada@linux.intel.com \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox