From: Sean Christopherson <seanjc@google.com>
To: Dapeng Mi <dapeng1.mi@intel.com>
Cc: pbonzini@redhat.com, tglx@linutronix.de, mingo@redhat.com,
dave.hansen@linux.intel.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, zhenyuw@linux.intel.com
Subject: Re: [PATCH] KVM: x86: disable halt polling when powersave governor is used
Date: Fri, 7 Oct 2022 17:51:36 +0000 [thread overview]
Message-ID: <Y0BnKIW+7sqJbTyY@google.com> (raw)
In-Reply-To: <20220915073121.1038840-1-dapeng1.mi@intel.com>
On Thu, Sep 15, 2022, Dapeng Mi wrote:
> Halt polling is enabled by default even through the CPU frequency
> governor is configured to powersave. Generally halt polling would
> consume extra power and this's not identical with the intent of
> powersave governor.
>
> disabling halt polling in powersave governor can save the precious
> power in power critical case.
>
> FIO random read test on Alder Lake platform shows halt polling
> occupies ~17% CPU utilization and consume 7% extra CPU power.
> After disabling halt polling, CPU has more chance to enter deeper
> C-states (C1E%: 25.3% -> 33.4%, C10%: 4.4% -> 17.4%).
>
> On Alder Lake platform, we don't find there are obvious performance
> downgrade after disabling halt polling on FIO and Netperf cases.
> Netperf UDP_RR case runs from two VMs locate on two different physical
> machines.
>
> FIO(MB/s) Base Disable-halt-polling Delta%
> Rand-read 432.6 436.3 0.8%
>
> Netperf Base Disable-halt-polling Delta%
> UDP_RR 509.8 508.5 -0.3%
>
> Signed-off-by: Dapeng Mi <dapeng1.mi@intel.com>
> ---
> arch/x86/kvm/x86.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d7374d768296..c0eb6574cbbb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -13015,7 +13015,22 @@ bool kvm_vector_hashing_enabled(void)
>
> bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
> {
> - return (vcpu->arch.msr_kvm_poll_control & 1) == 0;
> + struct cpufreq_policy *policy = cpufreq_cpu_get(vcpu->cpu);
Preemption is not disabled at this point, which means that using vcpu->cpu is
potentially unsafe. Given that cpufreq is refcounting the returned object, I gotta
imaging get migrated to a different pCPU would be problematic.
> + bool powersave = false;
I don't see anything in here that's x86 specific. Unless I'm missing something,
this belongs in common KVM.
> +
> + /*
> + * Halt polling could consume much CPU power, if CPU frequency
> + * governor is set to "powersave", disable halt polling.
> + */
> + if (policy) {
> + if ((policy->policy == CPUFREQ_POLICY_POWERSAVE) ||
> + (policy->governor &&
Indentation is messed up.
> + !strncmp(policy->governor->name, "powersave",
KVM should not be comparing magic strings. If the cpufreq subsystem can't get
policy->policy right, then that needs to be fixed.
> + CPUFREQ_NAME_LEN)))
> + powersave = true;
> + cpufreq_cpu_put(policy);
> + }
> + return ((vcpu->arch.msr_kvm_poll_control & 1) == 0) || powersave;
Doing all of the above work if polling is disabled is silly.
> }
> EXPORT_SYMBOL_GPL(kvm_arch_no_poll);
All in all, _if_ we want to do this automatically and not let userspace decide how
to manage powersave vs. halt-poll, I think this should be more like:
---
virt/kvm/kvm_main.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e30f1b4ecfa5..01116859cb31 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -29,6 +29,7 @@
#include <linux/file.h>
#include <linux/syscore_ops.h>
#include <linux/cpu.h>
+#include <linux/cpufreq.h>
#include <linux/sched/signal.h>
#include <linux/sched/mm.h>
#include <linux/sched/stat.h>
@@ -3483,6 +3484,23 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
}
}
+static bool kvm_cpufreq_no_halt_poll(struct kvm_vcpu *vcpu)
+{
+ struct cpufreq_policy *policy;
+ bool powersave = false;
+
+ preempt_disable();
+
+ policy = cpufreq_cpu_get(vcpu->cpu);
+ if (policy) {
+ powersave = (policy->policy == CPUFREQ_POLICY_POWERSAVE);
+ cpufreq_cpu_put(policy);
+ }
+
+ preempt_enable();
+ return powersave;
+}
+
/*
* Emulate a vCPU halt condition, e.g. HLT on x86, WFI on arm, etc... If halt
* polling is enabled, busy wait for a short time before blocking to avoid the
@@ -3491,7 +3509,8 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
*/
void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
{
- bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
+ const bool halt_poll_allowed = !kvm_arch_no_poll(vcpu) &&
+ !kvm_cpufreq_no_halt_poll(vcpu);
bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
ktime_t start, cur, poll_end;
bool waited = false;
base-commit: e18d6152ff0f41b7f01f9817372022df04e0d354
--
next prev parent reply other threads:[~2022-10-07 17:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-15 7:31 [PATCH] KVM: x86: disable halt polling when powersave governor is used Dapeng Mi
2022-10-07 17:51 ` Sean Christopherson [this message]
2022-10-08 9:40 ` Mi, Dapeng1
2022-10-25 17:06 ` Sean Christopherson
2022-11-15 6:25 ` Mi, Dapeng1
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y0BnKIW+7sqJbTyY@google.com \
--to=seanjc@google.com \
--cc=dapeng1.mi@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=zhenyuw@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox