From: "Radim Krčmář" <rkrcmar@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: linux-kernel@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org
Subject: Re: [PATCH] kvm: better MWAIT emulation for guests
Date: Mon, 13 Mar 2017 20:39:11 +0100 [thread overview]
Message-ID: <20170313193910.GB4547@potion> (raw)
In-Reply-To: <20170313180046-mutt-send-email-mst@kernel.org>
2017-03-13 18:08+0200, Michael S. Tsirkin:
> On Mon, Mar 13, 2017 at 04:46:20PM +0100, Radim Krčmář wrote:
>> 2017-03-10 00:29+0200, Michael S. Tsirkin:
>> > Some guests call mwait without checking the cpu flags. We currently
>> > emulate that as a NOP but on VMX we can do better: let guest stop the
>> > CPU until timer or IPI. CPU will be busy but that isn't any worse than
>> > a NOP emulation.
>> >
>> > Note that mwait within guests is not the same as on real hardware
>> > because you must halt if you want to go deep into sleep.
>>
>> SDM (25.3 CHANGES TO INSTRUCTION BEHAVIOR IN VMX NON-ROOT OPERATION)
>> says that "MWAIT operates normally". What is the reason why MWAIT
>> inside VMX cannot reach the same states as MWAIT outside VMX?
>
> If you are going into a deep sleep state with huge latency you are
> better off exiting and paying an extra microsecond latency
> since a chance some other task will want to schedule seems higher.
Oh, so MWAIT behavior is same and can reach deep sleep, just use-cases
differ ... If the guest VCPU is running on isolated CPU, then you might
want to reach a deep state to save power when there is no better use.
>> > Thus it isn't
>> > a good idea to use the regular MWAIT flag in CPUID for that. Add a flag
>> > in the hypervisor leaf instead.
>> >
>> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> > ---
>> [...]
>> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> > @@ -594,6 +594,9 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>> > + if (this_cpu_has(X86_FEATURE_MWAIT))
>> > + entry->eax = (1 << KVM_FEATURE_MWAIT);
>>
>> I'd rather not add it as a paravirt feature:
>>
>> - MWAIT requires the software to provide a target state, but we're not
>> doing anything to expose those states.
>
> Current linux guests just discover these states based on
> CPU model, so we do expose enough info.
Linux still filters the hardcoded hints through CPUID[5].edx, which is 0
in our case.
>> The feature would need very constrained setup, which is hard to
>> support
>
> Why would it? It works without any tweaking on several boxes
> I own.
MWAIT hints do not always mean the same, so they could lead to different
power/performance tradeoffs than the applications expects. We should at
least specify that the paravirt feature allows only hint 0.
You probably don't run weird combinations of host/guest CPUs.
>> - we've had requests to support MWAIT emulation for Linux and fully
>> emulating MWAIT would be best.
>> MWAIT is not going to enabled by default, of course; it would be
>> targeted at LPAR-like uses of KVM.
>
> Yes I think this limited emulation is safe to enable by default.
> Pretending mwait is equivalent to halt maybe isn't.
Right, we must keep the VCPU thread running when emulating mwait as it
is different from a hlt.
>> What about keeping just the last hunk to improve OS X, for now?
>>
>> Thanks.
>
> IMHO if we have a new functionality we are better of creating
> some way for guests to discover it is there.
>
> Do we really have to argue about a single bit in HV leaf?
> What harm does it do?
It adds code to both guest and hosts and needs documentation ...
The bit is acceptable. I just see no point in having it when there
already is a detection mechanism for mwait.
In any case, this patch should also remove VM exits under SVM and add
KVM_CAP_MWAIT for userspace. Userspace can then set the MWAIT feature
if it wishes the guest to use it in a more standard way.
I can do a cleanup due to unused VM exits on top of it.
Thanks.
next prev parent reply other threads:[~2017-03-13 19:39 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-09 22:29 [PATCH] kvm: better MWAIT emulation for guests Michael S. Tsirkin
2017-03-10 0:51 ` Gabriel L. Somlo
2017-03-10 1:12 ` Michael S. Tsirkin
2017-03-13 7:44 ` Wanpeng Li
2017-03-10 23:46 ` Jim Mattson
2017-03-12 0:01 ` Michael S. Tsirkin
2017-03-12 21:18 ` Gabriel L. Somlo
2017-03-13 15:46 ` Radim Krčmář
2017-03-13 16:08 ` Michael S. Tsirkin
2017-03-13 19:39 ` Radim Krčmář [this message]
2017-03-13 20:03 ` Michael S. Tsirkin
2017-03-13 21:43 ` Radim Krčmář
2017-03-15 18:14 ` Gabriel L. Somlo
2017-03-15 18:29 ` Michael S. Tsirkin
2017-03-15 19:01 ` Gabriel L. Somlo
2017-03-15 19:05 ` Michael S. Tsirkin
2017-03-15 19:29 ` Michael S. Tsirkin
2017-03-15 19:43 ` Gabriel L. Somlo
2017-03-15 20:13 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170313193910.GB4547@potion \
--to=rkrcmar@redhat.com \
--cc=corbet@lwn.net \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.