From: Alexander Graf <agraf@suse.de>
To: "Radim Krčmář" <rkrcmar@redhat.com>, "Jim Mattson" <jmattson@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
"Gabriel L. Somlo" <gsomlo@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
the arch/x86 maintainers <x86@kernel.org>,
Joerg Roedel <joro@8bytes.org>, kvm list <kvm@vger.kernel.org>,
linux-doc@vger.kernel.org
Subject: Re: [PATCH v5 untested] kvm: better MWAIT emulation for guests
Date: Mon, 3 Apr 2017 12:04:34 +0200 [thread overview]
Message-ID: <f6607513-4cbd-3fa0-1663-5477e855e783@suse.de> (raw)
In-Reply-To: <20170329121147.GA5129@potion>
On 03/29/2017 02:11 PM, Radim Krčmář wrote:
> 2017-03-28 13:35-0700, Jim Mattson:
>> On Tue, Mar 28, 2017 at 7:28 AM, Radim Krčmář <rkrcmar@redhat.com> wrote:
>>> 2017-03-27 15:34+0200, Alexander Graf:
>>>> On 15/03/2017 22:22, Michael S. Tsirkin wrote:
>>>>> Guests running Mac OS 5, 6, and 7 (Leopard through Lion) have a problem:
>>>>> unless explicitly provided with kernel command line argument
>>>>> "idlehalt=0" they'd implicitly assume MONITOR and MWAIT availability,
>>>>> without checking CPUID.
>>>>>
>>>>> We currently emulate that as a NOP but on VMX we can do better: let
>>>>> guest stop the CPU until timer, IPI or memory change. CPU will be busy
>>>>> but that isn't any worse than a NOP emulation.
>>>>>
>>>>> Note that mwait within guests is not the same as on real hardware
>>>>> because halt causes an exit while mwait doesn't. For this reason it
>>>>> might not be a good idea to use the regular MWAIT flag in CPUID to
>>>>> signal this capability. Add a flag in the hypervisor leaf instead.
>>>> So imagine we had proper MWAIT emulation capabilities based on page faults.
>>>> In that case, we could do something as fancy as
>>>>
>>>> Treat MWAIT as pass-through by default
>>>>
>>>> Have a per-vcpu monitor timer 10 times a second in the background that
>>>> checks which instruction we're in
>>>>
>>>> If we're in mwait for the last - say - 1 second, switch to emulated MWAIT,
>>>> if $IP was in non-mwait within that time, reset counter.
>>> Or we could reuse external interrupts for sampling. Exits trigerred by
>>> them would check for current instruction (probably would be best to
>>> limit just to timer tick) and a sufficient ratio (> 0?) of other exits
>>> would imply that MWAIT is not used.
>>>
>>>> Or instead maybe just reuse the adapter hlt logic?
>>> Emulated MWAIT is very similar to emulated HLT, so reusing the logic
>>> makes sense. We would just add new wakeup methods.
>>>
>>>> Either way, with that we should be able to get super low latency IPIs
>>>> running while still maintaining some sanity on systems which don't have
>>>> dedicated CPUs for workloads.
>>>>
>>>> And we wouldn't need guest modifications, which is a great plus. So older
>>>> guests (and Windows?) could benefit from mwait as well.
>>> There is no need guest modifications -- it could be exposed as standard
>>> MWAIT feature to the guest, with responsibilities for guest/host-impact
>>> on the user.
>>>
>>> I think that the page-fault based MWAIT would require paravirt if it
>>> should be enabled by default, because of performance concerns:
>>> Enabling write protection on a page needs a VM exit on all other VCPUs
>>> when beginning monitoring (to reload page permissions and prevent missed
>>> writes).
>>> We'd want to keep trapping writes to the page all the time because
>>> toggling is slow, but this could regress performance for an OS that has
>>> other data accessed by other VCPUs in that page.
>>> No current interface can tell the guest that it should reserve the whole
>>> page instead of what CPUID[5] says and that writes to the monitored page
>>> are not "cheap", but can trigger a VM exit ...
>> CPUID.05H:EBX is supposed to address the false sharing issue. IIRC,
>> VMware Fusion reports 64 in CPUID.05H:EAX and 4096 in CPUID.05H:EBX
>> when running Mac OS X guests. Per Intel's SDM volume 3, section
>> 8.10.5, "To avoid false wake-ups; use the largest monitor line size to
>> pad the data structure used to monitor writes. Software must make sure
>> that beyond the data structure, no unrelated data variable exists in
>> the triggering area for MWAIT. A pad may be needed to avoid this
>> situation." Unfortunately, most operating systems do not follow this
>> advice.
> Right, EBX provides what we need to expose that the whole page is
> monitored, thanks!
So coming back to the original patch, is there anything that should keep
us from exposing MWAIT straight into the guest at all times?
Alex
next prev parent reply other threads:[~2017-04-03 10:04 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-15 21:22 [PATCH v5 untested] kvm: better MWAIT emulation for guests Michael S. Tsirkin
2017-03-15 23:35 ` Gabriel L. Somlo
2017-03-15 23:41 ` Michael S. Tsirkin
2017-03-16 13:24 ` Gabriel L. Somlo
2017-03-16 14:04 ` Michael S. Tsirkin
2017-03-16 14:58 ` Gabriel L. Somlo
2017-03-16 15:23 ` Michael S. Tsirkin
2017-03-16 15:35 ` Radim Krčmář
2017-03-16 16:01 ` Radim Krčmář
2017-03-16 16:47 ` Gabriel L. Somlo
2017-03-16 17:22 ` Radim Krčmář
2017-03-16 17:39 ` Gabriel L. Somlo
2017-03-16 17:27 ` Michael S. Tsirkin
2017-03-16 17:41 ` Gabriel L. Somlo
2017-03-16 18:29 ` Michael S. Tsirkin
2017-03-16 19:24 ` Gabriel L. Somlo
2017-03-16 19:27 ` Michael S. Tsirkin
2017-03-16 20:17 ` Gabriel L. Somlo
2017-03-16 21:14 ` Gabriel L. Somlo
2017-03-17 2:03 ` Michael S. Tsirkin
2017-03-17 13:23 ` Gabriel L. Somlo
2017-03-21 3:22 ` Michael S. Tsirkin
2017-03-21 16:58 ` Radim Krčmář
2017-03-21 17:29 ` Nadav Amit
2017-03-21 19:22 ` Radim Krčmář
2017-03-21 22:51 ` Gabriel Somlo
2017-03-22 0:02 ` Nadav Amit
2017-03-22 13:35 ` Michael S. Tsirkin
2017-03-22 14:10 ` Gabriel L. Somlo
2017-03-22 14:15 ` Michael S. Tsirkin
2017-03-16 16:16 ` Gabriel L. Somlo
2017-03-16 16:45 ` Michael S. Tsirkin
2017-03-16 16:52 ` Gabriel L. Somlo
2017-03-16 16:54 ` Gabriel L. Somlo
2017-03-16 17:14 ` Michael S. Tsirkin
2017-03-16 17:38 ` Radim Krčmář
2017-03-16 14:08 ` Radim Krčmář
2017-03-16 15:44 ` Gabriel L. Somlo
2017-03-16 15:54 ` Radim Krčmář
2017-03-16 16:26 ` Gabriel L. Somlo
2017-03-21 16:16 ` Joerg Roedel
2017-03-21 18:45 ` Michael S. Tsirkin
2017-03-27 13:34 ` Alexander Graf
2017-03-28 14:28 ` Radim Krčmář
2017-03-28 20:35 ` Jim Mattson
2017-03-29 12:11 ` Radim Krčmář
2017-04-03 10:04 ` Alexander Graf [this message]
2017-04-04 12:39 ` Radim Krčmář
2017-04-04 12:51 ` Alexander Graf
2017-04-04 13:13 ` Radim Krčmář
2017-04-04 13:15 ` Alexander Graf
2017-04-04 13:44 ` Radim Krčmář
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f6607513-4cbd-3fa0-1663-5477e855e783@suse.de \
--to=agraf@suse.de \
--cc=corbet@lwn.net \
--cc=gsomlo@gmail.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox