From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: Jim Mattson <jmattson@google.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
"Gabriel L. Somlo" <gsomlo@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
the arch/x86 maintainers <x86@kernel.org>,
Joerg Roedel <joro@8bytes.org>, kvm list <kvm@vger.kernel.org>,
linux-doc@vger.kernel.org
Subject: Re: [PATCH v5 untested] kvm: better MWAIT emulation for guests
Date: Tue, 4 Apr 2017 14:39:16 +0200 [thread overview]
Message-ID: <20170404123915.GA9525@potion> (raw)
In-Reply-To: <f6607513-4cbd-3fa0-1663-5477e855e783@suse.de>
2017-04-03 12:04+0200, Alexander Graf:
> On 03/29/2017 02:11 PM, Radim Krčmář wrote:
>> 2017-03-28 13:35-0700, Jim Mattson:
>> > On Tue, Mar 28, 2017 at 7:28 AM, Radim Krčmář <rkrcmar@redhat.com> wrote:
>> > > 2017-03-27 15:34+0200, Alexander Graf:
>> > > > On 15/03/2017 22:22, Michael S. Tsirkin wrote:
>> > > > > Guests running Mac OS 5, 6, and 7 (Leopard through Lion) have a problem:
>> > > > > unless explicitly provided with kernel command line argument
>> > > > > "idlehalt=0" they'd implicitly assume MONITOR and MWAIT availability,
>> > > > > without checking CPUID.
>> > > > >
>> > > > > We currently emulate that as a NOP but on VMX we can do better: let
>> > > > > guest stop the CPU until timer, IPI or memory change. CPU will be busy
>> > > > > but that isn't any worse than a NOP emulation.
>> > > > >
>> > > > > Note that mwait within guests is not the same as on real hardware
>> > > > > because halt causes an exit while mwait doesn't. For this reason it
>> > > > > might not be a good idea to use the regular MWAIT flag in CPUID to
>> > > > > signal this capability. Add a flag in the hypervisor leaf instead.
>> > > > So imagine we had proper MWAIT emulation capabilities based on page faults.
>> > > > In that case, we could do something as fancy as
>> > > >
>> > > > Treat MWAIT as pass-through by default
>> > > >
>> > > > Have a per-vcpu monitor timer 10 times a second in the background that
>> > > > checks which instruction we're in
>> > > >
>> > > > If we're in mwait for the last - say - 1 second, switch to emulated MWAIT,
>> > > > if $IP was in non-mwait within that time, reset counter.
>> > > Or we could reuse external interrupts for sampling. Exits trigerred by
>> > > them would check for current instruction (probably would be best to
>> > > limit just to timer tick) and a sufficient ratio (> 0?) of other exits
>> > > would imply that MWAIT is not used.
>> > >
>> > > > Or instead maybe just reuse the adapter hlt logic?
>> > > Emulated MWAIT is very similar to emulated HLT, so reusing the logic
>> > > makes sense. We would just add new wakeup methods.
>> > >
>> > > > Either way, with that we should be able to get super low latency IPIs
>> > > > running while still maintaining some sanity on systems which don't have
>> > > > dedicated CPUs for workloads.
>> > > >
>> > > > And we wouldn't need guest modifications, which is a great plus. So older
>> > > > guests (and Windows?) could benefit from mwait as well.
>> > > There is no need guest modifications -- it could be exposed as standard
>> > > MWAIT feature to the guest, with responsibilities for guest/host-impact
>> > > on the user.
>> > >
>> > > I think that the page-fault based MWAIT would require paravirt if it
>> > > should be enabled by default, because of performance concerns:
>> > > Enabling write protection on a page needs a VM exit on all other VCPUs
>> > > when beginning monitoring (to reload page permissions and prevent missed
>> > > writes).
>> > > We'd want to keep trapping writes to the page all the time because
>> > > toggling is slow, but this could regress performance for an OS that has
>> > > other data accessed by other VCPUs in that page.
>> > > No current interface can tell the guest that it should reserve the whole
>> > > page instead of what CPUID[5] says and that writes to the monitored page
>> > > are not "cheap", but can trigger a VM exit ...
>> > CPUID.05H:EBX is supposed to address the false sharing issue. IIRC,
>> > VMware Fusion reports 64 in CPUID.05H:EAX and 4096 in CPUID.05H:EBX
>> > when running Mac OS X guests. Per Intel's SDM volume 3, section
>> > 8.10.5, "To avoid false wake-ups; use the largest monitor line size to
>> > pad the data structure used to monitor writes. Software must make sure
>> > that beyond the data structure, no unrelated data variable exists in
>> > the triggering area for MWAIT. A pad may be needed to avoid this
>> > situation." Unfortunately, most operating systems do not follow this
>> > advice.
>> Right, EBX provides what we need to expose that the whole page is
>> monitored, thanks!
>
> So coming back to the original patch, is there anything that should keep us
> from exposing MWAIT straight into the guest at all times?
Just minor issues:
* OS X on Core 2 fails for unknown reason if we disable the instruction
trapping, which is an argument against doing it by default
* idling guests would consume host CPU, which is a significant change
in behavior and shouldn't be done without userspace's involvement
I think the best compromise is to add a capability for the MWAIT VM-exit
controls and let userspace expose MWAIT if it wishes to.
Will send a patch.
next prev parent reply other threads:[~2017-04-04 12:39 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-15 21:22 [PATCH v5 untested] kvm: better MWAIT emulation for guests Michael S. Tsirkin
2017-03-15 23:35 ` Gabriel L. Somlo
2017-03-15 23:41 ` Michael S. Tsirkin
2017-03-16 13:24 ` Gabriel L. Somlo
2017-03-16 14:04 ` Michael S. Tsirkin
2017-03-16 14:58 ` Gabriel L. Somlo
2017-03-16 15:23 ` Michael S. Tsirkin
2017-03-16 15:35 ` Radim Krčmář
2017-03-16 16:01 ` Radim Krčmář
2017-03-16 16:47 ` Gabriel L. Somlo
2017-03-16 17:22 ` Radim Krčmář
2017-03-16 17:39 ` Gabriel L. Somlo
2017-03-16 17:27 ` Michael S. Tsirkin
2017-03-16 17:41 ` Gabriel L. Somlo
2017-03-16 18:29 ` Michael S. Tsirkin
2017-03-16 19:24 ` Gabriel L. Somlo
2017-03-16 19:27 ` Michael S. Tsirkin
2017-03-16 20:17 ` Gabriel L. Somlo
2017-03-16 21:14 ` Gabriel L. Somlo
2017-03-17 2:03 ` Michael S. Tsirkin
2017-03-17 13:23 ` Gabriel L. Somlo
2017-03-21 3:22 ` Michael S. Tsirkin
2017-03-21 16:58 ` Radim Krčmář
2017-03-21 17:29 ` Nadav Amit
2017-03-21 17:29 ` Nadav Amit
2017-03-21 19:22 ` Radim Krčmář
2017-03-21 22:51 ` Gabriel Somlo
2017-03-22 0:02 ` Nadav Amit
2017-03-22 13:35 ` Michael S. Tsirkin
2017-03-22 14:10 ` Gabriel L. Somlo
2017-03-22 14:15 ` Michael S. Tsirkin
2017-03-16 16:16 ` Gabriel L. Somlo
2017-03-16 16:45 ` Michael S. Tsirkin
2017-03-16 16:52 ` Gabriel L. Somlo
2017-03-16 16:54 ` Gabriel L. Somlo
2017-03-16 17:14 ` Michael S. Tsirkin
2017-03-16 17:38 ` Radim Krčmář
2017-03-16 14:08 ` Radim Krčmář
2017-03-16 15:44 ` Gabriel L. Somlo
2017-03-16 15:54 ` Radim Krčmář
2017-03-16 16:26 ` Gabriel L. Somlo
2017-03-21 16:16 ` Joerg Roedel
2017-03-21 18:45 ` Michael S. Tsirkin
2017-03-27 13:34 ` Alexander Graf
2017-03-28 14:28 ` Radim Krčmář
2017-03-28 20:35 ` Jim Mattson
2017-03-29 12:11 ` Radim Krčmář
2017-04-03 10:04 ` Alexander Graf
2017-04-04 12:39 ` Radim Krčmář [this message]
2017-04-04 12:51 ` Alexander Graf
2017-04-04 13:13 ` Radim Krčmář
2017-04-04 13:15 ` Alexander Graf
2017-04-04 13:44 ` Radim Krčmář
2017-04-04 13:44 ` [Qemu-devel] " Radim Krčmář
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170404123915.GA9525@potion \
--to=rkrcmar@redhat.com \
--cc=agraf@suse.de \
--cc=corbet@lwn.net \
--cc=gsomlo@gmail.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.