From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Subject: Re: [PATCH v2] kvm: x86: emulate monitor and mwait instructions as nop Date: Wed, 04 Jun 2014 17:09:49 +0200 Message-ID: <538F36BD.9040404@suse.de> References: <20140507205210.GA30030@ERROL.INI.CMU.EDU> <20140602192530.GC1653@ERROL.INI.CMU.EDU> <538D92BC.4060203@redhat.com> <20140604143941.GF1653@ERROL.INI.CMU.EDU> <538F30BD.5000501@suse.de> <20140604150519.GG1653@ERROL.INI.CMU.EDU> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Paolo Bonzini , kvm@vger.kernel.org, mst@redhat.com To: "Gabriel L. Somlo" Return-path: Received: from cantor2.suse.de ([195.135.220.15]:36261 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751592AbaFDPJw (ORCPT ); Wed, 4 Jun 2014 11:09:52 -0400 In-Reply-To: <20140604150519.GG1653@ERROL.INI.CMU.EDU> Sender: kvm-owner@vger.kernel.org List-ID: On 04.06.14 17:05, Gabriel L. Somlo wrote: > On Wed, Jun 04, 2014 at 04:44:13PM +0200, Alexander Graf wrote: >> On 04.06.14 16:39, Gabriel L. Somlo wrote: >>> Paolo, >>> >>> I noticed the monitor=mwait=nop patch is making its way upstream, so >>> thanks ! >>> >>> I'm still interested in following up with something that would enable >>> this behavior only conditionally (e.g. following an ioctl call from >>> userspace to enable it only for the (set of) vcpu(s) belonging to one >>> guest VM at a time), which should then also include advertising the >>> feature in CPUID. >>> >>> I grep-ed through the kvm sources for KVM_CAP for some inspiration, >>> and it looks more like KVM_CAP_* is a way to tell userspace what the >>> kernel supports, but nothing I saw showed me an example of a "tunable" >>> feature that userspace may ask to be turned on or off (e.g per-vcpu). >>> >>> Is there something like that I could use as an example ? >> Sure, we use it all over the place on PPC :). > Allright, I'll grep harder, then :) > >>> Obviously, if you really like the current behavior better you can >>> always reject whatever patch I'll come up with, but I'd like to at >>> least try and see what it would look like :) >> I think it's perfectly fine to leave mwait always implemented as NOP - it's >> valid behavior. > NOP is valid MWAIT behavior, *unless* MWAIT should generate an invalid > opcode (i.e., if CPUID says mwait not supported). In that respect, > we're cheating only to hook up guests which misbehave. I'd feel less > "dirty" if I could explicitly tell KVM "ok, just this once is OK, but > don't make a habit of it" :) We don't limit instructions the guest can execute properly anyway. If CPUID doesn't expose AVX, but the host CPU supports AVX, the guest can still call AVX instructions. So I think we're safe to always handle MWAIT :). > >> As for the CPUID exposure, that should be a pure QEMU thing. If overriding >> CPUID bits the kernel mask tells us doesn't work today, we should just make >> it possible :). >> >> Eventually I really think that -cpu foo,+mwait,+monitor or whatever the bits >> are should override any safety net that KVM gives us on features it thinks >> are safe to use. > I need to look at the qemu source, doing what you said > (+monitor,+mwait,+whatever) right now "works", doesn't generate an error, > but silently ignores you if it's not implemented. So I'd actually have to > generate a patch to make something happen when they're present on the > command line. > > The part I'm unsure about is "how bad is it to cheat the way we do right > now", vs. "how much is it worth to be pedantic and require explicitly > enabling things, in both qemu and kvm"... I feel like I don't know > enough to 1. have a strong opinion either way, and 2. have my opinion > be *right* :) Which is why I won't let it go already (and thanks for > all your patience, BTW) :) I think it's sane behavior to not expose the MWAIT capability in the default CPUID mask (which comes from KVM) unless we can actually emulate it properly ;). However, I think it's very important to be able to force CPUID bits to on from QEMU even when KVM says it doesn't support them. I actually thought we could do that already, but that code got refactored a number of times over the years, so maybe that ability got lost. Basically KVM gives QEMU 2 ioctls: * get list of KVM supported CPUIDs * set guest exposed CPUIDs Whether QEMU wants to only set CPUID bits that the kernel actually supports is up to its own implementation. Usually the "enforce" option is there to guarantee that all CPUID bits are actually supported. Apparently all unsupported bits just get dropped silently today. IMHO they shouldn't if they were specified through -cpu ...,+feature. Alex