From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37165) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SS3b3-0006hi-7q for qemu-devel@nongnu.org; Wed, 09 May 2012 05:54:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SS3aw-0006yh-Hc for qemu-devel@nongnu.org; Wed, 09 May 2012 05:54:12 -0400 Received: from cantor2.suse.de ([195.135.220.15]:55004 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SS3aw-0006x7-7i for qemu-devel@nongnu.org; Wed, 09 May 2012 05:54:06 -0400 Message-ID: <4FAA3EBB.5050709@suse.de> Date: Wed, 09 May 2012 11:54:03 +0200 From: Alexander Graf MIME-Version: 1.0 References: <4F96CF9F.9060302@siemens.com> <20120424171925.GT3169@otherpad.lan.raisama.net> <20120507182142.GD16951@otherpad.lan.raisama.net> <20120508201441.GN4373@otherpad.lan.raisama.net> <6BF7428F-FDEF-4497-94F5-7A43BC9E1E67@suse.de> <20120509081404.GO15960@redhat.com> <0FA57537-0C33-468F-B416-AEB2487A9DFD@suse.de> <20120509085151.GQ15960@redhat.com> <20120509093837.GT15960@redhat.com> In-Reply-To: <20120509093837.GT15960@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Semantics of "-cpu host" (was Re: [PATCH 2/2] Expose tsc deadline timer cpuid to guest) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov Cc: Andre Przywara , Eduardo Habkost , "kvm@vger.kernel.org" , Jan Kiszka , "qemu-devel@nongnu.org" , Avi Kivity On 05/09/2012 11:38 AM, Gleb Natapov wrote: > On Wed, May 09, 2012 at 11:05:58AM +0200, Alexander Graf wrote: >> On 09.05.2012, at 10:51, Gleb Natapov wrote: >> >>> On Wed, May 09, 2012 at 10:42:26AM +0200, Alexander Graf wrote: >>>> >>>> On 09.05.2012, at 10:14, Gleb Natapov wrote: >>>> >>>>> On Wed, May 09, 2012 at 12:07:04AM +0200, Alexander Graf wrote: >>>>>> On 08.05.2012, at 22:14, Eduardo Habkost wrote: >>>>>> >>>>>>> On Tue, May 08, 2012 at 02:58:11AM +0200, Alexander Graf wrote: >>>>>>>> On 07.05.2012, at 20:21, Eduardo Habkost wrote: >>>>>>>> >>>>>>>>> Andre? Are you able to help to answer the question below? >>>>>>>>> >>>>>>>>> I would like to clarify what's the expected behavior of "-cpu host" to >>>>>>>>> be able to continue working on it. I believe the code will need to be >>>>>>>>> fixed on either case, but first we need to figure out what are the >>>>>>>>> expectations/requirements, to know _which_ changes will be needed. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Apr 24, 2012 at 02:19:25PM -0300, Eduardo Habkost wrote: >>>>>>>>>> (CCing Andre Przywara, in case he can help to clarify what's the >>>>>>>>>> expected meaning of "-cpu host") >>>>>>>>>> >>>>>>>>> [...] >>>>>>>>>> I am not sure I understand what you are proposing. Let me explain the >>>>>>>>>> use case I am thinking about: >>>>>>>>>> >>>>>>>>>> - Feature FOO is of type (A) (e.g. just a new instruction set that >>>>>>>>>> doesn't require additional userspace support) >>>>>>>>>> - User has a Qemu vesion that doesn't know anything about feature FOO >>>>>>>>>> - User gets a new CPU that supports feature FOO >>>>>>>>>> - User gets a new kernel that supports feature FOO (i.e. has FOO in >>>>>>>>>> GET_SUPPORTED_CPUID) >>>>>>>>>> - User does _not_ upgrade Qemu. >>>>>>>>>> - User expects to get feature FOO enabled if using "-cpu host", without >>>>>>>>>> upgrading Qemu. >>>>>>>>>> >>>>>>>>>> The problem here is: to support the above use-case, userspace need a >>>>>>>>>> probing mechanism that can differentiate _new_ (previously unknown) >>>>>>>>>> features that are in group (A) (safe to blindly enable) from features >>>>>>>>>> that are in group (B) (that can't be enabled without an userspace >>>>>>>>>> upgrade). >>>>>>>>>> >>>>>>>>>> In short, it becomes a problem if we consider the following case: >>>>>>>>>> >>>>>>>>>> - Feature BAR is of type (B) (it can't be enabled without extra >>>>>>>>>> userspace support) >>>>>>>>>> - User has a Qemu version that doesn't know anything about feature BAR >>>>>>>>>> - User gets a new CPU that supports feature BAR >>>>>>>>>> - User gets a new kernel that supports feature BAR (i.e. has BAR in >>>>>>>>>> GET_SUPPORTED_CPUID) >>>>>>>>>> - User does _not_ upgrade Qemu. >>>>>>>>>> - User simply shouldn't get feature BAR enabled, even if using "-cpu >>>>>>>>>> host", otherwise Qemu would break. >>>>>>>>>> >>>>>>>>>> If userspace always limited itself to features it knows about, it would >>>>>>>>>> be really easy to implement the feature without any new probing >>>>>>>>>> mechanism from the kernel. But that's not how I think users expect "-cpu >>>>>>>>>> host" to work. Maybe I am wrong, I don't know. I am CCing Andre, who >>>>>>>>>> introduced the "-cpu host" feature, in case he can explain what's the >>>>>>>>>> expected semantics on the cases above. >>>>>>>> Can you think of any feature that'd go into category B? >>>>>>> - TSC-deadline: can't be enabled unless userspace takes care to enable >>>>>>> the in-kernel irqchip. >>>>>> The kernel can check if in-kernel irqchip has it enabled and otherwise mask it out, no? >>>>>> >>>>> How kernel should know that userspace does not emulate it? >>>> You have to enable the in-kernel apic to use it, at which point the kernel knows it's in use, right? >>>> >>>>>>> - x2apic: ditto. >>>>>> Same here. For user space irqchip the kernel side doesn't care. If in-kernel APIC is enabled, check for its capabilities. >>>>>> >>>>> Same here. >>>>> >>>>> Well, technically both of those features can't be implemented in >>>>> userspace right now since MSRs are terminated in the kernel, but I >>>> Doesn't sound like the greatest design - unless you deprecate the non-in-kernel apic case. >>>> >>> You mean terminating MSRs in kernel does not sound like the greatest >>> design? I do not disagree. That is why IMO kernel can't filter out >>> TSC-deadline and x2apic like you suggest. >> I still don't see why it can't. >> >> Imagine we would filter TSC-deadline and x2apic by default in the kernel - they are not known to exist yet. >> Now, we implement TSC-deadline in the kernel. We still filter TSC-deadline out in GET_SUPORTED_CPUID in the kernel. But we provide an interface to user space that says "call me to enable TSC-deadline CPUID, but only if you're using the in-kernel apic" >> New user space calls that ioctl when it's using the in-kernel apic, it doesn't when it's using the user space apic. >> Old user space doesn't call that ioctl. > First of all we already have TSC-deadline in GET_SUPORTED_CPUID without > any additional ioctls. And second I do not see why we need additional > iostls here. Yeah, some times our ABI is already broken :(. What a shame... > Hmm, so may be I misunderstood you. You propose to mask TSC-deadline and > x2apic out from GET_SUPORTED_CPUID if irq chip is not in kernel, not > from KVM_SET_CPUID? For those two features it may make sense indeed. Not > sure there won't be others that are not dependent on irq chip presence. > You propose to add additional ioctls to enable them if they appear? That's the only backwards compatible way to design this without putting a plethora of knowledge into user space I can see, yes. > >> So at the end all bits in GET_SUPPORTED_CPUID are consistent with what user space is capable of. >> > GET_SUPPORTED_CPUID should not be necessary consistent with what user > space is capable of. Userspace may emulate features that are not in > GET_SUPPORTED_CPUID. If it does, it can OR them in. GET_SUPPORTED_CPUID should return "these are the bits that your currently configured kvm is capable of". Alex