From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: fpu_taskswitch adjustment proposal Date: Mon, 18 Jun 2012 14:24:51 -0400 Message-ID: <20120618182450.GF24750@phenom.dumpdata.com> References: <4FDB790F020000780008A406@nat28.tlf.novell.com> <4FDEF595020000780008A5B7@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <4FDEF595020000780008A5B7@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Keir Fraser , xen-devel List-Id: xen-devel@lists.xenproject.org On Mon, Jun 18, 2012 at 08:32:05AM +0100, Jan Beulich wrote: > >>> On 15.06.12 at 19:06, Keir Fraser wrote: > > On 15/06/2012 17:03, "Jan Beulich" wrote: > > > >> While pv-ops so far doesn't care to eliminate the two trap-and- > >> emulate CR0 accesses from the asm/xor.h save/restore > >> operations, the legacy x86-64 kernel uses conditional clts()/stts() > >> for this purpose. While looking into whether to extend this to the > >> newly added (in 3.5) AVX operations there I realized that this isn't > >> fully correct: It doesn't properly nest inside a kernel_fpu_begin()/ > >> kernel_fpu_end() pair (as it will stts() at the end no matter what > >> the original state of CR0.TS was). That sounds like a bug in the generic code then? > >> > >> In order to not introduce completely new hypercalls to overcome > >> this (fpu_taskswitch isn't really extensible on its own), I'm > >> considering to add a new VM assist, altering the fpu_taskswitch > >> behavior so that it would return an indicator whether any change > >> to the virtual CR0.TS was done. That way, the kernel can > >> implement a true save/restore cycle here. How would that work with the multi-calls? Right now clts is batched and so is cr0 write. > > > > It should be possible for the guest kernel to track its CR0.TS setting > > shouldn't it? It gets modified only via a few paravirt hooks, and implicitly Hm, the clts() paravirt could take advantage of the per-cpu cr0 to figure out whether it truly needs to do anything. > > cleared on #NM. > Sure, but selling this to the Linux maintainers I would expect to be > harder than fitting the Xen side of things into the current save- > and-restore model the native xor code uses. It would only be strait > forward to implement on the legacy, forward ported trees. > > However, with the #NM handler in pv-ops apparently not > leveraging the fact that CR0.TS is already cleared for it on entry, > maybe this could indeed be introduced together. Konrad? Would this require an extra pvops call from the #NM handler? > > Jan