From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from co9outboundpool.messaging.microsoft.com (co9ehsobe003.messaging.microsoft.com [207.46.163.26]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client CN "mail.global.frontbridge.com", Issuer "MSIT Machine Auth CA 2" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 9FB322C00A0 for ; Thu, 4 Jul 2013 03:17:44 +1000 (EST) Date: Wed, 3 Jul 2013 12:17:34 -0500 From: Scott Wood Subject: Re: [PATCH 3/6] KVM: PPC: Book3E: Increase FPU laziness To: Alexander Graf In-Reply-To: <23C56B31-5145-481E-9877-F1878F66959D@suse.de> (from agraf@suse.de on Wed Jul 3 11:59:45 2013) Message-ID: <1372871854.8183.132@snotra> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; delsp=Yes; format=Flowed Cc: Caraman Mihai Claudiu-B02008 , "linuxppc-dev@lists.ozlabs.org" , "kvm@vger.kernel.org" , "kvm-ppc@vger.kernel.org" List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 07/03/2013 11:59:45 AM, Alexander Graf wrote: >=20 > On 03.07.2013, at 17:41, Caraman Mihai Claudiu-B02008 wrote: >=20 > >>>>> Increase FPU laziness by calling kvmppc_load_guest_fp() just =20 > before > >>>>> returning to guest instead of each sched in. Without this =20 > improvement > >>>>> an interrupt may also claim floting point corrupting guest =20 > state. > >>>> > >>>> Not sure I follow. Could you please describe exactly what's =20 > happening? > >>> > >>> This was already discussed on the list, I will forward you the =20 > thread. > >> > >> The only thing I've seen in that thread was some pathetic =20 > theoretical > >> case where an interrupt handler would enable fp and clobber state > >> carelessly. That's not something I'm worried about. > > > > Neither me though I don't find it pathetic. Please refer it to =20 > Scott. >=20 > If from Linux's point of view we look like a user space program with =20 > active floating point registers, we don't have to worry about this =20 > case. Kernel code that would clobber that fp state would clobber =20 > random user space's fp state too. This patch makes it closer to how it works with a user space program. =20 Or rather, it reduces the time window when we don't (and can't) act =20 like a normal userspace program -- and ensures that we have interrupts =20 disabled during that window. An interrupt can't randomly clobber FP =20 state; it has to call enable_kernel_fp() just like KVM does. =20 enable_kernel_fp() clears the userspace MSR_FP to ensure that the state =20 it saves gets restored before userspace uses it again, but that won't =20 have any effect on guest execution (especially in HV-mode). Thus =20 kvmppc_load_guest_fp() needs to be atomic with guest entry. =20 Conceptually it's like taking an automatic FP unavailable trap when we =20 enter the guest, since we can't be lazy in HV-mode. > >> I really don't see where this patch improves anything tbh. It =20 > certainly > >> makes the code flow more awkward. > > > > I was pointing you to this: The idea of FPU/AltiVec laziness that =20 > the kernel > > is struggling to achieve is to reduce the number of store/restore =20 > operations. > > Without this improvement we restore the unit each time we are sched =20 > it. If an > > other process take the ownership of the unit (on SMP it's even =20 > worse but don't > > bother with this) the kernel store the unit state to qemu task. =20 > This can happen > > multiple times during handle_exit(). > > > > Do you see it now? >=20 > Yup. Looks good. The code flow is very hard to follow though - there =20 > are a lot of implicit assumptions that don't get documented anywhere. =20 > For example the fact that we rely on giveup_fpu() to remove MSR_FP =20 > from our thread. That's not new to this patch... -Scott=