From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH] Lazy FPU save/restore for VT Date: Thu, 26 Apr 2007 19:03:59 +0300 Message-ID: <4630CD6F.8050706@qumranet.com> References: <4630CA1E.2000808@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel To: Anthony Liguori Return-path: In-Reply-To: <4630CA1E.2000808-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Anthony Liguori wrote: > Howdy, > > Attached patch implements lazy-fpu save/restore for VT. VMEXIT time > improves by about 10% (~550 cycles). I had to do a couple more things > to get it working than I had to do with SVM. I changed the CR0 host > mask to be all 1's so that any attempt to write to CR0 causes a > VMEXIT. I don't think there are any remaining bits now that we want > to trap TS that are safe for the guest to access and are in the > fast-paths. Great! With a couple more msrs we'll get to 50% off what we had two weeks ago. > > Since we're trapping TS, I had to implement CLTS exit handling. > vcpu->cr0 also had a rather bizarre life cycle. After a set_cr0, it > was a proper shadow of the guest's CR0. However, after a decache_cr0, > it would contain the host's version of the bits covered by the CR0 > host mask so it was no longer a proper shadow. > > I got rid of the CR0 caching and made vcpu->cr0 always be equivalent > to CR0_READ_SHADOW. Once these changes were made, the rest of the > patch was much like the SVM one. > > > - if ((intr_info & INTR_INFO_INTR_TYPE_MASK) == 0x200) { /* nmi */ > + switch (intr_info & INTR_INFO_VECTOR_MASK) { > + case NMI_VECTOR: > asm ("int $2"); > return 1; > + case NM_VECTOR: > + vcpu->fpu_active = 1; > + vmcs_clear_bits(EXCEPTION_BITMAP, 1 << NM_VECTOR); > + if (!(vcpu->cr0 & CR0_TS_MASK)) > + vmcs_clear_bits(GUEST_CR0, CR0_TS_MASK); > + return 1; > } > I'd like to be conservative here and not depend just on the vector: check type == 2 for nmi, and type == 6, vector == NM for #NM. In fact, I see that is_page_fault() and is_external_interrupt() implement something like that already. Also, can you split the patch into the cr0 cache fix and the lazy fpu? -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/