From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [PATCH] Lazy FPU save/restore for VT Date: Thu, 26 Apr 2007 11:10:21 -0500 Message-ID: <4630CEED.3000206@codemonkey.ws> References: <4630CA1E.2000808@us.ibm.com> <4630CD6F.8050706@qumranet.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: kvm-devel To: Avi Kivity Return-path: In-Reply-To: <4630CD6F.8050706-atKUWr5tajBWk0Htik3J/w@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org Avi Kivity wrote: > Anthony Liguori wrote: > >> Howdy, >> >> Attached patch implements lazy-fpu save/restore for VT. VMEXIT time >> improves by about 10% (~550 cycles). I had to do a couple more things >> to get it working than I had to do with SVM. I changed the CR0 host >> mask to be all 1's so that any attempt to write to CR0 causes a >> VMEXIT. I don't think there are any remaining bits now that we want >> to trap TS that are safe for the guest to access and are in the >> fast-paths. >> > > Great! With a couple more msrs we'll get to 50% off what we had two > weeks ago. > > >> Since we're trapping TS, I had to implement CLTS exit handling. >> vcpu->cr0 also had a rather bizarre life cycle. After a set_cr0, it >> was a proper shadow of the guest's CR0. However, after a decache_cr0, >> it would contain the host's version of the bits covered by the CR0 >> host mask so it was no longer a proper shadow. >> >> I got rid of the CR0 caching and made vcpu->cr0 always be equivalent >> to CR0_READ_SHADOW. Once these changes were made, the rest of the >> patch was much like the SVM one. >> >> >> - if ((intr_info & INTR_INFO_INTR_TYPE_MASK) == 0x200) { /* nmi */ >> + switch (intr_info & INTR_INFO_VECTOR_MASK) { >> + case NMI_VECTOR: >> asm ("int $2"); >> return 1; >> + case NM_VECTOR: >> + vcpu->fpu_active = 1; >> + vmcs_clear_bits(EXCEPTION_BITMAP, 1 << NM_VECTOR); >> + if (!(vcpu->cr0 & CR0_TS_MASK)) >> + vmcs_clear_bits(GUEST_CR0, CR0_TS_MASK); >> + return 1; >> } >> >> > > I'd like to be conservative here and not depend just on the vector: > check type == 2 for nmi, and type == 6, vector == NM for #NM. In fact, > I see that is_page_fault() and is_external_interrupt() implement > something like that already. > Okay. > Also, can you split the patch into the cr0 cache fix and the lazy fpu? > Sure, I was considering that already before I submitted. Regards, Anthony Liguori ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/