From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: Re: [PATCH] FPU LWP 6/8: create lazy and non-lazy FPU restore functions Date: Fri, 06 May 2011 08:49:18 +0100 Message-ID: <4DC3C41E020000780003FFB2@vpn.id2.novell.com> References: <4DC062ED.3070802@amd.com> <4DC117D6020000780003F9CB@vpn.id2.novell.com> <4DC17FF3.5080706@amd.com> <4DC26A46020000780003FC3E@vpn.id2.novell.com> <4DC31973.8040009@amd.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4DC31973.8040009@amd.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Wei Huang Cc: "xen-devel@lists.xensource.com" , KeirFraser List-Id: xen-devel@lists.xenproject.org >>> On 05.05.11 at 23:41, Wei Huang wrote: > Hi Jan, >=20 > If we want to make LWP restore optional in vcpu_restore_fpu_eager(), = we=20 > have to change vcpu_save_fpu() as well. Otherwise, the extended state=20 > will become inconsistent for non-LWP VCPUs (because save and restore = is=20 > asymmetric). There are two approaches: >=20 > 1. In vcpu_save_fpu(), clean physical CPU's extended state for VCPU=20 > which is being scheduled in. This prevents messy states from causing=20 > problems. The disadvantage is the cleaning cost, which would out-weight= =20 > the benefits. Cleaning cost? Wasn't it that one can express to default-initialize fields during xrstor (which, if indeed expensive, you'd want to trigger only if you know the physical CPU's state is dirty, i.e. in this case requiring a per-CPU variable that gets evaluated and updated on context restore). > 2. Add a new variable in VCPU to track whether nonlazy state is dirty. = I=20 > think this is better. See the attached file. >=20 > Let me know if it is what you want. After that, I will re-spin the = patches. Yes, this looks like what I meant. Two suggestions: The new field's name (nonlazy_xstate_dirty) would perhaps better be something like nonlazy_xstate_used, so that name and use are in sync. And the check in vcpu_restore_fpu_eager() probably doesn't need to re-evaluate xsave_enabled(v), since the flag can't get set without this (if you absolutely want to, put in an ASSERT() to this effect). Jan