All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Huang <wei.huang2@amd.com>
To: Jan Beulich <JBeulich@novell.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	KeirFraser <keir@xen.org>
Subject: Re: Re: [PATCH] FPU LWP 6/8: create lazy and non-lazy FPU	restore functions
Date: Thu, 5 May 2011 16:41:07 -0500	[thread overview]
Message-ID: <4DC31973.8040009@amd.com> (raw)
In-Reply-To: <4DC26A46020000780003FC3E@vpn.id2.novell.com>

[-- Attachment #1: Type: text/plain, Size: 2363 bytes --]

Hi Jan,

If we want to make LWP restore optional in vcpu_restore_fpu_eager(), we 
have to change vcpu_save_fpu() as well. Otherwise, the extended state 
will become inconsistent for non-LWP VCPUs (because save and restore is 
asymmetric). There are two approaches:

1. In vcpu_save_fpu(), clean physical CPU's extended state for VCPU 
which is being scheduled in. This prevents messy states from causing 
problems. The disadvantage is the cleaning cost, which would out-weight 
the benefits.
2. Add a new variable in VCPU to track whether nonlazy state is dirty. I 
think this is better. See the attached file.

Let me know if it is what you want. After that, I will re-spin the patches.

Thanks,
-Wei


On 05/05/2011 02:13 AM, Jan Beulich wrote:
>>>> On 04.05.11 at 18:33, Wei Huang<wei.huang2@amd.com>  wrote:
>> Checking whether there is a non-lazy state to save is architectural
>> specific and very messy. For instance, we need to read LWP_CBADDR to
>> confirm LWP's dirty state. This MSR is AMD specific and we don't want to
>> add it here. Plus reading data from LWP_CBADDR MSR might be as expensive
>> as clts/stts.
>>
>> My previous email showed that the overhead with LWP is around 1%-2% of
>> __context_switch(). For non lwp-capable CPU, this overhead should be
>> much smaller (only clts and stts) because xfeature_mask[LWP] is 0.
> I wasn't talking about determining whether LWP state is dirty, but
> much rather about LWP not being in use at all.
>
>> Yes, clts() and stts() don't have to called every time. How about this one?
>>
>> /* Restore FPU state whenever VCPU is schduled in. */
>> void vcpu_restore_fpu_eager(struct vcpu *v)
>> {
>>       ASSERT(!is_idle_vcpu(v));
>>
>>
>>       /* save the nonlazy extended state which is not tracked by CR0.TS bit */
>>       if ( xsave_enabled(v) )
>>       {
>>           /* Avoid recursion */
>>           clts();
>>           fpu_xrstor(v, XSTATE_NONLAZY);
>>           stts();
>>       }
> That's certainly better, but I'd still like to see the xsave_enabled()
> check to be replaced by some form of lwp_enabled() or
> lazy_xsave_needed() or some such (which will at once exclude all
> pv guests until you care to add support for them).
>
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>


[-- Attachment #2: nonlazy_dirty.txt --]
[-- Type: text/plain, Size: 2594 bytes --]

diff -r 343470b5ad6b xen/arch/x86/hvm/svm/svm.c
--- a/xen/arch/x86/hvm/svm/svm.c	Tue May 03 14:05:28 2011 -0500
+++ b/xen/arch/x86/hvm/svm/svm.c	Thu May 05 16:40:41 2011 -0500
@@ -716,7 +716,7 @@
     unsigned int eax, ebx, ecx, edx;
     uint32_t msr_low;
     
-    if ( cpu_has_lwp )
+    if ( xsave_enabled(v) && cpu_has_lwp )
     {
         hvm_cpuid(0x8000001c, &eax, &ebx, &ecx, &edx);
         msr_low = (uint32_t)msr_content;
@@ -729,6 +729,9 @@
         /* CPU might automatically correct reserved bits. So read it back. */
         rdmsrl(MSR_AMD64_LWP_CFG, msr_content);
         v->arch.hvm_svm.guest_lwp_cfg = msr_content;
+
+        /* track nonalzy state if LWP_CFG is non-zero. */
+        v->arch.nonlazy_xstate_dirty = !!(msr_content);
     }
 
     return 0;
diff -r 343470b5ad6b xen/arch/x86/i387.c
--- a/xen/arch/x86/i387.c	Tue May 03 14:05:28 2011 -0500
+++ b/xen/arch/x86/i387.c	Thu May 05 16:40:41 2011 -0500
@@ -98,13 +98,13 @@
 /*      FPU Save Functions     */
 /*******************************/
 /* Save x87 extended state */
-static inline void fpu_xsave(struct vcpu *v, uint64_t mask)
+static inline void fpu_xsave(struct vcpu *v)
 {
     /* XCR0 normally represents what guest OS set. In case of Xen itself,
      * we set all accumulated feature mask before doing save/restore.
      */
     set_xcr0(v->arch.xcr0_accum);
-    xsave(v, mask);
+    xsave(v, v->arch.nonlazy_xstate_dirty ? XSTATE_ALL : XSTATE_LAZY);
     set_xcr0(v->arch.xcr0);    
 }
 
@@ -164,15 +164,15 @@
 void vcpu_restore_fpu_eager(struct vcpu *v)
 {
     ASSERT(!is_idle_vcpu(v));
-
-    /* Avoid recursion */
-    clts();
     
     /* save the nonlazy extended state which is not tracked by CR0.TS bit */
-    if ( xsave_enabled(v) )
+    if ( xsave_enabled(v) && v->arch.nonlazy_xstate_dirty )
+    {
+        /* Avoid recursion */
+        clts();
         fpu_xrstor(v, XSTATE_NONLAZY);
-
-    stts();
+        stts();
+    }
 }
 
 /* 
@@ -219,7 +219,7 @@
     clts();
 
     if ( xsave_enabled(v) )
-        fpu_xsave(v, XSTATE_ALL);
+        fpu_xsave(v);
     else if ( cpu_has_fxsr )
         fpu_fxsave(v);
     else
diff -r 343470b5ad6b xen/include/asm-x86/domain.h
--- a/xen/include/asm-x86/domain.h	Tue May 03 14:05:28 2011 -0500
+++ b/xen/include/asm-x86/domain.h	Thu May 05 16:40:41 2011 -0500
@@ -492,6 +492,9 @@
      * it explicitly enables it via xcr0.
      */
     uint64_t xcr0_accum;
+    /* This variable determines whether nonlazy extended state is dirty and 
+     * needs to be tracked. */
+    bool_t nonlazy_xstate_dirty;
 
     struct paging_vcpu paging;
 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

  reply	other threads:[~2011-05-05 21:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-03 20:17 [PATCH] FPU LWP 6/8: create lazy and non-lazy FPU restore functions Wei Huang
2011-05-04  7:09 ` Jan Beulich
2011-05-04 16:33   ` Wei Huang
2011-05-05  7:13     ` Jan Beulich
2011-05-05 21:41       ` Wei Huang [this message]
2011-05-06  7:49         ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DC31973.8040009@amd.com \
    --to=wei.huang2@amd.com \
    --cc=JBeulich@novell.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.