All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@siemens.com>
To: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Cc: Xenomai <xenomai@xenomai.org>
Subject: Re: [Xenomai] ipipe/x86: do not restore during context switch
Date: Wed, 06 Feb 2013 18:40:41 +0100	[thread overview]
Message-ID: <51129599.3080709@siemens.com> (raw)
In-Reply-To: <5112945F.8080102@xenomai.org>

On 2013-02-06 18:35, Gilles Chanteperdrix wrote:
> On 02/06/2013 06:33 PM, Jan Kiszka wrote:
> 
>> On 2013-02-06 18:09, Gilles Chanteperdrix wrote:
>>> On 02/06/2013 06:03 PM, Jan Kiszka wrote:
>>>
>>>> Gilles,
>>>>
>>>> do you remember if this core-3.4 change was a performance optimization
>>>> or a necessary fix? Also, I'm not yet understanding why we need all the
>>>> #ifdefs except for the first one which forces fpu.preload to 0.
>>>
>>>
>>> It is a performance optimization, without it, we systematically hit the
>>> maximum latency when the timer would tick during a context switch which
>>> restores the FPU. Note that if you change that, you will probably break
>>> -forge.
>>
>> According to the Intel folks who introduced eagerfpu, xsave, or at least
>> xsaveopt (which I didn't implemented yet) is now faster than serializing
>> clts/stts. On the other hand, the worst case is a full SSE + AVX restore
>> while the target RT task is not depending on the FPU.
> 
> 
> Without xsave, we never restore fpu if the RT task never used it. This
> changes with xsave?

This would change with eagerfpu which depends on xsave. The kernel
sticks with lazy switching in the absence of xsaveopt.

>From the log message of the related commit:

    Reasons driving this model change [Jan: eagerfpu] are:
    
    i. Newer processors support optimized state save/restore using xsaveopt and
    xrstor by tracking the INIT state and MODIFIED state during context-switch.
    This is faster than modifying the cr0.TS bit which has serializing semantics.
    
    ii. Newer glibc versions use SSE for some of the optimized copy/clear routines.
    With certain workloads (like boot, kernel-compilation etc), application
    completes its work with in the first 5 task switches, thus taking upto 5 #DNA
    traps with the kernel not getting a chance to apply the above mentioned
    pre-load heuristic.
    
    iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit
    and thus will not work correctly in the presence of lazy restore. Non-lazy
    state restore is needed for enabling such features.
    
    Some data on a two socket SNB system:
     * Saved 20K DNA exceptions during boot on a two socket SNB system.
     * Saved 50K DNA exceptions during kernel-compilation workload.
     * Improved throughput of the AVX based checksumming function inside the
       kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts
       pair.

I guess for a first 3.8 version I will now simply force eagerfpu off at
I-pipe level. We should then likely benchmark the current code against
an eagerfpu+xsaveopt-enabled version to decide.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


  reply	other threads:[~2013-02-06 17:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-06 17:03 [Xenomai] ipipe/x86: do not restore during context switch Jan Kiszka
2013-02-06 17:09 ` Gilles Chanteperdrix
2013-02-06 17:33   ` Jan Kiszka
2013-02-06 17:35     ` Gilles Chanteperdrix
2013-02-06 17:40       ` Jan Kiszka [this message]
2013-02-06 17:44         ` Gilles Chanteperdrix
2013-02-06 17:47           ` Jan Kiszka
2013-02-06 17:51             ` Gilles Chanteperdrix
2013-02-06 18:26               ` Jan Kiszka
2013-02-06 18:31                 ` Gilles Chanteperdrix
2013-02-06 18:35                   ` Jan Kiszka
2013-02-06 18:40                     ` Gilles Chanteperdrix
2013-02-06 19:22                       ` Jan Kiszka
2013-02-06 19:30                         ` Gilles Chanteperdrix
2013-02-06 19:55                           ` Jan Kiszka
2013-02-06 20:03                             ` Gilles Chanteperdrix
2013-02-06 20:17                               ` Jan Kiszka
2013-02-06 20:20                                 ` Gilles Chanteperdrix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51129599.3080709@siemens.com \
    --to=jan.kiszka@siemens.com \
    --cc=gilles.chanteperdrix@xenomai.org \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.