All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Xenomai <xenomai@xenomai.org>
Subject: Re: [Xenomai] ipipe/x86: do not restore during context switch
Date: Wed, 06 Feb 2013 19:40:18 +0100	[thread overview]
Message-ID: <5112A392.3050302@xenomai.org> (raw)
In-Reply-To: <5112A269.40609@siemens.com>

On 02/06/2013 07:35 PM, Jan Kiszka wrote:

> On 2013-02-06 19:31, Gilles Chanteperdrix wrote:
>> On 02/06/2013 07:26 PM, Jan Kiszka wrote:
>>
>>> On 2013-02-06 18:51, Gilles Chanteperdrix wrote:
>>>> On 02/06/2013 06:47 PM, Jan Kiszka wrote:
>>>>
>>>>> On 2013-02-06 18:44, Gilles Chanteperdrix wrote:
>>>>>> On 02/06/2013 06:40 PM, Jan Kiszka wrote:
>>>>>>
>>>>>>> On 2013-02-06 18:35, Gilles Chanteperdrix wrote:
>>>>>>>> On 02/06/2013 06:33 PM, Jan Kiszka wrote:
>>>>>>>>
>>>>>>>>> On 2013-02-06 18:09, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 02/06/2013 06:03 PM, Jan Kiszka wrote:
>>>>>>>>>>
>>>>>>>>>>> Gilles,
>>>>>>>>>>>
>>>>>>>>>>> do you remember if this core-3.4 change was a performance optimization
>>>>>>>>>>> or a necessary fix? Also, I'm not yet understanding why we need all the
>>>>>>>>>>> #ifdefs except for the first one which forces fpu.preload to 0.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> It is a performance optimization, without it, we systematically hit the
>>>>>>>>>> maximum latency when the timer would tick during a context switch which
>>>>>>>>>> restores the FPU. Note that if you change that, you will probably break
>>>>>>>>>> -forge.
>>>>>>>>>
>>>>>>>>> According to the Intel folks who introduced eagerfpu, xsave, or at least
>>>>>>>>> xsaveopt (which I didn't implemented yet) is now faster than serializing
>>>>>>>>> clts/stts. On the other hand, the worst case is a full SSE + AVX restore
>>>>>>>>> while the target RT task is not depending on the FPU.
>>>>>>>>
>>>>>>>>
>>>>>>>> Without xsave, we never restore fpu if the RT task never used it. This
>>>>>>>> changes with xsave?
>>>>>>>
>>>>>>> This would change with eagerfpu which depends on xsave. The kernel
>>>>>>> sticks with lazy switching in the absence of xsaveopt.
>>>>>>
>>>>>>
>>>>>> I am not sure you understand what I mean, so, I am going to reformulate.
>>>>>> Without xsave, Linux uses lazy fpu restore, and Xenomai uses eager fpu
>>>>>> restore. But Xenomai eager fpu restore is a nop if the RT task never
>>>>>> used FPU since its inception (and all the parents from which it is
>>>>>> cloned never used FPU either). Does Linux eager switching mean the same
>>>>>> thing?
>>>>>
>>>>> eagerfpu means: always call xsaveopt/xrstor, it will optimize the case
>>>>> that the FPU was unused by the source/destination. And no fiddling with
>>>>> TS anymore, at no time.
>>>>
>>>>
>>>> I still do not understand this sentence then: "the worst case is a full
>>>> SSE + AVX restore while the target RT task is not depending on the FPU."
>>>> If the RT task does not depend on the FPU, why would xsaveopt/xrstor
>>>> restore SSE and AVX context?
>>>
>>> Switching between two tasks that both use the full state space defines
>>> the maximum latency of the FPU save/restore step. We cannot interrupt
>>> xsave or xrstor instructions, but we couldn't interrupt fxsave either.
>>>
>>> What we can do, though, is to ensure that we have at least an preemption
>>> point between both. Do we have such thing so far, a chance to handle a
>>> Xenomai IRQ between some FPU save for Linux task A and a FPU restore for
>>> the following task B? If not, the discussion is mood and we are just
>>> shifting probabilities of the very same worst case.
>>
>>
>> We can implement unlocked context switch support on x86 as we do on
>> other platforms. I tried that on atom actually and it did not really
>> improve latencies. You do not answer my question though, why would
>> xsave/xrstor do anything if the RT thread has not used FPU (and all its
>> parents have not used fpu) ?
> 
> We first of all would have to wait for the unrelated switch between
> those two Linux tasks before we could handle the IRQ and switch to the
> FPU-free RT task. __switch_to is atomic, also for Linux->Linux, no?


Only the *IP and *SP switch need to be atomic, the whole __switch_to can
be split in several atomic sections, this is what I tested on atom. But
as I said, it did not lead to any latency improvement.

-- 
                                                                Gilles.


  reply	other threads:[~2013-02-06 18:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-06 17:03 [Xenomai] ipipe/x86: do not restore during context switch Jan Kiszka
2013-02-06 17:09 ` Gilles Chanteperdrix
2013-02-06 17:33   ` Jan Kiszka
2013-02-06 17:35     ` Gilles Chanteperdrix
2013-02-06 17:40       ` Jan Kiszka
2013-02-06 17:44         ` Gilles Chanteperdrix
2013-02-06 17:47           ` Jan Kiszka
2013-02-06 17:51             ` Gilles Chanteperdrix
2013-02-06 18:26               ` Jan Kiszka
2013-02-06 18:31                 ` Gilles Chanteperdrix
2013-02-06 18:35                   ` Jan Kiszka
2013-02-06 18:40                     ` Gilles Chanteperdrix [this message]
2013-02-06 19:22                       ` Jan Kiszka
2013-02-06 19:30                         ` Gilles Chanteperdrix
2013-02-06 19:55                           ` Jan Kiszka
2013-02-06 20:03                             ` Gilles Chanteperdrix
2013-02-06 20:17                               ` Jan Kiszka
2013-02-06 20:20                                 ` Gilles Chanteperdrix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5112A392.3050302@xenomai.org \
    --to=gilles.chanteperdrix@xenomai.org \
    --cc=jan.kiszka@siemens.com \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.