From: Jan Kiszka <jan.kiszka@domain.hid>
To: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Cc: xenomai-core <xenomai@xenomai.org>
Subject: Re: [Xenomai-core] x86: Endless minor faults
Date: Sun, 05 Jul 2009 19:12:46 +0200 [thread overview]
Message-ID: <4A50DF0E.3070800@domain.hid> (raw)
In-Reply-To: <4A50BF29.4000007@domain.hid>
[-- Attachment #1: Type: text/plain, Size: 5555 bytes --]
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Hi again,
>>>>>>
>>>>>> this is now basically the patch which seems to stabilized x86 /wrt mmu
>>>>>> switches again:
>>>>>>
>>>>>> There were 3 race windows between setting active_mm of the current task
>>>>>> and actually switching it (that's a noarch issue), there were several
>>>>>> calls into switch_mm without proper hard interrupt protection (on archs
>>>>>> that have no preemptible switch_mm, like x86) and there was a race in
>>>>>> x86's leave_mm (as Gilles already remarked earlier in this thread -
>>>>>> while I was looking at an old tree where smp_invalidate_interrupt took
>>>>>> care of this).
>>>>>>
>>>>>> The patch is thought as a basis for further discussions about
>>>>>>
>>>>>> o how to solve all the issues for all archs technically (ideally
>>>>>> without the need to patch noarch parts in an arch-specific way...)
>>>>>>
>>>>>> o if anyone thinks there could be more spots like these (I've checked
>>>>>> the code only for x86 so far)
>>>>>> diff --git a/fs/aio.c b/fs/aio.c
>>>>>> index 76da125..0286f0f 100644
>>>>>> --- a/fs/aio.c
>>>>>> +++ b/fs/aio.c
>>>>>> @@ -618,13 +618,16 @@ static void use_mm(struct mm_struct *mm)
>>>>>> {
>>>>>> struct mm_struct *active_mm;
>>>>>> struct task_struct *tsk = current;
>>>>>> + unsigned long flags;
>>>>>>
>>>>>> task_lock(tsk);
>>>>>> active_mm = tsk->active_mm;
>>>>>> atomic_inc(&mm->mm_count);
>>>>>> tsk->mm = mm;
>>>>>> + local_irq_save_hw_cond(flags);
>>>>>> tsk->active_mm = mm;
>>>>>> - switch_mm(active_mm, mm, tsk);
>>>>>> + __switch_mm(active_mm, mm, tsk);
>>>>>> + local_irq_restore_hw_cond(flags);
>>>>>> task_unlock(tsk);
>>>>>>
>>>>>> mmdrop(active_mm);
>>>>>> diff --git a/fs/exec.c b/fs/exec.c
>>>>>> index 3b36c69..06591ac 100644
>>>>>> --- a/fs/exec.c
>>>>>> +++ b/fs/exec.c
>>>>>> @@ -718,6 +718,7 @@ static int exec_mmap(struct mm_struct *mm)
>>>>>> {
>>>>>> struct task_struct *tsk;
>>>>>> struct mm_struct * old_mm, *active_mm;
>>>>>> + unsigned long flags;
>>>>>>
>>>>>> /* Notify parent that we're no longer interested in the old VM */
>>>>>> tsk = current;
>>>>>> @@ -740,8 +741,10 @@ static int exec_mmap(struct mm_struct *mm)
>>>>>> task_lock(tsk);
>>>>>> active_mm = tsk->active_mm;
>>>>>> tsk->mm = mm;
>>>>>> + local_irq_save_hw_cond(flags);
>>>>>> tsk->active_mm = mm;
>>>>>> activate_mm(active_mm, mm);
>>>>>> + local_irq_restore_hw_cond(flags);
>>>>>> task_unlock(tsk);
>>>>>> arch_pick_mmap_layout(mm);
>>>>>> if (old_mm) {
>>>>>> diff --git a/kernel/fork.c b/kernel/fork.c
>>>>>> index 01a836b..cf3b68a 100644
>>>>>> --- a/kernel/fork.c
>>>>>> +++ b/kernel/fork.c
>>>>>> @@ -1665,11 +1665,14 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
>>>>>> }
>>>>>>
>>>>>> if (new_mm) {
>>>>>> + unsigned long flags;
>>>>>> mm = current->mm;
>>>>>> active_mm = current->active_mm;
>>>>>> current->mm = new_mm;
>>>>>> + local_irq_save_hw_cond(flags);
>>>>>> current->active_mm = new_mm;
>>>>>> activate_mm(active_mm, new_mm);
>>>>>> + local_irq_restore_hw_cond(flags);
>>>>>> new_mm = mm;
>>>>>> }
>>>>> Ok. These are patches of the noarch code which are dependent on x86
>>>>> implementation.
>>>>>
>>>>> I think we need something like
>>>>>
>>>>> mm_change_enter(flags);
>>>>> mm_change_leave(flags);
>>>>>
>>>>> which would resolve to ipipe_active_mm = NULL on architectures with
>>>>> unlocked context switches, and to local_irq_save_hw/local_irq_restore on
>>>>> x86?
>>>> + we have to change switch_mm in the lockless case, and maybe also its
>>>> caller (if I look at the arm code), to use ipipe_active_mm in order to
>>>> decide if to switch or not - see my explanation of the possible race in
>>>> aio.c.
>>> This is already the way it is currently working. the "prev" passed to
>>> switch_mm is always ipipe_active_mm.
>>>
>> Ah, I missed the difference in arm's xnarch_leave_root. Now I got it.
>>
>> OK, but this needs to be thought through in details again, specifically
>> also the enter-root scenarios. I have no safe feeling about it yet,
>> partly because switch_mm fiddles with ipipe_active_mm as well, but more
>> generally because we used to have such critical races in this area for a
>> "fairly" long time.
>
> When reentering root with ipipe_active_mm NULL, the TIF_MMSWTICH_INT bit
> is set, in which case the switch_mm preempted by Xenomai is restarted.
>
Right. But now we are also dealing with preemptions outside that loop.
And we are dealing with archs that have atomic switch_mm, thus no such
loops.
The goal should be to find a solution against the races outside
switch_mm that is applicable to all archs. But the question is still
unanswered to me if we can simply reuse existing ipipe_active_mm without
disturbing its current use in non-atomic switch_mm.
Moreover, part of the problems on x86 was the non-atomic update of
cpu_vm_mask vs. actually switching the hardware. How do you deal with
cpu_vm_mask on ARM? Why can you call switch_mm without IRQ protection?
My impression is that only the (costly) cpu_switch_mm should be allowed
to be preemptible to have a safe state around it. I'm also asking as we
would have to address this on x86 & co. once switch_mm could be called
with undefined prev_mm.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]
next prev parent reply other threads:[~2009-07-05 17:12 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-29 17:35 [Xenomai-core] x86: Endless minor faults Jan Kiszka
2009-06-29 18:09 ` Philippe Gerum
2009-06-29 18:20 ` Jan Kiszka
2009-06-30 8:32 ` Jan Kiszka
2009-06-30 8:36 ` Gilles Chanteperdrix
2009-06-30 8:44 ` Jan Kiszka
2009-06-30 8:42 ` Gilles Chanteperdrix
2009-06-30 8:56 ` Jan Kiszka
2009-06-30 9:20 ` Philippe Gerum
2009-06-30 9:21 ` Gilles Chanteperdrix
2009-06-30 9:25 ` Philippe Gerum
2009-06-30 9:26 ` Gilles Chanteperdrix
2009-06-30 9:27 ` Philippe Gerum
2009-06-30 9:34 ` Gilles Chanteperdrix
2009-06-30 16:11 ` Jan Kiszka
2009-07-01 11:56 ` Jan Kiszka
2009-07-01 12:05 ` Jan Kiszka
2009-07-01 12:24 ` Gilles Chanteperdrix
2009-07-01 12:39 ` Jan Kiszka
2009-07-01 12:41 ` Gilles Chanteperdrix
2009-07-01 12:41 ` Jan Kiszka
2009-07-01 15:51 ` Gilles Chanteperdrix
2009-07-01 16:01 ` Jan Kiszka
2009-07-01 16:04 ` Jan Kiszka
2009-07-01 17:56 ` Jan Kiszka
2009-07-01 18:15 ` Philippe Gerum
2009-07-01 18:27 ` Philippe Gerum
2009-07-01 18:58 ` Jan Kiszka
2009-07-01 19:14 ` Jan Kiszka
2009-07-02 2:05 ` Gilles Chanteperdrix
2009-07-02 6:24 ` Jan Kiszka
2009-07-02 6:59 ` Gilles Chanteperdrix
2009-07-02 7:16 ` Jan Kiszka
2009-07-02 7:44 ` Gilles Chanteperdrix
2009-07-01 18:56 ` Jan Kiszka
2009-07-02 7:11 ` Philippe Gerum
2009-07-02 17:14 ` Jan Kiszka
2009-07-03 14:54 ` Gilles Chanteperdrix
2009-07-03 15:06 ` Jan Kiszka
2009-07-04 16:39 ` Gilles Chanteperdrix
2009-07-05 12:01 ` Jan Kiszka
2009-07-05 14:56 ` Gilles Chanteperdrix
2009-07-05 17:12 ` Jan Kiszka [this message]
2009-07-06 7:54 ` Gilles Chanteperdrix
2009-07-07 18:45 ` Jan Kiszka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A50DF0E.3070800@domain.hid \
--to=jan.kiszka@domain.hid \
--cc=gilles.chanteperdrix@xenomai.org \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.