From: "Petr Cervenka" <grugh@domain.hid>
To: jan.kiszka@domain.hid
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] FPU not available
Date: Fri, 08 Feb 2008 13:41:48 +0100 [thread overview]
Message-ID: <200802081341.13030@domain.hid> (raw)
In-Reply-To: <47AB1C21.3070702@domain.hid>
>Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> On Wed, Feb 6, 2008 at 3:09 PM, Petr Cervenka <grugh@domain.hid> wrote:
>>>>>> Hello.
>>>>>> Recently, we switched to newer distribution of linux (Kubuntu 7.10). During this switch we changed many things (Xenomai 2.4.1, linux kernel 2.6.24, x86_64 architecture, ...).
>>>>>> No we have problem, that in one of our tasks we are sometimes not able to use floating point operations (under very specific circumstances) . In such case, that task crashes immediately, but rest of the application runs "normaly". Output from dmesg is attached to this message. Task was created with T_FPU flag.
>>>>>> Is there anything we can check or change?
>>>>>> Petr Cervenka
>>>>> I do not know if this is related to the issue you are facing, but the
>>>>> first FPU fault of a thread running in primary mode may be handled by
>>>>> Xenomai without switching to secondary mode. So, maybe the fault
>>>>> epilogue implicitely expects Xenomai to have switched the fault to
>>>>> secondary mode and use some secondary mode services such as
>>>>> ipipe_restore_root, whereas the thread never leaved primary mode.
>>>>>
>>>> Good point! That is probably this path (and not the one I starred on):
>>>>
>>>> __ipipe_handle_exception()
>>>> ...
>>>> if (unlikely(ipipe_trap_notify(vector, regs))) {
>>>> local_irq_restore(flags);
>>>> return 1;
>>>> }
>>>>
>>>> That needs some more thoughts...
>>> Looking at the whole __ipipe_handle_exception, the problem is related to
>>> the early, context-independent __ipipe_stall_root(). Can we postpone
>>> this safely after having called any potential high-stage hooks for this
>>> exception, and then only if the callee migrated the thread to the root
>>> domain? Or is there a need to have the root domain stalled across the
>>> post-fault migration?
>>>
>>
>> Someone from the root domain may want to get notified of the exceptions
>> occurring in that domain too, in which case we may not postpone the
>> virtual mask fixup after the notifier invocation, otherwise we would
>> call the handler with a broken interrupt state.
>>
>>> In the latter case, we would have to fiddle with the stall bits directly
>>> instead of calling local_irq_restore - not just to work around the
>>> BUG_ON, but also to avoid sync'ing root over potentially stalled
>>> non-root domains...
>>>
>>
>> This used to be done by ipipe_restore_pipeline_nosync() in older
>> patches, but this one has disappeared after the flat log refactoring. We
>> indeed need to resurrect something alike in order to reset the stall bit
>> without calling the syncer, when taking the fast exit path after
>> ipipe_trap_notify().
>
>Hmm, so it could be fairly simple in fact:
>
>--- a/arch/x86/kernel/ipipe.c
>+++ b/arch/x86/kernel/ipipe.c
>@@ -755,7 +755,9 @@ int __ipipe_handle_exception(struct pt_r
> #endif /* CONFIG_KGDB */
>
> if (unlikely(ipipe_trap_notify(vector, regs))) {
>- local_irq_restore(flags);
>+ if (!flags)
>+ __clear_bit(IPIPE_STALL_FLAG,
>+ &ipipe_root_cpudom_var(status));
> return 1;
> }
>
>Petr, ready to try?
>
I tried this patch and the problem (or the race condition) disappeared. ;-)
Is there any (easy) method to recognise if the problem was solved?
To your previous questions:
We use Athlon64 X2 (2 cores, 64-bit), kubuntu 7.10 amd64.
We have 2 real-time userspace applications: some kind of server for rtnet communication with special measuring hardware, and clients (1-4 instances) for some computing, configuration, ethernet comunication, etc. Comunication between server and clients is via named rt_queues.. Any "failing example" is perhaps impossible.
Any attempt with IPIPE_DEBUG and tracer removes the race condition.
Thank you VERY MUCH for you help and support (all of you).
Petr
>Jan
>
>--
>Siemens AG, Corporate Technology, CT SE 2
>Corporate Competence Center Embedded Linux
>
next prev parent reply other threads:[~2008-02-08 12:41 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-08 13:03 [Xenomai-help] rt_queue with multiple listeners Petr Cervenka
2008-02-06 14:09 ` [Xenomai-help] FPU not available Petr Cervenka
2008-02-06 14:45 ` Jan Kiszka
2008-02-07 12:22 ` Petr Cervenka
2008-02-07 13:16 ` Jan Kiszka
2008-02-07 13:23 ` Gilles Chanteperdrix
2008-02-07 13:45 ` Jan Kiszka
2008-02-07 14:02 ` Jan Kiszka
2008-02-07 14:35 ` Philippe Gerum
2008-02-07 14:56 ` Jan Kiszka
2008-02-08 12:41 ` Petr Cervenka [this message]
2008-02-08 13:17 ` Philippe Gerum
2008-02-08 13:18 ` Philippe Gerum
2008-02-08 15:27 ` Petr Cervenka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200802081341.13030@domain.hid \
--to=grugh@domain.hid \
--cc=jan.kiszka@domain.hid \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.