* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for
@ 2007-05-09 18:10 M. Koehrer
0 siblings, 0 replies; 8+ messages in thread
From: M. Koehrer @ 2007-05-09 18:10 UTC (permalink / raw)
To: gilles.chanteperdrix, rpm
user space only applications
Cc: xenomai@xenomai.org, mathias_koehrer@domain.hid
In-Reply-To: <4641F7A4.50105@domain.hid>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
References: <4641F7A4.50105@domain.hid> <22908039.1178707932628.JavaMail.ngmail@domain.hid>
<1178727256.11688.35.camel@domain.hid>
X-ngMessageSubType: MessageSubType_MAIL
X-WebmailclientIP: 84.160.96.10
Hi Gilles,
=20
> We could have a "sync object area" where all atomic counters would live,
> that would be mapped into real-time processes address space, the sync
> object would then only contain an index in this area. This would mean
> that the number of sync objects is limited, but it is a restriction far
> more acceptable than to allocate and share a full page by sync object.
That sounds great. I think typically only a few (up to 20) sync objects are=
required.
Thus, one memory page should be enough for most applications.
Regards
Mathias
--=20
Mathias Koehrer
mathias_koehrer@domain.hid
Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT JETZT bei Arcor: g=FCnsti=
g
und schnell mit DSL - das All-Inclusive-Paket f=FCr clevere Doppel-Sparer,
nur 39,85 =80 inkl. DSL- und ISDN-Grundgeb=FChr!
http://www.arcor.de/rd/emf-dsl-2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for
@ 2007-05-09 13:02 Fillod Stephane
0 siblings, 0 replies; 8+ messages in thread
From: Fillod Stephane @ 2007-05-09 13:02 UTC (permalink / raw)
To: xenomai; +Cc: M. Koehrer
Jan Kiszka wrote:
[...]
>Not saying that the futex idea itself is not worth pursuing, but I
would
>first of all think about if there aren't other, far simpler
>optimisations [...]
OProfile time?
--
Stephane
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Xenomai-help] Xenomai and futexes - Native API optimized for user space only applications
@ 2007-05-09 10:52 M. Koehrer
2007-05-09 11:51 ` Jan Kiszka
0 siblings, 1 reply; 8+ messages in thread
From: M. Koehrer @ 2007-05-09 10:52 UTC (permalink / raw)
To: xenomai
Hi everybody,
I am using Xenomai for a high performance real time simulation system.
All of the simulation code is executed within user space. One application is running that consists
of several Xenomai real time threads.
For performance reasons I am always using the latest PC technology available (which is currently
Pentium D or Core2Duo PCs, we are thinking even of using quad-core CPUs).
There has to be a kind of thread synchronisation e.g. when accessing shared data.
Hardware I/O is done via rtnet or via PCI I/O boards that work in user space aswell (PCI memory
mapped into the user space).
This is done e.g. by using semaphores, mutexes et.c
I like the Xenomai-native skin as it provides a very clear API that is easy to use.
However, for a user space only application it is a performance killer, that all API calls
lead to a mode switch from user to kernel space. Each API call takes about 1-2 microseconds (us)
on my PC which is really expensive.
Especially when inter process communication is used to protect the access to shared data
it is mostly the case that the calling thread does not have to wait. In this situation there is no
need for a context switch. The API call did not lead to a rescheduling of the available tasks.
And for this the required 1-2 us do really hurt.
Thus my question/proposal is if there is a plan to use a "variant" of the native API that is optimized for
user space only applications. In this case e.g. futexes can be used. If there is a need to
reschedule to another task it is fine to "invest" the 2us but it can be avoided mostly which should
increase the overall performance dramatically.
This would lead to a library where a big part of the functionality is handled directly in the library
(in user space). Currently the skin passes the (user space) API call via a Xenomai System call
to the kernel space to execute there the actual functionality.
Thanks for any feedback on this proposal.
Regards
Mathias
--
Mathias Koehrer
mathias_koehrer@domain.hid
Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT JETZT bei Arcor: günstig
und schnell mit DSL - das All-Inclusive-Paket für clevere Doppel-Sparer,
nur 39,85 inkl. DSL- und ISDN-Grundgebühr!
http://www.arcor.de/rd/emf-dsl-2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for user space only applications
2007-05-09 10:52 [Xenomai-help] Xenomai and futexes - Native API optimized for user space only applications M. Koehrer
@ 2007-05-09 11:51 ` Jan Kiszka
2007-05-09 12:27 ` [Xenomai-help] Xenomai and futexes - Native API optimized for M. Koehrer
0 siblings, 1 reply; 8+ messages in thread
From: Jan Kiszka @ 2007-05-09 11:51 UTC (permalink / raw)
To: M. Koehrer; +Cc: xenomai
[-- Attachment #1: Type: text/plain, Size: 2394 bytes --]
M. Koehrer wrote:
> Hi everybody,
>
> I am using Xenomai for a high performance real time simulation system.
> All of the simulation code is executed within user space. One application is running that consists
> of several Xenomai real time threads.
> For performance reasons I am always using the latest PC technology available (which is currently
> Pentium D or Core2Duo PCs, we are thinking even of using quad-core CPUs).
> There has to be a kind of thread synchronisation e.g. when accessing shared data.
> Hardware I/O is done via rtnet or via PCI I/O boards that work in user space aswell (PCI memory
> mapped into the user space).
> This is done e.g. by using semaphores, mutexes et.c
> I like the Xenomai-native skin as it provides a very clear API that is easy to use.
> However, for a user space only application it is a performance killer, that all API calls
> lead to a mode switch from user to kernel space. Each API call takes about 1-2 microseconds (us)
> on my PC which is really expensive.
> Especially when inter process communication is used to protect the access to shared data
> it is mostly the case that the calling thread does not have to wait. In this situation there is no
> need for a context switch. The API call did not lead to a rescheduling of the available tasks.
> And for this the required 1-2 us do really hurt.
>
> Thus my question/proposal is if there is a plan to use a "variant" of the native API that is optimized for
> user space only applications. In this case e.g. futexes can be used. If there is a need to
> reschedule to another task it is fine to "invest" the 2us but it can be avoided mostly which should
> increase the overall performance dramatically.
Your analysis is not wrong. But before going into any detail may I ask
you for on clarification: What do you want to optimise *primarily*, the
average execution time of your RT threads in order to gain more CPU time
for non-RT Linux jobs *or* the worst case execution time of your RT threads?
> This would lead to a library where a big part of the functionality is handled directly in the library
> (in user space). Currently the skin passes the (user space) API call via a Xenomai System call
> to the kernel space to execute there the actual functionality.
>
> Thanks for any feedback on this proposal.
>
> Regards
>
> Mathias
>
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for
2007-05-09 11:51 ` Jan Kiszka
@ 2007-05-09 12:27 ` M. Koehrer
2007-05-09 12:51 ` Jan Kiszka
2007-05-09 13:35 ` Herman Bruyninckx
0 siblings, 2 replies; 8+ messages in thread
From: M. Koehrer @ 2007-05-09 12:27 UTC (permalink / raw)
To: jan.kiszka, mathias_koehrer; +Cc: xenomai
Hi Jan,
primarily I want to execute my simulation models as fast as possible. This is, I want
to optimize the average execution time. As there are a series of critical sections to
be passed this should also have an impact on the worst execution time.
The critical region in the simulation are at locations when a lower prio thread and a higher
prio thread access the same data. This happens at a couple of times within a simulation step
(within that critical region there are typically some copy operations).
However, once the higher prio thread has entered one critical region it is sure (at least if
both threads run on the same CPU core) that it will be able to enter the next critical region without
waiting.
That means, that improving the average runtime should also help to improve the worst case
runtime of the highest prio thread.
Regards
Mathias
>
> Your analysis is not wrong. But before going into any detail may I ask
> you for on clarification: What do you want to optimise *primarily*, the
> average execution time of your RT threads in order to gain more CPU time
> for non-RT Linux jobs *or* the worst case execution time of your RT
> threads?
>
--
Mathias Koehrer
mathias_koehrer@domain.hid
Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT JETZT bei Arcor: günstig
und schnell mit DSL - das All-Inclusive-Paket für clevere Doppel-Sparer,
nur 39,85 inkl. DSL- und ISDN-Grundgebühr!
http://www.arcor.de/rd/emf-dsl-2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for
2007-05-09 12:27 ` [Xenomai-help] Xenomai and futexes - Native API optimized for M. Koehrer
@ 2007-05-09 12:51 ` Jan Kiszka
2007-05-09 13:35 ` Herman Bruyninckx
1 sibling, 0 replies; 8+ messages in thread
From: Jan Kiszka @ 2007-05-09 12:51 UTC (permalink / raw)
To: M. Koehrer; +Cc: xenomai
[-- Attachment #1: Type: text/plain, Size: 1447 bytes --]
M. Koehrer wrote:
> Hi Jan,
>
> primarily I want to execute my simulation models as fast as possible. This is, I want
> to optimize the average execution time. As there are a series of critical sections to
> be passed this should also have an impact on the worst execution time.
>
> The critical region in the simulation are at locations when a lower prio thread and a higher
> prio thread access the same data. This happens at a couple of times within a simulation step
> (within that critical region there are typically some copy operations).
> However, once the higher prio thread has entered one critical region it is sure (at least if
> both threads run on the same CPU core) that it will be able to enter the next critical region without
> waiting.
> That means, that improving the average runtime should also help to improve the worst case
> runtime of the highest prio thread.
Not saying that the futex idea itself is not worth pursuing, but I would
first of all think about if there aren't other, far simpler
optimisations available at application level: lock-less algorithms e.g.
that flip some current-buffer pointers instead of locking the access to
a single shared buffer. Or a locking scheme that reduces the number of
locks the highest prio thread has to take in a row. Specifically when
you say that taking lock A implies to succeed in taking B as well, I
wonder if there is a need for A!=B.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for
2007-05-09 12:27 ` [Xenomai-help] Xenomai and futexes - Native API optimized for M. Koehrer
2007-05-09 12:51 ` Jan Kiszka
@ 2007-05-09 13:35 ` Herman Bruyninckx
2007-05-09 13:58 ` M. Koehrer
1 sibling, 1 reply; 8+ messages in thread
From: Herman Bruyninckx @ 2007-05-09 13:35 UTC (permalink / raw)
To: M. Koehrer; +Cc: xenomai, jan.kiszka
On Wed, 9 May 2007, M. Koehrer wrote:
[...]
> The critical region in the simulation are at locations when a lower prio
> thread and a higher prio thread access the same data. This happens at a
> couple of times within a simulation step (within that critical region
> there are typically some copy operations).
I agree with Jan Kiszka (in another reply in this thread) that it's
probably better to improve the design of your application than to shift the
problem to asking for "yet another feature" of the RTOS!
In our robot control framework (orocos.org) that works as a user space
application on top of Xenomai we have implemented lock-free data exchange,
which does'nt require involvement of the scheduler at all, and which
improves the timing performance of the highest priority thread. It might be
useful for you to discuss this implementation aspect on the orocos
developers mailinglist...
Herman Bruyninckx
--
K.U.Leuven, Mechanical Eng., Mechatronics & Robotics Research Group
<http://people.mech.kuleuven.be/~bruyninc> Tel: +32 16 322480
Coordinator EURON (European Robotics Research Network)
<http://www.euron.org>
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for
2007-05-09 13:35 ` Herman Bruyninckx
@ 2007-05-09 13:58 ` M. Koehrer
2007-05-09 15:46 ` Jan Kiszka
2007-05-09 22:36 ` Herman Bruyninckx
0 siblings, 2 replies; 8+ messages in thread
From: M. Koehrer @ 2007-05-09 13:58 UTC (permalink / raw)
To: herman.bruyninckx, mathias_koehrer; +Cc: xenomai, jan.kiszka
Hi Herman and Jan,
thanks for your replies.
I agree that is the best to do an approach that avoids critical sections.
However, we have here the typical "historical heritage". Parts of this code have
been ported from another system (where they worked with cli/sti).
Other parts are generated from a third party code generator.
This code generator (which is not in my hand) is designed primarily
for offline simulation within a single task.
Thus, the guys that programmed that, did not care about multi thread programming.
The programs write the results into output variables.
To avoid critical section I have now to do either
an additional copy (also quite expensive) of the output data to a location
where I can work without any risk of being interrupted or I lock the access
to the data.
That means either I do lots of additional copy actions (also quite expensive) or I work
with the critical region.
Thus for me it is not an either - or - option.
I can not optimize the code to be 100% free of critical sections (however it can be done in
some parts..). That means I still have to count on the IPC provided by the OS.
Regards
Mathias
> > The critical region in the simulation are at locations when a lower prio
> > thread and a higher prio thread access the same data. This happens at a
> > couple of times within a simulation step (within that critical region
> > there are typically some copy operations).
>
> I agree with Jan Kiszka (in another reply in this thread) that it's
> probably better to improve the design of your application than to shift the
> problem to asking for "yet another feature" of the RTOS!
>
> In our robot control framework (orocos.org) that works as a user space
> application on top of Xenomai we have implemented lock-free data exchange,
> which does'nt require involvement of the scheduler at all, and which
> improves the timing performance of the highest priority thread. It might be
> useful for you to discuss this implementation aspect on the orocos
> developers mailinglist...
>
> Herman Bruyninckx
>
--
Mathias Koehrer
mathias_koehrer@domain.hid
Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT JETZT bei Arcor: günstig
und schnell mit DSL - das All-Inclusive-Paket für clevere Doppel-Sparer,
nur 39,85 inkl. DSL- und ISDN-Grundgebühr!
http://www.arcor.de/rd/emf-dsl-2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for
2007-05-09 13:58 ` M. Koehrer
@ 2007-05-09 15:46 ` Jan Kiszka
2007-05-09 22:36 ` Herman Bruyninckx
1 sibling, 0 replies; 8+ messages in thread
From: Jan Kiszka @ 2007-05-09 15:46 UTC (permalink / raw)
To: M. Koehrer; +Cc: xenomai
[-- Attachment #1: Type: text/plain, Size: 2120 bytes --]
M. Koehrer wrote:
> Hi Herman and Jan,
>
> thanks for your replies.
> I agree that is the best to do an approach that avoids critical sections.
> However, we have here the typical "historical heritage". Parts of this code have
> been ported from another system (where they worked with cli/sti).
> Other parts are generated from a third party code generator.
> This code generator (which is not in my hand) is designed primarily
> for offline simulation within a single task.
> Thus, the guys that programmed that, did not care about multi thread programming.
> The programs write the results into output variables.
> To avoid critical section I have now to do either
> an additional copy (also quite expensive) of the output data to a location
> where I can work without any risk of being interrupted or I lock the access
> to the data.
> That means either I do lots of additional copy actions (also quite expensive) or I work
> with the critical region.
>
> Thus for me it is not an either - or - option.
> I can not optimize the code to be 100% free of critical sections (however it can be done in
> some parts..). That means I still have to count on the IPC provided by the OS.
OK, seeing your motivation, I would now like to roughly sketch to issue
I see about futexes for Xenomai:
You need user space and kernel space working on the same piece of memory
atomically AND you need to get the contention case right. The latter is
tricky because Xenomai does -for various reasons like robustness- the
sequence "try to acquire, fail, enqueue waiter, suspend" under one
common lock. As with futexes that "try to acquire" may cause page faults
(in buggy programs), you need to redesign things.
I have no model for such a redesign at hand (nor would I like to think
deeper and detract myself from other pending stuff), but my feeling is
that it will be at least, well, challenging. Looking at kernel futex
code and watching its pending issues, you may see that this is actually
not so much Xenomai-specific.
Anyway, go wild people, maybe I'm just too pessimistic. :)
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai-help] Xenomai and futexes - Native API optimized for
2007-05-09 13:58 ` M. Koehrer
2007-05-09 15:46 ` Jan Kiszka
@ 2007-05-09 22:36 ` Herman Bruyninckx
1 sibling, 0 replies; 8+ messages in thread
From: Herman Bruyninckx @ 2007-05-09 22:36 UTC (permalink / raw)
To: M. Koehrer; +Cc: xenomai, jan.kiszka
On Wed, 9 May 2007, M. Koehrer wrote:
> thanks for your replies.
> I agree that is the best to do an approach that avoids critical sections.
> However, we have here the typical "historical heritage".
Nevertheless, I repeat my pledge _not_ to put new features in Xenomai
because of bad programming in legacy code! Every new feature will
automatically be considered by newcomers as good features, and that's _not_
the case! In my opinion, a good RTOS is one with a minimum of features, but
with a maximum of good application templates. (Call it "software patterns"
if you like.)
> That means I still have to count on the IPC provided by the OS.
I understand, but then I can't support your "complaint" about the Xenomai
features not being computationally optimal :-) _You_ have a bad legacy code,
so _you_ should pay the price for it, not the RTOS development or
maintainability...
Herman
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-05-09 22:36 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-09 18:10 [Xenomai-help] Xenomai and futexes - Native API optimized for M. Koehrer
-- strict thread matches above, loose matches on Subject: below --
2007-05-09 13:02 Fillod Stephane
2007-05-09 10:52 [Xenomai-help] Xenomai and futexes - Native API optimized for user space only applications M. Koehrer
2007-05-09 11:51 ` Jan Kiszka
2007-05-09 12:27 ` [Xenomai-help] Xenomai and futexes - Native API optimized for M. Koehrer
2007-05-09 12:51 ` Jan Kiszka
2007-05-09 13:35 ` Herman Bruyninckx
2007-05-09 13:58 ` M. Koehrer
2007-05-09 15:46 ` Jan Kiszka
2007-05-09 22:36 ` Herman Bruyninckx
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.