* [Xenomai-core] Getting the clock model right
@ 2007-04-06 12:07 Jan Kiszka
2007-04-06 13:47 ` Gilles Chanteperdrix
0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2007-04-06 12:07 UTC (permalink / raw)
To: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 3133 bytes --]
Hi,
recent announcement of some new TSC synchronisation feature in RTAI made
me stick my nose into this and think about the whole issue of clock
synchronisation again. Well, let's not talk about RTAI details here, but
they got one thing right: as long as we cannot handle unsynch'ed TSC on
SMP, we need some detection and alarming as the bare minimum.
Why can't we handle such cases yet? First, there seems to be still some
bugs hidden in the core (one example: xntimer_start_aperiodic() uses the
local time stamp to start a remote timer). While this can be addressed
by reviewing the code and fixing what is wrong, the more severe issue is
that we cannot help the application or driver developer to cope with
unsynch'ed time stamps properly. We either have to propagate dates as a
tuple of nanoseconds (or ticks) and clock ID (TSC of CPU x, RTC, remote
clock, etc.), or we should try harder to provide consistent time across
the whole system. The former means API breakage in many places, so I'm
more and more convinced that it is the _wrong_ path. That leaves us with
option 2 (please point me to any other alternative, I don't see them).
Let's take a step back again and look at why we currently claim that
unsynch'ed per-CPU clocks is the official model in Xenomai: certain
multi-processor or multi-core systems (specifically x86 and x86_64) do
not provide synchronised TSCs across all nodes, neither with respect to
their offsets nor regarding drifts due to transient freezing of TSCs.
That's a pity for now, but it will not remain so on the long-term. On
one side, there are alternatives, specifically HPET. On the other, CPU
manufactures realised that TSCs are used for timekeeping these days and
promise to fix the issue in hardware [1].
So we should really forget about designing around this shortcoming of
today's hardware and rather look for viable workarounds until the sun
breaks though again. That means we need
A) drift detection and alarming (highest prio to-do)
B) offset and drift compensation where feasible
C) support for alternatives (=> HPET-based clock source)
Regarding B): The issue should actually be not that tricky for most
reasonable systems. We already rely on consistent, monotonic CPU-local
TSCs (which implies switching off power management e.g.). Thus we should
see only small drifts in reality that should be manageable, no?
Comments and thoughts are welcome. I would really like to see a clear
roadmap for this (IMHO) important issue before 2.4 gets on the road.
Also, I would like to draw a line and add things like timers to the next
RTDM revision - also before 2.4.
BTW, there is another to-do regarding the time subsystem: optimised
tsc-to-ns conversion (and vice versa), including uninlining of those
huge functions. When looking at this, considering to implant some means
for smoothly adjusting clocks during runtime would be great. I'm
thinking about a generic infrastructure to synchronise the Xenomai time
on external sources (=>distributed clocks).
Jan
[1] http://developer.amd.com/article_print.jsp?id=92
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] Getting the clock model right
2007-04-06 12:07 [Xenomai-core] Getting the clock model right Jan Kiszka
@ 2007-04-06 13:47 ` Gilles Chanteperdrix
2007-04-06 14:29 ` Jan Kiszka
0 siblings, 1 reply; 7+ messages in thread
From: Gilles Chanteperdrix @ 2007-04-06 13:47 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-core
Jan Kiszka wrote:
> Hi,
>
> recent announcement of some new TSC synchronisation feature in RTAI made
> me stick my nose into this and think about the whole issue of clock
> synchronisation again. Well, let's not talk about RTAI details here, but
> they got one thing right: as long as we cannot handle unsynch'ed TSC on
> SMP, we need some detection and alarming as the bare minimum.
>
> Why can't we handle such cases yet? First, there seems to be still some
> bugs hidden in the core (one example: xntimer_start_aperiodic() uses the
> local time stamp to start a remote timer).
Reading the code, there seem to be only two places where the local tsc
is used to set a remote timer, it is xntimer_start_aperiodic, and
xntimer_move_aperiodic, which is used by xntimer_migrate. So we are left
with only one bug: starting a timer on the remote CPU, this could easily
be implemented with a queue which would be handled by the timer IPI.
> (...)
> for smoothly adjusting clocks during runtime would be great. I'm
> thinking about a generic infrastructure to synchronise the Xenomai time
> on external sources (=>distributed clocks).
What do you want, NTP ? Or calling xnpod_settime periodically ?
--
Gilles Chanteperdrix
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] Getting the clock model right
2007-04-06 13:47 ` Gilles Chanteperdrix
@ 2007-04-06 14:29 ` Jan Kiszka
2007-04-06 15:13 ` Gilles Chanteperdrix
[not found] ` <461677BB.1090301@domain.hid>
0 siblings, 2 replies; 7+ messages in thread
From: Jan Kiszka @ 2007-04-06 14:29 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 1914 bytes --]
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Hi,
>>
>> recent announcement of some new TSC synchronisation feature in RTAI made
>> me stick my nose into this and think about the whole issue of clock
>> synchronisation again. Well, let's not talk about RTAI details here, but
>> they got one thing right: as long as we cannot handle unsynch'ed TSC on
>> SMP, we need some detection and alarming as the bare minimum.
>>
>> Why can't we handle such cases yet? First, there seems to be still some
>> bugs hidden in the core (one example: xntimer_start_aperiodic() uses the
>> local time stamp to start a remote timer).
>
> Reading the code, there seem to be only two places where the local tsc
> is used to set a remote timer, it is xntimer_start_aperiodic, and
> xntimer_move_aperiodic, which is used by xntimer_migrate. So we are left
> with only one bug: starting a timer on the remote CPU, this could easily
> be implemented with a queue which would be handled by the timer IPI.
>
As I said: fixable based on thorough review - but only a minor part of
the problem.
>
>> (...)
>> for smoothly adjusting clocks during runtime would be great. I'm
>> thinking about a generic infrastructure to synchronise the Xenomai time
>> on external sources (=>distributed clocks).
>
> What do you want, NTP ? Or calling xnpod_settime periodically ?
>
More like NTP: monotonic clocks that can be derived from custom
synchronisation signals, maybe optimised/simplified with fact in mind
that those signals should have bounded worst-case jitters (on a hard
real-time system like Xenomai).
We just implemented such synchronisation based on CAN and serial
null-modem signals. RTnet comes with a high-quality distributed clock as
well. We have not yet implemented a smart clock adjustment (specifically
because our application only needs about 1 millisecond accuracy).
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] Getting the clock model right
2007-04-06 14:29 ` Jan Kiszka
@ 2007-04-06 15:13 ` Gilles Chanteperdrix
2007-04-06 16:36 ` Jan Kiszka
[not found] ` <461677BB.1090301@domain.hid>
1 sibling, 1 reply; 7+ messages in thread
From: Gilles Chanteperdrix @ 2007-04-06 15:13 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>
>>Jan Kiszka wrote:
>>
>>>Hi,
>>>
>>>recent announcement of some new TSC synchronisation feature in RTAI made
>>>me stick my nose into this and think about the whole issue of clock
>>>synchronisation again. Well, let's not talk about RTAI details here, but
>>>they got one thing right: as long as we cannot handle unsynch'ed TSC on
>>>SMP, we need some detection and alarming as the bare minimum.
>>>
>>>Why can't we handle such cases yet? First, there seems to be still some
>>>bugs hidden in the core (one example: xntimer_start_aperiodic() uses the
>>>local time stamp to start a remote timer).
>>
>>Reading the code, there seem to be only two places where the local tsc
>>is used to set a remote timer, it is xntimer_start_aperiodic, and
>>xntimer_move_aperiodic, which is used by xntimer_migrate. So we are left
>>with only one bug: starting a timer on the remote CPU, this could easily
>>be implemented with a queue which would be handled by the timer IPI.
>>
>
>
> As I said: fixable based on thorough review - but only a minor part of
> the problem.
I fail to see the remaining part of the problem.
--
Gilles Chanteperdrix
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] Getting the clock model right
2007-04-06 15:13 ` Gilles Chanteperdrix
@ 2007-04-06 16:36 ` Jan Kiszka
0 siblings, 0 replies; 7+ messages in thread
From: Jan Kiszka @ 2007-04-06 16:36 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
[-- Attachment #1: Type: text/plain, Size: 2110 bytes --]
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>
>>> Jan Kiszka wrote:
>>>
>>>> Hi,
>>>>
>>>> recent announcement of some new TSC synchronisation feature in RTAI made
>>>> me stick my nose into this and think about the whole issue of clock
>>>> synchronisation again. Well, let's not talk about RTAI details here, but
>>>> they got one thing right: as long as we cannot handle unsynch'ed TSC on
>>>> SMP, we need some detection and alarming as the bare minimum.
>>>>
>>>> Why can't we handle such cases yet? First, there seems to be still some
>>>> bugs hidden in the core (one example: xntimer_start_aperiodic() uses the
>>>> local time stamp to start a remote timer).
>>> Reading the code, there seem to be only two places where the local tsc
>>> is used to set a remote timer, it is xntimer_start_aperiodic, and
>>> xntimer_move_aperiodic, which is used by xntimer_migrate. So we are left
>>> with only one bug: starting a timer on the remote CPU, this could easily
>>> be implemented with a queue which would be handled by the timer IPI.
>>>
>>
>> As I said: fixable based on thorough review - but only a minor part of
>> the problem.
>
> I fail to see the remaining part of the problem.
>
Consider a simple scenario consisting of a shared communication device
over which packets arrive and get time-stamped. Now, if applications
that receive those packets sit on different, unsynchronised CPUs, they
have to know on which CPU the time stamps were taken in order to relate
them to other events correctly. Basically the same issue you have on
distributed systems as well.
If we leave the user with broken local clocks, we _must_ provide the
information about the clock source. There are always scenarios where you
_cannot_ separate your applications in a way that they run totally
independent on different CPUs. And then we should provide means to
synchronise the clocks, or the user has to re-invent the wheel over and
over again. Given the latter, doing this inside the core in a
transparent manner is far smarter.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] Getting the clock model right
[not found] ` <461677BB.1090301@domain.hid>
@ 2007-04-06 16:54 ` Jan Kiszka
2007-04-06 17:06 ` Jan Kiszka
0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2007-04-06 16:54 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 2801 bytes --]
Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>
>>> Jan Kiszka wrote:
>>>
>>>> Hi,
>>>>
>>>> recent announcement of some new TSC synchronisation feature in RTAI made
>>>> me stick my nose into this and think about the whole issue of clock
>>>> synchronisation again. Well, let's not talk about RTAI details here, but
>>>> they got one thing right: as long as we cannot handle unsynch'ed TSC on
>>>> SMP, we need some detection and alarming as the bare minimum.
>>>>
>>>> Why can't we handle such cases yet? First, there seems to be still some
>>>> bugs hidden in the core (one example: xntimer_start_aperiodic() uses the
>>>> local time stamp to start a remote timer).
>>> Reading the code, there seem to be only two places where the local tsc
>>> is used to set a remote timer, it is xntimer_start_aperiodic, and
>>> xntimer_move_aperiodic, which is used by xntimer_migrate. So we are left
>>> with only one bug: starting a timer on the remote CPU, this could easily
>>> be implemented with a queue which would be handled by the timer IPI.
>>>
>>
>> As I said: fixable based on thorough review - but only a minor part of
>> the problem.
>>
>>
>>>> (...)
>>>> for smoothly adjusting clocks during runtime would be great. I'm
>>>> thinking about a generic infrastructure to synchronise the Xenomai time
>>>> on external sources (=>distributed clocks).
>>> What do you want, NTP ? Or calling xnpod_settime periodically ?
>>>
>>
>> More like NTP: monotonic clocks that can be derived from custom
>> synchronisation signals, maybe optimised/simplified with fact in mind
>> that those signals should have bounded worst-case jitters (on a hard
>> real-time system like Xenomai).
>>
>> We just implemented such synchronisation based on CAN and serial
>> null-modem signals. RTnet comes with a high-quality distributed clock as
>> well. We have not yet implemented a smart clock adjustment (specifically
>> because our application only needs about 1 millisecond accuracy).
>
> What about emitting Adeos events when receiving NTP corrections ? This
> way, we would avoid reinventing NTP ?
>
IIRC, NTP was designed to synchronise clocks over unreliable and slow
media. The math behind it /may/ be useful (though it may also turn out
to be too heavy), but beyond that... Already seen a NTP-over-CAN
realisation, e.g.?
And what would NTP over standard network buy us on a Xenomai system when
the accuracy of NTP time stamps taken under Linux gets additionally
degraded by Xenomai activity? I thought about NTP for our problem for a
short while, but then quickly dropped the idea due to lacking guarantees.
Likely, we rather need something like IEEE 1588, but there are
unfortunate patents around that protocol.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Xenomai-core] Getting the clock model right
2007-04-06 16:54 ` Jan Kiszka
@ 2007-04-06 17:06 ` Jan Kiszka
0 siblings, 0 replies; 7+ messages in thread
From: Jan Kiszka @ 2007-04-06 17:06 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 1279 bytes --]
Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> What about emitting Adeos events when receiving NTP corrections ? This
>> way, we would avoid reinventing NTP ?
>>
>
> IIRC, NTP was designed to synchronise clocks over unreliable and slow
> media. The math behind it /may/ be useful (though it may also turn out
> to be too heavy), but beyond that... Already seen a NTP-over-CAN
> realisation, e.g.?
>
> And what would NTP over standard network buy us on a Xenomai system when
> the accuracy of NTP time stamps taken under Linux gets additionally
> degraded by Xenomai activity? I thought about NTP for our problem for a
> short while, but then quickly dropped the idea due to lacking guarantees.
>
> Likely, we rather need something like IEEE 1588, but there are
> unfortunate patents around that protocol.
>
Hmm, I think I just got distracted from my original idea.
IEEE 1588, NTP, whatever, those are synchronisation protocols, designed
for specific media. What we probably need in the Xenomai nucleus is a
generic infrastructure to tune the local clock based on some offset and
drift factor, however those were obtained. That leaves the door open for
any protocol and communication media to exchange the required sync
information.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-04-06 17:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-06 12:07 [Xenomai-core] Getting the clock model right Jan Kiszka
2007-04-06 13:47 ` Gilles Chanteperdrix
2007-04-06 14:29 ` Jan Kiszka
2007-04-06 15:13 ` Gilles Chanteperdrix
2007-04-06 16:36 ` Jan Kiszka
[not found] ` <461677BB.1090301@domain.hid>
2007-04-06 16:54 ` Jan Kiszka
2007-04-06 17:06 ` Jan Kiszka
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.