* Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) @ 2012-10-31 15:54 Scot Salmon 2012-10-31 20:08 ` Thomas Gleixner 0 siblings, 1 reply; 15+ messages in thread From: Scot Salmon @ 2012-10-31 15:54 UTC (permalink / raw) To: linux-rt-users; +Cc: Thomas Gleixner, Alexander Shishkin At RTLWS, I talked to tglx about patching clock_nanosleep to detect shifts in CLOCK_REALTIME. He suggested I bring it up here for comments. Alexander Shishkin proposed a patch for this a while back: http://thread.gmane.org/gmane.linux.kernel/1110759/focus=1131917 At that time Thomas rightly responded that hacking the POSIX clock_nanosleep API (both overloading a parameter and adding a new return value) was quite horrible. The nanosleep use case in that thread seemed somewhat speculative, with references to "weird" cron jobs, and Alexander seemed satisfied when Thomas added this feature to timerfd. I described a more concrete use case to Thomas that is not solved by timerfd. We have multiple devices running control loops using clock_nanosleep and TIMER_ABSTIME to get good periodic wakeups. The clocks need to be synchronized across the controllers so that the loops themselves can be in sync. In order to use a synchronized clock we have to use CLOCK_REALTIME. But if the control loop starts, and then the time sync protocol kicks in and shifts the clock, that breaks the control loop, the most obvious case being if time shifts backwards and a loop that should be running at 100us takes 100us + some arbitrary amount of time shift, potentially measured in minutes or even days. timerfd has the behavior I need, but its performance is much worse than clock_nanosleep, we believe because the wakeup goes through ksoftirqd. I applied Alexander's patches 2 and 3, which is the subset I care about (avoiding 1 and 4, since we already implemented 1 as a first pass at this issue, and Thomas already implemented 4 separately), and it does what I need. But, Thomas's objections are still valid so I am not sure I like this solution. Could reworking the way that timerfd handles wakeups be an option? We'd like some guidance: if that's something you feel worth investigating, we're willing to do so. -Scot ------ Scot Salmon (scot.salmon@ni.com) National Instruments, Austin, TX 101010, 222, 52, ... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-10-31 15:54 Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) Scot Salmon @ 2012-10-31 20:08 ` Thomas Gleixner 2012-11-15 19:28 ` Scot Salmon 0 siblings, 1 reply; 15+ messages in thread From: Thomas Gleixner @ 2012-10-31 20:08 UTC (permalink / raw) To: Scot Salmon Cc: linux-rt-users, Alexander Shishkin, Peter Zijlstra, John Stultz On Wed, 31 Oct 2012, Scot Salmon wrote: > I described a more concrete use case to Thomas that is not solved by > timerfd. We have multiple devices running control loops using > clock_nanosleep and TIMER_ABSTIME to get good periodic wakeups. The > clocks need to be synchronized across the controllers so that the loops > themselves can be in sync. In order to use a synchronized clock we have > to use CLOCK_REALTIME. But if the control loop starts, and then the time > sync protocol kicks in and shifts the clock, that breaks the control loop, > the most obvious case being if time shifts backwards and a loop that > should be running at 100us takes 100us + some arbitrary amount of time > shift, potentially measured in minutes or even days. timerfd has the > behavior I need, but its performance is much worse than clock_nanosleep, > we believe because the wakeup goes through ksoftirqd. With less conference induced brain damage I think your problem needs to be solved differently. What you are concerned about is keeping the machines in sync on a common global timeline. Though your more fundamental requirement is that you get the wakeup on each machine in the given cycle time. The global synchronization mechanism just adjusts that local periodic schedule. So when you start up a control process on a node you align the cycle time of this node to the global CLOCK_REALTIME timeline. That's why you decided to use CLOCK_REALTIME in the first place, but then as you observed correctly this sucks due to the nature of CLOCK_REALTIME which can be affected by leap seconds, daylight saving changes and other interesting events. So ideally you should use CLOCK_MONOTONIC for scheduling your periodic timeline, but you can't as you do not have a proper correlation between CLOCK_REALTIME, which provides your global synchronization, and the machine local CLOCK_MONOTONIC. What you really want is an atomic readout facility for CLOCK_MONOTONIC and CLOCK_REALTIME. That allows you to align the CLOCK_MONOTONIC based timer with the global CLOCK_REALTIME based time line and in the event that the CLOCK_REALTIME clock was set and jumped forward/backward you have full software control over the aligning mechanism including the ability to do sanity checking. Lets look at an example. T1 1000 1050 <--- Time correction resets global time to 1000 T2 1100 Now you have the problem when your wakeup is actually happening. 50 us delta is not a huge amount of time to propagate this to all CPUs and all involved distributed systems. So what happens if system 1 sees that update right away, but system 2 sees it just at the real timer wakeup point? Then suddenly your loops are off by 50us for at least one cycle. Not what you want, right? So in the CLOCK_MONOTONIC case you still maintain the accuracy of your periodic 100us event. The accuracy of CLOCK_MONOTONIC across (NTP/PTP) time synced systems is way better than any mechanism which relies on "timely" notification of CLOCK_REALTIME changes. The minimal clock skew adjustments which affect the global CLOCK_REALTIME are propagated to CLOCK_MONOTONIC as well, so you don't have to worry about that at all. All what you need to be concerned about is the time jump issue. But then again CLOCK_MONOTONIC will not follow those time jumps and therefor maintain your XXXus periods for quite some time with accurate synchronous behaviour. With an atomic readout of CLOCK_MONOTONIC and CLOCK_REALTIME you can be clever and safe about adjusting to a 50us or whatever large scale global time line change. You can actually verify in your cluster whether this was a legitimate change or just the random typo of the sysadmin and you can agree on how to deal with the time jump in a coordinated way, i.e. jumping forward sychronously on a given time stamp or gradually adjusting it in microsecond steps. Thanks, tglx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-10-31 20:08 ` Thomas Gleixner @ 2012-11-15 19:28 ` Scot Salmon 2012-11-15 20:53 ` John Stultz 0 siblings, 1 reply; 15+ messages in thread From: Scot Salmon @ 2012-11-15 19:28 UTC (permalink / raw) To: linux-rt-users; +Cc: John Stultz, Thomas Gleixner Thomas Gleixner <tglx@linutronix.de> wrote on 10/31/2012 03:08:45 PM: > What you are concerned about is keeping the machines in sync on a > common global timeline. Though your more fundamental requirement is > that you get the wakeup on each machine in the given cycle time. The > global synchronization mechanism just adjusts that local periodic > schedule. [...] > What you really want is an atomic readout facility for CLOCK_MONOTONIC > and CLOCK_REALTIME. That allows you to align the CLOCK_MONOTONIC based > timer with the global CLOCK_REALTIME based time line and in the event > that the CLOCK_REALTIME clock was set and jumped forward/backward you > have full software control over the aligning mechanism including the > ability to do sanity checking. This works for me. We discussed it on IRC and agreed on "something like clock_gettime2(id, id, *t1, *t2) with a sanity check on the ids, that allows us other combinations than MONO/REAL as well". -Scot ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-11-15 19:28 ` Scot Salmon @ 2012-11-15 20:53 ` John Stultz 2012-11-15 21:01 ` Thomas Gleixner 2012-12-19 20:43 ` Scot Salmon 0 siblings, 2 replies; 15+ messages in thread From: John Stultz @ 2012-11-15 20:53 UTC (permalink / raw) To: Scot Salmon; +Cc: linux-rt-users, Thomas Gleixner On 11/15/2012 11:28 AM, Scot Salmon wrote: > Thomas Gleixner <tglx@linutronix.de> wrote on 10/31/2012 03:08:45 PM: >> What you are concerned about is keeping the machines in sync on a >> common global timeline. Though your more fundamental requirement is >> that you get the wakeup on each machine in the given cycle time. The >> global synchronization mechanism just adjusts that local periodic >> schedule. > [...] >> What you really want is an atomic readout facility for CLOCK_MONOTONIC >> and CLOCK_REALTIME. That allows you to align the CLOCK_MONOTONIC based >> timer with the global CLOCK_REALTIME based time line and in the event >> that the CLOCK_REALTIME clock was set and jumped forward/backward you >> have full software control over the aligning mechanism including the >> ability to do sanity checking. > This works for me. We discussed it on IRC and agreed on "something > like clock_gettime2(id, id, *t1, *t2) with a sanity check on the ids, > that allows us other combinations than MONO/REAL as well". Yea, there's been a few times that folks have brought up the need, and this sort of interface is probably the best way to go. I suspect clock_gettime3 isn't far behind though :) thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-11-15 20:53 ` John Stultz @ 2012-11-15 21:01 ` Thomas Gleixner 2012-11-15 22:25 ` John Stultz 2012-12-19 20:43 ` Scot Salmon 1 sibling, 1 reply; 15+ messages in thread From: Thomas Gleixner @ 2012-11-15 21:01 UTC (permalink / raw) To: John Stultz; +Cc: Scot Salmon, linux-rt-users On Thu, 15 Nov 2012, John Stultz wrote: > On 11/15/2012 11:28 AM, Scot Salmon wrote: > > Thomas Gleixner <tglx@linutronix.de> wrote on 10/31/2012 03:08:45 PM: > > > What you are concerned about is keeping the machines in sync on a > > > common global timeline. Though your more fundamental requirement is > > > that you get the wakeup on each machine in the given cycle time. The > > > global synchronization mechanism just adjusts that local periodic > > > schedule. > > [...] > > > What you really want is an atomic readout facility for CLOCK_MONOTONIC > > > and CLOCK_REALTIME. That allows you to align the CLOCK_MONOTONIC based > > > timer with the global CLOCK_REALTIME based time line and in the event > > > that the CLOCK_REALTIME clock was set and jumped forward/backward you > > > have full software control over the aligning mechanism including the > > > ability to do sanity checking. > > This works for me. We discussed it on IRC and agreed on "something > > like clock_gettime2(id, id, *t1, *t2) with a sanity check on the ids, > > that allows us other combinations than MONO/REAL as well". > Yea, there's been a few times that folks have brought up the need, and this > sort of interface is probably the best way to go. > > I suspect clock_gettime3 isn't far behind though :) I guess you are talking about MONO/MONORAW/REAL. So if you already know about such requirements we should go for that, but we should try to restrict it to combinations which can be easily supported in the VDSO as well. Thanks, tglx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-11-15 21:01 ` Thomas Gleixner @ 2012-11-15 22:25 ` John Stultz 2012-11-19 18:27 ` Thomas Gleixner 0 siblings, 1 reply; 15+ messages in thread From: John Stultz @ 2012-11-15 22:25 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Scot Salmon, linux-rt-users On 11/15/2012 01:01 PM, Thomas Gleixner wrote: > On Thu, 15 Nov 2012, John Stultz wrote: >> On 11/15/2012 11:28 AM, Scot Salmon wrote: >>> Thomas Gleixner <tglx@linutronix.de> wrote on 10/31/2012 03:08:45 PM: >>>> What you are concerned about is keeping the machines in sync on a >>>> common global timeline. Though your more fundamental requirement is >>>> that you get the wakeup on each machine in the given cycle time. The >>>> global synchronization mechanism just adjusts that local periodic >>>> schedule. >>> [...] >>>> What you really want is an atomic readout facility for CLOCK_MONOTONIC >>>> and CLOCK_REALTIME. That allows you to align the CLOCK_MONOTONIC based >>>> timer with the global CLOCK_REALTIME based time line and in the event >>>> that the CLOCK_REALTIME clock was set and jumped forward/backward you >>>> have full software control over the aligning mechanism including the >>>> ability to do sanity checking. >>> This works for me. We discussed it on IRC and agreed on "something >>> like clock_gettime2(id, id, *t1, *t2) with a sanity check on the ids, >>> that allows us other combinations than MONO/REAL as well". >> Yea, there's been a few times that folks have brought up the need, and this >> sort of interface is probably the best way to go. >> >> I suspect clock_gettime3 isn't far behind though :) > I guess you are talking about MONO/MONORAW/REAL. Or MONO/BOOT/REAL or eventually MONO/TAI/REAL, etc.. But there's probably someone out there who will want all kernel state exported atomically, and its just not reasonable. The real problem is that figuring out the relationship between clocks has required the three read interpoloation, which isn't something easily or quickly done with good accuracy. If we provide clock_gettime2(), that will allow the relationship between clockid pairs to be accurately determined. And if folks need to calculate the relationship between three clockids, they can do a similar three read and bound the output. ie: do { clock_gettime2(MONO, REAL, mon1,real1); clock_gettime2(MONO,BOOT, mon2,boot2); clock_gettime2(MONO,REAL, mon3, real3); } while( real1-mon1 != real3-mon3); So it might be over-designing to provide a clock_gettime3 from the start (since then folks will want clock_gettime5.. :) > So if you already > know about such requirements we should go for that, but we should try > to restrict it to combinations which can be easily supported in the > VDSO as well. Other then the cputime clockids, I'm not sure I see what combinations wouldn't work with the vdso. As long as the clocksource is accessible, everything else is exportable. thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-11-15 22:25 ` John Stultz @ 2012-11-19 18:27 ` Thomas Gleixner 0 siblings, 0 replies; 15+ messages in thread From: Thomas Gleixner @ 2012-11-19 18:27 UTC (permalink / raw) To: John Stultz; +Cc: Scot Salmon, linux-rt-users On Thu, 15 Nov 2012, John Stultz wrote: > On 11/15/2012 01:01 PM, Thomas Gleixner wrote: > > On Thu, 15 Nov 2012, John Stultz wrote: > > > On 11/15/2012 11:28 AM, Scot Salmon wrote: > > > > Thomas Gleixner <tglx@linutronix.de> wrote on 10/31/2012 03:08:45 PM: > > > > > What you are concerned about is keeping the machines in sync on a > > > > > common global timeline. Though your more fundamental requirement is > > > > > that you get the wakeup on each machine in the given cycle time. The > > > > > global synchronization mechanism just adjusts that local periodic > > > > > schedule. > > > > [...] > > > > > What you really want is an atomic readout facility for CLOCK_MONOTONIC > > > > > and CLOCK_REALTIME. That allows you to align the CLOCK_MONOTONIC based > > > > > timer with the global CLOCK_REALTIME based time line and in the event > > > > > that the CLOCK_REALTIME clock was set and jumped forward/backward you > > > > > have full software control over the aligning mechanism including the > > > > > ability to do sanity checking. > > > > This works for me. We discussed it on IRC and agreed on "something > > > > like clock_gettime2(id, id, *t1, *t2) with a sanity check on the ids, > > > > that allows us other combinations than MONO/REAL as well". > > > Yea, there's been a few times that folks have brought up the need, and > > > this > > > sort of interface is probably the best way to go. > > > > > > I suspect clock_gettime3 isn't far behind though :) > > I guess you are talking about MONO/MONORAW/REAL. > Or MONO/BOOT/REAL or eventually MONO/TAI/REAL, etc.. > > But there's probably someone out there who will want all kernel state exported > atomically, and its just not reasonable. > > The real problem is that figuring out the relationship between clocks has > required the three read interpoloation, which isn't something easily or > quickly done with good accuracy. If we provide clock_gettime2(), that will > allow the relationship between clockid pairs to be accurately determined. And > if folks need to calculate the relationship between three clockids, they can > do a similar three read and bound the output. > ie: > do { > clock_gettime2(MONO, REAL, mon1,real1); > clock_gettime2(MONO,BOOT, mon2,boot2); > clock_gettime2(MONO,REAL, mon3, real3); > } while( real1-mon1 != real3-mon3); > > So it might be over-designing to provide a clock_gettime3 from the start > (since then folks will want clock_gettime5.. :) Fair enough! > > So if you already > > know about such requirements we should go for that, but we should try > > to restrict it to combinations which can be easily supported in the > > VDSO as well. > Other then the cputime clockids, I'm not sure I see what combinations wouldn't > work with the vdso. > As long as the clocksource is accessible, everything else is exportable. Ok. We probably should add the missing ones to the VDSO then. Thanks, tglx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-11-15 20:53 ` John Stultz 2012-11-15 21:01 ` Thomas Gleixner @ 2012-12-19 20:43 ` Scot Salmon 2012-12-19 20:57 ` John Stultz 1 sibling, 1 reply; 15+ messages in thread From: Scot Salmon @ 2012-12-19 20:43 UTC (permalink / raw) To: John Stultz; +Cc: linux-rt-users, Thomas Gleixner John Stultz wrote on 11/15/2012 02:53:56 PM: > On 11/15/2012 11:28 AM, Scot Salmon wrote: > > Thomas Gleixner wrote on 10/31/2012 03:08:45 PM: > > > What you are concerned about is keeping the machines in sync on a > > > common global timeline. Though your more fundamental requirement is > > > that you get the wakeup on each machine in the given cycle time. The > > > global synchronization mechanism just adjusts that local periodic > > > schedule. > > [...] > > > What you really want is an atomic readout facility for > > > CLOCK_MONOTONIC and CLOCK_REALTIME. That allows you to align the > > > CLOCK_MONOTONIC based timer with the global CLOCK_REALTIME based > > > time line and in the event that the CLOCK_REALTIME clock was set > > > and jumped forward/backward you have full software control over > > > the aligning mechanism including the ability to do sanity checking. > > This works for me. We discussed it on IRC and agreed on "something > > like clock_gettime2(id, id, *t1, *t2) with a sanity check on the ids, > > that allows us other combinations than MONO/REAL as well". > Yea, there's been a few times that folks have brought up the need, and > this sort of interface is probably the best way to go. I spent a few days working on clock_gettime2 this week, and I've hit a snag. I tried an implementation where I look up the two kclocks from their clockid and call kc1->clock_get() and kc2->clock_get() with the timekeeping data locked. The problem is, if id1 and id2 are the obvious choices of MONO and REAL, their clock_get's end up calling timekeeping_get_ns which samples the clock source synchronously, so I get two discrete samples of the underlying clock even if I have locked out any changes to the timekeeping data. This implementation does work for the _COARSE clockid variants, but it seems from some quick checks that those aren't going to be suitable for my needs (too coarse). Since there's nothing I can do to keep the underlying hardware clock source from ticking, it seems like my options are pretty limited. I can't do anything that will lead to two samples of the clock source. All I can think of is to sample it for the first clockid, then derive the value for the second from the first, for example by offsetting a CLOCK_MONOTONIC read with tk->offs_real and returning that as the synchronized value for CLOCK_REALTIME. Which isn't incredibly horrible for those two, but making it general would get very ugly. Any thoughts? -Scot ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-12-19 20:43 ` Scot Salmon @ 2012-12-19 20:57 ` John Stultz 2013-01-05 4:09 ` Richard Cochran 0 siblings, 1 reply; 15+ messages in thread From: John Stultz @ 2012-12-19 20:57 UTC (permalink / raw) To: Scot Salmon; +Cc: linux-rt-users, Thomas Gleixner On 12/19/2012 12:43 PM, Scot Salmon wrote: > John Stultz wrote on 11/15/2012 02:53:56 PM: >> On 11/15/2012 11:28 AM, Scot Salmon wrote: >>> Thomas Gleixner wrote on 10/31/2012 03:08:45 PM: >>>> What you are concerned about is keeping the machines in sync on a >>>> common global timeline. Though your more fundamental requirement is >>>> that you get the wakeup on each machine in the given cycle time. The >>>> global synchronization mechanism just adjusts that local periodic >>>> schedule. >>> [...] >>>> What you really want is an atomic readout facility for >>>> CLOCK_MONOTONIC and CLOCK_REALTIME. That allows you to align the >>>> CLOCK_MONOTONIC based timer with the global CLOCK_REALTIME based >>>> time line and in the event that the CLOCK_REALTIME clock was set >>>> and jumped forward/backward you have full software control over >>>> the aligning mechanism including the ability to do sanity checking. >>> This works for me. We discussed it on IRC and agreed on "something >>> like clock_gettime2(id, id, *t1, *t2) with a sanity check on the ids, >>> that allows us other combinations than MONO/REAL as well". >> Yea, there's been a few times that folks have brought up the need, and >> this sort of interface is probably the best way to go. > I spent a few days working on clock_gettime2 this week, and I've hit a > snag. I tried an implementation where I look up the two kclocks from > their clockid and call kc1->clock_get() and kc2->clock_get() with the > timekeeping data locked. The problem is, if id1 and id2 are the obvious > choices of MONO and REAL, their clock_get's end up calling > timekeeping_get_ns which samples the clock source synchronously, so I > get two discrete samples of the underlying clock even if I have locked > out any changes to the timekeeping data. > > This implementation does work for the _COARSE clockid variants, but it > seems from some quick checks that those aren't going to be suitable for > my needs (too coarse). > > Since there's nothing I can do to keep the underlying hardware clock > source from ticking, it seems like my options are pretty limited. I > can't do anything that will lead to two samples of the clock source. > All I can think of is to sample it for the first clockid, then derive > the value for the second from the first, for example by offsetting a > CLOCK_MONOTONIC read with tk->offs_real and returning that as the > synchronized value for CLOCK_REALTIME. Which isn't incredibly horrible > for those two, but making it general would get very ugly. > > Any thoughts? Likely this will require some refactoring of the current timekeeping accessor functions. Basically you need to split the current accsessor functions into two parts: 1) The clocksource hardware read 2) The time calculation, given the clocksource value Then for the gettime2() implementation, do a *single* read of the hardware, and pass that clocksource value into the different specified clockid functions. Another important thing to consider here, is that if we are considering adding a new syscall, we should avoid the 2038 issue, and specify a new time64_t and timespec64 types, and use them in the new interface. This way we can provide a 64bit time interface for 32bit systems. thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2012-12-19 20:57 ` John Stultz @ 2013-01-05 4:09 ` Richard Cochran 2013-01-21 15:36 ` Scot Salmon 0 siblings, 1 reply; 15+ messages in thread From: Richard Cochran @ 2013-01-05 4:09 UTC (permalink / raw) To: John Stultz; +Cc: Scot Salmon, linux-rt-users, Thomas Gleixner Instead of looking for new system calls, why not just solve this problem at the application level? 1. boot machine 2. start ptp4l 3. wait five seconds 4. start control app, and just use clock_realtime HTH, Richard ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2013-01-05 4:09 ` Richard Cochran @ 2013-01-21 15:36 ` Scot Salmon 2013-01-21 19:08 ` John Stultz 2013-01-21 19:12 ` Richard Cochran 0 siblings, 2 replies; 15+ messages in thread From: Scot Salmon @ 2013-01-21 15:36 UTC (permalink / raw) To: Richard Cochran; +Cc: John Stultz, linux-rt-users, Thomas Gleixner Richard Cochran <richardcochran@gmail.com> wrote on 01/04/2013 10:09:46 PM: > Instead of looking for new system calls, why not just solve this > problem at the application level? > > 1. boot machine > 2. start ptp4l > 3. wait five seconds > 4. start control app, and just use clock_realtime > > HTH, > Richard Richard and I talked about this a bit (thank you again Richard). My solution needs to handle a variety of different clock shift scenarios, because I'm not designing the final application but trying to provide a system that can support a variety of applications. I think the two variables that matter most are (1) that the two controllers might be on the same subnet or in different countries, and (2) that they might be synced by 1588 or SNTP. Another possibly important variable is that the time sync server might not be reliably available. Ideally the same system would work on all the combinations. Richard suggested a user-mode solution where I just sample the two clocks in a tight loop and determine the shift between them by the smallest delta across N loops. That solution is nice for a couple of reasons, first, no syscall, second, it supports arbitrary combinations of clockid's really nicely. One drawback of that solution for me is that for the specific clocks I care about (MONO and REAL) it is not optimal. The delta between those two clocks is available in the kernel timekeeping data, so a syscall just handling those two would be able to get the exact offset quickly. I'd still really like to be able to get that offset. One way to do that would be to take a small step along the clock_gettime2 path and just have it return an error if the clockid's are anything other than MONO and REAL. That seems ugly, but it's straightforward to implement, and would work for me and maybe others with basically similar requirements, so I'll throw it out there. Another option is to continue all the way down the clock_gettime2 path, as John suggested: > Basically you need to split the current accsessor functions into two parts: > 1) The clocksource hardware read > 2) The time calculation, given the clocksource value > > Then for the gettime2() implementation, do a *single* read of the > hardware, and pass that clocksource value into the different specified > clockid functions. > > Another important thing to consider here, is that if we are considering > adding a new syscall, we should avoid the 2038 issue, and specify a new > time64_t and timespec64 types, and use them in the new interface. This > way we can provide a 64bit time interface for 32bit systems. This is something I'd like to work on, but it seems like a significant project, and I'm wondering if it might be acceptable to start with the reduced functionality version I mentioned above and work towards this. As a side note, Richard mentioned that the idea of a CLOCK_TAI has been thrown around before. I agree with him that it would be a useful addition for test & measurement applications. -Scot ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2013-01-21 15:36 ` Scot Salmon @ 2013-01-21 19:08 ` John Stultz 2013-01-22 2:42 ` John Stultz 2013-01-21 19:12 ` Richard Cochran 1 sibling, 1 reply; 15+ messages in thread From: John Stultz @ 2013-01-21 19:08 UTC (permalink / raw) To: Scot Salmon; +Cc: Richard Cochran, linux-rt-users, Thomas Gleixner On 01/21/2013 07:36 AM, Scot Salmon wrote: > > One drawback of that solution for me is that for the specific clocks I > care about (MONO and REAL) it is not optimal. The delta between those two > clocks is available in the kernel timekeeping data, so a syscall just > handling those two would be able to get the exact offset quickly. I'd > still really like to be able to get that offset. > > One way to do that would be to take a small step along the clock_gettime2 > path and just have it return an error if the clockid's are anything other > than MONO and REAL. That seems ugly, but it's straightforward to > implement, and would work for me and maybe others with basically similar > requirements, so I'll throw it out there. Another option is to continue > all the way down the clock_gettime2 path, as John suggested: > >> Basically you need to split the current accsessor functions into two > parts: >> 1) The clocksource hardware read >> 2) The time calculation, given the clocksource value >> >> Then for the gettime2() implementation, do a *single* read of the >> hardware, and pass that clocksource value into the different specified >> clockid functions. >> >> Another important thing to consider here, is that if we are considering >> adding a new syscall, we should avoid the 2038 issue, and specify a new >> time64_t and timespec64 types, and use them in the new interface. This >> way we can provide a 64bit time interface for 32bit systems. > This is something I'd like to work on, but it seems like a significant > project, and I'm wondering if it might be acceptable to start with the > reduced functionality version I mentioned above and work towards this. Yea, its likely to be a reasonable amount of work to do it right (although really, its not too crazy, given its pretty limited scope - you're really just moving code into sub-functions). The problem with starting with the reduced functionality, especially when there's a sense of what the right approach should be, is that if we merge that, *someone* still has to clean things up to really finish the interface. So its usually not a good idea to merge such an approach unless there's a clear commitment and history of the submitter following through. The exceptions for this are usually when there's some interesting feature, but no one knows how to best integrate it (ie: how deeply), and in those cases, having a reduced functionality (even hackish) implementation, lets folks get a better sense of the functionality and what it duplicates or would integrate well with. In those cases, having something merged to learn from is more valuable. Now, as far as the 2038 issue goes, just defining the interface to be 64bit would be fine. You wouldn't have to convert all the internals to 64bits for just the interface to be useful (since converting the internals to be 64bits is something on my todo already, and on 64bit systems it would already be supported). > As a side note, Richard mentioned that the idea of a CLOCK_TAI has been > thrown around before. I agree with him that it would be a useful addition > for test & measurement applications. Yea, I've had some CLOCK_TAI patches in my tree for awhile now. However, its sort of half done, as I need to still integrate CLOCK_TAI hrtimer support before it can properly be merged. I've not had the time recently to finish this, but I'm hoping to get to it before too long (or so I've said for the last few releases). thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2013-01-21 19:08 ` John Stultz @ 2013-01-22 2:42 ` John Stultz 0 siblings, 0 replies; 15+ messages in thread From: John Stultz @ 2013-01-22 2:42 UTC (permalink / raw) To: Scot Salmon; +Cc: Richard Cochran, linux-rt-users, Thomas Gleixner On 01/21/2013 11:08 AM, John Stultz wrote: > On 01/21/2013 07:36 AM, Scot Salmon wrote: >> As a side note, Richard mentioned that the idea of a CLOCK_TAI has been >> thrown around before. I agree with him that it would be a useful >> addition >> for test & measurement applications. > Yea, I've had some CLOCK_TAI patches in my tree for awhile now. > However, its sort of half done, as I need to still integrate CLOCK_TAI > hrtimer support before it can properly be merged. I've not had the > time recently to finish this, but I'm hoping to get to it before too > long (or so I've said for the last few releases). So I figured I'd spend a few minutes and try to get some progress on this. My current tree can be found here: http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=shortlog;h=refs/heads/fortglx/3.10/time There's still some issues around notifying the hrtimer when the tai_offset changes that have to be dealt with, but if folks want to play around with it, I'd appreciate any feedback. thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2013-01-21 15:36 ` Scot Salmon 2013-01-21 19:08 ` John Stultz @ 2013-01-21 19:12 ` Richard Cochran 2013-01-23 15:14 ` Scot Salmon 1 sibling, 1 reply; 15+ messages in thread From: Richard Cochran @ 2013-01-21 19:12 UTC (permalink / raw) To: Scot Salmon; +Cc: John Stultz, linux-rt-users, Thomas Gleixner I still think that your problems can be solved at the application level quite easily. Just out of curiosity, I wrote a little test program, below, and I think it shows a promising direction to solve your issue. (The 'clk_cmp' function was taken from the userland servo in the phc2sys program, from the linuxptp project.) I am not sure what your application wants to do if it should notice a jump in CLOCK_REALTIME: keep the old phase angle or re-align to the new UTC second. My test program does not shift its phase, but I think you would know how to do it. Running on my dinky atom netbook (32 bit, no vdso, no RT) in an unscientific test, I see the value of 'expected - offset' typically around a few hundred nanoseconds, with occasional outliers of a few microseconds. With RT-PREMPT these could surely be avoided, or by using a moving average of say, 10 readings. Also, rather than jumping to the new phase angle (if that is what you want in fact to do), you could smoothly ajust the loop phase using a PI controller or similar. Anyhow, here it is. What do you think? Thanks, Richard --- #include <errno.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #include <unistd.h> #include <inttypes.h> #define NS_PER_SEC 1000000000LL static int clk_cmp(clockid_t clkid, clockid_t sysclk, int readings, int64_t *offset, uint64_t *ts) { struct timespec tdst1, tdst2, tsrc; int i; int64_t interval, best_interval = INT64_MAX; /* Pick the quickest clkid reading. */ for (i = 0; i < readings; i++) { if (clock_gettime(sysclk, &tdst1) || clock_gettime(clkid, &tsrc) || clock_gettime(sysclk, &tdst2)) { perror("clock_gettime"); return -1; } interval = (tdst2.tv_sec - tdst1.tv_sec) * NS_PER_SEC + tdst2.tv_nsec - tdst1.tv_nsec; if (best_interval > interval) { best_interval = interval; *offset = (tdst1.tv_sec - tsrc.tv_sec) * NS_PER_SEC + tdst1.tv_nsec - tsrc.tv_nsec + interval / 2; *ts = tdst2.tv_sec * NS_PER_SEC + tdst2.tv_nsec; } } return 0; } int main(int argc, char *argv[]) { int64_t expected, offset; uint64_t systs; struct timespec next; struct timespec incr = { 0, 250000000 }; clockid_t clk = CLOCK_MONOTONIC; int n_readings = 1; /* * Calibrate the offset. */ if (clk_cmp(clk, CLOCK_REALTIME, 25, &expected, &systs)) { return -1; } if (clock_gettime(clk, &next)) { perror("clock_gettime"); return -1; } next.tv_sec++; while (1) { if (clock_nanosleep(clk, TIMER_ABSTIME, &next, NULL)) { perror("clock_nanosleep"); break; } if (clk_cmp(clk, CLOCK_REALTIME, n_readings, &offset, &systs)) { break; } printf("%" PRId64 "\n", expected - offset); /* * At this point, if the difference is more than, say, * 10 microseconds, then you could simply recalibrate * the offset and set the next deadline. */ next.tv_sec += incr.tv_sec; next.tv_nsec += incr.tv_nsec; while (next.tv_nsec >= NS_PER_SEC) { next.tv_nsec -= NS_PER_SEC; next.tv_sec++; } } return 0; } ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) 2013-01-21 19:12 ` Richard Cochran @ 2013-01-23 15:14 ` Scot Salmon 0 siblings, 0 replies; 15+ messages in thread From: Scot Salmon @ 2013-01-23 15:14 UTC (permalink / raw) To: Richard Cochran Cc: John Stultz, linux-rt-users, linux-rt-users-owner, Thomas Gleixner Richard- Skipping to the punch line, that worked for us. Thanks again. Details: > I still think that your problems can be solved at the application > level quite easily. Just out of curiosity, I wrote a little test > program, below, and I think it shows a promising direction to solve > your issue. (The 'clk_cmp' function was taken from the userland servo > in the phc2sys program, from the linuxptp project.) ... > Running on my dinky atom netbook (32 bit, no vdso, no RT) in an > unscientific test, I see the value of 'expected - offset' typically > around a few hundred nanoseconds, with occasional outliers of a few > microseconds. With RT-PREMPT these could surely be avoided... I tried your code on a 667MHz ARM Cortex-A9 RT patched system and the initial results looked good. I made a few tweaks to make the test a little more aggressive, changing the period to something faster and not-round so that it would be more likely to conflict with periodic interrupts, changing the reporting so that it only tells me when the delta got worse so that the spam from the faster rate is manageable, and setting it to run at an RT priority appropriate for our applications (without the RT priority, the results were predictably very bad under load, like 180us of variation). Then I ran it under some load (gzip'ing two large files to load both cores, then scp'ing another large file to hit the network interrupt). I saw a few readings more than 10us from the expected value which concerned me at first. But, I think this turns out to be okay, as you predicted: > ...using a moving average of say, 10 readings. > > Also, rather than jumping to the new phase angle (if that is what you > want in fact to do), you could smoothly ajust the loop phase using a > PI controller or similar. I talked to the guys who actually use the data I'm providing here, and they already do what you suggest, and figure that the occasional spike I reported is acceptable for that reason. They also like the fact that the user mode solution would work on older kernels where the syscall would not be available. I also noticed later that the 2nd...n'th readings after the nanosleep are consistently much better than the first one, i.e. n_readings of at least 2 gives a perhaps-surprising improvement in the results. The 1st offset reading is consistently low, i.e. with n_readings = 1, expected - offset is almost always positive. This may be because of a first-iteration cache miss in the gettime(MONO) code path, and is mitigated either by using n_readings = 2 or adding a dummy gettime(MONO) before clk_cmp. It still makes me itch a little to use an approximation when the exact value is tantalizingly close, but all things considered your user mode solution is the right answer. Thank you very much for the idea, the demo, and all the patient advice. John had mentioned earlier in the thread, regarding clock_gettime2: > Yea, there's been a few times that folks have brought up the need, and > this sort of interface is probably the best way to go. I'd be interested to know more about the other needs. Ultimately if gettime2 is still useful and worth adding anyway, then I'd use it, so I can still pursue it if others think it's worthwhile for uses not covered by Richard's solution. -Scot ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-01-23 15:53 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-31 15:54 Detecting shift of CLOCK_REALTIME with clock_nanosleep (again) Scot Salmon 2012-10-31 20:08 ` Thomas Gleixner 2012-11-15 19:28 ` Scot Salmon 2012-11-15 20:53 ` John Stultz 2012-11-15 21:01 ` Thomas Gleixner 2012-11-15 22:25 ` John Stultz 2012-11-19 18:27 ` Thomas Gleixner 2012-12-19 20:43 ` Scot Salmon 2012-12-19 20:57 ` John Stultz 2013-01-05 4:09 ` Richard Cochran 2013-01-21 15:36 ` Scot Salmon 2013-01-21 19:08 ` John Stultz 2013-01-22 2:42 ` John Stultz 2013-01-21 19:12 ` Richard Cochran 2013-01-23 15:14 ` Scot Salmon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).