* Re: [ntp:hackers] Linux feature
[not found] <3BCB4401.E2E84470@udel.edu>
@ 2001-10-16 6:48 ` Ulrich Windl
0 siblings, 0 replies; 2+ messages in thread
From: Ulrich Windl @ 2001-10-16 6:48 UTC (permalink / raw)
To: David L. Mills
On 15 Oct 2001, at 21:16, David L. Mills wrote:
> Guys,
>
> My Linux test box backroom.udel.edu has gone nuts. The clock frequency
> is pegged at 500 PPM and the offset swings wildly from peg to peg at 900
> s intervals. Sound familiar?
>
> I found a bit of apparently misguided sleaze that explains a lot. The
> initial conditions happened to be a time error of about 600 ms and a
> frequency error of 500 PPM (from the ntp.drift file). This triggers a
> clock transition to state 3 on the assumption the frequency error is
> very, very large (which it is). After the 900-s stepout threshold, the
> time is reset and the frequency recalulated. Ordinarily, this corrects
> the errors right away and good timekeeping continues eventually to state
> 4. However, the step correction requested is only partially implemented
> by the Linux kernel, so the frequency correction is in error. This makes
Can you elaborate on that? I think in standard Linux a running
adjtime() continues even after setting the time, and MOD_OFFSET and
adjtime() both use the same variable, so don't expect the other to
finish once you have used one of both. Is that the problem?
You could (to send me some flames) try a PPSkit patched kernel instead.
The latest offer (not officially released is for kernel 2.4.7). Works
at home for some weeks now...
> the problem worse and operation continues in simulate-pinball mode
> forever. Restarting the daemon with ntp.drift set to zero does not fix
> the problem.
>
> Additional evidence suggests the ntpdate program reports a step
> correction, but the step is not implemented immediately; in fact, it
> takes maybe twenty seconds to complete. I expect this is whey a wait
Odd, very odd! Can you track it down to a system call?
> interval is implemented in some startup scripts. Note the ntpdate
> documentation (not written by me):
>
> -B
> Force the time to always be slewed using the adjtime() system call,
> even if the measured offset is greater than +-128 ms. The default is to
> step the time using settimeofday() if the offset is greater than +-128
> ms. Note that, if the offset is much greater than +-128 ms in this case,
> that it can take a long time (hours) to slew the clock to the correct
> value. During this time. the host should not be used to synchronize
> clients.
That's how adjtime() works.
>
> -b
> Force the time to be stepped using the settimeofday() system call,
> rather than slewed (default) using the adjtime() system call. This
> option should be used when called from a startup file at boot time.
>
> One would assume the time is always reset above 128 ms, unless the -B
> option is set, and always slewed below 128 ms, unless the -b option is
> set. This confirms the Linux kernel responds to a step request
> (settimeofday()) by slewing the clock anyway.
I don't believe that, but I believe that adjtime() simply continues
after a settimeofday(). Of course I've fixed that in my patches a long
time ago.
>
> The Linux behavor is broken to the max and directly responsible for the
> above scenario. In other words, should some transient kick the clock
> time and/or frequency in a significant way, ntpd will become unstable
> and degenerate to pinball mode. This will also cause ntpd to break in
> ntpdate mode. I expect this problem is the origin of many reports about
> Linux time stability. One more example where Linux refuses to conform to
> conventional semantics. Game over.
Honestly, Dave, your sample kernel code completely leaves out the topic
of adjtime() vs. settimeofday(). One has to guess with some sound
reasoning what's probably the best to do.
>
> Dave
Ulrich
P.S. BCC'd to linux-kernel, so maybe some new people will respond.
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [ntp:hackers] Linux feature
@ 2001-10-16 18:27 Kristofer T. Karas
0 siblings, 0 replies; 2+ messages in thread
From: Kristofer T. Karas @ 2001-10-16 18:27 UTC (permalink / raw)
To: linux-kernel
> On 15 Oct 2001, at 21:16, David L. Mills wrote:
> > My Linux test box backroom.udel.edu has gone nuts. [...]
> > However, the step correction requested is only partially implemented
> > by the Linux kernel, so the frequency correction is in error. [...]
The slew rate in the kernel (sans PPSkit patch) is small relative to other
implementations; and the error in the clock circuitry of the typical PC is
worse than other implementations; and the kernel default is to talk to IDE
disk drives with interrupts disabled, unless you tell it your IDE drive is
not so crippled. All of these compound the timekeeping problem, perhaps to
the point where the slew rate simply cannot compensate. If this is the case,
then you could expect to see a time offset curve that approximates a sawtooth
wave, with step adjustments occurring on the sharp edge. I don't have enough
data to know if that's what's happening here.
I would use the 'adjtimex' program when booting your test platform to ensure
the tick value is optimal. Also, issue the command "hdparm -u 1 /dev/hda" if
your test platform uses an IDE disk; warning: if you have an old vintage box,
this could cause harm, so read the man page for 'hdparm'.
On Tuesday 16 October 2001 02:48 am, Ulrich Windl wrote:
> I think in standard Linux a running
> adjtime() continues even after setting the time, and MOD_OFFSET and
> adjtime() both use the same variable, so don't expect the other to
> finish once you have used one of both. Is that the problem?
Uh, unless I'm totally whacked, I should think so. Once a step adjustment
has been made, the very data from which a slew adjustment is calculated is no
longer relevant; continuing to slew will undo the good of the step, worsening
timekeeping but in the other direction (leading to oscillation if the slew is
not cancelled or adjusted).
> > One more example where Linux refuses to conform to
> > conventional semantics. Game over.
Dave, "Game Over" is not a terribly helpful strategy if the goal is to help
Linux do the right thing. Better would be if you would look at the code in
question and provide some constructive critique. Ulrich and the people on
linux-kernel will cheerfully implement needed functionality if the father of
the mechanism blesses it so.
Kris
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2001-10-16 18:26 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <3BCB4401.E2E84470@udel.edu>
2001-10-16 6:48 ` [ntp:hackers] Linux feature Ulrich Windl
2001-10-16 18:27 Kristofer T. Karas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox