ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm
@ 2025-05-07  9:15 Keno Goertz
  0 siblings, 0 replies; 7+ messages in thread
From: Keno Goertz @ 2025-05-07  9:15 UTC (permalink / raw)
  To: linux-kernel

Hello,

I've been looking into the kernel's NTP code and found what I understand 
to be a deviation from NTP as standardized by RFC 5905.  The 
documentation of this part of the kernel is pretty sparse, so there may 
be some motivation behind this that I don't know of.  Perhaps someone 
with more knowledge can explain this.

The doc string of `struct ntp_data` states that `time_maxerror` holds 
the "NTP sync distance (NTP dispersion + delay / 2)".

ntpd indeed sets this value to what RFC 5905 calls the "root 
synchronization distance" LAMBDA.

In RFC 5905, this LAMBDA increases over time because the root dispersion 
increases at a rate of PHI, which is set to 15ppm.  Running

$ ntpq -c "rv 0 rootdisp"

a couple of times confirms that the root dispersion reported by ntpd 
increases with this rate.  Consequently, so does the root 
synchronization distance LAMBDA.

However, the function `ntp.c:second_overflow()` instead increases the 
value of `time_maxerror` with the rate MAXFREQ, which is set to 500ppm.

This leads to standard library functions like ntp_gettime() reporting 
much bigger values of `maxerror` than ntpd is working with.  This can be 
confirmed by running

$ adjtimex -p

a couple of times.

MAXFREQ *can* be found in the reference implementation of RFC 5905 and 
is also set to 500ppm there, but it is used in a different context: 
MAXFREQ is an upper bound for the local clock's frequency offset, while 
PHI is an upper bound for the frequency drift of a clock synchronized 
with NTP.

At least this is my understanding.  Can someone explain this?

Best regards
Keno

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm
@ 2025-05-07 11:28 Keno Goertz
  2025-05-08 19:45 ` John Stultz
  0 siblings, 1 reply; 7+ messages in thread
From: Keno Goertz @ 2025-05-07 11:28 UTC (permalink / raw)
  To: tglx, zippel, mingo, john.stulz; +Cc: linux-kernel

Hello,

I've been looking into the kernel's NTP code and found what I understand 
to be a deviation from NTP as standardized by RFC 5905.  The 
documentation of this part of the kernel is pretty sparse, so there may 
be some motivation behind this that I don't know of.  Perhaps someone 
with more knowledge can explain this.

The doc string of `struct ntp_data` states that `time_maxerror` holds 
the "NTP sync distance (NTP dispersion + delay / 2)".

ntpd indeed sets this value to what RFC 5905 calls the "root 
synchronization distance" LAMBDA.

In RFC 5905, this LAMBDA increases over time because the root dispersion 
increases at a rate of PHI, which is set to 15ppm.  Running

$ ntpq -c "rv 0 rootdisp"

a couple of times confirms that the root dispersion reported by ntpd 
increases with this rate.  Consequently, so does the root 
synchronization distance LAMBDA.

However, the function `ntp.c:second_overflow()` instead increases the 
value of `time_maxerror` with the rate MAXFREQ, which is set to 500ppm.

This leads to standard library functions like ntp_gettime() reporting 
much bigger values of `maxerror` than ntpd is working with.  This can be 
confirmed by running

$ adjtimex -p

a couple of times.

MAXFREQ *can* be found in the reference implementation of RFC 5905 and 
is also set to 500ppm there, but it is used in a different context: 
MAXFREQ is an upper bound for the local clock's frequency offset, while 
PHI is an upper bound for the frequency drift of a clock synchronized 
with NTP.

At least this is my understanding.  Can someone explain this?

Best regards
Keno

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm
  2025-05-07 11:28 ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm Keno Goertz
@ 2025-05-08 19:45 ` John Stultz
  2025-05-09 19:40   ` Keno Goertz
  2025-05-12  8:57   ` Miroslav Lichvar
  0 siblings, 2 replies; 7+ messages in thread
From: John Stultz @ 2025-05-08 19:45 UTC (permalink / raw)
  To: Keno Goertz; +Cc: tglx, zippel, mingo, linux-kernel, Miroslav Lichvar

On Wed, May 7, 2025 at 6:56 AM Keno Goertz <contact@kenogo.org> wrote:
>
> I've been looking into the kernel's NTP code and found what I understand
> to be a deviation from NTP as standardized by RFC 5905.  The
> documentation of this part of the kernel is pretty sparse, so there may
> be some motivation behind this that I don't know of.  Perhaps someone
> with more knowledge can explain this.
>
> The doc string of `struct ntp_data` states that `time_maxerror` holds
> the "NTP sync distance (NTP dispersion + delay / 2)".
>
> ntpd indeed sets this value to what RFC 5905 calls the "root
> synchronization distance" LAMBDA.
>
> In RFC 5905, this LAMBDA increases over time because the root dispersion
> increases at a rate of PHI, which is set to 15ppm.  Running
>
> $ ntpq -c "rv 0 rootdisp"
>
> a couple of times confirms that the root dispersion reported by ntpd
> increases with this rate.  Consequently, so does the root
> synchronization distance LAMBDA.
>
> However, the function `ntp.c:second_overflow()` instead increases the
> value of `time_maxerror` with the rate MAXFREQ, which is set to 500ppm.
>
> This leads to standard library functions like ntp_gettime() reporting
> much bigger values of `maxerror` than ntpd is working with.  This can be
> confirmed by running
>
> $ adjtimex -p
>
> a couple of times.
>
> MAXFREQ *can* be found in the reference implementation of RFC 5905 and
> is also set to 500ppm there, but it is used in a different context:
> MAXFREQ is an upper bound for the local clock's frequency offset, while
> PHI is an upper bound for the frequency drift of a clock synchronized
> with NTP.
>
> At least this is my understanding.  Can someone explain this?

Hey! Thanks for reaching out with your findings!

Looking back through the commit history, we used to increment
time_maxerror by (time_tolerance >> SHIFT_USEC), but all the way back
in the git history (2.6.12, and seemingly back as far as 2.3?)
time_tolerance was always set to MAXFREQ.

So, as it predates my involvement, I can only guess this was due to a
misreading of the spec in an early implementation?

Have you tried a patch introducing PHI (likely setting it to 15000 as
MAXFREQ is represented as nsec/sec) and using it instead of MAXFREQ in
the calculation? Do you see any behavioral change in fixing it, or is
this just a reporting  correctness issue?

Adding Miroslav, as he might have more insight into the potential
impact to existing applications of slowing time_maxerror's growth.

thanks
-john

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm
  2025-05-08 19:45 ` John Stultz
@ 2025-05-09 19:40   ` Keno Goertz
  2025-05-09 19:49     ` John Stultz
  2025-05-12  8:57   ` Miroslav Lichvar
  1 sibling, 1 reply; 7+ messages in thread
From: Keno Goertz @ 2025-05-09 19:40 UTC (permalink / raw)
  To: John Stultz; +Cc: tglx, zippel, mingo, linux-kernel, Miroslav Lichvar

Hey!

On 5/8/25 21:45, John Stultz wrote:
> Have you tried a patch introducing PHI (likely setting it to 15000 as
> MAXFREQ is represented as nsec/sec) and using it instead of MAXFREQ in
> the calculation? Do you see any behavioral change in fixing it, or is
> this just a reporting  correctness issue?

I haven't tried a patch, but I couldn't find any place in the kernel 
itself that uses time_maxerror.  So I wouldn't expect behavioral changes 
within the kernel.  Of course, user space applications may depend on the 
values returned by the adjtimex system call.

In fact, I only noticed this behavior in the first place because I am 
writing a distributed time-stamping program and the maxerror reported by 
libc's ntp_gettime (which calls adjtimex on Linux) just felt way too large.

I'm curious to hear what Miroslav might know about other user space 
applications that take an interest in this value.

Best regards
Keno

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm
  2025-05-09 19:40   ` Keno Goertz
@ 2025-05-09 19:49     ` John Stultz
  0 siblings, 0 replies; 7+ messages in thread
From: John Stultz @ 2025-05-09 19:49 UTC (permalink / raw)
  To: Keno Goertz; +Cc: tglx, zippel, linux-kernel, Miroslav Lichvar, Ingo Molnar

On Fri, May 9, 2025 at 12:40 PM Keno Goertz <contact@kenogo.org> wrote:
> On 5/8/25 21:45, John Stultz wrote:
> > Have you tried a patch introducing PHI (likely setting it to 15000 as
> > MAXFREQ is represented as nsec/sec) and using it instead of MAXFREQ in
> > the calculation? Do you see any behavioral change in fixing it, or is
> > this just a reporting  correctness issue?
>
> I haven't tried a patch, but I couldn't find any place in the kernel
> itself that uses time_maxerror.  So I wouldn't expect behavioral changes
> within the kernel.  Of course, user space applications may depend on the
> values returned by the adjtimex system call.

Agreed. Established userspace expectations (of the current behavior)
are what I'm worried about.

> In fact, I only noticed this behavior in the first place because I am
> writing a distributed time-stamping program and the maxerror reported by
> libc's ntp_gettime (which calls adjtimex on Linux) just felt way too large.
>
> I'm curious to hear what Miroslav might know about other user space
> applications that take an interest in this value.

It might be good to share a kernel patch so that folks can test such a
change as well.

thanks
-john

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm
  2025-05-08 19:45 ` John Stultz
  2025-05-09 19:40   ` Keno Goertz
@ 2025-05-12  8:57   ` Miroslav Lichvar
  2025-05-14 10:01     ` Keno Goertz
  1 sibling, 1 reply; 7+ messages in thread
From: Miroslav Lichvar @ 2025-05-12  8:57 UTC (permalink / raw)
  To: John Stultz; +Cc: Keno Goertz, tglx, zippel, mingo, linux-kernel

On Thu, May 08, 2025 at 12:45:13PM -0700, John Stultz wrote:
> Looking back through the commit history, we used to increment
> time_maxerror by (time_tolerance >> SHIFT_USEC), but all the way back
> in the git history (2.6.12, and seemingly back as far as 2.3?)
> time_tolerance was always set to MAXFREQ.

This 500 ppm increment goes all way back to the original nanokernel
implementation by David Mills, on which IIRC was based the Linux and
other systems' timekeeping code:
https://www.eecis.udel.edu/~mills/ntp/html/kern.html

I think the idea to use MAXFREQ (reported as tolerance in timex) was
to cover the case when the clock is not synchronized at all with the
frequency offset set to any value in the +/- 500 ppm range. The Linux
adjtimex also allows setting the tick length, which gives it a much
wider range of +/-10% adjustment, so that is not fully covered.

Changing the hardcoded rate to 15 ppm to match RFC5905 doesn't seem
like a good idea to me. The kernel doesn't know how well the clock is
synchronized and I'm sure in some cases it would be too small.

The best solution would be to add a new mode to adjtimex to make it
configurable, e.g. named ADJ_MAXERRORRATE and the actual value
provided in the timex tolerance field. For compatibility with existing
NTP/PTP clients the rate could be reset to the default 500 ppm on
every ADJ_MAXERROR setting. To get a reduced rate updated applications
could set both ADJ_MAXERROR and ADJ_MAXERRORRATE at the same time.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm
  2025-05-12  8:57   ` Miroslav Lichvar
@ 2025-05-14 10:01     ` Keno Goertz
  0 siblings, 0 replies; 7+ messages in thread
From: Keno Goertz @ 2025-05-14 10:01 UTC (permalink / raw)
  To: Miroslav Lichvar, John Stultz; +Cc: tglx, zippel, mingo, linux-kernel

Hey,

On 5/12/25 10:57, Miroslav Lichvar wrote:
> This 500 ppm increment goes all way back to the original nanokernel
> implementation by David Mills, on which IIRC was based the Linux and
> other systems' timekeeping code:
> https://www.eecis.udel.edu/~mills/ntp/html/kern.html
> 
> I think the idea to use MAXFREQ (reported as tolerance in timex) was
> to cover the case when the clock is not synchronized at all with the
> frequency offset set to any value in the +/- 500 ppm range. The Linux
> adjtimex also allows setting the tick length, which gives it a much
> wider range of +/-10% adjustment, so that is not fully covered.
> 
> Changing the hardcoded rate to 15 ppm to match RFC5905 doesn't seem
> like a good idea to me. The kernel doesn't know how well the clock is
> synchronized and I'm sure in some cases it would be too small.

Thank you for these insights!

The site you linked references this RFC, which describes the kernel 
model for timekeeping as used by the Linux kernel.

https://www.rfc-editor.org/rfc/rfc1589.html

Just skimming this document really helped my understanding of what's 
going on.  It also includes a more accurate description of time_maxerror:

> This variable establishes the maximum error of the indicated
> time relative to the primary synchronization source in
> microseconds. For NTP, the value is initialized by a
> ntp_adjtime() call to the synchronization distance, which is
> equal to the root dispersion plus one-half the root delay. It
> is increased by a small amount (time_tolerance) each second to
> reflect the clock frequency tolerance. This variable is
> computed by the synchronization daemon and the kernel, but is
> otherwise not used by the kernel.

In RFC 1589, time_tolerance is set to MAXFREQ by default and can be 
changed by the kernel.  The Linux kernel does a hard-coded adjustment of 
time_maxerror with MAXFREQ instead.

A quick fix would be to change the misleading docstring of time_maxerror:

> Maximum error in microseconds holding the NTP sync distance
> (NTP dispersion + delay / 2)

I think something like this is clearer:

Maximum error in microseconds.  The NTP daemon sets this to the root 
synchronization distance (root dispersion + delay / 2).  It is then 
incremented by MAXFREQ each second to reflect the clock frequency tolerance.

Best regards
Keno


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-05-14 10:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-07 11:28 ntp: Adjustment of time_maxerror with 500ppm instead of 15ppm Keno Goertz
2025-05-08 19:45 ` John Stultz
2025-05-09 19:40   ` Keno Goertz
2025-05-09 19:49     ` John Stultz
2025-05-12  8:57   ` Miroslav Lichvar
2025-05-14 10:01     ` Keno Goertz
  -- strict thread matches above, loose matches on Subject: below --
2025-05-07  9:15 Keno Goertz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox