public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>, Pavel Machek <pavel@ucw.cz>,
	Mikael Pettersson <mikpe@it.uu.se>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: [BUG] APM resume breakage from 2.6.18-rc1 clocksource changes
Date: Thu, 13 Jul 2006 22:27:13 +0200	[thread overview]
Message-ID: <1152822433.24345.19.camel@localhost.localdomain> (raw)
In-Reply-To: <1152664944.760.70.camel@cog.beaverton.ibm.com>

John,

On Tue, 2006-07-11 at 17:42 -0700, john stultz wrote: 
> > > 
> > > Time keeping code reads from a given source and does the necessary
> > > adjustments when NTP is active. There is no relation to an interrupt
> > > event at all. At the very end it boils down to a linear equation which
> > > is recalculated at the synchronization points.
> > > 
> > > The timer interrupt itself is not a synchronization point.
> > 
> > If the timer interrupt is not a synchronization point, what is?
> 
> Synchronization point seems like a bad term in my mind. I prefer to
> think of it as an accumulation point (to avoid clocksource overflows) as
> well as a calculation point where we can make careful adjustments to the
> clocksource frequency. Please note that these two actions are not
> logically linked and could be done separately (although there really
> isn't a need to do so).

I used the term synchronization point for the points on the timeline,
where the information of an external time reference is available, e.g.
NTP data, a PPS interrupt ....

At those points we calculate the deviation from the external time
reference and refine the conversion and adjustment factors. In the
simplest form this boils down to a linear equation where we interpolate
between the points which are defined by the external time reference.

> > The timekeeping doesn't care much where synchronizations point comes from, 
> > but it can do a much better job, if it knows when they come.
> > Without knowing this the timekeeping code has to do extra work and has to 
> > be either optimistic or assume the worst case, neither is good for precise 
> > timekeeping, the first causes extra jitter, the latter very slow 
> > adjustments.
> 
> This I agree with. But first lets bound the issue to make it more clear
> for everyone: The conflict lies only with the high-precision clocksource
> adjustment code and not so much with the rest of the timekeeping code.
> 
> The issue being that the high-precision clocksource adjustment code is
> needed because when NTP makes an adjustment, that very precise
> adjustment might be finer then the smallest multiplier unit adjustment
> that could be made to the clocksource (1/(2^clock->shift)).
> 
> To compensate for that, at interrupt time we accumulate the
> high-precision error between what NTP told us to use and what we're able
> to use and store it in the error value. Then we apply extra short-term
> correction when the error grows large enough to keep our long term
> accuracy finer then the (1/2^shift) clocksource resolution.
> 
> Without this high-precision adjustment, you would have to rely on NTPd
> to detect and adjust for this granularity difference, which would cause
> a slower, but larger jitter.
> 
> Now the problem is, that if interrupts are delayed, this extra short
> term correction might be applied for too long, causing the overshoot
> issue we've seen (which I believe is the "extra jitter" issue mentioned
> above).
> 
> However, if we greatly dampen our adjustments, so we are less likely to
> overshoot, this means it will take longer for us to converge (ie: the
> "slow adjustment" issue above).
> 
> My only problem here is this: I don't think the slow adjustment issue is
> as severe as claimed. NTPd itself limits its adjustment speed to 500ppm
> and the frequency of its adjustment changes are in the minutes range. So
> I'm not sure I see why damping the error correction so we converge a bit
> more slowly over a period of a second or two is such an issue. 
> 
> I think this would give a bit of independence between the clocksource
> adjustment code and the interrupt frequency (and likely improve
> robustness as well).

John, that's exactly the central point. In an environment where we have
non periodic interrupts (high resolution timers, dynamic ticks) this
adjustment mechanism which relies on the periodic precise event to do
the accumulation does not work any more. I do not like the idea of
modifying this in a way that the timekeeping code does this fine grained
adjustment on the non periodic timer events. This will be a nightmare of
math and decrease robustness a lot.

The whole concept of doing the fine grained adjustment in order to
compensate for the inaccuracy of the scaled math factors on a _periodic_
event is flawed by design. The latency of interrupts in the kernel and
the fact that the periodic interrupt might be driven by a different
hardware clock, which has a different drift behaviour than the clock
which drives the time source, are reason enough to think about a
solution which makes this interdependency go away completely. In a high
resolution timer / dyntick system and also on virtualized environments
we need to get this dependency removed anyway.

The adjustment code is simply interpolation between points and boils
down to linear equations. Due to the fact that the conversion factor is
not accurate enough we need some mechanism to compensate this. I accept
that the current design has its charm, but I'm quite sure that we can do
a equally precise calculation without the interaction with the timer
interrupt code.

I know you prefer shopping over math, but it is a solvable problem.

	tglx









  reply	other threads:[~2006-07-13 20:24 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-09 23:52 [BUG] APM resume breakage from 2.6.18-rc1 clocksource changes Mikael Pettersson
2006-07-10  7:55 ` Pavel Machek
2006-07-10 17:58 ` john stultz
2006-07-10 18:08   ` Pavel Machek
2006-07-10 18:19     ` john stultz
2006-07-10 22:37     ` Roman Zippel
2006-07-10 22:50       ` john stultz
2006-07-10 22:59         ` Roman Zippel
2006-07-11  8:07           ` Thomas Gleixner
2006-07-11  9:29             ` Roman Zippel
2006-07-11 11:07               ` Thomas Gleixner
2006-07-11 23:31                 ` Roman Zippel
2006-07-12  0:42                   ` john stultz
2006-07-13 20:27                     ` Thomas Gleixner [this message]
2006-07-13 22:05                       ` john stultz
2006-07-14  6:56                         ` Thomas Gleixner
2006-07-16 15:52                         ` Roman Zippel
2006-07-16 15:50                       ` Roman Zippel
2006-07-16 16:09                         ` Thomas Gleixner
2006-07-16 16:15                           ` Roman Zippel
2006-07-10 23:17       ` Pavel Machek
  -- strict thread matches above, loose matches on Subject: below --
2006-07-10 23:36 Mikael Pettersson
2006-07-09 23:53 Mikael Pettersson
2006-07-09 20:58 Mikael Pettersson
2006-07-09 21:20 ` john stultz
2006-07-09 21:31 ` Valdis.Kletnieks
2006-07-09 21:44 ` Pavel Machek
2006-07-09 22:51   ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1152822433.24345.19.camel@localhost.localdomain \
    --to=tglx@linutronix.de \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikpe@it.uu.se \
    --cc=mingo@elte.hu \
    --cc=pavel@ucw.cz \
    --cc=zippel@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox