linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: Time precision, adjtime(x) vs. gettimeofday
@ 2003-10-10  5:12 Bill Fink
  2003-10-10  7:33 ` Gabriel Paubert
  2003-10-10  7:53 ` Ethan Benson
  0 siblings, 2 replies; 9+ messages in thread
From: Bill Fink @ 2003-10-10  5:12 UTC (permalink / raw)
  To: LinuxPPC Developers; +Cc: Bill Fink


On Wed, 08 Oct 2003, Benjamin Herrenschmidt wrote:

> > I repeat the question: what are the values of drift on the machines
> > that encounter the problem ? Is this drift stable or unstable?
>
> So far, there is no problem. The problem that was happening
> was a via_calibrate_decr() bug with HZ != 100, but when
> investigating, I figured out that we had a potential problem
> there, that's all and that's why I want people like you who
> know those problems well to state if it's worth bothering ;)
>
> > > On all cases, those will drift some way from what the NTP server
> > > will give, either a lot or not, it will. So we may end up adjusting
> > > our kernel rate and thus opening a window for the problem.
> >
> > The worst variations of drift I've seen are a few ppm for a given
> > machine, barring the occasional boot-time calibration problems that I
> > have encountered.
>
> OK.

This discussion prompted me to finally ask about another clock related
problem I see on the 867 MHz G4 systems at work.  The clocks on these
systems continuously run 0.2% slow (about 3 minutes per day).  Apparently
this is more than ntp can adjust for (using scaling), as I get many of
these error messages in the log:

Oct 10 00:11:29 clifford ntpd[425]: time reset 2.641342 s
Oct 10 00:11:29 clifford ntpd[425]: synchronisation lost
Oct 10 00:32:07 clifford ntpd[425]: time reset 2.671741 s
Oct 10 00:32:07 clifford ntpd[425]: synchronisation lost
Oct 10 00:52:46 clifford ntpd[425]: time reset 2.671729 s
Oct 10 00:52:46 clifford ntpd[425]: synchronisation lost

This causes problems if I take these systems off the network for a few
hours, if I forget to reset them to the correct time when I reconnect
them, since we use Kerberos for security, and the time difference between
the system and the Kerberos KDC will prevent remote logins.

These systems are using a 2.4.20-ben1 kernel.

						-Bill

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Time precision, adjtime(x) vs. gettimeofday
  2003-10-10  5:12 Time precision, adjtime(x) vs. gettimeofday Bill Fink
@ 2003-10-10  7:33 ` Gabriel Paubert
  2003-10-10 16:39   ` Bill Fink
  2003-10-10  7:53 ` Ethan Benson
  1 sibling, 1 reply; 9+ messages in thread
From: Gabriel Paubert @ 2003-10-10  7:33 UTC (permalink / raw)
  To: Bill Fink; +Cc: LinuxPPC Developers


On Fri, Oct 10, 2003 at 01:12:54AM -0400, Bill Fink wrote:
>
> On Wed, 08 Oct 2003, Benjamin Herrenschmidt wrote:
>
> > > I repeat the question: what are the values of drift on the machines
> > > that encounter the problem ? Is this drift stable or unstable?
> >
> > So far, there is no problem. The problem that was happening
> > was a via_calibrate_decr() bug with HZ != 100, but when
> > investigating, I figured out that we had a potential problem
> > there, that's all and that's why I want people like you who
> > know those problems well to state if it's worth bothering ;)
> >
> > > > On all cases, those will drift some way from what the NTP server
> > > > will give, either a lot or not, it will. So we may end up adjusting
> > > > our kernel rate and thus opening a window for the problem.
> > >
> > > The worst variations of drift I've seen are a few ppm for a given
> > > machine, barring the occasional boot-time calibration problems that I
> > > have encountered.
> >
> > OK.
>
> This discussion prompted me to finally ask about another clock related
> problem I see on the 867 MHz G4 systems at work.  The clocks on these
> systems continuously run 0.2% slow (about 3 minutes per day).  Apparently
> this is more than ntp can adjust for (using scaling), as I get many of
> these error messages in the log:

Indeed, the limit of NTP is about 500ppm (0.05%) AFAIR. Anything higher
and you go into period time steps like the one you report.

2.4.20 is recent enough and should not have this kind of problems.

What is the initial decrementer frequency from boot messages log?

What is the timebase frequency from OF?
(od -td4 /proc/device-tree/cpus/PowerPC,G4/timebase-frequency)

	Gabriel

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Time precision, adjtime(x) vs. gettimeofday
  2003-10-10  5:12 Time precision, adjtime(x) vs. gettimeofday Bill Fink
  2003-10-10  7:33 ` Gabriel Paubert
@ 2003-10-10  7:53 ` Ethan Benson
  1 sibling, 0 replies; 9+ messages in thread
From: Ethan Benson @ 2003-10-10  7:53 UTC (permalink / raw)
  To: LinuxPPC Developers


On Fri, Oct 10, 2003 at 01:12:54AM -0400, Bill Fink wrote:
>
> This discussion prompted me to finally ask about another clock related
> problem I see on the 867 MHz G4 systems at work. The clocks on
> these systems continuously run 0.2% slow (about 3 minutes per day).
> Apparently this is more than ntp can adjust for (using scaling), as I
> get many of these error messages in the log:

is it a quicksilver G4?  i maintain one of those and its time goes off
much faster then that (3 minutes within a couple hours).

the fix is rather simple:

--- linux.old/arch/ppc/platforms/pmac_time.c.orig	Sat Nov 30 02:33:49 2002
+++ linux/arch/ppc/platforms/pmac_time.c	Sat Nov 30 02:33:22 2002
@@ -262,7 +262,9 @@
 	 * calibration. That's better since the VIA itself seems
 	 * to be slightly off. --BenH
 	 */
+#if 0
 	if (!machine_is_compatible("MacRISC2"))
+#endif
 		if (via_calibrate_decr())
 			return;

in the case of the quicksilver VIA is FAR better then whatever it uses
instead.

--
Ethan Benson
http://www.alaska.net/~erbenson/

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Time precision, adjtime(x) vs. gettimeofday
  2003-10-10  7:33 ` Gabriel Paubert
@ 2003-10-10 16:39   ` Bill Fink
  0 siblings, 0 replies; 9+ messages in thread
From: Bill Fink @ 2003-10-10 16:39 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: linuxppc-dev, Bill Fink


On Fri, 10 Oct 2003, Gabriel Paubert wrote:

> On Fri, Oct 10, 2003 at 01:12:54AM -0400, Bill Fink wrote:
> >
> > This discussion prompted me to finally ask about another clock related
> > problem I see on the 867 MHz G4 systems at work.  The clocks on these
> > systems continuously run 0.2% slow (about 3 minutes per day).  Apparently
> > this is more than ntp can adjust for (using scaling), as I get many of
> > these error messages in the log:
>
> Indeed, the limit of NTP is about 500ppm (0.05%) AFAIR. Anything higher
> and you go into period time steps like the one you report.
>
> 2.4.20 is recent enough and should not have this kind of problems.
>
> What is the initial decrementer frequency from boot messages log?

clifford% dmesg | grep -i decr
time_init: decrementer frequency = 33.290001 MHz

> What is the timebase frequency from OF?
> (od -td4 /proc/device-tree/cpus/PowerPC,G4/timebase-frequency)

clifford% od -td4 /proc/device-tree/cpus/PowerPC,G4/timebase-frequency
0000000    33290001
0000004

						-Bill

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Time precision, adjtime(x) vs. gettimeofday
@ 2003-10-11  4:45 Bill Fink
  2003-10-11  5:27 ` Ethan Benson
  0 siblings, 1 reply; 9+ messages in thread
From: Bill Fink @ 2003-10-11  4:45 UTC (permalink / raw)
  To: LinuxPPC Developers; +Cc: Bill Fink


On Thu, 9 Oct 2003, Ethan Benson wrote:

> On Fri, Oct 10, 2003 at 01:12:54AM -0400, Bill Fink wrote:
> >
> > This discussion prompted me to finally ask about another clock related
> > problem I see on the 867 MHz G4 systems at work. The clocks on
> > these systems continuously run 0.2% slow (about 3 minutes per day).
> > Apparently this is more than ntp can adjust for (using scaling), as I
> > get many of these error messages in the log:
>
> is it a quicksilver G4? i maintain one of those and its time goes off
> much faster then that (3 minutes within a couple hours).

Yes I believe it's a quicksilver G4.

clifford% cat /proc/cpuinfo
cpu             : 7450, altivec supported
clock           : 866MHz
revision        : 2.1 (pvr 8000 0201)
bogomips        : 865.07
machine         : PowerMac3,5
motherboard     : PowerMac3,5 MacRISC2 MacRISC Power Macintosh
detected as     : 69 (PowerMac G4 Silver)
pmac flags      : 00000000
L2 cache        : 256K unified
memory          : 640MB
pmac-generation : NewWorld

> the fix is rather simple:
>
> --- linux.old/arch/ppc/platforms/pmac_time.c.orig Sat Nov 30 02:33:49 2002
> +++ linux/arch/ppc/platforms/pmac_time.c Sat Nov 30 02:33:22 2002
> @@ -262,7 +262,9 @@
> * calibration. That's better since the VIA itself seems
> * to be slightly off. --BenH
> */
> +#if 0
> if (!machine_is_compatible("MacRISC2"))
> +#endif
> if (via_calibrate_decr())
> return;

Thanks for the suggested fix.  I'll give it a try when I get a chance.

> in the case of the quicksilver VIA is FAR better then whatever it uses
> instead.

Assuming the fix works, is there a simple way to test for the
quickserver G4 model rather than doing the "#if 0", since I like
to run a common kernel across a variety of different processor
models.

						-Bill

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Time precision, adjtime(x) vs. gettimeofday
  2003-10-11  4:45 Bill Fink
@ 2003-10-11  5:27 ` Ethan Benson
  2003-10-11 14:58   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 9+ messages in thread
From: Ethan Benson @ 2003-10-11  5:27 UTC (permalink / raw)
  To: LinuxPPC Developers


On Sat, Oct 11, 2003 at 12:45:41AM -0400, Bill Fink wrote:
>
> On Thu, 9 Oct 2003, Ethan Benson wrote:
>
> > On Fri, Oct 10, 2003 at 01:12:54AM -0400, Bill Fink wrote:
> > >
> > > This discussion prompted me to finally ask about another clock
> > > related problem I see on the 867 MHz G4 systems at work. The
> > > clocks on these systems continuously run 0.2% slow (about 3
> > > minutes per day). Apparently this is more than ntp can adjust for
> > > (using scaling), as I get many of these error messages in the log:
> >
> > is it a quicksilver G4? i maintain one of those and its time goes
> > off much faster then that (3 minutes within a couple hours).
>
> Yes I believe it's a quicksilver G4.
>
> clifford% cat /proc/cpuinfo
> cpu             : 7450, altivec supported
> clock           : 866MHz
> revision        : 2.1 (pvr 8000 0201)
> bogomips        : 865.07
> machine         : PowerMac3,5
> motherboard     : PowerMac3,5 MacRISC2 MacRISC Power Macintosh
> detected as     : 69 (PowerMac G4 Silver)
> pmac flags      : 00000000
> L2 cache        : 256K unified
> memory          : 640MB
> pmac-generation : NewWorld

thats a quicksilver alright.

> > the fix is rather simple:
> >
> > --- linux.old/arch/ppc/platforms/pmac_time.c.orig Sat Nov 30 02:33:49 2002
> > +++ linux/arch/ppc/platforms/pmac_time.c Sat Nov 30 02:33:22 2002
> > @@ -262,7 +262,9 @@
> > * calibration. That's better since the VIA itself seems
> > * to be slightly off. --BenH
> > */
> > +#if 0
> > if (!machine_is_compatible("MacRISC2"))
> > +#endif
> > if (via_calibrate_decr())
> > return;
>
> Thanks for the suggested fix. I'll give it a try when I get a chance.
>
> > in the case of the quicksilver VIA is FAR better then whatever it
> > inuses stead.
>
> Assuming the fix works, is there a simple way to test for the
> quickserver G4 model rather than doing the "#if 0", since I like to
> run a common kernel across a variety of different processor models.

i don't know, ive discussed it with benh, but he won't accept that VIA
is a better choice here.

--
Ethan Benson
http://www.alaska.net/~erbenson/

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Time precision, adjtime(x) vs. gettimeofday
  2003-10-11  5:27 ` Ethan Benson
@ 2003-10-11 14:58   ` Benjamin Herrenschmidt
  2003-10-14  7:07     ` Gabriel Paubert
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2003-10-11 14:58 UTC (permalink / raw)
  To: Ethan Benson; +Cc: LinuxPPC Developers


> > Assuming the fix works, is there a simple way to test for the
> > quickserver G4 model rather than doing the "#if 0", since I like to
> > run a common kernel across a variety of different processor models.
>
> i don't know, ive discussed it with benh, but he won't accept that VIA
> is a better choice here.

Hrm... I do accept that it's a better choice, but I'm sure using
the KeyLargo timer is even better :) Anyway, I'll switch to VIA by
default on "PowerMac3,5" type machines for now.

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Time precision, adjtime(x) vs. gettimeofday
  2003-10-11 14:58   ` Benjamin Herrenschmidt
@ 2003-10-14  7:07     ` Gabriel Paubert
  2003-10-14 11:16       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 9+ messages in thread
From: Gabriel Paubert @ 2003-10-14  7:07 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Ethan Benson, LinuxPPC Developers


On Sat, Oct 11, 2003 at 04:58:14PM +0200, Benjamin Herrenschmidt wrote:
>
> > > Assuming the fix works, is there a simple way to test for the
> > > quickserver G4 model rather than doing the "#if 0", since I like to
> > > run a common kernel across a variety of different processor models.
> >
> > i don't know, ive discussed it with benh, but he won't accept that VIA
> > is a better choice here.
>
> Hrm... I do accept that it's a better choice, but I'm sure using
> the KeyLargo timer is even better :) Anyway, I'll switch to VIA by
> default on "PowerMac3,5" type machines for now.

How does the Keylargo timer work? Any pointer?

Also for these machines it seems that OF also returns wrong values.
Maybe there is an OF update somewhere.

Does anybody know what MacOS X (most MacOS X machines probably use
ntp) do?

Sorry, more questions than answers. It superficially looks
like a HW screw-up in one specific series of machines.

	Gabriel

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Time precision, adjtime(x) vs. gettimeofday
  2003-10-14  7:07     ` Gabriel Paubert
@ 2003-10-14 11:16       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2003-10-14 11:16 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: Ethan Benson, LinuxPPC Developers


> How does the Keylargo timer work? Any pointer?

Darwin code... But it's basically a 64 bits counter at
KL base +

#define   kKeyLargoCounterLoOffset      0x15038
#define   kKeyLargoCounterHiOffset      0x1503C

MacOS X Appelle une fonction asm "TimeSystemBusKeyLargo" qui mesure
le nombre de "ticks" KeyLargo pour 1,048,575 PowerPC decrementer/tb
units.

(copied below)

And then uses that "tick" value this way:

        ticks = TimeSystemBusKeyLargo (keyLargoBaseAddress);
        if (intLock) {
                IOSimpleLockUnlockEnableInterrupt(intLock, is); // As you were
                IOSimpleLockFree (intLock);
        }

        systemBusHz = 4194300;
        systemBusHz *= 18432000;
        systemBusHz /= ticks;


;
; TimeSystemBusKeyLargo(inKeyLargoBaseAddress)
;
; TimeSystemBusKeyLargo - Times how long it takes the PowerPC decrementer to count down
; 1,048,575 ticks.
;
; returns, in r3, the number of KeyLargo timer ticks per 1,048,575 PowerPC decrementer ticks.
;
; trashes r3 - r10
;
; NOTE - interrupts should be disabled when calling this code
;



ENTRY(TimeSystemBusKeyLargo, TAG_NO_FRAME_USED)

                        lis             r4, 0x000F
                        ori             r4, r4, 0xFFFF          ; Load decrementer tick count (1,048,575)
                        lis             r6, kKeyLargoCounterLoOffset >> 16
                        ori             r6, r6, kKeyLargoCounterLoOffset & 0xFFFF ; Counter lo offset
                        lis             r7, kKeyLargoCounterHiOffset >> 16
                        ori             r7, r7, kKeyLargoCounterHiOffset & 0xFFFF ; Counter hi offset
                        lwbrx   r8, r6, r3                      ; Read low 32-bits of counter
                        lwbrx   r9, r7, r3                      ; Read hi 32-bits of counter

                        ; Set up decrementer and wait for it to tick down

                        mtdec   r4                                      ; Set decrementer to 1,048,575
                        isync

NewDecrementerLoop:
                        mfdec   r5                                      ; Read current decrementer value
                        cmpwi   r5, 0                           ; Check if decrementer is zero
                        bgt+    NewDecrementerLoop              ; If not yet to zero, keep looping
                        sync

                        ; Read current value of KeyLargo to get delta time

                        lwbrx   r4, r6, r3                      ; Load low 32-bits of timer (latches all 64 bits)
                        lwbrx   r5, r7, r3                      ; Load high 32-bits of timer (clear latch)

                        ; Calculate difference
                        subf    r3, r8, r4                      ; Subtract low bits (ignore wrap)
                        blr                                                     ; Return
(END)



> Also for these machines it seems that OF also returns wrong values.
> Maybe there is an OF update somewhere.
>
> Does anybody know what MacOS X (most MacOS X machines probably use
> ntp) do?
>
> Sorry, more questions than answers. It superficially looks
> like a HW screw-up in one specific series of machines.
>
> 	Gabriel
--
Benjamin Herrenschmidt <benh@kernel.crashing.org>


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-10-14 11:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-10  5:12 Time precision, adjtime(x) vs. gettimeofday Bill Fink
2003-10-10  7:33 ` Gabriel Paubert
2003-10-10 16:39   ` Bill Fink
2003-10-10  7:53 ` Ethan Benson
  -- strict thread matches above, loose matches on Subject: below --
2003-10-11  4:45 Bill Fink
2003-10-11  5:27 ` Ethan Benson
2003-10-11 14:58   ` Benjamin Herrenschmidt
2003-10-14  7:07     ` Gabriel Paubert
2003-10-14 11:16       ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).