public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Losing too many ticks! TSC cannot be used as time sourc
@ 2003-11-19 22:10 Petr Vandrovec
  2003-11-20  3:56 ` john stultz
  0 siblings, 1 reply; 2+ messages in thread
From: Petr Vandrovec @ 2003-11-19 22:10 UTC (permalink / raw)
  To: john stultz; +Cc: lkml

On 19 Nov 03 at 14:37, john stultz wrote:
> On Wed, 2003-11-19 at 03:21, Petr Vandrovec wrote:
> > Hi,
> >   today morning my kernel (2.6.0-test9-something) said that
> > TSC became unusable. There are no other messages around this
> > time in the log. It could be thermal throttling, but it should
> > print some message when X86_MCE_P4THERMAL is enabled, yes?
> > It happened after 17hrs 56min of uptime. System never produced
> > this message before. 
> 
> Hmmm. Interesting. You haven't seen it before and you've been running
> 2.6.0-testX for awhile?

I do not know when this monitoring entered kernel, but system is running
2.6.0-testX kernels since July 22, when it booted 2.6.0-test1-c1534...
 
> Any heavy cron jobs running at that time? 

atsar says:
         cpu %usr %sys %nice %idle pswtch/s runq nrproc lavg1 ..5 ..15
08:30:01 all 3    1    0     95    134      3    125    0.03 0.03 0.00
08:40:01 all 3    1    0     95    108      4    125    0.07 0.05 0.01
08:50:01 all 3    1    0     95     83      4    123    0.03 0.03 0.00

High process switch rate is caused by running teletext grabber. There is
peak at 6:30-6:50 caused by cron.daily.
 
> The message is caused after 100 consecutive timer ticks where it appears
> that the system has lost ticks. The assumption is that something has
> gone wrong and we can no longer trust the TSC as a time source
> (speedstep, for instance). Thermal throttling is a possibility, but I've
> not actually seen it occur. I'd have to defer to the cpufreq folks on
> that one. 

bash# grep FREQ .config 
# CONFIG_CPU_FREQ is not set
bash#

> If we're getting false positives, we may have to bump that
> 100-consecutive-ticks number up. 
> 
> Anything else quirky about the system?

I'm not aware of any problem. Three or four months ago I reported memory
corruption in one of bitkeeper files (32bits replaced with something
else), but it did not occured again since then.
 
> >   Are there some more information I could supply, or should I
> > simple live with fact that TSC stopped working on my P4 today
> > morning?
> 
> Well, the TSC didn't stop working, we just stopped using it as a time
> source. Things should work fine using just the PIT, although I'd be
> interested to hear if ntpd finally settled down after the change. If it
> cannot stay synced it may be an issue w/ your system's PIT. 

It did not produce any message since 9:54 (>14 hrs ago). Only difference
is that in the past /var/lib/ntp/ntp.drift listed value 4.496 to 4.967
(looking at all values I have logged in /var/log/syslog), but now
it says 45.662. But what I can expect from PIT...

> You may also want to check the ACPI PM timesource code in -mm4, and see
> how that works for you. 

Sometime ;-) Maybe if it will hit Linus kernel ;-)
                                                    Petr Vandrovec
                                                    


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Losing too many ticks! TSC cannot be used as time sourc
  2003-11-19 22:10 Losing too many ticks! TSC cannot be used as time sourc Petr Vandrovec
@ 2003-11-20  3:56 ` john stultz
  0 siblings, 0 replies; 2+ messages in thread
From: john stultz @ 2003-11-20  3:56 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: lkml

On Wed, 2003-11-19 at 14:10, Petr Vandrovec wrote:
> On 19 Nov 03 at 14:37, john stultz wrote:
> > On Wed, 2003-11-19 at 03:21, Petr Vandrovec wrote:
> > > Hi,
> > >   today morning my kernel (2.6.0-test9-something) said that
> > > TSC became unusable. There are no other messages around this
> > > time in the log. It could be thermal throttling, but it should
> > > print some message when X86_MCE_P4THERMAL is enabled, yes?
> > > It happened after 17hrs 56min of uptime. System never produced
> > > this message before. 
> > 
> > Hmmm. Interesting. You haven't seen it before and you've been running
> > 2.6.0-testX for awhile?
> 
> I do not know when this monitoring entered kernel, but system is running
> 2.6.0-testX kernels since July 22, when it booted 2.6.0-test1-c1534...

Alrighty. If it happens again let me know. The fallback code may need to
be tweaked if folks start tripping this too easily. 


> > Well, the TSC didn't stop working, we just stopped using it as a time
> > source. Things should work fine using just the PIT, although I'd be
> > interested to hear if ntpd finally settled down after the change. If it
> > cannot stay synced it may be an issue w/ your system's PIT. 
> 
> It did not produce any message since 9:54 (>14 hrs ago). Only difference
> is that in the past /var/lib/ntp/ntp.drift listed value 4.496 to 4.967
> (looking at all values I have logged in /var/log/syslog), but now
> it says 45.662. But what I can expect from PIT...

Ok, that all looks fine. I just wanted to be sure your PIT wasn't
continually acting up.

thanks
-john


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-11-20  4:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-19 22:10 Losing too many ticks! TSC cannot be used as time sourc Petr Vandrovec
2003-11-20  3:56 ` john stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox