public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: Bernd Paysan <bernd.paysan@gmx.de>
Cc: suse-amd64@suse.com, linux-kernel@vger.kernel.org
Subject: Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
Date: Sun, 8 May 2005 15:40:35 +0200	[thread overview]
Message-ID: <20050508134035.GC15724@wotan.suse.de> (raw)
In-Reply-To: <200505081445.26663.bernd.paysan@gmx.de>

On Sun, May 08, 2005 at 02:45:20PM +0200, Bernd Paysan wrote:
> Hi,
> 
> I've recently set up a dual Opteron RAID server (AMD-8000-based Tyan 
> Thunder K8S Pro SCSI board, 2 246 Opterons, stepping 10). Kernel is a 
> modified 2.6.11.4-20a from SuSE 9.3 (SMP version, sure). The Opterons 
> are capable of changing the CPU frequency (between 1GHz and 2GHz).

Your system should be using the HPET timer to work exactly around
this. AMD 8000 has HPET. Can you post a boot.log?

> 
> The system clock runs (on average) about twice as fast as it should be. 
> A closer observation revealed that the clock jumps forward by about 
> 10-30 seconds every 10-30 seconds (plus other oddities, including 
> backward clock jumps). The timer interrupts are distributed roughly 
> evenly among the two CPUs, but looking at the timer interrupt number 
> (grep timer /proc/interrupts) revealed that for about 10-30 seconds, 
> one CPU gets the interrupt, and then the other CPU gets them; the 
> transition causes the system clock to advance.
> 
> A quick look at timer_interrupt shows what I suspect is the culprit: 
> Each CPU keeps track of the last TSC at a timer interrupt, and adds the 

No, it doesn't. TSC is kept only globally right now.  Obviously
that is problem if the TSCs run at different frequencies (it actually
is a problem even without powernow, just a much smaller one), but
that is why HPET is used instead.

There are some plans to change that in the future, but it hasn't 
happened yet.

> "lost" ticks to jiffies when perceived necessary. If there's only a 
> single jiffies, but two vxtime.last_tsc, it can't work.
> 
> A quick workaround would be to ditch the handling of the "lost" jiffies. 
> I still anticipate to have annoying time skews by do_gettimeoffset() 
> (that's what explains the other oddities - if I do gettimeofday() on 
> the CPU that isn't getting interrupts, I'll going to add the "lost" 
> jiffies, too). A proposed fix would be to *also* store the last jiffies 
> value in the vxtime variable, and verify if it's really *this* CPU that 
> did miss the timer interrupts. This local "last-stored-jiffies" can 
> help do_gettimeoffset() to calculate the local time good enough on both 
> CPUs.

The current design is that only the BP runs the main timer, and the other
CPUs use the APIC timer and don't do any own time keeping. I think you
misread the code quite a bit.

And lost jiffie handling can't be dropped no.

A common problem however is that the irq 0 is misrouted somehow,
and gets broadcasted and processed on multiple CPUs. That results
in the time running far too fast. You can check that by looking
at /proc/interrupts.

-Andi

  reply	other threads:[~2005-05-08 13:40 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-08 12:45 False "lost ticks" on dual-Opteron system (=> timer twice as fast) Bernd Paysan
2005-05-08 13:40 ` Andi Kleen [this message]
2005-05-08 16:22   ` [suse-amd64] " Bernd Paysan
2005-05-09 10:53   ` Bernd Paysan
2005-05-09 13:17     ` Bernd Paysan
2005-05-10 10:53       ` Ed Tomlinson
2005-05-10 13:32         ` Andi Kleen
2005-05-10 11:12       ` Andi Kleen
2005-05-10 11:36         ` Bernd Paysan
2005-05-10 11:54         ` Bernd Paysan
2005-05-10 13:07           ` Andi Kleen
2005-05-10 13:15             ` Bernd Paysan
2005-05-10 13:21               ` Andi Kleen
2005-05-10 13:39                 ` Arjan van de Ven
2005-05-21 19:42 ` Hendrik Visage
2005-05-21 20:54   ` Scott Robert Ladd
     [not found]   ` <428F9FA6.1000800@coyotegulch.com>
     [not found]     ` <d93f04c70505211500216d8614@mail.gmail.com>
2005-05-23 11:50       ` Scott Robert Ladd
2005-05-23 23:04         ` Hendrik Visage
2005-05-25 17:06           ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050508134035.GC15724@wotan.suse.de \
    --to=ak@suse.de \
    --cc=bernd.paysan@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=suse-amd64@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox