False "lost ticks" on dual-Opteron system (=> timer twice as fast)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* False "lost ticks" on dual-Opteron system (=> timer twice as fast)
@ 2005-05-08 12:45 Bernd Paysan
  2005-05-08 13:40 ` [suse-amd64] " Andi Kleen
  2005-05-21 19:42 ` Hendrik Visage
  0 siblings, 2 replies; 19+ messages in thread
From: Bernd Paysan @ 2005-05-08 12:45 UTC (permalink / raw)
  To: suse-amd64, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2090 bytes --]

Hi,

I've recently set up a dual Opteron RAID server (AMD-8000-based Tyan 
Thunder K8S Pro SCSI board, 2 246 Opterons, stepping 10). Kernel is a 
modified 2.6.11.4-20a from SuSE 9.3 (SMP version, sure). The Opterons 
are capable of changing the CPU frequency (between 1GHz and 2GHz).

The system clock runs (on average) about twice as fast as it should be. 
A closer observation revealed that the clock jumps forward by about 
10-30 seconds every 10-30 seconds (plus other oddities, including 
backward clock jumps). The timer interrupts are distributed roughly 
evenly among the two CPUs, but looking at the timer interrupt number 
(grep timer /proc/interrupts) revealed that for about 10-30 seconds, 
one CPU gets the interrupt, and then the other CPU gets them; the 
transition causes the system clock to advance.

A quick look at timer_interrupt shows what I suspect is the culprit: 
Each CPU keeps track of the last TSC at a timer interrupt, and adds the 
"lost" ticks to jiffies when perceived necessary. If there's only a 
single jiffies, but two vxtime.last_tsc, it can't work.

A quick workaround would be to ditch the handling of the "lost" jiffies. 
I still anticipate to have annoying time skews by do_gettimeoffset() 
(that's what explains the other oddities - if I do gettimeofday() on 
the CPU that isn't getting interrupts, I'll going to add the "lost" 
jiffies, too). A proposed fix would be to *also* store the last jiffies 
value in the vxtime variable, and verify if it's really *this* CPU that 
did miss the timer interrupts. This local "last-stored-jiffies" can 
help do_gettimeoffset() to calculate the local time good enough on both 
CPUs.

What I can't believe is that I'm the only one who has this problem.

<rant>I know the timer system on an Intel or AMD system is broken by 
design, because there should be a single constant-clocked atomically 
read-only system-wide timer. But this is no excuse for that ;-).</rant>

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-08 12:45 False "lost ticks" on dual-Opteron system (=> timer twice as fast) Bernd Paysan
@ 2005-05-08 13:40 ` Andi Kleen
  2005-05-08 16:22   ` Bernd Paysan
  2005-05-09 10:53   ` Bernd Paysan
  2005-05-21 19:42 ` Hendrik Visage
  1 sibling, 2 replies; 19+ messages in thread
From: Andi Kleen @ 2005-05-08 13:40 UTC (permalink / raw)
  To: Bernd Paysan; +Cc: suse-amd64, linux-kernel

On Sun, May 08, 2005 at 02:45:20PM +0200, Bernd Paysan wrote:
> Hi,
> 
> I've recently set up a dual Opteron RAID server (AMD-8000-based Tyan 
> Thunder K8S Pro SCSI board, 2 246 Opterons, stepping 10). Kernel is a 
> modified 2.6.11.4-20a from SuSE 9.3 (SMP version, sure). The Opterons 
> are capable of changing the CPU frequency (between 1GHz and 2GHz).

Your system should be using the HPET timer to work exactly around
this. AMD 8000 has HPET. Can you post a boot.log?

> 
> The system clock runs (on average) about twice as fast as it should be. 
> A closer observation revealed that the clock jumps forward by about 
> 10-30 seconds every 10-30 seconds (plus other oddities, including 
> backward clock jumps). The timer interrupts are distributed roughly 
> evenly among the two CPUs, but looking at the timer interrupt number 
> (grep timer /proc/interrupts) revealed that for about 10-30 seconds, 
> one CPU gets the interrupt, and then the other CPU gets them; the 
> transition causes the system clock to advance.
> 
> A quick look at timer_interrupt shows what I suspect is the culprit: 
> Each CPU keeps track of the last TSC at a timer interrupt, and adds the 

No, it doesn't. TSC is kept only globally right now.  Obviously
that is problem if the TSCs run at different frequencies (it actually
is a problem even without powernow, just a much smaller one), but
that is why HPET is used instead.

There are some plans to change that in the future, but it hasn't 
happened yet.

> "lost" ticks to jiffies when perceived necessary. If there's only a 
> single jiffies, but two vxtime.last_tsc, it can't work.
> 
> A quick workaround would be to ditch the handling of the "lost" jiffies. 
> I still anticipate to have annoying time skews by do_gettimeoffset() 
> (that's what explains the other oddities - if I do gettimeofday() on 
> the CPU that isn't getting interrupts, I'll going to add the "lost" 
> jiffies, too). A proposed fix would be to *also* store the last jiffies 
> value in the vxtime variable, and verify if it's really *this* CPU that 
> did miss the timer interrupts. This local "last-stored-jiffies" can 
> help do_gettimeoffset() to calculate the local time good enough on both 
> CPUs.

The current design is that only the BP runs the main timer, and the other
CPUs use the APIC timer and don't do any own time keeping. I think you
misread the code quite a bit.

And lost jiffie handling can't be dropped no.

A common problem however is that the irq 0 is misrouted somehow,
and gets broadcasted and processed on multiple CPUs. That results
in the time running far too fast. You can check that by looking
at /proc/interrupts.

-Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-08 13:40 ` [suse-amd64] " Andi Kleen
@ 2005-05-08 16:22   ` Bernd Paysan
  2005-05-09 10:53   ` Bernd Paysan
  1 sibling, 0 replies; 19+ messages in thread
From: Bernd Paysan @ 2005-05-08 16:22 UTC (permalink / raw)
  To: suse-amd64; +Cc: Andi Kleen, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1148 bytes --]

On Sunday 08 May 2005 15:40, Andi Kleen wrote:
> Your system should be using the HPET timer to work exactly around
> this. AMD 8000 has HPET. Can you post a boot.log?

Will come tomorrow - I don't sit right at the machine, and while trying 
to figure out what happens, I accidentally shut it down or caused it to 
crash (I can't log in remotely ATM).

> The current design is that only the BP runs the main timer, and the
> other CPUs use the APIC timer and don't do any own time keeping. I
> think you misread the code quite a bit.
> 
> And lost jiffie handling can't be dropped no.
>
> A common problem however is that the irq 0 is misrouted somehow,
> and gets broadcasted and processed on multiple CPUs. That results
> in the time running far too fast. You can check that by looking
> at /proc/interrupts.

Yes, that's sort of what's happening. /proc/interrupts shows that all 
CPUs overall get an even share of IRQ 0 - but each IRQ0 is processed by 
just one CPU. How can I examine and set the interrupt routing?

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-08 13:40 ` [suse-amd64] " Andi Kleen
  2005-05-08 16:22   ` Bernd Paysan
@ 2005-05-09 10:53   ` Bernd Paysan
  2005-05-09 13:17     ` Bernd Paysan
  1 sibling, 1 reply; 19+ messages in thread
From: Bernd Paysan @ 2005-05-09 10:53 UTC (permalink / raw)
  To: suse-amd64; +Cc: Andi Kleen, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 4738 bytes --]

On Sunday 08 May 2005 15:40, Andi Kleen wrote:
> Your system should be using the HPET timer to work exactly around
> this. AMD 8000 has HPET. Can you post a boot.log?

Ok, boot.log attached. The only entry with hpet seems to indicate some 
problems.

> A common problem however is that the irq 0 is misrouted somehow,
> and gets broadcasted and processed on multiple CPUs. That results
> in the time running far too fast. You can check that by looking
> at /proc/interrupts.

After rebooting today (and doing basically nothing), things look like that:

# while [ .T. ]; do sleep 1; echo -n $(date); grep timer /proc/interrupts; 
done
Mo Mai 9 12:47:37 CEST 2005  0:    4156834    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:38 CEST 2005  0:    4157847    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:39 CEST 2005  0:    4158861    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:40 CEST 2005  0:    4159874    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:41 CEST 2005  0:    4160886    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:42 CEST 2005  0:    4161899    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:43 CEST 2005  0:    4162913    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:44 CEST 2005  0:    4163926    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:45 CEST 2005  0:    4164938    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:46 CEST 2005  0:    4165951    4466062    IO-APIC-edge  timer
Mo Mai 9 12:47:47 CEST 2005  0:    4166396    4466631    IO-APIC-edge  timer
Mo Mai 9 12:47:48 CEST 2005  0:    4166396    4467644    IO-APIC-edge  timer
Mo Mai 9 12:47:49 CEST 2005  0:    4166396    4468656    IO-APIC-edge  timer
Mo Mai 9 12:47:50 CEST 2005  0:    4166396    4469668    IO-APIC-edge  timer
Mo Mai 9 12:47:51 CEST 2005  0:    4166396    4470681    IO-APIC-edge  timer
Mo Mai 9 12:47:52 CEST 2005  0:    4166396    4471694    IO-APIC-edge  timer
Mo Mai 9 12:47:53 CEST 2005  0:    4166396    4472708    IO-APIC-edge  timer
Mo Mai 9 12:47:54 CEST 2005  0:    4166396    4473720    IO-APIC-edge  timer
Mo Mai 9 12:47:55 CEST 2005  0:    4166396    4474733    IO-APIC-edge  timer

Adding load to one CPU changes things:

# cat /dev/zero >/dev/null &
# speed
1000000
2000000
# while [ .T. ]; do sleep 1; echo -n $(date); grep timer /proc/interrupts; 
done
Mo Mai 9 12:48:52 CEST 2005  0:    4195741    4500873    IO-APIC-edge  timer
Mo Mai 9 12:48:53 CEST 2005  0:    4195741    4501882    IO-APIC-edge  timer
Mo Mai 9 12:48:54 CEST 2005  0:    4195741    4502893    IO-APIC-edge  timer
Mo Mai 9 12:48:55 CEST 2005  0:    4195741    4503902    IO-APIC-edge  timer
Mo Mai 9 12:48:56 CEST 2005  0:    4195741    4504913    IO-APIC-edge  timer
Mo Mai 9 12:49:01 CEST 2005  0:    4195958    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:03 CEST 2005  0:    4196968    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:04 CEST 2005  0:    4197977    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:06 CEST 2005  0:    4198986    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:07 CEST 2005  0:    4199997    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:09 CEST 2005  0:    4201006    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:10 CEST 2005  0:    4202015    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:04 CEST 2005  0:    4202868    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:12 CEST 2005  0:    4203675    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:14 CEST 2005  0:    4204685    4505706    IO-APIC-edge  timer
Mo Mai 9 12:49:15 CEST 2005  0:    4205376    4505713    IO-APIC-edge  timer
Mo Mai 9 12:49:16 CEST 2005  0:    4205376    4506724    IO-APIC-edge  timer
Mo Mai 9 12:49:17 CEST 2005  0:    4205376    4507734    IO-APIC-edge  timer
Mo Mai 9 12:49:18 CEST 2005  0:    4205376    4508743    IO-APIC-edge  timer
Mo Mai 9 12:49:19 CEST 2005  0:    4205376    4509752    IO-APIC-edge  timer
Mo Mai 9 12:49:20 CEST 2005  0:    4205376    4510761    IO-APIC-edge  timer

After stopping the load, the hickups continue:

Mo Mai 9 12:56:28 CEST 2005  0:    4312541    4585753    IO-APIC-edge  timer
Mo Mai 9 12:56:29 CEST 2005  0:    4313554    4585753    IO-APIC-edge  timer
Mo Mai 9 12:56:30 CEST 2005  0:    4314568    4585753    IO-APIC-edge  timer
Mo Mai 9 12:57:20 CEST 2005  0:    4315424    4585756    IO-APIC-edge  timer
Mo Mai 9 12:57:21 CEST 2005  0:    4316437    4585756    IO-APIC-edge  timer
Mo Mai 9 12:57:22 CEST 2005  0:    4317449    4585756    IO-APIC-edge  timer
Mo Mai 9 12:57:23 CEST 2005  0:    4318461    4585756    IO-APIC-edge  timer
Mo Mai 9 12:57:24 CEST 2005  0:    4319474    4585756    IO-APIC-edge  timer

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

[-- Attachment #1.2: boot.log --]
[-- Type: text/x-log, Size: 19798 bytes --]

Bootdata ok (command line is root=/dev/sda5 vga=0x317 selinux=0  splash=silent console=tty0 resume=/dev/sda1)
Linux version 2.6.11.4-20a-smp (geeko@buildhost) (gcc version 3.3.5 20050117 (prerelease) (SUSE Linux)) #1 SMP Wed Mar 23 21:52:37 UTC 2005
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
 BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000e8ff0000 (usable)
 BIOS-e820: 00000000e8ff0000 - 00000000e8fff000 (ACPI data)
 BIOS-e820: 00000000e8fff000 - 00000000e9000000 (ACPI NVS)
 BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
ACPI: RSDP (v000 ACPIAM                                ) @ 0x00000000000f6cb0
ACPI: RSDT (v001 A M I  OEMRSDT  0x04000518 MSFT 0x00000097) @ 0x00000000e8ff0000
ACPI: FADT (v001 A M I  OEMFACP  0x04000518 MSFT 0x00000097) @ 0x00000000e8ff0200
ACPI: MADT (v001 A M I  OEMAPIC  0x04000518 MSFT 0x00000097) @ 0x00000000e8ff0380
ACPI: OEMB (v001 A M I  OEMBIOS  0x04000518 MSFT 0x00000097) @ 0x00000000e8fff040
ACPI: SRAT (v001 A M I  OEMSRAT  0x04000518 MSFT 0x00000097) @ 0x00000000e8ff3b20
ACPI: ASF! (v001 AMIASF AMDSTRET 0x00000001 INTL 0x02002026) @ 0x00000000e8ff3c30
ACPI: DSDT (v001  0AAAA 0AAAA000 0x00000000 INTL 0x02002026) @ 0x0000000000000000
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 1 -> Node 1
SRAT: Node 0 PXM 0 100000-7fffffff
SRAT: Node 1 PXM 1 80000000-e8ffffff
SRAT: Node 0 PXM 0 0-7fffffff
Bootmem setup node 0 0000000000000000-000000007fffffff
Bootmem setup node 1 0000000080000000-00000000e8feffff
On node 0 totalpages: 524287
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 520191 pages, LIFO batch:16
  HighMem zone: 0 pages, LIFO batch:1
On node 1 totalpages: 430063
  DMA zone: 0 pages, LIFO batch:1
  Normal zone: 430063 pages, LIFO batch:16
  HighMem zone: 0 pages, LIFO batch:1
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:5 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:5 APIC version 16
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x03] address[0xfebff000] gsi_base[24])
IOAPIC[1]: apic_id 3, version 17, address 0xfebff000, GSI 24-27
ACPI: IOAPIC (id[0x04] address[0xfebfe000] gsi_base[28])
IOAPIC[2]: apic_id 4, version 17, address 0xfebfe000, GSI 28-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Checking aperture...
CPU 0: aperture @ 1d20000000 size 256 MB
Aperture from northbridge cpu 0 beyond 4GB. Ignoring.
No AGP bridge found
Built 2 zonelists
Kernel command line: root=/dev/sda5 vga=0x317 selinux=0  splash=silent console=tty0 resume=/dev/sda1
bootsplash: silent mode.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 131072 bytes)
time.c: Using 1.193182 MHz PIT timer.
time.c: Detected 1991.823 MHz processor.
Console: colour dummy device 80x25
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Memory: 3751076k/3817408k available (2280k kernel code, 0k reserved, 1179k data, 212k init)
Calibrating delay loop... 3915.77 BogoMIPS (lpj=1957888)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Mount-cache hash table entries: 256 (order: 0, 4096 bytes)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0(1) -> Node 0
checking if image is initramfs... it is
ACPI: Looking for DSDT in initrd... not found!
 not found!
Using local APIC NMI watchdog using perfctr0
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0(1) -> Node 0
CPU0: AMD Opteron(tm) Processor 246 stepping 0a
per-CPU timeslice cutoff: 1024.34 usecs.
task migration cache decay timeout: 2 msecs.
Booting processor 1/1 rip 6000 rsp ffff8100e8c67f58
Initializing CPU#1
Calibrating delay loop... 3981.31 BogoMIPS (lpj=1990656)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1(1) -> Node 1
AMD Opteron(tm) Processor 246 stepping 0a
Total of 2 processors activated (7897.08 BogoMIPS).
Using local APIC timer interrupts.
Detected 12.448 MHz APIC timer.
checking TSC synchronization across 2 CPUs: passed.
time.c: Using PIT/TSC based timekeeping.
Brought up 2 CPUs
NET: Registered protocol family 16
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20050211
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLB._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
TC classifier action (bugs to netdev@oss.sgi.com cc hadi@cyberus.ca)
PCI-DMA: Disabling IOMMU.
IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $
audit: initializing netlink socket (disabled)
audit(1115626867.414:0): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
Initializing Cryptographic API
vesafb: framebuffer at 0xfd000000, mapped to 0xffffc20000080000, using 6144k, total 8128k
vesafb: mode is 1024x768x16, linelength=2048, pages=4
vesafb: scrolling: redraw
vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0
bootsplash 3.1.6-2004/03/31: looking for picture...<6> silentjpeg size 57135 bytes,<6>...found (1024x768, 36789 bytes, v3).
Console: switching to colour frame buffer device 127x44
fb0: VESA VGA frame buffer device
Real Time Clock Driver v1.12
hpet_acpi_add: no address or irqs in _CRS
Non-volatile memory driver v1.2
Linux agpgart interface v0.100 (c) Dave Jones
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
RAMDISK driver initialized: 16 RAM disks of 128000K size 1024 blocksize
loop: loaded (max 8 devices)
mice: PS/2 mouse device common for all mice
input: PC Speaker
md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27
NET: Registered protocol family 2
IP: routing cache hash table of 16384 buckets, 256Kbytes
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
NET: Registered protocol family 1
ACPI wakeup devices: 
PCI1 USB0 USB1 PS2K UAR1 GOLA GLAN GOLB SMBC AC97 MODM PWRB 
ACPI: (supports S0 S1 S4 S5)
Freeing unused kernel memory: 212k freed
input: AT Translated Set 2 keyboard on isa0060/serio0
SCSI subsystem initialized
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD8111: IDE controller at PCI slot 0000:00:07.1
AMD8111: chipset revision 3
AMD8111: not 100% native mode: will probe irqs later
AMD8111: 0000:00:07.1 (rev 03) UDMA133 controller
    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio
    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
hdc: SONY DVD RW DW-D56A, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
3ware 9000 Storage Controller device driver for Linux v2.26.02.001.
ACPI: PCI interrupt 0000:02:03.0[A] -> GSI 27 (level, low) -> IRQ 169
scsi0 : 3ware 9000 Storage Controller
3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xfc9ffc00, IRQ: 169.
3w-9xxx: scsi0: Firmware FE9X 2.06.00.009, BIOS BE9X 2.03.01.051, Ports: 4.
  Vendor: AMCC      Model: 9500S-4LP  DISK   Rev: 2.06
  Type:   Direct-Access                      ANSI SCSI revision: 03
SCSI device sda: 136697856 512-byte hdwr sectors (69989 MB)
SCSI device sda: drive cache: write back, no read (daft)
SCSI device sda: 136697856 512-byte hdwr sectors (69989 MB)
SCSI device sda: drive cache: write back, no read (daft)
 sda: sda1 sda2 < sda5 sda6 sda7 >
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
ACPI: PCI interrupt 0000:01:03.0[A] -> GSI 28 (level, low) -> IRQ 177
3w-9xxx: scsi0: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xE7E0C000, length=0xFE, cmd=X.
scsi_id[967]: 0:0:0:0: sg_io failed status 0x8 0x0 0x0 0x4
scsi_id[967]: 0:0:0:0: Unable to get INQUIRY vpd 1 page 0x0.
scsi1 : 3ware 9000 Storage Controller
3w-9xxx: scsi1: Found a 3ware 9000 Storage Controller at 0xfc6ffc00, IRQ: 177.
3w-9xxx: scsi1: Firmware FE9X 2.06.00.009, BIOS BE9X 2.03.01.051, Ports: 12.
  Vendor: AMCC      Model: 9500S-12   DISK   Rev: 2.06
  Type:   Direct-Access                      ANSI SCSI revision: 03
SCSI device sdb: 1953038336 512-byte hdwr sectors (999956 MB)
SCSI device sdb: drive cache: write back, no read (daft)
SCSI device sdb: 1953038336 512-byte hdwr sectors (999956 MB)
SCSI device sdb: drive cache: write back, no read (daft)
 sdb: sdb1
Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
  Vendor: AMCC      Model: 9500S-12   DISK   Rev: 2.06
  Type:   Direct-Access                      ANSI SCSI revision: 03
SCSI device sdc: 1953038336 512-byte hdwr sectors (999956 MB)
SCSI device sdc: drive cache: write back, no read (daft)
SCSI device sdc: 1953038336 512-byte hdwr sectors (999956 MB)
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xE7DBB000, length=0xFE, cmd=X.
SCSI device sdc: drive cache: write back, no read (daft)
 sdc:<4>scsi_id[1114]: 1:0:0:0: sg_io failed status 0x8 0x0 0x0 0x4
 sdc1
Attached scsi disk sdc at scsi1, channel 0, id 1, lun 0
scsi_id[1114]: 1:0:0:0: Unable to get INQUIRY vpd 1 page 0x0.
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xE80D6000, length=0xFE, cmd=X.
scsi_id[1152]: 1:0:1:0: sg_io failed status 0x8 0x0 0x0 0x4
scsi_id[1152]: 1:0:1:0: Unable to get INQUIRY vpd 1 page 0x0.
ACPI: PCI interrupt 0000:02:06.0[A] -> GSI 24 (level, low) -> IRQ 185
ACPI: PCI interrupt 0000:02:06.1[B] -> GSI 25 (level, low) -> IRQ 193
scsi2 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI 33 or 66Mhz, 512 SCBs

(scsi2:A:6): 160.000MB/s transfers (80.000MHz DT, 16bit)
  Vendor: CERTANCE  Model: ULTRIUM 2         Rev: 1703
  Type:   Sequential-Access                  ANSI SCSI revision: 03
scsi3 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
        <Adaptec AIC7902 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI 33 or 66Mhz, 512 SCBs

hdc: ATAPI 24X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
md: Autodetecting RAID arrays.
JBD: barrier-based sync failed on sda5 - disabling barriers
md: autorun ...
md: ... autorun DONE.
Adding 8393920k swap on /dev/sda1.  Priority:42 extents:1
device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel@redhat.com
lp: driver loaded but no devices found
cdrom: open failed.
usbcore: registered new driver usbfs
usbcore: registered new driver hub
load_module: err 0xffffffffffffffef (dont worry)
ohci_hcd: 2004 Nov 08 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI interrupt 0000:03:00.0[D] -> GSI 19 (level, low) -> IRQ 201
ohci_hcd 0000:03:00.0: OHCI Host Controller
ohci_hcd 0000:03:00.0: irq 201, pci mem 0xfeafd000
ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
ACPI: PCI interrupt 0000:03:00.1[D] -> GSI 19 (level, low) -> IRQ 201
ohci_hcd 0000:03:00.1: OHCI Host Controller
ohci_hcd 0000:03:00.1: irq 201, pci mem 0xfeafe000
ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
e100: Intel(R) PRO/100 Network Driver, 3.4.1-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
hub 2-0:1.0: USB hub found
Losing some ticks... checking if CPU frequency changed.
hub 2-0:1.0: 3 ports detected
load_module: err 0xffffffffffffffef (dont worry)
tg3.c:v3.23 (February 15, 2005)
ACPI: PCI interrupt 0000:02:09.0[A] -> GSI 24 (level, low) -> IRQ 185
eth0: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCI:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:29:af:1c
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] 
ACPI: PCI interrupt 0000:02:09.1[B] -> GSI 25 (level, low) -> IRQ 193
eth1: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCI:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:29:af:1d
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] 
ACPI: PCI interrupt 0000:03:08.0[A] -> GSI 18 (level, low) -> IRQ 209
e100: eth2: e100_probe: addr 0xfeafb000, irq 209, MAC addr 00:E0:81:29:AF:1B
st: Version 20041025, fixed bufsize 32768, s/g segs 256
Attached scsi tape st0 at scsi2, channel 0, id 6, lun 0
st0: try direct i/o: yes (alignment 512 B), max page reachable by HBA 134217727
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,  type 0
Attached scsi generic sg1 at scsi1, channel 0, id 0, lun 0,  type 0
Attached scsi generic sg2 at scsi1, channel 0, id 1, lun 0,  type 0
Attached scsi generic sg3 at scsi2, channel 0, id 6, lun 0,  type 1
BIOS EDD facility v0.16 2004-Jun-25, 3 devices found
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xCE4A2000, length=0xFE, cmd=X.
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xCECE3000, length=0xFE, cmd=X.
scsi_id[4718]: 1:0:1:0: sg_io failed status 0x8 0x0 0x0 0x4
scsi_id[4718]: 1:0:1:0: Unable to get INQUIRY vpd 1 page 0x0.
scsi_id[4781]: 1:0:0:0: sg_io failed status 0x8 0x0 0x0 0x4
scsi_id[4781]: 1:0:0:0: Unable to get INQUIRY vpd 1 page 0x0.
3w-9xxx: scsi0: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xCF292000, length=0xFE, cmd=X.
scsi_id[4909]: 0:0:0:0: sg_io failed status 0x8 0x0 0x0 0x4
scsi_id[4909]: 0:0:0:0: Unable to get INQUIRY vpd 1 page 0x0.
cdrom: open failed.
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xCEE2C000, length=0x24, cmd=X.
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda6, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
ReiserFS: sda7: found reiserfs format "3.6" with standard journal
ReiserFS: sda7: using ordered data mode
reiserfs: using flush barriers
ReiserFS: sda7: journal params: device sda7, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda7: checking transaction log (sda7)
reiserfs: disabling flush barriers on sda7
ReiserFS: sda7: Using r5 hash to sort names
ReiserFS: dm-1: found reiserfs format "3.6" with standard journal
ReiserFS: dm-1: using ordered data mode
reiserfs: using flush barriers
ReiserFS: dm-1: journal params: device dm-1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: dm-1: checking transaction log (dm-1)
reiserfs: disabling flush barriers on dm-1
ReiserFS: dm-1: Using r5 hash to sort names
ReiserFS: dm-0: found reiserfs format "3.6" with standard journal
ReiserFS: dm-0: using ordered data mode
reiserfs: using flush barriers
ReiserFS: dm-0: journal params: device dm-0, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: dm-0: checking transaction log (dm-0)
reiserfs: disabling flush barriers on dm-0
ReiserFS: dm-0: Using r5 hash to sort names
ReiserFS: dm-2: found reiserfs format "3.6" with standard journal
ReiserFS: dm-2: using ordered data mode
reiserfs: using flush barriers
ReiserFS: dm-2: journal params: device dm-2, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: dm-2: checking transaction log (dm-2)
reiserfs: disabling flush barriers on dm-2
ReiserFS: dm-2: Using r5 hash to sort names
Capability LSM initialized
ieee1394: Initialized config rom entry `ip1394'
ieee1394: raw1394: /dev/raw1394 device initialized
video1394: Installed video1394 module
tg3: eth0: Link is up at 1000 Mbps, full duplex.
tg3: eth0: Flow control is on for TX and on for RX.
mtrr: 0xfd000000,0x800000 overlaps existing 0xfd000000,0x400000
mtrr: 0xfd000000,0x800000 overlaps existing 0xfd000000,0x400000
ACPI: Power Button (FF) [PWRF]
ACPI: Processor [CPU1] (supports 8 throttling states)
    ACPI-0484: *** Warning: Error getting cpuindex for acpiid 0x3
    ACPI-0484: *** Warning: Error getting cpuindex for acpiid 0x4
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
end_request: I/O error, dev fd0, sector 0
end_request: I/O error, dev fd0, sector 0
3w-9xxx: scsi0: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xD6866000, length=0x24, cmd=X.
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0x7D2CB000, length=0x24, cmd=X.
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0x7D2CB000, length=0x24, cmd=X.
NET: Registered protocol family 10
Disabled Privacy Extensions on device ffffffff80425c00(lo)
IPv6 over IPv4 tunneling driver
Disabled Privacy Extensions on device ffff81007cc65000(sit0)
powernow-k8: Found 2 AMD Athlon 64 / Opteron processors (version 1.00.09e)
powernow-k8:    0 : fid 0xc (2000 MHz), vid 0x2 (1500 mV)
powernow-k8:    1 : fid 0xa (1800 MHz), vid 0x6 (1400 mV)
powernow-k8:    2 : fid 0x2 (1000 MHz), vid 0xe (1200 mV)
cpu_init done, current fid 0xc, vid 0x2
powernow-k8:    0 : fid 0xc (2000 MHz), vid 0x2 (1500 mV)
powernow-k8:    1 : fid 0xa (1800 MHz), vid 0x6 (1400 mV)
powernow-k8:    2 : fid 0x2 (1000 MHz), vid 0xe (1200 mV)
cpu_init done, current fid 0xc, vid 0x2
drivers/usb/serial/usb-serial.c: USB Serial support registered for Generic
usbcore: registered new driver usbserial_generic
usbcore: registered new driver usbserial
drivers/usb/serial/usb-serial.c: USB Serial Driver core v2.0
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0x766DE000, length=0xFF, cmd=X.
3w-9xxx: scsi1: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0x7C31B000, length=0xFF, cmd=X.
eth0: no IPv6 routers present
3w-9xxx: scsi0: ERROR: (0x03:0x0104): SGL entry has illegal length:address=0xDC53A000, length=0xFF, cmd=X.
subfs 0.9
end_request: I/O error, dev fd0, sector 0

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-09 10:53   ` Bernd Paysan
@ 2005-05-09 13:17     ` Bernd Paysan
  2005-05-10 10:53       ` Ed Tomlinson
  2005-05-10 11:12       ` Andi Kleen
  0 siblings, 2 replies; 19+ messages in thread
From: Bernd Paysan @ 2005-05-09 13:17 UTC (permalink / raw)
  To: suse-amd64; +Cc: Andi Kleen, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1275 bytes --]

On Monday 09 May 2005 12:53, Bernd Paysan wrote:
> On Sunday 08 May 2005 15:40, Andi Kleen wrote:
> > Your system should be using the HPET timer to work exactly around
> > this. AMD 8000 has HPET. Can you post a boot.log?
>
> Ok, boot.log attached. The only entry with hpet seems to indicate some
> problems.

I went through the BIOS setup, and found a disabled feature "ACPI 2.0", 
which I enabled. Now Linux finds the HPET timer.

# grep -i hpet boot.log 
ACPI: HPET (v001 A M I  OEMHPET  0x04000518 MSFT 0x00000097) @ 
0x00000000e8ff3c30
ACPI: HPET id: 0x102282a0 base: 0xfec01000
time.c: Using 14.318180 MHz HPET timer.
time.c: Using HPET based timekeeping.
hpet0: at MMIO 0xfec01000, IRQs 2, 8, 0
hpet0: 69ns tick, 3 32-bit timers
hpet_acpi_add: no address or irqs in _CRS

and everything appears to work (though there's still no designated CPU to 
handle the timer interrupts). xntpd syncs quickly, I'm happy (so far ;-).

So that explains why nobody sees this problem. But the TSC-based fallback 
timekeeping is still broken on SMP systems with PowerNow and distributed 
IRQ handling, which both together seem to be rare enough ;-).

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-09 13:17     ` Bernd Paysan
@ 2005-05-10 10:53       ` Ed Tomlinson
  2005-05-10 13:32         ` Andi Kleen
  2005-05-10 11:12       ` Andi Kleen
  1 sibling, 1 reply; 19+ messages in thread
From: Ed Tomlinson @ 2005-05-10 10:53 UTC (permalink / raw)
  To: Bernd Paysan; +Cc: suse-amd64, Andi Kleen, linux-kernel

On Monday 09 May 2005 09:17, Bernd Paysan wrote:
> On Monday 09 May 2005 12:53, Bernd Paysan wrote:
> > On Sunday 08 May 2005 15:40, Andi Kleen wrote:
> > > Your system should be using the HPET timer to work exactly around
> > > this. AMD 8000 has HPET. Can you post a boot.log?
> >
> > Ok, boot.log attached. The only entry with hpet seems to indicate some
> > problems.
> 
> I went through the BIOS setup, and found a disabled feature "ACPI 2.0", 
> which I enabled. Now Linux finds the HPET timer.
> 
> # grep -i hpet boot.log 
> ACPI: HPET (v001 A M I  OEMHPET  0x04000518 MSFT 0x00000097) @ 
> 0x00000000e8ff3c30
> ACPI: HPET id: 0x102282a0 base: 0xfec01000
> time.c: Using 14.318180 MHz HPET timer.> 

> time.c: Using HPET based timekeeping.
> hpet0: at MMIO 0xfec01000, IRQs 2, 8, 0
> hpet0: 69ns tick, 3 32-bit timers
> hpet_acpi_add: no address or irqs in _CRS
> 
> and everything appears to work (though there's still no designated CPU to 
> handle the timer interrupts). xntpd syncs quickly, I'm happy (so far ;-).
> 
> So that explains why nobody sees this problem. But the TSC-based fallback 
> timekeeping is still broken on SMP systems with PowerNow and distributed 
> IRQ handling, which both together seem to be rare enough ;-).

Maybe on UP too:

May  8 18:50:25 grover kernel: [143640.507170] Attached scsi removable disk sda at scsi5, channel 0, id 0, lun 0
May  8 18:50:25 grover kernel: [143640.520422] rtc: lost some interrupts at 1024Hz.
May  8 18:50:25 grover kernel: [143640.554134] Attached scsi generic sg0 at scsi5, channel 0, id 0, lun 0,  type 0
May  8 18:50:25 grover kernel: [143640.567693] rtc: lost some interrupts at 1024Hz.
May  8 18:50:25 grover kernel: [143640.596402] usb-storage: device scan complete

This from 12-rc3-mm3, UP x86_64 with powernow active.  

It might be that the message is OK here since I do not see a fast clock.  I mention this
since powernow is active here.

Should HPET be available in UP?

Ed Tomlinson


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-09 13:17     ` Bernd Paysan
  2005-05-10 10:53       ` Ed Tomlinson
@ 2005-05-10 11:12       ` Andi Kleen
  2005-05-10 11:36         ` Bernd Paysan
  2005-05-10 11:54         ` Bernd Paysan
  1 sibling, 2 replies; 19+ messages in thread
From: Andi Kleen @ 2005-05-10 11:12 UTC (permalink / raw)
  To: Bernd Paysan; +Cc: suse-amd64, Andi Kleen, linux-kernel

> I went through the BIOS setup, and found a disabled feature "ACPI 2.0", 
> which I enabled. Now Linux finds the HPET timer.

Great. The machine came like this? I wish OEMs wouldn't do such things...

> 
> # grep -i hpet boot.log 
> ACPI: HPET (v001 A M I  OEMHPET  0x04000518 MSFT 0x00000097) @ 
> 0x00000000e8ff3c30
> ACPI: HPET id: 0x102282a0 base: 0xfec01000
> time.c: Using 14.318180 MHz HPET timer.
> time.c: Using HPET based timekeeping.
> hpet0: at MMIO 0xfec01000, IRQs 2, 8, 0
> hpet0: 69ns tick, 3 32-bit timers
> hpet_acpi_add: no address or irqs in _CRS
> 
> and everything appears to work (though there's still no designated CPU to 
> handle the timer interrupts). xntpd syncs quickly, I'm happy (so far ;-).

Great.

> 
> So that explains why nobody sees this problem. But the TSC-based fallback 
> timekeeping is still broken on SMP systems with PowerNow and distributed 
> IRQ handling, which both together seem to be rare enough ;-).

There is a patch pending for the TSC problem - using the pmtimer instead
in this case.

But the distributed timer interrupt problem is weird. It should not happen.
You sure it was IRQ 0 that was duplicated and not "LOC" ?

When you watch -n1 cat /proc/interrupts does the rate roughly match
up to 1000Hz?


-Andi


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-10 11:12       ` Andi Kleen
@ 2005-05-10 11:36         ` Bernd Paysan
  2005-05-10 11:54         ` Bernd Paysan
  1 sibling, 0 replies; 19+ messages in thread
From: Bernd Paysan @ 2005-05-10 11:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: suse-amd64, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1965 bytes --]

On Tuesday 10 May 2005 13:12, Andi Kleen wrote:
> > So that explains why nobody sees this problem. But the TSC-based
> > fallback timekeeping is still broken on SMP systems with PowerNow and
> > distributed IRQ handling, which both together seem to be rare enough
> > ;-).
>
> There is a patch pending for the TSC problem - using the pmtimer instead
> in this case.
>
> But the distributed timer interrupt problem is weird. It should not
> happen. You sure it was IRQ 0 that was duplicated and not "LOC" ?

Yes. Only one CPU actually gets and handles the timer interrupt, but which 
one is somewhat random (for about 10 seconds, it's the same CPU, then it 
switches over).

> When you watch -n1 cat /proc/interrupts does the rate roughly match
> up to 1000Hz?

Yes, and this is confirmed over longer time:

# grep timer /proc/interrupts; uptime
  0:   40347440   40582285    IO-APIC-edge  timer
  1:26pm  an  22:28,  1 user,  Durchschnittslast: 0,00, 0,01, 0,04
# echo $[(3600*22+28*60)*1000] $[40347440+40582285]
80880000 80929725

Given that uptime is only accurate to the minute, this sounds very 
reasonable. The distribution also is close to 50:50. That's (almost) true 
for all interrupt sources:

 # cat /proc/interrupts 
           CPU0       CPU1       
  0:   40523846   40753939    IO-APIC-edge  timer
  1:          3        189    IO-APIC-edge  i8042
  8:        261        280    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 15:     364369     364479    IO-APIC-edge  ide1
169:      59195      55498   IO-APIC-level  3w-9xxx
177:     618198     604643   IO-APIC-level  3w-9xxx
185:    8195891    8147619   IO-APIC-level  aic79xx, eth1
193:          0         30   IO-APIC-level  aic79xx
201:          0          0   IO-APIC-level  ohci_hcd, ohci_hcd
NMI:       1184       1013 
LOC:   81273966   81271958 
ERR:          0
MIS:          0

-- 
Bernd Paysan
http://www.mikron.de/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-10 11:12       ` Andi Kleen
  2005-05-10 11:36         ` Bernd Paysan
@ 2005-05-10 11:54         ` Bernd Paysan
  2005-05-10 13:07           ` Andi Kleen
  1 sibling, 1 reply; 19+ messages in thread
From: Bernd Paysan @ 2005-05-10 11:54 UTC (permalink / raw)
  To: Andi Kleen; +Cc: suse-amd64, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2029 bytes --]

On Tuesday 10 May 2005 13:12, Andi Kleen wrote:
> > So that explains why nobody sees this problem. But the TSC-based
> > fallback timekeeping is still broken on SMP systems with PowerNow and
> > distributed IRQ handling, which both together seem to be rare enough
> > ;-).
>
> There is a patch pending for the TSC problem - using the pmtimer instead
> in this case.
>
> But the distributed timer interrupt problem is weird. It should not
> happen. You sure it was IRQ 0 that was duplicated and not "LOC" ?

Yes. Only one CPU actually gets and handles the timer interrupt, but which 
one is somewhat random (for about 10 seconds, it's the same CPU, then it 
switches over).

> When you watch -n1 cat /proc/interrupts does the rate roughly match
> up to 1000Hz?

Yes, and this is confirmed over longer time:

# grep timer /proc/interrupts; uptime
  0:   40347440   40582285    IO-APIC-edge  timer
  1:26pm  an  22:28,  1 user,  Durchschnittslast: 0,00, 0,01, 0,04
# echo $[(3600*22+28*60)*1000] $[40347440+40582285]
80880000 80929725

Given that uptime is only accurate to the minute, this sounds very 
reasonable. The distribution also is close to 50:50. That's (almost) true 
for all interrupt sources:

 # cat /proc/interrupts 
           CPU0       CPU1       
  0:   40523846   40753939    IO-APIC-edge  timer
  1:          3        189    IO-APIC-edge  i8042
  8:        261        280    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 15:     364369     364479    IO-APIC-edge  ide1
169:      59195      55498   IO-APIC-level  3w-9xxx
177:     618198     604643   IO-APIC-level  3w-9xxx
185:    8195891    8147619   IO-APIC-level  aic79xx, eth1
193:          0         30   IO-APIC-level  aic79xx
201:          0          0   IO-APIC-level  ohci_hcd, ohci_hcd
NMI:       1184       1013 
LOC:   81273966   81271958 
ERR:          0
MIS:          0

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-10 11:54         ` Bernd Paysan
@ 2005-05-10 13:07           ` Andi Kleen
  2005-05-10 13:15             ` Bernd Paysan
  0 siblings, 1 reply; 19+ messages in thread
From: Andi Kleen @ 2005-05-10 13:07 UTC (permalink / raw)
  To: Bernd Paysan; +Cc: Andi Kleen, suse-amd64, linux-kernel

On Tue, May 10, 2005 at 01:54:53PM +0200, Bernd Paysan wrote:
> On Tuesday 10 May 2005 13:12, Andi Kleen wrote:
> > > So that explains why nobody sees this problem. But the TSC-based
> > > fallback timekeeping is still broken on SMP systems with PowerNow and
> > > distributed IRQ handling, which both together seem to be rare enough
> > > ;-).
> >
> > There is a patch pending for the TSC problem - using the pmtimer instead
> > in this case.
> >
> > But the distributed timer interrupt problem is weird. It should not
> > happen. You sure it was IRQ 0 that was duplicated and not "LOC" ?
> 
> Yes. Only one CPU actually gets and handles the timer interrupt, but which 
> one is somewhat random (for about 10 seconds, it's the same CPU, then it 
> switches over).

That could be irqbalance doing its thing. Does it go away when
you stop it?

-Andi


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-10 13:07           ` Andi Kleen
@ 2005-05-10 13:15             ` Bernd Paysan
  2005-05-10 13:21               ` Andi Kleen
  0 siblings, 1 reply; 19+ messages in thread
From: Bernd Paysan @ 2005-05-10 13:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: suse-amd64, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 268 bytes --]

On Tuesday 10 May 2005 15:07, Andi Kleen wrote:
> That could be irqbalance doing its thing. Does it go away when
> you stop it?

Yes, it seems to go away.

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-10 13:15             ` Bernd Paysan
@ 2005-05-10 13:21               ` Andi Kleen
  2005-05-10 13:39                 ` Arjan van de Ven
  0 siblings, 1 reply; 19+ messages in thread
From: Andi Kleen @ 2005-05-10 13:21 UTC (permalink / raw)
  To: Bernd Paysan; +Cc: Andi Kleen, suse-amd64, linux-kernel

On Tue, May 10, 2005 at 03:15:44PM +0200, Bernd Paysan wrote:
> On Tuesday 10 May 2005 15:07, Andi Kleen wrote:
> > That could be irqbalance doing its thing. Does it go away when
> > you stop it?
> 
> Yes, it seems to go away.

Ok, it is fine then. A bit unexpected, but fine.

-Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-10 10:53       ` Ed Tomlinson
@ 2005-05-10 13:32         ` Andi Kleen
  0 siblings, 0 replies; 19+ messages in thread
From: Andi Kleen @ 2005-05-10 13:32 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: Bernd Paysan, suse-amd64, Andi Kleen, linux-kernel

On Tue, May 10, 2005 at 06:53:49AM -0400, Ed Tomlinson wrote:
> Maybe on UP too:
> 
> May  8 18:50:25 grover kernel: [143640.507170] Attached scsi removable disk sda at scsi5, channel 0, id 0, lun 0
> May  8 18:50:25 grover kernel: [143640.520422] rtc: lost some interrupts at 1024Hz.
> May  8 18:50:25 grover kernel: [143640.554134] Attached scsi generic sg0 at scsi5, channel 0, id 0, lun 0,  type 0
> May  8 18:50:25 grover kernel: [143640.567693] rtc: lost some interrupts at 1024Hz.
> May  8 18:50:25 grover kernel: [143640.596402] usb-storage: device scan complete
> 
> This from 12-rc3-mm3, UP x86_64 with powernow active.  

More likely it is some driver hogging interrupts (turning them off
for too long). You can boot with report_lost_ticks, then the normal
timer interrupt will complain. Note it always triggers at boot a few
times.

When you find the culprit report it to the driver driver maintainer.

> 
> It might be that the message is OK here since I do not see a fast clock.  I mention this
> since powernow is active here.
> 
> Should HPET be available in UP?

HPET is used on HPET only as the backing clock and to run the timer interrupt,
but not for gettimeofday because using the TSC is considerable
faster there. Actually the actual strategy is  bit more complicated,
it also depends on some other things.

-Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [suse-amd64] False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-10 13:21               ` Andi Kleen
@ 2005-05-10 13:39                 ` Arjan van de Ven
  0 siblings, 0 replies; 19+ messages in thread
From: Arjan van de Ven @ 2005-05-10 13:39 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Bernd Paysan, suse-amd64, linux-kernel

On Tue, 2005-05-10 at 15:21 +0200, Andi Kleen wrote:
> On Tue, May 10, 2005 at 03:15:44PM +0200, Bernd Paysan wrote:
> > On Tuesday 10 May 2005 15:07, Andi Kleen wrote:
> > > That could be irqbalance doing its thing. Does it go away when
> > > you stop it?
> > 
> > Yes, it seems to go away.
> 
> Ok, it is fine then. A bit unexpected, but fine.

irqbalance nowadays rotates the timer interrupt every 10 seconds after
some people complained that having it always on the same cpu penalized
their HPC apps unbalanced. (they glued those tasks to each cpu). It's
not a big deal (the irq rate isn't that high) and it does make things
slightly more fair in the HPC case.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-08 12:45 False "lost ticks" on dual-Opteron system (=> timer twice as fast) Bernd Paysan
  2005-05-08 13:40 ` [suse-amd64] " Andi Kleen
@ 2005-05-21 19:42 ` Hendrik Visage
  2005-05-21 20:54   ` Scott Robert Ladd
       [not found]   ` <428F9FA6.1000800@coyotegulch.com>
  1 sibling, 2 replies; 19+ messages in thread
From: Hendrik Visage @ 2005-05-21 19:42 UTC (permalink / raw)
  To: Bernd Paysan; +Cc: suse-amd64, linux-kernel

On 5/8/05, Bernd Paysan <bernd.paysan@gmx.de> wrote:
> Hi,
> 
> I've recently set up a dual Opteron RAID server (AMD-8000-based Tyan
> Thunder K8S Pro SCSI board, 2 246 Opterons, stepping 10). Kernel is a
> modified 2.6.11.4-20a from SuSE 9.3 (SMP version, sure). The Opterons
> are capable of changing the CPU frequency (between 1GHz and 2GHz).

I'll be delving deeper into this thread soon, but I'm seeing similar
strangeness
on a Athlon64 (rated:3G+ real:2009MHz clock), 2.6.11-r8 (gentoo), MSI
K8N Neo Platinum.

ntp syncs time, then I start a couple of compiles, and I see ntp
losing track of time, big jitter etc. (and the one time source is in
on the local LAN syncing to the same remote servers). openntp I
noticed it also.

What I have noticed in my dmesg output is that I see "lost timer ticks
CPU Frequency change?" messages very early in the boot up.

> What I can't believe is that I'm the only one who has this problem.

I've seen this for about a week or three, and somehow I believe it
wasn't a problem before 2.6.11.

-- 
Hendrik Visage

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-21 19:42 ` Hendrik Visage
@ 2005-05-21 20:54   ` Scott Robert Ladd
       [not found]   ` <428F9FA6.1000800@coyotegulch.com>
  1 sibling, 0 replies; 19+ messages in thread
From: Scott Robert Ladd @ 2005-05-21 20:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Bernd Paysan <bernd.paysan@gmx.de> wrote:
>>I've recently set up a dual Opteron RAID server (AMD-8000-based Tyan
>>Thunder K8S Pro SCSI board, 2 246 Opterons, stepping 10). Kernel is a
>>modified 2.6.11.4-20a from SuSE 9.3 (SMP version, sure). The Opterons
>>are capable of changing the CPU frequency (between 1GHz and 2GHz).
>>
>>What I can't believe is that I'm the only one who has this problem.

Hendrik Visage wrote:
> I'll be delving deeper into this thread soon, but I'm seeing similar
> strangeness
> on a Athlon64 (rated:3G+ real:2009MHz clock), 2.6.11-r8 (gentoo), MSI
> K8N Neo Platinum.
> 
> ntp syncs time, then I start a couple of compiles, and I see ntp
> losing track of time, big jitter etc. (and the one time source is in
> on the local LAN syncing to the same remote servers). openntp I
> noticed it also.
> 
> What I have noticed in my dmesg output is that I see "lost timer ticks
> CPU Frequency change?" messages very early in the boot up.
> 
> I've seen this for about a week or three, and somehow I believe it
> wasn't a problem before 2.6.11.

I *don't* have any timer problems running 2.6.11-r8 (gentoo) on a dual
Opteron 250 system using a Tyan K8W 2885. Perhaps the problem is that
the two of you are running SCSI main drives, and I'm not?

..Scott


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: False "lost ticks" on dual-Opteron system (=> timer twice as fast)
       [not found]     ` <d93f04c70505211500216d8614@mail.gmail.com>
@ 2005-05-23 11:50       ` Scott Robert Ladd
  2005-05-23 23:04         ` Hendrik Visage
  0 siblings, 1 reply; 19+ messages in thread
From: Scott Robert Ladd @ 2005-05-23 11:50 UTC (permalink / raw)
  To: Linux Kernel Mailing List

> On 5/21/05, Scott Robert Ladd <lkml@coyotegulch.com> wrote:
>>I *don't* have any timer problems running 2.6.11-r8 (gentoo) on a dual
>>Opteron 250 system using a Tyan K8W 2885. Perhaps the problem is that
>>the two of you are running SCSI main drives, and I'm not?

Hendrik Visage wrote:
> IDE/PATA, SATA (libsata/SCSI) and external USB/FireWire adapters

Hi,

You can find my .config at:

    http://www.coyotegulch.com/distfiles/corwin.config

Perhaps a comparison of my .config to yours might identify the source of
your problem.

..Scott


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-23 11:50       ` Scott Robert Ladd
@ 2005-05-23 23:04         ` Hendrik Visage
  2005-05-25 17:06           ` Andi Kleen
  0 siblings, 1 reply; 19+ messages in thread
From: Hendrik Visage @ 2005-05-23 23:04 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: Linux Kernel Mailing List

On 5/23/05, Scott Robert Ladd <lkml@coyotegulch.com> wrote:

> 
> Perhaps a comparison of my .config to yours might identify the source of
> your problem.

I think I've isolated the problem child: MSI's Dynamic Overclocking features...

The MSI "Core Cell" chipset will modify the CPU frequency within
certain parameters as the CPU needs more oomph (like a kernel
recompile :^P ), but will drop it once it heats up or things become
"unstable" or the CPU is getting idle again.

What I still need to ascertain, is whether the MSI K8N Neo Platinum
(nForce2 250) *do* have a HPET implementation or not, as my timer.c
only reports a PIT timer :(

-- 
Hendrik Visage

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: False "lost ticks" on dual-Opteron system (=> timer twice as fast)
  2005-05-23 23:04         ` Hendrik Visage
@ 2005-05-25 17:06           ` Andi Kleen
  0 siblings, 0 replies; 19+ messages in thread
From: Andi Kleen @ 2005-05-25 17:06 UTC (permalink / raw)
  To: Hendrik Visage; +Cc: Linux Kernel Mailing List

Hendrik Visage <hvjunk@gmail.com> writes:

> What I still need to ascertain, is whether the MSI K8N Neo Platinum
> (nForce2 250) *do* have a HPET implementation or not, as my timer.c
> only reports a PIT timer :(

Nvidia never reports HPET in their BIOS, so the kernel cannot use it.
There are rumours it is actually in the hardware though, but it would
need magic code to enable.

-Andi

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2005-05-25 17:08 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-08 12:45 False "lost ticks" on dual-Opteron system (=> timer twice as fast) Bernd Paysan
2005-05-08 13:40 ` [suse-amd64] " Andi Kleen
2005-05-08 16:22   ` Bernd Paysan
2005-05-09 10:53   ` Bernd Paysan
2005-05-09 13:17     ` Bernd Paysan
2005-05-10 10:53       ` Ed Tomlinson
2005-05-10 13:32         ` Andi Kleen
2005-05-10 11:12       ` Andi Kleen
2005-05-10 11:36         ` Bernd Paysan
2005-05-10 11:54         ` Bernd Paysan
2005-05-10 13:07           ` Andi Kleen
2005-05-10 13:15             ` Bernd Paysan
2005-05-10 13:21               ` Andi Kleen
2005-05-10 13:39                 ` Arjan van de Ven
2005-05-21 19:42 ` Hendrik Visage
2005-05-21 20:54   ` Scott Robert Ladd
     [not found]   ` <428F9FA6.1000800@coyotegulch.com>
     [not found]     ` <d93f04c70505211500216d8614@mail.gmail.com>
2005-05-23 11:50       ` Scott Robert Ladd
2005-05-23 23:04         ` Hendrik Visage
2005-05-25 17:06           ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox