* Xen 3.2.2 - Timer ISR/0: Time went backwards @ 2008-08-04 18:51 Christopher S. Aker 2008-08-04 18:59 ` Keir Fraser 2008-08-05 7:03 ` Jan Beulich 0 siblings, 2 replies; 9+ messages in thread From: Christopher S. Aker @ 2008-08-04 18:51 UTC (permalink / raw) To: xen devel Hardware: Xen: 3.2.1-rc2 64bit dom0: 2.6.18.8 at changeset 622, PAE # xm dmesg | grep -e sync -e timer (XEN) checking TSC synchronization across 8 CPUs: passed. (XEN) Platform timer overflows in 234 jiffies. (XEN) Platform timer is 3.579MHz ACPI PM Timer (XEN) Machine check exception polling timer started. Spools one of these to console every few seconds: Timer ISR/0: Time went backwards: delta=-4270576170971 delta_cpu=254829029 shadow=2037844042151244163 off=261710497 processed=2037848312989081849 cpu_processed=2037844042158081849 0: 2037844042158081849 1: 2037828468354081849 2: 2037848312989081849 3: 2037837726866081849 4: 2037842059197081849 5: 2037840075526081849 6: 2037845844663081849 7: 2037841593777081849 A few t's into Xen's console: (XEN) *** Serial input -> Xen (type 'CTRL-a' three times to switch input to DOM0) (XEN) Min = 2037829427350793281 ; Max = 2037848310626701146 ; Diff = 18883275907865 (18883275907 microseconds) (XEN) Min = 2037829428349256182 ; Max = 2037848311625163843 ; Diff = 18883275907661 (18883275907 microseconds) (XEN) Min = 2037829428565188930 ; Max = 2037848311841096807 ; Diff = 18883275907877 (18883275907 microseconds) This particular box does this with 3.2.0 - 3.2.2-rc2. I have another box doing the same thing, except the delta is more sane (0 - 2 microseconds), however eventually dom0 freezes. -Chris ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Xen 3.2.2 - Timer ISR/0: Time went backwards 2008-08-04 18:51 Xen 3.2.2 - Timer ISR/0: Time went backwards Christopher S. Aker @ 2008-08-04 18:59 ` Keir Fraser 2008-08-04 19:18 ` Christopher S. Aker 2008-08-05 7:03 ` Jan Beulich 1 sibling, 1 reply; 9+ messages in thread From: Keir Fraser @ 2008-08-04 18:59 UTC (permalink / raw) To: Christopher S. Aker, xen devel What sort of hardware? -- Keir On 4/8/08 19:51, "Christopher S. Aker" <caker@theshore.net> wrote: > Hardware: > Xen: 3.2.1-rc2 64bit > dom0: 2.6.18.8 at changeset 622, PAE > > # xm dmesg | grep -e sync -e timer > (XEN) checking TSC synchronization across 8 CPUs: passed. > (XEN) Platform timer overflows in 234 jiffies. > (XEN) Platform timer is 3.579MHz ACPI PM Timer > (XEN) Machine check exception polling timer started. > > Spools one of these to console every few seconds: > > Timer ISR/0: Time went backwards: delta=-4270576170971 > delta_cpu=254829029 shadow=2037844042151244163 off=261710497 > processed=2037848312989081849 cpu_processed=2037844042158081849 > 0: 2037844042158081849 > 1: 2037828468354081849 > 2: 2037848312989081849 > 3: 2037837726866081849 > 4: 2037842059197081849 > 5: 2037840075526081849 > 6: 2037845844663081849 > 7: 2037841593777081849 > > A few t's into Xen's console: > > (XEN) *** Serial input -> Xen (type 'CTRL-a' three times to switch input > to DOM0) > (XEN) Min = 2037829427350793281 ; Max = 2037848310626701146 ; Diff = > 18883275907865 (18883275907 microseconds) > (XEN) Min = 2037829428349256182 ; Max = 2037848311625163843 ; Diff = > 18883275907661 (18883275907 microseconds) > (XEN) Min = 2037829428565188930 ; Max = 2037848311841096807 ; Diff = > 18883275907877 (18883275907 microseconds) > > This particular box does this with 3.2.0 - 3.2.2-rc2. I have another > box doing the same thing, except the delta is more sane (0 - 2 > microseconds), however eventually dom0 freezes. > > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Xen 3.2.2 - Timer ISR/0: Time went backwards 2008-08-04 18:59 ` Keir Fraser @ 2008-08-04 19:18 ` Christopher S. Aker 0 siblings, 0 replies; 9+ messages in thread From: Christopher S. Aker @ 2008-08-04 19:18 UTC (permalink / raw) To: Keir Fraser; +Cc: xen devel Keir Fraser wrote: > What sort of hardware? http://www.supermicro.com/products/system/1U/6015/SYS-6015B-T+.cfm processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping : 6 cpu MHz : 2500.088 cache size : 6144 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu de tsc msr pae cx8 apic sep mtrr cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni bogomips : 5003.94 ^--- 8x # xm dmesg __ __ _____ ____ ____ ____ \ \/ /___ _ __ |___ / |___ \ |___ \ _ __ ___|___ \ _ __ _ __ ___ \ // _ \ \047_ \ |_ \ __) | __) |__| \047__/ __| __) |__| \047_ \| \047__/ _ \ / \ __/ | | | ___) | / __/ _ / __/|__| | | (__ / __/|__| |_) | | | __/ /_/\_\___|_| |_| |____(_)_____(_)_____| |_| \___|_____| | .__/|_| \___| |_| (XEN) Xen version 3.2.2-rc2-pre (...) (gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)) Thu Jul 31 14:47:33 EDT 2008 (XEN) Latest ChangeSet: unavailable (XEN) Command line: dom0_mem=307200 com1=115200,8n1 console=com1 (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) VBE/DDC methods: none; EDID transfer time: 2 seconds (XEN) EDID info not retrieved because no DDC retrieval method detected (XEN) Disc information: (XEN) Found 1 MBR signatures (XEN) Found 1 EDD information structures (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 000000000009d400 (usable) (XEN) 000000000009d400 - 00000000000a0000 (reserved) (XEN) 00000000000e4000 - 0000000000100000 (reserved) (XEN) 0000000000100000 - 00000000cff60000 (usable) (XEN) 00000000cff60000 - 00000000cff69000 (ACPI data) (XEN) 00000000cff69000 - 00000000cff80000 (ACPI NVS) (XEN) 00000000cff80000 - 00000000d0000000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fec00000 - 00000000fec10000 (reserved) (XEN) 00000000fee00000 - 00000000fee01000 (reserved) (XEN) 00000000ff000000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000630000000 (usable) (XEN) System RAM: 24574MB (25164788kB) (XEN) ACPI: RSDP (v000 PTLTD ) @ 0x00000000000f5e60 (XEN) ACPI: RSDT (v001 PTLTD RSDT 0x06040000 LTP 0x00000000) @ 0x00000000cff63321 (XEN) ACPI: FADT (v001 INTEL TUMWATER 0x06040000 PTL 0x00000003) @ 0x00000000cff68e10 (XEN) ACPI: MADT (v001 PTLTD APIC 0x06040000 LTP 0x00000000) @ 0x00000000cff68e84 (XEN) ACPI: MCFG (v001 PTLTD MCFG 0x06040000 LTP 0x00000000) @ 0x00000000cff68f4c (XEN) ACPI: BOOT (v001 PTLTD $SBFTBL$ 0x06040000 LTP 0x00000001) @ 0x00000000cff68f88 (XEN) ACPI: SPCR (v001 PTLTD $UCRTBL$ 0x06040000 PTL 0x00000001) @ 0x00000000cff68fb0 (XEN) ACPI: SSDT (v001 PmRef Cpu0Tst 0x00003000 INTL 0x20050228) @ 0x00000000cff64bf0 (XEN) ACPI: SSDT (v001 PmRef Cpu7Tst 0x00003000 INTL 0x20050228) @ 0x00000000cff64b4a (XEN) ACPI: SSDT (v001 PmRef Cpu6Tst 0x00003000 INTL 0x20050228) @ 0x00000000cff64aa4 (XEN) ACPI: SSDT (v001 PmRef Cpu5Tst 0x00003000 INTL 0x20050228) @ 0x00000000cff649fe (XEN) ACPI: SSDT (v001 PmRef Cpu4Tst 0x00003000 INTL 0x20050228) @ 0x00000000cff64958 (XEN) ACPI: SSDT (v001 PmRef Cpu3Tst 0x00003000 INTL 0x20050228) @ 0x00000000cff648b2 (XEN) ACPI: SSDT (v001 PmRef Cpu2Tst 0x00003000 INTL 0x20050228) @ 0x00000000cff6480c (XEN) ACPI: SSDT (v001 PmRef Cpu1Tst 0x00003000 INTL 0x20050228) @ 0x00000000cff64766 (XEN) ACPI: SSDT (v001 PmRef CpuPm 0x00003000 INTL 0x20050228) @ 0x00000000cff6337d (XEN) ACPI: DSDT (v001 Intel BLAKFORD 0x06040000 MSFT 0x0100000e) @ 0x0000000000000000 (XEN) NUMA turned off (XEN) Faking a node at 0000000000000000-0000000630000000 (XEN) Xen heap: 13MB (14120kB) (XEN) Domain heap initialised: DMA width 32 bits (XEN) found SMP MP-table at 000f5e90 (XEN) DMI present. (XEN) Using APIC driver default (XEN) ACPI: PM-Timer IO Port: 0x1008 (XEN) ACPI: ACPI SLEEP INFO: pm1x_cnt[1004,0], pm1x_evt[1000,0] (XEN) ACPI: wakeup_vec[cff69fcc], vec_size[20] (XEN) ACPI: Local APIC address 0xfee00000 (XEN) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) (XEN) Processor #0 7:7 APIC version 20 (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x04] enabled) (XEN) Processor #4 7:7 APIC version 20 (XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) (XEN) Processor #1 7:7 APIC version 20 (XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x05] enabled) (XEN) Processor #5 7:7 APIC version 20 (XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x02] enabled) (XEN) Processor #2 7:7 APIC version 20 (XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x06] enabled) (XEN) Processor #6 7:7 APIC version 20 (XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x03] enabled) (XEN) Processor #3 7:7 APIC version 20 (XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] enabled) (XEN) Processor #7 7:7 APIC version 20 (XEN) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) (XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) (XEN) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) (XEN) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1]) (XEN) ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1]) (XEN) ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1]) (XEN) ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1]) (XEN) ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1]) (XEN) ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0]) (XEN) IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23 (XEN) ACPI: IOAPIC (id[0x09] address[0xfec80000] gsi_base[24]) (XEN) IOAPIC[1]: apic_id 9, version 32, address 0xfec80000, GSI 24-47 (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) (XEN) ACPI: IRQ0 used by override. (XEN) ACPI: IRQ2 used by override. (XEN) ACPI: IRQ9 used by override. (XEN) Enabling APIC mode: Flat. Using 2 I/O APICs (XEN) Using ACPI (MADT) for SMP configuration information (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Initializing CPU#0 (XEN) Detected 2500.174 MHz processor. (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K (XEN) CPU: Physical Processor ID: 0 (XEN) CPU: Processor Core ID: 0 (XEN) HVM: VMX enabled (XEN) CPU0: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping 06 (XEN) Booting processor 1/4 eip 8c000 (XEN) Initializing CPU#1 (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K (XEN) CPU: Physical Processor ID: 1 (XEN) CPU: Processor Core ID: 0 (XEN) CPU1: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping 06 (XEN) Booting processor 2/1 eip 8c000 (XEN) Initializing CPU#2 (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K (XEN) CPU: Physical Processor ID: 0 (XEN) CPU: Processor Core ID: 1 (XEN) CPU2: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping 06 (XEN) Booting processor 3/5 eip 8c000 (XEN) Initializing CPU#3 (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K (XEN) CPU: Physical Processor ID: 1 (XEN) CPU: Processor Core ID: 1 (XEN) CPU3: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping 06 (XEN) Booting processor 4/2 eip 8c000 (XEN) Initializing CPU#4 (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K (XEN) CPU: Physical Processor ID: 0 (XEN) CPU: Processor Core ID: 2 (XEN) CPU4: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping 06 (XEN) Booting processor 5/6 eip 8c000 (XEN) Initializing CPU#5 (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K (XEN) CPU: Physical Processor ID: 1 (XEN) CPU: Processor Core ID: 2 (XEN) CPU5: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping 06 (XEN) Booting processor 6/3 eip 8c000 (XEN) Initializing CPU#6 (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K (XEN) CPU: Physical Processor ID: 0 (XEN) CPU: Processor Core ID: 3 (XEN) CPU6: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping 06 (XEN) Booting processor 7/7 eip 8c000 (XEN) Initializing CPU#7 (XEN) CPU: L1 I cache: 32K, L1 D cache: 32K (XEN) CPU: Physical Processor ID: 1 (XEN) CPU: Processor Core ID: 3 (XEN) CPU7: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping 06 (XEN) Total of 8 processors activated. (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1 (XEN) checking TSC synchronization across 8 CPUs: passed. (XEN) Platform timer overflows in 234 jiffies. (XEN) Platform timer is 3.579MHz ACPI PM Timer (XEN) Brought up 8 CPUs (XEN) ACPI sleep modes: S3 (XEN) Machine check exception polling timer started. (XEN) xenoprof: Initialization failed. Intel processor model 23 for P6 class family is not supported (XEN) AMD IOMMU: Disabled (XEN) *** LOADING DOMAIN 0 *** (XEN) elf_parse_binary: phdr: paddr=0x100000 memsz=0x42aa28 (XEN) elf_parse_binary: phdr: paddr=0x52b000 memsz=0x12de44 (XEN) elf_parse_binary: memory: 0x100000 -> 0x658e44 (XEN) elf_xen_parse_note: GUEST_OS = "linux" (XEN) elf_xen_parse_note: GUEST_VERSION = "2.6" (XEN) elf_xen_parse_note: XEN_VERSION = "xen-3.0" (XEN) elf_xen_parse_note: VIRT_BASE = 0xc0000000 (XEN) elf_xen_parse_note: PADDR_OFFSET = 0x0 (XEN) elf_xen_parse_note: ENTRY = 0xc0100000 (XEN) elf_xen_parse_note: HYPERCALL_PAGE = 0xc0101000 (XEN) elf_xen_parse_note: HV_START_LOW = 0xf5800000 (XEN) elf_xen_parse_note: FEATURES = "writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|supervisor_mode_kernel" (XEN) elf_xen_parse_note: PAE_MODE = "yes" (XEN) elf_xen_parse_note: unknown xen elf note (0xd) (XEN) elf_xen_parse_note: LOADER = "generic" (XEN) elf_xen_parse_note: SUSPEND_CANCEL = 0x1 (XEN) elf_xen_parse: using notes from SHT_NOTE section (XEN) elf_xen_addr_calc_check: addresses: (XEN) virt_base = 0xc0000000 (XEN) elf_paddr_offset = 0x0 (XEN) virt_offset = 0xc0000000 (XEN) virt_kstart = 0xc0100000 (XEN) virt_kend = 0xc0658e44 (XEN) virt_entry = 0xc0100000 (XEN) Xen kernel: 64-bit, lsb, compat32 (XEN) Dom0 kernel: 32-bit, PAE, lsb, paddr 0x100000 -> 0x658e44 (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Dom0 alloc.: 000000061b000000->000000061b800000 (74752 pages to be allocated) (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: 00000000c0100000->00000000c0658e44 (XEN) Init. ramdisk: 00000000c0659000->00000000c0659000 (XEN) Phys-Mach map: 00000000c0659000->00000000c06a4000 (XEN) Start info: 00000000c06a4000->00000000c06a44a4 (XEN) Page tables: 00000000c06a5000->00000000c06af000 (XEN) Boot stack: 00000000c06af000->00000000c06b0000 (XEN) TOTAL: 00000000c0000000->00000000c0800000 (XEN) ENTRY ADDRESS: 00000000c0100000 (XEN) Dom0 has maximum 8 VCPUs (XEN) elf_load_binary: phdr 0 at 0x00000000c0100000 -> 0x00000000c052aa28 (XEN) elf_load_binary: phdr 1 at 0x00000000c052b000 -> 0x00000000c05efbe8 (XEN) Scrubbing Free RAM: ................................................................................................................................................................................................................................................done. (XEN) Xen trace buffers: disabled (XEN) Std. Loglevel: All (XEN) Guest Loglevel: All -Chris ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Xen 3.2.2 - Timer ISR/0: Time went backwards 2008-08-04 18:51 Xen 3.2.2 - Timer ISR/0: Time went backwards Christopher S. Aker 2008-08-04 18:59 ` Keir Fraser @ 2008-08-05 7:03 ` Jan Beulich 2008-08-05 15:00 ` Dan Magenheimer 2008-08-06 15:46 ` Dan Magenheimer 1 sibling, 2 replies; 9+ messages in thread From: Jan Beulich @ 2008-08-05 7:03 UTC (permalink / raw) To: Christopher S. Aker; +Cc: xen-devel This looks very similar to bug report we've got from IBM I'm currently trying to research (difficult, as I can't touch the hardware). What I know so far is that we're losing, starting a few seconds after dom0 boot began, 20-50+% timer interrupts. The moment this rate exceeds about 50%, platform time calibration breaks (as it sets the timer to half the overflow period). Since jiffies aren't used much elsewhere, this loss of timer ticks doesn't seem to matter much elsewhere. I've got no real clue so far *why* there's such a high rate of lost interrupts, though. The only (albeit small, since appearing very unlikely) possibility would be frequent and extensive SMM entries after ACPI mode got enabled on the system. Btw., does -unstable exhibit the same behavior? Jan >>> "Christopher S. Aker" <caker@theshore.net> 04.08.08 20:51 >>> Hardware: Xen: 3.2.1-rc2 64bit dom0: 2.6.18.8 at changeset 622, PAE # xm dmesg | grep -e sync -e timer (XEN) checking TSC synchronization across 8 CPUs: passed. (XEN) Platform timer overflows in 234 jiffies. (XEN) Platform timer is 3.579MHz ACPI PM Timer (XEN) Machine check exception polling timer started. Spools one of these to console every few seconds: Timer ISR/0: Time went backwards: delta=-4270576170971 delta_cpu=254829029 shadow=2037844042151244163 off=261710497 processed=2037848312989081849 cpu_processed=2037844042158081849 0: 2037844042158081849 1: 2037828468354081849 2: 2037848312989081849 3: 2037837726866081849 4: 2037842059197081849 5: 2037840075526081849 6: 2037845844663081849 7: 2037841593777081849 A few t's into Xen's console: (XEN) *** Serial input -> Xen (type 'CTRL-a' three times to switch input to DOM0) (XEN) Min = 2037829427350793281 ; Max = 2037848310626701146 ; Diff = 18883275907865 (18883275907 microseconds) (XEN) Min = 2037829428349256182 ; Max = 2037848311625163843 ; Diff = 18883275907661 (18883275907 microseconds) (XEN) Min = 2037829428565188930 ; Max = 2037848311841096807 ; Diff = 18883275907877 (18883275907 microseconds) This particular box does this with 3.2.0 - 3.2.2-rc2. I have another box doing the same thing, except the delta is more sane (0 - 2 microseconds), however eventually dom0 freezes. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Xen 3.2.2 - Timer ISR/0: Time went backwards 2008-08-05 7:03 ` Jan Beulich @ 2008-08-05 15:00 ` Dan Magenheimer 2008-08-06 2:50 ` Christopher S. Aker 2008-08-06 15:46 ` Dan Magenheimer 1 sibling, 1 reply; 9+ messages in thread From: Dan Magenheimer @ 2008-08-05 15:00 UTC (permalink / raw) To: Jan Beulich, Christopher S. Aker; +Cc: xen-devel@lists.xensource.com > Btw., does -unstable exhibit the same behavior? Yes if you are able to reproduce the problem on a very recent unstable (or on 4.0 when it is released), it would be very interesting to see the results of this run in dom0: xm debug-key t; xm dmesg | tail -2 or just do the t into the debug console that you did before. In fact if on your system you can do: watch "xm debug-key t; xm dmesg | tail -2" and watch for dynamic patterns, that might shed some light as well. (E.g. what values are growing, does it grow continuously or spurt every N seconds, etc.) (Note that the output from this is different and less useful prior to 4.0.) This will tell us if one of the physical cpu's tsc's has gone bonkers or if the problem is elsewhere. > -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of Jan Beulich > Sent: Tuesday, August 05, 2008 1:04 AM > To: Christopher S. Aker > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Xen 3.2.2 - Timer ISR/0: Time went backwards > > > This looks very similar to bug report we've got from IBM I'm currently > trying to research (difficult, as I can't touch the > hardware). What I know > so far is that we're losing, starting a few seconds after > dom0 boot began, > 20-50+% timer interrupts. The moment this rate exceeds about 50%, > platform time calibration breaks (as it sets the timer to > half the overflow > period). Since jiffies aren't used much elsewhere, this loss > of timer ticks > doesn't seem to matter much elsewhere. > > I've got no real clue so far *why* there's such a high rate > of lost interrupts, > though. The only (albeit small, since appearing very > unlikely) possibility > would be frequent and extensive SMM entries after ACPI mode got > enabled on the system. > > Btw., does -unstable exhibit the same behavior? > > Jan > > >>> "Christopher S. Aker" <caker@theshore.net> 04.08.08 20:51 >>> > Hardware: > Xen: 3.2.1-rc2 64bit > dom0: 2.6.18.8 at changeset 622, PAE > > # xm dmesg | grep -e sync -e timer > (XEN) checking TSC synchronization across 8 CPUs: passed. > (XEN) Platform timer overflows in 234 jiffies. > (XEN) Platform timer is 3.579MHz ACPI PM Timer > (XEN) Machine check exception polling timer started. > > Spools one of these to console every few seconds: > > Timer ISR/0: Time went backwards: delta=-4270576170971 > delta_cpu=254829029 shadow=2037844042151244163 off=261710497 > processed=2037848312989081849 cpu_processed=2037844042158081849 > 0: 2037844042158081849 > 1: 2037828468354081849 > 2: 2037848312989081849 > 3: 2037837726866081849 > 4: 2037842059197081849 > 5: 2037840075526081849 > 6: 2037845844663081849 > 7: 2037841593777081849 > > A few t's into Xen's console: > > (XEN) *** Serial input -> Xen (type 'CTRL-a' three times to > switch input > to DOM0) > (XEN) Min = 2037829427350793281 ; Max = 2037848310626701146 ; Diff = > 18883275907865 (18883275907 microseconds) > (XEN) Min = 2037829428349256182 ; Max = 2037848311625163843 ; Diff = > 18883275907661 (18883275907 microseconds) > (XEN) Min = 2037829428565188930 ; Max = 2037848311841096807 ; Diff = > 18883275907877 (18883275907 microseconds) > > This particular box does this with 3.2.0 - 3.2.2-rc2. I have another > box doing the same thing, except the delta is more sane (0 - 2 > microseconds), however eventually dom0 freezes. > > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Xen 3.2.2 - Timer ISR/0: Time went backwards 2008-08-05 15:00 ` Dan Magenheimer @ 2008-08-06 2:50 ` Christopher S. Aker 2008-08-06 7:16 ` Keir Fraser 0 siblings, 1 reply; 9+ messages in thread From: Christopher S. Aker @ 2008-08-06 2:50 UTC (permalink / raw) To: dan.magenheimer@oracle.com; +Cc: xen-devel@lists.xensource.com Jan Beulich wrote: > Btw., does -unstable exhibit the same behavior? Meh .. xen-unstable build from earlier today has been running fine. With 3.2.x, I would have had lockups and/or the "Time went backwards" error by now. So .. what's that tell us? -Chris ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Xen 3.2.2 - Timer ISR/0: Time went backwards 2008-08-06 2:50 ` Christopher S. Aker @ 2008-08-06 7:16 ` Keir Fraser 0 siblings, 0 replies; 9+ messages in thread From: Keir Fraser @ 2008-08-06 7:16 UTC (permalink / raw) To: Christopher S. Aker, dan.magenheimer@oracle.com Cc: xen-devel@lists.xensource.com On 6/8/08 03:50, "Christopher S. Aker" <caker@theshore.net> wrote: > Jan Beulich wrote: >> Btw., does -unstable exhibit the same behavior? > > Meh .. xen-unstable build from earlier today has been running fine. > With 3.2.x, I would have had lockups and/or the "Time went backwards" > error by now. > > So .. what's that tell us? It tells us that the CPUs now track the platform timer together much better even if it's a bit screwy. This greatly reduces the chances of them skewing relative to each other. -- Keir ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Xen 3.2.2 - Timer ISR/0: Time went backwards 2008-08-05 7:03 ` Jan Beulich 2008-08-05 15:00 ` Dan Magenheimer @ 2008-08-06 15:46 ` Dan Magenheimer 2008-08-06 16:27 ` [PATCH] warn if time calibration goes wacko (was RE: Xen 3.2.2 - Timer ISR/0: Time went backwards) Dan Magenheimer 1 sibling, 1 reply; 9+ messages in thread From: Dan Magenheimer @ 2008-08-06 15:46 UTC (permalink / raw) To: Jan Beulich, Christopher S. Aker, Keir Fraser Cc: xen-devel@lists.xensource.com > 20-50+% timer interrupts. The moment this rate exceeds about 50%, > platform time calibration breaks (as it sets the timer to > half the overflow period). I've looked at that code in local_time_calibration() a few times and even added debug code once to see if it occurs. It didn't on my machine, but I can see how it would cause problems if it did happen. Keir, would you accept a patch (or just add the two lines yourself) to printk a warning if that "goto out" ever occurs and/or maybe if the "scale factor is clamped"? (Chris, this might not be your problem so apologies for the topic drift, but if the printk had been there awhile ago, we'd at least know if it is or is not the problem.) Dan P.S. This is also what led to the separate thread about measuring interrupt latency. If this problem is due to huge periods with interrupts off, it would be nice to know. > -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of Jan Beulich > Sent: Tuesday, August 05, 2008 1:04 AM > To: Christopher S. Aker > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Xen 3.2.2 - Timer ISR/0: Time went backwards > > > This looks very similar to bug report we've got from IBM I'm currently > trying to research (difficult, as I can't touch the > hardware). What I know > so far is that we're losing, starting a few seconds after > dom0 boot began, > 20-50+% timer interrupts. The moment this rate exceeds about 50%, > platform time calibration breaks (as it sets the timer to > half the overflow > period). Since jiffies aren't used much elsewhere, this loss > of timer ticks > doesn't seem to matter much elsewhere. > > I've got no real clue so far *why* there's such a high rate > of lost interrupts, > though. The only (albeit small, since appearing very > unlikely) possibility > would be frequent and extensive SMM entries after ACPI mode got > enabled on the system. > > Btw., does -unstable exhibit the same behavior? > > Jan > > >>> "Christopher S. Aker" <caker@theshore.net> 04.08.08 20:51 >>> > Hardware: > Xen: 3.2.1-rc2 64bit > dom0: 2.6.18.8 at changeset 622, PAE > > # xm dmesg | grep -e sync -e timer > (XEN) checking TSC synchronization across 8 CPUs: passed. > (XEN) Platform timer overflows in 234 jiffies. > (XEN) Platform timer is 3.579MHz ACPI PM Timer > (XEN) Machine check exception polling timer started. > > Spools one of these to console every few seconds: > > Timer ISR/0: Time went backwards: delta=-4270576170971 > delta_cpu=254829029 shadow=2037844042151244163 off=261710497 > processed=2037848312989081849 cpu_processed=2037844042158081849 > 0: 2037844042158081849 > 1: 2037828468354081849 > 2: 2037848312989081849 > 3: 2037837726866081849 > 4: 2037842059197081849 > 5: 2037840075526081849 > 6: 2037845844663081849 > 7: 2037841593777081849 > > A few t's into Xen's console: > > (XEN) *** Serial input -> Xen (type 'CTRL-a' three times to > switch input > to DOM0) > (XEN) Min = 2037829427350793281 ; Max = 2037848310626701146 ; Diff = > 18883275907865 (18883275907 microseconds) > (XEN) Min = 2037829428349256182 ; Max = 2037848311625163843 ; Diff = > 18883275907661 (18883275907 microseconds) > (XEN) Min = 2037829428565188930 ; Max = 2037848311841096807 ; Diff = > 18883275907877 (18883275907 microseconds) > > This particular box does this with 3.2.0 - 3.2.2-rc2. I have another > box doing the same thing, except the delta is more sane (0 - 2 > microseconds), however eventually dom0 freezes. > > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] warn if time calibration goes wacko (was RE: Xen 3.2.2 - Timer ISR/0: Time went backwards) 2008-08-06 15:46 ` Dan Magenheimer @ 2008-08-06 16:27 ` Dan Magenheimer 0 siblings, 0 replies; 9+ messages in thread From: Dan Magenheimer @ 2008-08-06 16:27 UTC (permalink / raw) To: dan.magenheimer@oracle.com, Jan Beulich, Christopher S. Aker, Keir Fraser Cc: xen-devel@lists.xensource.com [-- Attachment #1: Type: text/plain, Size: 4659 bytes --] See below. Bad things will happen if these situations happen, so at least we can diagnose it easier. > -----Original Message----- > From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > Sent: Wednesday, August 06, 2008 9:47 AM > To: 'Jan Beulich'; 'Christopher S. Aker'; 'Keir Fraser' > Cc: 'xen-devel@lists.xensource.com' > Subject: RE: [Xen-devel] Xen 3.2.2 - Timer ISR/0: Time went backwards > > > > 20-50+% timer interrupts. The moment this rate exceeds about 50%, > > platform time calibration breaks (as it sets the timer to > > half the overflow period). > > I've looked at that code in local_time_calibration() a few times > and even added debug code once to see if it occurs. It > didn't on my machine, but I can see how it would cause problems > if it did happen. > > Keir, would you accept a patch (or just add the two lines yourself) > to printk a warning if that "goto out" ever occurs and/or maybe > if the "scale factor is clamped"? > > (Chris, this might not be your problem so apologies for the topic > drift, but if the printk had been there awhile ago, we'd at least > know if it is or is not the problem.) > > Dan > > P.S. This is also what led to the separate thread about measuring > interrupt latency. If this problem is due to huge periods with > interrupts off, it would be nice to know. > > > -----Original Message----- > > From: xen-devel-bounces@lists.xensource.com > > [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of > Jan Beulich > > Sent: Tuesday, August 05, 2008 1:04 AM > > To: Christopher S. Aker > > Cc: xen-devel@lists.xensource.com > > Subject: Re: [Xen-devel] Xen 3.2.2 - Timer ISR/0: Time went > backwards > > > > > > This looks very similar to bug report we've got from IBM > I'm currently > > trying to research (difficult, as I can't touch the > > hardware). What I know > > so far is that we're losing, starting a few seconds after > > dom0 boot began, > > 20-50+% timer interrupts. The moment this rate exceeds about 50%, > > platform time calibration breaks (as it sets the timer to > > half the overflow > > period). Since jiffies aren't used much elsewhere, this loss > > of timer ticks > > doesn't seem to matter much elsewhere. > > > > I've got no real clue so far *why* there's such a high rate > > of lost interrupts, > > though. The only (albeit small, since appearing very > > unlikely) possibility > > would be frequent and extensive SMM entries after ACPI mode got > > enabled on the system. > > > > Btw., does -unstable exhibit the same behavior? > > > > Jan > > > > >>> "Christopher S. Aker" <caker@theshore.net> 04.08.08 20:51 >>> > > Hardware: > > Xen: 3.2.1-rc2 64bit > > dom0: 2.6.18.8 at changeset 622, PAE > > > > # xm dmesg | grep -e sync -e timer > > (XEN) checking TSC synchronization across 8 CPUs: passed. > > (XEN) Platform timer overflows in 234 jiffies. > > (XEN) Platform timer is 3.579MHz ACPI PM Timer > > (XEN) Machine check exception polling timer started. > > > > Spools one of these to console every few seconds: > > > > Timer ISR/0: Time went backwards: delta=-4270576170971 > > delta_cpu=254829029 shadow=2037844042151244163 off=261710497 > > processed=2037848312989081849 cpu_processed=2037844042158081849 > > 0: 2037844042158081849 > > 1: 2037828468354081849 > > 2: 2037848312989081849 > > 3: 2037837726866081849 > > 4: 2037842059197081849 > > 5: 2037840075526081849 > > 6: 2037845844663081849 > > 7: 2037841593777081849 > > > > A few t's into Xen's console: > > > > (XEN) *** Serial input -> Xen (type 'CTRL-a' three times to > > switch input > > to DOM0) > > (XEN) Min = 2037829427350793281 ; Max = 2037848310626701146 > ; Diff = > > 18883275907865 (18883275907 microseconds) > > (XEN) Min = 2037829428349256182 ; Max = 2037848311625163843 > ; Diff = > > 18883275907661 (18883275907 microseconds) > > (XEN) Min = 2037829428565188930 ; Max = 2037848311841096807 > ; Diff = > > 18883275907877 (18883275907 microseconds) > > > > This particular box does this with 3.2.0 - 3.2.2-rc2. I > have another > > box doing the same thing, except the delta is more sane (0 - 2 > > microseconds), however eventually dom0 freezes. > > > > -Chris > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > [-- Attachment #2: timewarn.patch --] [-- Type: application/octet-stream, Size: 937 bytes --] diff -r f20fb83dac2c xen/arch/x86/time.c --- a/xen/arch/x86/time.c Tue Aug 05 13:55:14 2008 +0100 +++ b/xen/arch/x86/time.c Wed Aug 06 10:24:13 2008 -0600 @@ -909,7 +909,10 @@ static void local_time_calibration(void * We could be smarter here: resync platform timer with local timer? */ if ( ((s64)stime_elapsed64 < (EPOCH / 2)) ) + { + printk("WARNING: local time lost sync with platform timer\n"); goto out; + } /* * Calculate error-correction factor. This only slows down a fast local @@ -920,7 +923,10 @@ static void local_time_calibration(void { local_stime_err = curr_local_stime - curr_master_stime; if ( local_stime_err > EPOCH ) + { + printk("WARNING: scale factor clamped in local time calibration\n"); local_stime_err = EPOCH; + } error_factor = div_frac(EPOCH, EPOCH + (u32)local_stime_err); } [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-08-06 16:27 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-08-04 18:51 Xen 3.2.2 - Timer ISR/0: Time went backwards Christopher S. Aker 2008-08-04 18:59 ` Keir Fraser 2008-08-04 19:18 ` Christopher S. Aker 2008-08-05 7:03 ` Jan Beulich 2008-08-05 15:00 ` Dan Magenheimer 2008-08-06 2:50 ` Christopher S. Aker 2008-08-06 7:16 ` Keir Fraser 2008-08-06 15:46 ` Dan Magenheimer 2008-08-06 16:27 ` [PATCH] warn if time calibration goes wacko (was RE: Xen 3.2.2 - Timer ISR/0: Time went backwards) Dan Magenheimer
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.