From: Arjan Koers <0h61vkll2ly8@xutrox.com>
To: kvm@vger.kernel.org
Cc: Michael Tokarev <mjt@tls.msk.ru>,
Zachary Amsden <zamsden@redhat.com>, Avi Kivity <avi@redhat.com>,
Glauber Costa <glommer@redhat.com>,
Andre Przywara <andre.przywara@amd.com>
Subject: Re: 2.6.35-rc1 regression with pvclock and smp guests
Date: Thu, 30 Sep 2010 20:49:44 +0200 [thread overview]
Message-ID: <4CA4DBC8.6070606@xutrox.com> (raw)
In-Reply-To: <4CA4AD87.8060502@redhat.com>
On 2010-09-30 17:32, Zachary Amsden wrote:
> On 09/30/2010 05:12 AM, Michael Tokarev wrote:
>> 30.09.2010 17:54, Zachary Amsden wrote:
>> []
>>
>>> The printk movement is just a bandaid patch, correct? Anything which
>>> does printk before kvmclock is registered could trigger the same bug.
>>>
>> Well, I'd not say it's just a bandaid patch, it's real bug -- either
>> we can read kvmclock (so it's initialized), or we don't touch it (at
>> least before registration).
>>
>
> Yes, that's the bug, but moving the printk doesn't fix that, it just
> hides it.
Correct. It's just luck that it works for my 64-bit 2.6.34.* and
2.6.35.* kernels. The working kernels will break when compiled to
print additional debug information.
Here's a partial boot log of 2.6.32.23 with smpboot.c compiled
with DEBUG define. I modified printk to display the CPU# (printk_cpu).
All lines on CPU 1 up to 0.136487 are using the invalid clock and
will cause the kernel to hang later (if I hadn't patched pvclock
to correct the clock backwards).
...
[0: 0.124221] Booting processor 1 APIC 0x1 ip 0x6000
[0: 0.124579] Setting warm reset code and vector.
[0: 0.124585] 1.
[0: 0.124587] 2.
[0: 0.124588] 3.
[0: 0.124601] Asserting INIT.
[0: 0.124613] Waiting for send to finish...
[0: 0.134490] Deasserting INIT.
[0: 0.134497] Waiting for send to finish...
[0: 0.134501] #startup loops: 2.
[0: 0.134503] Sending STARTUP #1.
[0: 0.134508] After apic_write.
[1: 0.008000] Initializing CPU#1
[1: 0.008000] CPU#1 (phys ID: 1) waiting for CALLOUT
[0: 0.134826] Startup point 1.
[0: 0.135133] Waiting for send to finish...
[0: 0.135340] Sending STARTUP #2.
[0: 0.135346] After apic_write.
[0: 0.135650] Startup point 1.
[0: 0.135651] Waiting for send to finish...
[0: 0.135858] After Startup.
[0: 0.135859] Before Callout 1.
[0: 0.135861] After Callout 1.
[1: 0.008000] CALLIN, before setup_local_APIC().
[1: 0.008000] Stack at about ffff88001f889f44
[1: 0.008000] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[1: 0.008000] CPU: L2 Cache: 512K (64 bytes/line)
[1: 0.008000] kvm-clock: cpu 1, msr 0:1511601, secondary cpu clock
[0: 0.136461] OK.
[0: 0.136463] CPU1: AMD Athlon(tm) II X2 240 Processor stepping 02
[0: 0.136465] CPU has booted.
[0: 0.136488] Brought up 2 CPUs
[0: 0.136489] Boot done.
[0: 0.136490] Before bogomips.
[0: 0.136491] Total of 2 processors activated (11202.17 BogoMIPS).
[0: 0.136493] Before bogocount - setting activated=1.
[1: 0.136487] x86 PAT enabled: cpu 1, old 0x0, new 0x7010600070106
[0: 0.144007] CPU0 attaching sched-domain:
[0: 0.144010] domain 0: span 0-1 level CPU
[0: 0.144012] groups: 0 1
[0: 0.144016] CPU1 attaching sched-domain:
[0: 0.144018] domain 0: span 0-1 level CPU
[0: 0.144020] groups: 1 0
[0: 0.144219] NET: Registered protocol family 16
[0: 0.148091] PCI: Using configuration type 1 for base access
[0: 0.148451] PCI: Using configuration type 1 for extended access
[0: 0.148870] mtrr: your CPUs had inconsistent variable MTRR settings
[0: 0.148870] mtrr: your CPUs had inconsistent MTRRdefType settings
[0: 0.148870] mtrr: probably your BIOS does not setup all CPUs.
[0: 0.149185] mtrr: corrected configuration.
[0: 0.156112] bio: create slab <bio-0> at 0
[0: 0.156635] vgaarb: loaded
[0: 0.156635] PCI: Probing PCI hardware
[0: 0.156635] PCI: Probing PCI hardware (bus 00)
[0: 0.156635] pci 0000:00:01.1: reg 20 io port: [0xc000-0xc00f]
[0: 0.156773] pci 0000:00:01.3: quirk: region b000-b03f claimed by PIIX4 ACPI
[0: 0.160012] pci 0000:00:01.3: quirk: region b100-b10f claimed by PIIX4 SMB
[0: 0.163379] pci 0000:00:02.0: reg 10 32bit mmio pref: [0xf0000000-0xf1ffffff]
[0: 0.164660] pci 0000:00:02.0: reg 14 32bit mmio: [0xf2000000-0xf2000fff]
[0: 0.170537] pci 0000:00:03.0: reg 10 io port: [0xc020-0xc03f]
[0: 0.170629] pci 0000:00:03.0: reg 14 32bit mmio: [0xf2001000-0xf2001fff]
[0: 0.171037] pci 0000:00:04.0: reg 10 io port: [0xc040-0xc05f]
[0: 0.171373] pci 0000:00:05.0: reg 10 io port: [0xc080-0xc0bf]
[0: 0.172273] pci 0000:00:06.0: reg 10 io port: [0xc0c0-0xc0ff]
[0: 0.173099] vgaarb: device added: PCI:0000:00:02.0,decodes=io+mem,owns=io+mem,locks=none
[0: 0.176131] pci 0000:00:01.0: PIIX/ICH IRQ router [8086:7000]
[0: 0.177112] Switching to clocksource kvm-clock
[1: 0.181401] pci_bus 0000:00: resource 0 io: [0x00-0xffff]
[1: 0.181412] pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
[1: 0.181825] NET: Registered protocol family 2
...
>>> Can you try with printk timing disabled and see if the bug disappears?
>>>
>> Yes it disappears so far, at last I can't trigger it anymore, tried
>> numerous boots including the 2.6.35.6 32bit kernel (patched with the
>> printk registration patch!) which shows the prob in almost every boot.
>>
>
> So, looks like we need to do the real fix.
Your ideas to zero hv_clock or to use an initialized flag may be usable.
next prev parent reply other threads:[~2010-09-30 18:49 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-22 12:53 2.6.35-rc1 regression with pvclock and smp guests Andre Przywara
2010-07-25 8:44 ` Avi Kivity
2010-07-26 8:47 ` Andre Przywara
2010-07-26 18:59 ` Arjan Koers
2010-07-27 21:00 ` Arjan Koers
2010-07-28 10:37 ` Avi Kivity
2010-07-31 0:34 ` Arjan Koers
2010-07-31 1:38 ` Zachary Amsden
2010-07-31 11:50 ` Arjan Koers
2010-07-31 2:39 ` Zachary Amsden
2010-07-31 11:53 ` Arjan Koers
2010-07-31 16:36 ` Arjan Koers
2010-07-31 19:45 ` Arjan Koers
2010-07-31 23:55 ` Zachary Amsden
2010-08-02 14:43 ` Glauber Costa
2010-08-02 16:16 ` Arjan Koers
2010-08-02 18:07 ` Glauber Costa
2010-08-02 20:26 ` Zachary Amsden
2010-08-02 21:10 ` Glauber Costa
2010-08-02 21:35 ` Arjan Koers
2010-08-03 0:00 ` Zachary Amsden
2010-09-28 11:16 ` Michael Tokarev
2010-09-29 8:12 ` Michael Tokarev
2010-09-29 8:28 ` Avi Kivity
2010-09-29 9:17 ` Michael Tokarev
2010-09-29 9:19 ` Michael Tokarev
2010-09-29 19:26 ` Arjan Koers
2010-09-30 7:55 ` Michael Tokarev
2010-09-30 9:59 ` Michael Tokarev
2010-09-30 13:54 ` Zachary Amsden
2010-09-30 15:12 ` Michael Tokarev
2010-09-30 15:32 ` Zachary Amsden
2010-09-30 18:49 ` Arjan Koers [this message]
2010-09-30 19:05 ` Marcelo Tosatti
2010-09-30 20:16 ` Arjan Koers
2010-09-30 23:02 ` Michael Tokarev
2010-09-30 23:07 ` Michael Tokarev
2010-10-01 1:13 ` Zachary Amsden
2010-10-02 5:35 ` Zachary Amsden
2010-10-02 7:35 ` Michael Tokarev
2010-10-02 7:40 ` Michael Tokarev
2010-10-02 7:50 ` Michael Tokarev
2010-10-02 16:10 ` Arjan Koers
2010-10-02 20:26 ` Michael Tokarev
2010-10-02 23:42 ` Zachary Amsden
2010-10-03 8:27 ` Michael Tokarev
2010-10-08 0:12 ` Arjan Koers
2010-10-08 2:47 ` Zachary Amsden
2010-10-08 22:06 ` Marcelo Tosatti
2010-10-09 1:10 ` Arjan Koers
2010-10-09 2:27 ` Zachary Amsden
2010-10-09 6:29 ` Michael Tokarev
2010-10-09 8:59 ` Arjan Koers
2010-10-11 20:47 ` Zachary Amsden
2010-10-13 12:18 ` Glauber Costa
2010-10-10 1:20 ` Arjan Koers
2010-10-11 17:53 ` Anthony Liguori
2010-10-11 18:36 ` Marcelo Tosatti
2010-10-09 2:29 ` Zachary Amsden
2010-10-10 1:26 ` Arjan Koers
2010-10-20 20:47 ` Arjan Koers
2010-10-09 7:59 ` Michael Tokarev
2010-10-09 8:31 ` Michael Tokarev
2010-10-02 21:55 ` Zachary Amsden
2010-10-03 8:16 ` Michael Tokarev
2010-10-03 8:22 ` Avi Kivity
2010-10-03 8:30 ` Michael Tokarev
2010-07-27 10:03 ` Avi Kivity
2010-07-27 11:49 ` Andre Przywara
2010-07-27 12:06 ` Avi Kivity
2010-07-27 12:21 ` Andre Przywara
2010-07-27 12:34 ` Avi Kivity
2010-07-27 13:48 ` Andre Przywara
2010-07-27 13:58 ` Avi Kivity
2010-07-27 14:55 ` Andre Przywara
2010-07-27 21:51 ` Andre Przywara
2010-07-28 3:00 ` Zachary Amsden
2010-07-28 7:55 ` Andre Przywara
2010-07-28 12:25 ` Andre Przywara
2010-07-30 22:54 ` Zachary Amsden
2010-08-02 10:12 ` Andre Przywara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CA4DBC8.6070606@xutrox.com \
--to=0h61vkll2ly8@xutrox.com \
--cc=andre.przywara@amd.com \
--cc=avi@redhat.com \
--cc=glommer@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mjt@tls.msk.ru \
--cc=zamsden@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.