From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arjan Koers <0h61vkll2ly8@xutrox.com> Subject: Re: 2.6.35-rc1 regression with pvclock and smp guests Date: Sat, 31 Jul 2010 13:53:45 +0200 Message-ID: <4C540EC9.1010008@xutrox.com> References: <4C4D4B8B.80006@amd.com> <4C4DDB00.50203@xutrox.com> <4C4F48D0.8090609@xutrox.com> <4C500872.1020809@redhat.com> <4C536F80.5090205@xutrox.com> <4C538CCE.1010104@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, Avi Kivity , Glauber Costa To: Zachary Amsden Return-path: Received: from smtp-out3.tiscali.nl ([195.241.79.178]:42661 "EHLO smtp-out3.tiscali.nl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755707Ab0GaLxw (ORCPT ); Sat, 31 Jul 2010 07:53:52 -0400 In-Reply-To: <4C538CCE.1010104@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 2010-07-31 04:39, Zachary Amsden wrote: > On 07/30/2010 02:34 PM, Arjan Koers wrote: >> On 2010-07-28 12:37, Avi Kivity wrote: >> >>> On 07/28/2010 12:00 AM, Arjan Koers wrote: >>> >>>> On 2010-07-26 20:59, Arjan Koers wrote: >>>> >>>> >>>>> I ran into the same problem. 2.6.34.1 and 2.6.35-rc6 SMP guest >>>>> kernels hang during boot. >>>>> >>>> It appears that last is way ahead of ret twice. >>>> The kernel boots with this debug patch that makes the clock go >>>> backwards if the difference is big: >>>> >>>> last = atomic64_read(&last_value); >>>> do { >>>> - if (ret< last) >>>> - return last; >>>> + if (ret< last) { >>>> + if ( last - ret< 25000000 ) >>>> + return last; >>>> + else >>>> + printk("pvclock backwards: ret = %llx; last = >>>> %llx\n", ret, last); >>>> + } >>>> last = atomic64_cmpxchg(&last_value, last, ret); >>>> } while (unlikely(last != ret)); >>>> >>>> >>>> >>>> [ 0.037122] Total of 2 processors activated (11198.08 BogoMIPS). >>>> [ 0.037118] x86 PAT enabled: cpu 1, old 0x0, new 0x7010600070106 >>>> [ 0.040000] pvclock backwards: ret = 108373705fe2; last = >>>> 210aff61470a >>>> >>> Zaaaacchhhh?! >>> >>> >> >> The lists below show some debug data of the first 99 calls to >> pvclock_clocksource_read since the kernel booted. The situation >> after the 'do ... while (version != src->version)' loop is >> displayed. >> >> Meaning of the columns: >> - src pointer >> - shadow.tsc_timestamp >> - shadow.system_timestamp >> - shadow.version >> - native_read_tsc() >> - delta = native_read_tsc() - shadow.tsc_timestamp >> - offset = scale_delta(delta, shadow.tsc_to_nsec_mul, shadow.tsc_shift) >> - ret = shadow.system_timestamp + offset >> >> Fields left out, because they were the same for all rows: >> - shadow.tsc_to_nsec_mul: b6dc43b6 >> - shadow.tsc_shift: ffffffff >> - shadow.flags: 0 >> >> Debug log of guest after cold boot of virtual machine: >> 70: ffff880001411c00 232f4a54e b42c0e25f5e7 c17c >> 232f63f61 19a13 9274 b42c0e26885b >> 71: ffff880001511c00 20afec946 b42bffe0b604 130 1f890681eacdf >> 1f88e5d1fe399 b433ab005565 1685faae10b69 >> > > Okay, I think I know what's going on and why Glauber's patch causes > problems for you. It looks like your kernel is reading the kvmclock on > the AP before it is initialized. Looking at the guest side of things, > it seems entirely plausible this could happen. > > You did mention printk timing causes the bug to appear? Perhaps it is > not just coincidental. Printk getting the time might very well call > back into the timer code before the clock is initialized, and you've got > tons of stuff in cpu_init and friends that are likely to want to printk > all kinds of bootup messages. > > If this were in fact the case, the cmpxchg that was added by Glauber's > patch could leap your clock forward to some very uninitialized random > value and then you could end up stuck in a timeout loop for days, as you > are seeing. Yes. That large wrong value is stored in last_value and all future correct values are ignored, because they are smaller then last_value. > Can you try very simply disabling printk timing to see if that might be > the source of the bug? In the meantime, what kernel version do you have > in the guests? The kernel boots successfully when CONFIG_PRINTK_TIME is not set. I'm testing with 2.6.35-rc6 now. The problem also occurs with 2.6.34.1, which also has Glauber's patch. Version 2.6.34 is working.