From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zachary Amsden Subject: Re: 2.6.35-rc1 regression with pvclock and smp guests Date: Fri, 30 Jul 2010 15:38:32 -1000 Message-ID: <4C537E98.7000704@redhat.com> References: <4C4D4B8B.80006@amd.com> <4C4DDB00.50203@xutrox.com> <4C4F48D0.8090609@xutrox.com> <4C500872.1020809@redhat.com> <4C536F80.5090205@xutrox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm@vger.kernel.org, Avi Kivity To: Arjan Koers <0h61vkll2ly8@xutrox.com> Return-path: Received: from mx1.redhat.com ([209.132.183.28]:55667 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754141Ab0GaBig (ORCPT ); Fri, 30 Jul 2010 21:38:36 -0400 In-Reply-To: <4C536F80.5090205@xutrox.com> Sender: kvm-owner@vger.kernel.org List-ID: On 07/30/2010 02:34 PM, Arjan Koers wrote: > On 2010-07-28 12:37, Avi Kivity wrote: > >> On 07/28/2010 12:00 AM, Arjan Koers wrote: >> >>> On 2010-07-26 20:59, Arjan Koers wrote: >>> >>> >>>> I ran into the same problem. 2.6.34.1 and 2.6.35-rc6 SMP guest >>>> kernels hang during boot. >>>> >>> It appears that last is way ahead of ret twice. >>> The kernel boots with this debug patch that makes the clock go >>> backwards if the difference is big: >>> >>> last = atomic64_read(&last_value); >>> do { >>> - if (ret< last) >>> - return last; >>> + if (ret< last) { >>> + if ( last - ret< 25000000 ) >>> + return last; >>> + else >>> + printk("pvclock backwards: ret = %llx; last = >>> %llx\n", ret, last); >>> + } >>> last = atomic64_cmpxchg(&last_value, last, ret); >>> } while (unlikely(last != ret)); >>> >>> >>> >>> [ 0.037122] Total of 2 processors activated (11198.08 BogoMIPS). >>> [ 0.037118] x86 PAT enabled: cpu 1, old 0x0, new 0x7010600070106 >>> [ 0.040000] pvclock backwards: ret = 108373705fe2; last = 210aff61470a >>> >> Zaaaacchhhh?! >> >> > > The lists below show some debug data of the first 99 calls to > pvclock_clocksource_read since the kernel booted. The situation > after the 'do ... while (version != src->version)' loop is > displayed. > > Meaning of the columns: > - src pointer > - shadow.tsc_timestamp > - shadow.system_timestamp > - shadow.version > - native_read_tsc() > - delta = native_read_tsc() - shadow.tsc_timestamp > - offset = scale_delta(delta, shadow.tsc_to_nsec_mul, shadow.tsc_shift) > - ret = shadow.system_timestamp + offset > > Fields left out, because they were the same for all rows: > - shadow.tsc_to_nsec_mul: b6dc43b6 > - shadow.tsc_shift: ffffffff > - shadow.flags: 0 > > Debug log of guest after cold boot of virtual machine: > 1: ffff880001411c00 2107d5a4e b42c01d704c6 8294 210d8d4b5 5b7a67 20abdc b42c01f7b0a2 > 2: ffff880001411c00 2107d5a4e b42c01d704c6 8294 210dc2b61 5ed113 21dd1b b42c01f8e1e1 > 3: ffff880001411c00 21cb0d4a8 b42c0632768f bb70 21cb10a00 3558 130d b42c0632899c > 4: ffff880001411c00 21cb0d4a8 b42c0632768f bb70 21cb11f17 4a6f 1a95 b42c06329124 > 5: ffff880001411c00 21cceaad2 b42c063d1e45 bbd8 21ccec522 1a50 965 b42c063d27aa > 6: ffff880001411c00 21cde0644 b42c06429a42 bc10 21ce25457 44e13 1899a b42c064423dc > 7: ffff880001411c00 21cf905c1 b42c064c3e76 bc46 21cfa182b 1126a 6201 b42c064ca077 > 8: ffff880001411c00 21d088194 b42c0651c601 bc7a 21d089592 13fe 723 b42c0651cd24 > 9: ffff880001411c00 21d1ad073 b42c06584fc3 bcde 21d1b135d 42ea 17e5 b42c065867a8 > 10: ffff880001411c00 21d2a3837 b42c065dd039 bd10 21d2a4825 fee 5b0 b42c065dd5e9 > 11: ffff880001411c00 21d38bab3 b42c0662fea6 bd42 21d38caa1 fee 5b0 b42c06630456 > 12: ffff880001411c00 21d47459b b42c06683029 bd78 21d475517 f7c 587 b42c066835b0 > 13: ffff880001411c00 21d578ce7 b42c066e005f bdb2 21d57d70c 4a25 1a7a b42c066e1ad9 > 14: ffff880001411c00 21d578ce7 b42c066e005f bdb2 21d57d8d6 4bef 1b1e b42c066e1b7d > 15: ffff880001411c00 21d578ce7 b42c066e005f bdb2 21d57da22 4d3b 1b94 b42c066e1bf3 > 16: ffff880001411c00 21d578ce7 b42c066e005f bdb2 21d57fc5e 6f77 27ce b42c066e282d > 17: ffff880001411c00 21d67c77c b42c0673cc0a bde4 21d67d685 f09 55e b42c0673d168 > 18: ffff880001411c00 21d7625b2 b42c0678ed96 be16 21d763488 ed6 54c b42c0678f2e2 > 19: ffff880001411c00 21df3db36 b42c06a5d222 be54 21dfa78b9 69d83 25cd5 b42c06a82ef7 > 20: ffff880001411c00 21df3db36 b42c06a5d222 be54 21dfa7a3f 69f09 25d61 b42c06a82f83 > 21: ffff880001411c00 21df3db36 b42c06a5d222 be54 21dfa7f8b 6a455 25f45 b42c06a83167 > 22: ffff880001411c00 21e3a50ea b42c06befbb1 be58 21e3c1750 1c666 a249 b42c06bf9dfa > 23: ffff880001411c00 21e4bfe47 b42c06c54bc5 be92 21e4c4c61 4e1a 1be4 b42c06c567a9 > 24: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea4b224 20cb6 bb66 b42c06e4f922 > 25: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea52748 281da e53c b42c06e522f8 > 26: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea52907 28399 e5db b42c06e52397 > 27: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea52a76 28508 e65f b42c06e5241b > 28: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea5c86a 322fc 11ec9 b42c06e55c85 > 29: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea60e3a 368cc 137b7 b42c06e57573 > 30: ffff880001411c00 21ea2a56e b42c06e43dbc beca 21ea64dc8 3a85a 14e6a b42c06e58c26 > 31: ffff880001411c00 21ed8a003 b42c06f78496 bf02 21efda28b 250288 d37d2 b42c0704bc68 > 32: ffff880001411c00 21f0e9488 b42c070ac93f bf38 21f0eacdb 1853 8af b42c070ad1ee > 33: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230aeeac6 3e60 1646 b42c0d5636ed > 34: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230af06d0 5a6a 204a b42c0d5640f1 > 35: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b03f25 192bf 8fd6 b42c0d56b07d > 36: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b043c8 19762 917f b42c0d56b226 > 37: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b0526b 1a605 96b8 b42c0d56b75f > 38: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b05632 1a9cc 9812 b42c0d56b8b9 > 39: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b07eaa 1d244 a686 b42c0d56c72d > 40: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b094e9 1e883 ae78 b42c0d56cf1f > 41: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b09962 1ecfc b011 b42c0d56d0b8 > 42: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b10590 2592a d6b4 b42c0d56f75b > 43: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b1090d 25ca7 d7f3 b42c0d56f89a > 44: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b10f99 26333 da49 b42c0d56faf0 > 45: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b11204 2659e db27 b42c0d56fbce > 46: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b1217c 27516 e0ad b42c0d570154 > 47: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b1483f 29bd9 ee85 b42c0d570f2c > 48: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b14ba6 29f40 efbc b42c0d571063 > 49: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b15569 2a903 f338 b42c0d5713df > 50: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b250b3 3a44d 14cf8 b42c0d576d9f > 51: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b254a0 3a83a 14e5f b42c0d576f06 > 52: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b25bd8 3af72 150f3 b42c0d57719a > 53: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b25ec3 3b25d 151fd b42c0d5772a4 > 54: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b5fcab 75045 29cad b42c0d58bd54 > 55: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6013b 754d5 29e4e b42c0d58bef5 > 56: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6b86c 80c06 2dfbc b42c0d590063 > 57: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6bc41 80fdb 2e11a b42c0d5901c1 > 58: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6c4e5 8187f 2e430 b42c0d5904d7 > 59: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b6c776 81b10 2e51b b42c0d5905c2 > 60: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b7f97b 94d15 35266 b42c0d59730d > 61: ffff880001411c00 230aeac66 b42c0d5620a7 c100 230b864af 9b849 378b0 b42c0d599957 > 62: ffff880001411c00 23132e49d b42c0d855884 c16e 231599c3a 26b79d dd3ec b42c0d932c70 > 63: ffff880001411c00 23132e49d b42c0d855884 c16e 231599dbc 26b91f dd476 b42c0d932cfa > 64: ffff880001411c00 23132e49d b42c0d855884 c16e 231599f5f 26bac2 dd50c b42c0d932d90 > 65: ffff880001411c00 231fdf357 b42c0dcddc47 c176 232046a74 6771d 24f1e b42c0dd02b65 > 66: ffff880001411c00 231fdf357 b42c0dcddc47 c176 232046c53 678fc 24fca b42c0dd02c11 > 67: ffff880001411c00 231fdf357 b42c0dcddc47 c176 232046da0 67a49 25040 b42c0dd02c87 > 68: ffff880001411c00 232f4a54e b42c0e25f5e7 c17c 232f62a2d 184df 8ae2 b42c0e2680c9 > 69: ffff880001411c00 232f4a54e b42c0e25f5e7 c17c 232f63478 18f2a 8e8f b42c0e268476 > 70: ffff880001411c00 232f4a54e b42c0e25f5e7 c17c 232f63f61 19a13 9274 b42c0e26885b > 71: ffff880001511c00 20afec946 b42bffe0b604 130 1f890681eacdf 1f88e5d1fe399 b433ab005565 1685faae10b69 > 72: ffff880001411c00 2334400d3 b42c0e424ccd c180 23344a923 a850 3c1c b42c0e4288e9 > 73: ffff880001411c00 2334400d3 b42c0e424ccd c180 2334632f1 2321e c8c2 b42c0e43158f > 74: ffff880001411c00 2334400d3 b42c0e424ccd c180 23346a094 29fc1 efea b42c0e433cb7 > 75: ffff880001411c00 2334400d3 b42c0e424ccd c180 23347021d 3014a 112c0 b42c0e435f8d > 76: ffff880001411c00 2334400d3 b42c0e424ccd c180 2335ba33b 17a268 870e5 b42c0e4abdb2 > 77: ffff880001411c00 2334400d3 b42c0e424ccd c180 2335ba9f8 17a925 8734d b42c0e4ac01a > 78: ffff880001411c00 2334400d3 b42c0e424ccd c180 2335bb17d 17b0aa 875fd b42c0e4ac2ca > 79: ffff880001511c00 20afec946 b42bffe0b604 130 1f89068505355 1f88e5d518a0f b433ab1210ed 1685faaf2c6f1 > 80: ffff880001511c00 1f8906862c74e b42c0e59371c 2 1f8906863ad24 e5d6 5215 b42c0e598931 > 81: ffff880001511c00 1f8906862c74e b42c0e59371c 2 1f8906863b980 f232 567f b42c0e598d9b > 82: ffff880001511c00 1f8906862c74e b42c0e59371c 2 1f8906863bbdd f48f 5757 b42c0e598e73 > 83: ffff880001511c00 1f8906862c74e b42c0e59371c 2 1f8906863e9d2 12284 67c1 b42c0e599edd > 84: ffff880001411c00 2334400d3 b42c0e424ccd c180 233855729 415656 1755cc b42c0e59a299 > 85: ffff880001511c00 1f8906862c74e b42c0e59371c 2 1f890686410b4 14966 75a4 b42c0e59acc0 > 86: ffff880001411c00 2334400d3 b42c0e424ccd c180 233857b87 417ab4 1762c9 b42c0e59af96 > 87: ffff880001511c00 1f8906862c74e b42c0e59371c 2 1f89068646b9e 1a450 961d b42c0e59cd39 > 88: ffff880001411c00 2334400d3 b42c0e424ccd c180 233894271 45419e 18bc1e b42c0e5b08eb > 89: ffff880001411c00 2334400d3 b42c0e424ccd c180 2338ab48a 46b3b7 19404c b42c0e5b8d19 > 90: ffff880001411c00 2334400d3 b42c0e424ccd c180 2338adf39 46de66 194f8b b42c0e5b9c58 > 91: ffff880001411c00 2334400d3 b42c0e424ccd c180 2338b39b8 4738e5 196fdc b42c0e5bbca9 > 92: ffff880001511c00 1f890686bf9e1 b42c0e5c8045 4 1f890686cf137 f756 5855 b42c0e5cd89a > 93: ffff880001511c00 1f890686bf9e1 b42c0e5c8045 4 1f890686cfd6f 1038e 5cb3 b42c0e5cdcf8 > 94: ffff880001511c00 1f890686bf9e1 b42c0e5c8045 4 1f890686d9f4d 1a56c 9682 b42c0e5d16c7 > 95: ffff880001511c00 1f890686bf9e1 b42c0e5c8045 4 1f890686e5610 25c2f d7c8 b42c0e5d580d > 96: ffff880001511c00 1f890686bf9e1 b42c0e5c8045 4 1f890686e8326 28945 e7e2 b42c0e5d6827 > 97: ffff880001411c00 233907ea7 b42c0e5d9e8b c182 23391ad48 12ea1 6c15 b42c0e5e0aa0 > 98: ffff880001411c00 233907ea7 b42c0e5d9e8b c182 23391b539 13692 6eeb b42c0e5e0d76 > 99: ffff880001411c00 233907ea7 b42c0e5d9e8b c182 2339270a3 1f1fc b1da b42c0e5e5065 > > The data for the first CPU (ffff880001411c00) looks OK to me. > For the second CPU (ffff880001511c00), the contents of the shadow struct > appear to be wrong on line 71 and 79: shadow.tsc_timestamp and > native_read_tsc() are very dissimilar, which results in a wrong value > of ret. > On line 80, the struct is OK again. > Notice that shadow.version appears to have been be reset back to 0. That > doesn't happen when the guest is rebooted without stopping the virtual machine. > How are you printing shadow.version? From a local variable captured during the barrier window or directly in a printk afterwards? If should never go backwards like this, and the vcpus come from a zalloc. This is not easily explainable by anything other than a memory ordering or compiler issue. Note that receiving a startup IPI will cause the TSC to (mistakenly) pass through the host value, but this should be corrected for. This happens because SVM will call init_vmcb, clearing the tsc_offset field. This seems to explain the huge difference in TSC presented to the CPUs. It should affect kvmclock, because kvmclock won't be running at that time yet. Zach