From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: [xen-unstable test] 58821: tolerable FAIL Date: Mon, 22 Jun 2015 16:17:08 +0100 Message-ID: <1434986228.28264.172.camel@citrix.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , Andrew Cooper Cc: xen-devel@lists.xensource.com, ian.jackson@eu.citrix.com List-Id: xen-devel@lists.xenproject.org On Mon, 2015-06-22 at 14:09 +0000, osstest service user wrote: > flight 58821 xen-unstable real [real] > http://logs.test-lab.xenproject.org/osstest/logs/58821/ > [...] > test-amd64-amd64-libvirt 11 guest-start fail like 58789 http://logs.test-lab.xenproject.org/osstest/logs/58821/test-amd64-amd64-libvirt/info.html While investigating why libvirt hasn't been succeeding very well on merlot* I came across some things in the serial log which initially struck me as odd, but which I suspect are nothing (or at least not terribly relevant), if someone could confirm that would be great. Firstly is: Jun 22 12:41:09.633294 (XEN) microcode: CPU2 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:09.665099 (XEN) microcode: CPU4 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:09.729089 (XEN) microcode: CPU6 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:09.793224 (XEN) microcode: CPU8 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:09.857118 (XEN) microcode: CPU10 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:09.921123 (XEN) microcode: CPU12 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:09.985563 (XEN) microcode: CPU14 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:10.049212 (XEN) microcode: CPU16 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:10.121106 (XEN) microcode: CPU18 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:10.185059 (XEN) microcode: CPU20 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:10.249070 (XEN) microcode: CPU22 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:10.313063 (XEN) microcode: CPU24 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:10.393217 (XEN) microcode: CPU26 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:10.457126 (XEN) microcode: CPU28 updated from revision 0x6000822 to 0x6000832 Jun 22 12:41:10.521228 (XEN) microcode: CPU30 updated from revision 0x6000822 to 0x6000832 i.e. only even numbered cpus are updated. (0 was done earlier in boot). I suspect that the answer here is "hyperthreading", and the cpuinfo shows all cpus have in fact been updated. So I think that's just a red-herring. The second thing is: Jun 22 12:41:10.601103 (XEN) Brought up 32 CPUs Jun 22 12:41:10.625270 (XEN) Testing NMI watchdog on all CPUs: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 stuck i.e. at least one CPU has issues with NMI watchdog (looking at other runs it seems to vary between 29-31). Is this just that the NMI watchdog doesn't deal well with so many pCPUs? Or is it a real issue? Lastly the eventual "'0' pressed -> dumping Dom0's registers" thing only seems to dump cpus 0..9 (inclusive). It seems to vary a bit. I suspect this is another "didn't wait long enough" thing, possibly on the osstest end. Ian. [0] http://logs.test-lab.xenproject.org/osstest/logs/58821/test-amd64-amd64-libvirt/serial-merlot1.log