From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jim Fehlig Subject: Re: [libvirt test] 58119: regressions - FAIL Date: Fri, 12 Jun 2015 18:38:37 -0600 Message-ID: <557B7B8D.8090702@suse.com> References: <1433755348.7108.402.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1433755348.7108.402.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Anthony Perard , xen-devel@lists.xensource.com, ian.jackson@eu.citrix.com List-Id: xen-devel@lists.xenproject.org Ian Campbell wrote: > On Mon, 2015-06-08 at 04:37 +0000, osstest service user wrote: > >> flight 58119 libvirt real [real] >> http://logs.test-lab.xenproject.org/osstest/logs/58119/ >> >> Regressions :-( >> >> Tests which did not succeed and are blocking, >> including tests which could not be run: >> > > This has been failing for a while now, sorry for not brining it to your > attention sooner. > I've noticed all the failures, but unfortunately they are the same and contain no info as to why qemu failed to start. The issue was discussed during the last OpenStack+Xen+libvirt meetup, since Anthony is seeing a similar issue with the OpenStack CI. I've attempted to reproduced locally with a script that continuously runs concurrent create, shutdown, and destroy, but eventually run into other issues after many thousand iterations: # xl list Name ID Mem VCPUs State Time(s) Domain-0 0 1871 48 r----- 152599.8 (null) 14058 0 1 --ps-d 24.8 (null) 14060 0 4 --p--d 20.9 libvirt has finished all cleanup of these domains. qemu processes and all xenstore entries associated with these domains are gone. Yet something has a refcnt on them. From q debug key: (XEN) General information for domain 14058: (XEN) refcnt=1 dying=2 pause_count=2 (XEN) nr_pages=2 xenheap_pages=0 shared_pages=0 paged_pages=0 dirty_cpus={} max_pages=131328 (XEN) handle=039e9ee6-4a84-3055-4c81-8ba426ae2656 vm_assist=00000004 (XEN) General information for domain 14060: (XEN) refcnt=1 dying=2 pause_count=2 (XEN) nr_pages=2 xenheap_pages=0 shared_pages=0 paged_pages=0 dirty_cpus={} max_pages=131328 (XEN) handle=04f95218-76b6-60f1-87f1-b51795e3b6ae vm_assist=00000000 (XEN) paging assistance: hap refcounts translate external Further debug suggestions appreciated :-). My setup is running a recent xen.git (e13013db) and libvirt.git (efc68de5) master. libvirt contains an additional patch to cleanup virDomainObj ref counting in the libxl driver, which fixed the first problem the test script encountered. I sent the patch to Anthony to test in his OpenStack CI setup. Regards, Jim