From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: Enormous size of libvirt libxl-driver.log with Xen 4.2 and 4.3 Date: Tue, 11 Aug 2015 11:19:50 +0100 Message-ID: <1439288390.9747.216.camel@citrix.com> References: <1438598868.30740.128.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1438598868.30740.128.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , Jim Fehlig , Ian Jackson , Jan Beulich , Stefano Stabellini Cc: xen-devel List-Id: xen-devel@lists.xenproject.org On Mon, 2015-08-03 at 11:47 +0100, Ian Campbell wrote: > After the initial expected logging the file is simply full of: > > 2015-08-02 19:12:12 UTC libxl: debug: libxl.c:1004:domain_death_xswatch_callback: [evg=0x7f3cc44fa3f0:3] from domid=0 nentries=1 rc=1 > 2015-08-02 19:12:12 UTC libxl: debug: libxl.c:1015:domain_death_xswatch_callback: [evg=0x7f3cc44fa3f0:3] got=domaininfos[0] got->domain=0 > 2015-08-02 19:12:12 UTC libxl: debug: libxl.c:1015:domain_death_xswatch_callback: [evg=0x7f3cc44fa3f0:3] got=domaininfos[1] got->domain=-1 > 2015-08-02 19:12:12 UTC libxl: debug: libxl.c:1023:domain_death_xswatch_callback: got==gotend > > Repeated at around 51KHz. This sounds a lot like 4783c99aab8 (see below for full log message), which perhaps ought to be backported to the effected branches, i.e. 4.2 and 4.3. Looks like it was backported to 4.5 (as 0b19348f3cd1) and 4.4 (as 13623d5d8e85) already. Ian? Ian. commit 4783c99aab866f470bd59368cfbf5ad5f677b0ec Author: Ian Jackson Date: Tue Mar 17 09:30:57 2015 -0600 libxl: In domain death search, start search at first domid we want From: Ian Jackson When domain_death_xswatch_callback needed a further call to xc_domain_getinfolist it would restart it with the last domain it found rather than the first one it wants. If it only wants one it will also only ask for one domain. The result would then be that it gets the previous domain again (ie, the previous one to the one it wants), which still doesn't reveal the answer to the question, and it would therefore loop again. It's completely unclear to me why I thought it was a good idea to start the xc_domain_getinfolist with the last domain previously found rather than the first one left un-confirmed. The code has been that way since it was introduced. Instead, start each xc_domain_getinfolist at the next domain whose status we need to check. We also need to move the test for !evg into the loop, we now need evg to compute the arguments to getinfolist. Signed-off-by: Ian Jackson Reported-by: Jim Fehlig Reviewed-by: Jim Fehlig Tested-by: Jim Fehlig Acked-by: Wei Liu Acked-by: Ian Campbell