* xl: libxl_domain_info: getting domain info list: Bad address
@ 2015-09-10 17:07 Julien Grall
2015-09-11 7:55 ` Riku Voipio
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Julien Grall @ 2015-09-10 17:07 UTC (permalink / raw)
To: Ian Campbell, Ian Jackson, Wei Liu, xen-devel; +Cc: Riku Voipio
Hi,
Riku reported me an error on their CI loop while run Xen on the Arndale:
Starting /usr/sbin/xenstored...
Setting domain 0 name, domid and JSON config...
libxl: error: libxl.c:675:libxl_domain_info: getting domain info list: Bad address
libxl: error: libxl_dom.c:1869:libxl__userdata_path: unable to find domain info for domain 0: Bad address
cannot store stub json config for Dom0
Starting xenconsoled...
Starting QEMU as disk backend for dom0
/etc/init.d/xencommons: line 102: qemu-system-i386: command not found
libxl: error: libxl.c:656:libxl_list_domain: getting domain info list: Bad address
libxl_list_domain failed.
The full log can be found here: https://paste.debian.net/311187
I've looked at the osstest log and was able to find the same errors very often
on the arndale. Although, it seems that the tests are still passing. For
instance [1].
Does anyone have an idea what could go wrong?
Regards,
[1] http://logs.test-lab.xenproject.org/osstest/logs/61618/test-armhf-armhf-xl-arndale/info.html
--
Julien Grall
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: xl: libxl_domain_info: getting domain info list: Bad address 2015-09-10 17:07 xl: libxl_domain_info: getting domain info list: Bad address Julien Grall @ 2015-09-11 7:55 ` Riku Voipio 2015-09-11 9:50 ` Julien Grall 2015-09-11 8:52 ` Ian Campbell 2015-09-11 14:50 ` Julien Grall 2 siblings, 1 reply; 9+ messages in thread From: Riku Voipio @ 2015-09-11 7:55 UTC (permalink / raw) To: Julien Grall; +Cc: Wei Liu, Ian Jackson, Ian Campbell, xen-devel Hi, On 10 September 2015 at 20:07, Julien Grall <julien.grall@citrix.com> wrote: > Hi, > > Riku reported me an error on their CI loop while run Xen on the Arndale: > > Starting /usr/sbin/xenstored... > Setting domain 0 name, domid and JSON config... > libxl: error: libxl.c:675:libxl_domain_info: getting domain info list: Bad address > libxl: error: libxl_dom.c:1869:libxl__userdata_path: unable to find domain info for domain 0: Bad address > cannot store stub json config for Dom0 > Starting xenconsoled... > Starting QEMU as disk backend for dom0 > /etc/init.d/xencommons: line 102: qemu-system-i386: command not found > libxl: error: libxl.c:656:libxl_list_domain: getting domain info list: Bad address > libxl_list_domain failed. > > The full log can be found here: https://paste.debian.net/311187 > > I've looked at the osstest log and was able to find the same errors very often > on the arndale. Although, it seems that the tests are still passing. For > instance [1]. Hi, It looks like the errors started Sep 4th, while Sep 3rd was still OK. The Xen binary was same for both test runs, only the kernel (which follows mainline) was changed. Failing kernel was 807249d3ada1ff28a47c4054ca4edd479421b671 While last succeeding was 1e1a4e8f439113b7820bc7150569f685e1cc2b43 This range includes merge of ARM development updates from Russell King for 4.3, which probably contains the change that breaks Xen. Riku ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: xl: libxl_domain_info: getting domain info list: Bad address 2015-09-11 7:55 ` Riku Voipio @ 2015-09-11 9:50 ` Julien Grall 0 siblings, 0 replies; 9+ messages in thread From: Julien Grall @ 2015-09-11 9:50 UTC (permalink / raw) To: Riku Voipio; +Cc: Wei Liu, Ian Jackson, Ian Campbell, xen-devel On 11/09/2015 08:55, Riku Voipio wrote: > It looks like the errors started Sep 4th, while Sep 3rd was still OK. > The Xen binary was same for both test runs, only the kernel (which > follows mainline) was changed. > > Failing kernel was 807249d3ada1ff28a47c4054ca4edd479421b671 > While last succeeding was 1e1a4e8f439113b7820bc7150569f685e1cc2b43 Thank you for narrowing down. > This range includes merge of ARM development updates from Russell King > for 4.3, which probably contains the change that breaks Xen. I've been able to reproduce it on midway so it's not related to the Arndale board. Although, I don't see anything obvious in the log which could break Xen. I will try a manual bisection to see if I can fingered a specific commit. -- Julien Grall ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: xl: libxl_domain_info: getting domain info list: Bad address 2015-09-10 17:07 xl: libxl_domain_info: getting domain info list: Bad address Julien Grall 2015-09-11 7:55 ` Riku Voipio @ 2015-09-11 8:52 ` Ian Campbell 2015-09-11 9:26 ` Julien Grall 2015-09-11 14:50 ` Julien Grall 2 siblings, 1 reply; 9+ messages in thread From: Ian Campbell @ 2015-09-11 8:52 UTC (permalink / raw) To: Julien Grall, Ian Jackson, Wei Liu, xen-devel; +Cc: Riku Voipio On Thu, 2015-09-10 at 18:07 +0100, Julien Grall wrote: > Hi, > > Riku reported me an error on their CI loop while run Xen on the Arndale: > > Starting /usr/sbin/xenstored... > Setting domain 0 name, domid and JSON config... > libxl: error: libxl.c:675:libxl_domain_info: getting domain info list: > Bad address > libxl: error: libxl_dom.c:1869:libxl__userdata_path: unable to find > domain info for domain 0: Bad address > cannot store stub json config for Dom0 > Starting xenconsoled... > Starting QEMU as disk backend for dom0 > /etc/init.d/xencommons: line 102: qemu-system-i386: command not found > libxl: error: libxl.c:656:libxl_list_domain: getting domain info list: > Bad address > libxl_list_domain failed. FWIW this caused this recent test failure of linux-next: http://logs.test-lab.xenproject.org/osstest/logs/61690/test-armhf-armhf-xl-arndale/info.html I don't know for how long it has been failing, but may or may not be bisectable by the automated bisector. > The full log can be found here: https://paste.debian.net/311187 > > I've looked at the osstest log and was able to find the same errors very often > on the arndale. Although, it seems that the tests are still passing. For > instance [1]. I don't see this message in any of the test logs. It does appear in the serial logs, but with at "Sep 9 00:47:23" and "Sep 9 01:47:29" while this test was running from "2015-09-09 08:01:32 Z" until just after "2015-09-09 15:16:16 Z". This is because the serial logs are not rotated for each new job and they aren't trimmed to only the relevant time span, so they can (and almost always do) contain stuff from previous tests. You generally need to scroll to the end. There are two related potential improvements which could be made to osstest here. First would be to arrange for the serial logs to be trimmed to the relevant time span the second would be a new ts-logs-audit step which runs after ts-logs-capture and checks for anything amis (e.g. segfaults in the kernel logs). Obviously if you do the second without the first the code would need to be careful to only look at relevant lines in serial.log (other logs should be ok). Ian. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: xl: libxl_domain_info: getting domain info list: Bad address 2015-09-11 8:52 ` Ian Campbell @ 2015-09-11 9:26 ` Julien Grall 2015-09-11 9:38 ` Ian Campbell 0 siblings, 1 reply; 9+ messages in thread From: Julien Grall @ 2015-09-11 9:26 UTC (permalink / raw) To: Ian Campbell, Ian Jackson, Wei Liu, xen-devel; +Cc: Riku Voipio Hi Ian, On 11/09/2015 09:52, Ian Campbell wrote: > On Thu, 2015-09-10 at 18:07 +0100, Julien Grall wrote: >> Hi, >> >> Riku reported me an error on their CI loop while run Xen on the Arndale: >> >> Starting /usr/sbin/xenstored... >> Setting domain 0 name, domid and JSON config... >> libxl: error: libxl.c:675:libxl_domain_info: getting domain info list: >> Bad address >> libxl: error: libxl_dom.c:1869:libxl__userdata_path: unable to find >> domain info for domain 0: Bad address >> cannot store stub json config for Dom0 >> Starting xenconsoled... >> Starting QEMU as disk backend for dom0 >> /etc/init.d/xencommons: line 102: qemu-system-i386: command not found >> libxl: error: libxl.c:656:libxl_list_domain: getting domain info list: >> Bad address >> libxl_list_domain failed. > > FWIW this caused this recent test failure of linux-next: > http://logs.test-lab.xenproject.org/osstest/logs/61690/test-armhf-armhf-xl-arndale/info.html > I don't know for how long it has been failing, but may or may not be > bisectable by the automated bisector. It's causing the issue on Linux next since the end of august: http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl-arndale/linux-next.html The same problem appears in linux-linus from the 8th of september: http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf- armhf-xl-arndale/linux-linus.html Maybe the last job can be bisect from v4.2 tag? > >> The full log can be found here: https://paste.debian.net/311187 >> >> I've looked at the osstest log and was able to find the same errors very often >> on the arndale. Although, it seems that the tests are still passing. For >> instance [1]. > > I don't see this message in any of the test logs. > > It does appear in the serial logs, but with at "Sep 9 00:47:23" and "Sep > 9 01:47:29" while this test was running from "2015-09-09 08:01:32 Z" until > just after "2015-09-09 15:16:16 Z". > > This is because the serial logs are not rotated for each new job and they > aren't trimmed to only the relevant time span, so they can (and almost > always do) contain stuff from previous tests. You generally need to scroll > to the end. Damn, I though the serial logs was only containing the serial messages for the current job. I was doing some grep in the logs to see when it first appears. Sorry for the confusion Regards, -- Julien Grall ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: xl: libxl_domain_info: getting domain info list: Bad address 2015-09-11 9:26 ` Julien Grall @ 2015-09-11 9:38 ` Ian Campbell 2015-09-11 15:34 ` Ian Jackson 0 siblings, 1 reply; 9+ messages in thread From: Ian Campbell @ 2015-09-11 9:38 UTC (permalink / raw) To: Julien Grall, Ian Jackson, Wei Liu, xen-devel; +Cc: Riku Voipio On Fri, 2015-09-11 at 10:26 +0100, Julien Grall wrote: > Hi Ian, > > On 11/09/2015 09:52, Ian Campbell wrote: > > On Thu, 2015-09-10 at 18:07 +0100, Julien Grall wrote: > > > Hi, > > > > > > Riku reported me an error on their CI loop while run Xen on the > > > Arndale: > > > > > > Starting /usr/sbin/xenstored... > > > Setting domain 0 name, domid and JSON config... > > > libxl: error: libxl.c:675:libxl_domain_info: getting domain info > > > list: > > > Bad address > > > libxl: error: libxl_dom.c:1869:libxl__userdata_path: unable to find > > > domain info for domain 0: Bad address > > > cannot store stub json config for Dom0 > > > Starting xenconsoled... > > > Starting QEMU as disk backend for dom0 > > > /etc/init.d/xencommons: line 102: qemu-system-i386: command not found > > > libxl: error: libxl.c:656:libxl_list_domain: getting domain info > > > list: > > > Bad address > > > libxl_list_domain failed. > > > > FWIW this caused this recent test failure of linux-next: > > http://logs.test-lab.xenproject.org/osstest/logs/61690/test-armhf-armhf > > -xl-arndale/info.html > > I don't know for how long it has been failing, but may or may not be > > bisectable by the automated bisector. > > It's causing the issue on Linux next since the end of august: > http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-ar > mhf-xl-arndale/linux-next.html > > The same problem appears in linux-linus from the 8th of september: > http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf- > armhf-xl-arndale/linux-linus.html > > Maybe the last job can be bisect from v4.2 tag? It looks like it tried and got some really weird error: http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-linus/test-armhf-armhf-xl-arndale.leak-check--basis%288%29.html Revision graph generation failed! Error message: dot -Tps -o/home/logs/results/bisect/linux-linus/test-armhf-armhf-xl-arndale.leak-check--basis(8).ps /home/logs/results/bisect/linux-linus/test-armhf-armhf-xl-arndale.leak-check--basis(8).dot: 512 at Osstest.pm line 357. It worked for me manually, I wonder if the issue is lack of escaping for the ()'s? Ian? > > > > > > The full log can be found here: https://paste.debian.net/311187 > > > > > > I've looked at the osstest log and was able to find the same errors > > > very often > > > on the arndale. Although, it seems that the tests are still passing. > > > For > > > instance [1]. > > > > I don't see this message in any of the test logs. > > > > It does appear in the serial logs, but with at "Sep 9 00:47:23" and > > "Sep > > 9 01:47:29" while this test was running from "2015-09-09 08:01:32 Z" > > until > > just after "2015-09-09 15:16:16 Z". > > > > This is because the serial logs are not rotated for each new job and > > they > > aren't trimmed to only the relevant time span, so they can (and almost > > always do) contain stuff from previous tests. You generally need to > > scroll > > to the end. > > Damn, I though the serial logs was only containing the serial messages > for the current job. > > I was doing some grep in the logs to see when it first appears. Sorry > for the confusion > > Regards, > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: xl: libxl_domain_info: getting domain info list: Bad address 2015-09-11 9:38 ` Ian Campbell @ 2015-09-11 15:34 ` Ian Jackson 2015-09-11 16:01 ` Ian Campbell 0 siblings, 1 reply; 9+ messages in thread From: Ian Jackson @ 2015-09-11 15:34 UTC (permalink / raw) To: Ian Campbell; +Cc: Julien Grall, Riku Voipio, Wei Liu, xen-devel Ian Campbell writes ("Re: xl: libxl_domain_info: getting domain info list: Bad address"): > It worked for me manually, I wonder if the issue is lack of escaping for > the ()'s? Yes. Ian. >From 29e08dfa3a5c5a5aeb51fd01c67345e20cbb33c5 Mon Sep 17 00:00:00 2001 From: Ian Jackson <ian.jackson@eu.citrix.com> Date: Fri, 11 Sep 2015 16:27:08 +0100 Subject: [OSSTEST PATCH] cs-bisection-step: Cope with graph-out (testids) containing ( ) etc. cr-try-bisect launders / in the testid but relies on other characters being handled appropriately by cs-bisection-step. So for example it can pass graph-out=/home/logs/results/bisect/linux-linus/test-armhf-armhf-xl-arndale.leak-check--basis(8) But cs-bisection step foolishly assumed that the --graph-out argument did not contain any shell metacharacters. Fix this. Specifically: * Change invocations of perl's open to use the 3-argument form * Change invocations of system to pass individual arguments rather than constructing a shell script fragment and relying on the shell to split it up. * In particular, in the png processing pipeline, use the "sh -ec <script> x <arg>..." technique to pass the input and output filenames in a way that does not expose them to the shell's parser. To avoid making this code more tangled than it already is, also break out the construction of what is now $scriptlet. * Escape metacharacters in the URIs we put in the html output. Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> --- cs-bisection-step | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/cs-bisection-step b/cs-bisection-step index b676044..ebecda4 100755 --- a/cs-bisection-step +++ b/cs-bisection-step @@ -32,6 +32,7 @@ use Data::Dumper; no warnings qw(recursion); use HTML::Entities; use Osstest::Executive; +use URI::Escape; our @blessings= qw(real real-bisect); our @revtuplegenargs= (); @@ -945,7 +946,7 @@ sub odot ($) { print DOT $_[0] or die $!; } sub writegraph () { return unless length $graphfile; - open DOT, "> $graphfile.dot" or die "$graphfile.dot $!"; + open DOT, ">", "$graphfile.dot" or die "$graphfile.dot $!"; odot(<<END); digraph "$job $testid" { @@ -1005,8 +1006,9 @@ END my $href= $graphfile; $href =~ s,.*/,,; + $href = uri_escape($href, '^-._+,=0-9a-zA-Z'); - open HTML, "> $graphfile.html" or die "$graphfile.html $!"; + open HTML, ">", "$graphfile.html" or die "$graphfile.html $!"; print HTML <<END <html><head><title>bisection $branch $job $testid</title></head> <body> @@ -1021,17 +1023,20 @@ END if (eval { foreach my $fmt (qw(ps png)) { - system_checked("dot -T$fmt -o$graphfile.$fmt $graphfile.dot"); + system_checked("dot", "-T$fmt", "-o$graphfile.$fmt", + "$graphfile.dot"); } 1; }) { my $gsize = $c{BisectionRevisonGraphSize}; - system_checked("pngtopnm <$graphfile.png". - " | pnmscale -xysize ". - ($gsize =~ m/^(\d+)x(\d+)$/ ? "$1 $2" : - $gsize =~ m/^(\d+)$/ ? "$1 $1" : - die "$gsize ?"). - " | pnmtopng >$graphfile.mini.png"); + my $scriptlet = 'pngtopnm <$1'; + $scriptlet .= " | pnmscale -xysize "; + $scriptlet .= $gsize =~ m/^(\d+)x(\d+)$/ ? "$1 $2" : + $gsize =~ m/^(\d+)$/ ? "$1 $1" : + die "$gsize ?"; + $scriptlet .= ' | pnmtopng >$2'; + system_checked(qw(sh -ec), $scriptlet, 'x', + "$graphfile.png", "$graphfile.mini.png"); print HTML <<END or die $!; <h2>Revision graph overview</h2> <img src="$href.mini.png"> -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: xl: libxl_domain_info: getting domain info list: Bad address 2015-09-11 15:34 ` Ian Jackson @ 2015-09-11 16:01 ` Ian Campbell 0 siblings, 0 replies; 9+ messages in thread From: Ian Campbell @ 2015-09-11 16:01 UTC (permalink / raw) To: Ian Jackson; +Cc: Julien Grall, Riku Voipio, Wei Liu, xen-devel On Fri, 2015-09-11 at 16:34 +0100, Ian Jackson wrote: > From 29e08dfa3a5c5a5aeb51fd01c67345e20cbb33c5 Mon Sep 17 00:00:00 2001 > From: Ian Jackson <ian.jackson@eu.citrix.com> > Date: Fri, 11 Sep 2015 16:27:08 +0100 > Subject: [OSSTEST PATCH] cs-bisection-step: Cope with graph-out (testids) > containing ( ) etc. > > cr-try-bisect launders / in the testid but relies on other characters > being handled appropriately by cs-bisection-step. So for example it > can pass > > graph-out=/home/logs/results/bisect/linux-linus/test-armhf-armhf-xl > -arndale.leak-check--basis(8) > > But cs-bisection step foolishly assumed that the --graph-out argument > did not contain any shell metacharacters. Fix this. > > Specifically: > > * Change invocations of perl's open to use the 3-argument form > * Change invocations of system to pass individual arguments rather > than constructing a shell script fragment and relying on the shell > to split it up. > * In particular, in the png processing pipeline, use the "sh -ec > <script> x <arg>..." technique to pass the input and output > filenames in a way that does not expose them to the shell's parser. > To avoid making this code more tangled than it already is, also > break out the construction of what is now $scriptlet. > * Escape metacharacters in the URIs we put in the html output. > > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: xl: libxl_domain_info: getting domain info list: Bad address 2015-09-10 17:07 xl: libxl_domain_info: getting domain info list: Bad address Julien Grall 2015-09-11 7:55 ` Riku Voipio 2015-09-11 8:52 ` Ian Campbell @ 2015-09-11 14:50 ` Julien Grall 2 siblings, 0 replies; 9+ messages in thread From: Julien Grall @ 2015-09-11 14:50 UTC (permalink / raw) To: Ian Campbell, Ian Jackson, Wei Liu, xen-devel; +Cc: Riku Voipio On 10/09/15 18:07, Julien Grall wrote: > Hi, > > Riku reported me an error on their CI loop while run Xen on the Arndale: > > Starting /usr/sbin/xenstored... > Setting domain 0 name, domid and JSON config... > libxl: error: libxl.c:675:libxl_domain_info: getting domain info list: Bad address > libxl: error: libxl_dom.c:1869:libxl__userdata_path: unable to find domain info for domain 0: Bad address > cannot store stub json config for Dom0 > Starting xenconsoled... > Starting QEMU as disk backend for dom0 > /etc/init.d/xencommons: line 102: qemu-system-i386: command not found > libxl: error: libxl.c:656:libxl_list_domain: getting domain info list: Bad address > libxl_list_domain failed. > > The full log can be found here: https://paste.debian.net/311187 > > I've looked at the osstest log and was able to find the same errors very often > on the arndale. Although, it seems that the tests are still passing. For > instance [1]. > > Does anyone have an idea what could go wrong? FIY, it was an issue in Linux. Patch has been posted [1] [1] https://lkml.org/lkml/2015/9/11/373 -- Julien Grall ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-09-11 16:01 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-10 17:07 xl: libxl_domain_info: getting domain info list: Bad address Julien Grall 2015-09-11 7:55 ` Riku Voipio 2015-09-11 9:50 ` Julien Grall 2015-09-11 8:52 ` Ian Campbell 2015-09-11 9:26 ` Julien Grall 2015-09-11 9:38 ` Ian Campbell 2015-09-11 15:34 ` Ian Jackson 2015-09-11 16:01 ` Ian Campbell 2015-09-11 14:50 ` Julien Grall
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.