From: Richard Purdie <richard.purdie@linuxfoundation.org>
To: Randy MacLeod <randy.macleod@windriver.com>,
openembedded-core@lists.openembedded.org,
Armin Kuster <akuster808@gmail.com>,
"Huang, Jie (Jackie)" <Jackie.Huang@windriver.com>,
"Xu, Chi" <Chi.Xu@windriver.com>,
"Yang, Liezhi" <Liezhi.Yang@windriver.com>,
"WOLD, SAUL" <saul.wold@intel.com>
Subject: Re: Yocto Project Status WW47’17
Date: Thu, 23 Nov 2017 10:20:07 +0000 [thread overview]
Message-ID: <1511432407.862.111.camel@linuxfoundation.org> (raw)
In-Reply-To: <45922039-30bd-7fbc-89be-c037c068a0db@windriver.com>
[-- Attachment #1: Type: text/plain, Size: 1667 bytes --]
On Mon, 2017-11-20 at 20:22 -0500, Randy MacLeod wrote:
> > o Issues with 4.13.10 host kernels booting kvm x86 guests on
> > Tumbleweed (Suse) and Fedora 26 (attempting to see if 4.13.12
> > helps)
>
> Robert, can you test Fedora 26. It would help to have a defect open
> with steps to reproduce or
> something about the typical workflow/ build time/ day of the week/
> phase of the moon.
We have some further data:
a) The issue occurs on 4.13.12
rpurdie@tumbleweed:/tmp> cat /proc/version
Linux version 4.13.12-1-vanilla (geeko@buildhost) (gcc version 7.2.1 20171020 [gcc-7-branch revision 253932] (SUSE Linux)) #1 SMP PREEMPT Wed Nov 8 11:21:09 UTC 2017 (9151c66)
b) The hang usually occurs at the TIMER line in the kernel logs but can
occur after booting into userspace around the udevd line, or
occasionally later in the boot process.
c) The similarity between this and the ppc bug I worked on make me
strongly suspect qemu's timers are stopping firing and the guest is
sitting in the idle loop.
d) I do now have a way to brute force the hangs at will. The attached
"runqemu-parallel.py" script runs 50 qemus in parallel. In around 45s I
had 10 hung on the autobuilder. I can provide more info on using that
script if its not obvious. It does assume my recent master changes to
the qemuconf files so we don't need to run bitbake -e to run runqemu.
This could well be the same kind of locking issue we saw on ppc. I'll
continue to look into that.
Hopefully this extra information will put us on a good track to
resolving it now. It is continuing to break builds and stop patch
merging.
Cheers,
Richard
[-- Attachment #2: runqemu-parallel.py --]
[-- Type: text/x-python, Size: 2190 bytes --]
#!/usr/bin/env python3
import shutil
import time
import subprocess
import os
procs = list(range(1, 50))
logs = "/tmp/runqemu-log"
errlogs = "/tmp/zrunqemu-log"
logfds = {}
errlogfds = {}
processes = {}
image = "/media/build1/poky/build/tmp-musl/deploy/images/qemuppc/core-image-sato-qemuppc.qemuboot.conf"
kernel = "tmp-musl/deploy/images/qemuppc/vmlinux-qemuppc.bin"
machine = "qemuppc"
image = "/media/build1/poky/build/tmp-musl/deploy/images/qemux86-64/core-image-sato-qemux86-64.qemuboot.conf"
kernel = "tmp-musl/deploy/images/qemux86-64/bzImage-qemux86-64.bin"
machine = "qemux86-64"
env = os.environ.copy()
env['MACHINE'] = machine
def start(p):
logfds[p] = open(logs + ".%d" % p, "w")
errlogfds[p] = open(errlogs + ".%d" % p, "w")
#debugopts = "-d unimp,guest_errors,int,out_asm,op,op_opt,op_ind,mmu,pcall,cpu_reset,nochain"
#exec,cpu,in_asm,
debugopts = "-d unimp,guest_errors,int"
debugopts += " -D /tmp/qemu.%d -monitor pty" % p
processes[p] = subprocess.Popen(["runqemu",machine,"nographic", "snapshot", "kvm", kernel, image, "qemuparams=-pidfile /tmp/zzqemu.%d.pid %s" % (p, debugopts)], stdout=logfds[p], stderr=errlogfds[p], env=env)
print("Started %d" % p)
for p in procs:
start(p)
lastsize = {}
while procs:
sizes = {}
time.sleep(0.5)
for p in procs:
logfile = logs + ".%d" % p
pidfn = "/tmp/zzqemu.%d.pid" % p
with open(logfile, "r") as lf:
for l in lf:
if "login:" in l:
with open(pidfn) as pidfile:
pid = int(pidfile.readline().strip())
print("Killing %d with pid %d" % (p, pid))
os.kill(pid, 9)
procs.remove(p)
if p in lastsize:
del lastsize[p]
time.sleep(0.1)
logfds[p].close()
errlogfds[p].close()
start(p)
procs.append(p)
sizes[p] = os.stat(logfile).st_size
if p in lastsize and sizes[p]:
if not sizes[p] > lastsize[p]:
print("%d stalled?" % p)
lastsize = sizes
next prev parent reply other threads:[~2017-11-23 10:20 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-20 15:36 Yocto Project Status WW47’17 Jolley, Stephen K
2017-11-21 1:22 ` Randy MacLeod
2017-11-21 1:56 ` Huang, Jie (Jackie)
2017-11-21 11:09 ` Richard Purdie
2017-11-21 11:14 ` Burton, Ross
2017-11-21 15:32 ` Wold, Saul
2017-11-22 6:09 ` Robert Yang
2017-11-23 10:20 ` Richard Purdie [this message]
2017-11-28 7:49 ` Robert Yang
2017-11-28 10:47 ` Robert Yang
2017-11-28 12:48 ` Good news " Robert Yang
2017-11-28 13:15 ` Richard Purdie
2017-11-28 13:46 ` Robert Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1511432407.862.111.camel@linuxfoundation.org \
--to=richard.purdie@linuxfoundation.org \
--cc=Chi.Xu@windriver.com \
--cc=Jackie.Huang@windriver.com \
--cc=Liezhi.Yang@windriver.com \
--cc=akuster808@gmail.com \
--cc=openembedded-core@lists.openembedded.org \
--cc=randy.macleod@windriver.com \
--cc=saul.wold@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox