Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Richard Purdie <richard.purdie@linuxfoundation.org>
To: openembedded-core <openembedded-core@lists.openembedded.org>
Subject: Sanity Failures - Segfaults in qemu images
Date: Sun, 07 Apr 2013 09:23:27 +0100	[thread overview]
Message-ID: <1365323007.6526.229.camel@ted> (raw)

We're coming up to release however we're struggling with various sanity
test failures that keep showing up on the autobuilder.

A lot of them have been caused by issues in the qemu scripts and the
fact that the systems are being asked to do more in parallel due to the
new autobuilder infrastructure. I believe we have these ones resolved
now.

The ones that worry me are like two that happened in the last build for
example:

http://autobuilder.yoctoproject.org:8011/builders/nightly-arm-lsb/builds/95/steps/Running%20Sanity%20Tests_1/logs/stdio
http://autobuilder.yoctoproject.org:8011/builders/nightly-x86-64-lsb/builds/87/steps/Running%20Sanity%20Tests/logs/stdio

In both cases we have a segfault happening in the guest, one directly
triggered by a sanity test, the other being detected in dmesg.

We saw one of these on the previous build:

http://autobuilder.yoctoproject.org:8011/builders/nightly-x86/builds/92/steps/Running%20Sanity%20Tests/logs/stdio
(ignore the minimal failure, that was likely a timeout issue, resolved
by a recent change)

I've also seen the smart help segfault on a qemumips image. I did
download that one locally and saw the same fault the first time I booted
it. I then didn't see it again, despite running the image many times.
The booting was of a copy of the image so it wasn't a first boot issue.
The checksum matched that on the autobuilder.

At this point I think it may well be a qemu issue but we don't know that
for sure. I've not seen any report of this on real hardware.

The question is how do we debug this? Does anyone have any ideas?

The best idea I've heard so far is to generate a coredump in the image
and save that off, maybe it would give some clue in later analysis. We
could also upon failure move the actually booted somewhere for later
analysis. I wondered if we could save off the qemu state too somehow.
The trouble is none of these are simple coming up to release.

So if anyone has any ideas on what is causing this of how to debug/fix
it, I'd be very receptive to them.

Cheers,

Richard






             reply	other threads:[~2013-04-07  8:41 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-07  8:23 Richard Purdie [this message]
2013-04-07 22:32 ` Sanity Failures - Segfaults in qemu images Khem Raj
2013-04-08 15:54   ` Richard Purdie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1365323007.6526.229.camel@ted \
    --to=richard.purdie@linuxfoundation.org \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox