Openembedded Core Discussions
 help / color / mirror / Atom feed
* Sanity Failures - Segfaults in qemu images
@ 2013-04-07  8:23 Richard Purdie
  2013-04-07 22:32 ` Khem Raj
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Purdie @ 2013-04-07  8:23 UTC (permalink / raw)
  To: openembedded-core

We're coming up to release however we're struggling with various sanity
test failures that keep showing up on the autobuilder.

A lot of them have been caused by issues in the qemu scripts and the
fact that the systems are being asked to do more in parallel due to the
new autobuilder infrastructure. I believe we have these ones resolved
now.

The ones that worry me are like two that happened in the last build for
example:

http://autobuilder.yoctoproject.org:8011/builders/nightly-arm-lsb/builds/95/steps/Running%20Sanity%20Tests_1/logs/stdio
http://autobuilder.yoctoproject.org:8011/builders/nightly-x86-64-lsb/builds/87/steps/Running%20Sanity%20Tests/logs/stdio

In both cases we have a segfault happening in the guest, one directly
triggered by a sanity test, the other being detected in dmesg.

We saw one of these on the previous build:

http://autobuilder.yoctoproject.org:8011/builders/nightly-x86/builds/92/steps/Running%20Sanity%20Tests/logs/stdio
(ignore the minimal failure, that was likely a timeout issue, resolved
by a recent change)

I've also seen the smart help segfault on a qemumips image. I did
download that one locally and saw the same fault the first time I booted
it. I then didn't see it again, despite running the image many times.
The booting was of a copy of the image so it wasn't a first boot issue.
The checksum matched that on the autobuilder.

At this point I think it may well be a qemu issue but we don't know that
for sure. I've not seen any report of this on real hardware.

The question is how do we debug this? Does anyone have any ideas?

The best idea I've heard so far is to generate a coredump in the image
and save that off, maybe it would give some clue in later analysis. We
could also upon failure move the actually booted somewhere for later
analysis. I wondered if we could save off the qemu state too somehow.
The trouble is none of these are simple coming up to release.

So if anyone has any ideas on what is causing this of how to debug/fix
it, I'd be very receptive to them.

Cheers,

Richard






^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Sanity Failures - Segfaults in qemu images
  2013-04-07  8:23 Sanity Failures - Segfaults in qemu images Richard Purdie
@ 2013-04-07 22:32 ` Khem Raj
  2013-04-08 15:54   ` Richard Purdie
  0 siblings, 1 reply; 3+ messages in thread
From: Khem Raj @ 2013-04-07 22:32 UTC (permalink / raw)
  To: Richard Purdie; +Cc: openembedded-core

[-- Attachment #1: Type: text/plain, Size: 407 bytes --]

On Sun, Apr 7, 2013 at 1:23 AM, Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

>
> http://autobuilder.yoctoproject.org:8011/builders/nightly-x86-64-lsb/builds/87/steps/Running%20Sanity%20Tests/logs/stdio
>
>

what does complete dmesg output looks like ? I have seen ld.so segfaults on
real x86_64 hardware but havent narrowed it down since
it does not seem to bother my testing.​

[-- Attachment #2: Type: text/html, Size: 1108 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Sanity Failures - Segfaults in qemu images
  2013-04-07 22:32 ` Khem Raj
@ 2013-04-08 15:54   ` Richard Purdie
  0 siblings, 0 replies; 3+ messages in thread
From: Richard Purdie @ 2013-04-08 15:54 UTC (permalink / raw)
  To: Khem Raj; +Cc: openembedded-core

On Sun, 2013-04-07 at 15:32 -0700, Khem Raj wrote:
> On Sun, Apr 7, 2013 at 1:23 AM, Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
>         http://autobuilder.yoctoproject.org:8011/builders/nightly-x86-64-lsb/builds/87/steps/Running%20Sanity%20Tests/logs/stdio
>
>
> what does complete dmesg output looks like ? I have seen ld.so
> segfaults on real x86_64 hardware but havent narrowed it down since
> it does not seem to bother my testing.​

I don't have the full dmesg output however I have put a wealth of info
about this into:

https://bugzilla.yoctoproject.org/show_bug.cgi?id=4216

Basically these do look like "random" segfaults happening any point on
the system and having a variety of effects. The double fault is
particularly worrying as it suggests the problem is kernel or qemu. I've
put a script which tends to reproduce the issue into the bugzilla.

When you say you've seen ld.so faults on real hardware, have you any
more details? Are they happening often? Just ld.so?

I need to know if this is a qemu issue or on real hardware too since the
latter is a release blocker and extremely serious.

Cheers,

Richard




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-04-08 16:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-07  8:23 Sanity Failures - Segfaults in qemu images Richard Purdie
2013-04-07 22:32 ` Khem Raj
2013-04-08 15:54   ` Richard Purdie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox