Openembedded Core Discussions
 help / color / mirror / Atom feed
From: "Richard Purdie" <richard.purdie@linuxfoundation.org>
To: swat@lists.yoctoproject.org,
	openembedded-core <openembedded-core@lists.openembedded.org>,
	Bruce Ashfield <bruce.ashfield@gmail.com>,
	Randy MacLeod <randy.macleod@windriver.com>,
	 Paul Gortmaker <paul.gortmaker@windriver.com>
Subject: Re: [swat] ltp failures on autobuilder
Date: Fri, 11 Jun 2021 12:36:38 +0100	[thread overview]
Message-ID: <22099aeed4395a3fba3e2508ecafec25b7d64c2a.camel@linuxfoundation.org> (raw)
In-Reply-To: <1687473EDD63E45B.21776@lists.yoctoproject.org>

On Thu, 2021-06-10 at 18:02 +0100, Richard Purdie via lists.yoctoproject.org wrote:
> Noting down what we know about the ltp issue:
> 
> We've seen intermittent issues on the autobuilder where some ltp tests fail or 
> hang. I've been trying to figure out how to reproduce the issue and narrow down
> the cause.
> 
> I was able to isolate a patch which reproduces the issue for me:
> 
> http://git.yoctoproject.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/t222&id=d7d65aae104caa03afc28837b0abe0b486d5a8b8
> 
> with master-next, setting:
> 
> IMAGE_INSTALL_append = ' ltp' 
> TEST_SUITES = 'ping ssh ltp' 

also:

IMAGE_CLASSES += "testimage"
QEMU_USE_KVM_qemux86-64 = "True"


> then 
> 
> bitbake core-image-sato; bitbake core-image-sato -c testimage
> 
> where the issue shows up as a kernel "BUG:" in the logs in WORKDIR/testimage/qemu_*
> 
> The above patch runs the minimum of ltp tests I could find which replicate the issue.
> 
> I've reproduced this on 5.10.1 -> 5.10.42, 5.4.123 and 5.13-rc5.
> (and we've ruled out linux-yocto with plain kernels)
> Also reproduced on both qemu 6.0.0 and 5.2.0.
> 
> My build machine is an Ubuntu 20.04.2 LTS with:
> Linux version 5.4.0-74-generic (buildd@lgw01-amd64-038) (gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)) #83-Ubuntu SMP Sat May 8 02:35:39 UTC 2021

Good news (for me) is that Randy and Paul can now reproduce this with the above 
additional key pieces of config.

We have confirmed that the issue is present:

* with gcc 11.1.1 and 10.3
* in hardknott
* if QB_SMP is disabled (i.e. in a single processor qemu)
* on 18.04, 20.04 and 21.04 Ubuntu host distros which have varying 5.4 and 5.11 
  host kernels

I was not able to make the bug appear with in gatesgarth as yet 
(gcc 10.2, 5.8 kernel, qemu 5.1.0) (had to hack -b /dev/null to the ltp commandline)

I did backport the qemu platform, smp and qemu commandline changes back to
gatesgarth and it still doesn't crash.

I also found that setting CONFIG_DEBUG_KERNEL makes the issue 'go away'. 
Since that is a large hammer, I tried:

CONFIG_DEBUG_KERNEL=y
# CONFIG_CGROUP_DEBUG is not set
# CONFIG_SCHED_DEBUG is not set
# CONFIG_DEBUG_PREEMPT is not set
# CONFIG_RCU_TRACE is not set
# CONFIG_X86_DEBUG_FPU is not set
# CONFIG_CONSOLE_POLL is not set
# CONFIG_DEBUG_INFO is not set
# CONFIG_KGDB is not set
# CONFIG_KGDB_HONOUR_BLOCKLIST is not set
# CONFIG_KGDB_SERIAL_CONSOLE is not set
# CONFIG_KGDB_LOW_LEVEL_TRAP is not set
# CONFIG_KGDB_KDB is not set
# CONFIG_KDB_KEYBOARD is not set
# CONFIG_DEBUG_MISC is not set

as a .cfg to the kernel and that still reproduced the crash. However:

CONFIG_DEBUG_KERNEL=y
CONFIG_CGROUP_DEBUG=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_PREEMPT=y
# CONFIG_RCU_TRACE is not set
# CONFIG_X86_DEBUG_FPU is not set
# CONFIG_CONSOLE_POLL is not set
# CONFIG_DEBUG_INFO is not set
# CONFIG_KGDB is not set
# CONFIG_KGDB_HONOUR_BLOCKLIST is not set
# CONFIG_KGDB_SERIAL_CONSOLE is not set
# CONFIG_KGDB_LOW_LEVEL_TRAP is not set
# CONFIG_KGDB_KDB is not set
# CONFIG_KDB_KEYBOARD is not set
# CONFIG_DEBUG_MISC is not set

doesn't seem to want to reproduce the crash so something about
those three options seems to make things 'work'.

What does that all mean? No idea.

Cheers,

Richard





       reply	other threads:[~2021-06-11 11:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1687473EDD63E45B.21776@lists.yoctoproject.org>
2021-06-11 11:36 ` Richard Purdie [this message]
     [not found] ` <168784123C10B53A.9125@lists.yoctoproject.org>
2021-06-11 13:19   ` [swat] ltp failures on autobuilder Richard Purdie
2021-06-16 12:56     ` Paul Gortmaker
2021-06-16 14:17       ` Richard Purdie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22099aeed4395a3fba3e2508ecafec25b7d64c2a.camel@linuxfoundation.org \
    --to=richard.purdie@linuxfoundation.org \
    --cc=bruce.ashfield@gmail.com \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=randy.macleod@windriver.com \
    --cc=swat@lists.yoctoproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox