Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Mikko Rapeli <mikko.rapeli@linaro.org>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: Khem Raj <raj.khem@gmail.com>, openembedded-core@lists.openembedded.org
Subject: Re: [OE-core] [PATCH 1/2] qemurunner: Impove stdout logging handling
Date: Tue, 19 Dec 2023 14:12:36 +0200	[thread overview]
Message-ID: <ZYGItGxg-TAO4JIE@nuoska> (raw)
In-Reply-To: <8ba14102f0ea38d9bb691538e3b2171e82aead33.camel@linuxfoundation.org>

Hi,

On Tue, Dec 19, 2023 at 12:03:07PM +0000, Richard Purdie wrote:
> On Mon, 2023-12-18 at 23:01 +0000, Richard Purdie via
> lists.openembedded.org wrote:
> > On Mon, 2023-12-18 at 10:07 -0800, Khem Raj wrote:
> > > On Mon, Dec 18, 2023 at 9:58 AM Richard Purdie
> > > <richard.purdie@linuxfoundation.org> wrote:
> > > > 
> > > > On Mon, 2023-12-18 at 09:45 -0800, Khem Raj wrote:
> > > > > I tried the two patches in this series. It did improve the situation
> > > > > but I am still getting SSH timeouts. But this time its 13 tests
> > > > > earlier it used to be 40+
> > > > > btw. my images are using systemd. So it might be good to see if we see
> > > > > this with poky-altcfg as well or not.
> > > > 
> > > > Do you have the log.do_testimage and the ${WORKDIR}/testimage/ files?
> > > 
> > > yes, further I ran the failing tests in loop one after another still
> > > one test gzip fails with ssh timeouts
> > > 
> > > https://busybox.net/~kraj/log.do_testimage.503
> > > https://busybox.net/~kraj/testimage/
> > > 
> > > there are two runs in the testimages folder. In one you see the RCU
> > > stall and in second you do not
> > > but it fails with same ssh timeout issue.
> > > 
> > > > 
> > > > Did you still see rcu stalls in the logs?
> > 
> > What is interesting is there is ~3MB of nulls in the .2 serial log. The
> > rcu stall is also:
> > 
> > [   88.261687]  serial8250_tx_chars+0xea/0x2b0
> > [   88.261689]  serial8250_handle_irq+0x1e9/0x330
> > [   88.261691]  serial8250_default_handle_irq+0x4a/0x90
> > [   88.261693]  serial8250_interrupt+0x66/0xc0
> > [   88.261696]  __handle_irq_event_percpu+0x54/0x1c0
> > [   88.261701]  handle_irq_event+0x3d/0x80
> > 
> > i.e. it is stalled in the serial TX path.
> > 
> > The big question is why is there so many nulls on the serial port. I
> > see a few on my local x86 test runs but only ~4kb, not megabytes of
> > them. I hadn't worked out where/what they are from yet.
> > 
> > I suspect something in the serial/kernel/qemu space isn't interacting
> > correctly.
> > 
> > Find the cause of the nulls and we might make progress on this.
> 
> I went digging and it is a mess. That 3MB file is actually a wtmp file
> from /var/log. The reason is a failure of the "target_dumper" commands
> and that the simplistic serial command handling doesn't like binary
> files.
> 
> I've a few patches which should improve the situation but we're clearly
> multiple issues at play here where issues are stacking on top of other
> issues. My rough list of issues:
> 
> a) stdout logging from qemu wasn't being read and could overflow
> buffers locking qemu
> b) serial data from commands wasn't being fully read and could overflow
> buffers locking qemu
> c) the dumper log reading command is reading binary files like wtmp
> d) the dumper logs overwrite each other so are useless anyway
> e) the dumper logs run on every failed command
> f) qmp monitor dump command also run on every failed command
> 
> I think we'd be much better off if we drop the dumper/monitor stuff and
> qemu runs might be quite a bit faster too.

I agree to this. If target has serious issues like a full lockup, then dumper and
monitor commands don't help anyway. Capturing the full serial logs would be sufficient,
though I'd like to see a live capture of /var/log/message and systemd journal in case
services started by init have major issues.

Cheers,

-Mikko


  reply	other threads:[~2023-12-19 12:12 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-18 15:29 [PATCH 1/2] qemurunner: Impove stdout logging handling Richard Purdie
2023-12-18 15:29 ` [PATCH 2/2] qemurunner: Impove handling of serial port output blocking Richard Purdie
2023-12-19 15:10   ` [OE-core] " Alex Bennée
2023-12-19 15:15     ` Richard Purdie
2023-12-19 16:44       ` Alex Bennée
2023-12-19 22:57         ` Richard Purdie
2023-12-20  9:53           ` Alex Bennée
2023-12-20 10:04             ` Richard Purdie
2023-12-20 16:28               ` Alex Bennée
2023-12-20 16:52                 ` Richard Purdie
2023-12-18 16:04 ` [OE-core] [PATCH 1/2] qemurunner: Impove stdout logging handling Fabio Estevam
2023-12-18 17:30   ` Richard Purdie
2023-12-18 17:45 ` Khem Raj
2023-12-18 17:58   ` Richard Purdie
2023-12-18 18:07     ` Khem Raj
2023-12-18 23:01       ` Richard Purdie
2023-12-19 15:25         ` Alex Bennée
     [not found]       ` <17A20F573E606BC3.25349@lists.openembedded.org>
2023-12-19 12:03         ` Richard Purdie
2023-12-19 12:12           ` Mikko Rapeli [this message]
2023-12-18 22:27     ` Alex Bennée
2023-12-18 22:54       ` Richard Purdie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYGItGxg-TAO4JIE@nuoska \
    --to=mikko.rapeli@linaro.org \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=raj.khem@gmail.com \
    --cc=richard.purdie@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox