From: Mikko Rapeli <mikko.rapeli@linaro.org>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: Khem Raj <raj.khem@gmail.com>, openembedded-core@lists.openembedded.org
Subject: Re: [OE-core] [PATCH 1/2] qemurunner: Impove stdout logging handling
Date: Tue, 19 Dec 2023 14:12:36 +0200 [thread overview]
Message-ID: <ZYGItGxg-TAO4JIE@nuoska> (raw)
In-Reply-To: <8ba14102f0ea38d9bb691538e3b2171e82aead33.camel@linuxfoundation.org>
Hi,
On Tue, Dec 19, 2023 at 12:03:07PM +0000, Richard Purdie wrote:
> On Mon, 2023-12-18 at 23:01 +0000, Richard Purdie via
> lists.openembedded.org wrote:
> > On Mon, 2023-12-18 at 10:07 -0800, Khem Raj wrote:
> > > On Mon, Dec 18, 2023 at 9:58 AM Richard Purdie
> > > <richard.purdie@linuxfoundation.org> wrote:
> > > >
> > > > On Mon, 2023-12-18 at 09:45 -0800, Khem Raj wrote:
> > > > > I tried the two patches in this series. It did improve the situation
> > > > > but I am still getting SSH timeouts. But this time its 13 tests
> > > > > earlier it used to be 40+
> > > > > btw. my images are using systemd. So it might be good to see if we see
> > > > > this with poky-altcfg as well or not.
> > > >
> > > > Do you have the log.do_testimage and the ${WORKDIR}/testimage/ files?
> > >
> > > yes, further I ran the failing tests in loop one after another still
> > > one test gzip fails with ssh timeouts
> > >
> > > https://busybox.net/~kraj/log.do_testimage.503
> > > https://busybox.net/~kraj/testimage/
> > >
> > > there are two runs in the testimages folder. In one you see the RCU
> > > stall and in second you do not
> > > but it fails with same ssh timeout issue.
> > >
> > > >
> > > > Did you still see rcu stalls in the logs?
> >
> > What is interesting is there is ~3MB of nulls in the .2 serial log. The
> > rcu stall is also:
> >
> > [ 88.261687] serial8250_tx_chars+0xea/0x2b0
> > [ 88.261689] serial8250_handle_irq+0x1e9/0x330
> > [ 88.261691] serial8250_default_handle_irq+0x4a/0x90
> > [ 88.261693] serial8250_interrupt+0x66/0xc0
> > [ 88.261696] __handle_irq_event_percpu+0x54/0x1c0
> > [ 88.261701] handle_irq_event+0x3d/0x80
> >
> > i.e. it is stalled in the serial TX path.
> >
> > The big question is why is there so many nulls on the serial port. I
> > see a few on my local x86 test runs but only ~4kb, not megabytes of
> > them. I hadn't worked out where/what they are from yet.
> >
> > I suspect something in the serial/kernel/qemu space isn't interacting
> > correctly.
> >
> > Find the cause of the nulls and we might make progress on this.
>
> I went digging and it is a mess. That 3MB file is actually a wtmp file
> from /var/log. The reason is a failure of the "target_dumper" commands
> and that the simplistic serial command handling doesn't like binary
> files.
>
> I've a few patches which should improve the situation but we're clearly
> multiple issues at play here where issues are stacking on top of other
> issues. My rough list of issues:
>
> a) stdout logging from qemu wasn't being read and could overflow
> buffers locking qemu
> b) serial data from commands wasn't being fully read and could overflow
> buffers locking qemu
> c) the dumper log reading command is reading binary files like wtmp
> d) the dumper logs overwrite each other so are useless anyway
> e) the dumper logs run on every failed command
> f) qmp monitor dump command also run on every failed command
>
> I think we'd be much better off if we drop the dumper/monitor stuff and
> qemu runs might be quite a bit faster too.
I agree to this. If target has serious issues like a full lockup, then dumper and
monitor commands don't help anyway. Capturing the full serial logs would be sufficient,
though I'd like to see a live capture of /var/log/message and systemd journal in case
services started by init have major issues.
Cheers,
-Mikko
next prev parent reply other threads:[~2023-12-19 12:12 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-18 15:29 [PATCH 1/2] qemurunner: Impove stdout logging handling Richard Purdie
2023-12-18 15:29 ` [PATCH 2/2] qemurunner: Impove handling of serial port output blocking Richard Purdie
2023-12-19 15:10 ` [OE-core] " Alex Bennée
2023-12-19 15:15 ` Richard Purdie
2023-12-19 16:44 ` Alex Bennée
2023-12-19 22:57 ` Richard Purdie
2023-12-20 9:53 ` Alex Bennée
2023-12-20 10:04 ` Richard Purdie
2023-12-20 16:28 ` Alex Bennée
2023-12-20 16:52 ` Richard Purdie
2023-12-18 16:04 ` [OE-core] [PATCH 1/2] qemurunner: Impove stdout logging handling Fabio Estevam
2023-12-18 17:30 ` Richard Purdie
2023-12-18 17:45 ` Khem Raj
2023-12-18 17:58 ` Richard Purdie
2023-12-18 18:07 ` Khem Raj
2023-12-18 23:01 ` Richard Purdie
2023-12-19 15:25 ` Alex Bennée
[not found] ` <17A20F573E606BC3.25349@lists.openembedded.org>
2023-12-19 12:03 ` Richard Purdie
2023-12-19 12:12 ` Mikko Rapeli [this message]
2023-12-18 22:27 ` Alex Bennée
2023-12-18 22:54 ` Richard Purdie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZYGItGxg-TAO4JIE@nuoska \
--to=mikko.rapeli@linaro.org \
--cc=openembedded-core@lists.openembedded.org \
--cc=raj.khem@gmail.com \
--cc=richard.purdie@linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox