All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: Serial console stuck during boot, unblocked with xl debug-key
Date: Thu, 4 Jan 2024 14:02:32 +0100	[thread overview]
Message-ID: <ZZasam3zMBtrGvte@mail-itl> (raw)
In-Reply-To: <7d5ecc76-ecd3-4940-b658-fee60e3ab740@suse.com>

[-- Attachment #1: Type: text/plain, Size: 3395 bytes --]

On Thu, Jan 04, 2024 at 12:59:28PM +0100, Jan Beulich wrote:
> On 29.12.2023 10:50, Marek Marczykowski-Górecki wrote:
> > Hi,
> > 
> > This is continuation from matrix chat. There is an occasional failure on
> > qubes-hw2 gitlab runner that console become stuck during boot. I can now
> > reproduce it _much_ more often on another system, and the serial console output
> > ends with:
> > 
> >     (XEN) Allocated console ring of 256 KiB.
> >     (XEN) Using HWP for cpufreq
> >     (XEN) mwait-idle: does not run on family 6
> > 
> > It should be:
> > 
> >     (XEN) Allocated console ring of 256 KiB.
> >     (XEN) Using HWP for cpufreq
> >     (XEN) mwait-idle: does not run on family 6 model 183
> >     (XEN) VMX: Supported advanced features:
> >     (XEN)  - APIC MMIO access virtualisation
> >     (XEN)  - APIC TPR shadow
> >     ...
> > 
> > 
> > Otherwise the system works perfectly fine, the logs are available in
> > full via `xl dmesg` etc. Doing (any?) `xl debug-key` unblocks the
> > console and missing logs gets dumped there too. I narrowed it down to
> > the serial console tx buffer and collected some info with the attacked
> > patch (it collects info still during boot, after the place where it
> > usually breaks). When it works, I get:
> > 
> >     (XEN) SERIAL DEBUG: txbufc: 0x5b5, txbufp: 0x9f7, uart intr_works: 1, serial_txbufsz: 0x4000, tx_ready: 0, lsr_mask: 0x20, msi: 0, io_size: 8, skipped_interrupts: 0
> > 
> > And when it breaks, I get:
> > 
> >     (XEN) SERIAL DEBUG: txbufc: 0x70, txbufp: 0x9fd, uart intr_works: 1, serial_txbufsz: 0x4000, tx_ready: 16, lsr_mask: 0x20, msi: 0, io_size: 8, skipped_interrupts: 0
> 
> The only meaningful difference is tx_ready then. Looking at
> ns16550_tx_ready() I wonder whether the LSR reports inconsistent
> values on successive reads (there are at least three separate calls
> to the function out of serial_tx_interrupt() alone). What you didn't
> log is the LSR value itself; from the tx_ready value one can conclude
> though that in the bad case fifo_size was returned, while in the good
> case 0 was passed back. At the first glance this looks backwards, or
> in other words I can't explain why it would be this way round. (I
> assume you've had each case multiple times, and the output was
> sufficiently consistent; that doesn't go without saying as your
> invocation of serial_debug() is competing with the asynchronous
> transmitting of data [if any].) It being this way round might suggest
> that we lost an interrupt.

That is my current hypothesis too. Either at the hw level (being masked
for some reason at some point?) or on sw level (somehow not calling the
handler - that's why adding skipped_interrupts).

> Is this a real serial port, or one mimicked
> by a BMC (SoL or alike)?

This one is a real serial port. It isn't fully reproducible, but
happened sufficiently often that I'm quite sure of the above info.
Yes, my serial_debug() can interfere with data transfer, but I
intentionally added it significantly later than the issue happens (I
realize that console output end may not directly coincide with actual
time of the problem due to async sending, but still IMO should
be good enough). I later moved it to keyhandler, but that didn't give
any more info.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply	other threads:[~2024-01-04 13:03 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-29  9:50 Serial console stuck during boot, unblocked with xl debug-key Marek Marczykowski-Górecki
2024-01-04 11:59 ` Jan Beulich
2024-01-04 13:02   ` Marek Marczykowski-Górecki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZasam3zMBtrGvte@mail-itl \
    --to=marmarek@invisiblethingslab.com \
    --cc=jbeulich@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.