public inbox for linux-doc@vger.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Breno Leitao <leitao@debian.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	Oliver O'Halloran <oohall@gmail.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org,
	dcostantino@meta.com, rneu@meta.com, kernel-team@meta.com
Subject: Re: [PATCH] PCI/AER: Add option to panic on unrecoverable errors
Date: Fri, 6 Feb 2026 12:22:44 -0700	[thread overview]
Message-ID: <aYY_hMZyVp7GZvX2@kbusch-mbp> (raw)
In-Reply-To: <20260206185232.GA70936@bhelgaas>

On Fri, Feb 06, 2026 at 12:52:32PM -0600, Bjorn Helgaas wrote:
> Just from an overall complexity point of view, I'm a little hesitant
> to add new kernel parameters because this seems like a very specific
> case.
> 
> Is there anything we could do to improve the logging to make the issue
> more recognizable?  I assume you already look for KERN_CRIT, KERN_ERR,
> etc., but it looks like the current message is just KERN_INFO.  I
> think we could make a good case for at least KERN_WARNING.
> 
> But I guess you probably want something that's just impossible to
> ignore.

It's not necessarily about improving visibility with a higher alert
level. It's more that the system can't be trusted to operate correctly
from here on. Consider an interconnected GPU setup and only one
experiences an unrecoverable error. We don't want to leave the system
limping along with this unresolved error as it can't perform anything
useful. A panic induced reboot is the least bad option to return the
system to operation, or crashes the system temporally close to failure
to get logs for the vendor if we're actively debugging.
 
> Are there any other similar flags you already use that we could
> piggy-back on?  E.g., if we raised the level to KERN_WARNING, maybe
> the existing "panic_on_warn" would be enough?

There are many KERN_WARNING messages that don't rise to the level of
warranting a 'panic' that don't want to enable such an option in
production. It looks like the panic_on_warn was introduced for developer
debugging.

I agree the curnent INFO level is too low for the generic unrecovered
condition, though.

  reply	other threads:[~2026-02-06 19:22 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06 18:23 [PATCH] PCI/AER: Add option to panic on unrecoverable errors Breno Leitao
2026-02-06 18:41 ` Lukas Wunner
2026-02-06 18:50 ` Keith Busch
2026-02-06 18:52 ` Bjorn Helgaas
2026-02-06 19:22   ` Keith Busch [this message]
2026-02-06 20:53     ` Lukas Wunner
2026-02-06 21:10       ` Lukas Wunner
2026-02-07  5:55       ` Keith Busch
2026-02-09 14:28   ` Breno Leitao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYY_hMZyVp7GZvX2@kbusch-mbp \
    --to=kbusch@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=dcostantino@meta.com \
    --cc=helgaas@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=leitao@debian.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.ibm.com \
    --cc=oohall@gmail.com \
    --cc=rneu@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox