From: Lukas Wunner <lukas@wunner.de>
To: dan.j.williams@intel.com
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>,
Terry Bowman <terry.bowman@amd.com>,
dave@stgolabs.net, dave.jiang@intel.com,
alison.schofield@intel.com, bhelgaas@google.com,
shiju.jose@huawei.com, ming.li@zohomail.com,
Smita.KoralahalliChannabasappa@amd.com, rrichter@amd.com,
dan.carpenter@linaro.org, PradeepVineshReddy.Kodamati@amd.com,
Benjamin.Cheatham@amd.com,
sathyanarayanan.kuppuswamy@linux.intel.com,
linux-cxl@vger.kernel.org, vishal.l.verma@intel.com,
alucerop@amd.com, ira.weiny@intel.com,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Subject: Re: [PATCH v14 10/34] PCI/AER: Update is_internal_error() to be non-static is_aer_internal_error()
Date: Thu, 22 Jan 2026 14:34:21 +0100 [thread overview]
Message-ID: <aXInXZiTCSN06si8@wunner.de> (raw)
In-Reply-To: <6969513c2b1a4_34d2a1008a@dwillia2-mobl4.notmuch>
On Thu, Jan 15, 2026 at 12:42:36PM -0800, dan.j.williams@intel.com wrote:
> I agree with the general sentiment, but not the conclusion, especially
> because this is a private detail. Linux has long ignored internal
> errors. The only reason to consider them now is because CXL decided to
> multiplex its error model on top of this oft-ignored feature of PCIe
> AER.
>
> Specifically, portdrv.h is not in the global include namespace, this is
> a private detail of the only conumer of internal errors:
> drivers/pci/pcie/aer_cxl_{rch,vh}.c
>
> At most we should have this as a comment to clarify:
>
> /*
> * Note, internal errors are only considered for the CXL error model,
> * not for other implementations.
> */
>
> ...and the pci_aer_unmask_internal_errors() export should be:
>
> EXPORT_SYMBOL_FOR_MODULES(pci_aer_unmask_internal_errors, "cxl_core")
>
> ...for the same reason. Steer folks away from thinking that it is open
> season for adding more internal error support.
It's not like Internal Errors are a bad thing per se. They're a way
to signal "other" errors besides the spec-defined ones.
As an example, and I'm keeping this in general terms to avoid devulging
information about future products, a device possessing ECC RAM may raise
a Correctable Internal Error when ECC successfully recovers from flipped
bits because it allows alerting the user in advance that the device might
need to be replaced in the near future. If ECC recovery fails, the device
might try to use a reserved spare portion of RAM in lieu of the failing one
and instruct the AER driver to recover through a bus reset. Such errors
are not covered by the spec-defined types. Using the Internal Error type
is the only possibility it seems.
My point is, there are valid (upcoming, not theoretical) use cases for
Internal Errors and creating infrastructure in the kernel to take advantage
of them is a good thing. Hence my continued pushing back on hiding or
discouraging their use.
Thanks,
Lukas
next prev parent reply other threads:[~2026-01-22 13:34 UTC|newest]
Thread overview: 129+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-14 18:20 [PATCH v14 00/34] Enable CXL PCIe Port Protocol Error handling and logging Terry Bowman
2026-01-14 18:20 ` [PATCH v14 01/34] PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h Terry Bowman
2026-01-22 18:58 ` Bjorn Helgaas
2026-01-22 19:43 ` Bowman, Terry
2026-01-14 18:20 ` [PATCH v14 02/34] PCI: Update CXL DVSEC definitions Terry Bowman
2026-01-14 18:53 ` Jonathan Cameron
2026-01-19 23:44 ` dan.j.williams
2026-01-22 18:37 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 03/34] PCI: Introduce pcie_is_cxl() Terry Bowman
2026-01-21 1:19 ` dan.j.williams
2026-01-22 18:39 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 04/34] cxl/pci: Remove unnecessary CXL Endpoint handling helper functions Terry Bowman
2026-01-14 18:20 ` [PATCH v14 05/34] cxl/pci: Remove unnecessary CXL RCH " Terry Bowman
2026-01-14 18:20 ` [PATCH v14 06/34] PCI: Replace cxl_error_is_native() with pcie_aer_is_native() Terry Bowman
2026-01-14 18:55 ` Jonathan Cameron
2026-01-14 20:16 ` Dave Jiang
2026-01-14 20:15 ` Dave Jiang
2026-01-22 18:23 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 07/34] cxl/pci: Remove CXL VH handling in CONFIG_PCIEAER_CXL conditional blocks from core/pci.c Terry Bowman
2026-01-14 20:51 ` Dave Jiang
2026-01-14 18:20 ` [PATCH v14 08/34] cxl/pci: Move CXL driver's RCH error handling into core/ras_rch.c Terry Bowman
2026-01-14 20:35 ` Dave Jiang
2026-01-14 18:20 ` [PATCH v14 09/34] PCI/AER: Export pci_aer_unmask_internal_errors() Terry Bowman
2026-01-14 19:01 ` Jonathan Cameron
2026-01-14 19:09 ` Kuppuswamy Sathyanarayanan
2026-01-14 20:40 ` Dave Jiang
2026-01-20 2:09 ` dan.j.williams
2026-01-22 10:31 ` Lukas Wunner
2026-01-22 16:48 ` dan.j.williams
2026-01-22 18:51 ` Lukas Wunner
2026-01-22 18:49 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 10/34] PCI/AER: Update is_internal_error() to be non-static is_aer_internal_error() Terry Bowman
2026-01-14 19:08 ` Jonathan Cameron
2026-01-15 20:42 ` dan.j.williams
2026-01-22 13:34 ` Lukas Wunner [this message]
2026-01-22 19:09 ` dan.j.williams
2026-01-22 19:32 ` Lukas Wunner
2026-01-22 21:32 ` dan.j.williams
2026-01-23 12:22 ` Jonathan Cameron
2026-01-20 2:20 ` dan.j.williams
2026-01-20 15:15 ` Bowman, Terry
2026-01-20 16:53 ` dan.j.williams
2026-01-22 18:48 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 11/34] PCI/AER: Move CXL RCH error handling to aer_cxl_rch.c Terry Bowman
2026-01-22 17:23 ` Markus Elfring
2026-01-22 20:05 ` Bowman, Terry
2026-01-22 18:53 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 12/34] PCI/AER: Use guard() in cxl_rch_handle_error_iter() Terry Bowman
2026-01-14 19:11 ` Jonathan Cameron
2026-01-14 18:20 ` [PATCH v14 13/34] PCI/AER: Replace PCIEAER_CXL symbol with CXL_RAS Terry Bowman
2026-01-14 19:12 ` Jonathan Cameron
2026-01-14 20:49 ` Dave Jiang
2026-01-14 20:50 ` Dave Jiang
2026-01-22 18:24 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 14/34] PCI/AER: Report CXL or PCIe bus type in AER trace logging Terry Bowman
2026-01-14 19:45 ` Jonathan Cameron
2026-01-15 15:55 ` Mauro Carvalho Chehab
2026-01-14 20:56 ` Dave Jiang
2026-01-14 18:20 ` [PATCH v14 15/34] PCI/AER: Update struct aer_err_info with kernel-doc formatting Terry Bowman
2026-01-14 19:48 ` Jonathan Cameron
2026-01-15 20:56 ` dan.j.williams
2026-01-14 21:06 ` Dave Jiang
2026-01-22 18:29 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 16/34] cxl/mem: Clarify @host for devm_cxl_add_nvdimm() Terry Bowman
2026-01-14 19:49 ` Jonathan Cameron
2026-01-14 21:08 ` Dave Jiang
2026-01-16 3:07 ` dan.j.williams
2026-01-16 16:22 ` Dave Jiang
2026-01-14 18:20 ` [PATCH v14 17/34] cxl: Update RAS handler interfaces to also support CXL Ports Terry Bowman
2026-01-14 18:20 ` [PATCH v14 18/34] cxl/port: Remove "enumerate dports" helpers Terry Bowman
2026-01-14 19:50 ` Jonathan Cameron
2026-01-14 21:23 ` Dave Jiang
2026-01-16 3:15 ` dan.j.williams
2026-01-14 21:24 ` Dave Jiang
2026-01-16 3:21 ` dan.j.williams
2026-01-14 18:20 ` [PATCH v14 19/34] cxl/port: Fix devm resource leaks around with dport management Terry Bowman
2026-01-14 21:26 ` Dave Jiang
2026-01-15 14:46 ` Jonathan Cameron
2026-01-16 4:45 ` dan.j.williams
2026-01-16 15:01 ` Jonathan Cameron
2026-01-16 16:16 ` Jonathan Cameron
2026-01-19 23:02 ` dan.j.williams
2026-01-20 12:25 ` Jonathan Cameron
2026-01-19 2:48 ` dan.j.williams
2026-01-14 18:20 ` [PATCH v14 20/34] cxl/port: Move dport operations to a driver event Terry Bowman
2026-01-14 21:45 ` Dave Jiang
2026-01-15 14:56 ` Jonathan Cameron
2026-01-14 18:20 ` [PATCH v14 21/34] cxl/port: Move dport RAS reporting to a port resource Terry Bowman
2026-01-14 21:47 ` Dave Jiang
2026-01-15 15:02 ` Jonathan Cameron
2026-01-14 18:20 ` [PATCH v14 22/34] cxl: Update CXL Endpoint tracing Terry Bowman
2026-01-14 18:20 ` [PATCH v14 23/34] cxl: Map CXL Endpoint Port and CXL Switch Port RAS registers Terry Bowman
2026-01-14 21:53 ` Dave Jiang
2026-01-15 15:17 ` Jonathan Cameron
2026-01-14 18:20 ` [PATCH v14 24/34] cxl/port: Move endpoint component register management to cxl_port Terry Bowman
2026-01-14 21:55 ` Dave Jiang
2026-01-15 15:28 ` Jonathan Cameron
2026-01-14 18:20 ` [PATCH v14 25/34] cxl/port: Map Port component registers before switchport init Terry Bowman
2026-01-14 21:59 ` Dave Jiang
2026-01-15 15:30 ` Jonathan Cameron
2026-01-14 18:20 ` [PATCH v14 26/34] cxl: Change CXL handlers to use guard() instead of scoped_guard() Terry Bowman
2026-01-23 10:05 ` Markus Elfring
2026-01-14 18:20 ` [PATCH v14 27/34] PCI/ERR: Introduce PCI_ERS_RESULT_PANIC Terry Bowman
2026-01-14 18:58 ` Kuppuswamy Sathyanarayanan
2026-01-14 19:20 ` Bowman, Terry
2026-01-14 19:45 ` Kuppuswamy Sathyanarayanan
2026-01-14 18:20 ` [PATCH v14 28/34] PCI/AER: Move AER driver's CXL VH handling to pcie/aer_cxl_vh.c Terry Bowman
2026-01-15 15:40 ` Jonathan Cameron
2026-01-14 18:20 ` [PATCH v14 29/34] cxl/port: Unify endpoint and switch port lookup Terry Bowman
2026-01-14 23:04 ` Dave Jiang
2026-01-15 15:44 ` Jonathan Cameron
2026-01-14 18:20 ` [PATCH v14 30/34] PCI/AER: Dequeue forwarded CXL error Terry Bowman
2026-01-14 23:18 ` Dave Jiang
2026-01-16 14:42 ` Bowman, Terry
2026-01-15 16:01 ` Jonathan Cameron
2026-01-15 17:29 ` Bowman, Terry
2026-01-22 18:32 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 31/34] PCI: Introduce CXL Port protocol error handlers Terry Bowman
2026-01-14 23:37 ` Dave Jiang
2026-01-15 16:12 ` Jonathan Cameron
2026-01-22 18:27 ` Bjorn Helgaas
2026-01-14 18:20 ` [PATCH v14 32/34] cxl: Update Endpoint uncorrectable protocol error handling Terry Bowman
2026-01-14 22:07 ` dan.j.williams
2026-01-15 15:26 ` Bowman, Terry
2026-01-15 15:27 ` Bowman, Terry
2026-01-14 18:20 ` [PATCH v14 33/34] cxl: Update Endpoint correctable " Terry Bowman
2026-01-14 18:20 ` [PATCH v14 34/34] cxl: Enable CXL protocol errors during CXL Port probe Terry Bowman
2026-01-15 16:18 ` Jonathan Cameron
2026-01-15 19:41 ` Bowman, Terry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXInXZiTCSN06si8@wunner.de \
--to=lukas@wunner.de \
--cc=Benjamin.Cheatham@amd.com \
--cc=PradeepVineshReddy.Kodamati@amd.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=alison.schofield@intel.com \
--cc=alucerop@amd.com \
--cc=bhelgaas@google.com \
--cc=dan.carpenter@linaro.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=ming.li@zohomail.com \
--cc=rrichter@amd.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=shiju.jose@huawei.com \
--cc=terry.bowman@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox