From: Terry Bowman <terry.bowman@amd.com>
To: <dave@stgolabs.net>, <jic23@kernel.org>, <dave.jiang@intel.com>,
<alison.schofield@intel.com>, <djbw@kernel.org>,
<bhelgaas@google.com>, <shiju.jose@huawei.com>,
<ming.li@zohomail.com>, <Smita.KoralahalliChannabasappa@amd.com>,
<rrichter@amd.com>, <dan.carpenter@linaro.org>,
<PradeepVineshReddy.Kodamati@amd.com>, <lukas@wunner.de>,
<Benjamin.Cheatham@amd.com>,
<sathyanarayanan.kuppuswamy@linux.intel.com>,
<vishal.l.verma@intel.com>, <alucerop@amd.com>,
<ira.weiny@intel.com>, <corbet@lwn.net>, <rafael@kernel.org>,
<xueshuai@linux.alibaba.com>, <linux-cxl@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <linux-pci@vger.kernel.org>,
<linux-acpi@vger.kernel.org>, <linux-doc@vger.kernel.org>,
<terry.bowman@amd.com>
Subject: [PATCH v17 10/11] PCI/CXL: Mask/Unmask CXL protocol errors
Date: Tue, 5 May 2026 12:30:28 -0500 [thread overview]
Message-ID: <20260505173029.2718246-11-terry.bowman@amd.com> (raw)
In-Reply-To: <20260505173029.2718246-1-terry.bowman@amd.com>
CXL protocol errors are not enabled for all CXL devices after boot. They
must be enabled in order to process CXL protocol errors. Provide matching
teardown helpers so the masks are restored when a CXL Port or Downstream
Port goes away.
Add pci_aer_mask_internal_errors() as the symmetric counterpart to
pci_aer_unmask_internal_errors() and export both for the cxl_core module.
Introduce cxl_unmask_proto_interrupts() and cxl_mask_proto_interrupts()
in cxl_core to wrap the PCI helpers with the dev_is_pci() and
pcie_aer_is_native() gating CXL needs. Both helpers tolerate a NULL
@dev so teardown callers do not have to special-case it.
Wire cxl_unmask_proto_interrupts() into the success path of
cxl_dport_map_ras() and devm_cxl_port_ras_setup() so the unmask only
runs when the RAS register block was actually mapped. Pair each unmask
with a devm_add_action_or_reset() registration of
cxl_mask_proto_interrupts() scoped to the cxl_port device. The mask is
then restored when the cxl_port device releases its devres. This
applies to Endpoints, Upstream Switch Ports, Downstream Switch Ports,
and Root Ports.
Co-developed-by: Dan Williams <djbw@kernel.org>
Signed-off-by: Dan Williams <djbw@kernel.org>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
---
Changes in v16->v17:
- Drop redundant cxl_mask_proto_interrupts() calls from unregister_port()
and cxl_dport_remove(); the devres action registered alongside the unmask
is the sole mask path.
- Update title
- Remove unnecessary check for aer_capabilities
- Gate cxl_unmask_proto_interrupts() on pcie_aer_is_native()
- Add pci_aer_mask_internal_errors() and cxl_mask_proto_interrupts()
- Only unmask on successful cxl_map_component_regs()
- NULL-check @dev in cxl_{un,}mask_proto_interrupts()
- Drop static and declare in core/core.h
Change in v15 -> v16:
- None
Change in v14 -> v15:
- None
Changes in v13->v14:
- Update commit title's prefix (Bjorn)
Changes in v12->v13:
- Add dev and dev_is_pci() NULL checks in cxl_unmask_proto_interrupts() (Terry)
- Add Dave Jiang's and Ben's review-by
Changes in v11->v12:
- None
---
drivers/cxl/core/core.h | 4 +++
drivers/cxl/core/ras.c | 63 ++++++++++++++++++++++++++++++++++++++---
drivers/pci/pcie/aer.c | 25 ++++++++++++++++
include/linux/aer.h | 2 ++
4 files changed, 90 insertions(+), 4 deletions(-)
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 2c7387506dfb..ff39985d363f 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -190,6 +190,8 @@ void cxl_dport_map_rch_aer(struct cxl_dport *dport);
void cxl_disable_rch_root_ints(struct cxl_dport *dport);
void cxl_handle_rdport_errors(struct pci_dev *pdev);
void devm_cxl_dport_ras_setup(struct cxl_dport *dport);
+void cxl_unmask_proto_interrupts(struct device *dev);
+void cxl_mask_proto_interrupts(struct device *dev);
#else
static inline int cxl_ras_init(void)
{
@@ -207,6 +209,8 @@ static inline void cxl_dport_map_rch_aer(struct cxl_dport *dport) { }
static inline void cxl_disable_rch_root_ints(struct cxl_dport *dport) { }
static inline void cxl_handle_rdport_errors(struct pci_dev *pdev) { }
static inline void devm_cxl_dport_ras_setup(struct cxl_dport *dport) { }
+static inline void cxl_unmask_proto_interrupts(struct device *dev) { }
+static inline void cxl_mask_proto_interrupts(struct device *dev) { }
#endif /* CONFIG_CXL_RAS */
int cxl_gpf_port_setup(struct cxl_dport *dport);
diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
index a98ce0f412ad..b45e2b539b5f 100644
--- a/drivers/cxl/core/ras.c
+++ b/drivers/cxl/core/ras.c
@@ -66,16 +66,59 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
}
static DECLARE_WORK(cxl_cper_prot_err_work, cxl_cper_prot_err_work_fn);
+void cxl_unmask_proto_interrupts(struct device *dev)
+{
+ struct pci_dev *pdev;
+
+ if (!dev || !dev_is_pci(dev))
+ return;
+
+ pdev = to_pci_dev(dev);
+ if (!pcie_aer_is_native(pdev))
+ return;
+
+ pci_aer_unmask_internal_errors(pdev);
+}
+
+void cxl_mask_proto_interrupts(struct device *dev)
+{
+ struct pci_dev *pdev;
+
+ if (!dev || !dev_is_pci(dev))
+ return;
+
+ pdev = to_pci_dev(dev);
+ if (!pcie_aer_is_native(pdev))
+ return;
+
+ pci_aer_mask_internal_errors(pdev);
+}
+
+static void cxl_mask_proto_irqs(void *dev)
+{
+ cxl_mask_proto_interrupts(dev);
+}
+
static void cxl_dport_map_ras(struct cxl_dport *dport)
{
struct cxl_register_map *map = &dport->reg_map;
struct device *dev = dport->dport_dev;
- if (!map->component_map.ras.valid)
+ if (!map->component_map.ras.valid) {
dev_dbg(dev, "RAS registers not found\n");
- else if (cxl_map_component_regs(map, &dport->regs.component,
- BIT(CXL_CM_CAP_CAP_ID_RAS)))
+ return;
+ }
+
+ if (cxl_map_component_regs(map, &dport->regs.component,
+ BIT(CXL_CM_CAP_CAP_ID_RAS))) {
dev_dbg(dev, "Failed to map RAS capability.\n");
+ return;
+ }
+
+ cxl_unmask_proto_interrupts(dev);
+ if (devm_add_action_or_reset(dport_to_host(dport),
+ cxl_mask_proto_irqs, dev))
+ dev_warn(dev, "failed to register CXL proto-irq mask cleanup\n");
}
/**
@@ -109,6 +152,7 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_dport_rch_ras_setup, "CXL");
void devm_cxl_port_ras_setup(struct cxl_port *port)
{
struct cxl_register_map *map = &port->reg_map;
+ struct device *dev;
if (!map->component_map.ras.valid) {
dev_dbg(&port->dev, "RAS registers not found\n");
@@ -117,8 +161,19 @@ void devm_cxl_port_ras_setup(struct cxl_port *port)
map->host = &port->dev;
if (cxl_map_component_regs(map, &port->regs,
- BIT(CXL_CM_CAP_CAP_ID_RAS)))
+ BIT(CXL_CM_CAP_CAP_ID_RAS))) {
dev_dbg(&port->dev, "Failed to map RAS capability\n");
+ return;
+ }
+
+ dev = is_cxl_endpoint(port) ? port->uport_dev->parent : port->uport_dev;
+ if (!dev_is_pci(dev))
+ return;
+
+ cxl_unmask_proto_interrupts(dev);
+ if (devm_add_action_or_reset(&port->dev, cxl_mask_proto_irqs, dev))
+ dev_warn(&port->dev,
+ "Failed to register CXL proto-irq mask cleanup\n");
}
EXPORT_SYMBOL_NS_GPL(devm_cxl_port_ras_setup, "CXL");
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index b9c6c7b97217..eaa36fe0eb31 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1151,6 +1151,31 @@ void pci_aer_unmask_internal_errors(struct pci_dev *dev)
*/
EXPORT_SYMBOL_FOR_MODULES(pci_aer_unmask_internal_errors, "cxl_core");
+/**
+ * pci_aer_mask_internal_errors - mask internal errors
+ * @dev: pointer to the pci_dev data structure
+ *
+ * Mask internal errors in the Uncorrectable and Correctable Error
+ * Mask registers.
+ *
+ * Note: AER must be enabled and supported by the device which must be
+ * checked in advance, e.g. with pcie_aer_is_native().
+ */
+void pci_aer_mask_internal_errors(struct pci_dev *dev)
+{
+ int aer = dev->aer_cap;
+ u32 mask;
+
+ pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, &mask);
+ mask |= PCI_ERR_UNC_INTN;
+ pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, mask);
+
+ pci_read_config_dword(dev, aer + PCI_ERR_COR_MASK, &mask);
+ mask |= PCI_ERR_COR_INTERNAL;
+ pci_write_config_dword(dev, aer + PCI_ERR_COR_MASK, mask);
+}
+EXPORT_SYMBOL_FOR_MODULES(pci_aer_mask_internal_errors, "cxl_core");
+
/**
* pci_aer_handle_error - handle logging error into an event log
* @dev: pointer to pci_dev data structure of error source device
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 979ed2f9fd38..c52db62d4c7e 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -71,6 +71,7 @@ int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
void pci_aer_clear_fatal_status(struct pci_dev *dev);
int pcie_aer_is_native(struct pci_dev *dev);
void pci_aer_unmask_internal_errors(struct pci_dev *dev);
+void pci_aer_mask_internal_errors(struct pci_dev *dev);
#else
static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
{
@@ -79,6 +80,7 @@ static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
static inline void pci_aer_unmask_internal_errors(struct pci_dev *dev) { }
+static inline void pci_aer_mask_internal_errors(struct pci_dev *dev) { }
#endif
#ifdef CONFIG_CXL_RAS
--
2.34.1
next prev parent reply other threads:[~2026-05-05 17:32 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 17:30 [PATCH v17 00/11] Enable CXL PCIe Port Protocol Error handling and logging Terry Bowman
2026-05-05 17:30 ` [PATCH v17 01/11] PCI/AER: Introduce AER-CXL Kfifo Terry Bowman
2026-05-05 21:17 ` Dave Jiang
2026-05-05 17:30 ` [PATCH v17 02/11] cxl/ras: Unify Endpoint and Port AER trace events Terry Bowman
2026-05-05 21:46 ` Dave Jiang
2026-05-05 17:30 ` [PATCH v17 03/11] cxl: Use common CPER handling for all CXL devices Terry Bowman
2026-05-05 22:02 ` Dave Jiang
2026-05-05 17:30 ` [PATCH v17 04/11] cxl: Rename find_cxl_port() to find_cxl_port_by_dport() Terry Bowman
2026-05-05 22:06 ` Dave Jiang
2026-05-05 17:30 ` [PATCH v17 05/11] cxl: Limit CXL-CPER kfifo registration functions scope Terry Bowman
2026-05-05 22:16 ` Dave Jiang
2026-05-05 17:30 ` [PATCH v17 06/11] PCI: Establish common CXL Port protocol error flow Terry Bowman
2026-05-05 17:30 ` [PATCH v17 07/11] PCI/CXL: Add RCH support to CXL handlers Terry Bowman
2026-05-05 23:59 ` Dave Jiang
2026-05-05 17:30 ` [PATCH v17 08/11] cxl: Remove Endpoint AER correctable handler Terry Bowman
2026-05-05 17:30 ` [PATCH v17 09/11] cxl: Update Endpoint AER uncorrectable handler Terry Bowman
2026-05-06 17:43 ` Dave Jiang
2026-05-05 17:30 ` Terry Bowman [this message]
2026-05-06 18:00 ` [PATCH v17 10/11] PCI/CXL: Mask/Unmask CXL protocol errors Dave Jiang
2026-05-05 17:30 ` [PATCH v17 11/11] Documentation: cxl: Document CXL protocol error handling Terry Bowman
2026-05-06 18:34 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260505173029.2718246-11-terry.bowman@amd.com \
--to=terry.bowman@amd.com \
--cc=Benjamin.Cheatham@amd.com \
--cc=PradeepVineshReddy.Kodamati@amd.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=alison.schofield@intel.com \
--cc=alucerop@amd.com \
--cc=bhelgaas@google.com \
--cc=corbet@lwn.net \
--cc=dan.carpenter@linaro.org \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=djbw@kernel.org \
--cc=ira.weiny@intel.com \
--cc=jic23@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=ming.li@zohomail.com \
--cc=rafael@kernel.org \
--cc=rrichter@amd.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=shiju.jose@huawei.com \
--cc=vishal.l.verma@intel.com \
--cc=xueshuai@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox