* [PATCH 0/6 v7] Make ELOG and GHES log and trace consistently
@ 2025-11-04 18:22 Fabio M. De Francesco
2025-11-04 18:22 ` [PATCH 1/6 v7] ACPI: extlog: Trace CPER Non-standard Section Body Fabio M. De Francesco
` (5 more replies)
0 siblings, 6 replies; 18+ messages in thread
From: Fabio M. De Francesco @ 2025-11-04 18:22 UTC (permalink / raw)
To: linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci, Fabio M. De Francesco
When Firmware First is enabled, BIOS handles errors first and then it
makes them available to the kernel via the Common Platform Error Record
(CPER) sections (UEFI 2.10 Appendix N). Linux parses the CPER sections
via one of two similar paths, either ELOG or GHES.
Currently, ELOG and GHES show some inconsistencies in how they print to
the kernel log as well as in how they report to userspace via trace
events.
Make the two mentioned paths act similarly for what relates to logging
and tracing.
--- Changes for v7 ---
- Reference UEFI v2.11 (Sathyanarayanan)
- Substitute !(A || B) with !(A && B) in an 'if' statement to
convey the intended logic (Jonathan)
- Make ACPI_APEI_GHES explicitly select PCIAER because the needed
ACPI_APEI_PCIEAER doesn't recursively select that prerequisite (Jonathan)
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202510232204.7aYBpl7h-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202510232204.XIXgPWD7-lkp@intel.com/
- Don't add the unnecessary cxl_cper_ras_handle_prot_err() wrapper
for cxl_cper_handle_prot_err() (Jonathan)
- Make ACPI_EXTLOG explicitly select PCIAER && ACPI_APEI because
the needed ACPI_APEI_PCIEAER doesn't recursively select the
prerequisites
- Make ACPI_EXTLOG select CXL_BUS
--- Changes for v6 ---
- Rename the helper that copies the CPER CXL protocol error
information to work struct (Dave)
- Return -EOPNOTSUPP (instead of -EINVAL) from the two helpers if
ACPI_APEI_PCIEAER is not defined (Dave)
--- Changes for v5 ---
- Add 3/6 to select ACPI_APEI_PCIEAER for GHES
- Add 4,5/6 to move common code between ELOG and GHES out to new
helpers use them in 6/6 (Jonathan).
--- Changes for v4 ---
- Re-base on top of recent changes of the AER error logging and
drop obsoleted 2/4 (Sathyanarayanan)
- Log with pr_warn_ratelimited() (Dave)
- Collect tags
--- Changes for v3 ---
1/4, 2/4:
- collect tags; no functional changes
3/4:
- Invert logic of checks (Yazen)
- Select CONFIG_ACPI_APEI_PCIEAER (Yazen)
4/4:
- Check serial number only for CXL devices (Yazen)
- Replace "invalid" with "unknown" in the output of a pr_err()
(Yazen)
--- Changes for v2 ---
- Add a patch to pass log levels to pci_print_aer() (Dan)
- Add a patch to trace CPER CXL Protocol Errors
- Rework commit messages (Dan)
- Use log_non_standard_event() (Bjorn)
--- Changes for v1 ---
- Drop the RFC prefix and restart from PATCH v1
- Drop patch 3/3 because a discussion on it has not yet been
settled
- Drop namespacing in export of pci_print_aer while() (Dan)
- Don't use '#ifdef' in *.c files (Dan)
- Drop a reference on pdev after operation is complete (Dan)
- Don't log an error message if pdev is NULL (Dan)
Fabio M. De Francesco (6):
ACPI: extlog: Trace CPER Non-standard Section Body
ACPI: extlog: Trace CPER PCI Express Error Section
acpi/ghes: Make GHES select ACPI_APEI_PCIEAER
acpi/ghes: Add helper for CPER CXL protocol errors validity checks
acpi/ghes: Add helper to copy CPER CXL protocol error information to
work struct
ACPI: extlog: Trace CPER CXL Protocol Error Section
drivers/acpi/Kconfig | 7 ++++-
drivers/acpi/acpi_extlog.c | 60 ++++++++++++++++++++++++++++++++++++
drivers/acpi/apei/Kconfig | 2 ++
drivers/acpi/apei/ghes.c | 62 +++++++++++++++++++++++++-------------
drivers/cxl/core/ras.c | 3 +-
drivers/pci/pcie/aer.c | 2 +-
include/cxl/event.h | 22 ++++++++++++++
7 files changed, 134 insertions(+), 24 deletions(-)
base-commit: c9cfc122f03711a5124b4aafab3211cf4d35a2ac
--
2.51.1
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/6 v7] ACPI: extlog: Trace CPER Non-standard Section Body
2025-11-04 18:22 [PATCH 0/6 v7] Make ELOG and GHES log and trace consistently Fabio M. De Francesco
@ 2025-11-04 18:22 ` Fabio M. De Francesco
2025-11-04 18:22 ` [PATCH 2/6 v7] ACPI: extlog: Trace CPER PCI Express Error Section Fabio M. De Francesco
` (4 subsequent siblings)
5 siblings, 0 replies; 18+ messages in thread
From: Fabio M. De Francesco @ 2025-11-04 18:22 UTC (permalink / raw)
To: linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci, Fabio M. De Francesco, Jonathan Cameron,
Kuppuswamy Sathyanarayanan, Qiuxu Zhuo
ghes_do_proc() has a catch-all for unknown or unhandled CPER formats
(UEFI v2.11 Appendix N 2.3), extlog_print() does not. This gap was
noticed by a RAS test that injected CXL protocol errors which were
notified to extlog_print() via the IOMCA (I/O Machine Check
Architecture) mechanism. Bring parity to the extlog_print() path by
including a similar log_non_standard_event().
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/acpi/acpi_extlog.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
index f6b9562779de..47d11cb5c912 100644
--- a/drivers/acpi/acpi_extlog.c
+++ b/drivers/acpi/acpi_extlog.c
@@ -183,6 +183,12 @@ static int extlog_print(struct notifier_block *nb, unsigned long val,
if (gdata->error_data_length >= sizeof(*mem))
trace_extlog_mem_event(mem, err_seq, fru_id, fru_text,
(u8)gdata->error_severity);
+ } else {
+ void *err = acpi_hest_get_payload(gdata);
+
+ log_non_standard_event(sec_type, fru_id, fru_text,
+ gdata->error_severity, err,
+ gdata->error_data_length);
}
}
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/6 v7] ACPI: extlog: Trace CPER PCI Express Error Section
2025-11-04 18:22 [PATCH 0/6 v7] Make ELOG and GHES log and trace consistently Fabio M. De Francesco
2025-11-04 18:22 ` [PATCH 1/6 v7] ACPI: extlog: Trace CPER Non-standard Section Body Fabio M. De Francesco
@ 2025-11-04 18:22 ` Fabio M. De Francesco
2025-11-05 9:23 ` kernel test robot
2025-11-04 18:22 ` [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER Fabio M. De Francesco
` (3 subsequent siblings)
5 siblings, 1 reply; 18+ messages in thread
From: Fabio M. De Francesco @ 2025-11-04 18:22 UTC (permalink / raw)
To: linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci, Fabio M. De Francesco
I/O Machine Check Architecture events may signal failing PCIe components
or links. The AER event contains details on what was happening on the wire
when the error was signaled.
Trace the CPER PCIe Error section (UEFI v2.11, Appendix N.2.7) reported
by the I/O MCA.
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/acpi/Kconfig | 6 +++++-
drivers/acpi/acpi_extlog.c | 32 ++++++++++++++++++++++++++++++++
drivers/pci/pcie/aer.c | 2 +-
3 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index ca00a5dbcf75..be02634f2320 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -492,7 +492,11 @@ config ACPI_WATCHDOG
config ACPI_EXTLOG
tristate "Extended Error Log support"
- depends on X86_MCE && X86_LOCAL_APIC && EDAC
+ depends on X86_MCE && X86_LOCAL_APIC
+ select EDAC
+ select PCIEAER
+ select ACPI_APEI
+ select ACPI_APEI_PCIEAER
select UEFI_CPER
help
Certain usages such as Predictive Failure Analysis (PFA) require
diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
index 47d11cb5c912..b3976ceb4ee4 100644
--- a/drivers/acpi/acpi_extlog.c
+++ b/drivers/acpi/acpi_extlog.c
@@ -132,6 +132,34 @@ static int print_extlog_rcd(const char *pfx,
return 1;
}
+static void extlog_print_pcie(struct cper_sec_pcie *pcie_err,
+ int severity)
+{
+ struct aer_capability_regs *aer;
+ struct pci_dev *pdev;
+ unsigned int devfn;
+ unsigned int bus;
+ int aer_severity;
+ int domain;
+
+ if (!(pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
+ pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO))
+ return;
+
+ aer_severity = cper_severity_to_aer(severity);
+ aer = (struct aer_capability_regs *)pcie_err->aer_info;
+ domain = pcie_err->device_id.segment;
+ bus = pcie_err->device_id.bus;
+ devfn = PCI_DEVFN(pcie_err->device_id.device,
+ pcie_err->device_id.function);
+ pdev = pci_get_domain_bus_and_slot(domain, bus, devfn);
+ if (!pdev)
+ return;
+
+ pci_print_aer(pdev, aer_severity, aer);
+ pci_dev_put(pdev);
+}
+
static int extlog_print(struct notifier_block *nb, unsigned long val,
void *data)
{
@@ -183,6 +211,10 @@ static int extlog_print(struct notifier_block *nb, unsigned long val,
if (gdata->error_data_length >= sizeof(*mem))
trace_extlog_mem_event(mem, err_seq, fru_id, fru_text,
(u8)gdata->error_severity);
+ } else if (guid_equal(sec_type, &CPER_SEC_PCIE)) {
+ struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
+
+ extlog_print_pcie(pcie_err, gdata->error_severity);
} else {
void *err = acpi_hest_get_payload(gdata);
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 0b5ed4722ac3..1b903e0644d6 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -971,7 +971,7 @@ void pci_print_aer(struct pci_dev *dev, int aer_severity,
pcie_print_tlp_log(dev, &aer->header_log, info.level,
dev_fmt(" "));
}
-EXPORT_SYMBOL_NS_GPL(pci_print_aer, "CXL");
+EXPORT_SYMBOL_GPL(pci_print_aer);
/**
* add_error_device - list device to be handled
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER
2025-11-04 18:22 [PATCH 0/6 v7] Make ELOG and GHES log and trace consistently Fabio M. De Francesco
2025-11-04 18:22 ` [PATCH 1/6 v7] ACPI: extlog: Trace CPER Non-standard Section Body Fabio M. De Francesco
2025-11-04 18:22 ` [PATCH 2/6 v7] ACPI: extlog: Trace CPER PCI Express Error Section Fabio M. De Francesco
@ 2025-11-04 18:22 ` Fabio M. De Francesco
2025-11-05 10:05 ` kernel test robot
2025-11-11 15:42 ` Jonathan Cameron
2025-11-04 18:22 ` [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks Fabio M. De Francesco
` (2 subsequent siblings)
5 siblings, 2 replies; 18+ messages in thread
From: Fabio M. De Francesco @ 2025-11-04 18:22 UTC (permalink / raw)
To: linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci, Fabio M. De Francesco
GHES handles the PCI Express Error Section and also the Compute Express
Link (CXL) Protocol Error Section. Two of its functions depend on the
APEI PCIe AER logging/recovering support (ACPI_APEI_PCIEAER).
Make GHES select ACPI_APEI_PCIEAER and remove the conditional
compilation from the body of two static functions that handle the CPER
Error Sections mentioned above.
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/acpi/apei/Kconfig | 2 ++
drivers/acpi/apei/ghes.c | 4 ----
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 070c07d68dfb..cdf3cfa233b9 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -23,6 +23,8 @@ config ACPI_APEI_GHES
select ACPI_HED
select IRQ_WORK
select GENERIC_ALLOCATOR
+ select PCIEAER
+ select ACPI_APEI_PCIEAER
select ARM_SDE_INTERFACE if ARM64
help
Generic Hardware Error Source provides a way to report
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 97ee19f2cae0..d6fe5f020e96 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -613,7 +613,6 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
*/
static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
{
-#ifdef CONFIG_ACPI_APEI_PCIEAER
struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
@@ -646,7 +645,6 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
(struct aer_capability_regs *)
aer_info);
}
-#endif
}
static BLOCKING_NOTIFIER_HEAD(vendor_record_notify_list);
@@ -711,7 +709,6 @@ struct work_struct *cxl_cper_prot_err_work;
static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
int severity)
{
-#ifdef CONFIG_ACPI_APEI_PCIEAER
struct cxl_cper_prot_err_work_data wd;
u8 *dvsec_start, *cap_start;
@@ -767,7 +764,6 @@ static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
}
schedule_work(cxl_cper_prot_err_work);
-#endif
}
int cxl_cper_register_prot_err_work(struct work_struct *work)
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks
2025-11-04 18:22 [PATCH 0/6 v7] Make ELOG and GHES log and trace consistently Fabio M. De Francesco
` (2 preceding siblings ...)
2025-11-04 18:22 ` [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER Fabio M. De Francesco
@ 2025-11-04 18:22 ` Fabio M. De Francesco
2025-11-07 18:30 ` Dave Jiang
` (2 more replies)
2025-11-04 18:22 ` [PATCH 5/6 v7] acpi/ghes: Add helper to copy CXL protocol error info to work struct Fabio M. De Francesco
2025-11-04 18:22 ` [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section Fabio M. De Francesco
5 siblings, 3 replies; 18+ messages in thread
From: Fabio M. De Francesco @ 2025-11-04 18:22 UTC (permalink / raw)
To: linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci, Fabio M. De Francesco
Move the CPER CXL protocol errors validity check out of
cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit
the serial number check only to CXL agents that are CXL devices (UEFI
v2.10, Appendix N.2.13).
Export the new symbol for reuse by ELOG.
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/acpi/apei/ghes.c | 32 ++++++++++++++++++++++----------
include/cxl/event.h | 10 ++++++++++
2 files changed, 32 insertions(+), 10 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d6fe5f020e96..e69ae864f43d 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -706,30 +706,42 @@ static DEFINE_KFIFO(cxl_cper_prot_err_fifo, struct cxl_cper_prot_err_work_data,
static DEFINE_SPINLOCK(cxl_cper_prot_err_work_lock);
struct work_struct *cxl_cper_prot_err_work;
-static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
- int severity)
+int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
{
- struct cxl_cper_prot_err_work_data wd;
- u8 *dvsec_start, *cap_start;
-
if (!(prot_err->valid_bits & PROT_ERR_VALID_AGENT_ADDRESS)) {
pr_err_ratelimited("CXL CPER invalid agent type\n");
- return;
+ return -EINVAL;
}
if (!(prot_err->valid_bits & PROT_ERR_VALID_ERROR_LOG)) {
pr_err_ratelimited("CXL CPER invalid protocol error log\n");
- return;
+ return -EINVAL;
}
if (prot_err->err_len != sizeof(struct cxl_ras_capability_regs)) {
pr_err_ratelimited("CXL CPER invalid RAS Cap size (%u)\n",
prot_err->err_len);
- return;
+ return -EINVAL;
}
- if (!(prot_err->valid_bits & PROT_ERR_VALID_SERIAL_NUMBER))
- pr_warn(FW_WARN "CXL CPER no device serial number\n");
+ if ((prot_err->agent_type == RCD || prot_err->agent_type == DEVICE ||
+ prot_err->agent_type == LD || prot_err->agent_type == FMLD) &&
+ !(prot_err->valid_bits & PROT_ERR_VALID_SERIAL_NUMBER))
+ pr_warn_ratelimited(FW_WARN
+ "CXL CPER no device serial number\n");
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(cxl_cper_sec_prot_err_valid);
+
+static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
+ int severity)
+{
+ struct cxl_cper_prot_err_work_data wd;
+ u8 *dvsec_start, *cap_start;
+
+ if (cxl_cper_sec_prot_err_valid(prot_err))
+ return;
guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
diff --git a/include/cxl/event.h b/include/cxl/event.h
index 6fd90f9cc203..4d7d1036ea9c 100644
--- a/include/cxl/event.h
+++ b/include/cxl/event.h
@@ -320,4 +320,14 @@ static inline int cxl_cper_prot_err_kfifo_get(struct cxl_cper_prot_err_work_data
}
#endif
+#ifdef CONFIG_ACPI_APEI_PCIEAER
+int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err);
+#else
+static inline int
+cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
+{
+ return -EOPNOTSUPP;
+}
+#endif
+
#endif /* _LINUX_CXL_EVENT_H */
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 5/6 v7] acpi/ghes: Add helper to copy CXL protocol error info to work struct
2025-11-04 18:22 [PATCH 0/6 v7] Make ELOG and GHES log and trace consistently Fabio M. De Francesco
` (3 preceding siblings ...)
2025-11-04 18:22 ` [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks Fabio M. De Francesco
@ 2025-11-04 18:22 ` Fabio M. De Francesco
2025-11-07 18:31 ` Dave Jiang
2025-11-21 2:22 ` Hanjun Guo
2025-11-04 18:22 ` [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section Fabio M. De Francesco
5 siblings, 2 replies; 18+ messages in thread
From: Fabio M. De Francesco @ 2025-11-04 18:22 UTC (permalink / raw)
To: linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci, Fabio M. De Francesco
Make a helper out of cxl_cper_post_prot_err() that checks the CXL agent
type and copy the CPER CXL protocol errors information to a work data
structure.
Export the new symbol for reuse by ELOG.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/acpi/apei/ghes.c | 42 ++++++++++++++++++++++++++--------------
include/cxl/event.h | 10 ++++++++++
2 files changed, 37 insertions(+), 15 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index e69ae864f43d..2f4632d9855a 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -734,20 +734,12 @@ int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
}
EXPORT_SYMBOL_GPL(cxl_cper_sec_prot_err_valid);
-static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
- int severity)
+int cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
+ struct cxl_cper_sec_prot_err *prot_err,
+ int severity)
{
- struct cxl_cper_prot_err_work_data wd;
u8 *dvsec_start, *cap_start;
- if (cxl_cper_sec_prot_err_valid(prot_err))
- return;
-
- guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
-
- if (!cxl_cper_prot_err_work)
- return;
-
switch (prot_err->agent_type) {
case RCD:
case DEVICE:
@@ -756,20 +748,40 @@ static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
case RP:
case DSP:
case USP:
- memcpy(&wd.prot_err, prot_err, sizeof(wd.prot_err));
+ memcpy(&wd->prot_err, prot_err, sizeof(wd->prot_err));
dvsec_start = (u8 *)(prot_err + 1);
cap_start = dvsec_start + prot_err->dvsec_len;
- memcpy(&wd.ras_cap, cap_start, sizeof(wd.ras_cap));
- wd.severity = cper_severity_to_aer(severity);
+ memcpy(&wd->ras_cap, cap_start, sizeof(wd->ras_cap));
+ wd->severity = cper_severity_to_aer(severity);
break;
default:
pr_err_ratelimited("CXL CPER invalid agent type: %d\n",
prot_err->agent_type);
- return;
+ return -EINVAL;
}
+ return 0;
+}
+EXPORT_SYMBOL_GPL(cxl_cper_setup_prot_err_work_data);
+
+static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
+ int severity)
+{
+ struct cxl_cper_prot_err_work_data wd;
+
+ if (cxl_cper_sec_prot_err_valid(prot_err))
+ return;
+
+ guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
+
+ if (!cxl_cper_prot_err_work)
+ return;
+
+ if (cxl_cper_setup_prot_err_work_data(&wd, prot_err, severity))
+ return;
+
if (!kfifo_put(&cxl_cper_prot_err_fifo, wd)) {
pr_err_ratelimited("CXL CPER kfifo overflow\n");
return;
diff --git a/include/cxl/event.h b/include/cxl/event.h
index 4d7d1036ea9c..94081aec597a 100644
--- a/include/cxl/event.h
+++ b/include/cxl/event.h
@@ -322,12 +322,22 @@ static inline int cxl_cper_prot_err_kfifo_get(struct cxl_cper_prot_err_work_data
#ifdef CONFIG_ACPI_APEI_PCIEAER
int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err);
+int cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
+ struct cxl_cper_sec_prot_err *prot_err,
+ int severity);
#else
static inline int
cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
{
return -EOPNOTSUPP;
}
+static inline int
+cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
+ struct cxl_cper_sec_prot_err *prot_err,
+ int severity)
+{
+ return -EOPNOTSUPP;
+}
#endif
#endif /* _LINUX_CXL_EVENT_H */
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section
2025-11-04 18:22 [PATCH 0/6 v7] Make ELOG and GHES log and trace consistently Fabio M. De Francesco
` (4 preceding siblings ...)
2025-11-04 18:22 ` [PATCH 5/6 v7] acpi/ghes: Add helper to copy CXL protocol error info to work struct Fabio M. De Francesco
@ 2025-11-04 18:22 ` Fabio M. De Francesco
2025-11-05 9:01 ` kernel test robot
2025-11-07 19:59 ` Dave Jiang
5 siblings, 2 replies; 18+ messages in thread
From: Fabio M. De Francesco @ 2025-11-04 18:22 UTC (permalink / raw)
To: linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci, Fabio M. De Francesco,
Kuppuswamy Sathyanarayanan
When Firmware First is enabled, BIOS handles errors first and then it
makes them available to the kernel via the Common Platform Error Record
(CPER) sections (UEFI 2.11 Appendix N.2.13). Linux parses the CPER
sections via one of two similar paths, either ELOG or GHES. The errors
managed by ELOG are signaled to the BIOS by the I/O Machine Check
Architecture (I/O MCA).
Currently, ELOG and GHES show some inconsistencies in how they report to
userspace via trace events.
Therefore, make the two mentioned paths act similarly by tracing the CPER
CXL Protocol Error Section.
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
drivers/acpi/Kconfig | 1 +
drivers/acpi/acpi_extlog.c | 22 ++++++++++++++++++++++
drivers/cxl/core/ras.c | 3 ++-
include/cxl/event.h | 2 ++
4 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index be02634f2320..c2ad24e77ddf 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -498,6 +498,7 @@ config ACPI_EXTLOG
select ACPI_APEI
select ACPI_APEI_PCIEAER
select UEFI_CPER
+ select CXL_BUS
help
Certain usages such as Predictive Failure Analysis (PFA) require
more information about the error than what can be described in
diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
index b3976ceb4ee4..e6fb25395984 100644
--- a/drivers/acpi/acpi_extlog.c
+++ b/drivers/acpi/acpi_extlog.c
@@ -12,6 +12,7 @@
#include <linux/ratelimit.h>
#include <linux/edac.h>
#include <linux/ras.h>
+#include <cxl/event.h>
#include <acpi/ghes.h>
#include <asm/cpu.h>
#include <asm/mce.h>
@@ -160,6 +161,21 @@ static void extlog_print_pcie(struct cper_sec_pcie *pcie_err,
pci_dev_put(pdev);
}
+static void
+extlog_cxl_cper_handle_prot_err(struct cxl_cper_sec_prot_err *prot_err,
+ int severity)
+{
+ struct cxl_cper_prot_err_work_data wd;
+
+ if (cxl_cper_sec_prot_err_valid(prot_err))
+ return;
+
+ if (cxl_cper_setup_prot_err_work_data(&wd, prot_err, severity))
+ return;
+
+ cxl_cper_handle_prot_err(&wd);
+}
+
static int extlog_print(struct notifier_block *nb, unsigned long val,
void *data)
{
@@ -211,6 +227,12 @@ static int extlog_print(struct notifier_block *nb, unsigned long val,
if (gdata->error_data_length >= sizeof(*mem))
trace_extlog_mem_event(mem, err_seq, fru_id, fru_text,
(u8)gdata->error_severity);
+ } else if (guid_equal(sec_type, &CPER_SEC_CXL_PROT_ERR)) {
+ struct cxl_cper_sec_prot_err *prot_err =
+ acpi_hest_get_payload(gdata);
+
+ extlog_cxl_cper_handle_prot_err(prot_err,
+ gdata->error_severity);
} else if (guid_equal(sec_type, &CPER_SEC_PCIE)) {
struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
index 2731ba3a0799..a90480d07c87 100644
--- a/drivers/cxl/core/ras.c
+++ b/drivers/cxl/core/ras.c
@@ -63,7 +63,7 @@ static int match_memdev_by_parent(struct device *dev, const void *uport)
return 0;
}
-static void cxl_cper_handle_prot_err(struct cxl_cper_prot_err_work_data *data)
+void cxl_cper_handle_prot_err(struct cxl_cper_prot_err_work_data *data)
{
unsigned int devfn = PCI_DEVFN(data->prot_err.agent_addr.device,
data->prot_err.agent_addr.function);
@@ -104,6 +104,7 @@ static void cxl_cper_handle_prot_err(struct cxl_cper_prot_err_work_data *data)
else
cxl_cper_trace_uncorr_prot_err(cxlmd, data->ras_cap);
}
+EXPORT_SYMBOL_GPL(cxl_cper_handle_prot_err);
static void cxl_cper_prot_err_work_fn(struct work_struct *work)
{
diff --git a/include/cxl/event.h b/include/cxl/event.h
index 94081aec597a..ff97fea718d2 100644
--- a/include/cxl/event.h
+++ b/include/cxl/event.h
@@ -340,4 +340,6 @@ cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
}
#endif
+void cxl_cper_handle_prot_err(struct cxl_cper_prot_err_work_data *wd);
+
#endif /* _LINUX_CXL_EVENT_H */
--
2.51.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section
2025-11-04 18:22 ` [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section Fabio M. De Francesco
@ 2025-11-05 9:01 ` kernel test robot
2025-11-07 19:59 ` Dave Jiang
1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2025-11-05 9:01 UTC (permalink / raw)
To: Fabio M. De Francesco; +Cc: oe-kbuild-all
Hi Fabio,
kernel test robot noticed the following build errors:
[auto build test ERROR on c9cfc122f03711a5124b4aafab3211cf4d35a2ac]
url: https://github.com/intel-lab-lkp/linux/commits/Fabio-M-De-Francesco/ACPI-extlog-Trace-CPER-Non-standard-Section-Body/20251105-022733
base: c9cfc122f03711a5124b4aafab3211cf4d35a2ac
patch link: https://lore.kernel.org/r/20251104182446.863422-7-fabio.m.de.francesco%40linux.intel.com
patch subject: [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section
config: x86_64-buildonly-randconfig-006-20251105 (https://download.01.org/0day-ci/archive/20251105/202511051610.Shh8WKBG-lkp@intel.com/config)
compiler: gcc-13 (Debian 13.3.0-16) 13.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251105/202511051610.Shh8WKBG-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511051610.Shh8WKBG-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from drivers/cxl/core/pci.c:10:
drivers/cxl/cxlpci.h: In function 'cxl_pci_flit_256':
drivers/cxl/cxlpci.h:126:9: error: implicit declaration of function 'pcie_capability_read_word' [-Werror=implicit-function-declaration]
126 | pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &lnksta2);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
drivers/cxl/core/pci.c: In function 'devm_cxl_port_enumerate_dports':
drivers/cxl/core/pci.c:143:9: error: implicit declaration of function 'pci_walk_bus' [-Werror=implicit-function-declaration]
143 | pci_walk_bus(bus, match_add_dports, &ctx);
| ^~~~~~~~~~~~
drivers/cxl/core/pci.c: In function 'cxl_pci_get_latency':
drivers/cxl/core/pci.c:1059:14: error: implicit declaration of function 'pcie_link_speed_mbps'; did you mean 'pcie_get_speed_cap'? [-Werror=implicit-function-declaration]
1059 | bw = pcie_link_speed_mbps(pdev);
| ^~~~~~~~~~~~~~~~~~~~
| pcie_get_speed_cap
drivers/cxl/core/pci.c: In function 'cxl_gpf_get_dvsec':
>> drivers/cxl/core/pci.c:1149:17: error: implicit declaration of function 'pci_find_dvsec_capability'; did you mean 'pci_find_ext_capability'? [-Werror=implicit-function-declaration]
1149 | dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
| ^~~~~~~~~~~~~~~~~~~~~~~~~
| pci_find_ext_capability
cc1: some warnings being treated as errors
Kconfig warnings: (for reference only)
WARNING: unmet direct dependencies detected for PCIEAER
Depends on [n]: PCI [=n] && PCIEPORTBUS [=n]
Selected by [m]:
- ACPI_EXTLOG [=m] && ACPI [=y] && X86_MCE [=y] && X86_LOCAL_APIC [=y]
WARNING: unmet direct dependencies detected for CXL_BUS
Depends on [n]: PCI [=n]
Selected by [m]:
- ACPI_EXTLOG [=m] && ACPI [=y] && X86_MCE [=y] && X86_LOCAL_APIC [=y]
vim +1149 drivers/cxl/core/pci.c
a52b6a2c1c997b Davidlohr Bueso 2025-01-24 1135
36aace15d9bdcf Li Ming 2025-03-23 1136 u16 cxl_gpf_get_dvsec(struct device *dev)
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1137 {
36aace15d9bdcf Li Ming 2025-03-23 1138 struct pci_dev *pdev;
36aace15d9bdcf Li Ming 2025-03-23 1139 bool is_port = true;
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1140 u16 dvsec;
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1141
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1142 if (!dev_is_pci(dev))
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1143 return 0;
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1144
36aace15d9bdcf Li Ming 2025-03-23 1145 pdev = to_pci_dev(dev);
36aace15d9bdcf Li Ming 2025-03-23 1146 if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ENDPOINT)
36aace15d9bdcf Li Ming 2025-03-23 1147 is_port = false;
36aace15d9bdcf Li Ming 2025-03-23 1148
36aace15d9bdcf Li Ming 2025-03-23 @1149 dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1150 is_port ? CXL_DVSEC_PORT_GPF : CXL_DVSEC_DEVICE_GPF);
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1151 if (!dvsec)
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1152 dev_warn(dev, "%s GPF DVSEC not present\n",
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1153 is_port ? "Port" : "Device");
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1154 return dvsec;
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1155 }
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1156 EXPORT_SYMBOL_NS_GPL(cxl_gpf_get_dvsec, "CXL");
021b7e42fa7bc2 Davidlohr Bueso 2025-02-20 1157
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/6 v7] ACPI: extlog: Trace CPER PCI Express Error Section
2025-11-04 18:22 ` [PATCH 2/6 v7] ACPI: extlog: Trace CPER PCI Express Error Section Fabio M. De Francesco
@ 2025-11-05 9:23 ` kernel test robot
0 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2025-11-05 9:23 UTC (permalink / raw)
To: Fabio M. De Francesco; +Cc: oe-kbuild-all
Hi Fabio,
kernel test robot noticed the following build errors:
[auto build test ERROR on c9cfc122f03711a5124b4aafab3211cf4d35a2ac]
url: https://github.com/intel-lab-lkp/linux/commits/Fabio-M-De-Francesco/ACPI-extlog-Trace-CPER-Non-standard-Section-Body/20251105-022733
base: c9cfc122f03711a5124b4aafab3211cf4d35a2ac
patch link: https://lore.kernel.org/r/20251104182446.863422-3-fabio.m.de.francesco%40linux.intel.com
patch subject: [PATCH 2/6 v7] ACPI: extlog: Trace CPER PCI Express Error Section
config: i386-buildonly-randconfig-004-20251105 (https://download.01.org/0day-ci/archive/20251105/202511051721.LfgJrwLO-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251105/202511051721.LfgJrwLO-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511051721.LfgJrwLO-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
>> drivers/pci/pci-driver.c:1591:6: warning: no previous prototype for 'pci_uevent_ers' [-Wmissing-prototypes]
1591 | void pci_uevent_ers(struct pci_dev *pdev, enum pci_ers_result err_type)
| ^~~~~~~~~~~~~~
--
drivers/pci/pcie/aer.c: In function 'aer_root_reset':
>> drivers/pci/pcie/aer.c:1754:27: error: 'struct pci_dev' has no member named 'rcec'
1754 | root = dev->rcec;
| ^~
--
drivers/pci/pcie/err.c: In function 'report_error_detected':
>> drivers/pci/pcie/err.c:83:9: error: implicit declaration of function 'pci_uevent_ers'; did you mean 'pci_select_bars'? [-Wimplicit-function-declaration]
83 | pci_uevent_ers(dev, vote);
| ^~~~~~~~~~~~~~
| pci_select_bars
Kconfig warnings: (for reference only)
WARNING: unmet direct dependencies detected for PCIEAER
Depends on [n]: PCI [=y] && PCIEPORTBUS [=n]
Selected by [y]:
- ACPI_EXTLOG [=y] && ACPI [=y] && X86_MCE [=y] && X86_LOCAL_APIC [=y]
WARNING: unmet direct dependencies detected for OF_GPIO
Depends on [n]: GPIOLIB [=y] && OF [=n] && HAS_IOMEM [=y]
Selected by [y]:
- GPIO_TB10X [=y] && GPIOLIB [=y] && HAS_IOMEM [=y] && (ARC_PLAT_TB10X || COMPILE_TEST [=y])
WARNING: unmet direct dependencies detected for MFD_STMFX
Depends on [n]: HAS_IOMEM [=y] && I2C [=y] && OF [=n]
Selected by [y]:
- PINCTRL_STMFX [=y] && PINCTRL [=y] && I2C [=y] && OF_GPIO [=y] && HAS_IOMEM [=y]
WARNING: unmet direct dependencies detected for GPIO_SYSCON
Depends on [n]: GPIOLIB [=y] && HAS_IOMEM [=y] && MFD_SYSCON [=y] && OF [=n]
Selected by [y]:
- GPIO_SAMA5D2_PIOBU [=y] && GPIOLIB [=y] && HAS_IOMEM [=y] && MFD_SYSCON [=y] && OF_GPIO [=y] && (ARCH_AT91 || COMPILE_TEST [=y])
WARNING: unmet direct dependencies detected for I2C_K1
Depends on [n]: I2C [=y] && HAS_IOMEM [=y] && (ARCH_SPACEMIT || COMPILE_TEST [=y]) && OF [=n]
Selected by [y]:
- MFD_SPACEMIT_P1 [=y] && HAS_IOMEM [=y] && (ARCH_SPACEMIT || COMPILE_TEST [=y]) && I2C [=y]
vim +1754 drivers/pci/pcie/aer.c
5afc2f763edc5d drivers/pci/pcie/aer.c Kai-Heng Feng 2024-04-16 1732
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1733 /**
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1734 * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1735 * @dev: pointer to Root Port, RCEC, or RCiEP
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1736 *
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1737 * Invoked by Port Bus driver when performing reset.
f6d37800612830 drivers/pci/pcie/aer/aerdrv.c Hidetoshi Seto 2010-04-15 1738 */
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1739 static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1740 {
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1741 int type = pci_pcie_type(dev);
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1742 struct pci_dev *root;
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1743 int aer;
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1744 struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
c6d34eddecb34f drivers/pci/pcie/aer/aerdrv.c Hidetoshi Seto 2010-04-15 1745 u32 reg32;
1842623850d09b drivers/pci/pcie/aer.c Sinan Kaya 2018-07-19 1746 int rc;
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1747
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1748 /*
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1749 * Only Root Ports and RCECs have AER Root Command and Root Status
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1750 * registers. If "dev" is an RCiEP, the relevant registers are in
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1751 * the RCEC.
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1752 */
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1753 if (type == PCI_EXP_TYPE_RC_END)
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 @1754 root = dev->rcec;
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1755 else
7a8a22be35a505 drivers/pci/pcie/aer.c Keith Busch 2021-01-04 1756 root = pcie_find_root_port(dev);
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1757
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1758 /*
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1759 * If the platform retained control of AER, an RCiEP may not have
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1760 * an RCEC visible to us, so dev->rcec ("root") may be NULL. In
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1761 * that case, firmware is responsible for these registers.
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1762 */
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1763 aer = root ? root->aer_cap : 0;
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1764
13cf36c648df0c drivers/pci/pcie/aer.c Kai-Heng Feng 2023-05-12 1765 if ((host->native_aer || pcie_ports_native) && aer)
13cf36c648df0c drivers/pci/pcie/aer.c Kai-Heng Feng 2023-05-12 1766 aer_disable_irq(root);
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1767
5790862255028c drivers/pci/pcie/aer.c Qiuxu Zhuo 2020-11-20 1768 if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END) {
9bdc81ce440ec6 drivers/pci/pcie/aer.c Amey Narkhede 2021-08-17 1769 rc = pcie_reset_flr(dev, PCI_RESET_DO_RESET);
56f107d7813f11 drivers/pci/pcie/aer.c Amey Narkhede 2021-08-17 1770 if (!rc)
56f107d7813f11 drivers/pci/pcie/aer.c Amey Narkhede 2021-08-17 1771 pci_info(dev, "has been reset\n");
56f107d7813f11 drivers/pci/pcie/aer.c Amey Narkhede 2021-08-17 1772 else
56f107d7813f11 drivers/pci/pcie/aer.c Amey Narkhede 2021-08-17 1773 pci_info(dev, "not reset (no FLR support: %d)\n", rc);
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1774 } else {
c4eed62a214330 drivers/pci/pcie/aer.c Keith Busch 2018-09-20 1775 rc = pci_bus_error_reset(dev);
33ac78bd3b509d drivers/pci/pcie/aer.c Keith Busch 2021-01-04 1776 pci_info(dev, "%s Port link has been reset (%d)\n",
33ac78bd3b509d drivers/pci/pcie/aer.c Keith Busch 2021-01-04 1777 pci_is_root_bus(dev->bus) ? "Root" : "Downstream", rc);
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1778 }
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1779
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1780 if ((host->native_aer || pcie_ports_native) && aer) {
c6d34eddecb34f drivers/pci/pcie/aer/aerdrv.c Hidetoshi Seto 2010-04-15 1781 /* Clear Root Error Status */
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1782 pci_read_config_dword(root, aer + PCI_ERR_ROOT_STATUS, ®32);
a175102b0a82fc drivers/pci/pcie/aer.c Sean V Kelley 2020-12-02 1783 pci_write_config_dword(root, aer + PCI_ERR_ROOT_STATUS, reg32);
c6d34eddecb34f drivers/pci/pcie/aer/aerdrv.c Hidetoshi Seto 2010-04-15 1784
13cf36c648df0c drivers/pci/pcie/aer.c Kai-Heng Feng 2023-05-12 1785 aer_enable_irq(root);
50cc18fcd3053f drivers/pci/pcie/aer.c Sean V Kelley 2020-11-20 1786 }
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1787
1842623850d09b drivers/pci/pcie/aer.c Sinan Kaya 2018-07-19 1788 return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1789 }
6c2b374d74857e drivers/pci/pcie/aer/aerdrv.c Zhang, Yanmin 2006-07-31 1790
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER
2025-11-04 18:22 ` [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER Fabio M. De Francesco
@ 2025-11-05 10:05 ` kernel test robot
2025-11-11 15:42 ` Jonathan Cameron
1 sibling, 0 replies; 18+ messages in thread
From: kernel test robot @ 2025-11-05 10:05 UTC (permalink / raw)
To: Fabio M. De Francesco; +Cc: llvm, oe-kbuild-all
Hi Fabio,
kernel test robot noticed the following build errors:
[auto build test ERROR on c9cfc122f03711a5124b4aafab3211cf4d35a2ac]
url: https://github.com/intel-lab-lkp/linux/commits/Fabio-M-De-Francesco/ACPI-extlog-Trace-CPER-Non-standard-Section-Body/20251105-022733
base: c9cfc122f03711a5124b4aafab3211cf4d35a2ac
patch link: https://lore.kernel.org/r/20251104182446.863422-4-fabio.m.de.francesco%40linux.intel.com
patch subject: [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER
config: x86_64-randconfig-074-20251105 (https://download.01.org/0day-ci/archive/20251105/202511051740.HlE2bTd2-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251105/202511051740.HlE2bTd2-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511051740.HlE2bTd2-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
>> drivers/pci/pci-driver.c:1591:6: warning: no previous prototype for function 'pci_uevent_ers' [-Wmissing-prototypes]
1591 | void pci_uevent_ers(struct pci_dev *pdev, enum pci_ers_result err_type)
| ^
drivers/pci/pci-driver.c:1591:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
1591 | void pci_uevent_ers(struct pci_dev *pdev, enum pci_ers_result err_type)
| ^
| static
1 warning generated.
--
>> drivers/pci/pcie/err.c:83:2: error: call to undeclared function 'pci_uevent_ers'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
83 | pci_uevent_ers(dev, vote);
| ^
drivers/pci/pcie/err.c:124:2: error: call to undeclared function 'pci_uevent_ers'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
124 | pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
| ^
drivers/pci/pcie/err.c:182:2: error: call to undeclared function 'pci_uevent_ers'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
182 | pci_uevent_ers(dev, PCI_ERS_RESULT_RECOVERED);
| ^
3 errors generated.
--
>> drivers/pci/pcie/aer.c:1754:15: error: no member named 'rcec' in 'struct pci_dev'
1754 | root = dev->rcec;
| ~~~ ^
1 error generated.
Kconfig warnings: (for reference only)
WARNING: unmet direct dependencies detected for PCIEAER
Depends on [n]: PCI [=y] && PCIEPORTBUS [=n]
Selected by [y]:
- ACPI_APEI_GHES [=y] && ACPI [=y] && ACPI_APEI [=y]
vim +/pci_uevent_ers +83 drivers/pci/pcie/err.c
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 48
542aeb9c8f930e4 Keith Busch 2018-09-20 49 static int report_error_detected(struct pci_dev *dev,
16d79cd4e23b196 Luc Van Oostenryck 2020-07-02 50 pci_channel_state_t state,
542aeb9c8f930e4 Keith Busch 2018-09-20 51 enum pci_ers_result *result)
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 52 {
171d149ce8d11f7 Bjorn Helgaas 2021-10-12 53 struct pci_driver *pdrv;
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 54 pci_ers_result_t vote;
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 55 const struct pci_error_handlers *err_handler;
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 56
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 57 device_lock(&dev->dev);
e0217c5ba10d7bf Bjorn Helgaas 2021-11-10 58 pdrv = dev->driver;
5e69a33c5cec019 Christoph Hellwig 2022-06-01 59 if (pci_dev_is_disconnected(dev)) {
5e69a33c5cec019 Christoph Hellwig 2022-06-01 60 vote = PCI_ERS_RESULT_DISCONNECT;
5e69a33c5cec019 Christoph Hellwig 2022-06-01 61 } else if (!pci_dev_set_io_state(dev, state)) {
5e69a33c5cec019 Christoph Hellwig 2022-06-01 62 pci_info(dev, "can't recover (state transition %u -> %u invalid)\n",
5e69a33c5cec019 Christoph Hellwig 2022-06-01 63 dev->error_state, state);
5e69a33c5cec019 Christoph Hellwig 2022-06-01 64 vote = PCI_ERS_RESULT_NONE;
5e69a33c5cec019 Christoph Hellwig 2022-06-01 65 } else if (!pdrv || !pdrv->err_handler ||
171d149ce8d11f7 Bjorn Helgaas 2021-10-12 66 !pdrv->err_handler->error_detected) {
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 67 /*
bfcb79fca19d267 Keith Busch 2018-09-20 68 * If any device in the subtree does not have an error_detected
bfcb79fca19d267 Keith Busch 2018-09-20 69 * callback, PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent
bfcb79fca19d267 Keith Busch 2018-09-20 70 * error callbacks of "any" device in the subtree, and will
bfcb79fca19d267 Keith Busch 2018-09-20 71 * exit in the disconnected error state.
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 72 */
01daacfb9035e5b Yicong Yang 2019-12-13 73 if (dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) {
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 74 vote = PCI_ERS_RESULT_NO_AER_DRIVER;
8d077c3ce0109c4 Bjorn Helgaas 2019-12-13 75 pci_info(dev, "can't recover (no error_detected callback)\n");
01daacfb9035e5b Yicong Yang 2019-12-13 76 } else {
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 77 vote = PCI_ERS_RESULT_NONE;
01daacfb9035e5b Yicong Yang 2019-12-13 78 }
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 79 } else {
171d149ce8d11f7 Bjorn Helgaas 2021-10-12 80 err_handler = pdrv->err_handler;
542aeb9c8f930e4 Keith Busch 2018-09-20 81 vote = err_handler->error_detected(dev, state);
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 82 }
7b42d97e99d3a2b Keith Busch 2018-09-20 @83 pci_uevent_ers(dev, vote);
542aeb9c8f930e4 Keith Busch 2018-09-20 84 *result = merge_result(*result, vote);
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 85 device_unlock(&dev->dev);
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 86 return 0;
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 87 }
2e28bc84cf6eecd Oza Pawandeep 2018-05-17 88
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks
2025-11-04 18:22 ` [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks Fabio M. De Francesco
@ 2025-11-07 18:30 ` Dave Jiang
2025-11-11 15:43 ` Jonathan Cameron
2025-11-21 2:16 ` Hanjun Guo
2 siblings, 0 replies; 18+ messages in thread
From: Dave Jiang @ 2025-11-07 18:30 UTC (permalink / raw)
To: Fabio M. De Francesco, linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Mahesh J Salgaonkar, Oliver O'Halloran,
Bjorn Helgaas, linux-kernel, linux-acpi, linuxppc-dev, linux-pci
On 11/4/25 11:22 AM, Fabio M. De Francesco wrote:
> Move the CPER CXL protocol errors validity check out of
> cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit
> the serial number check only to CXL agents that are CXL devices (UEFI
> v2.10, Appendix N.2.13).
>
> Export the new symbol for reuse by ELOG.
>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>> ---
> drivers/acpi/apei/ghes.c | 32 ++++++++++++++++++++++----------
> include/cxl/event.h | 10 ++++++++++
> 2 files changed, 32 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index d6fe5f020e96..e69ae864f43d 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -706,30 +706,42 @@ static DEFINE_KFIFO(cxl_cper_prot_err_fifo, struct cxl_cper_prot_err_work_data,
> static DEFINE_SPINLOCK(cxl_cper_prot_err_work_lock);
> struct work_struct *cxl_cper_prot_err_work;
>
> -static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> - int severity)
> +int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
> {
> - struct cxl_cper_prot_err_work_data wd;
> - u8 *dvsec_start, *cap_start;
> -
> if (!(prot_err->valid_bits & PROT_ERR_VALID_AGENT_ADDRESS)) {
> pr_err_ratelimited("CXL CPER invalid agent type\n");
> - return;
> + return -EINVAL;
> }
>
> if (!(prot_err->valid_bits & PROT_ERR_VALID_ERROR_LOG)) {
> pr_err_ratelimited("CXL CPER invalid protocol error log\n");
> - return;
> + return -EINVAL;
> }
>
> if (prot_err->err_len != sizeof(struct cxl_ras_capability_regs)) {
> pr_err_ratelimited("CXL CPER invalid RAS Cap size (%u)\n",
> prot_err->err_len);
> - return;
> + return -EINVAL;
> }
>
> - if (!(prot_err->valid_bits & PROT_ERR_VALID_SERIAL_NUMBER))
> - pr_warn(FW_WARN "CXL CPER no device serial number\n");
> + if ((prot_err->agent_type == RCD || prot_err->agent_type == DEVICE ||
> + prot_err->agent_type == LD || prot_err->agent_type == FMLD) &&
> + !(prot_err->valid_bits & PROT_ERR_VALID_SERIAL_NUMBER))
> + pr_warn_ratelimited(FW_WARN
> + "CXL CPER no device serial number\n");
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_cper_sec_prot_err_valid);
> +
> +static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> +{
> + struct cxl_cper_prot_err_work_data wd;
> + u8 *dvsec_start, *cap_start;
> +
> + if (cxl_cper_sec_prot_err_valid(prot_err))
> + return;
>
> guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
>
> diff --git a/include/cxl/event.h b/include/cxl/event.h
> index 6fd90f9cc203..4d7d1036ea9c 100644
> --- a/include/cxl/event.h
> +++ b/include/cxl/event.h
> @@ -320,4 +320,14 @@ static inline int cxl_cper_prot_err_kfifo_get(struct cxl_cper_prot_err_work_data
> }
> #endif
>
> +#ifdef CONFIG_ACPI_APEI_PCIEAER
> +int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err);
> +#else
> +static inline int
> +cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
> +{
> + return -EOPNOTSUPP;
> +}
> +#endif
> +
> #endif /* _LINUX_CXL_EVENT_H */
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 5/6 v7] acpi/ghes: Add helper to copy CXL protocol error info to work struct
2025-11-04 18:22 ` [PATCH 5/6 v7] acpi/ghes: Add helper to copy CXL protocol error info to work struct Fabio M. De Francesco
@ 2025-11-07 18:31 ` Dave Jiang
2025-11-21 2:22 ` Hanjun Guo
1 sibling, 0 replies; 18+ messages in thread
From: Dave Jiang @ 2025-11-07 18:31 UTC (permalink / raw)
To: Fabio M. De Francesco, linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Mahesh J Salgaonkar, Oliver O'Halloran,
Bjorn Helgaas, linux-kernel, linux-acpi, linuxppc-dev, linux-pci
On 11/4/25 11:22 AM, Fabio M. De Francesco wrote:
> Make a helper out of cxl_cper_post_prot_err() that checks the CXL agent
> type and copy the CPER CXL protocol errors information to a work data
> structure.
>
> Export the new symbol for reuse by ELOG.
>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>> ---
> drivers/acpi/apei/ghes.c | 42 ++++++++++++++++++++++++++--------------
> include/cxl/event.h | 10 ++++++++++
> 2 files changed, 37 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index e69ae864f43d..2f4632d9855a 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -734,20 +734,12 @@ int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
> }
> EXPORT_SYMBOL_GPL(cxl_cper_sec_prot_err_valid);
>
> -static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> - int severity)
> +int cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
> + struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> {
> - struct cxl_cper_prot_err_work_data wd;
> u8 *dvsec_start, *cap_start;
>
> - if (cxl_cper_sec_prot_err_valid(prot_err))
> - return;
> -
> - guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
> -
> - if (!cxl_cper_prot_err_work)
> - return;
> -
> switch (prot_err->agent_type) {
> case RCD:
> case DEVICE:
> @@ -756,20 +748,40 @@ static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> case RP:
> case DSP:
> case USP:
> - memcpy(&wd.prot_err, prot_err, sizeof(wd.prot_err));
> + memcpy(&wd->prot_err, prot_err, sizeof(wd->prot_err));
>
> dvsec_start = (u8 *)(prot_err + 1);
> cap_start = dvsec_start + prot_err->dvsec_len;
>
> - memcpy(&wd.ras_cap, cap_start, sizeof(wd.ras_cap));
> - wd.severity = cper_severity_to_aer(severity);
> + memcpy(&wd->ras_cap, cap_start, sizeof(wd->ras_cap));
> + wd->severity = cper_severity_to_aer(severity);
> break;
> default:
> pr_err_ratelimited("CXL CPER invalid agent type: %d\n",
> prot_err->agent_type);
> - return;
> + return -EINVAL;
> }
>
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_cper_setup_prot_err_work_data);
> +
> +static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> +{
> + struct cxl_cper_prot_err_work_data wd;
> +
> + if (cxl_cper_sec_prot_err_valid(prot_err))
> + return;
> +
> + guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
> +
> + if (!cxl_cper_prot_err_work)
> + return;
> +
> + if (cxl_cper_setup_prot_err_work_data(&wd, prot_err, severity))
> + return;
> +
> if (!kfifo_put(&cxl_cper_prot_err_fifo, wd)) {
> pr_err_ratelimited("CXL CPER kfifo overflow\n");
> return;
> diff --git a/include/cxl/event.h b/include/cxl/event.h
> index 4d7d1036ea9c..94081aec597a 100644
> --- a/include/cxl/event.h
> +++ b/include/cxl/event.h
> @@ -322,12 +322,22 @@ static inline int cxl_cper_prot_err_kfifo_get(struct cxl_cper_prot_err_work_data
>
> #ifdef CONFIG_ACPI_APEI_PCIEAER
> int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err);
> +int cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
> + struct cxl_cper_sec_prot_err *prot_err,
> + int severity);
> #else
> static inline int
> cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
> {
> return -EOPNOTSUPP;
> }
> +static inline int
> +cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
> + struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> +{
> + return -EOPNOTSUPP;
> +}
> #endif
>
> #endif /* _LINUX_CXL_EVENT_H */
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section
2025-11-04 18:22 ` [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section Fabio M. De Francesco
2025-11-05 9:01 ` kernel test robot
@ 2025-11-07 19:59 ` Dave Jiang
1 sibling, 0 replies; 18+ messages in thread
From: Dave Jiang @ 2025-11-07 19:59 UTC (permalink / raw)
To: Fabio M. De Francesco, linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Alison Schofield, Vishal Verma, Ira Weiny,
Dan Williams, Mahesh J Salgaonkar, Oliver O'Halloran,
Bjorn Helgaas, linux-kernel, linux-acpi, linuxppc-dev, linux-pci,
Kuppuswamy Sathyanarayanan
On 11/4/25 11:22 AM, Fabio M. De Francesco wrote:
> When Firmware First is enabled, BIOS handles errors first and then it
> makes them available to the kernel via the Common Platform Error Record
> (CPER) sections (UEFI 2.11 Appendix N.2.13). Linux parses the CPER
> sections via one of two similar paths, either ELOG or GHES. The errors
> managed by ELOG are signaled to the BIOS by the I/O Machine Check
> Architecture (I/O MCA).
>
> Currently, ELOG and GHES show some inconsistencies in how they report to
> userspace via trace events.
>
> Therefore, make the two mentioned paths act similarly by tracing the CPER
> CXL Protocol Error Section.
>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>> ---
> drivers/acpi/Kconfig | 1 +
> drivers/acpi/acpi_extlog.c | 22 ++++++++++++++++++++++
> drivers/cxl/core/ras.c | 3 ++-
> include/cxl/event.h | 2 ++
> 4 files changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index be02634f2320..c2ad24e77ddf 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -498,6 +498,7 @@ config ACPI_EXTLOG
> select ACPI_APEI
> select ACPI_APEI_PCIEAER
> select UEFI_CPER
> + select CXL_BUS
> help
> Certain usages such as Predictive Failure Analysis (PFA) require
> more information about the error than what can be described in
> diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
> index b3976ceb4ee4..e6fb25395984 100644
> --- a/drivers/acpi/acpi_extlog.c
> +++ b/drivers/acpi/acpi_extlog.c
> @@ -12,6 +12,7 @@
> #include <linux/ratelimit.h>
> #include <linux/edac.h>
> #include <linux/ras.h>
> +#include <cxl/event.h>
> #include <acpi/ghes.h>
> #include <asm/cpu.h>
> #include <asm/mce.h>
> @@ -160,6 +161,21 @@ static void extlog_print_pcie(struct cper_sec_pcie *pcie_err,
> pci_dev_put(pdev);
> }
>
> +static void
> +extlog_cxl_cper_handle_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> +{
> + struct cxl_cper_prot_err_work_data wd;
> +
> + if (cxl_cper_sec_prot_err_valid(prot_err))
> + return;
> +
> + if (cxl_cper_setup_prot_err_work_data(&wd, prot_err, severity))
> + return;
> +
> + cxl_cper_handle_prot_err(&wd);
> +}
> +
> static int extlog_print(struct notifier_block *nb, unsigned long val,
> void *data)
> {
> @@ -211,6 +227,12 @@ static int extlog_print(struct notifier_block *nb, unsigned long val,
> if (gdata->error_data_length >= sizeof(*mem))
> trace_extlog_mem_event(mem, err_seq, fru_id, fru_text,
> (u8)gdata->error_severity);
> + } else if (guid_equal(sec_type, &CPER_SEC_CXL_PROT_ERR)) {
> + struct cxl_cper_sec_prot_err *prot_err =
> + acpi_hest_get_payload(gdata);
> +
> + extlog_cxl_cper_handle_prot_err(prot_err,
> + gdata->error_severity);
> } else if (guid_equal(sec_type, &CPER_SEC_PCIE)) {
> struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
>
> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
> index 2731ba3a0799..a90480d07c87 100644
> --- a/drivers/cxl/core/ras.c
> +++ b/drivers/cxl/core/ras.c
> @@ -63,7 +63,7 @@ static int match_memdev_by_parent(struct device *dev, const void *uport)
> return 0;
> }
>
> -static void cxl_cper_handle_prot_err(struct cxl_cper_prot_err_work_data *data)
> +void cxl_cper_handle_prot_err(struct cxl_cper_prot_err_work_data *data)
> {
> unsigned int devfn = PCI_DEVFN(data->prot_err.agent_addr.device,
> data->prot_err.agent_addr.function);
> @@ -104,6 +104,7 @@ static void cxl_cper_handle_prot_err(struct cxl_cper_prot_err_work_data *data)
> else
> cxl_cper_trace_uncorr_prot_err(cxlmd, data->ras_cap);
> }
> +EXPORT_SYMBOL_GPL(cxl_cper_handle_prot_err);
>
> static void cxl_cper_prot_err_work_fn(struct work_struct *work)
> {
> diff --git a/include/cxl/event.h b/include/cxl/event.h
> index 94081aec597a..ff97fea718d2 100644
> --- a/include/cxl/event.h
> +++ b/include/cxl/event.h
> @@ -340,4 +340,6 @@ cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
> }
> #endif
>
> +void cxl_cper_handle_prot_err(struct cxl_cper_prot_err_work_data *wd);
> +
> #endif /* _LINUX_CXL_EVENT_H */
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER
2025-11-04 18:22 ` [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER Fabio M. De Francesco
2025-11-05 10:05 ` kernel test robot
@ 2025-11-11 15:42 ` Jonathan Cameron
2025-11-21 2:12 ` Hanjun Guo
1 sibling, 1 reply; 18+ messages in thread
From: Jonathan Cameron @ 2025-11-11 15:42 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: linux-cxl, Rafael J . Wysocki, Len Brown, Tony Luck,
Borislav Petkov, Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue,
Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci
On Tue, 4 Nov 2025 19:22:34 +0100
"Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> wrote:
> GHES handles the PCI Express Error Section and also the Compute Express
> Link (CXL) Protocol Error Section. Two of its functions depend on the
> APEI PCIe AER logging/recovering support (ACPI_APEI_PCIEAER).
>
> Make GHES select ACPI_APEI_PCIEAER and remove the conditional
> compilation from the body of two static functions that handle the CPER
> Error Sections mentioned above.
Hi Fabio,
I'm not seeing a justification here for the change and there may be
APEI platforms without PCI support. So is this just to simplify things or
is there a functional reason that it is necessary?
Jonathan
>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/acpi/apei/Kconfig | 2 ++
> drivers/acpi/apei/ghes.c | 4 ----
> 2 files changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
> index 070c07d68dfb..cdf3cfa233b9 100644
> --- a/drivers/acpi/apei/Kconfig
> +++ b/drivers/acpi/apei/Kconfig
> @@ -23,6 +23,8 @@ config ACPI_APEI_GHES
> select ACPI_HED
> select IRQ_WORK
> select GENERIC_ALLOCATOR
> + select PCIEAER
> + select ACPI_APEI_PCIEAER
> select ARM_SDE_INTERFACE if ARM64
> help
> Generic Hardware Error Source provides a way to report
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 97ee19f2cae0..d6fe5f020e96 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -613,7 +613,6 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
> */
> static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
> {
> -#ifdef CONFIG_ACPI_APEI_PCIEAER
> struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
>
> if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
> @@ -646,7 +645,6 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
> (struct aer_capability_regs *)
> aer_info);
> }
> -#endif
> }
>
> static BLOCKING_NOTIFIER_HEAD(vendor_record_notify_list);
> @@ -711,7 +709,6 @@ struct work_struct *cxl_cper_prot_err_work;
> static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> int severity)
> {
> -#ifdef CONFIG_ACPI_APEI_PCIEAER
> struct cxl_cper_prot_err_work_data wd;
> u8 *dvsec_start, *cap_start;
>
> @@ -767,7 +764,6 @@ static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> }
>
> schedule_work(cxl_cper_prot_err_work);
> -#endif
> }
>
> int cxl_cper_register_prot_err_work(struct work_struct *work)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks
2025-11-04 18:22 ` [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks Fabio M. De Francesco
2025-11-07 18:30 ` Dave Jiang
@ 2025-11-11 15:43 ` Jonathan Cameron
2025-11-21 2:16 ` Hanjun Guo
2 siblings, 0 replies; 18+ messages in thread
From: Jonathan Cameron @ 2025-11-11 15:43 UTC (permalink / raw)
To: Fabio M. De Francesco
Cc: linux-cxl, Rafael J . Wysocki, Len Brown, Tony Luck,
Borislav Petkov, Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue,
Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci
On Tue, 4 Nov 2025 19:22:35 +0100
"Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> wrote:
> Move the CPER CXL protocol errors validity check out of
> cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit
> the serial number check only to CXL agents that are CXL devices (UEFI
> v2.10, Appendix N.2.13).
>
> Export the new symbol for reuse by ELOG.
>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER
2025-11-11 15:42 ` Jonathan Cameron
@ 2025-11-21 2:12 ` Hanjun Guo
0 siblings, 0 replies; 18+ messages in thread
From: Hanjun Guo @ 2025-11-21 2:12 UTC (permalink / raw)
To: Jonathan Cameron, Fabio M. De Francesco
Cc: linux-cxl, Rafael J . Wysocki, Len Brown, Tony Luck,
Borislav Petkov, Mauro Carvalho Chehab, Shuai Xue,
Davidlohr Bueso, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci
On 2025/11/11 23:42, Jonathan Cameron wrote:
> On Tue, 4 Nov 2025 19:22:34 +0100
> "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> wrote:
>
>> GHES handles the PCI Express Error Section and also the Compute Express
>> Link (CXL) Protocol Error Section. Two of its functions depend on the
>> APEI PCIe AER logging/recovering support (ACPI_APEI_PCIEAER).
>>
>> Make GHES select ACPI_APEI_PCIEAER and remove the conditional
>> compilation from the body of two static functions that handle the CPER
>> Error Sections mentioned above.
>
> Hi Fabio,
>
> I'm not seeing a justification here for the change and there may be
> APEI platforms without PCI support. So is this just to simplify things or
> is there a functional reason that it is necessary?
I have the same worry, embedded system with ACPI support may don't have
PCI. And for APEI, AER is one of the error type and optional.
Thanks
Hanjun
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks
2025-11-04 18:22 ` [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks Fabio M. De Francesco
2025-11-07 18:30 ` Dave Jiang
2025-11-11 15:43 ` Jonathan Cameron
@ 2025-11-21 2:16 ` Hanjun Guo
2 siblings, 0 replies; 18+ messages in thread
From: Hanjun Guo @ 2025-11-21 2:16 UTC (permalink / raw)
To: Fabio M. De Francesco, linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci
On 2025/11/5 2:22, Fabio M. De Francesco wrote:
> Move the CPER CXL protocol errors validity check out of
> cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit
> the serial number check only to CXL agents that are CXL devices (UEFI
> v2.10, Appendix N.2.13).
>
> Export the new symbol for reuse by ELOG.
>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/acpi/apei/ghes.c | 32 ++++++++++++++++++++++----------
> include/cxl/event.h | 10 ++++++++++
> 2 files changed, 32 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index d6fe5f020e96..e69ae864f43d 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -706,30 +706,42 @@ static DEFINE_KFIFO(cxl_cper_prot_err_fifo, struct cxl_cper_prot_err_work_data,
> static DEFINE_SPINLOCK(cxl_cper_prot_err_work_lock);
> struct work_struct *cxl_cper_prot_err_work;
>
> -static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> - int severity)
> +int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
> {
> - struct cxl_cper_prot_err_work_data wd;
> - u8 *dvsec_start, *cap_start;
> -
> if (!(prot_err->valid_bits & PROT_ERR_VALID_AGENT_ADDRESS)) {
> pr_err_ratelimited("CXL CPER invalid agent type\n");
> - return;
> + return -EINVAL;
> }
>
> if (!(prot_err->valid_bits & PROT_ERR_VALID_ERROR_LOG)) {
> pr_err_ratelimited("CXL CPER invalid protocol error log\n");
> - return;
> + return -EINVAL;
> }
>
> if (prot_err->err_len != sizeof(struct cxl_ras_capability_regs)) {
> pr_err_ratelimited("CXL CPER invalid RAS Cap size (%u)\n",
> prot_err->err_len);
> - return;
> + return -EINVAL;
> }
>
> - if (!(prot_err->valid_bits & PROT_ERR_VALID_SERIAL_NUMBER))
> - pr_warn(FW_WARN "CXL CPER no device serial number\n");
> + if ((prot_err->agent_type == RCD || prot_err->agent_type == DEVICE ||
> + prot_err->agent_type == LD || prot_err->agent_type == FMLD) &&
> + !(prot_err->valid_bits & PROT_ERR_VALID_SERIAL_NUMBER))
> + pr_warn_ratelimited(FW_WARN
> + "CXL CPER no device serial number\n");
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_cper_sec_prot_err_valid);
> +
> +static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> +{
> + struct cxl_cper_prot_err_work_data wd;
> + u8 *dvsec_start, *cap_start;
> +
> + if (cxl_cper_sec_prot_err_valid(prot_err))
> + return;
>
> guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
>
> diff --git a/include/cxl/event.h b/include/cxl/event.h
> index 6fd90f9cc203..4d7d1036ea9c 100644
> --- a/include/cxl/event.h
> +++ b/include/cxl/event.h
> @@ -320,4 +320,14 @@ static inline int cxl_cper_prot_err_kfifo_get(struct cxl_cper_prot_err_work_data
> }
> #endif
>
> +#ifdef CONFIG_ACPI_APEI_PCIEAER
> +int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err);
> +#else
> +static inline int
> +cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
> +{
> + return -EOPNOTSUPP;
> +}
> +#endif
> +
> #endif /* _LINUX_CXL_EVENT_H */
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Thanks
Hanjun
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 5/6 v7] acpi/ghes: Add helper to copy CXL protocol error info to work struct
2025-11-04 18:22 ` [PATCH 5/6 v7] acpi/ghes: Add helper to copy CXL protocol error info to work struct Fabio M. De Francesco
2025-11-07 18:31 ` Dave Jiang
@ 2025-11-21 2:22 ` Hanjun Guo
1 sibling, 0 replies; 18+ messages in thread
From: Hanjun Guo @ 2025-11-21 2:22 UTC (permalink / raw)
To: Fabio M. De Francesco, linux-cxl
Cc: Rafael J . Wysocki, Len Brown, Tony Luck, Borislav Petkov,
Mauro Carvalho Chehab, Shuai Xue, Davidlohr Bueso,
Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
Ira Weiny, Dan Williams, Mahesh J Salgaonkar,
Oliver O'Halloran, Bjorn Helgaas, linux-kernel, linux-acpi,
linuxppc-dev, linux-pci
On 2025/11/5 2:22, Fabio M. De Francesco wrote:
> Make a helper out of cxl_cper_post_prot_err() that checks the CXL agent
> type and copy the CPER CXL protocol errors information to a work data
> structure.
>
> Export the new symbol for reuse by ELOG.
>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
> drivers/acpi/apei/ghes.c | 42 ++++++++++++++++++++++++++--------------
> include/cxl/event.h | 10 ++++++++++
> 2 files changed, 37 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index e69ae864f43d..2f4632d9855a 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -734,20 +734,12 @@ int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
> }
> EXPORT_SYMBOL_GPL(cxl_cper_sec_prot_err_valid);
>
> -static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> - int severity)
> +int cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
> + struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> {
> - struct cxl_cper_prot_err_work_data wd;
> u8 *dvsec_start, *cap_start;
>
> - if (cxl_cper_sec_prot_err_valid(prot_err))
> - return;
> -
> - guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
> -
> - if (!cxl_cper_prot_err_work)
> - return;
> -
> switch (prot_err->agent_type) {
> case RCD:
> case DEVICE:
> @@ -756,20 +748,40 @@ static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> case RP:
> case DSP:
> case USP:
> - memcpy(&wd.prot_err, prot_err, sizeof(wd.prot_err));
> + memcpy(&wd->prot_err, prot_err, sizeof(wd->prot_err));
>
> dvsec_start = (u8 *)(prot_err + 1);
> cap_start = dvsec_start + prot_err->dvsec_len;
>
> - memcpy(&wd.ras_cap, cap_start, sizeof(wd.ras_cap));
> - wd.severity = cper_severity_to_aer(severity);
> + memcpy(&wd->ras_cap, cap_start, sizeof(wd->ras_cap));
> + wd->severity = cper_severity_to_aer(severity);
> break;
> default:
> pr_err_ratelimited("CXL CPER invalid agent type: %d\n",
> prot_err->agent_type);
> - return;
> + return -EINVAL;
> }
>
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_cper_setup_prot_err_work_data);
> +
> +static void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> +{
> + struct cxl_cper_prot_err_work_data wd;
> +
> + if (cxl_cper_sec_prot_err_valid(prot_err))
> + return;
> +
> + guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock);
> +
> + if (!cxl_cper_prot_err_work)
> + return;
> +
> + if (cxl_cper_setup_prot_err_work_data(&wd, prot_err, severity))
> + return;
> +
> if (!kfifo_put(&cxl_cper_prot_err_fifo, wd)) {
> pr_err_ratelimited("CXL CPER kfifo overflow\n");
> return;
> diff --git a/include/cxl/event.h b/include/cxl/event.h
> index 4d7d1036ea9c..94081aec597a 100644
> --- a/include/cxl/event.h
> +++ b/include/cxl/event.h
> @@ -322,12 +322,22 @@ static inline int cxl_cper_prot_err_kfifo_get(struct cxl_cper_prot_err_work_data
>
> #ifdef CONFIG_ACPI_APEI_PCIEAER
> int cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err);
> +int cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
> + struct cxl_cper_sec_prot_err *prot_err,
> + int severity);
> #else
> static inline int
> cxl_cper_sec_prot_err_valid(struct cxl_cper_sec_prot_err *prot_err)
> {
> return -EOPNOTSUPP;
> }
> +static inline int
> +cxl_cper_setup_prot_err_work_data(struct cxl_cper_prot_err_work_data *wd,
> + struct cxl_cper_sec_prot_err *prot_err,
> + int severity)
> +{
> + return -EOPNOTSUPP;
> +}
> #endif
>
> #endif /* _LINUX_CXL_EVENT_H */
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Thanks
Hanjun
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-11-21 2:22 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 18:22 [PATCH 0/6 v7] Make ELOG and GHES log and trace consistently Fabio M. De Francesco
2025-11-04 18:22 ` [PATCH 1/6 v7] ACPI: extlog: Trace CPER Non-standard Section Body Fabio M. De Francesco
2025-11-04 18:22 ` [PATCH 2/6 v7] ACPI: extlog: Trace CPER PCI Express Error Section Fabio M. De Francesco
2025-11-05 9:23 ` kernel test robot
2025-11-04 18:22 ` [PATCH 3/6 v7] acpi/ghes: Make GHES select ACPI_APEI_PCIEAER Fabio M. De Francesco
2025-11-05 10:05 ` kernel test robot
2025-11-11 15:42 ` Jonathan Cameron
2025-11-21 2:12 ` Hanjun Guo
2025-11-04 18:22 ` [PATCH 4/6 v7] acpi/ghes: Add helper for CXL protocol errors checks Fabio M. De Francesco
2025-11-07 18:30 ` Dave Jiang
2025-11-11 15:43 ` Jonathan Cameron
2025-11-21 2:16 ` Hanjun Guo
2025-11-04 18:22 ` [PATCH 5/6 v7] acpi/ghes: Add helper to copy CXL protocol error info to work struct Fabio M. De Francesco
2025-11-07 18:31 ` Dave Jiang
2025-11-21 2:22 ` Hanjun Guo
2025-11-04 18:22 ` [PATCH 6/6 v7] ACPI: extlog: Trace CPER CXL Protocol Error Section Fabio M. De Francesco
2025-11-05 9:01 ` kernel test robot
2025-11-07 19:59 ` Dave Jiang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.