* [PATCH v2 0/4] Add support for Versal Xilsem edac
@ 2025-07-22 16:03 Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 1/4] dt-bindings: edac: Add bindings for Xilinx Versal EDAC for XilSem Rama devi Veggalam
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Rama devi Veggalam @ 2025-07-22 16:03 UTC (permalink / raw)
To: bp, tony.luck, michal.simek, robh, krzk+dt, conor+dt
Cc: linux-kernel, linux-edac, devicetree, james.morse, mchehab, rric,
git, Rama devi Veggalam
Add sysfs interface for Xilsem scan operations initialize, start,
stop scan, error inject, read ECC, status and configuration values.
Handle correctable and uncorrectable xilsem error events.
This patch depends on
https://lore.kernel.org/r/20250701123851.1314531-1-jay.buddhabhatti@amd.com
Changes in v2:
- Patches created on top of dependent patch series
"enhance zynqmp_pm_get_family_info()"
- Removed non-relevant SOB names in error event header files
- Added details for eprobe_defer conditions
- Updated copyright information
- Merged Versal and Versal NET error event definitions to firmware
patch
- Updated "Date" field in sysfs file
- Changed "xlnx,versal-xilsem-edac" to constant
- Removed unused macros
- Fixed formatting issues
- Removed ARCH_ZYNQMP in dependent list of XilSEM Kconfig
- Added error code for invalid versal device type
- Removed redundant sysfs details in function headers
- Included MAINTAINERS to 4/4 patch
- Added more description in commit message 4/4
- Removed "items" in compatible
- Fixed indentation in examples
- Removed print for probe success
- Removed function comments for remove()
Rama devi Veggalam (4):
dt-bindings: edac: Add bindings for Xilinx Versal EDAC for XilSem
Documentation: ABI: Add ABI doc for xilsem edac sysfs
firmware: xilinx: Add support for Xilsem scan operations
edac: xilinx: Add EDAC support for Xilinx XilSem
.../ABI/testing/sysfs-driver-xilsem-edac | 104 +++
.../edac/xlnx,versal-xilsem-edac.yaml | 42 +
MAINTAINERS | 6 +
drivers/edac/Kconfig | 23 +
drivers/edac/Makefile | 1 +
drivers/edac/xilinx_xilsem_edac.c | 746 ++++++++++++++++++
drivers/firmware/xilinx/zynqmp.c | 91 ++-
drivers/soc/xilinx/xlnx_event_manager.c | 10 +-
.../linux/firmware/xlnx-versal-error-events.h | 49 ++
.../firmware/xlnx-versal-net-error-events.h | 51 ++
include/linux/firmware/xlnx-zynqmp.h | 47 +-
11 files changed, 1149 insertions(+), 21 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-driver-xilsem-edac
create mode 100644 Documentation/devicetree/bindings/edac/xlnx,versal-xilsem-edac.yaml
create mode 100644 drivers/edac/xilinx_xilsem_edac.c
create mode 100644 include/linux/firmware/xlnx-versal-error-events.h
create mode 100644 include/linux/firmware/xlnx-versal-net-error-events.h
--
2.23.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/4] dt-bindings: edac: Add bindings for Xilinx Versal EDAC for XilSem
2025-07-22 16:03 [PATCH v2 0/4] Add support for Versal Xilsem edac Rama devi Veggalam
@ 2025-07-22 16:03 ` Rama devi Veggalam
2025-07-23 8:05 ` Krzysztof Kozlowski
2025-07-22 16:03 ` [PATCH v2 2/4] Documentation: ABI: Add ABI doc for xilsem edac sysfs Rama devi Veggalam
` (3 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Rama devi Veggalam @ 2025-07-22 16:03 UTC (permalink / raw)
To: bp, tony.luck, michal.simek, robh, krzk+dt, conor+dt
Cc: linux-kernel, linux-edac, devicetree, james.morse, mchehab, rric,
git, Rama devi Veggalam
Add device tree bindings for Xilinx Versal EDAC for
XilSem controller
Signed-off-by: Rama devi Veggalam <rama.devi.veggalam@amd.com>
---
Changes in v2:
- Changed "xlnx,versal-xilsem-edac" to constant
- Removed "compatible: in required section
- Removed "|" in description
- Removed "items" in compatible
- Fixed indentation in examples
- Updated title and description
---
.../edac/xlnx,versal-xilsem-edac.yaml | 42 +++++++++++++++++++
1 file changed, 42 insertions(+)
create mode 100644 Documentation/devicetree/bindings/edac/xlnx,versal-xilsem-edac.yaml
diff --git a/Documentation/devicetree/bindings/edac/xlnx,versal-xilsem-edac.yaml b/Documentation/devicetree/bindings/edac/xlnx,versal-xilsem-edac.yaml
new file mode 100644
index 000000000000..23c1d0557a66
--- /dev/null
+++ b/Documentation/devicetree/bindings/edac/xlnx,versal-xilsem-edac.yaml
@@ -0,0 +1,42 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/edac/xlnx,versal-xilsem-edac.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Xilinx Versal Soft Error Mitigation (XilSEM) EDAC
+
+maintainers:
+ - Rama Devi Veggalam <rama.devi.veggalam@amd.com>
+
+description:
+ Xilinx Versal Soft Error Mitigation (XilSEM) is part of the
+ Platform Loader and Manager (PLM) which is loaded into and runs on the
+ Platform Management Controller (PMC). XilSEM is responsible for reporting
+ and optionally correcting soft errors in Configuration Memory of Versal.
+ The memory is scanned by a hardware controller in the Versal Programmable
+ Logic (PL). During the scan, if the controller detects any error, be it
+ correctable or uncorrectable, it reports the error to PLM. The XilSEM on PLM
+ performs the error validation and notifies the errors to user application.
+ This XilSEM EDAC node is responsible for handling error events received from
+ XilSEM on PLM and also provides an interface to control scan operations and
+ fetching the scan status & configuration information.
+
+properties:
+ compatible:
+ const: xlnx,versal-xilsem-edac
+
+ reg:
+ maxItems: 1
+
+required:
+ - reg
+
+additionalProperties: false
+
+examples:
+ - |
+ edac@f2014050 {
+ compatible = "xlnx,versal-xilsem-edac";
+ reg = <0xf2014050 0xc4>;
+ };
--
2.23.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/4] Documentation: ABI: Add ABI doc for xilsem edac sysfs
2025-07-22 16:03 [PATCH v2 0/4] Add support for Versal Xilsem edac Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 1/4] dt-bindings: edac: Add bindings for Xilinx Versal EDAC for XilSem Rama devi Veggalam
@ 2025-07-22 16:03 ` Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 3/4] firmware: xilinx: Add support for Xilsem scan operations Rama devi Veggalam
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Rama devi Veggalam @ 2025-07-22 16:03 UTC (permalink / raw)
To: bp, tony.luck, michal.simek, robh, krzk+dt, conor+dt
Cc: linux-kernel, linux-edac, devicetree, james.morse, mchehab, rric,
git, Rama devi Veggalam
Add documentation for the sysfs entries created for
versal xilsem edac.
Signed-off-by: Rama devi Veggalam <rama.devi.veggalam@amd.com>
---
Changes in v2:
- Updated Date field in sysfs file
---
.../ABI/testing/sysfs-driver-xilsem-edac | 104 ++++++++++++++++++
1 file changed, 104 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-driver-xilsem-edac
diff --git a/Documentation/ABI/testing/sysfs-driver-xilsem-edac b/Documentation/ABI/testing/sysfs-driver-xilsem-edac
new file mode 100644
index 000000000000..80180a7b16fb
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-xilsem-edac
@@ -0,0 +1,104 @@
+What: /sys/devices/system/edac/versal_xilsem/xsem_scan_control
+Date: July 2025
+Contact: rama.devi.veggalam@amd.com
+Description:
+ It is a read/write file.
+ Writing to this file causes the software to initiate a
+ request to the firmware for doing requested scan operation in
+ Configuration RAM (CRAM) or NPI of Versal devices. User needs
+ to provide scan operation id (init, start, stop) details.
+ The scan operation id values are as given below:
+ 1 - Initialize the scan
+ 2 - Start CRAM scan
+ 3 - Stop CRAM scan
+ 5 - Start NPI scan
+ 6 - Stop NPI scan
+ 7 - Inject NPI error in first descriptor
+
+ When read, it shows the current scan status with error code.
+ The format is <0x1030 | operation Id> <error code>.
+ The different error codes are as given below:
+ ========== =====
+ Error Code Cause
+ ========== =====
+ 0x0 Scan operation success
+ 0x1 Failure in NPI scan
+ 0x80 Calibration timeout
+ 0x2000 Internal error
+ 0x500000 CRAM initialization not yet done
+ 0x600000 Start scan failed
+ 0x700000 Stop scan failed
+ 0xF00000 Active CRC/UE error
+ 0x1000000 ECC/CRC error detected during calibration
+ ========== =====
+
+What: /sys/devices/system/edac/versal_xilsem/xsem_cram_injecterr
+Date: July 2025
+Contact: rama.devi.veggalam@amd.com
+Description:
+ It is a read/write file.
+ Writing to this file causes the software to initiate a
+ request to the firmware for doing error injection in
+ Configuration RAM (CRAM) of Versal devices. User needs
+ to provide the location details of CRAM
+ (frame, qword, bit number, row number) to inject the error.
+ When read, it shows the current error injection status. The
+ format is <header> <error code>.
+ Example: 0x10304 0
+ The different error codes are as given below:
+ ========== =====
+ Error Code Cause
+ ========== =====
+ 0x0 Error injection success
+ 0x2000 Internal NULL pointer error
+ 0x500000 CRAM initialization not yet done
+ 0x800000 Invalid row
+ 0x900000 Invalid qword
+ 0xA00000 Invalid bit
+ 0xB00000 Invalid frame address
+ 0xC00000 Unexpected bits flipped
+ 0xD00000 Masked bit
+ 0xE00000 Invalid block type
+ 0xF00000 Active CRC/UE error in CRAM
+ ========== =====
+
+What: /sys/devices/system/edac/versal_xilsem/xsem_cram_framecc_read
+Date: July 2025
+Contact: rama.devi.veggalam@amd.com
+Description:
+ It is a read/write file.
+ Writing to this file causes the software to initiate a
+ request to the firmware for reading frame ECC values in
+ Configuration RAM (CRAM) of Versal devices. User needs
+ to provide the location details of CRAM
+ (frame, row number) to read the ECC values.
+ When read, it shows the ECC values for the requested frame.
+ The format is <status> <header> <ECC_0> <ECC_1>
+ Example: 0 0x1030A 0x363B1A 0x8A0200
+
+What: /sys/devices/system/edac/versal_xilsem/xsem_read_config
+Date: July 2025
+Contact: rama.devi.veggalam@amd.com
+Description:
+ It is a read/write file.
+ Writing to this file causes the software to initiate a
+ request to the firmware for reading Xilsem configuration.
+ When read, it shows the CRAM and NPI scan configuration.
+ The format is <status> <header> <CRAM config> <NPI config>
+ Example: 0 0x1030A 0x26 0x5016
+
+What: /sys/devices/system/edac/versal_xilsem/xsem_read_status
+Date: July 2025
+Contact: rama.devi.veggalam@amd.com
+Description:
+ It is a read/write file.
+ Writing to this file causes the software to initiate a
+ request read the Xilsem status. User needs to provide
+ the module id for status. The module id values are as given below:
+ 1 - CRAM scan
+ 2 - NPI scan
+ When read, it shows the status of the requested module.
+ For CRAM: <status> <CE count>
+ Example: 0x10005 0
+ For NPI: <status> <scan count> <heartbeat count>
+ Example: 0xA01 0x10 0x1
--
2.23.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 3/4] firmware: xilinx: Add support for Xilsem scan operations
2025-07-22 16:03 [PATCH v2 0/4] Add support for Versal Xilsem edac Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 1/4] dt-bindings: edac: Add bindings for Xilinx Versal EDAC for XilSem Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 2/4] Documentation: ABI: Add ABI doc for xilsem edac sysfs Rama devi Veggalam
@ 2025-07-22 16:03 ` Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 4/4] edac: xilinx: Add EDAC support for Xilinx XilSem Rama devi Veggalam
2025-07-22 16:49 ` [PATCH v2 0/4] Add support for Versal Xilsem edac Borislav Petkov
4 siblings, 0 replies; 7+ messages in thread
From: Rama devi Veggalam @ 2025-07-22 16:03 UTC (permalink / raw)
To: bp, tony.luck, michal.simek, robh, krzk+dt, conor+dt
Cc: linux-kernel, linux-edac, devicetree, james.morse, mchehab, rric,
git, Rama devi Veggalam
Add the ATF EEMI call support for Xilsem scan operations
Initialize, start, stop scan, error inject, read configuration,
status and register for software error events.
Add macros for XilSem correctable and uncorrectable error events.
These new macros need to be used during registration of XilSem error
events for Versal and Versal NET devices.
Signed-off-by: Rama devi Veggalam <rama.devi.veggalam@amd.com>
---
Changes in v2:
- Patch created on top of dependent patch series
"enhance zynqmp_pm_get_family_info()"
- Removed non-relevant SOB names in error event header files
- Updated copyright information
- Merged Versal and Versal NET error event definitions to firmware
driver changes
---
drivers/firmware/xilinx/zynqmp.c | 91 ++++++++++++++++++-
drivers/soc/xilinx/xlnx_event_manager.c | 10 +-
.../linux/firmware/xlnx-versal-error-events.h | 49 ++++++++++
.../firmware/xlnx-versal-net-error-events.h | 51 +++++++++++
include/linux/firmware/xlnx-zynqmp.h | 47 ++++++----
5 files changed, 227 insertions(+), 21 deletions(-)
create mode 100644 include/linux/firmware/xlnx-versal-error-events.h
create mode 100644 include/linux/firmware/xlnx-versal-net-error-events.h
diff --git a/drivers/firmware/xilinx/zynqmp.c b/drivers/firmware/xilinx/zynqmp.c
index 17156eea78f2..9712ff353246 100644
--- a/drivers/firmware/xilinx/zynqmp.c
+++ b/drivers/firmware/xilinx/zynqmp.c
@@ -3,7 +3,7 @@
* Xilinx Zynq MPSoC Firmware layer
*
* Copyright (C) 2014-2022 Xilinx, Inc.
- * Copyright (C) 2022 - 2024, Advanced Micro Devices, Inc.
+ * Copyright (C) 2022 - 2025, Advanced Micro Devices, Inc.
*
* Michal Simek <michal.simek@amd.com>
* Davorin Mista <davorin.mista@aggios.com>
@@ -1643,6 +1643,95 @@ int zynqmp_pm_set_gem_config(u32 node, enum pm_gem_config_type config,
}
EXPORT_SYMBOL_GPL(zynqmp_pm_set_gem_config);
+/**
+ * zynqmp_pm_xilsem_cntrl_ops - PM call to perform XilSEM operations
+ * @cmd: Command for XilSEM scan control operations
+ * @response: Output response (command header, error code or status)
+ *
+ * Return: Returns 0 on success or error value on failure.
+ */
+int zynqmp_pm_xilsem_cntrl_ops(u32 cmd, u32 *const response)
+{
+ u32 ret_buf[PAYLOAD_ARG_CNT];
+ int ret;
+
+ ret = zynqmp_pm_invoke_fn(PM_XSEM_HEADER | cmd, ret_buf, 0);
+ response[0] = ret_buf[1];
+ response[1] = ret_buf[2];
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(zynqmp_pm_xilsem_cntrl_ops);
+
+/**
+ * zynqmp_pm_xilsem_cram_errinj - PM call to perform CRAM error injection
+ * @frame: Frame number to be used for error injection
+ * @qword: Word number to be used for error injection
+ * @bit: Bit location to be used for error injection
+ * @row: CFRAME row number to be used for error injection
+ * @response: Output response (command header, error code or status)
+ *
+ * Return: Returns 0 on success or error value on failure.
+ */
+int zynqmp_pm_xilsem_cram_errinj(u32 frame, u32 qword, u32 bit, u32 row,
+ u32 *const response)
+{
+ u32 ret_buf[PAYLOAD_ARG_CNT];
+ int ret;
+
+ ret = zynqmp_pm_invoke_fn(PM_XSEM_CRAM_ERRINJ, ret_buf, 4, frame,
+ qword, bit, row);
+ response[0] = ret_buf[1];
+ response[1] = ret_buf[2];
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(zynqmp_pm_xilsem_cram_errinj);
+
+/**
+ * zynqmp_pm_xilsem_cram_readecc - PM call to perform CFRAME ECC read
+ * @frame: Frame number to be used for reading ECC
+ * @row: CFRAME row number to be used for reading ECC
+ * @response: Output response (status, Frame ecc header, ECC values)
+ *
+ * Return: Returns 0 on success or error value on failure.
+ */
+int zynqmp_pm_xilsem_cram_readecc(u32 frame, u32 row, u32 *const response)
+{
+ u32 ret_buf[PAYLOAD_ARG_CNT];
+ int ret;
+
+ ret = zynqmp_pm_invoke_fn(PM_XSEM_CRAM_RD_ECC, ret_buf, 2, frame, row);
+ response[0] = ret_buf[0];
+ response[1] = ret_buf[1];
+ response[2] = ret_buf[2];
+ response[3] = ret_buf[3];
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(zynqmp_pm_xilsem_cram_readecc);
+
+/**
+ * zynqmp_pm_xilsem_read_cfg - PM call to perform Xilsem configuration read
+ * @response: Output response (status, config header, Xilsem config)
+ *
+ * Return: Returns 0 on success or error value on failure.
+ */
+int zynqmp_pm_xilsem_read_cfg(u32 *const response)
+{
+ u32 ret_buf[PAYLOAD_ARG_CNT];
+ int ret;
+
+ ret = zynqmp_pm_invoke_fn(PM_XSEM_RD_CONFIG, ret_buf, 0);
+ response[0] = ret_buf[0];
+ response[1] = ret_buf[1];
+ response[2] = ret_buf[2];
+ response[3] = ret_buf[3];
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(zynqmp_pm_xilsem_read_cfg);
+
/**
* struct zynqmp_pm_shutdown_scope - Struct for shutdown scope
* @subtype: Shutdown subtype
diff --git a/drivers/soc/xilinx/xlnx_event_manager.c b/drivers/soc/xilinx/xlnx_event_manager.c
index 6fdf4d14b7e7..f292a68ad5d5 100644
--- a/drivers/soc/xilinx/xlnx_event_manager.c
+++ b/drivers/soc/xilinx/xlnx_event_manager.c
@@ -3,12 +3,14 @@
* Xilinx Event Management Driver
*
* Copyright (C) 2021 Xilinx, Inc.
- * Copyright (C) 2024 Advanced Micro Devices, Inc.
+ * Copyright (C) 2024 - 2025 Advanced Micro Devices, Inc.
*
* Abhyuday Godhasara <abhyuday.godhasara@xilinx.com>
*/
#include <linux/cpuhotplug.h>
+#include <linux/firmware/xlnx-versal-error-events.h>
+#include <linux/firmware/xlnx-versal-net-error-events.h>
#include <linux/firmware/xlnx-event-manager.h>
#include <linux/firmware/xlnx-zynqmp.h>
#include <linux/hashtable.h>
@@ -85,7 +87,8 @@ static bool xlnx_is_error_event(const u32 node_id)
if (node_id == VERSAL_EVENT_ERROR_PMC_ERR1 ||
node_id == VERSAL_EVENT_ERROR_PMC_ERR2 ||
node_id == VERSAL_EVENT_ERROR_PSM_ERR1 ||
- node_id == VERSAL_EVENT_ERROR_PSM_ERR2)
+ node_id == VERSAL_EVENT_ERROR_PSM_ERR2 ||
+ node_id == VERSAL_EVENT_ERROR_SW_ERR)
return true;
} else if (pm_family_code == PM_VERSAL_NET_FAMILY_CODE) {
if (node_id == VERSAL_NET_EVENT_ERROR_PMC_ERR1 ||
@@ -94,7 +97,8 @@ static bool xlnx_is_error_event(const u32 node_id)
node_id == VERSAL_NET_EVENT_ERROR_PSM_ERR1 ||
node_id == VERSAL_NET_EVENT_ERROR_PSM_ERR2 ||
node_id == VERSAL_NET_EVENT_ERROR_PSM_ERR3 ||
- node_id == VERSAL_NET_EVENT_ERROR_PSM_ERR4)
+ node_id == VERSAL_NET_EVENT_ERROR_PSM_ERR4 ||
+ node_id == VERSAL_NET_EVENT_ERROR_SW_ERR)
return true;
}
diff --git a/include/linux/firmware/xlnx-versal-error-events.h b/include/linux/firmware/xlnx-versal-error-events.h
new file mode 100644
index 000000000000..2d3be7c9e84a
--- /dev/null
+++ b/include/linux/firmware/xlnx-versal-error-events.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Xilinx Versal Error Event Node IDs and Error Event Mask.
+ * Use with Xilinx Event Management Driver
+ *
+ * Copyright (C) 2021-2022 Xilinx
+ * Copyright (C) 2023-2025 Advanced Micro Devices, Inc.
+ *
+ */
+
+#ifndef _FIRMWARE_XLNX_VERSAL_ERROR_EVENTS_H_
+#define _FIRMWARE_XLNX_VERSAL_ERROR_EVENTS_H_
+
+/*
+ * Error Event Node Ids
+ */
+#define VERSAL_EVENT_ERROR_PMC_ERR1 (0x28100000U)
+#define VERSAL_EVENT_ERROR_PMC_ERR2 (0x28104000U)
+#define VERSAL_EVENT_ERROR_PSM_ERR1 (0x28108000U)
+#define VERSAL_EVENT_ERROR_PSM_ERR2 (0x2810C000U)
+#define VERSAL_EVENT_ERROR_SW_ERR (0x28110000U)
+
+/*
+ * Error Event Mask belongs to SW ERR node,
+ * For which Node_Id = VERSAL_EVENT_ERROR_SW_ERR
+ */
+
+/**
+ * XPM_VERSAL_EVENT_ERROR_MASK_XSEM_CRAM_CE_5: Error event mask for handling
+ * correctable error in Versal Configuration RAM which is reported by
+ * Soft Error Mitigation (XilSEM).
+ */
+#define XPM_VERSAL_EVENT_ERROR_MASK_XSEM_CRAM_CE_5 BIT(5)
+
+/**
+ * XPM_VERSAL_EVENT_ERROR_MASK_XSEM_CRAM_UE_6: Error event mask for handling
+ * uncorrectable error in Versal Configuration RAM which is reported by
+ * Soft Error Mitigation (XilSEM).
+ */
+#define XPM_VERSAL_EVENT_ERROR_MASK_XSEM_CRAM_UE_6 BIT(6)
+
+/**
+ * XPM_VERSAL_EVENT_ERROR_MASK_XSEM_NPI_UE_7: Error event mask for handling
+ * uncorrectable error in Versal NoC programming interface (NPI)
+ * register which is reported by Soft Error Mitigation (XilSEM).
+ */
+#define XPM_VERSAL_EVENT_ERROR_MASK_XSEM_NPI_UE_7 BIT(7)
+
+#endif /* _FIRMWARE_XLNX_VERSAL_ERROR_EVENTS_H_ */
diff --git a/include/linux/firmware/xlnx-versal-net-error-events.h b/include/linux/firmware/xlnx-versal-net-error-events.h
new file mode 100644
index 000000000000..690337c6b9e7
--- /dev/null
+++ b/include/linux/firmware/xlnx-versal-net-error-events.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Xilinx Versal NET Error Event Node IDs and Error Event Mask.
+ * Use with Xilinx Event Management Driver
+ *
+ * Copyright (C) 2023-2025, Advanced Micro Devices, Inc.
+ *
+ */
+
+#ifndef _FIRMWARE_XLNX_VERSAL_NET_ERROR_EVENTS_H_
+#define _FIRMWARE_XLNX_VERSAL_NET_ERROR_EVENTS_H_
+
+/*
+ * Error Event Node Ids
+ */
+#define VERSAL_NET_EVENT_ERROR_PMC_ERR1 (0x28100000U)
+#define VERSAL_NET_EVENT_ERROR_PMC_ERR2 (0x28104000U)
+#define VERSAL_NET_EVENT_ERROR_PMC_ERR3 (0x28108000U)
+#define VERSAL_NET_EVENT_ERROR_PSM_ERR1 (0x2810C000U)
+#define VERSAL_NET_EVENT_ERROR_PSM_ERR2 (0x28110000U)
+#define VERSAL_NET_EVENT_ERROR_PSM_ERR3 (0x28114000U)
+#define VERSAL_NET_EVENT_ERROR_PSM_ERR4 (0x28118000U)
+#define VERSAL_NET_EVENT_ERROR_SW_ERR (0x2811C000U)
+
+/*
+ * Error Event Mask belongs to SW ERR node,
+ * For which Node_Id = VERSAL_NET_EVENT_ERROR_SW_ERR
+ */
+
+/**
+ * XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_CRAM_CE: Error event mask for handling
+ * correctable error in Versal Configuration RAM which is reported by
+ * Soft Error Mitigation (XilSEM).
+ */
+#define XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_CRAM_CE BIT(7)
+
+/**
+ * XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_CRAM_UE: Error event mask for handling
+ * uncorrectable error in Versal Configuration RAM which is reported by
+ * Soft Error Mitigation (XilSEM).
+ */
+#define XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_CRAM_UE BIT(8)
+
+/**
+ * XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_NPI_UE: Error event mask for handling
+ * uncorrectable error in Versal NoC programming interface (NPI)
+ * register which is reported by Soft Error Mitigation (XilSEM).
+ */
+#define XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_NPI_UE BIT(9)
+
+#endif /* _FIRMWARE_XLNX_VERSAL_NET_ERROR_EVENTS_H_ */
diff --git a/include/linux/firmware/xlnx-zynqmp.h b/include/linux/firmware/xlnx-zynqmp.h
index 4bfe314e99ef..69f545eee743 100644
--- a/include/linux/firmware/xlnx-zynqmp.h
+++ b/include/linux/firmware/xlnx-zynqmp.h
@@ -3,7 +3,7 @@
* Xilinx Zynq MPSoC Firmware layer
*
* Copyright (C) 2014-2021 Xilinx
- * Copyright (C) 2022 - 2024, Advanced Micro Devices, Inc.
+ * Copyright (C) 2022 - 2025, Advanced Micro Devices, Inc.
*
* Michal Simek <michal.simek@amd.com>
* Davorin Mista <davorin.mista@aggios.com>
@@ -69,6 +69,11 @@
#define PM_SET_SUSPEND_MODE 0xa02
#define GET_CALLBACK_DATA 0xa01
+/* XilSEM commands */
+#define PM_XSEM_HEADER 0x300
+#define PM_XSEM_CRAM_ERRINJ 0x304
+#define PM_XSEM_RD_CONFIG 0x309
+#define PM_XSEM_CRAM_RD_ECC 0x30B
/* Number of 32bits values in payload */
#define PAYLOAD_ARG_CNT 7U
@@ -110,22 +115,6 @@
#define XILINX_ZYNQMP_PM_FPGA_CONFIG_STAT_OFFSET 7U
#define XILINX_ZYNQMP_PM_FPGA_READ_CONFIG_REG 0U
-/*
- * Node IDs for the Error Events.
- */
-#define VERSAL_EVENT_ERROR_PMC_ERR1 (0x28100000U)
-#define VERSAL_EVENT_ERROR_PMC_ERR2 (0x28104000U)
-#define VERSAL_EVENT_ERROR_PSM_ERR1 (0x28108000U)
-#define VERSAL_EVENT_ERROR_PSM_ERR2 (0x2810C000U)
-
-#define VERSAL_NET_EVENT_ERROR_PMC_ERR1 (0x28100000U)
-#define VERSAL_NET_EVENT_ERROR_PMC_ERR2 (0x28104000U)
-#define VERSAL_NET_EVENT_ERROR_PMC_ERR3 (0x28108000U)
-#define VERSAL_NET_EVENT_ERROR_PSM_ERR1 (0x2810C000U)
-#define VERSAL_NET_EVENT_ERROR_PSM_ERR2 (0x28110000U)
-#define VERSAL_NET_EVENT_ERROR_PSM_ERR3 (0x28114000U)
-#define VERSAL_NET_EVENT_ERROR_PSM_ERR4 (0x28118000U)
-
/* ZynqMP SD tap delay tuning */
#define SD_ITAPDLY 0xFF180314
#define SD_OTAPDLYSEL 0xFF180318
@@ -627,6 +616,10 @@ int zynqmp_pm_set_tcm_config(u32 node_id, enum rpu_tcm_comb tcm_mode);
int zynqmp_pm_set_sd_config(u32 node, enum pm_sd_config_type config, u32 value);
int zynqmp_pm_set_gem_config(u32 node, enum pm_gem_config_type config,
u32 value);
+int zynqmp_pm_xilsem_cntrl_ops(u32 cmd, u32 *const response);
+int zynqmp_pm_xilsem_cram_errinj(u32 frame, u32 qword, u32 bit, u32 row, u32 *const response);
+int zynqmp_pm_xilsem_cram_readecc(u32 frame, u32 row, u32 *const response);
+int zynqmp_pm_xilsem_read_cfg(u32 *const response);
#else
static inline int zynqmp_pm_get_api_version(u32 *version)
{
@@ -945,6 +938,26 @@ static inline int zynqmp_pm_set_gem_config(u32 node,
return -ENODEV;
}
+static inline int zynqmp_pm_xilsem_cntrl_ops(u32 cmd, u32 *const response)
+{
+ return -ENODEV;
+}
+
+static inline int zynqmp_pm_xilsem_cram_readecc(u32 frame, u32 row, u32 *const response)
+{
+ return -ENODEV;
+}
+
+static inline int zynqmp_pm_xilsem_cram_errinj(u32 frame, u32 qword, u32 bit,
+ u32 row, u32 *const response)
+{
+ return -ENODEV;
+}
+
+static inline int zynqmp_pm_xilsem_read_cfg(u32 *const response)
+{
+ return -ENODEV;
+}
#endif
#endif /* __FIRMWARE_ZYNQMP_H__ */
--
2.23.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 4/4] edac: xilinx: Add EDAC support for Xilinx XilSem
2025-07-22 16:03 [PATCH v2 0/4] Add support for Versal Xilsem edac Rama devi Veggalam
` (2 preceding siblings ...)
2025-07-22 16:03 ` [PATCH v2 3/4] firmware: xilinx: Add support for Xilsem scan operations Rama devi Veggalam
@ 2025-07-22 16:03 ` Rama devi Veggalam
2025-07-22 16:49 ` [PATCH v2 0/4] Add support for Versal Xilsem edac Borislav Petkov
4 siblings, 0 replies; 7+ messages in thread
From: Rama devi Veggalam @ 2025-07-22 16:03 UTC (permalink / raw)
To: bp, tony.luck, michal.simek, robh, krzk+dt, conor+dt
Cc: linux-kernel, linux-edac, devicetree, james.morse, mchehab, rric,
git, Rama devi Veggalam
Xilinx Versal Soft Error Mitigation (XilSEM) is responsible for reporting
and optionally correcting soft errors in Configuration Memory of Versal.
The Configuration Memory includes Configuration RAM and
Network on Chip (NoC) peripheral interconnect (NPI) Registers.
The Configuration RAM (CRAM) memory is used for storing configuration
data for the programmable logic (PL) fabric. The NPI registers are used
for configuring the memory controllers, miscellaneous integrated hardware,
NoC interface units in the Veral device.
Add EDAC support for Xilinx XilSEM, this driver
handles correctable and uncorrectable error events.
- Register for the notification.
- After receiving the notification handle the correctable and
uncorrectable errors.
Add sysfs interface for XilSEM scan operations
initialize, start, stop scan, error inject, read ECC, scan status and
configuration values.
Signed-off-by: Rama devi Veggalam <rama.devi.veggalam@amd.com>
---
Changes in v2:
- Patch created on top of dependent patch series
"enhance zynqmp_pm_get_family_info()"
- Fixed maximum length warning in patch description
- Added details for eprobe_defer conditions
- Updated copyright information
- Removed ARCH_ZYNQMP in dependent list of XilSEM Kconfig
- Added error code for invalid versal device type
- Removed redundant sysfs details in function headers
- Included MAINTAINERS to this patch
- Added more description in commit message
- Removed print for probe success
- Removed function comments for xsem_edac_remove()
---
MAINTAINERS | 6 +
drivers/edac/Kconfig | 23 +
drivers/edac/Makefile | 1 +
drivers/edac/xilinx_xilsem_edac.c | 746 ++++++++++++++++++++++++++++++
4 files changed, 776 insertions(+)
create mode 100644 drivers/edac/xilinx_xilsem_edac.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 10850512c118..6fcdc084716e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -27298,6 +27298,12 @@ S: Maintained
F: Documentation/devicetree/bindings/nvmem/xlnx,zynqmp-nvmem.yaml
F: drivers/nvmem/zynqmp_nvmem.c
+XILINX VERSAL XILSEM EDAC DRIVER
+M: Rama Devi Veggalam <rama.devi.veggalam@amd.com>
+S: Maintained
+F: Documentation/devicetree/bindings/edac/xlnx,versal-xilsem-edac.yaml
+F: drivers/edac/xilinx_xilsem_edac.
+
XILLYBUS DRIVER
M: Eli Billauer <eli.billauer@gmail.com>
L: linux-kernel@vger.kernel.org
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 19ad3c3b675d..4557248cc596 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -567,6 +567,29 @@ config EDAC_VERSAL
Support injecting both correctable and uncorrectable errors
for debugging purposes.
+config EDAC_XILINX_XILSEM
+ tristate "Xilinx Versal XilSEM Controller"
+ depends on ZYNQMP_FIRMWARE
+ help
+ Support for handling error events on the Xilinx Versal Xilsem
+ controller.
+
+ Xilinx Versal Soft Error Mitigation (XilSEM) is part of the
+ Platform Loader and Manager (PLM) which is loaded into and runs on the
+ Platform Management Controller (PMC). XilSEM is responsible for detecting
+ and optionally correcting soft errors in Configuration Memory of Versal.
+ The Configuration RAM (CRAM) memory is used for storing configuration
+ data for the programmable logic (PL) fabric. The NPI registers are used for
+ configuring the memory controllers, miscellaneous integrated hardware,
+ NoC interface units in the Veral device.
+
+ Whenever an error is detected in the Configuration Memory, be it correctable
+ or uncorrectable, XilSEM reports the error to user application.
+ This driver is responsible for handling the reported correctable and
+ uncorrectable errors. Also, provides sysfs interface for scan control operations.
+ The scan operations include initialization, start, stop, error inject, read ECC,
+ scan status and configuration values.
+
config EDAC_LOONGSON
tristate "Loongson Memory Controller"
depends on LOONGARCH && ACPI
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index a8f2d8f6c894..55d9bf7f818d 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -83,6 +83,7 @@ obj-$(CONFIG_EDAC_TI) += ti_edac.o
obj-$(CONFIG_EDAC_QCOM) += qcom_edac.o
obj-$(CONFIG_EDAC_ASPEED) += aspeed_edac.o
obj-$(CONFIG_EDAC_BLUEFIELD) += bluefield_edac.o
+obj-$(CONFIG_EDAC_XILINX_XILSEM) += xilinx_xilsem_edac.o
obj-$(CONFIG_EDAC_DMC520) += dmc520_edac.o
obj-$(CONFIG_EDAC_NPCM) += npcm_edac.o
obj-$(CONFIG_EDAC_ZYNQMP) += zynqmp_edac.o
diff --git a/drivers/edac/xilinx_xilsem_edac.c b/drivers/edac/xilinx_xilsem_edac.c
new file mode 100644
index 000000000000..d74f5d0150e6
--- /dev/null
+++ b/drivers/edac/xilinx_xilsem_edac.c
@@ -0,0 +1,746 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2022 - 2025, Advanced Micro Devices, Inc.
+ */
+
+#include <linux/edac.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/sizes.h>
+#include <linux/bitfield.h>
+#include <linux/firmware/xlnx-zynqmp.h>
+#include <linux/firmware/xlnx-versal-error-events.h>
+#include <linux/firmware/xlnx-versal-net-error-events.h>
+#include <linux/firmware/xlnx-event-manager.h>
+
+#include "edac_module.h"
+
+#define VERSAL_XILSEM_EDAC_MSG_SIZE 256
+#define VERSAL_XILSEM_EDAC_STRNG "versal_xilsem"
+#define EDAC_DEVICE "Xilsem"
+
+/* XilSem CE Error log count */
+#define XILSEM_MAX_CE_LOG_CNT 0x07
+
+/* XilSem_CRAM scan error info registers */
+#define CRAM_STS_INFO_OFFSET 0x34
+#define CRAM_CE_ADDRL0_OFFSET 0x38
+#define CRAM_CE_ADDRH0_OFFSET 0x3C
+#define CRAM_CE_COUNT_OFFSET 0x70
+
+/* XilSem_NPI_Scan uncorrectable error info registers */
+#define NPI_SCAN_COUNT 0x24
+#define NPI_SCAN_HB_COUNT 0x28
+#define NPI_ERR0_INFO_OFFSET 0x2C
+#define NPI_ERR1_INFO_OFFSET 0x30
+
+/* XilSem bit masks for extracting error details */
+#define CRAM_ERR_ROW_MASK GENMASK(26, 23)
+#define CRAM_ERR_BIT_MASK GENMASK(22, 16)
+#define CRAM_ERR_QWRD_MASK GENMASK(27, 23)
+#define CRAM_ERR_FRAME_MASK GENMASK(22, 0)
+
+enum xsem_cmd_id {
+ CRAM_INIT_SCAN = 1, /* To initialize CRAM scan */
+ CRAM_START_SCAN = 2, /* To start CRAM scan */
+ CRAM_STOP_SCAN = 3, /* To stop CRAM scan */
+ CRAM_ERR_INJECT = 4, /* To inject CRAM error */
+ NPI_START_SCAN = 5, /* To start NPI scan */
+ NPI_STOP_SCAN = 6, /* To stop NPI scan */
+ NPI_ERR_INJECT = 7, /* To inject NPI error */
+};
+
+/* XilSem Module IDs */
+#define CRAM_MOD_ID 0x1
+#define NPI_MOD_ID 0x2
+
+/**
+ * struct ecc_error_info - ECC error log information
+ * @status: CRAM/NPI scan error status
+ * @data0: Checksum of the error descriptor
+ * @data1: Index of the error descriptor
+ * @frame_addr: Frame location at which error occurred
+ * @block_type: Block type
+ * @row_id: Row number
+ * @bit_loc: Bit position in the Qword
+ * @qword: Qword location in the frame
+ */
+struct ecc_error_info {
+ u32 status;
+ u32 data0;
+ u32 data1;
+ u32 frame_addr;
+ u8 block_type;
+ u8 row_id;
+ u8 bit_loc;
+ u8 qword;
+};
+
+/**
+ * struct xsem_error_status - ECC status information to report
+ * @ce_cnt: Correctable error count
+ * @ue_cnt: Uncorrectable error count
+ * @ceinfo: Correctable error log information
+ * @ueinfo: Uncorrectable error log information
+ */
+struct xsem_error_status {
+ u32 ce_cnt;
+ u32 ue_cnt;
+ struct ecc_error_info ceinfo;
+ struct ecc_error_info ueinfo;
+};
+
+/**
+ * struct xsem_edac_priv - Xilsem private instance data
+ * @baseaddr: Base address of the XilSem PLM RTCA module
+ * @scan_ctrl_status: Buffer for scan ctrl commands
+ * @cram_errinj_status: Buffer for CRAM error injection
+ * @cram_frame_ecc: Buffer for CRAM frame ECC
+ * @xilsem_status: Buffer for CRAM & NPI status
+ * @sw_event_node_id: Error event node Id
+ * @xilsem_cfg: Buffer for CRAM & NPI configuration
+ * @cram_ce_mask: Event bit mask for CRAM correctable error
+ * @cram_ue_mask: Event bit mask for CRAM uncorrectable error
+ * @npi_ue_mask: Event bit mask for NPI uncorrectable error
+ * @ce_cnt: Correctable Error count
+ * @ue_cnt: Uncorrectable Error count
+ */
+struct xsem_edac_priv {
+ void __iomem *baseaddr;
+ u32 scan_ctrl_status[2];
+ u32 cram_errinj_status[2];
+ u32 cram_frame_ecc[4];
+ u32 xilsem_status[4];
+ u32 sw_event_node_id;
+ u32 xilsem_cfg[4];
+ u32 cram_ce_mask;
+ u32 cram_ue_mask;
+ u32 npi_ue_mask;
+ u32 ce_cnt;
+ u32 ue_cnt;
+};
+
+/**
+ * xsem_scan_control_show - Shows scan control operation status
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ *
+ * Shows the scan control operations status
+ * Return: Number of bytes copied.
+ */
+static ssize_t xsem_scan_control_show(struct edac_device_ctl_info *dci, char *data)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+
+ return sprintf(data, "[0x%x][0x%x]\n\r",
+ priv->scan_ctrl_status[0], priv->scan_ctrl_status[1]);
+}
+
+/**
+ * xsem_scan_control_store - Set scan control operation
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ * @count: read the size bytes from buffer
+ *
+ * User-space interface for doing Xilsem scan operations
+ * (initialization, start, stop)
+ * Return: count argument if request succeeds, else error code
+ */
+static ssize_t xsem_scan_control_store(struct edac_device_ctl_info *dci,
+ const char *data, size_t count)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+ u32 cmd;
+ int ret;
+
+ if (!data)
+ return -EFAULT;
+
+ if (kstrtouint(data, 0, &cmd))
+ return -EINVAL;
+
+ if (cmd < CRAM_INIT_SCAN || cmd > NPI_ERR_INJECT ||
+ cmd == CRAM_ERR_INJECT)
+ return -EINVAL;
+
+ ret = zynqmp_pm_xilsem_cntrl_ops(cmd, priv->scan_ctrl_status);
+ if (ret)
+ return ret;
+
+ return count;
+}
+
+/**
+ * xsem_cram_injecterr_show - Shows CRAM error injection status
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ *
+ * Shows CRAM error injection status
+ * Return: Number of bytes copied.
+ */
+static ssize_t xsem_cram_injecterr_show(struct edac_device_ctl_info *dci, char *data)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+
+ return sprintf(data, "[0x%x][0x%x]\n\r", priv->cram_errinj_status[0],
+ priv->cram_errinj_status[1]);
+}
+
+/**
+ * xsem_cram_injecterr_store - Start error injection
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ * @count: read the size bytes from buffer
+ *
+ * User-space interface for doing CRAM error injection
+ * Return: count argument if request succeeds, else error code
+ */
+static ssize_t xsem_cram_injecterr_store(struct edac_device_ctl_info *dci,
+ const char *data, size_t count)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+ char *kern_buff, *inbuf, *tok;
+ u32 row, frame, qword, bitloc;
+ int ret;
+
+ kern_buff = kzalloc(count, GFP_KERNEL);
+ if (!kern_buff)
+ return -ENOMEM;
+
+ strscpy(kern_buff, data, count);
+
+ inbuf = kern_buff;
+
+ /* Read Frame number */
+ tok = strsep(&inbuf, " ");
+ if (!tok) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ ret = kstrtouint(tok, 0, &frame);
+ if (ret) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ /* Read Qword number */
+ tok = strsep(&inbuf, " ");
+ if (!tok) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ ret = kstrtouint(tok, 0, &qword);
+ if (ret) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ /* Read Bit location */
+ tok = strsep(&inbuf, " ");
+ if (!tok) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ ret = kstrtouint(tok, 0, &bitloc);
+ if (ret) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ /* Read Row number */
+ tok = strsep(&inbuf, " ");
+ if (!tok) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ ret = kstrtouint(tok, 0, &row);
+ if (ret) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ ret = zynqmp_pm_xilsem_cram_errinj(frame, qword, bitloc, row,
+ priv->cram_errinj_status);
+err:
+ kfree(kern_buff);
+
+ if (ret)
+ return ret;
+
+ return count;
+}
+
+/**
+ * xsem_cram_framecc_read_show - Shows CRAM Frame ECC
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ *
+ * Shows CRAM Frame ECC value
+ * Return: Number of bytes copied.
+ */
+static ssize_t xsem_cram_framecc_read_show(struct edac_device_ctl_info *dci,
+ char *data)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+
+ return sprintf(data, "[0x%x][0x%x][0x%x][0x%x]\n\r",
+ priv->cram_frame_ecc[0], priv->cram_frame_ecc[1],
+ priv->cram_frame_ecc[2], priv->cram_frame_ecc[3]);
+}
+
+/**
+ * xsem_cram_framecc_read_store - Read CRAM Frame ECC
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ * @count: read the size bytes from buffer
+ *
+ * User-space interface for reading CRAM frame ECC
+ * Return: count argument if request succeeds, else error code
+ */
+static ssize_t xsem_cram_framecc_read_store(struct edac_device_ctl_info *dci,
+ const char *data, size_t count)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+ char *kern_buff, *inbuf, *tok;
+ u32 frameaddr, row;
+ int ret;
+
+ kern_buff = kzalloc(count, GFP_KERNEL);
+ if (!kern_buff)
+ return -ENOMEM;
+
+ strscpy(kern_buff, data, count);
+
+ inbuf = kern_buff;
+
+ /* Read Frame address */
+ tok = strsep(&inbuf, " ");
+ if (!tok) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ ret = kstrtouint(tok, 0, &frameaddr);
+ if (ret) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ /* Read Row number */
+ tok = strsep(&inbuf, " ");
+ if (!tok) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ ret = kstrtouint(tok, 0, &row);
+ if (ret) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ ret = zynqmp_pm_xilsem_cram_readecc(frameaddr, row, priv->cram_frame_ecc);
+err:
+ kfree(kern_buff);
+
+ if (ret)
+ return ret;
+
+ return count;
+}
+
+/**
+ * xsem_read_status_show - Shows CRAM & NPI scan status
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ *
+ * Shows CRAM & NPI scan status
+ * Return: Number of bytes copied.
+ */
+static ssize_t xsem_read_status_show(struct edac_device_ctl_info *dci, char *data)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+
+ return sprintf(data, "[0x%x][0x%x][0x%x]\n\r",
+ priv->xilsem_status[0], priv->xilsem_status[1],
+ priv->xilsem_status[2]);
+}
+
+/**
+ * xsem_read_status_store - Read CRAM & NPI scan status
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ * @count: read the size bytes from buffer
+ *
+ * User-space interface for reading Xilsem CRAM & NPI scan status
+ * Return: count argument if rea succeeds, else error code
+ */
+static ssize_t xsem_read_status_store(struct edac_device_ctl_info *dci,
+ const char *data, size_t count)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+ u32 module;
+
+ if (!data)
+ return -EFAULT;
+
+ if (kstrtouint(data, 0, &module))
+ return -EINVAL;
+
+ if (module == CRAM_MOD_ID) {
+ priv->xilsem_status[0] = readl(priv->baseaddr + CRAM_STS_INFO_OFFSET);
+ priv->xilsem_status[1] = readl(priv->baseaddr + CRAM_CE_COUNT_OFFSET);
+ priv->xilsem_status[2] = 0;
+ } else if (module == NPI_MOD_ID) {
+ priv->xilsem_status[0] = readl(priv->baseaddr);
+ priv->xilsem_status[1] = readl(priv->baseaddr + NPI_SCAN_COUNT);
+ priv->xilsem_status[2] = readl(priv->baseaddr + NPI_SCAN_HB_COUNT);
+ } else {
+ edac_printk(KERN_ERR, EDAC_DEVICE, "Invalid module %d\n", module);
+ return -EINVAL;
+ }
+
+ return count;
+}
+
+/**
+ * xsem_read_config_show - Shows CRAM & NPI configuration
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ *
+ * Shows CRAM & NPI configuration
+ * Return: Number of bytes copied.
+ */
+static ssize_t xsem_read_config_show(struct edac_device_ctl_info *dci, char *data)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+
+ return sprintf(data, "[0x%x][0x%x][0x%x][0x%x]\n\r",
+ priv->xilsem_cfg[0], priv->xilsem_cfg[1],
+ priv->xilsem_cfg[2], priv->xilsem_cfg[3]);
+}
+
+/**
+ * xsem_read_config_store - Read CRAM & NPI configuration
+ * @dci: Pointer to the edac device struct
+ * @data: Pointer to user data
+ * @count: read the size bytes from buffer
+ *
+ * User-space interface for reading Xilsem configuration
+ * Return: count argument if request succeeds, else error code
+ */
+static ssize_t xsem_read_config_store(struct edac_device_ctl_info *dci,
+ const char *data, size_t count)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+ int ret;
+
+ if (!data)
+ return -EFAULT;
+
+ ret = zynqmp_pm_xilsem_read_cfg(priv->xilsem_cfg);
+
+ if (ret)
+ return ret;
+
+ return count;
+}
+
+/**
+ * xsem_handle_error - Handle XilSem error types CE and UE
+ * @dci: Pointer to the edac device controller instance
+ * @p: Pointer to the xilsem error status structure
+ *
+ * Handles the correctable and uncorrectable error.
+ */
+static void xsem_handle_error(struct edac_device_ctl_info *dci, struct xsem_error_status *p)
+{
+ struct ecc_error_info *pinf;
+ char message[VERSAL_XILSEM_EDAC_MSG_SIZE];
+
+ if (p->ce_cnt) {
+ pinf = &p->ceinfo;
+ snprintf(message, VERSAL_XILSEM_EDAC_MSG_SIZE,
+ "\n\rXILSEM CRAM error type :%s\n\r"
+ "\nFrame_Addr: [0x%X]\t Row_num: [0x%X]\t Bit_loc: [0x%X]\t Qword: [0x%X]\n\r",
+ "CE", pinf->frame_addr, pinf->row_id,
+ pinf->bit_loc, pinf->qword);
+ edac_device_handle_ce(dci, 0, 0, message);
+ }
+
+ if (p->ue_cnt) {
+ pinf = &p->ueinfo;
+ snprintf(message, VERSAL_XILSEM_EDAC_MSG_SIZE,
+ "\n\rXILSEM error type :%s\n\r"
+ "status: [0x%X]\n\rError_Info0: [0x%X]\n\r"
+ "Error_Info1: [0x%X]",
+ "UE", pinf->status, pinf->data0, pinf->data1);
+ edac_device_handle_ue(dci, 0, 0, message);
+ }
+}
+
+/**
+ * xsem_geterror_info - Get the current ecc error info
+ * @dci: Pointer to the edac device controller instance
+ * @p: Pointer to the Xilsem error status structure
+ * @mask: mask indictaes the error type
+ *
+ * Determines there is any ecc error or not
+ */
+static void xsem_geterror_info(struct edac_device_ctl_info *dci, struct xsem_error_status *p,
+ int mask)
+{
+ struct xsem_edac_priv *priv = dci->pvt_info;
+ u32 error_word_0, error_word_1, ce_count;
+ u8 index;
+
+ if (mask & priv->cram_ce_mask) {
+ p->ce_cnt++;
+
+ /* Read CRAM total correctable error count */
+ ce_count = readl(priv->baseaddr + CRAM_CE_COUNT_OFFSET);
+ /* Calculate index for error log */
+ index = (ce_count % XILSEM_MAX_CE_LOG_CNT);
+ /*
+ * Check if addr index is not 0
+ * if yes, then decrement index, else set index as last entry
+ */
+ if (index != 0U) {
+ /* Decrement Index */
+ --index;
+ } else {
+ /* Set log index to 6 (Max-1) */
+ index = (XILSEM_MAX_CE_LOG_CNT - 1);
+ }
+ error_word_0 = readl(priv->baseaddr + CRAM_CE_ADDRL0_OFFSET + (index * 8U));
+ error_word_1 = readl(priv->baseaddr + CRAM_CE_ADDRH0_OFFSET + (index * 8U));
+
+ /* Frame is at 22:0 bits of SEM_CRAMERR_ADDRH0 reg */
+ p->ceinfo.frame_addr = FIELD_GET(CRAM_ERR_FRAME_MASK, error_word_1);
+
+ /* row is at 26:23 bits of SEM_CRAMERR_ADDRH0 reg */
+ p->ceinfo.row_id = FIELD_GET(CRAM_ERR_ROW_MASK, error_word_1);
+
+ /* bit is at 22:16 bits of SEM_CRAMERR_ADDRL0 reg */
+ p->ceinfo.bit_loc = FIELD_GET(CRAM_ERR_BIT_MASK, error_word_0);
+
+ /* Qword is at 27:23 bits of SEM_CRAMERR_ADDRL0 reg */
+ p->ceinfo.qword = FIELD_GET(CRAM_ERR_QWRD_MASK, error_word_0);
+
+ /* Read CRAM status */
+ p->ceinfo.status = readl(priv->baseaddr + CRAM_STS_INFO_OFFSET);
+ } else if (mask & priv->cram_ue_mask) {
+ p->ue_cnt++;
+ p->ueinfo.data0 = 0;
+ p->ueinfo.data1 = 0;
+ p->ueinfo.status = readl(priv->baseaddr + CRAM_STS_INFO_OFFSET);
+ } else if (mask & priv->npi_ue_mask) {
+ p->ue_cnt++;
+ p->ueinfo.data0 = readl(priv->baseaddr + NPI_ERR0_INFO_OFFSET);
+ p->ueinfo.data1 = readl(priv->baseaddr + NPI_ERR1_INFO_OFFSET);
+ p->ueinfo.status = readl(priv->baseaddr);
+ } else {
+ edac_printk(KERN_ERR, EDAC_DEVICE, "Invalid Event received %d\n", mask);
+ }
+}
+
+/**
+ * xsem_err_callback - Handle Correctable and Uncorrectable errors.
+ * @payload: payload data.
+ * @data: controller data.
+ *
+ * Handles ECC correctable and uncorrectable errors.
+ */
+static void xsem_err_callback(const u32 *payload, void *data)
+{
+ struct edac_device_ctl_info *dci = (struct edac_device_ctl_info *)data;
+ struct xsem_error_status stat;
+ struct xsem_edac_priv *priv;
+ int event;
+
+ priv = dci->pvt_info;
+ memset(&stat, 0, sizeof(stat));
+ /* Read payload to get the event type */
+ event = payload[2];
+ edac_printk(KERN_INFO, EDAC_DEVICE, "Event received %x\n", event);
+ xsem_geterror_info(dci, &stat, event);
+
+ priv->ce_cnt += stat.ce_cnt;
+ priv->ue_cnt += stat.ue_cnt;
+ xsem_handle_error(dci, &stat);
+}
+
+static struct edac_dev_sysfs_attribute xsem_edac_sysfs_attributes[] = {
+ {
+ .attr = {
+ .name = "xsem_scan_control_ops",
+ .mode = (0644)
+ },
+ .show = xsem_scan_control_show,
+ .store = xsem_scan_control_store},
+ {
+ .attr = {
+ .name = "xsem_cram_injecterr",
+ .mode = (0644)
+ },
+ .show = xsem_cram_injecterr_show,
+ .store = xsem_cram_injecterr_store},
+ {
+ .attr = {
+ .name = "xsem_cram_framecc_read",
+ .mode = (0644)
+ },
+ .show = xsem_cram_framecc_read_show,
+ .store = xsem_cram_framecc_read_store},
+ {
+ .attr = {
+ .name = "xsem_read_status",
+ .mode = (0644)
+ },
+ .show = xsem_read_status_show,
+ .store = xsem_read_status_store},
+ {
+ .attr = {
+ .name = "xsem_read_config",
+ .mode = (0644)
+ },
+ .show = xsem_read_config_show,
+ .store = xsem_read_config_store},
+ {
+ .attr = {.name = NULL}
+ }
+};
+
+/**
+ * xsem_edac_probe - Check controller and bind driver.
+ * @pdev: platform device.
+ *
+ * Probe a specific controller instance for binding with the driver.
+ *
+ * Return: 0 if the controller instance was successfully bound to the
+ * driver; otherwise, < 0 on error.
+ */
+static int xsem_edac_probe(struct platform_device *pdev)
+{
+ struct xsem_edac_priv *priv;
+ void __iomem *plmrtca_baseaddr;
+ struct edac_device_ctl_info *dci;
+ u32 family_code;
+ int rc;
+
+ plmrtca_baseaddr = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(plmrtca_baseaddr))
+ return PTR_ERR(plmrtca_baseaddr);
+
+ dci = edac_device_alloc_ctl_info(sizeof(*priv), VERSAL_XILSEM_EDAC_STRNG,
+ 1, VERSAL_XILSEM_EDAC_STRNG, 1, 0,
+ edac_device_alloc_index());
+ if (!dci) {
+ edac_printk(KERN_ERR, EDAC_DEVICE, "Unable to allocate EDAC device\n");
+ return -ENOMEM;
+ }
+
+ priv = dci->pvt_info;
+ platform_set_drvdata(pdev, dci);
+ dci->dev = &pdev->dev;
+ priv->baseaddr = plmrtca_baseaddr;
+ dci->mod_name = pdev->dev.driver->name;
+ dci->ctl_name = VERSAL_XILSEM_EDAC_STRNG;
+ dci->dev_name = dev_name(&pdev->dev);
+
+ dci->sysfs_attributes = xsem_edac_sysfs_attributes;
+ rc = edac_device_add_device(dci);
+ if (rc)
+ goto free_dev_ctl;
+
+ rc = zynqmp_pm_get_family_info(&family_code);
+ if (rc) {
+ /*
+ * Firmware driver returns -ENODEV if it is not probed. In this case,
+ * defer xsem_edac_probe.
+ */
+ if (rc == -ENODEV)
+ rc = -EPROBE_DEFER;
+ goto free_edac_dev;
+ }
+
+ if (family_code == PM_VERSAL_NET_FAMILY_CODE) {
+ priv->sw_event_node_id = VERSAL_NET_EVENT_ERROR_SW_ERR;
+ priv->cram_ce_mask = XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_CRAM_CE;
+ priv->cram_ue_mask = XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_CRAM_UE;
+ priv->npi_ue_mask = XPM_VERSAL_NET_EVENT_ERROR_MASK_XSEM_NPI_UE;
+ } else if (family_code == PM_VERSAL_FAMILY_CODE) {
+ priv->sw_event_node_id = VERSAL_EVENT_ERROR_SW_ERR;
+ priv->cram_ce_mask = XPM_VERSAL_EVENT_ERROR_MASK_XSEM_CRAM_CE_5;
+ priv->cram_ue_mask = XPM_VERSAL_EVENT_ERROR_MASK_XSEM_CRAM_UE_6;
+ priv->npi_ue_mask = XPM_VERSAL_EVENT_ERROR_MASK_XSEM_NPI_UE_7;
+ } else {
+ edac_printk(KERN_ERR, EDAC_DEVICE, "Invalid Device Sub family code %d\n",
+ family_code);
+ rc = -ENODEV;
+ goto free_edac_dev;
+ }
+ rc = xlnx_register_event(PM_NOTIFY_CB, priv->sw_event_node_id,
+ priv->cram_ce_mask | priv->cram_ue_mask | priv->npi_ue_mask,
+ false, xsem_err_callback, dci);
+ if (rc) {
+ /*
+ * Register event API returns -EACCES if the event manager driver is not probed.
+ * In this case, defer xsem_edac_probe.
+ */
+ if (rc == -EACCES)
+ rc = -EPROBE_DEFER;
+ goto free_edac_dev;
+ }
+
+ return rc;
+
+free_edac_dev:
+ edac_device_del_device(&pdev->dev);
+free_dev_ctl:
+ edac_device_free_ctl_info(dci);
+
+ return rc;
+}
+
+static void xsem_edac_remove(struct platform_device *pdev)
+{
+ struct edac_device_ctl_info *dci = platform_get_drvdata(pdev);
+ struct xsem_edac_priv *priv = dci->pvt_info;
+
+ xlnx_unregister_event(PM_NOTIFY_CB, priv->sw_event_node_id,
+ priv->cram_ce_mask | priv->cram_ue_mask | priv->npi_ue_mask,
+ xsem_err_callback, dci);
+ edac_device_del_device(&pdev->dev);
+ edac_device_free_ctl_info(dci);
+}
+
+static const struct of_device_id xlnx_xsem_edac_match[] = {
+ { .compatible = "xlnx,versal-xilsem-edac", },
+ {
+ /* end of table */
+ }
+};
+
+MODULE_DEVICE_TABLE(of, xlnx_xsem_edac_match);
+
+static struct platform_driver xilinx_xsem_edac_driver = {
+ .driver = {
+ .name = "xilinx-xilsem-edac",
+ .of_match_table = xlnx_xsem_edac_match,
+ },
+ .probe = xsem_edac_probe,
+ .remove = xsem_edac_remove,
+};
+
+module_platform_driver(xilinx_xsem_edac_driver);
+
+MODULE_AUTHOR("Advanced Micro Devices, Inc.");
+MODULE_DESCRIPTION("Xilinx XilSEM driver");
+MODULE_LICENSE("GPL");
--
2.23.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/4] Add support for Versal Xilsem edac
2025-07-22 16:03 [PATCH v2 0/4] Add support for Versal Xilsem edac Rama devi Veggalam
` (3 preceding siblings ...)
2025-07-22 16:03 ` [PATCH v2 4/4] edac: xilinx: Add EDAC support for Xilinx XilSem Rama devi Veggalam
@ 2025-07-22 16:49 ` Borislav Petkov
4 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2025-07-22 16:49 UTC (permalink / raw)
To: Rama devi Veggalam
Cc: tony.luck, michal.simek, robh, krzk+dt, conor+dt, linux-kernel,
linux-edac, devicetree, james.morse, mchehab, rric, git
On Tue, Jul 22, 2025 at 09:33:11PM +0530, Rama devi Veggalam wrote:
> Add sysfs interface for Xilsem scan operations initialize, start,
> stop scan, error inject, read ECC, status and configuration values.
> Handle correctable and uncorrectable xilsem error events.
I had questions last time:
https://lore.kernel.org/all/20250422171737.GAaAfPMbFtNKN6paJT@renoirsky.local/
which went unanswered.
You shouldn't wonder if your submissions get ignored too.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/4] dt-bindings: edac: Add bindings for Xilinx Versal EDAC for XilSem
2025-07-22 16:03 ` [PATCH v2 1/4] dt-bindings: edac: Add bindings for Xilinx Versal EDAC for XilSem Rama devi Veggalam
@ 2025-07-23 8:05 ` Krzysztof Kozlowski
0 siblings, 0 replies; 7+ messages in thread
From: Krzysztof Kozlowski @ 2025-07-23 8:05 UTC (permalink / raw)
To: Rama devi Veggalam
Cc: bp, tony.luck, michal.simek, robh, krzk+dt, conor+dt,
linux-kernel, linux-edac, devicetree, james.morse, mchehab, rric,
git
On Tue, Jul 22, 2025 at 09:33:12PM +0530, Rama devi Veggalam wrote:
> + Xilinx Versal Soft Error Mitigation (XilSEM) is part of the
> + Platform Loader and Manager (PLM) which is loaded into and runs on the
> + Platform Management Controller (PMC). XilSEM is responsible for reporting
> + and optionally correcting soft errors in Configuration Memory of Versal.
> + The memory is scanned by a hardware controller in the Versal Programmable
> + Logic (PL). During the scan, if the controller detects any error, be it
> + correctable or uncorrectable, it reports the error to PLM. The XilSEM on PLM
> + performs the error validation and notifies the errors to user application.
> + This XilSEM EDAC node is responsible for handling error events received from
> + XilSEM on PLM and also provides an interface to control scan operations and
> + fetching the scan status & configuration information.
> +
> +properties:
> + compatible:
> + const: xlnx,versal-xilsem-edac
Implement or respond to previous comment.
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-07-23 8:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-22 16:03 [PATCH v2 0/4] Add support for Versal Xilsem edac Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 1/4] dt-bindings: edac: Add bindings for Xilinx Versal EDAC for XilSem Rama devi Veggalam
2025-07-23 8:05 ` Krzysztof Kozlowski
2025-07-22 16:03 ` [PATCH v2 2/4] Documentation: ABI: Add ABI doc for xilsem edac sysfs Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 3/4] firmware: xilinx: Add support for Xilsem scan operations Rama devi Veggalam
2025-07-22 16:03 ` [PATCH v2 4/4] edac: xilinx: Add EDAC support for Xilinx XilSem Rama devi Veggalam
2025-07-22 16:49 ` [PATCH v2 0/4] Add support for Versal Xilsem edac Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).